Building Windows VirtualBox machines

I started this post in January originally, but after a couple of paragraphs I realized I’m writing a more generic post – Believe in build automation. Now you know why I believe in automation and we can get straight to it. I’m a Linux guy, I’d rather work with Linux, I always prefer UNIX/Linux on servers, but I run Windows desktop to be conformant. After all I can run anything in VirtualBox when I need it.

And sometimes what I need is just another Windows. But I don’t want to manually prepare the box all over again after it expires (evaluation), not to mention I want a repeatable process (because I believe in it :-)). You may snapshot your virtual machines, but you cannot avoid eventual end of evaluation period.

State of affairs in Windows automation

Couple of years ago I got a new computer with Windows and I wanted to put all my favourite tools on it. Of course I didn’t have a list. But I had a feeling I’m repeating myself. I also wanted to disable some Windows features. I had experimented with PowerShell before, so I turned to it with faith. I found out that there are some PowerShell modules that allow to add/remove features or applications, but they are limited only to Windows Server. Couple of ugly words ran through my head and I postponed my dream.

Now I know this was a hasty decision, because Microsoft does not offer just one good standard way how to do it. As explained here, you can use one of two PowerShell modules. ServerManager module was the one that made a bit angry because of its restrictions to server versions of Windows, but there is also Dism module available on any recent platform, not to mention dism.exe itself, that works for older Windows incarnations as well.

While this all is just a minor episode, it documents how difficult it may be to find the right way how to perform various tasks on the command line (and preferably PowerShell) for a newcomer. And I wasn’t even that new on Windows.

But after this it was easier and easier to use the right words and ask the right questions the right way to get my answers. Most of them were on StackOverflow, but I have to praise Microsoft’s sites too. Sure, sometimes you have to go over couple of Microsoft pages, but in overall you can find the answers.

Back to Windows virtual machine creation

Here you go, I nearly did it again! Wrote a different post than I wanted, that is. So back to the topic. Of course, you need to know your options for automation, so learning more about PowerShell and about ways how to (un)install various Windows features is still important. But we also need to know the general workflow how to bring Windows virtual machine to life. I decided to use Vagrant because it aims for developers and is praised by them.

When it comes to Windows there is one big trouble – because Windows is big. The same trouble exists for Linux too, but is smaller. We’re talking about automated installation. Good news is that both systems can be installed automatically. It comes as no surprise for Linux, but Windows also features so-called “unattended installation” which aims for corporate world where admins don’t want to sit through installations for all the computers in a big company.

It works in a simple way – you provide an XML file with this unattended configuration for the computer and Windows finds it during installation. It can be found on a floppy or USB drive.

I don’t know all the options how to install Windows in an automatic fashion, but this one is good for virtual machines. All you need to do is provide the booting virtual machine DVD with Windows ISO image and Autounattend.xml file on a virtual floppy disk.

Can Vagrant do it? Ok, now you got me. 🙂 Probably it can, but a brief investigation on the Internet revealed that instead of doing this installation with Vagrant I should first use Packer. And – what a surprise – both tools are developed by the very same company/guy (HashiCorp/Mitchell Hashimoto, sorry for leaving out any other participants from the company). I was not the only one confused about the differences between Vagrant and Packer.

Shortly (and maybe not 100% precise), Packer is good in creating virtual machine base images and Vagrant is good to use them in your development process. Base image is not installation ISO, it’s rather a snapshot of a virtual machine after installation with everything you want to have there for the start. Not with everything possible though, that’s why it’s called base image. Packer can build base images for various virtualization platforms, but we will focus on VirtualBox only.

My idol Matt Wrock

I decided to install the Windows somehow automatically using Vagrant and prefer PowerShell as much as possible. Searching for a solution I somehow found the post called In search of a light weight windows vagrant box by Matt Wrock. I read a bit of it but I was also intrigued by the link to the updated version using Packer and Boxstarter. Already by this time I was overflowed by new terms, but it was worth it, I promise!

Matt definitely knows his stuff, his Windows and automation experience is extensive and he can explain it properly as well. Just a day before that I wasn’t ready to add Packer to my arsenal, not to mention Boxstarter – and soon I learned about Chocolatey as well. Now, honestly, I still don’t get Boxstarter, so for me it’s just “some extension for PowerShell” (disregard at will), but I absolutely fell in love with Chocolatey, because with it the management of programs feels like on Linux.

Matt’s instructions how to use Packer and Boxstarter were pretty cool, he provides Packer files (configurations, or sort of recipes, for Packer, written in JSON format) for Windows Server 2012, Windows Nano (very interesting addition to Microsoft’s arsenal) and Windows 7 (here you need license key, as there is no evaluation ISO, shame). I definitely utilized Windows Server 2012R2, as the server edition always comes handy during development when you want to experiment with a domain controller, etc. But I also wanted packer template for Windows 10 – and I had to create the one myself.

Windows 10 experiments

Actually, the biggest problem with Windows 10 wasn’t the Packer template, but with the Autounattend.xml file. I found some generator, but it didn’t deliver without some errors. I’m still pretty sure that XML is far from flawless, it’s not cleaned up properly and so on – but it works. Diving into every detail in a field that is mostly new for me (the whole world of Windows automation) would probably stop me before I got to the result, so take it as it is, or make it better if you can.

I highly recommend to read that Matt’s article Creating windows base images using Packer and Boxstarter as it is a very good introduction into the whole pipeline. His packer templates also provided a great starting point for more experiments. I also liked the way how he minimized the images by removing many Windows components, defragging the disk, zeroing empty space, etc.

I summed up my experiments in a markdown file and looking at it, it definitely is not perfect and finished. But does it have to be? In a week I played with it I probably installed Windows 10 forty times. Most of these test I commented out the slow parts that were mere optimization steps mentioned above (minimizing the size of the image). I played a lot with some preinstalled software (using Chocolatey, of course) and tried to pre-configure it using registry changes where necessary. This is, however, in vain, as the sysprep.exe step wipes registry changes for vagrant user. Talking about vagrant user, be extra careful to spell it everywhere the same way. Once I messed it up, had vagrant in Autounattend.xml and Vagrant in postunattend.xml (which is used as C:\Windows\Panther\Unattend\unattend.xml by sysprep.exe) and had two Vagrant accounts – you don’t want that. 🙂

I tried hard to perform some installation and configuration steps after sysprep, I tried to change it from Packer’s shutdown_command to windows-restart provisioner step but I wasn’t able to overcome some errors. After a while I settled with a script I just copied to my Vagrant working environment directory and then ran it from initialized box where it appeared in c:\vagrant directory.

Sure I could do even better with full automation, but when things resist too much sometimes it’s good to step back, rethink the strategy and focus on quick wins.

Other options?

There are definitely more ways how to prepare Windows 10 box, or any Windows for that matter, with or without Packer, but even when we focus on Packer solutions there’s a wide spectrum of approaches. Some don’t bother to use sysprep.exe to generalize their installation – after all if it’s only for personal needs it really is not needed. People on GitHub seem to agree on using PowerShell as a Packer provider, but one of the solutions used no provider at all (all part of the build section). Also widen the search for other Windows versions and you’ll see much more variability.

Conclusion

Using Packer as a first step in the pipeline is very practical. You can prepare base image once and save a couple of hours any time you need a fresh environment (partially you can do the same with snapshots in VirtualBox, but it’s not the same).

I use Boxstarter in the process as recommended (although I’m not able to appreciate it fully) and Chocolatey to install/remove programs – during Packer steps and also anytime later. When my evaluation Windows runs out I simply refresh the packer image from ISO and I’m done for the next 90 or 180 days (depending on the OS version).

Following the installation enthusiasm I went on to install SQL Server 2014 Express from Chocolatey package, configure it using PowerShell bits and pieces found across many blogs and stackoverflow questions and wrote it down on GitHub. I actually got used to writing technical notes I may need later into these markdown files and it works very well for me. Now I have automation hiatus, but I’m sure I’ll get back to it and it’s good to find all the notes at hand, including unresolved problems and ideas.

Good luck with Windows automation!

Believe in build automation

I was probably lucky having a great colleague who changed our installation/upgrade process into a single install.sh script. The script contained uuencoded ZIP as well, it unzipped the content (couple of Java EARs), checked the application server if all the resources are set up and then deployed these EARs. Having this experience probably around 2005 I got totally hooked to the whole idea of automation.

2005? Wasn’t it late already?

I don’t remember exactly, maybe it was even 2006. In any case, it wasn’t late for me. It was the next step in my personal development and I was able to fully embrace it. It may sound foolish, but we did not measure saved hours vs those we spent to get to this single command upgrade of our system. We didn’t have many other automation experiences. We had no integration server (not sure we knew what it was) and we struggled with automated testing. Even if we had wanted to measure we would have failed probably, because we simply were not there yet.

There is this idea of a maturity model – that when there are multiple levels of knowledge or understanding or skill or whatever, you simply can’t just learn the highest one, or even skip over any to get higher. Because without living the steps one by one there is no solid foundation for the next one. You don’t need formalized maturity model, very often there simply is one, call it natural progress.

Of course, I don’t mean to bring in a “maturity model” model to hold you back. Aim high, but definitely reinforce your foundations when you feel they don’t work. Not all models are prescribed, sometimes it’s more an exploration. But talking about build automation, you can find some maturity models for it too. (Funny enough, it mentions Vagrant/Packer on the top of the pyramid, because these two tools and last two weeks with them made me write this post. :-))

Personal and team maturity

We were a small team, far from the best in the business, but we were quite far ahead in our neighbourhood. We were following trends, not blindly, but there was no approval process to stop us trying out interesting thing. By 2010 we had had our integration server (continuous build), managed to write automated tests for newly developed features and even created some for critical older parts.

Then I changed my job and went to kind of established software house and I was shocked how desperate the state of automation was there. Some bosses thought we have something like integration server, but nobody really knew about one. What?! And of course, testing takes time, especially automated, and so on and so forth… There was no way how I could just push them. I gave some talks about it, some people were on the same page, some tried to catch up.

We got Jenkins/Sonar up and running, but automated testing were lagging behind. We had really important system that should have had some – but there were none. People tried, but failed. But they did not pursue the goal, they saw only the problems (it takes time and adds code you have to maintain) but did not see the benefits. There are cases when doing more of the wrong thing does not make it any better (“let’s do even more detailed specification, so that coding can be done by cheaper people”), but there are other cases when doing the right things the wrong way requires different approach. It requires learning (reading, courses) and practice. There is no magic that will get you from zero to fourth maturity level in any discipline.

You can have mature individuals, but the team (or division, or even company) can still be immature in what you pursue. And it strongly depends on the company culture how it turns out. Other way around is much easier – coming into more matured team means you can quickly get up and running on the level, although it is still important to catch up with those foundations, unless you want to do what you do just superficially. This may be acceptable for some areas, it may even be really efficient as not everybody can know everything in all the depth.

Proof vs belief

I really believe, I got lucky to get hooked. You can throw books at them, you can argue and show them success stories. Just as with anything that is based on “levels of maturity” you can’t simply show them the result. They don’t see it with your eyes. It never “just works” and there are many agile/lean failures showing it – mostly based on following the practices only, forgetting values and principles (which is an analogy of a maturity model too).

I got my share of evidence, I saw the shell script allowing us those “one-click” updates. But for some this would not be enough. I was always inclined to learning and self-improvement. Sure there are days when I feel I do the repetitive work “manually”, I guess it’s because I was out of mental fuel. But in most cases I was trying to “program” my problems.

I rather messed around with vim macros for a while, even if it didn’t pay off the moment I needed to change 20 lines. Next time when I needed to tweak 2000 lines, I was ready. I never thought about kata, but now I know that was what I was doing. I was absolutely sure about why I do it. I didn’t see the lost time, what I imagined was my neurons bending around the problem first, just to bend later around the whole class of problems, around the pattern. I didn’t care about proof, I believed.

I see the proof now, some 15 years down the path. I see where believing got me and where many other programmers are stuck. I see how easily now I write a test that would be mind-blowing mission impossible just a couple of years ago. And I know that the path ahead is a long one still. I’m definitely much closer to the top of the trends than anytime before.

Final thoughts

I don’t think it was that install.sh script alone that brought the power of automation to my attention. I believe it was one of defining moments, my favourite one when I try to convince people with a personal story. It saves us incredible time. It was kind of documentation, the best one, because it was actually running. I would understand a lot of this much later and it’s crystal clear to me now.

But it was belief in the first place. That made it easier for me. When I met a culture that didn’t support these ideas (at least not in practice) I was already strong enough to do it my way anyway. And I saw it was paying off over and over and over again. While my colleagues declared “no time for testing”, I was in fact saving time with testing. People are so busy to write lines of code, maybe if they thought more or started with a test they would get to the result faster – with the benefit of higher quality packed in too.

How can we hope for continuous integration when we don’t start small with tests and other scripted tools? There are so many automation tools out there and the concepts are well understood already that staying stuck in 2000’s is just pure laziness. The bad kind of it. Doesn’t matter if it’s a single person or the whole company.