In this post I am taking a different tack to write about my perspective on the underlying causes of the Windows Vista codename Longhorn debacle. While this happened over a decade ago, this was a crucial period in the shift to mobile and had long-running consequences internally to Microsoft. I think there is a different story to tell — one that is better rooted in the actual facts of the projects and the real motivations of key parties.
This is not an effort at alternative history — I have no idea what would have happened if these mistakes were not made but they certainly did not help Microsoft navigate this critical inflection point in the computing industry.
This is also not investigative journalism — I have not done a deep series of interviews with key parties. This is my point of view based on what I experienced during that period and learned afterwards. Although I was in the Office organization at the time, I worked closely with many Windows partners and so was very aware of what was happening in the Windows organization.
I apologize for the length. The TL;DR; version is: Microsoft badly misjudged the underlying trends in computer hardware, in particular the right turn that occurred in to the trend of rapid improvements in single-threaded processor speed and matching improvements in other core elements of the PC.
Vista was planned for and built for hardware that did not exist. This was bad for desktops, worse for laptops and disastrous for mobile.
The bet on C and managed code was poorly motivated and poorly executed. This failure in particular can be laid directly on Bill Gates and his fruitless Holy Grail effort to create a universal storage and universal canvas applications infrastructure.
This had especially long-running consequences. Windows project management had teetered on catastrophic throughout its history with a trail of late projects that stumbled to completion. Vista was a disaster but was just the culmination of a series of near-catastrophes in the core executive mission of complex project execution.
Since it is so critical to this story, I want to start with a short primer on the structure of the industry and value creation. Any device is constructed from hardware, an operating system OS and applications. At the most basic level, the OS manages and exposes hardware resources so they can be shared and used by applications. The OS exposes application programming interfaces APIs that allow different types of hardware to be connected to the device and APIs that allow applications access to this hardware as well as OS-provided services.
The history of the OS, especially in the consumer world, is one of including more and more high-level services, either provided directly to users or exposed as APIs to applications.
This evolution of higher-level functionality is driven by the virtuous cycle and multi-sided network effects inherent in the OS business. More and more users of an OS attract more developers. More developers create more applications that make the OS more attractive to users. That results in a cycle of still more users leading to still more developers. The APIs exposed by the OS are what makes this such a stable business strategy for the winners in this contest. Millions of developers in aggregate expend massive effort programming to the system APIs and the services behind them.
The deeper some application depends on the sophisticated APIs exposed by a particular OS, the harder it is to move that application to some other OS. This means that even if a competitor matches the core functionality of another OS, they will be missing all those applications. There is no way a single OS provider can match the effort expended by those millions of developers.
With this dynamic, there are multiple reinforcing reasons for an OS provider to add more and more sophisticated functionality and APIs to their OS and make it easier for developers to access this functionality.
Sophisticated functionality should attract developers and make it easier for developers using these APIs to quickly build better applications. These better applications fit directly into the virtuous cycle to attract more users. A classic example was when Windows was the first OS that made it possible for an application to embed an HTML surface directly in the application.
Critically, when an application uses this functionality, it now makes it harder to move that application to another OS. If you look at Windows, iOS and Android, they all operate with this same dynamic despite the fact that Microsoft, Apple and Google all monetize differently.
Microsoft classically charged a per-device licensing fee that was paid by OEMs that sold Windows devices. This was a horizontal business strategy with lots of OEMs all paying Microsoft for the devices they built and sold. Apple monetizes by building and selling devices directly. Google does not charge OEMs an OS license but rather depends on post-sale monetization primarily through search. In fact the fear of being excluded from mobile search and mobile services generally by Apple and Microsoft is what drove Google to invest in Android in the first place.
Microsoft is also moving to direct device monetization through its Surface line as well as post-sales monetization through Bing and subscription-based services like Office Another important part of the story here is third-party middleware like Java and Adobe Flash. Middleware is in some ways no different than higher-level OS services except they are built and provided by a third-party. OS providers have a love-hate relationship with middleware.
To the extent that it makes it possible for developers to more quickly build great applications for their platform, there is love. Certain types of middleware specifically target the challenge of building applications for multiple platforms. Applications built on this middleware do not depend directly on OS APIs and therefore can run on any platform where the middleware exists. Note that modern readers will think of Java as either server-side infrastructure for web sites or as the preferred language for Android development.
I am referring to its origin as a language for on-demand browser-based applications which was its main identity as Vista was planned. Cross-platform middleware disrupts the network effects driven by exclusive applications tightly bound to an OS through exclusive OS-specific APIs.
Some types of OS functionality generate their own internal network effects where the more applications that use the functionality, the better all the applications behave. Rich copy-and-paste is a classic example; the more applications allow rich content to be copied and pasted between them, the more valuable the OS is for each user.
If a third-party middleware provider blocks this dynamic, it has blocked a key opportunity to create further sustainable differentiation over time. The browser as an application-delivery platform is probably the most stable example of middleware that disrupts the OS API dynamic.
Looking back on 35 years of PC history, other approaches have existed for a time but ultimately collapsed for reasons that I do not need to dive into here. The critical thing for this story is that twenty years ago it was not so clear that this is how it would play out.
Fear of middleware and disruption of the sustainable API differentiation drove much thinking at the time of Vista. I am going to do a brutal job of summarizing but I think it captures the essence. A Windows release generally had a key theme and rough timeline. For example, Windows 95 was about modernizing consumer Windows to bit, a modern file system, updated UI and standard networking including a browser. Beyond main themes, individual developers and teams would determine the key features for their area on their own and begin on development.
The product under development would lose stability as new features were checked in and for long periods could barely be packaged together as a release much less used broadly on a dogfood basis.
At a certain point, the team would determine they had made enough progress for the release and start the drive to stability and shipping. The history of the Windows team was that the release generally slipped significantly from the initial target date Windows 95 was initially Windows 93 and important target functionality was either dropped or shipped well short of the original functional target.
I will note that a key distinction between Windows and Office was that after Office 97, Office would pick a release date and generally nail it. This made all the difference in achieving broad coordination with minimal process overhead.
This process contrasts significantly with modern engineering practices. Independent of whether individual feature ideas are driven top-down from a broad consistent vision or bottom-up from individual engineers and teams, modern practices generally involve maintaining continuous ship-level quality and actually shipping to customers on a very frequent basis.
Services might ship multiple times a day while client code will ship weekly or monthly updating clients has an expense for both the provider and user which militates against updating too rapidly.
This requires major architectural and engineering infrastructure to accomplish reliably for large complex systems like Windows or Office. This process does not necessarily make it easier to build big advances in complex functionality, but it dramatically increases the teams agility and ability to respond to external events and realities.
It also forces a much more honest ongoing appraisal of how much real progress is being made. It is probably worth a separate post to describe how Office made this transition but for this story suffice to say Windows was nowhere near this process during this timeframe.
Windows XP was a massive release that followed this pattern. It unified the business and consumer platforms on the robust kernel of Windows NT and the consumer Windows user experience. Compatibility with all the applications built on top of the consumer Windows platform was a huge challenge but was key to enabling the transition to a single consumer and business platform. Unfortunately, Windows XP also had a headline-making zero-day security exploit on its public release date.
This and other high-profile security disasters started a pivot inside Microsoft to a huge overhaul of software and engineering practices around security and ultimately an immense service pack to Windows XP that consumed a large part of the Windows organization.
This was critical as Windows competed successfully to build up a larger enterprise server business. This requires a little more background. The browser started evolving as an application delivery environment from very early days. Microsoft built and invested in its own browser and its own ActiveX code embedding mechanism. Java arrived in this time frame as an alternate application delivery strategy.
Developers could use Java, a high level language with its own rich set of APIs, and have the code automatically downloaded and run in the browser.
Microsoft settled the Java lawsuit and ultimately decided to forge its own path with the C language. This proved to be disastrous for a wide range of reasons. The language and its runtime component use garbage collection to automatically recover any memory no longer in use.
Importantly for this time, the runtime also prevents the type of memory bugs that cause many of the security vulnerabilities we were seeing. At the time then and really for the following decade there were passionate arguments about the impact on programmer productivity and security of automatic memory management. I will not try to have that debate here but perhaps suffice to say that the most successful OS of our current era, iOS, did not make this gamble.
Android sells more copies but iOS captures the vast majority of the profit. Managed environments have an inherent resource overhead compared to unmanaged environments so they tend to require more memory to run. Most environments that leverage the productivity benefits of managed code are careful to limit its use to where it makes the most sense rather than leverage it blindly.
But memory use, whether automatic or manual, is resource use and a casual attitude to resource use results in bloated code that requires more memory to run. Using more resources was part of the value system at the time since it reinforced how important a large rich client was to the computing experience vs. Part of the bet on C was also a bet on a rich base class library and then building new client technology as a set of class libraries on top of this base.
The base library provides simple types like strings and arrays as well as more complex data structures and services like lists and hashtables.