Virtualization.info http://www.virtualization.info/2009/04/how-microsoft-and-vmware-use.html and vInternals http://vinternals.com/2009/04/microsoft-myths-and-realities/ provide excellent analyses on recent internal documents/presentations from Microsoft and VMware respectively on utilizing their virtualization products internally. Microsoft's internal use of Hyper-V from a January 2009 Technical Case Study http://technet.microsoft.com/en-us/library/cc974012.aspx and VMware's internal use of ESX from a VMworld Europe 2009 presentation by CIO Taylor Stansbury http://sessions.vmworld.com/mgrCourse/launchCourse.cfm?mL_method=player.
To summarize, VMware uses server virtualization internally for 97% of its servers, and the remaining servers will be virtualized by the end of this quarter. Microsoft, on the other hand, expects to have 50% of their server instances running on virtual machines by the end of Q2. VMware is able to handle twice the number of VMs per 8 core/32 GB host as Microsoft (Microsoft actually uses a combination of 8 core and 12 core hosts, so the discrepancy is even greater). VMware's deployment has no performance or stability issues and is both flexible and easy to manage. Microsoft, on the other hand, has this to say about their deployment:
"Because of the required brief outage every time a virtual machine is moved from one host to another, Microsoft IT found that coordinating the server update processes with virtual machine owners was difficult. Because one physical host could contain several virtual machines, Microsoft IT had to communicate with each of the virtual machine owners and coordinate host server maintenance with virtual machine maintenance."
Microsoft discusses its limitation of 23 LUNs because each LUN must be available to the Parent Partition" (Host OS). Windows does not deal with drives in terms of numbers, it deals in terms of letters (of which there are 26). "A" and "B" are traditionally reserved for disk drives and, as such, cannot be assigned to a LUN. "C" is the local hard drive and cannot be assigned to a LUN. Therefore, Hyper-V can only address 23 LUNs. Microsoft says it had to acquire new storage frames that supported more LUNs because they ran out with its 1 LUN: 1 VM strategy.
Microsoft says it uses 802.1q trunking but not 802.3, and doesn't support NIC teaming. This means that there is no network redundancy for the VMs in a Hyper-V environment. This is a huge failing on the network side — and much more important than VLAN tagging they do support. If a VM network port goes down, all of those VMs are offline until an administrator intervenes (Microsoft plans to fix this in the R2 release of Server 2008). VMware, on the other hand, not only fully supports 802.3 and NIC teaming today, but with the vNetwork Distributed Switch (VDS) of the upcoming vSphere, will span many ESX hosts allowing quick scale up of networking capacity. vNetwork also includes Private VLAN support, network VMotion, off-loaded network I/O and the ability for organizations such as Cisco to build VMotion-aware networking applications on top of it. Cisco's Nexus 1000V virtual switch is expected to be announced on 4/21 in conjunction with vSphere.
When a VMware virtual machine wants to communicate with the network, it must be processed by the Guest OS and then by the Hypervisor. Once the Hypervisor gets hold of it, the traffic gets pushed out to the network at large. On Hyper-V, the traffic must be processed first by the Guest OS, then by the Hypervisor, and then by the Parent Partition (the Host OS). The added layer (host OS) is the one that has traditionally been a bottleneck (which is one reason why ESX performs so much better than Hyper-V).
Here is a summary of the internal virtualization environments at VMware and Microsoft
Internal Virtualization Environment VMware Microsoft
Current % of servers that run as virtual instances 97% < 50%
% of Servers that will be virtual instances by end of Q2 100% 50%
# of Production VMs per 8 core/32 GB host 10.4 5.7
High Availability enabled? Yes No
Migration of live server instances without downtime? Yes No
Natively supports NIC teaming (network redundancy) Yes No
Can add vSwitches without requiring host rebooting Yes No
Efficient communication of VMs with the network Yes No
LUN limitation 255 23
# of VMs managed per administrator 145 ?
Thanks to Jason Coleman and Steve Jones of INX who explained the networking and storage issues for me (though any mistakes in transcribing their remarks are solely the author's).

You dont know what your talking about. The Luns is an alphabet problem so you can use cluster strings instead to get more LUNS. Your comparing penetration size of vmware in vmware agianst pentration of hyper v in microsoft, where 50% of microsoft is two Vwmares anyday in infrastructre alone. There is load balancing for Network redundancy built into Hyper-V called (NLB) as well as it support Cluster Aware Servers and Applications (Exchange,SQL..Etc.) As well quick migration works for 80% of operational needs based on changes to the Machine you still have to shutdown for in Vmware to add memory or CPU and operational you are more likely to do that then have a need for moving machines around based on the Utilization. Also and lastley going to a Buffet(Transparent page sharing) you get buffet food and quality. Going to a real reasturant and ordering what you like (memory Commitment)so more is not always better from a quality stand point and a risk exposure stand point if a server goes down you want 100 servers affected because a MTBF or a smaller set based on MTBF for Hardware.
Posted by: The One | April 10, 2009 at 05:15 PM
Hi “the one” - thank you for following Steve’s Blog! Please allow me to respond to some of your points.
You are, of course, correct that there are ways to work around the alphabetic limitation on LUN counts in Hyper-V. Unfortunately, these methods are quite complex and unwieldy. How unwieldy, you may ask? Please allow me to quote from a Microsoft Technet article published just three months ago (http://technet.microsoft.com/en-us/library/cc974012.aspx): "With Windows Server 2008 failover clustering, an administrator must store each virtual machine on an individual LUN. Because an administrator must provide all cluster nodes with access to the same shared storage by using the same drive letters, 23 is the maximum number of virtual machines that can run in a failover cluster. Microsoft IT could work around this limitation by using mount points and virtual machine groupings, but it considers this configuration too complex to administer." They then go on to say, "Because of these issues, Microsoft IT has not deployed failover clustering as the default standard for virtual machines." While there is a way to work around this limitation, it is so complicated that Microsoft, the company responsible for creating this technology, has determined that it is "too complex to administer".
This LUN limitation plays directly into High Availability. You are absolutely correct that Hyper-V supports clustering and, when compared to ESX, has identical behavior for "highly available" Virtual Machines. The key difference is that ESX's High Availability is infrastructure wide, whereas the Hyper-V implementation is, as demonstrated in the above referenced article, limited at best.
Please correct me if I’m wrong, but my impression is that NLB is utilized to balance network traffic between two or more servers for stateless applications, such as IIS. While this is a form of Load Balancing, it does not serve to replace the native support of NIC Teaming. NIC Teaming refers to technologies such as Link Aggregation (http://en.wikipedia.org/wiki/Link_aggregation), which allow multiple physical adapters to be bonded together to form a single, fault tolerant, logical adapter. Should a physical link unexpectedly stop transmitting traffic, the traffic will failover to the remaining physical links in the aggregate. By utilizing Link Aggregation to a Virtual Switch (one of ESX Server’s important features), every VM attached to that switch benefits from this redundancy.
Support for Guest OS Clustering options is neither new nor unique to Hyper-V; any Hypervisor worth its salt will allow such clustering. This type of clustering is not a replacement for Host level clustering.
Quick Migration places system administrators in the admirable position of being able to relocate Virtual Machines with only moments of downtime; as you say, it’s good enough for 80% of the times when system administrators need to relocate their Virtual Machines. VMotion, on the other hand, allows system administrators to relocate Virtual Machines with almost no downtime (the delay that VMotion incurs is primarily due to the time it takes to ARP and update the routes that the network packets must follow in order to arrive at their new destination, which has little to no impact in any but the most sensitive of applications), which is good enough for 99% of the times when an administrator may need to relocate a Virtual Machine. In fact, it’s so good that VMware allows for the automation of that process in order to facilitate VM load balancing between servers.
Both technologies require shutting down the VMs in order to make dramatic Virtual Machine hardware changes (such as adding memory or CPUs); only one of the technologies requires restarting the Host when making standard infrastructure changes (adding a vSwitch).
To respond to your second-to-last statement, your buffet analogy is completely fallacious. Transparent Page Sharing is not “going to the buffet” – it is recognizing that your waiters can carry more than one plate at a time. This way they aren’t wasting resources by running from the kitchen to the table over and over and over again. ESX allows for dedicated memory and shares identical pages between machines (only as long as they truly are identical, for when a machine tries to change such a page, it gets a unique copy spun off to satisfy that change) in order to preserve a valuable physical resource so that it can be more effectively utilized.
As far as Mean Time Between Failures goes, I’m surprised that you even brought it up. ESX and Hyper-V are going to be running on the same class of hardware (server), from the same vendors, so we’re obviously not concerned with a difference in their physical MTBF (and Virtualization technologies are rapidly making hardware failures irrelevant by providing effective recovery or failure negation capabilities). I have many customers who have been running their ESX environments since before Hyper-V was even in beta and they have had no ESX software failures. So, don’t talk to me about MTBF.
Posted by: Jason Coleman | April 10, 2009 at 07:30 PM
Good information. Recommendation for a follow-on: a VMWare ESX to XenServer comparison.
Posted by: Sean | April 10, 2009 at 07:47 PM
Hi Steve
A couple of comments:
It should be no surprise that VMware are attempting to get 100% of their servers running as virtual machines. Virtualization is what VMware does, not to attempt to virtualize all their workloads could, by some, be considered an admission that the technology had limits - something that VMware may be unwilling to accept.
MS on the other hand, with many more technologies in play, probably don't feel the need to aim quite as high. That said the fact that they feel they can virtualize 50% of their environment at this stage in Hyper-V's life cycle does suggest a high degree of confidence. It also should be said that moving from a physical to virtual data center takes time and significant resources; what is there to say that MS couldn't hit 80% of their servers today but just haven't got round to it yet. At the same time of course MS would be foolish to attempt to virtualize any workload with an SLA that has requirements that Hyper-V cannot currently meet, so maybe 50% is all they dare do.
What would be more interesting and relevant to report would be the ROI of virtualizing all these workloads, does the additional cost of running a virtual server environment deliver hard savings when extended to the edge in this way and how long is the payback time.
regards
Simon
Posted by: Account Deleted | April 11, 2009 at 09:06 AM
Simon, I think you hit the nail on the head when you talk about ROI as being key (surprised?). The question is, with virtualization - how much can any organization (even VMware and Microsoft)save? In my experience it's a very consistent correlation: the more you virtualize, the more you save. In fact, since the bulk of virtualization expense is in the up-front HW/SW/Consulting/Training - savings become even greater as you virtualize more of your infrastructure. Virtualization also enables significant benefits in terms of high-availability, integrated test/dev, DR, etc. So the fact that MS is shooting for 50% virtualization while VMware is at 100%, to me, is very telling in and of itself. IT managers should run the numbers and see what impact virtualization will have for their organizations. If it makes sense to virtualize a little, it probably makes sense to virtualize everything (network, storage and DR along with servers). As Microsoft's internal struggles with virtualization show - VMware is the only viable enterprise data center virtualization platform today.
Posted by: Steve Kaplan | April 11, 2009 at 10:48 AM
> As Microsoft's internal struggles with virtualization show - VMware is the only viable enterprise data center virtualization platform today.
I don't think you are being quite fair there Steve. We don't know if MS are struggling to virtualize their data centers, just that they haven't progressed as far as VMware and that could be for many reasons unrelated to the maturity of their hypervisor. You could equally ask what percentage of Microsoft's servers are running 2008 and use that to suggest that Windows Server 2008 isn't as good as Windows Server 2003.
And you missed Microsoft's statements
"If a physical server is the only way that a group can meet business requirements, Microsoft IT provides one. However, in most cases, a virtual machine more than meets business requirements."
and
"With Windows Server 2008 Hyper-V, the expectation is that at least 80 percent of new server orders will be deployed as virtual machines."
So going forwards it's clear that MS have the situation under control.
And as an aside, I take the same position that MS have, I use virtual servers in my lab almost exclusively, but when I am working on server power management I use physical servers (how else can I confirm that the s/w can work as expected).
Anyway that's not really the point. VMware is without question the market leader in the virtualization space 'today', but if we are to put the interests of the customer first, we have to take into account the long view and look at the ROI over the life of the data center and not just the first couple of years. Will VMware's closed ecosystem ultimately prove to be more expensive than the open approach that MS/Citrix are offering, or can we achieve a short enough ROI with VMware that the long view does not matter. You are far better at calculating the big picture ROI than I am, so I'd be very interested in hearing your view here.
Posted by: Account Deleted | April 12, 2009 at 10:12 AM
Simon,
I think in the responses we've veered away from the original point of the blog which is Microsoft's internal virtualization architecture. They openly discuss their many struggles in terms of HA, Live migration, storage and management. While there may be reasons to keep a very small minority of servers physical (though 20% seems far too high), a higher ROI is going to result from a greater % of virtualized servers, and at an increasing rate since the marginal cost to virtualize additional servers is very low. As I said in my last response, each organization needs to do their own financial analysis up front. If, however, their numbers validate my assumption that it makes strong economic sense to virtualize as much as possible - they then need to choose the optimal virtualization platform. At this point in time, there can be no question that VMware is it. Even the adimittedly Microsoft-biased Redmond Magazine gave their 2008 Editors Choice Award for the most reliable technology to VMware. The IBM mainframe came in #2. Hyper-V was nowhere in site. The competition to VMware is not Hyper-V, XenServer, Oracle VM, VirtualIron, etc. It's inertia. Every day that organizations don't virtualize is another day they incur incrased risk of downtime, bear additional administrative requirements, consume extra power, rack space, network ports, SAN ports and have one less day until they need to upgrade their existing servers. It's one more day when they may be adding air conditioning, ordering a PDU, upgrading their UPS or generator, purchasing extra Windows server licenses or implementing Microsoft Clustering. I council our clients to beat the competition, inertia, and begin realization of all of the savings and other benefits by virtualizing today with clearly superior hypervisor and the only one that is proven as an enterprise platform, VMware ESX. If, in two years or so, Hyper-V or XenServer or another competitor is at an acceptable level, then the optimum infrastructure for their implemenation is already in place. The organization can simply toss out VMware and put in the alternative. In the meantime, though, they'll probably have paid off the licensing costs hundreds or thousands of times over.
Posted by: Steve Kaplan | April 12, 2009 at 05:02 PM