There's a second trend that I've noticed that I never appreciated. Super Micro really benefitted from the inexpensive array of cheap compute notes theory of data centers. This is what Google pioneered, a ton of cheap commodity machines you throw away as they die. But with virtualization we've moved to extremely reliable overbuilt hosts for VM's.
I highly doubt that. What's your source for this? Software eats the world. Eventually everything will run on commoditized hardware because at large scales it will always be cheaper to implement redundancy in software than in hardware. Google / Amazon / Apple / Microsoft aren't going to deploy millions of (expensive) extremely reliable overbuilt hosts to minimize the chance that one of their computers will crash at some point. And small fish aren't going to deploy these servers because it's cheaper to rent capacity from one of the bigger players.
Call any manufacturer and ask for a quote on a virtualization box. Plus hanging around the industry.
But just think this through. Virtualization sells threads/cores, ram and disk space. It is more cost efficient to take a server from 10 cores to 20 cores and double the sticks of ram. In a data center you pay for every amp of electricity you use. The marginal cost of more cpu cores and ram is very small compared to an entirely new machine.
A few things changed that made this shift possible. Hyperconverged ethernet adapters. A host can have a 10/25/40Gbe ethernet adapter that's divided up into hundreds of small adapters for each host. Second is core density. In 2010 Xeon's had four or six cores. You can buy a Xeon Platinum with 28 cores now, and in a dual NUMA machine that 56 cores or 112 threads for hosts. Xeon's can support 1.5TB of ram, all in a single pizza box 1U.
A cloud provider then uses something like OpenStack and spins up hundreds of VM's across a small handful of machines.
Effectively what happened is machines got quicker faster than applications could keep up.
In terms of Amazon/Azure being cheaper, that's completely false. It's 100% cheaper to buy and run your own servers all of the time. I pay $500/mo for my half rack, the effective cost would be $20k/mo on AWS. I have a friend who runs IT at a large company, they're moving to AWS. He said "We know it's a lot more expensive, but it's easier than hiring, we can't hire good IT people. And the cost is opex vs capex."
I know another large company that's moving to the cloud, cost is irrelevant, it's because "IT is too slow, and we want to move quicker." That's a theme. My business partner has worked with a handful of Fortune 500 companies that he helped move to the cloud, the reason was the same. Never cost, it's cheaper to buy and support your own, but it's easier to get around IT. Execs love the idea of a recurring cost monthly vs a big budget spend one year, and then three years of nothing before IT's asking for an upgrade.
If you aren't convinced I'd say work out the math. I have tossed around the idea of cloud hosting, the numbers are insanely lucrative. The only way to do it is density, thread/core/disk. Disk density is easy, you can buy a few Nimble's and plug everything into them.
The way you thread this all together is with OpenStack and software defined networking. You create pools of VM's that can migrate between two or three hosts. This gives you the reliability. The whole concept is built with automation in mind. You can change routes, vlans, anything in the switch with a script. These scripts can spin up and move VM's. It's all automated and hands off.
There are some awesome YouTube videos out there from Facebook and AWS explaining how they build out data centers. This is exactly it, extremely dense, and automated. To get the density they're using quality enterprise gear, not cheap off the shelf stuff. It's commodity in the sense they've moved from customized and expensive blades to 1U, but it isn't the off the shelf ATX motherboards that Google does that you're probably thinking of.
The lr;dr; of this is consider: 10 machines virtualized might move to a single hypervisor. The entire point of virtualization is less machines needed because VM's can share resources. In the past what was 10 cheap servers are now on a single not as cheap dense server. Look at some of the stuff where there are two or four nodes in a single server now. It's all about density, not rooms of cheap single purpose pizza boxes.