HALF A DECADE ago, Jonathan Heiliger compared the world of Internet data centers to Fight Club.
It was the spring of 2011, and the giants of the Internet—including Google, Amazon, and Microsoft—were erecting a new kind of data center. Their online empires had grown so large that they could no longer rely on typical hardware from the likes of Dell, HP, and IBM. They needed hardware that was cheaper, more streamlined, and more malleable. So, behind the scenes, they designed this hardware from scratch and had it manufactured through little-known companies in Asia.
This shadow hardware market was rarely discussed in public. Companies like Google saw their latest data center hardware as a competitive advantage best kept secret from rivals. But then Facebook tore off the veil. It open sourced its latest server and data center designs, freely sharing them with the world under the aegis of a new organization called the Open Compute Project. “It’s time to stop treating data center design like Fight Club and demystify the way these things are built,” said Heiliger, then the vice president of technical operations at Facebook.
With the Open Compute Project, Facebook aimed to create a whole community of companies that would freely share their data center designs, hoping to accelerate the evolution of Internet hardware and, thanks to the economies of scale, drive down the cost of this hardware. That, among other things, boosts the Facebook bottom line. It worked—in a very big way. Microsoft soon shared its designs too. Companies like HP and Quanta began selling this new breed of streamlined gear. And businesses as diverse as Rackspace and Goldman Sachs used this hardware to expand their own massive online operations. Even Apple—that bastion of secrecy—eventually joined the project.
Two big holdouts remained: Google and Amazon. But today, that number dropped to one. At the annual Open Compute Summit in San Jose, California, Google announced that it too has joined the project. And it’s already working with Facebook on a new piece of open source hardware.
The announcement reaffirms the power of Facebook’s big idea. Google was the first company to rethink data center design for the modern age. For years, its technology was well ahead of anyone else. And when Heiliger complained of that Fight Club mentality, Google was surely top of mind. But through the Open Compute Project, Facebook has pushed the rest of the industry forward. And market forces have pushed Google to share its secrets in new ways. “The community and Google have inched closer and closer together,” says Jason Taylor, Facebook’s vice president of infrastructure.
But Google’s move also points to other changes inside the big Internet players. That joint open source project from Google and Facebook relates to the rise of deep learning, an artificial intelligence technology that is rapidly reinventing so many parts of the modern world. Both companies see AI as a key part of their future, and both believe they’ll get there faster if they share and collaborate on some of the core technologies that drive these neural networks.
Happy to Share
Google’s Urs Hölzle—one of the company’s first employees and the engineer most responsible for what is probably the world’s largest and most advanced computer network—doesn’t see today’s announcement as a big change. He points out that Google has openly discussed its internal hardware designs in the past.
“I know that historically in the press, there has been a tendency to position it as Open Compute Project versus Google. But it has never been like that,” he says. “Over the past ten years, we have shared many, many things with the industry. This is the latest one.”
Indeed, Google revealed some of its server designs in 2009. And last year, it lifted the curtain on its seminal approach to computer networking hardware. But typically, Google reveals its designs only after it has moved on to something else. And it doesn’t open source its gear a la Facebook. But now Google and Facebook are actively working together on hardware they intend to open source, which highlights the ever evolving priorities of both companies.
More Power to You
Together, Google and Facebook are developing a new data center server rack—an enclosure for massive numbers of computer servers. This new rack can deliver about four times more electrical power to all those machines, jumping from 12 volts to 48. As Hölzle points out, as we pack more and more hardware into smaller and smaller spaces, data center racks require more and more power, and this need has only increased with the rise of graphics processing units, or GPUs, inside the data center—a rise occasioned by the increasing importance of deep neural networks.
Companies are seeing that the best path toward improving their AI includes openly sharing their tech.
GPUs were originally designed as a way of rendering images for games and other graphics-intensive applications. But as it turns out, they’re also well suited to running deep neural nets, the AI technology that now helps companies like Google identify images, recognize commands spoken into smartphones, target ads, generate search results, and so much more. “Power density is going up,” Hölzle says. “GPUs are something that accelerates this—or amplifies it.”
Today, at the Open Compute Summit, Facebook is also open sourcing the designs for the GPU-based system that drives its neural networks. Hölzle indicates that the new rack standard that Facebook and Google are working on could help drive this kind of system.
It’s not news that deep neural nets have become a crucial part of our largest Internet services. Like Google and Facebook, so many others are moving toward an infrastructure in which AI plays a central role. What’s interesting is that so many of these companies believe that the best path toward improving their AI includes openly sharing their tech with the larger world instead of keeping it secret. This past fall, Google also open sourced TensorFlow, the software engine that drives its neural networks.
No More Fight Club
Why, exactly, is Google doing all this? Part of it that the academics who drive AI research believe that such sharing can accelerate research—that true progress comes from widespread collaboration. That’s the main reason Google open sourced TensorFlow. But the company is also looking for some good will. The new server rack project is a good example.
Google already uses a similar rack inside its data centers. If Facebook and others adopt this as a standard, it can potentially drive down the cost of this hardware. Economies of scale and all that. “We all benefit from an ecosystem that agrees on at least a few things,” Hölzle says. But in the end, he plays down this effect. He indicates that the company is really just trying to help others out. And when it comes right down to it, that generates good will—a circle of mutual benefit.
In the past, Google was mostly a company that offered Internet services to consumers. But now it’s intent on transforming itself into a cloud computing company,inviting a world of businesses to build and run their software on its vast online infrastructure. That means Google is also interested showing the world what it has built inside its data centers—and currying favor among the larger tech community.
In short: the rise of AI and cloud computing have put an end to Fight Club. The first rule of data centers is now: let’s talk about our data centers. A lot.
Leave a Reply