Meta (Formerly Facebook) is Building AI Supercomputer to Power Metaverse
Following Meta’s (previously Facebook’s) October revelation that it is attempting to make a claim on the metaverse, the business today revealed the development of the AI Research SuperCluster (RSC), which it says is one of the fastest AI supercomputers currently operational. When completed, Meta claims it will be the world’s fastest supercomputer – the business plans to finish it by the middle of this year. It is the fastest system designed for AI operations, according to Chief Executive Mark Zuckerberg, with 6,080 graphics processing units bundled into 760 Nvidia A100 modules.
Mark Zuckerberg, the business’s CEO, stated that the experiences the company is developing for the metaverse need massive processing power – quintillions of operations per second. The RSC enables future AI models to learn from billions of instances and comprehend hundreds of languages, among other capabilities. That computing capacity is comparable to that of the Perlmutter supercomputer, which utilizes over 6,000 of the same Nvidia GPUs and is now the world’s fifth fastest supercomputer. And in a second phase, Meta intends to increase performance by a factor of 2.5 this year with the addition of 16,000 GPUs.
Meta will employ RSC for a variety of research projects requiring next-generation performance, such as “multimodal” artificial intelligence that draws conclusions from a mix of sound, images, and actions rather than a single sort of input data. This might be advantageous when it comes to dealing with the complexities of one of Facebook’s major concerns, identifying dangerous material.
Meta, a leading artificial intelligence researcher, hopes the investment pays off by using RSC to assist in the development of the company’s current priority: the virtual environment dubbed the metaverse. RSC may be strong enough to interpret speech for a big number of people who each speak a different language concurrently.
When it comes to one of the most common applications of artificial intelligence — teaching an AI system to detect what’s in a picture — RSC is nearly 20 times quicker than its previous 2017-era Nvidia machine, Meta researchers Kevin Lee and Shubho Sengupta said in a blog post. It is almost three times quicker at deciphering human speech. Nowadays, the phrase “artificial intelligence” refers to a technique known as machine learning or deep learning that analyzes data in a manner comparable to how human brains do. It’s groundbreaking because AI models are taught using real-world data.
For instance, AI can understand how cats appear by studying hundreds of cat photographs, in contrast to conventional programming, which requires a developer to explain the whole range of feline hair, whiskers, eyes, and ears. RSC may also aid in resolving a particularly perplexing AI challenge Meta refers to as self-supervised learning. Today, AI models are developed on well labeled data. For example, stop signs are labeled in photographs used to train AI for autonomous cars, and audio used to train speech recognition AI is accompanied with a transcript.
Self-supervised training, on the other hand, is more challenging since it utilizes raw, unlabeled data. Thus far, it is an area where people continue to have an advantage over computers. Meta and other proponents of artificial intelligence have shown that training AI models on ever-larger data sets generates superior outcomes. Training AI models requires far more computational power than operating those models, which is why iPhones can unlock without having a link to a data center brimming with computers. The designers of supercomputers personalize their machines by balancing memory, GPU performance, CPU performance, power consumption, and internal data paths.
“Today’s AI is often dominated by the GPU, a kind of CPU initially created for graphics acceleration but now employed for a variety of other computer tasks. Pure Storage, a data storage provider, and Nvidia, a chipmaker, are both members of Facebook’s supercluster. Nvidia, in particular, has been a major proponent of the metaverse, with their omniverse product dubbed the “metaverse for engineers.”
Nvidia’s cutting-edge A100 processors are optimized for AI and other high-performance data center workloads. Google and a slew of firms are developing specialized AI processors, some of which are the biggest chips ever made. Facebook chooses the relatively adaptable A100 GPU base because, when paired with Facebook’s proprietary PyTorch AI engine, the business feels it provides the most productive environment for developers.
Rob Lee, CTO of Pure Storage, told VentureBeat via email that the RSC is relevant to firms outside Meta because the technology underlying the metaverse (such as AI and augmented/virtual reality) are more universally applicable and in-demand across sectors. According to Lee, technical decision makers are always on the lookout for cutting-edge practitioners to learn from, and the RSC offers excellent validation of the basic components that run the world’s biggest AI supercomputer.
“Meta’s world-class team recognized the benefits of coupling Pure Storage’s performance, density, and simplicity with the Nvidia GPUs used to power this ground-breaking work pushing the frontiers of performance and scalability,” Lee said. He noted that businesses of all sizes can benefit from Meta’s efforts, knowledge, and lessons learned in expanding their data, analytics, and artificial intelligence initiatives. Meta has faced multiple backlashes in recent years over its privacy and data regulations, with the Federal Trade Commission (FTC) saying in 2018 that it was examining serious concerns about Facebook’s privacy practices.
Meta is committed to addressing security and privacy concerns from the start, noting that the firm preserves data in RSC by architecting it with privacy and security in mind. This, Meta asserts, will allow its researchers to train models securely utilizing encrypted user-generated data that is not decrypted until just before to training. According to Meta, data must undergo a privacy review procedure to ensure that it has been properly anonymized before being imported into the RSC. Additionally, the business states that data is encrypted before it is used to train AI models, and that decryption keys are frequently erased to guarantee that outdated data is no longer available.
Nvidia supplied the computational layer for this supercomputer, including the Nvidia DGX A100 computers that served as its compute nodes. The GPUs interact through a 200 Gbps InfiniBand two-level Clos fabric from Nvidia Quantum. Lee emphasized that Penguin Computing’s hardware and software contributions serve as the “glue” that holds Penguin, Nvidia, and Pure Storage together. These three partners collaborated extensively to provide Meta with a huge supercomputing solution. Raja Koduri, vice president of Intel’s advanced computing systems and graphics department, said in December 2021 that the present computational infrastructure would need to increase 1,000-fold to support the Metaverse.
“You need access to petaflops [1,000 teraflops] of computation in less than a millisecond, or less than 10 milliseconds for real-time applications,” Koduri told Quartz at the time. The Metaverse, sometimes referred to as the next generation of the internet, is a virtual arena where people may work, play, and interact — often via the use of virtual reality (VR) and augmented reality (AR) technologies.