Microsoft Build 2020 | World Top-5 Supercomputer With OpenAI, Microsoft Turing Models, Responsible ML Capabilities in AI Systems

Published in

SyncedReview

6 min readMay 20, 2020

Instead of completely cancelling as Google did with Google I/O 2020, Microsoft decided to shift its annual developers conference Microsoft Build online this year.

The 48-hour digital event kicked off yesterday, and the company wasted no time making impactful announcements that included a new supercomputer, a family of large AI models, and a Responsible ML on Microsoft Azure initiative.

“We’re living through extraordinary times. While it’s hard not to be together in person, I’m comforted by this community being gathered here virtually,” Microsoft CEO Satya Nadella said in his opening keynote.

World’s Top Supercomputer

Microsoft announced that it has built one of the top-five publicly disclosed supercomputers in the world, making new infrastructure available in Azure to train extremely large artificial intelligence models. Built in collaboration with and exclusively for OpenAI, the supercomputer hosted in Azure was designed specifically to train AI models for OpenAI. It represents a milestone in a partnership announced last year to jointly create new supercomputing technologies in Azure.

The current top five supercomputers are Oak Ridge National Laboratory’s Summit, Lawrence Livermore National Laboratory’s Sierra, Sunway TaihuLight of China’s National Supercomputing Center in Wuxi, Tianhe-2 of China’s National University of Defense Technology, and Frontera at the University of Texas.

The new supercomputer developed for OpenAI is a single system with more than 285,000 CPU cores, 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server. It ranks in the top five compared with other machines listed on the TOP500 supercomputers in the world, Microsoft says.

The Azure-hosted supercomputer benefits from all the capabilities of a robust modern cloud infrastructure including rapid deployment, sustainable data centers and access to Azure services.

This is a first step toward realizing the next generation of very large AI models and the infrastructure needed to train them available as a platform for other organizations and developers to build upon, Microsoft says in an AI blog post detailing the new supercomputer.

The Microsoft Turing Models

Training massive AI models requires not only advanced supercomputing infrastructure or clusters of SOTA hardware connected by high-bandwidth networks, but also tools to train the models across these interconnected computers.

As part of a companywide AI at Scale initiative, Microsoft has also developed its own family of large AI models — the Microsoft Turing models — used to improve many different language understanding tasks across Bing, Office, Dynamics and other productivity products.

These AI models, the company explains, can learn about language by examining billions of pages of publicly available documents on the internet including Wikipedia entries, self-published books, instruction manuals, history lessons, and human resources guidelines in a self-supervised learning manner. They therefore no longer rely on meticulously labelled human-generated data to teach AI systems to recognize an object or determine whether the answer to a question makes sense.

Microsoft says that through the AI at Scale initiative it wants to make large AI models, training optimization tools, and supercomputing resources available through Azure AI services and GitHub — so developers, data scientists, and business customers can all leverage the power of AI at Scale.

Earlier this year, Microsoft released to researchers the largest publicly available AI language model in the world — the Microsoft Turing model for natural language generation.

At the virtual Build, Microsoft announced that it will soon begin open-sourcing the Microsoft Turing models as well as recipes for training them in Azure Machine Learning. This will include making the powerful language models that the company has used to improve language understanding across its products available for developers.

Microsoft also unveiled a new version of DeepSpeed, an open source deep learning library for PyTorch that reduces the amount of computing power needed for large distributed model training. This update is significantly more efficient than the version released three months ago, enabling the training of models 15 times larger and 10 times faster than without DeepSpeed on the same infrastructure.

Responsible Machine Learning

Over the past several years, machine learning has moved out of research labs and into the mainstream, and has transformed from a niche discipline for data scientists to one where all developers are expected to be able to participate, noted Eric Boyd, corporate vice president of Microsoft Azure AI, in a blog post.

Microsoft built Azure Machine Learning to enable developers across the spectrum of data science expertise to build and deploy AI systems. Boyd noted that developers today are increasingly interested in building AI systems that are easy to explain and comply with non-discrimination and privacy regulations.

To navigate these hurdles, Microsoft announced innovations in its Responsible ML initiative to enable developers to better understand, protect and control their models throughout the machine learning lifecycle. These capabilities can be accessed through Azure Machine Learning and are also available in open source on GitHub.

In addition, Microsoft said the Fairlearn toolkit, which includes capabilities to assess and improve the fairness of AI systems, will be integrated with Azure Machine Learning in June.

The company also announced that WhiteNoise — a toolkit for differential privacy — is now open-sourced on GitHub for user experimentation, and can also be accessed through Azure Machine Learning. The differential privacy capabilities were developed in collaboration with researchers at the Harvard Institute for Quantitative Social Science and School of Engineering.

Azure Machine Learning now also features built-in controls that enable developers to track and automate their model building, training and deployment process. This capability, aka machine learning and operations (MLOps), provides an audit trail to help organizations meet regulatory and compliance requirements.

Sarah Bird, Microsoft’s Responsible AI lead at Azure AI, said she believes machine learning is changing the world for the better, and the Microsoft team wants to make the tools and resources the developers need available for them to build models in ways that centre around responsibility.

Journalist: Yuan Yuan | Editor: Michael Sarazen

We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.

Thinking of contributing to Synced Review? Synced’s new column Share My Research welcomes scholars to share their own research breakthroughs with global AI enthusiasts.

Need a comprehensive review of the past, present and future of modern AI research development? Trends of AI Technology Development Report is out!

2018 Fortune Global 500 Public Company AI Adaptivity Report is out!
Purchase a Kindle-formatted report on Amazon.
Apply for Insight Partner Program to get a complimentary full PDF report.

Microsoft Build 2020 | World Top-5 Supercomputer With OpenAI, Microsoft Turing Models, Responsible ML Capabilities in AI Systems

World’s Top Supercomputer

The Microsoft Turing Models

Responsible Machine Learning

Written by Synced