MTIA v1: Meta’s first-generation AI inference accelerator

Mai 18, às 18:39


2 min de leitura


0 leituras

In 2020, we initiated the Meta Training and Inference Accelerator (MTIA) family of chips to support our evolving AI workloads, starting with an inference accelerator ASIC for deep learning recommendation models (DLRMs).
MTIA v1: Meta’s first-generation AI inference accelerator

The MTIA software (SW) stack aims to provide developer efficiency and high performance. It integrates fully with PyTorch, providing a familiar developer experience. Using PyTorch with MTIA is as easy as using PyTorch for CPUs or GPUs. The MTIA SW stack benefits from the flourishing PyTorch developer ecosystem and tooling. The compiler performs model-level transformations and optimizations using PyTorch FX IR and low-level optimizations using LLVM IR, with extensions to support the custom architecture and ISA of the MTIA accelerator.

The PyTorch runtime for MTIA manages on-device execution and features such as MTIA tensors, memory management, and the APIs for scheduling operators on the accelerator. The runtime and firmware perform communication to the accelerator device. The SW stack supports different modes of execution, such as eager mode and graph mode, and allows workloads to be partitioned across multiple accelerator cards. In the latter case, the SW stack also provides the necessary synchronization and communication between multiple accelerator boards.

The MTIA software stack.

There are multiple ways to author compute kernels that can run on the accelerator, including using PyTorch, C/C++ (for hand-tuned, very optimized kernels), and a new domain-specific language called KNYFE, which takes a short, high-level description of an ML operator as input and generates optimized, low-level C++ kernel code that is the implementation of this operator for MTIA.

Low-level code generation and optimizations leverage the open source LLVM compiler toolchain with MTIA extensions. The LLVM compiler then takes care of the next level of optimization and code generation to produce efficient executables that run on the processor cores within the PEs.

As part of the SW stack, we have also developed a library of hand-tuned and highly optimized kernels for performance-critical ML kernels, such as fully connected and embedding-bag operators. The higher levels of the SW stack can choose to instantiate and use these highly optimized kernels during the compilation and code generation process.

The MTIA SW stack continues to evolve with integration to PyTorch 2.0, which is faster and more Pythonic, yet as dynamic as ever. This will enable new features such as TorchDynamo and TorchInductor. We are also extending Triton DSL to support MTIA accelerators and using MLIR for internal representations and advanced optimizations.

Continue lendo

AI | Techcrunch

Disney is reportedly preparing a standalone ESPN streaming service
Disney is actively preparing to launch a standalone ESPN streaming service, according to a new report from the Wall Street Journal. The report indicates that ESPN is planning to sell its channel directly to...

Hoje, às 15:51

AI | Techcrunch

The billionaires are trying to live longer… again
Hello, and welcome back to Equity, a podcast about the business of startups, where we unpack the numbers and nuance behind the headlines. This week Mary Ann, Becca, and Alex gathered to chew through the biggest news of the week. Here’s what the gang got into today: Vice goes bankrupt: Now is not a great time […]

Hoje, às 15:17

AI | Techcrunch

NASA picks Blue Origin-led team to build second human landing system on the moon, joining SpaceX
NASA has chosen a Blue Origin-led team to develop a second lunar landing system for the Artemis program, as the agency looks to provide competition with SpaceX and support long-term exploration of the...

Hoje, às 14:41

AI | TechCrunch

Apple reportedly limits internal use of AI-powered tools like ChatGPT and GitHub Copilot
As big tech companies are in a fierce race with each other to build generative AI tools, they are being cautious about giving their secrets away. In a move to prevent any of its data from ending up with...

Hoje, às 13:55

AI | Techcrunch

Apple is on the hunt for generative AI talent
Apple, like a number of companies right now, may be grappling with what role the newest advances in AI are playing, and should play, in its business. But one thing Apple is confident about is the fact that it...

Hoje, às 13:16

Victoria Lo

Enhancing Public Speaking Skills: A Guide by an Introvert
Public speaking can be a daunting task for many people, especially for introverts who may feel uncomfortable in large groups or social situations. However, with a bit of preparation and practice, introverts...

Hoje, às 13:16


How React Preserve and Reset State
State is isolated between components. React keeps track of which state belongs to which component based on their place in the UI tree. You can control when to preserve state and when to reset it between...

Hoje, às 12:55

AI | Techcrunch

Restaurant365 gobbles up $135M to supersize its software for the food service industry
The price of food continues to go up and up, but surprisingly that hasn’t (yet?) played out as pressure on the wider restaurant industry. Now, a startup that’s building technology to serve that sector announced a supersized round of funding to nourish its growth. Restaurant365, which develops all-in-one restaurant management software, announced $135 million in […]

Hoje, às 11:57

AI | Techcrunch

To secure early-stage funding, entrepreneurs should build ESG into their business models
The fiduciary duty of investment managers would suggest a long-term imperative to ensure that the funds they manage are not placed into assets that will become stranded or obsolete.

Hoje, às 11:30

Hacker News

WSJ News Exclusive | Apple Restricts Employee Use of ChatGPT, Joining Other Companies Wary of Leaks
By Aaron Tilley and Miles KruppaUpdated May 18, 2023 7:35 pm ETSam Altman, CEO of ChatGPT creator OpenAI, touted the benefits of AI and acknowledged potential downsides of the technology during a Senate...

Hoje, às 10:55