TU Wien CAIML

Jingren Zhou: “Experience and Challenges in Building Large Language Models at Alibaba”

Jingren Zhou shares his experiences during the development of an LLM in industry.

jingren-zhou.jpg

March 14th 2024

  • 12:00 – 13:00 CET
  • FAV Hörsaal 3, Zemanek
  • 1040 Vienna, Favoritenstraße 9-11
    Ground Floor, Room HH EG 01

On March 14, 2024, the seminar with Jingren Zhou will take place.

Abstract

In this talk, I will share our experience and lessons learned during the development of Tongyi Qianwen, also known as Qwen, a state-of-the-art large language model at Alibaba Cloud. I will first outline the key steps taken to construct a high-performing model with the ability to generate creative text, comprehend intricate instructions, tackle mathematical problems, and more. Subsequently, I will describe a variety of systems challenges in building large language models and present our innovative design in the areas of distributed storage, high performance networking, resource scheduling, and execution frameworks. Such techniques significantly enhance the efficiency of handling complex AI workloads in a distributed environment.

About the Speaker

Jingren Zhou currently holds the position of Chief Technology Officer at Alibaba Cloud, where he plays a pivotal role in driving technology innovation and product development. His responsibilities also include leading the development of AI foundational models and their applications in various business scenarios at Alibaba Cloud. Before taking on this role, he made significant contributions by leading the development of advanced techniques for personalized search, product recommendation, and advertising at Alibaba’s e-commerce platform and Alipay’s online payment platform. Prior to his time at Alibaba, he was a veteran at Microsoft, where he led Big Data and AI research and development. His research interests span across cloud computing, databases, and large-scale machine learning systems. He holds a B.S. in Computer Science from the University of Science and Technology of China, and a Ph.D. in Computer Science from Columbia University. He is a Fellow of IEEE.

Supporter

This talk is supported by ZIF (Zentrum für Informatikforschung).