Generalist Agent AI (GAA) is a family of systems that generate effective actions in a given environment based on the understanding of multimodal sensory input. With the advent of large foundation models, numerous GAA systems have been proposed in fields ranging from basic research to applications. While these research areas are growing rapidly by integrating with the traditional technologies of each domain, they share common interests such as data collection, benchmarking, and ethical perspectives. In this tutorial, we focus on the some representative research areas of Embodied GAA, namely embodied-multimodality, robotics, gaming (VR/AR/MR), and healthcare, etc., and we aim to provide comprehensive knowledge on the common concerns discussed in these fields. As a result we expect the participants to learn the fundamentals of GAA and gain insights to further advance their research. Specific learning outcomes include:
Led by esteemed experts from academia and industry, we expect that the tutorial will be an interactive and enriching experience, complete with lectures, case studies, and Q&A sessions ensuring a comprehensive and engaging learning experience for all participants.
Time Slot | Talk Scheduling | Talk title | Tutorial Materials |
---|---|---|---|
08:30 - 08:40 | Jianfeng Gao | Opening Remarks | Slides |
08:40 - 09:30 | Talk1: Juan Carlos Niebles | Language-based AI Agents and Large Action Models (LAMs) | Slides |
09:30 - 09:50 | Coffee Break | ||
09:50 - 10:40 | Talk2: Yong Jae Lee | Generalist Multimodal Models | Slides |
10:40 - 11:30 | Talk3: Katsushi Ikeuchi | Agent Robotics: Learning-from-observation | Slides |
11:30 - 11:40 | Naoki Wake | Ending Remarks | Slides |