Building AI systems that learn from multiple sensory inputs such as text, speech, video, real-world sensors, wearable devices, and medical data holds great promise for impact in many scientific areas with practical benefits, such as in supporting human health and well-being, enabling multimedia content processing, and enhancing real-world autonomous agents. This workshop will cover practical methods for designing and deploying AI systems that can sense, learn, and interact with the real world through many sensory modalities. It will cover a spectrum of methods, from lightweight, efficient sensing tools to modern large-scale multimodal foundation models. The workshop will also explore the applications of next-generation AI for extreme sensing, revealing hidden visual information beyond human perception, such as seeing around corners and turning objects into cameras.
Faculty
Ramesh Raskar
Camera CultureAssociate Director and Associate Professor at MIT Media Lab
Paul Pu Liang
Multisensory IntelligenceAssistant Professor at MIT Media Lab and MIT EECS
Speakers
Akshat Dave
Camera CulturePostdoctoral Associate at MIT Media Lab
Nikhil Behari
Camera CultureGraduate Student at MIT Media Lab
Tzofi Klinghoffer
Camera CultureGraduate Student at MIT Media Lab
Multisensory AI for Extreme Sensing
October 10, 2024 (1:00 PM - 2:30 PM), E15-359
Part 1 (45 min): Foundations of Multi-Sensory AI
Paul Liang
Introduction to Multi-Sensory AI
Core Challenges and Solutions
Latest Advances and Open Problems
Q&A and Open Discussion
Part 2 (45 min): Applications of Multi-Sensory AI for Extreme Sensing
Computer Vision at the Speed of Light for AV and AR
Akshat Dave
What’s in an Image? Unlocking Hidden Cues in Imaging
Nikhil Behari
AI-Driven Design of Vision Systems
Tzofi Klinghoffer