
Building AI systems that learn from multiple sensory inputs such as text, speech, video, real-world sensors, wearable devices, and medical data holds great promise for impact in many scientific areas with practical benefits, such as in supporting human health and well-being, enabling multimedia content processing, and enhancing real-world autonomous agents. This workshop will cover practical methods for designing and deploying AI systems that can sense, learn, and interact with the real world through many sensory modalities. It will cover a spectrum of methods, from lightweight, efficient sensing tools to modern large-scale multimodal foundation models. The workshop will also explore the applications of next-generation AI for extreme sensing, revealing hidden visual information beyond human perception, such as seeing around corners and turning objects into cameras.
Faculty

Ramesh Raskar
Camera CultureAssociate Director and Associate Professor at MIT Media Lab

Paul Pu Liang
Multisensory IntelligenceAssistant Professor at MIT Media Lab and MIT EECS
Speakers

Siddharth Somasundaram
Camera CultureGraduate Student at MIT Media Lab

Nikhil Behari
Camera CultureGraduate Student at MIT Media Lab

Tzofi Klinghoffer
Camera CultureGraduate Student at MIT Media Lab
Multimodal AI for Extreme Sensing
October 23, 2025 (9:30 AM - 11:00 AM and 11:30 AM - 1:00 PM)
MIT Media Lab E14-240
Part 1 (45 min): Foundations of Multi-Sensory AI
Paul Liang
Introduction to Multi-Modal AI
AI for Tactile Sensing
AI for Olfaction
AI Reasoning
Q&A and Open Discussion
Part 2 (45 min): Applications of Multimodal AI for Extreme Sensing
Seeing Around Corners with Consumer Single-Photon Cameras
Siddharth Somasundaram
What’s in an Picture? Unlocking Hidden Cues in Imaging
Nikhil Behari
AI-Driven Design of Vision Systems
Tzofi Klinghoffer