Head of Research · Moments Lab
Unraveling the science of videos.
Large video collections are difficult to search because evidence is distributed across time, modalities, and incomplete metadata. I design indexing, retrieval, and reasoning pipelines that connect queries to relevant video segments and grounded outputs. The goal is to make broadcast and archival video usable for search, analysis, and decision support.
Research focus
Video understanding
Long-form video carries meaning across time, modalities, and context. I build shot detection, chaptering, and multimodal representation pipelines that turn raw footage into structured, machine understandable units.
Video retrieval
Video queries combine language, vision, and time, which makes evidence retrieval ambiguous and expensive. I design multimodal retrieval systems that ground reasoning in relevant segments rather than coarse video summaries.
AI fairness
Models inherit and amplify the biases present in their training data. I study how disability, minority, and representation gaps propagate through vision and language systems, and how to evaluate and mitigate them.
Selected work
Representative outputs on multimodal retrieval, reasoning, and indexing for video data. The full curated list is on the Research page.
-
Addresses how to ground answers in large video libraries by combining retrieval over video segments with generation conditioned on retrieved evidence.
-
Shows that video reasoning results depend strongly on frame sampling and provides a benchmark for evaluating small vision-language models under controlled settings.
-
Studies how to segment long-form news video into retrievable chapters and builds a multimodal chaptering pipeline for downstream search.
-
Builds an industrial method for shot-level video indexing so large collections can be searched at scale.
Bio
I am Head of Research at Moments Lab, where I lead research on multimodal retrieval and reasoning over large-scale video data. I build video indexing systems, retrieval pipelines, and evaluation methods for long-form broadcast and archival collections.
I completed a PhD at Institut Polytechnique de Paris (advised by Jérôme Boudy and Gérard Chollet) on multimodal speaker diarization with work on robustness and fairness in real-world conditions.
Outside of Moments Lab, I co-founded VocaCoach, a speech training platform (VivaTech award, covered by Le Parisien). I also built UpToCure, an AI-powered rare disease research platform, and serve on the board of Universal Wings Mobility (accessible travel AI). I contributed to DiverseSpectrum, an open-source dataset for minority representation in AI.
Press & Talks
Selected public interventions and press coverage.
-
How AI Video Understanding is Rewriting the Rules of Media Production and Discovery
-
Generation AI (FOST)