publications
Here is list of my main scientific publications. a more exhaustive list can be found on my Google Scholar profile.
2024
- Multimodal RAGTowards Retrieval Augmented Generation over Large Video LibrariesIn Proceedings of HSI 2024, 2024
- AI FairnessDisability Representations: Finding Biases in Automatic Image GenerationIn Workshop AVA: Accessibility, Vision, and Autonomy Meet, 2024
- Video understandingMultimodal Chaptering for Long-Form TV Newscast Video2024
- VLMInserting Faces inside Captions: Image Captioning with Attention Guided MergingarXiv preprint arXiv:2405.02305, 2024
- Edge AIPrivacy Preserving Personal Assistant with On-Device Diarization and Spoken Dialogue System for Home and BeyondProceedings of IHIET 2024, 2024
2023
- Speaker DiarizationDiarisation multimodale: vers des modèles robustes et justes en contexte réelInstitut Polytechnique de Paris, 2023
- Speaker DiarizationDétection d’activité vocale Multi-flux pour la Diarisation du locuteurProceedings of GRETSI 2023, 2023
- Speaker DiarizationHome monitoring for frailty detection through sound and speaker diarization analysisIn JETSAN 2023, 2023
- FairnessTowards measuring and scoring speaker diarization fairnessarXiv preprint arXiv:2302.09991, 2023
2022
- Speech ProcessingMulti-stream voice activity detection for robust speaker diarizationIn GDR ISIS 2022: Information, Signal, Image et ViSion: Traitement du signal pour la voix, 2022
- Speech ProcessingThe Newsbridge-Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System DescriptionVoxCeleb Speaker Recognition Challenge 2022 Tack 4, 2022