Shravan Venkatraman

I am an M.Sc. computer vision student at MBZUAI. I am fortunate to be a part of the Intellectual and Visual Analytics Lab, where I am advised by Dr. Fahad Khan and Dr. Salman Khan. I completed my undergrad in computer science at VIT University, advised by Dr. Joe Dhanith and Dr. Pandiyaraju (also receiving the Sir C. V. Raman Award 3x during my time there).

My research focuses on developing self-evolving large multimodal models for generalizable multimodal intelligence, within the broader context of multimodal representation learning for reasoning. I also work on unified large-scale models for image understanding and generation, and (on the side) I'm interested in geometry-aware representations and neural rendering for computer graphics.

I am currently a machine learning research intern at Aalto University in the Probabilistic ML and Generative AI Lab, advised by Dr. Arno Solin. Prior to this, I was a research intern at Nagasaki University in the Pattern Recognition and Machine Learning Lab, advised by Dr. Muthu Subash Kavitha. In the summer of 2024, I interned at MedxAI under the mentorship of Dr. Susan Elias and Dr. Sheena Pravin.

I'm always happy to chat about research, collaborations, or startups — feel free to send me an email! :)

Email CV Scholar GitHub

Shravan Venkatraman

News

Jun 2026 – VISE (Visual Invariance Self-Evolution) is accepted to ECCV 2026 and our preprint is up on arXiv! See you in Sweden 🇸🇪
Jun 2026 – Our paper, Ask, Solve, Generate, a self-evolving framework for unified image understanding and generation, is now on arXiv!
Jun 2026 – Excited to start my research internship at Aalto University in the Probabilistic ML and Generative AI Lab, working with Dr. Arno Solin, as part of the AScI Program! 🇫🇮
Apr 28, 2026 – Honored to have been elected President of the MBZUAI Student Council for AY 2026–27!
Apr 07, 2026 – Three papers (PCM-NeRF, TIDE, NTRM [Oral]) are accepted to CVPR'26 Workshops! 🎉
Jan 13, 2026 – I've been selected for the MBZUAI ML Winter School 2026 on Representation Learning & GenAI!
Dec 16, 2025 – Grateful to have received the Sir C. V. Raman Award from VIT Chennai for the third time (post-graduation, too) for my work on SAG-ViT.
Nov 21, 2025 – Our paper on EvoLMM, a purely self-evolving framework for LMMs, is now on arXiv!
Nov 27, 2025 – SPROUT has been accepted to Neurocomputing!
Oct 27, 2025 – RG-ViT has been accepted to Computers in Biology and Medicine!
Sep 18, 2025 – I'm honored to serve as Student Representative for the Computer Vision department at MBZUAI!
Aug 10, 2025 – I'm excited to start my MSc. in Computer Vision at MBZUAI!
Jul 27, 2025 – SAG-ViT has been accepted to Complex and Intelligent Systems!
Jul 14, 2025 – UGPL is accepted to ICCV'25 Workshops: CVAMD! Paper and Code are available!
Apr 17, 2025 – I have successfully defended my bachelor's thesis (titled: Making NeRF See Structure, Not Just Light) at VIT Chennai!
Feb 28, 2025 – FUSION is accepted to CVPR'25 Workshops: NTIRE!
Apr 07, 2025 – Honored to receive the Sir C. V. Raman Award from VIT Chennai for the second time in recognition of my research!
Feb 28, 2025 – We showcased and presented CerviLens at IInvenTiv'25 @IIT Madras, representing MedxAI Innovations!
Jan 25, 2025 – I am honored to have been admitted to the MSc. Computer Vision program at MBZUAI!
Dec 12, 2024 – Proud to have been selected as a recipient of the Sir C. V. Raman Award by VIT Chennai for my research!
Jun 23, 2024 – I presented our paper on attention-fused deep CNNs at ICRAS 2024 in Tokyo, Japan!
Selected Publications

Hover over publications for quick preview

VISE
VISE: Paying More Attention to Visual Tokens in Self-Evolving Large Multimodal Models
ECCV 2026
Geometric and semantic invariance rewards strengthen visual conditioning in self-evolving multimodal models, with no labels or external rewards.
VISE
VISE
VISE: Paying More Attention to Visual Tokens in Self-Evolving Large Multimodal Models
Shravan Venkatraman, Ritesh Thawkar, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Salman Khan, Fahad Shahbaz Khan
European Conference on Computer Vision (ECCV) 2026

Ask, Solve, Generate
Ask, Solve, Generate: Self-Evolving Unified Multimodal Understanding and Generation via Self-Consistency Rewards
arXiv
A unified multimodal model that self-improves both image understanding and generation from unlabeled images, using only self-consistency rewards.
Ask, Solve, Generate
Ask, Solve, Generate
Ask, Solve, Generate: Self-Evolving Unified Multimodal Understanding and Generation via Self-Consistency Rewards
Ritesh Thawkar, Shravan Venkatraman, Omkar Thawakar, Abdelrahman M Shaker, Fahad Shahbaz Khan, Hisham Cholakkal, Salman Khan, Rao Muhammad Anwer
arXiv

EvoLMM
EvoLMM: Self-Evolving Large Multimodal Models with Continuous Rewards
arXiv
EvoLMM is a fully unsupervised self-evolving framework for large multimodal models (LMMs) that improves visual reasoning from raw images only by coupling a Proposer and a Solver trained via continuous self-consistency rewards.
EvoLMM
EvoLMM
EvoLMM: Self-Evolving Large Multimodal Models with Continuous Rewards
Omkar Thawakar*, Shravan Venkatraman*, Ritesh Thawkar*, Abdelrahman M Shaker, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Fahad Shahbaz Khan
arXiv

Teaching
dragon

BCSE332P - Deep Learning Lab (Fall 2024)


Last updated April 2026.