CVPR 2026 Workshop

WORKSHOP ON ANY-TO-ANY MULTIMODAL LEARNING

Call for Papers

Important Dates

Important Dates for Review Process. We will follow the suggested dates by CVPR26.

Mar 01, 2026 AOE

Workshop Paper Submission Deadline

Mar 19, 2026

Workshop Paper Notification Date

Apr 10, 2026

Program, Camera-ready, Videos Uploaded

Paper Submission and Acceptance

We welcome technical, position, or perspective papers related to the topics outlined below. All submissions must be written in English, follow the official CVPR proceedings format, and adhere to the double-blind review policy.

  • Tiny or Short Papers (2–4 pages) - We invite concise papers that present implementations and evaluations of unpublished but insightful ideas, moderate yet self-contained theoretical analyses, follow-up experiments, re-analyses of prior work, or new perspectives on existing research.
  • Regular Papers (up to 8 pages, including figures and tables) - We encourage submissions introducing original methods, novel research visions, applications, or discussions of open challenges in multimodal learning.

We accept both archival and non-archival paper submissions; authors should indicate the submission type during submission.

A Best Paper Award will be presented based on reviewer scores and the workshop committee’s evaluation.

All accepted papers will be presented as posters during the workshop, and some of them will be selected for short oral presentations. Poster sessions will be conducted onsite with dedicated time for interactive discussions. For remote attendees, we will offer a virtual poster gallery and live Q&A channels to ensure inclusive engagement.

Topics and Themes

We welcome all relevant submissions in the area of multimodal learning, with emphasis on any-to-any multimodal intelligence, such as:

  • Multimodal Representation Learning
  • Multimodal Transformation
  • Multimodal Synergistic Collaboration
  • Benchmarking and Evaluation for Any-to-Any Multimodal Learning

Other topics include, but are not limited to:

  • Unified multimodal foundation and agentic models.
  • Representation learning for embodied and interactive systems.
  • Integration of underexplored modalities and cognitive perspectives on multimodal perception and reasoning.

About

The recent surge of multimodal large models has brought strong progress in connecting language, vision, audio, and beyond. Yet most existing systems remain constrained to fixed modality pairs, lacking flexibility to generalize or reason across arbitrary combinations. The Any-to-Any Multimodal Learning workshop aims to explore systems that can understand, align, transform, and generate across any set of modalities. We organize the discussion around three pillars: representation learning, transformation, and collaboration.

For the latest papers and datasets, please refer to Awesome-Any-to-Any-Generation. This repository is regularly updated and provides valuable resources for your submission.

Speakers

Tentative Schedule

Time Schedule Speaker
Morning Schedule
TBDIntroduction and opening remarks-
TBDKeynote Talk 1TBD
TBDKeynote Talk 2TBD
TBDOral Presentations-
TBDCoffee Break-
TBDKeynote Talk 3TBD
TBDKeynote Talk 4TBD
TBDPoster Session 1 (Interactive) + Virtual Gallery-
TBDLunch Break-
Afternoon Schedule
TBDKeynote Talk 5TBD
TBDKeynote Talk 6TBD
TBDPoster Session 2 (Interactive) + Virtual Gallery-
TBDCoffee Break-
TBDKeynote Talk 7TBD
TBDPanel DiscussionTBD
TBDClosing Remarks + Best Paper AwardTBD

Organizers