A2A-MML 2026

Call for Papers

Important Dates

Important Dates for Review Process. We will follow the suggested dates by CVPR26.

Mar 01, 2026 AOE

Workshop Paper Submission Deadline

Mar 19, 2026

Workshop Paper Notification Date

Apr 10, 2026

Program, Camera-ready, Videos Uploaded

Paper Submission and Acceptance

We welcome technical, position, or perspective papers related to the topics outlined below. All submissions must be written in English, follow the official CVPR proceedings format, and adhere to the double-blind review policy.

Submit Paper

Tiny or Short Papers (2–4 pages) - We invite concise papers that present implementations and evaluations of unpublished but insightful ideas, moderate yet self-contained theoretical analyses, follow-up experiments, re-analyses of prior work, or new perspectives on existing research.
Regular Papers (up to 8 pages, including figures and tables) - We encourage submissions introducing original methods, novel research visions, applications, or discussions of open challenges in multimodal learning.

We accept both archival and non-archival paper submissions; authors should indicate the submission type during submission.

A Best Paper Award will be presented based on reviewer scores and the workshop committee’s evaluation.

All accepted papers will be presented as posters during the workshop, and some of them will be selected for short oral presentations. Poster sessions will be conducted onsite with dedicated time for interactive discussions. For remote attendees, we will offer a virtual poster gallery and live Q&A channels to ensure inclusive engagement.

Topics and Themes

We welcome all relevant submissions in the area of multimodal learning, with emphasis on any-to-any multimodal intelligence, such as:

Multimodal Representation Learning
Multimodal Transformation
Multimodal Synergistic Collaboration
Benchmarking and Evaluation for Any-to-Any Multimodal Learning

Other topics include, but are not limited to:

Unified multimodal foundation and agentic models.
Representation learning for embodied and interactive systems.
Integration of underexplored modalities and cognitive perspectives on multimodal perception and reasoning.

About

The recent surge of multimodal large models has brought strong progress in connecting language, vision, audio, and beyond. Yet most existing systems remain constrained to fixed modality pairs, lacking flexibility to generalize or reason across arbitrary combinations. The Any-to-Any Multimodal Learning workshop aims to explore systems that can understand, align, transform, and generate across any set of modalities. We organize the discussion around three pillars: representation learning, transformation, and collaboration.

For the latest papers and datasets, please refer to Awesome-Any-to-Any-Generation. This repository is regularly updated and provides valuable resources for your submission.

Speakers

Tentative Schedule

Time	Schedule	Speaker
Morning Schedule
TBD	Introduction and opening remarks	-
TBD	Keynote Talk 1	TBD
TBD	Keynote Talk 2	TBD
TBD	Oral Presentations	-
TBD	Coffee Break	-
TBD	Keynote Talk 3	TBD
TBD	Keynote Talk 4	TBD
TBD	Poster Session 1 (Interactive) + Virtual Gallery	-
TBD	Lunch Break	-
Afternoon Schedule
TBD	Keynote Talk 5	TBD
TBD	Keynote Talk 6	TBD
TBD	Poster Session 2 (Interactive) + Virtual Gallery	-
TBD	Coffee Break	-
TBD	Keynote Talk 7	TBD
TBD	Panel Discussion	TBD
TBD	Closing Remarks + Best Paper Award	TBD

WORKSHOP ON ANY-TO-ANY MULTIMODAL LEARNING