SIGIR 2025 Tutorial:
Retrieval-Enhanced Machine Learning:
Synthesis and Opportunities

1Carnegie Mellon University, 2Databricks, 3University of Massachusetts Amherst

Sunday July 13th 09:00 - 12:30 (GMT+2)
@ Donatello Floor 0

About this tutorial

Retrieval-enhanced machine learning (REML) refers to the use of information retrieval methods to support reasoning and inference in machine learning tasks. Although relatively recent, these approaches can substantially improve model performance. This includes improved generalization, knowledge grounding, scalability, freshness, attribution, interpretability and on-device learning. To date, despite being influenced by work in the information retrieval community, REML research has predominantly been presented in natural language processing (NLP) conferences.

Our tutorial addresses this disconnect by introducing core REML concepts and synthesizing the literature from various domains in machine learning (ML), including but beyond NLP. What is unique to our approach is that we used consistent notations, to provide researchers with a unified and expandable framework. This tutorial will be delivered in lecture format based on an existing manuscript: "Retrieval-Enhanced Machine Learning: Synthesis and Opportunities"

Schedule

Combined Slides: [Slides]

Time Section Presenter In Manuscript
09:00 — 09:20 Section 1: Introduction Fernando Diaz Chapter 1 - 2
09:20 — 09:45 Section 2: Querying Alireza Salemi Chapter 3
09:45 — 09:55 Section 3: Searching Alireza Salemi Chapter 4
09:55 — 10:30 Section 4: Presentation & Consumption Andrew Drozdov Chapter 5
10:30 — 11:00 Coffee Break
11:00 — 11:25 Section 5: Storing To Eun Kim Chapter 6
11:25 — 11:45 Section 6: Optimization Hamed Zamani Chapter 7
11:45 — 12:05 Section 7: Evaluation Fernando Diaz Chapter 8
12:05 — 12:20 Section 8: Future Direction & Conclusion Fernando Diaz Chapter 9 - 10
12:20 — 12:30 Q & A All

BibTeX (Manuscript)

      
        @misc{kim2024retrievalenhancedmachinelearning,
          title={Retrieval-Enhanced Machine Learning: Synthesis and Opportunities}, 
          author={To Eun Kim and Alireza Salemi and Andrew Drozdov and Fernando Diaz and Hamed Zamani},
          year={2024},
          eprint={2407.12982},
          archivePrefix={arXiv},
          primaryClass={cs.LG},
          url={https://arxiv.org/abs/2407.12982}, 
        }