Catherine Chen
CS PhD Candidate

I am a fourth year PhD Candidate at Brown University working with Carsten Eickhoff in the Health NLP Lab. Currently, I am spending Spring 2025 as a visiting researcher in the IRLab at the University of Amsterdam, hosted by Maarten de Rijke.

My research centers on explainable information retrieval (XIR), particularly developing methods for interpretable and explainable search. I am also broadly interested in the interpretability of LLMs and the applications of ML/NLP/IR to the biomedical domain. Overall, my work aims to bridge the gap between model transparency and practical deployment, contributing to the development of more accountable and trustworthy AI systems.

Previously, I worked as a full-stack software engineer for FreeWheel in NYC to develop advertising technology solutions. I received my BA in Computer Science and French from Wellesley College, where I was also a member of the Track & Field and Swimming & Diving Teams.

In my spare time, I enjoy running, ice skating, cooking, and exploring local restaurants/bars.

catherine_s_chen[at]brown[dot]edu

Education
  • Brown University
    Brown University
    • Ph.D. in Computer Science
      Sep. 2021 - Present
    • Sc.M. in Computer Science
      Sep 2021 - May 2023
  • Wellesley College
    Wellesley College
    • B.A. in Computer Science and French
      Sep 2015 - May 2019
Employment
  • Blue Cross Blue Shield of MN
    Blue Cross Blue Shield of MN
    Data Science PhD Intern
    June - Sept 2024
  • FreeWheel, a Comcast Company
    FreeWheel, a Comcast Company
    Software Engineer
    Aug 2019 - Aug 2021
News
2025
Started as a visiting researcher in the IRLab at the University of Amsterdam!
Feb 01
2024
Our demo paper, MechIR: A Mechanistic Interpretability Framework for Information Retrieval, was accepted at ECIR 2025
Dec 16
Gave a talk at the Glasgow IR Seminar
Sep 30
Attended SIGIR in DC and presented my first (two) first-author paper(s)!
Jul 14
2/2 of our long papers were accepted to SIGIR 2024!
Mar 25
2023
Attended EMNLP in Singapore and presented our paper Outlier Dimensions Encode Task Specific Knowledge
Dec 06
Attended SIGIR in Taiwan and workshopped my dissertation in the Doctoral Consortium
Jul 23
Received my Master’s degree and advanced to candidacy
May 28
2021
Started my PhD at Brown University!
Sep 06
Selected Publications (view all )
MechIR: A Mechanistic Interpretability Framework for Information Retrieval

Andrew Parry, Catherine Chen, Carsten Eickhoff, Sean MacAvaney

ECIR 2025

Mechanistic interpretability is an emerging diagnostic approach for neural models that has gained traction in broader natural language processing domains. This paradigm aims to provide attribution to components of neural systems where causal relationships between hidden layers and output were previously uninterpretable. As the use of neural models in IR for retrieval and evaluation becomes ubiquitous, we need to ensure that we can interpret why a model produces a given output for both transparency and the betterment of systems. This work comprises a flexible framework for diagnostic analysis and intervention within these highly parametric neural systems specifically tailored for IR tasks and architectures. In providing such a framework, we look to facilitate further research in interpretable IR with a broader scope for practical interventions derived from mechanistic interpretability. We provide preliminary analysis and look to demonstrate our framework through an axiomatic lens to show its applications and ease of use for those IR practitioners inexperienced in this emerging paradigm.

MechIR: A Mechanistic Interpretability Framework for Information Retrieval

Andrew Parry, Catherine Chen, Carsten Eickhoff, Sean MacAvaney

ECIR 2025

Mechanistic interpretability is an emerging diagnostic approach for neural models that has gained traction in broader natural language processing domains. This paradigm aims to provide attribution to components of neural systems where causal relationships between hidden layers and output were previously uninterpretable. As the use of neural models in IR for retrieval and evaluation becomes ubiquitous, we need to ensure that we can interpret why a model produces a given output for both transparency and the betterment of systems. This work comprises a flexible framework for diagnostic analysis and intervention within these highly parametric neural systems specifically tailored for IR tasks and architectures. In providing such a framework, we look to facilitate further research in interpretable IR with a broader scope for practical interventions derived from mechanistic interpretability. We provide preliminary analysis and look to demonstrate our framework through an axiomatic lens to show its applications and ease of use for those IR practitioners inexperienced in this emerging paradigm.

Axiomatic Causal Interventions for Reverse Engineering Relevance Computation in Neural Retrieval Models

Catherine Chen, Jack Merullo, Carsten Eickhoff

SIGIR 2024

Neural models have demonstrated remarkable performance across diverse ranking tasks. However, the processes and internal mechanisms along which they determine relevance are still largely unknown. Existing approaches for analyzing neural ranker behavior with respect to IR properties rely either on assessing overall model behavior or employing probing methods that may offer an incomplete understanding of causal mechanisms. To provide a more granular understanding of internal model decision-making processes, we propose the use of causal interventions to reverse engineer neural rankers, and demonstrate how mechanistic interpretability methods can be used to isolate components satisfying term-frequency axioms within a ranking model. We identify a group of attention heads that detect duplicate tokens in earlier layers of the model, then communicate with downstream heads to compute overall document relevance. More generally, we propose that this style of mechanistic analysis opens up avenues for reverse engineering the processes neural retrieval models use to compute relevance. This work aims to initiate granular interpretability efforts that will not only benefit retrieval model development and training, but ultimately ensure safer deployment of these models.

Axiomatic Causal Interventions for Reverse Engineering Relevance Computation in Neural Retrieval Models

Catherine Chen, Jack Merullo, Carsten Eickhoff

SIGIR 2024

Neural models have demonstrated remarkable performance across diverse ranking tasks. However, the processes and internal mechanisms along which they determine relevance are still largely unknown. Existing approaches for analyzing neural ranker behavior with respect to IR properties rely either on assessing overall model behavior or employing probing methods that may offer an incomplete understanding of causal mechanisms. To provide a more granular understanding of internal model decision-making processes, we propose the use of causal interventions to reverse engineer neural rankers, and demonstrate how mechanistic interpretability methods can be used to isolate components satisfying term-frequency axioms within a ranking model. We identify a group of attention heads that detect duplicate tokens in earlier layers of the model, then communicate with downstream heads to compute overall document relevance. More generally, we propose that this style of mechanistic analysis opens up avenues for reverse engineering the processes neural retrieval models use to compute relevance. This work aims to initiate granular interpretability efforts that will not only benefit retrieval model development and training, but ultimately ensure safer deployment of these models.

Evaluating Search System Explainability with Psychometrics and Crowdsourcing

Catherine Chen, Carsten Eickhoff

SIGIR 2024

Information retrieval (IR) systems have become an integral part of our everyday lives. As search engines, recommender systems, and conversational agents are employed across various domains from recreational search to clinical decision support, there is an increasing need for transparent and explainable systems to guarantee accountable, fair, and unbiased results. Despite many recent advances towards explainable AI and IR techniques, there is no consensus on what it means for a system to be explainable. Although a growing body of literature suggests that explainability is comprised of multiple subfactors, virtually all existing approaches treat it as a singular notion. In this paper, we examine explainability in Web search systems, leveraging psychometrics and crowdsourcing to identify human-centered factors of explainability.

Evaluating Search System Explainability with Psychometrics and Crowdsourcing

Catherine Chen, Carsten Eickhoff

SIGIR 2024

Information retrieval (IR) systems have become an integral part of our everyday lives. As search engines, recommender systems, and conversational agents are employed across various domains from recreational search to clinical decision support, there is an increasing need for transparent and explainable systems to guarantee accountable, fair, and unbiased results. Despite many recent advances towards explainable AI and IR techniques, there is no consensus on what it means for a system to be explainable. Although a growing body of literature suggests that explainability is comprised of multiple subfactors, virtually all existing approaches treat it as a singular notion. In this paper, we examine explainability in Web search systems, leveraging psychometrics and crowdsourcing to identify human-centered factors of explainability.

Outlier Dimensions Encode Task Specific Knowledge
Outlier Dimensions Encode Task Specific Knowledge

William Rudman, Catherine Chen, Carsten Eickhoff

EMNLP 2023

Representations from large language models (LLMs) are known to be dominated by a small subset of dimensions with exceedingly high variance. Previous works have argued that although ablating these outlier dimensions in LLM representations hurts downstream performance, outlier dimensions are detrimental to the representational quality of embeddings. In this study, we investigate how fine-tuning impacts outlier dimensions and show that 1) outlier dimensions that occur in pre-training persist in fine-tuned models and 2) a single outlier dimension can complete downstream tasks with a minimal error rate. Our results suggest that outlier dimensions can encode crucial task-specific knowledge and that the value of a representation in a single outlier dimension drives downstream model decisions.

Outlier Dimensions Encode Task Specific Knowledge
Outlier Dimensions Encode Task Specific Knowledge

William Rudman, Catherine Chen, Carsten Eickhoff

EMNLP 2023

Representations from large language models (LLMs) are known to be dominated by a small subset of dimensions with exceedingly high variance. Previous works have argued that although ablating these outlier dimensions in LLM representations hurts downstream performance, outlier dimensions are detrimental to the representational quality of embeddings. In this study, we investigate how fine-tuning impacts outlier dimensions and show that 1) outlier dimensions that occur in pre-training persist in fine-tuned models and 2) a single outlier dimension can complete downstream tasks with a minimal error rate. Our results suggest that outlier dimensions can encode crucial task-specific knowledge and that the value of a representation in a single outlier dimension drives downstream model decisions.

All publications