Extending Bradley Terry
Extending Bradley Terry
The Bradley-Terry Model
The Bradley-Terry model assigns a strength parameter $\theta_i$ to each item $i$. The probability that item $i$ wins against item $j$ is:
\[P(i > j) = \sigma(\theta_i - \theta_j) = \frac{1}{1 + e^{-(\theta_i - \theta_j)}}\]where $\sigma$ is the logistic sigmoid function. The outcome depends on the difference in strengths between competitors.
The Loss Function
We fit the model by minimizing the binary cross-entropy loss (negative mean log-likelihood). For $N$ matchups with labels $y_i$ and predicted probabilities $p_i$:
\[\mathcal{L} = - \frac{1}{N} \sum_{i=1}^{N} \left[ y_i \log(p_i) + (1 - y_i) \log(1 - p_i) \right]\]This loss penalizes incorrect predictions: the term $y_i \log(p_i)$ contributes when model A wins ($y_i=1.0$), while $(1 - y_i) \log(1 - p_i)$ contributes when model A loses ($y_i=0.0$). For ties ($y_i=0.5$), both terms contribute equally.
Dataset
The dataset contains human preference judgments from pairwise comparisons of AI model responses. Each comparison involves:
- A user submitting a prompt
- Two different AI models generating responses
- The user selecting which response they prefer (or marking a tie)
Data Format
A load_data() function is provided which loads and splits the data into train and eval datasts both with the following key fields:
models: List of strings (length M) containing all unique model names, ordered alphabeticallymatchups: [N, 2] torch tensor of integers representing pairs of models being compared. Each integer is an index into themodelslistoutcomes: [N] torch tensor of floats with values in {1.0, 0.0, 0.5} indicating whether the first model won (1.0), lost (0.0), or tied (0.5)users: List of strings containing all unique user identifiersuser_ids: [N] torch tensor of integers, where each integer represents a unique user who made the judgmentprompt_embs: [N, 32] torch tensor of float embeddings representing the semantic content of each promptresponse_a_embs: [N, 32] torch tensor of float embeddings for the first model’s responseresponse_b_embs: [N, 32] torch tensor of float embeddings for the second model’s response
All embeddings are 32-dimensional vectors obtained from a toy embedding model.
Examples: If there are the following battles:
- Model A vs Model B, judged by User X who voted for Model A
- Model B vs Model C, judged by User Y who voted for Model C
- Model C vs Model D, judged by user X who voted tie
The data would look like:
train_data, val_data, models, users = load_data()
models
# ["Model A", "Model B", "Model C", "Model D"]
users
# ["User X", "User Y"]
train_data["matchups"]
# [[0, 1], # A vs B
[1, 2], # B vs C
[2, 3]] # C vs D
train_data["outcomes"]
# [1.0, # model in first column won
0.0, # model in second column won
0.5] # tie
train_data["user_ids"]
[0, # "User X" is index 0 in users
1, # "User Y" is index 1 in users
0] # "User X" is index 0 in users
Tasks
1. Bradley-Terry Warmup
Use the given started code to implement the loss funtion and Bradley-Terry model.
Run it on the sample data
2. User-Model Interaction Extension
Propose and implement a way to use the user_id information to create a more expressive extension of BT.
3. Content Represenation Extension
Propose and implement a method which utilizes the provided embeddings to model the vote outcome.
