Rerankers
Reranker models are often CrossEncoder
models with 1 output class, i.e. given a pair of texts (query, answer), the model outputs one score. This score, either a float score that reasonably ranges between -10.0 and 10.0, or a score that’s bound to 0…1, denotes to what extent the answer can help answer the query.
Many reranker models are trained on MS MARCO:
But most likely, you will get the best results when training on your dataset. Because of this, this page includes some examples training scripts that you can adopt for your own data:
-
This example uses
BinaryCrossEntropyLoss
on labeled pair data that was mined from the GooAQ dataset using an efficientSentenceTransformers
.The model is evaluated on subsets of MS MARCO, NFCorpus, NQ via the
CrossEncoderNanoBEIREvaluator
. Additionally, it is evaluated on the performance gain when reranking the top 100 results from an efficientSentenceTransformers
on the GooAQ development set. -
This example uses
CachedMultipleNegativesRankingLoss
on positive pair data loaded from the GooAQ dataset.The model is evaluated on subsets of MS MARCO, NFCorpus, NQ via the
CrossEncoderNanoBEIREvaluator
. -
This example uses
LambdaLoss
on labeled list data that was mined from the GooAQ dataset using an efficientSentenceTransformers
.The model is evaluated on subsets of MS MARCO, NFCorpus, NQ via the
CrossEncoderNanoBEIREvaluator
. Additionally, it is evaluated on the performance gain when reranking the top 100 results from an efficientSentenceTransformers
on the GooAQ development set. -
This example uses a near-identical training script as
training_gooaq_bce.py
, except on the smaller NQ (natural questions) dataset.
BinaryCrossEntropyLoss
The BinaryCrossEntropyLoss
is a very strong yet simple loss. Given pairs of texts (e.g. (query, answer) pairs), this loss uses the CrossEncoder
model to compute prediction scores. It compares these against the gold (or silver, a.k.a. determined with some model) labels, and computes a lower loss the better the model is doing.
CachedMultipleNegativesRankingLoss
The CachedMultipleNegativesRankingLoss
(a.k.a. InfoNCE with GradCache) is more complex than the common BinaryCrossEntropyLoss
. It accepts positive pairs (i.e. (query, answer) pairs) or triplets (i.e. (query, right_answer, wrong_answer) triplets), and will then randomly find num_negatives
extra incorrect answers per query by taking answers from other questions in the batch. This is often referred to as “in-batch negatives”.
The loss will then compute scores for all (query, answer) pairs, including the incorrect answers ones it just selected. The loss will then use a Cross Entropy Loss to ensure that the score of the (query, correct_answer) is higher than (query, wrong_answer) for all (randomly selected) wrong answers.
The CachedMultipleNegativesRankingLoss
uses an approach called GradCache to allow computing the scores in mini-batches without increasing the memory usage excessively. This loss is recommended over the “standard” MultipleNegativesRankingLoss
(a.k.a. InfoNCE) loss, which does not have this clever mini-batching support and thus requires a lot of memory.
Experimentation with an activation_fn
and scale
is warranted for this loss. torch.nn.Sigmoid
with scale=10.0
works okay, torch.nn.Identity`
with scale=1.0
also works, and the mGTE paper authors suggest using torch.nn.Tanh
with scale=10.0
.
Inference
The tomaarsen/reranker-ModernBERT-base-gooaq-bce model was trained with the first script. If you want to try out the model before training something yourself, feel free to use this script:
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("tomaarsen/reranker-ModernBERT-base-gooaq-bce")
# Get scores for pairs of texts
pairs = [
["how to obtain a teacher's certificate in texas?", 'Some aspiring educators may be confused about the difference between teaching certification and teaching certificates. Teacher certification is another term for the licensure required to teach in public schools, while a teaching certificate is awarded upon completion of an academic program.'],
["how to obtain a teacher's certificate in texas?", '["Step 1: Obtain a Bachelor\'s Degree. One of the most important Texas teacher qualifications is a bachelor\'s degree. ... ", \'Step 2: Complete an Educator Preparation Program (EPP) ... \', \'Step 3: Pass Texas Teacher Certification Exams. ... \', \'Step 4: Complete a Final Application and Background Check.\']'],
["how to obtain a teacher's certificate in texas?", "Washington Teachers Licensing Application Process Official transcripts showing proof of bachelor's degree. Proof of teacher program completion at an approved teacher preparation school. Passing scores on the required examinations. Completed application for teacher certification in Washington."],
["how to obtain a teacher's certificate in texas?", 'Teacher education programs may take 4 years to complete after which certification plans are prepared for a three year period. During this plan period, the teacher must obtain a Standard Certification within 1-2 years. Learn how to get certified to teach in Texas.'],
["how to obtain a teacher's certificate in texas?", 'In Texas, the minimum age to work is 14. Unlike some states, Texas does not require juvenile workers to obtain a child employment certificate or an age certificate to work. A prospective employer that wants one can request a certificate of age for any minors it employs, obtainable from the Texas Workforce Commission.'],
]
scores = model.predict(pairs)
print(scores)
# [0.00121048 0.97105724 0.00536712 0.8632406 0.00168043]
# Or rank different texts based on similarity to a single text
ranks = model.rank(
"how to obtain a teacher's certificate in texas?",
[
"[\"Step 1: Obtain a Bachelor's Degree. One of the most important Texas teacher qualifications is a bachelor's degree. ... \", 'Step 2: Complete an Educator Preparation Program (EPP) ... ', 'Step 3: Pass Texas Teacher Certification Exams. ... ', 'Step 4: Complete a Final Application and Background Check.']",
"Teacher education programs may take 4 years to complete after which certification plans are prepared for a three year period. During this plan period, the teacher must obtain a Standard Certification within 1-2 years. Learn how to get certified to teach in Texas.",
"Washington Teachers Licensing Application Process Official transcripts showing proof of bachelor's degree. Proof of teacher program completion at an approved teacher preparation school. Passing scores on the required examinations. Completed application for teacher certification in Washington.",
"Some aspiring educators may be confused about the difference between teaching certification and teaching certificates. Teacher certification is another term for the licensure required to teach in public schools, while a teaching certificate is awarded upon completion of an academic program.",
"In Texas, the minimum age to work is 14. Unlike some states, Texas does not require juvenile workers to obtain a child employment certificate or an age certificate to work. A prospective employer that wants one can request a certificate of age for any minors it employs, obtainable from the Texas Workforce Commission.",
],
)
print(ranks)
# [
# {'corpus_id': 0, 'score': 0.97105724},
# {'corpus_id': 1, 'score': 0.8632406},
# {'corpus_id': 2, 'score': 0.0053671156},
# {'corpus_id': 4, 'score': 0.0016804343},
# {'corpus_id': 3, 'score': 0.0012104829},
# ]