Pretrained Models

We have released various pre-trained Cross Encoder models via our Cross Encoder Hugging Face organization. Additionally, numerous community Cross Encoder models have been publicly released on the Hugging Face Hub.

Each of these models can be easily downloaded and used like so:

from sentence_transformers import CrossEncoder
import torch

# Load https://huggingface.co/cross-encoder/ms-marco-MiniLM-L6-v2
model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L6-v2", activation_fn=torch.nn.Sigmoid())
scores = model.predict([
    ("How many people live in Berlin?", "Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."),
    ("How many people live in Berlin?", "Berlin is well known for its museums."),
])
# => array([0.9998173 , 0.01312432], dtype=float32)

Cross-Encoders require text pairs as inputs and output a score 0…1 (if the Sigmoid activation function is used). They do not work for individual sentences and they don’t compute embeddings for individual texts.

MS MARCO

MS MARCO Passage Retrieval is a large dataset with real user queries from Bing search engine with annotated relevant text passages. Models trained on this dataset are very effective as rerankers for search systems.

Note

You can initialize these models with activation_fn=torch.nn.Sigmoid() to force the model to return scores between 0 and 1. Otherwise, the raw value can reasonably range between -10 and 10.

Model Name NDCG@10 (TREC DL 19) MRR@10 (MS Marco Dev) Docs / Sec
cross-encoder/ms-marco-TinyBERT-L2-v2 69.84 32.56 9000
cross-encoder/ms-marco-MiniLM-L2-v2 71.01 34.85 4100
cross-encoder/ms-marco-MiniLM-L4-v2 73.04 37.70 2500
cross-encoder/ms-marco-MiniLM-L6-v2 74.30 39.01 1800
cross-encoder/ms-marco-MiniLM-L12-v2 74.31 39.02 960
cross-encoder/ms-marco-electra-base 71.99 36.41 340

For details on the usage, see Retrieve & Re-Rank.

SQuAD (QNLI)

QNLI is based on the SQuAD dataset (HF) and was introduced by the GLUE Benchmark (HF). Given a passage from Wikipedia, annotators created questions that are answerable by that passage. These models output higher scores if a passage answers a question.

Model Name Accuracy on QNLI dev set
cross-encoder/qnli-distilroberta-base 90.96
cross-encoder/qnli-electra-base 93.21

STSbenchmark

The following models can be used like this:

from sentence_transformers import CrossEncoder

model = CrossEncoder("cross-encoder/stsb-roberta-base")
scores = model.predict([("It's a wonderful day outside.", "It's so sunny today!"), ("It's a wonderful day outside.", "He drove to work earlier.")])
# => array([0.60443085, 0.00240758], dtype=float32)

They return a score 0…1 indicating the semantic similarity of the given sentence pair.

Model Name

STSbenchmark Test Performance

cross-encoder/stsb-TinyBERT-L4

85.50

cross-encoder/stsb-distilroberta-base

87.92

cross-encoder/stsb-roberta-base

90.17

cross-encoder/stsb-roberta-large

91.47

Quora Duplicate Questions

These models have been trained on the Quora duplicate questions dataset. They can used like the STSb models and give a score 0…1 indicating the probability that two questions are duplicate questions.

Model Name Average Precision dev set
cross-encoder/quora-distilroberta-base 87.48
cross-encoder/quora-roberta-base 87.80
cross-encoder/quora-roberta-large 87.91

Note

The model don’t work for question similarity. The question “How to learn Java?” and “How to learn Python?” will get a low score, as these questions are not duplicates. For question similarity, a SentenceTransformer trained on the Quora dataset will yield much more meaningful results.

NLI

Given two sentences, are these contradicting each other, entailing one the other or are these neutral? The following models were trained on the SNLI and MultiNLI datasets.

Model Name

Accuracy on MNLI mismatched set

cross-encoder/nli-deberta-v3-base

90.04

cross-encoder/nli-deberta-base

88.08

cross-encoder/nli-deberta-v3-xsmall

87.77

cross-encoder/nli-deberta-v3-small

87.55

cross-encoder/nli-roberta-base

87.47

cross-encoder/nli-MiniLM2-L6-H768

86.89

cross-encoder/nli-distilroberta-base

83.98

from sentence_transformers import CrossEncoder

model = CrossEncoder("cross-encoder/nli-deberta-v3-base")
scores = model.predict([
    ("A man is eating pizza", "A man eats something"),
    ("A black race car starts up in front of a crowd of people.", "A man is driving down a lonely road."),
])

# Convert scores to labels
label_mapping = ["contradiction", "entailment", "neutral"]
labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
# => ['entailment', 'contradiction']

Community Models

Some notable models from the Community include: