Pretrained Models

We have released various pre-trained Cross Encoder models via our Cross Encoder Hugging Face organization. Additionally, numerous community CrossEncoder models have been publicly released on the Hugging Face Hub.

Each of these models can be easily downloaded and used like so:

from sentence_transformers import CrossEncoder
import torch

model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2", default_activation_function=torch.nn.Sigmoid())
scores = model.predict([
    ("How many people live in Berlin?", "Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."),
    ("How many people live in Berlin?", "Berlin is well known for its museums."),
# => array([0.9998173 , 0.01312432], dtype=float32)

Cross-Encoders require text pairs as inputs and output a score 0…1 (if the Sigmoid activation function is used). They do not work for individual sentences and they don’t compute embeddings for individual texts.


MS MARCO Passage Retrieval is a large dataset with real user queries from Bing search engine with annotated relevant text passages.


You can initialize these models with default_activation_function=torch.nn.Sigmoid() to force the model to return scores between 0 and 1. Otherwise, the raw value can reasonably range between -10 and 10.

For details on the usage, see Retrieve & Re-Rank or MS MARCO Cross-Encoders.


QNLI is based on the SQuAD dataset (HF) and was introduced by the GLUE Benchmark (HF). Given a passage from Wikipedia, annotators created questions that are answerable by that passage.


The following models can be used like this:

from sentence_transformers import CrossEncoder

model = CrossEncoder("cross-encoder/stsb-roberta-base")
scores = model.predict([("It's a wonderful day outside.", "It's so sunny today!"), ("It's a wonderful day outside.", "He drove to work earlier.")])
# => array([0.60443085, 0.00240758], dtype=float32)

They return a score 0…1 indicating the semantic similarity of the given sentence pair.

Quora Duplicate Questions

These models have been trained on the Quora duplicate questions dataset. They can used like the STSb models and give a score 0…1 indicating the probability that two questions are duplicate questions.


The model don’t work for question similarity. The question How to learn Java and How to learn Python will get a low score, as these questions are not duplicates. For question similarity, the respective bi-encoder trained on the Quora dataset yields much more meaningful results.


Given two sentences, are these contradicting each other, entailing one the other or are these netural? The following models were trained on the SNLI and MultiNLI datasets.

from sentence_transformers import CrossEncoder

model = CrossEncoder("cross-encoder/nli-deberta-v3-base")
scores = model.predict([
    ("A man is eating pizza", "A man eats something"),
    ("A black race car starts up in front of a crowd of people.", "A man is driving down a lonely road."),

# Convert scores to labels
label_mapping = ["contradiction", "entailment", "neutral"]
labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
# => ['entailment', 'contradiction']