This page documents the properties and methods when you load a SentenceTransformer model:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('model-name')
class sentence_transformers.SentenceTransformer(model_name_or_path: Optional[str] = None, modules: Optional[Iterable[torch.nn.modules.module.Module]] = None, device: Optional[str] = None, cache_folder: Optional[str] = None, use_auth_token: Optional[Union[bool, str]] = None)ΒΆ

Loads or create a SentenceTransformer model, that can be used to map sentences / text to embeddings.

  • model_name_or_path – If it is a filepath on disc, it loads the model from that path. If it is not a path, it first tries to download a pre-trained SentenceTransformer model. If that fails, tries to construct a model from Huggingface models repository with that name.

  • modules – This parameter can be used to create custom SentenceTransformer models from scratch.

  • device – Device (like β€˜cuda’ / β€˜cpu’) that should be used for computation. If None, checks if a GPU can be used.

  • cache_folder – Path to store models

  • use_auth_token – HuggingFace authentication token to download private models.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property device: torch.deviceΒΆ

Get torch.device from module, assuming that the whole module has one device.

encode(sentences: Union[str, List[str]], batch_size: int = 32, show_progress_bar: Optional[bool] = None, output_value: str = 'sentence_embedding', convert_to_numpy: bool = True, convert_to_tensor: bool = False, device: Optional[str] = None, normalize_embeddings: bool = False) Union[List[torch.Tensor], numpy.ndarray, torch.Tensor]ΒΆ

Computes sentence embeddings

  • sentences – the sentences to embed

  • batch_size – the batch size used for the computation

  • show_progress_bar – Output a progress bar when encode sentences

  • output_value – Default sentence_embedding, to get sentence embeddings. Can be set to token_embeddings to get wordpiece token embeddings. Set to None, to get all output values

  • convert_to_numpy – If true, the output is a list of numpy vectors. Else, it is a list of pytorch tensors.

  • convert_to_tensor – If true, you get one large tensor as return. Overwrites any setting from convert_to_numpy

  • device – Which torch.device to use for the computation

  • normalize_embeddings – If set to true, returned vectors will have length 1. In that case, the faster dot-product (util.dot_score) instead of cosine similarity can be used.


By default, a list of tensors is returned. If convert_to_tensor, a stacked tensor is returned. If convert_to_numpy, a numpy matrix is returned.

encode_multi_process(sentences: List[str], pool: Dict[str, object], batch_size: int = 32, chunk_size: Optional[int] = None)ΒΆ

This method allows to run encode() on multiple GPUs. The sentences are chunked into smaller packages and sent to individual processes, which encode these on the different GPUs. This method is only suitable for encoding large sets of sentences

  • sentences – List of sentences

  • pool – A pool of workers started with SentenceTransformer.start_multi_process_pool

  • batch_size – Encode sentences with batch size

  • chunk_size – Sentences are chunked and sent to the individual processes. If none, it determine a sensible size.


Numpy matrix with all embeddings

evaluate(evaluator: sentence_transformers.evaluation.SentenceEvaluator.SentenceEvaluator, output_path: Optional[str] = None)ΒΆ

Evaluate the model

  • evaluator – the evaluator

  • output_path – the evaluator can write the results to this path

fit(train_objectives: typing.Iterable[typing.Tuple[, torch.nn.modules.module.Module]], evaluator: typing.Optional[sentence_transformers.evaluation.SentenceEvaluator.SentenceEvaluator] = None, epochs: int = 1, steps_per_epoch=None, scheduler: str = 'WarmupLinear', warmup_steps: int = 10000, optimizer_class: typing.Type[torch.optim.optimizer.Optimizer] = <class 'torch.optim.adamw.AdamW'>, optimizer_params: typing.Dict[str, object] = {'lr': 2e-05}, weight_decay: float = 0.01, evaluation_steps: int = 0, output_path: typing.Optional[str] = None, save_best_model: bool = True, max_grad_norm: float = 1, use_amp: bool = False, callback: typing.Optional[typing.Callable[[float, int, int], None]] = None, show_progress_bar: bool = True, checkpoint_path: typing.Optional[str] = None, checkpoint_save_steps: int = 500, checkpoint_save_total_limit: int = 0)ΒΆ

Train the model with the given training objective Each training objective is sampled in turn for one batch. We sample only as many batches from each objective as there are in the smallest one to make sure of equal training with each dataset.

  • train_objectives – Tuples of (DataLoader, LossFunction). Pass more than one for multi-task learning

  • evaluator – An evaluator (sentence_transformers.evaluation) evaluates the model performance during training on held-out dev data. It is used to determine the best model that is saved to disc.

  • epochs – Number of epochs for training

  • steps_per_epoch – Number of training steps per epoch. If set to None (default), one epoch is equal the DataLoader size from train_objectives.

  • scheduler – Learning rate scheduler. Available schedulers: constantlr, warmupconstant, warmuplinear, warmupcosine, warmupcosinewithhardrestarts

  • warmup_steps – Behavior depends on the scheduler. For WarmupLinear (default), the learning rate is increased from o up to the maximal learning rate. After these many training steps, the learning rate is decreased linearly back to zero.

  • optimizer_class – Optimizer

  • optimizer_params – Optimizer parameters

  • weight_decay – Weight decay for model parameters

  • evaluation_steps – If > 0, evaluate the model using evaluator after each number of training steps

  • output_path – Storage path for the model and evaluation files

  • save_best_model – If true, the best model (according to evaluator) is stored at output_path

  • max_grad_norm – Used for gradient normalization.

  • use_amp – Use Automatic Mixed Precision (AMP). Only for Pytorch >= 1.6.0

  • callback – Callback function that is invoked after each evaluation. It must accept the following three parameters in this order: score, epoch, steps

  • show_progress_bar – If True, output a tqdm progress bar

  • checkpoint_path – Folder to save checkpoints during training

  • checkpoint_save_steps – Will save a checkpoint after so many steps

  • checkpoint_save_total_limit – Total number of checkpoints to store


Returns the maximal sequence length for input the model accepts. Longer inputs will be truncated

property max_seq_lengthΒΆ

Property to get the maximal input sequence length for the model. Longer inputs will be truncated.

save(path: str, model_name: Optional[str] = None, create_model_card: bool = True, train_datasets: Optional[List[str]] = None)ΒΆ

Saves all elements for this seq. sentence embedder into different sub-folders :param path: Path on disc :param model_name: Optional model name :param create_model_card: If True, create a with basic information about this model :param train_datasets: Optional list with the names of the datasets used to to train the model

save_to_hub(repo_name: str, organization: Optional[str] = None, private: Optional[bool] = None, commit_message: str = 'Add new SentenceTransformer model.', local_model_path: Optional[str] = None, exist_ok: bool = False, replace_model_card: bool = False, train_datasets: Optional[List[str]] = None)ΒΆ

Uploads all elements of this Sentence Transformer to a new HuggingFace Hub repository.

  • repo_name – Repository name for your model in the Hub.

  • organization – Organization in which you want to push your model or tokenizer (you must be a member of this organization).

  • private – Set to true, for hosting a prive model

  • commit_message – Message to commit while pushing.

  • local_model_path – Path of the model locally. If set, this file path will be uploaded. Otherwise, the current model will be uploaded

  • exist_ok – If true, saving to an existing repository is OK. If false, saving only to a new repository is possible

  • replace_model_card – If true, replace an existing model card in the hub with the automatically created model card

  • train_datasets – Datasets used to train the model. If set, the datasets will be added to the model card in the Hub.


The url of the commit of your model in the given repository.


Transforms a batch from a SmartBatchingDataset to a batch of tensors for the model Here, batch is a list of tuples: [(tokens, label), …]


batch – a batch from a SmartBatchingDataset


a batch of tensors for the model

start_multi_process_pool(target_devices: Optional[List[str]] = None)ΒΆ

Starts multi process to process the encoding with several, independent processes. This method is recommended if you want to encode on multiple GPUs. It is advised to start only one process per GPU. This method works together with encode_multi_process


target_devices – PyTorch target devices, e.g. cuda:0, cuda:1… If None, all available CUDA devices will be used


Returns a dict with the target processes, an input queue and and output queue.

static stop_multi_process_pool(pool)ΒΆ

Stops all processes started with start_multi_process_pool

tokenize(texts: Union[List[str], List[Dict], List[Tuple[str, str]]])ΒΆ

Tokenizes the texts

property tokenizerΒΆ

Property to get the tokenizer that is used by this model