Index of /examples/sentence_transformer/applications/image-search

[ICO]NameLast modifiedSizeDescription

[PARENTDIR]Parent Directory  -  
[TXT]README.html2025-04-22 17:55 72K 

Image Search — Sentence Transformers documentation
Logo

Getting Started

  • Installation
    • Install with pip
    • Install with Conda
    • Install from Source
    • Editable Install
    • Install PyTorch with CUDA support
  • Quickstart
    • Sentence Transformer
    • Cross Encoder
    • Next Steps
  • Migration Guide
    • Migrating from v2.x to v3.x
      • Migration for specific parameters from SentenceTransformer.fit
      • Migration for custom Datasets and DataLoaders used in SentenceTransformer.fit
    • Migrating from v3.x to v4.x
      • Migration for parameters on CrossEncoder initialization and methods
      • Migration for specific parameters from CrossEncoder.fit
      • Migration for CrossEncoder evaluators

Sentence Transformer

  • Usage
    • Computing Embeddings
      • Initializing a Sentence Transformer Model
      • Calculating Embeddings
      • Prompt Templates
      • Input Sequence Length
      • Multi-Process / Multi-GPU Encoding
    • Semantic Textual Similarity
      • Similarity Calculation
    • Semantic Search
      • Background
      • Symmetric vs. Asymmetric Semantic Search
      • Manual Implementation
      • Optimized Implementation
      • Speed Optimization
      • Elasticsearch
      • Approximate Nearest Neighbor
      • Retrieve & Re-Rank
      • Examples
    • Retrieve & Re-Rank
      • Retrieve & Re-Rank Pipeline
      • Retrieval: Bi-Encoder
      • Re-Ranker: Cross-Encoder
      • Example Scripts
      • Pre-trained Bi-Encoders (Retrieval)
      • Pre-trained Cross-Encoders (Re-Ranker)
    • Clustering
      • k-Means
      • Agglomerative Clustering
      • Fast Clustering
      • Topic Modeling
    • Paraphrase Mining
      • paraphrase_mining()
    • Translated Sentence Mining
      • Margin Based Mining
      • Examples
    • Image Search
      • Installation
      • Usage
      • Examples
    • Embedding Quantization
      • Binary Quantization
      • Scalar (int8) Quantization
      • Additional extensions
      • Demo
      • Try it yourself
    • Creating Custom Models
      • Structure of Sentence Transformer Models
      • Sentence Transformer Model from a Transformers Model
    • Speeding up Inference
      • PyTorch
      • ONNX
      • OpenVINO
      • Benchmarks
  • Pretrained Models
    • Original Models
    • Semantic Search Models
      • Multi-QA Models
      • MSMARCO Passage Models
    • Multilingual Models
      • Semantic Similarity Models
      • Bitext Mining
    • Image & Text-Models
    • INSTRUCTOR models
    • Scientific Similarity Models
  • Training Overview
    • Why Finetune?
    • Training Components
    • Dataset
      • Dataset Format
    • Loss Function
    • Training Arguments
    • Evaluator
    • Trainer
      • Callbacks
    • Multi-Dataset Training
    • Deprecated Training
    • Best Base Embedding Models
    • Comparisons with CrossEncoder Training
  • Dataset Overview
    • Datasets on the Hugging Face Hub
    • Pre-existing Datasets
  • Loss Overview
    • Loss Table
    • Loss modifiers
    • Distillation
    • Commonly used Loss Functions
    • Custom Loss Functions
  • Training Examples
    • Semantic Textual Similarity
      • Training data
      • Loss Function
    • Natural Language Inference
      • Data
      • SoftmaxLoss
      • MultipleNegativesRankingLoss
    • Paraphrase Data
      • Pre-Trained Models
    • Quora Duplicate Questions
      • Training
      • MultipleNegativesRankingLoss
      • Pretrained Models
    • MS MARCO
      • Bi-Encoder
    • Matryoshka Embeddings
      • Use Cases
      • Results
      • Training
      • Inference
      • Code Examples
    • Adaptive Layers
      • Use Cases
      • Results
      • Training
      • Inference
      • Code Examples
    • Multilingual Models
      • Extend your own models
      • Training
      • Datasets
      • Sources for Training Data
      • Evaluation
      • Available Pre-trained Models
      • Usage
      • Performance
      • Citation
    • Model Distillation
      • Knowledge Distillation
      • Speed - Performance Trade-Off
      • Dimensionality Reduction
      • Quantization
    • Augmented SBERT
      • Motivation
      • Extend to your own datasets
      • Methodology
      • Scenario 1: Limited or small annotated datasets (few labeled sentence-pairs)
      • Scenario 2: No annotated datasets (Only unlabeled sentence-pairs)
      • Training
      • Citation
    • Training with Prompts
      • What are Prompts?
      • Why would we train with Prompts?
      • How do we train with Prompts?
    • Training with PEFT Adapters
      • Compatibility Methods
      • Adding a New Adapter
      • Loading a Pretrained Adapter
      • Training Script
    • Unsupervised Learning
      • TSDAE
      • SimCSE
      • CT
      • CT (In-Batch Negative Sampling)
      • Masked Language Model (MLM)
      • GenQ
      • GPL
      • Performance Comparison
    • Domain Adaptation
      • Domain Adaptation vs. Unsupervised Learning
      • Adaptive Pre-Training
      • GPL: Generative Pseudo-Labeling
    • Hyperparameter Optimization
      • HPO Components
      • Putting It All Together
      • Example Scripts
    • Distributed Training
      • Comparison
      • FSDP

Cross Encoder

  • Usage
    • Cross-Encoder vs Bi-Encoder
      • Cross-Encoder vs. Bi-Encoder
      • When to use Cross- / Bi-Encoders?
      • Cross-Encoders Usage
      • Combining Bi- and Cross-Encoders
      • Training Cross-Encoders
    • Retrieve & Re-Rank
      • Retrieve & Re-Rank Pipeline
      • Retrieval: Bi-Encoder
      • Re-Ranker: Cross-Encoder
      • Example Scripts
      • Pre-trained Bi-Encoders (Retrieval)
      • Pre-trained Cross-Encoders (Re-Ranker)
    • Speeding up Inference
      • PyTorch
      • ONNX
      • OpenVINO
      • Benchmarks
  • Pretrained Models
    • MS MARCO
    • SQuAD (QNLI)
    • STSbenchmark
    • Quora Duplicate Questions
    • NLI
    • Community Models
  • Training Overview
    • Why Finetune?
    • Training Components
    • Dataset
      • Dataset Format
      • Hard Negatives Mining
    • Loss Function
    • Training Arguments
    • Evaluator
    • Trainer
      • Callbacks
    • Multi-Dataset Training
    • Training Tips
    • Deprecated Training
    • Comparisons with SentenceTransformer Training
  • Loss Overview
    • Loss Table
    • Distillation
    • Commonly used Loss Functions
    • Custom Loss Functions
  • Training Examples
    • Semantic Textual Similarity
      • Training data
      • Loss Function
      • Inference
    • Natural Language Inference
      • Data
      • CrossEntropyLoss
      • Inference
    • Quora Duplicate Questions
      • Training
      • Inference
    • MS MARCO
      • Cross Encoder
      • Training Scripts
      • Inference
    • Rerankers
      • BinaryCrossEntropyLoss
      • CachedMultipleNegativesRankingLoss
      • Inference
    • Model Distillation
      • Cross Encoder Knowledge Distillation
      • Inference
    • Distributed Training
      • Comparison
      • FSDP

Package Reference

  • Sentence Transformer
    • SentenceTransformer
      • SentenceTransformer
      • SentenceTransformerModelCardData
      • SimilarityFunction
    • Trainer
      • SentenceTransformerTrainer
    • Training Arguments
      • SentenceTransformerTrainingArguments
    • Losses
      • BatchAllTripletLoss
      • BatchHardSoftMarginTripletLoss
      • BatchHardTripletLoss
      • BatchSemiHardTripletLoss
      • ContrastiveLoss
      • OnlineContrastiveLoss
      • ContrastiveTensionLoss
      • ContrastiveTensionLossInBatchNegatives
      • CoSENTLoss
      • AnglELoss
      • CosineSimilarityLoss
      • DenoisingAutoEncoderLoss
      • GISTEmbedLoss
      • CachedGISTEmbedLoss
      • MSELoss
      • MarginMSELoss
      • MatryoshkaLoss
      • Matryoshka2dLoss
      • AdaptiveLayerLoss
      • MegaBatchMarginLoss
      • MultipleNegativesRankingLoss
      • CachedMultipleNegativesRankingLoss
      • MultipleNegativesSymmetricRankingLoss
      • CachedMultipleNegativesSymmetricRankingLoss
      • SoftmaxLoss
      • TripletLoss
    • Samplers
      • BatchSamplers
      • MultiDatasetBatchSamplers
    • Evaluation
      • BinaryClassificationEvaluator
      • EmbeddingSimilarityEvaluator
      • InformationRetrievalEvaluator
      • NanoBEIREvaluator
      • MSEEvaluator
      • ParaphraseMiningEvaluator
      • RerankingEvaluator
      • SentenceEvaluator
      • SequentialEvaluator
      • TranslationEvaluator
      • TripletEvaluator
    • Datasets
      • ParallelSentencesDataset
      • SentenceLabelDataset
      • DenoisingAutoEncoderDataset
      • NoDuplicatesDataLoader
    • Models
      • Main Classes
      • Further Classes
    • quantization
      • quantize_embeddings()
      • semantic_search_faiss()
      • semantic_search_usearch()
  • Cross Encoder
    • CrossEncoder
      • CrossEncoder
      • CrossEncoderModelCardData
    • Trainer
      • CrossEncoderTrainer
    • Training Arguments
      • CrossEncoderTrainingArguments
    • Losses
      • BinaryCrossEntropyLoss
      • CrossEntropyLoss
      • LambdaLoss
      • ListMLELoss
      • PListMLELoss
      • ListNetLoss
      • MultipleNegativesRankingLoss
      • CachedMultipleNegativesRankingLoss
      • MSELoss
      • MarginMSELoss
      • RankNetLoss
    • Evaluation
      • CrossEncoderRerankingEvaluator
      • CrossEncoderNanoBEIREvaluator
      • CrossEncoderClassificationEvaluator
      • CrossEncoderCorrelationEvaluator
  • util
    • Helper Functions
      • community_detection()
      • http_get()
      • is_training_available()
      • mine_hard_negatives()
      • normalize_embeddings()
      • paraphrase_mining()
      • semantic_search()
      • truncate_embeddings()
    • Model Optimization
      • export_dynamic_quantized_onnx_model()
      • export_optimized_onnx_model()
      • export_static_quantized_openvino_model()
    • Similarity Metrics
      • cos_sim()
      • dot_score()
      • euclidean_sim()
      • manhattan_sim()
      • pairwise_cos_sim()
      • pairwise_dot_score()
      • pairwise_euclidean_sim()
      • pairwise_manhattan_sim()
Sentence Transformers
  • Usage
  • Image Search
  • Edit on GitHub

Image Search

SentenceTransformers provides models that allow to embed images and text into the same vector space. This allows to find similar images as well as to implement image search.

ImageSearch

Installation

Ensure that you have transformers installed to use the image-text-models and use a recent PyTorch version (tested with PyTorch 1.7.0). Image-Text-Models have been added with SentenceTransformers version 1.0.0. Image-Text-Models are still in an experimental phase.

Usage

SentenceTransformers provides a wrapper for the OpenAI CLIP Model, which was trained on a variety of (image, text)-pairs.

from sentence_transformers import SentenceTransformer
from PIL import Image

# Load CLIP model
model = SentenceTransformer("clip-ViT-B-32")

# Encode an image:
img_emb = model.encode(Image.open("two_dogs_in_snow.jpg"))

# Encode text descriptions
text_emb = model.encode(
    ["Two dogs in the snow", "A cat on a table", "A picture of London at night"]
)

# Compute similarities
similarity_scores = model.similarity(img_emb, text_emb)
print(similarity_scores)

You can use the CLIP model for:

  • Text-to-Image / Image-To-Text / Image-to-Image / Text-to-Text Search

  • You can fine-tune it on your own image&text data with the regular SentenceTransformers training code.

Examples

  • Image_Search.ipynb (Colab Version) depicts a larger example for text-to-image and image-to-image search using 25,000 free pictures from Unsplash.

  • Image_Search-multilingual.ipynb (Colab Version) example of multilingual text2image search for 50+ languages.

  • Image_Clustering.ipynb (Colab Version) shows how to perform image clustering. Given 25,000 free pictures from Unsplash, we find clusters of similar images. You can control how sensitive the clustering should be.

  • Image_Duplicates.ipynb (Colab Version) shows an example how to find duplicate and near duplicate images in a large collection of photos.

  • Image_Classification.ipynb (Colab Version) example for (multi-lingual) zero-shot image classifcation.

Previous Next

© Copyright 2025.

Built with Sphinx using a theme provided by Read the Docs.