AI & ML interests
In the following you find models tuned to be used for sentence / text embedding generation. They can be used with the sentence-transformers package.
Recent Activity
View all activity
Organization Card
SentenceTransformers ๐ค is a Python framework for state-of-the-art sentence, text and image embeddings.
Install the Sentence Transformers library.
pip install -U sentence-transformers
The usage is as simple as:
from sentence_transformers import SentenceTransformer
# 1. Load a pretrained Sentence Transformer model
model = SentenceTransformer("all-MiniLM-L6-v2")
# The sentences to encode
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
# 2. Calculate embeddings by calling model.encode()
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# 3. Calculate the embedding similarities
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6660, 0.1046],
# [0.6660, 1.0000, 0.1411],
# [0.1046, 0.1411, 1.0000]])
Hugging Face makes it easy to collaboratively build and showcase your Sentence Transformers models! You can collaborate with your organization, upload and showcase your own models in your profile โค๏ธ

Documentation

Push your Sentence Transformers models to the Hub โค๏ธ

Find all Sentence Transformers models on the ๐ค Hub
To upload your Sentence Transformers models to the Hugging Face Hub, log in with huggingface-cli login
and use the push_to_hub
method within the Sentence Transformers library.
from sentence_transformers import SentenceTransformer
# Load or train a model
model = SentenceTransformer(...)
# Push to Hub
model.push_to_hub("my_new_model")
A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers
These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual.
-
sentence-transformers/parallel-sentences-wikititles
Viewer โข Updated โข 14.7M โข 180 โข 1 -
sentence-transformers/parallel-sentences-tatoeba
Viewer โข Updated โข 8.35M โข 1.42k -
sentence-transformers/parallel-sentences-talks
Viewer โข Updated โข 19.6M โข 1.48k โข 12 -
sentence-transformers/parallel-sentences-europarl
Viewer โข Updated โข 49.7M โข 514 โข 1
A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers
These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual.
-
sentence-transformers/parallel-sentences-wikititles
Viewer โข Updated โข 14.7M โข 180 โข 1 -
sentence-transformers/parallel-sentences-tatoeba
Viewer โข Updated โข 8.35M โข 1.42k -
sentence-transformers/parallel-sentences-talks
Viewer โข Updated โข 19.6M โข 1.48k โข 12 -
sentence-transformers/parallel-sentences-europarl
Viewer โข Updated โข 49.7M โข 514 โข 1
models
126

sentence-transformers/paraphrase-multilingual-mpnet-base-v2
Sentence Similarity
โข
0.3B
โข
Updated
โข
2.96M
โข
โข
410

sentence-transformers/stsb-mpnet-base-v2
Sentence Similarity
โข
0.1B
โข
Updated
โข
11k
โข
12

sentence-transformers/paraphrase-mpnet-base-v2
Sentence Similarity
โข
0.1B
โข
Updated
โข
940k
โข
44

sentence-transformers/nli-mpnet-base-v2
Sentence Similarity
โข
0.1B
โข
Updated
โข
70k
โข
14

sentence-transformers/multi-qa-mpnet-base-dot-v1
Sentence Similarity
โข
0.1B
โข
Updated
โข
1.62M
โข
โข
178

sentence-transformers/multi-qa-mpnet-base-cos-v1
Sentence Similarity
โข
0.1B
โข
Updated
โข
634k
โข
โข
41

sentence-transformers/all-mpnet-base-v1
Sentence Similarity
โข
0.1B
โข
Updated
โข
4.07k
โข
11

sentence-transformers/all-mpnet-base-v2
Sentence Similarity
โข
0.1B
โข
Updated
โข
17.7M
โข
โข
1.14k

sentence-transformers/average_word_embeddings_levy_dependency
Sentence Similarity
โข
Updated

sentence-transformers/average_word_embeddings_komninos
Sentence Similarity
โข
Updated
โข
4
datasets
89
sentence-transformers/msmarco-scores-ms-marco-MiniLM-L6-v2
Viewer
โข
Updated
โข
241M
โข
82
โข
2
sentence-transformers/msmarco
Viewer
โข
Updated
โข
527M
โข
650
โข
5
sentence-transformers/msmarco-msmarco-MiniLM-L6-v3
Viewer
โข
Updated
โข
80.6M
โข
502
โข
4
sentence-transformers/NanoTouche2020-bm25
Viewer
โข
Updated
โข
5.84k
โข
27
sentence-transformers/NanoSciFact-bm25
Viewer
โข
Updated
โข
3.02k
โข
49
sentence-transformers/NanoArguAna-bm25
Viewer
โข
Updated
โข
3.74k
โข
38
sentence-transformers/NanoSCIDOCS-bm25
Viewer
โข
Updated
โข
2.31k
โข
53
sentence-transformers/NanoQuoraRetrieval-bm25
Viewer
โข
Updated
โข
5.15k
โข
61
sentence-transformers/NanoNQ-bm25
Viewer
โข
Updated
โข
5.14k
โข
565
sentence-transformers/NanoNFCorpus-bm25
Viewer
โข
Updated
โข
3.05k
โข
541