setu4993/LaBSE
Setu4993/LaBSE is a machine learning model.
About setu4993/LaBSE
Language-agnostic BERT Sentence Encoder (LaBSE) is a BERT-based model trained for sentence embedding for 109 languages . The pre-training process combines masked language modeling with translation language modeling . The model is useful for getting multilingual sentence embeddings and for bi-text retrieval . For similarity between sentences, an L2-norm is recommended before calculating the similarity: F.normalize(embeddings_1) The model was trained on TensorFlow using the model: BertModel.from_pretrained("setu4993/LaBSe") and BertTokenizerFast.model.pooler_outputs . To get the sentence,