Skip to content

ceshine/TinyBERT_L-4_H-312_v2-distill-AllNLI

Ceshine/TinyBERT_L-4_H-312_v2-distill-AllNLI is machine learning model.

About ceshine/TinyBERT_L-4_H-312_v2-distill-AllNLI

This is distilled from the bert-base-nli-stsb-mean-tokens pre-trained model from Sentence-Transformers . The embedding vector is obtained by mean/average pooling of the last layer's hidden states . We compute cosine similarity scores of the embeddings of the sentence pair to get the spearman correlation on the STS benchmark (bigger is better) Update 20210325: Added the attention matrices imitation objective as in the TinyBERT paper, and the distill target has been changed from distilbert-base . to bert .base-base.nli .stsb .mean-mean .tok,
View model source

Explore

FAQ