Question 1

What is the DarshanDeshpande/marathi-distilbert model?

Accepted Answer

This version of Marathi-DistilBERT is trained from scratch on approximately 11.2 million sentences . It is trained using an Adam optimizer with a learning rate of 1e-4 and default β1 and β2 values of 0.9 and 0.999 respectively with a total batch size of 256 on a v3-8 TPU and mask probability of 15% .…

Question 2

Who created DarshanDeshpande/marathi-distilbert?

Accepted Answer

Publisher information for DarshanDeshpande/marathi-distilbert is not recorded in our dataset.

DarshanDeshpande/marathi-distilbert

About DarshanDeshpande/marathi-distilbert

Explore

FAQ