DeepPavlov/rubert-base-cased
The DeepPavlov/rubert-base-cased model is a machine learning model.
About DeepPavlov/rubert-base-cased
RuBERT was trained on the Russian part of Wikipedia and news data . We used this training data to build a vocabulary of Russian subtokens . We also took a multilingual version of BERT‑base as an initialization for RuberT[1] The data was used to build an initial vocabulary for the new model . The new model is based on the data from Wikipedia, news data and other sources of Russian language . We hope to use this data to create a new vocabulary for our new model of language-recognition systems in the new Russian version of the BERT-base version of Russian BERT base . The model was created using data from the Russian Wikipedia and Russian news,