cahya/bert-base-indonesian-1.5G
The cahya/bert-base-indonesian-1.5G model is a machine learning model.
About cahya/bert-base-indonesian-1.5G
It is BERT-base model pre-trained with indonesian Wikipedia and Indonesian newspapers using a masked language modeling (MLM) objective . This model is uncased. This is one of several other language models that have been pre-training . The texts are lowercased and tokenized using WordPiece and a vocabulary size of 32,000 . The inputs of the model are of the form: Sentence A [SEP] Sentence B [SENTENCE B [SAP] or Sentence D [SASP] The inputs are then of a form: "Silakan diganti dengan text apa saja" The model was pre,