Skip to content

bionlp/bluebert_pubmed_uncased_L-12_H-768_A-12

Bionlp/bluebert_pubmed_uncased_L-12_H-768_A-12 is a machine learning model.

About bionlp/bluebert_pubmed_uncased_L-12_H-768_A-12

A BERT model pre-trained on PubMed abstracts was used to train BlueBERT models . The pre-processed texts were used to pre-train the models . Pre-trained model: https://huggingface.co/bert-base-uncased-pre-trained . The corpus contains ~4000M words extracted from the PubMed ASCII code version. The model was pre-loaded with pre-training data from pre-processing texts that were used for training . The training procedure was done using the NLTK Treebank tokenizer tokenizing the text using the . Treebank WordTokenizer .tokenize(value) and re-sub(r"\s's,
View model source

Explore

FAQ