Skip to content

Megatron-BERT

NVIDIALanguage modeling/generation

The Megatron-BERT model is a frontier language modeling/generation model from NVIDIA with 3900000000.0 parameters.

About Megatron-BERT

Recent work in language modeling demonstrates that training large transformer models advances the state of the art in Natural Language Processing applications. However, very large models can be quite difficult to train due to memory constraints. In t

Details

Provider
NVIDIA
Task
Language modeling/generation
Parameters
3900000000.0
Released
2019-09-17
Open weights
No
View model source

Explore

FAQ