Skip to content

Mesh-TensorFlow Transformer 4.9B (language)

Google BrainLanguage modeling/generationTranslation

Mesh-TensorFlow Transformer 4.9B (language) is language modeling/generation model published by Google Brain in 2018 featuring 4900000000.0 parameters.

About Mesh-TensorFlow Transformer 4.9B (language)

Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers fr

Details

Provider
Google Brain
Task
Language modeling/generation,Translation
Parameters
4900000000.0
Released
2018-11-05
Open weights
No
View model source

Explore

FAQ