Skip to content

Mesh-TensorFlow Transformer 2.9B (translation)

Google BrainLanguage modeling/generationTranslation

Developed by Google Brain in 2018, Mesh-TensorFlow Transformer 2.9B (translation) is a language modeling/generation model with 2900000000.0 parameters.

About Mesh-TensorFlow Transformer 2.9B (translation)

Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers fr

Details

Provider
Google Brain
Task
Language modeling/generation,Translation
Parameters
2900000000.0
Released
2018-11-05
Open weights
No
View model source

Explore

FAQ