Mesh-TensorFlow Transformer 4.9B (language)
Google BrainLanguage modeling/generationTranslation
Mesh-TensorFlow Transformer 4.9B (language) is language modeling/generation model published by Google Brain in 2018 featuring 4900000000.0 parameters.
About Mesh-TensorFlow Transformer 4.9B (language)
Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers fr
Details
- Provider
- Google Brain
- Task
- Language modeling/generation,Translation
- Parameters
- 4900000000.0
- Released
- 2018-11-05
- Open weights
- No