DeepNet
Microsoft ResearchLanguage modelingTranslationLanguage modeling/generation
Developed by Microsoft Research in 2022, DeepNet is a language modeling model with 3200000000.0 parameters.
About DeepNet
In this paper, we propose a simple yet effective method to stabilize extremely deep Transformers. Specifically, we introduce a new normalization function (DeepNorm) to modify the residual connection in Transformer, accompanying with theoretically der
Details
- Provider
- Microsoft Research
- Task
- Language modeling,Translation,Language modeling/generation
- Parameters
- 3200000000.0
- Released
- 2022-03-01
- Open weights
- No