Skip to content

DeepNet

Microsoft ResearchLanguage modelingTranslationLanguage modeling/generation

Developed by Microsoft Research in 2022, DeepNet is a language modeling model with 3200000000.0 parameters.

About DeepNet

In this paper, we propose a simple yet effective method to stabilize extremely deep Transformers. Specifically, we introduce a new normalization function (DeepNorm) to modify the residual connection in Transformer, accompanying with theoretically der

Details

Provider
Microsoft Research
Task
Language modeling,Translation,Language modeling/generation
Parameters
3200000000.0
Released
2022-03-01
Open weights
No
View model source

Explore

FAQ