Segatron-XL large, M=384 + HCP
Microsoft ResearchUniversity of WaterlooLanguage modeling
Developed by Microsoft Research,University of Waterloo in 2022, Segatron-XL large, M=384 + HCP is a language modeling model with 256999999.99999997 parameters.
About Segatron-XL large, M=384 + HCP
Class-based language models (LMs) have been long devised to address context sparsity in n-gram LMs. In this study, we revisit this approach in the context of neural LMs. We hypothesize that class-based prediction leads to an implicit context aggregat
Details
- Provider
- Microsoft Research,University of Waterloo
- Task
- Language modeling
- Parameters
- 256999999.99999997
- Released
- 2022-03-21
- Open weights
- No