Skip to content

Segatron-XL large, M=384 + HCP

Microsoft ResearchUniversity of WaterlooLanguage modeling

Developed by Microsoft Research,University of Waterloo in 2022, Segatron-XL large, M=384 + HCP is a language modeling model with 256999999.99999997 parameters.

About Segatron-XL large, M=384 + HCP

Class-based language models (LMs) have been long devised to address context sparsity in n-gram LMs. In this study, we revisit this approach in the context of neural LMs. We hypothesize that class-based prediction leads to an implicit context aggregat

Details

Provider
Microsoft Research,University of Waterloo
Task
Language modeling
Parameters
256999999.99999997
Released
2022-03-21
Open weights
No
View model source

Explore

FAQ