Skip to content

Transformer-XL DeFINE (141M)

University of WashingtonAllen Institute for AILanguage modeling

Transformer-XL DeFINE (141M) is a language modeling model from University of Washington,Allen Institute for AI released in 2019 with 141000000.0 parameters.

About Transformer-XL DeFINE (141M)

For sequence models with large vocabularies, a majority of network parameters lie in the input and output layers. In this work, we describe a new method, DeFINE, for learning deep token representations efficiently. Our architecture uses a hierarchica

Details

Provider
University of Washington,Allen Institute for AI
Task
Language modeling
Parameters
141000000.0
Released
2019-11-27
Open weights
No
View model source

Explore

FAQ