Transformer-XL DeFINE (141M)
University of WashingtonAllen Institute for AILanguage modeling
Transformer-XL DeFINE (141M) is a language modeling model from University of Washington,Allen Institute for AI released in 2019 with 141000000.0 parameters.
About Transformer-XL DeFINE (141M)
For sequence models with large vocabularies, a majority of network parameters lie in the input and output layers. In this work, we describe a new method, DeFINE, for learning deep token representations efficiently. Our architecture uses a hierarchica
Details
- Provider
- University of Washington,Allen Institute for AI
- Task
- Language modeling
- Parameters
- 141000000.0
- Released
- 2019-11-27
- Open weights
- No