Skip to content

Routing Transformer (WT-103)

Google ResearchLanguage modelingOpen weights

Developed by Google Research in 2020, Routing Transformer (WT-103) is a language modeling model with 79500000.0 parameters with openly available weights.

About Routing Transformer (WT-103)

Self-attention has recently been adopted for a wide range of sequence modeling problems. Despite its effectiveness, self-attention suffers from quadratic compute and memory requirements with respect to sequence length. Successful approaches to reduce

Details

Provider
Google Research
Task
Language modeling
Parameters
79500000.0
Released
2020-03-12
Open weights
Yes
View model source

Explore

FAQ