Skip to content

Sandwich Transformer

Allen Institute for AIFacebook AI ResearchLanguage modeling

Developed by Allen Institute for AI,Facebook AI Research in 2019, Sandwich Transformer is a language modeling model with 209000000.0 parameters.

About Sandwich Transformer

Multilayer transformer networks consist of interleaved self-attention and feedforward sublayers. Could ordering the sublayers in a different pattern lead to better performance? We generate randomly ordered transformers and train them with the languag

Details

Provider
Allen Institute for AI,Facebook AI Research
Task
Language modeling
Parameters
209000000.0
Released
2019-11-10
Open weights
No
View model source

Explore

FAQ