Sandwich Transformer
Allen Institute for AIFacebook AI ResearchLanguage modeling
Developed by Allen Institute for AI,Facebook AI Research in 2019, Sandwich Transformer is a language modeling model with 209000000.0 parameters.
About Sandwich Transformer
Multilayer transformer networks consist of interleaved self-attention and feedforward sublayers. Could ordering the sublayers in a different pattern lead to better performance? We generate randomly ordered transformers and train them with the languag
Details
- Provider
- Allen Institute for AI,Facebook AI Research
- Task
- Language modeling
- Parameters
- 209000000.0
- Released
- 2019-11-10
- Open weights
- No