Skip to content

PermuteFormer

Peking UniversityLanguage modeling

Developed by Peking University in 2021, PermuteFormer is a language modeling model with 149697024.0 parameters.

About PermuteFormer

A recent variation of Transformer, Performer, scales Transformer to longer sequences with a linear attention mechanism. However, it is not compatible with relative position encoding, which has advantages over absolute position encoding. In this paper

Details

Provider
Peking University
Task
Language modeling
Parameters
149697024.0
Released
2021-09-06
Open weights
No
View model source

Explore

FAQ