facebook/wiki_dpr
Fill MaskText GenerationENcc-by-nc-4.0
Facebook/wiki_dpr is a fill mask dataset in EN from facebook with 126,091,800 records in Parquet format. It is distributed under the cc-by-nc-4.0 license and falls in the 10M<n<100M size category, and has been downloaded 68K times.
About facebook/wiki_dpr
This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as pa...
Details
- Task
- Fill Mask, Text Generation
- Language
- EN
- Format
- Parquet
- Rows / instances
- 126091800
- Size
- 10M<n<100M
- Creator
- Year
- 2022
- License
- cc-by-nc-4.0
- Downloads
- 68023
- Likes
- 45