Skip to content

facebook/wiki_dpr

Fill MaskText GenerationENcc-by-nc-4.0

Facebook/wiki_dpr is a fill mask dataset in EN from facebook with 126,091,800 records in Parquet format. It is distributed under the cc-by-nc-4.0 license and falls in the 10M<n<100M size category, and has been downloaded 68K times.

About facebook/wiki_dpr

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model. It contains 21M passages from wikipedia along with their DPR embeddings. The wikipedia articles were split into multiple, disjoint text blocks of 100 words as pa...

Details

Task
Fill Mask, Text Generation
Language
EN
Format
Parquet
Rows / instances
126091800
Size
10M<n<100M
Creator
facebook
Year
2022
License
cc-by-nc-4.0
Downloads
68023
Likes
45
Download Homepage

Related Fill Mask, Text Generation datasets

FAQ