Skip to content

jhu-clsp/mmBERT-decay-data

Fill MaskEnglishmit

Jhu-clsp/mmBERT-decay-data is a fill mask dataset in English from jhu-clsp in Parquet format. It is distributed under the mit license, and has been downloaded 33.1K times.

About jhu-clsp/mmBERT-decay-data

MMBERT Decay Phase Data Phase 3 of 3: Annealed language learning decay phase (100B tokens) with massive multilingual expansion to 1833 languages. 📊 Data Composition NOTE: there are multiple decay data mixtures: this mixture de...

Details

Task
Fill Mask
Language
English
Format
Parquet
Rows / instances
N/A
Creator
jhu-clsp
Year
2025
License
mit
Downloads
33129
Likes
6
Download Homepage

Related Fill Mask datasets

FAQ