joelniklaus/Multi_Legal_Pile
Fill MaskBG, CS, DAcc-by-nc-sa-4.0
Joelniklaus/Multi_Legal_Pile is a fill mask dataset in BG, CS, DA from joelniklaus in Parquet format. It is distributed under the cc-by-nc-sa-4.0 license and falls in the 10M<n<100M size category, and has been downloaded 5K times.
About joelniklaus/Multi_Legal_Pile
# Dataset Card for MultiLegalPile: A Large-Scale Multilingual Corpus for the Legal Domain
## Table of Contents
- [Table of Contents](#table-of-contents)
- [Dataset Description](#dataset-description)
- [Dataset Summary](#dataset-summary)
- [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards)
- [Languages](#languages)
- [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances)
- [Data Fields](#data-fields)
- [Data Splits](#data-splits)
- [Dataset Creation](#dataset-creation)
- [Curation Rationale](#curation-rationale)
- [Source Data](#source-data)
- [Annotations](#annotations)
- [Personal and Sensitive Information](#personal-and-sensitive-information)
- [Considerations for Using the Data](#considerations-for-using-the-data)
- [Social Impact of Dataset](#social-impact-of-dataset)
- [Discussion of Biases](#discussion-of-biases)…
Details
- Task
- Fill Mask
- Language
- BG, CS, DA
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10M<n<100M
- Creator
- joelniklaus
- Year
- 2022
- License
- cc-by-nc-sa-4.0
- Downloads
- 4966
- Likes
- 64