cryscan/multilingual-share
General NLPEN, ZHcc0-1.0
The cryscan/multilingual-share dataset is a EN, ZH General NLP resource from cryscan at 2023. With 46 downloads and 32 likes, it is actively used by the community. It is released under the cc0-1.0 license and is a 100K<n<1M-scale dataset.
About cryscan/multilingual-share
Multilingual Share GPT
Multilingual Share GPT, the free multi-language corpus for LLM training. All text are converted to markdown format, and classified by languages.
Github Repo
Follow the link here to Github.
Data Example
...
Details
- Task
- General NLP
- Language
- EN, ZH
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 100K<n<1M
- Creator
- cryscan
- Year
- 2023
- License
- cc0-1.0
- Downloads
- 46
- Likes
- 32