maywell/korean_textbooks
General NLPKO
Maywell/korean_textbooks is a General NLP-focused dataset in KO distributed in Parquet format.
About maywell/korean_textbooks
Massive Korean synthetic dataset
This dataset is a large-scale Korean artificial data set created using Gemini Pro.
It was created using the methodology described in Creation of synthetic textbook-quality datasets in Textbooks Are All You Need....
Details
- Task
- General NLP
- Language
- KO
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- maywell
- Year
- 2023