globis-university/aozorabunko-clean
Text GenerationText ClassificationJAcc-by-4.0
Created by globis-university at 2023, the globis-university/aozorabunko-clean is a text generation dataset in JA in Parquet format. With 2.7K downloads and 46 likes, it is actively used by the community. It is released under the cc-by-4.0 license and is a 10K<n<100K-scale dataset.
About globis-university/aozorabunko-clean
Overview
This dataset provides a convenient and user-friendly format of data from Aozora Bunko (青空文庫), a website that compiles public-domain books in Japan, ideal for Machine Learning applications.
[For Japanese] 日本語での概要説明を Qiita に記載しました: https...
Details
- Task
- Text Generation, Text Classification
- Language
- JA
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10K<n<100K
- Creator
- globis-university
- Year
- 2023
- License
- cc-by-4.0
- Downloads
- 2688
- Likes
- 46