Skip to content

globis-university/aozorabunko-clean

Text GenerationText ClassificationJAcc-by-4.0

Created by globis-university at 2023, the globis-university/aozorabunko-clean is a text generation dataset in JA in Parquet format. With 2.7K downloads and 46 likes, it is actively used by the community. It is released under the cc-by-4.0 license and is a 10K<n<100K-scale dataset.

About globis-university/aozorabunko-clean

Overview This dataset provides a convenient and user-friendly format of data from Aozora Bunko (青空文庫), a website that compiles public-domain books in Japan, ideal for Machine Learning applications. [For Japanese] 日本語での概要説明を Qiita に記載しました: https...

Details

Task
Text Generation, Text Classification
Language
JA
Format
Parquet
Rows / instances
N/A
Size
10K<n<100K
Creator
globis-university
Year
2023
License
cc-by-4.0
Downloads
2688
Likes
46
Download Homepage

Related Text Generation, Text Classification datasets

FAQ