Skip to content

CausalLM/Refined-Anime-Text

Text GenerationEN, ZH

Created by CausalLM at 2024, the CausalLM/Refined-Anime-Text is a text generation dataset in EN, ZH in Parquet format. With 16 downloads and 273 likes, it is actively used by the community. It is released under the wtfpl license and is a 1M<n<10M-scale dataset.

About CausalLM/Refined-Anime-Text

Refined Anime Text for Continual Pre-training of Language Models This is a subset of our novel synthetic dataset of anime-themed text, containing over 1M entries, ~440M GPT-4/3.5 tokens. This dataset has never been publicly released before. We ...

Details

Task
Text Generation
Language
EN, ZH
Format
Parquet
Rows / instances
N/A
Size
1M<n<10M
Creator
CausalLM
Year
2024
License
wtfpl
Downloads
16
Likes
273
Download Homepage

Related Text Generation datasets

FAQ