CausalLM/Refined-Anime-Text
Text GenerationEN, ZH
Created by CausalLM at 2024, the CausalLM/Refined-Anime-Text is a text generation dataset in EN, ZH in Parquet format. With 16 downloads and 273 likes, it is actively used by the community. It is released under the wtfpl license and is a 1M<n<10M-scale dataset.
About CausalLM/Refined-Anime-Text
Refined Anime Text for Continual Pre-training of Language Models
This is a subset of our novel synthetic dataset of anime-themed text, containing over 1M entries, ~440M GPT-4/3.5 tokens. This dataset has never been publicly released before. We ...
Details
- Task
- Text Generation
- Language
- EN, ZH
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 1M<n<10M
- Creator
- CausalLM
- Year
- 2024
- License
- wtfpl
- Downloads
- 16
- Likes
- 273