LLM360/guru-RL-92k
General NLPEnglish
LLM360/guru-RL-92k is a General NLP-focused dataset in English distributed in Parquet format.
About LLM360/guru-RL-92k
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Dataset Description
Guru is a curated six-domain dataset for training large language models (LLM) for complex reasoning with reinforcement learning (...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- LLM360
- Year
- 2025