databricks-synthetic-data-gen
Skillby databricks
Generate realistic synthetic data using Spark + Faker (strongly recommended). Supports serverless execution, multiple output formats (Parquet/JSON/CSV/Delta), and scales from thousands to millions of rows. For small datasets (<10K rows), can optionally generate locally and upload to volumes. Use when user mentions 'synthetic data', 'test data', 'generate data', 'demo dataset', 'Faker', or 'sample data'.
Details
- Path
- plugins/databricks/codex/skills/databricks-synthetic-data-gen/SKILL.md
- Dependencies
- 3