add script to analyse dataset
This commit is contained in:
@@ -133,6 +133,24 @@ just curate-dataset append=true
|
||||
just curate-dataset append=true archive=true archive_dir=data/dataset/archive
|
||||
```
|
||||
|
||||
Analyze dataset quality overall and by day (best game overall/day included):
|
||||
|
||||
```sh
|
||||
python -m server.DatasetStats --input "good_moves-*.jsonl"
|
||||
python -m server.DatasetStats --input data/dataset --output data/dataset/stats-report.json
|
||||
```
|
||||
|
||||
The stats report now includes both:
|
||||
- `best_game` (survival/length focused)
|
||||
- `best_pressure_game` (high-pressure quality focused: fewer safe options + strong survival)
|
||||
|
||||
Or with `just`:
|
||||
|
||||
```sh
|
||||
just analyze-dataset
|
||||
just analyze-dataset input=data/dataset output=data/dataset/stats-report.json
|
||||
```
|
||||
|
||||
To store compact dataset-only records (JSONL) and skip full per-game JSON files:
|
||||
|
||||
```sh
|
||||
|
||||
Reference in New Issue
Block a user