benchmarks
Frozen benchmark suites for DoTime.
This module exposes the consumer side of the released benchmarks: loading a versioned, immutable suite (downloading + caching it from Zenodo on first use) and iterating over its episodes for evaluation.
Public surface
Episode— one trajectory: obs/int data, intervention, ground truth.BenchmarkSuite— a named, versioned collection of episodes.load_benchmark()— fetch a suite by name (cached under~/.cache).available_suites()— list the registered suite names.
Notes for implementers
The download + parse path is stubbed where it touches real artifacts (marked
TODO(release)). The frozen on-disk format is a per-suite directory with a
manifest.json plus one or more parquet shards in the tidy schema produced by
scripts/build_release.py. Wire _parse_suite_dir() to that schema.
- class dotime.benchmarks.BenchmarkSuite(meta, episodes)[source]
Bases:
objectA named, versioned, immutable collection of
Episodeobjects.- Parameters:
meta (SuiteMetadata)
- by_structure()[source]
Yield
(structure_name, episodes)groups.Episodes with
structure is Noneare grouped under"_all".
- class dotime.benchmarks.Episode(x_obs, x_int, intervention, y_true, query_target, query_time, structure=None, scm_id=None, metadata=<factory>)[source]
Bases:
objectA single benchmark trajectory and its associated queries.
- Variables:
x_obs – Observational trajectory, shape
(T, N).x_int – Interventional trajectory under
intervention, shape(T, N).intervention – The applied intervention specification.
y_true – Ground-truth interventional outcome(s) for the query/queries, shape
(n_queries,).query_target – Index of the queried variable per query, shape
(n_queries,).query_time – Query time (float in
[0, 1]for continuous suites, or int step), shape(n_queries,).structure – Identification structure label (
"back_door", …), if applicable.scm_id – Stable id of the generating SCM within the suite.
metadata – Free-form per-episode metadata (effect magnitude, regime count, …).
- Parameters:
- intervention: InterventionSpec
- class dotime.benchmarks.SuiteMetadata(name, version, zenodo_record_id, doi, description, n_episodes, structures=(), license='CC-BY-4.0', hf_repo_id='')[source]
Bases:
objectStatic metadata for a released benchmark suite.
- Parameters:
- dotime.benchmarks.load_benchmark(name, version='latest', *, force_download=False, cache_dir=None)[source]
Load a frozen benchmark suite by name.
On first use the suite is downloaded from Zenodo into the cache directory (
~/.cache/dotimeby default, override with$DOTIME_CACHEor thecache_dirargument). Subsequent calls read from the cache.- Parameters:
- Return type:
- Returns:
BenchmarkSuite