Skip to content

Dataset batch hierarchies

Maturity labels

  • Now: Stable and supported in current releases.
  • Preview: Usable today, but behavior and APIs may evolve.
  • Planned: Not yet implemented.

Note

Status: Now

1) What it solves

Batch processing often devolves into nested loops with weak typing and unclear structure.

2) The idea

Dataset[...] gives keyed batch semantics, and datasets can nest to represent hierarchy.

3) Example

>>> from omnipy import Dataset, Model
>>> Inner = Dataset[Model[int]]
>>> Outer = Dataset[Inner]
>>> grouped = Outer({'group1': {'a': '1', 'b': 2}, 'group2': {'x': 10}})
>>> grouped.json()

4) Output / display

╭───┬────────────────┬────────────┬────────┬──────────────────╮
#Data file name   Type   LengthSize (in memory)
                                               
0aModel[int]-589 Bytes
1bModel[int]-589 Bytes
╰───┴────────────────┴────────────┴────────┴──────────────────╯

5) When to use / when not

Use it for record sets, grouped records, file collections, or keyed intermediate artifacts.

Skip it when you truly only process one scalar/record and no grouping is needed.

6) Gotchas

  • Define stable key semantics early (sample id, filename, partition key, etc.).
  • Very deep nesting usually means you need a clearer boundary between phases.