Skip to content

Tutorial 3: Dataset batch

Start with Dataset + Model batch parsing, then apply transformations without explicit loops.

Setup

>>> import omnipy as om

Batch parsing with Dataset

>>> readings = om.Dataset[om.Model[int]]({'sensor_a': '1', 'sensor_b': 2.0, 'sensor_c': 3})
>>> readings
╭───┬────────────────┬────────────┬────────┬──────────────────╮
#Data file name   Type   LengthSize (in memory)
                                               
0sensor_aModel[int]-589 Bytes
1sensor_bModel[int]-589 Bytes
2sensor_cModel[int]-589 Bytes
╰───┴────────────────┴────────────┴────────┴──────────────────╯

No-for-loop batch transform pattern

>>> incremented = readings.do(lambda value: int(value) + 1)

Hierarchical datasets

>>> from omnipy import Dataset, Model
>>> Inner = Dataset[Model[int]]
>>> Outer = Dataset[Inner]
>>> grouped = Outer({'group1': {'a': 1, 'b': 2}, 'group2': {'a': 10}})
>>> grouped
╭───┬────────────────┬─────────────────────┬────────┬──────────────────╮
#Data file name       Type        LengthSize (in memory)
                                                        
0group1Dataset[Model[int]]21.9 kB
1group2Dataset[Model[int]]11.4 kB
╰───┴────────────────┴─────────────────────┴────────┴──────────────────╯
>>> grouped_incremented = grouped.do(lambda dataset: dataset.do(lambda value: int(value) + 1))

You get batch behavior and hierarchy handling without writing explicit for loops.

What you learned

  • Dataset[Model[int]](...) parses and batches many values in one typed container.
  • Dataset.do(...) lets you apply per-item transforms without explicit loops.
  • Nested datasets can be transformed hierarchically while preserving structure.

Common pitfalls

  • Forgetting to convert values inside lambdas when needed. Use int(value) in mixed parsed inputs.

Next steps