Skip to content

omnipy.data.dataset

CLASS DESCRIPTION
Dataset

Dict-based container of data files that follow a specific Model

FUNCTION DESCRIPTION
is_dataset_instance
is_dataset_subclass
ATTRIBUTE DESCRIPTION
dict_t

dict_t module-attribute

dict_t = dict

Dataset

Bases: DatasetDisplayMixin, TaskDatasetMixin, DataClassBase, pyd.GenericModel, UserDict[str, _ModelOrDatasetT], Generic[_ModelOrDatasetT]

Dict-based container of data files that follow a specific Model

Dataset is a generic class that cannot be instantiated directly. Instead, a Dataset class needs to be specialized with a data model before Dataset objects can be instantiated. A data model functions as a data parser and guarantees that the parsed data follows the specified model.

The specialization must be done through the use of Model, either directly, e.g.::

MyDataset = Dataset[Model[dict[str, list[int]]])

... or indirectly, using a Model subclass, e.g.::

class MyModel(Model[dict[str, list[int]]):
    pass

MyDataset = Dataset[MyModel]

... alternatively through the specification of a Dataset subclass::

class MyDataset(Dataset[MyModel]):
    pass

The specialization can also be done in a more deeply nested structure, e.g.::

class MyNumberList(Model[list[int]]):
    pass

class MyToplevelDict(Model[dict[str, MyNumberList]]):
    pass

class MyDataset(Dataset[MyToplevelDict]):
    pass

Once instantiated, a dataset object functions as a dict of data files, with the keys referring to the data file names and the content to the data file content, e.g.::

MyNumberListDataset = Dataset[Model[list[int]]]

my_dataset = MyNumberListDataset({'file_1': [1,2,3]})
my_dataset['file_2'] = [2,3,4]

print(my_dataset.keys())

The Dataset class is a wrapper class around the powerful GenericModel class from pydantic.

CLASS DESCRIPTION
Config
METHOD DESCRIPTION
__init__
absorb
absorb_and_replace
as_multi_model_dataset
browse

Opens the model or dataset in a browser, if possible.

clone_dataset_cls
copy
deepcopy_context
default_repr_to_terminal_str
dict
do
failed_task_details
from_data
from_json
full

Display the content of the Model or Dataset in full height.

get_type

Returns the concrete type (Model or Dataset class) used for all

json

Preview the data content of the Model or Dataset as JSON.

list

Displays a summary list of all models in the dataset.

load
load_into
peek

Display a preview of the Model or Dataset content.

pending_task_details
save
to
to_data
to_json
to_json_schema
update_forward_refs
validate

Hack to allow overwriting of iter method without compromising pydantic validation. Part

ATTRIBUTE DESCRIPTION
available_data

TYPE: Self

config

TYPE: IsDataConfig

data

TYPE: dict[str, _ModelOrDatasetT]

failed_data

TYPE: Self

pending_data

TYPE: Self

reactive_objects

TYPE: IsReactiveObjects | None

snapshot_holder

TYPE: IsSnapshotHolder[HasContent, ContentT]

Source code in src/omnipy/data/dataset.py
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
class Dataset(
        DatasetDisplayMixin,
        TaskDatasetMixin,
        DataClassBase,
        pyd.GenericModel,
        UserDict[str, _ModelOrDatasetT],
        Generic[_ModelOrDatasetT],
        metaclass=_DatasetMetaclass):
    """
    Dict-based container of data files that follow a specific Model

    Dataset is a generic class that cannot be instantiated directly. Instead, a Dataset class needs
    to be specialized with a data model before Dataset objects can be instantiated. A data model
    functions as a data parser and guarantees that the parsed data follows the specified model.

    The specialization must be done through the use of Model, either directly, e.g.::

        MyDataset = Dataset[Model[dict[str, list[int]]])

    ... or indirectly, using a Model subclass, e.g.::

        class MyModel(Model[dict[str, list[int]]):
            pass

        MyDataset = Dataset[MyModel]

    ... alternatively through the specification of a Dataset subclass::

        class MyDataset(Dataset[MyModel]):
            pass

    The specialization can also be done in a more deeply nested structure, e.g.::

        class MyNumberList(Model[list[int]]):
            pass

        class MyToplevelDict(Model[dict[str, MyNumberList]]):
            pass

        class MyDataset(Dataset[MyToplevelDict]):
            pass

    Once instantiated, a dataset object functions as a dict of data files, with the keys
    referring to the data file names and the content to the data file content, e.g.::

        MyNumberListDataset = Dataset[Model[list[int]]]

        my_dataset = MyNumberListDataset({'file_1': [1,2,3]})
        my_dataset['file_2'] = [2,3,4]

        print(my_dataset.keys())

    The Dataset class is a wrapper class around the powerful `GenericModel` class from pydantic.
    """
    class Config:
        validate_assignment = True
        arbitrary_types_allowed = True

        # TODO: Use json serializer package from the pydantic config instead of 'json'

        # json_loads = orjson.loads
        # json_dumps = orjson_dumps

    if TYPE_CHECKING:
        data: dict[str, _ModelOrDatasetT] = pyd.Field(default={})

    else:

        # TODO: For pydantic v2, remove hack in Dataset to stop e.g.
        #       [{'a': 'b', 'c': 'd'}] to be coerced into {'a': 'c'} (remove
        #       first part of the union below, and edit get_type() and to_json_schema())
        data: list[dict[str, _ModelOrDatasetT]] | dict[str, _ModelOrDatasetT] = pyd.Field(
            default={})

    # data: dict[str, _ModelOrDatasetT] = pyd.Field(default={})

    def __class_getitem__(  # type: ignore[override]
        cls,
        params: type[_ModelOrDatasetT] | tuple[type[_ModelOrDatasetT]]
        | tuple[type[_ModelOrDatasetT], Any] | TypeVar
        | tuple[TypeVar, ...],
    ) -> Self:
        # TODO: change model type to params: Type[Any] | tuple[Type[Any], ...]
        #       as in GenericModel.

        _params = cls._prepare_params(params)
        orig_params = cls._clean_type(_params)

        if cls == Dataset:
            for type_variant in split_to_union_variants(orig_params):
                from omnipy.data.model import is_model_subclass

                if (not isinstance(type_variant,
                                   (TypeVar, str)) and not is_model_subclass(type_variant)
                        and not is_dataset_subclass(type_variant)):
                    cls._raise_type_exception(f'Invalid model: {orig_params} ')
        else:
            if isinstance(orig_params, TypeVar):
                _params = get_default_if_typevar(orig_params)

        created_dataset = super().__class_getitem__(_params)
        cls._recursively_set_allow_none(created_dataset._get_data_field())
        cleanup_name_qualname_and_module(cls, created_dataset, orig_params)

        return cast(Self, created_dataset)

    @call_super_if_available(call_super_before_method=True)
    @classmethod
    def _clean_type(cls, _type: TypeForm) -> TypeForm:
        return _type

    def __init__(  # noqa: C901
        self,
        value: Mapping[str, object] | Iterable[tuple[str, object]] | UndefinedType = Undefined,
        *,
        data: Mapping[str, object] | UndefinedType = Undefined,
        **kwargs: object,
    ) -> None:
        from omnipy.data.model import is_model_instance, is_pure_pydantic_model

        # TODO: Error message when forgetting parenthesis when creating Dataset should be improved.
        #       Unclear where this can be done, if anywhere? E.g.:
        #           a = Dataset[Model[int]]
        #           a['adsfas'] = 2
        #           Traceback (most recent call last):
        #             ...
        #           TypeError: 'ModelMetaclass' object does not support item assignment
        #
        # TODO: Disallow e.g.:
        #       Dataset[Model[str]](Model[int](5)) ==  Dataset[Model[str]](data=Model[int](5))
        #       == Dataset[Model[str]](data={'__root__': Model[str]('5')})

        super_kwargs = {}

        assert DATA_KEY not in kwargs, \
            ('Not allowed with "data" as kwargs key. Not sure how you managed this? Are you trying '
             'to break Dataset init on purpose?')

        if value != Undefined:
            assert data == Undefined, \
                'Not allowed to combine positional and "data" keyword argument'
            assert len(kwargs) == 0, \
                'Not allowed to combine positional and keyword arguments'
            super_kwargs[DATA_KEY] = value

        if data != Undefined:
            assert len(kwargs) == 0, \
                f"Not allowed to combine '{DATA_KEY}' with other keyword arguments"
            super_kwargs[DATA_KEY] = data

        if kwargs:
            if DATA_KEY not in super_kwargs:
                super_kwargs[DATA_KEY] = kwargs
                kwargs = {}

        _type = self.get_type()
        if _type == _ModelOrDatasetT:  # type: ignore[misc]
            self._raise_type_exception()

        def _validate_any_models_or_datasets(
                iterable_data: Iterable[tuple[str, object]]) -> tuple[dict, bool]:

            prepared_data = {}
            _model_or_dataset_as_input: bool = False

            for key, val in iterable_data:
                if is_model_instance(val):
                    _model_or_dataset_as_input = True
                    prepared_data[key] = self._validate_value_for_data_file(key, val)
                else:
                    prepared_data[key] = val
            return prepared_data, _model_or_dataset_as_input

        model_or_dataset_as_input = False
        if DATA_KEY in super_kwargs:
            input_data = super_kwargs[DATA_KEY]
            for_type_check = input_data.content if is_model_instance(input_data) else input_data
            match for_type_check:
                case Dataset():
                    model_or_dataset_as_input = True
                    super_kwargs[DATA_KEY] = cast(Dataset, input_data).to_data()
                case _input_data if is_pure_pydantic_model(_input_data):
                    super_kwargs[DATA_KEY], model_or_dataset_as_input = (
                        _validate_any_models_or_datasets(_input_data.dict().items()))
                case Mapping():
                    super_kwargs[DATA_KEY], model_or_dataset_as_input = (
                        _validate_any_models_or_datasets(cast(Mapping, input_data).items()))
                case Iterable():
                    try:
                        super_kwargs[DATA_KEY], model_or_dataset_as_input = (
                            _validate_any_models_or_datasets(self._check_iterable(input_data)))
                    except (TypeError, ValueError) as e:
                        raise TypeError(
                            'Data object must be a mapping or an iterable of '
                            '(key, val) pairs',
                            self.__class__) from e

                case _:
                    ...

        self._init(super_kwargs, **kwargs)

        try:
            self._primary_validation(super_kwargs)
        except ValidationError:
            if model_or_dataset_as_input:
                self._secondary_validation_from_data(super_kwargs)
            else:
                raise

        if not self.__doc__:
            self._set_standard_field_description()

    def _primary_validation(self, super_kwargs):
        # Pydantic validation of super_kwargs
        super().__init__(**super_kwargs)

    def _secondary_validation_from_data(self, super_kwargs):
        super().__init__()
        self.from_data(super_kwargs[DATA_KEY])

    def _init(self, super_kwargs: dict_t[str, Any], **kwargs: Any) -> None:
        ...

    # TODO: Revise with pydantic v2: __deepcopy__ is not defined for Dataset and Model, as it is not
    #       supported by pydantic v1. BaseModel.copy(deep=True) does not support a deepcopy memo.
    #       So we instead make use of the builtin support for deepcopy, which seems to work fine.
    #       However, __deepcopy__ in pydantic v2 is probably more efficient due to the memo and
    #       the Rust backend.

    def __copy__(self):
        return self.copy(deep=False)

    def copy(self, *, deep: bool = False, **kwargs) -> Self:
        pydantic_copy = pyd.GenericModel.copy(self, deep=deep, **kwargs)
        if not deep:
            object.__setattr__(pydantic_copy, DATA_KEY, pydantic_copy.__dict__[DATA_KEY].copy())

        return pydantic_copy  # pyright: ignore [reportReturnType]

    @classmethod
    def clone_dataset_cls(cls,
                          new_dataset_cls_name: str,
                          model_cls: type[_NewModelT] | None = None) -> type[Self]:
        if model_cls:
            generic_dataset_cls = cls.__bases__[0]
            new_base_cls = generic_dataset_cls[model_cls]  # type: ignore[index]
        else:
            new_base_cls = cls

        new_dataset_cls = type(new_dataset_cls_name, (new_base_cls,), {})
        return new_dataset_cls

    @classmethod
    def _get_data_field(cls) -> pyd.ModelField:
        return cast(pyd.ModelField, cls.__fields__.get(DATA_KEY))

    @classmethod
    @functools.cache
    def get_type(cls) -> type[_ModelOrDatasetT]:
        """
        Returns the concrete type (Model or Dataset class) used for all
        data files in the dataset, e.g.: `Model[list[int]]`, or
        `Dataset[Model[dict[str, float]]]` for nested datasets.
        :return: The concrete type (Model or Dataset class) used for all
                 data files in the dataset.
        """
        # Part of pydantic v1 hack to stop coercing of e.g.
        # [{'a': 'b', 'c': 'd'}] to {'a': 'c'}
        return cls._clean_type(cls._get_data_field().sub_fields[1].type_)  # type: ignore[index]
        # return cls._clean_type(cls._get_data_field().type_)

    @classmethod
    def _clean_type_caches(cls):
        cls.get_type.cache_clear()

    @staticmethod
    def _raise_type_exception(prefix_msg: str = '') -> None:
        msg = dedent("""\
            The Dataset class requires a concrete type (e.g. a Model class
            or a subclass) to be specified as a type hierarchy within
            brackets either directly, e.g.:

              model = Dataset[Model[list[int]]]()

            or indirectly in a subclass definition, e.g.:

              class MyNumberListDataset(Dataset[Model[list[int]]]): ...

            For anything other than the simplest cases, the definition of
            Model and Dataset subclasses is encouraged , e.g.:

              class MyNumberListModel(Model[list[int]]): ...
              class MyDataset(Dataset[MyNumberListModel]): ...

            Alternatively, a dataset can nest another dataset instead of a
            model, e.g.:

              class MyNestedDataset(Dataset[Dataset[Model[list[int]]]]): ...

            Note that at the bottom of the dataset nesting hierarchy, a
            Model class must always be specified.

            Unions of Model or Dataset classes are also supported, e.g.:

              model = Dataset[Model[int] | StrModel]()""")
        if prefix_msg:
            msg = prefix_msg + '\n\n' + msg
        raise TypeError(msg)

    def _set_standard_field_description(self) -> None:
        self.__fields__[DATA_KEY].field_info.description = self._get_standard_field_description()

    @classmethod
    def _get_standard_field_description(cls) -> str:
        return ('This class represents a dataset in the `omnipy` Python package and contains '
                'a set of named data items that follows the same data model. '
                'It is a statically typed specialization of the Dataset class according to a '
                'particular specialization of the Model class. Both main classes are wrapping '
                'the excellent Python package named `pydantic`.')

    if TYPE_CHECKING:  # noqa: C901

        # The code below is a hack needed because of a fundamental limitation of the current Python
        # typing syntax. There is no way (that we know of) to tell the type checkers that Model
        # objects can mimic the functionality of their type arguments, say that a Model[list] can
        # mimic a list. What we were aiming to do as a lesser hack was to tell to the type checkers
        # that the Model objects can be considered as inheriting from both the Model class
        # and the type argument class, e.g. Model[list] and list, but in a general way, using type
        # variables. As a workaround, we have to overload the Model.__new__ and Dataset.__getitem__
        # methods for the most important types.

        @overload
        def __getitem__(
            self: 'Dataset[Model[float]]',
            selector: str | int,
        ) -> Model_float:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[Model[int]]',
            selector: str | int,
        ) -> Model_int:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[Model[bool]]',
            selector: str | int,
        ) -> Model_bool:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[Model[str]]',
            selector: str | int,
        ) -> Model_str:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[Model[bytes]]',
            selector: str | int,
        ) -> Model_bytes:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[Model[set[_ValT]]]',
            selector: str | int,
        ) -> Model_set[_ValT]:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[Model[list[_ValT]]]',
            selector: str | int,
        ) -> Model_list[_ValT]:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[Model[tuple[_ValT, ...]]]',
            selector: str | int,
        ) -> Model_tuple_same_type[_ValT]:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[Model[tuple[_ValT, _ValT2]]]',
            selector: str | int,
        ) -> Model_tuple_pair[_ValT, _ValT2]:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[Model[dict_t[_KeyT, _ValT]]]',
            selector: str | int,
        ) -> Model_dict[_KeyT, _ValT]:
            ...

        # For better typing of NestedDataset and similar. Will always type
        # as if a nested Dataset is returned, thus will be wrong for the
        # terminal Model case (only when multiple __getitem__ are chained,
        # e.g.:
        #   nested_dataset = Dataset[Dataset[Model[list[int]]]](...)
        #   nested_dataset['a'][0] = 5  # <- here the type checker will
        #                               #    think nested_dataset['a'] is a
        #                               #    Dataset, not a Model[list[int]]

        @overload
        def __getitem__(
            self: 'Dataset[Model[Dataset[_ModelT]]]',
            selector: str | int,
        ) -> Model_Dataset[_ModelT]:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[_DatasetT | _ModelT ]',
            selector: str | int,
        ) -> _DatasetT:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[_DatasetT | _ModelT | _Model2T]',
            selector: str | int,
        ) -> _DatasetT:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[_DatasetT | _ModelT | _Model2T | _Model3T]',
            selector: str | int,
        ) -> _DatasetT:
            ...

        @overload
        def __getitem__(
            self: 'Dataset[_DatasetT | _ModelT | _Model2T | _Model3T | _Model4T]',
            selector: str | int,
        ) -> _DatasetT:
            ...

        # Even though these two overloads overlap, they are needed in this
        # order to solve typing for both nested and regular Dataset cases.

        @overload
        def __getitem__(  # type: ignore[overload-overlap]
            self: 'Dataset[_ModelOrDatasetT]',
            selector: str | int,
        ) -> _ModelOrDatasetT:
            ...

        # The only thing that should really be needed – if Python type hints would have able to
        # describe that Model objects can dynamically inherit from their type arguments. This would
        # at least go some way towards what we really want, which is a way to describe exactly the
        # way Model objects mimic the functionality of their type arguments.
        #
        # @overload
        # def __getitem__(self, selector: str | int) -> _ModelOrDatasetT:
        #     ...

        @overload
        def __getitem__(self, selector: slice | Iterable[str | int]) -> Self:
            ...

    def __getitem__(
        self,
        selector: str | int | slice | Iterable[str | int],
    ) -> '_DatasetT | _ModelOrDatasetT | Model | Self':
        selected_keys = select_keys(selector, self.data)

        if selected_keys.singular:
            value: _ModelOrDatasetT | Self = self.data[selected_keys.keys[0]]
        else:
            value = self.__class__({key: self.data[key] for key in selected_keys.keys})

        return self._check_value(value)

    @call_super_if_available(call_super_before_method=True)
    def _check_value(self, value: Any) -> Any:
        return value

    def __delitem__(self, selector: str | int | slice | Iterable[str | int]) -> None:
        selected_keys = select_keys(selector, self.data)

        if selected_keys.singular:
            del self.data[selected_keys.keys[0]]
        else:
            prev_data = copy(self.data)

            try:
                for key in selected_keys.keys:
                    del self.data[key]
            except Exception:
                self.data = prev_data
                raise

    @overload
    def __setitem__(self, selector: str | int, data_obj: object) -> None:
        ...

    @overload
    def __setitem__(self,
                    selector: slice | Iterable[str | int],
                    data_obj: Mapping[str, object] | Iterable[object]) -> None:
        ...

    def __setitem__(
        self,
        selector: str | int | slice | Iterable[str | int],
        data_obj: object | Mapping[str, object] | Iterable[object],
    ) -> None:
        selected_keys = select_keys(selector, self.data)

        if selected_keys.singular:
            self._set_data_file_and_validate(selected_keys.keys[0],
                                             cast(_ModelOrDatasetT, data_obj))
        else:
            key_2_data_item: Key2DataItemType[object]
            index_2_data_items: Index2DataItemsType[object]

            if isinstance(data_obj, MutableMapping):
                key_2_data_item, index_2_data_items = \
                    prepare_selected_items_with_mapping_data(
                        selected_keys.keys, selected_keys.last_index,
                        cast(Mapping[str, object], data_obj),
                    )

            elif is_iterable(data_obj) and not isinstance(data_obj, (str, bytes)):
                key_2_data_item, index_2_data_items = \
                    prepare_selected_items_with_iterable_data(
                        selected_keys.keys, selected_keys.last_index, tuple(data_obj),
                        cast(Mapping[str, object], self.data),
                    )

            else:
                raise TypeError('Data object must be a mapping or an iterable')

            self._update_selected_items_with_data_items(key_2_data_item, index_2_data_items)

    def _update_selected_items_with_data_items(
        self,
        key_2_data_item: Key2DataItemType[object],
        index_2_data_item: Index2DataItemsType[object],
    ) -> None:

        updated_mapping = create_updated_mapping(
            cast(MutableMapping[str, object], self.data), key_2_data_item,
            index_2_data_item)  # pyright: ignore [reportUndefinedVariable]
        self._replace_data_with_mapping(updated_mapping)

    def _replace_data_with_mapping(self, updated_mapping: MutableMapping[str, object]) -> None:
        prev_data = self.data
        try:
            self.absorb_and_replace(self.__class__(updated_mapping))
        except Exception:
            self.data = prev_data
            raise

    def _set_data_file_and_validate(self, key: str, val: _ModelOrDatasetT) -> None:
        has_prev_value = key in self.data
        if has_prev_value:
            prev_value = self.data[key]

        try:
            self.data[key] = val
            self._validate_data_file(key)
        except Exception:
            if has_prev_value:
                self.data[key] = prev_value
            else:
                del self.data[key]
            raise

    @classmethod
    def _check_iterable(cls, iterable: Iterable[Any]) -> Iterable[Any]:
        if isinstance(iterable, (str, bytes)):
            raise TypeError(
                'Outer data iterables cannot be strings or, '
                'bytes, got: {type(value)}', cls)

        def check_iterable_elements(iterable: Iterable) -> Iterator:
            for el in iterable:
                if not isinstance(el, (tuple, list)):
                    raise TypeError(
                        'Inner data iterable elements must be '
                        '(key, val) pairs, as tuples or lists, '
                        f'not: {type(el)}',
                        cls)
                if isinstance(el, Mapping):
                    yield from el.items()
                else:
                    yield el

        return check_iterable_elements(iterable)

    @classmethod
    def validate(cls, value: Any) -> Self:
        """
        Hack to allow overwriting of __iter__ method without compromising pydantic validation. Part
        of the pydantic API and not the Omnipy API.
        """
        # TODO: Doublecheck if validate() method is still needed for pydantic v2

        # validate_cls_counts[cls.__name__] += 1
        if is_iterable(value) and not isinstance(value, Mapping):
            value = cls._check_iterable(value)

        return super().validate({'data': value})

    @classmethod
    def update_forward_refs(
        cls,
        calling_module: str | None = None,
        prev_visited_classes: set[type] | None = None,
        **localns: Any,
    ) -> None:
        from omnipy.data.model import is_model_subclass
        """
        Try to update ForwardRefs on fields based on this Model, globalns and localns.
        """

        if prev_visited_classes is None:
            prev_visited_classes = set()
        elif cls in prev_visited_classes:
            return

        # Merge the namespaces of the Datasets's own module and the
        # calling module to the local namespace for evaluation of forward
        # references, which is necessary for cases where the Dataset is
        # defined in a different module than where it is used, e.g. when
        # the Dataset is defined in a library and used by a user in their
        # own code.
        if calling_module is None:
            calling_module = get_calling_module_name()
        own_module_ns, globalns = \
            build_own_module_and_global_namespace_for_forward_refs(cls, calling_module, **localns)

        prev_type = cls._get_data_field().type_

        super().update_forward_refs(**globalns)

        cls._get_data_field().type_ = evaluate_any_forward_refs_if_possible(prev_type, **globalns)
        cls.__annotations__[DATA_KEY] = evaluate_any_forward_refs_if_possible(
            cls.__annotations__[DATA_KEY], **globalns)

        prev_visited_classes.add(cls)

        # Merge the Dataset's own module namespace into
        # localns before propagating. This is to allow Model classes and
        # pydantic-generated parametrized base classes (which have
        # __module__='omnipy.data.dataset' rather than the defining
        # module) to still resolve forward refs that only exist
        # in the defining module's namespace.

        extra_ns: dict[str, Any] = {}
        extra_ns.update(own_module_ns)
        extra_ns.update(localns)

        # Propagate update_forward_refs to parent Dataset classes but
        # retaining the same calling module. This is needed to ensure the
        # correct context is used to resolve forward references in complex
        # inheritance hierarchies.
        #
        # We explicitly call `update_forward_refs` on immediate parent
        # classes (`__bases__`) instead of relying solely on
        # `super().update_forward_refs()`. This is because `super()`
        # inside this classmethod resolves relative to `Dataset` in the MRO,
        # silently bypassing custom logic on any intermediate `Dataset`
        # subclasses. Explicitly propagating through `__bases__` ensures
        # that class-level setups are correctly applied to all parents
        # exactly once, efficiently preventing redundant updates.
        for base in cls.__bases__:
            if is_dataset_subclass(base) and base is not Dataset:
                base.update_forward_refs(
                    calling_module=calling_module,
                    prev_visited_classes=prev_visited_classes,
                    **extra_ns,
                )

        # As above, but now propagate update_forward_refs to the types of
        # the Dataset (e.g. the Model).
        for type_variant in split_to_union_variants(cls.get_type()):
            if is_dataset_subclass(type_variant) or is_model_subclass(type_variant):
                type_variant.update_forward_refs(
                    calling_module=calling_module,
                    prev_visited_classes=prev_visited_classes,
                    **extra_ns,
                )

        cls.__name__ = remove_forward_ref_notation(cls.__name__)
        cls.__qualname__ = remove_forward_ref_notation(cls.__qualname__)

        cls._clean_type_caches()

    def _validate_data_file(self, data_file: str) -> None:
        from omnipy.data.model import is_model_instance

        val = self.data[data_file]
        if is_model_instance(val):
            self.data[data_file] = self._validate_value_for_data_file(data_file, val)
        else:
            self._force_full_validation()

    @staticmethod
    def _basic_validation_func(type_variant: 'type[Model | Dataset]',
                               value: UndefinedType | object) -> _ModelOrDatasetT:
        return cast(_ModelOrDatasetT, type_variant(value))  # type: ignore[arg-type]

    @classmethod
    def _validate_value_for_data_file(
        cls,
        data_file: str,
        value: UndefinedType | object,
        validation_func: (
            'Callable[[type[Model | Dataset], UndefinedType | object], _ModelOrDatasetT]'
        ) = _basic_validation_func,
    ) -> _ModelOrDatasetT:
        errors = []
        for type_variant in split_to_union_variants(cls.get_type()):
            try:
                return validation_func(cast('type[Model | Dataset]', type_variant), value)
            except (ValidationError, ValueError, TypeError) as exp:
                errors.append(exp)
        assert errors
        raise ValidationError([pyd.ErrorWrapper(exc, loc=data_file) for exc in errors], cls)

    def _force_full_validation(self):
        self.data = self.data  # Triggers pydantic validation, as validate_assignment=True

    @override
    def __iter__(self) -> Iterator[str]:  # type: ignore[override]
        return UserDict.__iter__(self)

    def __setattr__(self, attr: str, value: Any) -> None:
        if attr in self.__dict__ or attr == DATA_KEY or attr.startswith('__'):
            super().__setattr__(attr, value)
        elif attr == 'repr_state':
            prop = getattr(self.__class__, attr)
            prop.__set__(self, value)
        else:
            raise RuntimeError('Model does not allow setting of extra attributes')

    @pyd.root_validator
    def _parse_root_object(
        cls,
        root_obj: dict_t[str, dict_t[str, _ModelOrDatasetT]],
    ) -> Any:  # noqa
        assert DATA_KEY in root_obj
        data_dict = root_obj[DATA_KEY]
        for data_file, val in data_dict.items():
            if val is None:

                def validation_by_parse_obj(
                    type_variant: 'type[Model | Dataset]',
                    value: UndefinedType | object,
                ) -> _ModelOrDatasetT:
                    return cast(_ModelOrDatasetT, type_variant.parse_obj(value))

                data_dict[data_file] = cls._validate_value_for_data_file(
                    data_file,
                    val,
                    validation_by_parse_obj,
                )

        return {DATA_KEY: data_dict}

    def to(self, model_or_dataset_cls: type[_OtherModelOrDatasetT]) -> '_OtherModelOrDatasetT':
        return model_or_dataset_cls(self)

    def do(self, placeholder: F) -> 'Dataset[_ModelOrDatasetT]':
        new_dataset = self.__class__()
        for data_file, val in self.items():
            new_dataset[data_file] = placeholder(val)
        return new_dataset

    def to_data(self) -> dict_t[str, Any]:
        return {key: self._check_value(val) for key, val in self.dict(by_alias=True).items()}

    def dict(self, **kwargs) -> dict_t[str, Any]:
        return super().dict(**kwargs)[DATA_KEY]

    def from_data(self,
                  data: Mapping[str, Any] | Iterable[tuple[str, Any]],
                  update: bool = True) -> None:
        def callback_func(type_variant: 'Model | Dataset', content: Any):
            type_variant.from_data(content)

        self._from_dict_with_callback(data, update, callback_func)

    def _from_dict_with_callback(self,
                                 data: Mapping[str, Any] | Iterable[tuple[str, Any]],
                                 update: bool,
                                 callback_func: 'Callable[[Model | Dataset, Any], None]'):
        if isinstance(data, dict):
            data_as_dict: dict[str, Any] = data  # pyright: ignore [reportAssignmentType]
        else:
            data_as_dict = dict(data)

        if not update:
            self.clear()

        for data_file, content in data_as_dict.items():
            # TODO: Redefine from_data() also as classmethods on Model and
            #       Dataset. Here, we could then do
            #       type_variant.from_data(content) instead of creating a
            #       new instance and then calling from_data() on it.
            #       Instance-level from_data() should however also be kept,
            #       as it is useful in many cases. Note: Classmethod
            #       from_data() should still first create an empty instance
            #       and then call instance-level from_data() on it, to avoid
            #       issues with __init__ arguments (when e.g. 'self', 'value',
            #       'data' is used as keys in the data).

            def validation_by_callback_func(
                type_variant: 'type[Model | Dataset]',
                value: UndefinedType | object,
            ) -> _ModelOrDatasetT:
                new_instance = type_variant()
                callback_func(new_instance, value)
                return cast(_ModelOrDatasetT, new_instance)

            self.data[data_file] = self._validate_value_for_data_file(
                data_file,
                content,
                validation_by_callback_func,
            )

    def absorb(self, other: 'Dataset'):
        self.from_data(other.to_data(), update=True)

    def absorb_and_replace(self, other: 'Dataset'):
        self.from_data(other.to_data(), update=False)

    def to_json(self, pretty=True) -> dict_t[str, str]:
        result = {}

        for key, val in self.data.items():
            result[key] = val.to_json(pretty=pretty)

        return result

    def from_json(self,
                  data: Mapping[str, str] | Iterable[tuple[str, str]],
                  update: bool = True) -> None:
        def callback_func(type_variant: 'Model | Dataset', content: Any):
            type_variant.from_json(content)

        self._from_dict_with_callback(data, update, callback_func)

    # @classmethod
    # def get_type_args(cls):
    #     return cls.__fields__.get(DATA_KEY).type_
    #
    #
    # @classmethod
    # def create_from_json(cls, data: str, tuple[str]]):
    #     if isinstance(data, tuple):
    #         data = data[0]
    #
    #     obj = cls()
    #     obj.from_json(data, update=False)
    #     return obj
    #
    # def __reduce__(self):
    #     return self.__class__.create_from_json, (self.to_json(),)

    @classmethod
    def to_json_schema(cls, pretty: bool = True) -> str | dict_t[str, str]:
        result = {}
        clean_dataset = super(Dataset, Dataset).__class_getitem__(cls.get_type())
        schema = clean_dataset.schema()
        for key, val in schema['properties'][DATA_KEY].items():
            # Remove the first part of the type definition of 'data', added
            # as a hack to stop coercing of e.g. [{'a': 'b', 'c': 'd'}]
            # to {'a': 'c'}
            if key == 'anyOf':
                result['type'] = 'object'
                result['additionalProperties'] = {
                    '$ref': '#/definitions/' + pyd.normalize_name(clean_dataset.get_type().__name__)
                }
            else:
                result[key] = val

        result['title'] = clean_dataset.__name__
        result['definitions'] = schema['definitions']

        for model_desc in result['definitions'].values():
            if 'orig_model' in model_desc:
                del model_desc['orig_model']

        if pretty:
            return cls._pretty_print_json(result)
        else:
            return json.dumps(result)

    @staticmethod
    def _pretty_print_json(json_content: Any) -> str:
        return json.dumps(json_content, indent=2)

    def save(self, path: str):
        serializer_registry = self._get_serializer_registry()

        parsed_dataset, serializer = serializer_registry.auto_detect_tar_file_serializer(self)

        if serializer is None:
            print(f'Unable to find a serializer for dataset with data type "{type(self)}". '
                  f'Will abort saving...')
        else:
            if not path.endswith('.tar.gz'):
                out_tar_gz_path = f'{path}.tar.gz'

            print(f'Writing dataset as a gzipped tarpack to "{os.path.abspath(out_tar_gz_path)}"')

            with open(out_tar_gz_path, 'wb') as out_tar_gz_file:
                out_tar_gz_file.write(serializer.serialize(parsed_dataset))

            directory = os.path.abspath(out_tar_gz_path[:-7])
            if not os.path.exists(directory):
                os.makedirs(directory)

            tar = tarfile.open(out_tar_gz_path)
            print(f'Extracting content to directory "{os.path.abspath(out_tar_gz_path[:-7])}"')
            tar.extractall(path=directory)
            tar.close()

    @classmethod
    def load(
        cls,
        paths_or_urls: IsPathsOrUrlsOneOrMoreOrNone = None,
        by_file_suffix: bool = False,
        as_mime_type: None | str = None,
        **kwargs: IsPathOrUrl,
    ) -> Self | asyncio.Task[Self]:
        dataset = cls()
        return dataset.load_into(
            paths_or_urls, by_file_suffix=by_file_suffix, as_mime_type=as_mime_type, **kwargs)

    def load_into(
        self,
        paths_or_urls: IsPathsOrUrlsOneOrMoreOrNone = None,
        by_file_suffix: bool = False,
        as_mime_type: None | str = None,
        **kwargs: IsPathOrUrl,
    ) -> Self | asyncio.Task[Self]:
        from omnipy.components.remote.datasets import HttpUrlDataset
        from omnipy.components.remote.models import HttpUrlModel

        if paths_or_urls is None:
            assert len(kwargs) > 0, 'No paths or urls specified'
            paths_or_urls = kwargs
        else:
            assert len(kwargs) == 0, 'No keyword arguments allowed when paths_or_urls is specified'

        match paths_or_urls:
            case HttpUrlDataset():
                return self._load_http_urls(paths_or_urls, as_mime_type=as_mime_type)

            case HttpUrlModel():
                return self._load_http_urls(
                    HttpUrlDataset({str(paths_or_urls): paths_or_urls}),
                    as_mime_type=as_mime_type,
                )

            case str():
                try:
                    http_url_dataset = HttpUrlDataset({paths_or_urls: paths_or_urls})
                except ValidationError:
                    return self._load_paths([paths_or_urls], by_file_suffix)
                return self._load_http_urls(http_url_dataset, as_mime_type=as_mime_type)

            case Mapping():
                try:
                    http_url_dataset = HttpUrlDataset(paths_or_urls)
                except ValidationError as exp:
                    raise NotImplementedError(
                        'Loading files with specified keys is not yet '
                        'implemented, as only tar.gz file import is '
                        'supported until serializers have been refactored.') from exp
                return self._load_http_urls(http_url_dataset, as_mime_type=as_mime_type)

            case Iterable():
                path_or_url_iterable = paths_or_urls
                try:
                    http_url_dataset = HttpUrlDataset(
                        zip(path_or_url_iterable, path_or_url_iterable))
                except ValidationError:
                    return self._load_paths(path_or_url_iterable, by_file_suffix)
                return self._load_http_urls(http_url_dataset, as_mime_type=as_mime_type)
            case _:
                raise TypeError(f'"paths_or_urls" argument is of incorrect type. Type '
                                f'{type(paths_or_urls)} is not supported.')

    def _load_http_urls(
        self,
        http_url_dataset: IsHttpUrlDataset,
        as_mime_type: None | str = None,
    ) -> Self | asyncio.Task[Self]:
        from omnipy.components.remote.helpers import RateLimitingClientSession
        from omnipy.components.remote.tasks import get_auto_from_api_endpoint

        hosts: defaultdict[str, list[int]] = defaultdict(list)
        for i, url in enumerate(http_url_dataset.values()):
            hosts[url.host].append(i)

        async def load_all(as_mime_type: None | str = None) -> 'Dataset[_ModelOrDatasetT]':
            tasks = []

            for host in hosts:
                async with RateLimitingClientSession(
                        self.config.http.for_host[host].requests_per_time_period,
                        self.config.http.for_host[host].time_period_in_secs) as client_session:
                    indices = hosts[host]
                    # fetch_task = get_auto_from_api_endpoint
                    # if as_mime_type:
                    #     match as_mime_type:
                    #         case 'application/json':
                    #             fetch_task = get_json_from_api_endpoint
                    #         case 'text/plain':
                    #             fetch_task = get_str_from_api_endpoint
                    #         case 'application/octet-stream' | _:
                    #             fetch_task = get_bytes_from_api_endpoint

                    ret = get_auto_from_api_endpoint.refine(
                        output_dataset_param='output_dataset').run(
                            http_url_dataset[indices],
                            client_session=client_session,
                            output_dataset=self,
                            as_mime_type=as_mime_type)

                    if not isinstance(ret, asyncio.Task):
                        assert inspect.iscoroutine(ret)
                        task = asyncio.create_task(ret)
                    else:
                        task = ret

                    tasks.append(task)

                    while not task.done():
                        await asyncio.sleep(ASYNC_LOAD_SLEEP_TIME)

            await asyncio.gather(*tasks)
            return self

        loop, loop_is_running = get_event_loop_and_check_if_loop_is_running()

        if loop and loop_is_running:
            return loop.create_task(load_all(as_mime_type=as_mime_type))
        else:
            return asyncio.run(load_all(as_mime_type=as_mime_type))

    def _load_paths(self, path_or_urls: Iterable[str], by_file_suffix: bool) -> Self:
        for path_or_url in path_or_urls:
            serializer_registry = self._get_serializer_registry()
            tar_gz_file_path = self._ensure_tar_gz_file(path_or_url)

            if by_file_suffix:
                loaded_dataset = \
                    serializer_registry.load_from_tar_file_path_based_on_file_suffix(
                        self, tar_gz_file_path, self)
            else:
                loaded_dataset = \
                    serializer_registry.load_from_tar_file_path_based_on_dataset_cls(
                        self, tar_gz_file_path, self, any_file_suffix=True)
            if loaded_dataset is not None:
                self.absorb(loaded_dataset)
                continue
            else:
                raise RuntimeError('Unable to load from serializer')
        return self

    @staticmethod
    def _ensure_tar_gz_file(path: str):
        assert os.path.exists(path), f'No file or directory at {path}'

        if not path.endswith('.tar.gz'):
            tar_gz_file_path = path + '.tar.gz'
            if not os.path.isfile(tar_gz_file_path):
                print(f'Creating compressed file {os.path.abspath(tar_gz_file_path)} from '
                      f'the content of "{os.path.abspath(path)}"')

                with tarfile.open(tar_gz_file_path, 'w:gz') as tar:
                    if os.path.isdir(path):
                        for fn in sorted(os.listdir(path)):
                            p = os.path.join(path, fn)
                            tar.add(p, arcname=fn)
                    elif os.path.isfile(path):
                        tar.add(path, arcname=os.path.basename(path))
            return tar_gz_file_path

        return path

    @staticmethod
    def _get_serializer_registry():
        from omnipy.components import get_serializer_registry
        return get_serializer_registry()

    def as_multi_model_dataset(self) -> 'IsMultiModelDataset[_ModelOrDatasetT]':
        from omnipy.data.multi import MultiModelDataset

        multi_model_dataset = MultiModelDataset[self.get_type()]()
        for data_file in self:
            multi_model_dataset.data[data_file] = self.data[data_file]
        return multi_model_dataset

    def __eq__(self, other: object) -> bool:
        # return self.__class__ == other.__class__ and super().__eq__(other)
        return isinstance(other, Dataset) \
            and self.__class__ == other.__class__ \
            and self.data == other.data \
            and self.to_data() == other.to_data()  # last is probably unnecessary, but just in case

    def __repr_args__(self):
        from omnipy.data.model import is_model_instance

        return [(k, v.content) if is_model_instance(v) else (k, v) for k, v in self.data.items()]

available_data property

available_data: Self

config property

config: IsDataConfig

data class-attribute instance-attribute

data: dict[str, _ModelOrDatasetT] = pyd.Field(default={})

failed_data property

failed_data: Self

pending_data property

pending_data: Self

reactive_objects property

reactive_objects: IsReactiveObjects | None

snapshot_holder property

Config

ATTRIBUTE DESCRIPTION
arbitrary_types_allowed

validate_assignment

Source code in src/omnipy/data/dataset.py
class Config:
    validate_assignment = True
    arbitrary_types_allowed = True

arbitrary_types_allowed class-attribute instance-attribute

arbitrary_types_allowed = True

validate_assignment class-attribute instance-attribute

validate_assignment = True

__init__

__init__(
    value: Mapping[str, object] | Iterable[tuple[str, object]] | UndefinedType = Undefined,
    *,
    data: Mapping[str, object] | UndefinedType = Undefined,
    **kwargs: object,
) -> None
Source code in src/omnipy/data/dataset.py
def __init__(  # noqa: C901
    self,
    value: Mapping[str, object] | Iterable[tuple[str, object]] | UndefinedType = Undefined,
    *,
    data: Mapping[str, object] | UndefinedType = Undefined,
    **kwargs: object,
) -> None:
    from omnipy.data.model import is_model_instance, is_pure_pydantic_model

    # TODO: Error message when forgetting parenthesis when creating Dataset should be improved.
    #       Unclear where this can be done, if anywhere? E.g.:
    #           a = Dataset[Model[int]]
    #           a['adsfas'] = 2
    #           Traceback (most recent call last):
    #             ...
    #           TypeError: 'ModelMetaclass' object does not support item assignment
    #
    # TODO: Disallow e.g.:
    #       Dataset[Model[str]](Model[int](5)) ==  Dataset[Model[str]](data=Model[int](5))
    #       == Dataset[Model[str]](data={'__root__': Model[str]('5')})

    super_kwargs = {}

    assert DATA_KEY not in kwargs, \
        ('Not allowed with "data" as kwargs key. Not sure how you managed this? Are you trying '
         'to break Dataset init on purpose?')

    if value != Undefined:
        assert data == Undefined, \
            'Not allowed to combine positional and "data" keyword argument'
        assert len(kwargs) == 0, \
            'Not allowed to combine positional and keyword arguments'
        super_kwargs[DATA_KEY] = value

    if data != Undefined:
        assert len(kwargs) == 0, \
            f"Not allowed to combine '{DATA_KEY}' with other keyword arguments"
        super_kwargs[DATA_KEY] = data

    if kwargs:
        if DATA_KEY not in super_kwargs:
            super_kwargs[DATA_KEY] = kwargs
            kwargs = {}

    _type = self.get_type()
    if _type == _ModelOrDatasetT:  # type: ignore[misc]
        self._raise_type_exception()

    def _validate_any_models_or_datasets(
            iterable_data: Iterable[tuple[str, object]]) -> tuple[dict, bool]:

        prepared_data = {}
        _model_or_dataset_as_input: bool = False

        for key, val in iterable_data:
            if is_model_instance(val):
                _model_or_dataset_as_input = True
                prepared_data[key] = self._validate_value_for_data_file(key, val)
            else:
                prepared_data[key] = val
        return prepared_data, _model_or_dataset_as_input

    model_or_dataset_as_input = False
    if DATA_KEY in super_kwargs:
        input_data = super_kwargs[DATA_KEY]
        for_type_check = input_data.content if is_model_instance(input_data) else input_data
        match for_type_check:
            case Dataset():
                model_or_dataset_as_input = True
                super_kwargs[DATA_KEY] = cast(Dataset, input_data).to_data()
            case _input_data if is_pure_pydantic_model(_input_data):
                super_kwargs[DATA_KEY], model_or_dataset_as_input = (
                    _validate_any_models_or_datasets(_input_data.dict().items()))
            case Mapping():
                super_kwargs[DATA_KEY], model_or_dataset_as_input = (
                    _validate_any_models_or_datasets(cast(Mapping, input_data).items()))
            case Iterable():
                try:
                    super_kwargs[DATA_KEY], model_or_dataset_as_input = (
                        _validate_any_models_or_datasets(self._check_iterable(input_data)))
                except (TypeError, ValueError) as e:
                    raise TypeError(
                        'Data object must be a mapping or an iterable of '
                        '(key, val) pairs',
                        self.__class__) from e

            case _:
                ...

    self._init(super_kwargs, **kwargs)

    try:
        self._primary_validation(super_kwargs)
    except ValidationError:
        if model_or_dataset_as_input:
            self._secondary_validation_from_data(super_kwargs)
        else:
            raise

    if not self.__doc__:
        self._set_standard_field_description()

absorb

absorb(other: Dataset)
Source code in src/omnipy/data/dataset.py
def absorb(self, other: 'Dataset'):
    self.from_data(other.to_data(), update=True)

absorb_and_replace

absorb_and_replace(other: Dataset)
Source code in src/omnipy/data/dataset.py
def absorb_and_replace(self, other: 'Dataset'):
    self.from_data(other.to_data(), update=False)

as_multi_model_dataset

as_multi_model_dataset() -> IsMultiModelDataset[_ModelOrDatasetT]
Source code in src/omnipy/data/dataset.py
def as_multi_model_dataset(self) -> 'IsMultiModelDataset[_ModelOrDatasetT]':
    from omnipy.data.multi import MultiModelDataset

    multi_model_dataset = MultiModelDataset[self.get_type()]()
    for data_file in self:
        multi_model_dataset.data[data_file] = self.data[data_file]
    return multi_model_dataset

browse

browse(
    *,
    width: pyd.NonNegativeInt | None = None,
    height: pyd.NonNegativeInt | None = None,
    tab: pyd.NonNegativeInt = 4,
    indent: pyd.NonNegativeInt = 2,
    printer: PrettyPrinterLib.Literals = "auto",
    syntax: SyntaxLanguageSpec.Literals | str = "auto",
    freedom: pyd.NonNegativeFloat | None = 2.5,
    debug: bool = False,
    ui: UserInterfaceType.Literals = "auto",
    system: DisplayColorSystem.Literals = "auto",
    style: AllColorStyles.Literals | str = "auto",
    dark: Literal = "auto",
    bg: bool = False,
    fonts: tuple = ("Menlo", "DejaVu Sans Mono", "Consolas", "Courier New", "monospace"),
    font_size: pyd.NonNegativeInt | None = 14,
    font_weight: pyd.NonNegativeInt | None = 400,
    line_height: pyd.NonNegativeFloat | None = 1.25,
    h_overflow: HorizontalOverflowMode.Literals = "ellipsis",
    v_overflow: VerticalOverflowMode.Literals = "ellipsis_bottom",
    panel: PanelDesign.Literals = "table",
    title_at_top: bool = True,
    max_title_height: MaxTitleHeight.Literals = -1,
    min_panel_width: pyd.NonNegativeInt = 3,
    min_crop_width: pyd.NonNegativeInt = 33,
    use_min_crop_width: bool = False,
    max_panels_hor: pyd.NonNegativeInt | None = 9,
    max_nesting_depth: pyd.NonNegativeInt | None = 3,
    justify: Justify.Literals = "left",
) -> None

Opens the model or dataset in a browser, if possible.

For models, this is a detailed view of the model's content, and for datasets this is a detailed view of each model contained in the dataset, one model per browser tab.

PARAMETER DESCRIPTION
width

Width in characters of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

height

Height in lines of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

tab

Number of spaces to use for each tab.

TYPE: NonNegativeInt DEFAULT: 4

indent

Number of spaces to use for each indentation level.

TYPE: NonNegativeInt DEFAULT: 2

printer

Library to use for pretty printing.

TYPE: PrettyPrinterLib.Literals DEFAULT: 'auto'

syntax

Syntax language for code highlighting. Supported lexers are defined in SyntaxLanguageSpec. For non-supported styles, the user can specify a string with the Pygments lexer name. For this to work, the lexer must be registered in the Pygments library.

TYPE: SyntaxLanguageSpec.Literals | str DEFAULT: 'auto'

freedom

Parameter that controls the level of freedom for formatted text to follow the geometry of the frame size (=total available area) in a proportional manner. If the proportional freedom is 0 (the lowest), then the output area must not in any case be proportionally wider that the frame (i.e. a 16/9 frame will only produce output that is 16/9 or narrower). Larger values of proportional freedom allow the output to be proportionally wider than the total available frame, to a degree that relates to the size difference between the frame and the content (larger difference gives more freedom). The default value of 2.5 is a good compromise between readability/aesthetics and good use of the screen estate. If None, the freedom is unlimited (i.e. proportionality is not taken into account at all).

TYPE: float | None DEFAULT: 2.5

debug

When True, enables additional debugging information in the output, such as the hierarchy of the Model objects. Currently, only Python pretty printers support debug=True. Hence, enabling debug mode will automatically set the printer to the default Python pretty printer if the printer config value is not already set.

TYPE: bool DEFAULT: False

ui

Type of user interface for which the output should being prepared. The user interface describes the technical solutions available for interacting with the user, encompassing the support available for displaying output as well as how the user interacts with the library (including the type of interactive interpreter used, if any).

TYPE: UserInterfaceType.Literals DEFAULT: 'auto'

system

Color system to use for terminal output. The default is AUTO, which automatically detects the color system based on particular environment variables. If color capabilities are not detected, the output will be in black and white. If the color system of a modern consoles/terminal is not auto-detected (which is the case for e.g. the PyCharm console), the user might want to set the color system manually to ANSI_RGB to force color output.

TYPE: ColorSystem.Literals DEFAULT: 'auto'

style

Color style/theme for syntax highlighting and other display elements. Supported styles are defined in AllColorStyles. For non-supported styles, the user can specify a string with the Pygments style name. For this to work, the style must be registered in the Pygments library. If style is AUTO or any of the other RecommendedColorStyles, the style is automatically selected from the RecommendedColorStyles based on the detected user interface, the color system, and whether the background is dark or not.

TYPE: AllColorStyles.Literals | str DEFAULT: 'auto'

dark

Whether the background color of the output is dark. This is used to determine the appropriate color scheme for syntax highlighting. The default is AUTO, which automatically tries to detect whether the background is dark. Capability of auto-detection depends on the user interface.

TYPE: DarkBackground.Literals DEFAULT: 'auto'

bg

If False, uses transparent background for the output. In the case of terminal output, the background color will be the current background color of the terminal. For HTML output, the background color will be automatically set to pure black or pure white, depending on the luminosity of the foreground color.

TYPE: bool DEFAULT: False

fonts

Font families to use in HTML output, in order of preference (empty tuple for browser default).

TYPE: Tuple[str, ...] DEFAULT: ('Menlo', 'DejaVu Sans Mono', 'Consolas', 'Courier New', 'monospace')

font_size

Font size in pixels for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 14

font_weight

Font weight for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 400

line_height

Line height multiplier for HTML output (None for browser default).

TYPE: NonNegativeFloat | None DEFAULT: 1.25

h_overflow

How to handle text that exceeds the width.

TYPE: HorizontalOverflowMode.Literals DEFAULT: 'ellipsis'

v_overflow

How to handle text that exceeds the height.

TYPE: VerticalOverflowMode.Literals DEFAULT: 'ellipsis_bottom'

panel

Visual design of the panel used as container for the output. Only TABLE is currently supported, which displays the output in a table-like grid.

TYPE: PanelDesign.Literals DEFAULT: 'table'

title_at_top

Whether panel titles will be displayed over the panel content (True) or below the content (False)

TYPE: bool DEFAULT: True

max_title_height

Maximum height of the panel title. If AUTO, the height is determined by the content of the title, up to a maximum of two lines. If ZERO, the title is not displayed at all. If ONE or TWO, the title is displayed with a fixed height of max one or two lines, respectively.

TYPE: MaxTitleHeight.Literals DEFAULT: -1

min_panel_width

Minimum width in characters per panel.

TYPE: NonNegativeInt DEFAULT: 3

min_crop_width

Minimum cropping width in characters for panels in cases where more than one panel are to be displayed. This is for instance used to calculate the number of models to display in a Dataset peek(). Only applied if use_min_crop_width is set to True. min_crop_width must be equal to or larger than min_panel_width.

TYPE: NonNegativeInt DEFAULT: 33

use_min_crop_width

Whether the min_crop_width value should be considered in cases where more than one panel are to be displayed, potentially reducing the number of displayed panels.

TYPE: bool DEFAULT: False

max_panels_hor

Maximum number of panels to display horizontally side-by-side at the top level. This value also acts as a ceiling for nested panels; nested panels cannot exceed this limit even if the constant MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED is set to a higher value. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 9

max_nesting_depth

Maximum levels of nested panels to display. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 3

justify

Justification mode for the panel if inside a layout panel. This is only used for the panel content.

TYPE: Justify.Literals DEFAULT: 'left'

Source code in src/omnipy/data/_mixins/display.py
def browse(self, **kwargs) -> None:
    # %% Original docstring (managed by expand_docstr_macros.py) %%
    # {{BROWSE_SUMMARY}}
    #
    # {{BROWSE_DESCRIPTION}}
    #
    # {{DISPLAY_METHOD_ARGS}}
    #
    """Opens the model or dataset in a browser, if possible.

    For models, this is a detailed view of the model's content,
    and for datasets this is a detailed view of each model
    contained in the dataset, one model per browser tab.

    Args:
        width (NonNegativeInt | None):
            Width in characters of the output area (None for
            auto-detect based on available display dimensions).
        height (NonNegativeInt | None):
            Height in lines of the output area (None for
            auto-detect based on available display dimensions).
        tab (NonNegativeInt):
            Number of spaces to use for each tab.
        indent (NonNegativeInt):
            Number of spaces to use for each indentation level.
        printer (PrettyPrinterLib.Literals):
            Library to use for pretty printing.
        syntax (SyntaxLanguageSpec.Literals | str):
            Syntax language for code highlighting. Supported
            lexers are defined in SyntaxLanguageSpec. For
            non-supported styles, the user can specify a string
            with the Pygments lexer name. For this to work, the
            lexer must be registered in the Pygments library.
        freedom (float | None):
            Parameter that controls the level of freedom for
            formatted text to follow the geometry of the frame
            size (=total available area) in a proportional manner.
            If the proportional freedom is 0 (the lowest), then
            the output area must not in any case be proportionally
            wider that the frame (i.e. a 16/9 frame will only
            produce output that is 16/9 or narrower). Larger
            values of proportional freedom allow the output to be
            proportionally wider than the total available frame,
            to a degree that relates to the size difference
            between the frame and the content (larger difference
            gives more freedom). The default value of 2.5 is a
            good compromise between readability/aesthetics and
            good use of the screen estate. If None, the freedom is
            unlimited (i.e. proportionality is not taken into
            account at all).
        debug (bool):
            When True, enables additional debugging information in
            the output, such as the hierarchy of the Model
            objects. Currently, only Python pretty printers support
            debug=True. Hence, enabling debug mode will
            automatically set the printer to the default Python
            pretty printer if the `printer` config value is not
            already set.
        ui (UserInterfaceType.Literals):
            Type of user interface for which the output should
            being prepared. The user interface describes the
            technical solutions available for interacting with the
            user, encompassing the support available for
            displaying output as well as how the user interacts
            with the library (including the type of interactive
            interpreter used, if any).
        system (ColorSystem.Literals):
            Color system to use for terminal output. The default
            is `AUTO`, which automatically detects the color
            system based on particular environment variables. If
            color capabilities are not detected, the output will
            be in black and white. If the color system of a modern
            consoles/terminal is not auto-detected (which is the
            case for e.g. the PyCharm console), the user might
            want to set the color system manually to ANSI_RGB to
            force color output.
        style (AllColorStyles.Literals | str):
            Color style/theme for syntax highlighting and other
            display elements. Supported styles are defined in
            AllColorStyles. For non-supported styles, the user can
            specify a string with the Pygments style name. For this to
            work, the style must be registered in the Pygments
            library. If style is `AUTO` or any of the other
            RecommendedColorStyles, the style is automatically
            selected from the RecommendedColorStyles based on the
            detected user interface, the color system, and whether the
            background is dark or not.
        dark (DarkBackground.Literals):
            Whether the background color of the output is dark.
            This is used to determine the appropriate color scheme
            for syntax highlighting. The default is AUTO, which
            automatically tries to detect whether the background
            is dark. Capability of auto-detection depends on the
            user interface.
        bg (bool):
            If False, uses transparent background for the output.
            In the case of terminal output, the background color
            will be the current background color of the terminal.
            For HTML output, the background color will be
            automatically set to pure black or pure white,
            depending on the luminosity of the foreground color.
        fonts (Tuple[str, ...]):
            Font families to use in HTML output, in order of
            preference (empty tuple for browser default).
        font_size (NonNegativeInt | None):
            Font size in pixels for HTML output (None for browser
            default).
        font_weight (NonNegativeInt | None):
            Font weight for HTML output (None for browser
            default).
        line_height (NonNegativeFloat | None):
            Line height multiplier for HTML output (None for
            browser default).
        h_overflow (HorizontalOverflowMode.Literals):
            How to handle text that exceeds the width.
        v_overflow (VerticalOverflowMode.Literals):
            How to handle text that exceeds the height.
        panel (PanelDesign.Literals):
            Visual design of the panel used as container for the
            output. Only `TABLE` is currently supported, which
            displays the output in a table-like grid.
        title_at_top (bool):
            Whether panel titles will be displayed over the panel
            content (True) or below the content (False)
        max_title_height (MaxTitleHeight.Literals):
            Maximum height of the panel title. If `AUTO`, the
            height is determined by the content of the title, up
            to a maximum of two lines. If `ZERO`, the title is not
            displayed at all. If `ONE` or `TWO`, the title is
            displayed with a fixed height of max one or two lines,
            respectively.
        min_panel_width (NonNegativeInt):
            Minimum width in characters per panel.
        min_crop_width (NonNegativeInt):
            Minimum cropping width in characters for panels in
            cases where more than one panel are to be displayed.
            This is for instance used to calculate the number of
            models to display in a Dataset peek(). Only applied if
            `use_min_crop_width` is set to `True`.
            `min_crop_width` must be equal to or larger than
            `min_panel_width`.
        use_min_crop_width (bool):
            Whether the `min_crop_width` value should be
            considered in cases where more than one panel are to
            be displayed, potentially reducing the number of
            displayed panels.
        max_panels_hor (NonNegativeInt | None):
            Maximum number of panels to display horizontally
            side-by-side at the top level. This value also acts as
            a ceiling for nested panels; nested panels cannot
            exceed this limit even if the constant
            `MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED` is set to a
            higher value. If None, there is no limit.
        max_nesting_depth (NonNegativeInt | None):
            Maximum levels of nested panels to display. If None,
            there is no limit.
        justify (Justify.Literals):
            Justification mode for the panel if inside a layout
            panel. This is only used for the panel content.
    """
    self._browse(**kwargs)

clone_dataset_cls classmethod

clone_dataset_cls(
    new_dataset_cls_name: str, model_cls: type[_NewModelT] | None = None
) -> type[Self]
Source code in src/omnipy/data/dataset.py
@classmethod
def clone_dataset_cls(cls,
                      new_dataset_cls_name: str,
                      model_cls: type[_NewModelT] | None = None) -> type[Self]:
    if model_cls:
        generic_dataset_cls = cls.__bases__[0]
        new_base_cls = generic_dataset_cls[model_cls]  # type: ignore[index]
    else:
        new_base_cls = cls

    new_dataset_cls = type(new_dataset_cls_name, (new_base_cls,), {})
    return new_dataset_cls

copy

copy(*, deep: bool = False, **kwargs) -> Self
Source code in src/omnipy/data/dataset.py
def copy(self, *, deep: bool = False, **kwargs) -> Self:
    pydantic_copy = pyd.GenericModel.copy(self, deep=deep, **kwargs)
    if not deep:
        object.__setattr__(pydantic_copy, DATA_KEY, pydantic_copy.__dict__[DATA_KEY].copy())

    return pydantic_copy  # pyright: ignore [reportReturnType]

deepcopy_context

deepcopy_context(
    top_level_entry_func: Callable[[], None], top_level_exit_func: Callable[[], None]
) -> ContextManager[int]
Source code in src/omnipy/data/_data_class_creator.py
def deepcopy_context(
    self,
    top_level_entry_func: Callable[[], None],
    top_level_exit_func: Callable[[], None],
) -> ContextManager[int]:
    return self.__class__.data_class_creator.deepcopy_context(top_level_entry_func,
                                                              top_level_exit_func)

default_repr_to_terminal_str

default_repr_to_terminal_str(ui_type: TerminalOutputUserInterfaceType.Literals) -> str
Source code in src/omnipy/data/_mixins/display.py
def default_repr_to_terminal_str(
    self,
    ui_type: TerminalOutputUserInterfaceType.Literals,
) -> str:
    return self._display_according_to_ui_type(
        ui_type=ui_type,
        return_output_if_str=True,
        output_method=self._default_panel,
    )

dict

dict(**kwargs) -> dict_t[str, Any]
Source code in src/omnipy/data/dataset.py
def dict(self, **kwargs) -> dict_t[str, Any]:
    return super().dict(**kwargs)[DATA_KEY]

do

do(placeholder: F) -> Dataset[_ModelOrDatasetT]
Source code in src/omnipy/data/dataset.py
def do(self, placeholder: F) -> 'Dataset[_ModelOrDatasetT]':
    new_dataset = self.__class__()
    for data_file, val in self.items():
        new_dataset[data_file] = placeholder(val)
    return new_dataset

failed_task_details

failed_task_details() -> dict[str, IsFailedData]
Source code in src/omnipy/data/_mixins/task.py
def failed_task_details(self) -> dict[str, IsFailedData]:
    self_with_data = cast(HasData, self)
    return {  # pyright: ignore [reportReturnType]
        key: val for key, val in self_with_data.data.items() if isinstance(val, FailedData)
    }

from_data

from_data(data: Mapping[str, Any] | Iterable[tuple[str, Any]], update: bool = True) -> None
Source code in src/omnipy/data/dataset.py
def from_data(self,
              data: Mapping[str, Any] | Iterable[tuple[str, Any]],
              update: bool = True) -> None:
    def callback_func(type_variant: 'Model | Dataset', content: Any):
        type_variant.from_data(content)

    self._from_dict_with_callback(data, update, callback_func)

from_json

from_json(data: Mapping[str, str] | Iterable[tuple[str, str]], update: bool = True) -> None
Source code in src/omnipy/data/dataset.py
def from_json(self,
              data: Mapping[str, str] | Iterable[tuple[str, str]],
              update: bool = True) -> None:
    def callback_func(type_variant: 'Model | Dataset', content: Any):
        type_variant.from_json(content)

    self._from_dict_with_callback(data, update, callback_func)

full

full(
    *,
    width: pyd.NonNegativeInt | None = None,
    height: pyd.NonNegativeInt | None = None,
    tab: pyd.NonNegativeInt = 4,
    indent: pyd.NonNegativeInt = 2,
    printer: PrettyPrinterLib.Literals = "auto",
    syntax: SyntaxLanguageSpec.Literals | str = "auto",
    freedom: pyd.NonNegativeFloat | None = 2.5,
    debug: bool = False,
    ui: UserInterfaceType.Literals = "auto",
    system: DisplayColorSystem.Literals = "auto",
    style: AllColorStyles.Literals | str = "auto",
    dark: Literal = "auto",
    bg: bool = False,
    fonts: tuple = ("Menlo", "DejaVu Sans Mono", "Consolas", "Courier New", "monospace"),
    font_size: pyd.NonNegativeInt | None = 14,
    font_weight: pyd.NonNegativeInt | None = 400,
    line_height: pyd.NonNegativeFloat | None = 1.25,
    h_overflow: HorizontalOverflowMode.Literals = "ellipsis",
    v_overflow: VerticalOverflowMode.Literals = "ellipsis_bottom",
    panel: PanelDesign.Literals = "table",
    title_at_top: bool = True,
    max_title_height: MaxTitleHeight.Literals = -1,
    min_panel_width: pyd.NonNegativeInt = 3,
    min_crop_width: pyd.NonNegativeInt = 33,
    use_min_crop_width: bool = False,
    max_panels_hor: pyd.NonNegativeInt | None = 9,
    max_nesting_depth: pyd.NonNegativeInt | None = 3,
    justify: Justify.Literals = "left",
) -> Element | None

Display the content of the Model or Dataset in full height.

full() is a shorthand for peek(height=None) for both models and datasets. Both full-height views are automatically limited in width by the available display dimensions.

PARAMETER DESCRIPTION
width

Width in characters of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

height

Height in lines of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

tab

Number of spaces to use for each tab.

TYPE: NonNegativeInt DEFAULT: 4

indent

Number of spaces to use for each indentation level.

TYPE: NonNegativeInt DEFAULT: 2

printer

Library to use for pretty printing.

TYPE: PrettyPrinterLib.Literals DEFAULT: 'auto'

syntax

Syntax language for code highlighting. Supported lexers are defined in SyntaxLanguageSpec. For non-supported styles, the user can specify a string with the Pygments lexer name. For this to work, the lexer must be registered in the Pygments library.

TYPE: SyntaxLanguageSpec.Literals | str DEFAULT: 'auto'

freedom

Parameter that controls the level of freedom for formatted text to follow the geometry of the frame size (=total available area) in a proportional manner. If the proportional freedom is 0 (the lowest), then the output area must not in any case be proportionally wider that the frame (i.e. a 16/9 frame will only produce output that is 16/9 or narrower). Larger values of proportional freedom allow the output to be proportionally wider than the total available frame, to a degree that relates to the size difference between the frame and the content (larger difference gives more freedom). The default value of 2.5 is a good compromise between readability/aesthetics and good use of the screen estate. If None, the freedom is unlimited (i.e. proportionality is not taken into account at all).

TYPE: float | None DEFAULT: 2.5

debug

When True, enables additional debugging information in the output, such as the hierarchy of the Model objects. Currently, only Python pretty printers support debug=True. Hence, enabling debug mode will automatically set the printer to the default Python pretty printer if the printer config value is not already set.

TYPE: bool DEFAULT: False

ui

Type of user interface for which the output should being prepared. The user interface describes the technical solutions available for interacting with the user, encompassing the support available for displaying output as well as how the user interacts with the library (including the type of interactive interpreter used, if any).

TYPE: UserInterfaceType.Literals DEFAULT: 'auto'

system

Color system to use for terminal output. The default is AUTO, which automatically detects the color system based on particular environment variables. If color capabilities are not detected, the output will be in black and white. If the color system of a modern consoles/terminal is not auto-detected (which is the case for e.g. the PyCharm console), the user might want to set the color system manually to ANSI_RGB to force color output.

TYPE: ColorSystem.Literals DEFAULT: 'auto'

style

Color style/theme for syntax highlighting and other display elements. Supported styles are defined in AllColorStyles. For non-supported styles, the user can specify a string with the Pygments style name. For this to work, the style must be registered in the Pygments library. If style is AUTO or any of the other RecommendedColorStyles, the style is automatically selected from the RecommendedColorStyles based on the detected user interface, the color system, and whether the background is dark or not.

TYPE: AllColorStyles.Literals | str DEFAULT: 'auto'

dark

Whether the background color of the output is dark. This is used to determine the appropriate color scheme for syntax highlighting. The default is AUTO, which automatically tries to detect whether the background is dark. Capability of auto-detection depends on the user interface.

TYPE: DarkBackground.Literals DEFAULT: 'auto'

bg

If False, uses transparent background for the output. In the case of terminal output, the background color will be the current background color of the terminal. For HTML output, the background color will be automatically set to pure black or pure white, depending on the luminosity of the foreground color.

TYPE: bool DEFAULT: False

fonts

Font families to use in HTML output, in order of preference (empty tuple for browser default).

TYPE: Tuple[str, ...] DEFAULT: ('Menlo', 'DejaVu Sans Mono', 'Consolas', 'Courier New', 'monospace')

font_size

Font size in pixels for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 14

font_weight

Font weight for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 400

line_height

Line height multiplier for HTML output (None for browser default).

TYPE: NonNegativeFloat | None DEFAULT: 1.25

h_overflow

How to handle text that exceeds the width.

TYPE: HorizontalOverflowMode.Literals DEFAULT: 'ellipsis'

v_overflow

How to handle text that exceeds the height.

TYPE: VerticalOverflowMode.Literals DEFAULT: 'ellipsis_bottom'

panel

Visual design of the panel used as container for the output. Only TABLE is currently supported, which displays the output in a table-like grid.

TYPE: PanelDesign.Literals DEFAULT: 'table'

title_at_top

Whether panel titles will be displayed over the panel content (True) or below the content (False)

TYPE: bool DEFAULT: True

max_title_height

Maximum height of the panel title. If AUTO, the height is determined by the content of the title, up to a maximum of two lines. If ZERO, the title is not displayed at all. If ONE or TWO, the title is displayed with a fixed height of max one or two lines, respectively.

TYPE: MaxTitleHeight.Literals DEFAULT: -1

min_panel_width

Minimum width in characters per panel.

TYPE: NonNegativeInt DEFAULT: 3

min_crop_width

Minimum cropping width in characters for panels in cases where more than one panel are to be displayed. This is for instance used to calculate the number of models to display in a Dataset peek(). Only applied if use_min_crop_width is set to True. min_crop_width must be equal to or larger than min_panel_width.

TYPE: NonNegativeInt DEFAULT: 33

use_min_crop_width

Whether the min_crop_width value should be considered in cases where more than one panel are to be displayed, potentially reducing the number of displayed panels.

TYPE: bool DEFAULT: False

max_panels_hor

Maximum number of panels to display horizontally side-by-side at the top level. This value also acts as a ceiling for nested panels; nested panels cannot exceed this limit even if the constant MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED is set to a higher value. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 9

max_nesting_depth

Maximum levels of nested panels to display. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 3

justify

Justification mode for the panel if inside a layout panel. This is only used for the panel content.

TYPE: Justify.Literals DEFAULT: 'left'

RETURNS DESCRIPTION
Element | None

If the UI type is Jupyter running in browser, the method returns a ReactivelyResizingHtml element which is a Jupyter widget to display HTML output in the browser. Otherwise, the method returns None.

Note

Any default argument value is overridden by the corresponding value in the relevant subsection of the UserInterfaceConfig.

Source code in src/omnipy/data/_mixins/display.py
def full(self, **kwargs) -> 'Element | None':
    # %% Original docstring (managed by expand_docstr_macros.py) %%
    # {{FULL_SUMMARY}}
    #
    # {{FULL_DESCRIPTION}}
    #
    # {{DISPLAY_METHOD_ARGS}}
    #
    # {{DISPLAY_METHOD_RETURNS}}
    #
    # {{DISPLAY_METHOD_NOTE}}
    #
    #
    """Display the content of the Model or Dataset in full height.

    `full()` is a shorthand for `peek(height=None)` for both
    models and datasets. Both full-height views are automatically
    limited in width by the available display dimensions.

    Args:
        width (NonNegativeInt | None):
            Width in characters of the output area (None for
            auto-detect based on available display dimensions).
        height (NonNegativeInt | None):
            Height in lines of the output area (None for
            auto-detect based on available display dimensions).
        tab (NonNegativeInt):
            Number of spaces to use for each tab.
        indent (NonNegativeInt):
            Number of spaces to use for each indentation level.
        printer (PrettyPrinterLib.Literals):
            Library to use for pretty printing.
        syntax (SyntaxLanguageSpec.Literals | str):
            Syntax language for code highlighting. Supported
            lexers are defined in SyntaxLanguageSpec. For
            non-supported styles, the user can specify a string
            with the Pygments lexer name. For this to work, the
            lexer must be registered in the Pygments library.
        freedom (float | None):
            Parameter that controls the level of freedom for
            formatted text to follow the geometry of the frame
            size (=total available area) in a proportional manner.
            If the proportional freedom is 0 (the lowest), then
            the output area must not in any case be proportionally
            wider that the frame (i.e. a 16/9 frame will only
            produce output that is 16/9 or narrower). Larger
            values of proportional freedom allow the output to be
            proportionally wider than the total available frame,
            to a degree that relates to the size difference
            between the frame and the content (larger difference
            gives more freedom). The default value of 2.5 is a
            good compromise between readability/aesthetics and
            good use of the screen estate. If None, the freedom is
            unlimited (i.e. proportionality is not taken into
            account at all).
        debug (bool):
            When True, enables additional debugging information in
            the output, such as the hierarchy of the Model
            objects. Currently, only Python pretty printers support
            debug=True. Hence, enabling debug mode will
            automatically set the printer to the default Python
            pretty printer if the `printer` config value is not
            already set.
        ui (UserInterfaceType.Literals):
            Type of user interface for which the output should
            being prepared. The user interface describes the
            technical solutions available for interacting with the
            user, encompassing the support available for
            displaying output as well as how the user interacts
            with the library (including the type of interactive
            interpreter used, if any).
        system (ColorSystem.Literals):
            Color system to use for terminal output. The default
            is `AUTO`, which automatically detects the color
            system based on particular environment variables. If
            color capabilities are not detected, the output will
            be in black and white. If the color system of a modern
            consoles/terminal is not auto-detected (which is the
            case for e.g. the PyCharm console), the user might
            want to set the color system manually to ANSI_RGB to
            force color output.
        style (AllColorStyles.Literals | str):
            Color style/theme for syntax highlighting and other
            display elements. Supported styles are defined in
            AllColorStyles. For non-supported styles, the user can
            specify a string with the Pygments style name. For this to
            work, the style must be registered in the Pygments
            library. If style is `AUTO` or any of the other
            RecommendedColorStyles, the style is automatically
            selected from the RecommendedColorStyles based on the
            detected user interface, the color system, and whether the
            background is dark or not.
        dark (DarkBackground.Literals):
            Whether the background color of the output is dark.
            This is used to determine the appropriate color scheme
            for syntax highlighting. The default is AUTO, which
            automatically tries to detect whether the background
            is dark. Capability of auto-detection depends on the
            user interface.
        bg (bool):
            If False, uses transparent background for the output.
            In the case of terminal output, the background color
            will be the current background color of the terminal.
            For HTML output, the background color will be
            automatically set to pure black or pure white,
            depending on the luminosity of the foreground color.
        fonts (Tuple[str, ...]):
            Font families to use in HTML output, in order of
            preference (empty tuple for browser default).
        font_size (NonNegativeInt | None):
            Font size in pixels for HTML output (None for browser
            default).
        font_weight (NonNegativeInt | None):
            Font weight for HTML output (None for browser
            default).
        line_height (NonNegativeFloat | None):
            Line height multiplier for HTML output (None for
            browser default).
        h_overflow (HorizontalOverflowMode.Literals):
            How to handle text that exceeds the width.
        v_overflow (VerticalOverflowMode.Literals):
            How to handle text that exceeds the height.
        panel (PanelDesign.Literals):
            Visual design of the panel used as container for the
            output. Only `TABLE` is currently supported, which
            displays the output in a table-like grid.
        title_at_top (bool):
            Whether panel titles will be displayed over the panel
            content (True) or below the content (False)
        max_title_height (MaxTitleHeight.Literals):
            Maximum height of the panel title. If `AUTO`, the
            height is determined by the content of the title, up
            to a maximum of two lines. If `ZERO`, the title is not
            displayed at all. If `ONE` or `TWO`, the title is
            displayed with a fixed height of max one or two lines,
            respectively.
        min_panel_width (NonNegativeInt):
            Minimum width in characters per panel.
        min_crop_width (NonNegativeInt):
            Minimum cropping width in characters for panels in
            cases where more than one panel are to be displayed.
            This is for instance used to calculate the number of
            models to display in a Dataset peek(). Only applied if
            `use_min_crop_width` is set to `True`.
            `min_crop_width` must be equal to or larger than
            `min_panel_width`.
        use_min_crop_width (bool):
            Whether the `min_crop_width` value should be
            considered in cases where more than one panel are to
            be displayed, potentially reducing the number of
            displayed panels.
        max_panels_hor (NonNegativeInt | None):
            Maximum number of panels to display horizontally
            side-by-side at the top level. This value also acts as
            a ceiling for nested panels; nested panels cannot
            exceed this limit even if the constant
            `MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED` is set to a
            higher value. If None, there is no limit.
        max_nesting_depth (NonNegativeInt | None):
            Maximum levels of nested panels to display. If None,
            there is no limit.
        justify (Justify.Literals):
            Justification mode for the panel if inside a layout
            panel. This is only used for the panel content.

    Returns:
        If the UI type is Jupyter running in browser, the
        method returns a ReactivelyResizingHtml element which
        is a Jupyter widget to display HTML output in the
        browser. Otherwise, the method returns None.

    Note:
        Any default argument value is overridden by the
        corresponding value in the relevant subsection of the
        UserInterfaceConfig.

    """
    return self._display_according_to_ui_type(
        ui_type=self._extract_ui_type(**kwargs),
        return_output_if_str=False,
        output_method=self._full,
        **kwargs)

get_type cached classmethod

get_type() -> type[_ModelOrDatasetT]

Returns the concrete type (Model or Dataset class) used for all data files in the dataset, e.g.: Model[list[int]], or Dataset[Model[dict[str, float]]] for nested datasets. :return: The concrete type (Model or Dataset class) used for all data files in the dataset.

Source code in src/omnipy/data/dataset.py
@classmethod
@functools.cache
def get_type(cls) -> type[_ModelOrDatasetT]:
    """
    Returns the concrete type (Model or Dataset class) used for all
    data files in the dataset, e.g.: `Model[list[int]]`, or
    `Dataset[Model[dict[str, float]]]` for nested datasets.
    :return: The concrete type (Model or Dataset class) used for all
             data files in the dataset.
    """
    # Part of pydantic v1 hack to stop coercing of e.g.
    # [{'a': 'b', 'c': 'd'}] to {'a': 'c'}
    return cls._clean_type(cls._get_data_field().sub_fields[1].type_)  # type: ignore[index]

json

json(
    *,
    width: pyd.NonNegativeInt | None = None,
    height: pyd.NonNegativeInt | None = None,
    tab: pyd.NonNegativeInt = 4,
    indent: pyd.NonNegativeInt = 2,
    printer: PrettyPrinterLib.Literals = "auto",
    syntax: SyntaxLanguageSpec.Literals | str = "auto",
    freedom: pyd.NonNegativeFloat | None = 2.5,
    debug: bool = False,
    ui: UserInterfaceType.Literals = "auto",
    system: DisplayColorSystem.Literals = "auto",
    style: AllColorStyles.Literals | str = "auto",
    dark: Literal = "auto",
    bg: bool = False,
    fonts: tuple = ("Menlo", "DejaVu Sans Mono", "Consolas", "Courier New", "monospace"),
    font_size: pyd.NonNegativeInt | None = 14,
    font_weight: pyd.NonNegativeInt | None = 400,
    line_height: pyd.NonNegativeFloat | None = 1.25,
    h_overflow: HorizontalOverflowMode.Literals = "ellipsis",
    v_overflow: VerticalOverflowMode.Literals = "ellipsis_bottom",
    panel: PanelDesign.Literals = "table",
    title_at_top: bool = True,
    max_title_height: MaxTitleHeight.Literals = -1,
    min_panel_width: pyd.NonNegativeInt = 3,
    min_crop_width: pyd.NonNegativeInt = 33,
    use_min_crop_width: bool = False,
    max_panels_hor: pyd.NonNegativeInt | None = 9,
    max_nesting_depth: pyd.NonNegativeInt | None = 3,
    justify: Justify.Literals = "left",
) -> Element | None

Preview the data content of the Model or Dataset as JSON.

In contrast to e.g. peek(), json() displays the "data content" of the Model or Dataset, i.e. the content as plain Python objects, potentially converted from the internal data structure. This plain data is formatted in JSON (for compactness). Hence json() represents a the basic compatibility layer of all Omnipy Model or Dataset objects. The view is automatically limited by the available display dimensions.

PARAMETER DESCRIPTION
width

Width in characters of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

height

Height in lines of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

tab

Number of spaces to use for each tab.

TYPE: NonNegativeInt DEFAULT: 4

indent

Number of spaces to use for each indentation level.

TYPE: NonNegativeInt DEFAULT: 2

printer

Library to use for pretty printing.

TYPE: PrettyPrinterLib.Literals DEFAULT: 'auto'

syntax

Syntax language for code highlighting. Supported lexers are defined in SyntaxLanguageSpec. For non-supported styles, the user can specify a string with the Pygments lexer name. For this to work, the lexer must be registered in the Pygments library.

TYPE: SyntaxLanguageSpec.Literals | str DEFAULT: 'auto'

freedom

Parameter that controls the level of freedom for formatted text to follow the geometry of the frame size (=total available area) in a proportional manner. If the proportional freedom is 0 (the lowest), then the output area must not in any case be proportionally wider that the frame (i.e. a 16/9 frame will only produce output that is 16/9 or narrower). Larger values of proportional freedom allow the output to be proportionally wider than the total available frame, to a degree that relates to the size difference between the frame and the content (larger difference gives more freedom). The default value of 2.5 is a good compromise between readability/aesthetics and good use of the screen estate. If None, the freedom is unlimited (i.e. proportionality is not taken into account at all).

TYPE: float | None DEFAULT: 2.5

debug

When True, enables additional debugging information in the output, such as the hierarchy of the Model objects. Currently, only Python pretty printers support debug=True. Hence, enabling debug mode will automatically set the printer to the default Python pretty printer if the printer config value is not already set.

TYPE: bool DEFAULT: False

ui

Type of user interface for which the output should being prepared. The user interface describes the technical solutions available for interacting with the user, encompassing the support available for displaying output as well as how the user interacts with the library (including the type of interactive interpreter used, if any).

TYPE: UserInterfaceType.Literals DEFAULT: 'auto'

system

Color system to use for terminal output. The default is AUTO, which automatically detects the color system based on particular environment variables. If color capabilities are not detected, the output will be in black and white. If the color system of a modern consoles/terminal is not auto-detected (which is the case for e.g. the PyCharm console), the user might want to set the color system manually to ANSI_RGB to force color output.

TYPE: ColorSystem.Literals DEFAULT: 'auto'

style

Color style/theme for syntax highlighting and other display elements. Supported styles are defined in AllColorStyles. For non-supported styles, the user can specify a string with the Pygments style name. For this to work, the style must be registered in the Pygments library. If style is AUTO or any of the other RecommendedColorStyles, the style is automatically selected from the RecommendedColorStyles based on the detected user interface, the color system, and whether the background is dark or not.

TYPE: AllColorStyles.Literals | str DEFAULT: 'auto'

dark

Whether the background color of the output is dark. This is used to determine the appropriate color scheme for syntax highlighting. The default is AUTO, which automatically tries to detect whether the background is dark. Capability of auto-detection depends on the user interface.

TYPE: DarkBackground.Literals DEFAULT: 'auto'

bg

If False, uses transparent background for the output. In the case of terminal output, the background color will be the current background color of the terminal. For HTML output, the background color will be automatically set to pure black or pure white, depending on the luminosity of the foreground color.

TYPE: bool DEFAULT: False

fonts

Font families to use in HTML output, in order of preference (empty tuple for browser default).

TYPE: Tuple[str, ...] DEFAULT: ('Menlo', 'DejaVu Sans Mono', 'Consolas', 'Courier New', 'monospace')

font_size

Font size in pixels for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 14

font_weight

Font weight for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 400

line_height

Line height multiplier for HTML output (None for browser default).

TYPE: NonNegativeFloat | None DEFAULT: 1.25

h_overflow

How to handle text that exceeds the width.

TYPE: HorizontalOverflowMode.Literals DEFAULT: 'ellipsis'

v_overflow

How to handle text that exceeds the height.

TYPE: VerticalOverflowMode.Literals DEFAULT: 'ellipsis_bottom'

panel

Visual design of the panel used as container for the output. Only TABLE is currently supported, which displays the output in a table-like grid.

TYPE: PanelDesign.Literals DEFAULT: 'table'

title_at_top

Whether panel titles will be displayed over the panel content (True) or below the content (False)

TYPE: bool DEFAULT: True

max_title_height

Maximum height of the panel title. If AUTO, the height is determined by the content of the title, up to a maximum of two lines. If ZERO, the title is not displayed at all. If ONE or TWO, the title is displayed with a fixed height of max one or two lines, respectively.

TYPE: MaxTitleHeight.Literals DEFAULT: -1

min_panel_width

Minimum width in characters per panel.

TYPE: NonNegativeInt DEFAULT: 3

min_crop_width

Minimum cropping width in characters for panels in cases where more than one panel are to be displayed. This is for instance used to calculate the number of models to display in a Dataset peek(). Only applied if use_min_crop_width is set to True. min_crop_width must be equal to or larger than min_panel_width.

TYPE: NonNegativeInt DEFAULT: 33

use_min_crop_width

Whether the min_crop_width value should be considered in cases where more than one panel are to be displayed, potentially reducing the number of displayed panels.

TYPE: bool DEFAULT: False

max_panels_hor

Maximum number of panels to display horizontally side-by-side at the top level. This value also acts as a ceiling for nested panels; nested panels cannot exceed this limit even if the constant MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED is set to a higher value. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 9

max_nesting_depth

Maximum levels of nested panels to display. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 3

justify

Justification mode for the panel if inside a layout panel. This is only used for the panel content.

TYPE: Justify.Literals DEFAULT: 'left'

RETURNS DESCRIPTION
Element | None

If the UI type is Jupyter running in browser, the method returns a ReactivelyResizingHtml element which is a Jupyter widget to display HTML output in the browser. Otherwise, the method returns None.

Note

Any default argument value is overridden by the corresponding value in the relevant subsection of the UserInterfaceConfig.

Source code in src/omnipy/data/_mixins/display.py
def json(self, **kwargs) -> None:
    # %% Original docstring (managed by expand_docstr_macros.py) %%
    # {{JSON_SUMMARY}}
    #
    # {{JSON_DESCRIPTION}}
    #
    # {{DISPLAY_METHOD_ARGS}}
    #
    # {{DISPLAY_METHOD_RETURNS}}
    #
    # {{DISPLAY_METHOD_NOTE}}
    #
    """Preview the data content of the Model or Dataset as JSON.

    In contrast to e.g. `peek()`, `json()` displays the "data
    content" of the Model or Dataset, i.e. the content as plain
    Python objects, potentially converted from the internal data
    structure. This plain data is formatted in JSON (for
    compactness). Hence `json()` represents a the basic
    compatibility layer of all Omnipy Model or Dataset objects.
    The view is automatically limited by the available display
    dimensions.

    Args:
        width (NonNegativeInt | None):
            Width in characters of the output area (None for
            auto-detect based on available display dimensions).
        height (NonNegativeInt | None):
            Height in lines of the output area (None for
            auto-detect based on available display dimensions).
        tab (NonNegativeInt):
            Number of spaces to use for each tab.
        indent (NonNegativeInt):
            Number of spaces to use for each indentation level.
        printer (PrettyPrinterLib.Literals):
            Library to use for pretty printing.
        syntax (SyntaxLanguageSpec.Literals | str):
            Syntax language for code highlighting. Supported
            lexers are defined in SyntaxLanguageSpec. For
            non-supported styles, the user can specify a string
            with the Pygments lexer name. For this to work, the
            lexer must be registered in the Pygments library.
        freedom (float | None):
            Parameter that controls the level of freedom for
            formatted text to follow the geometry of the frame
            size (=total available area) in a proportional manner.
            If the proportional freedom is 0 (the lowest), then
            the output area must not in any case be proportionally
            wider that the frame (i.e. a 16/9 frame will only
            produce output that is 16/9 or narrower). Larger
            values of proportional freedom allow the output to be
            proportionally wider than the total available frame,
            to a degree that relates to the size difference
            between the frame and the content (larger difference
            gives more freedom). The default value of 2.5 is a
            good compromise between readability/aesthetics and
            good use of the screen estate. If None, the freedom is
            unlimited (i.e. proportionality is not taken into
            account at all).
        debug (bool):
            When True, enables additional debugging information in
            the output, such as the hierarchy of the Model
            objects. Currently, only Python pretty printers support
            debug=True. Hence, enabling debug mode will
            automatically set the printer to the default Python
            pretty printer if the `printer` config value is not
            already set.
        ui (UserInterfaceType.Literals):
            Type of user interface for which the output should
            being prepared. The user interface describes the
            technical solutions available for interacting with the
            user, encompassing the support available for
            displaying output as well as how the user interacts
            with the library (including the type of interactive
            interpreter used, if any).
        system (ColorSystem.Literals):
            Color system to use for terminal output. The default
            is `AUTO`, which automatically detects the color
            system based on particular environment variables. If
            color capabilities are not detected, the output will
            be in black and white. If the color system of a modern
            consoles/terminal is not auto-detected (which is the
            case for e.g. the PyCharm console), the user might
            want to set the color system manually to ANSI_RGB to
            force color output.
        style (AllColorStyles.Literals | str):
            Color style/theme for syntax highlighting and other
            display elements. Supported styles are defined in
            AllColorStyles. For non-supported styles, the user can
            specify a string with the Pygments style name. For this to
            work, the style must be registered in the Pygments
            library. If style is `AUTO` or any of the other
            RecommendedColorStyles, the style is automatically
            selected from the RecommendedColorStyles based on the
            detected user interface, the color system, and whether the
            background is dark or not.
        dark (DarkBackground.Literals):
            Whether the background color of the output is dark.
            This is used to determine the appropriate color scheme
            for syntax highlighting. The default is AUTO, which
            automatically tries to detect whether the background
            is dark. Capability of auto-detection depends on the
            user interface.
        bg (bool):
            If False, uses transparent background for the output.
            In the case of terminal output, the background color
            will be the current background color of the terminal.
            For HTML output, the background color will be
            automatically set to pure black or pure white,
            depending on the luminosity of the foreground color.
        fonts (Tuple[str, ...]):
            Font families to use in HTML output, in order of
            preference (empty tuple for browser default).
        font_size (NonNegativeInt | None):
            Font size in pixels for HTML output (None for browser
            default).
        font_weight (NonNegativeInt | None):
            Font weight for HTML output (None for browser
            default).
        line_height (NonNegativeFloat | None):
            Line height multiplier for HTML output (None for
            browser default).
        h_overflow (HorizontalOverflowMode.Literals):
            How to handle text that exceeds the width.
        v_overflow (VerticalOverflowMode.Literals):
            How to handle text that exceeds the height.
        panel (PanelDesign.Literals):
            Visual design of the panel used as container for the
            output. Only `TABLE` is currently supported, which
            displays the output in a table-like grid.
        title_at_top (bool):
            Whether panel titles will be displayed over the panel
            content (True) or below the content (False)
        max_title_height (MaxTitleHeight.Literals):
            Maximum height of the panel title. If `AUTO`, the
            height is determined by the content of the title, up
            to a maximum of two lines. If `ZERO`, the title is not
            displayed at all. If `ONE` or `TWO`, the title is
            displayed with a fixed height of max one or two lines,
            respectively.
        min_panel_width (NonNegativeInt):
            Minimum width in characters per panel.
        min_crop_width (NonNegativeInt):
            Minimum cropping width in characters for panels in
            cases where more than one panel are to be displayed.
            This is for instance used to calculate the number of
            models to display in a Dataset peek(). Only applied if
            `use_min_crop_width` is set to `True`.
            `min_crop_width` must be equal to or larger than
            `min_panel_width`.
        use_min_crop_width (bool):
            Whether the `min_crop_width` value should be
            considered in cases where more than one panel are to
            be displayed, potentially reducing the number of
            displayed panels.
        max_panels_hor (NonNegativeInt | None):
            Maximum number of panels to display horizontally
            side-by-side at the top level. This value also acts as
            a ceiling for nested panels; nested panels cannot
            exceed this limit even if the constant
            `MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED` is set to a
            higher value. If None, there is no limit.
        max_nesting_depth (NonNegativeInt | None):
            Maximum levels of nested panels to display. If None,
            there is no limit.
        justify (Justify.Literals):
            Justification mode for the panel if inside a layout
            panel. This is only used for the panel content.

    Returns:
        If the UI type is Jupyter running in browser, the
        method returns a ReactivelyResizingHtml element which
        is a Jupyter widget to display HTML output in the
        browser. Otherwise, the method returns None.

    Note:
        Any default argument value is overridden by the
        corresponding value in the relevant subsection of the
        UserInterfaceConfig.
    """
    return self._display_according_to_ui_type(
        ui_type=self._extract_ui_type(**kwargs),
        return_output_if_str=False,
        output_method=self._json,
        **kwargs)

list

list(
    *,
    width: pyd.NonNegativeInt | None = None,
    height: pyd.NonNegativeInt | None = None,
    tab: pyd.NonNegativeInt = 4,
    indent: pyd.NonNegativeInt = 2,
    printer: PrettyPrinterLib.Literals = "auto",
    syntax: SyntaxLanguageSpec.Literals | str = "auto",
    freedom: pyd.NonNegativeFloat | None = 2.5,
    debug: bool = False,
    ui: UserInterfaceType.Literals = "auto",
    system: DisplayColorSystem.Literals = "auto",
    style: AllColorStyles.Literals | str = "auto",
    dark: Literal = "auto",
    bg: bool = False,
    fonts: tuple = ("Menlo", "DejaVu Sans Mono", "Consolas", "Courier New", "monospace"),
    font_size: pyd.NonNegativeInt | None = 14,
    font_weight: pyd.NonNegativeInt | None = 400,
    line_height: pyd.NonNegativeFloat | None = 1.25,
    h_overflow: HorizontalOverflowMode.Literals = "ellipsis",
    v_overflow: VerticalOverflowMode.Literals = "ellipsis_bottom",
    panel: PanelDesign.Literals = "table",
    title_at_top: bool = True,
    max_title_height: MaxTitleHeight.Literals = -1,
    min_panel_width: pyd.NonNegativeInt = 3,
    min_crop_width: pyd.NonNegativeInt = 33,
    use_min_crop_width: bool = False,
    max_panels_hor: pyd.NonNegativeInt | None = 9,
    max_nesting_depth: pyd.NonNegativeInt | None = 3,
    justify: Justify.Literals = "left",
) -> Element | None

Displays a summary list of all models in the dataset.

The summary list includes a number of key properties for each model, including data file names, types, lengths, and sizes in memory. The output is automatically limited by the available display dimensions.

PARAMETER DESCRIPTION
width

Width in characters of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

height

Height in lines of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

tab

Number of spaces to use for each tab.

TYPE: NonNegativeInt DEFAULT: 4

indent

Number of spaces to use for each indentation level.

TYPE: NonNegativeInt DEFAULT: 2

printer

Library to use for pretty printing.

TYPE: PrettyPrinterLib.Literals DEFAULT: 'auto'

syntax

Syntax language for code highlighting. Supported lexers are defined in SyntaxLanguageSpec. For non-supported styles, the user can specify a string with the Pygments lexer name. For this to work, the lexer must be registered in the Pygments library.

TYPE: SyntaxLanguageSpec.Literals | str DEFAULT: 'auto'

freedom

Parameter that controls the level of freedom for formatted text to follow the geometry of the frame size (=total available area) in a proportional manner. If the proportional freedom is 0 (the lowest), then the output area must not in any case be proportionally wider that the frame (i.e. a 16/9 frame will only produce output that is 16/9 or narrower). Larger values of proportional freedom allow the output to be proportionally wider than the total available frame, to a degree that relates to the size difference between the frame and the content (larger difference gives more freedom). The default value of 2.5 is a good compromise between readability/aesthetics and good use of the screen estate. If None, the freedom is unlimited (i.e. proportionality is not taken into account at all).

TYPE: float | None DEFAULT: 2.5

debug

When True, enables additional debugging information in the output, such as the hierarchy of the Model objects. Currently, only Python pretty printers support debug=True. Hence, enabling debug mode will automatically set the printer to the default Python pretty printer if the printer config value is not already set.

TYPE: bool DEFAULT: False

ui

Type of user interface for which the output should being prepared. The user interface describes the technical solutions available for interacting with the user, encompassing the support available for displaying output as well as how the user interacts with the library (including the type of interactive interpreter used, if any).

TYPE: UserInterfaceType.Literals DEFAULT: 'auto'

system

Color system to use for terminal output. The default is AUTO, which automatically detects the color system based on particular environment variables. If color capabilities are not detected, the output will be in black and white. If the color system of a modern consoles/terminal is not auto-detected (which is the case for e.g. the PyCharm console), the user might want to set the color system manually to ANSI_RGB to force color output.

TYPE: ColorSystem.Literals DEFAULT: 'auto'

style

Color style/theme for syntax highlighting and other display elements. Supported styles are defined in AllColorStyles. For non-supported styles, the user can specify a string with the Pygments style name. For this to work, the style must be registered in the Pygments library. If style is AUTO or any of the other RecommendedColorStyles, the style is automatically selected from the RecommendedColorStyles based on the detected user interface, the color system, and whether the background is dark or not.

TYPE: AllColorStyles.Literals | str DEFAULT: 'auto'

dark

Whether the background color of the output is dark. This is used to determine the appropriate color scheme for syntax highlighting. The default is AUTO, which automatically tries to detect whether the background is dark. Capability of auto-detection depends on the user interface.

TYPE: DarkBackground.Literals DEFAULT: 'auto'

bg

If False, uses transparent background for the output. In the case of terminal output, the background color will be the current background color of the terminal. For HTML output, the background color will be automatically set to pure black or pure white, depending on the luminosity of the foreground color.

TYPE: bool DEFAULT: False

fonts

Font families to use in HTML output, in order of preference (empty tuple for browser default).

TYPE: Tuple[str, ...] DEFAULT: ('Menlo', 'DejaVu Sans Mono', 'Consolas', 'Courier New', 'monospace')

font_size

Font size in pixels for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 14

font_weight

Font weight for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 400

line_height

Line height multiplier for HTML output (None for browser default).

TYPE: NonNegativeFloat | None DEFAULT: 1.25

h_overflow

How to handle text that exceeds the width.

TYPE: HorizontalOverflowMode.Literals DEFAULT: 'ellipsis'

v_overflow

How to handle text that exceeds the height.

TYPE: VerticalOverflowMode.Literals DEFAULT: 'ellipsis_bottom'

panel

Visual design of the panel used as container for the output. Only TABLE is currently supported, which displays the output in a table-like grid.

TYPE: PanelDesign.Literals DEFAULT: 'table'

title_at_top

Whether panel titles will be displayed over the panel content (True) or below the content (False)

TYPE: bool DEFAULT: True

max_title_height

Maximum height of the panel title. If AUTO, the height is determined by the content of the title, up to a maximum of two lines. If ZERO, the title is not displayed at all. If ONE or TWO, the title is displayed with a fixed height of max one or two lines, respectively.

TYPE: MaxTitleHeight.Literals DEFAULT: -1

min_panel_width

Minimum width in characters per panel.

TYPE: NonNegativeInt DEFAULT: 3

min_crop_width

Minimum cropping width in characters for panels in cases where more than one panel are to be displayed. This is for instance used to calculate the number of models to display in a Dataset peek(). Only applied if use_min_crop_width is set to True. min_crop_width must be equal to or larger than min_panel_width.

TYPE: NonNegativeInt DEFAULT: 33

use_min_crop_width

Whether the min_crop_width value should be considered in cases where more than one panel are to be displayed, potentially reducing the number of displayed panels.

TYPE: bool DEFAULT: False

max_panels_hor

Maximum number of panels to display horizontally side-by-side at the top level. This value also acts as a ceiling for nested panels; nested panels cannot exceed this limit even if the constant MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED is set to a higher value. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 9

max_nesting_depth

Maximum levels of nested panels to display. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 3

justify

Justification mode for the panel if inside a layout panel. This is only used for the panel content.

TYPE: Justify.Literals DEFAULT: 'left'

RETURNS DESCRIPTION
Element | None

If the UI type is Jupyter running in browser, the method returns a ReactivelyResizingHtml element which is a Jupyter widget to display HTML output in the browser. Otherwise, the method returns None.

Note

Any default argument value is overridden by the corresponding value in the relevant subsection of the UserInterfaceConfig.

Source code in src/omnipy/data/_mixins/display.py
def list(self, **kwargs) -> 'Element | None':
    # %% Original docstring (managed by expand_docstr_macros.py) %%
    # {{LIST_SUMMARY}}
    #
    # {{LIST_DESCRIPTION}}
    #
    # {{DISPLAY_METHOD_ARGS}}
    #
    # {{DISPLAY_METHOD_RETURNS}}
    #
    # {{DISPLAY_METHOD_NOTE}}
    #
    """Displays a summary list of all models in the dataset.

    The summary list includes a number of key properties for each
    model, including data file names, types, lengths, and sizes in
    memory. The output is automatically limited by the available
    display dimensions.

    Args:
        width (NonNegativeInt | None):
            Width in characters of the output area (None for
            auto-detect based on available display dimensions).
        height (NonNegativeInt | None):
            Height in lines of the output area (None for
            auto-detect based on available display dimensions).
        tab (NonNegativeInt):
            Number of spaces to use for each tab.
        indent (NonNegativeInt):
            Number of spaces to use for each indentation level.
        printer (PrettyPrinterLib.Literals):
            Library to use for pretty printing.
        syntax (SyntaxLanguageSpec.Literals | str):
            Syntax language for code highlighting. Supported
            lexers are defined in SyntaxLanguageSpec. For
            non-supported styles, the user can specify a string
            with the Pygments lexer name. For this to work, the
            lexer must be registered in the Pygments library.
        freedom (float | None):
            Parameter that controls the level of freedom for
            formatted text to follow the geometry of the frame
            size (=total available area) in a proportional manner.
            If the proportional freedom is 0 (the lowest), then
            the output area must not in any case be proportionally
            wider that the frame (i.e. a 16/9 frame will only
            produce output that is 16/9 or narrower). Larger
            values of proportional freedom allow the output to be
            proportionally wider than the total available frame,
            to a degree that relates to the size difference
            between the frame and the content (larger difference
            gives more freedom). The default value of 2.5 is a
            good compromise between readability/aesthetics and
            good use of the screen estate. If None, the freedom is
            unlimited (i.e. proportionality is not taken into
            account at all).
        debug (bool):
            When True, enables additional debugging information in
            the output, such as the hierarchy of the Model
            objects. Currently, only Python pretty printers support
            debug=True. Hence, enabling debug mode will
            automatically set the printer to the default Python
            pretty printer if the `printer` config value is not
            already set.
        ui (UserInterfaceType.Literals):
            Type of user interface for which the output should
            being prepared. The user interface describes the
            technical solutions available for interacting with the
            user, encompassing the support available for
            displaying output as well as how the user interacts
            with the library (including the type of interactive
            interpreter used, if any).
        system (ColorSystem.Literals):
            Color system to use for terminal output. The default
            is `AUTO`, which automatically detects the color
            system based on particular environment variables. If
            color capabilities are not detected, the output will
            be in black and white. If the color system of a modern
            consoles/terminal is not auto-detected (which is the
            case for e.g. the PyCharm console), the user might
            want to set the color system manually to ANSI_RGB to
            force color output.
        style (AllColorStyles.Literals | str):
            Color style/theme for syntax highlighting and other
            display elements. Supported styles are defined in
            AllColorStyles. For non-supported styles, the user can
            specify a string with the Pygments style name. For this to
            work, the style must be registered in the Pygments
            library. If style is `AUTO` or any of the other
            RecommendedColorStyles, the style is automatically
            selected from the RecommendedColorStyles based on the
            detected user interface, the color system, and whether the
            background is dark or not.
        dark (DarkBackground.Literals):
            Whether the background color of the output is dark.
            This is used to determine the appropriate color scheme
            for syntax highlighting. The default is AUTO, which
            automatically tries to detect whether the background
            is dark. Capability of auto-detection depends on the
            user interface.
        bg (bool):
            If False, uses transparent background for the output.
            In the case of terminal output, the background color
            will be the current background color of the terminal.
            For HTML output, the background color will be
            automatically set to pure black or pure white,
            depending on the luminosity of the foreground color.
        fonts (Tuple[str, ...]):
            Font families to use in HTML output, in order of
            preference (empty tuple for browser default).
        font_size (NonNegativeInt | None):
            Font size in pixels for HTML output (None for browser
            default).
        font_weight (NonNegativeInt | None):
            Font weight for HTML output (None for browser
            default).
        line_height (NonNegativeFloat | None):
            Line height multiplier for HTML output (None for
            browser default).
        h_overflow (HorizontalOverflowMode.Literals):
            How to handle text that exceeds the width.
        v_overflow (VerticalOverflowMode.Literals):
            How to handle text that exceeds the height.
        panel (PanelDesign.Literals):
            Visual design of the panel used as container for the
            output. Only `TABLE` is currently supported, which
            displays the output in a table-like grid.
        title_at_top (bool):
            Whether panel titles will be displayed over the panel
            content (True) or below the content (False)
        max_title_height (MaxTitleHeight.Literals):
            Maximum height of the panel title. If `AUTO`, the
            height is determined by the content of the title, up
            to a maximum of two lines. If `ZERO`, the title is not
            displayed at all. If `ONE` or `TWO`, the title is
            displayed with a fixed height of max one or two lines,
            respectively.
        min_panel_width (NonNegativeInt):
            Minimum width in characters per panel.
        min_crop_width (NonNegativeInt):
            Minimum cropping width in characters for panels in
            cases where more than one panel are to be displayed.
            This is for instance used to calculate the number of
            models to display in a Dataset peek(). Only applied if
            `use_min_crop_width` is set to `True`.
            `min_crop_width` must be equal to or larger than
            `min_panel_width`.
        use_min_crop_width (bool):
            Whether the `min_crop_width` value should be
            considered in cases where more than one panel are to
            be displayed, potentially reducing the number of
            displayed panels.
        max_panels_hor (NonNegativeInt | None):
            Maximum number of panels to display horizontally
            side-by-side at the top level. This value also acts as
            a ceiling for nested panels; nested panels cannot
            exceed this limit even if the constant
            `MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED` is set to a
            higher value. If None, there is no limit.
        max_nesting_depth (NonNegativeInt | None):
            Maximum levels of nested panels to display. If None,
            there is no limit.
        justify (Justify.Literals):
            Justification mode for the panel if inside a layout
            panel. This is only used for the panel content.

    Returns:
        If the UI type is Jupyter running in browser, the
        method returns a ReactivelyResizingHtml element which
        is a Jupyter widget to display HTML output in the
        browser. Otherwise, the method returns None.

    Note:
        Any default argument value is overridden by the
        corresponding value in the relevant subsection of the
        UserInterfaceConfig.
    """

    return self._display_according_to_ui_type(
        ui_type=self._extract_ui_type(**kwargs),
        return_output_if_str=False,
        output_method=self._list,
        **kwargs,
    )

load classmethod

load(
    paths_or_urls: IsPathsOrUrlsOneOrMoreOrNone = None,
    by_file_suffix: bool = False,
    as_mime_type: None | str = None,
    **kwargs: IsPathOrUrl,
) -> Self | asyncio.Task[Self]
Source code in src/omnipy/data/dataset.py
@classmethod
def load(
    cls,
    paths_or_urls: IsPathsOrUrlsOneOrMoreOrNone = None,
    by_file_suffix: bool = False,
    as_mime_type: None | str = None,
    **kwargs: IsPathOrUrl,
) -> Self | asyncio.Task[Self]:
    dataset = cls()
    return dataset.load_into(
        paths_or_urls, by_file_suffix=by_file_suffix, as_mime_type=as_mime_type, **kwargs)

load_into

load_into(
    paths_or_urls: IsPathsOrUrlsOneOrMoreOrNone = None,
    by_file_suffix: bool = False,
    as_mime_type: None | str = None,
    **kwargs: IsPathOrUrl,
) -> Self | asyncio.Task[Self]
Source code in src/omnipy/data/dataset.py
def load_into(
    self,
    paths_or_urls: IsPathsOrUrlsOneOrMoreOrNone = None,
    by_file_suffix: bool = False,
    as_mime_type: None | str = None,
    **kwargs: IsPathOrUrl,
) -> Self | asyncio.Task[Self]:
    from omnipy.components.remote.datasets import HttpUrlDataset
    from omnipy.components.remote.models import HttpUrlModel

    if paths_or_urls is None:
        assert len(kwargs) > 0, 'No paths or urls specified'
        paths_or_urls = kwargs
    else:
        assert len(kwargs) == 0, 'No keyword arguments allowed when paths_or_urls is specified'

    match paths_or_urls:
        case HttpUrlDataset():
            return self._load_http_urls(paths_or_urls, as_mime_type=as_mime_type)

        case HttpUrlModel():
            return self._load_http_urls(
                HttpUrlDataset({str(paths_or_urls): paths_or_urls}),
                as_mime_type=as_mime_type,
            )

        case str():
            try:
                http_url_dataset = HttpUrlDataset({paths_or_urls: paths_or_urls})
            except ValidationError:
                return self._load_paths([paths_or_urls], by_file_suffix)
            return self._load_http_urls(http_url_dataset, as_mime_type=as_mime_type)

        case Mapping():
            try:
                http_url_dataset = HttpUrlDataset(paths_or_urls)
            except ValidationError as exp:
                raise NotImplementedError(
                    'Loading files with specified keys is not yet '
                    'implemented, as only tar.gz file import is '
                    'supported until serializers have been refactored.') from exp
            return self._load_http_urls(http_url_dataset, as_mime_type=as_mime_type)

        case Iterable():
            path_or_url_iterable = paths_or_urls
            try:
                http_url_dataset = HttpUrlDataset(
                    zip(path_or_url_iterable, path_or_url_iterable))
            except ValidationError:
                return self._load_paths(path_or_url_iterable, by_file_suffix)
            return self._load_http_urls(http_url_dataset, as_mime_type=as_mime_type)
        case _:
            raise TypeError(f'"paths_or_urls" argument is of incorrect type. Type '
                            f'{type(paths_or_urls)} is not supported.')

peek

peek(
    *,
    width: pyd.NonNegativeInt | None = None,
    height: pyd.NonNegativeInt | None = None,
    tab: pyd.NonNegativeInt = 4,
    indent: pyd.NonNegativeInt = 2,
    printer: PrettyPrinterLib.Literals = "auto",
    syntax: SyntaxLanguageSpec.Literals | str = "auto",
    freedom: pyd.NonNegativeFloat | None = 2.5,
    debug: bool = False,
    ui: UserInterfaceType.Literals = "auto",
    system: DisplayColorSystem.Literals = "auto",
    style: AllColorStyles.Literals | str = "auto",
    dark: Literal = "auto",
    bg: bool = False,
    fonts: tuple = ("Menlo", "DejaVu Sans Mono", "Consolas", "Courier New", "monospace"),
    font_size: pyd.NonNegativeInt | None = 14,
    font_weight: pyd.NonNegativeInt | None = 400,
    line_height: pyd.NonNegativeFloat | None = 1.25,
    h_overflow: HorizontalOverflowMode.Literals = "ellipsis",
    v_overflow: VerticalOverflowMode.Literals = "ellipsis_bottom",
    panel: PanelDesign.Literals = "table",
    title_at_top: bool = True,
    max_title_height: MaxTitleHeight.Literals = -1,
    min_panel_width: pyd.NonNegativeInt = 3,
    min_crop_width: pyd.NonNegativeInt = 33,
    use_min_crop_width: bool = False,
    max_panels_hor: pyd.NonNegativeInt | None = 9,
    max_nesting_depth: pyd.NonNegativeInt | None = 3,
    justify: Justify.Literals = "left",
) -> Element | None

Display a preview of the Model or Dataset content.

For Model instances, peek() displays a preview of the model's content. For Dataset instances, peek() displays a side-by-side view of each model contained in the dataset. Both views are automatically limited by the available display dimensions.

PARAMETER DESCRIPTION
width

Width in characters of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

height

Height in lines of the output area (None for auto-detect based on available display dimensions).

TYPE: NonNegativeInt | None DEFAULT: None

tab

Number of spaces to use for each tab.

TYPE: NonNegativeInt DEFAULT: 4

indent

Number of spaces to use for each indentation level.

TYPE: NonNegativeInt DEFAULT: 2

printer

Library to use for pretty printing.

TYPE: PrettyPrinterLib.Literals DEFAULT: 'auto'

syntax

Syntax language for code highlighting. Supported lexers are defined in SyntaxLanguageSpec. For non-supported styles, the user can specify a string with the Pygments lexer name. For this to work, the lexer must be registered in the Pygments library.

TYPE: SyntaxLanguageSpec.Literals | str DEFAULT: 'auto'

freedom

Parameter that controls the level of freedom for formatted text to follow the geometry of the frame size (=total available area) in a proportional manner. If the proportional freedom is 0 (the lowest), then the output area must not in any case be proportionally wider that the frame (i.e. a 16/9 frame will only produce output that is 16/9 or narrower). Larger values of proportional freedom allow the output to be proportionally wider than the total available frame, to a degree that relates to the size difference between the frame and the content (larger difference gives more freedom). The default value of 2.5 is a good compromise between readability/aesthetics and good use of the screen estate. If None, the freedom is unlimited (i.e. proportionality is not taken into account at all).

TYPE: float | None DEFAULT: 2.5

debug

When True, enables additional debugging information in the output, such as the hierarchy of the Model objects. Currently, only Python pretty printers support debug=True. Hence, enabling debug mode will automatically set the printer to the default Python pretty printer if the printer config value is not already set.

TYPE: bool DEFAULT: False

ui

Type of user interface for which the output should being prepared. The user interface describes the technical solutions available for interacting with the user, encompassing the support available for displaying output as well as how the user interacts with the library (including the type of interactive interpreter used, if any).

TYPE: UserInterfaceType.Literals DEFAULT: 'auto'

system

Color system to use for terminal output. The default is AUTO, which automatically detects the color system based on particular environment variables. If color capabilities are not detected, the output will be in black and white. If the color system of a modern consoles/terminal is not auto-detected (which is the case for e.g. the PyCharm console), the user might want to set the color system manually to ANSI_RGB to force color output.

TYPE: ColorSystem.Literals DEFAULT: 'auto'

style

Color style/theme for syntax highlighting and other display elements. Supported styles are defined in AllColorStyles. For non-supported styles, the user can specify a string with the Pygments style name. For this to work, the style must be registered in the Pygments library. If style is AUTO or any of the other RecommendedColorStyles, the style is automatically selected from the RecommendedColorStyles based on the detected user interface, the color system, and whether the background is dark or not.

TYPE: AllColorStyles.Literals | str DEFAULT: 'auto'

dark

Whether the background color of the output is dark. This is used to determine the appropriate color scheme for syntax highlighting. The default is AUTO, which automatically tries to detect whether the background is dark. Capability of auto-detection depends on the user interface.

TYPE: DarkBackground.Literals DEFAULT: 'auto'

bg

If False, uses transparent background for the output. In the case of terminal output, the background color will be the current background color of the terminal. For HTML output, the background color will be automatically set to pure black or pure white, depending on the luminosity of the foreground color.

TYPE: bool DEFAULT: False

fonts

Font families to use in HTML output, in order of preference (empty tuple for browser default).

TYPE: Tuple[str, ...] DEFAULT: ('Menlo', 'DejaVu Sans Mono', 'Consolas', 'Courier New', 'monospace')

font_size

Font size in pixels for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 14

font_weight

Font weight for HTML output (None for browser default).

TYPE: NonNegativeInt | None DEFAULT: 400

line_height

Line height multiplier for HTML output (None for browser default).

TYPE: NonNegativeFloat | None DEFAULT: 1.25

h_overflow

How to handle text that exceeds the width.

TYPE: HorizontalOverflowMode.Literals DEFAULT: 'ellipsis'

v_overflow

How to handle text that exceeds the height.

TYPE: VerticalOverflowMode.Literals DEFAULT: 'ellipsis_bottom'

panel

Visual design of the panel used as container for the output. Only TABLE is currently supported, which displays the output in a table-like grid.

TYPE: PanelDesign.Literals DEFAULT: 'table'

title_at_top

Whether panel titles will be displayed over the panel content (True) or below the content (False)

TYPE: bool DEFAULT: True

max_title_height

Maximum height of the panel title. If AUTO, the height is determined by the content of the title, up to a maximum of two lines. If ZERO, the title is not displayed at all. If ONE or TWO, the title is displayed with a fixed height of max one or two lines, respectively.

TYPE: MaxTitleHeight.Literals DEFAULT: -1

min_panel_width

Minimum width in characters per panel.

TYPE: NonNegativeInt DEFAULT: 3

min_crop_width

Minimum cropping width in characters for panels in cases where more than one panel are to be displayed. This is for instance used to calculate the number of models to display in a Dataset peek(). Only applied if use_min_crop_width is set to True. min_crop_width must be equal to or larger than min_panel_width.

TYPE: NonNegativeInt DEFAULT: 33

use_min_crop_width

Whether the min_crop_width value should be considered in cases where more than one panel are to be displayed, potentially reducing the number of displayed panels.

TYPE: bool DEFAULT: False

max_panels_hor

Maximum number of panels to display horizontally side-by-side at the top level. This value also acts as a ceiling for nested panels; nested panels cannot exceed this limit even if the constant MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED is set to a higher value. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 9

max_nesting_depth

Maximum levels of nested panels to display. If None, there is no limit.

TYPE: NonNegativeInt | None DEFAULT: 3

justify

Justification mode for the panel if inside a layout panel. This is only used for the panel content.

TYPE: Justify.Literals DEFAULT: 'left'

RETURNS DESCRIPTION
Element | None

If the UI type is Jupyter running in browser, the method returns a ReactivelyResizingHtml element which is a Jupyter widget to display HTML output in the browser. Otherwise, the method returns None.

Note

Any default argument value is overridden by the corresponding value in the relevant subsection of the UserInterfaceConfig.

Source code in src/omnipy/data/_mixins/display.py
def peek(self, **kwargs) -> 'Element | None':
    # %% Original docstring (managed by expand_docstr_macros.py) %%
    # {{PEEK_SUMMARY}}
    #
    # {{PEEK_DESCRIPTION}}
    #
    # {{DISPLAY_METHOD_ARGS}}
    #
    # {{DISPLAY_METHOD_RETURNS}}
    #
    # {{DISPLAY_METHOD_NOTE}}
    #
    """Display a preview of the Model or Dataset content.

    For Model instances, `peek()` displays a preview of the
    model's content. For Dataset instances, `peek()` displays a
    side-by-side view of each model contained in the dataset. Both
    views are automatically limited by the available display
    dimensions.

    Args:
        width (NonNegativeInt | None):
            Width in characters of the output area (None for
            auto-detect based on available display dimensions).
        height (NonNegativeInt | None):
            Height in lines of the output area (None for
            auto-detect based on available display dimensions).
        tab (NonNegativeInt):
            Number of spaces to use for each tab.
        indent (NonNegativeInt):
            Number of spaces to use for each indentation level.
        printer (PrettyPrinterLib.Literals):
            Library to use for pretty printing.
        syntax (SyntaxLanguageSpec.Literals | str):
            Syntax language for code highlighting. Supported
            lexers are defined in SyntaxLanguageSpec. For
            non-supported styles, the user can specify a string
            with the Pygments lexer name. For this to work, the
            lexer must be registered in the Pygments library.
        freedom (float | None):
            Parameter that controls the level of freedom for
            formatted text to follow the geometry of the frame
            size (=total available area) in a proportional manner.
            If the proportional freedom is 0 (the lowest), then
            the output area must not in any case be proportionally
            wider that the frame (i.e. a 16/9 frame will only
            produce output that is 16/9 or narrower). Larger
            values of proportional freedom allow the output to be
            proportionally wider than the total available frame,
            to a degree that relates to the size difference
            between the frame and the content (larger difference
            gives more freedom). The default value of 2.5 is a
            good compromise between readability/aesthetics and
            good use of the screen estate. If None, the freedom is
            unlimited (i.e. proportionality is not taken into
            account at all).
        debug (bool):
            When True, enables additional debugging information in
            the output, such as the hierarchy of the Model
            objects. Currently, only Python pretty printers support
            debug=True. Hence, enabling debug mode will
            automatically set the printer to the default Python
            pretty printer if the `printer` config value is not
            already set.
        ui (UserInterfaceType.Literals):
            Type of user interface for which the output should
            being prepared. The user interface describes the
            technical solutions available for interacting with the
            user, encompassing the support available for
            displaying output as well as how the user interacts
            with the library (including the type of interactive
            interpreter used, if any).
        system (ColorSystem.Literals):
            Color system to use for terminal output. The default
            is `AUTO`, which automatically detects the color
            system based on particular environment variables. If
            color capabilities are not detected, the output will
            be in black and white. If the color system of a modern
            consoles/terminal is not auto-detected (which is the
            case for e.g. the PyCharm console), the user might
            want to set the color system manually to ANSI_RGB to
            force color output.
        style (AllColorStyles.Literals | str):
            Color style/theme for syntax highlighting and other
            display elements. Supported styles are defined in
            AllColorStyles. For non-supported styles, the user can
            specify a string with the Pygments style name. For this to
            work, the style must be registered in the Pygments
            library. If style is `AUTO` or any of the other
            RecommendedColorStyles, the style is automatically
            selected from the RecommendedColorStyles based on the
            detected user interface, the color system, and whether the
            background is dark or not.
        dark (DarkBackground.Literals):
            Whether the background color of the output is dark.
            This is used to determine the appropriate color scheme
            for syntax highlighting. The default is AUTO, which
            automatically tries to detect whether the background
            is dark. Capability of auto-detection depends on the
            user interface.
        bg (bool):
            If False, uses transparent background for the output.
            In the case of terminal output, the background color
            will be the current background color of the terminal.
            For HTML output, the background color will be
            automatically set to pure black or pure white,
            depending on the luminosity of the foreground color.
        fonts (Tuple[str, ...]):
            Font families to use in HTML output, in order of
            preference (empty tuple for browser default).
        font_size (NonNegativeInt | None):
            Font size in pixels for HTML output (None for browser
            default).
        font_weight (NonNegativeInt | None):
            Font weight for HTML output (None for browser
            default).
        line_height (NonNegativeFloat | None):
            Line height multiplier for HTML output (None for
            browser default).
        h_overflow (HorizontalOverflowMode.Literals):
            How to handle text that exceeds the width.
        v_overflow (VerticalOverflowMode.Literals):
            How to handle text that exceeds the height.
        panel (PanelDesign.Literals):
            Visual design of the panel used as container for the
            output. Only `TABLE` is currently supported, which
            displays the output in a table-like grid.
        title_at_top (bool):
            Whether panel titles will be displayed over the panel
            content (True) or below the content (False)
        max_title_height (MaxTitleHeight.Literals):
            Maximum height of the panel title. If `AUTO`, the
            height is determined by the content of the title, up
            to a maximum of two lines. If `ZERO`, the title is not
            displayed at all. If `ONE` or `TWO`, the title is
            displayed with a fixed height of max one or two lines,
            respectively.
        min_panel_width (NonNegativeInt):
            Minimum width in characters per panel.
        min_crop_width (NonNegativeInt):
            Minimum cropping width in characters for panels in
            cases where more than one panel are to be displayed.
            This is for instance used to calculate the number of
            models to display in a Dataset peek(). Only applied if
            `use_min_crop_width` is set to `True`.
            `min_crop_width` must be equal to or larger than
            `min_panel_width`.
        use_min_crop_width (bool):
            Whether the `min_crop_width` value should be
            considered in cases where more than one panel are to
            be displayed, potentially reducing the number of
            displayed panels.
        max_panels_hor (NonNegativeInt | None):
            Maximum number of panels to display horizontally
            side-by-side at the top level. This value also acts as
            a ceiling for nested panels; nested panels cannot
            exceed this limit even if the constant
            `MAX_PANELS_HORIZONTALLY_DEEPLY_NESTED` is set to a
            higher value. If None, there is no limit.
        max_nesting_depth (NonNegativeInt | None):
            Maximum levels of nested panels to display. If None,
            there is no limit.
        justify (Justify.Literals):
            Justification mode for the panel if inside a layout
            panel. This is only used for the panel content.

    Returns:
        If the UI type is Jupyter running in browser, the
        method returns a ReactivelyResizingHtml element which
        is a Jupyter widget to display HTML output in the
        browser. Otherwise, the method returns None.

    Note:
        Any default argument value is overridden by the
        corresponding value in the relevant subsection of the
        UserInterfaceConfig.
    """
    return self._display_according_to_ui_type(
        ui_type=self._extract_ui_type(**kwargs),
        return_output_if_str=False,
        output_method=self._peek,
        **kwargs,
    )

pending_task_details

pending_task_details() -> dict[str, IsPendingData]
Source code in src/omnipy/data/_mixins/task.py
def pending_task_details(self) -> dict[str, IsPendingData]:
    self_with_data = cast(HasData, self)
    return {  # pyright: ignore [reportReturnType]
        key: val for key, val in self_with_data.data.items() if isinstance(val, PendingData)
    }

save

save(path: str)
Source code in src/omnipy/data/dataset.py
def save(self, path: str):
    serializer_registry = self._get_serializer_registry()

    parsed_dataset, serializer = serializer_registry.auto_detect_tar_file_serializer(self)

    if serializer is None:
        print(f'Unable to find a serializer for dataset with data type "{type(self)}". '
              f'Will abort saving...')
    else:
        if not path.endswith('.tar.gz'):
            out_tar_gz_path = f'{path}.tar.gz'

        print(f'Writing dataset as a gzipped tarpack to "{os.path.abspath(out_tar_gz_path)}"')

        with open(out_tar_gz_path, 'wb') as out_tar_gz_file:
            out_tar_gz_file.write(serializer.serialize(parsed_dataset))

        directory = os.path.abspath(out_tar_gz_path[:-7])
        if not os.path.exists(directory):
            os.makedirs(directory)

        tar = tarfile.open(out_tar_gz_path)
        print(f'Extracting content to directory "{os.path.abspath(out_tar_gz_path[:-7])}"')
        tar.extractall(path=directory)
        tar.close()

to

to(model_or_dataset_cls: type[_OtherModelOrDatasetT]) -> _OtherModelOrDatasetT
Source code in src/omnipy/data/dataset.py
def to(self, model_or_dataset_cls: type[_OtherModelOrDatasetT]) -> '_OtherModelOrDatasetT':
    return model_or_dataset_cls(self)

to_data

to_data() -> dict_t[str, Any]
Source code in src/omnipy/data/dataset.py
def to_data(self) -> dict_t[str, Any]:
    return {key: self._check_value(val) for key, val in self.dict(by_alias=True).items()}

to_json

to_json(pretty=True) -> dict_t[str, str]
Source code in src/omnipy/data/dataset.py
def to_json(self, pretty=True) -> dict_t[str, str]:
    result = {}

    for key, val in self.data.items():
        result[key] = val.to_json(pretty=pretty)

    return result

to_json_schema classmethod

to_json_schema(pretty: bool = True) -> str | dict_t[str, str]
Source code in src/omnipy/data/dataset.py
@classmethod
def to_json_schema(cls, pretty: bool = True) -> str | dict_t[str, str]:
    result = {}
    clean_dataset = super(Dataset, Dataset).__class_getitem__(cls.get_type())
    schema = clean_dataset.schema()
    for key, val in schema['properties'][DATA_KEY].items():
        # Remove the first part of the type definition of 'data', added
        # as a hack to stop coercing of e.g. [{'a': 'b', 'c': 'd'}]
        # to {'a': 'c'}
        if key == 'anyOf':
            result['type'] = 'object'
            result['additionalProperties'] = {
                '$ref': '#/definitions/' + pyd.normalize_name(clean_dataset.get_type().__name__)
            }
        else:
            result[key] = val

    result['title'] = clean_dataset.__name__
    result['definitions'] = schema['definitions']

    for model_desc in result['definitions'].values():
        if 'orig_model' in model_desc:
            del model_desc['orig_model']

    if pretty:
        return cls._pretty_print_json(result)
    else:
        return json.dumps(result)

update_forward_refs classmethod

update_forward_refs(
    calling_module: str | None = None, prev_visited_classes: set[type] | None = None, **localns: Any
) -> None
Source code in src/omnipy/data/dataset.py
@classmethod
def update_forward_refs(
    cls,
    calling_module: str | None = None,
    prev_visited_classes: set[type] | None = None,
    **localns: Any,
) -> None:
    from omnipy.data.model import is_model_subclass
    """
    Try to update ForwardRefs on fields based on this Model, globalns and localns.
    """

    if prev_visited_classes is None:
        prev_visited_classes = set()
    elif cls in prev_visited_classes:
        return

    # Merge the namespaces of the Datasets's own module and the
    # calling module to the local namespace for evaluation of forward
    # references, which is necessary for cases where the Dataset is
    # defined in a different module than where it is used, e.g. when
    # the Dataset is defined in a library and used by a user in their
    # own code.
    if calling_module is None:
        calling_module = get_calling_module_name()
    own_module_ns, globalns = \
        build_own_module_and_global_namespace_for_forward_refs(cls, calling_module, **localns)

    prev_type = cls._get_data_field().type_

    super().update_forward_refs(**globalns)

    cls._get_data_field().type_ = evaluate_any_forward_refs_if_possible(prev_type, **globalns)
    cls.__annotations__[DATA_KEY] = evaluate_any_forward_refs_if_possible(
        cls.__annotations__[DATA_KEY], **globalns)

    prev_visited_classes.add(cls)

    # Merge the Dataset's own module namespace into
    # localns before propagating. This is to allow Model classes and
    # pydantic-generated parametrized base classes (which have
    # __module__='omnipy.data.dataset' rather than the defining
    # module) to still resolve forward refs that only exist
    # in the defining module's namespace.

    extra_ns: dict[str, Any] = {}
    extra_ns.update(own_module_ns)
    extra_ns.update(localns)

    # Propagate update_forward_refs to parent Dataset classes but
    # retaining the same calling module. This is needed to ensure the
    # correct context is used to resolve forward references in complex
    # inheritance hierarchies.
    #
    # We explicitly call `update_forward_refs` on immediate parent
    # classes (`__bases__`) instead of relying solely on
    # `super().update_forward_refs()`. This is because `super()`
    # inside this classmethod resolves relative to `Dataset` in the MRO,
    # silently bypassing custom logic on any intermediate `Dataset`
    # subclasses. Explicitly propagating through `__bases__` ensures
    # that class-level setups are correctly applied to all parents
    # exactly once, efficiently preventing redundant updates.
    for base in cls.__bases__:
        if is_dataset_subclass(base) and base is not Dataset:
            base.update_forward_refs(
                calling_module=calling_module,
                prev_visited_classes=prev_visited_classes,
                **extra_ns,
            )

    # As above, but now propagate update_forward_refs to the types of
    # the Dataset (e.g. the Model).
    for type_variant in split_to_union_variants(cls.get_type()):
        if is_dataset_subclass(type_variant) or is_model_subclass(type_variant):
            type_variant.update_forward_refs(
                calling_module=calling_module,
                prev_visited_classes=prev_visited_classes,
                **extra_ns,
            )

    cls.__name__ = remove_forward_ref_notation(cls.__name__)
    cls.__qualname__ = remove_forward_ref_notation(cls.__qualname__)

    cls._clean_type_caches()

validate classmethod

validate(value: Any) -> Self

Hack to allow overwriting of iter method without compromising pydantic validation. Part of the pydantic API and not the Omnipy API.

Source code in src/omnipy/data/dataset.py
@classmethod
def validate(cls, value: Any) -> Self:
    """
    Hack to allow overwriting of __iter__ method without compromising pydantic validation. Part
    of the pydantic API and not the Omnipy API.
    """
    # TODO: Doublecheck if validate() method is still needed for pydantic v2

    # validate_cls_counts[cls.__name__] += 1
    if is_iterable(value) and not isinstance(value, Mapping):
        value = cls._check_iterable(value)

    return super().validate({'data': value})

is_dataset_instance

is_dataset_instance(__obj: object) -> TypeIs[Dataset]
Source code in src/omnipy/data/dataset.py
def is_dataset_instance(__obj: object) -> 'TypeIs[Dataset]':
    return lenient_isinstance(__obj, Dataset)

is_dataset_subclass cached

is_dataset_subclass(__cls: TypeForm) -> TypeIs[type[Dataset]]
Source code in src/omnipy/data/dataset.py
@functools.cache
def is_dataset_subclass(__cls: TypeForm) -> 'TypeIs[type[Dataset]]':
    return lenient_issubclass(__cls, Dataset)