Code Generation

This library uses code generation in Pydantic, with plans to move to Patito soon.

It works by taking the OpenAPI JSON schemas (a.k.a. "Swagger" schemas), and using the "components" which correspond to entities like "StopPoint" which means something like a bus stop or train platform.

These are 'DTOs' or Data Transport Objects, and correspond pretty neatly to data models, for which Python provides the Pydantic model class for validation.

I rolled my own schema handling for this, it's pretty straightforward. The one awkward part was getting informative names for the DTO models, which required cross-referencing against the old "Unified API" (marked as deprecated i.e. 'do not use' in the official TfL API docs!), in a process I call "reference chasing".

Each API is self-contained (meaning that one API may redeclare the same entities as another, and thus the generated code will have the same DTO Pydantic models). For instance there's a model class called Mode in 3 different APIs (Line, Journey, and StopPoint), each representing the mode of transport.

These nest inside each other: Pydantic models that contain other Pydantic models deserialise JSON into nested models, so it's an ideal choice here.

For example, here's the Line class in the Line API. The class name is the name of the DTO, i.e. of the schema component/entity, and the API it's found in is stored in the _source_schema_name private attribute, which is not used for any functionality, only to keep track of where these objects are coming from. We can also see from _component_schema_name that its original name was "Tfl-19" which isn't very informative (the name 'Line' which ended up as the class name was found by cross-referencing against the DTOs in the Unified API).

class Line(BaseModel):
    """
    Autogenerated from Line::Tfl.Api.Presentation.Entities.Line
    """

    model_config = ConfigDict(
        alias_generator=AliasGenerator(validation_alias=to_camel_case),
    )

    Id: str = None
    Name: str = None
    ModeName: str = None
    Disruptions: list["DisruptionModel"]
    Created: datetime = None
    Modified: datetime = None
    LineStatuses: list["LineStatusModel"]
    RouteSections: list["MatchedRouteModel"]
    ServiceTypes: list["LineServiceTypeInfoModel"]
    Crowding: CrowdingModel = None
    _source_schema_name: str = PrivateAttr(default='Line')
    _component_schema_name: str = PrivateAttr(default='Tfl-19')

LineModel = Line

Notice that this Pydantic model has an alias_generator set in its model_config before the fields are declared. This means that for instance a key modeName will be transformed to ModeName. This is necessary because some of the keys in the TfL APIs are reserved words (in fact words that are part of the Python language itself, and which would therefore produce unparseable ASTs if used as class attribute names (the field names in Pydantic model classes).

Also note: due to a current 'quirk' of Pydantic, all of the DTOs have a 'nickname' which is the class name with 'Model' on the end, and this is used in the nested definitions. So list["LineStatusModel"] in the LineStatuses field definition above would resolve to a list of the LineStatus class, and the Line class itself is given the nickname 'LineModel'. There were other workarounds but this was the simplest!