Skip to content

kiara.pipeline.config

PipelineConfig pydantic-model

A class to hold the configuration for a PipelineModule.

If you want to control the pipeline input and output names, you need to have to provide a map that uses the autogenerated field name ([step_id]__[field_name] -- 2 underscores!!) as key, and the desired field name as value. The reason that schema for the autogenerated field names exist is that it's hard to ensure the uniqueness of each field; some steps can have the same input field names, but will need different input values. In some cases, some inputs of different steps need the same input. Those sorts of things. So, to make sure that we always use the right values, I chose to implement a conservative default approach, accepting that in some cases the user will be prompted for duplicate inputs for the same value.

To remedy that, the pipeline creator has the option to manually specify a mapping to rename some or all of the input/output fields.

Further, because in a lot of cases there won't be any overlapping fields, the creator can specify auto, in which case Kiara will automatically create a mapping that tries to map autogenerated field names to the shortest possible names for each case.

Examples:

Configuration for a pipeline module that functions as a nand logic gate (in Python):

and_step = PipelineStepConfig(module_type="and", step_id="and")
not_step = PipelineStepConfig(module_type="not", step_id="not", input_links={"a": ["and.y"]}
nand_p_conf = PipelineConfig(doc="Returns 'False' if both inputs are 'True'.",
                    steps=[and_step, not_step],
                    input_aliases={
                        "and__a": "a",
                        "and__b": "b"
                    },
                    output_aliases={
                        "not__y": "y"
                    }}

Or, the same thing in json:

{
  "module_type_name": "nand",
  "doc": "Returns 'False' if both inputs are 'True'.",
  "steps": [
    {
      "module_type": "and",
      "step_id": "and"
    },
    {
      "module_type": "not",
      "step_id": "not",
      "input_links": {
        "a": "and.y"
      }
    }
  ],
  "input_aliases": {
    "and__a": "a",
    "and__b": "b"
  },
  "output_aliases": {
    "not__y": "y"
  }
}

context: Dict[str, Any] pydantic-field

Metadata for this workflow.

documentation: str pydantic-field

Documentation about what the pipeline does.

input_aliases: Union[str, Dict[str, str]] pydantic-field

A map of input aliases, with the calculated (__ -- double underscore!) name as key, and a string (the resulting workflow input alias) as value. Check the documentation for the config class for which marker strings can be used to automatically create this map if possible.

output_aliases: Union[str, Dict[str, str]] pydantic-field

A map of output aliases, with the calculated (__ -- double underscore!) name as key, and a string (the resulting workflow output alias) as value. Check the documentation for the config class for which marker strings can be used to automatically create this map if possible.

steps: List[kiara.pipeline.config.PipelineStepConfig] pydantic-field

A list of steps/modules of this pipeline, and their connections.

create_pipeline_config(config, module_config=None, kiara=None) classmethod

Create a PipelineModule instance.

The main 'config' argument here can be either:

  • a string: in which case it needs to be (in that order):
    • a module id
    • an operation id
    • a path to a local file
  • a ModuleConfig object
  • a dict (to create a ModuleInstsanceConfig from

Parameters:

Name Type Description Default
kiara Optional[Kiara]

the kiara context

None
config Union[kiara.module_config.ModuleConfig, Mapping[str, Any], str]

the 'main' config object

required
module_config Optional[Mapping[str, Any]]

in case the 'main' config object was a module id, this argument is used to instantiate the module

None
kiara Optional[Kiara]

the kiara context (will use default context instance if not provided)

None
Source code in kiara/pipeline/config.py
@classmethod
def create_pipeline_config(
    cls,
    config: typing.Union[ModuleConfig, typing.Mapping[str, typing.Any], str],
    module_config: typing.Optional[typing.Mapping[str, typing.Any]] = None,
    kiara: typing.Optional["Kiara"] = None,
) -> "PipelineConfig":
    """Create a PipelineModule instance.

    The main 'config' argument here can be either:

      - a string: in which case it needs to be (in that order):
        - a module id
        - an operation id
        - a path to a local file
      - a [ModuleConfig][kiara.module_config.ModuleConfig] object
      - a dict (to create a `ModuleInstsanceConfig` from


    Arguments:
        kiara: the kiara context
        config: the 'main' config object
        module_config: in case the 'main' config object was a module id, this argument is used to instantiate the module
        kiara: the kiara context (will use default context instance if not provided)

    """

    if kiara is None:
        from kiara.kiara import Kiara

        kiara = Kiara.instance()

    module_config_obj = ModuleConfig.create_module_config(
        config=config, module_config=module_config, kiara=kiara
    )

    if not module_config_obj.module_type == "pipeline":
        raise Exception(f"Not a valid pipeline configuration: {config}")

    # TODO: this is a bit round-about, to create a module config first, but it probably doesn't matter
    pipeline_config_data = module_config_obj.module_config

    module: PipelineConfig = PipelineConfig(**pipeline_config_data)
    return module

PipelineStepConfig pydantic-model

A class to hold the configuration of one module within a PipelineModule.

The map with the name of an input link as key, and the connected module output name(s) as value.

step_id: str pydantic-field required

The id of the step.

StepDesc pydantic-model

Details of a single PipelineStep (which lives within a Pipeline

input_connections: Dict[str, List[str]] pydantic-field required

A map that explains what elements connect to this steps inputs. A connection could either be a Pipeline input (indicated by the __pipeline__ token), or another steps output.

Examples:

input_connections: {
    "a": ["__pipeline__.a"],
    "b": ["step_one.a"]
}

output_connections: Dict[str, List[str]] pydantic-field required

A map that explains what elemnts connect to this steps outputs. A connection could be either a Pipeline output, or another steps input.

processing_stage: int pydantic-field required

The processing stage of this step within a Pipeline.

required: bool pydantic-field required

Whether this step is always required, or potentially could be skipped in case some inputs are not available.

step: PipelineStep pydantic-field required

Attributes of the step itself.