Skip to content

Getting started

This guide walks through some of the important (and some of the lesser important) features of kiara, the goal is to introduce new users to the overall framework, so they can get a feeling for what it can do, and whether it might be useful for their own usage scenarios.

Setting up kiara

In order to use kiara, we'll need to install it into a Python virtual (or conda-) environment, along all the plugins we might want to use. For the purpose of this tutorial, we'll use conda to create such an environment, but you can of course use a 'normal' virtualenv if you prefer. How to install conda itself is out of scope of this tutorial, but you should not have problems finding instructions online.

One simple way is to install the Anaconda (individual edition), then use the Anaconda navigator to create a new environment, install the 'git' package in it if your system does not already have it (you can install 'git' by running the conda install -c anaconda git command in your terminal for example), and use the 'Open Terminal' option of that environment to start up a terminal that has that virtual-/conda-environment already activated.

Here's how to create the environment, activate it, then install the necessary dependencies (assuming conda is installed). At some point in the process, you may be prompted by the terminal to confirm further proceeding (generally by typing "y" and enter) to complete all the steps.

conda create -n kiara_tutorial python=3.9
conda activate kiara_tutorial
conda install -c conda-forge mamba
mamba install -c conda-forge -c dharpa kiara kiara_plugin.core_types kiara_plugin.tabular kiara_plugin.network_analysis

Note

We are using mamba as our package manager here, instead of 'pure' conda. This is optional, but recommended since it makes things a lot faster.

Getting some example data

For this tutorial, we'll need some example data, so we can use kiara against it. We've prepared a git repository for that purpose:

git clone https://github.com/DHARPA-Project/kiara.examples.git
cd kiara.examples

Specifically, here we'll be using two CSV files that were created by my colleague Lena Jaskov: files

The files contain information about connection (edges) between medical journals (JournalEdges1902.csv), as well as additional metadata for the journals themselves (JournalNodes1902.csv). We'll use that data to create table and graph structures with kiara.

Checking for available operations

First, let's have a look which operations are available, and what we can do with them:

kiara operation list
╭─ Available operations ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                          │
│   Id                                                             Type(s)       Description                                               │
│  ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────  │
│   create.database.from.file                                      create_from   Create a database from a file.                            │
│   create.database.from.file_bundle                               create_from   Create a database from a file_bundle value.               │
│   create.database.from.table                                     create_from   Create a database value from a table.                     │
│   create.network_data.from.files                                 pipeline      Create table values from files containing edges and       │
│                                                                                node data, then assemble those to the network_data        │
│                                                                                result.                                                   │
│   create.network_data.from.tables                                              Create a graph object from one or two tables.             │
│   create.table.from.file                                         create_from   Create a table from a file, trying to auto-determine      │
│                                                                                the format of said file.                                  │
│   create.table.from.file_bundle                                  create_from   Create a table value from a text file_bundle.             │
│   date.check_range                                                             Check whether a date falls within a specified date        │
│                                                                                range.                                                    │
│   date.extract_from_string                                                     Extract a date object from a string.                      │
│   export.file.as.file                                            export_as     -- n/a --                                                 │
│   export.network_data.as.csv_files                               export_as     Export network data as 2 csv files (one for edges, one    │
│                                                                                for nodes.                                                │
│   export.network_data.as.graphml_file                            export_as     Export network data as graphml file.                      │
│   export.network_data.as.sql_dump                                export_as     Export network data as a sql dump file.                   │
│   export.network_data.as.sqlite_db                               export_as     Export network data as a sqlite database file.            │
│   export.table.as.csv_file                                       export_as     Export a table as csv file.                               │
│   extract.date_array.from.table                                  pipeline      Extract a date array from a table column.                 │
│   file_bundle.pick.file                                                        Pick a single file from a file_bundle value.              │
│   file_bundle.pick.sub_folder                                                  Pick a sub-folder from a file_bundle, resulting in a      │
│                                                                                new file_bundle.                                          │
│   filter.table                                                                 Filter a table.                                           │
│   import.database.from.local_file_path                           pipeline      Import a database from a csv file.                        │
│   import.local.file                                                            Import a file from the local filesystem.                  │
│   import.local.file_bundle                                                     Import a folder (file_bundle) from the local              │
│                                                                                filesystem.                                               │
│   import.network_data.from.local_file_paths                      pipeline      Onboard the edges and nodes from local files, create      │
│                                                                                table values from them, then assemble those to the        │
│                                                                                network_data result.                                      │
│   import.table.from.local_file_path                              pipeline      Import a table from a file on the local filesystem.       │
│   import.table.from.local_folder_path                            pipeline      Import a table from a local folder containing text        │
│                                                                                files.                                                    │
│   kiara_plugin.my_kiara_module.my_kiara_module.tutorial_module                 -- n/a --                                                 │
│   list.contains                                                                Check whether an element is in a list.                    │
│   logic.and                                                                    Returns 'True' if both inputs are 'True'.                 │
│   logic.nand                                                     pipeline      Returns 'False' if both inputs are 'True'.                │
│   logic.nor                                                      pipeline      Returns 'True' if both inputs are 'False'.                │
│   logic.not                                                                    Negates the input.                                        │
│   logic.or                                                                     Returns 'True' if one of the inputs is 'True'.            │
│   logic.xor                                                      pipeline      Returns 'True' if exactly one of it's two inputs is       │
│                                                                                'True'.                                                   │
│   my_kiara_module.example                                                      A very simple example module; concatenate two strings.    │
│   parse.date_array                                                             Create an array of date objects from an array of          │
│                                                                                strings.                                                  │
│   query.database                                                               Execute a sql query against a (sqlite) database.          │
│   query.table                                                                  Execute a sql query against an (Arrow) table.             │
│   string_filter.tokens                                           filter        -- n/a --                                                 │
│   table.pick.column                                                            Pick one column from a table, returning an array.         │
│   table_filter.drop_columns                                      filter        -- n/a --                                                 │
│   table_filter.select_columns                                    filter        -- n/a --                                                 │
│   table_filter.select_rows                                       filter        -- n/a --                                                 │
│                                                                                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Note

In this guide we'll use the term operation to indicate an entity that transforms data in some way or form. kiara also has the concept of module (the differences are explained in more detail here), and in most cases the meaning of 'module' and 'operation' is roughly the same. Especially in the context of this 'Getting started' guide. Nonetheless, keep in mind that technically both terms refer to different things.

Importing data, and creating a table

Tables are arguably the most used (and useful) data structures in data science and data engineering. They come in different forms; some people call them spreadsheets, or dataframes. We're not fancy, so we won't do that: we'll call them tables.

A depressingly large amount of (tabular) data comes in CSV files, which is why we'll use one as an example here. Specifically, we will use JournalNodes1902.csv. As stated above, this file contains information about historical medical journals (name, type, where it was from, etc.), and we'll later use it as the table which will provide node information in a network graph. We want to convert this file into a 'proper' table structure, because that will make subsequent processing faster, and also simpler in a lot of cases. 'Proper', in this case means we'll convert it into a better format for internal use, for example containing information about the data type in each column, among other things.

Finding the right command, and how to use it

kiara likes its data 'onboarded' (or: 'imported'), meaning it prefers to work with data that was imported into its internal data store. This effectively duplicates a file on a users filesystem (and depending on the filesystem used this could mean doubling the hard-disk space required for that particular dataset). The reason behind this preference is that this ensures the data won't be modified by an external application after import. This enables kiara to employ some techniques to save memory, hard-disk space as well as cpu-resources down the line.

So, in most cases, the first thing you (as a user) want to do is 'import' the source data you want to work with. So, let's run the operation list command again, but let's filter using the term 'import':

kiara operation list import
╭─ Filtered operations ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                          │
│   Id                                          Type(s)    Description                                                                     │
│  ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────  │
│   import.database.from.local_file_path        pipeline   Import a database from a csv file.                                              │
│   import.local.file                                      Import a file from the local filesystem.                                        │
│   import.local.file_bundle                               Import a folder (file_bundle) from the local filesystem.                        │
│   import.network_data.from.local_file_paths   pipeline   Onboard the edges and nodes from local files, create table values from them,    │
│                                                          then assemble those to the network_data result.                                 │
│   import.table.from.local_file_path           pipeline   Import a table from a file on the local filesystem.                             │
│   import.table.from.local_folder_path         pipeline   Import a table from a local folder containing text files.                       │
│                                                                                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Importing the 'raw' file

After looking at the kiara operation list output, it looks like the import.local.file module might be just what we need (to be honest, import.table.from.local_file_path is what we'd really use if we weren't stuck in this getting-started guide, but doing that would skip over a few important basics that are worth understanding).

kiara has the run sub-command, which is used to execute operations. If we only provide a module name, and not any input, this command will tell us what it expects:

kiara run import.local.file
╭─ Run info: import.local.file ────────────────────────────────────────────────╮
│                                                                              │
│ Can't run operation: invalid or insufficient input(s)                        │
│                                                                              │
│ ──────────────────────────────────────────────────────────────────────────── │
│                                                                              │
│ Operation: import.local.file                                                 │
│                                                                              │
│ Import a file from the local filesystem.                                     │
│                                                                              │
│ Inputs:                                                                      │
│                                                                              │
│   field name   status    type     description           required   default   │
│  ──────────────────────────────────────────────────────────────────────────  │
│   path         not set   string   The local path to     yes                  │
│                                   the file.                                  │
│                                                                              │
│                                                                              │
│ Outputs:                                                                     │
│                                                                              │
│   field name   type   description                                            │
│  ──────────────────────────────────────────────────────────────────────────  │
│   file         file   The loaded files.                                      │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

As makes obvious sense, we need to provide a path input, of type string, letting kiara know where to pick up the file. The kiara command-line interface can take complex inputs like dictionaries, but fortunately this is not necessary here. If you ever come into a situation where you need that, check out this section.

For simple inputs like string-type things, all we need to do is provide the input name, followed by '=' and the value itself:

kiara run import.local.file path=examples/data/journals/JournalNodes1902.csv
╭─ Result ─────────────────────────────────────────────────────────────────────╮
│                                                                              │
│   field   data_type   value                                                  │
│  ──────────────────────────────────────────────────────────────────────────  │
│   file    file        Id,Label,JournalType,City,CountryNetworkTime,Prese…   │
│                       75,Psychiatrische en neurologische                     │
│                       bladen,specialized: psychiatry and                     │
│                       neurology,Amsterdam,Netherlands,Netherlands,52.3666…   │
│                       36,The American Journal of Insanity,specialized:       │
│                       psychiatry and neurology,Baltimore,United              │
│                       States,United States,39.289444,-76.615278,English      │
│                       208,The American Journal of Psychology,specialized:    │
│                       psychology,Baltimore,United States,United              │
│                       States,39.289444,-76.615278,English                    │
│                       295,Die Krankenpflege,specialized:                     │
│                       therapy,Berlin,German                                  │
│                       Empire,Germany,52.52,13.405,German                     │
│                       296,Die deutsche Klinik am Eingange des zwanzigsten    │
│                       Jahrhunderts,general medicine,Berlin,German            │
│                       Empire,Germany,52.52,13.405,German                     │
│                       300,Therapeutische Monatshefte,specialized:            │
│                       therapy,Berlin,German                                  │
│                       Empire,Germany,52.52,13.405,German                     │
│                       1,Allgemeine Zeitschrift für                           │
│                       Psychiatrie,specialized: psychiatry and                │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       7,Archiv für Psychiatrie und                           │
│                       Nervenkrankheiten,specialized: psychiatry and          │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       10,Berliner klinische Wochenschrift,general            │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       13,Charité Annalen,general medicine,Berlin,German      │
│                       Empire,Germany,52.52,13.405,German                     │
│                       21,Monatsschrift für Psychiatrie und                   │
│                       Neurologie,specialized: psychiatry and                 │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       29,Virchows Archiv,"specialized: anatomy, physiology   │
│                       and pathology",Berlin,German                           │
│                       Empire,Germany,52.52,13.405,German                     │
│                       31,Zeitschrift für pädagogische Psychologie und        │
│                       Pathologie,specialized: psychology and                 │
│                       pedagogy,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       42,Vierteljahrsschrift für gerichtliche Medizin und    │
│                       öffentliches Sanitätswesen,"specialized:               │
│                       anthropology, criminology and                          │
│                       forensics",Berlin,German                               │
│                       Empire,Germany,52.52,13.405,German                     │
│                       47,Centralblatt für Nervenheilkunde und                │
│                       Psychiatrie,specialized: psychiatry and                │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       50,Russische medicinische Rundschau,general            │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       76,Deutsche Aerzte-Zeitung,general                     │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       87,Monatsschrift für Geburtshülfe und                  │
│                       Gynäkologie,specialized: gynecology,Berlin,German      │
│                       Empire,Germany,52.52,13.405,German                     │
│                       108,Archiv für klinische Chirurgie,specialized:        │
│                       surgery,Berlin,German                                  │
│                       Empire,Germany,52.52,13.405,German                     │
│                       113,Zeitschrift für klinische Medicin,general          │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       159,Deutsche militärärztliche                          │
│                       Zeitschrift,specialized: military                      │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       162,Jahresbericht über die Leistungen und              │
│                       Fortschritte auf dem Gebiete der Neurologie und        │
│                       Psychiatrie,specialized: psychiatry and                │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       192,Ärztliche Sachverständigen-Zeitung,general         │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       198,Zeitschrift für die Behandlung Schwachsinniger     │
│                       und Epileptischer,specialized: psychiatry and          │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       258,Der Pfarrbote,news media,Berlin,German             │
│                       Empire,Germany,52.52,13.405,German                     │
│                       71,Correspondenz-Blatt für Schweizer Aerzte,general    │
│                       medicine,Bern,Switzerland,Switzerland,46.948056,7.4…   │
│                       6,Archiv für mikroskopische Anatomie,"specialized:     │
│                       anatomy, physiology and pathology",Bonn,German         │
│                       Empire,Germany,50.733333,7.1,German                    │
│                       203,The Journal of Abnormal Psychology,specialized:    │
│                       psychology,Boston,United States,United                 │
│                       States,42.358056,-71.063611,English                    │
│                       273,"Correspondenz-Blatt der Deutschen Gesellschaft    │
│                       für Anthropologie, Ethnologie und                      │
│                       Urgeschichte","specialized: anthropology,              │
│                       criminology and forensics",Braunschweig,German         │
│                       Empire,Germany,52.266667,10.516667,German              │
│                       303,Policlinique de Bruxelles,general                  │
│                       medicine,Brussels,Belgium,Belgium,50.85,4.35,French    │
│                       306,Annales de la Société Belge de                     │
│                       Neurologie,specialized: psychiatry and                 │
│                       neurology,Brussels,Belgium,Belgium,50.85,4.35,French   │
│                       19,Journal de neurologie,specialized: psychiatry and   │
│                       neurology,Brussels,Belgium,Belgium,50.85,4.35,French   │
│                       25,"Revue internationale d'électrothérapie, de         │
│                       physiologie, de médecine, de chirurgie,                │
│                       d'obstétrique, de thérapeutique, de chimie et de       │
│                       pharmacie",general                                     │
│                       medicine,Brussels,Belgium,Belgium,50.85,4.35,French    │
│                       35,Bulletin de la Société de Médecine Mentale de       │
│                       Belgique,specialized: psychiatry and                   │
│                       neurology,Brussels,Belgium,Belgium,50.85,4.35,French   │
│                       ...                                                    │
│                                                                              │
│                       ...                                                    │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

As you can see from the terminal output, this produced one piece of output data: file (referring to the imported file), and it displays a preview of the file in question for us. By itself, this doesn't do anything yet, it just reads the file and then stops. What we want in this case is to 'save' the file, so we can refer to it again later. The process of 'saving' a value in kiara persists the file (rather: it's content and some metadata) into the kiara data store, giving it an internal unique id (string), and allows the user to 'tag' the value with one or multiple aliases. Aliases are names that are meaningful to the user, in order to make it easy to refer to datasets later on.

kiara supports saving any of the output values of a kiara run command via the --save flag. This --save parameter takes a single string as argument, and can be used in two ways:

  • if you want to save all output fields of a run you can just provide a single string (for example imported_journal_csv) as the parameter. In this case, kiara will store all result items with an auto-generated alias in the form of [save_argument].[field_name]. In our case this would result in one item being store in the data store, with the alias imported_journal_csv.file.
  • if you want to save only a subset of result values, or want to have more control about the aliases those results get, you can use the --save parameter for every field you want to persist. In this case the argument to --save must be in the form of: [field_name]=[alias]. You can use the --save parameter multiple times, with different field names.

In our case, lets opt for the second option:

kiara run --save file=journal_nodes_file import.local.file path=examples/data/journals/JournalNodes1902.csv
╭─ Result ─────────────────────────────────────────────────────────────────────╮
│                                                                              │
│   field   data_type   value                                                  │
│  ──────────────────────────────────────────────────────────────────────────  │
│   file    file        Id,Label,JournalType,City,CountryNetworkTime,Prese…   │
│                       75,Psychiatrische en neurologische                     │
│                       bladen,specialized: psychiatry and                     │
│                       neurology,Amsterdam,Netherlands,Netherlands,52.3666…   │
│                       36,The American Journal of Insanity,specialized:       │
│                       psychiatry and neurology,Baltimore,United              │
│                       States,United States,39.289444,-76.615278,English      │
│                       208,The American Journal of Psychology,specialized:    │
│                       psychology,Baltimore,United States,United              │
│                       States,39.289444,-76.615278,English                    │
│                       295,Die Krankenpflege,specialized:                     │
│                       therapy,Berlin,German                                  │
│                       Empire,Germany,52.52,13.405,German                     │
│                       296,Die deutsche Klinik am Eingange des zwanzigsten    │
│                       Jahrhunderts,general medicine,Berlin,German            │
│                       Empire,Germany,52.52,13.405,German                     │
│                       300,Therapeutische Monatshefte,specialized:            │
│                       therapy,Berlin,German                                  │
│                       Empire,Germany,52.52,13.405,German                     │
│                       1,Allgemeine Zeitschrift für                           │
│                       Psychiatrie,specialized: psychiatry and                │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       7,Archiv für Psychiatrie und                           │
│                       Nervenkrankheiten,specialized: psychiatry and          │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       10,Berliner klinische Wochenschrift,general            │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       13,Charité Annalen,general medicine,Berlin,German      │
│                       Empire,Germany,52.52,13.405,German                     │
│                       21,Monatsschrift für Psychiatrie und                   │
│                       Neurologie,specialized: psychiatry and                 │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       29,Virchows Archiv,"specialized: anatomy, physiology   │
│                       and pathology",Berlin,German                           │
│                       Empire,Germany,52.52,13.405,German                     │
│                       31,Zeitschrift für pädagogische Psychologie und        │
│                       Pathologie,specialized: psychology and                 │
│                       pedagogy,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       42,Vierteljahrsschrift für gerichtliche Medizin und    │
│                       öffentliches Sanitätswesen,"specialized:               │
│                       anthropology, criminology and                          │
│                       forensics",Berlin,German                               │
│                       Empire,Germany,52.52,13.405,German                     │
│                       47,Centralblatt für Nervenheilkunde und                │
│                       Psychiatrie,specialized: psychiatry and                │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       50,Russische medicinische Rundschau,general            │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       76,Deutsche Aerzte-Zeitung,general                     │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       87,Monatsschrift für Geburtshülfe und                  │
│                       Gynäkologie,specialized: gynecology,Berlin,German      │
│                       Empire,Germany,52.52,13.405,German                     │
│                       108,Archiv für klinische Chirurgie,specialized:        │
│                       surgery,Berlin,German                                  │
│                       Empire,Germany,52.52,13.405,German                     │
│                       113,Zeitschrift für klinische Medicin,general          │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       159,Deutsche militärärztliche                          │
│                       Zeitschrift,specialized: military                      │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       162,Jahresbericht über die Leistungen und              │
│                       Fortschritte auf dem Gebiete der Neurologie und        │
│                       Psychiatrie,specialized: psychiatry and                │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       192,Ärztliche Sachverständigen-Zeitung,general         │
│                       medicine,Berlin,German                                 │
│                       Empire,Germany,52.52,13.405,German                     │
│                       198,Zeitschrift für die Behandlung Schwachsinniger     │
│                       und Epileptischer,specialized: psychiatry and          │
│                       neurology,Berlin,German                                │
│                       Empire,Germany,52.52,13.405,German                     │
│                       258,Der Pfarrbote,news media,Berlin,German             │
│                       Empire,Germany,52.52,13.405,German                     │
│                       71,Correspondenz-Blatt für Schweizer Aerzte,general    │
│                       medicine,Bern,Switzerland,Switzerland,46.948056,7.4…   │
│                       6,Archiv für mikroskopische Anatomie,"specialized:     │
│                       anatomy, physiology and pathology",Bonn,German         │
│                       Empire,Germany,50.733333,7.1,German                    │
│                       203,The Journal of Abnormal Psychology,specialized:    │
│                       psychology,Boston,United States,United                 │
│                       States,42.358056,-71.063611,English                    │
│                       273,"Correspondenz-Blatt der Deutschen Gesellschaft    │
│                       für Anthropologie, Ethnologie und                      │
│                       Urgeschichte","specialized: anthropology,              │
│                       criminology and forensics",Braunschweig,German         │
│                       Empire,Germany,52.266667,10.516667,German              │
│                       303,Policlinique de Bruxelles,general                  │
│                       medicine,Brussels,Belgium,Belgium,50.85,4.35,French    │
│                       306,Annales de la Société Belge de                     │
│                       Neurologie,specialized: psychiatry and                 │
│                       neurology,Brussels,Belgium,Belgium,50.85,4.35,French   │
│                       19,Journal de neurologie,specialized: psychiatry and   │
│                       neurology,Brussels,Belgium,Belgium,50.85,4.35,French   │
│                       25,"Revue internationale d'électrothérapie, de         │
│                       physiologie, de médecine, de chirurgie,                │
│                       d'obstétrique, de thérapeutique, de chimie et de       │
│                       pharmacie",general                                     │
│                       medicine,Brussels,Belgium,Belgium,50.85,4.35,French    │
│                       35,Bulletin de la Société de Médecine Mentale de       │
│                       Belgique,specialized: psychiatry and                   │
│                       neurology,Brussels,Belgium,Belgium,50.85,4.35,French   │
│                       ...                                                    │
│                                                                              │
│                       ...                                                    │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

╭─ Stored result value ────────────────────────────────────────────────────────╮
│                                                                              │
│   field   data type   stored id                         alias(es)            │
│  ──────────────────────────────────────────────────────────────────────────  │
│   file    file        8bb90738-ab11-4cfb-8ada-f43549…   journal_nodes_file   │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

Checking the data store

To check whether that worked, we can list all of our items in the data store, and see if the one we just created is in there:

kiara data list
╭─ Available aliases ──────────────────────────────────────────────────────────╮
│                                                                              │
│   alias                type       size                                       │
│  ──────────────────────────────────────                                      │
│   journal_nodes_file   file   33.43 KB                                       │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

All right! Looks like this worked.

Creating a table from an imported CSV file

CSV files are usually not much use by themselves, in most cases we want to create a table-like structure from them, so we can efficiently query the data. This usually also makes sure that the structure and format of the file is valid.

Let's ask kiara what 'create' related operations it has available:

kiara operation list create
╭─ Filtered operations ────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                      │
│   Id                                 Type(s)       Description                                                       │
│  ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────  │
│   create.database.from.file          create_from   Create a database from a file.                                    │
│   create.database.from.file_bundle   create_from   Create a database from a file_bundle value.                       │
│   create.database.from.table         create_from   Create a database value from a table.                             │
│   create.network_data.from.files     pipeline      Create table values from files containing edges and node data,    │
│                                                    then assemble those to the network_data result.                   │
│   create.network_data.from.tables                  Create a graph object from one or two tables.                     │
│   create.table.from.file             create_from   Create a table from a file, trying to auto-determine the format   │
│                                                    of said file.                                                     │
│   create.table.from.file_bundle      create_from   Create a table value from a text file_bundle.                     │
│                                                                                                                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Righto, looks like create.table.from.file might be our ticket! Let's see what it does:

kiara operation explain create.table.from.file
╭─ Operation: create.table.from.file ──────────────────────────────────────────╮
│                                                                              │
│   Documentation   Create a table from a file, trying to auto-determine the   │
│                   format of said file.                                       │
│                                                                              │
│   Inputs                                                                     │
│                     field                                                    │
│                     name        type   descripti…   Required   Default       │
│                    ──────────────────────────────────────────────────────    │
│                     file        file   The source   yes        -- no         │
│                                        value (of               default       │
│                                        type                    --            │
│                                        'file').                              │
│                                                                              │
│                                                                              │
│   Outputs                                                                    │
│                     field name   type    description                         │
│                    ──────────────────────────────────────────────────────    │
│                     table        table   The result value (of type           │
│                                          'table').                           │
│                                                                              │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

So, it needs an input file of type ... file, and will return a 'table'-named output of type, well ... table. Looks good. Here is how we run this:

kiara run create.table.from.file file=alias:journal_nodes_file
╭─ Result ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                      │
│   field   data_type   value                                                                                                                                                                          │
│  ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────  │
│   table   table                                                                                                                                                                                      │
│                         Id    Label                              JournalType                         City        CountryNetworkTime        PresentDayCountry   Latitude    Longitude    Language     │
│                        ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────    │
│                         75    Psychiatrische en neurologische    specialized: psychiatry and neuro   Amsterdam   Netherlands               Netherlands         52.366667   4.9          Dutch        │
│                         36    The American Journal of Insanity   specialized: psychiatry and neuro   Baltimore   United States             United States       39.289444   -76.615278   English      │
│                         208   The American Journal of Psycholo   specialized: psychology             Baltimore   United States             United States       39.289444   -76.615278   English      │
│                         295   Die Krankenpflege                  specialized: therapy                Berlin      German Empire             Germany             52.52       13.405       German       │
│                         296   Die deutsche Klinik am Eingange    general medicine                    Berlin      German Empire             Germany             52.52       13.405       German       │
│                         300   Therapeutische Monatshefte         specialized: therapy                Berlin      German Empire             Germany             52.52       13.405       German       │
│                         1     Allgemeine Zeitschrift für Psych   specialized: psychiatry and neuro   Berlin      German Empire             Germany             52.52       13.405       German       │
│                         7     Archiv für Psychiatrie und Nerve   specialized: psychiatry and neuro   Berlin      German Empire             Germany             52.52       13.405       German       │
│                         10    Berliner klinische Wochenschrift   general medicine                    Berlin      German Empire             Germany             52.52       13.405       German       │
│                         13    Charité Annalen                    general medicine                    Berlin      German Empire             Germany             52.52       13.405       German       │
│                         21    Monatsschrift für Psychiatrie un   specialized: psychiatry and neuro   Berlin      German Empire             Germany             52.52       13.405       German       │
│                         29    Virchows Archiv                    specialized: anatomy, physiology    Berlin      German Empire             Germany             52.52       13.405       German       │
│                         31    Zeitschrift für pädagogische Psy   specialized: psychology and pedag   Berlin      German Empire             Germany             52.52       13.405       German       │
│                         42    Vierteljahrsschrift für gerichtl   specialized: anthropology, crimin   Berlin      German Empire             Germany             52.52       13.405       German       │
│                         47    Centralblatt für Nervenheilkunde   specialized: psychiatry and neuro   Berlin      German Empire             Germany             52.52       13.405       German       │
│                         50    Russische medicinische Rundschau   general medicine                    Berlin      German Empire             Germany             52.52       13.405       German       │
│                         ...   ...                                ...                                 ...         ...                       ...                 ...         ...          ...          │
│                         ...   ...                                ...                                 ...         ...                       ...                 ...         ...          ...          │
│                         277   L'arte medica                      general medicine                    Turin       Italy                     Italy               45.079167   7.676111     Italian      │
│                         288   Allgemeine österreichische Geric   specialized: anthropology, crimin   Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         18    Jahrbücher für Psychiatrie         specialized: psychiatry and neuro   Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         30    Wiener klinische Rundschau         general medicine                    Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         44    Wiener klinische Wochenschrift     general medicine                    Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         45    Wiener medizinische Wochenschrif   general medicine                    Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         72    Wiener medizinische Presse         general medicine                    Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         81    Monatsschrift für Gesundheitspfl   general medicine                    Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         93    Klinisch-therapeutische Wochensc   general medicine                    Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         151   Medicinisch-chirurgisches Centra   specialized: surgery                Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         199   Der Militärazt                     specialized: military medicine      Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                         261   Медицинская беседа                 general medicine                    Voronezh    Russian Empire            Russia              51.671667   39.210556    Russian      │
│                         77    Medycyna                           general medicine                    Warsaw      Russian Empire            Poland              52.233333   21.016667    Polish       │
│                         150   Kronika Lekarska                   general medicine                    Warsaw      Russian Empire            Poland              52.233333   21.016667    Polish       │
│                         86    Grenzfragen des Nerven- und Seel   specialized: psychiatry and neuro   Wiesbaden   German Empire             Germany             50.0825     8.24         German       │
│                         206   Ergebnisse der Allgemeinen Patho   specialized: anatomy, physiology    Wiesbaden   German Empire             Germany             50.0825     8.24         German       │
│                                                                                                                                                                                                      │
│                                                                                                                                                                                                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Note

In this example we pre-pend the right side of the file= argument with alias:. This is necessary to make it clear to kiara that we mean a dataset that lives in its data store, and we want to refer to it via its alias. Otherwise, kiara would have just interpreted the input as a string, and since that is of the wrong input type (we needed a table), it would have thrown an error.

That output looks good, right? Much more table-y then before. Only thing is: we want to again 'save' this output, so we can use it later directly. No big deal, just like last time:

kiara run --output silent --save table=journal_nodes_table create.table.from.file file=alias:journal_nodes_file
╭─ Stored result value ────────────────────────────────────────────────────────╮
│                                                                              │
│   field   data type   stored id                        alias(es)             │
│  ──────────────────────────────────────────────────────────────────────────  │
│   table   table       8ef2a5ea-031a-4370-93ac-c8d42…   journal_nodes_table   │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

Note

Here we use the --output silent command line option to supress any output of values. We've seen this already in the last invocation of this command. kiara will still tell us the id of the value it just saved.

Checking the data store, again

Now, let's look again at the content of the kiara data store:

kiara data list
╭─ Available aliases ──────────────────────────────────────────────────────────╮
│                                                                              │
│   alias                 type        size                                     │
│  ────────────────────────────────────────                                    │
│   journal_nodes_file    file    33.43 KB                                     │
│   journal_nodes_table   table   42.79 KB                                     │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

As you can see, there are 2 items now: one file, and one table. If you ever want to get more details about any of the items in the data store, you can use one of those commands:

Display information about the data: kiara data explain
kiara data explain alias:journal_nodes_table
╭─ Value details for: alias:journal_nodes_table ───────────────────────────────╮
│                                                                              │
│   value_id            8ef2a5ea-031a-4370-93ac-c8d42dc1ea3b                   │
│   kiara_id            bc41cc78-899c-433b-8e8d-33d0c9990791                   │
│                                                                              │
│                       ────────────────────────────────────────────────────   │
│   data_type_info                                                             │
│                         data_type_name     table                             │
│                         data_type_config   {}                                │
│                         characteristics    {                                 │
│                                              "is_scalar": false,             │
│                                              "is_json_serializable":         │
│                                            false                             │
│                                            }                                 │
│                         data_type_class                                      │
│                                              python_cla…   TableType         │
│                                              python_mod…   kiara_plug…       │
│                                              full_name     kiara_plug…       │
│                                                                              │
│                                                                              │
│   destiny_backlinks   {}                                                     │
│   enviroments         None                                                   │
│   property_links      {                                                      │
│                         "metadata.python_class":                             │
│                       "aae815fc-07fb-48ef-a8e3-cc18473d8389",                │
│                         "metadata.table":                                    │
│                       "30bec5f0-d69a-4b2e-bda2-e99a411d6463"                 │
│                       }                                                      │
│   value_hash          zdpuAn89Et1ENzfoASJRYcWEceyfRiPg664mN4nnHLFnjRLyg      │
│   value_schema                                                               │
│                         type          table                                  │
│                         type_config   {}                                     │
│                         default       __not_set__                            │
│                         optional      False                                  │
│                         is_constant   False                                  │
│                         doc           The result value (of type              │
│                                       'table').                              │
│                                                                              │
│   value_size          42.79 KB                                               │
│   value_status        -- set --                                              │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

This command prints out the metadata kiara has stored about a value item. This commands supports displaying several internally important metadata details of stored datasets, check out the available options with kiara data explain --help. One option that is particularly interesting is the --properties one, which displays all the metadata properties kiara has collected about a value. We will experiment with this option a bit later in this tutorial.

Display the data itself: kiara data load
kiara data load -s alias:journal_nodes_table
  Id    Label                                            JournalType                                       City        CountryNetworkTime        PresentDayCountry   Latitude    Longitude    Language  
 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 
  75    Psychiatrische en neurologische bladen           specialized: psychiatry and neurology             Amsterdam   Netherlands               Netherlands         52.366667   4.9          Dutch     
  36    The American Journal of Insanity                 specialized: psychiatry and neurology             Baltimore   United States             United States       39.289444   -76.615278   English   
  208   The American Journal of Psychology               specialized: psychology                           Baltimore   United States             United States       39.289444   -76.615278   English   
  295   Die Krankenpflege                                specialized: therapy                              Berlin      German Empire             Germany             52.52       13.405       German    
  296   Die deutsche Klinik am Eingange des zwanzigste   general medicine                                  Berlin      German Empire             Germany             52.52       13.405       German    
  300   Therapeutische Monatshefte                       specialized: therapy                              Berlin      German Empire             Germany             52.52       13.405       German    
  1     Allgemeine Zeitschrift für Psychiatrie           specialized: psychiatry and neurology             Berlin      German Empire             Germany             52.52       13.405       German    
  7     Archiv für Psychiatrie und Nervenkrankheiten     specialized: psychiatry and neurology             Berlin      German Empire             Germany             52.52       13.405       German    
  10    Berliner klinische Wochenschrift                 general medicine                                  Berlin      German Empire             Germany             52.52       13.405       German    
  13    Charité Annalen                                  general medicine                                  Berlin      German Empire             Germany             52.52       13.405       German    
  21    Monatsschrift für Psychiatrie und Neurologie     specialized: psychiatry and neurology             Berlin      German Empire             Germany             52.52       13.405       German    
  29    Virchows Archiv                                  specialized: anatomy, physiology and pathology    Berlin      German Empire             Germany             52.52       13.405       German    
  31    Zeitschrift für pädagogische Psychologie und P   specialized: psychology and pedagogy              Berlin      German Empire             Germany             52.52       13.405       German    
  42    Vierteljahrsschrift für gerichtliche Medizin u   specialized: anthropology, criminology and fore   Berlin      German Empire             Germany             52.52       13.405       German    
  47    Centralblatt für Nervenheilkunde und Psychiatr   specialized: psychiatry and neurology             Berlin      German Empire             Germany             52.52       13.405       German    
  50    Russische medicinische Rundschau                 general medicine                                  Berlin      German Empire             Germany             52.52       13.405       German    
  ...   ...                                              ...                                               ...         ...                       ...                 ...         ...          ...       
  ...   ...                                              ...                                               ...         ...                       ...                 ...         ...          ...       
  277   L'arte medica                                    general medicine                                  Turin       Italy                     Italy               45.079167   7.676111     Italian   
  288   Allgemeine österreichische Gerichts-Zeitung      specialized: anthropology, criminology and fore   Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  18    Jahrbücher für Psychiatrie                       specialized: psychiatry and neurology             Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  30    Wiener klinische Rundschau                       general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  44    Wiener klinische Wochenschrift                   general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  45    Wiener medizinische Wochenschrift                general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  72    Wiener medizinische Presse                       general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  81    Monatsschrift für Gesundheitspflege              general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  93    Klinisch-therapeutische Wochenschrift            general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  151   Medicinisch-chirurgisches Centralblatt           specialized: surgery                              Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  199   Der Militärazt                                   specialized: military medicine                    Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German    
  261   Медицинская беседа                               general medicine                                  Voronezh    Russian Empire            Russia              51.671667   39.210556    Russian   
  77    Medycyna                                         general medicine                                  Warsaw      Russian Empire            Poland              52.233333   21.016667    Polish    
  150   Kronika Lekarska                                 general medicine                                  Warsaw      Russian Empire            Poland              52.233333   21.016667    Polish    
  86    Grenzfragen des Nerven- und Seelenlebens         specialized: psychiatry and neurology             Wiesbaden   German Empire             Germany             50.0825     8.24         German    
  206   Ergebnisse der Allgemeinen Pathologie und Path   specialized: anatomy, physiology and pathology    Wiesbaden   German Empire             Germany             50.0825     8.24         German    

Note

If you omit the -s flag, this command will let you browse the table (or any other supported data type) interactively, similar to a pager application.

This command loads the actual data, and prints out its content (or a representation of it that makes sense in a terminal-context).

Querying the table data

This section is a bit more advanced, so you can skip it if you want. It's just to show an example of what can be done with a stored table data item.

We'll be using the SQL query language to find the names and types of all journals from Berlin. The query for this is:

select Label, JournalType from data where City='Berlin'

The kiara module we are going to use is called query.table. Let's check again the parameters this module expects:

kiara run query.table
╭─ Run info: query.table ──────────────────────────────────────────────────────╮
│                                                                              │
│ Can't run operation: invalid or insufficient input(s)                        │
│                                                                              │
│ ──────────────────────────────────────────────────────────────────────────── │
│                                                                              │
│ Operation: query.table                                                       │
│                                                                              │
│ Execute a sql query against an (Arrow) table.                                │
│                                                                              │
│ The default relation name for the sql query is 'data', but can be modified   │
│ by the 'relation_name' config option/input.                                  │
│                                                                              │
│ If the 'query' module config option is not set, users can provide their own  │
│ query, otherwise the pre-set one will be used.                               │
│                                                                              │
│ Inputs:                                                                      │
│                                                                              │
│   field name      status    type     description        required   default   │
│  ──────────────────────────────────────────────────────────────────────────  │
│   query           not set   string   The query, use     yes                  │
│                                      the value of the                        │
│                                      'relation_name'                         │
│                                      input as table,                         │
│                                      e.g. 'select *                          │
│                                      from data'.                             │
│   relation_name   valid     string   The name the       no         data      │
│                                      table is                                │
│                                      referred to in                          │
│                                      the sql query.                          │
│   table           not set   table    The table to       yes                  │
│                                      query                                   │
│                                                                              │
│                                                                              │
│ Outputs:                                                                     │
│                                                                              │
│   field name     type    description                                         │
│  ──────────────────────────────────────────────────────────────────────────  │
│   query_result   table   The query result.                                   │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

Aha. table, and query are required. Good, we have both. In this example we'll use the data item we've stored as input for another workflow. That goes like this:

kiara run query.table table=alias:journal_nodes_table query="select Label, JournalType from data where City='Berlin'"
╭─ Result ─────────────────────────────────────────────────────────────────────╮
│                                                                              │
│   field          data_type   value                                           │
│  ──────────────────────────────────────────────────────────────────────────  │
│   query_result   table                                                       │
│                                Label                 JournalType             │
│                               ───────────────────────────────────────────    │
│                                Die Krankenpflege     specialized: therap     │
│                                Die deutsche Klinik   general medicine        │
│                                Therapeutische Mona   specialized: therap     │
│                                Allgemeine Zeitschr   specialized: psychi     │
│                                Archiv für Psychiat   specialized: psychi     │
│                                Berliner klinische    general medicine        │
│                                Charité Annalen       general medicine        │
│                                Monatsschrift für P   specialized: psychi     │
│                                Virchows Archiv       specialized: anatom     │
│                                Zeitschrift für päd   specialized: psycho     │
│                                Vierteljahrsschrift   specialized: anthro     │
│                                Centralblatt für Ne   specialized: psychi     │
│                                Russische medicinis   general medicine        │
│                                Deutsche Aerzte-Zei   general medicine        │
│                                Monatsschrift für G   specialized: gyneco     │
│                                Archiv für klinisch   specialized: surger     │
│                                Zeitschrift für kli   general medicine        │
│                                Deutsche militärärz   specialized: milita     │
│                                Jahresbericht über    specialized: psychi     │
│                                Ärztliche Sachverst   general medicine        │
│                                Zeitschrift für die   specialized: psychi     │
│                                Der Pfarrbote         news media              │
│                                                                              │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

Note how we use the alias:-prefix again here, to signify to kiara that what follows is indeed a reference to a dataset, and not a string...

Saving the result of the query

As it is, the result of this query won't be saved anywhere. This might be fine for queries in exploratory-type situations. But in some cases we might want to store the result of our work, similar to how we imported the original table in the first place. The kiara run command can do that, using the --save flag. It takes as argument a string. If that string contains a '=', it is interpreted as a key value pair where the key is the name of the field we want to save, and the value the alias we want to save it under. Here is how that goes:

kiara run query.table --output=silent --save query_result=berlin_journals table=alias:journal_nodes_table query="select Label, JournalType from data where City='Berlin'"
╭─ Stored result value ────────────────────────────────────────────────────────╮
│                                                                              │
│   field          data type   stored id                     alias(es)         │
│  ──────────────────────────────────────────────────────────────────────────  │
│   query_result   table       f32438b9-ab95-4d18-b15e-22…   berlin_journals   │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

From looking at the output, it seems that saving our result has worked. We can make sure by letting kiara 'explain' to us the data that is stored under the alias 'berlin_journals'. This time, let's also display the result tables properties (by using the --properties flag:

kiara data explain --properties alias:berlin_journals
╭─ Value details for: alias:berlin_journals ───────────────────────────────────╮
│                                                                              │
│   value_id            f32438b9-ab95-4d18-b15e-22f6e168454d                   │
│   kiara_id            bc41cc78-899c-433b-8e8d-33d0c9990791                   │
│                                                                              │
│                       ────────────────────────────────────────────────────   │
│   data_type_info                                                             │
│                         data_type_name     table                             │
│                         data_type_config   {}                                │
│                         characteristics    {                                 │
│                                              "is_scalar": false,             │
│                                              "is_json_serializable":         │
│                                            false                             │
│                                            }                                 │
│                         data_type_class                                      │
│                                              python_cla…   TableType         │
│                                              python_mod…   kiara_plug…       │
│                                              full_name     kiara_plug…       │
│                                                                              │
│                                                                              │
│   destiny_backlinks   {}                                                     │
│   enviroments         None                                                   │
│   properties                                                                 │
│                         field                   value                        │
│                        ──────────────────────────────────────────────────    │
│                         metadata.python_class   {                            │
│                                                   "python_class": {          │
│                                                     "python_class_name"…     │
│                                                     "python_module_name…     │
│                                                     "full_name": "kiara…     │
│                                                   }                          │
│                                                 }                            │
│                         metadata.table          {                            │
│                                                   "table": {                 │
│                                                     "column_names": [        │
│                                                       "Label",               │
│                                                       "JournalType"          │
│                                                     ],                       │
│                                                     "column_schema": {       │
│                                                       "Label": {             │
│                                                         "type_name": "s…     │
│                                                         "metadata": {        │
│                                                           "arrow_type_i…     │
│                                                         }                    │
│                                                       },                     │
│                                                       "JournalType": {       │
│                                                         "type_name": "s…     │
│                                                         "metadata": {        │
│                                                           "arrow_type_i…     │
│                                                         }                    │
│                                                       }                      │
│                                                     },                       │
│                                                     "rows": 22,              │
│                                                     "size": 1672             │
│                                                   }                          │
│                                                 }                            │
│                                                                              │
│   property_links      {                                                      │
│                         "metadata.python_class":                             │
│                       "e71ccadc-8eee-49cb-93e1-ee42faeaad96",                │
│                         "metadata.table":                                    │
│                       "ca42d233-c95c-4d80-a690-f1acd840cf9f"                 │
│                       }                                                      │
│   value_hash          zdpuAq5Ty5hNtUaKWouPmS75LxteiQQv6Ue6Jsq9v39QoMPyw      │
│   value_schema                                                               │
│                         type          table                                  │
│                         type_config   {}                                     │
│                         default       __not_set__                            │
│                         optional      False                                  │
│                         is_constant   False                                  │
│                         doc           The query result.                      │
│                                                                              │
│   value_size          2.63 KB                                                │
│   value_status        -- set --                                              │
│                                                                              │
│                       ────────────────────────────────────────────────────   │
│                                                                              │
│   properties                                                                 │
│                         metadata.python_class   {                            │
│                                                   "python_class": {          │
│                                                     "python_class_name"…     │
│                                                     "python_module_name…     │
│                                                     "full_name": "kiara…     │
│                                                   }                          │
│                                                 }                            │
│                         metadata.table          {                            │
│                                                   "table": {                 │
│                                                     "column_names": [        │
│                                                       "Label",               │
│                                                       "JournalType"          │
│                                                     ],                       │
│                                                     "column_schema": {       │
│                                                       "Label": {             │
│                                                         "type_name": "s…     │
│                                                         "metadata": {        │
│                                                           "arrow_type_i…     │
│                                                         }                    │
│                                                       },                     │
│                                                       "JournalType": {       │
│                                                         "type_name": "s…     │
│                                                         "metadata": {        │
│                                                           "arrow_type_i…     │
│                                                         }                    │
│                                                       }                      │
│                                                     },                       │
│                                                     "rows": 22,              │
│                                                     "size": 1672             │
│                                                   }                          │
│                                                 }                            │
│                                                                              │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

Generating a network graph

Our goal for this tutorial is to create a network graph, and investigate its properties. Network graphs are usually created from one or two pieces of data (both tabular in nature):

  • edges (mandatory): information about what nodes exist, and if and how they are connected
  • nodes information (optional): information about attributes of each node

Note

In this tutorial we'll go through all the steps necessary to create a network graph object from two CSV files, one by one. This is a bit cumbersome, but it'll help you understand what actually happens. In a later tutorial we'll show how to create a kiara pipeline to combine all those steps into one.

Importing edges data, creating a table item from it

We already have our nodes imported into kiara (with the alias my_first_table). Now we need to do the same for our edges. Similar to what we have done above, we want to import the file into the kiara data store, and then convert it into a table. This time, let's just use a pre-pared (so-called) pipeline operation, which basically runs both operations in one, and feeds the right input(s) into the right input(s):

kiara operation explain import.table.from.local_file_path
╭─ Operation: import.table.from.local_file_path ───────────────────────────────╮
│                                                                              │
│   Documentation   Import a table from a file on the local filesystem.        │
│                                                                              │
│   Inputs                                                                     │
│                     field                                                    │
│                     name        type     descrip…   Required   Default       │
│                    ──────────────────────────────────────────────────────    │
│                     path        string   The        yes        -- no         │
│                                          local                 default       │
│                                          path to               --            │
│                                          the                                 │
│                                          file.                               │
│                                                                              │
│                                                                              │
│   Outputs                                                                    │
│                     field name      type    description                      │
│                    ──────────────────────────────────────────────────────    │
│                     imported_file   file    The loaded files.                │
│                     table           table   The result value (of type        │
│                                             'table').                        │
│                                                                              │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

So, let's see:

kiara run --save journal_edges import.table.from.local_file_path path=examples/data/journals/JournalEdges1902.csv
╭─ Results ────────────────────────────────────────────────────────────────────╮
│                                                                              │
│   field           data_type   value                                          │
│  ──────────────────────────────────────────────────────────                  │
│   imported_file   file        Source,Target,weight                          │
│                               1,1,11                                         │
│                               1,5,1                                          │
│                               1,7,6                                          │
│                               1,8,15                                         │
│                               1,10,24                                        │
│                               1,13,1                                         │
│                               1,14,2                                         │
│                               1,15,8                                         │
│                               1,18,7                                         │
│                               1,20,48                                        │
│                               1,21,7                                         │
│                               1,22,4                                         │
│                               1,23,75                                        │
│                               1,24,1                                         │
│                               1,26,8                                         │
│                               1,29,1                                         │
│                               1,30,14                                        │
│                               1,35,16                                        │
│                               1,36,23                                        │
│                               1,37,4                                         │
│                               1,38,5                                         │
│                               1,39,4                                         │
│                               1,40,10                                        │
│                               1,41,2                                         │
│                               1,42,4                                         │
│                               1,43,2                                         │
│                               1,44,1                                         │
│                               1,45,5                                         │
│                               1,46,7                                         │
│                               1,47,2                                         │
│                               1,56,1                                         │
│                               1,58,34                                        │
│                               1,61,9                                         │
│                               1,63,12                                        │
│                               ...                                            │
│                                                                              │
│                               ...                                            │
│   table           table                                                      │
│                                 Source   Target   weight                     │
│                                ──────────────────────────                    │
│                                 1        1        11                         │
│                                 1        5        1                          │
│                                 1        7        6                          │
│                                 1        8        15                         │
│                                 1        10       24                         │
│                                 1        13       1                          │
│                                 1        14       2                          │
│                                 1        15       8                          │
│                                 1        18       7                          │
│                                 1        20       48                         │
│                                 1        21       7                          │
│                                 1        22       4                          │
│                                 1        23       75                         │
│                                 1        24       1                          │
│                                 1        26       8                          │
│                                 1        29       1                          │
│                                 ...      ...      ...                        │
│                                 ...      ...      ...                        │
│                                 51       108      1                          │
│                                 51       109      5                          │
│                                 51       110      1                          │
│                                 51       111      1                          │
│                                 51       112      1                          │
│                                 51       113      1                          │
│                                 51       114      2                          │
│                                 51       115      2                          │
│                                 51       116      1                          │
│                                 51       118      3                          │
│                                 51       119      2                          │
│                                 51       120      1                          │
│                                 51       121      1                          │
│                                 63       102      1                          │
│                                 147      27       11                         │
│                                 147      241      1                          │
│                                                                              │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

╭─ Stored result values ───────────────────────────────────────────────────────╮
│                                                                              │
│   field           data type   stored id               alias(es)              │
│  ──────────────────────────────────────────────────────────────────────────  │
│   imported_file   file        d524e19e-be9b-4f56-b…   journal_edges.impor…   │
│   table           table       0aea06f2-f729-4ec5-b…   journal_edges.table    │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

Note

Here we've used a simple string (without '=') with the --save option, and as you can see, kiara created two namespaced aliases for the result items.

At this stage we'll have two relevant tables in our store: journal_edges.table, and journal_nodes_table (note how both use different naming schemes due to us using the --save option differently in both cases):

kiara data list
╭─ Available aliases ──────────────────────────────────────────────────────────╮
│                                                                              │
│   alias                         type        size                             │
│  ────────────────────────────────────────────────                            │
│   journal_edges.table           table    9.13 KB                             │
│   journal_nodes_file            file    33.43 KB                             │
│   journal_nodes_table           table   42.79 KB                             │
│   journal_edges.imported_file   file     3.02 KB                             │
│   berlin_journals               table    2.63 KB                             │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

Creating the graph

Now that we have the edges data in kiara in a useful format, we can create the graph object. The data type for graphs in kiara is called network_data, so let's check out all the operations kiara has to offer related to network_data:

kiara operation list network_data
╭─ Filtered operations ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                  │
│   Id                                          Type(s)     Description                                                                                                            │
│  ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────  │
│   create.network_data.from.files              pipeline    Create table values from files containing edges and node data, then assemble those to the network_data result.         │
│   create.network_data.from.tables                         Create a graph object from one or two tables.                                                                          │
│   export.network_data.as.csv_files            export_as   Export network data as 2 csv files (one for edges, one for nodes.                                                      │
│   export.network_data.as.graphml_file         export_as   Export network data as graphml file.                                                                                   │
│   export.network_data.as.sql_dump             export_as   Export network data as a sql dump file.                                                                                │
│   export.network_data.as.sqlite_db            export_as   Export network data as a sqlite database file.                                                                         │
│   import.network_data.from.local_file_paths   pipeline    Onboard the edges and nodes from local files, create table values from them, then assemble those to the network_data   │
│                                                           result.                                                                                                                │
│                                                                                                                                                                                  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Hm, create.network_data.from.tables looks good, right? Let's see that operations interface:

kiara operation explain create.network_data.from.tables
╭─ Operation: create.network_data.from.tables ─────────────────────────────────────────────────────────────────────────╮
│                                                                                                                      │
│   Documentation   Create a graph object from one or two tables.                                                      │
│                                                                                                                      │
│                   This module needs at least one table as input, providing the edges of the resulting network data   │
│                   set.                                                                                               │
│                   If no further table is created, basic node information will be automatically created by using      │
│                   unique values from                                                                                 │
│                   the edges source and target columns.                                                               │
│                                                                                                                      │
│   Inputs                                                                                                             │
│                     field name           type     description                        Required   Default              │
│                    ──────────────────────────────────────────────────────────────────────────────────────────────    │
│                     edges                table    A table that contains the edges    yes        -- no default --     │
│                                                   data.                                                              │
│                     source_column_name   string   The name of the source column      no         source               │
│                                                   name in the edges table.                                           │
│                     target_column_name   string   The name of the target column      no         target               │
│                                                   name in the edges table.                                           │
│                     edges_column_map     dict     An optional map of original        no         -- no default --     │
│                                                   column name to desired.                                            │
│                     nodes                table    A table that contains the nodes    no         -- no default --     │
│                                                   data.                                                              │
│                     id_column_name       string   The name (before any potential     no         id                   │
│                                                   column mapping) of the                                             │
│                                                   node-table column that contains                                    │
│                                                   the node identifier (used in the                                   │
│                                                   edges table).                                                      │
│                     label_column_name    string   The name of a column that          no         -- no default --     │
│                                                   contains the node label (before                                    │
│                                                   any potential column name                                          │
│                                                   mapping). If not specified, the                                    │
│                                                   value of the id value will be                                      │
│                                                   used as label.                                                     │
│                     nodes_column_map     dict     An optional map of original        no         -- no default --     │
│                                                   column name to desired.                                            │
│                                                                                                                      │
│                                                                                                                      │
│   Outputs                                                                                                            │
│                     field name     type           description                                                        │
│                    ──────────────────────────────────────────────────────────────────────────────────────────────    │
│                     network_data   network_data   The network/graph data.                                            │
│                                                                                                                      │
│                                                                                                                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

From this information we can assemble our command, using alias:edges_table as the main input, and saving it using the alias journals_graph. We can figure the values for the other inputs out be running kiara data explain --properties journal_edges.table, which will give us the edge column names, among other things (and, subsequently, `kiara data explain --properties journal_nodes_table. So, here goes nothing:

kiara run --save network_data=journals_graph create.network_data.from.tables edges=alias:journal_edges.table source_column_name=Source target_column_name=Target nodes=alias:journal_nodes_table id_column_name=Id label_column_name=Label
╭─ Result ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                                                              │
│   field          data_type      value                                                                                                                                                                                                        │
│  ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────  │
│   network_data   network_data                                                                                                                                                                                                                │
│                                 Table: edges                                                                                                                                                                                                 │
│                                                                                                                                                                                                                                              │
│                                   source   target   weight                                                                                                                                                                                   │
│                                  ──────────────────────────                                                                                                                                                                                  │
│                                   1        1        11                                                                                                                                                                                       │
│                                   1        5        1                                                                                                                                                                                        │
│                                   1        7        6                                                                                                                                                                                        │
│                                   1        8        15                                                                                                                                                                                       │
│                                   1        10       24                                                                                                                                                                                       │
│                                   1        13       1                                                                                                                                                                                        │
│                                   1        14       2                                                                                                                                                                                        │
│                                   1        15       8                                                                                                                                                                                        │
│                                   1        18       7                                                                                                                                                                                        │
│                                   1        20       48                                                                                                                                                                                       │
│                                   1        21       7                                                                                                                                                                                        │
│                                   1        22       4                                                                                                                                                                                        │
│                                   1        23       75                                                                                                                                                                                       │
│                                   1        24       1                                                                                                                                                                                        │
│                                   1        26       8                                                                                                                                                                                        │
│                                   1        29       1                                                                                                                                                                                        │
│                                   ...      ...      ...                                                                                                                                                                                      │
│                                   ...      ...      ...                                                                                                                                                                                      │
│                                   51       108      1                                                                                                                                                                                        │
│                                   51       109      5                                                                                                                                                                                        │
│                                   51       110      1                                                                                                                                                                                        │
│                                   51       111      1                                                                                                                                                                                        │
│                                   51       112      1                                                                                                                                                                                        │
│                                   51       113      1                                                                                                                                                                                        │
│                                   51       114      2                                                                                                                                                                                        │
│                                   51       115      2                                                                                                                                                                                        │
│                                   51       116      1                                                                                                                                                                                        │
│                                   51       118      3                                                                                                                                                                                        │
│                                   51       119      2                                                                                                                                                                                        │
│                                   51       120      1                                                                                                                                                                                        │
│                                   51       121      1                                                                                                                                                                                        │
│                                   63       102      1                                                                                                                                                                                        │
│                                   147      27       11                                                                                                                                                                                       │
│                                   147      241      1                                                                                                                                                                                        │
│                                                                                                                                                                                                                                              │
│                                 Table: nodes                                                                                                                                                                                                 │
│                                                                                                                                                                                                                                              │
│                                   id    label                                              JournalType                                       City        CountryNetworkTime        PresentDayCountry   Latitude    Longitude    Language     │
│                                  ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────    │
│                                   75    Psychiatrische en neurologische bladen             specialized: psychiatry and neurology             Amsterdam   Netherlands               Netherlands         52.366667   4.9          Dutch        │
│                                   36    The American Journal of Insanity                   specialized: psychiatry and neurology             Baltimore   United States             United States       39.289444   -76.615278   English      │
│                                   208   The American Journal of Psychology                 specialized: psychology                           Baltimore   United States             United States       39.289444   -76.615278   English      │
│                                   295   Die Krankenpflege                                  specialized: therapy                              Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   296   Die deutsche Klinik am Eingange des zwanzigsten    general medicine                                  Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   300   Therapeutische Monatshefte                         specialized: therapy                              Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   1     Allgemeine Zeitschrift für Psychiatrie             specialized: psychiatry and neurology             Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   7     Archiv für Psychiatrie und Nervenkrankheiten       specialized: psychiatry and neurology             Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   10    Berliner klinische Wochenschrift                   general medicine                                  Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   13    Charité Annalen                                    general medicine                                  Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   21    Monatsschrift für Psychiatrie und Neurologie       specialized: psychiatry and neurology             Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   29    Virchows Archiv                                    specialized: anatomy, physiology and pathology    Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   31    Zeitschrift für pädagogische Psychologie und Pat   specialized: psychology and pedagogy              Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   42    Vierteljahrsschrift für gerichtliche Medizin und   specialized: anthropology, criminology and fore   Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   47    Centralblatt für Nervenheilkunde und Psychiatrie   specialized: psychiatry and neurology             Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   50    Russische medicinische Rundschau                   general medicine                                  Berlin      German Empire             Germany             52.52       13.405       German       │
│                                   ...   ...                                                ...                                               ...         ...                       ...                 ...         ...          ...          │
│                                   ...   ...                                                ...                                               ...         ...                       ...                 ...         ...          ...          │
│                                   277   L'arte medica                                      general medicine                                  Turin       Italy                     Italy               45.079167   7.676111     Italian      │
│                                   288   Allgemeine österreichische Gerichts-Zeitung        specialized: anthropology, criminology and fore   Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   18    Jahrbücher für Psychiatrie                         specialized: psychiatry and neurology             Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   30    Wiener klinische Rundschau                         general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   44    Wiener klinische Wochenschrift                     general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   45    Wiener medizinische Wochenschrift                  general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   72    Wiener medizinische Presse                         general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   81    Monatsschrift für Gesundheitspflege                general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   93    Klinisch-therapeutische Wochenschrift              general medicine                                  Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   151   Medicinisch-chirurgisches Centralblatt             specialized: surgery                              Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   199   Der Militärazt                                     specialized: military medicine                    Vienna      Austro-Hungarian Empire   Austria             48.2        16.366667    German       │
│                                   261   Медицинская беседа                                 general medicine                                  Voronezh    Russian Empire            Russia              51.671667   39.210556    Russian      │
│                                   77    Medycyna                                           general medicine                                  Warsaw      Russian Empire            Poland              52.233333   21.016667    Polish       │
│                                   150   Kronika Lekarska                                   general medicine                                  Warsaw      Russian Empire            Poland              52.233333   21.016667    Polish       │
│                                   86    Grenzfragen des Nerven- und Seelenlebens           specialized: psychiatry and neurology             Wiesbaden   German Empire             Germany             50.0825     8.24         German       │
│                                   206   Ergebnisse der Allgemeinen Pathologie und Pathol   specialized: anatomy, physiology and pathology    Wiesbaden   German Empire             Germany             50.0825     8.24         German       │
│                                                                                                                                                                                                                                              │
│                                                                                                                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭─ Stored result value ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                                                              │
│   field          data type      stored id                              alias(es)                                                                                                                                                             │
│  ─────────────────────────────────────────────────────────────────────────────────────                                                                                                                                                       │
│   network_data   network_data   aea3b645-09a7-4c80-b782-608c03d188d5   journals_graph                                                                                                                                                        │
│                                                                                                                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

To confirm our graph data is created, let's check the data store:

kiara data explain --properties alias:journals_graph
╭─ Value details for: alias:journals_graph ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                                                              │
│   value_id            aea3b645-09a7-4c80-b782-608c03d188d5                                                                                                                                                                                   │
│   kiara_id            bc41cc78-899c-433b-8e8d-33d0c9990791                                                                                                                                                                                   │
│                                                                                                                                                                                                                                              │
│                       ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────   │
│   data_type_info                                                                                                                                                                                                                             │
│                         data_type_name     network_data                                                                                                                                                                                      │
│                         data_type_config   {}                                                                                                                                                                                                │
│                         characteristics    {                                                                                                                                                                                                 │
│                                              "is_scalar": false,                                                                                                                                                                             │
│                                              "is_json_serializable": false                                                                                                                                                                   │
│                                            }                                                                                                                                                                                                 │
│                         data_type_class                                                                                                                                                                                                      │
│                                              python_class_name    NetworkDataType                                                                                                                                                            │
│                                              python_module_name   kiara_plugin.network_analysis.data_types                                                                                                                                   │
│                                              full_name            kiara_plugin.network_analysis.data_types.NetworkDataType                                                                                                                   │
│                                                                                                                                                                                                                                              │
│                                                                                                                                                                                                                                              │
│   destiny_backlinks   {}                                                                                                                                                                                                                     │
│   enviroments         None                                                                                                                                                                                                                   │
│   properties                                                                                                                                                                                                                                 │
│                         field                       value                                                                                                                                                                                    │
│                        ─────────────────────────────────────────────────────────────────────────────────────────────────                                                                                                                     │
│                         metadata.database           {                                                                                                                                                                                        │
│                                                       "tables": {                                                                                                                                                                            │
│                                                         "edges": {                                                                                                                                                                           │
│                                                           "column_names": [                                                                                                                                                                  │
│                                                             "source",                                                                                                                                                                        │
│                                                             "target",                                                                                                                                                                        │
│                                                             "weight"                                                                                                                                                                         │
│                                                           ],                                                                                                                                                                                 │
│                                                           "column_schema": {                                                                                                                                                                 │
│                                                             "source": {                                                                                                                                                                      │
│                                                               "type_name": "INTEGER",                                                                                                                                                        │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "target": {                                                                                                                                                                      │
│                                                               "type_name": "INTEGER",                                                                                                                                                        │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "weight": {                                                                                                                                                                      │
│                                                               "type_name": "INTEGER",                                                                                                                                                        │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             }                                                                                                                                                                                │
│                                                           },                                                                                                                                                                                 │
│                                                           "rows": 321,                                                                                                                                                                       │
│                                                           "size": 4096                                                                                                                                                                       │
│                                                         },                                                                                                                                                                                   │
│                                                         "nodes": {                                                                                                                                                                           │
│                                                           "column_names": [                                                                                                                                                                  │
│                                                             "id",                                                                                                                                                                            │
│                                                             "label",                                                                                                                                                                         │
│                                                             "JournalType",                                                                                                                                                                   │
│                                                             "City",                                                                                                                                                                          │
│                                                             "CountryNetworkTime",                                                                                                                                                            │
│                                                             "PresentDayCountry",                                                                                                                                                             │
│                                                             "Latitude",                                                                                                                                                                      │
│                                                             "Longitude",                                                                                                                                                                     │
│                                                             "Language"                                                                                                                                                                       │
│                                                           ],                                                                                                                                                                                 │
│                                                           "column_schema": {                                                                                                                                                                 │
│                                                             "id": {                                                                                                                                                                          │
│                                                               "type_name": "INTEGER",                                                                                                                                                        │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "label": {                                                                                                                                                                       │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "JournalType": {                                                                                                                                                                 │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "City": {                                                                                                                                                                        │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "CountryNetworkTime": {                                                                                                                                                          │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "PresentDayCountry": {                                                                                                                                                           │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "Latitude": {                                                                                                                                                                    │
│                                                               "type_name": "REAL",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "Longitude": {                                                                                                                                                                   │
│                                                               "type_name": "REAL",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "Language": {                                                                                                                                                                    │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             }                                                                                                                                                                                │
│                                                           },                                                                                                                                                                                 │
│                                                           "rows": 276,                                                                                                                                                                       │
│                                                           "size": 40960                                                                                                                                                                      │
│                                                         }                                                                                                                                                                                    │
│                                                       }                                                                                                                                                                                      │
│                                                     }                                                                                                                                                                                        │
│                         metadata.graph_properties   {                                                                                                                                                                                        │
│                                                       "number_of_nodes": 276,                                                                                                                                                                │
│                                                       "properties_by_graph_type": [                                                                                                                                                          │
│                                                         {                                                                                                                                                                                    │
│                                                           "graph_type": "directed",                                                                                                                                                          │
│                                                           "number_of_edges": 321                                                                                                                                                             │
│                                                         },                                                                                                                                                                                   │
│                                                         {                                                                                                                                                                                    │
│                                                           "graph_type": "undirected",                                                                                                                                                        │
│                                                           "number_of_edges": 313                                                                                                                                                             │
│                                                         },                                                                                                                                                                                   │
│                                                         {                                                                                                                                                                                    │
│                                                           "graph_type": "directed-multi",                                                                                                                                                    │
│                                                           "number_of_edges": 321                                                                                                                                                             │
│                                                         },                                                                                                                                                                                   │
│                                                         {                                                                                                                                                                                    │
│                                                           "graph_type": "undirected-multi",                                                                                                                                                  │
│                                                           "number_of_edges": 321                                                                                                                                                             │
│                                                         }                                                                                                                                                                                    │
│                                                       ]                                                                                                                                                                                      │
│                                                     }                                                                                                                                                                                        │
│                         metadata.python_class       {                                                                                                                                                                                        │
│                                                       "python_class": {                                                                                                                                                                      │
│                                                         "python_class_name": "NetworkData",                                                                                                                                                  │
│                                                         "python_module_name": "kiara_plugin.network_analysis.models",                                                                                                                        │
│                                                         "full_name": "kiara_plugin.network_analysis.models.NetworkData"                                                                                                                      │
│                                                       }                                                                                                                                                                                      │
│                                                     }                                                                                                                                                                                        │
│                                                                                                                                                                                                                                              │
│   property_links      {                                                                                                                                                                                                                      │
│                         "metadata.database": "3c33c6b7-e152-4e52-b7bd-dfc28efb7045",                                                                                                                                                         │
│                         "metadata.graph_properties": "b8d152d7-0381-4ced-a104-900ceeb1e1d3",                                                                                                                                                 │
│                         "metadata.python_class": "3574d1d5-d9ed-435b-8b7a-395a43a4d4a1"                                                                                                                                                      │
│                       }                                                                                                                                                                                                                      │
│   value_hash          zdpuB17oZEahwMpecZvwQWEGDB17D9ppcHWaUQ6pWNLWsWKNX                                                                                                                                                                      │
│   value_schema                                                                                                                                                                                                                               │
│                         type          network_data                                                                                                                                                                                           │
│                         type_config   {}                                                                                                                                                                                                     │
│                         default       __not_set__                                                                                                                                                                                            │
│                         optional      False                                                                                                                                                                                                  │
│                         is_constant   False                                                                                                                                                                                                  │
│                         doc           The network/graph data.                                                                                                                                                                                │
│                                                                                                                                                                                                                                              │
│   value_size          61.44 KB                                                                                                                                                                                                               │
│   value_status        -- set --                                                                                                                                                                                                              │
│                                                                                                                                                                                                                                              │
│                       ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────   │
│                                                                                                                                                                                                                                              │
│   properties                                                                                                                                                                                                                                 │
│                         metadata.database           {                                                                                                                                                                                        │
│                                                       "tables": {                                                                                                                                                                            │
│                                                         "edges": {                                                                                                                                                                           │
│                                                           "column_names": [                                                                                                                                                                  │
│                                                             "source",                                                                                                                                                                        │
│                                                             "target",                                                                                                                                                                        │
│                                                             "weight"                                                                                                                                                                         │
│                                                           ],                                                                                                                                                                                 │
│                                                           "column_schema": {                                                                                                                                                                 │
│                                                             "source": {                                                                                                                                                                      │
│                                                               "type_name": "INTEGER",                                                                                                                                                        │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "target": {                                                                                                                                                                      │
│                                                               "type_name": "INTEGER",                                                                                                                                                        │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "weight": {                                                                                                                                                                      │
│                                                               "type_name": "INTEGER",                                                                                                                                                        │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             }                                                                                                                                                                                │
│                                                           },                                                                                                                                                                                 │
│                                                           "rows": 321,                                                                                                                                                                       │
│                                                           "size": 4096                                                                                                                                                                       │
│                                                         },                                                                                                                                                                                   │
│                                                         "nodes": {                                                                                                                                                                           │
│                                                           "column_names": [                                                                                                                                                                  │
│                                                             "id",                                                                                                                                                                            │
│                                                             "label",                                                                                                                                                                         │
│                                                             "JournalType",                                                                                                                                                                   │
│                                                             "City",                                                                                                                                                                          │
│                                                             "CountryNetworkTime",                                                                                                                                                            │
│                                                             "PresentDayCountry",                                                                                                                                                             │
│                                                             "Latitude",                                                                                                                                                                      │
│                                                             "Longitude",                                                                                                                                                                     │
│                                                             "Language"                                                                                                                                                                       │
│                                                           ],                                                                                                                                                                                 │
│                                                           "column_schema": {                                                                                                                                                                 │
│                                                             "id": {                                                                                                                                                                          │
│                                                               "type_name": "INTEGER",                                                                                                                                                        │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "label": {                                                                                                                                                                       │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "JournalType": {                                                                                                                                                                 │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "City": {                                                                                                                                                                        │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "CountryNetworkTime": {                                                                                                                                                          │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "PresentDayCountry": {                                                                                                                                                           │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "Latitude": {                                                                                                                                                                    │
│                                                               "type_name": "REAL",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "Longitude": {                                                                                                                                                                   │
│                                                               "type_name": "REAL",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             },                                                                                                                                                                               │
│                                                             "Language": {                                                                                                                                                                    │
│                                                               "type_name": "TEXT",                                                                                                                                                           │
│                                                               "metadata": {                                                                                                                                                                  │
│                                                                 "nullable": false,                                                                                                                                                           │
│                                                                 "primary_key": false                                                                                                                                                         │
│                                                               }                                                                                                                                                                              │
│                                                             }                                                                                                                                                                                │
│                                                           },                                                                                                                                                                                 │
│                                                           "rows": 276,                                                                                                                                                                       │
│                                                           "size": 40960                                                                                                                                                                      │
│                                                         }                                                                                                                                                                                    │
│                                                       }                                                                                                                                                                                      │
│                                                     }                                                                                                                                                                                        │
│                         metadata.graph_properties   {                                                                                                                                                                                        │
│                                                       "number_of_nodes": 276,                                                                                                                                                                │
│                                                       "properties_by_graph_type": [                                                                                                                                                          │
│                                                         {                                                                                                                                                                                    │
│                                                           "graph_type": "directed",                                                                                                                                                          │
│                                                           "number_of_edges": 321                                                                                                                                                             │
│                                                         },                                                                                                                                                                                   │
│                                                         {                                                                                                                                                                                    │
│                                                           "graph_type": "undirected",                                                                                                                                                        │
│                                                           "number_of_edges": 313                                                                                                                                                             │
│                                                         },                                                                                                                                                                                   │
│                                                         {                                                                                                                                                                                    │
│                                                           "graph_type": "directed-multi",                                                                                                                                                    │
│                                                           "number_of_edges": 321                                                                                                                                                             │
│                                                         },                                                                                                                                                                                   │
│                                                         {                                                                                                                                                                                    │
│                                                           "graph_type": "undirected-multi",                                                                                                                                                  │
│                                                           "number_of_edges": 321                                                                                                                                                             │
│                                                         }                                                                                                                                                                                    │
│                                                       ]                                                                                                                                                                                      │
│                                                     }                                                                                                                                                                                        │
│                         metadata.python_class       {                                                                                                                                                                                        │
│                                                       "python_class": {                                                                                                                                                                      │
│                                                         "python_class_name": "NetworkData",                                                                                                                                                  │
│                                                         "python_module_name": "kiara_plugin.network_analysis.models",                                                                                                                        │
│                                                         "full_name": "kiara_plugin.network_analysis.models.NetworkData"                                                                                                                      │
│                                                       }                                                                                                                                                                                      │
│                                                     }                                                                                                                                                                                        │
│                                                                                                                                                                                                                                              │
│                                                                                                                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

All good. Also, check out the metadata kiara knows about the graph already.

Side-note: investigating the graph value lineage

kiara keeps track of all the modules and inputs that went into producing a value, basically its entire ancestry. This is not the place to explain why, and how that can be very powerful and useful. But if you are ever interested about what went into creating a particular value, you can do this with:

kiara data explain --lineage alias:journals_graph
╭─ Value details for: alias:journals_graph ────────────────────────────────────╮
│                                                                              │
│   value_id            aea3b645-09a7-4c80-b782-608c03d188d5                   │
│   kiara_id            bc41cc78-899c-433b-8e8d-33d0c9990791                   │
│                                                                              │
│                       ────────────────────────────────────────────────────   │
│   data_type_info                                                             │
│                         data_type_name     network_data                      │
│                         data_type_config   {}                                │
│                         characteristics    {                                 │
│                                              "is_scalar": false,             │
│                                              "is_json_serializable":         │
│                                            false                             │
│                                            }                                 │
│                         data_type_class                                      │
│                                              python_cla…   NetworkDat…       │
│                                              python_mod…   kiara_plug…       │
│                                              full_name     kiara_plug…       │
│                                                                              │
│                                                                              │
│   destiny_backlinks   {}                                                     │
│   enviroments         None                                                   │
│   property_links      {                                                      │
│                         "metadata.database":                                 │
│                       "3c33c6b7-e152-4e52-b7bd-dfc28efb7045",                │
│                         "metadata.graph_properties":                         │
│                       "b8d152d7-0381-4ced-a104-900ceeb1e1d3",                │
│                         "metadata.python_class":                             │
│                       "3574d1d5-d9ed-435b-8b7a-395a43a4d4a1"                 │
│                       }                                                      │
│   value_hash          zdpuB17oZEahwMpecZvwQWEGDB17D9ppcHWaUQ6pWNLWsWKNX      │
│   value_schema                                                               │
│                         type          network_data                           │
│                         type_config   {}                                     │
│                         default       __not_set__                            │
│                         optional      False                                  │
│                         is_constant   False                                  │
│                         doc           The network/graph data.                │
│                                                                              │
│   value_size          61.44 KB                                               │
│   value_status        -- set --                                              │
│                                                                              │
│                       ────────────────────────────────────────────────────   │
│                                                                              │
│   lineage             create.network_data.from.tables                        │
│                       ├── input: edges (table) =                             │
│                       │   0aea06f2-f729-4ec5-b4dc-707606dd7269               │
│                       │   └── create.table                                   │
│                       │       └── input: file (file) =                       │
│                       │           d524e19e-be9b-4f56-bcd1-103a3cc13f9f       │
│                       │           └── import.local.file                      │
│                       │               └── input: path (string) =             │
│                       │                   837a0090-f20b-436d-8125-a3076df…   │
│                       ├── input: edges_column_map (dict) =                   │
│                       │   1142536c-72ee-4e00-accf-55b3448ab1ae               │
│                       ├── input: id_column_name (string) =                   │
│                       │   48227a2a-a772-4653-8bed-92621f06fa6d               │
│                       ├── input: label_column_name (string) =                │
│                       │   bb52e30e-0b80-479f-8233-25cc069f1c5e               │
│                       ├── input: nodes (table) =                             │
│                       │   8ef2a5ea-031a-4370-93ac-c8d42dc1ea3b               │
│                       │   └── create.table                                   │
│                       │       └── input: file (file) =                       │
│                       │           8bb90738-ab11-4cfb-8ada-f43549cd6d20       │
│                       │           └── import.local.file                      │
│                       │               └── input: path (string) =             │
│                       │                   89ea8b61-7e84-42db-ac7f-0d3751c…   │
│                       ├── input: nodes_column_map (dict) =                   │
│                       │   bcc75a9e-e20d-4758-a48a-81637975a8cc               │
│                       ├── input: source_column_name (string) =               │
│                       │   e30154e1-22eb-4efd-8969-a42113770a7d               │
│                       └── input: target_column_name (string) =               │
│                           aa33a9c1-d880-4054-a4a6-6f74bf4c8172               │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

As you can see, this basically describes what we've done so far, to get to this stage. You could now do a kiara explain data value:<value_id> on each of the value ids you see here, if you were so inclined.

More

... to come ...