Skip to content

array

array.map

                                                                                
 Documentation                                                                  
                          Map a list of values into another list of values.     
                                                                                
                          This module must be configured with the type (and     
                          optional) configuration of another kiara module.      
                          This 'child' module will then be used to compute      
                          the array items of the result.                        
                                                                                
 Origin                                                                         
                          Authors   Markus Binsteiner (markus@frkl.io)          
                                                                                
 Context                                                                        
                          Tags         array, core                              
                          Labels       package: kiara_modules.core              
                          References   source_repo:                             
                                       https://github.com/DHARPA-Project/kia…   
                                       documentation:                           
                                       https://dharpa.org/kiara_modules.core/   
                                       module_doc:                              
                                       https://dharpa.org/kiara_modules.core…   
                                       source_url:                              
                                       https://github.com/DHARPA-Project/kia…   
                                                                                
 Module config                                                                  
                          Field           Type     Description       Required   
                         ─────────────────────────────────────────────────────  
                          constants       object   Value constants   no         
                                                   for this                     
                                                   module.                      
                          defaults        object   Value defaults    no         
                                                   for this                     
                                                   module.                      
                          module_type     string   The name of the   yes        
                                                   kiara module to              
                                                   use to filter                
                                                   the input data.              
                          module_config   object   The config for    no         
                                                   the kiara                    
                                                   filter module.               
                          input_name      string   The name of the   no         
                                                   input name of                
                                                   the module                   
                                                   which will                   
                                                   receive the                  
                                                   items from our               
                                                   input array.                 
                                                   Can be omitted               
                                                   if the                       
                                                   configured                   
                                                   module only has              
                                                   a single input.              
                          output_name     string   The name of the   no         
                                                   output name of               
                                                   the module                   
                                                   which will                   
                                                   receive the                  
                                                   items from our               
                                                   input array.                 
                                                   Can be omitted               
                                                   if the                       
                                                   configured                   
                                                   module only has              
                                                   a single                     
                                                   output.                      
                                                                                
 Module config          -- no config --                                         
 Python class                                                                   
                          class_name    MapModule                               
                          module_name   kiara_modules.core.array                
                          full_name     kiara_modules.core.array.MapModule      
                                                                                
 Processing source code  ─────────────────────────────────────────────────────  
                          def process(self, inputs: ValueSet, outputs: Value…   
                                                                                
                              import pyarrow as pa                              
                                                                                
                              input_array: pa.Array = inputs.get_value_data(   
                                                                                
                              init_data: typing.Dict[str, typing.Any] = {}      
                              for input_name in self.input_schemas.keys():      
                                  if input_name in ["array", self.module_inp…   
                                      continue                                  
                                                                                
                                  init_data[input_name] = inputs.get_value_o…   
                                                                                
                              result_list = map_with_module(                    
                                  input_array,                                  
                                  module_input_name=self.module_input_name,     
                                  module_obj=self.child_module,                 
                                  init_data=init_data,                          
                                  module_output_name=self.module_output_name,   
                              )                                                 
                              outputs.set_value("array", pa.array(result_lis…   
                                                                                
                         ─────────────────────────────────────────────────────  
                                                                                

array.metadata

                                                                                
 Documentation                                                                  
                          Extract metadata from an 'array' value.               
                                                                                
 Origin                                                                         
                          Authors   Markus Binsteiner (markus@frkl.io)          
                                                                                
 Context                                                                        
                          Tags         array, core                              
                          Labels       package: kiara_modules.core              
                          References   source_repo:                             
                                       https://github.com/DHARPA-Project/kia…   
                                       documentation:                           
                                       https://dharpa.org/kiara_modules.core/   
                                       module_doc:                              
                                       https://dharpa.org/kiara_modules.core…   
                                       source_url:                              
                                       https://github.com/DHARPA-Project/kia…   
                                                                                
 Module config                                                                  
                          Field        Type     Description          Required   
                         ─────────────────────────────────────────────────────  
                          constants    object   Value constants      no         
                                                for this module.                
                          defaults     object   Value defaults for   no         
                                                this module.                    
                          value_type   string   The data type this   yes        
                                                module will be                  
                                                used for.                       
                                                                                
 Module config          -- no config --                                         
 Python class                                                                   
                          class_name    ArrayMetadataModule                     
                          module_name   kiara_modules.core.array                
                          full_name     kiara_modules.core.array.ArrayMetada…   
                                                                                
 Processing source code  ─────────────────────────────────────────────────────  
                          def process(self, inputs: ValueSet, outputs: Value…   
                                                                                
                              input_name = self.value_type                      
                              if input_name == "any":                           
                                  input_name = "value_item"                     
                                                                                
                              value = inputs.get_value_obj(input_name)          
                              if self.value_type != "any" and value.type_nam…   
                                  raise KiaraProcessingException(               
                                      f"Can't extract metadata for value of …   
                                  )                                             
                                                                                
                              # TODO: if type 'any', validate that the data …   
                                                                                
                              outputs.set_value("metadata_item_schema", self   
                              metadata = self.extract_metadata(value)           
                              if isinstance(metadata, BaseModel):               
                                  metadata = metadata.dict(exclude_none=True)   
                                                                                
                              # TODO: validate metadata?                        
                              outputs.set_value("metadata_item", metadata)      
                                                                                
                         ─────────────────────────────────────────────────────  
                                                                                

array.sample

                                                                                
 Documentation                                                                  
                          Sample an array.                                      
                                                                                
                          Samples are used to randomly select a subset of a     
                          dataset, which helps test queries and workflows on    
                          smaller versions of the original data, to adjust      
                          parameters before a full run.                         
                                                                                
 Origin                                                                         
                          Authors   Markus Binsteiner (markus@frkl.io)          
                                                                                
 Context                                                                        
                          Tags         array, core                              
                          Labels       package: kiara_modules.core              
                          References   source_repo:                             
                                       https://github.com/DHARPA-Project/kia…   
                                       documentation:                           
                                       https://dharpa.org/kiara_modules.core/   
                                       module_doc:                              
                                       https://dharpa.org/kiara_modules.core…   
                                       source_url:                              
                                       https://github.com/DHARPA-Project/kia…   
                                                                                
 Module config                                                                  
                          Field         Type     Description         Required   
                         ─────────────────────────────────────────────────────  
                          constants     object   Value constants     no         
                                                 for this module.               
                          defaults      object   Value defaults      no         
                                                 for this module.               
                          sample_type   string   The sample          yes        
                                                 method.                        
                                                                                
 Module config          -- no config --                                         
 Python class                                                                   
                          class_name    SampleArrayModule                       
                          module_name   kiara_modules.core.array                
                          full_name     kiara_modules.core.array.SampleArray…   
                                                                                
 Processing source code  ─────────────────────────────────────────────────────  
                          def process(self, inputs: ValueSet, outputs: Value…   
                                                                                
                              sample_size: int = inputs.get_value_data("samp…   
                              sample_type: str = self.get_config_value("samp…   
                                                                                
                              if sample_size < 0:                               
                                  raise KiaraProcessingException(               
                                      f"Invalid sample size '{sample_size}':…   
                                  )                                             
                                                                                
                              input_name = self.get_value_type()                
                              if input_name == "any":                           
                                  input_name = "value_item"                     
                              value: Value = inputs.get_value_obj(input_name)   
                                                                                
                              func = getattr(self, f"sample_{sample_type}")     
                              result = func(value=value, sample_size=sample_…   
                                                                                
                              outputs.set_value("sampled_value", result)        
                                                                                
                         ─────────────────────────────────────────────────────  
                                                                                

array.store

                                                                                
 Documentation                                                                  
                          Save an Arrow array to a file.                        
                                                                                
                          This module wraps the input array into an Arrow       
                          Table, and saves this table as a feather file.        
                                                                                
                          The output of this module is a dictionary             
                          representing the configuration to be used with kira   
                          to re-assemble the array object from disk.            
                                                                                
 Origin                                                                         
                          Authors   Markus Binsteiner (markus@frkl.io)          
                                                                                
 Context                                                                        
                          Tags         array, core                              
                          Labels       package: kiara_modules.core              
                          References   source_repo:                             
                                       https://github.com/DHARPA-Project/kia…   
                                       documentation:                           
                                       https://dharpa.org/kiara_modules.core/   
                                       module_doc:                              
                                       https://dharpa.org/kiara_modules.core…   
                                       source_url:                              
                                       https://github.com/DHARPA-Project/kia…   
                                                                                
 Module config                                                                  
                          Field        Type     Description          Required   
                         ─────────────────────────────────────────────────────  
                          constants    object   Value constants      no         
                                                for this module.                
                          defaults     object   Value defaults for   no         
                                                this module.                    
                          value_type   string   The type of the      yes        
                                                value to save.                  
                                                                                
 Module config          -- no config --                                         
 Python class                                                                   
                          class_name    StoreArrayTypeModule                    
                          module_name   kiara_modules.core.array                
                          full_name     kiara_modules.core.array.StoreArrayT…   
                                                                                
 Processing source code  ─────────────────────────────────────────────────────  
                          def process(self, inputs: ValueSet, outputs: Value…   
                                                                                
                              value_id: str = inputs.get_value_data("value_i…   
                              if not value_id:                                  
                                  raise KiaraProcessingException("No value i…   
                                                                                
                              field_name = self.get_config_value("value_type…   
                              if field_name == "any":                           
                                  field_name = "value_item"                     
                                                                                
                              value_obj: Value = inputs.get_value_obj(field_…   
                              base_path: str = inputs.get_value_data("base_p…   
                                                                                
                              result = self.store_value(value=value_obj, bas…   
                              if isinstance(result, typing.Mapping):            
                                  load_config = result                          
                                  result_value = value_obj                      
                              elif isinstance(result, tuple):                   
                                  load_config = result[0]                       
                                  if result[1]:                                 
                                      result_value = result[1]                  
                                  else:                                         
                                      result_value = value_obj                  
                              else:                                             
                                  raise KiaraProcessingException(               
                                      f"Invalid result type for 'store_value…   
                                  )                                             
                                                                                
                              load_config["value_id"] = value_id                
                                                                                
                              lc = LoadConfig(**load_config)                    
                                                                                
                              if lc.base_path_input_name and lc.base_path_in…   
                                  raise KiaraProcessingException(               
                                      f"Invalid load config: base path '{lc.   
                                  )                                             
                                                                                
                              outputs.set_values(                               
                                  metadata=None, lineage=None, **{"load_conf…   
                              )                                                 
                                                                                
                         ─────────────────────────────────────────────────────