Pipeline

class imagepypelines.Pipeline(blocks=[], name=None, skip_validation=False, track_types=True, debug=False)[source]

Bases: object

Pipeline object to apply a sequence of algorithms to input data

Pipelines pass data between block objects and validate the integrity of a data processing pipeline. It is intended to be a quick, flexible, and modular approach to creating a processing graph. It also contains helper functions for documentation and saving these pipelines for use by other researchers/users.

Parameters:
  • blocks (list) – list of blocks to instantiate this pipeline with, shortcut to the ‘add’ function. defaults to []
  • name (str) – name for this pipeline that will be enumerated to be unique, defaults to the name of the Pipeline-<index>
name

unique name for this pipeline

Type:str
blocks

list of block objects being used by this pipeline, in order of their processing sequence

Type:list
verbose

whether or not this pipeline with print out its status

Type:bool
enable_text_graph

whether or not to print out a graph of pipeline blocks and outputs

Type:bool
printer

printer object for this pipeline, registered with ‘name’

Type:ip.Printer
uuid

universally unique hex id for this pipeline

Type:str

Attributes Summary

names Returns the names of all blocks
requires_labels Returns whether or not this pipeline requires labels
trained Returns whether or not this pipeline has been trained

Methods Summary

add(block) Adds processing block to the pipeline processing chain
clear() Clears all processing blocks from the pipeline processing chain
copy() Provides deepcopy of pipeline processing chain
debug() Enables debug mode which turns on all printouts for this pipeline to aide in debugging
graph() TODO: Placeholder function for @Ryan to create
insert(index, block) Inserts processing block into the pipeline processing chain
join(pipeline) Adds the blocks from an input pipeline to the current pipeline
predict_type_chain(data) Predict the types at each stage of the pipeline
process(data)
remove(block_name) removes processing block from the pipeline processing chain
rename(name)
save([filename]) Pickles and saves the entire pipeline as a pickled object, so it can be used by others or at another time
train(data[, labels])
validate(data) validates the integrity of the pipeline

Attributes Documentation

names

Returns the names of all blocks

requires_labels

Returns whether or not this pipeline requires labels

trained

Returns whether or not this pipeline has been trained

Methods Documentation

add(block)[source]

Adds processing block to the pipeline processing chain

Parameters:block (ip.BaseBlock) – block object to add to this pipeline
Returns:None
Raise:
TypeError: if ‘block’ is not a subclass of BaseBlock
clear()[source]

Clears all processing blocks from the pipeline processing chain

Parameters:None
Returns:None
Raise:
None
copy()[source]

Provides deepcopy of pipeline processing chain

Parameters:None
Returns:a deepcopy of the entire pipeline instance, ‘self’
Return type:deepcopy
Raise:
None
debug()[source]

Enables debug mode which turns on all printouts for this pipeline to aide in debugging

graph()[source]

TODO: Placeholder function for @Ryan to create

insert(index, block)[source]

Inserts processing block into the pipeline processing chain

Parameters:
  • index (int) – index at which block object is to be inserted
  • block (ip.BaseBlock) – block object to add to this pipeline
Returns:

None

Raise:
TypeError: if ‘block’ is not a subclass of BaseBlock, or ‘index’ is not instance of int
join(pipeline)[source]

Adds the blocks from an input pipeline to the current pipeline

Parameters:pipeline (ip.Pipeline) – a valid pipeline object containing blocks
Returns:None
Raise:
None
predict_type_chain(data)[source]

Predict the types at each stage of the pipeline

process(data)[source]
remove(block_name)[source]

removes processing block from the pipeline processing chain

Parameters:block_name (str) – unique string name of block object to remove
Returns:None
Raise:

TypeError: if ‘block_name’ is not an instance of str

ValueError: if ‘block_name’ is not member of list self.names

rename(name)[source]
save(filename=None)[source]

Pickles and saves the entire pipeline as a pickled object, so it can be used by others or at another time

Parameters:filename (string) – filename to save pipeline to, defaults to saving the pipeline to the ip.cache
Returns:the filename the pipeline was saved to
Return type:str
train(data, labels=None)[source]
validate(data)[source]

validates the integrity of the pipeline

verifies all input-output shapes are compatible with each other

Developer Note:

this function could use a full refactor, especially with regards to printouts when an error is raised - Jeff

Type comparison between Blocks is complicated and I suspect more bugs are still yet to be discovered.

Raises:
  • TypeError – if ‘data’ isn’t a list or tuple
  • RuntimeError – if more than one block in the pipeline has the same name, or not all objects in the block list are BaseBlock subclasses