API

main

Definition of pypescript main function.

pypescript.main.main(config_block=None, pipe_graph_fn=None, data_block=None, save_data_block=None)

pypescript main function.

Parameters
  • config_block (string, ConfigBlock, dict, default=None) – If string, path to configuration file. Else ConfigBlock, dict provided configuration options. See pypeblock.config.ConfigBlock

  • pipe_graph_fn (string, default=None) – If not None, path where to save pipeline graph.

  • data_block (DataBlock, default=None) – Structure containing data exchanged between modules. If None, creates one.

  • save_data_block (string, default=None) – If not None, path where to save pipeline pipe_block.

block module

Definition of DataBlock and related classes.

class pypescript.block.BlockMapping(data=None, sep=None)

Bases: block.BlockMapping, pypescript.utils.BaseClass

This class handles a mapping between different (section, name) entries in DataBlock. It is useful if one wants to locally (i.e. for a specific module) change the entry corresponding to an item saved in the DataBlock instance.

data

Single level dictionary containing (section, name) mapping.

Type

dict

Note

Only (section, name) DataBlock getters and setters will be impacted by the mapping. For example, DataBlock.items() will list (section, name), values whatever the mapping is.

Initialize BlockMapping.

Parameters
  • data (dict, default=None) –

    Single level dictionary, where the DataBlock section2, name2 internal entry accessed when calling DataBlock getters and setters with section1, name1 is:

    data[section1,name1]
    

    If sep is not None, the corresponding data key may be section1[sep]name1 and the corresponding value section2[sep]name2 (see below). One can also specify the mapping for a whole section section1 to an internal section section2 with:

    data[section1]
    

  • sep (string, default=None) – If not None, data string keys and values will be split using the separator sep. For example, if sep == '.', the key section1.name1 will be split into section1, name1.

setdefault(key, value)

Set default value. TODO: implement in C.

to_dict(sep=None)

Return as dictionary of tuples, joined by sep if not None.

class pypescript.block.DataBlock(data=None, mapping=None, add_sections=None)

Bases: block.DataBlock, pypescript.utils.BaseClass

The data structure fed to all modules.

It is essentially a dictionary, with items to be accessed through the key (section, name). Most useful methods are those to get (get, get_type, get_[type]…) and set objects. The class mostly inherits from the DataBlock type coded using the Python C API. Only a few convenience methods are written in Python below.

>>> data_block = DataBlock({'section1':{'name1':1}})
>>> data_block.get('section1','name1')
>>> data_block.get_int('section1','name1')
>>> data_block.get_string('section1','name1')
Traceback (most recent call last): pypescript.block.TypeError: Wrong type for "name1" in section [section1].
data

Double level (section, name) dictionary.

Type

dict

mapping

See documentation of BlockMapping.

Type

BlockMapping

Initialize DataBlock.

Parameters
  • data (dict, default=None) –

    Double level dictionary, where the item corresponding to section1, name1 can be accessed through:

    data[section1][name1]
    

  • mapping (BlockMapping, dict, default=None) – See documentation of BlockMapping.

  • add_sections (list, default=None) – List of sections to be added to self. If None, defaults to syntax.common_sections.

copy(nocopy=None)

Return a shallow copy of self, i.e. only the dictionary mapping to the stored items is copied. The items themselves are not copied, except if they have an attribute _copy_if_datablock_copy set to True.

Parameters

nocopy (list, default=None) – List of sections to not copy (such that any change affecting in these sections of self will affect the returned copy as well). If None, defaults to syntax.common_sections.

Note

mapping instance is simply added to the returned DataBlock instance, no copy is performed.

get_type(section, name, types, *args, **kwargs)

Wrapper around DataBlock.get() which further checks the output type and returns a TypeError if the result is not an instance of any of types.

Parameters
  • section (string) – Section name.

  • name (string) – Element name. section, name is the complete DataBlock entry.

  • types (list, tuple, string, type or class) – Types to check the return value of DataBlock.get() against. If list or tuple, check whether any of the proposed types matches. If a type is string, will search for the corresponding builtin type.

  • args (list) – Other arguments to DataBlock.get().

  • kwargs (dict) – Other arguments to DataBlock.get().

Returns

value – Return value in self if it exists and is of the correct type (else raises a TypeError), else default value if provided (else raises a KeyError).

Return type

object

:raises TypeError if the result of DataBlock.get() is not the default value (if provided) and not a type_ instance.:

Note

If a default value is provided, and returned (in case the required section, name is not in DataBlock) then no type-checking is performed (hence no exception is raised).

set_mapping(mapping=None)

Set mapping.

Parameters

mapping (BlockMapping, dict, default=None) – See documentation of BlockMapping.

setdefault(section, name, value)

Set default value. TODO: implement in C.

update(other, nocopy=None)

Update self, i.e. only the dictionary mapping to the stored items is updated with other, not the items themselves.

Parameters

nocopy (list, default=None) – List of sections to not update (such that any change affecting in these sections of other will affect self as well. If None, defaults to syntax.common_sections.

Note

mapping instance is not updated.

class pypescript.block.SectionBlock(block, section)

Bases: object

Convenient wrapper to restrict a DataBlock instance to a given section, such that items can be accessed as in a single-level dictionary.

>>> section_block = SectionBlock(data_block,'section1')
>>> section_block.get('name1')

Initialize SectionBlock.

Parameters
  • block (DataBlock) – DataBlock instance.

  • section (string) – Section to restrict :attr:block to.

has(name)

Has this name?

items()

Yield (name, value) tuples.

keys()

Return names in the :attr:block section.

setdefault(name, value)

Set default value. TODO: implement in C.

config module

Definition of ConfigBlock.

class pypescript.config.ConfigBlock(data=None, string=None, parser=None)

Bases: pypescript.block.DataBlock

This class handles the pipeline configurations. Extends DataBlock with an initialisation from a file.

Initialize ConfigBlock.

Parameters
  • data (string, ConfigBlock, dict, default=None) – Path to configuration file. Else, ConfigBlock instance to be (shallow) copied. Else, a dictionary as for initialisation of a DataBlock instance. If None, ignored.

  • string (string, default=None) – If not None, yaml format string to decode. Added on top of data.

  • parser (callable, default=yaml_parser) – Function that parses yaml string into a dictionary. Used when data is string, or string is not None.

copy()

Return shallow copy of self.

save_yaml(filename)

Save class to disk.

exception pypescript.config.ConfigError

Bases: Exception

Exception raised when issue with pypescript configuration.

module module

Definition of BaseModule.

class pypescript.module.BaseModule(name, options=None, config_block=None, data_block=None, description=None, pipeline=None)

Bases: object

Base module class, which wraps pure Python modules or Python C extensions. Modules interact with the rest of the pipeline through the three methods BaseModule.setup(), BaseModule.execute() and BaseModule.cleanup().

name

Module name, which is set by the pipeline configuration and is a unique module identifier. One can use the same module implementation with different module names in the same pipeline.

Type

string

config_block

Structure containing configuration options.

Type

DataBlock

data_block

Structure containing data exchanged between modules.

Type

DataBlock

description

Module description.

Type

ModuleDescription

Initialize BaseModule.

Parameters
  • name (string) – Module name, which is set by the pipeline configuration and is a unique module identifier. One can use the same module implementation with different module names in the same pipeline.

  • options (SectionBlock, dict, default=None) – Options for this module.

  • config_block (DataBlock, dict, string, default=None) – Structure containing configuration options. If None, creates one.

  • data_block (DataBlock, default=None) – Structure containing data exchanged between modules. If None, creates one.

  • description (string, ModuleDescription, dict, default=None) – Module description.

  • pipeline (BasePipeline) – Pipeline instance for which this module was created.

check_options()

Check provided options are mentioned in description file (if exists), else raises ConfigError.

fetch_module(name='')

Fetch module/pipeline given (dot-separated) name.

classmethod from_filename(name='module', options=None, **kwargs)

Create BaseModule-type module from either module name or module file. The imported module can contain the following functions:

  • setup(name, config_block, data_block)

  • execute(name, config_block, data_block)

  • cleanup(name, config_block, data_block)

Or a class with the following methods:

  • setup(self)

  • execute(self)

  • cleanup(self)

which can use attributes name, config_block and data_block.

Parameters
  • name (string) – Module name, which is set by the pipeline configuration and is a unique module identifier.

  • options (SectionBlock, dict, default=None) – Options for this module. It should contain an entry ‘module_file’ OR (exclusive) ‘module_name’ (w.r.t. ‘base_dir’, defaulting to ‘.’). It may contain an entry ‘module_class’ containing a class name if the module consists in a class.

  • kwargs (dict) – Arguments for BaseModule.__init__().

property mpicomm

Return current MPI communicator.

classmethod plot_inheritance_graph(filename, exclude=None)

Plot inheritance graph to filename.

Parameters
  • filename (string) – Where to save graph (in ps format).

  • exclude (list) – List of module (base name) to exclude from the graph.

set_config_block(options=None, config_block=None)

Set config_block and options. Also sets:

  • _datablock_set, dictionary of (key, value) to set into data_block

  • _datablock_mapping, BlockMapping instance that maps data_block entries to others

  • :attr:’_datablock_duplicate’, BlockMapping instance used to duplicate data_block entries

Parameters
  • options (SectionBlock, dict, default=None) – Options for this module, which update those in config_block.

  • config_block (DataBlock, dict, string, default=None) – Structure containing configuration options, which will be updated with options.

set_data_block(data_block=None)

Set data_block.

Parameters

data_block (DataBlock, default=None) – DataBlock instance used by the module to retrieve and store items. If None, creates one.

class pypescript.module.MetaModule(name, bases, class_dict)

Bases: pypescript.utils.BaseMetaClass

Meta class to replace setup(), execute() and cleanup() module methods.

set_functions(functions)

Wrap input functions and add corresponding methods to class cls. Specifically:

  • before functions calls, fills in BaseModule.data_block with values specified in BaseModule._datablock_set

  • after functions calls, duplicate entries of BaseModule.data_block with key pairs specified in BaseModule._datablock_duplicate

  • set module BaseModule._state

  • exceptions occuring in functions calls are complemented with module class and local name, for easy debugging

Parameters

functions (dict) – Dictionary of function name: callable.

pypescript.module.mimport(module_name, module_file=None, module_class=None, name=None, data_block=None, options=None)

Convenient function to load pypescript module.

Parameters
  • module_name (string, default=None) – Python module name.

  • module_file (string, default=None) – Module file, used if module_name not provided.

  • module_class (string, default=None) – Module class to load from Python module. Unnessary if only one class in the module.

  • name (string, default=None) – Local (i.e. bound to the pipeline) module name.

  • data_block (DataBlock, default=None) – Structure containing data exchanged between modules. If None, creates one.

  • options (SectionBlock, dict, default=None) – Options for this module.

pipeline module

Definition of BasePipeline and subclasses.

class pypescript.pipeline.BasePipeline(name='main', options=None, config_block=None, data_block=None, description=None, pipeline=None, modules=None, setup=None, execute=None, cleanup=None)

Bases: pypescript.module.BaseModule

Extend BaseModule to load, set up, execute, and clean up several modules.

modules

List of modules.

Type

list

Initialize BasePipeline.

Parameters
  • name (string, default='main') – See BaseModule documentation. Defaults to ‘main’, the root of the full pipeline tree.

  • options (SectionBlock, dict, default=None) – Options for this module. It should contain an entry ‘modules’ listing module names to load (defaults to empty list).

  • config_block (DataBlock, dict, string, default=None) – Structure containing configuration options.

  • data_block (DataBlock, default=None) – Structure containing data exchanged between modules. If None, creates one.

  • description (string, ModuleDescription, dict, default=None) – Module description.

  • pipeline (BasePipeline) – Pipeline instance for which this (sub-)pipeline was created.

  • modules (list, default=None) – List of modules, which will be completed by those in ‘setup’, ‘execute’ and ‘cleanup’ entries of options.

  • setup (list, default=None) – List of ‘module.method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.setup().

  • execute (list, default=None) – List of ‘module.method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.execute().

  • cleanup (list, default=None) – List of ‘module:method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.cleanup().

get_module_from_name(name)

Return BaseModule instance corresponding to module (pipeline) name.

plot_pipeline_graph(filename)

Plot pipeline as a graph to filename.

set_config_block(options=None, config_block=None)

Set config_block and options. config_block is updated by that of all modules, then the resulting block is set in all modules. Also sets:

  • _datablock_set, dictionary of (key, value) to set into data_block

  • _datablock_mapping, BlockMapping instance that maps data_block entries to others

  • :attr:’_datablock_duplicate’, BlockMapping instance used to duplicate data_block entries

Parameters
  • options (SectionBlock, dict, default=None) – Options for this module, which update those in config_block.

  • config_block (DataBlock, dict, string, default=None) – Structure containing configuration options, which will be updated with options.

set_todos(modules=None, setup_todos=None, execute_todos=None, cleanup_todos=None)

Prepare ModuleTodo instances for setup, execute, and cleanup.

exception pypescript.pipeline.BatchError

Bases: Exception

Exception raised when issue with batch job.

class pypescript.pipeline.BatchPipeline(*args, **kwargs)

Bases: pypescript.pipeline.MPIPipeline

Extend MPIPipeline to execute a subpipeline with a batch job.

job_dir

Directory path where to save config files, DataBlock instance and possibly batch submission script for each job.

Type

string

mpiexec

Name of MPI executable. Used when job_template not provided.

Type

string

job_template

Template for job submission scripts. Should contain patterns {command} for command, and job_options keys.

Type

string

job_submit

Command to submit job.

Type

string

job_options

Options for job.

Type

dict

Initialize BasePipeline.

Parameters
  • name (string, default='main') – See BaseModule documentation. Defaults to ‘main’, the root of the full pipeline tree.

  • options (SectionBlock, dict, default=None) – Options for this module. It should contain an entry ‘modules’ listing module names to load (defaults to empty list).

  • config_block (DataBlock, dict, string, default=None) – Structure containing configuration options.

  • data_block (DataBlock, default=None) – Structure containing data exchanged between modules. If None, creates one.

  • description (string, ModuleDescription, dict, default=None) – Module description.

  • pipeline (BasePipeline) – Pipeline instance for which this (sub-)pipeline was created.

  • modules (list, default=None) – List of modules, which will be completed by those in ‘setup’, ‘execute’ and ‘cleanup’ entries of options.

  • setup (list, default=None) – List of ‘module.method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.setup().

  • execute (list, default=None) – List of ‘module.method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.execute().

  • cleanup (list, default=None) – List of ‘module:method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.cleanup().

execute_task(itask=0)

Execute single task number itask: either using the command line, or by executing a job script (if job_template is provided).

find_file_task(filetype, itask=None)

Return file name for task number itask corresponding to filetype: - config_block: iconfig_block (config_block for this task) - data_block: ipipe_block (pipe_block for this task) - save_data_block: pipe_block output by execution of subpipeline - job: job submission script

property is_datablock_saved

Save DataBlock instance only if items to be propagated in data_block.

load_task(itask=0)

Load subpipeline output DataBlock instance for task number itask from disk. If not asked to be saved, return None. Else, if DataBlock does not exist on disk, raise a BatchError.

class pypescript.pipeline.MPIPipeline(name='main', options=None, config_block=None, data_block=None, description=None, pipeline=None, modules=None, setup=None, execute=None, cleanup=None)

Bases: pypescript.pipeline.BasePipeline

Extend BasePipeline to execute several modules in parallel with MPI.

nprocs_per_task

Number of processes for each task.

Type

int

_iter

Tasks to iterate on in the execute() step.

Type

list, iterator

_configblock_iter

Mapping of config_block entry to callable giving value for each iteration.

Type

dict

_datablock_iter

Mapping of data_block entry to callable giving value for each iteration.

Type

dict

_datablock_key_iter

Mapping of data_block entry, to list of data_block keys, pointing to the data_block entry the where to store result for all iterations.

Type

dict

Initialize BasePipeline.

Parameters
  • name (string, default='main') – See BaseModule documentation. Defaults to ‘main’, the root of the full pipeline tree.

  • options (SectionBlock, dict, default=None) – Options for this module. It should contain an entry ‘modules’ listing module names to load (defaults to empty list).

  • config_block (DataBlock, dict, string, default=None) – Structure containing configuration options.

  • data_block (DataBlock, default=None) – Structure containing data exchanged between modules. If None, creates one.

  • description (string, ModuleDescription, dict, default=None) – Module description.

  • pipeline (BasePipeline) – Pipeline instance for which this (sub-)pipeline was created.

  • modules (list, default=None) – List of modules, which will be completed by those in ‘setup’, ‘execute’ and ‘cleanup’ entries of options.

  • setup (list, default=None) – List of ‘module.method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.setup().

  • execute (list, default=None) – List of ‘module.method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.execute().

  • cleanup (list, default=None) – List of ‘module:method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.cleanup().

run_iter(todos)

Run list of ModuleTodo for all iterations.

class pypescript.pipeline.MetaPipeline(name, bases, class_dict)

Bases: pypescript.module.MetaModule

Meta class to replace setup(), execute() and cleanup() pipeline methods.

set_functions(functions)

Wrap input functions and add corresponding methods to class cls. Specifically:

  • before functions calls, fills in BasePipeline.data_block with values specified in BasePipeline._datablock_set

  • after functions calls, copy entries of BasePipeline.pipe_block into BasePipeline.data_block with key pairs specified in BasePipeline._datablock_duplicate

  • set pipeline BasePipeline._state

  • exceptions occuring in functions calls are complemented with module class and local name, for easy debugging

Parameters

functions (dict) – Dictionary of function name: callable.

class pypescript.pipeline.ModuleTodo(pipeline, module, step)

Bases: object

Helper class to run module BaseModule.setup(), BaseModule.execute() and BaseModule.cleanup(), based on each module’s BaseModule._state and following decision tree:

  • if run ‘setup’: if state is ‘setup’ or ‘execute’, run ‘cleanup’ first

  • if run ‘execute’: if state is ‘cleanup’, run ‘setup’ first

Initialize ModuleTodo.

Parameters
  • pipeline (BasePipeline) – Pipeline instance that will call ModuleTodo instance.

  • module (BaseModule) – Module to run.

  • step (string) – module method to call.

set_data_block()

Set module BaseModule.data_block to BasePipeline.pipe_block.

todo()

Return list of steps to run.

class pypescript.pipeline.StreamPipeline(name='main', options=None, config_block=None, data_block=None, description=None, pipeline=None, modules=None, setup=None, execute=None, cleanup=None)

Bases: pypescript.pipeline.BasePipeline

Extend BasePipeline to load, set up, execute, and clean up several modules without copying data_block.

Initialize BasePipeline.

Parameters
  • name (string, default='main') – See BaseModule documentation. Defaults to ‘main’, the root of the full pipeline tree.

  • options (SectionBlock, dict, default=None) – Options for this module. It should contain an entry ‘modules’ listing module names to load (defaults to empty list).

  • config_block (DataBlock, dict, string, default=None) – Structure containing configuration options.

  • data_block (DataBlock, default=None) – Structure containing data exchanged between modules. If None, creates one.

  • description (string, ModuleDescription, dict, default=None) – Module description.

  • pipeline (BasePipeline) – Pipeline instance for which this (sub-)pipeline was created.

  • modules (list, default=None) – List of modules, which will be completed by those in ‘setup’, ‘execute’ and ‘cleanup’ entries of options.

  • setup (list, default=None) – List of ‘module.method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.setup().

  • execute (list, default=None) – List of ‘module.method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.execute().

  • cleanup (list, default=None) – List of ‘module:method’ (method being one of (‘setup’, ‘execute’, ‘cleanup’)) strings. If method not specified, defaults to BaseModule.cleanup().

syntax module

class pypescript.syntax.Decoder(data=None, string=None, parser=None, decode=True, **kwargs)

Bases: collections.UserDict

Class that decodes configuration dictionary, taking care of template forms.

data

Decoded configuration dictionary.

Type

dict

raw

Raw (without decoding of template forms) configuration dictionary.

Type

dict

filename

Path to corresponding configuration file.

Type

string

parser

yaml parser.

Type

callable

Initialize Decoder.

Parameters
  • data (dict, string, default=None) – Dictionary or path to a configuration yaml file to decode.

  • string (string) – If not None, yaml format string to decode. Added on top of data.

  • parser (callable, default=yaml_parser) – Function that parses yaml string into a dictionary. Used when data is string, or string is not None.

  • decode (bool, default=True) – Whether to decode configuration dictionary, i.e. solve template forms.

  • kwargs (dict) – Arguments for parser().

decode()

Decode description dictionary data:

  • expand section.name: value entries into {'section': {'name': 'value'}} dictionary

  • generate repeats $(%)

  • replace ${filename:section.name} by corresponding value in configuration file at path filename, at section , name keys.

  • replace f'here is the value: ${filename:section.name}' templates by 'here is the value: value'

  • replace e'42 + ${filename:section.name}' forms by ``42 + value

  • decode DataBlock duplicate: $[section1.name1] = $[section2.name2]

  • decode DataBlock mapping: $[section1.name1] = &$[section2.name2]

  • decode DataBlock set: $[section1.name1] = value

  • decode pypescript keywords (starting with ‘$’)

  • decode ConfigBlock mapping: &${section.name}

decode_eval(word, di=None)

If word matches template e'42 + ${filename:section.name}', return ``42 + value (di is the local section dictionary, to be used for replacements - see decode_replace()). Else return None.

decode_format(word, di=None)

If word matches template f'here is the value: ${filename:section.name}', return 'here is the value: value' (di is the local section dictionary, to be used for replacements - see decode_replace()). Else return None.

decode_keyword(word)

If word matches template $keyword, with keyword pypescript keyword, return keyword. Else return None.

decode_mapping(word)

If word matches template &${section.name}, return tuple (section, name). Else return None.

decode_repeat(word, placeholder=None)

If word matches template start$(value)end:

  • if value is %, placeholder is not None: replace by $(placeholder)

  • else, try to find start$(%)end in data, return new word, new key in data (startvalueend), old key in data (start$(%)end) and value

Else, if placeholder is not None, replace $% by placeholder.

decode_replace(word, di=None)

If word matches template ${section.name}, return corresponding value in the configuration file at section , name keys (if only name provided, search first into current section dictionary di if not None). Else if word matches template ${filename:section.name}, return corresponding value in configuration file at path filename, at section , name keys. Else return None.

read_file(filename)

Read file at path filename.

search(*keys)

Search value corresponding to the input sequence of keys.

exception pypescript.syntax.KeywordError(word)

Bases: Exception

Exception raised when issue with pypescript keyword.

pypescript.syntax.collapse_sections(di, maxdepth=None, sep='.')

Collapse nested dictionaries up to maxdepth.

>>> collapse_sections({'section1': {'section2': {'section3': 'value'}}},maxdepth=2,sep='.')
{'section1.section2': {'section3': 'value'}}
pypescript.syntax.expand_sections(di, sep='.')

Recursively replace section_sep separated di keys by nested dictionary.

>>> expand_sections({'section1.section2':'value'},sep='.')
{'section1': {'section2': 'value'}}
pypescript.syntax.remove_keywords(di, others=None)

Remove pypescript keyword entries (and other keys others) from input dictionary di.

MPI module

Task manager that distributes tasks over MPI processes.

Taken from https://github.com/bccp/nbodykit/blob/master/nbodykit/__init__.py and https://github.com/bccp/nbodykit/blob/master/nbodykit/batch.py.

class pypescript.mpi.CurrentMPIComm

Bases: object

Class to faciliate getting and setting the current MPI communicator.

static enable(func)

Decorator to attach the current MPI communicator to the input keyword arguments of func, via the mpicomm keyword.

classmethod enter(mpicomm)

Enter a context where the current default MPI communicator is modified to the argument comm. After leaving the context manager the communicator is restored.

Example:

with CurrentMPIComm.enter(comm):
    cat = UniformCatalog(...)

is identical to

cat = UniformCatalog(..., comm=comm)
classmethod get()

Get the default current MPI communicator. The initial value is MPI.COMM_WORLD.

classmethod pop()

Restore to the previous current default MPI communicator.

classmethod push(mpicomm)

Switch to a new current default MPI communicator.

class pypescript.mpi.CurrentMPIState(mpistate)

Bases: object

Descriptor for current MPI state of a Python class. - SCATTERED : class content scattered on all ranks - GATHERED : class content gathered on root rank - BROADCAST : class “content” (e.g. arrays) broadcast on all ranks

pypescript.mpi.MPIBroadcast(func)

mpi_broadcast() decorator that first gathers class on mpiroot, sets mpiroot, mpicomm mpistate.

exception pypescript.mpi.MPIError

Bases: Exception

Exception raised when issue with MPI operations.

pypescript.mpi.MPIGather(func)

mpi_gather() decorator that checks whether class is already gathered before gathering, then sets mpiroot and mpistate.

pypescript.mpi.MPIInit(func)

__init__() decorator that sets MPI attributes: mpiroot, mpicomm and mpistate.

pypescript.mpi.MPIScatter(func)

mpi_scatter() decorator that checks whether class is already scattered before scattering, then sets mpistate.

class pypescript.mpi.MPITaskManager(nprocs_per_task=1, use_all_nprocs=False, mpicomm=None)

Bases: object

A MPI task manager that distributes tasks over a set of MPI processes, using a specified number of independent workers to compute each task.

Given the specified number of independent workers (which compute tasks in parallel), the total number of available CPUs will be divided evenly.

The main function is iterate which iterates through a set of tasks, distributing the tasks in parallel over the available ranks.

Initialize MPITaskManager.

Parameters
  • nprocs_per_task (int, optional) – the desired number of processes assigned to compute each task

  • mpicomm (MPI communicator, optional) – the global communicator that will be split so each worker has a subset of CPUs available; default is COMM_WORLD

  • use_all_nprocs (bool, optional) – if True, use all available CPUs, including the remainder if nprocs_per_task does not divide the total number of CPUs evenly; default is False

is_root()

Is the current process the root process? Root is responsible for distributing the tasks to the other available ranks.

is_worker()

Is the current process a valid worker? Workers wait for instructions from the master.

iterate(tasks)

Iterate through a series of tasks in parallel.

Notes

This is a collective operation and should be called by all ranks.

Parameters

tasks (iterable) – An iterable of task items that will be yielded in parallel across all ranks.

Yields

task – The individual items of tasks, iterated through in parallel.

map(function, tasks)

Apply a function to all of the values in a list and return the list of results.

If tasks contains tuples, the arguments are passed to function using the *args syntax.

Notes

This is a collective operation and should be called by all ranks.

Parameters
  • function (callable) – The function to apply to the list.

  • tasks (list) – The list of tasks.

Returns

results – The list of the return values of function.

Return type

list

pypescript.mpi.broadcast_array(data, root=0, mpicomm=None)

Broadcast the input data array across all ranks, assuming data is initially only on root (and None on other ranks). This uses Scatterv, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype

Parameters
  • data (array_like or None) – on root, this gives the data to split and scatter

  • mpicomm (MPI communicator) – the MPI communicator

  • root (int) – the rank number that initially has the data

  • counts (list of int) – list of the lengths of data to send to each rank

Returns

recvbuffer – the chunk of data that each rank gets

Return type

array_like

pypescript.mpi.enum(*sequential, **named)

Enumeration values to serve as status tags passed between processes

pypescript.mpi.gather_array(data, root=0, mpicomm=None)

Taken from https://github.com/bccp/nbodykit/blob/master/nbodykit/utils.py Gather the input data array from all ranks to the specified root. This uses Gatherv, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype

Parameters
  • data (array_like) – the data on each rank to gather

  • mpicomm (MPI communicator) – the MPI communicator

  • root (int, or Ellipsis) – the rank number to gather the data to. If root is Ellipsis or None, broadcast the result to all ranks.

Returns

recvbuffer – the gathered data on root, and None otherwise

Return type

array_like, None

pypescript.mpi.scatter_array(data, counts=None, root=0, mpicomm=None)

Taken from https://github.com/bccp/nbodykit/blob/master/nbodykit/utils.py Scatter the input data array across all ranks, assuming data is initially only on root (and None on other ranks). This uses Scatterv, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype

Parameters
  • data (array_like or None) – on root, this gives the data to split and scatter

  • mpicomm (MPI communicator) – the MPI communicator

  • root (int) – the rank number that initially has the data

  • counts (list of int) – list of the lengths of data to send to each rank

Returns

recvbuffer – the chunk of data that each rank gets

Return type

array_like

pypescript.mpi.split_ranks(N_ranks, N, include_all=False)

Divide the ranks into chunks, attempting to have N ranks in each chunk. This removes the master (0) rank, such that N_ranks - 1 ranks are available to be grouped

Parameters
  • N_ranks (int) – the total number of ranks available

  • N (int) – the desired number of ranks per worker

  • include_all (bool, optional) – if True, then do not force each group to have exactly N ranks, instead including the remainder as well; default is False

utils module

A few utilities.

class pypescript.utils.BaseClass

Bases: pypescript.utils._BaseClass

Base template for pypescript MPI classes. It defines a couple of base methods.

classmethod from_state(state, mpiroot=0, mpicomm=None)

Instantiate and initalize class with state dictionary.

classmethod load(filename, mpiroot=0, mpicomm=None)

Load class from disk.

save(filename)

Save class to disk.

class pypescript.utils.BaseMetaClass(name, bases, class_dict)

Bases: type

Meta class to add logging attributes to BaseClass derived classes.

set_logger()

Add attributes for logging:

  • logger

  • methods log_debug, log_info, log_warning, log_error, log_critical

class pypescript.utils.BaseTaskManager(mpicomm=None)

Bases: object

A dumb task manager, that simply iterates through the tasks in series.

iterate(tasks)

Iterate through a series of tasks.

Parameters

tasks (iterable) – An iterable of tasks that will be yielded.

Yields

task – The individual items of `tasks, iterated through in series.

map(function, tasks)

Apply a function to all of the values in a list and return the list of results.

If tasks contains tuples, the arguments are passed to function using the *args syntax.

Parameters
  • function (callable) – The function to apply to the list.

  • tasks (list) – The list of tasks.

Returns

results – The list of the return values of function.

Return type

list

class pypescript.utils.MemoryMonitor(pid=None, msg='')

Bases: object

Class that monitors memory usage and clock, useful to check for memory leaks.

>>> with MemoryMonitor() as mem:
        '''do something'''
        mem()
        '''do something else'''

Initalize MemoryMonitor and register current memory usage.

Parameters
  • pid (int, default=None) – Process identifier. If None, use the identifier of the current process.

  • msg (string, default='') – Additional message.

class pypescript.utils.ScatteredBaseClass(**attrs)

Bases: pypescript.utils._BaseClass

Base template for pypescript MPI classes. It defines a couple of base methods and attributes.

mpistate

See CurrentMPIState.

Type

CurrentMPIState

mpicomm

Current MPI communicator.

Type

MPI communicator

mpiroot

MPI root rank.

Type

int

classmethod from_state(state, mpistate=1, mpiroot=0, mpicomm=None)

Instantiate and initalize class with state dictionary.

classmethod load(filename, mpiroot=0, mpistate=1, mpicomm=None)

Load class from disk.

classmethod mpi_collect(self=None, sources=None, mpicomm=None)

Return new instance corresponding to self on larger mpicomm.

Parameters
  • self (object, None) – Instance to spread on mpicomm.

  • sources (list, None) – Ranks of processes of mpicomm where self lives. If None, takes the ranks of processes where self is not None.

  • mpicomm (MPI communicator) – New mpi communicator.

Returns

new

Return type

object

mpi_distribute(dests, mpicomm=None)

Return new instance corresponding to self on smaller mpicomm.

Parameters
  • self (object, None) – Instance to concentrate on mpicomm.

  • dests (list, None) – Ranks of processes of mpicomm where to send self lives. If None, takes the ranks of processes where self is not None.

  • mpicomm (MPI communicator) – New mpi communicator.

Returns

new

Return type

object, None

mpi_to_state(mpistate)

Return instance, changing current MPI state to mpistate.

save(filename)

Save class to disk.

pypescript.utils.TaskManager(mpicomm=None, nprocs_per_task=1, **kwargs)

Switch between non-MPI (ntasks=1) and MPI task managers. To be called as:

with TaskManager(...) as tm:
    # do stuff
pypescript.utils.exception_handler(exc_type, exc_value, exc_traceback)

Print exception with a logger.

pypescript.utils.is_of_type(value, types)

Check type of value.

Parameters
  • value (object) – Value to check type of.

  • types (list, string, type or class) – Types to check the return value of DataBlock.get() against. If list or tuple, check whether any of the proposed types matches. If a type is string, will search for the corresponding builtin type.

Returns

oftype – Whether value is of any of types.

Return type

bool

pypescript.utils.mkdir(dirname)

Try to create dirnm and catch OSError.

pypescript.utils.savefile(func)

Wrapper for a class method that saves a file on disk. It creates the file directory (if does not exist).

pypescript.utils.setup_logging(level=20, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, filename=None, filemode='w', **kwargs)

Set up logging.

Parameters
  • level (string, int, default=logging.INFO) – Logging level.

  • stream (_io.TextIOWrapper, default=sys.stdout) – Where to stream.

  • filename (string, default=None) – If not None stream to file name.

  • filemode (string, default='w') – Mode to open file, only used if filename is not None.

  • kwargs (dict) – Other arguments for logging.basicConfig().

pypescript.utils.snake_to_pascal_case(snake)

Transform string in snake case (name1_name2) into Pascal case (Name1Name2).

libutils package

Implementation of numpy.core.setup() to install pypescript libraries like a charm.

class pypescript.libutils.setup.Extension(*args, module_dir='.', doc=None, description_file=None, **kwargs)

Bases: numpy.distutils.extension.Extension

Extend numpy.distutils.extension.Extension to pypescript extensions (i.e. modules extending Python).

module_dir

Module directory.

Type

string, default=’.’

doc

Module documentation, to fill m_doc entry of Python C API’s PyModuleDef.

Type

string, default=None

description_file

Module description file name.

Type

string, default None

args

Other arguments for numpy.distutils.extension.Extension.

Type

tuple

kwargs

Other arguments for numpy.distutils.extension.Extension.

Type

dict

Note

In numpy.distutils, f2py is called if Fortran source files are provided. Here, f2py is called if Fortran source files are provided AND a (possibly empty) list f2py_options is provided.

from_npextension(ext)

Convert numpy.distutils.extension.Extension ext to Extension.

Note

For compatibility, as in numpy.distutils, f2py will be called if Fortran source files are provided.

has_fortran_sources()

Compile Fortran sources?

pypescript.libutils.setup.addfirst(li, *args)

Add elements in args on top of list of li and return list.

class pypescript.libutils.setup.build_src(dist, **kw)

Bases: numpy.distutils.command.build_src.build_src

Extend numpy.distutils.build_src.build_src to generate the C file necessary to turn C/C++/Fortran sources into Python extension modules.

Construct the command for dist, updating vars(self) with any keyword parameters.

pymodule_csource(sources, extension)

Write C source file to compile C/C++/Fortran files as a Python extension.

pypescript.libutils.setup.fortran_pyf_ext_re(string, pos=0, endpos=9223372036854775807)

Matches zero or more characters at the beginning of the string.

class pypescript.libutils.setup.setup(name='pypescript_lib', base_dir='.', sections=None, version=None, author=None, maintainer=None, url=None, description='pypescript library', long_description=None, license=None, packages=None, py_modules=None, ext_modules=None, pype_module_names=None, install_requires=None, extras_require=None, data_files=None, libraries=None, **kwargs)

Bases: object

Class that extends the numpy.distutils.core.setup() function to setup a pypescript library.

Initialize setup and call numpy.distutils.core.setup() to install the pypescript library. Most arguments are similar to those of numpy.distutils.core.setup(). Only supplementary arguments are:

Parameters
  • base_dir (string, default='.') – Root of the directory tree to explore.

  • sections (string, list, default='.') – Section name yaml file, or list of sections (strings).

  • pype_module_names (string, dict, default=None) – Name of file containing a list of modules (w.r.t. base_dir) to install. See utils.read_path_list(). If None, all modules in base_dir are considered. Can be a dictionary of identifiers: list of modules, following the extras_require syntax of numpy.distutils.core.setup().

set_pype_modules(include_pype_module_names=None, exclude_pype_module_names=None)

Set modules to install. Split modules into Python extensions (i.e. to be compiled) pype_ext_modules and standard Python modules pype_modules. It also returns the requirements from each module as specified in their description file.

Generate pypescript modules rst documentation from module description files.

pypescript.libutils.generate_pype_module_doc.generate_pype_modules_rst_doc(section_underline='-', max_line_len=80, **kwargs)

Generate rst tables for several modules of the pypescript library.

Parameters
  • section_underline (string, default='=') – Section underline.

  • max_line_len (int, default=80) – Max table line length. See generate_rst_doc_table().

  • kwargs (dict) – Arguments for utils.walk_pype_modules().

Returns

rst – rst-formatted list of tables.

Return type

string

pypescript.libutils.generate_pype_module_doc.generate_rst_doc_table(rows, max_line_len=80)

Generate rst table from module documentation rows.

Parameters
  • rows (dict) – Documentation dictionary.

  • max_line_len (int, default=80) – Max table line length. Default length chosen to match the width of sphinx_rtd_theme. Note that only the width of the last column of the table will be impacted by this parameter (hence the minimum with of the table is set to the size of all the columns except the last one).

Returns

rst – rst-formatted table.

Return type

string

Note

This function has not been thoroughly tested.

pypescript.libutils.generate_pype_module_doc.write_pype_modules_rst_doc(filename, header='', title='Modules', title_underline='=', **kwargs)

Generate rst file with tables for several modules of the pypescript library.

Parameters
  • filename (string) – Where to save rst file.

  • header (string, default='') – To add on top of the rst file.

  • title (string, default='Modules') – Title.

  • title_underline (string, default='=') – Title underline.

  • kwargs (dict) – Arguments for generate_pype_modules_rst_doc().

Returns

rst – rst-formatted list of tables.

Return type

string

class pypescript.libutils.module_description.ModuleDescription(data=None, string=None, parser=None, **kwargs)

Bases: collections.UserDict

This class handles module description.

Initialize ModuleDescription.

Parameters
  • data (string, ModuleDescription, dict, default=None) – Path to description file. Else, ModuleDescription instance to be (shallow) copied. Else, a dictionary. If None, ignored.

  • string (string, default=None) – String to be parsed and update self internal dictionary.

  • parser (callable) – Parser which turns a string into a dictionary.

  • kwargs (dict) – Arguments for syntax_description.Decoder.

classmethod filename_from_module(module)

Return yaml description file name corresponding to Python module module.

classmethod from_module(module)

Return ModuleDescription instance(s) corresponding to Python module module, if they exist; else return None.

classmethod isinstance(filename)

Check wether the file description_file containing description is a pypescript module description file.

classmethod load(filename, **kwargs)

Load a ModuleDescription instance from filename. If several descriptions are found in the same yaml file (i.e. separated by a horizontal --- line), return corresponding ModuleDescription instances.

Parameters
  • filename (string) – Description file name.

  • kwargs (dict) – Arguments for ModuleDescription.

class pypescript.libutils.syntax_description.Decoder(data=None, string=None, parser=<function yaml_parser>, filename=None, decode=True, decode_eval=True, **kwargs)

Bases: collections.UserDict

Class that decodes description dictionary, taking care of template forms.

data

Description dictionary.

Type

dict

filename

Path to corresponding description file.

Type

string

parser

yaml parser.

Type

callable

Initialize Decoder.

Parameters
  • data (dict, string, default=None) – Dictionary or path to a description yaml file to decode.

  • string (string, default=None) – If not None, yaml format string to decode. Added on top of data.

  • parser (callable, default=yaml_parser) – Function that parses yaml string into a dictionary. Used when data is string, or string is not None.

  • filename (string, default=None) – Path to description file. Not used if data is string.

  • decode (bool, default=True) – Whether to decode configuration dictionary, i.e. solve template forms.

  • decode_eval (bool, default=True) – Whether to decode eval template forms.

  • kwargs (dict) – Arguments for parser().

decode(decode_eval=True)

Decode description dictionary data:

  • expand section.name: value entries into {'section': {'name': 'value'}} dictionary

  • replace ${filename:index/name:section.name} by corresponding value in description file at path filename, index/name description (can be several in a file), at section , name keys.

  • replace f'here is the value: ${filename:index/name:section.name}' templates by 'here is the value: value'

  • replace e'42 + ${filename:index/name:section.name}' forms by ``42 + value.

Parameters

decode_eval (bool, default=True) – Whether to decode eval template forms.

decode_eval(word)

If word matches template e'42 + ${filename:index/name:section.name}', return ``42 + value Else return None.

decode_format(word)

If word matches template f'here is the value: ${filename:index/name:section.name}', return 'here is the value: value' Else return None.

decode_replace(word)

If word matches template ${filename:index/name:section.name}, return corresponding value in description file at path filename, index/name description (can be several in a file), at section , name keys. Else return None.

read_file(filename)

Read file at path filename.

search(*keys)

Search value corresponding to the input sequence of keys.

exception pypescript.libutils.syntax_description.ParserError

Bases: Exception

Exception raised when template form parsing fails.

class pypescript.libutils.syntax_description.YamlLoader(stream)

Bases: yaml.loader.SafeLoader

yaml loader that correctly parses numbers. Taken from https://stackoverflow.com/questions/30458977/yaml-loads-5e-6-as-string-and-not-a-number.

Initialize the scanner.

pypescript.libutils.syntax_description.join_sections(words, sep='.')

Join sections with separator sep.

pypescript.libutils.syntax_description.search_in_dict(di, *keys)

Return value corresponding to sequence of keys entries in nested dictionary di.

pypescript.libutils.syntax_description.split_sections(word, sep='.', default_section=None)

Split word into a tuple of different sections.

Parameters
  • word (string) – String to be split into different sections.

  • sep (string, default='.') – Separator.

  • default_section (string, default=None) – If not None, and number of sections found is less than 2, add default_section at the binning.

Returns

sections – Tuple of sections.

Return type

tuple

pypescript.libutils.syntax_description.yaml_parser(string, index=None)

Parse string in yaml format.

A few utilities to walk through the pypescript library modules.

pypescript.libutils.utils.mkdir(filename)

Try to create directory of filename and catch OSError.

pypescript.libutils.utils.module_file_name(full_name, base_dir='.')

Return module file name (without extension).

Parameters
  • full_name (string) – Module full name, starting from base_dir. See module_full_name().

  • base_dir (string, default='.') – Base package directory.

  • module_file_name('/path/to/module/file.py' (>>>) –

  • base_dir='/path/to')

  • module/file

pypescript.libutils.utils.module_full_name(module_file, base_dir='.')

Return module full name, starting from base_dir.

Parameters
  • module_file (string) – Module file name.

  • base_dir (string, default='.') – Base package directory.

  • module_full_name('/path/to/module/file.py' (>>>) –

  • base_dir='/path/to')

  • module.file

pypescript.libutils.utils.read_path_list(filename)

Parse list of module names into modules to be included/excluded.

Parameters

filename (string) – File name of module list. This should respect the bash-syntax, except separators are dots.

Returns

  • include (list) – List of module name re patterns to include.

  • exclude – List of module name re patterns to exclude.

Example

To include modules in directory dir, write dir.*. To exclude modules in directory dir starting with mod, write dir.mod*.

pypescript.libutils.utils.walk_pype_modules(base_dir='.', include_pype_module_names=None, exclude_pype_module_names=None)

Walk through pypescript modules and yield (module directory, module full name (w.r.t. base_dir), description file name, desciption dictionary).

Parameters
  • base_dir (string, default='.') – Root of the directory tree to explore.

  • include_pype_module_names (list, default=None) – List of module names (w.r.t. base_dir) or regex to include. If None, all modules in base_dir are considered.

  • exclude_pype_module_names (list, default=None) – List of module names (w.r.t. base_dir) or regex to exclude. If None, no module is excluded.