ResultCollector#

class fourinsight.engineroom.utils.ResultCollector(headers, handler=None, indexing_mode='auto')[source]#

Collect and store indexed results.

This class provides a simple interface to collect, store, and index intermediate results. The results are stored in a pandas.DataFrame internally. Using a handler, the results can be pushed or *pulled from a remote source.

Parameters:
  • headers (dict) – Header names and data types as key/value pairs; int, float, and str are allowed as data types. The collector will only accept intermediate results defined here.

  • handler (object) – Handler extended from BaseHandler. Default handler is NullHandler, which does not provide any push or pull functionality.

  • indexing_mode (str) – Indexing mode. Should be ‘auto’ (default) or ‘timestamp’.

Notes

The data types are casted to Pandas equivalent data types, so that missing values are handled correctly.

append(dataframe)[source]#

Append rows of dataframe to the results.

Columns of dataframe must be in the headers.

Parameters:

dataframe (pandas.DataFrame) – The results to append.

collect(**results)[source]#

Collect and store results under the current index.

Parameters:

results (keyword arguments) – The results are passed as keyword arguments, where the keyword must be one of the ‘headers’. Provided values must be of correct data type (defined during instantiation).

property dataframe#

Return a (deep) copy of the internal dataframe

delete_rows(index)[source]#

Delete rows.

The index will be reset if ‘indexing_mode’ is set to ‘auto’.

Parameters:

index (single label or list-like) – Index labels to drop.

new_row(index=None)[source]#

Make a new row.

Parameters:

index (None or datetime-like) – The new index value. If indexing_mode is set to ‘auto’, index should be None. If indexing_mode is set to ‘timestamp’, index should be a unique datetime that is passed on to pandas.to_datetime().

pull(raise_on_missing=True, strict=True)[source]#

Pull results from source. Remote source overwrites existing values.

Parameters:
  • raise_on_missing (bool) – Raise exception if results can not be pulled from source.

  • strict (bool) – Whether to be strict with respect to headers in the source or not. Setting strict=True (default) will require that the source has the exact same headers as the ResultCollector. Setting strict=False will allow pulling of partial results (i.e., headers that does not have results in source, will be populated with None values).

push()[source]#

Push results to source.

truncate(before=None, after=None)[source]#

Truncate results by deleting rows before and/or after given index values.

The index will be reset if ‘indexing_mode’ is set to ‘auto’.

Parameters:
  • before (int or datetime-like, optional) – Delete results with index smaller than this value.

  • after (int or datetime-like, optional) – Delete results with index greater than this value.