Data Structures

DataReservoir.io works with two fundemental data structures; series and metadata. These data structures and their Python representations are explained below.

Series

A series is a one-dimensional sequence with numeric values (64-bit float) and unique indecies (64-bit integer). (Consequently, each numeric value is natively represented with 128-bits.) Each series is assigned a unique identifier TimeSeriesId (guid) for convinient access. Furhtermore, a series can be enriched with metadata.

For time series the index is interpreted as nanoseconds since epoch (1970-01-01 00:00:00+00:00), i.e. support for nanosencond resolution.

pandas.Series objects maps perfectly to this paradigm. datareservoirio is designed around pandas.Series as it accepts and returns series data in this format.

Higher dimensional data, such as tables (dataframes), can be broken down to one-dimensional series', see Cookbook for examples.

Metadata

Metadata is a set of key-value pairs that can be associated with one or more series. The purpose of metadata is to enrich series data with essential information such as units, origin, description, etc. The same metadata can also be used to search and find series data later.

DataReservoir.io employs a "schemaless" metadata store. That is, there are no minimum requirements and you can basically add anything your heart desires. However...

"With great power comes great responsibility!"

—Uncle Ben

But, that responsibility is left to the user/app/service that uses DataReservoir.io. Simply because one-size does not fit all!

Metadata entries are organized using namespace and key. A namespace can be thought of as a table and key is the row index. Then a row can have any number of arbitrary number of columns. (Note that rows in a table do not have to share the columns!). This resembles "table storage" paradigm for those who are familiar with that.

Thus, a namespace and key combination uniquely defines a metadata entry in DataReservoir.io. (That is, you can only have one entry in the entire system with a given namespace and key combination). In addition, each entry is also assigned an alias MetadataId (guid) that can be used for direct and convenient access.

A table-like representation may look like this:

Namespace: vessel.electrical

Keys

Units

Vendor

Type

Voltmeter A

V

Company S

Thermometer Z

C

Analog

Best practices

DataReservoir.io won't enforce a particular schema when it comes to metadata, but we can suggest smart ways of approaching it.

One simple yet effective way of creating a hierarcy and taxonomy is to use "period" seperated names as namespace. E.g.:

  • vessels.galactica.electrical

  • service.context.sensors

  • application.streams

We found that this approach is rather easy to visualize and maps well to the physcial world.

What is it NOT for

Despite its flexibility, DataReservoir.io has its limitations when it comes to metadata; it is NOT a general purpose database that you can dump anything in to and it is not designed to keep track of complex hierarchical information. The query capabilities are also kept simple and efficient by design.

For very advanced use cases, it may be advisable to employ a purpose built database solution (that compliments DataReservoir.io for your application).