Access time series#

Access existing data#

You can access any data for which you have the TimeSeriesId (and authorization). It is possible to query aggregated data directly, e.g. you can query 1 minute average values for a specified period. The finest available aggregation period is “tick” (100 nanoseconds).

# set up
import datareservoirio as drio
import numpy as np
import pandas as pd

auth = drio.Authenticator()
# Follow instructions to authenticate

client = drio.Client(auth)

# Get timeseries data resampled to 15 minutes average for selected period
timeseries = client.get_samples_aggregate(series_id,
                        start='2024-01-01', end='2024-01-02',
                        aggregation_period='15m',
                        aggregation_function='mean')

# Get all data for selected time period
timeseries = client.get_samples_aggregate(series_id,
                        start='2024-01-01', end='2024-01-02',
                        aggregation_period='tick',
                        aggregation_function='mean')

Note

Client.get_samples_aggregate() returns a pandas.Series. The start, end, aggregation_period and aggregation_function parameters are required.

Important

Time series data is archived 90 days after the upload. To access archived data directly, you can use the Client.get() method, but the data can also be restored by contacting support.

Access archived data#

You can access time series data using the Client.get() method, as long as you have the TimeSeriesId (and authorization). Note that this method only returns the raw data, and does not support aggregation. Below is an example demonstrating how to access archived time series data. We strongly recommended to use the Client.get_samples_aggregate() as long as the data was uploaded within the last 90 days, or contact support to restore it.

# Get entire timeseries
timeseries = client.get(series_id)

# Get a slice of time series
timeseries = client.get(series_id, start='2018-01-01 12:00:00',
                        end='2018-01-02 06:00:00')

Warning

The time resolution of aggregated data is in ticks (1tick = 100 nanoseconds), while the time resolution of non-aggregated data is in nanoseconds. This may lead to discrepancies in data when comparing the two, and some datapoints might get lost when using aggregation to access data, in cases when there are multiple datapoints within the same 100 nanosecond range.

Warning

We introduced a breaking change in DataReservoir, which affects data ingested after approximately 10th of March 2025 10:00 CET. With this change, booleans are no longer automatically converted to integer values 1 and 0. If this conversion is desired, it can be done manually where required by using the following code after downloading the data, as shown below:

timeseries.replace({True: 1, 'true': 1, 'True': 1, False: 0, 'false': 0, 'False': 0}, inplace=True)

Tip

When handling high-frequency data and/or extended timespans, it is crucial to consider memory usage. Accessing an excessive amount of data at once can cause your script to fail. The following is a recommended approach for accessing data in smaller chunks:

# Make a date iterator
start_end = pd.date_range(start="2020-01-01 00:00", end="2020-02-01 00:00", freq="1H")
start_end_iter = zip(start_end[:-1], start_end[1:])

series_id = <your time series ID>


# Get timeseries in chunks
for start, end in start_end_iter:
    timeseries = client.get(series_id, start=start, end=end)