[2]:
# Import to be able to import python package from src
import sys
sys.path.insert(0, '../src')
[3]:
import pandas as pd
import numpy as np
import ontime as on

Detectors#

Detectors allow you to get a signal given a condition. The condition can be :

  • an absolute threshold or,

  • a statistical threshold with a quantile.

Let’s make an example.

Generate a TimeSeries#

This is only to have some data for the purpose of the example.

[4]:
ts = on.generators.random_walk().generate(start=pd.Timestamp('2022-01-01'), end=pd.Timestamp('2022-12-31'))
[ ]:
ts.head(3)
<TimeSeries (DataArray) (time: 3, component: 1, sample: 1)> Size: 24B
array([[[-0.46778339]],

       [[-1.31388058]],

       [[-3.59231678]]])
Coordinates:
  * time       (time) datetime64[ns] 24B 2022-01-01 2022-01-02 2022-01-03
  * component  (component) object 8B 'random_walk'
Dimensions without coordinates: sample
Attributes:
    static_covariates:  None
    hierarchy:          None

Detection given an abolute threshold#

With a single line of code, you can make your absolute detector.

[6]:
td = on.detectors.threshold(low_threshold=-2)

Now, a detection on any TimeSeries can be made as follow :

[7]:
ats = td.detect(ts)

The return type of the detect function is a BinaryTimeSeries, meaning that it will always have values being 0 or 1.

[8]:
type(ats)
[8]:
ontime.core.time_series.binary_time_series.BinaryTimeSeries
[9]:
ats.plot()
[9]:

Detection given a statistical threshold (quantile)#

Here the idea is similar but the threshold is dependant on the data. The quantile detector can be instanciated as follow

[10]:
td = on.detectors.quantile(low_quantile=0.1)

And then, it can be fit

[11]:
td.fit(ts)

Now, the usage is similar to the threshold detector.

[12]:
ats = td.detect(ts)

The TimeSeries is also a BinaryTimeSeries with values being 0 or 1.

[13]:
type(ats)
[13]:
ontime.core.time_series.binary_time_series.BinaryTimeSeries
[14]:
ats.plot()
[14]:

Activate a logger when anomalies are detected#

Upon request, a logger can be activated and produces a dataframe where detected anomalies are recorded. As an example, let’s take the last detection code and add a logger to a new quantile detection.

[15]:
td = on.detectors.quantile(low_quantile=0.1, enable_logging=True)
td.fit(ts)

Detect

[16]:
ats = td.detect(ts)
Index(['time', 'random_walk'], dtype='object', name='component')
/home/benjy/projects_dev/ontime/src/ontime/core/utils/anomaly_logger.py:46: FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.
  self.log = pd.concat([self.log, log_df])

Check what is in the logger

[17]:
td.logger.log
[17]:
Timestamp Description Value
0 2022-01-01 QuantileDetector False
1 2022-01-02 QuantileDetector False
2 2022-01-03 QuantileDetector False
3 2022-01-04 QuantileDetector False
4 2022-01-05 QuantileDetector False
... ... ... ...
360 2022-12-27 QuantileDetector False
361 2022-12-28 QuantileDetector False
362 2022-12-29 QuantileDetector False
363 2022-12-30 QuantileDetector False
364 2022-12-31 QuantileDetector False

365 rows × 3 columns