[2]:
# Import to be able to import python package from src
import sys
sys.path.insert(0, '../src')
[3]:
import pandas as pd
import numpy as np
import ontime as on
Detectors#
Detectors allow you to get a signal given a condition. The condition can be :
an absolute threshold or,
a statistical threshold with a quantile.
Let’s make an example.
Generate a TimeSeries#
This is only to have some data for the purpose of the example.
[4]:
ts = on.generators.random_walk().generate(start=pd.Timestamp('2022-01-01'), end=pd.Timestamp('2022-12-31'))
[ ]:
ts.head(3)
<TimeSeries (DataArray) (time: 3, component: 1, sample: 1)> Size: 24B array([[[-0.46778339]], [[-1.31388058]], [[-3.59231678]]]) Coordinates: * time (time) datetime64[ns] 24B 2022-01-01 2022-01-02 2022-01-03 * component (component) object 8B 'random_walk' Dimensions without coordinates: sample Attributes: static_covariates: None hierarchy: None
Detection given an abolute threshold#
With a single line of code, you can make your absolute detector.
[6]:
td = on.detectors.threshold(low_threshold=-2)
Now, a detection on any TimeSeries can be made as follow :
[7]:
ats = td.detect(ts)
The return type of the detect function is a BinaryTimeSeries, meaning that it will always have values being 0 or 1.
[8]:
type(ats)
[8]:
ontime.core.time_series.binary_time_series.BinaryTimeSeries
[9]:
ats.plot()
[9]:
Detection given a statistical threshold (quantile)#
Here the idea is similar but the threshold is dependant on the data. The quantile detector can be instanciated as follow
[10]:
td = on.detectors.quantile(low_quantile=0.1)
And then, it can be fit
[11]:
td.fit(ts)
Now, the usage is similar to the threshold detector.
[12]:
ats = td.detect(ts)
The TimeSeries is also a BinaryTimeSeries with values being 0 or 1.
[13]:
type(ats)
[13]:
ontime.core.time_series.binary_time_series.BinaryTimeSeries
[14]:
ats.plot()
[14]:
Activate a logger when anomalies are detected#
Upon request, a logger can be activated and produces a dataframe where detected anomalies are recorded. As an example, let’s take the last detection code and add a logger to a new quantile detection.
[15]:
td = on.detectors.quantile(low_quantile=0.1, enable_logging=True)
td.fit(ts)
Detect
[16]:
ats = td.detect(ts)
Index(['time', 'random_walk'], dtype='object', name='component')
/home/benjy/projects_dev/ontime/src/ontime/core/utils/anomaly_logger.py:46: FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.
self.log = pd.concat([self.log, log_df])
Check what is in the logger
[17]:
td.logger.log
[17]:
Timestamp | Description | Value | |
---|---|---|---|
0 | 2022-01-01 | QuantileDetector | False |
1 | 2022-01-02 | QuantileDetector | False |
2 | 2022-01-03 | QuantileDetector | False |
3 | 2022-01-04 | QuantileDetector | False |
4 | 2022-01-05 | QuantileDetector | False |
... | ... | ... | ... |
360 | 2022-12-27 | QuantileDetector | False |
361 | 2022-12-28 | QuantileDetector | False |
362 | 2022-12-29 | QuantileDetector | False |
363 | 2022-12-30 | QuantileDetector | False |
364 | 2022-12-31 | QuantileDetector | False |
365 rows × 3 columns