{ "cells": [ { "cell_type": "code", "execution_count": 2, "id": "9286e0b8-3c78-4b0f-943c-d219e9840dfe", "metadata": { "papermill": { "duration": 0.014825, "end_time": "2024-01-31T17:50:26.847199", "exception": false, "start_time": "2024-01-31T17:50:26.832374", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "# Import to be able to import python package from src\n", "import sys\n", "sys.path.insert(0, '../src')" ] }, { "cell_type": "code", "execution_count": 3, "id": "2028eed7-b1c3-4c9e-b6a0-00433caa7d0f", "metadata": { "papermill": { "duration": 2.515115, "end_time": "2024-01-31T17:50:29.365394", "exception": false, "start_time": "2024-01-31T17:50:26.850279", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import ontime as on" ] }, { "cell_type": "markdown", "id": "41296cc6-9d84-47c5-8a92-2d292f6f3c4a", "metadata": { "papermill": { "duration": 0.008378, "end_time": "2024-01-31T17:50:26.827798", "exception": false, "start_time": "2024-01-31T17:50:26.819420", "status": "completed" }, "tags": [] }, "source": [ "# Detectors" ] }, { "cell_type": "markdown", "id": "48992359-bd13-4347-8344-8bfb42fb126c", "metadata": {}, "source": [ "Detectors allow you to get a signal given a condition. The condition can be :\n", "\n", "- an absolute threshold or,\n", "- a statistical threshold with a quantile.\n", "\n", "Let's make an example." ] }, { "cell_type": "markdown", "id": "e24da8ab-6a83-4c2f-9ff0-c633d4693a91", "metadata": { "papermill": { "duration": 0.001714, "end_time": "2024-01-31T17:50:29.375771", "exception": false, "start_time": "2024-01-31T17:50:29.374057", "status": "completed" }, "tags": [] }, "source": [ "## Generate a TimeSeries\n", "\n", "This is only to have some data for the purpose of the example." ] }, { "cell_type": "code", "execution_count": 4, "id": "e9a96d79-0423-4d79-b01d-726193216238", "metadata": { "papermill": { "duration": 0.006608, "end_time": "2024-01-31T17:50:29.384080", "exception": false, "start_time": "2024-01-31T17:50:29.377472", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "ts = on.generators.random_walk().generate(start=pd.Timestamp('2022-01-01'), end=pd.Timestamp('2022-12-31'))" ] }, { "cell_type": "code", "execution_count": null, "id": "d463df9c-4f02-4c1e-b1a5-7162b9ea8c63", "metadata": { "papermill": { "duration": 0.009125, "end_time": "2024-01-31T17:50:29.394914", "exception": false, "start_time": "2024-01-31T17:50:29.385789", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<TimeSeries (DataArray) (time: 3, component: 1, sample: 1)> Size: 24B\n",
       "array([[[-0.46778339]],\n",
       "\n",
       "       [[-1.31388058]],\n",
       "\n",
       "       [[-3.59231678]]])\n",
       "Coordinates:\n",
       "  * time       (time) datetime64[ns] 24B 2022-01-01 2022-01-02 2022-01-03\n",
       "  * component  (component) object 8B 'random_walk'\n",
       "Dimensions without coordinates: sample\n",
       "Attributes:\n",
       "    static_covariates:  None\n",
       "    hierarchy:          None
" ], "text/plain": [ " Size: 24B\n", "array([[[-0.46778339]],\n", "\n", " [[-1.31388058]],\n", "\n", " [[-3.59231678]]])\n", "Coordinates:\n", " * time (time) datetime64[ns] 24B 2022-01-01 2022-01-02 2022-01-03\n", " * component (component) object 8B 'random_walk'\n", "Dimensions without coordinates: sample\n", "Attributes:\n", " static_covariates: None\n", " hierarchy: None" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ts.head(3)" ] }, { "cell_type": "markdown", "id": "5af625dd-ba6b-4f3b-9f42-462fe8918c5a", "metadata": { "papermill": { "duration": 0.001865, "end_time": "2024-01-31T17:50:29.406225", "exception": false, "start_time": "2024-01-31T17:50:29.404360", "status": "completed" }, "tags": [] }, "source": [ "## Detection given an abolute threshold" ] }, { "cell_type": "markdown", "id": "d90f942f-00ee-4953-81e2-795e3d87b292", "metadata": {}, "source": [ "With a single line of code, you can make your absolute detector." ] }, { "cell_type": "code", "execution_count": 6, "id": "8310ade1-a382-4d2a-b139-0331b3b8ebed", "metadata": { "papermill": { "duration": 0.004788, "end_time": "2024-01-31T17:50:29.412834", "exception": false, "start_time": "2024-01-31T17:50:29.408046", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "td = on.detectors.threshold(low_threshold=-2)" ] }, { "cell_type": "markdown", "id": "dc43b9bf-8736-4f90-8089-393ecdd2289f", "metadata": {}, "source": [ "Now, a detection on any TimeSeries can be made as follow : " ] }, { "cell_type": "code", "execution_count": 7, "id": "15399a30-e23e-4dae-ac55-f6376f1f23e6", "metadata": {}, "outputs": [], "source": [ "ats = td.detect(ts)" ] }, { "cell_type": "markdown", "id": "7cd14648-4b33-4e49-94f1-2c30ead74886", "metadata": {}, "source": [ "The return type of the detect function is a BinaryTimeSeries, meaning that it will always have values being 0 or 1." ] }, { "cell_type": "code", "execution_count": 8, "id": "5b3d020e-18cc-47f2-a553-eb00ff972ef3", "metadata": { "papermill": { "duration": 0.197344, "end_time": "2024-01-31T17:50:29.612099", "exception": false, "start_time": "2024-01-31T17:50:29.414755", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "ontime.core.time_series.binary_time_series.BinaryTimeSeries" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(ats)" ] }, { "cell_type": "code", "execution_count": 9, "id": "e1ffc02f-1ec9-4bd2-8fec-c8bfd3909ed9", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "" ], "text/plain": [ "alt.LayerChart(...)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ats.plot()" ] }, { "cell_type": "markdown", "id": "ffbed9d6-d331-4708-8d50-25882c85e60d", "metadata": { "papermill": { "duration": 0.005612, "end_time": "2024-01-31T17:50:29.624161", "exception": false, "start_time": "2024-01-31T17:50:29.618549", "status": "completed" }, "tags": [] }, "source": [ "## Detection given a statistical threshold (quantile)" ] }, { "cell_type": "markdown", "id": "22fa051a-a00a-454b-bfb9-844bb32b9d06", "metadata": {}, "source": [ "Here the idea is similar but the threshold is dependant on the data. The quantile detector can be instanciated as follow" ] }, { "cell_type": "code", "execution_count": 10, "id": "04f2a0c4-5744-46bf-b622-9abaaaf6b35c", "metadata": { "papermill": { "duration": 0.016249, "end_time": "2024-01-31T17:50:29.649601", "exception": false, "start_time": "2024-01-31T17:50:29.633352", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "td = on.detectors.quantile(low_quantile=0.1)" ] }, { "cell_type": "markdown", "id": "675ff78a-4f74-494e-a049-8b239505ab69", "metadata": {}, "source": [ "And then, it can be fit" ] }, { "cell_type": "code", "execution_count": 11, "id": "02f12ec0-d1cc-41db-ba53-c53a98f6d8f3", "metadata": { "papermill": { "duration": 0.015824, "end_time": "2024-01-31T17:50:29.671330", "exception": false, "start_time": "2024-01-31T17:50:29.655506", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "td.fit(ts)" ] }, { "cell_type": "markdown", "id": "92a54b62-d159-47c1-b493-3d008e17e350", "metadata": {}, "source": [ "Now, the usage is similar to the threshold detector." ] }, { "cell_type": "code", "execution_count": 12, "id": "df0896b7-7857-42ad-9dd2-c831f8fa0123", "metadata": {}, "outputs": [], "source": [ "ats = td.detect(ts)" ] }, { "cell_type": "markdown", "id": "eca142b6-79bc-444f-96c4-d2872361b8a4", "metadata": {}, "source": [ "The TimeSeries is also a BinaryTimeSeries with values being 0 or 1." ] }, { "cell_type": "code", "execution_count": 13, "id": "caebf894-6cf9-4937-8748-19edd8123ac5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "ontime.core.time_series.binary_time_series.BinaryTimeSeries" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(ats)" ] }, { "cell_type": "code", "execution_count": 14, "id": "d640d149-f0eb-4d19-9e2b-10926d6fa26f", "metadata": { "papermill": { "duration": 0.159713, "end_time": "2024-01-31T17:50:29.836523", "exception": false, "start_time": "2024-01-31T17:50:29.676810", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "" ], "text/plain": [ "alt.LayerChart(...)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ats.plot()" ] }, { "cell_type": "markdown", "id": "5d09589f-fe40-4c7b-953c-7f507933e03c", "metadata": {}, "source": [ "## Activate a logger when anomalies are detected" ] }, { "cell_type": "markdown", "id": "73316597-fe16-43ac-bd11-86ff0af88d2e", "metadata": {}, "source": [ "Upon request, a logger can be activated and produces a dataframe where detected anomalies are recorded. As an example, let's take the last detection code and add a logger to a new quantile detection." ] }, { "cell_type": "code", "execution_count": 15, "id": "349e0ede-794b-42c9-af57-e8c9d809aad5", "metadata": {}, "outputs": [], "source": [ "td = on.detectors.quantile(low_quantile=0.1, enable_logging=True)\n", "td.fit(ts)" ] }, { "cell_type": "markdown", "id": "a63067db-74e8-4998-8c48-634f1622b3f6", "metadata": {}, "source": [ "Detect" ] }, { "cell_type": "code", "execution_count": 16, "id": "e00da952-230e-40a1-a547-20b8c31b0d7f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Index(['time', 'random_walk'], dtype='object', name='component')\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/benjy/projects_dev/ontime/src/ontime/core/utils/anomaly_logger.py:46: FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.\n", " self.log = pd.concat([self.log, log_df])\n" ] } ], "source": [ "ats = td.detect(ts)" ] }, { "cell_type": "markdown", "id": "beb1509a-e131-4899-ae06-7f41689898d6", "metadata": {}, "source": [ "Check what is in the logger" ] }, { "cell_type": "code", "execution_count": 17, "id": "6b13aa0c-4987-49b5-a2f0-da050c2dab7d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TimestampDescriptionValue
02022-01-01QuantileDetectorFalse
12022-01-02QuantileDetectorFalse
22022-01-03QuantileDetectorFalse
32022-01-04QuantileDetectorFalse
42022-01-05QuantileDetectorFalse
............
3602022-12-27QuantileDetectorFalse
3612022-12-28QuantileDetectorFalse
3622022-12-29QuantileDetectorFalse
3632022-12-30QuantileDetectorFalse
3642022-12-31QuantileDetectorFalse
\n", "

365 rows × 3 columns

\n", "
" ], "text/plain": [ " Timestamp Description Value\n", "0 2022-01-01 QuantileDetector False\n", "1 2022-01-02 QuantileDetector False\n", "2 2022-01-03 QuantileDetector False\n", "3 2022-01-04 QuantileDetector False\n", "4 2022-01-05 QuantileDetector False\n", ".. ... ... ...\n", "360 2022-12-27 QuantileDetector False\n", "361 2022-12-28 QuantileDetector False\n", "362 2022-12-29 QuantileDetector False\n", "363 2022-12-30 QuantileDetector False\n", "364 2022-12-31 QuantileDetector False\n", "\n", "[365 rows x 3 columns]" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "td.logger.log" ] } ], "metadata": { "kernelspec": { "display_name": "ontime-2OQVvbNf-py3.10", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" }, "papermill": { "default_parameters": {}, "duration": 4.80556, "end_time": "2024-01-31T17:50:30.903022", "environment_variables": {}, "exception": null, "input_path": "docs/user_guide/0_core/0.2-detectors-generators.ipynb", "output_path": "docs/user_guide/0_core/0.2-detectors-generators.ipynb", "parameters": {}, "start_time": "2024-01-31T17:50:26.097462", "version": "2.5.0" } }, "nbformat": 4, "nbformat_minor": 5 }