Name	Name	Last commit message	Last commit date
Latest commit History 20 Commits
donut	donut
sample_data	sample_data
tests	tests
.coveragerc	.coveragerc
.gitignore	.gitignore
.travis.yml	.travis.yml
README.rst	README.rst
requirements-dev.txt	requirements-dev.txt
requirements.txt	requirements.txt
setup.py	setup.py

DONUT

https://travis-ci.org/haowen-xu/donut.svg?branch=master https://coveralls.io/repos/github/haowen-xu/donut/badge.svg?branch=master

Donut is an anomaly detection algorithm for seasonal KPIs.

Citation

@inproceedings{donut,
 title={Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications},
 author={Xu, Haowen and Chen, Wenxiao and Zhao, Nengwen and Li, Zeyan and Bu, Jiahao and Li, Zhihan and Liu, Ying and Zhao, Youjian and Pei, Dan and Feng, Yang and others},
 booktitle={Proceedings of the 2018 World Wide Web Conference on World Wide Web},
 pages={187--196},
 year={2018},
 organization={International World Wide Web Conferences Steering Committee}
}

Dependencies

TensorFlow >= 1.5

Installation

Checkout this repository and execute:

pip install git+https://github.com/thu-ml/zhusuan.git
pip install git+https://github.com/haowen-xu/tfsnippet.git@v0.1.2
pip install .

This will first install ZhuSuan and TFSnippet, the two major dependencies of Donut, then install the Donut package itself.

API Usage

To prepare the data:

import numpy as np
from donut import complete_timestamp, standardize_kpi
# Read the raw data.
timestamp, values, labels = ...
# If there is no label, simply use all zeros.
labels = np.zeros_like(values, dtype=np.int32)
# Complete the timestamp, and obtain the missing point indicators.
timestamp, missing, (values, labels) = \
 complete_timestamp(timestamp, (values, labels))
# Split the training and testing data.
test_portion = 0.3
test_n = int(len(values) * test_portion)
train_values, test_values = values[:-test_n], values[-test_n:]
train_labels, test_labels = labels[:-test_n], labels[-test_n:]
train_missing, test_missing = missing[:-test_n], missing[-test_n:]
# Standardize the training and testing data.
train_values, mean, std = standardize_kpi(
 train_values, excludes=np.logical_or(train_labels, train_missing))
test_values, _, _ = standardize_kpi(test_values, mean=mean, std=std)

To construct a Donut model:

import tensorflow as tf
from donut import Donut
from tensorflow import keras as K
from tfsnippet.modules import Sequential
# We build the entire model within the scope of `model_vs`,
# it should hold exactly all the variables of `model`, including
# the variables created by Keras layers.
with tf.variable_scope('model') as model_vs:
 model = Donut(
 h_for_p_x=Sequential([
 K.layers.Dense(100, kernel_regularizer=K.regularizers.l2(0.001),
 activation=tf.nn.relu),
 K.layers.Dense(100, kernel_regularizer=K.regularizers.l2(0.001),
 activation=tf.nn.relu),
 ]),
 h_for_q_z=Sequential([
 K.layers.Dense(100, kernel_regularizer=K.regularizers.l2(0.001),
 activation=tf.nn.relu),
 K.layers.Dense(100, kernel_regularizer=K.regularizers.l2(0.001),
 activation=tf.nn.relu),
 ]),
 x_dims=120,
 z_dims=5,
 )

To train the Donut model, and use a trained model for prediction:

from donut import DonutTrainer, DonutPredictor
trainer = DonutTrainer(model=model, model_vs=model_vs)
predictor = DonutPredictor(model)
with tf.Session().as_default():
 trainer.fit(train_values, train_labels, train_missing, mean, std)
 test_score = predictor.get_score(test_values, test_missing)

To save and restore a trained model:

from tfsnippet.utils import get_variables_as_dict, VariableSaver
with tf.Session().as_default():
 # Train the model.
 ...
 # Remember to get the model variables after the birth of a
 # `predictor` or a `trainer`. The :class:`Donut` instances
 # does not build the graph until :meth:`Donut.get_score` or
 # :meth:`Donut.get_training_loss` is called, which is
 # done in the `predictor` or the `trainer`.
 var_dict = get_variables_as_dict(model_vs)
 # save variables to `save_dir`
 saver = VariableSaver(var_dict, save_dir)
 saver.save()
with tf.Session().as_default():
 # Restore variables from `save_dir`.
 saver = VariableSaver(get_variables_as_dict(model_vs), save_dir)
 saver.restore()

If you need more advanced outputs from the model, you may derive the outputs by using model.vae directly, for example:

from donut import iterative_masked_reconstruct
# Obtain the reconstructed `x`, with MCMC missing data imputation.
# See also:
# :meth:`donut.Donut.get_score`
# :func:`donut.iterative_masked_reconstruct`
# :meth:`tfsnippet.modules.VAE.reconstruct`
input_x = ... # 2-D `float32` :class:`tf.Tensor`, input `x` windows
input_y = ... # 2-D `int32` :class:`tf.Tensor`, missing point indicators
 # for the `x` windows
x = model.vae.reconstruct(
 iterative_masked_reconstruct(
 reconstruct=model.vae.reconstruct,
 x=input_x,
 mask=input_y,
 iter_count=mcmc_iteration,
 back_prop=False
 )
)
# `x` is a :class:`tfsnippet.stochastic.StochasticTensor`, from which
# you may derive many useful outputs, for example:
x.tensor # the `x` samples
x.log_prob(group_ndims=0) # element-wise log p(x|z) of sampled x
x.distribution.log_prob(input_x) # the reconstruction probability
x.distribution.mean, x.distribution.std # mean and std of p(x|z)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NetManAIOps/donut

Folders and files

Latest commit

History

Repository files navigation

DONUT

Citation

Dependencies

Installation

API Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

NetManAIOps/donut

Folders and files

Latest commit

History

Repository files navigation

DONUT

Citation

Dependencies

Installation

API Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages