shanhix1 · Feb 12, 2019
diff --git a/‎articles/stream-analytics/media/stream-analytics-machine-learning-anomaly-detection/anomaly-detection-level-change.png
-9.35 KB b/‎articles/stream-analytics/media/stream-analytics-machine-learning-anomaly-detection/anomaly-detection-level-change.png
-9.35 KB
diff --git a/‎articles/stream-analytics/media/stream-analytics-machine-learning-anomaly-detection/anomaly-detection-spike-dip.png
3.47 KB b/‎articles/stream-analytics/media/stream-analytics-machine-learning-anomaly-detection/anomaly-detection-spike-dip.png
3.47 KB
diff --git a/‎articles/stream-analytics/media/stream-analytics-machine-learning-anomaly-detection/anomaly-detection-trend-change.png
-11.3 KB b/‎articles/stream-analytics/media/stream-analytics-machine-learning-anomaly-detection/anomaly-detection-trend-change.png
-11.3 KB
diff --git a/‎articles/stream-analytics/stream-analytics-machine-learning-anomaly-detection.md
Lines changed: 16 additions & 0 deletions b/‎articles/stream-analytics/stream-analytics-machine-learning-anomaly-detection.md
Lines changed: 16 additions & 0 deletions
@@ -15,12 +15,28 @@ ms.custom: seodec18
 
 Azure Stream Analytics offers built-in machine learning based anomaly detection capabilities that can be used to monitory the two most commonly occurring anomalies: temporary and persistent. With the **AnomalyDetection_SpikeAndDip** and **AnomalyDetection_ChangePoint** functions, you can perform anomaly detection directly in your Stream Analytics job.
 
+The machine learning models assume a uniformly sampled time series. If the time series is not uniform, you may insert an aggregation step with a tumbling window prior to calling anomaly detection.
+
+The machine learning operations do not support seasonality trends or multi-variate correlations.
+
+## Model accuracy and performance
+
+Generally, the model's accuracy improves with more data in the sliding window. The data in the specified sliding window is treated as part of its normal range of values for that time frame. The model only considers event history over the sliding window to check if the current event is anomalous. As the sliding window moves, old values are evicted from the model’s training.
+
+The functions operate by establishing a certain normal based on what they have seen so far. Outliers are identified by comparing against the established normal, within the confidence level. The window size should be based on the minimum events required to train the model for normal behavior so that when an anomaly occurs, it would be able to recognize it.
+
+Keep in mind that the model's response time increases with history size because it needs to compare against a higher number of past events. It is recommended to only include the necessary number of events for better performance.
+
+Gaps in the time series can be a result of the model not receiving events at certain points in time. This situation is handled by Stream Analytics using imputation. The history size, as well as a time duration, for the same sliding window is used to calculate the average rate at which events are expected to arrive.
+
 ## Spike and Dip
 
 Temporary anomalies in a time series event stream are known as spikes and dips. Spikes and dips can be monitored using the Machine Learning based operator, **AnomalyDetection_SpikeAndDip**.
 
 ![Example of spike and dip anomaly](./media/stream-analytics-machine-learning-anomaly-detection/anomaly-detection-spike-dip.png)
 
+In the same sliding window, if a second spike is smaller than the first one, the computed score for the smaller spike is probably not significant enough compared to the score for the first spike within the confidence level specified. You can try decreasing the model's confidence level setting to catch such anomalies. However, if you start to get too many alerts, you can use a higher confidence interval.
+
 The following example query assumes a uniform input rate of 1 event per second in a 2 minute sliding window with a history of 120 events. The final SELECT statement extracts and outputs the score and anomaly status with a confidence level of 95%.
 
 ```SQL