overviewadaptersapplicationscore toolsetscognitive operating systemsanomaly detection

applications

The FOUNDATION Platform COS supports a variety of input types: video, images, industrial control SCADA, and network information security. Others are in development. For each supported input data type, a specific extension will exist to process and normalize that specific data type. Input modules may also include specific data input and output adaptors to support various industry standard data feed types. Additionally, data specific reporting may also augment core analytics reports.

Training

The FOUNDATION Platform Learning Engine trains itself through observation via unsupervised learning to establish a running baseline of normal events in a sensor. The FOUNDATION Platform Learning Engine utilizes a training period to determine patterns of normal events. When this initial period is completed, it continues to learn and update its models but is now able to issue alerts when a previously unseen event is detected. By continuously learning and adjusting the model, the system adapts to input fluctuations and sudden or gradual pattern shifts.

LINGUSTIC REPRESENTATION

The FOUNDATION Platform Learning Engine uses the input data received to create a dynamic model of normal events for a system or environment. It creates statistical clusters of event patterns using the received data point attributes. These clusters identify statistical multi-dimensional ranges of data. The width and frequency of occurrence are continually updated to establish a generative model.

The statistical model is multi-dimensional, and the Learning Engine constructs a cognitive syntax to describe the environment it is modeling. Primitive feature/attribute values are named alpha symbols. Combinations of alphas are grouped into betas. Combinations of betas are grouped into gammas. Because each data point has a time component, the learning engine can also compute time-based, year-to-year trend analysis.

Figure 8: Word Capacity of COS

Figure 8 illustrates the word capacity of feature-word when compared to words in all the possible combinations in the English alphabet. In the plot, it should be noted that not all the combinations are in the English dictionary. However, it can be seen that the linguistic model representation expressed by the COS is one thousand times larger for 32 characters’ words with 1-letter words, 2-letter words, 32-letter words.

FOUNDATION APPLICATION

The FOUNDATION Platform uses a variety of input extensions to receive data. For video, it relates to live video streams from cameras. An associated video adapter converts these video stream into meta data. For other types of data, such as SCADA, other adapters may pre-process and condition the data prior to passing it into the Learning Engine.

The data is passed from the data adapter to the Intellective AI COS Learning Engine. Data points may contain up to 31 feature/attribute values. An additional date/time attribute is included with each data point. The attributes may vary depending on the type of data being processed. Table 1 illustrates the video feature categories:

Table 1: Feature Categories of Video Application
Time
Appearance
Shape
Kinematics
Texture
Categories
feature encoding
Information about the time of day, day of week, etc.
Information about the color, contrast, texture, etc. of an object.
Information about the symmetry, edge, etc. of an object.
Information about the location, velocity, axis, angle, etc. of an object.
Information about gradient, texture, entropy, etc. of an object.

The FOUNDATION Platform Learning Engine may also be configured to group multiple different input sources, each with their own features/attributes, into composite groups for input fusing. These operate like physical input sources, but use the combined data from all associated physical inputs as a combined virtual object.

INtellective AI Foundation Anomaly

The Learning Engine’s Cognitive Model is used to analyze the resulting syntax from the collection of alpha, beta, and gammas symbols to establish a Score for each new data point received. This score describes the range of normality or rarity and the odds of occurrence for that item. The learning engine computes a formal Alert Odds for each data point.

Input sources can be configured to issue Alerts based on an Alerting Likelihood parameter for a specific input source. This is an integer value that determines the rarity with which alerts will be published based on the number of alerts desired per the total number of observations made.

A rarity score is assigned to every observation from the Learning Engine. It represents the probability of the observed event occurring when compared to all other events observed for the given input source. If the alert likelihood calculated for the event falls within the alerting likelihood set for the input source from which the event was observed, the system will publish an alert. Any event observed that has a lower calculated chance of occurrence (i.e. Alert Likelihood) than the Alerting Likelihood for that input threshold will generate an alert.

Common ranges of Alert Odds are in the range of 1 in 100 (more common) to 1 in 100,000 (less common). In order to allow n installation to establish tradeoffs between unusualness and the number of anomalous alerts to be generated, an Alert Odds threshold can be set, such that only events that exceed this threshold will be sent.

The Alert Odds value is configured differently for different input sources depending on a variety of factors, such as data variability, frequency of activity, etc. The goal is to achieve a balance of a reasonable number of anomaly alerts while providing adequate coverage of the desired anomalies.

Intellective ALERT METRIC

Figure 9: Alert Likelihood

Figure 9 shows the number of alerts based on the number of samples generated by the application for various scenes. All the results approach the blue line, which is the theoretical benchmark. Depending on the configuration and sensor type, some anomalies may actually be major events and/or environmental situations, while others may be pre-warning signs indicating the likelihood of an impending major event. By tuning an optimal Alert Odds, it is possible to catch valuable pre-warning signs prior to the occurrence of major events.

FOUNDATION VIDEO

The video extension receives real-time streaming videos from video cameras and handles conditioning and deconstruction of the incoming video frames into values for kinematics and appearance, such as basic shapes of objects. This approach enables seamless integration with existing network Video Cameras (RTSP, MP4, H263, H264, etc) and the industry’s top Video Management Systems (VMS). As a video stream arrives, the video frame images are deconstructed into sets of primitive shapes and objects. These, in turn, are grouped into compound objects with associated events.

Figure 10: Feature-Word generated from the Linguistic Engine

In concert with the Learning Engine, FOUNDATION Video is able to distinguish between different types of objects, such as humans, vehicles, and static background items as shown in Figure 10. It evaluates attributes such as location, size, and shape and events such as paths, direction, velocity, acceleration, and trajectories. Table 2 shows the meta data the video application sends to the COS. As the LE evaluates these compound objects using AI-based machine learning methods, it constructs a set of abstract models and associated sets of hypotheses, which it stores as a set of episodic or long term memories, as appropriate. It then uses these memories to compare future observations with past events.

Table 2: Video Feature List
0
1:i
i:j
j:k
k:l
index
category
Time
Appearance
Shape
Kinematics
Texture
feature name
Fractional Day
Encoded appearance information
Encoded shape information
Encoded kinematic information
Encoded texture information

The net result is a system that automatically trains itself through observation (unsupervised learning) to establish a running baseline of normal events in a camera’s field of view. By continuously learning and adjusting the model, the system adapts to environmental changes such as lighting fluctuations, scene instability, and sudden or gradual pattern shifts. The system can understand event patterns down to day of week and time of day.

FOUNDATION SCADA

The SCADA extension receives real-time streaming data from input sources that measure features such as pressure, temperature, and flow. When real-time analysis is started for SCADA data flows, the initial training for data analytics configuration typically requires a one-year sample of historic data to establish a calibration baseline for a sensor that normally collects 27000 or more points. Dynamic auto-ranging of the raw data is calibrated and the auto-range calibration model is continually updated with real-time data even after the initial calibration and initiation of real-time and/or historic data flow. Some large, distributed SCADA real-time systems use a dead-band feature. In this case, data sensors only update periodically transmitted values unless the reading has changed. In this case the front-end filtering process will auto-transmit the last known value unless a timeout failure period expires.

Table 3 shows the meta data the video application sends to the COS.

Table 3: SCADA Feature List
0
1:i
i:j
j:k
index
category
Time
Sensor
Derived Feature
Derived Feature
feature name
Fractional Day
Physical Sensor Reading
Time Domain Feature
Frequency Domain Feature

One of the biggest drivers of maintenance cost is unscheduled or unexpected maintenance due to unexpected failures. Continuous monitoring of sensor health using a Cognitive AI System can improve reliability and reduce maintenance costs by detecting failures before they reach a catastrophic stage.