Data Analysis and Visualization Modules
Statistics and Distributions The module 'Statistics and Distributions' presents an overview over all available data attributes, their statistical properties and their value distributions.
Correlations Analysis The Correlations Analysis panel computes and displays field-field correlations between the available data attributes (fields); it provides the 'drill-down' into a single pair of data fields by means of Bivariate Exploration.
Bivariate Exploration The Bivariate Exploration panel provides a refinement of the field-field Correlations Analysis: for a given pair of data fields, it presents a matrix of value-value interrelations and offers further interactive drill-down capabilities.
Pivot Tables The Pivot Tables panel creates aggregation tables which show the values of a user-defined statistical measure of the data as a function of the value ranges of two or more data fields.
Multivariate Exploration The Multivariate Exploration panel provides interactive multi-dimensional ad-hoc analysis and drill down features with real-time response even on multi gigabyte data.
Split Analysis In the Split Analysis panel, two disjunct data subsets can be defined: test data and control data. The control data can be further sampled in order to become representative for the test data with respect to certain data fields. On the other data fields, significant deviations between the test and the control data can be studied and quantified.
Time Series Analysis In the Time Series Analysis panel, trends and seasonal patterns in time series data can be detected, and future values can be forecasted.
Deviations, InconsistenciesIn the Deviation Detection module, outliers, deviations and presumable data inconsistencies can be detected. The specific approach of this module is that it does not examine the values and value distribution characteristics of each data field separately for outliers as traditional data quality checker tools do. Rather, it finds cross-field inconsistencies.
Associations Analysis An Associations Analysis detects typical patterns or atypical deviations in the data.
Sequences Analysis Sequences Analysis, also called Sequential Pattern Analysis, is a refinement of Associations Analysis: it detects time-ordered patterns and is a means for detecting causal relations in the data.
Self Organizing Maps (SOM) Self Organizing Maps (SOM) is a neural network approach in which a two-dimensional net of neurons 'learns' the training data. Afterwards, the SOM net can be used to detect homogeneous clusters in both the training data and new data sources, or for predicting missing values within these data.
Linear und Logistische Regression Linear and Logistic Regression are basic Data Mining techniques which try to predict the values of one data field, the target field, using the values of other data fields and grouping them into a linear equation. Linear regression is suitable for numeric target fields, logistic regression for two-valued data fields with values such as male/female, yes/no or 0/1.