InforSense Classification Studio
Category Intelligent Software>Data Mining Systems/Tools
Abstract The InforSense® Classification Studio is an interactive environment that simplifies the development and optimization of predictive models (data mining).
It provides a wizard environment that guides users through different steps of the 'predictive model' building process, to build, evaluate, optimize, and select 'classification models' based on a data table.
These input data sets are derived from the InforSense analytic workflow environment. All steps conducted by the wizard are also recorded and stored as an analytical workflow that captures the analysis steps from start to end.
The Classification Studio is open to any classifier and users can use it over any 'classification component' available within InforSense or InforSense In-Oracle Edition (IOE).
The main features/capabilities of the InforSense Classification Studio are:
1) Wizards for building workflows that capture the 'analytical process' from start to end --
- a) Data partitioning and preparation;
- b) Feature subset selection and evaluation;
- c) Predictive model building;
- d) Automated model parameter variation, and predictive model assessment.
2) Option to build an 'ensemble classifier' consisting of general models.
3) Environment for visualizing --
- a) Attribute importance tables, and classifier models;
- b) Performance metrics, evaluation reports, confusion matrices, lift and gains charts, and ROC plots.
4) Ability to compare and optimize classification tasks by overlaying charts (Lift, ROC).
5) Ability to construct pluggable classification workflows.
6) Fast and easy re-execution and re-usability of generated workflows.
Data Partitioning and Preparation -- Using the InforSense Classification Studio, data sets can be partitioned into training and validation subsets either systematically or randomly.
In addition, users are guided in selecting further data preparation steps such as normalization, binning, and outlier detection.
Feature Selection -- When building predictive models in Classification Studio, users can choose from a range of pluggable 'feature selection methods' for estimating feature importance, and automatic removal of unimportant features.
Methods include: RELIEF-F, Information Gain, Partial Least Squares, and Oracle Attribute Importance.
Classification Components -- The Classification Studio guides the users in setting the parameter values for a wide range of components, including:
1) InforSense Classifiers- Decision Tree, Decision Rules, MLP Neural Network, Support Vector Machine (SVM), and Naïve Bayes.
2) Oracle 10g Classifiers - Decision Tree, Adaptive Bayesian Network, Naïve Bayes, and SVM.
3) R Statistics - Logistic Regression featuring stagewise, forward and backward variable selection.
Ensemble Learning -- In InforSense Classification Studio, classifier models are easily enhanced via the following ensemble learning methods: Boosting, Bagging, and Arcing.
Ensemble learning methods allow improvement in the performance of any classifier meeting certain requirements.
The key to success in these methods is that they produce better generalization decision bounds allowing predictive models to perform more effectively on unseen data.
Evaluation Components -- Using the Classification Studio, access to a number of classifier assessment methods is provided including:
1) Confusion matrix for validation set, cross validation, and bootstrap testing.
2) ROC curve, AUC, and other binary performance metrics, such as sensitivity, specificity, and odd ratio.
3) Lift and Gains charts.
4) Multiple charts.
Automatic Workflow Capture -- The different data preparation and analysis steps conducted by the Classification Studio are captured as an analytical workflow that can be stored back into InforSense.
In addition to providing a record of the analysis steps conducted, it enables users to inspect and modify the analysis steps, and provides them with a mechanism for re-using the same analytic steps in other applications.
Deployment into End-User Applications -- The entire activity for each workflow produced in Classification Studio is packaged as a predictive service ready for deployment through any of the InforSense deployment methods, including portal, web service, command line interface or InforSense server API.
System Requirements
InforSense is a pure Java platform based on the J2EE architecture and has been validated on a wide range of platforms and operating systems.
Currently supported platforms include:
- Client: Microsoft Windows 2000/XP and Mac OS X Tiger (PowerPC architecture).
- Server: Microsoft Windows 2000/XP, Mac OS X Tiger (PowerPC architecture), Linux (Intel architecture), Solaris 8 (SPARC architecture).
Note: As an integrative analytics platform, InforSense co-ordinates the execution of software tools supplied by third parties installed on different machines, such tools may have their own systems requirements.
Manufacturer
- InforSense Limited
- Colet Court
- 100 Hammersmith Road
- London, W6 7JP
- United Kingdom
- Tel.: +44 (0) 20 8237 8440
- Fax: +44 (0) 20 8237 8441
- information@inforsense.com
Manufacturer Web Site InforSense Classification Studio
Price Contact manufacturer.
G6G Abstract Number 20347
G6G Manufacturer Number 101430


