KEEL

Category Intelligent Software>Data Mining Systems/Tools, Intelligent Software>Fuzzy Logic Systems/Tools, Intelligent Software>Genetic Programming Systems/Tools and Intelligent Software>Neural Network Systems/Tools

Abstract KEEL (Knowledge Extraction based on Evolutionary Learning) is a software tool to assess evolutionary algorithms (EAs) for Data Mining (DM) problems including regression, classification, clustering, pattern mining, etc.

KEEL contains a big collection of classical knowledge extraction algorithms, preprocessing techniques (instance selection, feature selection, discretization, imputation methods for missing values, etc.), Computational Intelligence based learning algorithms, including evolutionary rule learning algorithms based on different approaches (Pittsburgh, Michigan and IRL, etc.), and hybrid models such as ‘genetic fuzzy’ systems, evolutionary neural networks, etc.

KEEL allows you to perform a complete analysis of any learning model in comparison to existing ones, including a statistical test module for comparison. Moreover, KEEL has been designed with a double goal: research and educational.

The currently available version of KEEL consists of the following function blocks:

1) Data Management -- This part is composed of a set of tools that can be used to build new data, export and import data in other formats to the KEEL format, data edition and visualization, applying transformations and partitioning to data, etc.

2) Design of Experiments -- The aim of this part is the design of the desired experimentation over the selected data sets. It provides many options to choose from: type of validation, type of learning (classification, regression, unsupervised learning), etc.

3) Educational Experiments -- With a similar structure to the previous part, it allows you to design an experiment which can be debugged step-by-step in order to use this as a guideline, to show the learning process of a certain model by using the platform for educational objectives.

Taking into account each one of the above function blocks, KEEL can be useful for different types of users, who expect to find determined features in Data Mining (DM) software.

The following describes the ‘User Profiles’ of who KEEL is designed for, its Main Features and the different ways of working with it.

KEEL User Profiles --

KEEL is an integration of an environment with a defined architecture and the development of knowledge extraction as expandable modules. It is mainly intended for two (2) categories of users: researchers and students. Either group has a different set of needs:

1) KEEL as a research tool -- The most common use of this tool for a researcher will be the automated execution of experiments, and the statistical analysis of their results. Routinely, an experimental design includes a mix of evolutionary algorithms, statistical and Artificial Intelligent (AI)-related techniques.

Special care has been taken to make it possible for a researcher to use KEEL to assess the relevance of their own procedures.

Since the actual standards in machine learning require heavy computational work, the research tool is Not designed to offer a real-time view of the progress of the algorithms, it is designed to generate a script and be batch-executed in a cluster of computers.

The tool allows the researcher to apply the same sequence of pre-processing, experiments and analysis to large batteries of problems and focus their attention to the summary of the results.

2) KEEL as an educational tool -- The needs of a student are quite different to those of a researcher. Generally speaking, the objective is No longer that of making statistically sound comparisons between algorithms. There is No need of repeating each experiment a large number of times.

If the tool is to be used in a class, the execution time must be short and a real-time view of the evolution of the algorithms is needed, since the student will use this information to learn how to adjust the parameters of the algorithms.

In this sense, the educational tool is a simplified version of the research tool, where only the most relevant algorithms are available.

The execution is made in real time. The user has visual feedback of the progress of the algorithms and can access the final results from the same interface used to design the experiment.

KEEL Main Features --

KEEL is a software tool developed to assemble and use different Data Mining models. KEEL is one of the first software toolkits of its type that contains a library of evolutionary learning algorithms with open source code in Java. The main features of KEEL are:

1) Evolutionary Algorithms (EAs) are presented in predictive models, pre-processing (evolutionary feature and instance selection) and post-processing (evolutionary tuning of fuzzy rules).

2) It includes data pre-processing algorithms proposed in specialized literature: data transformation, discretization, instance selection and feature selection.

3) It has a statistical library to analyze an algorithms result. It consists of a set of statistical tests for analyzing the normality and heteroscedasticity of the results and performs parametric and non-parametric comparisons among the algorithms.

4) Some of the algorithms have been developed with the Java Class Library for Evolutionary Computation (JCLEC).

5) KEEL provides a user-friendly interface, oriented to the analysis of algorithms.

6) The software’s aim is to create experiments containing multiple data sets and algorithms connected together to obtain an expected result. Experiments are independently script-generated from the ‘user interface’ for an off-line run in the same or in other machines.

7) KEEL also allows you to create experiments in on-line mode, aimed at educational support, in order to learn the operation of the algorithms included.

8) KEEL contains a ‘Knowledge Extraction Algorithms Library’, consisting of the incorporation of multiple evolutionary learning algorithms with classical learning approaches. The main library features include:

9) Keel software operates via a web interface, allowing end-user access from all web-enabled computers.

Three (3) recent new aspects/features of KEEL --

1) KEEL-dataset, a data set repository that includes the data set partitions in the KEEL format and also shows some results of the algorithms in these data sets. This repository can free researchers from merely doing “technical work” and makes the comparison of their models with existing models easier.

2) KEEL has been developed with the idea of being easily extended with new algorithms. For this reason, the manufacturer introduces some basic guidelines that the developer may take into account for managing the specific constraints of the KEEL tool.

Moreover, a source code template has been made available to manage all the restrictions of the KEEL software, including the input and output functions, the parsing of the parameters, and the class structure.

The manufacturer describes in detail this template showing a simple algorithm, the “Steady-State Genetic Algorithm for Extracting Fuzzy Classification Rules From Data” (SGERD) procedure (see published paper below...).

3) A module of statistical procedures was developed in order to provide the researcher with a suitable tool to contrast the results obtained in any experimental study performed inside the KEEL environment.

The manufacturer describes this module and shows a case study using some non-parametric statistical tests for the multiple comparison of the performance of several ‘genetic rule’ learning methods for classification (see published paper below...).

Published paper (for additional info) --

J. Alcalá-Fdez, A. Fernandez, J. Luengo, J. Derrac, S. García, L. Sánchez, F. Herrera. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. Journal of Multiple-Valued Logic and Soft Computing 17:2-3 (2011) 255-287.

System Requirements

Keel software operates via a web interface, allowing end-user access from all web-enabled computers. Java version 6 or above is required to be installed on your system.

Manufacturer

This Project is being developed with the collaboration of five (5) Research Groups at the:

Spanish National Projects TIC2002-04036-C05, TIN2005-08386-C05 and TIN2008-06681-C06.

Manufacturer Web Site KEEL or KEEL (link 2)

Price Contact manufacturer.

G6G Abstract Number 20119R

G6G Manufacturer Number 101630