Web site and design © 2008 by G6G Consulting Group. All Rights Reserved. Most product content has been taken directly from manufacturer's web sites;
other product content is assembled by G6G Consulting Group. G6G welcomes any corrections and/or comments.
Product Feedback
* Required Field
*Your name:
*Email:
*Questions, comments, or feedback:
    CART 5.0

    Category  Intelligent Software>Data Mining Systems/Tools

    Abstract  CART data mining software is a decision tree tool that
    automatically sifts large, complex databases, searching for and
    isolating significant patterns and relationships. This discovered
    knowledge is then used to generate reliable, predictive models for
    applications such as credit risk scoring (probability of default, loss
    given default); fraud detection; targeted marketing (new customer
    acquisition, cross-sell, up-sell); churn modeling [and related customer
    relationship management (CRM)]; document classification; microarray
    data analysis; genomics, proteomics; manufacturing and production
    line quality control.

    In addition, CART is an advanced pre-processing complement to other
    data analysis techniques and data-mining packages, such as SAS.
    For example, CART's outputs (predicted values) can be used as inputs
    to improve the predictive accuracy of Neural Networks (NN) and
    Logistic Regression.

    In the first stage of a data-mining project, CART can extract the most
    important variables from a very large list of potential predictors.
    Focusing on the top variables from the CART model can significantly
    speed up neural networks and other data-mining techniques. For
    neural nets in particular, CART bypasses "noise" and irrelevant
    variables, quickly and effectively selecting the best variables for input.
    The result is significant reductions in neural-net training speeds and
    more accurate and robust neural networks. In addition, the CART
    outputs, or "predicted values," can be used as inputs to the neural net.

    CART is an acronym for Classification and Regression Trees, a
    decision-tree procedure. A decision tree is a flow chart or diagram
    representing a classification system or predictive model. The tree is
    structured as a sequence of simple questions, and the answers to
    these questions trace a path down the tree. The end point reached
    determines the classification or prediction made by the model, which
    can be a qualitative judgment (e.g., these are responders) or a
    numerical forecast (e.g., sales will increase 15 percent).

    CART's methodology is characterized by:

    1) A reliable pruning strategy -- CART's developers determined
    definitively that No stopping rule could be relied on to discover the
    optimal tree, so they introduced the notion of over-growing trees and
    then pruning back; this idea, fundamental to CART, ensures that
    important structure is Not overlooked by stopping too soon.

    2) An advanced binary split search approach -- CART's binary decision
    trees are more sparing with data and detect more structure before too
    little data are left for learning.

    3) Automatic self-validation procedures -- In the search for patterns in
    databases it is essential to avoid the trap of "overfitting," or finding
    patterns that apply only to the training data. CART's embedded test
    disciplines ensure that the patterns found will hold up when applied to
    new data. Further, the testing and selection of the optimal tree are an
    integral part of the CART algorithm.

    In addition, CART accommodates many different types of real world
    modeling problems by providing a unique combination of automated
    solutions:

    1) Surrogate splitters intelligently handle missing values -- CART
    handles missing values in the database by substituting "surrogate
    splitters," which are back-up rules that closely mimic the action of
    primary splitting rules. The surrogate splitter contains information that
    is typically similar to what would be found in the primary splitter. In
    CART, each record is processed using data specific to that record; this
    allows records with different data patterns to be handled differently,
    which results in a better characterization of the data.

    2) Adjustable misclassification penalties help avoid the most costly
    errors --CART can accommodate situations in which some
    misclassifications, or cases that have been incorrectly classified, are
    more serious than others. CART users can specify a higher penalty for
    misclassifying certain data, and the software will steer the tree away
    from that type of error. Further, when CART cannot guarantee a correct
    classification, it will try to ensure that the error it does make is less
    costly. If credit risk is classified as low, moderate, or high, for example,
    it would be much more costly to classify a high risk person as low risk
    than as moderate risk.

    3) Alternative splitting criteria make progress when other criteria fail --
    CART includes seven (7) single variable splitting criteria - Gini,
    symmetric Gini, twoing, ordered twoing and class probability for
    classification trees, and least squares and least absolute deviation for
    regression trees - and one multi-variable splitting criterion, the linear
    combinations method. The default Gini method typically performs best,
    but, given specific circumstances, other methods can generate more
    accurate models. CART's unique "twoing" procedure, for example, is
    tuned for classification problems with many classes, such as
    modeling which of 170 products would be chosen by a given
    consumer. To deal more effectively with select data patterns, CART
    also offers splits on linear combinations of continuous predictor
    variables.

    Model Deployment -- Any CART model can be easily deployed when
    translated into one of the supported languages (SAS-compatible), C,
    and Predictive Modeling Markup Language (PMML)/Extensible Markup
    Language (XML) or into classic text output. This is critical for using your
    CART trees in large scale production work.

    The decision logic of a CART tree, including the surrogate rules utilized
    if primary splitting values are missing, is automatically implemented.
    The resulting source code can be dropped into an external application
    thus eliminating errors due to hand coding of decision rules and
    enabling fast and accurate model deployment.
    Additional Features/Benefits are:

    1) Scalable -- Easily and quickly handles gigabyte-sized datasets.

    2) GUI and Command-Line Interfaces -- Intuitive point-and-click and
    command-line control modes (issues commands at prompt or via
    batch files).

    3) Multiple Variable Types -- Efficiently searches any combination of
    categorical, continuous, and text data.

    4) Automatic Self-Testing Procedures -- Automatically validates tree
    results using cross validation or user-specified test data.

    5) Committee of Experts -- Yields higher accuracy with new resampling
    bootstrap and ARCing technologies for tree combining.

    Note: For additional info and features for CART 5.0 please click here.

    System Requirements

    See http://www.salfordsystems.com/1132.php

    Manufacturer   Home office; see web site for international locations.

    Salford Systems
    4740 Murphy Canyon Rd. Ste 200
    San Diego, Calif. 92123
    Tel: 619.543.8880
    Fax: 619.543.8888
    info@salford-systems.com
    support@salford-systems.com

The G6G Directory of Omics and Intelligent Software
Search www.G6G-SoftwareDirectory.com