Web site and design © 2008 by G6G Consulting Group. All Rights Reserved. Most product content has been taken directly from manufacturer's web sites;
other product content is assembled by G6G Consulting Group. G6G welcomes any corrections and/or comments.
Product Feedback
* Required Field
*Your name:
*Email:
*Questions, comments, or feedback:
    CART 5.0 New and Enhanced Features

    Category  Intelligent Software>Data Mining Systems/Tools

    Abstract  CART data mining software is a decision tree tool that
    automatically sifts large, complex databases, searching for and
    isolating significant patterns and relationships. This discovered
    knowledge is then used to generate reliable, predictive models for
    applications such as credit risk scoring (probability of default, loss
    given default); fraud detection; targeted marketing (new customer
    acquisition, cross-sell, up-sell); churn modeling [and related customer
    relationship management (CRM)]; document classification; microarray
    data analysis; genomics, proteomics; manufacturing and production
    line quality control.

    CART 5.0 incorporates many new user-requested enhancements and
    features:

    Discrete Variables -- Discrete variables (a.k.a. categorical variables)
    are those that take on a finite set of distinct values. Predictor variables
    can be discrete, as can the target variable (in which case the model is
    a classification tree).

    CART 5 handles discrete variables in the following more flexible and
    easier-to-use ways:

    1) Ability to Automatically Detect Distinct Classes -- It is No longer
    necessary to identify how many distinct classes a discrete variable
    has, even if the variable is the target. Thus, the user has only to identify
    which variables are to be treated as discrete; CART will figure out the
    rest.

    2) Fractional Values Can Be Specified as Categorical -- Numeric
    discrete variables No longer need to take on whole-number values,
    nor do the values need to be contiguous. For instance, the following
    series of distinct values is supported in CART 5: 0.01, 0.1, 1.0, 1.001,
    200, -500.

    3) Character Data -- Character variables are now fully supported, as
    predictors, as the target or as auxiliary variables (described below).
    This is an important new feature because modern data, especially
    those arising from web logs and Internet transactions, are often
    character in nature.

    Native Support for Text Data -- CART 5 includes native support for text
    datasets, the most flexible and natural format for many users to
    maintain data. A single delimiter is used throughout the dataset,
    usually a comma, but semicolon, space, and tab are also supported
    as delimiters.

    Data Information -- The 'Data Info' window is a new display in CART 5
    that offers summary information about variables in your dataset,
    included are continuous statistics (N, mean, sum, min, max, variance,
    standard deviation, skewness and kurtosis, conditional mean, N equal
    and unequal to 0.0 that may be weighted by a case weight variable.)
    Also available is a fully-weighted tabulation of distinct values, along
    with quantiles, quartiles and interquartile range, and N-most and -least
    frequent values.

    Auxiliary Variables -- CART 5 introduces the "auxiliary" variable. Any
    variable (discrete/continuous, character/numeric) can be summarized
    with descriptive statistics or a frequency distribution at any node level.
    Such variables are termed "auxiliary" variables. It is Not necessary for
    auxiliary variables to be predictors in the model, although they can be.
    For example, profit and revenue measures in the dataset can be
    summarized for each node without affecting the growth of the tree (i.e.,
    they are Not predictors in the model), allowing the most profitable
    partitions to be identified.

    Groves -- CART 5 introduces "grove" files, which replace the pre-CART
    5 .TR1 file. A grove file is a binary file that stores all the information
    about the tree sequence needed to apply any tree from the sequence
    to new data, or to translate (export) the tree into a different presentation
    language. Grove files contain a variety of information, including node
    information, the optimal tree indicator, and predicted probabilities.
    Grove files are Not limited to storing only one tree sequence, but may
    contain entire collections of trees obtained as a result of bagging,
    arcing, or cross validation. The file format is flexible enough to
    accommodate further extensions and exotic tree-related objects
    created in other Salford Systems' applications.

    Note: Once a grove file is created, it can be translated into SAS-
    compatible, C, and Predictive Model Markup Language (PMML)
    languages.

    Exporting CART Model Information -- CART 5 includes the ability to
    export the model information contained within the binary grove file,
    including primary and surrogate splitting rules for various
    programming language codes. The files containing the exported code
    can be used outside CART for scoring data. Export language formats
    currently supported are SAS-compatible, C, and PMML.

    Missing Value Summary Report -- This report identifies the proportion
    of records missing for the target and each predictor variable, and for
    each sample (learn/test), sorted from most- to least-missing.

    Entropy Splitting Rule -- This well-known splitting rule is related to the
    likelihood function. With multilevel targets it tends to look for splits
    where some or as many levels as possible are divided perfectly or
    near perfectly. As a result Entropy puts more emphasis on getting rare
    levels right relative to common levels than either Gini or Twoing. In
    different circumstances, its properties may be similar to Gini or Twoing
    or somewhere between them.

    32-Character Variable Names -- CART 5 supports variable names up
    to 32 characters.

    Path Length Extended to Windows Maximum -- CART 5 supports a
    Windows maximum path length of 256 characters (including the file
    name).

    Improved Navigator Window --

    1) The tree topology navigator now allows you to display either the
    learn sample or the test sample.

    2) Toggle the secondary navigator window panel to display terminal
    node counts or the relative cost curve with an emphasis on all trees
    within one standard deviation.

    3) The new action button in the navigator allows the user to save the
    navigator and the grove file, score new data, or translate the tree into
    one of the available languages.

    4) Compare, learn and test samples at any level of the tree.

    5) View the tree topology display with the focus on any specified
    auxiliary variable.

    6) View auxiliary variable descriptive statistics or frequency tables at
    any level of the tree.

    New and Improved Summary Reports --

    1) An improved terminal node report now enables you to evaluate the
    purity or homogeneity of the terminal nodes, an indication of how well
    CART has partitioned the classes.

    2) A prediction success report allows you to specify a focus target
    class, enabling quicker analysis of the most important class.

    3) A learn/test sample breakdown is available in a majority of the post-
    processing result windows and dialogs.

    4) Result windows allow the user to toggle displays between the
    percent of data or the number of cases.

    5) Result windows allow the user to choose among various graph
    forms.

    Additional Tree Details for Viewing and Printing -- An increased level of
    tree detail provides more information, allowing greater control when
    displaying and printing your trees.

    1) Display weighted and/or unweighted case counts.

    2) Specify level of decimal place precision.

    3) Target class breakdown in a color-coded histogram.

    4) Display a node-splitting variable name inside and/or outside the
    node display.

    5) Specify and save the default level of detail for subsequent displays.

    6) Specify detail for internal and terminal nodes separately.

    7) Quickly change or set the levels of magnification for clear on-screen
    viewing.

    8) Toggle the tree displays for "compact" or "expanded" views.

    9) Display a class table with or without color coding.

    Improved Model Setup -- Quickly specify the target, predictor,
    categorical, weighting, and auxiliary variables in a single setup tab.

    Easy Data Access - Manufacturer has continued a direct link to
    DBMS/CopyT, with access to over 90 different file formats, including
    more than ten (10) new formats. For example, you can import and
    export statistical analysis packages (e.g., SAS, SPSS), and
    spreadsheets (e.g., Excel, Lotus). We have also added native support
    for text datasets.

    Printing -- Automatic page fitting allows the user to print trees on two
    (2) pages when possible. Upgraded support for large format printing
    and plot printing. It allows the user to produce presentation-quality
    printing of large trees on a single piece of paper.

    Note: For additional info and features for CART 5.0 please click here.

    System Requirements

    See http://www.salfordsystems.com/1132.php

    Manufacturer   Home office; see web site for international locations.

    Salford Systems
    4740 Murphy Canyon Rd. Ste 200
    San Diego, Calif. 92123
    Tel: 619.543.8880
    Fax: 619.543.8888
    info@salford-systems.com
    support@salford-systems.com

The G6G Directory of Omics and Intelligent Software
Search www.G6G-SoftwareDirectory.com