XMLMiner
Category Intelligent Software>Data Mining Systems/Tools, Intelligent Software>Expert (Knowledge Based) Systems/Tools, Intelligent Software>Fuzzy Logic Systems/Tools and Intelligent Software>Genetic Algorithm Systems/Tools
Abstract XMLMiner is a web service and class library for mining data and text expressed in Extensible Markup Language (XML), which extracts knowledge and re-uses that knowledge in the form of ‘fuzzy logic’ expert system rules. With XML Miner you can mine the structure, data and text of a document simultaneously. XMLMiner can also be used as a full featured, low cost and online Business Rules system.
Products features/capabilities include:
1) Use it to predict numeric values, categorize and classify data; infer the relevance and topics in text, and to mine the structure of XML documents.
2) XML data is everywhere, can be easily generated from any data source, but can be unstructured and sparse. XmlMiner is one of the first data mining tools to mine any data that can be expressed in XML.
3) XmlMiner is configured via XML, reads XML, and creates results in XML using the manufacturer's 'Metarule' schema (see below).
4) XmlMiner performs both 'Supervised learning' of numeric, categorical, structural or textual values to a given numeric or categorical output and 'Association learning', where a data set is searched for all useful relationships between data or structural values.
5) You can convert Metarule to the easily understood English language 'if...then rules' using an 'Extensible Stylesheet Language (XSL) transform' that the manufacturer supplies, so you can see what's been discovered.
6) You can apply Metarule rules to new data; either supplied directly or embedded in an XML document and has the results available for use in your programs or embedded into a copy of the source XML.
7) XmlMiner is standards-based and compatible with other standards- based tools.
8) XmlMiner comes with development tools integrated into the web service.
9) XmlMiner integrates 'text mining' seamlessly so that blocks of embedded text can be handled at the same time as numeric and categorical data.
10) XmlMiner is also available implemented as .Net and Java class libraries. You can create products for any platform.
Metarule -- is the rule language used by XmlMiner. As you'd expect, it's created in XML, and is fully standards compliant.
Metarule is created as the output of data or text mining and can be used by the 'inference engine' to process new data. It can also be created by hand using the 'interactive guided editor'.
Note: The guided editor takes you through the process of creating rules interactively (No programming necessary). At each stage of editing it shows you only the alternatives that are valid, thus ensuring you can't create syntactically incorrect rules. You edit and develop rules by selecting from alternatives.
Metarule represents knowledge in the form of 'fuzzy logic' production rules. These can be of three (3) different forms:
Type 1) if "conditions" then "output" will be "category"
Type 2) if "conditions" then "output" will be "fuzzy set"
Type 3) if "conditions" then "output" will be "arithmetic expression"
The rules can contain both logical and numerical elements: you can calculate the value of expressions and then use them as the source of comparisons in logical expressions. Using 'Type 3' rules you can use logical expressions to decide which arithmetic expression applies in a particular case. The language supports 'fuzzy sets' and 'fuzzy numbers'.
Metarule contains a 'data dictionary' section that permits you to define the inputs and outputs used by a rule set. Inputs and outputs are typed, and the type of the outputs defines the rule forms that can be used.
Data types for inputs are: categorical, numeric, textual, arity and presence. The latter two (2) types are used in 'structure mining’; the textual data type permits text mining of embedded text items in XML data.
'Outputs' can be categorical, numeric, arity and presence.
Lacuna module (testing) overview -- An important issue with any form of programming is that of testing.
Once you've created rule sets, either with the 'interactive editor' or with XmlMiner, how do you test them?
Both the editor and XmlMiner create syntactically correct Metarules automatically. The problem is deciding if your rule set does the things you intended, and more importantly doesn't do things you didn't intend.
In creating any rule based system, the biggest problem is that a set of circumstances may have been overlooked. The author may have forgotten, or Not thought through all the different combinations of inputs that might occur, and coded solutions for them.
If you are using XmlMiner this situation can still arise if the data set used to create the rules is Not general enough. If there are No examples in the data set for a particular condition that might arise, XmlMiner can't learn that condition.
These 'gaps' in a set of rules are known as 'lacunae' - hence lacuna's name.
Lacuna is intended to solve this problem. Lacuna takes a rule set and creates values automatically to fire at it. It uses the latest research in 'Genetic Algorithms' to create a large population of patterns that are genetically combined to improve the search for gaps in the rule set.
In a normal Lacuna testing run millions of values are fired at the rule set to try to find its weak points. This is all performed on the manufacturer's servers - if you are using the web service version of this product.
The result is a report outlining the conditions that have been found that do Not generate an output, or generate only a low confidence output from the rule set.
It may be that when you look at the list, none of the lacunae found are important or likely to occur in the real world. If there are lacunae the report tells you what combination of input values are required to generate them, so you can construct further rules to get rid of them.
Lacuna is part of the XmlMiner web service and Class library. To use lacuna you just supply the rule set and wait for processing to end. We recommend asynchronous calls with the web service as this can take seconds or minutes to process.
System Requirements
Web-based.
Manufacturer
- Scientio LLC
- 222 Lakeview Ave Ste. 160-141
- West Palm Beach
- Florida 33401
- USA
- Sales: sales@scientio.com
- Tel: (302)-351-5663
Manufacturer Web Site XMLMiner
Price Comercial: $5,999.00 Per CPU. Development: $599.00 Per developer seat.
G6G Abstract Number 20184
G6G Manufacturer Number 102337


