Data Mining Tools See5 and C5.0
Data mining is all about extracting patterns from an organization's stored or warehoused data. These patterns can be used to gain insight into aspects of the organization's operations, and to predict outcomes for future situations as an aid to decision-making.
Patterns often concern the categories to which situations belong. For example, is a loan applicant creditworthy or not? Will a certain segment of the population ignore a mailout or respond to it? Will a process give high, medium, or low yield on a batch of raw material?
See5 (Windows 8/10) and its Linux counterpart C5.0 are sophisticated data mining tools for discovering patterns that delineate categories, assembling them into classifiers, and using them to make predictions.
Some important features:
- See5/C5.0 has been designed to analyze substantial databases containing thousands to millions of records and tens to hundreds of numeric, time, date, or nominal fields. See5/C5.0 also takes advantage of computers with up to eight cores in one or more CPUs (including Intel Hyper-Threading) to speed up the analysis.
- To maximize interpretability, See5/C5.0 classifiers are expressed as decision trees or sets of if-then rules, forms that are generally easier to understand than neural networks.
- See5/C5.0 is available for Windows 8/10 and Linux.
- See5/C5.0 is easy to use and does not presume any special knowledge of Statistics or Machine Learning (although these don't hurt, either!)
- RuleQuest provides C source code so that classifiers constructed by See5/C5.0 can be embedded in your organization's own systems.
If you would like to learn more about See5/C5.0 or try out the systems, here are some useful links:
- Source code for a single-threaded version of C5.0 (Linux) is available under the Gnu GPL. Please see the downloads page.
- Links to several publications by See5/C5.0 users are available here.
- Tutorials describing and illustrating the use of See5/C5.0 are available for the Windows and Linux versions.
- Free demonstration versions (limited to small datasets) and the public code to read and interpret See5/C5.0 classifiers are available from our downloads page.
- If you have tried earlier versions of See5/C5.0, here is a summary of new features in Release 2.11a.
|© RULEQUEST RESEARCH 2020||Last updated October 2020|