Cubist icon New in Release 2.07

64-bit Windows support

This release includes 64-bit versions of Cubist and CubistX (the batch executable). These versions allow the use of more than 2GB of memory, as required by some extremely large data mining tasks.

The 32-bit release of Cubist will run under either 32-bit or 64-bit Windows, so there is no need to change unless your tasks may use more than 2GB of memory. The 64-bit version of Cubist will run only under 64-bit Windows Xp, Windows Vista, or Windows 7.

The network version of Cubist includes both 32-bit and 64-bit versions for installation on client PCs. A client PC running 64-bit Windows Xp/Vista/7 can install and use the 64-bit version, even if the server runs 32-bit Windows.

Cubist continues to be available in both 32-bit and 64-bit versions for Linux.

New option: unbiased rules

By default, Cubist rules attempt to minimize absolute error on unseen cases. This necessitates minimizing the median rather than the mean residual, so a Cubist rule is generally biased -- its mean prediction differs from the mean of the training cases that it covers. This new option leads to approximately unbiased rules but also to greater absolute error.

This option is recommended for applications where there are many cases with the same target value (such as zero). Unbiased rules will usually give more variation in predicted values near this common value.

Changes to composite models

Release 2.07 is faster when it generates composite models for large applications. Cubist's procedures for setting the number of nearest neighbors have also been changed, and distances are now calculated to a higher precision.

Improved public code

Rulequest provides public source code that enables models generated by Cubist to be employed in users' programs.

Warning: The public code for Release 2.07 cannot be used with models produced by previous Cubist releases.

Bug fixes

For composite models, the distance to a neighbor could be under-estimated under some rarely-occurring circumstances.

The scatterplot of real versus predicted values could fail for very large applications with hundreds of thousands of cases. In such situations Cubist now shows a scatterplot for only a sample of cases, although the statistics displayed are still computed over all cases.

The public code sometimes incorrectly flagged a case as having a value outside the range observed in the training cases.

When the public code was used with the -i option and without a label attribute, incorrect nearest neighbors were shown.

Here is a summary of changes in previous releases.


For Licensees Only:

Licensees who purchased Cubist within the last 12 months are welcome to upgrade to Release 2.07.

Click the appropriate link(s) below to download Release 2.07. You will be asked to re-enter your licence ID before running the system. If you use the source code for reading and interpreting Cubist models, you should also download the latest version.

Cubist (Linux):

Cubist (Windows 2000/Xp/Vista/7):

Either: source code for reading/interpreting models
If you purchased Cubist more than 12 months ago but would like to try Release 2.07, please complete the evaluation form to obtain a free ten-day licence.

© RULEQUEST RESEARCH 2010 Last updated February 2010


home products download evaluations prices purchase contact us