
computer intelligence sports betting resource

Evaluating machine learning software, one against another, is not as simple a task as it might seem. Certain products have strengths in particular areas, or have multiples of settings which can be adjusted to help suit a task.
Software vendors in this area frequently use industry 'standard' data sets to illustrate
the prowess of their particular product. But let's face it, if they are trying to
convince the potential purchaser that theirs is the one to go with, they're unlikely
to select a problem which does not show their package in its best light. Also, the
often used 'problem' data sets, are not problems at all. Asking a machine to sort
out what's going on in an XOR function is more like an exhibition AI obstacle course,
its not really going to assist an individual to understand anything. Another favourite
is Iris classification (the flower) from various bits of specific information . .
. Again, the rules are so simple that it is not, in my opinion, representative of
a real-
Another shortcoming of the Iris, XOR, examples is that they have definite, specific
solutions. If x, y AND z then iris species is definitely Virginica. In my area of
interest, as I suspect with many others, there are very few, if any, definite answers
to my queries ~ I'm looking for trends. For instance, a specific set of pre-
So here I'll present a comparison of some commercial packages that have been trained and tested on a couple of sports related data sets of my own compilation.
A couple of caveats:
The data configurations used make no claim to be the best solution to the given task, but at least they are the same for every program.
So far as possible, each program was used in its 'default' state. (some allow a host of configurations and settings allowing a user to possibly 'drill down' to a better solution)
Software used:
neural net -
neural net -
genetic programming -
genetic programming -
regression splines -
Sample Task 1: (football)
10 Inputs per record. League games won, drawn, lost, goals scored and goals conceded for each of home and away teams. Each data item is divided by the number of games played, producing per game figures. e.g. won 9 (but from 20 games) = 0.45
One output, an integer representing the home team advantage in goals.
e.g. score = 3-
Trained upon data sampled from two seasons (1,422 records), tested against a third season (2,121 records).
Fitness measurements compared;
R-
Sum of actual errors *
Sum of raw errors *
R-
0.04525 Tiberius
0.04484 WARD neural mode
0.04102 MARS
0.03620 GeneXProTools
0.03944 Discipulus best 'team'
0.03186 WARD genetic mode
0.03180 Discipulus best program
Sum of actual errors
155.49 GeneXProTools
226.94 WARD genetic mode
227.70 Tiberius
255.15 WARD neural mode
286.02 MARS
306.17 Discipulus best program
320.41 Discipulus best 'team'
Sum of raw errors
2719.89 Tiberius
2721.66 Discipulus best 'team'
2729.83 WARD neural mode
2730.41 MARS
2767.60 GeneXProTools
2780.54 Discipulus best program
2818.13 WARD genetic mode
Within half-
560 Discipulus best 'team'
554 Tiberius
553 WARD neural mode
553 MARS
553 GeneXProTools
530 WARD genetic mode
529 Discipulus best program
Sample Task 2: (horse racing favourites spreads)
3 Inputs per record. Race a handicap or not, Number of runners, Odds of favourite
One output, an integer representing the spreads value for a favourite's performance where: win=25, coming 2nd = 10pts, finishing 3rd = 5pts, otherwise zero points.
Trained upon data sample of 1,000, tested against out-
Fitness measurements taken;
R-
Sum of actual errors
Sum of raw errors
Software shown in ranking order, best score nearer the top;
* if the software predictions for 4 cases were 2 x 5 too high, 2 x 5 too low. Sum
of Actual = 5+5-
If another package predicted all 4 @ 2 too high, actual = 8, raw = 8. So, Actual figure allows an overview of the distribution of errors, the nearer to zero, the better its focus. Raw figures give an accumulated error over the whole data set (lower is better)
Within half-
By far & away the most significant two statistical measures are R-
Approximate training times (both exercises);
================================
MARS < 1 minute
WARD genetic 1 hour
WARD neural < 1 minute
GeneXproTools 1 hour
Discipulus 1 hour (both individual & team are trained simultaniously)
Tiberius <10 minutes
Software Prices
===========
WARD Predictor US$550.00
Discipulus Professional US$495.00
Tiberius US$265.00 (3-
MARS Salford Systems quoted me for the least expensive option which was $4,995.00 for a single user license with a further $1,998.00 annual renewal charge. If it makes any difference MARS price does include tech support, maintenance, all upgrades to future versions and internet training for a single user. Seats to any upcoming Salford Systems MARS training will be discounted by 55%
Testing was performed without bias, either in selection of tasks or otherwise. They are in my experience quite typical and perhaps underline why Tiberius is not only my package of choice, but the one to which I now judge all others.
The software chosen for this comparison is, in my experience, the cream of the current
(2008) commercial machine learning software. A package performing poorly in this
company does not infer the software is not up to scratch. I have tried & tested many
products -
Attrasoft
BrainCom
Crespin
Emergent
ExcelNeural
FANN
Joone
Membrain
Neurosolutions
Pythia
QNet
RapidMiner
RockEye
Tanagra
Trajan (which is also the Neural Network add-
XLPert
My rejection of these was for a variety of factors. It is not my intention to review these products individually, and of course my reasons for rejection may not be valid cause for others to do the same.
The above rejection list, in this reviewers opinion, suffer from at least one (and in a few cases a good few more than one) of the following negative factors;
Very poor at out-
Flaky and/or bug-
Frequent program crashes
Overly complex user interface (some are possibly targeted primarily at academic users)
Very poor user support (sometimes NO user support)
R-
0.09133 MARS
0.09108 Tiberius
0.09018 WARD genetic mode
0.08953 GeneXProTools
0.08654 Discipulus best 'team'
0.08076 Discipulus best program
0.07090 WARD neural mode
Sum of actual errors
20.79 WARD genetic mode
-
22.85 Discipulus best 'team'
109.20 Tiberius
132.71 GeneXProTools
173.85 MARS
201.80 WARD neural mode
Sum of raw errors
7173.65 Tiberius
7188.63 MARS
7203.16 WARD genetic mode
7221.83 GeneXProTools
7232.79 Discipulus best program
7239.55 Discipulus best 'team'
7315.22 WARD neural mode