computer intelligence sports betting resource

(c) 2003 - 2008 All Rights Reserved
Home.The Basics.Tactics.Best Software.Software Comparisons.Directory.Contact Us.Links.
Home.The Basics.Tactics.Best Software.Software Comparisons.Directory.Contact Us.Links.
Neural networks are a very effective tool for predicting almost anything, but they rely heavily upon what the user inputs as training factors. Much the same way a Ferarri is an extremely fast motor car, but if you put cat pee in the fuel tank, it won't even start never mind go fast. GIGO

Basic horse racing prediction though can be a problem area for NNets, because the most informative net inputs should ideally sum up the 'situation'. In order to encapsulate the differences between contenders in a typical horse race requires more than a few inputs . . and there is proof that NNets (like most other selection techniques) are not necessarily improved by adding in more variables (often the exact opposite is true).

To see the potential power of the NNet, probably better to start with assessments created elsewhere. The Racing Post provides the Nnet student with many such training file factors, and this is as good a place as any to start.

Use any factors you decide are relevant, and the more relevant the better, The Racing Post provides ratings such as Topspeed, Postdata, betting forecast, etc. They also provide a service called Postdata, which reviews several factors per runner and presents them in a table format.

Where the factors are not numeric you need to convert them, for example in Postdata some items are populated by graphical symbols. Three, two or one tick mark signifies a better (3 tick marks) or worse performance under a particular heading. A ‘X’ under the heading is an indication of a negative rating 9has shown particular poor performance in this area)

In order to feed numeric data into a neural network, such items could be translated from 3 ticks to be equal to the number 3, 2 ticks = 2, etc. X would be translated into minus 1, etc.

For the neural network output a single numeric value indicating the finishing position, this could be 1=win 0=lost. Better still would be lengths behind winner, 0 = won, 20 = how far behind the first placed horse, etc.

At this stage, use your imagination and try to be creative. Rather than using bare Postdata or whatever other ratings/indicators of potential for inputs, employ some reference to the race competition.

Remember, Neural Networks will only see a single line of information, which can be a little restrictive for us as humans, never mind a dumb computer. Imagine if you saw just one runner's info in front of you, it would not be as valuable to you as it is seeing all the information for other race runners for comparison purposes, or the whole of a Postdata table.,This way allows you to evaluate each row against all the other runners.

Whenever you assemble data for a neural network training, simply ask yourself this question. If I were viewing this, could I interpret what’s happening, or would I need some comparisons or some reference to a wider picture?

So, Postdata 'recent form' of 1 tick, although it means something in isolation, has far more potency if it can be related to the whole table. One way to indicate this to the NNet could be to use 2 inputs for every Postdata column, rather than one.

Let's say the best figure in the recent Form column is 3 ticks, using the 2 inputs idea for our 1 tick runner, the resultant two inputs could be '3' (the best figure in the race) and minus 2 (how this runner relates to the best in race)

Ability column indicates this runner with 2 ticks, and best in race is also 2 ticks. The 2 inputs here would be 2 and 0

Build a picture of the whole race, runner by runner, using the 6 Postdata columns (or 7 columns if you’re interrogating flat racing, where draw position is an additional factor). And when you've done for each race runner you'll have 12 (13 if flat racing) inputs and one output.

Or you may consider that TopSpeed (speed ratings) and/or PostMark (form ratings) are essential for a better picture, either add them to the rest, or drop a couple of other data items to accommodate them. The essential thing is to try and make sure the picture you're painting for the Neural Network to study, is as clear as you can make it.

Leonardo Da Vinci would have made a good NNet programmer, but Jackson Pollock - forget it Jackson, stick to your paint splashing.

If you're hand coding this will take some time to establish a decent sized data bank – but the results could well be worth it though!

If you’re adept at programming and have past racing results, you'd maybe be better selecting items that are available from standard results . . just remember to add the association element, so the Neural Network knows what you're talking about and can best visualise the race. Otherwise you might as well be pouring cat pee in the Ferarri's petrol tank. Sacrilege.


ALTERNATIVE STRATEGIES

The above is just one suggested approach, there are of course countless others.

Rather than the simple processed numeric values (horse 1 rating minus horse 2 rating stuff) you could try defining an element with a simple ranking figure, 1=best or equal to best, 2=second best, etc. This approach might benefit from adding [number of runners] as the first data item of each line so the neural network should be able to get things in proportion. After all, 3rd best figure in a 3-runner race is somewhat different to 3rd best in a 30-runner race.

Another alternative tactic would be to note the difference between best & worst in the race for each category, then express each runner as a percentage of that, thereby putting each into a race context.

e.g.
Object runner rated 95
Best rated in race=100
Worst in race=90

Inputs would be . . . 95 (actual rating)
and 0.5 (its percentage rank in the race)

Whereas if object runner was again=95
and best rated was again=100
but this time bottom rated=50

Inputs would be 95 (actual)
and 0.1 (better ranked in the context of the opposition - Where zero would be best, 1 worst)

Any of these can be simply set up in a spreadsheet so it will do the calculations for you.

This final percentage idea would just need a formula like;
(TopRated-ThisRating)/(TopRated-BottomRated)

These suggestions are not exhaustive, just try and allow your imagination to take you along.

Remember, you're attempting to best illustrate what you consider to be the key items in the context of the race to a dumb computer, in the guise of a Neural Network or other machine learning tactic. The better you can illustrate the problem, the more likely will be the machine’s (and software’s) chances of cracking it.

. . . and good luck!
More forecasting horse racing results for traditional odds betting using AI techniques

Traditional Betting  . . .
. . . or . . .
For spread betting enthusiasts a PDF article aimed at making it with the daily racing spread markets

Spread Betting . . .