Register | Login
Attackpoint - performance and training tools for orienteering athletes

Discussion: Ranking scores

in: Orienteering; General

Oct 14, 2009 1:25 AM # 
Nikolay:
I am looking at the 3 latest races in the USOF rankings: US Long champs, and two Boulder Dash Classic days.

Both weekends featured heathy competition (probably a little stronger field in the ultra long)
The gnarliness value for the ultra long is twice as big as the ones for the Boulder dash event: 15300 vs 7858 and 6051

Still the top runners in the Boulder Dash races received larger scores compared to the ultra long event.

What's the reason for these differences in the scores?
Advertisement  
Oct 14, 2009 1:36 AM # 
feet:
The reason is (always) that either the field strength is different, or the proportional spread between the winners and the rest is different. Here it is the latter. The ranking system says (roughly) that Ross' day 1 win in NH by 6 min would have equated to a 12-15 minute win in the ultralong (which was 2 to 2 1/2 times as long) ; since nobody managed that, nobody scores that highly in the ultralong.

Of course, it is not the gap from first to second that matters, but the distribution of times in the whole field; I explain it that way just for simplicity.
Oct 14, 2009 6:22 PM # 
O-scores:
I would be glad to comment on the issue but there is no data of Boulder Dash posted on winsplits

Can I get hold of the Boulder Dash times in CSV?

My O-scores system doesn't have issues of "proportional spread between the winners and the rest" and should bring better explanation to the results.
Oct 14, 2009 6:25 PM # 
JanetT:
SI was not used at the Boulder Dash so you're not going to be able to get that info.
Oct 14, 2009 6:36 PM # 
iansmith:
Krechet, your algorithm still correlates ranking to 1/(race time) up to some constant for individual races. My understand was just that the normalization you use is different, and while RLP, RGP, and RRD might be illuminating, the time distribution will still determine the scoring distribution.

Please correct me if I am mistaken (as if that statement were needed).
Oct 14, 2009 6:52 PM # 
feet:
I agree with Ian. krechet's scores for a single race will be proportional to USOF scores for a single race. krechet's scores overall will be roughly proportional to USOF scores. (They would be exactly proportional if both the scores were calculated on the same data set; they are not.) In practice the differences don't matter much - krechet uses a larger, noisier data set than the USOF scores, which could be good or bad, and we have had this argument before - but for a single race, the two sets of scores are proportional.

nb: edited to correct an error.
Oct 14, 2009 7:44 PM # 
Nikolay:
Thanks Will.
Oct 14, 2009 8:27 PM # 
O-scores:
@JanetT:
You answering so I assume you have something to do with it, so the question is do you have anything like (name;time) format?
(I can grab it from the web but If you have it it would be much easier...thanks in advance)

To the rest:

You all right about the results proportionality
I will repeat here two main system differences:

First, My normalization is bottom-up, starting from the "average runner" who always performs "average = 50 points". So, if somebody (Ross) performs very well, while "you" perform as usual, "you" will get same score as usual while Ross will get much better (While current system would keep Ross as a winner to ~100 and scale you down correspondingly to your time ratio)

Second, which might be more but rather less relevant here is that my system boils everybody in the same kettle (similar to what DVOA does) which brings a lot of clarity to events like Hudson Highlander where all runners run the same course.
Current system would go crazy not knowing what to do with most of the Brown/Green runners but it doesn't deal with HH anyway.

But my approach brings more stability as we have quite a lot of runners who run Blue/Red (Red/Green) course depending on the mood and this affects statistics a lot.

Have fun
Shura
Oct 14, 2009 9:40 PM # 
JanetT:
Shura, I don't have the data, I was just in attendance at the meet -- we used pin punches. :-)

J-J was in charge of results so he should be able to send you the results file.
Oct 15, 2009 1:08 AM # 
Shep:
i got drawn into reading this discussion cause i've been thinking about how to calculate ranking scores over this last week or so. i'm not a real fan of using the rankings and spread of times of the whole field, and here's why.

Imagine in a race i beat Julian and Simon, which would be quite an achievement ;) Assume there was a large start field with the runners having a range of ranking scores. The Australian ranking system (very similar to the world ranking system) only includes runners who run the winners time + 50% to calculate the scores for the race. But if i start reducing that cutoff my points go up. As an example, in one of our NOL races a couple of weeks ago my points would go up (in the order of several percent) decreasing that cutoff to +20%. So why is my race, where i (hypothetically!) beat Jules and Simon, made to be less of an achievement (in terms of ranking points scored) just because there were other runners in the race who increased the spread of times (which happens to a BIG extent in tough races - technically and physically tough). I still (hypothetically) beat the top 2 ranked guys in Australia! If you want to score good points then it would be better if those guys at the bottom of the results list didn't race...

So if world ranking points ever became important for eg WOC qualification (as has been suggested) then we'd be much better off running our WREs with an A and B race (as happens at eg Swedish Elite Series races). We wouldn't even need to run different courses - just have different classes. If we had, eg only the top 10 ranked runners in the race, they'd score more points than if we had every runner in the same "class". (I could be wrong here, the Aussie system has this property but i'm not 100% sure about the world ranking system...)
Oct 15, 2009 4:08 AM # 
O-scores:
@Shep

There are two sepearate issues here and I can try to guessplain them

First, as many people noticed even in this thread, ranking system is defined up to a multiplication constant. In the case you describing your relative point would still be higher than Julian and Simon or whomever you are faster.. Ratio is constant, not a value of points.

Second issue: why your points went higher is quite arbitrary, it could easily go lower, depends how the normalization is done.

And the constancy of normalization is exactly what I'm working on. My claim is that the most stable quantity we have on hand is average runner, with average covering as many people as possible, ideally whole world.

My claim is that your system will be less and less stable with decrease of cutoff to +40,30,20% and more stable with not cutting anybody (cutoff +infinity)
Oct 15, 2009 4:56 AM # 
Shep:
yeh the ratio between runners in a given race is constant, and yeh basically the issue i have with the system is that the value of points is not. you obviously can't have a constant number of points for a win, that would be stupid. but, given two races with jules, simon and myself, the points i score for beating them should be "more constant". i know that don't make much sense! so i'll try and explain... you're correct in that my points could have gone down when i reduced the cutoff, but if the only 3 runners in the race are jules, simon, and myself, i will always score more points than if we have other runners with lower rankings (assuming they run an average race). so if jules beats me and simon in a race with only the 3 of us, he'll score more points than simon will if he beats me and jules in a race with other runners, even if our times are the same (relatively). it gets especially bad for long distance races, where the standard deviation of times is big. it's that lack of consistency across races that is the problem...
Oct 15, 2009 5:07 AM # 
O-scores:
I'm lost, sorry
:)
Oct 15, 2009 5:21 AM # 
Shep:
ok heres an example
Australian Shopping Centre Championships
1st jules 10min
2nd simon 11min
3rd shep 12min

mean points MP = 1100
mean time MT = 11min
standard deviation SD = 1
ideal mean time IMT = MT + (MP-1000)*SD/200 = 11.5
Points P = (1000 + 200*(IMT-RT)/SD)
Jules 1300
Simon 1100
Shep 900

it doesn't matter what the winning time is, if simon is 10% behind, and i'm 20% behind, we'll all get the same points as above.

Now with more runners (ignoring the fact that IMT will change, but assuming none of the other runners run above their ability/ranking) we lower MP then Jules gets less points. If we increase SD then again Jules gets less points. The SD is always higher in a long race, so even if Jules, Simon and myself fill the top 3 with the same relative time gaps, the fact that the long has a higher SD means the winner of that race will get less points than the winner of the sprint.

so as an example (adding to the example above) if we add in another runner such that the MP doesn't change, but who runs 50% longer than julian, then the points will be
Jules 1285
Simon 1192
Shep 1100
New Guy 822

dropping the MP to 1050 (assuming the New Guy isn't so good) then the scores become
Jules 1235
Simon 1142
Shep 1050
New Guy 772

hence my complaint that there is too much emphasis on the entire field.

i haven't looked at the maths yet, so this sentence might be real dumb, but could it possibly be fixed by normalizing the standard deviation wrt the winning time?
Oct 15, 2009 2:08 PM # 
feet:
Shep, there is an important difference between the US system and the IOF/Australian systems here - and I think the US system is better.

The IOF/Australian system derives its scores for a given race from the distribution of proportional gaps from the mean time:
(runner's time - mean time)/(standard deviation).

The US system derives its scores for a given race from the distribution of speeds, which is proportional to:
1/runner's time.

So in your examples, if the times are 11min*(1-x), 11 min, and 11 min*(1+x) for Jules, Simon, and Shep, the Australian system will always assign points 1300, 1100, and 900 to this set of times, independent of x.

The US system, if the initial scores were the same, would assign scores that vary with x. (The USOF scores from this data would be proportional to 1/(1-x), 1, and 1/(1+x).) Then under the US system, if the times are 10:00, 11:00, and 12:00, you have x = 1/11 and the scores would be 1203, 1094, and 1003.

(I am also ignoring the effect that this event's scores will cause the whole set of scores to be recalculated: this is exactly right for the US system if the previous scores are really well set because there were hundreds of races before. This is similar to you ignoring the adjustment to IMT.)

If the times were 10:59, 11:00, and 11:01, the Australian system would still assign 1300, 1100, and 900 as the scores. The US system would assign 1101.7, 1100, and 1098.3.

If the times were 9:00, 11:00, and 13:00, the AUS system would still give 1300, 1100, and 900; the US system would give 1314, 1075, and 910.

(Actually, this is almost the same as normalizing the standard deviation by the winning time, in effect...)

Is this good or bad? Well, it depends. If you take a technical course and add a 1km road run at the end that everybody runs the same time on, then the AUS scores will not be compressed; the US scores will be. On the other hand, I find the idea that a 10:59, 11:00, 11:01 distribution of times deserves the same points as a 0:01, 11:00, and 21:59 distribution kind of strange.

Now, the issue you raise is: what happens if you add someone to the field?

In both systems, if everyone performs proportionally to their ranking, then the scores are unchanged. That is, in the Australian system, if the pre-existing scores are 1300, 1100, 900, and 1100 for New Guy (your first example), then any set of times where New Guy runs the same time as Simon and the other two are equal distances ahead and behind will lead to no change in the scores. In the US system, if those were the scores, then any time the speeds are proportional to 1300, 1100, 900, and 1000, again, the new race will give everyone the same score they came in with.

But then the question is, what happens if New Guy performs differently to his ranking? In your example, New Guy dramatically underperforms: he has a ranking of 1100 (because he doesn't change the mean points), but he runs like a guy with a terrible ranking (because he is many standard deviations below the mean when he runs 20 min in a race where everyone else is clustered near the top). In your second example, New Guy only has a ranking of 900, so he underperforms by less, so he brings fewer points to the table. But he gets beaten by the same amount. The system concludes that Jules, Simon, and Shep, who beat this guy by a ratio of 10/11/12 to 20, must have had a really good day when this guy has a ranking of 1100 - looking at the times of J/S/S and the four scores before the race, you would predict that New Guy would run 11:00. When he only has a ranking of 900, you would predict he runs 12:00, so you are less impressed by the other guys' performance.

Under the US system, the scores without New Guy are 1203, 1094, and 1003 (note: even without New Guy, if those are the rankings, then Jules is underperforming and Shep is overperforming if their scores coming in are 1300, 1100, and 900, and they run 10 min, 11 min, and 12 min). With New Guy ranked at 1100 coming in but running 20 min, the scores become 1357, 1234, 1131, and 679. With New Guy ranked at 900 coming in, the scores become 1295, 1178, 1079, and 648. Similar behavior to the AUS system in that the lower New Guy's score coming in, then the lower everybody's score given the time distribution. Different from the AUS system in that the presence of New Guy taking a huge beating is good for everyone else in the US system (because if one guy underperformed, then relative to him, everyone else must have outperformed) and bad for them in the AUS system (because if one guy underperforms, he blows up the standard deviation of times, meaning that gaps at the top of the field are also less impressive.)

It would be an interesting question to check which is less sensitive to erratic performances by erratic runners. (My guess is the USOF system is less sensitive: a 20 minute mistake by someone running 20 minutes per km on a 5 km course changes their speed from 3 kph to 2.5kph. This is a small change. Whereas in the Australian system, if all the other times are closely packed and one guy makes a huge mistake, he could be many standard deviations away from the mean and have a big effect compacting the distribution of all the other scores by blowing up the overall standard deviation.)

Another advantage of the USOF system is that because ranking scores are proportional to predicted speed, it is easier to interpret them, by the way. If my score is 110 and someone else's score is 100, then I run 10% faster than them according to the rankings.
Oct 15, 2009 11:33 PM # 
O-scores:
@ Nikolay

Thanks to Valerie I got data and here is Boulder Dash/ Ultra Long comparison

http://tinyurl.com/yfodsvo

In short, feet is right and spread made a difference.

Eddie and Ross were exceptionally good at Boulder Dash while winning corresponding races.

On the other side, same Eddie won the Ultra long by just being plainly and not exceptionally good. Here you go.

Enjoy.

This discussion thread is closed.