A Nerdy, Boring Statistical Analysis of Fighter Age and Win Probability

latoya johnson

Double Black Card
@Green
Joined
Mar 27, 2016
Messages
1,416
Reaction score
324
I wanted to construct a model that determines the win probability of a fighter based his age and the age of his opponent.

In general, I wanted a model to determine the probability of an event (in this case a fight outcome) as a function of one input variable (in this case age advantage).

The function that I constructed is
upload_2017-6-4_22-44-36.png
The sensitivity constant, k, measures how much the event probability is affected by the input variable.
k and x must be defined to be positive.




How to Measure the Age Advantage
I measured a fighter's age advantage as the difference between his opponent's age and the prime fighting age minus the difference between the fighter's age and the prime fighting age.
upload_2017-6-4_21-16-11.png

The challenge was to determine the prime fighting age. The key lay in the sensitivity constant, k.

An ill-defined or irrelevant input variable should have no correlation with win probability and should therefore cause the sensitivity constant to be estimated at zero, indicating a weak correlation.

The ideally chosen prime age, however, would yield realistic values for the age advantage, causing the input variable to strongly forecast win probability. This would maximize the sensitivity constant.

The prime age that maximized k was 27.5,
yielding a maximal sensitivity value of 0.2327.
upload_2017-6-4_21-25-43.png





Finally, having determined these parameters, the win probability of a fighter as a function of his age advantage is given by
upload_2017-6-4_22-38-28.png

Notes:
1. This model only works for positive values of the input variable. However, it can be modified to allow for negative values using absolute value operators.
2. There were some simplistic underlying assumptions in this model (like the linearity of the age advantage). It may be improved by modifying the definition of the input variable.
3. Moreover, the model may be vastly improved by being modified to allow for the input of several variables if possible.






Example:
Fighters Daniel Cormier and Jon Jones are scheduled to fight each other on July 29 of 2017.
At the date of the bout, Jon will be approximately 30 years old and Daniel will be approximately 38.

The probability of Jon's victory (accounting only for age of course) may be calculated as such:
upload_2017-6-4_22-41-13.png


TLDR
The prime fighting age is 27.5 years old.
 
Last edited:
So Jon Jones is 86% likely to beat Daniel Cormier?



@Threetrees I forgot to mention, the optimal values for prime age and k were based on recent bout results (39 trials),
so it was experimentally derived.


Also, here's a graph for fun.
Win Probability Graph.PNG
 
Last edited:
Wait, wouldn't it be a more impressive model if k was experimentally derived?
 
db9a1e20342657c26b510c5eca0d7637.jpg
 
MMAnalytics??? :eek: It might actually be interesting if this catches on.

But you don't need math to know that fighters over 35 are less likely to win than fighters under 30. This also ignores the real reason why Cormier will never beat Jones--the absurd reach deficit he faces that no amount of training or effort will address. In this case it wouldn't matter if Cormier was 10-15 years younger.
 
You need to do one for the probability of a fighter winning with and without dick pills in his system

And as a negro with a Finance degree, i think i just fell in like with you.
 
MMAnalytics??? :eek: It might actually be interesting if this catches on.

But you don't need math to know that fighters over 35 are less likely to win than fighters under 30. This also ignores the real reason why Cormier will never beat Jones--the absurd reach deficit he faces that no amount of training or effort will address. In this case it wouldn't matter if Cormier was 10-15 years younger.
He had a similar reach disadvantage against Gus and still won that fight.

Let's not blame the reach disadvantage for why Jones wins and will win again. Jones is a better and more skilled fighter than Cormier.
 
I wanted to construct a model that determines the win probability of a fighter based his age and the age of his opponent.

In general, I wanted a model to determine the probability of an event (in this case a fight outcome) as a function of one input variable (in this case age advantage).

The function that I constructed is
View attachment 235421
The sensitivity constant, k, measures how much the event probability is affected by the input variable.
k and x must be defined to be positive.




How to Measure the Age Advantage
I measured a fighter's age advantage as the difference between his opponent's age and the prime fighting age minus the difference between the fighter's age and the prime fighting age.
View attachment 235465

The challenge was to determine the prime fighting age. The key lay in the sensitivity constant, k.

An ill-defined or irrelevant input variable should have no correlation with win probability and should therefore cause the sensitivity constant to be estimated at zero, indicating a weak correlation.

The ideally chosen prime age, however, would yield realistic values for the age advantage, causing the input variable to strongly forecast win probability. This would maximize the sensitivity constant.

The prime age that maximized k was 27.5,
yielding a maximal sensitivity value of 0.2327.
View attachment 235473





Finally, having determined these parameters, the win probability of a fighter as a function of his age advantage is given by
View attachment 235471


Notes:
1. This model only works for positive values of the input variable. However, it can be modified to allow for negative values using absolute value operators.
2. There were some simplistic underlying assumptions in this model (like the linearity of the age advantage). It may be improved by modifying the definition of the input variable.
3. Moreover, the model may be vastly improved by being modified to allow for the input of several variables if possible.






Example:
Fighters Daniel Cormier and Jon Jones are scheduled to fight each other on July 29 of 2017.
At the date of the bout, Jon will be approximately 30 years old and Daniel will be approximately 38.

The probability of Jon's victory (accounting only for age of course) may be calculated as such:
View attachment 235469



TLDR
The prime fighting age is 27.5 years old.

I don't understand what you're talking about so I disagree with it!

i-disagree_o_594075.jpg
 
You need to do one for the probability of a fighter winning with and without dick pills in his system

And as a negro with a Finance degree, i think i just fell in like with you.
I love finance!
 
now all we need is to look at the sample size of maybe last year's fights and plot a graph so confirm the authenticity of your algo
The biggest issue that i have is that your algorithm takes relative age into equation, for example if a fighter is below your prime fighting age of 27.5 then the age difference should work in the negative direction with the win chances increasing as the fighter tends towards the PFA
 
if a fighter is below your prime fighting age of 27.5 then the age difference should work in the negative direction with the win chances increasing as the fighter tends towards the PFA
This is true. My model allows for this.

When I said the input variable must be positive, that only means the net age advantage should be positive.

So if a 28-year-old is fighting a 24-year-old, rather than saying a = (.5) - (3.5) = -3 for the younger fighter,
we would reverse perspective to be optimistic
and say that a = (3.5) - (.5) = +3 for the older fighter.

Just keep the net age advantage positive.
 
I wanted to construct a model that determines the win probability of a fighter based his age and the age of his opponent.

In general, I wanted a model to determine the probability of an event (in this case a fight outcome) as a function of one input variable (in this case age advantage).

The function that I constructed is
View attachment 235421
The sensitivity constant, k, measures how much the event probability is affected by the input variable.
k and x must be defined to be positive.




How to Measure the Age Advantage
I measured a fighter's age advantage as the difference between his opponent's age and the prime fighting age minus the difference between the fighter's age and the prime fighting age.
View attachment 235465

The challenge was to determine the prime fighting age. The key lay in the sensitivity constant, k.

An ill-defined or irrelevant input variable should have no correlation with win probability and should therefore cause the sensitivity constant to be estimated at zero, indicating a weak correlation.

The ideally chosen prime age, however, would yield realistic values for the age advantage, causing the input variable to strongly forecast win probability. This would maximize the sensitivity constant.

The prime age that maximized k was 27.5,
yielding a maximal sensitivity value of 0.2327.
View attachment 235473





Finally, having determined these parameters, the win probability of a fighter as a function of his age advantage is given by
View attachment 235471


Notes:
1. This model only works for positive values of the input variable. However, it can be modified to allow for negative values using absolute value operators.
2. There were some simplistic underlying assumptions in this model (like the linearity of the age advantage). It may be improved by modifying the definition of the input variable.
3. Moreover, the model may be vastly improved by being modified to allow for the input of several variables if possible.






Example:
Fighters Daniel Cormier and Jon Jones are scheduled to fight each other on July 29 of 2017.
At the date of the bout, Jon will be approximately 30 years old and Daniel will be approximately 38.

The probability of Jon's victory (accounting only for age of course) may be calculated as such:
View attachment 235469



TLDR
The prime fighting age is 27.5 years old.

reading-ikea-intructions-big-lebowski-confused.gif
 
Very good thread and well thought out. I agree. I don't know a lot about numbers, not so good with anything to do with math but its still awesome because its you. high five kitty.
 
Cool idea!

Questions:
1) Can you back test this model against a database of MMA fights to see how accurate it is?

2) Would a model based not on fighter age but # of MMA fights, or # of MMA rounds, or # of combat sport fights, or # of combat sport rounds, be more predictive?

3) How big of a problem is the linearity assumption? On the one hand, it seems like the decline becomes more steep as age past prime increases. On the other hand, those fighters who continue to fight at a late age might be the ones least affected by age related decline, which is why they don't retire sooner. So maybe there is a modest decline past prime, then a steep decline for a period well past prime, then a more slow decline into truly advanced fighter age. Who knows.

4) I don't understand the reasoning in your choice of k. You say that you want to choose k so as to yield a realistic age advantage then you seem to choose k to maximize the modeled age-based advantage. Doesn't this make the model question-begging? Why not choose a k so that the model most accurately matches the outcome in a dataset of old fights? Maybe I just don't understand what what you were saying cuz I dont know any statistics.
 
How was the "prime age" of 27.5 arrived at? Was it the age that led to the highest percentage probability over an average age range for fighters? say 18 - 40 or something like that? or...something else like sports medicine's theory for prime age of an athlete?
 
Back
Top