Version 1.4 Update

For version 1.4 I’ve decided to double down on making less predictions but more accurately. I have eliminated heavyweight fighters who, statistically, fight very differently than every other weightclass. They’re an outlier weightclass just like women’s MMA is. I have retuned the hyperparameters using Optuna, and I have added a feature incorporating title fight losses because I’ve noticed a pattern of fighters who recently lost the title going on skids. The training data now cuts off at 2016 rather than 2014, and I increased the testing data set size to 15% of the training dataset. This means it’s being tested against the last 1.25 years of fights. Last, I calculated the decline of fighters as they age and incorporated this into each fighter’s Elo score. Elo is the best predictive feature in the whole dataset but is very inaccurate in predicting fighter decline as they age. I calculated average winrates of fighters every year and found the peak is 27.5 years old. Real decline starts around 30 and winrate declines about 1% per year until 34 or 35 at which point it declines even faster. By calculating the predictive score of 1 point of Elo over your opponent is, I decline aging fighters’ Elo scores on a daily basis based on how old they are.

Hyperparameters

params = {'tree_method': 'gpu_hist', 'objective': 'binary:logistic', 'verbosity': 0, 'n_jobs': -1,
          'learning_rate': 0.006379731330665644, 'min_child_weight': 5, 'max_depth': 1,
          'subsample': 0.4329771439302427, 'colsample_bytree': 0.28566614739884083,
          'gamma': 0.047745011818589665, 'n_estimators': 158, 'eta': 0.10543103310179618}
clf = xgb.XGBClassifier(**params)

Features

precomp_avg_takedowns_attempts_per_min_peak_vs_opp 0.021260858
precomp_change_recent_avg_total_takedowns_absorbed_differential_vs_peak_vs_opp 0.021508418
precomp_recent_avg_ground_strikes_absorbed_differential 0.021862723
precomp_avg_days_since_last_comp_vs_opp 0.02355229
precomp_avg_clinch_strikes_absorbed_vs_opp 0.023759399
precomp_recent_avg_head_strikes_absorbed_differential_valley_vs_opp 0.023968771
precomp_avg_distance_strikes_landed_differential_valley_vs_opp 0.02415467
precomp_avg_distance_strikes_def_differential_valley_vs_opp 0.025220215
precomp_avg_win_streak_vs_opp 0.02570903
precomp_takedowns_attempts_per_min_peak_vs_opp 0.025853954
precomp_recent_avg_sig_strikes_absorbed_differential_valley_vs_opp 0.026118034
precomp_avg_head_strikes_absorbed_differential_peak 0.02732838
precomp_avg_sig_strikes_acc_differential_valley 0.027433267
precomp_avg_sig_strikes_def_peak_vs_opp 0.027729398
precomp_recent_avg_takedowns_attempts_per_min_peak_vs_opp 0.028087102
precomp_win_loss_ratio_vs_opp 0.028091302
precomp_control_differential_peak_vs_opp 0.028421916
precomp_avg_takedowns_attempts_per_min_vs_opp 0.028529506
precomp_sig_strikes_attempts_differential_valley_vs_opp 0.029160006
precomp_recent_avg_head_strikes_def_vs_opp 0.029465998
precomp_avg_ground_strikes_absorbed_differential_vs_opp 0.029726882
precomp_recent_avg_sig_strikes_absorbed_differential 0.029921925
precomp_avg_sig_strikes_absorbed_differential_peak_vs_opp 0.02993867
precomp_avg_head_strikes_landed_per_min_differential_vs_opp 0.03018053
precomp_avg_ground_strikes_absorbed_peak_vs_opp 0.030213127
precomp_avg_sig_strikes_absorbed_differential_vs_opp 0.030442381
precomp_recent_avg_sig_strikes_absorbed_differential_vs_opp 0.030662917
precomp_avg_lose_streak_vs_opp 0.030725233
precomp_recent_avg_head_strikes_absorbed_differential_vs_opp 0.031872895
precomp_avg_distance_strikes_absorbed_differential_vs_opp 0.03287488
precomp_change_avg_elo_differential 0.033649307
precomp_avg_head_strikes_absorbed_differential_vs_opp 0.03492086
precomp_recent_avg_age_vs_opp 0.03507302
precomp_age_differential_vs_opp 0.035898283
precomp_elo_differential 0.03668383

Elo scores are being calculated as the following:

def age_decline(self, age: float, dslc: float) -> float:
    # Age decline
    # You lose about .8% win chance per year after 30
    # 1 elo point = ~.14% win increase chance
    # .8 / 365 days = .0022 loss of win % per day
    # .14 / .0022 = 63.6 days to lose 1 elo point
    # 1 / 63.6 = .0157 elo points per day

    # 34+ we add 30% decline (.0157 * 1.3 = .02041)
    if age > 12418.5:
        decline = dslc * .02041 # dslc = days since last competition
    # 30-34 we add avg decline
    elif age > 10957.5:
        decline = dslc * .0157
    else:
        decline = 0

    return decline

Last, I have updated the Upcoming page to include the event, the predicted winner, a confidence score out of 5 for that winner, and the Vegas odds to compare against when they’re available.