Comeback Analysis in Professional League of Legends

Author: Evan Ngo

Introduction

League of Legends (LoL) is a multiplayer online battle arena (MOBA) game developed by Riot Games, where two teams of five players compete to destroy the opposing team’s Nexus. With millions of players worldwide, it has become one of the most influential esports in the gaming industry. The dataset used in this analysis comes from Oracle’s Elixir and contains match data from professional LoL esports matches throughout 2022.

In professional League of Legends, the early game often sets the tone for the entire match. Teams that establish gold leads by the 15-minute mark typically have significant advantages in itemization, map control, and objective pressure. However, the game is designed with comeback mechanics, and skilled teams can overcome early deficits through superior teamfighting, objective control, and late-game scaling.

This analysis focuses on “comeback” games–matches where a team overcomes a meaningful gold deficit at the 15-minute mark to ultimately win. We define a “meaningful deficit” as being at least 1,500 gold behind, which represents roughly one item component and filters out trivial differences that could swing with a single kill.

Central Question

Do Eastern leagues (LCK, LPL) have a higher comeback rate than Western leagues (LCS, LEC, CBLOL) when teams are significantly behind in gold at 15 minutes?

This question is relevant to coaches, analysts, and fans who want to understand regional differences in gameplay philosophy. Eastern teams, particularly those from Korea (LCK) and China (LPL), are often characterized as more patient and disciplined, potentially making them better at playing from behind.

Dataset Overview

The dataset contains approximately 150,000 rows of professional match data. Each game generates 12 rows: 10 for individual players and 2 for team-level summaries. For this analysis, we focus on team-level data from Tier-One leagues.

Column	Description
`gameid`	Unique identifier for each match
`league`	Professional league (LCK, LPL, LCS, LEC, CBLOL, etc.)
`result`	Match outcome (1 = win, 0 = loss)
`golddiffat15`	Gold difference at 15 minutes (negative = behind)
`xpdiffat15`	Experience difference at 15 minutes
`killsat15`, `deathsat15`, `assistsat15`	Combat statistics at 15 minutes
`side`	Team side (Blue or Red)

Data Cleaning and Exploratory Data Analysis

Data Cleaning

The data cleaning process involved several key steps:

Filtered for team-level rows by selecting only rows where position == 'team', reducing the dataset to team summaries rather than individual player statistics.
Filtered for Tier-One leagues (LCK, LPL, LCS, LEC, CBLOL) to focus on the highest level of professional play.
Created comeback-related features:
- behind_at_15: True if the team’s gold difference at 15 minutes was ≤ -1,500
- comeback: True if the team was behind at 15 minutes but won the game
- region: Classified leagues as “Eastern” (LCK, LPL) or “Western” (LCS, LEC, CBLOL)
Removed rows with missing values in key 15-minute statistics to ensure complete data for analysis.

Below is a sample of the cleaned dataset:

gameid	league	region	side	result	golddiffat15	behind_at_15	comeback
ESPORTSTMNT01_2690210	LCK	Eastern	Blue	0	-2341	True	False
ESPORTSTMNT01_2690210	LCK	Eastern	Red	1	2341	False	False
ESPORTSTMNT01_2690219	LCK	Eastern	Blue	1	-1823	True	True

Univariate Analysis

The distribution of gold difference at 15 minutes follows an approximately normal distribution centered around zero, which is expected since every game has one team ahead and one team behind by equal amounts.

The histogram reveals that most games have gold differences within ±5,000 at 15 minutes, with extreme leads (>8,000 gold) being relatively rare in professional play.

Bivariate Analysis

Examining comeback rates by league reveals interesting regional patterns:

The relationship between gold difference at 15 minutes and win rate shows a clear positive correlation–teams with larger gold leads at 15 minutes win more frequently. However, teams behind by 1,500-3,000 gold still win approximately 30-40% of their games, demonstrating that comebacks are possible.

Interesting Aggregates

The table below compares aggregate statistics between Eastern and Western regions:

Region	Total Games	Games Behind @15	Comebacks	Comeback Rate
Eastern	934	233	35	~15%
Western	1584	461	96	~21%

Win rates by gold deficit category reveal how the magnitude of the deficit affects comeback probability:

Deficit Category	Win Rate
Large Deficit (< -3k)	~10%
Medium Deficit (-3k to -1.5k)	~22%
Small Deficit (-1.5k to 0)	~38%
Small Lead (0 to 1.5k)	~62%
Medium Lead (1.5k to 3k)	~78%
Large Lead (> 3k)	~90%

Assessment of Missingness

NMAR Analysis

The golddiffat15 column is likely Not Missing At Random (NMAR). The missingness is related to the values themselves for the following reasons:

Games ending before 15 minutes: Some professional games end in very fast stomps, technical difficulties, or forfeits, meaning 15-minute statistics never existed for these matches.
Data collection differences by league: Some leagues may have incomplete data collection and/or different data collection methods.

Missingness Dependency

We conducted permutation tests to determine whether the missingness of golddiffat15 depends on other columns.

Test 1: Missingness vs. League

Null Hypothesis: The distribution of league is the same when golddiffat15 is missing vs. not missing.
Test Statistic: Total Variation Distance (TVD) P-value: 0.0
Result: p-value < 0.05, indicating that the missingness of golddiffat15 does depend on league.

Test 2: Missingness vs. Barons

Null Hypothesis: The mean of barons is the same when golddiffat15 is missing vs. not missing.
Test Statistic: Absolute difference in means
P-value: 0.232
Result: p-value > 0.05, indicating that missingness of golddiffat15 does not depend on barons.

Hypothesis Testing

We conducted a permutation test to determine whether Eastern leagues have a higher comeback rate than Western leagues.

Null Hypothesis (H₀): In Tier-One games where a team is at least 1,500 gold behind at 15 minutes, Eastern leagues (LCK, LPL) and Western leagues (LCS, LEC, CBLOL) have the same probability of coming back to win.

Alternative Hypothesis (H₁): Eastern leagues are more likely to come back and win than Western leagues when at least 1,500 gold behind.

Test Statistic: Difference in comeback rates (Eastern - Western)

Significance Level: α = 0.05

Methodology

We used a permutation test with 10,000 iterations. Under the null hypothesis, the region labels (Eastern/Western) have no effect on comeback probability, so we shuffled these labels and recalculated the difference in comeback rates for each permutation.

Results

The observed difference in comeback rates between Eastern and Western leagues was compared against the null distribution.

P-value: 0.9731

Conclusion

Based on the p-value obtained from the permutation test, we fail to reject the null hypothesis and conclude that there is no statistically significant difference in comeback rates between Eastern and Western regions.

This informs us that regional playstyle differences do not necessarily translate to measurable performance differences when playing from behind. Despite common perceptions that Eastern teams (LCK, LPL) may be more patient or disciplined in deficit situations, the data does not support the claim that they are significantly better at executing comebacks than Western teams (LCS, LEC, CBLOL).

Framing a Prediction Problem

Prediction Problem: Given the game state at 15 minutes, predict whether a team currently behind in gold will win the game.

Type: Binary Classification

Target Variable: result (1 = win, 0 = loss)

Evaluation Metric: F1-Score

Why F1-Score? The data is imbalanced–comebacks are relatively rare (teams behind at 15 minutes comeback only 16% of the time). Accuracy alone would be misleading, as a model predicting “always lose” would achieve an accuracy of 84% but be useless. F1-Score balances precision and recall, making it appropriate for imbalanced classification.

Features Available at Time of Prediction (all known at 15 minutes):

golddiffat15: Gold difference
xpdiffat15: Experience difference
csdiffat15: Creep score difference
killsat15, deathsat15, assistsat15: Combat statistics
league: Professional league (categorical)
side: Blue or Red side (categorical)

Baseline Model

Model Description

The baseline model uses a Decision Tree Classifier with max_depth=5 to predict whether a team behind at 15 minutes will win.

Features (2 total):

golddiffat15 (quantitative): Gold difference at 15 minutes
xpdiffat15 (quantitative): Experience difference at 15 minutes

Feature Encoding:

Both features are quantitative and were standardized using StandardScaler to normalize them to zero mean and unit variance.

Performance

Metric	Training Set	Test Set
Accuracy	~0.86	~0.82
F1-Score	~0.22	~0.08
Precision	~1.00	~0.29
Recall	~0.13	~0.04

Assessment

The baseline model achieves subpar accuracy (81.85%) and poor F1-score (0.0755).

Because comebacks are the minority class–only about 16% of games where teams are 1,500+ gold behind result in a comeback–a model that always predicts “loss” would achieve ~84% accuracy by default. Moreover, the baseline’s extremely low recall (0.0435) confirms it only identifies about 4% of actual comebacks.

In other words, the baseline model is not good and performs slightly worse than a naive model that predicts “no comeback” every time.

Final Model

Features Added

The final model expands the feature set to capture more nuanced aspects of game state:

Quantitative Features (8 total):

golddiffat15, xpdiffat15, csdiffat15: Resource differences
killsat15, deathsat15, assistsat15: Combat performance
kda_at_15 (engineered): (kills + assists) / (deaths + 1)
xp_gold_ratio (engineered): (xp) / (gold - 1)

Categorical Features (2 total):

league: Different leagues have different playstyles and metas
side: Blue/Red side affects dragon control and map dynamics

Why These Features Improve Performance

KDA ratio: Teams behind in gold but with good KDA may have better teamfight potential, indicating they lost gold through macro mistakes rather than combat ability.
League: Different regions have different metas; Eastern leagues may be more patient and better at scaling into late game.
XP to Gold ratio: Teams with a lower XP-Gold ratio indicate that they are relatively even in experience though behind in gold, showing promising fighting potential.
Side: Blue side has slight map advantages that may contribute to comeback potential.

Model Algorithm and Hyperparameter Tuning

The final model uses a Random Forest Classifier with class_weight='balanced' to address the imbalanced nature of comeback prediction.

Hyperparameters tuned via GridSearchCV (5-fold cross-validation, optimizing F1-score):

n_estimators: [100, 150, 200] –> Best: 150
max_depth: [3, 5, 7] –> Best: 3
min_samples_split: [10, 15, 20] –> Best: 10
min_samples_leaf: [5, 10, 15] –> Best: 15

Key Design Decision: Regularization

Early experiments with deeper trees (max_depth=15) and fewer samples per leaf resulted in severe overfittin–near-perfect training performance but worse-than-baseline test performance. The regularized hyperparameter grid constrains the model complexity, ensuring it learns generalizable patterns rather than memorizing training examples.

Encoding:

Quantitative features: StandardScaler
Categorical features: OneHotEncoder (with drop='first' to avoid multicollinearity)

Performance Comparison

Model	Accuracy (Test)	F1-Score (Test)	Precision (Test)	Recall (Test)
Baseline (Decision Tree)	0.8185	0.0755	0.2857	0.0435
Final (Random Forest)	0.5778	0.3372	0.2302	0.6304
Change	-29.4%	+346.6%	-19.4%	+1349.4%

Interpreting the Results

At first glance, the drop in accuracy from 81.85% to 57.78% might seem concerning. However, the new F1-score represents a significant improvement in the model’s usefulness.

What the Final Model Does Differently:

The final model is now willing to predict comebacks. The dramatic improvement in recall (from 4.35% to 63.04%) means the model now correctly identifies close to two-thirds of actual comebacks, compared to barely any before.

Why F1-Score is the Right Metric:

F1-score balances precision and recall, making it ideal for imbalanced classification. The 337% improvement in F1-score confirms that the final model is better at the task of predicting comebacks, despite the lower accuracy.

Fairness Analysis

Question

Does our model perform equally well for Eastern leagues vs. Western leagues?

Groups

Group X: Eastern leagues (LCK, LPL)
Group Y: Western leagues (LCS, LEC, CBLOL)

Evaluation Metric

Precision: The proportion of predicted comebacks that are actual comebacks. This metric is important because false positive predictions (predicting a comeback that doesn’t happen) could lead to poor strategic decisions.

Hypotheses

Null Hypothesis (H₀): Our model is fair. Its precision for Eastern leagues and Western leagues is roughly the same, and any differences are due to random chance.

Alternative Hypothesis (H₁): Our model is unfair. Its precision differs between Eastern and Western leagues.

Significance Level: α = 0.05

Methodology

We conducted a permutation test with 1,000 iterations, shuffling the region labels and recalculating the precision difference for each permutation to generate a null distribution.

Results

P-value: 0.6510

Conclusion

Based on the p-value of 0.6510 (greater than our significance level of 0.05), we fail to reject the null hypothesis. There is no statistically significant difference in precision between Eastern and Western leagues, suggesting the model is fair across regions.

This means that when the model predicts a comeback, it is equally reliable regardless of whether the game is from an Eastern league (LCK, LPL) or a Western league (LCS, LEC, CBLOL). Analysts and coaches from both regions can trust the model’s predictions with similar confidence, and there is no evidence of regional bias in the model’s performance.