Improving Online Algorithms via ML Predictions


TL;DR

This paper proposes a theoretical framework that uses machine learning (ML) predictions to improve the performance of online algorithms. The designed algorithms achieve near-optimal performance when the predictions are accurate (consistency), and do not degrade significantly when the predictions are wrong, gracefully falling back to the level of classic online algorithms (robustness).

Key Definitions

The paper introduces or adopts the following concepts, which are crucial for understanding the combination of online algorithms and ML predictions:

There are currently two main paradigms for handling uncertainty: one is machine learning, which builds models from historical data to predict the future; the other is online algorithms, which aim to design strategies with guaranteed worst-case performance, but such strategies are often too conservative and cannot exploit potentially favorable information.

Recent research has begun to combine the two, using ML predictions to improve the performance of online algorithms. The core challenge is how to design an algorithm that:

  1. When the predictions are accurate, can outperform the performance lower bound of traditional online algorithms (achieving good consistency).
  2. When the predictions are wrong, its performance does not collapse and can at least maintain the level of traditional online algorithms (ensuring robustness).

This paper aims to design new algorithms with both consistency and robustness for two classic online problems: ski rental and non-clairvoyant job scheduling.

Method

The core idea of this paper is not to blindly trust ML predictions, but to use them as a basis for adjusting the decision-making strategy of classic online algorithms, introducing a hyperparameter $\lambda$ to balance trust in the predictions against protection against the worst case.

Ski Rental

Problem setting: Renting skis costs 1 unit per day, while buying them outright costs $b$ units. The skier does not know how many days they will ski in total (the true number of days is $x$), but there is an ML model that predicts the number of days as $y$.

Limitations of the Naive Algorithm

A simple idea is to fully trust the prediction: if $y \geq b$, buy on the first day; otherwise, keep renting. This algorithm (Algorithm 1) is optimal when the prediction is accurate ($\eta=0$), so it is 1-consistent. But if the prediction is badly wrong (for example, predicting very few days $y < b$ while the actual number of days is much larger $x \gg b$), its cost can become arbitrarily high, so it is not robust.

Deterministic Robust and Consistent Algorithm (Algorithm 2)

To address this issue, the paper proposes a deterministic algorithm with a hyperparameter $\lambda \in (0, 1)$:

Innovation: This algorithm uses $\lambda$ to balance trust in the prediction and a conservative strategy.

Randomized Robust and Consistent Algorithm (Algorithm 3)

To obtain a better performance trade-off, the paper further designs a randomized algorithm. Instead of buying on a fixed day, it randomly selects a purchase day within a time window according to a specific probability distribution.

Innovation: By randomizing, the algorithm smooths out the “sharpness” of the decision and avoids being targeted by the adversary at a specific decision point.

Figure illustration

Non-clairvoyant Job Scheduling

Problem setting: Schedule $n$ jobs on a single machine with the goal of minimizing the total completion time. The actual processing time $x_j$ of each job is unknown, but a predicted value $y_j$ is available. Jobs can be preempted and resumed at any time. The classic optimal non-clairvoyant algorithm is Round-Robin (RR), with a competitive ratio of 2.

Preferential Round-Robin (PRR)

The paper proposes the PRR algorithm, which combines two strategies:

  1. Shortest Predicted Job First (SPJF): Execute jobs in increasing order of predicted processing time $y_j$. This strategy has good consistency but poor robustness.
  2. Round-Robin (RR): Allocate CPU time evenly among all unfinished jobs. This strategy is robust (competitive ratio 2).

The PRR algorithm executes the SPJF strategy at rate $\lambda$ while executing RR at rate $1-\lambda$. Specifically, at any time, the currently “shortest predicted” unfinished job receives processing rate $\lambda$, while all unfinished jobs share the remaining $1-\lambda$ processing rate.

Innovation: This algorithm is a hybrid strategy that uses a parameter $\lambda$ to combine a high-performance prediction-based strategy with a classic worst-case-guaranteed strategy.

Experimental Results

The paper validates the effectiveness of the proposed algorithms through simulation experiments.

Ski Rental Experiment

Oblivious Scheduling Experiment

N min max mean $\sigma$
50 1 22352 2168 5475.42

Average competitive ratio under different prediction errors (a) Ski Rental

Average competitive ratio under different prediction errors (b) Oblivious Scheduling

Final Conclusion: The experimental results strongly support the theoretical analysis, showing that the algorithmic framework proposed in this paper can effectively leverage machine learning predictions in practice to improve the performance of online decision-making while maintaining the necessary safety guarantees.