HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting?

Ivan Karpukhin Foma Shipilov Andrey Savchenko

HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting?

Abstract

Forecasting multiple future events within a given time horizon is essential for applications in finance, retail, social networks, and healthcare. Marked Temporal Point Processes (MTPP) provide a principled framework to model both the timing and labels of events. However, most existing research focuses on predicting only the next event, leaving long-horizon forecasting largely underexplored. To address this gap, we introduce HoTPP, the first benchmark specifically designed to rigorously evaluate long-horizon predictions. We identify shortcomings in widely used evaluation metrics, propose a theoretically grounded T-mAP metric, present strong statistical baselines, and offer efficient implementations of popular models. Our empirical results demonstrate that modern MTPP approaches often underperform simple statistical baselines. Furthermore, we analyze the diversity of predicted sequences and find that most methods exhibit mode collapse. Finally, we analyze the impact of autoregression and intensity-based losses on prediction quality, and outline promising directions for future research. The HoTPP source code, hyperparameters, and full evaluation results are available on GitHub.

Code Repositories

ivan-chai/hotpp-benchmark
Official
pytorch
Mentioned in GitHub
ivan-chai/torch-linear-assignment
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
point-processes-on-agegroup-transactions-mtppNHP
Accuracy (%): 35.43
MAE: 0.696
OTD: 6.97
T-mAP: 5.61
point-processes-on-agegroup-transactions-mtppODE-RNN
Accuracy (%): 35.6
MAE: 0.695
OTD: 6.97
T-mAP: 5.52
point-processes-on-agegroup-transactions-mtppIFTPP
Accuracy (%): 34.08
MAE: 0.693
OTD: 6.90
T-mAP: 5.88
point-processes-on-agegroup-transactions-mtppRMTPP
Accuracy (%): 34.15
MAE: 0.749
OTD: 6.88
T-mAP: 6.69
point-processes-on-amazon-mtppIFTPP
Accuracy (%): 35.73
MAE: 0.242
OTD: 6.52
T-mAP: 22.56
point-processes-on-amazon-mtppRMTPP
Accuracy (%): 35.76
MAE: 0.294
OTD: 6.57
T-mAP: 20.06
point-processes-on-amazon-mtppNHP
Accuracy (%): 11.06
MAE: 0.449
OTD: 9.02
T-mAP: 26.29
point-processes-on-retweet-mtppODE-RNN
Accuracy (%): 59.95
MAE: 18.38
OTD: 165.3
T-mAP: 48.81
point-processes-on-retweet-mtppNHP
Accuracy (%): 60.08
MAE: 18.42
OTD: 165.8
T-mAP: 45.07
point-processes-on-retweet-mtppAttNHP
Accuracy (%): 60.03
MAE: 18.39
OTD: 171.6
T-mAP: 25.85
point-processes-on-retweet-mtppRMTPP
Accuracy (%): 60.07
MAE: 18.45
OTD: 166.7
T-mAP: 44.74
point-processes-on-retweet-mtppIFTPP
Accuracy (%): 59.95
MAE: 18.27
OTD: 172.7
T-mAP: 31.75
point-processes-on-stackoverflow-mtppIFTPP
Accuracy (%): 45.41
MAE: 0.641
OTD: 13.64
T-mAP: 8.31
point-processes-on-stackoverflow-mtppRMTPP
Accuracy (%): 45.43
MAE: 0.701
OTD: 13.17
T-mAP: 12.72

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting? | Papers | HyperAI