4 个月前

HiRID-ICU-Benchmark -- 高分辨率ICU数据的综合机器学习基准测试

HiRID-ICU-Benchmark -- 高分辨率ICU数据的综合机器学习基准测试

摘要

近期,机器学习方法在重症监护室(ICU)收集的时间序列数据上的应用取得了显著成功,但也暴露了缺乏标准化的机器学习基准来开发和比较这些方法的问题。尽管像MIMIC-IV或eICU这样的原始数据集可以在Physionet上自由获取,但每篇论文中任务的选择和预处理往往是临时决定的,这限制了不同论文之间的可比性。在这项工作中,我们旨在通过提供一个涵盖广泛ICU相关任务的基准来改善这一状况。利用HiRID数据集,我们在临床医生的合作下定义了多个具有临床意义的任务。此外,我们还提供了一个可重现的端到端管道,用于构建数据和标签。最后,我们对当前最先进的序列建模方法进行了深入分析,指出了深度学习方法在处理此类数据时的一些局限性。通过这一基准,我们希望为研究社区提供一个公平比较其工作的机会。

代码仓库

ratschlab/HIRID-ICU-Benchmark
官方
pytorch
GitHub 中提及

基准测试

基准方法指标
circulatory-failure-on-hiridLSTM
AUPRC: 0.32.2±0.008
circulatory-failure-on-hiridLGBM
AUPRC: 0.389±0.003
circulatory-failure-on-hiridLGBM ( + hand crafted features)
AUPRC: 0.388±0.002
circulatory-failure-on-hiridTCN
AUPRC: 0.35.8±0.006
circulatory-failure-on-hiridGRU
AUPRC: 0.368±0.005
circulatory-failure-on-hiridTransformer
AUPRC: 0.352±0.006
circulatory-failure-on-hiridLR
AUPRC: 0.305±0.000
icu-mortality-on-hiridLogistic Regression
AUPRC: 0.581±0.000
icu-mortality-on-hiridTransformer
AUPRC: 0.610±0.008
icu-mortality-on-hiridGRU
AUPRC: 0.603 ±0.016
icu-mortality-on-hiridLSTM
AUPRC: 0.600±0.009
icu-mortality-on-hiridLGBM
AUPRC: 0.546±0.008
icu-mortality-on-hiridLGBM ( + hand crafted features)
AUPRC: 0.626±0.000
icu-mortality-on-hiridTCN
AUPRC: 0.602±0.011
kidney-function-on-hiridLSTM
MAE: 0.50±0.01
kidney-function-on-hiridGRU
MAE: 0.49±0.02
kidney-function-on-hiridLGBM ( + hand crafted features)
MAE: 0.45±0.00
kidney-function-on-hiridTransformer
MAE: 0.48±0.02
kidney-function-on-hiridTCN
MAE: 0.50±0.01
kidney-function-on-hiridLGBM
MAE: 0.45±0.00
patient-phenotyping-on-hiridTCN
Balanced Accuracy: 41.6±2.3
patient-phenotyping-on-hiridLGBM
Balanced Accuracy: 40.4±0.8
patient-phenotyping-on-hiridLGBM ( + hand crafted features)
Balanced Accuracy: 45.8±2.0
patient-phenotyping-on-hiridTransformer
Balanced Accuracy: 42.7±1.4
patient-phenotyping-on-hiridGRU
Balanced Accuracy: 39.2±2.1
patient-phenotyping-on-hiridLogistic Regression
Balanced Accuracy: 39.1±0.0
patient-phenotyping-on-hiridLSTM
Balanced Accuracy: 39.5±1.2
remaining-length-of-stay-on-hiridLGBM ( + hand crafted features)
MAE: 57.0±0.3
remaining-length-of-stay-on-hiridLGBM
MAE: 56.9±0.4
remaining-length-of-stay-on-hiridTransformer
MAE: 59.5±2.8
remaining-length-of-stay-on-hiridTCN
MAE: 59.8±2.8
remaining-length-of-stay-on-hiridLSTM
MAE: 60.7±1.6
remaining-length-of-stay-on-hiridGRU
MAE: 60.6±0.9
respiratory-failure-on-hiridLSTM
AUPRC: 0.569±0.003
respiratory-failure-on-hiridLGBM ( + hand crafted features)
AUPRC: 0.604±0.002
respiratory-failure-on-hiridTCN
AUPRC: 0.589±0.003
respiratory-failure-on-hiridGRU
AUPRC: 0.592±0.003
respiratory-failure-on-hiridLogistic Regression
AUPRC: 0.530±0.000
respiratory-failure-on-hiridLGBM
AUPRC: 0.585±0.001
respiratory-failure-on-hiridTransformer
AUPRC: 0.594±0.003

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
HiRID-ICU-Benchmark -- 高分辨率ICU数据的综合机器学习基准测试 | 论文 | HyperAI超神经