Browse/search for people

Publication - Professor Alan Champneys

    Use of machine learning to analyze routinely collected intensive care unit data

    a systematic review

    Citation

    Sterne, J, Shillan, D, Champneys, A & Gibbison, B, 2019, ‘Use of machine learning to analyze routinely collected intensive care unit data: a systematic review’. Critical Care.

    Abstract

    Background: Intensive Care Units (ICUs) face financial, bed management, and staffing constraints. Detailed data covering all aspects of patients’ journeys into and through intensive care are now collected and stored in electronic health records: machine learning has been used to analyse such data in order to provide decision support to clinicians.
    Methods: Systematic review of the applications of machine learning to routinely collected ICU data. Web of Science and MEDLINE databases were searched to identify candidate articles: those on image processing were excluded. Study aim, the type of machine learning used, size of dataset analysed, whether and how the model was validated, and measures of predictive accuracy were extracted.
    Results: Of 2,450 papers identified, 258 fulfilled eligibility criteria. The most common study aims were predicting complications (77 papers [29.8% of studies]), predicting mortality (70 [27.1%]), improving prognostic models (43 [16.7%]), and classifying sub-populations (29 [11.2%]). Median sample size was 488 (IQR 108-4099): 41 studies analysed data on >10,000 patients. Analyses focused on 169 (65.5%) papers that used machine learning to predict complications, mortality, length of stay, or improvement of health. Predictions were validated in 161 (95.2%) of these studies: the area under the ROC curve (AUC) was reported by 97 (60.2%) but only 10 (6.2%) validated predictions using independent data. The median AUC was 0.83 in studies of 1,000–10,000 patients, rising to 0.94 in studies of >100,000 patients. The most common machine learning methods were neural networks (72 studies [42.6%]), support vector machines (40 [23.7%]) and classification/decision trees (34 [20.1%]). Since 2015 (125 studies [48.4%]) the most common methods were support vector machines (37 studies [29.6%]) and random forests (29 [23.2%]).

    Full details in the University publications repository