Algorithm for prehospital diagnosis of acute coronary syndrome using machine learning: a prospective observational study
This study was a prospectively conducted multicenter observational study in an urban area of Japan (Chiba City, population 1 million). Patients enrolled from September 1, 2018 to March 5, 2021 and from March 6, 2021 to April 27, 2022 were assigned to the internal cohort and the external cohort, respectively. Consecutive adult patients (≥20 years of age) identified by EMS personnel with suspected ACS who were transported to one of twelve participating facilities were included in the study. Symptoms indicating ACS to EMS personnel included pain, discomfort, or pressure in the chest, epigastric region, neck, jaw, or shoulder within 24 hours. Patients with other symptoms strongly suspected of having the onset of ACS were also included in the study. Cardiac arrest patients were excluded from the study because they could not be interviewed in a consistent manner with other patients.
The study was approved by the Chiba University Graduate School of Medicine Ethics Review Board (#2733). In accordance with ethical guidelines for medical and health research involving human subjects in Japan, the requirement for written informed consent was waived by the review board.
Data collection and definition
We collected data from 663 patients from the internal cohort for 45 characteristics used to predict ACS in the prehospital setting. These characteristics included medical history, vital signs, 3-lead ECG monitoring, and 21 symptoms (Supplementary Table S5). However, we used only 43 features after excluding two low-variance features that were consistent across more than 95% of the sample, specifically medical history of ‘prior coronary artery bypass graft (CABG)’ and ‘haemorrhage intracranial”. Time of onset and weather conditions were considered but discarded in the final analysis (see Supplementary Note S1 for the contribution of time of onset and weather conditions).
ST changes were assessed with leads I, II, or III of ECG monitoring. ST-segment changes included ST-segment elevation and depression. Assessment of ST changes was left to the discretion of EMS personnel. Symptom content was determined based on previous studies12,13,14,22,23. Symptoms 1 and 2 were assessed by palpation and symptoms 3 to 21 were assessed via interviews. Detailed interview data is presented in Supplementary Table S6. The diagnosis of ACS was made by cardiologists based on the results of catheter angiography according to current guidelines24. ACS was defined as acute myocardial infarction (AMI) and unstable angina (UA).
Of the 663 patients screened in the internal cohort, 555 patients were included in the final analysis after excluding 108 patients due to missing diagnostic data, multiple entries and cardiac arrest (Supplementary Fig. S4). Of the 69 patients screened in the external cohort, 61 patients were included in the final analysis after excluding 8 patients due to missing diagnostic data and multiple entries.
As our data had missing values for some features, we performed imputations before building the machine learning models. We used the imputed values as inputs even in the gradient boosting model, which can handle missing values by treating them the same way as categorical values, because we found that our imputation approach written below had improved its performance compared to the implementation without imputation. Following domain knowledge, we mutually imputed missing values in some features: symptoms 4 to 21, excluding symptoms 19 and 20, and a pair of systolic and diastolic blood pressure. Vital signs, including body temperature, blood oxygen saturation, and respiratory rate, were imputed with each median value. For all other categorical attributes, missing values have been replaced with a new “Unknown” subcategory.
Development of machine learning models
In this study, we used nested cross-validation to assess the predictive performance of the model because the nested cross-validation procedure yields robust and unbiased performance estimates regardless of sample size.25,26,27(see Supplementary Note S2 for detailed descriptions of our nested cross-validation).
First, we developed binary classification models for ACS prediction as the main outcome based on nine machine learning algorithms: XGBoost, logistic regression, random forest, SVM (linear), SVM (radial basis function ), MLP, LDA, Light Gradient Amplification Machine (LGBM) classifier and voting classifier composed of all the machine learning used in this study. For the selection of machine learning, a popular method was chosen with reference to previous reports28.29. Voting classifier was selected as the set method of all other classifiers above. As a secondary result, we constructed binary classification models for AMI and STEMI prediction. Non-ST-segment elevation myocardial infarction (NSTEMI) was not included in the secondary analysis due to its small number. Parameters were optimized using the grid search method with nested cross-validation.
We assessed the importance of features in the machine learning model based on the SHAPley Additive exPlanation (SHAP) value30, which was calculated using machine learning algorithms with the highest AUC in the test score. The vote classifier was excluded from the algorithms for calculating SHAP values due to the lack of available code. SHAP value is a solution concept used in game theory and is calculated by the difference in model output resulting from the inclusion of a feature in the algorithm, providing information on the impact of each feature on the exit. The SHAP value is a method for its interpretability in machine learning models and is also used as a feature selection tool. A higher absolute SHAP value indicates a more important feature.
We also performed feature selection by removing redundant and prediction-irrelevant features to improve the performance and interpretability of the model using XGBoost. We used XGBoost for feature selections because the algorithm handles linear and nonlinear data and missing data efficiently and flexibly. Moreover, the accuracy of the algorithm is stable even in the analysis with redundant variables31. Feature selection was done according to the following steps: i.e. (1) We built models using 42 features by removing one feature from 43 features and evaluated the model using nested CVs (outer 5-ply and outer 5-ply). (2) We replaced the feature to be removed with another feature and repeated this for 43 features. (3) The best combinations of the explainable characteristic were selected by ROC AUC among these 43 models. (4) Procedures (1) to (3) were repeated until the number of features became one. This process was repeated 10 times to prevent less important characteristics from appearing by chance in the higher ranking. Following the iterations, we determined the most plausible number of features (i.e. the most important features to include) of the model that showed the best performance in the average CV scores. After feature selection, we built a classification model for ACS prediction using nine machine learning algorithms with the 17 selected features.