Machine learning-assisted real-time deformability cytometry of CD34+ cells helps identify patients with myelodysplastic syndromes

This article provides a proof of concept for using RT-DC for MDS detection. As RT-DC captures cell morphology, the information content is similar to morphological analyzes of BM smears which are currently the gold standard for MDS diagnosis. Additionally, the mechanical readout of RT-DC is a promising feature, as previous studies have shown alterations in the actin cytoskeleton in association with MDS.8,9,10.

Current MDS diagnostic routines are being re-examined due to reproducibility issues, high labor intensity, and the requirement for expert personnel2,4,19. These issues could be addressed by using a combination of imaging flow cytometry (IFC) for high-throughput acquisition and machine learning for automated data analysis.20.21. In IFC, fluorescence images are captured, which allows the labeling of different cell types and intracellular structures. However, it has already been demonstrated that using deep learning, brightfield images are sufficient, for example, to predict differentiation lineages or distinguish cell types in the blood.22.23. Therefore, the label-free approach of RT-DC could be advantageous as the staining process can be omitted.

In the present work, we use RT-DC for the first time for MDS detection. From each captured cell, seven features are calculated in real time, which were then used to train a random forest model, achieving an accuracy of 82.9% for the classification of healthy and MDS samples. As RT-DC performs real-time image analysis, MDS classification result could be provided immediately during measurement. The label-free appearance of RT-DC and real-time analysis could shorten the time needed for diagnosis.

Using a model interpretation technique, we found that the width of the cell size distribution is one of the most important criteria used by the random forest classification model. Although using a single feature for classification reduces accuracy (78%), it may be more suitable for observation in clinical practice. Interestingly, our finding is consistent with WHO guidelines that suggest considering cell size when assessing morphology. Our measurements consistently show that a subpopulation of cells in the size range (25 mu {text{m}}^{2}the Athe 45 mu {text{m}}^{2}) is underrepresented in MDS samples (see Fig. 1D and Supplementary Fig. S3). This effect could be explained by the reduced number of B cell precursor cells in MDS24, which are CD34+ and could be present in the sample after sorting based on CD3425. Additionally, the cell size histogram in Fig. 1D shows narrow peak at 50 µm2 for MDS, while the healthy counterpart has a wider distribution. Therefore, it is mainly the width of the distribution that plays a role, rather than the mean or the median which is similar for the two samples. However, since only 41 samples were used to train and validate the random forest model, the extrapolation of this study to the highly heterogeneous MDS population is limited, as the model might be overfit to this small dataset. Also, random forest models do not perform well in extrapolation tasks. Therefore, a larger prospective clinical study is needed to reach more decisive conclusions.

Our work considered seven features obtained using RT-DC which can be summarized in three groups: features describing cell size (A, LX, Lthere), mechanical properties (γ, D, I) and porosity (Ω). However, updated versions of RT-DC technology are able to save the brightfield image and calculate transparency characteristics in real time, which has been shown to allow discrimination between different blood cell types.26. Additionally, images can be evaluated by a deep neural network that allows fine details of the image to be used for accurate classification.22.23. Future research should incorporate these new modalities to improve label-free detection of MDS using RT-DC.

MDS is caused by the accumulation of genetic mutations that can be identified by whole genome sequencing. While the costs of whole genome sequencing have risen from a hundred million to a thousand dollars over the past 20 years, only targeted sequencing currently plays a role in clinical practice.27. Here, only selected genes that are frequently affected in MDS are checked, which is problematic, due to the large genomic heterogeneity present in various types of MDS.28.29. Therefore, standard diagnosis relies on an assessment of cell morphology as an indirect readout of genetic properties. Morphological alterations are accompanied by changes in F-actin distribution and structural changes in the cytoskeleton8,9,10. RT-DC measures the mechanical properties of cells that are determined by the cytoskeleton5,30,31. Diseases such as malaria, leukemia or spherocytosis have already been shown to lead to measurable differences in mechanical properties26.32. To link mechanistic and genetic changes, we measured HSCs of MDS patients using RT-DC and performed molecular analysis in parallel. Figure 2B indicates that a greater number of genetic mutations corresponds to a lower median strain. Therefore, RT-DC could provide an additional indirect readout of acquired mutations that has low cost per measurement, low measurement time, and offers real-time analysis results. However, despite the strong correlation, we would consider this result to be hypothesis-generating due to the small sample size (n = 10). Furthermore, we could identify neither an association of mutation and deformity type, nor a significant mechanical difference between the low and high risk group (data not shown), but rather the biological characteristics of the blast cells, such as the number of mutations, correlated with the mechanical properties. The importance of Dmedian resulting from the random forest model is low (see Fig. 1B). This suggests that Dmedian is similar for healthy and MDS samples. Therefore, the approach of correlating Dmedian inferring the number of mutations is only valid for samples for which MDS had already been diagnosed.

HSCs make up only about 1% of bone marrow cells33.34. To focus our study on this small subpopulation, we used MACS for CD34 enrichment of HSCs prior to measurement. However, since cells produced by mutated HSCs are likely morphologically different from the healthy counterpart, a future endeavor should evaluate unsorted bone marrow in RT-DC using an approach similar to that presented in the present work. In addition, the efficiency of CD34 isolation is low, which results in a small total number of cells for measurement. As a result, our measurements could not fully utilize the available throughput capacity of RT-DC. The samples presented in this manuscript were subjected to cryopreservation and thawing, which could potentially alter cell morphology and the outcome of MDS prediction. A follow-up project should therefore ideally use new MBs.

Overall, our study shows that RT-DC has the potential to expand the current status quo of MDS diagnostics. The morphological and mechanical reading of RT-DC are promising parameters for the identification of MDS. Whether this method can be complementary to standard diagnostic procedures in borderline cases or serve as a rapid and reliable test in the initial diagnosis remains to be demonstrated in prospective clinical studies.

Sherry J. Basler