A large number of researchers as well as health care professionals that are present in fields consisting of disease, radiology as well as ontology have recently gone on to find several common and serious shortcomings with machine learning made for the sake of the diagnosis of Covd-19 – or indeed the prognosis.
When the Covd-19 pandemic took the center stage just last year, different startups, including the likes of DarwinAI as well as major companies such as Nvidia took upon themselves to launch initiatives so as to detect Covd-19 with the aid of CT SCANS, X-rays or other forms of medicinal imaging. Perhaps the most important use of technology in this manner is the promise that it can lead to helping medical practitioners in differentiating between whether the subject has pneumonia or Covd-19, and also to help such health care professionals to provide more options as far as patient diagnosis goes. Some models have been developed so far as to make predictions on whether a person will die or will need a ventilator – this all being based on a CT scan. However, all at the same time, researchers have also gone on to claim that major changes are needed before this particular form of machine learning may be used in clinical setting.
Researchers took upon themselves to assess more than 2200 papers in total – this via a process of removing both duplicates as well as irrelevant titles. This assessment of the total amount of papers was then narrowed down to just 320 papers which underwent a full text review as far as quality goes. And at the end of it all, 62 papers were finally deemed as being fit enough so as to be a part of what the authors conclude as being a systematic review of published research as well as preprints which are shared on open different open research paper repositories.
Of the 62 papers that have been mentioned earlier, almost half of them made no attempt whatsoever to perform external validation of training data, and did not assess model sensitivity or indeed robustness, and neither did they report the demographics of people which happened to have been represented in training data.
“Frankenstein” datasets, which of course are the kind which are made with the aid of duplicate images that happen to be obtained from other datasets, were also in fact found to be a common problem, and it ended up being the case where only one in five Covd-19 diagnosis or prognosis models shared their respective code meaning that others would then have the opportunity to reproduce the results which were claimed in literature.
The paper claims : “In their current reported form, none of the machine learning models included in this review are likely candidates for clinical translation for the diagnosis/prognosis of Covd-19. Despite the huge efforts of researchers to develop machine learning models for Covd-19 diagnosis and prognosis, we found methodological flaws and many biases throughout the literature, leading to highly optimistic reported performance.”
This particular research was subject to being published in this month, as part of the March issue of Nature Machine Intelligence by researchers that happen to find themselves at the University of Cambridge and also the University of Manchester. There were also other common issues which were found with respect to machine learning models – those of which happened to have been developed for the sake of using medical imaging data. In these cases, there was virtually no assessment for bias carried out and there was a general concept of being trained without enough images. As it turns out, nearly all the papers which had been reviewed was found to have been having either a high or uncertain risk of bias.
As far as the publicly available datasets go – a common trend in these was that they also suffered from lower quality image formats and that they weren’t exactly large enough so as to making the process of training reliable AI models a practical one. Researchers actually made use of the checklist for the sake of artificial intelligence in medical imaging (CLAIM) as well as radiomics quality scores (RQS) so as to help them with the assessment of both the datasets as well models.
Conclusion : we still have a long way to go
As can be ascertained from the text above, we aren’t quite just ready to rely on machine learning to develop and present us with reliable reports on Covd-19 and its implications on different patients. While the process might end up being a tad more smoother as we go ahead in the future, it is unlikely that the integration will be with that of Covd-19, as the disease would hopefully be eradicated by that time. However, for other diseases, the process of integrating machine learning with their diagnoses might be very promising to say the least.