First published 2023
Diabetes is a widespread chronic condition, with an estimated 463 million adults affected globally in 2019, a number projected to rise to 600 million by 2040. The rate of diabetes among Chinese adults has escalated from 9.7% in 2010 to 12.8% in 2018. This condition can cause serious damage to various body systems, notably leading to diabetic retinopathy (DR), a major complication that affects approximately 34.6% of diabetic patients worldwide and is a leading cause of blindness in the working-age population. The prevalence of DR is significant in various regions, including China (18.45%), India (17.6%), and the United States (33.2%).
DR often goes unnoticed in its initial stages as it does not affect vision immediately, resulting in many patients missing early diagnosis and treatment, which are crucial for preventing vision impairment. The disease is characterised by distinct retinal vascular abnormalities and can be categorised based on severity into stages ranging from no apparent retinopathy to proliferative DR, the most advanced form. Diabetic macular edema (DME), another condition that can occur at any DR stage, involves fluid accumulation in the retina and is independently assessed due to its potential to impair vision severely.
Diagnosis of DR and DME is typically made through various methods such as ophthalmoscopy, biomicroscopy, fundus photography, optical coherence tomography (OCT), and other imaging techniques. While ophthalmoscopes and slit lamps are common due to their affordability, fundus photography is the international standard for DR screening. OCT, despite its higher cost, is increasingly recognised for its diagnostic value but is not universally accessible for screening purposes.
The current status of diabetic retinopathy (DR) screening emphasises early detection to improve outcomes for diabetic patients. In the United States, the American Academy of Ophthalmology recommends annual eye exams for individuals with type 1 diabetes beginning five years after diagnosis, and immediate annual exams for those with type 2 diabetes upon diagnosis. Despite these guidelines, compliance with screening is low; a significant proportion of diabetic patients do not receive regular eye exams, with only a small percentage adhering to the recommended screening intervals.
In the United Kingdom, a national diabetic eye screening program initiated in 2003 has been credited with reducing DR as the leading cause of blindness among the working-age population. The program’s success is attributed to the high screening coverage of diabetic individuals nationwide.
Non-compliance with screening recommendations is attributed to factors such as a lack of disease awareness, limited access to medical resources, and insufficient medical insurance. Patients with more severe DR or those who already have vision impairment tend to comply more with screening, suggesting that the lack of symptoms in early DR leads to underestimation of the need for regular check-ups.
The use of telemedicine has been proposed to increase accessibility to screening, exemplified by the Singapore Integrated Diabetic Retinopathy Program, which remotely obtains fundus images for evaluation, reducing medical costs. Telemedicine has been found cost-effective, especially in large populations. Recently, the development of artificial intelligence (AI) has presented an alternative to enhance patient compliance and the efficiency of telemedicine in DR screening. AI can potentially streamline the grading of fundus images, reducing reliance on human resources and improving the screening process.
AI’s origins trace back to 1956 when McCarthy first introduced the concept. Shortly after, in 1959, Arthur Samuel coined the term “machine learning” (ML), emphasising the ability of machines to learn from data without being explicitly programmed. Deep learning (DL), a subset of ML, uses multi-layer neural networks for learning; within this, convolutional neural networks (CNNs) are specialised for image processing, featuring layers designed for pattern recognition.
CNN architectures like AlexNet, VGGNet, and ResNet have been pivotal in advancing AI, achieving high accuracy through end-to-end training on labelled image datasets and optimising parameters via backpropagation algorithms. Transfer learning, another ML technique, leverages pre-trained models on new domains, allowing for effective learning from smaller datasets.
In the medical field, AI’s image processing capabilities have significantly impacted radiology, dermatology, pathology, and ophthalmology. Specifically in ophthalmology, AI assists in diagnosing conditions like DR, glaucoma, and macular degeneration. The FDA’s 2018 approval of the first AI software for DR, IDx-DR, marked a milestone, using Topcon NW400 for capturing fundus images and analysing them via a cloud server to provide diagnostic guidance.
Further developments in AI for ophthalmology include EyeArt and Retmarker DR, both recognised for their high sensitivity and specificity in DR detection. These AI systems have demonstrated advantages in efficiency, accuracy, and reduced demand for human resources. They’ve shown to not only expedite the screening process, as evidenced by an Australian study where AI-based screening took about 7 minutes per patient, but also to outperform manual screenings in both accuracy and patient preference.
AI’s ability to analyse fundus photographs or OCT images at primary care facilities simplifies the screening process, potentially improving patient compliance and significantly reducing ophthalmologists’ workloads. With AI providing immediate grading and recommendations for follow-up or referral, diabetic patients can more easily access and undergo screening, therefore enhancing the management of DR.
To ensure the efficacy and accuracy of AI-based diagnostic systems for diabetic retinopathy (DR), it is crucial to have a well-structured dataset that is divided into separate non-overlapping sets for training, validation, and testing. In the development of AI-based diagnostic systems for diseases such as diabetic retinopathy, the dataset is meticulously organised into three distinct categories—each with a specific function in the training and validation of the algorithm. The training set forms the foundation, where the AI algorithm learns to identify and interpret fundus photographs; this set must be extensive and comprise high-quality images that have been carefully evaluated and labelled by expert ophthalmologists. As per the guidelines provided by Chinese authorities, if the system uses fundus photographs, these images should be collected from a minimum of two different medical institutions to ensure a varied and comprehensive learning experience. Concurrently, the validation set plays a pivotal role in refining the AI parameters, acting as a tool for algorithm optimisation during the development process. Lastly, the testing set is paramount for the real-world evaluation of the AI system’s clinical performance. To preserve the integrity of the results, this set is kept separate from the training and validation sets, preventing any potential biases that could skew the system’s accuracy in practical applications.
The training set should have a diverse range of images, including at least 1,000 single-field FPs or 1,000 pairs of two-field FPs, 500 non-readable FP images or pairs, and 500 images or pairs showing other fundus diseases besides DR. The images should be graded by at least three qualified ophthalmologists, with the majority opinion determining the final grade. For standard testing, a set should include 5,000 FPs or pairs, with no fewer than 2,500 images or pairs for DR stage I and above, and 500 images or pairs for other fundus diseases. A random selection of 2,000 images or pairs should be used to evaluate the AI system’s performance on the DR stages.
Current research has indicated some issues with the training sets used in existing AI systems. These include the use of FPs from a single source and including fewer than the recommended 500 non-readable images or pairs. Furthermore, some training sets sourced from online datasets do not provide access to important patient demographics like gender and age, which can be crucial for comprehensive training and accurate diagnostics.
The Iowa Detection Program (IDP) is an early example of an AI system for diabetic retinopathy (DR) screening that showed promise in Caucasian and African populations by grading fundus photographs (FP) and identifying characteristic lesions, albeit without employing deep learning (DL) techniques. Its sensitivity was commendable, but it suffered from low specificity. In contrast, IDx-DR incorporated a convolutional neural network (CNN) into the IDP framework, enhancing the specificity of DR detection. Clinical studies highlighted that while IDx-DR’s sensitivity in real-world settings didn’t quite match its testing set performance, it nonetheless demonstrated a satisfactory balance of sensitivity and specificity.
EyeArt expanded AI’s reach into mobile technology, becoming the first system to detect DR using smartphones. A study in India involving 296 type 2 diabetes patients revealed a very high sensitivity and reasonable specificity, proving its potential for remote DR screening. Moreover, systems like Google’s AI for DR screening can adjust sensitivity and specificity thresholds to meet clinical needs, suggesting that a hybrid approach of AI and manual screening could maximise efficiency and minimize missed referable DR cases.
However, most AI systems for DR rely on FPs, which are limited to two dimensions and can only detect diabetic macular edema (DME) through the presence of hard exudates in the posterior pole, potentially missing some cases. Optical coherence tomography (OCT), with its higher detection rate for DME, offers a more advanced diagnostic tool. Combining OCT with AI has led to the development of systems with impressive sensitivity, specificity, and area under the curve (AUC) metrics, as reflected in various studies. Despite these advancements, challenges such as accessibility remain, especially in resource-limited areas, as demonstrated by Hwang et al’s AI system for OCT, which still necessitates OCT equipment and the transfer of images to a smartphone, indicating that issues of accessibility for patients in underserved regions persist.
The landscape of AI-based diagnostic systems for diabetic retinopathy (DR) is expansive, yet it confronts numerous challenges. Many systems are trained on online datasets such as Messidor and EyePACS, which are limited by homogeneity in image sources and quality, as well as disease scope. These datasets often fail to encapsulate the diversity of real-world clinical environments, leading to potential misdiagnoses. A lack of standardised protocols for algorithm training exacerbates this, with the variability in sample sizes, image quality, and study designs from different sources undermining the generalisability of these AI systems.
Furthermore, while most research adheres to the International Clinical Diabetic Retinopathy Severity Scale for classifying DR severity, debates continue about its suitability. Some argue that classifications like the Early Treatment Diabetic Retinopathy Study may be more appropriate, as they could reduce unnecessary referrals by better reflecting the slower progression of milder DR forms. Inconsistencies in classification standards among studies affect both algorithm validity and cross-study comparisons.
Compounding these issues is the absence of a unified criterion for evaluating AI algorithms, with significant discrepancies in testing sets and performance metrics such as sensitivity, specificity, and area under the curve (AUC) across studies. Without universal benchmarks, comparing and validating these tools remains challenging. Moreover, AI diagnostics suffer from the “black box” phenomenon—the opaque nature of the decision-making process within AI systems. This obscurity impedes understanding and trust in the algorithms, as users cannot ascertain the rationale behind the AI’s assessments or intervene if necessary.
Legal and ethical concerns also arise, particularly regarding liability for misdiagnoses. The responsibility cannot squarely fall on either the developers or the medical practitioners using AI systems. Presently, this has restricted AI’s application primarily to DR screening. When compounded with obstacles such as cataracts, unclear media, or poor patient cooperation, the reliance on AI is reduced, necessitating ophthalmologist involvement.
Patient data security represents another critical issue. As AI systems for diabetes screening could process vast amounts of personal information, ensuring this data’s use solely for medical purposes and preventing breaches is paramount.
Finally, there’s the limitation of disease specificity in AI systems, where most are trained to detect only DR during fundus examinations. However, some studies have reported AI systems capable of identifying multiple conditions simultaneously, like age-related macular degeneration alongside DR, which could streamline diagnostic processes if widely adopted. Addressing these multifaceted challenges is crucial for the advancement and reliable integration of AI into ophthalmic diagnostics.
Artificial intelligence (AI) holds considerable promise in the field of diabetic retinopathy (DR) screening and diagnosis, with the potential to reshape current approaches significantly. The future could see the proliferation of AI systems designed for portable devices, such as smartphones, enabling patients to conduct DR screenings at home, which may drastically reduce the dependency on professional medical staff and advanced medical equipment. This shift could make DR screening much more accessible, particularly under the constraints imposed by events like the COVID-19 pandemic, where telemedicine’s importance has surged, providing vast benefits and convenience to both patients and healthcare providers.
Most AI-assisted DR screening systems currently rely on traditional fundus imaging. However, as newer examination techniques evolve, AI is expected to integrate with diverse types of ocular assessments, such as multispectral fundus imaging and optical coherence tomography (OCT), which could further enhance diagnostic accuracy. Beyond screening, AI is poised to play a crucial role in DR diagnosis. Some studies have already shown that AI can match or even surpass the sensitivity of human ophthalmologists, supporting the potential of AI-assisted systems to augment the diagnostic process with higher precision and efficiency.
Overall, in countries where DR screening programs are established, integrating AI-based diagnostic systems could significantly alleviate human resource burdens and boost operational efficiency. Despite the optimism, the datasets currently used to train AI algorithms are somewhat restricted in scope. For AI to be more broadly applicable in clinical settings, it’s essential to leverage diverse clinical resources to create more varied datasets and to refine standards for image quality and labeling, ensuring AI systems are both standardised and effective. At this juncture, the technology is not yet at a point where it can replace ophthalmologists entirely. Therefore, in the interim, a combined approach where AI complements the work of medical professionals may offer the most realistic and advantageous path forward for the clinical adoption of AI in DR management.
Links
https://www.gov.uk/guidance/diabetic-eye-screening-programme-overview
https://drc.bmj.com/content/5/1/e000333
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9559815/
https://www.mdpi.com/2504-2289/6/4/152
https://www.thelancet.com/journals/landig/article/PIIS2589-7500(20)30250-8/fulltext
https://pubmed.ncbi.nlm.nih.gov/20580421/
https://www.aao.org/education/preferred-practice-pattern/diabetic-retinopathy-ppp
https://pubmed.ncbi.nlm.nih.gov/27726962/
https://onlinelibrary.wiley.com/doi/10.1046/j.1464-5491.2000.00338.x
https://iovs.arvojournals.org/article.aspx?articleid=2565719