Korean J Radiol.  2019 May;20(5):749-758. 10.3348/kjr.2018.0530.

Effect of a Deep Learning Framework-Based Computer-Aided Diagnosis System on the Diagnostic Performance of Radiologists in Differentiating between Malignant and Benign Masses on Breast Ultrasonography

Affiliations
  • 1Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea. bkhan@skku.edu

Abstract


OBJECTIVE
To investigate whether a computer-aided diagnosis (CAD) system based on a deep learning framework (deep learning-based CAD) improves the diagnostic performance of radiologists in differentiating between malignant and benign masses on breast ultrasound (US).
MATERIALS AND METHODS
B-mode US images were prospectively obtained for 253 breast masses (173 benign, 80 malignant) in 226 consecutive patients. Breast mass US findings were retrospectively analyzed by deep learning-based CAD and four radiologists. In predicting malignancy, the CAD results were dichotomized (possibly benign vs. possibly malignant). The radiologists independently assessed Breast Imaging Reporting and Data System final assessments for two datasets (US images alone or with CAD). For each dataset, the radiologists' final assessments were classified as positive (category 4a or higher) and negative (category 3 or lower). The diagnostic performances of the radiologists for the two datasets (US alone vs. US with CAD) were compared.
RESULTS
When the CAD results were added to the US images, the radiologists showed significant improvement in specificity (range of all radiologists for US alone vs. US with CAD: 72.8-92.5% vs. 82.1-93.1%; p < 0.001), accuracy (77.9-88.9% vs. 86.2-90.9%; p = 0.038), and positive predictive value (PPV) (60.2-83.3% vs. 70.4-85.2%; p = 0.001). However, there were no significant changes in sensitivity (81.3-88.8% vs. 86.3-95.0%; p = 0.120) and negative predictive value (91.4-93.5% vs. 92.9-97.3%; p = 0.259).
CONCLUSION
Deep learning-based CAD could improve radiologists' diagnostic performance by increasing their specificity, accuracy, and PPV in differentiating between malignant and benign masses on breast US.

Keyword

CAD; Deep learning; Breast; Ultrasound; Radiologist; Diagnostic performance

MeSH Terms

Breast*
Dataset
Diagnosis*
Humans
Information Systems
Learning*
Prospective Studies
Retrospective Studies
Sensitivity and Specificity
Ultrasonography
Ultrasonography, Mammary*

Figure

  • Fig. 1 24-year-old woman diagnosed with fibroadenoma using US-guided biopsy.A. Transverse B-mode US image shows 15-mm oval hypoechoic mass (arrows). B. After radiologist clicked on center point of mass on US image shown, two-dimensional region of interest (green line) was automatically drawn along mass margin through deep learning-based CAD. Following this, deep learning-based CAD analyzed US features of mass according to BI-RADS lexicon and displayed final assessment of “possibly benign” on screen. During first reading session (US images alone), two readers classified mass as BI-RADS category 4a because they assessed that margin of mass was angular (right arrow in A), whereas other two readers did not and classified mass as category 3. During second reading session (US images with CAD), two readers who previously classified mass as category 4a reassessed it as category 3, whereas two readers who previously classified it as category 3 did not change their classifications. BI-RADS = Breast Imaging Reporting and Data System, CAD = computer-aided diagnosis, US = ultrasound

  • Fig. 2 ROC curves for radiologists for two datasets (US images alone vs. US images with CAD) based on probability of malignancy risk.When deep learning-based CAD results were added to US, the readers' AUCs (right; range, 0.914–0.951) were significantly higher than those for US images alone (left; range, 0.884–0.919; p < 0.001). AUC = area under curve, ROC = receiver operating characteristic

  • Fig. 3 50-year-old woman diagnosed with ductal carcinoma in situ using US-guided biopsy and surgical excision.A. Transverse B-mode US image shows 13-mm oval mass with slightly heterogeneous echo pattern (arrows). B. Deep learning-based CAD analyzed US features of mass (green line) and displayed final assessment of “possibly malignant” on screen. During first reading session (US images alone), all four readers classified mass as BI-RADS category 3. During second reading session (US images with CAD), three of four readers changed their assessment to category 4a.

  • Fig. 4 48-year-old woman diagnosed with invasive ductal carcinoma using US-guided biopsy and surgical excision.A. Transverse B-mode US image shows 19-mm isoechoic mass (arrows). B. Deep learning-based CAD analyzed US features of mass (green line) and displayed final assessment of “possibly benign” on screen. During first reading session (US images alone), mass was classified as BI-RADS category 4b by one reader, category 4a by another reader, and category 3 by other two readers. During second reading session (US images with CAD), reader who previously classified mass as category 4b reassessed it as category 4a, whereas reader who classified it as category 4a reassessed it as category 3. Two readers who classified mass as category 3 did not change their classifications.


Reference

1. Berg WA, Blume JD, Cormack JB, Mendelson EB, Lehrer D, Böhm-Vélez M, et al. Combined screening with ultrasound and mammography vs mammography alone in women at elevated risk of breast cancer. JAMA. 2008; 299:2151–2163. PMID: 18477782.
Article
2. Kelly KM, Dean J, Comulada WS, Lee SJ. Breast cancer detection using automated whole breast ultrasound and mammography in radiographically dense breasts. Eur Radiol. 2010; 20:734–742. PMID: 19727744.
Article
3. Hooley RJ, Scoutt LM, Philpotts LE. Breast ultrasonography: state of the art. Radiology. 2013; 268:642–659. PMID: 23970509.
Article
4. Ohuchi N, Suzuki A, Sobue T, Kawai M, Yamamoto S, Zheng YF, et al. Sensitivity and specificity of mammography and adjunctive ultrasonography to screen for breast cancer in the Japan Strategic Anti-cancer Randomized Trial (J-START): a randomised controlled trial. Lancet. 2016; 387:341–348. PMID: 26547101.
Article
5. Berg WA, Blume JD, Cormack JB, Mendelson EB. Training the ACRIN 6666 Investigators and effects of feedback on breast ultrasound interpretive performance and agreement in BI-RADS ultrasound feature analysis. AJR Am J Roentgenol. 2012; 199:224–235. PMID: 22733916.
Article
6. D'Orsi C, Sickles E, Mendelson E, Morris E. ACR BI-RADS® Atlas, breast imaging reporting and data system. 5th ed. Reston, VA: American College of Radiology;2013. p. 1–153.
7. Lazarus E, Mainiero MB, Schepps B, Koelliker SL, Livingston LS. BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. Radiology. 2006; 239:385–391. PMID: 16569780.
Article
8. Kim EK, Ko KH, Oh KK, Kwak JY, You JK, Kim MJ, et al. Clinical application of the BI-RADS final assessment to breast sonography in conjunction with mammography. AJR Am J Roentgenol. 2008; 190:1209–1215. PMID: 18430833.
Article
9. Stavros AT, Freitas AG, Giselle G, Barke L, McDonald D, Kaske T, et al. Ultrasound positive predictive values by BI-RADS categories 3–5 for solid masses: an independent reader study. Eur Radiol. 2017; 27:4307–4315. PMID: 28396996.
Article
10. Abdullah N, Mesurolle B, El-Khoury M, Kao E. Breast imaging reporting and data system lexicon for US: interobserver agreement for assessment of breast masses. Radiology. 2009; 252:665–672. PMID: 19567644.
Article
11. Yoon JH, Kim MJ, Moon HJ, Kwak JY, Kim EK. Subcategorization of ultrasonographic BI-RADS category 4: positive predictive value and clinical factors affecting it. Ultrasound Med Biol. 2011; 37:693–699. PMID: 21458145.
Article
12. Doi K. Computer-aided diagnosis in medical imaging: historical review, current status and future potential. Comput Med Imaging Graph. 2007; 31:198–211. PMID: 17349778.
Article
13. Joo S, Yang YS, Moon WK, Kim HC. Computer-aided diagnosis of solid breast nodules: use of an artificial neural network based on multiple sonographic features. IEEE Trans Med Imaging. 2004; 23:1292–1300. PMID: 15493696.
Article
14. Huang YL, Chen DR. Support vector machines in sonography: application to decision making in the diagnosis of breast cancer. Clin Imaging. 2005; 29:179–184. PMID: 15855062.
15. Singh S, Maxwell J, Baker JA, Nicholas JL, Lo JY. Computer-aided classification of breast masses: performance and interobserver variability of expert radiologists versus residents. Radiology. 2011; 258:73–80. PMID: 20971779.
Article
16. Alam SK, Feleppa EJ, Rondeau M, Kalisz A, Garra BS. Ultrasonic multi-feature analysis procedure for computer-aided diagnosis of solid breast lesions. Ultrason Imaging. 2011; 33:17–38. PMID: 21608446.
Article
17. Moon WK, Chen IL, Chang JM, Shin SU, Lo CM, Chang RF. The adaptive computer-aided diagnosis system based on tumor sizes for the classification of breast tumors detected at screening ultrasound. Ultrasonics. 2017; 76:70–77. PMID: 28086107.
Article
18. Tourassi GD, Frederick ED, Markey MK, Floyd CE Jr. Application of the mutual information criterion for feature selection in computer-aided diagnosis. Med Phys. 2001; 28:2394–2402. PMID: 11797941.
Article
19. Newell D, Nie K, Chen JH, Hsu CC, Yu HJ, Nalcioglu O, et al. Selection of diagnostic features on breast MRI to differentiate between malignant and benign lesions using computer-aided diagnosis: differences in lesions presenting as mass and non-mass-like enhancement. Eur Radiol. 2010; 20:771–781. PMID: 19789878.
Article
20. Yang MC, Moon WK, Wang YCF, Bae MS, Huang CS, Chen JH, et al. Robust texture analysis using multi-resolution gray-scale invariant features for breast sonographic tumor diagnosis. IEEE Trans Med Imaging. 2013; 32:2262–2273. PMID: 24001985.
Article
21. Han S, Kang HK, Jeong JY, Park MH, Kim W, Bang WC, et al. A deep learning framework for supporting the classification of breast lesions in ultrasound images. Phys Med Biol. 2017; 62:7714–7728. PMID: 28753132.
Article
22. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, et al. Imagenet large scale visual recognition challenge. Int J Comput Vis. 2015; 115:211–252.
Article
23. Lee JG, Jun S, Cho YW, Lee H, Kim GB, Seo JB, et al. Deep learning in medical imaging: general overview. Korean J Radiol. 2017; 18:570–584. PMID: 28670152.
Article
24. Suk HI, Lee SW, Shen D. Alzheimer's Disease Neuroimaging Initiative. Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. Neuroimage. 2014; 101:569–582. PMID: 25042445.
Article
25. Hua KL, Hsu CH, Hidayati SC, Cheng WH, Chen YJ. Computer-aided classification of lung nodules on computed tomography images via deep learning technique. Onco Targets Ther. 2015; 8:2015–2022. PMID: 26346558.
26. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017; 284:574–582. PMID: 28436741.
Article
27. Cheng JZ, Ni D, Chou YH, Qin J, Tiu CM, Chang YC, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016; 6:24454. PMID: 27079888.
Article
28. Bennett B. On comparisons of sensitivity, specificity and predictive value of a number of diagnostic procedures. Biometrics. 1972; 28:793–800. PMID: 5073252.
Article
29. Barlow WE, Chi C, Carney PA, Taplin SH, D'Orsi C, Cutter G, et al. Accuracy of screening mammography interpretation by characteristics of radiologists. J Natl Cancer Inst. 2004; 96:1840–1850. PMID: 15601640.
Article
30. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988; 44:837–845. PMID: 3203132.
Article
31. Kim SA, Chang JM, Cho N, Yi A, Moon WK. Characterization of breast lesions: comparison of digital breast tomosynthesis and ultrasonography. Korean J Radiol. 2015; 16:229–238. PMID: 25741187.
Article
32. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977; 33:159–174. PMID: 843571.
Article
33. Berg WA, Cosgrove DO, Doré CJ, Schäfer FK, Svensson WE, Hooley RJ, et al. Shear-wave elastography improves the specificity of breast US: the BE1 multinational study of 939 masses. Radiology. 2012; 262:435–449. PMID: 22282182.
Article
34. Lee SH, Cho N, Chang JM, Koo HR, Kim JY, Kim WH, et al. Two-view versus single-view shear-wave elastography: comparison of observer performance in differentiating benign from malignant breast masses. Radiology. 2014; 270:344–353. PMID: 24029644.
Article
35. Sultan LR, Bouzghar G, Levenback BJ, Faizi NA, Venkatesh SS, Conant EF, et al. Observer variability in BI-RADS ultrasound features and its influence on computer-aided diagnosis of breast masses. Advances in Breast Cancer Research. 2015; 4:1–8.
Full Text Links
  • KJR
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr