300 likes | 409 Views
This study explores the integration of radiologists' feedback into computer-aided diagnosis systems to improve early detection rates. Using data from the Lung Image Database Consortium, methodologies such as Simple Distance Metrics, Linear Regression, and Principle Component Analysis were employed. Results show promising matches and future work includes refining the analysis based on radiologists' agreements and utilizing a leave-one-out technique to address data set size limitations.
E N D
Integration of Radiologists’ Feedback into Computer-Aided Diagnosis Systems Sarah A. Jabona Daniela S. Raicub Jacob D. Furstb aRose-Hulman Institute of Technology, Terre Haute, IN 47803 bSchool of Computing, CDM, DePaul Universtiy, Chicago, IL 60604
Overview • Introduction • Related Work • The Data • Methodology • Simple Distance Metrics • Linear Regression • Principle Component Analysis • Results • Simple Distance Metrics • Linear Regression • Principle Component Analysis • Conclusions • Future Work
Introduction • The 2008 official estimate • 215,020 cases diagnosed • 161,840 deaths will occur • Five-year relative-survival rate (1996 – 2004): 15.2% • Computer-aided diagnosis systems can help improve early detection
Related Work • El-Naqa et al. • mammography images • neural networks and support vector machines • Muramatsu et al. • mammography images. • three-layered artificial neural network to predict the semantic similarity rating between two nodules • Park et al. • linear distance-weighted K-nearest neighbor algorithm to identify similar images
Related Work • ASSERT by Purdue University • Content-based features: co-occurrence, shape, Fourier Transforms, global gray level statistics • Radiologists also provide features • BiasMap by Zhou and Huang • Relevance feedback, content-based features • Analysis: biased-discriminant analysis (BDA)
The Data • Lung Image Database Consortium • Reduced 1,989 images down to 149 (one for each nodule) • Summarized the radiologists’ ratings (up to 4) into a single vector • Each nodule has 7 semantic based characteristics and 64 content-based characteristics
Overview • Introduction • Related Work • The Data • Methodology • Simple Distance Metrics • Linear Regression • Principle Component Analysis • Results • Simple Distance Metrics • Linear Regression • Principle Component Analysis • Conclusions • Future Work
Methodology: Simple Distance Metrics Semantic-Based Similarity Content-Based Similarity
Simple Distance Metrics Content-Based Similarity Values (Euclidean) Semantic-Based Similarity Values (1 – Cosine)
Methodology: Principle Component Analysis • Content-Based Features: • 77 pairs with a correlation > 0.9 • 136 pairs with a correlation > 0.8 or < -0.8
Methodology: Principle Component Analysis • PCA on content-based features • accounts for 99% of the variance • 23 components • PCA on semantic-based characteristics • Method 1 • accounts for 92% of the variance • 4 components • Method 2 • accounts for 98% of the variance • 6 components
Overview • Introduction • Related Work • The Data • Methodology • Simple Distance Metrics • Linear Regression • Principle Component Analysis • Results • Simple Distance Metrics • Linear Regression • Principle Component Analysis • Conclusions • Future Work
Future Work • Perform the analysis only nodules on which all three radiologists agree • In order to address the small size of the data set, perform the analysis using a leave one out technique (instead of 2/3 training and 1/3 testing) • Incorporate relevance feedback into the system