1 / 23

The MammoGrid Project Grids Architecture

The MammoGrid Project Grids Architecture. Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions, Univ of Oxford, Univ of Sassari & Pisa, Univ West of England, Univ Hospitals of Cambridge (Addensbrookes) & Udine. Contents.

taite
Download Presentation

The MammoGrid Project Grids Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions, Univ of Oxford, Univ of Sassari & Pisa, Univ West of England, Univ Hospitals of Cambridge (Addensbrookes) & Udine

  2. Contents • The MammoGrid project objectives • Project challenges and philosophy • HEP vs distributed medical image analysis • The MammoGrid infrastructure • Implementation and current status • Future plans • Conclusions & questions R. McClatchey, CHEP’03 San Diego March 2003

  3. What is the Mammogrid? • EU FP5 project to build a pan-European distributed Databaseof mammographyimages using GRID Technologies. • Aim: To provide a demonstrator for use in epidemiological studies, quality control and validation of computer aided detection algorithms. R. McClatchey, CHEP’03 San Diego March 2003

  4. Mammogrid Objectives • To evaluate current Grids technologiesand determine the requirements for Grid-compliance in a pan-European mammography database. • To implement the Mammogrid database, using novel Grid-compliant and Federated-Database technologies that will provide improved access to distributed data and will allow rapid deployment of software packages to operate on locally stored information. • To deploy enhanced versions of a standardization systemthat enables comparison of mammograms in terms of intrinsic tissue properties independently of scanner settings, and to explore its place in the context of medical image formats (DICOM). • To develop software tools to automatically extract image information that can be used to performquality controlson the acquisition process of participating centers (e.g. average brightness, contrast). • To develop software tools to automatically extract tissueinformation that can be used to perform clinical studies(e.g. breast density, presence, number and location of micro-calcifications) in order to increase the performance of breast cancer screening programs. • To use the annotated information and the images in the database tobenchmarkthe performance of the software described in points 3, 4 and 5. • To exploit the Mammogrid database and the algorithms to propose initial pan-European quality controls on mammographic acquisition and ultimately to provide a benchmarking system to third party algorithms. R. McClatchey, CHEP’03 San Diego March 2003

  5. Mammogrid Philosophy • Project concentrates on applying emerging GRID technology rather on developing it. • It plans to implement a ‘lightweight’ (but fully functional) GRID and study its usage in hospitals • It will draw heavily on other Grids projects e.g. DataGrid • It will deliver a prototype federated database of mammograms in hospitals in the UK and Italy • It will provide rapid feedback from the Hospital community • And will inform the next generation of HealthGrids developments R. McClatchey, CHEP’03 San Diego March 2003

  6. Why a Mammography Database? • Breast cancer is a huge problem: • 10% of women develop breast cancer, • 19% of cancer deaths are due to breast cancer, • 24% of all cancer cases are breast cancers, • there are 348,000 cases in EU & USA, 50,000 die every year, • fortunately there is a solution. • Early diagnosis through mammography screening improves prognosis R. McClatchey, CHEP’03 San Diego March 2003

  7. ...but • Quality control in acquisition, diagnosis and efficient data management is vital. • Improving the reliability of screening and early diagnosis requires: • better epidemiological understanding, • improved diagnostic tools, • enhanced quality control, • continuous training and • efficient management of data and records. • A way to achieve the above is through repositories of mammography data for research and training that contain sufficiently large statistical samples e.g. • Mammogrid-EU, • NDMA-US, • eDIAMonD-UK (Mirada, IBM, Oxford, Edin. KCL, UCL) • GPCalma-Italy R. McClatchey, CHEP’03 San Diego March 2003

  8. The Mammogrid Challenge • Building this repository is not trivial because: • Large numbers of exemplars are required. • Cases must be obtained from many geographically remote locations. • Data itself is large: 2 breasts × 2 views × 4K × 4K pix × 2 bytes = 128Mbyte per patient per visit, 3M women per year UK, ~ 400 Terabytes in UK alone, • Acquisition is highly variable, same image may look different depending on machine and parameters. How do you compare? • Patient privacy and data security is key. • Many relevant items of metadata. R. McClatchey, CHEP’03 San Diego March 2003

  9. A GRID Infrastructure is ideal • The Databases to statistically validate image based clinical hypothesis are: • Populated by large number of cases • Contain large files (1 mammogram 10Mb+) • Geographically distributed repositories • Heterogeneous database formats • Need to be accessible to co-workers • Development and validation of medical image analysis solutions demands: • Computationally expensive simulations. • Repeated runs for optimal parameter tuning. • Statistical test rigs. • Remote execution and maintenance • Services (e.g. security) must be system-resident, invisible, generic R. McClatchey, CHEP’03 San Diego March 2003

  10. High Energy Physics vs. Mammogrid • Mammogrid heavily relies on technologies developed primarily in the field of high energy physics. • Similarities • Large number of big files • Files can be sensibly organized in directory tree • Need to replicate and move file copies between sites • Need to execute commands on the node which hosts data locally • Difficulties • Complexity of co-working in medical environment • Lack of trained IT personnel • Confidentiality R. McClatchey, CHEP’03 San Diego March 2003

  11. Healthcare Institute University Database Hospital Italy Local Query Local Query Local Query Local Query Query Result GRID Clinician’s Workstations Massively distributed data AND distributed analyses Local Analysis Local Analysis Local Analysis Local Analysis • Knowledge is stored alongside data • Active (meta-)objects manage various versions of data and algorithms • Small network bandwidth required Hospital UK Shared meta-data Analysis-specific data Federated System Solution R. McClatchey, CHEP’03 San Diego March 2003

  12. Project Management WP 1 - CERN (Vitamib) Information infrastructure GRID/DB infrastructure Integration test bed Use case/ validation User Req’s & Specs WP 3 - CERN/UWE specifications H/W local node implem. WP 5 - CERN WP 4 - Mirada WP 2 CERN/UWE Hospitals WP 9&10 Cambridge Udine Application S/W Standardisation S/W. WP 7&8 - Oxford, Pisa/Sassari WP 6 - Mirada Dissemination & Exploitation WP 11 - All Mammogrid Implementation R. McClatchey, CHEP’03 San Diego March 2003

  13. MammoGram Analysis Use-Case • Example Use-Case: • Mammogram Analysis • View and Annotate Images • Run CAD • Execute Queries R. McClatchey, CHEP’03 San Diego March 2003

  14. MammoGrid Data Structures • Database Entities: • Hospitals • Users (Radiologists) • Equipment • Patients • Studies • Series • Images R. McClatchey, CHEP’03 San Diego March 2003

  15. Mammogram MammoGrams & Annotation R. McClatchey, CHEP’03 San Diego March 2003

  16. Main Deliverables/milestones • User Requirements Specification and Technical System Specification (months 3, 6) • Prototype GRID-compliant database and information infrastructure (first release m. 18, final rel. m. 36) • Packaged medical imaging workstation with interface to GRID, secure GRID box, (month 12) • Grid compliant SMF software (month 12) • Application software (months 12, 24, 36) • Clinical Trial results (month 24, 36) R. McClatchey, CHEP’03 San Diego March 2003

  17. Workstations Mirada WST (MAS) GRID VPN Network Central File Catalogue File Cat. Replica AlienBackup Alien Data GridBox GridBox Mammogrid Data Backup Mammogrid Data High Security Level Cambridge Site File Cat. Replica File Cat. Replica Alien Data Alien Data GridBox GridBox Mammogrid Data Mammogrid Data Overall Grids Architecture Data replication R. McClatchey, CHEP’03 San Diego March 2003

  18. GRID : Mammogrid – AliEn Workstations MAS: Mirada Acquisition System Web Services Alien File Catalogue LFNs Object : Patient Information Service DICOM File : - Description Inf. - Image Alien Database PFNs File Transfer Daemon DICOM Server - Patient Personal Information, - Additional Information, - … Mammogrid Database Mirada Workstation SOAP Messages Local Cache Sends Dicom Files Read / Write operations Digitizer Local Site Architecture R. McClatchey, CHEP’03 San Diego March 2003

  19. Clinician to Data R. McClatchey, CHEP’03 San Diego March 2003

  20. cambridge Mirada-AliEn Interface AliEn prototype cern Interface Perl SOAP Server AliEn Catalogue … udine MammoGrid AliEn Prototype The Catalogue is divided in several databases, which can be distributed. The catalogue keeps the LFN-PFN mapping and the metadata R. McClatchey, CHEP’03 San Diego March 2003

  21. Case : READ SOAP Messages GRID Environment Mammogrid - AliEn Mirada WST Query IS Information Service FTD File Transfer Daemon Negociation Result Set Reads File Catalogue Case : WRITE Mammogrid - AliEn Mirada WST DICOM Server Push(DICOM File) Negociation FTD File Transfer Daemon File Handle Updates File Catalogue File Catalogue Alien Service Mammogrid Service Interaction Diagram R. McClatchey, CHEP’03 San Diego March 2003

  22. Current Hardware Setup • Gridbox specifications : • 2x intel Xeon processors, • 2 GB DDR 200/266 MHz, • Redundant Power Supply, • 2x 20 GB IDE HDD (7200 rpm) UDMA, • RAID-1 IDE adapter, • 360 GB usable, RAID-1, • Ethernet network adapter 10/100Mb/s, • Gigabit network adapter R. McClatchey, CHEP’03 San Diego March 2003

  23. Conclusions • Distributed Health informatics is an important application area for Grids technologies – HealthGrid • Many similarities with High Energy Physics • Need rapid feedback from the user community – MammoGrid user requirements specified BUT • Effective Grid deployment needed now and • Many open questions e.g : • How to resolve distributed queries ? • What role for meta-data ? • How to maintain secure, reliable data ? • MammoGrid : First results expected late 2003 R. McClatchey, CHEP’03 San Diego March 2003

More Related