1 / 16

Toward Knowledge Discovery in Databases Attached to Grids

Toward Knowledge Discovery in Databases Attached to Grids. Peter Brezany Insti tute for Software Science Univers ity of Vienna E-mail : brezany@par.univie.ac.at. Media That Radically Influenced Society. 1850s Telegraph. 1840s Penny Post. 1500s Printing Press. 1930s

dandre
Download Presentation

Toward Knowledge Discovery in Databases Attached to Grids

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software Science University of Vienna E-mail : brezany@par.univie.ac.at

  2. Media That Radically Influenced Society 1850s Telegraph 1840s Penny Post 1500s Printing Press 1930s Radio 1950s TV 1920s Telephone 20xx Grid 1990s Web

  3. Data Mining on the Grid – Background Information Application Examples Architecture of a Traditional Data Mining System GridMiner – A framework for Data Mining on the Grid GridMiner Architecture Functional and Data Access Model Conclusions Talk Outline

  4. Data mining on the Grid (DMG) : finding unknown data patterns in an environment with geographically distributed data and computation. Data may be highly heterogeneous with a high update frequency A good DMG algorithm analyzes data in a distributed fashion with modest data communication overhead. A typical DMG algorithm involves local data analysis followed by the generation of a global data model. Data Mining on the Grid

  5. Finding out the dependency of the emergence of hepatitis-C on the weather patterns: access to a large hepatitis-C DB at one location and an environmental DB at another location. 2 major financial organizations want to cooperate. They need to share data patterns relevant to the data mining task, they do not want to share the data since it is sensitive - combining the databases may not be feasible. Federating Brain Data Project – Integrating several neuro-science DBs A major multi-national corporation wants to analyze the customer transaction records for quickly developing successful business strategies. - It has thousands of establishments through out theworld - Collecting all the data to a centralized data warehouse,followed by analysis using existing commercial data mining software,takes too long. Application Examples

  6. Telemedical ApplicationsAMG – Austrian Medical Grid Database Raw Medical Data Derived Medical Data Database Reconstructed Medical Data Web

  7. Telemedical Collaboration - Example A patient living in a remote village has a heart problem. An EEG is taken by the local doctor and all the patient’s details are stored in the doctor’s PC based telemedical system. MRI and CT scans are taken within different departments of a general hospital and stored in the telemedical DB. A consultant compiles a report and saves it in the DB. If necessary, in a specialized clinic a 3D ultrasound scan is taken and further report compiled. Requiring complicated surgery, an external specialist using Virtual Reality techniques defines how the surgery should be planned. The resulting operation is placed on video for, e.g., education.  Data mining support/assistance is needed.

  8. Knowledge base Database Architecture of a Data Mining System Graphical user interface Pattern evaluation Data mining engine Database or data warehouse server Data cleaning, data integration Filtering Data warehouse

  9. On Line Analytical Mining (OLAM)

  10. GridMiner – A Framework for Data Mining on Grids System Requirements: - Algorithm and data publishing and integration - Compatibility with grid infrastructure and Grid awareness - Openness - Scalability - Security and data privacy Functionality requirements: - Mining different kinds of knowledge in databases - Incremental data mining algorithms - Interactive mining of knowledge at multiple levels of abstraction

  11. GridMiner (Layered) Architecture (Based on the K.F. Jeffery´s idea)

  12. Functional and Data Access Model MDS

  13. Example: Mining Patterns for Data Classification and Associations use databasedat1, dat2 mine classifications analyzecredit_rating usingg_parsimony display astree use databaseDBs attributes mine associations usingmethod attributes display asrules

  14. Knowledge Grid Architecture Layers High level layer Data Access Service Tools and Algorithms Access Service Execution Plan Management Result Present. Service Core layer Knowledge Directory Service Resource Allocation Execution Management Generic Grid and Data Grid Services

  15. Grid data mining is a relevant research topic GridMiner approach may contribute to this research domain Collaborations are needed IPG (Information Power Grid) is the only Grid project, which wants to addresss knowledge discovery issues Looking for a pilot application(s) Open issues- basic Grid technology: Globus, DataGrid, Jini, JXTA ? Conclusions

  16. Data Storage and the Components Site D Site C Site A Site B Preprocessing Preprocesing Preprocessing Preprocessing Local DM Local DM Local DM Local DM Construction of the Global Model GUI Site E

More Related