1 / 59

Information Resources Management

Information Resources Management. April 24, 2001. Agenda. Administrivia Object-Oriented & Databases Data Warehousing Data Mining SQL Extensions XML. Administrivia. Homework #8 Homework #9 Current Scores Final Review Session?. OODBMS vs. ORDBMS. OODBMS - Object-Oriented

Download Presentation

Information Resources Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Resources Management April 24, 2001

  2. Agenda • Administrivia • Object-Oriented & Databases • Data Warehousing • Data Mining • SQL Extensions • XML

  3. Administrivia • Homework #8 • Homework #9 • Current Scores • Final Review Session?

  4. OODBMS vs. ORDBMS • OODBMS - Object-Oriented • ORDBMS - Object-Relational • Appendix A

  5. OODBMS • Persistent Objects • By class • By creation • By marking • By reference • Storage/Retrieval Methods

  6. OODBMS - Benefits • Match • Programming • Methodology • Data types & structures • Ease of programming • Inheritance

  7. OODBMS - Challenges • Standards • ODMG - Object Database Management Group • Performance • Database vs. persistent language • Loss of integrity, queries • Storage Space • Maturity

  8. ORDBMS • Extensions to relational model • Complex data types • Inheritance • References • Migration path • Use existing applications and knowledge base

  9. ORDBMS - Benefits • SQL • Existing Systems • Vendors

  10. ORDBMS - Challenges • Standards • “Fit” with the development language • Programming Complexity

  11. Using a relational database to store data from an object-oriented system has been likened to parking your car in your garage. With an OODBMS you park the car in the garage. If a (O)RDBMS is used, to park your car in the garage, you must first completely disassemble it and put each part in its specific location on a shelf. This process must then be reversed the next time you want to go for a drive.

  12. OODBMS/ORDBMS Products

  13. OODBMS/ORDBMS Products

  14. Other Links • Object Database Management Group www.odmg.org • Object Database Newsgroup comp.databases.object

  15. Data Mining • Corporations have collosal amounts of data • Usually only used for very specific purposes (operations) • Automated attempt to learn from the data • Find statistical rules and patterns in the data Example: Giant Eagle Advantage Card

  16. Goals of Data Mining • Explanatory - Why? • Confirmatory - Is it? • Exploratory - ???

  17. Classification identify rules that create groups Association find related conditions or events Correlation relationships between values User Guided hypothesis driven Automatic data driven - AI based Approaches to Data Mining

  18. Data Warehouse • A subject-oriented, integrated, time-variant, nonvolatile collection of data • Usually all data for a corporation • Multidimensional database

  19. Data Warehousing • Single location • Long-term storage • Greater availability • Separate “data” processing from day-to-day operations (performance) • All data is historical • Support data mining, et al.

  20. Data Warehousing Questions • What data needs to be kept? • Where is it from? • How good is it? • How long should it be kept? • Can it be summarized? When? • Will it make sense? What is the schema? • When is it updated?

  21. Data Warehousing - Benefits • Support for decision making tools • DSS, EIS, Data Mining • Separation of information and day-to-day processing • Unification - Centralization • Improved quality and consistency

  22. Data Warehousing - Challenges • Costs: Storage, Setup, Maintenance • Historical data issues • Defining the warehouse schema • Doing the conversion • Implementation & every time • Keeping up with operational system changes • Answering the questions

  23. Multidimensional Databases • Two views • Multidimensional tables • Star schema • Multidimensional table • each cell is attribute • dimensions are “interesting” categories

  24. Multidimensional Table • Cell - sales • Dimensions • day • person • store • item

  25. Star Schema • Multiple tables • Central table - data item (cell) • Surrounding tables - information about each category (dimensions)

  26. Star Schema Person Day Sales Store Item

  27. Star Schema Sales (Day, Person, Store, Item, sales) Day (Day, day info) Person (Person, person info) Store (Store, store info) Item (Item, item info)

  28. Building/Maintaining a Data Warehouse 1. Capture 2. Scrub 3. Transform 4. Load and Index

  29. Data Marts • Making specific data available • Different ones for different needs DM1 DW Operational Systems DM2

  30. Data Mining • Corporations have collosal amounts of data • Usually only used for very specific purposes (operations) • Automated attempt to learn from the data • Find statistical rules and patterns in the data Example: Giant Eagle Advantage Card

  31. Goals of Data Mining • Explanatory - Why? • Confirmatory - Is it? • Exploratory - ???

  32. Classification identify rules that create groups Association find related conditions or events Correlation relationships between values User Guided hypothesis driven Automatic data driven - AI based Approaches to Data Mining

  33. Data Mining - Benefits • Use data • Learn new things • Improve decision making

  34. Data Mining - Challenges • Time (human and/or computer) • Spurious results • Separating the wheat from the chaff • Availability of data • Amount of data • Changes in tools and technologies • Validity over time

  35. Enhanced Data Analysis • Beyond SUM, COUNT, and AVG • SQL extensions (suggested) • GROUP BY … AS PERCENTILE • Specific percentiles • GROUP BY … WITH CUBE • Cross-tabulations • Statistical package interface • SAS, S++, others

  36. Enhanced Data Analysis - Benefits • Greater functionality • Improved decision making

  37. Enhanced Data Analysis - Challenges • Lack of standards • Understandability • Processing requirements • Cost of poorly written queries • “ad hoc” queries aren’t reviewed

  38. Extending Relational DBs • Spatial and Geographic Databases • Multimedia Databases • Changing the data stored while retaining the benefits of relational databases

  39. Spatial & Geographic DBs • Spatial - CAD • Geographic - GIS • Similar issue • How to store and retrieve such data

  40. Spatial Databases • Geometric objects (2 or 3 dimensions) • Locations • Connections • Nonspatial information about each object • Substructures • Spatial integrity constraints • Two things can’t occupy the same space

  41. GIS Databases • Raster Data (fractal data) • Pictures - possibly over time • Maps • Vector Data • Locations • Connections • Nongeographic information

  42. Spatial & Geographic DB -Benefits • DBMS • Specialized queries • Spatial & Geographic Data • “Standard” Data • Mix of the two • Integrity constraints

  43. Spatial & Geographic DB - Challenges • Space requirements • Level of detail • Understandability - Complexity • Processing requirements • Compatibility between systems • Lack of standards

  44. Multimedia Databases • Images, Audio, Video • Nonmultimedia data (text) about each • Database Enhancements • BLOBs (Binary Large Objects) • Similarity-based queries • Guaranteed steady rate • Synchronization of audio and video

  45. Multimedia Databases - Benefits • DBMS • Greater compression may be possible • “Paperless” office - document imaging • Workflow redesign - improvements • Greater availability

  46. Multimedia Databases - Challenges • S T O R A G E • Specialized DBMS • Unity of database and network • Usually requires ATM • Specialized hardware • “juke boxes” • optical disks

  47. XML • What is it? • What isn’t it? • What are the goals? • Who controls it? • Who’s using it? • Beyond XML

  48. What is XML? • eXtensible Markup Language • Markup language for “structured information” • “structured” - content & role of that content • markup - identify structures • “meta language for describing markup languages”

  49. Huh? • Storing structured data in a text file • spreadsheet, address book, transactions (think EDI) • Looks like HTML, <tags>, but isn’t • Text is universal, but not efficient • Does disk space matter? • What about network capacity? • XML is license-free & platform-independent

  50. What XML isn’t • HTML • SGML - Standard Generalized Markup Language - printing • Limited to current definitions (tags) • XML is the way to add new definitions • A relational database management system • A database, or is it?

More Related