1 / 67

Chapter 10

Chapter 10. Data and Knowledge Management. Agenda. Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining. Data. Set of discrete, objective facts about events Business - structured records of transactions

Download Presentation

Chapter 10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 10 Data and Knowledge Management

  2. Agenda • Information processing • Database • Data Administrator • The DBMS • Distributing data • Data warehousing and data mining

  3. Data • Set of discrete, objective facts about events • Business - structured records of transactions • Little relevance or purpose

  4. Information • Message with sender and receiver • Meant to change way receiver perceives something • Have an impact on his judgment / behavior

  5. Data Processing • Contextualize - why was data gathered? • Categorize - what are its key components? • Calculate - analyze mathematically • Condense - summarize in more concise form

  6. Information Processing • Compare - in kind and in time • Consequences - how used in decisions / actions • Connections - relation to other information • Conversation - what other people think about this information

  7. Agenda • Information processing • Database • Data Administrator • The DBMS • Distributing data • Data warehousing and data mining

  8. Database • Element • Types • Structure • Models • Creation • Topology

  9. Element • Bit, byte, field, record, file, database • Entity, attribute, key field • Relation • Class, object

  10. Database Types • Business database • Geographical information database • Knowledge database / deductive database • Multimedia database • Data warehouse • Data marts • Multimedia and hypermedia database • Object-oriented database

  11. Database Structure • Data definition language • Schema & subschema • Data Manipulation language • Structured Query Language (SQL) • Query By Example (QBE) • Data dictionary

  12. Database Models • Hierarchical • One to many • TPS or routine MIS • Network • Many to many • TPS or routine MIS • Relational • Normalization • Ad hoc reports or DSS • Object-oriented • E-commerce

  13. Database Creation • Conceptual design • Logical view • Entity-relationship (ER) diagram • Normalization

  14. Entity Relationship Diagram • Entity: object or concept • Relationship: meaning association between objects • Attribute: property of an object • Simple & Composite • Single-valued & multi-valued • Derived • Key • Primary key • Foreign key

  15. Normalization • A technique for identifying a true primary key for a relation • Types • First normal form: not repeating group • Second normal form: every non-primary-key attribute is fully functionally dependent on the entire primary key • Third normal form: no transit dependency

  16. Structured Query Language • Select • Join

  17. SQL DML - SELECT • SELECT [DISTINCT|ALL] {* | [col-expr [AS newname]][,...] FROM table-name [alias] [,...] [WHERE condition] [GROUP by colm [, colm] [HAVING condition]] ORDER BY colm [, colm]

  18. SQL DML - SELECT • SELECT attributes (or calculations: +, -, /, *) FROM relation • SELECT DISTINCT attributes FROM relation

  19. Examples • SELECT stuname FROM student; • SELECT stuid, stuname, credit FROM student; • SELECT stuid, stuname, credit+10 FROM student; • SELECT DISTINCT major FROM student;

  20. SQL DML - SELECT • SELECT attributes (or * wild card) FROM relation WHERE condition

  21. Examples • SELECT * FROM student; • SELECT stuname, major, credit FROM student WHERE stuid = ‘S114’; • SELECT * FROM faculty WHERE dept = ‘MIS’;

  22. SELECT - WHERE condition • AND OR • NOT IN • NOT IN BETWEEN • IS NULL IS NOT NULL • LIKE '%' multiple characters • LIKE ‘_’ single characters

  23. Examples • SELECT * FROM faculty WHERE dept = ‘MIS’ AND rank = ‘full professor’; • SELECT * FROM faculty WHERE dept = ‘MIS’ OR rank = ‘full professor’; • SELECT * FROM faculty WHERE dept = ‘MIS’ NOT rank = ‘full professor’;

  24. SELECT * FROM class WHERE room LIKE ‘B_S%’; • SELECT * FROM class WHERE room NOT LIKE ‘BUS%’; • SELECT productid, productname FROM inventory WHERE onhand BETWEEN 50 and 100;

  25. SELECT companyid, companyname FROM company WHERE companyname BETWEEN ‘G’ AND ‘K’; • SELECT productid, productname FROM inventory WHERE onhand NOT BETWEEN 50 and 100; • SELECT companyid, companyname FROM company WHERE companyname NOT BETWEEN ‘G’ AND ‘K’;

  26. SELECT facname FROM faculty WHERE dept IN (‘MIS’, ‘ACT’); • SELECT facname FROM faculty WHERE rank NOT IN (‘assistant’, ‘lecture’); • SELECT customername FROM customer WHERE emailadd IS NOT NULL;

  27. SELECT customername FROM customer WHERE creditlimit IS NULL;

  28. SELECT - aggregate functions • COUNT (*) • COUNT • SUM • AVG • MIN • MAX

  29. Examples • SELECT COUNT(*) FROM student; • SELECT COUNT(major) FROM student; • SELECT COUNT(DISTINCT major) FROM student;

  30. SELECT COUNT(stuid), SUM(credit), AVG(credit), MAX(credit), MIN(credit) FROM student;

  31. SELECT - GROUP • GROUP BY • HAVING

  32. Examples • SELECT major, AVG(credit) FROM student GROUP BY major HAVING COUNT(*) > 2; • SELECT course#, COUNT(stuid) FROM enrollment GROUP BY course# HAVING COUNT(*) > 2;

  33. SELECT major, AVG(credit) FROM student WHERE major IN (‘MIS’, ‘ACT’) GROUP BY major HAVING COUNT(*) > 2;

  34. SELECT - ORDER BY • ORDER BY • ORDER BY ... DESC

  35. Examples • SELECT facname, rank FROM faculty ORDER BY facname; • SELECT facname, rank FROM faculty ORDER BY rank DESC, facname;

  36. SELECT - JOIN Tables • Multiple tables in FROM clause • MUST have join conditions!!!

  37. Examples • SELECT stuname, grade FROM student, enrollment WHERE student.stuid = enrollment.stuid;

  38. SELECT enrollment.course#, stuname, major FROM class, enrollment, student WHERE class.course# = enrollment.course# AND enrollment.stuid = student.stuid AND facid = ‘F114’ ORDER BY enrollment.course#;

  39. SUBQUERY, EXIST, NOT EXIST • SELECT s.stuname, major FROM student s WHERE EXIST (SELECT * FROM enrollment e WHERE s.stuid = e.stuid);

  40. SELECT s.stuname, major FROM student s WHERE NOT EXIST (SELECT * FROM enrollment e WHERE s.stuid = e.stuid);

  41. Database Creation • Physical design • Physical view • Data topology (organization) • Centralized • Distributed database • Replicated database • Partitioned • Organization & access method • Sequential file • Indexed sequential file • Direct or random file • Security • Logical, physical, and transmitting

  42. Selection Criteria • User’ needs (type of application) • Compatibility • Portability • Reliability • Cost • Feature • Performance • Vendor’s support • Others?

  43. Agenda • Information processing • Database • Data Administrator • The DBMS • Distributing data • Data warehousing and data mining

  44. Data Administrator • Clean up data definitions • Control shared data • Manage distributed data • Maintain data quality

  45. Clean Up Definitions • Synonyms / aliases • Standard data definitions • Names and formats • Data Dictionary • Active • Integrated

  46. Control Shared Data • Local - used by one unit • Shared - used by two or more activities • Impact of proposed program changes on shared data • Program-to-data element matrix

  47. Manage Distributed Data • Geographically dispersed • Whether shared data or not • Different levels of detail • Different management levels

  48. Maintain Data Quality • Put owners in charge of data • Verify data accuracy and quality • Purge old data

  49. Agenda • Information processing • Database • Data Administrator • The DBMS • Distributing data • Data warehousing and data mining

  50. The DBMS Data Base Management System: software that permits a firm to: • Centralize data • Manage them efficiently • Provide accessto applications • Such as payroll, inventory

More Related