1 / 23

Introduction to Computing Lecture # 13

Introduction to Computing Lecture # 13. Outline. Managing Files: Basic Concepts Database Definition Database Management Systems Database Models Data Mining. Managing Files: Basic Concepts. Data storage hierarchy - levels of data stored in a computer: Database Files Records Fields

wdoe
Download Presentation

Introduction to Computing Lecture # 13

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Computing Lecture # 13

  2. Outline • Managing Files: Basic Concepts • Database Definition • Database Management Systems • Database Models • Data Mining

  3. Managing Files: Basic Concepts • Data storage hierarchy - levels of data stored in a computer: • Database • Files • Records • Fields • Characters (bytes) • Bits

  4. Managing Files: Basic Concepts • Database – an organized collection of integrated files. • File – a collection of related records. • Record – a collection of related fields. Often called a row. • Field – a unit (individual piece) of data consisting of one or more characters (bytes). Often called a column. • Character (byte) – a letter, number, or special character. • Key field – a field that is chosen to uniquely identify a record so that it can be easily retrieved and processed.

  5. Field Name Field Record Managing Files: Basic Concepts

  6. Database Definition • Structured set of data held in a computer.(Pocket Oxford Dictionary) • An organized collection of related (integrated) files. (Williams and Sawyer) • A database is a collection of related data or facts. (Peter Norton)

  7. Database Management Systems • Database management system (DBMS) – programs that control the structure of a database and access to the data. (Williams and Sawyer) • DBMS is a collection of programs that control the database. (Peter Norton) • Advantages of DBMSes: • File sharing • Reduced data redundancy • Data redundancy – situation in which the same data fields appear in many different files and often in different formats. • Improved data integrity • Data integrity – measure of how accurate, consistent, and up-to-date data is. • Increased security

  8. Database Models • Just as files can be organized in different ways, so databases can be organized in ways to best fit their use. • The four most common arrangements are: • Hierarchical • Network • Relational • Object-oriented

  9. Database Models • Hierarchical database – fields or records are arranged in related groups, resembling a family tree, with child (lower-level) records subordinate to parent (higher-level) records.

  10. Database Models • Network database – similar to a hierarchical database, but each child record can have more than one parent record.

  11. Database Models • Relational database – a database which relates (connects) data in different files through the use of a key field, or common data element.

  12. Database Models • SQL (Structured Query Language) – the standard language used to create, modify, maintain, and query relational databases. • SQL is pronounced as “sequel.” • How did this acronym get such an unlikely pronunciation? • The first structured query language was developed by IBM in the 1970s; its product name was “Sequel2.” • E. F. Codd is considered the “father” of relational database management systems – the most common model of databases. • His article entitled “A Relational Model of Data for Large Shared Data Banks” was published in the June 1970 “Communications of the ACM.”

  13. Database Models • Object-oriented database – database which uses “objects” (software written in small, reusable chunks) as elements within database files • An object consists of: • Data in any form, and • Instructions on the actions to take on the data

  14. Survey of Database Systems • Databases for individuals • Manage aspects of your life • Organizes hobbies for school • Microsoft Access is the most popular • Common Corporate DBMS • Oracle • DB2 • Microsoft SQL Server • MySQL

  15. Survey of Database Systems • Oracle • Most popular enterprise-level DBMS • Very flexible storage system • Can be very complex • Platform independent • Offers a wide range of solutions • DB2 • Venerable IBM database • Platform independent • Only database using pure SQL

  16. Survey of Database Systems • Microsoft SQL Server • Fastest growing DBMS • Only runs on Microsoft platforms • Eight different versions exist • Extremely scalable architecture • Software can grow with the data • MySQL • Leading DBMS for Linux • Very inexpensive • Features are those needed in business • Often faster than other DBMS • Platform independent

  17. Data Mining • Data mining (DM) – the computer-assisted process of sifting through and analyzing vast amounts of data in order to extract meaning and discover new knowledge. • Searches for trends and patterns • Makes predictions on events • Supplies ideas for improving business • Data mining begins with acquiring data and preparing it for what is known as the data warehouse by the following steps: • Data sources • Data fusion and cleansing • Data and meta-data • Data warehouse

  18. Data Mining • Data sources • Data may come from a number of sources: • Point-of-sale transactions in flat files on mainframes; • Databases of all kinds; • Other, e.g., news articles, online articles, etc.; and • Data from data warehouses

  19. Data Mining • Data fusion and cleansing • Data from diverse sources must be fused\join together, then put through a process known as data cleansing, or scrubbing. • The data may be of poor quality, full of errors and inconsistencies • Putting together the data from various sources and then “scrubbing” the data to eliminate errors and inconsistencies.

  20. Data Mining • Data and meta-data • Cleaned-up data and meta-data (data about data) • The cleansing process yields both the cleaned-up data and a variation of it called meta-data. • Meta-data shows the origins of the data, the transformations it has undergone, and summary information about it, which makes it more useful than the cleansed but unintegrated, unsummarized data.

  21. Data Mining • Data warehouse • A special database of cleaned up data and meta-data. • Both the data and the meta-data are sent to the data warehouse.

  22. Data Mining • Some applications of data mining: • Marketing: • Marketers use data mining tools to mine point-of-sale databases of retail stores, which contains facts for thousands of products in hundreds of geographic areas. • By understanding customer preferences and buying patterns, marketers hope to target consumers’ individual needs. • Health: • A coach in the U.S. Gymnastics Federation used a data mining system called IDIS to discover what long-term factors contributed to athletes’ performance, so as to know what problems to treat early on.

  23. End • Questions ?

More Related