Administração e Optimização de Bases de Dados • Enquadramento • Programa • Bibliografia • Planeamento teóricas e laboratórios • Avaliação • Corpo docente
Enquadramento Cadeiras de gestão de dados e tratamento de informação DEI 2007/08 : • BD (3º ano LEIC e LERCI) • GTI (1º ano MEIC/SIE) • SAD (1º ano MEIC e MERC) • AOBD (1º ano MEIC/TSI e MERC)
Bibliografia • Database System Concepts, Silberchatz, Korth, Sudarshan 5ª edição, McGraw Hill, 2005 • Database Management Systems, Raghu Ramakrishnan, Johannes Gehrke 3ª edição, Mc Graw Hill, 2002 • Database Tuning: Principles, Experiments, and Troubleshooting Techniques, Dennis Shasha & Philippe Bonnet, Morgan Kaufmann, 2003.
Programa (1) • Armazenamento e Sistema de Ficheiros: cap 11 (Silb), 9 (Raghu) • Técnicas de indexação (B+tree e hash-based) – cap 12 (Silb), 10 e 11 (Raghu) • Processamento de interrogações – cap 13(Silb), 12 e 13 (Raghu), • Optimização de interrogações – cap 14 (Silb), 15 (Raghu) • Gestão de Transações – cap 15 (Silb), 16 (Raghu) • Controlo de Concorrência – cap 16(Silb), 17 (Raghu) • Gestão de Recuperação – cap 17(Silb), 18 (Raghu)
Programa (2) • Principios de tuning • Lock tuning • Log tuning • Storage tuning • OS tuning • Index tuning • Query processing tuning • API tuning (Shasha & Bonnet e alguns capítulos do Raghu e/ou Silberchatz)
Planeamento • Aulas teóricas: 1H30 • Matéria (slides baseados no livro e artigos) • Aulas laboratóro: 1H30 • SQLServer 2005 • Material de apoio adicional: livros on-line e guias de laboratório
Avaliação • Exame (52%): • Nota mínima: 9,5v • Mini-projectos (48%): • Grupos de 3 pessoas • Exercícios “de papel e lápis” e/ou com SQLServer 2005. • Nota mínima: 9,5v
Corpo docente • Helena Galhardas (responsável) • Pável Calado • Bruno Martins (laboratórios)
Chapter 1: Introduction • Purpose of database systems • Data abstraction levels • Data models • SQL :Data Definition Language and Data Manipulation Language • Transaction management • Database users • DBMS structure • Typical database architectures
Database management systems (DBMS) • Collection of interrelated data and a set of programs to access the data • DBMS contains relevant information about a particular enterprise • DBMS provides an environment that is both convenient and efficient to use. • Database applications: • Banking: all transactions • Airlines: reservations, schedules • Universities: registration, grades • Sales: customers, products, purchases • Manufacturing: production, inventory, orders, supply chain • Human resources: employee records, salaries, tax deductions
DBMS vs File Systems (1) • In the early days, database applications were built on top of file systems • Drawbacks of using file systems to store data: • Data redundancy and inconsistency: Multiple file formats, duplication of information in different files • Difficulty in accessing data: Need to write a new program to carry out each new task • Data isolation: Multiple files and formats • Integrity problems: Integrity constraints (e.g. account balance > 0) become part of program code; Hard to add new constraints or change existing ones
DBMS vs File Systems (2) • Drawbacks of using file systems (cont.) • Atomicity of updates • Failures may leave database in an inconsistent state with partial updates carried out • E.g. transfer of funds from one account to another should either complete or not happen at all • Concurrent access by multiple users • Concurrent access needed for performance • Uncontrolled concurrent accesses can lead to inconsistencies • E.g. two people reading a balance and updating it at the same time • Security problems …. DBMS offer solutions to all the above problems!
Why use a DBMS? • Data independence and efficient access • Reduced application development time. • Data integrity and security. • Uniform data administration. • Concurrent access, recovery from crashes.
Levels of data abstraction (1) Physical level: describes how data are stored. Logical level: describes data stored in the database, and the relationships among the data. View level: application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.
Data Models • A collection of conceptual tools for describing: • data • data relationships • data semantics • data constraints Ex: • Entity-Relationship model, UML class diagram • Relational model • Other models: • semi-structured data models (XML used to represent sstr. data) • older models: network model and hierarchical model
Entity-Relationship Model Example of schema in the entity-relationship model
Data Independence • Applications insulated from how data is structured and stored. Logical data independence:Protection from changes in the logicalstructure of data. CustomerPublic(customer-id, customer-name) CustomerPrivate(customer-id, salary) Physical data independence:Protection from changes in the physical structure of data. • One of the most important benefits of using a DBMS!
Instances and Schemas Schema: the logical structure of the database, a description of a particular collection of data, using a given data model • Analogous to type information of a variable in a program • Physical schema: database design at the physical level • Logical schema: database design at the logical level Instance: the actual content of the database at a particular point in time • Analogous to the value of a variable
Example External schema (View): CustomerPublic (customer-id, customer-name) Logical schema: Customer(customer-id, customer-name, customer-street, customer-age) Account(account-number, balance) Depositor(account-number, customer-id) Physical schema: • Relations stored as unordered files. • Index on first column of Customer.
SQL: Structured Query Language • A DBMS provides: • a Data Definition Language (DDL) • a Data Manipulation Language (DML) as part of a single DB language: SQL • Most widely used declarative query language • Procedural – user specifies what data is required and how to get those data • Declarative – user specifies what data is required without specifying how to get those data
SQL: Data Definition Language (DDL) • DDL: Specification notation for defining the database • Ex: create tableaccount( account-numberchar(10), balanceinteger) • DDL compiler generates a set of tables stored in a data dictionary • Data dictionary contains metadata (i.e., data about data) • Ex: Database schema, consistency constraints, access methods
SQL: Data Manipulation Language (DML) • Language for accessing and manipulating data • Ex: find the name of the customer with customer-id 192-83-7465selectcustomer.customer-namefromcustomerwherecustomer.customer-id = ‘192-83-7465’
Transaction Management • A transaction is a collection of operations that performs a single logical function in a database application • Ex: funds transfer includes withdraw from account A and deposit into account B • Transaction management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures. • Transactions must be atomic, consistent, isolated and durable/persistent
Database Users (1) • Users are differentiated by the way they expect to interact with the system Application programmers: interact with system through DML calls Sophisticated users: form requests in a database query language Naïve/End users: invoke one of the permanent application programs that have been written previously • E.g. people accessing database over the web, bank tellers, clerical staff
Database Users (2) • Database administrators Coordinates all the activities of the database system; has a good understanding of the enterprise’s information resources and needs. Duties include: • Schema definition • Storage structure and access method definition • Schema and physical organization modification • Granting user authority to access the database • Specifying integrity constraints • Acting as liaison with users • Monitoring performance, tuning the system and responding to changes in requirements
Database Tuning Database Tuning is the activity of making a database application run more quickly. “More quickly” usually means higher throughput, though it may mean lower response time for time-critical applications.
ApplicationProgrammer(e.g., business analyst, Data architect) Application SophisticatedApplicationProgrammer(e.g., SAP admin) QueryProcessor Indexes Storage Subsystem Concurrency Control Recovery DBA,Tuner Operating System Hardware[Processor(s), Disk(s), Memory]
Next lecture • Ch 11: Storage and File Structure