140 likes | 219 Views
Explore the step-by-step process of handling a query in a database management system, including components like client communication, query processing, data retrieval, and more. Learn about different connection arrangements and essential relational operators. Delve into the intricate workings involved in the query execution process, from submitting SQL statements to fetching and presenting results. References cited provide further insights into the architectural foundations of database systems.
E N D
C-Store: The Life of a Query Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar 6, 2009
Main Components of a DBMS (1) • A typical DBMS has 5 main components • Client Communication Manager • Process Manager • Relational Query Processor • Transactional Storage Manager • Shared Components and Utilities
A Single-Query Transaction • At an airport, a gate agent clicks on a form to request the passenger list for a flight. • Example of Possible SQL statement SELECT name FROM Passenger Where flight = ‘510275’
Stage 1: Submit the SQL Statement • The Client: the personal computer at the airport gate • Calls an API to create a connection with the Client Communication Manager • The Client Communication Manager • Establish the security check of the Client • Set up states to remember the connection • And also remember the SQL statement • Forward the Client’s request deeper into the DBMS for processing.
Different Connection Arrangements • Two-tier or Client-server: • Client Database Server • Via ODBC or JDBC • Three-tier • Client Web Server Database Server • Four-tier • Client Web Server App Server Database Server
Stage 2: Assign a Thread of Computation • Upon receiving the SQL statement from the Client Communication Manager • The Process Manager first does Admission Control • The system should begin processing the query at once • Or defer to the time when the query can have enough resources for execution. • The Process Manager allocates a thread of control for a query • If the query should be executed at once.
Stage 3: Query Processing • The Relational Query Processor executes the query of the gate agent • Check if the agent is authorized to run the query • If authorized, compile the SQL query text into an internal query plan. • Query Parsing, Query Rewite / Optimization • Once compiled, the Plan Executor handles the query plan. • Invoke relation operators
Relational Operators (1) • Selection • File scan, B-Tree, Hash Index (Equality Selection) • Projection • Remove unwanted attributes • Eliminate any duplicates • Implement via sorting or hashing • Join • Nested Loops Join, Sort-Merge Join, Hash Join
Relational Operators (2) • Set • Union, Intersection, Difference, Cross product • Sorting or Hashing • Aggregation • SUM, MIN, MAX, COUNT, AVG • Data cube • Sorting or Hashing • Sorting • To get some good properties for speeding-up query
Stage 4: Fetch Data from Transactional Storage Manager • Plan Executor’s operators request data • Transactional Storage Manager manages calls for • All data access (READ) • All data manipulation (CREATE, UPDATE, DELETE) • And ensures ACID properties of transactions • Get locks from the Lock Manager • Interact with Log Manager for recovery preparation
Access Methods and Buffer Management • Access Methods • Algorithms and data structures for organizing and accessing data on disk. • Such as B-Tree, Hash, Bitmap Index • Buffer Management • Decides when and what data to transfer between disk and memory buffers.
Stage 5: Unwinding the Stack • After data access, access methods return control to the query executor’s operators. • Operators generate result tuples. • Result tuples are placed in a buffer for the Client Communication Manager • The Client Communication Manager ships the result tuples back to the Client. • At the end of the query, the transaction is completed. • Do clean-up jobs in each involved component.
References • Joseph M. Hellerstein, M. Stonebraker and J. Hamilton. Architecture of a Database System. Foundations and Trends in Databases 1(2). 2007. • Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems. Second Edition. McGraw-Hill Science. 2000.