approximate query processing n.
Skip this Video
Download Presentation

Loading in 2 Seconds...

play fullscreen
1 / 12


  • Uploaded on

APPROXIMATE QUERY PROCESSING. BY KAVYA REDDY MUSANI. Types of RDBMS. There are two types of RDBMS Operational/Transactional Databases: ---Used for day to day operations. ---It changes rapidly. DataWarehousing:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'APPROXIMATE QUERY PROCESSING' - blythe

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
approximate query processing




types of rdbms
Types of RDBMS

There are two types of RDBMS

  • Operational/Transactional Databases:

---Used for day to day operations.

---It changes rapidly.

  • DataWarehousing:

---It is useful for decision support and data analysis.

---Not useful for day to day operations.

---It doesn’t change rapidly.

analysis of datawarehousing
Analysis of DataWarehousing
  • Decision support
  • OLAP tools
  • Statistics
  • Data Mining
ways to improve query processing
Ways to improve Query Processing
  • Indexes
    • Indexes are used to speed up querying on huge databases.
    • The different kinds of indexes are B+ trees, Hash indexes, Bitmap indexes, etc.
  • Materialized Views
    • Materialized views improve query performance by precalculating expensive join and aggregation operations on the databases prior to execution time and storing these results in the database.
example of approximation
Example of Approximation
  • Suppose we have “Sales” table showing the sales of some products with attributes Product_ID, Product_Name, Price, Quantity and State.
  • Now we want to know the sales of a product in each state of a country.
  • Then we write a query which outputs the total sales of that product in each state.
  • Example of query:

Select State as State, Sum(Price) as total_price from Sales group by State.


  • We can approximate the total_price to an integer so that we can see the behaviour of the product sales in different states clearly.
  • For example if for the state of Texas the total_price is 56788.9866 dollars then we can approximate it to 60000.
  • Now we need to indicate the error in the corresponding column.
  • This error is called Confidence Interval because it shows the amount by which we have approximated the value.
synopsis method
Synopsis Method
  • Compress the data into a smaller representation (known as “synopsis”).
  • Execute query against synopsis and produce answers.
methods of data compression
Methods of Data Compression
  • Sampling.
  • Building models using statistics.
  • Curve Fitting.
types of synopsis
Types of Synopsis
  • Random Sampling (lossy compression).
  • Histograms.
  • Wavelets.
  • Learning Joint Distribution.
random sampling
Random Sampling
  • A sampling procedure that assures that each element in the population has an equal chance of being selected is referred to as simple random sampling.
  • There are different random sampling techniques like Simple random sampling, Systematic sampling, Stratified sampling, Cluster sampling, multi-stage sampling, etc.
role of random sampling in approximate query processing
Role of Random sampling in Approximate Query Processing
  • Suppose we want to find the average of salaries of employees of a big company for a particular year.
  • Involving every employee’s salary in the average salary slows down the processing.
  • So, we take a small percentage of random sample (synopsis) of the huge table and find the average salary.


  • This improves the speed of processing the query.
  • However, care should be taken while extracting random sample.
  • If the random sample drawn is assured to be good, then we can query the random sample (synopsis) for further references which reduces the processing time considerably and now we project the result to the actual table by multiplying with sampling fraction.