1 / 17

Time in Databases

Time in Databases. CSCI 6442. With thanks to Richard Snodgrass, 1985 ACM 0-89791-160-1/85/005/0236. Agenda. Taxonomy of time in databases Point-in-time databases. The Problem.

eshe
Download Presentation

Time in Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Time in Databases CSCI 6442 With thanks to Richard Snodgrass, 1985 ACM 0-89791-160-1/85/005/0236

  2. Agenda • Taxonomy of time in databases • Point-in-time databases

  3. The Problem • Suppose you are away from your company for 6 months assigned to work at an outpost in Australia, in the opal mines in Coober Peedy, underground, out of communication. • You are eligible for a raise in the midst of this. When you return they notice this. • Now it is approved and entered into the database within a few days, but you get the extra money from 3 months ago • Which times get recorded in the database?

  4. Snodgrass • Wrote “A Taxonomy of Time in Databases” (1985) • Regarded as the authoritative work • Defines three types of time

  5. Types of Time • There are three times: • Transaction time—when the change is made to the database • Valid time—when the change was approved • User-defined time—when the raise is effective (application-specific)

  6. Static Database • Database that reflects changing values with a snapshot at any particular time • Does not necessarily reflect the state of the real world • We discard past states of the database • Can’t answer questions about what past values were • Can’t show trends in change over time

  7. Static Rollback Database • Adds some time sequence • Ability to back out transactions is provided • Can back out one at a time to previous states of the database • Such a database shows the history of transactions rather than the history of the actual data

  8. What’s The Right Approach? • We might like to look at previous states of the database • We might be more interested in the evolution of the data over time rather than the transactions • In addition, recovery mechanisms provide us the ability to back out any transactions as desired

  9. Point In Time Databases* • A single database models the present and all past states of the data as it matters to the organization • Enough information about past states is stored so that the past state can be obtained using a SQL query • Instead of deleting rows they are marked as no longer valid • When values are updated, the old values are kept as well • This was impractical in 1970 because of cost and efficiency issues • Is it practical today? * Not Snodgrass’s terminology

  10. What We Store • With each change to a row, we insert a new row, instead of changing the previous row • Consider EMP(EMPNO, ENAME, SAL) • We add the attributes START and END • EMP(EMPNO, ENAME, SAL, START, END) • New attributes are start and end times for validity • When we change a row, that time is the END time for validity of the previous row, START time of validity for the new row • These times are usually what Snodgrass calls “user-defined” time because that’s the time that matters to the organization • His “valid time” is just a date of transaction approval • Transaction time is when entered into the database • Let’s call this time “effective time” because it’s the time when the row’s values are effective

  11. Point In Time Operations • Insert: • New row is added • Start time is set • Stop time is left null • Delete: • Stop time is set • Update: • New row added with new values, start time • Old row is kept and stop time is set • Read: • Current values: STOP ISNULL • As of values: ASOF >= START AND (ASOF < STOP OR STOP ISNULL)

  12. Example: Retroactive Pay Increase • Chen is given a salary increase • The salary increase is approved at date t1 • The increase is effective at date t2. • However, the increase does not get entered into the database until some other time, t3>t2 (called a retroactive salary increase). • In this case, t1 is an attribute of the approval process, t2 is the start of effective time and t3is the transaction time. • Most important to the organization is the time when Chen is getting the higher pay

  13. An Example: EMP Schema Definition: EMP(EMPNO,ENAME,JOB,SAL,DEPTNO) Now we add START and STOP for validity dates: EMP(EMPNO,ENAME,JOB,SAL,DEPTNO,START,STOP) • START gives the first time of validity for the values in the row • STOP gives the last time of validity for the values in the row • A row with no value for STOP is the current value

  14. Example To find the status of EMP on 1/1/2008 SELECT * FROM EMP WHERE START <= to_date( ‘01-JAN-2008','dd-mmm-yyyy') “1/1/2008” AND (STOP >= to_date( '01-JAN-2008','dd-mmm-yyyy') OR STOP ISNULL); DELETE statements will now be UPDATEs instead

  15. Complexities • Not every column value needs to be tracked • Can simplify by not tracking some column values • This is a semantic issue • Change in primary key • If primary key changes, hard to relate changed row to original row • Potential case for use of generated key • Cascading changes • Other tables may change because of one change • Use trigger instead of automatic foreign key update • Cascading deletes • Other tables may have rows deleted because of one change • Use trigger instead of automatic foreign key delete

  16. Capabilities • These techniques allow you to implement a database that can show its state as of any time in the past • This can be done while the database is actively being used and even updated by multiple processes • One process can be looking at the database as of two years ago while another is doing current processing • You can record retroactive changes • You can implement applications to process retroactive changes

  17. Bottom Line • This is a practical approach that potentially simplifies data models and processing • One table replaces duplication of data and completely different functions • One somewhat more complex function replaces two (or more) separate functions • Use can be presented a unifying interface that simplifies use of the application • It’s a tradeoff—sometimes use a separate history table

More Related