Oracle Solutions for Data Archiving

Oracle Solutions for Data Archiving Luca Canali, CERN/IT-DM Distributed Database Operations Workshop November 11th, 2008

Data life cycle management - WHAT • Use cases ultimately come from the experiments • Typically require removal of large chunks of data on a time-based selection • Related use case: put read only part of data

Data life cycle management - WHY • Manageability • A database should not become a ‘data dump’ • Very large DBs are hard to manage • Performance • Some application-critical SQL statements may not scale well (full scans, hash joins, etc) • More and more attention to SQL tuning is needed with very large tables • Cost • When the DB grows extra HW resources are needed to maintain the service levels • ex: extra spindles to provide IOPS to read stale data

Why it’s hard • There is no default Oracle mechanism to archive parts of applications • It must be custom implemented • Rules must be set in place by the ‘data owners’ on how to move and restore • What can be of help • Oracle features for bulk data movement • Typically require DBA privileges • Additional HW to hold ‘offline data’

Oracle and large data movements • Normal DML is too slow • delete operations in particular • Alternatives • DDL such as partition exchange/movement • Transportable tablespace • CTAS (create table a select) • Datapump

Oracle and large data sets • Data (tables) need to have the possibility to allow extracting ‘chunks’ in a time-based manner • Oracle partitioning by time is a possibility • Beware of many limitations of partitioning (it’s not a magic wand) • Do-it-yourself partitioning by generating multiple tables/schema every year (for example)

Additional challenge • Lookup tables/metadata is needed too • Besides the main table(s) to archive • Can be a large number of tables • Not easy to identify a ‘consistent set’ to archive

Technology • Main requirements • Must be able to restore the data if and when need (over a time scale of 10 years, i.e. 3-4 Oracle versions) • Must keep track of what is where • Should use one of the few Oracle technologies for moving large chunks of data • Possible solutions • Move data to a specialized Oracle database to collect ‘application chunks’ • Export to files (flat files or datapump) is not considered reliable given current experience

Possible Scenarios • Scenario 1 • Application owners identify consistent sets of data to be moved • Identified data sets are moved by the DBAs • Data is then dropped from production. • requires considerable effort from the developers • gives the cleaner result

Possible Scenarios • Scenario 2 • A clone (complete copy) of the selected applications is taken (for example at the end of each year) and archived according to the chosen archive technology • Application owners then take care of deleting from production trusting that data is archived and kept. • easy to implement • more data than strictly needed is copied

Example of data movements

Example of data purging after archive

Conclusions • LHC databases are growing fast • Offline data for significant performance and cost gains • Mechanisms for offlining are application dependent • DB services and ideas to provide tools to move and restore data • A large Oracle ‘Archive DB’ could be part of the solution

Q&A Thank you

Oracle Solutions for Data Archiving

Oracle Solutions for Data Archiving

Presentation Transcript

Census Data Archiving Seminar

ARE E-MAIL ARCHIVING SOLUTIONS RECORDKEEPING SOLUTIONS?

Oracle Data Archiving Taming the Beast

Oracle Data Integration Solutions Product Update: Oracle GoldenGate and Oracle Data Integrator

Data Archiving

Data Archiving and Data Stewardship

Archiving data

STARDAT DATA ARCHIVING SUITE

Data Archiving @ SAP

Archiving Geospatial Data

DATA EXCHANGE STANDARD FOR ARCHIVING

Data Archiving and Access

CIRS Data Archiving Status

PVSS Oracle archiving

Archiving CHPS Data

IBM Archiving Solutions for SAP

Understanding the business of Data Archiving Solutions in Banking

Migrating Data Warehouse Solutions from Oracle to non-Oracle Databases

Oracle XBRL Solutions

Metadata requirements for archiving structured data

Census Data Archiving Seminar