1 / 48

Software Development using Production Data

Software Development using Production Data. By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK). Monday 28 th April 2008 – OLSUG, BOSTON, MA. Disclaimer. The information contained within this presentation is based on systems at the Wellcome Trust Sanger Institute. Overview.

dahlia
Download Presentation

Software Development using Production Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute(WTSI, UK) Monday 28th April 2008 – OLSUG, BOSTON, MA

  2. Disclaimer The information contained within this presentation is based on systems at the Wellcome Trust Sanger Institute.

  3. Overview • About the Wellcome Trust Sanger Institute (WTSI) • What do we do? • Case Study – Sequencing pipeline • Development requirements • Smoke and Mirrors • Logical intervention – Logical Standby • Questions?

  4. About the WTSI • One of the leading genomics centres in the world. • Founded in 1993 by the Wellcome Trust (major funding provider) and the UK Medical Research Council (MRC). • Formerly the Sanger Centre. • Named after the double Nobel prize-winning biochemist Dr. Fred Sanger

  5. What do we do? • Human Genetics - e.g. Cancer studies, SNP, WTCCC, Copy Number Variation (CNV) • Model Organism & Pathogen Genetics -Mouse, Zebrafish, Pathogens

  6. What do we do? • Model Organism Sequencing – e.g. Human, Mouse, Yeast, Worm • Bioinformatics –Analysis, Annotation, Data Storage, Data Processing etc.

  7. What do we do? • Current Finished Sequence Total:  3,684,395,290 bases (25th April 2008) • Data produced is made freely available to researchers worldwide.

  8. Case Study – Sequencing Pipeline • High throughput sequencing application pipelines to support laboratory practices. • Mixture of new bespoke and legacy systems. • Multiple Oracle databases support a collection of co-operating production application systems. • Result of multi-developer effort over a period of 10 years.

  9. Mapping Viral sequencing Restriction digests Dna_reception Subcloning Exoseq EST Back end Picking Prepping Sequencing Finishing Devmin Corf Post sequence processing Assembly Epigenomics Internal trace Checking Genetrap External trace Gull/kit ETS EPS Archives Chromoview Abstract sequencing pipeline map

  10. Development requirements • Availability of “up to date” datasets for development testing. • Developer flexibility and autonomy. • Stable and robust development environment. • Ability to test component parts in isolation. • Easily define, store, recreate and test use cases.

  11. Smoke and Mirrors – 1st solution Using the production database system with an additional development database to support read/write processes.

  12. PRODUCTION/ PRIMARY DATABASE (100GB) PHYSICAL STANDBY DATABASE PHYSICAL STANDBY DATABASE (100GB) Archive Redo Logs DEVELOPMENT DATABASE (20GB) Manual copies of datasets Smoke and Mirrors - Architecture ORACLE 9i (9.2.0.5) on Compaq Tru64 UNIX V5.1

  13. Production/ Primary DB Development DB Production schema Public synonyms DS1 Private synonyms Across db links. SS1 Smoke and Mirrors - Setup • Setup new “special_” user on the production database i.e. SS1 • Setup new user on the development database i.e. DS1

  14. Development DB Production/ Primary DB Triggers replace FK which reference R/O production tables Public synonyms DS1 T Database links Production schema T1 SS1 Private synonyms Across db links. Development schemas with read/write Tables Smoke and Mirrors - Setup • Isolate read/write (r/w) and read only (r/o) table access for an application system. • Foreign keys replaced with triggers (where applicable).

  15. Development DB Production/ Primary DB Triggers replace FK which reference R/O production tables DS1 Production schema Public synonyms T Database links T1 S SS1 Private synonyms Across db links. T1@ Database links Development schemas with read/write Tables Smoke and Mirrors - Setup • For r/w access - Setup private synonyms over database links to the new development user. • For r/o access – Use public synonyms from production schemas or create private synonyms.

  16. Development DB Production/ Primary DB Triggers replace FK which reference R/O production tables Public synonyms Database links T1 DS1 T Production schema SS1 Private synonyms Across db links. Development schemas with read/write Tables S Database links T1@ Application Login Smoke and Mirrors - Setup • Run the software application using the “special_” new user login.

  17. Smoke and Mirrors - Setup • Run SQL check to ensure there are no leakages using the new special_ user. select * from all_tab_privs where grantee in (select granted_role from user_role_privs where username = (SELECT USER FROM DUAL)) UNION select * from all_tab_privs where GRANTEE = (SELECT USER FROM DUAL) UNION select * from all_tab_privs where GRANTEE = 'PUBLIC' and TABLE_SCHEMA not in ('SYS', 'SYSTEM', ‘WMSYS’, 'EXFSYS', 'DMSYS', 'XDB’) and PRIVILEGE != 'SELECT';

  18. Advantages Current dataset available for R/O access. Less disk space required for development database. Quick setup. Disadvantages Possibility of leakage into production database. Possible performance issues across database links. Less developer autonomy. Smoke and Mirrors

  19. Logical Standby – 2nd solution Using the logical standby technology, the development schemas and production schemas are within the same database without possibility of leakage onto the production database.

  20. PHYSICAL STANDBY DATABASE PHYSICAL STANDBY DATABASE (100GB) PRODUCTION/ PRIMARY DATABASE (100GB) Archive Redo Logs LOGICAL STANDBY DATABASE (120GB) Archive Redo Logs Logical Standby - Architecture SQL apply ORACLE 10GR2 (10.2.0.2) on SUSE LINUX (x86_64) SLES9

  21. Logical Standby - Setup Prepare the Production DB to support a logical standby. • Determine support for Datatypes and storage attributes. • Ensure table rows can be uniquely identified.

  22. Logical Standby – Setup Production DB - Dataguard parameters db_name=‘DB1’ db_unique_name=‘DB1’ log_archive_config='DG_CONFIG=(DB1,DB3)' log_archive_dest_3='service=DB3 valid_for=(online_logfiles,primary_role) db_unique_name=DB3 optional reopen=15' log_archive_dest_state_3=ENABLE

  23. Logical Standby - Setup • Create physical standby using RMAN Duplicate target database for standby dorecover; • Stop Redo apply. • Build redo dictionary on production db execute DBMS_LOGSTDBY.BUILD

  24. Logical Standby – Setup Physical Standby - Dataguard parameters *.db_name = ‘DB1’ *.db_unique_name=‘DB3‘ *.fal_client=‘DB3' *.fal_server=‘DB1' *.standby_file_management='AUTO'

  25. Logical Standby - Setup On the Standby • Transition the Physical to a Logical standby Alter database recover to logical standbynew_dbname; • Shutdown and amend the pfile parameters. *.db_name=‘DB3‘ *.standby_archive_dest='/oracle/lnnn/logstby/DB3'

  26. Logical Standby – Setup • Create a new password file. • Create new SPFILE. • Open database resetlogs. • Commence the SQL apply process. • Alter database start logical standby apply;

  27. Logical Standby – Development Setup • Alter Dataguard level to STANDBY. • Create separate tablespaces for new development users and objects. • Assign space quotas for each new user’s development schema. • Create new development user accounts in a standard format. *

  28. Logical Standby - Development Setup • Create tables in development schema for read write (r/w) access. • Use tables maintained by SQL apply process for read only (r/o) access. • Foreign keys are replaced with triggers (where applicable). ** • Run application with development user login.

  29. Production/ Primary DB Logical Standby DB Public synonyms Archived redo log transfer Production schema SQL apply R/O S T Production schema DS1 R/W T1 • Development schema areas • Private synonyms. • Local objects with R/W access. • Triggers replace FK. • Use public synonyms for R/O table access. Application login Logical Standby – Development Setup

  30. Development Area - Setup • Create additional development users in a standard format using the USER_ADMINISTRATION package PROCEDURE CREATE_USER Argument Name Type In/Out Default? ------------------------------ ---------------------- ------ -------- P_DEBUG BOOLEAN IN DEFAULT P_USERNAME VARCHAR2(30) IN

  31. Development Area - Setup PROCEDURE DROP_USER Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- P_DEBUG BOOLEAN IN DEFAULT P_USERNAME VARCHAR2(30) IN PROCEDURE VALIDATE Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- P_DEBUG BOOLEAN IN DEFAULT P_USERNAME VARCHAR2(30) IN P_MODE VARCHAR2 IN

  32. Development Area – Setup Package Usage Examples:- • exec USER_ADMINSTRATION.CREATE_USER(TRUE,’DEV_user’); Output as follows:- CREATE USER DEV_user IDENTIFIED BY user DEFAULT TABLESPACE DEV_TB_01 TEMPORARY TABLESPACE DEV_TMP_01 QUOTA 1024M ON DEV_TB_01 GRANT RESOURCE, CONNECT, DEV_USER_ADMINISTRATION TO DEV_user CREATE SYNONYM DEV_user.QUICK_FK_TRIGGER_PACKAGE GRANT EXECUTE ON QUICK_FK_TRIGGER_PACKAGE TO DEV_user

  33. Development Area - Setup • NO foreign keys permitted from the development sandbox to SQL apply maintained schemas. 2 solutions: • Copy the required table locally and create the FK within the development area. • Use a package which replaces FK's with triggers.

  34. Development Area - Setup • Create triggers to replace foreign keys between developer sandboxes and schemas maintained by SQL apply process using the QUICK_FK_TRIGGER package. PROCEDURE QUICK_CREATE_FK_TRIG Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- LOCAL_TABLE VARCHAR2 IN REFERENCE_TABLE VARCHAR2 IN REFERENCE_DBLINK VARCHAR2 IN DEFAULT DEBUG NUMBER IN DEFAULT

  35. Development Area – Setup Package Usage Examples:- • exec QUICK_FK_TRIGGER_PACKAGE. QUICK_CREATE_FK_TRIG('DEV_user.table_name.column_na me',‘ref_user.table_name.column_name',[db.domain|NULL],1); e.g.QUICK_FK_TRIGGER_PACKAGE.QUICK_CREATE_FK_TRI G('DEV_USER.FINISH_BATCH.PROJECTNAME',‘REF_USER.PR OJECT.PROJECTNAME','testdb.world',1);

  36. Development Area - Setup CREATE OR REPLACE TRIGGER AG_PROJECTNAME_FINISH_B_BR_IU BEFORE INSERT OR UPDATE ON FINISH_BATCH FOR EACH ROW DECLARE FK_ENTRY FINISH_BATCH.PROJECTNAME%TYPE; BEGIN ---------------------------------------- -- AUTO GENERATED TRIGGER FROM THE -- QUICK_FK_TRIGGER_PACKAGE -- -- NAME: AG_PROJECTNAME_FINISH_B_BR_IU -- AUTH: kva -- DATE: 05-DEC-07 ---------------------------------------- FK_ENTRY:= :NEW.PROJECTNAME IF FK_ENTRY IS NOT NULL THEN QUICK_FK_TRIGGER_PACKAGE.QUICK_FK_CHECKS(FK_ENTRY,’REF_USER',‘PR OJECT',‘PROJECTNAME',‘TESTDB.WORLD'); END IF; END AG_PROJECTNAME_FINISH_B_BR_IU;

  37. Development Area - Setup PROCEDURE QUICK_FK_CHECKS Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- VAL VARCHAR2 IN REFOWNER VARCHAR2 IN REFTABLE VARCHAR2 IN REFCOL VARCHAR2 IN REFDBLINK VARCHAR2 IN DEFAULT DEBUG NUMBER(38) IN DEFAULT

  38. Development Area - Setup PROCEDURE QUICK_FK_CHECKS • If no DBLINK specified SELECT projectname FROM ref_user.project WHERE projectname = ‘PROJECT1’; • If DB link specified SELECT projectname FROM ref_user.project@testdb.world WHERE projectname= ‘PROJECT1; • EXCEPTION RAISED FOR MISSING ENTRY RAISE_APPLICATION_ERROR(-20000,'Entry does not exist for PROJECT1 - MISSING ENTRY IN ref_user.project.projectname');

  39. Development Area - Setup • Any attempt to alter the schema being maintained by the SQL apply process, raises the following error: ORA-16224: DATABASE GUARD IS ENABLED

  40. Advantages Current dataset for RO access. All development contained within one database. More developer autonomy. No direct interaction with the production DB. Full copy of production data. Disadvantages More disk space required (1 database plus development area) Security of sensitive production data. Logical Standby

  41. Maintenance & Monitoring • Skipping sensitive schemas by executing the DBMS_LOGSTDBY.SKIP package. exec dbms_logstdby.skip('SCHEMA_DDL',‘schema','%'); exec dbms_logstdby.skip('DML',‘schema','%');

  42. Maintenance & Monitoring • Various views to monitor the processes which maintain the Logical Standby database.

  43. Summary • Both solutions provide access to current datasets for application integration and development testing. • The Smoke and mirrors solution is relatively quick to setup. • Logical standby is a more stable solution and provides more development flexibility and scaling.

  44. Conclusion The Logical Standby has currently proved to be a more stable and popular solution for our immediate development requirements.

  45. Future Plans • Investigation of Rapid Application Testing (RAT), new in Oracle 11g.

  46. Web References • LOGICAL STANDBY CREATION -10gR2 (ORACLE) http://download.oracle.com/docs/cd/B19306_01/server.102/b14239/cre ate_ls.htm#SBYDB00300 • LOGICAL STANDBY BY Julian Dyke http://64.233.183.104/search?q=cache:2VixdNn9HN0J:julian.dyke.user s.btopenworld.com/com/Presentations/LogicalStandby.ppt+logical+stan dby+-+julian+Dyke&hl=en&ct=clnk&cd=2&gl=uk • LOGICAL STANDBY MANAGEMENT -10gR2 (ORACLE) http://download.oracle.com/docs/cd/B19306_01/server.102/b14239/manage_ls. htm

  47. Acknowledgements • DBA group @Sanger Institute (UK) • Production Development group @Sanger Institute (UK) • Wellcome Trust Sanger Institute (UK)

  48. Questions? Contact kva@sanger.ac.uk, if you require more information.

More Related