html5-img
1 / 54

IDS Extensibility for Business Advantage

Informix User Forum 2005 Moving Forward With Informix. IDS Extensibility for Business Advantage. Jacques Roy Technical Lead/IBM. Atlanta, Georgia December 8-9, 2005. Agenda. What is Database Extensibility?

marsha
Download Presentation

IDS Extensibility for Business Advantage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Informix User Forum 2005Moving Forward With Informix IDS Extensibilityfor Business Advantage Jacques Roy Technical Lead/IBM Atlanta, Georgia December 8-9, 2005

  2. Agenda • What is Database Extensibility? • Database and Extensibility background • IDS features • Extensibility Examples • DataBlade Modules • Bladelets • Considerations for building your own

  3. What is Database Extensibility? • Ability to add business components in the database • Tailor the database to the business environment • Reduce application complexity • Put the processing where it makes the most sense • Set processing provided by the database • Higher performance • Less data movement, better data representation, better indexing • Faster development, lower maintenance cost Databases are not commodities!

  4. Database Background • Hierarchical/Network models • Tied to the physical representation of the data • Solves one problem very well • Relational Model • Logical organization of the data • Multiple views of the data • Object-Relational Model (extensibility) • Business representation and processing • Adapts to the business environment

  5. Extensibility Background • Historic Approach • user exits • device drivers • Relational Databases • triggers, constraints • Stored Procedures • New development platforms • Web Server: CGI, NSAPI/ISAPI • App Servers: J2EE • Object-Relational Capabilities • Relational Framework Databases are not commodities!

  6. DB Corporation Regions Branches . . . . . . . . . . . . . . . Accounts . . . Objects Instantiation Problem Ex: 10 regions 100 branches 10,000 accnts Over 10M objs And 20M msgs At 1KB per object you use 10GB of memory!

  7. User-defined types Distinct, opaque Table/Type inheritance Functional index R-Tree index Smart Large Objects More. . .Extensions can be written in:SPL, C, Java Complex types Row, set, multiset, list Polymorphism User-Defined Function User-Defined Aggregate Primary/Secondary Access Methods IDS Extensibility Features

  8. CPU cache System bus Memory Disk Controller Performance Relativity • 336MHz SPARC  5200MB/sec • Memory  1300MB/sec • System Bus  100MB/sec • SCSI Bus  20MB/sec • SCSI Disk  0.24 to 80MB/sec • Track-to-track seek time, rotation latency, etc. • Network  1 to 10MB/sec • Users  0.00001 MB/sec(600 char/min, ~120 words/min)

  9. When to Use UDFs/UDRs • Eliminate large data transfer • Transfer time is much larger than processing time • Simplify or improve processing • Eliminate table scan • Eliminate application set processing • Define new sorting order • Replace store procedures with aggregates (IDS only) • Provide business processing to applications • Consistency of result • Eliminate the need for custom interfaces Databases are not commodities!

  10. Better Grouping • Quarter() function:SELECT Quarter(date), SUM(income)FROM ordersWHERE Quarter(date) LIKE '2003Q%‘GROUP BY 1ORDER BY 1; • AgeGroup() function:SELECT AgeGroup(birthdate, 20, 10) AgeGrouping, SUM(total_price) TotalFROM customer c, orders o, items IWHERE c.customer_num = o.customer_numAND o.order_num = i.order_numGROUP BY 1ORDER BY 1 DESC; AgeGrouping Total 40-49 $12646.97 50-59 $ 5507.80

  11. Quarter Function #include <mi.h> mi_lvarchar *quarter(mi_date date, MI_FPARAM *fparam) { mi_lvarchar *RetVal; /* The return value. */ short mdy[3]; mi_integer ret, qt; char *pdate, buffer[10]; ret = rjulmdy(date, mdy); /* Extract month, day, and year from the date */ qt = (mdy[0] - 1) / 3; /* calculate the quarter */ qt++; sprintf(buffer, "%4dQ%d", mdy[2], qt); RetVal = mi_new_var(strlen(buffer)); mi_set_vardata(RetVal, buffer); /* Return the function's return value. */ return RetVal; }

  12. Compiling and Linking COMPILE=-I$(INFORMIXDIR)/incl/public -O -c cc -DMI_SERVBUILD $(COMPILE) quarter.c ld -G -o mylib.bld quarter.o chmod a+x mylib.bld (775 or read only to other)

  13. Creating the function CREATE FUNCTION quarter(date) RETURNS varchar(10) WITH (not variant, parallelizable) external name "$INFORMIXDIR/extend/class/qlib.bld(quarter)" LANGUAGE C; GRANT EXECUTE ON FUNCTION quarter (date) TO public;

  14. Removing the function DROP FUNCTION quarter(date);

  15. UDR Processing Stored Procedures SQL group UDR order order scan scan scan scan

  16. Better date manipulation: Day of the year Week of the year Week of the month Quarter Unit conversion Feet  Meters Gallons  Liters Fahrenheit  Celsius Functional indexes CREATE FUNCTION quarter(dt date) RETURNS integer WITH (NOT VARIANT) RETURN (YEAR(dt) * 100) + 1 + (MONTH(dt) - 1) / 3; END FUNCTION; Example: EXECUTE FUNCTION quarter('9/2/2005'); (expression) 200503 Using SPL for Extensions See Developer Works Informix zone:www-128.ibm.com/developerworks/db2/zones/informix/

  17. 1.0 1.1 1.2 1.3 1.2.3 1.2.2 1.2.1 1.2.3.2 1.2.3.3 1.2.3.4 1.2.3.4.5 The Node Type • "Hierarchically" aware type: Node • Pre-processed the hierarchical relationships • ex: Chapter 11, section 7, paragraph 3: 11.7.3 • Add new way to relate objects to one another • IsAncestor(), IsChild(), IsDescendant(), IsParent() • Can change processing from exponential to linear • examples: • policies, product classification,bill-of-material, LDAP, XML, etc.

  18. Node Application Example • Geo Hierarchy: • country > state > metro > city GeoNode Policy Resource Q: What policy apply to the Hyatt in Denver? A: A Colorado Policy

  19. component Parts Bill-of-Material Example CREATE TABLE Components ( ComponentId Node, Sequence integer, Quantity integer, Name varchar(30), PartNumber integer); CREATE TABLE Parts ( PartNumber integer, Name varchar(30), ProviderId integer, Price Money(10,2) ); • A component can be made up of multiple components • A component is made up of multiple parts

  20. Node performance vs standard relational Databases are not commodities!

  21. For more information on the Node Type “Downloadable bladelets and demos”http://www-106.ibm.com/developerworks/db2/zones/informix/library/samples/db_downloads.html Or “Open-Source Components for Informix Dynamic Server 9.x”Jacques Roy, William W. White, Jean T. Anderson, Paul G. BrownISBN 0-13-042827-2

  22. What is XML? • XML is a Structured Document Definition • It Represents a Hierarchy of Elements • Two Main Types of XML Documents • Document and Data • Two Ways to Process XML Documents • Text Retrieval System • Explode the XML Document into Columns

  23. Generating XML • Converting rows to XML formatSELECT genxml("customer", customer) FROM customer;orSELECT genxml("stats",ROW(customer_num, COUNT(*))) FROM cust_calls GROUP BY customer_num; • Creating a complete XML documentEXECUTE FUNCTION genxmlhdr("customer_set", "SELECT * FROM customer"); • User-Defined Aggregate:SELECT aggrxml(customer, "customer_set")FROM customer; See: Generating XML from IDS 9.x//www-106.ibm.com/developerworks/db2/zones/informix/library/techarticle/0302roy/0302roy2.html

  24. Other XML Possibilities • Decompose (Shred) an XML document into a table row • Generate rows from an XML document stored in the database • Use a generic table to keep track of the element type and position in the hierarchy • Use the XML Path language to extract values from an XML document • /table/row/lname • /table/row[1]/lname • /table/row[lname='Pauli']/customer_num • /table/row[@operation='I']/lname • /table/*/lname Stay Tuned to DeveloperWorks!

  25. Fine-Grained Auditing • Use triggers and user-define routine • Register event processing • Commit or rollback • Send events to file or outside process • Use a generic function for any table • Available in IDS 9.40.xC4 or later See the developerworks article:Event-driven fined-grained auditing with Informix Dynamic Server//www-106.ibm.com/developerworks/db2/library/techarticle/dm-0410roy/

  26. New Trigger Use CREATE TRIGGER tab1instrig INSERT ON tab1 FOR EACH ROW ( EXECUTE PROCEDURE do_auditing2() );

  27. Commit/Rollback MonitorProgram Callback Statement Register 8 4 7 3 2 1 6 5 Trigger Table EventTable Event-Driven Architecture

  28. loans Retail Financial Services Healthcare Telco Manufacturing Clothing Food Novelties Inheritance and Polymorphism

  29. Inheritance and Polymorphism (cont.) • SELECT branch_id, AVG(risk(loans)) FROM loans GROUP BY 1HAVING AVG(risk(loans)) > 1ORDER BY 2 DESC; • SELECT branch_id, AVGRISK(loans) FROM loans GROUP BY 1HAVING AVGRISK(loans) > 1ORDER BY 2 DESC;

  30. Replacing Store Procedures with UDA • Business problem: Merger multi-polygon types (ESRI) • Original Solution: • SPL stored procedure (82 lines, 3 SELECT statements, 1 insert statement, 2 embedded FOREACH) • New Solution: User-Defined Aggregate • 23 lines of SPL, no SQL, no FOREACH SELECT a.user_id, a.case_id, a.case_type, a.serial_nr_full,a.spt_disp_theme_cd, do_union(a.shp) shpFROM case_shp aWHERE user_id = "user0001“GROUP BY 1, 2, 3, 4, 5INTO TEMP my_temp_table WITH NO LOG;

  31. User-Defined Aggregate Code CREATE FUNCTION do_union_iter(state lvarchar, arg lvarchar) RETURNING lvarchar DEFINE retval lvarchar; IF (state IS NULL ) THEN RETURN arg ; END IF IF (arg IS NULL) THEN RETURN state ; END IF LET retval = state || arg ; RETURN retval ; END FUNCTION ; CREATE AGGREGATE do_union WITH (ITER=do_union_iter, COMBINE=do_union_iter);

  32. User-Defined Aggregates • A lot more flexible than standard aggregate functions • Can take any type as input • ex.: row type • Can take an initialization parameter of any type • ex: row type • Can return complex types • ex: list

  33. Fabric Classification • Business Problem:Provide an efficient way to select fabrics • Indexing Colors:Cielab coding (3 dimensions) and other attributes(Use the R-tree multi-dimensional indexing method) • Fabric Type HierarchyRequires a hierarchy-aware typeex: We want a “natural” fabric • Fabric Style and Patternsex: What does “Victorian” mean?

  34. Other examples • Other multi-dimensional problems: • 3D CAD drawings • Support for Globally Unique Identifier (GUID)http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0401roy/index.html • Implementation of unsigned integer • Soundex/Phonex

  35. Key to Successful Extensibility Project • Start small, develop your expertise • remember the first OO projects • Use pre-built extensions • Study examples • Approaches: • Use DataBlade Modules • Use Bladelets • Build your own

  36. What are DataBlade Modules? • Business Components • A set of functionality that solves a specific business problem • Building Blocks • Can include: • User-defined types • User-defined Functions • Tables, views • Client component • An DataBlade can come from IBM, a third party or be built in-house

  37. Spatial Free of charge with IDS 9.30 and higher Geodetic TimeSeries NAG Real-Time Loader Large Object Locator (LLD) Comes with IDS 9.x MQSeries Included in IDS 10.0 xC3 C-ISAM Image Foundation Video Foundation Web Excalibur Text Excalibur Image XSLT IDS DataBlade Modules

  38. Spatial is Everywhere • Where are my stores located related to my distributors? • How can I efficiently route my delivery trucks? • How can I micro-market to customers fitting a particular profile near my worst performing store? • How can I set insurance rates near to flood plain? • Where are the parcels in the city that are impacted by a zoning change? • Which bank branches do I keep after the merger based on my customers locations (among other things)?

  39. Complex Spatial Example CREATE TABLE e_Yellow_Pages ( Name VARCHAR(128) NOT NULL, Business Business_Type NOT NULL, Description Document NOT NULL, LocationGeoPoint NOT NULL, Occupied SET( Period NOT NULL ) ); -- “Show me available service stations specializing -- in Porsche brakes within 30 miles of where -- I am?” -- SELECT Y.Name FROM e_Yellow_Pages Y WHEREContains ( Y.Location, Circle( :GPS_Loc, ‘30 Miles’ ) ) AND Y.Business MATCH “Automotive Service” ANDDocContains(Y.Description, “Porsche AND brakes”) AND NOTBooked ( Y.Occupied, Period (TODAY,TODAY+5)); • Raise the level of abstraction at the database level. • Buy components to solve common problems, and build components to achieve a competitive edge

  40. TimeSeries DataBlade Module • A time series is a set of data as it varies over time • TimeSeries DataBlade optimizes storage usage • 50% Savings not uncommon • Optimized Access Time • 10 times performance improvement typical • Calendars • Defines period of time that data may or may not be collected • SQL, Java, and C interfaces • VTI Interface • Makes a time series look like a regular table • Office Connect Interface • Web enables Microsoft Excel plug-in

  41. Who’s Interested in TimeSeries • Capital Markets • Arbitrage opportunities, breakout signals, risk/return optimization, portfolio management, VaR calculations, simulations, backtesting... • Telecommunications: • Network monitoring, load prediction, blocked calls (lost revenue) from load, phone usage, fraud detection and analysis... • Manufacturing: • Machinery going out of spec; process sampling and analysis • Logistics: • Location of a fleet (e.g. GPS); route analysis • Scientific research: • Temperature over time...

  42. The NAG DataBlade Module • NAG: Numerical Analysis Group • over 30 years experience • Experts in numerical and statistical computation • Reputation for accuracy and performance • Business functions: convert timeseries data to vectors and vectors of returns; present value calculations; ErlangB (network) calculations; data export functions • NAG Fortran functions: correlation and regression analysis; variance-covariance matrix generation; optimization; eigenvectors, eigenvalues; ... • Examples: • Oil industry: temperature variability down an oil well • Finance: calculate volume weighted average price and volatility • Spatial routing: Find the shortest path between two points

  43. IDS Bladelets • Located at:http://www-106.ibm.com/developerworks/db2/zones/informix/library/samples/db_downloads.htmlandhttp://www.iiug.org/ver1/software/index_ORDBMS.html • Bladelets: • mrLvarchar, Node, regexp, shape, exec, period, etc. • For detailed information, see: "Open-Source Components for Informix Dynamic Server 9.x "Jacques Roy, William W. White, Jean T. Anderson, Paul G. BrownISBN 0-13-042827-2 includes: Node, Period, ffvti, exec, shape, sqllib/IUtil, regexp, mrLvarchar, JPGImage

  44. mrLvarchar and regexp • Store data efficiently based on its length • Document data vary in length • Many business data set include such documents • Web pages, XML documents, product description, articles, etc. • mrLvarchar stays in row when shorter than 2KB, in BLOB otherwise • Includes useful functions • Snip, clip, concat, concatAll, Upper, Lower, Length, Instr, regexp functions • RegExp • regexp_match, regexp_replace, regexp_extract, regexp_split • Search on mrLvarchar • more complete search capabilities than LIKE and MATCHES

  45. Period • Time periods comparisons based on dates or datetimes • Manages information about fixed intervals in a timeline • Rich comparison functions • Equal(), NotEqual(), WithinNotTouches(), Within(), ContainsNotTouch(), Contains(), AfterTouches(), BeforeTouches(), Overlap(), Before() After() • R-Tree Indexing • Can be used in any type of scheduling • Hotel room reservations • Consultant scheduling • Equipment scheduling • Advantages: • simpler SQL, Range indexing (R-Tree), proper behavior of a range • Example: "Are there any situations where two different trains are scheduled on the same track over the next week?"

  46. New Applications • Tracking Systems (TimeSeries, Spatial) • RTD buses, U-Haul, Cell phones • Customer Services: closest location • Security Systems: face recognition, fingerprints, etc. • Handling hierarchical problems: • material types, policies, XML, bill-of-material, etc. WHAT ABOUT YOUR BUSINESS PROBLEMS?

  47. Building Your Own Extensions • Use the DataBlade Development Kit (DBDK) • Learn the DataBlade API • Many ESQL/C functions are also part of the DAPI • Review the latest documentation in the release directory • For More Information on the DataBlade API: "Informix Dynamic Server.2000:Server-Side Programming in C"Jacques Roy ISBN 0-13-013709-X

  48. Development Environment • SPL • Standard SPL environment • C • Include directory: $INFORMIXDIR/incl/public • Makefile generated by DBDK • DBDK Include file for Makefile: $INFORMIX/incl/dbdk/makeinc.<platform> • Java • Server configuration: onconfig parameters(see release notice and machine notes) • $(INFORMIXDIR)/extend/krakatoa/krakatoa.jar

  49. DBDK Components • GUI tool, runs only on Windows NT • Bladesmith • Defines/build DataBlade modules • Generates functional tests • Generates script info for install • Bladepack • Organize distribution • Generates installshield script • Blade manager • Stores info in the database • install/uninstall DataBlade Modules in the application database

  50. Programming Environment • Proprietary Thread implementation • Non-Preemptible Threads • Multiple Processes on UNIX • Multiple Threads on NT

More Related