280 likes | 461 Views
Designing High Performance BIRT Reports. Mica J. Block Director Actuate Corporate Engineers Actuate Corporation. Topics. Understanding generation performance External factors Overhead Estimated Component Times Performance tips Generation Rendering
E N D
Designing High Performance BIRT Reports Mica J. Block Director Actuate Corporate Engineers Actuate Corporation
Topics • Understanding generation performance • External factors • Overhead • Estimated Component Times • Performance tips • Generation • Rendering • NOTE: This presentation deals with the performance of the reports themselves regardless of the server technology being used.
“Pages-per-Second” Myth • Assumes all reports are equal • Ignores • Number of report items per page • Complexity of the query • Pages are defined at render time • Impact of aggregates, and more… • Reality: • report items-per-second is better metric • Pages-per-second applies only to same report on different runs
External Factors • System: CPU, load • Raw CPU power • RAM • Overall load in server environment • JVM • User expectations • DB performance • Database design • Query design & optimization • Performance of vendor’s query features • Network overhead
Presentation (Web/Portal) Tier iPortal iPortal iPortal Perf Mgmt Mgmt Console Actuate Architecture Storage, Data Access & Integration Tier Development Tier Content Production Tier Client Tier (Web Browsers) iServer F1 V F2 F1 V F2 i M E i M E IE Firefox EII EII F1 V F2 F1 V F2 i M E i M E EII EII Development Tier: IT builds reports, blueprints, metadata, & templates for different reporting styles Storage & Data Tier: Dedicated, secure storage locations for accessing data, storing project & report content Production Tier: Single, scalable cluster for generating content for different reporting styles Presentation Tier: Dedicated tier for accessing & presenting report & dashboard content to users Client Tier: Users consume content according to their analytic objective
Estimated Component Times • Estimated from simple listing using single table (10,000 rows) in SQL Server • Generation only does not include rendering • Not scientific methodology (done on a laptop) • Your mileage will vary • Use your own data • Try in your own environment • Focus on specific reports with problems
Estimated Component Times • Pages: little to no effect • Changed page break from 200 to 100 --> double pages • Adds < 2% to report run • Formatting: little to no effect • Added numeric and date formatting • Was slightly faster • Groups: moderate to significant • Add two group levels to simple listing • Adds ~5-20% to report run per group • Depends on the number of group breaks • Depends on how the data is sorted
Estimated Component Times • One-pass aggregates: moderate • Added two aggregates • Adds ~4% per aggregate to report run • Depends on number of groups • Look-ahead aggregates: significant • Total for group as percent of overall total • Adds ~2-8% per aggregate to report run • Depends on number of groups and number of data items • Charts: Very significant • One chart added ~33% to report run • One chart per group ~30-150% to report run • Depends on number of groups (i.e. charts).
Implications • Report generation depends on: • number of report items • Presence of aggregates • Number of groups • Sorting of data • Presence of charts • Time per page depends on output format • Pages per second depends on layout • Decreasing page break number “doubles” performance!
Performance Strategies • Use report items-per-second as a guide • Relatively fixed for a platform • Determine a time budget • How many report items can the report afford? • Performance strategies • Remove application-specific bottlenecks • Make report items work harder • Reduce impact of aggregates
How to Analyze Performance • Test functionality separately • Write to a log file timers in key areas • Collect run times • Remove all content from report • Collect run times again • Difference is cost of processing report items • Remainder is per-row cost • Example:
General Observations • Report optimization is a trial and error effort • Some of the report optimization techniques require additional development time • Not necessary to use these techniques when the reports perform within the user requirements • These techniques should only be used to optimize reports
Use Latest Version • Use latest version of BIRT • Has many performance improvements • Do not use ‘Total’ functions • These functions are deprecated in BIRT 2.2.2 • Has some performance issues • Especially with filters
Optimize Database Access • Extra time from queries, DB overhead, computation, etc. • Minimize query time • Make sure query is optimized • Reduce the number of columns and rows returned • Reduce number of queries needed • Use stored procedures • Use materialized views
Optimize XML Access • XML is versatile, and powerful to describe meta data and actual data in one file • BIRT has a “generic” XML ODA which uses an extremely efficient XPath algorithm to parse the results • “generic” is great to solve a multitude of needs, but lacks to solve a single need very well • If the XML Schema will not change, and high user loads are required, specialize connectors should be built to improve overall system performance
Optimize XML Access • Java API for XML Binding (JAXB) is a specialized API for Java used to efficiently and quickly parse a fixed schema XML data file • Upside – may be 10x faster than the “generic” XML ODA • Downside – if the XML Schema changes, JAXB classes will need to be re-compiled • Downside – no UI exists to create data sets, JAXB classes must be used with a scripted data source • The same also applies for the Web Services ODA
Filtering • BIRT enables filtering at different layers such as in the table • Push filtering to the database (if possible) • Reduces the size of the result set • Extremely important with two pass aggregates
Sorting • When you add a group section BIRT will automatically sort the dataset in memory. • There is no setting to tell BIRT that the data is already sorted. • Always better to push the sort to the database
Getting caught in a (Data) Bind • As of BIRT 2.1.3 – this will change for a future release with data set caching • Each report item with a specified data binding will force that data set to re-execute for each binding • Bindings will cascade down to contained report items (data bindings on a table cascade down to items inside the table) • In nearly all reports data sets should only have 1 binding specified • Only extremely complex reports with inter-woven data set requirements will require multiple bindings per data set • Joint Data Sets can be used in some cases to avoid multiple bindings on a single data set • Do not bind data sets on the Master Page
Aggregates • Aggregates: Sum( ), Count( ), Min( ), etc. • Two types • Running – done while creating the table • Look-ahead - requires two passes over data • For performance, review look-ahead type • Create a stored procedure to do calculation • Use a separate query • Use a data filter to merge totals into each row • Compare to out-of-box solution
Charts • Good news - Most time spent in rendering (using drawing primitives in swing) • Actual code is optimized • Size and resolution will impact performance • All points are loaded in memory. • Avoid charts with many points • Little more you can discern in a chart with 10,000 points than in a chart with 500 points • More points will also take longer to render as there is more to draw • Make sure you use the table binding not the dataset binding
Charts • 3D charts might take more time as it uses a real 3D algorithm to sort surfaces • 2d charts with depth have no significant performance impact • Grouping inside charts will be the number one point that slows things down • Chart engine uses a different grouping algorithm • Group the data in the data set • BIRT 2.3 will use the DTE grouping capabilities • Avoid extra markers, labels, shadows, gradients, etc… • will impact the performance as it means more shapes and fills to draw
General Tips • Reduce number of report items • Concatenate values where makes sense • First Name + Last Name • Avoid table data bindings when not used • Use new Crosstab report item when appropriate as it is tuned for such operations.
Rendering Tips • PDF • Set appropriate page size in the master page • Will significantly decrease dynamic geometry • HTML • Avoid group sections with many items • Will cause a long TOC list and will impact viewing performance