E N D
1. May 2, 2012 Tamino Technical Overview John Fitzgerald
Business Integration Technologist
2. Agenda Introductions
Tamino Introduction and Overview
What is Tamino?
Tamino Server Architecture
Search and Retrieval
Document Management
Enterprise Class Features
APIs and Tools
Demo
What you can do Today!
3. What is Tamino?
Storage system (DBMS) for ...
Semi-structured data
XML documents, messages, metadata
Stores in native XML format
Multiple indexing methods
Unstructured data
Storage and indexing of non-XML objects
Images, video, audio, MS Office files, PDF..
Major Differentiators:
Robust & complete for mission-critical enterprise use
Built-in Internet File System
Superior XML-aware searching
Comprehensive developer support
Available for offline use
4. Why Use Tamino? Businesses use Tamino
For lowest TCO on managing XML and unstructured data
Store, find, re-compose, present multi-channel
Repurpose to save cost, time & resources
To increase development productivity
Faster through standards, many APIs & tools
Less efforts to adapt to changes
To avoid vendor lock-in through open standards
Access 'all areas' via XQuery
Fit for SOA - eg. UDDI, SOAP
For investment protection and mission-critical use
Provide secure & trusted access to existing back-end data
Deliver robust operation and protect against business outages
5. Tamino in Action Today Approx. 700 customers worldwide Vodafone
Electronic Billing (> 1 Terabyte)
RTL
Highly available news editorial system providing information for TV, news broadcasters
Juris German Legislation
XML-based document management system (~600.000 documents)
Commerzbank
Rule-based transfer of all trading transactions to a standardized view (financial products)
Daimler Chrysler
Consistent management of all car testing and diagnostics information & fleet management
HellerBank
Automated handling of monthly reports across multiple channels
Vodafone
Electronic Billing (> 1 Terabyte)
RTL
Highly available news editorial system providing information for TV, news broadcasters
Juris German Legislation
XML-based document management system (~600.000 documents)
Commerzbank
Rule-based transfer of all trading transactions to a standardized view (financial products)
Daimler Chrysler
Consistent management of all car testing and diagnostics information & fleet management
HellerBank
Automated handling of monthly reports across multiple channels
6. Tamino XML Server Architecture
7. Tamino XML Data Store
8. Tamino Data Map
9. XML Schema Support Complete Support for XML Schema 1.0 Specification
Industry Schema Support:
Docbook 4.4FpML 4.1METS 1.0NewsML 1.2SVG 1.0UBLVoiceMLWord 2003XBRL 2.1
Full DTD Support
10. Tamino Search and Retrieval W3C XQuery Support
User-defined functions
If-Then-Else
Node-level update
XPath Support
Extended with text search
11. Tamino Indexing and Retrieval Standard
Classical database indexes
Index any combination of elements and attributes
Supports relational operators, exact comparisons, sorting
Text
Use in conjunction with text retrieval functions
Supports wildcard searches
Structure
Index declared on the document
Registers instances of undeclared nodes
Reference
Indexes specific sub-trees of a document (e.g. /doc/a/b)
Useful for documents of high complexity (multiplicity of sub-trees)
Multipath
Index any element or attribute that meets an XPath expression
Compound
Index a combination of two elements (e.g. lastname and firstname)
12. Tamino and Document Management Integrated Internet File System (WebDAV) for document management
Drag & drop storage/retrieval
Instant document validation
Higher performance & scalability
Check-In, Check-Out, Workspaces, ...
Embedded security via ACLs
XQuery for property searches
Versioning - on document level
via WebDAV
"natively" (auto-versioning)
Non-XML Indexer
Microsoft Office, PDF, ZIP...
Indexing is extensible
RFC 2518 WebDAV base specification
file system view on data
HTTP 1.1: PUT / GET / DELETE / OPTIONS / HEAD
COPY / MOVE / MKCOL / REPORT
properties (PROPFIND / PROPPATCH)
locking
RFC 3253 Versioning in WebDAV
basic features
VERSION-CONTROL, CHECKOUT, CHECKIN, LABEL
histories, workspaces, working resources
advanced features
UPDATE (fork)
RFC 3744 WebDAV-ACL
ACLs and ACEs similar to Tamino Security with hierarchical inheritance but instance-based
defines no authentication
authentication like in HTTP protocol
users and groups from Tamino are used
WebDAV Search (Internet Draft)
defines a search grammar for searching on properties
protocol is extendable maybe XQuery?
Tamino supports DAV:basicsearch-grammar
Querying on WebDAV metadata is possible using XQueryRFC 2518 WebDAV base specification
file system view on data
HTTP 1.1: PUT / GET / DELETE / OPTIONS / HEAD
COPY / MOVE / MKCOL / REPORT
properties (PROPFIND / PROPPATCH)
locking
RFC 3253 Versioning in WebDAV
basic features
VERSION-CONTROL, CHECKOUT, CHECKIN, LABEL
histories, workspaces, working resources
advanced features
UPDATE (fork)
RFC 3744 WebDAV-ACL
ACLs and ACEs similar to Tamino Security with hierarchical inheritance but instance-based
defines no authentication
authentication like in HTTP protocol
users and groups from Tamino are used
WebDAV Search (Internet Draft)
defines a search grammar for searching on properties
protocol is extendable maybe XQuery?
Tamino supports DAV:basicsearch-grammar
Querying on WebDAV metadata is possible using XQuery
13. Indexes proprietary Office Documents on storage
Depending on MIME type
e.g. Word 2003, RTF, StarOffice, OpenOffice, PDF, MP3, ZIP, ...
Further arbitrary formatsindexable via new, extensibleInfrastructure (Server Extension w/ open Java interface)
Stores XML-converted "shadow"documents in parallel to originals
Content searchable via XQuery
Original documents additionallystored 1:1
Modification with Tamino not supported Non-XML Indexer Non-XML Indexer:
Server extension which accepts non-XML documents as input and creates so-called XML shadow documents.
Just like any other XML document, these shadow documents can be stored, indexed, and queried in Tamino.
For doctypes that make use of the Non-XML Indexer querying is performed on the shadow documents.
Once a query has retrieved a document's ino:id or its ino:docname you can access the non-XML document itself via plainURL addressing.
Indexing supported by plugins for PDF, Microsoft Word, Rich Text Format and Star Office.The generated shadow documents conform to the OASIS schemas
Microsoft Office files: Microsoft Word / Excel
OpenOffice files: OpenOffice Writer/ Calc
StarOffice files: StarOffice Writer/ Calc
Adobe PDF files
Plain text files (UTF-8) / Plain text files
MPEG Audio files (often known as MP3 files)
RTF (Rich Text Format) files
Zip files
With Tamino version4.4, the Non-XML Indexer implementation has been opened for extracting application-specific information from non-XML documents (e.g. GIF, JPG, etc.) via added custom plugins
Furthermore, the Non-XML Indexer can now deal with text documents that must not be converted within Tamino.
Non-XML Indexer:
Server extension which accepts non-XML documents as input and creates so-called XML shadow documents.
Just like any other XML document, these shadow documents can be stored, indexed, and queried in Tamino.
For doctypes that make use of the Non-XML Indexer querying is performed on the shadow documents.
Once a query has retrieved a document's ino:id or its ino:docname you can access the non-XML document itself via plainURL addressing.
Indexing supported by plugins for PDF, Microsoft Word, Rich Text Format and Star Office.The generated shadow documents conform to the OASIS schemas
Microsoft Office files: Microsoft Word / Excel
OpenOffice files: OpenOffice Writer/ Calc
StarOffice files: StarOffice Writer/ Calc
Adobe PDF files
Plain text files (UTF-8) / Plain text files
MPEG Audio files (often known as MP3 files)
RTF (Rich Text Format) files
Zip files
With Tamino version4.4, the Non-XML Indexer implementation has been opened for extracting application-specific information from non-XML documents (e.g. GIF, JPG, etc.) via added custom plugins
Furthermore, the Non-XML Indexer can now deal with text documents that must not be converted within Tamino.
14. Tamino Enterprise Features
Replication
Replicate Databases available for parallel read access
High Availability (Hot Standby and Failover)
Cluster support
Used by Schiphol Airport, Euredit, RTL / MaxiMedia, Ideal, Sun,....
Security
Support for LDAP and OS Security via Tamino Manager
2 Phase Commit
Available both for Java and .NET
Network drive support (NAS/SAN)
15. Tamino Tools and APIs Tamino APIs:
Java
.NET
C
SOAP
UDDI
Tamino Tools:
Schema Editor
X-Plorer
XQuery Editor
Interactive Interface
16. Performance Highlights & Market Acceptance ~ 1TB of Data in 3 Tamino DBs (Vodafone - Spain)
~ 1.125 billion logical reads / month (~ 430 reads/s at Migros Online - CH)
~ 180 million documents in 8 Tamino DBs (tested by IDEAL Greece)~ 15,000 transactions daily via 400 concurrent users~ 400 transactions daily via 2000 subscribed users~ Tamino internal compression of 1:20 (non-XML), 1:4 (XML)
~ 1-3 sec application response time w/ 420 users querying every 10s (RTL - D)
~ Load 7.2 million docs/hr, 75 MB/s, 16 clients, 1 index/doctype (Commerzbank- D)
17. Tamino Differentiators
Integrated high-performance Internet file system (via built-in WebDAV)
Built-in versioning (via integrated WebDAV & natively)
Standards support: XML, XML Schema, Web services, XQuery, UDDI3
Enterprise features (HA support, Replication, 2-Phase-Commit)
Multiple indexes for efficient native XML storage, search, access and retrieval
Multiple XML documents & schemas allowed per DB
Support for efficient structure changes (schema evolution)
Structure-independent retrieval times
Rich tool & utilities set (Schema editor, XML-Indexer, text-retrieval, ...)
Smart disk space management (compression)
Available across multiple OS platforms
18. VODAFONEMulti-Channel Electronic Bills Presentation Mission
Allow customer information from disparate systems to be combined together to form a service that is unique -> competitive advantage
Solution
A system that allows customers and internal users to see their invoices and the billing information via Internet
Using XML, Tamino XML Server, Web-Logic and IXOS products
Result
Access invoices from the web in diverse formats (HTML, XML, PDF and Excel)
Minimize paper and mail delivery
Send billing information to clients through SMS
Feed Vodafones System Data Warehouse
19. VODAFONEMulti-Channel Electronic Bills Presentation Performance Data
Cluster solution
HP11i-64
4 CPU, 2GB RAM
Total of >> 1 TB on 3 DBs [>250 .. 350 GB each]
1:40h backup or restore time for 250 GB DB
30 million docs, avg.~ 35kB [min. 9kB .. max. 180MB]
> 2.5 million bills / month (split into to 1MB chunks), 1 Doctype w/ 200 elements & 200 attribs.
> 9h loading for 250GB DBs; Server response times: 0.1 - 1 sec
20. For More Information Feel Free to Contact me
John.Fitzgerald@SoftwareAGUSA.com
(703) 391-8177
Visit Our Website
http://www.softwareagusa.com
Download a FREE trial Copy
http://www.xmlstarterkit.com/