1 / 34

Umbrella 0.5 Project Report #1

Umbrella 0.5 Project Report #1. Heterogeneous Data Source Access Using Standard SQL Queries Eric Matson – MSE Kansas State University. Agenda. Overview Methods & Tools Requirements Project Management and Control GUI Walkthrough/Prototyping Summary. Overview. Purpose Goals

payers
Download Presentation

Umbrella 0.5 Project Report #1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Umbrella 0.5Project Report #1 Heterogeneous Data Source Access Using Standard SQL Queries Eric Matson – MSE Kansas State University

  2. Agenda • Overview • Methods & Tools • Requirements • Project Management and Control • GUI Walkthrough/Prototyping • Summary

  3. Overview • Purpose • Goals • Description • Architecture • Key Elements • Process • Test Lab • Scope • Future Directions • Literature Search Results

  4. Purpose • Query heterogeneous data sources using standard SQL statements • Achieve database integration and access at the query level • Allow the data to be “black box” whereas users don’t need to have understanding to source, architecture, or technical details

  5. Goals • Complete the Requirements of CIS895 • Successfully Navigate Software Engineering Process • Prove Umbrella Concept • Prove Ability to Process Umbrella Queries • Split the query into to process heterogeneous sources • Manage the connection processing with the data sources • Return and create new relation(s) • Create easy to use Graphical User Interface (GUI)

  6. Description • Enter queries using standard SQL statements • Query heterogeneous data sources • Text Files • Formatted • Delimited • Relational Databases • PostgreSQL • MySQL • Informix • Knowledge Base Formats • Prolog Rules • CLIPS Deftemplate facts • Non-relational Databases • ODBC Data Sources • Unify Data together as single relation

  7. Description – Architecture Relational Database Parser D a t a C o n n e c t o r s Object Database API Splitter TCP/IP Text File Router Umbrella GUI Unifier KB File Profile Data Server Data Sources Engine

  8. Description – Key Elements • GUI: User Interface • Umbrella Engine Agent • Parser: Checks the query for correctness and validity • Splitter: Splits query to appropriate data sources • Router: Manages routing and connection to data source • Unifier: Unites data upon return • Profile: Database to manage data source information • Data Connector: Interface to data sources • Data Server: • Text: “Wraps” data and serves the Umbrella Engine • KB: “Wraps” fact and serves the Umbrella Engine

  9. Description - Process SELECT * FROM Job, Name, Addr WHERE Job.SSN = Name.SSN AND Job.SSN = Addr.SSN SSN Title Salary Job Parse SELECT * FROM Job SSN Title Salary First Middle Last Street State ZIP Split SSN First Middle Last Unify SELECT * FROM Name DC Name DC Route DC SSN Street State ZIP SELECT * FROM Addr Addr

  10. Description - Test Lab Server Client Network PostgreSQL Linksys 10 Mb Hub Windows 98 MySQL Red Hat 7.1 192.168.1.2 192.168.1.1 Text

  11. Scope • Functionality • Umbrella 0.5 Query Engine • Graphical User Interface (GUI) • Data Servers for Text Data Sources • Data Connectors • SQL Select Statement in Limited Format • Data Sources • PostgreSQL: ANSI Compliant ORDBMS • MySQL: non-ANSI Compliant RDBMS • Formatted Text • Delimited Text

  12. Future Directions • Version 0.6 – SQL release • Implement full grammar support for SQL • Enhance dataset join algorithm • Enhance GUI • Add ODBC, Berkeley DB, Interbase and ADABAS data connectors • Version 0.7 – KD/AI release • Add Lisp, Prolog, CLIPS data connectors • Enhance Query Efficiency Algorithms • Version 0.8 – Integration release • Add API layer for access by C/C++, Java, VB • Add tools for advanced searches and Data Mining • Add Security Module • Version 0.9 – Data Connector release • Add Oracle, Informix, UDB, Sybase data connectors • Add Object Database data connectors • Add XML data connector • Version 1.0 – Mainframe release • Add IMS, DB2, VSAM data connectors

  13. Literature Search • Looked at US/Canadian Research Units • Companies • Universities • Similar Research/Product Development • WWW.Jibe.com - Only JDBC/ODBC sources • Queens University(Canada)/IBM – Older pre-Internet “Explosion” • Stanford Mariposa Project • Summary • Some Similar Projects • Most are Tangential in Scope and Function

  14. Methods & Tools (M&T) • Development Process • Object Modeling • Formalism/Constraints • Development Tools • Infrastructure

  15. M&T – Development ProcessCOMET(Concurrent Object Modeling and architectural design mEThod – Hassan Gomaa) Planning Delivery Non-COMET Process COMET Process Requirements Modeling Analysis Modeling Design Modeling Incremental Software Construction Throwaway Prototyping Incremental Software Integration Systems Testing Incremental Prototyping Phase 1 Phase 2 Phase 3

  16. M&T – Object Modeling • Unified Modeling Language (UML) used • UML is standard OO modeling language • Well understood • UML Tool • Poseidon • ArgoUML • Both are very similar, Poseidon is an extension of ArgoUML

  17. M&T – Formalism/Constraints • Object Constraint Language (OCL) • Easier to understand • Less mathematical • Less developed • USE • Graphical tool to support OCL

  18. M&T – Development Tools • Sun Java 1.3 JDK/JRE • Newest release of development kit • Write Once, Run Anywhere (Portable) • Forte (Netbeans) Integrated Development Environment • Community Edition (CE) • Easy to use • Free!

  19. M&T - Infrastructure Server OS: Red Hat Linux 7.1 Object/Relational Database Server: PostgreSQL Relational Database Server: MySQL

  20. Requirements • Object Diagram • Class Descriptions • SQL Grammar • SQL Example

  21. Requirements – Object Diagram DataConn Parser Engine RelDataConn TextDataConn Query Splitter JDBCConn FormDataConn DelDataConn SubQuery Interagator MySQLConn PostgresConn Profile Router JDBCProfile TextProfile TextDesc Unifier FormTextProf DelTextProf ResultSet DataServer

  22. Requirements – Class Descriptions • Engine: Control agent of Umbrella architecture • Parser: Checks query for grammar/valid data objects • Splitter: Splits queries into valid sub-queries • Router: Routes queries to data source/manages connection • Unifier: Unifies sub-query result sets into single relation • Query: SQL statement • SubQuery: Created by Splitter for specific data source • ResultSet: Data returned from sub-query of data source • TextDesc: Defines characteristics of text data source • Interrogator: Queries a new data source for information on field definitions • DataServer: Server that supports access to text data sources

  23. Requirements – Class Descriptions • Profile: Defines characteristics of a data source • JDBCProfile • FormattedTextProfile • DelimitedTextProfile • DataConnector: Defines specifics to query a data source • RelationalDataConnector • JDBCDataConnector • MySQLDataConnector • PostgreSQLDataConnector • TextDataConnector • FormattedTextDataConnector • DelimitedTextDataConnector

  24. Requirements – SQL Grammar • <statement> ::= SELECT <select> FROM <table>.| SELECT <select> FROM <table> WHERE <condition>. • <select> ::= <field> | <select>,<field> | * • <field> ::= <fieldname> | <tablename>.<fieldname> • <fieldname> ::= <alpha> {<alpha>}{<digit>} • <table> ::= <tablename> | <tablename> {,<tablename>} • <tablename> ::= <alpha> {<alpha>}{<digit>} • <conditionals> ::= <relation> {AND <relation>}{OR <relation>} • <relation> ::= <field> <operand> <rightside> • <operand> ::= < | > | = • <rightside> ::= <field> | <vartype> • <vartype> ::= <string> | <integer> | <real> • <string> ::= “” | “<alpha> {<alpha>}{<digit>} “ • <integer> ::= <digit> {<digit>} • <real> ::= <integer> . <integer> • <alpha> ::= a|b|c…x|y|z|A|B|C…X|Y|Z • <digit> ::= 0|1|2|3|4|5|6|7|8|9

  25. Requirements – SQL Example • SELECT * FROM address; • SELECT ssn, age FROM employee; • SELECT lastname, firstname FROM employee WHERE age > 40; • SELECT lastname, firstname, salary FROM employee, hr WHERE employee.ssn = hr.ssn AND employee.age > 25;

  26. Project Planning and Control • Project Planning • Work Breakdown Structure (WBS) • Gantt Chart • Description of Estimation Techniques • Project Control • Effort/Budget • Project Log and Time Tracking • Change Management (Feedback Loops)

  27. Project Planning - WBS 1 Umbrella Goals: 1.1 Planning 1.1.1 Define Purpose • Tasks: .25 < hours < 8 1.2 Requirements Modeling 1.1.2 Define Scope Outcome: 1.3 Throwaway Prototyping 1.1.3 Define Purpose • Stay on schedule 1.4 Analysis Modeling 1.1.4 Define Goals • Prevent “creep” 1.5 Design Modeling … 1.6 Incr. SW Construction 1.7 Incr. SW Integration 1.8 Incr. Prototyping 1.9 Systems Testing 1.10 Delivery

  28. Project Planning – Gantt Chart September October November December January Planning Req’s Modeling Throwaway Prototyping Analysis Mod. Design Mod. Incr. SW Const Incr. SW Integ. Incr. Prototyping Systems Testing Delivery Review #1 Review #2 Review #3

  29. Project Planning – Estimation Techniques • Estimate Tasks on Defined WBS • All tasks broken into “chunks” of work between .25 and 8 hours • Keep project on schedule • WBS Estimation Used • Small Project Team • Well Understood Capabilities of Project Member(s) • Change Budget Allocated at 20% of Initial Estimate • Works with Iterative Schemes • Change is Planned and Accepted

  30. Project Control – Effort/Budget • Effort • Effort Data Hours Man/Mo (168 hr/mo) • Predicted 565 3.36 • Current 183.25 1.09 • Budget • Time ($50/hr) $28,250 • Current $9,162.50

  31. Project Control – Time Log/Issues • Time Log • Track time spent each day on project tasks • WBS Number • Hours • Viewed on Engineering Notebook Site • Issues • As design issues are realized • Issues Documented • Solutions Arrived at and Documented • Folded in to Change Management Process • If Impact and In Scope

  32. Project Control – Change Mgmt. • Development Process (COMET) Oriented for Change • Feedback changes will be tracked • Change Control Number • Task Name • Added Hours to Project Schedule • Description of Change • WBS Number of Artifacts Altered • Dependencies

  33. Graphical User Interface File Connection Query Help Connections --- 192.168.1.2 ---- /opt/data ----formatted.txt ---- PostgreSQL Server ----Employee ----Salary --- 192.168.1.1 ---- MySQL Server ----Items ----Invoice ----Customers ---- /var/lib/data ----delimited.txt Queries --- Employees and Customers --- Employee ---- ssn, lastname --- Customers ---- ssn Define Query Name: Employees and Customers Query: SELECT ssn, lastname FRO 111-22-3333 Elvis Presley 222-33-4444 Richard Nixon 333-44-5555 Mikhail Gorbechev Results Status: Subquery1: OK Subquery2: Failed

  34. Summary • Umbrella accesses heterogeneous data sources • Defined set of tools and methodologies • High Level Requirements Defined • Project Management Well Defined • Easy to Use GUI • Next Steps • Questions • Feedback • Signoff • Proceed to Phase #2

More Related