100 likes | 150 Views
Develop software to gather data from various websites into a centralized location accessible via web interface. Includes information scraping, ETL processing, and database management for user-friendly display.
E N D
Project Omnigatherer CPR E CPR E CPR E SE Joe Briggie Kenny Trytek Abby Birkett Derek Woods Kingland Systems Client
Project Goal • Create software that can take information from many sources (websites) and put the information in a central location, accessible through a web interface. Many Sources Central Location User’s Display
DAL Database ETL Tool Normalized Employee UI Client UI Web Svcs. No Conflicts? System Diagram Scraper Tool WWW Data HTML Parser PDF Parser Flat File Create Read Update Delete
Information Harvesters • Scour sites for information and produce a de-normalized flat file for use by the ETL (Extract, Transform, Load) component. • Currently have two harvesters: one for the FDIC website, and one for the FFIEC site. Flat File Many Sources Harvester Components
ETL (Extract, Transform, Load) Component • Transform file created by harvesters into a format useable by the database. • Load data into the database. ETL Flat File Database
DAL (Database Access Layer) • Provides an interface for CRUD operations onthe database. • Helps ensure normalization through standardization of input/output. Database With DAL
Web Services andUser Interface • Expose database to a client application. • Allow users to access the data through the interface. Web Services User Interface Database With DAL
Test Plan • Test individual modules for basic functionality. • Integrate all modules. • Test individual modules for full functionality with continuous integration to accommodate changes in each module. • Final testing for validation and client approval.