water analytics platform on aws n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Water Analytics Platform on AWS PowerPoint Presentation
Download Presentation
Water Analytics Platform on AWS

Loading in 2 Seconds...

play fullscreen
1 / 10

Water Analytics Platform on AWS - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

Water Analytics Platform on AWS. Team Members Srinivasan Vembuli Rikio Chiba Romeo Luka Under the Supervision Prof. Murlikrishna Viswanathan. Background. The Department of Environment, Water and Natural Resources (DEWNR) leads the management of South Australia’s most valuable resource .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Water Analytics Platform on AWS' - ketan


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
water analytics platform on aws

Water Analytics Platform on AWS

Team Members

SrinivasanVembuli

Rikio Chiba

Romeo Luka

Under the Supervision

Prof.MurlikrishnaViswanathan

background
Background
  • The Department of Environment, Water and Natural Resources (DEWNR) leads the management of South Australia’s most valuable resource.
  • The DEWNR collects water data from various sources and disseminates this to other agencies
  • Data currently stored in multiple systems
      • Hydstra(Legacy FoxproDB)
      • SQL Server
  • Data is currently being used by the Bureau Of Meteorology (BOM) for its analytics applications and by DEWNR in Water Connect Website applications
slide3
WDTF
  • The Water Data Transfer Format (XML) developed in 2008 is a national standard for transferring water information.
  • Over 240 organisations are required to give specified water information to the Bureau under the Water Regulations 2008.
  • BOM is using data from the current system in Water Data Transfer Format (WDTF)
existing system architecture
Existing System Architecture

Data Source

Storage / Application

Output

Other Data

GIS Application

Field Sensors

Data Mart

SQL Server

WDTF

Hydstra

Raw Data

Raw Data

Raw Data

Foxpro DB

Analysis

problem definition
Problem Definition
  • The current architecture relies on multiple systems running on legacy software ,i.e., Hydstra (Foxpro DB)
  • This leads to increased costs and inefficiency in service delivery
  • Current architecture does not fully utilise WDTF as the universal data format standard
project objectives
Project Objectives
  • Help DEWNR to use data in WDTF format to generate analytical data similar to BOM for public consumption (Open Data: OTF is a facilitator for SA Gov.)
  • Develop a cloud-based ETL system to manage water data (in WDTF) from across Australia
  • Providing useful analytics or insights from this data using different data mining and visualization techniques.
    • Some examples include time series analysis of aggregated ground-water/surface-water data and real-time mapping of water data using dashboards and mapping APIs.
solution
Solution
  • Hosting Water data on Cloud
    • Establish integrated data analysis platform
    • Publish and utilize water data for third party organizations
architecture on aws
Architecture on AWS

AWS Data Pipeline

Daily task

Daily task

Daily task

Daily task

Amazon EC2

Parse WDTF files

JSON

Why do we need to use Redshift?

Copy WDTF files

Store the data to Redshift

Amazon EMR

Amazon Redshift

Local FTP Server

Amazon S3

Analysis

Why do we need to use EMR?

current project status
Current Project Status
  • Wrote 2 Perl parser programs that do the following tasks: -
    • 1st parser unzips Zip files to generate XML files
    • 2nd parser that converts XML files to JSON

Zipped file -> (Unzip Parser) -> XML files -> (Convert Parser) -> JSON

  • Researching how to convert JSON files to tables using EMR algorithm & plug it to redshift for analytics
deliverables
Deliverables
  • Prototype of the proposed architecture
  • Technical Document
  • Project report