Connecticut is data rich but information poor
Download
1 / 25

Connecticut is Data Rich but Information Poor - PowerPoint PPT Presentation


  • 96 Views
  • Uploaded on

Connecticut is Data Rich but Information Poor. Our Vision: Connecting the Silos. How PATH Works Example of PATH installed as P20WIN PATH vs Desktop Integrator. PATH Presentation CT Data Collaborative June 2014. Virtual Data Warehouse

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Connecticut is Data Rich but Information Poor' - craig


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Connecticut is data rich but information poor
Connecticut is Data Rich but Information Poor


Our Vision: Connecting the Silos


Path presentation ct data collaborative june 2014

  • How PATH Works

  • Example of PATH installed as P20WIN

  • PATH vs Desktop Integrator

PATH PresentationCT Data Collaborative June 2014


How path works

  • Virtual Data Warehouse

  • Identity Resolution across multiple sources that don’t share a Gold Standard Identifier

  • HIPAA and FERPA Compliant

  • Always transfers Fact data separately from Demographic data or Personally Identifiable Information

  • Data Owners control which data is exported to a location outside of their data center

  • Data Owners approve all queries

How PATH Works


Path history

Completed Phases

  • 2007 - Established in Statute - Public Act 07-02

  • 2008 - Initial Development as CHIN, inclusion of 4 initial data sources

  • 2009 - Implemented advanced record linkage in a virtual data warehouse

  • 2011 - Scalability to 1M+ individuals, ability to add additional data sources and manage metadata w/o code modifications, unlimited data sources

  • 2014 - Implemented for P20WIN 40M Records, 1.6B Data Elements

    Now Available to CT Agencies and Organizations as PATH

PATH History


Data categories

  • People Records

    • Demographic Information such as Name, Address, SSN, DOB, etc.

    • Also known as PII – Personally Identifiable Information

  • Fact Records

    • Education, Health, Labor, etc. Information about a person BUT without the PII information

    • De-Identified or Anonymized Data

Data Categories



Step 1

  • PATH Remote Software installed at each Participating Agency

  • Agency Data Steward uses the PATH Metadata Editor to Identify:

    • Table/Record Schema of Agency Data

    • Data at the Field or Table Level marked Available or Unavailable for Download

    • Common Data Element fields used for linking records - provides Identity Resolution across the different sources

Agency Data

Agency Data

Agency Data

Agency Data

Step 1

SDE

CCC

Metadata Editor

& ETL

CSU

Metadata Editor

& ETL

DOL

Metadata Editor

& ETL

Metadata Editor

& ETL


Step 2

During Remote Initialization the Extract/Transform/Load function of PATH builds a Record Index of the People Records from each Data Source

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Agency Data

Record Index

Step 2

SDE

CCC

CSU

Record Index

DOL


Step 3

PATH Software installed at a Main Location - for P20WIN this location is DAS/BEST

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Record Index

Agency Data

Step 3

SDE

CCC

CSU

Record Index

Main @ DAS/BEST

DOL

Probabilistic Integrator - Pi

UI, Security,

Workflow,

Query Engine


Step 4

During Main Initialization location is DAS/BEST

Using each Agency’s Record Index, Extracts Common Data Elements from People Records

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Agency Data

Record Index

Step 4

SDE

CCC

CSU

Record Index

Main @ DAS/BEST

DOL

Probabilistic Integrator - Pi

UI, Security,

Workflow,

Query Engine


Step 41

During Main Initialization location is DAS/BEST

Using each Agency’s Record Index, Extracts Common Data Elements from People Records

Sends them to Main & Loads into Memory ONLY

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Agency Data

Record Index

Step 4

SDE

CCC

CSU

Record Index

Main @ DAS/BEST

DOL

Probabilistic Integrator - Pi

UI, Security,

Workflow,

Query Engine


Step 42

During Main Initialization location is DAS/BEST

Extracts Common Data Elements from People Records using each Agency’s Record Index

Sends them to Main & Loads into Memory ONLY

Combines multiple records for individuals into Clusters via Probabilistic Integration Utility

Table of Clusters containing only Agency Record Indices remains in memory

Agency PII flushed from memory

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Record Index

Agency Data

Step 4

SDE

CCC

CSU

Record Index

Main @ DAS/BEST

DOL

Probabilistic Integrator - Pi

UI, Security,

Workflow,

Query Engine


Step 5

Use UI features to establish user Roles, Login, etc. location is DAS/BEST

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Agency Data

Record Index

Step 5

SDE

CCC

CSU

Record Index

Main @ DAS/BEST

DOL

Probabilistic Integrator - Pi

UI, Security,

Workflow,

Query Engine


Step 51

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Agency Data

Record Index

Step 5

SDE

CCC

CSU

Record Index

Main @ DAS/BEST

DOL

Probabilistic Integrator - Pi

UI, Security,

Workflow,

Query Engine


Step 52

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Agency Data

Record Index

Step 5

SDE

CCC

CSU

Record Index

Main @ DAS/BEST

DOL

Probabilistic Integrator - Pi

UI, Security,

Workflow,

Query Engine


Step 6

SDE location is DAS/BEST

Query Engine uses Clusters of Indices to

Get the needed Agency Records Indices

Queries Only Agency Data marked Available for Download

Transfers only data marked Available for Download to the Main

Downloads Only Approved Queries

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Agency Data

Record Index

Step 6

CCC

CSU

DOL

Record Index

Main @ DAS/BEST

Probabilistic Integrator - Pi

UI, Security,

Workflow,

Query Engine

De-identified Integrated Data


3 user roles
3 location is DAS/BEST User Roles


Query workflow
Query Workflow location is DAS/BEST


Data Output location is DAS/BEST


Path components

  • Remote Components location is DAS/BEST

    • Metadata Editor

    • Extract, Transform and Load Module

  • Main Components

    • Integration Engine

    • User Interface

    • Security

    • Workflow Module

    • Query Engine with Filtering

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Agency Data

Record Index

PATH Components

Record Index

Main @ DAS/BEST

Metadata Editor

& ETL

Probabilistic Integrator - Pi

Integration Engine

`

UI, Security,

Workflow,

Query Engine

UI, Security,

Workflow,

Query Engine

De-identified Integrated Data


Path functionality

  • Security location is DAS/BEST

    • Personally Identifiable Information never written outside of Agency Data Center

    • Encrypted transfer of all data

    • PII & Fact records never transmitted together

    • Audit logs

    • Query Approval Workflow

    • Multiple Secure User Roles

  • Ease of Use

    • System Administration

    • Data Management

    • Query Filtering

    • Query results delivered as de-identified data

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Record Index

Agency Data

PATH Functionality

Data Mgmt

Record Index

Main @ DAS/BEST

Metadata Editor

& ETL

Probabilistic Integrator - Pi

Integration Engine

`

Encrypted Xfer

PII & Facts separate Xfer

No PII

User Roles

UI, Security,

Workflow,

Query Engine

UI, Security,

Workflow,

Query Engine

Audit logs

Sys Admin

De-identified Integrated Data

Approval req’d

No PII

Query Filtering


Competitor components

  • Remote Components location is DAS/BEST

    • Metadata Editor

    • Extract, Transform and Load Module

  • Main Components

    • Integration Engine

    • User Interface

    • Security

    • Workflow Module

    • Query Engine with Filtering

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Metadata Editor

& ETL

Agency Data

Agency Data

Agency Data

Record Index

Record Index

Record Index

Agency Data

Competitor Components

Data Mgmt

Record Index

Metadata Editor

& ETL

Integration Engine

`

Encrypted Xfer

PII & Facts separate Xfer

No PII

User Roles

UI, Security,

Workflow,

Query Engine

UI, Security,

Workflow,

Query Engine

Audit logs

Sys Admin

De-identified Integrated Data

Approval req’d

No PII

Query Filtering


Competitor deficits

Desktop Integration Engine location is DAS/BEST

  • Minimal Security

    • No Encrypted Transfer of Data

    • No Audit Logs

    • Transfer of Facts with PII

    • No Secure Logins

    • FTP or Thumb Drive Transfers

    • No Anonymized Data

  • No Access Control - No Approval Workflow

  • No Chain of Custody Assurance – Possibility for Cherry-Picked Data

Agency Data

Agency Data

Agency Data

Agency Data

Competitor Deficits

Integration Engine

`

Copies of Agency Data

PII Visible Integrated Data


Take a test drive
Take a Test Drive location is DAS/BEST


ad