Dana Soloff Director, Statistical Programming, Genzyme September 25, 2009

or SDTM V Is that an ADaM dataset on the Janus wall?The Humpty-Dumpty challenge of modeling study data with HL7-RIM Dana Soloff Director, Statistical Programming, Genzyme September 25, 2009 BACUN – Boston Area CDISC Users Network

Caveats • This presentation represents my thoughts, and not necessarily those of Genzyme.

Outline • Background • Our ADaM-HL7 Pilot • What we did • Our motivation • What we learned • What do we (pharma) do next? • Our final analysis

CDISC-HL7 is coming! “The FDA has embraced the HL7-RIM… We envision the CDISC content to be sent to FDA as XML messages based on the HL7-RIM… SDTM will evolve from a submission standard to an analysis view…” BEHRMAN 2008

When is it coming?!! balloted 9/2009 *PDUFA: Prescription Drug User Fee Act. Oliva, March 2009.

Genzyme ADaM-HL7 Pilot • We hired consultants! • We trained on HL7 V3! • We tried to model ADaM to HL7! • We gave up! • (But Genzyme is trying again!)

What are the CDISC-HL7 Messages? • Study Design* • What will be done? • Study Participation* • Who is involved? • Subject Data • What was observed? • Includes analysis data • Isn’t a message anymore *Has published Domain Analyisis Model (DAM) (Think CDISC I.G.) and passed DSTU ballot 9/09.

Study Data: Now “CDA”No DAM – BRIDG? It just says “Document””

Why? We were curious. • What is “CDISC-HL7”? • Why is the FDA doing this? • Will this impact statistical reviews? • How will we get our SDTM and ADaM data into CDISC-HL7? • Does this change our vision for data standards architecture?

SDTM & ADaM: Traditional datasets • Two dimensional rows and columns • Keys relate datasets • Requires human readable metadata • Relationships between values often implicit

What is the impact of “implicit relationships?” • Analysis Dataset • One observation row • Value for concomitant medication • Value for adverse event • Did the conmed cause the stroke? • Or was the conmed administered because of the stroke?

Specifications Match Structure of Datasets SDTMIG 3.1.2 SDTM 3.1.2

CDISC-HL7 • Break each dataset up by variables and values • Elements floating in multi-dimensional space • Wrapped in little pods of metadata called attributes • Explicitly modeled relationships

Protocol Representation Model BRIDG2.2

Define.xml AE DM LB Subject Person Plays SDTMs Race Usubjid Ethnic Birthdt What does it mean to map SDTM to BRIDG? (apologies to Diane Wold)

What is CDISC-HL7? • It’s not just SDTM reformatted • It’s a very big change • We have a lot to learn • The standards require further development • Complexity and our inexperience constrains our effective participation

Our questions • What is “CDISC-HL7”? • Why is the FDA doing this? • Will this impact statistical reviews? • How will we get our SDTM and ADaM data into CDISC-HL7? • Does this change our vision for data standards architecture?

The FDA wants more information on relationships between data This adverse event was the result of this concomitant medication administered by this investigator on this date in response to this lab value… • Great for medical review! • Can be modeled in web browser and no limit to instant clicking around to understand relationships!

Pharma needs to join healthcare • EHR (electronic health record) is HL7 based • Efficiencies, reduced development time using same source data • Potential to combine sponsor clinical trial data with subject’s healthcare record data • Personalized Medicine – Genzyme!

Why analysis data in HL7? Combine statistical results with “point of care” statistical results? No… Combine complex study-specific derived values across sponsors? No… Statistical analysis is performed on groups of observations Healthcare is performed on individuals Linking the two is tough!

What is the main reason provided in the HL7 subject data use cases? Transparency. Reviewers want more transparency between collected data and results They want derived data and collected data together They want to be able to easily identify which observations we imputed, excluded, etc.

Events Findings Interventions Why not put analysis data on SDTM?There’s no place to put it. Response = Mean weekly lab > 5 units, no rescue therapies, no adverse events of interest over the evaluation period

HL7 could theoretically solve this. Response Findings Events Interventions

Except there are interim calculations • Data handling and algorithms applied at every step • Imputations, selected observations based on values or time windows • Often comparisons to other variables before choosing or calculating value • Last follow up date could come from AE, EOS, LB, etc.

RESPONSE Rate of change Adverse Events Selected ATC Codes Transfusion P Mean Lab A P

The FDA also wants to compare actual to planned! • Protocol was amended four times! • A lot of unplanned things happened • New drugs came on the market • Sick people didn’t make it to scheduled visits • Trials weren’t executed perfectly • Bizarre data values happened • Some samples were incorrectly analyzed

And then we get busy… • Perform same calculations on different populations • ITT, Per Protocol • And by different imputation methods • LOCF, WOCF • We may plan to use an observation in one analyses and not another…

Analysis data is different than collected data! • Real surgery doesn’t have do-overs! • ITT, Per Protocol, Safety • More complexity and diversity in modeling statistics • Entities, acts, etc. don’t always make sense

Can we model ADaM in HL7? • Certainly not today! • It might be possible in the future • There will always be considerable room for error • Is there too MUCH information? • Is HL7 the best way to provide more transparency to reviewers? • Is the cost-benefit ratio acceptable?

Our questions • What is “CDISC-HL7”? • Why is the FDA doing this? • Will this impact statistical reviews? • How will we get our SDTM and ADaM into CDISC-HL7? • Does this change our vision for data standards architecture?

The data are submitted and the fun begins! • FDA receives HL7 messages • Janus generates • views of SDTM and ADaM that match ours • additional analysis views with both collected and derived data

Would the FDA’s views of SDTM and ADaM match ours? • SDTM and ADaM allow flexibility in modeling • How can one model from the specific to the general without a human or rules? • One will never have standard messages defined to cover all cases

Will the FDA reviewer use our datasets or theirs? • SDTM is a collected data standard • Original error was assuming that SDTM could ever be basis for statistical review • If reviewers are unhappy that there is no analysis data on SDTM… • And sponsors are required to model data in HL7 because SDTM is inadequate… • And reviewers have access to another view than SDTM that includes analysis data created from HL7… • Why would they use SDTM? • Other than for WebSDM, iReview

Will views be reassembled correctly? The Humpty-Dumpty Problem! • The data and the metadata are in pieces! • Some is part of HL7 attributes • The rest is in our black box • How will they put together an accurate view of the analysis data? • Our “metadata” – define.xml won’t document their view • Will derived variables be used incorrectly when used out of dataset context?

There will be challenges for the FDA and sponsor communication • If we have different input datasets • Or the Janus generated views are not accurate • How will this promote transparency with regard to statistical review? • It might help if FDA reviewers provide sponsors with their analysis data views • We need define.xml! • And ODM format! 

Is all that really better than this? Selection criteria described in define.xml

Who prepares the submission? Study, Data & Analysis SME CDISC-HL7 SME

Understanding trial, data content and analyses key to correct modeling • Relationships between observed and calculated not all captured as data • Until we have structured protocol & SAP, a fairly complete set of robust messages, and maybe even then… • We need SMEs to model data • They don’t have the HL7 expertise

How do we QC the result? • Is this double work? • Complex specs for ADaM datasets • Complex specs for HL7 • Complex specs to reassemble ADaM from HL7 • Double-program pre-HL7 ADaM with post-HL7?

Almost everything we’ve done has been valuable! All are plans are usable! • Governance • End-to-end metadata driven data standards roadmap • Metadata Repository • Structured Protocol • Central Lab Standard • CDASH based collection standards

Where should we go from here?ADaM may be better than HL7 in providing transparency • Be realistic about what SDTM can do • It’s fine as a collected data standard • But not as a base for FDA analysis review • Implement ADaM 2.1 and ADaMIG 1.0! • Improve our metadata (define.xml) • Err on the side of traceability! • Inclusion of SDTM data a priority • Intermediate datasets when helpful • Provide FDA multiple “views” of the same data • Provide FDA helper variables for analysis

What can we learn from our SDTM experience? • Less time spent developing model • More time testing actual data! • Engage FDA to understand and develop joint vision • Collaborate with each other on tool development and share costs • Pitch into fund to hire HL7 “technical lobbyist”

What else should sponsors do? • Buy a really big color printer! • Buy really big paper! • Buy a really big magnifying glass! • Buy a big HL7 warehouse! (eventually)

CDISC-HL7: Current sentiment heard around town… HL7 may make sense for collected data - but we don’t like it! - and we still need SDTM as a base for ADaM ODM makes more sense for analysis data - until we have proof that HL7 satisfies the use cases

Is HL7 TOO MUCH information? • The world is round and we do need a jet for collected data… • But are you sure we should take our jet to the ADaM grocery store? • Let’s give pharma a chance to upgrade to more robust ADaM!

HL7 is coming! This isn’t right!

Dana Soloff Director, Statistical Programming, Genzyme September 25, 2009

Dana Soloff Director, Statistical Programming, Genzyme September 25, 2009

Presentation Transcript

Dana Saliba Director, IT

Dana Saliba Director, IT

Crocketteers Meeting - September 25, 2009 (No. 00001)

Dana Saliba Director, IT

Abhishek K. Shrivastava September 25 th , 2009

Friday Sermon Slides September 25 th , 2009

TASS September 25, 2009 - Agenda

DHIN Customer Profile Webinar September 25, 2009

Web conference September 25, 2009

FRENEL Meeting, Nice, 23-25 September 2009

Paris, 24-25 September 2009

Dana Gheorghe Director General ANCS , 29 October 2009

India Sports Channel Opportunity September 25, 2009

Dana Soloff Director, Statistical Programming, Genzyme September 25, 2009

Friday Sermon Slides September 25 th , 2009

Donald M. Soloff | Personal Injury Lawyer | Soloff & Zervanos, P.C.