1 / 38

Translate, Transfer, Transform: Academia, Industry and Enabling Data Analytic Tools

Translate, Transfer, Transform: Academia, Industry and Enabling Data Analytic Tools. Martin Owen Verity Fisher. Emily Matthews David Woods. 1. 1. Why do we need to do something differently?.

mauch
Download Presentation

Translate, Transfer, Transform: Academia, Industry and Enabling Data Analytic Tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Translate, Transfer, Transform: Academia, Industry and Enabling Data Analytic Tools Martin Owen Verity Fisher Emily Matthews David Woods 1 1

  2. Why do we need to do something differently? “Because of the dissolution complexity and volume of data, it is not easy to keep all of the key results in your mind to make an informed decision. Historically only a few people have had enough working memory in their brain to do this… thus having a dashboard which visualizes the data during the meeting allows us to go beyond the limits of our brain capacity and make the right decisions”. Head of Drug Delivery , GlaxoSmithKline

  3. What is the generalised problem we are wanting to solve? “how can we empower data creators and data users to improve the way project teams communicate risk, evaluate options through interactive models and make informed evidence-based decisions?” Case Study The support the development of a formulated product in the pharmaceutical industry • Design of formulation impact on • Stability • Dissolution

  4. How are we going to do this? The multidisciplinary collaboration • Aim: to generate enhanced statistical design, modelling and visualisation capability. • fundamental science • development of innovative statistical methodology • implementation and application of the solutions • translation and communication of the results. Desired output • the use of maximally efficient quantitative and statistical methods, • better understanding and quantification of uncertainty and risk to product quality .

  5. Informatics • SEEK, INCUBATE, INDUSTRIALISE the creation of task specific applications Case studies Incubation (JMP): Accelerated stability Seek (R): Modelling dissolution profiles

  6. (1) Standardised, automated data import Fig 1a: Launch dashboard Fig 1b: Switch project (1c), access data tables, re-run dashboard, retrieve existing models Fig 1c: Create new project, select existing project Fig 1d: Confirm table selection, access exploratory data analysis options Fig 1d Fig 1c Fig 1b Fig 1a

  7. (2) Exploratory Data Analysis/Cleanse data • Multifaceted views of the same data set give different insights • Does the data make sense? • Data manipulation (stack/split) occurs behind scenes

  8. (3a) Overview of risk and uncertainty

  9. (3a) Overview of risk and uncertainty

  10. (4a) Centre focus

  11. (4b) Centre focus drill down

  12. (5) Automated model selection heuristics

  13. (6) Diagnostics: How good is the model?

  14. (6) What can the model tell us?

  15. (6) Depiction of uncertainty

  16. Application: Compare concepts and contrast the specifics Accelerated Stability Dissolution Standardised, automated data workflow Exploratory data analysis Views X,Y,Z..... Automated modeling and manual assessment of quality of models Observed dissolution data Observed test reference data Weibull Gompertz Asymptotic Origin Logistic Heuristic-based model selection Diagnostic evaluation • Standardised, automated data workflow • Exploratory data analysis • Views A,B,C ...... • Automated modeling and manual assessment of quality of models • Observed long term data • Relative Humidity Linear Time Kinetic • Absolute Humidity Linear Time Kinetic • Relative Humidity Accelerating Kinetic • Absolute Humidity Accelerating Kinetic • Relative Humidity Decelerating Kinetic • Absolute Humidity Decelerating Kinetic • Relative Humidity Power Model • Absolute Humidity Power Model • Heuristic-based model selection • Diagnostic evaluation

  17. Modelling Dissolution Profiles Emily Matthews and Dave Woods Southampton Statistical Sciences Research Institute {E.S.Matthews, D. Woods}@soton.ac.uk

  18. Introduction • Hierarchical Modelling • Stage 1 • Stage 2 • Model Assessment • Visualisation Using RStudio

  19. What is a Profile?

  20. Why Automate the Modelling? • Capsule experiment – non-regular fractional factorial design. • 6 factors, 16 runs. • Aim: model dissolution profiles to identify treatments which pass tests. • Profile for treatment ‘suitably’ close to the reference. • Four dissolution tests in four different media. • Capsule dissolved in three to twelve vessels per test. • 243 dissolution curves.

  21. Two-Stage Hierarchical Model We have used a two-stage hierarchical model for the dissolution curves: • Stage 1: Fit a model to each dissolution curve: • . , ~ • Model fit assessed using R2, AIC and BIC. • Stage 2: Fit a linear regression model ~ to the p stage 1 parameter estimates. To predict the dissolution curve for a new treatment, we evaluate the stage 1 model using parameters predicted using the stage 2 model.

  22. Example – Gompertz Model • Stage 1: • Stage 2:

  23. Parametric Models for Dissolution Profiles

  24. Problems with Profiles:Test 1 and 2

  25. Models for Stage 1 • Test 1: linear model, using lm in R. • Test 2: non-parametric model, principal components analysis (Jolliffe, 2002) using svd in R. • Test 4: Weibull model,

  26. Modelling for Stage 2 Two methods considered to find estimates for the stage two parameters: • The Expectation-Maximisation (EM) Algorithm (Davidian and Giltinan, 1995). • Samples from the Metropolis-Hastings within Gibbs Sampling Algorithm by Matthews and Woods (2015). • Variable selection. • Model averaged predictions.

  27. Model Assessment

  28. Centre Point

  29. Factor 2 Low

  30. Factor 2 High

  31. Factor 4 Low

  32. Factor 4 High

  33. Factor 6 Low

  34. Factor 6 High

  35. Bibliography Davidian, M. and Giltinan, D.M. (1995) Nonlinear Models for Repeated Measurement Data. No. 62 in Monographs on Statistics and Applied Probability. Florida: Chapman and Hall. Jolliffe, I.T. (2002) Principal components analysis. New York: Springer, second edn. Matthews, E.S. and Woods, D.C. (2015) A Bayesian analysis of split-plot designs with spike-and-slab priors. In Preparation.

  36. Summary

  37. Seek and Incubate Formulation Workflow “New” model building methodology & selection (R) X Import model description Data visualisation and “Existing” Model building & selection (JMP) Dashboard Application Output for general users (JMP) Mainstream Scientific decision makers Data

  38. Acknowledgements include.... • Case study 1 • ASAP Incubation Team • Don Clancy, Jonathan Dean, Neil Hodnett, Rachel Orr, Martin Owen, John Peterson • David Burnham (PegaAnalytics) • Case study 2 • Capsule Challenge Team • The GSK Project team members • Emily Matthews, Dave Woods, Verity Fisher

More Related