1 / 19

a starting point for:

a starting point for:. “Using simulation in parallel computing for faster sample size calculations in complex random effects models”. Toni Price, University of Bristol. MLPowSim. Developed in a separate ESRC-funded project

fburgess
Download Presentation

a starting point for:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. a starting point for: “Using simulation in parallel computing for faster sample size calculations in complex random effects models” Toni Price, University of Bristol

  2. MLPowSim • Developed in a separate ESRC-funded project • Generates both MLwiN macro code and R language code for performing sample size calculations on multilevel models • Works for a selection of multilevel nested and crossed designs • Text-based interface • Uses C code to gather user input and generate output

  3. Initial objective: Use MLPowSim as a basis and extend to support a broader range of models • Good starting point, but would benefit from an automated way of testing that generated code matches expected output (especially as new and more complex models are added)

  4. First step Put into a cohesive framework: • Streamline duplicated code (e.g. for user input which is similar across different models) • Also improves code maintenance (e.g. bug fixes impacting fewer lines of code) • Improve input validation • Makes for a better user experience and reduces crashes • Automate testing of generated code and results • Add multiple user interfaces, e.g. command line / file input / web-based

  5. Ruby is … • Much like Python in a number of ways • Cross-platform • A good choice for metaprogramming • Excellent for text processing … though in the end boils down to personal preference

  6. … moving to Ruby In the words of the official Ruby site (http://www.ruby-lang.org/en/) Ruby is “A dynamic, open source programming language with a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write.” (… I agree!)

  7. Input methods • Command line • Current input method • File input • Useful during development • Facilitates automated testing • Web interface • Familiar mode of input • ‘Easy’ to use

  8. # Input params # # Example 1 (p. 8 in MLPowSim user manual) # MLwiN code output general: output_lang: mlwin rnd_num_seed: 1 sig_level: 0.025 n_sims: 1000 model: n_levels: 1 response_type: normal est_method: igls include_fixed_intercept: yes n_explanatory_vars: 0 estimates: beta_0: -0.140 sigma_sq_e: 1.051 sample_size: level_1: low: 20 hi: 600 step: 20 File input – Example for a 1-level model

  9. # Input params # # Example 8 (p. 39 in MLPowSim user manual) # MLwiN code output general: output_lang: mlwin rnd_num_seed: 1 sig_level: 0.025 n_sims: 1000 model: n_levels: 2 is_balanced: yes structure: nested #=> nested | cross-classified response_type: normal est_method: igls include_fixed_intercept: yes include_random_intercept: yes n_explanatory_vars: 0 estimates: beta_0: -0.177 sigma_sq_u: 0.151 sigma_sq_e: 0.916 sample_size: level_2: low: 10 hi: 50 step: 10 level_1: low: 10 hi: 60 step: 10 File input – Example for a 2-level model

  10. Advantages of adding a Web interface • More accessible • No download required • Indexed by search engines • Cross-platform (Windows/Mac/Linux) • Up-to-date version available as soon as deployed • Centralised bug fixes • New features • No distribution overhead • Opportunity to collect usage information • E.g. model parameters … aligned with e-Stat objectives

  11. Disadvantages of Web interface • “Constrained” by browser functionality • Need to be online to use it • Needs hosting resources … fine for code-generation app as it stands, but would be too resource-intensive to run simulations and model-fitting on server

  12. [Demo of command-line and Web-based interfaces for MLPowSim]

  13. Improving speed • Another, parallel (so to speak ☺) objective is using parallelization to speed up run-time for generated power calculation code • Have taken an initial look at using capabilities of multi-core processors by executing more than one run simultaneously • Exploratory code makes use of Unix (Linux) ‘forking’ to create sub-processes • This approach will not work on Windows (since Windows does not support forks) • Precludes possibility of using this approach for MLwiN

  14. Improving speed … contd. • For now, doing tests on R code in Linux Initial results (very rough, just a starting point): • Model: 1-Level, Normal response, Fixed intercept, No explanatory variables • R code with sample sizes from 400 to 600 in steps of 100 (i.e. 400, 500, 600)

  15. Improving speed … contd.

  16. Improving speed … contd. Summary

  17. Where to from here? … this is just a small start … • Extend MLPowSim to support more models • Add test cases for code generation to cope with more models • Add automated tests for verifying actual numerical output • Further develop Web interface • Continue investigating speed improvements through parallelization

More Related