1 / 15

Accessing the data: going beyond what the author wanted to tell you

Accessing the data: going beyond what the author wanted to tell you. Interactive Publications and the Record of Science ICSTI Winter Workshop Paris, Monday, February 8, 2010. Brian McMahon International Union of Crystallography 5 Abbey Square, Chester CH1 2HU, UK bm@iucr.org.

cleo
Download Presentation

Accessing the data: going beyond what the author wanted to tell you

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Accessing the data: going beyond what the author wanted to tell you Interactive Publications and the Record of Science ICSTI Winter Workshop Paris, Monday, February 8, 2010 Brian McMahon International Union of Crystallography 5 Abbey Square, Chester CH1 2HU, UK bm@iucr.org

  2. PDFs and data impoverishment Henry Rzepa: Publishers are likely to love interactive PDF, since it is easy to archive. However ... such objects are data impoverished. Whereas with Jmol, one is obliged to provide semantically accurate data (e.g. CML or equivalent), the PDF object is simply a (pre)rendering of that data. Thus reconstituting a useful molecule from Jmol is trivial (and that reconstitution can then be used for many other purposes), reconstituting a molecule from a 3D PDF is likely to be non trivial, and will almost certainly suffer information loss compared to the original data. By all means, provide both, but I strongly urge that a 3D PDF should not be the only object provided. http://www.mail-archive.com/jmol-users@lists.sourceforge.net/msg13417.html 19 December 2009:

  3. Jmol interactive visualizations • Not new • Biochem J. (2008). 412 399–413 • Bespoke design / • implementation • Expensive • Requires consultation • Supplementary • information

  4. The right tool for the job • Jmol • Then (ca. 2004): • Protein structures (RasMol) • Small organic chemical molecules (Chime) • Now: • Crystal lattices (symmetry) • Inorganic materials (coordination polyhedra) • Displacement ellipsoids • Symmetry operations • Electron orbitals • Electron density maps

  5. Making it easier to use • Editing toolkit • http://submission.iucr.org/jtkt • High-quality immediate visual feedback • Context-sensitive help • Manuals, examples, tutorials • Reference: McMahon, B. & Hanson, R.M. (2008). J. Appl. Cryst.41, 811-814. A toolkit for publishing enhanced figures

  6. Interactive molecular visualizations enhance understanding Acta Cryst. (2008). F64, 156-162 • Rotate • Modify orientation • Alternative representations • Overlay representations • Interrogate

  7. Infrastructure for publication workflow • Server/client architecture • Ability to create interactive figures before or during article submission/review • Opportunity for peer review/revision • Auto-generation of static equivalent • Easy generation/activation of multiple scripts to provide alternative views

  8. Requirements for routine publication of enhanced figures • Platform independence • Web access for authors • Serving visualization application and data • Integration into submission/review procedures • Integration into journal production workflow • Automated generation of static copy (for failsafe/PDF edition/archiving) • Authoring tools

  9. The authoring environment • The author uploads a data file (CIF) • The system provides different default styles according to the type of structure • The author edits and annotates the view • The author may supply additional scripts • The author saves the result as an enhanced figure + publication-quality static figure

  10. Saving the enhanced figure • Interactive applet • Active scripts provided by the author • High-resolution static image • Option to view dynamic or static image online • Link to allow peer review

  11. The toolkit editing interface • Essential tool for authors • Accommodates novice and advanced users • Tabbed interface allows authors to concentrate on scientific aspects of visualization • Presets tuned to journal style requirements • Live testing, preview and feedback mechanisms

  12. Submission/review • Author may prepare enhanced figure ahead of publication • Simply enter URL of edit workspace when asked to ‘upload source files’ • Presented alongside other conventional figures • Available for peer review • Can be edited in response to referee comments

  13. Interactive authorship: publBio http://publbio.iucr.org • Start with the data (PDB) • example 3jw1 • Add structured text • Online look-up: • authors • references • crystallization solution components • Validation • references • Visualisation (Jmol) • Update data file as submission vehicle

  14. Uniform (compatible) markup systems • Crystallographic Information Framework (CIF) • Treat data/metadata, text/numerical data as peers • Domain-specific extensions (dictionaries = ontologies) • Image format • Some data fields may need to contain richer content • Text markup • Mathematical equations • Interactive figure scripts • Machine validation of dictionary attributes • Methods

  15. Conclusions • The working scientist really wants to interact with the data • What interactive PDF offers is currently limited • Publishers should develop compatible architectures • Need domain-specific implementations (learned societies) • Investment in new applications; integration with workflow • Education for a new paradigm • Archiving • requires more standardisation • proper compound document model • concentrate on data (or semantic content), not the implementation • ‘record not what it looks like, but what you are looking at’ • Distributed content sources • data not necessarily integral part of document • retrieval of non-discrete data sets

More Related