the ccpn project n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
The CCPN Project PowerPoint Presentation
Download Presentation
The CCPN Project

Loading in 2 Seconds...

play fullscreen
1 / 78

The CCPN Project - PowerPoint PPT Presentation


  • 122 Views
  • Uploaded on

The CCPN Project. Tim Stevens and Wayne Boucher October 2005. CCPN at G ö teborg: Day 1. Introduction to CCPN The CcpNmr applications Analysis basics Future developments Analysis advanced. CCPN at G ö teborg: Day 2. An overview of the data model API Tutorial Analysis Macros

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The CCPN Project' - nodin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the ccpn project

The CCPN Project

Tim Stevens and Wayne Boucher

October 2005

slide2

CCPN at Göteborg: Day 1

  • Introduction to CCPN
  • The CcpNmr applications
  • Analysis basics
  • Future developments
  • Analysis advanced
slide3

CCPN at Göteborg: Day 2

  • An overview of the data model
  • API Tutorial
  • Analysis Macros
  • Widgets and Popups
the ccpn project1
The CCPN Project
  • Collaborative Computing Project for NMR
    • Started in 1999
    • Collaborators in several countries
    • Developers at University of Cambridge and EBI
  • Unifying platform for NMR software
    • Similar to CCP4 (X-ray)
  • Main goals:
    • Data standards and data exchange
    • Software development and distribution
    • Meetings to determine and disseminate best practice
    • Open source access
people
People
  • Cambridge
    • Ernest Laue
    • Rasmus Fogh
    • Dan O’Donovan
  • EBI, Hinxton
    • Kim Henrick
    • John Ionides
    • Wim Vranken
    • Anne Pajon
history
History
  • Workshops:
    • EBI (2000, 2001)
    • Washington (2000)
  • Funding:
    • BBSRC (2000-2003, 2003-2006)
    • NMRQUAL (2001-2004)
    • TEMBLOR (2002-2005)
    • NMR-EXTEND (2005-2008)
nmr software
NMR Software
  • Problem - Heterogeneous development
    • Lots of proprietary data formats
    • Lots of stand-alone programs
    • Data is ‘lost’ along the way
    • Dedicated converters needed
    • Not acceptable for structural genomics projects
  • Solution - Unity
    • Data standards
      • Ease of transfer between programs
      • Completeness, integrity, deposition, data mining
    • Libraries
data format vs data model
Data Format vs. Data Model
  • Data format - How data is stored
    • STAR
    • XML
    • SQL
    • Tab-separated ascii
  • Data model - What data means
    • RCSB (PDB) mmCIF
    • XML DTD or schemas
    • SQL schema
ccpn approach
CCPN Approach
  • Data model rather than data format
    • Format independent
    • Language independent
    • Scientifically descriptive (NMR)
  • Library (API): in memory manipulation
    • Create, update, delete & query objects
    • One for each language
    • Error checking
  • I/O modules: load/store data from/to disk
    • One for each (storage format, language)
    • Bookkeeping
application view
Application View

User

GUI

Application1

Application2

Application3

API

In Memory Representation

(Python, Java, C++, Perl)

I/O

Data Store

(XML, SQL)

model driven architecture
Model-Driven Architecture
  • UML: Unified Modelling Language
    • Abstract representation of semantics
    • Pictorial
  • Mapping from UML: to anything
    • Multi-language
    • Multi-format
    • Architecture neutral (e.g. distributed or not)
  • Power: good and bad
  • CCPN uses Object Domain as its UML tool
    • Python as scripting language
slide13

Handcoded

(1%)

C

Autogeneration

Documentation

UML Model

Package 1

APIs

User

Package 2

Python

Package 3

Application

Storage

Java

SQL

Deposition

XML

Perl

Program

Domain

MEMOPS

Developers

Experts

framework

data model packages
Data Model Packages

Reference

CcpNmr

Programs

Citations

Experimental

Laboratory

NMR

Protocols

Samples

Nuclei and

Molecule

Structure

Isotopes

Molecule

Targets

Sequence

Structure and

Compound

Compound

Coordinates

Source

Preparation

Molecular

System

Residue

Project

Organisms,

Template

Tracking

Taxonomy

X

-

ray

Crystallisation

Crystallography

ccpn api
CCPN API
  • Classes for developers
    • Mainly getters and setters
    • More than just code stubs
    • Constraints (e.g. cardinality) enforced
    • Links the hard part
  • Mostly (> 99%) auto generated from UML
    • Some helper functions and constraints hand coded
  • Currently around 360k lines in Python and 650k lines in Java
developer benefits
Developer Benefits
  • Specified data model and API
  • No I/O code
  • Concentrate on science, not bookkeeping
  • Extendible
    • Application data can be assigned to any object
    • UML model can be extended (packages)
  • Notification system
    • Register interest when specified attribute changes (class, not object, level)
  • Undo/Redo (in future)
current status of api
Current Status of API
  • Stable and released:
    • Python and XML code generation
    • NMR, molecule description and structure data model
  • In testing stages:
    • Java and SQL database code generation
    • Protein production data model
  • Preliminary:
    • X-ray crystallography data model
structural biology pipeline

Data

processing

Spectrum

analysis

Structure

calculation

Databases

Structural Biology Pipeline

NMR

machine

nmr applications
NMR Applications

CcpNmr

Processing

CcpNmr

Analysis

Validation

software

ARIA 2.0

CCPN

Data Model

Reference data

NMRStar 3.0

CcpNmr

FormatConverter

Other formats (NmrView, XEasy, …)

main ccpnmr applications
Main CcpNmr Applications
  • Format Converter
    • Conversion to and from legacy formats
  • Analysis
    • Graphical analysis (e.g. assignment) program
  • Processing(coming soon)
    • Azara “process” wrapped in data model
ccpnmr format converter
CcpNmr Format Converter
  • Import/export of data formats to the Data Model
    • For harvesting/deposition purposes
    • Allow people to use or try out the data model
    • Interaction with existing programs
  • Fully or partially handles:
    • Ansig, Auremol, Autoassign, Azara, Bruker, Charmm, CNS/XPLOR/ARIA, Concoord, Diana/Dyana/Cyana, Discover, Fasta, Felix, Module, .mol, Molmol, Monte, NmrDraw, NMRPipe, NMR-STAR (v2.1.1, v3.0), NmrView, Pdb, Pipp, Pistachio, Pronto, Sparky, Talos, Varian, XEasy
    • Sequences, chemical compounds, coordinates, NMR measurements, constraints and peak lists, processing and acquisition parameters.
slide24

Format Converter - The NMR Translator

Peaks

Chemical shifts

Acquisition parameters

XEasy

NmrView

...

XEasy

NmrView

...

Bruker

Varian

Format specific

readers

Generic peak converter

Generic chemical shift converter

Generic acquisition parameters converter

Data model

entry

CCPN

Data Model

Format specific

writers

XEasy

NmrView

...

XEasy

NmrView

...

Azara

NMRPipe

Peaks

Chemical shifts

Processing parameters

format converter design
Format Converter Design
  • Wim Vranken (EBI)
  • Set of Python scripts
  • Accessed via:
    • Tkinter (Tcl/Tk)
    • custom Python scripts
  • http://www.ebi.ac.uk/msd-srv/docs/NMR/NMRtoolkit/main.html
ccpnmr analysis
CcpNmr Analysis
  • Requirements
    • Cross platform
    • Scalable
    • Extensible
    • Open and easy scripting language
    • Modern graphical user interface
    • Uses CCPN data model and API
  • Software
    • Python, Tcl/Tk, C, OpenGL
    • (Java, X, Motif)
  • OS
    • Linux, Sun, SGI, OSX (Windows)
spectrum windows
Spectrum Windows
  • N-dim. windows
  • Multiple spectra
  • Automatic mapping
  • Contours on fly
  • Aliasing
  • Strips & cells
  • Mouse and key
  • Blocked data
    • Azara
    • Felix
    • NMRPipe
    • UCSF
graphical interface
Graphical Interface
  • Menus and popup dialogues
    • CcpNmr widgets
  • Main objects
    • Spectra
    • Windows
    • Peaks
    • Resonances
    • Molecules
    • Structures
assignment
Assignment
  • Peak finding and fitting
  • Rich assignment model
  • Mainly mouse-driven
  • Can assign to atoms
  • Ambiguous contributions
  • Existing structure
  • Short resonance list
  • Multiple peaks easily
  • Navigation
the clouds protocol
The CLOUDS Protocol

Spectra

Pick Peaks,

Link Shifts &

Combine

Pick Peaks &

Normalise

  • Automated assignment & structure determination
    • Miguel Llinas, Alex Grishaev, et al.
    • Spatial distribution of anonymous resonances generated with NOEs
  • Integrated within CCPN
    • An Analysis module
    • Data Model glues modules
    • Functional platform
    • Distribution network

Spin Systems

NOE intensities

Relaxation Matrix

Optimisation

Distance Constraints

Hydrogen Atom

Molecular Dynamics

Proton Clouds

Chain Fitting &

Molecular Replacement

Chain Assignment

Full Structure

Calculation

Protein Structure

the clouds protocol1
The CLOUDS Protocol

A fitted protein backbone

A family of Clouds

other features
Other Features
  • Works with FormatConverter
  • Chemical compounds database
  • NMR reference information
  • Hard copy
    • PostScript
    • PDF
  • Table export
  • Rate analysis
  • Macros
  • Structures
extend nmr
Extend-NMR
  • EU STREP application funded to fully integrate software from:
    • Bruker (TOPSPIN, acquisition)
    • Billeter, Orekhov (Garant, Munin, MDD)
    • Kalbitzer (Auremol)
    • Llinas (CLOUDS)
    • Nilges (Inferential Structure Determination)
    • Bonvin (Haddock, RECOORD)
    • Vriend, Vuister (Queen, What-Check)
    • Henrick, Vranken (NMR database)
  • Focus on complexes and development of better software methodology
lims collaborations
LIMS Collaborations
  • PIMS project collaboration
    • Protein production LIMS (with EBI, Sport Consortia, OPPF and Poupon)
  • EU STREP application (SFGLIMS) to work with :
    • Poupon (Protein Production)
    • Perrakis (Biophysical methods, crystallisation)
    • Bricogne (X-ray data collection and structure generation)
    • Prilusky, Sussman (Bioinformatics, data mining)
data model extensions
Data Model Extensions
  • EXTEND-NMR
    • New NMR applications
  • Solid state NMR
  • PIMS
    • LIMS for protein production
  • SFGLIMS
    • LIMS for NMR and X-ray structure determination
  • X-ray
  • Chemoinformatics
  • (Metabolomics?)
code generation plans
Code Generation Plans
  • C++/C/FORTRAN code
    • Needed for Extend-NMR and for CcpNmr Processing
    • Needed for interface to CYANA, NMRPIPE, AUTOPSY, etc.
  • Java/Database code
    • Extend for LIMS, high-throughput projects, NMRVIEW
  • Basic Machinery
    • Upgrades for long term extensibility/maintainability and performance
api languages and formats
API Languages and Formats

Language

Python

Java

C++

Perl

XML

Format

SQL

Forall languages:

  • Metamodel
  • Documentation

Forall formats:

  • Schemas
  • I/O mappings
new core api technology
New Core API technology
  • Reduce burden of adding new languages, formats
    • Languages (Python, Java, C++, Perl)
    • Storage formats (XML, SQL)

Most of the logic

Language & Format

independent

Language dependent

only

Format dependent

only

Language & Format

dependent

Code required for new format

Code required for new language

core api technology cont
Core API technology, cont.
  • Remodelling of implementation details
    • Storages, collection types, root objects, etc.
  • Complex data types
    • e.g. rotation matrix
  • Client/Server architecture
    • For PIMS and SFGLIMS
analysis development
Analysis Development
  • Beyond CLOUDS
    • Large proteins, homologues
  • Processing linked in
  • Couplings (RDCs, TROSY), dihedral constraints
  • Titrations (Ka, Kd)
  • Chain states (alternate conformations)
  • Solid State NMR
  • Organic chemistry NMR (1D)
  • Publication-ready diagrams and tables
  • Windows version
developments in extend nmr
Developments in Extend-NMR
  • Integrated Bayesian, maximum entropy, … methods for data-processing, analysis and structure calculation
  • ‘Molecular replacement’ for NMR
  • Further RECOORD development
  • Databank for Experimental NMR spectra (DEN)
  • MSD database analysis
licenses
Licenses
  • GPL
    • Data model
    • Scripts which produce APIs
  • LGPL
    • Generic libraries
    • Widget libraries
    • Format Converter
  • CCPN
    • Analysis
resources 1
Resources, 1
  • SourceForge:
    • CVS repository for code
    • API and FormatConverter releases
    • http://sourceforge.net/projects/ccpn
  • CCPN:
    • Meetings, workshops
    • API, FormatConverter and Analysis releases
    • http://www.ccpn.ac.uk
resources 2
Resources, 2
  • EBI:
    • Format Converter
    • Databases (MSD group)
    • http://www.ebi.ac.uk/msd-srv/docs/NMR/NMRtoolkit/main.html
  • JISCMAIL:
    • Email list
    • http://www.jiscmail.ac.uk/lists/ccpnmr.html
    • (http://www.jiscmail.ac.uk/lists/nmrgen.html)
slide48

CCPN at Göteborg: Day 2

  • An overview of the data model
  • API Tutorial
  • Analysis Macros
  • Widgets and Popups
slide50

CCPN Packages

  • Groupings of related data
    • e.g. NMR, X-ray, Molecular description
  • Connections between packages
    • e.g. NMR loads Nucleus (isotope) information
  • Allows lazy loading
    • Only load relevant data
    • Only load when a link is queried
  • Save only modified
  • Reference packages
    • Chemical compound, Reference chemical shifts

Molecule

ChemComp

People

MolSystem

Nucleus

Sample

Coordinates

Nmr

slide56

Molecules and MolSystems

  • Molecules
    • Templates for specifying molecular connectivity.
    • Sequences, chemical components, protonation state etc.
    • A kind of reference, e.g. “Lysozyme”
  • MolSystems
    • Contain chains, which contain residues, which contain atoms.
    • The objects you assign to.
    • Built using molecule templates, e.g. a homo-oligomer is built using the same template to make different chains.
  • Stored in different packages
    • Molecule.xml, MolSystem.xml
slide60

Experiment, Spectrum & Shift List Objects

  • Experiment
    • The set-up under particular conditions at a particular time, not a class of experiment.
  • Spectrum
    • Known as Data Source in the data model. A pointer to a chunk of data that results from an experiment. Several spectra may result from the same experiment if they are processed differently.
  • Peak List
    • A set of crosspeaks that have been picked for a spectrum. A spectrum can have several peak lists. The user can separate peaks into classes, e.g. picked in different ways.
  • Shift List
    • A set of chemical shifts, which are derived from peaks and may be linked to atoms. Valid for a set of experiments with similar conditions that give similar chemical shifts. Using different shift lists doesn’t change assignments, but it does change which peaks are used in the calculation of a shift value.
slide63

Resonances and Assignment

  • Resonances
    • The centre of the NMR data model
  • Connect to peaks
    • Different peaks may be caused by the same thing.
  • Connect to atoms
    • A connection to NMR equivalent atoms. Need not be set if anonymous.
  • Have chemical shifts
    • May have different shifts under different conditions.

Experiment

Spectra

Conditions

Measurement

Chemical Shift

Relaxation

Coupling

Peak

Dimensions

Constraint

Distance

Dihedral

Resonance

Annotation

Spin System

Connectivity

Residue Type

Structure

Co-ordinates

Molecule

Atoms

Residues

Chains

slide67

Development in the CCPN framework

  • CcpNmr Macros
    • Small home-use Python functions
  • Additions to function library
    • Functions incorporated in software release
    • Community sharing
  • Embedded options
    • Extension to CcpNmr application
  • Stand-alone applications
    • Built on CCPN libraries and API
  • CcpNmr Clouds has examples of all of these
slide68

The Python interface to the CCPN Data Model

  • Find the number of assigned peaks in a spectrum

count = 0

for peakList in spectrum.peakLists:

for peak in peakList.peaks:

for peakDim in peak.peakDims

if peakDim.peakDimContribs:

count += 1

break

  • Find all H-C partners in a residue

pairs = []

for atom in residue.atoms:

if atom.chemAtom.elementSymbol == ‘C’:

for bond in atom.chemAtom.chemBonds:

chemAtoms = list(bond.chemAtoms)

chemAtoms.remove(chemAtom)

if chemAtoms[0].elementSymbol == ‘H’:

pairs append([atom, residue.findFirstAtom(chemAtom=chemAtom2))])

slide69

CcpNmr Analysis Macros

  • Python scripts/functions
  • Accessible from Analysis and embeddable
  • Argument server
    • An interface to the Analysis program
    • Access to objects
      • Selected peaks
      • Cursor position
      • Spectra
      • Windows
      • Etc…
  • High-level function library
    • Windows, Assignment, Molecules, Constraints
    • Documented
slide70

Macro 1 - Simple stuff

  • Python language
  • Function anatomy
  • Import library functions
  • ArgumentServer
  • Simple program
  • def addMarksToPeaks(argServer, peaks=None):
  • """Descrn: Adds position line markers to the selected peaks.
  • Inputs: ArgumentServer, List of Nmr.Peaks
  • Output: None
  • """
  • from ccpnmr.analysis.MarkBasic import createPeakMark
  • if not peaks:
  • peaks = argServer.getCurrentPeaks()
  • # no peaks - nothing happens
  • for peak in peaks:
  • createPeakMark(peak, remove=0)
slide71

Macro 2 - Ask the user

def calcAveragePeakListIntensity(argServer, peakList=None, intensityType='height'):

"""Descrn: Find the average height of peaks in a peak list.

Inputs: ArgumentServer, Nmr.PeakList

Output: Float

"""

from ccpnmr.analysis.ConstraintBasic import getMeanPeakIntensity

if not peakList:

peakList = argServer.getPeakList()

if not peakList:

argServer.showWarning('No peak list selected')

return

answer = argServer.askYesNo('Use peak volumes? Height will be used otherwise.')

if answer: # is true

intensityType = 'volume'

spec = peakList.dataSource

expt = spec.experiment

intensity = getMeanPeakIntensity(peakList.peaks, intensityType=intensityType)

data = (intensityType,expt.name,spec.name,peakList.serial,intensity))

argServer.showInfo('Mean peak %s for %s %s peak list %d is %e' % data

return intensity

slide72

Macro 3 - Popup loader

def openMyPopup(argServer):

"""Descrn: Opens and example popup.

Inputs: ArgumentServer

Output: None

"""

peakList = argServer.getPeakList()

popup = MyPopup(argServer.parent, peakList)

from memops.gui.BasePopup import BasePopup

from memops.gui.ButtonList import ButtonList

from memops.gui.ScrolledGraph import ScrolledGraph

from ccpnmr.analysis.PeakBasic import getPeakHeight, getPeakVolume

slide73

Macro 3 - The popup

class MyPopup(BasePopup):

def __init__(self, parent, peakList, *args, **kw):

self.peakList = peakList

self.colours = ['red', 'green']

self.dataSets = []

BasePopup.__init__(self, parent=parent, title='Test Popup', **kw)

def body(self, guiParent):

row = 0

self.graph = ScrolledGraph(guiParent)

self.graph.grid(row=row, column=0, sticky='NSEW')

row += 1

texts = ['Draw graph','Goodbye']

commands = [self.draw, self.destroy]

buttons = ButtonList(guiParent, texts=texts, commands = commands)

buttons.grid(row=row, column=0, sticky='NSEW')

def draw(self):

self.dataSets = self.getData()

self.graph.update(self.dataSets, self.colours)

def getData(self):

peakData = [( getPeakVolume(peak) or 0.0, peak) for peak in self.peakList.peaks]

peakData.sort()

heights = []

volumes = []

i = 0

for volume, peak in peakData:

heights.append([i, getPeakHeight(peak) or 0.0])

volumes.append([i, volume])

i += 1

slide74

CcpNmr Graphical Widgets

  • A library for any developer to use

ColorList

PulldownMenu

ScrolledMatrix

LabelFrame

CheckButton

Button

Label

Entry

ButtonList

slide75

CcpNmr Mega Widgets

  • Build them into your own code!
    • ScrolledMatrix
    • ScrolledGraph
    • StructureFrame
slide76

Ccp Stand-Alone AppTemplate

  • Menu System
  • Project handling
    • New
    • Load
    • Save
    • Backup
  • Popup template
    • Widgets
    • Geometry
    • Plumbing
slide77

Popup Constructors and Notifiers

User

Influence

External

Influence

  • Init
    • Setup local variables
    • Subclass popup window
  • Body
    • Arrange Graphical elements
    • Set up Data Model notifiers
    • Set initial state
  • Update
    • Process updated values
    • Redraw widgets based on status
  • Widget callback
    • From entry, buttons etc
    • User functions
    • Data Model change

Initialisation

Widgets

Body

Data

Model

Notifiers

Update Filter

Update

aftercare
Aftercare
  • www.ccpn.ac.uk
    • Downloads
    • Data Model documentation
    • Analysis documentation
    • Tutorials
  • Mailing List
    • http://www.jiscmail.ac.uk/lists/CCPNMR.html
    • Quick response
    • Bugs
    • Requests