190 likes | 411 Views
CARMEN and spike detection and sorting . Leslie S. Smith University of Stirling, Scotland, UK http:// www.carmen.org.uk. Contents. CARMEN architecture and project Neural Data Format (NDF) Workflows (?) Project status Where we are now Some reflections. CARMEN ‘Cloud’ (CAIRN).
 
                
                E N D
CARMEN and spike detection and sorting Leslie S. Smith University of Stirling, Scotland, UK http://www.carmen.org.uk
Contents • CARMEN architecture and project • Neural Data Format (NDF) • Workflows (?) • Project status • Where we are now • Some reflections INCF June 17 2012
CARMEN ‘Cloud’ (CAIRN) Enactment of scientific analysis processes Raw & Derived Data Store Compute Cluster on which Services are Dynamically Data Deployed Workflow Enactment Structured Metadata Store Enabling Search & Annotation Engine Raw Signal Data Search & Visualisation Portal y Rich t i r Metadata u Clients c e Web S Registry Analysis Code Store Service Repository Security Policies Controlling Access to Data & Code Search for Data & Analysis Code
CARMEN project status • Initially a 4 year UK e-Science project • From September 2006-March 2011 • Extended with BBSRC tools and techniques grant • To Sept 2014 • Major work in last 2 years has been • Improving User Interface • NDF implementation (working) • Workflow Implementation (nearly there!) INCF June 17 2012
CARMEN CARMEN and spikes Note that CARMEN also has other services, including higher level services, etc. INCF June 17 2012
Data, services and workflows • CARMEN supports • Data and metadata • Services: which process data, and • workflows (almost): concatenations of services. • Initially: • We allowed more or less any data format • Services which processed a data format and produced a different data format • …but to develop workflows (and to enable interoperability between services) • We now strongly recommend using our Neural Data Format (NDF) INCF June 17 2012
Neural Data Format (NDF) • An NDF dataset consists of a configuration file in XML format which contains metadata and references to the associated host data files. • A special XML data element, History, is included within the header file for recording data processing history. This element contains the full history (recording chain) of previous processing • The NDF API has been implemented as a C library. The NDF API translates the XML tree/nodes to C style data structures and insulates the data structures within the binary data file from the clients. INCF June 17 2012
NDF: Supported datatypes INCF June 17 2012
NDF XML file <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <ndtfDataCfgxmlns="http://www.carmen.org.uk" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.carmen.org.ukndtfDataCfg.xsd"> <Version>1.0.1</Version> <NdtfDataID>897A9272-4E6F-4F32-8A63-89C699F99120</NdtfDataID> <GeneralInfo> <Description>NDF Spike Detector Service (COB) Version 2 - SNN</Description> <Laboratory>Carmen VLE</Laboratory> <CreateDate>2012-06-13</CreateDate> <CreateTime>16:29:03</CreateTime> <RecordID>N/A</RecordID> </GeneralInfo> <DataSet> … INCF June 17 2012
NDF XML file cont’d <History> <Processor> <ProcessingDateTimeStartDateTime="2012-06-01T16:16:08"/> <CommandLine>mat2ndf (m0192_all.mat,m0192_all.ndf)</CommandLine> </Processor> <Processor> <ProcessingDateTimeStartDateTime="2012-06-13T17:05:58"/> <CommandLine>Spike detector COB NDF m0192_all.ndf, 256, 0.002, 15, 30, no</CommandLine> <ProcessingSettings>Spike Detector (COB) for NDF data</ProcessingSettings> </Processor> </History> </ndtfDataCfg> INCF June 17 2012
NDF based services • Filtering services: • HPF, LPF, BPF • Spike detectors: • single or multiple channel signal • Simple thresholding, positive/negative/both sided, NEO (Teager energy operator), Cepstrum of Bispectrum • Spike sorters • Kmeans • Waveclus (superparamagnetic clustering) • We can add new spike detectors and new spike sorters reasonably easily. • Wrapping services • The User Interface allows specific channels and sections of the dataset to be selected INCF June 17 2012
Workflows • Currently at alpha stage testing: • Can create workflows (graphically), generate scripts, store them, apply security and sharing appropriately: execution of workflows is almost ready. • Workflows will be generable either graphically or using a scripting language. INCF June 17 2012
Workflow graphical interface INCF June 17 2012
Where are we now? On the cluster • Can run single services: • But joining them together requires user intervention • (NDF services do read each other’s data correctly) • New NDF services can be “wrapped” • Have run workshops on this • Can turn a variety of formats into NDF • Mcd, nev, nex, plx, map, smr, abf, abf2 • Spike detectors of three sorts • Can process multi-electrode data in one service • Spike sorters of two sorts (Kmeans and waveclus) • But no workflows yet (promised soon!) • … also not really enough public datasets INCF June 17 2012
Where are we now? Local systems • NDF toolbox is available for Matlab (downloadable). • Runs on recent versions of Matlab: PC, Mac, Linux. • Services which run on the cluster are/will shortly be available to run locally under Matlab • Not really the intent of the project, but does enable service running and testing (and debugging!) to be carried out locally • Environment is essentially the same as on the cluster. • Local workflows enabled through writing XML files, and simple (-ish) scripts. • Can test the wrapping of scripts locally INCF June 17 2012
CARMEN and validation • Validation of services • Testing services on multiple datasets • Testing multiple services on datasets • Locally • On the Portal. INCF June 17 2012
Why is this so difficult? Why has it taken so long?What lessons can we learn? • Initially allowed user to write services for their own data types • Ties services to specific types: not easily shareable • NDF proved complex to implement • Generality, multiple language support • User Interface for portal was difficult • Wanted to support non-technical users • Existing software proved difficult to use • Software developed for R&D systems proved not to be robust: expecting it to be was overly optimistic • Insufficient development staff • Underestimated programming/development requirements • Supporting early development projects (“low hanging fruit”) took a lot of time. INCF June 17 2012
Carmen consortium INCF June 17 2012