Pedigree Import IBP Activity 2.2.2, Subactivity 2: Develop Genealogy Manager Application Principal Investigator: MylahAnacleto, IRRI Presentor: Alex Cañeda, IRRI
Pedigree Import • Import of germplasm entries using a predefined file format containing pedigree strings • Software allows user to verify if the entries in the imported file exist in the database • Software parses the pedigree strings based on selected rice nomenclature rules and name standardization • If unrecognized, allows users to edit portions of the pedigree string, and then check the resulting split for correctness, and apply the changes made back to the pedigree entry.
Pedigree Import • The target end users of the application are the data managers and data curators whose tasks include the bulk loading of historical pedigree entries
Timeline and milestones • October 2013 – Beta version • January 2014 – Release candidate 1 • June 2014 – Release candidate 2
Sample input pedigree strings ”IR” cross number designation - for all crosses made in IRRI; assigned by the database administrator of PBGB IR 88888 is an F2 plant. • IR 88888-21 is an F3 21st selection from IR 88888 F2 population • IR 88888-21-2 is an F4. 2nd selection from the IR 88888-21 (F3) • IR 88888-21-2-2 is an F5. 2nd selection from the IR 88888-21-2 (F4) • IR 88888 -21-2-2-2 is an F6. 2nd selection from the IR 88888-21-2-2 (F5)
Features • Loading of large volume of historical germplasm data in to the ICIS GMS database • Parse cross history strings of unknown crosses and, look for parents in the database • User to control the selection of parents found in the database or create a new entry for the parents • Derivative names of parents are recognized by looking for a cross number followed by letters and numbers separated by dashes
Features • Pedigree Importer will use the same parsing algorithm that has been developed under the Delphi based ICIS GMS parser application the GMSInput • Note: using java, developed parser based on expected output as specified in the guide on nomenclature rules, not sure if the same algorithm • basic data validation (for example, the parent should not be younger than its offspring) – not yet done • The other features can be viewed from IBP site
Technologies used • Web-based: GUI was written using PHP programming language • Java Web Services (Jersey) • Web Servers: Apache Tomcat and Apache httpd • Database: MySQL while testing the IBP middleware but will eventually need to use PostgreSQL
GUI • Home page
GUI • File is uploaded. Pedigree Strings not in standardized format are in red color.
GUI • Pedigree strings are standardized after clicking the standardize button. Unrecognized patterns are in red color. Click the pedigree string to edit.
GUI • Edit Germplasm Name
GUI • Changes are applied after editing germplasm names with unrecognized patterns.
GUI • Create GID’s. • After clicking the Create GID button. The link to display the strings created with GID’s is shown
GUI • Sample of Create GID’s page .
Features: string parsers Pedigree string parser • Identify tokens using ‘–’ as delimiter For each token: • Name standardization • Use of Regex libraries • Check spaces and patterns • Check cross notations (Single cross, compound cross, backcross)
Interaction of application components Pedigree string(s) to process • Restful Web Services Standardized name(s) Filename (where result was stored) • Pedigree Import GUI Pedigree Import Web Services Component (Pedigree string parsers) Pedigree string(s) to process Matching germplasm, Created GIDS Filename (where result was stored) • Apache Web Server IBP middleware (.jar file) -create germplasm -searchgermplasm IRIS GMS
With availability of IBP Web services • Restful Web Services • Restful Web Services • Pedigree • Import GUI Pedigree Import Web Services Component (Pedigree string parsers) IBP middleware -create germplasm -searchgermplasm • Apache Web Server IRIS GMS • Pedigree Import • *Only 1 deployment in IRRI
Next module Pedigree Editor • To communicate with IBP/Efficio on middleware requirements • Scope is for pedigree/genealogy editor needs (limited germplasm editing); to be able to edit the information uploaded using pedigree import tool.
Project Information Software developer: Nikki Carumba Nikki is part of Breeding Information Management group of IRRI-PBGB With user inputs from IRRI: Dr. Ruaraidh Sackville-Hamilton William Eusebio