Diverse Research and Development Projects in Computer Science and Bioinformatics
220 likes | 360 Views
This presentation by Satyanarayan Rao outlines a varied array of projects undertaken from undergraduate studies to work at SCFBio at IIT Delhi. Key projects include developing a GUI application for tree data structures in Java, implementing a linear programming problem solver using Matlab and web technologies, and creating an HMM-based speech recognizer for applications in VoIP. During a master's thesis, a toolkit for high-resolution image processing was created alongside optimizations in protein energy minimization. The work emphasizes significant advancements in computational methods and applications in several domains.
Diverse Research and Development Projects in Computer Science and Bioinformatics
E N D
Presentation Transcript
Work and research experiences Presented by Satyanarayan Rao
Outline • Projects during Undergrad (fall 2005 – summer 2008) • Projects during Master (fall 2008 – summer 2010) • Projects at SCFBio IIT Delhi (Jan 2011 – July 2012)
Projects during undergrad • Tree tech tutor ( Guide: Dr. P. K. Das) • Tutorial for tree data structure for beginners • Developed GUI application • Implemented different tree data structure, • Binary search tree • Red black balanced search tree • AVL tree • Technology used: Java
Continue… • Methods implemented • Create trees • Add new nodes • Delete nodes • Iteratively show the process
Continued… • Online simulator for linear programming problem solving (as a part of coursework) • maximize c’x • Subjected to Ax <= b • X >= 0 • Implemented simplex algorithm to solve linear programming problems, e.g., transportation problem • Technology used • Matlab tool to implement algorithm. • Php and html to design the front end.
B. Tech. Project • Modelling of HMM based speech recognizer and its application • Motivation: • An attempt to develop speech recognizer which can be imported to various applications. • Concept: • HMMs are statistical models which output a sequence of symbols or quantities. In HMM based speech recognition, it is assumed that the sequence of observed speech vectors corresponding to each word is generated by a markov model. • Why not other method, like Neural network • It is good for individual phones or isolated words . Not good for a sentence
Steps involved • Data preparation • Generate monophone HMMs • Generate tied-state Triphones • Recognizer evaluation
Application of recognizer • Automated speech recognizer • Application of Speech recognition in Asterisk • Asterisk is a software suit which enable the attached phone to make call over VoIP. • We bought a computer telephony interface and installed on a cpu on which the asterisk server was installed. • Hands free children learning application
Projects during master’s thesis • Toolkit for grid enabled high resolution image processing • Motivation: • Processing high resolution images for example satellite images. • Established a 3 node cluster using Condor scheduler ( similar as PBS ) • Used matlab tool for image processing.
The size of high resolution images are about 20-200MB • The goal was to implement an easy to use interface for the researchers to do image processing.
Projects at SCFBio, IIT Delhi • Optimization of energy minimization code • Issues: memory usage was highly dependent on sequence length. Redundant storage of parameters • The exploration of conformational space of native proteins • What is the hypothesis here? • Does “preferential interactions” between amino acids drive protein folding? Mittal A, Jayaram B et. al. 2010
The backbone conformation has been analyzed • Basically for each protein a 20x20 matrix of number of “neighbors” within a defined neighborhood distance.
Characteristic of curve • Sigmoidal in nature and follow the equation: • n and k are the free parameters in above equation. • I implemented the curve fitting (Levenberg–Marquardt algorithm) program in C++. • The claim is that for any pair of residues the neighborhood behavior is almost same.
Development of scoring function • Story behind it • Structure generation. • Need for robust scoring function in order to select the native or native like structures from the ensemble. • Important factors • Energy • Accessible area • Euclidian distance • Secondary structure • We designed the scoring function which assign a cumulative score (CS) to given structure. Mishra A., Rao S. et. al. 2013
Smaller values infer better structure. A1: fractional area of exposed nonpolar residues A2: fractional area of exposed nonpolar part of residues A3: weighted exposed area A4: total surface area PH, PS : Helix, Sheet Penalty M1: Euclidian distance
Tools used: • R, bash shell scripting, perl.