Protein – Protein Interactions

Protein – Protein Interactions

Implementation/Algorithm • Algorithm approximates the minimum set of domains pairs. • Algorithm needs to be able to choose d-d pairs in an educated, not a randomized fashion. • This educated way can be done using weight functions. Where each domain pair is given a weight, and the largest of the weights is chosen.

Plan for Testing • From the available data bank, create training data of different sizes (.01, .25, .5, .75, 1). • Run program which takes domain pairs chosen using training data and our algorithm • Creates all possible P-P interactions and calculates their probability to interact by looking at the protein structure. • Compares calculated P-P interactions with observed interactions. (number of matches, false positive, and false negative p-p interactions) • Calculate fold, specificity, and sensitivity in order to compare to previous research.

Prediction Input • Program written which reads in three files: • Protein Structure: • Protein name: p • list of proteins p interacts with • List of domains p contains • Domain Structure: • Name of domain: d • List of proteins which host d • Domain Interaction: • A predicted pair of interacting domains and their interaction probability.

Prediction Data Structures • There are four two dimensional vectors: • Protein Interactions: (observed/predicted) • Protein Domains: (observed) • Domain Hosts: (observed) • Domain Interactions: (predicted)

Prediction • For all domains Di For all Domains,Dj , Di interacts with For all proteins, Pi, Di is hosted by For all proteins, Pj, Dj is hosted by set Pi interacting with Pj with probability Di Dj interacting • For all proteins Pi For all Proteins Pj For all Domains,Di, Pi contains For all Domains, Dj, Pj contains probabilty Pi_Pj = 1 – PIE(1 – d[Di][Dj] )

Metrics for Comparison • By comparing the observed protein interactions with the predicted protein interactions: • False Positive: Number of predicted protein interactions which are not observed experimentally. • False Negative: Number of protein interactions which were observed experimentally but not predicted. • Fold: (number of matching protein pairs between experimental and observed / number of protein pairs with some probability greater than some threshold) / (total number of protein pairs observed / total number of protein pairs) • Specificity: number of matches / total number of predicted interactions • Sensitivity: number of matches / total number of observed interactions

Work Finished • Started writing paper • Program prototypes written • Tested using 75% of available data. • Calculated: • False Positives = • False Negatives = • Fold = • Specificity = • Sensitivity =

Work in Progress • Writing Paper • Cleaning code • Getting the testing done • Maybe making a few more weight functions • Adding or subtracting weight depending on different assumptions. • Compare with different algorithms/papers out there

Protein – Protein Interactions

Protein – Protein Interactions

Presentation Transcript

Methods of Protein Purification

Protein Homology Modelling

Ruminant Protein Nutrition

Addressing Housing and Food Insecurity with Program Income

Recombinant protein production in Eukaryotic cells

Protein 3D-structure analysis

Nuclear Magnetic Resonance (NMR) Data Protein–Protein Docking

Reporting Protein Identifications from MS/MS Results

Protein metabolism

Lecture 4 Protein Function prediction using network concepts Hierarchical Clustering

Chapter 17 From Gene to Protein

Protein Concentration Determination

G-protein coupled receptor signaling pathways

Protein folding

Protein interactions and Pathways

Protein Structure

Protein – protein interaction

The Protein

From DNA to Protein: Gene Expression

Protein Chemistry Basics