# Functions and Separate Compilation - PowerPoint PPT Presentation

1 / 30

Functions and Separate Compilation. Dr. Nancy Warter-Perez May 7, 2002. Outline. Discuss solution to homework 6 Introduction of workshop 11 Overview of File I/O - workshop 11.1 Switch statement - workshop 11.2 Functions - workshop 11.3 Separate compilation - workshop 11.4

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Functions and Separate Compilation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Functions and Separate Compilation

Dr. Nancy Warter-Perez

May 7, 2002

### Outline

• Discuss solution to homework 6

• Introduction of workshop 11

• Overview of

• File I/O - workshop 11.1

• Switch statement - workshop 11.2

• Functions - workshop 11.3

• Separate compilation - workshop 11.4

• Homework 11

Bioinformatics Programming

### Homework 6 - Solution

// This program will compute the %GC of a given sequence for a specified sliding window size.

// Written By: Prof. Warter-Perez

// Date Created: April 23, 2002

//April 23, 2002 - Modified the program to write data to an output file.

//April 24, 2002 - Modified the program to compute hydrophobicity using Kyte-Doolittle scale

//May 7, 2002 - Modified the program to read sequence from an input file. The input and

//output files are user defined.

#include<stdlib.h>

#include<string>

#include<iostream>

#include<fstream>

using namespace std;

Bioinformatics Programming

### Homework 6 - Solution

int main () {

string seq, file_in, file_out;

float count;

int i, j, window_size;

float hydro[25] = {1.8, 0, 2.5, -3.5, -3.5, 2.8, -.04, -3.2, 4.5, 0, -3.9, 3.8, 1.9, -3.5, 0, -1.6, -3.5, -4.5, -0.8, -0.7, 0, 4.2, -0.9, 0, -1.3};

fstream fout, fin;

Bioinformatics Programming

### Homework 6 - Solution

cout << "This program will compute the hydrophobicity of a given sequence \nfor a specified sliding window size.\n" << endl;

// Open the input file.

cout << "Please enter the input filename:\t" << flush;

cin >> file_in;

fin.open(file_in.c_str(), ios::in);

if(fin.fail()) {

cout << "Error: input file does not exist. Program Terminating." << endl;

return EXIT_FAILURE;

}

Bioinformatics Programming

### Homework 6 - Solution

// Open the output file.

cout << "Please enter the output filename:\t" << flush;

cin >> file_out;

fout.open(file_out.c_str(), ios::out);

fin >> seq;

// Read in the window size.

cout << "Enter the window size: "<< flush;

cin >> window_size;

Bioinformatics Programming

### Homework 6 - Solution

// Compute the average hydrophobicity for specified window

// size and write to output file.

for (i = 0; i < seq.size() - window_size + 1; i++) {

count = 0;

for (j = i; j < i + window_size; j++) {

count = count + hydro[toupper(seq.data()[j]) - 'A'];

}

fout << i << "\t" << count/window_size << endl;

}

return EXIT_SUCCESS;

}

Bioinformatics Programming

### Bacteriorhodopsin Sequence (short)

The protein sequence in FASTA format:

>gi|461612|sp|P33972|BACR_HALHS BACTERIORHODOPSIN (BR)

LWLGTAGMFLGMLYFIARGWGETDGRRQKFYIATILITAIAFVNYLAMALGFGLTFIEFGGEQHPIYWAR

VLLYFLFSSLSGRVANLPSDTRSTFKTLRNLVTVVWLVYPVWWLVGSEGLGLVGIGIETAGFMVIDLVA

Bioinformatics Programming

### Window size = 14

Bioinformatics Programming

### Bacteriorhodopsin Sequence (long)

• Bacteriorhodopsin precursor (BR) (number P02945)

• www.ncbi.nlm.nih.gov/entrez

• FASTA format

• (Thanks to Edain Velazquez)

Bioinformatics Programming

### Hydrophobicity – Bacteriorodopsin (Window size = 10)

Bioinformatics Programming

### Lecture 4 - Plot

Bioinformatics Programming

### Workshop #11

• Workshop 11.1

Write a program to read in a PAM matrix into a 2-dimensional array. To test, print the 2-D array to stdout. Assume the 2-D array is a global array.

• Workshop 11.2

Convert the program of 11.1 into a function without the prints to stdout. Test with dummy programs that display output to stdout.

Bioinformatics Programming

### Workshop #11

• Workshop 11.3

Write a function that takes index I and index J and returns the PAM score for row I, column J. Assume the PAM matrix is a global array.

• Workshop 11.4

Test each function separately and then combine into oneprogram that prompts the user for 2 amino-acids and returns their PAM score. Place the support functions developed in workshop 11.2 and 11.3 in a separate file than the main function. Use a header file to link them.

Bioinformatics Programming

### File I/O (C++)

• #include <fstream.h>

• fstream fin, fout; // fin and fout are object names

• fin.open(“infilename”, ios::in); // open a file to read

• int x;

fin >> x; // read an integer from input file into x

• char c;

fin >> c; // read a character from input file into x

• string s;

fin >> s; // read a string from input file into s

• fout.open(“outfilename”, ios::out);

• fout << x; // will write x into output file

Bioinformatics Programming

### 2-D Arrays

• int nums[2][3] = {{2,4,6},{-9,-7,-5}};

nums[0][0] == 2

nums[0][1] == 4

nums[0][2] == 6

nums[1][0] == -9

nums[1][1] == -7

nums[1][2] == -5

[0] [1] [2]

2 4 6

[0]

[1]

-9 -7 -5

Bioinformatics Programming

### Workshop #11

• Workshop 11.1

Write a function to read in a PAM matrix into a 2-dimensional array.

• Have to parse the input to ignore file heading information and matrix column and row headings.

Bioinformatics Programming

### Switch Statement

int x, y;

switch (x) {

case 0: y = 1;

break;

case 1: y = 2;

break;

case 2: y = 3;

default:y = 4;

}

x = 0?

y = 1

x = 1?

y = 2;

x = 2?

y = 4!

Else

y = 4

Bioinformatics Programming

### Switch Statement

• Works with char and int

char c; int y;

switch (c) {

case ‘a’: y = 1;

break;

case ‘b’:

case ‘c’: y = 2;

break;

case ‘z’:y = 3;

}

Bioinformatics Programming

### 1-D Arrays as Look-Up Tables

int table[3] = {1, 2, 2};

char c;

if(c != ‘z’)

y = table[c - ‘a’];

else

y = 3;

Bioinformatics Programming

### Functions

• Break program into modules or functions

• Easier to understand program

• Functions can be reused (e.g.,library functions)

• Easier to develop a program step by step

• Can test each function independently

• First function in a program must be “main”

Bioinformatics Programming

### Functions

<return_type> func_name (arg1_typ arg1_name, …, argN_typ argN_name)

{

function body

}

• Func_name – name of the function

• Return_type – type of value returned by function

• Arguments

• call-by-value – arguments are inputs to function that can’t be modified by function

• Function prototype (used in header files [*.h])

<return_type> func_name (arg1_typ, …, argN_typ);

• Library functions – commonly used functions

• stdlib.h, stdio.h, math.h, string.h (to name a few)

Bioinformatics Programming

### Workshop #11

• Workshop 11.2

Convert the programs of 11.1 into a function without the prints to stdout. Test with dummy programs that display output to stdout.

Bioinformatics Programming

### Projects Separate Compilation

Library functions

*.a

Compile

File1.c

(or .cpp)

Executable

*.exe

FileN.c

(or .cpp)

Compile

Object files

*.obj

Bioinformatics Programming

### Separate Compilation

• Break program into different files (can be developed by different people)

• Arrange functions logically into files

• Information is communicated between functions using header files

• Projects contain all files that need to be compiled for a given executable

Bioinformatics Programming

• filex.cpp can export information to other files using a header file, filex.h

• usually contains

• function prototypes

• constant declarations

• global variables (extern)

• user defined

• #include “filex.h”// can specify the path if not

// in same directory

Bioinformatics Programming

• filex.cpp

#include <iostream.h>

#include “filex.h”

int call_x(int a) {

cout << a << endl;

return a++;

}

• filex.h

int call_x(int a);

• filey.cpp

• #include <iostream.h>

• #include “filex.h”

• void main () {

• int y;

• y = call_x(5);

• cout << y << endl;

• }

Bioinformatics Programming

### Workshop #11

• Workshop 11.3

Write a function that takes as input the PAM matrix, index I, and index J and returns the PAM score for row I, column J.

• Workshop 11.4

Test each function and combine into oneprogram that prompts the user for 2 amino-acids and returns their PAM score. Place the support functions developed in workshop 11.2 and 11.3 in a separate file than the main function. Use a header file to link them.

Bioinformatics Programming

### Homework #11 – due 5/14

• Write a function to determine the score of 2 sequences aligned by the Needleman-Wunsch method using the scoring method proposed in the Lecture 5. A gap in an aligned sequence will be represented by a period (“.”).

• Test your function with a program that reads the sequences from standard input and displays the score to standard output. You should test your program with different sequences.

• Modify your program to use PAM scoring rather than Match/Mismatch scores.

Bioinformatics Programming

### Needleman-Wunsch Method

The sequences

abcdefghajklm

abbdhijk

are aligned and scored like this

a b c d e f g h a j k l m

| | | | | |

a b b d . . . h i j k

match 4 4 4 4 4 4

mismatch -3 -3

gap_open -2

gap_extend -1-1-1

for a total score of 24-6-2-3 = 13

Bioinformatics Programming