C Programming (the Final Lecture)

C Programming (the Final Lecture) • How to test your code really works. • How to earn your fortune illegally through computer crime (Do not do this bit). • The famous travelling salesman problem (and when a problem is _really_ hard). • Three ways to "solve" any optimisation (using kangaroos).

Testing your code really works • Just because a piece of code works once doesn't mean it will work again. • Because we get a right answer for an input of 'n' does not mean we will get the right answer for 'm'. • Working code should never "crash" it should always exit with an error whatever its input. • You should know how your code will behave when asked "the wrong question".

Test boundary conditions • Consider what might happen if the input is very large or very small. • If there is a possibility that your code will get such input you should make sure it can deal with it. • Always beware of the divide by zero error. • In 1998 the guided-missile carrier USS Yorktown was shut down for several hours when a crew-member mistakenly input zero to one of the computers. Don't let your code work like this.

Boundary conditions example • What is wrong with this code which is supposed to be like strlen int my_strlen (char *string) /* What is wrong with this code to find the length of a string */ { int len= 1; while (string[len] != '\0') len++; return len; }

Overflows of numbers • If you are going to work with large numbers then be sure you know how large a number your variables can deal with. • In most implementations of C a signed char can be from 0 to 255. How big an int can be varies from computer to computer. • In July 1996 Ariane 5 exploded as a direct result of a programming error which tried to fit 64 bits of floating point into a 16 bit int • Similar problems have caused test-driven cars to switch to reverse at high speed.

What if the user asks "the wrong question" • This code finds the average of 'n' doubles - under what conditions does it fail. double avg (double a[], int n) /* a is an array of n doubles */ { int i; double sum= 0; for (i= 0; i < n; i++) { sum+= a[i]; } return sum/n; }

Program defensively • In some cases (not all) you might add code to weed out rogue values. void class_of_degree (char degree[], double percent) /* Work out the approx. class of degree from someone's percentage overall mark */ { if (percent < 0 || percent > 100) strcpy(degree,"Error in mark"); else if (percent >= 70) strcpy(degree,"First"); else if (percent >= 60) strcpy(degree,"Two-one"); . . } These lines are just here out of caution

How to test your code while writing • A good programmer doesn't sit down, write 10,000 lines of code and then run it. • It will make your life easier if you test your program as you write it. • Write the smallest possible part of the program you think will _do something_ and test it. • Build the program up gradually - testing as you go. • I like to compile every dozen lines or so – as soon as I've made a significant change. (I use a separate window to compile in).

When and what to test • If your program takes no input but simply runs and produces an answer then it may not need much testing. Most programs are not like this. • If you are doing the cryptography project or Zipf's law projects, for example, your programs should be taking strings of input. • What would happen if those strings of input were just rubbish instead of well behaved strings of words and letters.

Document your testing • Documenting your testing is critical and it will be important in your project. • If appropriate, you should include in your write up, some evidence that you have tested your code with various inputs • Failing to document testing can have important consequences • One of the problems which beset the Pathfinder probe had actually been spotted in testing before launch - but forgotten about. It had to be solved while in flight.

How to hack into computers • Computer hacking - "cracking" as it is correctly called - involves illegally accessing computers. • Usually this involves finding an "exploit" or bug in the operating system of the computer. • All currently known computer systems have these "exploits" - they are the results of inadequate testing and sloppy programming. • This shows the importance of "testing" and "defensive programming".

Buffer Overflow Exploits • By far the majority of modern "exploits" are "buffer overflows". • What happens to this code if it is given a longer string in str2 than str1? void strcpy(char str1[], char str2[]) /* Copy to str1 from str2 */ { int i= 0; while ((str1[i]= str2[i]) != '\0') i++; } Another complex line which assigns and compares.

Where do buffer overflows come from? • Here are just some common ways that buffer overflows arise • Incautious use of "strcpy" (copying a potentially larger string into a smaller one). • Use of the gets command instead of fgets from stdin (which is why I didn't even teach you about gets) • Forgetting to check array bounds on input strings

So how does a buffer overflow work. Computer memory Some other junk The array we are about to overflow The bits of the program that are being run What we write to the array Lots of "no operations" Our evil program After writing to the array, the program tries to continue but has been overwritten with our evil program

So what do you do? (or not do) • Find some program which you can access which has the correct permissions and a "buffer overflow exploit" • Send your data containing your evil program to the input of the bugged program • In 1988 the "Internet Worm" used this method to place self-replicating code which automatically hacked computers. • The majority of the internet overloaded and shut down when the worm ran out of control.

Optimisation (the travelling salesman) • The travelling salesman problem is a classic optimisation problem. A salesman must visit all of 'n' cities in the shortest possible time. 1 city has 1 possible ordering 2 cities have 2 3 cities have 6 4 cities have 24 n cities have n! This is a hard problem its difficulty of solution is O(n!) (we can reduce this by 2 by symmetry) 2 5 1 4 3

What's the point of this silly problem? • The TSP represents a class of problems known as NP-hard. • This means that no known solution to the problem arrives at a solution in polynomial time • NP-hard problems are computationally equivalent - a solution to one is a solution to all. Solving (or proving the non-existence of a solution) of NP-hard problems is one of the famous "10 most important problems in maths" – with a $1m prize.

So how do we solve this? • We cannot use "brute force and ignorance" to solve this problem. There are simply too many combinations. We must be cleverer. • Irritatingly, this type of "hard" optimisation problem is extremely common in computing • There are three commonly used ways to get an approximate solution: • Hill Climbing • Simulated Annealing • Genetic Algorithms

Hill Climbing • To visualise hill climbing, imagine our problem space is a landscape and we wish to reach the highest point. • Starting with a random solution we can either: • Find an ascent direction and move a small amount in that direction • or Keep picking random directions until we find one which is an improvement • Repeat until no improvement can be found.

Hill Climbing (2) • In the case of the travelling salesman problem we might pick a solution at random and then swap the order of one city in the route until no further swaps produce improvement • Pros: • Computationally simple • Fast to run • Cons: • Gets stuck in local maxima

Simulated Annealing • Simulated annealing is based upon a physics analogy - crystal formation in metals • We set a temperature variable • As with hill climbing, we pick some "direction" to take steps in. • Unlike with hill climbing, we might choose to move "down-hill" if the temperature is high.

Simulated Annealing (2) • By gradually lowering the "temperature" we eventually move from allowing wild jumps across the landscape to becoming more like a hill-climbing process. • Pros: • Avoids some local minima. • Only a little more computationally intensive. • Cons: • A random process - doesn't guarantee a good solution every time

Genetic Algorithms • Begin by producing a whole bunch of random solutions. • Let our selection of solutions "breed" by producing hybrid solutions by mixing them. • Mutate solutions by randomly swapping parts of them. • Occasionally kill off some of the solutions which are less optimal. • By a process of killing some of the worst solutions the best solutions "survive" and "breed"

Genetic Algorithms (2) • Works like "evolution" to produce good solutions • Pros: • Not very susceptible to local minima. • Likely to find good solutions for many problems. • Cons: • Computationally both difficult and slow. • Doesn't work for every problem.

A sensible comparison of the methods

A Silly Comparison of Methods Hill climbing is like dropping a kangaroo somewhere on the surface of the earth, telling it to only hop uphill and hoping it will get to the top of mount Everest.

A Silly Comparison of Methods hic Simulated Annealing is like doing the same but getting the kangaroo very very drunk first.

A Silly Comparison of Methods Genetic Algorithms are like taking a whole plane load of kangaroos and letting them reproduce freely (not pictured).....

A Silly Comparison of Methods Aaaargh! Ouch ....and regularly shooting the ones at lower altitudes.

That’s all • You have now learnt the C programming language • How to document and test your code • Some simple algorithms to solve general problems • The rest is just practice... No kangaroos were harmed in the making of this lecture

C Programming (the Final Lecture)

C Programming (the Final Lecture)

Presentation Transcript

1101csc – C++ programming

C Programming Lecture

C Programming

CS3101-2 Programming Languages – C++ Lecture 5

C Programming

Programming In C++

System Programming in C

C# Programming

C Programming

CS197c: Programming in C++

C Programming

C Programming Basics

CS197c: Programming in C++

C Programming

C Programming Basics

C Programming

The C Programming

C Programming