1 / 29

Strings

Learn the fundamentals of handling strings in C/C++, including character storage in RAM, the use of null characters, and string manipulation techniques.

albertshaw
Download Presentation

Strings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Strings Strings Chapter 5

  2. What you MUST know before we start: Strings (Remember: The topics in this course build on each other) • What characters are • How characters are stored in RAM • How characters vary from/are the same as Integers • The use of C/C++ to manipulate characters

  3. Strings Strings • A string is simply a numeric (of data type char) array void main() { int i; char chararray[5]; chararray[0] = ‘H’; chararray[1] = ‘e’; chararray[2] = ‘l’; chararray[3] = ‘l’; chararray[4] = ‘o’; • The declaration: • Reserves 5-bytes of RAM at address chararray • Initialized each element of the array with a character

  4. We could have also written the C Code as: void main() { int i; char chararray[5]; chararray[0] = 72; chararray[1] = 101; chararray[2] = 108; chararray[3] = 108; chararray[4] = 111; Strings Which would have Exactly the same effect To Print the Array: for (i = 0; i < 5; i++) printf(“%c”, chararray[i]);

  5. 1205 1206 1201 1202 1200 1203 1204 1201 1202 1204 1204 1200 1201 1202 1203 1200 1203 01101111 H 01100101 l 01100101 72 101 01001000 l 108 o e 01100101 111 108 00011011 11011010 How would this be stored in RAM ?? Strings Assume that the base address of chararray (== &chararray[0]) is 1200: ‘Garbage’

  6. This is exactly the same as numeric arrays !!! True -- Except for one difference Strings Consider the sentences: The quality of mercy is not strained. It droppeth as the gentle rains from the heavens upon the place beneath. How many characters are there in the sentences (don’t forget to count spaces and special characters) ?? --- There are actually 108 --- • There are a few points to be made: • Do we really want to count how many characters there are ?? • Do we really care about the positions (offsets from the base address) of the characters ??

  7. What’s the Solution ??? • What do we need to know ??? • The base address of the array Strings • We can readily determine this by referring to the variable name (in our case, chararray == &chararray[0]) • Where the string ends How do we know where a string ends ??? Right now, we don’t. BUT …… if we were to add an additional character at the end of the string, we could check to see if we had reached the end of the string

  8. What Character ??? In c, we add a NULL (‘\0’) character at the end of the array Strings Rewriting our (previous) c code: void main() { int i; char chararray[6]; chararray[0] = ‘H’; chararray[1] = ‘e’; chararray[2] = ‘l’; chararray[3] = ‘l’; chararray[4] = ‘o’; chararray[5] = ‘\0’; • NOTICE: • We must allocate 1-byte more than anticipate

  9. How does this make things easier for us ??? • In a number of respects: • Because we know that a NULL character will be added at the end of a string, we do NOT have to declare how many characters are in a string when we initialize Strings We could have declared our string as: void main() { int i; char chararray[] = “Hello”; Which would have the same effect as our previous code • We could have also printed our string with the command: puts(chararray); OR printf(“%s\n”,chararray); Which would have the same effect as the code: for (i = 0; i < 5; i++) printf(“%c”, chararray[i]);

  10. 1206 1202 1204 1201 1203 1200 1200 1201 1203 1205 1200 1204 1205 1205 1204 1202 1202 1201 1203 108 \0 H 72 o 101 108 00000000 l 01101111 0 01001000 01100101 01100101 e 01100101 l 111 11011010 How would this be stored in RAM ?? Strings Again, Assume that the base address of chararray (== &chararray[0]) is 1200: ‘Garbage’

  11. How does puts work ??? • puts is a standard function found in <stdio.h> • The function receives a base address and continues printing the elements of the array until it encounters a NULL (‘\0’) character • If we were to pass the base address of our string (chararray) as: puts(chararray); Strings The C function necessary might appear as: voidputs (char *base) { // we could have used: inti = 0; while (*base != '\0') // or: while (chararray[i] != ‘\0’) printf("%c", *base++); // or: printf(“%c”, chararray[i++]); } How does this work ???

  12. 1203 1750 1205 1204 1202 1201 1200 1750 \0 H 1201 o 1200 l e l Assume, again, that the base address of: charchararray[] = “Hello”; was 1200 Strings The call: puts(chararray); voidputs (char *base) Places the address 1200 in location base (assume address 1750) Looking at RAM, we would see: The first pass: while (*base != '\0') True printf("%c", *base++); H

  13. 1750 1205 1200 1201 1204 1202 1203 H l l e \0 1201 o 1202 1201 1200 1204 1203 1205 1750 1750 l l e \0 o H 1202 1203 Given our RAM layout: Strings The next pass: while (*base != '\0') True printf("%c", *base++); H e RAM now appears as: The next pass: while (*base != '\0') True printf("%c", *base++); H e l

  14. 1750 1205 1200 1201 1204 1202 1203 H l l e \0 1203 o 1202 1201 1200 1204 1203 1205 1750 1750 l l e \0 o H 1204 1205 Given our New RAM layout: Strings The next pass: while (*base != '\0') True printf("%c", *base++); H e l l RAM now appears as: The next pass: while (*base != '\0') True printf("%c", *base++); H e l l o

  15. 1200 1201 1202 1203 1204 1205 1750 1205 H e l l o \0 Given our New RAM layout: Strings The next pass: while (*base != '\0') False We are done with the loop Are there other functions associated with strings (in the file <stdio.h>) ??? Yes, a number of them, including: gets Get a string from the keyboard (until CR entered) fputs Write a string to a file fgets Get a string from a file There is even a header file for strings: <string.h>

  16. What happens if we forget to add a NULL Character ??? Strange things can happen. Consider the following c code: Strings void main() { char chararray[5]; chararray[0] = ‘H’; chararray[1] = ‘e’; chararray[2] = ‘l’; chararray[3] = ‘l’; chararray[4] = ‘o’; puts(chararray); } The output of this program might appear as: HelloZ Why ???

  17. 1203 1204 1203 1203 1206 1205 1204 1200 1205 1202 1206 1202 1201 1201 1200 1201 1201 1202 1206 1207 1204 1207 1205 1207 00000000 11100100 01011010 01101111 01101100 \0 01100101 01001000 228 72 01101100 101 Z 111 90 108 l e  0 l H 0 108 Let’s look at how chararray might be stored in RAM (Again, assuming a base address of 1200): Strings • REMEMBER: • The function puts will keep printing characters until the NULL character is reached.

  18. Notice also that we have a convenient way to declare strings: Strings void main() { int i; char chararray[] = “Hello”; • We do NOT need to count the number of characters (The compiler will count for us) • We do NOT need to add a NULL character at the end (The compiler will add one for us) • In this case, 6-bytes (including one for the NULL character) will be reserved at base address chararray, and it will appear in RAM as before

  19. 9834 9833 9832 9833 9833 9832 9834 9832 9833 ‘2’ 00110100 50 00000000 ‘/0’ 00110110 0 ‘4’ 52 Another function we mentioned, gets, will get a string from the keyboard (until a CR is entered), AND place a NULL character at the end Strings Assume we wished to get a number from the keyboard: void main() { char number[5]; gets(number); } 42 If we were to enter the number: And the base address of number were 9832, it would appear as: BUT, this is NOT a number !!!

  20. When we type keystrokes, we enter characters. If we wish to store keystrokes as integer values, we must convert them. Strings HOW ??? FIRST, we need to determine what characters can be converted • the characters ‘0’ to ‘9’ • the characters ‘+’ and ‘-’, but only if they are the first characters in the string When converting, what must we consider??? • Whether the character is legal • The position of the character in the array Why position? • If the string = ‘6’, the integer value is: 6 • If the string = ‘32’, the integer value is: 3*10 + 2 = 32 • If the string = ‘675’, the integer value is: 6*100 + 7*10 + 5 = 600 + 70 + 5 = 675

  21. 1 3 2 offset 0 72 7 0 724 num ‘2’ = 50 ‘\0’ = 0 ‘4’ = 52 nstring[offset] ‘7’ = 55 TRUE FALSE condition TRUE TRUE 72*10 + 52 - 48 = 724 num=num*10+nstring[offset++]-’0’ 7*10 + 50 - 48 = 72 ** Loop Terminated 0*10 + 55 - 48 = 7 Consider the following C Code: void main() { charnstring[] = “724”; // the character string to convert intnum = 0, // num will hold the converted value offset = 0; // array index/offset // check if the characters are legal while ((nstring[offset] >= ‘0’) && (nstring[offset] <= ‘9’)) // if yes, then convert AND set positional value num = num*10 + nstring[offset++] - ‘0’; } Strings Following the instructions through the loop:

  22. The only drawback to this program is that it does not take into account the sign nor ‘white spaces’ (spaces, CR & tabs) Strings Consider the following C Code: intatoi(constchar *stringnum); // function prototype void main() { intnum; char *nstring = "-82"; // let’s check a neg. no. num = atoi(nstring); } // call the function intatoi(constchar *stringnum) { intn = 0, // n will hold the number sign = 1; // if unsigned then positive while (*stringnum == ' ' || *stringnum == '\n' || *stringnum == '\t') stringnum++; // skip the white spaces if ((*stringnum == '+') || (*stringnum == '-')) // if signed { if (*stringnum == '-') // negative number ? sign = -1; // then mark it stringnum++; } // and go to next character // the rest is the old procedure, BUT using pointers while ((*stringnum >= '0') && (*stringnum <= '9')) // Legal value?? { n = n * 10 + *stringnum - '0'; // determine number to date stringnum++; } // go to next position return(sign * n); } // return the SIGNED value

  23. Must we write this code each time we input characters from the keyboard (OR from a file)??? Strings • YES and No ---- • Each time we get numeric data from the keyboard (or an ASCII file), we MUST convert. • Because the conversions are so common, there are readily available library routines in <stdlib.h> Function Name Meaning Action atoi alpha tointeger Convert: string to int atol alpha tolong Convert: string to long atof alpha tofloat Convert: string to float

  24. Do we need to convert integers (or longs, or floats) to strings ?? YES --- Strings Whenever we store numeric values to an ASCII file, we MUST convert to a string How do we convert ??? Much like we did when converted from decimal to binary. Consider the conversion needed for 8710 to binary: 2 87 1 1 0 1 0 1 1 1 2 43 1 • Remember the conversion: • Divide by 2 (for binary) • Collect from bottom to top 2 21 1 2 10 0 2 5 1 What does this have to do with converting from integers to string ??? 2 2 0 2 1 1

  25. Instead of dividing by 2 and keeping track of the remainder, we divide by 10 and keep track of the remainder Strings Consider the integer: 5409 “5409” 10 5409 9 Convert to: ‘9’ 10 540 0 ‘0’ Convert to: The only difference is that we must first convert the digit to a character 10 54 4 ‘4’ Convert to: 10 5 5 ‘5’ Convert to: 0 Collect from bottom: How do we convert to a character ??? Simple: Add 48 (the character ‘0’) to the remainder

  26. Consider the following C Code: void main() { intdecimal = 5409, // The integer we wish to convert idx = 0; // The character array index charbin[10]; // the character array for out string while (decimal > 0) // Continue until the quotient is 0 (zero) { bin[idx++] = decimal % 10 +'0'; // store the remainder decimal = decimal / 10; } // get the new quotient bin[idx] = '\0'; } // set in the null character & decrement the index Strings This would execute as: decimal idx decimal > 0 decimal %10 + 48 decimal /10 bin 5409 % 10 +48 =57 5409 0 TRUE 5409/10 =540 “9” 540 1 TRUE 540 % 10 + 48 = 48 540/10 =54 “90” 54 2 TRUE 54 % 10 + 48 = 52 54/10 =5 “904” 5 3 TRUE 5 % 10 + 48 = 53 5/10 =0 “9045” 0 4 FALSE and: bin[idx] = '\0' “9045\0”

  27. NOTE, however, that our string is: “9045\0” --- We must reverse the order --- Strings If we were to ADD the c code: intoffset = 0; // we will start with the first character chartemp; // for temporary storage idx--; // DON’T move ‘\0’ (idx now contains 3) while (idx > offset) { temp = bin[idx]; // store uppermost non-swapped character bin[idx--] = bin[offset]; // move lower char. to upper & decrement bin[offset++] = temp; } } // move in the old uppermost character Following the loop: idx offset bin idx>offset temp bin[idx--] bin[offset++] bin 3 0 “9045\0” TRUE ‘5’ ‘9’ ‘5’ “5049\0” 2 1 “5049\0” TRUE ‘4’ ‘0’ ‘4’ “5409\0” “5409\0” FALSE *** We are out of the loop 1 2 And the string is in the correct order

  28. Must we write this code each time we write numeric values to an ASCII file ??? Strings • YES and No ---- • Each time we write numeric data to an ASCII, we MUST convert. • Because the conversions are so common, there are readily available library routines in <stdlib.h> Function Name Meaning Action itoa integer toalpha Convert: int to string ltoa longtoalpha Convert: long to string ftoa floattoalpha Convert: float to string Check Your Manuals: The parameters passed are different

  29. Strings

More Related