Data R epresentation Overflow Limits

Data RepresentationOverflowLimits No goodexplanations in thebooks Trytounderstand in classandfromtheslides

Representation of Data All data in a the computer’s memory and files are represented as a sequence of bits Bit : unit of storage, represents the level of an electrical charge. Can be either 0 or 1 Byte: another unit of storage that occupies 8 bits. A bit sequence can represent many different things: • We will see that a bit string (such as 10000000111000001011111110100010) can mean several different things depending on the representation that is agreed upon. So, how should to representintegers, characters, real numbers, strings, structures, in terms ofbits? • Representations must be efficent and convenient • We will see some of them

Characters In C/C++, characters are actually integers of length one byte, with special meaning as characters - the ASCII mapping char c = 'a';//stores the code corresponding to letter ‘a’, but //prints the character a when printed ASCII Standard • American Standard Code for Information Interchange • dates back 1960's • 256 different codes (0 . . . 255) and corresponding characters • The characters with codes 0 . . 31, and 127 are control characters • At these times, standardizing communication related and telegraphic codes was important. That is why most of the control characters are for this purpose and now obselete. Though, some OSs implement some of the control chars. • Extended ASCII (128 – 255): there are different conventions and interpretations

Special control characters. Most of themarenowobselete. SomeOSsimplementsome of themandmeaningsmaychangefrom OS to OS ASCII Seehttp://www.asciitable.com/andhttp://en.wikipedia.org/wiki/ASCIIformoreinfo These are the ASCII codes – of course you are not expected to memorize them, just know that there are codes for special characters and that numbers and letters of a case are consecutive (so one can do '1'+5 to get code of '6', or subtract 32 to go from lowercase to get corresponding uppercase) Bluecontrolcharactersaretheonesimportantfor Windows/DOS | 0 NUL| 1 SOH| 2 STX| 3 ETX| 4 EOT| 5 ENQ| 6 ACK| 7 BEL| | 8 BS | 9 HT | 10 LF| 11 VT | 12 FF | 13 CR | 14 SO | 15 SI | | 16 DLE| 17 DC1| 18 DC2| 19 DC3| 20 DC4| 21 NAK| 22 SYN| 23 ETB| | 24 CAN| 25 EM | 26 SUB| 27 ESC| 28 FS | 29 GS | 30 RS | 31 US | | 32 SP | 33 ! | 34 " | 35 # | 36 $ | 37 % | 38 & | 39 ' | | 40 ( | 41 ) | 42 * | 43 + | 44 , | 45 - | 46 . | 47 / | | 48 0 | 49 1 | 50 2 | 51 3 | 52 4 | 53 5 | 54 6 | 55 7 | | 56 8 | 57 9 | 58 : | 59 ; | 60 < | 61 = | 62 > | 63 ? | | 64 @ | 65 A | 66 B | 67 C | 68 D | 69 E | 70 F | 71 G | | 72 H | 73 I | 74 J | 75 K | 76 L | 77 M | 78 N | 79 O | | 80 P | 81 Q | 82 R | 83 S | 84 T | 85 U | 86 V | 87 W | | 88 X | 89 Y | 90 Z | 91 [ | 92 \ | 93 ] | 94 ^ | 95 _ | | 96 ` | 97 a | 98 b | 99 c |100 d |101 e |102 f |103 g | |104 h |105 i |106 j |107 k |108 l |109 m |110 n |111 o | |112 p |113 q |114 r |115 s |116 t |117 u |118 v |119 w | |120 x |121 y |122 z |123 { |124 | |125 } |126 ~ |127 DEL|

Integer Number Representation Sign-Magnitude Representation 1s complement Representation 2s complement Representation Comparison of different representations

Number Representation Fundamental problem: • Fixed-size representation (e.g. 4 bytes for integers) can’t encode all numbers • Usually sufficient in most applications, • But a potential source of bugs: overflow • need to be careful of it Other problems: • How to represent negative numbers, floating points? • Historically, many different representations. • How to do subtraction effectively?

Base 2 – unsignednumbers MSB – MostSignificant Bit LSB – LeastSignificant Bit 0 0 0 0 0 0 0 0  0 //8-bit binary representation of positive integers 0 0 0 0 0 0 0 1  1 0 0 0 0 0 0 1 0  2 0 0 0 0 0 0 1 1  3 ... 1 1 1 1 1 1 1 1  255 • Representation: an n-bit number in base bhas decimal value = • di is the coefficient of the ith bit. • Bit 0 is the LSB and bit n-1 is the MSB. • Example for base2(binary): 10112 = 1 x 20+ 1 x 21+ 0 x 22+ 1 x 23 = 1110

Sign/Magnitude representation(also called “signed representation”) • use one of the bits (the first bit = Most Significant Bit) as a sign bit. • use the rest for magnitude • e.g. 000 = +0 001 = +1 010 = +2 positive numbers 011 = +3 100 = -0 101 = -1 110 = -2 negative numbers 111 = -3 • range: -(2 (n-1)-1)to (2 (n-1) -1), where n is the total number of bits For n = 4, [ -(23-1) . . . 23-1 ] [ -7 . . . 7 ]

Alternative representations Most computers don’t use a “sign and magnitude” representation • Drawbacks of the Sign-Magnitude representation: • two 0s: one positive one negative • addition and subtraction involving negative numbers are complicated Alternatives? • 1's complement representation • 2's complement representation: today's standard These two representations seem very similar in approach, but they differ in: • Representation of negative numbers (positives are the same in all 3 representations) and • Ease of arithmetic operations involving negative numbers

Signed numbers: 1’s complement • Positive numbers:first bit is 0, and the rest is the binary equivalent of the number. • Negative numbers: represented by the 1’s complement of the corresponding positive number • 1’s complement: invert all the bits (0's become 1; 1's become 0) e.g +8 = 0000 1000 (0 for + sign, and 000 1000 for 8) - 8 = 1111 0111 • So, effectively the first bit is used for sign, but negative numbers show a distinction from those of the sign-magnitude representation. • How about 0?

Number 1’s-complement +7 0111 +6 0110 +5 0101 +4 0100 +3 0011 +2 0010 +1 0001 +0 0000 -01111 -1 1110 -2 1101 -3 1100 -4 1011 -5 1010 -6 1001 -7 1000 As in thesignedrepresentation, there is a + and - 0 Range: [-(2n-1-1) . . 2n-1-1] For n = 4, [-(23-1) . . . 23-1 ] [-7 . . . 7]

Signed numbers: 2’s complement Signed 2’s complement is the common representation for signed numbers used in computers • For positive numbers, use 0 first and the remaining bits are the binary equivalent of the magnitude. • Negative numbers are represented by the 2’s complement of the corresponding positive number. • 2s complement:invert all bits and add 1 • Alternative (easier) method: copy all the bits from right to left until and including the first 1, invert the rest) Ex : +20 = 00010100 -20= 1110 1100 • single 0 • addition and subtraction complexities simplified • note the range (one more negative as compared to 1's complement): -2 (n-1) ... (2 (n-1) -1) • current standard for representing signed integers Range: [-2n-1 . . . 2n-1-1] For n = 4, [-23 . . . 23-1 ] [-8 . . . 7 ]

Number 2’s-complement +7 0111 +6 0110 +5 0101 +4 0100 +3 0011 +2 0010 +1 0001 0 0000 -1 1111 -2 1110 -3 1101 -4 1100 -5 1011 -6 1010 -7 1001 -8 1000 There is onlyonezero There is onemorenegativenumber as comparedtopositives

Possible Representations: summary Sign Magnitude: One's Complement Two's Complement 000 = +0 000 = +0 000 = +0 001 = +1 001 = +1 001 = +1 010 = +2 010 = +2 010 = +2 011 = +3 011 = +3 011 = +3 100 = -0100 = -3100 = -4 101 = -1 101 = -2 101 = -3 110 = -2 110 = -1 110 = -2111 = -3111 = -0 111 = -1 Notice: Positive numbers are represented the same way (same bit strings) in all representations! So for all three representations, representation of a positive number is directly decimal to binary / binary to decimal conversion.

DecimalConversionforNegatives If you are given a bit string representing a nagative number, you can find the decimal equivalent depending on the number representation used. • if sign/magnitude representation is used: • If MSB is 1, that means the number is negative • but this bit has no contribution to the magnitude. Convert the remaining bits to decimal for the magnitude. • For example, 10010010 is equivalent to –18 (- (1x16 + 1x2)) • if 1s complement representation is used: • If MSB is 1, that means the number is negative • To find the magnitude: • invert all bits (i.e. negate): 10010010 => 01101101 • find the positive number corresponding to the negated string • 1x64 + 1x32 + 1x8 + 1x4 + 1 = 109 • 10010010 is equivalent to –109 • note that this is the reverse operation of what we would do if we wanted to find the bit representation of –109 (find the bit rep. of 109, take 1s complement)

DecimalConversionforNegatives– ctd. • if 2s complement representation is used: • If MSB is 1, that means the number is negative. • To find the magnitude: • invert all bits 1001 0010 => 0110 1101 • add 1 => 0110 1110 • this is the negated value • find the positive number corresponding to the negated string (01101110) • 1x64 + 1x32 + 1x8 + 1x4 + 1x2 = 110 • 10010010 is equivalent to –110 • note that this is the reverse operation of what we would do if we wanted to find the bit representation of –110 (find the bit rep. of 110, take 2s complement)

1 0 0 1 0 0 1 0 -128 64 32 16 8 4 2 1 -27 Alternative decimal conversion – 2s comp. Hence: 100100102 = -1x27 + 1x24 + 1x21= -11010 You can also directly/quickly find the decimal equivalent of a 2s complement number: • use the usual binary to decimal conversion, using at the most significant bitthe negative for the coefficient 26 25 24 23 22 21 20

conversion to decimal with 32 bit numbers – 2s comp. Same idea as 8 bit 2s complement integers, but the most significant bit is –231. … -2,147,483,64864 32 16 8 4 2 1 -231 -231 230 ... 26 ... 20 27 26 25 24 23 22 21 20

A note Converting n bit numbers into numbers with more than n bits: • copy the most significant bit (the sign bit) into the other bits • Example: 4-bit to 8-bit 0010 -> 0000 0010(both has decimal value 2) 1010 -> 1111 1010 (both has decimal value -6 in 2's complement) • This method is valid for both 1's and 2's complement representations

Subtraction a-b can always be represented as a+(-b). Doing the arithmetic in this way causes wrong results in sign-multitude and 1's complement representations, but not in 2's complement. We will see an example now. Consider 3 - 2 which is the same as 3 + (-2) In sign-magnitude representation using 4 bits: 3 + (-2) should give1, but instead we get -5 ! 0 011 = +310 1 010 = -210 +--------- 1 101 = -510 which is a wrong result To remedy this, the operation can take special notice of the sign bits and perform a subtraction instead.This complicates the implementation; we have a better solution using 2's complement (next slide).

Subtraction In 1's complementrepresentationusing 4 bits: 3 + (-2) becomes 0 ! 0 011 = +310 1 101 = -210 +--------- 1 0 000 = 010 which is a wrongresult But two's complement additionresults in the correct sum without hassle. 0 011 = +310 1 110 = -210 +--------- 1 0 001 = 110 which is thecorrectresult! Wegotrid of it automatically since it does not fit We got rid of it automatically since it does not fit

Why 2's Complement? There is onlyonezero. Rangefornegativenumbers is onemorethantheotherrepresentations Subtraction can be implemented as addition (a - b = a + -b). Thus no borrowing logicneedsto be implemented. Let's us givetwo 8-bit examples. 97 - 120 =? -51 – 70 = ? 01100001  97 11001101  -51 10001000  -120 10111010  -70 +------------- +------------- 11101001  -23 110000111  -121 Due to fixed width of the registers, carry overflow is lost automatically.

Two's Complement – Negation Negating a two's complement number is simple: • Start at least significant bit. Copy through the first “1”; after that,invert each bit. • Example:0010101100 1101010100 • Alternatively, invert all bits and add one to the least significant bit If you negate twice, you will arrive to the same number: 0011 3 1101 -3 0011 3

Important Note on Terminology! "2's complement" (or "two’s complement") does not mean a negative number! 2's complement is a representation used to represent allintegers, not just negative integers! So 2's complement is a format specification, but we also use the term "2's complement of a number" as its negation e.g. when we want to negate a number, either from positive to negative or negative to positive, we may say "take its 2's complement".

Overview of Built-in TypesandTheirRanges

Built-in Types in C++ The types which are part of the C++ language; not implemented as a class • char • int, long (a.k.a.. long int), short (a.k.a. short int) • float, double • Mostly for numeric data representation • There are signed (which is the default one) and unsinged versions for integer/char storage • Signed integer representation uses 2's complement Now we are going to see some characteristics of these types and their limits. Some of these discussions are not new to you (discussed in CS201), but after learning data representation and 2's complement, they will mean more to you now.

char Thetypechar is knowntostore an ASCII character, but actually it stores a signedonebyteintegernumber (2's complementrepresentation). • Since there is no otherone-byteintegertype in C++, char is widelyused as integers as well • Of course, it is alsousedtostore a character (as seen at thebeginning of thisppt file) charch; ch = 'A'; //valid ch = 99; //valid Thetypechar is bydefault "signed" in VisualStudio • Therange is -128 . . 127 ch = -25; //valid ch = 135; //out of range, but not a syntaxerror. Compilergives a warning • 135 is out of range but stillfits in 8-bits. Whenyouhave cout << ch; • Output is thecharacterforwhichthe ASCII code is 135. However, whenyouhave cout << (int) ch; • Output is signedinteger (2's complement) representation of 135 (in 32 bits).13510 = 11111111 11111111 11111111 100001112which is -121 in 2's complementrepresentation. Thusyousee -121 as theoutput

unsignedchar You can changethedafaultbehaviourtounsignedbychangingtheprojectproperties • Open the project's Property Pages dialog box. • Click on "C/C++" • Click on "CommandLine" • Add /J compileroption. You can alsoexplicitlyspecify a charvariableunsignedbyputtingthekeywordunsignedbeforechar. • Fornon-negativeone-byteintegers. Since thereare no negatives, no needtouse 2's complement • Therange is 0 . . 255. • For ASCII interpretation, signedandunsigned do not make a difference • The ASCII charactercorrespondingtothebinaryrepresentation unsignedcharch; ch = 200; //valid; the ASCII characterwithcode 200 ch = -25; //out of range, but not a syntaxerror. Compilergives a warning • -25 is out of range but can be represented in 2's complement as 11100111 in binaryandtheunsignedinterpretation of this bit string is 231. Thus: cout << ch; • displaysthecharacterforwhichthe ASCII code is 231.

int, short, long, longlong The "signed" integertypes of C++ Uses 2's complementrepresentation int • Mostlyusedsignedintegertype of C++ • Typicallythenumber of bytesused is theword size of theprocessor • So in 32 bit computers it is 4 bytes, but for 64-bit computers it should be 8 bytes • However, VisualStudiofixed it to 4 bytes: thus, in CS204 we can assumethatintalwaysuses 4 bytes • But ifyouportyourcodetoanother platform usinganothercompiler, do not trustthatintuses 4 bytes. • Range: INT_MINtoINT_MAX (thesearedefined in limits.h orclimitsheader file) -2n-1. . . 2n-1-1where n is thenumber of bitsused 32 bits (ourcase): -231. . . 231-1 -2,147,483,648 . . . 2,147,483,647 64 bits: -263. . . 263-1 -9,223,372,036,854,775,808to +9,223,372,036,854,775,807 long(can also be used as longint) longnum; //can also be defined as longintnum; • Signedintegerthatalwaysuse 4 bytes • Therange is thesame as 32-bit int

int, short, long, longlong longlong (can also be used as longlongint) longlongwow; //can also be defined as longlongintwow; • Microsoft specific64-bit signedinteger (always 64-bits) • Do not use it forcodesto be portedtootherplatforms, it won'twork. • Range: LLONG_MINtoLLONG_MAX ( -263. . . 263-1 ) short (can also be used as shortint) shortcount; //can also be defined as shortintcount; • Always 2 bytes • Signedintegerthatalwaysuse 2 bytes • Range: SHRT_MINtoSHRT_MAX (thesearedefined in limits.h orclimitsheader file) -215. . . 215-1 -32768. . . 32767 count = 31500; //valid count = 35000;//out of range, but not a syntaxerror. Compilergives a warning • So, what is theoutput of cout << count; ? • Itdisplays -30536, why? • Write 35000 in binary in 16-bitsandinterpretthis bit string as a 2's complementedsignednumber 3500010 = 1000 1000 1011 10002 = 8 + 16 + 32 + 128 + 2048 – 32768 = -30536

unsignedintegers In order to store only non-negative values, char, int, short, long, longlong can be defined as unsigned by putting unsigned keyword before the type name. unsigned int mynum; unsigned short cinekop; // same as unsigned short int cinekop; unsigned long lufer; //same as unsigned long int lufer; unsigned long long kofana; // same as unsigned long long int kofana; In unsigned representation there is no sign bit; most significant bit is part of the magnitude. Thus we do not need 2's complement. • In this way, we can use the full range (2, 4 or 8 bytes) for zero and positive values. The ranges become (note that the positive range is almost doubled as compared to signed integers): • 16-bit: 0 to USHRT_MAX (defined in limits.h or climits header file) 0 . . . 216-1 0 . . . 65535 • 32-bit: 0 to UINT_MAX (defined in limits.h or climits header file) 0 . . . 232-1 0 . . . 4,294,967,295 • 64-bit: 0 to ULLONG_MAX (defined in limits.h or climits header file) 0 . . . 264-1 0 . . . 18,446,744,073,709,551,615

unsignedintegers Unsigned numbers does not store negatives, but nothing can stop us to assign a negative value to an unsigned variable  unsigned short num; num = -25; cout << num; -25 is negative so it is represented using 2's complement. The resulting bit string is then interpreted as an unsigned number since it is assigned to unsigned number (implicit type casting). -2510 = 1111 1111 1110 01112 = 6551110 So the output becomes 65511 Of course, it is not a normal programmer behavior to assign a negative value to an unsigned variable, but such things may unintentionally occur. If you use a literal or constant at the right-hand-side of assignment, then compiler may warn you (depending on the warning level). However, if rhs is an expression, then the problem occurs at run-time and compiler cannot see that problem. Thus, you have to know what happens in such situations to locate the problem easily.

Limits You can include limits.h which defines the ranges of integers (depending on your platform/computer) #include <limits.h> OR #include <climits> Tip: Type #include <limits.h>(or any other filename) in your program, then go to that line, and right click on the file name and choose “Open Document”. That will bring you this header file. You can do this in general and it will save you the effort lo locate the file.

Typecastingbetweensignedandunsingednumbers Typecasting may be done explicitly or sometimes it happens implicitly (e.g. When you assing an unsigned variable to a signed one, or vice versa) • So, you should know how it executes Signed to unsigned typecasting • Represent to signed number using 2's complement format. • Interpret this bit string as unsinged • If MSB is 0, then the signed and unsigned are the same • If MSB is 1, then signed is negative. For unsigned conversion, MSB is not considered as the sign bit, it is interpreted as part of the magnitude. • 2 slides ago, we had an example, but let us give another one. short ints = -30000; unsigned short intus = ints; //implicit typecasting cout << intus; Output is 35536, the same bit representation as -30000

Typecastingbetweensignedandunsingednumbers Unsignedtosignedtypecasting • Representtheunsignednumber as bit string. • Interpretthis bit string as signed • If MSB is 0, thentheunsignedandsignedarethesame • If MSB is 1, theninterpretthe bit string as a 2's complementednegaitvevalue • I do not meantotake 2's complement. • But you can take 2's complementtounderstandthemagnitude of thisnegativenumber • Examples unsignedshortusnum = 30000; cout << (short) usnum; • Output is 30000, MSB of usnumis 0. unsignedshortusnum = 63000; cout << (short) usnum; • Output is -2536, MSB ofusnumis 1. Moral of thestorybehindtypecasting: Theyareallthesame bit strings; theonlythingthatchanges is howtointerpret it

Sometipsaboutselectingintegertype You may consider to use an unsigned variable if you will store a non-negative number. Well, if you are too close to 0, this is a bit risky. Consider the following loop: unsigned int j; for (j = 5; j >= 0; j--) cout << j << endl; This loop is infinite. When j is 0, it is decremented and you expect to have -1. Actually it is -1 as the bit string representation (a bit string with all 1's in it). This bit string is the largest unsigned integer number when you typecast into unsigned int. Thus it is >=0. Moral of the story: use unsigned only if you make sure that the value of the variable will never go below 0. Otherwise use signed integers.

Sometipsaboutselectingintegertype Do not mix signed and unsigned numbers in an expression. Some strange things may happen. Consider the following code: unsigned int a = 5; int b = -10; if (a+b < 0) cout << "hede" << endl; else cout << "hodo" << endl; You expect the output to be "hede", but it displays "hodo". Why? In C++, there is a rule saying that built-in operands of an operator must be of the same type. If they are different, one of them is implicitly typecasted to the other before evaluating the expression. • Typical case: if you add an integer to a double, integer is converted to double before the operation. • In the example above, signed (b) is typecasted to unsigned. -10 is a binary number with lots of 1's in 2's complement format. Thus as unsigned it is a big number. When added 5, the sum gets bigger and can never be less than 0. RULE: If there is a signed and an unsigned number in an expression, signed is automatically converted to unsigned by the compiler before the evaluation.

Sometipsaboutselectingintegertype Which integer size to use? Of course, this depends on the possible range of values you want to store in this variable. • Using the largest one all the time may cause unnecessary usage of memory. This is not good for efficiency. • But on the other hand, allow yourself some margin to proactively defend against some unanticipated problems. If you want to store a constant or literal bigger than the capacity, the compiler warns you (at compile time) short num = 45000; //compiler warns However, if you assign an expression that goes beyond the capacity, then compiler cannot see this and cannot warn you. This is a big problem and technically called as "overflow" and we are going to see overflow today (if time permits, otherwise in the next lecture).

Wrap-up: NumberRepresentations • Unsigned : for non-negative integers • Two’s complement : for signed integers (zero, negative or positive) Unless otherwise noted (as unsigned), always assume that numbers we consider are in 2's complement representation. • IEEE 754 floating-point : for real numbers (float, double) • We did not add anything for float and double on top of what you know from CS201. At the end of these slides you can find how IEEE 754 floating-point representation works, but we will not talk about this and you are not responsible.

Arithmetic Overflow

Overflow In this subsection, we will see the related topic of overflow, which basically means that after an operation such as addition or subtraction the result is not correct due to the fact that the result cannot be represented in the allocated space. There is overflow in the following piece of code since a +b goes beyond the range covered by c unsigned char a = 200; unsigned char b = 255; unsigned char c = a + b; • We will also give the rule about determining the value of c after the overflow. • This may not look essential since the result is already wrong, but getting into that deep may help us to find out logic error during debugging. We will start with small cases where the storage is 4-bits to understand the basics and then later we will generalize to built-in types of C++ We will give the basics of overflow on addition and subtraction with two operands • Other arithmetic operations and expressions with multiple operations may also cause overflow. We will generalize to this case at the end

How Can WeDetectArithmeticOverflow? Having carry out of MSB? • Arithmetic overflow is not always understood by having a carry out of the MSB • If there is a carry out of the MSB, then we say that there is a "carry overflow", but this may or may not mean there is "arithmetic overflow" and the result is wrong. E.g. 7-6 = 7 + (-6) 0111( 7)+ 1010 (-6) 1 0001 ( 1)There is a carry out of MSB; it is discarded and the result is correct! E.g. -7-6 = -7 + (-6) 1001 (-7)+ 1010 (-6) 1 0011 ( 3)There is a carry out of MSB; it is discarded and the result is wrong! x x

How Can WeDetectArithmeticOverflow? Having no carry out of MSB? • Does not always mean that there is no arithmetic overflow. We may have arithmetic overflow even if there is no carry out of MSB E.g. 7+1 0111( 7)+ 0001 ( 1) 1000 (-8)There is no carry out of MSB, but the result is incorrect! E.g. 2-3 = 2 + (-3) 0010 ( 2)+ 1101 (-3) 1111 (-1)There is no carry out of MSB and the result is correct!

120 +120 -16 Overflow and 8 bit addition 1 1 1 1 01111000 + 01111000 11110000 Overflow! It fits, but it’s still overflow! Reminder: Max 2s complement range with 8 bits: -128 to +127 01111000 = 1x64 + 1x32 + 1x16 + 1x8 = 12010 11110000 = -1x128 + 1x64 1x32 + 1x16 = -1610

Overflow – definition & detection Overflow means that the right answer don’t fit ! If you think in decimal and know the ranges, it is easy to detect. 120 +120 = 240 and the range of signed 8-bit integer is -128 . . 127  240 is not in this range, so there is overflow More formally, there is arithmetic overflow when the sign of the numbers is the same-AND- the sign of the result is differentthan the sign of the numbers

Detecting Overflow Therecan’t be an overflow when adding a positive and a negative number • Why? Basicallybecausethemagnitude of thenumbergetssmallerwithoutchangingsign Therecan’t be an overflow when signs are the same for subtraction • Why? Same as above since arithmeticallythis is adding a positveto a negative. Overflow occurs when the value affects the sign: • overflow when adding two positives yields a negative • or, adding two negatives gives a positive • or, subtract a negative from a positive and get a negative (similarto 1) • or, subtract a positive from a negative and get a positive (similarto 2) Of course, thisrule is forsignedintegers; forunsigned, wewillseelater

Visualizing Overflow Number 2’s-complement 00000 +10001 +20010 +30011 +4 0100 +50101 +60110 +7 0111 --------------------------------- -8 1000 -71001 -6 1010 -5 1011 -4 1100 -3 1101 -2 1110 -1 1111 Let us visualizethereason of overflow on 4-bit caseforsignedintegers (2's complement) Start withthefirstoperandand • goupbythesecondoperandforsubtraction • godownbythesecondoperandforaddition • Overflowoccursifourarithmeticoperationcausestopassthisredline (in anydirection) Wrappingaround( 0  -1 or -1  0) does not meanoverflow

Visualizing Overflowforcharandshort char Number 2’s-complement 00000 0000 +10000 0001 +20000 0010 . . .. . . +124 0111 1100 +1250111 1101 +1260111 1110 +127 0111 1111 --------------------------------- -128 1000 0000 -1271000 0001 -126 1000 0010 . . .. . . -4 1111 1100 -31111 1101 -21111 1110 -11111 1111 short Number 2’s-complement 00000 0000 0000 0000 +10000 0000 0000 0001 +20000 0000 0000 0010 . . .. . . +32764 0111 1111 1111 1100 +327650111 1111 1111 1101 +327660111 1111 1111 1110 +327670111 1111 1111 1111 ----------------------------------------------- -32768 1000 0000 0000 0000 -327671000 0000 0000 0001 -32766 1000 0000 0000 0010 . . .. . . -4 1111 1111 1111 1100 -31111 1111 1111 1101 -21111 1111 1111 1110 -11111 1111 1111 1111

Detecting Overflow – ComplexExpressions The rule of detecting the change of sign in the result applies to all signed integer types of C++. • But only for simple addition and subtraction • What about more complex operations? No simple formula for that; apply these steps • Simply calculate using decimal arithmetic and see if it fits in the range. • If does not fit, then there is overflow • Convert the overflowed result in binary and truncate as it fits to n-bits (where n is the number of bits in the corresponding type) • Interpret the truncated bit string in 2's complement logic • Examples char d = 3*200+21; • 621 is not between -128 . . 127, so there is overflow. • 62110 = 10 0110 11012 Discard the most significant two bits since they do not fit in 8 bits (storage for char). Resulting bit string is 0110 1101 which is 109 (decimal) char d = 2*200+15; • 415 is not between -128 . . 127, so there is overflow. • 41510 = 1 1001 11112 Discard the most significant bit since it does not fit in 8 bits (storage for char). Resulting bit string is 1001 1111 which -97 is (2's compl.)

DetectingOverflow - unsignedintegers Dec. Number Binary number ---------------------------------------- 00000 +10001 +20010 +30011 +4 0100 +50101 +60110 +7 0111 +8 1000 +91001 +10 1010 +11 1011 +12 1100 +13 1101 +14 1110 +15 1111 ---------------------------------------- Let us visualizetheoverflowcase on 4-bit unsignedsignedintegers Wehavetworedlineshere • There is overflowifyougobeyond 0 andbeyond 15 • Ifyouadd 1 to 15 youendup 10000 in binaryandwhenyoudiscardtheoverflow bit, theresultingvaluebecomes 0. • Similarlysubtracting 1 from 0 yields 15 Generalization of thiscaseto n-bit unsignedintegers is trivial • Maxvalue is 2n-1andnumber of bits in binary is n Evaluation of complexexpression is similartosignedcase • Do theoperation, converttobinary, discardtheoverflowedbits • But this time interpret as unsignednumber

Data R epresentation Overflow Limits

Data R epresentation Overflow Limits

Presentation Transcript

Limits of Data Structures

Safeguarding Data/ Disclosure Limits

Logics for D ata and K nowledge R epresentation

Evicare : Guideline R epresentation

Overflow

Overflow

OVERFLOW

Buffer Overflow

Buffer Overflow

Buffer Overflow

Buffer Overflow

Overflow Handling

OVERFLOW CHARTS

“The Overflow”

An International Perspective on Student R epresentation

Compiler Construction I ntermediate R epresentation I

K NOWLEDGE R EPRESENTATION

Logics for D ata and K nowledge R epresentation

Overflow Handling

Safeguarding Data/ Disclosure Limits