- 82 Views
- Uploaded on
- Presentation posted in: General

OMSE 510: Computing Foundations Intro Lecture

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

OMSE 510: Computing FoundationsIntro Lecture

Chris Gilmore <grimjac@cs.pdx.edu>

Portland State University/OMSE

- Website
http://web.cecs.pdx.edu/~grimjack/OMSE510CF/ComputerFoundations.html

- Mailing List:
omse510@cecs.pdx.edu

- Personal Email:
grimjack@cs.pdx.edu

- Course Rationale:
This course has been designed for graduate level software engineering students who are lacking key foundation computer science knowledge in the areas of computer architecture and operating systems. This course may also be taken by students needing or wanting to upgrade their knowledge in these areas. With the approval of an OMSE advisor, OMSE students may register in this course and count it for credit as an OMSE elective.

Divided into two halves

- Computer Architecture
- How the hardware works
- 4 Sessions + Midterm

- Operating Systems
- How the software interacts with the hardware
- 5 Sessions + Final

- Four Assignments (40%), Midterm (30%), Final (30%)

- Transistors, logic gates & Lower-level functionality
- In-depth Floating Point/Integer Arithmetic
- Networking
- Hardware Description Languages (Verilog, ISP’)
- Security
- The History of anything
- Theoretical Architectures

- I like feedback
- It’s good to ask questions in class. Email is less good.
- If you don’t understand, ask NOW. Probably other people don’t understand. And we always build on existing material.
- One or two breaks in a 3 hour class.

Today’s lecture covers the very basics – should probably be review!

If you’re bored, that’s good! The interesting stuff comes later

- Amdahl’s Law
- Data Representation
- Conventions: (binary/hex/oct)
- Unsigned/signed integers
- Floating point

- Brief on Compilers

Fundamental design principle in computer architecture design.

Make things FAST.

Amdahl’s law is a guideline for making things faster.

Suppose some task that takes time torigminutes to perform

Eg.

Flying from PDX to YVR, 80 mins

Boeing 727, ~900 km/h

But time is important to us! Let’s take the Concorde instead!

Flying from PDX to YVR

- Boeing 727, ~900 km/h, 80 mins

- Concorde, ~2200 km/h, 40 mins

40 minutes saved!

Flying from PDX to YVR

told = 80 mins (Boeing 727)

tnew = 40 mins (Concorde)

Speedup = = = 2

2x speed improvement! That’s great!

.. But is it really?

told

tnew

80 min

40 min

Time actually spent traveling from PDX to YVR:

30 mins MAX to airport

20 mins getting your ticket

45 mins getting through security

30 mins boarding/taxiing

80 mins flying

40 mins landing + customs

= 245 minutes

Time actually spent traveling from PDX to YVR:

245 minutes (Boeing 747)

205 minutes (Concorde)

Where’d that 2x speedup go?

30 mins MAX to airport

20 mins getting your ticket

45 mins getting through security

30 mins boarding/taxiing

80 mins flying

40 mins landing + customs

= 245 minutes

Only 33%

of total time!

The variables:

told = 245 mins (Original travel time)

α= 33% (Time actually spent flying)

k = 2 (Speedup factor)

tnew = (1-α) toldx α told / k

= 66% * 245 mins x 33% * 245 mins / 2

= 205 mins

Speedup, S

S = told /tnew

= 1 / [ (1-α) + α /k ]

= 1.2

Much less than 2x!

Moral of the story: To improve the system, you have to work harder than you want

Special case – set k = ∞

S∞ = 1 / (1 – α)

Most amount of speedup you can get out of tuning one component.

ie. Are you wasting your time?

Most important to Computer Architecture/Operating system design:

Speed!

Not necessarily like regular programming. More important than correctness (almost)

Foundation Idea #2:

Computers represent everything with numbers

Everything in a computer is represented as a number.

Letters -> Numbers

Pictures -> Numbers

Programs -> Numbers

Data = Numbers

(This should be old hat for you)

Non-negative Integers:

Decimal (Human) Numbers:

0,1,2,…..256, …. 1024… 2048….

Data in computers only exist in 2 states, on and off. (1 or 0)

This means it’s hard for them to count in decimal…

DecimalBinary

00

11

2 10

3 11

4100

5101

Decimal

12345 = abcde

Number = a*104 + b*103 + c*102 + d*101 + e*100

= 1*10000 + 2*1000 + 3*100 + 4*10 + 5*1

= 10000 + 2000 + 300 + 40 + 5

= 12345

Binary (Base 2)

10101 = abcde

Number = a*24 + b*23 + c*22 + d*21 + e*20

= 1*16 + 0*8 + 1*4 + 0*2 + 1*1

= 16 + 0 + 4 + 0 + 1

= 21

DecimalBinary

00

11

2 10

3 11

4100

5101

Okay, computers like binary…

But binary is too hard to read for humans.

… But we want to express powers of two conveniently

Octal

00, 01, 02,…, 07, 010, … 017, 020…..

Octal (Base 8)

012345 = 0abcde

Number = a*84 + b*83 + c*82 + d*81 + e*80

= 1*4096+ 2*512 + 3*64 + 4*8 + 5*1

= 4096 + 1024 + 192 + 32 + 5

= 5349

DecimalBinaryOctal

0000

1101

2 1002

3 1103

410004

510105

8 1000010

121100014

47101111057

But octal still cumbersome, because computers often prefer grouping in sets of 4 binary digits.

(Octal groups bits in sets of 3)

Hex Format (The preferred choice)

0x0, 0x1, 0x2,…0xf, 0x10, 0x11, .. 0x1a,0x20

Hex (Base 16)

0x12345 = 0xabcde

Number = a*164 + b*163 + c*162 + d*161 + e*160

= 1*65536+ 2*4096 + 3*256 + 4*16 + 5*1

= 65536 + 8192 + 768 + 64 + 5

= 74565

Hex Digits: Need more than 10 digits (0-9)

So we use a b c d e f

Decimal: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17

Hexedecimal: 0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xA, 0xB, 0xC, 0xD, 0xE, 0xF, 0x10,0x11

DecimalBinaryOctalHex

00000x0

11010x1

2 10020x2

3 11030x3

4100040x4

5101050x5

8 10000100x8

1211000140xC

471011110570x2F

*Chinese Remainder Theorem to convert

Oct Dec Hex Char

-------------------------------

101 65 41 A

102 66 42 B

103 67 43 C

104 68 44 D

105 69 45 E

106 70 46 F

107 71 47 G

110 72 48 H

111 73 49 I

112 74 4A J

113 75 4B K

114 76 4C L

115 77 4D M

Oct Dec Hex Char

-------------------------------

116 78 4E N

117 79 4F O

120 80 50 P

121 81 51 Q

122 82 52 R

123 83 53 S

124 84 54 T

125 85 55 U

126 86 56 V

127 87 57 W

130 88 58 X

131 89 59 Y

132 90 5A Z

Rolex Newbie FAQ

Is it okay to peel off the hologram sticker from the back of my new rolex?

Yes. It will not devalue your watch, nor void your warranty. Hologram stickers

are not a good way of differentiating real and fake Rolexes. Even fake ones

often come with a hologram sticker.

00000000 52 6F 6C 65 78 20 4E 65 77 62 69 65 20 46 41 51 Rolex Newbie FAQ

00000010 0D 0A 0D 0A 49 73 20 69 74 20 6F 6B 61 79 20 74 ....Is it okay t

00000020 6F 20 70 65 65 6C 20 6F 66 66 20 74 68 65 20 68 o peel off the h

00000030 6F 6C 6F 67 72 61 6D 20 73 74 69 63 6B 65 72 20 ologram sticker

00000040 66 72 6F 6D 20 74 68 65 20 62 61 63 6B 20 6F 66 from the back of

00000050 20 6D 79 20 6E 65 77 20 72 6F 6C 65 78 3F 0D 0A my new rolex?..

00000060 20 59 65 73 2E 20 49 74 20 77 69 6C 6C 20 6E 6F Yes. It will no

00000070 74 20 64 65 76 61 6C 75 65 20 79 6F 75 72 20 77 t devalue your w

00000080 61 74 63 68 2C 20 6E 6F 72 20 76 6F 69 64 20 79 atch, nor void y

00000090 6F 75 72 20 77 61 72 72 61 6E 74 79 2E 20 48 6F our warranty. Ho

000000A0 6C 6F 67 72 61 6D 20 73 74 69 63 6B 65 72 73 0D logram stickers.

000000B0 0A 20 61 72 65 20 6E 6F 74 20 61 20 67 6F 6F 64 . are not a good

000000C0 20 77 61 79 20 6F 66 20 64 69 66 66 65 72 65 6E way of differen

Each Pixel is a 3-tuple, (Red, Green, Blue)

$ dump lena.jpg

00000000 ffd8 ffe0 0010 4a46 4946 0001 0101 0048 .X.`..JFIF.....H

00000010 0048 0000 ffdb 0043 0006 0404 0405 0406 .H...[.C........

00000020 0505 0609 0605 0609 0b08 0606 080b 0c0a ................

00000030 0a0b 0a0a 0c10 0c0c 0c0c 0c0c 100c 0e0f ................

00000040 100f 0e0c 1313 1414 1313 1c1b 1b1b 1c20 ...............

00000050 2020 2020 2020 2020 20ff db00 4301 0707 .[.C...

00000060 070d 0c0d 1810 1018 1a15 1115 1a20 2020 .............

00000070 2020 2020 2020 2020 2020 2020 2020 2020

00000080 2020 2020 2020 2020 2020 2020 2020 2020

00000090 2020 2020 2020 2020 2020 2020 2020 ffc0 .@

000000a0 0011 0802 5803 2003 0111 0002 1101 0311 ....X. .........

000000b0 01ff c400 1c00 0001 0501 0101 0000 0000 ..D.............

000000c0 0000 0000 0000 0200 0103 0405 0607 08ff ................

000000d0 c400 5310 0001 0203 0406 0607 0408 0307 D.S.............

000000e0 0303 0403 0102 0300 0411 0512 2131 0613 ............!1..

000000f0 2241 5161 1432 7181 91a1 0723 4252 b1c1 "AQa.2q..!.#BR1A

All the numbers we’ve discussed are unsigned. (ie. Non-negative integers)

Assume 8-bits of information:

Eg.0000 0000 = 0

0000 0001 = 1

1000 0000 = 128

1111 1111 = 255

Range is [0,255]

What if we want to represent negative numbers?

Naïve Solution: Sign/Magnitude Notation

Use first bit to represent +/- (sign bit)

Eg.0000 0000 = 0

0000 0001 = 1

1000 0001 = -1

0111 1111 = 127

1111 1111 = -127

Range is [-127,127]. But this is wasteful! There are two ways of representing 0! (+0, -0)

Another approach: Bias Notation

Take the unsigned number, subtract b (eg. b = 127)

Eg.0000 0000 = 0 – 127 = -127

0000 0001 = 1 – 127 = -126

0111 1111 = 127 – 127 = 0

1000 0000 = 128 – 127 = 1

1111 1111 = 255 – 127 = 128

Range is [-127,128]. This works, and has its purposes, but usually we prefer….

Usual approach: Two’s Compliment

MSB is considered to have negative weight.

Eg.0000 0000 = 0

0000 0001 = 1

1111 1111 = -1

1000 0000 = – 128

0111 1111 = 127

Range is [-128,127].

It seems goofy, but there’s a lot of good reasons for it

Advantages:

- Easy to negate: Take the bitwise complement, add one
- Efficient – adding and what logical operator?
- Overflow is handled “gracefully”
- Easy to tell if a number is negative – if MSB is set
- More details in your req’d reading :)

Ones’ Compliment: Mostly theoretical (noone uses it)

MSB is considered to have weight –(2w-1-1) instead of 2w-1. (eg. MSB = -127 instead of -128)

Eg.0000 0000 = 0

0000 0001 = 1

1111 1110 = -1

1000 0000 = – 127

0111 1111 = 127

1111 1111 = 0

Range is [-127,127].

Note again there’s two ways of representing 0

Okay great, we know how to represent all kinds of integers:

Non-negative Integers: Unsigned format

Integers: Sign-Magnitude

Bias Notation

Two’s Complement

Ones’ Complement

But how do we represent fractional numbers? Eg. ½

Idea: How do we represent it in decimals?

½ = 0.5

We can introduce a decimal point to binary:

Decimal -> Binary

0.5 -> .1

1.5 -> 1.1

2.5 -> 10.1

0.25 -> 0.01

0.75 -> 0.11

This follows from our original definition

1010.1010 = abcd.efgh

Number = a*23 + b*22 + c*21 + d*20

+ e*2-1 + f*2-2 + g*2-3 + h*2-4

= 1*8 + 0*4 + 1*2 + 0*1

+ 1*1/2 + 0*1/4 + 1*1/8 + 0*1/16

= 8 + 2 + .5 + .125

= 10.675

So if we have 8 bits of information, and we say that the decimal point occurs between the two sets of 4 bits, we have a convention for representing fractions:

0000 0000 = 0

0001 0000 = 1

0000 1000 = 0.5

0001 1000 = 1.5

1010 1010 = 10.675

So called Fixed Point representation

But with n bits, our range is still very small.

[0,2w/2)

We want to be able to express a very large range (and negative numbers) very compactly.

Let’s think about scientific notation:

1.2e10 = 1.2 * 1010

Binary Equivalent!

Binary equivalent of scientific notation is called “floating point”

value * 2exponent

So since our decimal point is “floating”, we have a much larger expressible range

Standardized representation of floating point

(-1) sign * mantissa * 2exponent

So since our decimal point is “floating”, we have a much larger expressible range.

The mantissa is unsigned,

The exponent is expressed in bias notation.

*Brian & O’Hallaron calls it “significand” instead of mantissa

An in-depth example:

(-1) sign * mantissa * 2exponent

Suppose we have 9 bits to play with:

sign (1 bit) mantissa (4 bits) exponent (4 bits)

sign s: 0 or 1

mantissa, M: Fixed point number in the range [1,2)

exponent, E: Bias notation in the range [-6,7]*

*Why not [-7,8]? Those values used for something special

sign (1 bit) mantissa (4 bits) exponent (4 bits)

sabcdefgh

Mantissa: Fixed point notation – implied decimal point

a.bcd

eg. 1.0 -> 1.000

1.125 -> 1.001

1.25 -> 1.010

1.5 -> 1.100

1.75 -> 1.110

The mantissa encodes a value in the range [1,2)

Realization: The most significant digit is always 1! Don’t need to encode it!

sign (1 bit) mantissa (4 bits) exponent (4 bits)

s 1.abcdefgh

So the mantissa has a precision of 2-4 = 1/16

sign (1 bit) mantissa (4 bits) exponent (4 bits)

s 1.abcdefgh

Exponent, E has k-bits, in bias notation

Bias is 2k-1-1 = 7

So the range is [-7,8]

Encoding Table

sign (1 bit) mantissa (4 bits) exponent (4 bits)

s 1.abcdefgh

Special Values for Exponent, E:

If exponent field is all 0’s, the number is considered denormalized: Mantissa does not have an implied leading 1.

If exponent field is all 1’s, then there’s a special interpretation to encode values such as infinity, and NaN

So the range becomes is [-6,7]

Encoding Table

Closing notes:

- Some numbers, such as 0.2 cannot be represented exactly using any of the formats we’ve described
- IEEE 32-bit Single-precision float: (c float usually)
1 sign bit, 23-bit mantissa, 8-bit exponent

Approximately 7 decimal digits of precision

- IEEE 64-bit Double-precision float: (c double usually)
1 sign bit, 52-bit mantissa, 11-bit exponent

- Rounding imprecision is a BIG problem with floating point numbers.
bool equal( float x, float y ) { // Never do this

if ( x == y ) return true;

else return false;

}

- printf rounds floats to be more human readable

Some terminology:

- Byte: Smallest addressable unit on an architecture. Usually an octet (8 bits)

- Nibble: Half a byte (4 bits)
- Word: Natural Unit of data on the architecture
- 8086: 8 bits IA32, PPC: 32 bits
- (Often the size of address space)

- Dword (Double word), Quad-word
- Caches often like 64 bytes (x86)
- Memory Pages (x86 4096 bytes)
- Disk Sectors (512 bytes common)

b = bits

B = bytes

KB = Kilobyte = 210 = 1024

MB = Megabyte = 220 = 1024*1024 = 1048576

GB = Gigabyte = 230 = 1073741824

TB = Terrabyte = 240 = 1099511627776

*Note: MB = Megabyte, Mb = Megabit

**k and K are used interchangeably

b = bits

B = bytes

KB = Kilobyte = 10 = 1000

MB = Megabyte = 102 = 1,000,000

GB = Gigabyte = 103 = 1,000,000,000

TB = Terrabyte = 104 = 1,000,000,000,000

Reason: Makes numbers seem bigger and cooler

*Note: MB, mb, Mb all used interchangeably

System Bus

Disk

Memory

CPU

Disk Controller

$ cat hello.c

#include <stdio.h>

int main() {

printf( "Hello, world\n" );

return 0;

}

$ ./hello

Hello, world

But the computer doesn’t understand C code! C is for humans.

Machine code looks like this:

00000000 4d5a 9000 0300 0000 0400 0000 ffff 0000 MZ..............

00000010 b800 0000 0000 0000 4000 0000 0000 0000 8.......@.......

00000020 0000 0000 0000 0000 0000 0000 0000 0000 ................

00000030 0000 0000 0000 0000 0000 0000 8000 0000 ................

00000040 0e1f ba0e 00b4 09cd 21b8 014c cd21 5468 ..:..4.M!8.LM!Th

00000050 6973 2070 726f 6772 616d 2063 616e 6e6f is program canno

00000060 7420 6265 2072 756e 2069 6e20 444f 5320 t be run in DOS

00000070 6d6f 6465 2e0d 0d0a 2400 0000 0000 0000 mode....$.......

00000080 5045 0000 4c01 0400 0951 ee42 000c 0000 PE..L....QnB....

00000090 f800 0000 e000 0703 0b01 0238 0004 0000 x...`......8....

000000a0 0004 0000 0002 0000 0010 0000 0010 0000 ................

000000b0 0000 0000 0000 4000 0010 0000 0002 0000 ......@.........

The compiler translates C to machine code…

Hello.c

(text file)

Hello

(binary object)

Compiler Magic

$ gcc hello.c –o hello

$ ./hello

Hello, world

Demystifying (slightly)

Hello.o

(preprocessed

simplified c)

Hello.i

(preprocessed

simplified c)

Hello.c

(c code)

Hello

(binary object)

Prepro-

cessor

Compiler

Assembler

Compilation is divided into stages to simplify it.

Let’s follow through hello world example

Start with source code:

#include <stdio.h>

int main() {

printf( "Hello, world\n" );

return 0;

}

Translates C to “simplified” C. Translates macros, resolves file references, preprocessor conditionals

#include <file.h>

#if, #ifdef, #else, #endif

#define

$ gcc -E hello.c >hello.i

Translates preprocessed C into a simple language called Assembly. Still human-readable, but barely. Very close to machine language

pushl %ebp

movl %esp, %ebp

subl $8, %esp

andl $-16, %esp

movl $0, %eax

addl $15, %eax

addl $15, %eax

shrl $4, %eax

sall $4, %eax

movl %eax, -4(%ebp)

movl -4(%ebp), %eax

call __alloca

call ___main

movl $LC0, (%esp)

call _printf

movl $0, %eax

leave

ret

$ gcc -S hello.i

Translates assembly to machine code.

This stage is very simple – 1:1 mapping between assembly and machine code

00000070 0000 0000 0000 0000 0000 0000 0000 0000 ................

00000080 0000 0000 0000 0000 8000 00c0 2e72 6461 ...........@.rda

00000090 7461 0000 0000 0000 0000 0000 1000 0000 ta..............

000000a0 f400 0000 0000 0000 0000 0000 0000 0000 t...............

000000b0 4000 0040 5589 e583 ec08 83e4 f0b8 0000 @..@U.e.l..dp8..

000000c0 0000 83c0 0f83 c00f c1e8 04c1 e004 8945 ...@..@.Ah.A`..E

000000d0 fc8b 45fc e800 0000 00e8 0000 0000 c704 |.E|h....h....G.

000000e0 2400 0000 00e8 0000 0000 b800 0000 00c9 $....h....8....I

000000f0 c390 9090 4865 6c6c 6f2c 2077 6f72 6c64 C...Hello, world

$ gcc -c hello.s –o hello.o