Chapter 1: Data Storage • 1.1 Bits and Their Storage • 1.2 Main Memory • 1.3 Mass Storage • 1.4 Representing Information as Bit Patterns • 1.5 The Binary System
Chapter 1: Data Storage (continued) • 1.6 Storing Integers • 1.7 Storing Fractions • 1.8 Data Compression • 1.9 Communications Errors
Bits and Bit Patterns • Information is encoded as patterns of 0s and 1s. These digits are called bits (short for binary digits) • Bit: Binary Digit (0 or 1) • Bit Patterns are used to represent information. • Numbers • Text characters • Images • Sound • And others
Boolean Operations • George Boole (1815-1864). Pioneer in the field of math called logic. • Boolean Operation: An operation that manipulates one or more true/false values • Specific operations • AND • OR • XOR (exclusive or) • NOT
Figure 1.1 The Boolean operations AND, OR, and XOR (exclusive or)
Gates • Gate: A device that computes a Boolean operation • A gate can be constructed from a variety of technologies such as gears, relays and optic devices. • In today’s computers, gates are usually implemented as small electronic cirucits in which digits 0 and 1 are represented as voltage levels.
Figure 1.2 A pictorial representation of AND, OR, XOR, and NOT gates as well as their input and output values
Flip-flops • Flip-flop: A circuit built from gates that can store one bit. • Flip-flop produces an output value of 0 or 1, which remains constant until a temporary pulse from another circuit causes it to shift tot he other value. • The output will flip or flop between two values under the control of external stimuli. • One input line is used to set its stored value to 1 • One input line is used to set its stored value to 0 • While both input lines are 0, the most recently stored value is preserved
Number Systems • Decimal (10-base): 1,2,3,4... • Binary (2): 0 , 1 • Octal (8-base), 0...7 • Hexadecimal (16-base). 0....F • When considering the internal structure, the activities of a computer, we must deal with strings of bits, some of which can be quite long. • A long string of bits is often called stream. • Streams are difficult for human to comprehend. • a shorthand notation called hexadecimal notation.
Hexadecimal Notation • Hexadecimal notation: A shorthand notation for long bit patterns • Divides a pattern into groups of four bits each • Represents each group by a single symbol • Example: 10100011 becomes A3
Main Memory Cells • Main memory is organized in manageable units called cells with a typical cell size being 8 bits. • Cell: A unit of main memory (typically 8 bits which is one byte). Although there is no left or right within a computer, we normally envision the bits arranged in a row • Most significant bit: the bit at the left (high-order) end of the conceptual row of bits in a memory cell • Least significant bit: the bit at the right (low-order) end of the conceptual row of bits in a memory cell • Large computers may have billions of cells in their main memory.
Main Memory Addresses • To identify individual cells, each cell is assigned a unique name. • Address: A “name” that uniquely identifies one cell in the computer’s main memory • The names are actually numbers. • These numbers are assigned consecutively starting at zero. • Numbering the cells in this manner associates an order with the memory cells.
Memory Terminology • Random Access Memory (RAM): Memory in which individual cells can be easily accessed in any order • Dynamic Memory (DRAM): RAM composed of volatile memory
Measuring Memory Capacity • Kilobyte: 210 bytes = 1024 bytes • Example: 3 KB = 3 times1024 bytes • Sometimes “kibi” rather than “kilo” • Megabyte: 220 bytes = 1,048,576 bytes • Example: 3 MB = 3 times 1,048,576 bytes • Sometimes “megi” rather than “mega” • Gigabyte: 230 bytes = 1,073,741,824 bytes • Example: 3 GB = 3 times 1,073,741,824 bytes • Sometimes “gigi” rather than “giga” • Terabyte • Petabyte • Exabyte • Zettabyte • Yottabyte • Brontobyte • Nisabyte (rumors) • Zotzabyte (rumors)
Mass Storage • Hard disks, CDs, DVDs, Flash Drives etc. • Advantages over main memory • On-line(attached to the device) versus off-line (detached from the device) • Typically larger than main memory • Typically less volatile than main memory • Typically slower than main memory • Low cost • Large storage capacity • Less volatility • Disadvantage • Requires mechanical motion and therefore require more time to store and retrieve data
Mass Storage Systems • Magnetic Systems • Disk • Tape • Optical Systems • CD • DVD • Flash Drives
Figure 1.9 A magnetic disk storage system Locations of tracks and sectors are not permanent part of the disk’s physical structure. The capacity of a disk storage Depends on the # of disks used and the density in which the tracks and sectors are placed.
Measurements for Disk Performance • Seek time: the time required to move the read/write heads from one track to another • Rotation delay (latency time): half the time required for the disk to make a compete rotation • Access time: seek time + latency time • Transfer Rate: the reate at which data can be transferred to or from the disk
Files • File: A unit of data stored in mass storage system • Fields and keyfields • Physical record versus Logical record • Buffer: A memory area used for the temporary storage of data (usually as a step in transferring the data)
Representing Text • Each character (letter, punctuation, etc.) is assigned a unique bit pattern. • ASCII (American Standard Code for Infromation Interchange): Uses patterns of 7-bits to represent most symbols used in written English text • Unicode: Uses patterns of 16-bits to represent the major symbols used in languages world side • Developed through cooperation of several leading manufactures of hardware and software • ISO (International Standard Organization) standard: Uses patterns of 32-bits to represent most symbols used in languages world wide
Try encoding the following 1010100 1001000 1001001 1010011 0100000 1001001 1010011 0100000 1000001 0100000 1010011 1010100 1010101 1010000 1001001 1000100 0100000 1010000 1010010 1001111 1000010 1001100 1000101 1001101 0100001
Representing Numeric Values • Storing information in terms of encoded characters is inefficient when the info is numeric. • Binary notation is a way of representing numeric values • Binary notation: Uses bits to represent a number in base two • Limitations of computer representations of numeric values • Overflow – occurs when a value is too big to be represented • Truncation – occurs when a value cannot be represented accurately
Representing Images • Popular techniques for representing images can be classified into two: • Bit map techniques • An image represented as a collection of dots, each of which is called pixel (picture element) • Each pixel is represented by a combination of bits indicating the appearance of that pixel. There are two approaches: • RGB • Luminance and chrominance • Disadvantage is that an image cannot be reshaped to any arbitrary size • Vector techniques • An image is represented as a collection of lines an curves • Scalable • TrueType and PostScript
...Representation of Images Pictures: A picture must be transformed into numeric form before it can be stored or manipulated by the computer. Each picture is subdivided into a grid of squares called pixels (picture elements). 37
...Representation of Images An image on paper can be converted into pixels using a scanner. Digital cameras store their images as digital images, i.e. the picture is already stored in the cameras memory as pixels. 38
...Representation of Images A picture with only black and white pixels: 1 represents black. 0 represents white. 39
...Representation of Images • 010101010101010101010110101101001001000111110000 • 011010101010101010101001011010010110010100000110 • 100101010101010101010110110001010000101001010100 • 101101101011011010110101100110010110100010001001 • 011010010110100101101010001001100100101101010010 • 100101101100101011010101110110011001010010101100 • 011010010011010110010010001001100110101010010001 • 010101101100101100100101110110011001010100100101 • 010101010101010011011010001001100010100001010100 • 101010101010101100010010110010001101001110100001 • 010101010101010001000101000101101000010000001101 • 110110101010010100110100011010010011100101101000 • 101001010100100010100101100101101100001010000010 • 101011010001001001001001011110101011010100101100 • 101010000100010010010111110101111100101001001001 • 010100101001000100101010101110101011010010010000 • 101001000010011001101111101011101010101000100101 • 010010010100100011011000011110111011010110101000 • 000100000001001100100111111111110110111000000010 • 101000101010010011011000010101011101000010101000 • 000010000100101101010011111111111111011101000101 • 001000101001101010100100011101111110100010010000 • 010010010110001001001001111011110101101100100101 • 100100100000111010010010010111111111011001001000 = 40
Representation of Images (5) Photographic quality images have a gray-scale. Several shades between black and white are used. 41
Representation of Images (6) 4 level gray-scale means 4 shades are used. Each pixel needs 2 bits: 00 - represents white 01 - represents light gray 10 - represents dark gray 11 - represents black This picture has 4 levels of gray (this uses four bits). 42
Representation of Images (7) • 256 level gray-scale means 8 bits per pixel are needed for 256 shades of gray. This picture has 256 levels of gray (this uses eight bits). 43
Representation of Images (8) • We could also use 8 bits (known as a byte) to represent the colour of a pixel. • A byte can represent 256 different numbers, so we can have 256 different colours in the image. This picture uses 256 colours. 44
Graphics Colours Red, Green, Blue (RGB) The primary light colours use three values per pixel. One number is used for each of the amounts of Red, Green and Blue on the computer screen. Red Green Blue Colour of pixel 45
Representation of Images (again) This is a full colour image. 46
Representing Sound • Sampling techniques • The most generic method • Sample the amplitude of the sound wave at regular intervals and record the series of values obtained. • Used for high quality recordings • Records actual audio • MIDI • Used in music synthesizers • Records “musical score” • Wave Table Synthesis
Sampling: Analog-to-Digital Converter (ADC) Each dot in the figure above represents one audio sample. There are two factors that determine the quality of a digital recording: Sample rate: The rate at which the samples are captured, measured in Hertz (Hz), or samples per second. An audio CD has a sample rate of 44,100 Hz (44 KHz) Sample format or sample size: the number of digits in the digital representation of each sample.
...Representing Sound These voltages are sent down the speaker wires to produce sound. A sound wave represented by the sequence: 0, 1.5, 2.0, 1.5, 2.0, 3.0, 4.0, 3.0, 0 (Amps) 49