lecture 17 i o interfaces and i o busses n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Lecture 17 I/O Interfaces and I/O Busses PowerPoint Presentation
Download Presentation
Lecture 17 I/O Interfaces and I/O Busses

Loading in 2 Seconds...

play fullscreen
1 / 31

Lecture 17 I/O Interfaces and I/O Busses - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

Lecture 17 I/O Interfaces and I/O Busses. Storage System Issues. Historical Context of Storage I/O Storage I/O Performance Measures Secondary and Tertiary Storage Devices Processor Interface Issues I/O Buses Redundant Arrays of Inexpensive Disks (RAID). Disk Time Example.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Lecture 17 I/O Interfaces and I/O Busses' - maddock


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
lecture 17 i o interfaces and i o busses

Lecture 17 I/O Interfaces and I/O Busses

CS510 Computer Architectures

storage system issues
Storage System Issues
  • Historical Context of Storage I/O
  • Storage I/O Performance Measures
  • Secondary and Tertiary Storage Devices
  • Processor Interface Issues
  • I/O Buses
  • Redundant Arrays of Inexpensive Disks (RAID)

CS510 Computer Architectures

disk time example
Disk Time Example
  • Disk Parameters:
    • Transfer size is 8K bytes
    • Advertised average seek is 12 ms
    • Disk spins at 7200 RPM
    • Transfer rate is 4 MB/sec
    • Controller overhead is 2 ms
  • Assume that disk is idle so no queuing delay
  • What is Average Disk Access Time for a Sector?
    • Ave seek + ave rot delay + transfer time + controller overhead
    • 12 ms + 0.5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms
    • 12 + 4.15 + 2 + 2 = 20 ms
  • Advertised seek time assumes no locality: typically 1/4 to 1/3 advertised seek time
    • Average Disk Access Time for a Sector: 20 ms => 12 ms

CS510 Computer Architectures

processor interface issues
Processor Interface Issues
  • Interconnections
    • Busses
  • Processor interface
    • Interrupts
    • Memory mapped I/O
  • I/O Control Structures
    • Polling
    • Interrupts
    • DMA
    • I/O Controllers
    • I/O Processors
  • Capacity, Access Time, Bandwidth

CS510 Computer Architectures

i o interface

CPU

Memory

memory

bus

Interface

Interface

Peripheral

Peripheral

CPU

40 Mbytes/sec

optimistically

10 MIPS processor

completely

saturates the bus!

VME bus

Multibus-II

Nubus

Memory

Interface

Interface

Peripheral

Peripheral

I/O Interface

Independent I/O Bus

Seperate I/O instructions (in,out)

Lines distinguish between

I/O and memory transfers

Common Memory

& I/O bus

CS510 Computer Architectures

memory mapped i o

CPU

Single Memory & I/O Bus

No Separate I/O Instructions

ROM

Memory

Interface

Interface

RAM

Peripheral

Peripheral

CPU

I/O

$

L2 $

Memory Bus

I/O bus

Memory

Bus Adapter

Memory Mapped I/O

CS510 Computer Architectures

programmed i o polling

Is data

ready?

CPU

no

yes

read

data

Memory

IOC

store

data

device

done?

no

yes

Programmed I/O (Polling)
  • busy on wait loop
  • not an efficient way to
  • use the CPU, unless
  • the device is very fast!
  • but checks for I/O
  • completion can be
  • dispersed among
  • computationally
  • intensive code

CS510 Computer Architectures

interrupt driven data transfer

CPU

add

sub

and

or

nop

(1) I/O

interrupt

user

program

(2) save PC

Memory

IOC

(3) interrupt

service addr

device

read

store

...

rti

interrupt

service

routine

(4)

memory

1000 transfers/second:

1000 interrupts @ 2 msec per interrupt=> 2 msec

1000 interrupt service @ 98 msec each=> 98 msec

100 msec = 0.1 CPU seconds

Interrupt Driven Data Transfer

User program halts only

during actual transfer

CS510 Computer Architectures

direct memory access

Time to do 1000 xfers in 1 msec:

1 DMA set-up sequence: @ 50 msec

1 interrupt: @ 2 msec

1 interrupt service sequence: @ 48 msec

100msec

.0001 second of CPU time

CPU sends a starting address,

direction(R/W), and word count

to DMAC. Then issues "start".

CPU

0

ROM

Memory

DMAC

IOC

Memory

Mapped I/O

RAM

I/O

device

DMAC provides;

Peripheral controller Handshake signals

Memory Addresses

Handshake signals

Peripherals

DMAC

n

Direct Memory Access

CS510 Computer Architectures

input output processors

D1

CPU

issues

instruction

to IOP

IOP

(1)

D2

interrupts

when done

main memory

bus

(4)

Mem

. . .

target device

Dn

I/O

bus

where cmnds are

CPU

IOP

OP Device Address

looks in memory for commands

(2)

Command

(3)

memory

OP Addr Cnt Other

what

to do

special

requests

where

to put

data

how

much

Input/Output Processors

Device to/from memory transfers are controlled by the IOP directly.

IOP steals memory cycles from CPU

CS510 Computer Architectures

relationship to processor architecture
Relationship to Processor Architecture
  • I/O instructions and busses have largely disappeared
  • Interrupt vectors have been replaced by jump tablesPC <- M [ IVA + interrupt number ]
  • Interrupts:
    • Stack replaced by shadow registers
    • Handler saves registers and re-enables higher priority interrupts

CS510 Computer Architectures

relationship to processor architecture1
Relationship to Processor Architecture
  • Caches required for processor performance cause problems for I/O
    • Flushing is expensive, I/O pollutes cache
    • Solution is borrowed from shared memory multiprocessors "snooping"
  • Virtual memory frustrates DMA
  • Stateful processors are hard to switch context

CS510 Computer Architectures

interconnect trends

Network

Channel

Backplane

>1000 m

10 - 100 m

Distance

1 m

10 - 100 Mb/s

40 - 1000 Mb/s

Bandwidth

320 - 1000+ Mb/s

high (>ms)

medium

Latency

low (<µs)

low

medium

Reliability

high

Extensive CRC

Byte Parity

Byte Parity

Interconnect Trends
  • Interconnect = glue that interfaces computer system components
  • High speed hardware interfaces + logical protocols
  • Networks, channels, backplanes
  • Message-based
  • Narrow pathways
  • Distributed arbitration
  • Memory-mapped
  • Wide pathways
  • Centralized arbit.

CS510 Computer Architectures

backplane architectures

Metric

VME

Futur

eBus

MultiBus II

SCSI-I

Bus Width (signals)

128

96

96

25

Address data multiplexed?

No

Y

es

Y

es

na

Data Width

16 - 32

32

32

8

Xfer Size

Single/Multiple

Single/Multiple

Single/Multiple

Single/Multiple

# of Bus Masters

Multiple

Multiple

Multiple

Multiple

Split Transactions

No

Optional

Optional

Optional

Clocking

Async

Sync

Async

Either

or

d (0 ns mem)

Bandwidth, Single W

25

20

37

5, 1.5

Bandwidth, Single Word (150ns mem)

12.9

10

15.5

5, 1.5

Bandwidth Multiple W

27.9

40

or

d (0 ns mem)

95.2

5, 1.5

or

d (150 ns mem)

Bandwidth Multiple W

13.6

13.3

20.8

5, 1.5

Max # of devices

21

21

20

7

Max Bus Length

.5 m

.5 m

.5 m

25 m

Standard

IEEE 1014

IEEE 896

ANSI/IEEE 1296

ANSI X3.131

Backplane Architectures

Distinctions begin to blur:

SCSI channel is like a bus

FutureBus is like a channel (disconnect/reconnect)

HIPPI forms links in high speed switching fabrics

CS510 Computer Architectures

bus based interconnect
Bus-Based Interconnect
  • Bus: a shared communication link between subsystems
    • Low cost: a single set of wires is shared multiple ways
    • Versatility: Easy to add new devices & peripherals may even be ported between computers using common bus
  • Disadvantage
    • A communication bottleneck, possibly limiting the maximum I/O throughput
  • Bus speed is limited by physical factors
    • the bus length
    • the number of devices (and, hence, bus loading).
    • these physical limits prevent arbitrary bus speedup.

CS510 Computer Architectures

bus based interconnect1
Bus-Based Interconnect
  • Two generic types of busses:
    • I/O busses (sometimes called a channel)
      • lengthy
      • many types of devices connected
      • wide range in the data bandwidth
      • follow a bus standard
    • CPU-Memory buses (sometimes called a backplane)
      • short
      • high speed, matched to the memory system to maximize memory-CPU bandwidth
    • To lower costs, low cost (older) systems combine together
  • Bus transaction
    • Sending address & receiving or sending data

CS510 Computer Architectures

bus protocols

Master

Slave

° ° °

Control Lines

Address Lines

Data Lines

Bus Protocols
  • Bus Master: has ability to control the bus, initiates a transaction
  • Bus Slave: module activated by the transaction
  • Bus Communication Protocol: specification of sequence of events and timing requirements in transferring information.
    • Asynchronous Bus Transfers: control lines (req., ack.) serve to orchestrate sequencing
    • Synchronous Bus Transfers: sequence relative to the common clock

Multibus: 20 address, 16 data, 5 control, 50ns Pause

CS510 Computer Architectures

synchronous bus protocols

Read transaction

Clock

Address

Data

Read

Wait

begin read

Read complete

Pipelined/Split transaction Bus Protocol

Address

Data

Wait

addr 1

addr 2

addr 3

data 0

data 1

wait 1

OK 1

Synchronous Bus Protocols

CS510 Computer Architectures

asynchronous handshake

Address

Data

Read

Req.

Ack.

Next Address

Master Asserts Address

Master Asserts Data

4 Cycle Handshake

t0 t1 t2 t3 t4 t5

t0:Master has obtained control and asserts address, direction, data; Waits a specified amount of time for slaves to decode target

t1: Master asserts request line

t2:Slave asserts ack, indicating data received

t3:Master releases req

t4:Slave releases ack

Asynchronous Handshake

Write Transaction

CS510 Computer Architectures

asynchronous handshake1

Master Asserts Address

Address

Data

Read

Req

Ack

Next Address

4 Cycle Handshake

t0 t1 t2 t3 t4 t5

t0:Master has obtained control and asserts address, direction;

Waits a specified amount of time for slaves to decode

t1:Master asserts request line

t3:Master releases req, data received

t4:Slave releases ack

Asynchronous Handshake

Read Transaction

t2:Slave asserts ack, indicating ready to transmit data;

Slave asserts data

Time Multiplexed Bus: address and data share lines

CS510 Computer Architectures

bus arbitration

Bus Arbiter

BR: Bus Request

BG: Bus Grant

BR BG

BR BG

BR BG

Bus sequence:

BR

BG

Busy

data

M

M

M

BGi BGo

BG

BR

BGi BGo

BGiBGo

M

M

M

A.U.

BR

BR

BR

M

M

M

A.U.

BR A

BR A

BR A

BR

A

Bus Arbitration

Parallel (Centralized) Arbitration

Serial Arbitration (daisy chaining)

Polling

CS510 Computer Architectures

bus options

Option High performance Low cost

Bus Options

Bus width Separate address Multiplex address & data lines & data lines

Data width Wider is faster Narrower is cheaper (e.g., 32 bits) (e.g., 8 bits)

Transfer size Multiple words has Single-word transfer less bus overhead is simpler

Bus masters Multiple Single master (requires arbitration) (no arbitration)

Split Yes - separate No - continuous transaction? Request and Reply connection is cheaper packets gets higher and has lower latency bandwidth (needs multiple masters)

Clocking Synchronous Asynchronous

CS510 Computer Architectures

1990 bus survey

VME FutureBus Multibus II IPI SCSI

1990 Bus Survey

Signals 128 96 96 16 8

Addr/Data mux no yes yes n/a n/a

Data width 16 - 32 32 32 16 8

Masters multi multi multi single multi

Clocking Async Async Sync Async either

MB/s 25 37 20 25 1.5 (asyn)

(0ns, word) 5 (sync)

150ns word 12.9 15.5 10 = =

0ns block 27.9 95.2 40 = =

150ns block 13.6 20.8 13.3 = =

Max devices 21 20 21 8 7

Max meters 0.5 0.5 0.5 50 25

Standard IEEE 1014 IEEE 896.1 ANSI/IEEE ANSI X3.129 ANSI X3.131

1296

CS510 Computer Architectures

slide24
VME
  • 3 96-pin connectors
  • 128 defined as standard, rest customer defined
    • 32 address
    • 32 data
    • 64 command & power/ground lines

CS510 Computer Architectures

scsi small computer system interface
SCSI: Small Computer System Interface
  • Up to 8 devices to communicate on a bus or “string” at sustained speeds of 4-5 MBytes/sec
  • SCSI-2 up to 20 MB/sec
  • Devices can be slave (“target”) or master(“initiator”)
  • SCSI protocol: a series of phases, during which specific actions are taken by the controller and the SCSI disks
    • Bus Free: No device is currently accessing the bus
    • Arbitration: When the SCSI bus goes free, multiple devices may request (arbitrate for) the bus; fixed priority by address
    • Selection: informs the target that it will participate (Reselection if disconnected)
    • Command: the initiator reads the SCSI command bytes from host memory and sends them to the target
    • Data Transfer: data in or out, initiator: target
    • Message Phase: message in or out, initiator: target (identify, save/restore data pointer, disconnect, command complete)
    • Status Phase: target, just before command complete

CS510 Computer Architectures

scsi bus channel architecture

Command Setup

Arbitration

Selection

Message Out (Identify)

Command

Disconnect to seek/¼

ll buf

fer

Message In (Disconnect)

- - Bus Free - -

Arbitration

If no disconnect is needed

Reselection

Message In (Identify)

Data

T

ransfer

Data In

Completion

ll buf

fer

Disconnect to ¼

Command Completion

Message In (Save Data Ptr)

Status

Message In (Disconnect)

Message In (Command Complete)

- - Bus Free - -

Arbitration

Reselection

Message In (Identify)

Message In (Restore Data Ptr)

SCSI “Bus”: Channel Architecture

peer-to-peer protocols

initiator/target

linear byte streams

disconnect/reconnect

CS510 Computer Architectures

1993 i o bus survey

Bus SBus TurboChannel MicroChannel PCI

1993 I/O Bus Survey

Originator Sun DEC IBM Intel

Clock Rate (MHz) 16-25 12.5-25 async 33

Addressing Virtual Physical Physical Physical

Data Sizes (bits) 8,16,32 8,16,24,32 8,16,24,32,64 8,16,24,32,64

Master Multi Single Multi Multi

Arbitration Central Central Central Central

32 bit read (MB/s) 33 25 20 33

Peak (MB/s) 89 84 75 111 (222)

Max Power (W) 16 26 13 25

CS510 Computer Architectures

1993 mp server memory bus survey

Bus Summit Challenge XDBus

1993 MP Server Memory Bus Survey

Originator HP SGI Sun

Clock Rate (MHz) 60 48 66

Split transaction?YesYesYes?

Address lines 48 40 ??

Data lines 128 256 144 (parity)

Data Sizes (bits) 512 1024 512

Clocks/transfer 4 5 4?

Peak (MB/s)96012001056

Master Multi Multi Multi

Arbitration Central Central Central

Addressing Physical Physical Physical

Slots 16 9 10

Busses/system 1 1 2

Length 13 inches 12? inches 17 inches

CS510 Computer Architectures

communications networks

Network Controller

Processor Memory

List of request blocks

Node

Control

. . .

Processor

Reg. I/F

Data to be transmitted

DMA

List of free blocks

Request

. . .

Block

Net

Memory

I/F

List of receive blocks

Receive

. . .

Block

DMA

Data received

Media

Peripheral Backplane Bus

Communications Networks

Performance limiter is memory system, OS overhead, not protocols

  • Send/receive queues in processor memories
  • Network controller copies back and forth via DMA
  • No host intervention needed
  • Interrupt host when message sent or received

CS510 Computer Architectures

i o controller architecture

Peripheral Bus (VME, FutureBus, etc.)

Peripheral Bus Interface/DMA

Host

Memory

Buf

fer

Memory

Processor

µProc

Cache

ROM

Host

I/O Channel Interface

Processor

I/O Controller

I/O Controller Architecture
  • Request/Response block interface
  • Backdoor access to host memory

CS510 Computer Architectures

i o data flow

Application

Address Space

Memory-to-Memory Copy

Host Processor

fers (>10 MByte)

OS Buf

DMA

over Peripheral Bus

Buf

fers (1 M - 4 MBytes)

I/O Controller

HBA

Xfer over Disk Channel

Embedded Controller

T

rack Buf

fers (32K - 256KBytes)

Xfer over Serial Interface

I/O Device

Head/Disk

Assembly

I/O Data Flow

Impediment to high performance: multiple copies,

complex hierarchy

CS510 Computer Architectures