1 / 46

Computer Architecture

Computer Architecture. History. What is the oldest computing device? Computer Architecture Development First generation used vacuum tubes -- 1940 – 1950 Second generation: transistors -- 1950 – 1964 Third generation used integrated circuits -- 1964 – 1971

media
Download Presentation

Computer Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Architecture

  2. History • What is the oldest computing device? • Computer Architecture Development • First generation used vacuum tubes -- 1940 – 1950 • Second generation: transistors -- 1950 – 1964 • Third generation used integrated circuits -- 1964 – 1971 • Fourth generation uses microprocessor chips -- 1971 – present

  3. Who’s Computer is Faster? • The need for a performance measure • MIPS vs MFLOPS • MIPS – million instructions per second • instruction count divided by execution time x 10E6 • MFLOPS – million floating-point operations per second • number of floating point operations in a program divided execution time X 10E6

  4. Commercial Computers

  5. Basic Components The Motherboard is comprised of three components: Processor – also called Central Processor Unit (CPU), is responsible for following program instructions Memory – holds programs while they are running and data needed by the programs I/O – input feeds the computer, output is the result of the computation sent to the user The processor is comprised of two main components: Datapath – performs arithmetic operations Control – tells the datapath, memory, and I/O devices what to do according to the instructions of the program

  6. A Simple Processor and Memory The processor interfaces to memory as: MFC Processor (CPU) read/write MEM strobe Control address Datapaths data

  7. Inside the Processor MFC Control read/write MEM in/out in/out IR MAR in/out in/out PC MDR in/out Y Red lines communicate between the processor and the memory. Blue lines are internal, dedicated, electrical connections. Green lines are connections that can be made or broken as the control specifies. REG file in/out ALU in out Z

  8. Memory SRAM – Static RAM DRAM – Dynamic RAM SDRAM – Synchronous Dynamic RAM SIMM – Single Inline Memory Module DIMM - Dual Inline Memory Modules Cache L1, L2 ROM – Read Only Memory RAM – Random Access Memory

  9. Storage Punch Cards Tape Silos Hard Drive interfaces: ST506, ESDI, SCSI, IDE, EIDE Floppy/ZIP CDROM types: CD-i, TV-based proprietary players, CD-r/CD-rw, Photo CD players, Computer-based CD-ROM drives DVD/DVD+/DVD-RW

  10. Hard Drive Interfaces: ST506 – can only issue commands of one disk move at a time, support up to 16-head disks ESDI – (Enhanced Small Device Interface, early 80’s) supports mush larger disks and higher data transfer rates SCSI – (Small Computer Systems Interface) a very intelligent system level interface, responds to more complex commands than ST506 or ESDI IDE – (Integrated Drive Electronics) Shortest driver/cable in the world EIDE – (Enhanced IDE) Supports LBA translation supports drives larger than 504 MB, offers higher data transfer rates

  11. Intergrated Circuits VESA - Video Electronics Standards Association ISA - Industry Standard Architecture PCI - Peripheral Component Interconnect AGP - Accelerated Graphics Port IDE - Integrated Drive Electronics EIDE - Enhanced IDE UDMA - Ultra Direct Memory Access

  12. VESA VL-Bus is short for VESA local bus (VLB). The VESA local bus is an older local bus architecture popular on 486 computer systems in 1993 and 1994. The VL-Bus has been completely replaced by PCI and AGP bus architectures. The VL-Bus provides a high-speed data path between the CPU and peripherals (video, disk, network, etc.) running at the speed of the processor. The VL-Bus is a 32-bit bus that supports bus mastering and runs at speeds up to 40MHz.

  13. VESA Advantages • Faster Processing • Faster access • Direct access to the processor bus, which is local to the CPU • Direct access to system memory at the speed of the processor itself • 32-bit data transfer capability • 128MBps to 132MBps maximum throughput • Different physical slot that prevents plugging a slower card into a fast slot

  14. VESA Limitations • Available only for 486 processors • Not developed for the speed of the Pentium • Maximum speed of the VESA specification is 66MHz, but in reality the speed is limited to 33MHz • Limited to a maximum of three cards depending on system resources • Poor implementation of bus mastering • Does not support plug-and-play

  15. ISA The original ISA bus on the IBM PC was 8 bits wide The ISA bus was expanded to 16 bits in 1984. The ISA bus eventually became a bottleneck to performance and was augmented with additional high-speed buses, but ISA persists because of the truly enormous base of existing peripherals using the standard. There are still many devices for which the ISA's speed is more than sufficient, and will be for some time to come (standard modems being an example). After 17 years it appears that ISA may finally be going the way of the dodo. Market leaders Intel and Microsoft want to move the industry away from the use of the ISA bus in new machines.

  16. PCI Currently by far the most popular local I/O bus, the Peripheral Component Interconnect (PCI) bus was developed by Intel and introduced in 1993 PCI is a 32-bit bus that normally runs at a maximum of 33 MHz. The key to PCI's advantages over its predecessor, the VESA local bus, lies in the chipset that controls it.

  17. PCI The speed of the PCI bus can be set synchronously or asynchronously In a synchronized setup (used by most PCs), the PCI bus runs at half the memory bus speed; since the memory bus is usually 50, 60 or 66 MHz, the PCI bus would run at 25, 30 or 33 MHz respectively. In an asynchronous setup the speed of the PCI bus can be set independently of the memory bus speed.

  18. PCI Most peripherals use PCI connections The PCI bus is the first bus to popularize bus mastering; probably in part because for the first time there are operating systems and software that are really capable of taking advantage of it. PCI's design allows bus mastering of multiple devices on the bus simultaneously, with the arbitration circuitry working to ensure that no device on the bus (including the processor!) locks out any other device The PCI bus is part of the Plug and Play standard developed by Intel, with cooperation from Microsoft and many other companies. PCI systems were the first to popularize the use of Plug and Play.

  19. AGP Accelerated Graphics Port (AGP) technology provides a dedicated, high-speed port for the movement of large blocks of 3D texture data between the PC's graphics controller and system memory. Much as was the case with the ISA bus before it, traffic on the PCI bus is starting to become heavy on high-end PCs, with video, hard disk and peripheral data all competing for the same I/O bandwidth. To combat the eventual saturation of the PCI bus with video information, an interface has been pioneered by Intel, designed specifically for the video subsystem. The idea behind AGP is simple: create a faster, dedicated interface between the video chipset and the system processor. AGP is considered a port, and not a bus, because it only involves two devices (the processor and video card) and is not expandable

  20. AGP The AGP bus is 32 bits wide, just the same as PCI is, but instead of running at half of the system (memory) bus speed the way PCI does, it runs at full bus speed Peak bandwidth is four-times higher than the PCI bus thanks to pipelining, sideband addressing, and data transfers that occur on both rising and falling edges of the clock. This shows how the processing of texture maps is currently supported on the PC. AGP relieves the graphics bottleneck by adding a new dedicated high-speed bus directly between the chipset and the graphics controller.

  21. IDE The most popular interface used in modern hard disks Offer excellent performance at relatively low cost. They are challenged only by SCSI. This interface is also known by a truly staggering variety of other names such as: ATA, ATA/ATAPI, EIDE, ATA-2, Fast ATA, ATA-3, Ultra ATA, Ultra DMA

  22. ATA-1 The "proper" name for the IDE interface is AT Attachment, or ATA The first formal standard defining the AT Attachment interface was submitted to ANSI for approval in 1990. This standard is sometimes called ATA-1 to distinguish it from its successors. The original IDE/ATA standard defines the following features and transfer modes: • Two Hard Disks: The specification calls for a single channel in a PC, shared by two devices that are configured as master and slave. • PIO Modes: ATA includes support for PIO modes 0, 1 and 2. • The oldest method of transferring data over the IDE/ATA interface is through the use of programmed I/O. • DMA Modes: ATA includes support for single word DMA modes 0, 1 and 2, and multiword DMA mode 0. "Plain" ATA does not include support for enhancements such as ATA Packet Interface (ATAPI), hence is no longer used. The ATAPI standard is used for devices like optical, tape and removable storage drives. It enables them to plug into the standard IDE cable used by IDE/ATA hard disks, and be configured as master or slave, etc. just like a hard disk would be.

  23. ATA-2 • ATA-2 was a significant enhancement of the original ATA standard. It defines the following improvements over the base ATA standard (with which it is backward compatible): • Faster PIO Modes: ATA-2 adds the faster PIO modes 3 and 4 to those supported by ATA. • Faster DMA Modes: ATA-2 adds multiword DMA modes 1 and 2 to the ATA modes. • Block Transfers: ATA-2 adds commands to allow block transfers for improved performance. • Logical Block Addressing (LBA): ATA-2 defines support (by the hard disk) for logical block addressing. Using LBA requires BIOS support on the other end of the interface as well. • Improved "Identify Drive" Command: This command allows hard disks to respond to inquiries from software, with more accurate information about their geometry and other characteristics. Now, ATA-2 is obsolete.

  24. EIDE Enhanced IDE, also called EIDE, is a term that Western Digital coined in 1994 to represent a particular set of extensions it devised to the original AT Attachment standard. • The original Enhanced IDE program included the following improvements over ATA: • ATA-2 Enhancements: EIDE includes all of the improvements that are defined as part of the ATA-2 standard, including the higher-speed transfer modes. • ATAPI: The EIDE definition includes support for non-hard-disk ATAPI devices on the IDE/ATA channel. Note that at that time, ATAPI was not part of the ATA standard at all. • Dual IDE/ATA Host Adapters: The EIDE standard specifically includes support for dual IDE/ATA channels, allowing four IDE/ATA/ATAPI devices to be used. Some people in the hard disk industry apparently feel that the creation of "Enhanced IDE" was one of the worst things to ever happen to the IDE/ATA interface! Lots of Criticism!

  25. ATA-3 The ATA-3 standard is a minor revision of ATA-2 • Improved Reliability: ATA-3 improves the reliability of the higher-speed transfer modes, which can be an issue due to the low-performance standard cable used up to that point in IDE/ATA. (An improved cable was defined as part of ATA/ATAPI-4.) • Self-Monitoring Analysis and Reporting Technology (SMART): ATA-3 introduced this reliability feature. • Security Feature: ATA-3 defined security mode, which allows devices to be protected with a password.

  26. UDMA With the increase in performance of hard disks over the last few years, the use of programmed I/O modes became a hindrance to performance. As a result, focus was placed on the use of direct memory access (DMA) modes. Of course, hard disks get faster and faster, and the maximum speed of multiword DMA mode 2, 16.7 MB/s, quickly became insufficient for the fastest drives. The key technological advance introduced to IDE/ATA in Ultra DMA was double transition clocking. Ultra DMA also introduced the use of cyclical redundancy checking or CRC on the interface Just like ATA and ATAPI, there are several modes of UDMA. In common parlance, drives that use Ultra DMA are often called "Ultra ATA/xx" where "xx" is the speed of the interface. So, few people really talk about current drives being "Ultra DMA mode 5", they say they are "Ultra ATA/100".

  27. Ultra DMAMode Cycle Time (nanoseconds) Maximum Transfer Rate (MB/s) Defining Standard Mode 0 240 16.7 ATA/ATAPI-4 Mode 1 160 25.0 ATA/ATAPI-4 Mode 2 120 33.3 ATA/ATAPI-4 Mode 3 90 44.4 ATA/ATAPI-5 Mode 4 60 66.7 ATA/ATAPI-5 Mode 5 40 100.0 ATA/ATAPI-6 UDMA

  28. SCSI Small Computer Systems Interface, abbreviated SCSI and pronounced "skuzzy". SCSI is a much more advanced interface than its chief competitor, IDE/ATA, and has several advantages over IDE that make it preferable for many situations, usually in higher-end machines It is far less commonly used than IDE/ATA due to its higher cost and the fact that its advantages are not useful for the typical home or business desktop user. In terms of standards, SCSI suffers from the same problem  that IDE/ATA does: there are too many different ones and it can be hard to understand what is what. While IDE is an interface, SCSI is really a system-level bus. SCSI offers performance, expandability and compatibility unmatched by any other current PC interface.

  29. SCSI "Regular" SCSI (SCSI-1) Wide SCSI Fast SCSI Fast Wide SCSI Ultra SCSI Wide Ultra SCSI Ultra2 SCSI Wide Ultra2 SCSI Ultra3 SCSI Ultra160 (Ultra160/m) SCSI Ultra160+ SCSI Ultra320 SCSI

  30. USB The USB interface is specifically designed to allow easy connection of a wide variety of devices; it is intended to be user-friendly and truly "plug and play". The USB connection runs with a maximum throughput of 12 Mbits/second (1.5 Mbytes/second), which is shared by all devices. There is also a slower-speed, 187.5 kbytes/second mode available for very slow devices, such as keyboards. Seeing the limitations of USB, Intel has spurred the development and implementation of USB 2.0, an updated version of the interface that increases throughput from 12 Mbits per second all the way up to 480 Mbits per second

  31. Fire Wire It was originally developed by Apple, who called it FireWire; this name became popular, but Apple owns the rights to it, and many companies refused to pay to license the name Everyone started to refer to it by the standard number assigned it by the IEEE, which formally published the interface as a standard in 1995. IEEE-1394 is defined part of the SCSI-3 family of related standards, and was at one point sometimes called "serial SCSI". It is a serial interface that supports dozens of daisy-chained devices, hot-swapping, and plug-and-play IEEE-1394 supports up to 400 Mbits/second.

  32. Slot 1 Motherboard

  33. Notice the ISA Slots ISA is being phased out. AGP Port Slot 1 Processor

  34. Notice the processor slots. This motherboard has both Socket 7 and a Slot 1. This motherboard can support dual processors. Notice AGP Port, PCI Slots. One ISA Slot 3 Dimm Slots

  35. Pentium 1

  36. Pentium II

  37. Pentium III Same size as pentium II

  38. Pentium 4

  39. Superscalar • What is a superscalar processor? • an architecture that contains more than one execution unit, or pipeline, allowing more than one instruction per clock cycle • e.g. Pentium processor has two side-by-side pipelines for integer instructions

  40. Parallel Processing • Parallel computing can speed things up from 2 to 500 times faster than using a single processor • Local memory machines that communicate by messages • Shared memory machines that communicate through memory • Using messages sent between processors – copy memory • Using operating system threads – use memory in-place

  41. Programming issues • Parallel computers are difficult to program • Automatic parallelization techniques are only partially successful • Programming languages are few, not well supported, and difficult to use • Parallel algorithms are difficult to design • Parallelizing Code • Implicit Explicit • Write sequential algorithms • Use a parallelizing compiler • Rely on compiler to find parallelism • Design parallel algorithms • Write in a parallel language • Rely on human to find parallelism

  42. Vector • Related to SIMD • Work with linear arrays of numbers, or vectors • Operate on a few vector elements per clock cycle – e.g. A = B x C • Advantages over SISD • Result is independent of previous results • Single instruction does more • Access memory a block at a time • Access memory with known patterns

  43. Tightly Coupled Machines • Processing Bottleneck • Lack of Reliability • Clusters with fast networking • Cost of Memory • Overheads of Interconnection • Overheads of Data Communication

  44. Supercomputing Computer At the Pittsburgh Supercomputer Center there is a system of 64 interconnected Compaq ES40 Alphaservers, each housing four EV67 microprocessors, with a peak capability of 342 billion calculations per second (Gflops) Human Every man, woman and child on Earth would have to perform 1,000 calculations every second to keep pace with the machine’s peak speed

More Related