prelude to multiprocessing
Download
Skip this Video
Download Presentation
Prelude to Multiprocessing

Loading in 2 Seconds...

play fullscreen
1 / 26

Prelude to Multiprocessing - PowerPoint PPT Presentation


  • 51 Views
  • Uploaded on

Prelude to Multiprocessing. Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table. CPUID. Recent Intel processors provide a ‘cpuid’ instruction (opcode 0x0F, 0xA2) to assist software in detecting a CPU’s capabilities

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Prelude to Multiprocessing' - spencer


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
prelude to multiprocessing

Prelude to Multiprocessing

Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table

cpuid
CPUID
  • Recent Intel processors provide a ‘cpuid’ instruction (opcode 0x0F, 0xA2) to assist software in detecting a CPU’s capabilities
  • If it’s implemented, this instruction can be executed in any of the processor modes, and at any of its four privilege levels
  • But this ‘cpuid’ instruction might not be implemented (e.g., 8086, 80286, 80386)
intel x86 eflags register
Intel x86 EFLAGS register

31

16

21

0

0

0

0

0

0

0

0

0

0

I

D

V

I

P

V

I

F

A

C

V

M

R

F

15

0

0

N

T

IOPL

O

F

D

F

I

F

T

F

S

F

Z

F

0

A

F

0

P

F

1

C

F

Software can ‘toggle’ the ID-bit (bit #21) in the 32-bit EFLAGS register

if the processor is capable of executing the ‘cpuid’ instruction

but what if there s no eflags
But what if there’s no EFLAGS?
  • The early Intel processors (8086, 80286) did not implement any 32-bit registers
  • The FLAGS register was only 16-bits wide
  • So there was no ID-bit that software could try to ‘toggle’ (to see if ‘cpuid’ existed)
  • How can software be sure that the 32-bit EFLAGS register exists within the CPU?
detecting 32 bit processors
Detecting 32-bit processors
  • There’s a subtle difference in the way the logical shift/rotate instructions work when register CL contains the ‘shift-factor’
  • On the 32-bit processors (e.g., 80386+) the value in CL is truncated to 5-bits, but not so on the 16-bit CPUs (8086, 80286)
  • Software can exploit this distinction, in order to tell if EFLAGS is implemented
detecting eflags
Detecting EFLAGS

# Here’s a test for the presence of EFLAGS

mov $-1, %ax # a nonzero value

mov $32, %cl # shift-factor of 32

shl %cl, %ax # do logical shift

or %ax, %ax # test result in AX

jnz is32bit # EFLAGS present

jmp is16bit # EFLAGS absent

testing for id bit toggle
Testing for ID-bit ‘toggle’

# Here’s a test for the presence of the CPUID instruction

pushfl # copy EFLAGS contents

pop %eax # to accumulator register

mov %eax, %edx # save a duplicate image

btc $21, %eax # toggle the ID-bit (bit 21)

push %eax # copy revised contents

popfl # back into EFLAGS

pushfl # copy EFLAGS contents

pop %eax # back into accumulator

xor %edx, %eax # do XOR with prior value

bt $21, %eax # did ID-bit get toggled?

jc y_cpuid # yes, can execute ‘cpuid’

jmp n_cpuid # else ‘cpuid’ unimplemented

how does cpuid work
How does CPUID work?
  • Step 1: load value 0 into register EAX
  • Step 2: execute ‘cpuid’ instruction
  • Step 3: Verify ‘GenuineIntel’ character- string in registers (EBX,EDX,ECX)
  • Step 4: Find maximum CPUID input-value in the EAX register
version and features
Version and Features
  • load 1 into EAX and execute CPUID
  • Processor model and stepping information is returned in register EAX
  • 20 19 16 13 12 11 8 7 4 3 0

Extended

Family ID

Extended

Model ID

Type

Family

ID

Model

Stepping

ID

some feature flags in edx
Some Feature Flags in EDX

28

H

T

T

13

9

3

2

1

0

P

G

E

A

P

I

C

P

S

E

D

E

V

M

E

F

P

U

HTT = HyperThreading Technology (1 = yes, 0 = no)

PGE = Page Global Entries (1=yes, 0=no)

APIC = Advanced Programmable Interrupt Controller on-chip (1 = yes,0 = no)

PSE = Page-Size Extensions (1 = yes, 0 = no)

DE = Debugging Extensions (1=yes, 0=no)

VME = Virtual-8086 Mode Enhancements (1 = yes, 0 = no)

FPU = Floating-Point Unit on-chil (1=yes, 0=no)

some feature flags in ecx
Some Feature Flags in ECX

5

V

M

X

VMX = Virtual Machine Extensions (1 = yes, 0 = no)

multiprocessor specification
Multiprocessor Specification
  • It’s an industry standard, allowing OS software to use multiple processors in a uniform way
  • OS software searches in three regions of the physical address-space below 1-megabyte for a “paragraph-aligned” data-structure of length 16-bytes called the MP Floating Pointer Structure:
    • Search in lowest KB of Extended Bios Data Area
    • Search in topmost KB of conventional 640K RAM
    • Search in the 128KB ROM-BIOS (0xE0000-0xFFFFF)
mp floating pointer structure
MP Floating Pointer Structure
  • This structure may contain an ID-number for one a small number of standard SMP system architectures, or may contain the memory address for a more extensive MPConfiguration Table having entries that specify a “customized” system architecture
  • The machines in our classroom employ the latter of these two options
an example record
An example record
  • The MP Configuration Table will contain a record for each logical processor

reserved (=0)

reserved (=0)

Feature Flags

CPU signature (stepping, model, family)

CPU Flags

BP (bit 1), EN (bit 0)

Local-APIC

version

Local-APIC

ID

Entry Type

0

BP = Bootstrap Processor (1=yes, 0=no), EN = Enabled (1=yes, 0=no)

our mpinfo cpp utility
Our ‘mpinfo.cpp’ utility
  • We created a Linux utility that will display the system-information contained in the MP Configuration Table (in hex format)
  • You can refer to the ‘MP Specification 1.4’ document (online) to interpret this display
  • This utility needs a device-driver ‘dram.c’ to be pre-installed (in order that it be able to directly access the system’s memory)
a processor s local apic
A processor’s Local-APIC
  • The purpose of each processor’s APIC is to allow the CPUs in a multiprocessor system to send messages to one another and to manage the delivery of the interrupt-requests from the various peripheral devices to one (or more) of the CPUs in a dynamically programmable way
  • Each processor’s Local-APIC has a variety of registers, all ‘memory mapped’ to paragraph-aligned addresses within the 4KB page at physical-address 0xFEE00000
local apic s register space
Local-APIC’s register-space

APIC

0xFEE00000

4GB physical

address-space

RAM

0x00000000

analogies with the pic
Analogies with the PIC
  • Among the registers in a Local-APIC are these (which had analogues in the older 8259 PIC’s design:
    • IRR: Interrupt Request Register (256-bits)
    • ISR: In-Service Register (256-bits)
    • TMR: Trigger-Mode Register (256-bits)
  • For each of these, its 256-bits are divided among eight 32-bit register addresses
new way to do eoi
New way to do ‘EOI’
  • Instead of using a special End-Of-Interrupt command-byte, the Local-APIC contains a dedicated ‘write-only’ register (named the EOI Register) which an Interrupt Handler writes to when it is ready to signal an EOI

# issuing EOI to the Local-APIC

mov $0xFEE00000, %ebx # address of the cpu’s Local-APIC

movl $0, %fs:0xB0(%ebx) # write any value into EOI register

# Here we assume segment-register FS holds the selector for a segment-descriptor

# for a ‘writable’ 4GB-size expand-up data-segment whose base-address equals 0

each cpu has its own timer
Each CPU has its own timer!
  • Four of the Local-APIC registers are used to implement a programmable timer
  • It can privately deliver a periodic interrupt (or one-shot interrupt) just to its own CPU
    • 0xFEE00320: Timer Vector register
    • 0xFEE00380: Initial Count register
    • 0xFEE00390: Current Count register
    • 0xFEE003E0: Divider Configuration register
timer s local vector table
Timer’s Local Vector Table

0xFEE00320

7 0

12

17

16

M

O

D

E

M

A

S

K

B

U

S

Y

Interrupt

ID-number

MODE:

0=one-shot

1=periodic

MASK:

0=unmasked

1=masked

BUSY:

0=not busy

1=busy

timer s divide configuration
Timer’s ‘Divide-Configuration’

0xFEE003E0

3 2 1 0

reserved (=0)

0

Divider-Value field (bits 3, 1, and 0)

000 = divide by 2

001 = divide by 4

010 = divide by 8

011 = divide by 16

100 = divide by 32

101 = divide by 64

110 = divide by 128

111 = divide by 1

initial and current counts
Initial and Current Counts

0xFEE00380

Initial Count Register (read/write)

0xFEE00390

Current Count Register (read-only)

When the timer is programmed for ‘periodic’ mode, the Current Count is

automatically reloaded from the Initial Count register, then counts down

with each CPU bus-cycle, generating an interrupt when it reaches zero

using the timer s interrupts
Using the timer’s interrupts
  • Setup your desired Initial Count value
  • Select your desired Divide Configuration
  • Setup the APIC-timer’s LVT register with your desired interrupt-ID number and counting mode (‘periodic’ or ‘one-shot’), and clear the LVT register’s ‘Mask’ bit to initiate the automatic countdown operation
in class exercise 1
In-class exercise #1
  • Run the ‘cpuid.cpp’ Linux application (on our course website) to see if the CPUs in our classroom implement HyperThreading (i.e., multiple logical processors in a cpu)
  • Then run the ‘mpinfo.cpp’ application, to see if the MP Base Configuration Table has entries for more than one processor
  • If both results hold true, then we can write our own multiprocessing software in H235!
in class exercise 2
In-class exercise #2
  • Run the ‘apictick.s’ demo (on our CS 630 website) to observe the APIC’s ‘periodic’ interrupt-handler drawing ‘T’s onscreen
  • It executes for ten-milliseconds (the 8254 is used here to create that timed delay)
  • Try reprogramming the APIC’s Divider Configuration register, to cut the interrupt frequency in half (or perhaps to double it)
ad