processors with h yper t hreading and aliroot performance n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Processors with H yper -T hreading and AliRoot performance PowerPoint Presentation
Download Presentation
Processors with H yper -T hreading and AliRoot performance

Loading in 2 Seconds...

play fullscreen
1 / 12

Processors with H yper -T hreading and AliRoot performance - PowerPoint PPT Presentation


  • 78 Views
  • Uploaded on

Processors with H yper -T hreading and AliRoot performance. Jiří Chudoba FZÚ, Prague. Motivation. How to choose the optimal hardware Contributions are counted in SPECInts But how to measure it? www.spec.org : CPU2000 tests – 150 USD Many results are published, but:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Processors with H yper -T hreading and AliRoot performance' - shina


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
motivation
Motivation
  • How to choose the optimal hardware
  • Contributions are counted in SPECInts
  • But how to measure it?

www.spec.org: CPU2000 tests – 150 USD

Many results are published, but:

    • The hardware is often not identical with our machines
    • Results depend on OS, compiler, …

chudoba@fzu.cz

hp proliant dl360 g3
HP ProLiant DL360 G3

2xXeon 2.8 GhZ HT, cache 512 KB, 2x18.2 GB Ultra320 Hot Pluggable Drives

http://h18004.www1.hp.com/products/servers/proliantdl360/description-g3.html

SPECInt2000 results:

ProLiant DL560, 2.8 GHz, Intel Xeon MP (2MB L3 cache), HT disabled in BIOS

SPECint2000 1247

(SPECint_base2000 1196)

ProLiant DL360 G3, 3.06GHz, Intel Xeon), 512KB L2, 1MB L3, no HT

SPECint2000: 1258

Dell PowerEdge 2650, 2.8 GHz Xeon, 512KB L2 cache, HT disabled

SPECint2000: 907

Intel D875PBZ motherboard (2.80C GHz PIV, HT maybe on – default status)

SPECint2000: 1204

Intel D875PBZ motherboard (AA-301) (2.8E GHz, 1MB cache, HT maybe on)

SPECint2000: 1269

chudoba@fzu.cz

slide6

2 logical processors

Duplication of the architectural state on each processor,

while sharing one set of processor execution resources

Details on http://www.intel.com/technology/hyperthread/

10:49pm up 3 days, 4:03, 1 user, load average: 0.00, 0.00, 0.00

31 processes: 30 sleeping, 1 running, 0 zombie, 0 stopped

CPU0 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle

CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle

CPU2 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle

CPU3 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle

Mem: 2069804K av, 1386140K used, 683664K free, 0K shrd, 145584K buff

Swap: 2097112K av, 0K used, 2097112K free 1006064K cached

chudoba@fzu.cz

slide7

Not Doubled Performance

Note that a CPU that supports hyper-threading is not going to

provide comparable performance with two physical processors

rated at the same speed.

The simple reason for this is because the two logical processors

that make up your hyper-threaded CPU have to share resources,

namely the execution engine, cache, and access to the system bus.

Intel promises 10-30% performance increase ...

chudoba@fzu.cz

hyper threading not always better
Hyper-Threading – not always better

… but it not always the case:

http://www.2cpu.com/Hardware/ht_analysis/3.html

chudoba@fzu.cz

other tests
Other tests

Unix Benchmark Utility v.0.3

Author: Sergei Viznyuk <sv@phystech.com>

Klaus Schossmaier reported (numbers per CPU):

Opteron 1.4 GHz 74955

1.8 GHz 97749

Xeon 2.4 GHz 88064

Itanium 1.0 GHz 66714

chudoba@fzu.cz

slide10

Results for AliRoot

noHT

HT

HT with scheduling

1

297 ±1 s

1.13

337 ± 48 s

1

296 ± 2 s

2 jobs, parallel

2

596 ± 10 s

1.73

515 ± 5 s

1.73

514 ± 3 s

4 jobs, parallel

CPU0 states: 100.0% user, 0.0% system, 0.0% nice, 0.0% idle

CPU1 states: 100.0% user, 0.0% system, 0.0% nice, 0.0% idle

CPU2 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle

CPU3 states: 0.0% user, 0.1% system, 0.0% nice, 99.0% idle

CPU0 states: 100.0% user, 0.0% system, 0.0% nice, 0.0% idle

CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle

CPU2 states: 100.0% user, 0.0% system, 0.0% nice, 0.0% idle

CPU3 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle

2

594 s

2.26

674 s

2

592 s

2+2 jobs

CERN RH 7.3.3, kernel 2.4.20, AliRoot v4-01-05, 1000 tracks HIJINGParam, Real time

ftp://ftp.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/ + http://freshmeat.net/projects/sched-utils/

chudoba@fzu.cz

slide11

Conclusions

  • CPU resource estimates are probably very rough
  • HT can add 15% in performance but in some cases in Real Time
  • Publicly available results of some our standard CPU test would help

(update of Root benchmark tests ?)

chudoba@fzu.cz

slide12

Root benchmark stress results:

Root 3.10.02, gcc 3.2, -O

4 parallel jobs, 9000 events, HT: 512

4 parallel jobs, 9000 events, noHT: 732

2 parallel jobs, 9000 events, HT: 733

Klaus Schossmaier

chudoba@fzu.cz