performance model for future multicore process designs n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Performance Model for Future Multicore Process Designs PowerPoint Presentation
Download Presentation
Performance Model for Future Multicore Process Designs

Loading in 2 Seconds...

play fullscreen
1 / 12

Performance Model for Future Multicore Process Designs - PowerPoint PPT Presentation


  • 123 Views
  • Uploaded on

Performance Model for Future Multicore Process Designs. Yipkei Kwok 02/06/2008. A Non-Work-Conserving Operating System Scheduler For SMT Processors. Authors: A. Fedorova et. al Calculate optimal level of //ism of SMT Processors at run time Analytical model

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Performance Model for Future Multicore Process Designs' - jordan-huffman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a non work conserving operating system scheduler for smt processors
A Non-Work-Conserving Operating System Scheduler For SMT Processors
  • Authors: A. Fedorova et. al
  • Calculate optimal level of //ism of SMT Processors at run time
  • Analytical model
  • Estimate the workload’s IPC for a given degree of concurrency
  • 1st id’fy performance bottle
  • Suppressing L2 misses improves performance the best
a non work conserving operating system scheduler for smt processors1
A Non-Work-Conserving Operating System Scheduler For SMT Processors
  • Factors
    • N
    • perf_cache_CPI(N)
    • L2_RMR
    • L2_WMR
    • L2_WBR_R
    • L2_WBR_W
    • WSC
    • L2_MCOST
non work conserving operating system scheduler for smt processors
Non-Work-Conserving Operating System Scheduler For SMT Processors
  • 2-phases scheduling
    • Preparation phase
      • Collect model inputs under full //ism
      • W./ hardware counters
      • Till the retirement of the 100 million-th instructions
    • Optimization phase
      • Estimate optimal N
      • Enforce it
      • Till … …
        • New locality phase
limitations
Limitations
  • 3-56% improvement but … ..
  • Empirical model based on UltraSparc T1
  • SMT only
    • But expandable w./, hopefully, reasonable effort
  • Once expanded, performance prediction
  • What’re needed?
    • Extra factors?
what new factors
What new factors?
  • Depends on systems to model
  • Shared-memory machine
  • Threaded // workloads
  • SMP of CMPs
  • SMT per core
what new factors1
What new factors?
  • Architecture
    • Homo/hetero cores
      • Difference in speed, or functionality
    • Level of cache sharing
    • Interconnects
what new factors2
What new factors?
  • Params
    • #(cores)
    • Cache size
    • Degree of set-associativity
    • #(cores) sharing a cache
    • Bus, ring, crossbar, tiny-network
    • Switching & flow mechanisms
    • Routing algos
    • Fault tolerance techniques
what new factors3
What new factors?
  • Protocols
    • Cache coherence protocol at dedicated/semi-shared cache
  • Algorithms
    • Block replacement algorithm
    • Algorithms of cache coherence and data consistency protocols
potential uses
Potential uses
  • Performance prediction for future processors
  • Scheduler
similar work exists
Similar work exists?
  • Multi2Sim (2007)
    • Framework simulating the system working as a whole
    • Yet, app-only simulation
    • Evaluate multicore-multithreaded processors
    • 3 major components simulated
      • Core
      • Cache hierarchy
      • Interconnect
    • Note: source code available
enough
Enough?
  • Limitations
    • Homogenous core
    • Topology
      • Bus only
      • W./ variable bus width though