Fast multi threading on shared memory multi processors
This presentation is the property of its rightful owner.
Sponsored Links
1 / 11

Fast Multi-Threading on Shared Memory Multi-Processors PowerPoint PPT Presentation


  • 63 Views
  • Uploaded on
  • Presentation posted in: General

Fast Multi-Threading on Shared Memory Multi-Processors. Joseph Cordina B.Sc. Computer Science and Physics Year IV. Aims of Project. Implementation of MESH, a user level threads package, on to a shared memory multi-processor machine

Download Presentation

Fast Multi-Threading on Shared Memory Multi-Processors

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Fast multi threading on shared memory multi processors

Fast Multi-Threading on Shared Memory Multi-Processors

Joseph Cordina

B.Sc. Computer Science and Physics

Year IV


Aims of project

Aims of Project

  • Implementation of MESH, a user level threads package, on to a shared memory multi-processor machine

  • Take advantage of concurrent processing while maintaining the advantages of fine grain user level thread scheduling with low latency context switching

  • Enable concurrent inter-process communication on same machine and on an Ethernet network through the NIC


What is mesh

What Is MESH ?

  • A tightly coupled fine grain uni-processor user level thread scheduler for the C language

  • MESH provides an environment in which to manage user level threads

  • Makes use of inline active context switching relying on the compiler knowledge of the registers in use at any one time (min.c.s/w 55ns)

  • Direct hardware access close to maximum theoretical limit when using jumbo frames

  • Communication API supports message pools, ports and poolports


Concurrent resource access

Concurrent Resource Access

  • Scheduler entry points are explicit

  • Scheduler entry occurs concurrently when using more than one thread of execution

  • Access to global data needs to be protected from concurrency

    • Data read access does not need to be protected

    • Data write access cannot occur concurrently with data reads

  • Spin-lock protected resources with small critical section providing minimum busy wait time

  • Spin on read to preserve cache


Scheduling in smp mesh

Scheduling in SMP-MESH

  • Shared run queue to store user level threads descriptors at 32 levels of priority

  • Multiple Kernel level threads access it to retrieve threads and place new ones and can lead to data corruption

  • Lock protected run queue forces synchronisation

  • Fine thread granularity increases contention for run queue lock


Scheduling in smp mesh 2

Scheduling in SMP-MESH (2)

Linux does not provide a private memory area for each Kernel Level Thread unlike SunOS LWPs

  • Kernel level threads need knowledge of self identification achieved through comparing stack space

  • Kernel level thread should equal number of processors for best utilization


Well behaved idling

Well Behaved Idling

  • Upon finding the run queue empty, kernel level threads sleep in the kernel giving up the processor for other applications to execute

  • Sleeping on a semaphore removes risk of lost wakeup unlike signals and message passing

  • Upon re-awaking, the new user level threads are passed directly to the sleeping thread, without invoking run queue access


Load balancing

Load Balancing

  • No Kernel Level Thread is idle when a user level thread is on the shared run-queue

  • The run-queue’s FIFO structure ensures that oldest threads will be executed first

  • Cache consistency is not ensured when using shared run-queue


Communication in smp mesh

Communication in SMP-MESH

  • Inter thread communication on same system and in between different systems

  • All instances of message pools, ports and poolports have a private lock providing maximum concurrent communication

  • Consecutive memory needs to be protected when creating messages

  • Message transmission to NIC using a lock, reception from NIC using no lock


Results

Results

  • 500,000 context switches at differing thread granularity

  • Contention for shared resources at fine thread granularity gives worse performance on an SMP machine than on a uni-processor machine


Conclusion

Conclusion

  • SMP-MESH takes advantage of multi-processors for fine grain multi-threading

  • Concurrency is encouraged in all areas unless risk of data corruption exists

  • Overheads on the uni-processor MESH are expected yet counter balanced as number of processors are increased

  • Considerable speedup is available at minimum cost, the main disadvantage is requiring more careful synchronisation in application design


  • Login