1 / 7

Parallel ClockDesigner

Parallel ClockDesigner. Andrew Menard 18.377 Final Project Spring 2002. The Problem. Distribute a clock signal from one source to a large number of latches using a tree. Delay must be the same on every path through the tree. Each branch must satisfy minimum and maximum size constraints.

steri
Download Presentation

Parallel ClockDesigner

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel ClockDesigner Andrew Menard 18.377 Final Project Spring 2002

  2. The Problem • Distribute a clock signal from one source to a large number of latches using a tree. • Delay must be the same on every path through the tree. • Each branch must satisfy minimum and maximum size constraints. • Need to minimize resources used.

  3. Initial algorithm • Add latches to a net until it violates a constraint, then create another net. • This results in a solution which is legal but suboptimal. • This legal solution can be handed to a refinement optimizer which will improve it.

  4. Serial refinement optimizer • For each pair of nets, check whether there is any combination of latches that can be swapped between them that will make the larger one smaller • Repeat several times over the set of nets, so that any pair of nets is refined several times.

  5. Parallel Refine Optimizer • Mother process spawns one child process per processor. • Mother process sends out a net and a list of adjacent nets to each thread, repeats. • Child process spins waiting for data, optimizes that net with list of adjacent nets, then waits again.

  6. Complications • Child processes have to lock nets they are working on; a significant time sink. • Different pairs of nets can take wildly different amounts of time; some serialization at the end of each job. • Starting threads is expensive.

  7. Results • 20% Speedup in 4-processor version over serial version • Negligible change in memory • Scheduler matters a lot

More Related