1 / 14

RAMP-White

RAMP-White. Derek Chiou The University of Texas at Austin. High Level Characteristics. Coherent distributed shared memory machine Scalable at the same level as other RAMP machines 1K eventual target Intended to be ISA/Architecture independent Use different cores

loan
Download Presentation

RAMP-White

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RAMP-White Derek Chiou The University of Texas at Austin

  2. High Level Characteristics • Coherent distributed shared memory machine • Scalable at the same level as other RAMP machines • 1K eventual target • Intended to be ISA/Architecture independent • Use different cores • All RAMP efforts are intended to be ISA independent • Intended to integrate components from other RAMP participants • A testbed for sharing of IP RAMP-White

  3. Our Additions • New code in Bluespec rather than Verilog/VHDL • More configurable • That’s what my group is using • Embedded PowerPC as one core • Leon decided on after we started, just recently boots Debian • Wanted to determine issues of different cores • My research needs fast cores • Eventually an SMP OS, initially multi-OS shared space initially RAMP-White

  4. Issues • Architecture • Implementation • Operating System • Sharing IP • Language • Maturity • Infrastructure (CVS, etc.) RAMP-White

  5. Three Stages (for Implementation Ease) • Incoherent shared memory • No hardware global cache, just global shared memory support • Optimal cache for local memory • However, software can maintain coherence if necessary • Network virtual memory • Run a simulator on top of the processor • Ring-based coherence (scalable bus) • Requires a coherent cache • Running essentially a snoopy protocol • True coherence engine not required • But, very restricted communication • Good for testing, modeling many targets • General network-based coherence • Requires general coherence engine RAMP-White

  6. Intersection Unit Network Interface Unit Generalized Architecture Proc ISA dependent $ Mem MC IU NIU PLB ISA independent OPB bridge RAMP-White

  7. Sits between the Processor (cache), PLB bus and NIU Processor interface Slave Eventually snoop Network interface Master Slave Memory interface Master Eventually snoop Hooks for coherency engine Incoherent version is a special case Programmable regions Global (local and remote) Local Intersection Unit Proc $ Mem MC IU NIU PLB OPB bridge RAMP-White

  8. Split into two components Msg composition/Queuing Net transmit/receive Insert/extract for ring Intended to permit other transmit/receive One input/one output Creates a simple unidirectional ring Can interface to more advanced fabrics Network Interface Unit Proc $ Mem MC IU NIU PLB OPB bridge RAMP-White

  9. Operating System • Started by looking at PowerPC • Wanted an SMP OS • Knew we didn’t have coherent cache • But, also missing TLB Invalidation & OpenPIC (interprocessor interrupts, bring-up) • But, do have load-reservation/store-conditional instructions • Leon is SMP-capable, so should avoid these issues • Starting with separate OS’s • Region of memory is global • (no Block Address Translation (BAT) so need to manage global pages) • mmap RAMP-White

  10. Status: Hari Angepat • Bluespec learned • NIU code complete and unit tested • IU code complete being tested on XUP • 2 PowerPC processors • Supports interfaces • Processor Slave • PLB Master • NIU • Hardware intended to target different ISAs • Some preliminary OS work • SMP-linux investigation • Multi-image mmap interface currently targeted • Targets Phase 1 (incoherent shared memory) • 2 IUs, 1 MC with an arbiter RAMP-White

  11. Our Long Term Plans • Phase 1, XUP complete end of 1Q07 • With multi-OS support (with help from Stanford?) • Phase 2, 1 BEE2 board hopefully will be 2Q07 • Larger scalability, BEE2, Berkeley MC, Leon?, RDL? • Phase 3, hopefully 4Q07 • Arbitrary network, cache coherency engine, SMP OS?, Leon?, RDL? • x86 CMP/SMP on top of RAMP-White • Full cycle accurate (separate timing model) • RAMP-White executes functional model in parallel • Heterogeneous hosts! • Start with Phase 1 (separate team) • For Phase 3, tie target coherence system to RAMP-White • Cache maintained by target coherence, not by host coherence RAMP-White

  12. Sharing IP: Some Preliminary Experience • We looked at RAMP-Red XUP • Used some code (PLB master) • Red-BEE is not ready to distribute • Looking for switch code • Berkeley’s code on CVS repository • But, we can’t use memory controller because we don’t have BEE2 board yet • Bluespec • We are spinning almost all of our own code right now • Would like to steal software • OS (kernel proxy) • SMP OS port • Naming • MPI reference design in BEE2 repository • Is that RAMP-Blue? • A central CVS repository for RAMP code? RAMP-White

  13. Processor is shared Leon PowerPC MicroBlaze Everything else MC is shared Xilinx or Berkeley Coherent cache can be shared Transactional/traditional Borrow Stanford’s? Coherency engine can be shared CMU/Stanford IU functionality can be shared Trying to make ours general NIU can be shared Borrow half from Berkeley? Network can be shared Borrow Berkeley’s? Sharing Over the Long Term Proc $ Mem MC IU NIU CCE Peripherals RAMP-White

  14. Conclusions • RAMP White is started • Hari has been working full time for 1 semester • Have a clear first direction • Architecture looks fairly flexible • Would like to discuss how to share IP better so we don’t reinvent the wheel RAMP-White

More Related