A Fault Tolerant Gaussian Elimination Solver for the Cell Broadband Engine. James Geraci Lead Researcher Square Enix Co., Ltd. Research and Development Division. Introduction to Square Enix Group. Square Enix Group is a Japanese entertainment content/service developer and publisher.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Square Enix Co., Ltd.
Research and Development Division
Fault tolerance idea is to back
up on-chip data into main memory
The algorithm’s natural serialization points
are used as checkpoints.
When a fault occurs, backed up data is used
to redistribute workload among remaining
Core failures lead
of workload among
Addition of Cores:
Cores are added and
rows are dynamically
N failed cores
with M new cores