You currently have 0 events on your schedule.
Home
>
Interactive Schedule
> Wednesday, November 16, 2005
Warning: It appears you do not have Javascript enabled.
If so, you will have trouble creating and viewing your itinerary information.
Design and Implementation of Multiple Fault-Tolerant MPI over Myrinet
Session:
MPI and Network Transport
Event Type:
Paper
Time:
11:30am - 12:00pm
Session Chair
:
Scott Pakin
Speaker(s)
:
Hyungsoo Jung, Dongin Shin, Hyuck Han, Jai W. Kim, Heon Y. Yeom, Jongsuk Lee
Location:
608-609
Abstract:
Advances in network technology and computing power have inspired the emergence of high-performance cluster computing systems. While cluster management and harware high-availability tools are readily available, practical and easily deployable fault-tolerant systems have not been successfully adopted commercially. We present a fault-tolerant system, Multiple fault-tolerant MPI over Myrinet (M-cube), that differs in notable respects from other proposed fault-tolerant systems in the literature. M-cube is built on top of Myrinet since it is regarded as one of the best solutions for high-performance networks and is widely used in cluster computing systems.
The features of M-cube are that it requires no modifications of application code, and it preserves much of the high performance characteristics of Myrinet. Experimental results substantiate our assertion that M-cube can be a good candidate for practically deployable fault-tolerant systems in a very large and high performance Myrinet cluster, and its protocol can be applied to a wide variety of parallel communication libraries without difficulty.
This paper can be found in the ACM and IEEE Digital Libaries
Click here for ACM
Click here for IEEE
Chair/Speaker Details:
Scott Pakin (Chair)
Los Alamos National Laboratory
Hyungsoo Jung
Seoul National University
Dongin Shin
Seoul National University
Hyuck Han
Seoul National University
Jai W. Kim
Seoul National University
Heon Y. Yeom
Seoul National University
Jongsuk Lee
Korea Institute of Science and Technology Information
Home
|
About
|
Contact Us
|
Sitemap