| Publisher | French National Institute for Research in Computer Science and Control | ||
|---|---|---|---|
| Format | 319.8KB PDF | Date added | 06 Feb 2008 |
| Topics | Fault-Tolerant Servers, High Performance Computing, Middleware | ||
| Downloads | 5 | ||
GridRPC middleware are usually managing failures by using TCP or other link network layer provided failure detector, automatic checkpoints of sequential jobs and a centralized stable agent to perform scheduling. Most recent developments have provided some new mechanisms like the optimal Chandra & Toueg & Aguillera failure detector, most numerical libraries now providing their own optimized checkpoint routine and distributed scheduling GridRPC architectures. This paper aims at adapting to these novelties by providing the first implementation and evaluation in a grid system of the optimal fault detector, a novel and simple checkpoint API allowing to manage both service provided checkpoint and automatic checkpoint and a scheduling hierarchy recovery algorithm tolerating several simultaneous failures.
Related white papers
Increasing Energy Efficiency with x86 Servers
Increasing energy efficiency with x86 servers. Robert Frances Group (RFG) explains why Intel/IBM is a winning combination.
Trend Watch: Mission Critcial x86
This paper examines how, over the past several years, research has shown many organisation moving mission-critical systems onto x86 servers.
Virtual SMB Centre in a Box?
What types of organizations have the most efficient IT operations? Data centers that provide highly available and resilient services with a minimum of human intervention and management? IT shops that can...
IBM System x: Enterprise Servers in the New Enterprise DataCenter
Virtualization changes everything! Once you start to virtualize your servers and storage, you will find that you can enjoy a higher degree of server utilization and simpler, more comprehensive server management....
Virtualization strategy for mid-sized businesses
Virtualization of business applications allows IT operations in companies of all sizes to reduce costs, improve IT services and manage risk. The most dramatic cost savings are the result of reducing...
SAP - Increasing Energy Efficiency with x86 Servers
The current economic crisis has most companies scrambling for ways to get the most out of every dollar they spend. However, to date, many of the gains have been illusory...
Trend Watch: Mission Critcial x86
This paper examines how, over the past several years, research has shown many organisation moving mission-critical systems onto x86 servers.



