Advertisement
Promo

Server platforms Toolkit

Download now

Fault Tolerance Management for a Hierarchical GridRPC Middleware

PublisherFrench National Institute for Research in Computer Science and Control
Format319.8KB PDFDate added06 Feb 2008
Topics Fault-Tolerant Servers, High Performance Computing, Middleware
Downloads5

GridRPC middleware are usually managing failures by using TCP or other link network layer provided failure detector, automatic checkpoints of sequential jobs and a centralized stable agent to perform scheduling. Most recent developments have provided some new mechanisms like the optimal Chandra & Toueg & Aguillera failure detector, most numerical libraries now providing their own optimized checkpoint routine and distributed scheduling GridRPC architectures. This paper aims at adapting to these novelties by providing the first implementation and evaluation in a grid system of the optimal fault detector, a novel and simple checkpoint API allowing to manage both service provided checkpoint and automatic checkpoint and a scheduling hierarchy recovery algorithm tolerating several simultaneous failures.

Download now

Did you find this white paper useful?
7 out of 13 users found this white paper useful


  • Trackback
  • Clip Link

Related white papers

Increasing Energy Efficiency with x86 Servers

Increasing energy efficiency with x86 servers. Robert Frances Group (RFG) explains why Intel/IBM is a winning combination.


Trend Watch: Mission Critcial x86

This paper examines how, over the past several years, research has shown many organisation moving mission-critical systems onto x86 servers.


Virtual SMB Centre in a Box?

What types of organizations have the most efficient IT operations? Data centers that provide highly available and resilient services with a minimum of human intervention and management? IT shops that can...


IBM System x: Enterprise Servers in the New Enterprise DataCenter

Virtualization changes everything! Once you start to virtualize your servers and storage, you will find that you can enjoy a higher degree of server utilization and simpler, more comprehensive server management....


Virtualization strategy for mid-sized businesses

Virtualization of business applications allows IT operations in companies of all sizes to reduce costs, improve IT services and manage risk. The most dramatic cost savings are the result of reducing...


SAP - Increasing Energy Efficiency with x86 Servers

The current economic crisis has most companies scrambling for ways to get the most out of every dollar they spend. However, to date, many of the gains have been illusory...


Trend Watch: Mission Critcial x86

This paper examines how, over the past several years, research has shown many organisation moving mission-critical systems onto x86 servers.


Broadband Deals? Powered by Top 10 Broadband

150+ broadband packages

Compare 30+ mobile broadband deals

Mobile Broadband »
White Paper

Featured White Paper

Product Overview: IBMXIV Storage System

The IBMXIV® Storage System is a revolutionary high-end open disk system designed to support key current and future business require-ments for a highly available information infrastructure. Its design is a grid of standard Intel®/Linux® components, connected in any-to-anytopology using Gigabit Ethernet. This groundbreaking architecture provides ...

Download Now

Other White Papers

HP print solutions and 3M

the objective for 3M was to optimize office printing infrastructure at 3M locations worldwide...

IBM XIV® Storage System: Thin Provisioning Reinvented

Managing IT storage infrastructure is an endless balancing act of providing enterprise-class...

See All White Papers


Skip Sub Navigation Links to CNET Brand Links

Help

Become part of the ZDNet community.

Newsletters