Advertisement
Promo

Server platforms Toolkit

Download now

A Large-Scale Study of Failures in High-Performance-Computing Systems

PublisherCarnegie Mellon University
Format568.0KB PDFDate added01 Dec 2005
Topics High Performance Computing
Downloads2

Designing highly dependable systems requires a good understanding of failure characteristics. Unfortunately little raw data on failures in large IT installations is publicly available, due to the confidential nature of this data. This paper analyzes soon-to-be-public failure data covering systems at a large high-performance-computing site. The data has been collected over the past 9 years at Los Alamos National Laboratory and includes 23000 failures recorded on more than 20 different systems, mostly large clusters of SMP and NUMA nodes. They study the statistics of the data, including the root cause of failures, the mean time between failures, and the mean time to repair.

Download now

Did you find this white paper useful?
1 out of 1 users found this white paper useful


  • Trackback
  • Clip Link

Related white papers

Go Green with IBM System x Servers and Intel Xeon Processors

By "going green" with energy-efficient IBM® System x™ servers featuring Intel® Xeon® processors, you can win back control of your IT budget—and win the battle with data center power constraints.


Recommended Practices for PC Fleet Management for Mid Market and EnterpriseOrganizations

PC management is both costly and ongoing. Desktop support alone soaks up 30-45 percent1 of IT budgets. But optimizing your PC fleet management strategy will produce efficiencies and lower costs. ...


Take business PCs to the next leve: Improve security and remote manageability with Intel? vPro? technology-based notebook and desktop PCs

Intel? Centrino? 2 with vPro? technology for notebook PCs and Intel? Core?2 processor with vPro? technology for desktop PCs will change your IT reality. Our latest technology is optimized for...


The Benefits of Intel? Centrino? with vPro? Technology in the Enterprise

PCs are essential in today's enterprises, yet managing a PC fleet can consume a significant portion of IT's time and budget. Finding ways to keep employees productive with powerful notebook...


Massively Scalable NAS - Pre-Empting Tomorrow's Data Overload with Today's Technology

HP is launching the HP StorageWorks 9100 Extreme Data Storage System that solves challenges such as extreme scability, manageability and affordability and creates new business opportunities. HP is going to...


Alchemi: A .NET-Based Grid Computing Framework and Its Integration Into Global Grids

Microsoft's .NET Framework has become near-ubiquitous for implementing commercial distributed systems for Windows-based platforms, positioning it as the ideal platform for grid computing in this context. This paper presents Alchemi,...


Dell Helps Lead Scale-Out Industry-Standard Server Computing

Meeting business requirements for less. It’s the perennial struggle for IT managers, made even more difficult by the economic climate of the past three years. Many have turned to server...


White Paper

Featured White Paper

Selecting a Microsoft Hosted Exchange Service Provider

When it comes to the decision to outsource the delivery of your organisation's messaging solution, the task of selecting the most appropriate service provider can be daunting This whitepaper from Cobweb Solutions, Europe's leading Microsoft Hosted Exchange provider, is designed to help simplify that task for you, by arming you with the important ...

Download Now

Other White Papers

Business Efficiency in Unprecedented Times

In these unprecedented times, organisations are left with no choice but to seek out more and more...

Desktop Virtualization on IBM BladeCenter and System x Servers: Taking Back Control of the Desktop

"Operational efficiency is imperative in today's competitive marketplace. Thus, the IT strategies...

See All White Papers

Video icon

Video


Skip Sub Navigation Links to CNET Brand Links

Help

Become part of the ZDNet community.

Newsletters