[LWN Logo]
[LWN Feature]
OLS 2001 coverage

Weekly Edition
Daily updates
Events Calendar
Book reviews
Penguin Gallery

About LWN.net

Linux High Availability Working Group

July 25, 2001
Forrest Cook
Alan Robertson and Lars Marowsky-Bree led a working group session on Linux High Availability at OLS on Wednesday, July 25.

The goal of the session was to discuss ideas concerning creation of a common framework for High Availability (HA) and High Performance (HP) computing on Linux. The Open Clustering Framework (OCF) is intended to provide a common ground for development of cluster software.

Currently, the Linux clustering environment consists of many different, incompatible packages, many of which were ported from other operating systems. The various packages have many similar functions, but are incompatible with each other, this results in a fragmentation of effort for cluster software developers. As an example, all cluster implementations require functions to start, stop, and monitor processes.

Furthermore, people attempting to get into clustering for the first time are faced with a bewildering number of choices when choosing among the existing frameworks.

Commercial companies who want to provide Linux clustering solutions are faced with the task of supporting many systems, making work with Linux more expensive and less focused.

The Open Cluster Framework aims to unify the Linux clustering efforts in a number of ways and solve all of these problems.

One proposal is to create an active discussion on the linux-cluster mailing list and to adopt the RFC model for presenting papers and ideas. The Linux Standards Base was mentioned as a model of cooperation that can be copied with the OCF.

Some of the goals of the OCF are:

  • Agree on the main requirements for a clustering framework.
  • Assess currently available solutions for their viability.
  • Build a list of required components.
  • Identify components for functionality, features, scope, and communications paths.
  • Rework the currently disconnected pile of cluster components into a unified framework.
  • Figure out which components can be reused and which need to be rewritten.
  • Design an API from the component list.
  • Create feedback from component designs to the API design.
  • Achieve clustering world domination on the Linux platform.

Two of the more typical focuses for cluster systems include High Availability (HA) and High Performance (HP). Some of the goals of HA and HP are mutually exclusive, but the two areas also share many areas of commonality. For example, high performance systems need to emphasize high bandwith while high availability systems need to focus on low latency, goals which tend to be typically mutually exclusive. The OCF intends to provide for overlapping component functions in order to satisfy the needs of both HA and HP cluster users.

A system designer could pick the appropriate tools to build a system that was tuned for their needs. The overlapping component design would also allow proprietary solutions to be plugged in to implement various components. Open source implementations are planned for all of the main components. This strategy has the advantage of encouraging creativity and innovation, and it can maximize reuse and add flexibility.

In designing the API, a reference implementation would be required to prove that the API really works. Problems in the implementation could be fed back to the API design for optimization in a real-world situation.

The goal of pulling existing projects together into the same API was discussed. A problem area is that different projects currently have incompatible components. The project would aim to achieve enough agreement to be able to move forward despite these incompatibilities. Hopefully, working within the common framework would provide sufficient benefits to offset the effort required to rework existing software.

Having a common playground where everyone can bring toys to play with will create a desirable development environment. Innovation would be encouraged by the existence of a solid, well defined platform that supports extensibility from the beginning. Apparently, many vendors are currently unhappy with the current state of Linux cluster software due to the need for supporting way too many models. Proprietary vendors are currently showing interest, the bottom line to them is that a common API would allow them to sell software to more people, for less effort.

Intellectual property issues were also discussed, a clear policy needs to be made. No core component should be required to be proprietary, but conforming proprietary implementations should be allowed.

The OCF concept will again be presented at the Linux Kongress 2001 this fall in Enschede, the Netherlands.

Alan Robertson provided us with some notes from the session.

Peter Badovinatz also provided his notes which detail the discussions.

Eklektix, Inc. Linux powered! Copyright 2002 Eklektix, Inc. all rights reserved.
Linux ® is a registered trademark of Linus Torvalds