Presenter/Topics 9:15-9:30 Welcome Mike Wangsmo, symposium organizer, stated his motive: to get persons who have demonstrated interest and expertise in Linux High Availability/Clustering together in one room, so that they may begin to design and develop a GPL infrastructure for a comprehensive Linux solution. 9:30-noon Core Services Stephen Tweedie presented his understanding of where we are now, where we need to go, and how to get there. Currently there is a great deal of HA/Cluster development activity, but it is fragmented and focused on short-term goals -- on rapidly developing products or product components that solve the easy problems. If we continue in this way, we will end up with a disparate collection if incompatible, partial solutions. We need to start now designing and building a framework capable of solving the hard as well as the easy problems. This is doable, but it requires a long-term view. The hardest problem is cluster partitioning, the situation whereby a node becomes isolated independently updates a database that another node is also independently updating. This is a problem equally of shared-everything and shared-nothing approaches. To prevent partitioning, we need to be able to guarantee that every cluster node is informed when a node fails, and we need a technique to prevent surviving cluster nodes from acting independently. Quorum is such a technique -- cluster nodes vote, and none may act unless it has a majority. To break ties, the disk (or another device) may be given an extra vote. We need an API that provides Quorum as an independent cluster service. Another hard problem, scalability, can be solved by a hierarchical design, which will work for Massive Parallel Processing (MPP) as well as for heterogeneous/HA clusters. In this design, clusters of nodes are arranged in a hierarchy, with each cluster having a leader. In this arrangement, it is possible for any node among thousands of nodes to be informed quickly of the failure of any other node. Also, the arrangement allows for a cluster transition to be isolated to cluster peers at a level (usually a localized geographical area) -- no need to involve other levels in a recovery, or in the re-election of a new peer leader. A number of objections were raised, the largest among them being the motive for such an arrangement -- what's the advantage of, say, binding a Scotland cluster into a UK cluster and the UK cluster into Red Hat cluster, over just leaving them as independent clusters? Answer: with distributed heterogeneous clusters, ease of administration, and administrative flexibility. With MPP clusters, a hierarchical arrangement makes node recovery possible. (It was not clear whether these answers were acceptable to all present.) 1:00-4:00 Storage and DLM Peter Braam 4:00-5:00 Open discussion