a/alinka

Date: Wed, 12 Apr 2000 21:03:09 +0200
To: lwn@lwn.net
From: Antoine Brenner <abrenner@alinka.com>
Subject: ALINKA Linux Clustering Letter:


Dear Lwn,

I am pleased to announce to the weekly ALINKA Linux Clustering Letter:

clustering@alinka.com is a free weekly e-mail newsletter on linux clustering
from ALINKA.
 
To subscribe to the list, send e-mail to clustering@alinka.com from the
address you wish to subscribe, with the word "subscribe" in the subject.

To unsubscribe from the list, send e-mail to clustering@alinka.com from the
address you wish to unsubscribe from, with the word "unsubscribe" in the
subject.

Alinka is the editor of the ALINKA ORANGES and ALINKA RAISIN administration
software for Linux clusters. (Web site: http://www.alinka.com )

clustering@alinka.com provides a summary of the weekly activity in
mailing-lists
relative to linux clustering (such as beowulf, linux virtual server or
linux-ha) and 
general clustering news.

Here is the first Alinka Linux Clustering Letter:
======================================================================
======================================================================
 

This is the ALINKA Linux Clustering Letter of Wednesday, April the 12th.
2000


News from the High Performance world, by Dr Laurent Gatineau
(lgatineau@alinka.com)
======================================================================
New Beowulf cluster in the world
LosLobos

The  University of New  Mexico [1] and IBM  [2] have made a cluster of
256  IBM Netfinity  servers    (dual processors).   We could  read  in
Wired [3] that this  cluster  will deliver  a processing speed  of 375
gigaflops,  or 375 billion operations  per  second.  It will only rank
24th on the list of the top 500 fastest supercomputers [4].

It's  clear that we  need good software and  good hardware before that
clusters could  be better than  supercomputer. As Dr. Frank Gilfeather
said, Beowulf clusters  need management  tools like on  supercomputer,
and they need good I/O, including scalable file systems.

[1] http://www.unm.edu/
[2] http://www.ibm.com/
[3] http://www.wired.com/news/technology/0,1282,35113,00.html
[4] http://www.top500.org/

=============
Jet

The Forecast Systems  Laboratory [1]  has  made a Beowulf cluster  for
numerical weather  prediction. We   could  read  in the  Linux  Weekly
News [2] that the FSL cluster (called "Jet") currently consists of 276
nodes.   Each node is   a   667Mhz Alpha  processors  with 512   Mb of
memory. Like  for the LosLobos  cluster (and lots of Beowulf cluster),
the interconnection network used is Myrinet.


[1] http://www.fsl.noaa.gov/
[2] http://lwn.net/2000/features/FSLCluster/

=============
Lotus

The  University   of Maine  [1]   has  made a   Beowulf cluster  of 34
nodes. Each node is  based on two 600Mhz   Pentium III with 512  Mb of
memory and  two  Fast Ethernet cards.   This cluster is  available for
testing, computational physics and computational physical chemistry.


[1] http://weblotus.univ-lemans.fr/w3lotus/index.html


Conference about Clustering
=============
Extreme Linux Workshop/Conference #3

EL2000 will   be  held  in  conjunction with    the 4th   Annual Linux
Showcase & Conference on October 12-14,  2000 in Atlanta, GA.  Besides
the Extreme Linux track there will  be two other refereed tracks: Hack
Linux and  Use Linux.  There will also  be vendor exhibits, tutorials,
birds of  feather sessions, and work in  progress sessions.  Attendees
of the Extreme Linux workshop will be full attendees of the conference
and able to attend any sessions they want.

http://www.extremelinux.org/activities/usenix00/
http://www.linuxshowcase.org/



Software for Beowulf cluster
=============
dsh - distributed shell

dsh executes one or more commands on a collection  of hosts. The hosts
may be specified  on the command line  or as nodegroups.  Commands are
executed  sequentially on each host,  and the output from each command
is prepended with the  hostname. If a command  is not specified on the
command line, the user is prompted for commands to execute.

From: Beowulf mailing list
Home Page: http://www.ccr.buffalo.edu/dsh.htm

=============
PVFS Kernel Interface v0.8.1

Allows mounting of PVFS file systems on  Linux machines running 2.2.xx
kernels (tested on 2.2.12, 2.2.13, 2.2.15pre4). This version no longer
requires patching the kernel.

From: http://www.beowulf-underground.org/
Home Page: http://www.parl.clemson.edu/pvfs

=============
SCMS 1.2 Early Test Version

Early version SCMS1.2 beta  is now available  for testing. If  you try
and it doesn't    work  , please   report the   problem.  New  feature
including:
1. All in Java now. Much better user interface 
2. KCAP is now a separate package with many improvement. 
3. Realtime monitoring work much better. 


See [m1] and [m2] for downloading

From: the beowulf mailing list
[m1] http://www.beowulf.org/listarchives/beowulf/2000/04/0031.html
[m2] http://www.beowulf.org/listarchives/beowulf/2000/04/0033.html

=============
TCP patches for Red Hat 6.2

Josip Loncaric's TCP patches for Red Hat 6.2 (Linux kernel 2.2.14-5.0)
are available. See:
Explanation/use: http://www.icase.edu/coral/LinuxTCP2.html
Patch for RH6.2: http://www.icase.edu/~josip/tcp-patch-for-2.2.14-5.0

From: the beowulf mailing list
[m1] http://www.beowulf.org/listarchives/beowulf/2000/04/0035.html


Tips and tricks from the Beowulf mailing list
=============

* There was some threads about memory and cluster: how to buy / test /
  bench it ?

  A simple conclusion  could be: "buy expensive  memory and you  won't
  have problems", but it's not always true and Beowulf clusters should
  use all kind of hardware, so  if you could  trust in your vendor buy
  its memory and test it. In fact you should always test your hardware
  before using it.

  To test it there is some tools:
  . the famous memtest86 [1]
  . Adam Lazur [m1] gives  the URL of  memtester [2] and said that "he
    prefers memtester over memtest86  as it has  a  lot of algo's  for
    finding bad RAM"
  Another thing about the memory  tester, Douglas Eadline [m2] reports
  that "floating   point on x86  hits the  RAM espacially   hards", he
  hasn't verified this.

  An other way to test a node is to make a benchmark  of the memory. A
  good tool for this is stream [3]. By testing the performance of your
  memory  you test also important  hardware such as  the cache and the
  bus, and you  could test the scalability  of SMP nodes. This is  the
  subject of the David Konerding's thread.

[1] http://reality.sgi.com/cbrady_denver/memtest86/
[2] http://www.qcc.sk.ca/~charlesc/software/memtester/
[3] http://www.cs.virginia.edu/stream/

[m1] http://www.beowulf.org/listarchives/beowulf/2000/03/0307.html
[m2] http://www.beowulf.org/listarchives/beowulf/2000/03/0309.html
[m3] http://www.beowulf.org/listarchives/beowulf/2000/03/0286.html


* Ole Holm Nielsen had written a Beowulf cluster mini-HowTo, you could
  find it at  this url [1].  Documentations are  important: thanks for
  this work !

[1] http://www.fysik.dtu.dk/CAMP/cluster-howto.html 

[m1] http://www.beowulf.org/listarchives/beowulf-announce/2000/03/0011.html


* Borries Demeler asks  for a node cloning software...  This is one of
  the most important tools for a  cluster management software (and its
  fully provides in our software Alinka Raisin).  Robert G. Brown [m2]
  proposes the kickstart installation process  from RedHat [1]; dwight
  [m3] and   Pfenniger Daniel [m4] find  kickstart  very usefull. Alex
  Lancaster [m5] wrote that one of the limitation of kickstart is that
  it's not easy to configure software that require human interaction.

[1] http://www.redhat.com

[m1] http://www.beowulf.org/listarchives/beowulf/2000/03/0274.html
[m2] http://www.beowulf.org/listarchives/beowulf/2000/03/0279.html
[m3] http://www.beowulf.org/listarchives/beowulf/2000/03/0290.html
[m4] http://www.beowulf.org/listarchives/beowulf/2000/03/0295.html
[m5] http://www.beowulf.org/listarchives/beowulf/2000/04/0023.html


* Alexander Korenkov [m1] seeks for some  tricks to tune his MPI: with
  his  100 Mbit  fast Ethernet his  net speed  is only about  7.2 Mbit
  /s.  According to the description of  his algorithm, the best answer
  should be  the  Jeff Squyres's one  [m2]:  "use  the persistent mode
  sends and receives". He gives a good description on the cost to send
  a message: you have to take into account the  cost to go through the
  system and the cost to  go through the network  (the first one could
  be very expensive when you send lots of small messages).

[m1] http://www.beowulf.org/listarchives/beowulf/2000/04/0002.html
[m2] http://www.beowulf.org/listarchives/beowulf/2000/04/0003.html


* Jose Marin [m1] asks  for a network  traffic  monitor.  There's lots
  of network     meter,  some   does    lots  of   thing,  other   are
  lightweight... Robert G. Brown [m2] had wrote procstatd [1] and give
  a pointer [m3] to  mgm  [3] which  is probably not  for professional
  using. Jose Marin answers himself  and gives a  pointer to an  SGI's
  tool (Performance Co-Pilot) [2] which monitor all system ressources,
  and two other pointers [4], [5] for the tools  iptraf and ntop which
  are  recommended by Jay Sherman, Felix  Rauch [m4]. Two others tools
  were cited by Lyle Bickley  and Paul Nowoczynski: ethertape [6]  and
  Ethereal [7] (which is a network protocol analyzer).

[1] http://www.phy.duke.edu/brahma/
[2] http://oss.sgi.com/projects/pcp/
[3] http://www.xiph.org/mgm/
[4] http://cebu.mozcom.com/riker/iptraf/
[5] http://www.ntop.org/
[6] http://etherape.sourceforge.net/download.html
[7] http://ethereal.zing.org/

[m1] http://www.beowulf.org/listarchives/beowulf/2000/04/0036.html
[m2] http://www.beowulf.org/listarchives/beowulf/2000/04/0038.html
[m3] http://www.beowulf.org/listarchives/beowulf/2000/04/0045.html
[m4] http://www.beowulf.org/listarchives/beowulf/2000/04/0040.html


* Kragen  Sitaker [m1] wants   to  change the  bios  of all  his nodes
  without doing it by hand. Erik  Arjan Hendriks [m2] gives a solution
  with the tool Bios Writer [1].

[1] http://sourceforge.net/project/?group_id=2965

[m1] http://www.beowulf.org/listarchives/beowulf/2000/04/0058.html
[m2] http://www.beowulf.org/listarchives/beowulf/2000/04/0059.html


* J.Dube [m1] wants to know if "a  Beowulf on an outside connection is
  a  huge security hole". A  short answer could be   that a Beowulf is
  like one computer, so  it's no more no  less a security hole than to
  put a computer  on an outside  connection...  It depends on how  the
  Beowulf   is configured. Robert G.  Brown   [m2] wrote  a long  mail
  explaining how the Beowulf could  be configured to close the maximum
  of security holes, and dwight [m3] added some comments.

[m1] http://www.beowulf.org/listarchives/beowulf/2000/04/0030.html
[m2] http://www.beowulf.org/listarchives/beowulf/2000/04/0032.html
[m3] http://www.beowulf.org/listarchives/beowulf/2000/04/0052.html


News from the High Availability world, by Christophe Massiot
(cmassiot@alinka.com)
======================================================================
LVS
===
* Jean-Christophe Boggio describes [lvs1] a solution for having load
balancing with MySQL. The application must separate read queries from
write queries, read queries are dispatched normally through lvs, and
write queries are done as many times as there are SQL servers at the
application level. Sean Ward has developed such a solution which reads
log files for replication [lvs2]. The source code is currently
available [lvs3].

* Phil Z. sends in a report [lvs4] on Realnetworks G2 server. To have
it work under LVS, the audio/video daemon must be configured to 
listen/respond to both the VIP and its real IP, and both port 7070 
and 554 must be redirected.

* Using LVS for balancing smtp servers should work, provided you do not
have ident loops with multiple smtp servers. To avoid this, turn on the
-R option of tcpserver (qmail/tcpserver) [lvs5], or read the HOWTO for
sendmail [lvs6]. POP will only work if you have an NFS-safe POP server
[lvs7].

* RedHat creates a new mailing-list for piranha, a web interface for
Linux clustering. [lvs8]

* IPVS patch 0.9.10 is out. Here is the ChangeLog :
        * Julian added the droprate and secure_tcp defense strategies.
        * The dropentry defense strategy was revisited.
        * The fwmark service lookup was added by Horms, Julian and Wensong
        Use a firewall-marking to denote a virtual service instead of a
        triplet <protocol,addr,port>.  The marking of packets with a
        firewall-mark is done by firewalling code. This feature can be
        used to build a virtual service assoicated to different IP
        addresses or port numbers, but sharing the same real servers, such
        as multiple-homed LVS. [lvs9]

* A document on LVS defense strategies against DOS attack is available.
[lvs10]

[lvs1]
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=95484927402602&w=2
[lvs2]
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=95486218014132&w=2
[lvs3] http://lsdproject.sourceforge.net
[lvs4]
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=95486498617326&w=2
[lvs5]
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=95496469906008&w=2
[lvs6]
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=95404893314823&w=2
[lvs7] http://www.clubi.ie/%7eross/sendmail-maildir.html
[lvs8]
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=95503184623136&w=2
[lvs9]
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=95528038608970&w=2
[lvs10] http://www.LinuxVirtualServer.org/defense.html

Linux-HA
========
* Legato has released a clustering solution called Legato Cluster
Enterprise. It supports Solaris, Linux (RedHat, Caldera), Windows NT/2000,
HP-UX and AIX platforms. [lha1]

* Mike Wangsmo has released kernels with ext3 patches for testing purposes
only. [lha2]

[lha1] http://www.legato.com/News/pr00031index.html
[lha2] ftp://people.redhat.com/wanger/clustering/ext3

LinuxFailSafe
=============
* Chris Wright summarizes the status of LinuxFailSafe [lfs1]. The port
to Linux is well under way, and being done largerly by SGI (release date
in the summer ?). The intent is to open source almost all of FailSafe,
with a licence near GPL/LGPL.

* The presentations made by SGI at the Linux FailSafe Symposium at Denver
on 31st March are now available. [lfs2]

[lfs1] http://lists.tummy.com/pipermail/linuxfailsafe/2000-April/000011.html
[lfs2] http://oss.sgi.com/projects/failsafe/


News on the Filesystems front, by Ludovic Ishiomin
(lishiomin@alinka.com)
======================================================================

In [1m], someone asks if intermezzo could be used for filesystem 
replication, in order to achieve high availability. The answer 
is that intermezzo can handle this job, but it's not ready 
yet for critical tasks.

JFS 0.0.5 for Linux, a well known journaled filesystem which 
comes from IBM AIX, has been announced. More information can be
found at [1]. But it is not yet ready for a public release.

In the linux-lvm mailing list, there had been a discussion about the
ability to convert an existing filesystem into a logical volume. 
The conclusion is that it should be possible, but some code is 
needed in the ext2resize utility.


[1m] http://www.inter-mezzo.org/list-archives/intermezzo-discuss/msg00007.html
[1] http://oss.software.ibm.com/developerworks/opensource/jfs/index.html
======================================================================


This letter was brought to you by ALINKA (http://www.alinka.com), the
editor of the ALINKA ORANGES 
and ALINKA RAISIN administration software for Linux clusters. 

-- 
abrenner@alinka.com
http://www.alinka.com; AlinkA : Cluster Solutions
Tel: (+33) 1 49 35 29 29