[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 Development
 Commerce
 Linux in the news
 Announcements
 Back page
All in one big page

See also: last week's Back page page.

Linux links of the week


Sendmail.net is apparently intended to be a community forum site for sendmail users (i.e. most of us). They have gotten off to a more ambitions start, though, featuring interviews with Tim O'Reilly, Brian Behlendorf, and, of course, Eric Allman. Over the next week they plan to add others as well: Eric Raymond, Kirk McKusik, Paul Vixie, and more. Worth a look.

Michael Hammel has set up his new home at graphics-muse.com. Michael is the author of the Linux Journal "Graphics Muse" column, as well as SSC's book on the GIMP. His site contains a nice mix of writings about Linux and the GIMP, and examples of what can be done with the GIMP.

Section Editor: Jon Corbet


October 21, 1999

   

 

Letters to the editor


Letters to the editor should be sent to letters@lwn.net. Preference will be given to letters which are short, to the point, and well written. If you want your email address "anti-spammed" in some way please be sure to let us know. We do not have a policy against anonymous letters, but we will be reluctant to include them.
 
   
From: bret r robideaux <bret.r.robideaux@mail.sprint.com>
Date: Fri, 15 Oct 1999 11:39:36 -0500
Subject: High Availability


I am by no means an HA expert, but my present employment requires me to 
have a passing knowledge of the subject.

3-nines availability means, in raw numbers, that over the course of one 
year (31,536,000 seconds) you can expect down time that does not exceed 
31,536 seconds (or 8.76 hours/year). Adding that fourth nine means 
dropping to under 53 minutes/year.

Outages (the period of time your system is not providing the services 
it is expected to provide) are caused by several things: hardware 
failure, application failure, operating system failure as well as 
scheduled maintenance on hardware, applications and the operating 
system.

Any of these outages can be included or ommitted from your calculations.

Considering that M$ is not a hardware vendor, and they don't have any 
illusions (or seem to) of being in that market, it is not only likely, 
but even reasonable (believe it or not) that they completely omit 
hardware failures, upgrades and maintenance from their calculations. 
This further stands to reason because almost none of the hardware M$ 
products are designed to run on supports fault tolerance anyway. 
Therefore, it is highly likely that no outage, scheduled or 
unscheduled, that is due to hardware counts against their 31.5K seconds 
per year.

This leaves operating system upgrades (application of services packs), 
application upgrades to account for scheduled maintenance (read: 
outages).

Performance loss reboots, BSODs, and application crashes make up the 
unscheduled outages. My (limited) experience with NT suggests that 
rebooting an NT system once a week (whether it needs it or not) tends 
to significantly reduce (and even eliminate) app crashes, performance 
loss and BSODs.

Call it 5 minutes to reboot (probably a little generous, but let's be 
nice) X 52 weeks/year totals 260 minutes or 4 hours and 20 minutes a 
year. That leaves a solid 4 hours and 25 minutes a year for scheduled 
outages.

Even if it takes a hour to apply a service pack and another hour to 
apply an upgrade to the application being hosted (no one I know runs 
more than one major application per NT server) and you're still 2 hours 
under the requirement for 3-nines availability.

Certified 99.9% availability seems awfully impressive until you break 
it down. Now it just seems rather pathetic that 3-nines is the best 
they could do.

But really, none of that is the point. This is the point: anecdotal 
evidence is completely ignored in the corporate boardroom. Until we get 
hard evidence in this category, we're just wasting our breath in the 
long run. Microsoft has chosen another battle ground (setting the 
testing criteria and the score). To continue Linux's expansion, we have 
to rise to this challenge as well.

Bret


   
Date: Thu, 14 Oct 1999 02:09:45 -0700
From: Nathan Myers <ncm@nospam.cantrip.org>
To: letters@lwn.net
Subject: 99.9% uptime


To the editor,

I'd like to follow up on Mike Richardson's analysis, in which he pointed 
out that at 10 minutes of downtime per crash, 99.9% uptime implies a 
crash-per-week.

If crashes happened at random times, ten minutes a week wouldn't seem 
too bad, if in fact the crashes resulted in no cumulative damage.  After
all, _most_ of the time _most_ machines aren't doing much anyway.

Crashes don't happen at random times, though; they tend to happen during 
periods of peak load.  That means if a machine crashes, it tends to crash 
when you most need it to be working.  Seen in this light, a 99.9% uptime 
is as embarrassing to Microsoft as their frequent on-stage demo failures.  
"It only crashes during the week's peak demand" does not inspire confidence.

This leads us to a working definition of an otherwise annoyingly vague 
term:  What is an enterprise server?  In marketing text the term occurs 
in connection with terms like "scaling" and "multiprocessor", but that 
confuses goals with techniques.  In practice, when we talk about an 
enterprise server we're talking about an application where even a short 
failure costs more than the price of the entire system, and easily 
justifies throwing it out and replacing it with something better.  An 
hour's downtime on a warship may cost billions (or worse) and a minute's 
downtime on a surgical monitor may cost a life.  (Both of these examples 
are drawn from real failures.)

Is Linux qualified to act as an enterprise server?  Better scaling allows 
it to take on bigger jobs where more is at stake, but the key is still 
reliability.  A $400 co-hosting web server that logs $2000 in business in 
a peak hour is an enterprise server, by this definition, and the value 
managed by the fleet of such machines deployed among thousands of small 
businesses easily matches that handled by the biggest "big iron".  While
individual failures may attract less attention than downtime on a massive 
server, their cumulative effect is the same.  The difference is that such 
servers can be, and are being, replaced incrementally.  We experience that
process as growth in Linux's (and xBSDs') popularity.

In short, reliability is a more important measure of "enterprise readiness" 
than scalability.  Linux developers and users are already attuned to this
fact, but Linux reliability could still improve.  More code reviews (e.g.
for graduate credit?) and better in-kernel monitoring and data-gathering 
apparatus would help.

Nathan Myers
ncm@nospam.cantrip.org

   
To: letters@lwn.net
Subject: RE: Gerstner's speech
From: Guillaume Laurent <glaurent@worldnet.fr>
Date: 14 Oct 1999 16:51:28 +0200


I'd like to respond to Walt Smith's comments on Lou Gerstner's
speech. I've worked for IBM and still know a couple of IBMers, and I
believe Mr Smith is quite a bit misinformed.

> I don't know whether to sell my IBM stock or hope for a replacement
> for Gerstner.  Clearly the man is living in a different world.

IBM has enjoyed its most profitable years ever under his leadership,
and every IBMer agrees that without him the company would have
disappeared by now.

> A significant part of IBM is it's proprietary properties
> and manufacturing!

IBM is doing everything it can to make that part shrink, and it's
succeeding.

For the past three years or so, IBM has been changing a service
oriented company, because they can't follow the competion in the
hardware domain, be it the network appliances market (hence the recent
agreement with Cisco) or the PC one.

I've witnessed this in the IBM lab I used to work. Hardware oriented
projects are slowly dying while the new ones are dealing with services 
(like SAP/R3 or IBM Global Network).

I believe the whole IT industry has been witnessing it, actually. :-)

> Yes, internet appliances will make an impact, but not in the way he
> believes. I won't elaborate unless I get a check for consulting;
> that information is very valuable to IBM's marketing!!

I don't have anything to reply to this one, I just left it because
it's really very funny. :-)

-- 
						Guillaume
   
Date: Mon, 18 Oct 1999 23:47:57 -0600
From: Alan Robertson <alanr@bell-labs.com>
To: pankaj_chowdhry@zd.com
CC: letters@lwn.net, pcwonline@zd.com
Subject: Someone to trust...

This letter is in reply to your article for PCWeek online entitled:
	Open source meets the 'Baywatch' factor

Mr. Chowdhry:

I read your article with interest, but at the end found myself asking "Why do
you run Microsoft code at all?"

Microsoft has gotten caught multiple times putting hooks into their code to
collect confidential information from their customers.

This hasn't happened yet in open source projects.

All Microsoft has to do is declare it as "good for Microsoft", and it's done --
and the only way you can find out is to sniff every packet on the wire, and try
and figure out what they've done to you ... again...

Reading the source is much easier than this, and much more entertaining... 
Despite the formidable difficulties associated with monitoring closed-source
operating systems from the outside, Microsoft has been caught in apparent
misdeeds more than once, and several bugs have been found this way.

If it weren't such a serious matter, Mulder and Scully would laugh at your
analysis.  It appears that you don't have any idea how many layers you have to
trust from the bottom to the top just to log in.  OS patches are only the
tiniest tip of the iceberg.

For a simplified view, you can start with:
	Chip designers	(witness Pentium III)
	Chip design toolmakers
	Compiler authors
	Library authors
	OS authors
	Dozens of software component authors
	BIOS authors
	router manufacturers
	Hardware (motherboard and card) designers
	Your ISP's security procedures
	Your ISP's trusted personnel
	Authentication server authors
	The US government
	Internet backbone providers
	Telcos

And, if you use Windows:
	Microsoft

You blindly trust all those people every day.  The one you appear to trust the
most (Microsoft) has a poor track record, you can't check up on them, and yet,
inexplicably, you rant about Linux instead. Linux authors stand the best chance
of getting caught in misdeeds, or having their mistakes corrected.  Moreover,
security patches ARE carefully scrutinized by more than one person before being
put out.  Since these people don't have any common interest, except in the
security of Linux, this is very good checking indeed.

You make pejorative emotional statements devoid of experience or fact concerning
autorpm.
	
Your article is filled with naïve assumptions that make it difficult for you or
PC Week to to be seen as credible.

The more you write commentaries full of unsubstantiated emotional appeal, the
more clearly the subtext of your article says: "To read how PC Week propped the
door open for Linux hackers leaving them a sign saying 'Hack Here', while
carefully guarding Microsoft's reputation, click here".

You're not doing yourself, PC Week, or your readers any favors here.


	-- Alan Robertson
	   alanr@henge.com
   
Date: Tue, 19 Oct 1999 16:07:00 +0200
Subject: Open source meets the 'Baywatch
To: pankaj_chowdhry@zd.com
Cc: editor@lwn.net
From: Martin.Skjoldebrand@forumsyd.se (Martin Skjoldebrand)

Dear Sir,

In your article "Open source meets the 'Baywatch' factor" you write that:


>Our test struck the ire of the Linux community. Most of them suggest
>going to the Red Hat Web site and looking at its security page. This
>solution somewhat works but flies in the face of the whole Red
>Hat-is-not-Linux argument. Red Hat does offer signed versions of RPMs to
>verify their authenticity, but what sort of code verification do they do? 

and: 

>And pay no attention that you have a single source to look for all
>security updates. 
>Although I don't trust any of these [Novell, Microsoft, Sun] companies,
>they give me someone to sue, or at the very least, someone to yell at. 

I don't really follow you. Do you mean that the fact that the NT-server
was patched while the fact that the Linux server wasn't dependes on either:

a/ You are too bored to download 21 files from Red Hat (or get a CD with
those on), while not bored enough to download a MB-thingie off of the MS
server (or order a CD with those on); or

b/ You are paranoid enough to avoid Linux patches, while not nearly
paranoid enough to apply Microsoft patches. Microsoft is a company you
don't trust right?

Open source lives by code done by all kinds of people, so if you are
paranoid to trust no one you have to code your OS yourself I'm afraid.
But, you do bring out a point which has been discussed before, I think.
Someone may actually post malicious code on a public server. Someone did
too a while back, but was spotted almost immediately.

Cheers,

Martin S.
http://www.forumsyd.se
martin.skjoldebrand@forumsyd.se
Y2K? - What's so special about the year 2048?

 

 

 
Eklektix, Inc. Linux powered! Copyright © 1999 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds