[LWN Logo]

Date: 16 May 2000 15:02:02 -0000
To: python-list@python.org
Subject: [FAQTS] Python Knowledge Base Update -- May 16th, 2000
From: Fiona Czuczman <fiona@sitegnome.com>


Hi All,

Another bundle of entries into http://python.faqts.com

Cheers, Fiona Czuczman


## New Entries #################################################


-------------------------------------------------------------
How does the statement 'global' work?
http://www.faqts.com/knowledge-base/view.phtml/aid/2902
-------------------------------------------------------------
Fiona Czuczman
Remco Gerlich, Thomas Wouters

Answer1:

Inside functions, if you assign to a variable, it is assumed to be local 
to the function. If you want to assign to a global (module namespace) 
variable, you'll have to tell Python that it's a global first. 

ie,

x = 4
def spam():
  x = 5
spam()

Doesn't change the module's x. But

x = 4
def spam()
  global x
  x = 5
spam()

does.

If, inside a function, you only use the variable and never assign to it, 
it can't be a local variable, so Python assumes you mean a global.

This doesn't only hold for functions, but classes and methods too.

Answer2:

Python has two namespaces it uses for name lookups: the 'local' 
namespace and the 'global' namespace (you could call it the 'search 
path'.) The local namespace is the function or class you are in, the 
global namespace is the namespace of the module. If you are not in a 
function or class, the local namespace is the global namespace.

However, *assigning* to a variable does not use this search path ! 
Instead, if you assign, you always assign in the local namespace. So, 
for instance:

X = None

def FillX(x):
        X = x

would not work, because the X you assign to, is the *local* X, not the
global one. This wouldn't even generate an error, by the way. Even more
confusing is it if you read from and assign to 'X' in the same function:

X = None

def FillX(x):
        if not X:
                X = x

In Python 1.5.2, this generates a NameError:

Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 2, in FillX
NameError: X

and in Python 1.6, an UnboundLocalError:

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 3, in FillX
UnboundLocalError: X

It generates this strange error because the 'X' name exists in the local
namespace, but at the time of the test on X, it isn't filled in yet. The
interpreter knows it should be there, but can't find it at the time of
execution.

And this is what the 'global' keyword is for. It says 'this name is 
global, *even if* it gets assigned to. You only need it when you are 
assigning to globals, not when you are mutating them (appending to 
lists, adding to dictionaries, etc.)


-------------------------------------------------------------
How do Python and Java compare?
http://www.faqts.com/knowledge-base/view.phtml/aid/2906
-------------------------------------------------------------
Fiona Czuczman
Courageous, Warren Postma, dana_booth, Cameron Laird, Roman Milner, Martijn Faassen, Bob Hays, Andrew Cooke, Glyph Lefkowitz

Topic 1:
I would just like to know how serious python is taken in the IT industry 
and how it compares to something like Java.

Answer(s):
- Recent events have proven that the computer industry is a lot less 
staid and unmoving than it used to be. Everybody's getting paranoid that 
someone else will beat them to the next big thing. That being so, I 
expect a big cloud of Python hype any day now. Perl had it's hype day, 
as did Java.
- Larger companies probably do not take Python seriously. In the 
industry, though, small, faster moving companies can have an advantage. 
I work at a mid-sized manufacturing plant, and we're given the freedom 
to use whatever we like, as long as the plant keeps working. :) We use 
OpenBSD servers running MySQL and Samba. Shell, Perl, and Python 
scripting drives everything, and the Windows users are connecting using 
TCL/TK apps. We do lots of things that many plants our size do not do, 
or spend a lot of money doing.
That being said, it's been my experience that Python is easier to use 
than Java. While I use Perl for small text parsing scripts, I've found 
Python to be very well suited for creating even semi-large programs. 
Like many, I was excited about Java a few years ago, but found it to be 
a bit cumbersome for the small to medium sized apps that we needed to 
create. I especially don't like the I/O in Java.
- Larger companies like Intel, Motorola, Microsoft, Compaq, IBM, ...?
See, for example,
<URL:http://developer.intel.com/technology/efi/toolkit_overview.htm>.
That's a cheap shot on my part.  Mr. Booth is right.  Conservative
(in some sense) MIS departments *are* wont to scorn Python.  It can
be tough getting approval from many of them to use Python for a
project such as that under consideration here.
It's worth it, though--*particularly*, I claim, for Web work.
- In the company I work for, the two languages are both used for 
different (but overlapping) areas.  Java is used for the product; Python 
is used for scripting tasks (automating builds, etc).
This strikes me as quite sensible.  The real reason is probably
historical, but I feel Java is better for large projects because:
+ near-static typing helps enforce interfaces
+ various language constructions (eg Interfaces) help enforce a modular 
approach
+ easier to hire for
+ programmers used to Java code, which tends to be more explicit (long 
winded and tedious if you like), and so easier to understand
On the other hand, Python is better for scripts because:
+ compact
+ flexible (can use functions as well as objects; objects blend nicely
  into the general synax using __ methods)
+ quicker to write
+ easier to modify
Note that the argument above doesn't mean Java is better - in many ways
it is worse, but safer/more common.  Cue arguments about blunt knives
;-)
Performance is not critical in the software I am involved in - provided
the speed is reasonable, reliability is more important.  Otherwise, I
guess we'd use C.

Topic 2:
If a web application was written in both Java and Python,
which would companies prefer?

Answer(s):
- Java. This is because IT professionals pick products that they can't 
get fired for picking. An old setting in the IT industry: "nobody ever 
got fired for picking Oracle."
- I heard it as "nobody ever got fired for buying IBMs".  Then the clone
industry sort of stole IBM's thunder, technological lead, etcetera.  
IBMs became Also Rans. So will Java.  Once the hype evaporates, you have 
just one more language, with good bits and not so good bits.
- *companies* would always prefer Java, because that has all the buzz,
they've heard of it, it's hyped all over the place. Whether they 
*should* prefer Java is another question.
- I work in the IT organization of a large multinational bank (in the
top ten world-wide).
When we had to build an interface system that required mapping data
and we didn't know the target map (it kept changing - big surprise,
right?), we used Python for the mapping.  What used to take days to
change (in C++) then took at most an hour; testing became the long leg
of development....
The only problem I had in getting Python accepted was that most people
have Java experience but not Python experience.  I handled that by
having four people in the group learn Python enough to modify our
production app - it took each of them less than three days to get up
to speed.  We brought in a consultant to port the application to a new
version of the underlying library (which is SWIG wrapped) and it took
him two days.  Case made, no more problems....
BTW, we've used JPython in a second production application already.
- Java, mostly because Java has a cooler logo, and Java is backed by
Sun, which is a big company.  Software companies, especially, prefer
this, because they like to think that having a large company behind
something means something.
That's if you're marketing it as being in a certain language.  One
thing that we do too much in this industry is focus on our
implementation technology.  YOU should be familiar with the
technology, YOU should be aware of its strengths and weaknesses, but
your customers should just know what it does and how well it does it.
It would be cool if you could get python a little bit more press, but
you don't necessarily have to try to ride Java's (or Python's)
marketing success.  If you do that, the success of your product
becomes bound up with successes and failings of the language and
platform which may or may not have anything to do with your program at
all.

Topic 3:
Which is more maintainable?

Answer(s):
- In my humble opinion, I would put it this way:
    1. It takes less python code to do something in Java
    2. The Less Code you write, the less there is to maintain.
As a matter of opinion, I find Python far more "readable". The syntax is
clear, the built in data structures and types are simple enough to 
grasp, but powerful enough to be useable.
- Depends on the code you write, I imagine. Python may be considered 
less maintainable as it's harder to enforce interfaces, can't do static 
type checking, etc. Python may be considered more maintainable as its 
code is shorter, more readable, you get flatter inheritance hierarchies,
you can more easily change things around, and you can more easily 
reimplement things. I myself would prefer Python; I think a solidly 
engineered piece of Python could would be as maintainable if not more 
than a solidly engineered piece of Java code, and faster to set up to 
boot.
- Can't choose.  Python is easier to change but IMHO works better as a
prototyping and quick-mod layer, not quite as robust as Java.
- Python.  Python can do just about anything in fewer lines of code, and
is vastly more readable to boot (not to mention having lovely
space-saving features like function and method objects and
module-global variables.)  I don't think i've *ever* found a language
more maintainable than python, and I love it.  Even one-liners I
tossed off to solve very specific problems are completely readable
(and even somewhat reusable!) to me weeks after I write them, even if
I'm trying to be purposefully obtuse or clever ;-)
Not to mention the fact that everything is a first-class object in
python, so you don't end up worrying about how the heck you're going
to write your code to deal with int/long/double/float without
incurring extra overhead both in syntax and in memory: you can just
treat everything as a number in python.  (As if it were object
oriented or something!)
Compare the following to snippets:

class Foo:
    def __init__(self,x=15):
        self.x=x
    def bar(self, a, b):
        return self.x+a+b

...

public class Foo {
    int x;
    public Foo(int x) {
        this.x=x;
    }
    public int bar(a,b) {
        return x+a+b;
    }
}

The java version looks slightly more complex, but they appear to be
approximately the same: until you realize that Java is operating on
integers, and Python is operating on *anything which can be added*!
This could be strings, complex numbers, integers, floating point
numbers (despite the fact that we should, of course, never use
floating-point (thanks tim)). In order to get similiar functionality
in the Java version with appropriately aesthetic syntax, the code
would be about 20 times longer.  Less code == less to fix == less to
maintain == fewer bugs.

Topic 4:
Which is more scaleable?

Answer(s):
- Keep in mind that a lot of your python can connect to third party 
tools at native speeds, if you learn a trick or two.
- Scalable up or scalable down? Python scales DOWN much better. There's 
a lot more overhead in making and distributing a Java application, so 
Java scales down poorly.
On the other hand, I'd say more work has gone into Java, and using Java 
for what would traditionally be called a "transaction processing" (TP) 
system. Java has some features in it's Enterprise Java Beans 
architecture specifically designed to handle high volume transaction 
technology. This is not garden variety stuff, and not something that 
most small or medium sized businesses need to worry about. This is more 
like "I run a bank and I have 100,000 online customers accessing their 
account data and doing transactions at the same time" type stuff.  Not 
everybody needs Java or Enterprise Java Beans, but some people do. For 

the time being at least, Python has no equivalent technology.
- As for features, I'd say this would be Java; it gets more development
in this department, and I imagine it can run on SMP machines without
any global interpreter lock.
That said, the Python environment is more dynamic which may help
you in scaling (distributed objects and so on), the Python environment
is more compact, Python may be even more portable than Java, so you
can scale up to bigger machines.
- Solely depends upon your architecture, not your language IMHO.
- Python has a global interpreter lock, so multithreading doesn't speed
it up as much.  So, if you're going to buy an Enterprise Server
1000000000 or whatever, with altogether too many processors to run
your code, the answer is *probably* java.  (There are ways around this
in python, but not all of them are immediately obvious.)  The minimum
investment to get good performance out of Java is higher than that to
get good performance out of python though.
However, Java has some issues with memory usage.  I have a machine
with more than enough RAM to do this:

glyph@helix:~% python
Python 1.5.2 (#0, Apr  3 2000, 14:46:48)  [GCC 2.95.2 20000313 (Debian 

GNU/Linux)] on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> x=range(5000000)
>>> 
glyph@helix:~% cd app/IBMJPython-1.1 
glyph@helix:~/app/IBMJPython-1.1% ./jpython
JPython 1.1 on java1.3.0 (JIT: jitc)
Copyright (C) 1997-1999 Corporation for National Research Initiatives
>>> x=range(5000000)
Out of Memory
You might want to try the -mx flag to increase heap size
So it depends how you want to scale.
In JPython, which is translating the Python to Java bytecodes and
running it as it it were java, this allocation is too large.  Not
because java has more overhead and I don't have enough RAM for it, but
because Java gives up halfway through when it realizes there's not
enough space in its "allowed" memory block for that list.  Notice that
I'm using "IBMJpython" here, which is JPython installed on the newest,
funkiest JVM available for Linux, the IBM JDK 1.3 port.  It is
*possible* to increase maximum memory usage by passing commandline
options, but (A) who wants to figure out how much memory a
long-running application is going to take ahead of time and have to
restart it if it overgrows that and (B) performance begins to suffer
as you do that.
Also; you can easily bind your Python code to other systems, profile
out the performance bottlenecks, easily, by extending or embedding
Python in C or C++.
And finally; this point is often overlooked, but it is VERY important;
critical, almost -- bindings for Python to native functionality on
your platform *probably already exist*.  Java shoehorns you into ONE
API, ONE loosely-defined standard implemented shoddily by Sun, ONE way
to write GUI's, no ability to do multiplexing (all I/O is blocking and
in threads) ... all in all, the poor performance and poor scalability
of Java's standard library damn the language more thoroughly than any
other feature of it.  After all, java "scales" because it has true
multithreading that will take advantage of multiple processors, but
there are optimizing algorythms for servers that are *impossible* in
java because of library decisions (no multiplexing, as I said),
whereas Python will give them to you.

Topic 5:
Which is faster eg. Java servlets or Python CGI with mod_python?

Answer(s):
- Don't know about that - but there is some strong evidence that Zope
(python based application server) is faster than servelets:
http://www.zope.org/Members/BwanaZulia/zope_benchmarks/tomcat1.html
I think any major web project should consider Zope.  It has
transformaed the way I think of web development.  FWIW, I work at a
moderate sized telco and we do all of our web sites in Zope.
More on Zope at: http://www.zope.org
- In theory Java code should be faster; it does a lot more optimizing.
In practice however..
Zope seems to outspeed some Java servlet servers:
http://www.zope.org/Members/BwanaZulia/zope_benchmarks/benchmarks.html
More on Zope: http://www.zope.org 
Zope is based on Python. Definitely do look at Zope if you're into
web programming.
Doing the same task in Python may often be faster than Java, possibly
due to Python's high-level constructs and the huge overhead of the 
average Java runtime environment. Here's some research on that and more:
http://wwwipd.ira.uka.de/~prechelt/Biblio/jccpprtTR.pdf
Note that the *development speed* with Python is generally estimated to 
be quite a bit faster than the development speed with Java. This is also 
an important factor!
- Can't tell you.  However, we did some timings on Python, JPython and
Java (I rewrote a few Python/JPython classes in Java - took < 1 hour)
and found that Java was faster than both, with Python coming in next
and JPython being pretty slow.
- Since Python does all of its I/O buffering in C, and Java does all of
its buffering in Java, Python is going to be faster, despite all of
Java's "theoretically optimal" interpreter optimizations.  If your
code is *really really* CPU bound, java might do better, but given
Java's wonky cpu-hog GC behavior, it's likely that you'll lose there
too.  (If you're seriously that CPU bound, nothing beats C, so a C
application with python "steering" would beat java anyway.)

Servlets also use the standard-output facility in java: as shown on my
Java-versus-Python page --

        http://www.twistedmatrix.com/~glyph/rant/python-vs-java.html

this is NOT very fast at all.  I have no idea why the performance
difference is so significant.

Final Notes:

- Well if it's performance you want, ASP and embedded COM is
probably what you want.
- Performance at the price of portability, definitely.  ASP + native COM
interfaces in C/C++ and Com+'s transactions. The "MTS" feature of COM+ 
would give you speed and scalability and  absolutely NO portability!
You'd also be working in an environment with a lot of inertia to 
overcome whenever changes are required in the system. Python is great 
from Prototype to working systems precisely because it adapts quickly.
I'd accept lower performance, use SOAP instead of COM+/MTS From what 
I've read and seen, Microsoft's MTS "object runtime environment" is 
extremely fast.  Writing COM components in Visual C++ can be a pain in 
the butt, but if all you really need is performance, and you don't need 
flexibility, it's definitely going to scale up nicely.
Now if you factor the Real and Messy world in, where requirements are
constantly changing, you can color me Python again.  I would rather 
maintain and develop a system in python where I can adapt quickly to 
changing requirements than work on a system where draining every last 
drop of performance from the bare iron was a driving force in the design 
of the system.
In the end, it's not always about performance.
- The main advantage of Python would be its readability and high 
development speed. Another big advantage of Python is the existence of 
Zope. Yet another advantage is Python and Zope's open sourceness. The 
main advantage of Java would be its massive industry support.
- Have you seen
<URL:http://www-4.ibm.com/software/developer/library/script-survey/>?
Which is faster?  Both are.  Run-time performance of the common
implementations (and Web embeddings) of Java and Python are suffici-
ently close that the comparison depends strongly on the details of
the application under consideration.  Vulgar recognition of this
often appears as, "benchmarks are garbage."  In fact, benchmarks are
very valuable.  It's quite likely in your case that run-time perfor-
mance is *not* a significant differentiator.
Others have already written you about scalability and maintainability.
Python's more portable than Java.  You didn't ask, but you should
know that.  Java's improving, and someday will probably dominate here.
A few years ago, we thought surely it'd happen by now.  It hasn't, yet.
As others have hinted, it's not necessary to put the two in opposition.
It can be quite rational to use both Python and Java, sometimes
together, with JPython (or even more esoteric bindings).  Yes, I
understand MISthink that claims to want to standardize on One True
Language.  If that's truly a constraint on you, we can discuss
strategies for dealing with it.
- I am a "java expert".  I've been working with the language since its
inception.  It started out as a genuinely good thing: but it has
fallen into complete decay.  If you don't decide to go with python for
this project of yours, I would highly recommend finding something
other than Java for this.  History is *already* littered with the
corpses of projects which thought that Java would solve their
problems.  (Corel Office for Java, Marimba, hundreds of unreleased,
unpublicized projects...)
I am a "python newbie".  I have been working with python for 3-6
months (I don't remember exactly how long).  Even in this short time,
I have come to love python, not because it's the end-all be-all of
programming languages, but because it actually picks some things that
it wants to be good at, and does those things very well.  The
language's strengths are well-matched to the interpreter's, and the
environment is overall a positive experience.  Not only that, I look
around every day and see successful projects, both open (Zope) and
closed (UltraSeek) that are using python with success.
Java attempts to be everything to everyone, and, as such things are
fated, becomes nothing to anyone.  Java CLAIMS to do everything well,
but actually does everything so poorly that Sun has to promise that it
will be "better in the next release", and they've been doing this for
long enough that it amazes me that people believe them anymore.
- I do not intend to discredit java here; it is certainly worthwhile for
some things, but it is *really* not everything Sun claims it is.
Hopefully if people realize this, Sun will actually make strides
towards delivering on all of these wonderful promises that they've
made, or relinquish control to someone who will.
The thing that Java is most useful for at the moment is interfacing
with other applications written in Java; it's very easy to link any
arbitrary java code to any other arbitrary java code, even without a
"development kit".  This is nothing on python's introspection, but
it's certainly leaps and bounds beyond C (I won't even talk about C++.
Fragile base classes?  Yuck.), and a lot of things are available to
work with.  If you can afford to take a slight speed hit for
maintainability, readability, and flexibility, but you still need
interoperability with Java, JPython is a *wonderful* thing.  Check it
out at www.jpython.org.


-------------------------------------------------------------
Should __getattr__ increment the reference count before it returns the appropriate attribute, or not? writing a Python module in C++
http://www.faqts.com/knowledge-base/view.phtml/aid/2907
-------------------------------------------------------------
Fiona Czuczman
Gordon McMillan

It should incref the attribute and leave the owner alone.

Imagine a sequence like this:
 newref = a.b # here's your __getattr__
 a = None
Now a's refcount drops. If it drops to 0, it gets deallocated, which 
will decref b. Without an incref in __getattr__, the user would have 
an invalid reference.


-------------------------------------------------------------
Are colons unnecessary in my Python code?
http://www.faqts.com/knowledge-base/view.phtml/aid/2910
-------------------------------------------------------------
Fiona Czuczman
Courageous, David Porter, François Pinard, Moshe Zadka, Dirck Blaskey

Lead in discussion:

Sometimes colons seem syntactically unnecessary. For example:

        if <condition>:
                statement
        else:
                statement

Really, else doesn't need a colon, as far as I can tell (I can see the 
need for the if, supposing you want to have the statement on the same 
line).

Answer(s):

- When using python-mode in Emacs (or jed), the colon facilitates
auto-indention. Also, if you forget the colon, the next line will not be
indented, so you will catch your mistake. 

- Granted, but it is always good for a language to have a bit of 
redundant information.  When properly thought, such redundancy prevents 
various mistakes from programmers (once they are used to it, of course 
:-), and often increase overall legibility.

- Theoretically, a colon is only necessary in things like

if yes: print "yes"

Since otherwise the parser can figure out when to stick a colon. 
However, usability studies show people are more comfortable when the 
beginning of a block is signaled, and I can see why:

if yes
        print "yes"

Seems....naked. Much less readable then 

if yes:
        print "yes"

Guido didn't want 10 ways (or even 2) to spell things, so the colon is
mandated for all.

- > if yes: print "yes"

Oddly enough, the parser doesn't really need the colon here either.
It can manage to figure out where the if expression ends without it.

The colon is almost entirely for readability purposes.
(there are a couple of places where ambiguity occurs without it).

If you're curious about the other thread,
or about Python without colons,
or to test my above assertion,
check out:

http://www.danbala.com/python/colopt


-------------------------------------------------------------
Does anyone have any hints in configuring Python for BeOS?
http://www.faqts.com/knowledge-base/view.phtml/aid/2911
-------------------------------------------------------------
Fiona Czuczman
Donn Cave

The Problem:

I've managed to compile it but under make test it just hangs on 
test_p_open2.py. Once I remove that file from the testing sequence seems 
to go into an infinite loop after test_threading. Also the struct module 
test failed as well.

A possible path to the solution:

Well, I have three hints:

 1. Change the optimization from '-O3 -mpentiumpro' to just plain '-O'.
    I'd start with a clean unpack of the distribution and make this
    change to "configure", and then run configure with the usual flags.
    (--with-thread  --prefix=/boot/home/config)

That's the most important.  The problem with struct is some incorrect 
code from gcc -O3.  (I assume you're on Intel hardware.)  I don't know 
exactly what it does, but it works fine if I insert a printf.  Who knows 
what else could suffer, maybe the test_threading problem is also 
compiler error.  The worst thing was when some experiments with popen2 
dropped the whole system into the kernel debugger.

 2. Don't bother with test_popen2.  I don't know what's wrong with it,
    but it's going to hang.

 3. Don't bother with test_select.  Likewise don't know what the actual
    problem is, but BeOS select has bugs, defects, weaknesses etc.
    One thing you might fix in selectmodule.c, if you need select(),
    is to use FD_SETSIZE for the 1st parameter to the C function,     
instead of the calculated maximum size.

With those three changes, I made it through the test suite with only the
usual failures.