Date: 16 May 2000 15:02:02 -0000 To: python-list@python.org Subject: [FAQTS] Python Knowledge Base Update -- May 16th, 2000 From: Fiona Czuczman <fiona@sitegnome.com> Hi All, Another bundle of entries into http://python.faqts.com Cheers, Fiona Czuczman ## New Entries ################################################# ------------------------------------------------------------- How does the statement 'global' work? http://www.faqts.com/knowledge-base/view.phtml/aid/2902 ------------------------------------------------------------- Fiona Czuczman Remco Gerlich, Thomas Wouters Answer1: Inside functions, if you assign to a variable, it is assumed to be local to the function. If you want to assign to a global (module namespace) variable, you'll have to tell Python that it's a global first. ie, x = 4 def spam(): x = 5 spam() Doesn't change the module's x. But x = 4 def spam() global x x = 5 spam() does. If, inside a function, you only use the variable and never assign to it, it can't be a local variable, so Python assumes you mean a global. This doesn't only hold for functions, but classes and methods too. Answer2: Python has two namespaces it uses for name lookups: the 'local' namespace and the 'global' namespace (you could call it the 'search path'.) The local namespace is the function or class you are in, the global namespace is the namespace of the module. If you are not in a function or class, the local namespace is the global namespace. However, *assigning* to a variable does not use this search path ! Instead, if you assign, you always assign in the local namespace. So, for instance: X = None def FillX(x): X = x would not work, because the X you assign to, is the *local* X, not the global one. This wouldn't even generate an error, by the way. Even more confusing is it if you read from and assign to 'X' in the same function: X = None def FillX(x): if not X: X = x In Python 1.5.2, this generates a NameError: Traceback (innermost last): File "<stdin>", line 1, in ? File "<stdin>", line 2, in FillX NameError: X and in Python 1.6, an UnboundLocalError: Traceback (most recent call last): File "<stdin>", line 1, in ? File "<stdin>", line 3, in FillX UnboundLocalError: X It generates this strange error because the 'X' name exists in the local namespace, but at the time of the test on X, it isn't filled in yet. The interpreter knows it should be there, but can't find it at the time of execution. And this is what the 'global' keyword is for. It says 'this name is global, *even if* it gets assigned to. You only need it when you are assigning to globals, not when you are mutating them (appending to lists, adding to dictionaries, etc.) ------------------------------------------------------------- How do Python and Java compare? http://www.faqts.com/knowledge-base/view.phtml/aid/2906 ------------------------------------------------------------- Fiona Czuczman Courageous, Warren Postma, dana_booth, Cameron Laird, Roman Milner, Martijn Faassen, Bob Hays, Andrew Cooke, Glyph Lefkowitz Topic 1: I would just like to know how serious python is taken in the IT industry and how it compares to something like Java. Answer(s): - Recent events have proven that the computer industry is a lot less staid and unmoving than it used to be. Everybody's getting paranoid that someone else will beat them to the next big thing. That being so, I expect a big cloud of Python hype any day now. Perl had it's hype day, as did Java. - Larger companies probably do not take Python seriously. In the industry, though, small, faster moving companies can have an advantage. I work at a mid-sized manufacturing plant, and we're given the freedom to use whatever we like, as long as the plant keeps working. :) We use OpenBSD servers running MySQL and Samba. Shell, Perl, and Python scripting drives everything, and the Windows users are connecting using TCL/TK apps. We do lots of things that many plants our size do not do, or spend a lot of money doing. That being said, it's been my experience that Python is easier to use than Java. While I use Perl for small text parsing scripts, I've found Python to be very well suited for creating even semi-large programs. Like many, I was excited about Java a few years ago, but found it to be a bit cumbersome for the small to medium sized apps that we needed to create. I especially don't like the I/O in Java. - Larger companies like Intel, Motorola, Microsoft, Compaq, IBM, ...? See, for example, <URL:http://developer.intel.com/technology/efi/toolkit_overview.htm>. That's a cheap shot on my part. Mr. Booth is right. Conservative (in some sense) MIS departments *are* wont to scorn Python. It can be tough getting approval from many of them to use Python for a project such as that under consideration here. It's worth it, though--*particularly*, I claim, for Web work. - In the company I work for, the two languages are both used for different (but overlapping) areas. Java is used for the product; Python is used for scripting tasks (automating builds, etc). This strikes me as quite sensible. The real reason is probably historical, but I feel Java is better for large projects because: + near-static typing helps enforce interfaces + various language constructions (eg Interfaces) help enforce a modular approach + easier to hire for + programmers used to Java code, which tends to be more explicit (long winded and tedious if you like), and so easier to understand On the other hand, Python is better for scripts because: + compact + flexible (can use functions as well as objects; objects blend nicely into the general synax using __ methods) + quicker to write + easier to modify Note that the argument above doesn't mean Java is better - in many ways it is worse, but safer/more common. Cue arguments about blunt knives ;-) Performance is not critical in the software I am involved in - provided the speed is reasonable, reliability is more important. Otherwise, I guess we'd use C. Topic 2: If a web application was written in both Java and Python, which would companies prefer? Answer(s): - Java. This is because IT professionals pick products that they can't get fired for picking. An old setting in the IT industry: "nobody ever got fired for picking Oracle." - I heard it as "nobody ever got fired for buying IBMs". Then the clone industry sort of stole IBM's thunder, technological lead, etcetera. IBMs became Also Rans. So will Java. Once the hype evaporates, you have just one more language, with good bits and not so good bits. - *companies* would always prefer Java, because that has all the buzz, they've heard of it, it's hyped all over the place. Whether they *should* prefer Java is another question. - I work in the IT organization of a large multinational bank (in the top ten world-wide). When we had to build an interface system that required mapping data and we didn't know the target map (it kept changing - big surprise, right?), we used Python for the mapping. What used to take days to change (in C++) then took at most an hour; testing became the long leg of development.... The only problem I had in getting Python accepted was that most people have Java experience but not Python experience. I handled that by having four people in the group learn Python enough to modify our production app - it took each of them less than three days to get up to speed. We brought in a consultant to port the application to a new version of the underlying library (which is SWIG wrapped) and it took him two days. Case made, no more problems.... BTW, we've used JPython in a second production application already. - Java, mostly because Java has a cooler logo, and Java is backed by Sun, which is a big company. Software companies, especially, prefer this, because they like to think that having a large company behind something means something. That's if you're marketing it as being in a certain language. One thing that we do too much in this industry is focus on our implementation technology. YOU should be familiar with the technology, YOU should be aware of its strengths and weaknesses, but your customers should just know what it does and how well it does it. It would be cool if you could get python a little bit more press, but you don't necessarily have to try to ride Java's (or Python's) marketing success. If you do that, the success of your product becomes bound up with successes and failings of the language and platform which may or may not have anything to do with your program at all. Topic 3: Which is more maintainable? Answer(s): - In my humble opinion, I would put it this way: 1. It takes less python code to do something in Java 2. The Less Code you write, the less there is to maintain. As a matter of opinion, I find Python far more "readable". The syntax is clear, the built in data structures and types are simple enough to grasp, but powerful enough to be useable. - Depends on the code you write, I imagine. Python may be considered less maintainable as it's harder to enforce interfaces, can't do static type checking, etc. Python may be considered more maintainable as its code is shorter, more readable, you get flatter inheritance hierarchies, you can more easily change things around, and you can more easily reimplement things. I myself would prefer Python; I think a solidly engineered piece of Python could would be as maintainable if not more than a solidly engineered piece of Java code, and faster to set up to boot. - Can't choose. Python is easier to change but IMHO works better as a prototyping and quick-mod layer, not quite as robust as Java. - Python. Python can do just about anything in fewer lines of code, and is vastly more readable to boot (not to mention having lovely space-saving features like function and method objects and module-global variables.) I don't think i've *ever* found a language more maintainable than python, and I love it. Even one-liners I tossed off to solve very specific problems are completely readable (and even somewhat reusable!) to me weeks after I write them, even if I'm trying to be purposefully obtuse or clever ;-) Not to mention the fact that everything is a first-class object in python, so you don't end up worrying about how the heck you're going to write your code to deal with int/long/double/float without incurring extra overhead both in syntax and in memory: you can just treat everything as a number in python. (As if it were object oriented or something!) Compare the following to snippets: class Foo: def __init__(self,x=15): self.x=x def bar(self, a, b): return self.x+a+b ... public class Foo { int x; public Foo(int x) { this.x=x; } public int bar(a,b) { return x+a+b; } } The java version looks slightly more complex, but they appear to be approximately the same: until you realize that Java is operating on integers, and Python is operating on *anything which can be added*! This could be strings, complex numbers, integers, floating point numbers (despite the fact that we should, of course, never use floating-point (thanks tim)). In order to get similiar functionality in the Java version with appropriately aesthetic syntax, the code would be about 20 times longer. Less code == less to fix == less to maintain == fewer bugs. Topic 4: Which is more scaleable? Answer(s): - Keep in mind that a lot of your python can connect to third party tools at native speeds, if you learn a trick or two. - Scalable up or scalable down? Python scales DOWN much better. There's a lot more overhead in making and distributing a Java application, so Java scales down poorly. On the other hand, I'd say more work has gone into Java, and using Java for what would traditionally be called a "transaction processing" (TP) system. Java has some features in it's Enterprise Java Beans architecture specifically designed to handle high volume transaction technology. This is not garden variety stuff, and not something that most small or medium sized businesses need to worry about. This is more like "I run a bank and I have 100,000 online customers accessing their account data and doing transactions at the same time" type stuff. Not everybody needs Java or Enterprise Java Beans, but some people do. For the time being at least, Python has no equivalent technology. - As for features, I'd say this would be Java; it gets more development in this department, and I imagine it can run on SMP machines without any global interpreter lock. That said, the Python environment is more dynamic which may help you in scaling (distributed objects and so on), the Python environment is more compact, Python may be even more portable than Java, so you can scale up to bigger machines. - Solely depends upon your architecture, not your language IMHO. - Python has a global interpreter lock, so multithreading doesn't speed it up as much. So, if you're going to buy an Enterprise Server 1000000000 or whatever, with altogether too many processors to run your code, the answer is *probably* java. (There are ways around this in python, but not all of them are immediately obvious.) The minimum investment to get good performance out of Java is higher than that to get good performance out of python though. However, Java has some issues with memory usage. I have a machine with more than enough RAM to do this: glyph@helix:~% python Python 1.5.2 (#0, Apr 3 2000, 14:46:48) [GCC 2.95.2 20000313 (Debian GNU/Linux)] on linux2 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> x=range(5000000) >>> glyph@helix:~% cd app/IBMJPython-1.1 glyph@helix:~/app/IBMJPython-1.1% ./jpython JPython 1.1 on java1.3.0 (JIT: jitc) Copyright (C) 1997-1999 Corporation for National Research Initiatives >>> x=range(5000000) Out of Memory You might want to try the -mx flag to increase heap size So it depends how you want to scale. In JPython, which is translating the Python to Java bytecodes and running it as it it were java, this allocation is too large. Not because java has more overhead and I don't have enough RAM for it, but because Java gives up halfway through when it realizes there's not enough space in its "allowed" memory block for that list. Notice that I'm using "IBMJpython" here, which is JPython installed on the newest, funkiest JVM available for Linux, the IBM JDK 1.3 port. It is *possible* to increase maximum memory usage by passing commandline options, but (A) who wants to figure out how much memory a long-running application is going to take ahead of time and have to restart it if it overgrows that and (B) performance begins to suffer as you do that. Also; you can easily bind your Python code to other systems, profile out the performance bottlenecks, easily, by extending or embedding Python in C or C++. And finally; this point is often overlooked, but it is VERY important; critical, almost -- bindings for Python to native functionality on your platform *probably already exist*. Java shoehorns you into ONE API, ONE loosely-defined standard implemented shoddily by Sun, ONE way to write GUI's, no ability to do multiplexing (all I/O is blocking and in threads) ... all in all, the poor performance and poor scalability of Java's standard library damn the language more thoroughly than any other feature of it. After all, java "scales" because it has true multithreading that will take advantage of multiple processors, but there are optimizing algorythms for servers that are *impossible* in java because of library decisions (no multiplexing, as I said), whereas Python will give them to you. Topic 5: Which is faster eg. Java servlets or Python CGI with mod_python? Answer(s): - Don't know about that - but there is some strong evidence that Zope (python based application server) is faster than servelets: http://www.zope.org/Members/BwanaZulia/zope_benchmarks/tomcat1.html I think any major web project should consider Zope. It has transformaed the way I think of web development. FWIW, I work at a moderate sized telco and we do all of our web sites in Zope. More on Zope at: http://www.zope.org - In theory Java code should be faster; it does a lot more optimizing. In practice however.. Zope seems to outspeed some Java servlet servers: http://www.zope.org/Members/BwanaZulia/zope_benchmarks/benchmarks.html More on Zope: http://www.zope.org Zope is based on Python. Definitely do look at Zope if you're into web programming. Doing the same task in Python may often be faster than Java, possibly due to Python's high-level constructs and the huge overhead of the average Java runtime environment. Here's some research on that and more: http://wwwipd.ira.uka.de/~prechelt/Biblio/jccpprtTR.pdf Note that the *development speed* with Python is generally estimated to be quite a bit faster than the development speed with Java. This is also an important factor! - Can't tell you. However, we did some timings on Python, JPython and Java (I rewrote a few Python/JPython classes in Java - took < 1 hour) and found that Java was faster than both, with Python coming in next and JPython being pretty slow. - Since Python does all of its I/O buffering in C, and Java does all of its buffering in Java, Python is going to be faster, despite all of Java's "theoretically optimal" interpreter optimizations. If your code is *really really* CPU bound, java might do better, but given Java's wonky cpu-hog GC behavior, it's likely that you'll lose there too. (If you're seriously that CPU bound, nothing beats C, so a C application with python "steering" would beat java anyway.) Servlets also use the standard-output facility in java: as shown on my Java-versus-Python page -- http://www.twistedmatrix.com/~glyph/rant/python-vs-java.html this is NOT very fast at all. I have no idea why the performance difference is so significant. Final Notes: - Well if it's performance you want, ASP and embedded COM is probably what you want. - Performance at the price of portability, definitely. ASP + native COM interfaces in C/C++ and Com+'s transactions. The "MTS" feature of COM+ would give you speed and scalability and absolutely NO portability! You'd also be working in an environment with a lot of inertia to overcome whenever changes are required in the system. Python is great from Prototype to working systems precisely because it adapts quickly. I'd accept lower performance, use SOAP instead of COM+/MTS From what I've read and seen, Microsoft's MTS "object runtime environment" is extremely fast. Writing COM components in Visual C++ can be a pain in the butt, but if all you really need is performance, and you don't need flexibility, it's definitely going to scale up nicely. Now if you factor the Real and Messy world in, where requirements are constantly changing, you can color me Python again. I would rather maintain and develop a system in python where I can adapt quickly to changing requirements than work on a system where draining every last drop of performance from the bare iron was a driving force in the design of the system. In the end, it's not always about performance. - The main advantage of Python would be its readability and high development speed. Another big advantage of Python is the existence of Zope. Yet another advantage is Python and Zope's open sourceness. The main advantage of Java would be its massive industry support. - Have you seen <URL:http://www-4.ibm.com/software/developer/library/script-survey/>? Which is faster? Both are. Run-time performance of the common implementations (and Web embeddings) of Java and Python are suffici- ently close that the comparison depends strongly on the details of the application under consideration. Vulgar recognition of this often appears as, "benchmarks are garbage." In fact, benchmarks are very valuable. It's quite likely in your case that run-time perfor- mance is *not* a significant differentiator. Others have already written you about scalability and maintainability. Python's more portable than Java. You didn't ask, but you should know that. Java's improving, and someday will probably dominate here. A few years ago, we thought surely it'd happen by now. It hasn't, yet. As others have hinted, it's not necessary to put the two in opposition. It can be quite rational to use both Python and Java, sometimes together, with JPython (or even more esoteric bindings). Yes, I understand MISthink that claims to want to standardize on One True Language. If that's truly a constraint on you, we can discuss strategies for dealing with it. - I am a "java expert". I've been working with the language since its inception. It started out as a genuinely good thing: but it has fallen into complete decay. If you don't decide to go with python for this project of yours, I would highly recommend finding something other than Java for this. History is *already* littered with the corpses of projects which thought that Java would solve their problems. (Corel Office for Java, Marimba, hundreds of unreleased, unpublicized projects...) I am a "python newbie". I have been working with python for 3-6 months (I don't remember exactly how long). Even in this short time, I have come to love python, not because it's the end-all be-all of programming languages, but because it actually picks some things that it wants to be good at, and does those things very well. The language's strengths are well-matched to the interpreter's, and the environment is overall a positive experience. Not only that, I look around every day and see successful projects, both open (Zope) and closed (UltraSeek) that are using python with success. Java attempts to be everything to everyone, and, as such things are fated, becomes nothing to anyone. Java CLAIMS to do everything well, but actually does everything so poorly that Sun has to promise that it will be "better in the next release", and they've been doing this for long enough that it amazes me that people believe them anymore. - I do not intend to discredit java here; it is certainly worthwhile for some things, but it is *really* not everything Sun claims it is. Hopefully if people realize this, Sun will actually make strides towards delivering on all of these wonderful promises that they've made, or relinquish control to someone who will. The thing that Java is most useful for at the moment is interfacing with other applications written in Java; it's very easy to link any arbitrary java code to any other arbitrary java code, even without a "development kit". This is nothing on python's introspection, but it's certainly leaps and bounds beyond C (I won't even talk about C++. Fragile base classes? Yuck.), and a lot of things are available to work with. If you can afford to take a slight speed hit for maintainability, readability, and flexibility, but you still need interoperability with Java, JPython is a *wonderful* thing. Check it out at www.jpython.org. ------------------------------------------------------------- Should __getattr__ increment the reference count before it returns the appropriate attribute, or not? writing a Python module in C++ http://www.faqts.com/knowledge-base/view.phtml/aid/2907 ------------------------------------------------------------- Fiona Czuczman Gordon McMillan It should incref the attribute and leave the owner alone. Imagine a sequence like this: newref = a.b # here's your __getattr__ a = None Now a's refcount drops. If it drops to 0, it gets deallocated, which will decref b. Without an incref in __getattr__, the user would have an invalid reference. ------------------------------------------------------------- Are colons unnecessary in my Python code? http://www.faqts.com/knowledge-base/view.phtml/aid/2910 ------------------------------------------------------------- Fiona Czuczman Courageous, David Porter, François Pinard, Moshe Zadka, Dirck Blaskey Lead in discussion: Sometimes colons seem syntactically unnecessary. For example: if <condition>: statement else: statement Really, else doesn't need a colon, as far as I can tell (I can see the need for the if, supposing you want to have the statement on the same line). Answer(s): - When using python-mode in Emacs (or jed), the colon facilitates auto-indention. Also, if you forget the colon, the next line will not be indented, so you will catch your mistake. - Granted, but it is always good for a language to have a bit of redundant information. When properly thought, such redundancy prevents various mistakes from programmers (once they are used to it, of course :-), and often increase overall legibility. - Theoretically, a colon is only necessary in things like if yes: print "yes" Since otherwise the parser can figure out when to stick a colon. However, usability studies show people are more comfortable when the beginning of a block is signaled, and I can see why: if yes print "yes" Seems....naked. Much less readable then if yes: print "yes" Guido didn't want 10 ways (or even 2) to spell things, so the colon is mandated for all. - > if yes: print "yes" Oddly enough, the parser doesn't really need the colon here either. It can manage to figure out where the if expression ends without it. The colon is almost entirely for readability purposes. (there are a couple of places where ambiguity occurs without it). If you're curious about the other thread, or about Python without colons, or to test my above assertion, check out: http://www.danbala.com/python/colopt ------------------------------------------------------------- Does anyone have any hints in configuring Python for BeOS? http://www.faqts.com/knowledge-base/view.phtml/aid/2911 ------------------------------------------------------------- Fiona Czuczman Donn Cave The Problem: I've managed to compile it but under make test it just hangs on test_p_open2.py. Once I remove that file from the testing sequence seems to go into an infinite loop after test_threading. Also the struct module test failed as well. A possible path to the solution: Well, I have three hints: 1. Change the optimization from '-O3 -mpentiumpro' to just plain '-O'. I'd start with a clean unpack of the distribution and make this change to "configure", and then run configure with the usual flags. (--with-thread --prefix=/boot/home/config) That's the most important. The problem with struct is some incorrect code from gcc -O3. (I assume you're on Intel hardware.) I don't know exactly what it does, but it works fine if I insert a printf. Who knows what else could suffer, maybe the test_threading problem is also compiler error. The worst thing was when some experiments with popen2 dropped the whole system into the kernel debugger. 2. Don't bother with test_popen2. I don't know what's wrong with it, but it's going to hang. 3. Don't bother with test_select. Likewise don't know what the actual problem is, but BeOS select has bugs, defects, weaknesses etc. One thing you might fix in selectmodule.c, if you need select(), is to use FD_SETSIZE for the 1st parameter to the C function, instead of the calculated maximum size. With those three changes, I made it through the test suite with only the usual failures.