Attn: Development Editor, Latest Caml Weekly News

From:	 Alan Schmitt <alan.schmitt@inria.fr>
To:	 lwn@lwn.net
Subject: Attn: Development Editor, Latest Caml Weekly News
Date:	 Tue, 7 May 2002 12:19:09 +0200

Hello,

Here is the latest Caml Weekly News, weeks 30 april to 7 may, 2002. 

1) Danvy "Functional Unparsing" style output in OCaml
2) "high end" type theory for working programmers?
3) input_line is blocking

======================================================================
1) Danvy "Functional Unparsing" style output in OCaml
----------------------------------------------------------------------
T. Kurt Bond answered Francois Pottier and announced:

> Check out Olivier Danvy's paper `Functional Unparsing':
> 
>   http://www.brics.dk/RS/98/12/
> 
> It describes a very nice way of programming `printf' within ML's type
> system. `scanf' could be handled in a similar way. Furthermore, this approach
> yields code that is claimed to be more efficient than O'Caml's current
> `printf' implementation (because the overhead of interpreting format strings
> is lower). Lastly, it scales up to more expressive format directives, such as
> a directive for printing a list.
> 
> It might be worth adding a module that implements Danvy-style `printf' and
> `scanf' to the standard library. Has anyone written such a module already?
> Otherwise, I might consider doing it.

Back during an earlier discussion of Danvy-style output (probably on
this) I implemented a simple module for this (possibly starting from
some code that flew by on the list).

This round of discussion prompted me to go back and finish it up and
knock of a few of the rough edges and package it up.  Here's the README:

Cpsio is an Objective Caml implementation of the
continuation-passing-style output from Olivier Danvy's paper
Functional Unparsing.  It is available from:

   http://tkb.mpl.com/~tkb/software.html

The distribution is a gzipped tar file. It includes two modules:

   + Cpsio, which has a format function comparable to sprintf; and

   + Cpsio_exp, which has format_string, format_out, and
     format_err functions comparable to sprintf, printf, and
     eprintf, and a format function that allows the user to
     specify an accumulator/output function and an initial value
     (the later function is used to build the three previous
     functions and is available to clients for use with
     client-defined accumualtor/output functions).

Both modules have (rough) Ocamldoc documentation. The distribution
also includes test/example/benchmark programs for both Cpsio and Cpsio_exp,
and benchmark programs for comparing against OCaml and C
printf-style output. Perfomance with the bytecode compiler mostly
seems slightly faster than the OCaml printf, while performance with
the native code compiler seems to range from slower than the OCaml
printf to barely faster than the OCaml printf.

Deficiencies
   + Despite the "io" in the name, it unfortunately does not
     include input at this time, just output.
   + The Makefile is weak and does not have an install target.

I do not claim that this code is most elegant or most efficent
implementation of this idea.

I would welcome comments on the code.

This software is in the public domain.

======================================================================
2) "high end" type theory for working programmers?
----------------------------------------------------------------------
Chris Hecker asked:

The list has had a lot of discussions about type theory behind the module
system, tuples, and the like lately.  Most of it has been over my head, 
which is fun, because it presents a challenge to try to figure out what
people are saying.  I am wondering how much of it is useful for actually
writing "regular" code (as opposed to compilers or theorem provers).  Are
there books (or survey papers) on this stuff that are meant to educate
working programmers, as opposed to language researchers?  For example,
where should I go to learn what this means, and whether I care (just a
randomly chosen sentence representative of stuff that's currently over my
head from the past few days on the list):

"That functor is essentially the polymorphic identity functor, while the
other variation was a polymorphic eta-expansion of the abstraction operator."

or another example:

"In this encoding, modules are only records, so module types are ordinary
types, and there is no distinction between ordinary abstract types
(introduced by explicit polymorphic abstraction) and ``abstract
signatures''. There is, as far as I can tell, no need for kind polymorphism."

I started using caml to find out if a "higher level" language could make a
difference in my programming productivity (writing video games).  As I
continue with that experiment, I'm curious to know whether understanding
this high end type theory stuff would help make me a better programmer, or
just more able to understand the list lately.  Either is fine, but both
would obviously be great.  :)

Michael Vanier replied:

I highly recommend Benjamin Pierce's new book "Types in Programming
Languages" from MIT press.  It's very well-written, covers much of the  
material you describe, and includes implementations in ocaml ;-)

Neel Krishnaswami added:

Let me second this recommendation. It's a great book. I'm a regular
programmer and I found it extremely useful.

I think that Olivier Danvy's "Functional Unparsing" paper is one
of the best illustrations of why this stuff is useful for regular
programming. There's nothing more practical in the world than
printing text, and here he uses continuation-passing style, combinators,
higher-order functions, and all that stuff to derive a blisteringly   
fast statically-typed printf. It's amazing. (And you can make the 
library nearly perfect to use if you use labels and optional arguments.)

This technique is apparently an instance of a more general technique
that Zhe Yang describes in his paper "Encoding Types in ML-like
Languages", at <http://citeseer.nj.nec.com/53925.html>.

And Will Benton said:

Some great references (that explain these issues fairly clearly) are: 

    http://citeseer.nj.nec.com/cardelli97type.html
    http://citeseer.nj.nec.com/cardelli88basic.html
    http://citeseer.nj.nec.com/pierce95foundational.html

The first one (a massive survey that came from a CRC handbook) covers
what is meant by "well-typed" and contains the rules for proving that a
language/construct is well-typed.  The second covers type inference in
the face of polymorphism and other "fun" language features.  The third
covers lambda-calculus, the formal model for all functional (and
otherwise) languages (it also covers pi-calculus, which is a model for
communicating processes).  As a general rule, if you see the greek
letters alpha, beta, or eta in a PL-theory context, you can assume that
it's because someone is talking about the lambda calculus.  :-)

In any case, I think if you read those, you'll be able to follow some of 
the more "esoteric" discussions.

If you are really interested in learning about this stuff (types,
l-calculus, and PL theory in general), a great book is _Essentials of
Programming Languages_ by Friedman, Wand, and Haynes.  I have the first
edition, which is supposedly better for self-study (it was my undergrad
PL textbook), but the second edition is supposedly a better textbook
from what I've heard.  I have not seen the 2e, but I know that it has
some newer/improved algorithms for some program transformations.

This stuff *will* make you a better programmer -- you have probably
already observed that the strong typing in OCaml makes it easier to
write working code, and learning about how and why it works is helpful
for a lot of peoples' thought/design processes.  However, other PL
theory topics (ones that might seem esoteric, or only useful for
interpreter/compiler writers) will even make you write better code, as
the following anecdotes indicate:

http://groups.google.com/groups?hl=en&safe=off&selm=j7vk9d3eh1q.fsf%40new-world.cs.rice.edu&rnum=1

The last one in particular is a gem.

======================================================================
3) input_line is blocking
----------------------------------------------------------------------
Gerd Stolpmann answered Warp and announced:

> Hi
> I'm running "ocamlc" by using Unix.open_process_full in order to write an
> automatic compiler.
> Right now, it's working fine, but I got now a problem with input_line :
> 
> after running open_process_full , i'm first reading all its stdout lines of
> the process until End_of_file is raised, then all its stderr lines using the
> same function.
> 
> It has work fine for few weeks now, but now I found that in some cases
> input_line will block, not raising End_of_file.

I wrote a library exactly for such advanced usage of sub processes:
http://www.ocaml-programming.de/packages/documentation/shell/  

For example, to call ocamlc one would do:

open Shell
let stdout = Buffer.create 16 in
let stderr = Buffer.create 16 in
call ~stdout:(to_buffer stdout) ~stderr:(to_buffer stderr) [ cmd "ocamlc" args ]

The Shell library includes the necessary logic to read from multiple
file descriptors (using Unix.select).

One drawback: Shell works only for Unix (because of Unix.fork). I think
that you have to use multi-threading for a platform-independent
solution.

======================================================================

Alan Schmitt


-- 
The hacker: someone who figured things out and made something cool happen.