From: Alan Schmitt <alan.schmitt@inria.fr> To: lwn@lwn.net Subject: Attn: Development Editor, Latest Caml Weekly News Date: Tue, 5 Feb 2002 20:51:07 +0100 Hello, Here is the latest Caml Weekly News, week 30 january to 05 february, 2002. Summary: 1) otags-3.04.3 2) IoXML - OCaml from/to XML 3) IoXML goes on... 4) ocamlnet-0.92 has been released 5) OCamlCVS 1.1 6) OCamlAgrep 1.0 (string searching with errors) ====================================================================== 1) otags-3.04.3 ---------------------------------------------------------------------- Jean-Francois Monin announced: New version of otags, with better handling of classes and mutual recursive types. Thanks : Hendrik Tews http://www.multimania.com/moninjf/Ocaml/otags-3.04.3.tar.gz See HISTORY, README and -help for details. ---- from README --------- otags is used to build TAGS tables for emacs, like etags does, but for Ocaml code source files. It is based on camlp4 parsers of caml, which makes it more accurate than versions based on regexps : otags finds references to constructors of sum types, fields of records, etc. -------------------------- ====================================================================== 2) IoXML - OCaml from/to XML ---------------------------------------------------------------------- Daniel de Rauglaudre announced: Hi, I have the pleasure to announce the first release of IoXML. IoXML is a syntax extension (using Camlp4) which automatically add XML parsers and printers to all types you define in your mli and ml files. It is like input_value/output_value for your data, but in XML instead of binary. The dumped data is tagged by the constructor names and the record and labels names. Main page and distribution here: http://cristal.inria.fr/~ddr/IoXML/ ====================================================================== 3) IoXML goes on... ---------------------------------------------------------------------- Daniel de Rauglaudre then added: Hi all, I just released version 0.4 of IoXML. See the file CHANGES to see what happened this week-end: http://cristal.inria.fr/~ddr/IoXML/ --- IoXML is a Camlp4 syntax extension for OCaml mli and ml files which generates XML parsers and printers for all types you define. It offers a kind of input_value/output_value but in XML instead of binary. Useful to export your data to other programs, or to save it, to debug by traces, to be fashionable, and so on... ====================================================================== 4) ocamlnet-0.92 has been released ---------------------------------------------------------------------- Gerd Stolpmann announced: Hi, The ocamlnet project releases a new version of its library, the successor of the well-known netstring library. This release mainly focuses on smoothing the interoperation of its modules, especially by using the concept of channel objects everywhere. There are also a number of new features, see below. Ocamlnet has now a rich set of functions for the following tasks: - Dealing with stacks of channel transformers, e.g. "read a certain section of a file and decode the BASE64 strings, and then convert everything to UTF-8". This can usually be done without allocating large buffers, but by processing the data block by block. (NEW) - Separate a channel into sections, and process the sections independently. For example, one can define a boundary string, and read a file section by section. (NEW) - Encode and decode strings, convert between character sets. (The character set conversion needs to be revised.) - Parse and print date strings - Parse and print HTML text - Parse and print MIME mail messages. There are now high-level functions for manipulating mail messages (NEW) - Compose mail messages and send them (using sendmail). The message composer creates MIME messages, too, and can deal with arbitrary character sets. (NEW) - Parse and print URLs (This module is not yet up to date) - Ocamlnet includes a complete and modularized implementation of the CGI protocol. (UPDATED) - Ocamlnet has experimental support for building application servers with Apache and the Jserv protocol (NEW) - Ocamlnet includes an experimental POP3 client The library has been rewritten such that many functions accept so-called netchannels (or channel objects) as input/output media. The idea is very simple: a netchannel either represents a real channel, a buffer, a string, a lexer buffer, a filter (or some of the more complicated entities). A netchannel for input has methods for reading from the channel, for example let line = ch # input_line() to read the next line, and an output netchannel provides the usual output methods, like ch # output_string "xyz" to output a string to the netchannel. The same type is used no matter what kind of thing the netchannel really is: let ch1 = new output_channel stdout ;; let buf = Buffer.create 100 ;; let ch2 = new output_buffer buf ;; Here, ch1 is a netchannel on top of O'Caml's built-in out_channel, and ch2 is on top of a buffer. Ocamlnet contains many functions that can be applied to any kind of netchannel, for example write_mime_message msg ch1 msg (* write msg to stdout *) write_mime_message msg ch2 msg (* write msg to the buffer buf *) This helps a lot because it saves the programmer from providing the same functionality for various data types. A more complicated netchannel is let ch3 = new output_filter (new Base64.encoding_pipe()) ch2 ;; Any output to ch3 is BASE64-encoded and transferred to ch2. The netchannels can be stacked without any restriction, and if the stack is not too high, the performance loss is quite minimal. For instance, the BASE64 encoding is applied block by block, so that the additional costs for method lookups must only be paid for a whole block, not for a single byte. Another interesting means of programming are the so-called netstreams. These are input channels with look-ahead, i.e. you can look at the bytes you will read in the future. The look-ahead buffer can by enlarged as needed. It is possible to form substreams of netstreams, in order to view only a certain section of the whole stream. The substream will return EOF where the section ends, but the main stream can of course read beyond the section boundary. (See Netstream.) Again, it is possible to recursively apply substreaming to itself. An application is the parser for nested multipart MIME messages. Netchannels and netstreams are two examples illustrating the current direction of Ocamlnet development. We are trying to find out the basic types that are needed everywhere, and that make up the glue between the various parts of the library. And before I forget it the second time: Ocamlnet includes a routine to translate Str regular expressions to Pcre regular expressions (because we are only using Pcre now). Ocamlnet can be found here: - Project page: http://sourceforge.net/projects/ocamlnet - Home page: http://ocamlnet.sourceforge.net - Download of release 0.92: http://prdownloads.sourceforge.net/ocamlnet/ocamlnet-0.92.tar.gz http://www.ocaml-programming.de/packages/ocamlnet-0.92.tar.gz If you want to join the Ocamlnet developers team, please subscribe to the ocamlnet mailing list, and describe what you want to contribute. (More information on the project page.) ====================================================================== 5) OCamlCVS 1.1 ---------------------------------------------------------------------- Maxence Guesdon announced: Hi all, a new release (1.1) of OCamlCVS is available. Changes are : - a merge assistant added, to resolve merge conflicts with your magic mouse - the configure script now checks the ocaml version (thanks to Alain Frisch for pointing this out :-) - a bug fix, causing ocamlcvs to find no file when using a remote cvs server OCamlCVS can be found at http://www.maxence-g.net/Tools/ocamlcvs/ocamlcvs.html Enjoy ! [You can use this new version in Cameleon, just compile ocamlcvs, install, then re-compile Cameleon.] ====================================================================== 6) OCamlAgrep 1.0 (string searching with errors) ---------------------------------------------------------------------- Xavier Leroy announced: It is my pleasure to release the OCamlAgrep library: ftp://ftp.inria.fr/lang/caml-light/bazar-ocaml/ocamlagrep-1.0.tar.gz This library implements the Wu-Manber algorithm for string searching with errors, popularized by the "agrep" Unix command and the "glimpse" file indexing tool. It was developed as part of a search engine for a largish MP3 collection; the "with error" searching comes handy for those who can't spell Liszt or Shostakovitch. Given a search pattern and a string, this algorithm determines whether the string contains a substring that matches the pattern up to a parameterizable number N of "errors". An "error" is either a substitution (replace a character of the string with another character), a deletion (remove a character) or an insertion (add a character to the string). In more scientific terms, the number of errors is the Levenshtein edit distance between the pattern and the matched substring. The search patterns are roughly those of the Unix shell, including one-character wildcard (?), character classes ([0-9]) and multi-character wildcard (*). In addition, conjunction (&) and alternative (|) are supported. General regular expressions are not supported, however. Performance is quite good: for short patterns (less than 31 characters) and no errors, this library is about 8 times faster than OCaml's "Str" regular expression library. Speed decreases with the number of errors allowed, but even with 3 errors we are still faster than "Str". The algorithm is described in S. Wu and U. Manber, "Fast Text Searching With Errors", tech. rep. TR 91-11, University of Arizona, 1991. It's a nice exercise in dynamic programming and bit-parallel implementation. ====================================================================== Alan Schmitt -- The hacker: someone who figured things out and made something cool happen.