[LWN Logo]

To: mozilla-general@mozilla.org
From: Pat Gunn <pgunn01@ibm.net>
Subject: Proposal: Mozilla Transformation Services
Date: Thu, 21 May 1998 19:19:19 -0400

Proposal for Transformation Services in Mozilla
by Pat Gunn
Distribution and modification of this document is unlimited

The ability to create ways to preprocess pages before they display
would be a welcome addition to Mozilla's capabilities. This could
be made available for a variety of purposes, from providing a basis
for Machine Translation to having HTML filters. While it is true that
some of these features are already possible to some degree or another
by use of some of the undocumented tools in Mozilla, the adoption of
the TS services described in this document would provide for an easy,
flexable, and consistant means to do this. 

TS is based on the idea of a very simple, open API, and the use of
various modules which users may install and configure through the
preferences panels. These modules would recieve the webpage
before it is fully parsed, and transform it as they are programmed, passing
the transformed webpage either to the next module (they may be
chained), or to the rendering/parsing engine. Naturally, users may want to
run more than one module at a time, perhaps one that acts as a
HTML filter to remove hostile tags (like BLINK and EMBED), and
another as a simple lingual translation engine. Similarly, users
may wish for modules to be applied to only some webpages,
perhaps those in a foreign language or with a hostile PICS rating,
and not other. Properly designed and integrated, such a system
should not slow browsing significantly, provided that the plugins
are not doing something highly computationally expensive (i.e.
lingual translation or PDF->HTML conversion). 

Plugins should be highly controllable using the preferences panel,
with the user being able to specify if the plugin should be active
for all pages, pages only in a certain language or PICS rating, or
only applied on command (perhaps using aurora or a menu). 
This naturally is a concern for time spent on the page before it
displays, as the browser may need to determine the language or
PICS rating of the page before it parses it fully. Furthermore, 
javascript generated pages are a major concern. Ideally, the
metacode for the browser recieving a webpage would be:
	Retrieve page
	Preparse:
		Get PICS rating (if any)
		Get Language declaration (if any)
		Resolve Javascript's document.write(), staticising
			the pages (is this possible?)
	Pass the page to TS Layer:
		Pass through preferred ordering of modules,
		running any that are:
			set to run for all PICS levels
			set to run for this PICS level (if any)
			set to run for all Languages
			set to run for this Language
			explicitly told by the user to run for this page
	Pass the page to the parser
	Pass the page to the display engine

Considering that for the majority of users, the time for a webpage to
display is I/O bound, rather than CPU bound, for simple plugins the
impact of TS should not be a major performance concern.

The API of TS should be as simple as possible, with the full HTML/XML
source being passed to the module. The module may set up a system
to preprocess and defer the handling of the webpages to another
piece of software, if the platform permits, or load files needed to
transform specific translation tasks. To point, on Unix and possibly
other systems, a module might pass the document to a perl or
shell script, and recieve the resulting document back. Another
plugin might be a dejargonizing filter, and load a dictionary file
to do the translation. Yet another filter might prepare the document
in another format (RDF?) to prepare for advanced machine translation
between human languages, pass the prepared document to another
process, and recieve the translated document back.

Given these factors, it should be evident to the user, especially for
users of HTTPS, that they are not receiving the true page. Addition
of a third state to the lock logo in Mozilla, perhaps a red X, would
serve to indicate that a filter has been used in rendering the current
page.

Because plugins could be written that allow the user to do almost
anything imaginable in any programming language to enchance
their web experience, and because the interface to that functionality
is so simple and lightweight, MT services in Mozilla would be a
popular and useful addition to Mozilla's already wide array of
capabilities. 

-- 
---------------------------------------------------
Pat Gunn, moderator:comp.sys.newton.announce comoderator:comp.os.os2.moderated
"You can always judge a man by the quality of his enemies." -- Dr Who
http://junior.apk.net/~qc
------------------------------------------------