[LWN Logo]

From: Caolan McNamara <Caolan.McNamara@ul.ie>
Subject: mswordview, convert word 8 (office 97) to html
Date: Tue, 26 May 1998 13:06:53 GMT

-----BEGIN PGP SIGNED MESSAGE-----


Announcing

mswordview  a MS Word 8 Decoder

mswordview is a program that can understand the microsoft word 8
binary file format (office97), it currently converts word into 
html, which can then be read with a browser. its based on the
word 8 format documentation that ms released, and uses laola
(included) to split an ole files into its constituent streams.

                Features include
                    1) ability to understand fastsaved files as well as
                       non-fastsaved files.
                    2) conversion of word header paragraph styles into 
                       appropiate header levels of html.
                    3) conversion of basic font attributes such as 
                       italic, bold and font size into html tags
                    4) conversion of word tables into html tables.
                    5) a fair understanding of lists.
                Non Supported Features include
                    1) embedded graphics or other embedded types.
                    2) headers and footers.
                    3) fully correct conversion of tab stops and other 
                       formatting done by the user done with whitespace,
                       coz you cant really do this in html.
                    4) correct conversion of lists, all lists become
                       bullet pointed lists (<ul>), got list format is
                       a toughy.
                    5) other extraneous stuff like multi columns, table
                       of contents, and those special fields in general
                    5) word 6 and 7 etc arent currently supported, just 
                       word 8
                Defects are
                    1) mswordview uses laola to extract the ole streams 
                       from the document, and on occasion laola cant cope 
                       with some files, i.e corrupt docs and some large
                       docs.
                    2) ive only tested it with whatever word 8 files ive
                       at hand, if you have some that blow mswordview up,
                       or get wrong output out of it then you can submit
                       your file at the web gateway listed below.
                Web Page for download and information at
                http://www.csn.ul.ie/~caolan/docs/MSWordView.html
                Web gateway to mswordview at
                http://www.csn.ul.ie/~caolan/docs/MSWordView-Demo.html

ive submitted it to sunsite, and its in incoming there, id imagine itll
end up at 
ftp://sunsite.unc.edu/pub/Linux/utils/file/mswordview-0.0.14.tar.gz

its also available at the website mentioned above and at
ftp://skynet.csn.ul.ie/pub/linux/utilsmswordview-0.0.14.tar.gz

im of course interested to hear if it works for you, if it doesnt, and
if you have bug fixes for it. 

(p.s. if someone like applixware or stardivision would like this to
convert from word to their format, they can get in touch :-) )

C.
Real Life: Caolan McNamara           *  Doing: MSc in HCI
Work: Caolan.McNamara@ul.ie          *  Phone: +353-61-202699
URL: http://skynet.csn.ul.ie/~caolan *  Sig: an oblique strategy
Infintesimal gradations



- -- 
This article has been digitally signed by the moderator, using PGP.
http://www.iki.fi/mjr/cola-public-key.asc has PGP key for validating signature.
Send submissions for comp.os.linux.announce to: linux-announce@news.ornl.gov
PLEASE remember a short description of the software and the LOCATION.
This group is archived at http://www.iki.fi/mjr/linux/cola.html

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3ia
Charset: latin1

iQCVAgUBNWq+blrUI/eHXJZ5AQHI8gP/Up6fF41lfZ7+qduLnO9moHv9UGnhbC5t
Eev0crZjfm0LVTZAPrvTq4Tvcamd7B3Il6VDg1vLVBAYZixgr4ujO58h6MamtPsf
xeWJRNnMqYASrXDG52LHd0aSPsEKzbiBMpenSNRCob2DargBTF1uFZLkNGuanpNt
XM1S9VveTjE=
=2tLo
-----END PGP SIGNATURE-----