[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Open Source document merging? -- with DOC? Did you miss the



On Tue, 2006-01-31 at 17:17 -0600, Michael Stephens wrote:
> Is there a web based open source program that will allow web site visitors
> to select single-page Word documents and have what they've selected merged
> into one document?

Oh boy, where do I begin.

First off, you've just hit the #1 nail why DOC (MS Word) is _not_ used
for everything from professional publication to technical documentation.
MS Word (nor any MS Office application format) does _not_ have a well-
documented structure that allows individual authoring and piecemeal re-
combination.

There's no further proof than this than the fact that Microsoft uses
Adobe FrameMaker to write its own MS Office manuals.  And before you ask
where I got that info, I got it from the on-site Microsoft employee tech
support at your local Fortune 20 company.

Secondly, why would you assume a Freedomware (open source, open
standard) program can solve _any_ issue with a Hostageware
(unmaintainable source, unmaintainable standard) program that it cannot
itself?  There is no automated collaborative features in MS Word other
than it's built-in revisioning that only it can do -- and often screws
up at anyway.

Now this is at the heart of why many companies are clamoring for an XML-
based, with full, documented XML meta-data like schema, XSLTs, etc... in
MS Office 12.  The previous XML in MS Office 11 (2003) is content
export-only (no native format), and used _only_ for _integration_ of
_external_ content and style, _not_ for the actual documentation itself.

Lastly and simply put, _every_ major Freedomware application, and even
most other Commerceware (closed source, closed standard) has a well-
structured documentation language underneath.  MS Word has _never_.
That's why you can piece together typeset languages like DocBook,
[La]TeX, more traditional word processors like OpenOffice XML, Corel
WordPerfect -- even convert between LaTeX and OpenOffice XML+MathML,
etc... -- as well as desktop publishing applications like FrameMaker,
Quark Xpress as well as Scribus.

MS Word has _always_ been on its own "island" when it comes to a well-
defined, structured, underlying language -- it completely _lacks_ one.
That's why it's very, very difficult to do what you wish.  Although
there are a few projects and libraries that attempt to document all the
variations/revisions of DOC (including it's endless conflicts) and the
1:1 related variation of RTF.  Every new DOC version results in a new
RTF, at least the way MS Word exports it (only Write/WordPad seems to be
more "pure"), and MS Word has the _nasty_ habit of polluting its RTF
export to the point that AbiWord, OpenOffice and Corel just use their MS
Word importers for RTF anyway.

If you want large, collaboration where you can piecemeal documentation
from multiple authors, use an application that has a _real_ typeset.  In
the "old days," the American Mathematical Society (AMS) and the
Institute of Electrical and Electronics Engineers (IEEE -- who produce
25% of the world's technical documentation) used LaTeX.  LaTeX has
_everything_ you could ever want (one would argue way too much after 27
years).  A popular WYWSIWYM GUI front-end to LaTeX is LyX.

But more recently, many use Adobe FrameMaker and increasingly
OpenOffice.org's Writer, especially since OpenOffice XML, which uses
MathML for equations, can be converted to/from LaTeX.  OOo Writer is
compared to Adobe FrameMaker in its underlying capabilities more and
more nowdays.

For more formal desktop publishing, consider Scribus.


-- 
Bryan J. Smith   mailto:b.j.smith@ieee.org
http://thebs413.blogspot.com
------------------------------------------
Some things (or athletes) money can't buy.
For everything else there's "ManningCard."



-
To unsubscribe, send email to majordomo@luci.org with
"unsubscribe luci-discuss" in the body.