Looking at the new Open XML API, introduced by Kevin Boske here, makes you realise that old-style COM automation wasn’t so bad after all.
There are two distinct aspects to working programmatically with OOXML. First, there’s the Packaging API, which deals with how the various XML files which make up a document get stored in a ZIP archive. Second, there’s the XML specification itself, which defines the schema of elements and attributes that form the content of an OOXML document.
The new wrapper classes really only deal with the packaging aspect. You still have to work out how to parse and/or generate the correct XML content using your favourite XML parser. And it’s a lot more complex then HTML.
By contrast, the old COM automation API for Office presents a programmatic object model for the content, and you don’t have to worry much about how the document gets stored – you just tell Word or Excel to save it.
The (very big) downside of the COM object model is that it depends on the presence of Microsoft Office. High resource requirements, version problems, Windows-only, and inappropriate for server apps.
We seem to have traded one problem for another. What Microsoft needs to provide is wrapper classes for the content, rather than just its packaging.
Both Microsoft and an open source Java team are working on wrappers called Microsoft SDK for Open XML Formats and OpenXML4J respectively.
Jonathan,
Well, the library I linked to is called Microsoft SDK for Open XML Formats, but it deals mainly with packaging not content.
It looks like OpenXML4J is the same, judging by this FAQ item: