Earlier today, IBM announced that they are stepping up their involvement in the openoffice.org community. IBM joining OpenOffice isn’t about IBM helping to clone MS-Office in an open source project. It’s more about throwing technical resources into the community as it currently stands in an effort to broaden public knowledge and consumption of an open document format. OpenOffice is quite mature, so much of these resources are going to work on new projects that embed the ‘OpenOffice Technology’ in new and innovative offerings. So this announcement is a bellwether of things to come. All the derivative works will share a common standard for document file formats…and it’s ISO 26300, OpenDocument Format.

During ODF Day at aKademy 2006 (KDE developers conference) last September, Rob Weir talked about building an ecosystem around an open document format. In fact, he stated that the adoption of an open document format could lead to a “golden age” of document processing, both client and server side. With so much “marketing speak” in this presentation, it was difficult to see where he was going with this, but lights went on in the audience when he started talking about the the type of document processing that can only happen with an open document format (that ANYONE could implement against):

  • Interactive document creation in a heavy-weight client application (office suite)
  • Interactive document creation in a light-weight web-based application
  • Collaborative (multi-author) editing
  • Automatic document creation in response to a database query (report generation)
  • Indexing/scanning of document for search
  • Document scanning by anti-virus tools
  • Other types of document scanning, perhaps for regulatory compliance, legal or forensic purposes
  • Validation of documents to specifications, house style guidelines, accessibility best practices, etc.
  • Read-only display of documents on machines without full editors (viewer)
  • Conversion of documents from one editable format to another
  • Conversion of a document into a presentation format, such as PDF, PS, print or fax
  • Rendering of a document via other modes such as sound or video (speech synthesis)
  • Reduction/simplification of a document to render on a sub-desktop device such as cell phone or PDA
  • Importing data from an office document into a non-office application, i.e., import of spreadsheet data into statistical analysis software
  • Exporting data from a non-office application into an office format, such as an export of a spreadsheet from a personal finance application
  • Application which takes an existing document and outputs a modified version of that presentation, e.g., fills out a template, translates the language, etc.
  • Software which adds or verifies digital signatures on a document in order to control access (DRM)
  • Software which uses documents in part of a workflow, but treats the document as a black box, or perhaps is aware of only basic metadata
  • Software which treats documents as part of a workflow, but is able to introspect the document and make decisions based on the content
  • Software which packs/unpacks a document into relational database form


There are only two things that would complicate a vibrant document processing ecosystem: (1) multiple open document formats and (2) an open document format that could not be fully used to implement document processing tools. In other words, if parts of a standard document format are proprietary or reference proprietary or non-standard code, the standard becomes unusable by the masses.

In general, multiple standard candidates can be a positive thing as standards are being formulated. However, once an open standard is accepted, it is to the advantage of the entire development community and the ecosystem to focus on and improve a single standard. Otherwise, tool developers will need to replicate their work for multiple standards and they will suffer the pain and expense of incompatibilities and conversions.

Kudos to IBM for putting resources into the game to back up their stand on open document formats. In the end, it will be the consumers of the technology that will benefit. Documents will not only have long and unencumbered lives, but innovation will flourish around mining the content of these documents.

Popularity: 50% [?]

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Wists
  • Reddit
  • Slashdot

Comments

1 Comment so far

  1. John Cherry » Blog Archive » OOXML - Comment resolution in an shipping product? on October 19, 2007 2:40 pm

    […] management software vendors have made it clear that they do not want to support multiple open document formats. Document tool developers would like avoid the replicated work that comes with the support of […]

Name (required)

Email (required)

Website

Speak your mind