Java Solaris Communities About Sun How to Buy United States Worldwide

Feature Story

 
»   See all Feature Stories
 

The Promise of XML.

15.May.02--The StarOffice 6.0 office suite software--a Sun Open Net Environment (Sun ONE) product offering--supports human-readable XML file formats, giving document authors unprecedented flexibility to access, search, reuse, and publish the contents of their files. XML can help your organization take advantage of new innovations in information technology, including Web services, knowledge management solutions, content and document management systems, and advanced search capabilities. XML file formats provide a high degree of investment protection, security, and portability for the valuable data stored in your StarOffice 6.0 version documents. Here's how it is done.

What We Did

In response to customer demands for content stability, manageable file sizes, and the flexibility to create, manage, and access complex documents and Web pages for years to come, the StarOffice software team has replaced the previous binary file format of the StarOffice 5.2 version with a new, XML-based file format.

The new file name extensions are:

  • .sxw for StarOffice 6.0 Writer
  • .sxg for StarOffice 6.0 Writer Master Document
  • .sxc for StarOffice 6.0 Calc
  • .sxd for StarOffice 6.0 Draw
  • .sxi for StarOffice 6.0 Impress
  • .sxm for StarOffice 6.0 Math
In addition to the actual XML standard, the StarOffice suite XML file format makes use of elements and attributes from HTML, XLink, XSL-FO, Dublin Core, and SVG. For the StarOffice Math software, we used MathML. Developers familiar with these formats can easily pick up the StarOffice software format.

The StarOffice software format, which is also used in the open source version of the suite, OpenOffice.org 1.0, is available under a Lesser GNU public license, so you aren't at the mercy of a single company for improvements and fixes to the format and its supporting software.

How It Works

The StarOffice suite XML file format allows the constituent parts of a document--its content, layout, meta information (who wrote it and when), images, and embedded files--to be stored in separate streams of a ZIP-based package file. Using separate streams allows users to read, interpret, and modify the content types independent of one another. This gives document owners flexibility in how they access, search, manage, and publish the content in their files.

When saving files to XML, the XML filters of the various StarOffice modules store packages instead of plain XML files. To validate the contents of an XML file, you must unzip the package. In general, look in "content.xml" for document content, "style.xml" for style information, and "meta.xml" for meta information. There might also be additional XML streams in the package.

The Benefits

XML provides a platform- and application-independent environment for defining document markup that enables you to output and exchange StarOffice suite documents for years to come.

The StarOffice 6.0 office suite's XML file formats give you the following benefits.

Increased Robustness: XML makes it easy to ignore or tolerate problems or errors in corrupted documents, reducing the likelihood that you will lose the content of your files.

Document Longevity: For some users, long-term document storage of documents is important. With binary file formats, documents can be read only as long as the supporting application and operating system exist. XML, being a text-based and human-readable format, allows files to be read even if the original application or platform is no longer available.

Version Interoperability: A well-known problem with office documents is a file format versioning problem: New versions of the office suite usually come with a new version of a file format, which the older versions don't know about. To be able to read newer documents, users find themselves forced to upgrade their application. With the StarOffice suite XML-based format, you have the ability to add new features to the format without losing the ability to read older files. Older applications simply ignore the new, unrecognizable content and read the files as best they can. The result is a high degree of backward and forward compatibility for StarOffice software files.

Transparency: With the StarOffice suite XML, users can finally inspect files that are being sent and received, to check for macro viruses or other potentially harmful content. A simple combination of "zip" and "grep" allows you to check for suspicious content. If you need to quickly find a certain document, just use the UNIX® "grep" command or the Windows Explorer context menu to search through the meta information, which is stored as plain text.

Openness: Because it is based on XML and ZIP, the StarOffice suite file format can be used with a growing number of widely available tools that can process these formats.

  • Any available XML viewers can be used to examine document content, and XML editors can be used to make changes manually.

  • XML transformation tools and libraries, such as XSLT engines or XPathScript (Perl), can be used to automatically edit, modify, or generate StarOffice office suite documents.

  • A growing number of XML-aware databases and storage products can be used to store, index, query, and manipulate StarOffice office suite documents.

  • Standard ZIP tools may be used to change the package content. For example, using any ZIP tool, embedded graphics can be changed from low resolution to high resolution before the document is given to a print shop.

Back to top

Contact About Sun News & Events Employment Site Map Privacy Terms of Use Trademarks Copyright 1994-2008 Sun Microsystems, Inc.