Standards in Content Management

The great thing about standards is there are so many to choose from.

In content management the old saying has not really been relevant. There have not really been many standards. Then one day JSR-170 turned up, a Java content management standard.

Then CMIS.

Then the other day, JSR-283, ostensibly the simple successor to JSR-170 suddenly split into two, one part the data model and the other part the Java API. Clearly leaving room for a non Java API track. Not being on the standards group I do not know when this happened, but it does smell like a bit of reaction to CMIS. And indeed Roy Fielding from Day did react to CMIS recently in a not happy way.

Previously most of us outside the Java CMS world looked on the Java standard as being a pure Java move. Many standards attempts now are about strategic commercial interests. Actually though within the Java world it seemed to be quite well accepted, and Magnolia and Day could live in the same world. Deeper down though, is it an attempt to define ranges of functionality within product ranges? Although the specification does not correspond to the implementation necessarily, both constrain each other, especially in some parts of these specifications.

CMIS was a different matter. A different set of vendors, a similar model, a different set of APIs (a non RESTful REST that Fielding could lay into!). But the biggest criticism of JSR-170 was simply the J; anything could look credible by being more inclusive. Hence I think the current changes in JSR-283.

The CMIS draft states

The JSR standard requires a particular type of implementation for ECM repositories: Whereas CMIS restricts itself to specifying only generic/universal concepts for ECM constructs like Documents and Object Types that could be layered on most existing ECM implementation, the JSR standard requires a highly-specific & feature-completion implementation of a repository. This structure may not be appropriate for many types of applications, or efficiently layered on existing ECM repositories.

This is core to Fielding’s substantive criticism – it is just trying to model folders and files, and a WEBDAV interface does that fine. JSR defines abstract item and property trees, that the CMIS players feel don’t fit with their content models. The CMIS draft mentions webdav, but says it misses out on types and queries, locking, and not http interfaces(!).

Jon Marks http://jonontech.com/2009/04/09/cmis-is-xpath-just-a-bit-too-tricksy/ points to the XPATH/SQL distinction. XPATH too is a bit modern for the CMIS vendors. The XPATH expressions refer to the XML representation of the property tree, which if it does not actually correspond at all to your internal models or implementation method is quite a lot of work to implement.

There is some truth in the implementation specific parts of the criticism of JSR, in that I am not aware of any implementation of the JSR standards that does not use a JSR native content repository as the base implementation (eg Jackrabbit). Maybe I have missed one. Partially that is because it is a strong model for a moderns CMS, with excellent open and closed source implementations. Partly also that not many vendors have yet explored related but dissimilar frameworks (it is not clear that an RDF model would fit well for example as properties are first class; though a mapping may be possible the semantics of an object may end up differing). The other area in which differences will probably become apparent are the versioning models, though these will not necessarily be exposed through interfaces if there is a big mismatch.

In an abstract sense the differences in the two models seem small. In CMIS documents have a (single) body and then additional properties. In JSR objects simply have properties, the “content” is simply anther property. Although that sounds subtle and easy to switch models – just add a type property and a content property to documents, not to folders, there are more and more model differences the further you go. Locking for example. And the rules about folders not being versioned while files are, and the primacy of the document folder containment relationship. Another difference in CMIS is the large list of optional facilities, such as the query types available, and whether checked out copies or versioned copies of documents are accessible through the query mechanisms.

Overall there is a difference in what the “generic/universal concepts for ECM constructs” are. Two formalized models is actually a useful start to help classify the CMS models that exist. A web services interface to JSR will help that model expand outside the Java only area. Islands of interoperation or at least data transfer can only help the industry as a whole.

1 Trackbacks

You can leave a trackback using this URL: http://blog.technologyofcontent.com/2009/04/standards-in-content-management/trackback/

  1. By management technologies - StartTags.com on January 25, 2010 at 19:54

    [...] management and more. HealthDataManagement is a SourceMedia publication. SourceMedia is an…Standards in Content Management Technology of ContentStandards in Content Management. The great thing about standards is there are so many to … In [...]

Post a Comment

Your email is never shared. Required fields are marked *

*
*