<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Technology of Content</title>
	<atom:link href="http://blog.technologyofcontent.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.technologyofcontent.com</link>
	<description>Ramblings on the technology of content management</description>
	<lastBuildDate>Sun, 21 Feb 2010 13:05:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>NoSQL and content management</title>
		<link>http://blog.technologyofcontent.com/2010/02/nosql-and-content-management/</link>
		<comments>http://blog.technologyofcontent.com/2010/02/nosql-and-content-management/#comments</comments>
		<pubDate>Sun, 14 Feb 2010 23:34:15 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[CMS]]></category>
		<category><![CDATA[data modelling]]></category>
		<category><![CDATA[nosql]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=216</guid>
		<description><![CDATA[I went to many of the first ever NoSQL devroom talks at FOSDEM this year. For anyone who hasn&#8217;t been, FOSDEM is a great place, and the NoSQL room was well organized and full of interest. The term NoSQL is not even a year old; I first came across CouchDB around a year ago from [...]]]></description>
			<content:encoded><![CDATA[<p>I went to many of the first ever <a href="http://nosql.mypopescu.com/post/385372130/your-chance-to-review-the-fosdem-nosql-event">NoSQL devroom</a> talks at <a href="http://fosdem.org">FOSDEM</a> this year. For anyone who hasn&#8217;t been, FOSDEM is a great place, and the NoSQL room was well organized and full of interest. The term NoSQL is not even a year old; I first came across CouchDB around a year ago from memory; Tim Anglade gave an excellent introduction where he reminded people of the historical roots, both before relational databases and since then; so not new but there is a renewed focus now. Why is that? I am going to look here at the field of content management and why you might be interested in different data models if that is your problem space, based loosely on some of the ideas from the talks at FOSDEM. There was a talk about <a href="http://outerthought.org/blog/blog/353-OTC.html">content management specifically and the Lily CMS by Evert Arckens</a> although I missed it, but I have added some comments after watching the video.</p>

<p><a href="http://www.flickr.com/photos/justincormack/4375594326/" title="FOSDEM by Justin Cormack, on Flickr"><img src="http://farm5.static.flickr.com/4029/4375594326_7ebdafd796.jpg" width="450"  alt="FOSDEM" /></a></p>

<h2>The data model for content management</h2>

<p>I have another draft post on this subject in more detail, which I am working on as parrt of my REST modelling in content management work, but I will outline some of the types of data relations that are important. I will be quite abstract here, if you want more concrete examples you will have to wait for the other post: database models like the ones we are talking about here are more easily understood in the abstract I think.</p>

<p>First we our unit of modeling. This in itself is the first issue. Content management tends to deal with, at the conceptual level, something that looks like a document. It may be a fragment, in the sense that it is say a page component (asset if you use that terminology) rather than a whole item, but the unit for the user to edit and which is usually versioned is a structured object itself. The processing model tends to treat it as almost of binary blob, except that certain properties can be extracted, such as metadata, links in HTML and so forth, but it is stored as an item rather than decomposed further.</p>

<p>OK, so we have a piece of content and some attributes extracted from it as one basic model. This corresponds pretty much to the JCR data model for example. There are variations; sometimes people do not store metadata in the file formats, as historically many file formats had poor support for arbitrary structured metadata, although that is largely obsolete now, and the advantages of actually storing metadata and relations substantially within documents are high. External storage does not change the model much, just complicates processing and storage. Another variant, often seen in document management systems is to be able to have multiple &#8217;streams&#8217; ie several document variants rolled into one, for example a video and a still from it. You can however from the modelling point of view regard these as anotehr compound document format kept together because conceptually they are a bundle of content; you might distribute them as a zip file if you havent got any other suitable container format.</p>

<p>So now we have a storage model where we have a blob, with rich media operations on it, and extracted structural and metadata information. There is also versioning to consider, but let us ignore that and treat it either as part of the blob, or as a new document with some relation to the old ones, those being the two core versioning models, this does not really affect anything else.</p>

<p>There are two kinds of metadata, although they are more similar than they appear, properties and relations. Properties are the standard attributes (this picture depicts sheep), while relations join two items in the repository (this is a cropped version of this other picture). Although this distinction seems clear, in the end richer information architectures demand that everything becomes a relation, so I can browse a sheep node and find all the sheep items, turning every attribute value of any significance into a node with relations instead. Pure attribute values are only left for the less interesting properties (this PDF file is 176k in size).</p>

<p>They are also less interesting from a relational versus non relational storage point of view, although there is one important point, which is the dense versus sparse question, so let us take a look at this. Most real world attributes are sparse, that is most attributes aare not set on most items. In the relational model we have a row for our item, and columns for all the attributes, so we are saying most are NULL. (I was brought up on matrix algorithms and still think in terms of sparse versus dense matrices as this is exactly the same problem, and matrices represent graphs anyway). Storing huge mainly null tables is not very efficient, so there are two common practices in relational mapping of attributes in content management systems. First is to define a type based system, where a particular type of content item is defined to have certain attributes (or at least fewer NULLs!), and each set of that type therefore can have its own table which is assumed to have fewer NULL values. Mixins, sets of properties that live across types can potentially be added to this model, as can inheritance schemes, but the basic idea is one table per type. This gives a nice simple direct database programming model, and causes a complete nightmare if you ever want to change the schema, for example add an attribute, as for any large database most DBMSs will effectively shutdown the system while a schema change takes place, as schema changes require pretty much all locks. <a href="http://www.silverstripe.com">Silverstripe</a> is one example of a content management system built like this; there are many others.</p>

<p>The alternative is the <a href="http://en.wikipedia.org/wiki/Entity-attribute-value_model">entity attribute value</a> (EAV) model (terrible Wikipedia article, please fix), where rather than a direct mapping of the attributes to relations, you indirectly map, creating a table that joins entites, attributes and values; this table of course looks just like RDF triples. Doing this though loses everything that makes a relational database useful: constraints, typing, query optimization. It adds an extra layer of logical schema above the physical schema which the database layer does not understand. This is a pretty common relational mapping for content management systems, as it allows full flexibility in defining and redefining attributes. To implement well it needs a large mid layer to manage the constraints, provide an API layer, generate efficient queries, effectively to manage the logical layer to physical layer map. The <a href="http://drupal.org/node/82661">Drupal CCK</a> is an example of this model.</p>

<p>Of course this is not to say that neither of the two relational models do not work. The direct mapping works well with simple, unchanging content types in small websites, for example, or in models where attributes are not very sparse, or the sparseness is worth the overhead, and changing the schema is rare. EAV works well too, if managed carefully; it helps if the type of queries required on the model are not too complex.</p>

<p>Once you add relations as well as attributes, the already difficult mapping layer gets harder; you add another set of operations (recursion to handle tree structures) that the relational model does not handle well, so you may need to add more into the mapping layer. The promise of NoSQL is that you can bypass this for these types of applications, and program directly to a database model that handles sparse attributes and relations natively. But how much do the NoSQL databases get you? You can argue that if you are already looking at EAV, then you are already not getting much from a relational database, and you are building a modeling layer on top of it, so dropping that and going for something that maps the logical data layer directly does make sense from a development point of view. Whether that really helps performance is less clear; much of the original work for NoSQL has come out of huge scaling, big problems, not actually providing efficient solutions to the types of data mapping problem we are seeing here on a medium scale; of course for huge sites there may be benefits.</p>

<p>The types of NoSQL database vary in their level of support for attributes and relations as they are used in content management. Document oriented databases do not give you much more than retrieval of content items; associative ones give key value type attribute lookups; graph databases should let you query relations directly, expressing the types of queries that are needed for information architecture problems directly, in principle. Examples I am thinking of are things like tag clouds, which is simple to express as a graph problem as it is simple a count of the number of edges from a set of nodes. Indeed most information architecture problems look like graph problems, and also like <a href="http://en.wikipedia.org/wiki/OLAP_cube">OLAP processing operations</a> which also do not work well on relational databases. And of course one of the things that NoSQL has shared with OLAP is the use of denormalization; you can use simpler models if you denormalize data to match the queries you will be using, rather than assuming that the types of query you will use can necessarily be optimized and made efficient by a general purpose system.</p>

<p>Denormalization is not without its difficulties, although arguably it could become a tool embedded in databases like indexes are now. One of the issues with NoSQL is most of the database systems leave denormalization to the user: you need to use it because joins are not available, but you have to manage that yourself. Building an infrastructure to explicitly manage denormalization as a first class database item akin to an index might be interesting. So that gives us a first issue, as in any NoSQL system except a graph database we will either need to denormalize or compose queries to get the results we want.</p>

<p>So I think there are four realistic models for content management backends going forward:</p>

<ol>
<li>The direct relational model for small systems with simple data models, rare attribute changes, little or no use of relations.</li>
<li>EAV models wrapped in a content modeling layer; JCR is an example of this, hiding the underlying SQL layer very well, and indeed allowing it to be replaced with another underlying storage model potentially; I am sure someone is testing a Neo4J backend somewhere. This is where most production solutions are at now.</li>
<li>Direct, nondenormalized graph database backends, with the raw content stored in a document store. Cuts out a special purpose middle level by mapping the domain more directly. As <a href="http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html">Emil Neo</a> says, it may not scale right up as far as the othe NoSQL technologies, but it cuts complexity of implementation; there are also issues about whether all the kinds of queries required are available efficiently. I think this will be the sweet spot in a few years once the products mature and we see more open source activity in the field. Of course RDF based solutions, for example using SPARQL fall into this category too, and the maturity of products around these technologies will help drive this category as well as the NoSQL models.</li>
<li>Big, denormalized systems, probably with software support for managing the denormalization, and using underlying simple but scalable technologies like key-value stores. These already exist in large scale web applications, but may remain niche if the development effort remains high. If frameworks for modelling more easily on these turn up they may trickle down for performance reasons even on smaller datasets; a key value store runs fine on a relational database backend, although the types of processing required probably means a specialized backend is useful.</li>
</ol>

<p>Note that the <a href="http://lilycms.org/">Lily CMS</a> which there was a talk about fits very much into the fourth option above; this is where the NoSQL technologies have perhaps seen most use, but I think there will be a lot of work in order to build a CMS like this now, in particular in terms of tools to support denormalization strategies that are needed. The outlined approach sounded much like the outlines I have been thinking about for this type of model, although I would focus more on tooling for denormalized queries and less on scaling other parts like full text search right now. It will be interesting to follow the progress of this project.</p>

<p>We are at an interesting juncture, where it looks like there are some options that will let us do domain modelling in a way that corresponds more directly to the domain, but there are a lot of interesting challenges on the way.</p>

<p><a href="http://dilbert.com/strips/comic/2008-02-12/" title="Dilbert.com"><img src="http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/00000/1000/800/1869/1869.strip.gif" border="0" alt="Dilbert.com" width="440"/></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/02/nosql-and-content-management/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>JSON vs XML</title>
		<link>http://blog.technologyofcontent.com/2010/01/json-vs-xml/</link>
		<comments>http://blog.technologyofcontent.com/2010/01/json-vs-xml/#comments</comments>
		<pubDate>Wed, 27 Jan 2010 21:16:09 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=206</guid>
		<description><![CDATA[A lot of web developers you meet hate XML with a passion, and JSON has taken its place as the format of choice for a lot of API work. There are some advantages to JSON, but some disadvantages, and XML does have some problems, but the arguments are not as simple as generally made out.

I [...]]]></description>
			<content:encoded><![CDATA[<p>A lot of web developers you meet hate XML with a passion, and JSON has taken its place as the format of choice for a lot of API work. There are some advantages to JSON, but some disadvantages, and XML does have some problems, but the arguments are not as simple as generally made out.</p>

<p>I have been looking at the issue of writing filters for formats that basically change them as little as possible. This is a slightly difficult field in many ways. You have to store a fair amount of extra information in order to do this, but if you are say changing some metadata items for a user they may not want their CDATA removed and replaced with a semantically identical but syntactically different form. So I am looking at the formats partly from this point of view. Sometimes this brings up the conflicts between the human readable and computer manipulation aspects of these formats, the logical and physical structures. I have also been looking at making simple tools to allow modification of document formats, which has also raised some issues.</p>

<h2>Pro JSON</h2>

<p>JSON is simple and now well defined. It used not be clear in a few places, like how to encode items outside the 2 byte UTF-16 encoding using the \u notation. Many people do manage to generate invalid JSON (no quotes around identifiers, use of single quotes, use of a BOM at the start) it seems, which is a problem. I mean if people cannot even get possibly the simplest format ever invented right, what hope is there for civilization? This should get better as standard libraries that do the right thing come out one would hope! Introducing lax JSON parsers (in the style of HTML 5) seem to be unnecessary for a simple format that is normally generated by a computer. A strict JSON parser is not a lot of code.</p>

<p>JSON has a simple way of showing which of the allowable encodings it is in, based on the zero bytes at the start, as the first two characters must be ASCII. Allowing a BOM and then allowing unicode whitespace might be more standard, but the whitespace has no function except for use in text editors.</p>

<p>Despite teh attempts by, for example, E4X to add a simple XML native format and processing model to Javascript, JSON remains much easier in most languages to process, as it is built around structures that most languages just have natively, while XML is not. Some languages have issues with a mismatch to JSON, but most are fine. E4X on the other hand has security issues client side, and is not seeing adoption except in some server side applications.</p>

<h2>Against JSON</h2>

<p>JSON does not have a native hyperlink type. This is unacceptable in a web format in my opinion. For example a REST interface (a real one, not the clean URLs many people who use JSON think are REST, one with HATEOS) requires native links and link types. A native link format such as {&#8220;next&#8221;: <a href="http://example.com/next">http://example.com/next</a>} would solve a lot of issues, and would be compatible. There is JSON-Schema trying to add schemas that can extend the type system, but having to have a schema to understand links just seems overkill to me.</p>

<p>JSON ought to specify that unicode strings are normalized really. I guess most of them are, but it does mean you should normalize before doing comparisons on keys for example.</p>

<p>There are some syntactically different representations: arbitrary white space, although this does not including the full unicode white space definition, and backslash escaped characters, which can mostly be represented directly in the unicode encoding of the document. Whitespace clearly needs to be preserved for readability and use of line oriented editors and tools. It is unclear how inconvenient it would be if \u codes were normalized to unicode which is the sane default. I think there were some tools that did not support unicode, although it is mandated by the standard; it is odd perhaps that an ASCII encoding is not an option, but it seems unlikely to be important. In fact though, preserving the exact used syntax is not difficult in many applications as, unlike with some cases in XML, this does not involve much additional state.</p>

<p>Although JSON was designed to serialize data structures, its small set of types is limiting. We are not going to get around this though for now, as computer languages still have such different ideas of types. Most uses of JSON have an implicit schema, which is both a strength and weakness. Most implementations were tightly coupled, for example in AJAX. Now we are seeing more APIs exposed to the world using JSON; these have more need for a schema. I tend to prefer the idea of HATEOS, the REST idea of hypertext as the interface design constraint, rather than published schemas in the SOAP WSDL style, and JSON seems to be more inclined to move to the latter. Especially if people use JSON-RPC, I thought people had given up on the RPC style on the web but it appears not.</p>

<h2>Data model</h2>

<p>The JSON data model is simpler than XML. This is less clear as a differentiator. XML nodes have attributes and children, JSON ones attributes or children if you consider the object to model an attributes set and the array or list type to model an ordered list of children. This difference is not hard to work around, and it is very domain specific what the requirements are.</p>

<p><a href="http://twitter.com/dehora">Bill de hOra</a> pointed out &#8220;you should add field cardinality to the distinctions &#8211; json needs to change structure [], xml needs just another element&#8221; which is a very good point.</p>

<h2>Schemas</h2>

<p>Schemas for validating are great. Validation is an important activity. It is complicated though in general, rules such as this must be filled in if that is not and so on. Essentially a validation schema might need to be very complicated, but many are very simple. Having a choice of languages to express these  constraints in seems to me to be a good thing. The XML DTD is too weak, and should not have been included in the language, as discussed below. Some constraints are computationally complex and need very expressive languages.</p>

<p>The second function of a schema is interpretation; this may relate to validation in that a field must be readable as a number say, and we are also going to read it as a number. This is a different requirement, as in many cases it is about object modelling and code generation, when a validated structure is then mapped to a native language object. These are conceptually separate processes, as a number may be constrained to be between 3 and 5 for domain reasons, but the representation in say Java may be an integer, but it need not be. Of course the validation stage here is essential for security reasons, to stop overflows and type errors; however these are conceptually different activities and may have different schemas.</p>

<h2>Against both</h2>

<p>Binary data is a big problem. We will need a lot of other formats for anything that has binary data, they are just so much more efficient, even after compression. So ideas of a universal format are not going to happen.</p>

<h2>Against XML</h2>

<p>XML is weak on unordered items. Most of the structure in an XML document is the child relations and these are ordered. This is used as a criticism, but I am not sure it is that reasonable, as attributes are unordered and as said elsewhere there is an equivalence with the two structures provided by JSON, named and unordered items, and unnamed ordered ones which seems natural.</p>

<h2>Pro XML</h2>

<p>It was pointed out <a href="http://twitter.com/dret">by Erik Wilde</a> that I had missed out the pro XML section. This was an accident. I am actually very pro XML in many ways. First it has enough structure that we can build rich data structures; and to add to that it has some standard forms (such as XHTML) with rich sets of attributes and elements which can be reused in a variety of domains, and standard link relations. The other big thing is the set of extraction and transformation tools, which are generally quite well designed and fairly complete. There are stream and DOM parsers widely available.</p>

<h2>Against XML DTDs</h2>

<p>The DTD, which XML inherited from SGML, is an anomaly in many ways. First it has a non XML syntax, so we need another set of parsers and tools to work with it. It has several functions that really need to be separated. The first function is as a schema for validating documents against. Unfortunately it is not a very good schema language, as the constraints it can apply against documents are limited. Now we have for example XML Schema and RELAX-NG, which are better schema languages, but the DTD has a special position in the specification that is difficult to drop.</p>

<p>In addition to being a schema, the DTD can also define default values for attributes that the application should see just as if they were in the document. This is the kind of thing that makes preserving the textual form difficult, as there is a syntactic but not semantic difference between certain attributes. I also do not think that this is used much, as real defaults would be implied by the processing model not the document. Clearly it is easy to remove this feature from documents simply by adding in all the implied defaults explicitly.</p>

<p>There are security issues due to the parsing issues with entities, which means that <a href="http://msdn.microsoft.com/en-us/library/ms756016%28VS.85%29.aspx">some parsers disable DTD parsing for security reasons</a>. SOAP for example does not support DTDs. This is of course non conforming, but clearly a good idea in many situations.</p>

<p>DTDs are not namespace aware, which makes them unusable in many cases with documents with namespaces. Another reason to deprecate them.</p>

<h2>Against XML entities</h2>

<p>Then there are entities. My reading of the initial spec is that entities were designed to save typing for people, but I do not think that they are used for anything except for memorable encodings of characters outside the ASCII set. The thing about this use case is it is perfectly alright to substitute the values for them, as they never change, whereas if I create my own arbitrary entity inn a DTD for the name of something it may be because I wish to use this like a search and replace function to substitute whatever I want in. This is in my opinion not really appropriate at the document format level, this is an application level tool, and the application should use regular XML tags for this type of user level structure.</p>

<p>XML entities can also be used as an inclusion mechanism; again the DTD is not the place to define this. XInclude seems much better if this facility is needed.</p>

<p>Entities can contain other entities, markup and so on. Recursion, and unbalanced markup are not allowed. This whole thing adds enormously to parsing complexity, when the use case is entirely as character data.</p>

<h2>Against XML namespaces</h2>

<p>I am not against XML namespaces per se, but there are <a href="http://lists.xml.org/archives/xml-dev/200204/msg00170.html">pathological cases</a> which make them very hard to process sanely. In particular, you can redefine the same namespace name to refer to multiple URIs in the same document, and you  can refer to the same URI with  different names. This effectively means that all processing needs to refer to both the short name and the full name. As this is exactly what the spec was trying to avoid it is pretty bad. The amount of state you need to keep to keep a namespaced document textually the same after processing is very large; the nasty mess one tends to get from parsers to let you cope with namespaces is one measure; another is the complexities of xpath on namespaced documents, especially ones with any of the pathological cases in.</p>

<p>The simple solutions seem to involve not allowing redefinition of namespaces to a different URI in the same document, or the converse; declaring all the namespaces that will be used in the root element is also an option. This means processing can be more or less namespace unaware, as xsd:type will mean the same thing regardless of the context. This falls in with the standard usage, where a fairly small set of namespaces are used and they have abbreviations by convention that remain constant across large sets of documents. This means that very little namespace awareness complexity is needed.</p>

<h2>Other issues</h2>

<p>Mixed content, the role of CDATA, the significance of whitespace, these are all extremely complex issues that could be simplified.</p>

<h2>Minimal XML proposals</h2>

<p>XML, quite hard but worth it? For the applications I am interested in, I think simplification is needed. The first issue is that security and simplicity are related. Anything web facing will get hostile documents thrown at it, and having more constraint helps, in a way that the document processing industry does not see so much as an issue.</p>

<p>There was a time ten years or so ago, when minimal XML proposals were fashionable. XML itself was of course an attempt at a minimal SGML proposal, but not enough was cut or changed, and much compatibility was kept. <a href="http://simonstl.com/articles/cxmlspec.txt">Common XML</a> seems the most reasonable to me, and addresses many of the issues. XML tools do not work in the way that was perhaps envisaged, and making things simpler and easier, evolving them, will make them more robust. JSON shows that the demands for simplicity are there, and XML will suffer if it does not answer these.</p>

<p>The first thing is to drop the DTD. It serves no real function now we have alternative schema languages for XML. Radically, I think we can drop entities too, other than the necessary ones for escaping (amp, quot etc), and numeric ones which are again syntactic. The only possibility for requiring named entities is XHTML, but it barely exists now, and those entities could be special cased there without difficulty, as their values will never change and they do not contain markup or other things that cause parsing issues. Arguably these named entities could be added to the XML spec anyway for all documents, changed to a purely syntactic thing. I am not aware of any other XML usage of entities; there may be a few I suppose.</p>

<p>For namespaces, there needs to be a solution that maps syntax to semantics, so that an attribute or element syntactic name has the same semantics throughout the document. Renaming in different scopes makes global transformations, comparisons, and simple processing too hard. It breaks simple search and replace, even that needs to be namespace aware.</p>

<h2>Data versus applications</h2>

<p>Part of the conflict is due to whether XML is an application protocol, or a data format. Some of the bits that have issues, like entities, are really part of an application data format, for a class of applications that work according to the model in the mind of the XML designers, which in turn was based on real SGML applications. But data formats are winning really. We want to attach additional semantics to data now through standard mechanisms, such as relations, RDF and so on, not be expanding the storage format. Simplicity is winning here: complexity in a data format does not add to the richness that can be expressed; simple uniform mechanisms can do this. And simplicity is going to win; linked data over Microsoft Word style application data formats.</p>

<h2>What will happen?</h2>

<p>I actually think these changes are, informally, happening. DTDs and entities are not used in many cases now. They may be in some publishing applications, especially those based on SGML, but the web document architecture does not use them significantly. Namespaces are used in a particular way, usually. HTML5 has shown what the logic of human readability and writeability implies, which is a non XML language. The great advantage of XML is the variety of ways in which it can be processed, but issues such as security to hostile documents, parsing complexity, performance, and ease of processing really matter a lot, and despite many weaknesses JSON is showing the way of radical simplicity. But a simplified XML would be no more complex than JSON I think, and have the advantages of richer tool support, and widespread use. Most of the XML in the wild an APIs is very simple; the sorts of XML that are embedded in other documents as metadata are simple too. Security is limiting processing, and the traditional publishing applications that historically used more of the functionality could change too, although more slowly. Will simplicity win, and wil JSON replace XML? I think not, because so much XML is in use, but I think a specification of an XML subset is needed to stabilise the situation.</p>

<p><a href="http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags"><img src="http://blog.technologyofcontent.com/wp-content/uploads/2010/01/parse.png" width="440" alt="you cannot parse XML with regular expressions"></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/01/json-vs-xml/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Scaling, Security and architecture in 2010</title>
		<link>http://blog.technologyofcontent.com/2010/01/scaling-security-and-architecture-in-2010/</link>
		<comments>http://blog.technologyofcontent.com/2010/01/scaling-security-and-architecture-in-2010/#comments</comments>
		<pubDate>Sun, 17 Jan 2010 18:47:53 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=200</guid>
		<description><![CDATA[This post is about a bunch of stuff I have noticed recently, things that are affecting software and hardware architectures, and security; it is a bit miscellaneous perhaps. As application architectures on the enterprise move towards emulating web scale architectures these trends will affect software more widely. This concentrates on Linux, the operating system the [...]]]></description>
			<content:encoded><![CDATA[<p>This post is about a bunch of stuff I have noticed recently, things that are affecting software and hardware architectures, and security; it is a bit miscellaneous perhaps. As application architectures on the enterprise move towards emulating web scale architectures these trends will affect software more widely. This concentrates on Linux, the operating system the internet is now built on, and how it is modifying the trends to fit with ways of doing things that may be different from what goes on in other communities. Security continues to be more and more important as the environment for applications becomes more hostile.</p>

<h2>Virtualization</h2>

<p>Virtualization mainly started as a way to deal with issues in running multiple services on Windows, due to compatibility issues. This has always been much less of an issue with Linux applications, due to the scale of supporting libraries packaged by distributions. It is still an issue though, for security reasons (apache without suexec for shared hosting still exists, bypassing OS based multi tenancy security, a model that should have gone years ago). KVM, which uses Linux as a hypervisor and uses the hardware virtualization capabilities of newer hardware as now in the Linux kernel, and supported in Redhat Enterprise Linux. I suspect this will gradually overtake Xen and VMWare in areas where only Linux is of interest, due to the built in kernel support; however lighter weight solutions for the security issues such as containers will probably take off instead for many applications where running multiple kernels is unnecessary.</p>

<h2>Containers</h2>

<p>Linux now has a full container model called LXC, similar in principle to BSD jails and Solaris zones. It arrived a bit gradually as a set of patches to namespace various parts of the system such as the process ID space, so a container has its own init process with ID 1 and can have the same IDs as other containers (this also is needed for process migration). There is also a network namespace, so each container has its own loopback device, and independently named network devices (that can for example be bridged back to the host). There is also a read only bind mount which can be used to safely export libraries and binaries to multiple containers with updates done centrally if required; otherwise the container can be managed as a standalone system just sharing the kernel. This environemnt provides a level of secure isolation between containers that solutions such as chroot never had. Processes in containers can be seen from the container host so obviously this needs to be well secured. Because containers do not need hardware support and are very lightweight I think they will grow rarpidly in popularity; they can also run within a virtual machine guest for process isolation inn a virtual environment. Ubuntu 10.04 will have <a href="https://wiki.ubuntu.com/ContainersSpec">full support</a>; earlier versions do work.</p>

<h2>Capabilities</h2>

<p>The old high risk ways of setuid binaries (with broad permissions) are going at last, replaced by a fine grained capabilities system. In principle this means you can drop root capabilities completely, making root an unpriviledged user. There is a <a href="http://ols.fedoraproject.org/OLS/Reprints-2008/hallyn-reprint.pdf">good summary article on this</a> and <a href="http://www.linuxjournal.com/article/10249">another on trying to remove root access</a>. It seems that we will not see pure capabilities based Linux distributions for a while, and will have setuid binaries in general purpose systems, but there is no reason why single application sandboxes should not drop root capabilities in their init process and just use capabilities set in the file system. Fedora seems the furthest ahead in trying this out as a full distribution, and hopefully this will move ahead, adding another security layer in addition to SELinux.</p>

<h2>Sandboxing</h2>

<p>Privilege separation in network applications has been around for a while, but it is starting to spread, with the best example being the <a href="http://blog.chromium.org/2008/10/new-approach-to-browser-security-google.html">Chrome security model</a>. The thing that has really started to change is treating all complex bits of code, such as HTML rendering in Chrome, as potentially hostile as they are likely to be buggy. There is a lot to do to get good security thinking pervasive in application design, but having some well thought out examples is a good start. Currently Linux Chrome seems to offer a <a href="http://code.google.com/p/chromium/wiki/LinuxSandboxing">choice of sandboxing methods</a> of varying effectiveness from a suid helper to using <a href="http://lwn.net/Articles/332974/">seccomp</a></p>

<h2>SELinux</h2>

<p>SELinux has been available in Linux, providing a Mandatory Access Control framework for ten years now, but it has taken that long for it to get really widespread use, mainly pushed by RedHat. Gradually it is extending to other applications, such as mod_selinux for Apache that runs web applications in appropriate security contexts; Postgres SELinux extensions are also available. We are getting to a point when OS security mechanisms can and will be used as they provide the types of security hooks that modern applications need, after a period where we have had applications inventing their own security mechanisms because the OS did not provide the right ones.</p>

<h2>Physicalization</h2>

<p>There was an interesting new buzzword this year: <a href="http://arstechnica.com/business/news/2009/11/basics-of-physicalization.ars">physicalization</a>. Yes just when you tought virtualization was an important new trend, along comes the opposite. What is the idea?</p>

<p>A two socket 8 core server with 16GB RAM and multiple ethernet ports divided into four virtual servers is actually quite expensive compared to four commodity low end boxes. There is a server premium built into the chip manufacture profit model for a start, and also a volume issue.</p>

<p>The price arbitrage is fairly compelling, although the other costs (disks, motherboards, networking) add up and reduce the saving. The example systems are things like <a href="http://www.sgi.com/products/servers/microslice/">SGI&#8217;s Microslice</a> &#8211; yes SGI, that name from the past! This offers dual core but single CPU systems, but with ECC, for significantly lower price and power consumption than typical two way servers, and potentially more throughput per $, for some workloads.</p>

<p>There are even some suggestions that for Linux workloads non x86 architectures (eg ARM) might be competitive for applications that scale out effectively to multiple machines, although I think the risk of introducing these would be high, and there would need to be a big buyer.</p>

<h2>Cloud</h2>

<p>The big coming trend as the world comes out of recession is that cloud computing platforms are cheap, very cheap, compared to in house server provision. Some estimates put it at 20% of cost now, falling to 10% this year. Part of this is economies of scale, part is standardized components and architectural options, and economies of scale in administration. Part of it may be untrue, as there certainly do not appear to be good figures. What is clear is that the SAAS model is compelling for many kinds of product, and fits in with a general movement to charge software as an expense not an investment. There is a lot of hype, and a lot of people have seen the cloud idea before under different names, but the web has produced a viable delivery mechanism, and the uniformity of hosting environments like EC2 cuts costs. Costs such as upgrades are much lower in a SAAS environment too; although the architecture of this software needs to be different to support that.</p>

<h2>Availability</h2>

<p>The last year or so, high availability programming has reached out into awareness a bit. The <a href="http://www.infoq.com/presentations/Systems-that-Never-Stop-Joe-Armstrong">Erlang model</a> has become better known, bringing more awareness of the base elements for building reliable systems such as process supervision. We are starting to see other implementations, such as <a href="http://akkasource.org/">Akka</a>. This is a great move, as availability needs to  move from being a sysadmin and maintenance issue to being a coding issue; for too long effective handling of failure has been ignored by programmers.</p>

<h2>Locks</h2>

<p>As applications start to scale to more threads on multicore CPUs, locking becomes more of an issue. <a href="http://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms">Lock-free algorithms</a> are one interesting answer that has emerged that can work well for some  algorithms. Getting past the scaling issues as architectures get more cores needs innovation in lots of areas such as this. Locks are definitely in the sequential areas that limit scaling through <a href="http://en.wikipedia.org/wiki/Amdahl%27s_law">Amdahl&#8217;s law</a>.</p>

<h2>Summary</h2>

<p>Software architecture is at an interesting point; the principles of web architecture and the security mindset are gradually feeding into tools and infrastructure and becoming more widespread, and delivery is also changing. Scalable, available and secure systems are the aim.</p>

<p><a href="http://dilbert.com/strips/comic/2009-11-19/" title="Dilbert.com"><img src="http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/70000/4000/100/74150/74150.strip.gif" border="0" alt="Dilbert.com" width="450"/></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/01/scaling-security-and-architecture-in-2010/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Standards Diagram for Content Management</title>
		<link>http://blog.technologyofcontent.com/2010/01/standards-diagram/</link>
		<comments>http://blog.technologyofcontent.com/2010/01/standards-diagram/#comments</comments>
		<pubDate>Tue, 12 Jan 2010 23:17:33 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[CMS]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=191</guid>
		<description><![CDATA[This is attempt number 1 of a diagram which I promised Jon Marks after his post. No it still does not have OSGI in! As Jon&#8217;s presentation used Prezi I thought I would give it a go. It takes a while to get the hang of it but it is fun. I can&#8217;t work out [...]]]></description>
			<content:encoded><![CDATA[<p>This is attempt number 1 of a diagram which I promised <a href="http://jonontech.com/2010/01/10/an-incomplete-directory-of-open-standards/">Jon Marks</a> after his post. No it still does not have OSGI in! As Jon&#8217;s presentation used <a href="http://prezi.com/">Prezi</a> I thought I would give it a go. It takes a while to get the hang of it but it is fun. I can&#8217;t work out how to get an overview at the end though&#8230;</p>

<p>Just press the play button to move around.</p>

<iframe height="300" src="http://prezi.com/nifkatyvrk02/view" width="450"></iframe>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/01/standards-diagram/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Open source and Content Management (for Janus Boye)</title>
		<link>http://blog.technologyofcontent.com/2010/01/open-source-and-content-management-for-janus-boye/</link>
		<comments>http://blog.technologyofcontent.com/2010/01/open-source-and-content-management-for-janus-boye/#comments</comments>
		<pubDate>Sun, 10 Jan 2010 14:45:31 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[CMS]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=186</guid>
		<description><![CDATA[Janus Boye said the other day after the BCS open source seminar in London


  @McBoof I left London dazed and confused when it comes to open source. Somebody pls. help me explain what open source really means #idiot


Now I only spoke to him very briefly before he had to rush to the airport, but [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://twitter.com/janusboye">Janus Boye</a> said the other day after the BCS open source seminar in London</p>

<blockquote>
  <p>@<a href="http://twitter.com/McBoof">McBoof</a> I left London dazed and confused when it comes to open source. Somebody pls. help me explain what open source really means #idiot</p>
</blockquote>

<p>Now I only spoke to him very briefly before he had to rush to the airport, but hopefully the following will be helpful: first an overview of the important things about open source in general, and then how they are and will affect content management in particular.</p>

<h2>Open Source</h2>

<p>I think it is  easiest for software developers to understand open source. It came from that community, and it addresses our needs. For a long time no one outside that community was really concerned with it. I think the first time I noticed someone who was not a developer showing an interest was when I was stopped at a tram stop in Vienna by an American as I was wearing a Redhat T-shirt just after their float and was asked who the next open source IPO was going to be, that was 1999 ten years ago now. I suppose that IPO was a big event in the spreading awareness of open source, although it did not perhaps spread much information about what it was really about.</p>

<p>I think the best place to start with trying to understand open source is with three things. I tend to have a bit of a historical approach to things&#8230;</p>

<p>The first is Richard Stallman. I recommend <a href="http://www.fsf.org/events/rms-speeches.html">him in person</a> rather than in writing actually. Actually that reminds me the first time I ever saw him I was sitting in the <a href="http://www.foundry.tv/">Foundry in Old Street</a> and he walked in and proceeded to autograph a woman&#8217;s breasts. Anyone who wants to understand open source should hear him explain the roots of the open source movement. I will not really try to explain all that here, but openness is what created the scientific method, and the idea that software got to the point where it was no longer possible to make it do what you wanted because you did not have access to source code, the point where you could not build on stuff any more or fix it, where control of your tools is taken away is a key part of it. Some people have tried to sanitize Richard out of things (the open source vs free software mess) but that is a mistake.</p>

<p>Second is Eric Raymond&#8217;s essay <a href="http://catb.org/~esr/writings/homesteading/">The Cathedral and the Bazaar</a> which wass very influential at the birth of commercial open source. It is strictly about software development methodologies, and much of the discussion about the cathedral methods is applicable to open source software too. It is about the huge changes that the internet brought in open source development, the birth of a development method that no longer copied the methods of closed source development but utilised the openness to create true large scale community development in a way that was not possible before, and which closed source cannot replicate. Linux is of course the classic early example of this.</p>

<p>Which brings us to the third thing, community. Open source is first of all participatory, not just for consumption, perhaps a bit against the grain of late twentieth century culture. Actually I am an optimist, <a href="http://www.herecomeseverybody.org/2008/04/looking-for-the-mouse.html">with Clay Shirky and against the sitcom</a>, and think culture is swinging this way but we shall see. So for open source, start by using it, then participate. No you do not have to code, although you can learn, there are other ways, bugs, documentation, all sorts. If you just want too see what the community looks like, I can&#8217;t recommend anything better than going to a good conference, like <a href="http://fosdem.org/">FOSDEM next month in Brussels</a>.</p>

<h2>Open source in content management</h2>

<p>Open source has not affected content management much yet. Almost all content management by volume takes place on open source products (by volume Wordpress, Joomla! and Drupal far outweigh anything else). By value it is less clear, open source always has an issue with by value calculations as the revenue models are different, Linux is not the leading server operating system by value, but is by installed base, but is also probably by the value of the services running on it.</p>

<p>But arguably open source content management software has not affected the industry yet, looking now at the larger installations, and the areas that Janus is interested in, indeed that I am. The industry has grown up in a mess as far as standards, ideas, infrastructure are concerned, but the <a href="http://en.wikipedia.org/wiki/Reality_Checkpoint">reality checkpoint</a> has been reached. Two standards have so far started to change the technology landscape of content management, JCR and CMIS, and almost all the implementations of these are open source, and most are cross-vendor projects. This change will grow as more standardization and commoditization sweeps the industry, as the industry adopts a web infrastructure rather than the pre-web legacies inherited from the document management history of the business. Everything that this business deals with will be served through the web; almost all web infrastructure is open source software; content management will be no different.</p>

<p>In this field this is all just beginning. Like open source as I said above, it started with developers, about more efficient ways of building, architecting and delivering software; in terms of influence on the end users it is still small. But things are turning as people become aware of open source in the industry, but they clearly still need some help understanding it. I hope this has helped.</p>

<p><a href="http://dilbert.com/strips/comic/2007-08-03/" title="Dilbert.com"><img src="http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/00000/1000/600/1676/1676.strip.gif" border="0" alt="Dilbert.com" width="480"/></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/01/open-source-and-content-management-for-janus-boye/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>The bottom 10 things of 2009</title>
		<link>http://blog.technologyofcontent.com/2009/12/the-bottom-10-things-of-2009/</link>
		<comments>http://blog.technologyofcontent.com/2009/12/the-bottom-10-things-of-2009/#comments</comments>
		<pubDate>Tue, 22 Dec 2009 23:40:04 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=180</guid>
		<description><![CDATA[Ok so I agreed to write a bottom 10 list for 2009, in a twitter agreement with @pmonks. Unfortunately I have just had another bout of winter flu so it has got a bit late so I may not make it to 10, unless there is another last minute entry (any suggestions?). Actually checking the [...]]]></description>
			<content:encoded><![CDATA[<p>Ok so I agreed to write a bottom 10 list for 2009, in a twitter agreement with <a href="http://twitter.com/pmonks">@pmonks</a>. Unfortunately I have just had another bout of winter flu so it has got a bit late so I may not make it to 10, unless there is another last minute entry (any suggestions?). Actually checking the tweet, it said bottom for 2010, but it is traditional to do that in the new year. So here goes, here is what kept me awake in the night in 2009.</p>

<h2>10. The Pirate Bay saga</h2>

<p>In yet another mess in the ongoing spectacle of the entertainment industry preferring legal to creative solutions was the Pirate Bay trial. All this really showed us was that the laws are just not well framed, so anyone could win, and it may all change on appeal. This time it was not suing your customers directly, but legal action is not going to make anyone change their mind. Obviously the next step is going to be to influence the passing of bad laws, not the creation of business value. It seems uncoincidental that Spotify is Swedish. In times of change, business model engineering and service engineering are as important as product engineering. Legal action in the way the entertainment business is conducting it creates nothing long term.</p>

<h2>9. <a href="http://en.wikipedia.org/wiki/Internet_censorship_in_Australia">Australian internet censorship</a></h2>

<p>Get your act together Australians and stop this. Many other governments are looking at ways to start doing this, so it is an important example.</p>

<h2>8. The EU MySQL Oracle Sun delay</h2>

<p>You cannot make industrial policy on this sort of timeline. If the EU were to turn down the deal now Sun would be destroyed. Oddly MySQL was at a transition point anyway. I am very much in favour of the <a href="http://en.wikipedia.org/wiki/Drizzle_%28database_server%29">Drizzle idea of the future of MySQL</a>; who knows where it will end up but it may well be outside Oracle anyway.</p>

<h2>7. SPARQL is a query language without a resource model</h2>

<p>Looks like this has a chance of being fixed in 2010 at last, although I have temporarily mislaid the references, check for the newer references to named graphs. The idea that you could launch a query language for the web without a resource model was yet another of the dumb W3C ideas. The model appeared to be to build XML in Prolog. That sucks. Unfortunately the fixes are quite substantial (quads not triples for example).</p>

<h2>6. WebDAV</h2>

<p>Although not exactly something from this year, remarkably it has kind of held on and since people still specifically mention it as an alternative to CMIS. Indeed it is kind of useful sometimes, in strange situations, and it does work in a limited way, but it is not a modern HTTP interface. You have to remember how early it is, as work started in 1996, when it was not clear how the web would develop, or indeed how HTTP would develop (HTTP 1.1 was out but not much used and it was shipped in a mostly 1.0 environment). Even at the time some of the mistakes were clear, but the great thing is they are all documented in Yaron Goland&#8217;s <a href="http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0308.html">The WebDAV Book of Why</a>. Like the issues of hierarchy that make mixing WebDAV with normal HTTP impossible, and the <a href="http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html">depth header disaster</a>. There are also some comments about it in the <a href="http://jonudell.net/udell/2006-08-25-a-conversation-with-roy-fielding-about-http-rest-webdav-jsr-170-and-waka.html">Roy Fielding podcast</a> in which Roy tries to avoid talking about JSR. The best thing about WebDAV is how well documented the mistakes are; this should be compulsory for all standards.</p>

<h2>5. <a href="http://en.wikipedia.org/wiki/GeoCities">Geocities</a></h2>

<p>The embarrassing kiddie years of the internet dead and buried. Mostly not the worst of 2009, but the idea you could still nurture your city. Obviously an anti archival moment for future historians to curse about. Still, a hubris reminder too, this was once the third most popular site on the intranet, and sold for $3.57 billion; look on their works ye mighty and really despair.</p>

<h2>4. Microformats</h2>

<p>Never going to work. We really need generic metadata representations that have sane serializations or embeddings into all formats. Metadata <a href="http://blog.technologyofcontent.com/2009/08/metadata-is-not-what-it-used-to-be/">now lives within documents</a>; it used to get lost before that. So the RDF model has won, and microformats have lost. Oh, and the standards process sucked.</p>

<h2>3. The XHTML2 débâcle</h2>

<p>Had to happen, but why did it take so long for the W3C to fall behind HTML5 rather than XHTML2? This was a huge diversion of resource. The W3C churns out stuff and some of it gets adopted, some is implementable, some of it is not implementable realistically. The organization needs to change or it will be irrelevant.</p>

<h2>2. The Go programming language</h2>

<p>I am an aficianado of programming languages. I have programmed in many of them, C, Haskell, you name it. Lua and Erlang my new ones for the year though its getting a bit late and I have barely started. I know my combinators from my closures. What is the point of <a href="http://en.wikipedia.org/wiki/Go_%28programming_language%29">Go</a>? It does not really offer anything for the currently interesting problems, I do not think it is going to make it anywhere. I would be surprised if it ever gets onto the allowed Google programming language list, which is <a href="http://steve-yegge.blogspot.com/2007/06/rhino-on-rails.html">C++, Python, Java, Javascript</a> since you ask. Google is doing some cool performance work on python though under the name <a href="">unladen swallow</a>.http://code.google.com/p/unladen-swallow/wiki/ProjectPlan).</p>

<h2>1. I4I&#8217;s patent win over Microsoft</h2>

<p>A last minute entry here. I4I has an <a href="http://www.theregister.co.uk/2009/12/22/microsoft_loses_word_patent_appeal/">injunction against Microsoft selling Word</a> without the generic XML editing functionality removed. Obviously it will be removed, and it is not a feature that a lot of people used. However <a href="http://broadcast.oreilly.com/2009/08/microsoft-and-the-two-xml-pate.html">analysis of the patent</a> indicates that it clearly has prior art, is unclearly applicable, and could affect many other XML applications. The affected part of Word is designed to be a fairly general XML processor, with similar capabilities to <a href="http://en.wikipedia.org/wiki/XForms">XForms</a>. We need to support Microsoft in getting the judgement reversed.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2009/12/the-bottom-10-things-of-2009/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Smart resources, or why you should care about HTTP PATCH</title>
		<link>http://blog.technologyofcontent.com/2009/12/smart-resources-or-why-you-should-care-about-http-patch/</link>
		<comments>http://blog.technologyofcontent.com/2009/12/smart-resources-or-why-you-should-care-about-http-patch/#comments</comments>
		<pubDate>Fri, 11 Dec 2009 23:42:06 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[REST]]></category>
		<category><![CDATA[HTTP]]></category>
		<category><![CDATA[PATCH]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=169</guid>
		<description><![CDATA[Unusually, there has been a significant change to the HTTP protocol this week. The PATCH method was approved by the IETF.

This is a big change as one of the parts of the HTTP model is the small &#8220;uniform interface&#8221;, where there are very few things you can do to web resources. GET is the most [...]]]></description>
			<content:encoded><![CDATA[<p>Unusually, there has been a significant change to the HTTP protocol this week. The <a href="http://greenbytes.de/tech/webdav/draft-dusseault-http-patch-16.html">PATCH method</a> was approved by the <a href="https://datatracker.ietf.org/drafts/draft-dusseault-http-patch/">IETF</a>.</p>

<p>This is a big change as one of the parts of the HTTP model is the small &#8220;uniform interface&#8221;, where there are very few things you can do to web resources. GET is the most common, to retrieve a resource representation. Then there is PUT to update a resource, and DELETE to delete it. Then there is POST, which tends to cover everything else you might want to do. The problem with that is that discovering the interface for POST is difficult, as is knowing exactly what it will do. (There are a few other verbs too).</p>

<p>PATCH is much more straightforward. PUT updates an entire resource with a new version, while PATCH just makes an amendment to a resource. For some types of resource, the entire resource may be large, so that just sending differences will save bandwidth. Also, sending the full resource may unnecessarily make the changes sequential, for example append operations where the order of the operations is not significant. One example given is a log file, where many processes  may be adding entries, and if they had to retrieve the whole log, append a new entry and write it back there would be a lot of extra traffic, and a chance of either lost updates or processes having to retry if the resource was modified during this process. Clearly a PATCH operation here that does an append would make sense. I am not sure that is actually a very good example though, as you would  almost certainly create a resource for each log entry, rather than one for the whole lot, but clearly other similar patterns exist.</p>

<h2>HTTP is not a filesystem</h2>

<p>When it was new, people tended to treat HTTP like a filesystem. After all that was the common model for storage, and web servers generally stored web pages as files, so they tended to e treated much like that, with filename extensions annd index files, and WEBDAV was created to try to make the web usable as a filesystem protocol. This model does not really work very well however, as it does not model the things you can and can&#8217;t do with HTTP. The methods are one example; updating entire resources at once means they tend to be small units, rather than, for example, log files. File systems generally struggle to store millions of tiny files without wasting a lot of space, and without becoming slower. The web  resource does not have to support full Unix filesystem semantics (a topic that oddly Wikipedia seems to be missing an entry on! May have to rectify that), and supports a much simpler updte model.</p>

<p>Maybe the easist way of thinking about HTTP is to see every URL is a small <a href="http://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller">Model View Controller</a> (MVC) system. The model is an abstract resource, which we can see through GET, which retieves views. There can be multiple views, as a request can ask for different media types and encodings for a single resource. The controllers are the media types supported by PUT, which are usually the same as those for GET, but need not be; because both the view and the controller need to represent the whole resource state, they do tend to be quite complex representations, such as XML documents. PATCH however is also a controller, but a more interesting one in many ways, as it can send just state changes to the model, which tends to be how many MVC systems work.</p>

<p>Another thing that PATCH enables is resources that hide some of their state. A resource could only support PATCH and not PUT so that state modifications were only changes. If the state returned by GET is not the complete state, the resource could hide parts of the model. An example could be a voting method that records but does not reveal who has voted, only returning totals, which accepts votes as PATCH requests.</p>

<h2>Server side scripting</h2>

<p>One PATCH format that makes a lot of sense is actually to use executable code, rather than say diff files. There is no reason why you  should not send the server a PATCH request that is some Javascript to modify the DOM of an HTML resource which can be executed serverside, or an XSL transform to modify and XML object. Sending code is an efficient way of making changes to a resource, and can be executed in a sandbox like the browser sandbox. This will be another driving factor for Javascript on the server side, as it is well suited for embedding like this, and already has a DOM model for transforms.</p>

<p>All these changes take us further away from the filesystem model. Web resources will more and more combine some storage with some computation, including ability to execute code in a contolled way. Smart resources will become more common, over dumb storage only resources.</p>

<h2>In other news</h2>

<p>Next in line for HTTP, hopefully, is the <a href="http://tools.ietf.org/html/draft-nottingham-http-link-header-06">Link header</a> which adds a new header for legacy document formats that do not include a native hyperlinking capability. This will allow relationships between these documents to be included in the retrieved resource, such as a link to metadata or other related resources. The HTTP replacement for the file system model is getting serious.</p>

<!--a href="http://geekandpoke.typepad.com/geekandpoke/2009/11/service-calling-made-easy-part-1.htm"><img src="http://geekandpoke.typepad.com/.a/6a00d8341d3df553ef012875f312f9970c-pi" width="400"/></a-->
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2009/12/smart-resources-or-why-you-should-care-about-http-patch/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Social {space&#124;media&#124;policy} @Starbucks</title>
		<link>http://blog.technologyofcontent.com/2009/12/social-space-media-policy-starbucks/</link>
		<comments>http://blog.technologyofcontent.com/2009/12/social-space-media-policy-starbucks/#comments</comments>
		<pubDate>Sun, 06 Dec 2009 13:52:50 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[flickr]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[starbucks]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=164</guid>
		<description><![CDATA[Companies still don&#8217;t get social media. Starbucks may have 5,151,861 fans on Facebook but they don&#8217;t get the way social media actually involves engagement and changing the way you work.

I happen to like pictures of people unposed, and I sometimes take them when I am in the right mood. The decisive moment of course has [...]]]></description>
			<content:encoded><![CDATA[<p>Companies still don&#8217;t get social media. Starbucks may have 5,151,861 <a href="http://www.facebook.com/Starbucks">fans on Facebook</a> but they don&#8217;t get the way social media actually involves engagement and changing the way you work.</p>

<p>I happen to like pictures of people unposed, and I sometimes take them when I am in the <a href="http://www.flickr.com/photos/justincormack/2632480082/">right mood</a>. The <a href="http://en.wikipedia.org/wiki/Henri_Cartier-Bresson">decisive moment</a> of course has an important role to play in the history of photography; when I was on holiday I happened to take a picture of a man sitting on the street outside a Starbucks, and when I <a href="http://www.flickr.com/photos/justincormack/">posted it to Flickr</a> I thought I would find a Starbucks group to post it to. And thats where I <a href="http://www.flickr.com/groups/starbuckscoffeecompany/discuss/72157622351418443/">found this hilarious thread</a>, which is a warning to people about what happens if you jump into social media at the deep end.</p>

<p>At the end of September, when the <a href="http://www.flickr.com/groups/starbuckscoffeecompany/">official Starbucks group</a> was started, in the social media frenzy of 2009, this <a href="http://www.flickr.com/groups/starbuckscoffeecompany/discuss/72157622351418443/">thread was started</a>, pointing out that many people had been asked not to photograph in Starbucks stores, or had been thrown out for taking photographs.</p>

<p>The official responses from the official moderator <a href="http://www.flickr.com/photos/42346097@N02/">analisamarie</a> started off fairly optimistically</p>

<blockquote>
  <p>Our formal policy is that all press-related photo inquiries need to contact press@starbucks.com prior to taking pictures in a Starbucks store. However, we have no formal policy around customers taking non-press related pictures in-store so if you hear otherwise, it might just be because your barista is camera-shy <img src='http://blog.technologyofcontent.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
  
  <p>Hmmm- good discussion! Sounds like there is a bit of confusion out there &#8211; let me take this back to my team and see what we can do to help. Thanks for bringing this up&#8230;more to come!</p>
</blockquote>

<p>Then got bogged down in legal</p>

<blockquote>
  <p>I am making great headway here and hope to have some detailed information for you all shortly. To give you an idea of what I&#8217;m up to, I am researching if some of our international markets have policies around photography in stores. Since international laws and regulations vary country by country, this is quite the task <img src='http://blog.technologyofcontent.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  I&#8217;m also working to see where the confusion is stemming from in some US stores. Again, stay tuned. I&#8217;m working on it!</p>
  
  <p>I have been meeting with various teams in the building and learning a lot about the world of policies <img src='http://blog.technologyofcontent.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  I hope to have something more concrete to share with you soon &#8211; thanks for your patience while I work through the details.</p>
  
  <p>I am getting closer to a final ruling each day. I have a big meeting on Wednesday and after that, I will post here with an update.</p>
  
  <p>I did have a very productive meeting on Wednesday of last week. We read through each of your comments and now the legal team is reviewing some of your feedback around public and private property. More meetings this week&#8230;more to come!</p>
</blockquote>

<p>The social networker starts networking internally</p>

<blockquote>
  <p>Still here and haven&#8217;t forgotten about you. I&#8217;m writing a blog this weekend/next week about this discussion and hope to post by the end of the week. I&#8217;ll keep you in the know. Have a good weekend!</p>
  
  <p>Just wrote a blog response that my legal team is currently reviewing&#8230;once I have final approval I&#8217;ll post it and let you know. I know it&#8217;s taken a while and I know I&#8217;ve said it before but I appreciate your patience. This has been quite an interesting project to work on and has involved many meetings with all sorts of teams throughout the building. SO glad you guys brought this to our attention so that we could sort it out for you!</p>
</blockquote>

<p>Hints on something more negative</p>

<blockquote>
  <p>We want to do this in the best way possible. There are many perspectives to take into consideration as part of this discussion. That means considering our baristas&#8217; daily work and their privacy, our customers&#8217; experience in our stores as well as your photographic expression of that experience. We have a lot of things to consider when making decisions that affect what happens in our stores. It has to be the right thing for our partners (employees) and customers, and it has to work well for stores around the world. Please continue to be patient while we work on a solution. In the meantime, I do ask that you continue to be respectful of customers and partners in our stores. If a barista asks you not to take pictures, please respect their request. More to come &#8211; Anali</p>
</blockquote>

<p>And more strongly hinting that the answer is going to be no. As several of the people in the thread point out they might as well close the sponsorship agreement with Flickr if they are going to say no to photography.</p>

<blockquote>
  <p>I have to add that this group isn&#8217;t explicitly here for the purpose of taking pictures inside Starbucks stores. That is one part of the Starbucks Experience but pictures of your experience out-of-store are welcome in this group as well.</p>
</blockquote>

<p>This is currently her last post, a few days ago, so the story may still unfold. Possibly not as dramatically as when <a href="http://www.schneier.com/blog/archives/2009/02/man_arrested_by.html">Amtrak police arrested someone for taking pictures for their photo competition</a> but it has slightly broader issues than that.</p>

<p>First there is the timetable thing. Do not bother with social media if you can&#8217;t make decisions quickly as an organization. Period. When issues are raised by social media, you have to respond fast, because things have a habit of going viral. Two months is a joke. Fix the response times before you do anything, or when something blows up you will bodge the response.</p>

<p>Second, be a bit lateral. I mean, surely someone would have thought about this issue, maye the &#8220;camera-shy baristas&#8221; if internally the social media plans were discussed, but obviously this social media plan comes from head office.</p>

<p>Third, social media is not about head office marketing, it is about running an open, transparent business. Flickr is not Facebook, and does not have quite the same vibe (and far fewer photos actually). It is mainly a subscription platform, and many of the people there are generally quite articulate. The issue is not that they are going to complain about your coffee, which has carefully planned responses, they are complaining about the way you treat them as a commmunity. Walk right into it, unprepared.</p>

<p>Engagement has to be seen as a two way street; if you are not prepared to change in the social engagement and you treat it like an advertising campaign you may come unstuck.</p>

<p>The stupid thing is of course that Starbucks is a social space. The coffee shop, its Central Perk in Friends, its the pub in the British community, its the office when travelling. Casual photography is entwined with the social space since the <a href="http://en.wikipedia.org/wiki/Brownie_%28camera%29">Kodak Brownie</a>; it has been reckoned that most of the data the human race has ever produced is in the form of photos (lacking a citation for that right now; would welcome one). Every mobile phone has a camera now. It is not actually hard to work out what the answer to the question should be; it even seems that that is already the policy, though no one can actually tell, as Starbucks policies appear to be secrets.</p>

<p>If your organization doesn&#8217;t grok social media, don&#8217;t copycat and try it anyway. maybe try a dose of Enterprise 2.0 first, and I don&#8217;t mean writing blogs for lawyers to read. This online stuff, it is going to change the way things work, until you understand that you will get it wrong.</p>

<p>Will be interesting to watch and see if Starbucks manage to sort this out.</p>

<p><em>Update</em> We finally have a complete copout policy &#8220;Here&#8217;s the answer that you&#8217;ve been waiting for &#8230;Photos are allowed in our stores for the purpose of sharing them in our Flickr group.&#8221;</p>

<p><a href="http://www.flickr.com/photos/justincormack/4158745476/" title="Man at Starbucks by Justin Cormack, on Flickr"><img src="http://farm3.static.flickr.com/2486/4158745476_786894d534.jpg" width="500" height="399" alt="Man at Starbucks" /></a></p>

<p>(Note picture taken outside Starbucks in a public space, without purchase of Starbucks beverage; however I cannot post it to the Starbucks pool as I don&#8217;t have a model release).</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2009/12/social-space-media-policy-starbucks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Search APIs with HTTP interfaces</title>
		<link>http://blog.technologyofcontent.com/2009/11/search-apis-with-http-interfaces/</link>
		<comments>http://blog.technologyofcontent.com/2009/11/search-apis-with-http-interfaces/#comments</comments>
		<pubDate>Sat, 28 Nov 2009 22:45:55 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[REST]]></category>
		<category><![CDATA[API]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=156</guid>
		<description><![CDATA[I had a brief exchange with Erik Wilde or @dret on twitter about REST and query languages; specifically SPARQL and whether the SPARQL DESCRIBE could be made RESTful with GET and PUT.

SPARQL DESCRIBE is a somewhat controversial feature of SPARQL that returns an RDF graph &#8220;around&#8221; the query, including all the referenced URIs and some [...]]]></description>
			<content:encoded><![CDATA[<p>I had a brief exchange with Erik Wilde or <a href="http://twitter.com/dret">@dret</a> on twitter about REST and query languages; specifically SPARQL and whether the SPARQL DESCRIBE could be made RESTful with GET and PUT.</p>

<p>SPARQL DESCRIBE is a <a href="http://www.w3.org/2001/sw/DataAccess/ftf4.html#item14">somewhat controversial</a> feature of SPARQL that returns an RDF graph &#8220;around&#8221; the query, including all the referenced URIs and some sort of domain specific context. It is less well defined than CONSTRUCT, that also extracts a graph.</p>

<p>But there are not many guidelines for how to actually write useful interfaces for search type functionality for REST web service APIs.</p>

<p>A lot of this applies to other collection type interfaces, collection in the Atom sense, which is essentially a predefined filter rather than one you can submit a query to. Resource collections are a fundamental feature of any real world system, being able to take a set of resources and manipulate it as one. Query languages are generally an efficient way of defining interesting collections.</p>

<p>I do not specifically refer to SPARQL here. Like many of the W3C recommendations, it is XML document oriented in its specification, not HTTP oriented, and so does not try to address the types of issues here. I have some issues with the W3C approach, sidestepping implementation issues; indeed the whole RDF as Prolog wrapped in XML does not seem very productive, and I think a more pragmatic approach from practise will work better, as with some of the other W3C initiatives. Items like DESCRIBE imply much more domain knowledge, and it is unclear what interoperability would mean.</p>

<h2>Whats the problem with query languages and REST?</h2>

<p>The main issues are that a SQL style SELECT should map to an HTTP GET, but instead we are passing a query with another verb in it. One method of avoiding this would be to work with predefined refinement &#8220;queries&#8221; where you could walk through paths of filters from a resource that represents everything. Imagine a faceted navigation system drilling down to construct what becomes a filter query through a selection of refine links (where you might end up with a hierarchical URL in the end not a query document). This might work better in a system with a finite vocabulary, although it could still be navigated via forms potentially. However this starts to work less well with complex nested queries with sorting and limits, and subterms, as the number of opotions is large, and it would be complex to make a comprehensive set of link types to make them understandable (you might just end up tokenizing SQL in links which does not seem elegant).</p>

<h2>REST search and GET queries</h2>

<p>First off, for standard read only GET queries, there is one model that is really useful and is too rarely used. For some reason most search queries in &#8220;REST&#8221; frameworks involve submitting a query as a POST request because the query string would be too long for a GET request. This however generally does not create a resource, it just returns a response, or a temporary response resource. CMIS for example works along these lines. However the sane model along these lines is to submit a query document with POST (or PUT) and have it persistent until DELETEd, and be able to requery as needed. Also, it should be possible to create query resources with unbound variables, which can be used by a system like prepared queries in SQL databases; in particular the creation of these should allow a system to work out perhaps if it needs additional indexes based on the queries that have been created. These can then be evaluated by using query strings to bind the parameters. For some reason you rarely if ever see a query specification for a REST API that has unbound variables in the query, thus missing out on the ability to use this pattern.</p>

<p>Note that the query itself is a resource, so we can for example DELETE the prepared query eg http://example.com/query. There are also resources corresponding to the result sets eg http://example/com/query?q=red.</p>

<p>As well as unbound variables for constants, another missing feature from most query languages (probably because they were not designed for HTTP transport) is the lack of ability to specify a URL that corresponds to a subexpression. This can be considered as another extension of unbound variables; you can also see this as the extension of the labguage to support SQL style views that can themselves be queried. Like a view, you could either use the definition or the current materialized version for performing further queries.</p>

<h2>REST and update queries</h2>

<p>Although it rarely seems to be done, there is no particular reason that once you have a query resource like the ones defined above, you should not be able to DELETE http://example/com/query?q=red to delete all red objects, or PUT a new set of red objects to replace the existing ones. Most collection oriented protocols, such as Atom, do not allow PUT and DELETE methods on collections, only on the member resources. POST is allowed in Atom, to define collection membership; this only makes sense where collection membership cannot be defined by visible properties of a resource; the instruction create this item with whatever properties are needed to put this item in this query result is not necessarily something that works for any domain. Certainly one can think of models in which allowing PUT and DELETE on search results resources would be useful, although there might be issues with paging and large document sizes on some types of resource. PATCH would be a useful addition for doing the equivalent of SQL UPDATE, in order to reduce the document sizes, rather than having to PUT back everything.</p>

<p>There may be other domain issues, and in some systems consistency issues with mass deletion and insertion that mean that actions on individual items not collections are more efficient of course, but extending the uniform HTTP interface to collections should certainly not be ruled out for many domains.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2009/11/search-apis-with-http-interfaces/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>&#8220;Web content is, for the most part, crap&#8221;</title>
		<link>http://blog.technologyofcontent.com/2009/11/web-content-is-for-the-most-part-crap/</link>
		<comments>http://blog.technologyofcontent.com/2009/11/web-content-is-for-the-most-part-crap/#comments</comments>
		<pubDate>Wed, 25 Nov 2009 01:29:58 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=151</guid>
		<description><![CDATA[Book review: Content Strategy for the web, by Kristina Halvorson

I like books. I read a lot of them. The office is full of books, almost all of them mine. Apparently technical people don&#8217;t read books much. I seem to remember a figure (from Joel Spolsky? I am on a plane so I can&#8217;t check right [...]]]></description>
			<content:encoded><![CDATA[<p><em>Book review: Content Strategy for the web, by Kristina Halvorson</em></p>

<p>I like books. I read a lot of them. The office is full of books, almost all of them mine. Apparently technical people don&#8217;t read books much. I seem to remember a figure (from Joel Spolsky? I am on a plane so I can&#8217;t check right now) saying that the average programmer had never read any books on programming, and never would. What&#8217;s that about? If you don&#8217;t think about what you do in that sort of introspective way that makes you want to read books about other people thinking about it that way, then you are not really very engaged in what you are doing.</p>

<p>This blog was meant to have more book reviews than it does; in fact this is the first one. Expect more. Its not that I am not reading the books, but there are other things I need too write about. Expect some more. Why this book? Well I had a plane trip, and it was on the shelf at <a href="http://www.foyles.co.uk/">Foyles</a> and it sounded interesting. I follow <a href="http://twitter.com/halvorson">@halvorson</a> on twitter, and she sounds interesting, though I have not heard her speak. Its a short book, so I wont make the review too long, after all you can just read the book next time you are on a plane.</p>

<p>I am not a content strategist, but I do like writing. And reading, as I mentioned. Now it is therefore possible that there is a big self selection issue here. It could well be that everyone who reads this book will agree with its central premise about the importance of being strategic and serious about writing on the web. It could be that actually only people who read books ever read anything on the web either. Maybe everyone else just looks at the pictures and laughs at the <a href="http://www.comparethemeerkat.com">funny meerkat</a>. Igor the meerkat is the most successful advertising campaign this year, and has turned a failing web property into a great success with no words at all, and not much money either.</p>

<p>Fortunately I don&#8217;t have any figures on the importance of written content, and no internet access on the plane to find any. So I am going to say, I read content on the internet. Almost all of what I read is good content, I don&#8217;t read the other. If it is boring I skip it. If it is too short I skip it. I don&#8217;t watch videos, due to lack of time and headphones. Reading is faster. If you want me in your target audience, you need to write, and write well. Or catch me in the pub over a pint of bitter. (Actually I like good visuals too, underused on the web).</p>

<p>So what are the key things I took away from the book? Think like a publisher. Remember there used to be a whole business managing content; it was called publishing. They planned it and commissioned it and had whole branded collections of it called &#8220;magazines&#8221;, &#8220;books&#8221; and &#8220;newspapers&#8221;. People paid for these they were so good. There is a whole industry to steal ideas and methods from (and people for that matter).</p>

<p>If you don&#8217;t have a content strategy, you are just hoping it will all work out. You might get lucky, especially if you have a good writer, who will unconsciously perhaps create you a strategy, or at least stuff people want to read. But help yourself, think strategic in content like you do in other areas of the business. Plan, execute, measure, regroup.</p>

<p>&#8220;Page tables&#8221; a content wireframe. Don&#8217;t like the term, but they are needed for a web project and I cant think of a better name.</p>

<p>Content is to support the aims of the organization. It is not just marketing any more, it is branding, it is product design, it is sales, and it is the conversations you are having through social media that position your company in the marketplace. &#8220;Recognize content as a valuable business asset&#8221; is it in your investment programme? Is it on your balance sheet?</p>

<p>Web content is long lasting; unlike print which cannot be changed and often has a short shelf life. Kristina has a chilling example of a corporate Youtube channel that no one has logged into for a year. Many corporate blogs just die from lack of care and feeding. A blog is for life, not just for Christmas.</p>

<p>&#8220;Push &#8216;user experience design&#8217; off the pedestal&#8221;. UX without content is like a fish without a bicycle, looks pretty and maybe its tasty, but it is not getting you out to the next village. Or something. Actually some agencies do cover content quite well, but many do not, sticking with the pretty pictures and the ooh shiny widgets.</p>

<p>So there you are. Content, words, unsexy, black like coal, and almost as much effort to get out of the ground. An anthropomorphic pun might work for some people, but not you and me, those of us who reached the end of these words.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2009/11/web-content-is-for-the-most-part-crap/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
