<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Technology of Content &#187; justin</title>
	<atom:link href="http://blog.technologyofcontent.com/author/justin/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.technologyofcontent.com</link>
	<description>Ramblings on the technology of content management</description>
	<lastBuildDate>Sun, 25 Apr 2010 21:45:47 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>iPad review</title>
		<link>http://blog.technologyofcontent.com/2010/04/ipad-review/</link>
		<comments>http://blog.technologyofcontent.com/2010/04/ipad-review/#comments</comments>
		<pubDate>Sun, 25 Apr 2010 21:45:47 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ipad]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/2010/04/ipad-review/</guid>
		<description><![CDATA[We got an iPad at work the other week and are sharing it in rotation so here are my thoughts after the first week. Note that this review is written on the iPad as a full test!

Out of the box

The out of the box experience is pretty terrible you sit in the airline lounge and [...]]]></description>
			<content:encoded><![CDATA[<p>We got an iPad at work the other week and are sharing it in rotation so here are my thoughts after the first week. Note that this review is written on the iPad as a full test!</p>

<h2>Out of the box</h2>

<p>The out of the box experience is pretty terrible you sit in the airline lounge and open it up and turn it on and then it just insistently asks for the mothers nipple iTunes parent for imprinting. No playing until that&#8217;s done.</p>

<p>It gets better after that though. As you expect it is nice to look at and to use like most Apple products. Pictures in particular look lovely and it is fast smooth and responsive. Web pages vary in how quick they are, but somehow slow pages seem more annoying.</p>

<p>I haven&#8217;t installed many apps for various reasons: the UK app store does not have all the apps in, for example none of the Apple ones. The pad won&#8217;t even download apps over the air saying &#8220;the app store is not supported in your country&#8221; but it will sync them. However my work Mac that it was mothered to died leaving it orphaned at a young age.</p>

<h2>Comparison to a netbook</h2>

<p>I mostly write my blog posts on my Eee PC running Ubuntu netbook remix. The iPad seems noticeably faster. Part of this is clearly the screen rendering which on the Eee is a terrible integrated Intel graphics chipset. Mind you it was much cheaper too. The bigger screen size and higher resolution of the iPad make it nicer to use too. But there is a limit to what you can do as there are very few applications. Personally I like to have a command line and a programming language locally rather than on the web, although I guess one could rig something up in browser with local storage; there seem to be <a href="http://robrohan.com/projects/9ne/">a few editors around now</a>.
Will think about that model although it needs a lot of infrastructure to work.</p>

<h2>Web</h2>

<p>The web works well. In portrait view which I have mostly been using you can see a long way down a page and generally read the text too. That&#8217;s quite a nice page view. The big issue though is browser detection which is a terrible thing. People are detecting browsers not capabilities. For example the BBC iplayer thinks the browser wants flash even though they have a QuickTime iPhone version (the videos may be iPhone resolution only I suppose but that would be better than trying to show flash. Other sites show the mobile version which should mainly be about screen resolution detection not the browser identifier. Some sites just don&#8217;t work because they expect hover states. Other than that you just want to manipulate things rather than press buttons, as a mouse interface just feels unnatural. It is going to take a while before many web sites have gestural interfaces.</p>

<h2>Writing</h2>

<p>Google docs turned out to be a bit confused and I was unable to create a new doc on the website for this review I think it was only showing the mobile version even when I selected desktop. I couldn&#8217;t bear to use Notes with it&#8217;s hideous use of the Marker Felt font. Fortunately I had installed the Evernote app earlier so that&#8217;s what I am using. Typing is not great, pressing on glass and you have to hold it up with the other hand as typing on the lap doesn&#8217;t feel right. Doable but not ideal. Also I can&#8217;t work out how to turn the clickiness off other than just turning the volume down.</p>

<h2>Gestures</h2>

<p>After a short time gestural interfaces become very natural. There are a few issues with standards and ways of doing things that are not yet well defined but the basic movements are simple and other operations are easily learned. The big screen makes things much easier than the iPhone and multi finger operations make sense like bunching and spreading out photo albums in the picture viewer. Our hands are good at learning these sorts of operations and being precise about them. Touch is far more natural than say speaking to a computer. It is interesting that Apple and others have gradually been introducing elements of touch such as two finger scrolling on touch pads that I find it hard to manage without. Who wants to move a mouse to a picture of an elevator  when you can just stroke the screen?</p>

<h2>Walking around</h2>

<p>The iPad feels the write sort of thing to use in meetings for looking at reference material (the Basecamp overview page works well for example), looking at diagrams, the web, taking notes and so on. It is almost as easy to walk around with as a pad of paper and generally as useful although I don&#8217;t find diagram drawing very intuitive yet. A camera would be useful for capturing whiteboard pictures and so on; having a device without a way to get rough pictures easily is a bit annoying. Oddly though I don&#8217;t feel the same way about the netbook which has a front facing camera for video calls that I don&#8217;t use. If the phone could talk to the iPad easily that would help but it doesn&#8217;t. It is one of the Apple annoyances that they want to sell iPad 3G contracts rather than make the iPhone and iPad work together as a unit.</p>

<h2>Future of portable devices?</h2>

<p>I think the gestural touch interface is going to win over the mouse mediated interface. The keyboard will last, but maybe as an accessory like with the iPad rather than joined. However vertical screens don&#8217;t work with touch as your arms get tired. Pad is perhaps the right model reflecting how we use paper most of the time. There are issues about how to hold and use it that will need to be ironed out, and there are issues with Apple&#8217;s idea that it should be a simplified computer as they have perhaps gone too far. Indeed I would be very happy if it had a gestural version of Ubuntu on it like my netbook I think that might be perfect.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/04/ipad-review/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Trends in content management 2010</title>
		<link>http://blog.technologyofcontent.com/2010/03/trends-in-content-management-2010/</link>
		<comments>http://blog.technologyofcontent.com/2010/03/trends-in-content-management-2010/#comments</comments>
		<pubDate>Mon, 08 Mar 2010 19:05:08 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[CMS]]></category>
		<category><![CDATA[trends]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=226</guid>
		<description><![CDATA[This is a an overview of the medium term trends in content management, from a mostly technology point of view.

Standards


repository
feeds
CMIS
JCR
terminology, ways of thinking, industry model


Standardization has really started to affect the content management industry. The industry was very immature, a bit of a landgrab, and not very customer focussed. This has changed rapidly, with the [...]]]></description>
			<content:encoded><![CDATA[<p>This is a an overview of the medium term trends in content management, from a mostly technology point of view.</p>

<h2>Standards</h2>

<ul>
<li>repository</li>
<li>feeds</li>
<li>CMIS</li>
<li>JCR</li>
<li>terminology, ways of thinking, industry model</li>
</ul>

<p>Standardization has really started to affect the content management industry. The industry was very immature, a bit of a landgrab, and not very customer focussed. This has changed rapidly, with the wide adoption of the JCR standards, but particularly with process around CMIS. What is being set now is the model of the industry for the next five years, what the customers expect and what the products will deliver. Setting the agenda matters, and now is the opportunity to participate.</p>

<h2>CMS as a platform</h2>

<ul>
<li>build applications on a content platform</li>
<li>API driven development</li>
<li>SOA</li>
<li>embed code everywhere in domain level scripting languages</li>
</ul>

<p>A content management system is at last becoming less of a product that lets you do some stuff and more of a platform for working with content and building content centered applications and a service oriented world. Pervasive invasion of scripting languages such as Javascript into this is coming. The web programming model of pervasive agile scripting and rich REST APIs is going to be the norm, not large scale Java programming or application specific templating languages.</p>

<h2>Co-opetition and community</h2>

<ul>
<li>collaboration on standards, infrastructure</li>
<li>open source as community</li>
<li>twitter, blogs, enterprise 2.0</li>
<li>end of NIH</li>
<li>customers are community too</li>
</ul>

<p>In the last year especially the landscape of content management as a community has changed. First through the standards processes, particularly CMIS and JCR, and then through social media, particularly twitter, as well as via events and blogs, there is now a growing cross vendor technical content management community, particularly with the open source players, and joint projects, for example with CMIS. This is in addition to the developer communities that are strongest around the open source products, although the .net products are trying hard to build around the Microsoft developer relations model. And of course the community of customers, who are becoming more vocal.</p>

<h2>Rich content</h2>

<ul>
<li>richer xhtml and xml</li>
<li>enhanced metadata; richer metadata in other formats</li>
<li>constraints not just validation</li>
<li>RDF and semantic web, linked data</li>
<li>relations and IA expressed in metadata</li>
<li>enhancement via deeply integrated search</li>
<li>document management, DAM and WCM converge</li>
<li>richer presentation layers, richer APIs</li>
<li>Flash is dead, plugins are dead, HTML5 is winning faster than anyone thought</li>
</ul>

<p>As we have moved from document management, where the focus was on whole documents, to web content management, which is more component and assembly based, there has been a gradual push to do more with the documents. Standardized rich document semantics are after all one of the main advantages of web documents. It is taking a while but making use of the potential here is beginning to happen, now we have <a href="http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-snippets.html">Google indexing rich snippets</a> and even <a href="http://rdfa.info/2010/01/20/uk-retail-chain-tesco-adopts-rdfa/">Tesco using RDFa</a>. There is a lot more standardization work to do here.</p>

<p>In the front end the aim of this backend information enhancement is to build richer interfaces more easily, and to enhance findability, search and navigation, as well as to enable repurposing, richer APIs, and linked data. Authoring is the biggest challenge, as the majority of users need to be given interfaces that are independent of the IA, simple to use, but support generation and modification of complex data structures.</p>

<h2>SAAS and the service business</h2>

<ul>
<li>cloud</li>
<li>internal delivery in a SAAS way</li>
<li>devops</li>
<li>APIs and standardization forced by SAAS</li>
<li>changes to customer service model</li>
</ul>

<p>Software as a service models are winning because no one wants to buy software as a product any more. I will cover more of this in another article I have been working on for a bit, but the main point is that enterprise software is a paid big ticket product is dead. The replacements are open source software and SAAS. These are not alternatives though, as people want the open source software delivered as a service, albeit maybe a more commoditized one if there are multiple providers, and many of the SAAS products delivered will be largely built of open source components by companies that run a mixed model. Microsoft is <a href="http://www.theregister.co.uk/2010/03/04/ballmer_on_azure/">going headlong into cloud</a> in a way that redefines what the operating system is. Even purchased software will be delivered in internal clouds.</p>

<p>This changes both how code as written and administered, with the <a href="http://lethargy.org/~jesus/writes/a-job,-a-mission,-a-career-all-without-a-path-or-a-name.">web operations</a> joining up into rolling delivery and creating the emerging field of <a href="http://www.devopsdays.org/">devops</a>. Developers need to understand operations and how to build code for this environment.</p>

<p>The service business as a business is different from the product business. Open source companies have got that better than product based vendors, but the less there is lockin the more key these changes become. The <a href="http://www.interwest.com/software-as-a-service/on-demand/vp-of-customer-success-critical-to-the-saas-business-model/">success of the customer using the services becomes the key business driver</a>.</p>

<h2>Performance and scaling, real time</h2>

<ul>
<li>cloud has pushed scale up out of picture</li>
<li>scale out transparently</li>
<li>new technologies beyond RDBMS that fit CMS </li>
<li>dynamic generation becoming the norm; Google pushing the performance thing; the industry norm of 100ms will fall</li>
<li>real time becomes more important &#8211; dynamic updates, forget crawling, Google is going push</li>
<li>backend: queuing (0MQ, AMQP)</li>
<li>frontend: websockets, XMPP, long polling </li>
</ul>

<p>Just buying big hardware for scale up is really becoming difficult; the web vibe has always been to scale horizontally on commodity hardware. There is a lot of development around scale out <a href="http://blog.technologyofcontent.com/2010/02/nosql-and-content-management/">technologies such as NoSQL</a> which fit into the WCM data models, which are those of the web after all.</p>

<p>As well as scaling for volume, latency and real time are becoming key. Google&#8217;s time to crawl has been falling rapidly, to a few or less, but is <a href="http://www.readwriteweb.com/archives/google_developing_real_time_index.php">moving to real time</a> with push updates. Twitter has really pushed the boundaries of expectation for real time. Behind the scenes there are a lot of technologies for efficiently pushing around notifications and events both at the backend and on the frontend. Real time is going to become increasingly pervasive.</p>

<p>Page generation times will need to fall; the standard industry benchmark of 100ms per component will probably need to be halved; overall total times under 1s will become the norm.</p>

<h1>Security</h1>

<ul>
<li>web increasingly hostile</li>
<li>every bug is a potential security issue</li>
<li>security focussed on fewer areas, push into the OS not out to applications</li>
</ul>

<p>I read the excellent <a href="http://lwn.net">Linux Weekly News</a> every week, and every week there are <a href="http://cwe.mitre.org/top25/">security exploits</a> for many pieces of software; one that really struck me recently was the <a href="http://www.h-online.com/security/news/item/Possible-backdoor-in-the-e107-CMS-913588.html">major exploit against the CMS e107</a>. What happened here was the a group of crackers found a serious security flaw in the CMS, which they began attacking systematically. When the patch was released however, they already had control of the developer&#8217;s website via the flaw, so they replaced the patched version of the code with a version with a backdoor. Hacked websites are a vital part of the underground <a href="http://www.securitytube.net/Phishing-%28Evil-on-the-Internet%29-FOSDEM-Talk-video.aspx">online crime scene</a>, and a content management system is a high value target. Expect much more of this, and be prepared.</p>

<p>Narrowing the security into fewer points of vulnerability, sandboxing, using every available facet of the operating system&#8217;s security layers; make the most of processes, permissions, everything that you get there; I <a href="http://blog.technologyofcontent.com/2010/01/scaling-security-and-architecture-in-2010/">wrote more about this in an earlier post on emerging trends</a>. File format parsing is another area of vulnerability that is common.</p>

<p>It is war out there on the internet, and many people underestimate or ignore the issues, and too many programmers do not code defensively by habit.</p>

<h2>Summary</h2>

<p>It is an exciting time in web content management right now; the industry is growing up beyond its beginnings as a way of getting web sites up, towards being the core of the broader content management industry. The choices made now will shape the industry; the next generation of products will be a big step forward forr the industry.</p>

<p><a href="http://dilbert.com/strips/comic/2009-07-26/" title="Dilbert.com"><img src="http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/60000/1000/700/61747/61747.strip.sunday.gif" border="0" alt="Dilbert.com" width="450"/></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/03/trends-in-content-management-2010/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>NoSQL and content management</title>
		<link>http://blog.technologyofcontent.com/2010/02/nosql-and-content-management/</link>
		<comments>http://blog.technologyofcontent.com/2010/02/nosql-and-content-management/#comments</comments>
		<pubDate>Sun, 14 Feb 2010 23:34:15 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[CMS]]></category>
		<category><![CDATA[data modelling]]></category>
		<category><![CDATA[nosql]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=216</guid>
		<description><![CDATA[I went to many of the first ever NoSQL devroom talks at FOSDEM this year. For anyone who hasn&#8217;t been, FOSDEM is a great place, and the NoSQL room was well organized and full of interest. The term NoSQL is not even a year old; I first came across CouchDB around a year ago from [...]]]></description>
			<content:encoded><![CDATA[<p>I went to many of the first ever <a href="http://nosql.mypopescu.com/post/385372130/your-chance-to-review-the-fosdem-nosql-event">NoSQL devroom</a> talks at <a href="http://fosdem.org">FOSDEM</a> this year. For anyone who hasn&#8217;t been, FOSDEM is a great place, and the NoSQL room was well organized and full of interest. The term NoSQL is not even a year old; I first came across CouchDB around a year ago from memory; Tim Anglade gave an excellent introduction where he reminded people of the historical roots, both before relational databases and since then; so not new but there is a renewed focus now. Why is that? I am going to look here at the field of content management and why you might be interested in different data models if that is your problem space, based loosely on some of the ideas from the talks at FOSDEM. There was a talk about <a href="http://outerthought.org/blog/blog/353-OTC.html">content management specifically and the Lily CMS by Evert Arckens</a> although I missed it, but I have added some comments after watching the video.</p>

<p><a href="http://www.flickr.com/photos/justincormack/4375594326/" title="FOSDEM by Justin Cormack, on Flickr"><img src="http://farm5.static.flickr.com/4029/4375594326_7ebdafd796.jpg" width="450"  alt="FOSDEM" /></a></p>

<h2>The data model for content management</h2>

<p>I have another draft post on this subject in more detail, which I am working on as parrt of my REST modelling in content management work, but I will outline some of the types of data relations that are important. I will be quite abstract here, if you want more concrete examples you will have to wait for the other post: database models like the ones we are talking about here are more easily understood in the abstract I think.</p>

<p>First we our unit of modeling. This in itself is the first issue. Content management tends to deal with, at the conceptual level, something that looks like a document. It may be a fragment, in the sense that it is say a page component (asset if you use that terminology) rather than a whole item, but the unit for the user to edit and which is usually versioned is a structured object itself. The processing model tends to treat it as almost of binary blob, except that certain properties can be extracted, such as metadata, links in HTML and so forth, but it is stored as an item rather than decomposed further.</p>

<p>OK, so we have a piece of content and some attributes extracted from it as one basic model. This corresponds pretty much to the JCR data model for example. There are variations; sometimes people do not store metadata in the file formats, as historically many file formats had poor support for arbitrary structured metadata, although that is largely obsolete now, and the advantages of actually storing metadata and relations substantially within documents are high. External storage does not change the model much, just complicates processing and storage. Another variant, often seen in document management systems is to be able to have multiple &#8217;streams&#8217; ie several document variants rolled into one, for example a video and a still from it. You can however from the modelling point of view regard these as anotehr compound document format kept together because conceptually they are a bundle of content; you might distribute them as a zip file if you havent got any other suitable container format.</p>

<p>So now we have a storage model where we have a blob, with rich media operations on it, and extracted structural and metadata information. There is also versioning to consider, but let us ignore that and treat it either as part of the blob, or as a new document with some relation to the old ones, those being the two core versioning models, this does not really affect anything else.</p>

<p>There are two kinds of metadata, although they are more similar than they appear, properties and relations. Properties are the standard attributes (this picture depicts sheep), while relations join two items in the repository (this is a cropped version of this other picture). Although this distinction seems clear, in the end richer information architectures demand that everything becomes a relation, so I can browse a sheep node and find all the sheep items, turning every attribute value of any significance into a node with relations instead. Pure attribute values are only left for the less interesting properties (this PDF file is 176k in size).</p>

<p>They are also less interesting from a relational versus non relational storage point of view, although there is one important point, which is the dense versus sparse question, so let us take a look at this. Most real world attributes are sparse, that is most attributes aare not set on most items. In the relational model we have a row for our item, and columns for all the attributes, so we are saying most are NULL. (I was brought up on matrix algorithms and still think in terms of sparse versus dense matrices as this is exactly the same problem, and matrices represent graphs anyway). Storing huge mainly null tables is not very efficient, so there are two common practices in relational mapping of attributes in content management systems. First is to define a type based system, where a particular type of content item is defined to have certain attributes (or at least fewer NULLs!), and each set of that type therefore can have its own table which is assumed to have fewer NULL values. Mixins, sets of properties that live across types can potentially be added to this model, as can inheritance schemes, but the basic idea is one table per type. This gives a nice simple direct database programming model, and causes a complete nightmare if you ever want to change the schema, for example add an attribute, as for any large database most DBMSs will effectively shutdown the system while a schema change takes place, as schema changes require pretty much all locks. <a href="http://www.silverstripe.com">Silverstripe</a> is one example of a content management system built like this; there are many others.</p>

<p>The alternative is the <a href="http://en.wikipedia.org/wiki/Entity-attribute-value_model">entity attribute value</a> (EAV) model (terrible Wikipedia article, please fix), where rather than a direct mapping of the attributes to relations, you indirectly map, creating a table that joins entites, attributes and values; this table of course looks just like RDF triples. Doing this though loses everything that makes a relational database useful: constraints, typing, query optimization. It adds an extra layer of logical schema above the physical schema which the database layer does not understand. This is a pretty common relational mapping for content management systems, as it allows full flexibility in defining and redefining attributes. To implement well it needs a large mid layer to manage the constraints, provide an API layer, generate efficient queries, effectively to manage the logical layer to physical layer map. The <a href="http://drupal.org/node/82661">Drupal CCK</a> is an example of this model.</p>

<p>Of course this is not to say that neither of the two relational models do not work. The direct mapping works well with simple, unchanging content types in small websites, for example, or in models where attributes are not very sparse, or the sparseness is worth the overhead, and changing the schema is rare. EAV works well too, if managed carefully; it helps if the type of queries required on the model are not too complex.</p>

<p>Once you add relations as well as attributes, the already difficult mapping layer gets harder; you add another set of operations (recursion to handle tree structures) that the relational model does not handle well, so you may need to add more into the mapping layer. The promise of NoSQL is that you can bypass this for these types of applications, and program directly to a database model that handles sparse attributes and relations natively. But how much do the NoSQL databases get you? You can argue that if you are already looking at EAV, then you are already not getting much from a relational database, and you are building a modeling layer on top of it, so dropping that and going for something that maps the logical data layer directly does make sense from a development point of view. Whether that really helps performance is less clear; much of the original work for NoSQL has come out of huge scaling, big problems, not actually providing efficient solutions to the types of data mapping problem we are seeing here on a medium scale; of course for huge sites there may be benefits.</p>

<p>The types of NoSQL database vary in their level of support for attributes and relations as they are used in content management. Document oriented databases do not give you much more than retrieval of content items; associative ones give key value type attribute lookups; graph databases should let you query relations directly, expressing the types of queries that are needed for information architecture problems directly, in principle. Examples I am thinking of are things like tag clouds, which is simple to express as a graph problem as it is simple a count of the number of edges from a set of nodes. Indeed most information architecture problems look like graph problems, and also like <a href="http://en.wikipedia.org/wiki/OLAP_cube">OLAP processing operations</a> which also do not work well on relational databases. And of course one of the things that NoSQL has shared with OLAP is the use of denormalization; you can use simpler models if you denormalize data to match the queries you will be using, rather than assuming that the types of query you will use can necessarily be optimized and made efficient by a general purpose system.</p>

<p>Denormalization is not without its difficulties, although arguably it could become a tool embedded in databases like indexes are now. One of the issues with NoSQL is most of the database systems leave denormalization to the user: you need to use it because joins are not available, but you have to manage that yourself. Building an infrastructure to explicitly manage denormalization as a first class database item akin to an index might be interesting. So that gives us a first issue, as in any NoSQL system except a graph database we will either need to denormalize or compose queries to get the results we want.</p>

<p>So I think there are four realistic models for content management backends going forward:</p>

<ol>
<li>The direct relational model for small systems with simple data models, rare attribute changes, little or no use of relations.</li>
<li>EAV models wrapped in a content modeling layer; JCR is an example of this, hiding the underlying SQL layer very well, and indeed allowing it to be replaced with another underlying storage model potentially; I am sure someone is testing a Neo4J backend somewhere. This is where most production solutions are at now.</li>
<li>Direct, nondenormalized graph database backends, with the raw content stored in a document store. Cuts out a special purpose middle level by mapping the domain more directly. As <a href="http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html">Emil Neo</a> says, it may not scale right up as far as the othe NoSQL technologies, but it cuts complexity of implementation; there are also issues about whether all the kinds of queries required are available efficiently. I think this will be the sweet spot in a few years once the products mature and we see more open source activity in the field. Of course RDF based solutions, for example using SPARQL fall into this category too, and the maturity of products around these technologies will help drive this category as well as the NoSQL models.</li>
<li>Big, denormalized systems, probably with software support for managing the denormalization, and using underlying simple but scalable technologies like key-value stores. These already exist in large scale web applications, but may remain niche if the development effort remains high. If frameworks for modelling more easily on these turn up they may trickle down for performance reasons even on smaller datasets; a key value store runs fine on a relational database backend, although the types of processing required probably means a specialized backend is useful.</li>
</ol>

<p>Note that the <a href="http://lilycms.org/">Lily CMS</a> which there was a talk about fits very much into the fourth option above; this is where the NoSQL technologies have perhaps seen most use, but I think there will be a lot of work in order to build a CMS like this now, in particular in terms of tools to support denormalization strategies that are needed. The outlined approach sounded much like the outlines I have been thinking about for this type of model, although I would focus more on tooling for denormalized queries and less on scaling other parts like full text search right now. It will be interesting to follow the progress of this project.</p>

<p>We are at an interesting juncture, where it looks like there are some options that will let us do domain modelling in a way that corresponds more directly to the domain, but there are a lot of interesting challenges on the way.</p>

<p><a href="http://dilbert.com/strips/comic/2008-02-12/" title="Dilbert.com"><img src="http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/00000/1000/800/1869/1869.strip.gif" border="0" alt="Dilbert.com" width="440"/></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/02/nosql-and-content-management/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>JSON vs XML</title>
		<link>http://blog.technologyofcontent.com/2010/01/json-vs-xml/</link>
		<comments>http://blog.technologyofcontent.com/2010/01/json-vs-xml/#comments</comments>
		<pubDate>Wed, 27 Jan 2010 21:16:09 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=206</guid>
		<description><![CDATA[A lot of web developers you meet hate XML with a passion, and JSON has taken its place as the format of choice for a lot of API work. There are some advantages to JSON, but some disadvantages, and XML does have some problems, but the arguments are not as simple as generally made out.

I [...]]]></description>
			<content:encoded><![CDATA[<p>A lot of web developers you meet hate XML with a passion, and JSON has taken its place as the format of choice for a lot of API work. There are some advantages to JSON, but some disadvantages, and XML does have some problems, but the arguments are not as simple as generally made out.</p>

<p>I have been looking at the issue of writing filters for formats that basically change them as little as possible. This is a slightly difficult field in many ways. You have to store a fair amount of extra information in order to do this, but if you are say changing some metadata items for a user they may not want their CDATA removed and replaced with a semantically identical but syntactically different form. So I am looking at the formats partly from this point of view. Sometimes this brings up the conflicts between the human readable and computer manipulation aspects of these formats, the logical and physical structures. I have also been looking at making simple tools to allow modification of document formats, which has also raised some issues.</p>

<h2>Pro JSON</h2>

<p>JSON is simple and now well defined. It used not be clear in a few places, like how to encode items outside the 2 byte UTF-16 encoding using the \u notation. Many people do manage to generate invalid JSON (no quotes around identifiers, use of single quotes, use of a BOM at the start) it seems, which is a problem. I mean if people cannot even get possibly the simplest format ever invented right, what hope is there for civilization? This should get better as standard libraries that do the right thing come out one would hope! Introducing lax JSON parsers (in the style of HTML 5) seem to be unnecessary for a simple format that is normally generated by a computer. A strict JSON parser is not a lot of code.</p>

<p>JSON has a simple way of showing which of the allowable encodings it is in, based on the zero bytes at the start, as the first two characters must be ASCII. Allowing a BOM and then allowing unicode whitespace might be more standard, but the whitespace has no function except for use in text editors.</p>

<p>Despite teh attempts by, for example, E4X to add a simple XML native format and processing model to Javascript, JSON remains much easier in most languages to process, as it is built around structures that most languages just have natively, while XML is not. Some languages have issues with a mismatch to JSON, but most are fine. E4X on the other hand has security issues client side, and is not seeing adoption except in some server side applications.</p>

<h2>Against JSON</h2>

<p>JSON does not have a native hyperlink type. This is unacceptable in a web format in my opinion. For example a REST interface (a real one, not the clean URLs many people who use JSON think are REST, one with HATEOS) requires native links and link types. A native link format such as {&#8220;next&#8221;: <a href="http://example.com/next">http://example.com/next</a>} would solve a lot of issues, and would be compatible. There is JSON-Schema trying to add schemas that can extend the type system, but having to have a schema to understand links just seems overkill to me.</p>

<p>JSON ought to specify that unicode strings are normalized really. I guess most of them are, but it does mean you should normalize before doing comparisons on keys for example.</p>

<p>There are some syntactically different representations: arbitrary white space, although this does not including the full unicode white space definition, and backslash escaped characters, which can mostly be represented directly in the unicode encoding of the document. Whitespace clearly needs to be preserved for readability and use of line oriented editors and tools. It is unclear how inconvenient it would be if \u codes were normalized to unicode which is the sane default. I think there were some tools that did not support unicode, although it is mandated by the standard; it is odd perhaps that an ASCII encoding is not an option, but it seems unlikely to be important. In fact though, preserving the exact used syntax is not difficult in many applications as, unlike with some cases in XML, this does not involve much additional state.</p>

<p>Although JSON was designed to serialize data structures, its small set of types is limiting. We are not going to get around this though for now, as computer languages still have such different ideas of types. Most uses of JSON have an implicit schema, which is both a strength and weakness. Most implementations were tightly coupled, for example in AJAX. Now we are seeing more APIs exposed to the world using JSON; these have more need for a schema. I tend to prefer the idea of HATEOS, the REST idea of hypertext as the interface design constraint, rather than published schemas in the SOAP WSDL style, and JSON seems to be more inclined to move to the latter. Especially if people use JSON-RPC, I thought people had given up on the RPC style on the web but it appears not.</p>

<h2>Data model</h2>

<p>The JSON data model is simpler than XML. This is less clear as a differentiator. XML nodes have attributes and children, JSON ones attributes or children if you consider the object to model an attributes set and the array or list type to model an ordered list of children. This difference is not hard to work around, and it is very domain specific what the requirements are.</p>

<p><a href="http://twitter.com/dehora">Bill de hOra</a> pointed out &#8220;you should add field cardinality to the distinctions &#8211; json needs to change structure [], xml needs just another element&#8221; which is a very good point.</p>

<h2>Schemas</h2>

<p>Schemas for validating are great. Validation is an important activity. It is complicated though in general, rules such as this must be filled in if that is not and so on. Essentially a validation schema might need to be very complicated, but many are very simple. Having a choice of languages to express these  constraints in seems to me to be a good thing. The XML DTD is too weak, and should not have been included in the language, as discussed below. Some constraints are computationally complex and need very expressive languages.</p>

<p>The second function of a schema is interpretation; this may relate to validation in that a field must be readable as a number say, and we are also going to read it as a number. This is a different requirement, as in many cases it is about object modelling and code generation, when a validated structure is then mapped to a native language object. These are conceptually separate processes, as a number may be constrained to be between 3 and 5 for domain reasons, but the representation in say Java may be an integer, but it need not be. Of course the validation stage here is essential for security reasons, to stop overflows and type errors; however these are conceptually different activities and may have different schemas.</p>

<h2>Against both</h2>

<p>Binary data is a big problem. We will need a lot of other formats for anything that has binary data, they are just so much more efficient, even after compression. So ideas of a universal format are not going to happen.</p>

<h2>Against XML</h2>

<p>XML is weak on unordered items. Most of the structure in an XML document is the child relations and these are ordered. This is used as a criticism, but I am not sure it is that reasonable, as attributes are unordered and as said elsewhere there is an equivalence with the two structures provided by JSON, named and unordered items, and unnamed ordered ones which seems natural.</p>

<h2>Pro XML</h2>

<p>It was pointed out <a href="http://twitter.com/dret">by Erik Wilde</a> that I had missed out the pro XML section. This was an accident. I am actually very pro XML in many ways. First it has enough structure that we can build rich data structures; and to add to that it has some standard forms (such as XHTML) with rich sets of attributes and elements which can be reused in a variety of domains, and standard link relations. The other big thing is the set of extraction and transformation tools, which are generally quite well designed and fairly complete. There are stream and DOM parsers widely available.</p>

<h2>Against XML DTDs</h2>

<p>The DTD, which XML inherited from SGML, is an anomaly in many ways. First it has a non XML syntax, so we need another set of parsers and tools to work with it. It has several functions that really need to be separated. The first function is as a schema for validating documents against. Unfortunately it is not a very good schema language, as the constraints it can apply against documents are limited. Now we have for example XML Schema and RELAX-NG, which are better schema languages, but the DTD has a special position in the specification that is difficult to drop.</p>

<p>In addition to being a schema, the DTD can also define default values for attributes that the application should see just as if they were in the document. This is the kind of thing that makes preserving the textual form difficult, as there is a syntactic but not semantic difference between certain attributes. I also do not think that this is used much, as real defaults would be implied by the processing model not the document. Clearly it is easy to remove this feature from documents simply by adding in all the implied defaults explicitly.</p>

<p>There are security issues due to the parsing issues with entities, which means that <a href="http://msdn.microsoft.com/en-us/library/ms756016%28VS.85%29.aspx">some parsers disable DTD parsing for security reasons</a>. SOAP for example does not support DTDs. This is of course non conforming, but clearly a good idea in many situations.</p>

<p>DTDs are not namespace aware, which makes them unusable in many cases with documents with namespaces. Another reason to deprecate them.</p>

<h2>Against XML entities</h2>

<p>Then there are entities. My reading of the initial spec is that entities were designed to save typing for people, but I do not think that they are used for anything except for memorable encodings of characters outside the ASCII set. The thing about this use case is it is perfectly alright to substitute the values for them, as they never change, whereas if I create my own arbitrary entity inn a DTD for the name of something it may be because I wish to use this like a search and replace function to substitute whatever I want in. This is in my opinion not really appropriate at the document format level, this is an application level tool, and the application should use regular XML tags for this type of user level structure.</p>

<p>XML entities can also be used as an inclusion mechanism; again the DTD is not the place to define this. XInclude seems much better if this facility is needed.</p>

<p>Entities can contain other entities, markup and so on. Recursion, and unbalanced markup are not allowed. This whole thing adds enormously to parsing complexity, when the use case is entirely as character data.</p>

<h2>Against XML namespaces</h2>

<p>I am not against XML namespaces per se, but there are <a href="http://lists.xml.org/archives/xml-dev/200204/msg00170.html">pathological cases</a> which make them very hard to process sanely. In particular, you can redefine the same namespace name to refer to multiple URIs in the same document, and you  can refer to the same URI with  different names. This effectively means that all processing needs to refer to both the short name and the full name. As this is exactly what the spec was trying to avoid it is pretty bad. The amount of state you need to keep to keep a namespaced document textually the same after processing is very large; the nasty mess one tends to get from parsers to let you cope with namespaces is one measure; another is the complexities of xpath on namespaced documents, especially ones with any of the pathological cases in.</p>

<p>The simple solutions seem to involve not allowing redefinition of namespaces to a different URI in the same document, or the converse; declaring all the namespaces that will be used in the root element is also an option. This means processing can be more or less namespace unaware, as xsd:type will mean the same thing regardless of the context. This falls in with the standard usage, where a fairly small set of namespaces are used and they have abbreviations by convention that remain constant across large sets of documents. This means that very little namespace awareness complexity is needed.</p>

<h2>Other issues</h2>

<p>Mixed content, the role of CDATA, the significance of whitespace, these are all extremely complex issues that could be simplified.</p>

<h2>Minimal XML proposals</h2>

<p>XML, quite hard but worth it? For the applications I am interested in, I think simplification is needed. The first issue is that security and simplicity are related. Anything web facing will get hostile documents thrown at it, and having more constraint helps, in a way that the document processing industry does not see so much as an issue.</p>

<p>There was a time ten years or so ago, when minimal XML proposals were fashionable. XML itself was of course an attempt at a minimal SGML proposal, but not enough was cut or changed, and much compatibility was kept. <a href="http://simonstl.com/articles/cxmlspec.txt">Common XML</a> seems the most reasonable to me, and addresses many of the issues. XML tools do not work in the way that was perhaps envisaged, and making things simpler and easier, evolving them, will make them more robust. JSON shows that the demands for simplicity are there, and XML will suffer if it does not answer these.</p>

<p>The first thing is to drop the DTD. It serves no real function now we have alternative schema languages for XML. Radically, I think we can drop entities too, other than the necessary ones for escaping (amp, quot etc), and numeric ones which are again syntactic. The only possibility for requiring named entities is XHTML, but it barely exists now, and those entities could be special cased there without difficulty, as their values will never change and they do not contain markup or other things that cause parsing issues. Arguably these named entities could be added to the XML spec anyway for all documents, changed to a purely syntactic thing. I am not aware of any other XML usage of entities; there may be a few I suppose.</p>

<p>For namespaces, there needs to be a solution that maps syntax to semantics, so that an attribute or element syntactic name has the same semantics throughout the document. Renaming in different scopes makes global transformations, comparisons, and simple processing too hard. It breaks simple search and replace, even that needs to be namespace aware.</p>

<h2>Data versus applications</h2>

<p>Part of the conflict is due to whether XML is an application protocol, or a data format. Some of the bits that have issues, like entities, are really part of an application data format, for a class of applications that work according to the model in the mind of the XML designers, which in turn was based on real SGML applications. But data formats are winning really. We want to attach additional semantics to data now through standard mechanisms, such as relations, RDF and so on, not be expanding the storage format. Simplicity is winning here: complexity in a data format does not add to the richness that can be expressed; simple uniform mechanisms can do this. And simplicity is going to win; linked data over Microsoft Word style application data formats.</p>

<h2>What will happen?</h2>

<p>I actually think these changes are, informally, happening. DTDs and entities are not used in many cases now. They may be in some publishing applications, especially those based on SGML, but the web document architecture does not use them significantly. Namespaces are used in a particular way, usually. HTML5 has shown what the logic of human readability and writeability implies, which is a non XML language. The great advantage of XML is the variety of ways in which it can be processed, but issues such as security to hostile documents, parsing complexity, performance, and ease of processing really matter a lot, and despite many weaknesses JSON is showing the way of radical simplicity. But a simplified XML would be no more complex than JSON I think, and have the advantages of richer tool support, and widespread use. Most of the XML in the wild an APIs is very simple; the sorts of XML that are embedded in other documents as metadata are simple too. Security is limiting processing, and the traditional publishing applications that historically used more of the functionality could change too, although more slowly. Will simplicity win, and wil JSON replace XML? I think not, because so much XML is in use, but I think a specification of an XML subset is needed to stabilise the situation.</p>

<p><a href="http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags"><img src="http://blog.technologyofcontent.com/wp-content/uploads/2010/01/parse.png" width="440" alt="you cannot parse XML with regular expressions"></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/01/json-vs-xml/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Scaling, Security and architecture in 2010</title>
		<link>http://blog.technologyofcontent.com/2010/01/scaling-security-and-architecture-in-2010/</link>
		<comments>http://blog.technologyofcontent.com/2010/01/scaling-security-and-architecture-in-2010/#comments</comments>
		<pubDate>Sun, 17 Jan 2010 18:47:53 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=200</guid>
		<description><![CDATA[This post is about a bunch of stuff I have noticed recently, things that are affecting software and hardware architectures, and security; it is a bit miscellaneous perhaps. As application architectures on the enterprise move towards emulating web scale architectures these trends will affect software more widely. This concentrates on Linux, the operating system the [...]]]></description>
			<content:encoded><![CDATA[<p>This post is about a bunch of stuff I have noticed recently, things that are affecting software and hardware architectures, and security; it is a bit miscellaneous perhaps. As application architectures on the enterprise move towards emulating web scale architectures these trends will affect software more widely. This concentrates on Linux, the operating system the internet is now built on, and how it is modifying the trends to fit with ways of doing things that may be different from what goes on in other communities. Security continues to be more and more important as the environment for applications becomes more hostile.</p>

<h2>Virtualization</h2>

<p>Virtualization mainly started as a way to deal with issues in running multiple services on Windows, due to compatibility issues. This has always been much less of an issue with Linux applications, due to the scale of supporting libraries packaged by distributions. It is still an issue though, for security reasons (apache without suexec for shared hosting still exists, bypassing OS based multi tenancy security, a model that should have gone years ago). KVM, which uses Linux as a hypervisor and uses the hardware virtualization capabilities of newer hardware as now in the Linux kernel, and supported in Redhat Enterprise Linux. I suspect this will gradually overtake Xen and VMWare in areas where only Linux is of interest, due to the built in kernel support; however lighter weight solutions for the security issues such as containers will probably take off instead for many applications where running multiple kernels is unnecessary.</p>

<h2>Containers</h2>

<p>Linux now has a full container model called LXC, similar in principle to BSD jails and Solaris zones. It arrived a bit gradually as a set of patches to namespace various parts of the system such as the process ID space, so a container has its own init process with ID 1 and can have the same IDs as other containers (this also is needed for process migration). There is also a network namespace, so each container has its own loopback device, and independently named network devices (that can for example be bridged back to the host). There is also a read only bind mount which can be used to safely export libraries and binaries to multiple containers with updates done centrally if required; otherwise the container can be managed as a standalone system just sharing the kernel. This environemnt provides a level of secure isolation between containers that solutions such as chroot never had. Processes in containers can be seen from the container host so obviously this needs to be well secured. Because containers do not need hardware support and are very lightweight I think they will grow rarpidly in popularity; they can also run within a virtual machine guest for process isolation inn a virtual environment. Ubuntu 10.04 will have <a href="https://wiki.ubuntu.com/ContainersSpec">full support</a>; earlier versions do work.</p>

<h2>Capabilities</h2>

<p>The old high risk ways of setuid binaries (with broad permissions) are going at last, replaced by a fine grained capabilities system. In principle this means you can drop root capabilities completely, making root an unpriviledged user. There is a <a href="http://ols.fedoraproject.org/OLS/Reprints-2008/hallyn-reprint.pdf">good summary article on this</a> and <a href="http://www.linuxjournal.com/article/10249">another on trying to remove root access</a>. It seems that we will not see pure capabilities based Linux distributions for a while, and will have setuid binaries in general purpose systems, but there is no reason why single application sandboxes should not drop root capabilities in their init process and just use capabilities set in the file system. Fedora seems the furthest ahead in trying this out as a full distribution, and hopefully this will move ahead, adding another security layer in addition to SELinux.</p>

<h2>Sandboxing</h2>

<p>Privilege separation in network applications has been around for a while, but it is starting to spread, with the best example being the <a href="http://blog.chromium.org/2008/10/new-approach-to-browser-security-google.html">Chrome security model</a>. The thing that has really started to change is treating all complex bits of code, such as HTML rendering in Chrome, as potentially hostile as they are likely to be buggy. There is a lot to do to get good security thinking pervasive in application design, but having some well thought out examples is a good start. Currently Linux Chrome seems to offer a <a href="http://code.google.com/p/chromium/wiki/LinuxSandboxing">choice of sandboxing methods</a> of varying effectiveness from a suid helper to using <a href="http://lwn.net/Articles/332974/">seccomp</a></p>

<h2>SELinux</h2>

<p>SELinux has been available in Linux, providing a Mandatory Access Control framework for ten years now, but it has taken that long for it to get really widespread use, mainly pushed by RedHat. Gradually it is extending to other applications, such as mod_selinux for Apache that runs web applications in appropriate security contexts; Postgres SELinux extensions are also available. We are getting to a point when OS security mechanisms can and will be used as they provide the types of security hooks that modern applications need, after a period where we have had applications inventing their own security mechanisms because the OS did not provide the right ones.</p>

<h2>Physicalization</h2>

<p>There was an interesting new buzzword this year: <a href="http://arstechnica.com/business/news/2009/11/basics-of-physicalization.ars">physicalization</a>. Yes just when you tought virtualization was an important new trend, along comes the opposite. What is the idea?</p>

<p>A two socket 8 core server with 16GB RAM and multiple ethernet ports divided into four virtual servers is actually quite expensive compared to four commodity low end boxes. There is a server premium built into the chip manufacture profit model for a start, and also a volume issue.</p>

<p>The price arbitrage is fairly compelling, although the other costs (disks, motherboards, networking) add up and reduce the saving. The example systems are things like <a href="http://www.sgi.com/products/servers/microslice/">SGI&#8217;s Microslice</a> &#8211; yes SGI, that name from the past! This offers dual core but single CPU systems, but with ECC, for significantly lower price and power consumption than typical two way servers, and potentially more throughput per $, for some workloads.</p>

<p>There are even some suggestions that for Linux workloads non x86 architectures (eg ARM) might be competitive for applications that scale out effectively to multiple machines, although I think the risk of introducing these would be high, and there would need to be a big buyer.</p>

<h2>Cloud</h2>

<p>The big coming trend as the world comes out of recession is that cloud computing platforms are cheap, very cheap, compared to in house server provision. Some estimates put it at 20% of cost now, falling to 10% this year. Part of this is economies of scale, part is standardized components and architectural options, and economies of scale in administration. Part of it may be untrue, as there certainly do not appear to be good figures. What is clear is that the SAAS model is compelling for many kinds of product, and fits in with a general movement to charge software as an expense not an investment. There is a lot of hype, and a lot of people have seen the cloud idea before under different names, but the web has produced a viable delivery mechanism, and the uniformity of hosting environments like EC2 cuts costs. Costs such as upgrades are much lower in a SAAS environment too; although the architecture of this software needs to be different to support that.</p>

<h2>Availability</h2>

<p>The last year or so, high availability programming has reached out into awareness a bit. The <a href="http://www.infoq.com/presentations/Systems-that-Never-Stop-Joe-Armstrong">Erlang model</a> has become better known, bringing more awareness of the base elements for building reliable systems such as process supervision. We are starting to see other implementations, such as <a href="http://akkasource.org/">Akka</a>. This is a great move, as availability needs to  move from being a sysadmin and maintenance issue to being a coding issue; for too long effective handling of failure has been ignored by programmers.</p>

<h2>Locks</h2>

<p>As applications start to scale to more threads on multicore CPUs, locking becomes more of an issue. <a href="http://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms">Lock-free algorithms</a> are one interesting answer that has emerged that can work well for some  algorithms. Getting past the scaling issues as architectures get more cores needs innovation in lots of areas such as this. Locks are definitely in the sequential areas that limit scaling through <a href="http://en.wikipedia.org/wiki/Amdahl%27s_law">Amdahl&#8217;s law</a>.</p>

<h2>Summary</h2>

<p>Software architecture is at an interesting point; the principles of web architecture and the security mindset are gradually feeding into tools and infrastructure and becoming more widespread, and delivery is also changing. Scalable, available and secure systems are the aim.</p>

<p><a href="http://dilbert.com/strips/comic/2009-11-19/" title="Dilbert.com"><img src="http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/70000/4000/100/74150/74150.strip.gif" border="0" alt="Dilbert.com" width="450"/></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/01/scaling-security-and-architecture-in-2010/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Standards Diagram for Content Management</title>
		<link>http://blog.technologyofcontent.com/2010/01/standards-diagram/</link>
		<comments>http://blog.technologyofcontent.com/2010/01/standards-diagram/#comments</comments>
		<pubDate>Tue, 12 Jan 2010 23:17:33 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[CMS]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=191</guid>
		<description><![CDATA[This is attempt number 1 of a diagram which I promised Jon Marks after his post. No it still does not have OSGI in! As Jon&#8217;s presentation used Prezi I thought I would give it a go. It takes a while to get the hang of it but it is fun. I can&#8217;t work out [...]]]></description>
			<content:encoded><![CDATA[<p>This is attempt number 1 of a diagram which I promised <a href="http://jonontech.com/2010/01/10/an-incomplete-directory-of-open-standards/">Jon Marks</a> after his post. No it still does not have OSGI in! As Jon&#8217;s presentation used <a href="http://prezi.com/">Prezi</a> I thought I would give it a go. It takes a while to get the hang of it but it is fun. I can&#8217;t work out how to get an overview at the end though&#8230;</p>

<p>Just press the play button to move around.</p>

<iframe height="300" src="http://prezi.com/nifkatyvrk02/view" width="450"></iframe>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/01/standards-diagram/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Open source and Content Management (for Janus Boye)</title>
		<link>http://blog.technologyofcontent.com/2010/01/open-source-and-content-management-for-janus-boye/</link>
		<comments>http://blog.technologyofcontent.com/2010/01/open-source-and-content-management-for-janus-boye/#comments</comments>
		<pubDate>Sun, 10 Jan 2010 14:45:31 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[CMS]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=186</guid>
		<description><![CDATA[Janus Boye said the other day after the BCS open source seminar in London


  @McBoof I left London dazed and confused when it comes to open source. Somebody pls. help me explain what open source really means #idiot


Now I only spoke to him very briefly before he had to rush to the airport, but [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://twitter.com/janusboye">Janus Boye</a> said the other day after the BCS open source seminar in London</p>

<blockquote>
  <p>@<a href="http://twitter.com/McBoof">McBoof</a> I left London dazed and confused when it comes to open source. Somebody pls. help me explain what open source really means #idiot</p>
</blockquote>

<p>Now I only spoke to him very briefly before he had to rush to the airport, but hopefully the following will be helpful: first an overview of the important things about open source in general, and then how they are and will affect content management in particular.</p>

<h2>Open Source</h2>

<p>I think it is  easiest for software developers to understand open source. It came from that community, and it addresses our needs. For a long time no one outside that community was really concerned with it. I think the first time I noticed someone who was not a developer showing an interest was when I was stopped at a tram stop in Vienna by an American as I was wearing a Redhat T-shirt just after their float and was asked who the next open source IPO was going to be, that was 1999 ten years ago now. I suppose that IPO was a big event in the spreading awareness of open source, although it did not perhaps spread much information about what it was really about.</p>

<p>I think the best place to start with trying to understand open source is with three things. I tend to have a bit of a historical approach to things&#8230;</p>

<p>The first is Richard Stallman. I recommend <a href="http://www.fsf.org/events/rms-speeches.html">him in person</a> rather than in writing actually. Actually that reminds me the first time I ever saw him I was sitting in the <a href="http://www.foundry.tv/">Foundry in Old Street</a> and he walked in and proceeded to autograph a woman&#8217;s breasts. Anyone who wants to understand open source should hear him explain the roots of the open source movement. I will not really try to explain all that here, but openness is what created the scientific method, and the idea that software got to the point where it was no longer possible to make it do what you wanted because you did not have access to source code, the point where you could not build on stuff any more or fix it, where control of your tools is taken away is a key part of it. Some people have tried to sanitize Richard out of things (the open source vs free software mess) but that is a mistake.</p>

<p>Second is Eric Raymond&#8217;s essay <a href="http://catb.org/~esr/writings/homesteading/">The Cathedral and the Bazaar</a> which wass very influential at the birth of commercial open source. It is strictly about software development methodologies, and much of the discussion about the cathedral methods is applicable to open source software too. It is about the huge changes that the internet brought in open source development, the birth of a development method that no longer copied the methods of closed source development but utilised the openness to create true large scale community development in a way that was not possible before, and which closed source cannot replicate. Linux is of course the classic early example of this.</p>

<p>Which brings us to the third thing, community. Open source is first of all participatory, not just for consumption, perhaps a bit against the grain of late twentieth century culture. Actually I am an optimist, <a href="http://www.herecomeseverybody.org/2008/04/looking-for-the-mouse.html">with Clay Shirky and against the sitcom</a>, and think culture is swinging this way but we shall see. So for open source, start by using it, then participate. No you do not have to code, although you can learn, there are other ways, bugs, documentation, all sorts. If you just want too see what the community looks like, I can&#8217;t recommend anything better than going to a good conference, like <a href="http://fosdem.org/">FOSDEM next month in Brussels</a>.</p>

<h2>Open source in content management</h2>

<p>Open source has not affected content management much yet. Almost all content management by volume takes place on open source products (by volume Wordpress, Joomla! and Drupal far outweigh anything else). By value it is less clear, open source always has an issue with by value calculations as the revenue models are different, Linux is not the leading server operating system by value, but is by installed base, but is also probably by the value of the services running on it.</p>

<p>But arguably open source content management software has not affected the industry yet, looking now at the larger installations, and the areas that Janus is interested in, indeed that I am. The industry has grown up in a mess as far as standards, ideas, infrastructure are concerned, but the <a href="http://en.wikipedia.org/wiki/Reality_Checkpoint">reality checkpoint</a> has been reached. Two standards have so far started to change the technology landscape of content management, JCR and CMIS, and almost all the implementations of these are open source, and most are cross-vendor projects. This change will grow as more standardization and commoditization sweeps the industry, as the industry adopts a web infrastructure rather than the pre-web legacies inherited from the document management history of the business. Everything that this business deals with will be served through the web; almost all web infrastructure is open source software; content management will be no different.</p>

<p>In this field this is all just beginning. Like open source as I said above, it started with developers, about more efficient ways of building, architecting and delivering software; in terms of influence on the end users it is still small. But things are turning as people become aware of open source in the industry, but they clearly still need some help understanding it. I hope this has helped.</p>

<p><a href="http://dilbert.com/strips/comic/2007-08-03/" title="Dilbert.com"><img src="http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/00000/1000/600/1676/1676.strip.gif" border="0" alt="Dilbert.com" width="480"/></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2010/01/open-source-and-content-management-for-janus-boye/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>The bottom 10 things of 2009</title>
		<link>http://blog.technologyofcontent.com/2009/12/the-bottom-10-things-of-2009/</link>
		<comments>http://blog.technologyofcontent.com/2009/12/the-bottom-10-things-of-2009/#comments</comments>
		<pubDate>Tue, 22 Dec 2009 23:40:04 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=180</guid>
		<description><![CDATA[Ok so I agreed to write a bottom 10 list for 2009, in a twitter agreement with @pmonks. Unfortunately I have just had another bout of winter flu so it has got a bit late so I may not make it to 10, unless there is another last minute entry (any suggestions?). Actually checking the [...]]]></description>
			<content:encoded><![CDATA[<p>Ok so I agreed to write a bottom 10 list for 2009, in a twitter agreement with <a href="http://twitter.com/pmonks">@pmonks</a>. Unfortunately I have just had another bout of winter flu so it has got a bit late so I may not make it to 10, unless there is another last minute entry (any suggestions?). Actually checking the tweet, it said bottom for 2010, but it is traditional to do that in the new year. So here goes, here is what kept me awake in the night in 2009.</p>

<h2>10. The Pirate Bay saga</h2>

<p>In yet another mess in the ongoing spectacle of the entertainment industry preferring legal to creative solutions was the Pirate Bay trial. All this really showed us was that the laws are just not well framed, so anyone could win, and it may all change on appeal. This time it was not suing your customers directly, but legal action is not going to make anyone change their mind. Obviously the next step is going to be to influence the passing of bad laws, not the creation of business value. It seems uncoincidental that Spotify is Swedish. In times of change, business model engineering and service engineering are as important as product engineering. Legal action in the way the entertainment business is conducting it creates nothing long term.</p>

<h2>9. <a href="http://en.wikipedia.org/wiki/Internet_censorship_in_Australia">Australian internet censorship</a></h2>

<p>Get your act together Australians and stop this. Many other governments are looking at ways to start doing this, so it is an important example.</p>

<h2>8. The EU MySQL Oracle Sun delay</h2>

<p>You cannot make industrial policy on this sort of timeline. If the EU were to turn down the deal now Sun would be destroyed. Oddly MySQL was at a transition point anyway. I am very much in favour of the <a href="http://en.wikipedia.org/wiki/Drizzle_%28database_server%29">Drizzle idea of the future of MySQL</a>; who knows where it will end up but it may well be outside Oracle anyway.</p>

<h2>7. SPARQL is a query language without a resource model</h2>

<p>Looks like this has a chance of being fixed in 2010 at last, although I have temporarily mislaid the references, check for the newer references to named graphs. The idea that you could launch a query language for the web without a resource model was yet another of the dumb W3C ideas. The model appeared to be to build XML in Prolog. That sucks. Unfortunately the fixes are quite substantial (quads not triples for example).</p>

<h2>6. WebDAV</h2>

<p>Although not exactly something from this year, remarkably it has kind of held on and since people still specifically mention it as an alternative to CMIS. Indeed it is kind of useful sometimes, in strange situations, and it does work in a limited way, but it is not a modern HTTP interface. You have to remember how early it is, as work started in 1996, when it was not clear how the web would develop, or indeed how HTTP would develop (HTTP 1.1 was out but not much used and it was shipped in a mostly 1.0 environment). Even at the time some of the mistakes were clear, but the great thing is they are all documented in Yaron Goland&#8217;s <a href="http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0308.html">The WebDAV Book of Why</a>. Like the issues of hierarchy that make mixing WebDAV with normal HTTP impossible, and the <a href="http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0303.html">depth header disaster</a>. There are also some comments about it in the <a href="http://jonudell.net/udell/2006-08-25-a-conversation-with-roy-fielding-about-http-rest-webdav-jsr-170-and-waka.html">Roy Fielding podcast</a> in which Roy tries to avoid talking about JSR. The best thing about WebDAV is how well documented the mistakes are; this should be compulsory for all standards.</p>

<h2>5. <a href="http://en.wikipedia.org/wiki/GeoCities">Geocities</a></h2>

<p>The embarrassing kiddie years of the internet dead and buried. Mostly not the worst of 2009, but the idea you could still nurture your city. Obviously an anti archival moment for future historians to curse about. Still, a hubris reminder too, this was once the third most popular site on the intranet, and sold for $3.57 billion; look on their works ye mighty and really despair.</p>

<h2>4. Microformats</h2>

<p>Never going to work. We really need generic metadata representations that have sane serializations or embeddings into all formats. Metadata <a href="http://blog.technologyofcontent.com/2009/08/metadata-is-not-what-it-used-to-be/">now lives within documents</a>; it used to get lost before that. So the RDF model has won, and microformats have lost. Oh, and the standards process sucked.</p>

<h2>3. The XHTML2 débâcle</h2>

<p>Had to happen, but why did it take so long for the W3C to fall behind HTML5 rather than XHTML2? This was a huge diversion of resource. The W3C churns out stuff and some of it gets adopted, some is implementable, some of it is not implementable realistically. The organization needs to change or it will be irrelevant.</p>

<h2>2. The Go programming language</h2>

<p>I am an aficianado of programming languages. I have programmed in many of them, C, Haskell, you name it. Lua and Erlang my new ones for the year though its getting a bit late and I have barely started. I know my combinators from my closures. What is the point of <a href="http://en.wikipedia.org/wiki/Go_%28programming_language%29">Go</a>? It does not really offer anything for the currently interesting problems, I do not think it is going to make it anywhere. I would be surprised if it ever gets onto the allowed Google programming language list, which is <a href="http://steve-yegge.blogspot.com/2007/06/rhino-on-rails.html">C++, Python, Java, Javascript</a> since you ask. Google is doing some cool performance work on python though under the name <a href="">unladen swallow</a>.http://code.google.com/p/unladen-swallow/wiki/ProjectPlan).</p>

<h2>1. I4I&#8217;s patent win over Microsoft</h2>

<p>A last minute entry here. I4I has an <a href="http://www.theregister.co.uk/2009/12/22/microsoft_loses_word_patent_appeal/">injunction against Microsoft selling Word</a> without the generic XML editing functionality removed. Obviously it will be removed, and it is not a feature that a lot of people used. However <a href="http://broadcast.oreilly.com/2009/08/microsoft-and-the-two-xml-pate.html">analysis of the patent</a> indicates that it clearly has prior art, is unclearly applicable, and could affect many other XML applications. The affected part of Word is designed to be a fairly general XML processor, with similar capabilities to <a href="http://en.wikipedia.org/wiki/XForms">XForms</a>. We need to support Microsoft in getting the judgement reversed.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2009/12/the-bottom-10-things-of-2009/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Smart resources, or why you should care about HTTP PATCH</title>
		<link>http://blog.technologyofcontent.com/2009/12/smart-resources-or-why-you-should-care-about-http-patch/</link>
		<comments>http://blog.technologyofcontent.com/2009/12/smart-resources-or-why-you-should-care-about-http-patch/#comments</comments>
		<pubDate>Fri, 11 Dec 2009 23:42:06 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[REST]]></category>
		<category><![CDATA[HTTP]]></category>
		<category><![CDATA[PATCH]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=169</guid>
		<description><![CDATA[Unusually, there has been a significant change to the HTTP protocol this week. The PATCH method was approved by the IETF.

This is a big change as one of the parts of the HTTP model is the small &#8220;uniform interface&#8221;, where there are very few things you can do to web resources. GET is the most [...]]]></description>
			<content:encoded><![CDATA[<p>Unusually, there has been a significant change to the HTTP protocol this week. The <a href="http://greenbytes.de/tech/webdav/draft-dusseault-http-patch-16.html">PATCH method</a> was approved by the <a href="https://datatracker.ietf.org/drafts/draft-dusseault-http-patch/">IETF</a>.</p>

<p>This is a big change as one of the parts of the HTTP model is the small &#8220;uniform interface&#8221;, where there are very few things you can do to web resources. GET is the most common, to retrieve a resource representation. Then there is PUT to update a resource, and DELETE to delete it. Then there is POST, which tends to cover everything else you might want to do. The problem with that is that discovering the interface for POST is difficult, as is knowing exactly what it will do. (There are a few other verbs too).</p>

<p>PATCH is much more straightforward. PUT updates an entire resource with a new version, while PATCH just makes an amendment to a resource. For some types of resource, the entire resource may be large, so that just sending differences will save bandwidth. Also, sending the full resource may unnecessarily make the changes sequential, for example append operations where the order of the operations is not significant. One example given is a log file, where many processes  may be adding entries, and if they had to retrieve the whole log, append a new entry and write it back there would be a lot of extra traffic, and a chance of either lost updates or processes having to retry if the resource was modified during this process. Clearly a PATCH operation here that does an append would make sense. I am not sure that is actually a very good example though, as you would  almost certainly create a resource for each log entry, rather than one for the whole lot, but clearly other similar patterns exist.</p>

<h2>HTTP is not a filesystem</h2>

<p>When it was new, people tended to treat HTTP like a filesystem. After all that was the common model for storage, and web servers generally stored web pages as files, so they tended to e treated much like that, with filename extensions annd index files, and WEBDAV was created to try to make the web usable as a filesystem protocol. This model does not really work very well however, as it does not model the things you can and can&#8217;t do with HTTP. The methods are one example; updating entire resources at once means they tend to be small units, rather than, for example, log files. File systems generally struggle to store millions of tiny files without wasting a lot of space, and without becoming slower. The web  resource does not have to support full Unix filesystem semantics (a topic that oddly Wikipedia seems to be missing an entry on! May have to rectify that), and supports a much simpler updte model.</p>

<p>Maybe the easist way of thinking about HTTP is to see every URL is a small <a href="http://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller">Model View Controller</a> (MVC) system. The model is an abstract resource, which we can see through GET, which retieves views. There can be multiple views, as a request can ask for different media types and encodings for a single resource. The controllers are the media types supported by PUT, which are usually the same as those for GET, but need not be; because both the view and the controller need to represent the whole resource state, they do tend to be quite complex representations, such as XML documents. PATCH however is also a controller, but a more interesting one in many ways, as it can send just state changes to the model, which tends to be how many MVC systems work.</p>

<p>Another thing that PATCH enables is resources that hide some of their state. A resource could only support PATCH and not PUT so that state modifications were only changes. If the state returned by GET is not the complete state, the resource could hide parts of the model. An example could be a voting method that records but does not reveal who has voted, only returning totals, which accepts votes as PATCH requests.</p>

<h2>Server side scripting</h2>

<p>One PATCH format that makes a lot of sense is actually to use executable code, rather than say diff files. There is no reason why you  should not send the server a PATCH request that is some Javascript to modify the DOM of an HTML resource which can be executed serverside, or an XSL transform to modify and XML object. Sending code is an efficient way of making changes to a resource, and can be executed in a sandbox like the browser sandbox. This will be another driving factor for Javascript on the server side, as it is well suited for embedding like this, and already has a DOM model for transforms.</p>

<p>All these changes take us further away from the filesystem model. Web resources will more and more combine some storage with some computation, including ability to execute code in a contolled way. Smart resources will become more common, over dumb storage only resources.</p>

<h2>In other news</h2>

<p>Next in line for HTTP, hopefully, is the <a href="http://tools.ietf.org/html/draft-nottingham-http-link-header-06">Link header</a> which adds a new header for legacy document formats that do not include a native hyperlinking capability. This will allow relationships between these documents to be included in the retrieved resource, such as a link to metadata or other related resources. The HTTP replacement for the file system model is getting serious.</p>

<!--a href="http://geekandpoke.typepad.com/geekandpoke/2009/11/service-calling-made-easy-part-1.htm"><img src="http://geekandpoke.typepad.com/.a/6a00d8341d3df553ef012875f312f9970c-pi" width="400"/></a-->
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2009/12/smart-resources-or-why-you-should-care-about-http-patch/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Social {space&#124;media&#124;policy} @Starbucks</title>
		<link>http://blog.technologyofcontent.com/2009/12/social-space-media-policy-starbucks/</link>
		<comments>http://blog.technologyofcontent.com/2009/12/social-space-media-policy-starbucks/#comments</comments>
		<pubDate>Sun, 06 Dec 2009 13:52:50 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[flickr]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[starbucks]]></category>

		<guid isPermaLink="false">http://blog.technologyofcontent.com/?p=164</guid>
		<description><![CDATA[Companies still don&#8217;t get social media. Starbucks may have 5,151,861 fans on Facebook but they don&#8217;t get the way social media actually involves engagement and changing the way you work.

I happen to like pictures of people unposed, and I sometimes take them when I am in the right mood. The decisive moment of course has [...]]]></description>
			<content:encoded><![CDATA[<p>Companies still don&#8217;t get social media. Starbucks may have 5,151,861 <a href="http://www.facebook.com/Starbucks">fans on Facebook</a> but they don&#8217;t get the way social media actually involves engagement and changing the way you work.</p>

<p>I happen to like pictures of people unposed, and I sometimes take them when I am in the <a href="http://www.flickr.com/photos/justincormack/2632480082/">right mood</a>. The <a href="http://en.wikipedia.org/wiki/Henri_Cartier-Bresson">decisive moment</a> of course has an important role to play in the history of photography; when I was on holiday I happened to take a picture of a man sitting on the street outside a Starbucks, and when I <a href="http://www.flickr.com/photos/justincormack/">posted it to Flickr</a> I thought I would find a Starbucks group to post it to. And thats where I <a href="http://www.flickr.com/groups/starbuckscoffeecompany/discuss/72157622351418443/">found this hilarious thread</a>, which is a warning to people about what happens if you jump into social media at the deep end.</p>

<p>At the end of September, when the <a href="http://www.flickr.com/groups/starbuckscoffeecompany/">official Starbucks group</a> was started, in the social media frenzy of 2009, this <a href="http://www.flickr.com/groups/starbuckscoffeecompany/discuss/72157622351418443/">thread was started</a>, pointing out that many people had been asked not to photograph in Starbucks stores, or had been thrown out for taking photographs.</p>

<p>The official responses from the official moderator <a href="http://www.flickr.com/photos/42346097@N02/">analisamarie</a> started off fairly optimistically</p>

<blockquote>
  <p>Our formal policy is that all press-related photo inquiries need to contact press@starbucks.com prior to taking pictures in a Starbucks store. However, we have no formal policy around customers taking non-press related pictures in-store so if you hear otherwise, it might just be because your barista is camera-shy <img src='http://blog.technologyofcontent.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
  
  <p>Hmmm- good discussion! Sounds like there is a bit of confusion out there &#8211; let me take this back to my team and see what we can do to help. Thanks for bringing this up&#8230;more to come!</p>
</blockquote>

<p>Then got bogged down in legal</p>

<blockquote>
  <p>I am making great headway here and hope to have some detailed information for you all shortly. To give you an idea of what I&#8217;m up to, I am researching if some of our international markets have policies around photography in stores. Since international laws and regulations vary country by country, this is quite the task <img src='http://blog.technologyofcontent.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  I&#8217;m also working to see where the confusion is stemming from in some US stores. Again, stay tuned. I&#8217;m working on it!</p>
  
  <p>I have been meeting with various teams in the building and learning a lot about the world of policies <img src='http://blog.technologyofcontent.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  I hope to have something more concrete to share with you soon &#8211; thanks for your patience while I work through the details.</p>
  
  <p>I am getting closer to a final ruling each day. I have a big meeting on Wednesday and after that, I will post here with an update.</p>
  
  <p>I did have a very productive meeting on Wednesday of last week. We read through each of your comments and now the legal team is reviewing some of your feedback around public and private property. More meetings this week&#8230;more to come!</p>
</blockquote>

<p>The social networker starts networking internally</p>

<blockquote>
  <p>Still here and haven&#8217;t forgotten about you. I&#8217;m writing a blog this weekend/next week about this discussion and hope to post by the end of the week. I&#8217;ll keep you in the know. Have a good weekend!</p>
  
  <p>Just wrote a blog response that my legal team is currently reviewing&#8230;once I have final approval I&#8217;ll post it and let you know. I know it&#8217;s taken a while and I know I&#8217;ve said it before but I appreciate your patience. This has been quite an interesting project to work on and has involved many meetings with all sorts of teams throughout the building. SO glad you guys brought this to our attention so that we could sort it out for you!</p>
</blockquote>

<p>Hints on something more negative</p>

<blockquote>
  <p>We want to do this in the best way possible. There are many perspectives to take into consideration as part of this discussion. That means considering our baristas&#8217; daily work and their privacy, our customers&#8217; experience in our stores as well as your photographic expression of that experience. We have a lot of things to consider when making decisions that affect what happens in our stores. It has to be the right thing for our partners (employees) and customers, and it has to work well for stores around the world. Please continue to be patient while we work on a solution. In the meantime, I do ask that you continue to be respectful of customers and partners in our stores. If a barista asks you not to take pictures, please respect their request. More to come &#8211; Anali</p>
</blockquote>

<p>And more strongly hinting that the answer is going to be no. As several of the people in the thread point out they might as well close the sponsorship agreement with Flickr if they are going to say no to photography.</p>

<blockquote>
  <p>I have to add that this group isn&#8217;t explicitly here for the purpose of taking pictures inside Starbucks stores. That is one part of the Starbucks Experience but pictures of your experience out-of-store are welcome in this group as well.</p>
</blockquote>

<p>This is currently her last post, a few days ago, so the story may still unfold. Possibly not as dramatically as when <a href="http://www.schneier.com/blog/archives/2009/02/man_arrested_by.html">Amtrak police arrested someone for taking pictures for their photo competition</a> but it has slightly broader issues than that.</p>

<p>First there is the timetable thing. Do not bother with social media if you can&#8217;t make decisions quickly as an organization. Period. When issues are raised by social media, you have to respond fast, because things have a habit of going viral. Two months is a joke. Fix the response times before you do anything, or when something blows up you will bodge the response.</p>

<p>Second, be a bit lateral. I mean, surely someone would have thought about this issue, maye the &#8220;camera-shy baristas&#8221; if internally the social media plans were discussed, but obviously this social media plan comes from head office.</p>

<p>Third, social media is not about head office marketing, it is about running an open, transparent business. Flickr is not Facebook, and does not have quite the same vibe (and far fewer photos actually). It is mainly a subscription platform, and many of the people there are generally quite articulate. The issue is not that they are going to complain about your coffee, which has carefully planned responses, they are complaining about the way you treat them as a commmunity. Walk right into it, unprepared.</p>

<p>Engagement has to be seen as a two way street; if you are not prepared to change in the social engagement and you treat it like an advertising campaign you may come unstuck.</p>

<p>The stupid thing is of course that Starbucks is a social space. The coffee shop, its Central Perk in Friends, its the pub in the British community, its the office when travelling. Casual photography is entwined with the social space since the <a href="http://en.wikipedia.org/wiki/Brownie_%28camera%29">Kodak Brownie</a>; it has been reckoned that most of the data the human race has ever produced is in the form of photos (lacking a citation for that right now; would welcome one). Every mobile phone has a camera now. It is not actually hard to work out what the answer to the question should be; it even seems that that is already the policy, though no one can actually tell, as Starbucks policies appear to be secrets.</p>

<p>If your organization doesn&#8217;t grok social media, don&#8217;t copycat and try it anyway. maybe try a dose of Enterprise 2.0 first, and I don&#8217;t mean writing blogs for lawyers to read. This online stuff, it is going to change the way things work, until you understand that you will get it wrong.</p>

<p>Will be interesting to watch and see if Starbucks manage to sort this out.</p>

<p><em>Update</em> We finally have a complete copout policy &#8220;Here&#8217;s the answer that you&#8217;ve been waiting for &#8230;Photos are allowed in our stores for the purpose of sharing them in our Flickr group.&#8221;</p>

<p><a href="http://www.flickr.com/photos/justincormack/4158745476/" title="Man at Starbucks by Justin Cormack, on Flickr"><img src="http://farm3.static.flickr.com/2486/4158745476_786894d534.jpg" width="500" height="399" alt="Man at Starbucks" /></a></p>

<p>(Note picture taken outside Starbucks in a public space, without purchase of Starbucks beverage; however I cannot post it to the Starbucks pool as I don&#8217;t have a model release).</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.technologyofcontent.com/2009/12/social-space-media-policy-starbucks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
