Geert's Blog: February 2012

(originally posted on XMLHolland as day one and day two)

Cross-pollination, ER, and Pigs wearing Lipstick

a report on behalf of XMLAmsterdam by @grtjn

XMLPrague is great in so many ways, it is impossible to properly describe. I will be jotting down some thoughts and observations here anyhow, hoping to spark some good and funny memories of those who were there, and hopefully giving the others at least some impression of what they have missed..

Be there next time!

Before I start, I have to say this: Prague is absolutely one of my top favorite cities. Been there four times (twice for XMLPrague, and twice on bicycle), still hasn’t lost any of its charm. Yes, it was slippery. Yes, it was (very) cold. And Yes, some complained about coffee, queues, draft, power supply, and blocked views. But that doesn’t make Prague any less beautiful, nor XMLPrague any less interesting. Contrarily, it just added to the experience!

For those who never were..

About XMLPrague in general – for those who never attended it (so far): this conference is quite different from all others I attended. It is not an ordinary set of talks with questions afterwards. But also discussing about it, sharing thoughts (both with speakers, and among audience, even during each talk thanks to Twitter and the live TwitterWall), debating, inventing new ideas, starting new initiatives, making fun with/about each other, etc. It really feels much more like one big collective, than just a bunch of geeks that happen to share some interests (or not).

Why it is so different you might ask? Good question, no accurate answer. I don’t think XMLPrague is unique in this respect by the way. Surely there are other communities as cohesive as this one. One contributing factor must have been the decline of XML adepts this community has faced over the years. The remaining people are the more persevering – or nostalgic – ones. Anyhow, the community is surprisingly tight despite the fact it is quite literally scattered across all continents.

The general theme of day 1..

The topics at XMLPrague have always had a high level in various ways. Highly technical, talks about the progress of standards themselves, highly advanced topics, and – not the least – high quality, not to mention all those renowned speakers! At this year’s conference it seems to go even further. Topics really go beyond standards, it is all about crossing ‘borders’, building bridges. Making different techniques work together, for better (the A-team – weird quadruple, but they always succeed) or worse (chimera, pigs wearing lipstick – ugly). Standards learning or lending from each other (cross-pollination). Boldly going where no one has gone before… – well, kinda.

Opening keynote..

This starts off with the opening keynote by Jeni Tennisson. Her talk is about the fact that standards are often competing over the same space. HTML, JSON, XML, and RDF are all seeking Web dominance, for instance. Others try to take the best of some, and merge them together, building bridges or chimera. These certainly fill a need, but the result isn’t always pretty, nor does it always work. Jeni says instead of competing or merging, they should be used to work together, coexist as they are. Each standard has its niche, they should each be used for what they are best at. It often requires just a little bit of glue to make it work.

(Jeni uses the project legislation.gov.uk she currently works on as example, in which she glues the four standards together through URLs.)

Morning sessions..

The remaining morning sessions, as well as the discussion panel just before lunch, more or less extend (or ‘contradict’) on this statement.

Eric van der Vlist looks back over the period from when the XML hype started till now. XML was said to be thé data-interchange format, and was flexible enough to be applied to anything. XML advocates (quite literally) tried this, but things got overhyped. XHTML for instance has never become the success that some had envisioned. Biggest problem is that XHTML and HTML aren’t compatible (syntax-wise at least). Also, new, and more successful ideas like Web 2.0, HTML5, and JSON have overtaken XML and XHTML. But, says Eric, XML isn’t just about syntax. It is still built on top of a strong and flexible data model, and has lots of tooling. He suggests we should consider allowing data-structures like triples and JSON into the data-model. Unfortunately, he doesn’t say how to prevent it to become a chimera itself.

Robin Berjon teams up with Norm Walsh to show us there are good ideas in both HTML and XML. Looking beyond the syntax, they discuss ideas in which they take features from one domain and apply it to the other. You can already use XML together with CSS to show a web page. You can even include script elements in the XML. SVG is already mostly supported as part of HTML, but with HTML-like syntax. It is also possible to utilize the broad support of JavaScript to bring ideas from the XML domain into HTML. There is pdf.js, which might be altered to accept XSL-FO as input. There are JavaScript implementations for XSLT (Saxon-CE for instance), and XProc (written by Vojtech Toman). Web developers usually dislike the pointy brackets though. Robin suggests using CSS selectors to make transformations more accessible to them. He also suggests adding Schematron-like validation features to CSS.

As if the previous talks hadn’t gotten the crowd stirred enough yet, Anne van Kesteren surely did. In his relatively short presentation, he pretty much suggested to drop the strict XML well-formedness requirement, and allow HTML/SGML syntax again. This was a good upstart to the panel discussion that followed behind it, in which convergence between XML and HTML was discussed. The panel consisted of all previous speakers, as well as Steven Pemberton. To summarize briefly: some argued that you can’t drop the well-formedness in general, it helps in the editing process for instance. Others argued that the end user, the one looking at web pages for instance, shouldn’t be ‘punished’ for the mistakes of developers. It was suggested to apply the Postal’s law approach. Before the end of day 2, a new W3C working group was erected to address the idea of improved XML Error Recovery.

Afternoon..

After a lunch of mashed potatoes with schnitzel, the topics become less philosophical. Vojtech Toman starts with support for non-XML data in XProc. The XProc WG has looked at the need to handle such data within XProc pipelines. That is currently pretty much impossible without implementation-specific extensions. By adding a simple content-type attribute on inputs and outputs, and adding some extension functions and steps, the XProc processor ‘knows’ how to flow non-XML data appropriately. It will do conversion where appropriate. Vojtech shows an example in which he converts an image from PNG to JPEG just by specifying the appropriate input and output type. The idea for support for non-XML data was generally well received.

Next was a talk by George Bina. He talked about NVDL which is a standard to handle validation of XML with mixed namespaces. It allows different parts to be validated with different Schema types (RNG, Schema, etc). The ISO standard also provides a few sophisticated features that allow detailed control of how each part of the document should be validated. Bina shows how XProc and XSLT is used to implement NVDL support in oXygen.

After Poster presentations and a coffee break, the conference continues with a more delicate matter again: JSON. Jonathan shows some quotes telling that XQuery was meant to be a universal query language. But JSON people dislike XML. That is why a new query language is being proposed that builds a bridge: JSONiq. There are two syntaxes: XQ-- and XQ++. The former is a stripped XQuery syntax, with just support for JSON structures. It could help JSON-minded people to leverage the power of XQuery (and thus possibly that of XML databases). The latter is based on full XQuery syntax, but extended with JSON constructors and expressions. It allows XQuery-minded people to interact with JSON applications.

Norm Walsh presents another way to make life easier for JSON people, and others that don’t like to learn XQuery. Corona is an open source project to disclose many MarkLogic features as a REST interface. It allows for responses in both JSON and XML. It provides features like CRUD, and Search, but also allows management of all kinds of indexes and search facets. It also allows you to upload transformations in XSLT and XQuery, that can be applied in later requests.

Final presentation of day one is presented by Steven Pemberton. He talks about the history of XForms, and the new features of XForms 2.0. XForms 1.0 didn’t work out well, but the standard became Turing Complete with XForms 1.1. That proved its value. XForms 2.0 brings support for XPath 2.0, and AVTs. It also supports non-XML data as input, JSON in particular. Steven explains this was easiest to do when JSON was simply mapped to XML. This is not trivial, but doable. He explains which mapping is being used in XForms, which is different from existing ones. The audience makes remarks about the yet-another-JSON-mapping, but Steven explains it is for XForms internal use only. End users don’t need to know about it.

Diner and demojam..

Most of the attendees attend the social diner in the Cloister on top of the hill in ‘The other side’ of Prague. Good, and plentiful, as is the beer. Around nine, 10 contestants (including me) prepare for the demojam, sponsored by MarkLogic. Norm keeps a strict eye on the clock, as the contestants demo or dance their 5 minutes full. The applaud-o-meter helps the jury to come to a verdict. Robert Broersma with “XSLT for hipsters” ties with Gerrit Iemske with “floodit.xsl”. Norm generously grants both an IPad 2.

PS: I’ll blog more details about my demojam app ‘Mark my Tweet’ on my personal blog soon.

Theme of day 2..

The theme of the second day is mostly about newest features in XML standards, very advanced usages of them, extending coverage for XML standards, and bridging between worlds.

Morning sessions..

Sharon Adler is supposed to do the opening of the second day, but unfortunately she can’t make it due to personal health. Instead, Jonathan Robie and Michael Kay get extra time to talk about the current status of various standards. A brief summary:

XPath and XQuery 3.0 are in Last Call. The addition of dynamic function calls, inline functions, windowing in FLWOR, try/catch, higher-order functions and such is mostly known. A new string concatenation operator, support for EQNames, and outer join in FLWOR are new to me. I’m guessing the SQL people will love that concat operator.

XML Schema 1.1 has reached Proposed Recommendation stage. It has a ton of new possibilities, that lift most of the unwanted limitations. It includes conditional type assignment, allowing elements in multiple substitution groups, open content models, and more. Most notable was perhaps the addition of assertions, inspired from Schematron. Personally, sounds like a feature that could be very popular, but could turn out to be a chimera as well.

XSLT 3.0 makes slower progress. Its streaming features require a lot of research, not something W3C is intended to do. Apart from streaming it also includes features partly inspired from XQuery, like try/catch, iterate, evaluate, and such. Things I hadn’t yet heard about: matching templates on atomic values, accumulative counting in a for-each, breaking out of it, and packaging. Packaging takes modularization of code a step further, adding more control on visibility and dependencies. There will also be support for maps, and functions that can convert between JSON and maps. The latter sounds like a nice A-team approach to me!

Adam Retter continues with a presentation on RESTful XQuery after the coffee break. He shows that while most XQuery database allow RESTful web-applications, all of them rely on extensions, and implementation specific strategies. He instead proposes using function annotations based on JSR-311 to control exposure, and let the XQuery processor take care of the request handling. You would only need to specify the accepted method, the url pattern (including parameters), and input/output content-types for each function that needs to be exposed. The idea is very well received, and Liam suggests W3C should perhaps pick it up. Yes, please!

Alain Couthures follows with a presentation on supporting XQuery in the browser, by transforming it to JavaScript on client-side using XSLT 1.0. He argues that interpreting it with JavaScript would have been slow, while XSLT processing in the browser is fast. He elaborates on how he is building XQuery support into XSLTForms through XQueryX using YAPP, BNFs, and some uhm.. quite complex XSLT templates. Personally, would be interested to compare performance with for instance Saxon CE.

Afternoon sessions..

After a good lunch of rice and sauce (or pasta with cheese sauce) – in which I get entangled in a loud discussion about American politics that I allegedly have caused –, we continue with vegetables. Evan Lanz presents the idea of a transformation language derived from XQuery, but altered to support expressing template based processing of content, in an effort to bring best of both worlds together. He calls his language Carrot, because of the use of the hmm.. caret sign. Code expressed in this language could be supported natively or (in theory) be translated into either XSLT or XQuery. He gives a brief demonstration in which he uses the online Rex parser by Gunter Rademacher, to create an XQuery parser for Carrot. From there he transforms the parsed tree into XSLT. The audience seem to like the idea, but some debate the chosen syntax. I could imagine that it feels a bit like yet-another-transform-language to some.

John Snelson, colleague of Evan, continues on the same topic, but presents a different strategy. Instead of creating a new or meta language for transformations, he suggests to use annotated functions. The functions serve as the template bodies, the annotations specify the matching conditions. In a swirling Prezi he also demonstrates that it is possible to use the new XQuery 3.0 function features to implement the matching algoritms, and the before-mentioned Rex parser to create a parser for the match patterns. I’m afraid quite a few in the audience lost track due to the fancy tumbling, sliding and zooming of his Prezi, but his demo does show how much the expressiveness of XQuery improves with its latest features.

After the last coffee break of this conference, we have just two presentations left. The first is by Charles Foster. He presents his work on XQJ. Contrary to for instance JDBC, XQJ is an API that explicitly bridges between Java and XQuery. He argues that techniques like Hibernate are suboptimal for marshaling complex object structures to flat/tabular relational database structures. The idea of XQJ is that your code talks to a façade. Method invocations get automatically relayed to the other side. You either have a façade at the Java-side, in which case Java-classes and methods are generated from XQuery code. Or a façade at the XQuery-side, in which case XQuery functions are generated from Java code. There are implementations for MarkLogic, eXist and Sedna.

The last talk is presented by Lorenzo Bossi. He talks about the difficulties of maintaining sites like those owned by 7pixel. These include web shops with compare features that contain many items. Maintaining so many requires a collaborative editing approach like wiki. This also includes maintaining the structure or templates behind the items on those sites. Such template changes can easily results in invalid documents. Lorenzo shows how a good update strategy can help. By checking document changes caused by template updates, before committing them, problems can be detected in an early stage.

Closing keynote..

The XMLPrague conference is traditionally closed in unparalleled ways by Michael Sperberg-McQueen. Trying to summarize it is daunting, but this article isn’t complete without an attempt:

Michael makes parallels between John Amos Comenius and XML. Comenius was a religious man, the last bishop of Unity of the Brethren, but exiled. He was proclaimed founder of modern education, and wrote many books, but his books were burnt, some lost forever. His legacy is still highly valued, but has to grow on you. The same accounts for XML, it is also verbose, and you have to experience it to appreciate it. Religions tend to try to rule out deviations. Some say XML tries to do the same, and say that XML fails to do so, and has therefor failed as a whole. But XML hasn’t really failed, nor did Comenius really try to convert other people. Other formats, like many binary formats for instance, have their purpose. And though XML hasn’t overtaken the world, it is used in more places than people are unaware of. The XML formats of Word, Excel, Open Office are exemplary to that. But XML is used in much less obvious areas as well, like for instance forms for writing speeding tickets in North Carolina.

Michael talks about the NOTATIONs in DTDs. They allow referring to data that isn’t in SGML/XML format. It was never the intent to disallow other formats, but have them coexist. This is why for instance media-types in HTML are such a success. That pluralism is also seen in NVDL. XML tooling is a different case. XProc, XQuery and XForms do try to achieve universality by including support non-XML formats like binary and JSON. Michael warns these attempts might become pigs wearing lipstick, but also sees great value in it. And even though supporting XQuery and transformations through JavaScript or CSS seems far-fetched, but why not if browsers won’t support it and people want it?

Michael explains Comenius’ religion was all about tolerance. Comenius also stood for universal education. He didn’t achieve his goals, but these take a long time. Michael thinks the XML community has and is striving for similar goals, and that XMLPrague helps to get a step closer. These warm thoughts concluded this years conference.

XMLAmsterdam and more..

For those who couldn’t come to XMLPrague this year, come to:

XMLAmsterdam 2012, September 19^th, 2012

The call for papers will open soon. More details via http://twitter.com/xmlamsterdam and http://www.xmlamsterdam.com/!

I also collected a bunch of links to slides, proceedings, photos, and other blogs. In order of appearance:

· XMLPrague 2012 proceedings
http://www.xmlprague.cz/2012/files/xmlprague-2012-proceedings.pdf

· XMLPrague 2012 sessions and slides
http://www.xmlprague.cz/2012/sessions.html

· XMLPrague 2012 video archive
http://www.xmlprague.cz/2012/files/video-archive-1.html

· Conference photos taken by Thomas White
http://www.flickr.com/photos/thomas-white/collections/72157629283332045/

· Detailed notes of conference by Inigo Surguy
http://67bricks.com/xmlprague2012/xmlprague.html

· The newest W3C Working Group: XML-ER
http://www.w3.org/community/xml-er/

· Blog article of conference by Pieter Masereeuw
http://www.xmlholland.nl/content/congresverslag-xml-praag

· Lengthy blog article by speaker Eric van der Vlist
http://eric.van-der-vlist.com/blog/2012/02/15/xml-prague-2012-the-web-would-be-so-cool-without-the-web-developers/

Geert's Blog

Search This Blog

Thursday, February 16, 2012

XMLPrague 2012, day one and two