Wednesday, 16 May 2007

RDF - An alternative to Topic Maps?

In my last post I described how Topic Maps appeared to be a possible mechanism for creating a flexible database, capable of storing a range of different, and possibly unrelated, items - a heterogeneous database. However, as I also mentioned previously, a couple of things worried me about Topic Maps; namely that work in the area seems to have gone a bit quiet in the last couple of years and, secondly, that there appears to be quite a large European bias to the companies and institutions working in the field - not that this is a bad thing, but what are everyone else up to?

These two things were playing on my mind, so rather than rushing headlong into Topic Maps, I decided to take a step back and spend a bit more time investigating other possible solutions. This search revealed that there is another possible solution to the problem - RDF - the Resource Description Framework (I'd already suggested in my last post that this could be a possibility). Indeed RDF and Topic Maps appear to cover very similar ground and, from an initial study of the available information, it's hard to decide between the two. It may be that subtle differences exist between the two technologies but my initial impression is that they are pretty much going head-to-head against each other – another VHS versus Betamax, but with probably slightly less interest from the general public.

So what are the actual differences between RDF and Topic Maps and which technology is the most suitable for my purposes? As with Topic Maps, the documentation for RDF is rather sparse and also seems to mostly be a couple of years old - the only book I could find that described both subjects was The Explorer's Guide to the Semantic Web, published in 2004. While this provided a good overview of both technologies, it didn't really cover them at a low enough level for my needs and, by the end, I was still left uncertain of the differences in the two. In addition, RDF has other related technologies - namely RDFS (RDF Schema) and OWL (Web Ontology Language - which surely should be called WOL, although this would have reduced the oportunities to publish books with bird covers) and I felt that these weren't covered in enough detail. The only thing for it was to dig a bit deeper into each subject.

The definitive (and about the only) text on RDF is Shelley Power's Practical RDF. This book did give low level descriptions of RDF, RDFS and OWL and I'd definitely recommend it for anyone interested in RDF. My only criticism would be that the chapters describing RDF tools are now a bit out of date (the book was published in 2003) and I found quite a few typos and mistakes in the RDF examples - a new edition of the book is surely required. However, by the end, I felt that I'd gained enough knowledge to pursue RDF a bit further.

I think I now understand how RDF, RDFS and OWL fit together. My interpretation is that RDF is a mechanism that can describe simple relationships (subject-predicate-object expressions, known as triples e.g. apples have the colour green). RDF can be serialized using RDF\XML (i.e. an XML based language can be used to describe the RDF relationships). RDF Schema can then be used, in much the same way as XML Schema (XSD) are used with XML, to describe the contents and structure of the RDF, allowing relationships to be created. OWL is effectively built on top of RDFS, extending it to allow the generation of complex class and sub-class relationships.

I'm still not sure about the differences in Topic Maps and RDF (the RDF book didn't cover Topic Maps), so I think the only thing for it is to play with the two technologies. I'll let you know how it goes in the next post.