Umm... things aren't as straightforward as I'd initially imagined. On investigating what's required for heterogeneous databases I've discovered that there's a whole world of stuff out there that I know little or nothing about: facets, synonym rings, and controlled vocabularies to name but a few. In fact this world has a name: Information Architecture and even its own bible, the legendary Polar Bear Book. Hidden amongst all this lies the mysterious field of Topic Maps. From my initial skimming of the available resources and documentation on the subject, it sounds as if these may be exactly what I’m after.
I describe Topic Maps as being mysterious for two reasons: Firstly, after an initial flurry of activity in the area at the turn of the century, it appears as if things have now gone a bit quiet. The Topic Maps standard, ISO/IEC 13250 and the related XML Topic Maps (XTM), were last published in 2002 and 2001 respectively. The only book I could find on the subject, XML Topic Maps, was also published in 2002. Additionally, many of the main published links on the subject no longer exist. The second reason that Topic Maps are a bit mysterious is the fact that the vast majority of the research, and companies working in the area, seems to be concentrated in Europe – America seems to be ignoring their existence; either Europe is at the cutting edge of this technology, or something equivalent must exist that’s being used in America and I haven’t discovered it yet! (RDF and OWL perhaps?).
Topic Maps – The Basics
In Topic Maps nearly everything is represented as a standalone entity - a topic, and relationships can be created between these topics. For example, if we wished to describe a book, we could create separate topics for the name of the author, the title of the book, the publisher of the book, etc. and other topics to describe the relationships between these topics – for example, that this book was written by the author. This makes it possible to create relationships for practically anything and therefore it’s also possible to create a relational database that can be used to represent anything, which is exactly what I’m after. In addition, new topics and relationships can be added as more information is known about a subject.
In the next post I’ll hopefully describe how a topic map database can be created and how a range of different subjects can be stored. For anyone who can’t wait till then, some useful Topic Map information can be found at the following links:
"Metadata? Thesauri? Taxonomies? Topic Maps!" http://www.ontopia.net/topicmaps/materials/tm-vs-thesauri.html
Topic Maps and Relational Databases
http://www.xml.com/pub/a/2003/03/05/tmrdb.html
Sunday, 15 April 2007
Monday, 2 April 2007
In search of database flexibility
Initially, to refresh my knowledge of ASP.NET and to get up to speed with SQL and relational databases (RDBs), I built a simple web site capable of storing information book and CD information. (Mainly this was based on the examples given in this book: Beginning ASP.NET 2.0).
However, even at this stage, I was beginning to realise that it was going to be difficult to expand the RDB tables to provide a generic means of storage. To solve the problem I tried applying some object orientated design, creating a base table, to contain all common information and from which all others would be derived. However, it doesn't appear that relational databases lend themselves particularly well to object orientation, and I was still left with the situation were a new table would need to be created for each new type of item to be stored.
To categorize the items added to the database, and as a means of navigating, I used a three level hierarchy of Department, Section and Category. So, for example, "Eyes Open" by Snow Patrol was stored into "Entertainment - Music - Indie". This seemed to provide a fairly sensible breakdown for music and books, but seemed that it might prove a bit restrictive for other sorts of item. It forced items to always be placed into a category (i.e. items couldn't be stored into departments or sections) and categories couldn't be broken down any further to give more specific classifications.
Indeed, when I came to try to store more esoteric information, I soon discovered that the rigid nature of my database made it very difficult to add anything that didn't conform to a 3-layer classification scheme. For example, I wanted to add information about buildings but, knowing nothing about architecture, I had no idea what parent category ("Department" in my classification scheme) architecture would belong to, nor what sub-categories it would contain.
Now, whilst I could have gone off and studied more about architecture to know how it should be classified, this wasn't really what I wanted. Instead I wanted a database that would give me the flexibility to add items at any level, plus the ability to reclassify items and insert hierarchies as my knowledge of a subject increased. Clearly my intial database wasn't up to the job. As a result, I've now set off in search of a database structure that can give me the flexibility I require.
However, even at this stage, I was beginning to realise that it was going to be difficult to expand the RDB tables to provide a generic means of storage. To solve the problem I tried applying some object orientated design, creating a base table, to contain all common information and from which all others would be derived. However, it doesn't appear that relational databases lend themselves particularly well to object orientation, and I was still left with the situation were a new table would need to be created for each new type of item to be stored.
To categorize the items added to the database, and as a means of navigating, I used a three level hierarchy of Department, Section and Category. So, for example, "Eyes Open" by Snow Patrol was stored into "Entertainment - Music - Indie". This seemed to provide a fairly sensible breakdown for music and books, but seemed that it might prove a bit restrictive for other sorts of item. It forced items to always be placed into a category (i.e. items couldn't be stored into departments or sections) and categories couldn't be broken down any further to give more specific classifications.
Indeed, when I came to try to store more esoteric information, I soon discovered that the rigid nature of my database made it very difficult to add anything that didn't conform to a 3-layer classification scheme. For example, I wanted to add information about buildings but, knowing nothing about architecture, I had no idea what parent category ("Department" in my classification scheme) architecture would belong to, nor what sub-categories it would contain.
Now, whilst I could have gone off and studied more about architecture to know how it should be classified, this wasn't really what I wanted. Instead I wanted a database that would give me the flexibility to add items at any level, plus the ability to reclassify items and insert hierarchies as my knowledge of a subject increased. Clearly my intial database wasn't up to the job. As a result, I've now set off in search of a database structure that can give me the flexibility I require.
Subscribe to:
Posts (Atom)