Tuesday 3 July 2007

Using SemWeb with MySQL

Getting SemWeb to work with MySQL is pretty much the same as for SQLite (see the last post). Again, using the news.rdf file from Mozilla, the command line to enter data into the MySQL database now becomes:

rdfstorage news.rdf --out "mysql:rdf:database=news;username=test"

This assumes that a new database (schema) called "news" has been created in MySQL and that a user "test" exists that doesn't require a password (both these things can be done using the MySQL Administrator UI).

Initially running this command line in the installed version of the SemWeb\bin directory results in the following error being generated:

System.IO.FileNotFoundException: Could not load file or assembly 'MySql.Data, Version=1.0.8.0, Culture=neutral, PublicKeyToken=c5687fc88969c44d' or one of its dependencies. The system cannot find the file specified.


This error means that the MySQL.Data.dll is missing, so the first step is to get this from MySQL ADO.Net Data Provider 5.0.7

After placing this DLL in the bin directory and running the command line again, the following error will appear:

System.IO.FileLoadException: Could not load file or assembly 'MySql.Data, Version=1.0.8.0, Culture=neutral, PublicKeyToken=c5687fc88969c44d' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)


In this case the important part is the "manifest definition does not match the assembly reference" bit - as happened with the SQLite DLL, this MySQL DLL is newer than the version that was used to build the SemWeb.MySQLStore.dll and so the 2 DLLs are incompatible. We need to rebuild this SemWeb DLL.


The steps to do this are as follows:

  • Open the SemWeb solution in Visual Studio
  • Add a new project of type "Class Library" and called "SemWeb.MySQLStore"
  • Delete the default "Class1.cs" that gets added to the project
  • Choose to add an existing item to the project and pick "MySQLStore.cs" from the SemWeb\src directory
  • Add a reference to the "MySQL.Data.dll " that should already be in the SemWeb\bin directory
  • Add a reference to the SemWeb project
  • In the project properties, set the output path to the SemWeb\bin directory

These are the exact same steps that were required for the SQLiteStore DLL (although obviously referencing different files). However, building the MySQLStore DLL now results in the following compiler error:

The type or namespace name 'MySqlConnection' could not be found (are you missing a using directive or an assembly reference?)

To fix this error the following needs to be added to the top of the MySQLStore.cs file:

using MySql.Data.MySqlClient;


So the complete set of using directives at the top of the file should now appear as:


using System;
using System.Collections;
using MySql.Data.MySqlClient;
using System.Data;

The project should now rebuild successfully and the new SemWeb.MySQLStore.dll should now appear in the SemWeb\bin directory.

Unfortunately this still isn't the end of the process - running the command line again results in the following error:

MySql.Data.MySqlClient.MySqlException: Table 'news.rdf_statements' doesn't exist

The DLL is now ok, its just that the database that its creating has a few problems; inspecting the database in MySQL Administrator shows that the "rdf_literals" and "rdf_entities" tables have been created but the "rdf_statements" table hasn't been made.

Setting the SEMWEB_DEBUG_SQL environment variable (type "set SEMWEB_DEBUG_SQL=1" at the command prompt) and then running the command line again generates some debug information (alternatively put a break point on the "
CreateTable" function in the SQLStore.cs file and run the program in the debugger, after first setting up the command line arguments in the project properties). The error generated is:

System.FormatException: Input string was not in a correct format.

This is caused by the SemWeb database "Open" function (from MySQLStore.cs) querying the version of the database and then trying to set this into the .NET Version class; however, the version string returned from my version of MySQL is "5.0.41-community-nt" and this doesn't match with the expected .NET version string format, which is expecting only integers seperated by a period character ('.') - hence an error is thrown on the first call to "Open" and so the creation of the first database table ("rdf_statements") fails.

To fix this I've just put a try-catch around the setting of the version string, so basically I just ignore when the version can't be set.




private void Open()
{
if (connection != null)
return;
MySqlConnection c = new MySqlConnection(connectionString);
c.Open();
connection = c; // only set field if open was successful

using (IDataReader reader = RunReader(@"show variables like 'version'"))
{
reader.Read();
try
{
version = new Version(reader.GetString(1));
}
catch(Exception e)
{
if (Debug) Console.Error.WriteLine(e.Message);
}
}
}


Finally, rebuilding the SemWeb DLL and re-running the command line successfully adds the RDF to the database:

news.rdf 0m0s, 81 statements, 1728 st/sec
Total Time: 0m2s, 81 statements, 30 st/sec

4 comments:

christian said...

Hi,

I tried this on Windows XP and it worked fine. But when I try it on Windows Server 2003 I get the following error message when I enter
rdfstorage news.rdf --out "mysql:rdf:database=news;username=test"

Error:
System.NotSupportedException: The storage type in (SemWeb.Stores.MySQLStore, SemWeb.MySQLStore) could not be found.
at SemWeb.Store.Create(String spec, Boolean output)
at SemWeb.Store.CreateForOutput(String spec)
at RDFStorage.Main(String[] args)

Can anyone help me?
Thank You

formica said...

I have the same error of christian under linux. Is there a solution?

Regards
Roberto

Steve said...

Sorry to hear you're having problems. I don't actually have a Windows 2003 or Linux platform to build on, so can't test for the error you're getting.

Is it Mono that you're using on Linux? If so then I'd have thought that this would have been tested in the standard SemWeb release. It may be worth you posting a help request on the main SemWeb forum.

Sorry I can't be of more help.

Steve

Pekka Henttonen said...

I took a look at the source. It seems that the documentation has an error. The string should start with "sqlserver", like this "sqlserver:mysql:rdf:Database=rdf..."

At least this brings me forward one line of code - to a new error message :-)