A Semantic Web Primer

Main content

A Semantic Web Primer

dinsdag 20 februari 2007 21:39

Grigoris Antoniou and Frank van Harmelen (MIT Press, 2004)

The reason I bought this book is described within: "We think there is a need [for a book on the Semantic Web] because on the Web there are too many sources of varying quality and too much information. Some information is valid, some outdated, some wrong, and most sources talk about obscure details." I myself was pretty much lost, especially about what is important and what isn't, and what is the vision behind it.

The book is written very clearly, with a simple structure: The Semantic Web Vision, XML, RDF, OWL, Inference, Applications, Ontology Engineering, Conclusions and Outlook. It is possibly best to read the book when you have wrestled yourself through the online documentation somewhat. For a complete novice the book will probably a bit too much at once. If you have some idea about what the Web is, the book can make all the pieces of the puzzle fall into place.

For me the description of XML Schema (succeeding the DTD) was most interesting, having learnt XML in a time that DTDs were the only way. RDF Schema ('preceding' OWL as a simple ontology language) was also completely new to me. The book taught me some things about RDF I didn't know and also about OWL. Now I know the idea behind the three types of OWL and why they exist. And that's quite useful. If you are new to logical inference, Horn clauses and such, there is an interesting chapter about that too.

However, I came back from the book with the idea that the Semantic Web is unlikely to succeed. For the following reasons:

The syntax of all these XML languages is too gruesome for humans and too bloated for computers.
The book speaks of First Order Predicate Logic representations and reifications in OWL and complex logical constructs; transitivity, symmetry; class unions; an open world assumption; Description Logic, etc. However, when logic inference is treated, the book's authors fall back to the proven method of using a variant of Horn clauses, that are used in Prolog, for example. There is a large gap between what is possible theoretically and what seems to be feasible.
The semantic web does not enforce identical concepts to be represented the same way everywhere. This is a problem for concepts themselves, but is really problematic when the internal representation of concepts differs from domain to domain. The semantic web allows to much freedom in this respect. To connect to separate domains you need a conversion layer just as much as you would need connecting separate XML files.
The question of how all semantic knowledge of the world is to be integrated is left unanswered. I am assuming a future Google creates a gigantic private web built from all the local semantic webs. But even then the book does not even begin to answer how this huge web is to be queried.

What can I say? The book is a really good introduction to this field, and it has made me think. Maybe even a little too much...