A Question of Semantics

I’ve read a few articles through blogs lately regarding Linked Data and the Semantic Web. Ross Bates, Paul Miller, Ian Davis, and Semantics Incorporated have all explored the ideas of Linked Data and Web 3.0/Semantic Web. This got me thinking a bit about the Semantic web, and the direction of the efforts to reform web data to a more ‘object-oriented’ model. There are a number of resources out there that cover it beyond the items I linked. In the end, it boils down to a way of structuring data on the web in a way that allows machines to understand context and thus manipulate data in the same way that humans do, through reasoning. While this is a noble effort, and promises to restructure the way the web operates, I have to wonder if the entire approach isn’t slightly backwards.

The Semantic web certainly feels nice on paper, and in practice, the idea of linking data objects and allowing intelligent agents to understand these objects seems extremely logical. The ability for an object to be sorted, recognized, stored, and even manipulated and calculated by the agent without human interaction opens the door to promising applications and real leaps in the future of the web. Imagine a web store that had objects built right into the products, that allowed your browser to understand a “price”, and let your agent perform tasks that it knew belong to prices, like automating currency conversion or calculating sales tax, without the web store needing to involve itself in the discussion.  Not only that, but it knew what the item you were purchasing was, knew that it was something you needed for a defined task, and knew it no matter which store you purchased it from. Without you needing to browse, your agent could find the best fit at the best price at the closest location, and you’d never even need to click a button beyond defining your needs. Even this simple example is only the tip of the Semantic iceberg, and developers far smarter than I am are likely to find new ways to link and manipulate data that is beyond this simplistic example.

But this example, while very simple, highlights some problems that I see with semantic web  data.

  1. The developer must be inclined to define his objects. The current definition of the semantic web requires that those creating the website create, define, and code their objects to allow their context to be understood, which leads us to…
  2. We are allowing the developers to create context for us, instead of allowing us to define our own experience and values.

In the example of the web store above, we may be quite happy to allow the developer to define our experience for us. A price is a price, for instance, and a 4×2 wooden dowel is just what it says it is. But what if you are not on a store? What if you’re reading a blog post by a friend, talking about your dog. Your friend may define the dog as an object, but how would you tell the web and your agents that this dog was YOUR dog? How do you create a web that is relevant to you, beyond the definitions provided? How do you tell an agent that you are the manufacturer of an item, and carry that data across the web?

I believe that this leaves some room open for clients and services that synchronize to create social semantic webs. Instead of defining the semantics on the server side, the client could be instructed to create connections, and synchronize them to a social web service.  You could define a cat as a cat, and your dog as your dog, and as the system began to crawl, your definitions would shape the way that your agents viewed the web. The social web service would create a background layer to share trusted data and definitions, to link and identify commonalities, to establish identities and build peer relationships that would increase relevancy. Indeed, these clients and services could even stand in the background, browsing your web patterns for relevant content, and presenting you with options on linking, and slowly attempt to intelligently build new links based upon data you have provided it previously, with simple options to allow you to accept, reject, or redefine an identified pattern.

I believe that the search for a pure semantic web won’t be so easily found in defining RDFs and creating server-side code. There will always be the lazy, and those who simply lie about objects and data to take advantage of or obfuscate themselves from a search system or agent. In order for a truly semantic web to be realized, it will take client-side applications (browsers, OS, agents), services (semantic banks, trust center, universal ids, social connections), and back-end semantic definitions to allow us to utilize the data intelligently in a way that applies to us personally.

July 27th, 2009 | Blogosphere, Semantic Web, Social Media, Web 3.0

4 comments

I found some fascinating articles at http://www.mindswap.org/papers/ regarding building reputation networks and data accountability. I’m still reading through them, but they certainly seem to broadly talk on these subjects. They are worth a look.

Comment by xian — July 29, 2009 @ 8:32 pm

Yup – I also think there are opportunities in the market for trusted intermediaries. Companies that can aggregate, integrate, and verify triples – they’ll be the trusted source for customers. I wouldn’t be surprised to see search become more niche in a scenario like this.

Comment by Ross Bates — July 29, 2009 @ 7:20 am

You’re right about getting sucked in. I’ve spent most of the evening reading the W3C specifications for RDF / RDFa as well as N3. There’s a lot out there to digest.

I can see what you’re saying about extending objects, and I agree that there are inherent dangers. I think this points even more to the need for creating these ‘trusted’ objects that can accept or reject extensions, either explicitly or implicitly. Being able to define both trusted attributes and trusted social connections allow you to create an object that retain meaning.

Of course, this creates it’s own issues, because limiting objects to approved attributes creates a conflict on interest. For instance, if you’re building a rating or review system, an object could simply reject these types of items and consider them untrustworthy. So the question then becomes: How do you create an object with attributes you can TRUST without creating a wall that prevents honest modification? Perhaps you simply have to create separate trust circles for the base object and the external operations that extend it? That way, you could extend an object and accept extensions from connections that you trust.

Comment by xian — July 28, 2009 @ 8:50 pm

Christian – some nice thoughts…. watch out though, you are going to get sucked in! :)

I definitely agree with you about the needs for smarter client tools to synthesize the web of data. Like you said, a cat is just a cat. It’s not representative of my cat until it hasColor: white, hasTemperament: friendly, resides: Dallas…. anyway you get the point, these things need to be aggregated and glued together in the right way that applies to me. You are spot on there.

Where I would disagree is that RDF requires someone to define the object ahead of time. Or, let me try it this way…. you can define the object ahead of time if you’d like to, but anyone else can come along and extend it in anyway they see fit. That’s the power of an RDF data model deployed on the WWW – if I want to create a new attribute for your cat class called hasFleas: I can do it without asking for your blessing.

Yes it sounds like pure insanity at first, anyone can say whatever about anything. That being said I would counter with the fact that we’re already operating under this model today with the web of documents and it’s proving to work very well. The URLs we create to connect people, places, things… anyone can do it and the links that have formed are mind boggling. RDF and Linked Data is just taking that next step towards structured everything.

Comment by Ross Bates — July 28, 2009 @ 3:22 pm