Semantic Microblogging Architecture

Thoughts on location and microblogging...
Image by whiteafrican via Flickr

SMOB – Semantic Microblogging is an interesting system, because its architecture is similar to the kind of architecture I will need to realize the scientific profiling tool for twitter. SMOB has been described in an article about Microblogging by Passant et al. This paper also described the implementation of an initial prototype of this concept that provides ways to leverage microblogging with the Linked Data Web guidelines. At the time of writing microblogging services were (but today still are) centralised  and confined, and efforts are still to be made to let microblogging be part of the Social Semantic Web. First, the authors introduced classical microblogging and some of the issues it raises. Second, the authors saw how Semantic Web can help in getting rid of these issues and what it can offer that traditional services could not achieve.Passent et al. then gave an overview of microblogging and described why we should consider it and highlighted current issues. In the article they stated that they believe that the Semantic Web is an elegant solution to opening these data from proprietary silos and to providing machine-processable data and metadata to microblogging as well as to delivering an open and distributed environment for microblogging, as they will exposed in the next section.

In the next section the authors wrote about the architecture of a semantic microblogging service. In order to model the metadata of a microblogging service, they relied on two widely used ontologies on the Social Semantic Web: FOAF and SIOC. While microblog posts are by nature relatively light in content, it is interesting to identify some of the data they contain, which is one of the problematic areas for current systems. Thus, there is a need to (semi-)automatically extract those URIs or concepts from plain text or to let users annotate it similarly to what they can already do on Twitter with hash tags, but with more powerful processing that can extract and define URIs based on those tags.  The authors think that using URIs and RDF to model this data can be useful for two reasons:

  1. We rely on existing, unambiguous resource definitions to model the content.
  2. We open microblogging entries to the Linked Data Web in the case these URIs are available on the Web and in the better case, already linked to other content.

They illustrated this with a clear example:

Rather than writing ”Visiting #Eiffel Tower in #Paris”, someone could microblog ”Visiting #dbp:Eiffel Tower in #geo:Paris France” so that the processor would be able to extract the two hash tags and thanks to a predefined prefix mapping process, query DBpedia and GeoNames10 to retrieve URIs of the related concepts. Thus, the updates would be automatically linked to existing URIs rather than to simple and meaningless – from a software agent point of view – text strings.

Such a way to extract data and to interlink with existing URIs makes content more easily searchable on the Semantic Web. Indeed, thanks to lookup services such as Sindice that crawl the web for RDF data and links between docu- ments, one could be suggested to look at the update above when searching for ”Eiffel Tower”.

Passant et al. wanted the microblogging system to be open and distributed. So they envisioned a multitude of publishing services and aggregation servers interacting with each other. A publishing service makes the posts of one or more authors available on the Web in RDF. An aggregation server receives pings from publishers and retrieves posts it deems relevant for further use. Aggregators thus function as super-peers in the network, taking the burden of follow- ing publishers off the readers and making it simpler for publishers to announce new posts. The authors then also discussed the users’ ownership of their data and secutity and privicay issues.

In the last section they demonstrated their thoughts with an explanation of their implentation of a prototype: SMOB. It exists out of two parts: a publishing client and server web services. The publishing client is accessed as a web page that contains a small form field for content. Submitting the form creates the post and makes it available on the Web in RDF. Based on pings received from clients, the server loads all posts into its triple store.

In my own application I will also have to consider how to store triples and rdfize the stream from information published by twitter user. Passant et al. mentioned SPARUL-SPARSQL/Update (implemented in Jena, Openlink Virtuoso and ARC2). These are systems I need to take a closer look at. The server uses ARC217 to store the data.

In this paper, the authors introduced the architecture and a first implementation of a distributed semantic microblogging platform. While existing approaches to convert microblogging services to RDF already exist for Twitter or Jaiku, their approach relies on a complete open and distributed view, using some standards of the Social Semantic Web. Moreover, some parts of their work, as the hash tag processing could be adopted to services such as Twitter to enable some semantics in existing tools.

References

Passant et al. Microblogging: A semantic and distributed approach. Proceedings of the 4th Workshop on Scripting for the Semantic Web (2008)


Advertisements

About laurensdv
Computer Science Student, interested in creating more innovating user experiences for information access. Fond of travelling around Europe!

4 Responses to Semantic Microblogging Architecture

  1. Pingback: Something more about storing triples… « Laurens goes semantic…

  2. Pingback: Something more about storing triples | Laurens goes semantic

  3. Pingback: The Twitter data extraction begins! « Laurens goes semantic…

  4. Pingback: The Twitter data extraction begins! | Laurens goes semantic

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: