Can linked data assist in expert profiling?

Scientific profiling in social networks involves the determination of a canditate’s (user) generated content. To determine if this content (in this case the microblogs) have scientific relevance, thus if a twitter user is an expert in a certain domain, we link hashtags to the linked data cloud. Specifically we try to discover scientific conferences, locations, people and events. In the literature we found an important validation for this idea. The general conclusion is that there are sources available to build such a system. But they are not properly interlinked. This thesis project is an effort to provide the interlinking between several LOD sources (most importantly Colinda, GeoNames and DBPedia). Other resources can definitely enhance the possibilities of the framework. But to prove the case we strictly limit the effort to technical scientific people and we use the hypothesis that if people are attending similar scientific conferences they are a good match.

Stankovic et al. studied expert search and profiling systems. Such systems aim to identify candidate experts and rank them with respect to their estimated expertise on a given topic, using available evidence. The authors found that traditional expert search and profiling systems exploit structured data from closed systems (e.g. email program) or unstructured data from open systems (e.g. the Web). However, on today’s Web, there is a growing number of data sets published according to the Linked Data principals, the majority of them being part of the Linked Open Data (LOD) cloud. As LOD connects data and people across different platforms in a meaningful way, one can assume that expert search and profiling systems would benefit from harnessing LOD.

Stankovic et al. conducted several experiments to evaluate the feasibility of existing expert search and profiling approaches on a recent snapshot of the LOD cloud. Our findings indicate that LOD cloud is already a useful source for some kinds of expert search approaches (e.g., those based on publications and professional events) but still has to meet certain requirements in order to reach its full potential. In the existing literature on expert finding, different authors make different assumptions on what makes an expert and how expertise can be assessed. They called these assumptions expertise hypotheses.

One section presents hypotheses which assume that a user’s online activities related to a certain topic imply his/her expertise in that topic. In order to be a useful evidence source for expert search, the LOD cloud needs to satisfy certain conditions. They have designed the some tests to verify if those conditions are met by the current LOD cloud and conducted these test for each particular expertise hypothesis. In particular in this thesis project 3 hypothesis are very relevant:

  • H4: If a user answers questions (on topic X) from experts on topic X then he might himself be an expert on topic X
  • H5: If a user is among the first to discover (and share) “important/good” resources (i.e. resources which become later popular) on topic X, then he might be an expert on topic X.
  • H6: If a user participates in collaborative software development project then he might be an expert in the programming language that is used in the project.

 

The authors found some interesting linked data resources for H5 the Faviki, Virtuoso (via Sponger) and for H6 the DOAP Store, RDFOhloh. Q&A sites are a useful source of data about expertise, and despite the possibility to represent them using the SIOC ontology, they have not found any such website that provides SIOC-based data export. H4 was thus not applicable on the current LOD cloud.

Faviki is a good example of this issue as well. It provides useful data about tagging with links to DBPedia, but the data about the time of tagging is missing, thus making it difficult to design expert search approaches based on H5. In some cases LOD is not a good source for expert finding because the datasets which may be used by certain hypothesis are not interlinked. During their evaluation they have found some examples of data that would be a useful source for expertise evidence if they would be interlinked.

For approaches based on H6, a useful data source is RDFOhloh – the export of data related to software development projects that take place at Ohloh . This source provides both inverse functional properties for the members of the projects, and links to DBPedia concepts identifying the programming languages that are used. It is thus perfectly suited for finding experts on specific programming languages.

DoapStore is a promising source for H6-based approaches. It contains data on software development projects and their participants. Although the programming language data are present, they are only given in form of literals, and the presence of links to some general concepts (e.g. DBPedia or Freebase ones) is not common. The H6-based approaches may rely on RDFOhloh for a more complete support. RDFOhloh also provides direct links to DoapStore descriptions, thus making the integration possible despite the lack of links in DoapStore.

The authors concluded that expert search and profiling systems aggregate and analyze certain types of data depending on the types of expertise hypotheses they use. Traditional approaches tend to retrieve their data from closed or limited data corpuses. LOD on the other hand allows querying the whole Web like a huge database, thus surpassing the limits of closed data sets, and closed online communities. They believe that this opens new possibilities for traditional expert search and profiling systems which usually only rely on data from their local and limited databases or on unstructured data gathered from the Web. LOD also stands up for a great promise to deliver mutli purpose data that can be used to find experts in many domains and with many different expertise hypotheses. In this paper they have explored the potentials and drawbacks of LOD in comparison to traditional datasources used for expert search. They haven’t only asked the question what LOD can do, but also what one can do for LOD to make it an even better source of expertise evidence.

 

References

  • Stankovic M., Wagner C., Laublet P., Jovanovic J.: Looking for experts? what can linked data do for you. In: Linked Data on the Web.

 

Advertisements

About laurensdv
Computer Science Student, interested in creating more innovating user experiences for information access. Fond of travelling around Europe!

One Response to Can linked data assist in expert profiling?

  1. Pingback: Open Innovation Problem Solver Search « Laurens goes semantic…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: