JCDL Conference 2008


SIGIR logo

IEEE computer society logo

Tutorial 2: Semantic Digital Libraries
Full Day

The aim of this tutorial is to educate attendees on the applications of Semantic Web and Social Networking (Web 2.0) technologies in digital library systems. These technologies include metadata management, semantic search and browsing, personalized and community-aware services, and semantically empowered federations of digital libraries. The semantic digital libraries can be used in enterprise scale systems such as knowledge management systems, medical records systems, legal research systems and others that will be discussed at some depth.  After this tutorial, the audience will be able to start using existing semantic digital libraries and apply Semantic Web technologies to digital library systems. We will start by defining problems in the domain of semantic digital libraries and present solutions that provide building blocks for semantic digital libraries, such as WordNet, DMoz, SKOS, CIDOC-CRM, and OAI-ORE. We will then present the architecture of existing semantic digital libraries, elaborate on resource management, search and browsing features, and identify management and communication interfaces. We will discuss in detail the problems and solutions for bibliographic metadata management and interoperability, followed by a presentation of semantic search and browsing solutions, and other personalized and community-aware services. We will discuss the future of federations of digital libraries in the context of the Semantic Web and Web 2.0 Internet. Finally we will present six initiatives that adhere to the idea of a semantic digital library:  SIMILE, Corrib.org, BRICKS, Fedora, and DELOS.

Tutorial web site: http://semdl.corrib.org/Tutorial/

Learning Objectives:
The participants of the tutorial will learn about the main design goals and features of semantic digital libraries. They will get to know semantic web and social networking technologies, which can improve current information management and retrieval, as well as search and browsing solutions in digital libraries. The participants will learn about 6 different approaches to building semantic digital libraries; they will also get some first practical experience with installing, configuring, and using these libraries. After the tutorial, participants will be able to select and use the most appropriate solution for their needs, whether this is a complete semantic digital library or a single component which can be applied to an existing infrastructure.

Target Audience:
Researchers, practitioners, and computer scientists from (digital) library, semantic web, distributed systems and knowledge management communities.

Level of Experience: 
Introductory or intermediate level of experience in the presented topics.

Sebastian Ryszard Kruk is a lead researcher (Semantic Infrastructure Lab at e-Learning Cluster in DERI Galway) and project manager (Corrib.org) affiliated with DERI, National University of Ireland, Galway and Gdansk University of Technology (GUT). His main areas of interest cover Semantic Web technologies, digital libraries, information retrieval, security, and distributed computing. In 2002, together with Prof. Henryk Krawczyk he conceived of the semantic digital library implemented at GUT as Elvis-DL. Since 2004, he has continued development of this system under the JeromeDL project, involving collaboration between DERI and GUT. To improve the quality of the JeromeDL system, he started the MarcOnt Initiative and the FOAFRealm project. Since 2005, both have been supported by DERI and GUT under the hood of Corrib.org. He initiated the work on a lightweight implementation of the HyperCuP protocol that was later became an independent project. In 2005, as a part of the FOAFRealm project, he envisioned and delivered the first prototype of Social Semantic Collaborative Filtering, a unique bookmarks sharing solution. In 2006 he delivered the JOnto component, a unified API to access different taxonomies (used in, e.g., JeromeDL and FOAFRealm). He initiated work on the Didaskon, which delivers a framework for assembling a curriculum from existing learning objects provided by e-Learning services; the selection of learning objects will be based on the semantically annotated specification of the user's current skills. He also delivered the TagsTreeMaps component for efficient filtering of a tagged information space; and HexBrowser for representing an information space using the HoneyComb paradigm. In 2007 he developed a prototype of MultiBeeBrowse, providing collaborative, faceted navigation on unstructured metadata. MBB together with other components constitute the social semantic search and browsing cycle (S3B). Based on the S3B components, he and Adam Gzella set up notitio.us - a social semantic information sources discovery and sharing service. He contributes to the Open Source community with a number of other projects, including all aforementioned Corrib.org solutions. As an active member of the semantic web research community, he works with Stefan Decker, Daniel Schwabe, and Henryk Krawczyk. He has published a number of scientific articles in international conferences and journals.

Stefan Decker is a professor at the National University of Ireland, Galway, director of the Digital Enterprise Research Institute (leading a research institute employing over 100 people) and Cluster Leader of the Semantic Web Cluster within the institute. He received a Master’s degree in Computer Science in 1995 from the University of Kaiserslautern (awarded with distinction), and a Ph.D. in Computer Science in 2002 from the University of Karlsruhe (awarded with distinction). From 1999-2002, he worked as a Postdoc and Research Associate in the Computer Science Department of Stanford University, where he established one of the first Semantic Web research groups. From July 2002 to July 2005, he worked as a Computer Scientist and Research Assistant Professor at the Information Sciences Institute of the University of Southern California, USA. Since October 2003, he has engaged in establishing a new Research Institute, leading the Semantic Web research group as a Senior Research Fellow and Adjunct Lecturer responsible for 10 group members within the Institute and the National University of Ireland. His current research interests include the semantic web, metadata, ontologies and semi-structured data, web services, and applications for digital libraries, knowledge management, information integration and peer-to-peer technology. He has published around 70 papers in books, journals, conference, and workshop proceedings. CiteSeer ranks him at 1035 in their list of most cited computer scientists (see http://citeseer.ist.psu.edu/allcitedn.html). He has co-organized about 35 scientific workshops and conferences and has edited several special issues of scientific journals. He was editor-in-chief of Elsevier's Journal of Web Semantics, is a editorial committee member of the Electronic Transactions on Artificial Intelligence (ETAI) (the Semantic Web), the Journal on Internet Research, and the Journal on Web Intelligence and Agent Systems (WIAS). Dr. Decker is recognized as one of the most widely cited semantic web scientists. His dissertation work was quoted as one of the inspirations for the DARPA DAML program, which spans the semantic web effort.

Dean Krafft received his Ph.D. in Computer Science from Cornell University in 1981. He is currently the
Director of Information Technology for Computing and Information Science at Cornell University, and he is the Principal Investigator on the NSF-funded National Science Digital Library Project at Cornell. He led the effort over the past two years to convert the NSDL core infrastructure to a web-services digital object repository architecture based on the Fedora repository middleware. The project's current technical efforts focus on extending open-source collaborative applications to create content and context around the resources of the NSDL, using semantic web technologies to represent the relationships among the underlying objects and to support search and discovery within the library. Krafft has been working on digital libraries since 1992, when he worked on the Dienst and NCSTRL projects. He has been a researcher with NSDL since the program's inception in 2001.

Predrag Knežević received his Ph.D. at Fraunhofer IPSI Institute in Darmstadt, Germany in 2007. He holds a diploma in computer software and engineering from the School of Electrical Engineering, University of Belgrade. Since 2001, as a member of OASYS and i-Info divisions, he has been a lead software architect in a few EU and national projects like TeachwareOnDemand, MGN, and BRICKS. Before joining Fraunhofer IPSI, he spent five years working in system and network engineering at the broadcasting company B92, Belgrade. His main interests are peer-to-peer storages, replication protocols, and decentralized systems.

Mariusz Cygan is a research assistant affiliated with Digital Enterprise Research Institute since 2005. He is currently a developer and a lead software architect of the JeromeDL project. His main area of interest concerns introduction of semantic technologies into production environments. His current work focus on delivering JeromeDL as a production system. Mariusz Cygan received his M.Sc in Informatics from Gdansk University of Technology in 2006. He specializes in distributed applications and Internet systems. His publications concern ubiquitous search and browsing, as well as heterogenous networks of digital libraries.