Preprint

Concept-based semantic annotation, indexing and retrieval of office-like document units

  • Nešic, Saša Facoltà di scienze informatiche, Università della Svizzera italiana, Svizzera
  • Jazayeri, Mehdi Facoltà di scienze informatiche, Università della Svizzera italiana, Svizzera
  • Crestani, Fabio Facoltà di scienze informatiche, Università della Svizzera italiana, Svizzera
  • Gaševic, Dragan School of Computing and Information Systems, Athabasca University, Canada
Show more…
    2010

11 p.

English We present an ontology-driven approach to semantic annotation, indexing and retrieval of document units. This approach is based on a novel semantic document model (SDM) that we developed to make office-like document units be uniquely identified, semantically annotated with concepts from annotation ontologies and linkable across document boundaries. In the semantic annotation model that we propose, we first lexically expand descriptions of ontological concepts to enhance syntactic matching. Next, we expand a set of syntactic matches with semantically related concepts (i.e., semantic matches) discovered by exploring the annotation ontology. Moreover, we calculate the annotation weight of both the syntactic and semantic matches by taking into account the effects of the lexical expansion and measuring semantic distance between ontological concepts. The retrieval model of document units utilizes the inverted concept index that we generate from the concepts used in the annotation and their weights for document units they annotate. Results of the preliminary evaluation conducted with a prototype implementation are promising. We present the analysis of these results.
Language
  • English
Classification
Computer science and technology
License
License undefined
Identifiers
  • RERO DOC 22114
  • ARK ark:/12658/srd1318154
Persistent URL
https://n2t.net/ark:/12658/srd1318154
Statistics

Document views: 59 File downloads:
  • ITR1001.pdf: 105