Creating and Querying Personalized Versions of Wikidata on a Laptop

Datasets


  • Datasets for Creating and Querying Personalized Versions of Wikidata in a Laptop (Wikidata workshop, 2021)

    • Description: These datasets are used to support the results of the paper "Datasets for Creating and Querying Personalized Versions of Wikidata in a Laptop", submitted to the Wikidata workshop 2021 (https://wikidataworkshop.github.io/2021/) at the International Semantic Web Conference. The datasets have been derived from Wikidata dump 20210215. To help querying purposes, the dump is organized in different files: claims.time.tsv.gz: time-related assertions claims.wikibase-item.tsv-006.gz: item-related assertions derived.P279.tsv.gz: statements that are subclass of another statement derived.P279star.tsv.gz: statement that are subclass of another statement, including their chains. derived.P31.tsv.gz: instance of statements. labels.en.tsv-004.gz: labels in English claims.external-id.tsv-005.gz: External identifiers for each item. ulan.tsv: ULAN ids (used to link external identifiers to Wikidata identifiers) wikidata_infobox.tsv.gz: Information about dbpedia infoboxes. Upload by: Daniel Garijo
    • License: 'https://creativecommons.org/licenses/by/4.0/legalcode' and 'info:eu-repo/semantics/openAccess'
    • Author: Pedro Szekely

Software


The pointers for the main software used can be found below:

    Knowledge Graph Toolkit

    Readme
    License: MIT License
    Notebook
    {KGTK}: A Toolkit for Large Knowledge Graph Manipulation and Analysis}
    Citation
    Installation
    Usage
    Documentation
    Download
    pythonhtmlshell

    Code and datasets for the KGTK demo at the 2021 Wikidata Workshop at ISWC

    Readme
    License: MIT License
    python

    Bibliography


    • Vrandecic, D., Krotzsch, M.: Wikidata: a free collaborative knowledgebase. Com-munications of the ACM57(10), 78–85 (2014)

    About the authors


    Daniel Garijo

    Daniel Garijo

    Distinguished Researcher

    Universidad Politécnica de Madrid, University of Southern California

    http://w3id.org/people/dgarijo

    I am a researcher at Universidad Politécnica de Madrid. My research activities focus on e-Science and the Semantic Web, specifically on how to increase the ease of use of software and scientific workflows using provenance, metadata, intermediate results and Linked Data.

    Hans Chalupsky

    Hans Chalupsky

    Author

    Research Lead

    Research Lead at the Information Sciences Institute, University of Southern California.

    Pedro Szekely

    Pedro Szekely

    Author

    Research Director

    Research Director at the center on Knowledge Graphs, Information Sciences Institute, University of Southern California.

    Filip Ilievski

    Filip Ilievski

    Author

    Research Scientist

    Researcher at the Information Sciences Institute, University of Southern California.

    Kartik Shenoy

    Kartik Shenoy

    Author

    Student worker

    Master student at the University of Southern California.