3 Data Modeling

In this document you’ll find explainations and descriptions of the data model used in the Remove NA project.

Remove NA collects data about real world entities, such as people, organizations or locations, and the relationships between them.

3.1 Import to Factgrid

Factgrid is a database for historians, technically based on a Wikibase instance. I can use this platform to publish data gathered for this project.

3.1.1 Data modeling and import

To import the data in Factgrid, I have to model the data accordingly. I developed a process for doing so.

All scripts can be found in data-publishing/factgrid/.

A simple one, is the one for addings awards.

3.2 Create RDF files from database

RDF files created with RML rules.

Scripts in data-modelling

  1. run parser:
  • Knowledge Graph v1 with de-duplicated books, authors, publishers (best version!): yarrrml-parser -i mappings/kg_v1.yml -o mappings/kg_v1.r2rml.ttl -f R2RML
  • for posters: yarrrml-parser -i mappings/posters-rules.yml -o mappings/posters-mapping.r2rml.ttl -f R2RML
  • for books: yarrrml-parser -i mappings/books-rules.yml -o mappings/books-mapping.r2rml.ttl -f R2RML
  • for books and authors yarrrml-parser -i mappings/books-authors.yml -o mappings/books-authors.r2rml.ttl -f R2RML
  1. create RDF files with python3 create-rdf.py