Linked Data Modelling and Conversion

project

Swirrl is a linked-data company whose service PublishMyData allows you to publish data in linked-data format as part of the semantic web. PublishMyData provides an interface for browsing the data and a variety of APIs (including SPARQL endpoints) for programmatic access. The service is popular amoung public sector organisations seeking to publish open data in a way that maximises the possibilities for re-use by providing a solid technical platform for third-party application development.

Infonomics works with Swirrl to model and translate data, typically from a tabular form, into linked-data graphs. This process enriches the quality of the data by:

  • providing a multi-dimensional representation that better reflects the underlying facts - this is highly normalised to ensure integrity and maximum flexibility for future transformation;
  • cleaning and standardising data types to prepare the data for analysis;
  • relating the data to reference structures so that it may be integrated with other data in the semantic web;
  • establishing vocabularies (concept schemes and ontologies) which provide meta data descriptions to aid discover and interpretation and to ensure consistency.

The "RDFisation" process combines a variety of tools including Swirrls in-house ruby-based RDF ORM (object-relational mapping) library Tripod and clojure-based ETL (extraction, translation and loading) pipeline tool Grafter.

linked data open data data migration modelling analysis open source software data cleansing