Data collection pipeline

Hi all,

motivated by our knowledge exchanges and discussions during the General Assembly, we would like to follow up here.

As we (all?) are challenged by decentralized and unstructured data sources, we would like to get in touch with you about your data collection pipeline. We were talking to some of you and would like to open a discussion we can all benefit from. We would like to start with the following three questions:

  1. Where is your data collected (e.g. machines, eLab, manually) and in which format?
  2. How do you exchange and update data between machines and triple store?
  3. Where and how is the data stored? Are you just using a triple store or some additional software?
5 „Gefällt mir“

And our answers to the questions are:

  1. Currently, we are collecting data manually through MediaWiki Form. However, we are facing to implement some automated data input from measurement instruments.
  2. The data is transformed to RDF triples and stays within our MediWiki, yet. One can edit and delete data manually afterwards.
  3. We are using a Semantic Media Wiki that allows to store triples. Since our data is manually created, we did not yet run into performance issues. If we implement automated data input, we will introduce an external triples store.
2 „Gefällt mir“