Tripplestore including File Management

I have seen the demo of the ontodocker tripplestore (https://ontodocker.material-digital.de/) of the platform. I was wondering if there are any developments in the PMD that concern the integration of files to the data. E.g. many experiments include time series data which should not be stored in a tripplestore but rather referenced using a standard like csvw. Are there solutions for this already.
This question further extends to other file formats like figures as well.

1 Like

@paulzierep a tricky question. I suppose the ontodocker developer are those people who are most likely able to answer. @jannis.grundmann @henkbirkholz As far as I know they have ongoing collaborations with experimentalists…

The time series data is not stored in any triplestore. Only information about the data fields themselves (e.g. identifier and unit)
Which document-oriented database (e.g. MongoDB) is used for long-term storing is a point of discussion.

1 Like

Thank you for this answer.
Concerning the information about the data fields, how is the platform intention for the concerned data models ? I would for example assume to use the csvw library CSV on the Web: A Primer to describe metadata about csw files. But I have also seen different approaches, and I thought it would be advantageous to have similar data models for similar data types between platform projects.

1 Like

Concerning the point of discussion for document/file data storage. Is the idea, that the platform aims to provide solutions for this, or should every project look for their own data storage systems ? Also, in which forum are these points discussed ?

1 Like

While the concepts presented in the W3C standard „CSV on the Web: A Primer“ are in principle feasible, for brevity reasons we store lines of two-dimensional bulk data (e.g. time series) not in JSON objects but in JSON arrays.

Additionally, the two-dimensional data in CSV is often prepended by label/value formatted data (sometimes called metadata) that has to be processed differently.

In order to account for both the two-dimensional data series and the metadata in today’s CSV-modeled data, we introduce an optimized data model using the JSON serialization that goes beyond what is called „the primer“ in the W3C standard.
exemplary-timeseries-bulkdata_Batch-3353.json

Maybe this is an interesting topic for the Friday’s „Ontology Playground“ meeting. What do you think?

1 Like