How to allow for a workflow with ontological typing

I tried to launch a discussion during the general assembly with @bernd.bayerlein and @markus.schilling but it remained on a rather vague level, so I’m gonna try to summarize what we from the workflow developer side wanted to achieve, in the hope that it becomes clear.

Info: I don’t distinguish between “functions” and “nodes” in the following, as we consider “functions” as the numerical realization of “nodes” in the workflow sense.

Our current strategy of PMD is to develop workflows and ontologies separately, so that the ontological descriptions are applied on plain workflows, and workflows do not have any scientific context. We, workflow developers, think there should be a possibility for the users to provide nodes with ontological types as soon as they create them, in order for the ontological description to be included intrinsically when a workflow is executed, and not to be applied independently.

In this spirit, recently we developed a scheme for data typing which allows a user to indicate units via:

def get_speed(distance: u(float, “meter”), time: u(float, “second”)) -> u(float, “meter/second”):
    return distance / time

And if the user wants to enforce it, they can apply an interpreter, so that the inputs would be controlled:

from pint import UnitRegistry
from uniton.typing import u
from uniton.converter import units

@units
def get_speed(distance: u(float, “meter”), time: u(float, “second”)) -> u(float, “meter/second”):
    return distance / time

ureg = pint.UnitRegistry()

get_speed(1, 1) # units ignored
get_speed(1 * ureg.meter, 1 * ureg.second) # 1 meter / second
get_speed(1 * ureg.millimeter, 1 * ureg.second) # 0.001 meter / second
get_speed(1 * ureg.joule, 1 * ureg.second) # error

More can be found here: GitHub - pyiron/uniton

Now we would like to able to include ontological types in the same package. Our potential best scenario is to be able to write something like:

def add_tomato(my_pizza: some_type_hinting_algorithm(ontological_pizza)) -> modified_ontological_pizza:
    ….
    return pizza_with_tomatoes

And the logic will hopefully be interpreted by some decorator, just like in the case of units, in order for each function to have physical context in addition to performing numerical operations.

In a very simple case I can imagine a situation where the user can use dataclass from python and do something like:

@dataclass
class Pizza:
    pass

@dataclass
class VegetarianPizza(Pizza):
    pass

@check_type
def add_tomato(my_pizza: Pizza) -> Pizza:
    return pizza

And we could check the types inside check_type (or something else) e.g. via assert isinstance(input_arg, data_type).

Now the problem is that I know in ontology it’s not just about data types, but there are triples to append more information. In this particular example, the output of add_tomato is probably not PizzaWithTomato, but instead you would append an attribute has_tomatoes = True. This might conflict with the data stored in the data class, e.g.:

class MyPizza(Pizza):
    has_tomato = True # Ontological attribute
    price = 15 # Data

So in short, we cannot straightforwardly use data classes. This being said, if it’s only about has_xyz, then we can probably rewrite the current type hinting algorithm, but I presume in reality there’s a series of possible logical statements which have to be covered by the interpreter.

I summarize the text:

  • Ultimate goal: enable functions to contain ontological information
  • Seemingly possible option: Use type hinting for ontological types
  • Potential technical implementation candidate: data classes
  • Unknown factor: mapping of ontological logics

Any comments on any aspect would be appreciated. Thanks!

EDIT: No need to read anymore.

In the meantime I had a good discussion with Jörg Neugebauer, and I think I have an answer to my own question here. In this particular example, from the workflow side we don’t define another ontological type with tomatoes, instead we would hand over the information that the Pizza went through the function add_tomato, i.e. we would have something like the following code:

wf = Workflow("make_pizza")
wf.plain_pizza = my_pizza # which is of the type Pizza
wf.pizza_with_tomato = add_tomato(my_pizza=wf.plain_pizza)

So in this regard, instead of defining a data type for Pizza with has_tomato, now we have Pizza that went through the process add_tomato, in the hope that this can somehow be incorporated into the ontology database. Does this make sense?

In parallel, there’s something I don’t know: While it appears to make sense to use classes for this construct, it doesn’t seem to me that there is a good base ontological class that I could derive my Pizza from. I was told recently that owlready2 is not really in use anymore. Is there a better variant that we can stick to? I heard rdflib is good enough, but there are only a few examples on Getting Started and there is no single example on the example page. I frankly don’t understand how it is applied to the real data, especially the one that’s produced by a workflow.

It may very well be that my post makes no sense at all from the ontology perspective. For that (very possible) case, I would like to emphasise the bottom line: We would like to give scientific/ontological concepts to the workflow methods and data, so that as the workflow gets constructed, its ontology is intrinsically applied. It would be great if you could give a reply from that perspective.

I managed to get a good example from Sarath from NFDI here. There’s no possibility to close the discussion here so I leave it like this, but there’s no need to read my entries above.