Separate worfklows into moduls and examples

Maybe a question here to the app store experts. How would you define a workflow (wih inputs/outputs), where inputs are e.g. some material parameters, and outputs is a least squares error of the model response compared to some experimental data (so you could optimize your model parameters to best represent your experimental data by just using the workflow as callback function to scipy optimize). Note, this workflow is not a single job, but rather a complete workflow that might include multiple (dependent) jobs. I could imagine wrapping the notebook that currently defines the worfklow, but that seems suboptimal.
I would rather have the workflow being separated and e.g. in the notebook have an initial section (that is not part of the workflow) that defines the initial values and only then calls the workflow (thus the abstract definition of a workflow stored in the app store would not include the instantiation of the inputs).
So the essential question is what do we store in the app store - is that really a notebook that can be executed, or should we separate between workflows as modules/classes that provide some functionality (computing output for a given input including the compute environment, storing data provenance graph, …) and then have in addition examples/testcases being published that demonstrate/test the workflow for a specific setup?

In the future, I could even imagine that a workflow module is published (such as a conda/pypi package) such that I can import other workflows and compose new ones out of it by just doing something like import TensileTest

1 „Gefällt mir“

Hi @junger, This question is more directed to the Pyiron framework. However, in terms of SimStack, we can, in principle, save the workflow itself and the modules or WaNos, and with that, you can extend the workflow to fill in your needs.

1 „Gefällt mir“

Dear @celso.rego , at least from my point of view that is exceeding a discussion within pyiron, because I still think it might be a good idea to define the workflow independent of any tool, and then only afterwards incorporate that into a single application for the execution such as pyiron or simstack. As such (independent of the question to implement the above within pyiron) I think it would be interesting to evaluate what makes it potentially difficult (and even better how we could solve that) to use a common format to store our workflow. At least for the single modules/processes (still think we would have to agree on the wording), which are jobs in pyiron or WaNos in simstack, I think they are characterized by a compute environment (here I would suggest to have a single environment for each individual process and not put everything into a single global one - at least provide the option to do so) and some input/outputs. The input is mapped to a more complex input of the underlying application, then this one is called, and then the results of the application are mapped back to the output of interest. Why would that be something to be workflow tool specific? (I’m talking about the information that defines the process, not how this information is implemented and used afterwards in the execution). I think the success of the project strongly depends on how we extend and integrate multiple frameworks.

1 „Gefällt mir“

Dear @junger I promised I made a WaNo for the tensile example mentioned during our last meeting. You can run it in SimStack by loading the input data from .json files or directly changing the exposed parameters in the GUI.

1 „Gefällt mir“