Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit dependencies #25

Open
illusional opened this issue May 25, 2020 · 2 comments
Open

Implicit dependencies #25

illusional opened this issue May 25, 2020 · 2 comments

Comments

@illusional
Copy link
Member

Cases exist where we want some stepB to run AFTER stepA, even though there might not be an explicit dependency. We'd likely do this with dummy variables.

I have concerns a few concerns with this feature:

  • It might mean steps are using side-effects to work - this might cause issues with reproducibility, idempotency and what is actually required for a step to work.

Neither CWL nor WDL explicitly support this feature, but we as an intermediate could (provided this logic works), generate the required annotations.

Syntax

This syntax is a suggestion

  • Introduce a depends_on parameter on a step creation.
wf = WorkflowBuilder("depends_on_wf")
wf.input("inp", str)

wf.step("stepA", StepWithNoExplicitOutput(inp=wf.inp))
wf.step(
    "stepB", 
    Echo(inp="Step A has finished"),
    depends_on=[wf.stepA] # or singular
)

wf.output("out", source=wf.stepB.out)

Dummy variable

Introduce some dummy variable on EVERY tool input and output of type int[], and then add an extra connection between the depended on and dependent step.

Thanks to @drtconway for the feature suggestion, and helping spec this feature out.

@drtconway
Copy link

I like your proposed syntax better than the version I proposed in chat.

@mr-c
Copy link
Contributor

mr-c commented Apr 29, 2021

I have concerns a few concerns with this feature:

* It might mean steps are using side-effects to work - this might cause issues with reproducibility, idempotency and what is actually required for a step to work.

This is the only reason to have an explicit dependency beyond a data dependency.

CWL is explicitly designed for self contained workflows that do not interact with stateful services. If we did design for that then we would have to have the complexity on the order of Taverna or Pegasus. Ideally that would include transactions, rollbacks, and other advanced error handling. I presume the same is true for the other workflow languages that Janis targets.

So I would suggest that this would be a bad feature to implement as it makes dangerous practices easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants