-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a subset of the Common Workflow Language. #47
Conversation
@jmchilton This is fantastic, thank you for the refresh! Right now the conformance tests are running at https://ci.commonwl.org/job/galaxy-planemo-conformance/ using With regards to
This is already supported in Example: (the $namespaces: { edam: "http://edamontology.org/" }
$schemas: [ "http://edamontology.org/EDAM_1.16.owl" ] I explicitly demo and promote use of EDAM for bioinformatics tools and workflows (though the example workflow repo still needs updating)
Good news: you can do this today without modifying or extending CWL. Any Galaxy output type (a file format, in CWL parlance) that isn't represented in EDAM can be added as an additional format specifier (preferably in a galaxyproject.org namespace): http://www.commonwl.org/v1.0/CommandLineTool.html#CommandInputParameter Then the Galaxy user interface can choose to display only the files that have the the Galaxy specific type(s) when a CWL description specifies both generic and Galaxy specific formats, thus giving the best user experience. In the case of Galaxy's Hypothetical example if Galaxy's $namespaces: { edam: "http://edamontology.org/", galaxy: "https://galaxyproject.org/" }
$schemas: [ "http://edamontology.org/EDAM_1.16.owl", "https://galaxyproject.org/formats-release_17.01.owl" ] Obviously it would be best for bioinformatic CWL descriptions to only use EDAM formats, but this approach means that you won't have to wait for EDAM updates to still have the best user experience in Galaxy (though EDAM releases much faster than they used to). |
5d521d7
to
a9b5865
Compare
b48bc2e
to
56e3866
Compare
4856289
to
5a171d7
Compare
8c76a02
to
43de6cf
Compare
Co-authored-by: Nicola Soranzo <[email protected]>
Donwload conformance tests to `test/functional/tools/cwl_tools`
I think that might have just not been merged properly when we fixed this in upstream Galaxy (galaxyproject@4df1de3).
So revert #128 bascially.
Fixes waiting for history state to become OK if the history contains only empty collections.
@@ -314,7 +314,7 @@ def __init__(self, dataset_populator, workflow_populator, history_id, workflow_i | |||
self.invocation_id = invocation_id | |||
|
|||
def _output_name_to_object(self, output_name): | |||
invocation_response = self.dataset_populator._get(f"workflows/{self.invocation_id}/invocations/{self.workflow_id}") | |||
invocation_response = self.dataset_populator._get(f"workflows/{self.workflow_id}/invocations/{self.invocation_id}") | |||
api_asserts.assert_status_code_is(invocation_response, 200) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just use the invocation ID now also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, addressed in a commit that I'll push soon.
Closing this with the aim of opening this against galaxyproject/dev ASAP. |
New location is galaxyproject#12909 ; thank you @nsoranzo ! |
This should support a subset of draft-3 and v1.0 tools.
What is holding us back from merging the progress so far?
CWL Support (Tools):
secondaryFiles
that are actual Files are implemented, secondaryFiles containing directories are not yet implemented.InlineJavascriptRequirement
are support to define output files (seetest_cat3
test case).EnvVarRequirement
s are supported (see thetest_env_tool1
andtest_env_tool2
test cases).parseInt-tool
test case).CWL Support (Workflows):
step-valueFrom
andstep-valueFrom2
). This work doesn't yet model non-tool parameters to steps - for complexvalueFrom
expressions like instep-valueFrom3
do not work yet.Remaining Work
The work remaining is vast and will be tracked at https://github.com/common-workflow-language/galaxy/issues for the time being.
Implementation Notes:
Tools:
expression.json
files. Traditionally Galaxy hasn't supported non-File outputs from tools but CWL Galaxy has work in progress on bringing native Galaxy support for such outputs Add Expression Tools to Galaxy #27.__secondary_files__
directory in the dataset's extra_files_path directory and indexed in a file called __secondary_files_index.json in extra_files_path. The upload tools has been augmented to allow attaching arbitrary extra files as a tar file to support getting data into this format initially. CWL requires staging files to include their parent File'sbasename
- but tools describe inputs as just the extension. I'm not sure which way Galaxy should store secondary_files in its objectstore - just with the extension or with the basename and extension - both options are implemented and can be swapped by setting the boolean STORE_SECONDARY_FILES_WITH_BASENAME in galaxy.tools.cwl.util.inputs_representation
parameter that can be set to "cwl" now. Thecwl
representation for running tools corresonding to the CWL job json format with {class: "File: path: "/path/to/file"} inputs replaced with {"src": "hda", "id": "<dataset_id>"}. Code for building these requests for CWL job json is available in the test class.File
or non-File
and determined at runtime, sogalaxy.json
is used to dynamically adjust output extension as needed for non-File
parameters.Workflows:
Implementation Description:
The reference implementation Python library (mainly developed by Peter Amstutz - https://github.com/common-workflow-language/common-workflow-language/tree/master/reference) is used to load tool files ending with
.json
or.cwl
and proxy objects are created to adapt these tools to Galaxy representations. In particular input and output descriptions are loaded from the tool.When the tool is submitted, a special specialized tool class is used to build a cwltool compatible job description from the supplied Galaxy inputs and the CWL reference implementation is used to generate a CWL reference implementation Job object. A command-line is generated from this Job object.
As a result of this - Galaxy largely does not need to worry about the details of command-line adapters, expressions, etc....
Galaxy writes a description of the CWL job that it can reload to the job working directory. After the process is complete (on the Galaxy compute server, but outside the Docker container) this representation is reloaded and the dynamic outputs are discovered and moved to fixed locations as expected by Galaxy. CWL allows for much more expressive output locations than Galaxy, for better or worse, and this step uses cwltool to adapt CWL to Galaxy outputs.
Currently all
File
outputs are sniffed to determined a Galaxy datatype, CWL allows refinement on this and this remains work to be done.Implementation Links:
Hundreds of commits have been rebased into this one and so the details of individual parts of the implementation and how they built on each other are not enitrely clear. To see the original ideas behind individual features - here are some relevant links:
Testing:
Start Galaxy.
Open http://localhost:8080/ and see CWL test tools (along with all Galaxy test tools) in left hand tool panel.
To go a step further and actually run CWL jobs within their designated Docker containers, copy the following minimal Galaxy job configuration file to
config/job_conf.xml
. (Adjust thedocker_sudo
parameter based on how you execute Docker).https://gist.github.com/jmchilton/3997fa471d1b4c556966
Run API tests demonstrating the various CWL demo tools with the following command.
The first two execute various tool and workflow test cases manually crafted during implementation of this work. The third is an auto-generate test case class that contains Python tests for every CWL conformance test found with the reference specification.
An individual conformance test can be ran using this pattern:
Issues and Contact
Report issues at https://github.com/common-workflow-language/galaxy/issues and feel free ping jmchilton on the CWL Gitter channel.