-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s3a URL not working #2096
Comments
Hello @dekinsitro, thank you for submitting this issue. The docs suggest including
Are you running Spark on AWS, perhaps via EMR? |
I'm running on a simple Ubuntu 18.04 EC2 VM, not EMR. Spark/EMR on AWS already includes the necessary s3 connector jars. Using your command changes the error, but still roughly the same problem: produces: ::::::::::::::::::::::::::::::::::::::::::::::
I don't see any indication the packages are even being attempted to download, just looking for them in the cache. |
Right, things can be a little bit different depending on the Spark installation. For example, for me on Cloudera CDH only the
I don't know why your version of Spark isn't trying to download the necessary dependencies, perhaps there are some network or ivy settings issues? Another option would be to pull the dependencies into your local ivy cache using
I'll try hopping on an Ubuntu EC2 instance tomorrow to see if I can replicate your issue. |
Interesting suggestion. Please do try to reproduce this problem with a modern (18.04 Ubuntu) VM if possible. I'm basically doing either "conda install -c conda-forge adam" or "pip install bdgenomics.adam" |
Sorry for dropping this for a while, I'll try to replicate this later this week with the new 0.27.0 release. |
I am trying to follow the documentation to allow ADAM to read a BAM file from S3.
According to https://adam.readthedocs.io/en/latest/deploying/aws/#input-and-output-data-on-hdfs-and-s3 I should run a command like this:
adam-submit --packages com.amazonaws:aws-java-sdk-pom:1.11.463,net.fnothaft:jsr203-s3a:0.0.1 -- transformAlignments s3a://1000genomes/phase1/data/NA12878/exome_alignment/NA12878.mapped.illumina.mosaik.CEU.exome.20110411.bam /mnt/test.adam
When I run that command, I get an error with many unresolved dependency jars:
:: problems summary ::
:::: WARNINGS
[NOT FOUND ] org.apache.commons#commons-math3;3.1.1!commons-math3.jar (0ms)
....
:::: WARNINGS
[NOT FOUND ] org.apache.commons#commons-math3;3.1.1!commons-math3.jar (0ms)
It's not clear to me (I don't work with Java much) what is going on, but my guess is that the tool that should be downloading package dependencies doesn't run, and it's just looking for cached data in the maven cache.
The text was updated successfully, but these errors were encountered: