-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1859664: Issues uploading data via PUT to s3 on driver v1.8 or above (including 1.12.1) #1279
Comments
hi - thanks for filing this issue with us. So one likely change which influence the behaviour you're seeing between 1.7.1 and 1.8.0 is #991, where we stopped swallowing the errors we got back from cloud storage :) So no extra error was introduced, we just simply surface the error now, which error was always there - instead of silently ignoring it. I just confirmed by The Snowflake
The second operation is maybe failing in your case. The file itself is likely uploaded (see phase1), just the verification step seems to fail in your case. As of next step, you can
Also instead of fixing the underlying issue on the infrastructure, you can decide to continue ignoring PUT/GET errors by setting Let me know please how this went. |
I can confirm that The I assume all of the hosts are 100% snowflake owned, so there should be nothing on the receiving side blocking this? Based on what you are saying, Logger: would we still get the detailed log output when setting |
But ignoring or surfacing the same errors do not fix those errors. If they were originating from the Snowflake end, I would expect tons of issues reported to us from other users, but there isn't any (besides this). I still would like to re-do my test in the same Snowflake deployment as yours, just to be sure. Can you please share in which Snowflake deployment you're experiencing this problem? |
This is for the *.eu-central-1.snowflakecomputing.com snowflake deployment. The connections are coming from a data center (not AWS), so maybe there is something blocked on the side of the snowflake deployment, but I am very certain that it is not something that is blocked from the data center. I understand that hiding the error doesn't fix it... I want to have a somewhat safe way to get it into a release so that i do not have to roll back 5 minutes after I saw the first issue when this is apparently not a currently critical element. |
Perfect - my tests were run on the exact same deployment:
and worked, so i can confirm it works on the Snowflake side, so next step is probably for you to work together with your network guys and trace down on which hop where the socket gets closed which in turn leads to We can help of course by giving hints (here), I already suggested some tools for it ( Since you mentioned the source is a data center, I would also suggest checking with your network guys if you're perhaps using a S3 Gateway which can contribute to this behaviour if it's not configured to transparently allow all kinds of requests to Snowflake
That is entirely correct and I agree with you. I guess you need to consider: are the files actually uploaded with 1.8.0 / 1.12.1 too to the Snowflake internal stage correctly, the same way as with 1.7.1 ? You can verify it e.g. with downloading the same file with I think that's all I can add to this issue for now. If you perhaps need Snowflake's help on looking into information which you might not want to share here (the driver's logs which can contain sensitive information, the packet capture, etc) you can file an official case with Snowflake Support and we can take it from there. Please do understand however, that Snowflake cannot fix the issue regardless if it's filed on Github or in an official Support ticket, when it is related to non-Snowflake infrastructure. Thank you for your kind understanding. |
No worries, I understand. This information has already been very valuable in understanding the underlying issue better. I am 100% certain that we do not have an S3 proxy in play here, and I also checked if the HEAD requests are generally working from the machines making those requests. That's why I assumed that it is probably either a change in what the driver does or something not working on your end. So from our data center side there is nothing blocking this. Will look into it and provide more details if it turns out that it might be driver related after all (which I doubt based on your information) |
This is interesting what you just wrote:
This is a new information. So; if the HEAD requests are generally working but you have a certain particular flow, which always (or sometimes) fails and can generate the issue - then I would like to test the exact same flow on my setup and see if I can reproduce it locally, from a different network. Do you perhaps have a reproduction setup which you can share, which I can try on my end to see if it reproduces for me? For now, I was testing general functionality with a small file, but apparently general functionality is working for you, too. Without this bit of information I was under the impression the general functionality doesn't work. The error suggested so
meaning, even after 3 attempts, Any information might be helpful. E.g. file must be bigger than X MB to get to the issue, any details. Best of course, would be a repro code snippet, a runnable program, or a repro Github repo shared if that's an option here. Thank you in advance ! |
Hi, regarding this one:
This is perfectly fine. It is the only way to use PUTs with Go driver. You can set this value by:
|
I ran the Sure, I can try to stitch together an example case |
Happy new years! 🚀 So I stitched together an example. It fails when trying to do the OCSP check, so there might be some issue in the network setup to get that information retrieved.
When going back down to v1.7.1 I am not getting any of that, although I have seen some OCSP cache errors some times making things slower - those were issue with DNS resolution which was sometimes a bit glitchy. On v1.12.1 it seems to just get stuck and keep retrying on that forever.
So, assuming that it is OCSP I tried to disable it https://community.snowflake.com/s/article/How-to-turn-off-OCSP-checking-in-Snowflake-client-drivers So I added the In order to ensure that nothing else is the matter, I actually disabled the OCSP checks by disabling it in the vendored gosnowflake code... and it worked. I checked for the domain: during the connection setup 2 1/2 years ago from our data center (routed through some AWS setup on our end), there everything under the corresponding privatelink domain was part of the setup including the OCSP domain. |
hey @tobischo Happy New Year to you too ! I'm currently on leave and will be back ~mid next week to look into your reproduction and this comment, but wanted to still come here quickly to thank you for the effort of putting the reproduction together and sharing the initial observations. Really appreciated ! Some quick remarks without deep analysis:
Indeed the expected behaviour is (when either of the above flags is specified), using a different http transport which has the OCSP related checks disabled altogether, not even trying anything. Will return to this issue next week and again, thank you very much, this is very helpful! |
What version of GO driver are you using?
1.12.1
What operating system and processor architecture are you using?
debian Linux, x86
What version of GO are you using?
1.23.3
4.Server version:* E.g. 1.90.1
8.46.1
We have been using the snowflake go driver on v1.7.1 for quite a long time now, as we first encountered issues with v1.8. Back then we downgraded and did not followup on the matter under the assumption that it might just be a client issue and will be fixed with the next versions. Recently we updated to 1.12.1 to verify that this is working and received the the following error on the client:
Failed to upload data to snowflake via PUT: 264003: unexpected error while retrieving header: operation error S3: HeadObject, exceeded maximum number of attempts, 3, https response error StatusCode: 0, RequestID: , HostID: , request send failed, Head "https://<s3url>": write tcp <internal IP>-><aws ip>:443: write: broken pipe
Downgrading back to 1.7.1 fixes the issue for now.
What did you expect to see?
v1.12.1 to work the same as v1.7.1
Can you set logging to DEBUG and collect the logs?
Not easily
The text was updated successfully, but these errors were encountered: