-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong gzip compression recognized/identified by header #11066
Comments
@todor-ivanov before your day is over, please let me know where we stand with this issue and I might continue working on it. Regarding the warning message:
it's there for many years and AFAIK it's harmless. However, it might be doing something unexpected now that gzip is defined at the lowest level (pycurl_manager module). |
If it is gzip decompression, you need to decide how to make default behavior for it. According to line https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/pycurl_manager.py#L80 it is setup now to fail if given body is not compressed. We have two options here:
But I think we have wrong default settings in Requests.py here https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/Requests.py#L186 and https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/Requests.py#L215 which set gzip along others as default. If the couch code can handle gzip encoding (and I think it is) it will try to do so. The message |
This issue can be reproduced with:
and a fix has been tested and proposed in: #11069 |
Impact of the bug
Reqmgr2, WorkQueue , maybe some others as well
Describe the bug
While validating the April release we have noticed an error coming from the newly introduced
decompress
function at thepycurl_manager
module. This [1] is the PR with which we have included this function. It serves to decompress zipped streams through HTTP, which are identified by proper negotiation between the server and the client about the data content compression through the HTTP headers. It looks like for some cases the content's compression has been wrongly set togzip
[2]. More on the way how the identification of the proper compression happens is discussed in the issue/PR above. One thing is suspicious in the log message for the current bug - at the end it says:which may be a sign for reading from a cache file which is not having the proper compression.
How to reproduce it
Not clear yet. But if it is related to reading gzip data from the cache it might be by trying to provide the proper header from the client and then try to read the data from an old cache file, which is not having the contents compressed (Still to be confirmed).
Expected behavior
A clear and concise description of what you expected to happen.
Additional context and error message
[1]
#11036
[2]
[05/Apr/2022:02:49:24] Updating request "tivanov_ReReco_Parents_HG2204_Val_220401_123539_5532" with these user-provided args: {'total_jobs': 0, 'input_events': 0, 'input_lumis': 0, 'input_n
um_files': 0, 'RequestName': 'tivanov_ReReco_Parents_HG2204_Val_220401_123539_5532'}
ERROR:root:While processing decompress function with headers: {'Date': 'Tue, 05 Apr 2022 00:49:24 GMT', 'Server': 'CouchDB/1.6.1 (Erlang OTP/R16B03-1)', 'Set-Cookie': 'cms-node=624b9214001f3
6ffe1331bdb2ae875f748469a9f8732fb707323f8b02b6822441ae6cccea426fbbd05c2c42b;path=/;secure;httponly', 'ETag': '"W1AVV/xAOj0Xndl63XAMIA=="', 'Content-Type': 'application/json', 'Content-MD5':
'W1AVV/xAOj0Xndl63XAMIA==', 'Content-Length': '17414', 'Content-Encoding': 'gzip', 'Cache-Control': 'must-revalidate', 'Accept-Ranges': 'none', 'CMS-Server-Time': 'D=12028 t=1649119764746333
'}, we were unable to decompress gzip content. Details: Not a gzipped file (b'cc')
Traceback (most recent call last):
File "/data/srv/HG2204b/sw/slc7_amd64_gcc630/cms/reqmgr2/2.0.2.pre2/lib/python3.8/site-packages/WMCore/Services/pycurl_manager.py", line 80, in decompress
return gzip.decompress(body)
File "/data/srv/HG2204b/sw/slc7_amd64_gcc630/external/python3/3.8.2-comp/lib/python3.8/gzip.py", line 548, in decompress
return f.read()
File "/data/srv/HG2204b/sw/slc7_amd64_gcc630/external/python3/3.8.2-comp/lib/python3.8/gzip.py", line 292, in read
return self._buffer.read(size)
File "/data/srv/HG2204b/sw/slc7_amd64_gcc630/external/python3/3.8.2-comp/lib/python3.8/gzip.py", line 479, in read
if not self._read_gzip_header():
File "/data/srv/HG2204b/sw/slc7_amd64_gcc630/external/python3/3.8.2-comp/lib/python3.8/gzip.py", line 427, in _read_gzip_header
raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'cc')
Getting from Cache due to: CouchNotFoundError - reason: Object Not Found, data: {} result: None
The text was updated successfully, but these errors were encountered: