Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional "hangs" with FsspecStacIO #457

Open
TomAugspurger opened this issue Aug 16, 2023 · 0 comments
Open

Occasional "hangs" with FsspecStacIO #457

TomAugspurger opened this issue Aug 16, 2023 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@TomAugspurger
Copy link
Collaborator

TomAugspurger commented Aug 16, 2023

Describe the bug

Dumping this less than complete bug report here. I'll try to fill in more details later.

We're using FsspecStacIO via stacstools.sentinel2 and are observing regular "hangs". Here are the logs from the process creating the STAC items for a bunch of scenes:

[INFO] 2023-08-15 21:55:48,346 - (000.06%) [0.78s]  - blob://sentinel2l2a01/sentinel2-l2/36/S/VC/2023/05/08/S2B_MSIL2A_20230508T082609_N0509_R021_T36SVC_20230508T135506.SAFE/manifest.safe (1 of 1684)
[INFO] 2023-08-15 21:55:48,347 - Created item
[INFO] 2023-08-15 21:55:49,103 - (000.12%) [0.69s]  - blob://sentinel2l2a01/sentinel2-l2/36/S/WB/2023/05/08/S2B_MSIL2A_20230508T082609_N0509_R021_T36SWB_20230508T133704.SAFE/manifest.safe (2 of 1684)
[INFO] 2023-08-15 21:55:49,103 - Created item
[INFO] 2023-08-15 22:00:50,805 - (000.18%) [301.67s]  - blob://sentinel2l2a01/sentinel2-l2/36/S/XB/2023/05/08/S2B_MSIL2A_20230508T082609_N0509_R021_T36SXB_20230508T133650.SAFE/manifest.safe (3 of 1684)
[INFO] 2023-08-15 22:00:50,805 - Created item
[INFO] 2023-08-15 22:00:51,453 - (000.24%) [0.62s]  - blob://sentinel2l2a01/sentinel2-l2/01/V/CC/2023/05/07/S2A_MSIL2A_20230507T231551_N0509_R087_T01VCC_20230508T070652.SAFE/manifest.safe (4 of 1684)
[INFO] 2023-08-15 22:00:51,453 - Created item

The ~300s item creation is suspiciously close to aiohttp's (the HTTP library used internally by fsspec) default 300s timeout plus the normal item creation time. I haven't been able to debug exactly what the issue is yet.

To reproduce

Something like occastionally reproduces it, but see below for caveats:

import pystac
from stactools.sentinel2 import stac
import planetary_computer
from stactools.core.utils.antimeridian import Strategy

lines = [
 '47/D/PG/2023/01/01/S2B_MSIL2A_20230101T023549_N0400_R060_T47DPG_20230101T153257.SAFE',
 '44/T/QK/2023/01/01/S2B_MSIL2A_20230101T052229_N0400_R062_T44TQK_20230101T183526.SAFE',
 '47/P/QQ/2023/01/01/S2B_MSIL2A_20230101T034139_N0400_R061_T47PQQ_20230101T180408.SAFE',
 '49/S/BB/2023/01/01/S2B_MSIL2A_20230101T034139_N0400_R061_T49SBB_20230101T165453.SAFE',
 '28/H/EC/2023/01/01/S2B_MSIL2A_20230101T105029_N0400_R065_T28HEC_20230101T211552.SAFE',
 '28/H/FC/2023/01/01/S2B_MSIL2A_20230101T105029_N0400_R065_T28HFC_20230102T002315.SAFE',
 '28/H/GC/2023/01/01/S2B_MSIL2A_20230101T105029_N0400_R065_T28HGC_20230102T002146.SAFE',
 '31/S/GV/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T31SGV_20230101T224703.SAFE',
 '29/H/KT/2023/01/01/S2B_MSIL2A_20230101T105029_N0400_R065_T29HKT_20230101T220421.SAFE',
 '29/S/QS/2023/01/01/S2A_MSIL2A_20230101T111451_N0400_R137_T29SQS_20230102T001441.SAFE',
 '32/S/KD/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T32SKD_20230101T231854.SAFE',
 '32/S/KE/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T32SKE_20230101T215858.SAFE',
 '32/S/LD/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T32SLD_20230101T214510.SAFE',
 '32/S/LE/2023/01/01/S2B_MSIL2A_20230101T102339_N0400_R065_T32SLE_20230101T235039.SAFE',
 '33/P/UR/2023/01/01/S2A_MSIL2A_20230101T093411_N0400_R136_T33PUR_20230102T001951.SAFE',
 '32/P/QB/2023/01/01/S2A_MSIL2A_20230101T093411_N0400_R136_T32PQB_20230102T000130.SAFE',
 '13/T/DG/2023/01/01/S2A_MSIL2A_20230101T175741_N0400_R141_T13TDG_20230102T042809.SAFE',
 '13/T/DH/2023/01/01/S2A_MSIL2A_20230101T175741_N0400_R141_T13TDH_20230102T045803.SAFE',
 '13/T/EG/2023/01/01/S2A_MSIL2A_20230101T175741_N0400_R141_T13TEG_20230102T062330.SAFE',
 '13/T/EH/2023/01/01/S2A_MSIL2A_20230101T175741_N0400_R141_T13TEH_20230102T045007.SAFE'
]

for line in lines:
    print(line)
    granule_href = f"https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/{line}"
    item: pystac.Item = stac.create_item(
        granule_href=granule_href,
        read_href_modifier=planetary_computer.sign,
        antimeridian_strategy=Strategy.NORMALIZE,
        coordinate_precision=7,
    )

Unfortunately, I've only been able to reproduce it in an environment where the process running the command is in the same Azure region as the data (West Europe in this case). I haven't been able to reproduce it in an environment where I can actually inspect the process to see what's going on.

When it does hang, here's the traceback:

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[8], line 32
     30 print(line)
     31 granule_href = f"[https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/{](https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/%7Bline)[line](https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/%7Bline)}"
---> 32 item: pystac.Item = stac.create_item(
     33     granule_href=granule_href,
     34     read_href_modifier=planetary_computer.sign,
     35     antimeridian_strategy=Strategy.NORMALIZE,
     36     coordinate_precision=7,
     37 )

File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/sentinel2/stac.py:57, in create_item(granule_href, additional_providers, read_href_modifier, antimeridian_strategy, coordinate_precision)
     53 safe_manifest = SafeManifest(granule_href, read_href_modifier)
     55 product_metadata = ProductMetadata(safe_manifest.product_metadata_href,
     56                                    read_href_modifier)
---> 57 granule_metadata = GranuleMetadata(safe_manifest.granule_metadata_href,
     58                                    read_href_modifier)
     60 item = pystac.Item(
     61     id=product_metadata.scene_id,
     62     geometry=product_metadata.geometry,
   (...)
     65     properties={},
     66 )
     68 # --Common metadata--

File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/sentinel2/granule_metadata.py:22, in GranuleMetadata.__init__(self, href, read_href_modifier)
     17 def __init__(self,
     18              href,
     19              read_href_modifier: Optional[ReadHrefModifier] = None):
     20     self.href = href
---> 22     self._root = XmlElement.from_file(href, read_href_modifier)
     24     geocoding_node = self._root.find('n1:Geometric_Info/Tile_Geocoding')
     25     if geocoding_node is None:

File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/core/io/xml.py:74, in XmlElement.from_file(cls, href, read_href_modifier)
     70 @classmethod
     71 def from_file(
     72     cls, href: str, read_href_modifier: Optional[ReadHrefModifier] = None
     73 ) -> "XmlElement":
---> 74     text = read_text(href, read_href_modifier)
     75     return cls(etree.fromstring(bytes(text, encoding="utf-8")))

File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/core/io/__init__.py:20, in read_text(href, read_href_modifier)
     18     return StacIO.default().read_text(href)
     19 else:
---> 20     return StacIO.default().read_text(read_href_modifier(href))

File /srv/conda/envs/notebook/lib/python3.11/site-packages/pystac/stac_io.py:279, in DefaultStacIO.read_text(self, source, *_, **__)
    274 """A concrete implementation of :meth:`StacIO.read_text
    275 <pystac.StacIO.read_text>`. Converts the ``source`` argument to a string (if it
    276 is not already) and delegates to :meth:`DefaultStacIO.read_text_from_href` for
    277 opening and reading the file."""
    278 href = str(os.fspath(source))
--> 279 return self.read_text_from_href(href)

File /srv/conda/envs/notebook/lib/python3.11/site-packages/stactools/core/io/__init__.py:25, in FsspecStacIO.read_text_from_href(self, href, *args, **kwargs)
     24 def read_text_from_href(self, href: str, *args: Any, **kwargs: Any) -> str:
---> 25     with fsspec.open(href, "r") as f:
     26         s = f.read()
     27         if isinstance(s, str):

File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/core.py:102, in OpenFile.__enter__(self)
     99 def __enter__(self):
    100     mode = self.mode.replace("t", "").replace("b", "") + "b"
--> 102     f = self.fs.open(self.path, mode=mode)
    104     self.fobjects = [f]
    106     if self.compression is not None:

File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/spec.py:1241, in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs)
   1239 else:
   1240     ac = kwargs.pop("autocommit", not self._intrans)
-> 1241     f = self._open(
   1242         path,
   1243         mode=mode,
   1244         block_size=block_size,
   1245         autocommit=ac,
   1246         cache_options=cache_options,
   1247         **kwargs,
   1248     )
   1249     if compression is not None:
   1250         from fsspec.compression import compr

File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/implementations/http.py:356, in HTTPFileSystem._open(self, path, mode, block_size, autocommit, cache_type, cache_options, size, **kwargs)
    354 kw["asynchronous"] = self.asynchronous
    355 kw.update(kwargs)
--> 356 size = size or self.info(path, **kwargs)["size"]
    357 session = sync(self.loop, self.set_session)
    358 if block_size and size:

File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:121, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    118 @functools.wraps(func)
    119 def wrapper(*args, **kwargs):
    120     self = obj or args[0]
--> 121     return sync(self.loop, func, *args, **kwargs)

File /srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:94, in sync(loop, func, timeout, *args, **kwargs)
     91 asyncio.run_coroutine_threadsafe(_runner(event, coro, result, timeout), loop)
     92 while True:
     93     # this loops allows thread to get interrupted
---> 94     if event.wait(1):
     95         break
     96     if timeout is not None:

File /srv/conda/envs/notebook/lib/python3.11/threading.py:622, in Event.wait(self, timeout)
    620 signaled = self._flag
    621 if not signaled:
--> 622     signaled = self._cond.wait(timeout)
    623 return signaled

File /srv/conda/envs/notebook/lib/python3.11/threading.py:324, in Condition.wait(self, timeout)
    322 else:
    323     if timeout > 0:
--> 324         gotit = waiter.acquire(True, timeout)
    325     else:
    326         gotit = waiter.acquire(False)

KeyboardInterrupt:

Here are the versions of the relevant packages

aiohttp                       3.8.4
fsspec                        2023.6.0
stactools                     0.3.1
stactools-sentinel2           0.2.1

** To workaround **

At least for our use case, we can disable the use of fsspec by running

pystac.StacIO.set_default(pystac.stac_io.DefaultStacIO)

after stactools.sentienl2 is imported. That workaround might help out others.

@TomAugspurger TomAugspurger added the bug Something isn't working label Aug 16, 2023
@gadomski gadomski self-assigned this Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants