Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to access state_group of rejected event #17961

Open
kegsay opened this issue Nov 26, 2024 · 1 comment
Open

Attempt to access state_group of rejected event #17961

kegsay opened this issue Nov 26, 2024 · 1 comment

Comments

@kegsay
Copy link
Contributor

kegsay commented Nov 26, 2024

Description

Sometimes CSAPI requests to send messages or leave a room fail with an HTTP 500 due to RuntimeError: Attempt to access state_group of rejected event. This happens under moderate load over federation with a patchy federation connection.

Steps to reproduce

  • Setup a federation of two homeservers
  • Join shared rooms
  • Rapidly send messages / leave / rejoin as different users
  • Eventually synapse will throw this up.

Homeserver

Local

Synapse Version

docker.io/matrixdotorg/synapse:v1.116.0

Installation Method

Docker (matrixdotorg/synapse)

Database

PostgreSQL, single, no port, no backup

Workers

Multiple workers

Platform

1x federation reader worker
1x main
1x redis
1x postgres

Configuration

No.

Relevant log output

The 500:

2024-11-26 12:15:15,703 - synapse.http.server - 146 - ERROR - POST-58934 - Failed handle request via 'RoomMembershipRestServlet': <XForwardedForRequest at 0xffff7a5ad650 method='POST' uri='/_matrix/client/v3/rooms/%21sWrx.../leave' clientproto='HTTP/1.1' site='8008'>
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/synapse/http/server.py", line 332, in _async_render_wrapper
    callback_return = await self._async_render(request)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/http/server.py", line 544, in _async_render
    callback_return = await raw_callback_return
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/rest/client/room.py", line 1158, in on_POST
    return await self._do(request, requester, room_id, membership_action, None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/rest/client/room.py", line 1131, in _do
    await self.room_member_handler.update_membership(
  File "/usr/local/lib/python3.11/site-packages/synapse/metrics/background_process_metrics.py", line 251, in run
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/handlers/federation_event.py", line 901, in _process_new_pulled_events
    await self._process_pulled_event(origin, ev, backfilled=backfilled)
  File "/usr/local/lib/python3.11/site-packages/synapse/logging/opentracing.py", line 922, in _wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/logging/opentracing.py", line 922, in _wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/handlers/federation_event.py", line 1007, in _process_pulled_event
    await self._process_received_pdu(
  File "/usr/local/lib/python3.11/site-packages/synapse/logging/opentracing.py", line 922, in _wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/handlers/federation_event.py", line 1451, in _process_received_pdu
    await self._run_push_actions_and_persist_event(event, context, backfilled)
  File "/usr/local/lib/python3.11/site-packages/synapse/logging/opentracing.py", line 922, in _wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/handlers/room_member.py", line 666, in update_membership
    result = await self.update_membership_locked(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/handlers/room_member.py", line 1177, in update_membership_locked
    return await self._local_membership_update(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/handlers/room_member.py", line 503, in _local_membership_update
    await self.event_creation_handler.handle_new_client_event(
  File "/usr/local/lib/python3.11/site-packages/synapse/util/metrics.py", line 120, in measured_func
    r = await func(self, *args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/handlers/message.py", line 1489, in handle_new_client_event
    result, _ = await make_deferred_yieldable(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/internet/defer.py", line 2014, in _inlineCallbacks
    result = context.run(gen.send, result)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/handlers/federation_event.py", line 2210, in _run_push_actions_and_persist_event
    await self.persist_events_and_notify(
  File "/usr/local/lib/python3.11/site-packages/synapse/handlers/federation_event.py", line 2271, in persist_events_and_notify
    ) = await self._storage_controllers.persistence.persist_events(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/logging/opentracing.py", line 922, in _wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/controllers/persist_events.py", line 427, in persist_events
    ret_vals = await yieldable_gather_results(enqueue, partitioned.items())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/util/async_helpers.py", line 305, in yieldable_gather_results
    raise dfe.subFailure.value from None
  File "/usr/local/lib/python3.11/site-packages/twisted/internet/defer.py", line 2010, in _inlineCallbacks
    result = context.run(
             ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/failure.py", line 549, in throwExceptionIntoGenerator
    return g.throw(self.value.with_traceback(self.tb))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/controllers/persist_events.py", line 422, in enqueue
    return await self._event_persist_queue.add_to_queue(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/controllers/persist_events.py", line 245, in add_to_queue
    res = await make_deferred_yieldable(end_item.deferred.observe())
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/controllers/persist_events.py", line 288, in handle_queue_loop
    ret = await self._per_item_callback(room_id, item.task)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/controllers/persist_events.py", line 368, in _process_event_persist_queue_task
    return await self._persist_event_batch(room_id, task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/controllers/persist_events.py", line 623, in _persist_event_batch
    ) = await self._calculate_new_forward_extremities_and_state_delta(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/controllers/persist_events.py", line 715, in _calculate_new_forward_extremities_and_state_delta
    res = await self._get_new_state_after_events(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/controllers/persist_events.py", line 861, in _get_new_state_after_events
    if ctx.state_group is None:
       ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/events/snapshot.py", line 243, in state_group
    raise RuntimeError("Attempt to access state_group of rejected event")
RuntimeError: Attempt to access state_group of rejected event
2024-11-26 12:15:15,707 - synapse.access.http.8008 - 473 - INFO - POST-58934 - 192.168.65.1 - 8008 - {@user-17326229428...} Processed request: 0.082sec/0.002sec (0.004sec, 0.002sec) (0.013sec/0.005sec/9) 55B 500 "POST /_matrix/client/v3/rooms/%21sWrx.../leave HTTP/1.1" "Go-http-client/1.1" [0 dbevts]

Relevant earlier logs:

2024-11-26 12:15:15,626 - synapse.http.matrixfederationclient - 364 - INFO - _process_new_pulled_events_with_failed_pull_attempts-1-$9A-KfjOFJxpcBASTg7lNXBZqGe-tG7UX8jmIn8tUIb4 - {GET-O-11830} [hs2.localhost] Completed request: 200 OK in 0.10 secs, got 148916 bytes - GET matrix-federation://hs2.localhost/_matrix/federation/v1/event_auth/%21sWrxCKwxttGHGueqaS%3Ahs1.localhost/%249A-KfjOFJxpcBASTg7lNXBZqGe-tG7UX8jmIn8tUIb4
2024-11-26 12:15:15,651 - synapse.handlers.federation_event - 2164 - INFO - _process_new_pulled_events_with_failed_pull_attempts-1-$9A-KfjOFJxpcBASTg7lNXBZqGe-tG7UX8jmIn8tUIb4 - /event_auth returned 163 events
2024-11-26 12:15:15,652 - synapse.handlers.federation_event - 1690 - INFO - _process_new_pulled_events_with_failed_pull_attempts-1-$9A-KfjOFJxpcBASTg7lNXBZqGe-tG7UX8jmIn8tUIb4 - Persisting 2 remaining outliers: ['$g_HvHHE49GDt7liTxffpj8dYHyi5uJQo31XI6ztW5do', '$LanUoVz5bCYEhm5Bi87G7WPg08wVrRpG-zrG6gDcIZw']
[...]
2024-11-26 12:15:15,680 - synapse.handlers.federation_event - 1907 - INFO - _process_new_pulled_events_with_failed_pull_attempts-1-$9A-KfjOFJxpcBASTg7lNXBZqGe-tG7UX8jmIn8tUIb4 - event's auth_events are different to our calculated auth_events. Claimed but not calculated: [<FrozenEventV3 event_id=$LanUoVz5bCYEhm5Bi87G7WPg08wVrRpG-zrG6gDcIZw, type=m.room.member, state_key=@user-1732622942853-3:hs2.localhost, outlier=True>]. Calculated but not claimed: []
2024-11-26 12:15:15,680 - synapse.handlers.federation_event - 1927 - WARNING - _process_new_pulled_events_with_failed_pull_attempts-1-$9A-KfjOFJxpcBASTg7lNXBZqGe-tG7UX8jmIn8tUIb4 - While checking auth of <FrozenEventV3 event_id=$9A-KfjOFJxpcBASTg7lNXBZqGe-tG7UX8jmIn8tUIb4, type=m.room.message, state_key=None, outlier=False> against room state before the event: 403: User @user-1732622942853-3:hs2.localhost not in room !sWrxCKwxttGHGueqaS:hs1.localhost (None)
2024-11-26 12:15:15,685 - synapse.metrics.background_process_metrics - 253 - ERROR - _process_new_pulled_events_with_failed_pull_attempts-1 - Background process '_process_new_pulled_events_with_failed_pull_attempts' threw an exception

Events

// $LanUoVz5bCYEhm5Bi87G7WPg08wVrRpG-zrG6gDcIZw

{
    "auth_events": [
        "$N1K0VX128jNd72SNto_sQq4-7cbCEd6XP_tFYAFmqH8",
        "$fStWWu_EDbUab1-Zz9RXQD-UJtj6GUUOONPR2zAGfr4",
        "$g_HvHHE49GDt7liTxffpj8dYHyi5uJQo31XI6ztW5do",
        "$wfhk9ujUayraOaLCuyTMYtmPFj1diPF_Z7reYSg9yQQ"
    ],
    "prev_events": [
        "$g_HvHHE49GDt7liTxffpj8dYHyi5uJQo31XI6ztW5do",
        "$S4Rvt5w7dQwtHMwjiG_R0bHdEYuYys3tewdPMOMtpbM"
    ],
    "type": "m.room.member",
    "room_id": "!sWrxCKwxttGHGueqaS:hs1.localhost",
    "sender": "@user-1732622942853-3:hs2.localhost",
    "content": {
        "displayname": "user-1732622942853-3",
        "membership": "join"
    },
    "depth": 2148,
    "state_key": "@user-1732622942853-3:hs2.localhost",
    "origin": "hs2.localhost",
    "origin_server_ts": 1732623284997,
    "hashes": {
        "sha256": "Ngu/00VSvO7Hm7+4e0AM9Rxp4dpAsYdnSOcgeZNmOn4"
    },
    "signatures": {
        "hs2.localhost": {
            "ed25519:0": "omYcCGwD0F7nGddJTOjgt2+lMbW5iOEhBvg0UkKBCilyBtaeuaYhIHQiJlX2QdrPDclGKsiV83pu4xmmqDncCA"
        }
    },
    "unsigned": {
        "age": 30543
    }
}

// $9A-KfjOFJxpcBASTg7lNXBZqGe-tG7UX8jmIn8tUIb4
// From the other HS as it was never persisted in psql

{
    "auth_events": [
        "$fStWWu_EDbUab1-Zz9RXQD-UJtj6GUUOONPR2zAGfr4",
        "$LanUoVz5bCYEhm5Bi87G7WPg08wVrRpG-zrG6gDcIZw",
        "$wfhk9ujUayraOaLCuyTMYtmPFj1diPF_Z7reYSg9yQQ"
    ],
    "prev_events": [
        "$pF9Inmcvu9TkOJwJYho71LB5gE3esF5RkK2Rryhpf7A",
        "$9bNaBFAau0XcbT15fpiv3vtDBPIYeZrmBv27d1HRucQ"
    ],
    "type": "m.room.message",
    "room_id": "!sWrxCKwxttGHGueqaS:hs1.localhost",
    "sender": "@user-1732622942853-3:hs2.localhost",
    "content": {
        "msgtype": "m.text",
        "body": "majestic fridge"
    },
    "depth": 2197,
    "origin": "hs2.localhost",
    "origin_server_ts": 1732623288414,
    "hashes": {
        "sha256": "92ip+myQHEFTTWRymQdBmwCZuJeJb5IWOdv/vsAg4jc"
    },
    "signatures": {
        "hs2.localhost": {
            "ed25519:0": "BjPMMEXOyPZkZsItcaq84Gab9L/2q5ndW+aKqvqDGP84LPMXtikK/sN86N7Q5d+pcDaZXO7UfXdN7+HbalPQCA"
        }
    },
    "unsigned": {
        "age_ts": 1732623288414
    }
}

Anything else that would be useful to know?

No response

@richvdh
Copy link
Member

richvdh commented Nov 28, 2024

duplicate of #6867, potentially?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants