Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【deepin_V23_beta3】【一般】【正常】【发散测试】【内核】【6.5.4】Dell Inc. XPS 9320 (0C10) Notebook相机无法使用 #30

Open
kobe337 opened this issue Sep 21, 2023 · 1 comment

Comments

@kobe337
Copy link

kobe337 commented Sep 21, 2023

SUMMARY | 问题概要

Dell Inc. XPS 9320 (0C10) Notebook(笔记本)相机无法使用,显示无连接

SOFTWARE/OS VERSIONS | 软件/系统版本信息

1. 镜像:https://cdimage.uniontech.com/community/releases/23-Beta2/
2. 内核:Linux deepin-PC 6.5.4-amd64-desktop-hwe #2 SMP PREEMPT_DYNAMIC Wed Sep 20 11:35:35 CST 2023 x86_64 GNU/Linux

PRE-REQUIREMENT | 预置条件

STEPS TO REPRODUCE | 复现步骤

1.进入桌面环境,通过启动器打开相机应用

OBSERVED RESULT | 观察到的结果

1.Dell Inc. XPS 9320 (0C10) Notebook相机无法使用,显示无连接

EXPECTED RESULT | 期望的结果

进入桌面环境,通过启动器打开相机应用,相机正常工作

ADDITIONAL INFORMATION | 额外补充

[设备信息]
设备: Dell Inc. XPS 9320 (0C10) Notebook
操作系统: Deepin 23 Community(Beta2) Linux version 6.5.3 (abuild@deepin) (gcc (Deepin 11.2.0-deepin2+u004) 11.2.0, GNU ld (GNU Binutils for Deepin) 2.38) #1 SMP PREEMPT_DYNAMIC Thu Sep 14 09:30:05 UTC 2023
处理器: 13th Gen Intel(R) Core(TM) i7-1360P (十二核 / 十六逻辑处理器)
主板: 07608N
内存: 4GB(Not Specified 6400 MT/s)/4GB(Not Specified 6400 MT/s)/4GB(Not Specified 6400 MT/s)/4GB(Not Specified 6400 MT/s)/4GB(Not Specified 6400 MT/s)/4GB(Not Specified 6400 MT/s)/4GB(Not Specified 6400 MT/s)/4GB(Not Specified 6400 MT/s)
显示适配器: Raptor Lake-P [Iris Xe Graphics]
音频适配器: Intel Corporation
存储设备: PC801 NVMe SK hynix 1TB (1.00 TB)
蓝牙: deepin-PC
网络适配器: Intel Corporation
鼠标: PS/2 Generic Mouse (PS/2 Generic Mouse)/ELAN900C:00 04F3:2E25 (ELAN900C:00 04F3:2E25)/VEN_04F3:00 04F3:31D1 Mouse (VEN_04F3:00 04F3:31D1 Mouse)/VEN_04F3:00 04F3:31D1 Touchpad (VEN_04F3:00 04F3:31D1 Touchpad)/DDE Cooperation Mouse (DDE Cooperation Mouse)/DDE Cooperation Touchpad (DDE Cooperation Touchpad)
键盘: AT Translated Set 2 keyboard (AT Translated Set 2 keyboard)/DDE Cooperation Keyboard (DDE Cooperation Keyboard)/DDE Cooperation Keyboard (DDE Cooperation Keyboard)/DDE Cooperation Keyboard (DDE Cooperation Keyboard)/DDE Cooperation Keyboard (DDE Cooperation Keyboard)/DDE Cooperation Keyboard (DDE Cooperation Keyboard)/DDE Cooperation Keyboard (DDE Cooperation Keyboard)/DDE Cooperation Keyboard (DDE Cooperation Keyboard)/DDE Cooperation Keyboard (DDE Cooperation Keyboard)
显示设备: 8VXVT 134XK04 LCD Monitor(13.4 英寸(288mm X 180mm))
其他设备: USB Bridge/Goodix USB2.0 MISC
No response

@xzl01
Copy link

xzl01 commented Sep 25, 2023

相机无法使用问题在于intel上游驱动尚未合并进入主线,并非内核问题

matrix-wsk pushed a commit that referenced this issue Nov 21, 2023
…it_event_hooks()

[ Upstream commit 91cfe0b ]

When CONFIG_HID_UCLOGIC=y and CONFIG_KUNIT_ALL_TESTS=y, launch kernel and
then the below user-memory-access bug occurs.

In hid_test_uclogic_params_cleanup_event_hooks(),it call
uclogic_params_ugee_v2_init_event_hooks() with the first arg=NULL, so
when it calls uclogic_params_ugee_v2_has_battery(), the hid_get_drvdata()
will access hdev->dev with hdev=NULL, which will cause below
user-memory-access.

So add a fake_device with quirks member and call hid_set_drvdata()
to assign hdev->dev->driver_data which avoids the null-ptr-def bug
for drvdata->quirks in uclogic_params_ugee_v2_has_battery(). After applying
this patch, the below user-memory-access bug never occurs.

 general protection fault, probably for non-canonical address 0xdffffc0000000329: 0000 [#1] PREEMPT SMP KASAN
 KASAN: probably user-memory-access in range [0x0000000000001948-0x000000000000194f]
 CPU: 5 PID: 2189 Comm: kunit_try_catch Tainted: G    B   W        N 6.6.0-rc2+ #30
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
 RIP: 0010:uclogic_params_ugee_v2_init_event_hooks+0x87/0x600
 Code: f3 f3 65 48 8b 14 25 28 00 00 00 48 89 54 24 60 31 d2 48 89 fa c7 44 24 30 00 00 00 00 48 c7 44 24 28 02 f8 02 01 48 c1 ea 03 <80> 3c 02 00 0f 85 2c 04 00 00 48 8b 9d 48 19 00 00 48 b8 00 00 00
 RSP: 0000:ffff88810679fc88 EFLAGS: 00010202
 RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000
 RDX: 0000000000000329 RSI: ffff88810679fd88 RDI: 0000000000001948
 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffed1020f639f0
 R10: ffff888107b1cf87 R11: 0000000000000400 R12: 1ffff11020cf3f92
 R13: ffff88810679fd88 R14: ffff888100b97b08 R15: ffff8881030bb080
 FS:  0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000005286001 CR4: 0000000000770ee0
 DR0: ffffffff8fdd6cf4 DR1: ffffffff8fdd6cf5 DR2: ffffffff8fdd6cf6
 DR3: ffffffff8fdd6cf7 DR6: 00000000fffe0ff0 DR7: 0000000000000600
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? die_addr+0x3d/0xa0
  ? exc_general_protection+0x144/0x220
  ? asm_exc_general_protection+0x22/0x30
  ? uclogic_params_ugee_v2_init_event_hooks+0x87/0x600
  ? sched_clock_cpu+0x69/0x550
  ? uclogic_parse_ugee_v2_desc_gen_params+0x70/0x70
  ? load_balance+0x2950/0x2950
  ? rcu_trc_cmpxchg_need_qs+0x67/0xa0
  hid_test_uclogic_params_cleanup_event_hooks+0x9e/0x1a0
  ? uclogic_params_ugee_v2_init_event_hooks+0x600/0x600
  ? __switch_to+0x5cf/0xe60
  ? migrate_enable+0x260/0x260
  ? __kthread_parkme+0x83/0x150
  ? kunit_try_run_case_cleanup+0xe0/0xe0
  kunit_generic_run_threadfn_adapter+0x4a/0x90
  ? kunit_try_catch_throw+0x80/0x80
  kthread+0x2b5/0x380
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x2d/0x70
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork_asm+0x11/0x20
  </TASK>
 Modules linked in:
 Dumping ftrace buffer:
    (ftrace buffer empty)
 ---[ end trace 0000000000000000 ]---
 RIP: 0010:uclogic_params_ugee_v2_init_event_hooks+0x87/0x600
 Code: f3 f3 65 48 8b 14 25 28 00 00 00 48 89 54 24 60 31 d2 48 89 fa c7 44 24 30 00 00 00 00 48 c7 44 24 28 02 f8 02 01 48 c1 ea 03 <80> 3c 02 00 0f 85 2c 04 00 00 48 8b 9d 48 19 00 00 48 b8 00 00 00
 RSP: 0000:ffff88810679fc88 EFLAGS: 00010202
 RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000
 RDX: 0000000000000329 RSI: ffff88810679fd88 RDI: 0000000000001948
 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffed1020f639f0
 R10: ffff888107b1cf87 R11: 0000000000000400 R12: 1ffff11020cf3f92
 R13: ffff88810679fd88 R14: ffff888100b97b08 R15: ffff8881030bb080
 FS:  0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000005286001 CR4: 0000000000770ee0
 DR0: ffffffff8fdd6cf4 DR1: ffffffff8fdd6cf5 DR2: ffffffff8fdd6cf6
 DR3: ffffffff8fdd6cf7 DR6: 00000000fffe0ff0 DR7: 0000000000000600
 PKRU: 55555554
 Kernel panic - not syncing: Fatal exception
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Kernel Offset: disabled
 Rebooting in 1 seconds..

Fixes: a251d65 ("HID: uclogic: Handle wireless device reconnection")
Signed-off-by: Jinjie Ruan <[email protected]>
Reviewed-by: José Expósito <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
matrix-wsk pushed a commit that referenced this issue Nov 21, 2023
[ Upstream commit d45f72b ]

When CONFIG_HID_UCLOGIC=y and CONFIG_KUNIT_ALL_TESTS=y, launch
kernel and then the below work->entry not empty bug occurs.

In hid_test_uclogic_exec_event_hook_test(), the filter->work is not
initialized to be added to p.event_hooks->list, and then the
schedule_work() in uclogic_exec_event_hook() will call __queue_work(),
which check whether the work->entry is empty and cause the below
warning call trace.

So call INIT_WORK() with a fake work to solve the issue. After applying
this patch, the below work->entry not empty bug never occurs.

 WARNING: CPU: 0 PID: 2177 at kernel/workqueue.c:1787 __queue_work.part.0+0x780/0xad0
 Modules linked in:
 CPU: 0 PID: 2177 Comm: kunit_try_catch Tainted: G    B   W        N 6.6.0-rc2+ #30
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
 RIP: 0010:__queue_work.part.0+0x780/0xad0
 Code: 44 24 20 0f b6 00 84 c0 74 08 3c 03 0f 8e 52 03 00 00 f6 83 00 01 00 00 02 74 6f 4c 89 ef e8 c7 d8 f1 02 f3 90 e9 e5 f8 ff ff <0f> 0b e9 63 fc ff ff 89 e9 49 8d 57 68 4c 89 e6 4c 89 ff 83 c9 02
 RSP: 0000:ffff888102bb7ce8 EFLAGS: 00010086
 RAX: 0000000000000000 RBX: ffff888106b8e460 RCX: ffffffff84141cc7
 RDX: 1ffff11020d71c8c RSI: 0000000000000004 RDI: ffff8881001d0118
 RBP: dffffc0000000000 R08: 0000000000000001 R09: ffffed1020576f92
 R10: 0000000000000003 R11: ffff888102bb7980 R12: ffff888106b8e458
 R13: ffff888119c38800 R14: 0000000000000000 R15: ffff8881001d0100
 FS:  0000000000000000(0000) GS:ffff888119c00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff888119506000 CR3: 0000000005286001 CR4: 0000000000770ef0
 DR0: ffffffff8fdd6ce0 DR1: ffffffff8fdd6ce1 DR2: ffffffff8fdd6ce3
 DR3: ffffffff8fdd6ce5 DR6: 00000000fffe0ff0 DR7: 0000000000000600
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? __warn+0xc9/0x260
  ? __queue_work.part.0+0x780/0xad0
  ? report_bug+0x345/0x400
  ? handle_bug+0x3c/0x70
  ? exc_invalid_op+0x14/0x40
  ? asm_exc_invalid_op+0x16/0x20
  ? _raw_spin_lock+0x87/0xe0
  ? __queue_work.part.0+0x780/0xad0
  ? __queue_work.part.0+0x249/0xad0
  queue_work_on+0x48/0x50
  uclogic_exec_event_hook.isra.0+0xf7/0x160
  hid_test_uclogic_exec_event_hook_test+0x2f1/0x5d0
  ? try_to_wake_up+0x151/0x13e0
  ? uclogic_exec_event_hook.isra.0+0x160/0x160
  ? _raw_spin_lock_irqsave+0x8d/0xe0
  ? __sched_text_end+0xa/0xa
  ? __sched_text_end+0xa/0xa
  ? migrate_enable+0x260/0x260
  ? kunit_try_run_case_cleanup+0xe0/0xe0
  kunit_generic_run_threadfn_adapter+0x4a/0x90
  ? kunit_try_catch_throw+0x80/0x80
  kthread+0x2b5/0x380
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x2d/0x70
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork_asm+0x11/0x20
  </TASK>

Fixes: a251d65 ("HID: uclogic: Handle wireless device reconnection")
Signed-off-by: Jinjie Ruan <[email protected]>
Reviewed-by: José Expósito <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
matrix-wsk pushed a commit that referenced this issue Nov 21, 2023
…it_event_hooks()

[ Upstream commit 91cfe0b ]

When CONFIG_HID_UCLOGIC=y and CONFIG_KUNIT_ALL_TESTS=y, launch kernel and
then the below user-memory-access bug occurs.

In hid_test_uclogic_params_cleanup_event_hooks(),it call
uclogic_params_ugee_v2_init_event_hooks() with the first arg=NULL, so
when it calls uclogic_params_ugee_v2_has_battery(), the hid_get_drvdata()
will access hdev->dev with hdev=NULL, which will cause below
user-memory-access.

So add a fake_device with quirks member and call hid_set_drvdata()
to assign hdev->dev->driver_data which avoids the null-ptr-def bug
for drvdata->quirks in uclogic_params_ugee_v2_has_battery(). After applying
this patch, the below user-memory-access bug never occurs.

 general protection fault, probably for non-canonical address 0xdffffc0000000329: 0000 [#1] PREEMPT SMP KASAN
 KASAN: probably user-memory-access in range [0x0000000000001948-0x000000000000194f]
 CPU: 5 PID: 2189 Comm: kunit_try_catch Tainted: G    B   W        N 6.6.0-rc2+ #30
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
 RIP: 0010:uclogic_params_ugee_v2_init_event_hooks+0x87/0x600
 Code: f3 f3 65 48 8b 14 25 28 00 00 00 48 89 54 24 60 31 d2 48 89 fa c7 44 24 30 00 00 00 00 48 c7 44 24 28 02 f8 02 01 48 c1 ea 03 <80> 3c 02 00 0f 85 2c 04 00 00 48 8b 9d 48 19 00 00 48 b8 00 00 00
 RSP: 0000:ffff88810679fc88 EFLAGS: 00010202
 RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000
 RDX: 0000000000000329 RSI: ffff88810679fd88 RDI: 0000000000001948
 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffed1020f639f0
 R10: ffff888107b1cf87 R11: 0000000000000400 R12: 1ffff11020cf3f92
 R13: ffff88810679fd88 R14: ffff888100b97b08 R15: ffff8881030bb080
 FS:  0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000005286001 CR4: 0000000000770ee0
 DR0: ffffffff8fdd6cf4 DR1: ffffffff8fdd6cf5 DR2: ffffffff8fdd6cf6
 DR3: ffffffff8fdd6cf7 DR6: 00000000fffe0ff0 DR7: 0000000000000600
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? die_addr+0x3d/0xa0
  ? exc_general_protection+0x144/0x220
  ? asm_exc_general_protection+0x22/0x30
  ? uclogic_params_ugee_v2_init_event_hooks+0x87/0x600
  ? sched_clock_cpu+0x69/0x550
  ? uclogic_parse_ugee_v2_desc_gen_params+0x70/0x70
  ? load_balance+0x2950/0x2950
  ? rcu_trc_cmpxchg_need_qs+0x67/0xa0
  hid_test_uclogic_params_cleanup_event_hooks+0x9e/0x1a0
  ? uclogic_params_ugee_v2_init_event_hooks+0x600/0x600
  ? __switch_to+0x5cf/0xe60
  ? migrate_enable+0x260/0x260
  ? __kthread_parkme+0x83/0x150
  ? kunit_try_run_case_cleanup+0xe0/0xe0
  kunit_generic_run_threadfn_adapter+0x4a/0x90
  ? kunit_try_catch_throw+0x80/0x80
  kthread+0x2b5/0x380
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x2d/0x70
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork_asm+0x11/0x20
  </TASK>
 Modules linked in:
 Dumping ftrace buffer:
    (ftrace buffer empty)
 ---[ end trace 0000000000000000 ]---
 RIP: 0010:uclogic_params_ugee_v2_init_event_hooks+0x87/0x600
 Code: f3 f3 65 48 8b 14 25 28 00 00 00 48 89 54 24 60 31 d2 48 89 fa c7 44 24 30 00 00 00 00 48 c7 44 24 28 02 f8 02 01 48 c1 ea 03 <80> 3c 02 00 0f 85 2c 04 00 00 48 8b 9d 48 19 00 00 48 b8 00 00 00
 RSP: 0000:ffff88810679fc88 EFLAGS: 00010202
 RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000
 RDX: 0000000000000329 RSI: ffff88810679fd88 RDI: 0000000000001948
 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffed1020f639f0
 R10: ffff888107b1cf87 R11: 0000000000000400 R12: 1ffff11020cf3f92
 R13: ffff88810679fd88 R14: ffff888100b97b08 R15: ffff8881030bb080
 FS:  0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000005286001 CR4: 0000000000770ee0
 DR0: ffffffff8fdd6cf4 DR1: ffffffff8fdd6cf5 DR2: ffffffff8fdd6cf6
 DR3: ffffffff8fdd6cf7 DR6: 00000000fffe0ff0 DR7: 0000000000000600
 PKRU: 55555554
 Kernel panic - not syncing: Fatal exception
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Kernel Offset: disabled
 Rebooting in 1 seconds..

Fixes: a251d65 ("HID: uclogic: Handle wireless device reconnection")
Signed-off-by: Jinjie Ruan <[email protected]>
Reviewed-by: José Expósito <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
matrix-wsk pushed a commit that referenced this issue Nov 21, 2023
[ Upstream commit d45f72b ]

When CONFIG_HID_UCLOGIC=y and CONFIG_KUNIT_ALL_TESTS=y, launch
kernel and then the below work->entry not empty bug occurs.

In hid_test_uclogic_exec_event_hook_test(), the filter->work is not
initialized to be added to p.event_hooks->list, and then the
schedule_work() in uclogic_exec_event_hook() will call __queue_work(),
which check whether the work->entry is empty and cause the below
warning call trace.

So call INIT_WORK() with a fake work to solve the issue. After applying
this patch, the below work->entry not empty bug never occurs.

 WARNING: CPU: 0 PID: 2177 at kernel/workqueue.c:1787 __queue_work.part.0+0x780/0xad0
 Modules linked in:
 CPU: 0 PID: 2177 Comm: kunit_try_catch Tainted: G    B   W        N 6.6.0-rc2+ #30
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
 RIP: 0010:__queue_work.part.0+0x780/0xad0
 Code: 44 24 20 0f b6 00 84 c0 74 08 3c 03 0f 8e 52 03 00 00 f6 83 00 01 00 00 02 74 6f 4c 89 ef e8 c7 d8 f1 02 f3 90 e9 e5 f8 ff ff <0f> 0b e9 63 fc ff ff 89 e9 49 8d 57 68 4c 89 e6 4c 89 ff 83 c9 02
 RSP: 0000:ffff888102bb7ce8 EFLAGS: 00010086
 RAX: 0000000000000000 RBX: ffff888106b8e460 RCX: ffffffff84141cc7
 RDX: 1ffff11020d71c8c RSI: 0000000000000004 RDI: ffff8881001d0118
 RBP: dffffc0000000000 R08: 0000000000000001 R09: ffffed1020576f92
 R10: 0000000000000003 R11: ffff888102bb7980 R12: ffff888106b8e458
 R13: ffff888119c38800 R14: 0000000000000000 R15: ffff8881001d0100
 FS:  0000000000000000(0000) GS:ffff888119c00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff888119506000 CR3: 0000000005286001 CR4: 0000000000770ef0
 DR0: ffffffff8fdd6ce0 DR1: ffffffff8fdd6ce1 DR2: ffffffff8fdd6ce3
 DR3: ffffffff8fdd6ce5 DR6: 00000000fffe0ff0 DR7: 0000000000000600
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? __warn+0xc9/0x260
  ? __queue_work.part.0+0x780/0xad0
  ? report_bug+0x345/0x400
  ? handle_bug+0x3c/0x70
  ? exc_invalid_op+0x14/0x40
  ? asm_exc_invalid_op+0x16/0x20
  ? _raw_spin_lock+0x87/0xe0
  ? __queue_work.part.0+0x780/0xad0
  ? __queue_work.part.0+0x249/0xad0
  queue_work_on+0x48/0x50
  uclogic_exec_event_hook.isra.0+0xf7/0x160
  hid_test_uclogic_exec_event_hook_test+0x2f1/0x5d0
  ? try_to_wake_up+0x151/0x13e0
  ? uclogic_exec_event_hook.isra.0+0x160/0x160
  ? _raw_spin_lock_irqsave+0x8d/0xe0
  ? __sched_text_end+0xa/0xa
  ? __sched_text_end+0xa/0xa
  ? migrate_enable+0x260/0x260
  ? kunit_try_run_case_cleanup+0xe0/0xe0
  kunit_generic_run_threadfn_adapter+0x4a/0x90
  ? kunit_try_catch_throw+0x80/0x80
  kthread+0x2b5/0x380
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x2d/0x70
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork_asm+0x11/0x20
  </TASK>

Fixes: a251d65 ("HID: uclogic: Handle wireless device reconnection")
Signed-off-by: Jinjie Ruan <[email protected]>
Reviewed-by: José Expósito <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
matrix-wsk pushed a commit to matrix-wsk/kernel-6.6 that referenced this issue Mar 11, 2024
[ Upstream commit fad87db ]

The PAPR spec spells the function name as

  "ibm,reset-pe-dma-windows"

but in practice firmware uses the singular form:

  "ibm,reset-pe-dma-window"

in the device tree. Since we have the wrong spelling in the RTAS
function table, reverse lookups (token -> name) fail and warn:

  unexpected failed lookup for token 86
  WARNING: CPU: 1 PID: 545 at arch/powerpc/kernel/rtas.c:659 __do_enter_rtas_trace+0x2a4/0x2b4
  CPU: 1 PID: 545 Comm: systemd-udevd Not tainted 6.8.0-rc4 deepin-community#30
  Hardware name: IBM,9105-22A POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NL1060_028) hv:phyp pSeries
  NIP [c0000000000417f0] __do_enter_rtas_trace+0x2a4/0x2b4
  LR [c0000000000417ec] __do_enter_rtas_trace+0x2a0/0x2b4
  Call Trace:
   __do_enter_rtas_trace+0x2a0/0x2b4 (unreliable)
   rtas_call+0x1f8/0x3e0
   enable_ddw.constprop.0+0x4d0/0xc84
   dma_iommu_dma_supported+0xe8/0x24c
   dma_set_mask+0x5c/0xd8
   mlx5_pci_init.constprop.0+0xf0/0x46c [mlx5_core]
   probe_one+0xfc/0x32c [mlx5_core]
   local_pci_probe+0x68/0x12c
   pci_call_probe+0x68/0x1ec
   pci_device_probe+0xbc/0x1a8
   really_probe+0x104/0x570
   __driver_probe_device+0xb8/0x224
   driver_probe_device+0x54/0x130
   __driver_attach+0x158/0x2b0
   bus_for_each_dev+0xa8/0x120
   driver_attach+0x34/0x48
   bus_add_driver+0x174/0x304
   driver_register+0x8c/0x1c4
   __pci_register_driver+0x68/0x7c
   mlx5_init+0xb8/0x118 [mlx5_core]
   do_one_initcall+0x60/0x388
   do_init_module+0x7c/0x2a4
   init_module_from_file+0xb4/0x108
   idempotent_init_module+0x184/0x34c
   sys_finit_module+0x90/0x114

And oopses are possible when lockdep is enabled or the RTAS
tracepoints are active, since those paths dereference the result of
the lookup.

Use the correct spelling to match firmware's behavior, adjusting the
related constants to match.

Signed-off-by: Nathan Lynch <[email protected]>
Fixes: 8252b88 ("powerpc/rtas: improve function information lookups")
Reported-by: Gaurav Batra <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/20240222-rtas-fix-ibm-reset-pe-dma-window-v1-1-7aaf235ac63c@linux.ibm.com
Signed-off-by: Sasha Levin <[email protected]>
Avenger-285714 pushed a commit to Avenger-285714/DeepinKernel that referenced this issue May 20, 2024
commit 52a6947 upstream.

Currently, enabling SG_DEBUG in the kernel will cause nouveau to hit a
BUG() on startup:

  kernel BUG at include/linux/scatterlist.h:187!
  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 7 PID: 930 Comm: (udev-worker) Not tainted 6.9.0-rc3Lyude-Test+ deepin-community#30
  Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS 1.I0 01/22/2019
  RIP: 0010:sg_init_one+0x85/0xa0
  Code: 69 88 32 01 83 e1 03 f6 c3 03 75 20 a8 01 75 1e 48 09 cb 41 89 54
  24 08 49 89 1c 24 41 89 6c 24 0c 5b 5d 41 5c e9 7b b9 88 00 <0f> 0b 0f 0b
  0f 0b 48 8b 05 5e 46 9a 01 eb b2 66 66 2e 0f 1f 84 00
  RSP: 0018:ffffa776017bf6a0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffffa77600d87000 RCX: 000000000000002b
  RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffa77680d87000
  RBP: 000000000000e000 R08: 0000000000000000 R09: 0000000000000000
  R10: ffff98f4c46aa508 R11: 0000000000000000 R12: ffff98f4c46aa508
  R13: ffff98f4c46aa008 R14: ffffa77600d4a000 R15: ffffa77600d4a018
  FS:  00007feeb5aae980(0000) GS:ffff98f5c4dc0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f22cb9a4520 CR3: 00000001043ba000 CR4: 00000000003506f0
  Call Trace:
   <TASK>
   ? die+0x36/0x90
   ? do_trap+0xdd/0x100
   ? sg_init_one+0x85/0xa0
   ? do_error_trap+0x65/0x80
   ? sg_init_one+0x85/0xa0
   ? exc_invalid_op+0x50/0x70
   ? sg_init_one+0x85/0xa0
   ? asm_exc_invalid_op+0x1a/0x20
   ? sg_init_one+0x85/0xa0
   nvkm_firmware_ctor+0x14a/0x250 [nouveau]
   nvkm_falcon_fw_ctor+0x42/0x70 [nouveau]
   ga102_gsp_booter_ctor+0xb4/0x1a0 [nouveau]
   r535_gsp_oneinit+0xb3/0x15f0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? nvkm_udevice_new+0x95/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? ktime_get+0x47/0xb0
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_oneinit_+0x4f/0x120 [nouveau]
   nvkm_subdev_init_+0x39/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_init+0x44/0x90 [nouveau]
   nvkm_device_init+0x166/0x2e0 [nouveau]
   nvkm_udevice_init+0x47/0x70 [nouveau]
   nvkm_object_init+0x41/0x1c0 [nouveau]
   nvkm_ioctl_new+0x16a/0x290 [nouveau]
   ? __pfx_nvkm_client_child_new+0x10/0x10 [nouveau]
   ? __pfx_nvkm_udevice_new+0x10/0x10 [nouveau]
   nvkm_ioctl+0x126/0x290 [nouveau]
   nvif_object_ctor+0x112/0x190 [nouveau]
   nvif_device_ctor+0x23/0x60 [nouveau]
   nouveau_cli_init+0x164/0x640 [nouveau]
   nouveau_drm_device_init+0x97/0x9e0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? pci_update_current_state+0x72/0xb0
   ? srso_return_thunk+0x5/0x5f
   nouveau_drm_probe+0x12c/0x280 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   local_pci_probe+0x45/0xa0
   pci_device_probe+0xc7/0x270
   really_probe+0xe6/0x3a0
   __driver_probe_device+0x87/0x160
   driver_probe_device+0x1f/0xc0
   __driver_attach+0xec/0x1f0
   ? __pfx___driver_attach+0x10/0x10
   bus_for_each_dev+0x88/0xd0
   bus_add_driver+0x116/0x220
   driver_register+0x59/0x100
   ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau]
   do_one_initcall+0x5b/0x320
   do_init_module+0x60/0x250
   init_module_from_file+0x86/0xc0
   idempotent_init_module+0x120/0x2b0
   __x64_sys_finit_module+0x5e/0xb0
   do_syscall_64+0x83/0x160
   ? srso_return_thunk+0x5/0x5f
   entry_SYSCALL_64_after_hwframe+0x71/0x79
  RIP: 0033:0x7feeb5cc20cd
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89
  f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0
  ff ff 73 01 c3 48 8b 0d 1b cd 0c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffcf220b2c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
  RAX: ffffffffffffffda RBX: 000055fdd2916aa0 RCX: 00007feeb5cc20cd
  RDX: 0000000000000000 RSI: 000055fdd29161e0 RDI: 0000000000000035
  RBP: 00007ffcf220b380 R08: 00007feeb5d8fb20 R09: 00007ffcf220b310
  R10: 000055fdd2909dc0 R11: 0000000000000246 R12: 000055fdd29161e0
  R13: 0000000000020000 R14: 000055fdd29203e0 R15: 000055fdd2909d80
   </TASK>

We hit this when trying to initialize firmware of type
NVKM_FIRMWARE_IMG_DMA because we allocate our memory with
dma_alloc_coherent, and DMA allocations can't be turned back into memory
pages - which a scatterlist needs in order to map them.

So, fix this by allocating the memory with vmalloc instead().

V2:
* Fixup explanation as the prior one was bogus

Signed-off-by: Lyude Paul <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Cc: [email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Avenger-285714 pushed a commit to Avenger-285714/DeepinKernel that referenced this issue May 20, 2024
commit 52a6947 upstream.

Currently, enabling SG_DEBUG in the kernel will cause nouveau to hit a
BUG() on startup:

  kernel BUG at include/linux/scatterlist.h:187!
  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 7 PID: 930 Comm: (udev-worker) Not tainted 6.9.0-rc3Lyude-Test+ deepin-community#30
  Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS 1.I0 01/22/2019
  RIP: 0010:sg_init_one+0x85/0xa0
  Code: 69 88 32 01 83 e1 03 f6 c3 03 75 20 a8 01 75 1e 48 09 cb 41 89 54
  24 08 49 89 1c 24 41 89 6c 24 0c 5b 5d 41 5c e9 7b b9 88 00 <0f> 0b 0f 0b
  0f 0b 48 8b 05 5e 46 9a 01 eb b2 66 66 2e 0f 1f 84 00
  RSP: 0018:ffffa776017bf6a0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffffa77600d87000 RCX: 000000000000002b
  RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffa77680d87000
  RBP: 000000000000e000 R08: 0000000000000000 R09: 0000000000000000
  R10: ffff98f4c46aa508 R11: 0000000000000000 R12: ffff98f4c46aa508
  R13: ffff98f4c46aa008 R14: ffffa77600d4a000 R15: ffffa77600d4a018
  FS:  00007feeb5aae980(0000) GS:ffff98f5c4dc0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f22cb9a4520 CR3: 00000001043ba000 CR4: 00000000003506f0
  Call Trace:
   <TASK>
   ? die+0x36/0x90
   ? do_trap+0xdd/0x100
   ? sg_init_one+0x85/0xa0
   ? do_error_trap+0x65/0x80
   ? sg_init_one+0x85/0xa0
   ? exc_invalid_op+0x50/0x70
   ? sg_init_one+0x85/0xa0
   ? asm_exc_invalid_op+0x1a/0x20
   ? sg_init_one+0x85/0xa0
   nvkm_firmware_ctor+0x14a/0x250 [nouveau]
   nvkm_falcon_fw_ctor+0x42/0x70 [nouveau]
   ga102_gsp_booter_ctor+0xb4/0x1a0 [nouveau]
   r535_gsp_oneinit+0xb3/0x15f0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? nvkm_udevice_new+0x95/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? ktime_get+0x47/0xb0
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_oneinit_+0x4f/0x120 [nouveau]
   nvkm_subdev_init_+0x39/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_init+0x44/0x90 [nouveau]
   nvkm_device_init+0x166/0x2e0 [nouveau]
   nvkm_udevice_init+0x47/0x70 [nouveau]
   nvkm_object_init+0x41/0x1c0 [nouveau]
   nvkm_ioctl_new+0x16a/0x290 [nouveau]
   ? __pfx_nvkm_client_child_new+0x10/0x10 [nouveau]
   ? __pfx_nvkm_udevice_new+0x10/0x10 [nouveau]
   nvkm_ioctl+0x126/0x290 [nouveau]
   nvif_object_ctor+0x112/0x190 [nouveau]
   nvif_device_ctor+0x23/0x60 [nouveau]
   nouveau_cli_init+0x164/0x640 [nouveau]
   nouveau_drm_device_init+0x97/0x9e0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? pci_update_current_state+0x72/0xb0
   ? srso_return_thunk+0x5/0x5f
   nouveau_drm_probe+0x12c/0x280 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   local_pci_probe+0x45/0xa0
   pci_device_probe+0xc7/0x270
   really_probe+0xe6/0x3a0
   __driver_probe_device+0x87/0x160
   driver_probe_device+0x1f/0xc0
   __driver_attach+0xec/0x1f0
   ? __pfx___driver_attach+0x10/0x10
   bus_for_each_dev+0x88/0xd0
   bus_add_driver+0x116/0x220
   driver_register+0x59/0x100
   ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau]
   do_one_initcall+0x5b/0x320
   do_init_module+0x60/0x250
   init_module_from_file+0x86/0xc0
   idempotent_init_module+0x120/0x2b0
   __x64_sys_finit_module+0x5e/0xb0
   do_syscall_64+0x83/0x160
   ? srso_return_thunk+0x5/0x5f
   entry_SYSCALL_64_after_hwframe+0x71/0x79
  RIP: 0033:0x7feeb5cc20cd
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89
  f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0
  ff ff 73 01 c3 48 8b 0d 1b cd 0c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffcf220b2c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
  RAX: ffffffffffffffda RBX: 000055fdd2916aa0 RCX: 00007feeb5cc20cd
  RDX: 0000000000000000 RSI: 000055fdd29161e0 RDI: 0000000000000035
  RBP: 00007ffcf220b380 R08: 00007feeb5d8fb20 R09: 00007ffcf220b310
  R10: 000055fdd2909dc0 R11: 0000000000000246 R12: 000055fdd29161e0
  R13: 0000000000020000 R14: 000055fdd29203e0 R15: 000055fdd2909d80
   </TASK>

We hit this when trying to initialize firmware of type
NVKM_FIRMWARE_IMG_DMA because we allocate our memory with
dma_alloc_coherent, and DMA allocations can't be turned back into memory
pages - which a scatterlist needs in order to map them.

So, fix this by allocating the memory with vmalloc instead().

V2:
* Fixup explanation as the prior one was bogus

Signed-off-by: Lyude Paul <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Cc: [email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Avenger-285714 pushed a commit that referenced this issue May 20, 2024
commit 52a6947 upstream.

Currently, enabling SG_DEBUG in the kernel will cause nouveau to hit a
BUG() on startup:

  kernel BUG at include/linux/scatterlist.h:187!
  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 7 PID: 930 Comm: (udev-worker) Not tainted 6.9.0-rc3Lyude-Test+ #30
  Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS 1.I0 01/22/2019
  RIP: 0010:sg_init_one+0x85/0xa0
  Code: 69 88 32 01 83 e1 03 f6 c3 03 75 20 a8 01 75 1e 48 09 cb 41 89 54
  24 08 49 89 1c 24 41 89 6c 24 0c 5b 5d 41 5c e9 7b b9 88 00 <0f> 0b 0f 0b
  0f 0b 48 8b 05 5e 46 9a 01 eb b2 66 66 2e 0f 1f 84 00
  RSP: 0018:ffffa776017bf6a0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffffa77600d87000 RCX: 000000000000002b
  RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffa77680d87000
  RBP: 000000000000e000 R08: 0000000000000000 R09: 0000000000000000
  R10: ffff98f4c46aa508 R11: 0000000000000000 R12: ffff98f4c46aa508
  R13: ffff98f4c46aa008 R14: ffffa77600d4a000 R15: ffffa77600d4a018
  FS:  00007feeb5aae980(0000) GS:ffff98f5c4dc0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f22cb9a4520 CR3: 00000001043ba000 CR4: 00000000003506f0
  Call Trace:
   <TASK>
   ? die+0x36/0x90
   ? do_trap+0xdd/0x100
   ? sg_init_one+0x85/0xa0
   ? do_error_trap+0x65/0x80
   ? sg_init_one+0x85/0xa0
   ? exc_invalid_op+0x50/0x70
   ? sg_init_one+0x85/0xa0
   ? asm_exc_invalid_op+0x1a/0x20
   ? sg_init_one+0x85/0xa0
   nvkm_firmware_ctor+0x14a/0x250 [nouveau]
   nvkm_falcon_fw_ctor+0x42/0x70 [nouveau]
   ga102_gsp_booter_ctor+0xb4/0x1a0 [nouveau]
   r535_gsp_oneinit+0xb3/0x15f0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? nvkm_udevice_new+0x95/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? ktime_get+0x47/0xb0
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_oneinit_+0x4f/0x120 [nouveau]
   nvkm_subdev_init_+0x39/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_init+0x44/0x90 [nouveau]
   nvkm_device_init+0x166/0x2e0 [nouveau]
   nvkm_udevice_init+0x47/0x70 [nouveau]
   nvkm_object_init+0x41/0x1c0 [nouveau]
   nvkm_ioctl_new+0x16a/0x290 [nouveau]
   ? __pfx_nvkm_client_child_new+0x10/0x10 [nouveau]
   ? __pfx_nvkm_udevice_new+0x10/0x10 [nouveau]
   nvkm_ioctl+0x126/0x290 [nouveau]
   nvif_object_ctor+0x112/0x190 [nouveau]
   nvif_device_ctor+0x23/0x60 [nouveau]
   nouveau_cli_init+0x164/0x640 [nouveau]
   nouveau_drm_device_init+0x97/0x9e0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? pci_update_current_state+0x72/0xb0
   ? srso_return_thunk+0x5/0x5f
   nouveau_drm_probe+0x12c/0x280 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   local_pci_probe+0x45/0xa0
   pci_device_probe+0xc7/0x270
   really_probe+0xe6/0x3a0
   __driver_probe_device+0x87/0x160
   driver_probe_device+0x1f/0xc0
   __driver_attach+0xec/0x1f0
   ? __pfx___driver_attach+0x10/0x10
   bus_for_each_dev+0x88/0xd0
   bus_add_driver+0x116/0x220
   driver_register+0x59/0x100
   ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau]
   do_one_initcall+0x5b/0x320
   do_init_module+0x60/0x250
   init_module_from_file+0x86/0xc0
   idempotent_init_module+0x120/0x2b0
   __x64_sys_finit_module+0x5e/0xb0
   do_syscall_64+0x83/0x160
   ? srso_return_thunk+0x5/0x5f
   entry_SYSCALL_64_after_hwframe+0x71/0x79
  RIP: 0033:0x7feeb5cc20cd
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89
  f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0
  ff ff 73 01 c3 48 8b 0d 1b cd 0c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffcf220b2c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
  RAX: ffffffffffffffda RBX: 000055fdd2916aa0 RCX: 00007feeb5cc20cd
  RDX: 0000000000000000 RSI: 000055fdd29161e0 RDI: 0000000000000035
  RBP: 00007ffcf220b380 R08: 00007feeb5d8fb20 R09: 00007ffcf220b310
  R10: 000055fdd2909dc0 R11: 0000000000000246 R12: 000055fdd29161e0
  R13: 0000000000020000 R14: 000055fdd29203e0 R15: 000055fdd2909d80
   </TASK>

We hit this when trying to initialize firmware of type
NVKM_FIRMWARE_IMG_DMA because we allocate our memory with
dma_alloc_coherent, and DMA allocations can't be turned back into memory
pages - which a scatterlist needs in order to map them.

So, fix this by allocating the memory with vmalloc instead().

V2:
* Fixup explanation as the prior one was bogus

Signed-off-by: Lyude Paul <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Cc: [email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
opsiff pushed a commit that referenced this issue May 21, 2024
commit 52a6947 upstream.

Currently, enabling SG_DEBUG in the kernel will cause nouveau to hit a
BUG() on startup:

  kernel BUG at include/linux/scatterlist.h:187!
  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 7 PID: 930 Comm: (udev-worker) Not tainted 6.9.0-rc3Lyude-Test+ #30
  Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS 1.I0 01/22/2019
  RIP: 0010:sg_init_one+0x85/0xa0
  Code: 69 88 32 01 83 e1 03 f6 c3 03 75 20 a8 01 75 1e 48 09 cb 41 89 54
  24 08 49 89 1c 24 41 89 6c 24 0c 5b 5d 41 5c e9 7b b9 88 00 <0f> 0b 0f 0b
  0f 0b 48 8b 05 5e 46 9a 01 eb b2 66 66 2e 0f 1f 84 00
  RSP: 0018:ffffa776017bf6a0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffffa77600d87000 RCX: 000000000000002b
  RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffa77680d87000
  RBP: 000000000000e000 R08: 0000000000000000 R09: 0000000000000000
  R10: ffff98f4c46aa508 R11: 0000000000000000 R12: ffff98f4c46aa508
  R13: ffff98f4c46aa008 R14: ffffa77600d4a000 R15: ffffa77600d4a018
  FS:  00007feeb5aae980(0000) GS:ffff98f5c4dc0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f22cb9a4520 CR3: 00000001043ba000 CR4: 00000000003506f0
  Call Trace:
   <TASK>
   ? die+0x36/0x90
   ? do_trap+0xdd/0x100
   ? sg_init_one+0x85/0xa0
   ? do_error_trap+0x65/0x80
   ? sg_init_one+0x85/0xa0
   ? exc_invalid_op+0x50/0x70
   ? sg_init_one+0x85/0xa0
   ? asm_exc_invalid_op+0x1a/0x20
   ? sg_init_one+0x85/0xa0
   nvkm_firmware_ctor+0x14a/0x250 [nouveau]
   nvkm_falcon_fw_ctor+0x42/0x70 [nouveau]
   ga102_gsp_booter_ctor+0xb4/0x1a0 [nouveau]
   r535_gsp_oneinit+0xb3/0x15f0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? nvkm_udevice_new+0x95/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? ktime_get+0x47/0xb0
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_oneinit_+0x4f/0x120 [nouveau]
   nvkm_subdev_init_+0x39/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_init+0x44/0x90 [nouveau]
   nvkm_device_init+0x166/0x2e0 [nouveau]
   nvkm_udevice_init+0x47/0x70 [nouveau]
   nvkm_object_init+0x41/0x1c0 [nouveau]
   nvkm_ioctl_new+0x16a/0x290 [nouveau]
   ? __pfx_nvkm_client_child_new+0x10/0x10 [nouveau]
   ? __pfx_nvkm_udevice_new+0x10/0x10 [nouveau]
   nvkm_ioctl+0x126/0x290 [nouveau]
   nvif_object_ctor+0x112/0x190 [nouveau]
   nvif_device_ctor+0x23/0x60 [nouveau]
   nouveau_cli_init+0x164/0x640 [nouveau]
   nouveau_drm_device_init+0x97/0x9e0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? pci_update_current_state+0x72/0xb0
   ? srso_return_thunk+0x5/0x5f
   nouveau_drm_probe+0x12c/0x280 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   local_pci_probe+0x45/0xa0
   pci_device_probe+0xc7/0x270
   really_probe+0xe6/0x3a0
   __driver_probe_device+0x87/0x160
   driver_probe_device+0x1f/0xc0
   __driver_attach+0xec/0x1f0
   ? __pfx___driver_attach+0x10/0x10
   bus_for_each_dev+0x88/0xd0
   bus_add_driver+0x116/0x220
   driver_register+0x59/0x100
   ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau]
   do_one_initcall+0x5b/0x320
   do_init_module+0x60/0x250
   init_module_from_file+0x86/0xc0
   idempotent_init_module+0x120/0x2b0
   __x64_sys_finit_module+0x5e/0xb0
   do_syscall_64+0x83/0x160
   ? srso_return_thunk+0x5/0x5f
   entry_SYSCALL_64_after_hwframe+0x71/0x79
  RIP: 0033:0x7feeb5cc20cd
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89
  f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0
  ff ff 73 01 c3 48 8b 0d 1b cd 0c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffcf220b2c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
  RAX: ffffffffffffffda RBX: 000055fdd2916aa0 RCX: 00007feeb5cc20cd
  RDX: 0000000000000000 RSI: 000055fdd29161e0 RDI: 0000000000000035
  RBP: 00007ffcf220b380 R08: 00007feeb5d8fb20 R09: 00007ffcf220b310
  R10: 000055fdd2909dc0 R11: 0000000000000246 R12: 000055fdd29161e0
  R13: 0000000000020000 R14: 000055fdd29203e0 R15: 000055fdd2909d80
   </TASK>

We hit this when trying to initialize firmware of type
NVKM_FIRMWARE_IMG_DMA because we allocate our memory with
dma_alloc_coherent, and DMA allocations can't be turned back into memory
pages - which a scatterlist needs in order to map them.

So, fix this by allocating the memory with vmalloc instead().

V2:
* Fixup explanation as the prior one was bogus

Signed-off-by: Lyude Paul <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Cc: [email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
opsiff pushed a commit that referenced this issue Nov 23, 2024
commit 9b340aeb26d50e9a9ec99599e2a39b035fac978e upstream.

Currently, enabling SG_DEBUG in the kernel will cause nouveau to hit a
BUG() on startup, when the iommu is enabled:

kernel BUG at include/linux/scatterlist.h:187!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 930 Comm: (udev-worker) Not tainted 6.9.0-rc3Lyude-Test+ #30
Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS 1.I0 01/22/2019
RIP: 0010:sg_init_one+0x85/0xa0
Code: 69 88 32 01 83 e1 03 f6 c3 03 75 20 a8 01 75 1e 48 09 cb 41 89 54
24 08 49 89 1c 24 41 89 6c 24 0c 5b 5d 41 5c e9 7b b9 88 00 <0f> 0b 0f 0b
0f 0b 48 8b 05 5e 46 9a 01 eb b2 66 66 2e 0f 1f 84 00
RSP: 0018:ffffa776017bf6a0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffa77600d87000 RCX: 000000000000002b
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffa77680d87000
RBP: 000000000000e000 R08: 0000000000000000 R09: 0000000000000000
R10: ffff98f4c46aa508 R11: 0000000000000000 R12: ffff98f4c46aa508
R13: ffff98f4c46aa008 R14: ffffa77600d4a000 R15: ffffa77600d4a018
FS:  00007feeb5aae980(0000) GS:ffff98f5c4dc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f22cb9a4520 CR3: 00000001043ba000 CR4: 00000000003506f0
Call Trace:
 <TASK>
 ? die+0x36/0x90
 ? do_trap+0xdd/0x100
 ? sg_init_one+0x85/0xa0
 ? do_error_trap+0x65/0x80
 ? sg_init_one+0x85/0xa0
 ? exc_invalid_op+0x50/0x70
 ? sg_init_one+0x85/0xa0
 ? asm_exc_invalid_op+0x1a/0x20
 ? sg_init_one+0x85/0xa0
 nvkm_firmware_ctor+0x14a/0x250 [nouveau]
 nvkm_falcon_fw_ctor+0x42/0x70 [nouveau]
 ga102_gsp_booter_ctor+0xb4/0x1a0 [nouveau]
 r535_gsp_oneinit+0xb3/0x15f0 [nouveau]
 ? srso_return_thunk+0x5/0x5f
 ? srso_return_thunk+0x5/0x5f
 ? nvkm_udevice_new+0x95/0x140 [nouveau]
 ? srso_return_thunk+0x5/0x5f
 ? srso_return_thunk+0x5/0x5f
 ? ktime_get+0x47/0xb0

Fix this by using the non-coherent allocator instead, I think there
might be a better answer to this, but it involve ripping up some of
APIs using sg lists.

Cc: [email protected]
Fixes: 2541626 ("drm/nouveau/acr: use common falcon HS FW code for ACR FWs")
Signed-off-by: Dave Airlie <[email protected]>
Signed-off-by: Danilo Krummrich <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit cc29c5546c6a373648363ac49781f1d74b530707)
opsiff pushed a commit that referenced this issue Dec 16, 2024
commit 9b340aeb26d50e9a9ec99599e2a39b035fac978e upstream.

Currently, enabling SG_DEBUG in the kernel will cause nouveau to hit a
BUG() on startup, when the iommu is enabled:

kernel BUG at include/linux/scatterlist.h:187!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 930 Comm: (udev-worker) Not tainted 6.9.0-rc3Lyude-Test+ #30
Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS 1.I0 01/22/2019
RIP: 0010:sg_init_one+0x85/0xa0
Code: 69 88 32 01 83 e1 03 f6 c3 03 75 20 a8 01 75 1e 48 09 cb 41 89 54
24 08 49 89 1c 24 41 89 6c 24 0c 5b 5d 41 5c e9 7b b9 88 00 <0f> 0b 0f 0b
0f 0b 48 8b 05 5e 46 9a 01 eb b2 66 66 2e 0f 1f 84 00
RSP: 0018:ffffa776017bf6a0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffa77600d87000 RCX: 000000000000002b
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffa77680d87000
RBP: 000000000000e000 R08: 0000000000000000 R09: 0000000000000000
R10: ffff98f4c46aa508 R11: 0000000000000000 R12: ffff98f4c46aa508
R13: ffff98f4c46aa008 R14: ffffa77600d4a000 R15: ffffa77600d4a018
FS:  00007feeb5aae980(0000) GS:ffff98f5c4dc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f22cb9a4520 CR3: 00000001043ba000 CR4: 00000000003506f0
Call Trace:
 <TASK>
 ? die+0x36/0x90
 ? do_trap+0xdd/0x100
 ? sg_init_one+0x85/0xa0
 ? do_error_trap+0x65/0x80
 ? sg_init_one+0x85/0xa0
 ? exc_invalid_op+0x50/0x70
 ? sg_init_one+0x85/0xa0
 ? asm_exc_invalid_op+0x1a/0x20
 ? sg_init_one+0x85/0xa0
 nvkm_firmware_ctor+0x14a/0x250 [nouveau]
 nvkm_falcon_fw_ctor+0x42/0x70 [nouveau]
 ga102_gsp_booter_ctor+0xb4/0x1a0 [nouveau]
 r535_gsp_oneinit+0xb3/0x15f0 [nouveau]
 ? srso_return_thunk+0x5/0x5f
 ? srso_return_thunk+0x5/0x5f
 ? nvkm_udevice_new+0x95/0x140 [nouveau]
 ? srso_return_thunk+0x5/0x5f
 ? srso_return_thunk+0x5/0x5f
 ? ktime_get+0x47/0xb0

Fix this by using the non-coherent allocator instead, I think there
might be a better answer to this, but it involve ripping up some of
APIs using sg lists.

Cc: [email protected]
Fixes: 2541626 ("drm/nouveau/acr: use common falcon HS FW code for ACR FWs")
Signed-off-by: Dave Airlie <[email protected]>
Signed-off-by: Danilo Krummrich <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit cc29c5546c6a373648363ac49781f1d74b530707)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants