Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel NULL pointer dereference when accessing clkpm #753

Open
1 of 2 tasks
gulafaran opened this issue Dec 20, 2024 · 0 comments
Open
1 of 2 tasks

kernel NULL pointer dereference when accessing clkpm #753

gulafaran opened this issue Dec 20, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@gulafaran
Copy link

NVIDIA Open GPU Kernel Modules Version

565.77

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • I confirm that this does not happen with the proprietary driver package.

Operating System and Version

Gentoo Linux

Kernel Release

linux 6.12.5 , built myself.

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Hardware: GPU

GPU 0: NVIDIA GeForce RTX 4080 Laptop GPU (UUID: GPU-c14b4260-e302-0ed3-abed-c517e7fee34f)

Describe the bug

accessing clkpm like cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/link/clkpm causes a kernel null pointer deref.

[56813.210846] BUG: kernel NULL pointer dereference, address: 0000000000000030
[56813.210850] #PF: supervisor read access in kernel mode
[56813.210851] #PF: error_code(0x0000) - not-present page
[56813.210852] PGD 0 P4D 0 
[56813.210853] Oops: Oops: 0000 [#1] PREEMPT SMP
[56813.210856] CPU: 0 UID: 1000 PID: 426007 Comm: cat Tainted: G S   U     O       6.12.5-gentoo-tom #3
[56813.210858] Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER, [O]=OOT_MODULE
[56813.210859] Hardware name: LENOVO 82WQ/LNVNB161216, BIOS KWCN46WW 07/04/2024
[56813.210860] RIP: 0010:clkpm_show+0x4c/0x70
[56813.210864] Code: bf 28 09 00 00 48 8b 4f 10 48 83 79 10 00 74 18 48 8b 49 38 48 85 c9 74 0f 80 79 6c 00 74 09 48 8b 89 b8 00 00 00 eb 02 31 c9 <48> 8b 51 30 48 c1 ea 28 83 e2 01 48 89 c7 48 c7 c6 67 23 53 8d e8
[56813.210865] RSP: 0018:ffffa85ee5f7fc40 EFLAGS: 00010202
[56813.210866] RAX: ffff92a8879de000 RBX: ffffffff8dc20280 RCX: 0000000000000000
[56813.210867] RDX: ffff92a8879de000 RSI: ffffffff8dc20280 RDI: ffff92a6c5426000
[56813.210868] RBP: ffff92a6c3f4e3e8 R08: 0000000000001000 R09: ffff92a8879de000
[56813.210869] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8d786cd8
[56813.210869] R13: ffff92a6cd5866c0 R14: ffff92a6c54260c8 R15: ffff92a8879de000
[56813.210870] FS:  0000555555518740(0000) GS:ffff92be0d400000(0000) knlGS:0000000000000000
[56813.210871] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[56813.210872] CR2: 0000000000000030 CR3: 00000003e3636000 CR4: 00000000007526f0
[56813.210873] PKRU: 55555554
[56813.210873] Call Trace:
[56813.210876]  <TASK>
[56813.210878]  ? __die_body+0x66/0xb0
[56813.210880]  ? page_fault_oops+0x34c/0x3a0
[56813.210883]  ? exc_page_fault+0x63/0xd0
[56813.210885]  ? asm_exc_page_fault+0x22/0x30
[56813.210887]  ? clkpm_show+0x4c/0x70
[56813.210887]  dev_attr_show+0x14/0x40
[56813.210889]  sysfs_kf_seq_show+0x91/0xf0
[56813.210893]  seq_read_iter+0x16d/0x3c0
[56813.210895]  vfs_read+0x285/0x320
[56813.210898]  ksys_read+0x68/0xc0
[56813.210900]  do_syscall_64+0x80/0x150
[56813.210902]  ? __count_memcg_events+0x65/0xf0
[56813.210905]  ? handle_mm_fault+0x937/0xab0
[56813.210906]  ? do_user_addr_fault+0x1b7/0x6e0
[56813.210908]  ? exc_page_fault+0x63/0xd0
[56813.210909]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[56813.210910] RIP: 0033:0x555555319bed
[56813.210912] Code: 29 52 0e 00 f7 d8 64 89 02 b8 ff ff ff ff eb bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 80 3d 79 d4 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec
[56813.210913] RSP: 002b:00007fffffffc5a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[56813.210914] RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 0000555555319bed
[56813.210915] RDX: 0000000000040000 RSI: 00005555554d7000 RDI: 0000000000000003
[56813.210916] RBP: 0000000000040000 R08: 00007ffff7ffc480 R09: 0000000000000000
[56813.210916] R10: 0000000000000022 R11: 0000000000000246 R12: 00005555554d7000
[56813.210917] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000001
[56813.210918]  </TASK>
[56813.210918] Modules linked in: snd_usb_audio snd_ump snd_usbmidi_lib snd_rawmidi rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_ctl_led bnep vfat fat intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl intel_powerclamp snd_sof_intel_hda_generic soundwire_intel soundwire_cadence coretemp snd_sof_intel_hda_common iwlmvm snd_soc_hdac_hda snd_hda_codec_hdmi kvm_intel snd_sof_intel_hda_mlink mac80211 snd_sof_intel_hda snd_sof_pci snd_hda_codec_realtek snd_sof_xtensa_dsp kvm snd_hda_scodec_component libarc4 snd_hda_codec_generic snd_sof rapl snd_sof_utils intel_cstate snd_hda_ext_core snd_soc_acpi_intel_match soundwire_generic_allocation snd_soc_acpi int3403_thermal intel_uncore soundwire_bus snd_hda_scodec_tas2781_i2c snd_hda_intel snd_soc_tas2781_fmwlib snd_intel_dspcfg snd_soc_tas2781_comlib snd_intel_sdw_acpi crc8 iwlwifi uvcvideo snd_hda_codec spi_nor btusb uvc snd_soc_core processor_thermal_device_pci videobuf2_vmalloc
[56813.210948]  btrtl mtd videobuf2_memops processor_thermal_device snd_hda_core ac97_bus btmtk processor_thermal_power_floor videobuf2_v4l2 snd_pcm_dmaengine processor_thermal_wt_hint btbcm processor_thermal_wt_req videobuf2_common snd_hwdep snd_compress btintel r8169 i2c_i801 spi_intel_pci processor_thermal_rfim cfg80211 thunderbolt snd_pcm nvidia_wmi_ec_backlight wmi_bmof videodev intel_rapl_msr processor_thermal_mbox realtek i2c_smbus spi_intel processor_thermal_rapl snd_timer bluetooth idma64 intel_rapl_common mc int340x_thermal_zone snd soundcore mei_hdcp mei_pxp intel_pmc_core intel_vsec int3400_thermal pmt_telemetry acpi_pad acpi_thermal_rel pmt_class acpi_tad joydev mei_me mei legion_laptop(O) fuse loop nfnetlink crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 nvme ucsi_acpi sha256_ssse3 hid_multitouch typec_ucsi sha1_ssse3 nvme_core nvme_auth typec i2c_hid_acpi i2c_hid pinctrl_alderlake serio_raw uas hid_logitech_hidpp usb_storage i2c_nvidia_gpu
[56813.210980]  nvidia_drm(O) drm_ttm_helper nvidia_uvm(O) nvidia_modeset(O) nvidia(O) i915 cec i2c_algo_bit drm_display_helper drm_buddy ttm ideapad_laptop sparse_keymap rfkill platform_profile video wmi
[56813.210989] CR2: 0000000000000030
[56813.210990] ---[ end trace 0000000000000000 ]---
[56813.210990] RIP: 0010:clkpm_show+0x4c/0x70
[56813.210991] Code: bf 28 09 00 00 48 8b 4f 10 48 83 79 10 00 74 18 48 8b 49 38 48 85 c9 74 0f 80 79 6c 00 74 09 48 8b 89 b8 00 00 00 eb 02 31 c9 <48> 8b 51 30 48 c1 ea 28 83 e2 01 48 89 c7 48 c7 c6 67 23 53 8d e8
[56813.210992] RSP: 0018:ffffa85ee5f7fc40 EFLAGS: 00010202
[56813.210993] RAX: ffff92a8879de000 RBX: ffffffff8dc20280 RCX: 0000000000000000
[56813.210993] RDX: ffff92a8879de000 RSI: ffffffff8dc20280 RDI: ffff92a6c5426000
[56813.210994] RBP: ffff92a6c3f4e3e8 R08: 0000000000001000 R09: ffff92a8879de000
[56813.210994] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8d786cd8
[56813.210995] R13: ffff92a6cd5866c0 R14: ffff92a6c54260c8 R15: ffff92a8879de000
[56813.210995] FS:  0000555555518740(0000) GS:ffff92be0d400000(0000) knlGS:0000000000000000
[56813.210996] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[56813.210997] CR2: 0000000000000030 CR3: 00000003e3636000 CR4: 00000000007526f0
[56813.210997] PKRU: 55555554
[56813.210998] note: cat[426007] exited with irqs disabled

To Reproduce

simplest reproducer is just running cat on clkpm

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

No response

@gulafaran gulafaran added the bug Something isn't working label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant