Skip to content

amdgpu: Kernel panic while playing a video (vm_fault_lookup: fault on nofault entry, addr: 0xfffffe016e1e0000) #347

@monwarez

Description

@monwarez

Describe the bug
Kernel panic on amdgpu occuring when playing a video.
This happens somewhat randomly, but most of the times it is while playing a video.
While I have recently changed my kernel to fixes issue with linsysfs, I was getting this issue on the past.

The panic message is the following:
panic: vm_fault_lookup: fault on nofault entry, addr: 0xfffffe016e1e0000

FreeBSD version
FreeBSD msi.local 14.2-RELEASE-p2 FreeBSD 14.2-RELEASE-p2 feat-linsysfs-drm-fixup-n269519-5ae45e12f63f LINSYSFSFIX amd64 1402000 1402000

PCI Info

pciconf -lv hostb0@pci0:0:0:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1630 subvendor=0x1022 subdevice=0x1630 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne Root Complex' class = bridge subclass = HOST-PCI amdiommu0@pci0:0:0:2: class=0x080600 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1631 subvendor=0x1022 subdevice=0x1631 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne IOMMU' class = base peripheral subclass = IOMMU hostb1@pci0:0:1:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1632 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir PCIe Dummy Host Bridge' class = bridge subclass = HOST-PCI hostb2@pci0:0:2:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1632 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir PCIe Dummy Host Bridge' class = bridge subclass = HOST-PCI pcib1@pci0:0:2:1: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x1634 subvendor=0x1022 subdevice=0x1453 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne PCIe GPP Bridge' class = bridge subclass = PCI-PCI pcib5@pci0:0:2:2: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x1634 subvendor=0x1022 subdevice=0x1453 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne PCIe GPP Bridge' class = bridge subclass = PCI-PCI hostb3@pci0:0:8:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1632 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir PCIe Dummy Host Bridge' class = bridge subclass = HOST-PCI pcib6@pci0:0:8:1: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x1635 subvendor=0x1022 subdevice=0x1635 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Internal PCIe GPP Bridge to Bus' class = bridge subclass = PCI-PCI intsmb0@pci0:0:20:0: class=0x0c0500 rev=0x51 hdr=0x00 vendor=0x1022 device=0x790b subvendor=0x1849 subdevice=0xffff vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'FCH SMBus Controller' class = serial bus subclass = SMBus isab0@pci0:0:20:3: class=0x060100 rev=0x51 hdr=0x00 vendor=0x1022 device=0x790e subvendor=0x1849 subdevice=0xffff vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'FCH LPC Bridge' class = bridge subclass = PCI-ISA hostb4@pci0:0:24:0: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1448 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Device 24: Function 0' class = bridge subclass = HOST-PCI hostb5@pci0:0:24:1: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1449 subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Device 24: Function 1' class = bridge subclass = HOST-PCI hostb6@pci0:0:24:2: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x144a subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Device 24: Function 2' class = bridge subclass = HOST-PCI hostb7@pci0:0:24:3: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x144b subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Device 24: Function 3' class = bridge subclass = HOST-PCI hostb8@pci0:0:24:4: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x144c subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Device 24: Function 4' class = bridge subclass = HOST-PCI hostb9@pci0:0:24:5: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x144d subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Device 24: Function 5' class = bridge subclass = HOST-PCI hostb10@pci0:0:24:6: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x144e subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Device 24: Function 6' class = bridge subclass = HOST-PCI hostb11@pci0:0:24:7: class=0x060000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x144f subvendor=0x0000 subdevice=0x0000 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir Device 24: Function 7' class = bridge subclass = HOST-PCI xhci0@pci0:1:0:0: class=0x0c0330 rev=0x00 hdr=0x00 vendor=0x1022 device=0x43ec subvendor=0x1b21 subdevice=0x1142 vendor = 'Advanced Micro Devices, Inc. [AMD]' class = serial bus subclass = USB ahci0@pci0:1:0:1: class=0x010601 rev=0x00 hdr=0x00 vendor=0x1022 device=0x43eb subvendor=0x1b21 subdevice=0x1062 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = '500 Series Chipset SATA Controller' class = mass storage subclass = SATA pcib2@pci0:1:0:2: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x43e9 subvendor=0x1b21 subdevice=0x0201 vendor = 'Advanced Micro Devices, Inc. [AMD]' device = '500 Series Chipset Switch Upstream Port' class = bridge subclass = PCI-PCI pcib3@pci0:2:0:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x43ea subvendor=0x1b21 subdevice=0x3308 vendor = 'Advanced Micro Devices, Inc. [AMD]' class = bridge subclass = PCI-PCI pcib4@pci0:2:1:0: class=0x060400 rev=0x00 hdr=0x01 vendor=0x1022 device=0x43ea subvendor=0x1b21 subdevice=0x3308 vendor = 'Advanced Micro Devices, Inc. [AMD]' class = bridge subclass = PCI-PCI re0@pci0:3:0:0: class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x1849 subdevice=0x8168 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet iwm0@pci0:4:0:0: class=0x028000 rev=0x10 hdr=0x00 vendor=0x8086 device=0x24fb subvendor=0x8086 subdevice=0x2110 vendor = 'Intel Corporation' device = 'Dual Band Wireless-AC 3168NGW [Stone Peak]' class = network nvme0@pci0:5:0:0: class=0x010802 rev=0x03 hdr=0x00 vendor=0x8086 device=0xf1aa subvendor=0x8086 subdevice=0x390f vendor = 'Intel Corporation' device = 'SSD 670p Series [Keystone Harbor]' class = mass storage subclass = NVM vgapci0@pci0:6:0:0: class=0x030000 rev=0xc9 hdr=0x00 vendor=0x1002 device=0x1636 subvendor=0x1002 subdevice=0x1636 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Renoir [Radeon Vega Series / Radeon Vega Mobile Series]' class = display subclass = VGA hdac0@pci0:6:0:1: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1002 device=0x1637 subvendor=0x1002 subdevice=0x1637 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Renoir Radeon High Definition Audio Controller' class = multimedia subclass = HDA none0@pci0:6:0:2: class=0x108000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15df subvendor=0x1022 subdevice=0x15df vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 17h (Models 10h-1fh) Platform Security Processor' class = encrypt/decrypt xhci1@pci0:6:0:3: class=0x0c0330 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1639 subvendor=0x1849 subdevice=0xffff vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne USB 3.1' class = serial bus subclass = USB xhci2@pci0:6:0:4: class=0x0c0330 rev=0x00 hdr=0x00 vendor=0x1022 device=0x1639 subvendor=0x1849 subdevice=0xffff vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Renoir/Cezanne USB 3.1' class = serial bus subclass = USB hdac1@pci0:6:0:6: class=0x040300 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15e3 subvendor=0x1849 subdevice=0x288a vendor = 'Advanced Micro Devices, Inc. [AMD]' device = 'Family 17h/19h/1ah HD Audio Controller' class = multimedia subclass = HDA

DRM KMOD version
drm-61-kmod 6.1.128.1402000_2

To Reproduce
There is not really a reproducer, it happens occasionally.

Additional context
The backtrace is the following:

#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:405
#2  0xffffffff80b3d7a7 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:523
#3  0xffffffff80b3dc7e in vpanic (fmt=0xffffffff8112d293 "%s: fault on nofault entry, addr: %#lx", ap=ap@entry=0xfffffe00df9406a0)
    at /usr/src/sys/kern/kern_shutdown.c:967
#4  0xffffffff80b3dad3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:891
#5  0xffffffff80eb9ecf in vm_fault_lookup (fs=0xfffffe00df940710) at /usr/src/sys/vm/vm_fault.c:912
#6  vm_fault (map=<optimized out>, vaddr=18446741880828723200, fault_type=1 '\001', fault_flags=<optimized out>, m_hold=m_hold@entry=0x0)
    at /usr/src/sys/vm/vm_fault.c:1569
#7  0xffffffff80eb8841 in vm_fault_trap (map=<optimized out>, vaddr=<optimized out>, fault_type=<optimized out>, fault_flags=<unavailable>,
    fault_flags@entry=0, signo=0x0, ucode=0x0) at /usr/src/sys/vm/vm_fault.c:712
#8  0xffffffff81025bce in trap_pfault (frame=0xfffffe00df940890, usermode=false, signo=<unavailable>, ucode=<unavailable>)
    at /usr/src/sys/amd64/amd64/trap.c:845
#9  <signal handler called>
#10 memcpy_std () at /usr/src/sys/amd64/amd64/support.S:547
#11 0xffffffff83a15a4a in dc_resource_state_copy_construct () from /boot/modules/amdgpu.ko
#12 0xffffffff839dfabb in amdgpu_dm_atomic_commit_tail () from /boot/modules/amdgpu.ko
#13 0xffffffff8361cee5 in commit_tail () from /boot/modules/drm.ko
#14 0xffffffff80dc3054 in linux_work_fn (context=0x2e9d055f, pending=<optimized out>)
    at /usr/src/sys/compat/linuxkpi/common/src/linux_work.c:301
#15 0xffffffff80ba0d72 in taskqueue_run_locked (queue=queue@entry=0xfffff80001a56700) at /usr/src/sys/kern/subr_taskqueue.c:518
#16 0xffffffff80ba1ff2 in taskqueue_thread_loop (arg=arg@entry=0xfffff80001a0a980) at /usr/src/sys/kern/subr_taskqueue.c:830
#17 0xffffffff80af760f in fork_exit (callout=0xffffffff80ba1f30 <taskqueue_thread_loop>, arg=0xfffff80001a0a980, frame=0xfffffe00df940f40)
    at /usr/src/sys/kern/kern_fork.c:1164
#18 <signal handler called>
#19 0x364e58673a0e5863 in ?? ()
Backtrace stopped: Cannot access memory at address 0x9179c32b9d39c3af

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions