Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Aug 2018 10:13:10 +0200
From:      Phil Norman <philnorm@gmail.com>
To:        Meowthink <meowthink@gmail.com>
Cc:        freebsd-hackers@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: Help diagnose my Ryzen build problem
Message-ID:  <CAOa8eG4UGCo3Evz7sp7w72irtP2yb=-9-KURrvCQGu6Z-1HwVA@mail.gmail.com>
In-Reply-To: <CABnABoZA4DUOFfr7JdbbBAWxak3=ge6zX0HXtu1RffQH7tSb2Q@mail.gmail.com>
References:  <CABnABoZA4DUOFfr7JdbbBAWxak3=ge6zX0HXtu1RffQH7tSb2Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi.

I have a similar setup: Ryzen 3 and Fatal1ty X370 mini-ITX. I had some
trouble with instability, although my problems weren't panics, but rather
two issues. One was random lockups (with no evidence left in logs), but I
*think* this was down to an inadequately cooled graphics card.

The other problem I had was with USB. I got quite a spam of log messages
about the USB reinitialisation. However, eventually I figured out that the
problem didn't occur if I booted the system from a completely powered-down
state. That is, use the physical switch on the PSU to cut power entirely,
re-enable, then boot from that state. Since then I've had 67 days of
uninterrupted uptime, with no USB issues at all.

It sounds like your problem is different, but trying a boot-from-cold might
be worthwhile, just in case ASRock have a consistent problem in this regard.

Cheers,
Phil

On 26 August 2018 at 13:20, Meowthink <meowthink@gmail.com> wrote:

> Hello all,
>
> Recently I tried to build up a Ryzen system and run FreeBSD on it.
> CPU:  AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10)
> Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with
> PinnaclePI-AM4_1.0.0.4, microcode 0x810100b )
> Mem:  2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC
> actually won't work :( )
>
> But the system is unstable - it can't last few days even is nearly
> idle. System panics even at midnight. It almost panic while or after I
> built something large. Surprisly I didn't encourage a user program
> fault, bad binaries built etc., panics only.
>
> Then I tried lots of BIOS settings e.g. SMT, C6 idle current,
> underclock RAM, but none seems effect.
> It could pass memtest86 V7.5 without error, or various benchmarks
> under Windows. thus I think the problem is not in the hardware but
> software.
>
> In the mean time, I realized that the rate of irqs from xhci0 are too
> high - it's about 1998/s. I found [1] and tried to MFC r331665. It
> didn't fix the problem though, but disabling that bluetooth module
> stops the irq storm, after all.
>
> Then the system lasts much longer before panic. It eventually can
> compile ports tree, build the world, scrub the zpool, all done without
> annoying reboots.
> Then I assume this is [2] related? So I also tried cpuctl, bounding
> all processes to 2-7.
> But the problem is still there, only the chance become very low. It
> still panics occasionally, idling a week or stressing few hours -
> Stress seems to rise the chance of panic, but differently by types.
> Things like llvm will always build, but gcc will cause a panic per few
> passes.
>
> The system was 11.2 but then moved on to stable/11 (r337906
> currently). I've got last 10 coredumps saved but my kernel isn't
> compile as debug. So I'll put some backtrace from core.txt.? in the
> end.
>
> Indeed I want to eliminate this problem. Could someone guide me how to
> figure out the problem? What should I try next?
>
> Best regards,
> Meowthink
>
> [1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224886
> [2] https://reviews.freebsd.org/D11780
>
> Backtraces newer - older:
> ------------------------------------------------------------------------
> Panic while compiling gcc:
>
> #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> #1  0xffffffff80afa5fb in kern_reboot (howto=260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
>     ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80afa863 in panic (fmt=<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081e962790,
>     eva=18446735309538549504) at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081e962790,
> usermode=0)
>     at pcpu.h:230
> #6  0xffffffff80f7b984 in trap (frame=0xfffffe081e962790)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5bccc in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff822950a8 in arc_change_state ()
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1800
> #9  0xffffffff8229328b in arc_access () at time.h:145
> #10 0xffffffff82296232 in arc_write_done (zio=0xfffff8065f886410)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6169
> #11 0xffffffff82334cbe in zio_done (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:4032
> #12 0xffffffff8233070c in zio_execute (zio=0xfffff8065f886410)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768
> #13 0xffffffff80b52cc4 in taskqueue_run_locked (queue=0xfffff8000d9e6e00)
>     at /usr/src/sys/kern/subr_taskqueue.c:463
> #14 0xffffffff80b53e28 in taskqueue_thread_loop (arg=<value optimized out>)
>     at /usr/src/sys/kern/subr_taskqueue.c:755
> #15 0xffffffff80abd813 in fork_exit (
>     callout=0xffffffff80b53d90 <taskqueue_thread_loop>,
>     arg=0xfffff8000d967030, frame=0xfffffe081e962ac0)
>     at /usr/src/sys/kern/kern_fork.c:1072
> #16 0xffffffff80f5cc7e in fork_trampoline ()
>     at /usr/src/sys/amd64/amd64/exception.S:972
> #17 0x0000000000000000 in ?? ()
> Current language:  auto; currently minimal
> (kgdb)
>
> ------------------------------------------------------------------------
> backtrace panic when shuting down:
>
> #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> #1  0xffffffff80afa5fb in kern_reboot (howto=260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
>     ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80afa863 in panic (fmt=<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081ed30700, eva=0)
>     at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081ed30700,
> usermode=0)
>     at pcpu.h:230
> #6  0xffffffff80f7b984 in trap (frame=0xfffffe081ed30700)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5bccc in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff80dfe4ad in vm_object_terminate (object=0xfffff805bf66d5a0)
>     at /usr/src/sys/vm/vm_object.c:768
> #9  0xffffffff80dfd0f8 in vm_object_deallocate (object=0x0)
>     at /usr/src/sys/vm/vm_object.c:677
> #10 0xffffffff80df3189 in _vm_map_unlock (map=<value optimized out>,
>     file=<value optimized out>, line=<value optimized out>)
>     at /usr/src/sys/vm/vm_map.c:2939
> #11 0xffffffff80df7be2 in vm_map_remove (map=0xfffff80018673000,
> start=4096,
>     end=140737488351232) at /usr/src/sys/vm/vm_map.c:3137
> #12 0xffffffff80df2e49 in vmspace_exit (td=0xfffff80039ec0620)
>     at /usr/src/sys/vm/vm_map.c:337
> #13 0xffffffff80ab72b9 in exit1 (td=0xfffff80039ec0620,
>     rval=<value optimized out>, signo=<value optimized out>)
>     at /usr/src/sys/kern/kern_exit.c:401
> #14 0xffffffff80ab6ced in sys_sys_exit (td=<value optimized out>,
>     uap=<value optimized out>) at /usr/src/sys/kern/kern_exit.c:180
> #15 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff80039ec0620, traced=0)
>     at subr_syscall.c:132
> #16 0xffffffff80f5c5ad in fast_syscall_common ()
>     at /usr/src/sys/amd64/amd64/exception.S:494
> #17 0x00000008028d034a in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> (kgdb)
>
> ------------------------------------------------------------------------
> Panic while only running my single thread python script
>
> #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> #1  0xffffffff80afa5fb in kern_reboot (howto=260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
>     ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80afa863 in panic (fmt=<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081f635ec0, eva=952)
>     at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081f635ec0,
> usermode=0)
>     at pcpu.h:230
> #6  0xffffffff80f7b984 in trap (frame=0xfffffe081f635ec0)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5bccc in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff80af57ad in __rw_wlock_hard (c=0xfffff80016f8f798,
>     v=<value optimized out>) at /usr/src/sys/kern/kern_rwlock.c:977
> #9  0xffffffff80bbca92 in bufobj_invalbuf (bo=<value optimized out>,
> flags=1,
>     slpflag=1017770744, slptimeo=<value optimized out>)
>     at /usr/src/sys/kern/vfs_subr.c:1609
> #10 0xffffffff80bbf8be in vgonel (vp=0xfffff8053ca9f1d8)
>     at /usr/src/sys/kern/vfs_subr.c:1655
> #11 0xffffffff80bbbcc4 in vnlru_free_locked (count=1, mnt_op=0x0)
>     at /usr/src/sys/kern/vfs_subr.c:1227
> #12 0xffffffff80bbbe14 in getnewvnode_reserve (count=1)
>     at /usr/src/sys/kern/vfs_subr.c:1287
> #13 0xffffffff82327fb4 in zfs_zget (zfsvfs=0xfffff80076574000,
> obj_num=34941,
>     zpp=0xfffffe081f6362a8)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
> zfs_znode.c:1122
> #14 0xffffffff823421ad in zfs_dirent_lookup (dzp=0xfffff804ff94e420,
>     name=0xfffffe081f6363e0 "filename.ext", zpp=0xfffffe081f6362a8, flag=2)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
> zfs_dir.c:187
> #15 0xffffffff82342267 in zfs_dirlook (dzp=0xfffff804ff94e420,
>     name=<value optimized out>, zpp=0xfffffe081f636360)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
> zfs_dir.c:238
> #16 0xffffffff8235a4ef in zfs_lookup ()
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
> zfs_vnops.c:1658
> #17 0xffffffff8235ac1e in zfs_freebsd_lookup (ap=0xfffffe081f636548)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
> zfs_vnops.c:4956
> #18 0xffffffff810fe89c in VOP_CACHEDLOOKUP_APV (vop=<value optimized out>,
>     a=0xfffffe081f636548) at vnode_if.c:195
> #19 0xffffffff80ba8d56 in vfs_cache_lookup (ap=<value optimized out>)
>     at vnode_if.h:80
> #20 0xffffffff810fe77c in VOP_LOOKUP_APV (vop=<value optimized out>,
>     a=0xfffffe081f636610) at vnode_if.c:127
> #21 0xffffffff80bb2761 in lookup (ndp=0xfffffe081f636748) at vnode_if.h:54
> #22 0xffffffff80bb1c29 in namei (ndp=0xfffffe081f636748)
>     at /usr/src/sys/kern/vfs_lookup.c:448
> #23 0xffffffff80bc8238 in kern_statat (td=0xfffff8013ba1b620,
>     flag=<value optimized out>, fd=-100,
>     path=0x80332c910 <Address 0x80332c910 out of bounds>,
>     pathseg=UIO_USERSPACE, sbp=0xfffffe081f636900, hook=0)
>     at /usr/src/sys/kern/vfs_syscalls.c:2023
> #24 0xffffffff80bc817d in sys_stat (td=<value optimized out>,
>     uap=0xfffff8013ba1bb58) at /usr/src/sys/kern/vfs_syscalls.c:1978
> #25 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff8013ba1b620, traced=0)
>     at subr_syscall.c:132
> #26 0xffffffff80f5c5ad in fast_syscall_common ()
>     at /usr/src/sys/amd64/amd64/exception.S:494
> #27 0x0000000801a5b9ca in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> (kgdb)
>
> ------------------------------------------------------------------------
> Panic while using mplayer
>
> #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> #1  0xffffffff80af91cb in kern_reboot (howto=260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80af95f1 in vpanic (fmt=<value optimized out>,
>     ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80af9433 in panic (fmt=<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7a13f in trap_fatal (frame=0xfffffe081f70d380, eva=0)
>     at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7a199 in trap_pfault (frame=0xfffffe081f70d380,
> usermode=0)
>     at pcpu.h:230
> #6  0xffffffff80f79974 in trap (frame=0xfffffe081f70d380)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5a00c in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff8088c030 in hdac_stream_start (dev=<value optimized out>,
>     child=<value optimized out>, dir=0, stream=1, buf=1889533952,
> blksz=2048,
>     blkcnt=2) at /usr/src/sys/dev/sound/pci/hda/hdac.c:1927
> #9  0xffffffff8088437d in hdaa_channel_start (ch=<value optimized out>)
>     at hdac_if.h:84
> #10 0xffffffff80887e0d in hdaa_channel_trigger (obj=<value optimized out>,
>     data=0xfffff8007102c480, go=1)
>     at /usr/src/sys/dev/sound/pci/hda/hdaa.c:2161
> #11 0xffffffff80893b8e in chn_trigger (c=0xfffff80071058400, go=1)
>     at channel_if.h:131
> #12 0xffffffff8089751b in chn_notify (c=0xfffff80071058400,
>     flags=<value optimized out>) at /usr/src/sys/dev/sound/pcm/
> channel.c:2281
> #13 0xffffffff808b697f in vchan_trigger (obj=<value optimized out>,
>     data=<value optimized out>, go=1)
>     at /usr/src/sys/dev/sound/pcm/vchan.c:171
> #14 0xffffffff80893b8e in chn_trigger (c=0xfffff80071057c00, go=1)
>     at channel_if.h:131
> #15 0xffffffff8089de10 in dsp_ioctl (i_dev=<value optimized out>,
>     cmd=<value optimized out>, arg=0xfffffe081f70d8d0 "\003",
>     mode=<value optimized out>, td=<value optimized out>)
>     at /usr/src/sys/dev/sound/pcm/dsp.c:1733
> #16 0xffffffff809c5b38 in devfs_ioctl_f (fp=0xfffff802c5563c80,
>     com=2147766288, data=0xfffffe081f70d8d0, cred=0xfffff8004c482500,
>     td=0xfffff802f24ac000) at /usr/src/sys/fs/devfs/devfs_vnops.c:791
> #17 0xffffffff80b5c00d in kern_ioctl (td=0xfffff802f24ac000, fd=51,
>     com=2147766288, data=<value optimized out>) at file.h:323
> #18 0xffffffff80b5bd2c in sys_ioctl (td=0xfffff802f24ac000,
>     uap=0xfffff802f24ac538) at /usr/src/sys/kern/sys_generic.c:745
> #19 0xffffffff80f7b1c8 in amd64_syscall (td=0xfffff802f24ac000, traced=0)
>     at subr_syscall.c:132
> #20 0xffffffff80f5a8ed in fast_syscall_common ()
>     at /usr/src/sys/amd64/amd64/exception.S:494
> #21 0x0000000801fb94aa in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> (kgdb)
>
> ------------------------------------------------------------------------
> Panic while ilde, seems like cronjobs triggered ZFS ARC cleanup.
>
> #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
> #1  0xffffffff80af95fb in kern_reboot (howto=260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80af9a21 in vpanic (fmt=<value optimized out>,
>     ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80af9863 in panic (fmt=<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7b13f in trap_fatal (frame=0xfffffe081ee186e0,
> eva=201697507)
>     at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7b199 in trap_pfault (frame=0xfffffe081ee186e0,
> usermode=0)
>     at pcpu.h:230
> #6  0xffffffff80f7a974 in trap (frame=0xfffffe081ee186e0)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5a5bc in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff80ad596e in free (addr=0xfffff802472af200,
>     mtp=0xffffffff825bfc00) at /usr/src/sys/kern/kern_malloc.c:583
> #9  0xffffffff8232a667 in zfs_inactive (vp=<value optimized out>,
>     cr=<value optimized out>, ct=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
> zfs_vnops.c:4333
> #10 0xffffffff82332a1d in zfs_freebsd_inactive (ap=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
> zfs_vnops.c:5364
> #11 0xffffffff810ff6b2 in VOP_INACTIVE_APV (vop=<value optimized out>,
>     a=0xfffffe081ee18858) at vnode_if.c:1955
> #12 0xffffffff80bbd7bc in vinactive (vp=0xfffff803ae8b3760,
>     td=0xfffff803ae23b620) at vnode_if.h:807
> #13 0xffffffff80bbdcc7 in vputx (vp=0xfffff803ae8b3760, func=1)
>     at /usr/src/sys/kern/vfs_subr.c:2688
> #14 0xffffffff80bc5180 in sys_fchdir (td=0xfffff803ae23b620,
>     uap=<value optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:724
> #15 0xffffffff80f7c1c8 in amd64_syscall (td=0xfffff803ae23b620, traced=0)
>     at subr_syscall.c:132
> #16 0xffffffff80f5ae9d in fast_syscall_common ()
>     at /usr/src/sys/amd64/amd64/exception.S:494
> #17 0x00000008008a99aa in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> (kgdb)
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOa8eG4UGCo3Evz7sp7w72irtP2yb=-9-KURrvCQGu6Z-1HwVA>