Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Feb 2017 09:49:53 -0800
From:      Lee Damon <nomad@castle.org>
To:        freebsd-stable@freebsd.org
Subject:   Re: FBSD 10.3 + ZFS + Sun x4500 = utter lock up.
Message-ID:  <fc17381c-0eba-1731-b232-451ab562626c@castle.org>
In-Reply-To: <op.yvcgvjqdkndu52@53556c9c.cm-6-6b.dynamic.ziggo.nl>
References:  <44ecebcb-fb48-d828-7f08-47a981b732d2@castle.org> <op.yvcgvjqdkndu52@53556c9c.cm-6-6b.dynamic.ziggo.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
In what was arguably a silly attempt I changed all IRQ interrupts to go
to CPU0 and .. the host has stayed up through multiple attempts to crash
it. I'm not calling it fixed yet but there appears to be hope.

Right now I have a script -- /usr/local/etc/rc.d/cpuset.sh -- that's
doing the work. This seems a sub-optimal place to do it as there is a
possibility of crash before the script is executed on boot. Is there any
option in bootloader or related for setting these or is cpuset(1) my
only option?

thanks,
nomad

>> FreeBSD [redacted] 10.3-STABLE FreeBSD 10.3-STABLE #2 r313008: Tue Jan
>> 31 01:50:49 PST 2017     lvd@[redacted]:/usr/obj/usr/src/sys/GENERIC 
>> amd64
>>
>> I'm trying to get FBSD 10.3 working on a Sun x4500 (don't ask) for use
>> as a ZFS-based backup server. However, whenever any amount of data is
>> put into a zpool and then zpool scrub is run the host locks up hard. On
>> reboot it complains that a "Hyper transport sync flood occurred".
>>
>> I found
>> https://lists.freebsd.org/pipermail/freebsd-stable/2012-January/065542.html
>>
>> which seems to match but when I try the cpuset command mentioned there I
>> get an error:
>>
>> ; sudo cpuset -c -l 0 -x 58
>> cpuset: setaffinity: Invalid argument
>>
>> Looks like the -c was invalid. After removing that I was informed -x 58
>> wasn't valid. Sure enough, there's no mpt0 or IRQ 58 on the host:
>>
>> ; vmstat -i
>> interrupt                          total       rate
>> irq17: ohci2                        8578          2
>> irq18: ohci3                         473          0
>> irq19: ohci0 ohci1+                 4924          1
>> irq24: mvs0                          457          0
>> irq32: mvs1                          453          0
>> irq38: mvs2                          451          0
>> irq46: mvs3                         8063          1
>> irq52: em0                        152354         35
>> irq53: em1                           140          0
>> irq68: mvs4                          450          0
>> irq76: mvs5                          454          0
>> cpu0:timer                        208311         48
>> cpu1:timer                         98318         23
>> cpu2:timer                        105704         24
>> cpu3:timer                        106202         24
>> Total                             695332        162
>>
>> Looking around with some help from #freebsd on efnet I found mvs0-5
>> which are connected to the Marvel drive controllers on the host. I then
>> used
>>   ; sudo cpuset -l 0 -x ##
>> where I replaced ## with 24, 32, 38, 46, 68, and 76.
>>
>> After rebuilding the zpool I started writing to it. It took a lot less
>> time to crash - I didn't even need to run zpool scrub - but instead of
>> completely locking up it just rebooted. I did not see reference to the
>> hyper transport problem while watching it boot but given the poor
>> performance of the serial console I can't be 100% sure it wasn't there.
>>
>> So now I turn here to ask for guidance. Is anyone currently successfully
>> running 10.x on a x4500 and if so, how are you doing it? If not, how can
>> I get this working?
>>
>> thanks,
>> nomad
>> _______________________________________________
>> freebsd-stable@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?fc17381c-0eba-1731-b232-451ab562626c>