Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Oct 2013 17:33:49 +0400
From:      Boris Samorodov <bsam@passap.ru>
To:        pyunyh@gmail.com
Cc:        FreeBSD Stable Users <freebsd-stable@freebsd.org>
Subject:   Re: regression: msk0 watchdog timeout and interrupt storm
Message-ID:  <52725C3D.2030602@passap.ru>
In-Reply-To: <20131030021650.GA3106@michelle.cdnetworks.com>
References:  <526FBA53.9000208@passap.ru> <20131030021650.GA3106@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
30.10.2013 06:16, Yonghyeon PYUN пишет:
> On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote:

>> >From time to time I use a notebook and boot FreeBSD from USB
>> stick. FreeBSD 9.2-i386 works OK. So I tried to use
>> FreeBSD 10.0-i386 BETA2 and the network adapter works for
>> some 10-15 seconds and then stops with diagnostic message
>> "msk0:watchdog timeout". I've found similar case at
>> freebsd-current@ with no workaround. Yes, there is an
>> interrupt storm as well.
> 
> There had been no functional changes for very long time so I'm not
> sure what's going on here.  I've attached local change I have at
> this moment but I'm afraid it wouldn't address the issue above.
> 
> I recall jhb also reported interrupt storm in the past but the root
> cause was not identified yet.  Could you change msk_intr() and let
> me know which interrupt is firing?

I've yet to organize a build.

>> Here is some additional info:
>> -----
>> mskc0@pci0:3:0:0:       class=0x020000 card=0xff501179 chip=0x435511ab
>> rev=0x12 hdr=0x00
>>     vendor     = 'Marvell Technology Group Ltd.'
>>     device     = '88E8040T PCI-E Fast Ethernet Controller'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[48] = powerspec 3  supports D0 D1 D2 D3  current D0
>>     cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message
>>     cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link x1(x1)
>>                  speed 2.5(2.5) ASPM disabled(L0s/L1)
>>     ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>>     ecap 0003[130] = Serial 1 b8b063ffff681e00
>> -----

Meanwhile some more investigations, "vmstat -i" for calm and storm:
-----
interrupt                          total       rate
irq1: atkbd0                        1025          2
irq9: acpi0                          204          0
irq14: ata0                          327          0
irq16: uhci0+                        246          0
irq20: hpet0                       22472         52
irq23: uhci2 ehci1                 10341         24
irq256: hdac0                         52          0
irq257: mskc0                        258          0
irq258: ahci0                        221          0
Total                              35146         81
-----
interrupt                          total       rate
irq1: atkbd0                        1508          2
irq9: acpi0                          234          0
irq14: ata0                          409          0
irq16: uhci0+                        246          0
irq20: hpet0                       72288        131
irq23: uhci2 ehci1                 10846         19
irq256: hdac0                         52          0
irq257: mskc0                    4419760       8021
irq258: ahci0                        221          0
Total                            4505564       8177
-----

And "vmstat -w1" for calm and storm:
-----
 procs      memory      page                    disks     faults         cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr mm0 ad0   in   sy   cs
us sy id
 0 0 0  206928  956040   277   0   2   0   330   4   0   0  117  476
454  0  1 99
 0 0 0  206928  956036     0   0   0   0     8   4   0   0   50  123
137  0  0 100
 0 0 0  206928  956036     0   0   0   0     0   4   0   0   47  120
92  0  1 99
 0 0 0  206928  956036     0   0   0   0     0   4   0   0   43  123
119  0  1 99
 0 0 0  206928  956036     0   0   0   0     0   4   0   0   55  132
123  0  1 99
 0 0 0  206928  956004     0   0   0   0     0   4   0   0   68  123
185  0  1 99
 0 0 0  206928  956036     0   0   0   0     8   4   0   0   86  123
266  0  1 99
 0 0 0  206928  956036     0   0   0   0     0   4   0   0   44  125
124  0  0 100
 0 0 0  206928  956036     0   0   0   0     0   4   0   0   64  128
164  0  1 99
 0 0 0  206928  956036     0   0   0   0     0   4   0   0   42  131
101  0  1 99
-----
 procs      memory      page                    disks     faults         cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr mm0 ad0   in   sy   cs
us sy id
 0 0 0  213648  954676   104   0   1   0   121   4   0   0 22299  204
44262  0 10 90
 0 0 0  213648  954672     0   0   0   0     8   4   0   0 112259  123
222379  0 44 56
 0 0 0  213648  954672     0   0   0   0     0   4   0   0 111792  123
221489  0 43 57
 0 0 0  213648  954672     1   0   0   0     0   4   0   0 109887  183
217754  0 43 57
 0 0 0  213648  954668     2   0   0   0     0   4   0   0 109543  146
216963  0 44 56
 0 0 0  213648  954668     0   0   0   0     0   4   0   0 110142  123
218187  0 45 55
 0 0 0  213648  954660   472   0   0   0   474   4   0   0 109340  717
216674  0 42 57
 0 0 0  213648  954656     2   0   0   0     0   4   0   0 109459  147
216831  0 43 57
 0 0 0  213648  954656     0   0   0   0     0   4   0   0 109462  131
216827  0 43 57
 0 0 0  213648  954656     0   0   0   0     0   4   0   0 109454  123
216803  0 42 58
-----

Dmesg is here: ftp://ftp.wart.ru/pub/misc/tos.dmesg.boot.txt .

BTW, some more observations. While downloading a file the system
goto watchdog timeout rather quickly, but the system works. If I
try to upload files the system works much longer (for a couple of
minutes) but then freeses. No ctrl-alt-esc. Only cold restart works.

Thanks!
-- 
WBR, Boris Samorodov (bsam)
FreeBSD Committer, http://www.FreeBSD.org The Power To Serve



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52725C3D.2030602>