Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Aug 2009 22:27:00 +0600
From:      =?UTF-8?B?0JTQvNC40YLRgNC40Lkg0JfQsNC80YPRgNCw0LXQsg==?= <gigabyte.tmn@gmail.com>
To:        <alexpalias-bsdnet@yahoo.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: em driver input errors
Message-ID:  <000e01ca20e9$e19caa10$1e010a0a@in72.ru>
References:  <24727.68667.qm@web56404.mail.re3.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello Alex.

What sheduler are you using? ULE or 4BSD
Have you NIC IRQ sharing with other hardware?
What HZ value? 1000?

>Thanks for the suggestion.
>From a "clean" box:
>dev.em.0.rx_int_delay: 0
>dev.em.0.tx_int_delay: 66
>dev.em.0.rx_abs_int_delay: 66
>dev.em.0.tx_abs_int_delay: 66
>I reset all the values (errors still appearing), then tried your suggestion 
>(rx_int_delay=600, rx_abs_int_delay=1000).  This has reduced the number of 
> >interrupts for em0 (from about 7200/sec to around 6500/sec).  After some 
>time, I started getting errors again.
mmm, try the maximum value 67108, what hapens...

> But that has made me try this also:
>dev.em.0.tx_int_delay=600
>dev.em.0.tx_abs_int_delay=1000
I think it's a bad idea, but don't know because:
>Meaning using your suggested values for tx too.  Now em0 is seeing about 
>1800 interrupts/second, which is way better, but after some time I saw 
>errors >again...

>From the output of "netstat -nI em0 -w 5":
maybe mistake, did you meen "netstat -w5 em0" ?

I have PPPoE concenrator based on S3000AHV motherboard with Core2Quad 6600 
and four (to load all cores in CPU) Intel PCI-E x1 and PCI-E x4 NIC's
My load:
bras1 [/usr/home/dm]# netstat -w5 em0
            input        (Total)           output
   packets  errs      bytes    packets  errs      bytes colls
    943831     0  803741196     932221     0  766771487     0
^C
bras1 [/usr/home/dm]# netstat -w1 -Iem0
            input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
     24067     0   20593033      17152     0   17361755     0
^C
bras1 [/usr/home/dm]# netstat -w1 -Ilagg0
            input        (lagg0)           output
   packets  errs      bytes    packets  errs      bytes colls
     47085     0   38454150      46708     0   38128482     0
     44888     0   36087138      44714     0   35985529     0
     49607     0   40467232      49326     0   40227456     0
^C
bras1 [/usr/home/dm]# netstat -w5 -Ilagg0
            input        (lagg0)           output
   packets  errs      bytes    packets  errs      bytes colls
    230260     0  187650240     228911     0  186485136     0
    238023     0  194650670     236648     0  193471650     0
    218424     0  175576014     216860     0  174282762     0
^C
The lagg0 interface includes em0, em1, em2, em3 for lacp protocol, and 
comunicates with cisco 2960G switch.
vmstat -i says:
interrupt                                 total       rate
irq4: sio0                           95234          0
irq19: atapci1               8430157          1
cpu0: timer                1275549106        258
irq256: em0               2329917460        472
irq257: em1               645070135        130
irq258: em2              3527395550        715
irq259: em3             3923746474        795
cpu1: timer               1275548822         258
cpu3: timer               1275548798         258
cpu2: timer               1275548865         258
Total                        15536850601       3149
And i have't any problems. I think i select the good hardware.

>            input          (em0)           output
>   packets  errs      bytes    packets  errs      bytes colls
>     87267     0   50372599     106931     0   81598993     0
>     86496     0   50990332     105467     0   80064657     0
>     81726  3056   49876613      99080     0   73273640     0
>     90425     0   59172531     105299     0   77110096     0
>    120292     0   70369292     109597     0   78626248     0
>... a few minutes pass with zero errors ...
>    89646     0   56951878     111240     0   86493393     0
>     86031     0   53549721     108695     0   83592747     0
>     77760  3054   48505562      96912     0   73185576     0
>     87508     0   56116394     106094     0   79130608     0
>     89031     0   56490982     103039     0   77398567     0
>What's interesting is that I'm seeing errors in a 80k packets/5 sec (so 
>around 16k packets/s) zone, but no errors at 120k packets/5sec (24kpps).
Yes, it's not normaly.

>Interrupts total (as reported by systat):  around 13500/second.  I would 
>estimate the old IRQ load at around 30000-35000/second, which doesn't seem 
>too >much to me, for a dual xeon machine.
I think it depends by motherbord, what full hardware specification are you 
using? with chips names

>Speaking of which, I did compile the kernel with "options DEVICE_POLLING", 
>but enabling polling only made the errors appear more often, and in greater 
> >numbers.
I don't use polling on FBSD 7.x, it's usable on FBSD older versions

> - 1 x dual-port gigabit interface, PCI-X
Maybe I have this card. And it works unstable, i don't remember what 
happens, but i seen by tcpdump "truncated IP, missing XX bytes"

Good luck. 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?000e01ca20e9$e19caa10$1e010a0a>