Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 01 Jul 2013 12:27:25 +0400
From:      "Alexander V. Chernikov" <melifaro@FreeBSD.org>
To:        Navdeep Parhar <np@FreeBSD.org>
Cc:        net@freebsd.org
Subject:   Re: cxgbetool & hw filtering issues
Message-ID:  <51D13D6D.7030603@FreeBSD.org>
In-Reply-To: <51D089D9.6080901@FreeBSD.org>
References:  <51D03FCE.1060102@FreeBSD.org> <51D089D9.6080901@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 30.06.2013 23:41, Navdeep Parhar wrote:
> On 06/30/13 07:25, Alexander V. Chernikov wrote:
>> Hello list!
>>
>> While experimenting with Chelsio T440-CR (cxgbe) internal firewall, I'm
>> getting some kind of unexpected results:
> One bit of general advice to begin with: add "hitcnts 1" to all your
> filter rules and then you can see how many incoming packets hit that
> filter in the output of "cxgbetool t4nex0 filter list".  I really should
> make hitcnts=1 the default in the driver.
Thanks for the hint.
>
>> filtering 'type ipv4 action drop' permits IPv4 TCP traffic with bad
>> checksum.
> It may be that a bad checksum makes it an invalid IPv4 packet to the
> chip and so it doesn't hit the "type ipv4" rule.  There is an entirely
> separate knob available to have the chip drop bad packets if you don't
> want to see them.  The default is to let them through so that users can
> examine them with tcpdump etc.
That's OK, Intel also has such tunable (IXGBE_FCTRL_SBP flag).
How can I tune this?
>
>> filtering 'type IPv6 action drop' permits IPv6 traffic to multicast
>> addresses (MLDv2, etc..)
> The DMAC is an L2 multicast address?  Try "proto 58 hitcnts 1 action
> drop" to get these ICMP6 packets.
>
>> filtering 'ethtype 34525 action drop' (drop all IPv6) results in
>> 'CHELSIO_T4_SET_FILTER: Argument list too long' despite to what is said
>> in budget table from cxgbetool.8
> This _would_ have gotten everything with ethertype ipv6 but the default
> filter mode doesn't have ethtype enabled, which is why it's complaining:
> # cxgbetool t4nex0 filter mode
> ipv4 ipv6 sip dip sport dport matchtype proto ivlan iport fcoe
Well,
./cxgbetool t4nex0 filter mode ipv4 ipv6 sip dip sport dport matchtype 
proto vlan iport
cxgbetool: CHELSIO_T4_SET_FILTER_MODE: Operation not supported

(Probably because -t4_set_filter_mode() is still under "#ifdef notyet" 
in t4_main.c) :)
>
>> filtering 'matchtype 4 action drop' or similar (4,5,4:0,4:4, 5:0, 5:5)
>> does not match anything despite some traffic definitely falls into that
>> conditions.
>> filtering 'action drop' and 'iport X action drop' filters IPv4 traffic
>> only.
> Strange.  I use "iport X action drop hitcnts 1" as a packet black hole
> all the time.  Were these the only filters when you tried them?  Are you
> sure your packets didn't hit some other rule and were delivered as a
> result of that?  Check the order in "cxgbetool t4nex0 filter list"
  TESTING COUNTER:
# ipfw show 200
00200     432677     57910898 deny ip from any to any via cxgbe3
# while true; do sleep 1; ipfw show 200 ; ipfw -q zero 200 ;done

[## EMPTY # ./cxgbetool t4nex0 filter list      ##]

00200     281878     80450397 deny ip from any to any via cxgbe3
00200     281451     80296577 deny ip from any to any via cxgbe3
00200     299594     85351560 deny ip from any to any via cxgbe3

[##
# ./cxgbetool t4nex0 filter 0 iport 3 hitcnts 1 action drop
# ./cxgbetool t4nex0 filter list
  Idx     Hits FCoE Port      vld:VLAN  Prot MPS Frag                  
DIP                  SIP     DPORT     SPORT Action
    0  1841792  0/0  3/7 0:0000/0:0000 00/00 0/0  0/0 
00000000/00000000    00000000/00000000 0000/0000 0000/0000 Drop
##]

00200     115487     15451587 deny ip from any to any via cxgbe3
00200     115148     15414229 deny ip from any to any via cxgbe3
00200     116008     15526682 deny ip from any to any via cxgbe3

[ ## the same, IPv4 TCP with bad csum packets, and IPv6 traffic with L2 
multicast macs:

# tcpdump -i cxgbe3 -lns0 -c1 ip
tcpdump: WARNING: cxgbe3: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cxgbe3, link-type EN10MB (Ethernet), capture size 65535 bytes
12:09:42.249299 IP 95.108.170.36.39215 > 93.158.158.93.80: Flags [P.], 
seq 2064108148:2064108546, ack 4252238260, win 1040, options [nop,nop,TS 
val 538195909 ecr 1194268184], length 398

12:12 [0] test25# tcpdump -i cxgbe3 -lnes0 -c10 ip6
tcpdump: WARNING: cxgbe3: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cxgbe3, link-type EN10MB (Ethernet), capture size 65535 bytes
12:12:16.728912 80:49:71:11:8d:a2 > 33:33:00:00:00:fb, ethertype IPv6 
(0x86dd), length 324: fe80::8249:71ff:fe11:8da2.5353 > ff02::fb.5353: 
0*- [0q] 2/0/7 PTR zivot-osx._smb._tcp.local., TXT "model=MacBookAir4,2" 
(262)
12:12:16.728923 00:25:90:0e:00:b8 > 33:33:00:00:00:01, ethertype IPv6 
(0x86dd), length 134: fe80::225:90ff:fe0e:b8 > ff02::1: ICMP6, router 
advertisement, length 80
12:12:16.728942 5c:26:0a:6e:b4:76 > 33:33:00:00:00:16, ethertype IPv6 
(0x86dd), length 130: fe80::884:a1e8:86ae:57f7 > ff02::16: HBH ICMP6, 
multicast listener report v2, 3 group record(s), length 68
12:12:16.728968 5c:26:0a:6e:b4:76 > 33:33:ff:0e:00:b8, ethertype IPv6 
(0x86dd), length 86: fe80::884:a1e8:86ae:57f7 > ff02::1:ff0e:b8: ICMP6, 
neighbor solicitation, who has fe80::225:90ff:fe0e:b8, length 32
12:12:16.728971 5c:26:0a:6e:b4:76 > 33:33:ff:0e:00:b8, ethertype IPv6 
(0x86dd), length 86: fe80::884:a1e8:86ae:57f7 > ff02::1:ff0e:b8: ICMP6, 
neighbor solicitation, who has fe80::225:90ff:fe0e:b8, length 32
12:12:16.729011 5c:26:0a:6e:b4:76 > 33:33:ff:0e:00:b8, ethertype IPv6 
(0x86dd), length 86: fe80::884:a1e8:86ae:57f7 > ff02::1:ff0e:b8: ICMP6, 
neighbor solicitation, who has fe80::225:90ff:fe0e:b8, length 32
12:12:16.729011 20:c9:d0:2b:b7:28 > 33:33:00:00:00:fb, ethertype IPv6 
(0x86dd), length 321: fe80::22c9:d0ff:fe2b:b728.5353 > ff02::fb.5353: 
0*- [0q] 2/0/7 PTR octo-osx._smb._tcp.local., TXT "model=MacBookAir5,2" 
(259)
12:12:16.729012 20:c9:d0:7c:cb:1d > 33:33:00:00:00:fb, ethertype IPv6 
(0x86dd), length 95: fe80::22c9:d0ff:fe7c:cb1d.5353 > ff02::fb.5353: 0 
PTR (QM)? _smb._tcp.local. (33)
12:12:16.729021 5c:26:0a:6e:b4:76 > 33:33:ff:4f:dc:69, ethertype IPv6 
(0x86dd), length 78: :: > ff02::1:ff4f:dc69: ICMP6, neighbor 
solicitation, who has 2a02:6b8:0:401:c599:50e2:184f:dc69, length 24
12:12:16.729022 5c:26:0a:6e:b4:76 > 33:33:00:00:00:02, ethertype IPv6 
(0x86dd), length 70: fe80::884:a1e8:86ae:57f7 > ff02::2: ICMP6, router 
solicitation, length 16

## ]
>
> Also, are you going by the ifnet rx stats as displayed by netstat etc.?
>   Right now the driver fills the ifnet stats directly from hardware
> registers rather than counting the packets that it actually received
> from the chip.  The hardware registers include packets that would have
> been delivered to the driver if no filters were present but are dropped
> due to a filter.
I'm counting packets by "deny ip from any to any via cxgbe3" ipfw 
counter, as I specified in the setup scenario :)
>
>> filter 'type ipv6 ...' can be set on (0,4,8,12,...) filter numbers
>> yelling 'CHELSIO_T4_SET_FILTER: Invalid argument' on other numbers.
> Yes, IPv6 filters take 4 tid's (non-IPv6 take 1) and these tid's have to
> start at a naturally aligned boundary.  No way around this.
No problem :)
>
>> What can I do to debug further/fix this behavior?
>>
>> Some more questions:
>> Does anybody known how I can get/set total number of HW firewall
>> records? There is such tunable in Linux version.
> I will add a simple sysctl for this.  For now you can indirectly figure
> this out from the output of "sysctl -n dev.t4nex.0.misc.tids" -- the
> FTIDs are the filter tids.  For example I see 1456 filters on this card:
> trantor:~# sysctl -n dev.t4nex.0.misc.tids
> ATID range: 0-8191, in use: 0
> TID range: 2048-18431, in use: 0
> STID range: 0-511, in use: 0
> FTID range: 512-1967
> HW TID usage: 0 IP users, 0 IPv6 users
> trantor:~# echo $((1967 - 512 + 1))
> 1456
Thanks!
>
>> Is there any way to retrieve _host_ interface statistic (e.g. how much
>> traffic in packets/bytes are thrown to NIC driver)?
> cxgbe(4) doesn't count this stuff itself.  Currently it just reads the
Understood. I'll use hitcnts counters, then.
> hardware registers once per second and it's done.  Software stats would
> have to be per queue (and then aggregated from time to time).  I'll wait
counter(9) framework handles this automatically for sysctls
> to see where the PCPU counter work in the kernel goes before reworking
> this part of the driver.
Well, it's actually working, and working great :)
We're using PCPU counters for if_vlan, ipfw and IP stack statistics
>
> Regards,
> Navdeep
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51D13D6D.7030603>