Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 5 Mar 2011 10:18:56 -0300
From:      Luiz Otavio O Souza <lists.br@gmail.com>
To:        Eduardo Schoedler <eschoedler@gmail.com>
Cc:        freebsd-net@FreeBSD.org, bug-followup@FreeBSD.org
Subject:   Re: kern/155177: [route] [panic] Panic when inject routes in kernel
Message-ID:  <B75A64E0-4CF8-4440-BF05-9CC60558F825@gmail.com>
In-Reply-To: <!&!AAAAAAAAAAAYAAAAAAAAAIuUu3nmKcFHrbjqCBaxr87CgAAAEAAAAGwEZ7agMEtAuuQPNdgBzq8BAAAAAA==@gmail.com>
References:  <!&!AAAAAAAAAAAYAAAAAAAAAIuUu3nmKcFHrbjqCBaxr87CgAAAEAAAAGwEZ7agMEtAuuQPNdgBzq8BAAAAAA==@gmail.com>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help

--Apple-Mail-135-1048872824
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

On Mar 4, 2011, at 9:10 AM, Eduardo Schoedler wrote:
> Hello,
>=20
> I've found another (easy) way to reproduce the problem with two =
scripts:
> routes-add.sh and routes-remove.sh.
> First run routes-add.sh for a while; then execute routes-remove.sh.
> Cancel with CTRL+C and execute routes-remove.sh again.
>=20

<snip>

Hi Eduardo,

I've found another problem while trying something like you'd proposed, =
but it can be easily reproduced by just trying to remove a network route =
that is not in the table (probably what your script does when you press =
ctrl+c and restart it).

The problem i've found produces the following backtrace:

#0  doadump () at pcpu.h:244
244     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump () at pcpu.h:244
#1  0xc04d7de9 in db_fncall (dummy1=3D1, dummy2=3D0, dummy3=3D-1056933504,=
=20
    dummy4=3D0xe69ee798 "") at /usr/src/sys/ddb/db_command.c:548
#2  0xc04d81e1 in db_command (last_cmdp=3D0xc0e303dc, cmd_table=3D0x0, =
dopager=3D1)
    at /usr/src/sys/ddb/db_command.c:445
#3  0xc04d833a in db_command_loop () at =
/usr/src/sys/ddb/db_command.c:498
#4  0xc04da25d in db_trap (type=3D3, code=3D0) at =
/usr/src/sys/ddb/db_main.c:229
#5  0xc0902672 in kdb_trap (type=3D3, code=3D0, tf=3D0xe69ee948)
    at /usr/src/sys/kern/subr_kdb.c:533
#6  0xc0c137bb in trap (frame=3D0xe69ee948) at =
/usr/src/sys/i386/i386/trap.c:717
#7  0xc0bfc7ec in calltrap () at /usr/src/sys/i386/i386/exception.s:168
#8  0xc09024fa in kdb_enter (why=3D0xc0ce86fa "panic", msg=3D0xc0ce86fa =
"panic")
    at cpufunc.h:71
#9  0xc08cea24 in panic (fmt=3D0xc0cfedcb "radix node disappeared")
    at /usr/src/sys/kern/kern_shutdown.c:574
#10 0xc0996900 in rtrequest1_fib (req=3D2, info=3D0xe69eea50, =
ret_nrt=3D0xe69eea84,=20
    fibnum=3DVariable "fibnum" is not available.
) at /usr/src/sys/net/route.c:968
#11 0xc099abbd in route_output (m=3D0xc43a6b00, so=3D0xc48b0000)
    at /usr/src/sys/net/rtsock.c:630
#12 0xc09959da in raw_usend (so=3D0xc48b0000, flags=3DVariable "flags" =
is not available.
)
    at /usr/src/sys/net/raw_usrreq.c:228
#13 0xc0999275 in rts_send (so=3D0xc48b0000, flags=3D0, m=3D0xc43a6b00, =
nam=3D0x0,=20
    control=3D0x0, td=3D0xc49d18a0) at /usr/src/sys/net/rtsock.c:354
#14 0xc093ceed in sosend_generic (so=3D0xc48b0000, addr=3D0x0, =
uio=3D0xe69eec28,=20
    top=3D0xc43a6b00, control=3D0x0, flags=3D0, td=3D0xc49d18a0)
    at /usr/src/sys/kern/uipc_socket.c:1301
#15 0xc0938ddf in sosend (so=3D0xc48b0000, addr=3D0x0, uio=3D0xe69eec28, =
top=3D0x0,=20
    control=3D0x0, flags=3D0, td=3D0xc49d18a0)
    at /usr/src/sys/kern/uipc_socket.c:1345
#16 0xc0920ae3 in soo_write (fp=3D0xc4690d58, uio=3D0xe69eec28,=20
    active_cred=3D0xc47e8e00, flags=3D0, td=3D0xc49d18a0)
    at /usr/src/sys/kern/sys_socket.c:100
#17 0xc0919a65 in dofilewrite (td=3D0xc49d18a0, fd=3D3, fp=3D0xc4690d58,=20=

    auio=3D0xe69eec28, offset=3D-1, flags=3D0) at file.h:238
#18 0xc091b208 in kern_writev (td=3D0xc49d18a0, fd=3D3, auio=3D0xe69eec28)=

    at /usr/src/sys/kern/sys_generic.c:447
#19 0xc091b31f in write (td=3D0xc49d18a0, uap=3D0xe69eecec)
    at /usr/src/sys/kern/sys_generic.c:363
#20 0xc090fda3 in syscallenter (td=3D0xc49d18a0, sa=3D0xe69eece4)
    at /usr/src/sys/kern/subr_trap.c:344
#21 0xc0c13064 in syscall (frame=3D0xe69eed28)
    at /usr/src/sys/i386/i386/trap.c:1080
#22 0xc0bfc851 in Xint0x80_syscall ()
    at /usr/src/sys/i386/i386/exception.s:266
#23 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)=20


Are you sure that your scripts produce the backtrace you'd posted ? I =
cannot reproduce that here...

Well, about the problem i've found ("radix node disappeared") when =
removing a nonexistent route (route delete x.y.w.z/24 - where x.y.w.z/24 =
is _not_ in the route table), it was related to the code that check for =
a gateway when there are multiple gateways for a route, which clearly =
was not the case.

After some thought i've crafted the following patch which fix the "radix =
node disappeared" problem (for me obviously...), can you try your =
scripts with this patch ? Not sure yet if this is related to the first =
problem you'd reported.


Thanks,
Luiz



--Apple-Mail-135-1048872824
Content-Disposition: attachment;
	filename=radix_remove_gateway.diff
Content-Type: application/octet-stream;
	name="radix_remove_gateway.diff"
Content-Transfer-Encoding: 7bit

Index: sys/net/route.c
===================================================================
--- sys/net/route.c	(revision 219261)
+++ sys/net/route.c	(working copy)
@@ -946,7 +946,7 @@
 			RT_LOCK(rto);
 			rto->rt_flags |= RTF_UP;
 			RT_UNLOCK(rto);
-		} else if (rt->rt_flags & RTF_GATEWAY) {
+		} else if (gateway && rt->rt_flags & RTF_GATEWAY) {
 			/*
 			 * For gateway routes, we need to 
 			 * make sure that we we are deleting
@@ -955,9 +955,8 @@
 			 * check the case when there is only
 			 * one route in the chain.  
 			 */
-			if (gateway &&
-			    (rt->rt_gateway->sa_len != gateway->sa_len ||
-				memcmp(rt->rt_gateway, gateway, gateway->sa_len)))
+			if (rt->rt_gateway->sa_len != gateway->sa_len ||
+			    memcmp(rt->rt_gateway, gateway, gateway->sa_len))
 				error = ESRCH;
 			else {
 				/*
@@ -1002,7 +1001,6 @@
 nondelete:
 	if (req != RTM_DELETE)
 		panic("unrecognized request %d", req);
-	
 
 	/*
 	 * If the caller wants it, then it can have it,

--Apple-Mail-135-1048872824
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset=us-ascii





> 
> 
> Backtrace:
> ==========
> 
> # cat /var/crash/core.txt.1
> <snip>
> Unread portion of the kernel message buffer:
> panic: rtfree 2
> cpuid = 4
> KDB: stack backtrace:
> #0 0xffffffff80416e43 at kdb_backtrace+0x5e
> #1 0xffffffff803e68a8 at panic+0x182
> #2 0xffffffff804b2274 at rtalloc1_fib+0
> #3 0xffffffff804b5b92 at route_output+0x304
> #4 0xffffffff8044b776 at sosend_generic+0x366
> #5 0xffffffff8042cd5c at soo_write+0x54
> #6 0xffffffff80425bee at dofilewrite+0x7a
> #7 0xffffffff80425ec1 at kern_writev+0x52
> #8 0xffffffff80425f3f at write+0x4e
> #9 0xffffffff80422408 at syscallenter+0x186
> #10 0xffffffff8065b4f7 at syscall+0x40
> #11 0xffffffff806449f2 at Xfast_syscall+0xe2
> Uptime: 37m16s
> Physical memory: 4084 MB
> Dumping 497 MB:VOP_STRATEGY: bp is not locked but should be
> 482 466 450 434 418 402 386 370 354 338 322 306 290 274 258 242 226 210 194
> 178 162 146 130 114 98 82 66 50 34 18 2
> 
> #0  doadump () at pcpu.h:224
> 224     pcpu.h: No such file or directory.
>        in pcpu.h
> (kgdb) #0  doadump () at pcpu.h:224
> #1  0xffffffff803e6425 in boot (howto=260)
>    at /usr/src/sys/kern/kern_shutdown.c:419
> #2  0xffffffff803e6892 in panic (fmt=Variable "fmt" is not available.
> )
>    at /usr/src/sys/kern/kern_shutdown.c:592
> #3  0xffffffff804b2274 in rtfree (rt=Variable "rt" is not available.
> ) at /usr/src/sys/net/route.c:446
> #4  0xffffffff804b5b92 in route_output (m=0xffffff0004790700,
>    so=0xffffff00b07ead48) at /usr/src/sys/net/rtsock.c:863
> #5  0xffffffff8044b776 in sosend_generic (so=0xffffff00b07ead48, addr=0x0,
>    uio=0xffffff830ff98a90, top=0xffffff0004790700, control=0x0, flags=0,
>    td=0xffffff0004a13000) at /usr/src/sys/kern/uipc_socket.c:1260
> #6  0xffffffff8042cd5c in soo_write (fp=Variable "fp" is not available.
> )
>    at /usr/src/sys/kern/sys_socket.c:102
> #7  0xffffffff80425bee in dofilewrite (td=0xffffff0004a13000, fd=3,
>    fp=0xffffff0004977af0, auio=0xffffff830ff98a90, offset=Variable "offset"
> is not available.
> ) at file.h:239
> #8  0xffffffff80425ec1 in kern_writev (td=0xffffff0004a13000, fd=3,
>    auio=0xffffff830ff98a90) at /usr/src/sys/kern/sys_generic.c:447
> #9  0xffffffff80425f3f in write (td=Variable "td" is not available.
> ) at /usr/src/sys/kern/sys_generic.c:363
> #10 0xffffffff80422408 in syscallenter (td=0xffffff0004a13000,
>    sa=0xffffff830ff98ba0) at /usr/src/sys/kern/subr_trap.c:315
> #11 0xffffffff8065b4f7 in syscall (frame=0xffffff830ff98c40)
>    at /usr/src/sys/amd64/amd64/trap.c:944
> #12 0xffffffff806449f2 in Xfast_syscall ()
>    at /usr/src/sys/amd64/amd64/exception.S:381
> #13 0x0000000800735afc in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb)
> <snip>
> 
> Again, removing RADIX_MPATH from kernel, it's working fine.
> 
> 
> Regards,
> 
> --
> Eduardo Schoedler


--Apple-Mail-135-1048872824--



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?B75A64E0-4CF8-4440-BF05-9CC60558F825>