Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 18 Nov 2017 18:16:44 +0200
From:      "Jukka A. Ukkonen" <jau789@gmail.com>
To:        questions@freebsd.org
Subject:   10.4-stable systematically crashing inside pselect() when a tun device is used
Message-ID:  <fe0a45cb-4b2e-343e-a7cc-1eb933f29b19@gmail.com>

next in thread | raw e-mail | index | archive | help

Hello all,

As briefly stated in the subject I have a 10-stable system on
which I have been testing a program which opens either a tun
device or a tap device, waits in pselect() for the descriptor
to become readable, and then proceeds to read the packet/frame.
When using a tun descriptor the pselect() call systematically
panics the kernel with the complaints shown in the text dump
snippet at the end of this message. When using a tap device
the same code works just fine.
After a little eyeballing I failed to notice any obvious reason
for this in the tun device code. I hope someone who knows the tun
device better might be able to tell me what should I see in this.

At the very minimum I would expect the pselect() call to fail
properly with an error code. Raising a panic and crashing the
whole kernel gives me the impression that there is something
very seriously wrong there. At least for now it just has not
dawned to me what that something is.
The system doing this is just another amd64 running 10-stable.
So, this should not be a hardware related issue on a rarely used
hardware.
Any hints, pointers, helpful sophisticated guesses etc. would be
welcome.

—jau



The following 12 lines were manually copied from a photo of the
console display after the panic was triggered...

Fatal trap 12: page fault while in kernel mode
cpuid = 10; apic id = 13
fault virtual address	= 0x8
fault code		= supervisor read data, page not present
instuction pointer	= 0x20:0xffffffff80b29699
stack pointer		= 0x28:0xfffffe03e72a8a70
frame pointer		= 0x28:0xfffffe03e72a8ab0
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor flags		= interrupt enabled, resume, IOPL = 0
current process		= 12 (swi6: task queue)
trap number		= 12



The rest have been pulled from the core.text.0 file, but
this is the apparently the exact same data that got dumped
to the console display as well...


trap number             = 12
panic: page fault
cpuid = 10
KDB: stack backtrace:
#0 0xffffffff80a97b60 at kdb_backtrace+0x60
#1 0xffffffff80a57d26 at vpanic+0x126
#2 0xffffffff80a57bf3 at panic+0x43
#3 0xffffffff80e8b84d at trap_fatal+0x35d
#4 0xffffffff80e8bb68 at trap_pfault+0x308
#5 0xffffffff80e8b1ca at trap+0x47a
#6 0xffffffff80e6f93c at calltrap+0x8
#7 0xffffffff80aaa645 at taskqueue_run_locked+0xf5
#8 0xffffffff80aaa4f3 at taskqueue_run+0x93
#9 0xffffffff80a1f209 at intr_event_execute_handlers+0xb9
#10 0xffffffff80a1f676 at ithread_loop+0x96
#11 0xffffffff80a1c93a at fork_exit+0x9a
#12 0xffffffff80e6fe7e at fork_trampoline+0xe
Uptime: 11m7s
Dumping 865 out of 16346 
MB:..2%..12%..21%..32%..41%..52%..61%..71%..82%..91%




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?fe0a45cb-4b2e-343e-a7cc-1eb933f29b19>