Date: Wed, 20 Dec 2017 17:14:48 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 203129] syslogd: /dev/console: Interrupted system call Message-ID: <bug-203129-8-ueDAunTmzI@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-203129-8@https.bugs.freebsd.org/bugzilla/> References: <bug-203129-8@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203129 --- Comment #10 from heas@shrubbery.net --- I have noticed another regressive behavior from this: Local system status: 3:01AM up 36 days, 7 hrs, 3 users, load averages: 165.55, 164.19, 161.10 I am forced to reboot at this point. This is a headless server with a service processor and a serial console with h/w flow control running 11.1-p4. A similar server without a head and no se= rial console does not have this problem, nor does a headless 10.3 server, nor did this server when it was running 10.3. By process of elimination, I have found that removing syslog.conf lines that log to /dev/console, this is avoided (or seemes to be after 2 days without reoccurring). What happens is that the daily cron newsyslog rotates files and HUPs syslog= d.=20 At some point after that, usually a few hours, the load starts to climb. On this server, it is many postgres processes driving that load in a tight error/retry loop. Postgres is configured to lot to syslog. That loop traces as: 96701 postgres CALL sendto(0x7,0x7fffffffceb0,0x9d,0,0,0) 96701 postgres RET sendto -1 errno 55 No buffer space available 96701 postgres CALL nanosleep(0x7fffffffc980,0) 96701 postgres RET nanosleep 0 96701 postgres CALL sendto(0x7,0x7fffffffceb0,0x9d,0,0,0) 96701 postgres RET sendto -1 errno 55 No buffer space available 96701 postgres CALL nanosleep(0x7fffffffc980,0) I failed to take note of what FD 7 is. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMA= ND 96701 pgsql 1 48 0 179M 144M RUN 9 102:21 9.47% postg= res 40970 pgsql 1 48 0 179M 144M RUN 8 69:55 9.08% postg= res 44637 pgsql 1 48 0 179M 144M RUN 10 67:07 8.98% postg= res 70361 pgsql 1 48 0 179M 144M RUN 7 77:29 8.59% postg= res ...... USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND pgsql 32593 9.1 0.2 182788 147740 - Rs 22:50 95:11.30 postgre= s: gld gld 127.0.0.1(57118) (postgres) pgsql 80018 9.0 0.2 182788 147932 - Rs 00:26 74:00.04 postgre= s: gld gld 127.0.0.1(18337) (postgres) pgsql 67956 8.9 0.2 182788 147940 - Rs 00:26 73:59.81 postgre= s: gld gld 127.0.0.1(18309) (postgres) ..... Active Internet connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Address (state) gld to postfix: tcp4 0 0 127.0.0.1.5432 127.0.0.1.50479 ESTABLISH= ED tcp4 0 0 127.0.0.1.50479 127.0.0.1.5432 ESTABLISH= ED tcp4 0 0 127.0.0.1.2525 127.0.0.1.50478 CLOSE_WAIT tcp4 0 0 127.0.0.1.5432 127.0.0.1.29439 ESTABLISH= ED tcp4 0 0 127.0.0.1.29439 127.0.0.1.5432 ESTABLISH= ED tcp4 0 0 127.0.0.1.2525 127.0.0.1.29438 CLOSE_WAIT tcp4 0 0 127.0.0.1.5432 127.0.0.1.37203 CLOSED tcp4 0 0 127.0.0.1.5432 127.0.0.1.31279 CLOSED tcp4 0 0 127.0.0.1.5432 127.0.0.1.31277 CLOSED tcp4 0 0 127.0.0.1.5432 127.0.0.1.35576 CLOSED ...... these would be postfix to gld: tcp4 596 0 127.0.0.1.2525 127.0.0.1.37998 CLOSE_WAI= T=20 tcp4 627 0 127.0.0.1.2525 127.0.0.1.37994 CLOSE_WAIT tcp4 627 0 127.0.0.1.2525 127.0.0.1.37993 CLOSE_WAIT tcp4 602 0 127.0.0.1.2525 127.0.0.1.37992 CLOSE_WAIT tcp4 605 0 127.0.0.1.2525 127.0.0.1.37988 CLOSE_WAIT tcp4 605 0 127.0.0.1.2525 127.0.0.1.37987 CLOSE_WAIT tcp4 605 0 127.0.0.1.2525 127.0.0.1.37986 CLOSE_WAIT ...... Active UNIX domain sockets Address Type Recv-Q Send-Q Inode Conn=20= =20=20=20=20=20=20=20 Refs Nextref Addr fffff8012e26b690 dgram 0 0 fffff8012e2d3938 0=20= =20=20=20=20=20=20=20 0 0 /var/named/var/run/log fffff8012e26b780 dgram 49898 0 fffff8012e2d3b10 0 fffff808375de3c0 0 /var/run/logpriv fffff8012e26b870 dgram 44120 0 fffff8012e2d3ce8 0 fffff81015ef8870 0 /var/run/log fffff8012e2b51e0 seqpac 0 0 fffff80077b11938 0=20= =20=20=20=20=20=20=20 0 0 /var/run/devd.seqpacket.pipe .... Active Internet connections Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-= MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rex= mt persist keep 2msl delack rcvtime tcp4 0 0 127.0.0.1.5432 127.0.0.1.38803 0 = 0 0 0 81720 49032 1 2048 0 0 653760 392256 0.= 00=20=20 0.00 7145.47 0.00 0.00 54.52 tcp4 0 0 127.0.0.1.38803 127.0.0.1.5432 0 = 0 0 0 81720 49032 1 2048 0 0 653760 392256 0.= 00=20=20 0.00 7145.57 0.00 0.00 54.42 tcp4 668 0 127.0.0.1.2525 127.0.0.1.54066 1 = 0 1 0 81720 49032 1 2048 2304 0 653760 392256 0.= 00=20=20 0.00 7145.47 0.00 0.00 54.52 tcp4 0 0 127.0.0.1.54066 127.0.0.1.2525 0 = 0 0 0 81720 49032 1 2048 0 0 653760 392256 0.= 00=20=20 0.00 7145.57 0.00 0.00 54.42 tcp4 668 0 127.0.0.1.2525 127.0.0.1.33204 1 = 0 1 0 81720 49032 1 2048 2304 0 653760 392256 0.= 00=20=20 0.00 7145.47 0.00 0.00 54.52 tcp4 0 0 127.0.0.1.33204 127.0.0.1.2525 0 = 0 0 0 81720 49032 1 2048 0 0 653760 392256 0.= 00=20=20 0.00 7145.47 545.47 0.00 54.52 ..... That is as far as I've analyzed this, but toward a vote to revert whatever changed with console from 10.3. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-203129-8-ueDAunTmzI>