Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Dec 2017 17:14:48 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 203129] syslogd: /dev/console: Interrupted system call
Message-ID:  <bug-203129-8-ueDAunTmzI@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-203129-8@https.bugs.freebsd.org/bugzilla/>
References:  <bug-203129-8@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203129

--- Comment #10 from heas@shrubbery.net ---
I have noticed another regressive behavior from this:

Local system status:
 3:01AM  up 36 days, 7 hrs, 3 users, load averages: 165.55, 164.19, 161.10

I am forced to reboot at this point.

This is a headless server with a service processor and a serial console with
h/w flow control running 11.1-p4. A similar server without a head and no se=
rial
console does not have this problem, nor does a headless 10.3 server, nor did
this server when it was running 10.3.

By process of elimination, I have found that removing syslog.conf lines that
log to /dev/console, this is avoided (or seemes to be after 2 days without
reoccurring).

What happens is that the daily cron newsyslog rotates files and HUPs syslog=
d.=20
At some point after that, usually a few hours, the load starts to climb.  On
this server, it is many postgres processes driving that load in a tight
error/retry loop.  Postgres is configured to lot to syslog. That loop traces
as:

 96701 postgres CALL  sendto(0x7,0x7fffffffceb0,0x9d,0,0,0)
 96701 postgres RET   sendto -1 errno 55 No buffer space available
 96701 postgres CALL  nanosleep(0x7fffffffc980,0)
 96701 postgres RET   nanosleep 0
 96701 postgres CALL  sendto(0x7,0x7fffffffceb0,0x9d,0,0,0)
 96701 postgres RET   sendto -1 errno 55 No buffer space available
 96701 postgres CALL  nanosleep(0x7fffffffc980,0)

I failed to take note of what FD 7 is.

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMA=
ND
96701 pgsql         1  48    0   179M   144M RUN     9 102:21   9.47% postg=
res
40970 pgsql         1  48    0   179M   144M RUN     8  69:55   9.08% postg=
res
44637 pgsql         1  48    0   179M   144M RUN    10  67:07   8.98% postg=
res
70361 pgsql         1  48    0   179M   144M RUN     7  77:29   8.59% postg=
res
       ......

USER      PID %CPU %MEM    VSZ    RSS TT  STAT STARTED         TIME COMMAND
pgsql   32593  9.1  0.2 182788 147740  -  Rs   22:50       95:11.30 postgre=
s:
gld gld 127.0.0.1(57118)  (postgres)
pgsql   80018  9.0  0.2 182788 147932  -  Rs   00:26       74:00.04 postgre=
s:
gld gld 127.0.0.1(18337)  (postgres)
pgsql   67956  8.9  0.2 182788 147940  -  Rs   00:26       73:59.81 postgre=
s:
gld gld 127.0.0.1(18309)  (postgres)
   .....

Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address          Foreign Address        (state)
gld to postfix:
tcp4       0      0 127.0.0.1.5432         127.0.0.1.50479        ESTABLISH=
ED
tcp4       0      0 127.0.0.1.50479        127.0.0.1.5432         ESTABLISH=
ED
tcp4       0      0 127.0.0.1.2525         127.0.0.1.50478        CLOSE_WAIT
tcp4       0      0 127.0.0.1.5432         127.0.0.1.29439        ESTABLISH=
ED
tcp4       0      0 127.0.0.1.29439        127.0.0.1.5432         ESTABLISH=
ED
tcp4       0      0 127.0.0.1.2525         127.0.0.1.29438        CLOSE_WAIT
tcp4       0      0 127.0.0.1.5432         127.0.0.1.37203        CLOSED
tcp4       0      0 127.0.0.1.5432         127.0.0.1.31279        CLOSED
tcp4       0      0 127.0.0.1.5432         127.0.0.1.31277        CLOSED
tcp4       0      0 127.0.0.1.5432         127.0.0.1.35576        CLOSED
   ......
these would be postfix to gld:
tcp4     596      0 127.0.0.1.2525         127.0.0.1.37998        CLOSE_WAI=
T=20
tcp4     627      0 127.0.0.1.2525         127.0.0.1.37994        CLOSE_WAIT
tcp4     627      0 127.0.0.1.2525         127.0.0.1.37993        CLOSE_WAIT
tcp4     602      0 127.0.0.1.2525         127.0.0.1.37992        CLOSE_WAIT
tcp4     605      0 127.0.0.1.2525         127.0.0.1.37988        CLOSE_WAIT
tcp4     605      0 127.0.0.1.2525         127.0.0.1.37987        CLOSE_WAIT
tcp4     605      0 127.0.0.1.2525         127.0.0.1.37986        CLOSE_WAIT
   ......

Active UNIX domain sockets
Address          Type   Recv-Q Send-Q            Inode             Conn=20=
=20=20=20=20=20=20=20
    Refs          Nextref Addr
fffff8012e26b690 dgram       0      0 fffff8012e2d3938                0=20=
=20=20=20=20=20=20=20
       0                0 /var/named/var/run/log
fffff8012e26b780 dgram   49898      0 fffff8012e2d3b10                0
fffff808375de3c0                0 /var/run/logpriv
fffff8012e26b870 dgram   44120      0 fffff8012e2d3ce8                0
fffff81015ef8870                0 /var/run/log
fffff8012e2b51e0 seqpac      0      0 fffff80077b11938                0=20=
=20=20=20=20=20=20=20
       0                0 /var/run/devd.seqpacket.pipe
   ....

Active Internet connections
Proto Recv-Q Send-Q Local Address          Foreign Address        R-MBUF S-=
MBUF
R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX   rex=
mt
persist    keep    2msl  delack rcvtime
tcp4       0      0 127.0.0.1.5432         127.0.0.1.38803             0   =
   0
     0      0  81720  49032      1   2048      0      0 653760 392256    0.=
00=20=20
 0.00 7145.47    0.00    0.00   54.52
tcp4       0      0 127.0.0.1.38803        127.0.0.1.5432              0   =
   0
     0      0  81720  49032      1   2048      0      0 653760 392256    0.=
00=20=20
 0.00 7145.57    0.00    0.00   54.42
tcp4     668      0 127.0.0.1.2525         127.0.0.1.54066             1   =
   0
     1      0  81720  49032      1   2048   2304      0 653760 392256    0.=
00=20=20
 0.00 7145.47    0.00    0.00   54.52
tcp4       0      0 127.0.0.1.54066        127.0.0.1.2525              0   =
   0
     0      0  81720  49032      1   2048      0      0 653760 392256    0.=
00=20=20
 0.00 7145.57    0.00    0.00   54.42
tcp4     668      0 127.0.0.1.2525         127.0.0.1.33204             1   =
   0
     1      0  81720  49032      1   2048   2304      0 653760 392256    0.=
00=20=20
 0.00 7145.47    0.00    0.00   54.52
tcp4       0      0 127.0.0.1.33204        127.0.0.1.2525              0   =
   0
     0      0  81720  49032      1   2048      0      0 653760 392256    0.=
00=20=20
 0.00 7145.47  545.47    0.00   54.52
   .....

That is as far as I've analyzed this, but toward a vote to revert whatever
changed with console from 10.3.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-203129-8-ueDAunTmzI>