Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 8 Dec 2016 15:34:38 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Eric van Gyzen <vangyzen@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r309676 - in head: bin/ps lib/libkvm sys/compat/freebsd32 sys/kern sys/sys usr.bin/procstat usr.bin/top
Message-ID:  <20161208130730.O1089@besplex.bde.org>
In-Reply-To: <201612071504.uB7F4MCi020382@repo.freebsd.org>
References:  <201612071504.uB7F4MCi020382@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 7 Dec 2016, Eric van Gyzen wrote:

> Log:
>  Export the whole thread name in kinfo_proc
>
>  kinfo_proc::ki_tdname is three characters shorter than
>  thread::td_name.  Add a ki_moretdname field for these three
>  extra characters.  Add the new field to kinfo_proc32, as well.
>  Update all in-tree consumers to read the new field and assemble
>  the full name, except for lldb's HostThreadFreeBSD.cpp, which
>  I will handle separately.  Bump __FreeBSD_version.

tdname stuff was very badly implemented.  It abuses the ki_ocomm
field for ki_tdname.  ki_ocomm was reserved for the old command
name to support old applications.  It should have been populated
with an intelligently abbreviated copy of ki_comm, so that old
applications which don't know about the ki_comm renaming see a
useful name.  It was actually populated by blindly truncating
p_comm.  This was good enough in most cases.  Then when thread
support was added, ki_ocomm was broken copying td_name to it.
For the non-threaded case, this makes no difference, but for
the threaded case this breaks ki_ocomm for old applications and
it doesn't even work right for the threaded case since ki_ocomm
is too short.  This is the problem that you are partly fixing now.

ki_ocomm should have remained as the truncated ki_comm.  ki_comm
should have been modified to contain an intelligently abbreviated
copy of the command and thread names, so that not so old applications
see any extra info in the thread name.  A new field was needed for
the thread name, and perhaps for the previous (non-mixed) command
name, so that applications can see the raw names p_comm and td_name
if the understand threads.

Renaming ki_comm was rather gratuitous.  It was only expanded from
16+1 to 19+1 bytes.  This was done mainly because 16+1 gives either
misalgnment or wastes space.  The latter occurred for p_comm[] in
struct proc.  So we expanded to 19+1 but no more to get a slightly
longer name at no cost in struct proc.  But there was a large cost
for using this.  ac_comm[] in struct acct still has size 16+0 (it
is not NUL-terminated).  And the expansion didn't just work in user.h
since the ki_comm[16+1] was packed there.  The expansion was done
cleanly by keeping renaming the old field but renaming it to
ki_ocomm[16+1], and adding a new field ki_comm[19+1].  Then abusing
ki_ocomm made a mess.

ki_ocomm and OCOMMLEN are under a bogus BURN_BRIDGES ifdef in user.h.
This doesn't even break the ABI.  It only breaks the API, but that is
less important and can be done in the head branch without any ifdefs
(just after you update all uses of the API in src).  The ABI was broken
by abusing ki_ocomm.

There is also much ugliness and loss of info by combining ki_comm and
ki_tdname unintelligently and and then truncating.  The combined length
of 19+16 (now +3 more) is too wide for most displays.  The
unintelligences also adds excessive markup like '[]{}:' characters,
giving lengths of 40+.  Then blind truncation tends to lost the most
useful info at the end.

Your 'more' field instead of renaming again not as clean or wasteful of
space as renaming again, but it is less invasive.

> Modified: head/bin/ps/print.c
> ==============================================================================
> --- head/bin/ps/print.c	Wed Dec  7 14:35:05 2016	(r309675)
> +++ head/bin/ps/print.c	Wed Dec  7 15:04:22 2016	(r309676)
> @@ -120,11 +120,12 @@ command(KINFO *k, VARENT *ve)
> 	if (cflag) {
> 		/* If it is the last field, then don't pad */
> 		if (STAILQ_NEXT(ve, next_ve) == NULL) {
> -			asprintf(&str, "%s%s%s%s",
> +			asprintf(&str, "%s%s%s%s%s",
> 			    k->ki_d.prefix ? k->ki_d.prefix : "",
> 			    k->ki_p->ki_comm,
> 			    (showthreads && k->ki_p->ki_numthreads > 1) ? "/" : "",
> -			    (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_tdname : "");
> +			    (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_tdname : "",
> +			    (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_moretdname : "");

This expands the style bugs by copying a too-long line and expanding it
further.

With intelligent combination, you couldn't do it in a big printf() arg
list.

FreeBSD cluster machines seem to have stopped showing system processes
(even idle) in ps and top, so I can't easily see many bad  messes from
unintelligent combination or debug things related to system load.  Locally,
I get for top -SH:

   PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
    10 root     155 ki31     0K    64K CPU3    3   1:59 100.27% idle{idle: cpu3}
    10 root     155 ki31     0K    64K CPU5    5   1:59 100.27% idle{idle: cpu5}
    10 root     155 ki31     0K    64K CPU6    6   1:59 100.27% idle{idle: cpu6}
    10 root     155 ki31     0K    64K CPU7    7   1:59 100.27% idle{idle: cpu7}
    10 root     155 ki31     0K    64K CPU2    2   1:59 100.27% idle{idle: cpu2}
    10 root     155 ki31     0K    64K CPU1    1   1:58 100.27% idle{idle: cpu1}
    10 root     155 ki31     0K    64K RUN     0   1:58 100.27% idle{idle: cpu0}
    10 root     155 ki31     0K    64K CPU4    4   1:58 100.27% idle{idle: cpu4}
     0 root     -16    -     0K   160K swapin  5   0:29   0.00% kernel{swapper}
    13 root     -68    -     0K   120K -       0   0:00   0.00% usb{usbus2}
    13 root     -68    -     0K   120K -       5   0:00   0.00% usb{usbus1}
     0 root       8    -     0K   160K -       0   0:00   0.00% kernel{thread ta
     1 root       8    0   732K   372K wait    4   0:00   0.00% init
    11 root     -60    -     0K   168K WAIT    0   0:00   0.00% intr{swi4: clock
    11 root     -88    -     0K   168K WAIT    4   0:00   0.00% intr{irq265: ahc
    13 root     -68    -     0K   120K -       0   0:00   0.00% usb{usbus0}
    12 root      -8    -     0K    24K -       0   0:00   0.00% geom{g_event}
   678 bde        8    0  6344K  6168K wait    6   0:00   0.00% bash

This suffers some truncation, but freefall is much worse: some undisplayed
username is wider, so 5 columns are wasted for USERNAME and only 11 are left.
for COMMAND.

The above shows the common lossage that the first part of ki_tdname is the
same as the whole ki_comm.  ki_tdname needs to be longer to hold this, but
is shorter.  Unintelligent combination duplicates ki_comm and then truncates
info.  The other names are less consistently bad.  "usb" is duplicated, and
"intr" is duplicated as "irq".  Other names mostly are just too long.
ki_tdname is sometimes a description, not a name.  ki_comm doesn't claim to
be a name, but more usually is.  It takes -a to get args for top.  top -a
and ps l suffer from to much truncation of COMMAND to actually show many
args.


> 		} else
> 			str = strdup(k->ki_p->ki_comm);
>
> @@ -172,14 +173,16 @@ ucomm(KINFO *k, VARENT *ve)
> 	char *str;
>
> 	if (STAILQ_NEXT(ve, next_ve) == NULL) {	/* last field, don't pad */
> -		asprintf(&str, "%s%s%s%s",
> +		asprintf(&str, "%s%s%s%s%s",
> 		    k->ki_d.prefix ? k->ki_d.prefix : "",
> 		    k->ki_p->ki_comm,
> 		    (showthreads && k->ki_p->ki_numthreads > 1) ? "/" : "",
> -		    (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_tdname : "");
> +		    (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_tdname : "",
> +		    (showthreads && k->ki_p->ki_numthreads > 1) ? k->ki_p->ki_moretdname : "");

As above.

> 	} else {
> 		if (showthreads && k->ki_p->ki_numthreads > 1)
> -			asprintf(&str, "%s/%s", k->ki_p->ki_comm, k->ki_p->ki_tdname);
> +			asprintf(&str, "%s/%s%s", k->ki_p->ki_comm,
> +			    k->ki_p->ki_tdname, k->ki_p->ki_moretdname);
> 		else
> 			str = strdup(k->ki_p->ki_comm);
> 	}

Here you fixed the formatting style bug, but it might be considered a
style bug to use strdup() instead of asprintf() for the simple case.
style(9) says to always use fprintf() and not fputs()...or whatever,
and we can do the same things for asprintf() to get uniform logic
for the allocation.  The printf() part is just not so uniform when its
arg list depends on (showthreads && k->ki_p->ki_numthreads > 1).

> @@ -192,7 +195,8 @@ tdnam(KINFO *k, VARENT *ve __unused)
> 	char *str;
>
> 	if (showthreads && k->ki_p->ki_numthreads > 1)
> -		str = strdup(k->ki_p->ki_tdname);
> +		asprintf(&str, "%s%s", k->ki_p->ki_tdname,
> +		    k->ki_p->ki_moretdname);
> 	else
> 		str = strdup("      ");

Here always using asprintf() is clearly better because it avoids having to
count the spaces in the literal string.

> Modified: head/lib/libkvm/kvm_proc.c
> ==============================================================================
> --- head/lib/libkvm/kvm_proc.c	Wed Dec  7 14:35:05 2016	(r309675)
> +++ head/lib/libkvm/kvm_proc.c	Wed Dec  7 15:04:22 2016	(r309676)
> @@ -426,8 +426,6 @@ nopgrp:
> 			kp->ki_pri.pri_native = mtd.td_base_pri;
> 			kp->ki_lastcpu = mtd.td_lastcpu;
> 			kp->ki_wchan = mtd.td_wchan;
> -			if (mtd.td_name[0] != 0)
> -				strlcpy(kp->ki_tdname, mtd.td_name, MAXCOMLEN);

This seems to have been dead code.

> 			kp->ki_oncpu = mtd.td_oncpu;
> 			if (mtd.td_name[0] != '\0')
> 				strlcpy(kp->ki_tdname, mtd.td_name, sizeof(kp->ki_tdname));
>

In my version, most string lengths are not #defined and/or not exported in
user.h, so as to force use of sizeof() like here.

> Modified: head/sys/compat/freebsd32/freebsd32.h
> ==============================================================================
> --- head/sys/compat/freebsd32/freebsd32.h	Wed Dec  7 14:35:05 2016	(r309675)
> +++ head/sys/compat/freebsd32/freebsd32.h	Wed Dec  7 15:04:22 2016	(r309676)
> @@ -315,7 +315,8 @@ struct kinfo_proc32 {
> 	char	ki_comm[COMMLEN+1];
> 	char	ki_emul[KI_EMULNAMELEN+1];
> 	char	ki_loginclass[LOGINCLASSLEN+1];
> -	char	ki_sparestrings[50];
> +	char	ki_moretdname[MAXCOMLEN-TDNAMLEN+1];
> +	char	ki_sparestrings[46];

It's inconsistent to use #define'd values to calculate 3+1 for the size
of the new field, then hard code this 3+1 in the adjustment of 50 to 46.
I prefer to hard-code all the array sizes and then use sizeof() to recover
them.  I only do this in usee.h, not here.

> 	int	ki_spareints[KI_NSPARE_INT];

The #define's for the spare counts are more actively harmful.  You needed
to not have one for the char spares to make the above magic adjustment of
4 as easy as possible.

> 	int	ki_oncpu;
> 	int	ki_lastcpu;
>
> Modified: head/sys/kern/kern_proc.c
> ==============================================================================
> --- head/sys/kern/kern_proc.c	Wed Dec  7 14:35:05 2016	(r309675)
> +++ head/sys/kern/kern_proc.c	Wed Dec  7 15:04:22 2016	(r309676)
> @@ -1020,7 +1020,14 @@ fill_kinfo_thread(struct thread *td, str
> 		strlcpy(kp->ki_wmesg, td->td_wmesg, sizeof(kp->ki_wmesg));
> 	else
> 		bzero(kp->ki_wmesg, sizeof(kp->ki_wmesg));
> -	strlcpy(kp->ki_tdname, td->td_name, sizeof(kp->ki_tdname));
> +	if (strlcpy(kp->ki_tdname, td->td_name, sizeof(kp->ki_tdname)) >=
> +	    sizeof(kp->ki_tdname)) {
> +		strlcpy(kp->ki_moretdname,
> +		    td->td_name + sizeof(kp->ki_tdname) - 1,
> +		    sizeof(kp->ki_moretdname));
> +	} else {
> +		bzero(kp->ki_moretdname, sizeof(kp->ki_moretdname));
> +	}

I think this bzero() and the one above for wmesg and a couple of others
are bogus.  There should have been a bzero() of the whole struct to fill
any padding, and that fills all zero fields too.

> Modified: head/sys/sys/user.h
> ==============================================================================
> --- head/sys/sys/user.h	Wed Dec  7 14:35:05 2016	(r309675)
> +++ head/sys/sys/user.h	Wed Dec  7 15:04:22 2016	(r309676)
> @@ -180,12 +180,13 @@ struct kinfo_proc {
> 	char	ki_comm[COMMLEN+1];	/* command name */
> 	char	ki_emul[KI_EMULNAMELEN+1];  /* emulation name */
> 	char	ki_loginclass[LOGINCLASSLEN+1]; /* login class */
> +	char	ki_moretdname[MAXCOMLEN-TDNAMLEN+1];	/* more thread name */
> 	/*
> 	 * When adding new variables, take space for char-strings from the
> 	 * front of ki_sparestrings, and ints from the end of ki_spareints.
> 	 * That way the spare room from both arrays will remain contiguous.
> 	 */
> -	char	ki_sparestrings[50];	/* spare string space */
> +	char	ki_sparestrings[46];	/* spare string space */
> 	int	ki_spareints[KI_NSPARE_INT];	/* spare room for growth */
> 	int	ki_oncpu;		/* Which cpu we are on */
> 	int	ki_lastcpu;		/* Last cpu we were on */
>

The #define's for the sizes are actually sort of needed.  We are (ab)using
the MI values in the compat headers.  The values should be, but aren't
required to be the same, especially for the spares.  They are just what
they have to be to preserve the ABI, and the ABIs have the same number of
spares for historical reasons.  Only strings are sure to have the same ABI
for individual fields.

> Modified: head/usr.bin/procstat/procstat.c
> ==============================================================================
> --- head/usr.bin/procstat/procstat.c	Wed Dec  7 14:35:05 2016	(r309675)
> +++ head/usr.bin/procstat/procstat.c	Wed Dec  7 15:04:22 2016	(r309676)
> @@ -35,6 +35,7 @@
> #include <libprocstat.h>
> #include <stdio.h>
> #include <stdlib.h>
> +#include <string.h>
> #include <sysexits.h>
> #include <unistd.h>
>
> @@ -126,6 +127,21 @@ kinfo_proc_sort(struct kinfo_proc *kipp,
> 	qsort(kipp, count, sizeof(*kipp), kinfo_proc_compare);
> }
>
> +const char *
> +kinfo_proc_thread_name(const struct kinfo_proc *kipp)
> +{
> +	static char name[MAXCOMLEN+1];

Don't use the style bug of no spaces around binary operators when nearby
code sdoesn't already have it.

> +
> +	strlcpy(name, kipp->ki_tdname, sizeof(name));
> +	strlcat(name, kipp->ki_moretdname, sizeof(name));

Should probably use 1 snprintf().

> ...
> Modified: head/usr.bin/procstat/procstat_threads.c
> ==============================================================================
> --- head/usr.bin/procstat/procstat_threads.c	Wed Dec  7 14:35:05 2016	(r309675)
> +++ head/usr.bin/procstat/procstat_threads.c	Wed Dec  7 15:04:22 2016	(r309676)
> @@ -71,11 +71,10 @@ procstat_threads(struct procstat *procst
> 		xo_open_container(threadid);
> 		xo_emit("{dk:process_id/%5d/%d} ", kipp->ki_pid);
> 		xo_emit("{:thread_id/%6d/%d} ", kipp->ki_tid);
> -		xo_emit("{d:command/%-16s/%s} ", strlen(kipp->ki_comm) ?
> +		xo_emit("{d:command/%-19s/%s} ", strlen(kipp->ki_comm) ?
> 		    kipp->ki_comm : "-");
> -		xo_emit("{:thread_name/%-16s/%s} ", (strlen(kipp->ki_tdname) &&
> -		    (strcmp(kipp->ki_comm, kipp->ki_tdname) != 0)) ?
> -		    kipp->ki_tdname : "-");
> +		xo_emit("{:thread_name/%-19s/%s} ",
> +		    kinfo_proc_thread_name(kipp));
> 		if (kipp->ki_oncpu != 255)
> 			xo_emit("{:cpu/%3d/%d} ", kipp->ki_oncpu);
> 		else if (kipp->ki_lastcpu != 255)
>

It's actually more invasive to make applications concatentate the names,
but if you added a new field with the correct length you would have to
change lots of hard-coded 16's for the old wrong size.  Hard-coding these
in printf formats is easiest and almost OK since the ABI prevents the
size changing often.

> Modified: head/usr.bin/top/machine.c
> ==============================================================================
> --- head/usr.bin/top/machine.c	Wed Dec  7 14:35:05 2016	(r309675)
> +++ head/usr.bin/top/machine.c	Wed Dec  7 15:04:22 2016	(r309676)
> @@ -991,8 +991,8 @@ format_next_process(caddr_t handle, char
> 	if (!(flags & FMT_SHOWARGS)) {
> 		if (ps.thread && pp->ki_flag & P_HADTHREADS &&
> 		    pp->ki_tdname[0]) {
> -			snprintf(cmdbuf, cmdlen, "%s{%s}", pp->ki_comm,
> -			    pp->ki_tdname);
> +			snprintf(cmdbuf, cmdlen, "%s{%s%s}", pp->ki_comm,
> +			    pp->ki_tdname, pp->ki_moretdname);
> 		} else {
> 			snprintf(cmdbuf, cmdlen, "%s", pp->ki_comm);
> 		}

procstat has a little more intelligence to produce '-' when the thread
name just duplicates the command name, but according to my top output
it rarely does.  I think it mainly produces '-' by finding a null
thread name for the unthreaded case.  procstat uses strcmp().  Comparing
prefixes could do more.  The above uses the null thread name test and
other things to produce a different format instead of a thread name of
'-'.  It is not so easy to use a general function to prepare the names
for special formatting.

top doesn't have the hard-coded 16's and 19's in format strings.  It would
have just worked (except for bad combination) with a clean rename.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161208130730.O1089>