Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Feb 2013 09:04:14 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        Mikolaj Golub <trociny@freebsd.org>
Cc:        Stanislav Sedov <stas@freebsd.org>, Kostik Belousov <kib@freebsd.org>, "Robert N. M. Watson" <rwatson@freebsd.org>, Attilio Rao <attilio@freebsd.org>, freebsd-hackers@freebsd.org
Subject:   Re: libprocstat(3): retrieve process command line args and environment
Message-ID:  <201302200904.15324.jhb@freebsd.org>
In-Reply-To: <20130212215054.GA9839@gmail.com>
References:  <20130119151253.GB88025@gmail.com> <201301251531.43540.jhb@freebsd.org> <20130212215054.GA9839@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, February 12, 2013 4:50:54 pm Mikolaj Golub wrote:
> On Fri, Jan 25, 2013 at 03:31:43PM -0500, John Baldwin wrote:
> 
> > BTW, one off-ball thought I have is that I would like to have a mode where 
> > libprocstat operates on a core file (of a process, not a kernel crash dump), 
> > so it could list the threads from a core dump, and possibly file descriptor 
> > info (if PR kern/173723 is implemented).
> > 
> > We certainly could have a 'raw' mode where it spat out name: value or XML
> > of the entire kinfo_proc perhaps.
> > 
> 
> It looks very interesting! Do you mean something like this?

Yes, exactly like this!  Very nice indeed.

> root@lisa:/root # sh -c 'kill -5 $$'
> Trace/BPT trap (core dumped)
> root@lisa:/root # ls -l sh.core 
> -rw-------  1 root  wheel  8790016 Feb 12 21:17 sh.core
> root@lisa:/root # procstat sh.core 
>   PID  PPID  PGID   SID  TSID THR LOGIN    WCHAN     EMUL          COMM        
>   674   657   674   657   657   1 root     -         FreeBSD ELF32 sh          
> root@lisa:/root # procstat -f sh.core 
>   PID COMM               FD T V FLAGS     REF  OFFSET PRO NAME        
>   674 sh               text v r r--------   -       - -   /bin/sh           
>   674 sh               ctty v c rw-------   -       - -   /dev/pts/0        
>   674 sh                cwd v d r--------   -       - -   /root             
>   674 sh               root v d r--------   -       - -   /                 
>   674 sh                  0 v c rw-------   8    2537 -   /dev/pts/0        
>   674 sh                  1 v c rw-------   8    2537 -   /dev/pts/0        
>   674 sh                  2 v c rw-------   8    2537 -   /dev/pts/0        
> root@lisa:/root # procstat -v sh.core
>   PID      START        END PRT  RES PRES REF SHD   FL TP PATH
>   674  0x8048000  0x8064000 r-x   28    0   1   0 CN-- vn /bin/sh
>   674  0x8064000  0x8066000 rw-    2    0   1   0 ---- df 
>   674 0x28064000 0x2807a000 r-x   22    0  17   0 CN-- vn /libexec/ld-elf.so.1
>   674 0x2807a000 0x28083000 rw-    9    0   1   0 ---- df 
>   674 0x28084000 0x280a3000 r-x   31   32   2   1 CN-- vn /lib/libedit.so.7
>   674 0x280a3000 0x280a5000 rw-    2    0   1   0 C--- vn /lib/libedit.so.7
>   674 0x280a5000 0x280a7000 rw-    0    0   0   0 ---- -- 
>   674 0x280a7000 0x280e0000 r-x   57    0   4   2 CN-- vn /lib/libncurses.so.8
>   674 0x280e0000 0x280e3000 rw-    3    0   1   0 C--- vn /lib/libncurses.so.8
>   674 0x280e3000 0x28213000 r-x  304    0  34  17 CN-- vn /lib/libc.so.7
>   674 0x28213000 0x2821a000 rw-    7    0   1   0 C--- vn /lib/libc.so.7
>   674 0x2821a000 0x28243000 rw-   16    0   2   0 ---- df 
>   674 0x28400000 0x28c00000 rw-   24    0   2   0 ---- df 
>   674 0xbfbdf000 0xbfbff000 rwx    3    0   1   0 ---D df 
>   674 0xbfbff000 0xbfc00000 r-x    0    0  20   0 CN-- ph 
> 
> Here is my attempt to implement it:
> 
> http://people.freebsd.org/~trociny/procstat_core.1.patch
> 
> The patch needs much work yet, especially the userland part, but looks
> like it is good enough for demonstration purposes to discuss how this
> should be done properly.
> 
> So, procstat data is stored in a core as note sections with name
> "FreeBSD" and types NT_PROCSTAT_PROC, NT_PROCSTAT_FILES, ...
> 
> The current format of notes is a header of sizeof(int) and data in the
> format as it is returned by a related sysctl call. I think the
> header should provide some versioning and for the cases I implemented
> (kinfo_proc, kinfo_file, kinfo_vmentry) it contains a value of the
> corresponding kinfo struct size (e.g. KINFO_VMENTRY_SIZE). It might be
> not the best solution and I would be glad for suggestions.

I think including the size is good and is probably sufficient for
versioning.

> (BTW, why don't we have constants like KINFO_VMENTRY_SIZE defined for
> all archs?)

I cannot speak to that.

> To avoid code duplication I changed the code of kinfo sysctl handlers
> to output kinfo data to sbuf instead of calling SYSCTL_OUT directly so
> these functions could be used by both sysctl handlers and the coredump
> routine.

Ok.
 
> Another thing I am not sure about is writing procstat data in the
> coredump routine. The coredump routine on the first pass calculates
> core sizes and on the second pass does actual writing. I added
> procstat in that way that procstat data is collected on the first run
> to internal buffers and on the second pass is copied from the buffers
> to the core. I could do this another way, e.g running kern_proc_*out()
> twice, on the fist pass with tiny buffer and a drain routine that
> would calculate data length, but this looks less efficient,
> complicates things and currently I am not sure that procstat sizes are
> not changeable between the passes. The issue with my approach I see is
> that after the first pass, before actually doing allocations we check
> corefile limits. I do allocations before checking the limit, so
> potentially using some kernel space when it might not be allowed by
> limits. Is this a problem I should worry and try another approach?

The process should be stopped by the time we dump a core, so running it
multiple times should be ok in that the sizes should not change.  I would
say that you should try to implement a "determine sizes" pass that doesn't
allocate anything, but others should comment on that.

> I do nothing yet for freebsd32 compat case. I suppose I should check
> if the dumping process is running 32 binary and store procstat data in
> freebsd32 format?

Hmmmmm, that would allow you to use a 32-bit procstat to examine it, but
then you'd also need to extend libprocstat to support 32-bit structures
so that the native 64-bit procstat can still examine the structures.  Also,
this information is nice to have for Linux binaries as well.  I'm not sure
it is worth doing this since usually you want to examine a core dump on the
machine that generated the core (that has all the same libraries, etc.).

> As I said there is much to do in userland, and I would like to
> concentrate on the kernel part and the format first, but I would be
> glad if somebody helps me with the problem I faced linking libelf to
> libprocstat. With the modifications to the libprocstat makefile:
> 
> -DPADD=		${LIBKVM} ${LIBUTIL}
> -LDADD=		-lkvm -lutil
> +DPADD=		${LIBELF} ${LIBKVM} ${LIBUTIL}
> +LDADD=		-lelf -lkvm -lutil
> 
> buildworld fails with the error:
> 
> make: don't know how to make /scratch2/tmp/trociny/obj/i386.i386/home/trociny/svn/base/head/tmp/usr/lib/libelf.a. Stop

This I have no idea on.

One other thing to consider is if gcore needs to be updated to output these
records as well.  It could use the sysctls to fetch the data and then write
out appropriate notes I think, so perhaps it wouldn't be too difficult to add
this as a followup commit once the kernel version has settled and the file
format is set?

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201302200904.15324.jhb>