From owner-freebsd-ports@freebsd.org Thu Apr 11 15:32:52 2019 Return-Path: Delivered-To: freebsd-ports@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E25F2158747E for ; Thu, 11 Apr 2019 15:32:51 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from mail.rlwinm.de (mail.rlwinm.de [138.201.35.217]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A0FC26BF85 for ; Thu, 11 Apr 2019 15:32:50 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from crest.bultmann.eu (unknown [IPv6:2a00:c380:c0d5:1:a82c:c361:a35a:85ff]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.rlwinm.de (Postfix) with ESMTPSA id 21DC51C72; Thu, 11 Apr 2019 15:32:43 +0000 (UTC) Subject: Re: python 3 subprocess performance To: Konstantin Schukraft Cc: freebsd-ports@freebsd.org References: <20190411161649.1b740d21@vm2.home.zagrebin.ru> <8f3f8413-60f2-bb03-a6b4-4f6364cdc3df@rlwinm.de> <20190411143926.5rg4jskmodt4shhi@laparbeit> From: Jan Bramkamp Message-ID: <9729db47-12c4-caf4-cdcf-1913dab73c8e@rlwinm.de> Date: Thu, 11 Apr 2019 17:32:42 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <20190411143926.5rg4jskmodt4shhi@laparbeit> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: A0FC26BF85 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of crest@rlwinm.de designates 138.201.35.217 as permitted sender) smtp.mailfrom=crest@rlwinm.de X-Spamd-Result: default: False [-2.21 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-0.98)[-0.983,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-1.00)[-0.998,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[rlwinm.de]; NEURAL_SPAM_SHORT(0.11)[0.113,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[mail.rlwinm.de]; RCPT_COUNT_TWO(0.00)[2]; IP_SCORE(-0.03)[ipnet: 138.201.0.0/16(1.88), asn: 24940(-2.03), country: DE(-0.01)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:138.201.0.0/16, country:DE]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-ports@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting software to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2019 15:32:52 -0000 The reason is that that python does something stupid (tm). It tries to close all file descriptors (except a few whitelisted ones) up to the maximum file descriptor number. It does this by asking the kernel for the maximum possible number and closing everything it doesn't want to keep. Some time later someone came up with an optimization (read the open file descriptors from /dev/fd). All of this pain and suffering is caused by good old Ulrich Drepper braindamage: https://sourceware.org/bugzilla/show_bug.cgi?id=10353. Most Linux distros have lower default file descriptor limits than FreeBSD making this workaround less painful. The correct solution would be to teach python3 about closefrom(2). On 11.04.19 16:39, Konstantin Schukraft wrote: >> Please run python under truss -f. Does it try to close(2) all possible >> file descriptors? Does the runtime decrease if fdescfs is mounted at >> /dev/fd? > truss -f -p 17407 > > 17407: read(5,0x1265d2e9120,50000)               = 0 (0x0) > 17407: close(5)                                  = 0 (0x0) > 17407: lseek(3,0x0,SEEK_CUR)                     ERR#29 'Illegal seek' > 17407: fstat(3,{ mode=p--------- ,inode=1610971,size=0,blksize=4096 }) = > 0 (0x0) > 17407: read(3,"FreeBSD midgard 12.0-STABLE-HBSD"...,8192) = 88 (0x58) > 17407: read(3,0x1265ceda078,8104)                = 0 (0x0) > 17407: close(3)                                  = 0 (0x0) > 17407: wait4(1948,{ EXITED,val=0 },0x0,0x0)      = 1948 (0x79c) > 17407: write(1,"63\n",3)                         = 3 (0x3) > 17407: pipe2(0x63857dad22e8,O_CLOEXEC)           = 0 (0x0) > 17407: fstat(3,{ mode=p--------- ,inode=1610975,size=0,blksize=4096 }) = > 0 (0x0) > 17407: ioctl(3,TIOCGETA,0x63857dad21a0)          ERR#25 'Inappropriate > ioctl for device' > 17407: lseek(3,0x0,SEEK_CUR)                     ERR#29 'Illegal seek' > 17407: pipe2(0x63857dad21e8,O_CLOEXEC)           = 0 (0x0) > 17407: sigprocmask(SIG_BLOCK,{ > SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF| > > SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0) > 97164: > 17407: fork()                                    = 97164 (0x17b8c) > 97164: thr_self(0x1265c7e9000)                   = 0 (0x0) > 17407: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0) > 97164: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0) > 97164: close(3)                                  = 0 (0x0) > 97164: close(5)                                  = 0 (0x0) > 17407: close(6)                                  = 0 (0x0) > 97164: dup2(4,1)                                 = 1 (0x1) > 17407: close(4)                                  = 0 (0x0) > 97164: dup2(4,2)                                 = 2 (0x2) > 97164: sigprocmask(SIG_SETMASK,{ > SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPRO > > F|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0) > 97164: sigaction(SIGPIPE,{ SIG_DFL 0x0 ss_t },{ SIG_IGN 0x0 ss_t }) = 0 > (0x0) > 97164: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0) > 97164: sigprocmask(SIG_SETMASK,{ > SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPRO > > F|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0) > 97164: sigaction(SIGXFSZ,{ SIG_DFL 0x0 ss_t },{ SIG_IGN 0x0 ss_t }) = 0 > (0x0) > 97164: sigprocmask(SIG_SETMASK,{ },0x0)          = 0 (0x0) > 97164: fstatat(AT_FDCWD,"/dev",{ mode=dr-xr-xr-x > ,inode=2,size=512,blksize=4096 },0x0) = 0 (0x0) > 97164: fstatat(AT_FDCWD,"/dev/fd",{ mode=dr-xr-xr-x > ,inode=135,size=512,blksize=4096 },0x0) = 0 (0x0) > 97164: getrlimit(RLIMIT_NOFILE,{ cur=940563,max=940563 }) = 0 (0x0) > 97164: close(3)                                  ERR#9 'Bad file > descriptor' > 97164: close(4)                                  = 0 (0x0) > 97164: close(5)                                  ERR#9 'Bad file > descriptor' > 97164: close(7)                                  ERR#9 'Bad file > descriptor' > 97164: close(8)                                  ERR#9 'Bad file > descriptor' > 97164: close(9)                                  ERR#9 'Bad file > descriptor' > > The last error repeats several 10000 times. > > Mounting fdescfs does indeed speed python3 up to python2 levels. > >> On 11.04.19 15:16, Alexander Zagrebin wrote: >>> Hi! >>> >>> I've noticed the subprocess performance issue with python 3. >>> For example, this simple script takes just 0,15 second to complete >>> with python 2.7, but more than 5 sec with python 3.6: >>> >>> import subprocess >>> >>> for i in range(100): >>>     p = subprocess.Popen(['uname', '-a'], stdout=subprocess.PIPE, >>> stderr=subprocess.STDOUT) >>>     (stdoutdata, stderrdata) = p.communicate() >>> >>> Profiling with the cProfile shows, that this excessive 5 seconds was >>> wasted in the "{built-in method posix.read}" >>> >>> Could anybody confirm this issue? >>> >> _______________________________________________ >> freebsd-ports@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-ports >> To unsubscribe, send any mail to "freebsd-ports-unsubscribe@freebsd.org"