Skip site navigation (1)Skip section navigation (2)


| raw e-mail | index | archive | help
On Apr 22, 2024, at 3:26=E2=80=AFAM, Alexander Leidinger =
<Alexander@Leidinger.net> wrote:


> Hi,
>=20
> I see a higher failure rate of socket/network related stuff since a =
while. Those failures are transient. Directly executing the same thing =
again may or may not result in success/failure. I'm not able to =
reproduce this at will. Sometimes they show up.
>=20
> Examples:
> - poudriere runs with the sccache overlay (like ccache but also works =
for rust) sometimes fail to create the communication socket and as such =
the build fails. I have 3 different poudriere bulk runs after each other =
in my build script, and when the first one fails, the second and third =
still run. If the first fails due to the sccache issue, the second and =
3rd may or may not fail. Sometimes the first fails and the rest is ok. =
Sometimes all fail, and if I then run one by hand it works (the script =
does the same as the manual run, the script is simply a "for type in A B =
C; do; poudriere bulk -O sccache -j $type -f  ${type}.pkglist; done" =
which I execute from the same shell, and the script doesn't do =
env-sanityzing).
> - A webmail interface (inet / local net -> nginx (rev-proxy) -> nginx =
(webmail service) -> php -> imap) sees intermittent issues sometimes. =
Opening the same email directly again afterwards normally works. I've =
also seen transient issues with pgp signing (webmail interface -> gnupg =
/ gpg-agent on the server), simply hitting send again after a failure =
works fine.
>=20
> Gleb, could this be related to the socket stuff you did 2 weeks ago? =
My world is from 2024-04-17-112537. I do notice this since at least =
then, but I'm not sure if they where there before that and I simply =
didn't notice them. They are surely "new recently", that amount of =
issues I haven's seen in January. The last two updates of current I did =
before the last one where on 2024-03-31-120210 and 2024-04-08-112551.
>=20
> I could also imagine that some memory related transient failure could =
cause this, but with >3 GB free I do not expect this. Important here may =
be that I have https://reviews.freebsd.org/D40575 in my tree, which is =
memory related, but it's only a metric to quantify memory fragmentation.
>=20
> Any ideas how to track this down more easily than running the entire =
poudriere in ktrace (e.g. a hint/script which dtrace probes to use)?


No answers, I'm afraid, just a "me too."

I have the same problem as you describe when using =
ports-mgmt/sccache-overlay when building packages with Poudriere.  In my =
case, I'm using FreeBSD 14-STABLE (stable/14-13952fbca).

I actually stopped using ports-mgmt/sccache-overlay because it got to =
the point where it didn't work more often than it did.  Then, a few =
months ago, I decided to start using it again on a whim and it worked =
reliably for me.  Then, starting a few weeks ago, it has reverted to the =
behaviour you describe above.  It is not as bad right now as it got when =
I quit using it.  Now, sometimes it will fail, but it will succeed when =
re-running a "poudriere bulk" run.

I'd love it to go back to when it was working 100% of the time.

Cheers,

Paul.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?>