| raw e-mail | index | archive | help
On Apr 22, 2024, at 3:26=E2=80=AFAM, Alexander Leidinger = <Alexander@Leidinger.net> wrote: > Hi, >=20 > I see a higher failure rate of socket/network related stuff since a = while. Those failures are transient. Directly executing the same thing = again may or may not result in success/failure. I'm not able to = reproduce this at will. Sometimes they show up. >=20 > Examples: > - poudriere runs with the sccache overlay (like ccache but also works = for rust) sometimes fail to create the communication socket and as such = the build fails. I have 3 different poudriere bulk runs after each other = in my build script, and when the first one fails, the second and third = still run. If the first fails due to the sccache issue, the second and = 3rd may or may not fail. Sometimes the first fails and the rest is ok. = Sometimes all fail, and if I then run one by hand it works (the script = does the same as the manual run, the script is simply a "for type in A B = C; do; poudriere bulk -O sccache -j $type -f ${type}.pkglist; done" = which I execute from the same shell, and the script doesn't do = env-sanityzing). > - A webmail interface (inet / local net -> nginx (rev-proxy) -> nginx = (webmail service) -> php -> imap) sees intermittent issues sometimes. = Opening the same email directly again afterwards normally works. I've = also seen transient issues with pgp signing (webmail interface -> gnupg = / gpg-agent on the server), simply hitting send again after a failure = works fine. >=20 > Gleb, could this be related to the socket stuff you did 2 weeks ago? = My world is from 2024-04-17-112537. I do notice this since at least = then, but I'm not sure if they where there before that and I simply = didn't notice them. They are surely "new recently", that amount of = issues I haven's seen in January. The last two updates of current I did = before the last one where on 2024-03-31-120210 and 2024-04-08-112551. >=20 > I could also imagine that some memory related transient failure could = cause this, but with >3 GB free I do not expect this. Important here may = be that I have https://reviews.freebsd.org/D40575 in my tree, which is = memory related, but it's only a metric to quantify memory fragmentation. >=20 > Any ideas how to track this down more easily than running the entire = poudriere in ktrace (e.g. a hint/script which dtrace probes to use)? No answers, I'm afraid, just a "me too." I have the same problem as you describe when using = ports-mgmt/sccache-overlay when building packages with Poudriere. In my = case, I'm using FreeBSD 14-STABLE (stable/14-13952fbca). I actually stopped using ports-mgmt/sccache-overlay because it got to = the point where it didn't work more often than it did. Then, a few = months ago, I decided to start using it again on a whim and it worked = reliably for me. Then, starting a few weeks ago, it has reverted to the = behaviour you describe above. It is not as bad right now as it got when = I quit using it. Now, sometimes it will fail, but it will succeed when = re-running a "poudriere bulk" run. I'd love it to go back to when it was working 100% of the time. Cheers, Paul.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?>