Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 09 Feb 2019 20:37:38 -0800
From:      Cy Schubert <Cy.Schubert@cschubert.com>
To:        "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>
Cc:        Cy Schubert <Cy.Schubert@cschubert.com>, cem@freebsd.org, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: nosh init system
Message-ID:  <201902100437.x1A4bcGK042486@slippy.cwsent.com>
In-Reply-To: Message from "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net> of "Sat, 09 Feb 2019 20:20:28 -0800." <201902100420.x1A4KSxA064573@pdx.rh.CN85.dnsmgr.net>

next in thread | previous in thread | raw e-mail | index | archive | help
In message <201902100420.x1A4KSxA064573@pdx.rh.CN85.dnsmgr.net>, 
"Rodney W. Gri
mes" writes:
> > In message <CAG6CVpWXOA6r_aJcefxQBu2QZxprf1ZpDoTb4eb2JSwWsE2m+g@mail.gma
> > il.com>
> > , Conrad Meyer writes:
> > > Hi Cy,
> > >
> > > On Sat, Feb 9, 2019 at 3:35 PM Cy Schubert <Cy.Schubert@cschubert.com> wr
> ote:
> > > > I don't see what's so "incredibly fragile" about rc(8). That's not to
> > > > say there aren't better solutions, like SMF.
> > >
> > > Maybe "incredibly" as a choice of adjective is inappropriate.  I think
> > > we (you, me, and ngie@) can all agree it is somewhat fragile, and
> > > there are things SMF/systemd/nosh get right that rc(8) does not
> > > (today).  Anyway, your next paragraph goes on to be a good start at
> > > describing some of rc's fragility.  :-)
> > >
> > > > Where rc(8) falls down is any port or a customer's (user of FreeBSD) rc
> > > > script could fail hosing the boot or worse hosing the system*. Where a
> > > > solution like SMF solves the problem is that should a service which
> > > > other services depend on fail, only that branch of the startup tree
> > > > would fail.
> > >
> > > Right; that's a great example.
> > >
> > > > In that scenario, if a service fails but sshd start, a
> > > > sysadmin would still be able to login remotely to resolve the problem.
> > > > So in this regard rc(8) is at a disadvantage.
> > > >
> > > > We could address the above paragraph by starting sshd earlier during
> > > > boot thereby allowing the opportunity to fix remotely.
> > >
> > > I don't think that is really sufficient without substantially
> > > modifying init+rc to be closer to something like systemd or SMF,
> > > anyway.  And then we'd rather just have something like SMF :-).
> > 
> > I'd rather see SMF but a number felt a CDDL licensed init was 
> > unacceptable -- except for the fact that SMF doesn't replace init.
> > 
> > >
> > > As soon as *any* rc service fails to start (signal, non-zero exit,
> > > stop_boot), rc(8) exits non-zero, causing init(8) to go to single
> > > user.  All service state is thrown away with rc(8) exit, but any rc.d
> > > "services" that managed to start before boot failed are not
> > > terminated.  Even if an admin manages to log in and fix the
> > > configuration, re-starting rc(8) restarts the runcom process from
> > > scratch, as if nothing had already been done, without first stopping
> > > anything that was already running.  The only safe, reproducible way to
> > > re-start rc(8) is to fully reboot the system.
>
> It -should- be safe to restart rc, as rc scripts should check to
> see if the item they are being requested to start is already running,
> rc scripts that fail to have this check are defective and should be
> fixed.  You should be able to invate /etc/rc.d/foo start as many
> times as you want in a row and only get 1 instance of foo, with the
> other starts returning "foo already running"   Same with stop.
>
> > 
> > It wasn't that way 10-15 years ago. It's evolved to become this. Even 
> > if we stay with rc(8), quickly cobbling together a patch isn't going to 
> > fix it long term. Whether we use another init, an add-on like SMF, or 
> > make rc(8) more robust, it will not be fixed by a simple tweak here or 
> > there.
>
> Much gets broken in the name of new features sadly.

That was my point. Tunnel vision.

>
> > 
> > >
> > > E.g., the major pain point we run into repeatedly with restarted boot
> > > is that cleanvar / cleartmp run again.  This breaks ld-elf.so.hints
> > > cache (anything linking /usr/local libraries ??? hope your admin is
> > > running base sshd and not openssh-portable!) as well as wiping out
> > > /var/run pid files (breaking "already running?" rc pid checks).  As a
> > > result, services get double-started.
> > >
> > > Cleanvar could maybe be improved to avoid this problem ??? e.g., we
> > > could coordinate with the kernel to set a per-boot, persisted flag
> > > that cleanvar has completed, even if rc(8) exits ??? but the broad class
> > > of problems would remain (rc.d autostart is stateful, but any partial
> > > failure destroys all state).
> > 
> > This needs more than improving cleanvar or some other script. It's like 
> > whack-a-mole. (The rest of this not specifically talking to you 
> > Conrad.) This is why every one to two months this topic comes up again 
> > and again and again. It's a pain point. (And also the shiny new object 
> > syndrome.) Various people suggest their favourite init(8) replacement 
> > and the bikeshed starts up again.
>
> Shiny new things also come with shiny new problems, I would actually
> often rather repair a broken old something than get a new shiny
> something as I know the defects of the raty old something.

Agreed. Like building on a foundation of sand.

>
> > To avoid bikeshedding this to death again, we enumerated two issues so 
> > far. Let's continue to list issues. I also think that this should be a 
> > BSDCan devsummit whiteboard topic where we list issues in one column 
> > and next to it we list possible solutions, after listing all the issues 
> > first. And finally if this is too large for one person to work on, 
> > assign the various issues to willing developers.
>
> We do not need to wait for BSDCan, there are more of us here on this
> list than at any dev summit.

True but whiteboards help. My point is let's itemize a list of issues 
first. Write down (figuratively speaking) all ideas to address the 
listed issues. Select the best ideas. Implement them.

>
> > 
> > One final thought. init(8) and rc(8) requirements for desktop/laptop, 
> > server, embedded, and mobile are probably different enough that their 
> > requirements may compete with each other. Some embedded applications 
> > may desire a simple rc(8) whereas server or desktop a heavier weight 
> > solution.
>
> It is rather simple to just drop the whole rc.d and rewrite
> /etc/rc for the embeded situtaion, going back to the 4.3 era.
>
> Though we might want to go over to the rc mailling list?

Excellent idea!


-- 
Cheers,
Cy Schubert <Cy.Schubert@cschubert.com>
FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  http://www.FreeBSD.org

	The need of the many outweighs the greed of the few.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201902100437.x1A4bcGK042486>