Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Nov 2015 12:22:13 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Tim Kientzle <tim@kientzle.com>
Cc:        "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>, Michal Ratajsky <michal.ratajsky@gmail.com>,  Brooks Davis <brooks@freebsd.org>
Subject:   Re: mtree "language" enhancements
Message-ID:  <CANCZdfp%2BtCnXDkbMan9crp9YepVnZKT_hSw%2Bi43OAzZX3VWhXg@mail.gmail.com>
In-Reply-To: <AFF9BC5D-536B-4F7D-83CC-E26D9CBA8BF3@kientzle.com>
References:  <CANCZdfrDtfkwKxMV3o9tcQNzBQDKZdTx1JErkTKtC7UZORT5aA@mail.gmail.com> <AFF9BC5D-536B-4F7D-83CC-E26D9CBA8BF3@kientzle.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Nov 29, 2015 at 11:59 AM, Tim Kientzle <tim@kientzle.com> wrote:

> Sounds interesting.
>
> Have you talked with Michal (CCed) who is working on a libmtree library?
>

No. I haven't. I've been thinking mostly what's the fastest way I can get
NanoBSD working in a nopriv (-DNO_ROOT) environment that wouldn't
be hard to push into a library later.


> The capabilities you're describing here really need to be bundled into a
> library, I think.  In particular, the ability to "unlink", "copy", etc, is
> much more useful if you can directly query the mtree file contents to
> perform conditional changes.  (For example, it may be important to remove
> an empty directory which requires you to be able to query whether a
> directory has files in it.)
>

In the NanoBSD context, these entries would be automatically generated,
so the tree is at hand. There'd be no need for this conditional stuff,
though
having it as an additional extension wouldn't be bad.


> I would also be interested in a description of the processing model.  It
> sounds like you're assuming the same model used by the current mtree
> program -- mtree files are processed sequentially line-by-line as they are
> read.
>

The processing model is that the resulting mtree file is read sequentially.
Each
new entry either creates a new node in an internal representation, or
modifies
a previous node. Once everything has been processed, the internal
representation
would be used to do something. In my case, I'd output an mtree file free of
these
extensions.


> For instance, libarchive's mtree processor works differently; it reads the
> entire input, merging redundant lines for the same file, and then processes
> the list.  This is more explicitly declarative, and simplifies things like
> modifying the ownership or permissions of already-listed files.


Yes. My awk script that is the first manifestation of these extensions
is implemented this way. That's why I described it as a journal, but
didn't explain that in my nomenclature, a journal is process
first to last to get the current state.


>
> > Each action entry would have an 'action' keyword.
>
> In terms of the language per se, this seems unnecessary.    I've proposed
> alternate language below that omits the unnecessary "type=action" by just
> adding new keywords.


That would work too. I came up with the type=action thing as a way to avoid
a lot of new keywords, and to segregate the new actions from the old, but
what you propose would also work and might be more general.

> The keywords I've defined
> > so far are as follows:
> > 1. "unlink" which throws away the previous entry. That entry has been
> > removed. It may apply to files or directories, but it is an error not to
> > remove all entries in a directory when removing the directory.
>
> # When set on an entry, a matching file on disk will be removed.
> # This would also be useful for things like ObsoleteFiles
> unlink=true


OK. That's a little different than what I had in mind. My notion was that
the tree would be modified in place to remove the file, and this entry
would announce that action so the mtree internal representation could
be modified to reflect that. Though I do see value in your approach.


>
> > 2. "move" which relocates a previous entry. An additional targetpath
> > keyword specifies the ultimate destination for this entry.
>
> # When set on an entry, moves the existing file to the new name
> rename=<targetpath>
>
> # Example
> foo/bar type=file owner=root mode=0755 rename=foo/baz


That would work.

>
> > 3. "copy" which duplicates a previous entry. It too takes target path.
>
> # As with rename, except it copies the contents.
> copy_from=<original>
>

Yes.


> # properties that are not specified will be copied as well
> # Create foo/bar by copying foo/baz, preserving all attributes
> foo/bar type=file copy_from=foo/baz
> # Create foo/bar as above, but modify the owner
> foo/bar owner=dialer type=file copy_from=foo/baz


s/owner/uname=/ but I like that.


> > 4. "meta" which changes the meta data of the previous entry. All keywords
> > on this are merged with the previous entry.
>
> As above, libarchive's mtree processor already does this by default; no
> language change is needed.


OK. If it matches existing practice, I'm cool with the change.


> > The one other thing that my merging tool does is to remove all size
> > keywords. ... [comments about modifying existing files]
>
> One common case here is appending new contents to an existing file.  That
> could similarly be handled with the same pattern:
>
> # Append from source
> foo/bar append_from=<target path>
>

That's a novel idea. My most-processor might have a little trouble with it
if we were trying not
to modify the actual target tree. But with modify in place, we could make
it work.


> In particular, that removes the need to find the source file to modify it
> in-place.  I've run into various headaches with Crochet when the /usr/obj
> layout changes between releases and Crochet cannot find the new location of
> a file.  This would remove the need to always modify the file in-place.
> (But not all.)
>

It is a useful pattern.

Most of the nanobsd scripts I've seen use >> to append individual files,
one line at a time.


Warner

Cheers,
>
> Tim
>
>
>
> > On Nov 29, 2015, at 10:04 AM, Warner Losh <imp@bsdimp.com> wrote:
> >
> > Greetings,
> >
> > As part of making NanoBSD buildable by non-root, I've found a need to
> have
> > a richer mtree language than we currently have.
> >
> > mtree started out as a language to express hierarchies of files. It does
> a
> > decent job at that, even if some of the tools that we have in the tree
> > aren't so great about manipulating them. One could easily wish for better
> > tools, but that's not the topic of this thread.
> >
> > So, I've started to move the language into one that can also journal
> > changes to a tree, and have been moving NanoBSD to using wrappers that do
> > the changes to the tree and record the journal events at the end of the
> > metalog produced from buildworld. I have a second tool that reads the
> meta
> > log, and applies the actions to the earlier entries and then produces a
> > final metalog that's used for makefs. These tools are still evolving, but
> > before I got too close to the point of committing, I thought I'd post a
> > proposed extension to mtree for comments so I don't have to change too
> much.
> >
> > I'd like a new type called 'action' (so type=action in the records). This
> > type is defined loosely to manipulate and earlier entry (or maybe
> entries,
> > still unsure) in the file.
> >
> > Each action entry would have an 'action' keyword. The keywords I've
> defined
> > so far are as follows:
> > 1. "unlink" which throws away the previous entry. That entry has been
> > removed. It may apply to files or directories, but it is an error not to
> > remove all entries in a directory when removing the directory.
> > 2. "move" which relocates a previous entry. An additional targetpath
> > keyword specifies the ultimate destination for this entry.
> > 3. "copy" which duplicates a previous entry. It too takes targetpath.
> > 4. "meta" which changes the meta data of the previous entry. All keywords
> > on this are merged with the previous entry.
> >
> > The one other thing that my merging tool does is to remove all size
> > keywords. In the NanoBSD environment, size is irrelevant. Files are
> > replaced and appended to all the time in the build process, and it
> doesn't
> > make sense to track the size. makefs fails if the size is different, so
> > post-processing of the tree, say to add a new default to
> > /etc/defaults/rc.conf or to tweak /etc/ttys to turn on/off a tty (or
> append
> > a new entry) will cause it to fail. I would be nice of mtree could do
> this,
> > but is simply can't (but see above for whining about better tools being
> > beyond the scope of this).
> >
> > If things go well, we could eventually move these extensions into mtree
> so
> > that the post-processing stage is no longer necessary. I'm content to
> > maintain the hundred or two lines of awk I've written to implement it. I
> > chose awk because it does the job well enough, though python might do it
> > better. But I don't want to talk about that choice since right now it is
> > purely internal to NanoBSD (though I hope that other build orchestration
> > systems like src/release and crochet look to adopt).
> >
> > Comments?
> >
> > Warner
> > _______________________________________________
> > freebsd-arch@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-arch
> > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
> >
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfp%2BtCnXDkbMan9crp9YepVnZKT_hSw%2Bi43OAzZX3VWhXg>