Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 22 Aug 2004 14:29:15 -0500
From:      "Jacques A. Vidrine" <nectar@FreeBSD.org>
To:        Dan Langille <dan@langille.org>
Cc:        freebsd-vuxml@freebsd.org
Subject:   Re: database tables for VuXML
Message-ID:  <20040822192915.GA17478@madman.celabo.org>
In-Reply-To: <41279E59.13631.76AD2AB9@localhost>
References:  <41279E59.13631.76AD2AB9@localhost>

next in thread | previous in thread | raw e-mail | index | archive | help
[Added freebsd-vuxml@]

On Sat, Aug 21, 2004 at 07:11:21PM -0400, Dan Langille wrote:
> Hi folks,
>
> I'm in the early stage of incorporating VuXML information into
> FreshPorts.

Alright!  That'll be cool.

> I have a preliminary database table design which appears
> below.  If you think I've missed any essential information out of
> this design, please let me know.
>
> First: has anyone already worked on creating database tables for the
> VuXML data?

I have an old prototype application which uses SQLite.  FWIW, here is
the schema I used then. (the formatting is a little funny because this
was embedded in some other code)

        CREATE TABLE vuxml (
              vid          VARCHAR PRIMARY KEY UNIQUE
           ,  topic        VARCHAR
           ,  description  VARCHAR
           ,  discovery    DATE
           ,  entry        DATE
           ,  modified     DATE
	   -- computed max(entry, modified) for convenience in
	   -- other expressions
           ,  last         DATE
           );

        CREATE TABLE refs (
             vid           VARCHAR
           , type          VARCHAR
           , text          VARCHAR
           );

        CREATE TABLE affected (
              vid          VARCHAR
           ,  type         VARCHAR
           );

        CREATE TABLE names (
              affected     INTEGER
           ,  name         VARCHAR
           );

        CREATE TABLE ranges (
              affected     INTEGER
           ,  lowop        VARCHAR
           ,  low          VARCHAR
           ,  glow0        INTEGER
           ,  glow1        INTEGER
           ,  glow2        INTEGER
           ,  glow3        INTEGER
           ,  highop       VARCHAR
           ,  high         VARCHAR
           ,  ghigh0       INTEGER
           ,  ghigh1       INTEGER
           ,  ghigh2       INTEGER
           ,  ghigh3       INTEGER
           );

The `ranges' table perhaps looks erm interesting.  In this case,
I encoded each package version number into four 32-bit integers.
This allowed one to lookup an affected package completely in SQL.
However, the encoding I used may now be incorrect in some ways, since
pkg_install et al were recently changed to treat version numbers
differently.  Also, SQLite could not really optimize the complex SQL
statement that resulted.  So I think it is interesting for further
study, but I doubt I'd do it that way again unless I had a very
particular need to use pure SQL or the database was huge.

> create table vuxml
> (
>     id              serial                not null,
> -- internal FreshPorts ID
>     vid             text                  not null,
> -- the vuln vid. e.g. c4b025bb-f05d-11d8-9837-000c41e2cdad
>     topic           text                  not null,
>     description     text                  not null,
>     date_discovery  date                          ,
>     date_entry      date                          ,
>     date_modified   date                          ,
>     status          date                  not null,
>     primary key (id)
> );
> 
> create table vuxml_port_xref
> (
>     id                    serial                not null,
>     vuxml_id              integer               not null,
>     port_id               integer               not null,
>     primary key (id)
> );
> 
> create table vuxml_port_xref_range
> (
>     id                    serial                not null,
>     vuxml_port_xref_id    text                          ,
>     range_operator_start  text                          ,
>     range_operator_end    text                          ,
>     range_version_start   text                          ,
>     range_version_end     text                          ,
>     primary key (id)
> );
>
> If you consider the output from: vxquery -t vuxml ~/VuXML/vuln.xml
> tnftpd, the entry in vuxml_port_xref_range for tnftpd might be:
>
> (1, 'lt', 20040810, NULL, NULL)

Seem reasonable.  For what its worth, storing the range operators as
`>', `=>', etc lets you get something human readable as the output of a
SELECT, e.g.

   SELECT range_version_start, range_operator_start, port_name,
          range_operator_end, range_version_end FROM ... WHERE ...;

      ('20040810', '<', 'tnftpd', '', '')

I found that handy for manual inspection of the database.


You may notice from my example that I used separate `name', `ranges',
and `affected' tables.  I found this easier when filling the database,
since the <package>/<system> elements are kind of a cross-product
operator.  For example, when processing something like

      <package>
	<name>mutt</name>
	<name>ja-mutt</name>
	<range><ge>1.4</ge><lt>1.4.2</lt></range>
      </package>

I first created an `affected' entry to contain the other references.
Then, as I hit each `name' I just associated it with the `affected'
entry, and as I hit each `range' I did the same.  (Otherwise one
needs to construct a few lists, and then make all entries once
the </package> end tag is seen.)  I think this is also easier to
modify if we get new children of <package>, e.g. the <category> and
<architecture> tags that are currently in the VuXML pre-1.2 DTD.

Have fun, and please let me know if I can assist!

Cheers,
-- 
Jacques Vidrine / nectar@celabo.org / jvidrine@verio.net / nectar@freebsd.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040822192915.GA17478>