Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Mar 1997 21:03:40 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        hasty@rah.star-gate.com (Amancio Hasty)
Cc:        terry@lambert.org, hackers@freebsd.org
Subject:   Re: Cool Web page interface to mail + search engine?
Message-ID:  <199703260403.VAA26869@phaeton.artisoft.com>
In-Reply-To: <199703260131.RAA15627@rah.star-gate.com> from "Amancio Hasty" at Mar 25, 97 05:31:56 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> Odd, I think that you think like a Physics. There are LAWS which can
> not be broken. 

Yes, physics is the only real science (mathematics is not a science,
it is a tool).  What has that got to do with the price of tea in
China?


> Terry, I am going to leave this simple exercise for you to solve .
> When you have a "cool" solution, please let us know about it.
>
> >From The Desk Of Terry Lambert :
> > > It is just a distraction from Terry for now just ignore it and
> > > concentrate on the real project .
> > 
> > Ignore this:	HOW WILL YOU DESIGNATE A MESSAGE'S THREAD WITHOUT
> > 		AN "In-Reply-To:" HEADER?
> > 
> > That's the point of the discussion.


OK, I have a cool soloution; the pain is that it took so long to
write up (I think vastly faster than I type):



The problem may be solved in two parts:

I	Assignment of message ID's to messages that came into the
	list server without a "Message-ID:" header.

	Discussion:	There are several examples of messages
			without "Message-ID:" headers in the
			archives (mostly exmh users).

	List Change:	Messages received without "Message-ID:"
			headers shall be given one in the list
			remailing process, before the messages
			are the archived.  This means the
			messages will have an archival ID, and
			they will be likely to generate correct
			"In-reply-to:" headers from user mailers,
			guaranteeing threadability of correctly
			formatted replies/followups.


II	Designation of root messages and ordering for messages that
	do not have an "In-reply-to:" header.

	Case 1:		The message does not have an "In-reply-to:"
			because it is the head of a new thread

	Case 2:		The message does not have an "In-reply-to:"
			because the MUA used by the user failed
			to provide one

	Case 3:		The message has an "In-reply-to:" for a
			message archived by the list

	Case 4:		The message has an "In-reply-to:" for a
			message not archived by the list

	Archive Change:	Case 1 be grossly distinguished from
			case 2 by determination of "Subject:"
			line contents.

			If the "Subject:" line contains an "Re:"
			prefix to the subject, it is considered
			an "orphan" message from an existing
			thread.

			Alternate: If processing time can be
			spared, an index search of existing
			subjects should be used to identify the
			subject, and categorize the message
			that way; a date-cutoff parameter  (which
			we will call <timeout #1>) should allow
			regeneration of a subject thread without
			considering the thread an orphan.

	Archive Change:	Case 3 is time domain indistinguishable
			from case 4 for a given time window in
			the case that a case 4 message resulted
			from a transport propagation delay.  This
			may occur if a list and a user are both
			copied on a message, and the user replies
			before the list archives the message, and
			the network propagation delay topology is
			such that the original message arrives
			after the response to both the list and
			the orginal sender (most often caused by
			"group replies").

			The Archive processing must be two-tiered,
			such that there is a configurable window
			of time in which a non-archived referenced
			message may arrive before the refrencing
			message is archived (which we will call
			<timeout #2>).

			If <timeout #2> expires prior to the arrival
			of the orginal message, then the message
			that followed it is declared an "orphan".

	Define Orphan:	An orphan message is the result of a missing
			or "In-Reply-to:", a new message with an
			archived subject which recurrs within a
			<timeout #1> window, or a followup to a non-list
			message (an 'invalid' "In-reply-to:") for
			which an original message has not appeared
			within a <timeout #2> window.

			Said messages shall be marked with the headers:

			"X-Orphan: no in-reply-to",
			"X-Orphan: no in-reply-to AND active subject",
			"X-Orphan: unarchived in-reply-to 'reply-to'",

			respectively ('reply-to' in the last is the
			original "In-reply-to:" header).

	Handle Orphan:	If the subject does not exist, it is treated
			as the head of a new thread.

			If the subject exists, but has not been
			added to in <timeout #1> days, the subject
			is considered to have "recurred", and it is
			treated as the head of a new thread with the
			same subject.

			If the subject exists, and the has been added
			to in <timeout #1> days, the subject is
			considered to be "live" and the message is
			considered to be "unthreadable".  An message
			which is "unthreadable" is determined to have
			occurred in the "unthreadable" thread for the
			subject, and is treated as follows:

			a)	If it is the first "unthreadable"
				message for a subject, the root
				message gains a new thread as if
				the unthreadable message were a
				response to the root message itself.
				The messages's "In-Reply-to:" is
				adjusted accordingly.

			b)	If it is not the first "unthreadable"
				message for a subject, the root
				message's "unthreadable message list
				is followed to the tail of the thread,
				and it is insrted there.  The message's
				"In-reply-to:" is adjusted to point
				to the "Message-ID:" of the message at
				the tail of the thread.

			Alternate (b):

			b)	If sufficient processing power exists,
				the message is inserted in the thread
				"unthreadable" based on it's date/time
				stamp in UTC.  Both it's, and the message
				following it's "In-Reply-to:" field are
				adjusted to indicate the link insertion.


	It should be ovious that given the time-order domain of
	message header adjustments, it is possible to:

	o	Reindex messages in the case that the message index
		is somehow lost

	o	Periodically reindex messages based on the "X-Orphan:"
		header to "unorphan" messages for which the:

			"X-Orphan: unarchived in-reply-to 'reply-to'"

		unarchived parent has subsequently arrived outside the
		<timeout #2> period (the purpose of which is to get
		important data into the archive in a timely fashion).

	o	Periodically reindex messages based on the "X-Orphan:"
		header to "unorpahn" messages for which the:

			"X-Orphan: no in-reply-to AND active subject"

		has caused the messages to be interpreted as active
		subject orphans rather than new thread heads (typically
		this would be done when <timeout #1> has been increased).

	o	Periodically reindex messages based on the lack of an
		"In-reply-to:" header that are considered message heads
		to force them into the:

			"X-Orphan: no in-reply-to AND active subject"
		category (typically this would be done when <timeout #1>
		has been decreased).

	o	Apply administrative fiat to adjust headers to force
		messages from one category to another, and to join and
		split threads, as necessary.  To support this, there
		should be a valid "In-reply-to: root" or similar thread
		head directive which is treated as a pseudo-directive.


I think that about covers it...

					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199703260403.VAA26869>