Skip site navigation (1)Skip section navigation (2)
Date:      Wed,  3 Sep 2008 21:33:30 -0400 (EDT)
From:      vogelke+software@pobox.com (Karl Vogel)
To:        freebsd-questions@freebsd.org
Subject:   Re: script to assist ASCII text
Message-ID:  <20080904013330.B1E92B7BD@kev.msw.wpafb.af.mil>
In-Reply-To: <1219723211.4994.165.camel@localhost> (message from Gary Kline on Mon, 25 Aug 2008 21:00:10 -0700)

next in thread | previous in thread | raw e-mail | index | archive | help
>> On Mon, 25 Aug 2008 21:00:10 -0700, 
>> Gary Kline <kline@thought.org> said:

G> This had eluded me for years and it may not be possible, but here goes.
G> I write using vi or, less frequently vim.  Is there any sh script that
G> would make sure that there were exactly one space ('\040') between words,
G> and three spaces between sentences?  My definition of "a sentence" is a
G> string of words that ends in a period or question-mark, exclamation-mark,
G> or ellipse ("... . || ... ? || ... !)  Also, any dash "--" could not have
G> any whitespace around it.

   I like a similar setup -- one space between words, sentences ending
   with a period followed by two spaces.  The GNU version of "fmt" handles
   this pretty well.  Here's the first part of your message, formatted to
   50-character-wide lines, with the type of spacing that drives me nuts:

     me% cat -n msg
       1  This had eluded me for years and it may not be
       2  possible, but here goes. I write using vi or,
       3  less frequently vim. Is there any sh script that
       4  would make sure that there were exactly one
       5  space ('\040') between words, and three spaces
       6  between sentences? My definition of "a sentence"
       7  is a string of words that ends in a period or
       8  question-mark, exclamation-mark, or ellipse.

   Putting one word on each line and then letting GNU fmt decide on
   sentence-handling does almost exactly what you want:
   
     me% gfmt -1 msg | gfmt -50 | cat -n
       1  This had eluded me for years and it may not be
       2  possible, but here goes.  I write using vi or,
       3  less frequently vim.  Is there any sh script
       4  that would make sure that there were exactly one
       5  space ('\040') between words, and three spaces
       6  between sentences?  My definition of "a sentence"
       7  is a string of words that ends in a period or
       8  question-mark, exclamation-mark, or ellipse.

   Here's a script I use as a driver for GNU fmt.  It looks for an
   optional environment variable FMTWIDTH to decide how long each line
   should be.  This comes in handy if I call vi/vim from within a script:

     #!/bin/sh
     # driver for fmt.

     case "$FMTWIDTH" in
         "") opt= ;;
         *)  opt="-$FMTWIDTH" ;;
     esac
     case "$1" in
         -*) opt= ;;
         *)  ;;
     esac
     exec /usr/local/bin/gfmt $opt ${1+"$@"}

   Here's an alias I use for quickly reformatting a section of text
   in vim.  I mark where to start using 'a', then move down to the end
   of the section and hit 'v':

     jmbk:'a,.!fmt -1|fmt<CR>'b

   A similar alias will reformat whatever paragraph I'm in, with no need
   for marks:

     }jmbk{ma}:'a,.!fmt -1|fmt<CR>'b

   The script below helps me clean up a file or message after running fmt,
   which makes strings like "U.S.A." look like the end of a sentence
   even when they're not.  This should give you some ideas.

-- 
Karl Vogel                      I don't speak for the USAF or my company
Panda Mating Fails; Veterinarian Takes Over     --actual news headline, 1997

---------------------------------------------------------------------------
#!/usr/bin/perl
#
# $Id: cm,v 1.3 2008/08/17 20:25:49 vogelke Exp $
# $Source: /home/vogelke/bin/RCS/cm,v $
#
# cm: clean mail message

while (<>) {
    s/Jan\.  /Jan /g;
    s/Feb\.  /Feb /g;
    s/Aug\.  /Aug /g;
    s/Sept\.  /Sept /g;
    s/Oct\.  /Oct /g;
    s/Nov\.  /Nov /g;
    s/Dec\.  /Dec /g;
    s/Mr\.  /Mr. /g;
    s/Mrs\.  /Mrs. /g;
    s/Ms\.  /Ms. /g;
    s/Dr\.  /Dr. /g;
    s/Sen\.  /Senator /g;
    s/Rep\.  /Representative /g;
    s/U\.S\.A\.  /USA /g;
    s/U\.S\.  /US /g;
    s/D\.C\.  /DC /g;
    s/U\.N\.  /UN /g;
    s/B\.S\.  /BS /g;
    s/M\.B\.A\.  /MBA /g;
    s/ ([A-Z]\.)  / $1 /g;
    s/''/\"/g;
    s/``/\"/g;

    s/\342\200\231/'/g;     # These come from saving Firefox pages
    s/\342\200\234/"/g;
    s/\342\200\235/"/g;

    print;
}

exit(0);



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080904013330.B1E92B7BD>