Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Aug 2011 09:40:26 -0400
From:      Rod Person <rodperson@rodperson.com>
To:        Anton Shterenlikht <mexas@bristol.ac.uk>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: extracting text from docx files
Message-ID:  <20110809094026.dea10d7a.rodperson@rodperson.com>
In-Reply-To: <20110809133632.GA37445@mech-cluster241.men.bris.ac.uk>
References:  <20110809133632.GA37445@mech-cluster241.men.bris.ac.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 9 Aug 2011 14:36:32 +0100
Anton Shterenlikht <mexas@bristol.ac.uk> wrote:

> Usually I unzip a docx and then search
> through all *xml  files to find the
> useful data. However, I can't find any
> xml styles to use, so I have to convert
> the relevant xml file(s) to plain text
> by hand. I wonder if anybody can suggest
> a better way. Perhaps there's something
> in ports that can help.

You could try this for just plain text conversion
http://docx2txt.sourceforge.net/

-- 
Rod



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110809094026.dea10d7a.rodperson>