Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Aug 2011 14:36:32 +0100
From:      Anton Shterenlikht <mexas@bristol.ac.uk>
To:        freebsd-questions@freebsd.org
Subject:   extracting text from docx files
Message-ID:  <20110809133632.GA37445@mech-cluster241.men.bris.ac.uk>

next in thread | raw e-mail | index | archive | help
I often receive information in *.docx format
from my MS using colleagues. Sometimes I can
ask for a pdf (or similar) instead, but not always.

Usually I unzip a docx and then search
through all *xml  files to find the
useful data. However, I can't find any
xml styles to use, so I have to convert
the relevant xml file(s) to plain text
by hand. I wonder if anybody can suggest
a better way. Perhaps there's something
in ports that can help.

Many thanks
Anton


-- 
Anton Shterenlikht
Room 2.6, Queen's Building
Mech Eng Dept
Bristol University
University Walk, Bristol BS8 1TR, UK
Tel: +44 (0)117 331 5944
Fax: +44 (0)117 929 4423



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110809133632.GA37445>