Date: Fri, 29 Jan 2010 13:01:06 -0800 From: Murray Stokely <murray@stokely.org> To: FreeBSD doc list <freebsd-doc@freebsd.org> Subject: Re: Proposed new doc hierarchy for closed-captions / transcripts from conferences Message-ID: <2a7894eb1001291301u2e0b5f17q8dc381fad5b76285@mail.gmail.com> In-Reply-To: <2a7894eb1001172357t754cee36u760d9ddd1d6a7665@mail.gmail.com> References: <2a7894eb1001172357t754cee36u760d9ddd1d6a7665@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
No comments? I will proceed with this plan then.. - Murray On Sun, Jan 17, 2010 at 11:57 PM, Murray Stokely <murray@stokely.org> wrote= : > As some of you might be aware I have been working on getting closed > captions for the videos of FreeBSD related talks at conferences. =A0In > the last month I've started using the YouTube Machine Learning to > produce the first automatic transcript and then paying human editors > through Amazon Mechanical Turk to improve the technical vocabulary / > general editing of the transcripts. > > There are now four videos in the BSD Conferences YouTube channel with > relatively good quality human-edited english language transcripts. > (e.g. pointers at > http://freebsd.stokely.org/2010/01/improved-conference-captions-from.html= ) > > The caption files themselves are simple ASCII text files with one line > for the start/end time of the text to be displayed, 1 or 2 lines for > the text to be displayed, and a blank line to separate the next > record. > > I would like to start checking in these text files under > doc/en_US.ISO8859-1/captions/ for a number of reasons. > > 1. I want to make it easier for others to correct any mistakes in the cap= tions. > 2. I want to make it easier to translators to produce localized > captions for the most popular videos. > 3. Keep a centralized repository of the captions outside of YouTube, > so other hosting sites or systems are able to use them. > 4. Increase discoverability of technical content discussed in the > conference talks with indexable transcripts open to search engines. > > The blog post above has some example text files that I'd like to check > in. =A0It then becomes a matter of choosing the hierarchy. > > I might suggest: > > doc/${LANG}/captions/${YEAR}/${CONFERENCE}/${TALK} > > e.g. > > doc/en_US.ISO8859-1/captions/2009/asiabsdcon/mckusick-kernelinternals.sbv > > Thoughts? > > =A0 =A0- Murray >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2a7894eb1001291301u2e0b5f17q8dc381fad5b76285>