ANNOUNCE: brillig 0.3 - not quite the Brill tagger

Eric Kow eric.kow at gmail.com
Wed Sep 7 12:52:09 BST 2011


Hi!

On Wed, Sep 07, 2011 at 13:32:17 +0200, Grzegorz Chrupała wrote:
> I think I don't get it. How would you use a discriminatively trained
> tagger like sequor in combination with the Brill tagger?

So, at the moment, I don't know what a "discriminatively trained" tagger
is and wouldn't know how to start trying to answer such a question...

But hopefully I won't have to, because I was actually just saying
something incredibly simple and non-technical, that the brillig
executable could just provide a thin wrapper around different kinds of
taggers (as alternatives to each other, completely disjoint).
You know, files go in, tags come out... but this was before I looked
at the training file format and understood that this is what sequor
provides.  Oh well, this probably makes brillig just a bit redundant in
infrastructure terms. :-)

> For what it's worth, I just trained Sequor  (using several spelling
> features as encoded in the data/mlcomp2.features template) on the
> initial 90% of the Brown corpus, and tested on the final 10%, and got
> an accuracy of 96.2%. Training takes several hours, but tagging runs
> at more than 3000 words/second.

Cool!

PS. can we have a small release with '-rtsopts'?

-- 
Eric Kow <http://erickow.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <http://projects.haskell.org/pipermail/nlp/attachments/20110907/5e99be7b/attachment.pgp>


More information about the NLP mailing list