Ann: Chatter - a simple library for language processing
Grzegorz Chrupała
G.A.Chrupala at uvt.nl
Tue Nov 19 10:48:47 GMT 2013
Nice!
Regarding working with Text in the Tokenize lib, I'm just wondering,
wouldn't it be just as efficient to just use "pack . tokenize .
unpack"? There is quite a bit of character-by-character processing
involved in tokenization anyway.
--
Grzegorz
On Mon, Nov 18, 2013 at 10:53 PM, Rogan Creswick <creswick at gmail.com> wrote:
> I've been working on a simple NLP library over the past month or two, and I
> think it may finally be useful to others. I would love to hear comments,
> criticisms, contributions, etc... ;)
>
> My main objective was to make it extremely easy to do basic NLP tasks in
> Haskell, such as POS tagging and document similarity. (and later, Chunking,
> NER, co-ref resolution, etc...).
>
> The best example of this is Part-of-speech tagging with Chatter:
>
> {{{
> cabal install chatter
> ghci
>> :m +NLP.POS
>> t <- defaultTagger
>> tagStr t "This is a test."
> "This/dt is/bez a/at test/nn ./."
> }}}
>
> Chatter provides POS tagging (with backoff taggers, and a ~83% accurate
> trained default tagger), TF-IDF measures, and cosine document similarity.
>
> It also currently contains an adapted version of the Tokenize library,
> because I wanted to tokenize Text. That's a short-term solution; I haven't
> had time to make a patch to the tokenize lib.
>
> Links:
> - Hackage: http://hackage.haskell.org/package/chatter-0.0.0.2
> - Github: http://github.com/creswick/chatter
>
> --Rogan
>
>
> _______________________________________________
> NLP mailing list
> NLP at projects.haskell.org
> http://projects.haskell.org/cgi-bin/mailman/listinfo/nlp
>
More information about the NLP
mailing list