ANNOUNCE: brillig 0.3 - not quite the Brill tagger
Daniël de Kok
me at danieldk.eu
Sat Sep 3 10:54:26 BST 2011
Hi Eric,
On Sep 3, 2011, at 10:32 AM, Eric Y. Kow wrote:
> This is largely a seed-planting exercise, naive implementations of
> simple algorithms so that we have something rather than nothing
> (see also fullstop, a sentence segmenter). Is there a Haskell NLTK?
> No, but…
Thank you! It's good to see some NLP infrastructure is being written. From a quick glance through the code, I wonder if transformation rules can be made more expressive easily. E.g. it would be nice if a rule has a list of atomic conditions that are checked. Something along these lines (String and less type classes used for brevity):
---
{-# LANGUAGE ExistentialQuantification #-}
import Data.List.Zipper (Zipper(..))
type Tag = String
type Word = String
data Context =
Context {
ctxWords :: Zipper Word,
ctxTags :: Zipper Tag
}
class Condition c where
conditionApplies :: c -> Context -> Bool
data CondBox = forall c . (Condition c, Show c) => MkCond c
instance Show CondBox where
show (MkCond c) = show c
data Transform =
Transform {
conditions :: [CondBox],
replace :: Replacement
} deriving Show
data Replacement =
Replacement {
from :: Tag,
to :: Tag
} deriving (Show)
---
Now conditions can be defined briefly:
---
-- The Nth word to the left
data PrevWordCondition =
PrevWordCondition Word Int
deriving Show
instance Condition PrevWordCondition where
conditionApplies (PrevWordCondition word n) (Context (Zip left _) _) =
(left !! n) == word
-- The Nth tag to the left
data PrevTagCondition =
PrevTagCondition Tag Int
deriving Show
instance Condition PrevTagCondition where
conditionApplies (PrevTagCondition tag n) (Context _ (Zip left _)) =
(left !! n) == tag
---
One sample rule:
---
rule1 :: Transform
rule1 =
Transform
[ MkCond $ PrevWordCondition "very" 1,
MkCond $ PrevTagCondition "ADJ" 1]
(Replacement "N" "ADJ")
---
The obvious next step would be to make generators for conditions, so that you can make rule templates.
Take care,
Daniël
More information about the NLP
mailing list