[Haddock] [haddock] #20: We don't handle non-ASCII characters in doc comments
haddock
haddock at projects.haskell.org
Mon Dec 5 09:19:27 GMT 2011
#20: We don't handle non-ASCII characters in doc comments
-------------------+--------------------------------------------------------
Reporter: waern | Owner:
Type: defect | Status: new
Priority: major | Milestone:
Version: | Resolution:
Keywords: |
-------------------+--------------------------------------------------------
Comment(by simonmar):
The comments from GHC are lexed again by Haddock using an Alex lexer, and
I would expect that step to mangle the Unicode. From `src\Lex.x`:
{{{
alexGetByte :: AlexInput -> Maybe (Word8,AlexInput)
alexGetByte (p,c,[]) = Nothing
alexGetByte (p,_,(c:s)) = let p' = alexMove p c
in p' `seq` Just (fromIntegral (ord c),
(p', c, s))
-- for compat with Alex 2.x:
alexGetChar :: AlexInput -> Maybe (Char,AlexInput)
alexGetChar i = case alexGetByte i of
Nothing -> Nothing
Just (b,i') -> Just (chr (fromIntegral b), i')
}}}
You can see we apply `ord` in `alexGetByte` and `chr` again in
`alexGetChar`, so Unicode should be squashed to the low 8 bits.
--
Ticket URL: <http://trac.haskell.org/haddock/ticket/20#comment:12>
haddock <http://www.haskell.org/haddock>
Haddock, The Haskell Documentation Tool
More information about the Haddock
mailing list