[Haddock] [haddock] #191: Incorrect handling of character references
haddock
haddock at projects.haskell.org
Mon Jan 9 03:47:07 GMT 2012
#191: Incorrect handling of character references
---------------------+------------------------------------------------------
Reporter: selinger | Owner:
Type: defect | Status: new
Priority: minor | Milestone: 2.10.0
Version: 2.9.4 | Keywords:
---------------------+------------------------------------------------------
In Haddock, a character reference such as ü is used to represent non-
ASCII characters, such as the German umlaut "u.
However, this does not work in the following situations:
* if the character appears in italics,
* if the character appears in a code block with ">",
* if the character appears in a URL.
Moreover, if such a character appears in a Haskell identifier between
single quotes, the character is rendered correctly, but the word is not
recognized as a Haskell identifier (and therefore the surrounding quotes
are copied to the output and the identifier not linked).
See the attached file for examples.
Here are some comments on how I think it could be fixed. In my opinion,
the best way to handle the ü syntax would be to treat it as an input
encoding, i.e., handle it at the I/O level, before any lexing and parsing
is done by Haddock proper. In other words, the sequence ü should be
treated as if it were a single character literally present in the input
file.
If it were done this way, then one could use the ü in *every*
context, and one could even use escapes to represent actual ASCII
characters, for example, & to represent a literal "&". Thus, if the
sequence of 6 characters ü had to appear literally in a comment, one
could type it as ü - although &\#252; would achieve the same
result in a simpler way.
--
Ticket URL: <http://trac.haskell.org/haddock/ticket/191>
haddock <http://www.haskell.org/haddock>
Haddock, The Haskell Documentation Tool
More information about the Haddock
mailing list