NLP Digest, Vol 12, Issue 5
wren ng thornton
wren at freegeek.org
Sun Oct 30 01:20:11 BST 2011
On 10/29/11 7:04 PM, Dan Maftei wrote:
>> I mean the functions (\x -> log (1+x)) and (\x -> exp x - 1). They're
>> inaccurate for very small x because of details about how Float/Double
>> are represented. The details aren't too important, but just think about
>> what happens when we add a really small number to 1. There are many
>> values where 1+x == 1 because we can't represent 1+x with the available
>> precision and 1 is the closest thing we can represent.
>>
>> Suffice it to say that (\x -> log (1+x)) will underflow around x=2**-53
>> for Double, whereas we could implement a denotationally equivalent
>> function (\x -> log1p x) which only underflows around x=2**-1074 for
>> Double. The same idea goes for the naive (\x -> exp x - 1) vs a
>> denotationally equivalent implementation (\x -> expm1 x).
>>
>> N.B., (log1p . expm1) == id, and (expm1 . log1p) == id
>
> Perfect. That makes sense, and I tested on ghci too.
>
> But why is adding 1 to a small number then taking its log so common? When
> we move into logspace, we take logs of small numbers, and add or multiply
> them. Where does (\x -> log (1+x)) come into play?
Well, consider the implementation of logSum. Naively we could define it as:
(+) (LogFloat x) (LogFloat y) = LogFloat (log (exp x + exp y))
And in mathematics that's fine; however, when using floating point
instead of actual real numbers, the exponentiation will cause underflow
in all the places we're trying to avoid it. Thus, the naive
implementation doesn't achieve our goals. Through some manipulation we
can come up with the following implementation which avoids underflow:
(+) (LogFloat x) (LogFloat y)
| x == y
&& isInfinite x
&& isInfinite y = LogFloat x
| x >= y = LogFloat (x + log1p (exp (y - x)))
| otherwise = LogFloat (y + log1p (exp (x - y)))
The first case says that @0+0 == 0@ and @infinity+infinity == infinity@
also hold in the log-domain (where log 0 = -inf and log +inf = +inf).
The second two cases are the same as each other, just ensuring that we
start from the larger argument.
To understand this equation in the normal domain we must exponentiate it.
exp (x + log (1 + exp (y - x)))
= exp x * exp (log (1 + exp (y - x)))
= exp x * (1 + exp (y - x))
= exp x + exp x * exp (y - x)
= exp x + exp (x + y - x)
= exp x + exp y
So, it does actually succeed in adding two log-domain numbers. But why
do it this way? Since we can "factor x out of y" before doing the
exponentiation, that ensures that the thing we're exponentiating will be
smaller than either argument (provided they have the same sign). By
pushing things towards zero we're pushing things away from the boundary
where exp will underflow/overflow. As for why we want x>y, I don't
recall the details of that so I'll have to defer to an IEEE-754 guru.
Another reason why they're so important is because crossing the log/exp
border is extremely expensive. In the naive implementation we cross that
border thrice (2 exp & 1 log), whereas in the modified version we only
need to do it twice (1 exp & 1 log1p). Thus, we've reduced the running
time of logSum by about 33%! Because 1 plays such a special role with
respect to the field operators, we can make use of log1p, expm1, and
some basic algebra in order to make programs dealing with
logarithms/exponentiation much much faster. The fact that these
functions also have special implementations which are very precise is
icing on the cake.
--
Live well,
~wren
More information about the NLP
mailing list