gawk/embedded-nul.diff

48 lines
1.3 KiB
Diff

From nobody Wed Nov 30 23:49:26 2005
From: Paul Eggert <eggert@CS.UCLA.EDU>
Subject: Re: gawk: length return incorrect value when MB_CUR_MAX > 1
To: Hirofumi Saito <hi_saito@yk.rim.or.jp>
Cc: bug-gawk@gnu.org, KIMURA Koichi <kimura.koichi@canon.co.jp>
Date: Wed, 30 Nov 2005 13:39:56 -0800
Hirofumi Saito <hi_saito@yk.rim.or.jp> writes:
> And then, I tried to use gawk 3.1.5 which I build with sarge.
>
> $ LANG=ja_JP.utf8 gawk 'BEGIN {print length("abc\0def")}'
> 7
> $ LANG=ja_JP.eucJP gawk 'BEGIN {print length("abc\0def")}'
> 3
Very strange. I don't get this result with Debian sarge x86; instead,
I get 3 in both cases. And that is what I would expect to get, given
the source code. Perhaps your locales weren't all built? (Also, I
set LC_ALL rather than LANG; that's safer.)
> By the way, I patched Kimura's patch, then:
Yes, his patch should work.
Here's a slightly more-efficient patch:
--- node.c-bak 2005-07-26 11:07:43.000000000 -0700
+++ node.c 2005-11-30 13:33:44.000000000 -0800
@@ -749,9 +749,10 @@ str2wstr(NODE *n, size_t **ptr)
switch (count) {
case (size_t) -2:
case (size_t) -1:
- case 0:
goto done;
+ case 0:
+ count = 1;
default:
*wsp++ = wc;
src_count -= count;
_______________________________________________
bug-gnu-utils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-gnu-utils