From d58c8ecf79a549a79694670aec35bc56c11251c668b4a74d352004e61c900b4a Mon Sep 17 00:00:00 2001 From: Philipp Thomas Date: Fri, 7 May 2010 15:54:35 +0000 Subject: [PATCH 1/7] - Update to 8.5: Bug fixes * cp and mv once again support preserving extended attributes. * cp now preserves "capabilities" when also preserving file ownership.7 * ls --color once again honors the 'NORMAL' dircolors directive. [bug introduced in coreutils-6.11] * sort -M now handles abbreviated months that are aligned using blanks in the locale database. Also locales with 8 bit characters are handled correctly, including multi byte locales with the caveat that multi byte characters are matched case sensitively. * sort again handles obsolescent key formats (+POS -POS) correctly. Previously if -POS was specified, 1 field too many was used in the sort. [bug introduced in coreutils-7.2] New features * join now accepts the --header option, to treat the first line of each file as a header line to be joined and printed unconditionally. * timeout now accepts the --kill-after option which sends a kill signal to the monitored command if it's still running the specified duration after the initial signal was sent. * who: the "+/-" --mesg (-T) indicator of whether a user/tty is accepting messages could be incorrectly listed as "+", when in fact, the user was not accepting messages (mesg no). Before, who would examine only the permission bits, and not consider the group of the TTY device file. Thus, if a login tty's group would change somehow e.g., to "root", that would make it unwritable (via write(1)) by normal users, in spite of whatever the permission bits might imply. Now, when configured using the --with-tty-group[=NAME] option, who also compares the group of the TTY device with NAME (or "tty" if no group name is specified). Changes in behavior * ls --color no longer emits the final 3-byte color-resetting escape sequence when it would be a no-op. * join -t '' no longer emits an error and instead operates on each line as a whole (even if they contain NUL characters). For other changes since 7.1 see NEWS. - Split-up coreutils-%%{version}.diff as far as possible. - Prefix all patches with coreutils-. - All patches have the .patch suffix. - Use the i18n patch from Archlinux as it fixes at least one test suite failure. OBS-URL: https://build.opensuse.org/package/show/Base:System/coreutils?expand=0&rev=9 --- coreutils-5.3.0-i18n-0.1.patch | 4015 ---------------- ...n4su.diff => coreutils-5.3.0-sbin4su.patch | 14 +- ...tils-6.8-su.diff => coreutils-6.8-su.patch | 281 +- ....8.0-pie.diff => coreutils-6.8.0-pie.patch | 109 +- coreutils-7.1.diff | 194 - coreutils-7.1.tar.xz | 3 - coreutils-8.5-i18n.patch | 4066 +++++++++++++++++ coreutils-8.5.patch | 67 + coreutils-8.5.tar.xz | 3 + coreutils-add_ogv.patch | 8 +- coreutils-cifs-afs.diff | 35 - coreutils-fix_distcheck.patch | 80 - coreutils-getaddrinfo.diff | 16 - coreutils-getaddrinfo.patch | 17 + coreutils-gl_printf_safe.patch | 24 + coreutils-i18n-infloop.patch | 14 + ...ield.diff => coreutils-i18n-limfield.patch | 14 +- ...ort.diff => coreutils-i18n-monthsort.patch | 6 +- ...random.diff => coreutils-i18n-random.patch | 6 +- coreutils-i18n-uninit.patch | 16 + coreutils-invalid-ids.patch | 26 + coreutils-no_hostname_and_hostid.patch | 122 + ...ls-sysinfo.diff => coreutils-sysinfo.patch | 14 +- coreutils.changes | 53 +- coreutils.spec | 70 +- i18n-infloop.diff | 14 - i18n-uninit.diff | 29 - invalid-ids.diff | 49 - 28 files changed, 4673 insertions(+), 4692 deletions(-) delete mode 100644 coreutils-5.3.0-i18n-0.1.patch rename coreutils-5.3.0-sbin4su.diff => coreutils-5.3.0-sbin4su.patch (90%) rename coreutils-6.8-su.diff => coreutils-6.8-su.patch (78%) rename coreutils-6.8.0-pie.diff => coreutils-6.8.0-pie.patch (76%) delete mode 100644 coreutils-7.1.diff delete mode 100644 coreutils-7.1.tar.xz create mode 100644 coreutils-8.5-i18n.patch create mode 100644 coreutils-8.5.patch create mode 100644 coreutils-8.5.tar.xz delete mode 100644 coreutils-cifs-afs.diff delete mode 100644 coreutils-fix_distcheck.patch delete mode 100644 coreutils-getaddrinfo.diff create mode 100644 coreutils-getaddrinfo.patch create mode 100644 coreutils-gl_printf_safe.patch create mode 100644 coreutils-i18n-infloop.patch rename i18n-limfield.diff => coreutils-i18n-limfield.patch (85%) rename i18n-monthsort.diff => coreutils-i18n-monthsort.patch (66%) rename i18n-random.diff => coreutils-i18n-random.patch (70%) create mode 100644 coreutils-i18n-uninit.patch create mode 100644 coreutils-invalid-ids.patch create mode 100644 coreutils-no_hostname_and_hostid.patch rename coreutils-sysinfo.diff => coreutils-sysinfo.patch (86%) delete mode 100644 i18n-infloop.diff delete mode 100644 i18n-uninit.diff delete mode 100644 invalid-ids.diff diff --git a/coreutils-5.3.0-i18n-0.1.patch b/coreutils-5.3.0-i18n-0.1.patch deleted file mode 100644 index b07d63d..0000000 --- a/coreutils-5.3.0-i18n-0.1.patch +++ /dev/null @@ -1,4015 +0,0 @@ -Index: lib/linebuffer.h -=================================================================== ---- coreutils-7.1/lib/linebuffer.h.orig 2008-09-18 09:08:01.000000000 +0200 -+++ coreutils-7.1/lib/linebuffer.h 2010-06-29 18:49:31.855522069 +0200 -@@ -21,6 +21,11 @@ - - # include - -+/* Get mbstate_t. */ -+# if HAVE_WCHAR_H -+# include -+# endif -+ - /* A `struct linebuffer' holds a line of text. */ - - struct linebuffer -@@ -28,6 +33,9 @@ struct linebuffer - size_t size; /* Allocated. */ - size_t length; /* Used. */ - char *buffer; -+# if HAVE_WCHAR_H -+ mbstate_t state; -+# endif - }; - - /* Initialize linebuffer LINEBUFFER for use. */ -Index: src/cut.c -=================================================================== ---- coreutils-7.1/src/cut.c.orig 2008-09-18 09:06:57.000000000 +0200 -+++ coreutils-7.1/src/cut.c 2010-06-29 18:49:31.855522069 +0200 -@@ -28,6 +28,12 @@ - #include - #include - #include -+ -+/* Get mbstate_t, mbrtowc(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ - #include "system.h" - - #include "error.h" -@@ -36,6 +42,13 @@ - #include "quote.h" - #include "xstrndup.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "cut" - -@@ -77,6 +90,54 @@ struct range_pair - size_t hi; - }; - -+/* Refill the buffer BUF. */ -+#define REFILL_BUFFER(BUF, BUFPOS, BUFLEN, STREAM) \ -+ do \ -+ { \ -+ if (BUFLEN < MB_LEN_MAX && !feof (STREAM) && !ferror (STREAM)) \ -+ { \ -+ memmove (BUF, BUFPOS, BUFLEN); \ -+ BUFLEN += fread (BUF + BUFLEN, sizeof(char), BUFSIZ, STREAM); \ -+ BUFPOS = BUF; \ -+ } \ -+ } \ -+ while (0) -+ -+/* Get wide character which starts at BUFPOS. If the byte sequence is -+ not valid as a character, CONVFAIL is 1. Otherwise 0. */ -+#define GET_NEXT_WC_FROM_BUFFER(WC, BUFPOS, BUFLEN, MBLENGTH, STATE, CONVFAIL) \ -+ do \ -+ { \ -+ wchar_t tmp; \ -+ mbstate_t state_bak; \ -+ \ -+ if (BUFLEN < 1) \ -+ { \ -+ WC = WEOF; \ -+ break; \ -+ } \ -+ \ -+ /* Get a wide character. */ \ -+ CONVFAIL = 0; \ -+ state_bak = STATE; \ -+ MBLENGTH = mbrtowc (&tmp, BUFPOS, BUFLEN, &STATE); \ -+ WC = tmp; \ -+ \ -+ switch (MBLENGTH) \ -+ { \ -+ case (size_t)-1: \ -+ case (size_t)-2: \ -+ ++CONVFAIL; \ -+ STATE = state_bak; \ -+ /* Fall througn. */ \ -+ \ -+ case 0: \ -+ MBLENGTH = 1; \ -+ break; \ -+ } \ -+ } \ -+ while (0) -+ - /* This buffer is used to support the semantics of the -s option - (or lack of same) when the specified field list includes (does - not include) the first field. In both of those cases, the entire -@@ -89,7 +150,7 @@ static char *field_1_buffer; - /* The number of bytes allocated for FIELD_1_BUFFER. */ - static size_t field_1_bufsize; - --/* The largest field or byte index used as an endpoint of a closed -+/* The largest field, character or byte index used as an endpoint of a closed - or degenerate range specification; this doesn't include the starting - index of right-open-ended ranges. For example, with either range spec - `2-5,9-', `2-3,5,9-' this variable would be set to 5. */ -@@ -101,10 +162,11 @@ static size_t eol_range_start; - - /* This is a bit vector. - In byte mode, which bytes to output. -+ In character mode, which characters to output. - In field mode, which DELIM-separated fields to output. -- Both bytes and fields are numbered starting with 1, -+ Bytes, characters and fields are numbered starting with 1, - so the zeroth bit of this array is unused. -- A field or byte K has been selected if -+ A byte, character or field K has been selected if - (K <= MAX_RANGE_ENDPOINT and is_printable_field(K)) - || (EOL_RANGE_START > 0 && K >= EOL_RANGE_START). */ - static unsigned char *printable_field; -@@ -113,15 +175,25 @@ enum operating_mode - { - undefined_mode, - -- /* Output characters that are in the given bytes. */ -+ /* Output bytes that are in the given bytes. */ - byte_mode, - -+ /* Output characters that are at the given positions. */ -+ character_mode, -+ - /* Output the given delimeter-separated fields. */ - field_mode - }; - - static enum operating_mode operating_mode; - -+/* If true, when in byte mode, don't split multibyte characters. */ -+static bool byte_mode_character_aware; -+ -+/* If true, the function for single byte locale is work -+ if this program runs on multibyte locale. */ -+static bool force_singlebyte_mode; -+ - /* If true do not output lines containing no delimeter characters. - Otherwise, all such lines are printed. This option is valid only - with field mode. */ -@@ -133,6 +205,9 @@ static bool complement; - - /* The delimeter character for field mode. */ - static unsigned char delim; -+#if HAVE_WCHAR_H -+static wchar_t wcdelim; -+#endif - - /* True if the --output-delimiter=STRING option was specified. */ - static bool output_delimiter_specified; -@@ -206,7 +281,7 @@ Mandatory arguments to long options are - -f, --fields=LIST select only these fields; also print any line\n\ - that contains no delimiter character, unless\n\ - the -s option is specified\n\ -- -n (ignored)\n\ -+ -n with -b: don't split multibyte characters\n\ - "), stdout); - fputs (_("\ - --complement complement the set of selected bytes, characters\n\ -@@ -365,7 +440,7 @@ set_fields (const char *fieldstr) - in_digits = false; - /* Starting a range. */ - if (dash_found) -- FATAL_ERROR (_("invalid byte or field list")); -+ FATAL_ERROR (_("invalid byte, character or field list")); - dash_found = true; - fieldstr++; - -@@ -389,7 +464,9 @@ set_fields (const char *fieldstr) - if (!rhs_specified) - { - /* `n-'. From `initial' to end of line. */ -- eol_range_start = initial; -+ if (eol_range_start == 0 -+ || (eol_range_start != 0 && eol_range_start > initial)) -+ eol_range_start = initial; - field_found = true; - } - else -@@ -486,7 +563,7 @@ set_fields (const char *fieldstr) - fieldstr++; - } - else -- FATAL_ERROR (_("invalid byte or field list")); -+ FATAL_ERROR (_("invalid byte, character or field list")); - } - - max_range_endpoint = 0; -@@ -579,6 +656,81 @@ cut_bytes (FILE *stream) - } - } - -+#if HAVE_MBRTOWC -+/* This function is in use for the following case. -+ -+ 1. Read from the stream STREAM, printing to standard output any selected -+ characters. -+ -+ 2. Read from stream STREAM, printing to standard output any selected bytes, -+ without splitting multibyte characters. */ -+ -+static void -+cut_characters_or_cut_bytes_no_split (FILE *stream) -+{ -+ size_t idx; /* Number of bytes or characters in the line so far. */ -+ /* Whether to begin printing delimiters between ranges for the current line. -+ Set after we've begun printing data corresponding to the first range. */ -+ bool print_delimiter; -+ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen; /* The length of the byte sequence in buf. */ -+ wint_t wc; /* A gotten wide character. */ -+ size_t mblength; /* The byte size of a multibyte character which shows -+ as same character as WC. */ -+ mbstate_t state; /* State of the stream. */ -+ int convfail; /* 1, when conversion is failed. Otherwise 0. */ -+ -+ -+ idx = 0; -+ print_delimiter = false; -+ buflen = 0; -+ bufpos = buf; -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ while (1) -+ { -+ REFILL_BUFFER (buf, bufpos, buflen, stream); -+ -+ GET_NEXT_WC_FROM_BUFFER (wc, bufpos, buflen, mblength, state, convfail); -+ -+ if (wc == WEOF) -+ { -+ if (idx > 0) -+ putchar ('\n'); -+ break; -+ } -+ else if (wc == L'\n') -+ { -+ putchar ('\n'); -+ idx = 0; -+ print_delimiter = false; -+ } -+ else -+ { -+ bool range_start; -+ bool *rs = output_delimiter_specified ? &range_start : NULL; -+ -+ idx += (operating_mode == byte_mode) ? mblength : 1; -+ if (print_kth (idx, rs)) -+ { -+ if (rs && *rs && print_delimiter) -+ { -+ fwrite (output_delimiter_string, sizeof (char), -+ output_delimiter_length, stdout); -+ } -+ print_delimiter = true; -+ fwrite (bufpos, mblength, sizeof (char), stdout); -+ } -+ } -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+} -+#endif -+ - /* Read from stream STREAM, printing to standard output any selected fields. */ - - static void -@@ -701,13 +853,190 @@ cut_fields (FILE *stream) - } - } - -+#if HAVE_MBRTOWC -+static void -+cut_fields_mb (FILE *stream) -+{ -+ int c; -+ size_t field_idx = 1; -+ bool found_any_selected_field = false; -+ bool buffer_first_field; -+ int empty_input; -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen; /* The length of the byte sequence in buf. */ -+ wint_t wc; /* A gotten wide character. */ -+ size_t mblength; /* The byte size of a multibyte character which shows -+ as same character as WC. */ -+ mbstate_t state; /* State of the stream. */ -+ int convfail; /* 1, when conversion is failed. Otherwise 0. */ -+ -+ bufpos = buf; -+ buflen = 0; -+ memset (&state, '\0', sizeof (mbstate_t)); -+ -+ c = getc (stream); -+ empty_input = (c == EOF); -+ if (c != EOF) -+ ungetc (c, stream); -+ else -+ wc = WEOF; -+ -+ /* To support the semantics of the -s flag, we may have to buffer -+ all of the first field to determine whether it is `delimited.' -+ But that is unnecessary if all non-delimited lines must be printed -+ and the first field has been selected, or if non-delimited lines -+ must be suppressed and the first field has *not* been selected. -+ That is because a non-delimited line has exactly one field. */ -+ buffer_first_field = (suppress_non_delimited ^ !print_kth (1, NULL)); -+ -+ while (1) -+ { -+ if (field_idx == 1 && buffer_first_field) -+ { -+ size_t n_bytes = 0; -+ -+ while (1) -+ { -+ REFILL_BUFFER (buf, bufpos, buflen, stream); -+ -+ GET_NEXT_WC_FROM_BUFFER -+ (wc, bufpos, buflen, mblength, state, convfail); -+ -+ if (wc == WEOF) -+ break; -+ -+ field_1_buffer = xrealloc (field_1_buffer, n_bytes + mblength); -+ memcpy (field_1_buffer + n_bytes, bufpos, mblength); -+ n_bytes += mblength; -+ buflen -= mblength; -+ bufpos += mblength; -+ -+ if (!convfail && (wc == L'\n' || wc == wcdelim)) -+ break; -+ } -+ -+ if (wc == WEOF) -+ break; -+ -+ /* If the first field extends to the end of line (it is not -+ delimited) and we are printing all non-delimited lines, -+ print this one. */ -+ if (convfail || (!convfail && wc != wcdelim)) -+ { -+ if (suppress_non_delimited) -+ { -+ /* Empty. */ -+ } -+ else -+ { -+ fwrite (field_1_buffer, sizeof (char), n_bytes, stdout); -+ /* Make sure the output line is newline terminated. */ -+ if (convfail || (!convfail && wc != L'\n')) -+ putchar ('\n'); -+ } -+ continue; -+ } -+ -+ if (print_kth (1, NULL)) -+ { -+ /* Print the field, but not the trailing delimiter. */ -+ fwrite (field_1_buffer, sizeof (char), n_bytes - 1, stdout); -+ found_any_selected_field = true; -+ } -+ ++field_idx; -+ } -+ -+ if (wc != WEOF) -+ { -+ if (print_kth (field_idx, NULL)) -+ { -+ if (found_any_selected_field) -+ { -+ fwrite (output_delimiter_string, sizeof (char), -+ output_delimiter_length, stdout); -+ } -+ found_any_selected_field = true; -+ } -+ -+ while (1) -+ { -+ REFILL_BUFFER (buf, bufpos, buflen, stream); -+ -+ GET_NEXT_WC_FROM_BUFFER -+ (wc, bufpos, buflen, mblength, state, convfail); -+ -+ if (wc == WEOF) -+ break; -+ else if (!convfail && (wc == wcdelim || wc == L'\n')) -+ { -+ buflen -= mblength; -+ bufpos += mblength; -+ break; -+ } -+ -+ if (print_kth (field_idx, NULL)) -+ fwrite (bufpos, mblength, sizeof (char), stdout); -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+ } -+ -+ if ((!convfail || wc == L'\n') && buflen < 1) -+ wc = WEOF; -+ -+ if (!convfail && wc == wcdelim) -+ ++field_idx; -+ else if (wc == WEOF || (!convfail && wc == L'\n')) -+ { -+ if (found_any_selected_field -+ || (!empty_input && !(suppress_non_delimited && field_idx == 1))) -+ putchar ('\n'); -+ if (wc == WEOF) -+ break; -+ field_idx = 1; -+ found_any_selected_field = false; -+ } -+ } -+} -+#endif -+ - static void - cut_stream (FILE *stream) - { -- if (operating_mode == byte_mode) -- cut_bytes (stream); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) -+ { -+ switch (operating_mode) -+ { -+ case byte_mode: -+ if (byte_mode_character_aware) -+ cut_characters_or_cut_bytes_no_split (stream); -+ else -+ cut_bytes (stream); -+ break; -+ -+ case character_mode: -+ cut_characters_or_cut_bytes_no_split (stream); -+ break; -+ -+ case field_mode: -+ cut_fields_mb (stream); -+ break; -+ -+ default: -+ abort (); -+ } -+ } - else -- cut_fields (stream); -+#endif -+ { -+ if (operating_mode == field_mode) -+ cut_fields (stream); -+ else -+ cut_bytes (stream); -+ } - } - - /* Process file FILE to standard output. -@@ -757,6 +1086,8 @@ main (int argc, char **argv) - bool ok; - bool delim_specified = false; - char *spec_list_string IF_LINT(= NULL); -+ char mbdelim[MB_LEN_MAX + 1]; -+ size_t delimlen = 0; - - initialize_main (&argc, &argv); - set_program_name (argv[0]); -@@ -779,7 +1110,6 @@ main (int argc, char **argv) - switch (optc) - { - case 'b': -- case 'c': - /* Build the byte list. */ - if (operating_mode != undefined_mode) - FATAL_ERROR (_("only one type of list may be specified")); -@@ -787,6 +1117,14 @@ main (int argc, char **argv) - spec_list_string = optarg; - break; - -+ case 'c': -+ /* Build the character list. */ -+ if (operating_mode != undefined_mode) -+ FATAL_ERROR (_("only one type of list may be specified")); -+ operating_mode = character_mode; -+ spec_list_string = optarg; -+ break; -+ - case 'f': - /* Build the field list. */ - if (operating_mode != undefined_mode) -@@ -798,9 +1136,32 @@ main (int argc, char **argv) - case 'd': - /* New delimiter. */ - /* Interpret -d '' to mean `use the NUL byte as the delimiter.' */ -- if (optarg[0] != '\0' && optarg[1] != '\0') -- FATAL_ERROR (_("the delimiter must be a single character")); -- delim = optarg[0]; -+#if HAVE_MBRTOWC -+ if(MB_CUR_MAX > 1) -+ { -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ delimlen = mbrtowc (&wcdelim, optarg, MB_LEN_MAX, &state); -+ -+ if (delimlen == (size_t)-1 || delimlen == (size_t)-2) -+ force_singlebyte_mode = true; -+ else -+ { -+ delimlen = (delimlen < 1) ? 1 : delimlen; -+ if (wcdelim != L'\0' && *(optarg + delimlen) != '\0') -+ FATAL_ERROR (_("the delimiter must be a single character")); -+ memcpy (mbdelim, optarg, delimlen); -+ } -+ } -+ -+ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) -+#endif -+ { -+ if (optarg[0] != '\0' && optarg[1] != '\0') -+ FATAL_ERROR (_("the delimiter must be a single character")); -+ delim = (unsigned char) optarg[0]; -+ } - delim_specified = true; - break; - -@@ -814,6 +1175,7 @@ main (int argc, char **argv) - break; - - case 'n': -+ byte_mode_character_aware = true; - break; - - case 's': -@@ -836,7 +1198,7 @@ main (int argc, char **argv) - if (operating_mode == undefined_mode) - FATAL_ERROR (_("you must specify a list of bytes, characters, or fields")); - -- if (delim != '\0' && operating_mode != field_mode) -+ if (delim_specified && operating_mode != field_mode) - FATAL_ERROR (_("an input delimiter may be specified only\ - when operating on fields")); - -@@ -863,15 +1225,34 @@ main (int argc, char **argv) - } - - if (!delim_specified) -- delim = '\t'; -+ { -+ delim = '\t'; -+#ifdef HAVE_MBRTOWC -+ wcdelim = L'\t'; -+ mbdelim[0] = '\t'; -+ mbdelim[1] = '\0'; -+ delimlen = 1; -+ } -+#endif - - if (output_delimiter_string == NULL) - { -- static char dummy[2]; -- dummy[0] = delim; -- dummy[1] = '\0'; -- output_delimiter_string = dummy; -- output_delimiter_length = 1; -+#ifdef HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) -+ { -+ output_delimiter_string = xstrdup (mbdelim); -+ output_delimiter_length = delimlen; -+ } -+ -+ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) -+#endif -+ { -+ static char dummy[2]; -+ dummy[0] = delim; -+ dummy[1] = '\0'; -+ output_delimiter_string = dummy; -+ output_delimiter_length = 1; -+ } - } - - if (optind == argc) -Index: src/expand.c -=================================================================== ---- coreutils-7.1/src/expand.c.orig 2008-11-10 14:17:52.000000000 +0100 -+++ coreutils-7.1/src/expand.c 2010-06-29 18:49:31.871522014 +0200 -@@ -37,11 +37,31 @@ - #include - #include - #include -+ -+/* Get mbstate_t, mbrtowc, wcwidth. */ -+#if HAVE_WCHAR_H -+# include -+#endif -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "quote.h" - #include "xstrndup.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "expand" - -@@ -343,9 +363,12 @@ expand (void) - } - else - { -- column++; -- if (!column) -- error (EXIT_FAILURE, 0, _("input line is too long")); -+ if (!iscntrl (c)) -+ { -+ column++; -+ if (!column) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ } - } - - convert &= convert_entire_line | !! isblank (c); -@@ -361,6 +384,165 @@ expand (void) - } - } - -+#if HAVE_MBRTOWC && HAVE_WCTYPE_H -+static void -+expand_multibyte (void) -+{ -+ /* Input stream. */ -+ FILE *fp = next_file (NULL); -+ -+ mbstate_t i_state; /* Current shift state of the input stream. */ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen = 0; /* The length of the byte sequence in buf. */ -+ -+ if (!fp) -+ return; -+ -+ memset (&i_state, '\0', sizeof (mbstate_t)); -+ -+ for (;;) -+ { -+ /* Input character, or EOF. */ -+ wint_t wc; -+ -+ /* If true, perform translations. */ -+ bool convert = true; -+ -+ -+ /* The following variables have valid values only when CONVERT -+ is true: */ -+ -+ /* Column of next input character. */ -+ uintmax_t column = 0; -+ -+ /* Index in TAB_LIST of next tab stop to examine. */ -+ size_t tab_index = 0; -+ -+ -+ /* Convert a line of text. */ -+ -+ do -+ { -+ wchar_t w; -+ size_t mblength; /* The byte size of a multibyte character -+ which shows as same character as WC. */ -+ mbstate_t i_state_bak; /* Back up the I_STATE. */ -+ -+ /* Fill buffer */ -+ if (buflen < MB_LEN_MAX) -+ { -+ if (!feof(fp) && !ferror(fp)) -+ { -+ if (buflen > 0) -+ memmove (buf, bufpos, buflen); -+ buflen += fread (buf + buflen, sizeof (char), BUFSIZ, fp); -+ bufpos = buf; -+ } -+ } -+ -+ if (buflen < 1) -+ { -+ /* Move to the next file */ -+ if (feof (fp) || ferror (fp)) -+ fp = next_file(fp); -+ if (!fp) -+ return; -+ memset (&i_state, '\0', sizeof (mbstate_t)); -+ continue; -+ } -+ -+ i_state_bak = i_state; -+ mblength = mbrtowc (&w, bufpos, buflen, &i_state); -+ wc = w; -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ i_state = i_state_bak; -+ wc = L'\0'; -+ column += convert; -+ mblength = 1; -+ } -+ -+ if (convert) -+ { -+ if (wc == L'\t') -+ { -+ /* Column the next input tab stop is on. */ -+ uintmax_t next_tab_column; -+ -+ if (tab_size) -+ next_tab_column = column + (tab_size - column % tab_size); -+ else -+ for (;;) -+ if (tab_index == first_free_tab) -+ { -+ next_tab_column = column + 1; -+ break; -+ } -+ else -+ { -+ uintmax_t tab = tab_list[tab_index++]; -+ if (column < tab) -+ { -+ next_tab_column = tab; -+ break; -+ } -+ } -+ -+ if (next_tab_column < column) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ -+ while (++column < next_tab_column) -+ if (putchar (' ') < 0) -+ error (EXIT_FAILURE, errno, _("write error")); -+ -+ *bufpos = ' '; -+ } -+ else if (wc == L'\b') -+ { -+ /* Go back one column, and force recalculation of the -+ next tab stop. */ -+ column -= !!column; -+ tab_index -= !!tab_index; -+ } -+ else -+ { -+ if (!iswcntrl (wc)) -+ { -+ int width = wcwidth (wc); -+ if (width > 0) -+ { -+ if (column > (column + width)) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ column += width; -+ } -+ } -+ } -+ -+ convert &= convert_entire_line | iswblank (wc); -+ } -+ -+ if (mblength) -+ { -+ if (fwrite (bufpos, sizeof (char), mblength, stdout) < mblength) -+ error (EXIT_FAILURE, errno, _("write error")); -+ } -+ else -+ { -+ if (putchar ('\0')) -+ error (EXIT_FAILURE, errno, _("write error")); -+ mblength = 1; -+ } -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+ while (wc != L'\n'); -+ } -+} -+#endif -+ - int - main (int argc, char **argv) - { -@@ -425,7 +607,12 @@ main (int argc, char **argv) - - file_list = (optind < argc ? &argv[optind] : stdin_argv); - -- expand (); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ expand_multibyte (); -+ else -+#endif -+ expand (); - - if (have_read_stdin && fclose (stdin) != 0) - error (EXIT_FAILURE, errno, "-"); -Index: src/fold.c -=================================================================== ---- coreutils-7.1/src/fold.c.orig 2008-09-18 09:06:57.000000000 +0200 -+++ coreutils-7.1/src/fold.c 2010-06-29 18:49:31.896029818 +0200 -@@ -22,6 +22,19 @@ - #include - #include - -+/* Get MB_CUR_MAX. */ -+#include -+ -+/* Get mbrtowc, mbstate_t, wcwidth(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswprint(), iswctype(), wctype(). */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "quote.h" -@@ -29,11 +42,54 @@ - - #define TAB_WIDTH 8 - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ -+#ifndef HAVE_DECL_WCWIDTH -+"this configure-time declaration test was not run" -+#endif -+#if !HAVE_DECL_WCWIDTH -+extern int wcwidth (); -+#endif -+ -+/* If wcwidth() doesn't exist, assume all printable characters have -+ width 1. */ -+#if !defined wcwidth && !HAVE_WCWIDTH -+# define wcwidth(wc) ((wc) == 0 ? 0 : iswprint (wc) ? 1 : -1) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "fold" - - #define AUTHORS proper_name ("David MacKenzie") - -+#define FATAL_ERROR(Message) \ -+ do \ -+ { \ -+ error (0, 0, (Message)); \ -+ usage (2); \ -+ } \ -+ while (0) -+ -+enum operating_mode -+{ -+ /* Fold texts by columns that are at the given positions. */ -+ column_mode, -+ -+ /* Fold texts by bytes that are at the given positions. */ -+ byte_mode, -+ -+ /* Fold texts by characters that are at the given positions. */ -+ character_mode, -+}; -+ -+/* The argument shows current mode. (Default: column_mode) */ -+static enum operating_mode operating_mode; -+ - /* If nonzero, try to break on whitespace. */ - static bool break_spaces; - -@@ -43,11 +99,17 @@ static bool count_bytes; - /* If nonzero, at least one of the files we read was standard input. */ - static bool have_read_stdin; - --static char const shortopts[] = "bsw:0::1::2::3::4::5::6::7::8::9::"; -+static char const shortopts[] = "bcsw:0::1::2::3::4::5::6::7::8::9::"; -+ -+/* wide character class `blank' */ -+#if HAVE_MBRTOWC -+wctype_t blank_type; -+#endif - - static struct option const longopts[] = - { - {"bytes", no_argument, NULL, 'b'}, -+ {"characters", no_argument, NULL, 'c'}, - {"spaces", no_argument, NULL, 's'}, - {"width", required_argument, NULL, 'w'}, - {GETOPT_HELP_OPTION_DECL}, -@@ -77,6 +139,7 @@ Mandatory arguments to long options are - "), stdout); - fputs (_("\ - -b, --bytes count bytes rather than columns\n\ -+ -c, --characters count characters rather than columns\n\ - -s, --spaces break at spaces\n\ - -w, --width=WIDTH use WIDTH columns instead of 80\n\ - "), stdout); -@@ -94,7 +157,7 @@ Mandatory arguments to long options are - static size_t - adjust_column (size_t column, char c) - { -- if (!count_bytes) -+ if (operating_mode != byte_mode) - { - if (c == '\b') - { -@@ -113,14 +176,9 @@ adjust_column (size_t column, char c) - return column; - } - --/* Fold file FILENAME, or standard input if FILENAME is "-", -- to stdout, with maximum line length WIDTH. -- Return true if successful. */ -- --static bool --fold_file (char const *filename, size_t width) -+static int -+fold_text (FILE *istream, size_t width) - { -- FILE *istream; - int c; - size_t column = 0; /* Screen column where next char will go. */ - size_t offset_out = 0; /* Index in `line_out' for next char. */ -@@ -128,20 +186,6 @@ fold_file (char const *filename, size_t - static size_t allocated_out = 0; - int saved_errno; - -- if (STREQ (filename, "-")) -- { -- istream = stdin; -- have_read_stdin = true; -- } -- else -- istream = fopen (filename, "r"); -- -- if (istream == NULL) -- { -- error (0, errno, "%s", filename); -- return false; -- } -- - while ((c = getc (istream)) != EOF) - { - if (offset_out + 1 >= allocated_out) -@@ -219,6 +263,234 @@ fold_file (char const *filename, size_t - if (offset_out) - fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); - -+ return saved_errno; -+} -+ -+#if HAVE_MBRTOWC -+static void -+fold_multibyte_text (FILE *istream, size_t width) -+{ -+ int i; -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ size_t buflen; /* The length of the byte sequence in buf. */ -+ char *bufpos; /* Next read position of BUF. */ -+ wint_t wc; /* A gotten wide character. */ -+ wchar_t tmp; -+ size_t mblength; /* The byte size of a multibyte character which shows -+ as same character as WC. */ -+ mbstate_t state, state_bak; /* State of the stream. */ -+ int convfail; /* 1, when conversion is failed. Otherwise 0. */ -+ -+ char *line_out = NULL; -+ size_t offset_out = 0; /* Index in `line_out' for next char. */ -+ size_t allocated_out = 1024; -+ -+ int increment; -+ size_t column = 0; -+ -+ size_t last_blank_pos; -+ size_t last_blank_column; -+ int is_blank_seen; -+ int last_blank_increment; -+ int is_bs_following_last_blank; -+ size_t bs_following_last_blank_num; -+ int is_cr_after_last_blank; -+ -+ -+#define CLEAR_FLAGS \ -+ do \ -+ { \ -+ last_blank_pos = 0; \ -+ last_blank_column = 0; \ -+ is_blank_seen = 0; \ -+ is_bs_following_last_blank = 0; \ -+ bs_following_last_blank_num = 0; \ -+ is_cr_after_last_blank = 0; \ -+ } \ -+ while (0) -+ -+#define START_NEW_LINE \ -+ do \ -+ { \ -+ putchar ('\n'); \ -+ column = 0; \ -+ offset_out = 0; \ -+ CLEAR_FLAGS; \ -+ } \ -+ while (0) -+ -+ CLEAR_FLAGS; -+ -+ memset (&state, '\0', sizeof (mbstate_t)); -+ line_out = xmalloc (allocated_out); -+ -+ buflen = fread (buf, sizeof (char), BUFSIZ, istream); -+ bufpos = buf; -+ -+ for (;; bufpos += mblength, buflen -= mblength) -+ { -+ if (buflen < MB_LEN_MAX && !feof (istream) && !ferror (istream)) -+ { -+ memmove (buf, bufpos, buflen); -+ buflen += fread (buf + buflen, sizeof (char), BUFSIZ, istream); -+ bufpos = buf; -+ } -+ -+ if (buflen < 1) -+ break; -+ -+ /* Get a wide character. */ -+ convfail = 0; -+ state_bak = state; -+ mblength = mbrtowc (&tmp, bufpos, buflen, &state); -+ wc = tmp; -+ -+ switch (mblength) -+ { -+ case (size_t)-1: -+ case (size_t)-2: -+ convfail++; -+ state = state_bak; -+ /* Fall through. */ -+ -+ case 0: -+ mblength = 1; -+ break; -+ } -+ -+ if (!convfail && wc == L'\n') -+ { -+ if (offset_out > 0) -+ { -+ fwrite (line_out, sizeof (char), offset_out, stdout); -+ START_NEW_LINE; -+ } -+ continue; -+ } -+ -+ rescan: -+ if (operating_mode == byte_mode) /* byte mode */ -+ increment = mblength; -+ else if (operating_mode == character_mode) /* character mode */ -+ increment = 1; -+ else /* column mode */ -+ { -+ if (convfail) -+ increment = 1; -+ else -+ { -+ switch (wc) -+ { -+ case L'\b': -+ increment = (column > 0) ? -1 : 0; -+ break; -+ -+ case L'\r': -+ increment = -1 * column; -+ break; -+ -+ case L'\t': -+ increment = 8 - column % 8; -+ break; -+ -+ default: -+ increment = wcwidth (wc); -+ increment = (increment < 0) ? 0 : increment; -+ } -+ } -+ } -+ -+ if (column + increment > width && break_spaces && last_blank_pos) -+ { -+ fwrite (line_out, sizeof (char), last_blank_pos, stdout); -+ putchar ('\n'); -+ -+ offset_out = offset_out - last_blank_pos; -+ column = (column - last_blank_column -+ + (is_cr_after_last_blank -+ ? last_blank_increment : bs_following_last_blank_num)); -+ memmove (line_out, line_out + last_blank_pos, offset_out); -+ CLEAR_FLAGS; -+ goto rescan; -+ } -+ -+ if (column + increment > width && column != 0) -+ { -+ fwrite (line_out, sizeof (char), offset_out, stdout); -+ START_NEW_LINE; -+ goto rescan; -+ } -+ -+ if (allocated_out < offset_out + mblength) -+ line_out = x2nrealloc (line_out, &allocated_out, sizeof *line_out); -+ -+ for (i = 0; i < mblength; i++) -+ { -+ line_out[offset_out] = bufpos[i]; -+ ++offset_out; -+ } -+ -+ column += increment; -+ -+ if (is_blank_seen && !convfail && wc == L'\r') -+ is_cr_after_last_blank = 1; -+ -+ if (is_bs_following_last_blank && !convfail && wc == L'\b') -+ ++bs_following_last_blank_num; -+ else -+ is_bs_following_last_blank = 0; -+ -+ if (break_spaces && !convfail && iswctype (wc, blank_type)) -+ { -+ last_blank_pos = offset_out; -+ last_blank_column = column; -+ is_blank_seen = 1; -+ last_blank_increment = increment; -+ is_bs_following_last_blank = 1; -+ bs_following_last_blank_num = 0; -+ is_cr_after_last_blank = 0; -+ } -+ } -+ -+ if (offset_out) -+ fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); -+ -+ free(line_out); -+} -+#endif -+ -+/* Fold file FILENAME, or standard input if FILENAME is "-", -+ to stdout, with maximum line length WIDTH. -+ Return true if successful. */ -+ -+static bool -+fold_file (char const *filename, size_t width) -+{ -+ FILE *istream; -+ int saved_errno; -+ -+ if (STREQ (filename, "-")) -+ { -+ istream = stdin; -+ have_read_stdin = true; -+ } -+ else -+ istream = fopen (filename, "r"); -+ -+ if (istream == NULL) -+ { -+ error (0, errno, "%s", filename); -+ return false; -+ } -+ -+ /* Define how ISTREAM is being folded. */ -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ fold_multibyte_text (istream, width); -+ else -+#endif -+ saved_errno = fold_text (istream, width); -+ - if (ferror (istream)) - { - error (0, saved_errno, "%s", filename); -@@ -251,6 +523,10 @@ main (int argc, char **argv) - - atexit (close_stdout); - -+#if HAVE_MBRTOWC -+ blank_type = wctype ("blank"); -+#endif -+ operating_mode = column_mode; - break_spaces = count_bytes = have_read_stdin = false; - - while ((optc = getopt_long (argc, argv, shortopts, longopts, NULL)) != -1) -@@ -260,7 +536,15 @@ main (int argc, char **argv) - switch (optc) - { - case 'b': /* Count bytes rather than columns. */ -- count_bytes = true; -+ if (operating_mode != column_mode) -+ FATAL_ERROR (_("only one way of folding may be specified")); -+ operating_mode = byte_mode; -+ break; -+ -+ case 'c': /* Count characters rather than columns. */ -+ if (operating_mode != column_mode) -+ FATAL_ERROR (_("only one way of folding may be specified")); -+ operating_mode = character_mode; - break; - - case 's': /* Break at word boundaries. */ -Index: src/join.c -=================================================================== ---- coreutils-7.1/src/join.c.orig 2008-11-10 14:17:52.000000000 +0100 -+++ coreutils-7.1/src/join.c 2010-06-29 18:49:31.923528009 +0200 -@@ -22,6 +22,16 @@ - #include - #include - -+/* Get mbstate_t, mbrtowc, mbrtowc, wcwidth. */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswblank, towupper. */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "linebuffer.h" -@@ -32,6 +42,11 @@ - #include "xstrtol.h" - #include "argmatch.h" - -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "join" - -@@ -120,10 +135,13 @@ static struct outlist outlist_head; - /* Last element in `outlist', where a new element can be added. */ - static struct outlist *outlist_end = &outlist_head; - --/* Tab character separating fields. If negative, fields are separated -+/* Tab character separating fields. If NULL, fields are separated - by any nonempty string of blanks, otherwise by exactly one - tab character whose value (when cast to unsigned char) equals TAB. */ --static int tab = -1; -+static const char *tab = NULL; -+ -+/* The number of bytes used for tab. */ -+static size_t tablen = 0; - - /* If nonzero, check that the input is correctly ordered. */ - static enum -@@ -237,10 +255,10 @@ xfields (struct line *line) - if (ptr == lim) - return; - -- if (0 <= tab) -+ if (tab != NULL) - { - char *sep; -- for (; (sep = memchr (ptr, tab, lim - ptr)) != NULL; ptr = sep + 1) -+ for (; (sep = memchr (ptr, tab[0], lim - ptr)) != NULL; ptr = sep + 1) - extract_field (line, ptr, sep - ptr); - } - else -@@ -285,56 +303,115 @@ keycmp (struct line const *line1, struct - size_t jf_1, size_t jf_2) - { - /* Start of field to compare in each file. */ -- char *beg1; -- char *beg2; -- -- size_t len1; -- size_t len2; /* Length of fields to compare. */ -+ char *beg[2]; -+ char *copy[2]; -+ size_t len[2]; /* Length of fields to compare. */ - int diff; -+ int i, j; - - if (jf_1 < line1->nfields) - { -- beg1 = line1->fields[jf_1].beg; -- len1 = line1->fields[jf_1].len; -+ beg[0] = line1->fields[jf_1].beg; -+ len[0] = line1->fields[jf_1].len; - } - else - { -- beg1 = NULL; -- len1 = 0; -+ beg[0] = NULL; -+ len[0] = 0; - } - - if (jf_2 < line2->nfields) - { -- beg2 = line2->fields[jf_2].beg; -- len2 = line2->fields[jf_2].len; -+ beg[1] = line2->fields[jf_2].beg; -+ len[1] = line2->fields[jf_2].len; - } - else - { -- beg2 = NULL; -- len2 = 0; -+ beg[1] = NULL; -+ len[1] = 0; - } - -- if (len1 == 0) -- return len2 == 0 ? 0 : -1; -- if (len2 == 0) -+ if (len[0] == 0) -+ return len[1] == 0 ? 0 : -1; -+ if (len[1] == 0) - return 1; - - if (ignore_case) - { -- /* FIXME: ignore_case does not work with NLS (in particular, -- with multibyte chars). */ -- diff = memcasecmp (beg1, beg2, MIN (len1, len2)); -+#ifdef HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ size_t mblength; -+ wchar_t wc, uwc; -+ mbstate_t state, state_bak; -+ -+ memset (&state, '\0', sizeof (mbstate_t)); -+ -+ for (i = 0; i < 2; i++) -+ { -+ copy[i] = alloca (len[i] + 1); -+ -+ for (j = 0; j < MIN (len[0], len[1]);) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, beg[i] + j, len[i] - j, &state); -+ -+ switch (mblength) -+ { -+ case (size_t) -1: -+ case (size_t) -2: -+ state = state_bak; -+ /* Fall through */ -+ case 0: -+ mblength = 1; -+ break; -+ -+ default: -+ uwc = towupper (wc); -+ -+ if (uwc != wc) -+ { -+ mbstate_t state_wc; -+ -+ memset (&state_wc, '\0', sizeof (mbstate_t)); -+ wcrtomb (copy[i] + j, uwc, &state_wc); -+ } -+ else -+ memcpy (copy[i] + j, beg[i] + j, mblength); -+ } -+ j += mblength; -+ } -+ copy[i][j] = '\0'; -+ } -+ return xmemcoll (copy[0], len[0], copy[1], len[1]); -+ } -+#endif -+ if (hard_LC_COLLATE) -+ { -+ for (i = 0; i < 2; i++) -+ { -+ copy[i] = alloca (len[i] + 1); -+ -+ for (j = 0; j < MIN (len[0], len[1]); j++) -+ copy[i][j] = toupper (beg[i][j]); -+ -+ copy[i][j] = '\0'; -+ } -+ return xmemcoll (copy[0], len[0], copy[1], len[1]); -+ } -+ else -+ diff = memcasecmp (beg[0], beg[1], MIN (len[0], len[1])); - } - else - { - if (hard_LC_COLLATE) -- return xmemcoll (beg1, len1, beg2, len2); -- diff = memcmp (beg1, beg2, MIN (len1, len2)); -+ return xmemcoll (beg[0], len[0], beg[1], len[1]); -+ diff = memcmp (beg[0], beg[1], MIN (len[0], len[1])); - } - - if (diff) - return diff; -- return len1 < len2 ? -1 : len1 != len2; -+ return len[0] < len[1] ? -1 : len[0] != len[1]; - } - - /* Check that successive input lines PREV and CURRENT from input file -@@ -388,6 +465,133 @@ init_linep (struct line **linep) - return line; - } - -+#if HAVE_MBRTOWC -+static void -+xfields_multibyte (struct line *line) -+{ -+ int i; -+ char *ptr0 = line->buf.buffer; -+ char *ptr; -+ char *lim; -+ wchar_t wc = 0; -+ size_t mblength; -+ mbstate_t state, state_bak; -+ -+ memset (&state, 0, sizeof (mbstate_t)); -+ -+ ptr = ptr0; -+ lim = ptr0 + line->buf.length - 1; -+ -+ if (tab == NULL) -+ { -+ /* Skip leading blanks before the first field. */ -+ while (ptr < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ mblength = 1; -+ state = state_bak; -+ break; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (!iswblank (wc)) -+ break; -+ ptr += mblength; -+ } -+ } -+ -+ for (i = 0; ptr < lim; ++i) -+ { -+ if (tab != NULL) -+ { -+ char *beg = ptr; -+ while (ptr < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ mblength = 1; -+ state = state_bak; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (mblength == tablen && !memcmp (ptr, tab, mblength)) -+ break; -+ else -+ { -+ ptr += mblength; -+ continue; -+ } -+ } -+ -+ extract_field (line, beg, ptr - beg); -+ if (ptr < lim) -+ ptr += mblength; -+ } -+ else -+ { -+ char *beg = ptr; -+ while (ptr < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ mblength = 1; -+ state = state_bak; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (iswblank (wc)) -+ break; -+ else -+ { -+ ptr += mblength; -+ continue; -+ } -+ } -+ -+ extract_field (line, beg, ptr - beg); -+ if (ptr < lim) -+ ptr += mblength; -+ } -+ } -+ -+ if (ptr != ptr0) -+ { -+ mblength = mbrtowc (&wc, ptr - mblength, mblength, &state); -+ wc = (mbsinit (&state) && *(ptr - mblength) == '\0') ? L'\0' : wc; -+ if (tab != NULL) -+ { -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ mblength = 1; -+ -+ if (mblength == tablen && !memcmp (ptr - mblength, tab, mblength)) -+ /* Add one more (empty) field because the last character of -+ the line was a delimiter. */ -+ extract_field (line, NULL, 0); -+ } -+ else -+ { -+ if (mblength != (size_t) -1 && mblength != (size_t) -2) -+ { -+ if (iswblank (wc)) -+ /* Add one more (empty) field because the last character of -+ the line was a delimiter. */ -+ extract_field (line, NULL, 0); -+ } -+ } -+ } -+} -+#endif -+ - /* Read a line from FP into LINE and split it into fields. - Return true if successful. */ - -@@ -415,7 +619,12 @@ get_line (FILE *fp, struct line **linep, - return false; - } - -- xfields (line); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ xfields_multibyte (line); -+ else -+#endif -+ xfields (line); - - if (prevline[which - 1]) - check_order (prevline[which - 1], line, which); -@@ -520,7 +729,8 @@ static void - prjoin (struct line const *line1, struct line const *line2) - { - const struct outlist *outlist; -- char output_separator = tab < 0 ? ' ' : tab; -+ const char *output_separator = tab == NULL ? " " : tab; -+ size_t output_separator_len = tab == NULL ? 1 : tablen; - - outlist = outlist_head.next; - if (outlist) -@@ -555,7 +765,7 @@ prjoin (struct line const *line1, struct - o = o->next; - if (o == NULL) - break; -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - } - putchar ('\n'); - } -@@ -573,23 +783,23 @@ prjoin (struct line const *line1, struct - prfield (join_field_1, line1); - for (i = 0; i < join_field_1 && i < line1->nfields; ++i) - { -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - prfield (i, line1); - } - for (i = join_field_1 + 1; i < line1->nfields; ++i) - { -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - prfield (i, line1); - } - - for (i = 0; i < join_field_2 && i < line2->nfields; ++i) - { -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - prfield (i, line2); - } - for (i = join_field_2 + 1; i < line2->nfields; ++i) - { -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - prfield (i, line2); - } - putchar ('\n'); -@@ -1020,20 +1230,40 @@ main (int argc, char **argv) - - case 't': - { -- unsigned char newtab = optarg[0]; -- if (! newtab) -+ const char *newtab = optarg; -+ size_t newtablen; -+ if (! newtab[0]) - error (EXIT_FAILURE, 0, _("empty tab")); -- if (optarg[1]) -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ mbstate_t state; -+ -+ memset (&state, 0, sizeof (mbstate_t)); -+ newtablen = mbrtowc (NULL, newtab, strlen (newtab), &state); -+ if (newtablen == (size_t) 0 -+ || newtablen == (size_t) -1 || newtablen == (size_t) -2) -+ newtablen = 1; -+ } -+ else -+#endif -+ newtablen = 1; -+ if (optarg[newtablen]) - { - if (STREQ (optarg, "\\0")) -- newtab = '\0'; -+ { -+ newtab = "\0"; -+ newtablen = 1; -+ } - else - error (EXIT_FAILURE, 0, _("multi-character tab %s"), - quote (optarg)); - } -- if (0 <= tab && tab != newtab) -+ if (tab != NULL -+ && (tablen != newtablen || memcmp (tab, newtab, tablen) != 0)) - error (EXIT_FAILURE, 0, _("incompatible tabs")); - tab = newtab; -+ tablen = newtablen; - } - break; - -Index: src/pr.c -=================================================================== ---- coreutils-7.1/src/pr.c.orig 2009-01-27 22:11:25.000000000 +0100 -+++ coreutils-7.1/src/pr.c 2010-06-29 18:49:31.931969742 +0200 -@@ -312,6 +312,32 @@ - - #include - #include -+ -+/* Get MB_LEN_MAX. */ -+#include -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX == 1 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Get MB_CUR_MAX. */ -+#include -+ -+/* Solaris 2.5 has a bug: must be included before . */ -+/* Get mbstate_t, mbrtowc(), wcwidth(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswprint(). -- for wcwidth(). */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+#if !defined iswprint && !HAVE_ISWPRINT -+# define iswprint(wc) 1 -+#endif -+ - #include "system.h" - #include "error.h" - #include "mbswidth.h" -@@ -321,6 +347,18 @@ - #include "strftime.h" - #include "xstrtol.h" - -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ -+#ifndef HAVE_DECL_WCWIDTH -+"this configure-time declaration test was not run" -+#endif -+#if !HAVE_DECL_WCWIDTH -+extern int wcwidth (); -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "pr" - -@@ -414,8 +452,21 @@ struct COLUMN - typedef struct COLUMN COLUMN; - - #define NULLCOL (COLUMN *)0 -+ -+/* Funtion pointers to switch functions for single byte locale or for -+ multibyte locale. If multibyte functions do not exist in your sysytem, -+ these pointers always point the function for single byte locale. */ -+static void (*print_char) (char c); -+static int (*char_to_clump) (char c); -+ -+/* Functions for single byte locale. */ -+static void print_char_single (char c); -+static int char_to_clump_single (char c); -+ -+/* Functions for multibyte locale. */ -+static void print_char_multi (char c); -+static int char_to_clump_multi (char c); - --static int char_to_clump (char c); - static bool read_line (COLUMN *p); - static bool print_page (void); - static bool print_stored (COLUMN *p); -@@ -425,6 +476,7 @@ static void print_header (void); - static void pad_across_to (int position); - static void add_line_number (COLUMN *p); - static void getoptarg (char *arg, char switch_char, char *character, -+ int *character_length, int *character_width, - int *number); - void usage (int status); - static void print_files (int number_of_files, char **av); -@@ -439,7 +491,6 @@ static void store_char (char c); - static void pad_down (int lines); - static void read_rest_of_line (COLUMN *p); - static void skip_read (COLUMN *p, int column_number); --static void print_char (char c); - static void cleanup (void); - static void print_sep_string (void); - static void separator_string (const char *optarg_S); -@@ -451,7 +502,7 @@ static COLUMN *column_vector; - we store the leftmost columns contiguously in buff. - To print a line from buff, get the index of the first character - from line_vector[i], and print up to line_vector[i + 1]. */ --static char *buff; -+static unsigned char *buff; - - /* Index of the position in buff where the next character - will be stored. */ -@@ -555,7 +606,7 @@ static int chars_per_column; - static bool untabify_input = false; - - /* (-e) The input tab character. */ --static char input_tab_char = '\t'; -+static char input_tab_char[MB_LEN_MAX] = "\t"; - - /* (-e) Tabstops are at chars_per_tab, 2*chars_per_tab, 3*chars_per_tab, ... - where the leftmost column is 1. */ -@@ -565,7 +616,10 @@ static int chars_per_input_tab = 8; - static bool tabify_output = false; - - /* (-i) The output tab character. */ --static char output_tab_char = '\t'; -+static char output_tab_char[MB_LEN_MAX] = "\t"; -+ -+/* (-i) The byte length of output tab character. */ -+static int output_tab_char_length = 1; - - /* (-i) The width of the output tab. */ - static int chars_per_output_tab = 8; -@@ -639,7 +693,13 @@ static int power_10; - static bool numbered_lines = false; - - /* (-n) Character which follows each line number. */ --static char number_separator = '\t'; -+static char number_separator[MB_LEN_MAX] = "\t"; -+ -+/* (-n) The byte length of the character which follows each line number. */ -+static int number_separator_length = 1; -+ -+/* (-n) The character width of the character which follows each line number. */ -+static int number_separator_width = 0; - - /* (-n) line counting starts with 1st line of input file (not with 1st - line of 1st page printed). */ -@@ -692,6 +752,7 @@ static bool use_col_separator = false; - -a|COLUMN|-m is a `space' and with the -J option a `tab'. */ - static char *col_sep_string = (char *) ""; - static int col_sep_length = 0; -+static int col_sep_width = 0; - static char *column_separator = (char *) " "; - static char *line_separator = (char *) "\t"; - -@@ -848,6 +909,13 @@ separator_string (const char *optarg_S) - col_sep_length = (int) strlen (optarg_S); - col_sep_string = xmalloc (col_sep_length + 1); - strcpy (col_sep_string, optarg_S); -+ -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ col_sep_width = mbswidth (col_sep_string, 0); -+ else -+#endif -+ col_sep_width = col_sep_length; - } - - int -@@ -872,6 +940,21 @@ main (int argc, char **argv) - - atexit (close_stdout); - -+/* Define which functions are used, the ones for single byte locale or the ones -+ for multibyte locale. */ -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ print_char = print_char_multi; -+ char_to_clump = char_to_clump_multi; -+ } -+ else -+#endif -+ { -+ print_char = print_char_single; -+ char_to_clump = char_to_clump_single; -+ } -+ - n_files = 0; - file_names = (argc > 1 - ? xmalloc ((argc - 1) * sizeof (char *)) -@@ -948,8 +1031,12 @@ main (int argc, char **argv) - break; - case 'e': - if (optarg) -- getoptarg (optarg, 'e', &input_tab_char, -- &chars_per_input_tab); -+ { -+ int dummy_length, dummy_width; -+ -+ getoptarg (optarg, 'e', input_tab_char, &dummy_length, -+ &dummy_width, &chars_per_input_tab); -+ } - /* Could check tab width > 0. */ - untabify_input = true; - break; -@@ -962,8 +1049,12 @@ main (int argc, char **argv) - break; - case 'i': - if (optarg) -- getoptarg (optarg, 'i', &output_tab_char, -- &chars_per_output_tab); -+ { -+ int dummy_width; -+ -+ getoptarg (optarg, 'i', output_tab_char, &output_tab_char_length, -+ &dummy_width, &chars_per_output_tab); -+ } - /* Could check tab width > 0. */ - tabify_output = true; - break; -@@ -990,8 +1081,8 @@ main (int argc, char **argv) - case 'n': - numbered_lines = true; - if (optarg) -- getoptarg (optarg, 'n', &number_separator, -- &chars_per_number); -+ getoptarg (optarg, 'n', number_separator, &number_separator_length, -+ &number_separator_width, &chars_per_number); - break; - case 'N': - skip_count = false; -@@ -1031,6 +1122,7 @@ main (int argc, char **argv) - /* Reset an additional input of -s, -S dominates -s */ - col_sep_string = bad_cast (""); - col_sep_length = 0; -+ col_sep_width = 0; - use_col_separator = true; - if (optarg) - separator_string (optarg); -@@ -1187,10 +1279,45 @@ main (int argc, char **argv) - a number. */ - - static void --getoptarg (char *arg, char switch_char, char *character, int *number) -+getoptarg (char *arg, char switch_char, char *character, int *character_length, -+ int *character_width, int *number) - { - if (!ISDIGIT (*arg)) -- *character = *arg++; -+ { -+#ifdef HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) /* for multibyte locale. */ -+ { -+ wchar_t wc; -+ size_t mblength; -+ int width; -+ mbstate_t state = {'\0'}; -+ -+ mblength = mbrtowc (&wc, arg, strlen (arg), &state); -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ *character_length = 1; -+ *character_width = 1; -+ } -+ else -+ { -+ *character_length = (mblength < 1) ? 1 : mblength; -+ width = wcwidth (wc); -+ *character_width = (width < 0) ? 0 : width; -+ } -+ -+ strncpy (character, arg, *character_length); -+ arg += *character_length; -+ } -+ else /* for single byte locale. */ -+#endif -+ { -+ *character = *arg++; -+ *character_length = 1; -+ *character_width = 1; -+ } -+ } -+ - if (*arg) - { - long int tmp_long; -@@ -1249,7 +1376,7 @@ init_parameters (int number_of_files) - else - col_sep_string = column_separator; - -- col_sep_length = 1; -+ col_sep_length = col_sep_width = 1; - use_col_separator = true; - } - /* It's rather pointless to define a TAB separator with column -@@ -1280,11 +1407,11 @@ init_parameters (int number_of_files) - TAB_WIDTH (chars_per_input_tab, chars_per_number); */ - - /* Estimate chars_per_text without any margin and keep it constant. */ -- if (number_separator == '\t') -+ if (number_separator[0] == '\t') - number_width = chars_per_number + - TAB_WIDTH (chars_per_default_tab, chars_per_number); - else -- number_width = chars_per_number + 1; -+ number_width = chars_per_number + number_separator_width; - - /* The number is part of the column width unless we are - printing files in parallel. */ -@@ -1299,7 +1426,7 @@ init_parameters (int number_of_files) - } - - chars_per_column = (chars_per_line - chars_used_by_number - -- (columns - 1) * col_sep_length) / columns; -+ (columns - 1) * col_sep_width) / columns; - - if (chars_per_column < 1) - error (EXIT_FAILURE, 0, _("page width too narrow")); -@@ -1424,7 +1551,7 @@ init_funcs (void) - - /* Enlarge p->start_position of first column to use the same form of - padding_not_printed with all columns. */ -- h = h + col_sep_length; -+ h = h + col_sep_width; - - /* This loop takes care of all but the rightmost column. */ - -@@ -1458,7 +1585,7 @@ init_funcs (void) - } - else - { -- h = h_next + col_sep_length; -+ h = h_next + col_sep_width; - h_next = h + chars_per_column; - } - } -@@ -1748,9 +1875,9 @@ static void - align_column (COLUMN *p) - { - padding_not_printed = p->start_position; -- if (padding_not_printed - col_sep_length > 0) -+ if (padding_not_printed - col_sep_width > 0) - { -- pad_across_to (padding_not_printed - col_sep_length); -+ pad_across_to (padding_not_printed - col_sep_width); - padding_not_printed = ANYWHERE; - } - -@@ -2021,13 +2148,13 @@ store_char (char c) - /* May be too generous. */ - buff = X2REALLOC (buff, &buff_allocated); - } -- buff[buff_current++] = c; -+ buff[buff_current++] = (unsigned char) c; - } - - static void - add_line_number (COLUMN *p) - { -- int i; -+ int i, j; - char *s; - int left_cut; - -@@ -2050,22 +2177,24 @@ add_line_number (COLUMN *p) - /* Tabification is assumed for multiple columns, also for n-separators, - but `default n-separator = TAB' hasn't been given priority over - equal column_width also specified by POSIX. */ -- if (number_separator == '\t') -+ if (number_separator[0] == '\t') - { - i = number_width - chars_per_number; - while (i-- > 0) - (p->char_func) (' '); - } - else -- (p->char_func) (number_separator); -+ for (j = 0; j < number_separator_length; j++) -+ (p->char_func) (number_separator[j]); - } - else - /* To comply with POSIX, we avoid any expansion of default TAB - separator with a single column output. No column_width requirement - has to be considered. */ - { -- (p->char_func) (number_separator); -- if (number_separator == '\t') -+ for (j = 0; j < number_separator_length; j++) -+ (p->char_func) (number_separator[j]); -+ if (number_separator[0] == '\t') - output_position = POS_AFTER_TAB (chars_per_output_tab, - output_position); - } -@@ -2226,7 +2355,7 @@ print_white_space (void) - while (goal - h_old > 1 - && (h_new = POS_AFTER_TAB (chars_per_output_tab, h_old)) <= goal) - { -- putchar (output_tab_char); -+ fwrite (output_tab_char, 1, output_tab_char_length, stdout); - h_old = h_new; - } - while (++h_old <= goal) -@@ -2246,6 +2375,7 @@ print_sep_string (void) - { - char *s; - int l = col_sep_length; -+ int not_space_flag; - - s = col_sep_string; - -@@ -2259,6 +2389,7 @@ print_sep_string (void) - { - for (; separators_not_printed > 0; --separators_not_printed) - { -+ not_space_flag = 0; - while (l-- > 0) - { - /* 3 types of sep_strings: spaces only, spaces and chars, -@@ -2272,12 +2403,15 @@ print_sep_string (void) - } - else - { -+ not_space_flag = 1; - if (spaces_not_printed > 0) - print_white_space (); - putchar (*s++); -- ++output_position; - } - } -+ if (not_space_flag) -+ output_position += col_sep_width; -+ - /* sep_string ends with some spaces */ - if (spaces_not_printed > 0) - print_white_space (); -@@ -2304,8 +2438,9 @@ print_clump (COLUMN *p, int n, char *clu - a nonspace is encountered, call print_white_space() to print the - required number of tabs and spaces. */ - -+ - static void --print_char (char c) -+print_char_single (char c) - { - if (tabify_output) - { -@@ -2329,6 +2464,75 @@ print_char (char c) - putchar (c); - } - -+#ifdef HAVE_MBRTOWC -+static void -+print_char_multi (char c) -+{ -+ static size_t mbc_pos = 0; -+ static unsigned char mbc[MB_LEN_MAX] = {'\0'}; -+ static mbstate_t state = {'\0'}; -+ mbstate_t state_bak; -+ wchar_t wc; -+ unsigned char uc = (unsigned char) c; -+ size_t mblength; -+ int width; -+ -+ if (tabify_output) -+ { -+ state_bak = state; -+ mbc[mbc_pos++] = uc; -+ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); -+ -+ while (mbc_pos > 0) -+ { -+ switch (mblength) -+ { -+ case (size_t) -2: -+ state = state_bak; -+ return; -+ -+ case (size_t) -1: -+ state = state_bak; -+ ++output_position; -+ putchar (mbc[0]); -+ memmove (mbc, mbc + 1, MB_CUR_MAX - 1); -+ --mbc_pos; -+ break; -+ -+ case 0: -+ mblength = 1; -+ -+ default: -+ if (wc == L' ') -+ { -+ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); -+ --mbc_pos; -+ ++spaces_not_printed; -+ return; -+ } -+ else if (spaces_not_printed > 0) -+ print_white_space (); -+ -+ /* Nonprintables are assumed to have width 0, except L'\b'. */ -+ if ((width = wcwidth (wc)) < 1) -+ { -+ if (wc == L'\b') -+ --output_position; -+ } -+ else -+ output_position += width; -+ -+ fwrite (mbc, 1, mblength, stdout); -+ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); -+ mbc_pos -= mblength; -+ } -+ } -+ return; -+ } -+ putchar (uc); -+} -+#endif -+ - /* Skip to page PAGE before printing. - PAGE may be larger than total number of pages. */ - -@@ -2506,9 +2710,9 @@ read_line (COLUMN *p) - align_empty_cols = false; - } - -- if (padding_not_printed - col_sep_length > 0) -+ if (padding_not_printed - col_sep_width > 0) - { -- pad_across_to (padding_not_printed - col_sep_length); -+ pad_across_to (padding_not_printed - col_sep_width); - padding_not_printed = ANYWHERE; - } - -@@ -2609,9 +2813,9 @@ print_stored (COLUMN *p) - } - } - -- if (padding_not_printed - col_sep_length > 0) -+ if (padding_not_printed - col_sep_width > 0) - { -- pad_across_to (padding_not_printed - col_sep_length); -+ pad_across_to (padding_not_printed - col_sep_width); - padding_not_printed = ANYWHERE; - } - -@@ -2624,8 +2828,8 @@ print_stored (COLUMN *p) - if (spaces_not_printed == 0) - { - output_position = p->start_position + end_vector[line]; -- if (p->start_position - col_sep_length == chars_per_margin) -- output_position -= col_sep_length; -+ if (p->start_position - col_sep_width == chars_per_margin) -+ output_position -= col_sep_width; - } - - return true; -@@ -2643,8 +2847,9 @@ print_stored (COLUMN *p) - characters in clump_buff. (e.g, the width of '\b' is -1, while the - number of characters is 1.) */ - -+ - static int --char_to_clump (char c) -+char_to_clump_single (char c) - { - unsigned char uc = c; - char *s = clump_buff; -@@ -2654,10 +2859,10 @@ char_to_clump (char c) - int chars; - int chars_per_c = 8; - -- if (c == input_tab_char) -+ if (c == input_tab_char[0]) - chars_per_c = chars_per_input_tab; - -- if (c == input_tab_char || c == '\t') -+ if (c == input_tab_char[0] || c == '\t') - { - width = TAB_WIDTH (chars_per_c, input_position); - -@@ -2738,6 +2943,155 @@ char_to_clump (char c) - return chars; - } - -+#ifdef HAVE_MBRTOWC -+static int -+char_to_clump_multi (char c) -+{ -+ static size_t mbc_pos = 0; -+ static unsigned char mbc[MB_LEN_MAX] = {'\0'}; -+ static mbstate_t state = {'\0'}; -+ mbstate_t state_bak; -+ wchar_t wc; -+ unsigned char uc = (unsigned char) c; -+ size_t mblength; -+ int wc_width; -+ register char *s = clump_buff; -+ register int i, j; -+ char esc_buff[4]; -+ int width; -+ int chars; -+ int chars_per_c = 8; -+ -+ state_bak = state; -+ mbc[mbc_pos++] = uc; -+ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); -+ -+ width = 0; -+ chars = 0; -+ while (mbc_pos > 0) -+ { -+ switch (mblength) -+ { -+ case (size_t) -2: -+ state = state_bak; -+ return 0; -+ -+ case (size_t) -1: -+ state = state_bak; -+ mblength = 1; -+ -+ if (use_esc_sequence || use_cntrl_prefix) -+ { -+ width = +4; -+ chars = +4; -+ *s++ = '\\'; -+ sprintf (esc_buff, "%03o", mbc[0]); -+ for (i = 0; i <= 2; ++i) -+ *s++ = (int) esc_buff[i]; -+ } -+ else -+ { -+ width += 1; -+ chars += 1; -+ *s++ = mbc[0]; -+ } -+ break; -+ -+ case 0: -+ mblength = 1; -+ /* Fall through */ -+ -+ default: -+ if (memcmp (mbc, input_tab_char, mblength) == 0) -+ chars_per_c = chars_per_input_tab; -+ -+ if (memcmp (mbc, input_tab_char, mblength) == 0 || c == '\t') -+ { -+ int width_inc; -+ -+ width_inc = TAB_WIDTH (chars_per_c, input_position); -+ width += width_inc; -+ -+ if (untabify_input) -+ { -+ for (i = width_inc; i; --i) -+ *s++ = ' '; -+ chars += width_inc; -+ } -+ else -+ { -+ for (i = 0; i < mblength; i++) -+ *s++ = mbc[i]; -+ chars += mblength; -+ } -+ } -+ else if ((wc_width = wcwidth (wc)) < 1) -+ { -+ if (use_esc_sequence) -+ { -+ for (i = 0; i < mblength; i++) -+ { -+ width += 4; -+ chars += 4; -+ *s++ = '\\'; -+ sprintf (esc_buff, "%03o", uc); -+ for (j = 0; j <= 2; ++j) -+ *s++ = (int) esc_buff[j]; -+ } -+ } -+ else if (use_cntrl_prefix) -+ { -+ if (wc < 0200) -+ { -+ width += 2; -+ chars += 2; -+ *s++ = '^'; -+ *s++ = wc ^ 0100; -+ } -+ else -+ { -+ for (i = 0; i < mblength; i++) -+ { -+ width += 4; -+ chars += 4; -+ *s++ = '\\'; -+ sprintf (esc_buff, "%03o", uc); -+ for (j = 0; j <= 2; ++j) -+ *s++ = (int) esc_buff[j]; -+ } -+ } -+ } -+ else if (wc == L'\b') -+ { -+ width += -1; -+ chars += 1; -+ *s++ = c; -+ } -+ else -+ { -+ width += 0; -+ chars += mblength; -+ for (i = 0; i < mblength; i++) -+ *s++ = mbc[i]; -+ } -+ } -+ else -+ { -+ width += wc_width; -+ chars += mblength; -+ for (i = 0; i < mblength; i++) -+ *s++ = mbc[i]; -+ } -+ } -+ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); -+ mbc_pos -= mblength; -+ } -+ -+ input_position += width; -+ return chars; -+} -+#endif -+ - /* We've just printed some files and need to clean up things before - looking for more options and printing the next batch of files. - -Index: src/sort.c -=================================================================== ---- coreutils-7.1/src/sort.c.orig 2009-01-30 19:46:06.000000000 +0100 -+++ coreutils-7.1/src/sort.c 2010-06-29 18:51:17.203522566 +0200 -@@ -26,6 +26,19 @@ - #include - #include - #include -+#include -+ -+/* Get mbstate_t, mbrtowc(), wcrtomb(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswprint(), iswctype() towupper(). */ -+#if HAVE_WCTYPE_H -+# include -+wctype_t blank_type; /* = wctype ("blank"); */ -+#endif -+ - #include "system.h" - #include "argmatch.h" - #include "error.h" -@@ -53,6 +66,17 @@ struct rlimit { size_t rlim_cur; }; - # define getrlimit(Resource, Rlp) (-1) - #endif - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX == 1 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "sort" - -@@ -121,14 +145,38 @@ static int decimal_point; - /* Thousands separator; if -1, then there isn't one. */ - static int thousands_sep; - -+static int force_general_numcompare = 0; -+ - /* Nonzero if the corresponding locales are hard. */ - static bool hard_LC_COLLATE; --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - static bool hard_LC_TIME; - #endif - - #define NONZERO(x) ((x) != 0) - -+/* get a multibyte character's byte length. */ -+#define GET_BYTELEN_OF_CHAR(LIM, PTR, MBLENGTH, STATE) \ -+ do \ -+ { \ -+ wchar_t wc; \ -+ mbstate_t state_bak; \ -+ \ -+ state_bak = STATE; \ -+ mblength = mbrtowc (&wc, PTR, LIM - PTR, &STATE); \ -+ \ -+ switch (MBLENGTH) \ -+ { \ -+ case (size_t)-1: \ -+ case (size_t)-2: \ -+ STATE = state_bak; \ -+ /* Fall through. */ \ -+ case 0: \ -+ MBLENGTH = 1; \ -+ } \ -+ } \ -+ while (0) -+ - /* The kind of blanks for '-b' to skip in various options. */ - enum blanktype { bl_start, bl_end, bl_both }; - -@@ -264,13 +312,11 @@ static bool reverse; - they were read if all keys compare equal. */ - static bool stable; - --/* If TAB has this value, blanks separate fields. */ --enum { TAB_DEFAULT = CHAR_MAX + 1 }; -- --/* Tab character separating fields. If TAB_DEFAULT, then fields are -- separated by the empty string between a non-blank character and a blank -+/* Tab character separating fields. If NULL, then fields are separated by -+ the empty string between a non-blank character and a blank - character. */ --static int tab = TAB_DEFAULT; -+static const char *tab; -+static size_t tab_length = 1; - - /* Flag to remove consecutive duplicate lines from the output. - Only the last of a sequence of equal lines will be output. */ -@@ -702,6 +748,43 @@ reap_some (void) - update_proc (pid); - } - -+/* Fucntion pointers. */ -+static char * -+(* begfield) (const struct line *line, const struct keyfield *key); -+ -+static char * -+(* limfield) (const struct line *line, const struct keyfield *key); -+ -+static int -+(*getmonth) (const char *s, size_t len); -+ -+static int -+(* keycompare) (const struct line *a, const struct line *b); -+ -+/* Test for white space multibyte character. -+ Set LENGTH the byte length of investigated multibyte character. */ -+#if HAVE_MBRTOWC -+static int -+ismbblank (const char *str, size_t *length) -+{ -+ size_t mblength; -+ wchar_t wc; -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ mblength = mbrtowc (&wc, str, MB_LEN_MAX, &state); -+ -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ *length = 1; -+ return 0; -+ } -+ -+ *length = (mblength < 1) ? 1 : mblength; -+ return (iswctype (wc, blank_type)); -+} -+#endif -+ - /* Clean up any remaining temporary files. */ - - static void -@@ -1042,7 +1125,7 @@ zaptemp (const char *name) - free (node); - } - --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - - static int - struct_month_cmp (const void *m1, const void *m2) -@@ -1069,7 +1152,7 @@ inittables (void) - fold_toupper[i] = toupper (i); - } - --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - /* If we're not in the "C" locale, read different names for months. */ - if (hard_LC_TIME) - { -@@ -1151,6 +1234,71 @@ specify_nmerge (int oi, char c, char con - xstrtol_fatal (e, oi, c, long_options, s); - } - -+#if HAVE_MBRTOWC -+static void -+inittables_mb (void) -+{ -+ int i, j, k, l; -+ char *name, *s; -+ size_t s_len, mblength; -+ char mbc[MB_LEN_MAX]; -+ wchar_t wc, pwc; -+ mbstate_t state_mb, state_wc; -+ -+ for (i = 0; i < MONTHS_PER_YEAR; i++) -+ { -+ s = (char *) nl_langinfo (ABMON_1 + i); -+ s_len = strlen (s); -+ monthtab[i].name = name = (char *) xmalloc (s_len + 1); -+ monthtab[i].val = i + 1; -+ -+ memset (&state_mb, '\0', sizeof (mbstate_t)); -+ memset (&state_wc, '\0', sizeof (mbstate_t)); -+ -+ for (j = 0; j < s_len;) -+ { -+ if (!ismbblank (s + j, &mblength)) -+ break; -+ j += mblength; -+ } -+ -+ for (k = 0; j < s_len;) -+ { -+ mblength = mbrtowc (&wc, (s + j), (s_len - j), &state_mb); -+ /* If conversion is failed, fall back into single byte sorting. */ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ for (l = 0; l <= i; l++) -+ free ((void *) monthtab[l].name); -+ inittables(); -+ return; -+ } -+ else if (mblength == 0) -+ break; -+ -+ pwc = towupper (wc); -+ if (pwc == wc) -+ { -+ memcpy (mbc, s + j, mblength); -+ j += mblength; -+ } -+ else -+ { -+ j += mblength; -+ mblength = wcrtomb (mbc, wc, &state_wc); -+ assert (mblength != (size_t) 0 && mblength != (size_t) -1); -+ } -+ -+ for (l = 0; l < mblength; l++) -+ name[k++] = mbc[l]; -+ } -+ name[k] = '\0'; -+ } -+ qsort ((void *) monthtab, MONTHS_PER_YEAR, -+ sizeof *monthtab, struct_month_cmp); -+} -+#endif -+ - /* Specify the amount of main memory to use when sorting. */ - static void - specify_sort_size (int oi, char c, char const *s) -@@ -1361,7 +1509,7 @@ buffer_linelim (struct buffer const *buf - by KEY in LINE. */ - - static char * --begfield (const struct line *line, const struct keyfield *key) -+begfield_uni (const struct line *line, const struct keyfield *key) - { - char *ptr = line->text, *lim = ptr + line->length - 1; - size_t sword = key->sword; -@@ -1371,10 +1519,10 @@ begfield (const struct line *line, const - /* The leading field separator itself is included in a field when -t - is absent. */ - -- if (tab != TAB_DEFAULT) -+ if (tab != NULL) - while (ptr < lim && sword--) - { -- while (ptr < lim && *ptr != tab) -+ while (ptr < lim && *ptr != tab[0]) - ++ptr; - if (ptr < lim) - ++ptr; -@@ -1402,11 +1550,70 @@ begfield (const struct line *line, const - return ptr; - } - -+#if HAVE_MBRTOWC -+static char * -+begfield_mb (const struct line *line, const struct keyfield *key) -+{ -+ int i; -+ char *ptr = line->text, *lim = ptr + line->length - 1; -+ size_t sword = key->sword; -+ size_t schar = key->schar; -+ size_t mblength; -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ if (tab != NULL) -+ while (ptr < lim && sword--) -+ { -+ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ } -+ else -+ while (ptr < lim && sword--) -+ { -+ while (ptr < lim && ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ while (ptr < lim && !ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ } -+ -+ if (key->skipsblanks) -+ while (ptr < lim && ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ -+ for (i = 0; i < schar; i++) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ -+ if (ptr + mblength > lim) -+ break; -+ else -+ ptr += mblength; -+ } -+ -+ return ptr; -+} -+#endif -+ - /* Return the limit of (a pointer to the first character after) the field - in LINE specified by KEY. */ - - static char * --limfield (const struct line *line, const struct keyfield *key) -+limfield_uni (const struct line *line, const struct keyfield *key) - { - char *ptr = line->text, *lim = ptr + line->length - 1; - size_t eword = key->eword, echar = key->echar; -@@ -1419,10 +1626,10 @@ limfield (const struct line *line, const - `beginning' is the first character following the delimiting TAB. - Otherwise, leave PTR pointing at the first `blank' character after - the preceding field. */ -- if (tab != TAB_DEFAULT) -+ if (tab != NULL) - while (ptr < lim && eword--) - { -- while (ptr < lim && *ptr != tab) -+ while (ptr < lim && *ptr != tab[0]) - ++ptr; - if (ptr < lim && (eword | echar)) - ++ptr; -@@ -1468,7 +1675,7 @@ limfield (const struct line *line, const - */ - - /* Make LIM point to the end of (one byte past) the current field. */ -- if (tab != TAB_DEFAULT) -+ if (tab != NULL) - { - char *newlim; - newlim = memchr (ptr, tab, lim - ptr); -@@ -1504,6 +1711,107 @@ limfield (const struct line *line, const - return ptr; - } - -+#if HAVE_MBRTOWC -+static char * -+limfield_mb (const struct line *line, const struct keyfield *key) -+{ -+ char *ptr = line->text, *lim = ptr + line->length - 1; -+ size_t eword = key->eword, echar = key->echar; -+ int i; -+ size_t mblength; -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ if (tab != NULL) -+ while (ptr < lim && eword--) -+ { -+ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ } -+ else -+ while (ptr < lim && eword--) -+ { -+ while (ptr < lim && ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ while (ptr < lim && !ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ } -+ -+# ifdef POSIX_UNSPECIFIED -+ -+ /* Make LIM point to the end of (one byte past) the current field. */ -+ if (tab != NULL) -+ { -+ char *newlim, *p; -+ -+ newlim = NULL; -+ for (p = ptr; p < lim;) -+ { -+ if (memcmp (p, tab, tab_length) == 0) -+ { -+ newlim = p; -+ break; -+ } -+ -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ p += mblength; -+ } -+ } -+ else -+ { -+ char *newlim; -+ newlim = ptr; -+ -+ while (newlim < lim && ismbblank (newlim, &mblength)) -+ newlim += mblength; -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ while (newlim < lim && !ismbblank (newlim, &mblength)) -+ newlim += mblength; -+ lim = newlim; -+ } -+# endif -+ -+ /* If we're skipping leading blanks, don't start counting characters -+ until after skipping past any leading blanks. */ -+ if (key->skipeblanks) -+ while (ptr < lim && ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ /* Advance PTR by ECHAR (if possible), but no further than LIM. */ -+ for (i = 0; i < echar; i++) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ -+ if (ptr + mblength > lim) -+ break; -+ else -+ ptr += mblength; -+ } -+ -+ return ptr; -+} -+#endif -+ - /* Fill BUF reading from FP, moving buf->left bytes from the end - of buf->buf to the beginning first. If EOF is reached and the - file wasn't terminated by a newline, supply one. Set up BUF's line -@@ -1586,8 +1894,22 @@ fillbuf (struct buffer *buf, FILE *fp, c - else - { - if (key->skipsblanks) -- while (blanks[to_uchar (*line_start)]) -- line_start++; -+ { -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ size_t mblength; -+ -+ while (ismbblank (line_start, &mblength)) -+ line_start += mblength; -+ } -+ else -+#endif -+ { -+ while (blanks[to_uchar (*line_start)]) -+ line_start++; -+ } -+ } - line->keybeg = line_start; - } - } -@@ -1642,15 +1964,59 @@ general_numcompare (const char *sa, cons - /* FIXME: maybe add option to try expensive FP conversion - only if A and B can't be compared more cheaply/accurately. */ - -- char *ea; -- char *eb; -- double a = strtod (sa, &ea); -- double b = strtod (sb, &eb); -+ char *bufa, *ea; -+ char *bufb, *eb; -+ double a; -+ double b; -+ -+ char *p; -+ struct lconv *lconvp = localeconv (); -+ size_t thousands_sep_len = strlen (lconvp->thousands_sep); -+ -+ bufa = (char *) xmalloc (strlen (sa) + 1); -+ bufb = (char *) xmalloc (strlen (sb) + 1); -+ strcpy (bufa, sa); -+ strcpy (bufb, sb); -+ -+ if (force_general_numcompare) -+ { -+ while (1) -+ { -+ a = strtod (bufa, &ea); -+ if (memcmp (ea, lconvp->thousands_sep, thousands_sep_len) == 0) -+ { -+ for (p = ea; *(p + thousands_sep_len) != '\0'; p++) -+ *p = *(p + thousands_sep_len); -+ *p = '\0'; -+ continue; -+ } -+ break; -+ } -+ -+ while (1) -+ { -+ b = strtod (bufb, &eb); -+ if (memcmp (eb, lconvp->thousands_sep, thousands_sep_len) == 0) -+ { -+ for (p = eb; *(p + thousands_sep_len) != '\0'; p++) -+ *p = *(p + thousands_sep_len); -+ *p = '\0'; -+ continue; -+ } -+ break; -+ } -+ } -+ else -+ { -+ a = strtod (bufa, &ea); -+ b = strtod (bufb, &eb); -+ } -+ - - /* Put conversion errors at the start of the collating sequence. */ -- if (sa == ea) -- return sb == eb ? 0 : -1; -- if (sb == eb) -+ if (bufa == ea) -+ return bufb == eb ? 0 : -1; -+ if (bufb == eb) - return 1; - - /* Sort numbers in the usual way, where -0 == +0. Put NaNs after -@@ -1668,7 +2034,7 @@ general_numcompare (const char *sa, cons - Return 0 if the name in S is not recognized. */ - - static int --getmonth (char const *month, size_t len) -+getmonth_uni (char const *month, size_t len) - { - size_t lo = 0; - size_t hi = MONTHS_PER_YEAR; -@@ -1849,11 +2215,79 @@ compare_version (char *restrict texta, s - return diff; - } - -+#if HAVE_MBRTOWC -+static int -+getmonth_mb (char const *s, size_t len) -+{ -+ char *month; -+ register size_t i; -+ register int lo = 0, hi = MONTHS_PER_YEAR, result; -+ char *tmp; -+ size_t wclength, mblength; -+ const char **pp; -+ const wchar_t **wpp; -+ wchar_t *month_wcs; -+ mbstate_t state; -+ -+ while (len > 0 && ismbblank (s, &mblength)) -+ { -+ s += mblength; -+ len -= mblength; -+ } -+ -+ if (len == 0) -+ return 0; -+ -+ month = (char *) alloca (len + 1); -+ -+ tmp = (char *) alloca (len + 1); -+ memcpy (tmp, s, len); -+ tmp[len] = '\0'; -+ pp = (const char **) &tmp; -+ month_wcs = (wchar_t *) alloca ((len + 1) * sizeof (wchar_t)); -+ memset (&state, '\0', sizeof (mbstate_t)); -+ -+ wclength = mbsrtowcs (month_wcs, pp, len + 1, &state); -+ assert (wclength != 1 && *pp == NULL); -+ -+ for (i = 0; i < wclength; i++) -+ { -+ month_wcs[i] = towupper (month_wcs[i]); -+ if (iswctype (month_wcs[i], blank_type)) -+ { -+ month_wcs[i] = L'\0'; -+ break; -+ } -+ } -+ -+ wpp = (const wchar_t **) &month_wcs; -+ -+ mblength = wcsrtombs (month, wpp, len + 1, &state); -+ assert (mblength != (-1) && *wpp == NULL); -+ -+ do -+ { -+ int ix = (lo + hi) / 2; -+ -+ if (strncmp (month, monthtab[ix].name, strlen (monthtab[ix].name)) < 0) -+ hi = ix; -+ else -+ lo = ix; -+ } -+ while (hi - lo > 1); -+ -+ result = (!strncmp (month, monthtab[lo].name, strlen (monthtab[lo].name)) -+ ? monthtab[lo].val : 0); -+ -+ return result; -+} -+#endif -+ - /* Compare two lines A and B trying every key in sequence until there - are no more keys or a difference is found. */ - - static int --keycompare (const struct line *a, const struct line *b) -+keycompare_uni (const struct line *a, const struct line *b) - { - struct keyfield const *key = keylist; - -@@ -2022,11 +2456,190 @@ keycompare (const struct line *a, const - - return 0; - -- greater: -+greater: -+ diff = 1; -+not_equal: -+ return key->reverse ? -diff : diff; -+} -+ -+#if HAVE_MBRTOWC -+static int -+keycompare_mb (const struct line *a, const struct line *b) -+{ -+ struct keyfield *key = keylist; -+ -+ /* For the first iteration only, the key positions have been -+ precomputed for us. */ -+ char *texta = a->keybeg; -+ char *textb = b->keybeg; -+ char *lima = a->keylim; -+ char *limb = b->keylim; -+ -+ size_t mblength_a, mblength_b; -+ wchar_t wc_a, wc_b; -+ mbstate_t state_a, state_b; -+ -+ int diff; -+ -+ memset (&state_a, '\0', sizeof (mbstate_t)); -+ memset (&state_b, '\0', sizeof (mbstate_t)); -+ -+ for (;;) -+ { -+ register char const *translate = key->translate; -+ register bool const *ignore = key->ignore; -+ -+ /* Find the lengths. */ -+ size_t lena = lima <= texta ? 0 : lima - texta; -+ size_t lenb = limb <= textb ? 0 : limb - textb; -+ -+ /* Actually compare the fields. */ -+ if (key->numeric | key->general_numeric) -+ { -+ char savea = *lima, saveb = *limb; -+ -+ *lima = *limb = '\0'; -+ if (force_general_numcompare) -+ diff = general_numcompare (texta, textb); -+ else -+ diff = ((key->numeric ? numcompare : general_numcompare) -+ (texta, textb)); -+ *lima = savea, *limb = saveb; -+ } -+ else if (key->version) -+ diff = compare_version (texta, lena, textb, lenb); -+ else if (key->month) -+ diff = getmonth (texta, lena) - getmonth (textb, lenb); -+ else -+ { -+ if (ignore || translate) -+ { -+ char buf[4000]; -+ size_t size = lena + 1 + lenb + 1; -+ char *copy_a = (size <= sizeof buf ? buf : xmalloc (size)); -+ char *copy_b = copy_a + lena + 1; -+ size_t new_len_a, new_len_b; -+ size_t i, j; -+ -+ /* Ignore and/or translate chars before comparing. */ -+# define IGNORE_CHARS(NEW_LEN, LEN, TEXT, COPY, WC, MBLENGTH, STATE) \ -+ do \ -+ { \ -+ wchar_t uwc; \ -+ char mbc[MB_LEN_MAX]; \ -+ mbstate_t state_wc; \ -+ \ -+ for (NEW_LEN = i = 0; i < LEN;) \ -+ { \ -+ mbstate_t state_bak; \ -+ \ -+ state_bak = STATE; \ -+ MBLENGTH = mbrtowc (&WC, TEXT + i, LEN - i, &STATE); \ -+ \ -+ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1 \ -+ || MBLENGTH == 0) \ -+ { \ -+ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ -+ STATE = state_bak; \ -+ if (!ignore) \ -+ COPY[NEW_LEN++] = TEXT[i++]; \ -+ continue; \ -+ } \ -+ \ -+ if (ignore) \ -+ { \ -+ if ((ignore == nonprinting && !iswprint (WC)) \ -+ || (ignore == nondictionary \ -+ && !iswalnum (WC) && !iswctype (WC, blank_type))) \ -+ { \ -+ i += MBLENGTH; \ -+ continue; \ -+ } \ -+ } \ -+ \ -+ if (translate) \ -+ { \ -+ \ -+ uwc = toupper(WC); \ -+ if (WC == uwc) \ -+ { \ -+ memcpy (mbc, TEXT + i, MBLENGTH); \ -+ i += MBLENGTH; \ -+ } \ -+ else \ -+ { \ -+ i += MBLENGTH; \ -+ WC = uwc; \ -+ memset (&state_wc, '\0', sizeof (mbstate_t)); \ -+ \ -+ MBLENGTH = wcrtomb (mbc, WC, &state_wc); \ -+ assert (MBLENGTH != (size_t)-1 && MBLENGTH != 0); \ -+ } \ -+ \ -+ for (j = 0; j < MBLENGTH; j++) \ -+ COPY[NEW_LEN++] = mbc[j]; \ -+ } \ -+ else \ -+ for (j = 0; j < MBLENGTH; j++) \ -+ COPY[NEW_LEN++] = TEXT[i++]; \ -+ } \ -+ COPY[NEW_LEN] = '\0'; \ -+ } \ -+ while (0) -+ -+ IGNORE_CHARS (new_len_a, lena, texta, copy_a, -+ wc_a, mblength_a, state_a); -+ IGNORE_CHARS (new_len_b, lenb, textb, copy_b, -+ wc_b, mblength_b, state_b); -+ diff = xmemcoll (copy_a, new_len_a, copy_b, new_len_b); -+ -+ if (sizeof buf < size) -+ free (copy_a); -+ } -+ else if (lena == 0) -+ diff = - NONZERO (lenb); -+ else if (lenb == 0) -+ goto greater; -+ else -+ diff = xmemcoll (texta, lena, textb, lenb); -+ } -+ -+ if (diff) -+ goto not_equal; -+ -+ key = key->next; -+ if (! key) -+ break; -+ -+ /* Find the beginning and limit of the next field. */ -+ if (key->eword != SIZE_MAX) -+ lima = limfield (a, key), limb = limfield (b, key); -+ else -+ lima = a->text + a->length - 1, limb = b->text + b->length - 1; -+ -+ if (key->sword != SIZE_MAX) -+ texta = begfield (a, key), textb = begfield (b, key); -+ else -+ { -+ texta = a->text, textb = b->text; -+ if (key->skipsblanks) -+ { -+ while (texta < lima && ismbblank (texta, &mblength_a)) -+ texta += mblength_a; -+ while (textb < limb && ismbblank (textb, &mblength_b)) -+ textb += mblength_b; -+ } -+ } -+ } -+ -+ return 0; -+ -+greater: - diff = 1; -- not_equal: -+not_equal: - return key->reverse ? -diff : diff; - } -+#endif - - /* Compare two lines A and B, returning negative, zero, or positive - depending on whether A compares less than, equal to, or greater than B. */ -@@ -2857,6 +3470,11 @@ set_ordering (const char *s, struct keyf - break; - case 'M': - key->month = true; -+#if HAVE_MBRTOWC -+ if (strcmp (setlocale (LC_CTYPE, NULL), setlocale (LC_TIME, NULL))) -+ error (0, 0, _("As LC_TIME differs from LC_CTYPE, the results may be strange.")); -+ inittables_mb (); -+#endif - break; - case 'n': - key->numeric = true; -@@ -2915,7 +3533,7 @@ main (int argc, char **argv) - initialize_exit_failure (SORT_FAILURE); - - hard_LC_COLLATE = hard_locale (LC_COLLATE); --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - hard_LC_TIME = hard_locale (LC_TIME); - #endif - -@@ -2928,14 +3546,40 @@ main (int argc, char **argv) - add support for multibyte decimal points. */ - decimal_point = to_uchar (locale->decimal_point[0]); - if (! decimal_point || locale->decimal_point[1]) -- decimal_point = '.'; -+ { -+ decimal_point = '.'; -+ if (locale->decimal_point[0] && locale->decimal_point[1]) -+ force_general_numcompare = 1; -+ } - - /* FIXME: add support for multibyte thousands separators. */ - thousands_sep = to_uchar (*locale->thousands_sep); - if (! thousands_sep || locale->thousands_sep[1]) -- thousands_sep = -1; -+ { -+ thousands_sep = -1; -+ if (locale->thousands_sep[0] && locale->thousands_sep[1]) -+ force_general_numcompare = 1; -+ } - } - -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ blank_type = wctype ("blank"); -+ begfield = begfield_mb; -+ limfield = limfield_mb; -+ getmonth = getmonth_mb; -+ keycompare = keycompare_mb; -+ } -+ else -+#endif -+ { -+ begfield = begfield_uni; -+ limfield = limfield_uni; -+ keycompare = keycompare_uni; -+ getmonth = getmonth_uni; -+ } -+ - have_read_stdin = false; - inittables (); - -@@ -3196,13 +3840,32 @@ main (int argc, char **argv) - - case 't': - { -- char newtab = optarg[0]; -- if (! newtab) -+ const char *newtab = optarg; -+ size_t newtab_length; -+ if (! newtab[0]) - error (SORT_FAILURE, 0, _("empty tab")); -- if (optarg[1]) -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ mbstate_t state; -+ -+ memset (&state, 0, sizeof (mbstate_t)); -+ newtab_length = mbrtowc (NULL, newtab, strlen (newtab), &state); -+ if (newtab_length == (size_t) 0 -+ || newtab_length == (size_t) -1 -+ || newtab_length == (size_t) -2) -+ newtab_length = 1; -+ } -+ else -+#endif -+ newtab_length = 1; -+ if (optarg[newtab_length]) - { - if (STREQ (optarg, "\\0")) -- newtab = '\0'; -+ { -+ newtab = "\0"; -+ newtab_length = 1; -+ } - else - { - /* Provoke with `sort -txx'. Complain about -@@ -3213,9 +3876,12 @@ main (int argc, char **argv) - quote (optarg)); - } - } -- if (tab != TAB_DEFAULT && tab != newtab) -+ if (tab != NULL -+ && (tab_length != newtab_length -+ || memcmp (tab, newtab, tab_length) != 0)) - error (SORT_FAILURE, 0, _("incompatible tabs")); - tab = newtab; -+ tab_length = newtab_length; - } - break; - -Index: src/unexpand.c -=================================================================== ---- coreutils-7.1/src/unexpand.c.orig 2008-11-10 14:17:52.000000000 +0100 -+++ coreutils-7.1/src/unexpand.c 2010-06-29 18:49:31.975522293 +0200 -@@ -38,11 +38,34 @@ - #include - #include - #include -+ -+/* Get mbstate_t, mbrtowc(), wcwidth() */ -+#if HAVE_WCHAR_H -+# include -+#endif -+/* Get iswblank */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ -+ -+/* A sentinel value that's placed at the end of the list of tab stops. -+ * This value must be a large number, but not so large that adding the -+ * length of a line to it would cause the column variable to overflow. */ -+#define TAB_STOP_SENTINEL INT_MAX -+ - #include "system.h" - #include "error.h" - #include "quote.h" - #include "xstrndup.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "unexpand" - -@@ -449,6 +472,237 @@ unexpand (void) - } - } - -+#if HAVE_MBRTOWC && HAVE_WCTYPE_H -+static void -+unexpand_multibyte (void) -+{ -+ /* Input stream. */ -+ FILE *fp = next_file (NULL); -+ -+ mbstate_t i_state; /* Current shift state of the input stream. */ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen = 0; /* The length of the byte sequence in buf. */ -+ -+ /* The array of pending blanks. In non-POSIX locales, blanks can -+ include characters other than spaces, so the blanks must be -+ stored, not merely counted. */ -+ char *pending_blank; -+ -+ if (!fp) -+ return; -+ -+ /* The worst case is a non-blank character, then one blank, then a -+ tab stop, then MAX_COLUMN_WIDTH - 1 blanks, then a non-blank; so -+ allocate MAX_COLUMN_WIDTH bytes to store the blanks. */ -+ pending_blank = xmalloc (max_column_width); -+ -+ memset (&i_state, '\0', sizeof(mbstate_t)); -+ -+ for (;;) -+ { -+ /* A gotten wide character. */ -+ wint_t wc; -+ -+ /* If true, perform translations. */ -+ bool convert = true; -+ -+ /* The following variables have valid values only when CONVERT -+ is true: */ -+ -+ /* Column of next input character. */ -+ uintmax_t column = 0; -+ -+ /* Column the next input tab stop is on. */ -+ uintmax_t next_tab_column = 0; -+ -+ /* Index in TAB_LIST of next tab stop to examine. */ -+ size_t tab_index = 0; -+ -+ /* If true, the first pending blank came just before a tab stop. */ -+ bool one_blank_before_tab_stop = false; -+ -+ /* If true, the previous input character was a blank. This is -+ initially true, since initial strings of blanks are treated -+ as if the line was preceded by a blank. */ -+ bool prev_blank = true; -+ -+ /* Number of pending columns of blanks. */ -+ size_t pending = 0; -+ -+ /* Convert a line of text. */ -+ do -+ { -+ wchar_t w; -+ size_t mblength; /* The byte size of a multibyte character -+ which shows as same character as WC. */ -+ mbstate_t i_state_bak; /* Back up the I_STATE. */ -+ -+ /* Fill buffer */ -+ if (buflen < MB_LEN_MAX) -+ { -+ if (!feof (fp) && !ferror (fp)) -+ { -+ if (buflen > 0) -+ memmove (buf, bufpos, buflen); -+ buflen += fread (buf + buflen, sizeof (char), BUFSIZ, fp); -+ bufpos = buf; -+ } -+ } -+ -+ if (buflen < 1) -+ { -+ /* Move to the next file */ -+ if (feof (fp) || ferror (fp)) -+ fp = next_file (fp); -+ if (!fp) -+ { -+ if (pending) -+ { -+ if (fwrite (pending_blank, 1, pending, stdout) != pending) -+ error (EXIT_FAILURE, errno, _("write error")); -+ } -+ free (pending_blank); -+ return; -+ } -+ continue; -+ } -+ -+ i_state_bak = i_state; -+ mblength = mbrtowc (&w, bufpos, buflen, &i_state); -+ wc = w; -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ i_state = i_state_bak; -+ wc = L'\0'; -+ column += convert; -+ mblength = 1; -+ } -+ -+ if (convert) -+ { -+ bool blank = iswblank (wc); -+ -+ if (blank) -+ { -+ if (next_tab_column <= column) -+ { -+ if (tab_size) -+ next_tab_column = -+ column + (tab_size - column % tab_size); -+ else -+ for (;;) -+ if (tab_index == first_free_tab) -+ { -+ convert = false; -+ break; -+ } -+ else -+ { -+ uintmax_t tab = tab_list[tab_index++]; -+ if (column < tab) -+ { -+ next_tab_column = tab; -+ break; -+ } -+ } -+ } -+ -+ if (convert) -+ { -+ if (next_tab_column < column) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ -+ if (wc == L'\t') -+ { -+ column = next_tab_column; -+ -+ /* Discard pending blanks, unless it was a single -+ blank just before the previous tab stop. */ -+ if (! (pending == 1 && one_blank_before_tab_stop)) -+ { -+ pending = 0; -+ one_blank_before_tab_stop = false; -+ } -+ } -+ else -+ { -+ column++; -+ -+ if (! (prev_blank && column == next_tab_column)) -+ { -+ /* It is not yet known whether the pending blanks -+ will be replaced by tabs. */ -+ if (column == next_tab_column) -+ one_blank_before_tab_stop = true; -+ pending_blank[pending++] = ' '; -+ prev_blank = true; -+ buflen -= mblength; -+ bufpos += mblength; -+ continue; -+ } -+ -+ /* Replace the pending blanks by a tab or two. */ -+ pending_blank[0] = *bufpos = '\t'; -+ pending = one_blank_before_tab_stop; -+ } -+ } -+ } -+ else if (wc == L'\b') -+ { -+ /* Go back one column, and force recalculation of the -+ next tab stop. */ -+ column -= !!column; -+ next_tab_column = column; -+ tab_index -= !!tab_index; -+ } -+ else -+ { -+ if (!iswcntrl (wc)) -+ { -+ int width = wcwidth (wc); -+ if (width > 0) -+ { -+ if (column > (column + width)) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ column += width; -+ } -+ } -+ } -+ -+ if (pending) -+ { -+ if (fwrite (pending_blank, 1, pending, stdout) != pending) -+ error (EXIT_FAILURE, errno, _("write error")); -+ pending = 0; -+ one_blank_before_tab_stop = false; -+ } -+ -+ prev_blank = blank; -+ convert &= convert_entire_line | blank; -+ } -+ -+ if (mblength) -+ { -+ if (fwrite (bufpos, sizeof (char), mblength, stdout) < mblength) -+ error (EXIT_FAILURE, errno, _("write error")); -+ } -+ else -+ { -+ if (putchar ('\0')) -+ error (EXIT_FAILURE, errno, _("write error")); -+ mblength = 1; -+ } -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+ while (wc != L'\n'); -+ } -+} -+#endif -+ - int - main (int argc, char **argv) - { -@@ -527,7 +781,12 @@ main (int argc, char **argv) - - file_list = (optind < argc ? &argv[optind] : stdin_argv); - -- unexpand (); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ unexpand_multibyte (); -+ else -+#endif -+ unexpand (); - - if (have_read_stdin && fclose (stdin) != 0) - error (EXIT_FAILURE, errno, "-"); -Index: src/uniq.c -=================================================================== ---- coreutils-7.1/src/uniq.c.orig 2008-11-10 14:17:52.000000000 +0100 -+++ coreutils-7.1/src/uniq.c 2010-06-29 18:49:32.040030047 +0200 -@@ -22,6 +22,16 @@ - #include - #include - -+/* Get mbstate_t, mbrtowc(), wcrtomb() */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswctype(), wctype(), towupper)(. */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "argmatch.h" - #include "linebuffer.h" -@@ -32,6 +42,13 @@ - #include "xstrtol.h" - #include "memcasecmp.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "uniq" - -@@ -106,6 +123,12 @@ static enum delimit_method const delimit - /* Select whether/how to delimit groups of duplicate lines. */ - static enum delimit_method delimit_groups; - -+/* Function pointers. */ -+static char * (*find_field) (struct linebuffer *line); -+ -+/* Show the blank character class. */ -+wctype_t blank_type; -+ - static struct option const longopts[] = - { - {"count", no_argument, NULL, 'c'}, -@@ -202,7 +225,7 @@ size_opt (char const *opt, char const *m - return a pointer to the beginning of the line's field to be compared. */ - - static char * --find_field (struct linebuffer const *line) -+find_field_uni (struct linebuffer const *line) - { - size_t count; - char const *lp = line->buffer; -@@ -223,6 +246,83 @@ find_field (struct linebuffer const *lin - return line->buffer + i; - } - -+#if HAVE_MBRTOWC -+ -+# define MBCHAR_TO_WCHAR(WC, MBLENGTH, LP, POS, SIZE, STATEP, CONVFAIL) \ -+ do \ -+ { \ -+ mbstate_t state_bak; \ -+ \ -+ CONVFAIL = 0; \ -+ state_bak = *STATEP; \ -+ \ -+ MBLENGTH = mbrtowc (&WC, LP + POS, SIZE - POS, STATEP); \ -+ \ -+ switch (MBLENGTH) \ -+ { \ -+ case (size_t)-2: \ -+ case (size_t)-1: \ -+ *STATEP = state_bak; \ -+ CONVFAIL++; \ -+ /* Fall through */ \ -+ case 0: \ -+ MBLENGTH = 1; \ -+ } \ -+ } \ -+ while (0) -+ -+static char * -+find_field_multi (struct linebuffer const *line) -+{ -+ size_t count; -+ char *lp = line->buffer; -+ size_t size = line->length - 1; -+ size_t pos; -+ size_t mblength; -+ wchar_t wc; -+ mbstate_t *statep; -+ int convfail; -+ -+ pos = 0; -+ statep = &line->state; -+ -+ /* skip fields. */ -+ for (count = 0; count < skip_fields && pos < size; count++) -+ { -+ while (pos < size) -+ { -+ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); -+ -+ if (convfail || !iswctype (wc, blank_type)) -+ { -+ pos += mblength; -+ break; -+ } -+ pos += mblength; -+ } -+ -+ while (pos < size) -+ { -+ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); -+ -+ if (!convfail && iswctype (wc, blank_type)) -+ break; -+ -+ pos += mblength; -+ } -+ } -+ -+ /* skip fields. */ -+ for (count = 0; count < skip_chars && pos < size; count++) -+ { -+ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); -+ pos += mblength; -+ } -+ -+ return lp + pos; -+} -+#endif -+ - /* Return false if two strings OLD and NEW match, true if not. - OLD and NEW point not to the beginnings of the lines - but rather to the beginnings of the fields to compare. -@@ -247,6 +347,73 @@ different (char *old, char *new, size_t - return oldlen != newlen || memcmp (old, new, oldlen); - } - -+#if HAVE_MBRTOWC -+static int -+different_multi (const char *old, const char *new, size_t oldlen, size_t newlen, mbstate_t oldstate, mbstate_t newstate) -+{ -+ size_t i, j, chars; -+ const char *str[2]; -+ char *copy[2]; -+ size_t len[2]; -+ mbstate_t state[2]; -+ size_t mblength; -+ wchar_t wc, uwc; -+ mbstate_t state_bak; -+ -+ str[0] = old; -+ str[1] = new; -+ len[0] = oldlen; -+ len[1] = newlen; -+ state[0] = oldstate; -+ state[1] = newstate; -+ -+ for (i = 0; i < 2; i++) -+ { -+ copy[i] = alloca (len[i] + 1); -+ -+ for (j = 0, chars = 0; j < len[i] && chars < check_chars; chars++) -+ { -+ state_bak = state[i]; -+ mblength = mbrtowc (&wc, str[i] + j, len[i] - j, &state[i]); -+ -+ switch (mblength) -+ { -+ case (size_t)-1: -+ case (size_t)-2: -+ state[i] = state_bak; -+ /* Fall through */ -+ case 0: -+ mblength = 1; -+ break; -+ -+ default: -+ if (ignore_case) -+ { -+ uwc = towupper (wc); -+ -+ if (uwc != wc) -+ { -+ mbstate_t state_wc; -+ -+ memset (&state_wc, '\0', sizeof (mbstate_t)); -+ wcrtomb (copy[i] + j, uwc, &state_wc); -+ } -+ else -+ memcpy (copy[i] + j, str[i] + j, mblength); -+ } -+ else -+ memcpy (copy[i] + j, str[i] + j, mblength); -+ } -+ j += mblength; -+ } -+ copy[i][j] = '\0'; -+ len[i] = j; -+ } -+ -+ return xmemcoll (copy[0], len[0], copy[1], len[1]); -+} -+#endif -+ - /* Output the line in linebuffer LINE to standard output - provided that the switches say it should be output. - MATCH is true if the line matches the previous line. -@@ -299,15 +466,42 @@ check_file (const char *infile, const ch - { - char *prevfield IF_LINT (= NULL); - size_t prevlen IF_LINT (= 0); -+#if HAVE_MBRTOWC -+ mbstate_t prevstate; - -+ memset (&prevstate, '\0', sizeof (mbstate_t)); -+#endif - while (!feof (stdin)) - { - char *thisfield; - size_t thislen; -+#if HAVE_MBRTOWC -+ mbstate_t thisstate; -+#endif - if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) - break; - thisfield = find_field (thisline); - thislen = thisline->length - 1 - (thisfield - thisline->buffer); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ thisstate = thisline->state; -+ -+ if (prevline->length == 0 -+ || different_multi (thisfield, prevfield, thislen, prevlen, -+ thisstate, prevstate)) -+ { -+ fwrite (thisline->buffer, sizeof (char), -+ thisline->length, stdout); -+ -+ SWAP_LINES (prevline, thisline); -+ prevfield = thisfield; -+ prevlen = thislen; -+ prevstate = thisstate; -+ } -+ } -+ else -+#endif - if (prevline->length == 0 - || different (thisfield, prevfield, thislen, prevlen)) - { -@@ -326,17 +520,26 @@ check_file (const char *infile, const ch - size_t prevlen; - uintmax_t match_count = 0; - bool first_delimiter = true; -+#if HAVE_MBRTOWC -+ mbstate_t prevstate; -+#endif - - if (readlinebuffer_delim (prevline, stdin, delimiter) == 0) - goto closefiles; - prevfield = find_field (prevline); - prevlen = prevline->length - 1 - (prevfield - prevline->buffer); -+#if HAVE_MBRTOWC -+ prevstate = prevline->state; -+#endif - - while (!feof (stdin)) - { - bool match; - char *thisfield; - size_t thislen; -+#if HAVE_MBRTOWC -+ mbstate_t thisstate; -+#endif - if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) - { - if (ferror (stdin)) -@@ -345,6 +548,15 @@ check_file (const char *infile, const ch - } - thisfield = find_field (thisline); - thislen = thisline->length - 1 - (thisfield - thisline->buffer); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ thisstate = thisline->state; -+ match = !different_multi (thisfield, prevfield, -+ thislen, prevlen, thisstate, prevstate); -+ } -+ else -+#endif - match = !different (thisfield, prevfield, thislen, prevlen); - match_count += match; - -@@ -377,6 +589,9 @@ check_file (const char *infile, const ch - SWAP_LINES (prevline, thisline); - prevfield = thisfield; - prevlen = thislen; -+#if HAVE_MBRTOWC -+ prevstate = thisstate; -+#endif - if (!match) - match_count = 0; - } -@@ -422,6 +637,18 @@ main (int argc, char **argv) - - atexit (close_stdout); - -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ find_field = find_field_multi; -+ blank_type = wctype ("blank"); -+ } -+ else -+#endif -+ { -+ find_field = find_field_uni; -+ } -+ - skip_chars = 0; - skip_fields = 0; - check_chars = SIZE_MAX; -Index: tests/misc/cut -=================================================================== ---- coreutils-7.1/tests/misc/cut.orig 2008-09-18 09:06:57.000000000 +0200 -+++ coreutils-7.1/tests/misc/cut 2010-06-29 18:49:32.091533700 +0200 -@@ -26,7 +26,7 @@ use strict; - my $prog = 'cut'; - my $try = "Try \`$prog --help' for more information.\n"; - my $from_1 = "$prog: fields and positions are numbered from 1\n$try"; --my $inval = "$prog: invalid byte or field list\n$try"; -+my $inval = "$prog: invalid byte, character or field list\n$try"; - my $no_endpoint = "$prog: invalid range with no endpoint: -\n$try"; - - my @Tests = diff --git a/coreutils-5.3.0-sbin4su.diff b/coreutils-5.3.0-sbin4su.patch similarity index 90% rename from coreutils-5.3.0-sbin4su.diff rename to coreutils-5.3.0-sbin4su.patch index bf2cc6c..3af4168 100644 --- a/coreutils-5.3.0-sbin4su.diff +++ b/coreutils-5.3.0-sbin4su.patch @@ -1,8 +1,8 @@ Index: src/su.c =================================================================== ---- src/su.c.orig 2010-05-04 17:29:12.779359204 +0200 -+++ src/su.c 2010-05-04 17:29:12.939359620 +0200 -@@ -467,6 +467,117 @@ correct_password (const struct passwd *p +--- src/su.c.orig 2010-05-05 14:46:48.000000000 +0200 ++++ src/su.c 2010-05-05 14:48:55.023359308 +0200 +@@ -454,6 +454,117 @@ correct_password (const struct passwd *p #endif /* !USE_PAM */ } @@ -120,7 +120,7 @@ Index: src/su.c /* Update `environ' for the new shell based on PW, with SHELL being the value for the SHELL environment variable. */ -@@ -506,6 +617,22 @@ modify_environment (const struct passwd +@@ -493,6 +604,22 @@ modify_environment (const struct passwd DEFAULT_LOGIN_PATH) : getdef_str ("SUPATH", DEFAULT_ROOT_LOGIN_PATH))); @@ -140,6 +140,6 @@ Index: src/su.c + free (new); + } + } - if (pw->pw_uid) - { - xsetenv ("USER", pw->pw_name); + if (pw->pw_uid) + { + xsetenv ("USER", pw->pw_name); diff --git a/coreutils-6.8-su.diff b/coreutils-6.8-su.patch similarity index 78% rename from coreutils-6.8-su.diff rename to coreutils-6.8-su.patch index 090b0f3..c8e3e05 100644 --- a/coreutils-6.8-su.diff +++ b/coreutils-6.8-su.patch @@ -1,6 +1,10 @@ ---- Makefile.in -+++ Makefile.in -@@ -732,6 +732,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Add pam support in su + +Index: Makefile.in +=================================================================== +--- Makefile.in.orig 2010-04-23 17:58:41.000000000 +0200 ++++ Makefile.in 2010-05-06 19:37:44.784359208 +0200 +@@ -961,6 +961,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -8,41 +12,35 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- configure -+++ configure -@@ -612,6 +612,7 @@ OPTIONAL_BIN_PROGS +Index: configure +=================================================================== +--- configure.orig 2010-05-06 19:37:44.688359301 +0200 ++++ configure 2010-05-06 19:37:44.816359169 +0200 +@@ -631,6 +631,7 @@ OPTIONAL_BIN_PROGS INSTALL_SU LIB_GMP LIB_CRYPT +PAM_LIBS + GNULIB_WARN_CFLAGS WERROR_CFLAGS SEQ_LIBM - LIB_CAP -@@ -1231,6 +1232,7 @@ with_included_regex - enable_xattr +@@ -1501,6 +1502,7 @@ enable_xattr enable_libcap + with_tty_group enable_gcc_warnings +enable_pam with_gmp enable_install_program enable_no_install_program -@@ -1877,6 +1879,7 @@ Optional Features: +@@ -2152,6 +2154,7 @@ Optional Features: --disable-xattr do not support extended attributes --disable-libcap disable libcap support - --enable-gcc-warnings turn on lots of GCC warnings (not recommended) -+ --disable-pam Enable PAM support in su (default=auto) + --enable-gcc-warnings turn on lots of GCC warnings (for developers) ++ --disable-pam Disable PAM support in su (default=auto) --enable-install-program=PROG_LIST install the programs in PROG_LIST (comma-separated, default: none) -@@ -26931,7 +26934,6 @@ fi - - - -- - XGETTEXT_EXTRA_OPTIONS="$XGETTEXT_EXTRA_OPTIONS --keyword='proper_name:1,\"This is a proper name. See the gettext manual, section Names.\"'" - - -@@ -39096,6 +39098,111 @@ $as_echo "#define HAVE_WORKING_FORK 1" > +@@ -51989,6 +51992,111 @@ $as_echo "#define HAVE_WORKING_FORK 1" > fi @@ -152,11 +150,13 @@ +$as_echo "$enable_pam" >&6; } + optional_bin_progs= - for ac_func in uname - do ---- configure.ac -+++ configure.ac -@@ -79,6 +79,20 @@ fi + for ac_func in chroot + do : +Index: configure.ac +=================================================================== +--- configure.ac.orig 2010-03-13 16:14:09.000000000 +0100 ++++ configure.ac 2010-05-06 19:37:44.843292013 +0200 +@@ -128,6 +128,20 @@ fi AC_FUNC_FORK @@ -175,11 +175,13 @@ +AC_MSG_RESULT([$enable_pam]) + optional_bin_progs= - AC_CHECK_FUNCS([uname], - gl_ADD_PROG([optional_bin_progs], [uname])) ---- doc/Makefile.in -+++ doc/Makefile.in -@@ -713,6 +713,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ + AC_CHECK_FUNCS([chroot], + gl_ADD_PROG([optional_bin_progs], [chroot])) +Index: doc/Makefile.in +=================================================================== +--- doc/Makefile.in.orig 2010-04-23 17:58:37.000000000 +0200 ++++ doc/Makefile.in 2010-05-06 19:37:44.868359246 +0200 +@@ -957,6 +957,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -187,9 +189,11 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- gnulib-tests/Makefile.in -+++ gnulib-tests/Makefile.in -@@ -1421,6 +1421,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Index: gnulib-tests/Makefile.in +=================================================================== +--- gnulib-tests/Makefile.in.orig 2010-04-23 18:00:33.000000000 +0200 ++++ gnulib-tests/Makefile.in 2010-05-06 19:37:44.871374260 +0200 +@@ -2191,6 +2191,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -197,9 +201,11 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- lib/Makefile.in -+++ lib/Makefile.in -@@ -763,6 +763,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Index: lib/Makefile.in +=================================================================== +--- lib/Makefile.in.orig 2010-04-23 17:58:38.000000000 +0200 ++++ lib/Makefile.in 2010-05-06 19:37:59.594863753 +0200 +@@ -1006,6 +1006,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -207,9 +213,11 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- man/Makefile.in -+++ man/Makefile.in -@@ -703,6 +703,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Index: man/Makefile.in +=================================================================== +--- man/Makefile.in.orig 2010-05-06 19:37:44.618920753 +0200 ++++ man/Makefile.in 2010-05-06 19:37:44.934868934 +0200 +@@ -926,6 +926,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -217,24 +225,28 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- src/Makefile.am -+++ src/Makefile.am -@@ -147,7 +147,8 @@ tail_LDADD = $(nanosec_libs) - # If necessary, add -lm to resolve use of pow in lib/strtod.c. - uptime_LDADD = $(LDADD) $(POW_LIB) $(GETLOADAVG_LIBS) +Index: src/Makefile.am +=================================================================== +--- src/Makefile.am.orig 2010-04-23 15:44:14.000000000 +0200 ++++ src/Makefile.am 2010-05-06 19:37:59.594863753 +0200 +@@ -364,7 +364,8 @@ factor_LDADD += $(LIB_GMP) + uptime_LDADD += $(GETLOADAVG_LIBS) --su_LDADD = $(LDADD) $(LIB_CRYPT) + # for crypt +-su_LDADD += $(LIB_CRYPT) +su_SOURCES = su.c getdef.c +su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) - dir_LDADD += $(LIB_ACL) - ls_LDADD += $(LIB_ACL) ---- src/Makefile.in -+++ src/Makefile.in -@@ -605,9 +605,10 @@ stty_OBJECTS = stty.$(OBJEXT) - stty_LDADD = $(LDADD) - stty_DEPENDENCIES = libver.a ../lib/libcoreutils.a \ - $(am__DEPENDENCIES_1) ../lib/libcoreutils.a + # for various ACL functions + copy_LDADD += $(LIB_ACL) +Index: src/Makefile.in +=================================================================== +--- src/Makefile.in.orig 2010-04-23 18:35:11.000000000 +0200 ++++ src/Makefile.in 2010-05-06 19:37:59.594863753 +0200 +@@ -553,9 +553,10 @@ stdbuf_DEPENDENCIES = $(am__DEPENDENCIES + stty_SOURCES = stty.c + stty_OBJECTS = stty.$(OBJEXT) + stty_DEPENDENCIES = $(am__DEPENDENCIES_2) -su_SOURCES = su.c -su_OBJECTS = su.$(OBJEXT) -su_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) @@ -244,40 +256,28 @@ + $(am__DEPENDENCIES_1) sum_SOURCES = sum.c sum_OBJECTS = sum.$(OBJEXT) - sum_LDADD = $(LDADD) -@@ -735,11 +736,11 @@ SOURCES = $(nodist_libver_a_SOURCES) $(_ - $(rm_SOURCES) $(rmdir_SOURCES) runcon.c seq.c setuidgid.c \ - $(sha1sum_SOURCES) $(sha224sum_SOURCES) $(sha256sum_SOURCES) \ - $(sha384sum_SOURCES) $(sha512sum_SOURCES) shred.c shuf.c \ -- sleep.c sort.c split.c stat.c stty.c su.c sum.c sync.c tac.c \ -- tail.c tee.c test.c $(timeout_SOURCES) touch.c tr.c true.c \ -- truncate.c tsort.c tty.c $(uname_SOURCES) unexpand.c uniq.c \ -- unlink.c uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c \ -- yes.c -+ sleep.c sort.c split.c stat.c stty.c $(su_SOURCES) sum.c \ -+ sync.c tac.c tail.c tee.c test.c $(timeout_SOURCES) touch.c \ -+ tr.c true.c truncate.c tsort.c tty.c $(uname_SOURCES) \ -+ unexpand.c uniq.c unlink.c uptime.c users.c $(vdir_SOURCES) \ -+ wc.c who.c whoami.c yes.c - DIST_SOURCES = $(__SOURCES) $(arch_SOURCES) base64.c basename.c cat.c \ - chcon.c $(chgrp_SOURCES) chmod.c $(chown_SOURCES) chroot.c \ - cksum.c comm.c $(cp_SOURCES) csplit.c cut.c date.c dd.c df.c \ -@@ -754,10 +755,10 @@ DIST_SOURCES = $(__SOURCES) $(arch_SOURC + sum_DEPENDENCIES = $(am__DEPENDENCIES_2) +@@ -665,8 +666,8 @@ SOURCES = $(nodist_libver_a_SOURCES) $(_ $(rmdir_SOURCES) runcon.c seq.c setuidgid.c $(sha1sum_SOURCES) \ $(sha224sum_SOURCES) $(sha256sum_SOURCES) $(sha384sum_SOURCES) \ $(sha512sum_SOURCES) shred.c shuf.c sleep.c sort.c split.c \ -- stat.c stty.c su.c sum.c sync.c tac.c tail.c tee.c test.c \ -- $(timeout_SOURCES) touch.c tr.c true.c truncate.c tsort.c \ -- tty.c $(uname_SOURCES) unexpand.c uniq.c unlink.c uptime.c \ -- users.c $(vdir_SOURCES) wc.c who.c whoami.c yes.c -+ stat.c stty.c $(su_SOURCES) sum.c sync.c tac.c tail.c tee.c \ -+ test.c $(timeout_SOURCES) touch.c tr.c true.c truncate.c \ -+ tsort.c tty.c $(uname_SOURCES) unexpand.c uniq.c unlink.c \ -+ uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c yes.c - HEADERS = $(noinst_HEADERS) - ETAGS = etags - CTAGS = ctags -@@ -1209,6 +1210,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +- stat.c stdbuf.c stty.c su.c sum.c sync.c tac.c tail.c tee.c \ +- test.c $(timeout_SOURCES) touch.c tr.c true.c truncate.c \ ++ stat.c stdbuf.c stty.c $(su_SOURCES) sum.c sync.c tac.c tail.c \ ++ tee.c test.c $(timeout_SOURCES) touch.c tr.c true.c truncate.c \ + tsort.c tty.c $(uname_SOURCES) unexpand.c uniq.c unlink.c \ + uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c yes.c + DIST_SOURCES = $(__SOURCES) $(arch_SOURCES) base64.c basename.c cat.c \ +@@ -683,7 +684,7 @@ DIST_SOURCES = $(__SOURCES) $(arch_SOURC + $(rm_SOURCES) $(rmdir_SOURCES) runcon.c seq.c setuidgid.c \ + $(sha1sum_SOURCES) $(sha224sum_SOURCES) $(sha256sum_SOURCES) \ + $(sha384sum_SOURCES) $(sha512sum_SOURCES) shred.c shuf.c \ +- sleep.c sort.c split.c stat.c stdbuf.c stty.c su.c sum.c \ ++ sleep.c sort.c split.c stat.c stdbuf.c stty.c $(su_SOURCES) sum.c \ + sync.c tac.c tail.c tee.c test.c $(timeout_SOURCES) touch.c \ + tr.c true.c truncate.c tsort.c tty.c $(uname_SOURCES) \ + unexpand.c uniq.c unlink.c uptime.c users.c $(vdir_SOURCES) \ +@@ -1338,6 +1339,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -285,17 +285,17 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ -@@ -1511,7 +1513,8 @@ tail_LDADD = $(nanosec_libs) +@@ -1743,7 +1745,8 @@ stdbuf_LDADD = $(LDADD) $(LIBICONV) + stty_LDADD = $(LDADD) - # If necessary, add -lm to resolve use of pow in lib/strtod.c. - uptime_LDADD = $(LDADD) $(POW_LIB) $(GETLOADAVG_LIBS) + # for crypt -su_LDADD = $(LDADD) $(LIB_CRYPT) +su_SOURCES = su.c getdef.c +su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) - stat_LDADD = $(LDADD) $(LIB_SELINUX) - - # programs that use getaddrinfo (e.g., via canon_host) -@@ -2040,6 +2043,7 @@ distclean-compile: + sum_LDADD = $(LDADD) + sync_LDADD = $(LDADD) + tac_LDADD = $(LDADD) $(LIB_GETHRXTIME) +@@ -2386,6 +2389,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/false.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fmt.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fold.Po@am__quote@ @@ -303,8 +303,10 @@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/getlimits.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-copy.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-cp-hash.Po@am__quote@ ---- src/getdef.c -+++ src/getdef.c +Index: src/getdef.c +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ src/getdef.c 2010-05-06 19:37:45.014990147 +0200 @@ -0,0 +1,259 @@ +/* Copyright (C) 2003, 2004, 2005 Thorsten Kukuk + Author: Thorsten Kukuk @@ -565,8 +567,10 @@ +} + +#endif ---- src/getdef.h -+++ src/getdef.h +Index: src/getdef.h +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ src/getdef.h 2010-05-06 19:37:45.054863903 +0200 @@ -0,0 +1,29 @@ +/* Copyright (C) 2003, 2005 Thorsten Kukuk + Author: Thorsten Kukuk @@ -597,8 +601,10 @@ +extern void free_getdef_data (void); + +#endif /* _GETDEF_H_ */ ---- src/su.c -+++ src/su.c +Index: src/su.c +=================================================================== +--- src/su.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/su.c 2010-05-06 19:37:59.538860383 +0200 @@ -37,6 +37,16 @@ restricts who can su to UID 0 accounts. RMS considers that to be fascist. @@ -616,7 +622,7 @@ Compile-time options: -DSYSLOG_SUCCESS Log successful su's (by default, to root) with syslog. -DSYSLOG_FAILURE Log failed su's (by default, to root) with syslog. -@@ -52,6 +62,13 @@ +@@ -52,12 +62,22 @@ #include #include #include @@ -628,9 +634,8 @@ +#include +#endif - /* Hide any system prototype for getusershell. - This is necessary because some Cray systems have a conflicting -@@ -65,6 +82,9 @@ + #include "system.h" + #include "getpass.h" #if HAVE_SYSLOG_H && HAVE_SYSLOG # include @@ -640,7 +645,7 @@ #else # undef SYSLOG_SUCCESS # undef SYSLOG_FAILURE -@@ -98,19 +118,13 @@ +@@ -91,19 +111,13 @@ # include #endif @@ -664,18 +669,20 @@ /* The shell to run if none is given in the user's passwd entry. */ #define DEFAULT_SHELL "/bin/sh" -@@ -118,13 +132,22 @@ +@@ -111,8 +125,9 @@ /* The user to become if none is specified. */ #define DEFAULT_USER "root" +#ifndef USE_PAM char *crypt (char const *key, char const *salt); +- +#endif - char *getusershell (void); - void endusershell (void); - void setusershell (void); + static void run_shell (char const *, char const *, char **, size_t) + ATTRIBUTE_NORETURN; - extern char **environ; +@@ -125,6 +140,13 @@ static bool simulate_login; + /* If true, change some environment vars to indicate the user su'd to. */ + static bool change_environment; +#ifdef USE_PAM +static bool _pam_session_opened; @@ -684,10 +691,10 @@ +static void create_watching_parent (void); +#endif + - static void run_shell (char const *, char const *, char **, size_t) - ATTRIBUTE_NORETURN; - -@@ -212,7 +235,162 @@ log_su (struct passwd const *pw, bool su + static struct option const longopts[] = + { + {"command", required_argument, NULL, 'c'}, +@@ -200,7 +222,162 @@ log_su (struct passwd const *pw, bool su } #endif @@ -772,7 +779,7 @@ + /* the child proceeds to run the shell */ + if (child == 0) + return; -+ ++ + /* In the parent watch the child. */ + + /* su without pam support does not have a helper that keeps @@ -850,7 +857,7 @@ Return true if the user gives the correct password for entry PW, false if not. Return true without asking for a password if run by UID 0 or if PW has an empty password. */ -@@ -220,10 +398,52 @@ log_su (struct passwd const *pw, bool su +@@ -208,10 +385,52 @@ log_su (struct passwd const *pw, bool su static bool correct_password (const struct passwd *pw) { @@ -904,7 +911,7 @@ endspent (); if (sp) -@@ -244,6 +464,7 @@ correct_password (const struct passwd *p +@@ -232,6 +451,7 @@ correct_password (const struct passwd *p encrypted = crypt (unencrypted, correct); memset (unencrypted, 0, strlen (unencrypted)); return STREQ (encrypted, correct); @@ -912,33 +919,33 @@ } /* Update `environ' for the new shell based on PW, with SHELL being -@@ -268,8 +489,8 @@ modify_environment (const struct passwd +@@ -256,8 +476,8 @@ modify_environment (const struct passwd xsetenv ("USER", pw->pw_name); xsetenv ("LOGNAME", pw->pw_name); xsetenv ("PATH", (pw->pw_uid -- ? DEFAULT_LOGIN_PATH -- : DEFAULT_ROOT_LOGIN_PATH)); +- ? DEFAULT_LOGIN_PATH +- : DEFAULT_ROOT_LOGIN_PATH)); + ? getdef_str ("PATH", DEFAULT_LOGIN_PATH) + : getdef_str ("SUPATH", DEFAULT_ROOT_LOGIN_PATH))); } else { -@@ -279,6 +500,12 @@ modify_environment (const struct passwd - { - xsetenv ("HOME", pw->pw_dir); - xsetenv ("SHELL", shell); +@@ -267,6 +487,12 @@ modify_environment (const struct passwd + { + xsetenv ("HOME", pw->pw_dir); + xsetenv ("SHELL", shell); + if (getdef_bool ("ALWAYS_SET_PATH", 0)) + xsetenv ("PATH", (pw->pw_uid + ? getdef_str ("PATH", + DEFAULT_LOGIN_PATH) + : getdef_str ("SUPATH", + DEFAULT_ROOT_LOGIN_PATH))); - if (pw->pw_uid) - { - xsetenv ("USER", pw->pw_name); -@@ -286,19 +513,41 @@ modify_environment (const struct passwd - } - } + if (pw->pw_uid) + { + xsetenv ("USER", pw->pw_name); +@@ -274,19 +500,41 @@ modify_environment (const struct passwd + } + } } + +#ifdef USE_PAM @@ -955,7 +962,7 @@ #ifdef HAVE_INITGROUPS errno = 0; if (initgroups (pw->pw_name, pw->pw_gid) == -1) -- error (EXIT_FAILURE, errno, _("cannot set groups")); +- error (EXIT_CANCELED, errno, _("cannot set groups")); + { +#ifdef USE_PAM + cleanup_pam (PAM_ABORT); @@ -978,17 +985,17 @@ +change_identity (const struct passwd *pw) +{ if (setgid (pw->pw_gid)) - error (EXIT_FAILURE, errno, _("cannot set group id")); + error (EXIT_CANCELED, errno, _("cannot set group id")); if (setuid (pw->pw_uid)) -@@ -491,6 +740,7 @@ main (int argc, char **argv) +@@ -479,6 +727,7 @@ main (int argc, char **argv) #ifdef SYSLOG_FAILURE log_su (pw, false); #endif + sleep (getdef_num ("FAIL_DELAY", 1)); - error (EXIT_FAILURE, 0, _("incorrect password")); + error (EXIT_CANCELED, 0, _("incorrect password")); } #ifdef SYSLOG_SUCCESS -@@ -512,9 +762,21 @@ main (int argc, char **argv) +@@ -500,9 +749,21 @@ main (int argc, char **argv) shell = NULL; } shell = xstrdup (shell ? shell : pw->pw_shell); @@ -1011,9 +1018,11 @@ if (simulate_login && chdir (pw->pw_dir) != 0) error (0, errno, _("warning: cannot change directory to %s"), pw->pw_dir); ---- tests/Makefile.in -+++ tests/Makefile.in -@@ -677,6 +677,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Index: tests/Makefile.in +=================================================================== +--- tests/Makefile.in.orig 2010-04-23 17:58:39.000000000 +0200 ++++ tests/Makefile.in 2010-05-06 19:37:45.091861849 +0200 +@@ -986,6 +986,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ diff --git a/coreutils-6.8.0-pie.diff b/coreutils-6.8.0-pie.patch similarity index 76% rename from coreutils-6.8.0-pie.diff rename to coreutils-6.8.0-pie.patch index 36f565f..2a22116 100644 --- a/coreutils-6.8.0-pie.diff +++ b/coreutils-6.8.0-pie.patch @@ -1,28 +1,35 @@ ---- lib/Makefile.am -+++ lib/Makefile.am -@@ -18,6 +18,7 @@ +Index: lib/Makefile.am +=================================================================== +--- lib/Makefile.am.orig 2010-01-01 14:06:47.000000000 +0100 ++++ lib/Makefile.am 2010-05-05 14:38:03.083359277 +0200 +@@ -17,7 +17,7 @@ + include gnulib.mk - AM_CFLAGS = $(WARN_CFLAGS) # $(WERROR_CFLAGS) -+AM_CFLAGS += -fpie +-AM_CFLAGS += $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) ++AM_CFLAGS += $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) -fpie libcoreutils_a_SOURCES += \ buffer-lcm.c buffer-lcm.h \ ---- lib/Makefile.in -+++ lib/Makefile.in -@@ -1169,7 +1169,7 @@ GPERF = gperf - LINK_WARNING_H = $(top_srcdir)/build-aux/link-warning.h - charset_alias = $(DESTDIR)$(libdir)/charset.alias - charset_tmp = $(DESTDIR)$(libdir)/charset.tmp --AM_CFLAGS = $(WARN_CFLAGS) # $(WERROR_CFLAGS) -+AM_CFLAGS = $(WARN_CFLAGS) -fpie - all: $(BUILT_SOURCES) config.h - $(MAKE) $(AM_MAKEFLAGS) all-recursive - ---- src/Makefile.am -+++ src/Makefile.am -@@ -149,6 +149,10 @@ uptime_LDADD = $(LDADD) $(POW_LIB) $(GET - +Index: lib/Makefile.in +=================================================================== +--- lib/Makefile.in.orig 2010-05-05 14:37:08.000000000 +0200 ++++ lib/Makefile.in 2010-05-05 14:38:31.946859277 +0200 +@@ -1432,7 +1432,7 @@ DISTCLEANFILES = + MAINTAINERCLEANFILES = getdate.c iconv_open-aix.h iconv_open-hpux.h \ + iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h + AM_CPPFLAGS = +-AM_CFLAGS = $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) ++AM_CFLAGS = $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) -fpie + libcoreutils_a_SOURCES = set-mode-acl.c copy-acl.c file-has-acl.c \ + areadlink.c areadlink-with-size.c areadlinkat.c argv-iter.c \ + argv-iter.h base64.h base64.c bitrotate.h c-ctype.h c-ctype.c \ +Index: src/Makefile.am +=================================================================== +--- src/Makefile.am.orig 2010-05-05 14:37:08.000000000 +0200 ++++ src/Makefile.am 2010-05-05 14:39:20.956359221 +0200 +@@ -366,6 +366,10 @@ uptime_LDADD += $(GETLOADAVG_LIBS) + # for crypt su_SOURCES = su.c getdef.c su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) +su_CFLAGS = -fpie @@ -30,14 +37,16 @@ +timeout_CFLAGS = -fpie +timeout_LDFLAGS = -pie - dir_LDADD += $(LIB_ACL) - ls_LDADD += $(LIB_ACL) ---- src/Makefile.in -+++ src/Makefile.in -@@ -605,10 +605,12 @@ stty_OBJECTS = stty.$(OBJEXT) - stty_LDADD = $(LDADD) - stty_DEPENDENCIES = libver.a ../lib/libcoreutils.a \ - $(am__DEPENDENCIES_1) ../lib/libcoreutils.a + # for various ACL functions + copy_LDADD += $(LIB_ACL) +Index: src/Makefile.in +=================================================================== +--- src/Makefile.in.orig 2010-05-05 14:37:08.000000000 +0200 ++++ src/Makefile.in 2010-05-05 14:46:02.318905172 +0200 +@@ -553,10 +553,12 @@ stdbuf_DEPENDENCIES = $(am__DEPENDENCIES + stty_SOURCES = stty.c + stty_OBJECTS = stty.$(OBJEXT) + stty_DEPENDENCIES = $(am__DEPENDENCIES_2) -am_su_OBJECTS = su.$(OBJEXT) getdef.$(OBJEXT) +am_su_OBJECTS = su-su.$(OBJEXT) su-getdef.$(OBJEXT) su_OBJECTS = $(am_su_OBJECTS) @@ -47,8 +56,8 @@ + $@ sum_SOURCES = sum.c sum_OBJECTS = sum.$(OBJEXT) - sum_LDADD = $(LDADD) -@@ -633,9 +635,12 @@ tee_DEPENDENCIES = libver.a ../lib/libco + sum_DEPENDENCIES = $(am__DEPENDENCIES_2) +@@ -576,9 +578,12 @@ tee_DEPENDENCIES = $(am__DEPENDENCIES_2) test_SOURCES = test.c test_OBJECTS = test.$(OBJEXT) test_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) @@ -62,36 +71,36 @@ touch_SOURCES = touch.c touch_OBJECTS = touch.$(OBJEXT) touch_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) -@@ -1515,6 +1520,10 @@ tail_LDADD = $(nanosec_libs) - uptime_LDADD = $(LDADD) $(POW_LIB) $(GETLOADAVG_LIBS) +@@ -1747,6 +1752,10 @@ stty_LDADD = $(LDADD) + # for crypt su_SOURCES = su.c getdef.c su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) +su_CFLAGS = -fpie +su_LDFLAGS = -pie +timeout_CFLAGS = -fpie +timeout_LDFLAGS = -pie - stat_LDADD = $(LDADD) $(LIB_SELINUX) - - # programs that use getaddrinfo (e.g., via canon_host) -@@ -1933,7 +1942,7 @@ stty$(EXEEXT): $(stty_OBJECTS) $(stty_DE - $(LINK) $(stty_OBJECTS) $(stty_LDADD) $(LIBS) + sum_LDADD = $(LDADD) + sync_LDADD = $(LDADD) + tac_LDADD = $(LDADD) $(LIB_GETHRXTIME) +@@ -2279,7 +2288,7 @@ stty$(EXEEXT): $(stty_OBJECTS) $(stty_DE + $(AM_V_CCLD)$(LINK) $(stty_OBJECTS) $(stty_LDADD) $(LIBS) su$(EXEEXT): $(su_OBJECTS) $(su_DEPENDENCIES) @rm -f su$(EXEEXT) -- $(LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) -+ $(su_LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) +- $(AM_V_CCLD)$(LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) ++ $(AM_V_CCLD)$(su_LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) sum$(EXEEXT): $(sum_OBJECTS) $(sum_DEPENDENCIES) @rm -f sum$(EXEEXT) - $(LINK) $(sum_OBJECTS) $(sum_LDADD) $(LIBS) -@@ -1954,7 +1963,7 @@ test$(EXEEXT): $(test_OBJECTS) $(test_DE - $(LINK) $(test_OBJECTS) $(test_LDADD) $(LIBS) + $(AM_V_CCLD)$(LINK) $(sum_OBJECTS) $(sum_LDADD) $(LIBS) +@@ -2300,7 +2309,7 @@ test$(EXEEXT): $(test_OBJECTS) $(test_DE + $(AM_V_CCLD)$(LINK) $(test_OBJECTS) $(test_LDADD) $(LIBS) timeout$(EXEEXT): $(timeout_OBJECTS) $(timeout_DEPENDENCIES) @rm -f timeout$(EXEEXT) -- $(LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) -+ $(timeout_LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) +- $(AM_V_CCLD)$(LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) ++ $(AM_V_CCLD)$(timeout_LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) touch$(EXEEXT): $(touch_OBJECTS) $(touch_DEPENDENCIES) @rm -f touch$(EXEEXT) - $(LINK) $(touch_OBJECTS) $(touch_LDADD) $(LIBS) -@@ -2043,7 +2052,6 @@ distclean-compile: + $(AM_V_CCLD)$(LINK) $(touch_OBJECTS) $(touch_LDADD) $(LIBS) +@@ -2389,7 +2398,6 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/false.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fmt.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fold.Po@am__quote@ @@ -99,9 +108,9 @@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/getlimits.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-copy.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-cp-hash.Po@am__quote@ -@@ -2104,14 +2112,16 @@ distclean-compile: - @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/split.Po@am__quote@ +@@ -2453,14 +2461,16 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stat.Po@am__quote@ + @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stdbuf.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stty.Po@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/su.Po@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/su-getdef.Po@am__quote@ @@ -118,9 +127,9 @@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/touch.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tr.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/true.Po@am__quote@ -@@ -2286,6 +2296,62 @@ sha512sum-md5sum.obj: md5sum.c +@@ -2649,6 +2659,62 @@ sha512sum-md5sum.obj: md5sum.c @AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@ - @am__fastdepCC_FALSE@ $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(sha512sum_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o sha512sum-md5sum.obj `if test -f 'md5sum.c'; then $(CYGPATH_W) 'md5sum.c'; else $(CYGPATH_W) '$(srcdir)/md5sum.c'; fi` + @am__fastdepCC_FALSE@ $(AM_V_CC@am__nodep@)$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(sha512sum_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o sha512sum-md5sum.obj `if test -f 'md5sum.c'; then $(CYGPATH_W) 'md5sum.c'; else $(CYGPATH_W) '$(srcdir)/md5sum.c'; fi` +su-su.o: su.c +@am__fastdepCC_TRUE@ $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(su_CFLAGS) $(CFLAGS) -MT su-su.o -MD -MP -MF $(DEPDIR)/su-su.Tpo -c -o su-su.o `test -f 'su.c' || echo '$(srcdir)/'`su.c diff --git a/coreutils-7.1.diff b/coreutils-7.1.diff deleted file mode 100644 index f755a19..0000000 --- a/coreutils-7.1.diff +++ /dev/null @@ -1,194 +0,0 @@ ---- configure -+++ configure -@@ -3029,7 +3029,6 @@ as_fn_append ac_func_list " fchmod" - as_fn_append ac_func_list " alarm" - as_fn_append ac_header_list " sys/statvfs.h" - as_fn_append ac_header_list " sys/select.h" --gl_printf_safe=yes - as_fn_append ac_func_list " readlink" - as_fn_append ac_header_list " utmp.h" - as_fn_append ac_header_list " utmpx.h" ---- doc/coreutils.texi -+++ doc/coreutils.texi -@@ -66,8 +66,6 @@ - * fold: (coreutils)fold invocation. Wrap long input lines. - * groups: (coreutils)groups invocation. Print group names a user is in. - * head: (coreutils)head invocation. Output the first part of files. --* hostid: (coreutils)hostid invocation. Print numeric host identifier. --* hostname: (coreutils)hostname invocation. Print or set system name. - * id: (coreutils)id invocation. Print user identity. - * install: (coreutils)install invocation. Copy and change attributes. - * join: (coreutils)join invocation. Join lines on a common field. -@@ -195,7 +193,7 @@ Free Documentation License''. - * File name manipulation:: dirname basename pathchk - * Working context:: pwd stty printenv tty - * User information:: id logname whoami groups users who --* System context:: date uname hostname hostid uptime -+* System context:: date uname uptime - * SELinux context:: chcon runcon - * Modified command invocation:: chroot env nice nohup su timeout - * Process control:: kill -@@ -409,8 +407,6 @@ System context - * arch invocation:: Print machine hardware name - * date invocation:: Print or set system date and time - * uname invocation:: Print system information --* hostname invocation:: Print or set system name --* hostid invocation:: Print numeric host identifier - * uptime invocation:: Print system uptime and load - - @command{date}: Print or set system date and time -@@ -12969,8 +12965,6 @@ information. - * arch invocation:: Print machine hardware name. - * date invocation:: Print or set system date and time. - * uname invocation:: Print system information. --* hostname invocation:: Print or set system name. --* hostid invocation:: Print numeric host identifier. - * uptime invocation:: Print system uptime and load - @end menu - -@@ -13928,54 +13922,6 @@ Print the kernel version. - @exitstatus - - --@node hostname invocation --@section @command{hostname}: Print or set system name -- --@pindex hostname --@cindex setting the hostname --@cindex printing the hostname --@cindex system name, printing --@cindex appropriate privileges -- --With no arguments, @command{hostname} prints the name of the current host --system. With one argument, it sets the current host name to the --specified string. You must have appropriate privileges to set the host --name. Synopsis: -- --@example --hostname [@var{name}] --@end example -- --The only options are @option{--help} and @option{--version}. @xref{Common --options}. -- --@exitstatus -- -- --@node hostid invocation --@section @command{hostid}: Print numeric host identifier. -- --@pindex hostid --@cindex printing the host identifier -- --@command{hostid} prints the numeric identifier of the current host --in hexadecimal. This command accepts no arguments. --The only options are @option{--help} and @option{--version}. --@xref{Common options}. -- --For example, here's what it prints on one system I use: -- --@example --$ hostid --1bac013d --@end example -- --On that system, the 32-bit quantity happens to be closely --related to the system's Internet address, but that isn't always --the case. -- --@exitstatus -- - @node uptime invocation - @section @command{uptime}: Print system uptime and load - ---- gnulib-tests/test-isnanl.h -+++ gnulib-tests/test-isnanl.h -@@ -75,7 +75,7 @@ main () - /* Quiet NaN. */ - ASSERT (isnanl (0.0L / 0.0L)); - --#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT -+#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT && 0 - /* A bit pattern that is different from a Quiet NaN. With a bit of luck, - it's a Signalling NaN. */ - { -@@ -117,6 +117,7 @@ main () - { LDBL80_WORDS (0xFFFF, 0x83333333, 0x00000000) }; - ASSERT (isnanl (x.value)); - } -+#if 0 - /* The isnanl function should recognize Pseudo-NaNs, Pseudo-Infinities, - Pseudo-Zeroes, Unnormalized Numbers, and Pseudo-Denormals, as defined in - Intel IA-64 Architecture Software Developer's Manual, Volume 1: -@@ -150,6 +151,7 @@ main () - ASSERT (isnanl (x.value)); - } - #endif -+#endif - - return 0; - } ---- m4/gnulib-comp.m4 -+++ m4/gnulib-comp.m4 -@@ -287,7 +287,6 @@ AC_DEFUN([gl_INIT], - gl_POSIXVER - gl_FUNC_PRINTF_FREXP - gl_FUNC_PRINTF_FREXPL -- m4_divert_text([INIT_PREPARE], [gl_printf_safe=yes]) - m4_ifdef([AM_XGETTEXT_OPTION], - [AM_XGETTEXT_OPTION([--keyword='proper_name:1,\"This is a proper name. See the gettext manual, section Names.\"']) - AM_XGETTEXT_OPTION([--keyword='proper_name_utf8:1,\"This is a proper name. See the gettext manual, section Names.\"'])]) ---- man/Makefile.am -+++ man/Makefile.am -@@ -184,7 +184,7 @@ check-x-vs-1: - PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ - t=ls-files.$$$$; \ - (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ -- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ -+ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ - | tr -s ' ' '\n' | sed 's/\.1$$//') \ - | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ - rm $$t ---- man/Makefile.in -+++ man/Makefile.in -@@ -1275,7 +1275,7 @@ check-x-vs-1: - PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ - t=ls-files.$$$$; \ - (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ -- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ -+ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ - | tr -s ' ' '\n' | sed 's/\.1$$//') \ - | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ - rm $$t ---- src/system.h -+++ src/system.h -@@ -156,7 +156,7 @@ enum - # define DEV_BSIZE BBSIZE - #endif - #ifndef DEV_BSIZE --# define DEV_BSIZE 4096 -+# define DEV_BSIZE 512 - #endif - - /* Extract or fake data from a `struct stat'. ---- tests/misc/help-version -+++ tests/misc/help-version -@@ -182,6 +182,7 @@ lbracket_args=": ]" - for i in $built_programs; do - # Skip these. - case $i in chroot|stty|tty|false|chcon|runcon) continue;; esac -+ case $i in df) continue;; esac - - rm -rf $tmp_in $tmp_in2 $tmp_dir $tmp_out - echo > $tmp_in ---- tests/other-fs-tmpdir -+++ tests/other-fs-tmpdir -@@ -42,6 +42,8 @@ for d in $CANDIDATE_TMP_DIRS; do - fi - - done -+# Autobuild hack -+test -f /bin/uname.bin && other_partition_tmpdir= - - if test -z "$other_partition_tmpdir"; then - skip_test_ \ diff --git a/coreutils-7.1.tar.xz b/coreutils-7.1.tar.xz deleted file mode 100644 index 5f576f5..0000000 --- a/coreutils-7.1.tar.xz +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:a584c6ce92f390c684dac00032e5c790ecc15cb0fa3e61891ac62401832ae108 -size 3967824 diff --git a/coreutils-8.5-i18n.patch b/coreutils-8.5-i18n.patch new file mode 100644 index 0000000..b043447 --- /dev/null +++ b/coreutils-8.5-i18n.patch @@ -0,0 +1,4066 @@ +Index: lib/linebuffer.h +=================================================================== +--- lib/linebuffer.h.orig 2010-04-23 15:44:00.000000000 +0200 ++++ lib/linebuffer.h 2010-05-07 16:13:30.696492151 +0200 +@@ -21,6 +21,11 @@ + + # include + ++/* Get mbstate_t. */ ++# if HAVE_WCHAR_H ++# include ++# endif ++ + /* A `struct linebuffer' holds a line of text. */ + + struct linebuffer +@@ -28,6 +33,9 @@ struct linebuffer + size_t size; /* Allocated. */ + size_t length; /* Used. */ + char *buffer; ++# if HAVE_WCHAR_H ++ mbstate_t state; ++# endif + }; + + /* Initialize linebuffer LINEBUFFER for use. */ +Index: src/cut.c +=================================================================== +--- src/cut.c.orig 2010-04-20 21:52:04.000000000 +0200 ++++ src/cut.c 2010-05-07 16:40:46.225492013 +0200 +@@ -28,6 +28,11 @@ + #include + #include + #include ++ ++/* Get mbstate_t, mbrtowc(). */ ++#if HAVE_WCHAR_H ++# include ++#endif + #include "system.h" + + #include "error.h" +@@ -36,6 +41,18 @@ + #include "quote.h" + #include "xstrndup.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# undef MB_LEN_MAX ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "cut" + +@@ -71,6 +88,52 @@ + } \ + while (0) + ++/* Refill the buffer BUF to get a multibyte character. */ ++#define REFILL_BUFFER(BUF, BUFPOS, BUFLEN, STREAM) \ ++ do \ ++ { \ ++ if (BUFLEN < MB_LEN_MAX && !feof (STREAM) && !ferror (STREAM)) \ ++ { \ ++ memmove (BUF, BUFPOS, BUFLEN); \ ++ BUFLEN += fread (BUF + BUFLEN, sizeof(char), BUFSIZ, STREAM); \ ++ BUFPOS = BUF; \ ++ } \ ++ } \ ++ while (0) ++ ++/* Get wide character on BUFPOS. BUFPOS is not included after that. ++ If byte sequence is not valid as a character, CONVFAIL is 1. Otherwise 0. */ ++#define GET_NEXT_WC_FROM_BUFFER(WC, BUFPOS, BUFLEN, MBLENGTH, STATE, CONVFAIL) \ ++ do \ ++ { \ ++ mbstate_t state_bak; \ ++ \ ++ if (BUFLEN < 1) \ ++ { \ ++ WC = WEOF; \ ++ break; \ ++ } \ ++ \ ++ /* Get a wide character. */ \ ++ CONVFAIL = 0; \ ++ state_bak = STATE; \ ++ MBLENGTH = mbrtowc ((wchar_t *)&WC, BUFPOS, BUFLEN, &STATE); \ ++ \ ++ switch (MBLENGTH) \ ++ { \ ++ case (size_t)-1: \ ++ case (size_t)-2: \ ++ CONVFAIL++; \ ++ STATE = state_bak; \ ++ /* Fall througn. */ \ ++ \ ++ case 0: \ ++ MBLENGTH = 1; \ ++ break; \ ++ } \ ++ } \ ++ while (0) ++ + struct range_pair + { + size_t lo; +@@ -89,7 +152,7 @@ static char *field_1_buffer; + /* The number of bytes allocated for FIELD_1_BUFFER. */ + static size_t field_1_bufsize; + +-/* The largest field or byte index used as an endpoint of a closed ++/* The largest byte, character or field index used as an endpoint of a closed + or degenerate range specification; this doesn't include the starting + index of right-open-ended ranges. For example, with either range spec + `2-5,9-', `2-3,5,9-' this variable would be set to 5. */ +@@ -101,10 +164,11 @@ static size_t eol_range_start; + + /* This is a bit vector. + In byte mode, which bytes to output. ++ In character mode, which characters to output. + In field mode, which DELIM-separated fields to output. +- Both bytes and fields are numbered starting with 1, ++ Bytes, characters and fields are numbered starting with 1, + so the zeroth bit of this array is unused. +- A field or byte K has been selected if ++ A byte, character or field K has been selected if + (K <= MAX_RANGE_ENDPOINT and is_printable_field(K)) + || (EOL_RANGE_START > 0 && K >= EOL_RANGE_START). */ + static unsigned char *printable_field; +@@ -113,15 +177,25 @@ enum operating_mode + { + undefined_mode, + +- /* Output characters that are in the given bytes. */ ++ /* Output bytes that are at the given positions. */ + byte_mode, + ++ /* Output characters that are at the given positions. */ ++ character_mode, ++ + /* Output the given delimeter-separated fields. */ + field_mode + }; + + static enum operating_mode operating_mode; + ++/* If nonzero, when in byte mode, don't split multibyte characters. */ ++static int byte_mode_character_aware; ++ ++/* If nonzero, the function for single byte locale is work ++ if this program runs on multibyte locale. */ ++static int force_singlebyte_mode; ++ + /* If true do not output lines containing no delimeter characters. + Otherwise, all such lines are printed. This option is valid only + with field mode. */ +@@ -133,6 +207,9 @@ static bool complement; + + /* The delimeter character for field mode. */ + static unsigned char delim; ++#if HAVE_WCHAR_H ++static wchar_t wcdelim; ++#endif + + /* True if the --output-delimiter=STRING option was specified. */ + static bool output_delimiter_specified; +@@ -206,7 +283,7 @@ Mandatory arguments to long options are + -f, --fields=LIST select only these fields; also print any line\n\ + that contains no delimiter character, unless\n\ + the -s option is specified\n\ +- -n (ignored)\n\ ++ -n with -b: don't split multibyte characters\n\ + "), stdout); + fputs (_("\ + --complement complement the set of selected bytes, characters\n\ +@@ -365,7 +442,7 @@ set_fields (const char *fieldstr) + in_digits = false; + /* Starting a range. */ + if (dash_found) +- FATAL_ERROR (_("invalid byte or field list")); ++ FATAL_ERROR (_("invalid byte, character or field list")); + dash_found = true; + fieldstr++; + +@@ -389,14 +466,16 @@ set_fields (const char *fieldstr) + if (!rhs_specified) + { + /* `n-'. From `initial' to end of line. */ +- eol_range_start = initial; ++ if (eol_range_start == 0 || ++ (eol_range_start != 0 && eol_range_start > initial)) ++ eol_range_start = initial; + field_found = true; + } + else + { + /* `m-n' or `-n' (1-n). */ + if (value < initial) +- FATAL_ERROR (_("invalid decreasing range")); ++ FATAL_ERROR (_("invalid byte, character or field list")); + + /* Is there already a range going to end of line? */ + if (eol_range_start != 0) +@@ -476,6 +555,9 @@ set_fields (const char *fieldstr) + if (operating_mode == byte_mode) + error (0, 0, + _("byte offset %s is too large"), quote (bad_num)); ++ else if (operating_mode == character_mode) ++ error (0, 0, ++ _("character offset %s is too large"), quote (bad_num)); + else + error (0, 0, + _("field number %s is too large"), quote (bad_num)); +@@ -486,7 +568,7 @@ set_fields (const char *fieldstr) + fieldstr++; + } + else +- FATAL_ERROR (_("invalid byte or field list")); ++ FATAL_ERROR (_("invalid byte, character or field list")); + } + + max_range_endpoint = 0; +@@ -579,6 +661,63 @@ cut_bytes (FILE *stream) + } + } + ++#if HAVE_MBRTOWC ++/* This function is in use for the following case. ++ ++ 1. Read from the stream STREAM, printing to standard output any selected ++ characters. ++ ++ 2. Read from stream STREAM, printing to standard output any selected bytes, ++ without splitting multibyte characters. */ ++ ++static void ++cut_characters_or_cut_bytes_no_split (FILE *stream) ++{ ++ int idx; /* number of bytes or characters in the line so far. */ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen; /* The length of the byte sequence in buf. */ ++ wint_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character which shows ++ as same character as WC. */ ++ mbstate_t state; /* State of the stream. */ ++ int convfail; /* 1, when conversion is failed. Otherwise 0. */ ++ ++ idx = 0; ++ buflen = 0; ++ bufpos = buf; ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ while (1) ++ { ++ REFILL_BUFFER (buf, bufpos, buflen, stream); ++ ++ GET_NEXT_WC_FROM_BUFFER (wc, bufpos, buflen, mblength, state, convfail); ++ ++ if (wc == WEOF) ++ { ++ if (idx > 0) ++ putchar ('\n'); ++ break; ++ } ++ else if (wc == L'\n') ++ { ++ putchar ('\n'); ++ idx = 0; ++ } ++ else ++ { ++ idx += (operating_mode == byte_mode) ? mblength : 1; ++ if (print_kth (idx, NULL)) ++ fwrite (bufpos, mblength, sizeof(char), stdout); ++ } ++ ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++} ++#endif ++ + /* Read from stream STREAM, printing to standard output any selected fields. */ + + static void +@@ -701,13 +840,192 @@ cut_fields (FILE *stream) + } + } + ++#if HAVE_MBRTOWC ++static void ++cut_fields_mb (FILE *stream) ++{ ++ int c; ++ unsigned int field_idx; ++ int found_any_selected_field; ++ int buffer_first_field; ++ int empty_input; ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen; /* The length of the byte sequence in buf. */ ++ wint_t wc = 0; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character which shows ++ as same character as WC. */ ++ mbstate_t state; /* State of the stream. */ ++ int convfail; /* 1, when conversion is failed. Otherwise 0. */ ++ ++ found_any_selected_field = 0; ++ field_idx = 1; ++ bufpos = buf; ++ buflen = 0; ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ c = getc (stream); ++ empty_input = (c == EOF); ++ if (c != EOF) ++ ungetc (c, stream); ++ else ++ wc = WEOF; ++ ++ /* To support the semantics of the -s flag, we may have to buffer ++ all of the first field to determine whether it is `delimited.' ++ But that is unnecessary if all non-delimited lines must be printed ++ and the first field has been selected, or if non-delimited lines ++ must be suppressed and the first field has *not* been selected. ++ That is because a non-delimited line has exactly one field. */ ++ buffer_first_field = (suppress_non_delimited ^ !print_kth (1, NULL)); ++ ++ while (1) ++ { ++ if (field_idx == 1 && buffer_first_field) ++ { ++ int len = 0; ++ ++ while (1) ++ { ++ REFILL_BUFFER (buf, bufpos, buflen, stream); ++ ++ GET_NEXT_WC_FROM_BUFFER ++ (wc, bufpos, buflen, mblength, state, convfail); ++ ++ if (wc == WEOF) ++ break; ++ ++ field_1_buffer = xrealloc (field_1_buffer, len + mblength); ++ memcpy (field_1_buffer + len, bufpos, mblength); ++ len += mblength; ++ buflen -= mblength; ++ bufpos += mblength; ++ ++ if (!convfail && (wc == L'\n' || wc == wcdelim)) ++ break; ++ } ++ ++ if (wc == WEOF) ++ break; ++ ++ /* If the first field extends to the end of line (it is not ++ delimited) and we are printing all non-delimited lines, ++ print this one. */ ++ if (convfail || (!convfail && wc != wcdelim)) ++ { ++ if (suppress_non_delimited) ++ { ++ /* Empty. */ ++ } ++ else ++ { ++ fwrite (field_1_buffer, sizeof (char), len, stdout); ++ /* Make sure the output line is newline terminated. */ ++ if (convfail || (!convfail && wc != L'\n')) ++ putchar ('\n'); ++ } ++ continue; ++ } ++ ++ if (print_kth (1, NULL)) ++ { ++ /* Print the field, but not the trailing delimiter. */ ++ fwrite (field_1_buffer, sizeof (char), len - 1, stdout); ++ found_any_selected_field = 1; ++ } ++ ++field_idx; ++ } ++ ++ if (wc != WEOF) ++ { ++ if (print_kth (field_idx, NULL)) ++ { ++ if (found_any_selected_field) ++ { ++ fwrite (output_delimiter_string, sizeof (char), ++ output_delimiter_length, stdout); ++ } ++ found_any_selected_field = 1; ++ } ++ ++ while (1) ++ { ++ REFILL_BUFFER (buf, bufpos, buflen, stream); ++ ++ GET_NEXT_WC_FROM_BUFFER ++ (wc, bufpos, buflen, mblength, state, convfail); ++ ++ if (wc == WEOF) ++ break; ++ else if (!convfail && (wc == wcdelim || wc == L'\n')) ++ { ++ buflen -= mblength; ++ bufpos += mblength; ++ break; ++ } ++ ++ if (print_kth (field_idx, NULL)) ++ fwrite (bufpos, mblength, sizeof(char), stdout); ++ ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++ } ++ ++ if ((!convfail || wc == L'\n') && buflen < 1) ++ wc = WEOF; ++ ++ if (!convfail && wc == wcdelim) ++ ++field_idx; ++ else if (wc == WEOF || (!convfail && wc == L'\n')) ++ { ++ if (found_any_selected_field ++ || (!empty_input && !(suppress_non_delimited && field_idx == 1))) ++ putchar ('\n'); ++ if (wc == WEOF) ++ break; ++ field_idx = 1; ++ found_any_selected_field = 0; ++ } ++ } ++} ++#endif ++ + static void + cut_stream (FILE *stream) + { +- if (operating_mode == byte_mode) +- cut_bytes (stream); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) ++ { ++ switch (operating_mode) ++ { ++ case byte_mode: ++ if (byte_mode_character_aware) ++ cut_characters_or_cut_bytes_no_split (stream); ++ else ++ cut_bytes (stream); ++ break; ++ ++ case character_mode: ++ cut_characters_or_cut_bytes_no_split (stream); ++ break; ++ ++ case field_mode: ++ cut_fields_mb (stream); ++ break; ++ ++ default: ++ abort (); ++ } ++ } + else +- cut_fields (stream); ++#endif ++ { ++ if (operating_mode == field_mode) ++ cut_fields (stream); ++ else ++ cut_bytes (stream); ++ } + } + + /* Process file FILE to standard output. +@@ -757,6 +1075,8 @@ main (int argc, char **argv) + bool ok; + bool delim_specified = false; + char *spec_list_string IF_LINT (= NULL); ++ char mbdelim[MB_LEN_MAX + 1]; ++ size_t delimlen = 0; + + initialize_main (&argc, &argv); + set_program_name (argv[0]); +@@ -779,7 +1099,6 @@ main (int argc, char **argv) + switch (optc) + { + case 'b': +- case 'c': + /* Build the byte list. */ + if (operating_mode != undefined_mode) + FATAL_ERROR (_("only one type of list may be specified")); +@@ -787,6 +1106,14 @@ main (int argc, char **argv) + spec_list_string = optarg; + break; + ++ case 'c': ++ /* Build the character list. */ ++ if (operating_mode != undefined_mode) ++ FATAL_ERROR (_("only one type of list may be specified")); ++ operating_mode = character_mode; ++ spec_list_string = optarg; ++ break; ++ + case 'f': + /* Build the field list. */ + if (operating_mode != undefined_mode) +@@ -798,10 +1125,35 @@ main (int argc, char **argv) + case 'd': + /* New delimiter. */ + /* Interpret -d '' to mean `use the NUL byte as the delimiter.' */ +- if (optarg[0] != '\0' && optarg[1] != '\0') +- FATAL_ERROR (_("the delimiter must be a single character")); +- delim = optarg[0]; +- delim_specified = true; ++ { ++#if HAVE_MBRTOWC ++ if(MB_CUR_MAX > 1) ++ { ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ delimlen = mbrtowc (&wcdelim, optarg, strnlen(optarg, MB_LEN_MAX), &state); ++ ++ if (delimlen == (size_t)-1 || delimlen == (size_t)-2) ++ ++force_singlebyte_mode; ++ else ++ { ++ delimlen = (delimlen < 1) ? 1 : delimlen; ++ if (wcdelim != L'\0' && *(optarg + delimlen) != '\0') ++ FATAL_ERROR (_("the delimiter must be a single character")); ++ memcpy (mbdelim, optarg, delimlen); ++ } ++ } ++ ++ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) ++#endif ++ { ++ if (optarg[0] != '\0' && optarg[1] != '\0') ++ FATAL_ERROR (_("the delimiter must be a single character")); ++ delim = (unsigned char) optarg[0]; ++ } ++ delim_specified = true; ++ } + break; + + case OUTPUT_DELIMITER_OPTION: +@@ -814,6 +1166,7 @@ main (int argc, char **argv) + break; + + case 'n': ++ byte_mode_character_aware = 1; + break; + + case 's': +@@ -836,7 +1189,7 @@ main (int argc, char **argv) + if (operating_mode == undefined_mode) + FATAL_ERROR (_("you must specify a list of bytes, characters, or fields")); + +- if (delim != '\0' && operating_mode != field_mode) ++ if (delim_specified && operating_mode != field_mode) + FATAL_ERROR (_("an input delimiter may be specified only\ + when operating on fields")); + +@@ -863,15 +1216,34 @@ main (int argc, char **argv) + } + + if (!delim_specified) +- delim = '\t'; ++ { ++ delim = '\t'; ++#ifdef HAVE_MBRTOWC ++ wcdelim = L'\t'; ++ mbdelim[0] = '\t'; ++ mbdelim[1] = '\0'; ++ delimlen = 1; ++#endif ++ } + + if (output_delimiter_string == NULL) + { +- static char dummy[2]; +- dummy[0] = delim; +- dummy[1] = '\0'; +- output_delimiter_string = dummy; +- output_delimiter_length = 1; ++#ifdef HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) ++ { ++ output_delimiter_string = xstrdup(mbdelim); ++ output_delimiter_length = delimlen; ++ } ++ ++ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) ++#endif ++ { ++ static char dummy[2]; ++ dummy[0] = delim; ++ dummy[1] = '\0'; ++ output_delimiter_string = dummy; ++ output_delimiter_length = 1; ++ } + } + + if (optind == argc) +Index: src/expand.c +=================================================================== +--- src/expand.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/expand.c 2010-05-07 16:13:30.748169979 +0200 +@@ -38,11 +38,28 @@ + #include + #include + #include ++ ++/* Get mbstate_t, mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "quote.h" + #include "xstrndup.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "expand" + +@@ -358,6 +375,142 @@ expand (void) + } + } + ++#if HAVE_MBRTOWC ++static void ++expand_multibyte (void) ++{ ++ FILE *fp; /* Input strem. */ ++ mbstate_t i_state; /* Current shift state of the input stream. */ ++ mbstate_t i_state_bak; /* Back up the I_STATE. */ ++ mbstate_t o_state; /* Current shift state of the output stream. */ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen = 0; /* The length of the byte sequence in buf. */ ++ wchar_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character ++ which shows as same character as WC. */ ++ int tab_index = 0; /* Index in `tab_list' of next tabstop. */ ++ int column = 0; /* Column on screen of the next char. */ ++ int next_tab_column; /* Column the next tab stop is on. */ ++ int convert = 1; /* If nonzero, perform translations. */ ++ ++ fp = next_file ((FILE *) NULL); ++ if (fp == NULL) ++ return; ++ ++ memset (&o_state, '\0', sizeof(mbstate_t)); ++ memset (&i_state, '\0', sizeof(mbstate_t)); ++ ++ for (;;) ++ { ++ /* Refill the buffer BUF. */ ++ if (buflen < MB_LEN_MAX && !feof(fp) && !ferror(fp)) ++ { ++ memmove (buf, bufpos, buflen); ++ buflen += fread (buf + buflen, sizeof(char), BUFSIZ, fp); ++ bufpos = buf; ++ } ++ ++ /* No character is left in BUF. */ ++ if (buflen < 1) ++ { ++ fp = next_file (fp); ++ ++ if (fp == NULL) ++ break; /* No more files. */ ++ else ++ { ++ memset (&i_state, '\0', sizeof(mbstate_t)); ++ continue; ++ } ++ } ++ ++ /* Get a wide character. */ ++ i_state_bak = i_state; ++ mblength = mbrtowc (&wc, bufpos, buflen, &i_state); ++ ++ switch (mblength) ++ { ++ case (size_t)-1: /* illegal byte sequence. */ ++ case (size_t)-2: ++ mblength = 1; ++ i_state = i_state_bak; ++ if (convert) ++ { ++ ++column; ++ if (convert_entire_line == 0) ++ convert = 0; ++ } ++ putchar (*bufpos); ++ break; ++ ++ case 0: /* null. */ ++ mblength = 1; ++ if (convert && convert_entire_line == 0) ++ convert = 0; ++ putchar ('\0'); ++ break; ++ ++ default: ++ if (wc == L'\n') /* LF. */ ++ { ++ tab_index = 0; ++ column = 0; ++ convert = 1; ++ putchar ('\n'); ++ } ++ else if (wc == L'\t' && convert) /* Tab. */ ++ { ++ if (tab_size == 0) ++ { ++ /* Do not let tab_index == first_free_tab; ++ stop when it is 1 less. */ ++ while (tab_index < first_free_tab - 1 ++ && column >= tab_list[tab_index]) ++ tab_index++; ++ next_tab_column = tab_list[tab_index]; ++ if (tab_index < first_free_tab - 1) ++ tab_index++; ++ if (column >= next_tab_column) ++ next_tab_column = column + 1; ++ } ++ else ++ next_tab_column = column + tab_size - column % tab_size; ++ ++ while (column < next_tab_column) ++ { ++ putchar (' '); ++ ++column; ++ } ++ } ++ else /* Others. */ ++ { ++ if (convert) ++ { ++ if (wc == L'\b') ++ { ++ if (column > 0) ++ --column; ++ } ++ else ++ { ++ int width; /* The width of WC. */ ++ ++ width = wcwidth (wc); ++ column += (width > 0) ? width : 0; ++ if (convert_entire_line == 0) ++ convert = 0; ++ } ++ } ++ fwrite (bufpos, sizeof(char), mblength, stdout); ++ } ++ } ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++} ++#endif ++ + int + main (int argc, char **argv) + { +@@ -422,7 +575,12 @@ main (int argc, char **argv) + + file_list = (optind < argc ? &argv[optind] : stdin_argv); + +- expand (); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ expand_multibyte (); ++ else ++#endif ++ expand (); + + if (have_read_stdin && fclose (stdin) != 0) + error (EXIT_FAILURE, errno, "-"); +Index: src/fold.c +=================================================================== +--- src/fold.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/fold.c 2010-05-07 16:39:03.220004781 +0200 +@@ -22,11 +22,33 @@ + #include + #include + ++/* Get mbstate_t, mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswprint(), iswblank(), wcwidth(). */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "quote.h" + #include "xstrtol.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# undef MB_LEN_MAX ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + #define TAB_WIDTH 8 + + /* The official name of this program (e.g., no `g' prefix). */ +@@ -34,20 +56,41 @@ + + #define AUTHORS proper_name ("David MacKenzie") + ++#define FATAL_ERROR(Message) \ ++ do \ ++ { \ ++ error (0, 0, (Message)); \ ++ usage (2); \ ++ } \ ++ while (0) ++ ++enum operating_mode ++{ ++ /* Fold texts by columns that are at the given positions. */ ++ column_mode, ++ ++ /* Fold texts by bytes that are at the given positions. */ ++ byte_mode, ++ ++ /* Fold texts by characters that are at the given positions. */ ++ character_mode, ++}; ++ ++/* The argument shows current mode. (Default: column_mode) */ ++static enum operating_mode operating_mode; ++ + /* If nonzero, try to break on whitespace. */ + static bool break_spaces; + +-/* If nonzero, count bytes, not column positions. */ +-static bool count_bytes; +- + /* If nonzero, at least one of the files we read was standard input. */ + static bool have_read_stdin; + +-static char const shortopts[] = "bsw:0::1::2::3::4::5::6::7::8::9::"; ++static char const shortopts[] = "bcsw:0::1::2::3::4::5::6::7::8::9::"; + + static struct option const longopts[] = + { + {"bytes", no_argument, NULL, 'b'}, ++ {"characters", no_argument, NULL, 'c'}, + {"spaces", no_argument, NULL, 's'}, + {"width", required_argument, NULL, 'w'}, + {GETOPT_HELP_OPTION_DECL}, +@@ -77,6 +120,7 @@ Mandatory arguments to long options are + "), stdout); + fputs (_("\ + -b, --bytes count bytes rather than columns\n\ ++ -c, --characters count characters rather than columns\n\ + -s, --spaces break at spaces\n\ + -w, --width=WIDTH use WIDTH columns instead of 80\n\ + "), stdout); +@@ -94,7 +138,7 @@ Mandatory arguments to long options are + static size_t + adjust_column (size_t column, char c) + { +- if (!count_bytes) ++ if (operating_mode != byte_mode) + { + if (c == '\b') + { +@@ -117,30 +161,14 @@ adjust_column (size_t column, char c) + to stdout, with maximum line length WIDTH. + Return true if successful. */ + +-static bool +-fold_file (char const *filename, size_t width) ++static void ++fold_text (FILE *istream, size_t width, int *saved_errno) + { +- FILE *istream; + int c; + size_t column = 0; /* Screen column where next char will go. */ + size_t offset_out = 0; /* Index in `line_out' for next char. */ + static char *line_out = NULL; + static size_t allocated_out = 0; +- int saved_errno; +- +- if (STREQ (filename, "-")) +- { +- istream = stdin; +- have_read_stdin = true; +- } +- else +- istream = fopen (filename, "r"); +- +- if (istream == NULL) +- { +- error (0, errno, "%s", filename); +- return false; +- } + + while ((c = getc (istream)) != EOF) + { +@@ -168,6 +196,15 @@ fold_file (char const *filename, size_t + bool found_blank = false; + size_t logical_end = offset_out; + ++ /* If LINE_OUT has no wide character, ++ put a new wide character in LINE_OUT ++ if column is bigger than width. */ ++ if (offset_out == 0) ++ { ++ line_out[offset_out++] = c; ++ continue; ++ } ++ + /* Look for the last blank. */ + while (logical_end) + { +@@ -214,11 +251,222 @@ fold_file (char const *filename, size_t + line_out[offset_out++] = c; + } + +- saved_errno = errno; ++ *saved_errno = errno; ++ ++ if (offset_out) ++ fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); ++ ++} ++ ++#if HAVE_MBRTOWC ++static void ++fold_multibyte_text (FILE *istream, size_t width, int *saved_errno) ++{ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ size_t buflen = 0; /* The length of the byte sequence in buf. */ ++ char *bufpos = NULL; /* Next read position of BUF. */ ++ wint_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character which shows ++ as same character as WC. */ ++ mbstate_t state, state_bak; /* State of the stream. */ ++ int convfail; /* 1, when conversion is failed. Otherwise 0. */ ++ ++ static char *line_out = NULL; ++ size_t offset_out = 0; /* Index in `line_out' for next char. */ ++ static size_t allocated_out = 0; ++ ++ int increment; ++ size_t column = 0; ++ ++ size_t last_blank_pos; ++ size_t last_blank_column; ++ int is_blank_seen; ++ int last_blank_increment = 0; ++ int is_bs_following_last_blank; ++ size_t bs_following_last_blank_num; ++ int is_cr_after_last_blank; ++ ++#define CLEAR_FLAGS \ ++ do \ ++ { \ ++ last_blank_pos = 0; \ ++ last_blank_column = 0; \ ++ is_blank_seen = 0; \ ++ is_bs_following_last_blank = 0; \ ++ bs_following_last_blank_num = 0; \ ++ is_cr_after_last_blank = 0; \ ++ } \ ++ while (0) ++ ++#define START_NEW_LINE \ ++ do \ ++ { \ ++ putchar ('\n'); \ ++ column = 0; \ ++ offset_out = 0; \ ++ CLEAR_FLAGS; \ ++ } \ ++ while (0) ++ ++ CLEAR_FLAGS; ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ for (;; bufpos += mblength, buflen -= mblength) ++ { ++ if (buflen < MB_LEN_MAX && !feof (istream) && !ferror (istream)) ++ { ++ memmove (buf, bufpos, buflen); ++ buflen += fread (buf + buflen, sizeof(char), BUFSIZ, istream); ++ bufpos = buf; ++ } ++ ++ if (buflen < 1) ++ break; ++ ++ /* Get a wide character. */ ++ convfail = 0; ++ state_bak = state; ++ mblength = mbrtowc ((wchar_t *)&wc, bufpos, buflen, &state); ++ ++ switch (mblength) ++ { ++ case (size_t)-1: ++ case (size_t)-2: ++ convfail++; ++ state = state_bak; ++ /* Fall through. */ ++ ++ case 0: ++ mblength = 1; ++ break; ++ } ++ ++rescan: ++ if (operating_mode == byte_mode) /* byte mode */ ++ increment = mblength; ++ else if (operating_mode == character_mode) /* character mode */ ++ increment = 1; ++ else /* column mode */ ++ { ++ if (convfail) ++ increment = 1; ++ else ++ { ++ switch (wc) ++ { ++ case L'\n': ++ fwrite (line_out, sizeof(char), offset_out, stdout); ++ START_NEW_LINE; ++ continue; ++ ++ case L'\b': ++ increment = (column > 0) ? -1 : 0; ++ break; ++ ++ case L'\r': ++ increment = -1 * column; ++ break; ++ ++ case L'\t': ++ increment = 8 - column % 8; ++ break; ++ ++ default: ++ increment = wcwidth (wc); ++ increment = (increment < 0) ? 0 : increment; ++ } ++ } ++ } ++ ++ if (column + increment > width && break_spaces && last_blank_pos) ++ { ++ fwrite (line_out, sizeof(char), last_blank_pos, stdout); ++ putchar ('\n'); ++ ++ offset_out = offset_out - last_blank_pos; ++ column = column - last_blank_column + ((is_cr_after_last_blank) ++ ? last_blank_increment : bs_following_last_blank_num); ++ memmove (line_out, line_out + last_blank_pos, offset_out); ++ CLEAR_FLAGS; ++ goto rescan; ++ } ++ ++ if (column + increment > width && column != 0) ++ { ++ fwrite (line_out, sizeof(char), offset_out, stdout); ++ START_NEW_LINE; ++ goto rescan; ++ } ++ ++ if (allocated_out < offset_out + mblength) ++ { ++ line_out = X2REALLOC (line_out, &allocated_out); ++ } ++ ++ memcpy (line_out + offset_out, bufpos, mblength); ++ offset_out += mblength; ++ column += increment; ++ ++ if (is_blank_seen && !convfail && wc == L'\r') ++ is_cr_after_last_blank = 1; ++ ++ if (is_bs_following_last_blank && !convfail && wc == L'\b') ++ ++bs_following_last_blank_num; ++ else ++ is_bs_following_last_blank = 0; ++ ++ if (break_spaces && !convfail && iswblank (wc)) ++ { ++ last_blank_pos = offset_out; ++ last_blank_column = column; ++ is_blank_seen = 1; ++ last_blank_increment = increment; ++ is_bs_following_last_blank = 1; ++ bs_following_last_blank_num = 0; ++ is_cr_after_last_blank = 0; ++ } ++ } ++ ++ *saved_errno = errno; + + if (offset_out) + fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); + ++} ++#endif ++ ++/* Fold file FILENAME, or standard input if FILENAME is "-", ++ to stdout, with maximum line length WIDTH. ++ Return 0 if successful, 1 if an error occurs. */ ++ ++static bool ++fold_file (char *filename, size_t width) ++{ ++ FILE *istream; ++ int saved_errno; ++ ++ if (STREQ (filename, "-")) ++ { ++ istream = stdin; ++ have_read_stdin = 1; ++ } ++ else ++ istream = fopen (filename, "r"); ++ ++ if (istream == NULL) ++ { ++ error (0, errno, "%s", filename); ++ return 1; ++ } ++ ++ /* Define how ISTREAM is being folded. */ ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ fold_multibyte_text (istream, width, &saved_errno); ++ else ++#endif ++ fold_text (istream, width, &saved_errno); ++ + if (ferror (istream)) + { + error (0, saved_errno, "%s", filename); +@@ -251,7 +499,8 @@ main (int argc, char **argv) + + atexit (close_stdout); + +- break_spaces = count_bytes = have_read_stdin = false; ++ operating_mode = column_mode; ++ break_spaces = have_read_stdin = false; + + while ((optc = getopt_long (argc, argv, shortopts, longopts, NULL)) != -1) + { +@@ -260,7 +509,15 @@ main (int argc, char **argv) + switch (optc) + { + case 'b': /* Count bytes rather than columns. */ +- count_bytes = true; ++ if (operating_mode != column_mode) ++ FATAL_ERROR (_("only one way of folding may be specified")); ++ operating_mode = byte_mode; ++ break; ++ ++ case 'c': ++ if (operating_mode != column_mode) ++ FATAL_ERROR (_("only one way of folding may be specified")); ++ operating_mode = character_mode; + break; + + case 's': /* Break at word boundaries. */ +Index: src/join.c +=================================================================== +--- src/join.c.orig 2010-04-20 21:52:04.000000000 +0200 ++++ src/join.c 2010-05-07 16:41:17.564268573 +0200 +@@ -22,17 +22,31 @@ + #include + #include + ++/* Get mbstate_t, mbrtowc(), mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswblank(), towupper. */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "hard-locale.h" + #include "linebuffer.h" +-#include "memcasecmp.h" + #include "quote.h" + #include "stdio--.h" + #include "xmemcoll.h" + #include "xstrtol.h" + #include "argmatch.h" + ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "join" + +@@ -121,10 +135,12 @@ static struct outlist outlist_head; + /* Last element in `outlist', where a new element can be added. */ + static struct outlist *outlist_end = &outlist_head; + +-/* Tab character separating fields. If negative, fields are separated +- by any nonempty string of blanks, otherwise by exactly one +- tab character whose value (when cast to unsigned char) equals TAB. */ +-static int tab = -1; ++/* Tab character separating fields. If NULL, fields are separated ++ by any nonempty string of blanks. */ ++static char *tab = NULL; ++ ++/* The number of bytes used for tab. */ ++static size_t tablen = 0; + + /* If nonzero, check that the input is correctly ordered. */ + static enum +@@ -248,10 +264,11 @@ xfields (struct line *line) + if (ptr == lim) + return; + +- if (0 <= tab) ++ if (tab != NULL) + { ++ unsigned char t = tab[0]; + char *sep; +- for (; (sep = memchr (ptr, tab, lim - ptr)) != NULL; ptr = sep + 1) ++ for (; (sep = memchr (ptr, t, lim - ptr)) != NULL; ptr = sep + 1) + extract_field (line, ptr, sep - ptr); + } + else +@@ -278,6 +295,148 @@ xfields (struct line *line) + extract_field (line, ptr, lim - ptr); + } + ++#if HAVE_MBRTOWC ++static void ++xfields_multibyte (struct line *line) ++{ ++ char *ptr = line->buf.buffer; ++ char const *lim = ptr + line->buf.length - 1; ++ wchar_t wc = 0; ++ size_t mblength = 1; ++ mbstate_t state, state_bak; ++ ++ memset (&state, 0, sizeof (mbstate_t)); ++ ++ if (ptr >= lim) ++ return; ++ ++ if (tab != NULL) ++ { ++ unsigned char t = tab[0]; ++ char *sep = ptr; ++ for (; ptr < lim; ptr = sep + mblength) ++ { ++ sep = ptr; ++ while (sep < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, sep, lim - sep + 1, &state); ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (mblength == tablen && !memcmp (sep, tab, mblength)) ++ break; ++ else ++ { ++ sep += mblength; ++ continue; ++ } ++ } ++ ++ if (sep >= lim) ++ break; ++ ++ extract_field (line, ptr, sep - ptr); ++ } ++ } ++ else ++ { ++ /* Skip leading blanks before the first field. */ ++ while(ptr < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (!iswblank(wc)) ++ break; ++ ptr += mblength; ++ } ++ ++ do ++ { ++ char *sep; ++ state_bak = state; ++ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ sep = ptr + mblength; ++ while (sep < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, sep, lim - sep + 1, &state); ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (iswblank (wc)) ++ break; ++ ++ sep += mblength; ++ } ++ ++ extract_field (line, ptr, sep - ptr); ++ if (sep >= lim) ++ return; ++ ++ state_bak = state; ++ mblength = mbrtowc (&wc, sep, lim - sep + 1, &state); ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ ptr = sep + mblength; ++ while (ptr < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (!iswblank (wc)) ++ break; ++ ++ ptr += mblength; ++ } ++ } ++ while (ptr < lim); ++ } ++ ++ extract_field (line, ptr, lim - ptr); ++} ++#endif ++ + static void + freeline (struct line *line) + { +@@ -299,56 +458,115 @@ keycmp (struct line const *line1, struct + size_t jf_1, size_t jf_2) + { + /* Start of field to compare in each file. */ +- char *beg1; +- char *beg2; +- +- size_t len1; +- size_t len2; /* Length of fields to compare. */ ++ char *beg[2]; ++ char *copy[2]; ++ size_t len[2]; /* Length of fields to compare. */ + int diff; ++ int i, j; + + if (jf_1 < line1->nfields) + { +- beg1 = line1->fields[jf_1].beg; +- len1 = line1->fields[jf_1].len; ++ beg[0] = line1->fields[jf_1].beg; ++ len[0] = line1->fields[jf_1].len; + } + else + { +- beg1 = NULL; +- len1 = 0; ++ beg[0] = NULL; ++ len[0] = 0; + } + + if (jf_2 < line2->nfields) + { +- beg2 = line2->fields[jf_2].beg; +- len2 = line2->fields[jf_2].len; ++ beg[1] = line2->fields[jf_2].beg; ++ len[1] = line2->fields[jf_2].len; + } + else + { +- beg2 = NULL; +- len2 = 0; ++ beg[1] = NULL; ++ len[1] = 0; + } + +- if (len1 == 0) +- return len2 == 0 ? 0 : -1; +- if (len2 == 0) ++ if (len[0] == 0) ++ return len[1] == 0 ? 0 : -1; ++ if (len[1] == 0) + return 1; + + if (ignore_case) + { +- /* FIXME: ignore_case does not work with NLS (in particular, +- with multibyte chars). */ +- diff = memcasecmp (beg1, beg2, MIN (len1, len2)); ++#ifdef HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ size_t mblength; ++ wchar_t wc, uwc; ++ mbstate_t state, state_bak; ++ ++ memset (&state, '\0', sizeof (mbstate_t)); ++ ++ for (i = 0; i < 2; i++) ++ { ++ copy[i] = alloca (len[i] + 1); ++ ++ for (j = 0; j < MIN (len[0], len[1]);) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, beg[i] + j, len[i] - j, &state); ++ ++ switch (mblength) ++ { ++ case (size_t) -1: ++ case (size_t) -2: ++ state = state_bak; ++ /* Fall through */ ++ case 0: ++ mblength = 1; ++ break; ++ ++ default: ++ uwc = towupper (wc); ++ ++ if (uwc != wc) ++ { ++ mbstate_t state_wc; ++ ++ memset (&state_wc, '\0', sizeof (mbstate_t)); ++ wcrtomb (copy[i] + j, uwc, &state_wc); ++ } ++ else ++ memcpy (copy[i] + j, beg[i] + j, mblength); ++ } ++ j += mblength; ++ } ++ copy[i][j] = '\0'; ++ } ++ } ++ else ++#endif ++ { ++ for (i = 0; i < 2; i++) ++ { ++ copy[i] = alloca (len[i] + 1); ++ ++ for (j = 0; j < MIN (len[0], len[1]); j++) ++ copy[i][j] = toupper (beg[i][j]); ++ ++ copy[i][j] = '\0'; ++ } ++ } + } + else + { +- if (hard_LC_COLLATE) +- return xmemcoll (beg1, len1, beg2, len2); +- diff = memcmp (beg1, beg2, MIN (len1, len2)); ++ copy[0] = (unsigned char *) beg[0]; ++ copy[1] = (unsigned char *) beg[1]; + } + ++ if (hard_LC_COLLATE) ++ return xmemcoll ((char *) copy[0], len[0], (char *) copy[1], len[1]); ++ diff = memcmp (copy[0], copy[1], MIN (len[0], len[1])); ++ ++ + if (diff) + return diff; +- return len1 < len2 ? -1 : len1 != len2; ++ return len[0] - len[1]; + } + + /* Check that successive input lines PREV and CURRENT from input file +@@ -429,6 +647,11 @@ get_line (FILE *fp, struct line **linep, + return false; + } + ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ xfields_multibyte (line); ++ else ++#endif + xfields (line); + + if (prevline[which - 1]) +@@ -528,11 +751,18 @@ prfield (size_t n, struct line const *li + + /* Print the join of LINE1 and LINE2. */ + ++#define PUT_TAB_CHAR \ ++ do \ ++ { \ ++ (tab != NULL) ? \ ++ fwrite(tab, sizeof(char), tablen, stdout) : putchar (' '); \ ++ } \ ++ while (0) ++ + static void + prjoin (struct line const *line1, struct line const *line2) + { + const struct outlist *outlist; +- char output_separator = tab < 0 ? ' ' : tab; + + outlist = outlist_head.next; + if (outlist) +@@ -567,7 +797,7 @@ prjoin (struct line const *line1, struct + o = o->next; + if (o == NULL) + break; +- putchar (output_separator); ++ PUT_TAB_CHAR; + } + putchar ('\n'); + } +@@ -585,23 +815,23 @@ prjoin (struct line const *line1, struct + prfield (join_field_1, line1); + for (i = 0; i < join_field_1 && i < line1->nfields; ++i) + { +- putchar (output_separator); ++ PUT_TAB_CHAR; + prfield (i, line1); + } + for (i = join_field_1 + 1; i < line1->nfields; ++i) + { +- putchar (output_separator); ++ PUT_TAB_CHAR; + prfield (i, line1); + } + + for (i = 0; i < join_field_2 && i < line2->nfields; ++i) + { +- putchar (output_separator); ++ PUT_TAB_CHAR; + prfield (i, line2); + } + for (i = join_field_2 + 1; i < line2->nfields; ++i) + { +- putchar (output_separator); ++ PUT_TAB_CHAR; + prfield (i, line2); + } + putchar ('\n'); +@@ -1039,21 +1269,46 @@ main (int argc, char **argv) + + case 't': + { +- unsigned char newtab = optarg[0]; ++ char *newtab; ++ size_t newtablen; ++ newtab = xstrdup (optarg); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ mbstate_t state; ++ ++ memset (&state, 0, sizeof (mbstate_t)); ++ newtablen = mbrtowc (NULL, newtab, ++ strnlen (newtab, MB_LEN_MAX), ++ &state); ++ if (newtablen == (size_t) 0 ++ || newtablen == (size_t) -1 ++ || newtablen == (size_t) -2) ++ newtablen = 1; ++ } ++ else ++#endif ++ newtablen = 1; + if (! newtab) +- newtab = '\n'; /* '' => process the whole line. */ ++ { ++ newtab[0] = '\n'; /* '' => process the whole line. */ ++ } + else if (optarg[1]) + { +- if (STREQ (optarg, "\\0")) +- newtab = '\0'; +- else +- error (EXIT_FAILURE, 0, _("multi-character tab %s"), +- quote (optarg)); ++ if (newtablen == 1 && newtab[1]) ++ { ++ if (STREQ (newtab, "\\0")) ++ newtab[0] = '\0'; ++ } ++ } ++ if (tab != NULL && strcmp (tab, newtab)) ++ { ++ free (newtab); ++ error (EXIT_FAILURE, 0, _("incompatible tabs")); + } +- if (0 <= tab && tab != newtab) +- error (EXIT_FAILURE, 0, _("incompatible tabs")); + tab = newtab; +- } ++ tablen = newtablen; ++ } + break; + + case NOCHECK_ORDER_OPTION: +Index: src/pr.c +=================================================================== +--- src/pr.c.orig 2010-03-13 16:14:09.000000000 +0100 ++++ src/pr.c 2010-05-07 16:13:30.836003733 +0200 +@@ -312,6 +312,32 @@ + + #include + #include ++ ++/* Get MB_LEN_MAX. */ ++#include ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX == 1 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Get MB_CUR_MAX. */ ++#include ++ ++/* Solaris 2.5 has a bug: must be included before . */ ++/* Get mbstate_t, mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswprint(). -- for wcwidth(). */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++#if !defined iswprint && !HAVE_ISWPRINT ++# define iswprint(wc) 1 ++#endif ++ + #include "system.h" + #include "error.h" + #include "hard-locale.h" +@@ -322,6 +348,18 @@ + #include "strftime.h" + #include "xstrtol.h" + ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ ++#ifndef HAVE_DECL_WCWIDTH ++"this configure-time declaration test was not run" ++#endif ++#if !HAVE_DECL_WCWIDTH ++extern int wcwidth (); ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "pr" + +@@ -414,7 +452,20 @@ struct COLUMN + + typedef struct COLUMN COLUMN; + +-static int char_to_clump (char c); ++/* Funtion pointers to switch functions for single byte locale or for ++ multibyte locale. If multibyte functions do not exist in your sysytem, ++ these pointers always point the function for single byte locale. */ ++static void (*print_char) (char c); ++static int (*char_to_clump) (char c); ++ ++/* Functions for single byte locale. */ ++static void print_char_single (char c); ++static int char_to_clump_single (char c); ++ ++/* Functions for multibyte locale. */ ++static void print_char_multi (char c); ++static int char_to_clump_multi (char c); ++ + static bool read_line (COLUMN *p); + static bool print_page (void); + static bool print_stored (COLUMN *p); +@@ -424,6 +475,7 @@ static void print_header (void); + static void pad_across_to (int position); + static void add_line_number (COLUMN *p); + static void getoptarg (char *arg, char switch_char, char *character, ++ int *character_length, int *character_width, + int *number); + void usage (int status); + static void print_files (int number_of_files, char **av); +@@ -438,7 +490,6 @@ static void store_char (char c); + static void pad_down (int lines); + static void read_rest_of_line (COLUMN *p); + static void skip_read (COLUMN *p, int column_number); +-static void print_char (char c); + static void cleanup (void); + static void print_sep_string (void); + static void separator_string (const char *optarg_S); +@@ -450,7 +501,7 @@ static COLUMN *column_vector; + we store the leftmost columns contiguously in buff. + To print a line from buff, get the index of the first character + from line_vector[i], and print up to line_vector[i + 1]. */ +-static char *buff; ++static unsigned char *buff; + + /* Index of the position in buff where the next character + will be stored. */ +@@ -554,7 +605,7 @@ static int chars_per_column; + static bool untabify_input = false; + + /* (-e) The input tab character. */ +-static char input_tab_char = '\t'; ++static char input_tab_char[MB_LEN_MAX] = "\t"; + + /* (-e) Tabstops are at chars_per_tab, 2*chars_per_tab, 3*chars_per_tab, ... + where the leftmost column is 1. */ +@@ -564,7 +615,10 @@ static int chars_per_input_tab = 8; + static bool tabify_output = false; + + /* (-i) The output tab character. */ +-static char output_tab_char = '\t'; ++static char output_tab_char[MB_LEN_MAX] = "\t"; ++ ++/* (-i) The byte length of output tab character. */ ++static int output_tab_char_length = 1; + + /* (-i) The width of the output tab. */ + static int chars_per_output_tab = 8; +@@ -638,7 +692,13 @@ static int power_10; + static bool numbered_lines = false; + + /* (-n) Character which follows each line number. */ +-static char number_separator = '\t'; ++static char number_separator[MB_LEN_MAX] = "\t"; ++ ++/* (-n) The byte length of the character which follows each line number. */ ++static int number_separator_length = 1; ++ ++/* (-n) The character width of the character which follows each line number. */ ++static int number_separator_width = 0; + + /* (-n) line counting starts with 1st line of input file (not with 1st + line of 1st page printed). */ +@@ -691,6 +751,7 @@ static bool use_col_separator = false; + -a|COLUMN|-m is a `space' and with the -J option a `tab'. */ + static char *col_sep_string = (char *) ""; + static int col_sep_length = 0; ++static int col_sep_width = 0; + static char *column_separator = (char *) " "; + static char *line_separator = (char *) "\t"; + +@@ -847,6 +908,13 @@ separator_string (const char *optarg_S) + col_sep_length = (int) strlen (optarg_S); + col_sep_string = xmalloc (col_sep_length + 1); + strcpy (col_sep_string, optarg_S); ++ ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ col_sep_width = mbswidth (col_sep_string, 0); ++ else ++#endif ++ col_sep_width = col_sep_length; + } + + int +@@ -871,6 +939,21 @@ main (int argc, char **argv) + + atexit (close_stdout); + ++/* Define which functions are used, the ones for single byte locale or the ones ++ for multibyte locale. */ ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ print_char = print_char_multi; ++ char_to_clump = char_to_clump_multi; ++ } ++ else ++#endif ++ { ++ print_char = print_char_single; ++ char_to_clump = char_to_clump_single; ++ } ++ + n_files = 0; + file_names = (argc > 1 + ? xmalloc ((argc - 1) * sizeof (char *)) +@@ -947,8 +1030,12 @@ main (int argc, char **argv) + break; + case 'e': + if (optarg) +- getoptarg (optarg, 'e', &input_tab_char, +- &chars_per_input_tab); ++ { ++ int dummy_length, dummy_width; ++ ++ getoptarg (optarg, 'e', input_tab_char, &dummy_length, ++ &dummy_width, &chars_per_input_tab); ++ } + /* Could check tab width > 0. */ + untabify_input = true; + break; +@@ -961,8 +1048,12 @@ main (int argc, char **argv) + break; + case 'i': + if (optarg) +- getoptarg (optarg, 'i', &output_tab_char, +- &chars_per_output_tab); ++ { ++ int dummy_width; ++ ++ getoptarg (optarg, 'i', output_tab_char, &output_tab_char_length, ++ &dummy_width, &chars_per_output_tab); ++ } + /* Could check tab width > 0. */ + tabify_output = true; + break; +@@ -989,8 +1080,8 @@ main (int argc, char **argv) + case 'n': + numbered_lines = true; + if (optarg) +- getoptarg (optarg, 'n', &number_separator, +- &chars_per_number); ++ getoptarg (optarg, 'n', number_separator, &number_separator_length, ++ &number_separator_width, &chars_per_number); + break; + case 'N': + skip_count = false; +@@ -1029,7 +1120,7 @@ main (int argc, char **argv) + old_s = false; + /* Reset an additional input of -s, -S dominates -s */ + col_sep_string = bad_cast (""); +- col_sep_length = 0; ++ col_sep_length = col_sep_width = 0; + use_col_separator = true; + if (optarg) + separator_string (optarg); +@@ -1186,10 +1277,45 @@ main (int argc, char **argv) + a number. */ + + static void +-getoptarg (char *arg, char switch_char, char *character, int *number) ++getoptarg (char *arg, char switch_char, char *character, int *character_length, ++ int *character_width, int *number) + { + if (!ISDIGIT (*arg)) +- *character = *arg++; ++ { ++#ifdef HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) /* for multibyte locale. */ ++ { ++ wchar_t wc; ++ size_t mblength; ++ int width; ++ mbstate_t state = {'\0'}; ++ ++ mblength = mbrtowc (&wc, arg, strnlen(arg, MB_LEN_MAX), &state); ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ *character_length = 1; ++ *character_width = 1; ++ } ++ else ++ { ++ *character_length = (mblength < 1) ? 1 : mblength; ++ width = wcwidth (wc); ++ *character_width = (width < 0) ? 0 : width; ++ } ++ ++ strncpy (character, arg, *character_length); ++ arg += *character_length; ++ } ++ else /* for single byte locale. */ ++#endif ++ { ++ *character = *arg++; ++ *character_length = 1; ++ *character_width = 1; ++ } ++ } ++ + if (*arg) + { + long int tmp_long; +@@ -1248,7 +1374,7 @@ init_parameters (int number_of_files) + else + col_sep_string = column_separator; + +- col_sep_length = 1; ++ col_sep_length = col_sep_width = 1; + use_col_separator = true; + } + /* It's rather pointless to define a TAB separator with column +@@ -1279,11 +1405,11 @@ init_parameters (int number_of_files) + TAB_WIDTH (chars_per_input_tab, chars_per_number); */ + + /* Estimate chars_per_text without any margin and keep it constant. */ +- if (number_separator == '\t') ++ if (number_separator[0] == '\t') + number_width = chars_per_number + + TAB_WIDTH (chars_per_default_tab, chars_per_number); + else +- number_width = chars_per_number + 1; ++ number_width = chars_per_number + number_separator_width; + + /* The number is part of the column width unless we are + printing files in parallel. */ +@@ -1298,7 +1424,7 @@ init_parameters (int number_of_files) + } + + chars_per_column = (chars_per_line - chars_used_by_number - +- (columns - 1) * col_sep_length) / columns; ++ (columns - 1) * col_sep_width) / columns; + + if (chars_per_column < 1) + error (EXIT_FAILURE, 0, _("page width too narrow")); +@@ -1423,7 +1549,7 @@ init_funcs (void) + + /* Enlarge p->start_position of first column to use the same form of + padding_not_printed with all columns. */ +- h = h + col_sep_length; ++ h = h + col_sep_width; + + /* This loop takes care of all but the rightmost column. */ + +@@ -1457,7 +1583,7 @@ init_funcs (void) + } + else + { +- h = h_next + col_sep_length; ++ h = h_next + col_sep_width; + h_next = h + chars_per_column; + } + } +@@ -1747,9 +1873,9 @@ static void + align_column (COLUMN *p) + { + padding_not_printed = p->start_position; +- if (padding_not_printed - col_sep_length > 0) ++ if (padding_not_printed - col_sep_width > 0) + { +- pad_across_to (padding_not_printed - col_sep_length); ++ pad_across_to (padding_not_printed - col_sep_width); + padding_not_printed = ANYWHERE; + } + +@@ -2020,13 +2146,13 @@ store_char (char c) + /* May be too generous. */ + buff = X2REALLOC (buff, &buff_allocated); + } +- buff[buff_current++] = c; ++ buff[buff_current++] = (unsigned char) c; + } + + static void + add_line_number (COLUMN *p) + { +- int i; ++ int i, j; + char *s; + int left_cut; + +@@ -2049,22 +2175,24 @@ add_line_number (COLUMN *p) + /* Tabification is assumed for multiple columns, also for n-separators, + but `default n-separator = TAB' hasn't been given priority over + equal column_width also specified by POSIX. */ +- if (number_separator == '\t') ++ if (number_separator[0] == '\t') + { + i = number_width - chars_per_number; + while (i-- > 0) + (p->char_func) (' '); + } + else +- (p->char_func) (number_separator); ++ for (j = 0; j < number_separator_length; j++) ++ (p->char_func) (number_separator[j]); + } + else + /* To comply with POSIX, we avoid any expansion of default TAB + separator with a single column output. No column_width requirement + has to be considered. */ + { +- (p->char_func) (number_separator); +- if (number_separator == '\t') ++ for (j = 0; j < number_separator_length; j++) ++ (p->char_func) (number_separator[j]); ++ if (number_separator[0] == '\t') + output_position = POS_AFTER_TAB (chars_per_output_tab, + output_position); + } +@@ -2225,7 +2353,7 @@ print_white_space (void) + while (goal - h_old > 1 + && (h_new = POS_AFTER_TAB (chars_per_output_tab, h_old)) <= goal) + { +- putchar (output_tab_char); ++ fwrite (output_tab_char, sizeof(char), output_tab_char_length, stdout); + h_old = h_new; + } + while (++h_old <= goal) +@@ -2245,6 +2373,7 @@ print_sep_string (void) + { + char *s; + int l = col_sep_length; ++ int not_space_flag; + + s = col_sep_string; + +@@ -2258,6 +2387,7 @@ print_sep_string (void) + { + for (; separators_not_printed > 0; --separators_not_printed) + { ++ not_space_flag = 0; + while (l-- > 0) + { + /* 3 types of sep_strings: spaces only, spaces and chars, +@@ -2271,12 +2401,15 @@ print_sep_string (void) + } + else + { ++ not_space_flag = 1; + if (spaces_not_printed > 0) + print_white_space (); + putchar (*s++); +- ++output_position; + } + } ++ if (not_space_flag) ++ output_position += col_sep_width; ++ + /* sep_string ends with some spaces */ + if (spaces_not_printed > 0) + print_white_space (); +@@ -2304,7 +2437,7 @@ print_clump (COLUMN *p, int n, char *clu + required number of tabs and spaces. */ + + static void +-print_char (char c) ++print_char_single (char c) + { + if (tabify_output) + { +@@ -2328,6 +2461,74 @@ print_char (char c) + putchar (c); + } + ++#ifdef HAVE_MBRTOWC ++static void ++print_char_multi (char c) ++{ ++ static size_t mbc_pos = 0; ++ static char mbc[MB_LEN_MAX] = {'\0'}; ++ static mbstate_t state = {'\0'}; ++ mbstate_t state_bak; ++ wchar_t wc; ++ size_t mblength; ++ int width; ++ ++ if (tabify_output) ++ { ++ state_bak = state; ++ mbc[mbc_pos++] = c; ++ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); ++ ++ while (mbc_pos > 0) ++ { ++ switch (mblength) ++ { ++ case (size_t)-2: ++ state = state_bak; ++ return; ++ ++ case (size_t)-1: ++ state = state_bak; ++ ++output_position; ++ putchar (mbc[0]); ++ memmove (mbc, mbc + 1, MB_CUR_MAX - 1); ++ --mbc_pos; ++ break; ++ ++ case 0: ++ mblength = 1; ++ ++ default: ++ if (wc == L' ') ++ { ++ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); ++ --mbc_pos; ++ ++spaces_not_printed; ++ return; ++ } ++ else if (spaces_not_printed > 0) ++ print_white_space (); ++ ++ /* Nonprintables are assumed to have width 0, except L'\b'. */ ++ if ((width = wcwidth (wc)) < 1) ++ { ++ if (wc == L'\b') ++ --output_position; ++ } ++ else ++ output_position += width; ++ ++ fwrite (mbc, sizeof(char), mblength, stdout); ++ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); ++ mbc_pos -= mblength; ++ } ++ } ++ return; ++ } ++ putchar (c); ++} ++#endif ++ + /* Skip to page PAGE before printing. + PAGE may be larger than total number of pages. */ + +@@ -2507,9 +2708,9 @@ read_line (COLUMN *p) + align_empty_cols = false; + } + +- if (padding_not_printed - col_sep_length > 0) ++ if (padding_not_printed - col_sep_width > 0) + { +- pad_across_to (padding_not_printed - col_sep_length); ++ pad_across_to (padding_not_printed - col_sep_width); + padding_not_printed = ANYWHERE; + } + +@@ -2610,9 +2811,9 @@ print_stored (COLUMN *p) + } + } + +- if (padding_not_printed - col_sep_length > 0) ++ if (padding_not_printed - col_sep_width > 0) + { +- pad_across_to (padding_not_printed - col_sep_length); ++ pad_across_to (padding_not_printed - col_sep_width); + padding_not_printed = ANYWHERE; + } + +@@ -2625,8 +2826,8 @@ print_stored (COLUMN *p) + if (spaces_not_printed == 0) + { + output_position = p->start_position + end_vector[line]; +- if (p->start_position - col_sep_length == chars_per_margin) +- output_position -= col_sep_length; ++ if (p->start_position - col_sep_width == chars_per_margin) ++ output_position -= col_sep_width; + } + + return true; +@@ -2645,7 +2846,7 @@ print_stored (COLUMN *p) + number of characters is 1.) */ + + static int +-char_to_clump (char c) ++char_to_clump_single (char c) + { + unsigned char uc = c; + char *s = clump_buff; +@@ -2655,10 +2856,10 @@ char_to_clump (char c) + int chars; + int chars_per_c = 8; + +- if (c == input_tab_char) ++ if (c == input_tab_char[0]) + chars_per_c = chars_per_input_tab; + +- if (c == input_tab_char || c == '\t') ++ if (c == input_tab_char[0] || c == '\t') + { + width = TAB_WIDTH (chars_per_c, input_position); + +@@ -2739,6 +2940,154 @@ char_to_clump (char c) + return chars; + } + ++#ifdef HAVE_MBRTOWC ++static int ++char_to_clump_multi (char c) ++{ ++ static size_t mbc_pos = 0; ++ static char mbc[MB_LEN_MAX] = {'\0'}; ++ static mbstate_t state = {'\0'}; ++ mbstate_t state_bak; ++ wchar_t wc; ++ size_t mblength; ++ int wc_width; ++ register char *s = clump_buff; ++ register int i, j; ++ char esc_buff[4]; ++ int width; ++ int chars; ++ int chars_per_c = 8; ++ ++ state_bak = state; ++ mbc[mbc_pos++] = c; ++ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); ++ ++ width = 0; ++ chars = 0; ++ while (mbc_pos > 0) ++ { ++ switch (mblength) ++ { ++ case (size_t)-2: ++ state = state_bak; ++ return 0; ++ ++ case (size_t)-1: ++ state = state_bak; ++ mblength = 1; ++ ++ if (use_esc_sequence || use_cntrl_prefix) ++ { ++ width = +4; ++ chars = +4; ++ *s++ = '\\'; ++ sprintf (esc_buff, "%03o", mbc[0]); ++ for (i = 0; i <= 2; ++i) ++ *s++ = (int) esc_buff[i]; ++ } ++ else ++ { ++ width += 1; ++ chars += 1; ++ *s++ = mbc[0]; ++ } ++ break; ++ ++ case 0: ++ mblength = 1; ++ /* Fall through */ ++ ++ default: ++ if (memcmp (mbc, input_tab_char, mblength) == 0) ++ chars_per_c = chars_per_input_tab; ++ ++ if (memcmp (mbc, input_tab_char, mblength) == 0 || c == '\t') ++ { ++ int width_inc; ++ ++ width_inc = TAB_WIDTH (chars_per_c, input_position); ++ width += width_inc; ++ ++ if (untabify_input) ++ { ++ for (i = width_inc; i; --i) ++ *s++ = ' '; ++ chars += width_inc; ++ } ++ else ++ { ++ for (i = 0; i < mblength; i++) ++ *s++ = mbc[i]; ++ chars += mblength; ++ } ++ } ++ else if ((wc_width = wcwidth (wc)) < 1) ++ { ++ if (use_esc_sequence) ++ { ++ for (i = 0; i < mblength; i++) ++ { ++ width += 4; ++ chars += 4; ++ *s++ = '\\'; ++ sprintf (esc_buff, "%03o", c); ++ for (j = 0; j <= 2; ++j) ++ *s++ = (int) esc_buff[j]; ++ } ++ } ++ else if (use_cntrl_prefix) ++ { ++ if (wc < 0200) ++ { ++ width += 2; ++ chars += 2; ++ *s++ = '^'; ++ *s++ = wc ^ 0100; ++ } ++ else ++ { ++ for (i = 0; i < mblength; i++) ++ { ++ width += 4; ++ chars += 4; ++ *s++ = '\\'; ++ sprintf (esc_buff, "%03o", c); ++ for (j = 0; j <= 2; ++j) ++ *s++ = (int) esc_buff[j]; ++ } ++ } ++ } ++ else if (wc == L'\b') ++ { ++ width += -1; ++ chars += 1; ++ *s++ = c; ++ } ++ else ++ { ++ width += 0; ++ chars += mblength; ++ for (i = 0; i < mblength; i++) ++ *s++ = mbc[i]; ++ } ++ } ++ else ++ { ++ width += wc_width; ++ chars += mblength; ++ for (i = 0; i < mblength; i++) ++ *s++ = mbc[i]; ++ } ++ } ++ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); ++ mbc_pos -= mblength; ++ } ++ ++ input_position += width; ++ return chars; ++} ++#endif ++ + /* We've just printed some files and need to clean up things before + looking for more options and printing the next batch of files. + +Index: src/sort.c +=================================================================== +--- src/sort.c.orig 2010-04-21 09:06:17.000000000 +0200 ++++ src/sort.c 2010-05-07 16:34:36.664210645 +0200 +@@ -22,10 +22,19 @@ + + #include + ++#include + #include + #include + #include + #include ++#if HAVE_WCHAR_H ++# include ++#endif ++/* Get isw* functions. */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "argmatch.h" + #include "error.h" +@@ -124,14 +133,38 @@ static int decimal_point; + /* Thousands separator; if -1, then there isn't one. */ + static int thousands_sep; + ++static int force_general_numcompare = 0; ++ + /* Nonzero if the corresponding locales are hard. */ + static bool hard_LC_COLLATE; +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + static bool hard_LC_TIME; + #endif + + #define NONZERO(x) ((x) != 0) + ++/* get a multibyte character's byte length. */ ++#define GET_BYTELEN_OF_CHAR(LIM, PTR, MBLENGTH, STATE) \ ++ do \ ++ { \ ++ wchar_t wc; \ ++ mbstate_t state_bak; \ ++ \ ++ state_bak = STATE; \ ++ mblength = mbrtowc (&wc, PTR, LIM - PTR, &STATE); \ ++ \ ++ switch (MBLENGTH) \ ++ { \ ++ case (size_t)-1: \ ++ case (size_t)-2: \ ++ STATE = state_bak; \ ++ /* Fall through. */ \ ++ case 0: \ ++ MBLENGTH = 1; \ ++ } \ ++ } \ ++ while (0) ++ + /* The kind of blanks for '-b' to skip in various options. */ + enum blanktype { bl_start, bl_end, bl_both }; + +@@ -270,13 +303,11 @@ static bool reverse; + they were read if all keys compare equal. */ + static bool stable; + +-/* If TAB has this value, blanks separate fields. */ +-enum { TAB_DEFAULT = CHAR_MAX + 1 }; +- +-/* Tab character separating fields. If TAB_DEFAULT, then fields are ++/* Tab character separating fields. If tab_length is 0, then fields are + separated by the empty string between a non-blank character and a blank + character. */ +-static int tab = TAB_DEFAULT; ++static char tab[MB_LEN_MAX + 1]; ++static size_t tab_length = 0; + + /* Flag to remove consecutive duplicate lines from the output. + Only the last of a sequence of equal lines will be output. */ +@@ -714,6 +745,44 @@ reap_some (void) + update_proc (pid); + } + ++/* Function pointers. */ ++static void ++(*inittables) (void); ++static char * ++(*begfield) (const struct line*, const struct keyfield *); ++static char * ++(*limfield) (const struct line*, const struct keyfield *); ++static int ++(*getmonth) (char const *, size_t); ++static int ++(*keycompare) (const struct line *, const struct line *); ++static int ++(*numcompare) (const char *, const char *); ++ ++/* Test for white space multibyte character. ++ Set LENGTH the byte length of investigated multibyte character. */ ++#if HAVE_MBRTOWC ++static int ++ismbblank (const char *str, size_t len, size_t *length) ++{ ++ size_t mblength; ++ wchar_t wc; ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ mblength = mbrtowc (&wc, str, len, &state); ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ *length = 1; ++ return 0; ++ } ++ ++ *length = (mblength < 1) ? 1 : mblength; ++ return iswblank (wc); ++} ++#endif ++ + /* Clean up any remaining temporary files. */ + + static void +@@ -1158,7 +1227,7 @@ zaptemp (const char *name) + free (node); + } + +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + + static int + struct_month_cmp (const void *m1, const void *m2) +@@ -1173,7 +1242,7 @@ struct_month_cmp (const void *m1, const + /* Initialize the character class tables. */ + + static void +-inittables (void) ++inittables_uni (void) + { + size_t i; + +@@ -1185,7 +1254,7 @@ inittables (void) + fold_toupper[i] = toupper (i); + } + +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + /* If we're not in the "C" locale, read different names for months. */ + if (hard_LC_TIME) + { +@@ -1268,6 +1337,64 @@ specify_nmerge (int oi, char c, char con + xstrtol_fatal (e, oi, c, long_options, s); + } + ++#if HAVE_MBRTOWC ++static void ++inittables_mb (void) ++{ ++ int i, j, k, l; ++ char *name, *s; ++ size_t s_len, mblength; ++ char mbc[MB_LEN_MAX]; ++ wchar_t wc, pwc; ++ mbstate_t state_mb, state_wc; ++ ++ for (i = 0; i < MONTHS_PER_YEAR; i++) ++ { ++ s = (char *) nl_langinfo (ABMON_1 + i); ++ s_len = strlen (s); ++ monthtab[i].name = name = (char *) xmalloc (s_len + 1); ++ monthtab[i].val = i + 1; ++ ++ memset (&state_mb, '\0', sizeof (mbstate_t)); ++ memset (&state_wc, '\0', sizeof (mbstate_t)); ++ ++ for (j = 0; j < s_len;) ++ { ++ if (!ismbblank (s + j, s_len - j, &mblength)) ++ break; ++ j += mblength; ++ } ++ ++ for (k = 0; j < s_len;) ++ { ++ mblength = mbrtowc (&wc, (s + j), (s_len - j), &state_mb); ++ assert (mblength != (size_t)-1 && mblength != (size_t)-2); ++ if (mblength == 0) ++ break; ++ ++ pwc = towupper (wc); ++ if (pwc == wc) ++ { ++ memcpy (mbc, s + j, mblength); ++ j += mblength; ++ } ++ else ++ { ++ j += mblength; ++ mblength = wcrtomb (mbc, pwc, &state_wc); ++ assert (mblength != (size_t)0 && mblength != (size_t)-1); ++ } ++ ++ for (l = 0; l < mblength; l++) ++ name[k++] = mbc[l]; ++ } ++ name[k] = '\0'; ++ } ++ qsort ((void *) monthtab, MONTHS_PER_YEAR, ++ sizeof (struct month), struct_month_cmp); ++} ++#endif ++ + /* Specify the amount of main memory to use when sorting. */ + static void + specify_sort_size (int oi, char c, char const *s) +@@ -1478,7 +1605,7 @@ buffer_linelim (struct buffer const *buf + by KEY in LINE. */ + + static char * +-begfield (const struct line *line, const struct keyfield *key) ++begfield_uni (const struct line *line, const struct keyfield *key) + { + char *ptr = line->text, *lim = ptr + line->length - 1; + size_t sword = key->sword; +@@ -1487,10 +1614,10 @@ begfield (const struct line *line, const + /* The leading field separator itself is included in a field when -t + is absent. */ + +- if (tab != TAB_DEFAULT) ++ if (tab_length) + while (ptr < lim && sword--) + { +- while (ptr < lim && *ptr != tab) ++ while (ptr < lim && *ptr != tab[0]) + ++ptr; + if (ptr < lim) + ++ptr; +@@ -1516,11 +1643,70 @@ begfield (const struct line *line, const + return ptr; + } + ++#if HAVE_MBRTOWC ++static char * ++begfield_mb (const struct line *line, const struct keyfield *key) ++{ ++ int i; ++ char *ptr = line->text, *lim = ptr + line->length - 1; ++ size_t sword = key->sword; ++ size_t schar = key->schar; ++ size_t mblength; ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ if (tab_length) ++ while (ptr < lim && sword--) ++ { ++ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ } ++ else ++ while (ptr < lim && sword--) ++ { ++ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ while (ptr < lim && !ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ } ++ ++ if (key->skipsblanks) ++ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ ++ for (i = 0; i < schar; i++) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ++ if (ptr + mblength > lim) ++ break; ++ else ++ ptr += mblength; ++ } ++ ++ return ptr; ++} ++#endif ++ + /* Return the limit of (a pointer to the first character after) the field + in LINE specified by KEY. */ + + static char * +-limfield (const struct line *line, const struct keyfield *key) ++limfield_uni (const struct line *line, const struct keyfield *key) + { + char *ptr = line->text, *lim = ptr + line->length - 1; + size_t eword = key->eword, echar = key->echar; +@@ -1535,10 +1721,10 @@ limfield (const struct line *line, const + `beginning' is the first character following the delimiting TAB. + Otherwise, leave PTR pointing at the first `blank' character after + the preceding field. */ +- if (tab != TAB_DEFAULT) ++ if (tab_length) + while (ptr < lim && eword--) + { +- while (ptr < lim && *ptr != tab) ++ while (ptr < lim && *ptr != tab[0]) + ++ptr; + if (ptr < lim && (eword || echar)) + ++ptr; +@@ -1584,10 +1770,10 @@ limfield (const struct line *line, const + */ + + /* Make LIM point to the end of (one byte past) the current field. */ +- if (tab != TAB_DEFAULT) ++ if (tab_length) + { + char *newlim; +- newlim = memchr (ptr, tab, lim - ptr); ++ newlim = memchr (ptr, tab[0], lim - ptr); + if (newlim) + lim = newlim; + } +@@ -1618,6 +1804,113 @@ limfield (const struct line *line, const + return ptr; + } + ++#if HAVE_MBRTOWC ++static char * ++limfield_mb (const struct line *line, const struct keyfield *key) ++{ ++ char *ptr = line->text, *lim = ptr + line->length - 1; ++ size_t eword = key->eword, echar = key->echar; ++ int i; ++ size_t mblength; ++ mbstate_t state; ++ ++ if (echar == 0) ++ eword++; /* skip all of end field. */ ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ if (tab_length) ++ while (ptr < lim && eword--) ++ { ++ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ if (ptr < lim && (eword | echar)) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ } ++ else ++ while (ptr < lim && eword--) ++ { ++ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ while (ptr < lim && !ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ } ++ ++ ++# ifdef POSIX_UNSPECIFIED ++ /* Make LIM point to the end of (one byte past) the current field. */ ++ if (tab_length) ++ { ++ char *newlim, *p; ++ ++ newlim = NULL; ++ for (p = ptr; p < lim;) ++ { ++ if (memcmp (p, tab, tab_length) == 0) ++ { ++ newlim = p; ++ break; ++ } ++ ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ p += mblength; ++ } ++ } ++ else ++ { ++ char *newlim; ++ newlim = ptr; ++ ++ while (newlim < lim && ismbblank (newlim, lim - newlim, &mblength)) ++ newlim += mblength; ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ while (newlim < lim && !ismbblank (newlim, lim - newlim, &mblength)) ++ newlim += mblength; ++ lim = newlim; ++ } ++# endif ++ ++ if (echar != 0) ++ { ++ /* If we're skipping leading blanks, don't start counting characters ++ * until after skipping past any leading blanks. */ ++ if (key->skipsblanks) ++ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ /* Advance PTR by ECHAR (if possible), but no further than LIM. */ ++ for (i = 0; i < echar; i++) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ++ if (ptr + mblength > lim) ++ break; ++ else ++ ptr += mblength; ++ } ++ } ++ ++ return ptr; ++} ++#endif ++ + /* Fill BUF reading from FP, moving buf->left bytes from the end + of buf->buf to the beginning first. If EOF is reached and the + file wasn't terminated by a newline, supply one. Set up BUF's line +@@ -1700,8 +1993,24 @@ fillbuf (struct buffer *buf, FILE *fp, c + else + { + if (key->skipsblanks) +- while (blanks[to_uchar (*line_start)]) +- line_start++; ++ { ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ size_t mblength; ++ mbstate_t state; ++ memset (&state, '\0', sizeof(mbstate_t)); ++ while (line_start < line->keylim && ++ ismbblank (line_start, ++ line->keylim - line_start, ++ &mblength)) ++ line_start += mblength; ++ } ++ else ++#endif ++ while (blanks[to_uchar (*line_start)]) ++ line_start++; ++ } + line->keybeg = line_start; + } + } +@@ -1739,7 +2048,7 @@ fillbuf (struct buffer *buf, FILE *fp, c + hideously fast. */ + + static int +-numcompare (const char *a, const char *b) ++numcompare_uni (const char *a, const char *b) + { + while (blanks[to_uchar (*a)]) + a++; +@@ -1848,6 +2157,25 @@ human_numcompare (const char *a, const c + : strnumcmp (a, b, decimal_point, thousands_sep)); + } + ++#if HAVE_MBRTOWC ++static int ++numcompare_mb (const char *a, const char *b) ++{ ++ size_t mblength, len; ++ len = strlen (a); /* okay for UTF-8 */ ++ while (*a && ismbblank (a, len > MB_CUR_MAX ? MB_CUR_MAX : len, &mblength)) ++ { ++ a += mblength; ++ len -= mblength; ++ } ++ len = strlen (b); /* okay for UTF-8 */ ++ while (*b && ismbblank (b, len > MB_CUR_MAX ? MB_CUR_MAX : len, &mblength)) ++ b += mblength; ++ ++ return strnumcmp (a, b, decimal_point, thousands_sep); ++} ++#endif /* HAV_EMBRTOWC */ ++ + static int + general_numcompare (const char *sa, const char *sb) + { +@@ -1881,7 +2209,7 @@ general_numcompare (const char *sa, cons + Return 0 if the name in S is not recognized. */ + + static int +-getmonth (char const *month, size_t len) ++getmonth_uni (char const *month, size_t len) + { + size_t lo = 0; + size_t hi = MONTHS_PER_YEAR; +@@ -2062,11 +2390,79 @@ compare_version (char *restrict texta, s + return diff; + } + ++#if HAVE_MBRTOWC ++static int ++getmonth_mb (const char *s, size_t len) ++{ ++ char *month; ++ register size_t i; ++ register int lo = 0, hi = MONTHS_PER_YEAR, result; ++ char *tmp; ++ size_t wclength, mblength; ++ const char **pp; ++ const wchar_t **wpp; ++ wchar_t *month_wcs; ++ mbstate_t state; ++ ++ while (len > 0 && ismbblank (s, len, &mblength)) ++ { ++ s += mblength; ++ len -= mblength; ++ } ++ ++ if (len == 0) ++ return 0; ++ ++ month = (char *) alloca (len + 1); ++ ++ tmp = (char *) alloca (len + 1); ++ memcpy (tmp, s, len); ++ tmp[len] = '\0'; ++ pp = (const char **)&tmp; ++ month_wcs = (wchar_t *) alloca ((len + 1) * sizeof (wchar_t)); ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ wclength = mbsrtowcs (month_wcs, pp, len + 1, &state); ++ assert (wclength != (size_t)-1 && *pp == NULL); ++ ++ for (i = 0; i < wclength; i++) ++ { ++ month_wcs[i] = towupper(month_wcs[i]); ++ if (iswblank (month_wcs[i])) ++ { ++ month_wcs[i] = L'\0'; ++ break; ++ } ++ } ++ ++ wpp = (const wchar_t **)&month_wcs; ++ ++ mblength = wcsrtombs (month, wpp, len + 1, &state); ++ assert (mblength != (-1) && *wpp == NULL); ++ ++ do ++ { ++ int ix = (lo + hi) / 2; ++ ++ if (strncmp (month, monthtab[ix].name, strlen (monthtab[ix].name)) < 0) ++ hi = ix; ++ else ++ lo = ix; ++ } ++ while (hi - lo > 1); ++ ++ result = (!strncmp (month, monthtab[lo].name, strlen (monthtab[lo].name)) ++ ? monthtab[lo].val : 0); ++ ++ return result; ++} ++#endif ++ + /* Compare two lines A and B trying every key in sequence until there + are no more keys or a difference is found. */ + + static int +-keycompare (const struct line *a, const struct line *b) ++keycompare_uni (const struct line *a, const struct line *b) + { + struct keyfield *key = keylist; + +@@ -2246,6 +2642,179 @@ keycompare (const struct line *a, const + return key->reverse ? -diff : diff; + } + ++#if HAVE_MBRTOWC ++static int ++keycompare_mb (const struct line *a, const struct line *b) ++{ ++ struct keyfield *key = keylist; ++ ++ /* For the first iteration only, the key positions have been ++ precomputed for us. */ ++ char *texta = a->keybeg; ++ char *textb = b->keybeg; ++ char *lima = a->keylim; ++ char *limb = b->keylim; ++ ++ size_t mblength_a, mblength_b; ++ wchar_t wc_a, wc_b; ++ mbstate_t state_a, state_b; ++ ++ int diff; ++ ++ memset (&state_a, '\0', sizeof(mbstate_t)); ++ memset (&state_b, '\0', sizeof(mbstate_t)); ++ ++ for (;;) ++ { ++ char const *translate = key->translate; ++ bool const *ignore = key->ignore; ++ ++ /* Find the lengths. */ ++ size_t lena = lima <= texta ? 0 : lima - texta; ++ size_t lenb = limb <= textb ? 0 : limb - textb; ++ ++ /* Actually compare the fields. */ ++ if (key->random) ++ diff = compare_random (texta, lena, textb, lenb); ++ else if (key->numeric | key->general_numeric | key->human_numeric) ++ { ++ char savea = *lima, saveb = *limb; ++ ++ *lima = *limb = '\0'; ++ diff = (key->numeric ? numcompare (texta, textb) ++ : key->general_numeric ? general_numcompare (texta, textb) ++ : human_numcompare (texta, textb, key)); ++ *lima = savea, *limb = saveb; ++ } ++ else if (key->version) ++ diff = compare_version (texta, lena, textb, lenb); ++ else if (key->month) ++ diff = getmonth (texta, lena) - getmonth (textb, lenb); ++ else ++ { ++ if (ignore || translate) ++ { ++ char *copy_a = (char *) alloca (lena + 1 + lenb + 1); ++ char *copy_b = copy_a + lena + 1; ++ size_t new_len_a, new_len_b; ++ size_t i, j; ++ ++ /* Ignore and/or translate chars before comparing. */ ++# define IGNORE_CHARS(NEW_LEN, LEN, TEXT, COPY, WC, MBLENGTH, STATE) \ ++ do \ ++ { \ ++ wchar_t uwc; \ ++ char mbc[MB_LEN_MAX]; \ ++ mbstate_t state_wc; \ ++ \ ++ for (NEW_LEN = i = 0; i < LEN;) \ ++ { \ ++ mbstate_t state_bak; \ ++ \ ++ state_bak = STATE; \ ++ MBLENGTH = mbrtowc (&WC, TEXT + i, LEN - i, &STATE); \ ++ \ ++ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1 \ ++ || MBLENGTH == 0) \ ++ { \ ++ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ ++ STATE = state_bak; \ ++ if (!ignore) \ ++ COPY[NEW_LEN++] = TEXT[i++]; \ ++ continue; \ ++ } \ ++ \ ++ if (ignore) \ ++ { \ ++ if ((ignore == nonprinting && !iswprint (WC)) \ ++ || (ignore == nondictionary \ ++ && !iswalnum (WC) && !iswblank (WC))) \ ++ { \ ++ i += MBLENGTH; \ ++ continue; \ ++ } \ ++ } \ ++ \ ++ if (translate) \ ++ { \ ++ \ ++ uwc = towupper(WC); \ ++ if (WC == uwc) \ ++ { \ ++ memcpy (mbc, TEXT + i, MBLENGTH); \ ++ i += MBLENGTH; \ ++ } \ ++ else \ ++ { \ ++ i += MBLENGTH; \ ++ WC = uwc; \ ++ memset (&state_wc, '\0', sizeof (mbstate_t)); \ ++ \ ++ MBLENGTH = wcrtomb (mbc, WC, &state_wc); \ ++ assert (MBLENGTH != (size_t)-1 && MBLENGTH != 0); \ ++ } \ ++ \ ++ for (j = 0; j < MBLENGTH; j++) \ ++ COPY[NEW_LEN++] = mbc[j]; \ ++ } \ ++ else \ ++ for (j = 0; j < MBLENGTH; j++) \ ++ COPY[NEW_LEN++] = TEXT[i++]; \ ++ } \ ++ COPY[NEW_LEN] = '\0'; \ ++ } \ ++ while (0) ++ IGNORE_CHARS (new_len_a, lena, texta, copy_a, ++ wc_a, mblength_a, state_a); ++ IGNORE_CHARS (new_len_b, lenb, textb, copy_b, ++ wc_b, mblength_b, state_b); ++ diff = xmemcoll (copy_a, new_len_a, copy_b, new_len_b); ++ } ++ else if (lena == 0) ++ diff = - NONZERO (lenb); ++ else if (lenb == 0) ++ goto greater; ++ else ++ diff = xmemcoll (texta, lena, textb, lenb); ++ } ++ ++ if (diff) ++ goto not_equal; ++ ++ key = key->next; ++ if (! key) ++ break; ++ ++ /* Find the beginning and limit of the next field. */ ++ if (key->eword != -1) ++ lima = limfield (a, key), limb = limfield (b, key); ++ else ++ lima = a->text + a->length - 1, limb = b->text + b->length - 1; ++ ++ if (key->sword != -1) ++ texta = begfield (a, key), textb = begfield (b, key); ++ else ++ { ++ texta = a->text, textb = b->text; ++ if (key->skipsblanks) ++ { ++ while (texta < lima && ismbblank (texta, lima - texta, &mblength_a)) ++ texta += mblength_a; ++ while (textb < limb && ismbblank (textb, limb - textb, &mblength_b)) ++ textb += mblength_b; ++ } ++ } ++ } ++ ++ return 0; ++ ++greater: ++ diff = 1; ++not_equal: ++ return key->reverse ? -diff : diff; ++} ++#endif ++ + /* Compare two lines A and B, returning negative, zero, or positive + depending on whether A compares less than, equal to, or greater than B. */ + +@@ -3244,7 +3813,7 @@ main (int argc, char **argv) + initialize_exit_failure (SORT_FAILURE); + + hard_LC_COLLATE = hard_locale (LC_COLLATE); +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + hard_LC_TIME = hard_locale (LC_TIME); + #endif + +@@ -3265,6 +3834,27 @@ main (int argc, char **argv) + thousands_sep = -1; + } + ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ inittables = inittables_mb; ++ begfield = begfield_mb; ++ limfield = limfield_mb; ++ getmonth = getmonth_mb; ++ keycompare = keycompare_mb; ++ numcompare = numcompare_mb; ++ } ++ else ++#endif ++ { ++ inittables = inittables_uni; ++ begfield = begfield_uni; ++ limfield = limfield_uni; ++ getmonth = getmonth_uni; ++ keycompare = keycompare_uni; ++ numcompare = numcompare_uni; ++ } ++ + have_read_stdin = false; + inittables (); + +@@ -3536,13 +4126,35 @@ main (int argc, char **argv) + + case 't': + { +- char newtab = optarg[0]; +- if (! newtab) ++ char newtab[MB_LEN_MAX + 1]; ++ size_t newtab_length = 1; ++ strncpy (newtab, optarg, MB_LEN_MAX); ++ if (! newtab[0]) + error (SORT_FAILURE, 0, _("empty tab")); +- if (optarg[1]) ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ wchar_t wc; ++ mbstate_t state; ++ size_t i; ++ ++ memset (&state, '\0', sizeof (mbstate_t)); ++ newtab_length = mbrtowc (&wc, newtab, strnlen (newtab, ++ MB_LEN_MAX), ++ &state); ++ switch (newtab_length) ++ { ++ case (size_t) -1: ++ case (size_t) -2: ++ case 0: ++ newtab_length = 1; ++ } ++ } ++#endif ++ if (newtab_length == 1 && optarg[1]) + { + if (STREQ (optarg, "\\0")) +- newtab = '\0'; ++ newtab[0] = '\0'; + else + { + /* Provoke with `sort -txx'. Complain about +@@ -3553,9 +4165,12 @@ main (int argc, char **argv) + quote (optarg)); + } + } +- if (tab != TAB_DEFAULT && tab != newtab) ++ if (tab_length ++ && (tab_length != newtab_length ++ || memcmp (tab, newtab, tab_length) != 0)) + error (SORT_FAILURE, 0, _("incompatible tabs")); +- tab = newtab; ++ memcpy (tab, newtab, newtab_length); ++ tab_length = newtab_length; + } + break; + +Index: src/unexpand.c +=================================================================== +--- src/unexpand.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/unexpand.c 2010-05-07 16:13:31.016492129 +0200 +@@ -39,11 +39,28 @@ + #include + #include + #include ++ ++/* Get mbstate_t, mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "quote.h" + #include "xstrndup.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "unexpand" + +@@ -103,6 +120,208 @@ static struct option const longopts[] = + {NULL, 0, NULL, 0} + }; + ++static FILE *next_file (FILE *fp); ++ ++#if HAVE_MBRTOWC ++static void ++unexpand_multibyte (void) ++{ ++ FILE *fp; /* Input stream. */ ++ mbstate_t i_state; /* Current shift state of the input stream. */ ++ mbstate_t i_state_bak; /* Back up the I_STATE. */ ++ mbstate_t o_state; /* Current shift state of the output stream. */ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen = 0; /* The length of the byte sequence in buf. */ ++ wint_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character ++ which shows as same character as WC. */ ++ ++ /* Index in `tab_list' of next tabstop: */ ++ int tab_index = 0; /* For calculating width of pending tabs. */ ++ int print_tab_index = 0; /* For printing as many tabs as possible. */ ++ unsigned int column = 0; /* Column on screen of next char. */ ++ int next_tab_column; /* Column the next tab stop is on. */ ++ int convert = 1; /* If nonzero, perform translations. */ ++ unsigned int pending = 0; /* Pending columns of blanks. */ ++ ++ fp = next_file ((FILE *) NULL); ++ if (fp == NULL) ++ return; ++ ++ memset (&o_state, '\0', sizeof(mbstate_t)); ++ memset (&i_state, '\0', sizeof(mbstate_t)); ++ ++ for (;;) ++ { ++ if (buflen < MB_LEN_MAX && !feof(fp) && !ferror(fp)) ++ { ++ memmove (buf, bufpos, buflen); ++ buflen += fread (buf + buflen, sizeof(char), BUFSIZ, fp); ++ bufpos = buf; ++ } ++ ++ /* Get a wide character. */ ++ if (buflen < 1) ++ { ++ mblength = 1; ++ wc = WEOF; ++ } ++ else ++ { ++ i_state_bak = i_state; ++ mblength = mbrtowc ((wchar_t *)&wc, bufpos, buflen, &i_state); ++ } ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ i_state = i_state_bak; ++ wc = L'\0'; ++ } ++ ++ if (wc == L' ' && convert && column < INT_MAX) ++ { ++ ++pending; ++ ++column; ++ } ++ else if (wc == L'\t' && convert) ++ { ++ if (tab_size == 0) ++ { ++ /* Do not let tab_index == first_free_tab; ++ stop when it is 1 less. */ ++ while (tab_index < first_free_tab - 1 ++ && column >= tab_list[tab_index]) ++ tab_index++; ++ next_tab_column = tab_list[tab_index]; ++ if (tab_index < first_free_tab - 1) ++ tab_index++; ++ if (column >= next_tab_column) ++ { ++ convert = 0; /* Ran out of tab stops. */ ++ goto flush_pend_mb; ++ } ++ } ++ else ++ { ++ next_tab_column = column + tab_size - column % tab_size; ++ } ++ pending += next_tab_column - column; ++ column = next_tab_column; ++ } ++ else ++ { ++flush_pend_mb: ++ /* Flush pending spaces. Print as many tabs as possible, ++ then print the rest as spaces. */ ++ if (pending == 1) ++ { ++ putchar (' '); ++ pending = 0; ++ } ++ column -= pending; ++ while (pending > 0) ++ { ++ if (tab_size == 0) ++ { ++ /* Do not let print_tab_index == first_free_tab; ++ stop when it is 1 less. */ ++ while (print_tab_index < first_free_tab - 1 ++ && column >= tab_list[print_tab_index]) ++ print_tab_index++; ++ next_tab_column = tab_list[print_tab_index]; ++ if (print_tab_index < first_free_tab - 1) ++ print_tab_index++; ++ } ++ else ++ { ++ next_tab_column = ++ column + tab_size - column % tab_size; ++ } ++ if (next_tab_column - column <= pending) ++ { ++ putchar ('\t'); ++ pending -= next_tab_column - column; ++ column = next_tab_column; ++ } ++ else ++ { ++ --print_tab_index; ++ column += pending; ++ while (pending != 0) ++ { ++ putchar (' '); ++ pending--; ++ } ++ } ++ } ++ ++ if (wc == WEOF) ++ { ++ fp = next_file (fp); ++ if (fp == NULL) ++ break; /* No more files. */ ++ else ++ { ++ memset (&i_state, '\0', sizeof(mbstate_t)); ++ continue; ++ } ++ } ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ if (convert) ++ { ++ ++column; ++ if (convert_entire_line == 0) ++ convert = 0; ++ } ++ mblength = 1; ++ putchar (buf[0]); ++ } ++ else if (mblength == 0) ++ { ++ if (convert && convert_entire_line == 0) ++ convert = 0; ++ mblength = 1; ++ putchar ('\0'); ++ } ++ else ++ { ++ if (convert) ++ { ++ if (wc == L'\b') ++ { ++ if (column > 0) ++ --column; ++ } ++ else ++ { ++ int width; /* The width of WC. */ ++ ++ width = wcwidth (wc); ++ column += (width > 0) ? width : 0; ++ if (convert_entire_line == 0) ++ convert = 0; ++ } ++ } ++ ++ if (wc == L'\n') ++ { ++ tab_index = print_tab_index = 0; ++ column = pending = 0; ++ convert = 1; ++ } ++ fwrite (bufpos, sizeof(char), mblength, stdout); ++ } ++ } ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++} ++#endif ++ ++ + void + usage (int status) + { +@@ -524,7 +743,12 @@ main (int argc, char **argv) + + file_list = (optind < argc ? &argv[optind] : stdin_argv); + +- unexpand (); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ unexpand_multibyte (); ++ else ++#endif ++ unexpand (); + + if (have_read_stdin && fclose (stdin) != 0) + error (EXIT_FAILURE, errno, "-"); +Index: src/uniq.c +=================================================================== +--- src/uniq.c.orig 2010-03-13 16:14:09.000000000 +0100 ++++ src/uniq.c 2010-05-07 16:41:34.000063405 +0200 +@@ -21,6 +21,16 @@ + #include + #include + ++/* Get mbstate_t, mbrtowc(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get isw* functions. */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "argmatch.h" + #include "linebuffer.h" +@@ -31,7 +41,19 @@ + #include "stdio--.h" + #include "xmemcoll.h" + #include "xstrtol.h" +-#include "memcasecmp.h" ++#include "xmemcoll.h" ++ ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "uniq" +@@ -107,6 +129,10 @@ static enum delimit_method const delimit + /* Select whether/how to delimit groups of duplicate lines. */ + static enum delimit_method delimit_groups; + ++/* Function pointers. */ ++static char * ++(*find_field) (struct linebuffer *line); ++ + static struct option const longopts[] = + { + {"count", no_argument, NULL, 'c'}, +@@ -206,7 +232,7 @@ size_opt (char const *opt, char const *m + return a pointer to the beginning of the line's field to be compared. */ + + static char * +-find_field (struct linebuffer const *line) ++find_field_uni (struct linebuffer *line) + { + size_t count; + char const *lp = line->buffer; +@@ -227,6 +253,83 @@ find_field (struct linebuffer const *lin + return line->buffer + i; + } + ++#if HAVE_MBRTOWC ++ ++# define MBCHAR_TO_WCHAR(WC, MBLENGTH, LP, POS, SIZE, STATEP, CONVFAIL) \ ++ do \ ++ { \ ++ mbstate_t state_bak; \ ++ \ ++ CONVFAIL = 0; \ ++ state_bak = *STATEP; \ ++ \ ++ MBLENGTH = mbrtowc (&WC, LP + POS, SIZE - POS, STATEP); \ ++ \ ++ switch (MBLENGTH) \ ++ { \ ++ case (size_t)-2: \ ++ case (size_t)-1: \ ++ *STATEP = state_bak; \ ++ CONVFAIL++; \ ++ /* Fall through */ \ ++ case 0: \ ++ MBLENGTH = 1; \ ++ } \ ++ } \ ++ while (0) ++ ++static char * ++find_field_multi (struct linebuffer *line) ++{ ++ size_t count; ++ char *lp = line->buffer; ++ size_t size = line->length - 1; ++ size_t pos; ++ size_t mblength; ++ wchar_t wc; ++ mbstate_t *statep; ++ int convfail; ++ ++ pos = 0; ++ statep = &(line->state); ++ ++ /* skip fields. */ ++ for (count = 0; count < skip_fields && pos < size; count++) ++ { ++ while (pos < size) ++ { ++ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); ++ ++ if (convfail || !iswblank (wc)) ++ { ++ pos += mblength; ++ break; ++ } ++ pos += mblength; ++ } ++ ++ while (pos < size) ++ { ++ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); ++ ++ if (!convfail && iswblank (wc)) ++ break; ++ ++ pos += mblength; ++ } ++ } ++ ++ /* skip fields. */ ++ for (count = 0; count < skip_chars && pos < size; count++) ++ { ++ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); ++ pos += mblength; ++ } ++ ++ return lp + pos; ++} ++#endif ++ + /* Return false if two strings OLD and NEW match, true if not. + OLD and NEW point not to the beginnings of the lines + but rather to the beginnings of the fields to compare. +@@ -235,6 +338,8 @@ find_field (struct linebuffer const *lin + static bool + different (char *old, char *new, size_t oldlen, size_t newlen) + { ++ char *copy_old, *copy_new; ++ + if (check_chars < oldlen) + oldlen = check_chars; + if (check_chars < newlen) +@@ -242,15 +347,93 @@ different (char *old, char *new, size_t + + if (ignore_case) + { +- /* FIXME: This should invoke strcoll somehow. */ +- return oldlen != newlen || memcasecmp (old, new, oldlen); ++ size_t i; ++ ++ copy_old = alloca (oldlen + 1); ++ copy_new = alloca (oldlen + 1); ++ ++ for (i = 0; i < oldlen; i++) ++ { ++ copy_old[i] = toupper (old[i]); ++ copy_new[i] = toupper (new[i]); ++ } + } +- else if (hard_LC_COLLATE) +- return xmemcoll (old, oldlen, new, newlen) != 0; + else +- return oldlen != newlen || memcmp (old, new, oldlen); ++ { ++ copy_old = (char *)old; ++ copy_new = (char *)new; ++ } ++ ++ return xmemcoll (copy_old, oldlen, copy_new, newlen); + } + ++#if HAVE_MBRTOWC ++static int ++different_multi (const char *old, const char *new, size_t oldlen, size_t newlen, mbstate_t oldstate, mbstate_t newstate) ++{ ++ size_t i, j, chars; ++ const char *str[2]; ++ char *copy[2]; ++ size_t len[2]; ++ mbstate_t state[2]; ++ size_t mblength; ++ wchar_t wc, uwc; ++ mbstate_t state_bak; ++ ++ str[0] = old; ++ str[1] = new; ++ len[0] = oldlen; ++ len[1] = newlen; ++ state[0] = oldstate; ++ state[1] = newstate; ++ ++ for (i = 0; i < 2; i++) ++ { ++ copy[i] = alloca (len[i] + 1); ++ ++ for (j = 0, chars = 0; j < len[i] && chars < check_chars; chars++) ++ { ++ state_bak = state[i]; ++ mblength = mbrtowc (&wc, str[i] + j, len[i] - j, &(state[i])); ++ ++ switch (mblength) ++ { ++ case (size_t)-1: ++ case (size_t)-2: ++ state[i] = state_bak; ++ /* Fall through */ ++ case 0: ++ mblength = 1; ++ break; ++ ++ default: ++ if (ignore_case) ++ { ++ uwc = towupper (wc); ++ ++ if (uwc != wc) ++ { ++ mbstate_t state_wc; ++ ++ memset (&state_wc, '\0', sizeof(mbstate_t)); ++ wcrtomb (copy[i] + j, uwc, &state_wc); ++ } ++ else ++ memcpy (copy[i] + j, str[i] + j, mblength); ++ } ++ else ++ memcpy (copy[i] + j, str[i] + j, mblength); ++ } ++ j += mblength; ++ } ++ copy[i][j] = '\0'; ++ len[i] = j; ++ } ++ ++ return xmemcoll (copy[0], len[0], copy[1], len[1]); ++} ++#endif ++ + /* Output the line in linebuffer LINE to standard output + provided that the switches say it should be output. + MATCH is true if the line matches the previous line. +@@ -303,15 +486,43 @@ check_file (const char *infile, const ch + { + char *prevfield IF_LINT (= NULL); + size_t prevlen IF_LINT (= 0); ++#if HAVE_MBRTOWC ++ mbstate_t prevstate; ++ ++ memset (&prevstate, '\0', sizeof (mbstate_t)); ++#endif + + while (!feof (stdin)) + { + char *thisfield; + size_t thislen; ++#if HAVE_MBRTOWC ++ mbstate_t thisstate; ++#endif ++ + if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) + break; + thisfield = find_field (thisline); + thislen = thisline->length - 1 - (thisfield - thisline->buffer); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ thisstate = thisline->state; ++ ++ if (prevline->length == 0 || different_multi ++ (thisfield, prevfield, thislen, prevlen, thisstate, prevstate)) ++ { ++ fwrite (thisline->buffer, sizeof (char), ++ thisline->length, stdout); ++ ++ SWAP_LINES (prevline, thisline); ++ prevfield = thisfield; ++ prevlen = thislen; ++ prevstate = thisstate; ++ } ++ } ++ else ++#endif + if (prevline->length == 0 + || different (thisfield, prevfield, thislen, prevlen)) + { +@@ -330,17 +541,26 @@ check_file (const char *infile, const ch + size_t prevlen; + uintmax_t match_count = 0; + bool first_delimiter = true; ++#if HAVE_MBRTOWC ++ mbstate_t prevstate; ++#endif + + if (readlinebuffer_delim (prevline, stdin, delimiter) == 0) + goto closefiles; + prevfield = find_field (prevline); + prevlen = prevline->length - 1 - (prevfield - prevline->buffer); ++#if HAVE_MBRTOWC ++ prevstate = prevline->state; ++#endif + + while (!feof (stdin)) + { + bool match; + char *thisfield; + size_t thislen; ++#if HAVE_MBRTOWC ++ mbstate_t thisstate; ++#endif + if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) + { + if (ferror (stdin)) +@@ -349,6 +569,15 @@ check_file (const char *infile, const ch + } + thisfield = find_field (thisline); + thislen = thisline->length - 1 - (thisfield - thisline->buffer); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ thisstate = thisline->state; ++ match = !different_multi (thisfield, prevfield, ++ thislen, prevlen, thisstate, prevstate); ++ } ++ else ++#endif + match = !different (thisfield, prevfield, thislen, prevlen); + match_count += match; + +@@ -381,6 +610,9 @@ check_file (const char *infile, const ch + SWAP_LINES (prevline, thisline); + prevfield = thisfield; + prevlen = thislen; ++#if HAVE_MBRTOWC ++ prevstate = thisstate; ++#endif + if (!match) + match_count = 0; + } +@@ -426,6 +658,19 @@ main (int argc, char **argv) + + atexit (close_stdout); + ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ find_field = find_field_multi; ++ } ++ else ++#endif ++ { ++ find_field = find_field_uni; ++ } ++ ++ ++ + skip_chars = 0; + skip_fields = 0; + check_chars = SIZE_MAX; +Index: tests/Makefile.am +=================================================================== +--- tests/Makefile.am.orig 2010-04-20 21:52:05.000000000 +0200 ++++ tests/Makefile.am 2010-05-07 16:38:36.972072320 +0200 +@@ -224,6 +224,7 @@ TESTS = \ + misc/sort-compress \ + misc/sort-continue \ + misc/sort-files0-from \ ++ misc/sort-mb-tests \ + misc/sort-merge \ + misc/sort-merge-fdlimit \ + misc/sort-month \ +@@ -474,6 +475,10 @@ TESTS = \ + $(root_tests) + + pr_data = \ ++ misc/mb1.X \ ++ misc/mb1.I \ ++ misc/mb2.X \ ++ misc/mb2.I \ + pr/0F \ + pr/0FF \ + pr/0FFnt \ +Index: tests/misc/cut +=================================================================== +--- tests/misc/cut.orig 2010-01-01 14:06:47.000000000 +0100 ++++ tests/misc/cut 2010-05-07 16:13:31.144492080 +0200 +@@ -26,7 +26,7 @@ use strict; + my $prog = 'cut'; + my $try = "Try \`$prog --help' for more information.\n"; + my $from_1 = "$prog: fields and positions are numbered from 1\n$try"; +-my $inval = "$prog: invalid byte or field list\n$try"; ++my $inval = "$prog: invalid byte, character or field list\n$try"; + my $no_endpoint = "$prog: invalid range with no endpoint: -\n$try"; + + my @Tests = +@@ -141,7 +141,7 @@ my @Tests = + + # None of the following invalid ranges provoked an error up to coreutils-6.9. + ['inval1', qw(-f 2-0), {IN=>''}, {OUT=>''}, {EXIT=>1}, +- {ERR=>"$prog: invalid decreasing range\n$try"}], ++ {ERR=>"$prog: invalid byte, character or field list\n$try"}], + ['inval2', qw(-f -), {IN=>''}, {OUT=>''}, {EXIT=>1}, {ERR=>$no_endpoint}], + ['inval3', '-f', '4,-', {IN=>''}, {OUT=>''}, {EXIT=>1}, {ERR=>$no_endpoint}], + ['inval4', '-f', '1-2,-', {IN=>''}, {OUT=>''}, {EXIT=>1}, {ERR=>$no_endpoint}], +Index: tests/misc/mb1.I +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/mb1.I 2010-05-07 16:13:31.188492096 +0200 +@@ -0,0 +1,4 @@ ++Apple@10 ++Banana@5 ++Citrus@20 ++Cherry@30 +Index: tests/misc/mb1.X +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/mb1.X 2010-05-07 16:13:31.224492101 +0200 +@@ -0,0 +1,4 @@ ++Banana@5 ++Apple@10 ++Citrus@20 ++Cherry@30 +Index: tests/misc/mb2.I +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/mb2.I 2010-05-07 16:13:31.248492220 +0200 +@@ -0,0 +1,4 @@ ++Apple@AA10@@20 ++Banana@AA5@@30 ++Citrus@AA20@@5 ++Cherry@AA30@@10 +Index: tests/misc/mb2.X +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/mb2.X 2010-05-07 16:13:31.276492153 +0200 +@@ -0,0 +1,4 @@ ++Citrus@AA20@@5 ++Cherry@AA30@@10 ++Apple@AA10@@20 ++Banana@AA5@@30 +Index: tests/misc/sort-mb-tests +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/sort-mb-tests 2010-05-07 16:13:31.312492158 +0200 +@@ -0,0 +1,58 @@ ++#! /bin/sh ++case $# in ++ 0) xx='../src/sort';; ++ *) xx="$1";; ++esac ++test "$VERBOSE" && echo=echo || echo=: ++$echo testing program: $xx ++errors=0 ++test "$srcdir" || srcdir=. ++test "$VERBOSE" && $xx --version 2> /dev/null ++ ++export LC_ALL=en_US.UTF-8 ++locale -k LC_CTYPE 2>&1 | grep -q charmap.*UTF-8 || exit 77 ++errors=0 ++ ++$xx -t @ -k2 -n misc/mb1.I > misc/mb1.O ++code=$? ++if test $code != 0; then ++ $echo "Test mb1 failed: $xx return code $code differs from expected value 0" 1>&2 ++ errors=`expr $errors + 1` ++else ++ cmp misc/mb1.O $srcdir/misc/mb1.X > /dev/null 2>&1 ++ case $? in ++ 0) if test "$VERBOSE"; then $echo "passed mb1"; fi;; ++ 1) $echo "Test mb1 failed: files misc/mb1.O and $srcdir/misc/mb1.X differ" 1>&2 ++ (diff -c misc/mb1.O $srcdir/misc/mb1.X) 2> /dev/null ++ errors=`expr $errors + 1`;; ++ 2) $echo "Test mb1 may have failed." 1>&2 ++ $echo The command "cmp misc/mb1.O $srcdir/misc/mb1.X" failed. 1>&2 ++ errors=`expr $errors + 1`;; ++ esac ++fi ++ ++$xx -t @ -k4 -n misc/mb2.I > misc/mb2.O ++code=$? ++if test $code != 0; then ++ $echo "Test mb2 failed: $xx return code $code differs from expected value 0" 1>&2 ++ errors=`expr $errors + 1` ++else ++ cmp misc/mb2.O $srcdir/misc/mb2.X > /dev/null 2>&1 ++ case $? in ++ 0) if test "$VERBOSE"; then $echo "passed mb2"; fi;; ++ 1) $echo "Test mb2 failed: files misc/mb2.O and $srcdir/misc/mb2.X differ" 1>&2 ++ (diff -c misc/mb2.O $srcdir/misc/mb2.X) 2> /dev/null ++ errors=`expr $errors + 1`;; ++ 2) $echo "Test mb2 may have failed." 1>&2 ++ $echo The command "cmp misc/mb2.O $srcdir/misc/mb2.X" failed. 1>&2 ++ errors=`expr $errors + 1`;; ++ esac ++fi ++ ++if test $errors = 0; then ++ $echo Passed all 113 tests. 1>&2 ++else ++ $echo Failed $errors tests. 1>&2 ++fi ++test $errors = 0 || errors=1 ++exit $errors diff --git a/coreutils-8.5.patch b/coreutils-8.5.patch new file mode 100644 index 0000000..159f791 --- /dev/null +++ b/coreutils-8.5.patch @@ -0,0 +1,67 @@ +Index: gnulib-tests/test-isnanl.h +=================================================================== +--- gnulib-tests/test-isnanl.h.orig 2010-03-13 16:21:09.000000000 +0100 ++++ gnulib-tests/test-isnanl.h 2010-05-05 13:47:16.003024388 +0200 +@@ -63,7 +63,7 @@ main () + /* Quiet NaN. */ + ASSERT (isnanl (NaNl ())); + +-#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT ++#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT && 0 + /* A bit pattern that is different from a Quiet NaN. With a bit of luck, + it's a Signalling NaN. */ + { +@@ -105,6 +105,7 @@ main () + { LDBL80_WORDS (0xFFFF, 0x83333333, 0x00000000) }; + ASSERT (isnanl (x.value)); + } ++#if 0 + /* The isnanl function should recognize Pseudo-NaNs, Pseudo-Infinities, + Pseudo-Zeroes, Unnormalized Numbers, and Pseudo-Denormals, as defined in + Intel IA-64 Architecture Software Developer's Manual, Volume 1: +@@ -138,6 +139,7 @@ main () + ASSERT (isnanl (x.value)); + } + #endif ++#endif + + return 0; + } +Index: src/system.h +=================================================================== +--- src/system.h.orig 2010-04-20 21:52:05.000000000 +0200 ++++ src/system.h 2010-05-05 13:38:20.923127872 +0200 +@@ -138,7 +138,7 @@ enum + # define DEV_BSIZE BBSIZE + #endif + #ifndef DEV_BSIZE +-# define DEV_BSIZE 4096 ++# define DEV_BSIZE 512 + #endif + + /* Extract or fake data from a `struct stat'. +Index: tests/misc/help-version +=================================================================== +--- tests/misc/help-version.orig 2010-04-20 21:52:05.000000000 +0200 ++++ tests/misc/help-version 2010-05-05 13:44:11.919859133 +0200 +@@ -239,6 +239,7 @@ lbracket_setup () { args=": ]"; } + for i in $built_programs; do + # Skip these. + case $i in chroot|stty|tty|false|chcon|runcon) continue;; esac ++ case $i in df) continue;; esac + + rm -rf $tmp_in $tmp_in2 $tmp_dir $tmp_out $bigZ_in $zin $zin2 + echo z |gzip > $zin +Index: tests/other-fs-tmpdir +=================================================================== +--- tests/other-fs-tmpdir.orig 2010-01-01 14:06:47.000000000 +0100 ++++ tests/other-fs-tmpdir 2010-05-05 13:38:20.982872202 +0200 +@@ -43,6 +43,8 @@ for d in $CANDIDATE_TMP_DIRS; do + fi + + done ++# Autobuild hack ++test -f /bin/uname.bin && other_partition_tmpdir= + + if test -z "$other_partition_tmpdir"; then + skip_test_ \ diff --git a/coreutils-8.5.tar.xz b/coreutils-8.5.tar.xz new file mode 100644 index 0000000..cd6bae3 --- /dev/null +++ b/coreutils-8.5.tar.xz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5aa855caa08b94ccd632510d9ab265646d2ee11498c7efff205b27c2437dec5a +size 4531488 diff --git a/coreutils-add_ogv.patch b/coreutils-add_ogv.patch index b6fbd40..9b43b11 100644 --- a/coreutils-add_ogv.patch +++ b/coreutils-add_ogv.patch @@ -1,6 +1,8 @@ ---- src/dircolors.hin -+++ src/dircolors.hin -@@ -151,6 +151,7 @@ +Index: src/dircolors.hin +=================================================================== +--- src/dircolors.hin.orig 2010-04-20 21:52:04.000000000 +0200 ++++ src/dircolors.hin 2010-05-05 16:22:16.375859309 +0200 +@@ -158,6 +158,7 @@ EXEC 01;32 .m2v 01;35 .mkv 01;35 .ogm 01;35 diff --git a/coreutils-cifs-afs.diff b/coreutils-cifs-afs.diff deleted file mode 100644 index 41cd49f..0000000 --- a/coreutils-cifs-afs.diff +++ /dev/null @@ -1,35 +0,0 @@ ---- src/fs.h -+++ src/fs.h -@@ -5,10 +5,12 @@ - #if defined __linux__ - # define S_MAGIC_ADFS 0xADF5 - # define S_MAGIC_AFFS 0xADFF -+# define S_MAGIC_AFS 0x6B414653 - # define S_MAGIC_AUTOFS 0x187 - # define S_MAGIC_BEFS 0x42465331 - # define S_MAGIC_BFS 0x1BADFACE - # define S_MAGIC_BINFMT_MISC 0x42494e4d -+# define S_MAGIC_CIFS 0xFF534D42 - # define S_MAGIC_CODA 0x73757245 - # define S_MAGIC_COH 0x012FF7B7 - # define S_MAGIC_CRAMFS 0x28CD3D45 ---- src/stat.c -+++ src/stat.c -@@ -219,6 +219,8 @@ human_fstype (STRUCT_STATVFS const *stat - return "adfs"; - case S_MAGIC_AFFS: /* 0xADFF */ - return "affs"; -+ case S_MAGIC_AFS: /* 0x6B414653 */ -+ return "afs"; - case S_MAGIC_AUTOFS: /* 0x187 */ - return "autofs"; - case S_MAGIC_BEFS: /* 0x42465331 */ -@@ -227,6 +229,8 @@ human_fstype (STRUCT_STATVFS const *stat - return "bfs"; - case S_MAGIC_BINFMT_MISC: /* 0x42494e4d */ - return "binfmt_misc"; -+ case S_MAGIC_CIFS: /* 0xFF534D42 */ -+ return "cifs"; - case S_MAGIC_CODA: /* 0x73757245 */ - return "coda"; - case S_MAGIC_COH: /* 0x012FF7B7 */ diff --git a/coreutils-fix_distcheck.patch b/coreutils-fix_distcheck.patch deleted file mode 100644 index 9fc3c8e..0000000 --- a/coreutils-fix_distcheck.patch +++ /dev/null @@ -1,80 +0,0 @@ -Index: maint.mk -=================================================================== ---- maint.mk.orig 2009-02-18 16:13:19.000000000 +0100 -+++ maint.mk 2010-05-04 17:45:14.515359143 +0200 -@@ -623,14 +623,14 @@ bin=bin-$$$$ - - write_loser = printf '\#!%s\necho $$0: bad path 1>&2; exit 1\n' '$(SHELL)' - --TMPDIR ?= /tmp --t=$(TMPDIR)/$(PACKAGE)/test -+tmpdir = $(abs_top_builddir)/tests/torture -+ - pfx=$(t)/i - - # More than once, tainted build and source directory names would - # have caused at least one "make check" test to apply "chmod 700" - # to all directories under $HOME. Make sure it doesn't happen again. --tp := $(shell echo "$(TMPDIR)/$(PACKAGE)-$$$$") -+tp = $(tmpdir)/taint - t_prefix = $(tp)/a - t_taint = '$(t_prefix) b' - fake_home = $(tp)/home -@@ -648,10 +648,11 @@ taint-distcheck: $(DIST_ARCHIVES) - touch $(fake_home)/f - mkdir -p $(fake_home)/d/e - ls -lR $(fake_home) $(t_prefix) > $(tp)/.ls-before -+ HOME=$(fake_home); export HOME; \ - cd $(t_taint)/$(distdir) \ - && ./configure \ - && $(MAKE) \ -- && HOME=$(fake_home) $(MAKE) check \ -+ && $(MAKE) check \ - && ls -lR $(fake_home) $(t_prefix) > $(tp)/.ls-after \ - && diff $(tp)/.ls-before $(tp)/.ls-after \ - && test -d $(t_prefix) -@@ -670,6 +671,7 @@ endef - # Install, then verify that all binaries and man pages are in place. - # Note that neither the binary, ginstall, nor the ].1 man page is installed. - define my-instcheck -+ echo running my-instcheck; \ - $(MAKE) prefix=$(pfx) install \ - && test ! -f $(pfx)/bin/ginstall \ - && { fail=0; \ -@@ -688,6 +690,7 @@ endef - - define coreutils-path-check - { \ -+ echo running coreutils-path-check; \ - if test -f $(srcdir)/src/true.c; then \ - fail=1; \ - mkdir $(bin) \ -@@ -732,19 +735,20 @@ my-distcheck: $(DIST_ARCHIVES) $(local-c - -rm -rf $(t) - mkdir -p $(t) - GZIP=$(GZIP_ENV) $(AMTAR) -C $(t) -zxf $(distdir).tar.gz -- cd $(t)/$(distdir) \ -- && ./configure --disable-nls \ -- && $(MAKE) CFLAGS='$(warn_cflags)' \ -- AM_MAKEFLAGS='$(null_AM_MAKEFLAGS)' \ -- && $(MAKE) dvi \ -- && $(install-transform-check) \ -- && $(my-instcheck) \ -- && $(coreutils-path-check) \ -+ cd $(t)/$(distdir) \ -+ && ./configure --quiet --enable-gcc-warnings --disable-nls \ -+ && $(MAKE) CFLAGS='$(warn_cflags)' \ -+ AM_MAKEFLAGS='$(null_AM_MAKEFLAGS)' \ -+ && $(MAKE) dvi \ -+ && $(install-transform-check) \ -+ && $(my-instcheck) \ -+ && $(coreutils-path-check) \ - && $(MAKE) distclean - (cd $(t) && mv $(distdir) $(distdir).old \ - && $(AMTAR) -zxf - ) < $(distdir).tar.gz - diff -ur $(t)/$(distdir).old $(t)/$(distdir) - -rm -rf $(t) -+ rmdir $(tmpdir)/$(PACKAGE) $(tmpdir) - @echo "========================"; \ - echo "$(distdir).tar.gz is ready for distribution"; \ - echo "========================" diff --git a/coreutils-getaddrinfo.diff b/coreutils-getaddrinfo.diff deleted file mode 100644 index 39a0f38..0000000 --- a/coreutils-getaddrinfo.diff +++ /dev/null @@ -1,16 +0,0 @@ -Index: coreutils-6.9.90/gnulib-tests/test-getaddrinfo.c -================================================================================ ---- coreutils-7.1/gnulib-tests/test-getaddrinfo.c -+++ coreutils-7.1/gnulib-tests/test-getaddrinfo.c -@@ -71,10 +71,7 @@ int simple (char *host, char *service) - the test merely because someone is down the country on their - in-law's farm. */ - if (res == EAI_AGAIN) -- { -- fprintf (stderr, "skipping getaddrinfo test: no network?\n"); -- return 77; -- } -+ return 0; - /* IRIX reports EAI_NONAME for "https". Don't fail the test - merely because of this. */ - if (res == EAI_NONAME) diff --git a/coreutils-getaddrinfo.patch b/coreutils-getaddrinfo.patch new file mode 100644 index 0000000..d5b0720 --- /dev/null +++ b/coreutils-getaddrinfo.patch @@ -0,0 +1,17 @@ +Index: gnulib-tests/test-getaddrinfo.c +=================================================================== +--- gnulib-tests/test-getaddrinfo.c.orig 2010-03-13 16:21:08.000000000 +0100 ++++ gnulib-tests/test-getaddrinfo.c 2010-05-05 14:51:40.343025353 +0200 +@@ -88,11 +88,7 @@ simple (char const *host, char const *se + the test merely because someone is down the country on their + in-law's farm. */ + if (res == EAI_AGAIN) +- { +- skip++; +- fprintf (stderr, "skipping getaddrinfo test: no network?\n"); +- return 77; +- } ++ return 0; + /* IRIX reports EAI_NONAME for "https". Don't fail the test + merely because of this. */ + if (res == EAI_NONAME) diff --git a/coreutils-gl_printf_safe.patch b/coreutils-gl_printf_safe.patch new file mode 100644 index 0000000..ed5cef0 --- /dev/null +++ b/coreutils-gl_printf_safe.patch @@ -0,0 +1,24 @@ +Index: configure +=================================================================== +--- configure.orig 2010-04-23 18:06:40.000000000 +0200 ++++ configure 2010-05-05 13:40:11.419859163 +0200 +@@ -3340,7 +3340,6 @@ as_fn_append ac_func_list " alarm" + as_fn_append ac_header_list " sys/statvfs.h" + as_fn_append ac_header_list " sys/select.h" + as_fn_append ac_func_list " nl_langinfo" +-gl_printf_safe=yes + as_fn_append ac_header_list " utmp.h" + as_fn_append ac_header_list " utmpx.h" + as_fn_append ac_func_list " utmpname" +Index: m4/gnulib-comp.m4 +=================================================================== +--- m4/gnulib-comp.m4.orig 2010-04-21 20:12:06.000000000 +0200 ++++ m4/gnulib-comp.m4 2010-05-05 13:40:58.875859176 +0200 +@@ -1158,7 +1158,6 @@ AC_DEFUN([gl_INIT], + # Code from module printf-frexpl: + gl_FUNC_PRINTF_FREXPL + # Code from module printf-safe: +- m4_divert_text([INIT_PREPARE], [gl_printf_safe=yes]) + # Code from module priv-set: + gl_PRIV_SET + # Code from module progname: diff --git a/coreutils-i18n-infloop.patch b/coreutils-i18n-infloop.patch new file mode 100644 index 0000000..ede0365 --- /dev/null +++ b/coreutils-i18n-infloop.patch @@ -0,0 +1,14 @@ +Index: src/sort.c +=================================================================== +--- src/sort.c.orig 2010-05-07 16:52:08.068491875 +0200 ++++ src/sort.c 2010-05-07 16:53:44.704992155 +0200 +@@ -2720,7 +2720,8 @@ keycompare_mb (const struct line *a, con + if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ + STATE = state_bak; \ + if (!ignore) \ +- COPY[NEW_LEN++] = TEXT[i++]; \ ++ COPY[NEW_LEN++] = TEXT[i]; \ ++ i++; \ + continue; \ + } \ + \ diff --git a/i18n-limfield.diff b/coreutils-i18n-limfield.patch similarity index 85% rename from i18n-limfield.diff rename to coreutils-i18n-limfield.patch index b27c3c9..11d0832 100644 --- a/i18n-limfield.diff +++ b/coreutils-i18n-limfield.patch @@ -1,8 +1,8 @@ Index: src/sort.c =================================================================== ---- src/sort.c.orig 2010-05-04 17:29:12.419359202 +0200 -+++ src/sort.c 2010-05-04 17:29:12.479359419 +0200 -@@ -1731,7 +1731,7 @@ limfield_mb (const struct line *line, co +--- src/sort.c.orig 2010-05-05 16:22:15.815859271 +0200 ++++ src/sort.c 2010-05-05 16:22:15.875859173 +0200 +@@ -1845,7 +1845,7 @@ limfield_mb (const struct line *line, co GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ptr += mblength; } @@ -11,7 +11,7 @@ Index: src/sort.c { GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ptr += mblength; -@@ -1742,11 +1742,6 @@ limfield_mb (const struct line *line, co +@@ -1856,11 +1856,6 @@ limfield_mb (const struct line *line, co { while (ptr < lim && ismbblank (ptr, &mblength)) ptr += mblength; @@ -23,7 +23,7 @@ Index: src/sort.c while (ptr < lim && !ismbblank (ptr, &mblength)) ptr += mblength; } -@@ -1756,20 +1751,19 @@ limfield_mb (const struct line *line, co +@@ -1870,20 +1865,19 @@ limfield_mb (const struct line *line, co /* Make LIM point to the end of (one byte past) the current field. */ if (tab != NULL) { @@ -56,7 +56,7 @@ Index: src/sort.c } else { -@@ -1778,24 +1772,20 @@ limfield_mb (const struct line *line, co +@@ -1892,24 +1886,20 @@ limfield_mb (const struct line *line, co while (newlim < lim && ismbblank (newlim, &mblength)) newlim += mblength; @@ -86,7 +86,7 @@ Index: src/sort.c /* Advance PTR by ECHAR (if possible), but no further than LIM. */ for (i = 0; i < echar; i++) -@@ -1803,9 +1793,9 @@ limfield_mb (const struct line *line, co +@@ -1917,9 +1907,9 @@ limfield_mb (const struct line *line, co GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); if (ptr + mblength > lim) diff --git a/i18n-monthsort.diff b/coreutils-i18n-monthsort.patch similarity index 66% rename from i18n-monthsort.diff rename to coreutils-i18n-monthsort.patch index 58bf214..345f565 100644 --- a/i18n-monthsort.diff +++ b/coreutils-i18n-monthsort.patch @@ -1,8 +1,8 @@ Index: src/sort.c =================================================================== ---- src/sort.c.orig 2010-05-04 17:28:43.820359291 +0200 -+++ src/sort.c 2010-05-04 17:30:44.507859357 +0200 -@@ -1285,7 +1285,7 @@ inittables_mb (void) +--- src/sort.c.orig 2010-05-05 16:22:15.487859132 +0200 ++++ src/sort.c 2010-05-05 16:23:20.267859249 +0200 +@@ -1402,7 +1402,7 @@ inittables_mb (void) else { j += mblength; diff --git a/i18n-random.diff b/coreutils-i18n-random.patch similarity index 70% rename from i18n-random.diff rename to coreutils-i18n-random.patch index 566e2de..93bfabb 100644 --- a/i18n-random.diff +++ b/coreutils-i18n-random.patch @@ -1,8 +1,8 @@ Index: src/sort.c =================================================================== ---- src/sort.c.orig 2010-05-04 17:29:12.395359111 +0200 -+++ src/sort.c 2010-05-04 17:29:59.979859336 +0200 -@@ -2494,7 +2494,10 @@ keycompare_mb (const struct line *a, con +--- src/sort.c.orig 2010-05-06 15:16:27.475859128 +0200 ++++ src/sort.c 2010-05-06 15:16:53.899859247 +0200 +@@ -2712,7 +2712,10 @@ keycompare_mb (const struct line *a, con size_t lenb = limb <= textb ? 0 : limb - textb; /* Actually compare the fields. */ diff --git a/coreutils-i18n-uninit.patch b/coreutils-i18n-uninit.patch new file mode 100644 index 0000000..c3b8ebc --- /dev/null +++ b/coreutils-i18n-uninit.patch @@ -0,0 +1,16 @@ +Index: src/cut.c +=================================================================== +--- src/cut.c.orig 2010-05-06 15:16:26.851859241 +0200 ++++ src/cut.c 2010-05-06 15:16:27.095859170 +0200 +@@ -878,7 +878,10 @@ cut_fields_mb (FILE *stream) + c = getc (stream); + empty_input = (c == EOF); + if (c != EOF) +- ungetc (c, stream); ++ { ++ ungetc (c, stream); ++ wc = 0; ++ } + else + wc = WEOF; + diff --git a/coreutils-invalid-ids.patch b/coreutils-invalid-ids.patch new file mode 100644 index 0000000..a7cdbb1 --- /dev/null +++ b/coreutils-invalid-ids.patch @@ -0,0 +1,26 @@ +While uid_t and gid_t are both unsigned, the values (uid_t) -1 and +(gid_t) -1 are reserved. A uid or gid argument of -1 to the chown(2) +system call means to leave the uid/gid unchanged. Catch this case +so that trying to set a uid or gid to -1 will result in an error. + +Test cases: + + chown 4294967295 file + chown :4294967295 file + chgrp 4294967295 file + +Andreas Gruenbacher + +Index: src/chgrp.c +=================================================================== +--- src/chgrp.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/chgrp.c 2010-05-05 14:03:28.279359192 +0200 +@@ -89,7 +89,7 @@ parse_group (const char *name) + { + unsigned long int tmp; + if (! (xstrtoul (name, NULL, 10, &tmp, "") == LONGINT_OK +- && tmp <= GID_T_MAX)) ++ && tmp <= GID_T_MAX && (gid_t) tmp != (gid_t) -1)) + error (EXIT_FAILURE, 0, _("invalid group: %s"), quote (name)); + gid = tmp; + } diff --git a/coreutils-no_hostname_and_hostid.patch b/coreutils-no_hostname_and_hostid.patch new file mode 100644 index 0000000..b3657e0 --- /dev/null +++ b/coreutils-no_hostname_and_hostid.patch @@ -0,0 +1,122 @@ +Index: doc/coreutils.texi +=================================================================== +--- doc/coreutils.texi.orig 2010-05-06 15:17:48.132359317 +0200 ++++ doc/coreutils.texi 2010-05-06 15:21:02.631693747 +0200 +@@ -65,8 +65,6 @@ + * fold: (coreutils)fold invocation. Wrap long input lines. + * groups: (coreutils)groups invocation. Print group names a user is in. + * head: (coreutils)head invocation. Output the first part of files. +-* hostid: (coreutils)hostid invocation. Print numeric host identifier. +-* hostname: (coreutils)hostname invocation. Print or set system name. + * id: (coreutils)id invocation. Print user identity. + * install: (coreutils)install invocation. Copy and change attributes. + * join: (coreutils)join invocation. Join lines on a common field. +@@ -197,7 +195,7 @@ Free Documentation License''. + * File name manipulation:: dirname basename pathchk mktemp + * Working context:: pwd stty printenv tty + * User information:: id logname whoami groups users who +-* System context:: date arch nproc uname hostname hostid uptime ++* System context:: date arch nproc uname uptime + * SELinux context:: chcon runcon + * Modified command invocation:: chroot env nice nohup stdbuf su timeout + * Process control:: kill +@@ -413,8 +411,6 @@ System context + * date invocation:: Print or set system date and time + * nproc invocation:: Print the number of processors + * uname invocation:: Print system information +-* hostname invocation:: Print or set system name +-* hostid invocation:: Print numeric host identifier + * uptime invocation:: Print system uptime and load + + @command{date}: Print or set system date and time +@@ -13449,8 +13445,6 @@ information. + * arch invocation:: Print machine hardware name. + * nproc invocation:: Print the number of processors. + * uname invocation:: Print system information. +-* hostname invocation:: Print or set system name. +-* hostid invocation:: Print numeric host identifier. + * uptime invocation:: Print system uptime and load. + @end menu + +@@ -14272,55 +14266,6 @@ Print the kernel version. + + @exitstatus + +- +-@node hostname invocation +-@section @command{hostname}: Print or set system name +- +-@pindex hostname +-@cindex setting the hostname +-@cindex printing the hostname +-@cindex system name, printing +-@cindex appropriate privileges +- +-With no arguments, @command{hostname} prints the name of the current host +-system. With one argument, it sets the current host name to the +-specified string. You must have appropriate privileges to set the host +-name. Synopsis: +- +-@example +-hostname [@var{name}] +-@end example +- +-The only options are @option{--help} and @option{--version}. @xref{Common +-options}. +- +-@exitstatus +- +- +-@node hostid invocation +-@section @command{hostid}: Print numeric host identifier +- +-@pindex hostid +-@cindex printing the host identifier +- +-@command{hostid} prints the numeric identifier of the current host +-in hexadecimal. This command accepts no arguments. +-The only options are @option{--help} and @option{--version}. +-@xref{Common options}. +- +-For example, here's what it prints on one system I use: +- +-@example +-$ hostid +-1bac013d +-@end example +- +-On that system, the 32-bit quantity happens to be closely +-related to the system's Internet address, but that isn't always +-the case. +- +-@exitstatus +- + @node uptime invocation + @section @command{uptime}: Print system uptime and load + +Index: man/Makefile.am +=================================================================== +--- man/Makefile.am.orig 2010-05-06 15:17:48.136359276 +0200 ++++ man/Makefile.am 2010-05-06 15:18:44.844359168 +0200 +@@ -197,7 +197,7 @@ check-x-vs-1: + @PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ + t=$@-t; \ + (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ +- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ ++ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ + | tr -s ' ' '\n' | sed 's/\.1$$//') \ + | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ + rm $$t +Index: man/Makefile.in +=================================================================== +--- man/Makefile.in.orig 2010-05-06 15:17:48.136359276 +0200 ++++ man/Makefile.in 2010-05-06 15:18:44.875852631 +0200 +@@ -1574,7 +1574,7 @@ check-x-vs-1: + @PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ + t=$@-t; \ + (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ +- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ ++ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ + | tr -s ' ' '\n' | sed 's/\.1$$//') \ + | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ + rm $$t diff --git a/coreutils-sysinfo.diff b/coreutils-sysinfo.patch similarity index 86% rename from coreutils-sysinfo.diff rename to coreutils-sysinfo.patch index 3096103..4e5b9c4 100644 --- a/coreutils-sysinfo.diff +++ b/coreutils-sysinfo.patch @@ -1,10 +1,10 @@ Index: src/uname.c =================================================================== ---- src/uname.c.orig 2010-05-04 17:27:48.679359310 +0200 -+++ src/uname.c 2010-05-04 17:29:03.011859260 +0200 +--- src/uname.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/uname.c 2010-05-05 13:58:03.471359120 +0200 @@ -339,6 +339,36 @@ main (int argc, char **argv) # endif - } + } #endif + if (element == unknown) + { @@ -37,11 +37,11 @@ Index: src/uname.c +#endif + } if (! (toprint == UINT_MAX && element == unknown)) - print_element (element); + print_element (element); } @@ -364,6 +394,18 @@ main (int argc, char **argv) - element = hardware_platform; - } + element = hardware_platform; + } #endif + if (element == unknown) + { @@ -56,5 +56,5 @@ Index: src/uname.c + element = hardware_platform; + } if (! (toprint == UINT_MAX && element == unknown)) - print_element (element); + print_element (element); } diff --git a/coreutils.changes b/coreutils.changes index cfcabee..59c33d3 100644 --- a/coreutils.changes +++ b/coreutils.changes @@ -1,8 +1,55 @@ ------------------------------------------------------------------- -Tue Jun 29 20:18:04 CEST 2010 - pth@suse.de +Fri May 7 15:44:53 UTC 2010 - pth@novell.com -- Fix 'sort -V' not working because the i18n (mb handling) patch - wasn't updated to handle the new option (bnc#615073). +- Update to 8.5: + Bug fixes + * cp and mv once again support preserving extended attributes. + * cp now preserves "capabilities" when also preserving file ownership.7 + * ls --color once again honors the 'NORMAL' dircolors directive. + [bug introduced in coreutils-6.11] + * sort -M now handles abbreviated months that are aligned using + blanks in the locale database. Also locales with 8 bit characters + are handled correctly, including multi byte locales with the caveat + that multi byte characters are matched case sensitively. + * sort again handles obsolescent key formats (+POS -POS) correctly. + Previously if -POS was specified, 1 field too many was used in the + sort. [bug introduced in coreutils-7.2] + + New features + + * join now accepts the --header option, to treat the first line of + each file as a header line to be joined and printed + unconditionally. + + * timeout now accepts the --kill-after option which sends a kill + signal to the monitored command if it's still running the specified + duration after the initial signal was sent. + + * who: the "+/-" --mesg (-T) indicator of whether a user/tty is + accepting messages could be incorrectly listed as "+", when in + fact, the user was not accepting messages (mesg no). Before, who + would examine only the permission bits, and not consider the group + of the TTY device file. Thus, if a login tty's group would change + somehow e.g., to "root", that would make it unwritable (via + write(1)) by normal users, in spite of whatever the permission bits + might imply. Now, when configured using the + --with-tty-group[=NAME] option, who also compares the group of the + TTY device with NAME (or "tty" if no group name is specified). + + Changes in behavior + + * ls --color no longer emits the final 3-byte color-resetting escape + sequence when it would be a no-op. + + * join -t '' no longer emits an error and instead operates on each + line as a whole (even if they contain NUL characters). + + For other changes since 7.1 see NEWS. +- Split-up coreutils-%%{version}.diff as far as possible. +- Prefix all patches with coreutils-. +- All patches have the .patch suffix. +- Use the i18n patch from Archlinux as it fixes at least one test + suite failure. ------------------------------------------------------------------- Tue May 4 17:13:37 UTC 2010 - pth@novell.com diff --git a/coreutils.spec b/coreutils.spec index f3a1de5..35c3da7 100644 --- a/coreutils.spec +++ b/coreutils.spec @@ -1,5 +1,5 @@ # -# spec file for package coreutils (Version 7.1) +# spec file for package coreutils (Version 8.5) # # Copyright (c) 2010 SUSE LINUX Products GmbH, Nuernberg, Germany. # @@ -23,10 +23,10 @@ BuildRequires: help2man libacl-devel libcap-devel libselinux-devel pam-devel xz Url: http://www.gnu.org/software/coreutils/ License: GFDLv1.2 ; GPLv2+ ; GPLv3+ Group: System/Base -Version: 7.1 -Release: 6 -Provides: fileutils sh-utils stat textutils mktemp -Obsoletes: fileutils sh-utils stat textutils mktemp +Version: 8.5 +Release: 1 +Provides: fileutils = %{version}, sh-utils = {version}, stat = %version}, textutils = %{version}, mktemp = %{version} +Obsoletes: fileutils < %{version}, sh-utils < {version}, stat < %version}, textutils < %{version}, mktemp < %{version} Obsoletes: libselinux <= 1.23.11-3 libselinux-32bit = 9 libselinux-64bit = 9 libselinux-x86 = 9 AutoReqProv: on PreReq: %{install_info_prereq} @@ -35,22 +35,19 @@ Source: coreutils-%{version}.tar.xz Source1: su.pamd Source2: su.default Source3: baselibs.conf -Patch: coreutils-%{version}.diff -Patch4: coreutils-5.3.0-i18n-0.1.patch -Patch5: i18n-uninit.diff -Patch6: i18n-infloop.diff -Patch8: coreutils-sysinfo.diff -Patch11: i18n-monthsort.diff -Patch12: i18n-random.diff -Patch16: invalid-ids.diff -Patch17: i18n-limfield.diff -Patch20: coreutils-6.8-su.diff -Patch21: coreutils-6.8.0-pie.diff -Patch22: coreutils-5.3.0-sbin4su.diff -Patch23: coreutils-getaddrinfo.diff -Patch25: coreutils-cifs-afs.diff +Patch0: coreutils-%{version}.patch +Patch1: coreutils-no_hostname_and_hostid.patch +Patch2: coreutils-gl_printf_safe.patch +Patch4: coreutils-8.5-i18n.patch +Patch5: coreutils-i18n-uninit.patch +Patch6: coreutils-i18n-infloop.patch +Patch8: coreutils-sysinfo.patch +Patch16: coreutils-invalid-ids.patch +Patch20: coreutils-6.8-su.patch +Patch21: coreutils-6.8.0-pie.patch +Patch22: coreutils-5.3.0-sbin4su.patch +Patch23: coreutils-getaddrinfo.patch Patch26: coreutils-add_ogv.patch -Patch27: coreutils-fix_distcheck.patch BuildRoot: %{_tmppath}/%{name}-%{version}-build %description @@ -107,48 +104,44 @@ Authors: %lang_package %prep %setup -q -%patch4 -p1 +%patch4 %patch5 %patch6 -%patch +%patch0 +%patch1 +%patch2 %patch8 -%patch11 -%patch12 %patch16 -%patch17 %patch20 %patch21 %patch22 -%patch23 -p1 -%patch25 +%patch23 %patch26 -%patch27 %build -#AUTOPOINT=true autoreconf -fi -./configure CFLAGS="$RPM_OPT_FLAGS -Wall" \ - --prefix=%{_prefix} --mandir=%{_mandir} \ - --infodir=%{_infodir} --without-included-regex \ +AUTOPOINT=true autoreconf -fi +export CFLAGS="%optflags -Wall" +%configure --without-included-regex \ --enable-install-program=arch,su \ gl_cv_func_printf_directive_n=yes \ gl_cv_func_isnanl_works=yes \ DEFAULT_POSIX2_VERSION=199209 -make %{?jobs:-j%jobs} PAMLIBS="-lpam -ldl" +make %{?jobs:-j%jobs} PAMLIBS="-lpam -ldl" V=1 %check if test $EUID -eq 0; then - su nobody -c make %{?jobs:-j%jobs} check VERBOSE=yes - make %{?jobs:-j%jobs} check-root VERBOSE=yes + su nobody -c make %{?jobs:-j%jobs} check VERBOSE=yes V=1 + make %{?jobs:-j%jobs} check-root VERBOSE=yes V=1 else %ifarch %arm - make -k %{?jobs:-j%jobs} check VERBOSE=yes || echo make check failed + make -k %{?jobs:-j%jobs} check VERBOSE=yes V=1 || echo make check failed %else - make %{?jobs:-j%jobs} check VERBOSE=yes + make %{?jobs:-j%jobs} check VERBOSE=yes V=1 %endif fi %install -make DESTDIR="$RPM_BUILD_ROOT" install +%makeinstall test -f $RPM_BUILD_ROOT%{_bindir}/su || \ install src/su $RPM_BUILD_ROOT%{_bindir}/su install -d $RPM_BUILD_ROOT/bin @@ -182,6 +175,7 @@ rm -rf $RPM_BUILD_ROOT %config /etc/pam.d/su-l %config(noreplace) /etc/default/su %{_bindir}/* +%{_libdir}/%{name} %doc %{_infodir}/coreutils.info*.gz %doc %{_mandir}/man1/*.1.gz %dir %{_prefix}/share/locale/*/LC_TIME diff --git a/i18n-infloop.diff b/i18n-infloop.diff deleted file mode 100644 index dbfcc29..0000000 --- a/i18n-infloop.diff +++ /dev/null @@ -1,14 +0,0 @@ -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-05-04 17:27:49.103359264 +0200 -+++ src/sort.c 2010-05-04 17:28:43.820359291 +0200 -@@ -2540,7 +2540,8 @@ keycompare_mb (const struct line *a, con - if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ - STATE = state_bak; \ - if (!ignore) \ -- COPY[NEW_LEN++] = TEXT[i++]; \ -+ COPY[NEW_LEN++] = TEXT[i]; \ -+ i++; \ - continue; \ - } \ - \ diff --git a/i18n-uninit.diff b/i18n-uninit.diff deleted file mode 100644 index 8952a0d..0000000 --- a/i18n-uninit.diff +++ /dev/null @@ -1,29 +0,0 @@ -Index: src/cut.c -=================================================================== ---- src/cut.c.orig 2010-05-04 17:27:29.879859350 +0200 -+++ src/cut.c 2010-05-04 17:27:30.131859395 +0200 -@@ -878,7 +878,10 @@ cut_fields_mb (FILE *stream) - c = getc (stream); - empty_input = (c == EOF); - if (c != EOF) -- ungetc (c, stream); -+ { -+ ungetc (c, stream); -+ wc = 0; -+ } - else - wc = WEOF; - -Index: src/expand.c -=================================================================== ---- src/expand.c.orig 2010-05-04 17:27:29.915859239 +0200 -+++ src/expand.c 2010-05-04 17:27:30.155859324 +0200 -@@ -404,7 +404,7 @@ expand_multibyte (void) - for (;;) - { - /* Input character, or EOF. */ -- wint_t wc; -+ wint_t wc = 0; - - /* If true, perform translations. */ - bool convert = true; diff --git a/invalid-ids.diff b/invalid-ids.diff deleted file mode 100644 index 35f435c..0000000 --- a/invalid-ids.diff +++ /dev/null @@ -1,49 +0,0 @@ -While uid_t and gid_t are both unsigned, the values (uid_t) -1 and -(gid_t) -1 are reserved. A uid or gid argument of -1 to the chown(2) -system call means to leave the uid/gid unchanged. Catch this case -so that trying to set a uid or gid to -1 will result in an error. - -Test cases: - - chown 4294967295 file - chown :4294967295 file - chgrp 4294967295 file - -Andreas Gruenbacher - -Index: lib/userspec.c -=================================================================== ---- lib/userspec.c.orig 2010-05-04 17:27:48.479359439 +0200 -+++ lib/userspec.c 2010-05-04 17:29:12.439359267 +0200 -@@ -169,7 +169,7 @@ parse_with_separator (char const *spec, - { - unsigned long int tmp; - if (xstrtoul (u, NULL, 10, &tmp, "") == LONGINT_OK -- && tmp <= MAXUID) -+ && tmp <= MAXUID && tmp != (uid_t) -1) - unum = tmp; - else - error_msg = E_invalid_user; -@@ -200,7 +200,8 @@ parse_with_separator (char const *spec, - if (grp == NULL) - { - unsigned long int tmp; -- if (xstrtoul (g, NULL, 10, &tmp, "") == LONGINT_OK && tmp <= MAXGID) -+ if (xstrtoul (g, NULL, 10, &tmp, "") == LONGINT_OK && tmp <= MAXGID -+ && tmp != (gid_t) -1) - gnum = tmp; - else - error_msg = E_invalid_group; -Index: src/chgrp.c -=================================================================== ---- src/chgrp.c.orig 2010-05-04 17:27:48.479359439 +0200 -+++ src/chgrp.c 2010-05-04 17:29:12.443359269 +0200 -@@ -89,7 +89,7 @@ parse_group (const char *name) - { - unsigned long int tmp; - if (! (xstrtoul (name, NULL, 10, &tmp, "") == LONGINT_OK -- && tmp <= GID_T_MAX)) -+ && tmp <= GID_T_MAX && tmp != (gid_t) -1)) - error (EXIT_FAILURE, 0, _("invalid group: %s"), quote (name)); - gid = tmp; - } From 8ba42a9e885fe200a50967d902e894d33e1fc000791645983226f2647a254a79 Mon Sep 17 00:00:00 2001 From: Thorsten Kukuk Date: Fri, 18 Jun 2010 10:45:27 +0000 Subject: [PATCH 2/7] - Last part of fix for [bnc#533249]: Don't run account part of PAM stack for su as root. Requires pam > 1.1.1. OBS-URL: https://build.opensuse.org/package/show/Base:System/coreutils?expand=0&rev=10 --- coreutils.changes | 6 ++++++ coreutils.spec | 1 + su.pamd | 1 + 3 files changed, 8 insertions(+) diff --git a/coreutils.changes b/coreutils.changes index 59c33d3..26139e2 100644 --- a/coreutils.changes +++ b/coreutils.changes @@ -1,3 +1,9 @@ +------------------------------------------------------------------- +Fri Jun 18 11:57:47 CEST 2010 - kukuk@suse.de + +- Last part of fix for [bnc#533249]: Don't run account part of + PAM stack for su as root. Requires pam > 1.1.1. + ------------------------------------------------------------------- Fri May 7 15:44:53 UTC 2010 - pth@novell.com diff --git a/coreutils.spec b/coreutils.spec index 35c3da7..7a646c3 100644 --- a/coreutils.spec +++ b/coreutils.spec @@ -31,6 +31,7 @@ Obsoletes: libselinux <= 1.23.11-3 libselinux-32bit = 9 libselinux-64bit = AutoReqProv: on PreReq: %{install_info_prereq} Requires: %{name}-lang = %version +Requires: pam >= 1.1.1.90 Source: coreutils-%{version}.tar.xz Source1: su.pamd Source2: su.default diff --git a/su.pamd b/su.pamd index b729046..88ddbaf 100644 --- a/su.pamd +++ b/su.pamd @@ -1,6 +1,7 @@ #%PAM-1.0 auth sufficient pam_rootok.so auth include common-auth +account sufficient pam_rootok.so account include common-account password include common-password session include common-session From c50a85acddf4e3d73415c5c4c50e2b9c1ecd9456eadc9c89db8c3250199f3ff0 Mon Sep 17 00:00:00 2001 From: Philipp Thomas Date: Mon, 28 Jun 2010 10:54:49 +0000 Subject: [PATCH 3/7] - Fix typo in spec file (% missing from version). OBS-URL: https://build.opensuse.org/package/show/Base:System/coreutils?expand=0&rev=11 --- coreutils.changes | 5 +++++ coreutils.spec | 4 ++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/coreutils.changes b/coreutils.changes index 26139e2..1ba9f84 100644 --- a/coreutils.changes +++ b/coreutils.changes @@ -1,3 +1,8 @@ +------------------------------------------------------------------- +Mon Jun 28 12:52:15 CEST 2010 - pth@suse.de + +- Fix typo in spec file (% missing from version). + ------------------------------------------------------------------- Fri Jun 18 11:57:47 CEST 2010 - kukuk@suse.de diff --git a/coreutils.spec b/coreutils.spec index 7a646c3..1f2fc2e 100644 --- a/coreutils.spec +++ b/coreutils.spec @@ -25,8 +25,8 @@ License: GFDLv1.2 ; GPLv2+ ; GPLv3+ Group: System/Base Version: 8.5 Release: 1 -Provides: fileutils = %{version}, sh-utils = {version}, stat = %version}, textutils = %{version}, mktemp = %{version} -Obsoletes: fileutils < %{version}, sh-utils < {version}, stat < %version}, textutils < %{version}, mktemp < %{version} +Provides: fileutils = %{version}, sh-utils = %{version}, stat = %version}, textutils = %{version}, mktemp = %{version} +Obsoletes: fileutils < %{version}, sh-utils < %{version}, stat < %version}, textutils < %{version}, mktemp < %{version} Obsoletes: libselinux <= 1.23.11-3 libselinux-32bit = 9 libselinux-64bit = 9 libselinux-x86 = 9 AutoReqProv: on PreReq: %{install_info_prereq} From 5ae8424cb77b7e57f9ff07d1f5beab5c849ecd5857eb1eb68785b3db4884295d Mon Sep 17 00:00:00 2001 From: Philipp Thomas Date: Fri, 2 Jul 2010 09:36:06 +0000 Subject: [PATCH 4/7] Accepting request 42399 from home:jengelh:smp Copy from home:jengelh:smp/coreutils via accept of submit request 42399 revision 2. Request was accepted with message: reviewed ok. OBS-URL: https://build.opensuse.org/request/show/42399 OBS-URL: https://build.opensuse.org/package/show/Base:System/coreutils?expand=0&rev=12 --- coreutils.changes | 5 +++++ coreutils.spec | 10 +++++----- 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/coreutils.changes b/coreutils.changes index 1ba9f84..9d4179a 100644 --- a/coreutils.changes +++ b/coreutils.changes @@ -1,3 +1,8 @@ +------------------------------------------------------------------- +Thu Jul 1 21:23:40 UTC 2010 - jengelh@medozas.de + +- Use %_smp_mflags + ------------------------------------------------------------------- Mon Jun 28 12:52:15 CEST 2010 - pth@suse.de diff --git a/coreutils.spec b/coreutils.spec index 1f2fc2e..53e4a44 100644 --- a/coreutils.spec +++ b/coreutils.spec @@ -127,17 +127,17 @@ export CFLAGS="%optflags -Wall" gl_cv_func_printf_directive_n=yes \ gl_cv_func_isnanl_works=yes \ DEFAULT_POSIX2_VERSION=199209 -make %{?jobs:-j%jobs} PAMLIBS="-lpam -ldl" V=1 +make %{?_smp_mflags} PAMLIBS="-lpam -ldl" V=1 %check if test $EUID -eq 0; then - su nobody -c make %{?jobs:-j%jobs} check VERBOSE=yes V=1 - make %{?jobs:-j%jobs} check-root VERBOSE=yes V=1 + su nobody -c make %{?_smp_mflags} check VERBOSE=yes V=1 + make %{?_smp_mflags} check-root VERBOSE=yes V=1 else %ifarch %arm - make -k %{?jobs:-j%jobs} check VERBOSE=yes V=1 || echo make check failed + make -k %{?_smp_mflags} check VERBOSE=yes V=1 || echo make check failed %else - make %{?jobs:-j%jobs} check VERBOSE=yes V=1 + make %{?_smp_mflags} check VERBOSE=yes V=1 %endif fi From e5d1a797ba5efe3760dc03067a572876b98db3cb0b8638c183814d40319d8ca7 Mon Sep 17 00:00:00 2001 From: Ruediger Oertel Date: Wed, 14 Jul 2010 13:13:42 +0000 Subject: [PATCH 5/7] OBS-URL: https://build.opensuse.org/package/show/Base:System/coreutils?expand=0&rev=14 --- coreutils-i18n-limfield.patch | 100 --------------------------------- coreutils-i18n-monthsort.patch | 13 ----- coreutils-i18n-random.patch | 16 ------ coreutils.changes | 6 ++ 4 files changed, 6 insertions(+), 129 deletions(-) delete mode 100644 coreutils-i18n-limfield.patch delete mode 100644 coreutils-i18n-monthsort.patch delete mode 100644 coreutils-i18n-random.patch diff --git a/coreutils-i18n-limfield.patch b/coreutils-i18n-limfield.patch deleted file mode 100644 index 11d0832..0000000 --- a/coreutils-i18n-limfield.patch +++ /dev/null @@ -1,100 +0,0 @@ -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-05-05 16:22:15.815859271 +0200 -+++ src/sort.c 2010-05-05 16:22:15.875859173 +0200 -@@ -1845,7 +1845,7 @@ limfield_mb (const struct line *line, co - GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); - ptr += mblength; - } -- if (ptr < lim) -+ if (ptr < lim && (eword | echar)) - { - GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); - ptr += mblength; -@@ -1856,11 +1856,6 @@ limfield_mb (const struct line *line, co - { - while (ptr < lim && ismbblank (ptr, &mblength)) - ptr += mblength; -- if (ptr < lim) -- { -- GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -- ptr += mblength; -- } - while (ptr < lim && !ismbblank (ptr, &mblength)) - ptr += mblength; - } -@@ -1870,20 +1865,19 @@ limfield_mb (const struct line *line, co - /* Make LIM point to the end of (one byte past) the current field. */ - if (tab != NULL) - { -- char *newlim, *p; -+ char *newlim; - -- newlim = NULL; -- for (p = ptr; p < lim;) -- { -- if (memcmp (p, tab, tab_length) == 0) -- { -- newlim = p; -- break; -- } -- -- GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -- p += mblength; -- } -+ for (newlim = ptr; newlim < lim;) -+ { -+ if (memcmp (newlim, tab, tab_length) == 0) -+ { -+ lim = newlim; -+ break; -+ } -+ -+ GET_BYTELEN_OF_CHAR (lim, newlim, mblength, state); -+ newlim += mblength; -+ } - } - else - { -@@ -1892,24 +1886,20 @@ limfield_mb (const struct line *line, co - - while (newlim < lim && ismbblank (newlim, &mblength)) - newlim += mblength; -- if (ptr < lim) -- { -- GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -- ptr += mblength; -- } - while (newlim < lim && !ismbblank (newlim, &mblength)) -- newlim += mblength; -+ newlim += mblength; - lim = newlim; - } - # endif - -- /* If we're skipping leading blanks, don't start counting characters -- until after skipping past any leading blanks. */ -+ /* If we're ignoring leading blanks when computing the End -+ of the field, don't start counting bytes until after skipping -+ past any leading blanks. */ - if (key->skipeblanks) - while (ptr < lim && ismbblank (ptr, &mblength)) - ptr += mblength; - -- memset (&state, '\0', sizeof(mbstate_t)); -+ memset (&state, '\0', sizeof (mbstate_t)); - - /* Advance PTR by ECHAR (if possible), but no further than LIM. */ - for (i = 0; i < echar; i++) -@@ -1917,9 +1907,9 @@ limfield_mb (const struct line *line, co - GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); - - if (ptr + mblength > lim) -- break; -+ break; - else -- ptr += mblength; -+ ptr += mblength; - } - - return ptr; diff --git a/coreutils-i18n-monthsort.patch b/coreutils-i18n-monthsort.patch deleted file mode 100644 index 345f565..0000000 --- a/coreutils-i18n-monthsort.patch +++ /dev/null @@ -1,13 +0,0 @@ -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-05-05 16:22:15.487859132 +0200 -+++ src/sort.c 2010-05-05 16:23:20.267859249 +0200 -@@ -1402,7 +1402,7 @@ inittables_mb (void) - else - { - j += mblength; -- mblength = wcrtomb (mbc, wc, &state_wc); -+ mblength = wcrtomb (mbc, pwc, &state_wc); - assert (mblength != (size_t) 0 && mblength != (size_t) -1); - } - diff --git a/coreutils-i18n-random.patch b/coreutils-i18n-random.patch deleted file mode 100644 index 93bfabb..0000000 --- a/coreutils-i18n-random.patch +++ /dev/null @@ -1,16 +0,0 @@ -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-05-06 15:16:27.475859128 +0200 -+++ src/sort.c 2010-05-06 15:16:53.899859247 +0200 -@@ -2712,7 +2712,10 @@ keycompare_mb (const struct line *a, con - size_t lenb = limb <= textb ? 0 : limb - textb; - - /* Actually compare the fields. */ -- if (key->numeric | key->general_numeric) -+ -+ if (key->random) -+ diff = compare_random (texta, lena, textb, lenb); -+ else if (key->numeric | key->general_numeric) - { - char savea = *lima, saveb = *limb; - diff --git a/coreutils.changes b/coreutils.changes index 9d4179a..d023ebf 100644 --- a/coreutils.changes +++ b/coreutils.changes @@ -3,6 +3,12 @@ Thu Jul 1 21:23:40 UTC 2010 - jengelh@medozas.de - Use %_smp_mflags +------------------------------------------------------------------- +Tue Jun 29 20:18:04 CEST 2010 - pth@suse.de + +- Fix 'sort -V' not working because the i18n (mb handling) patch + wasn't updated to handle the new option (bnc#615073). + ------------------------------------------------------------------- Mon Jun 28 12:52:15 CEST 2010 - pth@suse.de From 769e06bec70c3e5ac0aa1014dd2b81d20461744cbe1275b9378270d8b6a71178 Mon Sep 17 00:00:00 2001 From: OBS User autobuild Date: Mon, 19 Jul 2010 12:12:47 +0000 Subject: [PATCH 6/7] Accepting request 42907 from Base:System checked in (request 42907) OBS-URL: https://build.opensuse.org/request/show/42907 OBS-URL: https://build.opensuse.org/package/show/Base:System/coreutils?expand=0&rev=15 --- coreutils-5.3.0-i18n-0.1.patch | 4015 ++++++++++++++++ ...n4su.patch => coreutils-5.3.0-sbin4su.diff | 14 +- ...tils-6.8-su.patch => coreutils-6.8-su.diff | 281 +- ....8.0-pie.patch => coreutils-6.8.0-pie.diff | 109 +- coreutils-7.1.diff | 194 + coreutils-7.1.tar.xz | 3 + coreutils-8.5-i18n.patch | 4066 ----------------- coreutils-8.5.patch | 67 - coreutils-8.5.tar.xz | 3 - coreutils-add_ogv.patch | 8 +- coreutils-cifs-afs.diff | 35 + coreutils-fix_distcheck.patch | 80 + coreutils-getaddrinfo.diff | 16 + coreutils-getaddrinfo.patch | 17 - coreutils-gl_printf_safe.patch | 24 - coreutils-i18n-infloop.patch | 14 - coreutils-i18n-uninit.patch | 16 - coreutils-invalid-ids.patch | 26 - coreutils-no_hostname_and_hostid.patch | 122 - ...ls-sysinfo.patch => coreutils-sysinfo.diff | 14 +- coreutils.changes | 69 - coreutils.spec | 71 +- i18n-infloop.diff | 14 + i18n-limfield.diff | 100 + i18n-monthsort.diff | 13 + i18n-random.diff | 16 + i18n-uninit.diff | 29 + invalid-ids.diff | 49 + su.pamd | 1 - 29 files changed, 4805 insertions(+), 4681 deletions(-) create mode 100644 coreutils-5.3.0-i18n-0.1.patch rename coreutils-5.3.0-sbin4su.patch => coreutils-5.3.0-sbin4su.diff (90%) rename coreutils-6.8-su.patch => coreutils-6.8-su.diff (78%) rename coreutils-6.8.0-pie.patch => coreutils-6.8.0-pie.diff (76%) create mode 100644 coreutils-7.1.diff create mode 100644 coreutils-7.1.tar.xz delete mode 100644 coreutils-8.5-i18n.patch delete mode 100644 coreutils-8.5.patch delete mode 100644 coreutils-8.5.tar.xz create mode 100644 coreutils-cifs-afs.diff create mode 100644 coreutils-fix_distcheck.patch create mode 100644 coreutils-getaddrinfo.diff delete mode 100644 coreutils-getaddrinfo.patch delete mode 100644 coreutils-gl_printf_safe.patch delete mode 100644 coreutils-i18n-infloop.patch delete mode 100644 coreutils-i18n-uninit.patch delete mode 100644 coreutils-invalid-ids.patch delete mode 100644 coreutils-no_hostname_and_hostid.patch rename coreutils-sysinfo.patch => coreutils-sysinfo.diff (86%) create mode 100644 i18n-infloop.diff create mode 100644 i18n-limfield.diff create mode 100644 i18n-monthsort.diff create mode 100644 i18n-random.diff create mode 100644 i18n-uninit.diff create mode 100644 invalid-ids.diff diff --git a/coreutils-5.3.0-i18n-0.1.patch b/coreutils-5.3.0-i18n-0.1.patch new file mode 100644 index 0000000..b07d63d --- /dev/null +++ b/coreutils-5.3.0-i18n-0.1.patch @@ -0,0 +1,4015 @@ +Index: lib/linebuffer.h +=================================================================== +--- coreutils-7.1/lib/linebuffer.h.orig 2008-09-18 09:08:01.000000000 +0200 ++++ coreutils-7.1/lib/linebuffer.h 2010-06-29 18:49:31.855522069 +0200 +@@ -21,6 +21,11 @@ + + # include + ++/* Get mbstate_t. */ ++# if HAVE_WCHAR_H ++# include ++# endif ++ + /* A `struct linebuffer' holds a line of text. */ + + struct linebuffer +@@ -28,6 +33,9 @@ struct linebuffer + size_t size; /* Allocated. */ + size_t length; /* Used. */ + char *buffer; ++# if HAVE_WCHAR_H ++ mbstate_t state; ++# endif + }; + + /* Initialize linebuffer LINEBUFFER for use. */ +Index: src/cut.c +=================================================================== +--- coreutils-7.1/src/cut.c.orig 2008-09-18 09:06:57.000000000 +0200 ++++ coreutils-7.1/src/cut.c 2010-06-29 18:49:31.855522069 +0200 +@@ -28,6 +28,12 @@ + #include + #include + #include ++ ++/* Get mbstate_t, mbrtowc(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ + #include "system.h" + + #include "error.h" +@@ -36,6 +42,13 @@ + #include "quote.h" + #include "xstrndup.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# undef MB_LEN_MAX ++# define MB_LEN_MAX 16 ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "cut" + +@@ -77,6 +90,54 @@ struct range_pair + size_t hi; + }; + ++/* Refill the buffer BUF. */ ++#define REFILL_BUFFER(BUF, BUFPOS, BUFLEN, STREAM) \ ++ do \ ++ { \ ++ if (BUFLEN < MB_LEN_MAX && !feof (STREAM) && !ferror (STREAM)) \ ++ { \ ++ memmove (BUF, BUFPOS, BUFLEN); \ ++ BUFLEN += fread (BUF + BUFLEN, sizeof(char), BUFSIZ, STREAM); \ ++ BUFPOS = BUF; \ ++ } \ ++ } \ ++ while (0) ++ ++/* Get wide character which starts at BUFPOS. If the byte sequence is ++ not valid as a character, CONVFAIL is 1. Otherwise 0. */ ++#define GET_NEXT_WC_FROM_BUFFER(WC, BUFPOS, BUFLEN, MBLENGTH, STATE, CONVFAIL) \ ++ do \ ++ { \ ++ wchar_t tmp; \ ++ mbstate_t state_bak; \ ++ \ ++ if (BUFLEN < 1) \ ++ { \ ++ WC = WEOF; \ ++ break; \ ++ } \ ++ \ ++ /* Get a wide character. */ \ ++ CONVFAIL = 0; \ ++ state_bak = STATE; \ ++ MBLENGTH = mbrtowc (&tmp, BUFPOS, BUFLEN, &STATE); \ ++ WC = tmp; \ ++ \ ++ switch (MBLENGTH) \ ++ { \ ++ case (size_t)-1: \ ++ case (size_t)-2: \ ++ ++CONVFAIL; \ ++ STATE = state_bak; \ ++ /* Fall througn. */ \ ++ \ ++ case 0: \ ++ MBLENGTH = 1; \ ++ break; \ ++ } \ ++ } \ ++ while (0) ++ + /* This buffer is used to support the semantics of the -s option + (or lack of same) when the specified field list includes (does + not include) the first field. In both of those cases, the entire +@@ -89,7 +150,7 @@ static char *field_1_buffer; + /* The number of bytes allocated for FIELD_1_BUFFER. */ + static size_t field_1_bufsize; + +-/* The largest field or byte index used as an endpoint of a closed ++/* The largest field, character or byte index used as an endpoint of a closed + or degenerate range specification; this doesn't include the starting + index of right-open-ended ranges. For example, with either range spec + `2-5,9-', `2-3,5,9-' this variable would be set to 5. */ +@@ -101,10 +162,11 @@ static size_t eol_range_start; + + /* This is a bit vector. + In byte mode, which bytes to output. ++ In character mode, which characters to output. + In field mode, which DELIM-separated fields to output. +- Both bytes and fields are numbered starting with 1, ++ Bytes, characters and fields are numbered starting with 1, + so the zeroth bit of this array is unused. +- A field or byte K has been selected if ++ A byte, character or field K has been selected if + (K <= MAX_RANGE_ENDPOINT and is_printable_field(K)) + || (EOL_RANGE_START > 0 && K >= EOL_RANGE_START). */ + static unsigned char *printable_field; +@@ -113,15 +175,25 @@ enum operating_mode + { + undefined_mode, + +- /* Output characters that are in the given bytes. */ ++ /* Output bytes that are in the given bytes. */ + byte_mode, + ++ /* Output characters that are at the given positions. */ ++ character_mode, ++ + /* Output the given delimeter-separated fields. */ + field_mode + }; + + static enum operating_mode operating_mode; + ++/* If true, when in byte mode, don't split multibyte characters. */ ++static bool byte_mode_character_aware; ++ ++/* If true, the function for single byte locale is work ++ if this program runs on multibyte locale. */ ++static bool force_singlebyte_mode; ++ + /* If true do not output lines containing no delimeter characters. + Otherwise, all such lines are printed. This option is valid only + with field mode. */ +@@ -133,6 +205,9 @@ static bool complement; + + /* The delimeter character for field mode. */ + static unsigned char delim; ++#if HAVE_WCHAR_H ++static wchar_t wcdelim; ++#endif + + /* True if the --output-delimiter=STRING option was specified. */ + static bool output_delimiter_specified; +@@ -206,7 +281,7 @@ Mandatory arguments to long options are + -f, --fields=LIST select only these fields; also print any line\n\ + that contains no delimiter character, unless\n\ + the -s option is specified\n\ +- -n (ignored)\n\ ++ -n with -b: don't split multibyte characters\n\ + "), stdout); + fputs (_("\ + --complement complement the set of selected bytes, characters\n\ +@@ -365,7 +440,7 @@ set_fields (const char *fieldstr) + in_digits = false; + /* Starting a range. */ + if (dash_found) +- FATAL_ERROR (_("invalid byte or field list")); ++ FATAL_ERROR (_("invalid byte, character or field list")); + dash_found = true; + fieldstr++; + +@@ -389,7 +464,9 @@ set_fields (const char *fieldstr) + if (!rhs_specified) + { + /* `n-'. From `initial' to end of line. */ +- eol_range_start = initial; ++ if (eol_range_start == 0 ++ || (eol_range_start != 0 && eol_range_start > initial)) ++ eol_range_start = initial; + field_found = true; + } + else +@@ -486,7 +563,7 @@ set_fields (const char *fieldstr) + fieldstr++; + } + else +- FATAL_ERROR (_("invalid byte or field list")); ++ FATAL_ERROR (_("invalid byte, character or field list")); + } + + max_range_endpoint = 0; +@@ -579,6 +656,81 @@ cut_bytes (FILE *stream) + } + } + ++#if HAVE_MBRTOWC ++/* This function is in use for the following case. ++ ++ 1. Read from the stream STREAM, printing to standard output any selected ++ characters. ++ ++ 2. Read from stream STREAM, printing to standard output any selected bytes, ++ without splitting multibyte characters. */ ++ ++static void ++cut_characters_or_cut_bytes_no_split (FILE *stream) ++{ ++ size_t idx; /* Number of bytes or characters in the line so far. */ ++ /* Whether to begin printing delimiters between ranges for the current line. ++ Set after we've begun printing data corresponding to the first range. */ ++ bool print_delimiter; ++ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen; /* The length of the byte sequence in buf. */ ++ wint_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character which shows ++ as same character as WC. */ ++ mbstate_t state; /* State of the stream. */ ++ int convfail; /* 1, when conversion is failed. Otherwise 0. */ ++ ++ ++ idx = 0; ++ print_delimiter = false; ++ buflen = 0; ++ bufpos = buf; ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ while (1) ++ { ++ REFILL_BUFFER (buf, bufpos, buflen, stream); ++ ++ GET_NEXT_WC_FROM_BUFFER (wc, bufpos, buflen, mblength, state, convfail); ++ ++ if (wc == WEOF) ++ { ++ if (idx > 0) ++ putchar ('\n'); ++ break; ++ } ++ else if (wc == L'\n') ++ { ++ putchar ('\n'); ++ idx = 0; ++ print_delimiter = false; ++ } ++ else ++ { ++ bool range_start; ++ bool *rs = output_delimiter_specified ? &range_start : NULL; ++ ++ idx += (operating_mode == byte_mode) ? mblength : 1; ++ if (print_kth (idx, rs)) ++ { ++ if (rs && *rs && print_delimiter) ++ { ++ fwrite (output_delimiter_string, sizeof (char), ++ output_delimiter_length, stdout); ++ } ++ print_delimiter = true; ++ fwrite (bufpos, mblength, sizeof (char), stdout); ++ } ++ } ++ ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++} ++#endif ++ + /* Read from stream STREAM, printing to standard output any selected fields. */ + + static void +@@ -701,13 +853,190 @@ cut_fields (FILE *stream) + } + } + ++#if HAVE_MBRTOWC ++static void ++cut_fields_mb (FILE *stream) ++{ ++ int c; ++ size_t field_idx = 1; ++ bool found_any_selected_field = false; ++ bool buffer_first_field; ++ int empty_input; ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen; /* The length of the byte sequence in buf. */ ++ wint_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character which shows ++ as same character as WC. */ ++ mbstate_t state; /* State of the stream. */ ++ int convfail; /* 1, when conversion is failed. Otherwise 0. */ ++ ++ bufpos = buf; ++ buflen = 0; ++ memset (&state, '\0', sizeof (mbstate_t)); ++ ++ c = getc (stream); ++ empty_input = (c == EOF); ++ if (c != EOF) ++ ungetc (c, stream); ++ else ++ wc = WEOF; ++ ++ /* To support the semantics of the -s flag, we may have to buffer ++ all of the first field to determine whether it is `delimited.' ++ But that is unnecessary if all non-delimited lines must be printed ++ and the first field has been selected, or if non-delimited lines ++ must be suppressed and the first field has *not* been selected. ++ That is because a non-delimited line has exactly one field. */ ++ buffer_first_field = (suppress_non_delimited ^ !print_kth (1, NULL)); ++ ++ while (1) ++ { ++ if (field_idx == 1 && buffer_first_field) ++ { ++ size_t n_bytes = 0; ++ ++ while (1) ++ { ++ REFILL_BUFFER (buf, bufpos, buflen, stream); ++ ++ GET_NEXT_WC_FROM_BUFFER ++ (wc, bufpos, buflen, mblength, state, convfail); ++ ++ if (wc == WEOF) ++ break; ++ ++ field_1_buffer = xrealloc (field_1_buffer, n_bytes + mblength); ++ memcpy (field_1_buffer + n_bytes, bufpos, mblength); ++ n_bytes += mblength; ++ buflen -= mblength; ++ bufpos += mblength; ++ ++ if (!convfail && (wc == L'\n' || wc == wcdelim)) ++ break; ++ } ++ ++ if (wc == WEOF) ++ break; ++ ++ /* If the first field extends to the end of line (it is not ++ delimited) and we are printing all non-delimited lines, ++ print this one. */ ++ if (convfail || (!convfail && wc != wcdelim)) ++ { ++ if (suppress_non_delimited) ++ { ++ /* Empty. */ ++ } ++ else ++ { ++ fwrite (field_1_buffer, sizeof (char), n_bytes, stdout); ++ /* Make sure the output line is newline terminated. */ ++ if (convfail || (!convfail && wc != L'\n')) ++ putchar ('\n'); ++ } ++ continue; ++ } ++ ++ if (print_kth (1, NULL)) ++ { ++ /* Print the field, but not the trailing delimiter. */ ++ fwrite (field_1_buffer, sizeof (char), n_bytes - 1, stdout); ++ found_any_selected_field = true; ++ } ++ ++field_idx; ++ } ++ ++ if (wc != WEOF) ++ { ++ if (print_kth (field_idx, NULL)) ++ { ++ if (found_any_selected_field) ++ { ++ fwrite (output_delimiter_string, sizeof (char), ++ output_delimiter_length, stdout); ++ } ++ found_any_selected_field = true; ++ } ++ ++ while (1) ++ { ++ REFILL_BUFFER (buf, bufpos, buflen, stream); ++ ++ GET_NEXT_WC_FROM_BUFFER ++ (wc, bufpos, buflen, mblength, state, convfail); ++ ++ if (wc == WEOF) ++ break; ++ else if (!convfail && (wc == wcdelim || wc == L'\n')) ++ { ++ buflen -= mblength; ++ bufpos += mblength; ++ break; ++ } ++ ++ if (print_kth (field_idx, NULL)) ++ fwrite (bufpos, mblength, sizeof (char), stdout); ++ ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++ } ++ ++ if ((!convfail || wc == L'\n') && buflen < 1) ++ wc = WEOF; ++ ++ if (!convfail && wc == wcdelim) ++ ++field_idx; ++ else if (wc == WEOF || (!convfail && wc == L'\n')) ++ { ++ if (found_any_selected_field ++ || (!empty_input && !(suppress_non_delimited && field_idx == 1))) ++ putchar ('\n'); ++ if (wc == WEOF) ++ break; ++ field_idx = 1; ++ found_any_selected_field = false; ++ } ++ } ++} ++#endif ++ + static void + cut_stream (FILE *stream) + { +- if (operating_mode == byte_mode) +- cut_bytes (stream); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) ++ { ++ switch (operating_mode) ++ { ++ case byte_mode: ++ if (byte_mode_character_aware) ++ cut_characters_or_cut_bytes_no_split (stream); ++ else ++ cut_bytes (stream); ++ break; ++ ++ case character_mode: ++ cut_characters_or_cut_bytes_no_split (stream); ++ break; ++ ++ case field_mode: ++ cut_fields_mb (stream); ++ break; ++ ++ default: ++ abort (); ++ } ++ } + else +- cut_fields (stream); ++#endif ++ { ++ if (operating_mode == field_mode) ++ cut_fields (stream); ++ else ++ cut_bytes (stream); ++ } + } + + /* Process file FILE to standard output. +@@ -757,6 +1086,8 @@ main (int argc, char **argv) + bool ok; + bool delim_specified = false; + char *spec_list_string IF_LINT(= NULL); ++ char mbdelim[MB_LEN_MAX + 1]; ++ size_t delimlen = 0; + + initialize_main (&argc, &argv); + set_program_name (argv[0]); +@@ -779,7 +1110,6 @@ main (int argc, char **argv) + switch (optc) + { + case 'b': +- case 'c': + /* Build the byte list. */ + if (operating_mode != undefined_mode) + FATAL_ERROR (_("only one type of list may be specified")); +@@ -787,6 +1117,14 @@ main (int argc, char **argv) + spec_list_string = optarg; + break; + ++ case 'c': ++ /* Build the character list. */ ++ if (operating_mode != undefined_mode) ++ FATAL_ERROR (_("only one type of list may be specified")); ++ operating_mode = character_mode; ++ spec_list_string = optarg; ++ break; ++ + case 'f': + /* Build the field list. */ + if (operating_mode != undefined_mode) +@@ -798,9 +1136,32 @@ main (int argc, char **argv) + case 'd': + /* New delimiter. */ + /* Interpret -d '' to mean `use the NUL byte as the delimiter.' */ +- if (optarg[0] != '\0' && optarg[1] != '\0') +- FATAL_ERROR (_("the delimiter must be a single character")); +- delim = optarg[0]; ++#if HAVE_MBRTOWC ++ if(MB_CUR_MAX > 1) ++ { ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ delimlen = mbrtowc (&wcdelim, optarg, MB_LEN_MAX, &state); ++ ++ if (delimlen == (size_t)-1 || delimlen == (size_t)-2) ++ force_singlebyte_mode = true; ++ else ++ { ++ delimlen = (delimlen < 1) ? 1 : delimlen; ++ if (wcdelim != L'\0' && *(optarg + delimlen) != '\0') ++ FATAL_ERROR (_("the delimiter must be a single character")); ++ memcpy (mbdelim, optarg, delimlen); ++ } ++ } ++ ++ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) ++#endif ++ { ++ if (optarg[0] != '\0' && optarg[1] != '\0') ++ FATAL_ERROR (_("the delimiter must be a single character")); ++ delim = (unsigned char) optarg[0]; ++ } + delim_specified = true; + break; + +@@ -814,6 +1175,7 @@ main (int argc, char **argv) + break; + + case 'n': ++ byte_mode_character_aware = true; + break; + + case 's': +@@ -836,7 +1198,7 @@ main (int argc, char **argv) + if (operating_mode == undefined_mode) + FATAL_ERROR (_("you must specify a list of bytes, characters, or fields")); + +- if (delim != '\0' && operating_mode != field_mode) ++ if (delim_specified && operating_mode != field_mode) + FATAL_ERROR (_("an input delimiter may be specified only\ + when operating on fields")); + +@@ -863,15 +1225,34 @@ main (int argc, char **argv) + } + + if (!delim_specified) +- delim = '\t'; ++ { ++ delim = '\t'; ++#ifdef HAVE_MBRTOWC ++ wcdelim = L'\t'; ++ mbdelim[0] = '\t'; ++ mbdelim[1] = '\0'; ++ delimlen = 1; ++ } ++#endif + + if (output_delimiter_string == NULL) + { +- static char dummy[2]; +- dummy[0] = delim; +- dummy[1] = '\0'; +- output_delimiter_string = dummy; +- output_delimiter_length = 1; ++#ifdef HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) ++ { ++ output_delimiter_string = xstrdup (mbdelim); ++ output_delimiter_length = delimlen; ++ } ++ ++ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) ++#endif ++ { ++ static char dummy[2]; ++ dummy[0] = delim; ++ dummy[1] = '\0'; ++ output_delimiter_string = dummy; ++ output_delimiter_length = 1; ++ } + } + + if (optind == argc) +Index: src/expand.c +=================================================================== +--- coreutils-7.1/src/expand.c.orig 2008-11-10 14:17:52.000000000 +0100 ++++ coreutils-7.1/src/expand.c 2010-06-29 18:49:31.871522014 +0200 +@@ -37,11 +37,31 @@ + #include + #include + #include ++ ++/* Get mbstate_t, mbrtowc, wcwidth. */ ++#if HAVE_WCHAR_H ++# include ++#endif ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "quote.h" + #include "xstrndup.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "expand" + +@@ -343,9 +363,12 @@ expand (void) + } + else + { +- column++; +- if (!column) +- error (EXIT_FAILURE, 0, _("input line is too long")); ++ if (!iscntrl (c)) ++ { ++ column++; ++ if (!column) ++ error (EXIT_FAILURE, 0, _("input line is too long")); ++ } + } + + convert &= convert_entire_line | !! isblank (c); +@@ -361,6 +384,165 @@ expand (void) + } + } + ++#if HAVE_MBRTOWC && HAVE_WCTYPE_H ++static void ++expand_multibyte (void) ++{ ++ /* Input stream. */ ++ FILE *fp = next_file (NULL); ++ ++ mbstate_t i_state; /* Current shift state of the input stream. */ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen = 0; /* The length of the byte sequence in buf. */ ++ ++ if (!fp) ++ return; ++ ++ memset (&i_state, '\0', sizeof (mbstate_t)); ++ ++ for (;;) ++ { ++ /* Input character, or EOF. */ ++ wint_t wc; ++ ++ /* If true, perform translations. */ ++ bool convert = true; ++ ++ ++ /* The following variables have valid values only when CONVERT ++ is true: */ ++ ++ /* Column of next input character. */ ++ uintmax_t column = 0; ++ ++ /* Index in TAB_LIST of next tab stop to examine. */ ++ size_t tab_index = 0; ++ ++ ++ /* Convert a line of text. */ ++ ++ do ++ { ++ wchar_t w; ++ size_t mblength; /* The byte size of a multibyte character ++ which shows as same character as WC. */ ++ mbstate_t i_state_bak; /* Back up the I_STATE. */ ++ ++ /* Fill buffer */ ++ if (buflen < MB_LEN_MAX) ++ { ++ if (!feof(fp) && !ferror(fp)) ++ { ++ if (buflen > 0) ++ memmove (buf, bufpos, buflen); ++ buflen += fread (buf + buflen, sizeof (char), BUFSIZ, fp); ++ bufpos = buf; ++ } ++ } ++ ++ if (buflen < 1) ++ { ++ /* Move to the next file */ ++ if (feof (fp) || ferror (fp)) ++ fp = next_file(fp); ++ if (!fp) ++ return; ++ memset (&i_state, '\0', sizeof (mbstate_t)); ++ continue; ++ } ++ ++ i_state_bak = i_state; ++ mblength = mbrtowc (&w, bufpos, buflen, &i_state); ++ wc = w; ++ ++ if (mblength == (size_t) -1 || mblength == (size_t) -2) ++ { ++ i_state = i_state_bak; ++ wc = L'\0'; ++ column += convert; ++ mblength = 1; ++ } ++ ++ if (convert) ++ { ++ if (wc == L'\t') ++ { ++ /* Column the next input tab stop is on. */ ++ uintmax_t next_tab_column; ++ ++ if (tab_size) ++ next_tab_column = column + (tab_size - column % tab_size); ++ else ++ for (;;) ++ if (tab_index == first_free_tab) ++ { ++ next_tab_column = column + 1; ++ break; ++ } ++ else ++ { ++ uintmax_t tab = tab_list[tab_index++]; ++ if (column < tab) ++ { ++ next_tab_column = tab; ++ break; ++ } ++ } ++ ++ if (next_tab_column < column) ++ error (EXIT_FAILURE, 0, _("input line is too long")); ++ ++ while (++column < next_tab_column) ++ if (putchar (' ') < 0) ++ error (EXIT_FAILURE, errno, _("write error")); ++ ++ *bufpos = ' '; ++ } ++ else if (wc == L'\b') ++ { ++ /* Go back one column, and force recalculation of the ++ next tab stop. */ ++ column -= !!column; ++ tab_index -= !!tab_index; ++ } ++ else ++ { ++ if (!iswcntrl (wc)) ++ { ++ int width = wcwidth (wc); ++ if (width > 0) ++ { ++ if (column > (column + width)) ++ error (EXIT_FAILURE, 0, _("input line is too long")); ++ column += width; ++ } ++ } ++ } ++ ++ convert &= convert_entire_line | iswblank (wc); ++ } ++ ++ if (mblength) ++ { ++ if (fwrite (bufpos, sizeof (char), mblength, stdout) < mblength) ++ error (EXIT_FAILURE, errno, _("write error")); ++ } ++ else ++ { ++ if (putchar ('\0')) ++ error (EXIT_FAILURE, errno, _("write error")); ++ mblength = 1; ++ } ++ ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++ while (wc != L'\n'); ++ } ++} ++#endif ++ + int + main (int argc, char **argv) + { +@@ -425,7 +607,12 @@ main (int argc, char **argv) + + file_list = (optind < argc ? &argv[optind] : stdin_argv); + +- expand (); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ expand_multibyte (); ++ else ++#endif ++ expand (); + + if (have_read_stdin && fclose (stdin) != 0) + error (EXIT_FAILURE, errno, "-"); +Index: src/fold.c +=================================================================== +--- coreutils-7.1/src/fold.c.orig 2008-09-18 09:06:57.000000000 +0200 ++++ coreutils-7.1/src/fold.c 2010-06-29 18:49:31.896029818 +0200 +@@ -22,6 +22,19 @@ + #include + #include + ++/* Get MB_CUR_MAX. */ ++#include ++ ++/* Get mbrtowc, mbstate_t, wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswprint(), iswctype(), wctype(). */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "quote.h" +@@ -29,11 +42,54 @@ + + #define TAB_WIDTH 8 + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# undef MB_LEN_MAX ++# define MB_LEN_MAX 16 ++#endif ++ ++#ifndef HAVE_DECL_WCWIDTH ++"this configure-time declaration test was not run" ++#endif ++#if !HAVE_DECL_WCWIDTH ++extern int wcwidth (); ++#endif ++ ++/* If wcwidth() doesn't exist, assume all printable characters have ++ width 1. */ ++#if !defined wcwidth && !HAVE_WCWIDTH ++# define wcwidth(wc) ((wc) == 0 ? 0 : iswprint (wc) ? 1 : -1) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "fold" + + #define AUTHORS proper_name ("David MacKenzie") + ++#define FATAL_ERROR(Message) \ ++ do \ ++ { \ ++ error (0, 0, (Message)); \ ++ usage (2); \ ++ } \ ++ while (0) ++ ++enum operating_mode ++{ ++ /* Fold texts by columns that are at the given positions. */ ++ column_mode, ++ ++ /* Fold texts by bytes that are at the given positions. */ ++ byte_mode, ++ ++ /* Fold texts by characters that are at the given positions. */ ++ character_mode, ++}; ++ ++/* The argument shows current mode. (Default: column_mode) */ ++static enum operating_mode operating_mode; ++ + /* If nonzero, try to break on whitespace. */ + static bool break_spaces; + +@@ -43,11 +99,17 @@ static bool count_bytes; + /* If nonzero, at least one of the files we read was standard input. */ + static bool have_read_stdin; + +-static char const shortopts[] = "bsw:0::1::2::3::4::5::6::7::8::9::"; ++static char const shortopts[] = "bcsw:0::1::2::3::4::5::6::7::8::9::"; ++ ++/* wide character class `blank' */ ++#if HAVE_MBRTOWC ++wctype_t blank_type; ++#endif + + static struct option const longopts[] = + { + {"bytes", no_argument, NULL, 'b'}, ++ {"characters", no_argument, NULL, 'c'}, + {"spaces", no_argument, NULL, 's'}, + {"width", required_argument, NULL, 'w'}, + {GETOPT_HELP_OPTION_DECL}, +@@ -77,6 +139,7 @@ Mandatory arguments to long options are + "), stdout); + fputs (_("\ + -b, --bytes count bytes rather than columns\n\ ++ -c, --characters count characters rather than columns\n\ + -s, --spaces break at spaces\n\ + -w, --width=WIDTH use WIDTH columns instead of 80\n\ + "), stdout); +@@ -94,7 +157,7 @@ Mandatory arguments to long options are + static size_t + adjust_column (size_t column, char c) + { +- if (!count_bytes) ++ if (operating_mode != byte_mode) + { + if (c == '\b') + { +@@ -113,14 +176,9 @@ adjust_column (size_t column, char c) + return column; + } + +-/* Fold file FILENAME, or standard input if FILENAME is "-", +- to stdout, with maximum line length WIDTH. +- Return true if successful. */ +- +-static bool +-fold_file (char const *filename, size_t width) ++static int ++fold_text (FILE *istream, size_t width) + { +- FILE *istream; + int c; + size_t column = 0; /* Screen column where next char will go. */ + size_t offset_out = 0; /* Index in `line_out' for next char. */ +@@ -128,20 +186,6 @@ fold_file (char const *filename, size_t + static size_t allocated_out = 0; + int saved_errno; + +- if (STREQ (filename, "-")) +- { +- istream = stdin; +- have_read_stdin = true; +- } +- else +- istream = fopen (filename, "r"); +- +- if (istream == NULL) +- { +- error (0, errno, "%s", filename); +- return false; +- } +- + while ((c = getc (istream)) != EOF) + { + if (offset_out + 1 >= allocated_out) +@@ -219,6 +263,234 @@ fold_file (char const *filename, size_t + if (offset_out) + fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); + ++ return saved_errno; ++} ++ ++#if HAVE_MBRTOWC ++static void ++fold_multibyte_text (FILE *istream, size_t width) ++{ ++ int i; ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ size_t buflen; /* The length of the byte sequence in buf. */ ++ char *bufpos; /* Next read position of BUF. */ ++ wint_t wc; /* A gotten wide character. */ ++ wchar_t tmp; ++ size_t mblength; /* The byte size of a multibyte character which shows ++ as same character as WC. */ ++ mbstate_t state, state_bak; /* State of the stream. */ ++ int convfail; /* 1, when conversion is failed. Otherwise 0. */ ++ ++ char *line_out = NULL; ++ size_t offset_out = 0; /* Index in `line_out' for next char. */ ++ size_t allocated_out = 1024; ++ ++ int increment; ++ size_t column = 0; ++ ++ size_t last_blank_pos; ++ size_t last_blank_column; ++ int is_blank_seen; ++ int last_blank_increment; ++ int is_bs_following_last_blank; ++ size_t bs_following_last_blank_num; ++ int is_cr_after_last_blank; ++ ++ ++#define CLEAR_FLAGS \ ++ do \ ++ { \ ++ last_blank_pos = 0; \ ++ last_blank_column = 0; \ ++ is_blank_seen = 0; \ ++ is_bs_following_last_blank = 0; \ ++ bs_following_last_blank_num = 0; \ ++ is_cr_after_last_blank = 0; \ ++ } \ ++ while (0) ++ ++#define START_NEW_LINE \ ++ do \ ++ { \ ++ putchar ('\n'); \ ++ column = 0; \ ++ offset_out = 0; \ ++ CLEAR_FLAGS; \ ++ } \ ++ while (0) ++ ++ CLEAR_FLAGS; ++ ++ memset (&state, '\0', sizeof (mbstate_t)); ++ line_out = xmalloc (allocated_out); ++ ++ buflen = fread (buf, sizeof (char), BUFSIZ, istream); ++ bufpos = buf; ++ ++ for (;; bufpos += mblength, buflen -= mblength) ++ { ++ if (buflen < MB_LEN_MAX && !feof (istream) && !ferror (istream)) ++ { ++ memmove (buf, bufpos, buflen); ++ buflen += fread (buf + buflen, sizeof (char), BUFSIZ, istream); ++ bufpos = buf; ++ } ++ ++ if (buflen < 1) ++ break; ++ ++ /* Get a wide character. */ ++ convfail = 0; ++ state_bak = state; ++ mblength = mbrtowc (&tmp, bufpos, buflen, &state); ++ wc = tmp; ++ ++ switch (mblength) ++ { ++ case (size_t)-1: ++ case (size_t)-2: ++ convfail++; ++ state = state_bak; ++ /* Fall through. */ ++ ++ case 0: ++ mblength = 1; ++ break; ++ } ++ ++ if (!convfail && wc == L'\n') ++ { ++ if (offset_out > 0) ++ { ++ fwrite (line_out, sizeof (char), offset_out, stdout); ++ START_NEW_LINE; ++ } ++ continue; ++ } ++ ++ rescan: ++ if (operating_mode == byte_mode) /* byte mode */ ++ increment = mblength; ++ else if (operating_mode == character_mode) /* character mode */ ++ increment = 1; ++ else /* column mode */ ++ { ++ if (convfail) ++ increment = 1; ++ else ++ { ++ switch (wc) ++ { ++ case L'\b': ++ increment = (column > 0) ? -1 : 0; ++ break; ++ ++ case L'\r': ++ increment = -1 * column; ++ break; ++ ++ case L'\t': ++ increment = 8 - column % 8; ++ break; ++ ++ default: ++ increment = wcwidth (wc); ++ increment = (increment < 0) ? 0 : increment; ++ } ++ } ++ } ++ ++ if (column + increment > width && break_spaces && last_blank_pos) ++ { ++ fwrite (line_out, sizeof (char), last_blank_pos, stdout); ++ putchar ('\n'); ++ ++ offset_out = offset_out - last_blank_pos; ++ column = (column - last_blank_column ++ + (is_cr_after_last_blank ++ ? last_blank_increment : bs_following_last_blank_num)); ++ memmove (line_out, line_out + last_blank_pos, offset_out); ++ CLEAR_FLAGS; ++ goto rescan; ++ } ++ ++ if (column + increment > width && column != 0) ++ { ++ fwrite (line_out, sizeof (char), offset_out, stdout); ++ START_NEW_LINE; ++ goto rescan; ++ } ++ ++ if (allocated_out < offset_out + mblength) ++ line_out = x2nrealloc (line_out, &allocated_out, sizeof *line_out); ++ ++ for (i = 0; i < mblength; i++) ++ { ++ line_out[offset_out] = bufpos[i]; ++ ++offset_out; ++ } ++ ++ column += increment; ++ ++ if (is_blank_seen && !convfail && wc == L'\r') ++ is_cr_after_last_blank = 1; ++ ++ if (is_bs_following_last_blank && !convfail && wc == L'\b') ++ ++bs_following_last_blank_num; ++ else ++ is_bs_following_last_blank = 0; ++ ++ if (break_spaces && !convfail && iswctype (wc, blank_type)) ++ { ++ last_blank_pos = offset_out; ++ last_blank_column = column; ++ is_blank_seen = 1; ++ last_blank_increment = increment; ++ is_bs_following_last_blank = 1; ++ bs_following_last_blank_num = 0; ++ is_cr_after_last_blank = 0; ++ } ++ } ++ ++ if (offset_out) ++ fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); ++ ++ free(line_out); ++} ++#endif ++ ++/* Fold file FILENAME, or standard input if FILENAME is "-", ++ to stdout, with maximum line length WIDTH. ++ Return true if successful. */ ++ ++static bool ++fold_file (char const *filename, size_t width) ++{ ++ FILE *istream; ++ int saved_errno; ++ ++ if (STREQ (filename, "-")) ++ { ++ istream = stdin; ++ have_read_stdin = true; ++ } ++ else ++ istream = fopen (filename, "r"); ++ ++ if (istream == NULL) ++ { ++ error (0, errno, "%s", filename); ++ return false; ++ } ++ ++ /* Define how ISTREAM is being folded. */ ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ fold_multibyte_text (istream, width); ++ else ++#endif ++ saved_errno = fold_text (istream, width); ++ + if (ferror (istream)) + { + error (0, saved_errno, "%s", filename); +@@ -251,6 +523,10 @@ main (int argc, char **argv) + + atexit (close_stdout); + ++#if HAVE_MBRTOWC ++ blank_type = wctype ("blank"); ++#endif ++ operating_mode = column_mode; + break_spaces = count_bytes = have_read_stdin = false; + + while ((optc = getopt_long (argc, argv, shortopts, longopts, NULL)) != -1) +@@ -260,7 +536,15 @@ main (int argc, char **argv) + switch (optc) + { + case 'b': /* Count bytes rather than columns. */ +- count_bytes = true; ++ if (operating_mode != column_mode) ++ FATAL_ERROR (_("only one way of folding may be specified")); ++ operating_mode = byte_mode; ++ break; ++ ++ case 'c': /* Count characters rather than columns. */ ++ if (operating_mode != column_mode) ++ FATAL_ERROR (_("only one way of folding may be specified")); ++ operating_mode = character_mode; + break; + + case 's': /* Break at word boundaries. */ +Index: src/join.c +=================================================================== +--- coreutils-7.1/src/join.c.orig 2008-11-10 14:17:52.000000000 +0100 ++++ coreutils-7.1/src/join.c 2010-06-29 18:49:31.923528009 +0200 +@@ -22,6 +22,16 @@ + #include + #include + ++/* Get mbstate_t, mbrtowc, mbrtowc, wcwidth. */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswblank, towupper. */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "linebuffer.h" +@@ -32,6 +42,11 @@ + #include "xstrtol.h" + #include "argmatch.h" + ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "join" + +@@ -120,10 +135,13 @@ static struct outlist outlist_head; + /* Last element in `outlist', where a new element can be added. */ + static struct outlist *outlist_end = &outlist_head; + +-/* Tab character separating fields. If negative, fields are separated ++/* Tab character separating fields. If NULL, fields are separated + by any nonempty string of blanks, otherwise by exactly one + tab character whose value (when cast to unsigned char) equals TAB. */ +-static int tab = -1; ++static const char *tab = NULL; ++ ++/* The number of bytes used for tab. */ ++static size_t tablen = 0; + + /* If nonzero, check that the input is correctly ordered. */ + static enum +@@ -237,10 +255,10 @@ xfields (struct line *line) + if (ptr == lim) + return; + +- if (0 <= tab) ++ if (tab != NULL) + { + char *sep; +- for (; (sep = memchr (ptr, tab, lim - ptr)) != NULL; ptr = sep + 1) ++ for (; (sep = memchr (ptr, tab[0], lim - ptr)) != NULL; ptr = sep + 1) + extract_field (line, ptr, sep - ptr); + } + else +@@ -285,56 +303,115 @@ keycmp (struct line const *line1, struct + size_t jf_1, size_t jf_2) + { + /* Start of field to compare in each file. */ +- char *beg1; +- char *beg2; +- +- size_t len1; +- size_t len2; /* Length of fields to compare. */ ++ char *beg[2]; ++ char *copy[2]; ++ size_t len[2]; /* Length of fields to compare. */ + int diff; ++ int i, j; + + if (jf_1 < line1->nfields) + { +- beg1 = line1->fields[jf_1].beg; +- len1 = line1->fields[jf_1].len; ++ beg[0] = line1->fields[jf_1].beg; ++ len[0] = line1->fields[jf_1].len; + } + else + { +- beg1 = NULL; +- len1 = 0; ++ beg[0] = NULL; ++ len[0] = 0; + } + + if (jf_2 < line2->nfields) + { +- beg2 = line2->fields[jf_2].beg; +- len2 = line2->fields[jf_2].len; ++ beg[1] = line2->fields[jf_2].beg; ++ len[1] = line2->fields[jf_2].len; + } + else + { +- beg2 = NULL; +- len2 = 0; ++ beg[1] = NULL; ++ len[1] = 0; + } + +- if (len1 == 0) +- return len2 == 0 ? 0 : -1; +- if (len2 == 0) ++ if (len[0] == 0) ++ return len[1] == 0 ? 0 : -1; ++ if (len[1] == 0) + return 1; + + if (ignore_case) + { +- /* FIXME: ignore_case does not work with NLS (in particular, +- with multibyte chars). */ +- diff = memcasecmp (beg1, beg2, MIN (len1, len2)); ++#ifdef HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ size_t mblength; ++ wchar_t wc, uwc; ++ mbstate_t state, state_bak; ++ ++ memset (&state, '\0', sizeof (mbstate_t)); ++ ++ for (i = 0; i < 2; i++) ++ { ++ copy[i] = alloca (len[i] + 1); ++ ++ for (j = 0; j < MIN (len[0], len[1]);) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, beg[i] + j, len[i] - j, &state); ++ ++ switch (mblength) ++ { ++ case (size_t) -1: ++ case (size_t) -2: ++ state = state_bak; ++ /* Fall through */ ++ case 0: ++ mblength = 1; ++ break; ++ ++ default: ++ uwc = towupper (wc); ++ ++ if (uwc != wc) ++ { ++ mbstate_t state_wc; ++ ++ memset (&state_wc, '\0', sizeof (mbstate_t)); ++ wcrtomb (copy[i] + j, uwc, &state_wc); ++ } ++ else ++ memcpy (copy[i] + j, beg[i] + j, mblength); ++ } ++ j += mblength; ++ } ++ copy[i][j] = '\0'; ++ } ++ return xmemcoll (copy[0], len[0], copy[1], len[1]); ++ } ++#endif ++ if (hard_LC_COLLATE) ++ { ++ for (i = 0; i < 2; i++) ++ { ++ copy[i] = alloca (len[i] + 1); ++ ++ for (j = 0; j < MIN (len[0], len[1]); j++) ++ copy[i][j] = toupper (beg[i][j]); ++ ++ copy[i][j] = '\0'; ++ } ++ return xmemcoll (copy[0], len[0], copy[1], len[1]); ++ } ++ else ++ diff = memcasecmp (beg[0], beg[1], MIN (len[0], len[1])); + } + else + { + if (hard_LC_COLLATE) +- return xmemcoll (beg1, len1, beg2, len2); +- diff = memcmp (beg1, beg2, MIN (len1, len2)); ++ return xmemcoll (beg[0], len[0], beg[1], len[1]); ++ diff = memcmp (beg[0], beg[1], MIN (len[0], len[1])); + } + + if (diff) + return diff; +- return len1 < len2 ? -1 : len1 != len2; ++ return len[0] < len[1] ? -1 : len[0] != len[1]; + } + + /* Check that successive input lines PREV and CURRENT from input file +@@ -388,6 +465,133 @@ init_linep (struct line **linep) + return line; + } + ++#if HAVE_MBRTOWC ++static void ++xfields_multibyte (struct line *line) ++{ ++ int i; ++ char *ptr0 = line->buf.buffer; ++ char *ptr; ++ char *lim; ++ wchar_t wc = 0; ++ size_t mblength; ++ mbstate_t state, state_bak; ++ ++ memset (&state, 0, sizeof (mbstate_t)); ++ ++ ptr = ptr0; ++ lim = ptr0 + line->buf.length - 1; ++ ++ if (tab == NULL) ++ { ++ /* Skip leading blanks before the first field. */ ++ while (ptr < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); ++ ++ if (mblength == (size_t) -1 || mblength == (size_t) -2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (!iswblank (wc)) ++ break; ++ ptr += mblength; ++ } ++ } ++ ++ for (i = 0; ptr < lim; ++i) ++ { ++ if (tab != NULL) ++ { ++ char *beg = ptr; ++ while (ptr < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); ++ ++ if (mblength == (size_t) -1 || mblength == (size_t) -2) ++ { ++ mblength = 1; ++ state = state_bak; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (mblength == tablen && !memcmp (ptr, tab, mblength)) ++ break; ++ else ++ { ++ ptr += mblength; ++ continue; ++ } ++ } ++ ++ extract_field (line, beg, ptr - beg); ++ if (ptr < lim) ++ ptr += mblength; ++ } ++ else ++ { ++ char *beg = ptr; ++ while (ptr < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); ++ ++ if (mblength == (size_t) -1 || mblength == (size_t) -2) ++ { ++ mblength = 1; ++ state = state_bak; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (iswblank (wc)) ++ break; ++ else ++ { ++ ptr += mblength; ++ continue; ++ } ++ } ++ ++ extract_field (line, beg, ptr - beg); ++ if (ptr < lim) ++ ptr += mblength; ++ } ++ } ++ ++ if (ptr != ptr0) ++ { ++ mblength = mbrtowc (&wc, ptr - mblength, mblength, &state); ++ wc = (mbsinit (&state) && *(ptr - mblength) == '\0') ? L'\0' : wc; ++ if (tab != NULL) ++ { ++ if (mblength == (size_t) -1 || mblength == (size_t) -2) ++ mblength = 1; ++ ++ if (mblength == tablen && !memcmp (ptr - mblength, tab, mblength)) ++ /* Add one more (empty) field because the last character of ++ the line was a delimiter. */ ++ extract_field (line, NULL, 0); ++ } ++ else ++ { ++ if (mblength != (size_t) -1 && mblength != (size_t) -2) ++ { ++ if (iswblank (wc)) ++ /* Add one more (empty) field because the last character of ++ the line was a delimiter. */ ++ extract_field (line, NULL, 0); ++ } ++ } ++ } ++} ++#endif ++ + /* Read a line from FP into LINE and split it into fields. + Return true if successful. */ + +@@ -415,7 +619,12 @@ get_line (FILE *fp, struct line **linep, + return false; + } + +- xfields (line); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ xfields_multibyte (line); ++ else ++#endif ++ xfields (line); + + if (prevline[which - 1]) + check_order (prevline[which - 1], line, which); +@@ -520,7 +729,8 @@ static void + prjoin (struct line const *line1, struct line const *line2) + { + const struct outlist *outlist; +- char output_separator = tab < 0 ? ' ' : tab; ++ const char *output_separator = tab == NULL ? " " : tab; ++ size_t output_separator_len = tab == NULL ? 1 : tablen; + + outlist = outlist_head.next; + if (outlist) +@@ -555,7 +765,7 @@ prjoin (struct line const *line1, struct + o = o->next; + if (o == NULL) + break; +- putchar (output_separator); ++ fwrite (output_separator, 1, output_separator_len, stdout); + } + putchar ('\n'); + } +@@ -573,23 +783,23 @@ prjoin (struct line const *line1, struct + prfield (join_field_1, line1); + for (i = 0; i < join_field_1 && i < line1->nfields; ++i) + { +- putchar (output_separator); ++ fwrite (output_separator, 1, output_separator_len, stdout); + prfield (i, line1); + } + for (i = join_field_1 + 1; i < line1->nfields; ++i) + { +- putchar (output_separator); ++ fwrite (output_separator, 1, output_separator_len, stdout); + prfield (i, line1); + } + + for (i = 0; i < join_field_2 && i < line2->nfields; ++i) + { +- putchar (output_separator); ++ fwrite (output_separator, 1, output_separator_len, stdout); + prfield (i, line2); + } + for (i = join_field_2 + 1; i < line2->nfields; ++i) + { +- putchar (output_separator); ++ fwrite (output_separator, 1, output_separator_len, stdout); + prfield (i, line2); + } + putchar ('\n'); +@@ -1020,20 +1230,40 @@ main (int argc, char **argv) + + case 't': + { +- unsigned char newtab = optarg[0]; +- if (! newtab) ++ const char *newtab = optarg; ++ size_t newtablen; ++ if (! newtab[0]) + error (EXIT_FAILURE, 0, _("empty tab")); +- if (optarg[1]) ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ mbstate_t state; ++ ++ memset (&state, 0, sizeof (mbstate_t)); ++ newtablen = mbrtowc (NULL, newtab, strlen (newtab), &state); ++ if (newtablen == (size_t) 0 ++ || newtablen == (size_t) -1 || newtablen == (size_t) -2) ++ newtablen = 1; ++ } ++ else ++#endif ++ newtablen = 1; ++ if (optarg[newtablen]) + { + if (STREQ (optarg, "\\0")) +- newtab = '\0'; ++ { ++ newtab = "\0"; ++ newtablen = 1; ++ } + else + error (EXIT_FAILURE, 0, _("multi-character tab %s"), + quote (optarg)); + } +- if (0 <= tab && tab != newtab) ++ if (tab != NULL ++ && (tablen != newtablen || memcmp (tab, newtab, tablen) != 0)) + error (EXIT_FAILURE, 0, _("incompatible tabs")); + tab = newtab; ++ tablen = newtablen; + } + break; + +Index: src/pr.c +=================================================================== +--- coreutils-7.1/src/pr.c.orig 2009-01-27 22:11:25.000000000 +0100 ++++ coreutils-7.1/src/pr.c 2010-06-29 18:49:31.931969742 +0200 +@@ -312,6 +312,32 @@ + + #include + #include ++ ++/* Get MB_LEN_MAX. */ ++#include ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX == 1 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Get MB_CUR_MAX. */ ++#include ++ ++/* Solaris 2.5 has a bug: must be included before . */ ++/* Get mbstate_t, mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswprint(). -- for wcwidth(). */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++#if !defined iswprint && !HAVE_ISWPRINT ++# define iswprint(wc) 1 ++#endif ++ + #include "system.h" + #include "error.h" + #include "mbswidth.h" +@@ -321,6 +347,18 @@ + #include "strftime.h" + #include "xstrtol.h" + ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ ++#ifndef HAVE_DECL_WCWIDTH ++"this configure-time declaration test was not run" ++#endif ++#if !HAVE_DECL_WCWIDTH ++extern int wcwidth (); ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "pr" + +@@ -414,8 +452,21 @@ struct COLUMN + typedef struct COLUMN COLUMN; + + #define NULLCOL (COLUMN *)0 ++ ++/* Funtion pointers to switch functions for single byte locale or for ++ multibyte locale. If multibyte functions do not exist in your sysytem, ++ these pointers always point the function for single byte locale. */ ++static void (*print_char) (char c); ++static int (*char_to_clump) (char c); ++ ++/* Functions for single byte locale. */ ++static void print_char_single (char c); ++static int char_to_clump_single (char c); ++ ++/* Functions for multibyte locale. */ ++static void print_char_multi (char c); ++static int char_to_clump_multi (char c); + +-static int char_to_clump (char c); + static bool read_line (COLUMN *p); + static bool print_page (void); + static bool print_stored (COLUMN *p); +@@ -425,6 +476,7 @@ static void print_header (void); + static void pad_across_to (int position); + static void add_line_number (COLUMN *p); + static void getoptarg (char *arg, char switch_char, char *character, ++ int *character_length, int *character_width, + int *number); + void usage (int status); + static void print_files (int number_of_files, char **av); +@@ -439,7 +491,6 @@ static void store_char (char c); + static void pad_down (int lines); + static void read_rest_of_line (COLUMN *p); + static void skip_read (COLUMN *p, int column_number); +-static void print_char (char c); + static void cleanup (void); + static void print_sep_string (void); + static void separator_string (const char *optarg_S); +@@ -451,7 +502,7 @@ static COLUMN *column_vector; + we store the leftmost columns contiguously in buff. + To print a line from buff, get the index of the first character + from line_vector[i], and print up to line_vector[i + 1]. */ +-static char *buff; ++static unsigned char *buff; + + /* Index of the position in buff where the next character + will be stored. */ +@@ -555,7 +606,7 @@ static int chars_per_column; + static bool untabify_input = false; + + /* (-e) The input tab character. */ +-static char input_tab_char = '\t'; ++static char input_tab_char[MB_LEN_MAX] = "\t"; + + /* (-e) Tabstops are at chars_per_tab, 2*chars_per_tab, 3*chars_per_tab, ... + where the leftmost column is 1. */ +@@ -565,7 +616,10 @@ static int chars_per_input_tab = 8; + static bool tabify_output = false; + + /* (-i) The output tab character. */ +-static char output_tab_char = '\t'; ++static char output_tab_char[MB_LEN_MAX] = "\t"; ++ ++/* (-i) The byte length of output tab character. */ ++static int output_tab_char_length = 1; + + /* (-i) The width of the output tab. */ + static int chars_per_output_tab = 8; +@@ -639,7 +693,13 @@ static int power_10; + static bool numbered_lines = false; + + /* (-n) Character which follows each line number. */ +-static char number_separator = '\t'; ++static char number_separator[MB_LEN_MAX] = "\t"; ++ ++/* (-n) The byte length of the character which follows each line number. */ ++static int number_separator_length = 1; ++ ++/* (-n) The character width of the character which follows each line number. */ ++static int number_separator_width = 0; + + /* (-n) line counting starts with 1st line of input file (not with 1st + line of 1st page printed). */ +@@ -692,6 +752,7 @@ static bool use_col_separator = false; + -a|COLUMN|-m is a `space' and with the -J option a `tab'. */ + static char *col_sep_string = (char *) ""; + static int col_sep_length = 0; ++static int col_sep_width = 0; + static char *column_separator = (char *) " "; + static char *line_separator = (char *) "\t"; + +@@ -848,6 +909,13 @@ separator_string (const char *optarg_S) + col_sep_length = (int) strlen (optarg_S); + col_sep_string = xmalloc (col_sep_length + 1); + strcpy (col_sep_string, optarg_S); ++ ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ col_sep_width = mbswidth (col_sep_string, 0); ++ else ++#endif ++ col_sep_width = col_sep_length; + } + + int +@@ -872,6 +940,21 @@ main (int argc, char **argv) + + atexit (close_stdout); + ++/* Define which functions are used, the ones for single byte locale or the ones ++ for multibyte locale. */ ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ print_char = print_char_multi; ++ char_to_clump = char_to_clump_multi; ++ } ++ else ++#endif ++ { ++ print_char = print_char_single; ++ char_to_clump = char_to_clump_single; ++ } ++ + n_files = 0; + file_names = (argc > 1 + ? xmalloc ((argc - 1) * sizeof (char *)) +@@ -948,8 +1031,12 @@ main (int argc, char **argv) + break; + case 'e': + if (optarg) +- getoptarg (optarg, 'e', &input_tab_char, +- &chars_per_input_tab); ++ { ++ int dummy_length, dummy_width; ++ ++ getoptarg (optarg, 'e', input_tab_char, &dummy_length, ++ &dummy_width, &chars_per_input_tab); ++ } + /* Could check tab width > 0. */ + untabify_input = true; + break; +@@ -962,8 +1049,12 @@ main (int argc, char **argv) + break; + case 'i': + if (optarg) +- getoptarg (optarg, 'i', &output_tab_char, +- &chars_per_output_tab); ++ { ++ int dummy_width; ++ ++ getoptarg (optarg, 'i', output_tab_char, &output_tab_char_length, ++ &dummy_width, &chars_per_output_tab); ++ } + /* Could check tab width > 0. */ + tabify_output = true; + break; +@@ -990,8 +1081,8 @@ main (int argc, char **argv) + case 'n': + numbered_lines = true; + if (optarg) +- getoptarg (optarg, 'n', &number_separator, +- &chars_per_number); ++ getoptarg (optarg, 'n', number_separator, &number_separator_length, ++ &number_separator_width, &chars_per_number); + break; + case 'N': + skip_count = false; +@@ -1031,6 +1122,7 @@ main (int argc, char **argv) + /* Reset an additional input of -s, -S dominates -s */ + col_sep_string = bad_cast (""); + col_sep_length = 0; ++ col_sep_width = 0; + use_col_separator = true; + if (optarg) + separator_string (optarg); +@@ -1187,10 +1279,45 @@ main (int argc, char **argv) + a number. */ + + static void +-getoptarg (char *arg, char switch_char, char *character, int *number) ++getoptarg (char *arg, char switch_char, char *character, int *character_length, ++ int *character_width, int *number) + { + if (!ISDIGIT (*arg)) +- *character = *arg++; ++ { ++#ifdef HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) /* for multibyte locale. */ ++ { ++ wchar_t wc; ++ size_t mblength; ++ int width; ++ mbstate_t state = {'\0'}; ++ ++ mblength = mbrtowc (&wc, arg, strlen (arg), &state); ++ ++ if (mblength == (size_t) -1 || mblength == (size_t) -2) ++ { ++ *character_length = 1; ++ *character_width = 1; ++ } ++ else ++ { ++ *character_length = (mblength < 1) ? 1 : mblength; ++ width = wcwidth (wc); ++ *character_width = (width < 0) ? 0 : width; ++ } ++ ++ strncpy (character, arg, *character_length); ++ arg += *character_length; ++ } ++ else /* for single byte locale. */ ++#endif ++ { ++ *character = *arg++; ++ *character_length = 1; ++ *character_width = 1; ++ } ++ } ++ + if (*arg) + { + long int tmp_long; +@@ -1249,7 +1376,7 @@ init_parameters (int number_of_files) + else + col_sep_string = column_separator; + +- col_sep_length = 1; ++ col_sep_length = col_sep_width = 1; + use_col_separator = true; + } + /* It's rather pointless to define a TAB separator with column +@@ -1280,11 +1407,11 @@ init_parameters (int number_of_files) + TAB_WIDTH (chars_per_input_tab, chars_per_number); */ + + /* Estimate chars_per_text without any margin and keep it constant. */ +- if (number_separator == '\t') ++ if (number_separator[0] == '\t') + number_width = chars_per_number + + TAB_WIDTH (chars_per_default_tab, chars_per_number); + else +- number_width = chars_per_number + 1; ++ number_width = chars_per_number + number_separator_width; + + /* The number is part of the column width unless we are + printing files in parallel. */ +@@ -1299,7 +1426,7 @@ init_parameters (int number_of_files) + } + + chars_per_column = (chars_per_line - chars_used_by_number - +- (columns - 1) * col_sep_length) / columns; ++ (columns - 1) * col_sep_width) / columns; + + if (chars_per_column < 1) + error (EXIT_FAILURE, 0, _("page width too narrow")); +@@ -1424,7 +1551,7 @@ init_funcs (void) + + /* Enlarge p->start_position of first column to use the same form of + padding_not_printed with all columns. */ +- h = h + col_sep_length; ++ h = h + col_sep_width; + + /* This loop takes care of all but the rightmost column. */ + +@@ -1458,7 +1585,7 @@ init_funcs (void) + } + else + { +- h = h_next + col_sep_length; ++ h = h_next + col_sep_width; + h_next = h + chars_per_column; + } + } +@@ -1748,9 +1875,9 @@ static void + align_column (COLUMN *p) + { + padding_not_printed = p->start_position; +- if (padding_not_printed - col_sep_length > 0) ++ if (padding_not_printed - col_sep_width > 0) + { +- pad_across_to (padding_not_printed - col_sep_length); ++ pad_across_to (padding_not_printed - col_sep_width); + padding_not_printed = ANYWHERE; + } + +@@ -2021,13 +2148,13 @@ store_char (char c) + /* May be too generous. */ + buff = X2REALLOC (buff, &buff_allocated); + } +- buff[buff_current++] = c; ++ buff[buff_current++] = (unsigned char) c; + } + + static void + add_line_number (COLUMN *p) + { +- int i; ++ int i, j; + char *s; + int left_cut; + +@@ -2050,22 +2177,24 @@ add_line_number (COLUMN *p) + /* Tabification is assumed for multiple columns, also for n-separators, + but `default n-separator = TAB' hasn't been given priority over + equal column_width also specified by POSIX. */ +- if (number_separator == '\t') ++ if (number_separator[0] == '\t') + { + i = number_width - chars_per_number; + while (i-- > 0) + (p->char_func) (' '); + } + else +- (p->char_func) (number_separator); ++ for (j = 0; j < number_separator_length; j++) ++ (p->char_func) (number_separator[j]); + } + else + /* To comply with POSIX, we avoid any expansion of default TAB + separator with a single column output. No column_width requirement + has to be considered. */ + { +- (p->char_func) (number_separator); +- if (number_separator == '\t') ++ for (j = 0; j < number_separator_length; j++) ++ (p->char_func) (number_separator[j]); ++ if (number_separator[0] == '\t') + output_position = POS_AFTER_TAB (chars_per_output_tab, + output_position); + } +@@ -2226,7 +2355,7 @@ print_white_space (void) + while (goal - h_old > 1 + && (h_new = POS_AFTER_TAB (chars_per_output_tab, h_old)) <= goal) + { +- putchar (output_tab_char); ++ fwrite (output_tab_char, 1, output_tab_char_length, stdout); + h_old = h_new; + } + while (++h_old <= goal) +@@ -2246,6 +2375,7 @@ print_sep_string (void) + { + char *s; + int l = col_sep_length; ++ int not_space_flag; + + s = col_sep_string; + +@@ -2259,6 +2389,7 @@ print_sep_string (void) + { + for (; separators_not_printed > 0; --separators_not_printed) + { ++ not_space_flag = 0; + while (l-- > 0) + { + /* 3 types of sep_strings: spaces only, spaces and chars, +@@ -2272,12 +2403,15 @@ print_sep_string (void) + } + else + { ++ not_space_flag = 1; + if (spaces_not_printed > 0) + print_white_space (); + putchar (*s++); +- ++output_position; + } + } ++ if (not_space_flag) ++ output_position += col_sep_width; ++ + /* sep_string ends with some spaces */ + if (spaces_not_printed > 0) + print_white_space (); +@@ -2304,8 +2438,9 @@ print_clump (COLUMN *p, int n, char *clu + a nonspace is encountered, call print_white_space() to print the + required number of tabs and spaces. */ + ++ + static void +-print_char (char c) ++print_char_single (char c) + { + if (tabify_output) + { +@@ -2329,6 +2464,75 @@ print_char (char c) + putchar (c); + } + ++#ifdef HAVE_MBRTOWC ++static void ++print_char_multi (char c) ++{ ++ static size_t mbc_pos = 0; ++ static unsigned char mbc[MB_LEN_MAX] = {'\0'}; ++ static mbstate_t state = {'\0'}; ++ mbstate_t state_bak; ++ wchar_t wc; ++ unsigned char uc = (unsigned char) c; ++ size_t mblength; ++ int width; ++ ++ if (tabify_output) ++ { ++ state_bak = state; ++ mbc[mbc_pos++] = uc; ++ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); ++ ++ while (mbc_pos > 0) ++ { ++ switch (mblength) ++ { ++ case (size_t) -2: ++ state = state_bak; ++ return; ++ ++ case (size_t) -1: ++ state = state_bak; ++ ++output_position; ++ putchar (mbc[0]); ++ memmove (mbc, mbc + 1, MB_CUR_MAX - 1); ++ --mbc_pos; ++ break; ++ ++ case 0: ++ mblength = 1; ++ ++ default: ++ if (wc == L' ') ++ { ++ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); ++ --mbc_pos; ++ ++spaces_not_printed; ++ return; ++ } ++ else if (spaces_not_printed > 0) ++ print_white_space (); ++ ++ /* Nonprintables are assumed to have width 0, except L'\b'. */ ++ if ((width = wcwidth (wc)) < 1) ++ { ++ if (wc == L'\b') ++ --output_position; ++ } ++ else ++ output_position += width; ++ ++ fwrite (mbc, 1, mblength, stdout); ++ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); ++ mbc_pos -= mblength; ++ } ++ } ++ return; ++ } ++ putchar (uc); ++} ++#endif ++ + /* Skip to page PAGE before printing. + PAGE may be larger than total number of pages. */ + +@@ -2506,9 +2710,9 @@ read_line (COLUMN *p) + align_empty_cols = false; + } + +- if (padding_not_printed - col_sep_length > 0) ++ if (padding_not_printed - col_sep_width > 0) + { +- pad_across_to (padding_not_printed - col_sep_length); ++ pad_across_to (padding_not_printed - col_sep_width); + padding_not_printed = ANYWHERE; + } + +@@ -2609,9 +2813,9 @@ print_stored (COLUMN *p) + } + } + +- if (padding_not_printed - col_sep_length > 0) ++ if (padding_not_printed - col_sep_width > 0) + { +- pad_across_to (padding_not_printed - col_sep_length); ++ pad_across_to (padding_not_printed - col_sep_width); + padding_not_printed = ANYWHERE; + } + +@@ -2624,8 +2828,8 @@ print_stored (COLUMN *p) + if (spaces_not_printed == 0) + { + output_position = p->start_position + end_vector[line]; +- if (p->start_position - col_sep_length == chars_per_margin) +- output_position -= col_sep_length; ++ if (p->start_position - col_sep_width == chars_per_margin) ++ output_position -= col_sep_width; + } + + return true; +@@ -2643,8 +2847,9 @@ print_stored (COLUMN *p) + characters in clump_buff. (e.g, the width of '\b' is -1, while the + number of characters is 1.) */ + ++ + static int +-char_to_clump (char c) ++char_to_clump_single (char c) + { + unsigned char uc = c; + char *s = clump_buff; +@@ -2654,10 +2859,10 @@ char_to_clump (char c) + int chars; + int chars_per_c = 8; + +- if (c == input_tab_char) ++ if (c == input_tab_char[0]) + chars_per_c = chars_per_input_tab; + +- if (c == input_tab_char || c == '\t') ++ if (c == input_tab_char[0] || c == '\t') + { + width = TAB_WIDTH (chars_per_c, input_position); + +@@ -2738,6 +2943,155 @@ char_to_clump (char c) + return chars; + } + ++#ifdef HAVE_MBRTOWC ++static int ++char_to_clump_multi (char c) ++{ ++ static size_t mbc_pos = 0; ++ static unsigned char mbc[MB_LEN_MAX] = {'\0'}; ++ static mbstate_t state = {'\0'}; ++ mbstate_t state_bak; ++ wchar_t wc; ++ unsigned char uc = (unsigned char) c; ++ size_t mblength; ++ int wc_width; ++ register char *s = clump_buff; ++ register int i, j; ++ char esc_buff[4]; ++ int width; ++ int chars; ++ int chars_per_c = 8; ++ ++ state_bak = state; ++ mbc[mbc_pos++] = uc; ++ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); ++ ++ width = 0; ++ chars = 0; ++ while (mbc_pos > 0) ++ { ++ switch (mblength) ++ { ++ case (size_t) -2: ++ state = state_bak; ++ return 0; ++ ++ case (size_t) -1: ++ state = state_bak; ++ mblength = 1; ++ ++ if (use_esc_sequence || use_cntrl_prefix) ++ { ++ width = +4; ++ chars = +4; ++ *s++ = '\\'; ++ sprintf (esc_buff, "%03o", mbc[0]); ++ for (i = 0; i <= 2; ++i) ++ *s++ = (int) esc_buff[i]; ++ } ++ else ++ { ++ width += 1; ++ chars += 1; ++ *s++ = mbc[0]; ++ } ++ break; ++ ++ case 0: ++ mblength = 1; ++ /* Fall through */ ++ ++ default: ++ if (memcmp (mbc, input_tab_char, mblength) == 0) ++ chars_per_c = chars_per_input_tab; ++ ++ if (memcmp (mbc, input_tab_char, mblength) == 0 || c == '\t') ++ { ++ int width_inc; ++ ++ width_inc = TAB_WIDTH (chars_per_c, input_position); ++ width += width_inc; ++ ++ if (untabify_input) ++ { ++ for (i = width_inc; i; --i) ++ *s++ = ' '; ++ chars += width_inc; ++ } ++ else ++ { ++ for (i = 0; i < mblength; i++) ++ *s++ = mbc[i]; ++ chars += mblength; ++ } ++ } ++ else if ((wc_width = wcwidth (wc)) < 1) ++ { ++ if (use_esc_sequence) ++ { ++ for (i = 0; i < mblength; i++) ++ { ++ width += 4; ++ chars += 4; ++ *s++ = '\\'; ++ sprintf (esc_buff, "%03o", uc); ++ for (j = 0; j <= 2; ++j) ++ *s++ = (int) esc_buff[j]; ++ } ++ } ++ else if (use_cntrl_prefix) ++ { ++ if (wc < 0200) ++ { ++ width += 2; ++ chars += 2; ++ *s++ = '^'; ++ *s++ = wc ^ 0100; ++ } ++ else ++ { ++ for (i = 0; i < mblength; i++) ++ { ++ width += 4; ++ chars += 4; ++ *s++ = '\\'; ++ sprintf (esc_buff, "%03o", uc); ++ for (j = 0; j <= 2; ++j) ++ *s++ = (int) esc_buff[j]; ++ } ++ } ++ } ++ else if (wc == L'\b') ++ { ++ width += -1; ++ chars += 1; ++ *s++ = c; ++ } ++ else ++ { ++ width += 0; ++ chars += mblength; ++ for (i = 0; i < mblength; i++) ++ *s++ = mbc[i]; ++ } ++ } ++ else ++ { ++ width += wc_width; ++ chars += mblength; ++ for (i = 0; i < mblength; i++) ++ *s++ = mbc[i]; ++ } ++ } ++ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); ++ mbc_pos -= mblength; ++ } ++ ++ input_position += width; ++ return chars; ++} ++#endif ++ + /* We've just printed some files and need to clean up things before + looking for more options and printing the next batch of files. + +Index: src/sort.c +=================================================================== +--- coreutils-7.1/src/sort.c.orig 2009-01-30 19:46:06.000000000 +0100 ++++ coreutils-7.1/src/sort.c 2010-06-29 18:51:17.203522566 +0200 +@@ -26,6 +26,19 @@ + #include + #include + #include ++#include ++ ++/* Get mbstate_t, mbrtowc(), wcrtomb(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswprint(), iswctype() towupper(). */ ++#if HAVE_WCTYPE_H ++# include ++wctype_t blank_type; /* = wctype ("blank"); */ ++#endif ++ + #include "system.h" + #include "argmatch.h" + #include "error.h" +@@ -53,6 +66,17 @@ struct rlimit { size_t rlim_cur; }; + # define getrlimit(Resource, Rlp) (-1) + #endif + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX == 1 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "sort" + +@@ -121,14 +145,38 @@ static int decimal_point; + /* Thousands separator; if -1, then there isn't one. */ + static int thousands_sep; + ++static int force_general_numcompare = 0; ++ + /* Nonzero if the corresponding locales are hard. */ + static bool hard_LC_COLLATE; +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + static bool hard_LC_TIME; + #endif + + #define NONZERO(x) ((x) != 0) + ++/* get a multibyte character's byte length. */ ++#define GET_BYTELEN_OF_CHAR(LIM, PTR, MBLENGTH, STATE) \ ++ do \ ++ { \ ++ wchar_t wc; \ ++ mbstate_t state_bak; \ ++ \ ++ state_bak = STATE; \ ++ mblength = mbrtowc (&wc, PTR, LIM - PTR, &STATE); \ ++ \ ++ switch (MBLENGTH) \ ++ { \ ++ case (size_t)-1: \ ++ case (size_t)-2: \ ++ STATE = state_bak; \ ++ /* Fall through. */ \ ++ case 0: \ ++ MBLENGTH = 1; \ ++ } \ ++ } \ ++ while (0) ++ + /* The kind of blanks for '-b' to skip in various options. */ + enum blanktype { bl_start, bl_end, bl_both }; + +@@ -264,13 +312,11 @@ static bool reverse; + they were read if all keys compare equal. */ + static bool stable; + +-/* If TAB has this value, blanks separate fields. */ +-enum { TAB_DEFAULT = CHAR_MAX + 1 }; +- +-/* Tab character separating fields. If TAB_DEFAULT, then fields are +- separated by the empty string between a non-blank character and a blank ++/* Tab character separating fields. If NULL, then fields are separated by ++ the empty string between a non-blank character and a blank + character. */ +-static int tab = TAB_DEFAULT; ++static const char *tab; ++static size_t tab_length = 1; + + /* Flag to remove consecutive duplicate lines from the output. + Only the last of a sequence of equal lines will be output. */ +@@ -702,6 +748,43 @@ reap_some (void) + update_proc (pid); + } + ++/* Fucntion pointers. */ ++static char * ++(* begfield) (const struct line *line, const struct keyfield *key); ++ ++static char * ++(* limfield) (const struct line *line, const struct keyfield *key); ++ ++static int ++(*getmonth) (const char *s, size_t len); ++ ++static int ++(* keycompare) (const struct line *a, const struct line *b); ++ ++/* Test for white space multibyte character. ++ Set LENGTH the byte length of investigated multibyte character. */ ++#if HAVE_MBRTOWC ++static int ++ismbblank (const char *str, size_t *length) ++{ ++ size_t mblength; ++ wchar_t wc; ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ mblength = mbrtowc (&wc, str, MB_LEN_MAX, &state); ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ *length = 1; ++ return 0; ++ } ++ ++ *length = (mblength < 1) ? 1 : mblength; ++ return (iswctype (wc, blank_type)); ++} ++#endif ++ + /* Clean up any remaining temporary files. */ + + static void +@@ -1042,7 +1125,7 @@ zaptemp (const char *name) + free (node); + } + +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + + static int + struct_month_cmp (const void *m1, const void *m2) +@@ -1069,7 +1152,7 @@ inittables (void) + fold_toupper[i] = toupper (i); + } + +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + /* If we're not in the "C" locale, read different names for months. */ + if (hard_LC_TIME) + { +@@ -1151,6 +1234,71 @@ specify_nmerge (int oi, char c, char con + xstrtol_fatal (e, oi, c, long_options, s); + } + ++#if HAVE_MBRTOWC ++static void ++inittables_mb (void) ++{ ++ int i, j, k, l; ++ char *name, *s; ++ size_t s_len, mblength; ++ char mbc[MB_LEN_MAX]; ++ wchar_t wc, pwc; ++ mbstate_t state_mb, state_wc; ++ ++ for (i = 0; i < MONTHS_PER_YEAR; i++) ++ { ++ s = (char *) nl_langinfo (ABMON_1 + i); ++ s_len = strlen (s); ++ monthtab[i].name = name = (char *) xmalloc (s_len + 1); ++ monthtab[i].val = i + 1; ++ ++ memset (&state_mb, '\0', sizeof (mbstate_t)); ++ memset (&state_wc, '\0', sizeof (mbstate_t)); ++ ++ for (j = 0; j < s_len;) ++ { ++ if (!ismbblank (s + j, &mblength)) ++ break; ++ j += mblength; ++ } ++ ++ for (k = 0; j < s_len;) ++ { ++ mblength = mbrtowc (&wc, (s + j), (s_len - j), &state_mb); ++ /* If conversion is failed, fall back into single byte sorting. */ ++ if (mblength == (size_t) -1 || mblength == (size_t) -2) ++ { ++ for (l = 0; l <= i; l++) ++ free ((void *) monthtab[l].name); ++ inittables(); ++ return; ++ } ++ else if (mblength == 0) ++ break; ++ ++ pwc = towupper (wc); ++ if (pwc == wc) ++ { ++ memcpy (mbc, s + j, mblength); ++ j += mblength; ++ } ++ else ++ { ++ j += mblength; ++ mblength = wcrtomb (mbc, wc, &state_wc); ++ assert (mblength != (size_t) 0 && mblength != (size_t) -1); ++ } ++ ++ for (l = 0; l < mblength; l++) ++ name[k++] = mbc[l]; ++ } ++ name[k] = '\0'; ++ } ++ qsort ((void *) monthtab, MONTHS_PER_YEAR, ++ sizeof *monthtab, struct_month_cmp); ++} ++#endif ++ + /* Specify the amount of main memory to use when sorting. */ + static void + specify_sort_size (int oi, char c, char const *s) +@@ -1361,7 +1509,7 @@ buffer_linelim (struct buffer const *buf + by KEY in LINE. */ + + static char * +-begfield (const struct line *line, const struct keyfield *key) ++begfield_uni (const struct line *line, const struct keyfield *key) + { + char *ptr = line->text, *lim = ptr + line->length - 1; + size_t sword = key->sword; +@@ -1371,10 +1519,10 @@ begfield (const struct line *line, const + /* The leading field separator itself is included in a field when -t + is absent. */ + +- if (tab != TAB_DEFAULT) ++ if (tab != NULL) + while (ptr < lim && sword--) + { +- while (ptr < lim && *ptr != tab) ++ while (ptr < lim && *ptr != tab[0]) + ++ptr; + if (ptr < lim) + ++ptr; +@@ -1402,11 +1550,70 @@ begfield (const struct line *line, const + return ptr; + } + ++#if HAVE_MBRTOWC ++static char * ++begfield_mb (const struct line *line, const struct keyfield *key) ++{ ++ int i; ++ char *ptr = line->text, *lim = ptr + line->length - 1; ++ size_t sword = key->sword; ++ size_t schar = key->schar; ++ size_t mblength; ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ if (tab != NULL) ++ while (ptr < lim && sword--) ++ { ++ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ } ++ else ++ while (ptr < lim && sword--) ++ { ++ while (ptr < lim && ismbblank (ptr, &mblength)) ++ ptr += mblength; ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ while (ptr < lim && !ismbblank (ptr, &mblength)) ++ ptr += mblength; ++ } ++ ++ if (key->skipsblanks) ++ while (ptr < lim && ismbblank (ptr, &mblength)) ++ ptr += mblength; ++ ++ for (i = 0; i < schar; i++) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ++ if (ptr + mblength > lim) ++ break; ++ else ++ ptr += mblength; ++ } ++ ++ return ptr; ++} ++#endif ++ + /* Return the limit of (a pointer to the first character after) the field + in LINE specified by KEY. */ + + static char * +-limfield (const struct line *line, const struct keyfield *key) ++limfield_uni (const struct line *line, const struct keyfield *key) + { + char *ptr = line->text, *lim = ptr + line->length - 1; + size_t eword = key->eword, echar = key->echar; +@@ -1419,10 +1626,10 @@ limfield (const struct line *line, const + `beginning' is the first character following the delimiting TAB. + Otherwise, leave PTR pointing at the first `blank' character after + the preceding field. */ +- if (tab != TAB_DEFAULT) ++ if (tab != NULL) + while (ptr < lim && eword--) + { +- while (ptr < lim && *ptr != tab) ++ while (ptr < lim && *ptr != tab[0]) + ++ptr; + if (ptr < lim && (eword | echar)) + ++ptr; +@@ -1468,7 +1675,7 @@ limfield (const struct line *line, const + */ + + /* Make LIM point to the end of (one byte past) the current field. */ +- if (tab != TAB_DEFAULT) ++ if (tab != NULL) + { + char *newlim; + newlim = memchr (ptr, tab, lim - ptr); +@@ -1504,6 +1711,107 @@ limfield (const struct line *line, const + return ptr; + } + ++#if HAVE_MBRTOWC ++static char * ++limfield_mb (const struct line *line, const struct keyfield *key) ++{ ++ char *ptr = line->text, *lim = ptr + line->length - 1; ++ size_t eword = key->eword, echar = key->echar; ++ int i; ++ size_t mblength; ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ if (tab != NULL) ++ while (ptr < lim && eword--) ++ { ++ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ } ++ else ++ while (ptr < lim && eword--) ++ { ++ while (ptr < lim && ismbblank (ptr, &mblength)) ++ ptr += mblength; ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ while (ptr < lim && !ismbblank (ptr, &mblength)) ++ ptr += mblength; ++ } ++ ++# ifdef POSIX_UNSPECIFIED ++ ++ /* Make LIM point to the end of (one byte past) the current field. */ ++ if (tab != NULL) ++ { ++ char *newlim, *p; ++ ++ newlim = NULL; ++ for (p = ptr; p < lim;) ++ { ++ if (memcmp (p, tab, tab_length) == 0) ++ { ++ newlim = p; ++ break; ++ } ++ ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ p += mblength; ++ } ++ } ++ else ++ { ++ char *newlim; ++ newlim = ptr; ++ ++ while (newlim < lim && ismbblank (newlim, &mblength)) ++ newlim += mblength; ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ while (newlim < lim && !ismbblank (newlim, &mblength)) ++ newlim += mblength; ++ lim = newlim; ++ } ++# endif ++ ++ /* If we're skipping leading blanks, don't start counting characters ++ until after skipping past any leading blanks. */ ++ if (key->skipeblanks) ++ while (ptr < lim && ismbblank (ptr, &mblength)) ++ ptr += mblength; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ /* Advance PTR by ECHAR (if possible), but no further than LIM. */ ++ for (i = 0; i < echar; i++) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ++ if (ptr + mblength > lim) ++ break; ++ else ++ ptr += mblength; ++ } ++ ++ return ptr; ++} ++#endif ++ + /* Fill BUF reading from FP, moving buf->left bytes from the end + of buf->buf to the beginning first. If EOF is reached and the + file wasn't terminated by a newline, supply one. Set up BUF's line +@@ -1586,8 +1894,22 @@ fillbuf (struct buffer *buf, FILE *fp, c + else + { + if (key->skipsblanks) +- while (blanks[to_uchar (*line_start)]) +- line_start++; ++ { ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ size_t mblength; ++ ++ while (ismbblank (line_start, &mblength)) ++ line_start += mblength; ++ } ++ else ++#endif ++ { ++ while (blanks[to_uchar (*line_start)]) ++ line_start++; ++ } ++ } + line->keybeg = line_start; + } + } +@@ -1642,15 +1964,59 @@ general_numcompare (const char *sa, cons + /* FIXME: maybe add option to try expensive FP conversion + only if A and B can't be compared more cheaply/accurately. */ + +- char *ea; +- char *eb; +- double a = strtod (sa, &ea); +- double b = strtod (sb, &eb); ++ char *bufa, *ea; ++ char *bufb, *eb; ++ double a; ++ double b; ++ ++ char *p; ++ struct lconv *lconvp = localeconv (); ++ size_t thousands_sep_len = strlen (lconvp->thousands_sep); ++ ++ bufa = (char *) xmalloc (strlen (sa) + 1); ++ bufb = (char *) xmalloc (strlen (sb) + 1); ++ strcpy (bufa, sa); ++ strcpy (bufb, sb); ++ ++ if (force_general_numcompare) ++ { ++ while (1) ++ { ++ a = strtod (bufa, &ea); ++ if (memcmp (ea, lconvp->thousands_sep, thousands_sep_len) == 0) ++ { ++ for (p = ea; *(p + thousands_sep_len) != '\0'; p++) ++ *p = *(p + thousands_sep_len); ++ *p = '\0'; ++ continue; ++ } ++ break; ++ } ++ ++ while (1) ++ { ++ b = strtod (bufb, &eb); ++ if (memcmp (eb, lconvp->thousands_sep, thousands_sep_len) == 0) ++ { ++ for (p = eb; *(p + thousands_sep_len) != '\0'; p++) ++ *p = *(p + thousands_sep_len); ++ *p = '\0'; ++ continue; ++ } ++ break; ++ } ++ } ++ else ++ { ++ a = strtod (bufa, &ea); ++ b = strtod (bufb, &eb); ++ } ++ + + /* Put conversion errors at the start of the collating sequence. */ +- if (sa == ea) +- return sb == eb ? 0 : -1; +- if (sb == eb) ++ if (bufa == ea) ++ return bufb == eb ? 0 : -1; ++ if (bufb == eb) + return 1; + + /* Sort numbers in the usual way, where -0 == +0. Put NaNs after +@@ -1668,7 +2034,7 @@ general_numcompare (const char *sa, cons + Return 0 if the name in S is not recognized. */ + + static int +-getmonth (char const *month, size_t len) ++getmonth_uni (char const *month, size_t len) + { + size_t lo = 0; + size_t hi = MONTHS_PER_YEAR; +@@ -1849,11 +2215,79 @@ compare_version (char *restrict texta, s + return diff; + } + ++#if HAVE_MBRTOWC ++static int ++getmonth_mb (char const *s, size_t len) ++{ ++ char *month; ++ register size_t i; ++ register int lo = 0, hi = MONTHS_PER_YEAR, result; ++ char *tmp; ++ size_t wclength, mblength; ++ const char **pp; ++ const wchar_t **wpp; ++ wchar_t *month_wcs; ++ mbstate_t state; ++ ++ while (len > 0 && ismbblank (s, &mblength)) ++ { ++ s += mblength; ++ len -= mblength; ++ } ++ ++ if (len == 0) ++ return 0; ++ ++ month = (char *) alloca (len + 1); ++ ++ tmp = (char *) alloca (len + 1); ++ memcpy (tmp, s, len); ++ tmp[len] = '\0'; ++ pp = (const char **) &tmp; ++ month_wcs = (wchar_t *) alloca ((len + 1) * sizeof (wchar_t)); ++ memset (&state, '\0', sizeof (mbstate_t)); ++ ++ wclength = mbsrtowcs (month_wcs, pp, len + 1, &state); ++ assert (wclength != 1 && *pp == NULL); ++ ++ for (i = 0; i < wclength; i++) ++ { ++ month_wcs[i] = towupper (month_wcs[i]); ++ if (iswctype (month_wcs[i], blank_type)) ++ { ++ month_wcs[i] = L'\0'; ++ break; ++ } ++ } ++ ++ wpp = (const wchar_t **) &month_wcs; ++ ++ mblength = wcsrtombs (month, wpp, len + 1, &state); ++ assert (mblength != (-1) && *wpp == NULL); ++ ++ do ++ { ++ int ix = (lo + hi) / 2; ++ ++ if (strncmp (month, monthtab[ix].name, strlen (monthtab[ix].name)) < 0) ++ hi = ix; ++ else ++ lo = ix; ++ } ++ while (hi - lo > 1); ++ ++ result = (!strncmp (month, monthtab[lo].name, strlen (monthtab[lo].name)) ++ ? monthtab[lo].val : 0); ++ ++ return result; ++} ++#endif ++ + /* Compare two lines A and B trying every key in sequence until there + are no more keys or a difference is found. */ + + static int +-keycompare (const struct line *a, const struct line *b) ++keycompare_uni (const struct line *a, const struct line *b) + { + struct keyfield const *key = keylist; + +@@ -2022,11 +2456,190 @@ keycompare (const struct line *a, const + + return 0; + +- greater: ++greater: ++ diff = 1; ++not_equal: ++ return key->reverse ? -diff : diff; ++} ++ ++#if HAVE_MBRTOWC ++static int ++keycompare_mb (const struct line *a, const struct line *b) ++{ ++ struct keyfield *key = keylist; ++ ++ /* For the first iteration only, the key positions have been ++ precomputed for us. */ ++ char *texta = a->keybeg; ++ char *textb = b->keybeg; ++ char *lima = a->keylim; ++ char *limb = b->keylim; ++ ++ size_t mblength_a, mblength_b; ++ wchar_t wc_a, wc_b; ++ mbstate_t state_a, state_b; ++ ++ int diff; ++ ++ memset (&state_a, '\0', sizeof (mbstate_t)); ++ memset (&state_b, '\0', sizeof (mbstate_t)); ++ ++ for (;;) ++ { ++ register char const *translate = key->translate; ++ register bool const *ignore = key->ignore; ++ ++ /* Find the lengths. */ ++ size_t lena = lima <= texta ? 0 : lima - texta; ++ size_t lenb = limb <= textb ? 0 : limb - textb; ++ ++ /* Actually compare the fields. */ ++ if (key->numeric | key->general_numeric) ++ { ++ char savea = *lima, saveb = *limb; ++ ++ *lima = *limb = '\0'; ++ if (force_general_numcompare) ++ diff = general_numcompare (texta, textb); ++ else ++ diff = ((key->numeric ? numcompare : general_numcompare) ++ (texta, textb)); ++ *lima = savea, *limb = saveb; ++ } ++ else if (key->version) ++ diff = compare_version (texta, lena, textb, lenb); ++ else if (key->month) ++ diff = getmonth (texta, lena) - getmonth (textb, lenb); ++ else ++ { ++ if (ignore || translate) ++ { ++ char buf[4000]; ++ size_t size = lena + 1 + lenb + 1; ++ char *copy_a = (size <= sizeof buf ? buf : xmalloc (size)); ++ char *copy_b = copy_a + lena + 1; ++ size_t new_len_a, new_len_b; ++ size_t i, j; ++ ++ /* Ignore and/or translate chars before comparing. */ ++# define IGNORE_CHARS(NEW_LEN, LEN, TEXT, COPY, WC, MBLENGTH, STATE) \ ++ do \ ++ { \ ++ wchar_t uwc; \ ++ char mbc[MB_LEN_MAX]; \ ++ mbstate_t state_wc; \ ++ \ ++ for (NEW_LEN = i = 0; i < LEN;) \ ++ { \ ++ mbstate_t state_bak; \ ++ \ ++ state_bak = STATE; \ ++ MBLENGTH = mbrtowc (&WC, TEXT + i, LEN - i, &STATE); \ ++ \ ++ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1 \ ++ || MBLENGTH == 0) \ ++ { \ ++ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ ++ STATE = state_bak; \ ++ if (!ignore) \ ++ COPY[NEW_LEN++] = TEXT[i++]; \ ++ continue; \ ++ } \ ++ \ ++ if (ignore) \ ++ { \ ++ if ((ignore == nonprinting && !iswprint (WC)) \ ++ || (ignore == nondictionary \ ++ && !iswalnum (WC) && !iswctype (WC, blank_type))) \ ++ { \ ++ i += MBLENGTH; \ ++ continue; \ ++ } \ ++ } \ ++ \ ++ if (translate) \ ++ { \ ++ \ ++ uwc = toupper(WC); \ ++ if (WC == uwc) \ ++ { \ ++ memcpy (mbc, TEXT + i, MBLENGTH); \ ++ i += MBLENGTH; \ ++ } \ ++ else \ ++ { \ ++ i += MBLENGTH; \ ++ WC = uwc; \ ++ memset (&state_wc, '\0', sizeof (mbstate_t)); \ ++ \ ++ MBLENGTH = wcrtomb (mbc, WC, &state_wc); \ ++ assert (MBLENGTH != (size_t)-1 && MBLENGTH != 0); \ ++ } \ ++ \ ++ for (j = 0; j < MBLENGTH; j++) \ ++ COPY[NEW_LEN++] = mbc[j]; \ ++ } \ ++ else \ ++ for (j = 0; j < MBLENGTH; j++) \ ++ COPY[NEW_LEN++] = TEXT[i++]; \ ++ } \ ++ COPY[NEW_LEN] = '\0'; \ ++ } \ ++ while (0) ++ ++ IGNORE_CHARS (new_len_a, lena, texta, copy_a, ++ wc_a, mblength_a, state_a); ++ IGNORE_CHARS (new_len_b, lenb, textb, copy_b, ++ wc_b, mblength_b, state_b); ++ diff = xmemcoll (copy_a, new_len_a, copy_b, new_len_b); ++ ++ if (sizeof buf < size) ++ free (copy_a); ++ } ++ else if (lena == 0) ++ diff = - NONZERO (lenb); ++ else if (lenb == 0) ++ goto greater; ++ else ++ diff = xmemcoll (texta, lena, textb, lenb); ++ } ++ ++ if (diff) ++ goto not_equal; ++ ++ key = key->next; ++ if (! key) ++ break; ++ ++ /* Find the beginning and limit of the next field. */ ++ if (key->eword != SIZE_MAX) ++ lima = limfield (a, key), limb = limfield (b, key); ++ else ++ lima = a->text + a->length - 1, limb = b->text + b->length - 1; ++ ++ if (key->sword != SIZE_MAX) ++ texta = begfield (a, key), textb = begfield (b, key); ++ else ++ { ++ texta = a->text, textb = b->text; ++ if (key->skipsblanks) ++ { ++ while (texta < lima && ismbblank (texta, &mblength_a)) ++ texta += mblength_a; ++ while (textb < limb && ismbblank (textb, &mblength_b)) ++ textb += mblength_b; ++ } ++ } ++ } ++ ++ return 0; ++ ++greater: + diff = 1; +- not_equal: ++not_equal: + return key->reverse ? -diff : diff; + } ++#endif + + /* Compare two lines A and B, returning negative, zero, or positive + depending on whether A compares less than, equal to, or greater than B. */ +@@ -2857,6 +3470,11 @@ set_ordering (const char *s, struct keyf + break; + case 'M': + key->month = true; ++#if HAVE_MBRTOWC ++ if (strcmp (setlocale (LC_CTYPE, NULL), setlocale (LC_TIME, NULL))) ++ error (0, 0, _("As LC_TIME differs from LC_CTYPE, the results may be strange.")); ++ inittables_mb (); ++#endif + break; + case 'n': + key->numeric = true; +@@ -2915,7 +3533,7 @@ main (int argc, char **argv) + initialize_exit_failure (SORT_FAILURE); + + hard_LC_COLLATE = hard_locale (LC_COLLATE); +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + hard_LC_TIME = hard_locale (LC_TIME); + #endif + +@@ -2928,14 +3546,40 @@ main (int argc, char **argv) + add support for multibyte decimal points. */ + decimal_point = to_uchar (locale->decimal_point[0]); + if (! decimal_point || locale->decimal_point[1]) +- decimal_point = '.'; ++ { ++ decimal_point = '.'; ++ if (locale->decimal_point[0] && locale->decimal_point[1]) ++ force_general_numcompare = 1; ++ } + + /* FIXME: add support for multibyte thousands separators. */ + thousands_sep = to_uchar (*locale->thousands_sep); + if (! thousands_sep || locale->thousands_sep[1]) +- thousands_sep = -1; ++ { ++ thousands_sep = -1; ++ if (locale->thousands_sep[0] && locale->thousands_sep[1]) ++ force_general_numcompare = 1; ++ } + } + ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ blank_type = wctype ("blank"); ++ begfield = begfield_mb; ++ limfield = limfield_mb; ++ getmonth = getmonth_mb; ++ keycompare = keycompare_mb; ++ } ++ else ++#endif ++ { ++ begfield = begfield_uni; ++ limfield = limfield_uni; ++ keycompare = keycompare_uni; ++ getmonth = getmonth_uni; ++ } ++ + have_read_stdin = false; + inittables (); + +@@ -3196,13 +3840,32 @@ main (int argc, char **argv) + + case 't': + { +- char newtab = optarg[0]; +- if (! newtab) ++ const char *newtab = optarg; ++ size_t newtab_length; ++ if (! newtab[0]) + error (SORT_FAILURE, 0, _("empty tab")); +- if (optarg[1]) ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ mbstate_t state; ++ ++ memset (&state, 0, sizeof (mbstate_t)); ++ newtab_length = mbrtowc (NULL, newtab, strlen (newtab), &state); ++ if (newtab_length == (size_t) 0 ++ || newtab_length == (size_t) -1 ++ || newtab_length == (size_t) -2) ++ newtab_length = 1; ++ } ++ else ++#endif ++ newtab_length = 1; ++ if (optarg[newtab_length]) + { + if (STREQ (optarg, "\\0")) +- newtab = '\0'; ++ { ++ newtab = "\0"; ++ newtab_length = 1; ++ } + else + { + /* Provoke with `sort -txx'. Complain about +@@ -3213,9 +3876,12 @@ main (int argc, char **argv) + quote (optarg)); + } + } +- if (tab != TAB_DEFAULT && tab != newtab) ++ if (tab != NULL ++ && (tab_length != newtab_length ++ || memcmp (tab, newtab, tab_length) != 0)) + error (SORT_FAILURE, 0, _("incompatible tabs")); + tab = newtab; ++ tab_length = newtab_length; + } + break; + +Index: src/unexpand.c +=================================================================== +--- coreutils-7.1/src/unexpand.c.orig 2008-11-10 14:17:52.000000000 +0100 ++++ coreutils-7.1/src/unexpand.c 2010-06-29 18:49:31.975522293 +0200 +@@ -38,11 +38,34 @@ + #include + #include + #include ++ ++/* Get mbstate_t, mbrtowc(), wcwidth() */ ++#if HAVE_WCHAR_H ++# include ++#endif ++/* Get iswblank */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ ++ ++/* A sentinel value that's placed at the end of the list of tab stops. ++ * This value must be a large number, but not so large that adding the ++ * length of a line to it would cause the column variable to overflow. */ ++#define TAB_STOP_SENTINEL INT_MAX ++ + #include "system.h" + #include "error.h" + #include "quote.h" + #include "xstrndup.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# undef MB_LEN_MAX ++# define MB_LEN_MAX 16 ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "unexpand" + +@@ -449,6 +472,237 @@ unexpand (void) + } + } + ++#if HAVE_MBRTOWC && HAVE_WCTYPE_H ++static void ++unexpand_multibyte (void) ++{ ++ /* Input stream. */ ++ FILE *fp = next_file (NULL); ++ ++ mbstate_t i_state; /* Current shift state of the input stream. */ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen = 0; /* The length of the byte sequence in buf. */ ++ ++ /* The array of pending blanks. In non-POSIX locales, blanks can ++ include characters other than spaces, so the blanks must be ++ stored, not merely counted. */ ++ char *pending_blank; ++ ++ if (!fp) ++ return; ++ ++ /* The worst case is a non-blank character, then one blank, then a ++ tab stop, then MAX_COLUMN_WIDTH - 1 blanks, then a non-blank; so ++ allocate MAX_COLUMN_WIDTH bytes to store the blanks. */ ++ pending_blank = xmalloc (max_column_width); ++ ++ memset (&i_state, '\0', sizeof(mbstate_t)); ++ ++ for (;;) ++ { ++ /* A gotten wide character. */ ++ wint_t wc; ++ ++ /* If true, perform translations. */ ++ bool convert = true; ++ ++ /* The following variables have valid values only when CONVERT ++ is true: */ ++ ++ /* Column of next input character. */ ++ uintmax_t column = 0; ++ ++ /* Column the next input tab stop is on. */ ++ uintmax_t next_tab_column = 0; ++ ++ /* Index in TAB_LIST of next tab stop to examine. */ ++ size_t tab_index = 0; ++ ++ /* If true, the first pending blank came just before a tab stop. */ ++ bool one_blank_before_tab_stop = false; ++ ++ /* If true, the previous input character was a blank. This is ++ initially true, since initial strings of blanks are treated ++ as if the line was preceded by a blank. */ ++ bool prev_blank = true; ++ ++ /* Number of pending columns of blanks. */ ++ size_t pending = 0; ++ ++ /* Convert a line of text. */ ++ do ++ { ++ wchar_t w; ++ size_t mblength; /* The byte size of a multibyte character ++ which shows as same character as WC. */ ++ mbstate_t i_state_bak; /* Back up the I_STATE. */ ++ ++ /* Fill buffer */ ++ if (buflen < MB_LEN_MAX) ++ { ++ if (!feof (fp) && !ferror (fp)) ++ { ++ if (buflen > 0) ++ memmove (buf, bufpos, buflen); ++ buflen += fread (buf + buflen, sizeof (char), BUFSIZ, fp); ++ bufpos = buf; ++ } ++ } ++ ++ if (buflen < 1) ++ { ++ /* Move to the next file */ ++ if (feof (fp) || ferror (fp)) ++ fp = next_file (fp); ++ if (!fp) ++ { ++ if (pending) ++ { ++ if (fwrite (pending_blank, 1, pending, stdout) != pending) ++ error (EXIT_FAILURE, errno, _("write error")); ++ } ++ free (pending_blank); ++ return; ++ } ++ continue; ++ } ++ ++ i_state_bak = i_state; ++ mblength = mbrtowc (&w, bufpos, buflen, &i_state); ++ wc = w; ++ ++ if (mblength == (size_t) -1 || mblength == (size_t) -2) ++ { ++ i_state = i_state_bak; ++ wc = L'\0'; ++ column += convert; ++ mblength = 1; ++ } ++ ++ if (convert) ++ { ++ bool blank = iswblank (wc); ++ ++ if (blank) ++ { ++ if (next_tab_column <= column) ++ { ++ if (tab_size) ++ next_tab_column = ++ column + (tab_size - column % tab_size); ++ else ++ for (;;) ++ if (tab_index == first_free_tab) ++ { ++ convert = false; ++ break; ++ } ++ else ++ { ++ uintmax_t tab = tab_list[tab_index++]; ++ if (column < tab) ++ { ++ next_tab_column = tab; ++ break; ++ } ++ } ++ } ++ ++ if (convert) ++ { ++ if (next_tab_column < column) ++ error (EXIT_FAILURE, 0, _("input line is too long")); ++ ++ if (wc == L'\t') ++ { ++ column = next_tab_column; ++ ++ /* Discard pending blanks, unless it was a single ++ blank just before the previous tab stop. */ ++ if (! (pending == 1 && one_blank_before_tab_stop)) ++ { ++ pending = 0; ++ one_blank_before_tab_stop = false; ++ } ++ } ++ else ++ { ++ column++; ++ ++ if (! (prev_blank && column == next_tab_column)) ++ { ++ /* It is not yet known whether the pending blanks ++ will be replaced by tabs. */ ++ if (column == next_tab_column) ++ one_blank_before_tab_stop = true; ++ pending_blank[pending++] = ' '; ++ prev_blank = true; ++ buflen -= mblength; ++ bufpos += mblength; ++ continue; ++ } ++ ++ /* Replace the pending blanks by a tab or two. */ ++ pending_blank[0] = *bufpos = '\t'; ++ pending = one_blank_before_tab_stop; ++ } ++ } ++ } ++ else if (wc == L'\b') ++ { ++ /* Go back one column, and force recalculation of the ++ next tab stop. */ ++ column -= !!column; ++ next_tab_column = column; ++ tab_index -= !!tab_index; ++ } ++ else ++ { ++ if (!iswcntrl (wc)) ++ { ++ int width = wcwidth (wc); ++ if (width > 0) ++ { ++ if (column > (column + width)) ++ error (EXIT_FAILURE, 0, _("input line is too long")); ++ column += width; ++ } ++ } ++ } ++ ++ if (pending) ++ { ++ if (fwrite (pending_blank, 1, pending, stdout) != pending) ++ error (EXIT_FAILURE, errno, _("write error")); ++ pending = 0; ++ one_blank_before_tab_stop = false; ++ } ++ ++ prev_blank = blank; ++ convert &= convert_entire_line | blank; ++ } ++ ++ if (mblength) ++ { ++ if (fwrite (bufpos, sizeof (char), mblength, stdout) < mblength) ++ error (EXIT_FAILURE, errno, _("write error")); ++ } ++ else ++ { ++ if (putchar ('\0')) ++ error (EXIT_FAILURE, errno, _("write error")); ++ mblength = 1; ++ } ++ ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++ while (wc != L'\n'); ++ } ++} ++#endif ++ + int + main (int argc, char **argv) + { +@@ -527,7 +781,12 @@ main (int argc, char **argv) + + file_list = (optind < argc ? &argv[optind] : stdin_argv); + +- unexpand (); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ unexpand_multibyte (); ++ else ++#endif ++ unexpand (); + + if (have_read_stdin && fclose (stdin) != 0) + error (EXIT_FAILURE, errno, "-"); +Index: src/uniq.c +=================================================================== +--- coreutils-7.1/src/uniq.c.orig 2008-11-10 14:17:52.000000000 +0100 ++++ coreutils-7.1/src/uniq.c 2010-06-29 18:49:32.040030047 +0200 +@@ -22,6 +22,16 @@ + #include + #include + ++/* Get mbstate_t, mbrtowc(), wcrtomb() */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswctype(), wctype(), towupper)(. */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "argmatch.h" + #include "linebuffer.h" +@@ -32,6 +42,13 @@ + #include "xstrtol.h" + #include "memcasecmp.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# undef MB_LEN_MAX ++# define MB_LEN_MAX 16 ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "uniq" + +@@ -106,6 +123,12 @@ static enum delimit_method const delimit + /* Select whether/how to delimit groups of duplicate lines. */ + static enum delimit_method delimit_groups; + ++/* Function pointers. */ ++static char * (*find_field) (struct linebuffer *line); ++ ++/* Show the blank character class. */ ++wctype_t blank_type; ++ + static struct option const longopts[] = + { + {"count", no_argument, NULL, 'c'}, +@@ -202,7 +225,7 @@ size_opt (char const *opt, char const *m + return a pointer to the beginning of the line's field to be compared. */ + + static char * +-find_field (struct linebuffer const *line) ++find_field_uni (struct linebuffer const *line) + { + size_t count; + char const *lp = line->buffer; +@@ -223,6 +246,83 @@ find_field (struct linebuffer const *lin + return line->buffer + i; + } + ++#if HAVE_MBRTOWC ++ ++# define MBCHAR_TO_WCHAR(WC, MBLENGTH, LP, POS, SIZE, STATEP, CONVFAIL) \ ++ do \ ++ { \ ++ mbstate_t state_bak; \ ++ \ ++ CONVFAIL = 0; \ ++ state_bak = *STATEP; \ ++ \ ++ MBLENGTH = mbrtowc (&WC, LP + POS, SIZE - POS, STATEP); \ ++ \ ++ switch (MBLENGTH) \ ++ { \ ++ case (size_t)-2: \ ++ case (size_t)-1: \ ++ *STATEP = state_bak; \ ++ CONVFAIL++; \ ++ /* Fall through */ \ ++ case 0: \ ++ MBLENGTH = 1; \ ++ } \ ++ } \ ++ while (0) ++ ++static char * ++find_field_multi (struct linebuffer const *line) ++{ ++ size_t count; ++ char *lp = line->buffer; ++ size_t size = line->length - 1; ++ size_t pos; ++ size_t mblength; ++ wchar_t wc; ++ mbstate_t *statep; ++ int convfail; ++ ++ pos = 0; ++ statep = &line->state; ++ ++ /* skip fields. */ ++ for (count = 0; count < skip_fields && pos < size; count++) ++ { ++ while (pos < size) ++ { ++ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); ++ ++ if (convfail || !iswctype (wc, blank_type)) ++ { ++ pos += mblength; ++ break; ++ } ++ pos += mblength; ++ } ++ ++ while (pos < size) ++ { ++ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); ++ ++ if (!convfail && iswctype (wc, blank_type)) ++ break; ++ ++ pos += mblength; ++ } ++ } ++ ++ /* skip fields. */ ++ for (count = 0; count < skip_chars && pos < size; count++) ++ { ++ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); ++ pos += mblength; ++ } ++ ++ return lp + pos; ++} ++#endif ++ + /* Return false if two strings OLD and NEW match, true if not. + OLD and NEW point not to the beginnings of the lines + but rather to the beginnings of the fields to compare. +@@ -247,6 +347,73 @@ different (char *old, char *new, size_t + return oldlen != newlen || memcmp (old, new, oldlen); + } + ++#if HAVE_MBRTOWC ++static int ++different_multi (const char *old, const char *new, size_t oldlen, size_t newlen, mbstate_t oldstate, mbstate_t newstate) ++{ ++ size_t i, j, chars; ++ const char *str[2]; ++ char *copy[2]; ++ size_t len[2]; ++ mbstate_t state[2]; ++ size_t mblength; ++ wchar_t wc, uwc; ++ mbstate_t state_bak; ++ ++ str[0] = old; ++ str[1] = new; ++ len[0] = oldlen; ++ len[1] = newlen; ++ state[0] = oldstate; ++ state[1] = newstate; ++ ++ for (i = 0; i < 2; i++) ++ { ++ copy[i] = alloca (len[i] + 1); ++ ++ for (j = 0, chars = 0; j < len[i] && chars < check_chars; chars++) ++ { ++ state_bak = state[i]; ++ mblength = mbrtowc (&wc, str[i] + j, len[i] - j, &state[i]); ++ ++ switch (mblength) ++ { ++ case (size_t)-1: ++ case (size_t)-2: ++ state[i] = state_bak; ++ /* Fall through */ ++ case 0: ++ mblength = 1; ++ break; ++ ++ default: ++ if (ignore_case) ++ { ++ uwc = towupper (wc); ++ ++ if (uwc != wc) ++ { ++ mbstate_t state_wc; ++ ++ memset (&state_wc, '\0', sizeof (mbstate_t)); ++ wcrtomb (copy[i] + j, uwc, &state_wc); ++ } ++ else ++ memcpy (copy[i] + j, str[i] + j, mblength); ++ } ++ else ++ memcpy (copy[i] + j, str[i] + j, mblength); ++ } ++ j += mblength; ++ } ++ copy[i][j] = '\0'; ++ len[i] = j; ++ } ++ ++ return xmemcoll (copy[0], len[0], copy[1], len[1]); ++} ++#endif ++ + /* Output the line in linebuffer LINE to standard output + provided that the switches say it should be output. + MATCH is true if the line matches the previous line. +@@ -299,15 +466,42 @@ check_file (const char *infile, const ch + { + char *prevfield IF_LINT (= NULL); + size_t prevlen IF_LINT (= 0); ++#if HAVE_MBRTOWC ++ mbstate_t prevstate; + ++ memset (&prevstate, '\0', sizeof (mbstate_t)); ++#endif + while (!feof (stdin)) + { + char *thisfield; + size_t thislen; ++#if HAVE_MBRTOWC ++ mbstate_t thisstate; ++#endif + if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) + break; + thisfield = find_field (thisline); + thislen = thisline->length - 1 - (thisfield - thisline->buffer); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ thisstate = thisline->state; ++ ++ if (prevline->length == 0 ++ || different_multi (thisfield, prevfield, thislen, prevlen, ++ thisstate, prevstate)) ++ { ++ fwrite (thisline->buffer, sizeof (char), ++ thisline->length, stdout); ++ ++ SWAP_LINES (prevline, thisline); ++ prevfield = thisfield; ++ prevlen = thislen; ++ prevstate = thisstate; ++ } ++ } ++ else ++#endif + if (prevline->length == 0 + || different (thisfield, prevfield, thislen, prevlen)) + { +@@ -326,17 +520,26 @@ check_file (const char *infile, const ch + size_t prevlen; + uintmax_t match_count = 0; + bool first_delimiter = true; ++#if HAVE_MBRTOWC ++ mbstate_t prevstate; ++#endif + + if (readlinebuffer_delim (prevline, stdin, delimiter) == 0) + goto closefiles; + prevfield = find_field (prevline); + prevlen = prevline->length - 1 - (prevfield - prevline->buffer); ++#if HAVE_MBRTOWC ++ prevstate = prevline->state; ++#endif + + while (!feof (stdin)) + { + bool match; + char *thisfield; + size_t thislen; ++#if HAVE_MBRTOWC ++ mbstate_t thisstate; ++#endif + if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) + { + if (ferror (stdin)) +@@ -345,6 +548,15 @@ check_file (const char *infile, const ch + } + thisfield = find_field (thisline); + thislen = thisline->length - 1 - (thisfield - thisline->buffer); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ thisstate = thisline->state; ++ match = !different_multi (thisfield, prevfield, ++ thislen, prevlen, thisstate, prevstate); ++ } ++ else ++#endif + match = !different (thisfield, prevfield, thislen, prevlen); + match_count += match; + +@@ -377,6 +589,9 @@ check_file (const char *infile, const ch + SWAP_LINES (prevline, thisline); + prevfield = thisfield; + prevlen = thislen; ++#if HAVE_MBRTOWC ++ prevstate = thisstate; ++#endif + if (!match) + match_count = 0; + } +@@ -422,6 +637,18 @@ main (int argc, char **argv) + + atexit (close_stdout); + ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ find_field = find_field_multi; ++ blank_type = wctype ("blank"); ++ } ++ else ++#endif ++ { ++ find_field = find_field_uni; ++ } ++ + skip_chars = 0; + skip_fields = 0; + check_chars = SIZE_MAX; +Index: tests/misc/cut +=================================================================== +--- coreutils-7.1/tests/misc/cut.orig 2008-09-18 09:06:57.000000000 +0200 ++++ coreutils-7.1/tests/misc/cut 2010-06-29 18:49:32.091533700 +0200 +@@ -26,7 +26,7 @@ use strict; + my $prog = 'cut'; + my $try = "Try \`$prog --help' for more information.\n"; + my $from_1 = "$prog: fields and positions are numbered from 1\n$try"; +-my $inval = "$prog: invalid byte or field list\n$try"; ++my $inval = "$prog: invalid byte, character or field list\n$try"; + my $no_endpoint = "$prog: invalid range with no endpoint: -\n$try"; + + my @Tests = diff --git a/coreutils-5.3.0-sbin4su.patch b/coreutils-5.3.0-sbin4su.diff similarity index 90% rename from coreutils-5.3.0-sbin4su.patch rename to coreutils-5.3.0-sbin4su.diff index 3af4168..bf2cc6c 100644 --- a/coreutils-5.3.0-sbin4su.patch +++ b/coreutils-5.3.0-sbin4su.diff @@ -1,8 +1,8 @@ Index: src/su.c =================================================================== ---- src/su.c.orig 2010-05-05 14:46:48.000000000 +0200 -+++ src/su.c 2010-05-05 14:48:55.023359308 +0200 -@@ -454,6 +454,117 @@ correct_password (const struct passwd *p +--- src/su.c.orig 2010-05-04 17:29:12.779359204 +0200 ++++ src/su.c 2010-05-04 17:29:12.939359620 +0200 +@@ -467,6 +467,117 @@ correct_password (const struct passwd *p #endif /* !USE_PAM */ } @@ -120,7 +120,7 @@ Index: src/su.c /* Update `environ' for the new shell based on PW, with SHELL being the value for the SHELL environment variable. */ -@@ -493,6 +604,22 @@ modify_environment (const struct passwd +@@ -506,6 +617,22 @@ modify_environment (const struct passwd DEFAULT_LOGIN_PATH) : getdef_str ("SUPATH", DEFAULT_ROOT_LOGIN_PATH))); @@ -140,6 +140,6 @@ Index: src/su.c + free (new); + } + } - if (pw->pw_uid) - { - xsetenv ("USER", pw->pw_name); + if (pw->pw_uid) + { + xsetenv ("USER", pw->pw_name); diff --git a/coreutils-6.8-su.patch b/coreutils-6.8-su.diff similarity index 78% rename from coreutils-6.8-su.patch rename to coreutils-6.8-su.diff index c8e3e05..090b0f3 100644 --- a/coreutils-6.8-su.patch +++ b/coreutils-6.8-su.diff @@ -1,10 +1,6 @@ -Add pam support in su - -Index: Makefile.in -=================================================================== ---- Makefile.in.orig 2010-04-23 17:58:41.000000000 +0200 -+++ Makefile.in 2010-05-06 19:37:44.784359208 +0200 -@@ -961,6 +961,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +--- Makefile.in ++++ Makefile.in +@@ -732,6 +732,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -12,35 +8,41 @@ Index: Makefile.in PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ -Index: configure -=================================================================== ---- configure.orig 2010-05-06 19:37:44.688359301 +0200 -+++ configure 2010-05-06 19:37:44.816359169 +0200 -@@ -631,6 +631,7 @@ OPTIONAL_BIN_PROGS +--- configure ++++ configure +@@ -612,6 +612,7 @@ OPTIONAL_BIN_PROGS INSTALL_SU LIB_GMP LIB_CRYPT +PAM_LIBS - GNULIB_WARN_CFLAGS WERROR_CFLAGS SEQ_LIBM -@@ -1501,6 +1502,7 @@ enable_xattr + LIB_CAP +@@ -1231,6 +1232,7 @@ with_included_regex + enable_xattr enable_libcap - with_tty_group enable_gcc_warnings +enable_pam with_gmp enable_install_program enable_no_install_program -@@ -2152,6 +2154,7 @@ Optional Features: +@@ -1877,6 +1879,7 @@ Optional Features: --disable-xattr do not support extended attributes --disable-libcap disable libcap support - --enable-gcc-warnings turn on lots of GCC warnings (for developers) -+ --disable-pam Disable PAM support in su (default=auto) + --enable-gcc-warnings turn on lots of GCC warnings (not recommended) ++ --disable-pam Enable PAM support in su (default=auto) --enable-install-program=PROG_LIST install the programs in PROG_LIST (comma-separated, default: none) -@@ -51989,6 +51992,111 @@ $as_echo "#define HAVE_WORKING_FORK 1" > +@@ -26931,7 +26934,6 @@ fi + + + +- + XGETTEXT_EXTRA_OPTIONS="$XGETTEXT_EXTRA_OPTIONS --keyword='proper_name:1,\"This is a proper name. See the gettext manual, section Names.\"'" + + +@@ -39096,6 +39098,111 @@ $as_echo "#define HAVE_WORKING_FORK 1" > fi @@ -150,13 +152,11 @@ Index: configure +$as_echo "$enable_pam" >&6; } + optional_bin_progs= - for ac_func in chroot - do : -Index: configure.ac -=================================================================== ---- configure.ac.orig 2010-03-13 16:14:09.000000000 +0100 -+++ configure.ac 2010-05-06 19:37:44.843292013 +0200 -@@ -128,6 +128,20 @@ fi + for ac_func in uname + do +--- configure.ac ++++ configure.ac +@@ -79,6 +79,20 @@ fi AC_FUNC_FORK @@ -175,13 +175,11 @@ Index: configure.ac +AC_MSG_RESULT([$enable_pam]) + optional_bin_progs= - AC_CHECK_FUNCS([chroot], - gl_ADD_PROG([optional_bin_progs], [chroot])) -Index: doc/Makefile.in -=================================================================== ---- doc/Makefile.in.orig 2010-04-23 17:58:37.000000000 +0200 -+++ doc/Makefile.in 2010-05-06 19:37:44.868359246 +0200 -@@ -957,6 +957,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ + AC_CHECK_FUNCS([uname], + gl_ADD_PROG([optional_bin_progs], [uname])) +--- doc/Makefile.in ++++ doc/Makefile.in +@@ -713,6 +713,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -189,11 +187,9 @@ Index: doc/Makefile.in PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ -Index: gnulib-tests/Makefile.in -=================================================================== ---- gnulib-tests/Makefile.in.orig 2010-04-23 18:00:33.000000000 +0200 -+++ gnulib-tests/Makefile.in 2010-05-06 19:37:44.871374260 +0200 -@@ -2191,6 +2191,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +--- gnulib-tests/Makefile.in ++++ gnulib-tests/Makefile.in +@@ -1421,6 +1421,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -201,11 +197,9 @@ Index: gnulib-tests/Makefile.in PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ -Index: lib/Makefile.in -=================================================================== ---- lib/Makefile.in.orig 2010-04-23 17:58:38.000000000 +0200 -+++ lib/Makefile.in 2010-05-06 19:37:59.594863753 +0200 -@@ -1006,6 +1006,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +--- lib/Makefile.in ++++ lib/Makefile.in +@@ -763,6 +763,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -213,11 +207,9 @@ Index: lib/Makefile.in PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ -Index: man/Makefile.in -=================================================================== ---- man/Makefile.in.orig 2010-05-06 19:37:44.618920753 +0200 -+++ man/Makefile.in 2010-05-06 19:37:44.934868934 +0200 -@@ -926,6 +926,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +--- man/Makefile.in ++++ man/Makefile.in +@@ -703,6 +703,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -225,28 +217,24 @@ Index: man/Makefile.in PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ -Index: src/Makefile.am -=================================================================== ---- src/Makefile.am.orig 2010-04-23 15:44:14.000000000 +0200 -+++ src/Makefile.am 2010-05-06 19:37:59.594863753 +0200 -@@ -364,7 +364,8 @@ factor_LDADD += $(LIB_GMP) - uptime_LDADD += $(GETLOADAVG_LIBS) +--- src/Makefile.am ++++ src/Makefile.am +@@ -147,7 +147,8 @@ tail_LDADD = $(nanosec_libs) + # If necessary, add -lm to resolve use of pow in lib/strtod.c. + uptime_LDADD = $(LDADD) $(POW_LIB) $(GETLOADAVG_LIBS) - # for crypt --su_LDADD += $(LIB_CRYPT) +-su_LDADD = $(LDADD) $(LIB_CRYPT) +su_SOURCES = su.c getdef.c +su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) - # for various ACL functions - copy_LDADD += $(LIB_ACL) -Index: src/Makefile.in -=================================================================== ---- src/Makefile.in.orig 2010-04-23 18:35:11.000000000 +0200 -+++ src/Makefile.in 2010-05-06 19:37:59.594863753 +0200 -@@ -553,9 +553,10 @@ stdbuf_DEPENDENCIES = $(am__DEPENDENCIES - stty_SOURCES = stty.c - stty_OBJECTS = stty.$(OBJEXT) - stty_DEPENDENCIES = $(am__DEPENDENCIES_2) + dir_LDADD += $(LIB_ACL) + ls_LDADD += $(LIB_ACL) +--- src/Makefile.in ++++ src/Makefile.in +@@ -605,9 +605,10 @@ stty_OBJECTS = stty.$(OBJEXT) + stty_LDADD = $(LDADD) + stty_DEPENDENCIES = libver.a ../lib/libcoreutils.a \ + $(am__DEPENDENCIES_1) ../lib/libcoreutils.a -su_SOURCES = su.c -su_OBJECTS = su.$(OBJEXT) -su_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) @@ -256,28 +244,40 @@ Index: src/Makefile.in + $(am__DEPENDENCIES_1) sum_SOURCES = sum.c sum_OBJECTS = sum.$(OBJEXT) - sum_DEPENDENCIES = $(am__DEPENDENCIES_2) -@@ -665,8 +666,8 @@ SOURCES = $(nodist_libver_a_SOURCES) $(_ - $(rmdir_SOURCES) runcon.c seq.c setuidgid.c $(sha1sum_SOURCES) \ - $(sha224sum_SOURCES) $(sha256sum_SOURCES) $(sha384sum_SOURCES) \ - $(sha512sum_SOURCES) shred.c shuf.c sleep.c sort.c split.c \ -- stat.c stdbuf.c stty.c su.c sum.c sync.c tac.c tail.c tee.c \ -- test.c $(timeout_SOURCES) touch.c tr.c true.c truncate.c \ -+ stat.c stdbuf.c stty.c $(su_SOURCES) sum.c sync.c tac.c tail.c \ -+ tee.c test.c $(timeout_SOURCES) touch.c tr.c true.c truncate.c \ - tsort.c tty.c $(uname_SOURCES) unexpand.c uniq.c unlink.c \ - uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c yes.c - DIST_SOURCES = $(__SOURCES) $(arch_SOURCES) base64.c basename.c cat.c \ -@@ -683,7 +684,7 @@ DIST_SOURCES = $(__SOURCES) $(arch_SOURC + sum_LDADD = $(LDADD) +@@ -735,11 +736,11 @@ SOURCES = $(nodist_libver_a_SOURCES) $(_ $(rm_SOURCES) $(rmdir_SOURCES) runcon.c seq.c setuidgid.c \ $(sha1sum_SOURCES) $(sha224sum_SOURCES) $(sha256sum_SOURCES) \ $(sha384sum_SOURCES) $(sha512sum_SOURCES) shred.c shuf.c \ -- sleep.c sort.c split.c stat.c stdbuf.c stty.c su.c sum.c \ -+ sleep.c sort.c split.c stat.c stdbuf.c stty.c $(su_SOURCES) sum.c \ - sync.c tac.c tail.c tee.c test.c $(timeout_SOURCES) touch.c \ - tr.c true.c truncate.c tsort.c tty.c $(uname_SOURCES) \ - unexpand.c uniq.c unlink.c uptime.c users.c $(vdir_SOURCES) \ -@@ -1338,6 +1339,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +- sleep.c sort.c split.c stat.c stty.c su.c sum.c sync.c tac.c \ +- tail.c tee.c test.c $(timeout_SOURCES) touch.c tr.c true.c \ +- truncate.c tsort.c tty.c $(uname_SOURCES) unexpand.c uniq.c \ +- unlink.c uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c \ +- yes.c ++ sleep.c sort.c split.c stat.c stty.c $(su_SOURCES) sum.c \ ++ sync.c tac.c tail.c tee.c test.c $(timeout_SOURCES) touch.c \ ++ tr.c true.c truncate.c tsort.c tty.c $(uname_SOURCES) \ ++ unexpand.c uniq.c unlink.c uptime.c users.c $(vdir_SOURCES) \ ++ wc.c who.c whoami.c yes.c + DIST_SOURCES = $(__SOURCES) $(arch_SOURCES) base64.c basename.c cat.c \ + chcon.c $(chgrp_SOURCES) chmod.c $(chown_SOURCES) chroot.c \ + cksum.c comm.c $(cp_SOURCES) csplit.c cut.c date.c dd.c df.c \ +@@ -754,10 +755,10 @@ DIST_SOURCES = $(__SOURCES) $(arch_SOURC + $(rmdir_SOURCES) runcon.c seq.c setuidgid.c $(sha1sum_SOURCES) \ + $(sha224sum_SOURCES) $(sha256sum_SOURCES) $(sha384sum_SOURCES) \ + $(sha512sum_SOURCES) shred.c shuf.c sleep.c sort.c split.c \ +- stat.c stty.c su.c sum.c sync.c tac.c tail.c tee.c test.c \ +- $(timeout_SOURCES) touch.c tr.c true.c truncate.c tsort.c \ +- tty.c $(uname_SOURCES) unexpand.c uniq.c unlink.c uptime.c \ +- users.c $(vdir_SOURCES) wc.c who.c whoami.c yes.c ++ stat.c stty.c $(su_SOURCES) sum.c sync.c tac.c tail.c tee.c \ ++ test.c $(timeout_SOURCES) touch.c tr.c true.c truncate.c \ ++ tsort.c tty.c $(uname_SOURCES) unexpand.c uniq.c unlink.c \ ++ uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c yes.c + HEADERS = $(noinst_HEADERS) + ETAGS = etags + CTAGS = ctags +@@ -1209,6 +1210,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -285,17 +285,17 @@ Index: src/Makefile.in PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ -@@ -1743,7 +1745,8 @@ stdbuf_LDADD = $(LDADD) $(LIBICONV) - stty_LDADD = $(LDADD) +@@ -1511,7 +1513,8 @@ tail_LDADD = $(nanosec_libs) - # for crypt + # If necessary, add -lm to resolve use of pow in lib/strtod.c. + uptime_LDADD = $(LDADD) $(POW_LIB) $(GETLOADAVG_LIBS) -su_LDADD = $(LDADD) $(LIB_CRYPT) +su_SOURCES = su.c getdef.c +su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) - sum_LDADD = $(LDADD) - sync_LDADD = $(LDADD) - tac_LDADD = $(LDADD) $(LIB_GETHRXTIME) -@@ -2386,6 +2389,7 @@ distclean-compile: + stat_LDADD = $(LDADD) $(LIB_SELINUX) + + # programs that use getaddrinfo (e.g., via canon_host) +@@ -2040,6 +2043,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/false.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fmt.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fold.Po@am__quote@ @@ -303,10 +303,8 @@ Index: src/Makefile.in @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/getlimits.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-copy.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-cp-hash.Po@am__quote@ -Index: src/getdef.c -=================================================================== ---- /dev/null 1970-01-01 00:00:00.000000000 +0000 -+++ src/getdef.c 2010-05-06 19:37:45.014990147 +0200 +--- src/getdef.c ++++ src/getdef.c @@ -0,0 +1,259 @@ +/* Copyright (C) 2003, 2004, 2005 Thorsten Kukuk + Author: Thorsten Kukuk @@ -567,10 +565,8 @@ Index: src/getdef.c +} + +#endif -Index: src/getdef.h -=================================================================== ---- /dev/null 1970-01-01 00:00:00.000000000 +0000 -+++ src/getdef.h 2010-05-06 19:37:45.054863903 +0200 +--- src/getdef.h ++++ src/getdef.h @@ -0,0 +1,29 @@ +/* Copyright (C) 2003, 2005 Thorsten Kukuk + Author: Thorsten Kukuk @@ -601,10 +597,8 @@ Index: src/getdef.h +extern void free_getdef_data (void); + +#endif /* _GETDEF_H_ */ -Index: src/su.c -=================================================================== ---- src/su.c.orig 2010-01-01 14:06:47.000000000 +0100 -+++ src/su.c 2010-05-06 19:37:59.538860383 +0200 +--- src/su.c ++++ src/su.c @@ -37,6 +37,16 @@ restricts who can su to UID 0 accounts. RMS considers that to be fascist. @@ -622,7 +616,7 @@ Index: src/su.c Compile-time options: -DSYSLOG_SUCCESS Log successful su's (by default, to root) with syslog. -DSYSLOG_FAILURE Log failed su's (by default, to root) with syslog. -@@ -52,12 +62,22 @@ +@@ -52,6 +62,13 @@ #include #include #include @@ -634,8 +628,9 @@ Index: src/su.c +#include +#endif - #include "system.h" - #include "getpass.h" + /* Hide any system prototype for getusershell. + This is necessary because some Cray systems have a conflicting +@@ -65,6 +82,9 @@ #if HAVE_SYSLOG_H && HAVE_SYSLOG # include @@ -645,7 +640,7 @@ Index: src/su.c #else # undef SYSLOG_SUCCESS # undef SYSLOG_FAILURE -@@ -91,19 +111,13 @@ +@@ -98,19 +118,13 @@ # include #endif @@ -669,20 +664,18 @@ Index: src/su.c /* The shell to run if none is given in the user's passwd entry. */ #define DEFAULT_SHELL "/bin/sh" -@@ -111,8 +125,9 @@ +@@ -118,13 +132,22 @@ /* The user to become if none is specified. */ #define DEFAULT_USER "root" +#ifndef USE_PAM char *crypt (char const *key, char const *salt); -- +#endif - static void run_shell (char const *, char const *, char **, size_t) - ATTRIBUTE_NORETURN; + char *getusershell (void); + void endusershell (void); + void setusershell (void); -@@ -125,6 +140,13 @@ static bool simulate_login; - /* If true, change some environment vars to indicate the user su'd to. */ - static bool change_environment; + extern char **environ; +#ifdef USE_PAM +static bool _pam_session_opened; @@ -691,10 +684,10 @@ Index: src/su.c +static void create_watching_parent (void); +#endif + - static struct option const longopts[] = - { - {"command", required_argument, NULL, 'c'}, -@@ -200,7 +222,162 @@ log_su (struct passwd const *pw, bool su + static void run_shell (char const *, char const *, char **, size_t) + ATTRIBUTE_NORETURN; + +@@ -212,7 +235,162 @@ log_su (struct passwd const *pw, bool su } #endif @@ -779,7 +772,7 @@ Index: src/su.c + /* the child proceeds to run the shell */ + if (child == 0) + return; -+ ++ + /* In the parent watch the child. */ + + /* su without pam support does not have a helper that keeps @@ -857,7 +850,7 @@ Index: src/su.c Return true if the user gives the correct password for entry PW, false if not. Return true without asking for a password if run by UID 0 or if PW has an empty password. */ -@@ -208,10 +385,52 @@ log_su (struct passwd const *pw, bool su +@@ -220,10 +398,52 @@ log_su (struct passwd const *pw, bool su static bool correct_password (const struct passwd *pw) { @@ -911,7 +904,7 @@ Index: src/su.c endspent (); if (sp) -@@ -232,6 +451,7 @@ correct_password (const struct passwd *p +@@ -244,6 +464,7 @@ correct_password (const struct passwd *p encrypted = crypt (unencrypted, correct); memset (unencrypted, 0, strlen (unencrypted)); return STREQ (encrypted, correct); @@ -919,33 +912,33 @@ Index: src/su.c } /* Update `environ' for the new shell based on PW, with SHELL being -@@ -256,8 +476,8 @@ modify_environment (const struct passwd +@@ -268,8 +489,8 @@ modify_environment (const struct passwd xsetenv ("USER", pw->pw_name); xsetenv ("LOGNAME", pw->pw_name); xsetenv ("PATH", (pw->pw_uid -- ? DEFAULT_LOGIN_PATH -- : DEFAULT_ROOT_LOGIN_PATH)); +- ? DEFAULT_LOGIN_PATH +- : DEFAULT_ROOT_LOGIN_PATH)); + ? getdef_str ("PATH", DEFAULT_LOGIN_PATH) + : getdef_str ("SUPATH", DEFAULT_ROOT_LOGIN_PATH))); } else { -@@ -267,6 +487,12 @@ modify_environment (const struct passwd - { - xsetenv ("HOME", pw->pw_dir); - xsetenv ("SHELL", shell); +@@ -279,6 +500,12 @@ modify_environment (const struct passwd + { + xsetenv ("HOME", pw->pw_dir); + xsetenv ("SHELL", shell); + if (getdef_bool ("ALWAYS_SET_PATH", 0)) + xsetenv ("PATH", (pw->pw_uid + ? getdef_str ("PATH", + DEFAULT_LOGIN_PATH) + : getdef_str ("SUPATH", + DEFAULT_ROOT_LOGIN_PATH))); - if (pw->pw_uid) - { - xsetenv ("USER", pw->pw_name); -@@ -274,19 +500,41 @@ modify_environment (const struct passwd - } - } + if (pw->pw_uid) + { + xsetenv ("USER", pw->pw_name); +@@ -286,19 +513,41 @@ modify_environment (const struct passwd + } + } } + +#ifdef USE_PAM @@ -962,7 +955,7 @@ Index: src/su.c #ifdef HAVE_INITGROUPS errno = 0; if (initgroups (pw->pw_name, pw->pw_gid) == -1) -- error (EXIT_CANCELED, errno, _("cannot set groups")); +- error (EXIT_FAILURE, errno, _("cannot set groups")); + { +#ifdef USE_PAM + cleanup_pam (PAM_ABORT); @@ -985,17 +978,17 @@ Index: src/su.c +change_identity (const struct passwd *pw) +{ if (setgid (pw->pw_gid)) - error (EXIT_CANCELED, errno, _("cannot set group id")); + error (EXIT_FAILURE, errno, _("cannot set group id")); if (setuid (pw->pw_uid)) -@@ -479,6 +727,7 @@ main (int argc, char **argv) +@@ -491,6 +740,7 @@ main (int argc, char **argv) #ifdef SYSLOG_FAILURE log_su (pw, false); #endif + sleep (getdef_num ("FAIL_DELAY", 1)); - error (EXIT_CANCELED, 0, _("incorrect password")); + error (EXIT_FAILURE, 0, _("incorrect password")); } #ifdef SYSLOG_SUCCESS -@@ -500,9 +749,21 @@ main (int argc, char **argv) +@@ -512,9 +762,21 @@ main (int argc, char **argv) shell = NULL; } shell = xstrdup (shell ? shell : pw->pw_shell); @@ -1018,11 +1011,9 @@ Index: src/su.c if (simulate_login && chdir (pw->pw_dir) != 0) error (0, errno, _("warning: cannot change directory to %s"), pw->pw_dir); -Index: tests/Makefile.in -=================================================================== ---- tests/Makefile.in.orig 2010-04-23 17:58:39.000000000 +0200 -+++ tests/Makefile.in 2010-05-06 19:37:45.091861849 +0200 -@@ -986,6 +986,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +--- tests/Makefile.in ++++ tests/Makefile.in +@@ -677,6 +677,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ diff --git a/coreutils-6.8.0-pie.patch b/coreutils-6.8.0-pie.diff similarity index 76% rename from coreutils-6.8.0-pie.patch rename to coreutils-6.8.0-pie.diff index 2a22116..36f565f 100644 --- a/coreutils-6.8.0-pie.patch +++ b/coreutils-6.8.0-pie.diff @@ -1,35 +1,28 @@ -Index: lib/Makefile.am -=================================================================== ---- lib/Makefile.am.orig 2010-01-01 14:06:47.000000000 +0100 -+++ lib/Makefile.am 2010-05-05 14:38:03.083359277 +0200 -@@ -17,7 +17,7 @@ - +--- lib/Makefile.am ++++ lib/Makefile.am +@@ -18,6 +18,7 @@ include gnulib.mk --AM_CFLAGS += $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) -+AM_CFLAGS += $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) -fpie + AM_CFLAGS = $(WARN_CFLAGS) # $(WERROR_CFLAGS) ++AM_CFLAGS += -fpie libcoreutils_a_SOURCES += \ buffer-lcm.c buffer-lcm.h \ -Index: lib/Makefile.in -=================================================================== ---- lib/Makefile.in.orig 2010-05-05 14:37:08.000000000 +0200 -+++ lib/Makefile.in 2010-05-05 14:38:31.946859277 +0200 -@@ -1432,7 +1432,7 @@ DISTCLEANFILES = - MAINTAINERCLEANFILES = getdate.c iconv_open-aix.h iconv_open-hpux.h \ - iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h - AM_CPPFLAGS = --AM_CFLAGS = $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) -+AM_CFLAGS = $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) -fpie - libcoreutils_a_SOURCES = set-mode-acl.c copy-acl.c file-has-acl.c \ - areadlink.c areadlink-with-size.c areadlinkat.c argv-iter.c \ - argv-iter.h base64.h base64.c bitrotate.h c-ctype.h c-ctype.c \ -Index: src/Makefile.am -=================================================================== ---- src/Makefile.am.orig 2010-05-05 14:37:08.000000000 +0200 -+++ src/Makefile.am 2010-05-05 14:39:20.956359221 +0200 -@@ -366,6 +366,10 @@ uptime_LDADD += $(GETLOADAVG_LIBS) - # for crypt +--- lib/Makefile.in ++++ lib/Makefile.in +@@ -1169,7 +1169,7 @@ GPERF = gperf + LINK_WARNING_H = $(top_srcdir)/build-aux/link-warning.h + charset_alias = $(DESTDIR)$(libdir)/charset.alias + charset_tmp = $(DESTDIR)$(libdir)/charset.tmp +-AM_CFLAGS = $(WARN_CFLAGS) # $(WERROR_CFLAGS) ++AM_CFLAGS = $(WARN_CFLAGS) -fpie + all: $(BUILT_SOURCES) config.h + $(MAKE) $(AM_MAKEFLAGS) all-recursive + +--- src/Makefile.am ++++ src/Makefile.am +@@ -149,6 +149,10 @@ uptime_LDADD = $(LDADD) $(POW_LIB) $(GET + su_SOURCES = su.c getdef.c su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) +su_CFLAGS = -fpie @@ -37,16 +30,14 @@ Index: src/Makefile.am +timeout_CFLAGS = -fpie +timeout_LDFLAGS = -pie - # for various ACL functions - copy_LDADD += $(LIB_ACL) -Index: src/Makefile.in -=================================================================== ---- src/Makefile.in.orig 2010-05-05 14:37:08.000000000 +0200 -+++ src/Makefile.in 2010-05-05 14:46:02.318905172 +0200 -@@ -553,10 +553,12 @@ stdbuf_DEPENDENCIES = $(am__DEPENDENCIES - stty_SOURCES = stty.c - stty_OBJECTS = stty.$(OBJEXT) - stty_DEPENDENCIES = $(am__DEPENDENCIES_2) + dir_LDADD += $(LIB_ACL) + ls_LDADD += $(LIB_ACL) +--- src/Makefile.in ++++ src/Makefile.in +@@ -605,10 +605,12 @@ stty_OBJECTS = stty.$(OBJEXT) + stty_LDADD = $(LDADD) + stty_DEPENDENCIES = libver.a ../lib/libcoreutils.a \ + $(am__DEPENDENCIES_1) ../lib/libcoreutils.a -am_su_OBJECTS = su.$(OBJEXT) getdef.$(OBJEXT) +am_su_OBJECTS = su-su.$(OBJEXT) su-getdef.$(OBJEXT) su_OBJECTS = $(am_su_OBJECTS) @@ -56,8 +47,8 @@ Index: src/Makefile.in + $@ sum_SOURCES = sum.c sum_OBJECTS = sum.$(OBJEXT) - sum_DEPENDENCIES = $(am__DEPENDENCIES_2) -@@ -576,9 +578,12 @@ tee_DEPENDENCIES = $(am__DEPENDENCIES_2) + sum_LDADD = $(LDADD) +@@ -633,9 +635,12 @@ tee_DEPENDENCIES = libver.a ../lib/libco test_SOURCES = test.c test_OBJECTS = test.$(OBJEXT) test_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) @@ -71,36 +62,36 @@ Index: src/Makefile.in touch_SOURCES = touch.c touch_OBJECTS = touch.$(OBJEXT) touch_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) -@@ -1747,6 +1752,10 @@ stty_LDADD = $(LDADD) - # for crypt +@@ -1515,6 +1520,10 @@ tail_LDADD = $(nanosec_libs) + uptime_LDADD = $(LDADD) $(POW_LIB) $(GETLOADAVG_LIBS) su_SOURCES = su.c getdef.c su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) +su_CFLAGS = -fpie +su_LDFLAGS = -pie +timeout_CFLAGS = -fpie +timeout_LDFLAGS = -pie - sum_LDADD = $(LDADD) - sync_LDADD = $(LDADD) - tac_LDADD = $(LDADD) $(LIB_GETHRXTIME) -@@ -2279,7 +2288,7 @@ stty$(EXEEXT): $(stty_OBJECTS) $(stty_DE - $(AM_V_CCLD)$(LINK) $(stty_OBJECTS) $(stty_LDADD) $(LIBS) + stat_LDADD = $(LDADD) $(LIB_SELINUX) + + # programs that use getaddrinfo (e.g., via canon_host) +@@ -1933,7 +1942,7 @@ stty$(EXEEXT): $(stty_OBJECTS) $(stty_DE + $(LINK) $(stty_OBJECTS) $(stty_LDADD) $(LIBS) su$(EXEEXT): $(su_OBJECTS) $(su_DEPENDENCIES) @rm -f su$(EXEEXT) -- $(AM_V_CCLD)$(LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) -+ $(AM_V_CCLD)$(su_LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) +- $(LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) ++ $(su_LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) sum$(EXEEXT): $(sum_OBJECTS) $(sum_DEPENDENCIES) @rm -f sum$(EXEEXT) - $(AM_V_CCLD)$(LINK) $(sum_OBJECTS) $(sum_LDADD) $(LIBS) -@@ -2300,7 +2309,7 @@ test$(EXEEXT): $(test_OBJECTS) $(test_DE - $(AM_V_CCLD)$(LINK) $(test_OBJECTS) $(test_LDADD) $(LIBS) + $(LINK) $(sum_OBJECTS) $(sum_LDADD) $(LIBS) +@@ -1954,7 +1963,7 @@ test$(EXEEXT): $(test_OBJECTS) $(test_DE + $(LINK) $(test_OBJECTS) $(test_LDADD) $(LIBS) timeout$(EXEEXT): $(timeout_OBJECTS) $(timeout_DEPENDENCIES) @rm -f timeout$(EXEEXT) -- $(AM_V_CCLD)$(LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) -+ $(AM_V_CCLD)$(timeout_LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) +- $(LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) ++ $(timeout_LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) touch$(EXEEXT): $(touch_OBJECTS) $(touch_DEPENDENCIES) @rm -f touch$(EXEEXT) - $(AM_V_CCLD)$(LINK) $(touch_OBJECTS) $(touch_LDADD) $(LIBS) -@@ -2389,7 +2398,6 @@ distclean-compile: + $(LINK) $(touch_OBJECTS) $(touch_LDADD) $(LIBS) +@@ -2043,7 +2052,6 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/false.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fmt.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fold.Po@am__quote@ @@ -108,9 +99,9 @@ Index: src/Makefile.in @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/getlimits.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-copy.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-cp-hash.Po@am__quote@ -@@ -2453,14 +2461,16 @@ distclean-compile: +@@ -2104,14 +2112,16 @@ distclean-compile: + @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/split.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stat.Po@am__quote@ - @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stdbuf.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stty.Po@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/su.Po@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/su-getdef.Po@am__quote@ @@ -127,9 +118,9 @@ Index: src/Makefile.in @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/touch.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tr.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/true.Po@am__quote@ -@@ -2649,6 +2659,62 @@ sha512sum-md5sum.obj: md5sum.c +@@ -2286,6 +2296,62 @@ sha512sum-md5sum.obj: md5sum.c @AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@ - @am__fastdepCC_FALSE@ $(AM_V_CC@am__nodep@)$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(sha512sum_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o sha512sum-md5sum.obj `if test -f 'md5sum.c'; then $(CYGPATH_W) 'md5sum.c'; else $(CYGPATH_W) '$(srcdir)/md5sum.c'; fi` + @am__fastdepCC_FALSE@ $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(sha512sum_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o sha512sum-md5sum.obj `if test -f 'md5sum.c'; then $(CYGPATH_W) 'md5sum.c'; else $(CYGPATH_W) '$(srcdir)/md5sum.c'; fi` +su-su.o: su.c +@am__fastdepCC_TRUE@ $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(su_CFLAGS) $(CFLAGS) -MT su-su.o -MD -MP -MF $(DEPDIR)/su-su.Tpo -c -o su-su.o `test -f 'su.c' || echo '$(srcdir)/'`su.c diff --git a/coreutils-7.1.diff b/coreutils-7.1.diff new file mode 100644 index 0000000..f755a19 --- /dev/null +++ b/coreutils-7.1.diff @@ -0,0 +1,194 @@ +--- configure ++++ configure +@@ -3029,7 +3029,6 @@ as_fn_append ac_func_list " fchmod" + as_fn_append ac_func_list " alarm" + as_fn_append ac_header_list " sys/statvfs.h" + as_fn_append ac_header_list " sys/select.h" +-gl_printf_safe=yes + as_fn_append ac_func_list " readlink" + as_fn_append ac_header_list " utmp.h" + as_fn_append ac_header_list " utmpx.h" +--- doc/coreutils.texi ++++ doc/coreutils.texi +@@ -66,8 +66,6 @@ + * fold: (coreutils)fold invocation. Wrap long input lines. + * groups: (coreutils)groups invocation. Print group names a user is in. + * head: (coreutils)head invocation. Output the first part of files. +-* hostid: (coreutils)hostid invocation. Print numeric host identifier. +-* hostname: (coreutils)hostname invocation. Print or set system name. + * id: (coreutils)id invocation. Print user identity. + * install: (coreutils)install invocation. Copy and change attributes. + * join: (coreutils)join invocation. Join lines on a common field. +@@ -195,7 +193,7 @@ Free Documentation License''. + * File name manipulation:: dirname basename pathchk + * Working context:: pwd stty printenv tty + * User information:: id logname whoami groups users who +-* System context:: date uname hostname hostid uptime ++* System context:: date uname uptime + * SELinux context:: chcon runcon + * Modified command invocation:: chroot env nice nohup su timeout + * Process control:: kill +@@ -409,8 +407,6 @@ System context + * arch invocation:: Print machine hardware name + * date invocation:: Print or set system date and time + * uname invocation:: Print system information +-* hostname invocation:: Print or set system name +-* hostid invocation:: Print numeric host identifier + * uptime invocation:: Print system uptime and load + + @command{date}: Print or set system date and time +@@ -12969,8 +12965,6 @@ information. + * arch invocation:: Print machine hardware name. + * date invocation:: Print or set system date and time. + * uname invocation:: Print system information. +-* hostname invocation:: Print or set system name. +-* hostid invocation:: Print numeric host identifier. + * uptime invocation:: Print system uptime and load + @end menu + +@@ -13928,54 +13922,6 @@ Print the kernel version. + @exitstatus + + +-@node hostname invocation +-@section @command{hostname}: Print or set system name +- +-@pindex hostname +-@cindex setting the hostname +-@cindex printing the hostname +-@cindex system name, printing +-@cindex appropriate privileges +- +-With no arguments, @command{hostname} prints the name of the current host +-system. With one argument, it sets the current host name to the +-specified string. You must have appropriate privileges to set the host +-name. Synopsis: +- +-@example +-hostname [@var{name}] +-@end example +- +-The only options are @option{--help} and @option{--version}. @xref{Common +-options}. +- +-@exitstatus +- +- +-@node hostid invocation +-@section @command{hostid}: Print numeric host identifier. +- +-@pindex hostid +-@cindex printing the host identifier +- +-@command{hostid} prints the numeric identifier of the current host +-in hexadecimal. This command accepts no arguments. +-The only options are @option{--help} and @option{--version}. +-@xref{Common options}. +- +-For example, here's what it prints on one system I use: +- +-@example +-$ hostid +-1bac013d +-@end example +- +-On that system, the 32-bit quantity happens to be closely +-related to the system's Internet address, but that isn't always +-the case. +- +-@exitstatus +- + @node uptime invocation + @section @command{uptime}: Print system uptime and load + +--- gnulib-tests/test-isnanl.h ++++ gnulib-tests/test-isnanl.h +@@ -75,7 +75,7 @@ main () + /* Quiet NaN. */ + ASSERT (isnanl (0.0L / 0.0L)); + +-#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT ++#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT && 0 + /* A bit pattern that is different from a Quiet NaN. With a bit of luck, + it's a Signalling NaN. */ + { +@@ -117,6 +117,7 @@ main () + { LDBL80_WORDS (0xFFFF, 0x83333333, 0x00000000) }; + ASSERT (isnanl (x.value)); + } ++#if 0 + /* The isnanl function should recognize Pseudo-NaNs, Pseudo-Infinities, + Pseudo-Zeroes, Unnormalized Numbers, and Pseudo-Denormals, as defined in + Intel IA-64 Architecture Software Developer's Manual, Volume 1: +@@ -150,6 +151,7 @@ main () + ASSERT (isnanl (x.value)); + } + #endif ++#endif + + return 0; + } +--- m4/gnulib-comp.m4 ++++ m4/gnulib-comp.m4 +@@ -287,7 +287,6 @@ AC_DEFUN([gl_INIT], + gl_POSIXVER + gl_FUNC_PRINTF_FREXP + gl_FUNC_PRINTF_FREXPL +- m4_divert_text([INIT_PREPARE], [gl_printf_safe=yes]) + m4_ifdef([AM_XGETTEXT_OPTION], + [AM_XGETTEXT_OPTION([--keyword='proper_name:1,\"This is a proper name. See the gettext manual, section Names.\"']) + AM_XGETTEXT_OPTION([--keyword='proper_name_utf8:1,\"This is a proper name. See the gettext manual, section Names.\"'])]) +--- man/Makefile.am ++++ man/Makefile.am +@@ -184,7 +184,7 @@ check-x-vs-1: + PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ + t=ls-files.$$$$; \ + (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ +- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ ++ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ + | tr -s ' ' '\n' | sed 's/\.1$$//') \ + | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ + rm $$t +--- man/Makefile.in ++++ man/Makefile.in +@@ -1275,7 +1275,7 @@ check-x-vs-1: + PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ + t=ls-files.$$$$; \ + (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ +- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ ++ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ + | tr -s ' ' '\n' | sed 's/\.1$$//') \ + | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ + rm $$t +--- src/system.h ++++ src/system.h +@@ -156,7 +156,7 @@ enum + # define DEV_BSIZE BBSIZE + #endif + #ifndef DEV_BSIZE +-# define DEV_BSIZE 4096 ++# define DEV_BSIZE 512 + #endif + + /* Extract or fake data from a `struct stat'. +--- tests/misc/help-version ++++ tests/misc/help-version +@@ -182,6 +182,7 @@ lbracket_args=": ]" + for i in $built_programs; do + # Skip these. + case $i in chroot|stty|tty|false|chcon|runcon) continue;; esac ++ case $i in df) continue;; esac + + rm -rf $tmp_in $tmp_in2 $tmp_dir $tmp_out + echo > $tmp_in +--- tests/other-fs-tmpdir ++++ tests/other-fs-tmpdir +@@ -42,6 +42,8 @@ for d in $CANDIDATE_TMP_DIRS; do + fi + + done ++# Autobuild hack ++test -f /bin/uname.bin && other_partition_tmpdir= + + if test -z "$other_partition_tmpdir"; then + skip_test_ \ diff --git a/coreutils-7.1.tar.xz b/coreutils-7.1.tar.xz new file mode 100644 index 0000000..5f576f5 --- /dev/null +++ b/coreutils-7.1.tar.xz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a584c6ce92f390c684dac00032e5c790ecc15cb0fa3e61891ac62401832ae108 +size 3967824 diff --git a/coreutils-8.5-i18n.patch b/coreutils-8.5-i18n.patch deleted file mode 100644 index b043447..0000000 --- a/coreutils-8.5-i18n.patch +++ /dev/null @@ -1,4066 +0,0 @@ -Index: lib/linebuffer.h -=================================================================== ---- lib/linebuffer.h.orig 2010-04-23 15:44:00.000000000 +0200 -+++ lib/linebuffer.h 2010-05-07 16:13:30.696492151 +0200 -@@ -21,6 +21,11 @@ - - # include - -+/* Get mbstate_t. */ -+# if HAVE_WCHAR_H -+# include -+# endif -+ - /* A `struct linebuffer' holds a line of text. */ - - struct linebuffer -@@ -28,6 +33,9 @@ struct linebuffer - size_t size; /* Allocated. */ - size_t length; /* Used. */ - char *buffer; -+# if HAVE_WCHAR_H -+ mbstate_t state; -+# endif - }; - - /* Initialize linebuffer LINEBUFFER for use. */ -Index: src/cut.c -=================================================================== ---- src/cut.c.orig 2010-04-20 21:52:04.000000000 +0200 -+++ src/cut.c 2010-05-07 16:40:46.225492013 +0200 -@@ -28,6 +28,11 @@ - #include - #include - #include -+ -+/* Get mbstate_t, mbrtowc(). */ -+#if HAVE_WCHAR_H -+# include -+#endif - #include "system.h" - - #include "error.h" -@@ -36,6 +41,18 @@ - #include "quote.h" - #include "xstrndup.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "cut" - -@@ -71,6 +88,52 @@ - } \ - while (0) - -+/* Refill the buffer BUF to get a multibyte character. */ -+#define REFILL_BUFFER(BUF, BUFPOS, BUFLEN, STREAM) \ -+ do \ -+ { \ -+ if (BUFLEN < MB_LEN_MAX && !feof (STREAM) && !ferror (STREAM)) \ -+ { \ -+ memmove (BUF, BUFPOS, BUFLEN); \ -+ BUFLEN += fread (BUF + BUFLEN, sizeof(char), BUFSIZ, STREAM); \ -+ BUFPOS = BUF; \ -+ } \ -+ } \ -+ while (0) -+ -+/* Get wide character on BUFPOS. BUFPOS is not included after that. -+ If byte sequence is not valid as a character, CONVFAIL is 1. Otherwise 0. */ -+#define GET_NEXT_WC_FROM_BUFFER(WC, BUFPOS, BUFLEN, MBLENGTH, STATE, CONVFAIL) \ -+ do \ -+ { \ -+ mbstate_t state_bak; \ -+ \ -+ if (BUFLEN < 1) \ -+ { \ -+ WC = WEOF; \ -+ break; \ -+ } \ -+ \ -+ /* Get a wide character. */ \ -+ CONVFAIL = 0; \ -+ state_bak = STATE; \ -+ MBLENGTH = mbrtowc ((wchar_t *)&WC, BUFPOS, BUFLEN, &STATE); \ -+ \ -+ switch (MBLENGTH) \ -+ { \ -+ case (size_t)-1: \ -+ case (size_t)-2: \ -+ CONVFAIL++; \ -+ STATE = state_bak; \ -+ /* Fall througn. */ \ -+ \ -+ case 0: \ -+ MBLENGTH = 1; \ -+ break; \ -+ } \ -+ } \ -+ while (0) -+ - struct range_pair - { - size_t lo; -@@ -89,7 +152,7 @@ static char *field_1_buffer; - /* The number of bytes allocated for FIELD_1_BUFFER. */ - static size_t field_1_bufsize; - --/* The largest field or byte index used as an endpoint of a closed -+/* The largest byte, character or field index used as an endpoint of a closed - or degenerate range specification; this doesn't include the starting - index of right-open-ended ranges. For example, with either range spec - `2-5,9-', `2-3,5,9-' this variable would be set to 5. */ -@@ -101,10 +164,11 @@ static size_t eol_range_start; - - /* This is a bit vector. - In byte mode, which bytes to output. -+ In character mode, which characters to output. - In field mode, which DELIM-separated fields to output. -- Both bytes and fields are numbered starting with 1, -+ Bytes, characters and fields are numbered starting with 1, - so the zeroth bit of this array is unused. -- A field or byte K has been selected if -+ A byte, character or field K has been selected if - (K <= MAX_RANGE_ENDPOINT and is_printable_field(K)) - || (EOL_RANGE_START > 0 && K >= EOL_RANGE_START). */ - static unsigned char *printable_field; -@@ -113,15 +177,25 @@ enum operating_mode - { - undefined_mode, - -- /* Output characters that are in the given bytes. */ -+ /* Output bytes that are at the given positions. */ - byte_mode, - -+ /* Output characters that are at the given positions. */ -+ character_mode, -+ - /* Output the given delimeter-separated fields. */ - field_mode - }; - - static enum operating_mode operating_mode; - -+/* If nonzero, when in byte mode, don't split multibyte characters. */ -+static int byte_mode_character_aware; -+ -+/* If nonzero, the function for single byte locale is work -+ if this program runs on multibyte locale. */ -+static int force_singlebyte_mode; -+ - /* If true do not output lines containing no delimeter characters. - Otherwise, all such lines are printed. This option is valid only - with field mode. */ -@@ -133,6 +207,9 @@ static bool complement; - - /* The delimeter character for field mode. */ - static unsigned char delim; -+#if HAVE_WCHAR_H -+static wchar_t wcdelim; -+#endif - - /* True if the --output-delimiter=STRING option was specified. */ - static bool output_delimiter_specified; -@@ -206,7 +283,7 @@ Mandatory arguments to long options are - -f, --fields=LIST select only these fields; also print any line\n\ - that contains no delimiter character, unless\n\ - the -s option is specified\n\ -- -n (ignored)\n\ -+ -n with -b: don't split multibyte characters\n\ - "), stdout); - fputs (_("\ - --complement complement the set of selected bytes, characters\n\ -@@ -365,7 +442,7 @@ set_fields (const char *fieldstr) - in_digits = false; - /* Starting a range. */ - if (dash_found) -- FATAL_ERROR (_("invalid byte or field list")); -+ FATAL_ERROR (_("invalid byte, character or field list")); - dash_found = true; - fieldstr++; - -@@ -389,14 +466,16 @@ set_fields (const char *fieldstr) - if (!rhs_specified) - { - /* `n-'. From `initial' to end of line. */ -- eol_range_start = initial; -+ if (eol_range_start == 0 || -+ (eol_range_start != 0 && eol_range_start > initial)) -+ eol_range_start = initial; - field_found = true; - } - else - { - /* `m-n' or `-n' (1-n). */ - if (value < initial) -- FATAL_ERROR (_("invalid decreasing range")); -+ FATAL_ERROR (_("invalid byte, character or field list")); - - /* Is there already a range going to end of line? */ - if (eol_range_start != 0) -@@ -476,6 +555,9 @@ set_fields (const char *fieldstr) - if (operating_mode == byte_mode) - error (0, 0, - _("byte offset %s is too large"), quote (bad_num)); -+ else if (operating_mode == character_mode) -+ error (0, 0, -+ _("character offset %s is too large"), quote (bad_num)); - else - error (0, 0, - _("field number %s is too large"), quote (bad_num)); -@@ -486,7 +568,7 @@ set_fields (const char *fieldstr) - fieldstr++; - } - else -- FATAL_ERROR (_("invalid byte or field list")); -+ FATAL_ERROR (_("invalid byte, character or field list")); - } - - max_range_endpoint = 0; -@@ -579,6 +661,63 @@ cut_bytes (FILE *stream) - } - } - -+#if HAVE_MBRTOWC -+/* This function is in use for the following case. -+ -+ 1. Read from the stream STREAM, printing to standard output any selected -+ characters. -+ -+ 2. Read from stream STREAM, printing to standard output any selected bytes, -+ without splitting multibyte characters. */ -+ -+static void -+cut_characters_or_cut_bytes_no_split (FILE *stream) -+{ -+ int idx; /* number of bytes or characters in the line so far. */ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen; /* The length of the byte sequence in buf. */ -+ wint_t wc; /* A gotten wide character. */ -+ size_t mblength; /* The byte size of a multibyte character which shows -+ as same character as WC. */ -+ mbstate_t state; /* State of the stream. */ -+ int convfail; /* 1, when conversion is failed. Otherwise 0. */ -+ -+ idx = 0; -+ buflen = 0; -+ bufpos = buf; -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ while (1) -+ { -+ REFILL_BUFFER (buf, bufpos, buflen, stream); -+ -+ GET_NEXT_WC_FROM_BUFFER (wc, bufpos, buflen, mblength, state, convfail); -+ -+ if (wc == WEOF) -+ { -+ if (idx > 0) -+ putchar ('\n'); -+ break; -+ } -+ else if (wc == L'\n') -+ { -+ putchar ('\n'); -+ idx = 0; -+ } -+ else -+ { -+ idx += (operating_mode == byte_mode) ? mblength : 1; -+ if (print_kth (idx, NULL)) -+ fwrite (bufpos, mblength, sizeof(char), stdout); -+ } -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+} -+#endif -+ - /* Read from stream STREAM, printing to standard output any selected fields. */ - - static void -@@ -701,13 +840,192 @@ cut_fields (FILE *stream) - } - } - -+#if HAVE_MBRTOWC -+static void -+cut_fields_mb (FILE *stream) -+{ -+ int c; -+ unsigned int field_idx; -+ int found_any_selected_field; -+ int buffer_first_field; -+ int empty_input; -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen; /* The length of the byte sequence in buf. */ -+ wint_t wc = 0; /* A gotten wide character. */ -+ size_t mblength; /* The byte size of a multibyte character which shows -+ as same character as WC. */ -+ mbstate_t state; /* State of the stream. */ -+ int convfail; /* 1, when conversion is failed. Otherwise 0. */ -+ -+ found_any_selected_field = 0; -+ field_idx = 1; -+ bufpos = buf; -+ buflen = 0; -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ c = getc (stream); -+ empty_input = (c == EOF); -+ if (c != EOF) -+ ungetc (c, stream); -+ else -+ wc = WEOF; -+ -+ /* To support the semantics of the -s flag, we may have to buffer -+ all of the first field to determine whether it is `delimited.' -+ But that is unnecessary if all non-delimited lines must be printed -+ and the first field has been selected, or if non-delimited lines -+ must be suppressed and the first field has *not* been selected. -+ That is because a non-delimited line has exactly one field. */ -+ buffer_first_field = (suppress_non_delimited ^ !print_kth (1, NULL)); -+ -+ while (1) -+ { -+ if (field_idx == 1 && buffer_first_field) -+ { -+ int len = 0; -+ -+ while (1) -+ { -+ REFILL_BUFFER (buf, bufpos, buflen, stream); -+ -+ GET_NEXT_WC_FROM_BUFFER -+ (wc, bufpos, buflen, mblength, state, convfail); -+ -+ if (wc == WEOF) -+ break; -+ -+ field_1_buffer = xrealloc (field_1_buffer, len + mblength); -+ memcpy (field_1_buffer + len, bufpos, mblength); -+ len += mblength; -+ buflen -= mblength; -+ bufpos += mblength; -+ -+ if (!convfail && (wc == L'\n' || wc == wcdelim)) -+ break; -+ } -+ -+ if (wc == WEOF) -+ break; -+ -+ /* If the first field extends to the end of line (it is not -+ delimited) and we are printing all non-delimited lines, -+ print this one. */ -+ if (convfail || (!convfail && wc != wcdelim)) -+ { -+ if (suppress_non_delimited) -+ { -+ /* Empty. */ -+ } -+ else -+ { -+ fwrite (field_1_buffer, sizeof (char), len, stdout); -+ /* Make sure the output line is newline terminated. */ -+ if (convfail || (!convfail && wc != L'\n')) -+ putchar ('\n'); -+ } -+ continue; -+ } -+ -+ if (print_kth (1, NULL)) -+ { -+ /* Print the field, but not the trailing delimiter. */ -+ fwrite (field_1_buffer, sizeof (char), len - 1, stdout); -+ found_any_selected_field = 1; -+ } -+ ++field_idx; -+ } -+ -+ if (wc != WEOF) -+ { -+ if (print_kth (field_idx, NULL)) -+ { -+ if (found_any_selected_field) -+ { -+ fwrite (output_delimiter_string, sizeof (char), -+ output_delimiter_length, stdout); -+ } -+ found_any_selected_field = 1; -+ } -+ -+ while (1) -+ { -+ REFILL_BUFFER (buf, bufpos, buflen, stream); -+ -+ GET_NEXT_WC_FROM_BUFFER -+ (wc, bufpos, buflen, mblength, state, convfail); -+ -+ if (wc == WEOF) -+ break; -+ else if (!convfail && (wc == wcdelim || wc == L'\n')) -+ { -+ buflen -= mblength; -+ bufpos += mblength; -+ break; -+ } -+ -+ if (print_kth (field_idx, NULL)) -+ fwrite (bufpos, mblength, sizeof(char), stdout); -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+ } -+ -+ if ((!convfail || wc == L'\n') && buflen < 1) -+ wc = WEOF; -+ -+ if (!convfail && wc == wcdelim) -+ ++field_idx; -+ else if (wc == WEOF || (!convfail && wc == L'\n')) -+ { -+ if (found_any_selected_field -+ || (!empty_input && !(suppress_non_delimited && field_idx == 1))) -+ putchar ('\n'); -+ if (wc == WEOF) -+ break; -+ field_idx = 1; -+ found_any_selected_field = 0; -+ } -+ } -+} -+#endif -+ - static void - cut_stream (FILE *stream) - { -- if (operating_mode == byte_mode) -- cut_bytes (stream); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) -+ { -+ switch (operating_mode) -+ { -+ case byte_mode: -+ if (byte_mode_character_aware) -+ cut_characters_or_cut_bytes_no_split (stream); -+ else -+ cut_bytes (stream); -+ break; -+ -+ case character_mode: -+ cut_characters_or_cut_bytes_no_split (stream); -+ break; -+ -+ case field_mode: -+ cut_fields_mb (stream); -+ break; -+ -+ default: -+ abort (); -+ } -+ } - else -- cut_fields (stream); -+#endif -+ { -+ if (operating_mode == field_mode) -+ cut_fields (stream); -+ else -+ cut_bytes (stream); -+ } - } - - /* Process file FILE to standard output. -@@ -757,6 +1075,8 @@ main (int argc, char **argv) - bool ok; - bool delim_specified = false; - char *spec_list_string IF_LINT (= NULL); -+ char mbdelim[MB_LEN_MAX + 1]; -+ size_t delimlen = 0; - - initialize_main (&argc, &argv); - set_program_name (argv[0]); -@@ -779,7 +1099,6 @@ main (int argc, char **argv) - switch (optc) - { - case 'b': -- case 'c': - /* Build the byte list. */ - if (operating_mode != undefined_mode) - FATAL_ERROR (_("only one type of list may be specified")); -@@ -787,6 +1106,14 @@ main (int argc, char **argv) - spec_list_string = optarg; - break; - -+ case 'c': -+ /* Build the character list. */ -+ if (operating_mode != undefined_mode) -+ FATAL_ERROR (_("only one type of list may be specified")); -+ operating_mode = character_mode; -+ spec_list_string = optarg; -+ break; -+ - case 'f': - /* Build the field list. */ - if (operating_mode != undefined_mode) -@@ -798,10 +1125,35 @@ main (int argc, char **argv) - case 'd': - /* New delimiter. */ - /* Interpret -d '' to mean `use the NUL byte as the delimiter.' */ -- if (optarg[0] != '\0' && optarg[1] != '\0') -- FATAL_ERROR (_("the delimiter must be a single character")); -- delim = optarg[0]; -- delim_specified = true; -+ { -+#if HAVE_MBRTOWC -+ if(MB_CUR_MAX > 1) -+ { -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ delimlen = mbrtowc (&wcdelim, optarg, strnlen(optarg, MB_LEN_MAX), &state); -+ -+ if (delimlen == (size_t)-1 || delimlen == (size_t)-2) -+ ++force_singlebyte_mode; -+ else -+ { -+ delimlen = (delimlen < 1) ? 1 : delimlen; -+ if (wcdelim != L'\0' && *(optarg + delimlen) != '\0') -+ FATAL_ERROR (_("the delimiter must be a single character")); -+ memcpy (mbdelim, optarg, delimlen); -+ } -+ } -+ -+ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) -+#endif -+ { -+ if (optarg[0] != '\0' && optarg[1] != '\0') -+ FATAL_ERROR (_("the delimiter must be a single character")); -+ delim = (unsigned char) optarg[0]; -+ } -+ delim_specified = true; -+ } - break; - - case OUTPUT_DELIMITER_OPTION: -@@ -814,6 +1166,7 @@ main (int argc, char **argv) - break; - - case 'n': -+ byte_mode_character_aware = 1; - break; - - case 's': -@@ -836,7 +1189,7 @@ main (int argc, char **argv) - if (operating_mode == undefined_mode) - FATAL_ERROR (_("you must specify a list of bytes, characters, or fields")); - -- if (delim != '\0' && operating_mode != field_mode) -+ if (delim_specified && operating_mode != field_mode) - FATAL_ERROR (_("an input delimiter may be specified only\ - when operating on fields")); - -@@ -863,15 +1216,34 @@ main (int argc, char **argv) - } - - if (!delim_specified) -- delim = '\t'; -+ { -+ delim = '\t'; -+#ifdef HAVE_MBRTOWC -+ wcdelim = L'\t'; -+ mbdelim[0] = '\t'; -+ mbdelim[1] = '\0'; -+ delimlen = 1; -+#endif -+ } - - if (output_delimiter_string == NULL) - { -- static char dummy[2]; -- dummy[0] = delim; -- dummy[1] = '\0'; -- output_delimiter_string = dummy; -- output_delimiter_length = 1; -+#ifdef HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) -+ { -+ output_delimiter_string = xstrdup(mbdelim); -+ output_delimiter_length = delimlen; -+ } -+ -+ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) -+#endif -+ { -+ static char dummy[2]; -+ dummy[0] = delim; -+ dummy[1] = '\0'; -+ output_delimiter_string = dummy; -+ output_delimiter_length = 1; -+ } - } - - if (optind == argc) -Index: src/expand.c -=================================================================== ---- src/expand.c.orig 2010-01-01 14:06:47.000000000 +0100 -+++ src/expand.c 2010-05-07 16:13:30.748169979 +0200 -@@ -38,11 +38,28 @@ - #include - #include - #include -+ -+/* Get mbstate_t, mbrtowc(), wcwidth(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "quote.h" - #include "xstrndup.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "expand" - -@@ -358,6 +375,142 @@ expand (void) - } - } - -+#if HAVE_MBRTOWC -+static void -+expand_multibyte (void) -+{ -+ FILE *fp; /* Input strem. */ -+ mbstate_t i_state; /* Current shift state of the input stream. */ -+ mbstate_t i_state_bak; /* Back up the I_STATE. */ -+ mbstate_t o_state; /* Current shift state of the output stream. */ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen = 0; /* The length of the byte sequence in buf. */ -+ wchar_t wc; /* A gotten wide character. */ -+ size_t mblength; /* The byte size of a multibyte character -+ which shows as same character as WC. */ -+ int tab_index = 0; /* Index in `tab_list' of next tabstop. */ -+ int column = 0; /* Column on screen of the next char. */ -+ int next_tab_column; /* Column the next tab stop is on. */ -+ int convert = 1; /* If nonzero, perform translations. */ -+ -+ fp = next_file ((FILE *) NULL); -+ if (fp == NULL) -+ return; -+ -+ memset (&o_state, '\0', sizeof(mbstate_t)); -+ memset (&i_state, '\0', sizeof(mbstate_t)); -+ -+ for (;;) -+ { -+ /* Refill the buffer BUF. */ -+ if (buflen < MB_LEN_MAX && !feof(fp) && !ferror(fp)) -+ { -+ memmove (buf, bufpos, buflen); -+ buflen += fread (buf + buflen, sizeof(char), BUFSIZ, fp); -+ bufpos = buf; -+ } -+ -+ /* No character is left in BUF. */ -+ if (buflen < 1) -+ { -+ fp = next_file (fp); -+ -+ if (fp == NULL) -+ break; /* No more files. */ -+ else -+ { -+ memset (&i_state, '\0', sizeof(mbstate_t)); -+ continue; -+ } -+ } -+ -+ /* Get a wide character. */ -+ i_state_bak = i_state; -+ mblength = mbrtowc (&wc, bufpos, buflen, &i_state); -+ -+ switch (mblength) -+ { -+ case (size_t)-1: /* illegal byte sequence. */ -+ case (size_t)-2: -+ mblength = 1; -+ i_state = i_state_bak; -+ if (convert) -+ { -+ ++column; -+ if (convert_entire_line == 0) -+ convert = 0; -+ } -+ putchar (*bufpos); -+ break; -+ -+ case 0: /* null. */ -+ mblength = 1; -+ if (convert && convert_entire_line == 0) -+ convert = 0; -+ putchar ('\0'); -+ break; -+ -+ default: -+ if (wc == L'\n') /* LF. */ -+ { -+ tab_index = 0; -+ column = 0; -+ convert = 1; -+ putchar ('\n'); -+ } -+ else if (wc == L'\t' && convert) /* Tab. */ -+ { -+ if (tab_size == 0) -+ { -+ /* Do not let tab_index == first_free_tab; -+ stop when it is 1 less. */ -+ while (tab_index < first_free_tab - 1 -+ && column >= tab_list[tab_index]) -+ tab_index++; -+ next_tab_column = tab_list[tab_index]; -+ if (tab_index < first_free_tab - 1) -+ tab_index++; -+ if (column >= next_tab_column) -+ next_tab_column = column + 1; -+ } -+ else -+ next_tab_column = column + tab_size - column % tab_size; -+ -+ while (column < next_tab_column) -+ { -+ putchar (' '); -+ ++column; -+ } -+ } -+ else /* Others. */ -+ { -+ if (convert) -+ { -+ if (wc == L'\b') -+ { -+ if (column > 0) -+ --column; -+ } -+ else -+ { -+ int width; /* The width of WC. */ -+ -+ width = wcwidth (wc); -+ column += (width > 0) ? width : 0; -+ if (convert_entire_line == 0) -+ convert = 0; -+ } -+ } -+ fwrite (bufpos, sizeof(char), mblength, stdout); -+ } -+ } -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+} -+#endif -+ - int - main (int argc, char **argv) - { -@@ -422,7 +575,12 @@ main (int argc, char **argv) - - file_list = (optind < argc ? &argv[optind] : stdin_argv); - -- expand (); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ expand_multibyte (); -+ else -+#endif -+ expand (); - - if (have_read_stdin && fclose (stdin) != 0) - error (EXIT_FAILURE, errno, "-"); -Index: src/fold.c -=================================================================== ---- src/fold.c.orig 2010-01-01 14:06:47.000000000 +0100 -+++ src/fold.c 2010-05-07 16:39:03.220004781 +0200 -@@ -22,11 +22,33 @@ - #include - #include - -+/* Get mbstate_t, mbrtowc(), wcwidth(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswprint(), iswblank(), wcwidth(). */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "quote.h" - #include "xstrtol.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - #define TAB_WIDTH 8 - - /* The official name of this program (e.g., no `g' prefix). */ -@@ -34,20 +56,41 @@ - - #define AUTHORS proper_name ("David MacKenzie") - -+#define FATAL_ERROR(Message) \ -+ do \ -+ { \ -+ error (0, 0, (Message)); \ -+ usage (2); \ -+ } \ -+ while (0) -+ -+enum operating_mode -+{ -+ /* Fold texts by columns that are at the given positions. */ -+ column_mode, -+ -+ /* Fold texts by bytes that are at the given positions. */ -+ byte_mode, -+ -+ /* Fold texts by characters that are at the given positions. */ -+ character_mode, -+}; -+ -+/* The argument shows current mode. (Default: column_mode) */ -+static enum operating_mode operating_mode; -+ - /* If nonzero, try to break on whitespace. */ - static bool break_spaces; - --/* If nonzero, count bytes, not column positions. */ --static bool count_bytes; -- - /* If nonzero, at least one of the files we read was standard input. */ - static bool have_read_stdin; - --static char const shortopts[] = "bsw:0::1::2::3::4::5::6::7::8::9::"; -+static char const shortopts[] = "bcsw:0::1::2::3::4::5::6::7::8::9::"; - - static struct option const longopts[] = - { - {"bytes", no_argument, NULL, 'b'}, -+ {"characters", no_argument, NULL, 'c'}, - {"spaces", no_argument, NULL, 's'}, - {"width", required_argument, NULL, 'w'}, - {GETOPT_HELP_OPTION_DECL}, -@@ -77,6 +120,7 @@ Mandatory arguments to long options are - "), stdout); - fputs (_("\ - -b, --bytes count bytes rather than columns\n\ -+ -c, --characters count characters rather than columns\n\ - -s, --spaces break at spaces\n\ - -w, --width=WIDTH use WIDTH columns instead of 80\n\ - "), stdout); -@@ -94,7 +138,7 @@ Mandatory arguments to long options are - static size_t - adjust_column (size_t column, char c) - { -- if (!count_bytes) -+ if (operating_mode != byte_mode) - { - if (c == '\b') - { -@@ -117,30 +161,14 @@ adjust_column (size_t column, char c) - to stdout, with maximum line length WIDTH. - Return true if successful. */ - --static bool --fold_file (char const *filename, size_t width) -+static void -+fold_text (FILE *istream, size_t width, int *saved_errno) - { -- FILE *istream; - int c; - size_t column = 0; /* Screen column where next char will go. */ - size_t offset_out = 0; /* Index in `line_out' for next char. */ - static char *line_out = NULL; - static size_t allocated_out = 0; -- int saved_errno; -- -- if (STREQ (filename, "-")) -- { -- istream = stdin; -- have_read_stdin = true; -- } -- else -- istream = fopen (filename, "r"); -- -- if (istream == NULL) -- { -- error (0, errno, "%s", filename); -- return false; -- } - - while ((c = getc (istream)) != EOF) - { -@@ -168,6 +196,15 @@ fold_file (char const *filename, size_t - bool found_blank = false; - size_t logical_end = offset_out; - -+ /* If LINE_OUT has no wide character, -+ put a new wide character in LINE_OUT -+ if column is bigger than width. */ -+ if (offset_out == 0) -+ { -+ line_out[offset_out++] = c; -+ continue; -+ } -+ - /* Look for the last blank. */ - while (logical_end) - { -@@ -214,11 +251,222 @@ fold_file (char const *filename, size_t - line_out[offset_out++] = c; - } - -- saved_errno = errno; -+ *saved_errno = errno; -+ -+ if (offset_out) -+ fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); -+ -+} -+ -+#if HAVE_MBRTOWC -+static void -+fold_multibyte_text (FILE *istream, size_t width, int *saved_errno) -+{ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ size_t buflen = 0; /* The length of the byte sequence in buf. */ -+ char *bufpos = NULL; /* Next read position of BUF. */ -+ wint_t wc; /* A gotten wide character. */ -+ size_t mblength; /* The byte size of a multibyte character which shows -+ as same character as WC. */ -+ mbstate_t state, state_bak; /* State of the stream. */ -+ int convfail; /* 1, when conversion is failed. Otherwise 0. */ -+ -+ static char *line_out = NULL; -+ size_t offset_out = 0; /* Index in `line_out' for next char. */ -+ static size_t allocated_out = 0; -+ -+ int increment; -+ size_t column = 0; -+ -+ size_t last_blank_pos; -+ size_t last_blank_column; -+ int is_blank_seen; -+ int last_blank_increment = 0; -+ int is_bs_following_last_blank; -+ size_t bs_following_last_blank_num; -+ int is_cr_after_last_blank; -+ -+#define CLEAR_FLAGS \ -+ do \ -+ { \ -+ last_blank_pos = 0; \ -+ last_blank_column = 0; \ -+ is_blank_seen = 0; \ -+ is_bs_following_last_blank = 0; \ -+ bs_following_last_blank_num = 0; \ -+ is_cr_after_last_blank = 0; \ -+ } \ -+ while (0) -+ -+#define START_NEW_LINE \ -+ do \ -+ { \ -+ putchar ('\n'); \ -+ column = 0; \ -+ offset_out = 0; \ -+ CLEAR_FLAGS; \ -+ } \ -+ while (0) -+ -+ CLEAR_FLAGS; -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ for (;; bufpos += mblength, buflen -= mblength) -+ { -+ if (buflen < MB_LEN_MAX && !feof (istream) && !ferror (istream)) -+ { -+ memmove (buf, bufpos, buflen); -+ buflen += fread (buf + buflen, sizeof(char), BUFSIZ, istream); -+ bufpos = buf; -+ } -+ -+ if (buflen < 1) -+ break; -+ -+ /* Get a wide character. */ -+ convfail = 0; -+ state_bak = state; -+ mblength = mbrtowc ((wchar_t *)&wc, bufpos, buflen, &state); -+ -+ switch (mblength) -+ { -+ case (size_t)-1: -+ case (size_t)-2: -+ convfail++; -+ state = state_bak; -+ /* Fall through. */ -+ -+ case 0: -+ mblength = 1; -+ break; -+ } -+ -+rescan: -+ if (operating_mode == byte_mode) /* byte mode */ -+ increment = mblength; -+ else if (operating_mode == character_mode) /* character mode */ -+ increment = 1; -+ else /* column mode */ -+ { -+ if (convfail) -+ increment = 1; -+ else -+ { -+ switch (wc) -+ { -+ case L'\n': -+ fwrite (line_out, sizeof(char), offset_out, stdout); -+ START_NEW_LINE; -+ continue; -+ -+ case L'\b': -+ increment = (column > 0) ? -1 : 0; -+ break; -+ -+ case L'\r': -+ increment = -1 * column; -+ break; -+ -+ case L'\t': -+ increment = 8 - column % 8; -+ break; -+ -+ default: -+ increment = wcwidth (wc); -+ increment = (increment < 0) ? 0 : increment; -+ } -+ } -+ } -+ -+ if (column + increment > width && break_spaces && last_blank_pos) -+ { -+ fwrite (line_out, sizeof(char), last_blank_pos, stdout); -+ putchar ('\n'); -+ -+ offset_out = offset_out - last_blank_pos; -+ column = column - last_blank_column + ((is_cr_after_last_blank) -+ ? last_blank_increment : bs_following_last_blank_num); -+ memmove (line_out, line_out + last_blank_pos, offset_out); -+ CLEAR_FLAGS; -+ goto rescan; -+ } -+ -+ if (column + increment > width && column != 0) -+ { -+ fwrite (line_out, sizeof(char), offset_out, stdout); -+ START_NEW_LINE; -+ goto rescan; -+ } -+ -+ if (allocated_out < offset_out + mblength) -+ { -+ line_out = X2REALLOC (line_out, &allocated_out); -+ } -+ -+ memcpy (line_out + offset_out, bufpos, mblength); -+ offset_out += mblength; -+ column += increment; -+ -+ if (is_blank_seen && !convfail && wc == L'\r') -+ is_cr_after_last_blank = 1; -+ -+ if (is_bs_following_last_blank && !convfail && wc == L'\b') -+ ++bs_following_last_blank_num; -+ else -+ is_bs_following_last_blank = 0; -+ -+ if (break_spaces && !convfail && iswblank (wc)) -+ { -+ last_blank_pos = offset_out; -+ last_blank_column = column; -+ is_blank_seen = 1; -+ last_blank_increment = increment; -+ is_bs_following_last_blank = 1; -+ bs_following_last_blank_num = 0; -+ is_cr_after_last_blank = 0; -+ } -+ } -+ -+ *saved_errno = errno; - - if (offset_out) - fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); - -+} -+#endif -+ -+/* Fold file FILENAME, or standard input if FILENAME is "-", -+ to stdout, with maximum line length WIDTH. -+ Return 0 if successful, 1 if an error occurs. */ -+ -+static bool -+fold_file (char *filename, size_t width) -+{ -+ FILE *istream; -+ int saved_errno; -+ -+ if (STREQ (filename, "-")) -+ { -+ istream = stdin; -+ have_read_stdin = 1; -+ } -+ else -+ istream = fopen (filename, "r"); -+ -+ if (istream == NULL) -+ { -+ error (0, errno, "%s", filename); -+ return 1; -+ } -+ -+ /* Define how ISTREAM is being folded. */ -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ fold_multibyte_text (istream, width, &saved_errno); -+ else -+#endif -+ fold_text (istream, width, &saved_errno); -+ - if (ferror (istream)) - { - error (0, saved_errno, "%s", filename); -@@ -251,7 +499,8 @@ main (int argc, char **argv) - - atexit (close_stdout); - -- break_spaces = count_bytes = have_read_stdin = false; -+ operating_mode = column_mode; -+ break_spaces = have_read_stdin = false; - - while ((optc = getopt_long (argc, argv, shortopts, longopts, NULL)) != -1) - { -@@ -260,7 +509,15 @@ main (int argc, char **argv) - switch (optc) - { - case 'b': /* Count bytes rather than columns. */ -- count_bytes = true; -+ if (operating_mode != column_mode) -+ FATAL_ERROR (_("only one way of folding may be specified")); -+ operating_mode = byte_mode; -+ break; -+ -+ case 'c': -+ if (operating_mode != column_mode) -+ FATAL_ERROR (_("only one way of folding may be specified")); -+ operating_mode = character_mode; - break; - - case 's': /* Break at word boundaries. */ -Index: src/join.c -=================================================================== ---- src/join.c.orig 2010-04-20 21:52:04.000000000 +0200 -+++ src/join.c 2010-05-07 16:41:17.564268573 +0200 -@@ -22,17 +22,31 @@ - #include - #include - -+/* Get mbstate_t, mbrtowc(), mbrtowc(), wcwidth(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswblank(), towupper. */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "hard-locale.h" - #include "linebuffer.h" --#include "memcasecmp.h" - #include "quote.h" - #include "stdio--.h" - #include "xmemcoll.h" - #include "xstrtol.h" - #include "argmatch.h" - -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "join" - -@@ -121,10 +135,12 @@ static struct outlist outlist_head; - /* Last element in `outlist', where a new element can be added. */ - static struct outlist *outlist_end = &outlist_head; - --/* Tab character separating fields. If negative, fields are separated -- by any nonempty string of blanks, otherwise by exactly one -- tab character whose value (when cast to unsigned char) equals TAB. */ --static int tab = -1; -+/* Tab character separating fields. If NULL, fields are separated -+ by any nonempty string of blanks. */ -+static char *tab = NULL; -+ -+/* The number of bytes used for tab. */ -+static size_t tablen = 0; - - /* If nonzero, check that the input is correctly ordered. */ - static enum -@@ -248,10 +264,11 @@ xfields (struct line *line) - if (ptr == lim) - return; - -- if (0 <= tab) -+ if (tab != NULL) - { -+ unsigned char t = tab[0]; - char *sep; -- for (; (sep = memchr (ptr, tab, lim - ptr)) != NULL; ptr = sep + 1) -+ for (; (sep = memchr (ptr, t, lim - ptr)) != NULL; ptr = sep + 1) - extract_field (line, ptr, sep - ptr); - } - else -@@ -278,6 +295,148 @@ xfields (struct line *line) - extract_field (line, ptr, lim - ptr); - } - -+#if HAVE_MBRTOWC -+static void -+xfields_multibyte (struct line *line) -+{ -+ char *ptr = line->buf.buffer; -+ char const *lim = ptr + line->buf.length - 1; -+ wchar_t wc = 0; -+ size_t mblength = 1; -+ mbstate_t state, state_bak; -+ -+ memset (&state, 0, sizeof (mbstate_t)); -+ -+ if (ptr >= lim) -+ return; -+ -+ if (tab != NULL) -+ { -+ unsigned char t = tab[0]; -+ char *sep = ptr; -+ for (; ptr < lim; ptr = sep + mblength) -+ { -+ sep = ptr; -+ while (sep < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, sep, lim - sep + 1, &state); -+ -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ mblength = 1; -+ state = state_bak; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (mblength == tablen && !memcmp (sep, tab, mblength)) -+ break; -+ else -+ { -+ sep += mblength; -+ continue; -+ } -+ } -+ -+ if (sep >= lim) -+ break; -+ -+ extract_field (line, ptr, sep - ptr); -+ } -+ } -+ else -+ { -+ /* Skip leading blanks before the first field. */ -+ while(ptr < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); -+ -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ mblength = 1; -+ state = state_bak; -+ break; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (!iswblank(wc)) -+ break; -+ ptr += mblength; -+ } -+ -+ do -+ { -+ char *sep; -+ state_bak = state; -+ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ mblength = 1; -+ state = state_bak; -+ break; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ sep = ptr + mblength; -+ while (sep < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, sep, lim - sep + 1, &state); -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ mblength = 1; -+ state = state_bak; -+ break; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (iswblank (wc)) -+ break; -+ -+ sep += mblength; -+ } -+ -+ extract_field (line, ptr, sep - ptr); -+ if (sep >= lim) -+ return; -+ -+ state_bak = state; -+ mblength = mbrtowc (&wc, sep, lim - sep + 1, &state); -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ mblength = 1; -+ state = state_bak; -+ break; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ ptr = sep + mblength; -+ while (ptr < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ mblength = 1; -+ state = state_bak; -+ break; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (!iswblank (wc)) -+ break; -+ -+ ptr += mblength; -+ } -+ } -+ while (ptr < lim); -+ } -+ -+ extract_field (line, ptr, lim - ptr); -+} -+#endif -+ - static void - freeline (struct line *line) - { -@@ -299,56 +458,115 @@ keycmp (struct line const *line1, struct - size_t jf_1, size_t jf_2) - { - /* Start of field to compare in each file. */ -- char *beg1; -- char *beg2; -- -- size_t len1; -- size_t len2; /* Length of fields to compare. */ -+ char *beg[2]; -+ char *copy[2]; -+ size_t len[2]; /* Length of fields to compare. */ - int diff; -+ int i, j; - - if (jf_1 < line1->nfields) - { -- beg1 = line1->fields[jf_1].beg; -- len1 = line1->fields[jf_1].len; -+ beg[0] = line1->fields[jf_1].beg; -+ len[0] = line1->fields[jf_1].len; - } - else - { -- beg1 = NULL; -- len1 = 0; -+ beg[0] = NULL; -+ len[0] = 0; - } - - if (jf_2 < line2->nfields) - { -- beg2 = line2->fields[jf_2].beg; -- len2 = line2->fields[jf_2].len; -+ beg[1] = line2->fields[jf_2].beg; -+ len[1] = line2->fields[jf_2].len; - } - else - { -- beg2 = NULL; -- len2 = 0; -+ beg[1] = NULL; -+ len[1] = 0; - } - -- if (len1 == 0) -- return len2 == 0 ? 0 : -1; -- if (len2 == 0) -+ if (len[0] == 0) -+ return len[1] == 0 ? 0 : -1; -+ if (len[1] == 0) - return 1; - - if (ignore_case) - { -- /* FIXME: ignore_case does not work with NLS (in particular, -- with multibyte chars). */ -- diff = memcasecmp (beg1, beg2, MIN (len1, len2)); -+#ifdef HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ size_t mblength; -+ wchar_t wc, uwc; -+ mbstate_t state, state_bak; -+ -+ memset (&state, '\0', sizeof (mbstate_t)); -+ -+ for (i = 0; i < 2; i++) -+ { -+ copy[i] = alloca (len[i] + 1); -+ -+ for (j = 0; j < MIN (len[0], len[1]);) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, beg[i] + j, len[i] - j, &state); -+ -+ switch (mblength) -+ { -+ case (size_t) -1: -+ case (size_t) -2: -+ state = state_bak; -+ /* Fall through */ -+ case 0: -+ mblength = 1; -+ break; -+ -+ default: -+ uwc = towupper (wc); -+ -+ if (uwc != wc) -+ { -+ mbstate_t state_wc; -+ -+ memset (&state_wc, '\0', sizeof (mbstate_t)); -+ wcrtomb (copy[i] + j, uwc, &state_wc); -+ } -+ else -+ memcpy (copy[i] + j, beg[i] + j, mblength); -+ } -+ j += mblength; -+ } -+ copy[i][j] = '\0'; -+ } -+ } -+ else -+#endif -+ { -+ for (i = 0; i < 2; i++) -+ { -+ copy[i] = alloca (len[i] + 1); -+ -+ for (j = 0; j < MIN (len[0], len[1]); j++) -+ copy[i][j] = toupper (beg[i][j]); -+ -+ copy[i][j] = '\0'; -+ } -+ } - } - else - { -- if (hard_LC_COLLATE) -- return xmemcoll (beg1, len1, beg2, len2); -- diff = memcmp (beg1, beg2, MIN (len1, len2)); -+ copy[0] = (unsigned char *) beg[0]; -+ copy[1] = (unsigned char *) beg[1]; - } - -+ if (hard_LC_COLLATE) -+ return xmemcoll ((char *) copy[0], len[0], (char *) copy[1], len[1]); -+ diff = memcmp (copy[0], copy[1], MIN (len[0], len[1])); -+ -+ - if (diff) - return diff; -- return len1 < len2 ? -1 : len1 != len2; -+ return len[0] - len[1]; - } - - /* Check that successive input lines PREV and CURRENT from input file -@@ -429,6 +647,11 @@ get_line (FILE *fp, struct line **linep, - return false; - } - -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ xfields_multibyte (line); -+ else -+#endif - xfields (line); - - if (prevline[which - 1]) -@@ -528,11 +751,18 @@ prfield (size_t n, struct line const *li - - /* Print the join of LINE1 and LINE2. */ - -+#define PUT_TAB_CHAR \ -+ do \ -+ { \ -+ (tab != NULL) ? \ -+ fwrite(tab, sizeof(char), tablen, stdout) : putchar (' '); \ -+ } \ -+ while (0) -+ - static void - prjoin (struct line const *line1, struct line const *line2) - { - const struct outlist *outlist; -- char output_separator = tab < 0 ? ' ' : tab; - - outlist = outlist_head.next; - if (outlist) -@@ -567,7 +797,7 @@ prjoin (struct line const *line1, struct - o = o->next; - if (o == NULL) - break; -- putchar (output_separator); -+ PUT_TAB_CHAR; - } - putchar ('\n'); - } -@@ -585,23 +815,23 @@ prjoin (struct line const *line1, struct - prfield (join_field_1, line1); - for (i = 0; i < join_field_1 && i < line1->nfields; ++i) - { -- putchar (output_separator); -+ PUT_TAB_CHAR; - prfield (i, line1); - } - for (i = join_field_1 + 1; i < line1->nfields; ++i) - { -- putchar (output_separator); -+ PUT_TAB_CHAR; - prfield (i, line1); - } - - for (i = 0; i < join_field_2 && i < line2->nfields; ++i) - { -- putchar (output_separator); -+ PUT_TAB_CHAR; - prfield (i, line2); - } - for (i = join_field_2 + 1; i < line2->nfields; ++i) - { -- putchar (output_separator); -+ PUT_TAB_CHAR; - prfield (i, line2); - } - putchar ('\n'); -@@ -1039,21 +1269,46 @@ main (int argc, char **argv) - - case 't': - { -- unsigned char newtab = optarg[0]; -+ char *newtab; -+ size_t newtablen; -+ newtab = xstrdup (optarg); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ mbstate_t state; -+ -+ memset (&state, 0, sizeof (mbstate_t)); -+ newtablen = mbrtowc (NULL, newtab, -+ strnlen (newtab, MB_LEN_MAX), -+ &state); -+ if (newtablen == (size_t) 0 -+ || newtablen == (size_t) -1 -+ || newtablen == (size_t) -2) -+ newtablen = 1; -+ } -+ else -+#endif -+ newtablen = 1; - if (! newtab) -- newtab = '\n'; /* '' => process the whole line. */ -+ { -+ newtab[0] = '\n'; /* '' => process the whole line. */ -+ } - else if (optarg[1]) - { -- if (STREQ (optarg, "\\0")) -- newtab = '\0'; -- else -- error (EXIT_FAILURE, 0, _("multi-character tab %s"), -- quote (optarg)); -+ if (newtablen == 1 && newtab[1]) -+ { -+ if (STREQ (newtab, "\\0")) -+ newtab[0] = '\0'; -+ } -+ } -+ if (tab != NULL && strcmp (tab, newtab)) -+ { -+ free (newtab); -+ error (EXIT_FAILURE, 0, _("incompatible tabs")); - } -- if (0 <= tab && tab != newtab) -- error (EXIT_FAILURE, 0, _("incompatible tabs")); - tab = newtab; -- } -+ tablen = newtablen; -+ } - break; - - case NOCHECK_ORDER_OPTION: -Index: src/pr.c -=================================================================== ---- src/pr.c.orig 2010-03-13 16:14:09.000000000 +0100 -+++ src/pr.c 2010-05-07 16:13:30.836003733 +0200 -@@ -312,6 +312,32 @@ - - #include - #include -+ -+/* Get MB_LEN_MAX. */ -+#include -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX == 1 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Get MB_CUR_MAX. */ -+#include -+ -+/* Solaris 2.5 has a bug: must be included before . */ -+/* Get mbstate_t, mbrtowc(), wcwidth(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswprint(). -- for wcwidth(). */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+#if !defined iswprint && !HAVE_ISWPRINT -+# define iswprint(wc) 1 -+#endif -+ - #include "system.h" - #include "error.h" - #include "hard-locale.h" -@@ -322,6 +348,18 @@ - #include "strftime.h" - #include "xstrtol.h" - -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ -+#ifndef HAVE_DECL_WCWIDTH -+"this configure-time declaration test was not run" -+#endif -+#if !HAVE_DECL_WCWIDTH -+extern int wcwidth (); -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "pr" - -@@ -414,7 +452,20 @@ struct COLUMN - - typedef struct COLUMN COLUMN; - --static int char_to_clump (char c); -+/* Funtion pointers to switch functions for single byte locale or for -+ multibyte locale. If multibyte functions do not exist in your sysytem, -+ these pointers always point the function for single byte locale. */ -+static void (*print_char) (char c); -+static int (*char_to_clump) (char c); -+ -+/* Functions for single byte locale. */ -+static void print_char_single (char c); -+static int char_to_clump_single (char c); -+ -+/* Functions for multibyte locale. */ -+static void print_char_multi (char c); -+static int char_to_clump_multi (char c); -+ - static bool read_line (COLUMN *p); - static bool print_page (void); - static bool print_stored (COLUMN *p); -@@ -424,6 +475,7 @@ static void print_header (void); - static void pad_across_to (int position); - static void add_line_number (COLUMN *p); - static void getoptarg (char *arg, char switch_char, char *character, -+ int *character_length, int *character_width, - int *number); - void usage (int status); - static void print_files (int number_of_files, char **av); -@@ -438,7 +490,6 @@ static void store_char (char c); - static void pad_down (int lines); - static void read_rest_of_line (COLUMN *p); - static void skip_read (COLUMN *p, int column_number); --static void print_char (char c); - static void cleanup (void); - static void print_sep_string (void); - static void separator_string (const char *optarg_S); -@@ -450,7 +501,7 @@ static COLUMN *column_vector; - we store the leftmost columns contiguously in buff. - To print a line from buff, get the index of the first character - from line_vector[i], and print up to line_vector[i + 1]. */ --static char *buff; -+static unsigned char *buff; - - /* Index of the position in buff where the next character - will be stored. */ -@@ -554,7 +605,7 @@ static int chars_per_column; - static bool untabify_input = false; - - /* (-e) The input tab character. */ --static char input_tab_char = '\t'; -+static char input_tab_char[MB_LEN_MAX] = "\t"; - - /* (-e) Tabstops are at chars_per_tab, 2*chars_per_tab, 3*chars_per_tab, ... - where the leftmost column is 1. */ -@@ -564,7 +615,10 @@ static int chars_per_input_tab = 8; - static bool tabify_output = false; - - /* (-i) The output tab character. */ --static char output_tab_char = '\t'; -+static char output_tab_char[MB_LEN_MAX] = "\t"; -+ -+/* (-i) The byte length of output tab character. */ -+static int output_tab_char_length = 1; - - /* (-i) The width of the output tab. */ - static int chars_per_output_tab = 8; -@@ -638,7 +692,13 @@ static int power_10; - static bool numbered_lines = false; - - /* (-n) Character which follows each line number. */ --static char number_separator = '\t'; -+static char number_separator[MB_LEN_MAX] = "\t"; -+ -+/* (-n) The byte length of the character which follows each line number. */ -+static int number_separator_length = 1; -+ -+/* (-n) The character width of the character which follows each line number. */ -+static int number_separator_width = 0; - - /* (-n) line counting starts with 1st line of input file (not with 1st - line of 1st page printed). */ -@@ -691,6 +751,7 @@ static bool use_col_separator = false; - -a|COLUMN|-m is a `space' and with the -J option a `tab'. */ - static char *col_sep_string = (char *) ""; - static int col_sep_length = 0; -+static int col_sep_width = 0; - static char *column_separator = (char *) " "; - static char *line_separator = (char *) "\t"; - -@@ -847,6 +908,13 @@ separator_string (const char *optarg_S) - col_sep_length = (int) strlen (optarg_S); - col_sep_string = xmalloc (col_sep_length + 1); - strcpy (col_sep_string, optarg_S); -+ -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ col_sep_width = mbswidth (col_sep_string, 0); -+ else -+#endif -+ col_sep_width = col_sep_length; - } - - int -@@ -871,6 +939,21 @@ main (int argc, char **argv) - - atexit (close_stdout); - -+/* Define which functions are used, the ones for single byte locale or the ones -+ for multibyte locale. */ -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ print_char = print_char_multi; -+ char_to_clump = char_to_clump_multi; -+ } -+ else -+#endif -+ { -+ print_char = print_char_single; -+ char_to_clump = char_to_clump_single; -+ } -+ - n_files = 0; - file_names = (argc > 1 - ? xmalloc ((argc - 1) * sizeof (char *)) -@@ -947,8 +1030,12 @@ main (int argc, char **argv) - break; - case 'e': - if (optarg) -- getoptarg (optarg, 'e', &input_tab_char, -- &chars_per_input_tab); -+ { -+ int dummy_length, dummy_width; -+ -+ getoptarg (optarg, 'e', input_tab_char, &dummy_length, -+ &dummy_width, &chars_per_input_tab); -+ } - /* Could check tab width > 0. */ - untabify_input = true; - break; -@@ -961,8 +1048,12 @@ main (int argc, char **argv) - break; - case 'i': - if (optarg) -- getoptarg (optarg, 'i', &output_tab_char, -- &chars_per_output_tab); -+ { -+ int dummy_width; -+ -+ getoptarg (optarg, 'i', output_tab_char, &output_tab_char_length, -+ &dummy_width, &chars_per_output_tab); -+ } - /* Could check tab width > 0. */ - tabify_output = true; - break; -@@ -989,8 +1080,8 @@ main (int argc, char **argv) - case 'n': - numbered_lines = true; - if (optarg) -- getoptarg (optarg, 'n', &number_separator, -- &chars_per_number); -+ getoptarg (optarg, 'n', number_separator, &number_separator_length, -+ &number_separator_width, &chars_per_number); - break; - case 'N': - skip_count = false; -@@ -1029,7 +1120,7 @@ main (int argc, char **argv) - old_s = false; - /* Reset an additional input of -s, -S dominates -s */ - col_sep_string = bad_cast (""); -- col_sep_length = 0; -+ col_sep_length = col_sep_width = 0; - use_col_separator = true; - if (optarg) - separator_string (optarg); -@@ -1186,10 +1277,45 @@ main (int argc, char **argv) - a number. */ - - static void --getoptarg (char *arg, char switch_char, char *character, int *number) -+getoptarg (char *arg, char switch_char, char *character, int *character_length, -+ int *character_width, int *number) - { - if (!ISDIGIT (*arg)) -- *character = *arg++; -+ { -+#ifdef HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) /* for multibyte locale. */ -+ { -+ wchar_t wc; -+ size_t mblength; -+ int width; -+ mbstate_t state = {'\0'}; -+ -+ mblength = mbrtowc (&wc, arg, strnlen(arg, MB_LEN_MAX), &state); -+ -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ *character_length = 1; -+ *character_width = 1; -+ } -+ else -+ { -+ *character_length = (mblength < 1) ? 1 : mblength; -+ width = wcwidth (wc); -+ *character_width = (width < 0) ? 0 : width; -+ } -+ -+ strncpy (character, arg, *character_length); -+ arg += *character_length; -+ } -+ else /* for single byte locale. */ -+#endif -+ { -+ *character = *arg++; -+ *character_length = 1; -+ *character_width = 1; -+ } -+ } -+ - if (*arg) - { - long int tmp_long; -@@ -1248,7 +1374,7 @@ init_parameters (int number_of_files) - else - col_sep_string = column_separator; - -- col_sep_length = 1; -+ col_sep_length = col_sep_width = 1; - use_col_separator = true; - } - /* It's rather pointless to define a TAB separator with column -@@ -1279,11 +1405,11 @@ init_parameters (int number_of_files) - TAB_WIDTH (chars_per_input_tab, chars_per_number); */ - - /* Estimate chars_per_text without any margin and keep it constant. */ -- if (number_separator == '\t') -+ if (number_separator[0] == '\t') - number_width = chars_per_number + - TAB_WIDTH (chars_per_default_tab, chars_per_number); - else -- number_width = chars_per_number + 1; -+ number_width = chars_per_number + number_separator_width; - - /* The number is part of the column width unless we are - printing files in parallel. */ -@@ -1298,7 +1424,7 @@ init_parameters (int number_of_files) - } - - chars_per_column = (chars_per_line - chars_used_by_number - -- (columns - 1) * col_sep_length) / columns; -+ (columns - 1) * col_sep_width) / columns; - - if (chars_per_column < 1) - error (EXIT_FAILURE, 0, _("page width too narrow")); -@@ -1423,7 +1549,7 @@ init_funcs (void) - - /* Enlarge p->start_position of first column to use the same form of - padding_not_printed with all columns. */ -- h = h + col_sep_length; -+ h = h + col_sep_width; - - /* This loop takes care of all but the rightmost column. */ - -@@ -1457,7 +1583,7 @@ init_funcs (void) - } - else - { -- h = h_next + col_sep_length; -+ h = h_next + col_sep_width; - h_next = h + chars_per_column; - } - } -@@ -1747,9 +1873,9 @@ static void - align_column (COLUMN *p) - { - padding_not_printed = p->start_position; -- if (padding_not_printed - col_sep_length > 0) -+ if (padding_not_printed - col_sep_width > 0) - { -- pad_across_to (padding_not_printed - col_sep_length); -+ pad_across_to (padding_not_printed - col_sep_width); - padding_not_printed = ANYWHERE; - } - -@@ -2020,13 +2146,13 @@ store_char (char c) - /* May be too generous. */ - buff = X2REALLOC (buff, &buff_allocated); - } -- buff[buff_current++] = c; -+ buff[buff_current++] = (unsigned char) c; - } - - static void - add_line_number (COLUMN *p) - { -- int i; -+ int i, j; - char *s; - int left_cut; - -@@ -2049,22 +2175,24 @@ add_line_number (COLUMN *p) - /* Tabification is assumed for multiple columns, also for n-separators, - but `default n-separator = TAB' hasn't been given priority over - equal column_width also specified by POSIX. */ -- if (number_separator == '\t') -+ if (number_separator[0] == '\t') - { - i = number_width - chars_per_number; - while (i-- > 0) - (p->char_func) (' '); - } - else -- (p->char_func) (number_separator); -+ for (j = 0; j < number_separator_length; j++) -+ (p->char_func) (number_separator[j]); - } - else - /* To comply with POSIX, we avoid any expansion of default TAB - separator with a single column output. No column_width requirement - has to be considered. */ - { -- (p->char_func) (number_separator); -- if (number_separator == '\t') -+ for (j = 0; j < number_separator_length; j++) -+ (p->char_func) (number_separator[j]); -+ if (number_separator[0] == '\t') - output_position = POS_AFTER_TAB (chars_per_output_tab, - output_position); - } -@@ -2225,7 +2353,7 @@ print_white_space (void) - while (goal - h_old > 1 - && (h_new = POS_AFTER_TAB (chars_per_output_tab, h_old)) <= goal) - { -- putchar (output_tab_char); -+ fwrite (output_tab_char, sizeof(char), output_tab_char_length, stdout); - h_old = h_new; - } - while (++h_old <= goal) -@@ -2245,6 +2373,7 @@ print_sep_string (void) - { - char *s; - int l = col_sep_length; -+ int not_space_flag; - - s = col_sep_string; - -@@ -2258,6 +2387,7 @@ print_sep_string (void) - { - for (; separators_not_printed > 0; --separators_not_printed) - { -+ not_space_flag = 0; - while (l-- > 0) - { - /* 3 types of sep_strings: spaces only, spaces and chars, -@@ -2271,12 +2401,15 @@ print_sep_string (void) - } - else - { -+ not_space_flag = 1; - if (spaces_not_printed > 0) - print_white_space (); - putchar (*s++); -- ++output_position; - } - } -+ if (not_space_flag) -+ output_position += col_sep_width; -+ - /* sep_string ends with some spaces */ - if (spaces_not_printed > 0) - print_white_space (); -@@ -2304,7 +2437,7 @@ print_clump (COLUMN *p, int n, char *clu - required number of tabs and spaces. */ - - static void --print_char (char c) -+print_char_single (char c) - { - if (tabify_output) - { -@@ -2328,6 +2461,74 @@ print_char (char c) - putchar (c); - } - -+#ifdef HAVE_MBRTOWC -+static void -+print_char_multi (char c) -+{ -+ static size_t mbc_pos = 0; -+ static char mbc[MB_LEN_MAX] = {'\0'}; -+ static mbstate_t state = {'\0'}; -+ mbstate_t state_bak; -+ wchar_t wc; -+ size_t mblength; -+ int width; -+ -+ if (tabify_output) -+ { -+ state_bak = state; -+ mbc[mbc_pos++] = c; -+ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); -+ -+ while (mbc_pos > 0) -+ { -+ switch (mblength) -+ { -+ case (size_t)-2: -+ state = state_bak; -+ return; -+ -+ case (size_t)-1: -+ state = state_bak; -+ ++output_position; -+ putchar (mbc[0]); -+ memmove (mbc, mbc + 1, MB_CUR_MAX - 1); -+ --mbc_pos; -+ break; -+ -+ case 0: -+ mblength = 1; -+ -+ default: -+ if (wc == L' ') -+ { -+ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); -+ --mbc_pos; -+ ++spaces_not_printed; -+ return; -+ } -+ else if (spaces_not_printed > 0) -+ print_white_space (); -+ -+ /* Nonprintables are assumed to have width 0, except L'\b'. */ -+ if ((width = wcwidth (wc)) < 1) -+ { -+ if (wc == L'\b') -+ --output_position; -+ } -+ else -+ output_position += width; -+ -+ fwrite (mbc, sizeof(char), mblength, stdout); -+ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); -+ mbc_pos -= mblength; -+ } -+ } -+ return; -+ } -+ putchar (c); -+} -+#endif -+ - /* Skip to page PAGE before printing. - PAGE may be larger than total number of pages. */ - -@@ -2507,9 +2708,9 @@ read_line (COLUMN *p) - align_empty_cols = false; - } - -- if (padding_not_printed - col_sep_length > 0) -+ if (padding_not_printed - col_sep_width > 0) - { -- pad_across_to (padding_not_printed - col_sep_length); -+ pad_across_to (padding_not_printed - col_sep_width); - padding_not_printed = ANYWHERE; - } - -@@ -2610,9 +2811,9 @@ print_stored (COLUMN *p) - } - } - -- if (padding_not_printed - col_sep_length > 0) -+ if (padding_not_printed - col_sep_width > 0) - { -- pad_across_to (padding_not_printed - col_sep_length); -+ pad_across_to (padding_not_printed - col_sep_width); - padding_not_printed = ANYWHERE; - } - -@@ -2625,8 +2826,8 @@ print_stored (COLUMN *p) - if (spaces_not_printed == 0) - { - output_position = p->start_position + end_vector[line]; -- if (p->start_position - col_sep_length == chars_per_margin) -- output_position -= col_sep_length; -+ if (p->start_position - col_sep_width == chars_per_margin) -+ output_position -= col_sep_width; - } - - return true; -@@ -2645,7 +2846,7 @@ print_stored (COLUMN *p) - number of characters is 1.) */ - - static int --char_to_clump (char c) -+char_to_clump_single (char c) - { - unsigned char uc = c; - char *s = clump_buff; -@@ -2655,10 +2856,10 @@ char_to_clump (char c) - int chars; - int chars_per_c = 8; - -- if (c == input_tab_char) -+ if (c == input_tab_char[0]) - chars_per_c = chars_per_input_tab; - -- if (c == input_tab_char || c == '\t') -+ if (c == input_tab_char[0] || c == '\t') - { - width = TAB_WIDTH (chars_per_c, input_position); - -@@ -2739,6 +2940,154 @@ char_to_clump (char c) - return chars; - } - -+#ifdef HAVE_MBRTOWC -+static int -+char_to_clump_multi (char c) -+{ -+ static size_t mbc_pos = 0; -+ static char mbc[MB_LEN_MAX] = {'\0'}; -+ static mbstate_t state = {'\0'}; -+ mbstate_t state_bak; -+ wchar_t wc; -+ size_t mblength; -+ int wc_width; -+ register char *s = clump_buff; -+ register int i, j; -+ char esc_buff[4]; -+ int width; -+ int chars; -+ int chars_per_c = 8; -+ -+ state_bak = state; -+ mbc[mbc_pos++] = c; -+ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); -+ -+ width = 0; -+ chars = 0; -+ while (mbc_pos > 0) -+ { -+ switch (mblength) -+ { -+ case (size_t)-2: -+ state = state_bak; -+ return 0; -+ -+ case (size_t)-1: -+ state = state_bak; -+ mblength = 1; -+ -+ if (use_esc_sequence || use_cntrl_prefix) -+ { -+ width = +4; -+ chars = +4; -+ *s++ = '\\'; -+ sprintf (esc_buff, "%03o", mbc[0]); -+ for (i = 0; i <= 2; ++i) -+ *s++ = (int) esc_buff[i]; -+ } -+ else -+ { -+ width += 1; -+ chars += 1; -+ *s++ = mbc[0]; -+ } -+ break; -+ -+ case 0: -+ mblength = 1; -+ /* Fall through */ -+ -+ default: -+ if (memcmp (mbc, input_tab_char, mblength) == 0) -+ chars_per_c = chars_per_input_tab; -+ -+ if (memcmp (mbc, input_tab_char, mblength) == 0 || c == '\t') -+ { -+ int width_inc; -+ -+ width_inc = TAB_WIDTH (chars_per_c, input_position); -+ width += width_inc; -+ -+ if (untabify_input) -+ { -+ for (i = width_inc; i; --i) -+ *s++ = ' '; -+ chars += width_inc; -+ } -+ else -+ { -+ for (i = 0; i < mblength; i++) -+ *s++ = mbc[i]; -+ chars += mblength; -+ } -+ } -+ else if ((wc_width = wcwidth (wc)) < 1) -+ { -+ if (use_esc_sequence) -+ { -+ for (i = 0; i < mblength; i++) -+ { -+ width += 4; -+ chars += 4; -+ *s++ = '\\'; -+ sprintf (esc_buff, "%03o", c); -+ for (j = 0; j <= 2; ++j) -+ *s++ = (int) esc_buff[j]; -+ } -+ } -+ else if (use_cntrl_prefix) -+ { -+ if (wc < 0200) -+ { -+ width += 2; -+ chars += 2; -+ *s++ = '^'; -+ *s++ = wc ^ 0100; -+ } -+ else -+ { -+ for (i = 0; i < mblength; i++) -+ { -+ width += 4; -+ chars += 4; -+ *s++ = '\\'; -+ sprintf (esc_buff, "%03o", c); -+ for (j = 0; j <= 2; ++j) -+ *s++ = (int) esc_buff[j]; -+ } -+ } -+ } -+ else if (wc == L'\b') -+ { -+ width += -1; -+ chars += 1; -+ *s++ = c; -+ } -+ else -+ { -+ width += 0; -+ chars += mblength; -+ for (i = 0; i < mblength; i++) -+ *s++ = mbc[i]; -+ } -+ } -+ else -+ { -+ width += wc_width; -+ chars += mblength; -+ for (i = 0; i < mblength; i++) -+ *s++ = mbc[i]; -+ } -+ } -+ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); -+ mbc_pos -= mblength; -+ } -+ -+ input_position += width; -+ return chars; -+} -+#endif -+ - /* We've just printed some files and need to clean up things before - looking for more options and printing the next batch of files. - -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-04-21 09:06:17.000000000 +0200 -+++ src/sort.c 2010-05-07 16:34:36.664210645 +0200 -@@ -22,10 +22,19 @@ - - #include - -+#include - #include - #include - #include - #include -+#if HAVE_WCHAR_H -+# include -+#endif -+/* Get isw* functions. */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "argmatch.h" - #include "error.h" -@@ -124,14 +133,38 @@ static int decimal_point; - /* Thousands separator; if -1, then there isn't one. */ - static int thousands_sep; - -+static int force_general_numcompare = 0; -+ - /* Nonzero if the corresponding locales are hard. */ - static bool hard_LC_COLLATE; --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - static bool hard_LC_TIME; - #endif - - #define NONZERO(x) ((x) != 0) - -+/* get a multibyte character's byte length. */ -+#define GET_BYTELEN_OF_CHAR(LIM, PTR, MBLENGTH, STATE) \ -+ do \ -+ { \ -+ wchar_t wc; \ -+ mbstate_t state_bak; \ -+ \ -+ state_bak = STATE; \ -+ mblength = mbrtowc (&wc, PTR, LIM - PTR, &STATE); \ -+ \ -+ switch (MBLENGTH) \ -+ { \ -+ case (size_t)-1: \ -+ case (size_t)-2: \ -+ STATE = state_bak; \ -+ /* Fall through. */ \ -+ case 0: \ -+ MBLENGTH = 1; \ -+ } \ -+ } \ -+ while (0) -+ - /* The kind of blanks for '-b' to skip in various options. */ - enum blanktype { bl_start, bl_end, bl_both }; - -@@ -270,13 +303,11 @@ static bool reverse; - they were read if all keys compare equal. */ - static bool stable; - --/* If TAB has this value, blanks separate fields. */ --enum { TAB_DEFAULT = CHAR_MAX + 1 }; -- --/* Tab character separating fields. If TAB_DEFAULT, then fields are -+/* Tab character separating fields. If tab_length is 0, then fields are - separated by the empty string between a non-blank character and a blank - character. */ --static int tab = TAB_DEFAULT; -+static char tab[MB_LEN_MAX + 1]; -+static size_t tab_length = 0; - - /* Flag to remove consecutive duplicate lines from the output. - Only the last of a sequence of equal lines will be output. */ -@@ -714,6 +745,44 @@ reap_some (void) - update_proc (pid); - } - -+/* Function pointers. */ -+static void -+(*inittables) (void); -+static char * -+(*begfield) (const struct line*, const struct keyfield *); -+static char * -+(*limfield) (const struct line*, const struct keyfield *); -+static int -+(*getmonth) (char const *, size_t); -+static int -+(*keycompare) (const struct line *, const struct line *); -+static int -+(*numcompare) (const char *, const char *); -+ -+/* Test for white space multibyte character. -+ Set LENGTH the byte length of investigated multibyte character. */ -+#if HAVE_MBRTOWC -+static int -+ismbblank (const char *str, size_t len, size_t *length) -+{ -+ size_t mblength; -+ wchar_t wc; -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ mblength = mbrtowc (&wc, str, len, &state); -+ -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ *length = 1; -+ return 0; -+ } -+ -+ *length = (mblength < 1) ? 1 : mblength; -+ return iswblank (wc); -+} -+#endif -+ - /* Clean up any remaining temporary files. */ - - static void -@@ -1158,7 +1227,7 @@ zaptemp (const char *name) - free (node); - } - --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - - static int - struct_month_cmp (const void *m1, const void *m2) -@@ -1173,7 +1242,7 @@ struct_month_cmp (const void *m1, const - /* Initialize the character class tables. */ - - static void --inittables (void) -+inittables_uni (void) - { - size_t i; - -@@ -1185,7 +1254,7 @@ inittables (void) - fold_toupper[i] = toupper (i); - } - --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - /* If we're not in the "C" locale, read different names for months. */ - if (hard_LC_TIME) - { -@@ -1268,6 +1337,64 @@ specify_nmerge (int oi, char c, char con - xstrtol_fatal (e, oi, c, long_options, s); - } - -+#if HAVE_MBRTOWC -+static void -+inittables_mb (void) -+{ -+ int i, j, k, l; -+ char *name, *s; -+ size_t s_len, mblength; -+ char mbc[MB_LEN_MAX]; -+ wchar_t wc, pwc; -+ mbstate_t state_mb, state_wc; -+ -+ for (i = 0; i < MONTHS_PER_YEAR; i++) -+ { -+ s = (char *) nl_langinfo (ABMON_1 + i); -+ s_len = strlen (s); -+ monthtab[i].name = name = (char *) xmalloc (s_len + 1); -+ monthtab[i].val = i + 1; -+ -+ memset (&state_mb, '\0', sizeof (mbstate_t)); -+ memset (&state_wc, '\0', sizeof (mbstate_t)); -+ -+ for (j = 0; j < s_len;) -+ { -+ if (!ismbblank (s + j, s_len - j, &mblength)) -+ break; -+ j += mblength; -+ } -+ -+ for (k = 0; j < s_len;) -+ { -+ mblength = mbrtowc (&wc, (s + j), (s_len - j), &state_mb); -+ assert (mblength != (size_t)-1 && mblength != (size_t)-2); -+ if (mblength == 0) -+ break; -+ -+ pwc = towupper (wc); -+ if (pwc == wc) -+ { -+ memcpy (mbc, s + j, mblength); -+ j += mblength; -+ } -+ else -+ { -+ j += mblength; -+ mblength = wcrtomb (mbc, pwc, &state_wc); -+ assert (mblength != (size_t)0 && mblength != (size_t)-1); -+ } -+ -+ for (l = 0; l < mblength; l++) -+ name[k++] = mbc[l]; -+ } -+ name[k] = '\0'; -+ } -+ qsort ((void *) monthtab, MONTHS_PER_YEAR, -+ sizeof (struct month), struct_month_cmp); -+} -+#endif -+ - /* Specify the amount of main memory to use when sorting. */ - static void - specify_sort_size (int oi, char c, char const *s) -@@ -1478,7 +1605,7 @@ buffer_linelim (struct buffer const *buf - by KEY in LINE. */ - - static char * --begfield (const struct line *line, const struct keyfield *key) -+begfield_uni (const struct line *line, const struct keyfield *key) - { - char *ptr = line->text, *lim = ptr + line->length - 1; - size_t sword = key->sword; -@@ -1487,10 +1614,10 @@ begfield (const struct line *line, const - /* The leading field separator itself is included in a field when -t - is absent. */ - -- if (tab != TAB_DEFAULT) -+ if (tab_length) - while (ptr < lim && sword--) - { -- while (ptr < lim && *ptr != tab) -+ while (ptr < lim && *ptr != tab[0]) - ++ptr; - if (ptr < lim) - ++ptr; -@@ -1516,11 +1643,70 @@ begfield (const struct line *line, const - return ptr; - } - -+#if HAVE_MBRTOWC -+static char * -+begfield_mb (const struct line *line, const struct keyfield *key) -+{ -+ int i; -+ char *ptr = line->text, *lim = ptr + line->length - 1; -+ size_t sword = key->sword; -+ size_t schar = key->schar; -+ size_t mblength; -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ if (tab_length) -+ while (ptr < lim && sword--) -+ { -+ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ } -+ else -+ while (ptr < lim && sword--) -+ { -+ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) -+ ptr += mblength; -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ while (ptr < lim && !ismbblank (ptr, lim - ptr, &mblength)) -+ ptr += mblength; -+ } -+ -+ if (key->skipsblanks) -+ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) -+ ptr += mblength; -+ -+ for (i = 0; i < schar; i++) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ -+ if (ptr + mblength > lim) -+ break; -+ else -+ ptr += mblength; -+ } -+ -+ return ptr; -+} -+#endif -+ - /* Return the limit of (a pointer to the first character after) the field - in LINE specified by KEY. */ - - static char * --limfield (const struct line *line, const struct keyfield *key) -+limfield_uni (const struct line *line, const struct keyfield *key) - { - char *ptr = line->text, *lim = ptr + line->length - 1; - size_t eword = key->eword, echar = key->echar; -@@ -1535,10 +1721,10 @@ limfield (const struct line *line, const - `beginning' is the first character following the delimiting TAB. - Otherwise, leave PTR pointing at the first `blank' character after - the preceding field. */ -- if (tab != TAB_DEFAULT) -+ if (tab_length) - while (ptr < lim && eword--) - { -- while (ptr < lim && *ptr != tab) -+ while (ptr < lim && *ptr != tab[0]) - ++ptr; - if (ptr < lim && (eword || echar)) - ++ptr; -@@ -1584,10 +1770,10 @@ limfield (const struct line *line, const - */ - - /* Make LIM point to the end of (one byte past) the current field. */ -- if (tab != TAB_DEFAULT) -+ if (tab_length) - { - char *newlim; -- newlim = memchr (ptr, tab, lim - ptr); -+ newlim = memchr (ptr, tab[0], lim - ptr); - if (newlim) - lim = newlim; - } -@@ -1618,6 +1804,113 @@ limfield (const struct line *line, const - return ptr; - } - -+#if HAVE_MBRTOWC -+static char * -+limfield_mb (const struct line *line, const struct keyfield *key) -+{ -+ char *ptr = line->text, *lim = ptr + line->length - 1; -+ size_t eword = key->eword, echar = key->echar; -+ int i; -+ size_t mblength; -+ mbstate_t state; -+ -+ if (echar == 0) -+ eword++; /* skip all of end field. */ -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ if (tab_length) -+ while (ptr < lim && eword--) -+ { -+ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ if (ptr < lim && (eword | echar)) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ } -+ else -+ while (ptr < lim && eword--) -+ { -+ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) -+ ptr += mblength; -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ while (ptr < lim && !ismbblank (ptr, lim - ptr, &mblength)) -+ ptr += mblength; -+ } -+ -+ -+# ifdef POSIX_UNSPECIFIED -+ /* Make LIM point to the end of (one byte past) the current field. */ -+ if (tab_length) -+ { -+ char *newlim, *p; -+ -+ newlim = NULL; -+ for (p = ptr; p < lim;) -+ { -+ if (memcmp (p, tab, tab_length) == 0) -+ { -+ newlim = p; -+ break; -+ } -+ -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ p += mblength; -+ } -+ } -+ else -+ { -+ char *newlim; -+ newlim = ptr; -+ -+ while (newlim < lim && ismbblank (newlim, lim - newlim, &mblength)) -+ newlim += mblength; -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ while (newlim < lim && !ismbblank (newlim, lim - newlim, &mblength)) -+ newlim += mblength; -+ lim = newlim; -+ } -+# endif -+ -+ if (echar != 0) -+ { -+ /* If we're skipping leading blanks, don't start counting characters -+ * until after skipping past any leading blanks. */ -+ if (key->skipsblanks) -+ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) -+ ptr += mblength; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ /* Advance PTR by ECHAR (if possible), but no further than LIM. */ -+ for (i = 0; i < echar; i++) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ -+ if (ptr + mblength > lim) -+ break; -+ else -+ ptr += mblength; -+ } -+ } -+ -+ return ptr; -+} -+#endif -+ - /* Fill BUF reading from FP, moving buf->left bytes from the end - of buf->buf to the beginning first. If EOF is reached and the - file wasn't terminated by a newline, supply one. Set up BUF's line -@@ -1700,8 +1993,24 @@ fillbuf (struct buffer *buf, FILE *fp, c - else - { - if (key->skipsblanks) -- while (blanks[to_uchar (*line_start)]) -- line_start++; -+ { -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ size_t mblength; -+ mbstate_t state; -+ memset (&state, '\0', sizeof(mbstate_t)); -+ while (line_start < line->keylim && -+ ismbblank (line_start, -+ line->keylim - line_start, -+ &mblength)) -+ line_start += mblength; -+ } -+ else -+#endif -+ while (blanks[to_uchar (*line_start)]) -+ line_start++; -+ } - line->keybeg = line_start; - } - } -@@ -1739,7 +2048,7 @@ fillbuf (struct buffer *buf, FILE *fp, c - hideously fast. */ - - static int --numcompare (const char *a, const char *b) -+numcompare_uni (const char *a, const char *b) - { - while (blanks[to_uchar (*a)]) - a++; -@@ -1848,6 +2157,25 @@ human_numcompare (const char *a, const c - : strnumcmp (a, b, decimal_point, thousands_sep)); - } - -+#if HAVE_MBRTOWC -+static int -+numcompare_mb (const char *a, const char *b) -+{ -+ size_t mblength, len; -+ len = strlen (a); /* okay for UTF-8 */ -+ while (*a && ismbblank (a, len > MB_CUR_MAX ? MB_CUR_MAX : len, &mblength)) -+ { -+ a += mblength; -+ len -= mblength; -+ } -+ len = strlen (b); /* okay for UTF-8 */ -+ while (*b && ismbblank (b, len > MB_CUR_MAX ? MB_CUR_MAX : len, &mblength)) -+ b += mblength; -+ -+ return strnumcmp (a, b, decimal_point, thousands_sep); -+} -+#endif /* HAV_EMBRTOWC */ -+ - static int - general_numcompare (const char *sa, const char *sb) - { -@@ -1881,7 +2209,7 @@ general_numcompare (const char *sa, cons - Return 0 if the name in S is not recognized. */ - - static int --getmonth (char const *month, size_t len) -+getmonth_uni (char const *month, size_t len) - { - size_t lo = 0; - size_t hi = MONTHS_PER_YEAR; -@@ -2062,11 +2390,79 @@ compare_version (char *restrict texta, s - return diff; - } - -+#if HAVE_MBRTOWC -+static int -+getmonth_mb (const char *s, size_t len) -+{ -+ char *month; -+ register size_t i; -+ register int lo = 0, hi = MONTHS_PER_YEAR, result; -+ char *tmp; -+ size_t wclength, mblength; -+ const char **pp; -+ const wchar_t **wpp; -+ wchar_t *month_wcs; -+ mbstate_t state; -+ -+ while (len > 0 && ismbblank (s, len, &mblength)) -+ { -+ s += mblength; -+ len -= mblength; -+ } -+ -+ if (len == 0) -+ return 0; -+ -+ month = (char *) alloca (len + 1); -+ -+ tmp = (char *) alloca (len + 1); -+ memcpy (tmp, s, len); -+ tmp[len] = '\0'; -+ pp = (const char **)&tmp; -+ month_wcs = (wchar_t *) alloca ((len + 1) * sizeof (wchar_t)); -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ wclength = mbsrtowcs (month_wcs, pp, len + 1, &state); -+ assert (wclength != (size_t)-1 && *pp == NULL); -+ -+ for (i = 0; i < wclength; i++) -+ { -+ month_wcs[i] = towupper(month_wcs[i]); -+ if (iswblank (month_wcs[i])) -+ { -+ month_wcs[i] = L'\0'; -+ break; -+ } -+ } -+ -+ wpp = (const wchar_t **)&month_wcs; -+ -+ mblength = wcsrtombs (month, wpp, len + 1, &state); -+ assert (mblength != (-1) && *wpp == NULL); -+ -+ do -+ { -+ int ix = (lo + hi) / 2; -+ -+ if (strncmp (month, monthtab[ix].name, strlen (monthtab[ix].name)) < 0) -+ hi = ix; -+ else -+ lo = ix; -+ } -+ while (hi - lo > 1); -+ -+ result = (!strncmp (month, monthtab[lo].name, strlen (monthtab[lo].name)) -+ ? monthtab[lo].val : 0); -+ -+ return result; -+} -+#endif -+ - /* Compare two lines A and B trying every key in sequence until there - are no more keys or a difference is found. */ - - static int --keycompare (const struct line *a, const struct line *b) -+keycompare_uni (const struct line *a, const struct line *b) - { - struct keyfield *key = keylist; - -@@ -2246,6 +2642,179 @@ keycompare (const struct line *a, const - return key->reverse ? -diff : diff; - } - -+#if HAVE_MBRTOWC -+static int -+keycompare_mb (const struct line *a, const struct line *b) -+{ -+ struct keyfield *key = keylist; -+ -+ /* For the first iteration only, the key positions have been -+ precomputed for us. */ -+ char *texta = a->keybeg; -+ char *textb = b->keybeg; -+ char *lima = a->keylim; -+ char *limb = b->keylim; -+ -+ size_t mblength_a, mblength_b; -+ wchar_t wc_a, wc_b; -+ mbstate_t state_a, state_b; -+ -+ int diff; -+ -+ memset (&state_a, '\0', sizeof(mbstate_t)); -+ memset (&state_b, '\0', sizeof(mbstate_t)); -+ -+ for (;;) -+ { -+ char const *translate = key->translate; -+ bool const *ignore = key->ignore; -+ -+ /* Find the lengths. */ -+ size_t lena = lima <= texta ? 0 : lima - texta; -+ size_t lenb = limb <= textb ? 0 : limb - textb; -+ -+ /* Actually compare the fields. */ -+ if (key->random) -+ diff = compare_random (texta, lena, textb, lenb); -+ else if (key->numeric | key->general_numeric | key->human_numeric) -+ { -+ char savea = *lima, saveb = *limb; -+ -+ *lima = *limb = '\0'; -+ diff = (key->numeric ? numcompare (texta, textb) -+ : key->general_numeric ? general_numcompare (texta, textb) -+ : human_numcompare (texta, textb, key)); -+ *lima = savea, *limb = saveb; -+ } -+ else if (key->version) -+ diff = compare_version (texta, lena, textb, lenb); -+ else if (key->month) -+ diff = getmonth (texta, lena) - getmonth (textb, lenb); -+ else -+ { -+ if (ignore || translate) -+ { -+ char *copy_a = (char *) alloca (lena + 1 + lenb + 1); -+ char *copy_b = copy_a + lena + 1; -+ size_t new_len_a, new_len_b; -+ size_t i, j; -+ -+ /* Ignore and/or translate chars before comparing. */ -+# define IGNORE_CHARS(NEW_LEN, LEN, TEXT, COPY, WC, MBLENGTH, STATE) \ -+ do \ -+ { \ -+ wchar_t uwc; \ -+ char mbc[MB_LEN_MAX]; \ -+ mbstate_t state_wc; \ -+ \ -+ for (NEW_LEN = i = 0; i < LEN;) \ -+ { \ -+ mbstate_t state_bak; \ -+ \ -+ state_bak = STATE; \ -+ MBLENGTH = mbrtowc (&WC, TEXT + i, LEN - i, &STATE); \ -+ \ -+ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1 \ -+ || MBLENGTH == 0) \ -+ { \ -+ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ -+ STATE = state_bak; \ -+ if (!ignore) \ -+ COPY[NEW_LEN++] = TEXT[i++]; \ -+ continue; \ -+ } \ -+ \ -+ if (ignore) \ -+ { \ -+ if ((ignore == nonprinting && !iswprint (WC)) \ -+ || (ignore == nondictionary \ -+ && !iswalnum (WC) && !iswblank (WC))) \ -+ { \ -+ i += MBLENGTH; \ -+ continue; \ -+ } \ -+ } \ -+ \ -+ if (translate) \ -+ { \ -+ \ -+ uwc = towupper(WC); \ -+ if (WC == uwc) \ -+ { \ -+ memcpy (mbc, TEXT + i, MBLENGTH); \ -+ i += MBLENGTH; \ -+ } \ -+ else \ -+ { \ -+ i += MBLENGTH; \ -+ WC = uwc; \ -+ memset (&state_wc, '\0', sizeof (mbstate_t)); \ -+ \ -+ MBLENGTH = wcrtomb (mbc, WC, &state_wc); \ -+ assert (MBLENGTH != (size_t)-1 && MBLENGTH != 0); \ -+ } \ -+ \ -+ for (j = 0; j < MBLENGTH; j++) \ -+ COPY[NEW_LEN++] = mbc[j]; \ -+ } \ -+ else \ -+ for (j = 0; j < MBLENGTH; j++) \ -+ COPY[NEW_LEN++] = TEXT[i++]; \ -+ } \ -+ COPY[NEW_LEN] = '\0'; \ -+ } \ -+ while (0) -+ IGNORE_CHARS (new_len_a, lena, texta, copy_a, -+ wc_a, mblength_a, state_a); -+ IGNORE_CHARS (new_len_b, lenb, textb, copy_b, -+ wc_b, mblength_b, state_b); -+ diff = xmemcoll (copy_a, new_len_a, copy_b, new_len_b); -+ } -+ else if (lena == 0) -+ diff = - NONZERO (lenb); -+ else if (lenb == 0) -+ goto greater; -+ else -+ diff = xmemcoll (texta, lena, textb, lenb); -+ } -+ -+ if (diff) -+ goto not_equal; -+ -+ key = key->next; -+ if (! key) -+ break; -+ -+ /* Find the beginning and limit of the next field. */ -+ if (key->eword != -1) -+ lima = limfield (a, key), limb = limfield (b, key); -+ else -+ lima = a->text + a->length - 1, limb = b->text + b->length - 1; -+ -+ if (key->sword != -1) -+ texta = begfield (a, key), textb = begfield (b, key); -+ else -+ { -+ texta = a->text, textb = b->text; -+ if (key->skipsblanks) -+ { -+ while (texta < lima && ismbblank (texta, lima - texta, &mblength_a)) -+ texta += mblength_a; -+ while (textb < limb && ismbblank (textb, limb - textb, &mblength_b)) -+ textb += mblength_b; -+ } -+ } -+ } -+ -+ return 0; -+ -+greater: -+ diff = 1; -+not_equal: -+ return key->reverse ? -diff : diff; -+} -+#endif -+ - /* Compare two lines A and B, returning negative, zero, or positive - depending on whether A compares less than, equal to, or greater than B. */ - -@@ -3244,7 +3813,7 @@ main (int argc, char **argv) - initialize_exit_failure (SORT_FAILURE); - - hard_LC_COLLATE = hard_locale (LC_COLLATE); --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - hard_LC_TIME = hard_locale (LC_TIME); - #endif - -@@ -3265,6 +3834,27 @@ main (int argc, char **argv) - thousands_sep = -1; - } - -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ inittables = inittables_mb; -+ begfield = begfield_mb; -+ limfield = limfield_mb; -+ getmonth = getmonth_mb; -+ keycompare = keycompare_mb; -+ numcompare = numcompare_mb; -+ } -+ else -+#endif -+ { -+ inittables = inittables_uni; -+ begfield = begfield_uni; -+ limfield = limfield_uni; -+ getmonth = getmonth_uni; -+ keycompare = keycompare_uni; -+ numcompare = numcompare_uni; -+ } -+ - have_read_stdin = false; - inittables (); - -@@ -3536,13 +4126,35 @@ main (int argc, char **argv) - - case 't': - { -- char newtab = optarg[0]; -- if (! newtab) -+ char newtab[MB_LEN_MAX + 1]; -+ size_t newtab_length = 1; -+ strncpy (newtab, optarg, MB_LEN_MAX); -+ if (! newtab[0]) - error (SORT_FAILURE, 0, _("empty tab")); -- if (optarg[1]) -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ wchar_t wc; -+ mbstate_t state; -+ size_t i; -+ -+ memset (&state, '\0', sizeof (mbstate_t)); -+ newtab_length = mbrtowc (&wc, newtab, strnlen (newtab, -+ MB_LEN_MAX), -+ &state); -+ switch (newtab_length) -+ { -+ case (size_t) -1: -+ case (size_t) -2: -+ case 0: -+ newtab_length = 1; -+ } -+ } -+#endif -+ if (newtab_length == 1 && optarg[1]) - { - if (STREQ (optarg, "\\0")) -- newtab = '\0'; -+ newtab[0] = '\0'; - else - { - /* Provoke with `sort -txx'. Complain about -@@ -3553,9 +4165,12 @@ main (int argc, char **argv) - quote (optarg)); - } - } -- if (tab != TAB_DEFAULT && tab != newtab) -+ if (tab_length -+ && (tab_length != newtab_length -+ || memcmp (tab, newtab, tab_length) != 0)) - error (SORT_FAILURE, 0, _("incompatible tabs")); -- tab = newtab; -+ memcpy (tab, newtab, newtab_length); -+ tab_length = newtab_length; - } - break; - -Index: src/unexpand.c -=================================================================== ---- src/unexpand.c.orig 2010-01-01 14:06:47.000000000 +0100 -+++ src/unexpand.c 2010-05-07 16:13:31.016492129 +0200 -@@ -39,11 +39,28 @@ - #include - #include - #include -+ -+/* Get mbstate_t, mbrtowc(), wcwidth(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "quote.h" - #include "xstrndup.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "unexpand" - -@@ -103,6 +120,208 @@ static struct option const longopts[] = - {NULL, 0, NULL, 0} - }; - -+static FILE *next_file (FILE *fp); -+ -+#if HAVE_MBRTOWC -+static void -+unexpand_multibyte (void) -+{ -+ FILE *fp; /* Input stream. */ -+ mbstate_t i_state; /* Current shift state of the input stream. */ -+ mbstate_t i_state_bak; /* Back up the I_STATE. */ -+ mbstate_t o_state; /* Current shift state of the output stream. */ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen = 0; /* The length of the byte sequence in buf. */ -+ wint_t wc; /* A gotten wide character. */ -+ size_t mblength; /* The byte size of a multibyte character -+ which shows as same character as WC. */ -+ -+ /* Index in `tab_list' of next tabstop: */ -+ int tab_index = 0; /* For calculating width of pending tabs. */ -+ int print_tab_index = 0; /* For printing as many tabs as possible. */ -+ unsigned int column = 0; /* Column on screen of next char. */ -+ int next_tab_column; /* Column the next tab stop is on. */ -+ int convert = 1; /* If nonzero, perform translations. */ -+ unsigned int pending = 0; /* Pending columns of blanks. */ -+ -+ fp = next_file ((FILE *) NULL); -+ if (fp == NULL) -+ return; -+ -+ memset (&o_state, '\0', sizeof(mbstate_t)); -+ memset (&i_state, '\0', sizeof(mbstate_t)); -+ -+ for (;;) -+ { -+ if (buflen < MB_LEN_MAX && !feof(fp) && !ferror(fp)) -+ { -+ memmove (buf, bufpos, buflen); -+ buflen += fread (buf + buflen, sizeof(char), BUFSIZ, fp); -+ bufpos = buf; -+ } -+ -+ /* Get a wide character. */ -+ if (buflen < 1) -+ { -+ mblength = 1; -+ wc = WEOF; -+ } -+ else -+ { -+ i_state_bak = i_state; -+ mblength = mbrtowc ((wchar_t *)&wc, bufpos, buflen, &i_state); -+ } -+ -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ i_state = i_state_bak; -+ wc = L'\0'; -+ } -+ -+ if (wc == L' ' && convert && column < INT_MAX) -+ { -+ ++pending; -+ ++column; -+ } -+ else if (wc == L'\t' && convert) -+ { -+ if (tab_size == 0) -+ { -+ /* Do not let tab_index == first_free_tab; -+ stop when it is 1 less. */ -+ while (tab_index < first_free_tab - 1 -+ && column >= tab_list[tab_index]) -+ tab_index++; -+ next_tab_column = tab_list[tab_index]; -+ if (tab_index < first_free_tab - 1) -+ tab_index++; -+ if (column >= next_tab_column) -+ { -+ convert = 0; /* Ran out of tab stops. */ -+ goto flush_pend_mb; -+ } -+ } -+ else -+ { -+ next_tab_column = column + tab_size - column % tab_size; -+ } -+ pending += next_tab_column - column; -+ column = next_tab_column; -+ } -+ else -+ { -+flush_pend_mb: -+ /* Flush pending spaces. Print as many tabs as possible, -+ then print the rest as spaces. */ -+ if (pending == 1) -+ { -+ putchar (' '); -+ pending = 0; -+ } -+ column -= pending; -+ while (pending > 0) -+ { -+ if (tab_size == 0) -+ { -+ /* Do not let print_tab_index == first_free_tab; -+ stop when it is 1 less. */ -+ while (print_tab_index < first_free_tab - 1 -+ && column >= tab_list[print_tab_index]) -+ print_tab_index++; -+ next_tab_column = tab_list[print_tab_index]; -+ if (print_tab_index < first_free_tab - 1) -+ print_tab_index++; -+ } -+ else -+ { -+ next_tab_column = -+ column + tab_size - column % tab_size; -+ } -+ if (next_tab_column - column <= pending) -+ { -+ putchar ('\t'); -+ pending -= next_tab_column - column; -+ column = next_tab_column; -+ } -+ else -+ { -+ --print_tab_index; -+ column += pending; -+ while (pending != 0) -+ { -+ putchar (' '); -+ pending--; -+ } -+ } -+ } -+ -+ if (wc == WEOF) -+ { -+ fp = next_file (fp); -+ if (fp == NULL) -+ break; /* No more files. */ -+ else -+ { -+ memset (&i_state, '\0', sizeof(mbstate_t)); -+ continue; -+ } -+ } -+ -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ if (convert) -+ { -+ ++column; -+ if (convert_entire_line == 0) -+ convert = 0; -+ } -+ mblength = 1; -+ putchar (buf[0]); -+ } -+ else if (mblength == 0) -+ { -+ if (convert && convert_entire_line == 0) -+ convert = 0; -+ mblength = 1; -+ putchar ('\0'); -+ } -+ else -+ { -+ if (convert) -+ { -+ if (wc == L'\b') -+ { -+ if (column > 0) -+ --column; -+ } -+ else -+ { -+ int width; /* The width of WC. */ -+ -+ width = wcwidth (wc); -+ column += (width > 0) ? width : 0; -+ if (convert_entire_line == 0) -+ convert = 0; -+ } -+ } -+ -+ if (wc == L'\n') -+ { -+ tab_index = print_tab_index = 0; -+ column = pending = 0; -+ convert = 1; -+ } -+ fwrite (bufpos, sizeof(char), mblength, stdout); -+ } -+ } -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+} -+#endif -+ -+ - void - usage (int status) - { -@@ -524,7 +743,12 @@ main (int argc, char **argv) - - file_list = (optind < argc ? &argv[optind] : stdin_argv); - -- unexpand (); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ unexpand_multibyte (); -+ else -+#endif -+ unexpand (); - - if (have_read_stdin && fclose (stdin) != 0) - error (EXIT_FAILURE, errno, "-"); -Index: src/uniq.c -=================================================================== ---- src/uniq.c.orig 2010-03-13 16:14:09.000000000 +0100 -+++ src/uniq.c 2010-05-07 16:41:34.000063405 +0200 -@@ -21,6 +21,16 @@ - #include - #include - -+/* Get mbstate_t, mbrtowc(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get isw* functions. */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "argmatch.h" - #include "linebuffer.h" -@@ -31,7 +41,19 @@ - #include "stdio--.h" - #include "xmemcoll.h" - #include "xstrtol.h" --#include "memcasecmp.h" -+#include "xmemcoll.h" -+ -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "uniq" -@@ -107,6 +129,10 @@ static enum delimit_method const delimit - /* Select whether/how to delimit groups of duplicate lines. */ - static enum delimit_method delimit_groups; - -+/* Function pointers. */ -+static char * -+(*find_field) (struct linebuffer *line); -+ - static struct option const longopts[] = - { - {"count", no_argument, NULL, 'c'}, -@@ -206,7 +232,7 @@ size_opt (char const *opt, char const *m - return a pointer to the beginning of the line's field to be compared. */ - - static char * --find_field (struct linebuffer const *line) -+find_field_uni (struct linebuffer *line) - { - size_t count; - char const *lp = line->buffer; -@@ -227,6 +253,83 @@ find_field (struct linebuffer const *lin - return line->buffer + i; - } - -+#if HAVE_MBRTOWC -+ -+# define MBCHAR_TO_WCHAR(WC, MBLENGTH, LP, POS, SIZE, STATEP, CONVFAIL) \ -+ do \ -+ { \ -+ mbstate_t state_bak; \ -+ \ -+ CONVFAIL = 0; \ -+ state_bak = *STATEP; \ -+ \ -+ MBLENGTH = mbrtowc (&WC, LP + POS, SIZE - POS, STATEP); \ -+ \ -+ switch (MBLENGTH) \ -+ { \ -+ case (size_t)-2: \ -+ case (size_t)-1: \ -+ *STATEP = state_bak; \ -+ CONVFAIL++; \ -+ /* Fall through */ \ -+ case 0: \ -+ MBLENGTH = 1; \ -+ } \ -+ } \ -+ while (0) -+ -+static char * -+find_field_multi (struct linebuffer *line) -+{ -+ size_t count; -+ char *lp = line->buffer; -+ size_t size = line->length - 1; -+ size_t pos; -+ size_t mblength; -+ wchar_t wc; -+ mbstate_t *statep; -+ int convfail; -+ -+ pos = 0; -+ statep = &(line->state); -+ -+ /* skip fields. */ -+ for (count = 0; count < skip_fields && pos < size; count++) -+ { -+ while (pos < size) -+ { -+ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); -+ -+ if (convfail || !iswblank (wc)) -+ { -+ pos += mblength; -+ break; -+ } -+ pos += mblength; -+ } -+ -+ while (pos < size) -+ { -+ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); -+ -+ if (!convfail && iswblank (wc)) -+ break; -+ -+ pos += mblength; -+ } -+ } -+ -+ /* skip fields. */ -+ for (count = 0; count < skip_chars && pos < size; count++) -+ { -+ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); -+ pos += mblength; -+ } -+ -+ return lp + pos; -+} -+#endif -+ - /* Return false if two strings OLD and NEW match, true if not. - OLD and NEW point not to the beginnings of the lines - but rather to the beginnings of the fields to compare. -@@ -235,6 +338,8 @@ find_field (struct linebuffer const *lin - static bool - different (char *old, char *new, size_t oldlen, size_t newlen) - { -+ char *copy_old, *copy_new; -+ - if (check_chars < oldlen) - oldlen = check_chars; - if (check_chars < newlen) -@@ -242,15 +347,93 @@ different (char *old, char *new, size_t - - if (ignore_case) - { -- /* FIXME: This should invoke strcoll somehow. */ -- return oldlen != newlen || memcasecmp (old, new, oldlen); -+ size_t i; -+ -+ copy_old = alloca (oldlen + 1); -+ copy_new = alloca (oldlen + 1); -+ -+ for (i = 0; i < oldlen; i++) -+ { -+ copy_old[i] = toupper (old[i]); -+ copy_new[i] = toupper (new[i]); -+ } - } -- else if (hard_LC_COLLATE) -- return xmemcoll (old, oldlen, new, newlen) != 0; - else -- return oldlen != newlen || memcmp (old, new, oldlen); -+ { -+ copy_old = (char *)old; -+ copy_new = (char *)new; -+ } -+ -+ return xmemcoll (copy_old, oldlen, copy_new, newlen); - } - -+#if HAVE_MBRTOWC -+static int -+different_multi (const char *old, const char *new, size_t oldlen, size_t newlen, mbstate_t oldstate, mbstate_t newstate) -+{ -+ size_t i, j, chars; -+ const char *str[2]; -+ char *copy[2]; -+ size_t len[2]; -+ mbstate_t state[2]; -+ size_t mblength; -+ wchar_t wc, uwc; -+ mbstate_t state_bak; -+ -+ str[0] = old; -+ str[1] = new; -+ len[0] = oldlen; -+ len[1] = newlen; -+ state[0] = oldstate; -+ state[1] = newstate; -+ -+ for (i = 0; i < 2; i++) -+ { -+ copy[i] = alloca (len[i] + 1); -+ -+ for (j = 0, chars = 0; j < len[i] && chars < check_chars; chars++) -+ { -+ state_bak = state[i]; -+ mblength = mbrtowc (&wc, str[i] + j, len[i] - j, &(state[i])); -+ -+ switch (mblength) -+ { -+ case (size_t)-1: -+ case (size_t)-2: -+ state[i] = state_bak; -+ /* Fall through */ -+ case 0: -+ mblength = 1; -+ break; -+ -+ default: -+ if (ignore_case) -+ { -+ uwc = towupper (wc); -+ -+ if (uwc != wc) -+ { -+ mbstate_t state_wc; -+ -+ memset (&state_wc, '\0', sizeof(mbstate_t)); -+ wcrtomb (copy[i] + j, uwc, &state_wc); -+ } -+ else -+ memcpy (copy[i] + j, str[i] + j, mblength); -+ } -+ else -+ memcpy (copy[i] + j, str[i] + j, mblength); -+ } -+ j += mblength; -+ } -+ copy[i][j] = '\0'; -+ len[i] = j; -+ } -+ -+ return xmemcoll (copy[0], len[0], copy[1], len[1]); -+} -+#endif -+ - /* Output the line in linebuffer LINE to standard output - provided that the switches say it should be output. - MATCH is true if the line matches the previous line. -@@ -303,15 +486,43 @@ check_file (const char *infile, const ch - { - char *prevfield IF_LINT (= NULL); - size_t prevlen IF_LINT (= 0); -+#if HAVE_MBRTOWC -+ mbstate_t prevstate; -+ -+ memset (&prevstate, '\0', sizeof (mbstate_t)); -+#endif - - while (!feof (stdin)) - { - char *thisfield; - size_t thislen; -+#if HAVE_MBRTOWC -+ mbstate_t thisstate; -+#endif -+ - if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) - break; - thisfield = find_field (thisline); - thislen = thisline->length - 1 - (thisfield - thisline->buffer); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ thisstate = thisline->state; -+ -+ if (prevline->length == 0 || different_multi -+ (thisfield, prevfield, thislen, prevlen, thisstate, prevstate)) -+ { -+ fwrite (thisline->buffer, sizeof (char), -+ thisline->length, stdout); -+ -+ SWAP_LINES (prevline, thisline); -+ prevfield = thisfield; -+ prevlen = thislen; -+ prevstate = thisstate; -+ } -+ } -+ else -+#endif - if (prevline->length == 0 - || different (thisfield, prevfield, thislen, prevlen)) - { -@@ -330,17 +541,26 @@ check_file (const char *infile, const ch - size_t prevlen; - uintmax_t match_count = 0; - bool first_delimiter = true; -+#if HAVE_MBRTOWC -+ mbstate_t prevstate; -+#endif - - if (readlinebuffer_delim (prevline, stdin, delimiter) == 0) - goto closefiles; - prevfield = find_field (prevline); - prevlen = prevline->length - 1 - (prevfield - prevline->buffer); -+#if HAVE_MBRTOWC -+ prevstate = prevline->state; -+#endif - - while (!feof (stdin)) - { - bool match; - char *thisfield; - size_t thislen; -+#if HAVE_MBRTOWC -+ mbstate_t thisstate; -+#endif - if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) - { - if (ferror (stdin)) -@@ -349,6 +569,15 @@ check_file (const char *infile, const ch - } - thisfield = find_field (thisline); - thislen = thisline->length - 1 - (thisfield - thisline->buffer); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ thisstate = thisline->state; -+ match = !different_multi (thisfield, prevfield, -+ thislen, prevlen, thisstate, prevstate); -+ } -+ else -+#endif - match = !different (thisfield, prevfield, thislen, prevlen); - match_count += match; - -@@ -381,6 +610,9 @@ check_file (const char *infile, const ch - SWAP_LINES (prevline, thisline); - prevfield = thisfield; - prevlen = thislen; -+#if HAVE_MBRTOWC -+ prevstate = thisstate; -+#endif - if (!match) - match_count = 0; - } -@@ -426,6 +658,19 @@ main (int argc, char **argv) - - atexit (close_stdout); - -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ find_field = find_field_multi; -+ } -+ else -+#endif -+ { -+ find_field = find_field_uni; -+ } -+ -+ -+ - skip_chars = 0; - skip_fields = 0; - check_chars = SIZE_MAX; -Index: tests/Makefile.am -=================================================================== ---- tests/Makefile.am.orig 2010-04-20 21:52:05.000000000 +0200 -+++ tests/Makefile.am 2010-05-07 16:38:36.972072320 +0200 -@@ -224,6 +224,7 @@ TESTS = \ - misc/sort-compress \ - misc/sort-continue \ - misc/sort-files0-from \ -+ misc/sort-mb-tests \ - misc/sort-merge \ - misc/sort-merge-fdlimit \ - misc/sort-month \ -@@ -474,6 +475,10 @@ TESTS = \ - $(root_tests) - - pr_data = \ -+ misc/mb1.X \ -+ misc/mb1.I \ -+ misc/mb2.X \ -+ misc/mb2.I \ - pr/0F \ - pr/0FF \ - pr/0FFnt \ -Index: tests/misc/cut -=================================================================== ---- tests/misc/cut.orig 2010-01-01 14:06:47.000000000 +0100 -+++ tests/misc/cut 2010-05-07 16:13:31.144492080 +0200 -@@ -26,7 +26,7 @@ use strict; - my $prog = 'cut'; - my $try = "Try \`$prog --help' for more information.\n"; - my $from_1 = "$prog: fields and positions are numbered from 1\n$try"; --my $inval = "$prog: invalid byte or field list\n$try"; -+my $inval = "$prog: invalid byte, character or field list\n$try"; - my $no_endpoint = "$prog: invalid range with no endpoint: -\n$try"; - - my @Tests = -@@ -141,7 +141,7 @@ my @Tests = - - # None of the following invalid ranges provoked an error up to coreutils-6.9. - ['inval1', qw(-f 2-0), {IN=>''}, {OUT=>''}, {EXIT=>1}, -- {ERR=>"$prog: invalid decreasing range\n$try"}], -+ {ERR=>"$prog: invalid byte, character or field list\n$try"}], - ['inval2', qw(-f -), {IN=>''}, {OUT=>''}, {EXIT=>1}, {ERR=>$no_endpoint}], - ['inval3', '-f', '4,-', {IN=>''}, {OUT=>''}, {EXIT=>1}, {ERR=>$no_endpoint}], - ['inval4', '-f', '1-2,-', {IN=>''}, {OUT=>''}, {EXIT=>1}, {ERR=>$no_endpoint}], -Index: tests/misc/mb1.I -=================================================================== ---- /dev/null 1970-01-01 00:00:00.000000000 +0000 -+++ tests/misc/mb1.I 2010-05-07 16:13:31.188492096 +0200 -@@ -0,0 +1,4 @@ -+Apple@10 -+Banana@5 -+Citrus@20 -+Cherry@30 -Index: tests/misc/mb1.X -=================================================================== ---- /dev/null 1970-01-01 00:00:00.000000000 +0000 -+++ tests/misc/mb1.X 2010-05-07 16:13:31.224492101 +0200 -@@ -0,0 +1,4 @@ -+Banana@5 -+Apple@10 -+Citrus@20 -+Cherry@30 -Index: tests/misc/mb2.I -=================================================================== ---- /dev/null 1970-01-01 00:00:00.000000000 +0000 -+++ tests/misc/mb2.I 2010-05-07 16:13:31.248492220 +0200 -@@ -0,0 +1,4 @@ -+Apple@AA10@@20 -+Banana@AA5@@30 -+Citrus@AA20@@5 -+Cherry@AA30@@10 -Index: tests/misc/mb2.X -=================================================================== ---- /dev/null 1970-01-01 00:00:00.000000000 +0000 -+++ tests/misc/mb2.X 2010-05-07 16:13:31.276492153 +0200 -@@ -0,0 +1,4 @@ -+Citrus@AA20@@5 -+Cherry@AA30@@10 -+Apple@AA10@@20 -+Banana@AA5@@30 -Index: tests/misc/sort-mb-tests -=================================================================== ---- /dev/null 1970-01-01 00:00:00.000000000 +0000 -+++ tests/misc/sort-mb-tests 2010-05-07 16:13:31.312492158 +0200 -@@ -0,0 +1,58 @@ -+#! /bin/sh -+case $# in -+ 0) xx='../src/sort';; -+ *) xx="$1";; -+esac -+test "$VERBOSE" && echo=echo || echo=: -+$echo testing program: $xx -+errors=0 -+test "$srcdir" || srcdir=. -+test "$VERBOSE" && $xx --version 2> /dev/null -+ -+export LC_ALL=en_US.UTF-8 -+locale -k LC_CTYPE 2>&1 | grep -q charmap.*UTF-8 || exit 77 -+errors=0 -+ -+$xx -t @ -k2 -n misc/mb1.I > misc/mb1.O -+code=$? -+if test $code != 0; then -+ $echo "Test mb1 failed: $xx return code $code differs from expected value 0" 1>&2 -+ errors=`expr $errors + 1` -+else -+ cmp misc/mb1.O $srcdir/misc/mb1.X > /dev/null 2>&1 -+ case $? in -+ 0) if test "$VERBOSE"; then $echo "passed mb1"; fi;; -+ 1) $echo "Test mb1 failed: files misc/mb1.O and $srcdir/misc/mb1.X differ" 1>&2 -+ (diff -c misc/mb1.O $srcdir/misc/mb1.X) 2> /dev/null -+ errors=`expr $errors + 1`;; -+ 2) $echo "Test mb1 may have failed." 1>&2 -+ $echo The command "cmp misc/mb1.O $srcdir/misc/mb1.X" failed. 1>&2 -+ errors=`expr $errors + 1`;; -+ esac -+fi -+ -+$xx -t @ -k4 -n misc/mb2.I > misc/mb2.O -+code=$? -+if test $code != 0; then -+ $echo "Test mb2 failed: $xx return code $code differs from expected value 0" 1>&2 -+ errors=`expr $errors + 1` -+else -+ cmp misc/mb2.O $srcdir/misc/mb2.X > /dev/null 2>&1 -+ case $? in -+ 0) if test "$VERBOSE"; then $echo "passed mb2"; fi;; -+ 1) $echo "Test mb2 failed: files misc/mb2.O and $srcdir/misc/mb2.X differ" 1>&2 -+ (diff -c misc/mb2.O $srcdir/misc/mb2.X) 2> /dev/null -+ errors=`expr $errors + 1`;; -+ 2) $echo "Test mb2 may have failed." 1>&2 -+ $echo The command "cmp misc/mb2.O $srcdir/misc/mb2.X" failed. 1>&2 -+ errors=`expr $errors + 1`;; -+ esac -+fi -+ -+if test $errors = 0; then -+ $echo Passed all 113 tests. 1>&2 -+else -+ $echo Failed $errors tests. 1>&2 -+fi -+test $errors = 0 || errors=1 -+exit $errors diff --git a/coreutils-8.5.patch b/coreutils-8.5.patch deleted file mode 100644 index 159f791..0000000 --- a/coreutils-8.5.patch +++ /dev/null @@ -1,67 +0,0 @@ -Index: gnulib-tests/test-isnanl.h -=================================================================== ---- gnulib-tests/test-isnanl.h.orig 2010-03-13 16:21:09.000000000 +0100 -+++ gnulib-tests/test-isnanl.h 2010-05-05 13:47:16.003024388 +0200 -@@ -63,7 +63,7 @@ main () - /* Quiet NaN. */ - ASSERT (isnanl (NaNl ())); - --#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT -+#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT && 0 - /* A bit pattern that is different from a Quiet NaN. With a bit of luck, - it's a Signalling NaN. */ - { -@@ -105,6 +105,7 @@ main () - { LDBL80_WORDS (0xFFFF, 0x83333333, 0x00000000) }; - ASSERT (isnanl (x.value)); - } -+#if 0 - /* The isnanl function should recognize Pseudo-NaNs, Pseudo-Infinities, - Pseudo-Zeroes, Unnormalized Numbers, and Pseudo-Denormals, as defined in - Intel IA-64 Architecture Software Developer's Manual, Volume 1: -@@ -138,6 +139,7 @@ main () - ASSERT (isnanl (x.value)); - } - #endif -+#endif - - return 0; - } -Index: src/system.h -=================================================================== ---- src/system.h.orig 2010-04-20 21:52:05.000000000 +0200 -+++ src/system.h 2010-05-05 13:38:20.923127872 +0200 -@@ -138,7 +138,7 @@ enum - # define DEV_BSIZE BBSIZE - #endif - #ifndef DEV_BSIZE --# define DEV_BSIZE 4096 -+# define DEV_BSIZE 512 - #endif - - /* Extract or fake data from a `struct stat'. -Index: tests/misc/help-version -=================================================================== ---- tests/misc/help-version.orig 2010-04-20 21:52:05.000000000 +0200 -+++ tests/misc/help-version 2010-05-05 13:44:11.919859133 +0200 -@@ -239,6 +239,7 @@ lbracket_setup () { args=": ]"; } - for i in $built_programs; do - # Skip these. - case $i in chroot|stty|tty|false|chcon|runcon) continue;; esac -+ case $i in df) continue;; esac - - rm -rf $tmp_in $tmp_in2 $tmp_dir $tmp_out $bigZ_in $zin $zin2 - echo z |gzip > $zin -Index: tests/other-fs-tmpdir -=================================================================== ---- tests/other-fs-tmpdir.orig 2010-01-01 14:06:47.000000000 +0100 -+++ tests/other-fs-tmpdir 2010-05-05 13:38:20.982872202 +0200 -@@ -43,6 +43,8 @@ for d in $CANDIDATE_TMP_DIRS; do - fi - - done -+# Autobuild hack -+test -f /bin/uname.bin && other_partition_tmpdir= - - if test -z "$other_partition_tmpdir"; then - skip_test_ \ diff --git a/coreutils-8.5.tar.xz b/coreutils-8.5.tar.xz deleted file mode 100644 index cd6bae3..0000000 --- a/coreutils-8.5.tar.xz +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:5aa855caa08b94ccd632510d9ab265646d2ee11498c7efff205b27c2437dec5a -size 4531488 diff --git a/coreutils-add_ogv.patch b/coreutils-add_ogv.patch index 9b43b11..b6fbd40 100644 --- a/coreutils-add_ogv.patch +++ b/coreutils-add_ogv.patch @@ -1,8 +1,6 @@ -Index: src/dircolors.hin -=================================================================== ---- src/dircolors.hin.orig 2010-04-20 21:52:04.000000000 +0200 -+++ src/dircolors.hin 2010-05-05 16:22:16.375859309 +0200 -@@ -158,6 +158,7 @@ EXEC 01;32 +--- src/dircolors.hin ++++ src/dircolors.hin +@@ -151,6 +151,7 @@ .m2v 01;35 .mkv 01;35 .ogm 01;35 diff --git a/coreutils-cifs-afs.diff b/coreutils-cifs-afs.diff new file mode 100644 index 0000000..41cd49f --- /dev/null +++ b/coreutils-cifs-afs.diff @@ -0,0 +1,35 @@ +--- src/fs.h ++++ src/fs.h +@@ -5,10 +5,12 @@ + #if defined __linux__ + # define S_MAGIC_ADFS 0xADF5 + # define S_MAGIC_AFFS 0xADFF ++# define S_MAGIC_AFS 0x6B414653 + # define S_MAGIC_AUTOFS 0x187 + # define S_MAGIC_BEFS 0x42465331 + # define S_MAGIC_BFS 0x1BADFACE + # define S_MAGIC_BINFMT_MISC 0x42494e4d ++# define S_MAGIC_CIFS 0xFF534D42 + # define S_MAGIC_CODA 0x73757245 + # define S_MAGIC_COH 0x012FF7B7 + # define S_MAGIC_CRAMFS 0x28CD3D45 +--- src/stat.c ++++ src/stat.c +@@ -219,6 +219,8 @@ human_fstype (STRUCT_STATVFS const *stat + return "adfs"; + case S_MAGIC_AFFS: /* 0xADFF */ + return "affs"; ++ case S_MAGIC_AFS: /* 0x6B414653 */ ++ return "afs"; + case S_MAGIC_AUTOFS: /* 0x187 */ + return "autofs"; + case S_MAGIC_BEFS: /* 0x42465331 */ +@@ -227,6 +229,8 @@ human_fstype (STRUCT_STATVFS const *stat + return "bfs"; + case S_MAGIC_BINFMT_MISC: /* 0x42494e4d */ + return "binfmt_misc"; ++ case S_MAGIC_CIFS: /* 0xFF534D42 */ ++ return "cifs"; + case S_MAGIC_CODA: /* 0x73757245 */ + return "coda"; + case S_MAGIC_COH: /* 0x012FF7B7 */ diff --git a/coreutils-fix_distcheck.patch b/coreutils-fix_distcheck.patch new file mode 100644 index 0000000..9fc3c8e --- /dev/null +++ b/coreutils-fix_distcheck.patch @@ -0,0 +1,80 @@ +Index: maint.mk +=================================================================== +--- maint.mk.orig 2009-02-18 16:13:19.000000000 +0100 ++++ maint.mk 2010-05-04 17:45:14.515359143 +0200 +@@ -623,14 +623,14 @@ bin=bin-$$$$ + + write_loser = printf '\#!%s\necho $$0: bad path 1>&2; exit 1\n' '$(SHELL)' + +-TMPDIR ?= /tmp +-t=$(TMPDIR)/$(PACKAGE)/test ++tmpdir = $(abs_top_builddir)/tests/torture ++ + pfx=$(t)/i + + # More than once, tainted build and source directory names would + # have caused at least one "make check" test to apply "chmod 700" + # to all directories under $HOME. Make sure it doesn't happen again. +-tp := $(shell echo "$(TMPDIR)/$(PACKAGE)-$$$$") ++tp = $(tmpdir)/taint + t_prefix = $(tp)/a + t_taint = '$(t_prefix) b' + fake_home = $(tp)/home +@@ -648,10 +648,11 @@ taint-distcheck: $(DIST_ARCHIVES) + touch $(fake_home)/f + mkdir -p $(fake_home)/d/e + ls -lR $(fake_home) $(t_prefix) > $(tp)/.ls-before ++ HOME=$(fake_home); export HOME; \ + cd $(t_taint)/$(distdir) \ + && ./configure \ + && $(MAKE) \ +- && HOME=$(fake_home) $(MAKE) check \ ++ && $(MAKE) check \ + && ls -lR $(fake_home) $(t_prefix) > $(tp)/.ls-after \ + && diff $(tp)/.ls-before $(tp)/.ls-after \ + && test -d $(t_prefix) +@@ -670,6 +671,7 @@ endef + # Install, then verify that all binaries and man pages are in place. + # Note that neither the binary, ginstall, nor the ].1 man page is installed. + define my-instcheck ++ echo running my-instcheck; \ + $(MAKE) prefix=$(pfx) install \ + && test ! -f $(pfx)/bin/ginstall \ + && { fail=0; \ +@@ -688,6 +690,7 @@ endef + + define coreutils-path-check + { \ ++ echo running coreutils-path-check; \ + if test -f $(srcdir)/src/true.c; then \ + fail=1; \ + mkdir $(bin) \ +@@ -732,19 +735,20 @@ my-distcheck: $(DIST_ARCHIVES) $(local-c + -rm -rf $(t) + mkdir -p $(t) + GZIP=$(GZIP_ENV) $(AMTAR) -C $(t) -zxf $(distdir).tar.gz +- cd $(t)/$(distdir) \ +- && ./configure --disable-nls \ +- && $(MAKE) CFLAGS='$(warn_cflags)' \ +- AM_MAKEFLAGS='$(null_AM_MAKEFLAGS)' \ +- && $(MAKE) dvi \ +- && $(install-transform-check) \ +- && $(my-instcheck) \ +- && $(coreutils-path-check) \ ++ cd $(t)/$(distdir) \ ++ && ./configure --quiet --enable-gcc-warnings --disable-nls \ ++ && $(MAKE) CFLAGS='$(warn_cflags)' \ ++ AM_MAKEFLAGS='$(null_AM_MAKEFLAGS)' \ ++ && $(MAKE) dvi \ ++ && $(install-transform-check) \ ++ && $(my-instcheck) \ ++ && $(coreutils-path-check) \ + && $(MAKE) distclean + (cd $(t) && mv $(distdir) $(distdir).old \ + && $(AMTAR) -zxf - ) < $(distdir).tar.gz + diff -ur $(t)/$(distdir).old $(t)/$(distdir) + -rm -rf $(t) ++ rmdir $(tmpdir)/$(PACKAGE) $(tmpdir) + @echo "========================"; \ + echo "$(distdir).tar.gz is ready for distribution"; \ + echo "========================" diff --git a/coreutils-getaddrinfo.diff b/coreutils-getaddrinfo.diff new file mode 100644 index 0000000..39a0f38 --- /dev/null +++ b/coreutils-getaddrinfo.diff @@ -0,0 +1,16 @@ +Index: coreutils-6.9.90/gnulib-tests/test-getaddrinfo.c +================================================================================ +--- coreutils-7.1/gnulib-tests/test-getaddrinfo.c ++++ coreutils-7.1/gnulib-tests/test-getaddrinfo.c +@@ -71,10 +71,7 @@ int simple (char *host, char *service) + the test merely because someone is down the country on their + in-law's farm. */ + if (res == EAI_AGAIN) +- { +- fprintf (stderr, "skipping getaddrinfo test: no network?\n"); +- return 77; +- } ++ return 0; + /* IRIX reports EAI_NONAME for "https". Don't fail the test + merely because of this. */ + if (res == EAI_NONAME) diff --git a/coreutils-getaddrinfo.patch b/coreutils-getaddrinfo.patch deleted file mode 100644 index d5b0720..0000000 --- a/coreutils-getaddrinfo.patch +++ /dev/null @@ -1,17 +0,0 @@ -Index: gnulib-tests/test-getaddrinfo.c -=================================================================== ---- gnulib-tests/test-getaddrinfo.c.orig 2010-03-13 16:21:08.000000000 +0100 -+++ gnulib-tests/test-getaddrinfo.c 2010-05-05 14:51:40.343025353 +0200 -@@ -88,11 +88,7 @@ simple (char const *host, char const *se - the test merely because someone is down the country on their - in-law's farm. */ - if (res == EAI_AGAIN) -- { -- skip++; -- fprintf (stderr, "skipping getaddrinfo test: no network?\n"); -- return 77; -- } -+ return 0; - /* IRIX reports EAI_NONAME for "https". Don't fail the test - merely because of this. */ - if (res == EAI_NONAME) diff --git a/coreutils-gl_printf_safe.patch b/coreutils-gl_printf_safe.patch deleted file mode 100644 index ed5cef0..0000000 --- a/coreutils-gl_printf_safe.patch +++ /dev/null @@ -1,24 +0,0 @@ -Index: configure -=================================================================== ---- configure.orig 2010-04-23 18:06:40.000000000 +0200 -+++ configure 2010-05-05 13:40:11.419859163 +0200 -@@ -3340,7 +3340,6 @@ as_fn_append ac_func_list " alarm" - as_fn_append ac_header_list " sys/statvfs.h" - as_fn_append ac_header_list " sys/select.h" - as_fn_append ac_func_list " nl_langinfo" --gl_printf_safe=yes - as_fn_append ac_header_list " utmp.h" - as_fn_append ac_header_list " utmpx.h" - as_fn_append ac_func_list " utmpname" -Index: m4/gnulib-comp.m4 -=================================================================== ---- m4/gnulib-comp.m4.orig 2010-04-21 20:12:06.000000000 +0200 -+++ m4/gnulib-comp.m4 2010-05-05 13:40:58.875859176 +0200 -@@ -1158,7 +1158,6 @@ AC_DEFUN([gl_INIT], - # Code from module printf-frexpl: - gl_FUNC_PRINTF_FREXPL - # Code from module printf-safe: -- m4_divert_text([INIT_PREPARE], [gl_printf_safe=yes]) - # Code from module priv-set: - gl_PRIV_SET - # Code from module progname: diff --git a/coreutils-i18n-infloop.patch b/coreutils-i18n-infloop.patch deleted file mode 100644 index ede0365..0000000 --- a/coreutils-i18n-infloop.patch +++ /dev/null @@ -1,14 +0,0 @@ -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-05-07 16:52:08.068491875 +0200 -+++ src/sort.c 2010-05-07 16:53:44.704992155 +0200 -@@ -2720,7 +2720,8 @@ keycompare_mb (const struct line *a, con - if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ - STATE = state_bak; \ - if (!ignore) \ -- COPY[NEW_LEN++] = TEXT[i++]; \ -+ COPY[NEW_LEN++] = TEXT[i]; \ -+ i++; \ - continue; \ - } \ - \ diff --git a/coreutils-i18n-uninit.patch b/coreutils-i18n-uninit.patch deleted file mode 100644 index c3b8ebc..0000000 --- a/coreutils-i18n-uninit.patch +++ /dev/null @@ -1,16 +0,0 @@ -Index: src/cut.c -=================================================================== ---- src/cut.c.orig 2010-05-06 15:16:26.851859241 +0200 -+++ src/cut.c 2010-05-06 15:16:27.095859170 +0200 -@@ -878,7 +878,10 @@ cut_fields_mb (FILE *stream) - c = getc (stream); - empty_input = (c == EOF); - if (c != EOF) -- ungetc (c, stream); -+ { -+ ungetc (c, stream); -+ wc = 0; -+ } - else - wc = WEOF; - diff --git a/coreutils-invalid-ids.patch b/coreutils-invalid-ids.patch deleted file mode 100644 index a7cdbb1..0000000 --- a/coreutils-invalid-ids.patch +++ /dev/null @@ -1,26 +0,0 @@ -While uid_t and gid_t are both unsigned, the values (uid_t) -1 and -(gid_t) -1 are reserved. A uid or gid argument of -1 to the chown(2) -system call means to leave the uid/gid unchanged. Catch this case -so that trying to set a uid or gid to -1 will result in an error. - -Test cases: - - chown 4294967295 file - chown :4294967295 file - chgrp 4294967295 file - -Andreas Gruenbacher - -Index: src/chgrp.c -=================================================================== ---- src/chgrp.c.orig 2010-01-01 14:06:47.000000000 +0100 -+++ src/chgrp.c 2010-05-05 14:03:28.279359192 +0200 -@@ -89,7 +89,7 @@ parse_group (const char *name) - { - unsigned long int tmp; - if (! (xstrtoul (name, NULL, 10, &tmp, "") == LONGINT_OK -- && tmp <= GID_T_MAX)) -+ && tmp <= GID_T_MAX && (gid_t) tmp != (gid_t) -1)) - error (EXIT_FAILURE, 0, _("invalid group: %s"), quote (name)); - gid = tmp; - } diff --git a/coreutils-no_hostname_and_hostid.patch b/coreutils-no_hostname_and_hostid.patch deleted file mode 100644 index b3657e0..0000000 --- a/coreutils-no_hostname_and_hostid.patch +++ /dev/null @@ -1,122 +0,0 @@ -Index: doc/coreutils.texi -=================================================================== ---- doc/coreutils.texi.orig 2010-05-06 15:17:48.132359317 +0200 -+++ doc/coreutils.texi 2010-05-06 15:21:02.631693747 +0200 -@@ -65,8 +65,6 @@ - * fold: (coreutils)fold invocation. Wrap long input lines. - * groups: (coreutils)groups invocation. Print group names a user is in. - * head: (coreutils)head invocation. Output the first part of files. --* hostid: (coreutils)hostid invocation. Print numeric host identifier. --* hostname: (coreutils)hostname invocation. Print or set system name. - * id: (coreutils)id invocation. Print user identity. - * install: (coreutils)install invocation. Copy and change attributes. - * join: (coreutils)join invocation. Join lines on a common field. -@@ -197,7 +195,7 @@ Free Documentation License''. - * File name manipulation:: dirname basename pathchk mktemp - * Working context:: pwd stty printenv tty - * User information:: id logname whoami groups users who --* System context:: date arch nproc uname hostname hostid uptime -+* System context:: date arch nproc uname uptime - * SELinux context:: chcon runcon - * Modified command invocation:: chroot env nice nohup stdbuf su timeout - * Process control:: kill -@@ -413,8 +411,6 @@ System context - * date invocation:: Print or set system date and time - * nproc invocation:: Print the number of processors - * uname invocation:: Print system information --* hostname invocation:: Print or set system name --* hostid invocation:: Print numeric host identifier - * uptime invocation:: Print system uptime and load - - @command{date}: Print or set system date and time -@@ -13449,8 +13445,6 @@ information. - * arch invocation:: Print machine hardware name. - * nproc invocation:: Print the number of processors. - * uname invocation:: Print system information. --* hostname invocation:: Print or set system name. --* hostid invocation:: Print numeric host identifier. - * uptime invocation:: Print system uptime and load. - @end menu - -@@ -14272,55 +14266,6 @@ Print the kernel version. - - @exitstatus - -- --@node hostname invocation --@section @command{hostname}: Print or set system name -- --@pindex hostname --@cindex setting the hostname --@cindex printing the hostname --@cindex system name, printing --@cindex appropriate privileges -- --With no arguments, @command{hostname} prints the name of the current host --system. With one argument, it sets the current host name to the --specified string. You must have appropriate privileges to set the host --name. Synopsis: -- --@example --hostname [@var{name}] --@end example -- --The only options are @option{--help} and @option{--version}. @xref{Common --options}. -- --@exitstatus -- -- --@node hostid invocation --@section @command{hostid}: Print numeric host identifier -- --@pindex hostid --@cindex printing the host identifier -- --@command{hostid} prints the numeric identifier of the current host --in hexadecimal. This command accepts no arguments. --The only options are @option{--help} and @option{--version}. --@xref{Common options}. -- --For example, here's what it prints on one system I use: -- --@example --$ hostid --1bac013d --@end example -- --On that system, the 32-bit quantity happens to be closely --related to the system's Internet address, but that isn't always --the case. -- --@exitstatus -- - @node uptime invocation - @section @command{uptime}: Print system uptime and load - -Index: man/Makefile.am -=================================================================== ---- man/Makefile.am.orig 2010-05-06 15:17:48.136359276 +0200 -+++ man/Makefile.am 2010-05-06 15:18:44.844359168 +0200 -@@ -197,7 +197,7 @@ check-x-vs-1: - @PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ - t=$@-t; \ - (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ -- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ -+ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ - | tr -s ' ' '\n' | sed 's/\.1$$//') \ - | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ - rm $$t -Index: man/Makefile.in -=================================================================== ---- man/Makefile.in.orig 2010-05-06 15:17:48.136359276 +0200 -+++ man/Makefile.in 2010-05-06 15:18:44.875852631 +0200 -@@ -1574,7 +1574,7 @@ check-x-vs-1: - @PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ - t=$@-t; \ - (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ -- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ -+ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ - | tr -s ' ' '\n' | sed 's/\.1$$//') \ - | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ - rm $$t diff --git a/coreutils-sysinfo.patch b/coreutils-sysinfo.diff similarity index 86% rename from coreutils-sysinfo.patch rename to coreutils-sysinfo.diff index 4e5b9c4..3096103 100644 --- a/coreutils-sysinfo.patch +++ b/coreutils-sysinfo.diff @@ -1,10 +1,10 @@ Index: src/uname.c =================================================================== ---- src/uname.c.orig 2010-01-01 14:06:47.000000000 +0100 -+++ src/uname.c 2010-05-05 13:58:03.471359120 +0200 +--- src/uname.c.orig 2010-05-04 17:27:48.679359310 +0200 ++++ src/uname.c 2010-05-04 17:29:03.011859260 +0200 @@ -339,6 +339,36 @@ main (int argc, char **argv) # endif - } + } #endif + if (element == unknown) + { @@ -37,11 +37,11 @@ Index: src/uname.c +#endif + } if (! (toprint == UINT_MAX && element == unknown)) - print_element (element); + print_element (element); } @@ -364,6 +394,18 @@ main (int argc, char **argv) - element = hardware_platform; - } + element = hardware_platform; + } #endif + if (element == unknown) + { @@ -56,5 +56,5 @@ Index: src/uname.c + element = hardware_platform; + } if (! (toprint == UINT_MAX && element == unknown)) - print_element (element); + print_element (element); } diff --git a/coreutils.changes b/coreutils.changes index d023ebf..cfcabee 100644 --- a/coreutils.changes +++ b/coreutils.changes @@ -1,78 +1,9 @@ -------------------------------------------------------------------- -Thu Jul 1 21:23:40 UTC 2010 - jengelh@medozas.de - -- Use %_smp_mflags - ------------------------------------------------------------------- Tue Jun 29 20:18:04 CEST 2010 - pth@suse.de - Fix 'sort -V' not working because the i18n (mb handling) patch wasn't updated to handle the new option (bnc#615073). -------------------------------------------------------------------- -Mon Jun 28 12:52:15 CEST 2010 - pth@suse.de - -- Fix typo in spec file (% missing from version). - -------------------------------------------------------------------- -Fri Jun 18 11:57:47 CEST 2010 - kukuk@suse.de - -- Last part of fix for [bnc#533249]: Don't run account part of - PAM stack for su as root. Requires pam > 1.1.1. - -------------------------------------------------------------------- -Fri May 7 15:44:53 UTC 2010 - pth@novell.com - -- Update to 8.5: - Bug fixes - * cp and mv once again support preserving extended attributes. - * cp now preserves "capabilities" when also preserving file ownership.7 - * ls --color once again honors the 'NORMAL' dircolors directive. - [bug introduced in coreutils-6.11] - * sort -M now handles abbreviated months that are aligned using - blanks in the locale database. Also locales with 8 bit characters - are handled correctly, including multi byte locales with the caveat - that multi byte characters are matched case sensitively. - * sort again handles obsolescent key formats (+POS -POS) correctly. - Previously if -POS was specified, 1 field too many was used in the - sort. [bug introduced in coreutils-7.2] - - New features - - * join now accepts the --header option, to treat the first line of - each file as a header line to be joined and printed - unconditionally. - - * timeout now accepts the --kill-after option which sends a kill - signal to the monitored command if it's still running the specified - duration after the initial signal was sent. - - * who: the "+/-" --mesg (-T) indicator of whether a user/tty is - accepting messages could be incorrectly listed as "+", when in - fact, the user was not accepting messages (mesg no). Before, who - would examine only the permission bits, and not consider the group - of the TTY device file. Thus, if a login tty's group would change - somehow e.g., to "root", that would make it unwritable (via - write(1)) by normal users, in spite of whatever the permission bits - might imply. Now, when configured using the - --with-tty-group[=NAME] option, who also compares the group of the - TTY device with NAME (or "tty" if no group name is specified). - - Changes in behavior - - * ls --color no longer emits the final 3-byte color-resetting escape - sequence when it would be a no-op. - - * join -t '' no longer emits an error and instead operates on each - line as a whole (even if they contain NUL characters). - - For other changes since 7.1 see NEWS. -- Split-up coreutils-%%{version}.diff as far as possible. -- Prefix all patches with coreutils-. -- All patches have the .patch suffix. -- Use the i18n patch from Archlinux as it fixes at least one test - suite failure. - ------------------------------------------------------------------- Tue May 4 17:13:37 UTC 2010 - pth@novell.com diff --git a/coreutils.spec b/coreutils.spec index 53e4a44..f3a1de5 100644 --- a/coreutils.spec +++ b/coreutils.spec @@ -1,5 +1,5 @@ # -# spec file for package coreutils (Version 8.5) +# spec file for package coreutils (Version 7.1) # # Copyright (c) 2010 SUSE LINUX Products GmbH, Nuernberg, Germany. # @@ -23,32 +23,34 @@ BuildRequires: help2man libacl-devel libcap-devel libselinux-devel pam-devel xz Url: http://www.gnu.org/software/coreutils/ License: GFDLv1.2 ; GPLv2+ ; GPLv3+ Group: System/Base -Version: 8.5 -Release: 1 -Provides: fileutils = %{version}, sh-utils = %{version}, stat = %version}, textutils = %{version}, mktemp = %{version} -Obsoletes: fileutils < %{version}, sh-utils < %{version}, stat < %version}, textutils < %{version}, mktemp < %{version} +Version: 7.1 +Release: 6 +Provides: fileutils sh-utils stat textutils mktemp +Obsoletes: fileutils sh-utils stat textutils mktemp Obsoletes: libselinux <= 1.23.11-3 libselinux-32bit = 9 libselinux-64bit = 9 libselinux-x86 = 9 AutoReqProv: on PreReq: %{install_info_prereq} Requires: %{name}-lang = %version -Requires: pam >= 1.1.1.90 Source: coreutils-%{version}.tar.xz Source1: su.pamd Source2: su.default Source3: baselibs.conf -Patch0: coreutils-%{version}.patch -Patch1: coreutils-no_hostname_and_hostid.patch -Patch2: coreutils-gl_printf_safe.patch -Patch4: coreutils-8.5-i18n.patch -Patch5: coreutils-i18n-uninit.patch -Patch6: coreutils-i18n-infloop.patch -Patch8: coreutils-sysinfo.patch -Patch16: coreutils-invalid-ids.patch -Patch20: coreutils-6.8-su.patch -Patch21: coreutils-6.8.0-pie.patch -Patch22: coreutils-5.3.0-sbin4su.patch -Patch23: coreutils-getaddrinfo.patch +Patch: coreutils-%{version}.diff +Patch4: coreutils-5.3.0-i18n-0.1.patch +Patch5: i18n-uninit.diff +Patch6: i18n-infloop.diff +Patch8: coreutils-sysinfo.diff +Patch11: i18n-monthsort.diff +Patch12: i18n-random.diff +Patch16: invalid-ids.diff +Patch17: i18n-limfield.diff +Patch20: coreutils-6.8-su.diff +Patch21: coreutils-6.8.0-pie.diff +Patch22: coreutils-5.3.0-sbin4su.diff +Patch23: coreutils-getaddrinfo.diff +Patch25: coreutils-cifs-afs.diff Patch26: coreutils-add_ogv.patch +Patch27: coreutils-fix_distcheck.patch BuildRoot: %{_tmppath}/%{name}-%{version}-build %description @@ -105,44 +107,48 @@ Authors: %lang_package %prep %setup -q -%patch4 +%patch4 -p1 %patch5 %patch6 -%patch0 -%patch1 -%patch2 +%patch %patch8 +%patch11 +%patch12 %patch16 +%patch17 %patch20 %patch21 %patch22 -%patch23 +%patch23 -p1 +%patch25 %patch26 +%patch27 %build -AUTOPOINT=true autoreconf -fi -export CFLAGS="%optflags -Wall" -%configure --without-included-regex \ +#AUTOPOINT=true autoreconf -fi +./configure CFLAGS="$RPM_OPT_FLAGS -Wall" \ + --prefix=%{_prefix} --mandir=%{_mandir} \ + --infodir=%{_infodir} --without-included-regex \ --enable-install-program=arch,su \ gl_cv_func_printf_directive_n=yes \ gl_cv_func_isnanl_works=yes \ DEFAULT_POSIX2_VERSION=199209 -make %{?_smp_mflags} PAMLIBS="-lpam -ldl" V=1 +make %{?jobs:-j%jobs} PAMLIBS="-lpam -ldl" %check if test $EUID -eq 0; then - su nobody -c make %{?_smp_mflags} check VERBOSE=yes V=1 - make %{?_smp_mflags} check-root VERBOSE=yes V=1 + su nobody -c make %{?jobs:-j%jobs} check VERBOSE=yes + make %{?jobs:-j%jobs} check-root VERBOSE=yes else %ifarch %arm - make -k %{?_smp_mflags} check VERBOSE=yes V=1 || echo make check failed + make -k %{?jobs:-j%jobs} check VERBOSE=yes || echo make check failed %else - make %{?_smp_mflags} check VERBOSE=yes V=1 + make %{?jobs:-j%jobs} check VERBOSE=yes %endif fi %install -%makeinstall +make DESTDIR="$RPM_BUILD_ROOT" install test -f $RPM_BUILD_ROOT%{_bindir}/su || \ install src/su $RPM_BUILD_ROOT%{_bindir}/su install -d $RPM_BUILD_ROOT/bin @@ -176,7 +182,6 @@ rm -rf $RPM_BUILD_ROOT %config /etc/pam.d/su-l %config(noreplace) /etc/default/su %{_bindir}/* -%{_libdir}/%{name} %doc %{_infodir}/coreutils.info*.gz %doc %{_mandir}/man1/*.1.gz %dir %{_prefix}/share/locale/*/LC_TIME diff --git a/i18n-infloop.diff b/i18n-infloop.diff new file mode 100644 index 0000000..dbfcc29 --- /dev/null +++ b/i18n-infloop.diff @@ -0,0 +1,14 @@ +Index: src/sort.c +=================================================================== +--- src/sort.c.orig 2010-05-04 17:27:49.103359264 +0200 ++++ src/sort.c 2010-05-04 17:28:43.820359291 +0200 +@@ -2540,7 +2540,8 @@ keycompare_mb (const struct line *a, con + if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ + STATE = state_bak; \ + if (!ignore) \ +- COPY[NEW_LEN++] = TEXT[i++]; \ ++ COPY[NEW_LEN++] = TEXT[i]; \ ++ i++; \ + continue; \ + } \ + \ diff --git a/i18n-limfield.diff b/i18n-limfield.diff new file mode 100644 index 0000000..b27c3c9 --- /dev/null +++ b/i18n-limfield.diff @@ -0,0 +1,100 @@ +Index: src/sort.c +=================================================================== +--- src/sort.c.orig 2010-05-04 17:29:12.419359202 +0200 ++++ src/sort.c 2010-05-04 17:29:12.479359419 +0200 +@@ -1731,7 +1731,7 @@ limfield_mb (const struct line *line, co + GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); + ptr += mblength; + } +- if (ptr < lim) ++ if (ptr < lim && (eword | echar)) + { + GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); + ptr += mblength; +@@ -1742,11 +1742,6 @@ limfield_mb (const struct line *line, co + { + while (ptr < lim && ismbblank (ptr, &mblength)) + ptr += mblength; +- if (ptr < lim) +- { +- GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); +- ptr += mblength; +- } + while (ptr < lim && !ismbblank (ptr, &mblength)) + ptr += mblength; + } +@@ -1756,20 +1751,19 @@ limfield_mb (const struct line *line, co + /* Make LIM point to the end of (one byte past) the current field. */ + if (tab != NULL) + { +- char *newlim, *p; ++ char *newlim; + +- newlim = NULL; +- for (p = ptr; p < lim;) +- { +- if (memcmp (p, tab, tab_length) == 0) +- { +- newlim = p; +- break; +- } +- +- GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); +- p += mblength; +- } ++ for (newlim = ptr; newlim < lim;) ++ { ++ if (memcmp (newlim, tab, tab_length) == 0) ++ { ++ lim = newlim; ++ break; ++ } ++ ++ GET_BYTELEN_OF_CHAR (lim, newlim, mblength, state); ++ newlim += mblength; ++ } + } + else + { +@@ -1778,24 +1772,20 @@ limfield_mb (const struct line *line, co + + while (newlim < lim && ismbblank (newlim, &mblength)) + newlim += mblength; +- if (ptr < lim) +- { +- GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); +- ptr += mblength; +- } + while (newlim < lim && !ismbblank (newlim, &mblength)) +- newlim += mblength; ++ newlim += mblength; + lim = newlim; + } + # endif + +- /* If we're skipping leading blanks, don't start counting characters +- until after skipping past any leading blanks. */ ++ /* If we're ignoring leading blanks when computing the End ++ of the field, don't start counting bytes until after skipping ++ past any leading blanks. */ + if (key->skipeblanks) + while (ptr < lim && ismbblank (ptr, &mblength)) + ptr += mblength; + +- memset (&state, '\0', sizeof(mbstate_t)); ++ memset (&state, '\0', sizeof (mbstate_t)); + + /* Advance PTR by ECHAR (if possible), but no further than LIM. */ + for (i = 0; i < echar; i++) +@@ -1803,9 +1793,9 @@ limfield_mb (const struct line *line, co + GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); + + if (ptr + mblength > lim) +- break; ++ break; + else +- ptr += mblength; ++ ptr += mblength; + } + + return ptr; diff --git a/i18n-monthsort.diff b/i18n-monthsort.diff new file mode 100644 index 0000000..58bf214 --- /dev/null +++ b/i18n-monthsort.diff @@ -0,0 +1,13 @@ +Index: src/sort.c +=================================================================== +--- src/sort.c.orig 2010-05-04 17:28:43.820359291 +0200 ++++ src/sort.c 2010-05-04 17:30:44.507859357 +0200 +@@ -1285,7 +1285,7 @@ inittables_mb (void) + else + { + j += mblength; +- mblength = wcrtomb (mbc, wc, &state_wc); ++ mblength = wcrtomb (mbc, pwc, &state_wc); + assert (mblength != (size_t) 0 && mblength != (size_t) -1); + } + diff --git a/i18n-random.diff b/i18n-random.diff new file mode 100644 index 0000000..566e2de --- /dev/null +++ b/i18n-random.diff @@ -0,0 +1,16 @@ +Index: src/sort.c +=================================================================== +--- src/sort.c.orig 2010-05-04 17:29:12.395359111 +0200 ++++ src/sort.c 2010-05-04 17:29:59.979859336 +0200 +@@ -2494,7 +2494,10 @@ keycompare_mb (const struct line *a, con + size_t lenb = limb <= textb ? 0 : limb - textb; + + /* Actually compare the fields. */ +- if (key->numeric | key->general_numeric) ++ ++ if (key->random) ++ diff = compare_random (texta, lena, textb, lenb); ++ else if (key->numeric | key->general_numeric) + { + char savea = *lima, saveb = *limb; + diff --git a/i18n-uninit.diff b/i18n-uninit.diff new file mode 100644 index 0000000..8952a0d --- /dev/null +++ b/i18n-uninit.diff @@ -0,0 +1,29 @@ +Index: src/cut.c +=================================================================== +--- src/cut.c.orig 2010-05-04 17:27:29.879859350 +0200 ++++ src/cut.c 2010-05-04 17:27:30.131859395 +0200 +@@ -878,7 +878,10 @@ cut_fields_mb (FILE *stream) + c = getc (stream); + empty_input = (c == EOF); + if (c != EOF) +- ungetc (c, stream); ++ { ++ ungetc (c, stream); ++ wc = 0; ++ } + else + wc = WEOF; + +Index: src/expand.c +=================================================================== +--- src/expand.c.orig 2010-05-04 17:27:29.915859239 +0200 ++++ src/expand.c 2010-05-04 17:27:30.155859324 +0200 +@@ -404,7 +404,7 @@ expand_multibyte (void) + for (;;) + { + /* Input character, or EOF. */ +- wint_t wc; ++ wint_t wc = 0; + + /* If true, perform translations. */ + bool convert = true; diff --git a/invalid-ids.diff b/invalid-ids.diff new file mode 100644 index 0000000..35f435c --- /dev/null +++ b/invalid-ids.diff @@ -0,0 +1,49 @@ +While uid_t and gid_t are both unsigned, the values (uid_t) -1 and +(gid_t) -1 are reserved. A uid or gid argument of -1 to the chown(2) +system call means to leave the uid/gid unchanged. Catch this case +so that trying to set a uid or gid to -1 will result in an error. + +Test cases: + + chown 4294967295 file + chown :4294967295 file + chgrp 4294967295 file + +Andreas Gruenbacher + +Index: lib/userspec.c +=================================================================== +--- lib/userspec.c.orig 2010-05-04 17:27:48.479359439 +0200 ++++ lib/userspec.c 2010-05-04 17:29:12.439359267 +0200 +@@ -169,7 +169,7 @@ parse_with_separator (char const *spec, + { + unsigned long int tmp; + if (xstrtoul (u, NULL, 10, &tmp, "") == LONGINT_OK +- && tmp <= MAXUID) ++ && tmp <= MAXUID && tmp != (uid_t) -1) + unum = tmp; + else + error_msg = E_invalid_user; +@@ -200,7 +200,8 @@ parse_with_separator (char const *spec, + if (grp == NULL) + { + unsigned long int tmp; +- if (xstrtoul (g, NULL, 10, &tmp, "") == LONGINT_OK && tmp <= MAXGID) ++ if (xstrtoul (g, NULL, 10, &tmp, "") == LONGINT_OK && tmp <= MAXGID ++ && tmp != (gid_t) -1) + gnum = tmp; + else + error_msg = E_invalid_group; +Index: src/chgrp.c +=================================================================== +--- src/chgrp.c.orig 2010-05-04 17:27:48.479359439 +0200 ++++ src/chgrp.c 2010-05-04 17:29:12.443359269 +0200 +@@ -89,7 +89,7 @@ parse_group (const char *name) + { + unsigned long int tmp; + if (! (xstrtoul (name, NULL, 10, &tmp, "") == LONGINT_OK +- && tmp <= GID_T_MAX)) ++ && tmp <= GID_T_MAX && tmp != (gid_t) -1)) + error (EXIT_FAILURE, 0, _("invalid group: %s"), quote (name)); + gid = tmp; + } diff --git a/su.pamd b/su.pamd index 88ddbaf..b729046 100644 --- a/su.pamd +++ b/su.pamd @@ -1,7 +1,6 @@ #%PAM-1.0 auth sufficient pam_rootok.so auth include common-auth -account sufficient pam_rootok.so account include common-account password include common-password session include common-session From 74bac430a88c66500e4aef492f7cfbb0b237e7ab1a4cba2e780c3844ebd6f3a3 Mon Sep 17 00:00:00 2001 From: OBS User buildservice-autocommit Date: Mon, 19 Jul 2010 12:13:47 +0000 Subject: [PATCH 7/7] Updating link to change in openSUSE:Factory/coreutils revision 43.0 OBS-URL: https://build.opensuse.org/package/show/Base:System/coreutils?expand=0&rev=a7b4c6efd1592fb3aac750d37166b7e4 --- coreutils-5.3.0-i18n-0.1.patch | 4015 ---------------- ...n4su.diff => coreutils-5.3.0-sbin4su.patch | 14 +- ...tils-6.8-su.diff => coreutils-6.8-su.patch | 281 +- ....8.0-pie.diff => coreutils-6.8.0-pie.patch | 109 +- coreutils-7.1.diff | 194 - coreutils-7.1.tar.xz | 3 - coreutils-8.5-i18n.patch | 4066 +++++++++++++++++ coreutils-8.5.patch | 67 + coreutils-8.5.tar.xz | 3 + coreutils-add_ogv.patch | 8 +- coreutils-cifs-afs.diff | 35 - coreutils-fix_distcheck.patch | 80 - coreutils-getaddrinfo.diff | 16 - coreutils-getaddrinfo.patch | 17 + coreutils-gl_printf_safe.patch | 24 + coreutils-i18n-infloop.patch | 14 + coreutils-i18n-uninit.patch | 16 + coreutils-invalid-ids.patch | 26 + coreutils-no_hostname_and_hostid.patch | 122 + ...ls-sysinfo.diff => coreutils-sysinfo.patch | 14 +- coreutils.changes | 69 + coreutils.spec | 71 +- i18n-infloop.diff | 14 - i18n-limfield.diff | 100 - i18n-monthsort.diff | 13 - i18n-random.diff | 16 - i18n-uninit.diff | 29 - invalid-ids.diff | 49 - su.pamd | 1 + 29 files changed, 4681 insertions(+), 4805 deletions(-) delete mode 100644 coreutils-5.3.0-i18n-0.1.patch rename coreutils-5.3.0-sbin4su.diff => coreutils-5.3.0-sbin4su.patch (90%) rename coreutils-6.8-su.diff => coreutils-6.8-su.patch (78%) rename coreutils-6.8.0-pie.diff => coreutils-6.8.0-pie.patch (76%) delete mode 100644 coreutils-7.1.diff delete mode 100644 coreutils-7.1.tar.xz create mode 100644 coreutils-8.5-i18n.patch create mode 100644 coreutils-8.5.patch create mode 100644 coreutils-8.5.tar.xz delete mode 100644 coreutils-cifs-afs.diff delete mode 100644 coreutils-fix_distcheck.patch delete mode 100644 coreutils-getaddrinfo.diff create mode 100644 coreutils-getaddrinfo.patch create mode 100644 coreutils-gl_printf_safe.patch create mode 100644 coreutils-i18n-infloop.patch create mode 100644 coreutils-i18n-uninit.patch create mode 100644 coreutils-invalid-ids.patch create mode 100644 coreutils-no_hostname_and_hostid.patch rename coreutils-sysinfo.diff => coreutils-sysinfo.patch (86%) delete mode 100644 i18n-infloop.diff delete mode 100644 i18n-limfield.diff delete mode 100644 i18n-monthsort.diff delete mode 100644 i18n-random.diff delete mode 100644 i18n-uninit.diff delete mode 100644 invalid-ids.diff diff --git a/coreutils-5.3.0-i18n-0.1.patch b/coreutils-5.3.0-i18n-0.1.patch deleted file mode 100644 index b07d63d..0000000 --- a/coreutils-5.3.0-i18n-0.1.patch +++ /dev/null @@ -1,4015 +0,0 @@ -Index: lib/linebuffer.h -=================================================================== ---- coreutils-7.1/lib/linebuffer.h.orig 2008-09-18 09:08:01.000000000 +0200 -+++ coreutils-7.1/lib/linebuffer.h 2010-06-29 18:49:31.855522069 +0200 -@@ -21,6 +21,11 @@ - - # include - -+/* Get mbstate_t. */ -+# if HAVE_WCHAR_H -+# include -+# endif -+ - /* A `struct linebuffer' holds a line of text. */ - - struct linebuffer -@@ -28,6 +33,9 @@ struct linebuffer - size_t size; /* Allocated. */ - size_t length; /* Used. */ - char *buffer; -+# if HAVE_WCHAR_H -+ mbstate_t state; -+# endif - }; - - /* Initialize linebuffer LINEBUFFER for use. */ -Index: src/cut.c -=================================================================== ---- coreutils-7.1/src/cut.c.orig 2008-09-18 09:06:57.000000000 +0200 -+++ coreutils-7.1/src/cut.c 2010-06-29 18:49:31.855522069 +0200 -@@ -28,6 +28,12 @@ - #include - #include - #include -+ -+/* Get mbstate_t, mbrtowc(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ - #include "system.h" - - #include "error.h" -@@ -36,6 +42,13 @@ - #include "quote.h" - #include "xstrndup.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "cut" - -@@ -77,6 +90,54 @@ struct range_pair - size_t hi; - }; - -+/* Refill the buffer BUF. */ -+#define REFILL_BUFFER(BUF, BUFPOS, BUFLEN, STREAM) \ -+ do \ -+ { \ -+ if (BUFLEN < MB_LEN_MAX && !feof (STREAM) && !ferror (STREAM)) \ -+ { \ -+ memmove (BUF, BUFPOS, BUFLEN); \ -+ BUFLEN += fread (BUF + BUFLEN, sizeof(char), BUFSIZ, STREAM); \ -+ BUFPOS = BUF; \ -+ } \ -+ } \ -+ while (0) -+ -+/* Get wide character which starts at BUFPOS. If the byte sequence is -+ not valid as a character, CONVFAIL is 1. Otherwise 0. */ -+#define GET_NEXT_WC_FROM_BUFFER(WC, BUFPOS, BUFLEN, MBLENGTH, STATE, CONVFAIL) \ -+ do \ -+ { \ -+ wchar_t tmp; \ -+ mbstate_t state_bak; \ -+ \ -+ if (BUFLEN < 1) \ -+ { \ -+ WC = WEOF; \ -+ break; \ -+ } \ -+ \ -+ /* Get a wide character. */ \ -+ CONVFAIL = 0; \ -+ state_bak = STATE; \ -+ MBLENGTH = mbrtowc (&tmp, BUFPOS, BUFLEN, &STATE); \ -+ WC = tmp; \ -+ \ -+ switch (MBLENGTH) \ -+ { \ -+ case (size_t)-1: \ -+ case (size_t)-2: \ -+ ++CONVFAIL; \ -+ STATE = state_bak; \ -+ /* Fall througn. */ \ -+ \ -+ case 0: \ -+ MBLENGTH = 1; \ -+ break; \ -+ } \ -+ } \ -+ while (0) -+ - /* This buffer is used to support the semantics of the -s option - (or lack of same) when the specified field list includes (does - not include) the first field. In both of those cases, the entire -@@ -89,7 +150,7 @@ static char *field_1_buffer; - /* The number of bytes allocated for FIELD_1_BUFFER. */ - static size_t field_1_bufsize; - --/* The largest field or byte index used as an endpoint of a closed -+/* The largest field, character or byte index used as an endpoint of a closed - or degenerate range specification; this doesn't include the starting - index of right-open-ended ranges. For example, with either range spec - `2-5,9-', `2-3,5,9-' this variable would be set to 5. */ -@@ -101,10 +162,11 @@ static size_t eol_range_start; - - /* This is a bit vector. - In byte mode, which bytes to output. -+ In character mode, which characters to output. - In field mode, which DELIM-separated fields to output. -- Both bytes and fields are numbered starting with 1, -+ Bytes, characters and fields are numbered starting with 1, - so the zeroth bit of this array is unused. -- A field or byte K has been selected if -+ A byte, character or field K has been selected if - (K <= MAX_RANGE_ENDPOINT and is_printable_field(K)) - || (EOL_RANGE_START > 0 && K >= EOL_RANGE_START). */ - static unsigned char *printable_field; -@@ -113,15 +175,25 @@ enum operating_mode - { - undefined_mode, - -- /* Output characters that are in the given bytes. */ -+ /* Output bytes that are in the given bytes. */ - byte_mode, - -+ /* Output characters that are at the given positions. */ -+ character_mode, -+ - /* Output the given delimeter-separated fields. */ - field_mode - }; - - static enum operating_mode operating_mode; - -+/* If true, when in byte mode, don't split multibyte characters. */ -+static bool byte_mode_character_aware; -+ -+/* If true, the function for single byte locale is work -+ if this program runs on multibyte locale. */ -+static bool force_singlebyte_mode; -+ - /* If true do not output lines containing no delimeter characters. - Otherwise, all such lines are printed. This option is valid only - with field mode. */ -@@ -133,6 +205,9 @@ static bool complement; - - /* The delimeter character for field mode. */ - static unsigned char delim; -+#if HAVE_WCHAR_H -+static wchar_t wcdelim; -+#endif - - /* True if the --output-delimiter=STRING option was specified. */ - static bool output_delimiter_specified; -@@ -206,7 +281,7 @@ Mandatory arguments to long options are - -f, --fields=LIST select only these fields; also print any line\n\ - that contains no delimiter character, unless\n\ - the -s option is specified\n\ -- -n (ignored)\n\ -+ -n with -b: don't split multibyte characters\n\ - "), stdout); - fputs (_("\ - --complement complement the set of selected bytes, characters\n\ -@@ -365,7 +440,7 @@ set_fields (const char *fieldstr) - in_digits = false; - /* Starting a range. */ - if (dash_found) -- FATAL_ERROR (_("invalid byte or field list")); -+ FATAL_ERROR (_("invalid byte, character or field list")); - dash_found = true; - fieldstr++; - -@@ -389,7 +464,9 @@ set_fields (const char *fieldstr) - if (!rhs_specified) - { - /* `n-'. From `initial' to end of line. */ -- eol_range_start = initial; -+ if (eol_range_start == 0 -+ || (eol_range_start != 0 && eol_range_start > initial)) -+ eol_range_start = initial; - field_found = true; - } - else -@@ -486,7 +563,7 @@ set_fields (const char *fieldstr) - fieldstr++; - } - else -- FATAL_ERROR (_("invalid byte or field list")); -+ FATAL_ERROR (_("invalid byte, character or field list")); - } - - max_range_endpoint = 0; -@@ -579,6 +656,81 @@ cut_bytes (FILE *stream) - } - } - -+#if HAVE_MBRTOWC -+/* This function is in use for the following case. -+ -+ 1. Read from the stream STREAM, printing to standard output any selected -+ characters. -+ -+ 2. Read from stream STREAM, printing to standard output any selected bytes, -+ without splitting multibyte characters. */ -+ -+static void -+cut_characters_or_cut_bytes_no_split (FILE *stream) -+{ -+ size_t idx; /* Number of bytes or characters in the line so far. */ -+ /* Whether to begin printing delimiters between ranges for the current line. -+ Set after we've begun printing data corresponding to the first range. */ -+ bool print_delimiter; -+ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen; /* The length of the byte sequence in buf. */ -+ wint_t wc; /* A gotten wide character. */ -+ size_t mblength; /* The byte size of a multibyte character which shows -+ as same character as WC. */ -+ mbstate_t state; /* State of the stream. */ -+ int convfail; /* 1, when conversion is failed. Otherwise 0. */ -+ -+ -+ idx = 0; -+ print_delimiter = false; -+ buflen = 0; -+ bufpos = buf; -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ while (1) -+ { -+ REFILL_BUFFER (buf, bufpos, buflen, stream); -+ -+ GET_NEXT_WC_FROM_BUFFER (wc, bufpos, buflen, mblength, state, convfail); -+ -+ if (wc == WEOF) -+ { -+ if (idx > 0) -+ putchar ('\n'); -+ break; -+ } -+ else if (wc == L'\n') -+ { -+ putchar ('\n'); -+ idx = 0; -+ print_delimiter = false; -+ } -+ else -+ { -+ bool range_start; -+ bool *rs = output_delimiter_specified ? &range_start : NULL; -+ -+ idx += (operating_mode == byte_mode) ? mblength : 1; -+ if (print_kth (idx, rs)) -+ { -+ if (rs && *rs && print_delimiter) -+ { -+ fwrite (output_delimiter_string, sizeof (char), -+ output_delimiter_length, stdout); -+ } -+ print_delimiter = true; -+ fwrite (bufpos, mblength, sizeof (char), stdout); -+ } -+ } -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+} -+#endif -+ - /* Read from stream STREAM, printing to standard output any selected fields. */ - - static void -@@ -701,13 +853,190 @@ cut_fields (FILE *stream) - } - } - -+#if HAVE_MBRTOWC -+static void -+cut_fields_mb (FILE *stream) -+{ -+ int c; -+ size_t field_idx = 1; -+ bool found_any_selected_field = false; -+ bool buffer_first_field; -+ int empty_input; -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen; /* The length of the byte sequence in buf. */ -+ wint_t wc; /* A gotten wide character. */ -+ size_t mblength; /* The byte size of a multibyte character which shows -+ as same character as WC. */ -+ mbstate_t state; /* State of the stream. */ -+ int convfail; /* 1, when conversion is failed. Otherwise 0. */ -+ -+ bufpos = buf; -+ buflen = 0; -+ memset (&state, '\0', sizeof (mbstate_t)); -+ -+ c = getc (stream); -+ empty_input = (c == EOF); -+ if (c != EOF) -+ ungetc (c, stream); -+ else -+ wc = WEOF; -+ -+ /* To support the semantics of the -s flag, we may have to buffer -+ all of the first field to determine whether it is `delimited.' -+ But that is unnecessary if all non-delimited lines must be printed -+ and the first field has been selected, or if non-delimited lines -+ must be suppressed and the first field has *not* been selected. -+ That is because a non-delimited line has exactly one field. */ -+ buffer_first_field = (suppress_non_delimited ^ !print_kth (1, NULL)); -+ -+ while (1) -+ { -+ if (field_idx == 1 && buffer_first_field) -+ { -+ size_t n_bytes = 0; -+ -+ while (1) -+ { -+ REFILL_BUFFER (buf, bufpos, buflen, stream); -+ -+ GET_NEXT_WC_FROM_BUFFER -+ (wc, bufpos, buflen, mblength, state, convfail); -+ -+ if (wc == WEOF) -+ break; -+ -+ field_1_buffer = xrealloc (field_1_buffer, n_bytes + mblength); -+ memcpy (field_1_buffer + n_bytes, bufpos, mblength); -+ n_bytes += mblength; -+ buflen -= mblength; -+ bufpos += mblength; -+ -+ if (!convfail && (wc == L'\n' || wc == wcdelim)) -+ break; -+ } -+ -+ if (wc == WEOF) -+ break; -+ -+ /* If the first field extends to the end of line (it is not -+ delimited) and we are printing all non-delimited lines, -+ print this one. */ -+ if (convfail || (!convfail && wc != wcdelim)) -+ { -+ if (suppress_non_delimited) -+ { -+ /* Empty. */ -+ } -+ else -+ { -+ fwrite (field_1_buffer, sizeof (char), n_bytes, stdout); -+ /* Make sure the output line is newline terminated. */ -+ if (convfail || (!convfail && wc != L'\n')) -+ putchar ('\n'); -+ } -+ continue; -+ } -+ -+ if (print_kth (1, NULL)) -+ { -+ /* Print the field, but not the trailing delimiter. */ -+ fwrite (field_1_buffer, sizeof (char), n_bytes - 1, stdout); -+ found_any_selected_field = true; -+ } -+ ++field_idx; -+ } -+ -+ if (wc != WEOF) -+ { -+ if (print_kth (field_idx, NULL)) -+ { -+ if (found_any_selected_field) -+ { -+ fwrite (output_delimiter_string, sizeof (char), -+ output_delimiter_length, stdout); -+ } -+ found_any_selected_field = true; -+ } -+ -+ while (1) -+ { -+ REFILL_BUFFER (buf, bufpos, buflen, stream); -+ -+ GET_NEXT_WC_FROM_BUFFER -+ (wc, bufpos, buflen, mblength, state, convfail); -+ -+ if (wc == WEOF) -+ break; -+ else if (!convfail && (wc == wcdelim || wc == L'\n')) -+ { -+ buflen -= mblength; -+ bufpos += mblength; -+ break; -+ } -+ -+ if (print_kth (field_idx, NULL)) -+ fwrite (bufpos, mblength, sizeof (char), stdout); -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+ } -+ -+ if ((!convfail || wc == L'\n') && buflen < 1) -+ wc = WEOF; -+ -+ if (!convfail && wc == wcdelim) -+ ++field_idx; -+ else if (wc == WEOF || (!convfail && wc == L'\n')) -+ { -+ if (found_any_selected_field -+ || (!empty_input && !(suppress_non_delimited && field_idx == 1))) -+ putchar ('\n'); -+ if (wc == WEOF) -+ break; -+ field_idx = 1; -+ found_any_selected_field = false; -+ } -+ } -+} -+#endif -+ - static void - cut_stream (FILE *stream) - { -- if (operating_mode == byte_mode) -- cut_bytes (stream); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) -+ { -+ switch (operating_mode) -+ { -+ case byte_mode: -+ if (byte_mode_character_aware) -+ cut_characters_or_cut_bytes_no_split (stream); -+ else -+ cut_bytes (stream); -+ break; -+ -+ case character_mode: -+ cut_characters_or_cut_bytes_no_split (stream); -+ break; -+ -+ case field_mode: -+ cut_fields_mb (stream); -+ break; -+ -+ default: -+ abort (); -+ } -+ } - else -- cut_fields (stream); -+#endif -+ { -+ if (operating_mode == field_mode) -+ cut_fields (stream); -+ else -+ cut_bytes (stream); -+ } - } - - /* Process file FILE to standard output. -@@ -757,6 +1086,8 @@ main (int argc, char **argv) - bool ok; - bool delim_specified = false; - char *spec_list_string IF_LINT(= NULL); -+ char mbdelim[MB_LEN_MAX + 1]; -+ size_t delimlen = 0; - - initialize_main (&argc, &argv); - set_program_name (argv[0]); -@@ -779,7 +1110,6 @@ main (int argc, char **argv) - switch (optc) - { - case 'b': -- case 'c': - /* Build the byte list. */ - if (operating_mode != undefined_mode) - FATAL_ERROR (_("only one type of list may be specified")); -@@ -787,6 +1117,14 @@ main (int argc, char **argv) - spec_list_string = optarg; - break; - -+ case 'c': -+ /* Build the character list. */ -+ if (operating_mode != undefined_mode) -+ FATAL_ERROR (_("only one type of list may be specified")); -+ operating_mode = character_mode; -+ spec_list_string = optarg; -+ break; -+ - case 'f': - /* Build the field list. */ - if (operating_mode != undefined_mode) -@@ -798,9 +1136,32 @@ main (int argc, char **argv) - case 'd': - /* New delimiter. */ - /* Interpret -d '' to mean `use the NUL byte as the delimiter.' */ -- if (optarg[0] != '\0' && optarg[1] != '\0') -- FATAL_ERROR (_("the delimiter must be a single character")); -- delim = optarg[0]; -+#if HAVE_MBRTOWC -+ if(MB_CUR_MAX > 1) -+ { -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ delimlen = mbrtowc (&wcdelim, optarg, MB_LEN_MAX, &state); -+ -+ if (delimlen == (size_t)-1 || delimlen == (size_t)-2) -+ force_singlebyte_mode = true; -+ else -+ { -+ delimlen = (delimlen < 1) ? 1 : delimlen; -+ if (wcdelim != L'\0' && *(optarg + delimlen) != '\0') -+ FATAL_ERROR (_("the delimiter must be a single character")); -+ memcpy (mbdelim, optarg, delimlen); -+ } -+ } -+ -+ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) -+#endif -+ { -+ if (optarg[0] != '\0' && optarg[1] != '\0') -+ FATAL_ERROR (_("the delimiter must be a single character")); -+ delim = (unsigned char) optarg[0]; -+ } - delim_specified = true; - break; - -@@ -814,6 +1175,7 @@ main (int argc, char **argv) - break; - - case 'n': -+ byte_mode_character_aware = true; - break; - - case 's': -@@ -836,7 +1198,7 @@ main (int argc, char **argv) - if (operating_mode == undefined_mode) - FATAL_ERROR (_("you must specify a list of bytes, characters, or fields")); - -- if (delim != '\0' && operating_mode != field_mode) -+ if (delim_specified && operating_mode != field_mode) - FATAL_ERROR (_("an input delimiter may be specified only\ - when operating on fields")); - -@@ -863,15 +1225,34 @@ main (int argc, char **argv) - } - - if (!delim_specified) -- delim = '\t'; -+ { -+ delim = '\t'; -+#ifdef HAVE_MBRTOWC -+ wcdelim = L'\t'; -+ mbdelim[0] = '\t'; -+ mbdelim[1] = '\0'; -+ delimlen = 1; -+ } -+#endif - - if (output_delimiter_string == NULL) - { -- static char dummy[2]; -- dummy[0] = delim; -- dummy[1] = '\0'; -- output_delimiter_string = dummy; -- output_delimiter_length = 1; -+#ifdef HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) -+ { -+ output_delimiter_string = xstrdup (mbdelim); -+ output_delimiter_length = delimlen; -+ } -+ -+ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) -+#endif -+ { -+ static char dummy[2]; -+ dummy[0] = delim; -+ dummy[1] = '\0'; -+ output_delimiter_string = dummy; -+ output_delimiter_length = 1; -+ } - } - - if (optind == argc) -Index: src/expand.c -=================================================================== ---- coreutils-7.1/src/expand.c.orig 2008-11-10 14:17:52.000000000 +0100 -+++ coreutils-7.1/src/expand.c 2010-06-29 18:49:31.871522014 +0200 -@@ -37,11 +37,31 @@ - #include - #include - #include -+ -+/* Get mbstate_t, mbrtowc, wcwidth. */ -+#if HAVE_WCHAR_H -+# include -+#endif -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "quote.h" - #include "xstrndup.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "expand" - -@@ -343,9 +363,12 @@ expand (void) - } - else - { -- column++; -- if (!column) -- error (EXIT_FAILURE, 0, _("input line is too long")); -+ if (!iscntrl (c)) -+ { -+ column++; -+ if (!column) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ } - } - - convert &= convert_entire_line | !! isblank (c); -@@ -361,6 +384,165 @@ expand (void) - } - } - -+#if HAVE_MBRTOWC && HAVE_WCTYPE_H -+static void -+expand_multibyte (void) -+{ -+ /* Input stream. */ -+ FILE *fp = next_file (NULL); -+ -+ mbstate_t i_state; /* Current shift state of the input stream. */ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen = 0; /* The length of the byte sequence in buf. */ -+ -+ if (!fp) -+ return; -+ -+ memset (&i_state, '\0', sizeof (mbstate_t)); -+ -+ for (;;) -+ { -+ /* Input character, or EOF. */ -+ wint_t wc; -+ -+ /* If true, perform translations. */ -+ bool convert = true; -+ -+ -+ /* The following variables have valid values only when CONVERT -+ is true: */ -+ -+ /* Column of next input character. */ -+ uintmax_t column = 0; -+ -+ /* Index in TAB_LIST of next tab stop to examine. */ -+ size_t tab_index = 0; -+ -+ -+ /* Convert a line of text. */ -+ -+ do -+ { -+ wchar_t w; -+ size_t mblength; /* The byte size of a multibyte character -+ which shows as same character as WC. */ -+ mbstate_t i_state_bak; /* Back up the I_STATE. */ -+ -+ /* Fill buffer */ -+ if (buflen < MB_LEN_MAX) -+ { -+ if (!feof(fp) && !ferror(fp)) -+ { -+ if (buflen > 0) -+ memmove (buf, bufpos, buflen); -+ buflen += fread (buf + buflen, sizeof (char), BUFSIZ, fp); -+ bufpos = buf; -+ } -+ } -+ -+ if (buflen < 1) -+ { -+ /* Move to the next file */ -+ if (feof (fp) || ferror (fp)) -+ fp = next_file(fp); -+ if (!fp) -+ return; -+ memset (&i_state, '\0', sizeof (mbstate_t)); -+ continue; -+ } -+ -+ i_state_bak = i_state; -+ mblength = mbrtowc (&w, bufpos, buflen, &i_state); -+ wc = w; -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ i_state = i_state_bak; -+ wc = L'\0'; -+ column += convert; -+ mblength = 1; -+ } -+ -+ if (convert) -+ { -+ if (wc == L'\t') -+ { -+ /* Column the next input tab stop is on. */ -+ uintmax_t next_tab_column; -+ -+ if (tab_size) -+ next_tab_column = column + (tab_size - column % tab_size); -+ else -+ for (;;) -+ if (tab_index == first_free_tab) -+ { -+ next_tab_column = column + 1; -+ break; -+ } -+ else -+ { -+ uintmax_t tab = tab_list[tab_index++]; -+ if (column < tab) -+ { -+ next_tab_column = tab; -+ break; -+ } -+ } -+ -+ if (next_tab_column < column) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ -+ while (++column < next_tab_column) -+ if (putchar (' ') < 0) -+ error (EXIT_FAILURE, errno, _("write error")); -+ -+ *bufpos = ' '; -+ } -+ else if (wc == L'\b') -+ { -+ /* Go back one column, and force recalculation of the -+ next tab stop. */ -+ column -= !!column; -+ tab_index -= !!tab_index; -+ } -+ else -+ { -+ if (!iswcntrl (wc)) -+ { -+ int width = wcwidth (wc); -+ if (width > 0) -+ { -+ if (column > (column + width)) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ column += width; -+ } -+ } -+ } -+ -+ convert &= convert_entire_line | iswblank (wc); -+ } -+ -+ if (mblength) -+ { -+ if (fwrite (bufpos, sizeof (char), mblength, stdout) < mblength) -+ error (EXIT_FAILURE, errno, _("write error")); -+ } -+ else -+ { -+ if (putchar ('\0')) -+ error (EXIT_FAILURE, errno, _("write error")); -+ mblength = 1; -+ } -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+ while (wc != L'\n'); -+ } -+} -+#endif -+ - int - main (int argc, char **argv) - { -@@ -425,7 +607,12 @@ main (int argc, char **argv) - - file_list = (optind < argc ? &argv[optind] : stdin_argv); - -- expand (); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ expand_multibyte (); -+ else -+#endif -+ expand (); - - if (have_read_stdin && fclose (stdin) != 0) - error (EXIT_FAILURE, errno, "-"); -Index: src/fold.c -=================================================================== ---- coreutils-7.1/src/fold.c.orig 2008-09-18 09:06:57.000000000 +0200 -+++ coreutils-7.1/src/fold.c 2010-06-29 18:49:31.896029818 +0200 -@@ -22,6 +22,19 @@ - #include - #include - -+/* Get MB_CUR_MAX. */ -+#include -+ -+/* Get mbrtowc, mbstate_t, wcwidth(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswprint(), iswctype(), wctype(). */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "quote.h" -@@ -29,11 +42,54 @@ - - #define TAB_WIDTH 8 - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ -+#ifndef HAVE_DECL_WCWIDTH -+"this configure-time declaration test was not run" -+#endif -+#if !HAVE_DECL_WCWIDTH -+extern int wcwidth (); -+#endif -+ -+/* If wcwidth() doesn't exist, assume all printable characters have -+ width 1. */ -+#if !defined wcwidth && !HAVE_WCWIDTH -+# define wcwidth(wc) ((wc) == 0 ? 0 : iswprint (wc) ? 1 : -1) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "fold" - - #define AUTHORS proper_name ("David MacKenzie") - -+#define FATAL_ERROR(Message) \ -+ do \ -+ { \ -+ error (0, 0, (Message)); \ -+ usage (2); \ -+ } \ -+ while (0) -+ -+enum operating_mode -+{ -+ /* Fold texts by columns that are at the given positions. */ -+ column_mode, -+ -+ /* Fold texts by bytes that are at the given positions. */ -+ byte_mode, -+ -+ /* Fold texts by characters that are at the given positions. */ -+ character_mode, -+}; -+ -+/* The argument shows current mode. (Default: column_mode) */ -+static enum operating_mode operating_mode; -+ - /* If nonzero, try to break on whitespace. */ - static bool break_spaces; - -@@ -43,11 +99,17 @@ static bool count_bytes; - /* If nonzero, at least one of the files we read was standard input. */ - static bool have_read_stdin; - --static char const shortopts[] = "bsw:0::1::2::3::4::5::6::7::8::9::"; -+static char const shortopts[] = "bcsw:0::1::2::3::4::5::6::7::8::9::"; -+ -+/* wide character class `blank' */ -+#if HAVE_MBRTOWC -+wctype_t blank_type; -+#endif - - static struct option const longopts[] = - { - {"bytes", no_argument, NULL, 'b'}, -+ {"characters", no_argument, NULL, 'c'}, - {"spaces", no_argument, NULL, 's'}, - {"width", required_argument, NULL, 'w'}, - {GETOPT_HELP_OPTION_DECL}, -@@ -77,6 +139,7 @@ Mandatory arguments to long options are - "), stdout); - fputs (_("\ - -b, --bytes count bytes rather than columns\n\ -+ -c, --characters count characters rather than columns\n\ - -s, --spaces break at spaces\n\ - -w, --width=WIDTH use WIDTH columns instead of 80\n\ - "), stdout); -@@ -94,7 +157,7 @@ Mandatory arguments to long options are - static size_t - adjust_column (size_t column, char c) - { -- if (!count_bytes) -+ if (operating_mode != byte_mode) - { - if (c == '\b') - { -@@ -113,14 +176,9 @@ adjust_column (size_t column, char c) - return column; - } - --/* Fold file FILENAME, or standard input if FILENAME is "-", -- to stdout, with maximum line length WIDTH. -- Return true if successful. */ -- --static bool --fold_file (char const *filename, size_t width) -+static int -+fold_text (FILE *istream, size_t width) - { -- FILE *istream; - int c; - size_t column = 0; /* Screen column where next char will go. */ - size_t offset_out = 0; /* Index in `line_out' for next char. */ -@@ -128,20 +186,6 @@ fold_file (char const *filename, size_t - static size_t allocated_out = 0; - int saved_errno; - -- if (STREQ (filename, "-")) -- { -- istream = stdin; -- have_read_stdin = true; -- } -- else -- istream = fopen (filename, "r"); -- -- if (istream == NULL) -- { -- error (0, errno, "%s", filename); -- return false; -- } -- - while ((c = getc (istream)) != EOF) - { - if (offset_out + 1 >= allocated_out) -@@ -219,6 +263,234 @@ fold_file (char const *filename, size_t - if (offset_out) - fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); - -+ return saved_errno; -+} -+ -+#if HAVE_MBRTOWC -+static void -+fold_multibyte_text (FILE *istream, size_t width) -+{ -+ int i; -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ size_t buflen; /* The length of the byte sequence in buf. */ -+ char *bufpos; /* Next read position of BUF. */ -+ wint_t wc; /* A gotten wide character. */ -+ wchar_t tmp; -+ size_t mblength; /* The byte size of a multibyte character which shows -+ as same character as WC. */ -+ mbstate_t state, state_bak; /* State of the stream. */ -+ int convfail; /* 1, when conversion is failed. Otherwise 0. */ -+ -+ char *line_out = NULL; -+ size_t offset_out = 0; /* Index in `line_out' for next char. */ -+ size_t allocated_out = 1024; -+ -+ int increment; -+ size_t column = 0; -+ -+ size_t last_blank_pos; -+ size_t last_blank_column; -+ int is_blank_seen; -+ int last_blank_increment; -+ int is_bs_following_last_blank; -+ size_t bs_following_last_blank_num; -+ int is_cr_after_last_blank; -+ -+ -+#define CLEAR_FLAGS \ -+ do \ -+ { \ -+ last_blank_pos = 0; \ -+ last_blank_column = 0; \ -+ is_blank_seen = 0; \ -+ is_bs_following_last_blank = 0; \ -+ bs_following_last_blank_num = 0; \ -+ is_cr_after_last_blank = 0; \ -+ } \ -+ while (0) -+ -+#define START_NEW_LINE \ -+ do \ -+ { \ -+ putchar ('\n'); \ -+ column = 0; \ -+ offset_out = 0; \ -+ CLEAR_FLAGS; \ -+ } \ -+ while (0) -+ -+ CLEAR_FLAGS; -+ -+ memset (&state, '\0', sizeof (mbstate_t)); -+ line_out = xmalloc (allocated_out); -+ -+ buflen = fread (buf, sizeof (char), BUFSIZ, istream); -+ bufpos = buf; -+ -+ for (;; bufpos += mblength, buflen -= mblength) -+ { -+ if (buflen < MB_LEN_MAX && !feof (istream) && !ferror (istream)) -+ { -+ memmove (buf, bufpos, buflen); -+ buflen += fread (buf + buflen, sizeof (char), BUFSIZ, istream); -+ bufpos = buf; -+ } -+ -+ if (buflen < 1) -+ break; -+ -+ /* Get a wide character. */ -+ convfail = 0; -+ state_bak = state; -+ mblength = mbrtowc (&tmp, bufpos, buflen, &state); -+ wc = tmp; -+ -+ switch (mblength) -+ { -+ case (size_t)-1: -+ case (size_t)-2: -+ convfail++; -+ state = state_bak; -+ /* Fall through. */ -+ -+ case 0: -+ mblength = 1; -+ break; -+ } -+ -+ if (!convfail && wc == L'\n') -+ { -+ if (offset_out > 0) -+ { -+ fwrite (line_out, sizeof (char), offset_out, stdout); -+ START_NEW_LINE; -+ } -+ continue; -+ } -+ -+ rescan: -+ if (operating_mode == byte_mode) /* byte mode */ -+ increment = mblength; -+ else if (operating_mode == character_mode) /* character mode */ -+ increment = 1; -+ else /* column mode */ -+ { -+ if (convfail) -+ increment = 1; -+ else -+ { -+ switch (wc) -+ { -+ case L'\b': -+ increment = (column > 0) ? -1 : 0; -+ break; -+ -+ case L'\r': -+ increment = -1 * column; -+ break; -+ -+ case L'\t': -+ increment = 8 - column % 8; -+ break; -+ -+ default: -+ increment = wcwidth (wc); -+ increment = (increment < 0) ? 0 : increment; -+ } -+ } -+ } -+ -+ if (column + increment > width && break_spaces && last_blank_pos) -+ { -+ fwrite (line_out, sizeof (char), last_blank_pos, stdout); -+ putchar ('\n'); -+ -+ offset_out = offset_out - last_blank_pos; -+ column = (column - last_blank_column -+ + (is_cr_after_last_blank -+ ? last_blank_increment : bs_following_last_blank_num)); -+ memmove (line_out, line_out + last_blank_pos, offset_out); -+ CLEAR_FLAGS; -+ goto rescan; -+ } -+ -+ if (column + increment > width && column != 0) -+ { -+ fwrite (line_out, sizeof (char), offset_out, stdout); -+ START_NEW_LINE; -+ goto rescan; -+ } -+ -+ if (allocated_out < offset_out + mblength) -+ line_out = x2nrealloc (line_out, &allocated_out, sizeof *line_out); -+ -+ for (i = 0; i < mblength; i++) -+ { -+ line_out[offset_out] = bufpos[i]; -+ ++offset_out; -+ } -+ -+ column += increment; -+ -+ if (is_blank_seen && !convfail && wc == L'\r') -+ is_cr_after_last_blank = 1; -+ -+ if (is_bs_following_last_blank && !convfail && wc == L'\b') -+ ++bs_following_last_blank_num; -+ else -+ is_bs_following_last_blank = 0; -+ -+ if (break_spaces && !convfail && iswctype (wc, blank_type)) -+ { -+ last_blank_pos = offset_out; -+ last_blank_column = column; -+ is_blank_seen = 1; -+ last_blank_increment = increment; -+ is_bs_following_last_blank = 1; -+ bs_following_last_blank_num = 0; -+ is_cr_after_last_blank = 0; -+ } -+ } -+ -+ if (offset_out) -+ fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); -+ -+ free(line_out); -+} -+#endif -+ -+/* Fold file FILENAME, or standard input if FILENAME is "-", -+ to stdout, with maximum line length WIDTH. -+ Return true if successful. */ -+ -+static bool -+fold_file (char const *filename, size_t width) -+{ -+ FILE *istream; -+ int saved_errno; -+ -+ if (STREQ (filename, "-")) -+ { -+ istream = stdin; -+ have_read_stdin = true; -+ } -+ else -+ istream = fopen (filename, "r"); -+ -+ if (istream == NULL) -+ { -+ error (0, errno, "%s", filename); -+ return false; -+ } -+ -+ /* Define how ISTREAM is being folded. */ -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ fold_multibyte_text (istream, width); -+ else -+#endif -+ saved_errno = fold_text (istream, width); -+ - if (ferror (istream)) - { - error (0, saved_errno, "%s", filename); -@@ -251,6 +523,10 @@ main (int argc, char **argv) - - atexit (close_stdout); - -+#if HAVE_MBRTOWC -+ blank_type = wctype ("blank"); -+#endif -+ operating_mode = column_mode; - break_spaces = count_bytes = have_read_stdin = false; - - while ((optc = getopt_long (argc, argv, shortopts, longopts, NULL)) != -1) -@@ -260,7 +536,15 @@ main (int argc, char **argv) - switch (optc) - { - case 'b': /* Count bytes rather than columns. */ -- count_bytes = true; -+ if (operating_mode != column_mode) -+ FATAL_ERROR (_("only one way of folding may be specified")); -+ operating_mode = byte_mode; -+ break; -+ -+ case 'c': /* Count characters rather than columns. */ -+ if (operating_mode != column_mode) -+ FATAL_ERROR (_("only one way of folding may be specified")); -+ operating_mode = character_mode; - break; - - case 's': /* Break at word boundaries. */ -Index: src/join.c -=================================================================== ---- coreutils-7.1/src/join.c.orig 2008-11-10 14:17:52.000000000 +0100 -+++ coreutils-7.1/src/join.c 2010-06-29 18:49:31.923528009 +0200 -@@ -22,6 +22,16 @@ - #include - #include - -+/* Get mbstate_t, mbrtowc, mbrtowc, wcwidth. */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswblank, towupper. */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "error.h" - #include "linebuffer.h" -@@ -32,6 +42,11 @@ - #include "xstrtol.h" - #include "argmatch.h" - -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "join" - -@@ -120,10 +135,13 @@ static struct outlist outlist_head; - /* Last element in `outlist', where a new element can be added. */ - static struct outlist *outlist_end = &outlist_head; - --/* Tab character separating fields. If negative, fields are separated -+/* Tab character separating fields. If NULL, fields are separated - by any nonempty string of blanks, otherwise by exactly one - tab character whose value (when cast to unsigned char) equals TAB. */ --static int tab = -1; -+static const char *tab = NULL; -+ -+/* The number of bytes used for tab. */ -+static size_t tablen = 0; - - /* If nonzero, check that the input is correctly ordered. */ - static enum -@@ -237,10 +255,10 @@ xfields (struct line *line) - if (ptr == lim) - return; - -- if (0 <= tab) -+ if (tab != NULL) - { - char *sep; -- for (; (sep = memchr (ptr, tab, lim - ptr)) != NULL; ptr = sep + 1) -+ for (; (sep = memchr (ptr, tab[0], lim - ptr)) != NULL; ptr = sep + 1) - extract_field (line, ptr, sep - ptr); - } - else -@@ -285,56 +303,115 @@ keycmp (struct line const *line1, struct - size_t jf_1, size_t jf_2) - { - /* Start of field to compare in each file. */ -- char *beg1; -- char *beg2; -- -- size_t len1; -- size_t len2; /* Length of fields to compare. */ -+ char *beg[2]; -+ char *copy[2]; -+ size_t len[2]; /* Length of fields to compare. */ - int diff; -+ int i, j; - - if (jf_1 < line1->nfields) - { -- beg1 = line1->fields[jf_1].beg; -- len1 = line1->fields[jf_1].len; -+ beg[0] = line1->fields[jf_1].beg; -+ len[0] = line1->fields[jf_1].len; - } - else - { -- beg1 = NULL; -- len1 = 0; -+ beg[0] = NULL; -+ len[0] = 0; - } - - if (jf_2 < line2->nfields) - { -- beg2 = line2->fields[jf_2].beg; -- len2 = line2->fields[jf_2].len; -+ beg[1] = line2->fields[jf_2].beg; -+ len[1] = line2->fields[jf_2].len; - } - else - { -- beg2 = NULL; -- len2 = 0; -+ beg[1] = NULL; -+ len[1] = 0; - } - -- if (len1 == 0) -- return len2 == 0 ? 0 : -1; -- if (len2 == 0) -+ if (len[0] == 0) -+ return len[1] == 0 ? 0 : -1; -+ if (len[1] == 0) - return 1; - - if (ignore_case) - { -- /* FIXME: ignore_case does not work with NLS (in particular, -- with multibyte chars). */ -- diff = memcasecmp (beg1, beg2, MIN (len1, len2)); -+#ifdef HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ size_t mblength; -+ wchar_t wc, uwc; -+ mbstate_t state, state_bak; -+ -+ memset (&state, '\0', sizeof (mbstate_t)); -+ -+ for (i = 0; i < 2; i++) -+ { -+ copy[i] = alloca (len[i] + 1); -+ -+ for (j = 0; j < MIN (len[0], len[1]);) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, beg[i] + j, len[i] - j, &state); -+ -+ switch (mblength) -+ { -+ case (size_t) -1: -+ case (size_t) -2: -+ state = state_bak; -+ /* Fall through */ -+ case 0: -+ mblength = 1; -+ break; -+ -+ default: -+ uwc = towupper (wc); -+ -+ if (uwc != wc) -+ { -+ mbstate_t state_wc; -+ -+ memset (&state_wc, '\0', sizeof (mbstate_t)); -+ wcrtomb (copy[i] + j, uwc, &state_wc); -+ } -+ else -+ memcpy (copy[i] + j, beg[i] + j, mblength); -+ } -+ j += mblength; -+ } -+ copy[i][j] = '\0'; -+ } -+ return xmemcoll (copy[0], len[0], copy[1], len[1]); -+ } -+#endif -+ if (hard_LC_COLLATE) -+ { -+ for (i = 0; i < 2; i++) -+ { -+ copy[i] = alloca (len[i] + 1); -+ -+ for (j = 0; j < MIN (len[0], len[1]); j++) -+ copy[i][j] = toupper (beg[i][j]); -+ -+ copy[i][j] = '\0'; -+ } -+ return xmemcoll (copy[0], len[0], copy[1], len[1]); -+ } -+ else -+ diff = memcasecmp (beg[0], beg[1], MIN (len[0], len[1])); - } - else - { - if (hard_LC_COLLATE) -- return xmemcoll (beg1, len1, beg2, len2); -- diff = memcmp (beg1, beg2, MIN (len1, len2)); -+ return xmemcoll (beg[0], len[0], beg[1], len[1]); -+ diff = memcmp (beg[0], beg[1], MIN (len[0], len[1])); - } - - if (diff) - return diff; -- return len1 < len2 ? -1 : len1 != len2; -+ return len[0] < len[1] ? -1 : len[0] != len[1]; - } - - /* Check that successive input lines PREV and CURRENT from input file -@@ -388,6 +465,133 @@ init_linep (struct line **linep) - return line; - } - -+#if HAVE_MBRTOWC -+static void -+xfields_multibyte (struct line *line) -+{ -+ int i; -+ char *ptr0 = line->buf.buffer; -+ char *ptr; -+ char *lim; -+ wchar_t wc = 0; -+ size_t mblength; -+ mbstate_t state, state_bak; -+ -+ memset (&state, 0, sizeof (mbstate_t)); -+ -+ ptr = ptr0; -+ lim = ptr0 + line->buf.length - 1; -+ -+ if (tab == NULL) -+ { -+ /* Skip leading blanks before the first field. */ -+ while (ptr < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ mblength = 1; -+ state = state_bak; -+ break; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (!iswblank (wc)) -+ break; -+ ptr += mblength; -+ } -+ } -+ -+ for (i = 0; ptr < lim; ++i) -+ { -+ if (tab != NULL) -+ { -+ char *beg = ptr; -+ while (ptr < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ mblength = 1; -+ state = state_bak; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (mblength == tablen && !memcmp (ptr, tab, mblength)) -+ break; -+ else -+ { -+ ptr += mblength; -+ continue; -+ } -+ } -+ -+ extract_field (line, beg, ptr - beg); -+ if (ptr < lim) -+ ptr += mblength; -+ } -+ else -+ { -+ char *beg = ptr; -+ while (ptr < lim) -+ { -+ state_bak = state; -+ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ mblength = 1; -+ state = state_bak; -+ } -+ mblength = (mblength < 1) ? 1 : mblength; -+ -+ if (iswblank (wc)) -+ break; -+ else -+ { -+ ptr += mblength; -+ continue; -+ } -+ } -+ -+ extract_field (line, beg, ptr - beg); -+ if (ptr < lim) -+ ptr += mblength; -+ } -+ } -+ -+ if (ptr != ptr0) -+ { -+ mblength = mbrtowc (&wc, ptr - mblength, mblength, &state); -+ wc = (mbsinit (&state) && *(ptr - mblength) == '\0') ? L'\0' : wc; -+ if (tab != NULL) -+ { -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ mblength = 1; -+ -+ if (mblength == tablen && !memcmp (ptr - mblength, tab, mblength)) -+ /* Add one more (empty) field because the last character of -+ the line was a delimiter. */ -+ extract_field (line, NULL, 0); -+ } -+ else -+ { -+ if (mblength != (size_t) -1 && mblength != (size_t) -2) -+ { -+ if (iswblank (wc)) -+ /* Add one more (empty) field because the last character of -+ the line was a delimiter. */ -+ extract_field (line, NULL, 0); -+ } -+ } -+ } -+} -+#endif -+ - /* Read a line from FP into LINE and split it into fields. - Return true if successful. */ - -@@ -415,7 +619,12 @@ get_line (FILE *fp, struct line **linep, - return false; - } - -- xfields (line); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ xfields_multibyte (line); -+ else -+#endif -+ xfields (line); - - if (prevline[which - 1]) - check_order (prevline[which - 1], line, which); -@@ -520,7 +729,8 @@ static void - prjoin (struct line const *line1, struct line const *line2) - { - const struct outlist *outlist; -- char output_separator = tab < 0 ? ' ' : tab; -+ const char *output_separator = tab == NULL ? " " : tab; -+ size_t output_separator_len = tab == NULL ? 1 : tablen; - - outlist = outlist_head.next; - if (outlist) -@@ -555,7 +765,7 @@ prjoin (struct line const *line1, struct - o = o->next; - if (o == NULL) - break; -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - } - putchar ('\n'); - } -@@ -573,23 +783,23 @@ prjoin (struct line const *line1, struct - prfield (join_field_1, line1); - for (i = 0; i < join_field_1 && i < line1->nfields; ++i) - { -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - prfield (i, line1); - } - for (i = join_field_1 + 1; i < line1->nfields; ++i) - { -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - prfield (i, line1); - } - - for (i = 0; i < join_field_2 && i < line2->nfields; ++i) - { -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - prfield (i, line2); - } - for (i = join_field_2 + 1; i < line2->nfields; ++i) - { -- putchar (output_separator); -+ fwrite (output_separator, 1, output_separator_len, stdout); - prfield (i, line2); - } - putchar ('\n'); -@@ -1020,20 +1230,40 @@ main (int argc, char **argv) - - case 't': - { -- unsigned char newtab = optarg[0]; -- if (! newtab) -+ const char *newtab = optarg; -+ size_t newtablen; -+ if (! newtab[0]) - error (EXIT_FAILURE, 0, _("empty tab")); -- if (optarg[1]) -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ mbstate_t state; -+ -+ memset (&state, 0, sizeof (mbstate_t)); -+ newtablen = mbrtowc (NULL, newtab, strlen (newtab), &state); -+ if (newtablen == (size_t) 0 -+ || newtablen == (size_t) -1 || newtablen == (size_t) -2) -+ newtablen = 1; -+ } -+ else -+#endif -+ newtablen = 1; -+ if (optarg[newtablen]) - { - if (STREQ (optarg, "\\0")) -- newtab = '\0'; -+ { -+ newtab = "\0"; -+ newtablen = 1; -+ } - else - error (EXIT_FAILURE, 0, _("multi-character tab %s"), - quote (optarg)); - } -- if (0 <= tab && tab != newtab) -+ if (tab != NULL -+ && (tablen != newtablen || memcmp (tab, newtab, tablen) != 0)) - error (EXIT_FAILURE, 0, _("incompatible tabs")); - tab = newtab; -+ tablen = newtablen; - } - break; - -Index: src/pr.c -=================================================================== ---- coreutils-7.1/src/pr.c.orig 2009-01-27 22:11:25.000000000 +0100 -+++ coreutils-7.1/src/pr.c 2010-06-29 18:49:31.931969742 +0200 -@@ -312,6 +312,32 @@ - - #include - #include -+ -+/* Get MB_LEN_MAX. */ -+#include -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX == 1 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Get MB_CUR_MAX. */ -+#include -+ -+/* Solaris 2.5 has a bug: must be included before . */ -+/* Get mbstate_t, mbrtowc(), wcwidth(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswprint(). -- for wcwidth(). */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+#if !defined iswprint && !HAVE_ISWPRINT -+# define iswprint(wc) 1 -+#endif -+ - #include "system.h" - #include "error.h" - #include "mbswidth.h" -@@ -321,6 +347,18 @@ - #include "strftime.h" - #include "xstrtol.h" - -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ -+#ifndef HAVE_DECL_WCWIDTH -+"this configure-time declaration test was not run" -+#endif -+#if !HAVE_DECL_WCWIDTH -+extern int wcwidth (); -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "pr" - -@@ -414,8 +452,21 @@ struct COLUMN - typedef struct COLUMN COLUMN; - - #define NULLCOL (COLUMN *)0 -+ -+/* Funtion pointers to switch functions for single byte locale or for -+ multibyte locale. If multibyte functions do not exist in your sysytem, -+ these pointers always point the function for single byte locale. */ -+static void (*print_char) (char c); -+static int (*char_to_clump) (char c); -+ -+/* Functions for single byte locale. */ -+static void print_char_single (char c); -+static int char_to_clump_single (char c); -+ -+/* Functions for multibyte locale. */ -+static void print_char_multi (char c); -+static int char_to_clump_multi (char c); - --static int char_to_clump (char c); - static bool read_line (COLUMN *p); - static bool print_page (void); - static bool print_stored (COLUMN *p); -@@ -425,6 +476,7 @@ static void print_header (void); - static void pad_across_to (int position); - static void add_line_number (COLUMN *p); - static void getoptarg (char *arg, char switch_char, char *character, -+ int *character_length, int *character_width, - int *number); - void usage (int status); - static void print_files (int number_of_files, char **av); -@@ -439,7 +491,6 @@ static void store_char (char c); - static void pad_down (int lines); - static void read_rest_of_line (COLUMN *p); - static void skip_read (COLUMN *p, int column_number); --static void print_char (char c); - static void cleanup (void); - static void print_sep_string (void); - static void separator_string (const char *optarg_S); -@@ -451,7 +502,7 @@ static COLUMN *column_vector; - we store the leftmost columns contiguously in buff. - To print a line from buff, get the index of the first character - from line_vector[i], and print up to line_vector[i + 1]. */ --static char *buff; -+static unsigned char *buff; - - /* Index of the position in buff where the next character - will be stored. */ -@@ -555,7 +606,7 @@ static int chars_per_column; - static bool untabify_input = false; - - /* (-e) The input tab character. */ --static char input_tab_char = '\t'; -+static char input_tab_char[MB_LEN_MAX] = "\t"; - - /* (-e) Tabstops are at chars_per_tab, 2*chars_per_tab, 3*chars_per_tab, ... - where the leftmost column is 1. */ -@@ -565,7 +616,10 @@ static int chars_per_input_tab = 8; - static bool tabify_output = false; - - /* (-i) The output tab character. */ --static char output_tab_char = '\t'; -+static char output_tab_char[MB_LEN_MAX] = "\t"; -+ -+/* (-i) The byte length of output tab character. */ -+static int output_tab_char_length = 1; - - /* (-i) The width of the output tab. */ - static int chars_per_output_tab = 8; -@@ -639,7 +693,13 @@ static int power_10; - static bool numbered_lines = false; - - /* (-n) Character which follows each line number. */ --static char number_separator = '\t'; -+static char number_separator[MB_LEN_MAX] = "\t"; -+ -+/* (-n) The byte length of the character which follows each line number. */ -+static int number_separator_length = 1; -+ -+/* (-n) The character width of the character which follows each line number. */ -+static int number_separator_width = 0; - - /* (-n) line counting starts with 1st line of input file (not with 1st - line of 1st page printed). */ -@@ -692,6 +752,7 @@ static bool use_col_separator = false; - -a|COLUMN|-m is a `space' and with the -J option a `tab'. */ - static char *col_sep_string = (char *) ""; - static int col_sep_length = 0; -+static int col_sep_width = 0; - static char *column_separator = (char *) " "; - static char *line_separator = (char *) "\t"; - -@@ -848,6 +909,13 @@ separator_string (const char *optarg_S) - col_sep_length = (int) strlen (optarg_S); - col_sep_string = xmalloc (col_sep_length + 1); - strcpy (col_sep_string, optarg_S); -+ -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ col_sep_width = mbswidth (col_sep_string, 0); -+ else -+#endif -+ col_sep_width = col_sep_length; - } - - int -@@ -872,6 +940,21 @@ main (int argc, char **argv) - - atexit (close_stdout); - -+/* Define which functions are used, the ones for single byte locale or the ones -+ for multibyte locale. */ -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ print_char = print_char_multi; -+ char_to_clump = char_to_clump_multi; -+ } -+ else -+#endif -+ { -+ print_char = print_char_single; -+ char_to_clump = char_to_clump_single; -+ } -+ - n_files = 0; - file_names = (argc > 1 - ? xmalloc ((argc - 1) * sizeof (char *)) -@@ -948,8 +1031,12 @@ main (int argc, char **argv) - break; - case 'e': - if (optarg) -- getoptarg (optarg, 'e', &input_tab_char, -- &chars_per_input_tab); -+ { -+ int dummy_length, dummy_width; -+ -+ getoptarg (optarg, 'e', input_tab_char, &dummy_length, -+ &dummy_width, &chars_per_input_tab); -+ } - /* Could check tab width > 0. */ - untabify_input = true; - break; -@@ -962,8 +1049,12 @@ main (int argc, char **argv) - break; - case 'i': - if (optarg) -- getoptarg (optarg, 'i', &output_tab_char, -- &chars_per_output_tab); -+ { -+ int dummy_width; -+ -+ getoptarg (optarg, 'i', output_tab_char, &output_tab_char_length, -+ &dummy_width, &chars_per_output_tab); -+ } - /* Could check tab width > 0. */ - tabify_output = true; - break; -@@ -990,8 +1081,8 @@ main (int argc, char **argv) - case 'n': - numbered_lines = true; - if (optarg) -- getoptarg (optarg, 'n', &number_separator, -- &chars_per_number); -+ getoptarg (optarg, 'n', number_separator, &number_separator_length, -+ &number_separator_width, &chars_per_number); - break; - case 'N': - skip_count = false; -@@ -1031,6 +1122,7 @@ main (int argc, char **argv) - /* Reset an additional input of -s, -S dominates -s */ - col_sep_string = bad_cast (""); - col_sep_length = 0; -+ col_sep_width = 0; - use_col_separator = true; - if (optarg) - separator_string (optarg); -@@ -1187,10 +1279,45 @@ main (int argc, char **argv) - a number. */ - - static void --getoptarg (char *arg, char switch_char, char *character, int *number) -+getoptarg (char *arg, char switch_char, char *character, int *character_length, -+ int *character_width, int *number) - { - if (!ISDIGIT (*arg)) -- *character = *arg++; -+ { -+#ifdef HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) /* for multibyte locale. */ -+ { -+ wchar_t wc; -+ size_t mblength; -+ int width; -+ mbstate_t state = {'\0'}; -+ -+ mblength = mbrtowc (&wc, arg, strlen (arg), &state); -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ *character_length = 1; -+ *character_width = 1; -+ } -+ else -+ { -+ *character_length = (mblength < 1) ? 1 : mblength; -+ width = wcwidth (wc); -+ *character_width = (width < 0) ? 0 : width; -+ } -+ -+ strncpy (character, arg, *character_length); -+ arg += *character_length; -+ } -+ else /* for single byte locale. */ -+#endif -+ { -+ *character = *arg++; -+ *character_length = 1; -+ *character_width = 1; -+ } -+ } -+ - if (*arg) - { - long int tmp_long; -@@ -1249,7 +1376,7 @@ init_parameters (int number_of_files) - else - col_sep_string = column_separator; - -- col_sep_length = 1; -+ col_sep_length = col_sep_width = 1; - use_col_separator = true; - } - /* It's rather pointless to define a TAB separator with column -@@ -1280,11 +1407,11 @@ init_parameters (int number_of_files) - TAB_WIDTH (chars_per_input_tab, chars_per_number); */ - - /* Estimate chars_per_text without any margin and keep it constant. */ -- if (number_separator == '\t') -+ if (number_separator[0] == '\t') - number_width = chars_per_number + - TAB_WIDTH (chars_per_default_tab, chars_per_number); - else -- number_width = chars_per_number + 1; -+ number_width = chars_per_number + number_separator_width; - - /* The number is part of the column width unless we are - printing files in parallel. */ -@@ -1299,7 +1426,7 @@ init_parameters (int number_of_files) - } - - chars_per_column = (chars_per_line - chars_used_by_number - -- (columns - 1) * col_sep_length) / columns; -+ (columns - 1) * col_sep_width) / columns; - - if (chars_per_column < 1) - error (EXIT_FAILURE, 0, _("page width too narrow")); -@@ -1424,7 +1551,7 @@ init_funcs (void) - - /* Enlarge p->start_position of first column to use the same form of - padding_not_printed with all columns. */ -- h = h + col_sep_length; -+ h = h + col_sep_width; - - /* This loop takes care of all but the rightmost column. */ - -@@ -1458,7 +1585,7 @@ init_funcs (void) - } - else - { -- h = h_next + col_sep_length; -+ h = h_next + col_sep_width; - h_next = h + chars_per_column; - } - } -@@ -1748,9 +1875,9 @@ static void - align_column (COLUMN *p) - { - padding_not_printed = p->start_position; -- if (padding_not_printed - col_sep_length > 0) -+ if (padding_not_printed - col_sep_width > 0) - { -- pad_across_to (padding_not_printed - col_sep_length); -+ pad_across_to (padding_not_printed - col_sep_width); - padding_not_printed = ANYWHERE; - } - -@@ -2021,13 +2148,13 @@ store_char (char c) - /* May be too generous. */ - buff = X2REALLOC (buff, &buff_allocated); - } -- buff[buff_current++] = c; -+ buff[buff_current++] = (unsigned char) c; - } - - static void - add_line_number (COLUMN *p) - { -- int i; -+ int i, j; - char *s; - int left_cut; - -@@ -2050,22 +2177,24 @@ add_line_number (COLUMN *p) - /* Tabification is assumed for multiple columns, also for n-separators, - but `default n-separator = TAB' hasn't been given priority over - equal column_width also specified by POSIX. */ -- if (number_separator == '\t') -+ if (number_separator[0] == '\t') - { - i = number_width - chars_per_number; - while (i-- > 0) - (p->char_func) (' '); - } - else -- (p->char_func) (number_separator); -+ for (j = 0; j < number_separator_length; j++) -+ (p->char_func) (number_separator[j]); - } - else - /* To comply with POSIX, we avoid any expansion of default TAB - separator with a single column output. No column_width requirement - has to be considered. */ - { -- (p->char_func) (number_separator); -- if (number_separator == '\t') -+ for (j = 0; j < number_separator_length; j++) -+ (p->char_func) (number_separator[j]); -+ if (number_separator[0] == '\t') - output_position = POS_AFTER_TAB (chars_per_output_tab, - output_position); - } -@@ -2226,7 +2355,7 @@ print_white_space (void) - while (goal - h_old > 1 - && (h_new = POS_AFTER_TAB (chars_per_output_tab, h_old)) <= goal) - { -- putchar (output_tab_char); -+ fwrite (output_tab_char, 1, output_tab_char_length, stdout); - h_old = h_new; - } - while (++h_old <= goal) -@@ -2246,6 +2375,7 @@ print_sep_string (void) - { - char *s; - int l = col_sep_length; -+ int not_space_flag; - - s = col_sep_string; - -@@ -2259,6 +2389,7 @@ print_sep_string (void) - { - for (; separators_not_printed > 0; --separators_not_printed) - { -+ not_space_flag = 0; - while (l-- > 0) - { - /* 3 types of sep_strings: spaces only, spaces and chars, -@@ -2272,12 +2403,15 @@ print_sep_string (void) - } - else - { -+ not_space_flag = 1; - if (spaces_not_printed > 0) - print_white_space (); - putchar (*s++); -- ++output_position; - } - } -+ if (not_space_flag) -+ output_position += col_sep_width; -+ - /* sep_string ends with some spaces */ - if (spaces_not_printed > 0) - print_white_space (); -@@ -2304,8 +2438,9 @@ print_clump (COLUMN *p, int n, char *clu - a nonspace is encountered, call print_white_space() to print the - required number of tabs and spaces. */ - -+ - static void --print_char (char c) -+print_char_single (char c) - { - if (tabify_output) - { -@@ -2329,6 +2464,75 @@ print_char (char c) - putchar (c); - } - -+#ifdef HAVE_MBRTOWC -+static void -+print_char_multi (char c) -+{ -+ static size_t mbc_pos = 0; -+ static unsigned char mbc[MB_LEN_MAX] = {'\0'}; -+ static mbstate_t state = {'\0'}; -+ mbstate_t state_bak; -+ wchar_t wc; -+ unsigned char uc = (unsigned char) c; -+ size_t mblength; -+ int width; -+ -+ if (tabify_output) -+ { -+ state_bak = state; -+ mbc[mbc_pos++] = uc; -+ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); -+ -+ while (mbc_pos > 0) -+ { -+ switch (mblength) -+ { -+ case (size_t) -2: -+ state = state_bak; -+ return; -+ -+ case (size_t) -1: -+ state = state_bak; -+ ++output_position; -+ putchar (mbc[0]); -+ memmove (mbc, mbc + 1, MB_CUR_MAX - 1); -+ --mbc_pos; -+ break; -+ -+ case 0: -+ mblength = 1; -+ -+ default: -+ if (wc == L' ') -+ { -+ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); -+ --mbc_pos; -+ ++spaces_not_printed; -+ return; -+ } -+ else if (spaces_not_printed > 0) -+ print_white_space (); -+ -+ /* Nonprintables are assumed to have width 0, except L'\b'. */ -+ if ((width = wcwidth (wc)) < 1) -+ { -+ if (wc == L'\b') -+ --output_position; -+ } -+ else -+ output_position += width; -+ -+ fwrite (mbc, 1, mblength, stdout); -+ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); -+ mbc_pos -= mblength; -+ } -+ } -+ return; -+ } -+ putchar (uc); -+} -+#endif -+ - /* Skip to page PAGE before printing. - PAGE may be larger than total number of pages. */ - -@@ -2506,9 +2710,9 @@ read_line (COLUMN *p) - align_empty_cols = false; - } - -- if (padding_not_printed - col_sep_length > 0) -+ if (padding_not_printed - col_sep_width > 0) - { -- pad_across_to (padding_not_printed - col_sep_length); -+ pad_across_to (padding_not_printed - col_sep_width); - padding_not_printed = ANYWHERE; - } - -@@ -2609,9 +2813,9 @@ print_stored (COLUMN *p) - } - } - -- if (padding_not_printed - col_sep_length > 0) -+ if (padding_not_printed - col_sep_width > 0) - { -- pad_across_to (padding_not_printed - col_sep_length); -+ pad_across_to (padding_not_printed - col_sep_width); - padding_not_printed = ANYWHERE; - } - -@@ -2624,8 +2828,8 @@ print_stored (COLUMN *p) - if (spaces_not_printed == 0) - { - output_position = p->start_position + end_vector[line]; -- if (p->start_position - col_sep_length == chars_per_margin) -- output_position -= col_sep_length; -+ if (p->start_position - col_sep_width == chars_per_margin) -+ output_position -= col_sep_width; - } - - return true; -@@ -2643,8 +2847,9 @@ print_stored (COLUMN *p) - characters in clump_buff. (e.g, the width of '\b' is -1, while the - number of characters is 1.) */ - -+ - static int --char_to_clump (char c) -+char_to_clump_single (char c) - { - unsigned char uc = c; - char *s = clump_buff; -@@ -2654,10 +2859,10 @@ char_to_clump (char c) - int chars; - int chars_per_c = 8; - -- if (c == input_tab_char) -+ if (c == input_tab_char[0]) - chars_per_c = chars_per_input_tab; - -- if (c == input_tab_char || c == '\t') -+ if (c == input_tab_char[0] || c == '\t') - { - width = TAB_WIDTH (chars_per_c, input_position); - -@@ -2738,6 +2943,155 @@ char_to_clump (char c) - return chars; - } - -+#ifdef HAVE_MBRTOWC -+static int -+char_to_clump_multi (char c) -+{ -+ static size_t mbc_pos = 0; -+ static unsigned char mbc[MB_LEN_MAX] = {'\0'}; -+ static mbstate_t state = {'\0'}; -+ mbstate_t state_bak; -+ wchar_t wc; -+ unsigned char uc = (unsigned char) c; -+ size_t mblength; -+ int wc_width; -+ register char *s = clump_buff; -+ register int i, j; -+ char esc_buff[4]; -+ int width; -+ int chars; -+ int chars_per_c = 8; -+ -+ state_bak = state; -+ mbc[mbc_pos++] = uc; -+ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); -+ -+ width = 0; -+ chars = 0; -+ while (mbc_pos > 0) -+ { -+ switch (mblength) -+ { -+ case (size_t) -2: -+ state = state_bak; -+ return 0; -+ -+ case (size_t) -1: -+ state = state_bak; -+ mblength = 1; -+ -+ if (use_esc_sequence || use_cntrl_prefix) -+ { -+ width = +4; -+ chars = +4; -+ *s++ = '\\'; -+ sprintf (esc_buff, "%03o", mbc[0]); -+ for (i = 0; i <= 2; ++i) -+ *s++ = (int) esc_buff[i]; -+ } -+ else -+ { -+ width += 1; -+ chars += 1; -+ *s++ = mbc[0]; -+ } -+ break; -+ -+ case 0: -+ mblength = 1; -+ /* Fall through */ -+ -+ default: -+ if (memcmp (mbc, input_tab_char, mblength) == 0) -+ chars_per_c = chars_per_input_tab; -+ -+ if (memcmp (mbc, input_tab_char, mblength) == 0 || c == '\t') -+ { -+ int width_inc; -+ -+ width_inc = TAB_WIDTH (chars_per_c, input_position); -+ width += width_inc; -+ -+ if (untabify_input) -+ { -+ for (i = width_inc; i; --i) -+ *s++ = ' '; -+ chars += width_inc; -+ } -+ else -+ { -+ for (i = 0; i < mblength; i++) -+ *s++ = mbc[i]; -+ chars += mblength; -+ } -+ } -+ else if ((wc_width = wcwidth (wc)) < 1) -+ { -+ if (use_esc_sequence) -+ { -+ for (i = 0; i < mblength; i++) -+ { -+ width += 4; -+ chars += 4; -+ *s++ = '\\'; -+ sprintf (esc_buff, "%03o", uc); -+ for (j = 0; j <= 2; ++j) -+ *s++ = (int) esc_buff[j]; -+ } -+ } -+ else if (use_cntrl_prefix) -+ { -+ if (wc < 0200) -+ { -+ width += 2; -+ chars += 2; -+ *s++ = '^'; -+ *s++ = wc ^ 0100; -+ } -+ else -+ { -+ for (i = 0; i < mblength; i++) -+ { -+ width += 4; -+ chars += 4; -+ *s++ = '\\'; -+ sprintf (esc_buff, "%03o", uc); -+ for (j = 0; j <= 2; ++j) -+ *s++ = (int) esc_buff[j]; -+ } -+ } -+ } -+ else if (wc == L'\b') -+ { -+ width += -1; -+ chars += 1; -+ *s++ = c; -+ } -+ else -+ { -+ width += 0; -+ chars += mblength; -+ for (i = 0; i < mblength; i++) -+ *s++ = mbc[i]; -+ } -+ } -+ else -+ { -+ width += wc_width; -+ chars += mblength; -+ for (i = 0; i < mblength; i++) -+ *s++ = mbc[i]; -+ } -+ } -+ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); -+ mbc_pos -= mblength; -+ } -+ -+ input_position += width; -+ return chars; -+} -+#endif -+ - /* We've just printed some files and need to clean up things before - looking for more options and printing the next batch of files. - -Index: src/sort.c -=================================================================== ---- coreutils-7.1/src/sort.c.orig 2009-01-30 19:46:06.000000000 +0100 -+++ coreutils-7.1/src/sort.c 2010-06-29 18:51:17.203522566 +0200 -@@ -26,6 +26,19 @@ - #include - #include - #include -+#include -+ -+/* Get mbstate_t, mbrtowc(), wcrtomb(). */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswprint(), iswctype() towupper(). */ -+#if HAVE_WCTYPE_H -+# include -+wctype_t blank_type; /* = wctype ("blank"); */ -+#endif -+ - #include "system.h" - #include "argmatch.h" - #include "error.h" -@@ -53,6 +66,17 @@ struct rlimit { size_t rlim_cur; }; - # define getrlimit(Resource, Rlp) (-1) - #endif - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX == 1 -+# define MB_LEN_MAX 16 -+#endif -+ -+/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ -+#if HAVE_MBRTOWC && defined mbstate_t -+# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "sort" - -@@ -121,14 +145,38 @@ static int decimal_point; - /* Thousands separator; if -1, then there isn't one. */ - static int thousands_sep; - -+static int force_general_numcompare = 0; -+ - /* Nonzero if the corresponding locales are hard. */ - static bool hard_LC_COLLATE; --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - static bool hard_LC_TIME; - #endif - - #define NONZERO(x) ((x) != 0) - -+/* get a multibyte character's byte length. */ -+#define GET_BYTELEN_OF_CHAR(LIM, PTR, MBLENGTH, STATE) \ -+ do \ -+ { \ -+ wchar_t wc; \ -+ mbstate_t state_bak; \ -+ \ -+ state_bak = STATE; \ -+ mblength = mbrtowc (&wc, PTR, LIM - PTR, &STATE); \ -+ \ -+ switch (MBLENGTH) \ -+ { \ -+ case (size_t)-1: \ -+ case (size_t)-2: \ -+ STATE = state_bak; \ -+ /* Fall through. */ \ -+ case 0: \ -+ MBLENGTH = 1; \ -+ } \ -+ } \ -+ while (0) -+ - /* The kind of blanks for '-b' to skip in various options. */ - enum blanktype { bl_start, bl_end, bl_both }; - -@@ -264,13 +312,11 @@ static bool reverse; - they were read if all keys compare equal. */ - static bool stable; - --/* If TAB has this value, blanks separate fields. */ --enum { TAB_DEFAULT = CHAR_MAX + 1 }; -- --/* Tab character separating fields. If TAB_DEFAULT, then fields are -- separated by the empty string between a non-blank character and a blank -+/* Tab character separating fields. If NULL, then fields are separated by -+ the empty string between a non-blank character and a blank - character. */ --static int tab = TAB_DEFAULT; -+static const char *tab; -+static size_t tab_length = 1; - - /* Flag to remove consecutive duplicate lines from the output. - Only the last of a sequence of equal lines will be output. */ -@@ -702,6 +748,43 @@ reap_some (void) - update_proc (pid); - } - -+/* Fucntion pointers. */ -+static char * -+(* begfield) (const struct line *line, const struct keyfield *key); -+ -+static char * -+(* limfield) (const struct line *line, const struct keyfield *key); -+ -+static int -+(*getmonth) (const char *s, size_t len); -+ -+static int -+(* keycompare) (const struct line *a, const struct line *b); -+ -+/* Test for white space multibyte character. -+ Set LENGTH the byte length of investigated multibyte character. */ -+#if HAVE_MBRTOWC -+static int -+ismbblank (const char *str, size_t *length) -+{ -+ size_t mblength; -+ wchar_t wc; -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ mblength = mbrtowc (&wc, str, MB_LEN_MAX, &state); -+ -+ if (mblength == (size_t)-1 || mblength == (size_t)-2) -+ { -+ *length = 1; -+ return 0; -+ } -+ -+ *length = (mblength < 1) ? 1 : mblength; -+ return (iswctype (wc, blank_type)); -+} -+#endif -+ - /* Clean up any remaining temporary files. */ - - static void -@@ -1042,7 +1125,7 @@ zaptemp (const char *name) - free (node); - } - --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - - static int - struct_month_cmp (const void *m1, const void *m2) -@@ -1069,7 +1152,7 @@ inittables (void) - fold_toupper[i] = toupper (i); - } - --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - /* If we're not in the "C" locale, read different names for months. */ - if (hard_LC_TIME) - { -@@ -1151,6 +1234,71 @@ specify_nmerge (int oi, char c, char con - xstrtol_fatal (e, oi, c, long_options, s); - } - -+#if HAVE_MBRTOWC -+static void -+inittables_mb (void) -+{ -+ int i, j, k, l; -+ char *name, *s; -+ size_t s_len, mblength; -+ char mbc[MB_LEN_MAX]; -+ wchar_t wc, pwc; -+ mbstate_t state_mb, state_wc; -+ -+ for (i = 0; i < MONTHS_PER_YEAR; i++) -+ { -+ s = (char *) nl_langinfo (ABMON_1 + i); -+ s_len = strlen (s); -+ monthtab[i].name = name = (char *) xmalloc (s_len + 1); -+ monthtab[i].val = i + 1; -+ -+ memset (&state_mb, '\0', sizeof (mbstate_t)); -+ memset (&state_wc, '\0', sizeof (mbstate_t)); -+ -+ for (j = 0; j < s_len;) -+ { -+ if (!ismbblank (s + j, &mblength)) -+ break; -+ j += mblength; -+ } -+ -+ for (k = 0; j < s_len;) -+ { -+ mblength = mbrtowc (&wc, (s + j), (s_len - j), &state_mb); -+ /* If conversion is failed, fall back into single byte sorting. */ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ for (l = 0; l <= i; l++) -+ free ((void *) monthtab[l].name); -+ inittables(); -+ return; -+ } -+ else if (mblength == 0) -+ break; -+ -+ pwc = towupper (wc); -+ if (pwc == wc) -+ { -+ memcpy (mbc, s + j, mblength); -+ j += mblength; -+ } -+ else -+ { -+ j += mblength; -+ mblength = wcrtomb (mbc, wc, &state_wc); -+ assert (mblength != (size_t) 0 && mblength != (size_t) -1); -+ } -+ -+ for (l = 0; l < mblength; l++) -+ name[k++] = mbc[l]; -+ } -+ name[k] = '\0'; -+ } -+ qsort ((void *) monthtab, MONTHS_PER_YEAR, -+ sizeof *monthtab, struct_month_cmp); -+} -+#endif -+ - /* Specify the amount of main memory to use when sorting. */ - static void - specify_sort_size (int oi, char c, char const *s) -@@ -1361,7 +1509,7 @@ buffer_linelim (struct buffer const *buf - by KEY in LINE. */ - - static char * --begfield (const struct line *line, const struct keyfield *key) -+begfield_uni (const struct line *line, const struct keyfield *key) - { - char *ptr = line->text, *lim = ptr + line->length - 1; - size_t sword = key->sword; -@@ -1371,10 +1519,10 @@ begfield (const struct line *line, const - /* The leading field separator itself is included in a field when -t - is absent. */ - -- if (tab != TAB_DEFAULT) -+ if (tab != NULL) - while (ptr < lim && sword--) - { -- while (ptr < lim && *ptr != tab) -+ while (ptr < lim && *ptr != tab[0]) - ++ptr; - if (ptr < lim) - ++ptr; -@@ -1402,11 +1550,70 @@ begfield (const struct line *line, const - return ptr; - } - -+#if HAVE_MBRTOWC -+static char * -+begfield_mb (const struct line *line, const struct keyfield *key) -+{ -+ int i; -+ char *ptr = line->text, *lim = ptr + line->length - 1; -+ size_t sword = key->sword; -+ size_t schar = key->schar; -+ size_t mblength; -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ if (tab != NULL) -+ while (ptr < lim && sword--) -+ { -+ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ } -+ else -+ while (ptr < lim && sword--) -+ { -+ while (ptr < lim && ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ while (ptr < lim && !ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ } -+ -+ if (key->skipsblanks) -+ while (ptr < lim && ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ -+ for (i = 0; i < schar; i++) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ -+ if (ptr + mblength > lim) -+ break; -+ else -+ ptr += mblength; -+ } -+ -+ return ptr; -+} -+#endif -+ - /* Return the limit of (a pointer to the first character after) the field - in LINE specified by KEY. */ - - static char * --limfield (const struct line *line, const struct keyfield *key) -+limfield_uni (const struct line *line, const struct keyfield *key) - { - char *ptr = line->text, *lim = ptr + line->length - 1; - size_t eword = key->eword, echar = key->echar; -@@ -1419,10 +1626,10 @@ limfield (const struct line *line, const - `beginning' is the first character following the delimiting TAB. - Otherwise, leave PTR pointing at the first `blank' character after - the preceding field. */ -- if (tab != TAB_DEFAULT) -+ if (tab != NULL) - while (ptr < lim && eword--) - { -- while (ptr < lim && *ptr != tab) -+ while (ptr < lim && *ptr != tab[0]) - ++ptr; - if (ptr < lim && (eword | echar)) - ++ptr; -@@ -1468,7 +1675,7 @@ limfield (const struct line *line, const - */ - - /* Make LIM point to the end of (one byte past) the current field. */ -- if (tab != TAB_DEFAULT) -+ if (tab != NULL) - { - char *newlim; - newlim = memchr (ptr, tab, lim - ptr); -@@ -1504,6 +1711,107 @@ limfield (const struct line *line, const - return ptr; - } - -+#if HAVE_MBRTOWC -+static char * -+limfield_mb (const struct line *line, const struct keyfield *key) -+{ -+ char *ptr = line->text, *lim = ptr + line->length - 1; -+ size_t eword = key->eword, echar = key->echar; -+ int i; -+ size_t mblength; -+ mbstate_t state; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ if (tab != NULL) -+ while (ptr < lim && eword--) -+ { -+ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ } -+ else -+ while (ptr < lim && eword--) -+ { -+ while (ptr < lim && ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ while (ptr < lim && !ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ } -+ -+# ifdef POSIX_UNSPECIFIED -+ -+ /* Make LIM point to the end of (one byte past) the current field. */ -+ if (tab != NULL) -+ { -+ char *newlim, *p; -+ -+ newlim = NULL; -+ for (p = ptr; p < lim;) -+ { -+ if (memcmp (p, tab, tab_length) == 0) -+ { -+ newlim = p; -+ break; -+ } -+ -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ p += mblength; -+ } -+ } -+ else -+ { -+ char *newlim; -+ newlim = ptr; -+ -+ while (newlim < lim && ismbblank (newlim, &mblength)) -+ newlim += mblength; -+ if (ptr < lim) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ ptr += mblength; -+ } -+ while (newlim < lim && !ismbblank (newlim, &mblength)) -+ newlim += mblength; -+ lim = newlim; -+ } -+# endif -+ -+ /* If we're skipping leading blanks, don't start counting characters -+ until after skipping past any leading blanks. */ -+ if (key->skipeblanks) -+ while (ptr < lim && ismbblank (ptr, &mblength)) -+ ptr += mblength; -+ -+ memset (&state, '\0', sizeof(mbstate_t)); -+ -+ /* Advance PTR by ECHAR (if possible), but no further than LIM. */ -+ for (i = 0; i < echar; i++) -+ { -+ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -+ -+ if (ptr + mblength > lim) -+ break; -+ else -+ ptr += mblength; -+ } -+ -+ return ptr; -+} -+#endif -+ - /* Fill BUF reading from FP, moving buf->left bytes from the end - of buf->buf to the beginning first. If EOF is reached and the - file wasn't terminated by a newline, supply one. Set up BUF's line -@@ -1586,8 +1894,22 @@ fillbuf (struct buffer *buf, FILE *fp, c - else - { - if (key->skipsblanks) -- while (blanks[to_uchar (*line_start)]) -- line_start++; -+ { -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ size_t mblength; -+ -+ while (ismbblank (line_start, &mblength)) -+ line_start += mblength; -+ } -+ else -+#endif -+ { -+ while (blanks[to_uchar (*line_start)]) -+ line_start++; -+ } -+ } - line->keybeg = line_start; - } - } -@@ -1642,15 +1964,59 @@ general_numcompare (const char *sa, cons - /* FIXME: maybe add option to try expensive FP conversion - only if A and B can't be compared more cheaply/accurately. */ - -- char *ea; -- char *eb; -- double a = strtod (sa, &ea); -- double b = strtod (sb, &eb); -+ char *bufa, *ea; -+ char *bufb, *eb; -+ double a; -+ double b; -+ -+ char *p; -+ struct lconv *lconvp = localeconv (); -+ size_t thousands_sep_len = strlen (lconvp->thousands_sep); -+ -+ bufa = (char *) xmalloc (strlen (sa) + 1); -+ bufb = (char *) xmalloc (strlen (sb) + 1); -+ strcpy (bufa, sa); -+ strcpy (bufb, sb); -+ -+ if (force_general_numcompare) -+ { -+ while (1) -+ { -+ a = strtod (bufa, &ea); -+ if (memcmp (ea, lconvp->thousands_sep, thousands_sep_len) == 0) -+ { -+ for (p = ea; *(p + thousands_sep_len) != '\0'; p++) -+ *p = *(p + thousands_sep_len); -+ *p = '\0'; -+ continue; -+ } -+ break; -+ } -+ -+ while (1) -+ { -+ b = strtod (bufb, &eb); -+ if (memcmp (eb, lconvp->thousands_sep, thousands_sep_len) == 0) -+ { -+ for (p = eb; *(p + thousands_sep_len) != '\0'; p++) -+ *p = *(p + thousands_sep_len); -+ *p = '\0'; -+ continue; -+ } -+ break; -+ } -+ } -+ else -+ { -+ a = strtod (bufa, &ea); -+ b = strtod (bufb, &eb); -+ } -+ - - /* Put conversion errors at the start of the collating sequence. */ -- if (sa == ea) -- return sb == eb ? 0 : -1; -- if (sb == eb) -+ if (bufa == ea) -+ return bufb == eb ? 0 : -1; -+ if (bufb == eb) - return 1; - - /* Sort numbers in the usual way, where -0 == +0. Put NaNs after -@@ -1668,7 +2034,7 @@ general_numcompare (const char *sa, cons - Return 0 if the name in S is not recognized. */ - - static int --getmonth (char const *month, size_t len) -+getmonth_uni (char const *month, size_t len) - { - size_t lo = 0; - size_t hi = MONTHS_PER_YEAR; -@@ -1849,11 +2215,79 @@ compare_version (char *restrict texta, s - return diff; - } - -+#if HAVE_MBRTOWC -+static int -+getmonth_mb (char const *s, size_t len) -+{ -+ char *month; -+ register size_t i; -+ register int lo = 0, hi = MONTHS_PER_YEAR, result; -+ char *tmp; -+ size_t wclength, mblength; -+ const char **pp; -+ const wchar_t **wpp; -+ wchar_t *month_wcs; -+ mbstate_t state; -+ -+ while (len > 0 && ismbblank (s, &mblength)) -+ { -+ s += mblength; -+ len -= mblength; -+ } -+ -+ if (len == 0) -+ return 0; -+ -+ month = (char *) alloca (len + 1); -+ -+ tmp = (char *) alloca (len + 1); -+ memcpy (tmp, s, len); -+ tmp[len] = '\0'; -+ pp = (const char **) &tmp; -+ month_wcs = (wchar_t *) alloca ((len + 1) * sizeof (wchar_t)); -+ memset (&state, '\0', sizeof (mbstate_t)); -+ -+ wclength = mbsrtowcs (month_wcs, pp, len + 1, &state); -+ assert (wclength != 1 && *pp == NULL); -+ -+ for (i = 0; i < wclength; i++) -+ { -+ month_wcs[i] = towupper (month_wcs[i]); -+ if (iswctype (month_wcs[i], blank_type)) -+ { -+ month_wcs[i] = L'\0'; -+ break; -+ } -+ } -+ -+ wpp = (const wchar_t **) &month_wcs; -+ -+ mblength = wcsrtombs (month, wpp, len + 1, &state); -+ assert (mblength != (-1) && *wpp == NULL); -+ -+ do -+ { -+ int ix = (lo + hi) / 2; -+ -+ if (strncmp (month, monthtab[ix].name, strlen (monthtab[ix].name)) < 0) -+ hi = ix; -+ else -+ lo = ix; -+ } -+ while (hi - lo > 1); -+ -+ result = (!strncmp (month, monthtab[lo].name, strlen (monthtab[lo].name)) -+ ? monthtab[lo].val : 0); -+ -+ return result; -+} -+#endif -+ - /* Compare two lines A and B trying every key in sequence until there - are no more keys or a difference is found. */ - - static int --keycompare (const struct line *a, const struct line *b) -+keycompare_uni (const struct line *a, const struct line *b) - { - struct keyfield const *key = keylist; - -@@ -2022,11 +2456,190 @@ keycompare (const struct line *a, const - - return 0; - -- greater: -+greater: -+ diff = 1; -+not_equal: -+ return key->reverse ? -diff : diff; -+} -+ -+#if HAVE_MBRTOWC -+static int -+keycompare_mb (const struct line *a, const struct line *b) -+{ -+ struct keyfield *key = keylist; -+ -+ /* For the first iteration only, the key positions have been -+ precomputed for us. */ -+ char *texta = a->keybeg; -+ char *textb = b->keybeg; -+ char *lima = a->keylim; -+ char *limb = b->keylim; -+ -+ size_t mblength_a, mblength_b; -+ wchar_t wc_a, wc_b; -+ mbstate_t state_a, state_b; -+ -+ int diff; -+ -+ memset (&state_a, '\0', sizeof (mbstate_t)); -+ memset (&state_b, '\0', sizeof (mbstate_t)); -+ -+ for (;;) -+ { -+ register char const *translate = key->translate; -+ register bool const *ignore = key->ignore; -+ -+ /* Find the lengths. */ -+ size_t lena = lima <= texta ? 0 : lima - texta; -+ size_t lenb = limb <= textb ? 0 : limb - textb; -+ -+ /* Actually compare the fields. */ -+ if (key->numeric | key->general_numeric) -+ { -+ char savea = *lima, saveb = *limb; -+ -+ *lima = *limb = '\0'; -+ if (force_general_numcompare) -+ diff = general_numcompare (texta, textb); -+ else -+ diff = ((key->numeric ? numcompare : general_numcompare) -+ (texta, textb)); -+ *lima = savea, *limb = saveb; -+ } -+ else if (key->version) -+ diff = compare_version (texta, lena, textb, lenb); -+ else if (key->month) -+ diff = getmonth (texta, lena) - getmonth (textb, lenb); -+ else -+ { -+ if (ignore || translate) -+ { -+ char buf[4000]; -+ size_t size = lena + 1 + lenb + 1; -+ char *copy_a = (size <= sizeof buf ? buf : xmalloc (size)); -+ char *copy_b = copy_a + lena + 1; -+ size_t new_len_a, new_len_b; -+ size_t i, j; -+ -+ /* Ignore and/or translate chars before comparing. */ -+# define IGNORE_CHARS(NEW_LEN, LEN, TEXT, COPY, WC, MBLENGTH, STATE) \ -+ do \ -+ { \ -+ wchar_t uwc; \ -+ char mbc[MB_LEN_MAX]; \ -+ mbstate_t state_wc; \ -+ \ -+ for (NEW_LEN = i = 0; i < LEN;) \ -+ { \ -+ mbstate_t state_bak; \ -+ \ -+ state_bak = STATE; \ -+ MBLENGTH = mbrtowc (&WC, TEXT + i, LEN - i, &STATE); \ -+ \ -+ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1 \ -+ || MBLENGTH == 0) \ -+ { \ -+ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ -+ STATE = state_bak; \ -+ if (!ignore) \ -+ COPY[NEW_LEN++] = TEXT[i++]; \ -+ continue; \ -+ } \ -+ \ -+ if (ignore) \ -+ { \ -+ if ((ignore == nonprinting && !iswprint (WC)) \ -+ || (ignore == nondictionary \ -+ && !iswalnum (WC) && !iswctype (WC, blank_type))) \ -+ { \ -+ i += MBLENGTH; \ -+ continue; \ -+ } \ -+ } \ -+ \ -+ if (translate) \ -+ { \ -+ \ -+ uwc = toupper(WC); \ -+ if (WC == uwc) \ -+ { \ -+ memcpy (mbc, TEXT + i, MBLENGTH); \ -+ i += MBLENGTH; \ -+ } \ -+ else \ -+ { \ -+ i += MBLENGTH; \ -+ WC = uwc; \ -+ memset (&state_wc, '\0', sizeof (mbstate_t)); \ -+ \ -+ MBLENGTH = wcrtomb (mbc, WC, &state_wc); \ -+ assert (MBLENGTH != (size_t)-1 && MBLENGTH != 0); \ -+ } \ -+ \ -+ for (j = 0; j < MBLENGTH; j++) \ -+ COPY[NEW_LEN++] = mbc[j]; \ -+ } \ -+ else \ -+ for (j = 0; j < MBLENGTH; j++) \ -+ COPY[NEW_LEN++] = TEXT[i++]; \ -+ } \ -+ COPY[NEW_LEN] = '\0'; \ -+ } \ -+ while (0) -+ -+ IGNORE_CHARS (new_len_a, lena, texta, copy_a, -+ wc_a, mblength_a, state_a); -+ IGNORE_CHARS (new_len_b, lenb, textb, copy_b, -+ wc_b, mblength_b, state_b); -+ diff = xmemcoll (copy_a, new_len_a, copy_b, new_len_b); -+ -+ if (sizeof buf < size) -+ free (copy_a); -+ } -+ else if (lena == 0) -+ diff = - NONZERO (lenb); -+ else if (lenb == 0) -+ goto greater; -+ else -+ diff = xmemcoll (texta, lena, textb, lenb); -+ } -+ -+ if (diff) -+ goto not_equal; -+ -+ key = key->next; -+ if (! key) -+ break; -+ -+ /* Find the beginning and limit of the next field. */ -+ if (key->eword != SIZE_MAX) -+ lima = limfield (a, key), limb = limfield (b, key); -+ else -+ lima = a->text + a->length - 1, limb = b->text + b->length - 1; -+ -+ if (key->sword != SIZE_MAX) -+ texta = begfield (a, key), textb = begfield (b, key); -+ else -+ { -+ texta = a->text, textb = b->text; -+ if (key->skipsblanks) -+ { -+ while (texta < lima && ismbblank (texta, &mblength_a)) -+ texta += mblength_a; -+ while (textb < limb && ismbblank (textb, &mblength_b)) -+ textb += mblength_b; -+ } -+ } -+ } -+ -+ return 0; -+ -+greater: - diff = 1; -- not_equal: -+not_equal: - return key->reverse ? -diff : diff; - } -+#endif - - /* Compare two lines A and B, returning negative, zero, or positive - depending on whether A compares less than, equal to, or greater than B. */ -@@ -2857,6 +3470,11 @@ set_ordering (const char *s, struct keyf - break; - case 'M': - key->month = true; -+#if HAVE_MBRTOWC -+ if (strcmp (setlocale (LC_CTYPE, NULL), setlocale (LC_TIME, NULL))) -+ error (0, 0, _("As LC_TIME differs from LC_CTYPE, the results may be strange.")); -+ inittables_mb (); -+#endif - break; - case 'n': - key->numeric = true; -@@ -2915,7 +3533,7 @@ main (int argc, char **argv) - initialize_exit_failure (SORT_FAILURE); - - hard_LC_COLLATE = hard_locale (LC_COLLATE); --#if HAVE_NL_LANGINFO -+#if HAVE_LANGINFO_CODESET - hard_LC_TIME = hard_locale (LC_TIME); - #endif - -@@ -2928,14 +3546,40 @@ main (int argc, char **argv) - add support for multibyte decimal points. */ - decimal_point = to_uchar (locale->decimal_point[0]); - if (! decimal_point || locale->decimal_point[1]) -- decimal_point = '.'; -+ { -+ decimal_point = '.'; -+ if (locale->decimal_point[0] && locale->decimal_point[1]) -+ force_general_numcompare = 1; -+ } - - /* FIXME: add support for multibyte thousands separators. */ - thousands_sep = to_uchar (*locale->thousands_sep); - if (! thousands_sep || locale->thousands_sep[1]) -- thousands_sep = -1; -+ { -+ thousands_sep = -1; -+ if (locale->thousands_sep[0] && locale->thousands_sep[1]) -+ force_general_numcompare = 1; -+ } - } - -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ blank_type = wctype ("blank"); -+ begfield = begfield_mb; -+ limfield = limfield_mb; -+ getmonth = getmonth_mb; -+ keycompare = keycompare_mb; -+ } -+ else -+#endif -+ { -+ begfield = begfield_uni; -+ limfield = limfield_uni; -+ keycompare = keycompare_uni; -+ getmonth = getmonth_uni; -+ } -+ - have_read_stdin = false; - inittables (); - -@@ -3196,13 +3840,32 @@ main (int argc, char **argv) - - case 't': - { -- char newtab = optarg[0]; -- if (! newtab) -+ const char *newtab = optarg; -+ size_t newtab_length; -+ if (! newtab[0]) - error (SORT_FAILURE, 0, _("empty tab")); -- if (optarg[1]) -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ mbstate_t state; -+ -+ memset (&state, 0, sizeof (mbstate_t)); -+ newtab_length = mbrtowc (NULL, newtab, strlen (newtab), &state); -+ if (newtab_length == (size_t) 0 -+ || newtab_length == (size_t) -1 -+ || newtab_length == (size_t) -2) -+ newtab_length = 1; -+ } -+ else -+#endif -+ newtab_length = 1; -+ if (optarg[newtab_length]) - { - if (STREQ (optarg, "\\0")) -- newtab = '\0'; -+ { -+ newtab = "\0"; -+ newtab_length = 1; -+ } - else - { - /* Provoke with `sort -txx'. Complain about -@@ -3213,9 +3876,12 @@ main (int argc, char **argv) - quote (optarg)); - } - } -- if (tab != TAB_DEFAULT && tab != newtab) -+ if (tab != NULL -+ && (tab_length != newtab_length -+ || memcmp (tab, newtab, tab_length) != 0)) - error (SORT_FAILURE, 0, _("incompatible tabs")); - tab = newtab; -+ tab_length = newtab_length; - } - break; - -Index: src/unexpand.c -=================================================================== ---- coreutils-7.1/src/unexpand.c.orig 2008-11-10 14:17:52.000000000 +0100 -+++ coreutils-7.1/src/unexpand.c 2010-06-29 18:49:31.975522293 +0200 -@@ -38,11 +38,34 @@ - #include - #include - #include -+ -+/* Get mbstate_t, mbrtowc(), wcwidth() */ -+#if HAVE_WCHAR_H -+# include -+#endif -+/* Get iswblank */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ -+ -+/* A sentinel value that's placed at the end of the list of tab stops. -+ * This value must be a large number, but not so large that adding the -+ * length of a line to it would cause the column variable to overflow. */ -+#define TAB_STOP_SENTINEL INT_MAX -+ - #include "system.h" - #include "error.h" - #include "quote.h" - #include "xstrndup.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "unexpand" - -@@ -449,6 +472,237 @@ unexpand (void) - } - } - -+#if HAVE_MBRTOWC && HAVE_WCTYPE_H -+static void -+unexpand_multibyte (void) -+{ -+ /* Input stream. */ -+ FILE *fp = next_file (NULL); -+ -+ mbstate_t i_state; /* Current shift state of the input stream. */ -+ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ -+ char *bufpos; /* Next read position of BUF. */ -+ size_t buflen = 0; /* The length of the byte sequence in buf. */ -+ -+ /* The array of pending blanks. In non-POSIX locales, blanks can -+ include characters other than spaces, so the blanks must be -+ stored, not merely counted. */ -+ char *pending_blank; -+ -+ if (!fp) -+ return; -+ -+ /* The worst case is a non-blank character, then one blank, then a -+ tab stop, then MAX_COLUMN_WIDTH - 1 blanks, then a non-blank; so -+ allocate MAX_COLUMN_WIDTH bytes to store the blanks. */ -+ pending_blank = xmalloc (max_column_width); -+ -+ memset (&i_state, '\0', sizeof(mbstate_t)); -+ -+ for (;;) -+ { -+ /* A gotten wide character. */ -+ wint_t wc; -+ -+ /* If true, perform translations. */ -+ bool convert = true; -+ -+ /* The following variables have valid values only when CONVERT -+ is true: */ -+ -+ /* Column of next input character. */ -+ uintmax_t column = 0; -+ -+ /* Column the next input tab stop is on. */ -+ uintmax_t next_tab_column = 0; -+ -+ /* Index in TAB_LIST of next tab stop to examine. */ -+ size_t tab_index = 0; -+ -+ /* If true, the first pending blank came just before a tab stop. */ -+ bool one_blank_before_tab_stop = false; -+ -+ /* If true, the previous input character was a blank. This is -+ initially true, since initial strings of blanks are treated -+ as if the line was preceded by a blank. */ -+ bool prev_blank = true; -+ -+ /* Number of pending columns of blanks. */ -+ size_t pending = 0; -+ -+ /* Convert a line of text. */ -+ do -+ { -+ wchar_t w; -+ size_t mblength; /* The byte size of a multibyte character -+ which shows as same character as WC. */ -+ mbstate_t i_state_bak; /* Back up the I_STATE. */ -+ -+ /* Fill buffer */ -+ if (buflen < MB_LEN_MAX) -+ { -+ if (!feof (fp) && !ferror (fp)) -+ { -+ if (buflen > 0) -+ memmove (buf, bufpos, buflen); -+ buflen += fread (buf + buflen, sizeof (char), BUFSIZ, fp); -+ bufpos = buf; -+ } -+ } -+ -+ if (buflen < 1) -+ { -+ /* Move to the next file */ -+ if (feof (fp) || ferror (fp)) -+ fp = next_file (fp); -+ if (!fp) -+ { -+ if (pending) -+ { -+ if (fwrite (pending_blank, 1, pending, stdout) != pending) -+ error (EXIT_FAILURE, errno, _("write error")); -+ } -+ free (pending_blank); -+ return; -+ } -+ continue; -+ } -+ -+ i_state_bak = i_state; -+ mblength = mbrtowc (&w, bufpos, buflen, &i_state); -+ wc = w; -+ -+ if (mblength == (size_t) -1 || mblength == (size_t) -2) -+ { -+ i_state = i_state_bak; -+ wc = L'\0'; -+ column += convert; -+ mblength = 1; -+ } -+ -+ if (convert) -+ { -+ bool blank = iswblank (wc); -+ -+ if (blank) -+ { -+ if (next_tab_column <= column) -+ { -+ if (tab_size) -+ next_tab_column = -+ column + (tab_size - column % tab_size); -+ else -+ for (;;) -+ if (tab_index == first_free_tab) -+ { -+ convert = false; -+ break; -+ } -+ else -+ { -+ uintmax_t tab = tab_list[tab_index++]; -+ if (column < tab) -+ { -+ next_tab_column = tab; -+ break; -+ } -+ } -+ } -+ -+ if (convert) -+ { -+ if (next_tab_column < column) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ -+ if (wc == L'\t') -+ { -+ column = next_tab_column; -+ -+ /* Discard pending blanks, unless it was a single -+ blank just before the previous tab stop. */ -+ if (! (pending == 1 && one_blank_before_tab_stop)) -+ { -+ pending = 0; -+ one_blank_before_tab_stop = false; -+ } -+ } -+ else -+ { -+ column++; -+ -+ if (! (prev_blank && column == next_tab_column)) -+ { -+ /* It is not yet known whether the pending blanks -+ will be replaced by tabs. */ -+ if (column == next_tab_column) -+ one_blank_before_tab_stop = true; -+ pending_blank[pending++] = ' '; -+ prev_blank = true; -+ buflen -= mblength; -+ bufpos += mblength; -+ continue; -+ } -+ -+ /* Replace the pending blanks by a tab or two. */ -+ pending_blank[0] = *bufpos = '\t'; -+ pending = one_blank_before_tab_stop; -+ } -+ } -+ } -+ else if (wc == L'\b') -+ { -+ /* Go back one column, and force recalculation of the -+ next tab stop. */ -+ column -= !!column; -+ next_tab_column = column; -+ tab_index -= !!tab_index; -+ } -+ else -+ { -+ if (!iswcntrl (wc)) -+ { -+ int width = wcwidth (wc); -+ if (width > 0) -+ { -+ if (column > (column + width)) -+ error (EXIT_FAILURE, 0, _("input line is too long")); -+ column += width; -+ } -+ } -+ } -+ -+ if (pending) -+ { -+ if (fwrite (pending_blank, 1, pending, stdout) != pending) -+ error (EXIT_FAILURE, errno, _("write error")); -+ pending = 0; -+ one_blank_before_tab_stop = false; -+ } -+ -+ prev_blank = blank; -+ convert &= convert_entire_line | blank; -+ } -+ -+ if (mblength) -+ { -+ if (fwrite (bufpos, sizeof (char), mblength, stdout) < mblength) -+ error (EXIT_FAILURE, errno, _("write error")); -+ } -+ else -+ { -+ if (putchar ('\0')) -+ error (EXIT_FAILURE, errno, _("write error")); -+ mblength = 1; -+ } -+ -+ buflen -= mblength; -+ bufpos += mblength; -+ } -+ while (wc != L'\n'); -+ } -+} -+#endif -+ - int - main (int argc, char **argv) - { -@@ -527,7 +781,12 @@ main (int argc, char **argv) - - file_list = (optind < argc ? &argv[optind] : stdin_argv); - -- unexpand (); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ unexpand_multibyte (); -+ else -+#endif -+ unexpand (); - - if (have_read_stdin && fclose (stdin) != 0) - error (EXIT_FAILURE, errno, "-"); -Index: src/uniq.c -=================================================================== ---- coreutils-7.1/src/uniq.c.orig 2008-11-10 14:17:52.000000000 +0100 -+++ coreutils-7.1/src/uniq.c 2010-06-29 18:49:32.040030047 +0200 -@@ -22,6 +22,16 @@ - #include - #include - -+/* Get mbstate_t, mbrtowc(), wcrtomb() */ -+#if HAVE_WCHAR_H -+# include -+#endif -+ -+/* Get iswctype(), wctype(), towupper)(. */ -+#if HAVE_WCTYPE_H -+# include -+#endif -+ - #include "system.h" - #include "argmatch.h" - #include "linebuffer.h" -@@ -32,6 +42,13 @@ - #include "xstrtol.h" - #include "memcasecmp.h" - -+/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC -+ installation; work around this configuration error. */ -+#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 -+# undef MB_LEN_MAX -+# define MB_LEN_MAX 16 -+#endif -+ - /* The official name of this program (e.g., no `g' prefix). */ - #define PROGRAM_NAME "uniq" - -@@ -106,6 +123,12 @@ static enum delimit_method const delimit - /* Select whether/how to delimit groups of duplicate lines. */ - static enum delimit_method delimit_groups; - -+/* Function pointers. */ -+static char * (*find_field) (struct linebuffer *line); -+ -+/* Show the blank character class. */ -+wctype_t blank_type; -+ - static struct option const longopts[] = - { - {"count", no_argument, NULL, 'c'}, -@@ -202,7 +225,7 @@ size_opt (char const *opt, char const *m - return a pointer to the beginning of the line's field to be compared. */ - - static char * --find_field (struct linebuffer const *line) -+find_field_uni (struct linebuffer const *line) - { - size_t count; - char const *lp = line->buffer; -@@ -223,6 +246,83 @@ find_field (struct linebuffer const *lin - return line->buffer + i; - } - -+#if HAVE_MBRTOWC -+ -+# define MBCHAR_TO_WCHAR(WC, MBLENGTH, LP, POS, SIZE, STATEP, CONVFAIL) \ -+ do \ -+ { \ -+ mbstate_t state_bak; \ -+ \ -+ CONVFAIL = 0; \ -+ state_bak = *STATEP; \ -+ \ -+ MBLENGTH = mbrtowc (&WC, LP + POS, SIZE - POS, STATEP); \ -+ \ -+ switch (MBLENGTH) \ -+ { \ -+ case (size_t)-2: \ -+ case (size_t)-1: \ -+ *STATEP = state_bak; \ -+ CONVFAIL++; \ -+ /* Fall through */ \ -+ case 0: \ -+ MBLENGTH = 1; \ -+ } \ -+ } \ -+ while (0) -+ -+static char * -+find_field_multi (struct linebuffer const *line) -+{ -+ size_t count; -+ char *lp = line->buffer; -+ size_t size = line->length - 1; -+ size_t pos; -+ size_t mblength; -+ wchar_t wc; -+ mbstate_t *statep; -+ int convfail; -+ -+ pos = 0; -+ statep = &line->state; -+ -+ /* skip fields. */ -+ for (count = 0; count < skip_fields && pos < size; count++) -+ { -+ while (pos < size) -+ { -+ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); -+ -+ if (convfail || !iswctype (wc, blank_type)) -+ { -+ pos += mblength; -+ break; -+ } -+ pos += mblength; -+ } -+ -+ while (pos < size) -+ { -+ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); -+ -+ if (!convfail && iswctype (wc, blank_type)) -+ break; -+ -+ pos += mblength; -+ } -+ } -+ -+ /* skip fields. */ -+ for (count = 0; count < skip_chars && pos < size; count++) -+ { -+ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); -+ pos += mblength; -+ } -+ -+ return lp + pos; -+} -+#endif -+ - /* Return false if two strings OLD and NEW match, true if not. - OLD and NEW point not to the beginnings of the lines - but rather to the beginnings of the fields to compare. -@@ -247,6 +347,73 @@ different (char *old, char *new, size_t - return oldlen != newlen || memcmp (old, new, oldlen); - } - -+#if HAVE_MBRTOWC -+static int -+different_multi (const char *old, const char *new, size_t oldlen, size_t newlen, mbstate_t oldstate, mbstate_t newstate) -+{ -+ size_t i, j, chars; -+ const char *str[2]; -+ char *copy[2]; -+ size_t len[2]; -+ mbstate_t state[2]; -+ size_t mblength; -+ wchar_t wc, uwc; -+ mbstate_t state_bak; -+ -+ str[0] = old; -+ str[1] = new; -+ len[0] = oldlen; -+ len[1] = newlen; -+ state[0] = oldstate; -+ state[1] = newstate; -+ -+ for (i = 0; i < 2; i++) -+ { -+ copy[i] = alloca (len[i] + 1); -+ -+ for (j = 0, chars = 0; j < len[i] && chars < check_chars; chars++) -+ { -+ state_bak = state[i]; -+ mblength = mbrtowc (&wc, str[i] + j, len[i] - j, &state[i]); -+ -+ switch (mblength) -+ { -+ case (size_t)-1: -+ case (size_t)-2: -+ state[i] = state_bak; -+ /* Fall through */ -+ case 0: -+ mblength = 1; -+ break; -+ -+ default: -+ if (ignore_case) -+ { -+ uwc = towupper (wc); -+ -+ if (uwc != wc) -+ { -+ mbstate_t state_wc; -+ -+ memset (&state_wc, '\0', sizeof (mbstate_t)); -+ wcrtomb (copy[i] + j, uwc, &state_wc); -+ } -+ else -+ memcpy (copy[i] + j, str[i] + j, mblength); -+ } -+ else -+ memcpy (copy[i] + j, str[i] + j, mblength); -+ } -+ j += mblength; -+ } -+ copy[i][j] = '\0'; -+ len[i] = j; -+ } -+ -+ return xmemcoll (copy[0], len[0], copy[1], len[1]); -+} -+#endif -+ - /* Output the line in linebuffer LINE to standard output - provided that the switches say it should be output. - MATCH is true if the line matches the previous line. -@@ -299,15 +466,42 @@ check_file (const char *infile, const ch - { - char *prevfield IF_LINT (= NULL); - size_t prevlen IF_LINT (= 0); -+#if HAVE_MBRTOWC -+ mbstate_t prevstate; - -+ memset (&prevstate, '\0', sizeof (mbstate_t)); -+#endif - while (!feof (stdin)) - { - char *thisfield; - size_t thislen; -+#if HAVE_MBRTOWC -+ mbstate_t thisstate; -+#endif - if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) - break; - thisfield = find_field (thisline); - thislen = thisline->length - 1 - (thisfield - thisline->buffer); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ thisstate = thisline->state; -+ -+ if (prevline->length == 0 -+ || different_multi (thisfield, prevfield, thislen, prevlen, -+ thisstate, prevstate)) -+ { -+ fwrite (thisline->buffer, sizeof (char), -+ thisline->length, stdout); -+ -+ SWAP_LINES (prevline, thisline); -+ prevfield = thisfield; -+ prevlen = thislen; -+ prevstate = thisstate; -+ } -+ } -+ else -+#endif - if (prevline->length == 0 - || different (thisfield, prevfield, thislen, prevlen)) - { -@@ -326,17 +520,26 @@ check_file (const char *infile, const ch - size_t prevlen; - uintmax_t match_count = 0; - bool first_delimiter = true; -+#if HAVE_MBRTOWC -+ mbstate_t prevstate; -+#endif - - if (readlinebuffer_delim (prevline, stdin, delimiter) == 0) - goto closefiles; - prevfield = find_field (prevline); - prevlen = prevline->length - 1 - (prevfield - prevline->buffer); -+#if HAVE_MBRTOWC -+ prevstate = prevline->state; -+#endif - - while (!feof (stdin)) - { - bool match; - char *thisfield; - size_t thislen; -+#if HAVE_MBRTOWC -+ mbstate_t thisstate; -+#endif - if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) - { - if (ferror (stdin)) -@@ -345,6 +548,15 @@ check_file (const char *infile, const ch - } - thisfield = find_field (thisline); - thislen = thisline->length - 1 - (thisfield - thisline->buffer); -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ thisstate = thisline->state; -+ match = !different_multi (thisfield, prevfield, -+ thislen, prevlen, thisstate, prevstate); -+ } -+ else -+#endif - match = !different (thisfield, prevfield, thislen, prevlen); - match_count += match; - -@@ -377,6 +589,9 @@ check_file (const char *infile, const ch - SWAP_LINES (prevline, thisline); - prevfield = thisfield; - prevlen = thislen; -+#if HAVE_MBRTOWC -+ prevstate = thisstate; -+#endif - if (!match) - match_count = 0; - } -@@ -422,6 +637,18 @@ main (int argc, char **argv) - - atexit (close_stdout); - -+#if HAVE_MBRTOWC -+ if (MB_CUR_MAX > 1) -+ { -+ find_field = find_field_multi; -+ blank_type = wctype ("blank"); -+ } -+ else -+#endif -+ { -+ find_field = find_field_uni; -+ } -+ - skip_chars = 0; - skip_fields = 0; - check_chars = SIZE_MAX; -Index: tests/misc/cut -=================================================================== ---- coreutils-7.1/tests/misc/cut.orig 2008-09-18 09:06:57.000000000 +0200 -+++ coreutils-7.1/tests/misc/cut 2010-06-29 18:49:32.091533700 +0200 -@@ -26,7 +26,7 @@ use strict; - my $prog = 'cut'; - my $try = "Try \`$prog --help' for more information.\n"; - my $from_1 = "$prog: fields and positions are numbered from 1\n$try"; --my $inval = "$prog: invalid byte or field list\n$try"; -+my $inval = "$prog: invalid byte, character or field list\n$try"; - my $no_endpoint = "$prog: invalid range with no endpoint: -\n$try"; - - my @Tests = diff --git a/coreutils-5.3.0-sbin4su.diff b/coreutils-5.3.0-sbin4su.patch similarity index 90% rename from coreutils-5.3.0-sbin4su.diff rename to coreutils-5.3.0-sbin4su.patch index bf2cc6c..3af4168 100644 --- a/coreutils-5.3.0-sbin4su.diff +++ b/coreutils-5.3.0-sbin4su.patch @@ -1,8 +1,8 @@ Index: src/su.c =================================================================== ---- src/su.c.orig 2010-05-04 17:29:12.779359204 +0200 -+++ src/su.c 2010-05-04 17:29:12.939359620 +0200 -@@ -467,6 +467,117 @@ correct_password (const struct passwd *p +--- src/su.c.orig 2010-05-05 14:46:48.000000000 +0200 ++++ src/su.c 2010-05-05 14:48:55.023359308 +0200 +@@ -454,6 +454,117 @@ correct_password (const struct passwd *p #endif /* !USE_PAM */ } @@ -120,7 +120,7 @@ Index: src/su.c /* Update `environ' for the new shell based on PW, with SHELL being the value for the SHELL environment variable. */ -@@ -506,6 +617,22 @@ modify_environment (const struct passwd +@@ -493,6 +604,22 @@ modify_environment (const struct passwd DEFAULT_LOGIN_PATH) : getdef_str ("SUPATH", DEFAULT_ROOT_LOGIN_PATH))); @@ -140,6 +140,6 @@ Index: src/su.c + free (new); + } + } - if (pw->pw_uid) - { - xsetenv ("USER", pw->pw_name); + if (pw->pw_uid) + { + xsetenv ("USER", pw->pw_name); diff --git a/coreutils-6.8-su.diff b/coreutils-6.8-su.patch similarity index 78% rename from coreutils-6.8-su.diff rename to coreutils-6.8-su.patch index 090b0f3..c8e3e05 100644 --- a/coreutils-6.8-su.diff +++ b/coreutils-6.8-su.patch @@ -1,6 +1,10 @@ ---- Makefile.in -+++ Makefile.in -@@ -732,6 +732,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Add pam support in su + +Index: Makefile.in +=================================================================== +--- Makefile.in.orig 2010-04-23 17:58:41.000000000 +0200 ++++ Makefile.in 2010-05-06 19:37:44.784359208 +0200 +@@ -961,6 +961,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -8,41 +12,35 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- configure -+++ configure -@@ -612,6 +612,7 @@ OPTIONAL_BIN_PROGS +Index: configure +=================================================================== +--- configure.orig 2010-05-06 19:37:44.688359301 +0200 ++++ configure 2010-05-06 19:37:44.816359169 +0200 +@@ -631,6 +631,7 @@ OPTIONAL_BIN_PROGS INSTALL_SU LIB_GMP LIB_CRYPT +PAM_LIBS + GNULIB_WARN_CFLAGS WERROR_CFLAGS SEQ_LIBM - LIB_CAP -@@ -1231,6 +1232,7 @@ with_included_regex - enable_xattr +@@ -1501,6 +1502,7 @@ enable_xattr enable_libcap + with_tty_group enable_gcc_warnings +enable_pam with_gmp enable_install_program enable_no_install_program -@@ -1877,6 +1879,7 @@ Optional Features: +@@ -2152,6 +2154,7 @@ Optional Features: --disable-xattr do not support extended attributes --disable-libcap disable libcap support - --enable-gcc-warnings turn on lots of GCC warnings (not recommended) -+ --disable-pam Enable PAM support in su (default=auto) + --enable-gcc-warnings turn on lots of GCC warnings (for developers) ++ --disable-pam Disable PAM support in su (default=auto) --enable-install-program=PROG_LIST install the programs in PROG_LIST (comma-separated, default: none) -@@ -26931,7 +26934,6 @@ fi - - - -- - XGETTEXT_EXTRA_OPTIONS="$XGETTEXT_EXTRA_OPTIONS --keyword='proper_name:1,\"This is a proper name. See the gettext manual, section Names.\"'" - - -@@ -39096,6 +39098,111 @@ $as_echo "#define HAVE_WORKING_FORK 1" > +@@ -51989,6 +51992,111 @@ $as_echo "#define HAVE_WORKING_FORK 1" > fi @@ -152,11 +150,13 @@ +$as_echo "$enable_pam" >&6; } + optional_bin_progs= - for ac_func in uname - do ---- configure.ac -+++ configure.ac -@@ -79,6 +79,20 @@ fi + for ac_func in chroot + do : +Index: configure.ac +=================================================================== +--- configure.ac.orig 2010-03-13 16:14:09.000000000 +0100 ++++ configure.ac 2010-05-06 19:37:44.843292013 +0200 +@@ -128,6 +128,20 @@ fi AC_FUNC_FORK @@ -175,11 +175,13 @@ +AC_MSG_RESULT([$enable_pam]) + optional_bin_progs= - AC_CHECK_FUNCS([uname], - gl_ADD_PROG([optional_bin_progs], [uname])) ---- doc/Makefile.in -+++ doc/Makefile.in -@@ -713,6 +713,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ + AC_CHECK_FUNCS([chroot], + gl_ADD_PROG([optional_bin_progs], [chroot])) +Index: doc/Makefile.in +=================================================================== +--- doc/Makefile.in.orig 2010-04-23 17:58:37.000000000 +0200 ++++ doc/Makefile.in 2010-05-06 19:37:44.868359246 +0200 +@@ -957,6 +957,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -187,9 +189,11 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- gnulib-tests/Makefile.in -+++ gnulib-tests/Makefile.in -@@ -1421,6 +1421,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Index: gnulib-tests/Makefile.in +=================================================================== +--- gnulib-tests/Makefile.in.orig 2010-04-23 18:00:33.000000000 +0200 ++++ gnulib-tests/Makefile.in 2010-05-06 19:37:44.871374260 +0200 +@@ -2191,6 +2191,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -197,9 +201,11 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- lib/Makefile.in -+++ lib/Makefile.in -@@ -763,6 +763,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Index: lib/Makefile.in +=================================================================== +--- lib/Makefile.in.orig 2010-04-23 17:58:38.000000000 +0200 ++++ lib/Makefile.in 2010-05-06 19:37:59.594863753 +0200 +@@ -1006,6 +1006,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -207,9 +213,11 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- man/Makefile.in -+++ man/Makefile.in -@@ -703,6 +703,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Index: man/Makefile.in +=================================================================== +--- man/Makefile.in.orig 2010-05-06 19:37:44.618920753 +0200 ++++ man/Makefile.in 2010-05-06 19:37:44.934868934 +0200 +@@ -926,6 +926,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -217,24 +225,28 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ ---- src/Makefile.am -+++ src/Makefile.am -@@ -147,7 +147,8 @@ tail_LDADD = $(nanosec_libs) - # If necessary, add -lm to resolve use of pow in lib/strtod.c. - uptime_LDADD = $(LDADD) $(POW_LIB) $(GETLOADAVG_LIBS) +Index: src/Makefile.am +=================================================================== +--- src/Makefile.am.orig 2010-04-23 15:44:14.000000000 +0200 ++++ src/Makefile.am 2010-05-06 19:37:59.594863753 +0200 +@@ -364,7 +364,8 @@ factor_LDADD += $(LIB_GMP) + uptime_LDADD += $(GETLOADAVG_LIBS) --su_LDADD = $(LDADD) $(LIB_CRYPT) + # for crypt +-su_LDADD += $(LIB_CRYPT) +su_SOURCES = su.c getdef.c +su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) - dir_LDADD += $(LIB_ACL) - ls_LDADD += $(LIB_ACL) ---- src/Makefile.in -+++ src/Makefile.in -@@ -605,9 +605,10 @@ stty_OBJECTS = stty.$(OBJEXT) - stty_LDADD = $(LDADD) - stty_DEPENDENCIES = libver.a ../lib/libcoreutils.a \ - $(am__DEPENDENCIES_1) ../lib/libcoreutils.a + # for various ACL functions + copy_LDADD += $(LIB_ACL) +Index: src/Makefile.in +=================================================================== +--- src/Makefile.in.orig 2010-04-23 18:35:11.000000000 +0200 ++++ src/Makefile.in 2010-05-06 19:37:59.594863753 +0200 +@@ -553,9 +553,10 @@ stdbuf_DEPENDENCIES = $(am__DEPENDENCIES + stty_SOURCES = stty.c + stty_OBJECTS = stty.$(OBJEXT) + stty_DEPENDENCIES = $(am__DEPENDENCIES_2) -su_SOURCES = su.c -su_OBJECTS = su.$(OBJEXT) -su_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) @@ -244,40 +256,28 @@ + $(am__DEPENDENCIES_1) sum_SOURCES = sum.c sum_OBJECTS = sum.$(OBJEXT) - sum_LDADD = $(LDADD) -@@ -735,11 +736,11 @@ SOURCES = $(nodist_libver_a_SOURCES) $(_ - $(rm_SOURCES) $(rmdir_SOURCES) runcon.c seq.c setuidgid.c \ - $(sha1sum_SOURCES) $(sha224sum_SOURCES) $(sha256sum_SOURCES) \ - $(sha384sum_SOURCES) $(sha512sum_SOURCES) shred.c shuf.c \ -- sleep.c sort.c split.c stat.c stty.c su.c sum.c sync.c tac.c \ -- tail.c tee.c test.c $(timeout_SOURCES) touch.c tr.c true.c \ -- truncate.c tsort.c tty.c $(uname_SOURCES) unexpand.c uniq.c \ -- unlink.c uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c \ -- yes.c -+ sleep.c sort.c split.c stat.c stty.c $(su_SOURCES) sum.c \ -+ sync.c tac.c tail.c tee.c test.c $(timeout_SOURCES) touch.c \ -+ tr.c true.c truncate.c tsort.c tty.c $(uname_SOURCES) \ -+ unexpand.c uniq.c unlink.c uptime.c users.c $(vdir_SOURCES) \ -+ wc.c who.c whoami.c yes.c - DIST_SOURCES = $(__SOURCES) $(arch_SOURCES) base64.c basename.c cat.c \ - chcon.c $(chgrp_SOURCES) chmod.c $(chown_SOURCES) chroot.c \ - cksum.c comm.c $(cp_SOURCES) csplit.c cut.c date.c dd.c df.c \ -@@ -754,10 +755,10 @@ DIST_SOURCES = $(__SOURCES) $(arch_SOURC + sum_DEPENDENCIES = $(am__DEPENDENCIES_2) +@@ -665,8 +666,8 @@ SOURCES = $(nodist_libver_a_SOURCES) $(_ $(rmdir_SOURCES) runcon.c seq.c setuidgid.c $(sha1sum_SOURCES) \ $(sha224sum_SOURCES) $(sha256sum_SOURCES) $(sha384sum_SOURCES) \ $(sha512sum_SOURCES) shred.c shuf.c sleep.c sort.c split.c \ -- stat.c stty.c su.c sum.c sync.c tac.c tail.c tee.c test.c \ -- $(timeout_SOURCES) touch.c tr.c true.c truncate.c tsort.c \ -- tty.c $(uname_SOURCES) unexpand.c uniq.c unlink.c uptime.c \ -- users.c $(vdir_SOURCES) wc.c who.c whoami.c yes.c -+ stat.c stty.c $(su_SOURCES) sum.c sync.c tac.c tail.c tee.c \ -+ test.c $(timeout_SOURCES) touch.c tr.c true.c truncate.c \ -+ tsort.c tty.c $(uname_SOURCES) unexpand.c uniq.c unlink.c \ -+ uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c yes.c - HEADERS = $(noinst_HEADERS) - ETAGS = etags - CTAGS = ctags -@@ -1209,6 +1210,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +- stat.c stdbuf.c stty.c su.c sum.c sync.c tac.c tail.c tee.c \ +- test.c $(timeout_SOURCES) touch.c tr.c true.c truncate.c \ ++ stat.c stdbuf.c stty.c $(su_SOURCES) sum.c sync.c tac.c tail.c \ ++ tee.c test.c $(timeout_SOURCES) touch.c tr.c true.c truncate.c \ + tsort.c tty.c $(uname_SOURCES) unexpand.c uniq.c unlink.c \ + uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c yes.c + DIST_SOURCES = $(__SOURCES) $(arch_SOURCES) base64.c basename.c cat.c \ +@@ -683,7 +684,7 @@ DIST_SOURCES = $(__SOURCES) $(arch_SOURC + $(rm_SOURCES) $(rmdir_SOURCES) runcon.c seq.c setuidgid.c \ + $(sha1sum_SOURCES) $(sha224sum_SOURCES) $(sha256sum_SOURCES) \ + $(sha384sum_SOURCES) $(sha512sum_SOURCES) shred.c shuf.c \ +- sleep.c sort.c split.c stat.c stdbuf.c stty.c su.c sum.c \ ++ sleep.c sort.c split.c stat.c stdbuf.c stty.c $(su_SOURCES) sum.c \ + sync.c tac.c tail.c tee.c test.c $(timeout_SOURCES) touch.c \ + tr.c true.c truncate.c tsort.c tty.c $(uname_SOURCES) \ + unexpand.c uniq.c unlink.c uptime.c users.c $(vdir_SOURCES) \ +@@ -1338,6 +1339,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ @@ -285,17 +285,17 @@ PATH_SEPARATOR = @PATH_SEPARATOR@ PERL = @PERL@ POSIX_SHELL = @POSIX_SHELL@ -@@ -1511,7 +1513,8 @@ tail_LDADD = $(nanosec_libs) +@@ -1743,7 +1745,8 @@ stdbuf_LDADD = $(LDADD) $(LIBICONV) + stty_LDADD = $(LDADD) - # If necessary, add -lm to resolve use of pow in lib/strtod.c. - uptime_LDADD = $(LDADD) $(POW_LIB) $(GETLOADAVG_LIBS) + # for crypt -su_LDADD = $(LDADD) $(LIB_CRYPT) +su_SOURCES = su.c getdef.c +su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) - stat_LDADD = $(LDADD) $(LIB_SELINUX) - - # programs that use getaddrinfo (e.g., via canon_host) -@@ -2040,6 +2043,7 @@ distclean-compile: + sum_LDADD = $(LDADD) + sync_LDADD = $(LDADD) + tac_LDADD = $(LDADD) $(LIB_GETHRXTIME) +@@ -2386,6 +2389,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/false.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fmt.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fold.Po@am__quote@ @@ -303,8 +303,10 @@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/getlimits.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-copy.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-cp-hash.Po@am__quote@ ---- src/getdef.c -+++ src/getdef.c +Index: src/getdef.c +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ src/getdef.c 2010-05-06 19:37:45.014990147 +0200 @@ -0,0 +1,259 @@ +/* Copyright (C) 2003, 2004, 2005 Thorsten Kukuk + Author: Thorsten Kukuk @@ -565,8 +567,10 @@ +} + +#endif ---- src/getdef.h -+++ src/getdef.h +Index: src/getdef.h +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ src/getdef.h 2010-05-06 19:37:45.054863903 +0200 @@ -0,0 +1,29 @@ +/* Copyright (C) 2003, 2005 Thorsten Kukuk + Author: Thorsten Kukuk @@ -597,8 +601,10 @@ +extern void free_getdef_data (void); + +#endif /* _GETDEF_H_ */ ---- src/su.c -+++ src/su.c +Index: src/su.c +=================================================================== +--- src/su.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/su.c 2010-05-06 19:37:59.538860383 +0200 @@ -37,6 +37,16 @@ restricts who can su to UID 0 accounts. RMS considers that to be fascist. @@ -616,7 +622,7 @@ Compile-time options: -DSYSLOG_SUCCESS Log successful su's (by default, to root) with syslog. -DSYSLOG_FAILURE Log failed su's (by default, to root) with syslog. -@@ -52,6 +62,13 @@ +@@ -52,12 +62,22 @@ #include #include #include @@ -628,9 +634,8 @@ +#include +#endif - /* Hide any system prototype for getusershell. - This is necessary because some Cray systems have a conflicting -@@ -65,6 +82,9 @@ + #include "system.h" + #include "getpass.h" #if HAVE_SYSLOG_H && HAVE_SYSLOG # include @@ -640,7 +645,7 @@ #else # undef SYSLOG_SUCCESS # undef SYSLOG_FAILURE -@@ -98,19 +118,13 @@ +@@ -91,19 +111,13 @@ # include #endif @@ -664,18 +669,20 @@ /* The shell to run if none is given in the user's passwd entry. */ #define DEFAULT_SHELL "/bin/sh" -@@ -118,13 +132,22 @@ +@@ -111,8 +125,9 @@ /* The user to become if none is specified. */ #define DEFAULT_USER "root" +#ifndef USE_PAM char *crypt (char const *key, char const *salt); +- +#endif - char *getusershell (void); - void endusershell (void); - void setusershell (void); + static void run_shell (char const *, char const *, char **, size_t) + ATTRIBUTE_NORETURN; - extern char **environ; +@@ -125,6 +140,13 @@ static bool simulate_login; + /* If true, change some environment vars to indicate the user su'd to. */ + static bool change_environment; +#ifdef USE_PAM +static bool _pam_session_opened; @@ -684,10 +691,10 @@ +static void create_watching_parent (void); +#endif + - static void run_shell (char const *, char const *, char **, size_t) - ATTRIBUTE_NORETURN; - -@@ -212,7 +235,162 @@ log_su (struct passwd const *pw, bool su + static struct option const longopts[] = + { + {"command", required_argument, NULL, 'c'}, +@@ -200,7 +222,162 @@ log_su (struct passwd const *pw, bool su } #endif @@ -772,7 +779,7 @@ + /* the child proceeds to run the shell */ + if (child == 0) + return; -+ ++ + /* In the parent watch the child. */ + + /* su without pam support does not have a helper that keeps @@ -850,7 +857,7 @@ Return true if the user gives the correct password for entry PW, false if not. Return true without asking for a password if run by UID 0 or if PW has an empty password. */ -@@ -220,10 +398,52 @@ log_su (struct passwd const *pw, bool su +@@ -208,10 +385,52 @@ log_su (struct passwd const *pw, bool su static bool correct_password (const struct passwd *pw) { @@ -904,7 +911,7 @@ endspent (); if (sp) -@@ -244,6 +464,7 @@ correct_password (const struct passwd *p +@@ -232,6 +451,7 @@ correct_password (const struct passwd *p encrypted = crypt (unencrypted, correct); memset (unencrypted, 0, strlen (unencrypted)); return STREQ (encrypted, correct); @@ -912,33 +919,33 @@ } /* Update `environ' for the new shell based on PW, with SHELL being -@@ -268,8 +489,8 @@ modify_environment (const struct passwd +@@ -256,8 +476,8 @@ modify_environment (const struct passwd xsetenv ("USER", pw->pw_name); xsetenv ("LOGNAME", pw->pw_name); xsetenv ("PATH", (pw->pw_uid -- ? DEFAULT_LOGIN_PATH -- : DEFAULT_ROOT_LOGIN_PATH)); +- ? DEFAULT_LOGIN_PATH +- : DEFAULT_ROOT_LOGIN_PATH)); + ? getdef_str ("PATH", DEFAULT_LOGIN_PATH) + : getdef_str ("SUPATH", DEFAULT_ROOT_LOGIN_PATH))); } else { -@@ -279,6 +500,12 @@ modify_environment (const struct passwd - { - xsetenv ("HOME", pw->pw_dir); - xsetenv ("SHELL", shell); +@@ -267,6 +487,12 @@ modify_environment (const struct passwd + { + xsetenv ("HOME", pw->pw_dir); + xsetenv ("SHELL", shell); + if (getdef_bool ("ALWAYS_SET_PATH", 0)) + xsetenv ("PATH", (pw->pw_uid + ? getdef_str ("PATH", + DEFAULT_LOGIN_PATH) + : getdef_str ("SUPATH", + DEFAULT_ROOT_LOGIN_PATH))); - if (pw->pw_uid) - { - xsetenv ("USER", pw->pw_name); -@@ -286,19 +513,41 @@ modify_environment (const struct passwd - } - } + if (pw->pw_uid) + { + xsetenv ("USER", pw->pw_name); +@@ -274,19 +500,41 @@ modify_environment (const struct passwd + } + } } + +#ifdef USE_PAM @@ -955,7 +962,7 @@ #ifdef HAVE_INITGROUPS errno = 0; if (initgroups (pw->pw_name, pw->pw_gid) == -1) -- error (EXIT_FAILURE, errno, _("cannot set groups")); +- error (EXIT_CANCELED, errno, _("cannot set groups")); + { +#ifdef USE_PAM + cleanup_pam (PAM_ABORT); @@ -978,17 +985,17 @@ +change_identity (const struct passwd *pw) +{ if (setgid (pw->pw_gid)) - error (EXIT_FAILURE, errno, _("cannot set group id")); + error (EXIT_CANCELED, errno, _("cannot set group id")); if (setuid (pw->pw_uid)) -@@ -491,6 +740,7 @@ main (int argc, char **argv) +@@ -479,6 +727,7 @@ main (int argc, char **argv) #ifdef SYSLOG_FAILURE log_su (pw, false); #endif + sleep (getdef_num ("FAIL_DELAY", 1)); - error (EXIT_FAILURE, 0, _("incorrect password")); + error (EXIT_CANCELED, 0, _("incorrect password")); } #ifdef SYSLOG_SUCCESS -@@ -512,9 +762,21 @@ main (int argc, char **argv) +@@ -500,9 +749,21 @@ main (int argc, char **argv) shell = NULL; } shell = xstrdup (shell ? shell : pw->pw_shell); @@ -1011,9 +1018,11 @@ if (simulate_login && chdir (pw->pw_dir) != 0) error (0, errno, _("warning: cannot change directory to %s"), pw->pw_dir); ---- tests/Makefile.in -+++ tests/Makefile.in -@@ -677,6 +677,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ +Index: tests/Makefile.in +=================================================================== +--- tests/Makefile.in.orig 2010-04-23 17:58:39.000000000 +0200 ++++ tests/Makefile.in 2010-05-06 19:37:45.091861849 +0200 +@@ -986,6 +986,7 @@ PACKAGE_STRING = @PACKAGE_STRING@ PACKAGE_TARNAME = @PACKAGE_TARNAME@ PACKAGE_URL = @PACKAGE_URL@ PACKAGE_VERSION = @PACKAGE_VERSION@ diff --git a/coreutils-6.8.0-pie.diff b/coreutils-6.8.0-pie.patch similarity index 76% rename from coreutils-6.8.0-pie.diff rename to coreutils-6.8.0-pie.patch index 36f565f..2a22116 100644 --- a/coreutils-6.8.0-pie.diff +++ b/coreutils-6.8.0-pie.patch @@ -1,28 +1,35 @@ ---- lib/Makefile.am -+++ lib/Makefile.am -@@ -18,6 +18,7 @@ +Index: lib/Makefile.am +=================================================================== +--- lib/Makefile.am.orig 2010-01-01 14:06:47.000000000 +0100 ++++ lib/Makefile.am 2010-05-05 14:38:03.083359277 +0200 +@@ -17,7 +17,7 @@ + include gnulib.mk - AM_CFLAGS = $(WARN_CFLAGS) # $(WERROR_CFLAGS) -+AM_CFLAGS += -fpie +-AM_CFLAGS += $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) ++AM_CFLAGS += $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) -fpie libcoreutils_a_SOURCES += \ buffer-lcm.c buffer-lcm.h \ ---- lib/Makefile.in -+++ lib/Makefile.in -@@ -1169,7 +1169,7 @@ GPERF = gperf - LINK_WARNING_H = $(top_srcdir)/build-aux/link-warning.h - charset_alias = $(DESTDIR)$(libdir)/charset.alias - charset_tmp = $(DESTDIR)$(libdir)/charset.tmp --AM_CFLAGS = $(WARN_CFLAGS) # $(WERROR_CFLAGS) -+AM_CFLAGS = $(WARN_CFLAGS) -fpie - all: $(BUILT_SOURCES) config.h - $(MAKE) $(AM_MAKEFLAGS) all-recursive - ---- src/Makefile.am -+++ src/Makefile.am -@@ -149,6 +149,10 @@ uptime_LDADD = $(LDADD) $(POW_LIB) $(GET - +Index: lib/Makefile.in +=================================================================== +--- lib/Makefile.in.orig 2010-05-05 14:37:08.000000000 +0200 ++++ lib/Makefile.in 2010-05-05 14:38:31.946859277 +0200 +@@ -1432,7 +1432,7 @@ DISTCLEANFILES = + MAINTAINERCLEANFILES = getdate.c iconv_open-aix.h iconv_open-hpux.h \ + iconv_open-irix.h iconv_open-osf.h iconv_open-solaris.h + AM_CPPFLAGS = +-AM_CFLAGS = $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) ++AM_CFLAGS = $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS) -fpie + libcoreutils_a_SOURCES = set-mode-acl.c copy-acl.c file-has-acl.c \ + areadlink.c areadlink-with-size.c areadlinkat.c argv-iter.c \ + argv-iter.h base64.h base64.c bitrotate.h c-ctype.h c-ctype.c \ +Index: src/Makefile.am +=================================================================== +--- src/Makefile.am.orig 2010-05-05 14:37:08.000000000 +0200 ++++ src/Makefile.am 2010-05-05 14:39:20.956359221 +0200 +@@ -366,6 +366,10 @@ uptime_LDADD += $(GETLOADAVG_LIBS) + # for crypt su_SOURCES = su.c getdef.c su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) +su_CFLAGS = -fpie @@ -30,14 +37,16 @@ +timeout_CFLAGS = -fpie +timeout_LDFLAGS = -pie - dir_LDADD += $(LIB_ACL) - ls_LDADD += $(LIB_ACL) ---- src/Makefile.in -+++ src/Makefile.in -@@ -605,10 +605,12 @@ stty_OBJECTS = stty.$(OBJEXT) - stty_LDADD = $(LDADD) - stty_DEPENDENCIES = libver.a ../lib/libcoreutils.a \ - $(am__DEPENDENCIES_1) ../lib/libcoreutils.a + # for various ACL functions + copy_LDADD += $(LIB_ACL) +Index: src/Makefile.in +=================================================================== +--- src/Makefile.in.orig 2010-05-05 14:37:08.000000000 +0200 ++++ src/Makefile.in 2010-05-05 14:46:02.318905172 +0200 +@@ -553,10 +553,12 @@ stdbuf_DEPENDENCIES = $(am__DEPENDENCIES + stty_SOURCES = stty.c + stty_OBJECTS = stty.$(OBJEXT) + stty_DEPENDENCIES = $(am__DEPENDENCIES_2) -am_su_OBJECTS = su.$(OBJEXT) getdef.$(OBJEXT) +am_su_OBJECTS = su-su.$(OBJEXT) su-getdef.$(OBJEXT) su_OBJECTS = $(am_su_OBJECTS) @@ -47,8 +56,8 @@ + $@ sum_SOURCES = sum.c sum_OBJECTS = sum.$(OBJEXT) - sum_LDADD = $(LDADD) -@@ -633,9 +635,12 @@ tee_DEPENDENCIES = libver.a ../lib/libco + sum_DEPENDENCIES = $(am__DEPENDENCIES_2) +@@ -576,9 +578,12 @@ tee_DEPENDENCIES = $(am__DEPENDENCIES_2) test_SOURCES = test.c test_OBJECTS = test.$(OBJEXT) test_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) @@ -62,36 +71,36 @@ touch_SOURCES = touch.c touch_OBJECTS = touch.$(OBJEXT) touch_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) -@@ -1515,6 +1520,10 @@ tail_LDADD = $(nanosec_libs) - uptime_LDADD = $(LDADD) $(POW_LIB) $(GETLOADAVG_LIBS) +@@ -1747,6 +1752,10 @@ stty_LDADD = $(LDADD) + # for crypt su_SOURCES = su.c getdef.c su_LDADD = $(LDADD) $(LIB_CRYPT) $(PAM_LIBS) +su_CFLAGS = -fpie +su_LDFLAGS = -pie +timeout_CFLAGS = -fpie +timeout_LDFLAGS = -pie - stat_LDADD = $(LDADD) $(LIB_SELINUX) - - # programs that use getaddrinfo (e.g., via canon_host) -@@ -1933,7 +1942,7 @@ stty$(EXEEXT): $(stty_OBJECTS) $(stty_DE - $(LINK) $(stty_OBJECTS) $(stty_LDADD) $(LIBS) + sum_LDADD = $(LDADD) + sync_LDADD = $(LDADD) + tac_LDADD = $(LDADD) $(LIB_GETHRXTIME) +@@ -2279,7 +2288,7 @@ stty$(EXEEXT): $(stty_OBJECTS) $(stty_DE + $(AM_V_CCLD)$(LINK) $(stty_OBJECTS) $(stty_LDADD) $(LIBS) su$(EXEEXT): $(su_OBJECTS) $(su_DEPENDENCIES) @rm -f su$(EXEEXT) -- $(LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) -+ $(su_LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) +- $(AM_V_CCLD)$(LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) ++ $(AM_V_CCLD)$(su_LINK) $(su_OBJECTS) $(su_LDADD) $(LIBS) sum$(EXEEXT): $(sum_OBJECTS) $(sum_DEPENDENCIES) @rm -f sum$(EXEEXT) - $(LINK) $(sum_OBJECTS) $(sum_LDADD) $(LIBS) -@@ -1954,7 +1963,7 @@ test$(EXEEXT): $(test_OBJECTS) $(test_DE - $(LINK) $(test_OBJECTS) $(test_LDADD) $(LIBS) + $(AM_V_CCLD)$(LINK) $(sum_OBJECTS) $(sum_LDADD) $(LIBS) +@@ -2300,7 +2309,7 @@ test$(EXEEXT): $(test_OBJECTS) $(test_DE + $(AM_V_CCLD)$(LINK) $(test_OBJECTS) $(test_LDADD) $(LIBS) timeout$(EXEEXT): $(timeout_OBJECTS) $(timeout_DEPENDENCIES) @rm -f timeout$(EXEEXT) -- $(LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) -+ $(timeout_LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) +- $(AM_V_CCLD)$(LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) ++ $(AM_V_CCLD)$(timeout_LINK) $(timeout_OBJECTS) $(timeout_LDADD) $(LIBS) touch$(EXEEXT): $(touch_OBJECTS) $(touch_DEPENDENCIES) @rm -f touch$(EXEEXT) - $(LINK) $(touch_OBJECTS) $(touch_LDADD) $(LIBS) -@@ -2043,7 +2052,6 @@ distclean-compile: + $(AM_V_CCLD)$(LINK) $(touch_OBJECTS) $(touch_LDADD) $(LIBS) +@@ -2389,7 +2398,6 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/false.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fmt.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fold.Po@am__quote@ @@ -99,9 +108,9 @@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/getlimits.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-copy.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginstall-cp-hash.Po@am__quote@ -@@ -2104,14 +2112,16 @@ distclean-compile: - @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/split.Po@am__quote@ +@@ -2453,14 +2461,16 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stat.Po@am__quote@ + @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stdbuf.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stty.Po@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/su.Po@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/su-getdef.Po@am__quote@ @@ -118,9 +127,9 @@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/touch.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tr.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/true.Po@am__quote@ -@@ -2286,6 +2296,62 @@ sha512sum-md5sum.obj: md5sum.c +@@ -2649,6 +2659,62 @@ sha512sum-md5sum.obj: md5sum.c @AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@ - @am__fastdepCC_FALSE@ $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(sha512sum_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o sha512sum-md5sum.obj `if test -f 'md5sum.c'; then $(CYGPATH_W) 'md5sum.c'; else $(CYGPATH_W) '$(srcdir)/md5sum.c'; fi` + @am__fastdepCC_FALSE@ $(AM_V_CC@am__nodep@)$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(sha512sum_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o sha512sum-md5sum.obj `if test -f 'md5sum.c'; then $(CYGPATH_W) 'md5sum.c'; else $(CYGPATH_W) '$(srcdir)/md5sum.c'; fi` +su-su.o: su.c +@am__fastdepCC_TRUE@ $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(su_CFLAGS) $(CFLAGS) -MT su-su.o -MD -MP -MF $(DEPDIR)/su-su.Tpo -c -o su-su.o `test -f 'su.c' || echo '$(srcdir)/'`su.c diff --git a/coreutils-7.1.diff b/coreutils-7.1.diff deleted file mode 100644 index f755a19..0000000 --- a/coreutils-7.1.diff +++ /dev/null @@ -1,194 +0,0 @@ ---- configure -+++ configure -@@ -3029,7 +3029,6 @@ as_fn_append ac_func_list " fchmod" - as_fn_append ac_func_list " alarm" - as_fn_append ac_header_list " sys/statvfs.h" - as_fn_append ac_header_list " sys/select.h" --gl_printf_safe=yes - as_fn_append ac_func_list " readlink" - as_fn_append ac_header_list " utmp.h" - as_fn_append ac_header_list " utmpx.h" ---- doc/coreutils.texi -+++ doc/coreutils.texi -@@ -66,8 +66,6 @@ - * fold: (coreutils)fold invocation. Wrap long input lines. - * groups: (coreutils)groups invocation. Print group names a user is in. - * head: (coreutils)head invocation. Output the first part of files. --* hostid: (coreutils)hostid invocation. Print numeric host identifier. --* hostname: (coreutils)hostname invocation. Print or set system name. - * id: (coreutils)id invocation. Print user identity. - * install: (coreutils)install invocation. Copy and change attributes. - * join: (coreutils)join invocation. Join lines on a common field. -@@ -195,7 +193,7 @@ Free Documentation License''. - * File name manipulation:: dirname basename pathchk - * Working context:: pwd stty printenv tty - * User information:: id logname whoami groups users who --* System context:: date uname hostname hostid uptime -+* System context:: date uname uptime - * SELinux context:: chcon runcon - * Modified command invocation:: chroot env nice nohup su timeout - * Process control:: kill -@@ -409,8 +407,6 @@ System context - * arch invocation:: Print machine hardware name - * date invocation:: Print or set system date and time - * uname invocation:: Print system information --* hostname invocation:: Print or set system name --* hostid invocation:: Print numeric host identifier - * uptime invocation:: Print system uptime and load - - @command{date}: Print or set system date and time -@@ -12969,8 +12965,6 @@ information. - * arch invocation:: Print machine hardware name. - * date invocation:: Print or set system date and time. - * uname invocation:: Print system information. --* hostname invocation:: Print or set system name. --* hostid invocation:: Print numeric host identifier. - * uptime invocation:: Print system uptime and load - @end menu - -@@ -13928,54 +13922,6 @@ Print the kernel version. - @exitstatus - - --@node hostname invocation --@section @command{hostname}: Print or set system name -- --@pindex hostname --@cindex setting the hostname --@cindex printing the hostname --@cindex system name, printing --@cindex appropriate privileges -- --With no arguments, @command{hostname} prints the name of the current host --system. With one argument, it sets the current host name to the --specified string. You must have appropriate privileges to set the host --name. Synopsis: -- --@example --hostname [@var{name}] --@end example -- --The only options are @option{--help} and @option{--version}. @xref{Common --options}. -- --@exitstatus -- -- --@node hostid invocation --@section @command{hostid}: Print numeric host identifier. -- --@pindex hostid --@cindex printing the host identifier -- --@command{hostid} prints the numeric identifier of the current host --in hexadecimal. This command accepts no arguments. --The only options are @option{--help} and @option{--version}. --@xref{Common options}. -- --For example, here's what it prints on one system I use: -- --@example --$ hostid --1bac013d --@end example -- --On that system, the 32-bit quantity happens to be closely --related to the system's Internet address, but that isn't always --the case. -- --@exitstatus -- - @node uptime invocation - @section @command{uptime}: Print system uptime and load - ---- gnulib-tests/test-isnanl.h -+++ gnulib-tests/test-isnanl.h -@@ -75,7 +75,7 @@ main () - /* Quiet NaN. */ - ASSERT (isnanl (0.0L / 0.0L)); - --#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT -+#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT && 0 - /* A bit pattern that is different from a Quiet NaN. With a bit of luck, - it's a Signalling NaN. */ - { -@@ -117,6 +117,7 @@ main () - { LDBL80_WORDS (0xFFFF, 0x83333333, 0x00000000) }; - ASSERT (isnanl (x.value)); - } -+#if 0 - /* The isnanl function should recognize Pseudo-NaNs, Pseudo-Infinities, - Pseudo-Zeroes, Unnormalized Numbers, and Pseudo-Denormals, as defined in - Intel IA-64 Architecture Software Developer's Manual, Volume 1: -@@ -150,6 +151,7 @@ main () - ASSERT (isnanl (x.value)); - } - #endif -+#endif - - return 0; - } ---- m4/gnulib-comp.m4 -+++ m4/gnulib-comp.m4 -@@ -287,7 +287,6 @@ AC_DEFUN([gl_INIT], - gl_POSIXVER - gl_FUNC_PRINTF_FREXP - gl_FUNC_PRINTF_FREXPL -- m4_divert_text([INIT_PREPARE], [gl_printf_safe=yes]) - m4_ifdef([AM_XGETTEXT_OPTION], - [AM_XGETTEXT_OPTION([--keyword='proper_name:1,\"This is a proper name. See the gettext manual, section Names.\"']) - AM_XGETTEXT_OPTION([--keyword='proper_name_utf8:1,\"This is a proper name. See the gettext manual, section Names.\"'])]) ---- man/Makefile.am -+++ man/Makefile.am -@@ -184,7 +184,7 @@ check-x-vs-1: - PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ - t=ls-files.$$$$; \ - (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ -- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ -+ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ - | tr -s ' ' '\n' | sed 's/\.1$$//') \ - | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ - rm $$t ---- man/Makefile.in -+++ man/Makefile.in -@@ -1275,7 +1275,7 @@ check-x-vs-1: - PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ - t=ls-files.$$$$; \ - (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ -- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ -+ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ - | tr -s ' ' '\n' | sed 's/\.1$$//') \ - | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ - rm $$t ---- src/system.h -+++ src/system.h -@@ -156,7 +156,7 @@ enum - # define DEV_BSIZE BBSIZE - #endif - #ifndef DEV_BSIZE --# define DEV_BSIZE 4096 -+# define DEV_BSIZE 512 - #endif - - /* Extract or fake data from a `struct stat'. ---- tests/misc/help-version -+++ tests/misc/help-version -@@ -182,6 +182,7 @@ lbracket_args=": ]" - for i in $built_programs; do - # Skip these. - case $i in chroot|stty|tty|false|chcon|runcon) continue;; esac -+ case $i in df) continue;; esac - - rm -rf $tmp_in $tmp_in2 $tmp_dir $tmp_out - echo > $tmp_in ---- tests/other-fs-tmpdir -+++ tests/other-fs-tmpdir -@@ -42,6 +42,8 @@ for d in $CANDIDATE_TMP_DIRS; do - fi - - done -+# Autobuild hack -+test -f /bin/uname.bin && other_partition_tmpdir= - - if test -z "$other_partition_tmpdir"; then - skip_test_ \ diff --git a/coreutils-7.1.tar.xz b/coreutils-7.1.tar.xz deleted file mode 100644 index 5f576f5..0000000 --- a/coreutils-7.1.tar.xz +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:a584c6ce92f390c684dac00032e5c790ecc15cb0fa3e61891ac62401832ae108 -size 3967824 diff --git a/coreutils-8.5-i18n.patch b/coreutils-8.5-i18n.patch new file mode 100644 index 0000000..b043447 --- /dev/null +++ b/coreutils-8.5-i18n.patch @@ -0,0 +1,4066 @@ +Index: lib/linebuffer.h +=================================================================== +--- lib/linebuffer.h.orig 2010-04-23 15:44:00.000000000 +0200 ++++ lib/linebuffer.h 2010-05-07 16:13:30.696492151 +0200 +@@ -21,6 +21,11 @@ + + # include + ++/* Get mbstate_t. */ ++# if HAVE_WCHAR_H ++# include ++# endif ++ + /* A `struct linebuffer' holds a line of text. */ + + struct linebuffer +@@ -28,6 +33,9 @@ struct linebuffer + size_t size; /* Allocated. */ + size_t length; /* Used. */ + char *buffer; ++# if HAVE_WCHAR_H ++ mbstate_t state; ++# endif + }; + + /* Initialize linebuffer LINEBUFFER for use. */ +Index: src/cut.c +=================================================================== +--- src/cut.c.orig 2010-04-20 21:52:04.000000000 +0200 ++++ src/cut.c 2010-05-07 16:40:46.225492013 +0200 +@@ -28,6 +28,11 @@ + #include + #include + #include ++ ++/* Get mbstate_t, mbrtowc(). */ ++#if HAVE_WCHAR_H ++# include ++#endif + #include "system.h" + + #include "error.h" +@@ -36,6 +41,18 @@ + #include "quote.h" + #include "xstrndup.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# undef MB_LEN_MAX ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "cut" + +@@ -71,6 +88,52 @@ + } \ + while (0) + ++/* Refill the buffer BUF to get a multibyte character. */ ++#define REFILL_BUFFER(BUF, BUFPOS, BUFLEN, STREAM) \ ++ do \ ++ { \ ++ if (BUFLEN < MB_LEN_MAX && !feof (STREAM) && !ferror (STREAM)) \ ++ { \ ++ memmove (BUF, BUFPOS, BUFLEN); \ ++ BUFLEN += fread (BUF + BUFLEN, sizeof(char), BUFSIZ, STREAM); \ ++ BUFPOS = BUF; \ ++ } \ ++ } \ ++ while (0) ++ ++/* Get wide character on BUFPOS. BUFPOS is not included after that. ++ If byte sequence is not valid as a character, CONVFAIL is 1. Otherwise 0. */ ++#define GET_NEXT_WC_FROM_BUFFER(WC, BUFPOS, BUFLEN, MBLENGTH, STATE, CONVFAIL) \ ++ do \ ++ { \ ++ mbstate_t state_bak; \ ++ \ ++ if (BUFLEN < 1) \ ++ { \ ++ WC = WEOF; \ ++ break; \ ++ } \ ++ \ ++ /* Get a wide character. */ \ ++ CONVFAIL = 0; \ ++ state_bak = STATE; \ ++ MBLENGTH = mbrtowc ((wchar_t *)&WC, BUFPOS, BUFLEN, &STATE); \ ++ \ ++ switch (MBLENGTH) \ ++ { \ ++ case (size_t)-1: \ ++ case (size_t)-2: \ ++ CONVFAIL++; \ ++ STATE = state_bak; \ ++ /* Fall througn. */ \ ++ \ ++ case 0: \ ++ MBLENGTH = 1; \ ++ break; \ ++ } \ ++ } \ ++ while (0) ++ + struct range_pair + { + size_t lo; +@@ -89,7 +152,7 @@ static char *field_1_buffer; + /* The number of bytes allocated for FIELD_1_BUFFER. */ + static size_t field_1_bufsize; + +-/* The largest field or byte index used as an endpoint of a closed ++/* The largest byte, character or field index used as an endpoint of a closed + or degenerate range specification; this doesn't include the starting + index of right-open-ended ranges. For example, with either range spec + `2-5,9-', `2-3,5,9-' this variable would be set to 5. */ +@@ -101,10 +164,11 @@ static size_t eol_range_start; + + /* This is a bit vector. + In byte mode, which bytes to output. ++ In character mode, which characters to output. + In field mode, which DELIM-separated fields to output. +- Both bytes and fields are numbered starting with 1, ++ Bytes, characters and fields are numbered starting with 1, + so the zeroth bit of this array is unused. +- A field or byte K has been selected if ++ A byte, character or field K has been selected if + (K <= MAX_RANGE_ENDPOINT and is_printable_field(K)) + || (EOL_RANGE_START > 0 && K >= EOL_RANGE_START). */ + static unsigned char *printable_field; +@@ -113,15 +177,25 @@ enum operating_mode + { + undefined_mode, + +- /* Output characters that are in the given bytes. */ ++ /* Output bytes that are at the given positions. */ + byte_mode, + ++ /* Output characters that are at the given positions. */ ++ character_mode, ++ + /* Output the given delimeter-separated fields. */ + field_mode + }; + + static enum operating_mode operating_mode; + ++/* If nonzero, when in byte mode, don't split multibyte characters. */ ++static int byte_mode_character_aware; ++ ++/* If nonzero, the function for single byte locale is work ++ if this program runs on multibyte locale. */ ++static int force_singlebyte_mode; ++ + /* If true do not output lines containing no delimeter characters. + Otherwise, all such lines are printed. This option is valid only + with field mode. */ +@@ -133,6 +207,9 @@ static bool complement; + + /* The delimeter character for field mode. */ + static unsigned char delim; ++#if HAVE_WCHAR_H ++static wchar_t wcdelim; ++#endif + + /* True if the --output-delimiter=STRING option was specified. */ + static bool output_delimiter_specified; +@@ -206,7 +283,7 @@ Mandatory arguments to long options are + -f, --fields=LIST select only these fields; also print any line\n\ + that contains no delimiter character, unless\n\ + the -s option is specified\n\ +- -n (ignored)\n\ ++ -n with -b: don't split multibyte characters\n\ + "), stdout); + fputs (_("\ + --complement complement the set of selected bytes, characters\n\ +@@ -365,7 +442,7 @@ set_fields (const char *fieldstr) + in_digits = false; + /* Starting a range. */ + if (dash_found) +- FATAL_ERROR (_("invalid byte or field list")); ++ FATAL_ERROR (_("invalid byte, character or field list")); + dash_found = true; + fieldstr++; + +@@ -389,14 +466,16 @@ set_fields (const char *fieldstr) + if (!rhs_specified) + { + /* `n-'. From `initial' to end of line. */ +- eol_range_start = initial; ++ if (eol_range_start == 0 || ++ (eol_range_start != 0 && eol_range_start > initial)) ++ eol_range_start = initial; + field_found = true; + } + else + { + /* `m-n' or `-n' (1-n). */ + if (value < initial) +- FATAL_ERROR (_("invalid decreasing range")); ++ FATAL_ERROR (_("invalid byte, character or field list")); + + /* Is there already a range going to end of line? */ + if (eol_range_start != 0) +@@ -476,6 +555,9 @@ set_fields (const char *fieldstr) + if (operating_mode == byte_mode) + error (0, 0, + _("byte offset %s is too large"), quote (bad_num)); ++ else if (operating_mode == character_mode) ++ error (0, 0, ++ _("character offset %s is too large"), quote (bad_num)); + else + error (0, 0, + _("field number %s is too large"), quote (bad_num)); +@@ -486,7 +568,7 @@ set_fields (const char *fieldstr) + fieldstr++; + } + else +- FATAL_ERROR (_("invalid byte or field list")); ++ FATAL_ERROR (_("invalid byte, character or field list")); + } + + max_range_endpoint = 0; +@@ -579,6 +661,63 @@ cut_bytes (FILE *stream) + } + } + ++#if HAVE_MBRTOWC ++/* This function is in use for the following case. ++ ++ 1. Read from the stream STREAM, printing to standard output any selected ++ characters. ++ ++ 2. Read from stream STREAM, printing to standard output any selected bytes, ++ without splitting multibyte characters. */ ++ ++static void ++cut_characters_or_cut_bytes_no_split (FILE *stream) ++{ ++ int idx; /* number of bytes or characters in the line so far. */ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen; /* The length of the byte sequence in buf. */ ++ wint_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character which shows ++ as same character as WC. */ ++ mbstate_t state; /* State of the stream. */ ++ int convfail; /* 1, when conversion is failed. Otherwise 0. */ ++ ++ idx = 0; ++ buflen = 0; ++ bufpos = buf; ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ while (1) ++ { ++ REFILL_BUFFER (buf, bufpos, buflen, stream); ++ ++ GET_NEXT_WC_FROM_BUFFER (wc, bufpos, buflen, mblength, state, convfail); ++ ++ if (wc == WEOF) ++ { ++ if (idx > 0) ++ putchar ('\n'); ++ break; ++ } ++ else if (wc == L'\n') ++ { ++ putchar ('\n'); ++ idx = 0; ++ } ++ else ++ { ++ idx += (operating_mode == byte_mode) ? mblength : 1; ++ if (print_kth (idx, NULL)) ++ fwrite (bufpos, mblength, sizeof(char), stdout); ++ } ++ ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++} ++#endif ++ + /* Read from stream STREAM, printing to standard output any selected fields. */ + + static void +@@ -701,13 +840,192 @@ cut_fields (FILE *stream) + } + } + ++#if HAVE_MBRTOWC ++static void ++cut_fields_mb (FILE *stream) ++{ ++ int c; ++ unsigned int field_idx; ++ int found_any_selected_field; ++ int buffer_first_field; ++ int empty_input; ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen; /* The length of the byte sequence in buf. */ ++ wint_t wc = 0; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character which shows ++ as same character as WC. */ ++ mbstate_t state; /* State of the stream. */ ++ int convfail; /* 1, when conversion is failed. Otherwise 0. */ ++ ++ found_any_selected_field = 0; ++ field_idx = 1; ++ bufpos = buf; ++ buflen = 0; ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ c = getc (stream); ++ empty_input = (c == EOF); ++ if (c != EOF) ++ ungetc (c, stream); ++ else ++ wc = WEOF; ++ ++ /* To support the semantics of the -s flag, we may have to buffer ++ all of the first field to determine whether it is `delimited.' ++ But that is unnecessary if all non-delimited lines must be printed ++ and the first field has been selected, or if non-delimited lines ++ must be suppressed and the first field has *not* been selected. ++ That is because a non-delimited line has exactly one field. */ ++ buffer_first_field = (suppress_non_delimited ^ !print_kth (1, NULL)); ++ ++ while (1) ++ { ++ if (field_idx == 1 && buffer_first_field) ++ { ++ int len = 0; ++ ++ while (1) ++ { ++ REFILL_BUFFER (buf, bufpos, buflen, stream); ++ ++ GET_NEXT_WC_FROM_BUFFER ++ (wc, bufpos, buflen, mblength, state, convfail); ++ ++ if (wc == WEOF) ++ break; ++ ++ field_1_buffer = xrealloc (field_1_buffer, len + mblength); ++ memcpy (field_1_buffer + len, bufpos, mblength); ++ len += mblength; ++ buflen -= mblength; ++ bufpos += mblength; ++ ++ if (!convfail && (wc == L'\n' || wc == wcdelim)) ++ break; ++ } ++ ++ if (wc == WEOF) ++ break; ++ ++ /* If the first field extends to the end of line (it is not ++ delimited) and we are printing all non-delimited lines, ++ print this one. */ ++ if (convfail || (!convfail && wc != wcdelim)) ++ { ++ if (suppress_non_delimited) ++ { ++ /* Empty. */ ++ } ++ else ++ { ++ fwrite (field_1_buffer, sizeof (char), len, stdout); ++ /* Make sure the output line is newline terminated. */ ++ if (convfail || (!convfail && wc != L'\n')) ++ putchar ('\n'); ++ } ++ continue; ++ } ++ ++ if (print_kth (1, NULL)) ++ { ++ /* Print the field, but not the trailing delimiter. */ ++ fwrite (field_1_buffer, sizeof (char), len - 1, stdout); ++ found_any_selected_field = 1; ++ } ++ ++field_idx; ++ } ++ ++ if (wc != WEOF) ++ { ++ if (print_kth (field_idx, NULL)) ++ { ++ if (found_any_selected_field) ++ { ++ fwrite (output_delimiter_string, sizeof (char), ++ output_delimiter_length, stdout); ++ } ++ found_any_selected_field = 1; ++ } ++ ++ while (1) ++ { ++ REFILL_BUFFER (buf, bufpos, buflen, stream); ++ ++ GET_NEXT_WC_FROM_BUFFER ++ (wc, bufpos, buflen, mblength, state, convfail); ++ ++ if (wc == WEOF) ++ break; ++ else if (!convfail && (wc == wcdelim || wc == L'\n')) ++ { ++ buflen -= mblength; ++ bufpos += mblength; ++ break; ++ } ++ ++ if (print_kth (field_idx, NULL)) ++ fwrite (bufpos, mblength, sizeof(char), stdout); ++ ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++ } ++ ++ if ((!convfail || wc == L'\n') && buflen < 1) ++ wc = WEOF; ++ ++ if (!convfail && wc == wcdelim) ++ ++field_idx; ++ else if (wc == WEOF || (!convfail && wc == L'\n')) ++ { ++ if (found_any_selected_field ++ || (!empty_input && !(suppress_non_delimited && field_idx == 1))) ++ putchar ('\n'); ++ if (wc == WEOF) ++ break; ++ field_idx = 1; ++ found_any_selected_field = 0; ++ } ++ } ++} ++#endif ++ + static void + cut_stream (FILE *stream) + { +- if (operating_mode == byte_mode) +- cut_bytes (stream); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) ++ { ++ switch (operating_mode) ++ { ++ case byte_mode: ++ if (byte_mode_character_aware) ++ cut_characters_or_cut_bytes_no_split (stream); ++ else ++ cut_bytes (stream); ++ break; ++ ++ case character_mode: ++ cut_characters_or_cut_bytes_no_split (stream); ++ break; ++ ++ case field_mode: ++ cut_fields_mb (stream); ++ break; ++ ++ default: ++ abort (); ++ } ++ } + else +- cut_fields (stream); ++#endif ++ { ++ if (operating_mode == field_mode) ++ cut_fields (stream); ++ else ++ cut_bytes (stream); ++ } + } + + /* Process file FILE to standard output. +@@ -757,6 +1075,8 @@ main (int argc, char **argv) + bool ok; + bool delim_specified = false; + char *spec_list_string IF_LINT (= NULL); ++ char mbdelim[MB_LEN_MAX + 1]; ++ size_t delimlen = 0; + + initialize_main (&argc, &argv); + set_program_name (argv[0]); +@@ -779,7 +1099,6 @@ main (int argc, char **argv) + switch (optc) + { + case 'b': +- case 'c': + /* Build the byte list. */ + if (operating_mode != undefined_mode) + FATAL_ERROR (_("only one type of list may be specified")); +@@ -787,6 +1106,14 @@ main (int argc, char **argv) + spec_list_string = optarg; + break; + ++ case 'c': ++ /* Build the character list. */ ++ if (operating_mode != undefined_mode) ++ FATAL_ERROR (_("only one type of list may be specified")); ++ operating_mode = character_mode; ++ spec_list_string = optarg; ++ break; ++ + case 'f': + /* Build the field list. */ + if (operating_mode != undefined_mode) +@@ -798,10 +1125,35 @@ main (int argc, char **argv) + case 'd': + /* New delimiter. */ + /* Interpret -d '' to mean `use the NUL byte as the delimiter.' */ +- if (optarg[0] != '\0' && optarg[1] != '\0') +- FATAL_ERROR (_("the delimiter must be a single character")); +- delim = optarg[0]; +- delim_specified = true; ++ { ++#if HAVE_MBRTOWC ++ if(MB_CUR_MAX > 1) ++ { ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ delimlen = mbrtowc (&wcdelim, optarg, strnlen(optarg, MB_LEN_MAX), &state); ++ ++ if (delimlen == (size_t)-1 || delimlen == (size_t)-2) ++ ++force_singlebyte_mode; ++ else ++ { ++ delimlen = (delimlen < 1) ? 1 : delimlen; ++ if (wcdelim != L'\0' && *(optarg + delimlen) != '\0') ++ FATAL_ERROR (_("the delimiter must be a single character")); ++ memcpy (mbdelim, optarg, delimlen); ++ } ++ } ++ ++ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) ++#endif ++ { ++ if (optarg[0] != '\0' && optarg[1] != '\0') ++ FATAL_ERROR (_("the delimiter must be a single character")); ++ delim = (unsigned char) optarg[0]; ++ } ++ delim_specified = true; ++ } + break; + + case OUTPUT_DELIMITER_OPTION: +@@ -814,6 +1166,7 @@ main (int argc, char **argv) + break; + + case 'n': ++ byte_mode_character_aware = 1; + break; + + case 's': +@@ -836,7 +1189,7 @@ main (int argc, char **argv) + if (operating_mode == undefined_mode) + FATAL_ERROR (_("you must specify a list of bytes, characters, or fields")); + +- if (delim != '\0' && operating_mode != field_mode) ++ if (delim_specified && operating_mode != field_mode) + FATAL_ERROR (_("an input delimiter may be specified only\ + when operating on fields")); + +@@ -863,15 +1216,34 @@ main (int argc, char **argv) + } + + if (!delim_specified) +- delim = '\t'; ++ { ++ delim = '\t'; ++#ifdef HAVE_MBRTOWC ++ wcdelim = L'\t'; ++ mbdelim[0] = '\t'; ++ mbdelim[1] = '\0'; ++ delimlen = 1; ++#endif ++ } + + if (output_delimiter_string == NULL) + { +- static char dummy[2]; +- dummy[0] = delim; +- dummy[1] = '\0'; +- output_delimiter_string = dummy; +- output_delimiter_length = 1; ++#ifdef HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1 && !force_singlebyte_mode) ++ { ++ output_delimiter_string = xstrdup(mbdelim); ++ output_delimiter_length = delimlen; ++ } ++ ++ if (MB_CUR_MAX <= 1 || force_singlebyte_mode) ++#endif ++ { ++ static char dummy[2]; ++ dummy[0] = delim; ++ dummy[1] = '\0'; ++ output_delimiter_string = dummy; ++ output_delimiter_length = 1; ++ } + } + + if (optind == argc) +Index: src/expand.c +=================================================================== +--- src/expand.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/expand.c 2010-05-07 16:13:30.748169979 +0200 +@@ -38,11 +38,28 @@ + #include + #include + #include ++ ++/* Get mbstate_t, mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "quote.h" + #include "xstrndup.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "expand" + +@@ -358,6 +375,142 @@ expand (void) + } + } + ++#if HAVE_MBRTOWC ++static void ++expand_multibyte (void) ++{ ++ FILE *fp; /* Input strem. */ ++ mbstate_t i_state; /* Current shift state of the input stream. */ ++ mbstate_t i_state_bak; /* Back up the I_STATE. */ ++ mbstate_t o_state; /* Current shift state of the output stream. */ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen = 0; /* The length of the byte sequence in buf. */ ++ wchar_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character ++ which shows as same character as WC. */ ++ int tab_index = 0; /* Index in `tab_list' of next tabstop. */ ++ int column = 0; /* Column on screen of the next char. */ ++ int next_tab_column; /* Column the next tab stop is on. */ ++ int convert = 1; /* If nonzero, perform translations. */ ++ ++ fp = next_file ((FILE *) NULL); ++ if (fp == NULL) ++ return; ++ ++ memset (&o_state, '\0', sizeof(mbstate_t)); ++ memset (&i_state, '\0', sizeof(mbstate_t)); ++ ++ for (;;) ++ { ++ /* Refill the buffer BUF. */ ++ if (buflen < MB_LEN_MAX && !feof(fp) && !ferror(fp)) ++ { ++ memmove (buf, bufpos, buflen); ++ buflen += fread (buf + buflen, sizeof(char), BUFSIZ, fp); ++ bufpos = buf; ++ } ++ ++ /* No character is left in BUF. */ ++ if (buflen < 1) ++ { ++ fp = next_file (fp); ++ ++ if (fp == NULL) ++ break; /* No more files. */ ++ else ++ { ++ memset (&i_state, '\0', sizeof(mbstate_t)); ++ continue; ++ } ++ } ++ ++ /* Get a wide character. */ ++ i_state_bak = i_state; ++ mblength = mbrtowc (&wc, bufpos, buflen, &i_state); ++ ++ switch (mblength) ++ { ++ case (size_t)-1: /* illegal byte sequence. */ ++ case (size_t)-2: ++ mblength = 1; ++ i_state = i_state_bak; ++ if (convert) ++ { ++ ++column; ++ if (convert_entire_line == 0) ++ convert = 0; ++ } ++ putchar (*bufpos); ++ break; ++ ++ case 0: /* null. */ ++ mblength = 1; ++ if (convert && convert_entire_line == 0) ++ convert = 0; ++ putchar ('\0'); ++ break; ++ ++ default: ++ if (wc == L'\n') /* LF. */ ++ { ++ tab_index = 0; ++ column = 0; ++ convert = 1; ++ putchar ('\n'); ++ } ++ else if (wc == L'\t' && convert) /* Tab. */ ++ { ++ if (tab_size == 0) ++ { ++ /* Do not let tab_index == first_free_tab; ++ stop when it is 1 less. */ ++ while (tab_index < first_free_tab - 1 ++ && column >= tab_list[tab_index]) ++ tab_index++; ++ next_tab_column = tab_list[tab_index]; ++ if (tab_index < first_free_tab - 1) ++ tab_index++; ++ if (column >= next_tab_column) ++ next_tab_column = column + 1; ++ } ++ else ++ next_tab_column = column + tab_size - column % tab_size; ++ ++ while (column < next_tab_column) ++ { ++ putchar (' '); ++ ++column; ++ } ++ } ++ else /* Others. */ ++ { ++ if (convert) ++ { ++ if (wc == L'\b') ++ { ++ if (column > 0) ++ --column; ++ } ++ else ++ { ++ int width; /* The width of WC. */ ++ ++ width = wcwidth (wc); ++ column += (width > 0) ? width : 0; ++ if (convert_entire_line == 0) ++ convert = 0; ++ } ++ } ++ fwrite (bufpos, sizeof(char), mblength, stdout); ++ } ++ } ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++} ++#endif ++ + int + main (int argc, char **argv) + { +@@ -422,7 +575,12 @@ main (int argc, char **argv) + + file_list = (optind < argc ? &argv[optind] : stdin_argv); + +- expand (); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ expand_multibyte (); ++ else ++#endif ++ expand (); + + if (have_read_stdin && fclose (stdin) != 0) + error (EXIT_FAILURE, errno, "-"); +Index: src/fold.c +=================================================================== +--- src/fold.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/fold.c 2010-05-07 16:39:03.220004781 +0200 +@@ -22,11 +22,33 @@ + #include + #include + ++/* Get mbstate_t, mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswprint(), iswblank(), wcwidth(). */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "quote.h" + #include "xstrtol.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# undef MB_LEN_MAX ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + #define TAB_WIDTH 8 + + /* The official name of this program (e.g., no `g' prefix). */ +@@ -34,20 +56,41 @@ + + #define AUTHORS proper_name ("David MacKenzie") + ++#define FATAL_ERROR(Message) \ ++ do \ ++ { \ ++ error (0, 0, (Message)); \ ++ usage (2); \ ++ } \ ++ while (0) ++ ++enum operating_mode ++{ ++ /* Fold texts by columns that are at the given positions. */ ++ column_mode, ++ ++ /* Fold texts by bytes that are at the given positions. */ ++ byte_mode, ++ ++ /* Fold texts by characters that are at the given positions. */ ++ character_mode, ++}; ++ ++/* The argument shows current mode. (Default: column_mode) */ ++static enum operating_mode operating_mode; ++ + /* If nonzero, try to break on whitespace. */ + static bool break_spaces; + +-/* If nonzero, count bytes, not column positions. */ +-static bool count_bytes; +- + /* If nonzero, at least one of the files we read was standard input. */ + static bool have_read_stdin; + +-static char const shortopts[] = "bsw:0::1::2::3::4::5::6::7::8::9::"; ++static char const shortopts[] = "bcsw:0::1::2::3::4::5::6::7::8::9::"; + + static struct option const longopts[] = + { + {"bytes", no_argument, NULL, 'b'}, ++ {"characters", no_argument, NULL, 'c'}, + {"spaces", no_argument, NULL, 's'}, + {"width", required_argument, NULL, 'w'}, + {GETOPT_HELP_OPTION_DECL}, +@@ -77,6 +120,7 @@ Mandatory arguments to long options are + "), stdout); + fputs (_("\ + -b, --bytes count bytes rather than columns\n\ ++ -c, --characters count characters rather than columns\n\ + -s, --spaces break at spaces\n\ + -w, --width=WIDTH use WIDTH columns instead of 80\n\ + "), stdout); +@@ -94,7 +138,7 @@ Mandatory arguments to long options are + static size_t + adjust_column (size_t column, char c) + { +- if (!count_bytes) ++ if (operating_mode != byte_mode) + { + if (c == '\b') + { +@@ -117,30 +161,14 @@ adjust_column (size_t column, char c) + to stdout, with maximum line length WIDTH. + Return true if successful. */ + +-static bool +-fold_file (char const *filename, size_t width) ++static void ++fold_text (FILE *istream, size_t width, int *saved_errno) + { +- FILE *istream; + int c; + size_t column = 0; /* Screen column where next char will go. */ + size_t offset_out = 0; /* Index in `line_out' for next char. */ + static char *line_out = NULL; + static size_t allocated_out = 0; +- int saved_errno; +- +- if (STREQ (filename, "-")) +- { +- istream = stdin; +- have_read_stdin = true; +- } +- else +- istream = fopen (filename, "r"); +- +- if (istream == NULL) +- { +- error (0, errno, "%s", filename); +- return false; +- } + + while ((c = getc (istream)) != EOF) + { +@@ -168,6 +196,15 @@ fold_file (char const *filename, size_t + bool found_blank = false; + size_t logical_end = offset_out; + ++ /* If LINE_OUT has no wide character, ++ put a new wide character in LINE_OUT ++ if column is bigger than width. */ ++ if (offset_out == 0) ++ { ++ line_out[offset_out++] = c; ++ continue; ++ } ++ + /* Look for the last blank. */ + while (logical_end) + { +@@ -214,11 +251,222 @@ fold_file (char const *filename, size_t + line_out[offset_out++] = c; + } + +- saved_errno = errno; ++ *saved_errno = errno; ++ ++ if (offset_out) ++ fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); ++ ++} ++ ++#if HAVE_MBRTOWC ++static void ++fold_multibyte_text (FILE *istream, size_t width, int *saved_errno) ++{ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ size_t buflen = 0; /* The length of the byte sequence in buf. */ ++ char *bufpos = NULL; /* Next read position of BUF. */ ++ wint_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character which shows ++ as same character as WC. */ ++ mbstate_t state, state_bak; /* State of the stream. */ ++ int convfail; /* 1, when conversion is failed. Otherwise 0. */ ++ ++ static char *line_out = NULL; ++ size_t offset_out = 0; /* Index in `line_out' for next char. */ ++ static size_t allocated_out = 0; ++ ++ int increment; ++ size_t column = 0; ++ ++ size_t last_blank_pos; ++ size_t last_blank_column; ++ int is_blank_seen; ++ int last_blank_increment = 0; ++ int is_bs_following_last_blank; ++ size_t bs_following_last_blank_num; ++ int is_cr_after_last_blank; ++ ++#define CLEAR_FLAGS \ ++ do \ ++ { \ ++ last_blank_pos = 0; \ ++ last_blank_column = 0; \ ++ is_blank_seen = 0; \ ++ is_bs_following_last_blank = 0; \ ++ bs_following_last_blank_num = 0; \ ++ is_cr_after_last_blank = 0; \ ++ } \ ++ while (0) ++ ++#define START_NEW_LINE \ ++ do \ ++ { \ ++ putchar ('\n'); \ ++ column = 0; \ ++ offset_out = 0; \ ++ CLEAR_FLAGS; \ ++ } \ ++ while (0) ++ ++ CLEAR_FLAGS; ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ for (;; bufpos += mblength, buflen -= mblength) ++ { ++ if (buflen < MB_LEN_MAX && !feof (istream) && !ferror (istream)) ++ { ++ memmove (buf, bufpos, buflen); ++ buflen += fread (buf + buflen, sizeof(char), BUFSIZ, istream); ++ bufpos = buf; ++ } ++ ++ if (buflen < 1) ++ break; ++ ++ /* Get a wide character. */ ++ convfail = 0; ++ state_bak = state; ++ mblength = mbrtowc ((wchar_t *)&wc, bufpos, buflen, &state); ++ ++ switch (mblength) ++ { ++ case (size_t)-1: ++ case (size_t)-2: ++ convfail++; ++ state = state_bak; ++ /* Fall through. */ ++ ++ case 0: ++ mblength = 1; ++ break; ++ } ++ ++rescan: ++ if (operating_mode == byte_mode) /* byte mode */ ++ increment = mblength; ++ else if (operating_mode == character_mode) /* character mode */ ++ increment = 1; ++ else /* column mode */ ++ { ++ if (convfail) ++ increment = 1; ++ else ++ { ++ switch (wc) ++ { ++ case L'\n': ++ fwrite (line_out, sizeof(char), offset_out, stdout); ++ START_NEW_LINE; ++ continue; ++ ++ case L'\b': ++ increment = (column > 0) ? -1 : 0; ++ break; ++ ++ case L'\r': ++ increment = -1 * column; ++ break; ++ ++ case L'\t': ++ increment = 8 - column % 8; ++ break; ++ ++ default: ++ increment = wcwidth (wc); ++ increment = (increment < 0) ? 0 : increment; ++ } ++ } ++ } ++ ++ if (column + increment > width && break_spaces && last_blank_pos) ++ { ++ fwrite (line_out, sizeof(char), last_blank_pos, stdout); ++ putchar ('\n'); ++ ++ offset_out = offset_out - last_blank_pos; ++ column = column - last_blank_column + ((is_cr_after_last_blank) ++ ? last_blank_increment : bs_following_last_blank_num); ++ memmove (line_out, line_out + last_blank_pos, offset_out); ++ CLEAR_FLAGS; ++ goto rescan; ++ } ++ ++ if (column + increment > width && column != 0) ++ { ++ fwrite (line_out, sizeof(char), offset_out, stdout); ++ START_NEW_LINE; ++ goto rescan; ++ } ++ ++ if (allocated_out < offset_out + mblength) ++ { ++ line_out = X2REALLOC (line_out, &allocated_out); ++ } ++ ++ memcpy (line_out + offset_out, bufpos, mblength); ++ offset_out += mblength; ++ column += increment; ++ ++ if (is_blank_seen && !convfail && wc == L'\r') ++ is_cr_after_last_blank = 1; ++ ++ if (is_bs_following_last_blank && !convfail && wc == L'\b') ++ ++bs_following_last_blank_num; ++ else ++ is_bs_following_last_blank = 0; ++ ++ if (break_spaces && !convfail && iswblank (wc)) ++ { ++ last_blank_pos = offset_out; ++ last_blank_column = column; ++ is_blank_seen = 1; ++ last_blank_increment = increment; ++ is_bs_following_last_blank = 1; ++ bs_following_last_blank_num = 0; ++ is_cr_after_last_blank = 0; ++ } ++ } ++ ++ *saved_errno = errno; + + if (offset_out) + fwrite (line_out, sizeof (char), (size_t) offset_out, stdout); + ++} ++#endif ++ ++/* Fold file FILENAME, or standard input if FILENAME is "-", ++ to stdout, with maximum line length WIDTH. ++ Return 0 if successful, 1 if an error occurs. */ ++ ++static bool ++fold_file (char *filename, size_t width) ++{ ++ FILE *istream; ++ int saved_errno; ++ ++ if (STREQ (filename, "-")) ++ { ++ istream = stdin; ++ have_read_stdin = 1; ++ } ++ else ++ istream = fopen (filename, "r"); ++ ++ if (istream == NULL) ++ { ++ error (0, errno, "%s", filename); ++ return 1; ++ } ++ ++ /* Define how ISTREAM is being folded. */ ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ fold_multibyte_text (istream, width, &saved_errno); ++ else ++#endif ++ fold_text (istream, width, &saved_errno); ++ + if (ferror (istream)) + { + error (0, saved_errno, "%s", filename); +@@ -251,7 +499,8 @@ main (int argc, char **argv) + + atexit (close_stdout); + +- break_spaces = count_bytes = have_read_stdin = false; ++ operating_mode = column_mode; ++ break_spaces = have_read_stdin = false; + + while ((optc = getopt_long (argc, argv, shortopts, longopts, NULL)) != -1) + { +@@ -260,7 +509,15 @@ main (int argc, char **argv) + switch (optc) + { + case 'b': /* Count bytes rather than columns. */ +- count_bytes = true; ++ if (operating_mode != column_mode) ++ FATAL_ERROR (_("only one way of folding may be specified")); ++ operating_mode = byte_mode; ++ break; ++ ++ case 'c': ++ if (operating_mode != column_mode) ++ FATAL_ERROR (_("only one way of folding may be specified")); ++ operating_mode = character_mode; + break; + + case 's': /* Break at word boundaries. */ +Index: src/join.c +=================================================================== +--- src/join.c.orig 2010-04-20 21:52:04.000000000 +0200 ++++ src/join.c 2010-05-07 16:41:17.564268573 +0200 +@@ -22,17 +22,31 @@ + #include + #include + ++/* Get mbstate_t, mbrtowc(), mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswblank(), towupper. */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "hard-locale.h" + #include "linebuffer.h" +-#include "memcasecmp.h" + #include "quote.h" + #include "stdio--.h" + #include "xmemcoll.h" + #include "xstrtol.h" + #include "argmatch.h" + ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "join" + +@@ -121,10 +135,12 @@ static struct outlist outlist_head; + /* Last element in `outlist', where a new element can be added. */ + static struct outlist *outlist_end = &outlist_head; + +-/* Tab character separating fields. If negative, fields are separated +- by any nonempty string of blanks, otherwise by exactly one +- tab character whose value (when cast to unsigned char) equals TAB. */ +-static int tab = -1; ++/* Tab character separating fields. If NULL, fields are separated ++ by any nonempty string of blanks. */ ++static char *tab = NULL; ++ ++/* The number of bytes used for tab. */ ++static size_t tablen = 0; + + /* If nonzero, check that the input is correctly ordered. */ + static enum +@@ -248,10 +264,11 @@ xfields (struct line *line) + if (ptr == lim) + return; + +- if (0 <= tab) ++ if (tab != NULL) + { ++ unsigned char t = tab[0]; + char *sep; +- for (; (sep = memchr (ptr, tab, lim - ptr)) != NULL; ptr = sep + 1) ++ for (; (sep = memchr (ptr, t, lim - ptr)) != NULL; ptr = sep + 1) + extract_field (line, ptr, sep - ptr); + } + else +@@ -278,6 +295,148 @@ xfields (struct line *line) + extract_field (line, ptr, lim - ptr); + } + ++#if HAVE_MBRTOWC ++static void ++xfields_multibyte (struct line *line) ++{ ++ char *ptr = line->buf.buffer; ++ char const *lim = ptr + line->buf.length - 1; ++ wchar_t wc = 0; ++ size_t mblength = 1; ++ mbstate_t state, state_bak; ++ ++ memset (&state, 0, sizeof (mbstate_t)); ++ ++ if (ptr >= lim) ++ return; ++ ++ if (tab != NULL) ++ { ++ unsigned char t = tab[0]; ++ char *sep = ptr; ++ for (; ptr < lim; ptr = sep + mblength) ++ { ++ sep = ptr; ++ while (sep < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, sep, lim - sep + 1, &state); ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (mblength == tablen && !memcmp (sep, tab, mblength)) ++ break; ++ else ++ { ++ sep += mblength; ++ continue; ++ } ++ } ++ ++ if (sep >= lim) ++ break; ++ ++ extract_field (line, ptr, sep - ptr); ++ } ++ } ++ else ++ { ++ /* Skip leading blanks before the first field. */ ++ while(ptr < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (!iswblank(wc)) ++ break; ++ ptr += mblength; ++ } ++ ++ do ++ { ++ char *sep; ++ state_bak = state; ++ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ sep = ptr + mblength; ++ while (sep < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, sep, lim - sep + 1, &state); ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (iswblank (wc)) ++ break; ++ ++ sep += mblength; ++ } ++ ++ extract_field (line, ptr, sep - ptr); ++ if (sep >= lim) ++ return; ++ ++ state_bak = state; ++ mblength = mbrtowc (&wc, sep, lim - sep + 1, &state); ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ ptr = sep + mblength; ++ while (ptr < lim) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, ptr, lim - ptr + 1, &state); ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ mblength = 1; ++ state = state_bak; ++ break; ++ } ++ mblength = (mblength < 1) ? 1 : mblength; ++ ++ if (!iswblank (wc)) ++ break; ++ ++ ptr += mblength; ++ } ++ } ++ while (ptr < lim); ++ } ++ ++ extract_field (line, ptr, lim - ptr); ++} ++#endif ++ + static void + freeline (struct line *line) + { +@@ -299,56 +458,115 @@ keycmp (struct line const *line1, struct + size_t jf_1, size_t jf_2) + { + /* Start of field to compare in each file. */ +- char *beg1; +- char *beg2; +- +- size_t len1; +- size_t len2; /* Length of fields to compare. */ ++ char *beg[2]; ++ char *copy[2]; ++ size_t len[2]; /* Length of fields to compare. */ + int diff; ++ int i, j; + + if (jf_1 < line1->nfields) + { +- beg1 = line1->fields[jf_1].beg; +- len1 = line1->fields[jf_1].len; ++ beg[0] = line1->fields[jf_1].beg; ++ len[0] = line1->fields[jf_1].len; + } + else + { +- beg1 = NULL; +- len1 = 0; ++ beg[0] = NULL; ++ len[0] = 0; + } + + if (jf_2 < line2->nfields) + { +- beg2 = line2->fields[jf_2].beg; +- len2 = line2->fields[jf_2].len; ++ beg[1] = line2->fields[jf_2].beg; ++ len[1] = line2->fields[jf_2].len; + } + else + { +- beg2 = NULL; +- len2 = 0; ++ beg[1] = NULL; ++ len[1] = 0; + } + +- if (len1 == 0) +- return len2 == 0 ? 0 : -1; +- if (len2 == 0) ++ if (len[0] == 0) ++ return len[1] == 0 ? 0 : -1; ++ if (len[1] == 0) + return 1; + + if (ignore_case) + { +- /* FIXME: ignore_case does not work with NLS (in particular, +- with multibyte chars). */ +- diff = memcasecmp (beg1, beg2, MIN (len1, len2)); ++#ifdef HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ size_t mblength; ++ wchar_t wc, uwc; ++ mbstate_t state, state_bak; ++ ++ memset (&state, '\0', sizeof (mbstate_t)); ++ ++ for (i = 0; i < 2; i++) ++ { ++ copy[i] = alloca (len[i] + 1); ++ ++ for (j = 0; j < MIN (len[0], len[1]);) ++ { ++ state_bak = state; ++ mblength = mbrtowc (&wc, beg[i] + j, len[i] - j, &state); ++ ++ switch (mblength) ++ { ++ case (size_t) -1: ++ case (size_t) -2: ++ state = state_bak; ++ /* Fall through */ ++ case 0: ++ mblength = 1; ++ break; ++ ++ default: ++ uwc = towupper (wc); ++ ++ if (uwc != wc) ++ { ++ mbstate_t state_wc; ++ ++ memset (&state_wc, '\0', sizeof (mbstate_t)); ++ wcrtomb (copy[i] + j, uwc, &state_wc); ++ } ++ else ++ memcpy (copy[i] + j, beg[i] + j, mblength); ++ } ++ j += mblength; ++ } ++ copy[i][j] = '\0'; ++ } ++ } ++ else ++#endif ++ { ++ for (i = 0; i < 2; i++) ++ { ++ copy[i] = alloca (len[i] + 1); ++ ++ for (j = 0; j < MIN (len[0], len[1]); j++) ++ copy[i][j] = toupper (beg[i][j]); ++ ++ copy[i][j] = '\0'; ++ } ++ } + } + else + { +- if (hard_LC_COLLATE) +- return xmemcoll (beg1, len1, beg2, len2); +- diff = memcmp (beg1, beg2, MIN (len1, len2)); ++ copy[0] = (unsigned char *) beg[0]; ++ copy[1] = (unsigned char *) beg[1]; + } + ++ if (hard_LC_COLLATE) ++ return xmemcoll ((char *) copy[0], len[0], (char *) copy[1], len[1]); ++ diff = memcmp (copy[0], copy[1], MIN (len[0], len[1])); ++ ++ + if (diff) + return diff; +- return len1 < len2 ? -1 : len1 != len2; ++ return len[0] - len[1]; + } + + /* Check that successive input lines PREV and CURRENT from input file +@@ -429,6 +647,11 @@ get_line (FILE *fp, struct line **linep, + return false; + } + ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ xfields_multibyte (line); ++ else ++#endif + xfields (line); + + if (prevline[which - 1]) +@@ -528,11 +751,18 @@ prfield (size_t n, struct line const *li + + /* Print the join of LINE1 and LINE2. */ + ++#define PUT_TAB_CHAR \ ++ do \ ++ { \ ++ (tab != NULL) ? \ ++ fwrite(tab, sizeof(char), tablen, stdout) : putchar (' '); \ ++ } \ ++ while (0) ++ + static void + prjoin (struct line const *line1, struct line const *line2) + { + const struct outlist *outlist; +- char output_separator = tab < 0 ? ' ' : tab; + + outlist = outlist_head.next; + if (outlist) +@@ -567,7 +797,7 @@ prjoin (struct line const *line1, struct + o = o->next; + if (o == NULL) + break; +- putchar (output_separator); ++ PUT_TAB_CHAR; + } + putchar ('\n'); + } +@@ -585,23 +815,23 @@ prjoin (struct line const *line1, struct + prfield (join_field_1, line1); + for (i = 0; i < join_field_1 && i < line1->nfields; ++i) + { +- putchar (output_separator); ++ PUT_TAB_CHAR; + prfield (i, line1); + } + for (i = join_field_1 + 1; i < line1->nfields; ++i) + { +- putchar (output_separator); ++ PUT_TAB_CHAR; + prfield (i, line1); + } + + for (i = 0; i < join_field_2 && i < line2->nfields; ++i) + { +- putchar (output_separator); ++ PUT_TAB_CHAR; + prfield (i, line2); + } + for (i = join_field_2 + 1; i < line2->nfields; ++i) + { +- putchar (output_separator); ++ PUT_TAB_CHAR; + prfield (i, line2); + } + putchar ('\n'); +@@ -1039,21 +1269,46 @@ main (int argc, char **argv) + + case 't': + { +- unsigned char newtab = optarg[0]; ++ char *newtab; ++ size_t newtablen; ++ newtab = xstrdup (optarg); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ mbstate_t state; ++ ++ memset (&state, 0, sizeof (mbstate_t)); ++ newtablen = mbrtowc (NULL, newtab, ++ strnlen (newtab, MB_LEN_MAX), ++ &state); ++ if (newtablen == (size_t) 0 ++ || newtablen == (size_t) -1 ++ || newtablen == (size_t) -2) ++ newtablen = 1; ++ } ++ else ++#endif ++ newtablen = 1; + if (! newtab) +- newtab = '\n'; /* '' => process the whole line. */ ++ { ++ newtab[0] = '\n'; /* '' => process the whole line. */ ++ } + else if (optarg[1]) + { +- if (STREQ (optarg, "\\0")) +- newtab = '\0'; +- else +- error (EXIT_FAILURE, 0, _("multi-character tab %s"), +- quote (optarg)); ++ if (newtablen == 1 && newtab[1]) ++ { ++ if (STREQ (newtab, "\\0")) ++ newtab[0] = '\0'; ++ } ++ } ++ if (tab != NULL && strcmp (tab, newtab)) ++ { ++ free (newtab); ++ error (EXIT_FAILURE, 0, _("incompatible tabs")); + } +- if (0 <= tab && tab != newtab) +- error (EXIT_FAILURE, 0, _("incompatible tabs")); + tab = newtab; +- } ++ tablen = newtablen; ++ } + break; + + case NOCHECK_ORDER_OPTION: +Index: src/pr.c +=================================================================== +--- src/pr.c.orig 2010-03-13 16:14:09.000000000 +0100 ++++ src/pr.c 2010-05-07 16:13:30.836003733 +0200 +@@ -312,6 +312,32 @@ + + #include + #include ++ ++/* Get MB_LEN_MAX. */ ++#include ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX == 1 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Get MB_CUR_MAX. */ ++#include ++ ++/* Solaris 2.5 has a bug: must be included before . */ ++/* Get mbstate_t, mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get iswprint(). -- for wcwidth(). */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++#if !defined iswprint && !HAVE_ISWPRINT ++# define iswprint(wc) 1 ++#endif ++ + #include "system.h" + #include "error.h" + #include "hard-locale.h" +@@ -322,6 +348,18 @@ + #include "strftime.h" + #include "xstrtol.h" + ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ ++#ifndef HAVE_DECL_WCWIDTH ++"this configure-time declaration test was not run" ++#endif ++#if !HAVE_DECL_WCWIDTH ++extern int wcwidth (); ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "pr" + +@@ -414,7 +452,20 @@ struct COLUMN + + typedef struct COLUMN COLUMN; + +-static int char_to_clump (char c); ++/* Funtion pointers to switch functions for single byte locale or for ++ multibyte locale. If multibyte functions do not exist in your sysytem, ++ these pointers always point the function for single byte locale. */ ++static void (*print_char) (char c); ++static int (*char_to_clump) (char c); ++ ++/* Functions for single byte locale. */ ++static void print_char_single (char c); ++static int char_to_clump_single (char c); ++ ++/* Functions for multibyte locale. */ ++static void print_char_multi (char c); ++static int char_to_clump_multi (char c); ++ + static bool read_line (COLUMN *p); + static bool print_page (void); + static bool print_stored (COLUMN *p); +@@ -424,6 +475,7 @@ static void print_header (void); + static void pad_across_to (int position); + static void add_line_number (COLUMN *p); + static void getoptarg (char *arg, char switch_char, char *character, ++ int *character_length, int *character_width, + int *number); + void usage (int status); + static void print_files (int number_of_files, char **av); +@@ -438,7 +490,6 @@ static void store_char (char c); + static void pad_down (int lines); + static void read_rest_of_line (COLUMN *p); + static void skip_read (COLUMN *p, int column_number); +-static void print_char (char c); + static void cleanup (void); + static void print_sep_string (void); + static void separator_string (const char *optarg_S); +@@ -450,7 +501,7 @@ static COLUMN *column_vector; + we store the leftmost columns contiguously in buff. + To print a line from buff, get the index of the first character + from line_vector[i], and print up to line_vector[i + 1]. */ +-static char *buff; ++static unsigned char *buff; + + /* Index of the position in buff where the next character + will be stored. */ +@@ -554,7 +605,7 @@ static int chars_per_column; + static bool untabify_input = false; + + /* (-e) The input tab character. */ +-static char input_tab_char = '\t'; ++static char input_tab_char[MB_LEN_MAX] = "\t"; + + /* (-e) Tabstops are at chars_per_tab, 2*chars_per_tab, 3*chars_per_tab, ... + where the leftmost column is 1. */ +@@ -564,7 +615,10 @@ static int chars_per_input_tab = 8; + static bool tabify_output = false; + + /* (-i) The output tab character. */ +-static char output_tab_char = '\t'; ++static char output_tab_char[MB_LEN_MAX] = "\t"; ++ ++/* (-i) The byte length of output tab character. */ ++static int output_tab_char_length = 1; + + /* (-i) The width of the output tab. */ + static int chars_per_output_tab = 8; +@@ -638,7 +692,13 @@ static int power_10; + static bool numbered_lines = false; + + /* (-n) Character which follows each line number. */ +-static char number_separator = '\t'; ++static char number_separator[MB_LEN_MAX] = "\t"; ++ ++/* (-n) The byte length of the character which follows each line number. */ ++static int number_separator_length = 1; ++ ++/* (-n) The character width of the character which follows each line number. */ ++static int number_separator_width = 0; + + /* (-n) line counting starts with 1st line of input file (not with 1st + line of 1st page printed). */ +@@ -691,6 +751,7 @@ static bool use_col_separator = false; + -a|COLUMN|-m is a `space' and with the -J option a `tab'. */ + static char *col_sep_string = (char *) ""; + static int col_sep_length = 0; ++static int col_sep_width = 0; + static char *column_separator = (char *) " "; + static char *line_separator = (char *) "\t"; + +@@ -847,6 +908,13 @@ separator_string (const char *optarg_S) + col_sep_length = (int) strlen (optarg_S); + col_sep_string = xmalloc (col_sep_length + 1); + strcpy (col_sep_string, optarg_S); ++ ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ col_sep_width = mbswidth (col_sep_string, 0); ++ else ++#endif ++ col_sep_width = col_sep_length; + } + + int +@@ -871,6 +939,21 @@ main (int argc, char **argv) + + atexit (close_stdout); + ++/* Define which functions are used, the ones for single byte locale or the ones ++ for multibyte locale. */ ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ print_char = print_char_multi; ++ char_to_clump = char_to_clump_multi; ++ } ++ else ++#endif ++ { ++ print_char = print_char_single; ++ char_to_clump = char_to_clump_single; ++ } ++ + n_files = 0; + file_names = (argc > 1 + ? xmalloc ((argc - 1) * sizeof (char *)) +@@ -947,8 +1030,12 @@ main (int argc, char **argv) + break; + case 'e': + if (optarg) +- getoptarg (optarg, 'e', &input_tab_char, +- &chars_per_input_tab); ++ { ++ int dummy_length, dummy_width; ++ ++ getoptarg (optarg, 'e', input_tab_char, &dummy_length, ++ &dummy_width, &chars_per_input_tab); ++ } + /* Could check tab width > 0. */ + untabify_input = true; + break; +@@ -961,8 +1048,12 @@ main (int argc, char **argv) + break; + case 'i': + if (optarg) +- getoptarg (optarg, 'i', &output_tab_char, +- &chars_per_output_tab); ++ { ++ int dummy_width; ++ ++ getoptarg (optarg, 'i', output_tab_char, &output_tab_char_length, ++ &dummy_width, &chars_per_output_tab); ++ } + /* Could check tab width > 0. */ + tabify_output = true; + break; +@@ -989,8 +1080,8 @@ main (int argc, char **argv) + case 'n': + numbered_lines = true; + if (optarg) +- getoptarg (optarg, 'n', &number_separator, +- &chars_per_number); ++ getoptarg (optarg, 'n', number_separator, &number_separator_length, ++ &number_separator_width, &chars_per_number); + break; + case 'N': + skip_count = false; +@@ -1029,7 +1120,7 @@ main (int argc, char **argv) + old_s = false; + /* Reset an additional input of -s, -S dominates -s */ + col_sep_string = bad_cast (""); +- col_sep_length = 0; ++ col_sep_length = col_sep_width = 0; + use_col_separator = true; + if (optarg) + separator_string (optarg); +@@ -1186,10 +1277,45 @@ main (int argc, char **argv) + a number. */ + + static void +-getoptarg (char *arg, char switch_char, char *character, int *number) ++getoptarg (char *arg, char switch_char, char *character, int *character_length, ++ int *character_width, int *number) + { + if (!ISDIGIT (*arg)) +- *character = *arg++; ++ { ++#ifdef HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) /* for multibyte locale. */ ++ { ++ wchar_t wc; ++ size_t mblength; ++ int width; ++ mbstate_t state = {'\0'}; ++ ++ mblength = mbrtowc (&wc, arg, strnlen(arg, MB_LEN_MAX), &state); ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ *character_length = 1; ++ *character_width = 1; ++ } ++ else ++ { ++ *character_length = (mblength < 1) ? 1 : mblength; ++ width = wcwidth (wc); ++ *character_width = (width < 0) ? 0 : width; ++ } ++ ++ strncpy (character, arg, *character_length); ++ arg += *character_length; ++ } ++ else /* for single byte locale. */ ++#endif ++ { ++ *character = *arg++; ++ *character_length = 1; ++ *character_width = 1; ++ } ++ } ++ + if (*arg) + { + long int tmp_long; +@@ -1248,7 +1374,7 @@ init_parameters (int number_of_files) + else + col_sep_string = column_separator; + +- col_sep_length = 1; ++ col_sep_length = col_sep_width = 1; + use_col_separator = true; + } + /* It's rather pointless to define a TAB separator with column +@@ -1279,11 +1405,11 @@ init_parameters (int number_of_files) + TAB_WIDTH (chars_per_input_tab, chars_per_number); */ + + /* Estimate chars_per_text without any margin and keep it constant. */ +- if (number_separator == '\t') ++ if (number_separator[0] == '\t') + number_width = chars_per_number + + TAB_WIDTH (chars_per_default_tab, chars_per_number); + else +- number_width = chars_per_number + 1; ++ number_width = chars_per_number + number_separator_width; + + /* The number is part of the column width unless we are + printing files in parallel. */ +@@ -1298,7 +1424,7 @@ init_parameters (int number_of_files) + } + + chars_per_column = (chars_per_line - chars_used_by_number - +- (columns - 1) * col_sep_length) / columns; ++ (columns - 1) * col_sep_width) / columns; + + if (chars_per_column < 1) + error (EXIT_FAILURE, 0, _("page width too narrow")); +@@ -1423,7 +1549,7 @@ init_funcs (void) + + /* Enlarge p->start_position of first column to use the same form of + padding_not_printed with all columns. */ +- h = h + col_sep_length; ++ h = h + col_sep_width; + + /* This loop takes care of all but the rightmost column. */ + +@@ -1457,7 +1583,7 @@ init_funcs (void) + } + else + { +- h = h_next + col_sep_length; ++ h = h_next + col_sep_width; + h_next = h + chars_per_column; + } + } +@@ -1747,9 +1873,9 @@ static void + align_column (COLUMN *p) + { + padding_not_printed = p->start_position; +- if (padding_not_printed - col_sep_length > 0) ++ if (padding_not_printed - col_sep_width > 0) + { +- pad_across_to (padding_not_printed - col_sep_length); ++ pad_across_to (padding_not_printed - col_sep_width); + padding_not_printed = ANYWHERE; + } + +@@ -2020,13 +2146,13 @@ store_char (char c) + /* May be too generous. */ + buff = X2REALLOC (buff, &buff_allocated); + } +- buff[buff_current++] = c; ++ buff[buff_current++] = (unsigned char) c; + } + + static void + add_line_number (COLUMN *p) + { +- int i; ++ int i, j; + char *s; + int left_cut; + +@@ -2049,22 +2175,24 @@ add_line_number (COLUMN *p) + /* Tabification is assumed for multiple columns, also for n-separators, + but `default n-separator = TAB' hasn't been given priority over + equal column_width also specified by POSIX. */ +- if (number_separator == '\t') ++ if (number_separator[0] == '\t') + { + i = number_width - chars_per_number; + while (i-- > 0) + (p->char_func) (' '); + } + else +- (p->char_func) (number_separator); ++ for (j = 0; j < number_separator_length; j++) ++ (p->char_func) (number_separator[j]); + } + else + /* To comply with POSIX, we avoid any expansion of default TAB + separator with a single column output. No column_width requirement + has to be considered. */ + { +- (p->char_func) (number_separator); +- if (number_separator == '\t') ++ for (j = 0; j < number_separator_length; j++) ++ (p->char_func) (number_separator[j]); ++ if (number_separator[0] == '\t') + output_position = POS_AFTER_TAB (chars_per_output_tab, + output_position); + } +@@ -2225,7 +2353,7 @@ print_white_space (void) + while (goal - h_old > 1 + && (h_new = POS_AFTER_TAB (chars_per_output_tab, h_old)) <= goal) + { +- putchar (output_tab_char); ++ fwrite (output_tab_char, sizeof(char), output_tab_char_length, stdout); + h_old = h_new; + } + while (++h_old <= goal) +@@ -2245,6 +2373,7 @@ print_sep_string (void) + { + char *s; + int l = col_sep_length; ++ int not_space_flag; + + s = col_sep_string; + +@@ -2258,6 +2387,7 @@ print_sep_string (void) + { + for (; separators_not_printed > 0; --separators_not_printed) + { ++ not_space_flag = 0; + while (l-- > 0) + { + /* 3 types of sep_strings: spaces only, spaces and chars, +@@ -2271,12 +2401,15 @@ print_sep_string (void) + } + else + { ++ not_space_flag = 1; + if (spaces_not_printed > 0) + print_white_space (); + putchar (*s++); +- ++output_position; + } + } ++ if (not_space_flag) ++ output_position += col_sep_width; ++ + /* sep_string ends with some spaces */ + if (spaces_not_printed > 0) + print_white_space (); +@@ -2304,7 +2437,7 @@ print_clump (COLUMN *p, int n, char *clu + required number of tabs and spaces. */ + + static void +-print_char (char c) ++print_char_single (char c) + { + if (tabify_output) + { +@@ -2328,6 +2461,74 @@ print_char (char c) + putchar (c); + } + ++#ifdef HAVE_MBRTOWC ++static void ++print_char_multi (char c) ++{ ++ static size_t mbc_pos = 0; ++ static char mbc[MB_LEN_MAX] = {'\0'}; ++ static mbstate_t state = {'\0'}; ++ mbstate_t state_bak; ++ wchar_t wc; ++ size_t mblength; ++ int width; ++ ++ if (tabify_output) ++ { ++ state_bak = state; ++ mbc[mbc_pos++] = c; ++ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); ++ ++ while (mbc_pos > 0) ++ { ++ switch (mblength) ++ { ++ case (size_t)-2: ++ state = state_bak; ++ return; ++ ++ case (size_t)-1: ++ state = state_bak; ++ ++output_position; ++ putchar (mbc[0]); ++ memmove (mbc, mbc + 1, MB_CUR_MAX - 1); ++ --mbc_pos; ++ break; ++ ++ case 0: ++ mblength = 1; ++ ++ default: ++ if (wc == L' ') ++ { ++ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); ++ --mbc_pos; ++ ++spaces_not_printed; ++ return; ++ } ++ else if (spaces_not_printed > 0) ++ print_white_space (); ++ ++ /* Nonprintables are assumed to have width 0, except L'\b'. */ ++ if ((width = wcwidth (wc)) < 1) ++ { ++ if (wc == L'\b') ++ --output_position; ++ } ++ else ++ output_position += width; ++ ++ fwrite (mbc, sizeof(char), mblength, stdout); ++ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); ++ mbc_pos -= mblength; ++ } ++ } ++ return; ++ } ++ putchar (c); ++} ++#endif ++ + /* Skip to page PAGE before printing. + PAGE may be larger than total number of pages. */ + +@@ -2507,9 +2708,9 @@ read_line (COLUMN *p) + align_empty_cols = false; + } + +- if (padding_not_printed - col_sep_length > 0) ++ if (padding_not_printed - col_sep_width > 0) + { +- pad_across_to (padding_not_printed - col_sep_length); ++ pad_across_to (padding_not_printed - col_sep_width); + padding_not_printed = ANYWHERE; + } + +@@ -2610,9 +2811,9 @@ print_stored (COLUMN *p) + } + } + +- if (padding_not_printed - col_sep_length > 0) ++ if (padding_not_printed - col_sep_width > 0) + { +- pad_across_to (padding_not_printed - col_sep_length); ++ pad_across_to (padding_not_printed - col_sep_width); + padding_not_printed = ANYWHERE; + } + +@@ -2625,8 +2826,8 @@ print_stored (COLUMN *p) + if (spaces_not_printed == 0) + { + output_position = p->start_position + end_vector[line]; +- if (p->start_position - col_sep_length == chars_per_margin) +- output_position -= col_sep_length; ++ if (p->start_position - col_sep_width == chars_per_margin) ++ output_position -= col_sep_width; + } + + return true; +@@ -2645,7 +2846,7 @@ print_stored (COLUMN *p) + number of characters is 1.) */ + + static int +-char_to_clump (char c) ++char_to_clump_single (char c) + { + unsigned char uc = c; + char *s = clump_buff; +@@ -2655,10 +2856,10 @@ char_to_clump (char c) + int chars; + int chars_per_c = 8; + +- if (c == input_tab_char) ++ if (c == input_tab_char[0]) + chars_per_c = chars_per_input_tab; + +- if (c == input_tab_char || c == '\t') ++ if (c == input_tab_char[0] || c == '\t') + { + width = TAB_WIDTH (chars_per_c, input_position); + +@@ -2739,6 +2940,154 @@ char_to_clump (char c) + return chars; + } + ++#ifdef HAVE_MBRTOWC ++static int ++char_to_clump_multi (char c) ++{ ++ static size_t mbc_pos = 0; ++ static char mbc[MB_LEN_MAX] = {'\0'}; ++ static mbstate_t state = {'\0'}; ++ mbstate_t state_bak; ++ wchar_t wc; ++ size_t mblength; ++ int wc_width; ++ register char *s = clump_buff; ++ register int i, j; ++ char esc_buff[4]; ++ int width; ++ int chars; ++ int chars_per_c = 8; ++ ++ state_bak = state; ++ mbc[mbc_pos++] = c; ++ mblength = mbrtowc (&wc, mbc, mbc_pos, &state); ++ ++ width = 0; ++ chars = 0; ++ while (mbc_pos > 0) ++ { ++ switch (mblength) ++ { ++ case (size_t)-2: ++ state = state_bak; ++ return 0; ++ ++ case (size_t)-1: ++ state = state_bak; ++ mblength = 1; ++ ++ if (use_esc_sequence || use_cntrl_prefix) ++ { ++ width = +4; ++ chars = +4; ++ *s++ = '\\'; ++ sprintf (esc_buff, "%03o", mbc[0]); ++ for (i = 0; i <= 2; ++i) ++ *s++ = (int) esc_buff[i]; ++ } ++ else ++ { ++ width += 1; ++ chars += 1; ++ *s++ = mbc[0]; ++ } ++ break; ++ ++ case 0: ++ mblength = 1; ++ /* Fall through */ ++ ++ default: ++ if (memcmp (mbc, input_tab_char, mblength) == 0) ++ chars_per_c = chars_per_input_tab; ++ ++ if (memcmp (mbc, input_tab_char, mblength) == 0 || c == '\t') ++ { ++ int width_inc; ++ ++ width_inc = TAB_WIDTH (chars_per_c, input_position); ++ width += width_inc; ++ ++ if (untabify_input) ++ { ++ for (i = width_inc; i; --i) ++ *s++ = ' '; ++ chars += width_inc; ++ } ++ else ++ { ++ for (i = 0; i < mblength; i++) ++ *s++ = mbc[i]; ++ chars += mblength; ++ } ++ } ++ else if ((wc_width = wcwidth (wc)) < 1) ++ { ++ if (use_esc_sequence) ++ { ++ for (i = 0; i < mblength; i++) ++ { ++ width += 4; ++ chars += 4; ++ *s++ = '\\'; ++ sprintf (esc_buff, "%03o", c); ++ for (j = 0; j <= 2; ++j) ++ *s++ = (int) esc_buff[j]; ++ } ++ } ++ else if (use_cntrl_prefix) ++ { ++ if (wc < 0200) ++ { ++ width += 2; ++ chars += 2; ++ *s++ = '^'; ++ *s++ = wc ^ 0100; ++ } ++ else ++ { ++ for (i = 0; i < mblength; i++) ++ { ++ width += 4; ++ chars += 4; ++ *s++ = '\\'; ++ sprintf (esc_buff, "%03o", c); ++ for (j = 0; j <= 2; ++j) ++ *s++ = (int) esc_buff[j]; ++ } ++ } ++ } ++ else if (wc == L'\b') ++ { ++ width += -1; ++ chars += 1; ++ *s++ = c; ++ } ++ else ++ { ++ width += 0; ++ chars += mblength; ++ for (i = 0; i < mblength; i++) ++ *s++ = mbc[i]; ++ } ++ } ++ else ++ { ++ width += wc_width; ++ chars += mblength; ++ for (i = 0; i < mblength; i++) ++ *s++ = mbc[i]; ++ } ++ } ++ memmove (mbc, mbc + mblength, MB_CUR_MAX - mblength); ++ mbc_pos -= mblength; ++ } ++ ++ input_position += width; ++ return chars; ++} ++#endif ++ + /* We've just printed some files and need to clean up things before + looking for more options and printing the next batch of files. + +Index: src/sort.c +=================================================================== +--- src/sort.c.orig 2010-04-21 09:06:17.000000000 +0200 ++++ src/sort.c 2010-05-07 16:34:36.664210645 +0200 +@@ -22,10 +22,19 @@ + + #include + ++#include + #include + #include + #include + #include ++#if HAVE_WCHAR_H ++# include ++#endif ++/* Get isw* functions. */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "argmatch.h" + #include "error.h" +@@ -124,14 +133,38 @@ static int decimal_point; + /* Thousands separator; if -1, then there isn't one. */ + static int thousands_sep; + ++static int force_general_numcompare = 0; ++ + /* Nonzero if the corresponding locales are hard. */ + static bool hard_LC_COLLATE; +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + static bool hard_LC_TIME; + #endif + + #define NONZERO(x) ((x) != 0) + ++/* get a multibyte character's byte length. */ ++#define GET_BYTELEN_OF_CHAR(LIM, PTR, MBLENGTH, STATE) \ ++ do \ ++ { \ ++ wchar_t wc; \ ++ mbstate_t state_bak; \ ++ \ ++ state_bak = STATE; \ ++ mblength = mbrtowc (&wc, PTR, LIM - PTR, &STATE); \ ++ \ ++ switch (MBLENGTH) \ ++ { \ ++ case (size_t)-1: \ ++ case (size_t)-2: \ ++ STATE = state_bak; \ ++ /* Fall through. */ \ ++ case 0: \ ++ MBLENGTH = 1; \ ++ } \ ++ } \ ++ while (0) ++ + /* The kind of blanks for '-b' to skip in various options. */ + enum blanktype { bl_start, bl_end, bl_both }; + +@@ -270,13 +303,11 @@ static bool reverse; + they were read if all keys compare equal. */ + static bool stable; + +-/* If TAB has this value, blanks separate fields. */ +-enum { TAB_DEFAULT = CHAR_MAX + 1 }; +- +-/* Tab character separating fields. If TAB_DEFAULT, then fields are ++/* Tab character separating fields. If tab_length is 0, then fields are + separated by the empty string between a non-blank character and a blank + character. */ +-static int tab = TAB_DEFAULT; ++static char tab[MB_LEN_MAX + 1]; ++static size_t tab_length = 0; + + /* Flag to remove consecutive duplicate lines from the output. + Only the last of a sequence of equal lines will be output. */ +@@ -714,6 +745,44 @@ reap_some (void) + update_proc (pid); + } + ++/* Function pointers. */ ++static void ++(*inittables) (void); ++static char * ++(*begfield) (const struct line*, const struct keyfield *); ++static char * ++(*limfield) (const struct line*, const struct keyfield *); ++static int ++(*getmonth) (char const *, size_t); ++static int ++(*keycompare) (const struct line *, const struct line *); ++static int ++(*numcompare) (const char *, const char *); ++ ++/* Test for white space multibyte character. ++ Set LENGTH the byte length of investigated multibyte character. */ ++#if HAVE_MBRTOWC ++static int ++ismbblank (const char *str, size_t len, size_t *length) ++{ ++ size_t mblength; ++ wchar_t wc; ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ mblength = mbrtowc (&wc, str, len, &state); ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ *length = 1; ++ return 0; ++ } ++ ++ *length = (mblength < 1) ? 1 : mblength; ++ return iswblank (wc); ++} ++#endif ++ + /* Clean up any remaining temporary files. */ + + static void +@@ -1158,7 +1227,7 @@ zaptemp (const char *name) + free (node); + } + +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + + static int + struct_month_cmp (const void *m1, const void *m2) +@@ -1173,7 +1242,7 @@ struct_month_cmp (const void *m1, const + /* Initialize the character class tables. */ + + static void +-inittables (void) ++inittables_uni (void) + { + size_t i; + +@@ -1185,7 +1254,7 @@ inittables (void) + fold_toupper[i] = toupper (i); + } + +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + /* If we're not in the "C" locale, read different names for months. */ + if (hard_LC_TIME) + { +@@ -1268,6 +1337,64 @@ specify_nmerge (int oi, char c, char con + xstrtol_fatal (e, oi, c, long_options, s); + } + ++#if HAVE_MBRTOWC ++static void ++inittables_mb (void) ++{ ++ int i, j, k, l; ++ char *name, *s; ++ size_t s_len, mblength; ++ char mbc[MB_LEN_MAX]; ++ wchar_t wc, pwc; ++ mbstate_t state_mb, state_wc; ++ ++ for (i = 0; i < MONTHS_PER_YEAR; i++) ++ { ++ s = (char *) nl_langinfo (ABMON_1 + i); ++ s_len = strlen (s); ++ monthtab[i].name = name = (char *) xmalloc (s_len + 1); ++ monthtab[i].val = i + 1; ++ ++ memset (&state_mb, '\0', sizeof (mbstate_t)); ++ memset (&state_wc, '\0', sizeof (mbstate_t)); ++ ++ for (j = 0; j < s_len;) ++ { ++ if (!ismbblank (s + j, s_len - j, &mblength)) ++ break; ++ j += mblength; ++ } ++ ++ for (k = 0; j < s_len;) ++ { ++ mblength = mbrtowc (&wc, (s + j), (s_len - j), &state_mb); ++ assert (mblength != (size_t)-1 && mblength != (size_t)-2); ++ if (mblength == 0) ++ break; ++ ++ pwc = towupper (wc); ++ if (pwc == wc) ++ { ++ memcpy (mbc, s + j, mblength); ++ j += mblength; ++ } ++ else ++ { ++ j += mblength; ++ mblength = wcrtomb (mbc, pwc, &state_wc); ++ assert (mblength != (size_t)0 && mblength != (size_t)-1); ++ } ++ ++ for (l = 0; l < mblength; l++) ++ name[k++] = mbc[l]; ++ } ++ name[k] = '\0'; ++ } ++ qsort ((void *) monthtab, MONTHS_PER_YEAR, ++ sizeof (struct month), struct_month_cmp); ++} ++#endif ++ + /* Specify the amount of main memory to use when sorting. */ + static void + specify_sort_size (int oi, char c, char const *s) +@@ -1478,7 +1605,7 @@ buffer_linelim (struct buffer const *buf + by KEY in LINE. */ + + static char * +-begfield (const struct line *line, const struct keyfield *key) ++begfield_uni (const struct line *line, const struct keyfield *key) + { + char *ptr = line->text, *lim = ptr + line->length - 1; + size_t sword = key->sword; +@@ -1487,10 +1614,10 @@ begfield (const struct line *line, const + /* The leading field separator itself is included in a field when -t + is absent. */ + +- if (tab != TAB_DEFAULT) ++ if (tab_length) + while (ptr < lim && sword--) + { +- while (ptr < lim && *ptr != tab) ++ while (ptr < lim && *ptr != tab[0]) + ++ptr; + if (ptr < lim) + ++ptr; +@@ -1516,11 +1643,70 @@ begfield (const struct line *line, const + return ptr; + } + ++#if HAVE_MBRTOWC ++static char * ++begfield_mb (const struct line *line, const struct keyfield *key) ++{ ++ int i; ++ char *ptr = line->text, *lim = ptr + line->length - 1; ++ size_t sword = key->sword; ++ size_t schar = key->schar; ++ size_t mblength; ++ mbstate_t state; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ if (tab_length) ++ while (ptr < lim && sword--) ++ { ++ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ } ++ else ++ while (ptr < lim && sword--) ++ { ++ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ while (ptr < lim && !ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ } ++ ++ if (key->skipsblanks) ++ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ ++ for (i = 0; i < schar; i++) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ++ if (ptr + mblength > lim) ++ break; ++ else ++ ptr += mblength; ++ } ++ ++ return ptr; ++} ++#endif ++ + /* Return the limit of (a pointer to the first character after) the field + in LINE specified by KEY. */ + + static char * +-limfield (const struct line *line, const struct keyfield *key) ++limfield_uni (const struct line *line, const struct keyfield *key) + { + char *ptr = line->text, *lim = ptr + line->length - 1; + size_t eword = key->eword, echar = key->echar; +@@ -1535,10 +1721,10 @@ limfield (const struct line *line, const + `beginning' is the first character following the delimiting TAB. + Otherwise, leave PTR pointing at the first `blank' character after + the preceding field. */ +- if (tab != TAB_DEFAULT) ++ if (tab_length) + while (ptr < lim && eword--) + { +- while (ptr < lim && *ptr != tab) ++ while (ptr < lim && *ptr != tab[0]) + ++ptr; + if (ptr < lim && (eword || echar)) + ++ptr; +@@ -1584,10 +1770,10 @@ limfield (const struct line *line, const + */ + + /* Make LIM point to the end of (one byte past) the current field. */ +- if (tab != TAB_DEFAULT) ++ if (tab_length) + { + char *newlim; +- newlim = memchr (ptr, tab, lim - ptr); ++ newlim = memchr (ptr, tab[0], lim - ptr); + if (newlim) + lim = newlim; + } +@@ -1618,6 +1804,113 @@ limfield (const struct line *line, const + return ptr; + } + ++#if HAVE_MBRTOWC ++static char * ++limfield_mb (const struct line *line, const struct keyfield *key) ++{ ++ char *ptr = line->text, *lim = ptr + line->length - 1; ++ size_t eword = key->eword, echar = key->echar; ++ int i; ++ size_t mblength; ++ mbstate_t state; ++ ++ if (echar == 0) ++ eword++; /* skip all of end field. */ ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ if (tab_length) ++ while (ptr < lim && eword--) ++ { ++ while (ptr < lim && memcmp (ptr, tab, tab_length) != 0) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ if (ptr < lim && (eword | echar)) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ } ++ else ++ while (ptr < lim && eword--) ++ { ++ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ while (ptr < lim && !ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ } ++ ++ ++# ifdef POSIX_UNSPECIFIED ++ /* Make LIM point to the end of (one byte past) the current field. */ ++ if (tab_length) ++ { ++ char *newlim, *p; ++ ++ newlim = NULL; ++ for (p = ptr; p < lim;) ++ { ++ if (memcmp (p, tab, tab_length) == 0) ++ { ++ newlim = p; ++ break; ++ } ++ ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ p += mblength; ++ } ++ } ++ else ++ { ++ char *newlim; ++ newlim = ptr; ++ ++ while (newlim < lim && ismbblank (newlim, lim - newlim, &mblength)) ++ newlim += mblength; ++ if (ptr < lim) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ptr += mblength; ++ } ++ while (newlim < lim && !ismbblank (newlim, lim - newlim, &mblength)) ++ newlim += mblength; ++ lim = newlim; ++ } ++# endif ++ ++ if (echar != 0) ++ { ++ /* If we're skipping leading blanks, don't start counting characters ++ * until after skipping past any leading blanks. */ ++ if (key->skipsblanks) ++ while (ptr < lim && ismbblank (ptr, lim - ptr, &mblength)) ++ ptr += mblength; ++ ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ /* Advance PTR by ECHAR (if possible), but no further than LIM. */ ++ for (i = 0; i < echar; i++) ++ { ++ GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); ++ ++ if (ptr + mblength > lim) ++ break; ++ else ++ ptr += mblength; ++ } ++ } ++ ++ return ptr; ++} ++#endif ++ + /* Fill BUF reading from FP, moving buf->left bytes from the end + of buf->buf to the beginning first. If EOF is reached and the + file wasn't terminated by a newline, supply one. Set up BUF's line +@@ -1700,8 +1993,24 @@ fillbuf (struct buffer *buf, FILE *fp, c + else + { + if (key->skipsblanks) +- while (blanks[to_uchar (*line_start)]) +- line_start++; ++ { ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ size_t mblength; ++ mbstate_t state; ++ memset (&state, '\0', sizeof(mbstate_t)); ++ while (line_start < line->keylim && ++ ismbblank (line_start, ++ line->keylim - line_start, ++ &mblength)) ++ line_start += mblength; ++ } ++ else ++#endif ++ while (blanks[to_uchar (*line_start)]) ++ line_start++; ++ } + line->keybeg = line_start; + } + } +@@ -1739,7 +2048,7 @@ fillbuf (struct buffer *buf, FILE *fp, c + hideously fast. */ + + static int +-numcompare (const char *a, const char *b) ++numcompare_uni (const char *a, const char *b) + { + while (blanks[to_uchar (*a)]) + a++; +@@ -1848,6 +2157,25 @@ human_numcompare (const char *a, const c + : strnumcmp (a, b, decimal_point, thousands_sep)); + } + ++#if HAVE_MBRTOWC ++static int ++numcompare_mb (const char *a, const char *b) ++{ ++ size_t mblength, len; ++ len = strlen (a); /* okay for UTF-8 */ ++ while (*a && ismbblank (a, len > MB_CUR_MAX ? MB_CUR_MAX : len, &mblength)) ++ { ++ a += mblength; ++ len -= mblength; ++ } ++ len = strlen (b); /* okay for UTF-8 */ ++ while (*b && ismbblank (b, len > MB_CUR_MAX ? MB_CUR_MAX : len, &mblength)) ++ b += mblength; ++ ++ return strnumcmp (a, b, decimal_point, thousands_sep); ++} ++#endif /* HAV_EMBRTOWC */ ++ + static int + general_numcompare (const char *sa, const char *sb) + { +@@ -1881,7 +2209,7 @@ general_numcompare (const char *sa, cons + Return 0 if the name in S is not recognized. */ + + static int +-getmonth (char const *month, size_t len) ++getmonth_uni (char const *month, size_t len) + { + size_t lo = 0; + size_t hi = MONTHS_PER_YEAR; +@@ -2062,11 +2390,79 @@ compare_version (char *restrict texta, s + return diff; + } + ++#if HAVE_MBRTOWC ++static int ++getmonth_mb (const char *s, size_t len) ++{ ++ char *month; ++ register size_t i; ++ register int lo = 0, hi = MONTHS_PER_YEAR, result; ++ char *tmp; ++ size_t wclength, mblength; ++ const char **pp; ++ const wchar_t **wpp; ++ wchar_t *month_wcs; ++ mbstate_t state; ++ ++ while (len > 0 && ismbblank (s, len, &mblength)) ++ { ++ s += mblength; ++ len -= mblength; ++ } ++ ++ if (len == 0) ++ return 0; ++ ++ month = (char *) alloca (len + 1); ++ ++ tmp = (char *) alloca (len + 1); ++ memcpy (tmp, s, len); ++ tmp[len] = '\0'; ++ pp = (const char **)&tmp; ++ month_wcs = (wchar_t *) alloca ((len + 1) * sizeof (wchar_t)); ++ memset (&state, '\0', sizeof(mbstate_t)); ++ ++ wclength = mbsrtowcs (month_wcs, pp, len + 1, &state); ++ assert (wclength != (size_t)-1 && *pp == NULL); ++ ++ for (i = 0; i < wclength; i++) ++ { ++ month_wcs[i] = towupper(month_wcs[i]); ++ if (iswblank (month_wcs[i])) ++ { ++ month_wcs[i] = L'\0'; ++ break; ++ } ++ } ++ ++ wpp = (const wchar_t **)&month_wcs; ++ ++ mblength = wcsrtombs (month, wpp, len + 1, &state); ++ assert (mblength != (-1) && *wpp == NULL); ++ ++ do ++ { ++ int ix = (lo + hi) / 2; ++ ++ if (strncmp (month, monthtab[ix].name, strlen (monthtab[ix].name)) < 0) ++ hi = ix; ++ else ++ lo = ix; ++ } ++ while (hi - lo > 1); ++ ++ result = (!strncmp (month, monthtab[lo].name, strlen (monthtab[lo].name)) ++ ? monthtab[lo].val : 0); ++ ++ return result; ++} ++#endif ++ + /* Compare two lines A and B trying every key in sequence until there + are no more keys or a difference is found. */ + + static int +-keycompare (const struct line *a, const struct line *b) ++keycompare_uni (const struct line *a, const struct line *b) + { + struct keyfield *key = keylist; + +@@ -2246,6 +2642,179 @@ keycompare (const struct line *a, const + return key->reverse ? -diff : diff; + } + ++#if HAVE_MBRTOWC ++static int ++keycompare_mb (const struct line *a, const struct line *b) ++{ ++ struct keyfield *key = keylist; ++ ++ /* For the first iteration only, the key positions have been ++ precomputed for us. */ ++ char *texta = a->keybeg; ++ char *textb = b->keybeg; ++ char *lima = a->keylim; ++ char *limb = b->keylim; ++ ++ size_t mblength_a, mblength_b; ++ wchar_t wc_a, wc_b; ++ mbstate_t state_a, state_b; ++ ++ int diff; ++ ++ memset (&state_a, '\0', sizeof(mbstate_t)); ++ memset (&state_b, '\0', sizeof(mbstate_t)); ++ ++ for (;;) ++ { ++ char const *translate = key->translate; ++ bool const *ignore = key->ignore; ++ ++ /* Find the lengths. */ ++ size_t lena = lima <= texta ? 0 : lima - texta; ++ size_t lenb = limb <= textb ? 0 : limb - textb; ++ ++ /* Actually compare the fields. */ ++ if (key->random) ++ diff = compare_random (texta, lena, textb, lenb); ++ else if (key->numeric | key->general_numeric | key->human_numeric) ++ { ++ char savea = *lima, saveb = *limb; ++ ++ *lima = *limb = '\0'; ++ diff = (key->numeric ? numcompare (texta, textb) ++ : key->general_numeric ? general_numcompare (texta, textb) ++ : human_numcompare (texta, textb, key)); ++ *lima = savea, *limb = saveb; ++ } ++ else if (key->version) ++ diff = compare_version (texta, lena, textb, lenb); ++ else if (key->month) ++ diff = getmonth (texta, lena) - getmonth (textb, lenb); ++ else ++ { ++ if (ignore || translate) ++ { ++ char *copy_a = (char *) alloca (lena + 1 + lenb + 1); ++ char *copy_b = copy_a + lena + 1; ++ size_t new_len_a, new_len_b; ++ size_t i, j; ++ ++ /* Ignore and/or translate chars before comparing. */ ++# define IGNORE_CHARS(NEW_LEN, LEN, TEXT, COPY, WC, MBLENGTH, STATE) \ ++ do \ ++ { \ ++ wchar_t uwc; \ ++ char mbc[MB_LEN_MAX]; \ ++ mbstate_t state_wc; \ ++ \ ++ for (NEW_LEN = i = 0; i < LEN;) \ ++ { \ ++ mbstate_t state_bak; \ ++ \ ++ state_bak = STATE; \ ++ MBLENGTH = mbrtowc (&WC, TEXT + i, LEN - i, &STATE); \ ++ \ ++ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1 \ ++ || MBLENGTH == 0) \ ++ { \ ++ if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ ++ STATE = state_bak; \ ++ if (!ignore) \ ++ COPY[NEW_LEN++] = TEXT[i++]; \ ++ continue; \ ++ } \ ++ \ ++ if (ignore) \ ++ { \ ++ if ((ignore == nonprinting && !iswprint (WC)) \ ++ || (ignore == nondictionary \ ++ && !iswalnum (WC) && !iswblank (WC))) \ ++ { \ ++ i += MBLENGTH; \ ++ continue; \ ++ } \ ++ } \ ++ \ ++ if (translate) \ ++ { \ ++ \ ++ uwc = towupper(WC); \ ++ if (WC == uwc) \ ++ { \ ++ memcpy (mbc, TEXT + i, MBLENGTH); \ ++ i += MBLENGTH; \ ++ } \ ++ else \ ++ { \ ++ i += MBLENGTH; \ ++ WC = uwc; \ ++ memset (&state_wc, '\0', sizeof (mbstate_t)); \ ++ \ ++ MBLENGTH = wcrtomb (mbc, WC, &state_wc); \ ++ assert (MBLENGTH != (size_t)-1 && MBLENGTH != 0); \ ++ } \ ++ \ ++ for (j = 0; j < MBLENGTH; j++) \ ++ COPY[NEW_LEN++] = mbc[j]; \ ++ } \ ++ else \ ++ for (j = 0; j < MBLENGTH; j++) \ ++ COPY[NEW_LEN++] = TEXT[i++]; \ ++ } \ ++ COPY[NEW_LEN] = '\0'; \ ++ } \ ++ while (0) ++ IGNORE_CHARS (new_len_a, lena, texta, copy_a, ++ wc_a, mblength_a, state_a); ++ IGNORE_CHARS (new_len_b, lenb, textb, copy_b, ++ wc_b, mblength_b, state_b); ++ diff = xmemcoll (copy_a, new_len_a, copy_b, new_len_b); ++ } ++ else if (lena == 0) ++ diff = - NONZERO (lenb); ++ else if (lenb == 0) ++ goto greater; ++ else ++ diff = xmemcoll (texta, lena, textb, lenb); ++ } ++ ++ if (diff) ++ goto not_equal; ++ ++ key = key->next; ++ if (! key) ++ break; ++ ++ /* Find the beginning and limit of the next field. */ ++ if (key->eword != -1) ++ lima = limfield (a, key), limb = limfield (b, key); ++ else ++ lima = a->text + a->length - 1, limb = b->text + b->length - 1; ++ ++ if (key->sword != -1) ++ texta = begfield (a, key), textb = begfield (b, key); ++ else ++ { ++ texta = a->text, textb = b->text; ++ if (key->skipsblanks) ++ { ++ while (texta < lima && ismbblank (texta, lima - texta, &mblength_a)) ++ texta += mblength_a; ++ while (textb < limb && ismbblank (textb, limb - textb, &mblength_b)) ++ textb += mblength_b; ++ } ++ } ++ } ++ ++ return 0; ++ ++greater: ++ diff = 1; ++not_equal: ++ return key->reverse ? -diff : diff; ++} ++#endif ++ + /* Compare two lines A and B, returning negative, zero, or positive + depending on whether A compares less than, equal to, or greater than B. */ + +@@ -3244,7 +3813,7 @@ main (int argc, char **argv) + initialize_exit_failure (SORT_FAILURE); + + hard_LC_COLLATE = hard_locale (LC_COLLATE); +-#if HAVE_NL_LANGINFO ++#if HAVE_LANGINFO_CODESET + hard_LC_TIME = hard_locale (LC_TIME); + #endif + +@@ -3265,6 +3834,27 @@ main (int argc, char **argv) + thousands_sep = -1; + } + ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ inittables = inittables_mb; ++ begfield = begfield_mb; ++ limfield = limfield_mb; ++ getmonth = getmonth_mb; ++ keycompare = keycompare_mb; ++ numcompare = numcompare_mb; ++ } ++ else ++#endif ++ { ++ inittables = inittables_uni; ++ begfield = begfield_uni; ++ limfield = limfield_uni; ++ getmonth = getmonth_uni; ++ keycompare = keycompare_uni; ++ numcompare = numcompare_uni; ++ } ++ + have_read_stdin = false; + inittables (); + +@@ -3536,13 +4126,35 @@ main (int argc, char **argv) + + case 't': + { +- char newtab = optarg[0]; +- if (! newtab) ++ char newtab[MB_LEN_MAX + 1]; ++ size_t newtab_length = 1; ++ strncpy (newtab, optarg, MB_LEN_MAX); ++ if (! newtab[0]) + error (SORT_FAILURE, 0, _("empty tab")); +- if (optarg[1]) ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ wchar_t wc; ++ mbstate_t state; ++ size_t i; ++ ++ memset (&state, '\0', sizeof (mbstate_t)); ++ newtab_length = mbrtowc (&wc, newtab, strnlen (newtab, ++ MB_LEN_MAX), ++ &state); ++ switch (newtab_length) ++ { ++ case (size_t) -1: ++ case (size_t) -2: ++ case 0: ++ newtab_length = 1; ++ } ++ } ++#endif ++ if (newtab_length == 1 && optarg[1]) + { + if (STREQ (optarg, "\\0")) +- newtab = '\0'; ++ newtab[0] = '\0'; + else + { + /* Provoke with `sort -txx'. Complain about +@@ -3553,9 +4165,12 @@ main (int argc, char **argv) + quote (optarg)); + } + } +- if (tab != TAB_DEFAULT && tab != newtab) ++ if (tab_length ++ && (tab_length != newtab_length ++ || memcmp (tab, newtab, tab_length) != 0)) + error (SORT_FAILURE, 0, _("incompatible tabs")); +- tab = newtab; ++ memcpy (tab, newtab, newtab_length); ++ tab_length = newtab_length; + } + break; + +Index: src/unexpand.c +=================================================================== +--- src/unexpand.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/unexpand.c 2010-05-07 16:13:31.016492129 +0200 +@@ -39,11 +39,28 @@ + #include + #include + #include ++ ++/* Get mbstate_t, mbrtowc(), wcwidth(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ + #include "system.h" + #include "error.h" + #include "quote.h" + #include "xstrndup.h" + ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "unexpand" + +@@ -103,6 +120,208 @@ static struct option const longopts[] = + {NULL, 0, NULL, 0} + }; + ++static FILE *next_file (FILE *fp); ++ ++#if HAVE_MBRTOWC ++static void ++unexpand_multibyte (void) ++{ ++ FILE *fp; /* Input stream. */ ++ mbstate_t i_state; /* Current shift state of the input stream. */ ++ mbstate_t i_state_bak; /* Back up the I_STATE. */ ++ mbstate_t o_state; /* Current shift state of the output stream. */ ++ char buf[MB_LEN_MAX + BUFSIZ]; /* For spooling a read byte sequence. */ ++ char *bufpos; /* Next read position of BUF. */ ++ size_t buflen = 0; /* The length of the byte sequence in buf. */ ++ wint_t wc; /* A gotten wide character. */ ++ size_t mblength; /* The byte size of a multibyte character ++ which shows as same character as WC. */ ++ ++ /* Index in `tab_list' of next tabstop: */ ++ int tab_index = 0; /* For calculating width of pending tabs. */ ++ int print_tab_index = 0; /* For printing as many tabs as possible. */ ++ unsigned int column = 0; /* Column on screen of next char. */ ++ int next_tab_column; /* Column the next tab stop is on. */ ++ int convert = 1; /* If nonzero, perform translations. */ ++ unsigned int pending = 0; /* Pending columns of blanks. */ ++ ++ fp = next_file ((FILE *) NULL); ++ if (fp == NULL) ++ return; ++ ++ memset (&o_state, '\0', sizeof(mbstate_t)); ++ memset (&i_state, '\0', sizeof(mbstate_t)); ++ ++ for (;;) ++ { ++ if (buflen < MB_LEN_MAX && !feof(fp) && !ferror(fp)) ++ { ++ memmove (buf, bufpos, buflen); ++ buflen += fread (buf + buflen, sizeof(char), BUFSIZ, fp); ++ bufpos = buf; ++ } ++ ++ /* Get a wide character. */ ++ if (buflen < 1) ++ { ++ mblength = 1; ++ wc = WEOF; ++ } ++ else ++ { ++ i_state_bak = i_state; ++ mblength = mbrtowc ((wchar_t *)&wc, bufpos, buflen, &i_state); ++ } ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ i_state = i_state_bak; ++ wc = L'\0'; ++ } ++ ++ if (wc == L' ' && convert && column < INT_MAX) ++ { ++ ++pending; ++ ++column; ++ } ++ else if (wc == L'\t' && convert) ++ { ++ if (tab_size == 0) ++ { ++ /* Do not let tab_index == first_free_tab; ++ stop when it is 1 less. */ ++ while (tab_index < first_free_tab - 1 ++ && column >= tab_list[tab_index]) ++ tab_index++; ++ next_tab_column = tab_list[tab_index]; ++ if (tab_index < first_free_tab - 1) ++ tab_index++; ++ if (column >= next_tab_column) ++ { ++ convert = 0; /* Ran out of tab stops. */ ++ goto flush_pend_mb; ++ } ++ } ++ else ++ { ++ next_tab_column = column + tab_size - column % tab_size; ++ } ++ pending += next_tab_column - column; ++ column = next_tab_column; ++ } ++ else ++ { ++flush_pend_mb: ++ /* Flush pending spaces. Print as many tabs as possible, ++ then print the rest as spaces. */ ++ if (pending == 1) ++ { ++ putchar (' '); ++ pending = 0; ++ } ++ column -= pending; ++ while (pending > 0) ++ { ++ if (tab_size == 0) ++ { ++ /* Do not let print_tab_index == first_free_tab; ++ stop when it is 1 less. */ ++ while (print_tab_index < first_free_tab - 1 ++ && column >= tab_list[print_tab_index]) ++ print_tab_index++; ++ next_tab_column = tab_list[print_tab_index]; ++ if (print_tab_index < first_free_tab - 1) ++ print_tab_index++; ++ } ++ else ++ { ++ next_tab_column = ++ column + tab_size - column % tab_size; ++ } ++ if (next_tab_column - column <= pending) ++ { ++ putchar ('\t'); ++ pending -= next_tab_column - column; ++ column = next_tab_column; ++ } ++ else ++ { ++ --print_tab_index; ++ column += pending; ++ while (pending != 0) ++ { ++ putchar (' '); ++ pending--; ++ } ++ } ++ } ++ ++ if (wc == WEOF) ++ { ++ fp = next_file (fp); ++ if (fp == NULL) ++ break; /* No more files. */ ++ else ++ { ++ memset (&i_state, '\0', sizeof(mbstate_t)); ++ continue; ++ } ++ } ++ ++ if (mblength == (size_t)-1 || mblength == (size_t)-2) ++ { ++ if (convert) ++ { ++ ++column; ++ if (convert_entire_line == 0) ++ convert = 0; ++ } ++ mblength = 1; ++ putchar (buf[0]); ++ } ++ else if (mblength == 0) ++ { ++ if (convert && convert_entire_line == 0) ++ convert = 0; ++ mblength = 1; ++ putchar ('\0'); ++ } ++ else ++ { ++ if (convert) ++ { ++ if (wc == L'\b') ++ { ++ if (column > 0) ++ --column; ++ } ++ else ++ { ++ int width; /* The width of WC. */ ++ ++ width = wcwidth (wc); ++ column += (width > 0) ? width : 0; ++ if (convert_entire_line == 0) ++ convert = 0; ++ } ++ } ++ ++ if (wc == L'\n') ++ { ++ tab_index = print_tab_index = 0; ++ column = pending = 0; ++ convert = 1; ++ } ++ fwrite (bufpos, sizeof(char), mblength, stdout); ++ } ++ } ++ buflen -= mblength; ++ bufpos += mblength; ++ } ++} ++#endif ++ ++ + void + usage (int status) + { +@@ -524,7 +743,12 @@ main (int argc, char **argv) + + file_list = (optind < argc ? &argv[optind] : stdin_argv); + +- unexpand (); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ unexpand_multibyte (); ++ else ++#endif ++ unexpand (); + + if (have_read_stdin && fclose (stdin) != 0) + error (EXIT_FAILURE, errno, "-"); +Index: src/uniq.c +=================================================================== +--- src/uniq.c.orig 2010-03-13 16:14:09.000000000 +0100 ++++ src/uniq.c 2010-05-07 16:41:34.000063405 +0200 +@@ -21,6 +21,16 @@ + #include + #include + ++/* Get mbstate_t, mbrtowc(). */ ++#if HAVE_WCHAR_H ++# include ++#endif ++ ++/* Get isw* functions. */ ++#if HAVE_WCTYPE_H ++# include ++#endif ++ + #include "system.h" + #include "argmatch.h" + #include "linebuffer.h" +@@ -31,7 +41,19 @@ + #include "stdio--.h" + #include "xmemcoll.h" + #include "xstrtol.h" +-#include "memcasecmp.h" ++#include "xmemcoll.h" ++ ++/* MB_LEN_MAX is incorrectly defined to be 1 in at least one GCC ++ installation; work around this configuration error. */ ++#if !defined MB_LEN_MAX || MB_LEN_MAX < 2 ++# define MB_LEN_MAX 16 ++#endif ++ ++/* Some systems, like BeOS, have multibyte encodings but lack mbstate_t. */ ++#if HAVE_MBRTOWC && defined mbstate_t ++# define mbrtowc(pwc, s, n, ps) (mbrtowc) (pwc, s, n, 0) ++#endif ++ + + /* The official name of this program (e.g., no `g' prefix). */ + #define PROGRAM_NAME "uniq" +@@ -107,6 +129,10 @@ static enum delimit_method const delimit + /* Select whether/how to delimit groups of duplicate lines. */ + static enum delimit_method delimit_groups; + ++/* Function pointers. */ ++static char * ++(*find_field) (struct linebuffer *line); ++ + static struct option const longopts[] = + { + {"count", no_argument, NULL, 'c'}, +@@ -206,7 +232,7 @@ size_opt (char const *opt, char const *m + return a pointer to the beginning of the line's field to be compared. */ + + static char * +-find_field (struct linebuffer const *line) ++find_field_uni (struct linebuffer *line) + { + size_t count; + char const *lp = line->buffer; +@@ -227,6 +253,83 @@ find_field (struct linebuffer const *lin + return line->buffer + i; + } + ++#if HAVE_MBRTOWC ++ ++# define MBCHAR_TO_WCHAR(WC, MBLENGTH, LP, POS, SIZE, STATEP, CONVFAIL) \ ++ do \ ++ { \ ++ mbstate_t state_bak; \ ++ \ ++ CONVFAIL = 0; \ ++ state_bak = *STATEP; \ ++ \ ++ MBLENGTH = mbrtowc (&WC, LP + POS, SIZE - POS, STATEP); \ ++ \ ++ switch (MBLENGTH) \ ++ { \ ++ case (size_t)-2: \ ++ case (size_t)-1: \ ++ *STATEP = state_bak; \ ++ CONVFAIL++; \ ++ /* Fall through */ \ ++ case 0: \ ++ MBLENGTH = 1; \ ++ } \ ++ } \ ++ while (0) ++ ++static char * ++find_field_multi (struct linebuffer *line) ++{ ++ size_t count; ++ char *lp = line->buffer; ++ size_t size = line->length - 1; ++ size_t pos; ++ size_t mblength; ++ wchar_t wc; ++ mbstate_t *statep; ++ int convfail; ++ ++ pos = 0; ++ statep = &(line->state); ++ ++ /* skip fields. */ ++ for (count = 0; count < skip_fields && pos < size; count++) ++ { ++ while (pos < size) ++ { ++ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); ++ ++ if (convfail || !iswblank (wc)) ++ { ++ pos += mblength; ++ break; ++ } ++ pos += mblength; ++ } ++ ++ while (pos < size) ++ { ++ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); ++ ++ if (!convfail && iswblank (wc)) ++ break; ++ ++ pos += mblength; ++ } ++ } ++ ++ /* skip fields. */ ++ for (count = 0; count < skip_chars && pos < size; count++) ++ { ++ MBCHAR_TO_WCHAR (wc, mblength, lp, pos, size, statep, convfail); ++ pos += mblength; ++ } ++ ++ return lp + pos; ++} ++#endif ++ + /* Return false if two strings OLD and NEW match, true if not. + OLD and NEW point not to the beginnings of the lines + but rather to the beginnings of the fields to compare. +@@ -235,6 +338,8 @@ find_field (struct linebuffer const *lin + static bool + different (char *old, char *new, size_t oldlen, size_t newlen) + { ++ char *copy_old, *copy_new; ++ + if (check_chars < oldlen) + oldlen = check_chars; + if (check_chars < newlen) +@@ -242,15 +347,93 @@ different (char *old, char *new, size_t + + if (ignore_case) + { +- /* FIXME: This should invoke strcoll somehow. */ +- return oldlen != newlen || memcasecmp (old, new, oldlen); ++ size_t i; ++ ++ copy_old = alloca (oldlen + 1); ++ copy_new = alloca (oldlen + 1); ++ ++ for (i = 0; i < oldlen; i++) ++ { ++ copy_old[i] = toupper (old[i]); ++ copy_new[i] = toupper (new[i]); ++ } + } +- else if (hard_LC_COLLATE) +- return xmemcoll (old, oldlen, new, newlen) != 0; + else +- return oldlen != newlen || memcmp (old, new, oldlen); ++ { ++ copy_old = (char *)old; ++ copy_new = (char *)new; ++ } ++ ++ return xmemcoll (copy_old, oldlen, copy_new, newlen); + } + ++#if HAVE_MBRTOWC ++static int ++different_multi (const char *old, const char *new, size_t oldlen, size_t newlen, mbstate_t oldstate, mbstate_t newstate) ++{ ++ size_t i, j, chars; ++ const char *str[2]; ++ char *copy[2]; ++ size_t len[2]; ++ mbstate_t state[2]; ++ size_t mblength; ++ wchar_t wc, uwc; ++ mbstate_t state_bak; ++ ++ str[0] = old; ++ str[1] = new; ++ len[0] = oldlen; ++ len[1] = newlen; ++ state[0] = oldstate; ++ state[1] = newstate; ++ ++ for (i = 0; i < 2; i++) ++ { ++ copy[i] = alloca (len[i] + 1); ++ ++ for (j = 0, chars = 0; j < len[i] && chars < check_chars; chars++) ++ { ++ state_bak = state[i]; ++ mblength = mbrtowc (&wc, str[i] + j, len[i] - j, &(state[i])); ++ ++ switch (mblength) ++ { ++ case (size_t)-1: ++ case (size_t)-2: ++ state[i] = state_bak; ++ /* Fall through */ ++ case 0: ++ mblength = 1; ++ break; ++ ++ default: ++ if (ignore_case) ++ { ++ uwc = towupper (wc); ++ ++ if (uwc != wc) ++ { ++ mbstate_t state_wc; ++ ++ memset (&state_wc, '\0', sizeof(mbstate_t)); ++ wcrtomb (copy[i] + j, uwc, &state_wc); ++ } ++ else ++ memcpy (copy[i] + j, str[i] + j, mblength); ++ } ++ else ++ memcpy (copy[i] + j, str[i] + j, mblength); ++ } ++ j += mblength; ++ } ++ copy[i][j] = '\0'; ++ len[i] = j; ++ } ++ ++ return xmemcoll (copy[0], len[0], copy[1], len[1]); ++} ++#endif ++ + /* Output the line in linebuffer LINE to standard output + provided that the switches say it should be output. + MATCH is true if the line matches the previous line. +@@ -303,15 +486,43 @@ check_file (const char *infile, const ch + { + char *prevfield IF_LINT (= NULL); + size_t prevlen IF_LINT (= 0); ++#if HAVE_MBRTOWC ++ mbstate_t prevstate; ++ ++ memset (&prevstate, '\0', sizeof (mbstate_t)); ++#endif + + while (!feof (stdin)) + { + char *thisfield; + size_t thislen; ++#if HAVE_MBRTOWC ++ mbstate_t thisstate; ++#endif ++ + if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) + break; + thisfield = find_field (thisline); + thislen = thisline->length - 1 - (thisfield - thisline->buffer); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ thisstate = thisline->state; ++ ++ if (prevline->length == 0 || different_multi ++ (thisfield, prevfield, thislen, prevlen, thisstate, prevstate)) ++ { ++ fwrite (thisline->buffer, sizeof (char), ++ thisline->length, stdout); ++ ++ SWAP_LINES (prevline, thisline); ++ prevfield = thisfield; ++ prevlen = thislen; ++ prevstate = thisstate; ++ } ++ } ++ else ++#endif + if (prevline->length == 0 + || different (thisfield, prevfield, thislen, prevlen)) + { +@@ -330,17 +541,26 @@ check_file (const char *infile, const ch + size_t prevlen; + uintmax_t match_count = 0; + bool first_delimiter = true; ++#if HAVE_MBRTOWC ++ mbstate_t prevstate; ++#endif + + if (readlinebuffer_delim (prevline, stdin, delimiter) == 0) + goto closefiles; + prevfield = find_field (prevline); + prevlen = prevline->length - 1 - (prevfield - prevline->buffer); ++#if HAVE_MBRTOWC ++ prevstate = prevline->state; ++#endif + + while (!feof (stdin)) + { + bool match; + char *thisfield; + size_t thislen; ++#if HAVE_MBRTOWC ++ mbstate_t thisstate; ++#endif + if (readlinebuffer_delim (thisline, stdin, delimiter) == 0) + { + if (ferror (stdin)) +@@ -349,6 +569,15 @@ check_file (const char *infile, const ch + } + thisfield = find_field (thisline); + thislen = thisline->length - 1 - (thisfield - thisline->buffer); ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ thisstate = thisline->state; ++ match = !different_multi (thisfield, prevfield, ++ thislen, prevlen, thisstate, prevstate); ++ } ++ else ++#endif + match = !different (thisfield, prevfield, thislen, prevlen); + match_count += match; + +@@ -381,6 +610,9 @@ check_file (const char *infile, const ch + SWAP_LINES (prevline, thisline); + prevfield = thisfield; + prevlen = thislen; ++#if HAVE_MBRTOWC ++ prevstate = thisstate; ++#endif + if (!match) + match_count = 0; + } +@@ -426,6 +658,19 @@ main (int argc, char **argv) + + atexit (close_stdout); + ++#if HAVE_MBRTOWC ++ if (MB_CUR_MAX > 1) ++ { ++ find_field = find_field_multi; ++ } ++ else ++#endif ++ { ++ find_field = find_field_uni; ++ } ++ ++ ++ + skip_chars = 0; + skip_fields = 0; + check_chars = SIZE_MAX; +Index: tests/Makefile.am +=================================================================== +--- tests/Makefile.am.orig 2010-04-20 21:52:05.000000000 +0200 ++++ tests/Makefile.am 2010-05-07 16:38:36.972072320 +0200 +@@ -224,6 +224,7 @@ TESTS = \ + misc/sort-compress \ + misc/sort-continue \ + misc/sort-files0-from \ ++ misc/sort-mb-tests \ + misc/sort-merge \ + misc/sort-merge-fdlimit \ + misc/sort-month \ +@@ -474,6 +475,10 @@ TESTS = \ + $(root_tests) + + pr_data = \ ++ misc/mb1.X \ ++ misc/mb1.I \ ++ misc/mb2.X \ ++ misc/mb2.I \ + pr/0F \ + pr/0FF \ + pr/0FFnt \ +Index: tests/misc/cut +=================================================================== +--- tests/misc/cut.orig 2010-01-01 14:06:47.000000000 +0100 ++++ tests/misc/cut 2010-05-07 16:13:31.144492080 +0200 +@@ -26,7 +26,7 @@ use strict; + my $prog = 'cut'; + my $try = "Try \`$prog --help' for more information.\n"; + my $from_1 = "$prog: fields and positions are numbered from 1\n$try"; +-my $inval = "$prog: invalid byte or field list\n$try"; ++my $inval = "$prog: invalid byte, character or field list\n$try"; + my $no_endpoint = "$prog: invalid range with no endpoint: -\n$try"; + + my @Tests = +@@ -141,7 +141,7 @@ my @Tests = + + # None of the following invalid ranges provoked an error up to coreutils-6.9. + ['inval1', qw(-f 2-0), {IN=>''}, {OUT=>''}, {EXIT=>1}, +- {ERR=>"$prog: invalid decreasing range\n$try"}], ++ {ERR=>"$prog: invalid byte, character or field list\n$try"}], + ['inval2', qw(-f -), {IN=>''}, {OUT=>''}, {EXIT=>1}, {ERR=>$no_endpoint}], + ['inval3', '-f', '4,-', {IN=>''}, {OUT=>''}, {EXIT=>1}, {ERR=>$no_endpoint}], + ['inval4', '-f', '1-2,-', {IN=>''}, {OUT=>''}, {EXIT=>1}, {ERR=>$no_endpoint}], +Index: tests/misc/mb1.I +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/mb1.I 2010-05-07 16:13:31.188492096 +0200 +@@ -0,0 +1,4 @@ ++Apple@10 ++Banana@5 ++Citrus@20 ++Cherry@30 +Index: tests/misc/mb1.X +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/mb1.X 2010-05-07 16:13:31.224492101 +0200 +@@ -0,0 +1,4 @@ ++Banana@5 ++Apple@10 ++Citrus@20 ++Cherry@30 +Index: tests/misc/mb2.I +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/mb2.I 2010-05-07 16:13:31.248492220 +0200 +@@ -0,0 +1,4 @@ ++Apple@AA10@@20 ++Banana@AA5@@30 ++Citrus@AA20@@5 ++Cherry@AA30@@10 +Index: tests/misc/mb2.X +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/mb2.X 2010-05-07 16:13:31.276492153 +0200 +@@ -0,0 +1,4 @@ ++Citrus@AA20@@5 ++Cherry@AA30@@10 ++Apple@AA10@@20 ++Banana@AA5@@30 +Index: tests/misc/sort-mb-tests +=================================================================== +--- /dev/null 1970-01-01 00:00:00.000000000 +0000 ++++ tests/misc/sort-mb-tests 2010-05-07 16:13:31.312492158 +0200 +@@ -0,0 +1,58 @@ ++#! /bin/sh ++case $# in ++ 0) xx='../src/sort';; ++ *) xx="$1";; ++esac ++test "$VERBOSE" && echo=echo || echo=: ++$echo testing program: $xx ++errors=0 ++test "$srcdir" || srcdir=. ++test "$VERBOSE" && $xx --version 2> /dev/null ++ ++export LC_ALL=en_US.UTF-8 ++locale -k LC_CTYPE 2>&1 | grep -q charmap.*UTF-8 || exit 77 ++errors=0 ++ ++$xx -t @ -k2 -n misc/mb1.I > misc/mb1.O ++code=$? ++if test $code != 0; then ++ $echo "Test mb1 failed: $xx return code $code differs from expected value 0" 1>&2 ++ errors=`expr $errors + 1` ++else ++ cmp misc/mb1.O $srcdir/misc/mb1.X > /dev/null 2>&1 ++ case $? in ++ 0) if test "$VERBOSE"; then $echo "passed mb1"; fi;; ++ 1) $echo "Test mb1 failed: files misc/mb1.O and $srcdir/misc/mb1.X differ" 1>&2 ++ (diff -c misc/mb1.O $srcdir/misc/mb1.X) 2> /dev/null ++ errors=`expr $errors + 1`;; ++ 2) $echo "Test mb1 may have failed." 1>&2 ++ $echo The command "cmp misc/mb1.O $srcdir/misc/mb1.X" failed. 1>&2 ++ errors=`expr $errors + 1`;; ++ esac ++fi ++ ++$xx -t @ -k4 -n misc/mb2.I > misc/mb2.O ++code=$? ++if test $code != 0; then ++ $echo "Test mb2 failed: $xx return code $code differs from expected value 0" 1>&2 ++ errors=`expr $errors + 1` ++else ++ cmp misc/mb2.O $srcdir/misc/mb2.X > /dev/null 2>&1 ++ case $? in ++ 0) if test "$VERBOSE"; then $echo "passed mb2"; fi;; ++ 1) $echo "Test mb2 failed: files misc/mb2.O and $srcdir/misc/mb2.X differ" 1>&2 ++ (diff -c misc/mb2.O $srcdir/misc/mb2.X) 2> /dev/null ++ errors=`expr $errors + 1`;; ++ 2) $echo "Test mb2 may have failed." 1>&2 ++ $echo The command "cmp misc/mb2.O $srcdir/misc/mb2.X" failed. 1>&2 ++ errors=`expr $errors + 1`;; ++ esac ++fi ++ ++if test $errors = 0; then ++ $echo Passed all 113 tests. 1>&2 ++else ++ $echo Failed $errors tests. 1>&2 ++fi ++test $errors = 0 || errors=1 ++exit $errors diff --git a/coreutils-8.5.patch b/coreutils-8.5.patch new file mode 100644 index 0000000..159f791 --- /dev/null +++ b/coreutils-8.5.patch @@ -0,0 +1,67 @@ +Index: gnulib-tests/test-isnanl.h +=================================================================== +--- gnulib-tests/test-isnanl.h.orig 2010-03-13 16:21:09.000000000 +0100 ++++ gnulib-tests/test-isnanl.h 2010-05-05 13:47:16.003024388 +0200 +@@ -63,7 +63,7 @@ main () + /* Quiet NaN. */ + ASSERT (isnanl (NaNl ())); + +-#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT ++#if defined LDBL_EXPBIT0_WORD && defined LDBL_EXPBIT0_BIT && 0 + /* A bit pattern that is different from a Quiet NaN. With a bit of luck, + it's a Signalling NaN. */ + { +@@ -105,6 +105,7 @@ main () + { LDBL80_WORDS (0xFFFF, 0x83333333, 0x00000000) }; + ASSERT (isnanl (x.value)); + } ++#if 0 + /* The isnanl function should recognize Pseudo-NaNs, Pseudo-Infinities, + Pseudo-Zeroes, Unnormalized Numbers, and Pseudo-Denormals, as defined in + Intel IA-64 Architecture Software Developer's Manual, Volume 1: +@@ -138,6 +139,7 @@ main () + ASSERT (isnanl (x.value)); + } + #endif ++#endif + + return 0; + } +Index: src/system.h +=================================================================== +--- src/system.h.orig 2010-04-20 21:52:05.000000000 +0200 ++++ src/system.h 2010-05-05 13:38:20.923127872 +0200 +@@ -138,7 +138,7 @@ enum + # define DEV_BSIZE BBSIZE + #endif + #ifndef DEV_BSIZE +-# define DEV_BSIZE 4096 ++# define DEV_BSIZE 512 + #endif + + /* Extract or fake data from a `struct stat'. +Index: tests/misc/help-version +=================================================================== +--- tests/misc/help-version.orig 2010-04-20 21:52:05.000000000 +0200 ++++ tests/misc/help-version 2010-05-05 13:44:11.919859133 +0200 +@@ -239,6 +239,7 @@ lbracket_setup () { args=": ]"; } + for i in $built_programs; do + # Skip these. + case $i in chroot|stty|tty|false|chcon|runcon) continue;; esac ++ case $i in df) continue;; esac + + rm -rf $tmp_in $tmp_in2 $tmp_dir $tmp_out $bigZ_in $zin $zin2 + echo z |gzip > $zin +Index: tests/other-fs-tmpdir +=================================================================== +--- tests/other-fs-tmpdir.orig 2010-01-01 14:06:47.000000000 +0100 ++++ tests/other-fs-tmpdir 2010-05-05 13:38:20.982872202 +0200 +@@ -43,6 +43,8 @@ for d in $CANDIDATE_TMP_DIRS; do + fi + + done ++# Autobuild hack ++test -f /bin/uname.bin && other_partition_tmpdir= + + if test -z "$other_partition_tmpdir"; then + skip_test_ \ diff --git a/coreutils-8.5.tar.xz b/coreutils-8.5.tar.xz new file mode 100644 index 0000000..cd6bae3 --- /dev/null +++ b/coreutils-8.5.tar.xz @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5aa855caa08b94ccd632510d9ab265646d2ee11498c7efff205b27c2437dec5a +size 4531488 diff --git a/coreutils-add_ogv.patch b/coreutils-add_ogv.patch index b6fbd40..9b43b11 100644 --- a/coreutils-add_ogv.patch +++ b/coreutils-add_ogv.patch @@ -1,6 +1,8 @@ ---- src/dircolors.hin -+++ src/dircolors.hin -@@ -151,6 +151,7 @@ +Index: src/dircolors.hin +=================================================================== +--- src/dircolors.hin.orig 2010-04-20 21:52:04.000000000 +0200 ++++ src/dircolors.hin 2010-05-05 16:22:16.375859309 +0200 +@@ -158,6 +158,7 @@ EXEC 01;32 .m2v 01;35 .mkv 01;35 .ogm 01;35 diff --git a/coreutils-cifs-afs.diff b/coreutils-cifs-afs.diff deleted file mode 100644 index 41cd49f..0000000 --- a/coreutils-cifs-afs.diff +++ /dev/null @@ -1,35 +0,0 @@ ---- src/fs.h -+++ src/fs.h -@@ -5,10 +5,12 @@ - #if defined __linux__ - # define S_MAGIC_ADFS 0xADF5 - # define S_MAGIC_AFFS 0xADFF -+# define S_MAGIC_AFS 0x6B414653 - # define S_MAGIC_AUTOFS 0x187 - # define S_MAGIC_BEFS 0x42465331 - # define S_MAGIC_BFS 0x1BADFACE - # define S_MAGIC_BINFMT_MISC 0x42494e4d -+# define S_MAGIC_CIFS 0xFF534D42 - # define S_MAGIC_CODA 0x73757245 - # define S_MAGIC_COH 0x012FF7B7 - # define S_MAGIC_CRAMFS 0x28CD3D45 ---- src/stat.c -+++ src/stat.c -@@ -219,6 +219,8 @@ human_fstype (STRUCT_STATVFS const *stat - return "adfs"; - case S_MAGIC_AFFS: /* 0xADFF */ - return "affs"; -+ case S_MAGIC_AFS: /* 0x6B414653 */ -+ return "afs"; - case S_MAGIC_AUTOFS: /* 0x187 */ - return "autofs"; - case S_MAGIC_BEFS: /* 0x42465331 */ -@@ -227,6 +229,8 @@ human_fstype (STRUCT_STATVFS const *stat - return "bfs"; - case S_MAGIC_BINFMT_MISC: /* 0x42494e4d */ - return "binfmt_misc"; -+ case S_MAGIC_CIFS: /* 0xFF534D42 */ -+ return "cifs"; - case S_MAGIC_CODA: /* 0x73757245 */ - return "coda"; - case S_MAGIC_COH: /* 0x012FF7B7 */ diff --git a/coreutils-fix_distcheck.patch b/coreutils-fix_distcheck.patch deleted file mode 100644 index 9fc3c8e..0000000 --- a/coreutils-fix_distcheck.patch +++ /dev/null @@ -1,80 +0,0 @@ -Index: maint.mk -=================================================================== ---- maint.mk.orig 2009-02-18 16:13:19.000000000 +0100 -+++ maint.mk 2010-05-04 17:45:14.515359143 +0200 -@@ -623,14 +623,14 @@ bin=bin-$$$$ - - write_loser = printf '\#!%s\necho $$0: bad path 1>&2; exit 1\n' '$(SHELL)' - --TMPDIR ?= /tmp --t=$(TMPDIR)/$(PACKAGE)/test -+tmpdir = $(abs_top_builddir)/tests/torture -+ - pfx=$(t)/i - - # More than once, tainted build and source directory names would - # have caused at least one "make check" test to apply "chmod 700" - # to all directories under $HOME. Make sure it doesn't happen again. --tp := $(shell echo "$(TMPDIR)/$(PACKAGE)-$$$$") -+tp = $(tmpdir)/taint - t_prefix = $(tp)/a - t_taint = '$(t_prefix) b' - fake_home = $(tp)/home -@@ -648,10 +648,11 @@ taint-distcheck: $(DIST_ARCHIVES) - touch $(fake_home)/f - mkdir -p $(fake_home)/d/e - ls -lR $(fake_home) $(t_prefix) > $(tp)/.ls-before -+ HOME=$(fake_home); export HOME; \ - cd $(t_taint)/$(distdir) \ - && ./configure \ - && $(MAKE) \ -- && HOME=$(fake_home) $(MAKE) check \ -+ && $(MAKE) check \ - && ls -lR $(fake_home) $(t_prefix) > $(tp)/.ls-after \ - && diff $(tp)/.ls-before $(tp)/.ls-after \ - && test -d $(t_prefix) -@@ -670,6 +671,7 @@ endef - # Install, then verify that all binaries and man pages are in place. - # Note that neither the binary, ginstall, nor the ].1 man page is installed. - define my-instcheck -+ echo running my-instcheck; \ - $(MAKE) prefix=$(pfx) install \ - && test ! -f $(pfx)/bin/ginstall \ - && { fail=0; \ -@@ -688,6 +690,7 @@ endef - - define coreutils-path-check - { \ -+ echo running coreutils-path-check; \ - if test -f $(srcdir)/src/true.c; then \ - fail=1; \ - mkdir $(bin) \ -@@ -732,19 +735,20 @@ my-distcheck: $(DIST_ARCHIVES) $(local-c - -rm -rf $(t) - mkdir -p $(t) - GZIP=$(GZIP_ENV) $(AMTAR) -C $(t) -zxf $(distdir).tar.gz -- cd $(t)/$(distdir) \ -- && ./configure --disable-nls \ -- && $(MAKE) CFLAGS='$(warn_cflags)' \ -- AM_MAKEFLAGS='$(null_AM_MAKEFLAGS)' \ -- && $(MAKE) dvi \ -- && $(install-transform-check) \ -- && $(my-instcheck) \ -- && $(coreutils-path-check) \ -+ cd $(t)/$(distdir) \ -+ && ./configure --quiet --enable-gcc-warnings --disable-nls \ -+ && $(MAKE) CFLAGS='$(warn_cflags)' \ -+ AM_MAKEFLAGS='$(null_AM_MAKEFLAGS)' \ -+ && $(MAKE) dvi \ -+ && $(install-transform-check) \ -+ && $(my-instcheck) \ -+ && $(coreutils-path-check) \ - && $(MAKE) distclean - (cd $(t) && mv $(distdir) $(distdir).old \ - && $(AMTAR) -zxf - ) < $(distdir).tar.gz - diff -ur $(t)/$(distdir).old $(t)/$(distdir) - -rm -rf $(t) -+ rmdir $(tmpdir)/$(PACKAGE) $(tmpdir) - @echo "========================"; \ - echo "$(distdir).tar.gz is ready for distribution"; \ - echo "========================" diff --git a/coreutils-getaddrinfo.diff b/coreutils-getaddrinfo.diff deleted file mode 100644 index 39a0f38..0000000 --- a/coreutils-getaddrinfo.diff +++ /dev/null @@ -1,16 +0,0 @@ -Index: coreutils-6.9.90/gnulib-tests/test-getaddrinfo.c -================================================================================ ---- coreutils-7.1/gnulib-tests/test-getaddrinfo.c -+++ coreutils-7.1/gnulib-tests/test-getaddrinfo.c -@@ -71,10 +71,7 @@ int simple (char *host, char *service) - the test merely because someone is down the country on their - in-law's farm. */ - if (res == EAI_AGAIN) -- { -- fprintf (stderr, "skipping getaddrinfo test: no network?\n"); -- return 77; -- } -+ return 0; - /* IRIX reports EAI_NONAME for "https". Don't fail the test - merely because of this. */ - if (res == EAI_NONAME) diff --git a/coreutils-getaddrinfo.patch b/coreutils-getaddrinfo.patch new file mode 100644 index 0000000..d5b0720 --- /dev/null +++ b/coreutils-getaddrinfo.patch @@ -0,0 +1,17 @@ +Index: gnulib-tests/test-getaddrinfo.c +=================================================================== +--- gnulib-tests/test-getaddrinfo.c.orig 2010-03-13 16:21:08.000000000 +0100 ++++ gnulib-tests/test-getaddrinfo.c 2010-05-05 14:51:40.343025353 +0200 +@@ -88,11 +88,7 @@ simple (char const *host, char const *se + the test merely because someone is down the country on their + in-law's farm. */ + if (res == EAI_AGAIN) +- { +- skip++; +- fprintf (stderr, "skipping getaddrinfo test: no network?\n"); +- return 77; +- } ++ return 0; + /* IRIX reports EAI_NONAME for "https". Don't fail the test + merely because of this. */ + if (res == EAI_NONAME) diff --git a/coreutils-gl_printf_safe.patch b/coreutils-gl_printf_safe.patch new file mode 100644 index 0000000..ed5cef0 --- /dev/null +++ b/coreutils-gl_printf_safe.patch @@ -0,0 +1,24 @@ +Index: configure +=================================================================== +--- configure.orig 2010-04-23 18:06:40.000000000 +0200 ++++ configure 2010-05-05 13:40:11.419859163 +0200 +@@ -3340,7 +3340,6 @@ as_fn_append ac_func_list " alarm" + as_fn_append ac_header_list " sys/statvfs.h" + as_fn_append ac_header_list " sys/select.h" + as_fn_append ac_func_list " nl_langinfo" +-gl_printf_safe=yes + as_fn_append ac_header_list " utmp.h" + as_fn_append ac_header_list " utmpx.h" + as_fn_append ac_func_list " utmpname" +Index: m4/gnulib-comp.m4 +=================================================================== +--- m4/gnulib-comp.m4.orig 2010-04-21 20:12:06.000000000 +0200 ++++ m4/gnulib-comp.m4 2010-05-05 13:40:58.875859176 +0200 +@@ -1158,7 +1158,6 @@ AC_DEFUN([gl_INIT], + # Code from module printf-frexpl: + gl_FUNC_PRINTF_FREXPL + # Code from module printf-safe: +- m4_divert_text([INIT_PREPARE], [gl_printf_safe=yes]) + # Code from module priv-set: + gl_PRIV_SET + # Code from module progname: diff --git a/coreutils-i18n-infloop.patch b/coreutils-i18n-infloop.patch new file mode 100644 index 0000000..ede0365 --- /dev/null +++ b/coreutils-i18n-infloop.patch @@ -0,0 +1,14 @@ +Index: src/sort.c +=================================================================== +--- src/sort.c.orig 2010-05-07 16:52:08.068491875 +0200 ++++ src/sort.c 2010-05-07 16:53:44.704992155 +0200 +@@ -2720,7 +2720,8 @@ keycompare_mb (const struct line *a, con + if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ + STATE = state_bak; \ + if (!ignore) \ +- COPY[NEW_LEN++] = TEXT[i++]; \ ++ COPY[NEW_LEN++] = TEXT[i]; \ ++ i++; \ + continue; \ + } \ + \ diff --git a/coreutils-i18n-uninit.patch b/coreutils-i18n-uninit.patch new file mode 100644 index 0000000..c3b8ebc --- /dev/null +++ b/coreutils-i18n-uninit.patch @@ -0,0 +1,16 @@ +Index: src/cut.c +=================================================================== +--- src/cut.c.orig 2010-05-06 15:16:26.851859241 +0200 ++++ src/cut.c 2010-05-06 15:16:27.095859170 +0200 +@@ -878,7 +878,10 @@ cut_fields_mb (FILE *stream) + c = getc (stream); + empty_input = (c == EOF); + if (c != EOF) +- ungetc (c, stream); ++ { ++ ungetc (c, stream); ++ wc = 0; ++ } + else + wc = WEOF; + diff --git a/coreutils-invalid-ids.patch b/coreutils-invalid-ids.patch new file mode 100644 index 0000000..a7cdbb1 --- /dev/null +++ b/coreutils-invalid-ids.patch @@ -0,0 +1,26 @@ +While uid_t and gid_t are both unsigned, the values (uid_t) -1 and +(gid_t) -1 are reserved. A uid or gid argument of -1 to the chown(2) +system call means to leave the uid/gid unchanged. Catch this case +so that trying to set a uid or gid to -1 will result in an error. + +Test cases: + + chown 4294967295 file + chown :4294967295 file + chgrp 4294967295 file + +Andreas Gruenbacher + +Index: src/chgrp.c +=================================================================== +--- src/chgrp.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/chgrp.c 2010-05-05 14:03:28.279359192 +0200 +@@ -89,7 +89,7 @@ parse_group (const char *name) + { + unsigned long int tmp; + if (! (xstrtoul (name, NULL, 10, &tmp, "") == LONGINT_OK +- && tmp <= GID_T_MAX)) ++ && tmp <= GID_T_MAX && (gid_t) tmp != (gid_t) -1)) + error (EXIT_FAILURE, 0, _("invalid group: %s"), quote (name)); + gid = tmp; + } diff --git a/coreutils-no_hostname_and_hostid.patch b/coreutils-no_hostname_and_hostid.patch new file mode 100644 index 0000000..b3657e0 --- /dev/null +++ b/coreutils-no_hostname_and_hostid.patch @@ -0,0 +1,122 @@ +Index: doc/coreutils.texi +=================================================================== +--- doc/coreutils.texi.orig 2010-05-06 15:17:48.132359317 +0200 ++++ doc/coreutils.texi 2010-05-06 15:21:02.631693747 +0200 +@@ -65,8 +65,6 @@ + * fold: (coreutils)fold invocation. Wrap long input lines. + * groups: (coreutils)groups invocation. Print group names a user is in. + * head: (coreutils)head invocation. Output the first part of files. +-* hostid: (coreutils)hostid invocation. Print numeric host identifier. +-* hostname: (coreutils)hostname invocation. Print or set system name. + * id: (coreutils)id invocation. Print user identity. + * install: (coreutils)install invocation. Copy and change attributes. + * join: (coreutils)join invocation. Join lines on a common field. +@@ -197,7 +195,7 @@ Free Documentation License''. + * File name manipulation:: dirname basename pathchk mktemp + * Working context:: pwd stty printenv tty + * User information:: id logname whoami groups users who +-* System context:: date arch nproc uname hostname hostid uptime ++* System context:: date arch nproc uname uptime + * SELinux context:: chcon runcon + * Modified command invocation:: chroot env nice nohup stdbuf su timeout + * Process control:: kill +@@ -413,8 +411,6 @@ System context + * date invocation:: Print or set system date and time + * nproc invocation:: Print the number of processors + * uname invocation:: Print system information +-* hostname invocation:: Print or set system name +-* hostid invocation:: Print numeric host identifier + * uptime invocation:: Print system uptime and load + + @command{date}: Print or set system date and time +@@ -13449,8 +13445,6 @@ information. + * arch invocation:: Print machine hardware name. + * nproc invocation:: Print the number of processors. + * uname invocation:: Print system information. +-* hostname invocation:: Print or set system name. +-* hostid invocation:: Print numeric host identifier. + * uptime invocation:: Print system uptime and load. + @end menu + +@@ -14272,55 +14266,6 @@ Print the kernel version. + + @exitstatus + +- +-@node hostname invocation +-@section @command{hostname}: Print or set system name +- +-@pindex hostname +-@cindex setting the hostname +-@cindex printing the hostname +-@cindex system name, printing +-@cindex appropriate privileges +- +-With no arguments, @command{hostname} prints the name of the current host +-system. With one argument, it sets the current host name to the +-specified string. You must have appropriate privileges to set the host +-name. Synopsis: +- +-@example +-hostname [@var{name}] +-@end example +- +-The only options are @option{--help} and @option{--version}. @xref{Common +-options}. +- +-@exitstatus +- +- +-@node hostid invocation +-@section @command{hostid}: Print numeric host identifier +- +-@pindex hostid +-@cindex printing the host identifier +- +-@command{hostid} prints the numeric identifier of the current host +-in hexadecimal. This command accepts no arguments. +-The only options are @option{--help} and @option{--version}. +-@xref{Common options}. +- +-For example, here's what it prints on one system I use: +- +-@example +-$ hostid +-1bac013d +-@end example +- +-On that system, the 32-bit quantity happens to be closely +-related to the system's Internet address, but that isn't always +-the case. +- +-@exitstatus +- + @node uptime invocation + @section @command{uptime}: Print system uptime and load + +Index: man/Makefile.am +=================================================================== +--- man/Makefile.am.orig 2010-05-06 15:17:48.136359276 +0200 ++++ man/Makefile.am 2010-05-06 15:18:44.844359168 +0200 +@@ -197,7 +197,7 @@ check-x-vs-1: + @PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ + t=$@-t; \ + (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ +- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ ++ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ + | tr -s ' ' '\n' | sed 's/\.1$$//') \ + | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ + rm $$t +Index: man/Makefile.in +=================================================================== +--- man/Makefile.in.orig 2010-05-06 15:17:48.136359276 +0200 ++++ man/Makefile.in 2010-05-06 15:18:44.875852631 +0200 +@@ -1574,7 +1574,7 @@ check-x-vs-1: + @PATH=../src$(PATH_SEPARATOR)$$PATH; export PATH; \ + t=$@-t; \ + (cd $(srcdir) && ls -1 *.x) | sed 's/\.x$$//' | $(ASSORT) > $$t;\ +- (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) \ ++ (echo $(dist_man1_MANS) $(NO_INSTALL_PROGS_DEFAULT) hostid \ + | tr -s ' ' '\n' | sed 's/\.1$$//') \ + | $(ASSORT) -u | diff - $$t || { rm $$t; exit 1; }; \ + rm $$t diff --git a/coreutils-sysinfo.diff b/coreutils-sysinfo.patch similarity index 86% rename from coreutils-sysinfo.diff rename to coreutils-sysinfo.patch index 3096103..4e5b9c4 100644 --- a/coreutils-sysinfo.diff +++ b/coreutils-sysinfo.patch @@ -1,10 +1,10 @@ Index: src/uname.c =================================================================== ---- src/uname.c.orig 2010-05-04 17:27:48.679359310 +0200 -+++ src/uname.c 2010-05-04 17:29:03.011859260 +0200 +--- src/uname.c.orig 2010-01-01 14:06:47.000000000 +0100 ++++ src/uname.c 2010-05-05 13:58:03.471359120 +0200 @@ -339,6 +339,36 @@ main (int argc, char **argv) # endif - } + } #endif + if (element == unknown) + { @@ -37,11 +37,11 @@ Index: src/uname.c +#endif + } if (! (toprint == UINT_MAX && element == unknown)) - print_element (element); + print_element (element); } @@ -364,6 +394,18 @@ main (int argc, char **argv) - element = hardware_platform; - } + element = hardware_platform; + } #endif + if (element == unknown) + { @@ -56,5 +56,5 @@ Index: src/uname.c + element = hardware_platform; + } if (! (toprint == UINT_MAX && element == unknown)) - print_element (element); + print_element (element); } diff --git a/coreutils.changes b/coreutils.changes index cfcabee..d023ebf 100644 --- a/coreutils.changes +++ b/coreutils.changes @@ -1,9 +1,78 @@ +------------------------------------------------------------------- +Thu Jul 1 21:23:40 UTC 2010 - jengelh@medozas.de + +- Use %_smp_mflags + ------------------------------------------------------------------- Tue Jun 29 20:18:04 CEST 2010 - pth@suse.de - Fix 'sort -V' not working because the i18n (mb handling) patch wasn't updated to handle the new option (bnc#615073). +------------------------------------------------------------------- +Mon Jun 28 12:52:15 CEST 2010 - pth@suse.de + +- Fix typo in spec file (% missing from version). + +------------------------------------------------------------------- +Fri Jun 18 11:57:47 CEST 2010 - kukuk@suse.de + +- Last part of fix for [bnc#533249]: Don't run account part of + PAM stack for su as root. Requires pam > 1.1.1. + +------------------------------------------------------------------- +Fri May 7 15:44:53 UTC 2010 - pth@novell.com + +- Update to 8.5: + Bug fixes + * cp and mv once again support preserving extended attributes. + * cp now preserves "capabilities" when also preserving file ownership.7 + * ls --color once again honors the 'NORMAL' dircolors directive. + [bug introduced in coreutils-6.11] + * sort -M now handles abbreviated months that are aligned using + blanks in the locale database. Also locales with 8 bit characters + are handled correctly, including multi byte locales with the caveat + that multi byte characters are matched case sensitively. + * sort again handles obsolescent key formats (+POS -POS) correctly. + Previously if -POS was specified, 1 field too many was used in the + sort. [bug introduced in coreutils-7.2] + + New features + + * join now accepts the --header option, to treat the first line of + each file as a header line to be joined and printed + unconditionally. + + * timeout now accepts the --kill-after option which sends a kill + signal to the monitored command if it's still running the specified + duration after the initial signal was sent. + + * who: the "+/-" --mesg (-T) indicator of whether a user/tty is + accepting messages could be incorrectly listed as "+", when in + fact, the user was not accepting messages (mesg no). Before, who + would examine only the permission bits, and not consider the group + of the TTY device file. Thus, if a login tty's group would change + somehow e.g., to "root", that would make it unwritable (via + write(1)) by normal users, in spite of whatever the permission bits + might imply. Now, when configured using the + --with-tty-group[=NAME] option, who also compares the group of the + TTY device with NAME (or "tty" if no group name is specified). + + Changes in behavior + + * ls --color no longer emits the final 3-byte color-resetting escape + sequence when it would be a no-op. + + * join -t '' no longer emits an error and instead operates on each + line as a whole (even if they contain NUL characters). + + For other changes since 7.1 see NEWS. +- Split-up coreutils-%%{version}.diff as far as possible. +- Prefix all patches with coreutils-. +- All patches have the .patch suffix. +- Use the i18n patch from Archlinux as it fixes at least one test + suite failure. + ------------------------------------------------------------------- Tue May 4 17:13:37 UTC 2010 - pth@novell.com diff --git a/coreutils.spec b/coreutils.spec index f3a1de5..93cfa3d 100644 --- a/coreutils.spec +++ b/coreutils.spec @@ -1,5 +1,5 @@ # -# spec file for package coreutils (Version 7.1) +# spec file for package coreutils (Version 8.5) # # Copyright (c) 2010 SUSE LINUX Products GmbH, Nuernberg, Germany. # @@ -23,34 +23,32 @@ BuildRequires: help2man libacl-devel libcap-devel libselinux-devel pam-devel xz Url: http://www.gnu.org/software/coreutils/ License: GFDLv1.2 ; GPLv2+ ; GPLv3+ Group: System/Base -Version: 7.1 -Release: 6 -Provides: fileutils sh-utils stat textutils mktemp -Obsoletes: fileutils sh-utils stat textutils mktemp +Version: 8.5 +Release: 1 +Provides: fileutils = %{version}, sh-utils = %{version}, stat = %version}, textutils = %{version}, mktemp = %{version} +Obsoletes: fileutils < %{version}, sh-utils < %{version}, stat < %version}, textutils < %{version}, mktemp < %{version} Obsoletes: libselinux <= 1.23.11-3 libselinux-32bit = 9 libselinux-64bit = 9 libselinux-x86 = 9 AutoReqProv: on PreReq: %{install_info_prereq} Requires: %{name}-lang = %version +Requires: pam >= 1.1.1.90 Source: coreutils-%{version}.tar.xz Source1: su.pamd Source2: su.default Source3: baselibs.conf -Patch: coreutils-%{version}.diff -Patch4: coreutils-5.3.0-i18n-0.1.patch -Patch5: i18n-uninit.diff -Patch6: i18n-infloop.diff -Patch8: coreutils-sysinfo.diff -Patch11: i18n-monthsort.diff -Patch12: i18n-random.diff -Patch16: invalid-ids.diff -Patch17: i18n-limfield.diff -Patch20: coreutils-6.8-su.diff -Patch21: coreutils-6.8.0-pie.diff -Patch22: coreutils-5.3.0-sbin4su.diff -Patch23: coreutils-getaddrinfo.diff -Patch25: coreutils-cifs-afs.diff +Patch0: coreutils-%{version}.patch +Patch1: coreutils-no_hostname_and_hostid.patch +Patch2: coreutils-gl_printf_safe.patch +Patch4: coreutils-8.5-i18n.patch +Patch5: coreutils-i18n-uninit.patch +Patch6: coreutils-i18n-infloop.patch +Patch8: coreutils-sysinfo.patch +Patch16: coreutils-invalid-ids.patch +Patch20: coreutils-6.8-su.patch +Patch21: coreutils-6.8.0-pie.patch +Patch22: coreutils-5.3.0-sbin4su.patch +Patch23: coreutils-getaddrinfo.patch Patch26: coreutils-add_ogv.patch -Patch27: coreutils-fix_distcheck.patch BuildRoot: %{_tmppath}/%{name}-%{version}-build %description @@ -107,48 +105,44 @@ Authors: %lang_package %prep %setup -q -%patch4 -p1 +%patch4 %patch5 %patch6 -%patch +%patch0 +%patch1 +%patch2 %patch8 -%patch11 -%patch12 %patch16 -%patch17 %patch20 %patch21 %patch22 -%patch23 -p1 -%patch25 +%patch23 %patch26 -%patch27 %build -#AUTOPOINT=true autoreconf -fi -./configure CFLAGS="$RPM_OPT_FLAGS -Wall" \ - --prefix=%{_prefix} --mandir=%{_mandir} \ - --infodir=%{_infodir} --without-included-regex \ +AUTOPOINT=true autoreconf -fi +export CFLAGS="%optflags -Wall" +%configure --without-included-regex \ --enable-install-program=arch,su \ gl_cv_func_printf_directive_n=yes \ gl_cv_func_isnanl_works=yes \ DEFAULT_POSIX2_VERSION=199209 -make %{?jobs:-j%jobs} PAMLIBS="-lpam -ldl" +make %{?_smp_mflags} PAMLIBS="-lpam -ldl" V=1 %check if test $EUID -eq 0; then - su nobody -c make %{?jobs:-j%jobs} check VERBOSE=yes - make %{?jobs:-j%jobs} check-root VERBOSE=yes + su nobody -c make %{?_smp_mflags} check VERBOSE=yes V=1 + make %{?_smp_mflags} check-root VERBOSE=yes V=1 else %ifarch %arm - make -k %{?jobs:-j%jobs} check VERBOSE=yes || echo make check failed + make -k %{?_smp_mflags} check VERBOSE=yes V=1 || echo make check failed %else - make %{?jobs:-j%jobs} check VERBOSE=yes + make %{?_smp_mflags} check VERBOSE=yes V=1 %endif fi %install -make DESTDIR="$RPM_BUILD_ROOT" install +%makeinstall test -f $RPM_BUILD_ROOT%{_bindir}/su || \ install src/su $RPM_BUILD_ROOT%{_bindir}/su install -d $RPM_BUILD_ROOT/bin @@ -182,6 +176,7 @@ rm -rf $RPM_BUILD_ROOT %config /etc/pam.d/su-l %config(noreplace) /etc/default/su %{_bindir}/* +%{_libdir}/%{name} %doc %{_infodir}/coreutils.info*.gz %doc %{_mandir}/man1/*.1.gz %dir %{_prefix}/share/locale/*/LC_TIME diff --git a/i18n-infloop.diff b/i18n-infloop.diff deleted file mode 100644 index dbfcc29..0000000 --- a/i18n-infloop.diff +++ /dev/null @@ -1,14 +0,0 @@ -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-05-04 17:27:49.103359264 +0200 -+++ src/sort.c 2010-05-04 17:28:43.820359291 +0200 -@@ -2540,7 +2540,8 @@ keycompare_mb (const struct line *a, con - if (MBLENGTH == (size_t)-2 || MBLENGTH == (size_t)-1) \ - STATE = state_bak; \ - if (!ignore) \ -- COPY[NEW_LEN++] = TEXT[i++]; \ -+ COPY[NEW_LEN++] = TEXT[i]; \ -+ i++; \ - continue; \ - } \ - \ diff --git a/i18n-limfield.diff b/i18n-limfield.diff deleted file mode 100644 index b27c3c9..0000000 --- a/i18n-limfield.diff +++ /dev/null @@ -1,100 +0,0 @@ -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-05-04 17:29:12.419359202 +0200 -+++ src/sort.c 2010-05-04 17:29:12.479359419 +0200 -@@ -1731,7 +1731,7 @@ limfield_mb (const struct line *line, co - GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); - ptr += mblength; - } -- if (ptr < lim) -+ if (ptr < lim && (eword | echar)) - { - GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); - ptr += mblength; -@@ -1742,11 +1742,6 @@ limfield_mb (const struct line *line, co - { - while (ptr < lim && ismbblank (ptr, &mblength)) - ptr += mblength; -- if (ptr < lim) -- { -- GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -- ptr += mblength; -- } - while (ptr < lim && !ismbblank (ptr, &mblength)) - ptr += mblength; - } -@@ -1756,20 +1751,19 @@ limfield_mb (const struct line *line, co - /* Make LIM point to the end of (one byte past) the current field. */ - if (tab != NULL) - { -- char *newlim, *p; -+ char *newlim; - -- newlim = NULL; -- for (p = ptr; p < lim;) -- { -- if (memcmp (p, tab, tab_length) == 0) -- { -- newlim = p; -- break; -- } -- -- GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -- p += mblength; -- } -+ for (newlim = ptr; newlim < lim;) -+ { -+ if (memcmp (newlim, tab, tab_length) == 0) -+ { -+ lim = newlim; -+ break; -+ } -+ -+ GET_BYTELEN_OF_CHAR (lim, newlim, mblength, state); -+ newlim += mblength; -+ } - } - else - { -@@ -1778,24 +1772,20 @@ limfield_mb (const struct line *line, co - - while (newlim < lim && ismbblank (newlim, &mblength)) - newlim += mblength; -- if (ptr < lim) -- { -- GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); -- ptr += mblength; -- } - while (newlim < lim && !ismbblank (newlim, &mblength)) -- newlim += mblength; -+ newlim += mblength; - lim = newlim; - } - # endif - -- /* If we're skipping leading blanks, don't start counting characters -- until after skipping past any leading blanks. */ -+ /* If we're ignoring leading blanks when computing the End -+ of the field, don't start counting bytes until after skipping -+ past any leading blanks. */ - if (key->skipeblanks) - while (ptr < lim && ismbblank (ptr, &mblength)) - ptr += mblength; - -- memset (&state, '\0', sizeof(mbstate_t)); -+ memset (&state, '\0', sizeof (mbstate_t)); - - /* Advance PTR by ECHAR (if possible), but no further than LIM. */ - for (i = 0; i < echar; i++) -@@ -1803,9 +1793,9 @@ limfield_mb (const struct line *line, co - GET_BYTELEN_OF_CHAR (lim, ptr, mblength, state); - - if (ptr + mblength > lim) -- break; -+ break; - else -- ptr += mblength; -+ ptr += mblength; - } - - return ptr; diff --git a/i18n-monthsort.diff b/i18n-monthsort.diff deleted file mode 100644 index 58bf214..0000000 --- a/i18n-monthsort.diff +++ /dev/null @@ -1,13 +0,0 @@ -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-05-04 17:28:43.820359291 +0200 -+++ src/sort.c 2010-05-04 17:30:44.507859357 +0200 -@@ -1285,7 +1285,7 @@ inittables_mb (void) - else - { - j += mblength; -- mblength = wcrtomb (mbc, wc, &state_wc); -+ mblength = wcrtomb (mbc, pwc, &state_wc); - assert (mblength != (size_t) 0 && mblength != (size_t) -1); - } - diff --git a/i18n-random.diff b/i18n-random.diff deleted file mode 100644 index 566e2de..0000000 --- a/i18n-random.diff +++ /dev/null @@ -1,16 +0,0 @@ -Index: src/sort.c -=================================================================== ---- src/sort.c.orig 2010-05-04 17:29:12.395359111 +0200 -+++ src/sort.c 2010-05-04 17:29:59.979859336 +0200 -@@ -2494,7 +2494,10 @@ keycompare_mb (const struct line *a, con - size_t lenb = limb <= textb ? 0 : limb - textb; - - /* Actually compare the fields. */ -- if (key->numeric | key->general_numeric) -+ -+ if (key->random) -+ diff = compare_random (texta, lena, textb, lenb); -+ else if (key->numeric | key->general_numeric) - { - char savea = *lima, saveb = *limb; - diff --git a/i18n-uninit.diff b/i18n-uninit.diff deleted file mode 100644 index 8952a0d..0000000 --- a/i18n-uninit.diff +++ /dev/null @@ -1,29 +0,0 @@ -Index: src/cut.c -=================================================================== ---- src/cut.c.orig 2010-05-04 17:27:29.879859350 +0200 -+++ src/cut.c 2010-05-04 17:27:30.131859395 +0200 -@@ -878,7 +878,10 @@ cut_fields_mb (FILE *stream) - c = getc (stream); - empty_input = (c == EOF); - if (c != EOF) -- ungetc (c, stream); -+ { -+ ungetc (c, stream); -+ wc = 0; -+ } - else - wc = WEOF; - -Index: src/expand.c -=================================================================== ---- src/expand.c.orig 2010-05-04 17:27:29.915859239 +0200 -+++ src/expand.c 2010-05-04 17:27:30.155859324 +0200 -@@ -404,7 +404,7 @@ expand_multibyte (void) - for (;;) - { - /* Input character, or EOF. */ -- wint_t wc; -+ wint_t wc = 0; - - /* If true, perform translations. */ - bool convert = true; diff --git a/invalid-ids.diff b/invalid-ids.diff deleted file mode 100644 index 35f435c..0000000 --- a/invalid-ids.diff +++ /dev/null @@ -1,49 +0,0 @@ -While uid_t and gid_t are both unsigned, the values (uid_t) -1 and -(gid_t) -1 are reserved. A uid or gid argument of -1 to the chown(2) -system call means to leave the uid/gid unchanged. Catch this case -so that trying to set a uid or gid to -1 will result in an error. - -Test cases: - - chown 4294967295 file - chown :4294967295 file - chgrp 4294967295 file - -Andreas Gruenbacher - -Index: lib/userspec.c -=================================================================== ---- lib/userspec.c.orig 2010-05-04 17:27:48.479359439 +0200 -+++ lib/userspec.c 2010-05-04 17:29:12.439359267 +0200 -@@ -169,7 +169,7 @@ parse_with_separator (char const *spec, - { - unsigned long int tmp; - if (xstrtoul (u, NULL, 10, &tmp, "") == LONGINT_OK -- && tmp <= MAXUID) -+ && tmp <= MAXUID && tmp != (uid_t) -1) - unum = tmp; - else - error_msg = E_invalid_user; -@@ -200,7 +200,8 @@ parse_with_separator (char const *spec, - if (grp == NULL) - { - unsigned long int tmp; -- if (xstrtoul (g, NULL, 10, &tmp, "") == LONGINT_OK && tmp <= MAXGID) -+ if (xstrtoul (g, NULL, 10, &tmp, "") == LONGINT_OK && tmp <= MAXGID -+ && tmp != (gid_t) -1) - gnum = tmp; - else - error_msg = E_invalid_group; -Index: src/chgrp.c -=================================================================== ---- src/chgrp.c.orig 2010-05-04 17:27:48.479359439 +0200 -+++ src/chgrp.c 2010-05-04 17:29:12.443359269 +0200 -@@ -89,7 +89,7 @@ parse_group (const char *name) - { - unsigned long int tmp; - if (! (xstrtoul (name, NULL, 10, &tmp, "") == LONGINT_OK -- && tmp <= GID_T_MAX)) -+ && tmp <= GID_T_MAX && tmp != (gid_t) -1)) - error (EXIT_FAILURE, 0, _("invalid group: %s"), quote (name)); - gid = tmp; - } diff --git a/su.pamd b/su.pamd index b729046..88ddbaf 100644 --- a/su.pamd +++ b/su.pamd @@ -1,6 +1,7 @@ #%PAM-1.0 auth sufficient pam_rootok.so auth include common-auth +account sufficient pam_rootok.so account include common-account password include common-password session include common-session