Now on revision 107781. ------------------------------------------------------------ revno: 107781 committer: Eli Zaretskii branch nick: trunk timestamp: Fri 2012-04-06 16:10:30 +0300 message: Warning comments about subtleties of fetching characters from buffers/strings. src/buffer.h (FETCH_CHAR, FETCH_MULTIBYTE_CHAR): src/character.h (STRING_CHAR, STRING_CHAR_AND_LENGTH): Add comments about subtle differences between FETCH_CHAR* and STRING_CHAR* macros related to unification of CJK characters. For the details, see the discussion following the message here: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=11073#14. diff: === modified file 'src/ChangeLog' --- src/ChangeLog 2012-04-04 07:54:02 +0000 +++ src/ChangeLog 2012-04-06 13:10:30 +0000 @@ -1,3 +1,12 @@ +2012-04-06 Eli Zaretskii + + * buffer.h (FETCH_CHAR, FETCH_MULTIBYTE_CHAR): + * character.h (STRING_CHAR, STRING_CHAR_AND_LENGTH): Add comments + about subtle differences between FETCH_CHAR* and STRING_CHAR* + macros related to unification of CJK characters. For the details, + see the discussion following the message here: + http://debbugs.gnu.org/cgi/bugreport.cgi?bug=11073#14. + 2012-04-04 Chong Yidong * keyboard.c (Vdelayed_warnings_list): Doc fix. === modified file 'src/buffer.h' --- src/buffer.h 2012-01-19 07:21:25 +0000 +++ src/buffer.h 2012-04-06 13:10:30 +0000 @@ -343,7 +343,8 @@ - (ptr - (current_buffer)->text->beg <= GPT_BYTE - BEG_BYTE ? 0 : GAP_SIZE) \ + BEG_BYTE) -/* Return character at byte position POS. */ +/* Return character at byte position POS. See the caveat WARNING for + FETCH_MULTIBYTE_CHAR below. */ #define FETCH_CHAR(pos) \ (!NILP (BVAR (current_buffer, enable_multibyte_characters)) \ @@ -359,7 +360,17 @@ /* Return character code of multi-byte form at byte position POS. If POS doesn't point the head of valid multi-byte form, only the byte at - POS is returned. No range checking. */ + POS is returned. No range checking. + + WARNING: The character returned by this macro could be "unified" + inside STRING_CHAR, if the original character in the buffer belongs + to one of the Private Use Areas (PUAs) of codepoints that Emacs + uses to support non-unified CJK characters. If that happens, + CHAR_BYTES will return a value that is different from the length of + the original multibyte sequence stored in the buffer. Therefore, + do _not_ use FETCH_MULTIBYTE_CHAR if you need to advance through + the buffer to the next character after fetching this one. Instead, + use either FETCH_CHAR_ADVANCE or STRING_CHAR_AND_LENGTH. */ #define FETCH_MULTIBYTE_CHAR(pos) \ (_fetch_multibyte_char_p = (((pos) >= GPT_BYTE ? GAP_SIZE : 0) \ === modified file 'src/character.h' --- src/character.h 2011-11-20 02:29:42 +0000 +++ src/character.h 2012-04-06 13:10:30 +0000 @@ -292,7 +292,9 @@ } while (0) /* Return the character code of character whose multibyte form is at - P. */ + P. Note that this macro unifies CJK characters whose codepoints + are in the Private Use Areas (PUAs), so it might return a different + codepoint from the one actually stored at P. */ #define STRING_CHAR(p) \ (!((p)[0] & 0x80) \ @@ -309,7 +311,15 @@ /* Like STRING_CHAR, but set ACTUAL_LEN to the length of multibyte - form. */ + form. + + Note: This macro returns the actual length of the character's + multibyte sequence as it is stored in a buffer or string. The + character it returns might have a different codepoint that has a + different multibyte sequence of a different legth, due to possible + unification of CJK characters inside string_char. Therefore do NOT + assume that the length returned by this macro is identical to the + length of the multibyte sequence of the character it returns. */ #define STRING_CHAR_AND_LENGTH(p, actual_len) \ (!((p)[0] & 0x80) \ ------------------------------------------------------------ revno: 107780 committer: Chong Yidong branch nick: trunk timestamp: Fri 2012-04-06 14:39:35 +0800 message: * doc/lispref/minibuf.texi (Programmed Completion): Document metadata method. (Completion Variables): Document completion-category-overrides. diff: === modified file 'doc/lispref/ChangeLog' --- doc/lispref/ChangeLog 2012-04-05 14:47:41 +0000 +++ doc/lispref/ChangeLog 2012-04-06 06:39:35 +0000 @@ -1,9 +1,12 @@ +2012-04-06 Chong Yidong + + * minibuf.texi (Programmed Completion): Document metadata method. + (Completion Variables): Document completion-category-overrides. + 2012-04-05 Chong Yidong * anti.texi (Antinews): Rewrite for Emacs 23. - * minibuf.texi (Programmed Completion): Document metadata method. - 2012-04-04 Chong Yidong * minibuf.texi (Programmed Completion): Remove obsolete variable === modified file 'doc/lispref/minibuf.texi' --- doc/lispref/minibuf.texi 2012-04-04 10:32:35 +0000 +++ doc/lispref/minibuf.texi 2012-04-06 06:39:35 +0000 @@ -1575,12 +1575,10 @@ @cindex completion styles @defopt completion-styles -The value of this variable is a list of completion styles to use for -performing completion. A @dfn{completion style} is a set of rules for -generating completions. - -Each style listed in this variable must be one of those defined in -@code{completion-styles-alist}. +The value of this variable is a list of completion style (symbols) to +use for performing completion. A @dfn{completion style} is a set of +rules for generating completions. Each symbol in occurring this list +must have a corresponding entry in @code{completion-styles-alist}. @end defopt @defvar completion-styles-alist @@ -1588,15 +1586,16 @@ element in the list has the form @example -(@var{name} @var{try-completion} @var{all-completions} @var{doc}) +(@var{style} @var{try-completion} @var{all-completions} @var{doc}) @end example @noindent -Here, @var{name} is the name of the completion style (a symbol), which -may be used in @code{completion-styles-alist} to refer to this style; -@var{try-completion} is the function that does the completion; -@var{all-completions} is the function that lists the completions; and -@var{doc} is a string describing the completion style. +Here, @var{style} is the name of the completion style (a symbol), +which may be used in the @code{completion-styles} variable to refer to +this style; @var{try-completion} is the function that does the +completion; @var{all-completions} is the function that lists the +completions; and @var{doc} is a string describing the completion +style. The @var{try-completion} and @var{all-completions} functions should each accept four arguments: @var{string}, @var{collection}, @@ -1622,6 +1621,31 @@ description of the available completion styles. @end defvar +@defopt completion-category-overrides +This variable specifies special completion styles and other completion +behaviors to use when completing certain types of text. Its value +should be a list of the form @code{(@var{category} . @var{alist})}. +@var{category} is a symbol describing what is being completed; +currently, the @code{buffer} and @code{file} categories are defined, +but others can be defined via specialized completion functions +(@pxref{Programmed Completion}). @var{alist} is an association list +describing how completion should behave for the corresponding +category. The following alist keys are supported: + +@table @code +@item styles +The value should be a list of completion styles (symbols). + +@item cycle +The value should be a value for @code{completion-cycle-threshold} +(@pxref{Completion Options,,, emacs, The GNU Emacs Manual}) for this +category. +@end table + +@noindent +Additional alist entries may be defined in the future. +@end defopt + @defvar completion-extra-properties This variable is used to specify extra properties of the current completion command. It is intended to be let-bound by specialized @@ -1706,9 +1730,48 @@ should return @code{(boundaries START . END)}, where START is the position of the beginning boundary in the specified string, and END is the position of the end boundary in SUFFIX. + +@item metadata +This specifies a request for information about the state of the +current completion. The function should return an alist, as described +below. The alist may contain any number of elements. @end table + +@noindent +If the flag has any other value, the completion function should return +@code{nil}. @end itemize +The following is a list of metadata entries that a completion function +may return in response to a @code{metadata} flag argument: + +@table @code +@item category +The value should be a symbol describing what kind of text the +completion function is trying to complete. If the symbol matches one +of the keys in @code{completion-category-overrides}, the usual +completion behavior is overridden. @xref{Completion Variables}. + +@item annotation-function +The value should be a function for @dfn{annotating} completions. The +function should take one argument, @var{string}, which is a possible +completion. It should return a string, which is displayed after the +completion @var{string} in the @samp{*Completions*} buffer. + +@item display-sort-function +The value should be a function for sorting completions. The function +should take one argument, a list of completion strings, and return a +sorted list of completion strings. It is allowed to alter the input +list destructively. + +@item cycle-sort-function +The value should be a function for sorting completions, when +@code{completion-cycle-threshold} is non-@code{nil} and the user is +cycling through completion alternatives. @xref{Completion Options,,, +emacs, The GNU Emacs Manual}. Its argument list and return value are +the same as for @code{display-sort-function}. +@end table + @defun completion-table-dynamic function This function is a convenient way to write a function that can act as programmed completion function. The argument @var{function} should be