diff options
author | Guy Harris <guy@alum.mit.edu> | 2014-01-21 01:23:29 +0000 |
---|---|---|
committer | Guy Harris <guy@alum.mit.edu> | 2014-01-21 01:23:29 +0000 |
commit | 9cdf8dd5f5c1466c056d40f069cda28bdedfb1cc (patch) | |
tree | 93e48f4966e2585cae9756d2a360903d5a6526ff /epan/tvbuff.h | |
parent | 6517e3ba4b294931601f74112439eb827a3b5c85 (diff) |
Don't do the byte-with-8th-bit-set-to-REPLACEMENT-CHARACTER mapping for
UTF-8 strings.
Add that mapping for null-terminated ASCII strings.
Factor out some common parts of comments about string routines, and
clean up some other comments.
svn path=/trunk/; revision=54868
Diffstat (limited to 'epan/tvbuff.h')
-rw-r--r-- | epan/tvbuff.h | 17 |
1 files changed, 11 insertions, 6 deletions
diff --git a/epan/tvbuff.h b/epan/tvbuff.h index 2ea913a3f3..e8b9049695 100644 --- a/epan/tvbuff.h +++ b/epan/tvbuff.h @@ -485,11 +485,13 @@ extern gchar *tvb_format_stringzpad_wsp(tvbuff_t *tvb, const gint offset, * * Throws an exception if the tvbuff ends before the string does. * - * tvb_get_string() handles 7bit ASCII strings, 8bit characters are - * converted into the Unicode Replacement Character. + * tvb_get_string() handles 7-bit ASCII strings, with characters + * with the 8th bit set are converted to the + * Unicode REPLACEMENT CHARACTER. * * tvb_get_string_enc() takes a string encoding as well, and converts to UTF-8 - * from the encoding. + * from the encoding, possibly mapping some characters + * to the REPLACEMENT CHARACTER. * * If scope is set to NULL it is the user's responsibility to g_free() * the memory allocated by tvb_memdup(). Otherwise memory is @@ -522,10 +524,13 @@ WS_DLL_PUBLIC gchar *tvb_get_ts_23_038_7bits_string(wmem_allocator_t *scope, * and return a pointer to the string. Also return the length of the * string (including the terminating null) through a pointer. * - * tvb_get_stringz() returns a string + * tvb_get_stringz() handles 7-bit ASCII strings, with characters + * with the 8th bit set are converted to the + * Unicode REPLACEMENT CHARACTER. * - * tvb_get_stringz_enc() takes a string encoding as well, and converts to - * UTF-8 from the encoding. + * tvb_get_stringz_enc() takes a string encoding as well, and converts to UTF-8 + * from the encoding, possibly mapping some characters + * to the REPLACEMENT CHARACTER. * * tvb_get_const_stringz() returns a constant (unmodifiable) string that does * not need to be freed, instead it will automatically be |