Don't do the byte-with-8th-bit-set-to-REPLACEMENT-CHARACTER mapping for

UTF-8 strings. Add that mapping for null-terminated ASCII strings. Factor out some common parts of comments about string routines, and clean up some other comments. svn path=/trunk/; revision=54868
author: Guy Harris <guy@alum.mit.edu> 2014-01-21 01:23:29 +0000
committer: Guy Harris <guy@alum.mit.edu> 2014-01-21 01:23:29 +0000
commit: 9cdf8dd5f5c1466c056d40f069cda28bdedfb1cc (patch)
tree: 93e48f4966e2585cae9756d2a360903d5a6526ff /epan/tvbuff.h
parent: 6517e3ba4b294931601f74112439eb827a3b5c85 (diff)
1 files changed, 11 insertions, 6 deletions
diff --git a/epan/tvbuff.h b/epan/tvbuff.h
index 2ea913a3f3..e8b9049695 100644
--- a/epan/tvbuff.h
+++ b/epan/tvbuff.h
@@ -485,11 +485,13 @@ extern gchar *tvb_format_stringzpad_wsp(tvbuff_t *tvb, const gint offset,
  *
  * Throws an exception if the tvbuff ends before the string does.
  *
- * tvb_get_string() handles 7bit ASCII strings, 8bit characters are
- *                  converted into the Unicode Replacement Character.
+ * tvb_get_string() handles 7-bit ASCII strings, with characters
+ *                   with the 8th bit set are converted to the
+ *                   Unicode REPLACEMENT CHARACTER.
  *
  * tvb_get_string_enc() takes a string encoding as well, and converts to UTF-8
- *                   from the encoding.
+ *                   from the encoding, possibly mapping some characters
+ *                   to the REPLACEMENT CHARACTER.
  *
  * If scope is set to NULL it is the user's responsibility to g_free()
  * the memory allocated by tvb_memdup(). Otherwise memory is
@@ -522,10 +524,13 @@ WS_DLL_PUBLIC gchar *tvb_get_ts_23_038_7bits_string(wmem_allocator_t *scope,
  * and return a pointer to the string.  Also return the length of the
  * string (including the terminating null) through a pointer.
  *
- * tvb_get_stringz() returns a string
+ * tvb_get_stringz() handles 7-bit ASCII strings, with characters
+ *                   with the 8th bit set are converted to the
+ *                   Unicode REPLACEMENT CHARACTER.
  *
- * tvb_get_stringz_enc() takes a string encoding as well, and converts to
- *                   UTF-8 from the encoding.
+ * tvb_get_stringz_enc() takes a string encoding as well, and converts to UTF-8
+ *                   from the encoding, possibly mapping some characters
+ *                   to the REPLACEMENT CHARACTER.
  *
  * tvb_get_const_stringz() returns a constant (unmodifiable) string that does
  *                   not need to be freed, instead it will automatically be
author	Guy Harris <guy@alum.mit.edu>	2014-01-21 01:23:29 +0000
committer	Guy Harris <guy@alum.mit.edu>	2014-01-21 01:23:29 +0000
commit	9cdf8dd5f5c1466c056d40f069cda28bdedfb1cc (patch)
tree	93e48f4966e2585cae9756d2a360903d5a6526ff /epan/tvbuff.h
parent	6517e3ba4b294931601f74112439eb827a3b5c85 (diff)