aboutsummaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorGerald Combs <gerald@wireshark.org>2020-09-03 18:40:36 -0700
committerAndersBroman <a.broman58@gmail.com>2020-09-04 10:01:23 +0000
commit188b4a655f792995abc68afb7a6f894d96e53f76 (patch)
tree0fbce3b7487734b018736341f1ac6a7743f0ab9c /doc
parentfd075df3f8cf8c99e598975c2a4dcd49e82147b3 (diff)
README.developer: Note that sources can use UTF-8.
We started allowing source files to be encoded as UTF-8 in April 2019 in bd75f5af0a. Update README.developer to match. README.developer no longer has a "Code style" section, so update the Developer's Guide to point to the "Portability" section.
Diffstat (limited to 'doc')
-rw-r--r--doc/README.developer22
1 files changed, 12 insertions, 10 deletions
diff --git a/doc/README.developer b/doc/README.developer
index bf15d68c4f..8c283f1cf4 100644
--- a/doc/README.developer
+++ b/doc/README.developer
@@ -501,16 +501,18 @@ automatically free()d when the dissection of the current packet ends so you
don't have to worry about free()ing them explicitly in order to not leak memory.
Please read README.wmem.
-Don't use non-ASCII characters in source files; not all compiler
-environments will be using the same encoding for non-ASCII characters,
-and at least one compiler (Microsoft's Visual C) will, in environments
-with double-byte character encodings, such as many Asian environments,
-fail if it sees a byte sequence in a source file that doesn't correspond
-to a valid character. This causes source files using either an ISO
-8859/n single-byte character encoding or UTF-8 to fail to compile. Even
-if the compiler doesn't fail, there is no guarantee that the compiler,
-or a developer's text editor, will interpret the characters the way you
-intend them to be interpreted.
+Source files can use UTF-8 encoding, but characters outside the ASCII
+range should be used sparingly. It should be safe to use non-ASCII
+characters in comments and strings, but some compilers (such as GCC
+versions prior to 10) may not support extended identifiers very well.
+There is also no guarantee that a developer's text editor will interpret
+the characters the way you intend them to be interpreted.
+
+The majority of Wireshark encodes strings as UTF-8. The main exception
+is the code that uses the Qt API, which uses UTF-16. Console output is
+UTF-8, but as with the source code extended characters should be used
+sparingly since some consoles (most notably Windows' cmd.exe) have
+limited support for UTF-8.
3. Robustness.