From ae4a912af04456a6fc4022150485be541f65d96c Mon Sep 17 00:00:00 2001 From: Gerald Combs Date: Fri, 6 May 2016 10:25:02 -0700 Subject: TShark: Convert TTY output. If we detect that we're writing to a TTY and that it doesn't support UTF-8, convert our output to the current code page on UNIX/Linux or to UTF-16LE on Windows. This helps to ensure that we don't fill users' screens with mojibake, along with scrubbing invalid output. Add a note about our output behavior to the TShark man page. Add a note about the glyphs we should and shouldn't be using to utf8_entities.h. Bug: 12393 Change-Id: I52b6dd240173b80ffb6d35b5950a46a565c97ce8 Reviewed-on: https://code.wireshark.org/review/15277 Reviewed-by: Gerald Combs Petri-Dish: Gerald Combs Tested-by: Petri Dish Buildbot Reviewed-by: Graham Bloice Reviewed-by: Anders Broman --- doc/tshark.pod | 14 ++++++++++++++ 1 file changed, 14 insertions(+) (limited to 'doc/tshark.pod') diff --git a/doc/tshark.pod b/doc/tshark.pod index fb88d53d9a..77082c7a1e 100644 --- a/doc/tshark.pod +++ b/doc/tshark.pod @@ -1741,6 +1741,20 @@ personal preferences file. =back +=head1 OUTPUT + +B uses UTF-8 to represent strings internally. In some cases the +output might not be valid. For example, a dissector might generate +invalid UTF-8 character sequences. Programs reading B output +should expect UTF-8 and be prepared for invalid output. + +If B detects that it is writing to a TTY on UNIX or Linux and +the locale does not support UTF-8, output will be re-encoded to match the +current locale. + +If B detects that it is writing to a TTY on Windows, output will be +encoded as UTF-16LE. + =head1 ENVIRONMENT VARIABLES =over 4 -- cgit v1.2.3