From 1c7269a6d19b62075b1786d812e9880d247f5967 Mon Sep 17 00:00:00 2001 From: Guy Harris Date: Sat, 12 May 2012 20:10:18 +0000 Subject: Mention ENC_UCS_2 and ENC_UTF_16. svn path=/trunk/; revision=42602 --- doc/README.developer | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) (limited to 'doc') diff --git a/doc/README.developer b/doc/README.developer index f371783dd1..57aaed46f6 100644 --- a/doc/README.developer +++ b/doc/README.developer @@ -2377,15 +2377,21 @@ order. For string fields, the encoding specifies the character set used for the string and the way individual code points in that character set are encoded. For FT_UINT_STRING fields, the byte order of the count must be -specified; when support for UTF-16 encoding is added, the byte order of -the encoding will also have to be specified. In other cases, ENC_NA -should be used. The character encodings that are currently -supported are: +specified; for UCS-2 and UTF-16, the byte order of the encoding must be +specified (for counted UCS-2 and UTF-16 strings, the byte order of the +count and the 16-bit values in the string must be the same). In other +cases, ENC_NA should be used. The character encodings that are +currently supported are: - ENC_UTF_8 - UTF-8 ENC_ASCII - ASCII (currently treated as UTF-8; in the future, all bytes with the 8th bit set will be treated as errors) + ENC_UTF_8 - UTF-8 + ENC_UCS_2 - UCS-2 + ENC_UTF_16 - UTF-16 (currently treated as UCS-2; in the future, + surrogate pairs will be handled, and non-valid 16-bit + code points and surrogate pairs will be treated as + errors) ENC_EBCDIC - EBCDIC Other encodings will be added in the future. -- cgit v1.2.3