path: root/doc
diff options
authorGuy Harris <guy@alum.mit.edu>2012-05-12 20:10:18 +0000
committerGuy Harris <guy@alum.mit.edu>2012-05-12 20:10:18 +0000
commit1c7269a6d19b62075b1786d812e9880d247f5967 (patch)
treee80e955b1f50627d6e7a19b1c3f34e3d56ad7399 /doc
parent3896fea6c080b1b88690bba43f5217f780c2a22b (diff)
Mention ENC_UCS_2 and ENC_UTF_16.
svn path=/trunk/; revision=42602
Diffstat (limited to 'doc')
1 files changed, 11 insertions, 5 deletions
diff --git a/doc/README.developer b/doc/README.developer
index f371783dd1..57aaed46f6 100644
--- a/doc/README.developer
+++ b/doc/README.developer
@@ -2377,15 +2377,21 @@ order.
For string fields, the encoding specifies the character set used for the
string and the way individual code points in that character set are
encoded. For FT_UINT_STRING fields, the byte order of the count must be
-specified; when support for UTF-16 encoding is added, the byte order of
-the encoding will also have to be specified. In other cases, ENC_NA
-should be used. The character encodings that are currently
-supported are:
+specified; for UCS-2 and UTF-16, the byte order of the encoding must be
+specified (for counted UCS-2 and UTF-16 strings, the byte order of the
+count and the 16-bit values in the string must be the same). In other
+cases, ENC_NA should be used. The character encodings that are
+currently supported are:
- ENC_UTF_8 - UTF-8
ENC_ASCII - ASCII (currently treated as UTF-8; in the future,
all bytes with the 8th bit set will be treated as
+ ENC_UTF_8 - UTF-8
+ ENC_UCS_2 - UCS-2
+ ENC_UTF_16 - UTF-16 (currently treated as UCS-2; in the future,
+ surrogate pairs will be handled, and non-valid 16-bit
+ code points and surrogate pairs will be treated as
+ errors)
Other encodings will be added in the future.