3
0
Given a UTF-16 string, test its endianness.
Rules:
As long as the input is in UTF-16 without BOM, its type doesn't matter. In C++ on Windows, it can be
std::wstring
orstd::u16string
. In Haskell, it can be[Int16]
or[Word16]
.The endianness is tested by noncharacters. If the input contains a noncharacter, it is in wrong endianness.
If the original version and the endianness-swapped version both or neither contain a noncharacter, the test is inconclusive.
The type of the output must be well-ordered. Output 1 (or an equivalent enumerator) when the input is in right endianness, or -1 (or an equivalent enumerator) when the input must be endianness-swapped, or 0 (or an equivalent enumerator) when the test is inconclusive. For example, in Haskell,
Ordering
is a valid output type.Wrong UTF-16 sequences are also treated as noncharacters. (So the noncharacters are: Encodings of code point from U+FDD0 to U+FDEF, encodings of code point U+XXFFFE and U+XXFFFF with XX from 00 to 10, and unpaired surrogates.)
As this is a code golf, the shortest code in bytes wins.
Examples:
Assume we're in Haskell, the system is a big-endian, and Ordering
is the output type. When given the following byte sequence:
00 00 DB FF DF FE
Because it encodes the noncharacter U+10FFFE, the output must be:
LT
For the following byte sequence:
00 00 00 00 00 00
The output must be:
EQ
For the following byte sequence:
FE FF
Since the endianness-swapped version encodes the noncharacter U+FFFE, the output must be:
GT
@JoKing We output
0
when the test is inconclusive (see Rule #3). A non-UTF character is treated as a noncharacter also in the flipped version (see Rule #5). – Dannyu NDos – 2019-10-02T01:03:53.7533Some more test cases would be good – Jo King – 2019-10-02T01:14:11.067
2You should include a description of noncharacters and invalid UTF-16 sequences in your challenge, or at least point to the specification. – nwellnhof – 2019-10-02T14:59:35.930
1@nwellnhof I improved the question accordingly. Now vote for reopen, please? – Dannyu NDos – 2019-10-02T23:52:57.513