MBRTOC16(3) | Library Functions Manual | MBRTOC16(3) |
mbrtoc16
—
#include <uchar.h>
size_t
mbrtoc16
(char16_t * restrict
pc16, const char * restrict s,
size_t n, mbstate_t * restrict
ps);
mbrtoc16
function decodes multibyte characters in
the current locale and converts them to UTF-16, keeping state so it can
restart after incremental progress.
Each call to mbrtoc16
:
*
pc16,Specifically:
mbrtoc16
returns
(size_t)-1
and sets
errno(2) to indicate the
error.mbrtoc16
saves its state
in ps after all the input so far and returns
(size_t)-2
. All
n bytes of input are consumed in this case.mbrtoc16
had previously decoded a multibyte
character but has not yet yielded all the code units of its UTF-16
encoding, it stores the next UTF-16 code unit at
*
pc16 and returns
(size_t)-3
. No bytes of input
are consumed in this case.mbrtoc16
decodes the null multibyte character,
then it stores zero at *
pc16
and returns zero.mbrtoc16
decodes a single multibyte
character, stores the first (and possibly only) code unit in its UTF-16
encoding at *
pc16, and
returns the number of bytes consumed to decode the first multibyte
character.If pc16 is a null pointer, nothing is stored, but the effects on ps and the return value are unchanged.
If s is a null pointer, the
mbrtoc16
call is equivalent to:
mbrtoc16
(NULL
,
""
, 1
,
ps);This always returns zero, and has the effect of resetting ps to the initial conversion state, without writing to pc16, even if it is nonnull.
If ps is a null pointer,
mbrtoc16
uses an internal
mbstate_t object with static storage duration,
distinct from all other mbstate_t objects (including
those used by mbrtoc8(3),
mbrtoc32(3),
c8rtomb(3),
c16rtomb(3), and
c32rtomb(3)), which is
initialized at program startup to the initial conversion state.
mbrtoc16
function yields
either a Unicode scalar value in the Basic Multilingual Plane (BMP), i.e., a
16-bit Unicode code point that is not a surrogate code point, or, over two
successive calls, yields the high and low surrogate code points (in that
order) of a Unicode scalar value outside the BMP.
mbrtoc16
function returns:
0
mbrtoc16
decoded a null multibyte
character.1
≤
i ≤ n, if
mbrtoc16
consumed i bytes of
input to decode the next multibyte character, yielding a UTF-16 code
unit.(size_t)-3
mbrtoc16
consumed no new bytes
of input but yielded a UTF-16 code unit that was pending from previous
input.(size_t)-2
mbrtoc16
found only an incomplete
multibyte sequence after all n bytes of input and
any previous input, and saved its state to restart in the next call with
ps.(size_t)-1
char *s = ...; size_t n = ...; mbstate_t mbs = {0}; /* initial conversion state */ while (n) { char16_t c16; size_t len; len = mbrtoc16(&c16, s, n, &mbs); switch (len) { case 0: /* NUL terminator */ assert(c16 == 0); goto out; default: /* scalar value or high surrogate */ printf("U+%04"PRIx16"\n", (uint16_t)c16); break; case (size_t)-3: /* low surrogate */ printf("continue U+%04"PRIx16"\n", (uint16_t)c16); break; case (size_t)-2: /* incomplete */ printf("incomplete\n"); goto readmore; case (size_t)-1: /* error */ printf("error: %d\n", errno); goto out; } s += len; n -= len; }
The Unicode Standard, https://www.unicode.org/versions/Unicode15.0.0/UnicodeStandard-15.0.pdf, The Unicode Consortium, September 2022, Version 15.0 — Core Specification.
P. Hoffman and F. Yergeau, UTF-16, an encoding of ISO 10646, Internet Engineering Task Force, RFC 2781, https://datatracker.ietf.org/doc/html/rfc2781, February 2000.
mbrtoc16
function conforms to
ISO/IEC 9899:2011 (“ISO C11”).
mbrtoc16
function first appeared in
NetBSD 11.0.
August 14, 2024 | NetBSD 10.1 |