-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for IDL wstring #2124
Support for IDL wstring #2124
Conversation
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
616d1f0
to
df06f23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks mostly fine, though there are a few missing cases in the IDL compiler and the serializer, and (I believe) an incorrect unit for serialized length of a wstring
.
for (uint32_t i = 0; val[i] != L'\0'; i++) | ||
{ | ||
assert (di < n); | ||
if ((uint32_t) val[i] < 0x10000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The (XTypes) spec isn't completely clear on supporting non-BMP characters in wide strings:
The serialized length of an object of type String shall be the number of bytes in the CDR buffer taken by the String characters. This is twice the number of characters in the string because a single character (in the Basic Multilingual Plane) encoded using UTF-16 takes 2 bytes to serialize.
But section 7.2.2.2.1.2.7 on String<Char16>
does not mention this restriction, while the sections on Char16
and array/sequence of Char16
do (7.2.2.2.1.2.5 and 7.2.2.2.1.2.6).
And in the rationale in 7.4.1.1.2: ... could support sending UTF-16 encoded Unicode characters outside the BMP
So I'm fine with having no restrictions for writing CDR.
examples/dynsub/print_sample.c
Outdated
// must always call align for its side effects | ||
if (bound != 0) | ||
{ | ||
return align (sample, c, _Alignof (wchar_t), bound + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the size
parameter be bound * sizeof (wchar_t) + sizeof (L'\0')
?
examples/dynsub/type_cache.c
Outdated
*size = sizeof (wchar_t *); | ||
} else { | ||
*align = 1; | ||
*size = bound; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, should also use sizeof (wchar_t)
and take terminating \0
into account
src/core/ddsi/src/ddsi_typebuilder.c
Outdated
case DDS_XTypes_TK_STRING16: { | ||
bool bounded = (type->xt._u.str16.bound > 0); | ||
tb_type->type_code = bounded ? DDS_OP_VAL_BWSTR : DDS_OP_VAL_WSTR; | ||
tb_type->args.string_args.max_size = type->xt._u.str8.bound + 1; // +1 for terminating L'\0' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tb_type->args.string_args.max_size = type->xt._u.str8.bound + 1; // +1 for terminating L'\0' | |
tb_type->args.string_args.max_size = type->xt._u.str16.bound + 1; // +1 for terminating L'\0' |
src/core/ddsi/src/ddsi_typebuilder.c
Outdated
@@ -415,16 +422,29 @@ static dds_return_t typebuilder_add_type (struct typebuilder_data *tbd, uint32_t | |||
case DDS_XTypes_TK_STRING8: { | |||
bool bounded = (type->xt._u.str8.bound > 0); | |||
tb_type->type_code = bounded ? DDS_OP_VAL_BST : DDS_OP_VAL_STR; | |||
tb_type->args.string_args.max_size = type->xt._u.str8.bound + 1; // +1 for terminating \0 | |||
tb_type->args.string_args.max_size = type->xt._u.str8.bound + 1; // +1 for terminating '\0' | |||
tb_type->cdr_align = bounded ? 1 : 4; // unbounded string has 4 bytes length field in cdr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both bounded and unbounded strings have a 4 bytes length in the CDR, so this comment (which I wrote myself a few years ago) is wrong. It looks like the cdr_align
field is not used anywhere, and can be removed.
src/core/cdr/src/dds_cdrstream.c
Outdated
static void dds_stream_skip_wstring (dds_istream_t * __restrict is) | ||
{ | ||
const uint32_t length = dds_is_get4 (is); | ||
dds_stream_skip_forward (is, length, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assumes length is in 2-byte units (see comment on write_wstring)
src/core/cdr/src/dds_cdrstream.c
Outdated
assert (maxsz > 0); | ||
if (!read_and_normalize_uint32 (&sz, data, off, size, bswap)) | ||
return false; | ||
if ((size - *off) / 2u < sz || maxsz - 1 < sz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assumes size is in 2-byte units
src/core/cdr/src/dds_cdrstream.c
Outdated
case DDS_OP_VAL_WSTR: case DDS_OP_VAL_BWSTR: { | ||
uint32_t sz = dds_is_get4 (is); | ||
dds_os_put4 (os, allocator, sz); | ||
dds_os_put_bytes (os, allocator, is->m_buffer + is->m_index, 2u * sz); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment on line 4653
@@ -4439,6 +4865,7 @@ static void dds_stream_read_key_impl (dds_istream_t * __restrict is, char * __re | |||
case DDS_OP_VAL_2BY: *((uint16_t *) dst) = dds_is_get2 (is); break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[lines 4723 and 4805] WCHAR
, WSTR
and BWSTR
cases missing (cast subtype
and type
to enum so that the compiler shows a warning?)
src/core/cdr/src/dds_cdrstream.c
Outdated
@@ -4754,7 +5228,7 @@ static bool prtf_simple_array (char * __restrict *buf, size_t * __restrict bufsi | |||
} | |||
break; | |||
case DDS_OP_VAL_BLN: case DDS_OP_VAL_2BY: case DDS_OP_VAL_4BY: case DDS_OP_VAL_8BY: | |||
case DDS_OP_VAL_STR: case DDS_OP_VAL_BST: | |||
case DDS_OP_VAL_STR: case DDS_OP_VAL_BST: case DDS_OP_VAL_WCHAR: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WSTR
and BWSTR
missing
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
…ither kind Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
It used to handle it the same as a DDS_OP_JEQ4 and tried to interpret the offset as a type code. This seemingly never caused a problem before, until the introduction of wstring. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
dds_stream_reuse_wstring_bound / wstring_from_utf16 expect a bound including a terminating 0, so should not subtract 1 from bound in instruction. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
FOREACH moved from dynlib.s, FOREACH_PAIR is analogous. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
@dpotman I believe the commits I just pushed address all the comments — and a few details extra that I discovered in the process. There are also now some tests for wstring support. Good thing I did that, because that's where 2fd2600 comes from. Now the problem with the tests is that I disabled them on Windows, or more precisely: when using Microsoft's compiler. I know the general issue, which is that you need an extra round of macro expansion when playing games with I would've done that trial-and-error if only my little old Windows PC would still let me do a remote desktop session ... That leaves the CI, but for this kind of trial-and-error going through the CI is practically infeasible. Maybe some kind soul can help me out here ... |
With some trial-and-error, I think I've fixed it! 229733d |
Signed-off-by: Dennis Potman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good now!
I really think no-one should be using
wstring
but ... ROS 2 does have a wide string type, and so ros2/rmw_cyclonedds#501 runs into a problem if Cyclone doesn't do wide strings.It probably needs more testing, I haven't yet checked how to also do this for Python, and I am still debating whether it wouldn't be better to check that the wide strings received over the network use the surrogate pairs correctly ...