Bytestrings for interfacing native (C) code divoplade 12 Sep 2020 18:13 UTC

Hello all,

I think bytestrings should be advertised as the preferred type for
interfacing with external native (C) libraries, because they do not
make a difference between strings intended to store text and string
intended to store raw data.

For instance, consider cryptographic libraries. It might be a good idea
to re-use existing native C code for efficiency or simply avoiding
reimplementing it. A cryptographic library takes C-strings + length,
and outputs C-strings + length, whether it operates on textual data or
binary data. For interfacing cryptographic libraries, I can only use
bytevectors.

The same can be said for JSON, for instance. Emacs has recently brought
the JSON parsing functions to native code.

So, if I need to do minor text processing between a call to a native
JSON library and a native cryptographic library, or even recursively
pass my data back and forth to these libraries, it would be way easier
to keep bytestrings all the way. I am currently thinking of DPOP (
https://datatracker.ietf.org/doc/draft-ietf-oauth-dpop/) in which
calling back and forth between json encoding / decoding and hashing or
verifying a signature is possible (for instance, for public key
confirmation), and processing fields will be useful for the
application. If the JSON library and the cryptographic library agreed
to process bytestrings, we could use the bytestring library early.

As a conclusion, native code should try to provide a bytestring
interface first, and the bytestring library should point this out in
its rationale.

This is my opinion, what do you think?

Best regards,

divoplade