On this page:
top
up

4.5Byte StringsπŸ”— i

+Bytes and Byte Strings in The Racket Guide introduces byte strings.

A byte string is a fixed-length array of bytes. A byte is an exact integer between 0 and 255 inclusive.

A byte string can be mutable or immutable. When an immutable byte string is provided to a procedure like bytes-set! , the exn:fail:contract exception is raised. Byte-string constants generated by the default reader (see Reading Strings) are immutable, and they are interned in read-syntax mode. Use immutable? to check whether a byte string is immutable.

Two byte strings are equal? when they have the same length and contain the same sequence of bytes.

A byte string can be used as a single-valued sequence (see Sequences). The bytes of the string serve as elements of the sequence. See also in-bytes .

See Reading Strings for information on read ing byte strings and Printing Strings for information on print ing byte strings.

See also: immutable? .

4.5.1Byte String Constructors, Selectors, and MutatorsπŸ”— i

procedure

(bytes? v)boolean?

v:any/c
Returns #t if v is a byte string, #f otherwise.

See also immutable-bytes? and mutable-bytes? .

Examples:
> (bytes? #"Apple")

#t

> (bytes? "Apple")

#f

Returns a new mutable byte string of length k where each position in the byte string is initialized with the byte b.

Example:
> (make-bytes 565)

#"AAAAA"

procedure

(bytes b...)bytes?

b:byte?
Returns a new mutable byte string whose length is the number of provided bs, and whose positions are initialized with the given bs.

Example:
> (bytes 65112112108101)

#"Apple"

Returns an immutable byte string with the same content as bstr, returning bstr itself if bstr is immutable.

Examples:

procedure

(byte? v)boolean?

v:any/c
Returns #t if v is a byte (i.e., an exact integer between 0 and 255 inclusive), #f otherwise.

Examples:
> (byte? 65)

#t

> (byte? 0)

#t

> (byte? 256)

#f

> (byte? -1)

#f

Returns the length of bstr.

Example:
> (bytes-length #"Apple")

5

procedure

(bytes-ref bstrk)byte?

bstr:bytes?
Returns the byte at position k in bstr. The first position in the bytes corresponds to 0, so the position k must be less than the length of the bytes, otherwise the exn:fail:contract exception is raised.

Example:
> (bytes-ref #"Apple"0)

65

Changes the byte at position k in bstr to b. The first position in the byte string corresponds to 0, so the position k must be less than the length of the bytes, otherwise the exn:fail:contract exception is raised.

Examples:
> (define s(bytes 65112112108101))
> (bytes-set! s4121)
> s

#"Apply"

procedure

(subbytes bstrstart[end])bytes?

bstr:bytes?
Returns a new mutable byte string that is (- endstart) bytes long, and that contains the same bytes as bstr from start inclusive to end exclusive. The start and end arguments must be less than or equal to the length of bstr, and end must be greater than or equal to start, otherwise the exn:fail:contract exception is raised.

Examples:
> (subbytes #"Apple"13)

#"pp"

> (subbytes #"Apple"1)

#"pple"

procedure

(bytes-copy bstr)bytes?

bstr:bytes?
Returns (subbytes str0).

procedure

(bytes-copy! dest
dest-start
src
[ src-start
src-end])void?
src:bytes?
Changes the bytes of dest starting at position dest-start to match the bytes in src from src-start (inclusive) to src-end (exclusive). The byte strings dest and src can be the same byte string, and in that case the destination region can overlap with the source region; the destination bytes after the copy match the source bytes from before the copy. If any of dest-start, src-start, or src-end are out of range (taking into account the sizes of the byte strings and the source and destination regions), the exn:fail:contract exception is raised.

Examples:
> (define s(bytes 65112112108101))
> (bytes-copy! s4#"y")
> (bytes-copy! s0s34)
> s

#"lpply"

procedure

(bytes-fill! destb)void?

b:byte?
Changes dest so that every position in the bytes is filled with b.

Examples:
> (define s(bytes 65112112108101))
> (bytes-fill! s113)
> s

#"qqqqq"

procedure

(bytes-append bstr...)bytes?

bstr:bytes?
Returns a new mutable byte string that is as long as the sum of the given bstrs’ lengths, and that contains the concatenated bytes of the given bstrs. If no bstrs are provided, the result is a zero-length byte string.

Example:
> (bytes-append #"Apple"#"Banana")

#"AppleBanana"

procedure

(bytes->list bstr)(listof byte? )

bstr:bytes?
Returns a new list of bytes corresponding to the content of bstr. That is, the length of the list is (bytes-length bstr), and the sequence of bytes in bstr is the same sequence in the result list.

Example:
> (bytes->list #"Apple")

'(65 112 112 108 101)

procedure

(list->bytes lst)bytes?

lst:(listof byte? )
Returns a new mutable byte string whose content is the list of bytes in lst. That is, the length of the byte string is (length lst), and the sequence of bytes in lst is the same sequence in the result byte string.

Example:
> (list->bytes (list 65112112108101))

#"Apple"

Returns a new mutable byte string of length k where each position in the byte string is initialized with the byte b. For communication among places, the new byte string is allocated in the shared memory space.

Example:

#"AAAAA"

procedure

(shared-bytes b...)bytes?

b:byte?
Returns a new mutable byte string whose length is the number of provided bs, and whose positions are initialized with the given bs. For communication among places, the new byte string is allocated in the shared memory space.

Example:
> (shared-bytes 65112112108101)

#"Apple"

4.5.2Byte String ComparisonsπŸ”— i

procedure

(bytes=? bstr1bstr2...)boolean?

bstr1:bytes?
bstr2:bytes?
Returns #t if all of the arguments are eqv? .

Examples:
> (bytes=? #"Apple"#"apple")

#f

> (bytes=? #"a"#"as"#"a")

#f

Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.

procedure

(bytes<? bstr1bstr2...)boolean?

bstr1:bytes?
bstr2:bytes?
Returns #t if the arguments are lexicographically sorted increasing, where individual bytes are ordered by < , #f otherwise.

Examples:
> (bytes<? #"Apple"#"apple")

#t

> (bytes<? #"apple"#"Apple")

#f

> (bytes<? #"a"#"b"#"c")

#t

Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.

procedure

(bytes>? bstr1bstr2...)boolean?

bstr1:bytes?
bstr2:bytes?
Like bytes<? , but checks whether the arguments are decreasing.

Examples:
> (bytes>? #"Apple"#"apple")

#f

> (bytes>? #"apple"#"Apple")

#t

> (bytes>? #"c"#"b"#"a")

#t

Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.

4.5.3Bytes to/from Characters, Decoding and EncodingπŸ”— i

procedure

( bytes->string/utf-8 bstr[err-charstartend])string?

bstr:bytes?
err-char:(or/c #fchar? )=#f
Produces a string by decoding the start to end substring of bstr as a UTF-8 encoding of Unicode code points. If err-char is not #f, then it is used for bytes that fall in the range 128 to 255 but are not part of a valid encoding sequence. (This rule is consistent with reading characters from a port; see Encodings and Locales for more details.) If err-char is #f, and if the start to end substring of bstr is not a valid UTF-8 encoding overall, then the exn:fail:contract exception is raised.

Example:
> (bytes->string/utf-8 (bytes 195167195176195182194163))

"çðö£"

procedure

[ err-char
start
end])string?
bstr:bytes?
err-char:(or/c #fchar? )=#f
Produces a string by decoding the start to end substring of bstr using the current locale’s encoding (see also Encodings and Locales). If err-char is not #f, it is used for each byte in bstr that is not part of a valid encoding; if err-char is #f, and if the start to end substring of bstr is not a valid encoding overall, then the exn:fail:contract exception is raised.

procedure

[ err-char
start
end])string?
bstr:bytes?
err-char:(or/c #fchar? )=#f
Produces a string by decoding the start to end substring of bstr as a Latin-1 encoding of Unicode code points; i.e., each byte is translated directly to a character using integer->char , so the decoding always succeeds. The err-char argument is ignored, but present for consistency with the other operations.

Example:
> (bytes->string/latin-1 (bytes 254211209165))

"þÓÑ¥"

procedure

( string->bytes/utf-8 str[err-bytestartend])bytes?

str:string?
err-byte:(or/c #fbyte? )=#f
Produces a byte string by encoding the start to end substring of str via UTF-8 (always succeeding). The err-byte argument is ignored, but included for consistency with the other operations.

Examples:
> (define b
(bytes 195167195176195182194163)))

#"303円247円303円260円303円266円302円243円"

"çðö£"

procedure

( string->bytes/locale str[err-bytestartend])bytes?

str:string?
err-byte:(or/c #fbyte? )=#f
Produces a string by encoding the start to end substring of str using the current locale’s encoding (see also Encodings and Locales). If err-byte is not #f, it is used for each character in str that cannot be encoded for the current locale; if err-byte is #f, and if the start to end substring of str cannot be encoded, then the exn:fail:contract exception is raised.

procedure

[ err-byte
start
end])bytes?
str:string?
err-byte:(or/c #fbyte? )=#f
Produces a string by encoding the start to end substring of str using Latin-1; i.e., each character is translated directly to a byte using char->integer . If err-byte is not #f, it is used for each character in str whose value is greater than 255. If err-byte is #f, and if the start to end substring of str has a character with a value greater than 255, then the exn:fail:contract exception is raised.

Examples:
> (define b
(bytes->string/latin-1 (bytes 254211209165)))

#"376円323円321円245円"

"þÓÑ¥"

Returns the length in bytes of the UTF-8 encoding of str’s substring from start to end, but without actually generating the encoded bytes.

Examples:
(bytes->string/utf-8 (bytes 195167195176195182194163)))

8

> (string-utf-8-length "hello")

5

procedure

(bytes-utf-8-length bstr[err-charstartend])

bstr:bytes?
err-char:(or/c #fchar? )=#f
Returns the length in characters of the UTF-8 decoding of bstr’s substring from start to end, but without actually generating the decoded characters. If err-char is #f and the substring is not a UTF-8 encoding overall, the result is #f. Otherwise, err-char is used to resolve decoding errors as in bytes->string/utf-8 .

Examples:
> (bytes-utf-8-length (bytes 195167195176195182194163))

4

5

procedure

(bytes-utf-8-ref bstr[skiperr-charstartend])(or/c char? #f)

bstr:bytes?
err-char:(or/c #fchar? )=#f
Returns the skipth character in the UTF-8 decoding of bstr’s substring from start to end, but without actually generating the other decoded characters. If the substring is not a UTF-8 encoding up to the skipth character (when err-char is #f), or if the substring decoding produces fewer than skip characters, the result is #f. If err-char is not #f, it is used to resolve decoding errors as in bytes->string/utf-8 .

Examples:
> (bytes-utf-8-ref (bytes 195167195176195182194163)0)

#\ç

> (bytes-utf-8-ref (bytes 195167195176195182194163)1)

#\ð

> (bytes-utf-8-ref (bytes 195167195176195182194163)2)

#\ö

> (bytes-utf-8-ref (bytes 65666768)0)

#\A

> (bytes-utf-8-ref (bytes 65666768)1)

#\B

> (bytes-utf-8-ref (bytes 65666768)2)

#\C

procedure

skip
[ err-char
start
end])
bstr:bytes?
err-char:(or/c #fchar? )=#f
Returns the offset in bytes into bstr at which the skipth character’s encoding starts in the UTF-8 decoding of bstr’s substring from start to end (but without actually generating the other decoded characters). The result is relative to the start of bstr, not to start. If the substring is not a UTF-8 encoding up to the skipth character (when err-char is #f), or if the substring decoding produces fewer than skip characters, the result is #f. If err-char is not #f, it is used to resolve decoding errors as in bytes->string/utf-8 .

Examples:
> (bytes-utf-8-index (bytes 195167195176195182194163)0)

0

> (bytes-utf-8-index (bytes 195167195176195182194163)1)

2

> (bytes-utf-8-index (bytes 195167195176195182194163)2)

4

> (bytes-utf-8-index (bytes 65666768)0)

0

> (bytes-utf-8-index (bytes 65666768)1)

1

> (bytes-utf-8-index (bytes 65666768)2)

2

4.5.4Bytes to Bytes Encoding ConversionπŸ”— i

procedure

(bytes-open-converter from-nameto-name)

from-name:string?
to-name:string?
Produces a byte converter to go from the encoding named by from-name to the encoding named by to-name. If the requested conversion pair is not available, #f is returned instead of a converter.

Certain encoding combinations are always available:

  • (bytes-open-converter "UTF-8""UTF-8") — the identity conversion, except that encoding errors in the input lead to a decoding failure.

  • (bytes-open-converter "UTF-8-permissive""UTF-8")the identity conversion, except that any input byte that is not part of a valid encoding sequence is effectively replaced by the UTF-8 encoding sequence for #\uFFFD. (This handling of invalid sequences is consistent with the interpretation of port bytes streams into characters; see Ports.)

  • (bytes-open-converter """UTF-8") — converts from the current locale’s default encoding (see Encodings and Locales) to UTF-8.

  • (bytes-open-converter "UTF-8""") — converts from UTF-8 to the current locale’s default encoding (see Encodings and Locales).

  • (bytes-open-converter "platform-UTF-8""platform-UTF-16") — converts UTF-8 to UTF-16 on Unix and Mac OS, where each UTF-16 code unit is a sequence of two bytes ordered by the current platform’s endianness. On Windows, the conversion is the same as (bytes-open-converter "WTF-8""WTF-16") to support unpaired surrogate code units.

  • (bytes-open-converter "platform-UTF-8-permissive""platform-UTF-16") — like (bytes-open-converter "platform-UTF-8""platform-UTF-16"), but an input byte that is not part of a valid UTF-8 encoding sequence (or valid for the unpaired-surrogate extension on Windows) is effectively replaced with #\uFFFD.

  • (bytes-open-converter "platform-UTF-16""platform-UTF-8") — converts UTF-16 (bytes ordered by the current platform’s endianness) to UTF-8 on Unix and Mac OS. On Windows, the conversion is the same as (bytes-open-converter "WTF-16""WTF-8") to support unpaired surrogates. On Unix and Mac OS, surrogates are assumed to be paired: a pair of bytes with the bits #xD800 starts a surrogate pair, and the #x03FF bits are used from the pair and following pair (independent of the value of the #xDC00 bits). On all platforms, performance may be poor when decoding from an odd offset within an input byte string.

  • (bytes-open-converter "WTF-8""WTF-16") — converts the WTF-8 [Sapin18] superset of UTF-8 to a superset of UTF-16 to support unpaired surrogate code units, where each UTF-16 code unit is a sequence of two bytes ordered by the current platform’s endianness.

  • (bytes-open-converter "WTF-8-permissive""WTF-16") — like (bytes-open-converter "WTF-8""WTF-16"), but an input byte that is not part of a valid WTF-8 encoding sequence is effectively replaced with #\uFFFD.

  • (bytes-open-converter "WTF-16""WTF-8") — converts the WTF-16 [Sapin18] superset of UTF-16 to the WTF-8 superset of UTF-8. The input can include UTF-16 code units that are unpaired surrogates, and the corresponding output includes an encoding of each surrogate in a natural extension of UTF-8.

A newly opened byte converter is registered with the current custodian (see Custodians), so that the converter is closed when the custodian is shut down. A converter is not registered with a custodian (and does not need to be closed) if it is one of the guaranteed combinations not involving "" on Unix, or if it is any of the guaranteed combinations (including "") on Windows and Mac OS.

In the Racket software distributions for Windows, a suitable "iconv.dll" is included with "libmzschVERS.dll".

The set of available encodings and combinations varies by platform, depending on the iconv library that is installed; the from-name and to-name arguments are passed on to iconv_open. On Windows, "iconv.dll" or "libiconv.dll" must be in the same directory as "libmzschVERS.dll" (where VERS is a version number), in the user’s path, in the system directory, or in the current executable’s directory at run time, and the DLL must either supply _errno or link to "msvcrt.dll" for _errno; otherwise, only the guaranteed combinations are available.

Use bytes-convert with the result to convert byte strings.

Changed in version 7.9.0.17 of package base: Added built-in converters for "WTF-8", "WTF-8-permissive", and "WTF-16".

procedure

(bytes-close-converter converter)void

converter:bytes-converter?
Closes the given converter, so that it can no longer be used with bytes-convert or bytes-convert-end .

procedure

(bytes-convert converter
src-bstr
[ src-start-pos
src-end-pos
dest-bstr
dest-start-pos
dest-end-pos])
(or/c 'complete'continues'aborts'error)
converter:bytes-converter?
src-bstr:bytes?
src-start-pos:exact-nonnegative-integer? =0
= (bytes-length src-bstr)
dest-bstr:(or/c bytes? #f)=#f
dest-start-pos:exact-nonnegative-integer? =0
dest-end-pos : (or/c exact-nonnegative-integer? #f)
=
(and dest-bstr
(bytes-length dest-bstr))
Converts the bytes from src-start-pos to src-end-pos in src-bstr.

If dest-bstr is not #f, the converted bytes are written into dest-bstr from dest-start-pos to dest-end-pos. If dest-bstr is #f, then a newly allocated byte string holds the conversion results, and if dest-end-pos is not #f, the size of the result byte string is no more than (- dest-end-posdest-start-pos).

The result of bytes-convert is three values:

  • result-bstr or dest-wrote-amt — a byte string if dest-bstr is #f or not provided, or the number of bytes written into dest-bstr otherwise.

  • src-read-amt — the number of bytes successfully converted from src-bstr.

  • 'complete, 'continues, 'aborts, or 'error — indicates how conversion terminated:

    • 'complete: The entire input was processed, and src-read-amt will be equal to (- src-end-possrc-start-pos).

    • 'continues: Conversion stopped due to the limit on the result size or the space in dest-bstr; in this case, fewer than (- dest-end-posdest-start-pos) bytes may be returned if more space is needed to process the next complete encoding sequence in src-bstr.

    • 'aborts: The input stopped part-way through an encoding sequence, and more input bytes are necessary to continue. For example, if the last byte of input is 195 for a "UTF-8-permissive" decoding, the result is 'aborts, because another byte is needed to determine how to use the 195 byte.

    • 'error: The bytes starting at (+ src-start-possrc-read-amt) bytes in src-bstr do not form a legal encoding sequence. This result is never produced for some encodings, where all byte sequences are valid encodings. For example, since "UTF-8-permissive" handles an invalid UTF-8 sequence by dropping characters or generating “?,” every byte sequence is effectively valid.

Applying a converter accumulates state in the converter (even when the third result of bytes-convert is 'complete). This state can affect both further processing of input and further generation of output, but only for conversions that involve “shift sequences” to change modes within a stream. To terminate an input sequence and reset the converter, use bytes-convert-end .

Examples:
> (define convert(bytes-open-converter "UTF-8""UTF-16"))
> (bytes-convert convert(bytes 65666768))

#"376円377円0円A0円B0円C0円D"

4

'complete

> (bytes 195167195176195182194163)

#"303円247円303円260円303円266円302円243円"

> (bytes-convert convert(bytes 195167195176195182194163))

#"0円347円0円360円0円366円0円243円"

8

'complete

procedure

(bytes-convert-end converter
[ dest-bstr
dest-start-pos
dest-end-pos])
(or/c 'complete'continues)
converter:bytes-converter?
dest-bstr:(or/c bytes? #f)=#f
dest-start-pos:exact-nonnegative-integer? =0
dest-end-pos : (or/c exact-nonnegative-integer? #f)
=
(and dest-bstr
(bytes-length dest-bstr))
Like bytes-convert , but instead of converting bytes, this procedure generates an ending sequence for the conversion (sometimes called a “shift sequence”), if any. Few encodings use shift sequences, so this function will succeed with no output for most encodings. In any case, successful output of a (possibly empty) shift sequence resets the converter to its initial state.

The result of bytes-convert-end is two values:

  • result-bstr or dest-wrote-amt — a byte string if dest-bstr is #f or not provided, or the number of bytes written into dest-bstr otherwise.

  • 'complete or 'continues — indicates whether conversion completed. If 'complete, then an entire ending sequence was produced. If 'continues, then the conversion could not complete due to the limit on the result size or the space in dest-bstr, and the first result is either an empty byte string or 0.

procedure

(bytes-converter? v)boolean?

v:any/c
Returns #t if v is a byte converter produced by bytes-open-converter , #f otherwise.

Examples:
> (bytes-converter? (bytes-open-converter "UTF-8""UTF-16"))

#t

> (bytes-converter? (bytes-open-converter "whacky""not likely"))

#f

> (define b(bytes-open-converter "UTF-8""UTF-16"))

#t

procedure

(locale-string-encoding )any

Returns a string for the current locale’s encoding (i.e., the encoding normally identified by ""). See also system-language+country .

4.5.5Additional Byte String FunctionsπŸ”— i

(require racket/bytes ) package: base
The bindings documented in this section are provided by the racket/bytes and racket libraries, but not racket/base.

procedure

( bytes-append* str...strs)bytes?

str:bytes?
strs:(listof bytes? )
Like bytes-append , but the last argument is used as a list of arguments for bytes-append , so (bytes-append* str... strs) is the same as (apply bytes-append str... strs). In other words, the relationship between bytes-append and bytes-append* is similar to the one between list and list* .

Examples:
> (bytes-append* #"a"#"b"'(#"c"#"d"))

#"abcd"

> (bytes-append* (cdr (append* (map (lambda (x)(list #", "x))
'(#"Alpha"#"Beta"#"Gamma")))))

#"Alpha, Beta, Gamma"

procedure

( bytes-join strssep)bytes?

strs:(listof bytes? )
sep:bytes?
Appends the byte strings in strs, inserting sep between each pair of bytes in strs. A new mutable byte string is returned.

Example:
> (bytes-join '(#"one"#"two"#"three"#"four")#" potato ")

#"one potato two potato three potato four"

top
up

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /