-
Notifications
You must be signed in to change notification settings - Fork 20
Avoid per-value stack expansion when decoding RowBinary Tuple #369
Open
Description
Problem
Tuple decoding currently expands the tuple schema onto the decode stack for every tuple value:
{:tuple, tuple_types} -> decode_rows(tuple_types ++ [{:tuple_over, row} | types_rest], bin, [], rows, types)
That allocates proportional to tuple width per decoded row. This is the same general shape as the array/map stack expansion work, except tuple width is fixed and known from the type.
Benchmark
Environment:
commit: 5c9244a
macOS, Apple M2, 8 GB RAM
Elixir 1.19.5, Erlang/OTP 28.3, JIT enabled
Benchmark code:
alias Ch.RowBinary base = DateTime.from_naive!(~N[2026年01月01日 00:00:00.000], "Etc/UTC") flat_rows = Enum.map(1..100_000, fn i -> [i, "event", DateTime.add(base, i, :second)] end) tuple_rows = Enum.map(flat_rows, fn row -> [List.to_tuple(row)] end) flat_bin = flat_rows |> RowBinary.encode_rows(["UInt64", "String", "DateTime"]) |> IO.iodata_to_binary() tuple_bin = tuple_rows |> RowBinary.encode_rows(["Tuple(UInt64, String, DateTime)"]) |> IO.iodata_to_binary() Benchee.run( %{ "decode flat UInt64/String/DateTime" => fn -> RowBinary.decode_rows(flat_bin, ["UInt64", "String", "DateTime"]) end, "decode Tuple(UInt64,String,DateTime)" => fn -> RowBinary.decode_rows(tuple_bin, ["Tuple(UInt64, String, DateTime)"]) end }, warmup: 1, time: 2 )
Results:
Name ips average
decode flat UInt64/String/DateTime 6.99 143.10 ms
decode Tuple(UInt64,String,DateTime) 4.24 235.69 ms
Tuple decoding: 1.65x slower (+92.59 ms for 100k rows)
The comparison is not claiming tuples should be as cheap as flat rows, but it gives a baseline for the per-value tuple stack allocation.
Suggested direction
Avoid rebuilding tuple_types ++ marker for each tuple value. Possible approaches:
- represent tuple decoding as a small state frame with current tuple types and accumulated tuple row;
- precompute a reusable tuple decode frame during
decoding_type/1; - align this with the array/map non-stack improvement so nested containers use one consistent mechanism.
Tests to add:
- tuple decode round trips for scalar and nested types;
- incomplete tuple payload still returns the same continuation/error behavior;
- benchmark coverage for tuple-heavy rows.
Metadata
Metadata
Assignees
Labels
No labels
Type
Fields
Give feedbackNo fields configured for issues without a type.