-1

I have a Flink job with multiple downstream operators I want to route tuples to based on a condition. Side outputs are advertised for this use case in the Flink documentation.

However, when sending data to side outputs using a ProcessFunction, throughput in the Flink Web UI decreases proportionally with the number of output channels. This happens when

  • using the primary output for the first downstream operator and side outputs for the others
  • as well as when just using side outputs for all downstream operators, leaving the main output idle.

When connecting all downstream operators to the upstream operator or just sending data to the main output channel, the UI does not show a decrease in throughput

Minimal reproducible example

Sending rate 10K rec/s, Watermark issued every second (based on event time), all downstream operators receive all records (i.e., no filters), 3 downstream operators. To illustrate the issue, no filters are used for routing to different channels.

Case 1: Upper most downstream operator (Map) uses main output channel or a side output of TumblingEventTimeWindows -> Process with main output channel idle, other 2 downstream operators use side output channels of TumblingEventTimeWindows -> Process. Downstream operators received records are 1/3 of TumblingEventTimeWindows -> Process records.

enter image description here

Case 2: Just send data to main output channel in ProcessFunction and do not send data to side outputs. Connect upper most downstream operator (Map) to main output channel. TumblingEventTimeWindows -> Process sent records are approximately equal to received records of upper most downstream operator.

enter image description here

Case 3: Directly connect all downstream operators (Map) to main output channel of TumblingEventTimeWindows -> Process. TumblingEventTimeWindows -> Process sent records are equal to received records of all downstream operators.

enter image description here


Is this behavior expected?

asked Nov 6, 2025 at 16:40
0

1 Answer 1

0

Answering my own question:

When I look at the actual throughput metric (numRecordsInPerSecond) of the Map functions, it is similar for both case 1 and case 3. My mistake was to infer something about the throughput based on the metrics shown in the overview. However, the number of records that go out in the upstream operator increases with the number of side outputs. I wrongly assumed that each unique record is only counted once, even if it sent to multiple outputs (which actually does not make sense, because Flink cannot infer this easily).

answered Nov 8, 2025 at 16:48
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.