-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Configure queue length in remote chunking #4780
-
Currently I'm using the remote chunking pattern to process large number of data. However it looks like spring batch is not able to leverage full number of workers. For example, having 10 workers v.s. 20 workers v.s. 100 workers doesn't really make any performance increase.
It looks like the master node in the remote chunking setup only sends a limited number of RemoteChunkingRequest at a time. Setting the throttleLimit
doesn't seem to make any difference and has a warning that it's deprecated
Sample remote chunking manager setup
public Step sampleManagerStep() { return this.managerStepBuilderFactory.get("managerStep") .chunk(20) .reader(sampleReader()) .outputChannel(sampleManagerOutboundRequests()) .inputChannel(sampleManagerInboundResponse()) .faultTolerant().skip(Exception.class).skipLimit(200) .throttleLimit(100).build(); }
Essentially I'm looking for a way to configure the manager to send out all the pending items to the queue at once. But currently it's sending only a limited number of request -> wait for response -> send another number of requests
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment 1 reply
-
Throttle limit is the key parameter here. Which warning are you referring to? The throttle limit here is different from the one that is deprecated in a local concurrent step.
Another parameter that can play a role in my experience is the prefetch size of consumers (for example for rabbitmq it is the spring.rabbitmq.listener.simple.prefetch
size). Which broker are you using?
Beta Was this translation helpful? Give feedback.
All reactions
-
👀 1
-
thx @fmbenhassine. I'm using RemoteChunkingManagerStepBuilderFactory
and setting the throttleLimit
to for example 20. However looks like each time the master node only sending a max of 7 chunks to my JMS queue(The exit message on BATCH_STEP_EXECUTION also states: Waited for 7 results
).
As for broker, I'm using JMS. How can the prefetch size impact in this case?
The deprecation warning I'm seeing is below:
In the case of an asynchronous taskExecutor(TaskExecutor) the number of concurrent tasklet executions can be throttled (beyond any throttling provided by a thread pool). The throttle limit should be less than the data source pool size used in the job repository for this step. Deprecated with no replacement since 5.0, scheduled for removal in 6.0. Use a custom RepeatOperations implementation (based on a TaskExecutor with a bounded task queue) and set it on the step with stepOperations(RepeatOperations). Params: throttleLimit – maximum number of concurrent tasklet executions allowed Returns: this for fluent chaining
Looks like, when using RemoteChunkingManagerStepBuilderFactory
and setting the throttleLimit
, it's actually setting the limit in public abstract class AbstractTaskletStepBuilder<B extends AbstractTaskletStepBuilder<B>> extends StepBuilderHelper<B>
. But in theory, it should setting the limit in public class RemoteChunkingManagerStepBuilder<I, O> extends FaultTolerantStepBuilder<I, O>
Above also explains when user set the throttleLimit on RemoteChunkingManagerStepBuilderFactory
, it doesn't make any difference. Because RemoteChunkingManagerStepBuilder
still uses DEFAULT_THROTTLE_LIMIT = 6
Beta Was this translation helpful? Give feedback.