Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat(athena): add start_query_executions for parallel query execution #3190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ggiallo28 wants to merge 3 commits into aws:main
base: main
Choose a base branch
Loading
from ggiallo28:feat/athena-start-query-executions

Conversation

Copy link

@ggiallo28 ggiallo28 commented Aug 28, 2025
edited
Loading

Feature or Bugfix

  • Feature

Detail

  • Added start_query_executions to submit multiple Athena queries in one call.
  • Enabled parallel query submission and wait, significantly reducing end-to-end execution time.
  • Introduced configurable concurrency to adapt performance to available system resources.

Relates

  • Improves efficiency and responsiveness for workflows requiring multiple Athena queries.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Introduce `wr.athena.start_query_executions` as a parallelized variant of
`start_query_execution`. It allows submitting multiple queries in one call,
with support for:
- Sequential or threaded submission (`use_threads`)
- Lazy or eager consumption of results (`as_iterator`)
- Per-query `client_request_token` (string or list)
- Optional workgroup checks (`check_workgroup`, `enforce_workgroup`)
- Full Athena cache integration
This improves performance when dispatching batches of queries by reducing
workgroup lookups and enabling concurrent execution.
...nd parallel wait
- Simplified client_request_token handling:
 - Removed manual padding/truncation.
 - Let Athena enforce length constraints.
 - Tokens generated as `<base_token>-<index>` or provided as list.
- Improved wait logic:
 - Added optional wait handling directly inside _submit.
 - Queries can now be waited in parallel with submission (reduced overhead).
- Configurable default threads:
 - Replaced hardcoded defaults with os.cpu_count().
 - Added support for AWSWRANGLER_THREADS_DEFAULT env var override.
- Removed unused `reduce` import from Athena module.
- Applied ruff formatting to `start_query_executions`.
- Fixed static check issues to pass CI.
- Added ruff check on Athena tests file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

1 participant

AltStyle によって変換されたページ (->オリジナル) /