feat: HTTP SenderPool with asyncio support #66

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

amunra wants to merge 6 commits into main

from async_pool

Draft

feat: HTTP SenderPool with asyncio support #66

amunra wants to merge 6 commits into main from async_pool

Conversation

amunra

Copy link

Collaborator

@amunra amunra commented Mar 20, 2024 •

edited

Loading

Overview

A new API to make it easier to work with the sender asynchronously with true parallelism.

from questdb.ingress.pool import SenderPool
with SenderPool('http::addr=localhost:9000;') as pool:
 # Buffers can be safely constructed independently,
 # also on separate threads.
 buf1 = pool.transaction('tbl1')
 buf1.row(...)
 buf2 = pool.transaction('tbl2')
 buf2.dataframe(...)
 # parallelism
 fut1 = buf1.commit()
 fut2 = buf2.commit()
 await fut1
 await fut2

Details

The buffer can only accumulate rows for a given table.
Each flush represents an atomic database transaction.
Flush operations can happen in parallel (network operations release the GIL).
The ownership of the buffer is "moved" to the pool.
By introducing parallelism alleviates the performance penalties of using ILP/HTTP: Network roundtrip times.

API downsides

This is a new "parallel" API for more advanced use cases. Creates an API split:
- Server-style applications written in Python would use this new API.
- Simpler "jupyter notebook" style stuff would continue using the existing API.
- Both APIs would continue being supported (since this new one is just a wrapper around the other anyway).
Shoe-horning these features into the regular API would be a struggle.
This API drops auto-flushing completely, since auto-flushing
creates silent network-blocking operations in the API.

Thread safety and Parallelism

Once a pool object is created, it can be shared between threads.
The pool.next_buffer() and pool.flush() methods are thread safe.
This allows for N:M concurrency
- N buffer writer threads
- M threads responsible for concurrently writing to the database (inside the pool).

Tasks

Review the API. Is this even a good idea?
Split out SenderPool into new questdb.ingress.pool module.
Improve test coverage, including TransactionalBuffer.dataframe(..).
Triple-check thread safety of pool.next_buffer() and pool.flush implementations.
Skip the indirection and call asyncio.wrap_future in def flush() directly (since that's how it's implemented anyway): https://github.com/python/cpython/blob/8ad88984200b2ccddc0a08229dd2f4c14d1a71fc/Lib/asyncio/base_events.py#L896 - this allows implementing .flush() in terms of .flush_to_future() and cut code duplication.

Closes #64

amunra added 3 commits

March 20, 2024 13:25

@amunra


 feat: HTTP SenderPool with asyncio support

25dd5f3

@amunra


 SenderPool returns buffer to free list even in case of exception

a4bffe9

@amunra


 fixed buffer freelist queue logic

06a107b

@amunra amunra mentioned this pull request

Mar 20, 2024

is there an async sender.flush()? #64

Open

@amunra


 Merge remote-tracking branch 'origin/main' into async_pool

95fe60b

@amunra amunra mentioned this pull request

Mar 30, 2024

HTTP timeouts not respected in flush #75

Open

@amunra


 minor API improvements

340d3e3

@amunra amunra added the DO NOT MERGE label

Sep 3, 2024

@amunra


 Merge remote-tracking branch 'origin/main' into async_pool

5b7f946

@amunra amunra mentioned this pull request

Nov 23, 2024

Is there any async api for python? #96

Closed

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: HTTP SenderPool with asyncio support #66

Are you sure you want to change the base?

feat: HTTP SenderPool with asyncio support #66

Uh oh!

Conversation

@amunra amunra commented Mar 20, 2024 •

edited

Loading

Uh oh!

Overview

Details

API downsides

Thread safety and Parallelism

Tasks

Uh oh!

Uh oh!

feat: HTTP SenderPool with asyncio support #66

Are you sure you want to change the base?

feat: HTTP SenderPool with asyncio support #66

Uh oh!

Conversation

@amunra amunra commented Mar 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Details

API downsides

Thread safety and Parallelism

Tasks

Uh oh!

Uh oh!

@amunra amunra commented Mar 20, 2024 •

edited

Loading