Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

HTTP timeouts not respected in flush #75

Open
@nicholas-a-guerra

Description

Using the questdb python client version 2.0.0 it seems that the default http timeouts are not being respected when the network connection degrades. The minimal reproducible code is as follows:

self.sender = Sender(Protocol.Http, self.config['IP'], self.config['HTTP_Port'], auto_flush=False)
self.sender.establish()
self.buffer = Buffer()
'''
success = False
try:
 self.flush_start = perf_counter()
 with self.buffer_lock:
 after_lock = perf_counter()
 self.sender.flush(self.buffer)
 success = True 
except:
 logger.opt(exception=True).warning(f"QuestDB write error | {self.config['IP']}")
logger.debug(f"QuestDB flush | {self.config['IP']} | Success: {success} | Lock obtained: {after_lock-self.flush_start}s | Flush: {perf_counter()-after_lock}s")

For reference, there is another loop that feeds this buffer using the lock. But this is irrelevant to the problem I'm having and I've printed the lock acquire time to isolate it as an issue. The results show a growing flush time, for example this is an an exact output from my logs:

2024年03月30日 12:26:52.741 EDT [DEBUG] | QuestDB_Logger:flush_loop:133 | QuestDB flush | kronus-nexus | Success: False | Lock obtained: 9.148032404482365e-06s | Flush: 822.764772413997s

You can see the lock isn't taking almost any time so that's not the problem, but we're seeing growing flush time now seen here at 822 seconds. Here are the ping stats for reference during this time:

21 packets transmitted, 17 received, 19.0476% packet loss, time 20072ms
rtt min/avg/max/mdev = 59.412/318.782/639.467/145.486 ms

The expectation would be that with the default http configuration, for the flush to never take longer than 10 seconds no matter what. Currently this time is increasing over time. It may be growing in conjunction with the growing buffer but that's difficult to know without knowledge of how the internals of flush work. It also seems that once the connection degrades and the buffer starts growing, all consecutive flush attempts just fail. If the buffer gets small enough and the connection gets slightly better then the flush will finally succeed as normal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /