HTTP timeouts not respected in flush #75

Open

Description

opened

on Mar 30, 2024

Using the questdb python client version 2.0.0 it seems that the default http timeouts are not being respected when the network connection degrades. The minimal reproducible code is as follows:

self.sender = Sender(Protocol.Http, self.config['IP'], self.config['HTTP_Port'], auto_flush=False)
self.sender.establish()
self.buffer = Buffer()
'''
success = False
try:
 self.flush_start = perf_counter()
 with self.buffer_lock:
 after_lock = perf_counter()
 self.sender.flush(self.buffer)
 success = True 
except:
 logger.opt(exception=True).warning(f"QuestDB write error | {self.config['IP']}")
logger.debug(f"QuestDB flush | {self.config['IP']} | Success: {success} | Lock obtained: {after_lock-self.flush_start}s | Flush: {perf_counter()-after_lock}s")

For reference, there is another loop that feeds this buffer using the lock. But this is irrelevant to the problem I'm having and I've printed the lock acquire time to isolate it as an issue. The results show a growing flush time, for example this is an an exact output from my logs:

2024年03月30日 12:26:52.741 EDT [DEBUG] | QuestDB_Logger:flush_loop:133 | QuestDB flush | kronus-nexus | Success: False | Lock obtained: 9.148032404482365e-06s | Flush: 822.764772413997s

You can see the lock isn't taking almost any time so that's not the problem, but we're seeing growing flush time now seen here at 822 seconds. Here are the ping stats for reference during this time:

21 packets transmitted, 17 received, 19.0476% packet loss, time 20072ms
rtt min/avg/max/mdev = 59.412/318.782/639.467/145.486 ms

The expectation would be that with the default http configuration, for the flush to never take longer than 10 seconds no matter what. Currently this time is increasing over time. It may be growing in conjunction with the growing buffer but that's difficult to know without knowledge of how the internals of flush work. It also seems that once the connection degrades and the buffer starts growing, all consecutive flush attempts just fail. If the buffer gets small enough and the connection gets slightly better then the flush will finally succeed as normal.

Metadata

Assignees

No one assigned

Labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HTTP timeouts not respected in flush #75

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions