Is there any guide or documentation on how to use pgvecto.rs with Citus? · tensorchord/pgvecto.rs · Discussion #571

cho-thinkfree-com
Aug 23, 2024

First of all, thank you for developing such a great open-source project.

I am currently developing a service using pgvecto.rs.

In the documentation provided by the team managing this project, I found a guide on how to install and operate pgvecto.rs in a Kubernetes environment. Due to the size of the data, I am planning to use Citus for sharding.

Is there any guide or documentation on how to use pgvecto.rs with Citus?

As far as I know, pgvecto.rs has some separately managed files, unlike pgvector. (Please correct me if I am mistaken.) Even if Citus can be used, I am curious about how to handle backups in this setup.

Replies: 2 comments 3 replies

gaocegege
Aug 23, 2024
Maintainer

cc @VoVAllen

0 replies

VoVAllen
Aug 23, 2024
Maintainer

We haven't tested with Citus yet. Have you tested pgvector with Citus? And can you share with us your vector scale? How many vectors and dimension you're going to store? Thanks

3 replies

@cho-thinkfree-com

cho-thinkfree-com Aug 23, 2024
Author

Thank you for your prompt response.

I encountered an issue with incorrect behavior in the WHERE clause while using pgvector in SQL. Therefore, I am preparing to use pgvecto.rs.
I am using a single row with two 1024-dimensional dense vector columns and two 25,000-dimensional sparse vectors. The total number of rows is expected to exceed 100 million.
I believe this might be too large to store in a single database. (I lack expertise in databases, so please feel free to correct me if I’m wrong.)
Therefore, I am considering applying sharding.

For your reference, it seems that Citus supports pgvector.
(pgvector is mentioned on the Citus website - https://www.citusdata.com/product/community )

@gaocegege

gaocegege Aug 30, 2024
Maintainer

Hi @cho-thinkfree-com, I'm a maintainer with @tensorchord/pgvecto-rs-maintainers. Would you be interested in a meeting with us? I’d love to learn more about your use case and see how we can assist you.

@cho-thinkfree-com

cho-thinkfree-com Sep 2, 2024
Author

Thank you for your response.

First of all, please note that I am using a translation tool to ask questions and provide answers as I am not good at English. :-) Even if we proceed with a meeting, real-time conversation might be very challenging due to the "language" barrier. :-)

We are literally trying to build a document search service.
The basic search methods we have in mind are very similar to those described in the following links: Sparse Vector Use Case and Adaptive Retrieval Use Case.

The details of what we are using are documented below:
GitHub Issue Comment

The scale of data we are considering involves creating 160,000 tables. We are deliberating whether to create 160,000 separate tables or partition a single table into multiple parts. We are unsure which option would be better if we use pgvecto.rs. (The reason we inquired about Citus was also to handle large volumes of data.)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is there any guide or documentation on how to use pgvecto.rs with Citus? #571

Uh oh!

{{title}}

Uh oh!

cho-thinkfree-com
Aug 23, 2024

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

gaocegege
Aug 23, 2024
Maintainer

Uh oh!

{{title}}

Uh oh!

VoVAllen
Aug 23, 2024
Maintainer

Uh oh!

{{title}}

Uh oh!

cho-thinkfree-com Aug 23, 2024
Author

Uh oh!

{{title}}

Uh oh!

gaocegege Aug 30, 2024
Maintainer

Uh oh!

{{title}}

Uh oh!

cho-thinkfree-com Sep 2, 2024
Author

Select a reply

Uh oh!

Is there any guide or documentation on how to use pgvecto.rs with Citus? #571

Uh oh!

cho-thinkfree-com Aug 23, 2024

Replies: 2 comments · 3 replies

Uh oh!

gaocegege Aug 23, 2024 Maintainer

Uh oh!

VoVAllen Aug 23, 2024 Maintainer

Uh oh!

cho-thinkfree-com Aug 23, 2024 Author

Uh oh!

gaocegege Aug 30, 2024 Maintainer

Uh oh!

cho-thinkfree-com Sep 2, 2024 Author

cho-thinkfree-com
Aug 23, 2024

Replies: 2 comments 3 replies

gaocegege
Aug 23, 2024
Maintainer

VoVAllen
Aug 23, 2024
Maintainer

cho-thinkfree-com Aug 23, 2024
Author

gaocegege Aug 30, 2024
Maintainer

cho-thinkfree-com Sep 2, 2024
Author