1. Home
2. Questions
3. AI Assist
4. Tags
5. Challenges
6. Chat
7. Articles
8. Users
9. Companies
11. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Bring the best of human thought and AI automation together at your work. Learn more

Update materialized view with latest s3 file starrocks

Asked 11 months ago

Viewed 76 times

Part of AWS Collective

So with starrocks I have a pipe that looks for new files in my s3 bucket, now I would like to have this file in a temporary table which would get overwritten with the latest file that lands in my s3 bucket. But a starrocks pipe does not support INSERT OVERWRITE so how would one achieve something like this. My materialized view looks something like this.

CREATE MATERIALIZED VIEW mv_segmentation_manager REFRESH ASYNC AS
SELECT * 
FROM segmentation_manager
WHERE segment_name NOT IN (SELECT DISTINCT segment_name FROM temp_segments)
UNION ALL
SELECT * 
FROM temp_segments;

My pipe has the following configuration

CREATE PIPE segment_manager
PROPERTIES
(
 "AUTO_INGEST" = "TRUE",
 "POLL_INTERVAL" = "5"
)
AS
INSERT OVERWRITE temp_segments
SELECT * FROM FILES
(
 "path" = "s3://my-bucket/*.snappy.parquet",
 "format" = "parquet",
 "aws.s3.region" = "region",
 "aws.s3.use_instance_profile" = "true"
);

So basically what needs to happen is that my materialized view checks for the segment_name in the latest s3 file and excludes all the rows in the segmentation_manager table and unions the new s3 file this is just a upsert without checking. But currently I am inserting this s3 file into a temp_table which if not truncated will become very big depending on the amount of s3 files that land in the bucket so this should be empty whenever a s3 file lands in the bucket. Truncating on a schedule would not be good as this could overlap with an s3 file that lands in the bucket and potentially lose data. Any ideas how this could be achieved in starrocks? Looking forward to hearing you guys ideas!

Improve this question

edited Jan 14, 2025 at 11:23

Shadow's user avatar

Shadow

34.5k10 gold badges67 silver badges76 bronze badges

asked Jan 14, 2025 at 11:05

David Backx's user avatar

David Backx

551 silver badge5 bronze badges

I removed mysql tag as there is no reference in the question to mysql and mysql does not support the view creation syntax in the question

Shadow
– Shadow

2025年01月14日 11:24:10 +00:00
Commented Jan 14, 2025 at 11:24

Add a comment |

0

Sorted by: Reset to default

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

CollectivesTM on Stack Overflow

Update materialized view with latest s3 file starrocks

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions