Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
0 votes
1 answer
121 views

I have a large table which I want to move to a partitioned model. I created the partitioned table, same fields as the original and partioning by a particular timestamp field (by range). I then ...
0 votes
0 answers
65 views

I am trying to find updated information regarding aws best practices when it comes to multi-tenant data partitioning in S3. From what I know and what I studied for when I did my AWS Solutions ...
1 vote
1 answer
140 views

I have a BigQuery table where a PubSub subscription inserts new web events every second. This table is partition by: column: derived_tstamp type: timestamp granularity: daily To create a specific ...
2 votes
1 answer
374 views

I have a big database that represents a graph with a ton of data in it that is constantly growing. The database looks something like: CREATE TABLE node ( id BIGSERIAL PRIMARY KEY, created_at ...
0 votes
1 answer
445 views

In Bigquery, suppose I create a table and partition it by a date column "mydate" with a "DAY" granularity. Using DBT, this can be done using : partition_by = { "...
-1 votes
2 answers
219 views

I am about to create a dynamo db table which has below columns and each row will have unique data, user id profile Id attribute1 1001 9001 x 1002 9002 x table will have 1M records which means unique ...
0 votes
2 answers
116 views

I have a dataset as below from which I would like to draw some inferences. Id Nbr Dt Status Cont1Sta1 DateLagInDays Recurrence 1 2 2023年10月1日 1 1 2 2023年11月2日 0 1 2 2023年12月13日 0 1 3 2023年10月1日 0 1 3 2023-...
0 votes
1 answer
260 views

We have situation in the database, where we have to make one table schema of entire tables as data partitioned based on tenant id clause Using create_table "billing_schedule_lines_old", id: :...
1 vote
1 answer
116 views

Problem: We have a table "test", consists of sections "test_202309", "test_202310", "test_202311". The sections store data for September 2023, October 2023 and November 2023. I using the command "...
1 vote
2 answers
2k views

Snowflake stores data using a hybrid-columnar storage method. I understand what columnar storage is and its benefits, but what does the hybrid mean? Is this simply referring to Snowflake accessing ...
0 votes
1 answer
391 views

A table contains xyz columns, with 3 years of data. Index = clustered column index Hash distribution column = product. Partition column = date. As the new year data arrive ...
0 votes
0 answers
23 views

I've tried several routes to getting the 10 records from each subset of a large dataset and the best I can do is querying each subgroup explicitly in the query. My first attempt from the (Teradata ...
n8.'s user avatar
  • 1,742
0 votes
1 answer
386 views

I am trying to split my data into training, test and validation groups within my data. I have 2 groups: control and TP and within these groups I have a secondary variable called Bio with numbers in ...
0 votes
1 answer
183 views

I am faced with the following situation: among the BigQuery datasets which I am handling there is a rather large table - let us call it lt - that undergoes daily updates (more specifically, this table ...
0 votes
1 answer
2k views

I'm following the Apache Hudi documentation to write and read a Hudi table. Here's the code I'm using to create and save a PySpark DataFrame into Azure DataLake Gen2: tableName = "my_hudi_table&...

15 30 50 per page
1
2 3 4 5
...
23

AltStyle によって変換されたページ (->オリジナル) /