190 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
1
vote
1
answer
279
views
Implement Slowly Changing Dimensions (SCD) - Type 2 Using DuckDB
I want to implement SCD-Type2, and keep track of historized data, I am using for this task DuckDB, but I found out that DuckDB does not support Merge Statement.
The idea I have is to have two separate ...
0
votes
0
answers
87
views
Merge query, Not able to perform insert and update the records
I am struggling to build one query in databricks using merge statement. I have below 3 scenarios:
My source table is a full load:
Target table load:
If source record does not exist in target table ...
0
votes
0
answers
42
views
SCD (Scalable Color Descriptor) and CLD(Color Layout Descriptor) image extraction code in Python
I somehow can't find good code documentation or tutorial on SCD and CLD image extraction on the internet. Where can I find the documentation? preferably on python because I need it for my project. It'...
0
votes
1
answer
45
views
tracking the history of data in bigquery
I have table (my_table) in bigquery (where i m not admin) whose values changes on daily basis:
i.e. my_table is some what:
item: apple
quantiy: 20
tomorrow it could be:
item: apple
quantiy: 10
...
0
votes
0
answers
67
views
Comparing rows data in a table
I have a table with more than 80 columns. Table is based on a SCD, which means if there is any change in the value of any of the columns, I insert a new row with the change only in that column, rest ...
0
votes
1
answer
368
views
SCD Type2 - Selecting latest row on the change date
I have the following table
CREATE TABLE #CustomerDimension (
row_id int,
customer_id INT NOT NULL,
customer_name VARCHAR(255),
address VARCHAR(255),
start_date DATE NOT NULL,
...
0
votes
0
answers
43
views
data warehouse and SCD
probably odd question but I never encountered a date 31.12.9999 in slowly changing dimension or tables with records and column like EndDate which should mean as the record is currently valid or there ...
0
votes
1
answer
289
views
Databricks DLT CDC/SCD - Taking the latest ID per day
Hi I'm creating a DLT pipeline which uses DLT CDC to implement SCD Type 1 to take the latest record using a datetime column which works with no issues:
@dlt.view
def users():
return spark.readStream....
1
vote
2
answers
91
views
Postgres SCD Type 1 program updating all rows instead of matching rows, unmatched rows will be updated as NULL
Postgres SCD Type 1 program updating all rows instead of only to those matching rows, those unmatched row will be updated as blank.
-- Create table tableToBeUpdated
CREATE TABLE "TEST"."...
0
votes
0
answers
76
views
If a parent entity is SCD type 4 with a history table, should the child entities (one-to-many) also have a history table?
Lets say we have an entity recipe (id, version, other_stuff) with a child entity ingredient (recipe_id, recipe_version, other_stuff). Any recipe can have multiple ingredients, and ingredients can only ...
0
votes
1
answer
611
views
Slow changing dimension | SCD type 1 deleting rows from prior date data
I have implemented SCD type 1 using merge into statement in azure databricks.
When I am doing first load, it’s loading the data that is expected with some just say 5000 rows for ingest date 28 Nov, ...
-1
votes
2
answers
1k
views
Implement SCD Type 2 on periodic snapshot table
Currently I have a very big table that have a snapshot of data for each month.
ID
other
Team
Period
1
.....
A
2020年04月30日
1
.....
A
2020年05月31日
1
.....
A
2020年06月30日
1
.....
A
2020年07月31日
1
.....
B
2020-...
0
votes
1
answer
94
views
Create table with effective_from_date and effective_to_date from history table in BigQuery
I have a table in which data gets appended for the changes. No delete or update, only append is done by a cloud run job.
Base table
Supplier_ID
Supplier_Name
Supplier_Contact
Last_Modified
123
ABC
03 ...
0
votes
1
answer
1k
views
Capture Bigquery data changes
I have a bigquery table that I want to capture data changes. Let's say the table has a userid and a repeated string field for tag. Every time the a tag is deleted or new tag is created, it should save ...
0
votes
2
answers
776
views
Implementing SCD2 in Azure Data Factory: Duplicate Issue on Row Update
I'm attempting to implement SCD2 within Azure Data Factory, but I'm encountering issues with the update mechanism. Instead of updating rows, my process seems to insert all rows from the source data ...