Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
Best practices
1 vote
2 replies
109 views

What recommendations might be offered for the most elegant and performative way with T-SQL to evaluate whether a source value should update a target value, as part of an ETL update process in which ...
504more's user avatar
  • 505
1 vote
1 answer
63 views

I have an AWS Glue job that processes thousands of small JSON files from S3 (historical data load for Adobe Experience Platform). The job is taking approximately 4 hours to complete, which is ...
Best practices
0 votes
0 replies
45 views

Question: I'm currently working on a dashboard prototype and storing data from a 22-page PDF document as 22 separate DataFrames. These DataFrames should undergo an ETL process (especially data type ...
0 votes
1 answer
87 views

I’m working on a data quality workflow where I validate incoming records for null or missing values. Even when a column clearly contains nulls, my rule doesn’t trigger and the record passes validation....
0 votes
1 answer
61 views

I’m working with IBM InfoSphere DataStage 11.7. I exported several jobs as XML files using istool export. Then, using a Python script, I modified the XML to add another database stage in parallel to ...
1 vote
1 answer
69 views

I’m trying to programmatically modify IBM DataStage jobs to add a new database connector stage in parallel to an existing Database stage. Here’s my workflow: Export a job from DataStage Designer as ...
2 votes
0 answers
100 views

I tried using Prefect with FastAPI project. Then when I updated logs and redeployed the repo as well as Prefect deployments and flows. It runs and displays the logs (Basically , Prefect is still ...
Needa's user avatar
  • 41
-4 votes
1 answer
83 views

I’m trying to programmatically add a new database stage in parallel to an existing DataStage job by modifying its exported XML. I export the job from DataStage Designer, modify the XML via a Python ...
0 votes
0 answers
52 views

I'm building ETL packages in SSIS. My data comes from an OLE DB Source that calls a stored procedure in SQL Server. I want to add a new Lookup (or a similar transformation) that uses some of the input ...
0 votes
0 answers
198 views

I have started prefect server on Remote Desktop using prefect server start —-host 0.0.0.0 —-port 8080 After this I am able to access the UI from different computers present on this network. I create a ...
Anzar's user avatar
  • 35
1 vote
2 answers
185 views

I have a table in Power Query like this: PO - Purchase Order SID - Ship ID QTY - Quantity PO SID QTY 1001 A001 2000 1001 A001 2000 1001 A001 -2000 (This line cancel the previous one) 1002 A002 3000 ...
0 votes
1 answer
148 views

I am building ETL using LLM to extract some information. I have ollama installed locally. I am on Macbook M4 Max. I don't understand why I have this error from my worker. ads-worker-1 | 2025年08月28日 15:...
0 votes
0 answers
99 views

I have a Flink ETL job that reads from ~13 Kafka topics and writes data into HDFS using a FileSink with compaction enabled. Right now, we have around 40 different output paths (buckets), and roughly ...
0 votes
0 answers
56 views

I am trying to run a batch process using Apache Airflow. The Extract and Transform stages work very fine but the load stages is giving an error. Here is my code: from airflow.decorators import dag, ...
0 votes
0 answers
92 views

I'm trying Apache Airflow for the first time and built a simple ETL. But after loading the data and proceeding to the transform phase, it throws an error because it says pyarrow was not found. Im ...

15 30 50 per page
1
2 3 4 5
...
398

AltStyle によって変換されたページ (->オリジナル) /