Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
0 votes
0 answers
81 views

I am running an Apache Spark job on Amazon EMR that needs to connect to an Amazon MSK cluster configured with IAM authentication. The EMR cluster has an IAM role with full MSK permissions, and I can ...
0 votes
0 answers
52 views

I have been using Spark v3.5 Spark Stream functionality for the below use case. I am observing the issue below on one of the environments with Spark Stream. Please if I can get some assistance with ...
0 votes
1 answer
143 views

I am trying to load the data written into the Kafka topic into the Postgres table. I can see the topic is receiving new messages every second and also the data looks good. However, when I use the ...
0 votes
1 answer
125 views

ERROR SparkContext: Failed to add home/areaapache/software/spark-3.5.2-bin-hadoop3/jars/spark-streaming-kafka-0-10_2.13-3.5.2.jar \ to Spark environment import logging from pyspark.sql import ...
3 votes
0 answers
83 views

I have Spark Streaming application lives on Argo + K8S that reads Kafka topics by subscribe pattern then there are some transformations and writing to a target. Several different producers may write ...
0 votes
1 answer
309 views

My goal is to run a Spark job using Databricks, and my challenge is that I can't store files in the local filesystem since the file is saved in the driver, but when my executors tried to access the ...
1 vote
0 answers
400 views

I am working on spark streaming and reading data from kafka topic, but getting error java.lang.NoClassDefFoundError: org/apache/spark/kafka010/KafkaConfigUpdater. Running my code in K8s and provide ...
2 votes
0 answers
132 views

I have multiple topics in kafka that I need to sink in their respective delta table. A) 1 Streaming query for all topics If i use one streaming query, then the RDD/DF should contains data from ...
2 votes
1 answer
3k views

When I try to run this .py: import logging from cassandra.cluster import Cluster from pyspark.sql import SparkSession from pyspark.sql.functions import from_json, col from pyspark.sql.types import ...
2 votes
1 answer
556 views

I am in a bind here. I am trying to implement a very basic pipeline which reads data from kafka and process it in Spark. The problem I am facing is that apache spark shuts down abruptly giving the ...
1 vote
0 answers
99 views

I'm trying to read stream from Kafka using pyspark. The Stack I'm working with: Kubernetes. Stand alone spark cluster with 2 workers. spark-connect connected to the cluster and has the dependencies ...
0 votes
1 answer
171 views

I can't write to Kafka from Spark, Spark is reading but not writing, if I write to the console it doesn't give an error Traceback (most recent call last): File "f:\Sistema de Informação\TCC\...
0 votes
1 answer
78 views

I have been trying to complete a project in which I needed to send data stream using kafka to local Spark to process the incoming data. However I can not show and use the data frame in the right ...
0 votes
0 answers
33 views

Hello I am trying to use pyspark + kafka in order to do this I execute this command in order to set up the Spark application Spark version is 3.5.0 | spark-3.5.0-bin-hadoop3 Kafka version is - ...
0 votes
0 answers
273 views

I'm trying to read data from kafka topic by using spark structured streaming on ec2(ubuntu) machine. If I try to read the data by using kafka stream only(kafka-console-consumer.sh) then there is no ...

15 30 50 per page
1
2 3 4 5
...
8

AltStyle によって変換されたページ (->オリジナル) /