3,767 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
0
votes
1
answer
39
views
flink on yarn, ContainerLocalizer download locally file failed
I submit a Flink job to Hadoop-Yarn, and use Flink application mode. Everything is normal on the client side, but the app master starts failing on the NodeManager, with the following logs.
...
0
votes
1
answer
52
views
How are ResourceManager and NodeManager deployed in relation to NameNode and DataNode in Hadoop?
I'm currently learning Hadoop and am a bit confused about how the Hadoop Distributed File System (HDFS) and YARN components interact, especially in terms of deployment across machines.
Here’s what I ...
0
votes
0
answers
38
views
Spark log files getting truncated in yarn cluster mode
In spark running in yarn cluster mode, the log files - stderr and stdout, are getting truncated within a minute. Not sure which process or config is doing so.What could be the cause of truncation?
...
0
votes
0
answers
25
views
How to filter only the application ID from Yarn
We need to filter only the application ID on notebookApp application:
yarn application -list | grep notebookapp | awk '{print 1ドル}' | grep -v INFO
25/04/24 08:30:31 INFO client.AHSProxy: Connecting to ...
0
votes
0
answers
60
views
YARN framework issue for any application reporting setsid not found
(sorry, this is my first message here, so I have not used format features but just plain text)
I can't solve this issue that I have when trying to run applications using yarn.
I've checked the user ...
3
votes
2
answers
407
views
GoogleHadoopOutputStream: hflush(): No-op due to rate limit: Increase in class A operation for gcs bucket
We are running our spark ingestion jobs which process multiple files in batches.
We read csv or tsv files in batches and create a dataframe and do some transformations before loading it into big query ...
0
votes
0
answers
78
views
Hive Fails to Execute Spark Task: "Failed to create Spark client for Spark session"
I am trying to integrate Apache Spark with Hive in a multi-node cluster setup. My setup consists of the following machines:
192.XXX.01.01 → Hadoop node 1
192.XXX.01.02 → Hadoop node 2
192.XXX.01.03 → ...
0
votes
0
answers
21
views
Local Spark and Yarn Spark giving different results
I have a function called runSparkFlow() which takes SparkSession object and hashmap of Dataset<Row>. This hashmap of datasets is created from a function called getSourceDatasets(). I am running ...
2
votes
0
answers
73
views
Issue: Hive Metastore Not Resolvable in Spark Cluster Mode
Apache Spark Cluster Mode Error: NoSuchFieldError: METASTOREURIS
I am running Apache Spark with Hive Metastore on YARN. The setup consists of:
One Edge Node (running inside containers, using --...
0
votes
1
answer
65
views
Setting YARN application name for Flink Jobs
I'm currently running flink job using below command.
flink run-application -t yarn-application -c com.app.FlinkJob flink-job.jar
This starts a Yarn job with name Flink Application Cluster. is there a ...
0
votes
0
answers
71
views
How to avoid closed wait sessions that come from huge numbers of HTTP requests?
We run thousands of Python scripts on our RHEL machines that open and close socket connections on port 8088. As a result, we are facing a high volume of HTTP requests.
Here is very simple example of ...
0
votes
0
answers
22
views
EMR ResourceManaged UI Physical Mem Used %
I'm using AWS EMR with Hadoop and Yarn and when I go to UI of the RM I can see information like "Physical Mem Used %" and "Physical VCores Used %". I cannot find anything online (...
1
vote
0
answers
37
views
zeppelin flink interpreter in application mode cant find yarn resourcemanager
I run on zeppelin (0.11.0) flink interpreter (flink 0.17.2). on yarn-session mode it works, but on yarn-application mode it stucks with:
INFO org.apache.zeppelin.flink.FlinkScala212Interpreter ...
1
vote
0
answers
97
views
Spark executor does not see aws credentials env vars
I have a cluster made of 2 nodes (kubernetes pods), a YARN resource manager and a YARN node manager
Both pods have environment variables containing credentials to access a specific bucket in my AWS ...
-1
votes
1
answer
44
views
Spark on YARN(EKS) fails because of no credentials
I have a Spark application running on an EKS cluster with YARN
The application starts and i can see it in the YARN UI, but it fails (see screenshot below) because of missing credentials, the defaultFS ...