2,058 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
-1
votes
2
answers
54
views
How do I figure out which HBase version(s) is compatible with Hadoop-2.6.5?
I must install HBase on my Virtual Machines, which are Linux Ubuntu LTS 20.04. I was wondering if there was a way to determine which HBase version is compatible with, because I saw the mention of ...
0
votes
0
answers
12
views
Hadoop 2.6.5 Cannot find NameNode
When I run start-dfs.sh, I don't see the NameNode specified in the command's output. I have written the following in the hdfs-site.xml document:
<?xml version="1.0" encoding="UTF-8&...
1
vote
0
answers
53
views
How does the hdfs journal nodes work internally?
I understand that Journal Nodes are like the central repo for all edit logs (no matter which namenode is currently active, all push to journal node). I suppose the QuorumJournalManager handles this ...
0
votes
1
answer
91
views
Upgrading hadoop to 3.1.2 with hbase-testing-utility 2.2.3
The goal
I want to switch from the HDFS to the s3a client. To this end I need to upgrade from Hadoop 2.8.5 to at least 3.1.2, because I need to use the AssumedRoleCredentialProvider for AWS access. ...
1
vote
0
answers
82
views
How to set custom tmp directory for Hadoop
We are using Hadoop version 2.10.2 and facing below error while starting server. Due to company security policy no execution permission is set on /tmp directory hence library libleveldbjni-64-1-...
1
vote
1
answer
2k
views
trino : io.trino.spi.trinoexception error reading from hdfs at position caused by java.io.ioexception 4 missing blocks , the stripe is : AlignedStripe
I have trino to query hdfs with hive connector.
not always but sometimes it gets this error :
io.trino.spi.trinoexception error reading from hdfs at position caused by java.io.ioexception 4 missing ...
2
votes
1
answer
423
views
Configure hadoop.service.shutdown.timeout property
I need to configure the value of hadoop.service.shutdown.timeout due to the shutdown hooks triggering a timeout when our MR jobs stop:
2023年08月25日 08:44:39,566 [WARN] [Thread-0] [org.apache.hadoop.util....
0
votes
1
answer
155
views
Accumulo error: "ZOOKEEPER_HOME is not set or is not a directory" with separate ZooKeeper cluster
I'm trying to set up an Accumulo cluster that uses a separate ZooKeeper cluster. I've configured the accumulo-site.xml file to include the instance.zookeeper.host property with the hostname or IP ...
1
vote
0
answers
184
views
Test cases fail with permission denied error with hadoop-minicluster initialization for 3.2.2 version
I am trying to run Junits for a spark project in intelliji. Junits initialize local hadoop cluster using hadoop-minicluster dependency. Tests run fine with hadoop version - 2.7.3.2.6.5.0-292. Since we ...
0
votes
1
answer
675
views
Module/Package resolution in Python
So I have a project directory "dataplatform" and its contents are the follows:
── dataplatform
├── __init__.py
├── commons
│ ├── __init__.py
│ ├── ...
0
votes
1
answer
185
views
How can I get job configuration in command line?
I get get running apps with this yarn application -appStates RUNNING then I get one applicationID from list.
then I can get status of app with this: yarn application -status
I want to get job ...
-1
votes
1
answer
182
views
How to get specific key/value from HDFS via HTTP or JAVA API?
How can I get the value of one or more keys in HDFS via HTTP or JAVA api from remote client? For example, the file below has a million keys and values. I just want to get the values of the 'phone' and ...
0
votes
0
answers
414
views
"ENOENT: No such file or directory" in hadoop while executing WordCount program
Trying on the wordcount example by using command "hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output" in linux shell, however it keep reminds me ...
1
vote
2
answers
613
views
What is the difference between FileInputStream/FileOutputStream Vs FSDataInputStream/FSDataOutputStream and where we will use them
I am trying to understand the difference between FileInputStream Vs FSDataInputStream and FileOutputStream Vs FSDataOutputStream.
I am trying to read a file from S3 bucket and apply some formatting ...
1
vote
1
answer
1k
views
Spark 3.2.1 fetch HBase data not working with NewAPIHadoopRDD
Below is the sample code snippet that is used for data fetch from HBase. This worked fine with Spark 3.1.2. However after upgrading to Spark 3.2.1, it is not working i.e. returned RDD doesn't contain ...