237 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
0
votes
0
answers
22
views
Why doesn't the use of distinct in my window function take effect?
Version: HDP Hive 3.1.3.
data is:
name
day
a
01
a
01
a
01
b
01
c
01
sql is:
select day, name, count(distinct name) as cnt from tablea;
restult is:
day
name
cnt
01
a
5
01
a
5
01
a
5
01
b
5
01
c
5
day
...
0
votes
0
answers
75
views
Downloaded the HDP2.6.5 using DOCKER DESKTOP " docker pull hortonworks/sandbox-hdp:2.6.5" , but containers/img was not created
Newbie here. Started downloading HDP2.6.5 using "docker pull hortonworks/sandbox-hdp:2.6.5" on DOCKER DESKTOP , but wifi router has to be restarted in between download(total 15gb took 3hrs ),...
1
vote
1
answer
42
views
Merging Solr index stored in HDFS not working
I'm trying to merge two Solr core indexes into new one using org/apache/lucene/misc/IndexMergeTool.
All indexes are saved on HDFS under path /apps/solr/data/collection_name/data/index.
So I've created ...
0
votes
1
answer
183
views
Hive How to disable Semantic check 'Schema of both sides of union should match'
Two same Hive 2.1
I have two hadoop cluster:
HDP 2.x with Hive 2.1.0 r6177e19d5af719688732bbffc2a7953295e62b0a (select version();)
CDH 6.x with Hive 2.1.1-cdh6.3.2 ...
1
vote
0
answers
131
views
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob. : ...SparkException: Job aborted due to stage failure
I'm trying to do "101 - Intro to Spark" lab in Python via Zeppelin using Hortonworks Sandbox HDP 2.6.5, and when I get to counting words with a DataFrame, an error occurs.
When I try to ...
0
votes
1
answer
401
views
Ubuntu 22.04 docker install HDP 2.6.5 with error "failed to get D-BUS connection: No such file or directory
I've been downloaded and ran HDP 2.6.5 docker ,
But I've been struggling and received the following error for this line of command:
docker exec -t sandbox-hdp sh -c rm -rf /var/run/postgresql/*; ...
-1
votes
1
answer
233
views
HDFS showing heartbeat lost for nodes
I am new to HDP sandbox. I installed HDP sandbox in the Virtual machine. After installation when I start http://localhost:1080 it connects to Ambari but here it shows alerts for errors. one such error ...
1
vote
2
answers
593
views
Ambari UI not showing versions in cluster installation
I've successfully installed Ambari server/agent 2.7.5 on my Centos 7 machine. Now i am facing an issue while installing a cluster in the install wizard at the "Select version" step. I have ...
0
votes
1
answer
339
views
Migration from HDP non-secure cluster to CDP secure cluster
We are running a migration of HDFS data from an HDP non-sercure cluster to CDP secure cluster, when I read the Cloudera documentation, they are mentioning "distcp" as a tool to handle the ...
0
votes
1
answer
65
views
Ambari DB is damaged without Ambari DB backup
We have Ambari HDP cluster ( HDP version - 2.6.4 ) , with 420 workers linux machines ( when each worker include data node and node manager service )
Unfortunately Ambari DB is damaged , and we not ...
Judy's user avatar
- 1,927
0
votes
0
answers
401
views
Tez session getting created in every time spark job runs
Running a spark(scala) job on HDP cluster. However every time the job executes(both client and cluster mode) a parallel Tez session is also created and application is submitted to YARN.
As part of ...
1
vote
0
answers
251
views
standalone spark cluster unable to read hdfs from executor on kerberized cluster
Spark 3.x or 2.3
HDP 2.7
After enabling kerberos on HDP cluster, spark standalone cluster is unable to read data from hdfs and hive external tables, it is able to read metastore of hive.
If I run this ...
0
votes
0
answers
200
views
how to Modify HDFS Configuration according to dedicated config group
we have HDP cluster with 528 data nodes machines
in Ambari HDFS Configs , we configured 3 config group because the following:
212 data nodes machine are with 32G
119 data nodes machines are with ...
0
votes
1
answer
170
views
How can I Install pip3 with python on HDP3.0.1?
I've tried a few ways to install python3-pip on Hortonworks Sandbox HDP_3.0.1 but no success.
Could any one guide on how to achieve that correctly
0
votes
2
answers
401
views
count most repeated value per group in hive?
I am using hive 0.14.0 in a hortonworks data platform, on a big file similar to this input data:
tpep_pickup_datetime
pulocationid
2022年01月28日 23:32:52.0
100
2022年02月28日 23:02:40.0
202
2022年02月28日 17:22:...