Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 080c41e

Browse files
Update README.md
1 parent bd37d03 commit 080c41e

File tree

1 file changed

+5
-6
lines changed

1 file changed

+5
-6
lines changed

‎README.md‎

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@ This project shows how to use SPARK as Cloud-based SQL Engine and expose your bi
33

44
### Central Idea:
55
Traditional relational Database engines like SQL had scalability problems and so evolved couple of SQL-on-Hadoop frameworks like Hive, Cloudier Impala, Presto etc. These frameworks are essentially cloud-based solutions and they all come with their own advantages and limitations. This project will demo how SparkSQL comes across as one more SQL-on-Hadoop framework.
6-
To know more details on this please refer to [this](https://spoddutur.github.io/spark-notes/spark-as-cloud-based-sql-engine-via-thrift-server) blog.
6+
7+
### Complete Guide
8+
To know more details about this, please refer to [this](https://spoddutur.github.io/spark-notes/spark-as-cloud-based-sql-engine-via-thrift-server) blog.
79

810
### What is the role of Spark Thrift Server in this?
911
SparkSQL enables fast, in-memory integration of external data sources with Hadoop for BI access over JDBC/ODBC. Spark ThriftServer makes this data queryable as JDBC/ODBC source.Spark Thrift Server is similar to HiveServer2 Thrift, instead of submitting sql queries as Hive MapReduce job, spark thrift will use Spark SQL engine which inturn uses full spark capabilities.
@@ -13,7 +15,7 @@ Following picture depicts the same:
1315

1416
### How to connect to Spark Thrift Server?
1517
To connect to Spark ThriftServer, use JDBC/ODBC driver just like HiveServer2 and access Hive or Spark temp tables to run the sql queries on ApacheSpark framework. There are couple of ways to connect to it.
16-
1. Beeline: Perhaps, the simplest is to use beeline command-line tool provided in Spark's bin folder.
18+
1. **Beeline:** Perhaps, the simplest is to use beeline command-line tool provided in Spark's bin folder.
1719
```markdown
1820
`$> beeline`
1921
Beeline version 2.1.1-amzn-0 by Apache Hive
@@ -27,13 +29,10 @@ Enter password for jdbc:hive2://localhost:10000:
2729
// run your sql queries and access data..
2830
`jdbc:hive2://localhost:10000> show tables;,`
2931
```
30-
2. Java JDBC: Please refer to this project's test folder where I've shared a java example to demo the same.
32+
2. **Java JDBC:** Please refer to this project's test folder where I've shared a java example - `TestThriftClient` class - to demo the same.
3133

3234
### Requirements
3335
- Spark 2.1.0, Java 1.8 and Scala 2.11
3436

35-
Guide:
36-
[Spark as cloud-based SQL Engine exposing data via ThriftServer](https://spoddutur.github.io/spark-notes/spark-as-cloud-based-sql-engine-via-thrift-server)
37-
3837
References:
3938
[MapR Docs on SparkThriftServer](http://maprdocs.mapr.com/home/Spark/SparkSQLThriftServer.html)

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /