Commit 080c41e

authored

Update README.md

1 parent bd37d03 commit 080c41eCopy full SHA for 080c41e

File tree

1 file changed

-6

lines changed

README.md

1 file changed

-6

lines changed

`‎README.md‎`

Lines changed: 5 additions & 6 deletions

Original file line number	Diff line number	Diff line change
`@@ -3,7 +3,9 @@ This project shows how to use SPARK as Cloud-based SQL Engine and expose your bi`
`3`	`3`
`4`	`4`	`### Central Idea:`
`5`	`5`	`Traditional relational Database engines like SQL had scalability problems and so evolved couple of SQL-on-Hadoop frameworks like Hive, Cloudier Impala, Presto etc. These frameworks are essentially cloud-based solutions and they all come with their own advantages and limitations. This project will demo how SparkSQL comes across as one more SQL-on-Hadoop framework.`
`6`		`-To know more details on this please refer to [this](https://spoddutur.github.io/spark-notes/spark-as-cloud-based-sql-engine-via-thrift-server) blog.`
	`6`	`+`
	`7`	`+### Complete Guide`
	`8`	`+To know more details about this, please refer to [this](https://spoddutur.github.io/spark-notes/spark-as-cloud-based-sql-engine-via-thrift-server) blog.`
`7`	`9`
`8`	`10`	`### What is the role of Spark Thrift Server in this?`
`9`	`11`	`SparkSQL enables fast, in-memory integration of external data sources with Hadoop for BI access over JDBC/ODBC. Spark ThriftServer makes this data queryable as JDBC/ODBC source.Spark Thrift Server is similar to HiveServer2 Thrift, instead of submitting sql queries as Hive MapReduce job, spark thrift will use Spark SQL engine which inturn uses full spark capabilities.`
`@@ -13,7 +15,7 @@ Following picture depicts the same:`
`13`	`15`
`14`	`16`	`### How to connect to Spark Thrift Server?`
`15`	`17`	`To connect to Spark ThriftServer, use JDBC/ODBC driver just like HiveServer2 and access Hive or Spark temp tables to run the sql queries on ApacheSpark framework. There are couple of ways to connect to it.`
`16`		`-1. Beeline: Perhaps, the simplest is to use beeline command-line tool provided in Spark's bin folder.`
	`18`	`+1. Beeline: Perhaps, the simplest is to use beeline command-line tool provided in Spark's bin folder.`
`17`	`19`	```markdown
`18`	`20`	`$> beeline`
`19`	`21`	`Beeline version 2.1.1-amzn-0 by Apache Hive`
`@@ -27,13 +29,10 @@ Enter password for jdbc:hive2://localhost:10000:`
`27`	`29`	`// run your sql queries and access data..`
`28`	`30`	`jdbc:hive2://localhost:10000> show tables;,`
`29`	`31`	```
`30`		`-2. Java JDBC: Please refer to this project's test folder where I've shared a java example to demo the same.`
	`32`	+2. Java JDBC: Please refer to this project's test folder where I've shared a java example - `TestThriftClient` class - to demo the same.
`31`	`33`
`32`	`34`	`### Requirements`
`33`	`35`	`- Spark 2.1.0, Java 1.8 and Scala 2.11`
`34`	`36`
`35`		`-Guide:`
`36`		`-[Spark as cloud-based SQL Engine exposing data via ThriftServer](https://spoddutur.github.io/spark-notes/spark-as-cloud-based-sql-engine-via-thrift-server)`
`37`		`-`
`38`	`37`	`References:`
`39`	`38`	`[MapR Docs on SparkThriftServer](http://maprdocs.mapr.com/home/Spark/SparkSQLThriftServer.html)`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 080c41e

File tree

1 file changed

1 file changed

`‎README.md‎`

0 commit comments