Site navigation

< Back
ODBC drivers >
JDBC drivers >
Bridges, gateways >
Other >
All products

< Back

Support

Licensing

< Back

Company

ODBC drivers >

JDBC drivers >

Bridges, gateways >

Other >

All products

DBMS

Accounting and finance

CRM

NoSQL and data warehouse

Analytics

Legacy

Create custom ODBC drivers

Bespoke data access requirements

Consultancy

In the pipeline

Product roadmap

Support

Resources

Quick start

Getting Started Guides

Licensing

Knowledge Base

User Guides

Company

About Us

Careers & Partners

Apache Spark ODBC driver

Are you evaluating

an ODBC driver?

Yes I am!

The Apache Spark ODBC driver lets you work with Spark data in applications such as Perl, PHP, Excel, and Oracle).

Configuring an ODBC data source

Before the Apache Spark ODBC driver can be used to connect an application to Apache Spark, it's necessary to configure an ODBC data source. An ODBC data source stores the connection details for the target database (in this case, Apache Spark) and the ODBC driver that's required to connect to it (in this case, the Apache Spark ODBC driver).

ODBC data sources are configured in ODBC Data Source Administrator, which is included with Windows.

In ODBC Data Source Administrator:

Choose the System DSN tab, and then choose Add.
In the Create New Data Source dialog box, choose Easysoft ODBC-Apache Spark Driver, and then choose Finish.
Complete the fields in the Easysoft ODBC-Apache Spark Driver DSN Setup dialog box.
In your application, connect to your newly configured data source and run a sample query. For example:
```
SELECT * FROM MyTable
```

Work with Apache Spark data in Perl

Strawberry Perl is a Perl distribution for Windows that includes the necessary middleware layers (Perl DBI and Perl DBD::ODBC) to enable the Apache Spark ODBC driver to connect your Perl applications to Apache Spark.

Here's a Perl script that retrieves some Apache Spark data:

#!/usr/bin/perl -w
use strict;
use DBI;
my $dbh = DBI-> connect('dbi:ODBC:MyApacheSparkDataSource');
my $sql = "SELECT MyCol FROM MyTable LIMIT 10";
# Prepare the statement.
my $sth = $dbh->prepare($sql)
 or die "Can't prepare statement: $DBI::errstr";
# Execute the statement.
$sth->execute();
my($SparkCol);
# Fetch and display the result set value.
while(($SparkCol) = $sth->fetchrow()){
 print("$SparkCol\n"); 
}
$dbh->disconnect if ($dbh);

Work with Apache Spark data in PHP

Here's a PHP script that retrieves some Apache Spark data:

<?php
$con = odbc_connect("MyApacheSparkDataSource", "", "");
$err = odbc_errormsg();
if (strlen($err) <> 0) {
 echo odbc_errormsg();
} else {
 $rs2 = odbc_exec($con, "select MyCol from MyTable");
 odbc_result_all($rs2);
 odbc_close($con);
}
?>

Connecting Excel to Apache Spark

Follow these steps to return data from Apache Spark to Microsoft Excel by using Microsoft Query:

On the Data tab, choose New Query > From Other Source > From ODBC.
Choose the Apache Spark ODBC data source when prompted.
Choose a table from the available data sets.
Choose the Load to return the Apache Spark data to the worksheet.
Note that for large result sets, you may have to have to filter the data using Excel before the data can be returned to the worksheet.

Connect to Apache Spark from Oracle

Create a DG4ODBC init file on your Oracle machine. To do this, change to the %ORACLE_HOME%\hs\admin directory. Create a copy of the file initdg4odbc.ora. Name the new file initspark.ora.
Note In these instructions, replace %ORACLE_HOME% with the location of your Oracle HOME directory. For example, C:\oraclexe\app\oracle\product11円.2.0\server.
Ensure these parameters and values are present in your init file:
```
HS_FDS_CONNECT_INFO = MyApacheSparkDataSource
```
Comment out the line that enables DG4ODBC tracing. For example:
```
#HS_FDS_TRACE_LEVEL = <trace_level>
```

Add an entry to %ORACLE_HOME%\network\admin\listener.ora that creates a SID_NAME for DG4ODBC. For example:

SID_LIST_LISTENER =
 (SID_LIST =
 (SID_DESC=
 (SID_NAME=spark)
 (ORACLE_HOME=%ORACLE_HOME%)
 (PROGRAM=dg4odbc)
 )
 )

Add a DG4ODBC entry to %ORACLE_HOME%\network\admin\tnsnames.ora that specifies the SID_NAME created in the previous step. For example:
```
SPARK =
 (DESCRIPTION =
 (ADDRESS = (PROTOCOL = TCP)(HOST = oracle_host)(PORT = 1521))
 (CONNECT_DATA =
 (SID = spark)
 )
 (HS = OK)
 )
```
Replace oracle_host with the host name of your Oracle machine.

Start (or restart) the Oracle Listener:

cd %ORACLE_HOME%\bin
lsnrctl stop
lsnrctl start

Connect to your Oracle database in SQL*Plus.

In SQL*Plus, create a database link for the target Apache Spark instance. For example:

CREATE PUBLIC DATABASE LINK SPARKLINK CONNECT TO "mydummyuser" IDENTIFIED BY "mydummypassword" USING 'spark';

Try querying your Apache Spark data. For example:
```
SELECT * FROM "MyTable"@SPARKLINK;
```

Notes

If you have problems connecting to Apache Spark from Oracle, enable DG4ODBC tracing and check the trace files written to the %ORACLE_HOME%\hs\trace directory. To enable DG4ODBC tracing, add the line HS_FDS_TRACE_LEVEL = DEBUG to initspark.ora and then start or restart the Oracle listener. If the trace directory does not exist, create it.
If you enable ODBC Driver Manager tracing, but do not get a trace file or get an empty trace file (check for C:\SQL.log), change the trace file location to the Windows TEMP directory. For example, C:\Windows\Temp\SQL.log.

DBMS

Accounting and finance

CRM

NoSQL and data warehouse

Analytics

Legacy

Support

Licensing

Company

ODBC drivers >

JDBC drivers >

Bridges, gateways >

Other >

All products

DBMS

Accounting and finance

CRM

NoSQL and data warehouse

Analytics

Legacy

Connect JDBC to any ODBC driver

Connect JDBC to Microsoft Access

Cross-platform or cross-architecture access to any ODBC driver

ODBC access to any JDBC driver

Connect dbExpress applications to any ODBC driver

Connect XML applications to any ODBC driver

Access mutiple ODBC data sources from one SQL statement

Create custom ODBC drivers

Bespoke data access requirements

In the pipeline

Support

Resources

Quick start

Licensing

Knowledge Base

User Guides

Company

About Us

Careers & Partners

Apache Spark ODBC driver

Are you evaluating

Configuring an ODBC data source

Work with Apache Spark data in Perl

Work with Apache Spark data in PHP

Connecting Excel to Apache Spark

Connect to Apache Spark from Oracle

Notes