Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

epuujee/docker-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

78 Commits

Repository files navigation

spark

A debian:stretch based Spark container. Use it in a standalone cluster with the accompanying docker-compose.yml, or as a base for more complex recipes.

docker example

To run SparkPi, run the image with Docker:

docker run --rm -it -p 4040:4040 gettyimages/spark bin/run-example SparkPi 10

To start spark-shell with your AWS credentials:

docker run --rm -it -e "AWS_ACCESS_KEY_ID=YOURKEY" -e "AWS_SECRET_ACCESS_KEY=YOURSECRET" -p 4040:4040 gettyimages/spark bin/spark-shell

To do a thing with Pyspark

echo -e "import pyspark\n\nprint(pyspark.SparkContext().parallelize(range(0, 10)).count())" > count.py
docker run --rm -it -p 4040:4040 -v $(pwd)/count.py:/count.py gettyimages/spark bin/spark-submit /count.py

docker-compose example

To create a simplistic standalone cluster with docker-compose:

docker-compose up

The SparkUI will be running at http://${YOUR_DOCKER_HOST}:8080 with one worker listed. To run pyspark, exec into a container:

docker exec -it docker-spark_master_1 /bin/bash
bin/pyspark

To run SparkPi, exec into a container:

docker exec -it docker-spark_master_1 /bin/bash
bin/run-example SparkPi 10

license

MIT

About

Docker build for Apache Spark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Dockerfile 100.0%

AltStyle によって変換されたページ (->オリジナル) /