Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

How'd you add Sparkmagic as an AWS SageMaker kernel? #874

Answered by dosubot bot
starskyreverie asked this question in Q&A
Discussion options

Hello! I was wondering how you were able to add sparkmagic as a kernel available here. I'm trying to add my own custom container and was wondering, how did you publish the kernel as an available SageMaker kernel? Is there some public ECR to push to?

You must be logged in to vote

🤖

Hey @pq43! Nice to see you back around here. Hope you're doing well!

To add sparkmagic as an available kernel in AWS SageMaker and publish it as a custom container, you can follow these steps:

  1. Create a Dockerfile similar to the one provided in the context. This Dockerfile installs sparkmagic and its dependencies, copies the necessary files, and sets up the Jupyter notebook environment.

  2. Build the Docker image using the Dockerfile. You can do this by running the following command in the directory containing the Dockerfile:

docker build -t sparkmagic:latest .
  1. After the Docker image is built, you can test it locally by running:
docker run -p 8888:8888 sparkmagic:latest
  1. If the Docke...

Replies: 3 comments 1 reply

Comment options

🤖

Hey @pq43! Nice to see you back around here. Hope you're doing well!

To add sparkmagic as an available kernel in AWS SageMaker and publish it as a custom container, you can follow these steps:

  1. Create a Dockerfile similar to the one provided in the context. This Dockerfile installs sparkmagic and its dependencies, copies the necessary files, and sets up the Jupyter notebook environment.

  2. Build the Docker image using the Dockerfile. You can do this by running the following command in the directory containing the Dockerfile:

docker build -t sparkmagic:latest .
  1. After the Docker image is built, you can test it locally by running:
docker run -p 8888:8888 sparkmagic:latest
  1. If the Docker image works as expected, you can push it to a Docker registry. If you're using Amazon ECR, you can create a repository and push the image there:
aws ecr create-repository --repository-name sparkmagic
docker tag sparkmagic:latest <account-id>.dkr.ecr.<region>.amazonaws.com/sparkmagic:latest
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <account-id>.dkr.ecr.<region>.amazonaws.com
docker push <account-id>.dkr.ecr.<region>.amazonaws.com/sparkmagic:latest
  1. Finally, in AWS SageMaker, you can create a new notebook instance and specify the custom Docker image URL in the "Git repositories" section. The URL should be in the format <account-id>.dkr.ecr.<region>.amazonaws.com/sparkmagic:latest.

Please replace <account-id> and <region> with your AWS account ID and the region where your ECR repository is located, respectively.

The Dockerfile in the context installs sparkmagic and its dependencies, copies the necessary files, and sets up the Jupyter notebook environment. It also installs the sparkmagic kernels for Spark, PySpark, and SparkR, and enables the sparkmagic server extension. The Docker image built from this Dockerfile can be used as a custom container in AWS SageMaker.

As for your question about a public ECR to push to, AWS does not provide a public ECR for SageMaker. You would need to host your Docker image in a Docker registry that SageMaker can access, such as Amazon ECR, Docker Hub, or a private Docker registry. You can then specify the URL of the Docker image when you create a notebook instance in SageMaker.

I hope this helps! If you have any other questions, feel free to ask.

Best,
Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

You must be logged in to vote
0 replies
Answer selected by starskyreverie
Comment options

Hey @pq43 Checkout the Dockerfile.Jupyter file, specifically these lines for how to make the Sparkmagic kernel available

RUN mkdir /home/$NB_USER/.sparkmagic
COPY sparkmagic/example_config.json /home/$NB_USER/.sparkmagic/config.json
RUN sed -i 's/localhost/spark/g' /home/$NB_USER/.sparkmagic/config.json
RUN jupyter nbextension enable --py --sys-prefix widgetsnbextension
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/sparkkernel
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/pysparkkernel
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/sparkrkernel
RUN jupyter serverextension enable --py sparkmagic

then you can follow the instructions Dosu mentioned to make the image available in ECR.

I hope this helps!

You must be logged in to vote
1 reply
Comment options

tysm! dosu's really cool btw :) appreciate the manual response as well

Comment options

🤖

You're very welcome! I'm glad I could help. If you have any more questions or need further assistance, feel free to ask. I'm here to help!


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /