-
Notifications
You must be signed in to change notification settings - Fork 454
How'd you add Sparkmagic as an AWS SageMaker kernel? #874
-
Hello! I was wondering how you were able to add sparkmagic as a kernel available here. I'm trying to add my own custom container and was wondering, how did you publish the kernel as an available SageMaker kernel? Is there some public ECR to push to?
Beta Was this translation helpful? Give feedback.
All reactions
🤖
Hey @pq43! Nice to see you back around here. Hope you're doing well!
To add sparkmagic as an available kernel in AWS SageMaker and publish it as a custom container, you can follow these steps:
-
Create a Dockerfile similar to the one provided in the context. This Dockerfile installs sparkmagic and its dependencies, copies the necessary files, and sets up the Jupyter notebook environment.
-
Build the Docker image using the Dockerfile. You can do this by running the following command in the directory containing the Dockerfile:
docker build -t sparkmagic:latest .
- After the Docker image is built, you can test it locally by running:
docker run -p 8888:8888 sparkmagic:latest
- If the Docke...
Replies: 3 comments 1 reply
-
🤖
Hey @pq43! Nice to see you back around here. Hope you're doing well!
To add sparkmagic as an available kernel in AWS SageMaker and publish it as a custom container, you can follow these steps:
-
Create a Dockerfile similar to the one provided in the context. This Dockerfile installs sparkmagic and its dependencies, copies the necessary files, and sets up the Jupyter notebook environment.
-
Build the Docker image using the Dockerfile. You can do this by running the following command in the directory containing the Dockerfile:
docker build -t sparkmagic:latest .
- After the Docker image is built, you can test it locally by running:
docker run -p 8888:8888 sparkmagic:latest
- If the Docker image works as expected, you can push it to a Docker registry. If you're using Amazon ECR, you can create a repository and push the image there:
aws ecr create-repository --repository-name sparkmagic docker tag sparkmagic:latest <account-id>.dkr.ecr.<region>.amazonaws.com/sparkmagic:latest aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <account-id>.dkr.ecr.<region>.amazonaws.com docker push <account-id>.dkr.ecr.<region>.amazonaws.com/sparkmagic:latest
- Finally, in AWS SageMaker, you can create a new notebook instance and specify the custom Docker image URL in the "Git repositories" section. The URL should be in the format
<account-id>.dkr.ecr.<region>.amazonaws.com/sparkmagic:latest
.
Please replace <account-id>
and <region>
with your AWS account ID and the region where your ECR repository is located, respectively.
The Dockerfile in the context installs sparkmagic and its dependencies, copies the necessary files, and sets up the Jupyter notebook environment. It also installs the sparkmagic kernels for Spark, PySpark, and SparkR, and enables the sparkmagic server extension. The Docker image built from this Dockerfile can be used as a custom container in AWS SageMaker.
As for your question about a public ECR to push to, AWS does not provide a public ECR for SageMaker. You would need to host your Docker image in a Docker registry that SageMaker can access, such as Amazon ECR, Docker Hub, or a private Docker registry. You can then specify the URL of the Docker image when you create a notebook instance in SageMaker.
I hope this helps! If you have any other questions, feel free to ask.
Best,
Dosu
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
Beta Was this translation helpful? Give feedback.
All reactions
-
Hey @pq43 Checkout the Dockerfile.Jupyter file, specifically these lines for how to make the Sparkmagic kernel available
RUN mkdir /home/$NB_USER/.sparkmagic COPY sparkmagic/example_config.json /home/$NB_USER/.sparkmagic/config.json RUN sed -i 's/localhost/spark/g' /home/$NB_USER/.sparkmagic/config.json RUN jupyter nbextension enable --py --sys-prefix widgetsnbextension RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/sparkkernel RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/pysparkkernel RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/sparkrkernel RUN jupyter serverextension enable --py sparkmagic
then you can follow the instructions Dosu mentioned to make the image available in ECR.
I hope this helps!
Beta Was this translation helpful? Give feedback.
All reactions
-
tysm! dosu's really cool btw :) appreciate the manual response as well
Beta Was this translation helpful? Give feedback.
All reactions
-
❤️ 1
-
🤖
You're very welcome! I'm glad I could help. If you have any more questions or need further assistance, feel free to ask. I'm here to help!
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
Beta Was this translation helpful? Give feedback.