

To have Jupyter running on your AWS cluster ( EMR 4.x.x versions) add the following bootstrap action:

Although, the original Jupyter installation comes only with a python kernel out-of-the-box, and so, the installation is two step: Installing Jupyter
#CAN I USE PYTHON AND INSTALL JUPYTER NOTEBOOK SEPARATELY CODE#
Installing the Jupyter notebook exposes a web service that is accessible from your web browser and enables you to write code inside your web browser, then hit CTRL+Enter and your snippets are executed on your cluster without leaving the notebook. For testing, visualization, ad hoc querying and researching.Documentation of snippets you’d normally run on spark-shell REPL.Presentations / Demos – as you can add text in markup language, plot images, and run your code live.Have it a go: Who is this product good for You can embed widgets (LaTex equations, for example) in it, write formatted text, generate charts dynamically, and more.Īlthough not out of the box, it supports running Spark code on a cluster, so it becomes a really powerful tool for Spark practitioners as well, and its installation takes only few simple steps. You can run your code without leaving your notebook. This post will guide you through installing the open source Jupyter notebook to the point where you can execute distributed Spark code from within it.įormerly known as IPython, now the Jupyter project supports running Python, Scala, R and more (~40 languages via using kernels). Well, if you liked the idea then you should definitely try using a notebook. Of course, you can generate a “wiki” page for your project, but what would really be cool is if you could embed some code inside it, and execute it on demand to get the results, seamlessly.

Sometimes, it is the case when you would like to add equations, images, complex text formats and more. In-code comments are not always sufficient if you want to maintain a good documentation of your code.
