title | description | keywords | ms.service | ms.topic | ms.custom | ms.date |
---|---|---|---|---|---|---|
PySpark interactive environment with Azure HDInsight Tools |
Learn how to use the Azure HDInsight Tools for Visual Studio Code to create and submit queries and scripts. |
VScode,Azure HDInsight Tools,Hive,Python,PySpark,Spark,HDInsight,Hadoop,LLAP,Interactive Hive,Interactive Query |
hdinsight |
how-to |
seoapr2020, devx-track-python |
03/30/2022 |
The following steps show how to set up the PySpark interactive environment in VSCode. This step is only for non-Windows users.
We use python/pip command to build virtual environment in your Home path. If you want to use another version, you need to change default version of python/pip command manually. More details see update-alternatives.
-
-
Install Python from https://www.python.org/downloads/.
-
Install pip from https://pip.pypa.io/en/stable/installing (if it's not installed from the Python installation).
-
Optionally validate that Python and pip are installed successfully by using the commands
python --version
, andpip --version
, respectively.[!NOTE] It is recommended to manually install Python instead of using the macOS default version.
-
-
Install virtualenv by running command below.
pip install virtualenv
On Linux, if you come across the error message below, then install the required packages by running the following two commands.
:::image type="content" source="./media/set-up-pyspark-interactive-environment/install-libkrb5-package.png" alt-text="Install libkrb5 package for python" border="true":::
sudo apt-get install libkrb5-dev
sudo apt-get install python-dev
Restart VSCode, and then go back to the VSCode editor and run Spark: PySPark Interactive command.
- HDInsight for VS Code: Video