![]() Run the following command to start Spark shell: spark-shell Now let's do some verifications to ensure it is working. ![]() The first configuration is used to write event logs when Spark application runs while the second directory is used by the historical server to read event logs. ![]() These two configurations can be the same or different. There are many other configurations you can do. # Enable the following one if you have Hive installed. Make sure you add the following line: localhost Run the following command to create a Spark default config file: cp $SPARK_HOME/conf/ $SPARK_HOME/conf/nfĮdit the file to add some configurations use the following commands: vi $SPARK_HOME/conf/nf If you also have Hive installed, change SPARK_DIST_CLASSPATH to: export SPARK_DIST_CLASSPATH=$(hadoop classpath):$HIVE_HOME/lib/* # Source the modified file to make it effective: # Configure Spark to use Hadoop classpathĮxport SPARK_DIST_CLASSPATH=$(hadoop classpath) bashrc file: vi ~/.bashrcĪdd the following lines to the end of the file: export SPARK_HOME=~/hadoop/spark-3.0.1 ![]() We also need to configure Spark environment variable SPARK_DIST_CLASSPATH to use Hadoop Java class path. Setup SPARK_HOME environment variables and also add the bin subfolder into PATH variable. The Spark binaries are unzipped to folder ~/hadoop/spark-3.0.1. Unpack the package using the following command: mkdir ~/hadoop/spark-3.0.1 Visit Downloads page on Spark website to find the download URL.ĭownload the binary package using the following command (remember to replace the URL with your closest download site): wget Now let’s start to configure Apache Spark 3.0.1 in a UNIX-alike system. OpenJDK 64-Bit Server VM (build 25.212-b03, mixed mode) Run the following command to verify Java environment: $ java -version In the Hadoop installation articles, it includes the steps to install OpenJDK.
0 Comments
Leave a Reply. |