If you have problems with the getting started guide, note that there's a separate troubleshooting section for that.
Windows: missing winutils
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries
winutils.exe executable can not be found.
- Download hadoop winutils binaries (e.g https://github.com/cdarlint/winutils/archive/refs/heads/master.zip)
- Extract binaries for desired hadoop version into folder (e.g. hadoop-3.2.2\bin)
- Set HADOOP_HOME evironment variable (e.g. HADOOP_HOME=...\hadoop-3.2.2). Note that the binary files need to be located at %HADOOP_HOME%\bin!
- Add %HADOOP_HOME%\bin to PATH variable.
/tmp/hive is not writable
RuntimeException: Error while running command to get file permissions
%HADOOP_HOME%\bin and execute
winutils chmod 777 /tmp/hive.
Windows: winutils.exe is not working correctly
winutils.exe - System Error The code execution cannot proceed because MSVCR100.dll was not found. Reinstalling the program may fix this problem.
Other errors are also possible:
- Similar error message when double clicking on winutils.exe (Popup)
- Errors when providing a path to the configuration instead of a single configuration file
- ExitCodeException exitCode=-1073741515 when executing SDL even though everything ran without errors
Install VC++ Redistributable Package from Microsoft:
Java IllegalAccessError (Java 17)
Symptom: Starting an SDLB pipeline fails with the following exception:
java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x343570b7) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x343570b7
Java 17 is more restrictive regarding usage of module exports. Unfortunately Spark uses classes from unexported packages. Packages can be exported manually. To fix above exception add
--add-exports java.base/sun.nio.ch=ALL-UNNAMED to the java command line, see also Stackoverflow.
Resources not copied
Tests fail due to missing or outdated resources or the execution starts but can not find the feeds specified. IntelliJ might not copy the resource files to the target directory.
Execute the maven goal
mvn resources:resources) manually after you changed any resource file.
Maven compile error: tools.jar
Could not find artifact jdk.tools:jdk.tools:jar:1.7 at specified path ...
Hadoop/Spark has a dependency on the tools.jar file which is installed as part of the JDK installation.
- Your system does not have a JDK installed (only a JRE).
- Fix: Make sure a JDK is installed and your PATH and JAVA_HOME environment variables are pointing to the JDK installation.
- You are using a Java 9 JDK or higher. The tools.jar has been removed in JDK 9. See: https://openjdk.java.net/jeps/220
- Fix: Downgrade your JDK to Java 8.
How can I test Hadoop / HDFS locally ?
local:// URIs, file permissions on Windows, or certain actions, local Hadoop binaries are required.
- Download your desired Apache Hadoop binary release from https://hadoop.apache.org/releases.html.
- Extract the contents of the Hadoop distribution archive to a location of your choice, e.g.,
- Set the environment variable
- Windows only: Download a Hadoop winutils distribution corresponding to your Hadoop version from https://github.com/steveloughran/winutils (for newer Hadoop releases at: https://github.com/cdarlint/winutils) and extract the contents to