View on GitHub


Framework to quickly build and maintain Smart Data Lakes


There are so many DataObjects / Actions, how do I get started?

Common use cases are included in the sdl-examples project, check out its application.conf. This should give you a good overview of how to use data objects and actions. Then check the Reference to see which types of data objects and actions are currently supported.

How can I test Hadoop / HDFS locally ?

When using local:// URIs, file permissions on Windows, or certain actions, local Hadoop binaries are required.

  1. Download your desired Apache Hadoop binary release from
  2. Extract the contents of the Hadoop distribution archive to a location of your choice, e.g., /path/to/hadoop (Unix) or C:\path\to\hadoop (Windows).
  3. Set the environment variable HADOOP_HOME=/path/to/hadoop (Unix) or HADOOP_HOME=C:\path\to\hadoop (Windows).
  4. Windows only: Download a Hadoop winutils distribution corresponding to your Hadoop version from (for newer Hadoop releases at: and extract the contents to %HADOOP_HOME%\bin.