Setup an environment

This section is for Data Engineers who will be preparing and administrating the whole environment. It describes steps that should be done to prepare the DP tool to be used in an organization with full potential.

Create Data Pipeline project template

The first thing that you need to do is to create a git repository with a project template used later to create multiple projects. The template should contain the whole directory structure and files used in your projects. Additionally, it should have a connection configuration to all components in your environment, CICD, and all other aspects specific to your company. Here you can find templates examples that you can adjust to your need: https://github.com/getindata/data-pipelines-template-example . Based on the template The Data Pipelines CLI will ask a user a series of questions to build the final project.

Thanks to the copier you can leverage Jinja template syntax to create easily modifiable configuration templates. Just create a copier.yml and configure the template questions (read more at copier documentation).

Create a template to setup a local environment

Working with Data Pipelines usually requires local variables to be set to run and test avoiding messing in shared environments (DEV, STAGE, PROD). To simplify working environment preparation we also decided to use templates that will ask a series of questions and generate local configuration in a home directory.

It requires a repository with a global configuration template file that you or your organization will be using. The repository should contain dp.yml.tmpl file looking similar to this:

templates:
  my-first-template:
    template_name: my-first-template
    template_path: https://github.com/<YOUR_USERNAME>/<YOUR_TEMPLATE>.git
vars:
  username: YOUR_USERNAME

The file must contain a list of available templates. The templates will be displayed and available for selection in Data Pipelines CLI. The next section contains variables that will be passed to the project whenever running in the configured environment. The same rules apply in template creation as for project templates.