Deployment Options in Matillion ETL – Using Environments
This is the first blog in a three-part series on Matillion ETL deployment options. This article describes the first of three commonly-used choices for how to manage and deploy your Matillion solution between multiple environments: for example development – test – production. Note that in this series we’re looking exclusively at options which are available through Matillion ETL’s web user interface. Additional options are available using Matillion’s REST API.
With this method, you don’t actually move any code at all!
Instead, you simply run a single version of the Matillion code against the environment of your choice.
In order to describe the details, first a few definitions. What do we actually mean by “Matillion code” and, what’s an “Environment”?
Orchestration and Transformation Jobs
Matillion is an ELT tool. it does two main things on your behalf:
- Data ingestion – in other words loading data from external sources into the database
- Data transformation – getting your data ready for visualization or analysis; for example, integration, reshaping, applying business rules, aggregation, and deriving calculated fields.
These two things are implemented by Orchestration Jobs and Transformation Jobs respectively.
Orchestration jobs fulfill an additional command-and-control function: they define the overall sequence of events and can be scheduled. Orchestration jobs can call Transformation jobs as part of their work.
“Matillion Code” means the definition of the Orchestration and Transformation Jobs that you are using. So, at runtime, where does a Matillion Job actually execute? Against an “Environment”.
What’s a Matillion Environment?
A Matillion Environment defines a target data warehouse.
To manage your Environments you’ll need to expand the panel at the bottom left of the screen, which is minimized by default.
When you first launched Matillion, you went through an initial configuration screen. This asked for details of the target data warehouse plus a couple of other configuration items. For this reason, you’ll always have at least one Environment.
You can manage environments through the right-click context menu. The Edit Environment option will take you back to that initial configuration screen, where you can change the settings if necessary.
You can add, edit and remove Environments using these options.
- Matillion has an overall cap on the total number of environments that may exist within your installation. The cap depends upon the instance size. If you have used multiple Projects then each Project will have at least one Environment, and they all count towards the cap even if they are actually pointing at the same target. There’s more on this subject in the second post of this series.
- You are not permitted to delete the last Environment within a Project.
With this simple code deployment option, you don’t move the Matillion code at all: you just run it against different Environments.
If you already had a “Dev” environment, you might add a “Production” environment.
Then choose the name, and fill in the details of your Production target data warehouse appropriately.
Whenever you create or edit an Environment, always press the Test button, and ensure that you see a ‘Success’ message!
Provided you’re still within the total number of permitted Environments, you’ll now have two entries listed in the Environments panel.
- One Environment is always marked [Selected], and you can change this by using the “Select Environment” context menu option.
- You can delete Environments if you have created more than one
Now when you are in a Job editor, your context menu will have additional options:
- Run Job (Dev) – means to run the job against the Environment which is marked as [Selected]
- Run Job in Another Environment – opens another dialog which allows you to choose which Environment to run the job against
You’ll also see the same drop-down menu in the Maintain Schedule editor.
There is no code movement at all with this method of “code deployment”. Instead, you take advantage of the fact that Matillion ETL can have more than one target environment defined – for example, development, test, and production. At runtime, you choose which one to use.
Normal backup mechanisms still work and are still recommended.
This method of code deployment is the simplest of the three but correspondingly offers the least governance.
Other Methods of Code Deployment