Half a day with Maia. A working pipeline by the end.

Register

Migrating from a Matillion Hosted Git Repo to a BYOG repository

When setting up your first project in the Data Productivity Cloud, you’ll notice that there is a choice of configurations that can be followed when it comes to the git repository to be used for the specific project. The two options are Matillion Hosted Git or Bring Your Own Git (BYOG).

Matillion Hosted Git is the fastest and least amount of effort to set up, so it's normal to use this for your initial work. However, the question is likely to arise again. If, as a user, I want to convert to using my own git repository, how can the change be made?

In this blog, I will outline the process for migrating from a Matillion-hosted git-based project to a bring-your-own-git project.

Understanding your Data Productivity Cloud configuration

Before jumping into the steps to migrate a project that is using a Matillion Hosted git repository to a bring-your-own git repository-based project, let's cover some basics of the Designer UI, which will help identify what the current configurations of a specific project are.

After going into Designer from the Matillion Hub, we can see a list of projects. After clicking on one of the project names, there is a section in the UI to provide some details about the project:

Here, we can see the following attributes of a project:

  • Agent type: either Full SaaS or Hybrid SaaS
  • Data Platform: the cloud data warehouse the project is connected to
  • Git repository: Matillion hosted Git, or if the project is using a Bring Your Own Git repo, it will have the provider name listed here (ie. GitHub)

Step by step - migrating from Matillion hosted git to BYOG

Step 1 - Complete integration with the repository provider

The first step is to set up one of Matillion's supported git repository providers. While this integration is outside the scope of this article, you can find the full documentation in this Bring your own git repository overview article.

After the integration has been completed, create a new blank repository on the provider's side that will be used with the new project. Note that the new repository created should be empty, and should only have a main branch present, with no existing files in the repository. (although it's fine if a README file is created when the repository is initialized).

In my example, I am migrating to a repository hosted on GitHub, and here is what my repository looks like after the initial creation:

Step 2 - Create a new project, and select bring your own git

Add a new project, and provide the name description, and select the data warehouse platform. 

Select ‘Advanced settings’ on the project configurations page. This will enable you to choose a BYO git repository 

On the following page, select your git provider. Note that when the provider is selected, the page will advance automatically and prompt a flow to authenticate with the git provider.

Back in the Matillion Designer, search for the repository created on the git provider side previously and select it.

Select the agent type the project should be linked to. In most cases, the same agent will be used for the project when doing a migration like this. The main goal in this migration is to start using a bring-your-own GitHub repository, but feel free to select a new agent here if needed. The scope of this article will include migrating to a project using the same Matillion Hosted Agent but using a GitHub-based repository. 

Define the environment for the project and select the default access

Continue to provide the credentials for the environment as well as the default role, database, and schema to be used by the environment.

The new project has been created! Note the project attributes in the top right-hand corner, which show GitHub as the repository.

Step 3 - Review and prepare the new Project

As part of the migration process, we will be leveraging the export/import functionality within Designer to move the pipelines from the original project to the new project. However, there are a few things to consider when performing an export: 

The following items will be exported with the pipelines:

  • Pipeline variables
  • Components and configurations
  • Folder structure of pipelines (if applicable)

The following items will not be exported with the pipelines:

  • Project variables
  • Environments
  • Cloud provider credentials
  • OAuths
  • User access
  • Secret definitions

Review the non-exportable items in the original project and create them in the new project. This will ensure all resources are in place when the pipelines are imported.

Step 4 - Export pipelines from the original project

Begin by checking that all of the latest changes to the pipelines in the original project are committed and merged accordingly with the main branch. After this is completed, the next step is to export all pipelines from the original project.

To do this, enter into the main branch of the original project, click the 3 dot icon next to the folder or individual pipelines that need to be migrated, and select ‘export’

This will result in a .zip file being downloaded to your computer, which will contain the pipeline definitions (.yaml files). Keep the file zipped as downloaded initially to your computer.

Step 5 - Import pipelines into the new project

After the pipelines have been exported from the original project, go into the new project and select the main branch. From here, import the pipelines and select the zip file that was downloaded to your computer when they were exported. 

Repeat as necessary for all your other pipelines.

Step 6 - Commit work and push changes to the new BYO repository

After the project has been set up, all non-exportable items in place, and the pipelines imported, it's now time to commit the changes on the main branch, and push the changes to the remote repository. Navigate to the git menu and follow this sequence:

Use the git menu to commit changes. This will commit all of the imported pipelines to your new BYO git repository.

After committing the changes and filling out the commit message, navigate back to the git menu and select the ‘Push local changes’ option in the menu. This will push the commit made in step 1 to the remote repository. During this process, you can optionally select to publish the pipelines, which will get them ready for scheduling if required. 

As a last step of validation, go back to the remote repository provider side and validate that you now see the pipelines as expected. In my example, I am using a GitHub repository, and I can now see my ‘pipelines’ folder in the remote repository along with all of the corresponding YAML files that hold the pipeline code

Further reading

Congratulations! You have successfully migrated a project from a Matillion Hosted Git repository to a repository hosted with the Git provider of your choosing. Regular branching, merging, etc. can now take place along with other standard pipeline development as needed. 

Here are some Matillion reference guides:

If you'd like to run your own trial of the Matillion Data Productivity Cloud, start by going to the Matillion Hub and creating an account.

Kevin Kirkpatrick
Kevin Kirkpatrick

Associate Delivery Solution Architect

Ready to get moving?

See how quickly your team can start delivering business-ready data, with Matillion.