Visit Matillion AI Playground at Snowflake Data Cloud Summit 24

Find out more

Mastering Git At Matillion. An In-Depth Guide To Merging

Welcome back to our Git series, where we continue to explore the fascinating world of version control. In previous installments, we've covered the basics of Git, the power of branching, and common branching strategies. Now, let's dive into a critical aspect of Git: merging. In this blog, we'll examine what merging is, why it's essential for distributed teams, the different strategies used in merging, and some common pitfalls and solutions when things don't go as planned.

What is Merging in Git?

Merging in Git is the process of integrating changes from one branch into another. This could involve merging a feature branch into the main branch, a release branch back into development, or even resolving changes from a collaborator's branch (see our previous blog on branching strategies for more on these). Merging is the backbone of collaboration in Git, allowing teams to work on separate tasks and then combine their work into a cohesive whole.

Collaborating in Distributed Teams

For distributed teams, where developers might be working across different time zones or geographic locations, merging is indispensable. It allows team members to work independently in their own branches, developing features, fixing bugs, or experimenting with new ideas. When it's time to bring these changes together, merging makes it possible to integrate work from multiple sources into a single branch, ensuring that everyone's contributions are included without overwriting or losing valuable code.

Matillion's Git Merging Strategy 

There are several merging strategies in Git, each with advantages and best use cases. In an upcoming blog, we will discuss different merging strategies in depth.

Today, in the Data Productivity Cloud, we use a three-way merge strategy. It is a method for merging two branches into a single branch, used when two branches have developed separately from a common ancestor, and changes from both branches need to be reconciled to produce a unified result. Three-way merges are robust because they leverage the common ancestor to understand the context of changes, allowing for more informed merging and easier conflict resolution when necessary.

What Can Go Wrong and How to Fix It

Merging can sometimes lead to conflicts, where changes in different branches are incompatible. This can happen when two developers edit the same part of a file or when changes from one branch impact the work in another. Here are some common problems and ways to resolve them:

Merge Conflicts: If Git detects conflicting changes, you will be prompted to resolve them manually. Tools like Git's built-in conflict resolution or third-party merge tools can help. Once resolved, you can commit the changes and complete the merge. In the Data Productivity Cloud, we currently offer conflict resolution on a file-by-file basis, allowing users to select whether they want to keep the changes from their branch or the branch they are merging into.

Lost Commits: This can occur when a rebase is done incorrectly or branches are accidentally deleted. To avoid this, ensure you have backups or use reflog to recover lost commits. This is highly unlikely in the Data Productivity Cloud due to the fact that we use a three-way merge strategy.

Complex Merge Histories: Too many merges or frequent rebases can lead to a confusing commit history. Consider squashing commits or using simpler branching strategies to maintain clarity. This is currently not relevant in the Data Productivity Cloud due to the three-way merge strategy.

In summary, merging is a fundamental process in Git that facilitates collaboration and teamwork. By understanding the different merging strategies and knowing how to handle common challenges, you can ensure smooth integration and a cohesive codebase.

Stay tuned for our next installment, where we'll be exploring git’s Hard Reset feature. Until then, happy merging and Git Good with the Data Productivity Cloud!

Catch up here for:

Part One on Commit, Push and Pull 

Part Two on Branching

Part Three on Exploring Common Branching Strategies 

Bryns Jones
Bryns Jones

Senior Engineering Manager

Bryns Jones is a Senior Engineering Manager working with the DataOps team at Matillion