Mastering Git at Matillion - Merge Strategies

Welcome back to our Git series! In previous posts, we've covered essential Git concepts like branching, merging, and managing uncommitted changes. Today, we dive into a more nuanced topic: Git merge strategies. While the default merge strategy typically works well, understanding different strategies can be invaluable in specific scenarios. We’ll explore the reasons behind selecting a merge strategy, popular options available, and when you might want to deviate from the default.

Why Select a Merge Strategy?

When you merge branches in Git, the merge strategy determines how changes from different branches are combined. Most of the time, the default merge strategy, `ort` (formerly `recursive`), is more than sufficient for handling merges in Git. It effectively combines changes and resolves conflicts for the majority of cases you’ll encounter. However, there are specific scenarios where you might encounter issues or need more control over the merge process. In these situations, considering an alternative merge strategy can provide a tailored solution to address unique challenges or requirements in your development workflow. Understanding these strategies and their use cases ensures you’re equipped to handle any merging complexities that arise.

Selecting a merge strategy can help in:

  • Conflict Resolution: Different strategies provide various ways to handle conflicts, making it easier to resolve complex merges.
  • History Management: Some strategies influence how the commit history is structured, which can be important for maintaining a clear and understandable project history.
  • Specific Use Cases: Certain strategies are designed for particular scenarios, such as merging multiple branches simultaneously or incorporating changes from external repositories.

Handling Merge Conflicts

Merge conflicts are an inevitable part of collaborative development, occurring when changes in different branches overlap or contradict each other. Understanding how merge conflicts arise and how different merge strategies handle them is essential for maintaining a smooth workflow and ensuring the integrity of your codebase.

How Merge Conflicts Arise

Merge conflicts typically occur in the following situations:

  • Simultaneous Changes: Two or more developers make changes to the same line of code or the same file in different branches.
  • Conflicting Modifications: Changes in one branch contradict changes in another branch, leading to uncertainty about which version should be retained.
  • Reordering: Changes in the order of lines or files can also lead to conflicts, especially in closely related code.

When a conflict arises Git may be unable to automatically merge the changes. This can result in requiring manual intervention, where the user is prompted to perform an interactive merge, to resolve the discrepancies. Different merge strategies can result in a different set of conflicts, with some strategies able to resolve scenarios more effectively than others.

Popular Merge Strategies

ort (“Ostensibly Recursive’s Twin” - Formerly “Recursive”)

The `ort` (or `recursive`) strategy is the default in Git and is what is used in the Data Productivity Cloud’s merge command.It uses a 3-way merge algorithm, merging changes by finding a common base commit and creating a new commit that combines the new commits from both branches. It is the default merge strategy as it usually results in the fewest merge conflicts without causing mismerges. It can also detect and handle file renames. Options like `ours` and `theirs` within this strategy can help resolve conflicts by favouring changes from one branch over the other:

  • ours: This option resolves conflicts by favouring changes from the current branch.
  • theirs: This option resolves conflicts by favouring changes from the branch being merged in.
  • ignore-all-space: Ignores whitespace differences when comparing lines, reducing conflicts caused by formatting changes.
  • ignore-space-change: This option treats sequences of one or more whitespace characters as equivalent, minimising conflicts due to minor spacing variations.
resolve

The `resolve` strategy is simpler and faster than `ort`, but it only works with two branches and is less effective at handling complex conflicts. It’s useful for straightforward merges where changes don’t overlap much. It does not handle renames. This is best for large, but straightforward merges.

octopus

The `octopus` strategy is used when merging more than two branches. It’s designed for simple cases where conflicts are unlikely. It handles conflicts by merging non-conflicting changes and leaving conflicting changes for manual resolution. It’s best used for integrating multiple topic branches with minimal conflicts and is the default when merging more than one branch.

ours

The `ours` strategy is a bit of a misnomer. It records the merge, but discards the changes from the branches being merged in, keeping only the changes from the current branch. This can be useful for merges that are about superseding old development histories of side branches while acknowledging their existence. It is distinctly different from the `ours` option for the `recursive` merge strategy.

subtree

The `subtree` strategy is used when you want to merge in a subproject and retain its history as a subdirectory of your project. This is useful for maintaining projects that include other projects as dependencies. It handles conflicts by treating the subproject as a distinct entity, preserving its changes within its designated subdirectory.

When to Deviate from the Default Strategy

While the `ort` strategy works well in most cases, there are scenarios where other strategies might be more appropriate:

  • Complex Merges: When dealing with a complex, multiple branch merge, the `octopus` strategy can simplify the process.
  • Maintaining Separate Histories: Use the `subtree` strategy when you need to maintain a clear history of a subproject within your main project.
  • Specialised Conflict Resolution: The `ours` or `theirs` options within the `ort` strategy can be useful when you need to resolve conflicts by choosing one branch’s changes over the other’s.

Merging within the Data Productivity Cloud

Within the Data Productivity Cloud merges are performed using the recursive/ort strategy, resulting in few conflicts.  Where conflicts do occur the user will be advised of the files that have conflicts and may choose the file from either ours or theirs accepting all changes from that file.

In the future we will be extending this to allow users to select from the individual conflicts within the files.

Summary

Understanding and selecting the right merge strategy can help you manage your project's history and resolve conflicts effectively. While the default `ort` strategy is generally sufficient, knowing about other strategies like `resolve`, `octopus`, `ours`, and `subtree` can be invaluable in specific situations. By choosing the appropriate strategy, you can ensure a smoother, more efficient merge process.

Stay tuned for our next instalment, where we’ll explore Git’s revert feature in order to enhance your development workflow and make sure you get the most out of your Git experience. Until then, happy merging and Git Good with the Data Productivity Cloud!

Catch up on previous parts here!

Part One: Commit, Push, and Pull. 

Part Two: Branching 

Part Three: Exploring Common Branching Strategies 

Part Four: An In-Depth Guide To Merging

Part Five: Understanding Hard Reset 

Part Six: Unpushed Commits

Part Seven: Uncommitted Changes 

Part Nine: Git Revert: Undoing changes the smart way

Bryns Jones
Bryns Jones

Senior Engineering Manager

Bryns Jones is a Senior Engineering Manager working with the DataOps team at Matillion

Get started today

Matillion's comprehensive data pipeline platform offers more than point solutions.