Embracing Low-Code Pipeline Tools: A Data Engineer's Perspective

The tools that data engineers use can make or break workflows. Among the many advancements, low-code and no-code tools have emerged as game-changers, sparking debates among data engineers. As someone deeply entrenched in this field, you might have heard about these tools and wondered if they could fit into your meticulously crafted processes. Let me share why low-code pipeline tools are not just viable but can be extremely beneficial.

The Misconception of Low-Code Tools

First, let's address a common misconception: low-code, no-code, high-code, some code, little code, lots of code—these terms often seem like marketing ploys designed to increase productivity and enable less technical users to build pipelines, AI, data quality checks, and more. This perspective suggests that low-code equates to low capability, and that these tools are only suitable for those without deep technical expertise. However, this couldn't be further from the truth. Many low-code tools are designed with robust functionalities that cater to both novice users and seasoned data engineers.

What is Low-Code and Why Does it Matter?

Low-code platforms offer a visual approach to data and AI pipeline development, allowing users to create data pipelines through graphical interfaces rather than traditional hand-coded programming. This approach matters because it will reduce the effort you need to spend on repetitive tasks, allow less skilled data professionals to do their own tasks, and improve the development process, enabling faster deployment and making complex tasks more accessible to a broader range of people. Data engineers, in particular, can benefit from these tools' speed and efficiency.

The Data Engineer's Perspective

If you look out on Reddit and other forums, you'll find that data engineers often have a negative view of low-code and no-code tools. This skepticism is understandable. There is a perception that these tools create a lock-in with vendors, resulting in an inflexible environment where you can't easily adopt new technologies or make necessary changes. The speed at which vendors update their platforms for security, new features, or other issues can be slow, further exacerbating this concern.

However, if you choose the right low-code —what I like to call "code-optional" platforms—you can mitigate these concerns. These tools can handle repetitive, non-value-add tasks like connecting to data sources and pulling data into a platform database or cloud data platform. These are tasks that often don't require the latest state-of-the-art tools or technologies, making low-code solutions a perfect fit.

Speed and Efficiency

One of the primary advantages of low-code tools is the speed at which you can build and deploy pipelines. Traditional data engineering tasks, from data ingestion to transformation and integration, often involve writing and debugging extensive code. With low-code tools, much of this can be accomplished through intuitive visual interfaces. This not only accelerates the development process but also significantly reduces the time spent on maintenance.

For instance, setting up a data pipeline to integrate various data sources can be done in a fraction of the time. Instead of writing complex scripts to handle data transformations, you can use drag-and-drop functionalities to map out your processes. This rapid development cycle is particularly beneficial when working on projects with tight deadlines or when you need to quickly prototype and iterate on solutions.

Bridging Skill Gaps

Low-code tools also bridge the gap between technical and non-technical team members across the company. Data Analysts, business intelligence professionals, and even marketing teams often need to interact with data. Low-code tools enable these stakeholders to contribute more effectively without requiring them to dive deep into the technical details.

This collaborative environment can lead to more innovative solutions as different perspectives are brought into the data pipeline development process. Additionally, it allows data engineers to focus on more complex and high-value tasks, rather than getting bogged down by routine and repetitive coding chores.

Scalability and Flexibility

Contrary to the belief that low-code tools are inflexible, many modern low-code platforms are built to scale with your needs. They offer integrations with a wide array of services and databases, and they often come with robust APIs that allow for customization beyond the visual interface, and they easily plug into your DevOps or DataOps process. This means you can start with a low-code approach and still incorporate custom scripts and advanced configurations as needed.

Real-World Examples and Flexibility

When evaluating low-code, no-code, or code-optional platforms, look for those that allow you to inject code where necessary. For example, being able to run external Python, dbt, or SQL scripts can provide the flexibility to use the strengths of low-code tools while still leveraging your coding skills.

Here's an example of using Python to load data into Snowflake from a common system like MySQL.

import mysql.connector
import snowflake.connector
import pandas as pd

# MySQL connection details
mysql_config = {
    'user': 'your_mysql_user',
    'password': 'your_mysql_password',
    'host': 'your_mysql_host',
    'database': 'your_mysql_database'
}

# Snowflake connection details
snowflake_config = {
    'user': 'your_snowflake_user',
    'password': 'your_snowflake_password',
    'account': 'your_snowflake_account',
    'warehouse': 'your_snowflake_warehouse',
    'database': 'your_snowflake_database',
    'schema': 'your_snowflake_schema'
}

# Source MySQL table
source_table = 'your_mysql_table'

# Destination Snowflake table
destination_table = 'your_snowflake_table'

def fetch_mysql_data():
    # Connect to MySQL
    mysql_conn = mysql.connector.connect(**mysql_config)
    query = f"SELECT * FROM {source_table}"
    df = pd.read_sql(query, mysql_conn)
    mysql_conn.close()
    return df

def load_data_to_snowflake(df):
    # Connect to Snowflake
    snowflake_conn = snowflake.connector.connect(**snowflake_config)
    cursor = snowflake_conn.cursor()

    # Write the DataFrame to Snowflake
    # Use the PUT command to stage the data
    stage_name = 'your_stage_name'
    stage_path = '@your_stage_path'

    # Save DataFrame to CSV (if using a stage) or directly use SNOWFLAKE-PYTHON-CODE (if supported)
    df.to_csv('temp_file.csv', index=False)

    cursor.execute(f"PUT file://temp_file.csv {stage_path}")
    cursor.execute(f"COPY INTO {destination_table} FROM {stage_path}/temp_file.csv FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY='\"')")

    # Clean up
    cursor.execute(f"REMOVE {stage_path}")
    snowflake_conn.commit()
    cursor.close()
    snowflake_conn.close()

if __name__ == '__main__':
    data_frame = fetch_mysql_data()
    load_data_to_snowflake(data_frame)
    print("Data copied successfully from MySQL to Snowflake")

While you could develop this entirely in Python, a low-code tool like Matillion can automate the data loading and initial transformations, saving you considerable time, especially if you need to manage connections to hundreds or thousands of sources.

 

Conclusion: Embrace the Change

As data engineers, our ultimate goal is to derive insights and value from data as efficiently as possible. Low-code pipeline tools are not a replacement for traditional coding skills but rather a complement to them. They can handle many routine tasks, free up our time for more strategic work, and foster a more collaborative and inclusive data culture within organizations.

In embracing low-code tools, we are not compromising on quality or capability. Instead, we are expanding our toolkit, enhancing our productivity, and driving innovation in our projects. So, if you haven't yet explored what low-code pipeline tools can offer, now is the perfect time to start. You might be surprised at how much they can enhance your workflow and contribute to your success as a data engineer.

If you want to see the low-code capabilities that Matillion offers, sign up for a free trial. 

Mark Balkenende
Mark Balkenende

VP of Product Marketing

Mark Balkenende, VP of Product Marketing, at Matillion has spent the last 20 years in the Data Management space. He started his career in IT roles managing large enterprise data integration projects, systems, and teams for companies like Motorola, Abbott Laboratories, and Walgreens. Mark has applied his data management subject matter expertise to customer-centric, practitioner-focused product marketing at data management software companies like Talend.