What to Expect from Data Engineering in 2024

It is safe to say that the past 12 months have been transformative for the data engineering space. 

The impact of AI on data engineering, since November 2022 in particular, has been seismic. A year of, and we’re only really now starting to see the impact and true potential of the hype caused by GenAI. 

As GenAI’s potential comes to fruition, what can data engineers expect for the year ahead? We spent time with a few key leaders from across Matillion to find out their thoughts on data engineering and AI for 2024 and beyond.

More with Less

Ed Thompson, CTO, Matillion

I think 2024 will be the year that teams will be able to do more with less. They will equally be asked to do more. Of course, that means harnessing AI and the tools there to boost productivity and manage that ever-growing workload.

AI will enable teams to do more with less and become increasingly helpful at data analysis on deep data, but in the immediate future, they will inevitably bump against some of the limitations of the technology. The big limitation of genAI and LLMs is the amount of data that they can interpret as a whole. It is almost like a goldfish reading a book - it can only remember 10 pages at a time, so is missing that legacy context. That will limit your understanding of the novel as you get further into the book so, if you’re thinking of putting a large data set into a model and asking it to analyse that large data set, it will be limited by its context. Until that is remedied, however, the use cases are still very much there, with AI taking on the boilerplate work and empowering users to do less of the run-of-the-mill tasks day to day.

It’s worth noting that the most successful AI systems will be the ones that keep the human in the loop, the co-pilots, the tools that help you write code, rather than those that seek to replace the human. Having a human analyse responses from the model to determine whether they’re accurate or suggesting the right path for client support, for example, is where the secret sauce lies in business AI’s immediate future.

Trial and Error

Laura Malins, VP Product, Matillion

In the next 12 months, I believe the industry will continue to generate more and more with AI in a trial-and-error approach. Of course, this means there’s an element of generating significant amounts of useless content that has the potential to clutter up that of real value.

I suspect that beyond the next 12 months or so, so into 2025, we will move into the year of cleansing that content and use cases. How can we gain real value from what is being generated and what we’ve learned in this journey?

For a technology with such a huge potential impact, GenAI is in its infancy. We have so much to learn about wider business use cases and beyond. Whatever the next year holds, it’s going to be a fascinating one!

A simpler life for data engineers

Ciaran Dynes, Chief of Product, Matillion

The role of the data engineer has radically expanded over the past decade. We even used to call them ETL developers, then it was big data engineers. The skills required for the modern data engineer range from ETL designer, to data modeller, to SQL developer, Site Reliability Engineer, to security analyst, and now with the advent of AI and Machine learning, to Python programmer and data scientist.

There are new technologies to learn, from vector databases, large language models to train and tune, whether it’s ChatGPT, Bert, Claude, Lambda or beyond, plus there are new AI tools to use from AWS Bedrock, Azure OpenAI, Anthropic, and Databricks: it is all linguistic soup.

The next 12 months will be the year that tech companies make life simpler for data engineers.

Tools will come to market, be integrated into existing platforms to enable adding generative AI to existing data pipelines with the ability to deploy these models internally so that users can interact live with these models just like they already do with ChatGPT. 

Regardless of the tools that come to market, the next year will also see huge demand for data engineers to retrain to master prompt engineering, how to fine tune these models, how to increase their productivity massively. Next year, we will see data engineers’ lives get so much more interesting.

No one left behind

Naggi Asmar, Chief of Engineering, Matillion

There will be a pause in the hype of AI, while people start looking at real use cases and figuring out what is going to work and what is not going to work in the short term. With that comes more clarity around AI in its next phase.

There are two different camps for AI: the people who are really engaging with AI and are excited to see how it will fit into their businesses and those who are afraid of or who simply don’t understand how to use the technology. In the second camp, it is incumbent on the enterprises and organizations themselves to train their teams to bring them along with this modern workforce technology that we know will be transformative in the future. We don’t want to leave people behind.

To sum up

With GenAI dominating the majority of predictions for tech in the year ahead, Deloitte predicts that the share of enterprise AI spending dedicated to gen AI may grow by a significant 30% in 2024. When you consider the sheer quantity of tools already being brought to market - and the significant ROI of these in terms of data productivity - there’s no surprise a jump in investment is expected! Whatever the maturation of the hype looks like, it’s certainly going to be a fascinating journey. 

Find out more about Matillion’s product roadmap for 2024 and beyond, and dive deep into our GenAI possibilities by signing up for our AI preview to stay informed about the latest advancements and to get early access to AI functionality!