We are surrounded by ever more diverse sets of data. Much of this data is already distributed into the cloud, hosted by a rapidly expanding family of SaaS applications. Data sources underpin the flux of business opportunities, and are therefore in a constant dynamic state of change.
In order to compete effectively, we have to be able to make sense of all that data quickly and reliably in the cloud. Data processing requires software to run rules and logic. Which poses an interesting conundrum…
- Software is required, but having software is not actually a business objective. The business needs reliable and timely data from ETL/ELT which to drive analytics and AI/ML.
- Creating software by hand is quite an esoteric and exclusive occupation. It can be a tremendous bottleneck in time and cost.
A Low-Code / No-Code (LCNC) data integration strategy has become a vital part of solving this puzzle. LCNC switches the focus back to where it should be: on the data, not on the process to prepare the data. In this way, two key objectives can be met:
- The ability to iterate data solutions rapidly to keep up with the pace of change.
- The ability to quickly bring a minimum viable data product (MVP) to production. This applies equally to operational reporting, joined-up analytics, right through to data science experiments.
In this article I will talk about the top five features that help make LCNC a legitimate alternative to hand coding data integration and transformation processes. Starting, of course, with the fundamental fact that LCNC platforms generate code themselves.
1. Code generation
A good LCNC data integration platform presents a primarily graphical user interface (GUI). Rather than laboriously typing hundreds of lines of code, the user can build sophisticated logic using drag and drop operations that link simpler functions. For example, a Rank component plus a Filter component becomes a deduplicator. The LCNC platform generates the necessary code itself.
This hugely reduces the barrier to entry for non-technical data users. Citizen developers from right across the business can quickly start to contribute. They are able to self-serve and build the business logic in which they themselves are the experts. Rather than being forced to choose between outsourcing or buying technical help, the “self-build” option becomes possible again.
LCNC platforms have a range of capabilities:
- Low Code (or “Code Optional”) – It is possible to add hand written code, or to modify generated code. Professional software developers are able to add their own unique bespoke customizations.
- No Code – Fully visual environment/graphical platforms, with no hand coding option at all.
Every LCNC data tool has a range of features associated with a particular domain. Examples are “data loading” and “connecting to an API”. It is therefore common to deploy a LCNC platform made up of multiple LCNC tools. Common frameworks underpin all of the tools. This makes the tools easier to learn, and conveniently provides all the utility functions such as logging.
You should expect to find a variety of productivity enhancements built into a LCNC graphical interface, such as intelligent autocomplete. Furthermore, the generated code must be optimal for the target cloud data platform. These are both examples of AI/ML driven automation features.
2. AI/ML enhanced automation
ETL/ELT data processing software solutions do not exist in isolation. The code generated by a modern LCNC tool will run on a particular cloud data warehouse (CDW), which is itself hosted on a particular cloud infrastructure. Optimizing the interactions between these components is equally as challenging as writing software. It is vital that the LCNC platform takes such context into account. The best possible outcome is when that happens without you even having to be aware of it!
Similarly, multiple small automations (such as autofill) can add up to great time savings. These should be driven by automatically harvested metadata – for example from API or database catalogs.
In order to gain confidence and reliability, LCNC data integration platforms should include features such as:
- Proactive alerting, when conditions indicate that a data job is likely to fail
- Cost and runtime forecasting
- Predictive scaling
- Auto detection of data categories, such as personally identifying information (PII)
Rapidly changing environments still require full governance and compliance. LCNC features such as source code control, collaboration, documentation generation and automated data lineage are vital enablers for this.
Schema drift is a fact of life in a constantly evolving data landscape. It can be handled with automation, and also by the ability to design in a declarative way.
3. Declarative design modes
One of the keys to productivity is to remove distractions. A data-centric platform should empower the user to say, “This is how I want it to end up”. They should not have to spend time bothering with the mechanics – the so-called “imperatives.”
Some examples of declarative interactions are:
- Create this table, in this way, and replace it if it exists already
- Ensure this table exists, with this model, and with these privileges
- Make this source data available to this target platform
In contrast, imperative solutions are full of distractions. For example, trying to drop a table that does not exist always causes a software error. Cue eye roll: I don’t want it to exist!
In declarative mode, the user does not need to remember to run the code at the appropriate moment, in the correct order. Those are problems that machines are good at solving. Declarative design modes make deploying data solutions simpler.
4. Ease of deployment
Reporting and analytics teams work on data to make it understandable by humans, and their job is known as transformation and integration. Data scientists work on data to make it consumable by algorithms, and they call it feature engineering. Either way, the tasks are similar. The terms DataOps, MLOps and AIOps reflect the need for system and order in the face of constant change.
A low code or no code data data integration platform is a key enabler helping data teams issue software changes and enhancements in short release cycles: a cornerstone of any of the aforementioned ‘Ops’. Considerations in this area include packaging code into transfer formats, and the ability to integrate into a wider DevOps scope. The differences between development and production environments must be parameters rather than hardwired.
One simple thing that makes deployment easier is to deploy less. Look for an LCNC platform that provides the ability to invoke reusable modules – ideally with a good range of pre-built functionality, plus the option to parameterize extensively. This means much less copying and pasting, and less need to constantly redeploy: a massive time-saver for any data team.
The ability to create and invoke reusable functionality brings two great benefits:
- There is no need to create the same logic in multiple places. Often a cause of fragility
- Standardized definitions can be used, which creates reliable outputs and business confidence. All of the different data teams should be able to share the same definitions of core business concepts.
A LCNC platform should enable users to easily invoke pre-built modules. It should also be easy to create and share new modules, for example through a templating system or a formal marketplace. Every bespoke piece of data handling logic that is specific to the business domain is a great potential differentiator.
Pre-built modules should provide the ability to parameterize, optimize and extend. For example, REST API interactions are standardized across the industry, so a LCNC user should not have to worry about paging and authentication: They should be handled by the platform.
LCNC is just the beginning
There is a misconception that LCNC means the end of software development. In fact quite the opposite! LCNC reflects the fact that software development is so important that there is a need to make it more widely accessible.
- For software developers, LCNC means having to spend less time on mundane activities, and more time delivering value with their unique skills
- For business users, LCNC means being able to directly deploy their unique domain knowledge as working software
Want to learn more about Low Code / No Code data integration?
Matillion provides a complete LCNC data integration and transformation platform that is built for the cloud.
Download Matillion’s example LCNC solutions onto your own Matillion ETL instance.