Now Available On-Demand | AI. The Future of Data Engineering - Today. Not Just Tomorrow.

Watch now

Matillions New AI Auto-Documentation Component

Richard McEnery, Senior Solution Architect at Amplify, a Matillion Partner, wrote this guest blog. Amplify Consulting Partners is a data-first consulting company that delivers high-impact professional services across the technology ecosystem - from data engineering and visual analytics to data-driven marketing and everything in between. 

Amplify clients are always interested in staying up to date on the latest feature releases of their technology stack. As a Matillion Gold Partner, it’s our responsibility to not only provide that information but also test it out and give our honest recommendation. We recently had a client who was interested in generating documentation for their data pipelines and incorporating AI. This article is focused on sharing our experience testing out Matillion’s AI Auto-Documentation tool in public preview.

Documentation: Ugh!

I completely understand the struggle with documentation. It’s tedious, time-consuming, and a common challenge for many developers, myself included. Late last year, Matillion unveiled a set of AI tools that can be incorporated into Data Productivity Cloud (DPC) pipelines, including AI Auto-Documentation, which uses artificial intelligence to generate documentation for data pipelines. As part of our AI Private Preview access, I was able to play around with AI Auto-Documentation and got some interesting results.

Documentation: Good

I started with an existing pipeline that iterates through a list of Salesforce objects and loads the object data into Snowflake tables. Since the list of objects is stored in a Snowflake table, adding another Salesforce object is as simple as adding a row to the table. The table being ingested can be a full replacement of a table or an update to an existing table.

Starting with the high-level pipeline, I selected the components, right-clicked on the canvas and chose “Add note using AI.”

What I got back was a text box documenting the pipeline context with two options— Regenerate or Add. ‘Regenerate’ does exactly what you would expect; it recreates a new version of the documentation, while ‘Add’ appends it to the pipeline canvas, allowing you to resize the text box or move it around as needed.

Documentation: Better

Not bad! I pressed ‘Add’ to save the text and then discovered that you can subsequently update the content by simply clicking on it as you would any other note. I added some text explaining in more detail what the pipeline is doing. When editing the note, I noticed certain characters around words and realized that’s how you can emphasize parts of the text via Markdown syntax. For example, a pair of asterisks (**) on both sides of a word or phrase will render them bold, while backticks (`) will add a gray background to the selected text. [Other standard markdown formatting applies, too.]

Documentation: Wow!

My next experiment was to try it on the Orchestration pipeline called the Table Iterator.  I did a drag-and-draw box to highlight all the components, right-clicked to select “Add note using AI” again and let the AI do its thing. It generated a nice description of the pipeline function and logic, which I rearranged into the order I wanted and added a bit more context. Here is the final annotation:

Takeaways

Alongside the unveiling of their AI tools at the Data Unlocked virtual summit in October, Matillion talked about its AI charter, a set of rules for how they’ll work in the AI space. I was incredibly impressed that they went that far to help engineers and developers understand their roadmap and vision. It demonstrates a real commitment to both the technology and the ethics of AI. Tools like AI Auto-Documentation will make me more productive while staying true to the “human in the middle” aspect (making humans more productive, not replaceable) of the charter. It enables teams to work together more effectively by allowing collaborators to quickly document what they’re doing and highlight key component information while still giving space for customization. At the end of the day, good documentation should be easy to digest without looking at the code—this does just that.[{{type}} Annotation].

Are you looking for innovative solutions to revolutionize your data engineering practices? Look no further! Watch Matillion's latest webinar: AI. The Future of Data Engineering. 

Richard McEnery
Richard McEnery

Sr. Solution Architect

Richard McEnery is a Sr. Solution Architect with five decades of experience across multiple data platforms and technologies, including SQL, Snowflake, and Matillion. He’s led many successful and complex client projects as an architect and developer and contributes to the design and implementation of new data ingestion and data modeling techniques. Richard is also a Certified Matillion Associate