Matillion Patterns for Working with Complex JSON data from an API Source

In this blog post, we will explore and demonstrate how Matillion Data Productivity Cloud simplifies the ingestion of complex JSON data from API sources. Accompanied by a demonstration video, we’ll focus on a common integration scenario. 

The video showcases the usage of a straightforward API, specifically the AWS IP Ranges endpoint, for retrieving AWS IP address range details. The format of the JSON response is a simple representation of the characteristics one might find in a more complex API response, with different levels of granularity within the response data. The general approach outlined is to use Matillion Custom Connector to integrate with a custom API endpoint, shaping the schema of the response to extract nested JSON, arrays and other fields at a common level of granularity. Then, use Matillion Designer to load that API data into a Cloud Data Platform and further parse the nested JSON and arrays into relational tables and columns. The video outlines some best practices and patterns that can be applied towards many similar use cases that are common when working with semi-structured data from API sources.

 

Here are some key points that are highlighted in this video. If keeping these points in mind, you will hopefully find this as a sensible and repeatable approach for all of your custom API integration needs:

  1. API Understanding: Have an understanding of the API response format and the elements in the response that are of interest.
  2. Matillion Custom Connector: Matillion Custom Connector provides an easy-to-use, low code approach to integrating with almost any REST API.
  3. Shape the API Response: Using Matillion Custom Connector, shape the schema of the API response to extract nested semi-structured fields and common fields at a common level of granularity.
  4. Flattening Arrays: When turning an array of data into individual rows, use the Flatten Variant component.
  5. Flattening JSON Data: When parsing attributes and values from a JSON object, use the Extract Nested Data component.
  6. Data Structure and Parsing: Use the Rename component for removing fields from your Pipeline when they are no longer needed.

 

Quick Recap

In this short recap video, we focus on the above key takeaways in Matillion Data Productivity Cloud.

 

Takeaways 

Mastering API integration and response handling is crucial for effective data management. By understanding the API response format and the key elements of interest, you can make the most of the Matillion Custom Connector. This tool offers a user-friendly, low-code solution for integrating with nearly any REST API.

For converting arrays into individual rows, the Flatten Variant component is indispensable, while the Extract Nested Data component excels in parsing attributes and values from JSON objects. Additionally, the Rename component ensures a streamlined data structure by removing unnecessary fields from your pipeline. With these tools and techniques, you can efficiently manage and utilize API data in your workflows.

To experiment with these features, start a free trial today or book a demo

Arawan Gajajiva
Arawan Gajajiva

Principal Architect - Sales Engineering COE