Solving GraphQL Pagination in Custom Connector using Script Pagination

Extracting data from REST and GraphQL APIs can require flexibility. Matillion's Custom Connector combines a low-code, dialog-based interface for common paging scenarios, with a code-based Script Paging language for more complex scenarios.
This article outlines how to effectively handle GraphQL pagination in Matillion using Script Paging.
REST API vs. GraphQL Pagination
Typical REST APIs offer multiple endpoints, which accept relatively simple parameters. In contrast, typical GraphQL APIs offer a single endpoint which can accept a sophisticated query.
In the absence of actual formal standards, both kinds of APIs offer several commonly implemented pagination methods. These include:
- Page-based pagination (e.g., ?page=1)
- Offset-based pagination (e.g., ?offset=50&limit=50)
- Link headers (containing URLs for subsequent pages)
- Full/Relative-path pagination (URLs or paths for next pages within the response)
- Cursor-based pagination (using unique identifiers for results)
Both REST and GraphQL API endpoints can use any of the above approaches to pagination, or subtle variations of them. The more sophisticated query expressions in GraphQL APIs offer their own individual ways to implement pagination using formatted requests and responses.
For API users, GraphQL's more sophisticated approach to structuring queries means embedding pagination parameters alongside the query itself.
GraphQL and Matillion Custom Connector
Matillion's Custom Connector supports parameterizing the page value both in a JSON POST body or as a query parameter in a GET request.
For GraphQL queries that are sent via a POST with a JSON body, the challenge lies in dynamically updating the page value for each subsequent request. Similarly, with GET requests, the page value within the query parameters needs to be updated.
These scenarios are exactly what Matillion's Script Paging capabilities were designed for.
Example: Rick and Morty GraphQL API
This article uses the public Rick and Morty GraphQL API for demonstration. A sample GraphQL query looks like this:
query {
characters(page: 1, filter: {name: "Morty"}) {
info {
count
next
pages
prev
}
results {
created
gender
image
name
species
status
type
id
}
}
}
And a sample response (truncated for brevity):
{
"data": {
"characters": {
"info": {
"count": 107,
"next": 2,
"pages": 6,
"prev": null
},
"results": [
{
"created": "2017-11-04T18:48:46.250Z",
"gender": "Male",
"id": "1",
"image": "https://rickandmortyapi.com/api/character/avatar/1.jpeg",
"name": "Rick Sanchez",
"species": "Human",
"status": "Alive",
"type": ""
},
// ... more results
]
}
}
}
Note the pagination information nested inside the /data/characters/info object. This is page-based pagination, where the response contains the total number of pages, plus the current, next and previous page numbers.
Understanding Parameterisation in GraphQL
GraphQL queries allow variables to be passed separately from the query itself. This improves flexibility and reusability.
For example, instead of hardcoding values like this:
query {
characters(page: 1, filter: {name: "rick"}) {
info {
count
next
pages
prev
}
results {
name
id
}
}
}
... you can structure the query to accept dynamic input like this:
query ($page: Int!) {
characters(page: $page, filter: {name: "rick"}) {
info {
count
next
pages
prev
}
results {
name
id
}
}
}
Then, a variables object provides the actual value:
"variables": {"page": 1}
What Does Int! Mean?
In GraphQL, Int is a scalar type representing integer values.
The exclamation mark (!) means that this parameter is non-nullable, meaning the query must receive a value for this parameter, or it will return an error.
Custom GraphQL Paging with a POST Request - Solution 1
Start by setting up your custom connector to send a POST request to the Rick and Morty GraphQL API endpoint at https://rickandmortyapi.com/graphql.
First add a constant request header with:
Content-Type: application/json
Use the following JSON structure for the request body, with a placeholder parameter for the page value:
{
"query": "query ($page: Int!) {characters(page: $page, filter: {name: \"rick\"}) { info { count next pages prev } results { created gender id image name species status type id } }}",
"variables": {
"page": 1
}
}
Choose script based pagination and add the following:
// Extract next page value and replace page value in variables
var current_page = @pager.pageCount();
var next_page = current_page + 1;
@request.body.put("/variables/page", next_page);
// Stop paging when current page is same as total pages
var total_pages = @response.body.get("/data/characters/info/pages");
@pager.stop(current_page == total_pages);
Test the pagination, and then the request itself, to make sure the custom connection is working.
Custom GraphQL Paging with a GET Request - Solution 2
This custom connector will be very similar to solution 1, but this time starting with a GET request to this URL:
https://rickandmortyapi.com/graphql?query=query($page:Int!){characters(page:$page,filter:{name:"rick"}){info{count next pages prev }results{created gender id image name species status type}}}&variables={"page":1}
Choose script based pagination and implement the Paging Script like this
// Extract current page query parameter value (JSON object)
var currentVariables = @request.query.get("variables");
// Get the current page value from the variables JSON object
var currentPage = @json.get("page", currentVariables);
// Increment current page value for next request
var nextPage = currentPage++;
// Update the variables in the query parameter
var nextVariables = @json.put("/page", nextPage, currentVariables);
// Replace the variables query parameter value with updated JSON object
@request.query.put("variables", nextVariables);
// Get the total number of pages
var totalPages = @response.body.get("/data/characters/info/pages");
// Stop paging when the current page value is the same as total pages
@pager.stop(currentPage == totalPages);
Integrating semi-structured data
Once you have saved the custom connector, you can use it in an orchestration pipeline. For example, saving the data to a table named stg_rm_data.
The output from the GraphQL query is one semi-structured row per page, 6 at the time of writing.
To convert this into rows and columns requires an Extract Nested Data component in a Transformation pipeline:
Once the data has been converted to rows and columns, it's easy to work with further – for example to summarize, or to integrate with other data sets.
Quick and Simple Custom Data Source Connectivity
As you can see from the example documented here, Matillion's flexible design allows you to quickly and easily connect to any data source, whether it's structured, semi-structured, or unstructured. This means you spend less time building and maintaining connectors, and less time on manual coding. Matillion makes it faster to bring in more data sources, helping you innovate quickly and move swiftly to market.
In practice, using Matillion's Custom Connector feature, you can handle the tricky parts of GraphQL pagination. You can use POST or GET requests to change pagination settings on the fly and retrieve all the data you need from GraphQL APIs.
Matillion’s low-code interface makes it straightforward to build most Custom Connectors, but it also offers more advanced coding options if you need them. The upshot is that you can quickly start using newly accessed data sources to add more value to your business data, enhancing its overall utility.
For more detailed help on using Custom Connectors and Script Paging, you can refer to the Matillion documentation.
Don't want to build this yourself? Download the pre-built connector from the Matillion Exchange.
Darlington Moyo
Staff Software Engineer
Darlington Moyo is a Staff Software Engineer at Matillion with a decade of experience designing and building data integration tools for Matillion’s platforms. Specialising in architecting scalable and efficient data extraction solutions, he develops software that enables seamless connectivity between diverse data sources and cloud data warehouses. When not coding or refining architectures, he enjoys exploring new technologies and sharing insights on software development best practices
Featured Resources
Human in the Loop in Data Engineering
Data pipelines are the backbone of modern analytics, but they're also notoriously fragile. The most resilient pipelines ...
BlogHow Matillion is Leading the AI Revolution in Enterprise Data Integration
The AI revolution demands new data integration approaches. Discover how Matillion's Data Productivity Cloud and Maia transform ...
BlogData Integration as a Service
Data Integration as a Service modernizes enterprise workflows with scalable, cloud-native integration. The definition is ...
Share: