Visit Matillion AI Playground at Snowflake Data Cloud Summit 24

Find out more

What is an API?

API stands for Application Programming Interface. In the context of APIs, the term "Application" refers to any software that serves a specific function. An "Interface" can be seen as a service agreement between two applications. This agreement specifies how the two applications communicate with each other through requests and responses.

Simply put, an API is a user-facing interface that developers can use to send requests and receive responses from an application. These requests can vary from receiving or updating data in the application to configuring the application itself. When working with APIs, the application hosting the API is called the server, while the user sending the request is called the client

As APIs are the front-end interface for applications, their functions can vary vastly. Fortunately, their path of access, or front-end, is standardized to one of the different types of APIs available. These include SOAP and REST, to name a few. Every API should include documentation on the expected structure for requests and responses. These documentations are very important, as they highlight the proper syntax to use when trying to access an API.

Benefits of APIs

  • Access to applications 

APIs give external users access to an application from anywhere in the world as long as they have access to the Internet. This has also allowed companies to give users access to their data without exposing traditional databases or file storage services, expanding the availability of data across systems. 

  • Easy to develop

A vital part of APIs is documentation, so when working with APIs, there is always an in-depth documentation page that details the endpoint for requests and syntax for responses. There is also a standard on syntax when communicating with APIs, so once the first request/response is created, the rest becomes an exercise in repetition. 

  • Low overhead

APIs can be a good source of data with little performance overhead. Since simple code can be used to access an API through the web, and with the ease of development, APIs will create little performance overhead when accessing the data. This avoids issues that come with retrieving data from a database system that would need a driver to communicate with. APIs use a fraction of the performance overhead that would be required using a driver to communicate server to server. Note that this comes with limitations, which are highlighted in the next section.

Potential Drawbacks with APIs

Security: APIs give external users access to an application that would otherwise have no access to the outside world. This has become an avenue for hackers to use to infiltrate systems. 

Limitations: APIs have various limitations that are unique to each system. Some APIs limit the number of records that can be pulled on a single run, while others limit the number of records that can be pulled per second. 

Performance: APIs are an external access point to an application. While most APIs that data developers use are for extracting data, the application is not necessarily built to be optimized to transfer large amounts of data.

Types of API

REST APIs

These are the most popular and flexible APIs found on the web today. The client sends requests to the server as data. The server uses this client input to start internal functions and returns output data back to the client.

SOAP APIs

These APIs use Simple Object Access Protocol. Client and server exchange messages using XML. This is a less flexible form of API that was more popular in the past.

Websocket APIs

The Websocket API is another modern web API development that uses JSON objects to pass data. A WebSocket API supports two-way communication between client apps and the server. The server can send callback messages to connected clients, making it more efficient than a REST API for highly interactive use cases.

RPC APIs

RPC APIs enable one application to request and execute functions or procedures on another application over a network.

Pagination

Pagination is the method used to batch responses from a REST API. Pagination is used to reduce the size of the API response when dealing with more records than fit in a single response. There are different methods of pagination, which means you must verify with the API documentation if the API has pagination as an option. If the option is available, it should be activated. This will usually be in the form of adding a pagination parameter to the API call, like the page size of the page limit.

Pagination is not automatically enabled in a basic API call, and the API documentation should be referred to for syntax on how to use Pagination.

Key notes about pagination terminology:

  • "Page" is the word used to describe the batch
  • "Page Size" means the number of records per batch

When setting the number of Pages, you are selecting how many batches of records you are looking to receive

  • Ex:  A request of Page = 2 with a page size = 10, will return 20 records, with 2 batches of 10 records each. 

REST API methods

A GET is used to request data from a specified resource.

A POST is used to send data to a server to create/update a resource.

A DELETE is used to delete a specified resource.

Some notes on GET requests:
  • GET requests can be cached
  • GET requests remain in the browser history
  • GET requests can be bookmarked
  • GET requests should never be used when dealing with sensitive data
  • GET requests have length restrictions
  • GET requests are only used to request data (not modify)
  • Some notes on POST requests:
  • POST requests are never cached
  • POST requests do not remain in the browser history
  • POST requests cannot be bookmarked
  • POST requests have no restrictions on data length

REST API GET Call Example

Based on this documented API: GET/userconfig/user

Base URL : http://<InstanceAddress>/rest/v1/

Authentication: The method you are using to authenticate your access to the API, these can include:

  • Basic -  username and password
  • API KEY or Bearer Token - Single code of line that represents the clients access to the server.
  • Oauth2  - A process where the clients exchange an access token to gain limited time access to the server. 
    • An access token can also be requested from the server using an authorization token (short term access). 
    • An access token can also carry a Refresh Token, which is used to extend the end to end access. 

Endpoint: The location you are trying to access in the API

  • Ex: userconfig/user
  • The beginning part of the API will be constant, while the endpoints will change based on the API resource you are trying to access.

URI: The full HTML address of the API location you are trying to access

  • Ex: http://<InstanceAddress>/rest/v1/userconfig/user

Parameters:

  • Added with a “?” at the end of the URI. 
  • Ex:
    • http://<InstanceAddress>/rest/v1/userconfig/user/instance?userName=<username>
  • This can include the format being used, data filters, and pagination parameters

Body: This is the information being sent in a POST request and it should fit the syntax that the API is expecting on the request.

Server Response: Once the request is sent, you can expect to receive a response like the following:

[

    {

    "name": "api-user",

    "admin": true,

    "api": true,

    "projectAdmin": false,

    "currentLoggedInUser": false

  }

]

Conclusion

API connectivity empowers data engineers to build dynamic, interconnected, and efficient data pipelines. APIs help realize the full potential of automation, bring opportunities for better DataOps practices, and boost overall connectivity.

The Matillion Data Productivity Cloud offers several methods for adding API functionality, including built-in connectors, webhooks, and the ability to create bespoke connectors.

Carlos Calderon
Carlos Calderon

Solutions Architect