Daylight Savings w/ Matillion ETL For Amazon Redshift
Every year at the end of March, Europe puts the clocks forward 1 hour, signalling the longer days of summer in the northern hemisphere. You may need to adjust for Daylight Savings with Matillion ETL for Amazon Redshift so that your data is accurate. At the point of changeover, everyone’s local time skips forwards one hour relative to UTC, creating longer daylight in the evenings - at the cost of one more hour of darkness in the morning! The rules guiding this changeover have been consolidated in Europe, and can easily be implemented in a Matillion ETL Transformation job. In this article we’ll create a reference table that could be used in any other ELT job to provide the start and end dates of Daylight Savings changeovers every year.
Date generationThe rules for finding the Daylight Savings changeover dates are simple:
- Start of Daylight Savings - last Sunday in March
- End of Daylight Savings - last Sunday in October
UTC and BST calculationsNext it’s easy to convert the integer sequence into a date sequence, using an Amazon Redshift SQL expression such as TO_DATE. Matillion ETL for Amazon Redshift has a Calculator component for this purpose, which guides you through adding derived columns. We’ll add a simple date first of all: Then, using some more date expressions, the following additional columns:
- Day Name
- Year ID
- Month ID
- Final Sunday - using an analytic function to find the final Sunday in every month
- Is Last Sunday in March - flagging if this is the final Sunday in March
- Is Last Sunday in October - flagging if this is the final Sunday in March
Filtering and Aggregation for Daylight SavingsBy this stage, we have a data set that includes the dates of interest (last Sunday in every March and last Sunday in every October), but also a whole lot of other dates which aren’t relevant. We need to filter them out. For this purpose we add a filter, in “OR” mode, which selects only the Daylight Savings start or end dates. The data is nearly at the correct granularity now, but just needs a final aggregation step to create one record per year. This is a Matillion ETL Aggregate component, which chooses the minimum and maximum of the filtered dates: the minimum is the Daylight Savings start, and the maximum is the Daylight Savings end.
Creating the OutputAfter the aggregation, the data is correct but the column names are still rather unfriendly. This is easily addressed using a Rename component, which provides a simple mapping interface: The output data is now ready to be materialized into a table for general use. The Rewrite Table component in a Transformation job is perfect for this purpose, and simply writes the data into a new Amazon Redshift table. The completed job looks like this, and is available for download as an attachment. We are now as prepared as we can be for losing an hour’s sleep!
Manager of Developer Relations
Data Mesh vs. Data Fabric: Which Approach Is Right for Your Organization? Part 3
In our recent exploration, we've thoroughly analyzed two key ...eBooks
10 Best Practices for Maintaining Data Pipelines
Mastering Data Pipeline Maintenance: A Comprehensive GuideBeyond ...News
Matillion Adds AI Power to Pipelines with Amazon Bedrock
Data Productivity Cloud adds Amazon Bedrock to no-code generative ...