What's New & Latest Features | Matillion ETL for Redshift

What’s New

  • 13-03-2017 [1.26] EMR Load component, Youtube Query, Connection Manager, Java 8 Compatibility and more


    • – A new Youtube Query Component.
    • – A new EMR Load Component to make it easier to natively load EMR data sets.
    • – Environment Explorer Tree shows UDFs, Primary Keys, Sort Keys, Distribution Keys.
    • – Validation of Orchestration tasks now run in the background an appear in the Task panel (in the same way as Transformation tasks) this is more predictable, particularly for components that take longer to validate against 3rd party API’s.
    • – A new Connection Manager allows you to see and control connected sessions. This will also prevent users from being locked out when they hit their connection limit.
    • – All data-loading components now support a “Load Options” parameter to control:-
    •           – Keeping the objects in S3 after the load completes for archive purposes.
    •           – Turning off automatic compression analysis.
    •           – Turning off automatic statistics gathering.
    • – Users (using internal security) can be added/removed without requiring a restart.
    • – The If component now logs the decision taken to the task panel. This will help users diagnose decision logic problems.
    • – Matillion now runs on Java 8.
  • 18-01-2017 [1.25] Enhanced Encryption, Text Output Component, New Data Connectors and more

    New Data Connectors


    New Components

    • – New Text Output orchestration component simplifies export to CSV and other Text based formats.
      • – Similar to S3 Unload, with support for headers


    Other Changes

    • – File Iterator now supports S3, you can loop over a list of files in an S3 bucket
    • – KMS Encryption option in password manager allows you to use AWS managed encryption keys to encrypt passwords in Matillion ETL.
    • – Run Transformation / Run Orchestration  components now support variable overrides to make it easier to run jobs in a reusable manner
    • – Added support for the boolean data type
    • – The scheduling test will check your maintenance window and warn of possible overlaps
    • – Some orchestration components such as Create Table have an SQL Tab so it is easy to understand the generated SQL
    • – Additional methods available on Matillion date variables will simplify using dates in variables
    • – New cleaned up and simplified sample tab
    • – Hundreds of other tweaks, minor improvements and bug fixes
  • 22-11-2016 [1.24] S3 Load Generator, Project Sharing, Chat Features, New Data Connectors and more

    New Data Connectors

    • – SugarCRM Query
    • – Dynamics NAV Query
    • – Oracle Eloqua Query
    • – SAP Netweaver Query


    New Components

    • – Delete Tables – Remove table such as temp tables as part of an Orchestration.
    • – S3 Get Object – Get S3 Objects and push them SFTP, HDFS and Windows File Shares.
    • – File Iterator – Iterate over a list of pattern matched objects in an FTP, SFTP, HDFS or Windows Fileshare.


    S3 Load Generator

    • – This tool helps generate compatible “Create Table” and “S3 Load” components by sampling delimited data files on S3 and guessing the layout.


    Project Sharing

    • – Private projects can be created
    • – Projects have an owner who controls which other users can collaborate


    Automated Backups

    • – You may enable automated daily backups of the Matillion ETL for Redshift instance root volume


    New Chat and Presence Features

    • – You can see who else is collaborating with you, and chat to them. Chats are persisted to provide context on your project


    Other Improvements

    • – Create Table and Fixed Flow components support additional data types (Integer, Date). More to follow.
    • – S3 Put Object now supports S3 as a source (in case you have ZIP files on S3 that need unpacking before loading to Redshift).
    • – The SQL component can now be used at the beginning of a flow.
    • – We now include an API profile for Matillion’s API to copy the run history to Redshift. The API Query component can be used to query this data and import to Redshift.


    Plus hundreds of minor improvement and bug fixes.

  • 23-09-2016 [1.23] Real-time validation, improved editor windows, searchable task history, component improvements and more

    New features
    • – The Sample tab now allows filtering to assist debugging complex transformations.
    • – Real-time validation of expressions in the Expression Editor
    • Your syntax is checked by Redshift as you type.
    • – Jobs and folders in the explorer can be moved and copied in bulk.
    • – Improved editor windows. You can see available variables and test your code without leaving the editor when writing Python and SQL Scripts.
    • – Notes can now include bold, underlined and italic text, as well as hyperlinks.
    • – The Task History is now searchable, and opens in a separate tab.
    • – In the environment navigation browse the available tables, views and columns within each environment; drag and drop them into a Transformation.


    Component improvements
    • – The S3 Load Component can specify an IAM Role ARN that is attached to your Redshift cluster.
    • – On the Table Output Component “Analyze Compression” now supported an “If not compressed already” setting.
    • – Python modules can now be installed with ‘pip’, and the latest boto3 API is now included by default for interaction with AWS services.
    • – The RDS Bulk Output Component now supports output to Postgresql databases.
    • – The S3 Put Component can now read directly from HDFS.
  • 02-08-2016 [1.22] Non-blocking task queue, profile editor, preview API, new data load components and more

    Key features
    • – Non-blocking task queue allows users to collaborate more seamlessly without being blocked by each others requests.
      • – Multiple runs of the same job will queue.
      • – All other runs may happen concurrently, regardless of the environment.


    New components
    • – Load data from Hubspot with the HubSpot Query Component
    • – Load OData Sources with the OData Query Component
    • – Load Microsoft Excel spreadsheets with the Excel Query component
    • – Load Google AdWords data with the Google AdWords Query component
    • – SFTP Put Object component allows you to write from Redshift back to an SFTP server
    • – Retry Component allows automatic retrying and back-off, which is most useful for 3rd party API’s that are not 100% reliable


    • – S3 Put Object now supports copying a file from a Windows File Share
    • – You can now run an orchestration job from part way through
    • – Profile editor for building data profiles to describe how API’s map to tables and columns that can then be queried from the API Query Component
    • – Import/Export now includes Variables and Environments
    • – Notices/warnings/errors are now displayed on a new “Notices” tab
    • – Experimental API to import/export entire projects, run jobs, monitor running jobs
      • – Can be used for integration to 3rd party source control management systems
      • – Ask support for more details on how to get started with this


    Plus hundreds of bug fixes, performance improvements and minor features!

  • 07-06-2016 [1.21] Project selection and organisation, enhanced user management, new data load components and more

    Key Features
    • – Project selection and organisation improvements
    • – New components for Google Spreadsheets, Marketo, RDS Bulk Ouput, Bash Script and CloudWatch Publish


    Other Features
    • – Centrally Managed Passwords
    • – Manage Users, Software Upgrades and more through a new Admin screen
    • – Copy/Paste settings between spreadsheets/text files and the Grid Editor
    • – SNS/SQS/RDS Components will offer Topics, Queues and Endpoints to choose from (if the given credentials allow it)


    Plus dozens of other minor improvements and bug fixes.

  • 22-04-2016 [1.20] Multi-threaded orchestration, new data load components and lots more

    Key Features
    • – Concurrent execution of orchestration tasks (orchestration multi-threading).
    • – New load components for Salesforce, MongoDb, DynamoDb, Google BigQuery, Netsuite, Microsoft Dynamics and LDAP.


    Other Features
    • – S3 Put’ Zip Unpacking (unpack contents of a ZIP file to an S3 bucket from remote FTP/HTTP server).
    • – Support for Views as data input in Transformation.
    • – Support for Views as data input in Transformation. Support for Views as data output in Transformation.
    • – Teradata support added to Database Query component.
    • – RDS certificate support added to keystore functionality.


    Over 90 other minor improvements and bug fixes.

  • 26-02-2016 [1.19] Social media and API support. Commitment control. Improvement for SCDs.

    Key Features
    • New load components for Facebook, Twitter and Google Analytics.
    • New API Query component, loads data from any REST, JSON or SOAP API.
    • Commitment control functionality.


    Other Features
    • Detect changes component – for SCDs and real time updates
    • Improved task cancellation


    Over 100 other minor improvements

  • 15-01-2016 [1.18] Improved Orchestration flow control and component variables. Improved security support. Redshift UDF support.

    Key Features
    • New ‘If’ orchestration component
    • New component variable features set (inc. row count, duration, error handling, performance information)
    • Added S3 Manifest Writer component
    • Added Transpose Rows component
    • Added support for SSL
    • Support for Redshift User Defined Functions


    Other Features
    • Support for interleaved sorts
    • Encryption on S3 load
    • AVRO support for S3 load
    • Improved script editors to add syntax highlighting and auto-complete
    • DB2 in the database query component.


    Dozens more minor improvements

  • 30-11-2015 [1.17] Iteration, multi-schema support and FTP/HTTP data load functionality.
    Key Features
    • Multi-schema support
    • Iteration added to orchestration flows (table, list and fixed)
    • New S3 Put Component – load data from FTP/HTTP
    • New Python component


    Other Features
    • Manifest support in S3 loader added.


    Lots of other minor improvements

  • 21-10-2015 [1.16] SQS Integration, SQS Message, SNS Message and Load data from RDBMS
    Key Features
    • SQS Integration added
    • Added ability to nest orchestration jobs
    • New components added: SQS Message, SNS Message
    • Load data from RDBMS (Database Query component)


    Other Features
    • Manually or automatically set column encoding on Redshift tables
    • Cancel task feature added
    • New ‘Schema Copy’ component


    Dozens of other minor enhancements

  • 08-08-2015 [1.15] New scheduler. Load data directly from Amazon RDS. Improved UI.
    Key Features
    • New scheduler
    • RDS query component added,
    • UI enhancements (e.g. snap to grid)


    Other Features
    • Analyze/Vacuum/Truncate components added
    • New transformation components (Regex)


    Lots of other minor improvements and bug fixes.

  • 29-07-2015 [1.14] S3 load/unload. Improved performance and AWS integration.
    Key Features
    • New S3 Load/Unload components


    Other Features
    • Added ability to manually define AWS credentials after AMI Launch
    • Internal caching makes running the same jobs repeatedly much faster
    • New components can (i) update table rows, (ii) delete table rows, (iii) split a field on a delimiter


    72 other bug fixes and minor improvements