Next Lesson: Reading data from a file

Walk-through

Welcome to the CloverDX Walk-through

In this guide we will learn how to build, deploy, run and monitor a complete automated data processing pipeline using CloverDX. Note: This guide uses the terms “CloverETL” and “CloverDX” interchangeably, both refer to (different versions of) the same product.

This trail demonstrates:

Reading a simple CSV file (Lesson 1)
Filtering and removal of unwanted records (Lesson 2)
Loading records into Microsoft SQL Server relational database (Lesson 3)
Transforming different input data formats into a single consistent output format (Lesson 4)
Detecting the arrival of new files to be processed (workflow, Lesson 5)
Setting up additional workflow steps (archive input files, maintain a log file, Lesson 6)

Automated Data Pipeline

Automated Data Pipeline is about removing the laborious manual processes involved in any stage of data processing. Data integration is not only about connecting data sources with data targets; it’s also about automating the process – triggering based on events, scheduling, monitoring, trouble-shooting and running workflows that handle errors, cleanup, file handling, etc.

Reading Data From a File

First of all, let's get familiar with CloverDX Designer environment and read a single file.

View lesson

Filtering and Writing Data

As a second step, we will filter the input data and write it to a file.

View lesson

Cleaning Up Data

Here, we will unify different input formats, start reading multiple input files and keep a log of records we choose to ignore.

View lesson

Writing to a Relational Database

Then, we’ll write the transactions into a database, mapping field names and performing type conversions as necessary.

View lesson

Deploying to CloverDX Server

Now, we’ll deploy our project into a production environment and schedule its execution, moving it from CloverDX Designer to CloverDX Server. This is the beginning stage of automating our data pipeline.

View lesson

Orchestration with Jobflows

Finally, we’ll fully automate the execution of the data pipeline in CloverDX Server. We will watch for the arrival of new input files, process them using our graph, back up the input files and log the entire process.

View lesson