Navigation:  Advanced Topics >

Threads and Execution Sequence

Clover really gets the concepts of scalability, parallelism and multi-threading. In most cases, you do not need to be aware of this. However, there are circumstances where a good knowledge of this can give you a degree of control that can be essential.

 

Take a look at the graph below. You will see that there are two things going on. At the top, an Orders file is read, sorted and then processed further. Below, we have another read operation, field mapping and then the results are written to another file.

 

Phases

 

Take a look at the top left corner of each component. You will notice that there is a number (our example has 0,1 and 2 used). Clover does the following with these numbers

 

The lowest Phase number will start to execute
Each component with the same Phase number executes in its own thread, which means it runs in parallel.
The next Phase will only start to execute once each and every component with a lower phase number has finished executing. This means that Phase execute sequentially.
If a Phase terminates with an error, later Phases will not run at all.

 

So in the example graph,

 

The Orders and Customers will be processed in parallel.
The Customer sequence will run through to completion as fast as it can.
The Orders will be processed and fully sorted. The Copy will not start running until the sort has completed sorting all records and the Merge will also not start until the Copy has completed.

 

Here are some other things to note

 

If you change a components phase to a higher number, all components downstream of that component will also have their Phase numbers changed to the same number.
When we say that all components run in parallel, what this means in practice is that whenever a record of data has been processed, the next component downstream will immediately start processing it while the earlier component gets on with the next record. You can see how this allows genuine scalability.

 

By understanding these concepts, you will hopefully recognize when your data processing needs take advantage of this capability.