All the data flow components can be categorized as synchronous and asynchronous.
They can be further differentiate in
- non-blocking (synchronous)
- semi-blocking (asynchronous)
- fully-blocking (asynchronous).
Synchronous
- The number of rows input to the synchronous components is equal to the number of rows output from synchronous components
- Components use the same buffer
- All destination components are synchronous
- Non-blocking transformations are synchronized
Asynchronous
- The number of rows output from Asynchronous components may be less\ more than the number of records input to the components (filtration inside)
- Stores all the rows into the memory buffer before it begins the process of modifying input data to the required output format.
- They block the data flow in the pipeline until all the input rows are read into the memory E.g. – Sort Transformation- where the component has to process the complete set of rows in a single operation.
- Components use different buffer
- All source components are asynchronous. they create two buffers; one for the success output and one for the error output.
- Semi\ Full blocking transformations are asynchronous in nature
Synchronous reuse buffers and therefore are generally faster than asynchronous components
Non-Blocking transformations
- Audit
- Character Map
- Conditional Split
- Copy Column
- Data Conversion
- Derived Column
- Lookup
- Multicast
- Percent Sampling
- Row Count
- Script Component
- Export Column
- Import Column
- Slowly Changing Dimension
- OLE DB Command
Semi-blocking transformations
- Data Mining Query
- Merge
- Merge Join
- Pivot
- Unpivot
- Term Lookup
- Union All
Blocking transformations
- Aggregate
- Fuzzy Grouping
- Fuzzy Lookup
- Row Sampling
- Sort
- Term Extraction
No comments:
Post a Comment