By specifying a rate, this transform reads in all of the data from the previous transform and generates a set of random indexes according to the rate input multiplied by the total record count.
Then it outputs the records according to those indexes.
The Percentage Sampling transform requires 1 input transform that has at least 1 column.
The input could be a SQL Select transform, or the result of another transform. For example, the input data is:
2. Add the Transform
Steps to add the transform:
- Select the connector link.
- Select the transform from the menu.
- To Edit/Configue the transform, select the newly added transform, and click the Configure menu.
Steps to configure the Percentage Sampling transform:
- Sampling Percentage - Value, in terms of percentage of the input records, that you want included in the output.
- Random Seed - the number used to used to initialize a pseudorandom number generator.
- Value is from 0 to 2,147,483,647.
- If the value is 0, the random number generator uses the time and your random indexes will always be different.
- Select the columns to be included in the output.
The figure below illustrates the output from the Percentage Sampling transform.