Python Data Generator
The Python Data Generator transform lets you generate data by writing scripts using the Python programming language. You can use the Python Data Generator transform to generate data for prototyping or developing proof-of-concept dashboards.
To learn more about the Python language, see python.org.
Before you can use the Python Data Generator transform in Dundas BI, the Python programming environment must be installed on the server.
See Install Python for more details.
The Python Data Generator transform does not have any inputs. It generates output by running Python scripts.
3. Add the transform
When creating a new data cube, you can add the Python Data Generator transform to an empty canvas from the toolbar.
The Python Data Generation transform is added to the data cube and connected to a Process Result transform automatically.
You can also add the Python Data Generator transform from the toolbar to an existing data cube process. A typical example is to connect the Python Data Generation to a Union transform, which merges data from multiple inputs.
4. Configure the transform
Double click the Python Data Generation transform or select the Configure option from its right-click menu.
In the configuration dialog for the transform, the key task is to enter a Python script that returns a result.
For example, a simple script for generating a column of numbers from 1 to 5 looks like this:
The output of the Python Data Generator depends on the script it is configured with. It can be a single value, a column of values, or multiple columns.
In the case of the simple script for generating numbers from 1 to 5, you can see an output column named 'f0' in the Data Preview tab.
6. Generate data from Twitter
An example Python script for generating data is using Twitter API to connect to your Twitter account. This example essentially uses the data cube to create a Twitter data connector.
This particular example relies on the tweepy package in Python and an application on the Twitter developer's site:
- To install the tweepy package, open command prompt as an administrator, navigate to the Python scripts folder (for example, C:\Program Files\Python36\Scripts), and type:
pip install tweepy
- You can set up a new twitter developer application on their developer's site. After your application is created, you will need to create an access token and get the following information from the Keys and Access Tokens tab:
- Consumer Key (API Key)
- Consumer Secret (API Secret)
- Access Token
- Access Token Secret
6.2. Create the script
To generate the twitter data, configure the Python Data Generation transform and add the following script:
import tweepy auth = tweepy.OAuthHandler("key", "secret") auth.set_access_token("token", "secret") client = tweepy.API(auth) friends = client.friends() myList =  count = 0 for m in friends: myList.append() myList[count].append(m.screen_name) myList[count].append(m.name) myList[count].append(m.created_at) myList[count].append(m.friends_count) myList[count].append(m.listed_count) myList[count].append(m.followers_count) myList[count].append(m.favourites_count) count = count + 1 return list(zip(*myList))
This will create a table with seven columns based on your friend data on Twitter.
6.3. Adjust the column names
Configure the transform again and click Edit output elements.
Edit each output elements and provide a relevant column name.
7. Generate data from a JSON file
Another example Python script for generating data is by connecting to a JSON file. This example essentially uses the data cube to create a JSON data connector.
This example relies on four packages in Python. To install the packages, open command prompt as an administrator, navigate to the Python scripts folder (for example, C:\Program Files\Python36\Scripts), and type the following commands:
- pip install numpy
- pip install pandas
- pip install jsonlib-python3
- pip install requests
7.2. Create the script
To generate the JSON data, configure the Python Data Generation transform and add the following script:
import json, requests from pandas.io.json import json_normalize url = "http://example.domain.com/data.json" resp = requests.get(url) data = json.loads(resp.text) return json_normalize(data)
This will create a table reflecting all of the data in the referenced JSON file, which is located at the example url (http://example.domain.com/data.json).