Python Data Generator

Contents[Hide]

The Python Data Generator transform lets you generate data by writing scripts using the Python programming language. You can use the Python Data Generator transform to generate data for prototyping or developing proof-of-concept dashboards.

To learn more about the Python language, see python.org.

1. Setup

Before you can use the Python Data Generator transform in Dundas BI, the Python programming environment must be installed on the server. 

See Install Python for more details.

2. Input

The Python Data Generator transform does not have any inputs. It generates output by running Python scripts.

3. Add the transform

When creating a new data cube, you can add the Python Data Generator transform to an empty canvas from the toolbar.

Add the Python Data Generator transform from the toolbar
Add the Python Data Generator transform from the toolbar

The Python Data Generation transform is added to the data cube and connected to a Process Result transform automatically.

The Python Data Generation transform is added
The Python Data Generation transform is added

You can also add the Python Data Generator transform from the toolbar to an existing data cube process. A typical example is to connect the Python Data Generation to a Union transform, which merges data from multiple inputs.

Merging Python Data Generator output with other data using a Union transform
Merging Python Data Generator output with other data using a Union transform

4. Configure the transform

Double click the Python Data Generation transform or select the Configure option from its right-click menu.

In the configuration dialog for the transform, the key task is to enter a Python script that returns a result.

For example, a simple script for generating a column of numbers from 1 to 5 looks like this:

return (1,2,3,4,5)

Configure the transform by entering a Python script that sets the output variable
Configure the transform by entering a Python script that sets the output variable

Important
Dundas BI will be unable to use Python outputs such as print or draw. the result must be in a format that can be represented as a table.

5. Output

The output of the Python Data Generator depends on the script it is configured with. It can be a single value, a column of values, or multiple columns.

In the case of the simple script for generating numbers from 1 to 5, you can see an output column named 'f0' in the Data Preview tab.

Data Preview for Python Data Generation output
Data Preview for Python Data Generation output

6. Generate data from Twitter

An example Python script for generating data is using Twitter API to connect to your Twitter account. This example essentially uses the data cube to create a Twitter data connector.

6.1. Setup

This particular example relies on the tweepy package in Python and an application on the Twitter developer's site:

  • To install the tweepy package, open command prompt as an administrator, navigate to the Python scripts folder (for example, C:\Program Files\Python36\Scripts), and type:
    pip install tweepy
  • You can set up a new twitter developer application on their developer's site. After your application is created, you will need to create an access token and get the following information from the Keys and Access Tokens tab:
    • Consumer Key (API Key)
    • Consumer Secret (API Secret)
    • Access Token
    • Access Token Secret

6.2. Create the script

To generate the twitter data, configure the Python Data Generation transform and add the following script:

import tweepy

auth = tweepy.OAuthHandler("key", "secret")
auth.set_access_token("token", "secret")
client = tweepy.API(auth)
 
friends = client.friends()

myList = []
count = 0
for m in friends:
            myList.append([])
            myList[count].append(m.screen_name)
            myList[count].append(m.name)
            myList[count].append(m.created_at)
            myList[count].append(m.friends_count)
            myList[count].append(m.listed_count)
            myList[count].append(m.followers_count)
            myList[count].append(m.favourites_count)
            count = count + 1
return list(zip(*myList))

This will create a table with seven columns based on your friend data on Twitter.

Twitter Data Generation output
Twitter Data Generation output

6.3. Adjust the column names

Configure the transform again and click Edit output elements.

Edit each output elements and provide a relevant column name.

Edit output elements
Edit output elements

Result:

Twitter Data Generation result
Twitter Data Generation result

7. Generate data from a JSON file

Another example Python script for generating data is by connecting to a JSON file. This example essentially uses the data cube to create a JSON data connector.

7.1. Setup

This example relies on four packages in Python. To install the packages, open command prompt as an administrator, navigate to the Python scripts folder (for example, C:\Program Files\Python36\Scripts), and type the following commands:

  • pip install numpy
  • pip install pandas
  • pip install jsonlib-python3
  • pip install requests

7.2. Create the script

To generate the JSON data, configure the Python Data Generation transform and add the following script:

import json, requests
from pandas.io.json import json_normalize
url = "http://example.domain.com/data.json"
resp = requests.get(url)
data = json.loads(resp.text)
return json_normalize(data)

This will create a table reflecting all of the data in the referenced JSON file, which is located at the example url (http://example.domain.com/data.json).

8. See also

Dundas Data Visualization, Inc.
500-250 Ferrand Drive
Toronto, ON, Canada
M3C 3G8

North America: 1.800.463.1492
International: 1.416.467.5100

Dundas Support Hours: 7am-6pm, ET, Mon-Fri