Data cube storage types

1. Overview

After you have designed your data cube, you can choose a storage type to possibly increase data retrieval performance and reduce the load on your database server. There are two storage types available from the data cube designer toolbar:

  • The warehouse storage type lets you store or cache data in the Dundas BI Warehouse database.
  • The in-memory storage type stores data in server memory for even greater performance gain potential.

You can choose to independently build either storage on-demand, or schedule it to run periodically. Scheduling is particularly useful if you have a line of business database that is not permitted to be queried during the day. In this case, you can schedule a storage operation to run at night only so that during the day, data retrieval queries will be made against the warehouse or in-memory storage only.

To further increase data retrieval performance, both storage types offer the option of incremental updates. Incrementally built data cubes will retrieve only data that was added to the database since the last build, instead of retrieving all of the data every time the data cube is rebuilt.

Note
A data cube that contains public parameters or user-specific filters cannot be built as a warehouse or in-memory cube.

2. Data cube preparation

Building in-memory storage can require a lot of server memory. Therefore, we strongly recommend that you remove or hide unnecessary elements (columns, hierarchies, measures, and measure aggregators), which you do not need from your data cube's Process Result transform.

For example, edit a measure from your data cube and check the list of Supported Aggregators. For measures these are Sum, Count, Average, Minimum, Maximum, and Distinct Count. For hierarchies these are Count and Distinct Count. By default, only Sum, Count, and Average are selected in order to conserve memory usage. You can adjust this property as needed.

Supported aggregators for a measure
Supported aggregators for a measure

Double-click the Process Result transform to open its configuration dialog, which lets you uncheck output columns that you don't need. This will completely remove the column from the data cube output.

As an alternative, you can also hide a measure or hierarchy from the output by clicking the grey minus icon on the very right. The measure or hierarchy will be listed under a Hidden section but you can easily unhide it by clicking its plus icon. A column hidden this way will be excluded from in-memory storage processing (which helps with performance) but the column will still be available for subsequent linking in hierarchy keys (for example, if you need to use the data cube later to define a hierarchy by using the column as part of a composite key).

Hiding a measure or hierarchy
Hiding a measure or hierarchy

3. Choosing the storage type

In the Data Cube Designer screen, go to the toolbar and click Storage Type. Then select the storage type you want from the dropdown.

Storage types
Storage types

4. Warehouse storage

The warehouse storage type lets you store or cache data in the Dundas BI Warehouse database.

4.1. Check in your data cube

Once you've chosen the warehouse storage type, you must check in your data cube. Otherwise, the warehouse options will be unavailable.

After checking in, you will be asked to Start Warehouse Processing. Click OK to begin building warehouse storage, or click Cancel if you want to build it later as shown in the next section.

Similarly, you will be asked to Setup schedule. Click OK to open the Schedule Rule dialog, or click Cancel if you want to set it up later.

4.2. Build warehouse storage

Go to the toolbar and click Warehouse. Select Build Warehouse from the dropdown.

Build warehouse
Build warehouse

You'll see a message indicating the build has started.

Go to the Dundas BI Log in the Administration screen to see related log entries. You can also monitor the status of a warehouse build from the Jobs screen.

4.3. Schedule warehouse storage

Go to the toolbar and click Warehouse. Then click Schedule Warehouse from the dropdown.

In the Schedule Rule dialog, enter the specifics of the periodic schedule you want to set up. Then click Submit.

Schedule warehouse build
Schedule warehouse build

5. In-memory storage

The in-memory storage type stores data in server memory for even greater performance gain potential. The cube is also persisted to the Dundas BI Warehouse database so that it can be reloaded back into memory in the event of a server restart.

If you make changes to your data cube, you should (manually) rebuild the in-memory storage by following the steps below.

5.1. Check in your data cube

Once you've chosen the in-memory storage type, you must check in your data cube. Otherwise, the in-memory options will be unavailable.

After checking in, you will be asked to Start In-Memory Processing. Click OK to begin building in-memory storage, or click Cancel if you want to build it later as shown in the next section.

Similarly, you will be asked to Setup schedule. Click OK to open the Schedule Rule dialog, or click Cancel if you want to set it up later.

5.2. Build in-memory storage

Go to the toolbar and click In-Memory. Then click Build In-Memory from the dropdown.

Build in-memory
Build in-memory

You'll see a message indicating the build has started. The time required to build the in-memory storage depends on the columns and default measures in your data cube.

Go to the Dundas BI Log in the Administration screen to see related log entries. You can also monitor the status of an in-memory build from the Jobs screen.

5.3. Schedule in-memory storage

Go to the toolbar and click In-Memory. Then click Schedule In-Memory from the dropdown.

In the Schedule Rule dialog, enter the specifics of the periodic schedule you want to set up. Then click Submit.

Schedule warehouse build
Schedule warehouse build

5.4. Disable memory management

After the in-memory storage exceeds a certain number of records, it begins compressing the records to maintain a stable use of RAM. When the records are used they are decompressed, and recompressed again after some time.

You can increase the performance of the in-memory data cube by preventing it from recompressing records.

Important
This setting allows for unlimited use of RAM. Using it with an insufficient amount of RAM can result in memory paging, which would likely lower the performance rather than improve it.

Click the Configure option on the toolbar.

Configure the data cube
Configure the data cube

In the Configure Data Cube dialog, tick the checkbox under Disable Memory Management. Then click Submit.

Disable memory management
Disable memory management

The new setting will take effect after the next time the data cube is built.

Note
As an alternative, you can change the number of records after which compression starts but going to Admin > Setup > Config and editing the Record Block Size configuration.

5.5. Limitations

The following limitations exist in in-memory warehouse storage:

  • In-memory storage type does not support binary data types.

6. Incremental data update

Incrementally building data cubes will only retrieve data that was added to the database since the last build. This has the potential of substantially increasing data retrieval performance; however, it will not update or remove existing data.

It is important to be cautious when using incremental updates on a data cube where changes to existing data are possible. Deleted entries will still be present, entries that have been updated to a smaller value will retain their original value, and entries that have been updated to a larger value will appear as a new entry after the data cube has been updated.

Note
Incremental data updates are not available for data structures that do not allow parameters, such as an Excel.

You can enable incremental data updates on either of the data storage types, and then build the data cubes manually or via a schedule, as described above.

Click the Configure Settings option on the toolbar.

Configure the data cube
Configure the data cube

Select Perform Incremental Build, and then select all the input columns you want to be rebuilt.

Enable Incremental Build
Enable Incremental Build

7. Notes

7.1. Build Warehouse does not clear data result cache

After you build the warehouse, when you go to view a dashboard, it is possible that you will be seeing old data that was there before the warehouse was last built. This is because of data result caching which is not cleared when the warehouse is built.

You can change or disable result caching for the entire application by changing the Result Cache Absolute Expiration configuration setting in administration.

Set Result Cache Absolute Expiration to customize data result caching
Set Result Cache Absolute Expiration to customize data result caching

7.2. What storage type is this data cube using?

If your data cube is checked in, you can see what type of storage it is using from the Explore window, the administration File Explorer, or the Properties References dialog for a referencing item. This saves you from having to edit the data cube just to see its storage type.

When a data cube uses a storage type, it appears with a different icon in the Explore window (or File Explorer). The data cube's tooltip also indicates the type of storage it uses.

Data cube icon and tooltip indicates warehouse storage
Data cube icon and tooltip indicates warehouse storage

Data cube icon and tooltip indicates in-memory storage
Data cube icon and tooltip indicates in-memory storage

7.3. Performance of building in-memory storage

The total time required to build in-memory storage for a particular data cube, and the RAM memory required, can vary greatly depending on a range of factors. Estimating build times is therefore not as simple as saying building X records takes Y minutes, etc.

For example, memory consumption generally depends on:

  • How memory is allocated by the .NET framework
  • Number of hierarchies and measures used in your data cube process (especially with distinct count)
  • How sparse is your fact table (end result of your data cube process)

Ideally, if the in-memory storage result fits entirely in RAM, then the total build time will generally be divided as follows on a typical 8 core CPU box:

  • 50% of time spent retrieving/loading data from your data sources
  • 50% of time spent building the in-memory data structures

With a higher number of processing cores, the time to build the in-memory data structures can be reduced due to parallelization.

Here are some general guidelines for optimizing in-memory builds:

  • De-select columns you don't need in your data cube as early in the process as possible.
  • Minimize the number of hierarchies, measures, and aggregators in your data cube.
  • Minimize the complexity of your data cube process (e.g., type and number of transforms used)
  • Use the Filter transform to restrict data to required ranges.

7.4. Metric set data retrieval is set to raw

When the Data Retrieval Format on a metric set is set to Raw and <Row Number> is placed on Rows, the raw data will be retrieved directly from the original database, bypassing the data cube's storage.

For more information on raw data mode, see the article Metric set analysis tools.

8. See also

 

Dundas Data Visualization, Inc.
500-250 Ferrand Drive
Toronto, ON, Canada
M3C 3G8

North America: 1.800.463.1492
International: 1.416.467.5100

Dundas Support Hours: 7am-6pm, ET, Mon-Fri