QlikView, by design, includes many new and innovative technologies such as the associative data model and highly effective data compression algorithms which make possible its state-of-the-art in-memory technology. QlikView allows us to load and keep all the data in memory for evaluation, analysis and reporting. If you’ve worked with QlikView you understand the value of this approach, but it sometimes comes with a price. Very large data sets can often take a long time to load bogging down the performance of your QlikView documents over time. This month we will take a look at a way to minimize the load time of very large data sets and increase your performance using incremental data loads.
What is an Increment Load?
Incremental load is term that describes loading only new or changed records from the database. It is a common task with databases and can greatly reduce the time needed to load data into your QlikView application. The bulk of the data needed will already be available within your application and only the new or changed data will be necessary to complete the picture. Incremental loads are possible through the use of .QVD files.
What is a QVD file?
A QVD file is a native QlikView file format optimized and compacted for speed when reading data from within a QlikView load script. Reading data from a QVD file can be 10-100 times faster than reading records directly from other data sources. A QVD file contains a single table of data from a QlikView application. While that may seem somewhat restricting remember that table can be the result of a concatenation or a join so the structure you create in the application can greatly increase its use. You can also include all calculations and manipulations in the script that creates you QVD file further increases load performance at run time.
How could you use a QVD file?
There are several uses for a QVD file and in many cases more than one will be applicable at the same time.
- Decreasing Load Time
- Decreasing Database Server Loads
- Joining Data from Different QlikView Applications
- Incremental Data Loading
Decreasing Load Time
By saving data to and loading large amounts of data from a QVD file you eliminate most of the time used during load by using an optimized and compressed data file. By scripting all of your concatenation, joining, calculations and data manipulations in the file used to create the QVD you will increase your performance even more.
Decreasing Database Server Loads
By isolating your large data volumes and loading them from QVD files you will reduce the processing on the database server at load time and dramatically increase the load time of your scripts as well. You only need to provide data since the last load of your QVD to your QlikView document during refresh. The fresher the data in your QVD the less data needed from the database server.
Joining Data from Different QlikView Applications
Once you’ve formatted and manipulated your data and get it working just the way you want, you can save that table to a QVD and use the same vetted structure in other QlikView documents. While it is true that you could copy and paste your load script into another QlikView document, by using a QVD file instead you have the added advantage of dramatically faster loading. As your scripts become more and more complex based on the business questions asked by the users you can standardized your scripts and maintain the logic in one place. This increases our ability to create a single version of the truth.
Incremental Data Loading
By adding business logic to the creation of you QVD files you can extend that logic to all of the QlikView Applications that use that data; to create a more dynamic loading scenario. Let’s say you schedule your QVD loads monthly, after the close of business for the previous month. You application now only needs to load data for the current period directly from the database and then load all previous periods from your QVD file.
Incremental Load Scenarios
The structure of your data, available evaluation fields and how you choose to store your data in QVD’s will determine your particular scenario but here are a few examples to get you started thinking.
Daily Incremental Reloads of a QVD
This scenario requires a date field in data that identifies the creation or update time of all records. We can retrieve the last modified/created date from the existing QVD, use that date to retrieve new records from the database and then concatenate the previously saved records from the QVD file to our current data and save the QVD file again.
- Load the latest (max) modified date from you previously saved QVD. If you have not yet created the QVD then use the current date.
- Select only the records from the database where the last modified date is between the last modified date you retrieved in step on and right now.
- Concatenate all the data from the QVD file where there is no match in new data table on the unique ID field. This allows QlikView to only add the records that do not exist and accounts for updated records as well as new records.
- Save the resulting data set by overwriting the QVD file with all of the records in the new data set.
This scenario will force QlikView into Fast mode instead of Super-Fast mode but will still be significantly faster than loading all data from the database. You may also need to extend this logic to your production QlikView Application if it needs to retrieve data since the last daily load.
Daily/Monthly/Yearly Stacked QVDs
At close of each Day, Month or Year (Month and/or Year being the most popular) you will create a QVD containing that periods data. Each QVD will be named so that the data in them is clearly identified by the name (I.E.: 3-1-2010.qvd or 3-2010.qvd or 2010.qvd). You may wish to use a combination approach such as saving data from previous year in a yearly QVD and data within the current year in a monthly QVD. This will give you the option of loading only the appropriate data into your QlikView Applications. Depending on the target audience for your application you may need different combinations of data. One application might require all available data while other may only require a specific number of years past. A more analytic application may only require yearly and/or monthly data while others will require up-to-the-minute data. This approach will give you flexibility for all of those scenarios.
Another advantage of this approach is that the daily, monthly or yearly data can be loaded in Super-Fast mode since no date evaluation is needed. Only the data needed to supplement the application since the last saved QVD file, if any, will be read directly from the database.