In order to streamline transformations in Power Query and tackle the issue of long refresh cycles, particularly for large volumes of data, there are some important and useful tips provided below that you may use:
1. Eliminate Step Redundancies
- Merge Transformations: Rather than performing a series of transformations one after another (for example, filtering, sorting, or merging), it’s better to perform such steps in a single transformation whenever appropriate. For instance, when row filtering is conducted, followed by column removal, both activities can be accomplished in a single operation with the use of a custom column or conditional formats.
Get Rid of Unwanted Queries: Assess your queries and delete irrelevant ones. Superfluous secondary queries and unwarranted intermediate tables, for example, can increase overhead during refresh.
2. Optimize the Query Folding
- Utilize Query Folding: Aim to construct your transformations so that they can be query-folded, or, in other words, the transformations are done at the data source level and not in Power BI. This greatly minimizes the amount of data imported and processed in Power Query even though data transformation has already taken place in Power BI. Whether or not query folding is effective can be seen in the applied steps. A step with ‘View Native Query’ implies that query folding is effective.
Limit the Number of Operations that Cause Folding to Break: Some operations and tasks, particularly those that add indexes, certain functions, or source data from different locations, can also break query folding. Try to refrain from using most of the Transformation steps until the last stages of the query.
3. Staging Tables Wisely
- Intermediate Staging Tables Completion: For operations that require complex modification, it is advisable to incorporate staging tables that capture intermediate results. In this context, intensive transformations can be carried out once, with the result stored in a staging table and then retrieved for other queries. This prevents the need to perform the same Transformation numerous times.
Limit Data Volume Early, Focus on the Objective: It is best to remove redundant data at the earliest possible stage, even if the operation does not concern overturning the entire dataset. The model reaches its refresh completion significantly quicker, depending on the total number of items loaded. For instance, if only a segment of current records is desired, data with actual values can be considered applicable to the rod and date restrictions implemented at the peak phase.
4. Optimize Data Types
- Appropriate Data Types: Make sure you set the exact value for data types. Sometimes, wrong data types can result in slow transformations. If necessary, changing the data types may be costly in terms of resources since they may need to be changed concurrently. Therefore, it is prudent to set data types very early within the transformation chain.
5. Use Parameters and Functions
- Use Parameters For Filtering Without Changing Queries: There are many advantages associated with using parameters, including their ability to allow filters on data without the need to change queries. This becomes necessary if you want to load a different volume of data each time.
Compose Functions That Can Be Used a Number of Times: If you have certain processes that you have done several times, you will want to do them in a function. This is because it will enable you to maintain order in your coding and allow you to easily make changes in future work without the need for reformats.
Implementing these measures can help reduce unnecessary transformation activities in Power Query, thus promoting an effective dip in the time taken to refresh the data in Power BI. One or the other amelioration assists in making the process more fluid, which, in the end, one or the other gives better results and usability to the user.