Here are the best practices for identifying and removing inefficient calculated columns in a Power BI dataset:
1. Identify inefficient calculated columns
For example, you can use Performance Analyzer: Open Performance Analyzer in Power BI Desktop to determine if there are queries with poor performance due to calculated columns.
Check Storage Mode: Calculated columns in DirectQuery mode can slow down performance as they will create corresponding SQL queries on the database side.
Look for High Cardinality Columns: Columns with too many unique values, such as Customer Email, will have a high memory footprint and slow down performance.
2. Use Better Alternatives
Use Measure Instead: If the calculated column is used in aggregations (e.g., Sales Amount = Quantity * Price), a DAX measure should be created instead of storing the value in a calculated column.
Pre-Calculate in the Data Source: Wherever possible, do the computation in SQL or Power Query rather than using DAX. This reduces the burden on Power BI and thus improves refresh times.
Make Use of Relationships Instead of Conditional Columns: Use a lookup table instead of IF or SWITCH in calculated columns and define relationships between the two.
3. Remove Unneeded Calculated Columns
Check for Unused Columns: Remove any calculated column from the dataset that is not being utilized in any of the reports.
Convert Text-Based Calculated Columns: If required, replace text-based calculations with numeric encoding to improve compression.