How can I extract a list of used and unused measures and columns from Power BI project folders using Python

Question

How can I extract a list of used and unused measures and columns from Power BI project folders using Python?

I need to analyze a Power BI project folder to identify which measures and columns are actively used in reports and which ones are unused. This is to optimize performance by removing unnecessary fields. Are there Python libraries or scripts that can parse Power BI metadata to extract this information? What techniques or best practices can help automate this process effectively?

score 0 · Answer 1 · 3 hours

The extracted measures and columns, which represent both the consumed and unused measures and columns, can be extracted from a Power BI project folder by analyzing the metadata that exists in the .pbit or .pbix file using Python.

Best Approach: Use Python to Parse Power BI Metadata
1. Use pbi-tools (Best recommended for pass-through to Extract Metadata)

PBI-tools export Power BI metadata into JSON files.

Run:

pbi-tools extract "path/to/your.pbix"

This generates a JSON model containing tables, columns, and measures.

2. Use Python to Analyze Metadata

Extract tables, columns, and measures from the JSON files.
Compare them against visuals in report pages.
Example script using json:

import json

with open("Model.bim", "r") as file:
    model = json.load(file)

# Extract all columns and measures
all_columns = {table["name"]: [col["name"] for col in table.get("columns", [])] for table in model["tables"]}
all_measures = {table["name"]: [measure["name"] for measure in table.get("measures", [])] for table in model["tables"]}

print("Columns:", all_columns)
print("Measures:", all_measures)

3. Identify Used vs. Unused Elements

Parse report visuals (Report\Layout.json) and check which columns/measures are referenced.
Use set operations in Python to find unused fields.

Automation Tip: