4 Techniques to Create a Primary Key in Power BI (and 2 to Avoid)
Introduction
Let’s explore that through a relatable example: a club enrollment table with the following fields:
Student NameMentor NameClub NameJoined DateLeaving Date
Key Scenario Rules:
- A student can be in multiple clubs under the same mentor.
- A student can join multiple clubs at the same time.
- A student may rejoin the same club under the same mentor at a different point in time.
- If
Leaving Dateis blank, we assume the student is still active in that club.
With overlapping periods and recurring combinations, a reliable row-level key is critical. Let’s explore four ways to build one, starting with a warning about a common pitfall.
Method 1: Indexing in Power Query (Why It’s the Wrong Tool for Primary Keys)
What it is:
In Power Query (Accessible from report view, Home > Transform Data). Using “Add Column > Index Column > From 1” in Power Query to assign a unique-looking number to each row.
Why does it fail?
This index is assigned based on the order in which Power Query loads the data. It’s not stable if:
- The source file structure changes,
- New rows are inserted,
- Filters or sorts are applied differently.
Verdict:
Only use it for temporary row counts or visual ordering—never as a primary key.
Method 2: Rank-Based Unique Key (Why It’s the Wrong Tool for Primary Keys)
This method creates a numeric key as a calculated column by ranking rows across the selected column for uniqueness based on their content.
In table view, click Add Column and use the following DAX query.
DAXRank = RANKX( ALL(club_list),
club_list[Student Name] & club_list[Mentor Name] & club_list[Club Name] & FORMAT(club_list[Joined Date], "YYYYMMDD") & FORMAT(IF(ISBLANK(club_list[Leaving Date]), TODAY(), club_list[Leaving Date]), "YYYYMMDD"), , ASC , Dense )
This treats the combined string as a ranking expression, assigning the same value only if all inputs are identical.
Pros:
- Completely numeric and clean.
- Reflects business logic implicitly.
Cons:
- Slightly performance-intensive on large data.
- Not always intuitive.
- It will re-evaluate every time new rows are added.
- Existing rows can shift in rank if new rows have earlier join or leave dates.
That means the “primary key” is not truly stable. It’s dependent on the entire dataset’s makeup at that moment.
Method 3: Combine All Columns into a Unique Key
If each row is uniquely defined by its full field values, we can concatenate them all into a single key.
Option A — Using Power Query
Steps:
- Go to Power Query Editor. (Accessible from report view, Home > Transform Data)
- Add Columns
- Select the following columns while holding Ctrl:
Student NameMentor NameClub NameJoined DateLeaving Date- Merge Columns.
- Choose a separator.
- Name the new column Mergedpqid.
Option B — Power Query M Code
Option C — DAX (Calculated Column)
In the Table view of the report, select New column and use this DAX Query
Merged_DAX = [Student Name] & "_" & [Mentor Name] & "_" & [Club Name] & "_" & FORMAT([Joined Date], "YYYYMMDD") & "_" & FORMAT(IF(ISBLANK([Leaving Date]), TODAY(), [Leaving Date]), "YYYYMMDD")
Pros:
- Easy to inspect and debug.
- Transparent, readable key logic.
- Works well for merges in Power Query.
Cons:
- Long string values may impact performance.
- Case and whitespace sensitive.
Method 4: Convert Concatenated Key to a Number
This creates a numeric key using the combined string, often better for joins and sorting.
- Go to
Add Column>Custom Column. - Use the following formula for basic numeric conversion of the merged columns:
= Table.AddColumn(Custom1, "Numberval", each List.Sum(List.Transform(Text.ToList(Text.Combine({ [Student Name], [Mentor Name], [Club Name], Text.From([Joined Date]), Text.From([Leaving Date]) }, "_")), each Character.ToNumber(_))))
This approach adds up character values to create a pseudo-unique number.
Pros:
- Numeric keys are preferred for indexing and storage.
- Cleaner column profile for modelling.
Cons:
- Not 100% collision-free. : This is because this simply sums the character codes of each letter in the concatenated key. So, if two rows have similar or rearranged content (like
John_ClubAandClubA_John.They can still produce the same total sum, leading to collisions—i.e., different inputs with the same output. - Less readable and harder to reverse-engineer.
Let's Fix this
1. Text.Combine(...)
This joins all your key columns into one string using underscores
e.g., "Courtney Turner_Dana French_Club 3_19/01/2025"
It guarantees a consistent structure for hashing-style logic.
2. Text.ToList(...)
Breaks the combined string into individual characters:
{ "C", "o", "u", "r", ..., "2", "5" }
3. List.Numbers(1, Text.Length(...))
Generates a list of numbers from 1 to the number of characters in the string. These serve as positional weights:
{ 1, 2, 3, ..., n }
4. List.Zip({CharList, PosList})
Pair each character with its respective position in the string:
{ {"C",1}, {"o",2}, {"u",3}, ... }
5. Character.ToNumber(_{0}) * _{1}
For each character–position pair:
_ {0}gets the character,_ {1}gets its position,- We multiply the Unicode value of the character by its position.
This boosts uniqueness because it’s sensitive to character content and order.
6. List.Sum(...)
Finally, we add up all the weighted values. The resulting number is a pseudo-hash:
- Compact
- Numeric
- Order-sensitive
- Less collision-prone than a basic character sum
Conclusion
When it comes to crafting a reliable primary key in Power BI, not all methods are created equal. While ranking-based and numeric transformation strategies can be clever, they can also introduce instability or collision risks as your dataset evolves.
If you're working within Power Query Desktop, the most dependable and straightforward solution remains:
Concatenate all relevant columns into a single, readable string key.
This method:
- Preserves contextual meaning,
- Ensures uniqueness when built thoughtfully,
- And plays nicely with both merges in Power Query and joins in the data model.
Although hashing offers compactness and obfuscation, Power Query Desktop does not currently support hashing functions like Binary.Hash, limiting your ability to implement true cryptographic hashes natively. For those scenarios, consider preprocessing your data externally or using services like Power Automate or Dataflows that support more advanced transformations.
In short: When in doubt, concat it out. It's human-readable, refresh-stable, and just works.
.jpg)













Comments
Post a Comment