Beyond Regex: How We Built Enterprise-Grade PII Redaction
5/6/2026
If you've ever used a data masking tool, you've probably encountered this frustrating scenario: a random tracking number or database ID suddenly gets blurred out because the software thinks it's a credit card.
This happens because most tools rely entirely on Regular Expressions (Regex).
While Regex is powerful for finding text that looks like a pattern, it doesn't understand context or mathematical validity. At ScreenMask, we decided that wasn't good enough for our enterprise users.
Here is how we moved beyond Regex to achieve 99% accuracy with zero false positives.
1. The Luhn Algorithm for Credit Cards
A credit card number isn't just 16 random digits. The final digit is a checksum generated by the Luhn Algorithm.
When ScreenMask detects a 13 to 19 digit number on your screen, it doesn't immediately blur it. Instead, it instantly runs the Luhn algorithm in the background. If the math fails, we know it's a tracking number or a random ID, and we leave it visible.
2. The Verhoeff Algorithm for Aadhaar (India)
India's Aadhaar system uses a highly complex mathematical formula known as the Verhoeff Algorithm (based on Dihedral Group D5 logic).
Similar to credit cards, when ScreenMask sees a 12-digit number, it runs it against the Verhoeff multiplication, permutation, and inverse lookup tables. If the checksum isn't zero, it's not an Aadhaar number, and it stays unblurred.
3. Structural Validation for PAN and GSTIN
Regex can easily find something that looks like an Indian PAN card (AAAAA1111A). But did you know that the 4th letter of a PAN card actually represents the entity type?
P= PersonC= CompanyH= Hindu Undivided Family
If ScreenMask finds a string like ABCDX1234A, it will realize that X is not a valid entity type and will ignore the string.
Similarly, for GSTINs (which are 15 characters long), we extract the first two digits to verify they correspond to a real Indian State Code (01-38) before we ever apply a blur.
4. Date Heuristics
Regex will happily match 99/99/9999 as a date. Our engine takes the matched string, attempts to parse it into a real calendar object, and verifies that the date actually exists in the past and doesn't make the person 150 years old.
Why This Matters
When you are doing a live product demo or providing technical support, every blurred piece of text that shouldn't be blurred causes confusion.
By building these mathematical validations directly into our browser extension, ScreenMask Pro ensures that your sensitive data is protected without getting in the way of your actual work.