Help needed for new PII command

together we create
Standard

Update

The location of the JSON files has been changed to development branch

I’ll cut right to it, I need your help.

I’m developing a new command for dbatools to scan for PII.

I already have a wide variety of different patterns and ways to check on possible personal information but I want to be as thorough and complete as possible.

The command is called Invoke-DbaDbPiiScan and it does two things:

  1. it scans the columns in the tables and sees if it is named in such a way that it could contain personal information
  2. it retrieves a given amount of rows and goes through the rows to do pattern recognition

pii scan result

How does it work

The command uses two files:

  1. pii-knownnames.json; Used for the column name recognition
  2. pii-patterns.json; Used for the pattern recognition

You can find the files here in the GitHub repository.

The patterns and known names are setup using regex to make the scan really fast.
Also, using regex this way with the JSON files makes the solution modular and easy to extend.

pii-knownnames.json

An example of a known name regex is this:

What this does is, it tries to match anything with “name”.

pii-patterns.json

The pattern regexes tend to be more complex than the know names. This is because we have to deal with more complex data.

An example of a pattern:

This particular pattern is used to find any MasterCard credit card numbers.

How can you help

What I need from you is to see if you can come up with more patterns that could lead to a more exact result.

I opened an issue in the Github repository where you can leave a comment with the pattern.

If this pattern is only used in a certain country, make sure you include which country this applies to.

I want to thank beforehand for any input.

If you have any questions leave a comment here, contact me through SQL Community Slack Channel or Twitter both as @SQLStad.

 

Leave a Reply