I've been given the task of searching for SSNs (and other PII so we can remove it) in our entire file structure, fun I know. So far this script will search thru all .xlsx files in a given directory, but no matter what I try, I cannot for the life of me get the $SearchText variable to work. I have tried so many different deviations of the regex currently displayed, the only regex string that works is straight question marks; "???????????", but that returns entires I'm not looking for.
Any help would be very much appreciated.
Thanks!
$SourceLocation = "C:\Users\nick\Documents\ScriptingTest"
$SearchText2 = "^(?!(000|666|9))\d{3}-(?!00)\d{2}-(?!0000)\d{4}$"
$SearchText = "*"
$FileNames = Get-ChildItem -Path $SourceLocation -Recurse -Include *.xlsx
Function Search-Excel {
    $Excel = New-Object -ComObject Excel.Application
    $Workbook = $Excel.Workbooks.Open($File)
    ForEach ($Worksheet in @($Workbook.Sheets)) {
        $Found = $WorkSheet.Cells.Find($SearchText)
        If ($Found.Text -match "SearchText2") {
            $BeginAddress = $Found.Address(0,0,1,1)
            [pscustomobject]@{
                WorkSheet = $Worksheet.Name
                Column = $Found.Column
                Row =$Found.Row
                Text = $Found.Text
                Address = $File
            }
            Do {
                $Found = $WorkSheet.Cells.FindNext($Found)
                $Address = $Found.Address(0,0,1,1)
                If ($Address -eq $BeginAddress) {
                    BREAK
                }
                [pscustomobject]@{
                    WorkSheet = $Worksheet.Name
                    Column = $Found.Column
                    Row =$Found.Row
                    Text = $Found.Text
                    Address = $File
                }                 
            } Until ($False)
        }
        }
    }
    $workbook.close($false)
    [void][System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$excel)
    [gc]::Collect()
    [gc]::WaitForPendingFinalizers()
    Remove-Variable excel -ErrorAction SilentlyContinue
foreach ($File in $FileNames) 
{
    Search-Excel
}
EDIT: Turns out excel has a very limited range of acceptable regex: Acceptable Excel Regex, so I modified the first $Searchtext viarable to just be "*", and the first if statement to match regex outside of excel's search. Now I just need to come up with a crafty regex pattern to filter what I want. The next problem is filtering:
- No letters.
- Valid SSNs with dashes.
- Valid SSNs without dashes. (this part is stumping me, how to search for something that can have dashes, but if it doesn't, it can only be 9 characters long)
 
    