Here's what I'm trying to do:
I have a huge mess of files (around ten thousand) of various formats. Each file can be defined as a certain type (ex: product sheet, business plan, offer, presentation, etc). The files are in no particular order and might as well be looked at as a single list. I'm interested in creating a catalogue by type.
The idea is that, for a certain format and a certain type, I know what keywords to look for in the file's contents. I would like to have a powershell script that basically executes a series of scripts looking for all the files of a certain format containing specific keywords and outputting each list to a separate csv. The crucial point here is that the keyword will be in the content (body of a pdf, cell of an excel etc.) and not in the filename. As of now I've tried the following:
get-childitem -Recurse | where {!$_.PSIsContainer} |
select-object FullName, LastWriteTime, Length, Extension | export-csv -notypeinformation -delimiter '|' -path C:\Users\Uzer\Documents\file.csv -encoding default
That is nice and gives me the complete list of files including their size and extension. I'm looking for something similar but filtering by content. Any ideas?
Edit: based on the solution below her's the new code:
$searchstring = "foo"
$directory = Get-ChildItem -include ('*.pdf') -Path "C:\Users\Uzer\Searchfolder" -Recurse
foreach ($obj in $directory)
{Get-Content $obj.fullname | Where-Object {$_.Contains($searchstring)}| select-object FullName, LastWriteTime, Length, Extension | export-csv -notypeinformation -delimiter '|' -path C:\Users\Uzer\Documents\file2.csv -encoding default}
However I get a bunch of these errors:
An object at the specified path C:[blabla]\filename.pdf does not exist, or has been filtered by the -Include or -Exclude parameter.