powershell - iterate over a collection of PDFs looking for specific info.subject

Does anyone know how to iterate over a collection of PDFs using Powershell, looking for documents with specific information stored in one of the info properties, e.g., info.subject?

As part of a document-workflow process I’m working on, I’m going to be storing values in some of the info properties, viz., Author, Subject, and Keywords. It would be very useful to later be able to find PDF documents that have specific information stored in those properties.

In Windows, if you look at the Properties for a PDF document, on the PDF tab you can see some of the info properties. Ideally, I’d be able to access those properties without having to actually open the PDF.

Christian Bahnsen

Out of the box … without some specialized function or module or something like this Powershell does not know that much about pdf files. You can show what’s available with :

Get-Item -Path -path to your pdf file--your pdf file-.pdf | Select-Object -Property *

:wink:

Thanks for your reply.

Yes, the info properties are metadata, so they don’t show up in the standard file properties.

I’m hoping someone has already cracked this nut. The Adobe Acrobat technical support community doesn’t seem to have the breadth and depth I find for MSFT products.

Christian

Check out this post showing how to use itextsharp.dll to parse PDF files:
https://social.technet.microsoft.com/Forums/scriptcenter/en-US/086b9a8c-7e47-49ed-8e94-8f5f43f408fe/search-a-pdf-and-return-specific-text?forum=winserverpowershell