I was in the middle of setting up an Excel template based on the page count of each scanned PDF files in a particular folder. Manually filling the info in is fine if there are only handful scanned documents but it would take too much time with many errors along the way if there are 50 or even 100 of them.
So what I need is a simple easy way to have a tool to count each PDF files in a folder and export the page count in a CSV file that I can use later on.
There is an Open Source tool called PDF Page Count that I could use. It’s free and very easy to use. The export isn’t CSV based but is easy to manage to get the right format of data I can use to my template.
But what I like more is a combination use of PowerShell and a small free utility PDFtk free that comes with a command-line.
Install the free utility PDFtk and then prepare a PowerShell script like below:
$result = @() $path = [string]$args[0] dir $path\*.pdf | foreach-object{ $pdf = pdftk.exe $_.FullName dump_data $NumberOfPages = [regex]::match($pdf,'NumberOfPages: (\d+)').Groups[1].Value $details = @{ NumberOfPages = $NumberOfPages Name = $_.Name } $result += New-Object PSObject -Property $details } $result $result | export-csv -Path $path\pdf.csv echo "Th result has been saved in $path\pdf.csv file"
The script takes the directory as the argument from the command line to identify which folder you want to scan the PDF files. For example, to scan all PDF files in H: drive to find the page count for each files in that directory, I run:
.\pdf.ps1 H:
Helped a lot! Thanks. It may be simple for you but saves a planet for me!
Thanks a lot for this Ken! I’m a complete noob to Powershell but I’m quickly realising how powerful it is.
I’m trying to apply the script to a directory with many sub folders to iterate through. How should I use the Get-ChildItem -recurse command with your script to achieve this?