Converting Word documents to PDF format in bulk can be a tedious task if done manually. Fortunately, PowerShell, a powerful scripting language built into Windows, offers a convenient way to automate this process. This article provides a detailed guide on how to use PowerShell to batch convert both .docx
and .doc
files to PDF, saving you time and effort.
The following script provides a basic framework for converting Word documents to PDF using PowerShell:
$documents_path = 'C:\doc2pdf'
$word_app = New-Object -ComObject Word.Application
Get-ChildItem -Path $documents_path -Filter *.doc? | ForEach-Object {
$document = $word_app.Documents.Open($_.FullName)
$pdf_filename = "$($_.DirectoryName)\$($_.BaseName).pdf"
$document.SaveAs([ref] $pdf_filename, [ref] 17)
$document.Close()
}
$word_app.Quit()
Explanation:
$documents_path = 'C:\doc2pdf'
: This line defines the path to the directory containing the Word documents you want to convert. Important: Replace 'C:\doc2pdf'
with the actual path to your documents.$word_app = New-Object -ComObject Word.Application
: This line creates a new instance of the Microsoft Word application object, allowing PowerShell to interact with Word.Get-ChildItem -Path $documents_path -Filter *.doc?
: This command retrieves all files with the .doc
or .docx
extension from the specified directory. The ?
wildcard ensures that both .doc
and .docx
files are included.ForEach-Object { ... }
: This loop iterates through each Word document found in the directory.$document = $word_app.Documents.Open($_.FullName)
: This line opens the current Word document using the Word application object. ($_.FullName)
represents the full path to the current file.$pdf_filename = "$($_.DirectoryName)\$($_.BaseName).pdf"
: This line constructs the name for the output PDF file. It uses the directory name and base name (filename without extension) of the original Word document and appends the .pdf
extension. This ensures the PDF is created in the same directory as the original .doc
/.docx
file.$document.SaveAs([ref] $pdf_filename, [ref] 17)
: This is the core conversion command. It saves the opened Word document as a PDF file.
$pdf_filename
is the name of output pdf file.17
is a constant representing the wdFormatPDF
format in Word.[ref]
is used to pass the variables by reference, which is required by the SaveAs
method.$document.Close()
: This line closes the current Word document.$word_app.Quit()
: This line closes the Word application object..ps1
extension (e.g., convert_to_pdf.ps1
).$documents_path
: Change the $documents_path
variable to point to the directory containing your Word documents.cd
command to navigate to the directory where you saved the .ps1
file. For example: cd C:\scripts
.\convert_to_pdf.ps1
and pressing Enter.When dealing with a large number of files, errors can occur, especially due to memory limitations or Word crashing. Here's an enhanced script that addresses these issues:
$Files=Get-ChildItem -path '.\path\to\docs' -recurse -include "*.doc*"
$counter = 0
$filesProcessed = 0
$Word = New-Object -ComObject Word.Application
Foreach ($File in $Files) {
$Name="$(($File.FullName).substring(0, $File.FullName.lastIndexOf("."))).pdf"
if ((Test-Path $Name) -And ((Get-Item $Name).length -gt 3kb)) {
echo "skipping $($Name), already exists"
continue
}
echo "$($filesProcessed): processing $($File.FullName)"
$Doc = $Word.Documents.Open($File.FullName)
$Doc.SaveAs($Name, 17)
$Doc.Close()
if ($counter -gt 100) {
$counter = 0
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
$Word = New-Object -ComObject Word.Application
}
$counter = $counter + 1
$filesProcessed = $filesProcessed + 1
}
Enhancements:
$Word.Quit()
.[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
.$Word = New-Object -ComObject Word.Application
.Output Directory: To save the converted PDF files to a different directory, modify the $pdf_filename
variable. For example:
$output_path = 'C:\PDF_Output'
$pdf_filename = "$output_path\$($_.BaseName).pdf"
File Filtering: To convert only specific files, adjust the -Filter
parameter in the Get-ChildItem
command. For example, to convert only .docx
files:
Get-ChildItem -Path $documents_path -Filter *.docx
wdFormatPDF
constant (value 17) is compatible with Word 2007 and later versions.By leveraging PowerShell, you can efficiently batch convert Word documents to PDF format. The provided scripts offer a solid foundation, and the customization options allow you to tailor the process to your specific needs. Remember to handle errors, manage memory, and adjust the script according to your environment for optimal performance. Using automation tools like PowerShell will boost your productivity and help you avoid manual, repetitive tasks. This functionality can be expanded with integration with Microsoft Teams API to automate file conversions directly from Teams channels.