Under Reconstruction

Edit and Read PDFs in PowerShell with iTextSharp

Posted on 9 April 2016

Apart from having to send personalised emails using PowerShell, I have also needed to interact with PDFs using PowerShell. My most common use case is reading in marks from a marksheet (a PDF form). I create a template PDF with fillable form fields and then, from there, create individual ones to send the students or colleagues. Doing this by hand for 40 or 50 marksheets can be tedious, so I have come to rely on PowerShell once again.

I have found iTextSharp to be the best tool to use in PowerShell.

This first step is to download the iTextSharp DLL and load the assembly into PowerShell.

[System.Reflection.Assembly]::LoadFrom("itextsharp.dll") 

iTextSharp can both read and write to PDFs.

Reading Form Fields from a PDF

To read from a PDF, you simply open a the PDF Reader and read the each field you require:

$PdfReader = New-Object iTextSharp.text.pdf.PdfReader("C:\Path\To\Original.pdf") # Open reader

$FieldValue = $PdfReader.AcroFields.GetField("FieldName") # Read form field

$PdfReader.Close() # Close the file

Writing Form Fields in a PDF

Should want to write to Form Fields, you first open a PDF Reader of the template containing the form fields and then open a PDF Stamper for the PDF file you want to write to:

$PdfReader = New-Object iTextSharp.text.pdf.PdfReader("C:\Path\To\Original.pdf") # Open reader for Template PDF
$PdfStamper = New-Object iTextSharp.text.pdf.PdfStamper($PdfReader, [System.IO.File]::Create("C:\Path\To\New.pdf")) # Open stamper for the new PDF file

Now, for each field you want to write, specify the field’s name and the value you want to put into it.

$PdfStamper.AcroFields.SetField("FieldName", "Field Value")

There is no need to “Save” per se, but remember to close both the Stamper and Reader!

$PdfStamper.Close()
$PdfReader.Close()

Flattening or make Read Only Form Fields

Once your students or colleagues have to completed the forms, you might want to store or archive them in some way. What I normally do is to either make the field read-only or flatten the field (make the field normal text and not a form field).

If you want to flatten all fields in your PDF, you can simply enable the FormFlattening property in the Stamper:

$PdfStamper.FormFlattening = 1

This will flatten ALL fields. However, if you only want to flatten certain fields, you can specify each one you want to flatten and then enable the FormFlattening property:

$PdfStamper.partialFormFlattening("FieldName")
$PdfStamper.FormFlattening = 1

Otherwise, if you don’t want to flatten the field (for example if it’s a multi-line textbox, you can just read it to read only:

$PdfStamper.AcroFields.SetFieldProperty("FieldName", [iTextSharp.text.pdf.PdfFormField]::FF_READ_ONLY, 0)
$PdfStamper.AcroFields.RegenerateField("FieldName") 

After changing a field’s property, you must regenerate the field.

You can refer to the FormField API for more on what you can do with the FormFields.

What I also like to do is to make the field RED to indicate that it was added by the marker and not just part of the template:

$PdfStamper.AcroFields.SetFieldProperty($field, "textcolor", [iTextSharp.text.BaseColor]::RED, 0)
$PdfStamper.AcroFields.RegenerateField("FieldName")

Remember to regenerate the field when you change something!

Creating New PDFs

Of course you can also create new PDFs. The process of which is fairly simple.

First define the page size you need. Again, you can refer to the PageSize API to find all the available page sizes. Once you’ve defined the page size, create a new PDF Document and associate it with a FileStream. Set the margins that you create and Open the document.

$pagesize = New-Object iTextSharp.text.Rectangle([iTextSharp.text.PageSize]::A4) # Create a new A4 rectangle
$Document = New-Object iTextSharp.text.Document($pagesize)
$FileStream = [System.IO.File]::Create("C:\Path\To\Output.pdf")
$PdfWriter = [iTextSharp.text.pdf.PdfWriter]::GetInstance($Document, $FileStream)
$Document.setMargins(0, 0, 0, 0)
$Document.Open()

See? That wasn’t that hard…

Now you probably want to write something to the page. You first want to define the font you want to write in. You can use either built in PDF fonts or your own TrueType/OpenType font:

### Create a new font based on the built-on COURIER font and embed it in the PDF.
$CourierFont = [iTextSharp.text.pdf.BaseFont]::createFont([iTextSharp.text.pdf.BaseFont]::Courier, [iTextSharp.text.pdf.BaseFont]::CP1252, [iTextSharp.text.pdf.BaseFont]::EMBEDDED)

# Create a new font for "Arial" and embed it in your PDF.
$ArialFont = [iTextSharp.text.pdf.BaseFont]::createFont("ARIAL.TTF", [iTextSharp.text.pdf.BaseFont]::CP1252, [iTextSharp.text.pdf.BaseFont]::EMBEDDED)

The BaseFont API has all the built-in PDF fonts.

Now you need to access the content of the page and add your text to it:

# Access the Content
$DirectContent = $PdfWriter.directContent

$DirectContent.saveState()
$DirectContent.beginText()

# Place your text at a certain point
# iText starts (0,0) at the Bottom-Left of the page
$DirectContent.moveText($x, $y) 

# Set the Font (which we defined earlier) and the size in Points
$DirectContent.setFontAndSize($CourierFont, 18)

# Add the text:
$DirectContent.showText("Text to show on PDF")
$DirectContent.endText()
$DirectContent.restoreState()

# Close the Document and then the FileStream
$Document.Close()
$FileStream.Close()

Add an Image to PDF

To add an image, you simply load the image, scale it and place it in your Document:

# Load your Image into to iText
[iTextSharp.text.Image]$Image = [iTextSharp.text.Image]::getInstance("MyImage.png")
# Scale the image, in percentage
$Image.ScalePercent(75)
# Place your image at a certain point
# Remember, iText starts (0,0) at the Bottom-Left of the page
$Image.setAbsolutePosition($x, $y)
$Document.add($Image)

And that’s it!

Obviously all of this isn’t meant for high intensity PDF creation and editing, but to capture some form fields and distribute custom PDFs, this routine has served me well.