Canoo Webtest WebTest Canoo

PDF Step pdfToTextFilter


Extracts all text content from within the current PDF document.

In general, PDF documents can place text in documents using a variety of mechanisms. They may contain text as a stream of characters in an expected order, the order may not be expected but explicit positioning will place it in the correct position or it may contain graphical representations of the characters. For these reasons, this filter may not always produce what you expect. You will have to experiment to see what will work for you.


Required? no
The description of this test step.
Required? no, default is a single space
The fragment separator string to use, e.g. "" or " " or "," or " | ". Only used if mode is "groupByLines".
Required? no, default is platform line separator
The line separator string to use, e.g. " " or "\n".
Required? no, default is normal
Deprecated: doesn't do anything anymore.
Required? no, default is [+++ NEW PAGE +++]\n
The page separator string to use, e.g. "\n" or "------".


Here is an example of using pdfToTextFilter:

pdfToTextFilter example
    <invoke url="testDocBookmarks.pdf"/>
    <compareToExpected saveFiltered="truereadFiltered="falsetoFile="${expectedFile}">
        <pdfToTextFilter mode="groupByLineslineSep="\ndescription="extract PDF text"/>
        <lineSeparatorFilter description="normalise line separators"/>

As a result of invoking the above steps a file would be created containing something like the following:

pdfToTextFilter output
Heading One
[+++ NEW PAGE +++]
Heading Two
[+++ NEW PAGE +++]


Latest build: development
Posted: 19-Jul-2016 17:36

WebTest 3.0 released, featuring upgrades to Java 5, Groovy 1.6, and HtmlUnit 2.4.
The release includes support for maven integration, IDE-integration like for unit tests, capturing of background JavaScript errors, new steps for mouseOver and mouseOut events, better parallel execution of tests and - as usual - lots of handling improvements.
Posted: 5 March 2009

WebTest @ JavaOne
Dierk König presented "Functional testing of web applications: scaling with Java" on Wed May 7, 13:30 at JavaOne 2008 in the Tools and Scripting Languages track.
Posted: 8 May 2008

New WebTest screencast available:
Data Driven WebTest
Posted: 13 November 2007

First WebTest screencast available:
Creating a first Webtest Project

Extend WebTest with Groovy! Groovy in Action is available in every good bookstore.
Groovy in Action
Posted: 29 January 2007