Block ads using LaTeX

Posted by Fabrice on Tuesday, April 23, 2019 Updated on Tuesday, April 23, 2019 Translation: fr

I'm quite annoyed with ads. As of many, I'm using an adblocker on my computer, but there is one kind of ads that annoys me the most: ads on printable ticket. Not only it poisons our eyes, but it consumes ink to print it.

I'm aware that we can just open the QR/barcode on your smartphone, but still, isn't it better if we can get rid of the ad directly?

A first obvious solution could be to import your pdf in any image editing software and simply use any rectangle shape selection tool to remove the ad. However, this produces a new pdf file (or image file) that does not contain any more information about the text, and which may grow in size.

A simple workaround is then to use a vector graphic editor to keep this information: for instance, opening the PDF with inkscape, and remove the image corresponding to the ads. Yet more elegant, this approach still has a serious drawback: it breaks the fonts. Also, some of them (such as Air France's “Excellence in Motion” font) are proprietary and cannot be found easily/legally for free.

But inkscape can still be of use in order to remove those ads. Indeed, it allows finding the coordinates and the dimensions of those ads as illustrated in the following (click to zoom):

Inkscape ad dimensions

Explanations: After opening your pdf file, start by selecting the ad (purple), you may have to ungroup elements (ctrl+shift+g), then set the dimensions in cm or your favourite length unit (blue) and finally note the dimensions of the ad (red).

Then we just use LaTeX to add a white (or any background color, I let you devise it by yourself, you can use RGB codes with xcolor) rectangle in front of the ad. I already used the wallpaper package in another post, but it has some limitations: it doesn't allow us to import multiple pages (such as a round-trip ticket), and tikz doesn't interact well with the induced page geometry.

Thus, I used this answer on stackexchange. To put it short, we use the package pdfpages with its options pages={-} to include every page, and the option pagecommand to include the rectangle overlay with the right dimensions X, Y, L, H. That gives us the following .tex file which can simply be compiled with your favorite latex typesetter (for instance pdflatex file.tex twice).

\documentclass[a4paper]{article}
% Tikz with pdfpages
\usepackage{tikz}
\usetikzlibrary{calc}
\usepackage{pdfpages}
% avoid page numbering
\pagestyle{empty}
\begin{document}
\includepdf[pages={-},% include all pages
  pagecommand={% is called at the beginning of each inclusion
    \begin{tikzpicture}[remember picture,overlay]
      \draw[color=white,fill=white] ($(current page.north west) +%
         (X, -Y)$) rectangle ++ (L, -H);%
    \end{tikzpicture}%
}]%
{original file.pdf}
\end{document}

Remark: You may have noticed the minus sign in front of Y and H. This is because tikz computes coordinates from bottom left of the page, while inkscape (and gimp) starts at top left (and the frame is oriented accordingly).

Some examples of dimensions to copy-paste (mostly for myself):

  • Rhônexpress:
… + (1.5cm, -14.65cm)$) rectangle ++ (18cm, -9cm);
  • Air France/KLM foldable tickets.
… + (11cm, -18cm)$) rectangle ++ (9cm, -9cm);

You may have noticed that the dimensions are larger than in the above picture, this is because Air France sometime uses square ads.
However, if you plan to use your smartphone, these companies also attach an ad-free png with minimal information.

As I don't buy a plane ticket every day, I didn't feel the need to script it, and I don't have enough examples to make an interesting enough database of ad locations.
However, it is a great opportunity to get some more data about it, therefore there you can git clone the script from here. As it is a self-hosted private repository, if you are eager to contribute, you may want to do a pull request on its github repository or send me an email on courriel. In any case, feel free to contact me for any further remarks, comments or questions.

xkcd 1319 Randall Munroe
XKCD #1319 by Randall Munroe.

See also

  • pdf-adblock on github.
    It's a script based on heuristics (for instance an ad will be an image) to remove single-page-ads automatically from a PDF (for instance a magazine PDF).

tags: LaTeX, inkscape, ads, git