Skip to content Skip to sidebar Skip to footer

Starting Out With Python Pdf Download

DocRaptor Vs. WeasyPrint: Python PDF Generation Tools Showdown

image

Tyler Hawkins Hacker Noon profile picture

@ thawkin3

Tyler Hawkins

Senior software engineer. Continuous learner. Educator.

I recently published an article comparing HTML-to-PDF export libraries. In it, I explored options like the native browser print functionality, open-source libraries jsPDF and pdfmake, and the paid service DocRaptor. Here's a quick recap of my findings:

If you want the simplest solution and don't need a professional-looking document, the native browser print functionality should be just fine. If you need more control over the PDF output, then you'll want to use a library.
jsPDF shines when it comes to single-page content generated based on HTML shown in the UI. pdfmake works best when generating PDF content from data rather than from HTML. DocRaptor is the most powerful of them all with its simple API and its beautiful PDF output. But again, unlike the others, it is a paid service. However, if your business depends on elegant, professional document generation, DocRaptor is well worth the cost.

In the comment section for my article on Dev.to, one person suggested I take a look at Paged.js and WeasyPrint as additional alternatives to consider. (This person is Andreas Zettl by the way, and he has an awesome demo site full of Print CSS examples.)

So today we'll explore the relative strengths and weaknesses of DocRaptor and WeasyPrint.

WeasyPrint Overview

Let's start with WeasyPrint, an open-source library developed by Kozea and supported by Court Bouillon. For starters, it's free, which is a plus. It's licensed under the BSD 3-Clause License, a relatively permissive and straightforward license. WeasyPrint allows you to generate content as either a PDF or a PNG, which should adequately cover most use cases. It's built for Python 3.6+, which is great if you're a Python developer. If Python is not your forte or not part of your company's tech stack, then this may be a non-starter for you.

One of the biggest caveats to be aware of is that WeasyPrint does not support JavaScript-generated content! So when using this library, you'll need to be exporting content that is generated server-side. If you are relying on dynamically generated content or charts and tables powered by JavaScript, this library is not for you.

Installing WeasyPrint

Getting up and running with WeasyPrint is fairly easy. They provide installation instructions on their website, but I use

              pyenv            

to install and manage Python rather than Homebrew, so my installation steps looked more like this:

Installing

              pyenv            

and Python:

                                  # install pyenv using Homebrew                  brew install pyenv                  # install Python 3.7.3 using pyenv                  pyenv install 3.7.3                  # specify that I'd like to use version 3.7.3 when I use Python                  pyenv global 3.7.3                  # quick sanity check                  pyenv version                  # add `pyenv init` to my shell to enable shims and autocompletion                  echo                  -e                  'if command -v pyenv 1>/dev/null 2>&1; then\n  eval "$(pyenv init -)"\nfi'                  >> ~/.zshrc              

Installing WeasyPrint and running it against the WeasyPrint website:

                pip install WeasyPrint  weasyprint https://weasyprint.org/ weasyprint.pdf              

As you can see, the simplest way to use WeasyPrint from your terminal is to run the

              weasyprint            

command with two arguments: the URL input and the filename output. This creates a file called

              weasyprint.pdf            

in the directory from which you run the command. Here's a screenshot of the PDF output when viewed in the Preview app on a Mac:

image

Sample PDF output from WeasyPrint

Looks great! WeasyPrint also has a full page of examples you can check out which showcases reports, invoices, and even event tickets complete with a barcode.

DocRaptor Overview

Now let's consider DocRaptor. DocRaptor is closed-source and is available through a paid license subscription (although you can generate test documents for free). It uses the PrinceXML HTML-to-PDF engine and is the only API powered by this technology.

Unlike WeasyPrint's Python-only usage, DocRaptor has SDKs for PHP, Python, Node, Ruby, Java, .NET, and JavaScript/jQuery. It can also be used directly via an HTTP request, so you can generate a PDF right from your terminal using cURL. This is great news if you're someone like me who doesn't have Python in their arsenal.

DocRaptor can export content as a PDF, XLS, or XLSX document. This can come in handy if your content is meant to be a table compatible with Excel. For the time being though, we'll just look at PDFs since that's something both WeasyPrint and DocRaptor support.

One relative strength of DocRaptor compared to WeasyPrint is that it can wait for JavaScript on the page to be executed, so it's perfect for use with dynamically generated content and charting libraries.

Getting Started with DocRaptor

DocRaptor has guides for each of their SDKs that are well worth reading when first trying out their service. Since we ran the WeasyPrint example from the command line, let's also run DocRaptor in our terminal by using cURL to make an HTTP request. DocRaptor is API-based, so there's no need to download or install anything.

Here's their example you can try:

                curl http://[email protected]/docs \   --fail --silent --show-error \   --header                  "Content-Type:application/json"                  \   --data                  '{"test": true,            "document_url": "http://docraptor.com/examples/invoice.html",            "type": "pdf" }'                  > docraptor.pdf              

And here's the output after running that code snippet in your terminal:

image

Sample PDF output from DocRaptor

Voila: a nice and simple invoice. DocRaptor's example here isn't as complex as WeasyPrint's was, so let's try generating a PDF from one of DocRaptor's more advanced examples.

                curl http://[email protected]/docs \   --fail --silent --show-error \   --header                  "Content-Type:application/json"                  \   --data                  '{"test": true,            "document_url": "https://docraptor.com/samples/cookbook.html",            "type": "pdf" }'                  > docraptor_cookbook.pdf              

Here's the output for this cookbook recipe PDF:

image

Sample PDF output from DocRaptor using their Cookbook Recipe example

Pretty neat! Just like WeasyPrint, DocRaptor can handle complex designs and full-bleed layouts that extend to the very edge of the page. One important callout here is that DocRaptor supports footnotes, as seen in this example. WeasyPrint, on the other hand, has not yet fully implemented the CSS paged media specifications, so it can't handle footnote generation.

You can view more DocRaptor examples on their site including a financial statement, a brochure, an invoice, and an e-book.

JavaScript Execution

So far we've seen the powers and similarities of both DocRaptor and WeasyPrint. But one core difference we touched on above is that WeasyPrint does not wait for JavaScript to execute before generating the PDF. This is crucial for applications built with a framework like React. By default, React apps contain only a root container

              div            

in the HTML, and then JavaScript runs to inject the React components onto the page.

So if you try to generate a PDF from the command line for an app built with React, you won't get the actual app content! Instead, you'll likely see the content of the

              noscript            

tag, which typically contains a message stating something like "You need to enable JavaScript to run this app."

This is also the case for applications that rely on charting libraries like Google Charts, HighCharts, or Chart.js. Without the JavaScript running, no chart is created.

As an example, consider this simple web page I've put together. It contains a page header, a paragraph included in the HTML source code, and a paragraph inserted into the DOM by JavaScript. You can find the code on GitHub. Here's what the page looks like:

image

DocRaptor JS demo web page

Now, let's use WeasyPrint to generate a PDF from the web page by running the following command in the terminal:

                weasyprint http://tylerhawkins.info/docraptor-js-demo/ weasyprint_js_demo.pdf                              

Here's the output:

image

JS demo PDF output from WeasyPrint

Oh no! Where's the second paragraph? It's not there, because the JavaScript was never executed.

Now let's try again, but this time with DocRaptor. In order to have JavaScript run on the page, we must provide DocRaptor with the

              "javascript": true            

argument in our options object. Here's the code:

                curl http://[email protected]/docs \   --fail --silent --show-error \   --header                  "Content-Type:application/json"                  \   --data                  '{"test": true,            "javascript": true,            "document_url": "http://tylerhawkins.info/docraptor-js-demo/",            "type": "pdf" }'                  > docraptor_js_demo.pdf              

And the output:

image

JS demo PDF output from DocRaptor

Tada! The JavaScript has been successfully executed, leading to the insertion of the second paragraph.

Conclusion

So, which should you use, WeasyPrint or DocRaptor? It depends on your use case.

If your app contains static content that doesn't rely on JavaScript, if Python is part of your tech stack, or if you need PNG image output, then WeasyPrint is an excellent choice. It's open source, free, and flexible enough to handle visually complex output.

If you need to use a programming language other than Python, or you rely on the execution of JavaScript to render the content you need exported, DocRaptor is the right choice.

Table of Comparisons

As an added bonus, here's a comparison table for a quick summary of these two libraries:

image

DocRaptor vs. WeasyPrint comparison table

Happy coding!

Also published at https://dzone.com/articles/docraptor-vs-weasyprint-a-pdf-export-showdown

Tags

# javascript# python# docraptor# weasyprint# programming# software-engineering# html-to-pdf# generate-a-pdf-from-html

Posted by: crystadifabioe0199067.blogspot.com

Source: https://hackernoon.com/docraptor-vs-weasyprint-python-pdf-generation-tools-showdown-c52h31uv

Post a Comment for "Starting Out With Python Pdf Download"