Home | Send Feedback

Convert Web Pages and Office Documents to PDF with Gotenberg and Go

Published: 21. October 2024  •  go

In this blog post, we take a look at Gotenberg, a Docker-powered stateless HTTP API for converting different document formats into PDF files and other functionalities.

Gotenberg employs an HTTP API that any programming language can interact with that has an HTTP client. You can also call the API using command line tools like curl.

Gotenberg is a Docker image that incorporates all necessary dependencies (e.g., LibreOffice, Chromium, and more), and it is easy to set up and run.

Gotenberg is open-source and free to use. The source code can be found on GitHub.

Installation

To run Gotenberg, you need to have Docker installed on your machine and then run the following command:

docker run --rm -p 3000:3000 gotenberg/gotenberg:8

Gotenberg's internal port is set to 3000 by default, and you can map it to any port on your host machine. The command above maps the internal port to port 3000 on the host machine.

When you use Docker Compose, you can define the service as follows:

services:
  gotenberg:
    image: gotenberg/gotenberg:8
    ports:
      - 3000:3000

For more information about the installation, check out the official documentation.

Gotenberg provides a lot of flags to customize the server's runtime behavior. You find all available flags in the documentation. For this blog post, I use the default configuration.

To test if Gotenberg is running, open your browser and navigate to the health endpoint http://localhost:3000/health. You should see a JSON with status information. You can also use curl to check the health status.

curl --request GET http://localhost:3000/health

Usage with Go

In this section, I show you a few examples of how to access Gotenberg. I use Go for the examples, but you can use any programming language that supports sending HTTP requests. In Go, we can use the HTTP client from the standard library to communicate with Gotenberg, but there is a Go client library available that makes it more convenient to interact with the API. In the awesome-gotenberg GitHub repository, you find a list of client libraries for other programming languages.

To install the Go client library, execute the following command:

go get -u github.com/dcaraxes/gotenberg-go-client/v8

First of all, we need to create a new client instance. The client instance holds the hostname of the Gotenberg server and an HTTP client instance.

  httpClient := &http.Client{
    Timeout: 5 * time.Second,
  }
  client := &gotenberg.Client{Hostname: "http://localhost:3000", HTTPClient: httpClient}

main.go

You can omit the HTTPClient field; in that case, the library will instantiate a new HTTP client internally with &http.Client{}.

Conversions with Chromium

The following examples show you some features of Gotenberg that use the Chromium browser.

Convert URL to PDF

The following example exports a web page into a PDF.

First, the application creates a new request with the web page's URL. Then it sets the margins to zero, scales the content of the page to 90% and configures Chromium to render the page as a single PDF page.

  req := gotenberg.NewURLRequest("https://xkcd.com/")
  req.Margins(gotenberg.NoMargins)
  req.Scale(0.9)
  req.SinglePage()

main.go

Gotenberg supports more options for customizing PDF output. You can find a list of available options here.

To convert the web page to a PDF, the application sends the request to the Gotenberg server with the Post method. The response is a standard http.Response object that contains the PDF in the body. This example writes the PDF to a file.

  response, err := client.Post(req)
  if err != nil {
    log.Fatal(err)
  }
  defer response.Body.Close()

  file, err := os.Create("xkcd.pdf")
  if err != nil {
    log.Fatal(err)
  }
  defer file.Close()

  _, err = io.Copy(file, response.Body)
  if err != nil {
    log.Fatal(err)
  }

main.go

Because this is such a common use case, the library provides a convenient method called Store that sends the request to the server and saves the response to a file. The following code does the same as the previous code snippet.

    err = client.Store(req, "xkcd.pdf")
    if err != nil {
        log.Fatal(err)
    }

All client methods, such as Post and Store, are also available as context-aware methods: PostContext and StoreContext. The context-aware methods allow you to pass a context to them. For example, if you run these methods in an HTTP handler, you can pass the request context to them.

    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()
    
    response, err := client.PostContext(ctx, req)
    if err != nil {
        log.Fatal(err)
    }

Convert HTML to PDF

Gotenberg can also convert any HTML we send in the request to a PDF. The following example shows how to convert a simple HTML page to a PDF. We can either read the HTML code from a file or any other location or dynamically create it in the application.

This example also shows how to handle assets like images, stylesheets, and fonts. The only requirement is that the paths referenced in the HTML are on the root level.

In this demo application, the HTML and CSS code are hardcoded.

const html = `
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>Gopher</title>
    <link rel="stylesheet" href="style.css">
  </head>
  <body>
    <h1>Gopher</h1>
    <img src="gopher.png" width="100" />
  </body>
</html>
`

const css = `
body {
  font-family: Arial, sans-serif;
  margin: 0;
  padding: 0;
  background-color: green;
}
h1 {
  color: black;
  font-size: 6em;
}
`

main.go

To show you that you can also incorporate data from external sources, the application loads a Gopher image from the internet.

  httpClient := &http.Client{
    Timeout: 5 * time.Second,
  }

  gopherURL := "https://raw.githubusercontent.com/golang-samples/gopher-vector/refs/heads/master/gopher.png"

  gopherResp, err := httpClient.Get(gopherURL)
  if err != nil {
    log.Fatal(err)
  }
  defer gopherResp.Body.Close()

  gopherBytes, err := io.ReadAll(gopherResp.Body)
  if err != nil {
    log.Fatal(err)
  }

main.go

Next, the application needs to wrap the HTML and the assets in a document object. The gotenberg-go-client library provides convenient methods to create new document objects from strings or byte slices.

  index, err := gotenberg.NewDocumentFromString("index.html", html)
  if err != nil {
    log.Fatal(err)
  }

  style, err := gotenberg.NewDocumentFromString("style.css", css)
  if err != nil {
    log.Fatal(err)
  }

  gopher, err := gotenberg.NewDocumentFromBytes("gopher.png", gopherBytes)
  if err != nil {
    log.Fatal(err)
  }

main.go

It is important to notice here that the assets' names are the same as those referenced in the HTML code.

The library also provides the NewDocumentFromPath method to read content from a file and the NewDocumentFromReader method to read content from an io.Reader.

Next, the application creates a new request object with NewHTMLRequest and passes the HTML page as a parameter. The assets are added to the request object with the Assets method. Additionally, the application sets the paper size to A4 and the margins to zero. And finally, it sends the request to the Gotenberg server and saves the response to a file with the Store method.

  req := gotenberg.NewHTMLRequest(index)
  req.Assets(style, gopher)
  req.PaperSize(gotenberg.A4)
  req.Margins(gotenberg.NoMargins)

  err = client.Store(req, "my.pdf")
  if err != nil {
    log.Fatal(err)
  }

main.go

Gotenberg can also convert Markdown files to PDF.


Take Screenshots

Another feature of Gotenberg and Chromium is to take screenshots of web pages. The following example takes a screenshot of a Wikipedia page. Instead of using the Store method, the code calls the StoreScreenshot method to send the request to Gotenberg and save the response to a file.

  req := gotenberg.NewURLRequest("https://en.wikipedia.org/wiki/2024_World_Jigsaw_Puzzle_Championship")
  req.Format(gotenberg.PNG)
  req.OmitBackground()

  err := client.StoreScreenshot(req, "puzzle.png")
  if err != nil {
    log.Fatal(err)
  }

main.go

This example sets the screenshot format to PNG and requests the API to hide the default white background and generate a picture with transparency.

If you don't want to save the screenshot to a file, you can call the Screenshot method and handle the response body in your code.

Gotenberg can also take screenshots of HTML and Markdown files.

Gotenberg with Chromium provides more features that we haven't covered here. You can set cookies, custom HTTP headers, auth headers, and more. Check out the documentation for more information.


Conversions with LibreOffice

The Gotenberg Docker image also includes LibreOffice, which allows it to convert different office document formats to PDF. You can find a list of all supported formats in the documentation.

The following example creates an Excel file programmatically with the excelize library and converts it to a PDF. You find the code that creates the Excel file here. The application saves the generated Excel file under the name demo.xlsx.

With the NewDocumentFromPath method, the application creates a new document object from the Excel file. Then, it creates a new request object with NewOfficeRequest and passes the Excel file as a parameter. The application sends the request to the Gotenberg server and saves the generated PDF to a file with the Store method.

  xlsFile, err := gotenberg.NewDocumentFromPath("demo.xlsx", "demo.xlsx")
  if err != nil {
    log.Fatal(err)
  }

  req := gotenberg.NewOfficeRequest(xlsFile)
  err = client.Store(req, "demo.pdf")
  if err != nil {
    log.Fatal(err)
  }

main.go

You find all the options for LibreOffice conversions in the official documentation.


We can send multiple files in one request.

  xlsFile2, err := gotenberg.NewDocumentFromPath("demo2.xlsx", "demo.xlsx")
  if err != nil {
    log.Fatal(err)
  }
  req = gotenberg.NewOfficeRequest(xlsFile, xlsFile2)
  err = client.Store(req, "demo.zip")
  if err != nil {
    log.Fatal(err)
  }

main.go

Without any additional configuration, Gotenberg converts all files individually to PDFs and sends them back in a ZIP file.

If you want to merge the files into one PDF instead, you can set the Merge option to true.

  req = gotenberg.NewOfficeRequest(xlsFile, xlsFile2)
  req.Merge()
  err = client.Store(req, "demo_merged.pdf")
  if err != nil {
    log.Fatal(err)
  }

main.go

Additional Features

Gotenberg provides more features that I haven't covered in this blog post.

PDF/A and PDF/UA support
PDF/A is a standard for long-term archiving of electronic documents, and PDF/UA is a standard for accessible PDF files. All the routes can be configured to convert the output to a PDF/A or a PDF/UA file. Additionally, Gotenberg provides an endpoint that converts any PDF file to PDF/A or PDF/UA.

Metadata
Gotenberg can read and write metadata from and to PDF files. With these two endpoints:

Another feature is the merge PDFs route that merges multiple PDF files into one PDF file.


This concludes our journey through Gotenberg's features. A powerful service that can help you to dynamically generate PDF files and convert office document formats to PDF.

Check out the official documentation for more information and examples.