Recently, I use Chrome to run a spider locally with a golang library called chromedp as driver, then I want to use Heroku to deploy the spider. but I met several problems, luckily with help of Google and documents of Heroku, I solved these problems, this post is used to describe the problems and solutions.

Terminology

Chrome DevTools Protocol

The Chrome DevTools Protocol allows for tools to instrument, inspect, debug and profile Chromium, Chrome and other Blink-based browsers. Many existing projects currently use the protocol. The Chrome DevTools uses this protocol and the team maintains its API.

Instrumentation is divided into a number of domains (DOM, Debugger, Network etc.). Each domain defines a number of commands it supports and events it generates. Both commands and events are serialized JSON objects of a fixed structure.

The Chrome DevTools Protocol is a JSON-based remote debugging protocol. The protocol is used by the Chrome DevTools to communicate with the browser. The protocol is also used by the browser to communicate with the DevTools.

There are many drivers for the Chrome DevTools Protocol in different languages. Such as puppeteer of Node.js and chromedp of Go.

chromedp

Package chromedp is a faster, simpler way to drive browsers supporting the Chrome DevTools Protocol in Go without external dependencies.

We can use chromedp to manipulate Chrome using golang.

Heroku

Heroku is a cloud platform that lets companies build, deliver, monitor and scale apps — we're the fastest way to go from idea to URL, bypassing all those infrastructure headaches.

Heroku let us web developers focus on the code, not the infrastructure.

Problems

Go version problem

The first problem I met is I can't even deploy my app to Heroku. When I try to deploy my app using git push heroku master, I get the following error:

Syntax error: q.Interval.Milliseconds undefined

I checked my code several times and can't find the reason. Then I Googled it and found this problem usually caused by the Go version is too old in this github issues .

Then I changed go version in my go.mod file to latest version, and then I can deploy my app to Heroku. But same problem still exists. Appearantly, it didn't work, then I found this issues on how to change the go version in Heroku, which said we should specify the go version in environment variable.

Chrome buildpack

Another problem comes from how to install chrome in Heroku. When run it locally, we can use chrome app installed in our computer. But when we deploy it to Heroku, because we use container technology, like docker image, we build our app with buildpack based on official golang buildpack, like golang, there is also a chrome buildpack to deploy chrome based app.

We can use heroku buildpack:add command to add the chrome buildpack to our app.

heroku buildpacks:publish heroku/google-chrome master

So now, our app based on two buildpacks: golang and chrome.

Locate chrome binary

When we run our app, we can see the chrome binary is not in the path, by reading docs of heroku, we can get that chrome path stored in the environment variable GOOGLE_CHROME_SHIM. So we can specify it in go code using chromedp.ExecPath function as follows:

chromeBin := os.Getenv("GOOGLE_CHROME_SHIM")
log.Info("chrome path: %+v", chromeBin)

options := []chromedp.ExecAllocatorOption{
        chromedp.ExecPath(chromeBin),
        chromedp.Flag("headless", true),
        chromedp.Flag("blink-settings", "imageEnable=false"),
        chromedp.UserAgent(`Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko)`),
}

Thought

By using Google and Github and official documents, we can solve 90% of our daily programming problems. The only thing we need to do is to find the right way to describe the problem, and don't panic, keep calm to search for solution.