Ben Gorman

Ben Gorman

Life's a garden. Dig it.

Let's automate this Python script with Google Cloud Run.

main.py
def hello():
    print("Hello World")
    
    
# Start script
if __name__ == "__main__":
    hello()

Step 1: Create a new Google Cloud Project

Create a new Google Cloud Project ->

What's a Google Cloud Project?

A Google Cloud Project is a top-level container for your work. In other words, it's just a high layer of abstraction for "a thing that does stuff". To give you a better idea..

  • Users are created and managed within a project.
  • APIs and tools are enabled with a project.
  • I create and manage one project for each of my businesses.
  • It's common to have a single billing account for a single project, but occassionally you'll see one billing account cover the expenses for multiple projects.
  • If you're creating more than one project per day or fewer than two projects per year, you're probably doing it wrong.

There's no hard and fast rules to this stuff. Just get your hands dirty and improve over time.

Still lost? Follow the docs here ->

Step 2: Prep the source code on your local machine

Mimic this project structure:

my-gcrun-project/
  main.py
  Procfile

The two files (main.py and Procfile) should have the following contents 👇

def hello():
    print("Hello World")
    
    
# Start script
if __name__ == "__main__":
    hello()
web: python3 main.py

What's Procfile?

Procfile is a configuration file that tells Google the command to execute when the application starts. In our example Procfile above,

web: python3 main.py
web tells Google to use a web process type. Web processes are triggered via HTTP requests.
web: python3 main.py
python3 tells Google to use the python3 runtime
web: python3 main.py
main.py tells Google to run main.py

Step 3: Deploy the code to create a cloud run job

There are multiple ways to do this, but the easiest is to use gcloud run jobs deploy. This assumes you've installed the gcloud CLI.

gcloud run jobs deploy my-job \
    --source . \
    --region us-central1 \
    --project=my-project
In this example, I've created a job named my-job under project my-project in the region us-central1. You'll want to tweak these parameters to suit your needs.

Running this command in your local terminal should produce some output like this 👇

This command is equivalent to running `gcloud builds submit --pack image=[IMAGE] .` and `gcloud run jobs deploy my-job --image [IMAGE]`
 
Building using Buildpacks and deploying container to Cloud Run job [my-job] in project [my-project] region [us-central1]
✓ Building and creating job... Done.                                                                                                                                
  ✓ Uploading sources...                                                                                                                                            
  ✓ Building Container... Logs are available at [https://console.cloud.google.com/cloud-build/builds/...].   
Done.                                                                                                                                                               
Job [my-job] has successfully been deployed.
 
To execute this job, use:
gcloud run jobs execute my-job

Step 4: Execute the newly created job

The first time you execute a cloud run job, I recommend doing it from the console.

  1. Go to https://console.cloud.google.com/run/jobs (Make sure you're in the correct project.)
  2. Click the job you want to run
  3. Click the EXECUTE button. You may also click Execute with overrides to see the run-time options you can modify like Number of tasks and Environment variables.

Once comfortable running jobs from the console, try running your job from the command line using gcloud run jobs execute.

gcloud run jobs execute my-job --region us-central1

Running this command should output something like this 👇

✓ Creating execution... Done.                                                                                                                                       
  ✓ Provisioning resources...                                                                                                                                       
Done.                                                                                                                                                               
Execution [my-job-4cmjb] has successfully started running.
 
View details about this execution by running:
gcloud run jobs executions describe my-job-4cmjb
 
Or visit https://console.cloud.google.com/run/jobs/executions/details/us-central1/my-job-4cmjb/tasks?...

Each execution of the job gets a unique Execution ID. You can see these in the job details page.

cloud run job executions

Click on an execution id to bring up the execution details page.

cloud run job execution details

The execution details include the status of the job as well as the job logs. Notice the "Hello World" line in the logs! 😀

What's a task and why would I need more than one?

You may have noticed that Cloud Run has the concept of tasks. A single cloud run job can have one or more (up to 10,000) tasks. Tasks run in parellel, each on their own instance. They're a way of breaking up up the work for a big job into smaller pieces. For example,

  • If your cloud run job emails a list of customers, you might dedicate one task per email address
  • If your cloud run job copies data from one database to another, you might dedicate one task per table
  • If your cloud run job fetches data from an API, you might dedicate one task per API call

The important thing to keep in mind is that tasks cannot talk to each other. So, the work a task does should (ideally) be independent of all other tasks.

You can use built-in environment variables to identify a task

  • CLOUD_RUN_TASK_INDEX: the task index
  • CLOUD_RUN_TASK_COUNT: the total number of tasks for this job execution (via the --tasks parameter)
  • CLOUD_RUN_TASK_ATTEMPT: the nth attempted time this task has been tried, presuming previous tries have failed

How do I change the number of tasks?

Use the --tasks parameter in the deploy command.

gcloud run jobs deploy my-job \
    --source . \
    --tasks 50 \
    --region us-central1 \
    --project=my-project

How do I update an existing job?

If you make a change to an existing job's source code, you can update the job using gcloud run jobs update.

gcloud run jobs update JOB_NAME

How do I use a specific version of Python?

By default, Google will use the latest stable version of the Python interpreter.1 You can specify a particular Python version by including a .python-version file in your application's root directory.

.python-version
3.9.9
Use Python version 3.9.9

How do I specify dependent packages to be installed?

Use a requirements.txt file in your application's root directory to specify dependencies for your application. For example,

requirements.txt
numpy
pandas==2.2.1

The specified dependencies will be installed via pip.2

How do I incorporate environment variables?

You can use the set-env-vars parameter in the deploy command.

gcloud run jobs deploy my-job \
    --source . \
    --set-env-vars SLEEP_MS=10000 \
    --region us-central1 \
    --project=my-project

How do I schedule a job to run on a recurring basis?

You can schedule a cloud run job to execute on a recurring basis.

  1. In the Google Cloud Console, go to the job details page

  2. Click on the TRIGGERS tab

    Cloud run job trigger

  3. Click ADD SCHEDULER TRIGGER

  4. Fill out the schedule details like the name, region, frequency, and timezone.


Footnotes

  1. https://cloud.google.com/docs/buildpacks/python#specifying_the_python_version

  2. https://cloud.google.com/docs/buildpacks/python#specifying_dependencies_with_pip