How to Fix Ansible Certificate Verify Failed on Google Cloud Run


Troubleshooting Guide: Ansible “Certificate Verify Failed” on Google Cloud Run

As Senior DevOps Engineers, we often encounter the dreaded “Certificate Verify Failed” error. When Ansible, a tool designed for remote automation, hits this wall while running on a Google Cloud Run instance, it indicates a trust issue within your container’s environment. This guide will walk you through diagnosing and resolving it.


1. The Root Cause: Why This Happens on Google Cloud Run

Google Cloud Run executes your applications within stateless, containerized environments. The “Certificate Verify Failed” error arises when Ansible, attempting to connect to a remote server (e.g., a Git repository, a custom API, a cloud provider endpoint) over HTTPS, cannot validate the SSL/TLS certificate presented by that server.

The core reasons for this are typically:

  • Minimal Base Image: Your container’s base image (e.g., Alpine, a lean Python image) might lack a comprehensive or up-to-date collection of trusted root CA certificates. These are essential for validating certificates issued by common Certificate Authorities.
  • Custom or Internal CAs: If you’re connecting to internal services, private Git repositories, or APIs secured by certificates issued by an internal Certificate Authority (CA) that isn’t publicly trusted, your container’s CA store won’t recognize it.
  • SSL Interception (Corporate Proxies): In enterprise environments, corporate proxies often intercept SSL traffic, presenting their own certificate to clients. If your container doesn’t trust this proxy’s CA, verification will fail.
  • Outdated CA Certificates: Even if ca-certificates are present, they might be outdated and not include the root CA for the server you’re trying to connect to.

On Cloud Run, because instances are immutable containers, any changes to the trusted certificate store must be baked into the Docker image itself during the build process.


2. Quick Fix (CLI - for Testing & Diagnostics)

While not a long-term solution due to security implications, you can temporarily disable certificate verification for diagnostic purposes or for quick testing on Cloud Run using environment variables.

WARNING: Disabling certificate verification exposes your connections to potential Man-in-the-Middle attacks. Do not use this in production.

Option A: Disable Python’s HTTPS Verification (Ansible’s underlying library)

Many Ansible modules rely on Python’s requests library or other standard HTTP clients. You can influence their behavior with environment variables.

  1. Locally (for testing your image build):
    # When running your container locally for testing
    docker run -e PYTHONHTTPSVERIFY=0 your-ansible-image-name ansible-playbook your_playbook.yml
  2. On Google Cloud Run (via gcloud): When deploying or updating your Cloud Run service:
    gcloud run services update YOUR_SERVICE_NAME \
      --region YOUR_REGION \
      --update-env-vars PYTHONHTTPSVERIFY=0
    This sets PYTHONHTTPSVERIFY=0 as an environment variable for all instances of your Cloud Run service.

Option B: For Git Operations (if Ansible interacts with Git)

If Ansible is performing git clone or git fetch operations within the container and encountering SSL issues with your Git server:

  1. On Google Cloud Run (via gcloud):
    gcloud run services update YOUR_SERVICE_NAME \
      --region YOUR_REGION \
      --update-env-vars GIT_SSL_NO_VERIFY=true
    Alternatively, you might try to set GIT_SSL_CAINFO="" or point it to a non-existent file, but GIT_SSL_NO_VERIFY is more direct for disabling.

3. Configuration Check (The Proper Solution: Dockerfile)

The correct way to resolve “Certificate Verify Failed” on Cloud Run is by ensuring your container image has the necessary trusted CA certificates. This involves modifying your Dockerfile.

3.1. Add/Update System CA Certificates

Ensure your container’s base image includes or updates its standard CA certificate store.

For Debian/Ubuntu-based images (e.g., python:3.9-slim-buster):

FROM python:3.9-slim-buster

# Install ca-certificates and update them
RUN apt-get update \
    && apt-get install -y --no-install-recommends ca-certificates \
    && update-ca-certificates \
    && rm -rf /var/lib/apt/lists/*

# (Rest of your Dockerfile, e.g., install Ansible, copy playbooks)
# COPY requirements.txt .
# RUN pip install -r requirements.txt
# COPY . /app
# WORKDIR /app
# CMD ["ansible-playbook", "your_playbook.yml"]

For Alpine-based images (e.g., python:3.9-alpine):

FROM python:3.9-alpine

# Install ca-certificates
RUN apk add --no-cache ca-certificates

# Ensure ca-certificates are symlinked correctly (often done by apk add)
# This step is usually not explicitly needed for apk, but good to know
# RUN update-ca-certificates # Alpine handles this automatically on install

# (Rest of your Dockerfile)
# COPY requirements.txt .
# RUN pip install -r requirements.txt
# COPY . /app
# WORKDIR /app
# CMD ["ansible-playbook", "your_playbook.yml"]

3.2. Add Custom/Internal CA Certificates

If you are connecting to endpoints secured by internal or custom CAs, you need to add their public certificates to your container’s trust store.

  1. Place your .crt file: Put your custom CA certificate file (e.g., my-internal-ca.crt) in the same directory as your Dockerfile.

  2. Modify your Dockerfile:

    For Debian/Ubuntu-based images:

    FROM python:3.9-slim-buster
    
    # Install ca-certificates
    RUN apt-get update \
        && apt-get install -y --no-install-recommends ca-certificates \
        && rm -rf /var/lib/apt/lists/*
    
    # Copy and add custom CA certificate
    COPY my-internal-ca.crt /usr/local/share/ca-certificates/my-internal-ca.crt
    RUN update-ca-certificates
    
    # (Rest of your Dockerfile)

    For Alpine-based images:

    FROM python:3.9-alpine
    
    # Install ca-certificates
    RUN apk add --no-cache ca-certificates
    
    # Copy and add custom CA certificate
    COPY my-internal-ca.crt /usr/local/share/ca-certificates/my-internal-ca.crt
    # Alpine's update-ca-certificates is usually triggered by file changes in this dir or via reconfigure
    # If not automatically picked up, a reconfigure might be needed depending on Alpine version.
    # For older Alpine versions or if experiencing issues, you might need:
    # RUN cat my-internal-ca.crt >> /etc/ssl/certs/ca-certificates.crt
    # However, the `/usr/local/share/ca-certificates/` approach is more standard.

For explicit control, especially if Ansible uses requests or git for its operations, you can set environment variables to point to the system’s CA bundle. This ensures these tools look in the right place.

FROM python:3.9-slim-buster
# ... (install ca-certificates and custom CAs as above) ...

# Explicitly tell Python's requests library and Git where to find trusted CAs
ENV REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
ENV GIT_SSL_CAINFO=/etc/ssl/certs/ca-certificates.crt

# (Rest of your Dockerfile)

3.4. Ansible ansible.cfg Configuration (If Applicable)

For certain Ansible modules or connection plugins, you might need to specify certificate paths directly in ansible.cfg. This is less common for general HTTP/HTTPS issues but can be relevant for specific cloud provider modules.

If you have an ansible.cfg file, ensure it’s copied into the container and configured correctly. For instance, for uri module:

[defaults]
# ...
# If you need to specify a custom CA bundle specifically for Ansible's URI module or similar
# This assumes your custom CA bundle is merged into a single file and placed inside the container.
# This path should exist within the container.
# uri_ca_path = /etc/ssl/certs/ca-certificates.crt
# uri_verify_certs = yes # (default, but good to be explicit)

4. Verification

After building your new Docker image with the CA certificate updates and deploying it to Cloud Run, verify the fix.

4.1. Run the Ansible Playbook

The most straightforward verification is to simply run the Ansible playbook or task that was previously failing. If the issue is resolved, it should complete successfully.

4.2. Exec into the Container (Local Debugging)

While you can’t gcloud run services enter in Cloud Run to truly exec into a running instance in the same way you can with Kubernetes, you can:

  1. Test locally: Run your Docker image locally and exec into it to perform checks.
    docker run -it --entrypoint /bin/bash your-ansible-image-name
  2. Inside the local container, check:
    • List CA certificates:
      ls -l /etc/ssl/certs/
      # You should see ca-certificates.crt (or similar) and potentially your custom CA.
      # On Alpine, it might be in /etc/ssl/certs/ or symlinked from /usr/share/ca-certificates.
    • Use curl to test the problematic endpoint:
      curl -v https://your-problematic-endpoint.com
      Look for * SSL certificate verify ok. in the output. If it fails, curl will provide details about the certificate chain error.
    • Use a simple Python script:
      python -c "import requests; print(requests.get('https://your-problematic-endpoint.com').status_code)"
      If this returns 200 (or another success code) without an SSLError exception, Python’s requests library is successfully validating the certificate.

By systematically addressing the container’s trust store, you can ensure your Ansible automation runs reliably and securely on Google Cloud Run.