How to Fix Terraform Too Many Open Files on DigitalOcean Droplet


Troubleshooting “Terraform Too Many Open Files” on DigitalOcean Droplets

As a Senior DevOps Engineer, few errors are as frustratingly vague yet critically impactful as “Too Many Open Files.” When Terraform, your infrastructure-as-code workhorse, throws this error on your DigitalOcean Droplet, it signals that your system has hit a fundamental operating system limit. This guide will walk you through diagnosing and resolving this issue professionally and directly.


1. The Root Cause: Understanding File Descriptor Limits

The “Too Many Open Files” error in a Linux environment, including DigitalOcean Droplets, indicates that a process (in this case, Terraform or one of its provider plugins) has attempted to open more file descriptors than it is allowed by the operating system.

What are file descriptors? File descriptors (FDs) are abstract handles used by the kernel to identify various I/O resources. These include:

  • Actual files (e.g., Terraform .tf configurations, state files, log files)
  • Network sockets (e.g., HTTPS connections to the DigitalOcean API, connections to other cloud providers)
  • Pipes
  • Devices

Why Terraform hits this limit: Terraform, especially when managing large infrastructures, performing complex operations, or interacting with numerous resources and providers, can quickly consume FDs. Each provider plugin runs as a separate process, and each connection it makes to a cloud API (like DigitalOcean’s) consumes a socket, which is a file descriptor. When you have many resources, multiple providers, and potentially concurrent Terraform runs, these numbers add up fast, exceeding the default ulimit settings on your Droplet.

Default ulimit on Droplets: Most Linux distributions, including those commonly used on DigitalOcean Droplets (Ubuntu, CentOS, Debian), ship with conservative default ulimit values for nofile (number of open files). These defaults are generally suitable for desktop users or basic server applications but are often insufficient for dedicated automation hosts running intensive applications like Terraform.


2. Quick Fix (CLI): Temporary Session Adjustment

For immediate relief and to confirm that increasing the limit resolves the issue, you can temporarily adjust the nofile limit for your current shell session.

  1. Check Current Limits: First, see what your current soft and hard limits are:

    ulimit -Sn # Soft limit for current user
    ulimit -Hn # Hard limit for current user

    The ulimit -Sn output is typically the one you’re hitting.

  2. Increase the Soft Limit (Temporarily): Choose a higher number. A common starting point is 4096 or 8192.

    ulimit -n 8192
    • Note: You cannot set the soft limit higher than the hard limit (ulimit -Hn). If your current hard limit is too low, you’ll need to adjust that first (which typically requires sudo or root privileges, and often a more permanent configuration change as detailed below).
  3. Verify New Limit:

    ulimit -Sn

    Confirm it reflects the new value.

  4. Rerun Terraform: Execute your Terraform command within the same shell session.

Important: This ulimit -n command only affects the current shell session and any processes launched from it. If you open a new terminal, log out and back in, or reboot the Droplet, the limit will revert to its default. For a permanent solution, proceed to the “Configuration Check” section.


3. Configuration Check: Persistent Limit Adjustments

To ensure your Terraform operations consistently have sufficient file descriptor limits, you need to make permanent changes to your Droplet’s configuration.

Option A: Per-User/Group Limits (/etc/security/limits.conf)

This is the standard and recommended method for setting limits for specific users or groups.

  1. Edit limits.conf: Open the /etc/security/limits.conf file with sudo:

    sudo nano /etc/security/limits.conf
  2. Add or Modify Entries: Add lines at the end of the file. You can target a specific user (e.g., terraform_user) or all users (*). It’s best practice to set both soft and hard limits.

    # <domain>      <type>    <item>      <value>
    terraform_user  soft      nofile      8192
    terraform_user  hard      nofile      16384
    # OR for all users:
    # *             soft      nofile      8199
    # *             hard      nofile      16384
    • terraform_user: Replace with the actual username that executes Terraform. If Terraform is run by a CI/CD agent (e.g., jenkins), use that user.
    • soft: The limit that is enforced by the kernel. Processes can increase their soft limit up to the hard limit.
    • hard: The maximum value that the soft limit can be set to. Only root can increase the hard limit.
    • nofile: Specifies the maximum number of open file descriptors.
    • Values: Start with 8192 for soft and 16384 for hard, and adjust as needed.
  3. Ensure PAM is Configured: For limits.conf to take effect, the Pluggable Authentication Modules (PAM) system must be configured to load the pam_limits.so module. This is usually enabled by default on most distributions. You can quickly check by looking for a line similar to session required pam_limits.so in files like:

    • /etc/pam.d/common-session
    • /etc/pam.d/sshd (if connecting via SSH)
    • /etc/pam.d/sudo (if using sudo to run Terraform)
  4. Apply Changes: For these changes to take effect, the user needs to log out and log back in (or restart the SSH session). A full Droplet reboot is the most guaranteed way to apply all changes.

Option B: System-Wide File Descriptor Limit (/etc/sysctl.conf)

While limits.conf sets per-process limits, fs.file-max in sysctl.conf sets the total maximum number of file descriptors that the entire kernel can allocate across all processes. You might need to adjust this if you have many processes, each using many FDs.

  1. Edit sysctl.conf: Open the /etc/sysctl.conf file:

    sudo nano /etc/sysctl.conf
  2. Add or Modify Entry: Add or modify the fs.file-max line to a significantly higher value.

    fs.file-max = 2097152 # Example: 2 million system-wide FDs
  3. Apply Changes: Load the new sysctl settings without rebooting:

    sudo sysctl -p

    Alternatively, a Droplet reboot will also apply these changes.

Option C: For Systemd Services (if Terraform is run via a service)

If your Terraform operations are part of a CI/CD pipeline or automation script that runs as a systemd service (e.g., a Jenkins agent, a custom Terraform runner service), limits.conf might not directly apply to the service’s environment. In this case, configure the limit directly within the service’s unit file.

  1. Locate the Service File: Find the .service file for your application (e.g., /etc/systemd/system/jenkins.service or /etc/systemd/system/terraform-runner.service).

  2. Add LimitNOFILE: Edit the service file (sudo nano /etc/systemd/system/<your-service>.service) and add LimitNOFILE under the [Service] section:

    [Service]
    # ... other service configurations ...
    LimitNOFILE=8192 # Set the soft and hard limits for this service
    • LimitNOFILE sets both the soft and hard nofile limits for the service process to the specified value.
  3. Reload Systemd and Restart Service:

    sudo systemctl daemon-reload
    sudo systemctl restart <your-service>

4. Verification: Confirming the Fix

After implementing any of the persistent configuration changes, it’s crucial to verify that the new limits are in effect.

  1. Check Per-User Limit (after logout/login or reboot): Log in as the user that runs Terraform and execute:

    ulimit -Sn
    ulimit -Hn

    These should now reflect your configured values.

  2. Check System-Wide Limit (if applicable):

    cat /proc/sys/fs/file-max

    This should show the value you set in /etc/sysctl.conf.

  3. Check a Running Process (optional): If Terraform is actively running (or a service that runs Terraform), you can inspect its limits:

    ps aux | grep terraform # Find the PID of the terraform process
    cat /proc/<PID>/limits | grep "Max open files"

    Replace <PID> with the actual process ID.

  4. Rerun Terraform: Execute your Terraform commands (e.g., terraform plan, terraform apply) to confirm that the “Too Many Open Files” error no longer occurs.

By systematically addressing the root cause and implementing the appropriate persistent configuration, you can effectively resolve the “Terraform Too Many Open Files” error on your DigitalOcean Droplet, ensuring your infrastructure automation runs smoothly.