How to Retry a Task Until Condition Met in Ansible

Ansible Retry Examples

Ansible allows you to retry a task until a specified condition is met. This feature is useful when working with tasks that depend on external events or conditions, such as waiting for a service to become available or checking if a specific file exists. The until directive in Ansible is used in conjunction with the retries and delay parameters to define the retry logic.

This guide will walk you through practical examples of how to use until in Ansible to retry a task until a condition is met.

Example 1: Retry Until a File Exists

In this example, we’ll retry a task to verify if a specific file (/tmp/example.txt) exists on the remote host.

- name: Retry Until a File Exists
  hosts: localhost
  tasks:
    - name: Check if file exists
      stat:
        path: /tmp/example.txt
      register: file_check
      until: file_check.stat.exists
      retries: 5
      delay: 10

Here, the connection to the remote (or local) host is made, and the result of the stat task determines if the file exists. By default, if the file does not exist, Ansible will attempt the check again. If the file appears before all retries are exhausted, the task changed status returns true, otherwise it fails.

Example 2: Retry Until a Service is Active

In this example, we’ll check that a service (nginx) is running. If it’s not active, the task will retry until it becomes active.

- name: Retry Until Nginx is Running
  hosts: all
  tasks:
    - name: Check Nginx status
      shell: systemctl is-active nginx
      register: nginx_status
      until: nginx_status.stdout == "active"
      retries: 5
      delay: 5

If the result of the command does not show active, Ansible will wait and try again. If it never becomes active, the task will eventually fail, returning false for the success condition.

Example 3: Retry Until a Port is Open

In this example, we will wait for a specific port (8080) to become open on the remote host.

- name: Retry Until Port 8080 is Open
  hosts: all
  tasks:
    - name: Check if port 8080 is open
      shell: netstat -tuln | grep :8080
      register: port_check
      failed_when: port_check.rc != 0 and port_check.stdout == ""
      until: port_check.rc == 0
      retries: 10
      delay: 3

This playbook waits for port 8080 to open on all hosts, retrying the check up to 10 times with a 3-second interval between attempts. The failed_when line specifies that the task should fail only if the return code is not 0 and the output (stdout) is empty.

Example 4: Retry Until Output Matches a Pattern

In this example, we will verify if the output of a command matches a specific pattern.

- name: Retry Until Command Output Matches
  hosts: all
  tasks:
    - name: Run command and check output
      command: echo "ready"
      register: command_output
      until: command_output.stdout.find("ready") != -1
      retries: 5
      delay: 2

This playbook retries a command until its output contains the word “ready,” retrying up to 5 times with a 2-second delay between attempts.

Example 5: Use Retry and wait_for Together

You can combine wait_for and retries in scenarios where both conditions and re-execution are needed.

- name: Ensure the service is running and accessible
  hosts: localhost
  tasks:
    - name: Wait for port 8080 to become available
      ansible.builtin.wait_for:
        port: 8080
        host: 127.0.0.1
        timeout: 30
      register: wait_result

    - name: Verify if port 8080 became available
      debug:
        msg: "Port 8080 is available. Proceeding to health check."
      when: wait_result.state == "started"

    - name: Retry checking the service health endpoint
      command: curl -s http://127.0.0.1:8080/health
      register: health_check
      retries: 5
      delay: 10
      until: health_check.rc == 0

    - name: Display health check success message
      debug:
        msg: "Service health check passed successfully!"

In this case, wait_for ensures the port is available, and then the play repeatedly checks the service endpoint until it returns a successful result. If the endpoint returns a non-zero exit code, the task keeps trying until it no longer fails.

Conclusion

Retrying tasks until a condition is met is an essential feature in Ansible, especially when dealing with dynamic and unpredictable environments. The until directive, combined with retries and delay, provides a structured way to handle such scenarios, ensuring your tasks succeed only when specific criteria are fulfilled.

FAQs

1. How do I debug why an until condition is not met?

Use the debug module to print the registered variable values and analyze why the condition is failing.

2. Can I ignore errors for tasks using until?

Yes, you can add ignore_errors: yes to tasks using until to prevent playbook failures, but ensure this aligns with your automation logic.

3. How does Ansible determine when to stop retrying a task?

Ansible stops retrying once the until condition is met or the maximum number of retries (retries) is reached.

About Hitesh Jethva

I am Hitesh Jethva, Founder and Author at Code2DevOps.com. With over 15 years of experience in DevOps and open source technologies, I am passionate about empowering teams through automation, continuous integration, and scalable solutions.

View all posts by Hitesh Jethva