Skip to content

State Management

YAML Workflow provides robust state management capabilities to track workflow execution, handle failures, and enable resumable workflows.

State Storage

State File Structure

The workflow state is stored in .workflow_state.json (or similar, often within .workflow_metadata.json) in the workspace directory. The exact structure evolves, but conceptually includes:

{
    "workflow_id": "unique-workflow-id",
    "name": "workflow-name",
    "start_time": "2025-04-14T10:00:00Z",
    "last_updated": "2025-04-14T10:05:00Z",
    "status": "running", // Overall workflow status
    // Information about the execution progress and failures:
    "execution_state": {
        "status": "running", // Current execution status (running, failed, completed)
        "current_step": "step2",
        "failed_step": null, // Populated on failure
        "step_outputs": {   // Stores results of successfully completed steps
            "step1": {       // Key is the step name
                "result": "Output from Step 1" // The actual return value of the task is here
            }
            // step2 output would appear here upon completion
        },
        "retry_state": { // Information about retries
            "step_1_retry": {
                "attempt": 1,
                "max_attempts": 3
            }
        }
        // Other execution details like error messages might be stored here too
    },
    "variables": { // Snapshot of context variables (args, env etc. might be here)
        "user_input": "value",
        "computed_value": 42
    }
}

State Lifecycle

  1. Initialization
  2. Created when workflow starts
  3. Records start time and initial parameters
  4. Initializes empty step tracking

  5. Step Execution

  6. Updates current step
  7. Records step outputs
  8. Tracks execution time

  9. Completion/Failure

  10. Records final status
  11. Preserves outputs for resume
  12. Logs error information if failed

Resume Capability

Resuming Failed Workflows

# Resume from last successful step
yaml-workflow run --resume workflow.yaml

# Resume from specific step
yaml-workflow run --resume --from-step step2 workflow.yaml

# Resume with modified parameters
yaml-workflow run --resume --override-params workflow.yaml

Resume Behavior

  1. State Loading
  2. Reads previous state file
  3. Validates step consistency
  4. Merges new parameters

  5. Execution Strategy

  6. Skips completed steps
  7. Restarts from failure point
  8. Preserves previous outputs

  9. State Updates

  10. Maintains execution history
  11. Updates modified steps
  12. Tracks resume attempts

Progress Tracking

Step Progress

steps:
  - name: long_running_step
    task: batch
    params:
      progress_update: true  # Enable progress tracking
      progress_interval: 5   # Update every 5 seconds

Progress Information

  • Step completion percentage
  • Estimated time remaining
  • Resource usage
  • Error counts
  • Retry attempts

Error Handling

Retry Mechanism

steps:
  - name: api_call
    task: http_request
    retry:
      max_attempts: 3
      delay: 5
      backoff: 2
      on_error:
        - ConnectionError
        - TimeoutError

Error Recovery

  1. Automatic Recovery
  2. Configurable retry policies
  3. Exponential backoff
  4. Error-specific handling

  5. Manual Intervention

  6. State inspection tools
  7. Manual retry capability
  8. Step skip options

State Management API

Reading State

from yaml_workflow.state import WorkflowState

# Load current state
state = WorkflowState.load("workflow_id")

# Access state information
current_step = state.current_step
outputs = state.step_outputs
variables = state.variables

Modifying State

# Update state
state.update_step("step1", status="completed", output="result")
state.set_variable("key", "value")
state.mark_completed()

# Save changes
state.save()

State Cleanup

Automatic Cleanup

  • Successful workflows: Optional state retention
  • Failed workflows: State preserved for resume
  • Configurable retention policy

Manual Cleanup

# Clean old state files
yaml-workflow clean --older-than 30d

# Remove specific workflow state
yaml-workflow clean --workflow-id workflow-123

# Clean all successful states
yaml-workflow clean --status completed

Best Practices

  1. State File Management
  2. Use version control ignore rules
  3. Implement backup strategies
  4. Clean up old state files

  5. Resume Strategy

  6. Design idempotent steps
  7. Handle partial completions
  8. Validate state consistency

  9. Error Handling

  10. Define retry policies
  11. Log sufficient context
  12. Plan for recovery

  13. Progress Monitoring

  14. Enable appropriate tracking
  15. Set reasonable intervals
  16. Monitor resource usage