Azure Container Apps Jobs – Terraform

Azure Container Apps handle long-running services well — APIs, web apps, background workers. Short-lived tasks are a different story. A PDF export that finishes in 40 seconds. A nightly report that aggregates data and uploads a file. A Service Bus consumer that drains a queue and terminates. Running these as always-on Container Apps means paying for idle replicas.

Azure Container Apps Jobs fix this: spin up, execute, terminate. You pay only for actual compute time.

This article covers all three job trigger types — manual, scheduled, and event-driven — provisioned with Terraform using azurerm_container_app_job. Full source code: container_apps_jobs/.


What Are Container App Jobs?

Container App Jobs are short-lived container executions inside an Azure Container App Environment. Unlike Container Apps (which run indefinitely and serve traffic), Jobs run to completion and exit.

Container Apps Container App Jobs
Lifecycle Long-running Runs to completion, then terminates
Use case APIs, web apps, services Batch processing, report generation, queue draining
Billing Per-second while replicas run Per-execution (vCPU-s + GiB-s consumed)
Triggers HTTP, TCP, KEDA scaling Manual, Scheduled (cron), Event-driven (KEDA)
Terraform resource azurerm_container_app azurerm_container_app_job

Rule of thumb: if your container exits after completing its task, use a Job. If it runs indefinitely and serves requests, use a Container App.


Why I Moved DocWriters Finalize Stage to a Job

In DocWriter Studio, the finalize stage renders PlantUML/Mermaid diagrams and exports documents as PDF and DOCX. Each execution takes 30–60 seconds and exits.

Previously this ran as a Container App with minReplicas = 1. The replica sat idle 95% of the time. Moving it to an event-driven Job triggered by Service Bus messages reduced compute cost to near-zero during quiet periods — the Job scales from 0, processes the request, and terminates.


Architecture

All three job types share the same Container App Environment, pull images from a private Azure Container Registry, and connect to Azure Service Bus:

Resources provisioned by Terraform:

  • Resource Group, VNet, Log Analytics Workspace
  • Container App Environment (Consumption workload profile, VNet-integrated)
  • Azure Container Registry + User Assigned Managed Identity with AcrPull role
  • Service Bus Namespace + Queue (job-tasks)
  • Three azurerm_container_app_job resources (manual, scheduled, event-driven)

Three Trigger Types in Terraform

All three jobs share the same container image — a .NET console app that connects to Service Bus, drains messages, and exits. The difference is the trigger configuration block.

Manual Job — On-Demand Execution

Triggered via Azure Portal, CLI, or REST API. Ideal for ad-hoc PDF regeneration, one-off data migrations, or admin tasks.

resource "azurerm_container_app_job" "manual_job" {
  name                         = "${local.prefix}-manual-job"
  location                     = var.location
  resource_group_name          = azurerm_resource_group.rg.name
  container_app_environment_id = azurerm_container_app_environment.app_env.id
  replica_timeout_in_seconds   = 300
  replica_retry_limit          = 1

  manual_trigger_config {
    parallelism              = 1
    replica_completion_count = 1
  }

  # template, secret, identity, registry blocks omitted — identical across all three jobs
}
  • manual_trigger_config — no cron, no KEDA. Fire-and-forget.
  • replica_timeout_in_seconds = 300 — kill if running longer than 5 minutes.

Trigger after deployment:

az containerapp job start --name <job_name> --resource-group <rg>

Scheduled Job — Cron-Based Execution

Runs on a recurring schedule. Suitable for nightly batch reports, periodic data aggregation, or cache warming.

resource "azurerm_container_app_job" "scheduled_job" {
  name                         = "${local.prefix}-scheduled-job"
  location                     = var.location
  resource_group_name          = azurerm_resource_group.rg.name
  container_app_environment_id = azurerm_container_app_environment.app_env.id
  replica_timeout_in_seconds   = 300
  replica_retry_limit          = 1

  schedule_trigger_config {
    cron_expression          = "0 */6 * * *"
    parallelism              = 1
    replica_completion_count = 1
  }

  # template, secret, identity, registry blocks omitted
}
  • cron_expression = “0 */6 * * *” — fires at minute 0 of hours 0, 6, 12, 18.
  • Common alternatives: 0 2 * * * (daily at 2 AM), */15 * * * * (every 15 min), 0 0 * * 0 (Sundays at midnight).

Event-Driven Job — KEDA Service Bus Scaler

The most powerful trigger. Uses KEDA to watch an Azure Service Bus queue and spin up executions when messages arrive. Scales to zero when the queue is empty.

resource "azurerm_container_app_job" "event_driven_job" {
  name                         = "${local.prefix}-event-job"
  location                     = var.location
  resource_group_name          = azurerm_resource_group.rg.name
  container_app_environment_id = azurerm_container_app_environment.app_env.id
  replica_timeout_in_seconds   = 600
  replica_retry_limit          = 2

  event_trigger_config {
    parallelism              = 1
    replica_completion_count = 1

    scale {
      min_executions              = 0
      max_executions              = 10
      polling_interval_in_seconds = 30

      rules {
        name             = "servicebus-queue-rule"
        custom_rule_type = "azure-servicebus"
        metadata = {
          queueName    = azurerm_servicebus_queue.job_queue.name
          namespace    = azurerm_servicebus_namespace.sb.name
          messageCount = "20"
        }
        authentication {
          secret_name       = "service-bus-connection-string"
          trigger_parameter = "connection"
        }
      }
    }
  }

  # template, secret, identity, registry blocks omitted
}

How KEDA scaling works with messageCount = “20”:

KEDA polls the Service Bus queue every 30 seconds and divides the current queue depth by the messageCount threshold to determine how many job instances to launch. Each instance handles its batch independently:

Queue depth Calculation Instances launched
0 messages 0 ÷ 20 = 0 0 (scale to zero)
15 messages 15 ÷ 20 = 1 (rounded up) 1
20 messages 20 ÷ 20 = 1 1
45 messages 45 ÷ 20 = 3 (rounded up) 3
100 messages 100 ÷ 20 = 5 5
250 messages 250 ÷ 20 = 13 → capped 10 (max_executions)

So with messageCount = “20” and max_executions = 10, the system handles up to 200 messages in parallel (10 instances × 20 messages each). Anything beyond 200 waits for the next polling cycle.

Full runtime flow:

  1. Messages land in the job-tasks queue.
  2. KEDA polls every 30 seconds and reads the queue depth.
  3. KEDA divides queue depth by messageCount (20) → number of instances to launch.
  4. Each instance starts, drains up to 20 messages, exits.
  5. Next poll: if messages remain, KEDA launches more instances.
  6. Queue empty → zero executions, zero cost.

The Consumer Application

All three jobs run the same .NET console app. It connects to Service Bus, receives messages in a loop, and exits when done:

using Azure.Messaging.ServiceBus;

string connectionString = Environment.GetEnvironmentVariable("serviceBusConnectionString")
    ?? throw new ArgumentNullException("serviceBusConnectionString");
string queueName = Environment.GetEnvironmentVariable("serviceBusQueue")
    ?? throw new ArgumentNullException("serviceBusQueue");
int maxMessagesPerRun = int.TryParse(Environment.GetEnvironmentVariable("MAX_MESSAGES_PER_RUN"), out int parsedMaxMessages)
    ? parsedMaxMessages
    : 20;
int maxIdleWaitSeconds = int.TryParse(Environment.GetEnvironmentVariable("MAX_IDLE_WAIT_SECONDS"), out int parsedIdleSeconds)
    ? parsedIdleSeconds
    : 15;

await using var client = new ServiceBusClient(connectionString);
await using var receiver = client.CreateReceiver(queueName);

int processedCount = 0;
while (processedCount < maxMessagesPerRun)
{
    var message = await receiver.ReceiveMessageAsync(TimeSpan.FromSeconds(maxIdleWaitSeconds));
    if (message is null) break;

    Console.WriteLine(message.Body.ToString());
    await receiver.CompleteMessageAsync(message);
    processedCount++;
}

Console.WriteLine($"Processed {processedCount} message(s). Exiting.");

Two exit conditions — processed MAX_MESSAGES_PER_RUN messages, or no message arrived within MAX_IDLE_WAIT_SECONDS. Both result in exit code 0 → successful execution.


Cost: Always-On Container App vs. Job

A workload running 4× per day, each run taking 60 seconds at 0.25 vCPU / 0.5 GiB:

Metric Always-On (minReplicas=1) Container App Job
vCPU-seconds/day 21,600 60
GiB-seconds/day 43,200 120
Utilization 0.28% 100%
Monthly cost ~$15.55 ~$0.04

360× fewer compute-seconds. Event-driven jobs with min_executions = 0 cost nothing when the queue is empty.


Deployment

git clone https://github.com/azure-way/terraform-container-apps.git
cd terraform-container-apps/container_apps_jobs

terraform init
terraform apply

Provide service principal credentials via terraform.tfvars or at the prompt. Terraform creates all infrastructure, builds the container image via az acr build, and deploys all three jobs.


Which Trigger Type to Use?

Trigger Best for Example
Manual Ad-hoc tasks from humans or CI/CD On-demand PDF export, data migration
Scheduled Recurring work on fixed cadence Nightly reports, daily data exports
Event-driven Reactive processing from external events Service Bus consumer, Blob trigger, Kafka

Common Questions

What happens if a job execution fails?

The replica_retry_limit controls automatic retries. Set to 1, the platform retries once. Set to 2 (as in the event-driven example), it retries twice. After exhausting retries the execution is marked as failed. Failed executions are visible in the Azure Portal and Log Analytics — they do not block subsequent executions.

Can I use a different KEDA scaler instead of Service Bus?

Yes. Replace the rules block with any KEDA-supported scaler — Azure Storage Queue, Kafka, PostgreSQL, RabbitMQ, Cron, and dozens more. The structure stays the same: custom_rule_type, metadata, and authentication.

How does messageCount actually drive scaling?

KEDA divides the current queue depth by messageCount to calculate the desired number of job instances. With messageCount = “20”: 45 queued messages → 3 instances. 100 messages → 5 instances. The result is capped by max_executions. Each instance independently connects to the queue and drains its share of messages.

Does the event-driven job process exactly 20 messages per instance?

Not exactly. messageCount is a KEDA scaling threshold, not a delivery guarantee. KEDA uses it to decide how many instances to start. The actual message consumption depends on your application logic — in this sample, MAX_MESSAGES_PER_RUN = 20 and MAX_IDLE_WAIT_SECONDS = 15 control how many messages each instance drains before exiting. Multiple instances compete for the same queue, so each one may process fewer than 20 if messages are spread across instances.

What is the maximum execution time for a job?

Controlled by replica_timeout_in_seconds. Maximum allowed value is 86,400 (24 hours). In this sample: manual and scheduled jobs use 300s (5 min), the event-driven job uses 600s (10 min). Set this based on realistic worst-case processing time plus margin.

Can I run Container App Jobs on a Dedicated workload profile instead of Consumption?

Yes. Replace the Consumption workload profile with a Dedicated profile (D4, D8, etc.) on the Container App Environment and add workload_profile_name to the job resource. Dedicated profiles provide predictable performance and can support higher CPU/memory limits, but you lose scale-to-zero — you pay for the profile nodes whether jobs are running or not.

How do I monitor job executions?

Job execution history is available in the Azure Portal under the Container App Job resource. Logs from Console.WriteLine are streamed to the Log Analytics workspace attached to the Container App Environment. Query them with:

ContainerAppConsoleLogs_CL
| where ContainerGroupName_s contains "event-job"
| order by TimeGenerated desc

Can I pass different parameters to each manual job execution?

Not directly through the Terraform trigger config. However, you can override environment variables at execution time using the Azure CLI:

az containerapp job start \
    --name <job_name> \
    --resource-group <rg> \
    --container-name manual-job \
    --env-vars "MAX_MESSAGES_PER_RUN=50"

This lets you reuse the same job definition with different parameters per run.


Summary

Azure Container Apps Jobs are the right tool for workloads that run to completion and exit:

  • Jobs vs. Apps — container exits after work? Use a Job.
  • Three triggers — manual_trigger_config, schedule_trigger_config, event_trigger_config.
  • Scale to zero — event-driven jobs with min_executions = 0 cost nothing when idle.
  • KEDA scaling — messageCount divides queue depth into parallel instances, capped by max_executions.
  • Cost — up to 360× cheaper than always-on replicas for intermittent workloads.

Full source: azure-way/terraform-container-apps/container_apps_jobs.


Part of the Azure Container Apps with Terraform series:

The whole Azure Container Apps series articles can be found here

Leave a Reply