PlantUML Server in Azure Container Apps for Agentic AI App

In the first article of this series, I introduced DocWriter Studio — my multi-agent, AI-powered document generation platform running on Azure. I showed how specialized agents (planner, writer, reviewer, verifier, rewriter) collaborate through Azure Service Bus queues to produce executive-ready, 60+ page technical documents with embedded diagrams.

One of the capabilities I’m most proud of is diagram-rich storytelling: PlantUML Server in Azure Container Apps is authored automatically by AI agents, rendered to PNG/SVG, and embedded in Markdown, PDF, and DOCX outputs — without a single human touching a diagramming tool.

But there’s a critical piece of infrastructure that makes this possible: a PlantUML rendering server deployed as a microservice in Azure Container Apps. In this article, I’ll walk you through how I deploy, integrate, and harden PlantUML as a first-class service in my agentic AI workflow.


Why PlantUML as a Microservice

When my AI agents generate diagrams, they produce PlantUML source code (text) that needs to be rendered into visual assets (PNG or SVG). Running PlantUML locally works for development, but in production I needed:

  • Always-on availability — multiple diagram-render workers fire concurrently as documents are processed in parallel
  • Horizontal scalability — large documents can contain 10–20+ diagrams; burst rendering must not become a bottleneck
  • Network isolation — the server should be reachable only from within the Container Apps environment
  • Zero-ops management — no VM patching or JVM tuning; Azure Container Apps handles scaling and restarts

I solved all of this by deploying PlantUML as a containerized microservice alongside my API, UI, and Function workers in the same Container Apps environment.


Where PlantUML Fits in the Pipeline

PlantUML sits at the Diagram Render stage — stage 9 of 10 in my pipeline. I split diagram processing into two stages that work as a pair:

  1. Diagram Prep (functions_diagram_prep) — extracts PlantUML blocks from the draft Markdown, sanitizes them (strips Markdown fences, validates syntax, rejects accidental Mermaid), uploads clean .puml sources to Blob Storage, and enqueues render requests onto the docwriter-diagram-render queue.

  2. Diagram Render (functions_diagram_render) — picks up those requests, calls the PlantUML server’s HTTP API, stores rendered PNG/SVG in Blob Storage, and signals the Finalize stage that diagrams are ready.

The PlantUML server itself is a stateless HTTP service — it receives text in a POST, returns the rendered image. No state, no database, trivial scaling. That’s exactly the kind of simplicity I was looking for.


The Container: Deliberately Simple

I don’t fork or customize PlantUML. I use a three-line Dockerfile:

FROM plantuml/plantuml-server:jetty

ENV PLANTUML_PORT=8080

EXPOSE 8080

The official plantuml/plantuml-server:jetty image is production-ready out of the box. I initially over-engineered this with custom health checks and JVM tuning, but stripped everything back — the upstream image just works. It honours JAVA_OPTS if I ever need custom memory settings at deploy time.

To build and test locally:

docker build -t plantuml-server ./plantuml-server
docker run -p 8080:8080 plantuml-server

# Test rendering
curl --data-binary '@diagram.puml' http://localhost:8080/plantuml/png > diagram.png

The server responds with PNG by default, or SVG if you hit /plantuml/svg.


Deploying with Terraform

I provision my entire infrastructure with Terraform, and the PlantUML server is deployed as a first-class Container App alongside the API and UI.

In my root main.tf, the PlantUML image sits alongside the API in the api_images map:

  api_images = {
    api = {
      image        = "${module.container_registry.url}/docwriter-api:${var.docker_image_version}"
      min_replicas = 1
      max_replicas = 1
    }
    plantuml = {
      image        = "${module.container_registry.url}/plantuml-server:${var.docker_image_version}"
      min_replicas = 1
      max_replicas = 1
    }
  }

  api_ports = {
    api      = 8000
    plantuml = 8080
  }

The integration I’m most pleased with is automatic service discovery. Every Container App in the same environment gets a deterministic FQDN, and I inject it into all other containers:

      env {
        name  = "PLANTUML_SERVER_URL"
        value = "https://aidocwriter-plantuml.${azurerm_container_app_environment.main.default_domain}"
      }

My diagram-render workers simply read PLANTUML_SERVER_URL from their environment — no hardcoded IPs, no service mesh, no DNS management. The same injection happens in the Functions containers:

      env {
        name  = "PLANTUML_SERVER_URL"
        value = "https://${var.plantuml_server_name}.${azurerm_container_app_environment.main.default_domain}"
      }

The server name is configurable via a Terraform variable (default: aidocwriter-plantuml) in case I ever need multiple instances.


CI/CD: Built and Deployed Atomically

The PlantUML image is just another entry in my GitHub Actions build matrix — built alongside all 12 DocWriter containers:

- dockerfile: plantuml-server/Dockerfile
  image_name: plantuml-server

The workflow builds all images, pushes them to Azure Container Registry with :latest and :v<git-describe> tags, then triggers the Terraform workflow with the resolved version. The PlantUML server is redeployed atomically with everything else — no version skew, no manual steps.


Helping the AI Write Valid PlantUML

Before any diagram reaches the PlantUML server, I’ve put guardrails in place to maximize rendering success.

The PlantUML Encyclopedia

I built a structured reference catalog in plantuml_reference.py covering 20+ diagram types — class, sequence, component, deployment, ERD, mind map, Gantt, WBS, and more — each with description and example syntax:

PLANTUML_FEATURES: Dict[str, Dict[str, Dict[str, Dict[str, str]]]] = {
    "plantuml_diagram_types": {
        "uml": {
            "sequence_diagram": {
                "description": "Shows interactions between objects over time.",
                "syntax": "@startuml\nAlice -> Bob : Hello\nBob --> Alice : Response\n@enduml",
            },
            "component_diagram": {
                "description": "Shows components and their interfaces.",
                "syntax": "@startuml\ncomponent API\ncomponent DB\nAPI --> DB\n@enduml",
            },
            # ... class, state, activity, deployment, etc.
        },
        "data_and_structure": {
            "entity_relationship_diagram": {
                "description": "ER-style data modeling.",
                "syntax": "@startuml\nentity User {\n  *id\n  name\n}\n...",
            },
        },
        # ... 20+ types total
    }
}

I feed this reference text to the Writer agent as part of its system prompt, so it knows exactly which diagram families and syntax are available.

Diagram Prep: Sanitize Before Rendering

My Diagram Prep stage performs several critical cleanup operations before anything goes near the PlantUML server. I learned the hard way that LLMs are creative with whitespace and fencing:

def _validate_plantuml_source(source: str) -> List[str]:
    issues: List[str] = []
    lower = source.lower()
    if "@startuml" not in lower:
        issues.append("missing @startuml")
    if "@enduml" not in lower:
        issues.append("missing @enduml")
    if "```" in source:
        issues.append("contains markdown code fences inside PlantUML")
    if "@startmermaid" in lower or "```mermaid" in lower:
        issues.append("contains Mermaid instead of PlantUML")
    stripped = re.sub(r"@startuml", "", source, flags=re.IGNORECASE)
    stripped = re.sub(r"@enduml", "", stripped, flags=re.IGNORECASE).strip()
    if not stripped:
        issues.append("empty diagram body")
    return issues

The sanitizer strips Markdown fences, normalizes line breaks, ensures @startuml/@enduml guards are present, and rejects accidental Mermaid. Only clean, validated .puml sources get uploaded to Blob Storage and queued for rendering.


The Rendering Code: Retries, LLM Fixes, and Fallbacks

Here’s the core logic that calls my PlantUML Container App:

    server_url = os.getenv("PLANTUML_SERVER_URL")
    if not server_url:
        raise DiagramRenderError("PLANTUML_SERVER_URL not configured")

    last_exc: Exception | None = None
    last_source = source
    for attempt in range(5):
        try:
            uml_source = _reformat_plantuml_text(last_source)
            last_source = uml_source
            endpoint = f"{server_url.rstrip('/')}/{fmt}"

            response = requests.post(
                endpoint,
                data=uml_source.encode("utf-8"),
                headers={"Content-Type": "text/plain; charset=utf-8"},
                timeout=30,
            )

            response.raise_for_status()
            return response.content
        except Exception as exc:
            last_exc = exc
            track_exception(exc, {"stage": "DIAGRAM_RENDER", "attempt": str(attempt + 1)})
            if regen_after_second_failure:
                try:
                    regenerated = regen_after_second_failure()
                    if regenerated:
                        last_source = regenerated
                        continue
                except Exception as regen_exc:
                    track_exception(
                        regen_exc,
                        {"stage": "DIAGRAM_RENDER", "attempt": "regen_after_second_failure"},
                    )
    raise DiagramRenderError(f"PlantUML rendering failed after 3 attempts: {last_exc}") from last_exc

The key design decisions I made here:

  • 5 retry attempts with 30-second timeouts for resilience against transient failures
  • LLM-powered reformatting before each attempt — I use an Azure OpenAI model to fix line breaks, indentation, and labels. AI-generated PlantUML often has subtle formatting issues that crash the renderer
  • Regeneration fallback — after two consecutive failures, I rebuild the PlantUML from scratch using the original diagram description, entities, and relationships from the plan. This is my nuclear option, and it saves a surprising number of diagrams
  • Telemetry on every failure — stage, diagram ID, and attempt number all flow into Application Insights

I’m essentially using one AI model to fix the output of another. It sounds recursive, but it works remarkably well because the reformatting task is much simpler than the original diagram generation.

Graceful Degradation

If all render attempts fail for a given diagram, the pipeline doesn’t crash. My render worker sends error metadata to the Finalize stage, which preserves the original code block rather than leaving a gap. The status timeline emits a DIAGRAM → FAILED event with the error message, so users see exactly what happened.


End-to-End: From terraform apply to Rendered Diagrams

Here’s the practical deployment flow:

# 1. Build the image
docker build -t plantuml-server ./plantuml-server

# 2. Push to ACR (or let GitHub Actions handle it)
docker tag plantuml-server aidocwriteracr.azurecr.io/plantuml-server:v1
docker push aidocwriteracr.azurecr.io/plantuml-server:v1

# 3. Deploy with Terraform
terraform -chdir=infra/terraform apply \
  -var "docker_image_version=v1" \
  -var "plantuml_server_name=aidocwriter-plantuml"

# 4. Verify
curl https://aidocwriter-plantuml.<env-domain>/plantuml/png/~h

Terraform creates the Container App on port 8080, injects PLANTUML_SERVER_URL into every other container, and configures managed identity and registry auth automatically.


What I Learned

  1. PlantUML as a sidecar microservice is the right pattern for agentic AI systems that generate diagrams. It decouples rendering from generation and scales independently. I tried running PlantUML in-process early on — it was a nightmare of Java dependency management.

  2. A three-line Dockerfile is enough. The upstream plantuml/plantuml-server:jetty image is production-ready. Don’t over-engineer the wrapper.

  3. Azure Container Apps service discovery is powerful. A deterministic FQDN injected via Terraform replaces the need for a service mesh or manual DNS.

  4. AI-generated diagram code needs multiple layers of defense. Sanitization → validation → LLM reformatting → retry → regeneration. Each layer catches issues the previous ones missed.

  5. Atomic deployment via IaC pays for itself. Updating the PlantUML version is a one-line change to the base image tag, deployed alongside everything else with zero coordination overhead.


📦 Source code: azure-way/aidocwriter

🌐 Live demo: docwriter-studio.azureway.cloud

📄 Series overview: DocWriter Studio Multi-Agent: AI-Powered Document Generation on Azure

Leave a Reply