Solving Docker BuildKit Compatibility Issues with Amazon ECS
A deep dive into resolving Docker BuildKit attestation manifest incompatibilities that prevent ECS deployments, including investigation steps and solutions.

Docker containers that build successfully on local development machines can fail when deployed to Amazon ECS through CDK. This post documents the investigation and resolution of a deployment failure caused by Docker BuildKit's attestation manifests, which are incompatible with Amazon ECS.
The root cause stems from Docker BuildKit creating OCI image manifests with additional security attestation layers that ECS cannot parse. While these attestations provide supply chain security benefits, ECS expects simple, single-platform Docker manifests.
The Problem
Our CDK deployments of containerized services were failing with cryptic errors. The services would deploy successfully to ECR, but ECS tasks refused to start. The most puzzling aspect? The exact same Docker builds worked flawlessly on our local development machines.
The deployment failures manifested in several ways. CDK reported build failures despite local builds working perfectly. Images appeared in ECR but showed 0 MB size. ECS tasks immediately stopped with pull errors, and the error messages referenced missing linux/amd64 descriptors.
The ECS console displayed this error:
Stopped reason:
CannotPullContainerError: failed to resolve reference
"xxx.dkr.ecr.region.amazonaws.com/repo:tag":
pulling from host xxx.dkr.ecr.region.amazonaws.com
failed with status code [manifests tag]: 400 Bad Request
The Investigation
Step 1: Verifying Local Builds
Our first instinct was to verify that the Dockerfiles were correct. We ran local builds:
# Local build test
docker build -t test-image .
docker run test-image
# Check image details
docker images | grep test-image
# Output: test-image latest abc123 2 minutes ago 292MB
Everything worked perfectly. The containers started, ran their applications, and had the expected file sizes. This eliminated Dockerfile issues as the cause.
Step 2: Pursuing Red Herrings
Since local builds worked but CDK builds failed, we suspected issues with our monorepo structure:
// Original Dockerfile attempting to handle pnpm workspaces
FROM node:18-alpine
WORKDIR /app
# We thought this was the problem
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./
COPY containers/my-app/package.json ./containers/my-app/
# Hours spent converting this to standalone npm...
RUN pnpm install --frozen-lockfile --filter my-app...
We spent significant time converting pnpm workspaces to standalone npm packages, reorganizing .dockerignore
files, and restructuring the monorepo layout. None of these changes had any effect on the actual problem.
Step 3: Analyzing ECR Manifests
The breakthrough came when we inspected the image manifests in ECR:
# Check manifest type
aws ecr batch-get-image \
--repository-name my-repo \
--image-ids imageTag=latest \
--query 'images[0].imageManifest' \
--output text | jq '.mediaType'
# Output: "application/vnd.oci.image.index.v1+json"
This revealed that our images were using OCI index manifests rather than simple Docker manifests. Further investigation showed attestation layers:
# Check for attestation layers
aws ecr batch-get-image \
--repository-name my-repo \
--image-ids imageTag=latest \
--query 'images[0].imageManifest' \
--output text | jq '.manifests[]'
Understanding the Root Cause
Docker BuildKit vs Legacy Builder
Docker BuildKit, enabled by default in recent Docker versions, creates sophisticated image manifests that include:
BuildKit adds security attestations (provenance and SBOM data) that provide supply chain security benefits. However, ECS expects simple, single-platform manifests and cannot parse these additional layers.
Manifest Structure Comparison
BuildKit Manifest (Incompatible with ECS):
{
"mediaType": "application/vnd.oci.image.index.v1+json",
"schemaVersion": 2,
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"platform": {
"architecture": "amd64",
"os": "linux"
}
},
{
"mediaType": "application/vnd.in-toto+json",
"annotations": {
"in-toto.io/predicate-type": "https://slsa.dev/provenance/v0.2"
}
}
]
}
Legacy Builder Manifest (Compatible with ECS):
{
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"schemaVersion": 2,
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json"
}
}
The Solution
Immediate Fix
The solution was surprisingly simple once we understood the root cause. We disabled BuildKit in our deployment script:
#!/bin/bash
set -e
# Disable Docker BuildKit to avoid attestation manifest issues with ECS
# ECS expects simple linux/amd64 images, not manifest lists
export DOCKER_BUILDKIT=0
# Continue with CDK deployment
cdk deploy --all
CDK Configuration
For CDK projects, you can also ensure platform specification:
import { DockerImageAsset, Platform } from "aws-cdk-lib/aws-ecr-assets";
export class MyStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
// Create Docker image asset with explicit platform
const asset = new DockerImageAsset(this, "MyImage", {
directory: path.join(__dirname, "../containers/my-app"),
platform: Platform.LINUX_AMD64,
// BuildKit will be disabled by environment variable
});
// Use in ECS task definition
const taskDefinition = new FargateTaskDefinition(this, "TaskDef");
taskDefinition.addContainer("Container", {
image: ContainerImage.fromDockerImageAsset(asset),
memoryLimitMiB: 512,
logging: LogDrivers.awsLogs({
streamPrefix: "my-app",
}),
});
}
}
Validation Steps
After implementing the fix, we validated our deployments:
# 1. Build with BuildKit disabled
export DOCKER_BUILDKIT=0
docker build -t test-image .
# 2. Check manifest type
docker manifest inspect test-image | jq '.mediaType'
# Should show: "application/vnd.docker.distribution.manifest.v2+json"
# 3. Deploy to ECS
cdk deploy
# 4. Verify ECS tasks are running
aws ecs list-tasks \
--cluster my-cluster \
--service-name my-service \
--desired-status RUNNING
Lessons Learned
This investigation revealed several important insights about containerized deployments. Not all AWS services support the latest container specifications. While BuildKit's attestations provide valuable security benefits, compatibility with your deployment target takes precedence.
Local Docker builds can behave differently from CDK's build process. Our local Docker daemon handled BuildKit manifests perfectly, but ECS has different requirements. Testing in an environment that closely matches production would have revealed this issue earlier.
Error messages don't always point to the root cause. The "platform mismatch" error suggested architecture incompatibilities, when the real problem was manifest format parsing. Understanding the complete deployment pipeline helps identify where issues actually originate.
Before refactoring application structure, investigate deployment tool differences. We spent hours restructuring our monorepo when the solution was a simple environment variable change.
Best Practices
Creating a deployment checklist helps prevent similar issues. Key items to verify include disabling BuildKit in deployment scripts, confirming base images use linux/amd64 architecture, validating CDK platform specifications, and testing ECS task launches in staging environments. Ensuring local builds match expected sizes and reviewing .dockerignore configurations can catch issues early in the deployment process.
Diagnostic commands prove invaluable for troubleshooting container deployment issues:
# diagnose.sh
#!/bin/bash
echo "Checking Docker BuildKit status..."
echo "DOCKER_BUILDKIT=${DOCKER_BUILDKIT}"
echo "Checking image manifest type..."
aws ecr batch-get-image \
--repository-name $1 \
--image-ids imageTag=$2 \
--query 'images[0].imageManifest' \
--output text | jq '.mediaType'
echo "Checking for attestation layers..."
aws ecr batch-get-image \
--repository-name $1 \
--image-ids imageTag=$2 \
--query 'images[0].imageManifest' \
--output text | jq '.manifests[] | select(.mediaType | contains("in-toto"))'
Conclusion
Docker BuildKit's advanced features create a compatibility gap with Amazon ECS's current manifest parsing capabilities. While this caused significant debugging time, the solution—disabling BuildKit—is straightforward once you understand the root cause.
The key takeaway? When containerized services work locally but fail in cloud deployments, investigate the build and deployment toolchain differences before diving into application code changes. Sometimes the simplest solution is the right one.
Remember to keep this workaround documented and revisit it periodically as both Docker and AWS ECS continue to evolve. What's incompatible today might become the standard tomorrow.