DEV Community

Cover image for Serverless Scaling: Deploying Strands + MCP on AWS
Om Shree
Om Shree

Posted on • Originally published at glama.ai

Serverless Scaling: Deploying Strands + MCP on AWS

In this Article, we'll explore how to deploy a Strands Agent connected to an MCP server using serverless AWS services. We'll cover three deployment models—Lambda (native & web adapter) and Fargate—and compare their pros, limitations, and recommended scenarios.

1. Introduction

Strands Agents SDK provides a convenient model-driven loop, while MCP enables dynamic tool invocation. Deploying them on AWS serverless platforms allows you to build scalable, maintainable agents without managing servers1.

2. Deployment Options Overview

Option Benefits Limitations
AWS Lambda (Native) Fast startup, easy CI/CD, unified observability Max 15-minute execution, no streaming support2
Lambda with Web Adapter Preserve web frameworks, serverless pay-per-use Slower cold start (1–3 s), added complexity3
AWS Fargate (ECS/EKS) Long-running containers, streaming support Higher cost, container lifecycle management4

3. Native AWS Lambda (Stateless MCP)

Approach: Package your MCP server as a Lambda function using FastMCP with HTTP transport3.

# lambda_mcp.py
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("lambda-mcp", stateless_http=True)

@mcp.tool()
def echo(message: str) -> str:
    return message

def lambda_handler(event, context):
    return mcp.handle_lambda_event(event, context)
Enter fullscreen mode Exit fullscreen mode

How to Deploy:

zip function.zip lambda_mcp.py

aws lambda create-function \
  --function-name lambdaMcp \
  --runtime python3.9 \
  --handler lambda_mcp.lambda_handler \
  --zip-file fileb://function.zip \
  --role <LAMBDA_IAM_ROLE_ARN> \
  --timeout 900
Enter fullscreen mode Exit fullscreen mode

Optionally, expose it via API Gateway:

aws apigateway create-rest-api --name mcpAPI
# Configure /mcp POST integration with the Lambda function
Enter fullscreen mode Exit fullscreen mode

Benefits:

  • Fast cold starts
  • Simplified deployment for stateless tools
  • Integrated with AWS native monitoring

Limitations:

  • No streaming support
  • 15-minute execution timeout
  • No persistent state between invocations

4. Lambda + Web Adapter (Containerized MCP)

Approach: Package MCP within a web framework (FastAPI, Flask, or Express) inside a Lambda Web Adapter container. This enables web-like behavior within Lambda.

Dockerfile:

FROM public.ecr.aws/lambda/python:3.9
COPY app.py requirements.txt ./
RUN pip install -r requirements.txt
CMD ["app.lambda_handler"]
Enter fullscreen mode Exit fullscreen mode

app.py Example:

from fastmcp import FastMCP
from aws_lambda_adapter import api_gateway_handler

mcp = FastMCP("web-mcp", stateless_http=True)
app = mcp.app

def lambda_handler(event, context):
    return api_gateway_handler(app, event, context)
Enter fullscreen mode Exit fullscreen mode

Deploy via AWS CDK Example:

from aws_cdk import (
    aws_lambda as _lambda,
    aws_apigateway as apigw,
    Stack
)
from constructs import Construct

class WebAdapterStack(Stack):
    def __init__(self, scope, id, **kwargs):
        super().__init__(scope, id, **kwargs)

        fn = _lambda.DockerImageFunction(self, "WebMCPFn",
            code=_lambda.DockerImageCode.from_image_asset("path/to/dockerfile")
        )

        apigw.LambdaRestApi(self, "ApiGateway", handler=fn)
Enter fullscreen mode Exit fullscreen mode

Benefits:

  • Allows existing web frameworks
  • Flexible HTTP routing via API Gateway
  • Serverless, pay-per-use

Limitations:

  • Added container and adapter complexity
  • Cold start delays (1–3 seconds)
  • Still no native streaming support

5. AWS Fargate (Containerized MCP)

Approach: Fully containerize the MCP server and deploy on AWS Fargate via ECS or EKS. Suitable for agents requiring persistent sessions and streaming2.

Dockerfile:

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY mcp_server.py ./
CMD ["python", "mcp_server.py"]
Enter fullscreen mode Exit fullscreen mode

mcp_server.py Example:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("fargate-mcp", stateless_http=True, port=8080)

@mcp.tool()
def echo(message: str) -> str:
    return message

if __name__ == "__main__":
    mcp.run(transport="streamable-http")
Enter fullscreen mode Exit fullscreen mode

CDK Deployment Example:

from aws_cdk import (
    aws_ecs as ecs,
    aws_ecs_patterns as patterns,
    aws_ecr_assets as assets,
    Stack
)
from constructs import Construct

class FargateStack(Stack):
    def __init__(self, scope, id, **kwargs):
        super().__init__(scope, id, **kwargs)

        docker_image = assets.DockerImageAsset(self, "McpImage",
            directory="path/to/dockerfile"
        )

        patterns.ApplicationLoadBalancedFargateService(
            self, "FargateMCPService",
            task_image_options={
                "image": ecs.ContainerImage.from_docker_image_asset(docker_image)
            },
            desired_count=2,
            public_load_balancer=True
        )
Enter fullscreen mode Exit fullscreen mode

Benefits:

  • Full streaming and persistent workloads supported
  • Scalability with ECS or EKS
  • Suitable for production-grade deployments

Limitations:

  • More costly than Lambda for low-usage patterns
  • Slightly longer deploy cycles
  • Requires container orchestration setup

6. Choosing the Right Model

  • Use Native Lambda for testing, short-lived tasks, low traffic.
  • Add Web Adapter when integrating with web apps or frameworks.
  • Choose Fargate for streaming, persistent workloads, or higher performance needs43.

7. Key Considerations

  • Security & Observability: Lambda and Fargate integrate with X-Ray, CloudWatch, IAM, and OpenTelemetry23.
  • Cost & Scaling: Lambda is cost-effective for burst workloads; Fargate favors steady or stream-heavy usage4.
  • Developer Experience: Native Lambda offers fastest dev loop; Fargate supports production parity and long-lived workflows3.

8. Next Steps

  1. Start with a proof-of-concept using native Lambda + FastMCP.
  2. Expand to include frameworks via Web Adapter for structured web API support.
  3. Move to a containerized MCP + agent deployment on Fargate via Strands’ sample projects1.

References


  1. AWS “Open Protocols for Agent Interoperability Part 3: Strands Agents & MCP” 

  2. Heeki Park, “Building an MCP server as an API developer”  

  3. Ran Isenberg, “Serverless MCP on AWS: Lambda vs. Fargate for Agentic AI Workloads” 

  4. Vivek V, “Implementing Nova Act MCP Server on ECS Fargate” 

Top comments (4)

Collapse
 
thedeepseeker profile image
Anna kowoski

Good

Collapse
 
om_shree_0709 profile image
Om Shree

Glad you liked it Anna!

Collapse
 
growthlimitcom profile image
GrowthLimit.com

Nice

Collapse
 
om_shree_0709 profile image
Om Shree

Thanks Sir