When I joined Pixelpk Technologies as a Senior Software Engineer and Team Lead, one of my primary objectives was to reduce our cloud infrastructure costs while maintaining—and ideally improving—performance. Over the course of the year, we achieved a 40% reduction in operational costs by migrating to a serverless architecture.
Here’s exactly how we did it, with real code, actual numbers, and lessons learned from production.
The Starting Point
Our legacy architecture:
- 3 EC2 t3.large instances (24/7 uptime): ~$180/month
- RDS PostgreSQL db.t3.medium: ~$120/month
- Load balancer: ~$20/month
- Data transfer: ~$80/month
- CloudWatch logs: ~$30/month
- Total: ~$430/month (not counting development environments)
For our traffic pattern (50-100 requests/minute with spikes to 500), we were drastically over-provisioned.
The Serverless Architecture
Lambda Functions + API Gateway
We broke our monolithic application into focused Lambda functions:
// Before: Single Express app handling everything
app.get("/users/:id", getUserHandler);
app.post("/users", createUserHandler);
app.get("/posts/:id", getPostHandler);
// ... 50+ routes
// After: Individual Lambda functions
// users/get.ts
export const handler: APIGatewayProxyHandler = async (event) => {
const { id } = event.pathParameters || {};
if (!id) {
return {
statusCode: 400,
body: JSON.stringify({ error: "User ID required" }),
};
}
try {
const user = await dynamodb
.get({
TableName: "Users",
Key: { id },
})
.promise();
return {
statusCode: 200,
headers: {
"Content-Type": "application/json",
"Cache-Control": "public, max-age=300", // 5 min cache
},
body: JSON.stringify(user.Item),
};
} catch (error) {
console.error("Error fetching user:", error);
return {
statusCode: 500,
body: JSON.stringify({ error: "Internal server error" }),
};
}
};
Why This Saves Money
- Pay Per Request: No idle time costs
- Auto-scaling: No over-provisioning
- Regional Edge Caching: API Gateway caching reduces invocations
- Optimized Memory: Right-sized functions (we found 512MB was perfect)
Cost Breakdown After Migration
- Lambda invocations (2M/month at $0.20 per 1M): ~$0.40
- Lambda duration (150,000 GB-seconds): ~$2.50
- API Gateway (2M requests): ~$7.00
- DynamoDB (on-demand, 1M reads, 200K writes): ~$1.50
- S3 (storage + requests): ~$5.00
- CloudWatch: ~$10.00
- Data transfer: ~$15.00
- Total: ~$41.40/month
Savings: $388.60/month (90% reduction)
But that’s not the full picture. Let’s talk about the hidden costs and optimizations.
Challenge #1: Cold Starts
Cold starts were killing our user experience. Initial response times:
- Cold start: 2-3 seconds
- Warm start: 50-100ms
Solution: Provisioned Concurrency
// serverless.yml
functions:
getUser:
handler: users/get.handler
provisionedConcurrency: 2 # Keep 2 instances warm
reservedConcurrency: 100 # Max concurrent executions
This added ~$30/month but reduced P95 latency from 2.1s to 120ms.
Solution: Lambda Layers for Dependencies
// Before: 50MB deployment package
// After: 5MB function + 45MB shared layer
// serverless.yml
layers:
commonDependencies:
path: layers/common
name: ${self:service}-common-deps
description: Shared dependencies
retain: true
functions:
getUser:
handler: users/get.handler
layers:
- {Ref: CommonDependenciesLambdaLayer}
Result: Cold start time reduced from 3s to 800ms.
Solution: Webpack Bundling & Tree Shaking
// webpack.config.js
module.exports = {
entry: "./src/handler.ts",
target: "node",
mode: "production",
optimization: {
minimize: true,
usedExports: true, // Tree shaking
},
externals: {
"aws-sdk": "aws-sdk", // Don't bundle AWS SDK
},
};
Result: Package size reduced by 65%.
Challenge #2: DynamoDB Single-Table Design
Moving from PostgreSQL’s relational model to DynamoDB required rethinking our data model.
The Single-Table Pattern
// Before (PostgreSQL): Multiple tables with joins
SELECT u.*, p.* FROM users u
LEFT JOIN posts p ON u.id = p.user_id
WHERE u.id = ?
// After (DynamoDB): Single table with composite keys
interface TableSchema {
PK: string // Partition Key: USER#123 or POST#456
SK: string // Sort Key: PROFILE or META#2024-01
GSI1PK: string // For alternate access patterns
GSI1SK: string
Type: string // USER | POST | COMMENT
// ... entity-specific attributes
}
// Query pattern examples:
// Get user profile
const user = await dynamodb.get({
TableName: 'AppData',
Key: {
PK: 'USER#123',
SK: 'PROFILE'
}
})
// Get user's posts
const posts = await dynamodb.query({
TableName: 'AppData',
KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
ExpressionAttributeValues: {
':pk': 'USER#123',
':sk': 'POST#'
}
})
Access Pattern Planning
// Define all access patterns upfront
const AccessPatterns = {
getUserById: "PK = USER#{id}, SK = PROFILE",
getUserPosts: "PK = USER#{id}, SK begins_with POST#",
getPostById: "PK = POST#{id}, SK = META",
getPostsByDate: "GSI1PK = POSTS, GSI1SK = DATE#{date}",
getCommentsByPost: "PK = POST#{id}, SK begins_with COMMENT#",
};
// Build helper functions
class DynamoDBHelper {
async getUserWithPosts(userId: string) {
const result = await this.dynamodb
.query({
TableName: "AppData",
KeyConditionExpression: "PK = :pk",
ExpressionAttributeValues: {
":pk": `USER#${userId}`,
},
})
.promise();
// Separate by type
const user = result.Items?.find((i) => i.SK === "PROFILE");
const posts = result.Items?.filter((i) => i.SK.startsWith("POST#"));
return { user, posts };
}
}
DynamoDB Cost Optimization
// Use batch operations whenever possible
const batchGet = async (userIds: string[]) => {
const chunks = chunk(userIds, 100); // DynamoDB limit
const results = await Promise.all(
chunks.map((chunk) =>
dynamodb
.batchGet({
RequestItems: {
AppData: {
Keys: chunk.map((id) => ({
PK: `USER#${id}`,
SK: "PROFILE",
})),
},
},
})
.promise()
)
);
return results.flatMap((r) => r.Responses?.AppData || []);
};
// Use projection expressions to reduce data transfer
const getUser = async (id: string) => {
return dynamodb
.get({
TableName: "AppData",
Key: { PK: `USER#${id}`, SK: "PROFILE" },
ProjectionExpression: "id, #name, email, createdAt",
ExpressionAttributeNames: {
"#name": "name", // 'name' is a reserved word
},
})
.promise();
};
Challenge #3: API Gateway Optimization
Enable Caching
# serverless.yml
provider:
apiGateway:
caching:
enabled: true
ttlInSeconds: 300 # 5 minutes
functions:
getUser:
handler: users/get.handler
events:
- http:
path: users/{id}
method: get
caching:
enabled: true
ttlInSeconds: 300
cacheKeyParameters:
- name: request.path.id
Result: 70% of requests served from cache, reducing Lambda invocations by same amount.
Request Validation
functions:
createUser:
handler: users/create.handler
events:
- http:
path: users
method: post
request:
schemas:
application/json: ${file(schemas/create-user.json)}
This prevents Lambda invocations for invalid requests, saving costs and improving security.
Response Compression
// Enable compression in API Gateway
export const handler: APIGatewayProxyHandler = async (event) => {
const data = await getLargeDataset();
return {
statusCode: 200,
headers: {
"Content-Type": "application/json",
"Content-Encoding": "gzip",
},
body: JSON.stringify(data),
isBase64Encoded: false,
};
};
Challenge #4: Observability & Debugging
Serverless makes debugging harder. Here’s our solution:
Structured Logging
import { Logger } from "@aws-lambda-powertools/logger";
const logger = new Logger({
serviceName: "user-service",
logLevel: "INFO",
});
export const handler: APIGatewayProxyHandler = async (event) => {
logger.addContext({ requestId: event.requestContext.requestId });
logger.info("Processing user request", {
userId: event.pathParameters?.id,
method: event.httpMethod,
});
try {
const result = await processRequest(event);
logger.info("Request completed successfully", {
duration: Date.now() - startTime,
});
return result;
} catch (error) {
logger.error("Request failed", {
error: error.message,
stack: error.stack,
});
throw error;
}
};
Custom CloudWatch Metrics
import { MetricUnits, Metrics } from "@aws-lambda-powertools/metrics";
const metrics = new Metrics({
namespace: "UserService",
serviceName: "user-api",
});
export const handler = async (event: APIGatewayProxyEvent) => {
metrics.addMetric("UserRequests", MetricUnits.Count, 1);
const startTime = Date.now();
try {
const result = await getUserById(event.pathParameters?.id);
metrics.addMetric(
"UserRequestDuration",
MetricUnits.Milliseconds,
Date.now() - startTime
);
metrics.addMetric("UserRequestSuccess", MetricUnits.Count, 1);
return result;
} catch (error) {
metrics.addMetric("UserRequestFailure", MetricUnits.Count, 1);
throw error;
} finally {
metrics.publishStoredMetrics();
}
};
X-Ray Tracing
import AWSXRay from "aws-xray-sdk-core";
import AWS from "aws-sdk";
const dynamodb = AWSXRay.captureAWSClient(new AWS.DynamoDB.DocumentClient());
// Now all DynamoDB calls are automatically traced
Challenge #5: Local Development
Using Serverless Offline
# serverless.yml
plugins:
- serverless-offline
- serverless-dynamodb-local
custom:
serverless-offline:
httpPort: 3000
dynamodb:
stages:
- dev
start:
port: 8000
inMemory: true
migrate: true
# Start local environment
$ npm run dev
# Serverless runs on http://localhost:3000
# DynamoDB runs on http://localhost:8000
The Complete Cost Comparison
Before (Traditional EC2)
EC2 (3x t3.large): $180/month
RDS (db.t3.medium): $120/month
Load Balancer: $20/month
Data Transfer: $80/month
CloudWatch: $30/month
-----------------------------------
Total: $430/month
Annual: $5,160
After (Serverless)
Lambda: $2.90/month
API Gateway: $7.00/month
DynamoDB: $1.50/month
S3: $5.00/month
CloudWatch: $10.00/month
Data Transfer: $15.00/month
-----------------------------------
Total: $41.40/month
Annual: $496.80
Annual Savings: $4,663.20 (90% reduction)
But remember:
- Added $30/month for provisioned concurrency
- Development effort: ~2 months
- ROI: Break-even in 3 months
Key Takeaways
-
Right-Size Everything: We started with 1024MB Lambda memory, found 512MB was perfect (50% cost reduction)
-
Use Caching Aggressively: API Gateway caching reduced our Lambda invocations by 70%
-
Single-Table Design: Takes time to learn but dramatically reduces DynamoDB costs
-
Monitor Cold Starts: Use provisioned concurrency strategically, not everywhere
-
Batch Operations: DynamoDB batch operations are your friend
-
Infrastructure as Code: Serverless Framework made deployments consistent and repeatable
Common Pitfalls to Avoid
- Over-Provisioning: Don’t set provisioned concurrency on every function
- Ignoring Cold Starts: Profile first, optimize selectively
- Not Using Layers: Shared dependencies save deployment time and cold starts
- Ignoring DynamoDB Design: Single-table design is worth learning
- Insufficient Monitoring: CloudWatch metrics are essential
When NOT to Use Serverless
Serverless isn’t always the answer:
- Long-running tasks (>15 minutes Lambda limit)
- Consistent high throughput (EC2 might be cheaper)
- Websocket-heavy applications (consider Fargate)
- Complex state management (consider containers)
For us, with sporadic traffic and clear separation of concerns, serverless was perfect.
Considering serverless for your project? I’m happy to discuss your specific use case. Feel free to reach out.