Conquer .NET Lambda Performance with these 3 steps | James Eastham | Just a storytelling technologist thinking about systems, software and people.

Every engineer working in serverless has felt the sting of a cold start at least once. There you are, full of anticipation, hitting an API endpoint only to stare at your screen, twiddling your thumbs, wondering if AWS Lambda has gone on holiday or something. When the response finally arrives, it’s often paired with frustration, confusion, and a twinge of embarrassment if there was a team member watching over your shoulder. For us .NET, and Java, developers this problem can be particularly pronounced.

I know of entire organizations who have completely re-written their whole stack because of this challenge.

Cold starts aren’t just a curiosity—they’re an architectural bottleneck, a design constraint, and a UX reality all at once. But like most “impossible” problems in distributed systems, once you peel away the mystery, a set of actionable strategies emerges. Let’s dig into why .NET Lambda cold starts loom so large, and what you can do—today—to whittle those waits away.

Memory Allocation: The Undervalued Lever

There’s a certain badge of honor in minimizing your Lambda function’s memory allocation. It feels frugal, architectural, green. But here’s the kicker: your memory setting isn’t just about RAM. AWS Lambda ties your CPU allocation directly to how much memory you request. Pick the smallest memory footprint, and you’ll get a low powered CPU as well.

That 128 MB Lambda function? It might be cheap, but it’s also starved for compute. That translates, under the covers, to lethargic cold starts—exactly the pain you see when the clock ticks and nothing seems to happen.

Start with a gigabyte. I know, I know, it feels excessive, but you’ve got to think in Lambda’s logic. Jumping from 128 MB to 1 GB can transform a 9.2 second build duration into a 1.2 second response for .NET Lambda. Even at a glance, it’s clear that frugality can cost you more—both in user patience and, depending on your workload, in real money.

Still skeptical or want to optimize further? Try the open-source AWS Lambda Power Tuning tool. Let it hammer your function with various memory allocations. More often than not, you’ll see a sweet spot emerge where performance improves and costs diminish beyond what intuition suggests.

Initialization: The Secret to Fast Successive Invocations

The first invoke pays for all your mistakes. SDK initialization, secrets fetching, client construction—pack those all into your handler and you’ll relive cold start pain every single time. Systems thinking means designing for the lifecycle Lambda gives you: exploitation of “execution environments” that stick around for future requests.

In plain speak: anything that can be set up once and reused across invocations should be. In .NET, that means initializing heavy objects (like AWS SDK clients or configuration data) in the constructor or a startup class, rather than every single time the handler is called. .NET’s Lambda Annotations Framework is brilliant for this pattern. As well as simplifying your handlers code, it also brings in familiar dependency injection principles.

The mechanics are simple. Suppose you’re using the modern Lambda Annotations-based model:

public class MyLambdaFunction(IAmazonDynamoDB ddbClient, HttpClient httpClient, AppConfig config)
{
    [LambdaFunction]
    public async Task<string> Handler(APIGatewayProxyRequest request)
    {
        // Handler code here
    }

}

You’d have a Startup.cs file that looked a little bit something like this:

[LambdaStartup]
public class Startup
{
    public void ConfigureServices(IServiceCollection services)
    {
        services.AddSingleton<IAmazonDynamoDB>(new AmazonDynamoDBClient());
        services.AddHttpClient();

        var apiKey = FetchSecretSync();

        services.AddSingleton(new AppConfig()
        {
            ApiKey = apiKey,
            TableName = Environment.GetEnvironmentVariable("TABLE_NAME") ?? "Products"
        });
    }

    private string FetchSecretSync()
    {
        try
        {
            using var secretsClient = new AmazonSecretsManagerClient();
            var secretResponse = secretsClient.GetSecretValueAsync(new GetSecretValueRequest
            {
                SecretId = Environment.GetEnvironmentVariable("SECRET_NAME") ?? "my-api-key"
            }).GetAwaiter().GetResult();

            return secretResponse.SecretString ?? "default-key";
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Could not fetch secret: {ex.Message}. Using default.");
            return "default-key";
        }
    }
}

Objects get initialized once per execution environment, rather than on every invocation. That means after the cold start, followup requests can be lightning fast—single-digit millisecond response times aren’t uncommon.

What’s so powerful here is that Lambda’s initialization phase (the init duration) gets a CPU boost behind the scenes: if your memory setting entitles you to, say, 1.5 vCPUs, Lambda rounds up and gives you 2 during init. That’s a “free” burst you can harness for heavy startup tasks.

The knock on effect, is that you also write more testable handlers as well.

Native AOT: Henry Ford for Your Lambdas

For greenfield APIs and microservices, there’s a not so new anymore tool in the .NET arsenal: Native Ahead-Of-Time (AOT) compilation. .NET typically relies on a JIT (Just-In-Time compiler), converting IL code into machine instructions when the Lambda spins up. Native AOT flips that on its head, compiling down to a native binary ahead of time. The cold start penalty from the JIT almost vanishes.

Caveats loom. Reflection-based features don’t play well with AOT, so you’ll need to retool serialization and similar patterns. For example, JSON serialization/deserialization must use source generators. Here’s a rough sketch of what that looks like in code:

[JsonSerializable(typeof(APIResponse))]
[JsonSerializable(typeof(APIGatewayProxyRequest))]
[JsonSerializable(typeof(APIGatewayProxyResponse))]
public partial class LambdaJsonContext : JsonSerializerContext
{
}

Switching to source-generated serialization means you must explicitly list the types you want handled—no wildcards. But the performance benefits speak for themselves, cold starts dropping to as little as 300 milliseconds, total. For a modern API, that rivals the fastest interpreted runtimes out there—go ahead, compare it to Node.js or Go. Alright alright Rust developers calm down though, you’re still the most performant Lambda runtime.

It All Adds Up: A Systems View

Serverless isn’t “magic.” It’s a set of trade-offs, many of them masked by abstraction. But as you can see, the cold start challenge is best tackled not by a single fix, but by layering solutions: appropriate resource allocation, smart initialization, and native compilation.

It’s systems thinking in action—a holistic approach that sees cold start latency as the emergent property of resource allocation, object lifetimes, and compilation mode.

Want to See This in Action?

Seeing is believing. If you’re hungry for benchmarks, concrete configuration steps, and even more technical deep dives, check out the video version of this post here.