Streaming Results from Entity Framework Core and Web API Core

Full source code here. The code provided will not compile until you make a change in seeder.cs, the way it’s written it generates 500,000 rows in a local db. Set this to whatever value you want.

In this post I’m going to show you how to return an unlimited number of results from a database via a Web API application while keeping your memory usage low and constant. In effect, you are going to stream results from the database with Entity Framework.

To achieve this you need to do two things, disable tracking and return a not materialize your data inside the action method.

Turn off tracking

By default Entity Framework tracks entities that you read from a database. Tracking allows EF to determine what, if any, properties have changed since being loaded, then EF can save just the relevant changes. But tracking takes up memory and CPU.

If you have no intention of changing these entities there is no point in tracking them.

There are two ways of doing this, at point where you make the request to the database, or globally for the whole context.

To use AsNoTracking for a single request it looks like this - _salesContext.Products.AsNoTracking().Where(..)

To use it for all requests to that context set AsNoTracking in the constructor of the context.

1public SalesContext(DbContextOptions<SalesContext> options) : base(options)
2{
3	ChangeTracker.QueryTrackingBehavior = QueryTrackingBehavior.NoTracking;
4}

That’s the first step taken care of, now we make sure not to turn our data into objects inside the controller.

Do not materialize

If you follow tutorials on Entity Framework Core and Web API you will see examples like this -

1[HttpGet]
2public async Task<ActionResult> Get()
3{
4    List<Product> products = await _salesContext.Products.ToListAsync();
5
6    return Ok(products);
7}

In this example, all the products in the database are read and put into a list of products, the return does not execute until after all the data has been read, so you are waiting for this to complete before getting any results. You are also storing the whole list of products in memory (you are probably also tracking the entities).

Instead of that you can do the following -

1[HttpGet]
2public ActionResult GetStreaming()
3{
4	IQueryable<Product> products = _salesContext.Products.AsNoTracking();
5
6	return Ok(products);
7}

Now, the products do not materialize, there is no list to store in memory, tracking is turned off and the action method begins returning data almost immediately.

Your memory profile will be almost same whether you are returning a few hundred or a few million rows of data.

Example of Memory Usage

Below are screenshots from Visual Studio 2017 of the application running showing the amount of memory consumed for a variety of requests. The first shows the memory usage when I requested a single row from the database

The following images show the memory usage when streaming and not streaming results sets of 100,000, 200,000, 300,000 and 500,000.

[caption id=“attachment_1370” align=“aligncenter” width=“928”]

106 MB used for a single row[/caption]

 

[caption id=“attachment_1372” align=“aligncenter” width=“640”]

100,000 rows returned. On the left is with streaming, on the right without.[/caption]

 

[caption id=“attachment_1373” align=“aligncenter” width=“877”]

200,000 rows returned. On the left is with streaming, on the right without.[/caption]

 

[caption id=“attachment_1374” align=“aligncenter” width=“911”]

300,000 rows returned. On the left is with streaming, on the right without.[/caption]

 

[caption id=“attachment_1375” align=“aligncenter” width=“944”]

500,000 rows returned. On the left is with streaming, on the right without.[/caption]

 

[caption id=“attachment_1379” align=“aligncenter” width=“539”]

Summary of Results[/caption]

It’s obvious that streaming maintains a uniform memory footprint while the memory consumed for non-streaming grows with the number of rows returned.

What you don’t see here is that streaming requests complete more quickly. If you download the attached code and use a local mdf file, you probably won’t see much of a difference in speed, but if your application connects to remote database you will see approximately a 20% improvement in speed.

Summary

For streaming to work you need to turn of tracking of entities and return an IQueryable from the action method.

Full source code here.  
The code provided will not compile until you make a change in seeder.cs, the way it’s written it generates 500,000 rows in a local db. Set this to whatever value you want.

comments powered by Disqus

Related