Streaming Results from Entity Framework Core and Web API Core – Part 2

Full source code available here.

Some time ago I wrote a post showing how to stream results from Entity Framework over Web API. This approach a few benefits – the results would not be materialized in the API code, a small amount of memory would be used irrespective of the size of the data returned, the results would being steaming as soon as possible and the speed of the request was faster than doing something like .ToList() or .ToListAsync().

In the example code I directly accessed the DbContext from the API controller, a reader of the blog got in touch to ask the database access code could be kept away from the API controller. This blog post shows a simple way of achieving this.

In the related post I had a method like this –

[HttpGet("streaming/")]
public IActionResult GetStreaming()
{
    IQueryable<Product> products = _salesContext.Products.AsNoTracking();
    return Ok(products);
}

The _salesContext was passed into the constructor of the ProductsController class using dependency injection and access directly.

In this post I’m going to pass a ProductsService into the ProductsController and use it to make the request to the database.

Here is how the controller now looks –

    [Route("api/[controller]")]
    [ApiController]
    public class ProductsController : ControllerBase
    {
        private readonly ProductsService _productsService;

        public ProductsController(ProductsService productsService)
        {
            _productsService = productsService;
        }

        [HttpGet("streamingFromService/{count}")]
        public ActionResult GetStreamingFromService(int count)
        {
            return Ok(_productsService.GetStreaming(count));
        }

        [HttpGet("streamingFromServiceWithProjection/{count}")]
        public ActionResult GetStreamingFromServiceWithProjection(int count)
        {
            return Ok(_productsService.GetStreamingWithProjection(count));
        }

        [HttpGet("nonStreamingFromService/{count}")]
        public async Task<ActionResult> GetNonStreamingFromService(int count)
        {
            return Ok(await _productsService.GetNonStreaming(count));
        }
    }

The ProductService is very simple, it makes the relevant calls to the database and returns the Products or ProductModels. For good measure I’ve added Automapper to project the Product to ProductModel, separating the underlying data type from the one presented to the caller.

    public class ProductsService
    {
        private readonly SalesContext _salesContext;
        private readonly IMapper _mapper;

        public ProductsService(SalesContext salesContext, IMapper mapper)
        {
            _salesContext = salesContext;
            _mapper = mapper;
        }

        public IQueryable<Product> GetStreaming(int count)
        {
            IQueryable<Product> products = _salesContext.Products.OrderBy(o => o.ProductId).Take(count).AsNoTracking();
            return products;
        }

        public IQueryable<ProductModel> GetStreamingWithProjection(int count)
        {
            var productModels = _mapper.ProjectTo<ProductModel>(_salesContext.Products.OrderBy(o => o.ProductId).Take(count).AsNoTracking());
            return productModels;
        }

        public async Task<IList<Product>> GetNonStreaming(int count)
        {
            IList<Product> products = await _salesContext.Products.OrderBy(o => o.ProductId).Take(count).ToListAsync();
            return products;
        }
    }

Now, some people might be concerned about returning an IQueryable, if you are see this post by Mark Seemann https://blog.ploeh.dk/2012/03/26/IQueryableTisTightCoupling/ to start digging into the topic.

While writing this I came across what seems like a severe performance bug in Entity Framework Core 3.x, I’m working on another post which will cover this is detail.

Full source code available here.

Streaming Results from Entity Framework Core and Web API Core

Full source code here.
The code provided will not compile until you make a change in seeder.cs, the way it’s written it generates 500,000 rows in a local db. Set this to whatever value you want.

In this post I’m going to show you how to return an unlimited number of results from a database via a Web API application while keeping your memory usage low and constant. In effect, you are going to stream results from the database with Entity Framework.

To achieve this you need to do two things, disable tracking and return a not materialize your data inside the action method.

Turn off tracking
By default Entity Framework tracks entities that you read from a database. Tracking allows EF to determine what, if any, properties have changed since being loaded, then EF can save just the relevant changes. But tracking takes up memory and CPU.

If you have no intention of changing these entities there is no point in tracking them.

There are two ways of doing this, at point where you make the request to the database, or globally for the whole context.

To use AsNoTracking for a single request it looks like this –
_salesContext.Products.AsNoTracking().Where(..)

To use it for all requests to that context set AsNoTracking in the constructor of the context.

public SalesContext(DbContextOptions<SalesContext> options) : base(options)
{
	ChangeTracker.QueryTrackingBehavior = QueryTrackingBehavior.NoTracking;
}

That’s the first step taken care of, now we make sure not to turn our data into objects inside the controller.

Do not materialize
If you follow tutorials on Entity Framework Core and Web API you will see examples like this –

[HttpGet]
public async Task<ActionResult> Get()
{
    List<Product> products = await _salesContext.Products.ToListAsync();

    return Ok(products);
}

In this example, all the products in the database are read and put into a list of products, the return does not execute until after all the data has been read, so you are waiting for this to complete before getting any results. You are also storing the whole list of products in memory (you are probably also tracking the entities).

Instead of that you can do the following –

[HttpGet]
public ActionResult GetStreaming()
{
	IQueryable<Product> products = _salesContext.Products.AsNoTracking();

	return Ok(products);
}

Now, the products do not materialize, there is no list to store in memory, tracking is turned off and the action method begins returning data almost immediately.

Your memory profile will be almost same whether you are returning a few hundred or a few million rows of data.

Example of Memory Usage
Below are screenshots from Visual Studio 2017 of the application running showing the amount of memory consumed for a variety of requests. The first shows the memory usage when I requested a single row from the database

The following images show the memory usage when streaming and not streaming results sets of 100,000, 200,000, 300,000 and 500,000.

106 MB used for a single row

 

100,000 rows returned. On the left is with streaming, on the right without.

 

200,000 rows returned. On the left is with streaming, on the right without.

 

300,000 rows returned. On the left is with streaming, on the right without.

 

500,000 rows returned. On the left is with streaming, on the right without.

 

Summary of Results

It’s obvious that streaming maintains a uniform memory footprint while the memory consumed for non-streaming grows with the number of rows returned.

What you don’t see here is that streaming requests complete more quickly. If you download the attached code and use a local mdf file, you probably won’t see much of a difference in speed, but if your application connects to remote database you will see approximately a 20% improvement in speed.

Summary
For streaming to work you need to turn of tracking of entities and return an IQueryable from the action method.

Full source code here.
The code provided will not compile until you make a change in seeder.cs, the way it’s written it generates 500,000 rows in a local db. Set this to whatever value you want.