[Solved] Entity Framework Core - storing/querying multilingual records in the database efficiently
Looking to automatically optimize YOUR SQL query? Start for free.

EverSQL Database Performance Knowledge Base

Entity Framework Core - storing/querying multilingual records in the database efficiently

I'm building an application that must support more than one language.

Therefore, some of the records in my database need to have multiple versions for each language.

I will first explain how I currently achieve this: consider an entity called Region which represents a geographical location and simply has a name.

I would design my entities like this:

public class Region 
    public int Id { get; set; }

    public List<RegionLanguage> Languages { get;set; }

public class RegionLanguage 
    public Region Region { get;set; } // Parent record this language applies to
    public string CultureCode { get; set; } // Will store culture code such as en-US or fr-CA

    // This column/property will be in the language specified by Culturecode
    public string Name { get;set; }

From a database perspective, this works great because its infinitely scalable to any number of records. However, due to the way Entity Framework Core works, it becomes less scalable.

Using the above structure, I can query a Region and generate a view model based on specific culture information:

var region = _context.Regions.Where(e => e.Id == 34)
                     .Include(e => e.Languages)

var viewModel = new RegionViewModel 
                         Name = region.Languages.FirstOrDefault(e => e.CultureCode == "en-US")?.Name // en-US would be dynamic based on the user's current language preference

You can see this becomes inefficient since I have to include ALL language records for the entity I'm fetching, when I actually only need one and then search for the correct language in memory. Of course this becomes even worse when I need to fetch a list of Regions which then has to return a large amount of unnecessary data.

Of course, this is possible using SQL directly simply by adding an extra clause on the join statement:

select * 
from Regions 
left join RegionLanguage on (RegionLanguage.Region = Regions.Id and RegionLanguage.CultureCode = 'en-US')

However, to my understanding, this is not possible to do natively from Entity Framework Core without using a RawQuery (EF: Include with where clause)

So that begs the question: is there a better way to achieve multilingual records in the database using EF Core? Or should I just continue with my approach and hope that EF Core implements Include filtering by the time my application actually needs it (I'll admit I might be optimizing slightly prematurely, but I'm genuinely curious if there is a better way to achieve this).

How to optimize this SQL query?

The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:

  1. Description of the steps you can take to speed up the query.
  2. The optimal indexes for this query, which you can copy and create in your database.
  3. An automatically re-written query you can copy and execute in your database.
The optimization process and recommendations:
  1. Avoid Selecting Unnecessary Columns (query line: 2): Avoid selecting all columns with the '*' wildcard, unless you intend to use them all. Selecting redundant columns may result in unnecessary performance degradation.
  2. Create Optimal Indexes (modified query below): The recommended indexes are an integral part of this optimization effort and should be created before testing the execution duration of the optimized query.
Optimal indexes for this query:
ALTER TABLE `RegionLanguage` ADD INDEX `regionlanguage_idx_culturecode_region` (`CultureCode`,`Region`);
The optimized query:
            ON (
                RegionLanguage.Region = Regions.Id 
                AND RegionLanguage.CultureCode = 'en-US'

Related Articles

* original question posted on StackOverflow here.