I have the following database structure (EDMX models):
public class Company
{
public Guid Id { get; set; }
public virtual ICollection<Location> Locations { get; set; }
// ... 15 more columns
}
public class Location
{
public Guid Id { get; set; }
public virtual Company Company { get; set; }
public Guid CompanyId { get; set; }
public virtual ICollection<ReportA> ReportsA { get; set; }
public virtual ICollection<ReportB> ReportsB { get; set; }
public virtual ICollection<ReportC> ReportsC { get; set; }
// ... 15 more columns with information - name, description etc.
}
public class ReportA
{
public virtual Location Location { get; set; }
public Guid LocationId { get; set; }
// 30 more columns of type "int?"
}
public class ReportB
{
public virtual Location Location { get; set; }
public Guid LocationId { get; set; }
// 30 more columns of type "int?"
}
public class ReportC
{
public virtual Location Location { get; set; }
public Guid LocationId { get; set; }
// 30 more columns of type "int?"
}
One company can have many locations. Each location have many reports from A, B and C.
The columns from ReportA, ReportB, ReportC tables are different.
Each one of the Report tables have аpproximately 40 000 rows.
The Company and Location tables have ~ 5000 rows
I need to get all of the data and make a summary report.
The code is:
using (ComapnyEntities dataBaseContext = new ComapnyEntities())
{
IQueryable<Locations> query = dataBaseContext.Locations
.AsNoTracking()
.Where(location => companyIds.Contains(location.CompanyId))
.Include(location => location.Company)
.Include(location => location.ReportsA)
.Include(location => location.ReportsB)
.Include(location => location.ReportsC);
// more filtation
return query.ToList();
// assume that companyIds have all company Ids
}
In most cases I can use Skip() and Take() methods which makes execution faster (i.e. .Take(10)), but in one special case I need to pull all locations along with company information, ReportsA, ReportsB, ReportsC.
based on comment in Stackoverflow, Include() makes Cartesian product by rows. https://stackoverflow.com/a/22625208/6142097
// so if that is true:
4 000 * 4 000 * 40 000 * 40 000 * 40 000 = ?? (1.024e+21)
The "good part" is that the page takes 15 sec to load (locally), but in production is not the same and takes longer.
So is there a way to improve the performance of the query?
- I Tried with Entity-framework code is slow when using Include() many times - it's slower
- I have added Non-Clustured index of
Location.CompanyId,ReportA.LocationId,ReportB.LocationId,ReportC.LocationId,LocationId- no difference (I am not sure indexes are correctly added) - Cannot construct native SQL query, that joins all the tables, which can be executed for less than 10 sec.
- I Tried to replace
.Inlude()with.IncludeOptimized()but no real difference.
My machine is using SQL Server 2014 with windows 8.1
What are your suggestions? Is there a way to improve my code?