Problem
I have a pretty simple method that grabs data from an sql query and then puts it inside of a list and finally iterates through two for each loops onto an excel file. It works fine with small amounts of data, but not at all with larger amounts. What would be the best way to iterate and also write to my workbook the quickest? I am using OpenXML.
Here is the code
private UInt32 AddSheets(SpreadsheetDocument document, WorkbookPart workbookPart, Sheets sheets, string sheetName, UInt32 sheetNumber, List<RentalFeeReportingRecord> currentRecordList)
{
Dictionary<int, string> listofCoreBankingIds = new Dictionary<int, string>();
listofCoreBankingIds = currentRecordList.Select(x => new { id = x.CoreBankingId, regionName = x.RegionName }).Distinct().ToDictionary(y => y.id, y => y.regionName);
foreach (var coreBankingId in listofCoreBankingIds.OrderBy(x => x.Value))
{
WorksheetPart worksheetPart1 = workbookPart.AddNewPart<WorksheetPart>();
string currentCurrency = null;
string regionName = null;
Worksheet worksheet1 = new Worksheet();
SheetData sheetData1 = new SheetData();
Row rowInSheet1 = new Row();
Row firstRow = new Row();
Row lastRow = new Row();
rowInSheet1.Append(
excelController.ConstructCell("Region", CellValues.String, 3), excelController.ConstructCell("Merchant - Terminal", CellValues.String, 3),
excelController.ConstructCell("Fee Charged", CellValues.String, 3),
excelController.ConstructCell("Currency", CellValues.String, 3),
excelController.ConstructCell("Processing Date", CellValues.String, 3),
excelController.ConstructCell("Description", CellValues.String, 3)
);
sheetData1.Append(rowInSheet1);
foreach (var currentRecord in currentRecordList.OrderBy(x => x.RegionName).ThenBy(z => z.Currency).ThenBy(y => y.DateProcessed))
{
if (coreBankingId.Key == currentRecord.CoreBankingId)
{
regionName = currentRecord.RegionName;
currentCurrency = currentRecord.Currency;
double currentTotal = currentRecord.TotalFeeAmount;
string merchant = currentRecord.MerchantRecord.MerchantID + " - " + currentRecord.MerchantTerminalRecord.terminalID;
if (currentRecord.IsPinPad)
{
merchant = merchant + " - PIN Pad";
}
string description = currentRecord.TerminalRecord.description;
if (currentRecord.IsProRated)
{
int days = DateTime.DaysInMonth(currentRecord.DeployedDate.Year, currentRecord.DeployedDate.Month);
description = currentRecord.TerminalRecord.description + " (" + currentRecord.MonthlyFee + " Fee \ " + days.ToString() + " Days) * " + currentRecord.DaysActive.ToString() + " Days Active";
}
rowInSheet1 = new Row();
firstRow = new Row();
lastRow = new Row();
rowInSheet1.Append(
excelController.ConstructCell(regionName, CellValues.String, 2),
excelController.ConstructCell(merchant, CellValues.String, 2),
excelController.ConstructCell(currentRecord.TotalFee.ToString("N2"), CellValues.String, 1),
excelController.ConstructCell(currentRecord.Currency, CellValues.String, 2),
excelController.ConstructCell(currentRecord.DateProcessed.ToString("MMM dd, yyyy"), CellValues.String, 2),
excelController.ConstructCell(description, CellValues.String, 2)
);
sheetData1.Append(rowInSheet1);
}
}
Columns columns = excelController.AutoSize(sheetData1);
worksheet1.AppendChild(columns);
worksheet1.AppendChild(sheetData1);
worksheetPart1.Worksheet = worksheet1;
Sheet sheet1 = new Sheet()
{
Id = document.WorkbookPart.GetIdOfPart(worksheetPart1),
SheetId = sheetNumber,
Name = regionName
};
sheets.Append(sheet1);
sheetNumber++;
}
return sheetNumber;
}
Any help would be appreciated.
Solution
It would be worth doing a high-level analysis of your algorithm’s complexity. I won’t use big o to keep things simple.
// ...
foreach (var coreBankingId in listofCoreBankingIds.OrderBy(x => x.Value))
{
// ...
foreach (var currentRecord in currentRecordList.OrderBy(x => x.RegionName).ThenBy(z => z.Currency).ThenBy(y => y.DateProcessed))
{
if (coreBankingId.Key == currentRecord.CoreBankingId)
{
// ...
}
}
}
So:
- for every iteration of
coreBankingId
, you are reordering thecurrentRecordList
in exactly the same way - the inner loop only cares about 1 set of values (when the
CoreBankingId
is equal to the outer loop’s key)
With those two pieces of knowledge, I suggest you look at using GroupBy
instead. This code is untested straight into CR and I’m a bit rusty on C# but something like:
var groups = currentRecordList
.GroupBy(r => new { Id = r.CoreBankingId, RegionName = r.RegionName });
foreach (var group in groups.OrderBy(g => g.Key.RegionName)
{
var coreBankingId = group.Key.Id;
foreach (var currentRecord in group.OrderBy(x => x.RegionName).ThenBy(z => z.Currency).ThenBy(y => y.DateProcessed))
{
// Your loop
}
}
That should have scale much better to bigger datasets.
Original suggestion (which I think is wrong):
var data = currentRecordList
.GroupBy(r => new { Id = r.CoreBankingId, RegionName = r.RegionName })
.OrderBy(g => g.Key.RegionName)
// I can't remember if ToLookup preserves order. If it doesn't, you'll have to tweak this to order later.
.ToLookup(
g => g.Key.Id,
g => g.OrderBy(x => x.RegionName)
.ThenBy(z => z.Currency)
.ThenBy(y => y.DateProcessed)
.ToList());
foreach (var datum in data)
{
var coreBankingId = datum.Key;
foreach (var activity in datum)
{
// Your loop body without the if.
}
}
Quick remarks:
-
I don’t see the point of
listofCoreBankingIds
, especially since you only use thekey
, never thevalue
. Just loop through thecurrentRecordList
. I’m guessing you might need toGroupBy
theCoreBankingId
and loop through that, but your logic is obscured by the creation of that dictionary. -
currentRecordList
is a bad name IMHO. The “List” part is pointless. Why not call itrentalFeeReportingRecords
orcurrentRecords
? -
Six parameters, almost a hundred lines: I’d turn this method into a class of its own and split it into several descriptively named methods. For instance, the compilation of the
description
could be a method. -
Your logic jumps all over the place. For instance,
worksheetPart1
gets defined early on, but doesn’t come into play until almost at the end. Same forworksheet1
. -
currentCurrency
is never used.regionName
only exists because you do the pointless “create dictionary and loop through its items” logic. -
Why the “1” suffix in
sheetData1
andworksheet1
androwInSheet1
? -
Use string interpolation instead of concatenating strings.