Using C# I'm trying to insert from csv file (~ 55000 records) to a postgresql table with 350 columns. I'm just building the insert statement from header and record which are comma separated.
Like this:
INSERT INTO public.table(field1,fields2,field3, ..... ,field350) VALUES(?,?,?,?,?,.......,?);
INSERT INTO public.table(field1,fields2,field3, ..... ,field350) VALUES(?,?,?,?,?,.......,?);
INSERT INTO public.table(field1,fields2,field3, ..... ,field350) VALUES(?,?,?,?,?,.......,?);
INSERT INTO public.table(field1,fields2,field3, ..... ,field350) VALUES(?,?,?,?,?,.......,?);
....etc
I tried using batch, like gathering 1000 statements and execute them in a transaction, but this takes for 1000 records up to ~3 sec.
I tried to follow the example from Microsoft here in which I call ExecuteNonQuery() on each line but I commit the transaction after 1000 records and begin a new transcation and this is taking up to ~3 sec per 1000 records as well. Something like this:
foreach (var line in ReadCsvFileToList(dataFile))
{
    try
    {
        if (firstLine)
        {
            header = line;
            firstLine = false;
            continue;
        }
        else
        {
            var formattedLine = line.Replace("\"\"", "NULL").Replace("'", "''").Replace("\"", "'");
            var commandText = $"INSERT INTO public.table({header.Replace('/', '_')}) VALUES ({formattedLine})";
            command.CommandText = commandText;
            await command.ExecuteNonQueryAsync();
            round++;
        }
        if (round == 1000) // batch size
        {
            await transaction.CommitAsync();
            Console.WriteLine("batch commited to DB at: " + DateTime.Now.ToString("ddMMyyy hh:mm:ss"));
            round = 0;
            transaction = connection.BeginTransaction();
        }
    }
    catch (Exception)
    {
        await connection.CloseAsync();
        await connection.DisposeAsync();
        await transaction?.RollbackAsync();
        throw;
    }
}
An idea on how to optimize this more?
 
    