I am trying to parse a 17G XML file. I need to read all attributes of each tag () and save them in a CSV. I am using the following code to parse XML. It is taking too much time to parse. For example, 10000 rows take more than 5 minutes to be exported. I have a Windows machine (Intel i5 2.4GHz/8G RAM).
Sample XML
<?xml version="1.0" encoding="utf-8"?>
<comments>
  <row Id="2" PostId="35314" Score="8" Text="Yeah, I didn't believe it until I created a console app - but good lord!  Why would they give you the rope to hang yourself!  I hated that about VB.NET - the OrElse and AndAlso keywords!" CreationDate="2008-09-06T08:09:52.330" UserId="3" /> 
</comments>
Python Code
import xml.etree.ElementTree as etree
import pandas as pd
df = pd.DataFrame(columns=('Id','PostId','Score','Text','CreationDate','UserId'))
def process_params(elem):
    global df
    row = [elem.attrib.get('Id'),elem.attrib.get('PostId'),elem.attrib.get('Score'),elem.attrib.get('Text'),elem.attrib.get('CreationDate'),elem.attrib.get('UserId')]
    df.loc[len(df)] = row
    if (len(df) % 10000 == 0):
        with open('xaa.csv', 'a', encoding="utf-8") as f:
            df.to_csv(f, header=False)
        print("{} rows exported".format(len(df)))
for event, elem in etree.iterparse("xaa.xml", events=('start', 'end')):
    if event == 'start':
        if elem.tag == 'row':
            process_params(elem)
Tried another variant of code without Pandas which fails due to memory outage
def process_params(elem):
    row = [elem.attrib.get('Id'),elem.attrib.get('PostId'),elem.attrib.get('Score'),elem.attrib.get('Text'),elem.attrib.get('CreationDate'),elem.attrib.get('UserId')]
    with open("comments1.csv", "a", encoding="utf-8") as f:
        wr = csv.writer(f)
        wr.writerow(row)
for event, elem in etree.iterparse("comments.xml", events=('start', 'end')):
    if event == 'start':
        if elem.tag == 'row':
            process_params(elem)
Tried a few more variants, but nothing worked or takes forever. Please suggest.