I have big XML files (between 500MB and 1GB) and I'm trying to filter them in order to keep only nodes with some specified attributes, in this case Prod_id. I have about 10k Prod_id that I need to filter and currently XML contains about 60k items.
Currently I'm using XSL with node.js (https://github.com/fiduswriter/xslt-processor) but it's really slow (I never saw one of them finished in 30-40 minutes).
Is there a way to increase the speed of this process? XSL is not a requirement, I can use everything.
XML Example:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<products>
    <Product Quality="approved" Name="WL6A6" Title="BeBikes comfort WL6A6" Prod_id="BBKBECOMFORTWL6A6">
        <CategoryFeatureGroup ID="10030">
            <FeatureGroup>
                <Name Value="Dettagli tecnici" langid="5"/>
            </FeatureGroup>
        </CategoryFeatureGroup>
        <Gallery />
    </Product>
    ...
    <Product Quality="approved" Name="WL6A6" Title="BeBikes comfort WL6A6" Prod_id="LAL733">
        <CategoryFeatureGroup ID="10030">
            <FeatureGroup>
                <Name Value="Dettagli tecnici" langid="5"/>
            </FeatureGroup>
        </CategoryFeatureGroup>
        <Gallery />
    </Product>
</products>
XSL I'm using
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>  
  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="
         products/Product
         [not(@Prod_id='CEESPPRIVAIPHONE4')]
         ...
         [not(@Prod_id='LAL733')]"
   />
</xsl:stylesheet>
Thanks
