I have the following XML:
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
          xmlns:o="urn:schemas-microsoft-com:office:office"
          xmlns:x="urn:schemas-microsoft-com:office:excel"
          xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
          xmlns:html="http://www.w3.org/TR/REC-html40">
  <Names>
    <NamedRange ss:Name="SomeNamedRange" ss:RefersTo="=Control!R1C1:R51C4"/>
  </Names>
  <Worksheet ss:Name="Control" ss:Protected="1">
    <Table ss:ExpandedColumnCount="4" ss:ExpandedRowCount="51">
      <Row>
        <Cell ss:StyleID="s145">          
          <Comment ss:Author="Some comment here">
            <ss:Data xmlns="http://www.w3.org/TR/REC-html40"></ss:Data>
          </Comment>          
        </Cell>
      </Row>      
    </Table>
  </Worksheet>
</Workbook>
I would like to get the Names element with XPath, so I try:
//Names
but this doesn't work. So far, I have found a number of ways to fix this.
//ss:Names
//*:Names
//*[local-name()='Names']
OR, I can delete the following element:
<ss:Data xmlns="http://www.w3.org/TR/REC-html40"></ss:Data>
So clearly, this is something to do with namespaces but I still don't really understand what's going on. So I have two questions:
- Why does deleting the ss:Dataelement affect being able to read theNameselement?
- Given that there are 5 namespaces declared at the top, why is the Nameselement considered to be in thessnamespace (when thess:Dataelement exists)?
- What is the correct general approach here? I feel like there is some general piece of information I'm missing about either XML or XPath
EDIT:
This issue is not limited to http://xpather.com/. I have had various results with different XPath websites, and have summarised the results here.
 
    