Hi there. There is some interesting behavior when searching nodes. I'll show it in example.
So the problem is when I have tr nodes collection, and goes through it using "foreach" I expect that //td[@class='name'] will select every name tag in current tr tag. But I always get only first name tag. So the result string will look like "Volcano Name,Volcano Name,Volcano Name,Volcano Name," Is this correct? Or I do not fully understand logic of //?
Using td[@class='name'] helps to collect exactly what is needed.
This is html, used for searching some names:
So the problem is when I have tr nodes collection, and goes through it using "foreach" I expect that //td[@class='name'] will select every name tag in current tr tag. But I always get only first name tag. So the result string will look like "Volcano Name,Volcano Name,Volcano Name,Volcano Name," Is this correct? Or I do not fully understand logic of //?
Using td[@class='name'] helps to collect exactly what is needed.
This is html, used for searching some names:
<html>
<body>
<table>
<tr>
<td class="name">Volcano Name</td>
<td>Location</td>
<td>Last Major Eruption</td>
<td>Type of Eruption</td>
</tr>
<tr>
<td class="name">Mt. Lassen</td>
<td>California</td>
<td>1914-17</td>
<td>Explosive Eruption</td>
</tr>
<tr>
<td class="name">Mt. Hood</td>
<td>Oregon</td>
<td>1790s</td>
<td>Pyroclastic flows and Mudflows</td>
</tr>
<tr>
<td class="name">Mt .St. Helens</td>
<td>Washington</td>
<td>1980</td>
<td>Explosive Eruption</td>
</tr>
</table>
</body>
</html>
And here you have C# example:class HtmlParser
{
private HtmlDocument Document = new HtmlDocument();
public HtmlParser(String file)
{
Document.Load(file);
}
public String Parse()
{
var table = Document.DocumentNode.SelectSingleNode("//table");
var trCollection = table.SelectNodes("tr");
var result = String.Empty;
foreach (var trNode in trCollection)
{
var nameNode = trNode.SelectSingleNode("//td[@class='name']");
var name = nameNode.InnerText;
result += name + ",";
}
return result;
}
}