I'm trying to parse an nginx auto index page to get the links from a download directory and their timestamps.
I have successfully retrieved the links and their "names" so to speak but I am struggling with the timestamp.
I have the following code:
Any ideas?
I have successfully retrieved the links and their "names" so to speak but I am struggling with the timestamp.
I have the following code:
return doc.DocumentNode.SelectNodes("//a").Select(anchor => new IndexPageLink
{
Link = new Uri(root, anchor.InnerText),
Name = anchor.InnerText
})
Which is parsing the following HTML structure<pre><a href="../">../</a>
<a href="file.txt">file.txt</a> 24-Jan-2014 01:50 5M
</pre>
I've tried looking at the next element, which correctly shows as text element but it only has new line characters. I can definitely see the text when I look at the document from the pre node but it would be nice to process relative to the anchors that I find with the select nodes search.Any ideas?