Quantcast
Viewing all articles
Browse latest Browse all 655

New Post: Why can't I reliably extract links from nodes?

I'm looking at nodes that definitely contain links.  Sometimes I get the proper link by Node.Attributes["href"] but sometimes that's null and I can't see why it sometimes works and sometimes doesn't.  A brute force approach of looking for href=" and then taking everything up to the next quote is working but I would like to understand what's going on here.

Going to the real page the links do work and the pages do correctly display as my routine crawls them.


Viewing all articles
Browse latest Browse all 655

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>