I've been asked why a certain URL is being crawled. It is publicly accessible and within the allowed domain, etc, so is correctly being indexed. But the URL shouldn't have been discoverable.
So we're wondering, how did the crawler find it? When looking at crawl.log, I can see the URL, and looking at the URLs preceding I don't see any links, but then there are a lot and I'm aware it could have been from any of them.
So is there a way to report on the referrer of a particular URL? Or a log of where each URL was first found?