Hi there, wondering if there is a way to exclude all subdomains from a Funnelback crawl/index. We have a situation where the main site is the top-level domain, but spin-off sites come up semi-regularly with subdomains off our main one without letting us know. Since they’re subdomains they are included in crawl and therefore in the site search results. I know how to exclude specific subdomains, but since subdomains can get spun up at any time I’d like a catch-all.
E.g.
We currently include all of ‘economicdevelopment.vic.gov.au’. We want to exclude ‘[anything].economicdevelopment.vic.gov.au’.
Usually I would leave out the protocol but in this case it seems to have the effect that the crawler ignores all the subdomains, which is what I was after. Nice to have the regex option up my sleeve though.
Nice one. Include patterns certainly support protocols, and things can get pretty gnarly if you’re wrangling dozens of TLD’s regex exclude patterns over time.
I think you’ve picked the appropriate solution to your problem there, @quimby.