There isn't a lot you can do to control the crawler's redirect behaviour.
There is a server_alias.cfg you can user to set a preferred server name when there are aliases for a server but it seems to me that the easiest solution here might be to apply your external metadata disclaimer to pages on your internal site and use your template to check if this field exists or not. If it doesn't exist then print out your disclaimer message.
So you might have something like
internalsite.com disclaimer:"this is a your internal website"
then in your freemarker you can do something like
Disclaimer: This is an third party site