For #1, please confirm by following trying the page(s) that is not being crawled in the following APIs. The API UI can be accessed by the "View API UI" option in the "System" menu in the Funnelback Administration dashboard.
This API will test the include/exclude patterns and check whether the URL is in the index, including redirects.
The debug API will check if the crawler has any issues reaching the provided URL, including redirects.
Do the links to the pages that you'd like to be crawled have the
In the last question, the logs that may have the answer are
crawl.log.*.gz , where the
Another possibility, though unlikely, is if the crawl is timing out (due to the configured time limit, default 24 hours) and those page(s) were still in the frontier when the crawl finished. Those would show in the collection