401 Unauthorised when crawl starts

heroneast · June 13, 2022, 10:33am

Hi there,

I’m currently having a little trouble finding what the issue could be. I’ve got a collection for our intranet site which requires windows authentication.

If I logon to the server with the account which is in the http_user and http_password parameter in the collection I can view the site directly via the browser.

However when I run the crawl on the collection I see the following in following in the ‘url_error.log’ - E http://intranet/Services/ [401 Unauthorized] [2022:06:13:11:23:45].

I’ve turned off windows authetication on the application to ensure it works without and it runs without authetication enabled. Had anyone experienced something similair? Is there any parameters in the collection I need to ensure are assigned?

Thanks!

plevan · June 13, 2022, 11:06pm

Hi,

http_user and http_password are for site that use HTTP Basic authentication. If you’re using windows authentication you can probably use the crawler’s NTLM authentication to access the site as a specific user.

(15.24 instructions are below)

heroneast · June 14, 2022, 1:16pm

Hi,

Thanks very much for this, this has been the answer i was looking for.

Kind regards!