I'm having some what I think is strange bahviour, but maybe I'm misunderstanding how things work.
I have two XML dumps that I'm indexing into separate collections and combining into one meta collection. Both the feeds are for a staff directory, but the format of the XML is different.
In one of the collections I've set up the xml.cfg like this:
PADRE XML Mapping Version: 2
document,/html/body/results/person
docurl,/html/body/results/person/EmailInternal
a,1,,//FirstName
b,1,,//Surname
...etc...
In the second one it's like this:
PADRE XML Mapping Version: 2
document,/results/person
docurl,/results/person/dn
a,1,,//givenname
b,1,,//sn
...etc...
So the fields I'm interested in are mapped to the same classes.
I've then set up a hook script that converts any search term into a wildcard search (basically just adds '*' to the end of every search term. In the collection config file I;ve set -fmo=true, so only results that match all terms are returned.
I'm trying to create a staff directory where it returns records that match all terms in either the first or last name fields, i.e. all search terms are matched, but may not be matched in every field.
Unfortunately what's happening is it's returning results from matches outside of those 2 fields. In particular it's returning matches on manager names, and in some cases street names like 'Smith' and 'Collins'.
Seems like it's indexed the entirety of the XML record as the document because of the 'document' line in the xml.cfg file. This makes sense, but it's not what I'm trying to do.
I've toyed with metadata queries such as 'a:query', but I don't know how to get it to work the way I want it to. Basically
- all terms must be matched
- terms matched can be distributed across multiple metadata fields
- matches should not be on the remainder of the document
Is this possible? Have I attempted to do too much?
Note that this is just the default search. I have other searches that will utilise other metadata fields, so the othe fields still need to be indexed/available.