Morning!
Having a bit of a weird issue with indexing some large XML files. All has been well for over a year, but today we noticed that for some reason, the indexer isn't indexing all fields within the XML records when there are a lot of fields attached. What's even weirder is that it's only a couple of fields that are not being indexed fully : all other fields at the same "level" are being indexed!
For example below, there are 350 'tuples' attached to the below record (i've only shown the first two!). It indexes the first 6 fields in all 350 tuples perfectly, until it gets to MulTitle where it only gets to around tuple number 175. Then for Mulidentifier, it only indexes 9 of the 350 fields!! There's no errors, and the text within the fields where it "stops" is exactly same as previous, so not an encoding issue. The indexing just "stops" half way through the fields too- so it indexes half of the 175th MulTitle and half of the 9th MuIdentifier but not the rest of the text (as seen here).
<id>1</id>
<MulMultiMediaRef_tab>
<tuple>
<AdmImportIdentifier>ms mm ingest 12</AdmImportIdentifier>
<irn>405479</irn>
<MulMimeFormat>jpeg</MulMimeFormat>
<MulMimeType>image</MulMimeType>
<ChaImageHeight>5888</ChaImageHeight>
<ChaImageWidth>7283</ChaImageWidth>
<AdmPublishWebNoPassword>Yes</AdmPublishWebNoPassword>
<MulTitle>001 Sermons by Samuel Rutherford_ms30386-front and inserted pages_Page_1</MulTitle>
<ChaAudioLength/>
<ChaVideoFilmLength/>
<MulIdentifier>001_ms30386-front and inserted pages_Page_1.jpg</MulIdentifier>
</tuple>
<tuple>
<AdmImportIdentifier>ms mm ingest 12</AdmImportIdentifier>
<irn>405480</irn>
<MulMimeFormat>jpeg</MulMimeFormat>
<MulMimeType>image</MulMimeType>
<ChaImageHeight>5201</ChaImageHeight>
<ChaImageWidth>7304</ChaImageWidth>
<AdmPublishWebNoPassword>Yes</AdmPublishWebNoPassword>
<MulTitle>002 Sermons by Samuel Rutherford_ms30386-front and inserted pages_Page_2</MulTitle>
<ChaAudioLength/>
<ChaVideoFilmLength/>
<MulIdentifier>002_ms30386-front and inserted pages_Page_2.jpg</MulIdentifier>
</tuple>
We thought it was maybe an issue outlined here with the chamber, so i increased that from 1000 to 3000, but then received the following error:
Error: Malloc failed (chamberpot) of -1149239295 bytes
Error exit(code 101)
msg:An error has occurred in the search system.
detail: femalloc: Malloc failed
Context: chamberpot
Command finished with exit code: 101
I decreased the chamber down to 2000 and the indexing worked, but had no impact on the problem.
I'm out of ideas! Any suggestions? Is it the chamber? If so - how do I increase it enough without receiving an error? Is there another setting I need to adjust (heap space or similar)? The only thing I can see is that those two fields are the "longest" at that level, but we've got loads of fields with way more text in them elsewhere!
Thanks