Push collection auto-commit

I’ve just discovered these two settings for auto-commits, which seem to contradict.
If it commits after every change, why does it need to commit again?

push.scheduler.changes-before-auto-commit=1

By default a Push collection will trigger a commit after every change, making a document searchable as soon as it is added to a push collection.

(https://docs.funnelback.com/collections/collection-types/push/push-collection-options/push_scheduler_changes_before_auto_commit_collection_cfg.html)

push.scheduler.auto-commit-timeout-seconds=300

By default, a commit is triggered every 5 minutes.

(https://docs.funnelback.com/collections/collection-types/push/push-collection-options/push_scheduler_auto_commit_timeout_seconds_collection_cfg.html)

The reason I’m looking is that our collection doesn’t seem to commit after every change, so I’m going to try setting these to help…

Oh, these are both ‘scheduler’ settings, so perhaps these two settings work in conjunction. IE:

Every x minutes, the scheduler checks if y or more changes have been made, and if so, commit.

Also worth asking if the collection needs to be re-started for these to take effect?

Either way, our collection seems to be taking up to 25mins to commit, which I’d really like some help to explain!

Hi

By default the push collection will schedule a commit to run after each change as well as five minutes since the last commit. They do not contradict each other the timeout one does not prevent the commits after some number of changes.

I suspect that some other issue exists as to why commits are slow.

Have you confirmed commits are slow go to the API UI and go to the Push API and run a commit with ‘wait-for-completion’ set to true.

If that took a long time then we should find out which step is taking a long time.

Go to https://funnelback-server:admin-port/push-api/monitor/metrics?pretty=true e.g.
https://example.com:8443/push-api/monitor/metrics?pretty=true

Now look for the metrics that have the form:

push-core.collection.your-collection.update-steps.Step-Something

e.g.

push-core.collection.some-push-collection.update-steps.Step-QueryIndependentEvidenceCollectionLevel

and look for ones which have a high ‘max’ or ‘mean’ value. What steps are they?

It could be that the commit itself is fast but the time from a document being added to push completing a commit is slow. That could be because it is taking a long time for push to be able to run a commit for your collection. Check for your collection the metric:

push-core.collection.some-push-collection.commit-wait-timer

if that is really high that might help explain what is going on. That should be reporting the amount of time it took from when push requested a commit to run on that collection to the time the request actually started. If a single Funnelback instance is hosting many Push collections this could be high. You can allow push to use more resources for these types of jobs with push.worker-thread-count, this can be set in global.cfg and the default is in global.cfg.default

Thanks Luke, some responses:

These were pretty quick, maybe up to a couple of seconds.

[quote=“LukeButters, post:5, topic:10040”]look for ones which have a high ‘max’ or ‘mean’ value. What steps are they?
[/quote]

push-core.collection.monash-push-news.update-steps.Step-AnnieAMetaHardLinkFilesToComponents:     
max:     1.9666713210000002

push-core.collection.monash-push-news.update-steps.Step-BuildAutoCompletion:     
max:     2.9463206250000002

push-core.collection.monash-push-events.commit-wait-timer:	
count:	3
max:	2.164869292

push-core.collection.monash-push-events.merge-running-timer	
count	1
max	2.14296818
mean	2.14296818

push-core.collection.monash-push-events.commit-running-timer	
count	3
max	2.2238981460000002
mean	1.5402782444428902

Looks like there’s no settings at all.. what would be a good value to try? I see from docs it defaults to number of CPUs.

cd /opt/funnelback/conf
grep 'push' global.cfg*
# (no results)

Out of curiosity I wonder if the folders in the collection log relates to number of threads? (/opt/funnelback/data/collection-name/live/log ). Currently there’s 17 folders.