We have an option for noindex content or HTML elements like an example mentioned in below URL,
I like to know, is there any option to crawl or index only between HTML tags or class
For example,
<div data-swiftype-index='true'>
</dov>
We have an option for noindex content or HTML elements like an example mentioned in below URL,
I like to know, is there any option to crawl or index only between HTML tags or class
For example,
<div data-swiftype-index='true'>
</dov>
Hi gnana,
Unfortunately, we don’t have an really easy way to index only things between a css class.
It is possible to achieve what you want by using groovy filters and only keep the markup you want to index.
Here’s an example of a filter which manipulates html document before they are indexed: HTML document filtering (Jsoup filters) - Funnelback Documentation - Version 15.24.0
However, it would be a pretty involved process and require a non-trivial amount of code to be written.
What is the goal you are trying to achieve? Depending on the requirements, it might be possible to use a combination of the metadata scraper and adjusting the behavior of metadata classes and relelvancy.
Hope this helps.
Thanks,
~Gioan
Hi Gtran,
Thank you for your response.
As like mentioned above, I like to crawl the web pages only between this HTML tags
`
Our existing search engine using this HTML tags to crawling content from page. If we can use this same tags in Funnelback will be easy to implements or else we could need to add <!--noindex--> <!--endnoindex--> for 100+ websites.
Thank you again,
Gnana