Funnelback to index csv

Hi,

 

I'm trying to get Funnelback to index data as CSV and cannot get it to make sense of the content.

 

Here is my current config:

 

The content is in Matrix, text formatted as CSV, at a normal url (headers set to text/html; charset=utf-8). My CSV looks like:

 

email@email.com,Mr,first name,last name,1234,123,S234,job title,,,,department

 

FB is looking at a a standard page in Matrix (blank design and paint layout) of 

 

I've tried using a text file in Matrix, as opposed to a standard page, so it gets the correct header (text/csv) but couldn't get FB to actually index that file at all.

 

My collection.cfg has a few options set there, including:

 

indexer_options=-csv=,n -csv_fields=T1,A1,F1,L1,N1,A1,R1,J1,-X,-X,-X,D1

 

I have two main issues:

 

1. getting FB to create multiple documents in that collection if i have multiple lines in my CSV. That doesn't work at the moment. If i have more than one line, it will simply create one record on index and have the content from both CSV lines

2. getting FB to make sense of my metadata mapping. At most I could get it to create one metadata field for the record it creates, (as "t"). I'm not sure how it managed that. In most cases it simply doesn't translate the CSV columns to metadata classes.

 

I'm using a web collection. My metamap.cfg and xml.cfg files are empty. And the collection besides this is pretty standard. It was initially set up to index the data as XML and was working, so I've just tweaked it to grab the data as CSV.

 

Is indexing CSV with FB even possible? There's very little documentation about this, so I'm not entirely sure it's been used or tested, or is even a feature.

 

Gilles

 

 

 

 

 

 

 

Might be easier to convert the cvs into xml and index that with FB.

 

You should probably use a local collection and have a pre-gather command to download the remote xml to a local directory (which would be your data-root)

It would be, but I have no control over the data, and I can't get it in another format.

 

Might try the local collection though, that might do what I need!

 

But yes, if the CSV stuff doesn't work, I'll have to find another way to pass the data the FB, somehow...

You could parse the csv with groovy and create new xml documents from the source. 

There’s a recent new addition the Funnelback Showcase that might assist further:
http://showcase.funnelback.com/s/search.html?collection=showcase-csv

Version 15.10 has a built-in filter for this now:

The filter used for the showcase is available via GitHub: GitHub - funnelback/custom-gatherer-csv: Custom gatherer for indexing CSV data sources. (synced with github.com/funnelback) and is a solution for pre-15.10 versions of Funnelback