1

I am trying to create a GUI for searching through a large number of huge configuration files (approx 60000 files, each one with a size between 20 KByte to 50 MByte). Those files are also updated frequently (~3 times/day).

So far I have found SOLR and Sphinx, but found no way to have them return the list of matching lines including a line number for each matching document.

What we currently do is we convert each text file to XML:

<xml>
   <line number="1">foobar</line>
   <line number="2">barfoo</line>
   ...
</xml>

and store the result in an eXist-db. However, storing documents is way too slow, so we need an alternative.

Any better ideas?

Oliver Salzburg
  • 89,072
  • 65
  • 269
  • 311
knipknap
  • 121

1 Answers1

0

Opinion: If you have large amounts of volatile text data you need fast access to, converting them to XML will make your problems much harder to solve.

Any better ideas?

Leave the files as text and use Lucene?

(I'm assuming that grep doesn't cut it)