Fastest way to find text in file

So I’m looking for a way to efficiently search for text in a file. Right now I’m using this:

using (FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read, 1024 * 1024, FileOptions.SequentialScan)) using (StreamReader streamReader = new StreamReader(fileStream)) {     string line;     while ((line = streamReader.ReadLine()) != null)     {         int index = 0;         while ((index = line.IndexOf(searchText, index, StringComparison.Ordinal)) != -1)         {             index += searchText.Length;         }     } } 

However, I was wondering if there was a way to more efficiently search the file. I was thinking of maybe searching for the text in buffers, but I’m not sure how. Thanks.

EDIT: Without calling IndexOf, I get around 1600ms. With index of, it’s around 7400ms.

EDIT: I have a basic implementation of chunk reading, and it got the time down to 740ms. (No reading lines) It still has lots of work, but I basically read a chunk at a time and take index of.

Add Comment
1 Answer(s)

Your approach from the performance point of view will be O(xl) time, where x is the length of the string being searched and l the length of the string you are trying to find. There are few general algorithms that you can apply:

  • Boyer-Moore
  • Morris-Pratt
  • Knuth-Morris-Pratt

I recommend you to use Boyer-Moore and here you have examples on how to implement it:

Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.