The answer probably exists somewhere but I can't find it. I came to this question from an algorithm I am creating. Essentially a .contains(String s1, String s2) that returns true if s1 contains s2, ignoring Greek/English character difference. For example the string 'nai, of course' contains the string 'ναι'. However, this is kindly irrelevant to my question.
The contains() method of a String uses the naive approach and I use the same for my algorithm. What contains() essentially does, is to call the static indexOf(char[] source, int sourceOffset, int sourceCount, char[] target, int targetOffset, int targetCount, int fromIndex) which exists in java.lang.String.class with the correct parameters.
While I was doing different kind of benchmarking and tests to my algorithm, I removed all the Greek - English logic to see how fast it behaves with only English strings. And it was slower. About 2 times slower than a s1.contains(s2) that comes from JDK.
So, I took the time to copy and paste this indexOf method to my class, and call it 15 million times in the same way JDK calls it for a string.
The class is the following:
public class TestIndex {
    public static int fromJdk;
    public static int fromMe;
    public static void main(String[] args) {
        String s1 = "papapatita";
        String s2 = "patita";
        final char[] s1Arr = s1.toCharArray();
        int s1Length = s1Arr.length;
        final char[] s2Arr = s2.toCharArray();
        int s2Length = s2Arr.length;
        long be4 = System.nanoTime();
        for (int i = 0; i < 15_000_000; i++) {
            fromJdk = s1.indexOf(s2);
            //          fromMe = indexOf(s1Arr, 0, s1Length, s2Arr, 0, s2Length, 0);
        }
        long after = System.nanoTime();
        System.out.println((after - be4) / 1000000.0);
    }
    static int indexOf(char[] source, int sourceOffset, int sourceCount, char[] target, int targetOffset,
            int targetCount, int fromIndex) {
        if (fromIndex >= sourceCount) {
            return (targetCount == 0 ? sourceCount : -1);
        }
        if (fromIndex < 0) {
            fromIndex = 0;
        }
        if (targetCount == 0) {
            return fromIndex;
        }
        char first = target[targetOffset];
        int max = sourceOffset + (sourceCount - targetCount);
        for (int i = sourceOffset + fromIndex; i <= max; i++) {
            /* Look for first character. */
            if (source[i] != first) {
                while (++i <= max && source[i] != first)
                    ;
            }
            /* Found first character, now look at the rest of v2 */
            if (i <= max) {
                int j = i + 1;
                int end = j + targetCount - 1;
                for (int k = targetOffset + 1; j < end && source[j] == target[k]; j++, k++)
                    ;
                if (j == end) {
                    /* Found whole string. */
                    return i - sourceOffset;
                }
            }
        }
        return -1;
    }
}
While the fromJdk = s1.indexOf(s2); is without comment, the result is (in my machine) about ~150ms. If I comment this line and leave the fromMe = indexOf(s1Arr, 0, s1Length, s2Arr, 0, s2Length, 0); without comment, I should get the same result. The result is almost the double time. About ~300ms.
So, the question is why? The indexOf method is exactly as it exists in JDK. I know that measuring performance in Java is a hard thing to do (probably JMH)? But the difference is big. Does indexOf from JDK gets a special treatment under the hood?
If the answer somewhere exists, point me to it.
 
     
    