For a simple bare array, the for loop will tend to produce slightly smaller IL. Compare
    static int[] array = new int[100];
    static void UseForLoop () {
        for (int i = 0; i < array.Length; ++i) {
            Console.WriteLine(array[i]);
        }
    }
    static void UseForeachLoop () {
        foreach (int i in array) {
            Console.WriteLine(i);
        }
    }
which produces the following sets of IL from VS 2010, default release configuration:
.method private hidebysig static void UseForLoop() cil managed
{
        .maxstack 2
        .locals init (
                [0] int32 i)
        L_0000: ldc.i4.0 
        L_0001: stloc.0 
        L_0002: br.s L_0014
        L_0004: ldsfld int32[] ConsoleApplication5.Program::array
        L_0009: ldloc.0 
        L_000a: ldelem.i4 
        L_000b: call void [mscorlib]System.Console::WriteLine(int32)
        L_0010: ldloc.0 
        L_0011: ldc.i4.1 
        L_0012: add 
        L_0013: stloc.0 
        L_0014: ldloc.0 
        L_0015: ldsfld int32[] ConsoleApplication5.Program::array
        L_001a: ldlen 
        L_001b: conv.i4 
        L_001c: blt.s L_0004
        L_001e: ret 
}
.method private hidebysig static void UseForeachLoop() cil managed
{
        .maxstack 2
        .locals init (
                [0] int32 i,
                [1] int32[] CS$6$0000,
                [2] int32 CS$7$0001)
        L_0000: ldsfld int32[] ConsoleApplication5.Program::array
        L_0005: stloc.1 
        L_0006: ldc.i4.0 
        L_0007: stloc.2 
        L_0008: br.s L_0018
        L_000a: ldloc.1 
        L_000b: ldloc.2 
        L_000c: ldelem.i4 
        L_000d: stloc.0 
        L_000e: ldloc.0 
        L_000f: call void [mscorlib]System.Console::WriteLine(int32)
        L_0014: ldloc.2 
        L_0015: ldc.i4.1 
        L_0016: add 
        L_0017: stloc.2 
        L_0018: ldloc.2 
        L_0019: ldloc.1 
        L_001a: ldlen 
        L_001b: conv.i4 
        L_001c: blt.s L_000a
        L_001e: ret 
}
..but the key parts there, the loops, are basically the same. As others have said, this is kind of a micro-optimization, too. The JIT'd x86 from these two methods is probably going to be the same, and unless your iterating over a complex collection with a complicated enumerator, the difference is not likely to be huge even in a practical example.
I'd use the one that is more readable -- if speed is really that much of a concern, favor a for loop, but you'd likely get better results from algorithmic optimizations.