Since the requirement was added that there be statistically uniform distribution of the lines selected from the file, I offer this simple approach.
"""randsamp - extract a random subset of n lines from a large file"""
import random
def scan_linepos(path):
    """return a list of seek offsets of the beginning of each line"""
    linepos = []
    offset = 0
    with open(path) as inf:     
        # WARNING: CPython 2.7 file.tell() is not accurate on file.next()
        for line in inf:
            linepos.append(offset)
            offset += len(line)
    return linepos
def sample_lines(path, linepos, nsamp):
    """return nsamp lines from path where line offsets are in linepos"""
    offsets = random.sample(linepos, nsamp)
    offsets.sort()  # this may make file reads more efficient
    lines = []
    with open(path) as inf:
        for offset in offsets:
            inf.seek(offset)
            lines.append(inf.readline())
    return lines
dataset = 'big_data.txt'
nsamp = 5
linepos = scan_linepos(dataset) # the scan only need be done once
lines = sample_lines(dataset, linepos, nsamp)
print 'selecting %d lines from a file of %d' % (nsamp, len(linepos))
print ''.join(lines)
I tested it on a mock data file of 3 million lines comprising 1.7GB on disk. The scan_linepos dominated the runtime taking about 20 seconds on my not-so-hot desktop. 
Just to check the performance of sample_lines I used the timeit module as so
import timeit
t = timeit.Timer('sample_lines(dataset, linepos, nsamp)', 
        'from __main__ import sample_lines, dataset, linepos, nsamp')
trials = 10 ** 4
elapsed = t.timeit(number=trials)
print u'%dk trials in %.2f seconds, %.2fµs per trial' % (trials/1000,
        elapsed, (elapsed/trials) * (10 ** 6))
For various values of nsamp; when nsamp was 100, a single sample_lines completed in 460µs and scaled linearly up to 10k samples at 47ms per call.
The natural next question is Random is barely random at all?, and the answer is "sub-cryptographic but certainly fine for bioinformatics".