Out of curiousity, is there a way on Python that the computer/program can count in thirds without using range, but instead with slices and indices? For example, what if you had a codon like 'CAGCAGCAT'. Could python divide that string into thirds like this: CAG CAG CAT? I tried to, but I failed. If there's a way, show me how. I'm curious
            Asked
            
        
        
            Active
            
        
            Viewed 178 times
        
    2
            
            
        - 
                    http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python is related. – rlms Feb 26 '14 at 21:47
- 
                    It's funny how the he specifically said "no range" and 4 out of 6 below use range. :) – Russia Must Remove Putin Feb 26 '14 at 21:51
- 
                    Oh funny, I didn’t even read that. Wonder where that requirement comes from… – poke Feb 26 '14 at 21:54
- 
                    Even though you didn't ask for. Maybe you are interested in [Biopython](http://biopython.org/DIST/docs/tutorial/Tutorial.html): Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. – wolfrevo Feb 26 '14 at 21:57
6 Answers
3
            
            
        import textwrap
textwrap.wrap('CAGCAGCAT' ,3)
returns
['CAG', 'CAG', 'CAT']
 
    
    
        Russia Must Remove Putin
        
- 374,368
- 89
- 403
- 331
2
            
            
        You could use the grouper recipe, zip(*[iterator]*n), to collect items without using range. 
In [96]: data = 'CAGCAGCAT'
In [97]: [''.join(grp) for grp in zip(*[iter(data)]*3)]
Out[97]: ['CAG', 'CAG', 'CAT']
If len(data) is not a multiple of 3, then the above chops off the remainder. To prevent that, use itertools.izip_longest:
In [102]: import itertools as IT
In [108]: [''.join(grp) for grp in IT.izip_longest(*[iter('CAGCAGCATCA')]*3, fillvalue='')]
Out[108]: ['CAG', 'CAG', 'CAT', 'CA']
By the way, grouper recipe works with any iterator. textwrap.wrap works only with strings. Moreover, the grouper recipe is faster:
In [100]: %timeit textwrap.wrap(data, 3)
10000 loops, best of 3: 17.7 µs per loop
In [101]: %timeit [''.join(grp) for grp in zip(*[iter(data)]*3)]
100000 loops, best of 3: 1.78 µs per loop
Also note that textwrap.wrap may not group your string into groups of 3 characters if the string contains spaces:
In [42]: textwrap.wrap('I am a hat', 3)
Out[42]: ['I', 'am', 'a', 'hat']
 
    
    
        unutbu
        
- 842,883
- 184
- 1,785
- 1,677
- 
                    
- 
                    this is my favorite way to do it (although it can leave off some end items if the length is not `0 (mod n)` – Joran Beasley Feb 26 '14 at 21:51
- 
                    1
- 
                    +1 lol I never thought of that I usually just tack on `+ data[-len(data)%n:]` – Joran Beasley Feb 26 '14 at 21:55
1
            
            
        >>> s = 'CAGCAGCAT'
>>> [''.join(g) for g in zip(s[::3], s[1::3], s[2::3])]
['CAG', 'CAG', 'CAT']
 
    
    
        ndpu
        
- 22,225
- 6
- 54
- 69
1
            
            
        You can use the list comprehension, the third parameter of range is a step:
>>> s = "CAGCAGCAT"
>>> [s[i:i+3] for i in range(0, len(s), 3)]
['CAG', 'CAG', 'CAT']
>>> 
 
    
    
        pkacprzak
        
- 5,537
- 1
- 17
- 37
1
            
            
        You can use the grouper itertools recipe:
>>> s = 'CAGCAGCAT'
>>> list(grouper(s, 3))
[('C', 'A', 'G'), ('C', 'A', 'G'), ('C', 'A', 'T')]
Or in your case, you can also use simple slices:
>>> [s[i:i+3] for i in range(0, len(s), 3)]
['CAG', 'CAG', 'CAT']
 
    
    
        poke
        
- 369,085
- 72
- 557
- 602
1
            
            
        def chunker(seq, size):
    return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))
Stolen from What is the most "pythonic" way to iterate over a list in chunks?
 
     
    