For a project, I need a method of creating thousands of random strings while keeping collisions low. I'm looking for them to be only 12 characters long and uppercase only. Any suggestions?
- 
                    3You mean you don't want any lowercase digits? – martineau Aug 19 '13 at 17:02
- 
                    Hmm, yeah, that should be clarified :) – Maarten Bodewes Aug 19 '13 at 17:02
- 
                    Don't forget to read this page about [the default random number generator in python](http://docs.python.org/2/library/random.html). The chance of collisions seems to be fully dependent on the size of the "random strings", but that does not mean that an attacker cannot re-create the random numbers; the random numbers generated are *not cryptographically secure*. – Maarten Bodewes Aug 19 '13 at 17:10
- 
                    Hah, right. I meant alphanumeric. – Brandon Aug 20 '13 at 15:08
7 Answers
CODE:
from random import choice
from string import ascii_uppercase
print(''.join(choice(ascii_uppercase) for i in range(12)))
OUTPUT:
5 examples:
QPUPZVVHUNSN
EFJACZEBYQEB
QBQJJEEOYTZY
EOJUSUEAJEEK
QWRWLIWDTDBD
EDIT:
If you need only digits, use the digits constant instead of the ascii_uppercase one from the string module.
3 examples:
229945986931
867348810313
618228923380
 
    
    - 19,134
- 9
- 53
- 73
 
    
    - 11,726
- 7
- 55
- 77
- 
                    4yeah, well this is missleading: *"12 digits long and uppercase"* -- since digits can't be uppercased – Peter Varo Aug 19 '13 at 17:01
- 
                    And if you need Alphanumeric i.e ASCII Uppercase plus digits then `import digits` `print(''.join(choice(ascii_uppercase + digits) for i in range(12)))` – Sandeep Kanabar Jan 05 '17 at 12:45
- 
                    Does this gives an unique Id each time? What if I call this function from multiple threads (e.g. 2 of them) for 10000 times? What is the probability of collision or getting the same id at given point of time? – AnilJ Sep 06 '17 at 22:43
- 
                    @AnilJ for further info on how the `random` module is working, please read the official documentation on it: https://docs.python.org/3/library/random.html – Peter Varo Sep 07 '17 at 07:44
- 
                    Well, digits is not on Python3. You can use `string.hexdigits` to get a mix of '0123456789abcdefABCDEF', or just `string.digits + string.ascii_letters` for all letters. – goetz Oct 31 '17 at 01:20
- 
                    
- 
                    @PeterVaro Few years late, but can you elaborate on that ? I do not understand how a digit can be uppercased. – Itération 122442 Jul 26 '21 at 15:09
By Django, you can use get_random_string function in django.utils.crypto module.
get_random_string(length=12,
    allowed_chars=u'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
    Returns a securely generated random string.
    The default length of 12 with the a-z, A-Z, 0-9 character set returns
    a 71-bit value. log_2((26+26+10)^12) =~ 71 bits
Example:
get_random_string()
u'ngccjtxvvmr9'
get_random_string(4, allowed_chars='bqDE56')
u'DDD6'
But if you don't want to have Django, here is independent code of it:
Code:
import random
import hashlib
import time
SECRET_KEY = 'PUT A RANDOM KEY WITH 50 CHARACTERS LENGTH HERE !!'
try:
    random = random.SystemRandom()
    using_sysrandom = True
except NotImplementedError:
    import warnings
    warnings.warn('A secure pseudo-random number generator is not available '
                  'on your system. Falling back to Mersenne Twister.')
    using_sysrandom = False
def get_random_string(length=12,
                      allowed_chars='abcdefghijklmnopqrstuvwxyz'
                                    'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'):
    """
    Returns a securely generated random string.
    The default length of 12 with the a-z, A-Z, 0-9 character set returns
    a 71-bit value. log_2((26+26+10)^12) =~ 71 bits
    """
    if not using_sysrandom:
        # This is ugly, and a hack, but it makes things better than
        # the alternative of predictability. This re-seeds the PRNG
        # using a value that is hard for an attacker to predict, every
        # time a random string is required. This may change the
        # properties of the chosen random sequence slightly, but this
        # is better than absolute predictability.
        random.seed(
            hashlib.sha256(
                ("%s%s%s" % (
                    random.getstate(),
                    time.time(),
                    SECRET_KEY)).encode('utf-8')
            ).digest())
    return ''.join(random.choice(allowed_chars) for i in range(length))
 
    
    - 9,862
- 1
- 60
- 64
Could make a generator:
from string import ascii_uppercase
import random
from itertools import islice
def random_chars(size, chars=ascii_uppercase):
    selection = iter(lambda: random.choice(chars), object())
    while True:
        yield ''.join(islice(selection, size))
random_gen = random_chars(12)
print next(random_gen)
# LEQIITOSJZOQ
print next(random_gen)
# PXUYJTOTHWPJ
Then just pull from the generator when they're needed... Either using next(random_gen) when you need them, or use random_200 = list(islice(random_gen, 200)) for instance...
 
    
    - 138,671
- 33
- 247
- 280
- 
                    2
- 
                    @martineau can take one at a time, set up ones with different variables, can slice off to take n many at a time etc... The main difference is that it's in effect an iterable itself, instead of repeatedly calling a function... – Jon Clements Aug 19 '13 at 17:12
- 
                    
- 
                    `functools.partial` can fix parameters, and `list(itertools.islice(gen, n))` isn't any better than `[func() for _ in xrange(n)]` – user2357112 Aug 19 '13 at 17:58
- 
                    @user2357112 by building a generator, there's an advantage over resuming its state, than setting up and calling up a function repeatedly... Also the `list` and `islice` will work at the implementation level instead of as a list-comp that could leak its `_` (in Py 2.x) variable and has to build an unnecessary range constraint that's otherwise handled... Also, it's also harder to build on top of functions, rather than streams... – Jon Clements Aug 19 '13 at 18:05
- 
                    Resuming a generator's state vs calling a function repeatedly isn't an advantage, and if you want to set up fixed parameters, `functools.partial` can do that. The fact that `list` and `islice` are in C would be an advantage if there weren't a Python-level generator and several Python-level function calls in the inner loop. Leaking the loop variable is annoying, but no reason to avoid using list comprehensions. – user2357112 Aug 19 '13 at 18:14
- 
                    If you use a generator, getting a single random string is `next(random_chars(n))`, whereas with a regular function it's just `random_chars(n)`. Looping over `k` random strings is `for s in islice(random_chars(n), k):`, whereas with a regular function, it's `for i in xrange(k): s = random_chars(n)`. I find the `islice` and `next` calls to be warning signs that you don't actually want a generator here. – user2357112 Aug 19 '13 at 18:19
- 
                    @user2357112 depends on the use-case... I was just offering another option... If it's to associate a userid in a file (for instance) with a random password, then `dict(zip(fileobj, random_gen))` is perhaps better than using a dict comp with a call() as the value). If it's going to be arbitrarily used then I'd go for the approach already suggested, but what's the point of offering a duplicate answer ;) – Jon Clements Aug 19 '13 at 18:26
#!/bin/python3
import random
import string
def f(n: int) -> str:
        bytes(random.choices(string.ascii_uppercase.encode('ascii'),k=n)).decode('ascii')
run faster for very big n. avoid str concatenate.
 
    
    - 269
- 2
- 8
For cryptographically strong pseudo-random bytes you might use the pyOpenSSL wrapper around OpenSSL.
It provides the bytes function to gather a pseudo-random sequences of bytes.
from OpenSSL import rand
b = rand.bytes(7)
BTW, 12 uppercase letters is a little bit more that 56 bits of entropy. You will only to have to read 7 bytes.
 
    
    - 50,096
- 7
- 103
- 125
- 
                    1Wouldn't 12 randomly selected uppercase letters correspond to ~56.4 bits worth of entropy? – DSM Aug 19 '13 at 17:40
- 
                    1
This function generates random string of UPPERCASE letters with the specified length,
eg: length = 6, will generate the following random sequence pattern
YLNYVQ
    import random as r
    def generate_random_string(length):
        random_string = ''
        random_str_seq = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
        for i in range(0,length):
            if i % length == 0 and i != 0:
                random_string += '-'
            random_string += str(random_str_seq[r.randint(0, len(random_str_seq) - 1)])
        return random_string
 
    
    - 2,247
- 24
- 20
- 
                    With above code `random_str_seq = "ABC@#$%^!&_+|*()OPQRSTUVWXYZ"` can give you even more complex results. – Iqra. Jan 18 '19 at 11:39
A random generator function without duplicates using a set to store values which have been generated before. Note this will cost some memory with very large strings or amounts and it probably will slow down a bit. The generator will stop at a given amount or when the maximum possible combinations are reached.
Code:
#!/usr/bin/env python
from typing import Generator
from random import SystemRandom as RND
from string import ascii_uppercase, digits
def string_generator(size: int = 1, amount: int = 1) -> Generator[str, None, None]:
    """
    Return x random strings of a fixed length.
    :param size: string length, defaults to 1
    :type size: int, optional
    :param amount: amount of random strings to generate, defaults to 1
    :type amount: int, optional
    :yield: Yield composed random string if unique
    :rtype: Generator[str, None, None]
    """
    CHARS = list(ascii_uppercase + digits)
    LIMIT = len(CHARS) ** size
    count, check, string = 0, set(), ''
    while LIMIT > count < amount:
        string = ''.join(RND().choices(CHARS, k=size))
        if string not in check:
            check.add(string)
            yield string
            count += 1
for my_count, my_string in enumerate(string_generator(12, 20)):
    print(my_count, my_string)
Output:
0 IESUASWBRHPD
1 JGGO1THKLC9K
2 BW04A5GWBA7K
3 KDQTY72BV1S9
4 FAOL5L28VVMN
5 NLDNNBGHTRTI
6 2RV6TE6BCQ8K
7 B79B8FBPUD07
8 89VXXRHPUN41
9 DFC8QJUY6HRB
10 FXYYDKVQHC5Z
11 57KTZE67RSCU
12 389H1UT7N6CI
13 AKZMN9XITAVB
14 6T9ACH3GDAYG
15 CH8RJUQMTMBE
16 SPQ7E02ZLFD3
17 YD6JFXGIF3YF
18 ZUSA2X6OVNCN
19 JQRH6LR229Y4
 
    
    - 172
- 1
- 7
