The Problem
I have a Python script that uses TensorFlow to create a multilayer perceptron net (with dropout) in order to do binary classification. Even though I've been careful to set both the Python and TensorFlow seeds, I get non-repeatable results. If I run once and then run again, I get different results. I can even run once, quit Python, restart Python, run again and get different results.
What I've Tried
I know some people posted questions about getting non-repeatable results in TensorFlow (e.g., "How to get stable results...", "set_random_seed not working...", "How to get reproducible result in TensorFlow"), and the answers usually turn out to be an incorrect use/understanding of tf.set_random_seed(). I've made sure to implement the solutions given but that has not solved my problem.
A common mistake is not realizing that tf.set_random_seed() is only a graph-level seed and that running the script multiple times will alter the graph, explaining the non-repeatable results. I used the following statement to print out the entire graph and verified (via diff) that the graph is the same even when the results are different.
print [n.name for n in tf.get_default_graph().as_graph_def().node]
I've also used function calls like tf.reset_default_graph() and tf.get_default_graph().finalize() to avoid any changes to the graph even though this is probably overkill.
The (Relevant) Code
My script is ~360 lines long so here are the relevant lines (with snipped code indicated). Any items that are in ALL_CAPS are constants that are defined in my Parameters block below.
import numpy as np
import tensorflow as tf
from copy import deepcopy
from tqdm import tqdm  # Progress bar
# --------------------------------- Parameters ---------------------------------
(snip)
# --------------------------------- Functions ---------------------------------
(snip)
# ------------------------------ Obtain Train Data -----------------------------
(snip)
# ------------------------------ Obtain Test Data -----------------------------
(snip)
random.seed(12345)
tf.set_random_seed(12345)
(snip)
# ------------------------- Build the TensorFlow Graph -------------------------
tf.reset_default_graph()
with tf.Graph().as_default():
    
    x = tf.placeholder("float", shape=[None, N_INPUT])
    y_ = tf.placeholder("float", shape=[None, N_CLASSES])
    
    # Store layers weight & bias
    weights = {
        'h1': tf.Variable(tf.random_normal([N_INPUT, N_HIDDEN_1])),
        'h2': tf.Variable(tf.random_normal([N_HIDDEN_1, N_HIDDEN_2])),
        'h3': tf.Variable(tf.random_normal([N_HIDDEN_2, N_HIDDEN_3])),
        'out': tf.Variable(tf.random_normal([N_HIDDEN_3, N_CLASSES]))
    }
    
    biases = {
        'b1': tf.Variable(tf.random_normal([N_HIDDEN_1])),
        'b2': tf.Variable(tf.random_normal([N_HIDDEN_2])),
        'b3': tf.Variable(tf.random_normal([N_HIDDEN_3])),
        'out': tf.Variable(tf.random_normal([N_CLASSES]))
    }
# Construct model
    pred = multilayer_perceptron(x, weights, biases, USE_DROP_LAYERS, DROP_KEEP_PROB)
    
    mean1 = tf.reduce_mean(weights['h1'])
    mean2 = tf.reduce_mean(weights['h2'])
    mean3 = tf.reduce_mean(weights['h3'])
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y_))
    
    regularizers = (tf.nn.l2_loss(weights['h1']) + tf.nn.l2_loss(biases['b1']) +
                    tf.nn.l2_loss(weights['h2']) + tf.nn.l2_loss(biases['b2']) +
                    tf.nn.l2_loss(weights['h3']) + tf.nn.l2_loss(biases['b3']))
    
    cost += COEFF_REGULAR * regularizers
    
    optimizer = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cost)
    
    out_labels = tf.nn.softmax(pred)
    
    sess = tf.InteractiveSession()
    sess.run(tf.initialize_all_variables())
    
    tf.get_default_graph().finalize()  # Lock the graph as read-only
    
    #Print the default graph in text form    
    print [n.name for n in tf.get_default_graph().as_graph_def().node]
    
    # --------------------------------- Training ----------------------------------
    
    print "Start Training"
    pbar = tqdm(total = TRAINING_EPOCHS)
    for epoch in range(TRAINING_EPOCHS):
        avg_cost = 0.0
        batch_iter = 0
        
        train_outfile.write(str(epoch))
        
        while batch_iter < BATCH_SIZE:
            train_features = []
            train_labels = []
            batch_segments = random.sample(train_segments, 20)
            for segment in batch_segments:
                train_features.append(segment[0])
                train_labels.append(segment[1])
            sess.run(optimizer, feed_dict={x: train_features, y_: train_labels})
            line_out = "," + str(batch_iter) + "\n"
            train_outfile.write(line_out)
            line_out = ",," + str(sess.run(mean1, feed_dict={x: train_features, y_: train_labels}))
            line_out += "," + str(sess.run(mean2, feed_dict={x: train_features, y_: train_labels}))
            line_out += "," + str(sess.run(mean3, feed_dict={x: train_features, y_: train_labels})) + "\n"
            train_outfile.write(line_out)
            avg_cost += sess.run(cost, feed_dict={x: train_features, y_: train_labels})/BATCH_SIZE
            batch_iter += 1
    
        line_out = ",,,,," + str(avg_cost) + "\n"
        train_outfile.write(line_out)
        pbar.update(1)  # Increment the progress bar by one
    
    train_outfile.close()
    print "Completed training"
# ------------------------------ Testing & Output ------------------------------
keep_prob = 1.0  # Do not use dropout when testing
print "now reducing mean"
print(sess.run(mean1, feed_dict={x: test_features, y_: test_labels}))
print "TRUE LABELS"
print(test_labels)
print "PREDICTED LABELS"
pred_labels = sess.run(out_labels, feed_dict={x: test_features})
print(pred_labels)
output_accuracy_results(pred_labels, test_labels)
sess.close()
What's not repeatable
As you can see, I'm outputting results during each epoch to a file and also printing out accuracy numbers at the end. None of these match from run to run, even though I believe I've set the seed(s) correctly. I've used both random.seed(12345) and tf.set_random_seed(12345)
Set-up details
TensorFlow version 0.8.0 (CPU only)
Enthought Canopy version 1.7.2  (Python 2.7, not 3.+)
Mac OS X version 10.11.3