I have 140K sentences I want to get embeddings for. I am using TF_HUB Universal Sentence Encoder and am iterating over the sentences(I know it's not the best way but when I try to feed over 500 sentences into the model it crashes). My Environment is: Ubuntu 18.04 Python 3.7.4 TF 1.14 Ram: 16gb processor: i-5
my code is:
version 1 I iterate inside the tf.session context manager
embed = hub.Module("https://tfhub.dev/google/universal-sentence-encoder-large/3")
    df = pandas_repository.get_dataframe_from_table('sentences')
    with tf.compat.v1.Session() as session:
        session.run(tf.global_variables_initializer())
        session.run(tf.tables_initializer())
        sentence_embedding = None
        for i, row in df.iterrows():
            sentence = row['content']
            embeddings = embed([sentence])
            sentence_embedding = session.run(embeddings)
            df.at[i, 'embedding'] = sentence_embedding
            print('processed index:', i)
version 2 I open and close a session within each iteration
embed = hub.Module("https://tfhub.dev/google/universal-sentence-encoder-large/3")
    df = pandas_repository.get_dataframe_from_table('sentences')
    for i, row in df.iterrows():
        sentence = row['content']
        embeddings = embed([sentence])
        sentence_embedding = None
        with tf.compat.v1.Session() as session:
            session.run(tf.global_variables_initializer())
            session.run(tf.tables_initializer())
            sentence_embedding = session.run(embeddings)
            df.at[i, 'embedding'] = sentence_embedding
            print('processed index:', i)
While version 2 does seem to have some sort of GC and memory is cleared a bit. It still goes over 50 items and explodes.
version 1 just goes on gobbling memory.
The correct solution as given by arnoegw
def calculate_embeddings(dataframe, table_name):
    sql_get_sentences = "SELECT * FROM semantic_similarity.sentences WHERE embedding IS NULL LIMIT 1500"
    sql_update = 'UPDATE {} SET embedding = data.embedding FROM (VALUES %s) AS data(id, embedding) WHERE {}.id = data.id'.format(table_name, table_name)
    df = pandas_repository.get_dataframe_from_sql(sql_get_sentences) 
    with hub.eval_function_for_module("https://tfhub.dev/google/universal-sentence-encoder-large/3") as embed:    
        while len(df) >= 0:
            sentence_array = df['content'].values
            sentence_embeddings = embed(sentence_array)
            df['embedding'] = sentence_embeddings.tolist()
            values = [tuple(x) for x in df[['id', 'embedding']].values]
            pandas_repository.update_db_from_df('semantic_similarity.sentences', sql_update, values)       
            df = pandas_repository.get_dataframe_from_sql(sql_get_sentences)
I am a newbee to TF and can use any help I can get.