I have a table with hundreds of millions of rows, where I want to get a single list of the unique values from 2 indexed columns from the same table (with no unique row ID).
To illustrate this, let's say we have a table with a fruits column and a veggies column, and I want to build a healthy_foods list with the  unique values in both columns.
I have tried the following queries:
with UNION
WITH cte as (
    SELECT fruit, veggie
    FROM recipes
)
SELECT fruit as healthy_food
         FROM cte
         UNION --  <--- 
         SELECT veggie as healthy_food
         FROM cte;
with UNION ALL then DISTINCT ON
WITH cte as (...)
SELECT DISTINCT ON (healthy_food) healthy_food FROM  --  <--- 
(SELECT fruit as healthy_food
     FROM cte
     UNION ALL --  <--- 
     SELECT veggie as healthy_food
     FROM cte) tb;
with UNION ALL then GROUP BY
WITH cte as (...)
SELECT fruit as healthy_food
         FROM cte
         UNION ALL --  <--- 
         SELECT veggie as healthy_food
         FROM cte
GROUP BY healthy_food; --  <--- 
(and adding HAVING COUNT(*) = 1 and GROUP BY on each SELECT from the UNION)
The UNION ALL gets executed super fast, but all the duplicate-removing combinations I've tried take +15mins.
How could I optimize this query, taking into account that the 2 fields/columns come from the same table and are indexed?
(or alternatively, what would be the least expensive way to keep track of all unique values? maybe a trigger inserting on a UNIQUE table, or a view?)
 
     
     
    