Here the column amenity_category and parent_path is JSONB column with value like ["Tv","Air Condition"] and ["20000","20100","203"] respectively. Apart from that other columns are normal varchar and numeric type. I've around 2.5M rows with primary key on id and it is indexed. Basically the initial CTE part is taking time when rp.parent_path match multiple rows.
Sample dataset:
Current query:
WITH CTE AS
(
  SELECT id,
  property_name,
  property_type_category,
  review_score, 
  amenity_category.name, 
  count(*) AS cnt FROM table_name rp, 
  jsonb_array_elements_text(rp.amenity_categories) amenity_category(name)
  WHERE rp.parent_path ? '203' AND number_of_review >= 1
  GROUP BY amenity_category.name,id 
),
CTE2 as
(
  SELECT id, property_name,property_type_category,name,
  ROW_NUMBER() OVER (PARTITION BY property_type_category,
  name ORDER BY review_score DESC),
  COUNT(id) OVER (PARTITION BY property_type_category,
  name ORDER BY name DESC) 
  FROM CTE
)
SELECT id, property_name, property_type_category, name, COUNT 
FROM CTE2
where row_number = 1
Current Output:
So my basic question is is there any other way I can re-write this query or optimize the current query?


 
     
    