I looked up for any reference for pyspark equivalent of pandas df.groupby(upc)['store'].unique() where df is any dataframe in pandas.
Please use this piece of code for data frame creation in Pyspark
from pyspark.sql.types import StructType,StructField, StringType, IntegerType
from pyspark.sql import *
from datetime import date
import pyspark.sql.functions as F
spark = SparkSession.builder.appName('SparkByExamples.com').getOrCreate()
data2 = [("36636","M",3000),
    ("40288","M",4000),
    ("42114","M",3000),
    ("39192","F",4000),
    ("39192","F",2000)
  ]
schema = StructType([ \
    StructField("upc", StringType(), True), \
    StructField("store", StringType(), True), \
    StructField("sale", IntegerType(), True) \
  ])
 
df = spark.createDataFrame(data=data2,schema=schema)
I know pyspark groupby unique_count, but need help with unique_values