We have a spring boot application build using Kafka streams which needs to integrate with an external application. So we have defined two topologies:
1. Request topology:
- A (
complex-type-key,string-value) record arrives in theinput-topic. - From the 
complex-type-keywe extract an uniquestring-keyand store thestring-keyand thestring-valuein a state store (local and persistent) - We transform the 
string-valuein astring-requestand write the (string-key,string-request) record to the external application we are integrating with 
2. Response topology:
- We receive the (
string-key,string-response) record from the external application in our response topic - We look up 
string-keyin the state store and want to retrieve the originalstring-value - Based on the original 
string-valueandstring-requestwe do some processing and forward the original (complex-type-key,string-value) tooutbound-topic-1oroutbound-topic-2 
All works fine when all topics have only one partition but at soon as we define our topics with multiple partitions the lookup in the state store does not find the string-value for the given string-key. We suspected that was because complex-type-key and string-key are different types and such will be processed by different stream processors. Having the sate store local would make not possible to retrieve the string-value as in fact there would be one storage to save and a different one to look up. However we tried to convert the complex-type-key int the string-key and pass it through an internal topic with a key of type String  before saving to the state store.
This make me think our understanding about how all these state stores work is not clear and we are using it in a wrong way.
Thank you in advance for your inputs.
NOTE:
We solved the issue by remapping complex-type-key into the string-key and storing it together with the string-value into an internal topic. Then when the response comes back we window leftJoin it with this internal topic. This way we gave up the usage of the state store entirely. However I still believe our initial approach was correct, providing it worked fine with one partition. It would still be good to understand what we done wrong. My feeling is that window left join solution has the two disadvantages of consuming more storage (costly in AWS) as well as an inaccurate window length can lead to loosing responses which could mean support calls