Say, I am building a camera app. Every time the user clicks a photo, the image is stored on the cloud. Because I want to restrict how many images are stored on the cloud, the app gets 10 URLs in an array called listURLs when it is initialised.
The first 10 clicks get PUT into the cloud, exhausting listURLs. Then, every time a click happens, a coin toss determines whether the latest click replaces an existing click on the cloud. Typical numbers would be 50 clicks, first 10 clicks get assigned a URL, and of the remaining 40 clicks, 20 of them overwrite an existing URL.
I store records of each app session in a Postgres DB. Each session will have an ID and all instances of clicks (which may or may not have a corresponding url). I also need to know the url corresponding to each click, if one exists. So, if there are 30 clicks, I will need to know which 10 of these have a corresponding url.
I can think of two ways of storing this data.
- tblClicksURLsas a Table that has- click_id,- urland- url_activeas its fields. Every time a- click_idand non-null- urlneed to be inserted, update all other records with the same- urlto have- url_activeas- false.
- Two tables - tblClicksand- tblURLs.- tblURLshas a- click_idforeign key. Every time a- click_idand non-null- urlneed to be inserted, the- click_idgets inserted into- tblClicksand- click_idand- urlget upserted into- tblURLs. The upsert is based on whether the- urlalready exists in- tblURLs. So, for a given- url, there will only be one- click_idin- tblURLs
So, in Case 1, I will have an UPDATE of url_active followed by  INSERT on the same table. In Case 2, I will have an INSERT into one table and an UPSERT into another. I will need indexing on click_id, but not on url.
If you are looking at writes of > 10k rows per second, maybe even more, which of these two would be more efficient? Assume that the numbers per session are similar to the one quoted above (50 clicks, etc.)
I could also register a created_at datetime for each record in Case 1, and just use the first non-null url ordered reverse-chronologically. But, I am trying to avoid this, unless the performance benefits are enormous.
