Situation
My goal is to have a yearly cronjob that deletes certain data from a database based on age. To my disposal I have the powers of Bash and MySQL. I started with writing a bash script but then it struck me that maybe, I could do everything with just a single SQL query.
I'm more a programmer by nature and I haven't had much experience with data structures so that's why I would like some help.
Tables / data structure
The relevant tables and columns for this query are as follows:
Registration:
+-----+-------------------+
| Id  | Registration_date |
+-----+-------------------+
|   2 | 2011-10-03        | 
|   3 | 2011-10-06        | 
|   4 | 2011-10-07        | 
|   5 | 2011-10-07        | 
|   6 | 2011-10-10        | 
|   7 | 2011-10-13        | 
|   8 | 2011-10-14        | 
|   9 | 2011-10-14        | 
|  10 | 2011-10-17        |
+-------------------------+ 
AssociatedClient:
+-----------+-----------------+
| Client_id | Registration_id |
+-----------+-----------------+
|         2 |               2 | 
|         3 |               2 | 
|         3 |               4 | 
|         4 |               5 | 
|         3 |               6 | 
|         5 |               6 | 
|         3 |               8 | 
|         8 |               9 | 
|         7 |              10 | 
+-----------------------------+
Client: only Id is relevant here.
As you can see, this is a simple many-to-many relationship. A client can have multiple registrations to his name, and a registration can have multiple clients.
The goal
I need to delete all registrations and client data for clients who have not had a new registration in 5 years. Sounds simple, right?
The tricky part
The data should be kept if any other client on any registration from a specific client has a new registration within 5 years.
So imagine client A having 4 registrations with just him in them, and 1 registration with himself and client B. All 5 registrations are older than 5 years. If client B did not have a new registration in 5 years, everything should be deleted: client A registrations and record. If B did have a new registration within 5 years, all client A data should be kept, including his own old registrations.
What I've tried
Building my query, I got about this far:
DELETE * FROM `Registration` AS Reg
WHERE TIMESTAMPDIFF(YEAR, Reg.`Registration_date`, NOW()) >= 5
AND 
    (COUNT(`Id`) FROM `Registration` AS Reg2
     WHERE Reg2.`Id` IN (SELECT `Registration_id` FROM `AssociatedClient` AS Clients
                         WHERE Clients.`Client_id` IN (SELECT `Client_id` FROM `AssociatedClient` AS Clients2
                                                       WHERE Clients2.`Registration_id` IN -- stuck
               #I need all the registrations from the clients associated with the first
               # (outer) registration here, that are newer than 5 years.
    ) = 0 -- No newer registrations from any associated clients
Please understand that I have very limited experience with SQL. I realise that even what I got so far can be heavily optimised (with joins etc) and may not even be correct.
The reason I got stuck is that the solution I had in mind would work if I could use some kind of loop, and I only just realised that this is not something you easily do in an SQL query of this kind.
Any help
Is much appreciated.
 
     
     
     
     
     
     
    