I currently have 3 tables roughly described as the following SQLAlchemy mapping:
class Task(BASE):
    __tablename__ = 'tasks'
    id = Column(Integer, primary_key=True)
    service_id = Column(Integer, ForeignKey('services.id'))
    service = relationship('Service', back_populates="tasks")
    updates = relationship("TaskUpdate")
class TaskUpdate(BASE):
    __tablename__ = 'task_updates'
    id = Column(Integer, primary_key=True)
    external_status = Column(String(32))
    external_updated_at = Column(DateTime(timezone=True))
    task_id = Column(Integer, ForeignKey('tasks.id'))
    task = relationship('Task', back_populates="updates")
class Service(BASE):
    __tablename__ = 'services'
    id = Column(Integer, primary_key=True)
    client_id = Column(Integer, ForeignKey('clients.id'))
    client = relationship('Client', back_populates='services')
So I have one-to-many relationship from Task to TaskUpdates and a many-to-one from Task to Service.
I'm trying to create a query to get all Tasks where their latest TaskUpdate (by timestamp) has an external_status that is "New", or "Open."
Here's what I got:
sub = SESSION.query(
        TaskUpdate.task_id,
        TaskUpdate.external_status.label('last_status'),
        func.max(TaskUpdate.external_updated_at).label('last_update')
        ).group_by(TaskUpdate.task_id
        ).subquery()
tasks = SESSION.query(Task
        ).join(Service
        ).filter(Service.client_id == client_id
        ).join((sub, sub.c.task_id == Task.id)
        ).filter(sub.c.last_status.in_(['New', 'Open']))
When I run this, I get this error:
ProgrammingError: (psycopg2.ProgrammingError) column "task_updates.external_status" must appear in the GROUP BY clause or be used in an aggregate function
I'd appreciate any help you can give. This is important.
Update 1 (this is the SQL that ended up working (as far as I can tell, I can't test the frontend until I get this working in SQLAlchemy though:
SELECT t.* FROM ( 
  SELECT DISTINCT ON (task_id) task_id, external_status 
  FROM task_updates 
  ORDER BY task_id, external_updated_at DESC NULLS LAST) tu 
JOIN tasks t ON t.id = tu.task_id 
JOIN services s ON s.id = t.service_id 
WHERE s.client_id = '" + str(client_id) + "' 
AND tu.external_status IN ('New', 'Open');
Here's my conversion attempt, still not working:
sub = SESSION.query(TaskUpdate).distinct(TaskUpdate.task_id).order_by(TaskUpdate.task_id.desc().nullslast(), TaskUpdate.external_updated_at.desc().nullslast()).subquery()
tasks = SESSION.query(Task).join(Service).join(sub.c.task_id==Task.id).filter(TaskUpdate.external_status.in_(['New', 'Open']))
Update 2: The query I have below works, but when I do .count() it returns the total number of TaskUpdates, not tasks and I suspect the query will need to be redone a different way, unless someone knows a way to handle this?
 
     
     
    