I have a regular ETL job that runs on an AWS ec2 instance.
The workflow looks like the following:
- Bring up the ec2 instance using
EC2StartInstanceOperatoroperator. - Find out public IP using
boto3function wrapped inside aPythonOperator. This operator pushes the IP to XCOM. - Establish an SSH hook using the public IP and run a remote command using
SSHOperator. - Stop the ec2 instance upon completion using
EC2StopInstanceOperator.
The issues with the above are:
- The SSH hook (
airflow.providers.ssh.hooks.ssh.SSHHookin Airflow 2.0) can not access XCOM, only operators do. - AWS ec2 instances do not get reassigned the same public IP between the runs, so I have to run the
PythonOperatorto find out the public IP during every run.
Thanks!