Skip to content

GCSToGCSOperator ignores replace parameter when there is no wildcard #23162

@Yao-ATG

Description

@Yao-ATG

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

Latest

Apache Airflow version

2.2.5 (latest released)

Operating System

MacOS 12.2.1

Deployment

Composer

Deployment details

No response

What happened

Ran the same DAG twice with 'replace = False', in the second run files are overwritten anyway.
source_object does not include wildcard.

Not sure whether this incorrect behavior happens to "with wildcard" scenario, but from source code
https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/transfers/gcs_to_gcs.py
in line 346 (inside _copy_source_with_wildcard) we have
if not self.replace:
but in _copy_source_without_wildcard we don't check self.replace at all.

What you think should happen instead

When 'replace = False', the second run should skip copying files since they are already there.

How to reproduce

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions