Skip to content

Commit 502ba30

Browse files
authored
1 parent 75065ac commit 502ba30

File tree

12 files changed

+74
-3
lines changed

12 files changed

+74
-3
lines changed

β€Ž.markdownlint.yml

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,6 @@ MD013: false
3434
# MD014/commands-show-output
3535
MD014: false
3636

37-
# MD022/blanks-around-headings/blanks-around-headers
38-
MD022: false
39-
4037
# MD024/no-duplicate-heading/no-duplicate-header
4138
MD024: false
4239

β€ŽREADME.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
specific language governing permissions and limitations
1717
under the License.
1818
-->
19+
1920
# Apache Airflow
2021

2122
[![PyPI version](https://badge.fury.io/py/apache-airflow.svg)](https://badge.fury.io/py/apache-airflow)

β€ŽUPDATING.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
specific language governing permissions and limitations
1717
under the License.
1818
-->
19+
1920
# Updating Airflow
2021

2122
This file documents any backwards-incompatible changes in Airflow and
@@ -166,6 +167,7 @@ More tips can be found in the guide:
166167
https://developers.google.com/style/inclusive-documentation
167168
168169
-->
170+
169171
### Major changes
170172

171173
This section describes the major changes that have been made in this release.
@@ -238,6 +240,7 @@ You should update the import paths if you are setting log configurations with th
238240
The old import paths still works but can be abandoned.
239241

240242
#### SendGrid emailer has been moved
243+
241244
Formerly the core code was maintained by the original creators - Airbnb. The code that was in the contrib
242245
package was supported by the community. The project was passed to the Apache community and currently the
243246
entire code is maintained by the community, so now the division has no justification, and it is only due
@@ -411,6 +414,7 @@ has also been changed to `Running Slots`.
411414
The Mesos Executor is removed from the code base as it was not widely used and not maintained. [Mailing List Discussion on deleting it](https://lists.apache.org/thread.html/daa9500026b820c6aaadeffd66166eae558282778091ebbc68819fb7@%3Cdev.airflow.apache.org%3E).
412415

413416
#### Change dag loading duration metric name
417+
414418
Change DAG file loading duration metric from
415419
`dag.loading-duration.<dag_id>` to `dag.loading-duration.<dag_file>`. This is to
416420
better handle the case when a DAG file has multiple DAGs.
@@ -503,6 +507,7 @@ To maintain consistent behavior, both successful or skipped downstream task can
503507
`wait_for_downstream=True` flag.
504508

505509
#### `airflow.utils.helpers.cross_downstream`
510+
506511
#### `airflow.utils.helpers.chain`
507512

508513
The `chain` and `cross_downstream` methods are now moved to airflow.models.baseoperator module from
@@ -532,6 +537,7 @@ from airflow.models.baseoperator import cross_downstream
532537
```
533538

534539
#### `airflow.operators.python.BranchPythonOperator`
540+
535541
`BranchPythonOperator` will now return a value equal to the `task_id` of the chosen branch,
536542
where previously it returned None. Since it inherits from BaseOperator it will do an
537543
`xcom_push` of this value if `do_xcom_push=True`. This is useful for downstream decision-making.
@@ -602,13 +608,21 @@ in `SubDagOperator`.
602608

603609

604610
#### `airflow.providers.google.cloud.operators.datastore.CloudDatastoreExportEntitiesOperator`
611+
605612
#### `airflow.providers.google.cloud.operators.datastore.CloudDatastoreImportEntitiesOperator`
613+
606614
#### `airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator`
615+
607616
#### `airflow.providers.ssh.operators.ssh.SSHOperator`
617+
608618
#### `airflow.providers.microsoft.winrm.operators.winrm.WinRMOperator`
619+
609620
#### `airflow.operators.bash.BashOperator`
621+
610622
#### `airflow.providers.docker.operators.docker.DockerOperator`
623+
611624
#### `airflow.providers.http.operators.http.SimpleHttpOperator`
625+
612626
#### `airflow.providers.http.operators.http.SimpleHttpOperator`
613627

614628
The `do_xcom_push` flag (a switch to push the result of an operator to xcom or not) was appearing in different incarnations in different operators. It's function has been unified under a common name (`do_xcom_push`) on `BaseOperator`. This way it is also easy to globally disable pushing results to xcom.
@@ -665,6 +679,7 @@ replaced with its corresponding new path.
665679
| ``airflow.LoggingMixin`` | ``airflow.utils.log.logging_mixin.LoggingMixin`` |
666680
| ``airflow.conf`` | ``airflow.configuration.conf`` |
667681
| ``airflow.AirflowException`` | ``airflow.exceptions.AirflowException`` |
682+
668683
#### Variables removed from the task instance context
669684

670685
The following variables were removed from the task instance context:
@@ -711,6 +726,7 @@ The old method is still works but can be abandoned at any time. The changes are
711726
that are rarely used.
712727

713728
#### `airflow.models.dag.DAG.create_dagrun`
729+
714730
DAG.create_dagrun accepts run_type and does not require run_id
715731
This change is caused by adding `run_type` column to `DagRun`.
716732

@@ -799,6 +815,7 @@ untangle cyclic imports between DAG, BaseOperator, SerializedDAG, SerializedBase
799815
part of AIRFLOW-6010.
800816

801817
#### `airflow.utils.log.logging_mixin.redirect_stderr`
818+
802819
#### `airflow.utils.log.logging_mixin.redirect_stdout`
803820

804821
Function `redirect_stderr` and `redirect_stdout` from `airflow.utils.log.logging_mixin` module has
@@ -885,6 +902,7 @@ This section describes the changes that have been made, and what you need to do
885902
you use operators or hooks which integrate with Google services (including Google Cloud - GCP).
886903

887904
#### Direct impersonation added to operators communicating with Google services
905+
888906
[Directly impersonating a service account](https://cloud.google.com/iam/docs/understanding-service-accounts#directly_impersonating_a_service_account)
889907
has been made possible for operators communicating with Google services via new argument called `impersonation_chain`
890908
(`google_impersonation_chain` in case of operators that also communicate with services of other cloud providers).
@@ -1123,8 +1141,11 @@ operators/hooks. Otherwise, ``google_cloud_default`` will be used as GCP's conn_
11231141
by default.
11241142

11251143
#### `airflow.providers.google.cloud.hooks.dataflow.DataflowHook`
1144+
11261145
#### `airflow.providers.google.cloud.operators.dataflow.DataflowCreateJavaJobOperator`
1146+
11271147
#### `airflow.providers.google.cloud.operators.dataflow.DataflowTemplatedJobStartOperator`
1148+
11281149
#### `airflow.providers.google.cloud.operators.dataflow.DataflowCreatePythonJobOperator`
11291150

11301151
To use project_id argument consistently across GCP hooks and operators, we did the following changes:
@@ -1165,6 +1186,7 @@ Will now call:
11651186
Where '.keep' is a single file at your prefix that the sensor should not consider new.
11661187

11671188
#### `airflow.providers.google.cloud.hooks.bigquery.BigQueryBaseCursor`
1189+
11681190
#### `airflow.providers.google.cloud.hooks.bigquery.BigQueryHook`
11691191

11701192
To simplify BigQuery operators (no need of `Cursor`) and standardize usage of hooks within all GCP integration methods from `BiqQueryBaseCursor`
@@ -1209,14 +1231,17 @@ exceptions raised by the following methods:
12091231
* `airflow.providers.google.cloud.hooks.bigquery.BigQueryBaseCursor.get_dataset` raises `AirflowException` instead of `ValueError`.
12101232

12111233
#### `airflow.providers.google.cloud.operators.bigquery.BigQueryCreateEmptyTableOperator`
1234+
12121235
#### `airflow.providers.google.cloud.operators.bigquery.BigQueryCreateEmptyDatasetOperator`
12131236

12141237
Idempotency was added to `BigQueryCreateEmptyTableOperator` and `BigQueryCreateEmptyDatasetOperator`.
12151238
But to achieve that try / except clause was removed from `create_empty_dataset` and `create_empty_table`
12161239
methods of `BigQueryHook`.
12171240

12181241
#### `airflow.providers.google.cloud.hooks.dataflow.DataflowHook`
1242+
12191243
#### `airflow.providers.google.cloud.hooks.mlengine.MLEngineHook`
1244+
12201245
#### `airflow.providers.google.cloud.hooks.pubsub.PubSubHook`
12211246

12221247
The change in GCP operators implies that GCP Hooks for those operators require now keyword parameters rather
@@ -1226,11 +1251,17 @@ in case they are called using positional parameters.
12261251
Other GCP hooks are unaffected.
12271252

12281253
#### `airflow.providers.google.cloud.hooks.pubsub.PubSubHook`
1254+
12291255
#### `airflow.providers.google.cloud.operators.pubsub.PubSubTopicCreateOperator`
1256+
12301257
#### `airflow.providers.google.cloud.operators.pubsub.PubSubSubscriptionCreateOperator`
1258+
12311259
#### `airflow.providers.google.cloud.operators.pubsub.PubSubTopicDeleteOperator`
1260+
12321261
#### `airflow.providers.google.cloud.operators.pubsub.PubSubSubscriptionDeleteOperator`
1262+
12331263
#### `airflow.providers.google.cloud.operators.pubsub.PubSubPublishOperator`
1264+
12341265
#### `airflow.providers.google.cloud.sensors.pubsub.PubSubPullSensor`
12351266

12361267
In the `PubSubPublishOperator` and `PubSubHook.publsh` method the data field in a message should be bytestring (utf-8 encoded) rather than base64 encoded string.
@@ -1267,10 +1298,15 @@ Detailed information about connection management is available:
12671298
* The `maxResults` parameter in `GoogleCloudStorageHook.list` has been renamed to `max_results` for consistency.
12681299

12691300
#### `airflow.providers.google.cloud.operators.dataproc.DataprocSubmitPigJobOperator`
1301+
12701302
#### `airflow.providers.google.cloud.operators.dataproc.DataprocSubmitHiveJobOperator`
1303+
12711304
#### `airflow.providers.google.cloud.operators.dataproc.DataprocSubmitSparkSqlJobOperator`
1305+
12721306
#### `airflow.providers.google.cloud.operators.dataproc.DataprocSubmitSparkJobOperator`
1307+
12731308
#### `airflow.providers.google.cloud.operators.dataproc.DataprocSubmitHadoopJobOperator`
1309+
12741310
#### `airflow.providers.google.cloud.operators.dataproc.DataprocSubmitPySparkJobOperator`
12751311

12761312
The 'properties' and 'jars' properties for the Dataproc related operators (`DataprocXXXOperator`) have been renamed from
@@ -1306,7 +1342,9 @@ previous one was (project_id, dataset_id, ...) (breaking change)
13061342
favor of `list_rows`. (breaking change)
13071343

13081344
#### `airflow.providers.google.cloud.hooks.dataflow.DataflowHook.start_python_dataflow`
1345+
13091346
#### `airflow.providers.google.cloud.hooks.dataflow.DataflowHook.start_python_dataflow`
1347+
13101348
#### `airflow.providers.google.cloud.operators.dataflow.DataflowCreatePythonJobOperator`
13111349

13121350
Change python3 as Dataflow Hooks/Operators default interpreter
@@ -1379,9 +1417,13 @@ Migrated are:
13791417
| airflow.contrib.sensors.aws_sqs_sensor.SQSSensor | airflow.providers.amazon.aws.sensors.sqs.SQSSensor |
13801418

13811419
#### `airflow.providers.amazon.aws.hooks.emr.EmrHook`
1420+
13821421
#### `airflow.providers.amazon.aws.operators.emr_add_steps.EmrAddStepsOperator`
1422+
13831423
#### `airflow.providers.amazon.aws.operators.emr_create_job_flow.EmrCreateJobFlowOperator`
1424+
13841425
#### `airflow.providers.amazon.aws.operators.emr_terminate_job_flow.EmrTerminateJobFlowOperator`
1426+
13851427
The default value for the [aws_conn_id](https://airflow.apache.org/howto/manage-connections.html#amazon-web-services) was accidently set to 's3_default' instead of 'aws_default' in some of the emr operators in previous
13861428
versions. This was leading to EmrStepSensor not being able to find their corresponding emr cluster. With the new
13871429
changes in the EmrAddStepsOperator, EmrTerminateJobFlowOperator and EmrCreateJobFlowOperator this issue is
@@ -1468,6 +1510,7 @@ Remove unnecessary parameter ``open`` in PostgresHook function ``copy_expert`` f
14681510
Change parameter name from ``visibleTo`` to ``visible_to`` in OpsgenieAlertOperator for pylint compatible
14691511

14701512
#### `airflow.providers.imap.hooks.imap.ImapHook`
1513+
14711514
#### `airflow.providers.imap.sensors.imap_attachment.ImapAttachmentSensor`
14721515

14731516
ImapHook:
@@ -1697,11 +1740,13 @@ Example:
16971740
The above code returned `None` previously, now it will return `''`.
16981741

16991742
### Make behavior of `none_failed` trigger rule consistent with documentation
1743+
17001744
The behavior of the `none_failed` trigger rule is documented as "all parents have not failed (`failed` or
17011745
`upstream_failed`) i.e. all parents have succeeded or been skipped." As previously implemented, the actual behavior
17021746
would skip if all parents of a task had also skipped.
17031747

17041748
### Add new trigger rule `none_failed_or_skipped`
1749+
17051750
The fix to `none_failed` trigger rule breaks workflows that depend on the previous behavior.
17061751
If you need the old behavior, you should change the tasks with `none_failed` trigger rule to `none_failed_or_skipped`.
17071752

@@ -1716,6 +1761,7 @@ No breaking changes.
17161761
## Airflow 1.10.8
17171762

17181763
### Failure callback will be called when task is marked failed
1764+
17191765
When task is marked failed by user or task fails due to system failures - on failure call back will be called as part of clean up
17201766

17211767
See [AIRFLOW-5621](https://jira.apache.org/jira/browse/AIRFLOW-5621) for details
@@ -1887,6 +1933,7 @@ they contain the strings "airflow" and "DAG". For backwards
18871933
compatibility, this option is enabled by default.
18881934

18891935
### RedisPy dependency updated to v3 series
1936+
18901937
If you are using the Redis Sensor or Hook you may have to update your code. See
18911938
[redis-py porting instructions] to check if your code might be affected (MSET,
18921939
MSETNX, ZADD, and ZINCRBY all were, but read the full doc).
@@ -1961,6 +2008,7 @@ Hooks involved:
19612008
Other GCP hooks are unaffected.
19622009

19632010
### Changed behaviour of using default value when accessing variables
2011+
19642012
It's now possible to use `None` as a default value with the `default_var` parameter when getting a variable, e.g.
19652013

19662014
```python
@@ -2076,6 +2124,7 @@ that he has permissions on. If a new role wants to access all the dags, the admi
20762124
We also provide a new cli command(``sync_perm``) to allow admin to auto sync permissions.
20772125

20782126
### Modification to `ts_nodash` macro
2127+
20792128
`ts_nodash` previously contained TimeZone information along with execution date. For Example: `20150101T000000+0000`. This is not user-friendly for file or folder names which was a popular use case for `ts_nodash`. Hence this behavior has been changed and using `ts_nodash` will no longer contain TimeZone information, restoring the pre-1.10 behavior of this macro. And a new macro `ts_nodash_with_tz` has been added which can be used to get a string with execution date and timezone info without dashes.
20802129

20812130
Examples:
@@ -2088,6 +2137,7 @@ Examples:
20882137
next_ds/prev_ds now map to execution_date instead of the next/previous schedule-aligned execution date for DAGs triggered in the UI.
20892138

20902139
### User model changes
2140+
20912141
This patch changes the `User.superuser` field from a hardcoded boolean to a `Boolean()` database column. `User.superuser` will default to `False`, which means that this privilege will have to be granted manually to any users that may require it.
20922142

20932143
For example, open a Python shell and
@@ -2590,6 +2640,7 @@ indefinitely. This is only available on the command line.
25902640
After how much time should an updated DAG be picked up from the filesystem.
25912641

25922642
#### min_file_parsing_loop_time
2643+
25932644
CURRENTLY DISABLED DUE TO A BUG
25942645
How many seconds to wait between file-parsing loops to prevent the logs from being spammed.
25952646

β€ŽUPGRADING_TO_2.0.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
specific language governing permissions and limitations
1717
under the License.
1818
-->
19+
1920
# Upgrading to Airflow 2.0+
2021

2122
This file documents any backwards-incompatible changes in Airflow and
@@ -92,6 +93,7 @@ goal is that any Airflow setup that can pass these tests will be able to upgrade
9293

9394

9495
## Step 3: Set Operators to Backport Providers
96+
9597
Now that you are set up in airflow 1.10.13 with python a 3.6+ environment, you are ready to start porting your DAGs to Airfow 2.0 compliance!
9698

9799
The most important step in this transition is also the easiest step to do in pieces. All Airflow 2.0 operators are backwards compatible with Airflow 1.10
@@ -443,9 +445,11 @@ For Airflow 2.0, the traditional `executor_config` will continue operation with
443445
but will be removed in a future version.
444446

445447
## Appendix
448+
446449
### Changed Parameters for the KubernetesPodOperator
447450

448451
#### port has migrated from a List[Port] to a List[V1ContainerPort]
452+
449453
Before:
450454
```python
451455
from airflow.kubernetes.pod import Port
@@ -475,6 +479,7 @@ k = KubernetesPodOperator(
475479
```
476480

477481
#### volume_mounts has migrated from a List[VolumeMount] to a List[V1VolumeMount]
482+
478483
Before:
479484
```python
480485
from airflow.kubernetes.volume_mount import VolumeMount
@@ -509,6 +514,7 @@ k = KubernetesPodOperator(
509514
```
510515

511516
#### volumes has migrated from a List[Volume] to a List[V1Volume]
517+
512518
Before:
513519
```python
514520
from airflow.kubernetes.volume import Volume
@@ -545,7 +551,9 @@ k = KubernetesPodOperator(
545551
task_id="task",
546552
)
547553
```
554+
548555
#### env_vars has migrated from a Dict to a List[V1EnvVar]
556+
549557
Before:
550558
```python
551559
k = KubernetesPodOperator(
@@ -720,6 +728,7 @@ k = KubernetesPodOperator(
720728
resources=resources,
721729
)
722730
```
731+
723732
#### image_pull_secrets has migrated from a String to a List[k8s.V1LocalObjectReference]
724733

725734
Before:
@@ -749,6 +758,7 @@ quay_k8s = KubernetesPodOperator(
749758
```
750759

751760
### Migration Guide from Experimental API to Stable API v1
761+
752762
In Airflow 2.0, we added the new REST API. Experimental API still works, but support may be dropped in the future.
753763
If your application is still using the experimental API, you should consider migrating to the stable API.
754764

@@ -757,6 +767,7 @@ differences between the two endpoints that will help you migrate from the
757767
experimental REST API to the stable REST API.
758768

759769
#### Base Endpoint
770+
760771
The base endpoint for the stable API v1 is ``/api/v1/``. You must change the
761772
experimental base endpoint from ``/api/experimental/`` to ``/api/v1/``.
762773
The table below shows the differences:
@@ -777,6 +788,7 @@ The table below shows the differences:
777788
| DAG Lineage(GET) | /api/experimental/lineage/<DAG_ID>/<string:execution_date>/ | /api/v1/dags/{dag_id}/dagRuns/{dag_run_id}/taskInstances/{task_id}/xcomEntries |
778789

779790
#### Note
791+
780792
This endpoint ``/api/v1/dags/{dag_id}/dagRuns`` also allows you to filter dag_runs with parameters such as ``start_date``, ``end_date``, ``execution_date`` etc in the query string.
781793
Therefore the operation previously performed by this endpoint
782794

β€Žairflow/providers/apache/hive/example_dags/example_twitter_README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
specific language governing permissions and limitations
1717
under the License.
1818
-->
19+
1920
# Example Twitter DAG
2021

2122
***Introduction:*** This example dag depicts a typical ETL process and is a perfect use case automation scenario for Airflow. Please note that the main scripts associated with the tasks are returning None. The purpose of this DAG is to demonstrate how to write a functional DAG within Airflow.

β€Žairflow/providers/google/cloud/ADDITIONAL_INFO.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
## Additional info
2121

2222
### Breaking change in `AutoMLBatchPredictOperator`
23+
2324
Class `AutoMLBatchPredictOperator` property `params` is renamed to `prediction_params`.
2425
To keep old behaviour, please rename `params` to `prediction_params` when initializing an instance of `AutoMLBatchPredictOperator`.
2526

0 commit comments

Comments
 (0)