Skip to content

query_and_wait use in Cursor.execute breaks default_query_job_config in SQLAlchemy BQ Dialect #1867

@wgunther

Description

@wgunther

Environment details

  • OS type and version: Ubuntu 20.04.5 LTS
  • Python version: python --version 3.8
  • pip version: pip --version 20.0.2
  • google-cloud-bigquery version: pip show google-cloud-bigquery 3.19.0

Steps to reproduce

  1. Run the code in the example
  2. Check job from query_and_wait -- it is interactive
  3. Check job from query -- it is batch

Code example

import google.cloud.bigquery

job_config = google.cloud.bigquery.QueryJobConfig(
    priority=google.cloud.bigquery.QueryPriority.BATCH,
    labels={"test": "test2"}
)
bq_client = google.cloud.bigquery.Client()
bq_client.query_and_wait("SELECT \"query_and_wait\"", job_config=job_config)
bq_client.query("SELECT \"query\"", job_config=job_config)

Details

query_and_wait does not respect all of the QueryJobConfig options. This is because it seems to go through a different API (jobs.query rather than jobs.insert), and jobs.query doesn't even seem to respect some of these.

The query_and_wait call above ultimately performs a jobs.query call with this payload: {"labels": {"test": "test2"}, "priority": "BATCH", "useLegacySql": false, "formatOptions": {"useInt64Timestamp": true}, "query": "SELECT \\"query_and_wait\\"", "requestId": "..."}

The query call performs this: '{"jobReference": {"jobId": "...", "projectId": "..."}, "configuration": {"query": {"priority": "BATCH", "useLegacySql": false, "query": "SELECT \\"query\\""}, "labels": {"test": "test2"}}}'

Note: priority is not in the jobs.query API, but impact goes beyond this one particular option in QueryJobConfig. Labels happens to work because labels happens to be part of the jobs.query API.

Impact

Although a caller could just migrate from query_and_wait to query if they were making the call, this is not an option with SQLAlchemy.

The Cursor.execute function, from the dbapi, as an optimization, will call query_and_wait without a job ID. This breaks the ability in SQLAlchemy interactions with BigQuery to supply default query job configs, which is explicitly a goal of the library. "For situations like these, or for situations where you want the Client to have a default_query_job_config you can pass many arguments in the query of the connection string."

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions