-
Notifications
You must be signed in to change notification settings - Fork 322
Description
Environment details
- OS type and version: Ubuntu 20.04.5 LTS
- Python version:
python --version
3.8 - pip version:
pip --version
20.0.2 google-cloud-bigquery
version:pip show google-cloud-bigquery
3.19.0
Steps to reproduce
- Run the code in the example
- Check job from
query_and_wait
-- it is interactive - Check job from
query
-- it is batch
Code example
import google.cloud.bigquery
job_config = google.cloud.bigquery.QueryJobConfig(
priority=google.cloud.bigquery.QueryPriority.BATCH,
labels={"test": "test2"}
)
bq_client = google.cloud.bigquery.Client()
bq_client.query_and_wait("SELECT \"query_and_wait\"", job_config=job_config)
bq_client.query("SELECT \"query\"", job_config=job_config)
Details
query_and_wait
does not respect all of the QueryJobConfig
options. This is because it seems to go through a different API (jobs.query
rather than jobs.insert
), and jobs.query
doesn't even seem to respect some of these.
The query_and_wait
call above ultimately performs a jobs.query
call with this payload: {"labels": {"test": "test2"}, "priority": "BATCH", "useLegacySql": false, "formatOptions": {"useInt64Timestamp": true}, "query": "SELECT \\"query_and_wait\\"", "requestId": "..."}
The query
call performs this: '{"jobReference": {"jobId": "...", "projectId": "..."}, "configuration": {"query": {"priority": "BATCH", "useLegacySql": false, "query": "SELECT \\"query\\""}, "labels": {"test": "test2"}}}'
Note: priority
is not in the jobs.query
API, but impact goes beyond this one particular option in QueryJobConfig
. Labels happens to work because labels happens to be part of the jobs.query
API.
Impact
Although a caller could just migrate from query_and_wait
to query
if they were making the call, this is not an option with SQLAlchemy.
The Cursor.execute
function, from the dbapi, as an optimization, will call query_and_wait
without a job ID. This breaks the ability in SQLAlchemy interactions with BigQuery to supply default query job configs, which is explicitly a goal of the library. "For situations like these, or for situations where you want the Client to have a default_query_job_config you can pass many arguments in the query of the connection string."