The purpose of this package is to provide a small subset of BigQuery functionality that maps well to pandas.read_gbq and pandas.DataFrame.to_gbq. Those methods in the pandas library are a thin wrapper to the equivalent methods in this package.
Considerations when adding new features to pandas-gbq:
- New method? Consider an alternative, as the core focus of this library is
read_gbqandto_gbq. - Breaking change to an existing parameter? Consider an alternative, as folks
could be using an older version of
pandasthat doesn't account for the change when a newer version ofpandas-gbqis installed. If you must, please follow a 1+ year deprecation timeline. - New parameter? Go for it! Be sure to also send a PR to
pandasafter the feature is released so that folks using thepandaswrapper can take advantage of it. - New data type? OK. If there's not a good mapping to an existing
pandasdtype, consider adding one to thedb-dtypespackage.
The pandas-gbq package should do the "right thing" by default. This means you
should carefully choose dtypes for maximum compatibility with BigQuery and
avoid data loss. As new data types are added to BigQuery that don't have good
equivalents yet in the pandas ecosystem, equivalent dtypes should be added to
the db-dtypes package.
As new features are added that might improve performance, pandas-gbq should
offer easy ways to use them without sacrificing usability. For example, one
might consider using the api_method parameter of to_gbq to support the
BigQuery Storage Write API.
A note on pandas.read_sql: we'd like to be compatible with this too, for folks
that need better performance compared to the SQLAlchemy connector.
Unlike the more object-oriented client-libraries, it's natural to have a method
with many parameters in the Python data science ecosystem. That said, the
configuration argument is provided, which takes the REST representation of
the job configuration so that power users can use new features without the need
for an explicit parameter being added.
Keep it simple.
Don't break existing users.
Do the right thing by default.