PostgresOSM.read_sql_query¶
-
PostgresOSM.
read_sql_query
(sql_query, method='tempfile', tempfile_mode='w+b', max_size_spooled=1, delimiter=',', dtype=None, tempfile_kwargs=None, stringio_kwargs=None, **kwargs)¶ Read table data by SQL query (recommended for large table).
See also [SQL-P-RSQ-1], [SQL-P-RSQ-2] and [SQL-P-RSQ-3].
- Parameters
sql_query (str) – SQL query to be executed
method (str) –
method to be used for buffering temporary data
'tempfile'
(default): use tempfile.TemporaryFile'stringio'
: use io.StringIO'spooled'
: use tempfile.SpooledTemporaryFile
tempfile_mode (str) – mode of the specified method, defaults to
'w+b'
max_size_spooled (int or float) –
max_size
of tempfile.SpooledTemporaryFile, defaults to1
(in gigabyte)delimiter (str) – delimiter used in data, defaults to
','
dtype (dict or None) – data type for specified data columns, dtype used by pandas.read_csv, defaults to
None
tempfile_kwargs – optional parameters of tempfile.TemporaryFile or tempfile.SpooledTemporaryFile
stringio_kwargs – optional parameters of io.StringIO, e.g.
initial_value
(default:''
)kwargs – optional parameters of pandas.read_csv
- Returns
data frame as queried by the statement
sql_query
- Return type
pandas.DataFrame
Examples:
>>> import pandas >>> from pyhelpers.sql import PostgreSQL >>> testdb = PostgreSQL('localhost', 5432, username='postgres', database_name='testdb') Password (postgres@localhost:5432): *** Connecting postgres:***@localhost:5432/testdb ... Successfully. >>> # Create a pandas.DataFrame >>> xy_array = [(530034, 180381), ... (406689, 286822), ... (383819, 398052), ... (582044, 152953)] >>> idx_labels = ['London', 'Birmingham', 'Manchester', 'Leeds'] >>> col_names = ['Easting', 'Northing'] >>> dat = pandas.DataFrame(xy_array, index=idx_labels, columns=col_names) >>> print(dat) Easting Northing London 530034 180381 Birmingham 406689 286822 Manchester 383819 398052 Leeds 582044 152953 >>> table = 'England' >>> schema = 'points' >>> testdb.import_data(dat, table, schema, if_exists='replace', index=True, ... chunk_size=None, force_replace=False, col_type=None, verbose=2) To import data into "points"."England" at postgres:***@localhost:5432/testdb ? [No]|Yes: yes Creating a schema: "points" ... Done. Importing the data into the table "points"."England" ... Done. >>> res = testdb.table_exists(table, schema) >>> print("The table "{}"."{}" exists? {}.".format(schema, table, res)) The table "points"."England" exists? True. >>> dat_retrieval = testdb.read_table(table, schema, index_col='index') >>> dat_retrieval.index.name = None >>> print(dat_retrieval) Easting Northing London 530034 180381 Birmingham 406689 286822 Manchester 383819 398052 Leeds 582044 152953 >>> # Alternatively >>> sql_qry = 'SELECT * FROM "{}"."{}"'.format(schema, table) >>> dat_retrieval_alt = testdb.read_sql_query(sql_qry, index_col='index') >>> dat_retrieval_alt.index.name = None >>> print(dat_retrieval_alt) Easting Northing London 530034 180381 Birmingham 406689 286822 Manchester 383819 398052 Leeds 582044 152953 >>> # Delete the table "England" >>> testdb.drop_table(table_name=table, schema_name=schema, verbose=True) To drop the table "points"."England" from postgres:***@localhost:5432/testdb ? [No]|Yes: yes Dropping "points"."England" ... Done. >>> # Delete the schema "points" >>> testdb.drop_schema(schema, verbose=True) To drop the schema "points" from postgres:***@localhost:5432/testdb ? [No]|Yes: yes Dropping "points" ... Done. >>> # Delete the database "testdb" >>> testdb.drop_database(verbose=True) To drop the database "testdb" from postgres:***@localhost:5432 ? [No]|Yes: yes Dropping "testdb" ... Done.
Aside: a brief example of using the parameter
params
for pandas.read_sqlimport datetime sql_qry = 'SELECT * FROM "table_name" ' 'WHERE "timestamp_column_name" BETWEEN %(ts_start)s AND %(ts_end)s' params = {'ds_start': datetime.datetime.today(), 'ds_end': datetime.datetime.today()} data_frame = pandas.read_sql(sql_qry, testdb.engine, params=params)