GeofabrikReader.read_osm_pbf
- GeofabrikReader.read_osm_pbf(subregion_name, data_dir=None, readable=False, expand=False, parse_geometry=False, parse_properties=False, parse_other_tags=False, update=False, download=True, pickle_it=False, ret_pickle_path=False, rm_pbf_file=False, chunk_size_limit=50, verbose=False, **kwargs)[source]
Read a PBF (.osm.pbf) data file of a geographic (sub)region.
- Parameters:
subregion_name (str) – name of a geographic (sub)region (case-insensitive) that is available on Geofabrik free download server
data_dir (str | None) – directory where the .osm.pbf data file is located/saved; if
None
, the default local directoryreadable (bool) – whether to parse each feature in the raw data, defaults to
False
expand (bool) – whether to expand dict-like data into separate columns, defaults to
False
parse_geometry (bool) – whether to represent the
'geometry'
field in a shapely.geometry format, defaults toFalse
parse_properties (bool) – whether to represent the
'properties'
field in a tabular format, defaults toFalse
parse_other_tags (bool) – whether to represent a
'other_tags'
(of'properties'
) in a dict format, defaults toFalse
download (bool) – whether to download/update the PBF data file of the given subregion, if it is not available at the specified path, defaults to
True
update (bool) – whether to check to update pickle backup (if available), defaults to
False
pickle_it (bool) – whether to save the .pbf data as a pickle file, defaults to
False
ret_pickle_path (bool) – (when
pickle_it=True
) whether to return a path to the saved pickle filerm_pbf_file (bool) – whether to delete the downloaded .osm.pbf file, defaults to
False
chunk_size_limit (int | None) – threshold (in MB) that triggers the use of chunk parser, defaults to
50
; if the size of the .osm.pbf file (in MB) is greater thanchunk_size_limit
, it will be parsed in a chunk-wise wayverbose (bool | int) – whether to print relevant information in console as the function runs, defaults to
False
kwargs – [optional] parameters of the method
PBFReadParse.read_pbf()
- Returns:
dictionary of the .osm.pbf data; when
pickle_it=True
, return a tuple of the dictionary and a path to the pickle file- Return type:
dict | tuple | None
Examples:
>>> from pydriosm.reader import GeofabrikReader >>> from pyhelpers.dirs import delete_dir >>> gfr = GeofabrikReader() >>> subrgn_name = 'rutland' >>> dat_dir = "tests\osm_data" >>> # If the PBF data of Rutland is not available at the specified data directory, >>> # the function can download the latest data by setting `download=True` (default) >>> pbf_raw = gfr.read_osm_pbf(subrgn_name, data_dir=dat_dir, verbose=True) Downloading "rutland-latest.osm.pbf" to "tests\osm_data\rutland\" ... Done. Reading "tests\osm_data\rutland\rutland-latest.osm.pbf" ... Done. >>> type(pbf_raw) dict >>> list(pbf_raw.keys()) ['points', 'lines', 'multilinestrings', 'multipolygons', 'other_relations'] >>> pbf_raw_points = pbf_raw['points'] >>> type(pbf_raw_points) list >>> type(pbf_raw_points[0]) osgeo.ogr.Feature >>> # Set `readable=True` >>> pbf_parsed = gfr.read_osm_pbf(subrgn_name, dat_dir, readable=True, verbose=True) Parsing "tests\osm_data\rutland\rutland-latest.osm.pbf" ... Done. >>> pbf_parsed_points = pbf_parsed['points'] >>> pbf_parsed_points.head() 0 {'type': 'Feature', 'geometry': {'type': 'Poin... 1 {'type': 'Feature', 'geometry': {'type': 'Poin... 2 {'type': 'Feature', 'geometry': {'type': 'Poin... 3 {'type': 'Feature', 'geometry': {'type': 'Poin... 4 {'type': 'Feature', 'geometry': {'type': 'Poin... Name: points, dtype: object >>> # Set `expand=True`, which would force `readable=True` >>> pbf_parsed_ = gfr.read_osm_pbf(subrgn_name, dat_dir, expand=True, verbose=True) Parsing "tests\osm_data\rutland\rutland-latest.osm.pbf" ... Done. >>> pbf_parsed_points_ = pbf_parsed_['points'] >>> pbf_parsed_points_.head() id ... properties 0 488432 ... {'osm_id': '488432', 'name': None, 'barrier': ... 1 488658 ... {'osm_id': '488658', 'name': 'Tickencote Inter... 2 13883868 ... {'osm_id': '13883868', 'name': None, 'barrier'... 3 14049101 ... {'osm_id': '14049101', 'name': None, 'barrier'... 4 14558402 ... {'osm_id': '14558402', 'name': None, 'barrier'... [5 rows x 3 columns] >>> # Set `readable` and `parse_geometry` to be `True` >>> pbf_parsed_1 = gfr.read_osm_pbf(subrgn_name, dat_dir, readable=True, ... parse_geometry=True) >>> pbf_parsed_1_point = pbf_parsed_1['points'][0] >>> pbf_parsed_1_point['geometry'] 'POINT (-0.5134241 52.6555853)' >>> pbf_parsed_1_point['properties']['other_tags'] '"odbl"=>"clean"' >>> # Set `readable` and `parse_other_tags` to be `True` >>> pbf_parsed_2 = gfr.read_osm_pbf(subrgn_name, dat_dir, readable=True, ... parse_other_tags=True) >>> pbf_parsed_2_point = pbf_parsed_2['points'][0] >>> pbf_parsed_2_point['geometry'] {'type': 'Point', 'coordinates': [-0.5134241, 52.6555853]} >>> pbf_parsed_2_point['properties']['other_tags'] {'odbl': 'clean'} >>> # Set `readable`, `parse_geometry` and `parse_other_tags` to be `True` >>> pbf_parsed_3 = gfr.read_osm_pbf(subrgn_name, dat_dir, readable=True, ... parse_geometry=True, parse_other_tags=True) >>> pbf_parsed_3_point = pbf_parsed_3['points'][0] >>> pbf_parsed_3_point['geometry'] 'POINT (-0.5134241 52.6555853)' >>> pbf_parsed_3_point['properties']['other_tags'] {'odbl': 'clean'} >>> # Delete the example data and the test data directory >>> delete_dir(dat_dir, verbose=True) To delete the directory "tests\osm_data\" (Not empty) ? [No]|Yes: yes Deleting "tests\osm_data\" ... Done.