PBF.read_pbf¶
- classmethod PBF.read_pbf(path_to_file, readable=True, expand=False, parse_geometry=False, parse_properties=False, parse_other_tags=False, number_of_chunks=None, max_tmpfile_size=5000, **kwargs)[source]¶
Parse a PBF data file (by GDAL).
- Parameters:
path_to_file (str) – pathname of a PBF data file
readable (bool) – whether to parse each feature in the raw data, defaults to
Falseexpand (bool) – whether to expand dict-like data into separate columns, defaults to
Falseparse_geometry (bool) – whether to represent the
'geometry'field in a shapely.geometry format, defaults toFalseparse_properties (bool) – whether to represent the
'properties'field in a tabular format, defaults toFalseparse_other_tags (bool) – whether to represent a
'other_tags'(of'properties') in a dict format, defaults toFalsenumber_of_chunks (int | None) – number of chunks, defaults to
Nonemax_tmpfile_size (int | None) – maximum size of the temporary file, defaults to
None; whenmax_tmpfile_size=None, it defaults to5000kwargs – [optional] parameters of the function pyhelpers.settings.gdal_configurations()
- Returns:
parsed OSM PBF data
- Return type:
dict
Note
The GDAL/OGR drivers categorizes the features of OSM PBF data into five layers:
0: ‘points’ - “node” features having significant tags attached
1: ‘lines’ - “way” features being recognized as non-area
2: ‘multilinestrings’ - “relation” features forming a multilinestring (type=’multilinestring’ / type=’route’)
3: ‘multipolygons’ - “relation” features forming a multipolygon (type=’multipolygon’ / type=’boundary’), and “way” features being recognized as area
4: ‘other_relations’ - “relation” features not belonging to the above 2 layers
For more information, please refer to OpenStreetMap XML and PBF.
Warning
Parsing large PBF data files (e.g. > 50MB) can be time-consuming!
The function
read_osm_pbf()may require fairly high amount of physical memory to parse large files, in which case it would be recommended thatnumber_of_chunksis set to be a reasonable value.
Examples:
>>> from pydriosm.reader._pbf import PBF >>> from pydriosm.downloader import Downloader >>> from pyhelpers.dirs import delete_dir >>> import os >>> # Download the PBF data file of 'Rutland' as an example >>> subregion_name = 'rutland' >>> osm_file_format = ".pbf" >>> download_dir = "tests/osm_data" >>> dl = Downloader() >>> dl.download_data(subregion_name, osm_file_format, download_dir, verbose=True) To download data in the format '.osm.pbf' for the following geographic (sub)region(s): "Rutland" to "./tests/osm_data/rutland/" ? [No]|Yes: >? yes Downloading "rutland-latest.osm.pbf" 100%|██████████| 1.83M/1.83M | 5.74MB/s ... Saving "rutland-latest.osm.pbf" to "./tests/osm_data/rutland/" ... Done. >>> path_to_file = dl.data_paths[0] >>> os.path.relpath(path_to_file) 'tests\osm_data\rutland\rutland-latest.osm.pbf' >>> # Read the downloaded PBF data >>> rutland_pbf = PBF.read_pbf(path_to_file) >>> type(rutland_pbf) dict >>> list(rutland_pbf.keys()) ['points', 'lines', 'multilinestrings', 'multipolygons', 'other_relations'] >>> rutland_pbf_points = rutland_pbf['points'] >>> rutland_pbf_points.head() 0 {'type': 'Feature', 'geometry': {'type': 'Poin... 1 {'type': 'Feature', 'geometry': {'type': 'Poin... 2 {'type': 'Feature', 'geometry': {'type': 'Poin... 3 {'type': 'Feature', 'geometry': {'type': 'Poin... 4 {'type': 'Feature', 'geometry': {'type': 'Poin... Name: points, dtype: object >>> # Set `expand` to be `True` >>> pbf_0 = PBF.read_pbf(path_to_file, expand=True) >>> type(pbf_0) dict >>> list(pbf_0.keys()) ['points', 'lines', 'multilinestrings', 'multipolygons', 'other_relations'] >>> pbf_0_points = pbf_0['points'] >>> pbf_0_points.head() id ... properties 0 488432 ... {'osm_id': '488432', 'name': None, 'barrier': ... 1 488658 ... {'osm_id': '488658', 'name': 'Tickencote Inter... 2 13883868 ... {'osm_id': '13883868', 'name': None, 'barrier'... 3 14049101 ... {'osm_id': '14049101', 'name': None, 'barrier'... 4 14558402 ... {'osm_id': '14558402', 'name': None, 'barrier'... [5 rows x 3 columns] >>> pbf_0_points['geometry'].head() 0 {'type': 'Point', 'coordinates': [-0.5134241, ... 1 {'type': 'Point', 'coordinates': [-0.5313354, ... 2 {'type': 'Point', 'coordinates': [-0.7229332, ... 3 {'type': 'Point', 'coordinates': [-0.7249816, ... 4 {'type': 'Point', 'coordinates': [-0.7266581, ... Name: geometry, dtype: object >>> # Set both `expand` and `parse_geometry` to be `True` >>> pbf_1 = PBF.read_pbf(path_to_file, expand=True, parse_geometry=True) >>> pbf_1_points = pbf_1['points'] >>> # Check the difference in 'geometry' column, compared to `pbf_0_points` >>> pbf_1_points['geometry'].head() 0 POINT (-0.5134241 52.6555853) 1 POINT (-0.5313354 52.6737716) 2 POINT (-0.7229332 52.5889864) 3 POINT (-0.7249816 52.6748426) 4 POINT (-0.7266543 52.669517) Name: geometry, dtype: object >>> # Set both `expand` and `parse_properties` to be `True` >>> pbf_2 = PBF.read_pbf(path_to_file, expand=True, parse_properties=True) >>> pbf_2_points = pbf_2['points'] >>> pbf_2_points['other_tags'].head() 0 "odbl"=>"clean" 1 None 2 None 3 "traffic_calming"=>"cushion" 4 "direction"=>"clockwise" Name: other_tags, dtype: object >>> # Set both `expand` and `parse_other_tags` to be `True` >>> pbf_3 = PBF.read_pbf(path_to_file, expand=True, parse_properties=True, ... parse_other_tags=True) >>> pbf_3_points = pbf_3['points'] >>> # Check the difference in 'other_tags', compared to ``pbf_2_points`` >>> pbf_3_points['other_tags'].head() 0 {'odbl': 'clean'} 1 None 2 None 3 {'traffic_calming': 'cushion'} 4 {'direction': 'clockwise'} Name: other_tags, dtype: object >>> # Delete the downloaded PBF data file >>> delete_dir(dl.download_dir, verbose=True) To delete the directory "./tests/osm_data/" (Not empty) ? [No]|Yes: yes Deleting "./tests/osm_data/" ... Done.
See also
Examples for the methods:
GeofabrikReader.read_pbf()andBBBikeReader.read_pbf().