parse_osm_pbf¶
-
pydriosm.reader.
parse_osm_pbf
(path_to_osm_pbf, parse_raw_feat=False, transform_geom=False, transform_other_tags=False, number_of_chunks=None, max_tmpfile_size=None)¶ Parse a PBF data file.
- Parameters
path_to_osm_pbf (str) – path to a PBF data file
parse_raw_feat (bool) – whether to parse each feature in the raw data, defaults to
False
transform_geom – whether to transform a single coordinate (or a collection of coordinates) into a geometric object, defaults to
False
transform_other_tags (bool) – whether to transform a
'other_tags'
into a dictionary, defaults toFalse
number_of_chunks (int or None) – number of chunks, defaults to
None
max_tmpfile_size (int or None) – defaults to
None
; see alsogdal_configurations()
- Returns
parsed OSM PBF data
- Return type
dict
Note
The driver categorises features into 5 layers:
0: ‘points’ - “node” features having significant tags attached
1: ‘lines’ - “way” features being recognized as non-area
2: ‘multilinestrings’ - “relation” features forming a multilinestring (type=’multilinestring’ / type=’route’)
3: ‘multipolygons’ - “relation” features forming a multipolygon (type=’multipolygon’ / type=’boundary’), and “way” features being recognized as area
4: ‘other_relations’ - “relation” features not belonging to the above 2 layers
See also [POP-1].
This function may require fairly high amount of physical memory to parse large files (e.g. > 200MB), in which case it would be recommended that
number_of_chunks
is set to be a reasonable value.Example:
>>> import os >>> from pydriosm.reader import GeofabrikDownloader, parse_osm_pbf >>> # Download the PBF data file of Rutland as an example >>> geofabrik_downloader = GeofabrikDownloader() >>> path_to_rutland_pbf = geofabrik_downloader.download_osm_data( ... subregion_names='Rutland', osm_file_format=".pbf", download_dir="tests", ... verbose=True, ret_download_path=True) To download .osm.pbf data of the following geographic region(s): Rutland ? [No]|Yes: yes Downloading "rutland-latest.osm.pbf" to "tests\" ... Done. >>> print(os.path.relpath(path_to_rutland_pbf)) tests\rutland-latest.osm.pbf >>> # Parse the downloaded PBF data >>> rutland_pbf_raw = parse_osm_pbf(path_to_rutland_pbf) >>> type(rutland_pbf_raw) dict >>> list(rutland_pbf_raw.keys()) ['points', 'lines', 'multilinestrings', 'multipolygons', 'other_relations'] >>> rutland_pbf_raw_points = rutland_pbf_raw['points'] >>> rutland_pbf_raw_points.head() points 0 {"type": "Feature", "geometry": {"type": "Poin... 1 {"type": "Feature", "geometry": {"type": "Poin... 2 {"type": "Feature", "geometry": {"type": "Poin... 3 {"type": "Feature", "geometry": {"type": "Poin... 4 {"type": "Feature", "geometry": {"type": "Poin... >>> # Set ``parse_raw_feat`` to be ``True`` >>> rutland_pbf_parsed_0 = parse_osm_pbf(path_to_rutland_pbf, parse_raw_feat=True) >>> rutland_pbf_parsed_points_0 = rutland_pbf_parsed_0['points'] >>> rutland_pbf_parsed_points_0.head() id coordinates ... man_made other_tags 0 488432 [-0.5134241, 52.6555853] ... None "odbl"=>"clean" 1 488658 [-0.5313354, 52.6737716] ... None None 2 13883868 [-0.7229332, 52.5889864] ... None None 3 14049101 [-0.7249922, 52.6748223] ... None "traffic_calming"=>"cushion" 4 14558402 [-0.7266686, 52.6695051] ... None "direction"=>"clockwise" [5 rows x 12 columns] >>> # Set both ``parse_raw_feat`` and ``transform_geom`` to be ``True`` >>> rutland_pbf_parsed_1 = parse_osm_pbf(path_to_rutland_pbf, parse_raw_feat=True, ... transform_geom=True) >>> rutland_pbf_parsed_points_1 = rutland_pbf_parsed_1['points'] >>> # Check the difference in 'coordinates', compared to ``rutland_pbf_parsed_points_0`` >>> rutland_pbf_parsed_points_1[['coordinates']].head() coordinates 0 POINT (-0.5134241 52.6555853) 1 POINT (-0.5313354 52.6737716) 2 POINT (-0.7229332000000001 52.5889864) 3 POINT (-0.7249922 52.6748223) 4 POINT (-0.7266686 52.6695051) >>> # Further, set ``transform_other_tags`` to be ``True`` >>> rutland_pbf_parsed_2 = parse_osm_pbf(path_to_rutland_pbf, parse_raw_feat=True, ... transform_other_tags=True) >>> rutland_pbf_parsed_points_2 = rutland_pbf_parsed_2['points'] >>> # Check the difference in 'other_tags', compared to ``rutland_pbf_parsed_points_0`` >>> rutland_pbf_parsed_points_2[['other_tags']].head() other_tags 0 {'odbl': 'clean'} 1 None 2 None 3 {'traffic_calming': 'cushion'} 4 {'direction': 'clockwise'} >>> # Delete the downloaded PBF data file >>> os.remove(path_to_rutland_pbf)
See also
More examples for the method
GeofabrikReader.read_osm_pbf()
.