parse_osm_pbf

pydriosm.reader.parse_osm_pbf(path_to_osm_pbf, number_of_chunks, parse_raw_feat, transform_geom, transform_other_tags, max_tmpfile_size=None)[source]

Parse a PBF data file.

Parameters
  • path_to_osm_pbf (str) – absolute path to a PBF data file

  • number_of_chunks (int or None) – number of chunks

  • parse_raw_feat (bool) – whether to parse each feature in the raw data

  • transform_geom – whether to transform a single coordinate (or a collection of coordinates) into a geometric object

  • transform_other_tags (bool) – whether to transform a 'other_tags' into a dictionary

  • max_tmpfile_size (int or None) – defaults to None, see also pydriosm.settings.gdal_configurations()

Returns

parsed OSM PBF data

Return type

dict

Note

This function can require fairly high amount of physical memory to read large files e.g. > 200MB

The driver categorises features into 5 layers:

  • 0: ‘points’ - “node” features having significant tags attached

  • 1: ‘lines’ - “way” features being recognized as non-area

  • 2: ‘multilinestrings’ - “relation” features forming a multilinestring (type=’multilinestring’ / type=’route’)

  • 3: ‘multipolygons’ - “relation” features forming a multipolygon (type=’multipolygon’ / type=’boundary’), and “way” features being recognized as area

  • 4: ‘other_relations’ - “relation” features not belonging to the above 2 layers

See also [POP-1].

Example:

>>> import os
>>> from pydriosm.reader import GeofabrikDownloader, parse_osm_pbf

>>> geofabrik_downloader = GeofabrikDownloader()

>>> sr_name = 'Rutland'
>>> file_fmt = ".pbf"
>>> dwnld_dir = "tests"

>>> path_to_rutland_pbf = geofabrik_downloader.download_osm_data(
...     sr_name, file_fmt, dwnld_dir, verbose=True, ret_download_path=True)
Confirmed to download .osm.pbf data of the following geographic region(s):
    Rutland
? [No]|Yes: yes
Downloading "rutland-latest.osm.pbf" to "\tests" ...
Done.

>>> rutland_pbf_raw = parse_osm_pbf(path_to_rutland_pbf, number_of_chunks=50,
...                                 parse_raw_feat=False, transform_geom=False,
...                                 transform_other_tags=False)

>>> print(list(rutland_pbf_raw.keys()))
['points', 'lines', 'multilinestrings', 'multipolygons', 'other_relations']

>>> rutland_pbf_raw_points = rutland_pbf_raw['points']
>>> print(rutland_pbf_raw_points.head())
                                              points
0  {"type": "Feature", "geometry": {"type": "Poin...
1  {"type": "Feature", "geometry": {"type": "Poin...
2  {"type": "Feature", "geometry": {"type": "Poin...
3  {"type": "Feature", "geometry": {"type": "Poin...
4  {"type": "Feature", "geometry": {"type": "Poin...

>>> rutland_pbf_parsed = parse_osm_pbf(path_to_rutland_pbf, number_of_chunks=50,
...                                    parse_raw_feat=True, transform_geom=False,
...                                    transform_other_tags=False)

>>> rutland_pbf_parsed_points = rutland_pbf_parsed['points']
>>> print(rutland_pbf_parsed_points.head())
         id               coordinates  ... man_made                    other_tags
0    488432  [-0.5134241, 52.6555853]  ...     None               "odbl"=>"clean"
1    488658  [-0.5313354, 52.6737716]  ...     None                          None
2  13883868  [-0.7229332, 52.5889864]  ...     None                          None
3  14049101  [-0.7249922, 52.6748223]  ...     None  "traffic_calming"=>"cushion"
4  14558402  [-0.7266686, 52.6695051]  ...     None      "direction"=>"clockwise"
[5 rows x 12 columns]

>>> rutland_pbf_parsed_1 = parse_osm_pbf(path_to_rutland_pbf, number_of_chunks=50,
...                                      parse_raw_feat=True, transform_geom=True,
...                                      transform_other_tags=False)

>>> rutland_pbf_parsed_points_1 = rutland_pbf_parsed_1['points']
>>> print(rutland_pbf_parsed_points_1[['coordinates']].head())
                                coordinates
0             POINT (-0.5134241 52.6555853)
1             POINT (-0.5313354 52.6737716)
2    POINT (-0.7229332000000001 52.5889864)
3             POINT (-0.7249922 52.6748223)
4             POINT (-0.7266686 52.6695051)

>>> rutland_pbf_parsed_2 = parse_osm_pbf(path_to_rutland_pbf, number_of_chunks=50,
...                                      parse_raw_feat=True, transform_geom=True,
...                                      transform_other_tags=True)

>>> rutland_pbf_parsed_points_2 = rutland_pbf_parsed_2['points']
>>> print(rutland_pbf_parsed_points_2[['coordinates', 'other_tags']].head())
                              coordinates                      other_tags
0           POINT (-0.5134241 52.6555853)               {'odbl': 'clean'}
1           POINT (-0.5313354 52.6737716)                            None
2  POINT (-0.7229332000000001 52.5889864)                            None
3           POINT (-0.7249922 52.6748223)  {'traffic_calming': 'cushion'}
4           POINT (-0.7266686 52.6695051)      {'direction': 'clockwise'}

>>> # Delete the downloaded PBF data file
>>> os.remove(path_to_rutland_pbf)

See also

The examples for the method GeofabrikReader.read_osm_pbf().