BBBikeReader.read_pbf

BBBikeReader.read_pbf(subregion_name, data_dir=None, readable=False, expand=False, parse_geometry=False, parse_properties=False, parse_other_tags=False, update=False, download=False, pickle_it=False, ret_pickle_path=False, rm_pbf_file=False, chunk_size_limit=50, verbose=False, **kwargs)[source]

Read a PBF (.osm.pbf) data file of a geographic (sub)region.

Parameters:
  • subregion_name (str) – name of a geographic (sub)region (case-insensitive) that is available on Geofabrik free download server

  • data_dir (str | None) – directory where the .osm.pbf data file is located/saved; if None, the default local directory

  • readable (bool) – whether to parse each feature in the raw data, defaults to False

  • expand (bool) – whether to expand dict-like data into separate columns, defaults to False

  • parse_geometry (bool) – whether to represent the 'geometry' field in a shapely.geometry format, defaults to False

  • parse_properties (bool) – whether to represent the 'properties' field in a tabular format, defaults to False

  • parse_other_tags (bool) – whether to represent a 'other_tags' (of 'properties') in a dict format, defaults to False

  • download (bool) – whether to download/update the PBF data file of the given subregion, if it is not available at the specified path, defaults to True

  • update (bool) – whether to check to update pickle backup (if available), defaults to False

  • pickle_it (bool) – whether to save the .pbf data as a pickle file, defaults to False

  • ret_pickle_path (bool) – (when pickle_it=True) whether to return a path to the saved pickle file

  • rm_pbf_file (bool) – whether to delete the downloaded .osm.pbf file, defaults to False

  • chunk_size_limit (int | None) – threshold (in MB) that triggers the use of chunk parser, defaults to 50; if the size of the .osm.pbf file (in MB) is greater than chunk_size_limit, it will be parsed in a chunk-wise way

  • verbose (bool | int) – whether to print relevant information in console as the function runs, defaults to False

  • kwargs – [optional] parameters of the method BaseReader.read_pbf()

Returns:

dictionary of the .osm.pbf data; when pickle_it=True, return a tuple of the dictionary and a path to the pickle file

Return type:

dict | tuple | None

Examples:

>>> from pydriosm.reader import BBBikeReader
>>> from pyhelpers.dirs import delete_dir
>>> bbr = BBBikeReader()
>>> subregion_name = 'Leeds'
>>> data_dir = "tests/osm_data"
>>> leeds_pbf_raw = bbr.read_pbf(subregion_name, data_dir=data_dir, verbose=True)
The .osm.pbf file for "Leeds" is not found.
>>> leeds_pbf_raw is None
True
>>> # Set `download=True`
>>> leeds_pbf_raw = bbr.read_pbf(
...     subregion_name, data_dir=data_dir, download=True, verbose=True)
Downloading "Leeds.osm.pbf" 100%|██████████| 38.1M/38.1M | 18.9MB/s | ETA: 00:00
  Saving "Leeds.osm.pbf" to "./tests/osm_data/leeds/" ... Done.
Reading "./tests/osm_data/leeds/Leeds.osm.pbf" ... Done.
>>> type(leeds_pbf_raw)
dict
>>> list(leeds_pbf_raw.keys())
['points', 'lines', 'multilinestrings', 'multipolygons', 'other_relations']
>>> pbf_raw_points = leeds_pbf_raw['points']
>>> type(pbf_raw_points)
list
>>> type(pbf_raw_points[0])
osgeo.ogr.Feature
>>> # (Parsing the data in this example might take up to a few minutes.)
>>> leeds_pbf_parsed = bbr.read_pbf(
...     subregion_name, data_dir=data_dir, readable=True, expand=True,
...     parse_geometry=True, parse_other_tags=True, parse_properties=True,
...     verbose=True)
Parsing "./tests/osm_data/leeds/Leeds.osm.pbf" ... Done.
>>> list(leeds_pbf_parsed.keys())
['points', 'lines', 'multilinestrings', 'multipolygons', 'other_relations']
>>> # Data of the 'multipolygons' layer
>>> leeds_pbf_parsed_multipolygons = leeds_pbf_parsed['multipolygons']
>>> leeds_pbf_parsed_multipolygons.shape
(481516, 26)
>>> leeds_pbf_parsed_multipolygons.head()
      id  ...                      other_tags
0  10595  ...                            None
1  10600  ...                            None
2  10601  ...                            None
3  10612  ...  {'ref:GB:uprn': '10025043089'}
4  10776  ...                            None
[5 rows x 26 columns]
>>> # Delete the example data and the test data directory
>>> delete_dir(data_dir, verbose=True)
To delete the directory "./tests/osm_data/" (Not empty)
? [No]|Yes: yes
Deleting "./tests/osm_data/" ... Done.

See also