GeofabrikDownloader.get_raw_directory_index

static GeofabrikDownloader.get_raw_directory_index(url, verbose=False)

Get a raw directory index.

This includes logs of older files and their and download URLs.

Parameters
  • url (str) – URL to the web page of the homepage or any subregion

  • verbose (bool or int) – whether to print relevant information in console, defaults to False

Returns

data of raw directory index

Return type

pandas.DataFrame or None

Examples:

>>> from pydriosm.downloader import GeofabrikDownloader

>>> geofabrik_downloader = GeofabrikDownloader()

>>> gb_url = 'https://download.geofabrik.de/europe/great-britain.html'

>>> raw_dir_idx = geofabrik_downloader.get_raw_directory_index(gb_url)

>>> type(raw_dir_idx)
pandas.core.frame.DataFrame
>>> raw_dir_idx.head()
                               File  ...                                     FileURL
0             great-britain-updates  ...  https://download.geofabrik.de/europe/gr...
1  great-britain-210412.osm.pbf.md5  ...  https://download.geofabrik.de/europe/gr...
2  great-britain-latest.osm.pbf.md5  ...  https://download.geofabrik.de/europe/gr...
3                 great-britain.kml  ...  https://download.geofabrik.de/europe/gr...
4      great-britain-latest.osm.pbf  ...  https://download.geofabrik.de/europe/gr...
[5 rows x 4 columns]

>>> gf_url = 'https://download.geofabrik.de/'

>>> raw_dir_idx = geofabrik_downloader.get_raw_directory_index(gf_url, verbose=True)
Collecting the raw directory index for the page 'https://download.ge...' ... Failed.
The web page does not have any raw directory index.
>>> type(raw_dir_idx)
NoneType