GeofabrikDownloader.get_raw_directory_index

classmethod GeofabrikDownloader.get_raw_directory_index(url, verbose=False)[source]

Get a raw directory index (including download information of older file logs).

Parameters:
  • url (str) – URL of a web page of a data resource (e.g. a subregion)

  • verbose (bool | int) – whether to print relevant information in console, defaults to False

Returns:

information of raw directory index

Return type:

pandas.DataFrame | None

Examples:

>>> from pydriosm.downloader import GeofabrikDownloader

>>> gfd = GeofabrikDownloader()

>>> homepage_url = gfd.URL
>>> homepage_url
'https://download.geofabrik.de/'
>>> raw_index = gfd.get_raw_directory_index(homepage_url, verbose=True)
Collecting the raw directory index on 'https://download.geofabrik.de/' ... Failed.
No raw directory index is available on the web page.
>>> raw_index is None
True

>>> great_britain_url = 'https://download.geofabrik.de/europe/great-britain.html'
>>> raw_index = gfd.get_raw_directory_index(great_britain_url)
>>> type(raw_index)
pandas.core.frame.DataFrame
>>> raw_index.columns.tolist()
['file', 'date', 'size', 'metric_file_size', 'url']