DoU_classify_grid {flexurba} | R Documentation |
Create the DEGURBA grid cell classification
Description
The function reconstructs the grid cell classification of the Degree of Urbanisation. The arguments of the function allow to adapt the standard specifications in the Degree of Urbanisation in order to construct an alternative version (see section "Custom specifications" below).
For more information about the Degree of Urbanisation methodology, see the methodological manual, GHSL Data Package 2022 and GHSL Data Package 2023.
Usage
DoU_classify_grid(
data,
level1 = TRUE,
parameters = NULL,
values = NULL,
regions = FALSE,
filename = NULL
)
Arguments
data |
path to the directory with the data, or named list with the data as returned by function |
level1 |
logical. Whether to classify the grid according to first hierarchical level ( |
parameters |
named list with the parameters to adapt the standard specifications in the Degree of Urbanisation classification. For more details, see section "Custom specifications" below. |
values |
vector with the values assigned to the different classes in the resulting classification:
|
regions |
logical. Whether to execute the classification in the memory-efficient pre-defined regions. For more details, see section "Regions" below (Note that this requires a large amount of memory). |
filename |
character. Output filename (with extension |
Value
SpatRaster with the grid cell classification
Classification rules
The Degree of Urbanisation consists of two hierarchical levels. In level 1, the cells of a 1 km² grid are classified in urban centres, urban clusters and rural cells (and water cells). In level 2, urban cluster are further divided in dense urban clusters, semi-dense urban clusters and suburbs or peri-urban cells. Rural cells are further divided in rural clusters, low density rural cells and very low density rural cells.
The detailed classification rules are as follows:
LEVEL 1:
-
Urban centres are identified as clusters of continuous grid cells (based on rook contiguity) with a minimum density of 1500 inhabitants per km² (or with a minimum built-up area; see section "Built-up area criterium" below), and a minimum total population of 50 000 inhabitants. Gaps smaller than 15 km² in the urban centres are filled and edges are smoothed by a 3x3-majority rule (see section "Edge smoothing" below).
-
Urban clusters are identified as clusters of continuous grid cells (based on queen contiguity) with a minimum density of 300 inhabitants per km², and a minimum total population of 5000 inhabitants.
-
Water cells contain no built-up area, no population, and less than 50% permanent land. All other cells not belonging to an urban centre or urban cluster are considered rural cells.
LEVEL 2:
-
Urban centres are identified as clusters of continuous grid cells (based on rook contiguity) with a minimum density of 1500 inhabitants per km² (or with a minimum built-up area; see section "Built-up area criterium" below), and a minimum total population of 50 000 inhabitants. Gaps smaller than 15 km² in the urban centres are filled and edges are smoothed by a 3x3-majority rule (see section "Edge smoothing" below).
-
Dense urban clusters are identified as clusters of continuous grid cells (based on rook contiguity) with a minimum density of 1500 inhabitants per km² (or with a minimum built-up area; see section "Built-up area criterium" below), and a minimum total population of 5000 inhabitants.
-
Semi-dense urban clusters are identified as clusters of continuous grid cells (based on rook contiguity) with a minimum density of 900 inhabitants per km², and a minimum total population of 2500 inhabitants, that are not within 2 km away from urban centres and dense urban clusters. Clusters that are within 2 km away are classified as suburban and peri-urban cells.
-
Rural clusters are clusters of continuous grid cells (based on queen contiguity) with a minimum density of 300 inhabitants per km², and a minimum total population of 500 inhabitants.
-
Low density rural cells are remaining cells with a population density less than 50 inhabitants per km².
-
Water cells contain no built-up area, no population, and less than 50% permanent land. All cells not belonging to an other class are considered very low density rural cells.
For more information about the Degree of Urbanisation methodology, see the methodological manual, GHSL Data Package 2022 and GHSL Data Package 2023.
Custom specifications
The function allows to change the standard specifications of the Degree of Urbanisation in order to construct an alternative version of the grid classification. Custom specifications can be passed in a named list by the argument parameters
. The supported parameters with their default values are returned by the function DoU_get_grid_parameters()
and are as follows:
LEVEL 1
-
UC_density_threshold
numeric (default:1500
).Minimum population density per permanent land of a cell required to belong to an urban centre
-
UC_size_threshold
numeric (default:50000
).Minimum total population size required for an urban centre
-
UC_contiguity_rule
integer (default:4
).Which cells are considered adjacent in urban centres:
4
for rooks case (horizontal and vertical neighbours) or8
for queens case (horizontal, vertical and diagonal neighbours) -
UC_built_criterium
logical (default:TRUE
).Whether to use the additional built-up area criterium (see section "Built-up area criterium" below). If
TRUE
, not only cells that meet the population density requirement will be considered when delineating urban centres, but also cells with a built-up area per permanent land above theUC_built_threshold
-
UC_built_threshold
numeric or character (default:0.2
).Additional built-up area threshold. Can be a value between
0
and1
, representing the minimum built-up area per permanent land, or"optimal"
(see section "Built-up area criterium" below). Ignored whenUC_built_criterium
isFALSE
. -
built_optimal_data
character / list (default:NULL
).Path to the directory with the data, or named list with the data as returned by function
DoU_preprocess_grid()
used to determine the optimal built threshold (see section "Built-up area criterium" below). Ignored whenUC_built_criterium
isFALSE
or whenUC_built_threshold
is not"optimal"
. -
UC_smooth_pop
logical (default:FALSE
).Whether to smooth the population grid before delineating urban centres. If
TRUE
, the population grid will be smoothed with a moving average of window sizeUC_smooth_pop_window
. -
UC_smooth_pop_window
integer (default:5
).Size of the moving window used to smooth the population grid before delineating urban centres. Ignored when
UC_smooth_pop
isFALSE
. -
UC_gap_fill
logical (default:TRUE
).Whether to perform gap filling. If
TRUE
, gaps in urban centres smaller thanUC_max_gap
are filled. -
UC_max_gap
integer (default:15
).Gaps with an area smaller than this threshold in urban centres will be filled (unit is km²). Ignored when
UC_gap_fill
isFALSE
. -
UC_smooth_edge
logical (default:TRUE
).Whether to perform edge smoothing. If
TRUE
, edges of urban centres are smoothed with the functionUC_smooth_edge_fun
. -
UC_smooth_edge_fun
character / function (default:"majority_rule_R2023A"
).Function used to smooth the edges of urban centres. Ignored when
UC_smooth_edge
isFALSE
. Possible values are:-
"majority_rule_R2022A"
to use the edge smoothing algorithm in GHSL Data Package 2022 (see section "Edge smoothing" below) -
"majority_rule_R2023A"
to use the edge smoothing algorithm in GHSL Data Package 2023 (see section "Edge smoothing" below) a custom function with a signature similar as
apply_majority_rule()
.
-
-
UCL_density_threshold
numeric (default:300
).Minimum population density per permanent land of a cell required to belong to an urban cluster
-
UCL_size_threshold
numeric (default:5000
).Minimum total population size required for an urban cluster
-
UCL_contiguity_rule
integer (default:8
).Which cells are considered adjacent in urban clusters:
4
for rooks case (horizontal and vertical neighbours) or8
for queens case (horizontal, vertical and diagonal neighbours) -
UCL_smooth_pop
logical (default:FALSE
).Whether to smooth the population grid before delineating urban clusters. If
TRUE
, the population grid will be smoothed with a moving average of window sizeUCL_smooth_pop_window
. -
UCL_smooth_pop_window
integer (default:5
).Size of the moving window used to smooth the population grid before delineating urban clusters. Ignored when
UCL_smooth_pop
isFALSE
. -
water_land_threshold
numeric (default:0.5
).Maximum proportion of permanent land allowed in a water cell
-
water_pop_threshold
numeric (default:0
).Maximum population size allowed in a water cell
-
water_built_threshold
numeric (default:0
).Maximum built-up area allowed in a water cell
LEVEL 2
-
UC_density_threshold
numeric (default:1500
).Minimum population density per permanent land of a cell required to belong to an urban centre
-
UC_size_threshold
numeric (default:50000
).Minimum total population size required for an urban centre
-
UC_contiguity_rule
integer (default:4
).Which cells are considered adjacent in urban centres:
4
for rooks case (horizontal and vertical neighbours) or8
for queens case (horizontal, vertical and diagonal neighbours) -
UC_built_criterium
logical (default:TRUE
).Whether to use the additional built-up area criterium (see section "Built-up area criterium" below). If
TRUE
, not only cells that meet the population density requirement will be considered when delineating urban centres, but also cells with a built-up area per permanent land above theUC_built_threshold
-
UC_built_threshold
numeric or character (default:0.2
).Additional built-up area threshold. Can be a value between
0
and1
, representing the minimum built-up area per permanent land, or"optimal"
(see section "Built-up area criterium" below). Ignored whenUC_built_criterium
isFALSE
. -
built_optimal_data
character / list (default:NULL
).Path to the directory with the data, or named list with the data as returned by function
DoU_preprocess_grid()
used to determine the optimal built threshold (see section "Built-up area criterium" below). Ignored whenUC_built_criterium
isFALSE
or whenUC_built_threshold
is not"optimal"
. -
UC_smooth_pop
logical (default:FALSE
).Whether to smooth the population grid before delineating urban centres. If
TRUE
, the population grid will be smoothed with a moving average of window sizeUC_smooth_pop_window
. -
UC_smooth_pop_window
integer (default:5
).Size of the moving window used to smooth the population grid before delineating urban centres. Ignored when
UC_smooth_pop
isFALSE
. -
UC_gap_fill
logical (default:TRUE
).Whether to perform gap filling. If
TRUE
, gaps in urban centres smaller thanUC_max_gap
are filled. -
UC_max_gap
integer (default:15
).Gaps with an area smaller than this threshold in urban centres will be filled (unit is km²). Ignored when
UC_gap_fill
isFALSE
. -
UC_smooth_edge
logical (default:TRUE
).Whether to perform edge smoothing. If
TRUE
, edges of urban centres are smoothed with the functionUC_smooth_edge_fun
. -
UC_smooth_edge_fun
character / function (default:"majority_rule_R2023A"
).Function used to smooth the edges of urban centres. Ignored when
UC_smooth_edge
isFALSE
. Possible values are:-
"majority_rule_R2022A"
to use the edge smoothing algorithm in GHSL Data Package 2022 (see section "Edge smoothing" below) -
"majority_rule_R2023A"
to use the edge smoothing algorithm in GHSL Data Package 2023 (see section "Edge smoothing" below) a custom function with a signature similar as
apply_majority_rule()
.
-
-
DUC_density_threshold
numeric (default:1500
).Minimum population density required for a dense urban cluster
-
DUC_size_threshold
numeric (default:5000
).Minimum total population size required for a dense urban cluster
-
DUC_built_criterium
logical (default:TRUE
).Whether to use the additional built-up area criterium (see section "Built-up area criterium" below). If
TRUE
, not only cells that meet the population density requirement will be considered when delineating dense urban clusters, but also cells with a built-up area per permanent land above theDUC_built_threshold
-
DUC_built_threshold
numeric or character (default:0.2
).Additional built-up area threshold. Can be a value between
0
and1
, representing the minimum built-up area per permanent land, or"optimal"
(see section "Built-up area criterium" below). Ignored whenDUC_built_criterium
isFALSE
. -
DUC_contiguity_rule
integer (default:4
).Which cells are considered adjacent in dense urban clusters:
4
for rooks case (horizontal and vertical neighbours) or8
for queens case (horizontal, vertical and diagonal neighbours) -
SDUC_density_threshold
numeric (default:900
).Minimum population density per permanent land of a cell required to belong to a semi-dense urban cluster
-
SDUC_size_threshold
numeric (default:2500
).Minimum total population size required for a semi-dense urban cluster
-
SDUC_contiguity_rule
integer (default:4
).Which cells are considered adjacent in semi-dense urban clusters:
4
for rooks case (horizontal and vertical neighbours) or8
for queens case (horizontal, vertical and diagonal neighbours) -
SDUC_buffer_size
integer (default:2
).The distance to urban centres and dense urban clusters required for a semi-dense urban cluster
-
SUrb_density_threshold
numeric (default:300
).Minimum population density per permanent land of a cell required to belong to a suburban or peri-urban area
-
SUrb_size_threshold
numeric (default:5000
).Minimum total population size required for a suburban or peri-urban area
-
SUrb_contiguity_rule
integer (default:8
).Which cells are considered adjacent in suburban or peri-urban area:
4
for rooks case (horizontal and vertical neighbours) or8
for queens case (horizontal, vertical and diagonal neighbours) -
RC_density_threshold
numeric (default:300
).Minimum population density per permanent land of a cell required to belong to a rural cluster
-
RC_size_threshold
numeric (default:500
).Minimum total population size required for a rural cluster
-
RC_contiguity_rule
integer (default:8
).Which cells are considered adjacent in rural clusters:
4
for rooks case (horizontal and vertical neighbours) or8
for queens case (horizontal, vertical and diagonal neighbours) -
LDR_density_threshold
numeric (default:50
).Minimum population density per permanent land of a low density rural grid cell
-
water_land_threshold
numeric (default:0.5
).Maximum proportion of permanent land allowed in a water cell
-
water_pop_threshold
numeric (default:0
).Maximum population size allowed in a water cell
-
water_built_threshold
numeric (default:0
).Maximum built-up area allowed in a water cell
Built-up area criterium
In Data Package 2022, the Degree of Urbanisation includes an optional built-up area criterium to account for the presence of office parks, shopping malls, factories and transport infrastructure. When the setting is enabled, urban centres (and dense urban clusters) are created using both cells with a population density of at least 1500 inhabitants per km² and cells that have at least 50% built-up area on permanent land. For more information: see GHSL Data Package 2022, footnote 25. The parameter settings UC_built_criterium=TRUE
and UC_built_threshold=0.5
(level 1 & 2) and DUC_built_criterium=TRUE
and DUC_built_threshold=0.5
(level 2) reproduce this built-up area criterium in urban centres and dense urban clusters respectively.
In Data Package 2023, the built-up area criterium is slightly adapted and renamed to the "Reduce Fragmentation Option". Instead of using a fixed threshold of built-up area per permanent land of 50%, an "optimal" threshold is employed. The optimal threshold is dynamically identified as the global average built-up area proportion in clusters with a density of at least 1500 inhabitants per permanent land with a minimum population of 5000 people. We determined empirically that this optimal threshold is 20% for the data of 2020. For more information: see GHSL Data Package 2023, footnote 30. The "Reduce Fragmentation Option" can be reproduced with the parameter settings UC_built_criterium=TRUE
and UC_built_threshold="optimal"
(level 1 & 2) and DUC_built_criterium=TRUE
and DUC_built_threshold="optimal"
(level 2). In addition, the parameter built_optimal_data
must contain the path to the directory with the (global) data to compute the optimal built-up area threshold.
Edge smoothing
In Data Package 2022, edges of urban centres are smoothed by an iterative majority rule. The majority rule works as follows: if a cell has at least five of the eight surrounding cells belonging to an unique urban centre, then the cell is added to that urban centre. The process is iteratively repeated until no more cells are added. The parameter setting UC_smooth_edge=TRUE
and UC_smooth_edge_fun="majority_rule_R2022A"
reproduces this edge smoothing rule.
In Data Package 2023, the majority rule is slightly adapted. A cell is added to an urban centre if the majority of the surrounding cells belongs to an unique urban centre, with majority only computed among populated or land cells (proportion of permanent land > 0.5). In addition, cells with permanent water are never added to urban centres. The process is iteratively repeated until no more cells are added. For more information: see GHSL Data Package 2023, footnote 29. The parameter setting UC_smooth_edge=TRUE
and UC_smooth_edge_fun="majority_rule_R2023A"
reproduces this edge smoothing rule.
Regions
Because of the large amount of data at a global scale, the grid classification procedure is quite memory-consuming. To optimise the procedure, we divided the world in 9 pre-defined regions. These regions are the smallest grouping of GHSL tiles while ensuring that no continuous land mass is split into two different regions (for more information, see the figure below and GHSL_tiles_per_region
).
If regions=TRUE
, a global grid classification is created by (1) executing the grid classification procedure separately in the 9 pre-defined regions, and (2) afterwards merging these classifications together. The argument data
should contain the path to a directory with the data of all pre-defined regions (for example as created by download_GHSLdata(... extent="regions"
). Note that although the grid classification is optimised, it still takes approx. 145 minutes and requires 116 GB RAM to execute the grid classification with the standard parameters (performed on a Kubernetes server with 32 cores and 256 GB RAM). For a concrete example on how to construct the grid classification on a global scale, see vignette("vig3-DoU-global-scale")
.
Examples
# load the data
data_belgium <- DoU_load_grid_data_belgium()
# classify with standard parameters:
classification1 <- DoU_classify_grid(data = data_belgium)
# classify with custom parameters:
classification2 <- DoU_classify_grid(
data = data_belgium,
parameters = list(
UC_density_threshold = 3000,
UC_size_threshold = 75000,
UC_gap_fill = FALSE,
UC_smooth_edge = FALSE,
UCL_contiguity_rule = 4
)
)