library(reticulate)
if (Sys.getenv("GITHUB_ACTIONS") == "true") {
# for running in GitHub Actions (otherwise reticulate can't find diverse_web_env)
::use_python(Sys.getenv("RETICULATE_PYTHON"), required = TRUE)
reticulateelse {
} # for running locally
::use_condaenv("diverse_web_env", required = TRUE)
reticulate }
Historical Alberta Wildfire Data
Use Binder to explore and run this dataset and analysis interactively in your browser with R
. Ideal for students and instructors to run, modify, and test the analysis in a live JupyterLab environment—no setup needed.
About the Data
This data set contains information on wildfires in Canada, compiled from official government sources under the Open Government Licence – Alberta.
The data was gathered to monitor, assess, and respond to wildfire risks across different regions. Wildfires have far-reaching environmental, social, and economic consequences. From an equity and inclusion perspective, analyzing wildfire data can reveal geographic and resource-based disparities in detection and containment efforts, and highlight how certain populations face greater risks due to climate change and limited infrastructure.
In particular, Alberta experiences some of the most severe and frequent wildfires in Canada due to its vast forested areas, dry climate, and increasing temperatures linked to climate change. Wildfires in Alberta can lead to widespread evacuations, destroy homes and livelihoods, and disproportionately affect rural and Indigenous communities, who may lack access to adequate emergency services and infrastructure. Understanding the patterns of wildfire occurrence and spread helps policymakers, environmental planners, and emergency services allocate resources more equitably and implement effective mitigation strategies. This data set enables data-driven approaches to reduce the impact of wildfires and support more resilient and inclusive disaster management practices across Alberta and beyond.
Download
Metadata
Variables
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
year |
ID | Integer | Year in which the wildfire was first detected incident | Year | No |
fire_number |
ID | String | Identifier for the wildfire | - | No |
current_size |
Feature | Numeric | Final estimated area burned by the wildfire | Hectares | No |
size_class |
Feature | Categorical | Size classification based on final area burned | - | No |
latitude |
Feature | Numeric | Latitude coordinate of the wildfire origin | Degrees | No |
longitude |
Feature | Numeric | Longitude coordinate of the wildfire origin | Degrees | No |
fire_origin |
Metadata | Categorical | Who owns or administers the land the wildfire ignited on | - | No |
general_cause |
Feature | Categorical | Classification of the wildfire cause | - | No |
responsible_group |
Metadata | Categorical | Recreational group responsible for causing the wildfire | - | No |
activity_class |
Feature | Categorical | Activity that was going on when the wildfire started | - | No |
true_cause |
Feature | Categorical | Specific reason why the wildfire started (e.g., “Arson Known”, “Hot Exhaust”, “Line Impact”, “Unattended Fire”, etc.) | - | No |
fire_start_date |
Time | Date | Datetime the wildfire started | YYYY-MM-DD | Yes |
detection_agent_type |
Feature | Categorical | Type of detection agent that discovered the wildfire (e.g., lookout (“LKT”), aircraft (“AIR”)) | - | No |
detection_agent |
Feature | Categorical | Specific type of detection agent that discovered the wildfire | - | No |
assessment_hectares |
Feature | Numeric | Size of the wildfire at the time of assessment | Hectares | No |
fire_spread_rate |
Feature | Numeric | Rate at which the wildfire spread at the time of initial assessment | Metres/minute | No |
fire_type |
Feature | Categorical | Predominant wildfire behavior classification at the time of initial assessment (e.g., “Surface”, “Ground”, “Crown”) | - | No |
fire_position_on_slope |
Feature | Categorical | Position of the wildfire relative to the slope it is travelling on at the time of initial assessment (e.g., “Bottom”, “Middle 1/3”, “Unknown”) | - | No |
weather_conditions_over_fire |
Feature | Categorical | Weather conditions over the wildfire at the time of initial assessment | - | No |
temperature |
Feature | Numeric | Temperature at the wildfire location at the time of initial assessment | °C | Yes |
relative_humidity |
Feature | Numeric | Relative humidity at the wildfire location at the time of initial assessment | % | Yes |
wind_direction |
Feature | Categorical | Wind direction at the wildfire location at the time of initial assessment | - | No |
wind_speed |
Feature | Numeric | Wind speed at the wildfire location in km/h at the time of initial assessment | km/h | Yes |
fuel_type |
Feature | Categorical | Dominant fuel type (vegetation cover) in which the wildfire is burning at the wildfire location at the time of initial assessment | - | No |
initial_action_by |
Metadata | Categorical | Group that initiated suppression efforts | - | No |
ia_arrival_at_fire_date |
Time | DateTime | Datetime when the initial action group arrived at the wildfire | YYYY-MM-DD | Yes |
ia_access |
Feature | Categorical | Method of access that the initial action group used | - | No |
fire_fighting_start_date |
Time | DateTime | Datetime when the initial action group began firefighting activities | YYYY-MM-DD | Yes |
fire_fighting_start_size |
Feature | Numeric | Wildfire size at the time firefighting began | Hectares | No |
bucketing_on_fire |
Feature | Binary | Whether aerial bucketing was used on the wildfire | Yes/No | No |
first_bh_date |
Time | DateTime | Datetime when wildfire was first declared being held | YYYY-MM-DD | No |
first_bh_size |
Feature | Numeric | Wildfire size when wildfire was first declared being held | Hectares | No |
first_uc_date |
Time | DateTime | Datetime when wildfire was first declared under control | YYYY-MM-DD | No |
first_uc_size |
Feature | Numeric | Wildfire size when first declared under control | Hectares | No |
first_ex_size_perimeter |
Feature | Numeric | Wildfire size when first declared extinguished | Hectares | No |
Key Features of the Data Set
Each row represents a single wildfire incident and includes information such as:
Environmental conditions (
temperature
,wind_speed
,relative_humidity
) – Air temperature (°C), wind speed (km/h) and relative humidity (%) near the fire; influence fire intensity and spread.Fire behavior metrics (
fire_spread_rate
,fire_type
,fuel_type
) – Fire expansion rate (ha/h), behavior class (surface, crown) and dominant fuel; determine growth and management.Access difficulty (
ia_access
) – Ease of crew access; low access delays response.Location coordinates (
latitude
,longitude
) – Decimal degrees of fire origin; used for mapping and spatial analysis.
Purpose and Use Cases
This data set is designed to support analysis of:
Factors contributing to the spread, intensity, and size of wildfires
The impact of weather conditions and fuel types on fire behavior
Geographic and seasonal patterns in wildfire occurrence
The effectiveness and timeliness of initial suppression efforts
Relationships between fire causes, detection methods, and responsible parties
Case Study
Objective
Large wildfires pose serious environmental, social, and economic challenges, especially as climate conditions become more extreme. Identifying the key environmental and human factors linked to these fires can help guide more effective prevention and response strategies.
So, our main question is:
Can we identify the environmental and human factors most associated with large wildfires?
According to Natural Resources Canada, wildfires exceeding 200 hectares in final size are classified as “large fires.” While these fires represent a small percentage of all wildfires, they account for the majority of the total area burned annually.
The goal is to explore potential predictors of fire size, such as weather, fire cause, and detection method, and provide insights that could inform early interventions and resource planning.
Analysis
Loading Libraries
# Data
library(diversedata) # Diverse Data Hub data sets
# Core libraries
library(tidyverse)
library(lubridate)
# Spatial & mapping
library(sf)
library(terra)
library(ggmap)
library(ggspatial)
library(maptiles)
library(leaflet)
library(leaflet.extras)
# Visualization & color
library(viridis)
# Tables & reporting
library(gt)
library(kableExtra)
# Modeling & interpretation
library(marginaleffects)
library(broom)
1. Data Cleaning & Processing
- Converted fire size to numeric
- Created a binary variable
large_fire
(TRUE if >200 ha) - Filtered out incomplete records
# Reading Data
<- wildfire
wildfire_data
# Clean and prepare base data
<- wildfire_data |>
wildfire_clean filter(!is.na(assessment_hectares), assessment_hectares > 0) |>
mutate(
large_fire = current_size > 200,
true_cause = as.factor(true_cause),
detection_agent_type = as.factor(detection_agent_type),
temperature = as.numeric(temperature),
wind_speed = as.numeric(wind_speed)
)
# Drop unused levels for modeling
<- wildfire_clean |>
wildfire_clean filter(!is.na(true_cause), !is.na(detection_agent_type)) |>
mutate(
true_cause = droplevels(true_cause),
detection_agent_type = droplevels(detection_agent_type)
)
2. Exploratory Data Analysis
Map of Wildfire Size and Location in Alberta
This interactive map displays the geographic distribution and relative size of wildfires across Alberta, using red circles sized by fire area. Each point represents a wildfire event, with larger circles indicating more extensive burns. The map reveals regions with concentrated wildfire activity and visually emphasizes differences in fire magnitude across the province.
Note
To provide geographic context for our wildfire data, we added a shapefile representing Alberta’s boundaries.
This shapefile was sourced from the Alberta Government Open Data Portal and specifically corresponds to the Electoral Division Shapefile (Bill 33, 2017).
The data was processed and transformed to the appropriate geographic coordinate system to enable mapping alongside our wildfire data set.
# map
leaflet() |>
addProviderTiles("CartoDB.Positron") |>
setView(lng = -115, lat = 55, zoom = 5.5) |>
addPolygons(data = alberta_shape,
color = "#CCCCCC",
weight = 0.5,
fillOpacity = 0.02,
group = "Alberta Boundaries") |>
addCircles(data = wildfire_sf,
radius = ~sqrt(current_size) * 30,
fillOpacity = 0.6,
color = "red",
stroke = FALSE,
group = "Wildfires") |>
addLayersControl(overlayGroups = c("Alberta Boundaries", "Wildfires"),
options = layersControlOptions(collapsed = FALSE)) |>
addLegend(position = "bottomright",
title = "Wildfire Size (approx.)",
colors = "red",
labels = "Larger = Bigger fire")