This function downloads and processes data from the Australian Electoral Commission (AEC) based on user-specified criteria such as file name, date range, election type, and data category. It retrieves raw data files from the AEC, optionally applies standardisation processes (e.g., column name consistency), and returns a combined data frame for analysis. The function is designed to handle various types of election-related datasets, including federal elections, referendums, and by-elections.
Arguments
- file_name
A character string specifying the name of the AEC dataset to retrieve (e.g., "National list of candidates"). This name must match entries in the internal index datasets.
- date_range
A list with two elements,
"from"and"to", specifying the start and end dates (in "YYYY-MM-DD" format) for the election events to include. Defaults tolist(from = "2022-01-01", to = "2025-01-01").- type
A character string specifying the type of election or event. Must be one of: "Federal Election", "Referendum", "By-Election", or "Disclosure". Defaults to the first option.
- category
A character string specifying the category of the data. Must be one of: "House", "Senate", "Referendum", "General", or "Statistics". Defaults to the first option.
- process
A logical value indicating whether to apply additional processing to the downloaded data, such as standardizing column names. Defaults to
TRUE.- cache
Logical. If TRUE (default), caches the downloaded and processed data for the session, making subsequent identical requests instant. Set to FALSE to always download fresh data.
Value
A data frame containing the combined AEC data for the specified criteria. The data frame
includes metadata columns (e.g., date, event) and is optionally processed
for consistency if process = TRUE. If no data is available for the given parameters,
the function stops with an informative error message.
Details
The get_election_data function automates the retrieval and processing of AEC datasets by:
Validating input parameters to ensure correctness.
Checking if the data is already cached (if
cache = TRUE).Retrieving internal metadata about election events within the specified
date_rangeand matching thetype.Checking the availability of the requested
file_nameandcategoryin the internal index datasets.Constructing download URLs and retrieving the raw data files from the AEC website.
Optionally preprocessing postal vote data and standardizing column names.
Combining data from multiple election events into a single data frame.
Caching the result for future identical requests (if
cache = TRUE).
The function relies on internal helper functions (e.g., check_params, construct_url,
preprocess_pva) and datasets (e.g., info, aec_elections_index) within the
scgElectionsAU package. It also uses scgUtils::get_file for downloading files.
The function is designed to be robust, providing clear messages and errors to guide users through
the data retrieval process.
Use clear_cache to remove cached data when needed.
Examples
if (FALSE) { # \dontrun{
# Retrieve and process the national list of candidates for House elections in 2022
# First call downloads from AEC
data <- get_election_data(
file_name = "National list of candidates",
date_range = list(from = "2022-01-01", to = "2023-01-01"),
type = "Federal Election",
category = "House",
process = FALSE
)
# Second identical call uses cache - instant!
data2 <- get_election_data(
file_name = "National list of candidates",
date_range = list(from = "2022-01-01", to = "2023-01-01"),
type = "Federal Election",
category = "House",
process = FALSE
)
# Clear cache when done (optional - clears automatically when session ends)
clear_cache()
} # }
