Standardises and transforms Postal Vote Application (PVA) data for a single Australian federal election event into a consistent long-format structure based on application receipt dates. This function aligns column names across election years (2010, 2013, 2016, 2019), pivots date-specific vote counts into a long format, and converts receipt dates to Date objects. For unrecognised election years, the data is returned unprocessed with a message.
Arguments
- data
A data frame containing PVA data for a single election event. Must include:
date
(the date of the election or data snapshot)event
(the election event, matching theevent
argument)
Additional required columns vary by year:
2010:
Enrolment
, and date-specific columns (e.g., "02 Aug 10").2013:
Enrolment Division
, and date-specific columns (e.g., "20-Aug-13").2016:
State_Cd
,PVA_Web_2_Date_Div
, and date-specific columns (e.g., "20160614").2019:
State_Cd
,PVA_Web_2_Date_V2_Div
, and date-specific columns (e.g., "20190411").
- event
A character string specifying the election event to process. Recognised values are "2010 Federal Election", "2013 Federal Election", "2016 Federal Election", or "2019 Federal Election". Other values result in the data being returned unprocessed.
Value
A data frame with standardised columns for recognised election years:
date
(the date of the election or data snapshot)event
(the election event)StateAb
(state abbreviation, upper case; "ZZZ" for NA in 2013)DivisionNm
(division name)DateReceived
(date the PVA was received, as a Date object)TotalPVAs
(total PVA applications received on the corresponding date)
For unrecognised years, the original data frame is returned unchanged.
Details
This function processes PVA data by:
Standardising column names across recognised election years using
rename_cols()
:2010:
Enrolment
toDivisionNm
.2013:
Enrolment Division
toDivisionNm
.2016:
State_Cd
toStateAb
,PVA_Web_2_Date_Div
toDivisionNm
.2019:
State_Cd
toStateAb
,PVA_Web_2_Date_V2_Div
toDivisionNm
.
Handling missing states: For 2013, NA in
StateAb
is replaced with "ZZZ".Filtering rows: For 2010 and 2013, rows with NA in
DivisionNm
(e.g., notes or totals) are removed.Removing unnecessary columns: Drops columns like "TOTAL to date (Inc GPV)" (2010, 2013), "<>", and "Date out of range" (2019).
Pivoting data: Uses
pivot_event()
to transform date-specific columns (e.g., "20-Aug-13") into long format withDateReceived
andTotalPVAs
.Converting dates: Formats
DateReceived
as a Date object using year-specific formats:2019: "%Y%m%d" (e.g., "20190411")
2016: "%Y%m%d" (e.g., "20160614")
2013: "%d-%b-%y" (e.g., "20-Aug-13")
2010: "%d %b %y" (e.g., "02 Aug 10")
Formatting: Converts
StateAb
to uppercase.Unrecognised years: Returns the data unprocessed with an informative message.
The function assumes the input data frame contains the required columns (date
, event
, and
year-specific columns) from the AEC past results datasets and that the event
column matches
the event
argument. The date-specific columns represent daily PVA totals and are pivoted into
the DateReceived
and TotalPVAs
columns.
Examples
# Sample 2010 data
data_2010 <- data.frame(
date = "2010-08-21",
event = "2010 Federal Election",
StateAb = "VIC",
Enrolment = "Melbourne",
`02 Aug 10` = 50,
`03 Aug 10` = 60
)
process_pva_date(data_2010, "2010 Federal Election")
#> Processing `2010 Federal Election` data to ensure all columns align across all elections.
#> [1] date event StateAb DivisionNm DateReceived
#> [6] TotalPVAs
#> <0 rows> (or 0-length row.names)
# Sample invalid year
data_2022 <- data.frame(
date = "2022-05-21",
event = "2022 Federal Election",
StateAb = "QLD",
Votes = 90
)
process_pva_date(data_2022, "2022 Federal Election")
#> No processing required for `2022 Federal Election`. Data returned unprocessed.
#> date event StateAb Votes
#> 1 2022-05-21 2022 Federal Election QLD 90