R/clean_labour_force_survey.R
clean_labour_force_survey.Rd
This functions automatically detects Labour Force Survey data in the specified directory and reads it into R. It then extracts the age, sex, UK birth status, and country. From this it creates a tidy dataset. The data can be downloaded here.
clean_labour_force_survey(data_path = "~/data/tb_data/LFS", years = 2000:2016, years_var = list(`2000` = c("age", "sex", "cry", "govtof", "pwt07"), `2001` = c("age", "sex", "cry01", "country", "pwt07"), `2002` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2003` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2004` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2005` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2006` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2007` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2008` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2009` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2010` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2011` = c("AGE", "SEX", "CRY01", "COUNTRY", "PWT14"), `2012` = c("AGE", "SEX", "CRY12", "COUNTRY", "PWT14"), `2013` = c("AGE", "SEX", "CRY12", "COUNTRY", "PWT16"), `2014` = c("AGE", "SEX", "CRY12", "COUNTRY", "PWT16"), `2015` = c("AGE", "SEX", "CRY12", "COUNTRY", "PWT16"), `2016` = c("AGE", "SEX", "CRY12", "COUNTRY", "PWT16")), return = TRUE, save = TRUE, save_name = "formatted_LFS_2000_2016", save_path = "~/data/tb_data/tbinenglanddataclean", save_format = "rds", verbose = TRUE, theme_set = theme_minimal)
data_path | A charater string containing the file path to the demographic data. |
---|---|
years | A numeric vector specifying which years of data to clean |
years_var | A named list of character strings. Each character string should contain the variables to extract from a given year and this should be named with the year of data to extract. |
return | Logical, defaults to |
save | Logical, defaults to |
save_name | A character string contaning the file name for the data to be saved under. |
save_path | The filepath for the data to be saved in |
save_format | A character vector specifying the format/formats to save the data into, defaults to rds. Currently
csv is also supported. See |
verbose | A logical indicating whether summary information should be provided. |
theme_set | The ggplot theme to apply to the summary graphs, defaults to theme_minimal |
A tidy data frame of population broken down by country, age, sex and UK birth status for 2000 to 2015.