In my last post, I showed how to extend the functions I developed for NJASK data to other state assessments. In this post, I'll tie eveything together, and write some general functions that bring a wide variety of NJ state assessment data into R.
Roughly speaking, we're tying to write a function that will return data given a year
and a grade
. Here are the big things that need to happen:
-
Check that call we made is a valid grade/year combination (raising an informative error if not)
-
Map the grade / year call to the correct
get_blank_data
function (NJASK? HSPA? GEPA?) -
Fetch, clean, and return the data frame.
Let's tackle each of these in turn:.
valid calls
Before we do anything, let's source in all of the functions fetch_hspa()
, fetch_gepa()
created in the previous two posts.
knitr::knit('05_njask-data-2.rmd', tangle=TRUE)
## [1] "05_njask-data-2.R"
source('05_njask-data-2.R')
## Parsed with column specification:
## cols(
## .default = col_character(),
## TOTAL_POPULATION_Number_Enrolled_ELA = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Number_Not_Present = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_Not_Present = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Scale_Score_Mean = col_integer(),
## TOTAL_POPULATION_SCIENCE_Number_Enrolled_Science = col_integer(),
## TOTAL_POPULATION_SCIENCE_Number_Not_Present = col_integer(),
## TOTAL_POPULATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_SCIENCE_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_Number_Enrolled_ELA = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Number_Not_Present = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Advanced_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Not_Present = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Advanced_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Number_Enrolled_Science = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Number_Not_Present = col_integer()
## # ... with 240 more columns
## )
## See spec(...) for full column specifications.
## Warning: 2578 parsing failures.
## row col expected actual
## 1 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 0
## 1 NA 551 columns 549 columns
## 2 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 0
## 2 NA 551 columns 549 columns
## 3 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 2
## ... .............................................................. ........... ...........
## See problems(...) for more details.
## Parsed with column specification:
## cols(
## .default = col_character(),
## TOTAL_POPULATION_Number_Enrolled_ELA = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## TOTAL_POPULATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_SCIENCE_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_Number_Enrolled_ELA = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Number_Not_Present = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Advanced_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Not_Present = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Scale_Score_Mean = col_integer(),
## SPECIAL_EDUCATION_Number_Enrolled_ELA = col_integer(),
## SPECIAL_EDUCATION_LANGUAGE_ARTS_Advanced_Proficient_Percentage = col_integer(),
## SPECIAL_EDUCATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## SPECIAL_EDUCATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## SPECIAL_EDUCATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## SPECIAL_EDUCATION_SCIENCE_Scale_Score_Mean = col_integer(),
## LIMITED_ENGLISH_PROFICIENT_current_and_former_Number_Enrolled_ELA = col_integer()
## # ... with 147 more columns
## )
## See spec(...) for full column specifications.
## Warning: 3010 parsing failures.
## row col expected actual
## 1 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 0
## 1 NA 551 columns 549 columns
## 2 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 0
## 2 NA 551 columns 549 columns
## 3 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 2
## ... .............................................................. ........... ...........
## See problems(...) for more details.
## Error: NA column indexes not supported
knitr::knit('06_njask-data-3.rmd', tangle=TRUE)
## [1] "06_njask-data-3.R"
source('06_njask-data-3.R')
## Parsed with column specification:
## cols(
## .default = col_character(),
## DFG = col_integer(),
## Special_Needs = col_integer(),
## TOTAL_POPULATION_Number_Enrolled_LAL = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Partially_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Advanced_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_of_Valid_Scale_Scores = col_integer(),
## TOTAL_POPULATION_SCIENCE_Number_of_Valid_Scale_Scores = col_integer(),
## TOTAL_POPULATION_SCIENCE_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_SCIENCE_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_Number_Enrolled_LAL = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_of_Valid_Scale_Scores = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Number_of_Valid_Scale_Scores = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Partially_Proficient_Percentage = col_integer()
## # ... with 202 more columns
## )
## See spec(...) for full column specifications.
## Warning: 1484 parsing failures.
## row col expected actual
## 1 SCIENCE_SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_Number_of_Valid_Scale_Scores 6 chars 0
## 1 NA 559 columns 555 columns
## 2 SCIENCE_SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_Number_of_Valid_Scale_Scores 6 chars 0
## 2 NA 559 columns 555 columns
## 3 SCIENCE_SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_Number_of_Valid_Scale_Scores 6 chars 0
## ... .......................................................................... ........... ...........
## See problems(...) for more details.
## Parsed with column specification:
## cols(
## .default = col_character(),
## DFG = col_integer(),
## Special_Needs = col_integer(),
## TOTAL_POPULATION_Number_Enrolled_LAL = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Partially_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_of_Valid_Scale_Scores = col_integer(),
## TOTAL_POPULATION_SCIENCE_Number_of_Valid_Scale_Scores = col_integer(),
## TOTAL_POPULATION_SCIENCE_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_SCIENCE_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_Number_Enrolled_LAL = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Advanced_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_of_Valid_Scale_Scores = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Number_of_Valid_Scale_Scores = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Partially_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Proficient_Percentage = col_integer()
## # ... with 189 more columns
## )
## See spec(...) for full column specifications.
## Parsed with column specification:
## cols(
## .default = col_character(),
## `Special_Needs_(Abbott)_district_flag` = col_integer(),
## MALE_SCIENCE_Scale_Score_Mean = col_integer(),
## MIGRANT_MATHEMATICS_Number_of_Valid_Scale_Scores = col_integer(),
## MIGRANT_SCIENCE_Number_of_Valid_Scale_Scores = col_integer(),
## PACIFIC_ISLANDER_LANGUAGE_ARTS_Proficient_Percentage = col_integer(),
## `NON-ECONOMICALLY_DISADVANTAGED_Non-Econ_SCIENCE_Scale_Score_Mean` = col_integer()
## )
## See spec(...) for full column specifications.
## Warning: 2448 parsing failures.
## row col expected actual
## 1 NON-ECONOMICALLY_DISADVANTAGED_Non-Econ_SCIENCE_Scale_Score_Mean 4 chars 1
## 1 NA 486 columns 484 columns
## 2 NON-ECONOMICALLY_DISADVANTAGED_Non-Econ_SCIENCE_Scale_Score_Mean 4 chars 1
## 2 NA 486 columns 484 columns
## 3 NON-ECONOMICALLY_DISADVANTAGED_Non-Econ_SCIENCE_Scale_Score_Mean 4 chars 1
## ... ................................................................ ........... ...........
## See problems(...) for more details.
## Parsed with column specification:
## cols(
## .default = col_character(),
## `Special_Needs_(Abbott)_district_flag` = col_integer(),
## MALE_SCIENCE_Scale_Score_Mean = col_integer(),
## MIGRANT_MATHEMATICS_Number_of_Valid_Scale_Scores = col_integer(),
## MIGRANT_SCIENCE_Number_of_Valid_Scale_Scores = col_integer(),
## PACIFIC_ISLANDER_LANGUAGE_ARTS_Proficient_Percentage = col_integer(),
## `NON-ECONOMICALLY_DISADVANTAGED_Non-Econ_SCIENCE_Scale_Score_Mean` = col_integer()
## )
## See spec(...) for full column specifications.
## Warning: 2448 parsing failures.
## row col expected actual
## 1 NON-ECONOMICALLY_DISADVANTAGED_Non-Econ_SCIENCE_Scale_Score_Mean 4 chars 1
## 1 NA 486 columns 484 columns
## 2 NON-ECONOMICALLY_DISADVANTAGED_Non-Econ_SCIENCE_Scale_Score_Mean 4 chars 1
## 2 NA 486 columns 484 columns
## 3 NON-ECONOMICALLY_DISADVANTAGED_Non-Econ_SCIENCE_Scale_Score_Mean 4 chars 1
## ... ................................................................ ........... ...........
## See problems(...) for more details.
This function will test if a year/grade call is valid.
valid_call <- function(year, grade) {
#data for 2015 school year doesn't exist yet
#common core transition started in 2015 (njask is no more)
if(year > 2014) {
valid_call <- FALSE
#assessment coverage 3:8 from 2006 on.
#NJASK fully implemented in 2008
} else if(year >= 2006) {
valid_call <- grade %in% c(3:8, 11)
} else if (year >= 2004) {
valid_call <- grade %in% c(3, 4, 8, 11)
} else if (year < 2004) {
valid_call <- FALSE
}
return(valid_call)
}
map for retrieval
This function does normal retrieval (NJASK for 3-8; HSPA for 11).
standard_assess <- function(year, grade) {
if(grade %in% c(3:8)) {
assess_data <- fetch_njask(year, grade)
} else if (grade == 11) {
assess_data <- fetch_hspa(year)
}
return(assess_data)
}
Here is a mapping function that calls the correct retrieval method, given grade and year.
fetch_nj_assess <- function(year, grade) {
require(ensurer)
#only allow valid calls
valid_call(year, grade) %>%
ensure_that(
all(.) ~ "invalid grade/year parameter passed")
#everything post 2008 has the same grade coverage
if (year >= 2008) {
assess_data <- standard_assess(year, grade)
#2006 and 2007: NJASK 3rd-7th, GEPA 8th, HSPA 11th
} else if (year %in% c(2006, 2007)) {
if (grade %in% c(3:7)) {
assess_data <- standard_assess(year, grade)
} else if (grade == 8) {
assess_data <- fetch_gepa(year)
} else if (grade == 11) {
assess_data <- fetch_hspa(year)
}
#2004 and 2005: NJASK 3rd & 4th, GEPA 8th, HSPA 11th
} else if (year %in% c(2004, 2005)) {
if (grade %in% c(3:4)) {
assess_data <- standard_assess(year, grade)
} else if (grade == 8) {
assess_data <- fetch_gepa(year)
} else if (grade == 11) {
assess_data <- fetch_hspa(year)
}
} else {
#if we ever reached this block, there's a problem with our `valid_call()` function
stop("unable to match your grade/year parameters to the appropriate function.")
}
return(assess_data)
}
try it out:
fetch_nj_assess(2014, 6) %>% select(CDS_Code:TOTAL_POPULATION_LANGUAGE_ARTS_Scale_Score_Mean) %>% head()
## Loading required package: ensurer
## Parsed with column specification:
## cols(
## .default = col_character(),
## TOTAL_POPULATION_Number_Enrolled_ELA = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## TOTAL_POPULATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_SCIENCE_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_Number_Enrolled_ELA = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Number_Not_Present = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Advanced_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Not_Present = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Scale_Score_Mean = col_integer(),
## SPECIAL_EDUCATION_Number_Enrolled_ELA = col_integer(),
## SPECIAL_EDUCATION_LANGUAGE_ARTS_Advanced_Proficient_Percentage = col_integer(),
## SPECIAL_EDUCATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## SPECIAL_EDUCATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## SPECIAL_EDUCATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## SPECIAL_EDUCATION_SCIENCE_Scale_Score_Mean = col_integer(),
## LIMITED_ENGLISH_PROFICIENT_current_and_former_Number_Enrolled_ELA = col_integer()
## # ... with 147 more columns
## )
## See spec(...) for full column specifications.
## Warning: 3010 parsing failures.
## row col expected actual
## 1 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 0
## 1 NA 551 columns 549 columns
## 2 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 0
## 2 NA 551 columns 549 columns
## 3 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 2
## ... .............................................................. ........... ...........
## See problems(...) for more details.
## Error: NA column indexes not supported
all together
Finally, as a convenience, let's write a function that brings down all of the NJASK data for all years and grades.
fetch_all_nj <- function() {
#make the df of years and grades to iterate over
post2006_years <- c(2006:2014)
post2006_grades <- c(3:8, 11)
pre2006_years <- c(2004, 2005)
pre2006_grades <- c(3, 4, 8, 11)
#subset just for testing
#post2006_years <- c(2006)
#post2006_grades <- c(8, 11)
#pre2006_grades <- c(4)
df <- data.frame(
year = vector(mode="numeric", length=0),
grade = vector(mode="numeric", length=0)
)
for (i in post2006_years) {
#use R recycling to make df
int_df <- data.frame(
year = i,
grade = post2006_grades
)
df <- rbind(df, int_df)
}
for (j in pre2006_years) {
#use R recycling to make df
int_df <- data.frame(
year = j,
grade = pre2006_grades
)
df <- rbind(df, int_df)
}
#sort the df
df <- df %>% dplyr::arrange(
desc(year), grade
)
#to hold the results
results <- list()
#iterate over the df and get the data
for (i in 1:nrow(df)) {
this_row <- df[i, ]
#be verbose
row_key <- paste0('nj', this_row$year, 'gr', this_row$grade)
print(row_key)
#call this grade/year and attach to results list
results[[row_key]] <- fetch_nj_assess(this_row$year, this_row$grade)
}
return(results)
}
test it:
all_nj <- fetch_all_nj()
## [1] "nj2014gr3"
## Parsed with column specification:
## cols(
## .default = col_character(),
## TOTAL_POPULATION_Number_Enrolled_ELA = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Number_Not_Present = col_integer(),
## TOTAL_POPULATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## TOTAL_POPULATION_MATHEMATICS_Number_Not_Present = col_integer(),
## TOTAL_POPULATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## TOTAL_POPULATION_SCIENCE_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_Number_Enrolled_ELA = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Number_Not_Present = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Advanced_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Enrolled_Math = col_integer(),
## GENERAL_EDUCATION_MATHEMATICS_Number_Not_Present = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Advanced_Proficient_Percentage = col_integer(),
## GENERAL_EDUCATION_SCIENCE_Scale_Score_Mean = col_integer(),
## SPECIAL_EDUCATION_Number_Enrolled_ELA = col_integer(),
## SPECIAL_EDUCATION_LANGUAGE_ARTS_Number_Not_Present = col_integer(),
## SPECIAL_EDUCATION_LANGUAGE_ARTS_Advanced_Proficient_Percentage = col_integer(),
## SPECIAL_EDUCATION_LANGUAGE_ARTS_Scale_Score_Mean = col_integer(),
## SPECIAL_EDUCATION_MATHEMATICS_Number_Enrolled_Math = col_integer()
## # ... with 156 more columns
## )
## See spec(...) for full column specifications.
## Warning: 3924 parsing failures.
## row col expected actual
## 1 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 0
## 1 NA 551 columns 549 columns
## 2 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 0
## 2 NA 551 columns 549 columns
## 3 SPECIAL_EDUCATION_WITH_ACCOMMODATIONS_SCIENCE_Scale_Score_Mean 4 chars 2
## ... .............................................................. ........... ...........
## See problems(...) for more details.
## Error: NA column indexes not supported
length(all_nj)
## Error in eval(expr, envir, enclos): object 'all_nj' not found