Data Sources and Methodology

Overview

This vignette provides detailed information about the data sources and processing methods used to prepare the data used by the edfinr package. Understanding these details will help you interpret the data appropriately and inform analytical decisions.

Full data processing methods and scripts are available on GitHub via bellwetherorg/edfinr_data_cleaning.

Data Sources

This package provides access to education finance data from:

Data Processing Methods

Data Processing Detail

NCES F-33 Survey Data

Data source: NCES Common Core of Data text files of F-33 data from 2011-12 through 2021-22.

Raw variables selected:

  • Basic information: state, leaid, name, yrdata, V33.
  • Revenue data: totalrev, tlocrev, tstrev, tfedrev.
  • Expenditure data: c11, u11, v91, v92, c24, l12, m12, d11, q11.
  • Current expenditure data: ce1, ce2, and ce3.
  • Detailed expenditure data: z32, z34, v93, v95, v02, k14, e13, z33, v10, e17, v11, v12, e07, v13, v14, e08, v15, v16, e09, v17, v18, v40, v21, v22, v45, v23, v24, v90, v37, v38, e11, v29, v30, v60, v32, v65, ae1, ae2, ae3, ae4, ae5, ae6, ae7, ae8.

Adjustments:

  • Rename variables.
  • Convert district names to title case.
  • Ensure enrollment is a numeric variable.
  • Replace -1 and -2 codes with NA values.

CCD Directory Data

Data source: NCES CCD Directory data obtained via the educationdata package.

Raw variables selected:

  • Core district identifiers and location: state, ncesid, county, dist_name, state_leaid.
  • Institutional details: lea_type, lea_type_id, urbanicity, congressional_dist.

Adjustments:

  • Rename variables to more intuitive names.

SAIPE Poverty Estimates

Data source: Census Bureau SAIPE Estimates.

Raw variables selected:

  • Basic geographic and demographic fields: State Postal Code, State FIPS Code, District ID, Name
  • Population estimates: Estimated Total Population, Estimated Population ages 5-17, and the estimated number of relevant children ages 5 to 17 living in poverty

Adjustments:

  • Convert population fields to numeric
  • Construct a combined NCES district identifier by concatenating state FIPS and District ID

ACS 5-Year Estimates

Data source: American Community Survey 5-Year Estimates accessed via the tidycensus package.

Raw variables selected:

  • Economic indicators: Median household income (B19013_001) and median property value (B25077_001).
  • Educational attainment: Total population 25 years or older (B15003_001) and subsets of that population holding bachelor’s degrees (B15003_022), master’s degrees (B15003_023), professional degrees (B15003_024), and doctoral degrees (B15003_025).
  • Data are pulled for different geographic breakdowns (unified, elementary, and secondary school districts).

Adjustments:

  • Reshape data from long to wide format.
  • Rename "GEOID" to a standard ncesid and ensure proper formatting of district identifiers.
  • Convert estimates to numeric as needed.

CPI

Data source: U.S. Bureau of Labor Statistics, specifically the Consumer Price Index for All Urban Consumers (CPI-U).

Raw variables selected:

  • CPI time series data (specific variable names as provided in the raw file).

Adjustments:

  • Calculate an averaged CPI value using the second half of one year and the first half of the following year to align with the academic calendar, with the 2011-12 school year as the baseline year.
  • Clean and reformat CPI data for consistency across processing scripts.

Joining Data

Revenue Adjustments

Additional transformations are applied after the join: - Capital expenditures and debt service (C11) are subtrated from state revenues. - Property sales (U11) are subtracted from local revenues. - For Texas local education agencies (LEAs) in school year 2012-13 and earlier, payments to state governments (L12) are subtracted from local revenues. - Payments to other school systems (V91, V92, and Q11) are proportionally subracted from local, state, and federal revenues.

Exclusions

Data Notes and Cautions

Users should note the following when working with the edfinr datasets:

AltStyle によって変換されたページ (->オリジナル) /