Exploring California Public School District Data

California has a plethora of data available for its school district system, such as data available from the California Department of Education. I explored this data, particularly the data around wages and also API scores.

Here are the top ten school districts in California in terms of total wages paid out to school district employees. Los Angeles Unified pays $4.8Bn in total to its employees.

The box plot illustrates the distribution of wages, with the box representing the inner quartile range or 25th percentile to 75th percentile and the line in the middle is the median. The whiskers and the dots show the extremes of the distribution. For example, the max pay in Los Angeles Unified is close to $500k. But ignoring those extremes, the range seems reasonable, the bulk of ranges falling under $100k and the median around $50k.

Do these wages make a difference? I wanted to see the correlation between wages paid, normalized by student body (so total wages paid / total number of students), to the API score of the district. The following scatter plot with regression line shows that there is absolutely no correlation between wages and API scores.

There is one variable that I found that has a very strong correlation to API score, and that is education level of the parent. 1 means high school drop out and 5 means graduate level education. So would be parents out there, definitely start sharpening your pencils and get an advanced degree because your education is the best predictor of your child's education.

The code for this analysis can be found on our Github page. Also, if you liked this post you should follow us on Twitter by clicking on the bird below.