## Identifying the Over and Under Achieving Michigan Schools

The Michigan Department of Education (MDE) ranks schools past on test scores and than reports the percentile a schools is in.  This tells us the absolute performance, but leaves out information.  For example, the percentage of students getting free lunches greatly impacts performance.  A school that has a lot of free lunch students that beats expectations, could be adding more value than a school with high absolute performance.  I’ve created a map that shows under and over performing schools based on location, demographics, and socio-economic status.    The model captures the top 10% and the bottom 10%.

Indicates a school that over performed on one or more of math, reading, science

Indicates a school that under performed on one or more of math, reading, science.

Be sure to click on the icon to get the details about the school.

The icky math details are below.

One way of looking for instances of over or under performing is to look at the distribution, average, and standard deviation of the results.  The MEAP test scores in the following chart follow a normal distribution (aka bell curve).
The average is 527.  The black lines represent two standard deviations above and below the average.   95% of the schools in Michigan have an average MEAP Math score between 544 and 511.  The schools on either side are out or under performing 95% of the schools in Michigan.  The problem with this model is that it doesn’t take anything other than performance.  A college town could be outperforming not because its schools are effective, but because many of its students are children of professors.

A regression model allows us to take a variable, such as percentage of free lunch students, and make a prediction about the schools based on the data for each school.  In the follow chart, the small circles represent actual schools average MEAP math scores and the line represents the regression model’s prediction.

The chart makes clear that in this one variable regression model most of the results are either higher or lower than the prediction.   The actual output from the R statistical packages regression function (lm) is:

 Coefficient Std Error t Value Pr(>|t|) (Intercept) 548.836 0.5716 960.21 <2e-16 *** Free Lunch Ratio −44.15 1.0296 −42.88 <2e-16 ***

Residual standard error: 10.72 on 1473 degrees of freedom
Multiple R-squared:  0.5552,    Adjusted R-squared:  0.5549
F-statistic:  1839 on 1 and 1473 DF,  p-value: < 2.2e-16

The coefficient column represents the numbers will turn into a formula for predicting the Math score based on the free lunch ratio.  The -44.15 coefficient tells us that a school with 100% free lunch students will score 44.15 points below a school with no free lunch students.  Another point to understand in the output is the Adjusted R-squared of 0.5549.  That means that the model accounts for 55% of the variation in Math scores.  The other 45% is accounted for by variables not in the model like effective teaching.

The formal algebraic definition of the model is:

$\hat{y} = \hat{\beta}_{0} + \hat{\beta}_{1}x_{1}$

Where:

$\hat{y}$ is the estimate of the school’s average MEAP score

$\hat{\beta}_{0}$ Is the intercept.  In the regression output above this number is  548.836.

$\hat{\beta}_{1}$ This is the coefficient for the free lunch ratio.  In the regression output above it is -44.15.

$x_{1}$  This is the school’s ratio of free lunch students

Assuming a school has a 80% free lunch students (a ratio of .8), the formula looks like:

513 = 549 + -44 * .8

In the chart above, if the school’s actual MEAP average was above the estimate, it would have a circle above the line and vice versa.  While the formula outputs a specific answer, the standard error should be included to provide a range.  In this case, the range would be between 500 and 527.

By including more variables in the regression model, we can improve the accuracy of the output.

 Estimate Std. Error t value Pr(>|t|) (Intercept) 523.265 6.69923 78.108 <2E-16 *** Free Lunch Ratio −34.595 1.33537 −25.907 <2E-16 *** Reduced Price Ratio −16.108 5.67084 −2.84 0.00457 ** Asian Ratio 48.0848 4.23635 11.351 <2E-16 *** African American Ratio −9.3022 1.39094 −6.688 3.2E−11 *** For Profit Charter −1.1188 0.83357 −1.342 0.17973 Not For Profit Charter 3.84668 1.95715 1.965 0.04955 * City:Midsize 2.69329 1.43163 1.881 0.06013 . City:Small 3.96461 1.31282 3.02 0.00257 ** Rural:Distant 0.80392 1.58978 0.506 0.61316 Rural:Fringe 2.26883 1.524 1.489 0.13677 Rural:Remote 2.31241 1.888 1.225 0.22085 Suburb:Large 3.12571 1.28801 2.427 0.01535 * Suburb:Midsize 2.05215 1.94642 1.054 0.29191 Suburb:Small 0.08622 1.80301 0.048 0.96186 Town:Distant 1.02831 1.82438 0.564 0.57308 Town:Fringe 0.36864 2.3092 0.16 0.87319 Town:Remote 0.85657 1.94081 0.441 0.65903 Ratio of MEAP Science Takers 20.4892 6.65813 3.077 0.00213 ** College Town 4.77183 0.92805 5.142 3.1E−07 ***

Note:  A school was counted as being in a college town if it was within 5 km of a state university (e.g Central Michigan University)

Residual standard error: 9.769 on 1455 degrees of freedom
Multiple R-squared:  0.6351,    Adjusted R-squared:  0.6304
F-statistic: 133.3 on 19 and 1455 DF,  p-value: < 2.2e-16

The  above model was shaped from a larger set of variables using R’s step function.  The first thing to notice is that the adjusted R-squared has improved from .55 to .63, meaning that the longer model captures 63% of the variation as opposed to the 55% of the single variable model.  This will allow for better predictions.  Another point to notice is that the coefficient for the ratio of free lunch students has declined with the addition of other explanatory variables.

For an example, lets use the following school data:

 Free Lunch Ratio 0.89286 Reduced Price Ratio 0 Asian Students Ratio 0.0119 African American Ratio 0.72619 MEAP Science Takers Ratio 0.95238 type Not College locale City:Small school For Profit Charter

The example data produces an estimated MEAP math score of 508.6, with an upper range of 521 and a lower range of 496.  The school’s actual average MEAP Math (2013) score is 497.  Since it is  more than the lower bound of the model, for the purposes of this post, the school  would be said to meet expectations.

The following graph charts actual vs predicted average MEAP Math scores

With this diagram it is possible to see schools that are performing much worse or much better than the prediction.  For example, the upper left most school had a predicted value of about 525 and an actual value of about 485.

 Coeffiecient Std Error t value Pr(>|t|) (Intercept) 554.441 4.36671 126.97 <2E-16 *** # of 5th Graders −0.0078 0.00346 −2.26 0.02398 * Free Lunch Ratio −29.427 1.07034 −27.493 <2E-16 *** Reduced Price Ratio −7.4848 3.7588 −1.991 0.04664 * Native Am. Ratio −17.479 4.88014 −3.582 0.00035 *** African American Ratio −23.921 2.51283 −9.52 <2E-16 *** White Ratio −16.093 2.45714 −6.549 8E−11 *** Hispanic Ratio −19.978 2.84602 −7.02 3.4E−12 *** Female Ratio 10.1187 2.35092 4.304 1.8E−05 *** For Profit Charter −1.0801 0.55915 −1.932 0.05359 . Not for Profit Charter 2.97008 1.29856 2.287 0.02233 * Bachelor Degree Ratio 12.923 4.15019 3.114 0.00188 ** localeCity:Midsize 2.28848 0.99259 2.306 0.02128 * localeCity:Small 2.94806 0.90178 3.269 0.0011 ** localeRural:Distant 3.62574 1.08462 3.343 0.00085 *** localeRural:Fringe 2.90248 1.03398 2.807 0.00507 ** localeRural:Remote 5.32231 1.31004 4.063 5.1E−05 *** localeSuburb:Large 2.36062 0.88678 2.662 0.00785 ** localeSuburb:Midsize 3.68453 1.3099 2.813 0.00498 ** localeSuburb:Small 2.56385 1.20122 2.134 0.03298 * localeTown:Distant 4.57153 1.22524 3.731 0.0002 *** localeTown:Fringe 3.22099 1.5523 2.075 0.03816 * localeTown:Remote 4.73134 1.33169 3.553 0.00039 *** Math Taker Ratio 5.58488 3.41062 1.637 0.10174 College Town 1.61416 0.61767 2.613 0.00906 **

Residual standard error: 6.447 on 1450 degrees of freedom
Multiple R-squared:  0.7412,    Adjusted R-squared:  0.737
F-statistic: 173.1 on 24 and 1450 DF,  p-value: < 2.2e-16

### MEAP Science

 Coefficient Std Error t value Pr(>|t|) (Intercept) 539.593 5.57255 96.831 <2E-16 *** # of 5th Graders −0.0103 0.00376 −2.732 0.00638 ** Free Lunch Ratio −28.746 1.04997 −27.378 <2E-16 *** Native American Ratio −15.146 5.15437 −2.938 0.00335 ** African American Ratio −30.101 2.66717 −11.286 <2E-16 *** White Ratio −15.418 2.54702 −6.053 1.8E−09 *** Hispanic Ratio −21.475 3.11505 −6.894 8.0E−12 *** Female Ratio 6.26657 2.63182 2.381 0.01739 * For Profit Charter −1.2335 0.62003 −1.99 0.04683 * Not for Profit Charter 6.24743 1.44764 4.316 1.7E−05 *** Ratio of Reading Takers 32.1586 4.76553 6.748 2.2E−11 *** Ratio of Science Takers −17.44 6.53255 −2.67 0.00767 ** College Town 2.22155 0.66502 3.341 0.00086 ***

Residual standard error: 7.252 on 1462 degrees of freedom
Multiple R-squared:  0.7376,    Adjusted R-squared:  0.7354
F-statistic: 342.4 on 12 and 1462 DF,  p-value: < 2.2e-16

The school at the far right where the model is predicting a score of 510 and the school delivered a score of 600 is Martin Luther King, Jr. Education Center Academy, which has a fine arts and technology focus.  The school is a not for profit charter where the Detroit Public Schools is the authorizer.

### MME Math

 Coefficient Std. Error t value Pr(>|t|) (Intercept) 1073.98 3.72834 288.058 <2E-16 *** Size of School 0.00776 0.00086 8.975 <2E-16 *** Asian Ratio 61.981 13.2901 4.664 3.6E−06 *** African American Ratio −14.629 2.0402 −7.17 1.7E−12 *** Female Ratio 44.0469 4.96361 8.874 <2E-16 *** localeCity:Midsize 3.98957 2.48769 1.604 0.10916 localeCity:Small 2.43685 2.30079 1.059 0.28985 localeRural:Distant 8.77303 2.53842 3.456 0.00058 *** localeRural:Fringe 8.4771 2.4487 3.462 0.00056 *** localeRural:Remote 12.1052 2.71233 4.463 9.2E−06 *** localeSuburb:Large 2.16915 2.10979 1.028 0.30419 localeSuburb:Midsize 2.46964 2.92114 0.845 0.39812 localeSuburb:Small 8.58269 3.2536 2.638 0.0085 ** localeTown:Distant 5.73102 2.73797 2.093 0.03665 * localeTown:Fringe 6.1321 3.13205 1.958 0.05059 . localeTown:Remote 8.34679 2.84976 2.929 0.0035 ** Free Lunch Ratio −37.221 2.22288 −16.744 <2E-16 *** Reduced Price Ratio 16.8552 7.6838 2.194 0.02855 * Math Takers Ratio 0.65569 0.12969 5.056 5.3E−07 *** College Town 3.95644 1.42811 2.77 0.00573 **

Residual standard error: 10.4 on 808 degrees of freedom
Multiple R-squared:  0.7077,    Adjusted R-squared:  0.7008
F-statistic:   103 on 19 and 808 DF,  p-value: < 2.2e-16

 Coefficient Std Error t value Pr(>|t|) (Intercept) 1091.75 3.26469 334.411 <2E-16 *** Size of School 0.00816 0.00178 4.573 5.6E−06 *** Asian Ratio 43.622 11.689 3.732 0.0002 *** African American Ratio −5.0073 1.77509 −2.821 0.00491 ** Female Ratio 38.1605 4.32416 8.825 <2E-16 *** localeCity:Midsize 4.37229 2.1875 1.999 0.04597 * localeCity:Small 2.09615 2.00479 1.046 0.29607 localeRural:Distant 7.39967 2.2129 3.344 0.00086 *** localeRural:Fringe 6.44298 2.13373 3.02 0.00261 ** localeRural:Remote 9.40315 2.36797 3.971 7.8E−05 *** localeSuburb:Large 2.07382 1.83594 1.13 0.25899 localeSuburb:Midsize 1.97515 2.54214 0.777 0.43741 localeSuburb:Small 7.96792 2.83124 2.814 0.00501 ** localeTown:Distant 4.76523 2.38237 2 0.04581 * localeTown:Fringe 5.45857 2.7256 2.003 0.04554 * localeTown:Remote 7.67696 2.48055 3.095 0.00204 ** # of 11th Graders −0.0117 0.00745 −1.573 0.11609 Free Lunch Ratio −38.372 1.95897 −19.588 <2E-16 *** Reduced Price Ratio 14.8411 6.70322 2.214 0.02711 * MME Reading Takers Ratio 0.53293 0.11307 4.713 2.9E−06 **

Residual standard error: 9.045 on 807 degrees of freedom
Multiple R-squared:  0.6821,    Adjusted R-squared:  0.6742
F-statistic: 86.58 on 20 and 807 DF,  p-value: < 2.2e-16

### MME Science

 Coefficient Std. Error t Value Pr(>|t|) (Intercept) 1061.41 4.56015 232.758 <2E-16 *** Size of School 0.01367 0.00245 5.574 3.4E−08 *** Asian Ratio 70.7227 15.9399 4.437 1.0E−05 *** White Ratio 11.9754 2.48344 4.822 1.7E−06 *** Hispanic Ratio 19.083 5.15876 3.699 0.00023 *** Female Ratio 55.0491 5.93665 9.273 <2E-16 *** localeCity:Midsize 1.14377 2.97853 0.384 0.70108 localeCity:Small 2.18534 2.72054 0.803 0.42205 localeRural:Distant 12.299 3.02354 4.068 5.2E−05 *** localeRural:Fringe 10.3105 2.90957 3.544 0.00042 *** localeRural:Remote 17.1508 3.20763 5.347 1.2E−07 *** localeSuburb:Large 1.9264 2.50386 0.769 0.4419 localeSuburb:Midsize 4.26122 3.46655 1.229 0.21934 localeSuburb:Small 11.8973 3.87674 3.069 0.00222 ** localeTown:Distant 7.55268 3.23907 2.332 0.01996 * localeTown:Fringe 7.96267 3.72716 2.136 0.03295 * localeTown:Remote 11.8308 3.36197 3.519 0.00046 *** # of 11th Graders −0.0163 0.01023 −1.596 0.11087 Free Lunch Ratio −46.604 2.77734 −16.78 <2E-16 *** Reduced Lunch Ratio 29.4086 9.1703 3.207 0.00139 ** MME Science Takers Ratio 0.69039 0.15516 4.45 9.8E−06 *** College Town 3.55201 1.71701 2.069 0.03889 *

Residual standard error: 12.41 on 806 degrees of freedom
Multiple R-squared:  0.7027,    Adjusted R-squared:  0.6949
F-statistic: 90.71 on 21 and 806 DF,  p-value: < 2.2e-16