Identifying the Over and Under Achieving Michigan Schools

The Michigan Department of Education (MDE) ranks schools past on test scores and than reports the percentile a schools is in.  This tells us the absolute performance, but leaves out information.  For example, the percentage of students getting free lunches greatly impacts performance.  A school that has a lot of free lunch students that beats expectations, could be adding more value than a school with high absolute performance.  I’ve created a map that shows under and over performing schools based on location, demographics, and socio-economic status.    The model captures the top 10% and the bottom 10%.

OverperformingIndicates a school that over performed on one or more of math, reading, science

 

Under Performing SchoolsIndicates a school that under performed on one or more of math, reading, science.

 

Be sure to click on the icon to get the details about the school.

The icky math details are below.


One way of looking for instances of over or under performing is to look at the distribution, average, and standard deviation of the results.  The MEAP test scores in the following chart follow a normal distribution (aka bell curve).
Histogram of School's Math MEAP Average ScoresThe average is 527.  The black lines represent two standard deviations above and below the average.   95% of the schools in Michigan have an average MEAP Math score between 544 and 511.  The schools on either side are out or under performing 95% of the schools in Michigan.  The problem with this model is that it doesn’t take anything other than performance.  A college town could be outperforming not because its schools are effective, but because many of its students are children of professors.

A regression model allows us to take a variable, such as percentage of free lunch students, and make a prediction about the schools based on the data for each school.  In the follow chart, the small circles represent actual schools average MEAP math scores and the line represents the regression model’s prediction.

Regression Output for Math ~ Free Lunch StudentsThe chart makes clear that in this one variable regression model most of the results are either higher or lower than the prediction.   The actual output from the R statistical packages regression function (lm) is:

Coefficient Std Error t Value Pr(>|t|)
(Intercept) 548.836 0.5716 960.21 <2e-16 ***
Free Lunch Ratio −44.15 1.0296 −42.88 <2e-16 ***

Residual standard error: 10.72 on 1473 degrees of freedom
Multiple R-squared:  0.5552,    Adjusted R-squared:  0.5549
F-statistic:  1839 on 1 and 1473 DF,  p-value: < 2.2e-16

The coefficient column represents the numbers will turn into a formula for predicting the Math score based on the free lunch ratio.  The -44.15 coefficient tells us that a school with 100% free lunch students will score 44.15 points below a school with no free lunch students.  Another point to understand in the output is the Adjusted R-squared of 0.5549.  That means that the model accounts for 55% of the variation in Math scores.  The other 45% is accounted for by variables not in the model like effective teaching.

The formal algebraic definition of the model is:

\hat{y} = \hat{\beta}_{0} + \hat{\beta}_{1}x_{1}

Where:

\hat{y} is the estimate of the school’s average MEAP score

\hat{\beta}_{0} Is the intercept.  In the regression output above this number is  548.836.

\hat{\beta}_{1} This is the coefficient for the free lunch ratio.  In the regression output above it is -44.15.

x_{1}  This is the school’s ratio of free lunch students

Assuming a school has a 80% free lunch students (a ratio of .8), the formula looks like:

513 = 549 + -44 * .8

In the chart above, if the school’s actual MEAP average was above the estimate, it would have a circle above the line and vice versa.  While the formula outputs a specific answer, the standard error should be included to provide a range.  In this case, the range would be between 500 and 527.

By including more variables in the regression model, we can improve the accuracy of the output.

Estimate Std. Error t value Pr(>|t|)
(Intercept) 523.265 6.69923 78.108 <2E-16 ***
Free Lunch Ratio −34.595 1.33537 −25.907 <2E-16 ***
Reduced Price Ratio −16.108 5.67084 −2.84 0.00457 **
Asian Ratio 48.0848 4.23635 11.351 <2E-16 ***
African American Ratio −9.3022 1.39094 −6.688 3.2E−11 ***
For Profit Charter −1.1188 0.83357 −1.342 0.17973
Not For Profit Charter 3.84668 1.95715 1.965 0.04955 *
City:Midsize 2.69329 1.43163 1.881 0.06013 .
City:Small 3.96461 1.31282 3.02 0.00257 **
Rural:Distant 0.80392 1.58978 0.506 0.61316
Rural:Fringe 2.26883 1.524 1.489 0.13677
Rural:Remote 2.31241 1.888 1.225 0.22085
Suburb:Large 3.12571 1.28801 2.427 0.01535 *
Suburb:Midsize 2.05215 1.94642 1.054 0.29191
Suburb:Small 0.08622 1.80301 0.048 0.96186
Town:Distant 1.02831 1.82438 0.564 0.57308
Town:Fringe 0.36864 2.3092 0.16 0.87319
Town:Remote 0.85657 1.94081 0.441 0.65903
Ratio of MEAP Science Takers 20.4892 6.65813 3.077 0.00213 **
College Town 4.77183 0.92805 5.142 3.1E−07 ***

Note:  A school was counted as being in a college town if it was within 5 km of a state university (e.g Central Michigan University)

Residual standard error: 9.769 on 1455 degrees of freedom
Multiple R-squared:  0.6351,    Adjusted R-squared:  0.6304
F-statistic: 133.3 on 19 and 1455 DF,  p-value: < 2.2e-16

The  above model was shaped from a larger set of variables using R’s step function.  The first thing to notice is that the adjusted R-squared has improved from .55 to .63, meaning that the longer model captures 63% of the variation as opposed to the 55% of the single variable model.  This will allow for better predictions.  Another point to notice is that the coefficient for the ratio of free lunch students has declined with the addition of other explanatory variables.

 

For an example, lets use the following school data:

Free Lunch Ratio 0.89286
Reduced Price Ratio 0
Asian Students Ratio 0.0119
African American Ratio 0.72619
MEAP Science Takers Ratio 0.95238
type Not College
locale City:Small
school For Profit Charter

The example data produces an estimated MEAP math score of 508.6, with an upper range of 521 and a lower range of 496.  The school’s actual average MEAP Math (2013) score is 497.  Since it is  more than the lower bound of the model, for the purposes of this post, the school  would be said to meet expectations.

The following graph charts actual vs predicted average MEAP Math scores

Actual vs Predicted MEAP Math ScoresWith this diagram it is possible to see schools that are performing much worse or much better than the prediction.  For example, the upper left most school had a predicted value of about 525 and an actual value of about 485.

 

MEAP Reading

Coeffiecient Std Error t value Pr(>|t|)
(Intercept) 554.441 4.36671 126.97 <2E-16 ***
# of 5th Graders −0.0078 0.00346 −2.26 0.02398 *
Free Lunch Ratio −29.427 1.07034 −27.493 <2E-16 ***
Reduced Price Ratio −7.4848 3.7588 −1.991 0.04664 *
Native Am. Ratio −17.479 4.88014 −3.582 0.00035 ***
African American Ratio −23.921 2.51283 −9.52 <2E-16 ***
White Ratio −16.093 2.45714 −6.549 8E−11 ***
Hispanic Ratio −19.978 2.84602 −7.02 3.4E−12 ***
Female Ratio 10.1187 2.35092 4.304 1.8E−05 ***
For Profit Charter −1.0801 0.55915 −1.932 0.05359 .
Not for Profit Charter 2.97008 1.29856 2.287 0.02233 *
Bachelor Degree Ratio 12.923 4.15019 3.114 0.00188 **
localeCity:Midsize 2.28848 0.99259 2.306 0.02128 *
localeCity:Small 2.94806 0.90178 3.269 0.0011 **
localeRural:Distant 3.62574 1.08462 3.343 0.00085 ***
localeRural:Fringe 2.90248 1.03398 2.807 0.00507 **
localeRural:Remote 5.32231 1.31004 4.063 5.1E−05 ***
localeSuburb:Large 2.36062 0.88678 2.662 0.00785 **
localeSuburb:Midsize 3.68453 1.3099 2.813 0.00498 **
localeSuburb:Small 2.56385 1.20122 2.134 0.03298 *
localeTown:Distant 4.57153 1.22524 3.731 0.0002 ***
localeTown:Fringe 3.22099 1.5523 2.075 0.03816 *
localeTown:Remote 4.73134 1.33169 3.553 0.00039 ***
Math Taker Ratio 5.58488 3.41062 1.637 0.10174
College Town 1.61416 0.61767 2.613 0.00906 **

Residual standard error: 6.447 on 1450 degrees of freedom
Multiple R-squared:  0.7412,    Adjusted R-squared:  0.737
F-statistic: 173.1 on 24 and 1450 DF,  p-value: < 2.2e-16

MEAP Reading Average Scores Predicted vs Actual

 MEAP Science

Coefficient Std Error t value Pr(>|t|)
(Intercept) 539.593 5.57255 96.831 <2E-16 ***
# of 5th Graders −0.0103 0.00376 −2.732 0.00638 **
Free Lunch Ratio −28.746 1.04997 −27.378 <2E-16 ***
Native American Ratio −15.146 5.15437 −2.938 0.00335 **
African American Ratio −30.101 2.66717 −11.286 <2E-16 ***
White Ratio −15.418 2.54702 −6.053 1.8E−09 ***
Hispanic Ratio −21.475 3.11505 −6.894 8.0E−12 ***
Female Ratio 6.26657 2.63182 2.381 0.01739 *
For Profit Charter −1.2335 0.62003 −1.99 0.04683 *
Not for Profit Charter 6.24743 1.44764 4.316 1.7E−05 ***
Ratio of Reading Takers 32.1586 4.76553 6.748 2.2E−11 ***
Ratio of Science Takers −17.44 6.53255 −2.67 0.00767 **
College Town 2.22155 0.66502 3.341 0.00086 ***

Residual standard error: 7.252 on 1462 degrees of freedom
Multiple R-squared:  0.7376,    Adjusted R-squared:  0.7354
F-statistic: 342.4 on 12 and 1462 DF,  p-value: < 2.2e-16

Average MEAP Scores Predicted vs ActualThe school at the far right where the model is predicting a score of 510 and the school delivered a score of 600 is Martin Luther King, Jr. Education Center Academy, which has a fine arts and technology focus.  The school is a not for profit charter where the Detroit Public Schools is the authorizer.

 MME Math

Coefficient Std. Error t value Pr(>|t|)
(Intercept) 1073.98 3.72834 288.058 <2E-16 ***
Size of School 0.00776 0.00086 8.975 <2E-16 ***
Asian Ratio 61.981 13.2901 4.664 3.6E−06 ***
African American Ratio −14.629 2.0402 −7.17 1.7E−12 ***
Female Ratio 44.0469 4.96361 8.874 <2E-16 ***
localeCity:Midsize 3.98957 2.48769 1.604 0.10916
localeCity:Small 2.43685 2.30079 1.059 0.28985
localeRural:Distant 8.77303 2.53842 3.456 0.00058 ***
localeRural:Fringe 8.4771 2.4487 3.462 0.00056 ***
localeRural:Remote 12.1052 2.71233 4.463 9.2E−06 ***
localeSuburb:Large 2.16915 2.10979 1.028 0.30419
localeSuburb:Midsize 2.46964 2.92114 0.845 0.39812
localeSuburb:Small 8.58269 3.2536 2.638 0.0085 **
localeTown:Distant 5.73102 2.73797 2.093 0.03665 *
localeTown:Fringe 6.1321 3.13205 1.958 0.05059 .
localeTown:Remote 8.34679 2.84976 2.929 0.0035 **
Free Lunch Ratio −37.221 2.22288 −16.744 <2E-16 ***
Reduced Price Ratio 16.8552 7.6838 2.194 0.02855 *
Math Takers Ratio 0.65569 0.12969 5.056 5.3E−07 ***
College Town 3.95644 1.42811 2.77 0.00573 **

Residual standard error: 10.4 on 808 degrees of freedom
Multiple R-squared:  0.7077,    Adjusted R-squared:  0.7008
F-statistic:   103 on 19 and 808 DF,  p-value: < 2.2e-16

MME Math Actual vs PredictedMME Reading

Coefficient Std Error t value Pr(>|t|)
(Intercept) 1091.75 3.26469 334.411 <2E-16 ***
Size of School 0.00816 0.00178 4.573 5.6E−06 ***
Asian Ratio 43.622 11.689 3.732 0.0002 ***
African American Ratio −5.0073 1.77509 −2.821 0.00491 **
Female Ratio 38.1605 4.32416 8.825 <2E-16 ***
localeCity:Midsize 4.37229 2.1875 1.999 0.04597 *
localeCity:Small 2.09615 2.00479 1.046 0.29607
localeRural:Distant 7.39967 2.2129 3.344 0.00086 ***
localeRural:Fringe 6.44298 2.13373 3.02 0.00261 **
localeRural:Remote 9.40315 2.36797 3.971 7.8E−05 ***
localeSuburb:Large 2.07382 1.83594 1.13 0.25899
localeSuburb:Midsize 1.97515 2.54214 0.777 0.43741
localeSuburb:Small 7.96792 2.83124 2.814 0.00501 **
localeTown:Distant 4.76523 2.38237 2 0.04581 *
localeTown:Fringe 5.45857 2.7256 2.003 0.04554 *
localeTown:Remote 7.67696 2.48055 3.095 0.00204 **
# of 11th Graders −0.0117 0.00745 −1.573 0.11609
Free Lunch Ratio −38.372 1.95897 −19.588 <2E-16 ***
Reduced Price Ratio 14.8411 6.70322 2.214 0.02711 *
MME Reading Takers Ratio 0.53293 0.11307 4.713 2.9E−06 **

Residual standard error: 9.045 on 807 degrees of freedom
Multiple R-squared:  0.6821,    Adjusted R-squared:  0.6742
F-statistic: 86.58 on 20 and 807 DF,  p-value: < 2.2e-16
MME Reading Actual vs Predicted

MME Science

Coefficient Std. Error t Value Pr(>|t|)
(Intercept) 1061.41 4.56015 232.758 <2E-16 ***
Size of School 0.01367 0.00245 5.574 3.4E−08 ***
Asian Ratio 70.7227 15.9399 4.437 1.0E−05 ***
White Ratio 11.9754 2.48344 4.822 1.7E−06 ***
Hispanic Ratio 19.083 5.15876 3.699 0.00023 ***
Female Ratio 55.0491 5.93665 9.273 <2E-16 ***
localeCity:Midsize 1.14377 2.97853 0.384 0.70108
localeCity:Small 2.18534 2.72054 0.803 0.42205
localeRural:Distant 12.299 3.02354 4.068 5.2E−05 ***
localeRural:Fringe 10.3105 2.90957 3.544 0.00042 ***
localeRural:Remote 17.1508 3.20763 5.347 1.2E−07 ***
localeSuburb:Large 1.9264 2.50386 0.769 0.4419
localeSuburb:Midsize 4.26122 3.46655 1.229 0.21934
localeSuburb:Small 11.8973 3.87674 3.069 0.00222 **
localeTown:Distant 7.55268 3.23907 2.332 0.01996 *
localeTown:Fringe 7.96267 3.72716 2.136 0.03295 *
localeTown:Remote 11.8308 3.36197 3.519 0.00046 ***
# of 11th Graders −0.0163 0.01023 −1.596 0.11087
Free Lunch Ratio −46.604 2.77734 −16.78 <2E-16 ***
Reduced Lunch Ratio 29.4086 9.1703 3.207 0.00139 **
MME Science Takers Ratio 0.69039 0.15516 4.45 9.8E−06 ***
College Town 3.55201 1.71701 2.069 0.03889 *

Residual standard error: 12.41 on 806 degrees of freedom
Multiple R-squared:  0.7027,    Adjusted R-squared:  0.6949
F-statistic: 90.71 on 21 and 806 DF,  p-value: < 2.2e-16
MME Science Actual vs Predicted

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s