Welcome to College Value

Where's This Come From?

All of the data you see here comes from the Department of Education's wonderful College Scorecard dataset. Specifically, it's mashed up from two huge CSV files: Most-Recent-Cohorts-Institution.csv and Most-Recent-Cohorts-Field-of-Study.csv This data is compiled based on the subset of students who take out federal student loans or grants, so it's not by any means a complete picture. There are also significant gaps in the data where costs or earnings are unknown or, as listed in the data, "Privacy Suppressed".

To be honest, some of the data is kind of a pain. For instance, there are multiple colleges with the same name, which is especially true for cosmetology schools. Some are different branches in different areas and some are unrelated. There are other cases where the OPEID or UNITID fields are either blank or incorrect (for example, they refer to the main campus rather than the satellite campus that the data is for). In other cases, the primary URL for a school is empty or just plain wrong. So there's issues here and there. As we find these, we'll try to get them updated.

The data is only based on federal student loan and grant programs, so it does not include those who take out private loans, or have the good fortune to be able to cover the cost of college themselves, or scholarships, etc.

Graduation Rates

I was surprisied to learn that the Dept. of Education measures completion rates at the 8 year mark. So when you look on collegescorecard at a specific school and see their graduation rate - for example, at Granite State College you'll see a rate of 42%. This is the 8 year rate, which they mention in the small print infobox, if you hover over it. There's probably good reasons for this. Granite State is an online college, and so the students are going to be far more likely to already have fulltime jobs and families. But it seems a bit disingenuous, especially when the 6 year graduation rate is only 14% and the 4 year rate is 3%!


Incomes for both college and field rankings are based on numbers 1 year post-graduation. Median earnings at the college level are measured at 6-10 year levels and are listed for colleges, but there are some questions to detail here.

DOE has started breaking this up based on family income, but this isn't represented on CV yet. They have three terciles for this: low-income: $30,000 or less; middle-income: $30,001-$75,000; and high-income: $75,001+. How payback rate affects what is reported is not specified. In other words, it seems like the only income statistics received are for those still paying loans. If some students are able to pay off their loans much quicker, their (possibly skewed higher) income is not accounted. This could affect 6-10 year median incomes much more than the 1 year post-graduation incomes used here to compile rankings.

One of the reasons for building this site in the first place is to test the hypothesis that the correlation of majors and colleges together often matter more than either variable alone. It appears that this is the case, for example the expected earnings across nursing degrees varies wildly, especially when accounting for debt loads and graduation rates.

It's important to note that this is still a very limited and possibly skewed view. Some details noted by the DOE:

"One of the most common reasons students cite in choosing to go to college is the expansion of employment opportunities. To that end, data on the earnings and employment prospects of former students can provide key information. To measure the labor market outcomes of individuals attending institutions of higher education, data on cohorts of federally aided undergraduate students were linked with earnings data from de-identified tax records and reported back at the aggregate, institutional level. Mean earnings data elements at the institution-level were last updated in the fall of 2018."
"There are two notable limitations that researchers should keep in mind for all of these metrics. First, research suggests that the variation across programs within an institution may be even greater than aggregate earnings across institutions. For information related to more recent earnings calculations by field of study, please see the technical documentation for field of study data files. Second, the data include only Title IV-receiving students, so figures may not be representative of institutions with a low proportion of Title IV-eligible students. Additionally, the data are restricted to students who are not enrolled (enrolled means having an in-school deferment status for at least 30 days of the measurement so students who are currently enrolled in, for example, graduate school at the time of measurement are excluded."

Debt Loads

One key insight is that the amount of debt incurred is independent from completion rate, and the students are still beholden to this debt load! As Bryan Caplan and others have pointed out, the majority of the value of an undergraduate degree is in the last year and actually receiving the diploma rather than averaged over 4 years.

The DOE says:

"At institutions where large numbers of students withdraw before completion, a lower median debt level could simply reflect the lack of time that a typical student spends at the institution. Therefore, the Department uses the typical debt level for students who complete (GRAD_DEBT_MDN_SUPP or GRAD_DEBT_MDN10YR_SUPP for the debt level expressed in monthly payments26) on the consumer website. Additionally, this measure can be placed in context by looking at the borrowing rate of students at the institution (FTFTPCTFLOAN; see above); at institutions where few students borrow, the numbers may represent outliers."

For colleges, we break this down and show median debt for graduates, withdrawals, and both. For individual majors, the debt is based on the loans for only those students who completed the program.


This is a sparse matrix. A lot of the data is empty to maintain student privacy. We can still get bigger trends in a lot of cases, but smaller institutions or fields will not have much data.

From the DOE:

"..Those data that do not meet reporting standards are shown as PrivacySuppressed. Note that for many elements, we have also taken additional steps to ensure data are stable from year to year and representative of a certain number of students. For many elements, data are pooled across two years of data to reduce year-over-year variability in figures (i.e. repayment rate, debt figures, earnings). Moreover, for elements that are highlighted on the consumer-facing College Scorecard, a separate version of the element is available that suppresses data for institutions with fewer than 30 students in the denominator to ensure data are as representative as possible."

Naming and Franchises

Some schools, especially for-profit organizations, have many branches spread out in different cities but they often only report one set of statistics for the entire institution. Over time, we plan to decompose these into their proper grouping. Examples include Strayer, University of Phoenix, Cortiva Institute, etc.

Just Remember..

"All models are wrong, but some are useful." -George Box