It’s time to reassess the use of GREs in the STEM doctoral student admissions process

Sandra L. Petersen, Ph.D.
Institute for Applied Life Sciences
Department of Veterinary and Animal Sciences University of Massachusetts Amherst

A rapidly growing number of STEM doctoral programs, particularly in biomedical disciplines, are dropping the requirement for doctoral program applicants to report GRE scores and fewer students are opting to take the exam [1].  The Educational Testing Service (ETS), the organization that administers the GRE, and other GRE proponents argue that this is a mistake. They suggest that omitting the GRE will negatively impact the selection process by reducing the amount of information considered, and may impede diversity efforts by making the process more susceptible to implicit biases [2].  They also argue that a holistic process that includes GRE scores is the best way to select students. Below we present counter arguments based on recent data from studies specifically investigating the efficacy of the GRE in predicting success in STEM doctoral programs.

Assertion 1: GRE scores provide important information about candidates

The ETS states that the GRE provides information important for identifying candidates prepared for graduate school.  In fact, there is little evidence that the tests predict any key indices of success in STEM doctoral programs.

The most common evidence cited to support the idea that GREs are useful predictors come from a meta-analysis conducted over 18 years ago using 50 years of data from multiple institutions [3, 4].  Not only are the primary data quite old, the analyses do not focus specifically on STEM fields and do not differentiate between masters and doctoral students. The latter is a key issue because the relationship between GRE scores and indices of performance appears to differ between these two degree programs  [5].  Similarly, the data were not disaggregated by gender, and the relationship between GRE scores and PhD completion differs between genders [6].  Finally, the population aspiring to attain advanced degrees is more diverse than it was 25-50 years ago when the data used in these two studies were collected [3, 4].

For these reasons and because earlier studies did not clearly assess the efficacy of GREs to predict outcomes in STEM doctoral programs, several groups recently investigated this issue.  Overall, their findings fail to support the idea that GRE scores are useful for selecting STEM PhD students. Findings are described below:

  • GRE Scores do not predict PhD completion or other measures of success in STEM doctoral programs:

Two small studies, one at Ponce Health Sciences University [7]and one at the University of San Francisco [8], examined outcomes for PhD students in biomedical fields. Results of these studies showed that GRE scores are not useful in identifying students most likely to complete their programs.

Two larger studies published in 2017 by Hall and colleagues [9]and Moneta-Koehler and colleagues[10]led the way for GRExit—the discontinuation of the use of GREs in the admissions process (www.beyondthegre.org).  These studies also failed to detect a relationship between GRE scores and PhD completion at the University of North Carolina (UNC) Medical School (280 subjects) [9]and at Vanderbilt University School of Medicine (495 subjects) [10].  Although the mean scores for GRE Verbal (GREV) and GRE Quantitative (GREQ) tests were above the 70thpercentile at UNC, Hall et al. noted that some of the most successful students in the UNC study scored below the 40thpercentile on GRE Q and GRE V tests. In line with this observation, a study of students from underrepresented groups at Vanderbilt University School of Medicine biomedical program, Sealy et al. found that 28 of 32 students earned PhD degrees despite having GRE scores ranging from the 1stto the 91stpercentile.

The inability of GREs to predict PhD completion is not limited to biomedical graduate programs or to highly selective institutions whose entering students have mean GRE scores higher than the national average.   A study of 1,805 students in all fields classified as STEM at four state flagship institutions in the Northeast Alliance for Graduate Education and the Professoriate (NEAGEP) included students with scores ranging between the 1stand 94thpercentile of all test takers [6].   In each of the four institutions, GRE Q scores of females were lower than those of males [6], mirroring ETS data showing that men score better than women on the test [11][12].  Despite gender differences in scores, the rate of PhD completion was the same for men and women.

The large sample size of the NEAGEP study allowed for analyses of PhD completion disaggregated by quartile ranking and gender.  For women there were no differences in STEM PhD completion among quartiles even though the lowest GRE Q quartile had a mean of 27thpercentile (n=109) and the highest had a mean of 86thpercentile (n=113).

The most important finding in the NEAGEP study was that STEM PhD completion rate was 74% for men with scores in the lowest quartile for GRE Q (mean 34th percentile; n=160), but only 56.2% for those in the highest quartile (mean 91st percentile; n=143). This pattern was replicated across each of the four institutions in the study and in a study specifically focused on engineering students [6].  GRE V scores showed a similar pattern.  Interestingly, the meta-analysis analyzing data from 25-50 years ago also found a negative correlation between PhD completion and GRE Q scores in Life Sciences [4].

ETS suggests the reason studies fail to find a relationship between GRE scores and success in graduate school is that students with low scores are not admitted to graduate school so the range is truncated [1].   Clearly results of our NEAGEP study and the study of Sealy et al. [13] refute this argument. 

PhD completion is arguably the most critical marker of success, but other measures have also been considered.  Early work found some evidence that GRE scores were correlated with math and physical sciences faculty assessment of students, but no such correlation was seen in life sciences [4].  Others found a correlation when they related GRE scores and faculty ratings of students they knew well [3].  More recent work also reported somewhat higher faculty assessments of students’ ability to handle classwork, keep up with the literature and write creatively [10].  But faculty ratings as a measure of graduate student success are problematic because ratings may be affected by implicit bias and circular reasoning.  If faculty members believe GRE scores are meaningful and they know the scores of their students, it would be surprising if GRE scores did not correlate with faculty ratings.   Indeed, a major concern with GRE scores is that faculty expectations may change based on the scores of their students and result in self-fulfilling prophesies.

To obtain a more objective measure of success, recent studies investigated the relationship between GRE scores and indices of productivity.  Neither Moneta-Koehler et al. nor Hall et al. found a relationship between GRE scores and the number of first-author publications in graduate school [9, 10].   Moreover, neither Moneta-Koehler et al. nor Pacheco et al. found a relationship between GRE scores and the ability to obtain individual grants or fellowships[7, 10].  To some extent, this may be due to the fact that GRE scores are no longer reported on NIH and NSF fellowship applications.

  • Evidence does not support the idea that GRE scores predict readiness for STEM doctoral studies

The ETS now acknowledges that the GRE does not predict PhD completion and, instead, they argue that it predicts readiness for doctoral studies and, presumably, for succeeding in advanced course work [14].  This statement may be based on results of early studies that detected a weak correlation between GRE scores and first-year and total overall graduate GPA [4, 5, 15].  But around the same time, it was also reported that second year grades were not correlated with GRE scores [15].  More recently, a Vanderbilt study [10]found a moderate correlation between GRE V and GRE Q scores and first semester GPA; however, the overall graduate GPA was not correlated with GRE Q and only slightly correlated with GRE V.  It is important to note that the overall graduate GPA in the Vanderbilt study of 488 students was 3.66 with a standard deviation of only 0.27.  The small variability in GPA scores likely reflects the fact that grade ranges are compressed because of the requirement that students maintain at least a B average in most programs.  Thus, the correlation between GRE scores and GPA is of questionable significance.

If the weak correlation between GRE scores and GPA was an important marker of readiness for doctoral studies in STEM fields, there should be a relationship between GRE scores and the number of students leaving during the first three years.  This is the period when STEM doctoral students are completing their coursework and if they fail to attain grades above “C”, they are generally asked to leave. In fact, recent work shows that GRE scores do not predict who will leave during the first three years.  The NEAGEP study found no differences in GRE scores of men or women who completed STEM PhD degrees and those who left after the first year [6].   Similarly, Moneta-Koehler et al. [10] found that GRE scores were not related to passing the qualifying exam after the second year.   GRE scores were also unrelated to leaving STEM doctoral programs around the third year in the NEAGEP [6]and the Ponce Health Sciences University studies [7].

Readiness for graduate school might also be reflected in the time it takes for students to complete PhD degrees.  Again, studies of several biomedical doctoral programs failed to find a significant correlation between GRE scores and time to program completion [7, 9, 10].  The NEAGEP study also found no differences in time to degree based on GRE V or GRE Q scores in either male or female students in STEM doctoral programs [6].

In a study of 28 underrepresented minority students at Vanderbilt University School, Sealy et al. [13]found a minor positive correlation between either GRE Q or GRE V and time to degree.  The correlation was sufficiently low that the difference in time to degree between students with GRE Q scores in the 40thand those in the 60thpercentile was only one month.

For these reasons, we agree with ETS researchers who stated, “a measure that predicts first-year grades but is unrelated to later success would not be a desirable admission measure” [5].  

Assertion 2:  Omitting GRE scores may negatively impact diversity initiatives

The use of GRE scores in the graduate admissions process has long been challenged [11, 15], but the debate intensified recently as more STEM faculty have become  principal investigators (PIs) on NIH- and NSF-funded grants focused on diversifying the STEM workforce.   These PIs found that using GRE scores to select students for STEM PhD programs interfered with diversity initiatives because women and those from underrepresented minority (URM) groups generally score lower than students from other groups [16].  Thus, the pool of women and URM students viewed as “acceptable” is disproportionately limited, particularly in STEM programs wherein diversity is severely lacking [17].  For this reason, several groups recently addressed the issue of whether GRE scores predict sufficiently important indices of success to justify their continued use [6, 9, 10, 13, 17, 18].  As described below, the overwhelming answer was, “No”.

It is, therefore, puzzling that GRE proponents assert that dropping the GRE requirement will negatively impact diversity initiatives.  The argument seems to be that implicit bias would play a greater role in the admissions process if GRE scores were not considered. This idea likely dates back 70 years when the tests were introduced after WWII to “level the playing field” by reducing the impact of institutional reputation on admission decisions.  But this was at a time when graduate school applicants were largely White men.  The current applicant pool is much more diverse in terms of race/ethnicity, gender and socioeconomic status, so the same strategy for “equalizing” opportunity may not be as useful as it was previously.

There may indeed be implicit bias against applicants coming from historically black colleges and universities, Hispanic-serving institutions and women’s colleges.  But, this possible implicit bias pales in comparison to the explicit bias of the GREs against women and persons from most underrepresented minorities.  These groups continue to score lower than White males and Asian test takers[16].  Unfortunately, rather than questioning what it is about the test that produces these disparities, lower GRE scores contribute to the perception that women and students from underrepresented groups are not as prepared for doctoral studies as other groups. This perception frustrates inclusion efforts because non-minority students suspect that all minority students have low GRE scores, but are accepted because of their race/ethnicity. Consequently, rather than leveling the playing field by reducing bias, the GRE legitimizes existing biases and limits the size of the pool of women and minorities deemed acceptable for admission to STEM doctoral programs [17].

Assertion 3:  Holistic admissions procedures that include the GRE constitute the best practice for selecting students for doctoral programs

STEM admission committee members generally indicate that they use a holistic approach to admissions and do not put undue weight on the GRE scores. However, in practice, double blind studies show that if the scores are reported, they do sway reviewers and are often used to “screen” applicants [19].  This would not be an unreasonable strategy if the test scores measured anything relevant to success in STEM doctoral programs, but as demonstrated above, they do not.  Moreover, inclusion of the GRE scores do not enhance predictability of success above that achieved by considering a composite of other factors typically used in a holistic review process [7].

Even if the test scores are not used as a primary factor in choosing students, requiring the students to take the test inflicts collateral damage.  First, the test creates an unnecessary burden for those who cannot afford to take off work and spend the 3-6 months recommended for study, or pay the costs of preparation courses and materials that may easily exceed $1,500.  Retaking the test to improve one’s score adds another financial burden that many cannot afford, again favoring those of means.  Second, regardless of whether admissions committees ignore the GRE and admit students who have lower scores, those students know their scores and question their own abilities. This increases the burden of imposter syndrome often carried by students not in the majority.  The problem is exacerbated if faculty know the scores of students and have different expectations for high- and low-scoring students.  Third, students with potential may decide not to go to graduate school, because stereotype threat and test anxiety cause them to doubt that they will do well enough on the GRE to get in.

Thus, a test that fails to measure characteristics important for success in graduate school is negatively impacting the lives of individuals.  It is also reducing the ability of our STEM graduate programs to meet the national need for scientists, mathematicians and engineers who represent all sectors of the U.S. population.

If the GRE is not predictive of success in STEM programs, why is it still used?

In agreement with the recent findings described above, the ETS concurs that the GRE is not meant to predict degree completion [14].  Recent data described above show that the GRE also fails to predict readiness for advanced study or any other important measure of success in STEM doctoral programs.

Ironically, some STEM faculty members, people whose careers are based on evaluating data critically, still resist discontinuing the use of GREs in the admissions process. The reasons for this resistance are likely complex.  For one thing, if programs only accept students with relatively high GRE scores, there are few exemplars of successful students with low scores.  Another reason is that the test has been around for 70 years and an attendant industry focused on preparing students for the GRE has emerged, lending legitimacy to the test. In addition, scores are viewed as objective measures simply because they are numerical; therefore, faculty worry that ignoring or omitting GRE scores is tantamount to lowering standards.  But, if GRE scores are not providing useful information, what relevant standards are we lowering by dropping them?

GRE scores may also seem valuable to faculty because they likely scored well themselves. However, many do not know the data regarding the test, the test content, or how the test is currently administered.  Simply put, the GRE V tests whether students know a large number of vocabulary words for which there are flash cards available to memorize in order to improve your score (see stories of student test takers at https://beyondthegre.org/gre-stories).  The GRE Q tests high school level mathematics that can be learned with free and purchased materials.  Perhaps more importantly, expensive courses can teach strategies for taking the test. Faculty also may not understand the current scoring system with a more compressed range than previously; scores are from 130 to 180 with 130 being the score achieved when no questions are answered.  For those who focus on percentiles, a score of 145 on the GRE Q corresponds to approximately the 20thpercentile and 165 to the 90thpercentile.  Those who answer no questions score 130.

So, what the GRE seems to discriminate among test-takers is the ability to perform well on a long, timed test.  For those who are worried that they will not do wellon this high-stakestest (perhaps women and minorities who have historically scored less well than others), the way the test is administered increases the likelihood that their scores will indeed be lower.  This is because the test is adaptive by section so performance on the first section (when anxiety is likely highest) determines the difficulty of the subsequent section questions and the ultimate score.  In addition, doing well on the test requires focus that may be hard for anxious students to achieve during the 3.75-hour test.  The test has six sections with only one-minute breaks after each section, except the third when a 10-min break is allowed.

If not the GRE, then what?

Modern STEM PhD programs emphasize deep thinking, problem-solving,and teamwork, with less focus on rote learning, a skill that likely improves GRE scores.  In addition, the changing demographics and more diverse academic settings bring other challenges that affect student retention.  Thus, completion of STEM PhD programs in today’s world requires students to effectively deal with family issues, interpersonal conflicts in diverse working groups, scientific experimental failures and financial challenges.  The GRE does not assess whether students are equipped with the relevant personality characteristics, experiences and skills to effectively deal with these issues.  It is time to develop better assessment tools!

Few in the STEM faculty rely on research tools that were used decades ago, but do exactly that when they use the GRE scores to select graduate students. To get faculty to re-consider their continued reliance on the GRE requirement, Scott Borolo (University of Michigan) was a pioneer in using a data-driven approach.  He engaged 14 biomedical doctoral programs in an intensive review of the literature and an academic debate based on the findings in the literature.  One group of faculty developed a white paper on reasons why the GRE should be retained and another group on the reasons it should no longer be used.  The overwhelming weight of evidence was on the side of rejecting the use of the exam.  The next step was having admissions committees review applications without seeing the GRE scores, but with the option of viewing the scores after choosing the students and changing the decisions.   Only two programs asked to see the scores and no admission decisions were changed (personal communication, Scott Borolo).

Other programs across the country are trying to compromise by making the submission of the GRE optional, but this strategy is confusing to applicants and may present problems for admissions committees.  For example, if the admissions committee has two candidates who have equally compelling applications, will the candidate who chose not to submit GREs be viewed less favorably?  Will it be assumed that the reason the student did not choose to submit scores was that they were low?  Those of us who advise students know that excellent candidates for STEM PhD programs are choosing not to invest the time and money required to maximize their GRE scores.  Of course, students are not required to take the expensive preparatory courses, but the purveyors of the courses and the students who take them broadly advertise that they can dramatically increase scores.  Thus, if students do not take the exam, they may be at a competitive disadvantage.  For these reasons, using a “test optional” approach does not seem to be the best strategy.

A holistic admissions process that considers grade point average, research experience, and letters of recommendation is just as predictive of success when GREs are disregarded as when they are included [7].  Nevertheless, it seems likely that the process could be improved if it assessed characteristics and experiences that are more predictive.  Direct evidence that such characteristics exist and are more important than GRE scores was provided by the NEAGEP study [6].  In that large, multi-institutional study, men in the lowest quartile of GRE Q scores completed STEM PhDs at a rate significantly higher than those in the highest quartile.  This pattern was apparent in each of the four participating institutions.  What is striking about these findings is that faculty were able to identify characteristics sufficiently compelling that they ignored very low GRE scores and their confidence paid off—men with the lowest scores finished at a rate averaging 78% compared with 58% for those with the highest scores [6].  Thus, faculty can make more predictive admissions decisions for men when they evaluate characteristics other than GRE scores. These findings are consistent with the outcome of the University of Michigan Programs in Biomedical Sciences project wherein faculty became confident in their ability to choose students without the aid of GRE test scores.

Armed with evidence that GRE scores are not useful predictors and that faculty are able to identify characteristics of students likely to be successful without considering the scores, we convened a workshop at the University of Massachusetts Amherst in September, 2017. The “If Not GREs, Then What” workshop brought together a group of 60 faculty and administrators from 25 institutions and organizations with various areas of expertise.  At the workshop, participants worked in small groups to: 1) develop a list of characteristics of successful STEM doctoral students; 2) design ways of assessing these characteristics in a new admissions model; 3) implement new admissions procedures. A white paper describing the results of the meeting can be found at https://beyondthegre.org/identifying-promising-stem-students/.

Taking this approach will no doubt require more time and resources than screening applicants based on a number.  But, it is also likely to save resources in the long run as shown by results of the NEAGEP study.  Those findings show that by ignoring GRE scores, faculty identified a group of students who completed STEM PhDs at a rate more than 20% above those selected with high GRE scores.  This translates into a sizable financial savings, considering that the cost of training a doctoral student generally exceeds $50,000/year and students who leave usually do so after three years. A more detailed discussion of financial costs can be found in [6].

Summary

Available data from recent studies specifically focused on STEM PhD programs shows that GRE scores are not useful in the admissions process.  Moreover, data show that faculty who ignored GRE scores were able to make more predictive admissions decisions than if they had considered the scores.  Finally, in view of the inequities inherent in using the GRE and its negative impact on STEM diversity initiatives, it is time to put our time and financial resources into developing a more efficacious admissions strategy.  Such an effort is likely to improve inclusion of women and those from groups underrepresented in the STEM workforce, an accomplishment necessary for theU.S. to maintain its international leadership in scientific innovation.  It’s time to move beyond the GRE!

 

REFERENCES

  1. Langin K. Ph.D. programs drop standardized exam. Science. 2019;364(6443):816-. doi: 10.1126/science.364.6443.816.
  2. Payne DG. The Value of Testing in Graduate Admissions. Inside Higher Ed. 2018.
  3. Kuncel NR, Hezlett SA. Standardized tests predict graduate students’ success. Science. 2007;315:1080-1.
  4. Kuncel NR, Hezlett SA, Ones DS. A comprehensive meta-analysis of the predictive validity of the Graduate Record Examinations: Implications for graduate student selection and performance. Psychological Bulletin. 2001;127(1):162-81. doi: 10.1037/0033-2909.127.1.162.
  5. Burton NW, Wang M. Predicting long-term success in graduate school:A collaboratie validity study. GRE Board Report. GRE Board Report: 2005 April 2005. Report No.: 99-14R Contract No.: ETS RR-05-03.
  6. Petersen SL, Erenrich ES, Levine DL, Vigoreaux J, Gile K. Multi-institutional study of GRE scores as predictors of STEM PhD degree completion: GRE gets a low mark. PLoS One. 2018;13(10):e0206570. doi: 10.1371/journal.pone.0206570. PubMed PMID: 30372469; PubMed Central PMCID: PMCPMC6205626.
  7. Pacheco WI, Noel RJ, Jr., Porter JT, Appleyard CB. Beyond the GRE:Using a composite score to predict the success of Puerto Rican Students in a biomedical PhD program. CBE–Life Sciences Education. 2015;14(Summer 2015):1-7.
  8. Weiner OD. How should we be selecting our graduate students? Molecular Biology of the Cell. 2014;25(4):429-30. doi: 10.1091/mbc.E13-11-0646. PubMed PMID: PMC3923635.
  9. Hall JD, O’Connell AB, Cook JG. Predictors of Student Productivity in Biomedical Graduate School Applications. PLoS One. 2017;12(1):e0169121. doi: 10.1371/journal.pone.0169121. PubMed PMID: 28076439; PubMed Central PMCID: PMCPMC5226343.
  10. Moneta-Koehler L, Brown AM, Petrie KA, Evans BJ, Chalkley R. The Limitations of the GRE in Predicting Success in Biomedical Graduate School. PLOS ONE. 2017;12(1):e0166742. doi: 10.1371/journal.pone.0166742.
  11. FairTest. Examining the GRE:Myths, Misuses, and Alternatives 2007 [cited 2017 July 15]. Available from: https://www.fairtest.org/examining-gre-myths-misuses-and-alternatives.
  12. ETS. A Snapshot of the Individuals Who Took the GREGeneral Test 2017. Available from: https://www.ets.org/s/gre/pdf/snapshot.pdf.
  13. Sealy L, Saunders C, Blume J, Chalkley R. The GRE over the entire range of scores lacks predictive ability for PhD outcomes in the biomedical sciences. PLoS One. 2019;14(3):e0201634. doi: 10.1371/journal.pone.0201634. PubMed PMID: 30897086; PubMed Central PMCID: PMCPMC6428323.
  14. Jaschik S. New Criticisms of GRE. Inside Higher Ed. 2018;November 5, 2018.
  15. Sternberg RJ, Williams WM. Does the graduate record examination predict meaningful success in the graduate training of psychologists? American Psychologist. 1997;52(6):630-41.
  16. ETS. A Snapshot of the Individuals Who Toke the GRE General Test. July 2013-June 2018.
  17. Miller C, Stassun K. A test that fails. Nature. 2014;510(12 June ):303-4.
  18. Stassun KG, Sturm S, Holley-Bockelmann K, Burger A, Ernst DJ, Webb D. The Fisk-Vanderbilt Master’s-to-Ph.D. Bridge Program: Recognizing, enlisting, and cultivating unrealized or unrecognized potential in underrepresented minority students. American Journal of Physics. 2011;79(4):374-9. doi: 10.1119/1.3546069.
  19. Posselt JR. Inside Graduate Admissions:Merit, Diversity, and Faculty Gatekeeping. Cambridge, MA: Harvard University Press; 2016. 250 p.