Grade Definitions and Chart

Grade Definitions and Chart

Strength of the Evidence for a Conclusion

The interactive pie chart (on the left) divides the Conclusion Statements currently in the Academy Evidence Analysis Library by Grades based on the strength of the evidence.  Do you want to know which conclusion statements are Grade I?  or Grade III?  Simply click on the section of the graph corresponding to that grade. (On some systems, it is necessary to click twice.)

In addition to the interactive pie chart, you will find the following resources on this page:

Narrative Explanation of Grades

Table of Grading Criteria

Grades are assigned based on the strength of the evidence found through systematic reviews of published literature.  For example, a determination that there is "Good" (Grade I) evidence that an intervention is effective means that there is good quality research to support the conclusion.  However a determination that there is "insufficient evidence" (Grade V) to determine effectiveness of a particular intervention does NOT mean that the intervention does NOT work, but rather indicates that additional research is needed to determine whether or not the intervention is effective.

Narrative Explanation of Grades

Grade I: Good—The evidence consists of results from studies of strong design for answering the question addressed. The results are both clinically important and consistent with minor exceptions at most. The results are free of serious doubts about generalizability, bias, and flaws in research design. Studies with negative results have sufficiently large sample sizes to have adequate statistical power.

Grade II: Fair—The evidence consists of results from studies of strong design answering the question addressed, but there is uncertainty attached to the conclusion because of inconsistencies among the results from different studies or because of doubts about generalizability, bias, research design flaws, or adequacy of sample size. Alternatively, the evidence consists solely of results from weaker designs for the questions addressed, but the results have been confirmed in separate studies and are consistent with minor exceptions at most.

Grade III: Limited—The evidence consists of results from a limited number of studies of weak design for answering the questions addressed. Evidence from studies of strong design is either unavailable because no studies of strong design have been done or because the studies that have been done are inconclusive due to lack of generalizability, bias, design flaws, or inadequate sample sizes.

Grade IV: Expert Opinion Only—The support of the conclusion consists solely of the statement of informed medical commentators based on their clinical experience, unsubstantiated by the results of any research studies.

Grade V: Not Assignable—There is no evidence available that directly supports or refutes the conclusion.

In September 2004, ADA Research Committee adapted this grading system from: Greer N, Mosser G, Logan G, Wagstrom Halaas G. A practical approach to evidence grading. Jt Comm. J Qual Improv. 2000; 26:700-712.

Table of Grading Criteria

The EAL Expert Workgroup members use the following predefined criteria to grade the strength of the evidence supporting each conclusion statement. These criteria guide members to carefully evaluate the:

  • quality of studies (both strength of design and execution),
  • quantity of studies and subjects,
  • consistency of findings across studies,
  • the magnitude of effect,
  • generalizability of findings

reported in the body of literature supporting each conclusion. The chart below defines the criteria used to determine each grade.


Conclusion Grading Table

Strength of Evidence Elements









Expert Opinion Only


Grade Not Assignable


Scientific rigor/validity

Considers design and execution

Studies of strong design for question

Free from design flaws, bias and execution problems

Studies of strong design for question

with minor methodological concerns, OR

Only studies of weaker study design for question

Studies of weak design for answering the question


Inconclusive findings due to design flaws, bias or execution problems

No studies available

Conclusion based on usual practice, expert consensus, clinical experience, opinion, or extrapolation from basic research

No evidence that pertains to question being addressed


Of findings across studies

Findings generally consistent in direction and size of effect or degree of association, and statistical significance with minor exceptions at most

Inconsistency among results of studies with strong design, OR

Consistency with minor exceptions across studies of weaker design

Unexplained inconsistency among results from different studies OR single study unconfirmed by other studies

Conclusion supported solely by statements of informed nutrition or medical commentators



Number of studies

Number of subjects in studies


One to several good quality studies

Large number of subjects studied

Studies with negative results have sufficiently large sample size for adequate statistical power

Several studies by independent investigators

Doubts about adequacy of sample size to avoid Type I and Type II error

Limited number of studies

Low number of subjects studied and/or

inadequate sample size within studies

Unsubstantiated by published research studies

Relevant studies have not been done

Clinical Impact

Importance of studied outcomes

Magnitude of effect

Studied outcome relates directly to the question

Size of effect is clinically meaningful

Significant (statistical) difference is large


Some doubt about the statistical or clinical significance of the effect

Studied outcome is an intermediate outcome or surrogate for the true outcome of interest


Size of effect is small or lacks statistical and/or clinical significance

Objective data unavailable

Indicates area for future research

Generalizability To population of interest

Studied population, intervention and outcomes are free from serious doubts about generalizability

Minor doubts about generalizability

Serious doubts about generalizability due to

narrow or different study population, intervention or outcomes studied

Generalizability limited to scope of experience