Pediatrics and Physical Activity

Study Design:
- Click here for explanation of classification scheme.
Quality Rating:
Research Purpose:
This study evaluated a health-related physical education program, Sports, Play, and Active Recreation for Kids (SPARK,) for fourth- and fifth-grade students designed to increase physical activity during physical education classes and outside of school.
Inclusion Criteria:

Complete or near complete data for the survey, fitness measures, and physical activity monitor at baseline and final timepoints.

Exclusion Criteria:
None provided.
Description of Study Protocol:


Of 16 elementary schools in school district, 12 principals agreed to participate. Recruitment of 4th grade students occurred over a 2-year period (1990-91).


7 smallest schools stratified into 2 groups by % ethnic minority students. Within each stratum, 1 school randomly assigned to each of 3 experimental conditions. The remaining school was assigned to the control condition in anticipation of loss of controls.

Intervention (if applicable)

  • Group 3 – 3 certified PE specialists implemented program with ongoing training and supervision from investigators (2 schools)
  • Group 2 – 28 trained classroom teachers implemented program; trained 32 hrs over 7 sessions in first year, decreased training in subsequent years with follow-up support by PE specialists (2 schools)
  • Group 1 – usual PE (control; 3 schools)
Data Collection Summary:

Timing of Measurements

baseline, end year 2 (for most measures)

Dependent Variables

  • Physical fitness (adaptation of FITNESSGRAM protocol – mile run test for cardiovascular endurance, number of bent knee sit-ups in 60 seconds for muscular strength and endurance, number of pull-ups for upper body strength, sit-and-reach test for hamstring flexibility)
  • Self-reported physical activity outside of school (1-day recall of previously validated checklist of 20 activities)
  • Measured physical activity outside of school (Caltrac accelerometer worn for 1 weekday per semester and 1 weekend per school year; no baseline measure; for weekday measures 82% valid data [5% of missing data due to absence from school, 6% due to forgetting accelerometer)
  • Weight & height (measured)
  • Sum of skinfolds (measured calf and triceps)
  • PE class activity (previously validated System for Observing Fitness Instruction Time [SOFIT] instrument; 4 randomly chosen children observed every 20 seconds during rotating 4-min blocks during class for 2 weeks/year)

Independent Variables

SPARK Curriculum 

PE – to promote PA in school

  • Three 30-min classes per week (half for each component)
  • Two components – (1) 10 health-fitness activities to promote cardiovascular endurance (plus develop abdominal and upper body strength) (e.g., walk/run, aerobic dance/games, jump rope); (2) 9 skill-fitness activities to develop sports skills as well as promote cardiovascular fitness (e.g., basketball, soccer, Frisbee games)
  • Motivation stimulated by self-assessment of fitness

Self-management – to teach behavior change skills to promote PA outside of school

  • One 30-min class per week
  • Skills included – self-monitoring, goal-setting, stimulus control, self-reinforcement, self-instruction, and problem solving
  • Parental involvement stimulated through monthly newsletters and homework
  • Motivation stimulated by awarding prizes (e.g., pencils, sports water bottles) for meeting weekly activity goals and self-reward.

Equipment - All schools, including controls, were provided with sufficient PE equipment to carry out the SPARK program.

Control Variables

Gender (separate analyses for data collected on individuals), baseline age & baseline value (adjusted for by regressing raw score on baseline age and baseline measure [except accelerometer data] and adjusting score on the basis of the sum of the overall mean and the student’s residual score)

Statistical Analysis:

  • One-way ANOVA modified to account for clustering of values with schools (because school was the unit of assignment) – to compare outcome differences by group
  • Effect sizes (difference between 2 group means, divided by underlying SD) – to assess practical significance of intervention in relation to control condition’ effect sizes > 0.4 considered large, 0.3 moderate, 0.1 small
Description of Actual Data Sample:

Initial N: All 4th grade students from 7 elementary schools invited to participate.
Approximately 98% of students provided written parental consent (note: in 1993 paper said passive consent).1538 children completed surveys at baseline

Withdrawals/Drop-Outs: 583 children; characteristics not described

Attrition (final N): 955 children (487 boys, 468 girls; 264 in Gp 1, 331 in Gp 2, 360 in Gp 3; 62.1% of original sample) provided complete or near complete data.

Ethnicity: 82% White, 12% Asian and Pacific Islander, 4% Latino, 2% African American

Location: Poway, suburb of San Diego, CA

Duration: 2 school years (˜18 mo), Fall 1990 (4th grade) – Spring 1993 (5th grade)

Summary of Results:


Significant difference in age by condition (p<.01), but range of means small (9.49-9.62 years)


No differences by experimental group or gender

  • Significant difference by age (p<.01), but retained students were only 0.1 years older than dropouts. Minority students were more likely to be retained than Whites (p<.05)

Physical Activity in PE:

Moderate to vigorous PA by students (min/wk) – 

  • Gp 3 (mean = 40.2, 95% CI 36.8-43.7) > Gp 2 (mean = 32.7, 95% CI 29.1-36.2) > Gp 1 (mean = 17.8, 95% CI 13.2-22.3)
  • ANOVA p<.001

Student energy expenditure (kcal/kg/wk) – 

  • Gp 3 (mean = 7.2, 95% CI 6.8-7.6) > Gp 2 (mean = 5.8, 95% CI 5.3-6.3) > Gp 1 (mean = 3.3, 95% CI 2.4-4.1)
  • ANOVA p<.001

Lessons (no/wk) – 

  • Gp 3 (mean = 2.9, 95% CI 2.8-2.9) & Gp 2 (mean = 2.6, 95% CI 2.4-2.9) > Gp 1 (mean = 1.8, 95% CI 1.4-2.3)
  • ANOVA p<.001

Amount of PE (min/wk) –

  • Gp 3 (mean = 79.7; 95% CI 76.3-83.1) > Gp 2 (mean = 64.6, 95% CI 59.0-70.2) > Gp 1 (mean = 38.0, 95% CI 27.9-48.1)
  • ANOVA p<.001

Physical Activity Outside of School:

Weekday or weekend activity (accelerometer counts/h – posttest data only)

  • NS differences for girls (effect sizes vs. control = .05-.13)
  • NS differences for boys (effect sizes vs. control = .05-.18)

Self-report PA (mean of 1-day, weekday and weekend)

  • NS differences for girls (effect sizes vs. control = .16-.23)
  • NS differences for boys (effect sizes vs. control = .04-.12)

Physical Fitness:

Mile run (sec)

  • Gp 3< Gp 1 for girls (effect size for Gp 3 vs. control  = .32, p=.03)
  • NS differences for boys (effect sizes vs. control = .02-.14)

Sit-ups (no/min)

  • Gp 1< Gp 3 for girls (effect size for Gp 3 vs. control  = .31, p=.03)
  • NS differences for boys (effect sizes vs. control = .07-.18)

Pull-ups (no)

  • NS differences for girls (effects sizes vs. control = .03-.08)
  • NS differences for boys (effect sizes vs. control = .09-.13)

Sit-and-reach (inches)

  • NS differences for girls (effects sizes vs. control = .03-.13)
  • NS differences for boys (effect sizes vs. control = .02-.14)

Anthropometry :

BMI, ht or wt – results not reported

Sum of skinfolds (mm)

  • NS differences for girls (effects sizes vs. control = .01-.20)
  • NS differences for boys (effect sizes vs. control = .08-.12)


Author Conclusion:

A health-related physical education curriculum can provide students with substantially more physical activity during physical education classes. Improved physical education classes can potentially benefit 97% of elementary school students.

The SPARK health-related physical education program increased PA during PE classes but not out of school. 

It is estimated that during a 36-week school year, students in specialist-led classes spent about 13 more hours in moderate to vigorous physical activity than students in control classes.  Data from the control condition suggest that PE is supplying only 18 (12%) of the recommended 150 minutes of PA per school week.  The teacher-led condition supplies 22%, and the specialist-led condition supplied 27%.

This increase in PA was sufficient to improve two components of health-related fitness in girls significantly.  The stronger intervention effect in girls may be explained in part by their lower levels of fitness at baseline.

Consistent with observed physical activity during PE classes, the largest fitness gains were found in specialist-led students.  These results support positions statements calling for certified PE specialists at all grade levels.   Present results also support the conclusion that elementary classroom teachers, with adequate training and support, can improve their teaching PE.

Funding Source:
Government: NIH, Instituto Mexicano del Seguro Social,
University/Hospital: Harvard School of Public Health, Brigham and Women's Hospital and Harvard Medical School
Reviewer Comments:


Described specialist and teacher training program, 2 y duration.


Schools, rather than individuals, were randomized, no blinding of measurers, no provision of wt or BMI data, no comparison of groups at baseline on most factors, no power calculations provided, lack of baseline accelerometer measures.

Other Comments:

SPARK was not designed to be an obesity prevention program.

Quality Criteria Checklist: Primary Research
Relevance Questions
  1. Would implementing the studied intervention or procedure (if found successful) result in improved outcomes for the patients/clients/population group? (Not Applicable for some epidemiological studies) Yes
  2. Did the authors study an outcome (dependent variable) or topic that the patients/clients/population group would care about? Yes
  3. Is the focus of the intervention or procedure (independent variable) or topic of study a common issue of concern to dieteticspractice? Yes
  4. Is the intervention or procedure feasible? (NA for some epidemiological studies) Yes
Validity Questions
1. Was the research question clearly stated? Yes
  1.1. Was (were) the specific intervention(s) or procedure(s) [independent variable(s)] identified? Yes
  1.2. Was (were) the outcome(s) [dependent variable(s)] clearly indicated? Yes
  1.3. Were the target population and setting specified? Yes
2. Was the selection of study subjects/patients free from bias? No
  2.1. Were inclusion/exclusion criteria specified (e.g., risk, point in disease progression, diagnostic or prognosis criteria), and with sufficient detail and without omitting criteria critical to the study? ???
  2.2. Were criteria applied equally to all study groups? Yes
  2.3. Were health, demographics, and other characteristics of subjects described? No
  2.4. Were the subjects/patients a representative sample of the relevant population? ???
3. Were study groups comparable? No
  3.1. Was the method of assigning subjects/patients to groups described and unbiased? (Method of randomization identified if RCT) ???
  3.2. Were distribution of disease status, prognostic factors, and other factors (e.g., demographics) similar across study groups at baseline? ???
  3.3. Were concurrent controls or comparisons used? (Concurrent preferred over historical control or comparison groups.) Yes
  3.4. If cohort study or cross-sectional study, were groups comparable on important confounding factors and/or were preexisting differences accounted for by using appropriate adjustments in statistical analysis? N/A
  3.5. If case control study, were potential confounding factors comparable for cases and controls? (If case series or trial with subjects serving as own control, this criterion is not applicable.) N/A
  3.6. If diagnostic test, was there an independent blind comparison with an appropriate reference standard (e.g., "gold standard")? N/A
4. Was method of handling withdrawals described? No
  4.1. Were follow-up methods described and the same for all groups? Yes
  4.2. Was the number, characteristics of withdrawals (i.e., dropouts, lost to follow up, attrition rate) and/or response rate (cross-sectional studies) described for each group? (Follow up goal for a strong study is 80%.) Yes
  4.3. Were all enrolled subjects/patients (in the original sample) accounted for? Yes
  4.4. Were reasons for withdrawals similar across groups? ???
  4.5. If diagnostic test, was decision to perform reference test not dependent on results of test under study? N/A
5. Was blinding used to prevent introduction of bias? No
  5.1. In intervention study, were subjects, clinicians/practitioners, and investigators blinded to treatment group, as appropriate? No
  5.2. Were data collectors blinded for outcomes assessment? (If outcome is measured using an objective test, such as a lab value, this criterion is assumed to be met.) No
  5.3. In cohort study or cross-sectional study, were measurements of outcomes and risk factors blinded? N/A
  5.4. In case control study, was case definition explicit and case ascertainment not influenced by exposure status? N/A
  5.5. In diagnostic study, were test results blinded to patient history and other test results? N/A
6. Were intervention/therapeutic regimens/exposure factor or procedure and any comparison(s) described in detail? Were interveningfactors described? Yes
  6.1. In RCT or other intervention trial, were protocols described for all regimens studied? Yes
  6.2. In observational study, were interventions, study settings, and clinicians/provider described? N/A
  6.3. Was the intensity and duration of the intervention or exposure factor sufficient to produce a meaningful effect? Yes
  6.4. Was the amount of exposure and, if relevant, subject/patient compliance measured? Yes
  6.5. Were co-interventions (e.g., ancillary treatments, other therapies) described? N/A
  6.6. Were extra or unplanned treatments described? N/A
  6.7. Was the information for 6.4, 6.5, and 6.6 assessed the same way for all groups? N/A
  6.8. In diagnostic study, were details of test administration and replication sufficient? N/A
7. Were outcomes clearly defined and the measurements valid and reliable? Yes
  7.1. Were primary and secondary endpoints described and relevant to the question? ???
  7.2. Were nutrition measures appropriate to question and outcomes of concern? N/A
  7.3. Was the period of follow-up long enough for important outcome(s) to occur? Yes
  7.4. Were the observations and measurements based on standard, valid, and reliable data collection instruments/tests/procedures? Yes
  7.5. Was the measurement of effect at an appropriate level of precision? Yes
  7.6. Were other factors accounted for (measured) that could affect outcomes? Yes
  7.7. Were the measurements conducted consistently across groups? Yes
8. Was the statistical analysis appropriate for the study design and type of outcome indicators? Yes
  8.1. Were statistical analyses adequately described and the results reported appropriately? Yes
  8.2. Were correct statistical tests used and assumptions of test not violated? Yes
  8.3. Were statistics reported with levels of significance and/or confidence intervals? Yes
  8.4. Was "intent to treat" analysis of outcomes done (and as appropriate, was there an analysis of outcomes for those maximally exposed or a dose-response analysis)? No
  8.5. Were adequate adjustments made for effects of confounding factors that might have affected the outcomes (e.g., multivariate analyses)? Yes
  8.6. Was clinical significance as well as statistical significance reported? Yes
  8.7. If negative findings, was a power calculation reported to address type 2 error? No
9. Are conclusions supported by results with biases and limitations taken into consideration? Yes
  9.1. Is there a discussion of findings? Yes
  9.2. Are biases and study limitations identified and discussed? Yes
10. Is bias due to study's funding or sponsorship unlikely? Yes
  10.1. Were sources of funding and investigators' affiliations described? Yes
  10.2. Was the study free from apparent conflict of interest? Yes