EE: Measurement Interval (2005)
DelikanakiSkaribas E. The role of sampling duration on basal metabolic rate measurement error. Thesis dissertation. 2001 (1a).
 Estimate the reliability and measurement error associated with measuring BMR in elderly men
 Examine the effects of sampling duration within a day and daytoday variance on the accuracy of measuring BMR of elderly patients
 Define the BMR measurement error associated with sampling duration that varies in time and can be generalized to include daytoday measurement error.

More than 55 years

Male

Medications allowed

Sampling duration of BMR for more than 12 minutes of continuous data

Perform both BMR tests within seven days for each patient

BMR performed early in the morning under standardized conditions

Sampling duration was recorded within 30second intervals

Signed a consent form.
 Life expectancy less than 30 days (per admitting MD)
 Anemia (HCT less than 25%), documented renal SeCreatine higher than 3.0mg per dL) or liver failure
 Malignancy, immunodeficiency (e.g., HIV infection, steroid treatment, plasma cell dyscrasia)
 Known chronic infection requiring antibiotic treatment
 Estimated length of stay less than six weeks or predicted noncompliance with study measures
 Advanced CHF or malabsorption syndromes directly affecting nutritional state
 Refusal to sign a consent form.
Recruitment
Selected from database.
Design
Nonconcurrent cohort study (database study).
Statistical Analysis
 Reliability and standard error of measurement (SEM) was calculated using generalizability (G) theory
 Gtheory differs from the classical theory because it partitions the undifferentiated error and identifies multiple sources of variability of an unlimited number of factors. Dimensions associated with the sources of error are called facets. In the present design, there are seven sources of variation: Among subjects, among intervals, among days, subjects X intervals, subjects X days, intervals X days, subjects X intervals X days. Variance components in the Gstudy were calculated from expected mean squares (EMS).
 EMS = the value of the mean square that would be obtained, on average, by repeatedly analyzing samples from the same population and universe with the same design
 Repeat measures (ANOVA) to examine significant differences between days, among intervals and daybyintervals interaction and multivariate tests (MANOVA) was used to examine significant differences among intervals and the interaction
 The measurement protocol was performed on two different days within a sevenday period.
Timing of Measurements
 The measurement protocol was performed on two different days within a sevenday period
 Body composition assessment was performed with dual energy Xray absorptiometry (DEXA) within a week of BMR measurement.
Dependent Variables/Outcomes
 Reliability, measurement error and variation in measured REE [(VO_{2}, L per minute STPD), VCO_{2} (L per minute STPD; ml per kg per minute), respiratory exchange ratio (RER), VCO_{2}/VO_{2}), ventilation VE (L per minute STPD)]
 BMR (kcal per day) and percent predicted BMR (kcal per day)
 Resting energy expenditure:
 IC type: Medical Graphics CardiO2 System; Breeze Ex Software system with three different mask sizes with use of sealing gel
 Equipment of Calibration: Yes, with threepoint technique
 Coefficient of variation using std gases: Yes
 Rest before measure (state length of time rested if available): “Relax, minimize movements and breathe normally from mouth”
 Measurement length: 30 to 40 minutes total test duration; a 12minute interval (24 to 30second intervals) was selected from each patient and analyzed
 Machine measured length: 30second interval
 Steady state: Not specified; however, reports “sampling duration of the gas exchange was a 30second interval to reduce large breathbybreath variability due to tidal volume”
 “Patients asked to minimize movements”
 Fasting length: 12 hours
 Exercise restrictions XX hr prior to test? Not applicable
 Room temp: Equipment adjusted for temperature
 No. of measures within the measurement period:
 Were some measures eliminated?
 Was a set of measurements averaged?
 If average, identify length of each measure and number of measurements?
 Coefficient of variation in subjects measures?
 Training of measurer? “Two testers collected BMR using the same standardized testing procedures.”
 Subject training of measuring process? Each volunteer received an explanation of the time commitment and procedures involved in the study, and these were explained again the day before the testing
 Monitored heart rate? Not specified
 Body temperature? Not specified
 Medications administered? Not specified, but most likely yes, given the population.
Independent Variables
Seven sources of variation: Among subjects, among intervals, among days, subjects X intervals, subjects X days, intervals X days, subjects X intervals X days.
 Final Sample: N=35 male longterm rehabilitation inpatients
 Age: Mean 74.66±6.89 years SD (range 56 to 87 years).
Other Relevant Demographics
 Admitting diagnoses included: BKA, DM, HTN, CVA, Dementia, CHF, CAD, Parkinson's, PVD, pressure ulcer, peptic ulcer, COPD, malnutrition, both BKA, depression, deconditioning, IDDM, fracture and osteoporosis
 History of alcohol abuse was obtained in 31 of 35 subjects.
Anthropometrics:
Men
Mean ±SD  Range  
Weight, kg  78.95±18.1  45.6 to 128.6 
Height, m  1.77±0.07  1.60 to 1.91 
BMI  25.18±5.36  15.75 to 39.15 
Fatfree body mass, percent  69.89±9.13  43.88 to 85.83 
Fat mass  32.87±12.3  14.17 to 63.9 
Measurement Process
 The multivariate test showed no significant interaction between days and intervals (F=0.644; P=0.824) and the variation among intervals was within chance variance (F0.928; P=0.579)
 The estimated variance components from a Gstudy reflect the magnitude of error in generalizing from a person’s average score on a single day or interval to his interval score. The variance component for subjects accounted for 31% of the total variance, showing that patients differed in the VO_{2} uptake during rest; About 2% of the total variance was associated with the random error variance from the interaction of subjects by days.
Number of Measurements
 Increasing the measurement schedule to more than one day increases the reliability estimates and decreases the measurement error
 A sampling duration of 12 minutes on a single day yields a Gcoefficient of approximately 0.59, the same level of reliability would be obtained with a sampling duration of 0.5 minutes in a threeday measurement schedule
 The 20minute sampling duration measurement error on a single day has an expected measurement error of around 20ml per minute and about the same expected measurement error will occur in a sampling duration of about one minute in a twoday schedule or a sampling duration of 30 seconds for a threeday measurement
 Two days of 20 minutes duration increases reliability to 0.75 and reduces the SEM to 150kcal per day; three days of 20 minutes sampling increases reliability to 0.82 with a SEM of 122kcal per day.
Length of Measurements
 The reliability estimate for a single 30second interval was 0.32 with standard error of 53ml per minute (i.e., approximately 373kcals per day error). As the sampling intervals increased, the Gcoefficients also increased and the standard error decreased. The slope of change in the SEM is very sharp from 30 seconds up to about five minutes and then levels off.
 The measurement error for a twominute sampling duration on a single day has an expected measurement error around 30ml per minute
 A 20minute sampling duration is nearly as accurate as 40 to 60 minutes (i.e., SEM, 211kcal per day vs. 207kcal per day, respectively). There is no additional benefit of a substantial further reduction measurement error for sampling durations longer than 20 minutes.
 RQ was measured but not reported.
Measurement Timing
 Sleep or rest: The mean BMR for day one was 1,376.7kcal per day (±311.7) and for day two was 1,415.9kcal per day (±360.1)
 Physical activity: Not discussed
 Food intake: Not discussed
 Various times in the day: Not discussed.
Individual Characteristics
 There is a large random variation of the patients’ BMR (kcal per day) from one day to another; approximately half of the variation (47%) was associated with the threeway interaction between people days and intervals and random source of variation
 The sharp decrease in SEM for a single day slows down after five minutes and reaches a plateau about at a 20minute sampling duration. There is no additional benefit for measuring BMR longer than 20 minutes.
 A sampling duration of 20 minutes for a singleday measurement yields a Gcoefficient of about 0.60 and a SEM of about 211kcal per day. The same Gcoefficient and SEM is expected with a sampling duration of one minute with a twoday BMR measurement and a sampling duration of 30 seconds with a threeday BMR measurement.
 Circulatory hormones: Not discussed
 Breathing ability: Not discussed
 Medical tests/procedures: None
 Chemicals (medications/drugs/herbs, caffeine, nicotine, alcohol): Identified if alcohol was used.
 The standard error or measurement and Gcoefficient should lead the decisionmaking for optimal sampling duration and numbers of test days
 Previous findings by Atkinson et al, 1998 noted that standard error of measurement is an absolute reliability statistic that can be directly applied to future individuals to estimate the measurement error under similar conditions; the benefit of SEM statistic is that, unlike a reliability coefficient, it is unaffected by the range of measurements
 The minimum measurement error associated with a single day measurement BMR in this population is about 200kcal per day
 Our results show that daytoday variation was the major source of measurement error
 In the present study, between subjects variance is 33% and the rest is due to intraindividual variance
 Indirect calorimetry measurement error depends on the sampling duration and the number of days measured. In order to decrease this measurement error, BMR should be measured within a minimum sampling duration of 20 minutes for a singleday testing. Greater accuracy can be achieved by measuring BMR over several days. Increasing sampling duration up to 20 minutes decreases measurement error substantially and produces a more accurate estimate of the subject’s true BMR.
Other: 
Strengths
 Had a good sample size and age range; included patients with multiple diseases
 Strong knowledge and use of statistics producing important and applicable nutritionfocused (i.e., errors in kcal amounts) data.
 Multiethnic population.
Generalizability/Weaknesses
 Generalizable to male population residing in Veterans Subacute/Transitional Care Setting.
 Did not describe subject dropouts, i.e., reason for noncompliance with demographic characteristics of age, disease, etc., i.e., were they related to tolerance of the mask with use of a sealing gel?
 Included all weight categories (i.e., under, normal overweight and obese) in sample but did not discuss separately
 Steady state was not directly defined, rather the issue was measured over time
 Smoking in subjects was not identified
 Statistical note: Gtheory defines twofacet design and identifies the intervals and the days as the two facets; used to estimate variance components that were associated with the various source of variation
 Dstudy estimated the reliability, given by the Gcoefficient and the measurement error; determined how much the reliability of oneday BMR measurement was improved by forecasting Gcoefficient or measurement error for different sampling intervals.
Further Review Comments
A sentence in the Discussion section, page 55:
"The daily amount of calories that BMR measurement is likely to differ due to chance with a 95% confidence interval in an elderly individual, following similar testing procedures and equipment is 465kcal for a sampling duration of five minutes and decreases to about 421kcal for a 20minute sampling duration."
cannot be verified using Table 10 or Figure 6. Therefore, this comment and pertinent thesis information was submitted to an expert panel member. It was anticipated that the 465 is a typo and should read 455 and the researcher has doubled the kcal error to represent 95% confidence interval. Additional comments were: “Applicability of results are limited to measurement realities which include use of face masks with various weight classifications (i.e., interindividual error variance related to air leaks occurring with a subject who has an emaciated face vs. not, inclusion of the first five minutes of acclimation, use of BMR conditions (i.e., 12hour fast, sixhour rest, no recent movement), large age ranged considered to be elderly (i.e., 55 to 87 years). The quality rating worksheet was reviewed and these limitations were reflected.
“We cannot assume that her findings are generally applicable to RMR measures
but limit our conclusions to her measurement realities:
 Face masks ALWAYS leak, even with sealing gel, and thus are rarely used for research or clinical measures. The error variance will also not the equal across individuals, since an emaciated face may have a totally different "fit" than an obese one.
 I believe that she includes the first five minutes of acclimation (often ignored by others as unreliable) in all her measures higher than five minutes. Extra minutes of measurement will "dilute" out this error over time.
 She uses BMR conditions, 12hour fast, sixhour rest, no recent movement
 She has quite a range of body comp realities in these patients (BMI of 15.75 = severe malnutrition, BMI of 39.15 almost severe obesity). Since she measured LBM by DXA, most investigators would "normalize" the BMR to LBM to reduce the impact of metabolically active tissue on the measure.
 She also has quite the range in age, considering 55 to be elderly up to 87 years. This alone may add diversity to the measures."
Quality Criteria Checklist: Primary Research


Relevance Questions  
1.  Would implementing the studied intervention or procedure (if found successful) result in improved outcomes for the patients/clients/population group? (Not Applicable for some epidemiological studies)  Yes  
2.  Did the authors study an outcome (dependent variable) or topic that the patients/clients/population group would care about?  Yes  
3.  Is the focus of the intervention or procedure (independent variable) or topic of study a common issue of concern to dieteticspractice?  Yes  
4.  Is the intervention or procedure feasible? (NA for some epidemiological studies)  Yes  
Validity Questions  
1.  Was the research question clearly stated?  Yes  
1.1.  Was (were) the specific intervention(s) or procedure(s) [independent variable(s)] identified?  N/A  
1.2.  Was (were) the outcome(s) [dependent variable(s)] clearly indicated?  N/A  
1.3.  Were the target population and setting specified?  N/A  
2.  Was the selection of study subjects/patients free from bias?  Yes  
2.1.  Were inclusion/exclusion criteria specified (e.g., risk, point in disease progression, diagnostic or prognosis criteria), and with sufficient detail and without omitting criteria critical to the study?  N/A  
2.2.  Were criteria applied equally to all study groups?  N/A  
2.3.  Were health, demographics, and other characteristics of subjects described?  N/A  
2.4.  Were the subjects/patients a representative sample of the relevant population?  N/A  
3.  Were study groups comparable?  Yes  
3.1.  Was the method of assigning subjects/patients to groups described and unbiased? (Method of randomization identified if RCT)  N/A  
3.2.  Were distribution of disease status, prognostic factors, and other factors (e.g., demographics) similar across study groups at baseline?  N/A  
3.3.  Were concurrent controls or comparisons used? (Concurrent preferred over historical control or comparison groups.)  N/A  
3.4.  If cohort study or crosssectional study, were groups comparable on important confounding factors and/or were preexisting differences accounted for by using appropriate adjustments in statistical analysis?  N/A  
3.5.  If case control study, were potential confounding factors comparable for cases and controls? (If case series or trial with subjects serving as own control, this criterion is not applicable.)  N/A  
3.6.  If diagnostic test, was there an independent blind comparison with an appropriate reference standard (e.g., "gold standard")?  N/A  
4.  Was method of handling withdrawals described?  Yes  
4.1.  Were followup methods described and the same for all groups?  N/A  
4.2.  Was the number, characteristics of withdrawals (i.e., dropouts, lost to follow up, attrition rate) and/or response rate (crosssectional studies) described for each group? (Follow up goal for a strong study is 80%.)  N/A  
4.3.  Were all enrolled subjects/patients (in the original sample) accounted for?  N/A  
4.4.  Were reasons for withdrawals similar across groups?  N/A  
4.5.  If diagnostic test, was decision to perform reference test not dependent on results of test under study?  N/A  
5.  Was blinding used to prevent introduction of bias?  Yes  
5.1.  In intervention study, were subjects, clinicians/practitioners, and investigators blinded to treatment group, as appropriate?  N/A  
5.2.  Were data collectors blinded for outcomes assessment? (If outcome is measured using an objective test, such as a lab value, this criterion is assumed to be met.)  N/A  
5.3.  In cohort study or crosssectional study, were measurements of outcomes and risk factors blinded?  N/A  
5.4.  In case control study, was case definition explicit and case ascertainment not influenced by exposure status?  N/A  
5.5.  In diagnostic study, were test results blinded to patient history and other test results?  N/A  
6.  Were intervention/therapeutic regimens/exposure factor or procedure and any comparison(s) described in detail? Were interveningfactors described?  No  
6.1.  In RCT or other intervention trial, were protocols described for all regimens studied?  N/A  
6.2.  In observational study, were interventions, study settings, and clinicians/provider described?  N/A  
6.3.  Was the intensity and duration of the intervention or exposure factor sufficient to produce a meaningful effect?  N/A  
6.4.  Was the amount of exposure and, if relevant, subject/patient compliance measured?  N/A  
6.5.  Were cointerventions (e.g., ancillary treatments, other therapies) described?  N/A  
6.6.  Were extra or unplanned treatments described?  N/A  
6.7.  Was the information for 6.4, 6.5, and 6.6 assessed the same way for all groups?  N/A  
6.8.  In diagnostic study, were details of test administration and replication sufficient?  N/A  
7.  Were outcomes clearly defined and the measurements valid and reliable?  Yes  
7.1.  Were primary and secondary endpoints described and relevant to the question?  N/A  
7.2.  Were nutrition measures appropriate to question and outcomes of concern?  N/A  
7.3.  Was the period of followup long enough for important outcome(s) to occur?  N/A  
7.4.  Were the observations and measurements based on standard, valid, and reliable data collection instruments/tests/procedures?  N/A  
7.5.  Was the measurement of effect at an appropriate level of precision?  N/A  
7.6.  Were other factors accounted for (measured) that could affect outcomes?  N/A  
7.7.  Were the measurements conducted consistently across groups?  N/A  
8.  Was the statistical analysis appropriate for the study design and type of outcome indicators?  Yes  
8.1.  Were statistical analyses adequately described and the results reported appropriately?  N/A  
8.2.  Were correct statistical tests used and assumptions of test not violated?  N/A  
8.3.  Were statistics reported with levels of significance and/or confidence intervals?  N/A  
8.4.  Was "intent to treat" analysis of outcomes done (and as appropriate, was there an analysis of outcomes for those maximally exposed or a doseresponse analysis)?  N/A  
8.5.  Were adequate adjustments made for effects of confounding factors that might have affected the outcomes (e.g., multivariate analyses)?  N/A  
8.6.  Was clinical significance as well as statistical significance reported?  N/A  
8.7.  If negative findings, was a power calculation reported to address type 2 error?  N/A  
9.  Are conclusions supported by results with biases and limitations taken into consideration?  No  
9.1.  Is there a discussion of findings?  N/A  
9.2.  Are biases and study limitations identified and discussed?  N/A  
10.  Is bias due to study's funding or sponsorship unlikely?  Yes  
10.1.  Were sources of funding and investigators' affiliations described?  N/A  
10.2.  Was the study free from apparent conflict of interest?  N/A  