Techniques such as behavior coding, respondent debriefing, interviewer debriefing, cognitive interviewing, and nonresponse analysis all provide information to help the questionnaire designer assess whether respondents understand questions as intended and whether they are able to provide adequate answers to them. However, with the possible exception of some types of respondent debriefing questions, these techniques do not actually measure question reliability. How well do question evaluation techniques in fact predict reliability and validity? Data reported by Belli and Lepkowski suggest that interviewer behaviors have little predictive value for response accuracy, though the evidence is somewhat more suggestive for respondent behaviors. Recently, the U.S. Department of Agriculture's Food and Consumer Service fielded a new survey designed to measure the subjective experience of hunger in the United States. The Census Bureau helped develop the questionnaire using some of the evaluation methods listed above. In addition, we conducted a reinterview with a sample of households following the survey. In this paper, we compare the results of the questionnaire evaluation data to those of the reinterview data to assess how well behavior coding predicts test-retest reliability.