Another critical element to designing an excellent experience sampling study, among the others we’ve discussed in this series, is questionnaire design. Excellent sampling design will have a streamlined process that reduces participant burnout while increasing motivation. The questionnaire will reduce the cognitive load required for participation and promote respondent understanding. In short, a well-designed questionnaire will ensure that researchers are studying the construct(s) they are aiming for in the desired timeframe and location.
There are two iterative stages of questionnaire design. The first is designing the questionnaire elements, such as question types, response options, length, and complexity, among others. The second is testing and collecting feedback from pilot participants.
Ultimately, a well-designed questionnaire tailors questions to both capture real-time experiences and model the constructs of interest, while minimally disrupting participants’ daily lives.

Elements of Sampling Design Relevant to Participant Engagement
While more traditional survey questions can provide a good start for ESM data-gathering, typically some adaptation is required to work with the methodological differences and practical considerations of Experience Sampling. For example, excessively wordy questions, complex text input, and overly long response option lists all create a poor user experience which increases burden and diminishes participant motivation, a key consideration for ESM studies (Vaataja & Roto, 2010). Other elements which relate to participant engagement include relevance, wording, order, and length.
Evaluating Relevance
KU Leuven’s Open Handbook of Experience Sampling Methodology recommends these best practices for optimal formulation of ESM questions to minimize burden:
- Keep questions and response options short and direct. This makes improves the participant experience as the user interface will be less cluttered, leading to higher and more efficient participant understanding.
- Questions should be clear and intelligible to the target population. Question wording should not include jargon or language that is not accessible to all participants. The more accessible the language, the lower the cognitive burden.
- Where possible, avoid reflective questions. Reflective questions require participants to complete complex self-evaluations of connections between psychological states and behaviors. Instead, researchers should separate out questions asking specifically about dimensions of constructs that they are interested in, and handle modeling association with statistical testing rather than in the questions themselves.
- Question wording should remain relevant in all or most contexts. Due to the variable nature of ESM studies, questions should not be so specific to a particular context that a change in context would render the question irrelevant.
Note: qualitative methods that use free text should be mindful of the added burden that free text questions can create and adjust their study design accordingly (Granfeldt & Gullberg, 2023). For example, to limit burden, researchers might reduce the frequency of notifications per day and/or the duration of the study.
Ensure that response options…
- Are clearly differentiable and avoid overlapping. This can create confusion and burden for participants.
- Cover all potential response cases to avoid participant frustration with the absence of relevant options. For example, researchers can provide “other,” “don’t know,” or “not applicable” options when pertinent.
Other key considerations for question wording may include time references and direct versus indirect language.
Time References
Another element to consider in sampling design is the use of time references. For more information on this issue, see the Open Handbook of Experience Sampling Methodology, namely pages 74–77. In order to better study the fluctuations of daily life experiences, questions should be particularly cognizant of the interval that participants are being asked to assess. Asking participants to recall information from many hours or days before, for example, can impose a greater cognitive load, which increases participant burden, and could potentially interfere with the collection of momentary assessments by introducing retrospective bias. Studies that are interested in true momentary assessment should contain questions with present-oriented language, for example: how impulsive are you feeling right now? However, time interval references are often used in questions, such as:
- “In the last 30 minutes…”
- “In the last hour…”
- “Since the last ping…”
- “Since this morning…”
The reference interval is largely question-dependent. For example, if asking about sleep quality, it makes more sense to ask participants to recall the night before and rate their quality of sleep once per day. In contrast, questions related to fluctuations in sleepiness throughout the day will likely need to be asked more frequently and use present-oriented language. When evaluating time references in questions, recall bias is an important consideration.
Direct vs. Indirect Language
Researchers must also evaluate when it is more appropriate to assess a construct indirectly rather than directly. For example, when trying to assess frequency of intrusive thoughts, it may be better to inquire about recent thoughts rather than asking about intrusive thoughts directly, as this kind of direct question may trigger intrusive thoughts that could increase participant burden or potentially cause harm to participant wellbeing (Open Handbook of Experience Sampling Methodology, p. 78).
Evaluating Question Order
Question order is another critical component when drafting an ESM questionnaire. Predictable, consistent questionnaires, such as those that use fixed-order question delivery, can lower participant burden by allowing for quicker completion.
Generally, if a questionnaire includes a variety of items, randomizing should be undertaken cautiously. Randomizing questions with different scales (i.e. unipolar vs bipolar) or time references (i.e. right now vs an hour ago) can increase a participant’s confusion and error, leading to higher burden and lower data quality. Randomized questions, though they introduce higher burden, can also reduce bias in occasions where earlier questions may bias subsequent answers (Open Handbook of Experience Sampling Methodology, p. 81–82).
Evaluating Questionnaire Length
Another important consideration, questionnaire length, varies widely in the literature. These meta-analyses respectively report an average number of items per questionnaire of 22.5 (SD 18.6) and 24.6 (SD 18.93). It is generally agreed that assessments should be kept short, around 1–3 minutes in duration at most, to minimize disruption to participants’ daily lives and increase participant compliance (Kimhy et al., 2012 ; Morren et al., 2009 ). This is especially true if assessment frequency is high. Further, longer questionnaires are generally associated with lower compliance and lower data quality due to increased participant burden (Eisele et al., 2020).
Questionnaire length can be minimized by handling any conditional questions with display logic, branching, or triggered follow-ups. In the case of triggered responses (ex: if a participant shares that they had anxiety recently, followed by a series of questions related to that recent event) should evaluate in pre-testing and piloting to ensure that such triggered questions are not introducing higher burden to participants by making survey length too great or too variable.
Another method of reducing questionnaire length is to only give 2–4 of the top-loading questions per assessment, or per domain within an assessment (Gabriel et al., 2019). This method can maintain reliability and validity so long as the top-loading questions provide adequate responses to address the desired domain(s). For example, see how Conner et al. (2017) were able to implement a shortened version of the Flourishing Scale in their daily survey.
A third method to consider is to randomly display a varying subset of items in each assessment instance of a questionnaire as opposed to displaying every item every time. While this does slightly inflate standard errors, it is considered a valid way to reduce participant burden by cutting questionnaire length (Open Handbook of Experience Sampling Methodology, p. 83; Silvia et al., 2014).
How Pilot Testing and Participant Feedback Improve Questionnaire Design
Piloting and pre-testing are essential for acquiring valid participant feedback. Even with a validated ESM instrument, participant feedback is critical to catching added participant burden which may be unexpected. For example, a pilot test may uncover that the UI/UX of the ESM platform selected to administer the questionnaires is poor quality or confusing. Pilot testing should ideally involve participants from target populations and incorporate a means for participants to provide feedback on aspects of questionnaire design. Some key questions to ask include:
- Was any wording within the questions unclear or confusing?
- Were the scales, categories, and classifications understandable?
- Were any response options insufficient, or were certain cases omitted?
- Were questions consistently relevant across participants and in particular contexts of daily life?
Implementing a rigorous pilot testing process along with a strong participant feedback loop is vital to addressing potential issues and points of confusion. It is also a prime opportunity to test out the UI experience for potential software, learn about available support services, and develop troubleshooting documentation and processes for study administrative staff. For more information and direction on pilot testing, see the Open Handbook of Experience Sampling Methodology, with particular attention to Chapter 3 and page 84.
Overall, a well-drafted sampling design will reduce participant burden, while also promoting adherence and comprehension. By considering relevance, wording, order, length, and implementing a pilot test, researchers can draft excellent questionnaires that collect high-fidelity data with minimal disruptions to participants’ daily lives.