You are currently viewing 6 ways to validate your survey responses and improve data quality

Did you know that SurveyCTO makes it possible to automatically validate survey responses at the point of data collection? You can do so by using constraints, a powerful tool for ensuring high-quality data.  

A constraint is a condition that is applied to a field that must be satisfied before the value entered into that field is accepted. So if a user’s entry or selection doesn’t meet the constraint, the user will have to correct it before continuing to the next field. This article presents a few examples of how you can use constraints to enhance data quality. For more examples, as well as information on how to build and test constraints on the SurveyCTO platform, read this multi-part series of Support Center articles.

Explore SurveyCTO with a free trial

1. Set ranges of possible responses

The most common way to use a constraint is if you know the range of possible answers for an integer or decimal field. To build this or any constraint, you can use the constraint wizard in SurveyCTO’s online form designer for step-by-step guidance. You can also include an exception if the respondent says they don’t know the value or if they refuse to answer. With a field capturing age, for example, you can set a credible range between 18 and 100, as well as an optional code the user can enter that indicates that the age wasn’t provided (-99). The constraint expression is as follows:

(. >= 18 and .<100) or . = -99

This expression requires that the answer be greater than or equal to 18 and less than or equal to 100, or it can be equal to -99. If the answer doesn’t meet this constraint, a default message will appear, alerting the user that the response is invalid. You can also create specific messages that provide more information to help the user understand and correct the error.

2. Confirm double entries

When a survey involves pre-assigned unique IDs, a common practice is to require that enumerators enter the ID twice to help avoid input errors. In the second field, you can add a constraint that requires that the entered ID is equal to the value in the first field. So if id_1 is the first ID field, the second field’s constraint would look like this:

. = ${id_1}

Any constraint message can be programmed, such as “Does not match ID entered on the previous screen.” This constraint example can also be used for other unique values that you want to make sure are inputted correctly.

3. Set response lengths

Let’s say that you want to understand why people decided not to participate in your survey, so you’ve included a text field for enumerators to input the given reasons. If you want to ensure that your enumerators collect substantive answers, you can also require that the answers are at least a certain number of characters long. To do so, use the string-length function:

string-length(.) >= 20

This makes it so that the length of the value captured must be greater than or equal to 20 characters in length. Applying this function can help ensure that enumerators collect the information you need.

4. Refer to earlier numeric values

A great way to use constraints to improve the quality of your data is to logically constrain responses based on previous answers. For example, in the beginning of your survey you might ask for a respondent’s total income (using a field called total_income). Later in the survey, you could constrain questions about individual income streams (e.g. “How much did you earn just from X?”) using the answer given at the beginning. The amount earned from an individual income stream should always be less than or equal to their total income, so the constraint expression on questions about individual income streams would be the following:

. <= ${total_income}

You can accompany this constraint with the message, “Answer cannot be greater than the respondent’s total income.”  You can also create a hint that assists the enumerator in understanding the error, such as, “The respondent’s total income is ${total_income}.” Applying this type of constraint assists enumerators in flagging inconsistencies and inaccuracies.

5. Refer to earlier date values

Suppose you want to do the same with date fields. If you collected the date of birth of a respondent, any life event for this person should be on or after that date. So if you ask for the date the person had been inoculated on, for example, that date should be between their date of birth and the current day’s date. The constraint expression to validate this is as follows:

. >= ${date_of_birth} and . <= today()

If you’re collecting information that should correspond to earlier date values in any way, this is a useful tool for vetting the accuracy of those responses.

6. Refer to static values

In some cases, specific values that are static and not dependent on other values might be important logical constraints. Let’s assume that you want to know about your respondent’s average monthly crop yields only if they are over 50 kg. You can create a constraint expression so that any entered crop yield that violates this condition will be indicated as invalid, immediately alerting the enumerator. Such a constraint might look like this:

. >= 50

You can also do this with dates. Let’s assume that you ask respondents if they have been accepted for post-secondary study and you are only interested in cases where they have been accepted for enrollment before the end of 2019. You can create a constraint expression so that only enrollment dates equal to or earlier than December 31, 2019 and equal to or later than the current date are valid. Such a constraint might look like this:

. >= today() and. <= date('2019-12-31')

Keep in mind that you won’t be limited to whatever the current date is. You can modify today() by adding or subtracting any number of days. For example, to represent 30 days from today, you can use the following expression:

today() + 30

These are only a few of the many ways that you can use constraints to automatically verify survey responses. Constraints in SurveyCTO can make use of values stored in any field type, as well as any of the 50+ functions. Experiment with all the possibilities, and reach out to our stellar support team with any questions about using them in your project.

Chris Robert

Founder

Chris is the founder of SurveyCTO. He now serves as Director and Founder Emeritus, supporting Dobility in a variety of part-time capacities. Over the course of Dobility’s first 10 years, he held several positions, including CEO, CTO, and Head of Product.

Before founding Dobility, he was involved in a long-term project to evaluate the impacts of microfinance in South India; developed online curriculum for a program to promote the use of evidence in policy-making in Pakistan and India; and taught statistics and policy analysis at the Harvard Kennedy School. Before that, he co-founded and helped grow an internet technology consultancy and led technology efforts for the top provider of software and hardware for multi-user bulletin board systems (the online systems most prominent before the Internet).