A well-known benefit of electronic data collection is improved efficiency and data quality. But how exactly does that happen? This case study highlights some common challenges to collecting quality data, and describes how the CASCADE team, operating in rural Central America without external IT support, overcame them using a SurveyCTO electronic template.
The CASCADE project aims to help smallholder farmers adapt to climate change by evaluating adaptation strategies, including ecosystem-based adaptations, and by supporting the adoption of successful strategies. The project focuses on coffee and subsistence (maize and bean) farmers in three Central American countries: Costa Rica, Honduras, and Guatemala.
In order to fulfill the requirements of one objective of the project – the identification of practices implemented in the field – the CASCADE team developed a questionnaire to learn about farmers’ lives both on and off the farm. It included questions on household characteristics, perception of changes in the weather, and the impact of weather-related extreme events, among others.
Ultimately, the CASCADE project's goal is to disseminate information about farmers and their performance, and to provide training-for-trainers to support the implementation of best practices for a given context. Therefore, reliable data was a prerequisite for any aspect of the project to succeed. It was thus very important to collect high-quality data from the beginning.
The data collection to be undertaken spanned all three of the project’s target countries. The team identified, at the start, a few major challenges:
- Large size: a sample size of 900 small farmers to locate and interview in rural areas.
- Complexity: a lengthy questionnaire (over 1,500 variables), with complicated logic (skip patterns, adaptive sections, consistency checks) that created interdependencies between different parts of the survey. In particular, the questionnaire featured many different questions depending on whether the household produced coffee beans or basic grains. The final questionnaire included ten modules on various topics, including the socio-economic status of the household, perceptions of climate change, and agricultural production.
- Need for flexibility: a sample covering three countries and two types of landscape within each (coffee vs. grains). Even after extensive piloting, changes to the survey instrument were expected to be required as fieldwork rolled out across the different settings (as, e.g., translations were tweaked and questions were otherwise adapted or extended to the different settings).
Considering the challenges, the project team did some careful research and ultimately chose to use electronic data collection, with SurveyCTO as their platform. This ended up being a good choice for several reasons:
Enumerators can focus on the respondents
The team knew electronic data collection would be the best method for a complex questionnaire. Had the team chosen paper, enumerators would need to spend much of the survey time flipping through the questionnaire to follow the complex skip patterns. Not only did the electronic template reduce errors by automating this process, it also allowed the enumerators to stay engaged with the respondent for the duration of the interview. It freed enumerators to focus on the conversation, not how to ask questions or which question to ask next. As a result, the interview is more like a conversation, and it is possible to collect more (and more accurate) information before either the enumerator or the respondent feels tired. This factor alone justified, for the CASCADE team, the use of electronic methods.
The team retains full control
The data collection team chose SurveyCTO because the spreadsheet-based survey-design format was easy enough to learn quickly, but also powerful and flexible enough to meet the project’s needs. Instead of hiring an external programmer, with SurveyCTO the team was able to fully control both the development of the questionnaire itself and the implementation of the questionnaire on the device. This resulted in a template, and ultimately a final dataset, in the exact format envisioned by the team. It also empowered the team to make needed changes to the questionnaire in real-time – such as tweaks to the language of questions as the team entered a new area – even after the survey had launched in the field.
Additionally, field validation and logical checks programmed into the survey helped to minimize enumerator error. The team was able to avoid the kinds of data problems that often plague data-collection efforts, such as:
- missing values because someone accidentally skipped a question (prevented using field requirements and built-in skip patterns that are automatically enforced at the time of interview);
- answers out of any reasonable range (e.g., household members reported to be 348 years old; prevented using on-the-spot field validation); or
- answers that do not make logical sense (e.g., selecting both "none" and another option in a multiple-choice question; again prevented using on-the-spot field validation).
The team found that the most common mistakes made during interviews could be easily detected and avoided in the field, by simply implementing and refining field validation rules and logical consistency checks.
Support staff can monitor and quality-check in real-time
The data collection team developed a protocol to ensure data quality: at the end of each day in the field, there was a very quick meeting in which enumerators uploaded their completed surveys to the server. Then, in a matter of less than five minutes, the field supervisor ran a process (a Stata do-file) that automatically created two different back-ups for the data and some Excel tables with reports.
Some reports were intended to verify that the sampling strategy was working properly, using the geographic distribution of surveys. Others ensured that enumerators’ work was consistent (average number of surveys, statistics on duration, etc.). Finally, staff at CATIE headquarters used some reports to monitor the distributions of key variables.
As the field team was generally exhausted after a day in the (often hot and sunny) field, it was critical for the daily review process to be quick. Because of the well-designed export and reporting system, enumerators were most often able to be released within ten minutes. If any issue was detected, the supervisor dealt with it in the moment, with technical support from CASCADE’s office in Costa Rica. When the issue required a change in the survey, technical staff at the office (a member of the research team, not an external programmer) worked overnight to have an updated version ready for the next day.
The process is streamlined end-to-end
Simply removing the need for data entry (typing paper surveys into a template after data collection) reduced human error significantly. As an added bonus, creating an electronic data collection system resulted in not only higher quality, but data that was ready to use much faster.
In sum, electronic data collection allowed the project team to meet the data quality challenges of a complex questionnaire administered to samples from different populations, with different requirements in language and content. Pleased with the success of their data collection effort, the CASCADE team has gone back to the field to collect more detailed information from some of the farmers interviewed in this round, again using SurveyCTO. The success of CASCADE’s data collection has spurred several other researchers to use SurveyCTO in their projects in Latin America, including doctoral students completing fieldwork for their theses.
Tips from the Field
Finally, we leave you with some tips from Tabaré Capitán, a member of the CASCADE team, who worked extensively with SurveyCTO. Tabaré says:
- "Even with all the control you get by using SurveyCTO, do not underestimate the back-end work after completing data collection. Our survey ended-up having 10 different versions, the result of variations in language, context (different units or options), and adjustments based on lessons from the field. The work of combining everything can be a challenging task that mostly relies on the researchers’ expert opinion on what can be compared between versions. Given that, it’s critical to keep an exhaustive record of every single change between versions and every decision made to create a combined dataset with all the versions of your instrument." (SurveyCTO v1.40 has since made revisions and version tracking systematically easier, but the role of good record-keeping and expert opinion in interpreting data from multiple versions cannot be overstated.)
- "As in everything in life, balance is important for success. It’s great to be able to control every possible answer, however, it’s necessary to recognize that real-life people are not 100% accurate. For example, we asked for the total area of the farms in one section, and we also asked for the area of each piece (different crops) of the farm. The temptation to control that the sum of the pieces should be equal to the total farm area was there all the time, however, people sometimes think about the area of a piece of the farm in a different unit than another piece, or the total area (which is only one potential factor for the sum to be different than the reported total). If your survey form is 100% strict, you might not be able to capture real-life people’s information! However, there’s a difference between no control and 100% strict. For example, our compromise in this case was that the sum of the pieces must be close to the reported total area."
- "It’s true that great organizations like IPA and the World Bank are leading amazing research projects with SurveyCTO, but do not be intimidated by that. SurveyCTO is easy enough to collect data anywhere in the world with the same quality standards of those great research institutions."
* The CASCADE project is part of the International Climate Initiative (IKI). The German Federal Ministry for the Environment, Nature Conservation, Building and Nuclear Safety (BMUB) supports this initiative on the basis of a decision adopted by the German Bundestag.
** Photo credit: MAN, EfD.
- User: Conservation International (CI) and the Tropical Agricultural Research and Higher Education Center (CATIE)
- Project: Ecosystem-based Adaptation for Smallholder Subsistence and Coffee Farming Communities in Central America (CASCADE)*
- Scope: 900 rural households in Central America (Costa Rica, Honduras, and Guatemala)