You are currently viewing Work with SurveyCTO data in R using the rsurveycto package

Two SurveyCTO users demonstrate how to use the new rsurveycto package as a prelude to statistical analysis and data visualization.

This post was written by Jake Hughey and Rob On, core team members of the Agency Fund, a philanthropic venture working to expand human agency. Jake and Rob are SurveyCTO users, R enthusiasts, and developers of the rsurveycto package.

Why combine R and SurveyCTO? 

If you’re reading this blog, you might already know SurveyCTO is an incredibly effective tool for mobile data collection. At the Agency Fund, a number of our partner organizations use SurveyCTO to collect and manage their program data. This data is vital for making evidence-based decisions on what’s working (and should be scaled) and what needs more attention. Importantly, SurveyCTO’s REST API makes this data accessible for automated processing and analysis pipelines.

For Python users, our colleagues at IDInsight previously developed a wrapper around the REST API in the form of the pysurveycto package. But what about R users? The R programming environment and its rich ecosystem of packages are well-suited for all sorts of data-related tasks, from data cleaning to statistical analyses to visualization.

What has been missing is a simple way to interact with SurveyCTO data in R, which is why we developed the rsurveycto package.

New to SurveyCTO? Explore the platform for free with a 15-day trial or request a demo.

How does the rsurveycto package work?

The rsurveycto package allows R users to easily pull data from, and even push data to, a SurveyCTO server. The rsurveycto package relies on SurveyCTO’s REST API, but abstracts away the dreary details. To get a sense of what’s possible with R and SurveyCTO, let’s see the package in action.

What can you do with rsurveycto?

First, we load the package and authenticate to a SurveyCTO server. We recommend creating a text file containing the server name, user name, and password — for our example, let’s name this text file “scto_auth.txt” (Note: make sure the user is assigned a role that has permission to download data and for which “Allow server API access” is enabled). We also load the awesome data.table package, since rsurveycto makes heavy use of `data.table`s.

library('data.table')
library('rsurveycto')

auth = scto_auth('scto_auth.txt')

Next, let’s read data from a form.

form_submissions = scto_read(auth, 'my_form_id')

The `scto_read()` function understands the same options as the API, allowing you to specify a start date, a review status, and a private key for encrypted fields (more on this in our reference documentation). You can retrieve server datasets in the same way, e.g., `scto_read(auth, ‘cases’)`.

What if you want to know all the forms and datasets on the server? The rsurveycto package has you covered.

catalog = scto_catalog(auth)

In fact, you can read in all forms and datasets in one go.

forms_datasets = scto_read(auth)

Now we’re cooking with gas. From here, you have sundry options, including wrangling the data, merging the data with data from other sources, fitting statistical or predictive models, and making elegant and informative plots.

Want to know more?

But wait, there’s more! The rsurveycto package can also fetch detailed metadata and form definitions, download file attachments, and even write to an existing server dataset. Best of all in our opinion, the package is free, open-source, and available on CRAN.

We hope the package is useful in R-based data processing, analysis, and visualization pipelines involving SurveyCTO data. Please try it. If you do, we’d welcome your feedback, especially suggestions for how to make it better, on the package’s GitHub repository.

Want to take advantage of the rsurveycto package for your survey data? Start your SurveyCTO journey with a free trial today or request a demo from our team.