This past weekend I was invited to run a 2 1/2 day workshop for the Community College Undergraduate Research Initiative. CCURI (pronounced “curry”) is an NSF-funded program involving 50 community colleges, and the focus of this workshop was to engage a dozen CCURI professors in developing activities for their students to work with authentic scientific data.
I was joined in my efforts by my close friend and colleague Bill Finzer from the Concord Consortium. Creator of Fathom and CODAP (the Common Online Data Analysis Platform), I knew that Bill would be the perfect partner in crime – for if we were going to be working with data, we’d need a tool for that - and CODAP is the best interface I’ve seen for taking a user from raw data to usable visualizations, quickly and with minimal training.
In the weeks leading up to the workshop, I reached out to the participating professors and asked them to think about what datasets they wanted to be working with. When the workshop started, we had a menagerie of datasets – some big, some small; some collected by single individuals, and some provided by massive federal programs. We had data taken from fish tanks, soil, Superfund cleanup sites, and campgrounds in remote mountains, measuring the lengths and widths of goldfish, heavy metal concentrations in human hair, and the spread of the West Nile virus.
The objectives of the workshop were simple (if a bit ambitious). First, we wanted to get the various datasets into CODAP and have the professors explore the data. We wanted them to discover the stories the datasets had to tell and share those stories with the broader group. And then we asked them to think about how to build a learning module that would allow students to discover and tell those same data stories.
It is amazing how much you can learn from something that, on the surface, sounds so relatively straightforward.
First, I was reminded that working with data can be really hard. Simple decisions like how to label columns, or whether a certain variable is continuous or categorical, can radically affect the kinds of visualizations and analyses you can carry out – and you often don’t realize that until it is too late.
Second, some data stories are a lot more interesting than others. In some cases we realized that the most interesting stories were only implied by the data we had, and that we really needed some other kinds of data to flesh things out. In other cases, the datasets offered so many stories that we knew we needed to pick one or two good ones, and just focus on those – even if it meant dropping some data out.
Finally, and most importantly, I was reminded that in the “teaching with data” business, it’s all about the interface. When a bug made it difficult to upload files into CODAP, work ground to a halt. When someone realized that CODAP could automatically average daily data into monthly data, a task that could have taken hours was reduced to the work of a few mouse clicks. And when someone discovered that they could click from variable to variable, and watch data points in a scatter plot instantly flow from place to place, the data story they were trying to tell nearly told itself.
After two and a half days, we had several teachers who were ready to try their exercises with their students. We had others who were designing the protocols to collect more data, or searching the Web for other datasets that would help them tell their stories. People were excited. And tired. And I am so grateful to all of them for working so hard, and trusting Bill and me to lead them to create something worthwhile. I’m grateful to Heather Bock and Jim Hewlett for giving us the opportunity to present, and to Bill for joining me. I think we all learned a great deal.