Doing data projects on your own*

NICAR, Jacksonville, Fla.


Brent Jones, Data Visual Specialist

St. Louis Public Radio


*or mostly

The Plan

The Pitch

The Execution

I want to talk about the process: Planning, pitching and executing on data stories when you're by yourself. There are going to be a few tips and tools along the way, but this is going to be mostly conceptual — the process.

What data is available?

First, for planning out data projects, you have to learn what's available on your beat, or learn what's available in general if you're a more general person.

If sources use numbers, if reporters use numbers in their stories, ask where they got those — somebody's collecting them. Find out who's regulating your beat, who's regulating the organizations in that story. Who's watching them to make sure they're doing what they say they're doing. Who's giving them money, who are they giving money to? They're usually tracked so figure out who it is.

One example of this that came up recently for us was there's a proposition coming up in St. Louis to regulate payday lending. One of the things it will regulate is where they can be located, so we wanted to see where they already are located.

I found out that the payday lenders have to register with the state, and so the state has a database with information on each one. We wound up with a nicely formatted CSV file with each business including their address.

Knowing that stuff exists and how to find it is pretty important.

What does your story really need?

Next you want to find out what your story needs as far as data goes.

Sometimes you don't need a chart. Sometimes you don't need a map. We've all seen the jokes about maps that are just population maps. Sometimes just using data in context in text is a great way of using it. That counts — that's using data.

You don't necessarily have to make a big fancy interactive to feel like you're being a data journalist. Just find the data, put it into the story. Give the readers context.

We want to present the data in the best way possible, and sometimes that's not hiding it behind a graphic, it's just presenting it clearly. Sometimes the interactive is the best way to go, but not always.

What can you do?

Finally, for planning, make an honest assessment of where your skills lie.

"Can you do what you're proposing to do?" is the first big question to ask. If you can't do it — if you have this great idea but it seems like it lies way beyond your skill level — that's fine, that's great that you had that idea.

There are still possibilities: can you do part of it? maybe it's worth taking the time to learn how to do it. You'll probably do it again sometime. And sometimes there are tools available to help.

For example, there are a lot of chartbuilding tools out there so even if you haven't learned how to use d3 or another charting library, you can still use charts in your story.

The Pitch

Now we'll move on to pitching data stories. All bosses are different and I recognize that. If you're a freelancer, you have a whole different set of challenges too. Some of these tips may be more effective than others.

How long will it take?

What will you produce?

Underpromise and overdeliver. Do not confuse those two or you'll get in trouble real quick.

You want to figure out how long it takes and what you'll have at the end of it. I would much rather tell my boss that something will take a week and get it done in two days, than the other way around.

If you're trying something new, which I do frequently, I say to my boss and anyone else who's involved, "I haven't done anything like this before." I want to make that clear to them. I don't want to pretend like I know everything — I don't know a lot of things. Making that clear to them helps, because if we meet any unexpected bumps in the road, it's more expected.

Deadlines help me figure out how much I need to get done and what I need to focus on. If there's a deadline involved in your project it can be helpful in saying "Ok, I know I can have this part of it done by this time."

What's your backup plan?

Have a backup plan. This can involve working the data into the story instead of that nice interactive you were planning in case that falls apart in the middle.

That way you can still use the data. If you did some analysis you can still use that. Maybe instead of an interactive, you fall back to a static visual.

Whatever it is, you want to have a way to fail gracefully.

If you're in the middle of a project and it's not working out, it's far far better to be able to go to your boss and say "Here's what we can do instead," rather than just say "I'm sorry, I can't do that."

Being able to at least do something is important, so plan for that. You obviously always want to go for your ideal, but if you can't get there, have a backup plan.

Who can help?

Finally, team up. There are other people who can help you.

One way to reduce the load is by sharing it. So if you're a reporter and you're also trying to handle data, maybe you find a story that fits with another reporter's beat and have them handle the interview process and writing the story, and you do the data analysis and the visual inside of it.

Reach out to the members of the NICAR community for help.

People enjoy helping out other people in this community. Find the people who can help you and use that in your pitch. Say, "I've talked to these people and here's how they're willing to help me on this project."

The Execution

What are you re-thinking?

Step one in execution, I think, is use templates, use stylesheets, use checklists. Don't re-think the same inconsequential things over and over again.

Save your thinking for the hard questions about an individual story, the things that differ about an individual story. Don't use it to decide what color the bar chart should be. You should already know that.

This is palette that was developed for the website I worked for and these are the colors I use. I don't have to think about what color am I going to use — these are the colors I use.

Even though there are a bunch of them here, 14, I've narrowed the set of possible colors down to 14. And the templates that I use pick one for me, so I don't even have to pick one out of the 14.

Develop templates for commonly-used graphics, if you do things over and over again. Once you do something, save your code, you can reuse it later.

Sometimes you can just swap out the data that is underlying the code, point it at a new CSV file for example, and you have a brand new functioning graphic. But even if you can't do that, if you have to modify the code a little bit, you're still way ahead of where you'd be if you were just starting from scratch.

How will you do it?

Step two is learn your tools. That means both what tools are available to you to use, so learn the universe of tools that are out there and what might work best in any situation, but also learn how to use the ones that you choose to use.

Learn Excel. If you have Excel, learn how that works if you don't already know. There are a lot of things you can do with it. There are people who have spent their careers learning Excel and they're still learning new things.

One of the biggest benefits that I get from our community in NICAR is that all of you are finding or in some cases making tools that help me do my job better and I wouldn't know about them if not for this community. So thank you for that, please keep doing it. Post on Twitter when you find a cool thing that helps you.

Try new things with the tools that you're using. Figure out how to break them, why they broke and how to fix them. Figure out how to do things more efficiently using them. Master the tools that you're using.

What's next?

Finally, learn from what you've just done.

Whenever you use data, think about what went well with this process, what didn't work well, what could make you better the next time.

It's really easy for us to just finish a project and move on to the next thing, because that's kind of the business that we all work in, but it's really worth your time to stop and think. Particularly if it's a routine process, how long does it take and how much time could you save.

It's worth thinking about those things instead of just doing the same thing over and over because that's the way that you've done it. Stop and think could I improve this, could I make this better.

It's almost always worth the time to stop and document a process. That's separate from improving it.

To document a process means that the next time you get that dataset, when it's updated next year or next month, you don't have to start from scratch. You don't have to remember "How did I do that thing," or "what was that SQL query I used" or "what steps did I take in Excel to get this final output". If you've documented it, if you've written it down, you can get to it much more quickly.

Even if you don't do the exact thing again, you might run into something similar.

Who can help?

In closing I want to jump back a few slides to "Who can help?"

For me that's a really important question because even if you're learning on your own, you're most likely not learning on your own. You're reading tutorials that someone else made, you're finding answers to questions that other folks have already asked and answered, you're using tools that somebody else built.

One of the best things about learning about NICAR has been learning from everybody here.

Sometimes it's asking questions on the listserv, emailing someone directly about a project that they did and asking how they did it, or talking with them for a few minutes here. Most people, most of the time here are very very generous with their time and their knowledge, so ask them. Take advantage of that.

Also, you can be one of those people, even if you're new here. Each one of you knows something that I don't know. Each one of you could teach me something.

That's important. Everybody can help share their knowledge. Please be one of those people because that makes all of us better at our jobs and it makes our profession healthier too.

It also makes us feel a little less alone.