How to edit data as seriously as we edit words
Session facilitator(s): Hannah Recht, Kate Rabinowitz
Day & Time: Thursday, 9:30-10:45am
KATE: Hello? I’ll start at a far distance from the mic. So you might have seen we have a little tiny three yes/no question anonymous survey that you can take at https://tinyurl.com/srccon-data. We also have paper guys if you don’t want to look at your screen at all. If you could just take a couple minutes to do that.
HANNAH: We’ll get started in a couple minutes. Just take a minute to fill that out. And if you have a paper copy, just return it to us. You can keep talking. It’ll be a few minutes. Don’t worry.
Hello, everybody! Hello! Good morning! For those of you just arriving, we have a really short survey that you can take at some point, as you get settled. We also have – if you don’t have any sort of device, we have some paper copies over there, and just return it when you’re done. It’s anonymous. It’s just three yes/no questions. So if you haven’t yet, take a minute to fill that out. So welcome to our session! Thank you for coming out bright and early for the first SRCCON session. This is actually my first SRCCON ever. So welcome! And we hope you enjoy the session. We’re gonna talk today about data editing, and we’re gonna be doing a lot of sharing at our tables, so it’s exciting to see so many of you. I don’t think anyone is alone at a table. But definitely make sure you have a couple of people to talk with at your table. My name is Hannah Recht, a data journalist at Bloomberg News, on the graphics team, but I come from a research background and worked in data analysis. And Kate comes from a very similar background as well. And one of the things that I think for both of us was a little shocking and eye opening coming into journalism was the difference around editing words and around editing data, especially when we compare it to how things go in a much slower-paced but very formalized data process, in a research world. So today we’re gonna be talking about: What is our data process in a newsroom? I’m sure there are people here in giant teams who have really formalized processes for data editing, some of you might be the only person in your newsroom who understands data, or working solo, a lot of people in between, and some people might be newsroom adjacent or working in other roles than newsrooms. So I want to give a quick example. Think about… You’re a traditional reporter, in a regular reporting job, working on a story about, say, a state government. The transportation agency is in turmoil. Everyone thinks the boss is terrible. They’re gonna revolt. I don’t know. The pay discrepancies. Whatever.
So you write your story. Your editor is working with you. All the way along. And before publication, you’re gonna do a pretty rigorous fact checking, right? You might have – in the comments, this is the source of this fact, this is the source of this fact. If you can’t verify where something comes from, your newsroom isn’t gonna let you publish it, and we have decades, centuries-long old processes for that. What does it look like for the equivalent? If you’re doing a data story on that government agency? You’re gonna say… Okay. There’s a massive pay gap or they’re doing 45% more overtime hours than a similar agency. You’re ready to publish that story. What does the fact checking look like for data analysis? Maybe you did it in Excel, maybe you did it in R. You’re analyzing massive data or some small data that you collected yourself. Do you have a process for that? That has the same level of rigor of, like, requirement that you can’t opt out of it, you can’t say… I don’t need an editor for my words today or your editor says… Yeah, I trust you, that it’s well written. That your facts are accurate. But sometimes that’s kind of where we get to in data journalism. That if someone is definitely reading the words you write about data analysis, maybe they’re sort of looking at… Oh, this data comes from this state agency, but are they actually looking into how you did the analysis? And making sure that they can replicate it themselves? So we’re gonna talk about some of those issues, and I’ll hand it over to Kate.
KATE: I’m gonna try to go micless. Hi. I’m Kate. I’m a graphics reporter at the Washington Post. And as Hannah mentioned, I also don’t really come from a journalism background. I previously did kind of research and data analysis. I’ve also worked in journalism, in a small newsroom, in a larger newsroom, and have done freelancing, and in all of those situations I’ve encountered totally different processes. Some places, it was the process where a person would entirely replicate the analysis that you did. In other places, it’s like… Mmm… Those numbers, they seem… Yes.
And when you’re working on your own, it’s your own challenge, because you don’t have someone. Like, looking over your shoulder for that. So we want to dive into all of that today. As Hannah said, a part of this – and really like a key to making this a successful gathering for us all – is that we’re able to talk openly and frankly about kind of what we see and what our experiences are. So there is a transcriber, by the way. This session is being transcribed. Which is fantastic. You can always say that what you are saying… Like, publicly to the room… Is off the record. If you prefer it that way. We are also going to use chatterhouse rules for any group discussions that we have amongst tables. And so what that means is: After this session, you can go and you can say, like, generally… What experiences were. And what you kind of talked about, but you can’t attribute those things to a person or an organization.
So you can’t say, like, wow. Kate who works at the Washington Post said that she has the best editor in the world. You can’t say that. My editor is here.
But what you can say is that… Someone said they had a really great editor. Which… Is maybe not as interesting, but I think when we talk about data processes, there will be a lot more of that. We are now going to flip over to our results, to start. And I’m gonna try and make this split screen work. Is it gonna work? Okay. We’re just gonna use a not fancy… Sorry about this data viz, everyone. We didn’t have any time to put it together.
So the first question is: Do you think your organization has an established process for editing numbers in stories? That is an 80% no. When you analyze data, does someone review your work? For editors in the room, do you review numbers in the stories you’re editing? That is a 50% yes. So it’s not bad. It is not great, but it is not bad. And if somebody reviews your work, do they look over or recreate the steps you took to get to the end product? For editors, do you do this? And that is a 73% no. So there’s like… Some real gray space between reviewing work and kind of looking at the underlying process.
I’ll pause for a minute and just generally… Is this shocking to anyone? Cannot say it was shocking to me.
Okay. For our first exercise…
HANNAH: Yeah. So we’re gonna be doing a lot of talking in this session. So first we’re gonna talk at our tables.
KATE: I can do this. You can just talk.
HANNAH: So we’re gonna talk for ten minutes at our tables with whoever you’re seated next to. And we’re gonna talk about some of the challenges we’re facing in data journalism. Sorry. As technical journalists, we have problems with technology.
KATE: Split screen is very hard. But you’ll be using, amongst your groups, as a guide, the questions that we started with in the intro. And then some additional questions, which we’ll have up here in a second.
HANNAH: Yeah, so some of the guiding questions – and you’ll take these questions in whatever direction you want, but some of the things to think about is: What does that editing process look like? If you have a process, if you’re one of the 20% of people who said you had a process for editing numbers in your newsroom for data editing, what does that process look like? If you’re an editor, how do you approach a story that is really data-heavy, where the point of the story is: We did this data analysis. Here’s what we found. How do you approach a story like that? Do you have any concerns about it? Any fears? If you’re a reporter, do you have someone to turn to, with data skills? Is that your editor? Is it a peer? Maybe you’re in a local newsroom that’s within a network who has somebody else who maybe is also here, who you could meet, who you could turn to. Could you be that peer for someone else? Maybe you’re not an editor. Neither of us are. But maybe you could do that review of somebody else’s work. And do you see any challenges? So we’re just gonna take ten minutes to talk. So we’ll finish at 9:52. So get to know the people at your table. And when we’re done, we’re gonna ask each table to share something that they learned. So if you want to have somebody figure out one or two or a couple of the things you said, we’re gonna write that down just in general terms, to see as a group where are we with data editing. Okay.
Hello. Sorry, again. This is the two-minute warning. So in two minutes, we’re gonna go around and ask for each table to give a little baby synopsis on the state of data editing based on what you’ve talked about, and any challenges. Two minutes.
HANNAH: Hello! We’re wrapping up our time now. Hello, room. Yes, thank you. We’re gonna go around and ask each table to share something. And I’m sure there’s gonna be some overlap. So just say briefly what are some of the situations you discussed at your tables. We’ll start from that back corner. If you want to have somebody – please stand up. And it’s a big room, so speak as loudly as you can, please. Just for a second.
AUDIENCE: So they pointed at me and said “you’re reporting out!” So I think the highlight was Jennifer LaFleur, who used to work at Reveal, talked about their process – it’s by far the most rigorous process I have ever heard. When they get story pitches from the outside or stories from the outside, they assign somebody internally to essentially redo the analysis and backcheck the analysis. They don’t publish anything unless they can do that and verify the provenance of the data and the work and the calculations. Internally, they assign someone on the data team to fact check every project, and essentially redo the work and verify the work for the numbers that are gonna end up in the project. Have I accurately more or less…
AUDIENCE: Yeah. Is Mike in here? No. Okay. That’s my experience. I’m not there anymore, but I think they’re continuing to do that.
AUDIENCE: That’s 100% more backstopping and checking than any team I’ve ever worked on.
HANNAH: Yeah, that’s amazing. So we’ll go to the middle back table now.
AUDIENCE: Yeah. So we talked about what it’s kind of like. Well, we also have… I lost my train of thought mid-sentence. No, we talked about what it’s like to fact check these stories on your own. And kind of not having a set process on how to do that. Turning to outside experts for a lot of help, and whether or not writing up your code is actually really helpful for your colleagues who have no idea what it means, versus just kind of writing up your analysis, just in a regular document and sharing it with them. And then: How can you work on verifying your reporting when there’s no one there who could do it for you? Yeah.
HANNAH: Great. We’ll go to that table.
AUDIENCE: Okay. So we observed there was a real inconsistency in the amount of transparency in data editing at our organizations. Some of our places, it’s sort of a black box. So we do a story. We have a lot of numbers in it. It goes to the editor. The editor might have some questions, but otherwise it’ll get a stamp of approval, and we don’t know what level of scrutiny it went through. Some of our other organizations go a lot – many steps further. Not quite Jennifer LaFleur area, but it was definitely – we assign somebody who has the skills to, if there’s time, replicate the analysis independently and compare the conclusions, other times, at least do a line edit of the code done to do the analysis. But it’s interesting, the discrepancies between some places where it’s just a magical stamp of approval, and you hope for the best. Which doesn’t seem right. And others… More scrutiny is applied.
HANNAH: I really like that magical stamp of approval analogy. I think that’s a pretty common experience. Yeah, this looks good, and you’re smart. It’s probably right. So we’ll continue just snaking through the room. Next table?
AUDIENCE: I took some notes, so I can summarize, I think. So two things stood out to me from the conversation at this table. One was: Kai mentioned the interns come to him, but he doesn’t have anybody to go to, with sort of data editing questions. And I think that’s frequently a problem. Especially in smaller newsrooms. And Megan from Associated Press talked about the way that they sort of approach the editing process – is kind of a peer relationship with two analysts kind of backchecking each other, and with defined check points in the process. One being in the very, very beginning, which I think seems very smart.
HANNAH: The next table?
AUDIENCE: We talked mainly about how… The ideal world would be that an assignment editor and copy editor would be reviewing the findings in the story at the same time as they’re reviewing the words, and how could we better do that process? And I talked a little bit about… I’m trying to work on that in my newsroom. And I’ve come up with a series of questions that – and I realize as I wrote the questions out – that I’m treating it a lot like I would an anonymous source. Like, an editor is talking to the reporter about: Is this source reliable? What does this source know and not know? And what questions did you ask? What questions could you not ask? So things like that. That might help the editor suss out any potential problems in the findings. Some editors have talked about a data team backstopping the reporter and their findings, and we need that, but we also need editors questioning what they’re actually saying in the story and how they’re saying it and things like that.
AUDIENCE: None of us at this table had ever been part of a team with a rigorous data editing process, and pretty much none of us had a coworker or editor capable of doing that. Just to focus on one thing – because I think a lot of repetition is gonna come up – we had differing responses, two very opposite responses, in terms of how the editors handled a data-based or code-based story you provide them. Some editors will assume you did it right and say great. This looks good. And just copy edit your prose and send it in. And then there’s another group of editors who are extremely skeptical of it, because they don’t understand it, and they don’t trust it. Laura talked about how one editor killed one of her stories because it was too close to deadline and they didn’t have time to satisfy their suspicions about the numbers.
HANNAH: Oh, that’s sad. Hopefully coming out of today we’ll have strategies to make sure that doesn’t happen again.
AUDIENCE: Most of us at least have never been part of a newsroom that had a prose fact checking bureau either. Editors will sometimes ask questions about the facts in your story, but not a rigorous check every fact in your story setup like the magazine-style fact checking.
HANNAH: Okay. This table here?
AUDIENCE: Okay. We talked a lot about the fact that I don’t think any of our newsrooms have really a rigorous data fact checking process in place. That we have to deal a lot with people coming to us and sometimes turning them away, saying… No, we can’t do this, because it goes against the… We can’t do this map for you, because it bends the rules of what is actually possible for this data visualization in the first place. Let alone having someone come behind us and actually check our analysis. So it’s sort of this balance between service desk and actually having someone to check our own work.
HANNAH: Yeah. That’s a big one. The service desk mentality.
AUDIENCE: There’s also a factor of triage. Bigger stories get more attention than smaller stories. And I think that’s the case with… No matter whether it’s a story that has numbers in it or a story that doesn’t.
HANNAH: For sure. The middle table?
AUDIENCE: So we talked a lot about the challenges of trying to implement any kind of data fact checking, starting with: Where do you actually start the fact checking? Do you start it before the data is sanitized, or at what point does it make sense to have someone check in? Because if you’re sanitizing it wrong, that could introduce a number of errors. And then also we talked about skill sets required, so if someone is doing the analysis in SQL and R or R and you don’t know SQL and R, and the editors don’t know SQL, who is responsible for that? So there just were a lot of complications beyond not even having defined roles for who should – like, the defined process for how that should go. How do you even get the skill sets required to do it, and how do you have the time to do it, if a story is on a deadline? And those kinds of questions.
HANNAH: The next table?
AUDIENCE: Yeah, definitely going along with that, we talked about how a lot of times there’s not someone above you that can edit you. That has those same skill sets. And we have this idea that, like, does there need to be a screw-up, a big screw-up, to make people realize that there needs to be these processes in place? But then, on the other hand, there is someone who works on a tech team that collaborates with the journalism team, and they were just talking about how it’s so important to have open, constant communication between teams, even if the other team might understand what one team is doing. So, for example, putting all of the code and discussion about that in one Slack channel, and giving the journalism non-data team access to that, even if they’re not gonna understand all of it. They’ll get the general gist. And sort of still be involved in the process, to have that transparency.
HANNAH: This first table.
AUDIENCE: We talked about different skill sets also, in the newsroom, and maybe editors not being able to… Because they don’t know those tools. Not being able to reproduce all your work. We also talked about them maybe trusting sources without questioning them, because of that. And we talked about how if you’re able to collaborate, how early in the process as well you collaborate. And maybe some people in the team know these sources better than others. They can give you an opinion on it immediately. And we always try to document. So that’s a way to reproduce it. But we also talked about tools, like… What do you use? We have different skill sets on teams. Yeah.
HANNAH: Next table.
AUDIENCE: Yeah. We talked about kind of the different sizes of our organizations. Small, medium, to large. And also our different roles in that organization. Either on the product side or as a journalist. Or working with editors or journalists. There’s kind of different ways to approach things, based on the team size. Skill sets are the big one. I made the metaphor of journalists being like detectives and data journalists being like forensics. There’s kind of a spectrum of skills. Sometimes people are in the middle. Sometimes people are on the end. And it really depends on the organization.
HANNAH: Yeah. And this last table?
AUDIENCE: We talked about distributing the data edit amongst different members of the newsroom. We discussed how documentation is really important, using a data memo or data diary, and also the different platforms or technologies that people may be using, and how that’s a double edged sword. Both a challenge and a bit more rigorous, when someone can reproduce someone else’s work, using a different methodology. That that actually improves the bullet proofing process.
HANNAH: Great, yeah. There’s a lot of themes in common there, and a lot of the ones that, in planning this session, we had discussed. There’s technical issues. Is there a shared language between people? Is there even another person who maybe doesn’t even know the language that well, who could maybe understand the code? Or is there really nobody in the newsroom? And how do you move forward if your editor doesn’t understand data analysis at all? If you don’t have a peer to do that work with?
So we’re gonna go into another exercise that Kate is gonna introduce. And I want you to keep those themes in mind and think about: What are some of those challenges? And we’re gonna talk about some of the solutions for them.
KATE: Okay. So we talked about some challenges. Now we’re going to go into, like, a near future world. You do not work at the place you currently work. You’re going to be working somewhere else. And you’re also going to be moving tables, because we’ve all been sitting for quite some time now. So I will introduce the places and their challenges. In this near future world, we have also colonized outer space.
So you’ll see… The first newspaper is the Deimos Daily. And they have data editing. But it is largely ad hoc, and it relies on informal connections. So I know Stephen – Stephen is a friend of mine. And I know if I have a data story, even though he’s swamped, he’s like… Really into data editing. And so he will, like, look at my story. But maybe Stephen is sick. Or is on vacation. Or maybe Stephen moved to Mars. So yeah. So how do you take something that is ad hoc and kind of purely relies on the connections that you have, and turn it into a process? That will be these first three tables. You do not all have to talk across tables. But if you want to move to Deimos Daily, you will be in this row.
There is the Slim Sentinel. Because also in this future world, more billionaires own newspapers. So the Slim Sentinel has a data team, and the data team all use different languages and styles. Some people use Python. Some people use R. Some people are fanatic about Go. So how do you check data when everyone has their own process? Everyone has their own style? And let’s say they’re not super great about documenting it, because it’s code and you’re a coder. So Slim Sentinel, right along here.
So some billionaires’ dreams have come true. And this row is the Musk Mars Mailer. In this one, they like to move fast. Management is not interested in allocating time or resources to data editing. How would you convince management that this is an important part of the process? Or maybe you would like to go rogue and set up your own thing? TBD. That is this row. Okay. We have five. So we’re actually gonna split this. So this table area is the Kardashian Kaller. Because they’ve moved into journalism. And there are… There’s a small data team. But there’s no data expertise in management. So editors look at stories. And they see the words. The words are good. They see the numbers. And they know that you’re very smart. So that is good enough. What do we do at the Kardashian Kaller?
And then we go to the Ganymede Gazette, where there is only one data reporter on all of Ganymede? What do you do if you’re the one reporter? So we have our new future, our newsrooms, we have lone data reporter. We have no data expertise in management. We have… Management does not care for your data editing. We have… There’s a data team. They do all the things in all the different ways. And we have data editing is ad hoc and just relies on the friends that you have. That may or may not stick around. So now we’re gonna get up. We’re actually gonna move our bodies. And you’re gonna go to any given spot. And talk through some of the challenges.
KATE: This is a five minute warning! In five minutes, we will be asking for your team to share a little strategy or two or a little frustration or two. We ask that if you were the speaker last time, you are not the speaker this time. Even though you’re in different groups. Five minutes.
KATE: Okay, everyone. We’re gonna do a roundup. Hello. Yes, okay. So we’re gonna start in the front this time. We’re gonna start in the front this time. Deimos Daily number one. How are you feeling about your non-data editing process?
AUDIENCE: You had a great idea. Okay. Fine. He talked about unit testing. So I’m not an expert in unit testing, but I will say… One of the things that can be done is learning more about that. It’s more of a software developer approach to code. Or data analysis. And I think a lot of us who learned ad hoc on the job – which I think a lot of newsroom developers have – don’t necessarily know some of the best practices. So starting somewhere like that, where you can – before having to bounce off somebody else, edit your own work. Now, it’s not foolproof. So also comment religiously. If we’re talking about something where we’re transforming raw data into something else, using some sort of code, or you talked about – we just got there. But do you want to say what you were just talking about, in terms of creating a big spreadsheet, with lots of steps?
AUDIENCE: Yeah. I probably should use (inaudible). But I find everyone understands Google spreadsheets, so I’ll make a spreadsheet with 30 tabs. And start with step one.
AUDIENCE: Sometimes I find if you’re doing something in Excel or Google spreadsheets or whatever, Google Sheets, the formulas will be pretty opaque. There’ll be formulas and formulas and formulas and formulas, which is one of the weaknesses of a review. But if you can do something like that, you can break it out and see the steps along the process. Just taking the time up front to comment, make things clear, whatever, allows for people at the backend to have more clarity. Does that make sense?
KATE: Yes. That is totally true. The more you comment, the better.
AUDIENCE: It’s a pain in the ass, but you’ve got to do it. Your future self will thank you.
KATE: Yes. That’s what they always say. Number two? You have a speaker?
AUDIENCE: I’m not equipped to speak on this.
AUDIENCE: We talked a little bit about how… It sounds like in this newsroom there are people with these skills. It’s happening informally. So what would it look like to formalize that and have steps in the process where people could ask for help and make an effort to formalize that being a part of those people’s job, and so that it is understood that… Okay, these people can spend 5% or 10% of their time checking data, and there’s some way of signing up for them to check or some way of pairing people, so that it’s not reliant on relationships, and it’s being something on the side for them to do, but it is built into their job and anybody can access it, rather than just through friendships.
KATE: This is also very good, because a lot of times, data editing will be a side thing that you don’t necessarily account for in time. So if that’s considered part of your role and accounted for, I think that can help. Third table?
AUDIENCE: Is there any way you could repeat what the planet was?
KATE: Yes, important question. Can everybody in the back hear?
AUDIENCE: Sort of.
KATE: We have… We have… Wait. Okay. We have traveling mics. But so… For this table, it was about unit testing, and using more up-front ways to check your own code, as well as documentation. And if you’re working in something like Excel, where there are these – lots of nested formulas, all in a single row, that you can’t necessarily see, breaking that up through tabs and using things like Google Spreadsheets, which tend to be more non-data journalist friendly, than something like R.
And the second table spoke about formalizing the time spent on data editing. So a lot of times it can be a very informal process, where you do it for a friend, you do it for a coworker, but that’s not really a part of your job. So what if that was something that was formalized in terms of something like 5%, 10% of your time went towards checking someone else’s work. And now I’m gonna hand this off to the speaker at this table. You spoke last time? No! I’m sorry. You can’t talk. One of you three. Rosie, you have no choice.
AUDIENCE: Oh, one thing that we talked about – we also talked about formalizing time and getting the team to… Knowing when to put multiple people on a story and get them to pair. Was contracting things out. Like, if you’re asking somebody outside the newsroom to do work for you, your newsroom should be paying for it. I know that that involves management, and might not always go well. But it’s apparently a thing that has been done successfully and has developed some data editorial skills. And resulted in better processes.
KATE: Convincing management to spend money is always very good. And pair programming is a great thing to do generally, but also has a lot of editing and code editing benefits. Hello. Speaker?
AUDIENCE: I spoke last time.
AUDIENCE: Hi. So we talked… Our planet was… We were the people… The data teams that were doing every single thing. And that didn’t faze us too, too much. We decided that people could do analysis in whatever they chose, and sometimes there are benefits from that. If someone was going to remake their analysis in a different language, that actually can be beneficial for big projects. So we were cool with that, as long as there was significant documentation and annotation of what steps an analyst had taken to get to the final points. And then we talked a little bit about news apps, and how in that world, working in a bunch of different languages actually can be very problematic, in the sense that someone writes an app in Ruby and leaves and no one on the team knows Ruby and you can have an app that can die and is totally unsupported. So we decided on the news app side, there should be kind of a common language, and there should be some kind of central plan around working in a shared set of languages.
AUDIENCE: The set project structure idea?
AUDIENCE: And we talked a little bit about set project structures, which is something our team has worked on a lot in creating projects that look the same. So we have a tool that is gonna be Open Source soon, that creates the same set of folder structures for your projects and syncs to GitLab, syncs to S3, so everything looks the same from project to project, which makes it much easier to dive in and out of projects.
KATE: That sounds beautiful.
AUDIENCE: Soon to be released to the world.
KATE: This was a trick question, in a way. It didn’t have to be a bad thing that all of your people use different code. So it’s okay if you subverted it.
KATE: Thank you. I should note that we’ll be turning all of this into a blog post. So we’re taking notes right now. They may be tiny… And may be not as clean as they could be. So this will all be a blog post later. Any final thoughts?
AUDIENCE: Other than documenting and commenting code, we discussed having, early on in a project, discussing your technology choices with the rest of the team. Making a set of questions, like, general questions that everyone can answer, no matter what language they’re using. So if you’re coming to the same result, that means you’re good to go. And we talked about doing graphics and data review the same way that we do code review. So not just applying it to code, but also for graphics as well.
KATE: Thank you. I feel like a common theme here is that a common language amongst us all is English, so if we write more of it about our cool, that would probably be helpful. Musk Mars Mailer. Musk Mars Mailer. Your management – not a fan of allocating time or resources to data editing. Is there a speaker amongst you?
AUDIENCE: So one of the things that we talked about is sort of, in a different way, kind of speaking in the language of management, which might be business schools, and often is. We did have an interesting discussion – I’m not sure if it was resolved – about whether or not that was selling out. I think from a lot of our real world experiences, bundling requests from management with business schools has been historically successful. We’ve talked about things like creating reusable platforms for interactives and doing data journalism and selling those tools as a way to formalize a process, but as we sort of started talking about it more, we kind of revealed that to some degree, we would all have to go rogue. Because it’s just sort of like… If you want something that management isn’t gonna give you, without trying to convince them, you’re gonna have to, to some degree, take it under control and figure out a process that you later have to sell to them. And then it sort of brought up questions like a lot of us have experienced, where some genius comes in, builds something amazing, in a way that nobody really understands, and doesn’t really make that knowledge accessible to other people, and so… There has to be, I think, a transition from going rogue and doing your own thing to then formalizing that into an infrastructure that people can later use and you can onboard people with.
KATE: Thank you. I think that’s right. That a lot of times to convince management, you kind of have to build something out, at least a little bit, before presenting. Speakers? Who spoke last time?
AUDIENCE: I’ll do it. So this was kind of tough for us. Mostly because management has clearly already put some sort of value into having data journalists in the first place, so trying to convince Elon Musk or otherwise that we also need data editing might be kind of difficult. So instead of pitching it as our desperate need, pitching it as their brilliant idea is probably gonna work better. So they’re probably already patting themselves on the back for making the investment into data journalism, so they should pat themselves on the back for creating an innovative new way to edit data journalism, even if it’s the exact same thing that everybody else is doing. So they can have focus groups and they can give it a fancy name. They can package it as a product. They can do all of this stuff to kind of take credit for something that you need on a daily basis.
KATE: Thank you. We truly love to innovate here at Musk Mars Mailer. Ganymede. The only data reporter on Ganymede. Who speaks for you?
AUDIENCE: Am I the spokesperson? The brilliant idea that Aaron had actually is for a new tool, which will be AR and machine learning and probably use… On the Blockchain… Called BS.gy. Because we’re on that planet. And it will be a tool so that you can actually, in a virtual Blockchain setting, walk through your analysis with someone in another place. Maybe on Mars if you guys are available. Because we don’t have someone readily available in our own newsroom. What else? What did I miss?
AUDIENCE: Training editors.
AUDIENCE: We talked a little bit too about how we can train otherwise non-technical editors to ask the right questions about data analysis from data reporters. Which is good editing in general, but it sounds like there are some resources out there to help with that. That don’t necessarily involve learning how to code on the editor’s side of things. And we talked a little bit about – walking through analysis with someone else from another newsroom, how do you connect otherwise isolated, lonely coders with each other? To the point where they can feel comfortable asking somebody for a couple of hours of time to keep them from being wrong with their story.
AUDIENCE: Did we get everything?
KATE: Thank you. And then finally, Kardashian Kaller. No expertise in management. What are you doing?
AUDIENCE: We actually had a pretty solid transition point. We spoke about conveying the value of data editing onto management, and sort of getting at the fact… Well, if they’re good editors, they’re gonna know how to question facts, right? So David had the awesome idea of just sitting down and being like… Rip my story apart for 15 minutes. Be skeptical. Ask me questions. Be an adversary, for just a few minutes, to sort of prove this out, even if we can’t have someone with actual data expertise. Because everyone knows how to ask questions. Everyone knows how to interrogate a story. It’s kind of the same thing that Mary Jo said earlier about treating it as a source. And we also talked about that loop of getting them to interrogate your data, your presentation, all that kind of stuff – will theoretically lead to it being a better story, because that’s the same interrogation that you would expect the audience to do. The audience is asking questions like: What is the data? What does it do? All that kind of stuff, which is the same basic questions you would want your non-data expertise editor to be asking before the story goes out there. So that’s kind of what we spent our time talking about. Even if you can’t… Originally we were like… Just hire someone with data expertise. So like was helpfully pointed out… There’s no money for that. So we can do it without money. So that was one of the ways.
AUDIENCE: The Kardashians have money.
KATE: Do they want to spend it on the newspaper, though? Thank you. These are all really great strategies. I think one more thing that I’ll add is really for any coder, but particularly lonely coders… There are lots of great resources, in terms of the Pro Publica with data…
HANNAH: On the next slide.
KATE: It’s on the next slide. The Quartz Bad Data Guide. If you’re not familiar with that, it’s fantastic. The Pro Publica data bullet proofing, and the Times came out with a data training session that’s on GitHub, which might be helpful to you in terms of learning, and this will all be in a SRCCON blog post after the fact. But also has a lot of tip sheets and things you should look out for. For instance, the Quartz Guide to Bad Data is just a lot of checks you can do. If you’re in Excel and you have 255 columns, you probably have more columns than that. But Excel cut it off. We have a few moments left. What I would like to do is… There were some people that answered yes, yes, yes on all the questions at the beginning. They have great processes. Everyone looks over it with a fine-toothed comb. If anyone is interested in sharing things that they feel that their newsroom does particularly well with data editing, maybe? Yes?
AUDIENCE: Ben Welsh, who is the data editor at the LA Times, does a weekly show and tell meeting, where he just goes down to the local brewery, and brings back a ton of beer, and they have a bunch of drinks and go over… And each of the data journalists goes around and picks apart every single story. I see it every week, and it works really well.
KATE: That sounds like a lot of fun, and fantastic.
AUDIENCE: It’s both fun and cool.
KATE: Yes. I would also add that not everybody drinks. So I hope it’s also a friendly environment for people. Okay. Cool. Any other tips and tricks? It can involve beer. It doesn’t have to.
AUDIENCE: It doesn’t involve beer. But something that worked for us was having one person… It was me… Who kept bothering people every single time it came up, when there was any sort of data. Because before I was there, there was no process. If the government came out with numbers, they were accepted, and they were not interrogated or otherwise. So it really took, over time, just showing up every single time to be present in that editorial process, to say: Hey, we have questions about this. We should answer them. And so it was time spent being present to make that change. My newsroom is not there yet, but it’s 80%.
KATE: Bothering people can be wildly effective. Are there any other tips or tricks?
AUDIENCE: So when you asked for that, I started to hear my CTO’s voice in my head, and he would be like… Iterate, iterate, iterate, iterate. You’re not just handing them – you get this data, you’re analyzing it, and then you’re not handing them final package. Go through iterations. Be like… Okay. So what’s the first thing we want to do with the data? Do that. Give it to them quickly. Take a look at it. What do you think of it? What is good about it, what is not good about it? In doing that, you’re keeping them involved in that process of analyzing that data. And you’re actually building the literacy with them. You’re building communication. And you’re not just pretending like you know exactly what needs to come out of the data.
KATE: Thank you. I think we are at… We’re past time. So there is food now! I think! I think? Oh, but there’s a 30 minute break. Okay. There’s a 30-minute break. So please enjoy your break. Thank you for coming. If you have questions, comments, you want to come cry to us, or give us really good tips, we would enjoy those things. Thank you.