This handout accompanies slides -- Developing a data mindset to improve stories every day -- taught by Brant Houston at Illinois NewsTrain on April 1, 2022. Houston is the Knight Chair in Investigative Reporting at the University of Illinois, where he oversees an online newsroom, CU-CitizenAccess.org. For more info on the News Leaders Association's NewsTrain, see https://www.newsleaders.org/newstrain.
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Bringing a data mindset to your reporting - Brant Houston - Illinois NewsTrain 4.01.22
1. Page 1 of 8
Bringing a data mindset to your reporting
Brant Houston| @branthouston| houstonb@illinois.edu
(573) 529-3581
Handout adapted from versions created by the WBUR’s Todd Wallack and Aaron Mendelson at KPCC.
Years of slides and handouts from NewsTrain sessions are available at the following link: bit.ly/2oGAOF2
Data journalism is now an integral part of all reporting
• Story tips. With simple data analysis (using a pivot table in a spreadsheet), you can see
patterns, trends and outliers that steer you into stories you would not otherwise see.
• Context. Using data analysis, you can see what is the most, the least and not reported.
• More relevant examples for stories: Using data analysis, you can pick the anecdote (the
particular person, place, agency) that reflects the pattern or trend.
• Visualization: Much easier now to make charts, maps, and social network analysis
(connecting the dots.)
• Your own data library: Datasets – downloaded, obtained through a FOIA, or created - can
be used many times. They can serve as reference for breaking news or other stories. Also, you
can see and produce stories by simply updating them routinely.
• Your career. As a data journalist, your opportunities for better jobs and better stories
dramatically increases.
The basics: Use the right tool for the right job
• Spreadsheets. (Excel is best, but Google Sheets are free.) Most data stories are done
with spreadsheets. They allow you to quickly filter, sort and summarize data, in addition to
doing and automating calculations. It allows you to quickly make charts.
Learn the pivot table if you don’t know it.
• Database manager software: A database manager not only does all that a spreadsheet does
in filtering, sorting and summarizing, but also allows to join (compare) different datasets based
on a common identification number or words.
• PDF conversion software: Make sure you can convert PDFs quickly into a format that can
be used by a spreadsheet or database manager.
• Visualization tools. Easy ones for maps (Tableau) or overall visualization such as
Datawrapper or more advanced software for social network analysis.
The basic tools: Use the right tool for the right job
• Keep it simple at first. A dataset in a spreadsheet of a few columns and a few hundred rows.
Salaries, campaign finance, and datasets that are mostly words.
• Relevant data. Find data that helps you on what you cover. Get ideas for stories and data
from tip sheets at IRE and NICAR and other data tutorials on topics and beats. Download
data from the web and/or file records requests to get data
2. Page 2 of 8
• Join networks of data reporters. NICAR-L is a good place. It’s an email listserv of
journalists using data. If you get stuck, someone on NICAR-L can probably help you within
hours. bit.ly/subscribeNICAR-L
• Trial and error and practice. Use it or lose it. Figure out a way to use a spreadsheet every
day.
• Take classes given by journalism training organizations. Take a class and apply what
you learned to data that matters to you.
• And google it. The answer to 99.9% of technical questions is a quick search away. Google it,
click on the first couple links, take a deep breath, and dive in.
Finding data
• Watch for stories that use data.
• Use a government data portal. Most states have gathered a portion of their data sets in
one place.
• Search agency web sites: Use key words such as data, dataset, etc. to find downloadable
data. Use their search tool and Google search.
• Ask people/sources. Ask officials, researchers, experts, watchdogs, government agencies,
think tanks – anyone – to point you to good data sources.
• Check IRE tip sheets. IRE has a library of hundreds of tip sheets, many of which include
suggestions on data. The international journalism organization, Global Investigative
Journalism Network, has a great resource center on data.
• Examine the retention schedule for your city/state. It’s supposed to be a guide to how
long agencies must hold on to records. But it can also be a tip sheet for records that agencies
have.
• Use local, state, national and international annual reports and audits. See a stat?
See a table? That means the agency probably has a database that generated it. Ask for the data.
• Look at forms. Most agencies enter every box of a form into a database. That means you can
probably obtain the database with a FOIA request.
• FOIA. File a public-records request for data. There are many templates for specifically asking
for government data.
• Build your own database. Sometimes, you just can’t find the data you need. Or it’s only in
paper form. In that case, it often is worth the effort to enter the data into a spreadsheet so you
can analyze it. (See film “Spotlight”!)
Open data portals
The federal government and many states operate websites where they feature some of their databases:
• U.S.: data.gov
• Illinois Data Portal: https://data.illinois.gov/
• Illinois maps: https://www.ilsos.gov/departments/library/maps/home.html
More Google tricks
• Google’s Dataset Search: https://toolbox.google.com/datasetsearch
3. Page 3 of 8
• Use the advanced-search page: google.com/advanced_search.
• Search by file type. Examples: filetype:XLS Some common ones are: XLS (old Excel), XLSX
(new Excel), CSV (comma-separated values – Excel can open it), TSV (tab-separated values –
you can import it into Excel).
• Search a particular website or domain. Examples: site:gov or site:boston.gov. Even if a
website has a search box, sometimes Google works better. Try it both ways.
Examples of databases to ask for
• Payroll/salary data
• Budgets
• Parking tickets
• Business/occupation licenses
• Census
• School test scores
• Crime reports
• Jail inmates
• Purchase data
• Campaign finance
Tips on finding databases on non-governmental beats: bit.ly/otherbeats
Public-records tips
• Learn public records law. Make sure to counter any denial you get; it’ll carry more weight
if you know the ins and outs of the law.
• Assume it’s public. If you don’t ask, you won’t get it.
• Ask for the documentation. The technical documentation for databases can be called many
things: a record layout, field list or data dictionary. But it’s helpful to ask for it. That way, you’ll
know what data the agency keeps. And you’ll notice if something is missing.
• Ask for the data in Excel, CSV or “machine-readable format” (not PDFs). PDFs are
designed to print out or look at – not analyze. You want the data in a format a database can
use.
• Ask for more than one year of data. You want to see trends. I typically ask for five years.
• Talk to the data people. Sometimes, the PR people are friendly but don’t know anything
about the data and what is possible.
• Appeal if rejected. Go up the chain of the command. Or follow the appeals process (if one
exists in your state).
• Be polite but be persistent. Sometimes agencies have a routine of ignore, deny, delay.
Some will simply hope you forget about the request. But be persistent and use the phone and
even go to the offices in person.
4. Page 4 of 8
Caveats
• Perfect databases are few and far between. Watch out for dirty data. Typos.
Mistakes. Missing data. If something in the data seems crazy, it just might be an error. So,
verify it with the original documents or with sources. Remember that if a dataset is packed with
errors and omissions and is being used to set policy, then that is a story by itself.
• Save a copy of your data. Set aside the original, and do your work on the copy. That way
y0u preserve a record of the data, to go back and check later.
• Double-check your calculations. It’s usually a good idea to run them by the agency or
another trusted source before publication. Or ask a colleague to check your math.
• Datasets are often a good start. You still need to interview people and get out into the
field. That will tell how good or bad the dataset is.
• Look for the key number or a few key numbers for the story. Visualize all you can.
Your stories will be stronger if you use only the numbers that matter most. Instead, tell the
data stories through people, anecdotes, quotes and traditional storytelling that reinforces your
findings.
• Beware of working with new data on deadline. Every database has quirks. Sometimes
codes don’t mean what you think they mean. Sometimes they’re incomplete. Try to avoid
working with databases for the first time on a tight deadline.
Where can you learn more
IRE/NICAR conferences/workshops/tip sheets. IRE costs $70 ($25 for students) a year:
bit.ly/joinire. But membership gives you access to thousands of tip sheets and stories. Plus, you
can listen to recordings of past conferences. IRE’s trainings and conferences are also a great
resource. ire.org/events-and-training/
Check at other data training sites such as the National Press Foundation or GIJN.org
Data For Journalists: A Practical Guide For Computer-Assisted Reporting, 5h Edition, by
Brant Houston. $35 for IRE members.
This tutorial from Berkeley Advanced Media Institute is for those who’ve never opened a
spreadsheet before: bit.ly/othersheetbasics
The Data Journalism Handbook, volumes 1 and 2, from the European Journalism Centre and
Open Knowledge Foundation is free to read online: bit.ly/datajbook
ICIJ reporter Kate Willson has four short videos on how to use Excel to sort and filter,
concatenate (link together), auto fill and make pivot tables: bit.ly/ICIJexcel
Whether you cover education or anything else, this online guide to Excel from the Education
Writers Association will teach you everything you need to know: ewa.org/reporter-
guide/reporters-guide-excel
OpenNews: The website of the OpenNews project features helpful tutorials and is a great way
to follow what’s going on in the “news nerd” community: https://source.opennews.org/
Knight Science Journalism at MIT has a fantastic resource for data work that covers basics to
programming: ksj.mit.edu/data-journalism-tools/
5. Page 5 of 8
Spreadsheet basics
SAVE INITIAL FILE
Save the initial file somewhere safe, and make a new copy to work with. That way, no matter what you
do, you can go back to the original source of the data.
Google Sheets: Click on File in upper left corner, choose “Make a Copy” option.
(Google also saves a “Version History” that you can refer back to, but this is best done only in
emergencies. Find it in the File menu)
SAVE AS YOU GO
Always be saving.
Excel Windows 2007/2016: Ctrl-S or hit the disk icon in upper left-hand corner
Google Sheets: Saves automatically.
CHECK OUT DATA FIRST
Try to find the “four corners” of the spreadsheet.
Excel Windows 2007/2016: Use the CTRL + Arrow keys to go up, down, left and right.
Google Sheets: Use the CTRL + Arrow keys to go up, down, left and right.
UNDO
Sometimes, we all hit the wrong button. Here’s how to fix it.
Excel Windows 2007/2016: CTRL-Z (can hit more than once)
Google Sheets: CTRL-Z (can hit more than once). OR Go back to earlier version by using the
“version history” (under File menu or hit CTRL-ALT-SHIFT-H. Then click on the version you want on
the right, then click “restore this version.” To cancel, click on the left arrow in the upper left-hand
corner.)
MULTIPLE SHEETS?
Check to see whether the worksheet contains multiple “sheets” or “tabs.” Look at the bottom left-had
corner. To create a new one:
Excel Windows 2007: Click on the curled piece of paper on the lower left-hand corner, next to the
existing tabs. Excel 2016: Click the plus-sign-in-a-circle icon on the lower left-hand corner, right of
the existing tabs.
Google Sheets: Click on the plus sign on the lower left-hand corner, next to the existing tabs.
FREEZE HEADERS
This is a handy command that lets you scroll through the data while still seeing the headers/labels at
the top.
Excel Windows 2007/2016: Hit the View tab at the top, select freeze panes (middle right of the
tool bar).
6. Page 6 of 8
Google Sheets: Go to the View menu, select freeze.
WIDEN COLUMNS
Sometimes, columns are too narrow to read. (You will sometimes see ####s when columns are too
narrow to show a string of numbers.)
Excel Windows 2007/2016: Hover cursor between the two letters marking the columns until the
cursor changes to a cross. Press and hold down the left mouse key and drag the mouse left and right
until it is the right width. Release.
Google Sheets: Hover cursor between the two letters marking the columns until the cursor changes
to a cross. Press and hold down the left mouse key and drag the mouse left and right until it is the
right width. Release.
SORT COLUMNS
Use this command when you want to sort from high to low (or in alphabetical order).
Excel Windows 2007/2016: Click on any cell within the column you want to sort. (Note: Do NOT
highlight the entire column.) Click the Data tab at the top, then click on the Sort tool icon in the
middle of toolbar. Make sure the right column is selected, Sort by Values, and then pick either A to Z
(low to high) or Z to A (high to low.) Note: Be sure headers box is checked correctly.
Short cut: Instead of using the Sort tool, you can also just click on the A-Z or Z-A buttons in
the toolbar after hitting the data tab. This will usually work, but sometimes Excel gets confused
and sorts the headers along with the rest of the data. To fix this, click on the Sort tool under the
Data menu, then make sure the headers box is checked. (Yet another option: Highlight the area
you want to sort first.)
Google Sheets: Click on the Data menu, select Sort Sheet by Column _, A→Z or Z→A. (The blank is
for the letter of the Column.)
FILTER COLUMNS
Use this command when you want to select rows that meet certain criteria, such as all salaries from a
certain department or all voters in a certain ZIP code.
Excel Windows 2007/2016: Make sure you click on a cell somewhere in the data you are using.
Click the Data menu button, hit the funnel button on the Tools ribbon.
Little arrows should appear next to the columns. Click the arrow next to the column you want to filter.
Then select the criteria you want to use.
Google Sheets: Click the funnel on the upper right of the tool bar.
Mini funnels should appear next to the columns. Click the funnel next to the column you want to
filter. Then select the criteria you want to use.
PIVOT TABLES
The basic way to efficiently summarize data. The procedure varies by whether you are using a PC, Mac
or Google sheets. Click inside your dataset on one cell. In PC and Macs, look for the pivot table icon
under insert or data. From there, in this example, with just a few clicks, you could see the top seller of
guns in Illinois
7. Page 7 of 8
Here is the result
INSERT COLUMNS, ROWS
It’s easy to add another column or row.
Excel Windows 2007/2016: Highlight row/column you want by clicking on the letter or number
that marks each row/column. Then right click, and then click on insert.
Google Sheets: Highlight row/column you want by clicking on the letter or number that marks each
row/column. Then right click, and then click on insert. OR
Hit the Insert menu option at the top, then choose either the column/row above or below.
BASIC MATH
Formulas generally start with an = sign.
Addition: =SUM(cell range)
Example: =SUM(B2:B9)
Subtraction (change/difference): = New - Old
Example: =B2-C2
Percentage change: =(New - Old)/Old Way to remember: NOO!
Example: =(C2-B2)/B2
Then highlight the cell or column and hit the % button on the left-hand side of the tool bar to convert
to percent.
Percent of a total: = Part/Total
8. Page 8 of 8
Example: =B2/$B$11
Note: Use the dollar signs to keep the second part of the formula from changing when you copy the
formula.
Average: =AVERAGE(cell range)
Example: =AVERAGE(B2:B10)
Median: =MEDIAN(cell range)
Example: =MEDIAN(B2:B10)
Maximum: =MAX(cell range)
Example: =MAX(B2:B10)
Minimum: =MIN(cell range)
Example: =MIN(B2:B10)
MORE ADVANCED FORMULAS
If/Then
=IF(comparison,”print this if true”,”print this if false”)
Example: IF(B2>100000,”High Earner”,”Low/Medium earner”)
Dates
=YEAR(CELL)
=WEEKDAY(CELL)
=CHOOSE(WEEKDAY(CELL), "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat")
=MONTH(CELL)
Some formulas are slightly different in Excel and Google sheets.
For instance, to find the difference in dates:
Google: =CELL - CELL
Excel: =DATEDIF(A1,A2,"d") (for days) Use “m” for months or “y” for years”
COPY A FORMULA DOWN AN ENTIRE COLUMN
Move the mouse to the formula, position the mouse in the lower right-hand corner of the cell until
you see the cursor change to a plus sign and double-click. This will copy data down until it hits a blank
row.
ANOTHER WAY TO COPY A FORMULA TO AN ENTIRE COLUMN
The above method will only work if the formula is next to a column with all the rows filled out.
Otherwise, it will only copy formulas down until the data next to the column stops. If that is a
problem, you can scroll to the bottom of the column where you want to stick the formulas, enter some
text - anything will do. Then go back to the formula you want to copy, click on that cell, hit ctrl-c to
copy, then hold down the shift key, then hit ctrl-down-arrow to highlight the column (up until the
point where you typed in your random text), then hit ctrl-V to paste.