Data journalism: Empowering public

NO-SWEAT CHART A bar chart created by Datawrapper, an open-source online tool, from the data on road accidents in Metro Manila by city

A total of P34,067,467, 968.85. Countries, institutions and individuals pledged this amount for the areas devastated by Supertyphoon “Yolanda” (international name: Haiyan) last year, according to the Foreign Aid Transparency Hub (FAiTH). But beyond the huge amount there’s more that the public can look into.

FAiTH (www.gov.ph/faith) posts data on Yolanda-related donations as part of the Aquino administration’s efforts toward transparency. The hub categorizes the donations into cash and noncash, both valued in pesos.

A piece of information sticks out: The country has received only 44 percent (just under P15 billion) of the total amount pledged. The why is not readily answered by the infographics (www.gov.ph/faith/full-report).

Scroll down and you find a forum criticizing the available data. One user called FAiTH “partial transparency,” full only when the expenditures were shown. Another described it as a “fraud,” tracking only what came in and not what reached ground zero.

Two people suggested that the critics pore through the links, which could reveal some answers. The links automatically download the specifications in comma-separated values (readable by Microsoft Excel) or in JavaScript.

Exciting

The interactions on the government-hosted website reflect the “exciting” data ecology that Open Knowledge Foundation (OKF) sees unfolding across the globe in a “momentum,” with governments, “closed for hundreds of years,” finally releasing data for public access.

“Information is power,” said Anders Pedersen, a member of OKF’s Knowledge Team, when he visited the Philippines for a data skills workshop. “When governments share information, that’s a sign that power is opening up. Governments used to keep information, keep power and now it’s been different.”

“It’s very exciting,” Pedersen added. “It means that … the information that’s becoming available offers us a lot of new ways to cover society and to help citizens find out what’s happening, to hold the governments to account.”

He cited the Philippine Government Electronic Procurement System (philgeps.gov.ph) as a compelling example. It is a centralized database of the government’s bidding opportunities for goods, civil works and consultation services.

Open knowledge, visualization

Open data suggest that certain information should be accessible for public consumption. The data then become open knowledge when “useful, usable and used,” OKF (okfn.org) said, explaining the organization’s purpose.

Visualizations like the charts and tables available on FAiTH hope to make the data “useful” for netizens, turning them into attractive and comprehensible information compared with the spreadsheets they originated in.

The formats of the downloadable data make the information “usable” as they are machine-readable and reproducible by netizens.

The third criterion (“used”) is met when Filipinos know that FAiTH exists and actively look at the data sets, and identify loopholes, according to Pedersen.

“Demanding better data will help inform government about what they need to change,” he said. “As citizens and journalists, we need to tell them we’re using the data. We want the data and this is the next data set we would like them to release.”

Demanding better data comes with the times.

“Asking information from government used to be a physical transaction—bring the papers to someone, you send them the papers—and [the papers] were very complicated to analyze,” Pedersen said. The then-arduous efforts at data hunting can now be directed at doing better analysis and presentation.

Technology has come a long way. “Your smartphone has the data capacity of the Apollo [that landed in the moon],” Pedersen said, adding that 1 terabyte, 1 trillion bytes, is now worth less that $100, from $450,000 two decades ago.

Computers store data in bytes and if a minute of MP3 music is 1 megabyte, that’s about two years of music.

The Internet has also become a potent partner in open data. It hosts interesting infographics from how fast Los Angeles fire marshals respond to 911 calls (Los Angeles Times, “How fast is LAFD where you live?”) to which Tanzanian secondary schools have performed well or horribly in the period 2003-2013 (Shule.info, similar to Kenya’s Findmyschool.co.ke).

MAP created by Datawrapper showing the cities (in dots) and the number of deaths from road accidents

Crowdsourcing

“We are able to do new things that we weren’t able to do before,” said Pedersen, also citing citizen journalism as a good example. It involves crowdsourcing or user-generated content, which encourages ordinary citizens to contribute to the data pool. With a smartphone, for example, one can snap hazards on the road and forward the photo to media and, in some countries, to government.

Crowdsourcing helps

WikiLeaks produce a steady stream of previously confidential data sets. In a novel feat, Norway’s Verdens Gang identified the people in a photo taken during an event of the royal family there, via posting the photo on the Web and letting people comment.

The Data Journalism Handbook (datajournalismhandbook.org) provides a wealth of examples. Apart from the definitions and basic how-to’s usually found in books for data-journalism tyros, it has behind-the-scenes details for some stellar examples of data-driven journalism.

Granular data

In the United Kingdom, OKF would tell a regular worker earning 25,000 pounds monthly that 44 cents of his daily tax goes to culture projects and 28 cents is allocated for the environment (wheredoesmymoneygo.org/dailybread.html). The project is based on OpenSpending, in which Pedersen is a community coordinator. The group tracks data on government spending and tax allocations.

“Granularity is king,” Pedersen said, suggesting that data sets gain impact when they are made more specific.

In 2010, WikiLeaks released “Iraq War Logs,” a compilation of more than 390,000 military reports documenting military operations in Iraq from 2004 to 2009. It showed that the biggest death toll (over 60 percent) was accounted for by civilians; “31 civilians dying every day during the six-year period,” WikiLeaks said.

The Guardian went further. It mapped the data on Google Maps, each assault represented by a red dot, visualizing the concentration of combats (“Wikileaks Iraq war logs: every death mapped”).

“[We] can tell very local stories and the big picture at the same time,” Pedersen said. “You can have the larger story and also offer a drop-down menu of, like, how education is in each LGU (local government unit).”

Injustice, online courses

Data are also about injustice and inequality, he said. Access to data plays an important role in fixing society. “If you don’t have good data, the government might not want to allocate your money because there’s no accountability,” Pedersen said.

OKF is behind the School of Data (schoolofdata.org), which offers free online courses on how to “play” with data. It also conducts on-site seminars like the data skills workshop held recently in the World Bank office in Taguig City.

“It’s very active,” Pedersen said of OKF. It trains citizens to identify gaps in available data and urges governments to provide these. “We [citizens] need to check that, look at that, make stories about it and tell agencies when they’re not releasing good enough data, if they’re still keeping it, if they’re still keeping it under commercial licenses, and so forth.”

Ammunition for change

OKF, founded in 2004, operates in more than 40 countries to teach journalists and civil society groups how data sets can be powerful ammunition for change, through top-notch analysis and visualizations.

The weeklong training dedicated two days for some members of the media and civil society groups. Inquirer, GMA, ABS-CBN, Rappler, the Asia Foundation and Affiliated Network for Social Accountability-East Asia and the Pacific, among others, sent representatives.

Data-scraping tools

Pedersen cited some issues where data reporting was effective—government subsidies, government suppliers, gun ownership, company ownership and demographics.

Mexican Sergio Araiza of Escuela de Datos (School of Data) introduced useful tricks in data scraping in which data are extracted from sources, such as websites, and converted into machine-readable or open formats such as .csv and .txt. Data-scraping tools available online include Table Capture and Tabula.

The duo also taught data cleaning, or filtering and arranging data according to an individual’s needs. With a few steps, a downloadable list of public secondary schools in the Philippines—21 columns, 38,662 rows on Microsoft Excel—can be sorted to find that there are 112 in Cagayan province, with 2 in Sanchez Mira town and 3 in Pamplona.

A Bureau of Customs data set was mined on Day 2 in “Data Expedition,” where attendees worked in groups to condense the data and produce stories and  visualizations using free online tools, such as Datawrapper and Fusion Tables.

Open government

“There’s an international momentum where the Philippines is playing a very active role, formally in the Open Government Partnership (OGP),” said Pedersen.

The training was an initiative of the World Bank and Open Data Philippines (ODP), an interagency task force affirming the government’s commitment to the project. ODP is composed of the Office of the Presidential Spokesperson, the Presidential Communications Development and Strategic Planning Office and the Department of Budget and Management.

OGP (opengovpartnership.org) is a multilateral initiative, which the Philippines helped put up in 2011, together with seven other countries—Brazil, Indonesia, Mexico, Norway, South Africa, the United Kingdom and the United States.

It seeks to cultivate transparency, empower citizens, battle corruption and develop new technology for better governance through concrete commitments from member countries. From eight founding countries, it has grown to 64 members, which have made more than 1,000 commitments to make their governments more transparent.

The launch of the ODP website (data.gov.ph) earlier this year is a commitment to the OGP. The website contains data sets in open formats as well as maps and infographics, which the public can view, download and share on social media, blogs or through e-mail.

Public discourse

“As we continue to publish government data, it can be a basis for a data-driven journalism where we elevate the public discourse, elevate it where it is evidence-based, data-based. In so doing, the public can also help us in policy formation,” said Edwin Lacierda, a spokesperson of President Aquino.

The move to an open-data environment, according to Lacierda, will be successful if the bureaucracy embraces this mind-set: It is better to disclose data in open formats than to release them in files that are difficult to process, or not show them to the public at all.

However, problems still lie ahead. In a progress report on the OGP for 2011-2013, Malou Mangahas of the Philippine Center for Investigative Journalism found that the Philippines still had trouble engaging civil society. Only 30 percent of the population has Internet access, with 70 percent of them active, the report said.

FOI bill

The report also found that the absence of a freedom of information law would “retard full implementation.” While the open data initiative is gaining momentum, the freedom of information (FOI) bill, which is crucial in promoting transparency and accountability in  government, is languishing in the House of Representatives.

Lacierda was nevertheless unfazed. “From the previous administration of opaqueness, we are now moving toward transparency to accountability and that’s an irreversible commitment of this government,” he said. The Senate passed its version of the bill and Malacañang has committed to push for the passage of the FOI bill before President Aquino steps down in 2016.

“I do want to emphasize that this is a start. Governments have been closed for hundreds of years and that is changing now,” said Pedersen. “It’s important [to point out that] we don’t want to be complacent.”

Links

Data scraping and data cleaning tools:

Table Capture https://chrome.google.com/webstore/detail/

table-capture/iebpjdmgckacbodjpijphcplhebcmeop?hl=en

Google Refine https://code.google.com/p/google-refine

Tabula https://tabula.nerdpower.org

Online OCR https://www.onlineocr.net

ILovePDF https://www.ilovepdf.com

Data Visualization

Datawrapper https://datawrapper.de

Tableau https://www.tableausoftware.com

TimelineJS https://timeline.knightlab.com

Read more...