What I have written about

A little tour of aleph, a data search tool for reporters

29.06.2016 Over the past six months, I've been working for OCCRP to productise Aleph, a powerful search tool for investigative reporters. This is a little tour of it's key features, and a little view into the future development agenda.

A Poor Journalists's Text Mining Toolkit

08.06.2016 How can journalists search and analyze collections of documents on their own computers with simple tools? At last weekend's DataHarvest, we ran a workshop trying to answer that question. This write-up to covers using Apache Tika for content extraction and regular expressions in Sublime Text as an advanced search tool.

Against Decentralization

04.11.2015 In the free software/open web community, the notion that the web should be decentralized is more than a shared ideal, it is a piece of dogma. But are we really promoting a progressive vision of the web, or fighting a losing battle to avoid political engagement?

Keeping stock: investigative data warehouses

03.08.2015 Data warehouses are used in industry to manage the many datasets accrued inside a company that might be relevant to reporting and analysis. I want to propose a similar pattern for investigative journalism.

SpenDB, a data analysis tool for government finance, looking for testers!

30.06.2015 The first beta version of SpenDB features a small set of well-designed features for data import and analysis. Now the platform is ready to be adopted by anyone interested in exploring financial data, from budgets to procurement.

On Hacks/Hackers, Google and community building

04.06.2015 A few weeks ago, the US team of Hacks/Hackers announced their plans to turn the network of journalism innovators into a collaboration with Google News Labs, starting with an event in Berlin. I tweeted about this, and Phillip Smith wrote a thoughtful reaction. Given this invitation to debate, I wanted to outline my criticism in more detail.

SpenDB, a light-weight tool for government financial data

03.06.2015 Over the past few months, I have spent my weekends simplifying and modernizing the OpenSpending codebase to create SpenDB - a prototype-stage, light-weight data loading tool and analytical API for government financial data.

Who’s got dirt? - What if robots could do cross-border investigations?

30.05.2015 If we want to make open data relevant to investigative journalism, we have to simplify the way people access it. We must create a way for our data tools to talk to each other and trade information about the companies and people we are researching.

8 things you probably believe about your data standard

21.05.2015 Developing open data standards is all the rage. In fact, chances are that you're drawing one up right now (I am). In that case, here's a list of things you may believe about your data standard, but that are probably not true.

A Tale of Two Networks

19.05.2015 I've had the chance to contribute to two influence mapping projects in South Africa and Mozambique. While both projects focus on finding possible conflicts of interest within a small group of politically exposed persons, their approach has been very different.

Data doesn't grow in tables: harvesting journalistic insight from documents

15.05.2015 When we discuss data journalism, we often tend to think of nicely formatted spreadsheets full of financial data or crime stats. Yet most journalistic source material does not take the form of tables, but it comes in messy collections of documents, whether on paper, or scraped off a web site.

Why influence mapping matters to journalism

13.12.2014 Building Grano started with a desire to map political and economic influences. Developing it further has made us re-examine our motivations: why would journalists want software to help map out the connections between people in politics and industry?

What if journalists had story writing tools as powerful as those used by coders?

03.12.2014 When software developers write code, they often use tools called IDEs, integrated development environments, that provide contextual information needed to manage the complexity of modern software. What would such a workspace look like, if it were designed for journalism?

Oil Rush on Edgar Creek

12.11.2014 I've worked with OpenOil in an effort to find oil contracts which have been been published as part of filings to the US stock exchange regulator, the SEC. While these contracts are not usually public, companies are required to file full contract documents as part of their annual reports under certain conditions.

Grano advanced queries, and Linked Data

01.09.2014 The sad story of how the need for complex queries lead me to try and migrate grano to use linked data.

Civic Patterns, a language for news and citizen engagement design

25.06.2014 We're launching a catalogue of design patterns for civic engagement that is aimed not just at digital NGOs, but also media organisations.

Opening up Europe's procurement data

15.05.2014 After three years of advocacy, collaboration and data scraping, the public procurement data of the EU is now easily accessible to journalists.

OffenesParlament: Open Data-Projekt sucht neues Team

20.04.2014 Nach drei Jahren Arbeit am Gesetzgebungs-Tracker OffenesParlament muss ich das spannende Projekt nun an ein neues Team abgeben.

Why I'm building Grano, and why you should help

30.03.2014 Grano will be a modular, configurable and collaboratively built platform for influence mapping in advocacy and news organisations.

Charting Social Network Analysis Tools

21.12.2013 A wide range of projects attempt to use the metaphor of social networks to analyze and visualize connections between people in business and government.

Where to learn about data journalism

13.11.2013 A collection of reading materials and inspiring examples of data journalism.

Aktion Datenpaten