client toolkit for the Twitter streaming API, storing and analysing statuses.
easy-to-use Python library for data processing using relational databases.
legislative activity tracking system for both chambers of the German parliament.
Features plenary speeches, legislative drafts and personal profiles.
web-based data cleaning tool to help match aliases of an entity in source data
and to map them to its correct form.
a pastebin site for tabular data, enables easy sharing of small data snippets.
ReGENESIS, data extraction from the repository of the German statistical office, data cube reconstruction.
Data warehousing system to support the analysis and visualization of budgetary and spending data
from across the globe. Set up budget sites for Cameroon,
and Minas Gerais (Brasil).
Contributed to Where Does My Money Go? and the
data.gov.uk spend browser.
messytables, extracting structured
data from badly-formatted tabular data.
LobbyFacts API, an API to provide clean
and current data on lobby activity in the European Union, cooperation with Corporate Europe, LobbyControl
and Friends of the Earth Europe.
SQL-based data store and API generator service.
YourTopia, contributor, voting on a custom index
for human development criteria. Third price in the World Bank's Apps
for Development competition.
scraping of large web sites, e.g. the German companies
DataHub, light-weight, experimental,
activity-centric clone of CKAN.
OffenerHaushalt, visualization of
scraped German budget data.
CKAN, worked on
customizations for PublicData.eu, IATI
Registry, Helsinki Regional Infoshare, the
Data Hub and various community instances, including OffeneDaten.de
collaborative policy drafting platform for distributed groups. Deployments for the
Commission Internet and
Digital Society of the German Parliament, the City of Munich and various other organisations.
Many of my projects require that I keep up-to-date copies of various public
datasets. These are often scraped, or additional information had been added
to facilitate analysis. Below is a collection of resources I'm re-publishing
because they are significantly enhanced compared to their source form.
For context, also see the OpenSpending briefing on the finance data sources of the European Union.
Training and materials
World Bank Institute Data Bootcamps, series of multi-day data literacy workshops for
journalists, activists and coders in developing nations. Data trainer in
Grahamstown, South Africa,
Chisinau, Moldova and
School of Data Journalism, series of training events with the OKFN and European Journalism Centre. Contributed to workshops in Warsaw, Utrecht,
Perugia, Vienna and Konjic (BiH).
Spending Data Handbook, book sprint to create guidance for civil society organisations in their use and advocacy of government financial data.
Data Journalism Handbook, contributed chapters on data acquistion and produced a web version of the book.
Open Data FAQ, frequently asked questions on the progress of open data initiatives in Germany and the relevant political and technical background.
DataPatterns, a collection of design patterns for data processing. Deprecated in favor of the School of Data Handbook.