An open database of persons and entities of interest

This dataset combines names and details for persons and entities of interest. Such entities include political leaders as well as entities sanctioned by various governemnts (e.g. connected to criminal and terrorist activity), or those barred from public procurement.

Why would this be useful?

Let's say you are working as a journalist and you have received a leaked database, a document stash or a fresh open data release that you'd like to dig through. The next question is: what are you going to search for? This dataset gives a possible answer: by cross-referencing with sanctions lists and names of political leaders you can produce leads, which you can then check for political and criminal relevance.

What are your sources?

Glad you asked. As of 8. December 2015, these sources are included:

If you know of further sources of relevant data and openly available data - particularly with regards to families of politically exposed persons, please suggest them on GitHub or email me.

Can this dataset be used freely?

Hopefully. While care is taken not to violate commerical copyrights, databases included in this dataset might carry unexpected claims of copyright by the institutions which provide them. YMMV.

Downloads

Currently, the database is distributed as a set of downloads which are updated regularly. In the future, I hope to also provide additional services, such as an OpenRefine Reconciliation API.

full.xlsx [13MB] - main list

This file contains entities from all sources, as well as sheets to indicate alternate names (aka's), identification documents and known addresses. The individual sheets are available as CSV files, as well:

opennames-latest.tgz [45MB] - full bundle

This package contains the database as Excel and CSV files, individual files for each data source and the original source data as downloaded before processing.

it's open source, y'all.

The extraction, transformation and cleaning code for this project is available on GitHub and can be replicated using Docker. Please consider contributing additional lists and ideas for how to make this project more useful.