Data is available in bits publicly, but aggregated by companies that want to charge for it. Other data may be free in aggregate form, but governments and well-funded institutions function as the custodians, excluding smaller institutions, local community groups, and individuals from contributing to open data initiatives.
Using open source tools, the Data Anywhere project aims to solve these problems, one data set at a time. The solution is to set up simple database, which will replicate itself, and simple scrapers on various virtual machines. These are cheap (about $5+/mo on digitalocean), and many go unused/underutilized.
The immediate goal is for the servers to aggregate any type of data, and make it accessible to the public. The longer term vision of this project will appeal to any data geek. We’d like to use the data for examining unexpected relationships chronologically at first, but could be compared along any index.
Although just taking off, the Data Anywhere project has the potential to help many organizations. It integrates a persistent data model; if one machine is shut down, no permanent loss is incurred to the data set, since it replicated itself to several other machines. These servers can be used to aggregate any type of data, and make it accessible to the public at large, through a simple RESTful web interface.
We are actively looking for more individuals and community partners to grow the Data Anywhere community. Our very first workshop was at the March Occupy Data hackathon. We had two groups initiate projects, and we’re planning our next workshop for a summer Occupy Data hackathon. At these events, participants are provided with simple instructions on how to set up and secure a server, and databases that maintain themselves, and replicate. Knowledge of Linux or Python is helpful but not necessary. Patience and a willingness to learn is MUCH more important.
More Info: Our next workshop is being planned for this summer and is lead by an incredible software developer and Linux admin, teaching Linux basic system admin, MongoDB setup and usage, and flask web API. For Data Anywhere announcements subscribe to the Occupy Data discussion list, follow @occupydata on Twitter, or join us Meetup.com.