Our tale begins a month and a half ago, in a lab at the University of Southampton….
I was at the beginning of my internship, and we had decided one of the key jobs was fleshing out our data. The Open Data Service previously gathered data using pen and whiteboard. The issue with these archaic tools is time. Not just the time spent gathering data, but processing it all by hand at the other end.
As such, I set out on journey. A journey to create one tool that would facilitate the easy gathering of data. It’s still far from complete, but here’s what it does so far.
DISCLAIMER: None of this tool is considered “stable” at this point in time. Data formats can and will change, use at your own risk!
The aims of the tool are:
- To speed up the gathering and processing of data. More specifically, data gathered on-location.
- To enable and encourage non-technical people to contribute open data.
The result is a responsive website, written in PHP. PHP was chosen as it’s trivially integrated into the Open Data service. – https://github.com/Spoffy/OpenGather
The current interface is extremely simple, but it gets the job done. It shows a set of object types (schemas) people can submit. Changing the schema changes the form fields shown. These can then be filled in and submitted. Any required fields that weren’t filled in are highlighted in red.
The tool currently supports text fields, dropdown fields and geolocation fields. For geolocation fields, the initial values of longitude and latitude use the phone’s GPS. It’s possible to click on the map to select a more precise location. This is especially useful when recording the location-sensitive objects such as doors.
The tool uses MySQL as its default backend. The details are configurable in config.php. There’s a single central table that records each data item entered. It stores an id, the time and the schema id.
Each schema has its own table of details. Each entry’s id acts a foreign key, relating an entry to the details about it in the schema’s table.
Currently, the data is exportable as JSON. This format allows several schemas to exist together seamlessly. JSON is also human editable, making it easy to correct long-term data. The tool makes the export publically available at http://yourwebsite/path/to/tool/dumpjson.php. There’s no issue with making data public, as it’s designed to gather OpenData.
The data is also available through the MySQL instance. This method of access isn’t recommended.
The Schema Generator
Personally, I think this is the coolest part of the system. It allows you to quickly specify a schema using PHP. This schema is then transformed into HTML for the web forms and SQL for the database. The web interface updates the schema list when the page loads. Dynamically loading these schemas allows submissions to go straight to the database.
The upshot is that defining new schema is incredibly easy. There’s no need to mess around with HTML or SQL. Just a few PHP objects gets the job done!
The following is a sample schema I use to gather data from around the University:
The tool is still very much in the early stages of development. Feel free to use it, but be wary that things may break between versions! If you’re feeling particularly adventurous, merge requests are more than welcome…
Improvements currently on the roadmap include:
- An updated, friendlier user interface.
- Versioning for the schema, including a tool to move data between schema versions. This should help with the long-term preservation for data.
- Support for image uploading. One major weakness is the need to take images seperately and link them later.
- Add a README to explain installation and usage.
The source code is available at https://github.com/Spoffy/OpenGather
Next week Eventually, I’ll be talking about taming QGIS, building tilesets and designing GeoJSON maps!