I am trying to decide between database management systems for smallish ecology projects. Generally, only one person will be accessing them at a time and only a few people who work on the project will ever be entering data or performing queries. I would like to get used to a system that will serve me well in the long run, regardless of future projects.
With that in mind, I would like a system that is cheap, works across multiple platforms, and plays well with other programs (i.e. can use SQL, python, R to interact). Ideally, it would have a decent GUI so collaborators less familiar with programming could work with in easily. Also, having a system where data entry forms could be used on an iPad or other tablet would be ideal for field data collection.
My first database experience was with MS Access, which I liked quite a bit. However, it’s only available for Windows machines and I frequently go back and forth between Mac OS and Windows as do collaborators (and I imagine setting up Linux servers in the distant future). I could use a parallels program or emulator to run MS Access on a Mac machine but that’s more of a headache than it’s worth to me. I did use Pendragon software on a Palm Pilot for field data collection. It made for fewer errors during data collection (due to built in checks), no transcription errors copying paper data to the database, and saved time on data entry. For a project with ~100,000 lines of data in the main table and a half dozen related tables collected by a dozen researchers over 5+ years, this was a great system (except that the Palm Pilot batteries back in 2005 didn’t last long enough).
I have very briefly used OpenOffice Base, which is very much like MS Access but it doesn’t look as pretty and lacks a few of the functions. I like the fact that it’s open source, but don’t know much about the pros and cons of Base. At least I can use it on both Mac OS and Windows, but I’ve found at least one critic suggesting that Base isn’t a sufficient replacement for Access (here)
More recently, I’ve used FileMaker Pro 12. It works on both Mac and Windows OS (not sure about Linux). It isn’t cheap but it’s not prohibitively expensive. It comes with a free download of the FileMaker Go app for the iPad, which is nice for setting up field data collection systems. My problem with FileMaker Pro (FMP) is that not as many colleagues have access to it and I just don’t find the work flow as easy or intuitive as MS Access. Maybe if I give it more time, I will get a more efficient workflow. The system of forms and tables is still a bit awkward for me and I haven’t been thrilled with the scripting. Previously, FMP couldn’t use SQL queries, but I believe that it was introduced with enhanced capabilities in version 12 (but I haven’t tried it).
Most recently, I’ve been learning a little about MySQL and trying out MySQL Workbench on my Mac OS. MySQL is a bit more complex and might be harder for collaborators, but it seems like it might be a better system in the long run. I’ve only just tried it one afternoon but the Workbench seems like a decent GUI interface. MySQL is very powerful and free but has a steeper learning curve than MS Access, OO Base, or FMP. It is often used in websites, so could be adapted for use on a tablet, but I am not sure if there is an easy way to do with without programming skills.
There are also other options like PostgreSQL or SQLite but I will probably stick to one of the solutions above (unless someone convinces me that PostgreSQL is the best solution for my purposes). Most comparisons of Relational Database Management Systems (RDBMS) focus on encryption, database sizes, and things that I’m not generally concerned with for my applications.
If you’re still reading, what do you use? What system do you find most useful for day-to-day handling of small research lab data (e.g. no genomics or massive web-based data collection)? For systems other than FMP, do you have a relatively easy collection system for field work on an iPad or other tablet? Is it worth learning SQL/MySQL/Workbench (or PostgreSQL)? What about for working with non-tech-savy collaborators or switch between computers and operating systems? I’d love to hear your thoughts and suggestions. I haven’t found a great overall introduction for practicing scientists (there are good intros to each system individually). In the end, any of these solutions might work for my purposes, but it would be nice to know a bit more before I commit to one.
Practical Computing for Biologists offers a great intro to MySQL and getting it set up on a Mac OS.
Ethan White has a nice intro to databases and MS Access on his programming for biologists website
Comparison of FMP, Access, MySQL, PostgreSQL (unfortunately no OpenOffice Base)
This is a nice, brief overview of what a science research lab database should do but not how to get there