# Category Archives: Academia

## Markdown for Manuscripts: Enhancements

Today was my first time using markdown for a manuscript methods section (see previous post on getting set up in markdown). It had lots of equations so using LaTeX to write the equations was quite nice. Here’s an example of the markdown code:

\$\mu_{s,h,d,y} = \left\{ \begin{array}{1 1} \omega_{s,h,d,y} + \delta_{s}(t_{s,h,d-1,y} - \omega_{s,h,d-1,y}) & \quad \text{for t_{s,h,d-1,y} is real} \\ \omega_{s,h,d,y} & \quad \text{for t_{s,h,d-1,y} is not real} \end{array} \right. \$

where $\delta_{s}$ is an autoregressive [AR(1)] coefficient
that varies randomly by site and $\omega_{s,h,d,y}$
is the expected temperature before accounting for temporal
autocorrelation in the error structure.


which nicely renders in HTML (or PDF) as seen in this screenshot. This was quite refreshing as I always have headaches with MS Word Equation Editor. Mostly, I just wanted to write a quick post to share some of the tools I found useful while doing this. In the screenshot above I was editing in Sublime Text and doing the automatic preview in Marked 2. It’s the first time I’ve moved away from Mou, which was a nice editor with built-in preview but I wanted to take advantage of some of the higher end features of Sublime Text and Marked 2. I’m glad I did. I’m using trial versions but will definitely buy them this week. Great software!

## Automate pandoc

pandoc-watch Makefile to convert all markdown files in a directory to PDF

## LaTeX Symbols and Math

http://martinkeefe.com/math/mathjax3 (fantastic resource)

http://en.wikibooks.org/wiki/LaTeX/Mathematics

http://www.artofproblemsolving.com/Wiki/index.php/LaTeX:Symbols

## Academic Social Networking

Online social networking was basically unheard of just a decade ago, now it’s integrated into the fabric of American society. Facebook pages advertised on the national news and twitter hashtags pop up everywhere. And it’s not just American society either, as evidenced by the use of Twitter for social organization during the Arab Spring.

From what I’ve observed, the use of social networking appears to be extremely varied in academia. This isn’t surprising given the high demands already placed on academics and the slow turnover rate among faculty (older faculty are less likely to adapt new tools of questionable utility but there are numerous exceptions, of course).

I have found blogging to be useful for improving my quick writing skills, thinking through new ideas, getting feedback on ideas and computer code, and making new contacts. My blog is networked through r-bloggers and the International Network of Next Generation Ecologists (INNGE) supported Ecobloggers, which both help create a community of users. I have found a tremendous amount of useful R code and statistical advice on other people’s blogs.

However, I’ve found social networks like Facebook and LinkedIn to be of limited use so far. I’m sure there is a place for work-related Facebook pages, but I’ve just found that I have been outlets. Google+ is similar but I do have an account and use it on occasion for science-related posts and reading. The only main-stream social network that I find really useful is Twitter. I get feedback on computing and statistics questions quickly, find links to articles and ideas I wouldn’t otherwise come across, meet and interact with new people (even in person the the ESA meeting tweetup), and share my thoughts and research with a larger audience.

I’ve signed up but don’t regularly use a number of other social networks and sites where I can post my academic persona. I think these could be useful but haven’t made great use of them yet (listed below). Even things like Mendeley and Stack Exchange have social components and user rating/badge systems. I think it’s important to manage one’s own online image both personally and professionally. I don’t know if I do the best job, but I’ve at least had some fun exploring different options. I am generally careful not to post anything online that I wouldn’t want a hiring committee or my grandmother to find.

Here’s What I’ve used for managing my online professional presence:

## Teaching Scientific Computing: Peer Review

This post is going out on a bit of a limb because I am not familiar with the pedagogical literature relating to teaching scientific computing. As such, I can only speak from my very limited experience. I’ve taken a couple short courses on scientific computing, but the only formal full-semester course I’ve taken was Introduction to C Programming for Engineers 15 years ago. In that course, the instructor spent 50 minutes 3 days a week writing code on the chalk board in front of us and we were expected to learn. Homework was to write increasingly large programs throughout the semester. If they didn’t work we got a 0%, if they produced the wrong output we got a 50%, and if they worked properly we got a 100%. Obviously it was a terrible course (although a number of my statistic courses that involved programming were not very different so this might be more common than I’d like to believe). Besides some of the conspicuous instructional problems, I was just thinking that scientific programming courses could learn from pedagogy in the humanities. The University of New Hampshire requires undergraduates to take a number of writing intensive courses. To qualify as writing intensive a course my meet 3 criteria:

1. Students in the course should do substantial writing that enhances learning and demonstrates knowledge of the subject or the discipline. Writing should be an integral part of the course and should account for a significant part (approximately 50% or more) of the final grade.
2. Writing should be assigned in such a manner as to require students to write regularly throughout the course. Major assignments should integrate the process of writing (prewriting, drafting, revision, editing). Students should be able to receive constructive feedback of some kind (peer response, workshop, professor, TA, etc.) during the drafting/revising process to help improve their writing.
3. The course should include both formal (graded) and informal (heuristic) writing.  There should be papers written outside of class which are handed in for formal evaluation as well as informal assignments designed to promote learning, such as invention activities, in-class essays, reaction papers, journals, reading summaries, or other appropriate exercises.

I think these criteria could be applied or at least adapted for scientific computing courses. The 1st one is easy. The 2nd and 2rd are what I think computing courses could really take advantage of. From what I’ve seen, there is often not a lot of time spent of informal feedback from instructors and peers to help with revision. In programming, especially with flexible languages like R, there are often many solutions to the same problems. Useful assignments could be to critic the programs of peers, find ways to improve code efficiency, and provide alternative solutions to sections of code. This could include critics of the commenting and README files.

In introductory courses there is often an emphasis on cover content. Some people will balk at the idea of spending time of learning alternatives to simple options when there is clearly 1 best solution and so much material to cover to get students writing even simple scripts. However, it’s better in my opinion to learn a few things well than many things superficially. By evaluating, revising, and developing alternatives to code written by peers, students will learn how to program better. There is a reason that informal assessment, peer review, and revision is a required part of writing intensive courses. Those same reasons apply to scientific computing courses. Just as review and revision make us better writers, it will make us better programmers.

## Open Access Publishing

As those who know me IRL or follow me on twitter (@djhocking), I am advocate for open science. This includes data sharing, open source software, open access to code of analysis, and open source publishing. This is my first post on the subject. I actually started this post back on 02 April 2013 but then my daughter surprised us by arriving a couple weeks early so I am just reviving the post now. My thoughts were stimulated by an article in Nature: Cost of Publishing. In the article the author notes that,

an average revenue per article of roughly $5,000. Analysts estimate profit margins at 20–30% for the industry, so the average cost to the publisher of producing an article is likely to be around$3,500–4,000.

The author notes that costs vary widely and are difficult to estimate. “Diane Sullenberger, executive editor for Proceedings of the National Academy of Sciences in Washington DC, says that the journal would need to charge about 3,700 per paper to cover costs if it went open-access.” These values align well with the typical3000 cost of electing for open-access in traditional journals. Nature publishing suggests that it would be much more expensive to publish open-access. These traditional journals provide copy-editing and sometimes promotional activities.
However, newer journals without a tradition of paper printing, copy-editing, and typesetting, are able to publish open-access articles with less expense. One complaint of scientists is that they provide the reviewing, primary editing, and formatting for free and don’t see where the expense comes from.

For example, most of PLoS ONE’s editors are working scientists, and the journal does not perform functions such as copy-editing. Some journals, including Nature, also generate additional content for readers, such as editorials, commentary articles and journalism.

Some of the expense is reliable, long-term server space. Publishers such as PLOS require considerable initial capital investment through grants and Venture Capitalists. Then high volume publishing helps maintain finances in the black. PLOS ONE charge $1350 per article but is generally very good about reducing/waving fees if grants are not available to pay for publishing. In addition to the wave of new open access journals, there is an increasing interest in preprint servers (more here about preprints). The Nature article points out that, Many researchers in fields such as mathematics, high-energy physics and computer science … post pre- and post-reviewed versions of their work on servers such as arXiv — an operation that costs some$800,000 a year to keep going, or about $10 per article. Under a scheme of free open-access ‘Episciences’ journals proposed by some mathematicians this January, researchers would organize their own system of community peer review and host research on arXiv, making it open for all at minimal cost (see Nature http://doi.org/kwg; 2013). One other major benefit of open access publishing is that even if per-article costs remained the same, there would be value in the time researchers save in accessing and reading papers that are not behind paywalls. Despite many of the benefits of open-access publishing the Nature article points out that, a total conversion will be slow in coming, because scientists still have every economic incentive to submit their papers to high-prestige subscription journals. The subscriptions tend to be paid for by campus libraries, and few individual scientists see the costs directly. From their perspective, publication is effectively free. Open Access is Coming Though The US Federal Government will be requiring open access of articles from publicly-funded research. White House Open Access: A Review Cascade can also greatly help facilitate publishing and review times as well as encouraging open access publishing. Nature now has a review cascade. OA Journals for Ecology and the Environment Here are a some open access journals for research on ecology, conservation biology, and the environment. Most of my focus is on English-language journals for Ecology, but even for that discipline this is in no way an exhaustive list. New OA journals seems to be popping up everywhere these days. It will certainly be interesting to see the future of scientific publishing. More information on OA Journals is available through the Directory of Open Access Journals (DOAJ). Let me know if you have experience with any of these journals/publishers or if you know of other good options for ecology and conservation. [UPDATE: This list is now being updated at https://danieljhocking.wordpress.com/links/oa-journals/] BMC Ecology • Publisher: BioMed Central • Indexed: Yes • Year Established: 2001 • Eigenfactor: • OA Cost:$USD 1955

Elementa

• Publisher: BioOne?
• Indexed: Not yet?
• Year Established: 2012
• Eigenfactor:
• OA Cost: $USD 1,450 • Judge Importance: No • Acceptance Rate: Yes • Publish Reviews: No • License: CC-BY 3.0 Unported Ecosphere • Publisher: Ecological Society of America • Indexed: Not as of 02 March 2013 • Year Established: 2010 • Eigenfactor: • Impact Factor: Not calculated yet • OA Cost:$USD 1250/1500 (members/non-members)
• Judge Importance: Yes
• Acceptance Rate:
• Publish Reviews: No

Herpetological Conservation and Biology

• Publisher:
• Indexed: Yes
• Year Established: 2006
• Eigenfactor:
• Impact Factor: 0.76 (2012: 5yr)
• OA Cost: No
• Judge Importance: Yes
• Acceptance Rate: 60%
• Publish Reviews: Yes

Ideas in Ecology and Evolution

• Publisher: Queen’s University
• Indexed: Partly (not by ISI)
• Year Established: 2008
• Eigenfactor:
• Impact Factor:
• OA Cost: $50 – 200 (Canadian) • Judge Importance: Yes • Acceptance Rate: • Publish Reviews: Yes International Journal of Ecology • Publisher: Hindawi Publishing Company • Indexed: Yes • Year Established: 2007 • Eigenfactor: • Impact Factor: • OA Cost:$USD 600
• Judge Importance: Yes
• Acceptance Rate:
• Publish Reviews: Yes

Journal of Biodiversity and Ecological Sciences

• Publisher:
• Indexed:
• Year Established: 2011
• Eigenfactor:
• Impact Factor:
• OA Cost:
• Judge Importance:
• Acceptance Rate:
• Publish Reviews: Yes

Natural Resources

The Open Ecology Journal

• Publisher: Bentham open
• Indexed: Yes
• Year Established: 2008
• Eigenfactor:
• Impact Factor: ~1.86
• OA Cost: $600-900 • Judge Importance: • Acceptance Rate: • Publish Reviews: Yes • License: Creative Commons Attribution non-commercial License 3.0 Open Journal of Ecology • Publisher: Scientific Research Publishing • Indexed: Yes • Year Established: 2011 • Eigenfactor: • Impact Factor: • OA Cost:$USD 500 +50 per page over 10
• Judge Importance:
• Acceptance Rate:
• Publish Reviews: ?
• License: Creative Commons Attribution License
• Publisher: PeerJ
• Indexed: Yes
• Year Established: 2012
• Eigenfactor:
• Impact Factor:
• OA Cost: Lifetime Membership ($USD 99 per author) • Judge Importance: Yes • Acceptance Rate: • Publish Reviews: No (can upload as non-peer reviewed PrePrint) PLoS Biology • Publisher: Public Library of Science • Indexed: Yes • Year Established: 2003 • Eigenfactor: • Impact Factor: • OA Cost:$USD 2900 (reduced for many countries or at request)
• Judge Importance: Yes
• Acceptance Rate:
• Publish Reviews: Yes

PLoS ONE

• Publisher: Public Library of Science
• Indexed: Yes
• Year Established: 2006
• Eigenfactor:
• Impact Factor:
• OA Cost: $USD 1350 (reduced for many countries or at request) • Judge Importance: No • Acceptance Rate: 69% • Publish Reviews: No OA Options of more traditional journals Acta Oecologia – Published by Elsevier which has been controversial in their relationship to the OA movement (here, here, here, current info). OA Option$USD 2500

## Regular Expressions to Increase MS Word Efficiency

Regular Expressions (Photo credit: Jeff Kubina)

Just a quick post today. I was formatting an article for submission to the journal Biological Conservation. In the instructions for the authors, I came across the line “Use decimal points (not commas); use a space for thousands (10 000 and above).”

For me that means numbers like 1,565 need to become 1565 (smaller than 10,000) and 136,000 becomes 136 000.

Without using regular expressions, the options are to search the document for commas (hundreds in the document) or go through the entire manuscript line by line and hope you don’t miss anything. Regular expressions allow you to match patterns in documents/files/code. It can help you to find files on your computer, scrape web sites, or in this case find and replace strings in a Microsoft Word document.

For my example above, I used the find and replace feature in MS Word (you may need to go into the advanced options and check “use wildcards”). To replace the comma with a space for values over 10,000, I searched to find

([0-9])([0-9]),([0-9])([0-9])([0-9])

which means find a digit between 0 and 9, followed by another digit, then a comma, followed by three more numbers. This will work for numbers above 10,000 including 100,000. You may need a different search for numbers over 1 million but I knew I didn’t have any in this document.

I then replaced each string that matched that with

 \1\2 \3\4\5

which mean replace with the string that was found with the first character (digit 0-9), then the second character, then a space, then the next three numbers.

With wildcards like * and nearly unlimited combinations, once you get comfortable with regular expressions, you can locate and modify documents with ease. See here or here for more of the basics of regular expressions.

Now to get that manuscript submitted. . .