Just a quick post today. I was formatting an article for submission to the journal Biological Conservation. In the instructions for the authors, I came across the line “Use decimal points (not commas); use a space for thousands (10 000 and above).”
For me that means numbers like 1,565 need to become 1565 (smaller than 10,000) and 136,000 becomes 136 000.
Without using regular expressions, the options are to search the document for commas (hundreds in the document) or go through the entire manuscript line by line and hope you don’t miss anything. Regular expressions allow you to match patterns in documents/files/code. It can help you to find files on your computer, scrape web sites, or in this case find and replace strings in a Microsoft Word document.
For my example above, I used the find and replace feature in MS Word (you may need to go into the advanced options and check “use wildcards”). To replace the comma with a space for values over 10,000, I searched to find
which means find a digit between 0 and 9, followed by another digit, then a comma, followed by three more numbers. This will work for numbers above 10,000 including 100,000. You may need a different search for numbers over 1 million but I knew I didn’t have any in this document.
I then replaced each string that matched that with
which mean replace with the string that was found with the first character (digit 0-9), then the second character, then a space, then the next three numbers.
With wildcards like * and nearly unlimited combinations, once you get comfortable with regular expressions, you can locate and modify documents with ease. See here or here for more of the basics of regular expressions.
Now to get that manuscript submitted. . .