Regular Expressions to Increase MS Word Efficiency


Regular Expressions

Regular Expressions (Photo credit: Jeff Kubina)

Just a quick post today. I was formatting an article for submission to the journal Biological Conservation. In the instructions for the authors, I came across the line “Use decimal points (not commas); use a space for thousands (10 000 and above).”

For me that means numbers like 1,565 need to become 1565 (smaller than 10,000) and 136,000 becomes 136 000.

Without using regular expressions, the options are to search the document for commas (hundreds in the document) or go through the entire manuscript line by line and hope you don’t miss anything. Regular expressions allow you to match patterns in documents/files/code. It can help you to find files on your computer, scrape web sites, or in this case find and replace strings in a Microsoft Word document.

For my example above, I used the find and replace feature in MS Word (you may need to go into the advanced options and check “use wildcards”). To replace the comma with a space for values over 10,000, I searched to find

([0-9])([0-9]),([0-9])([0-9])([0-9])

which means find a digit between 0 and 9, followed by another digit, then a comma, followed by three more numbers. This will work for numbers above 10,000 including 100,000. You may need a different search for numbers over 1 million but I knew I didn’t have any in this document.

I then replaced each string that matched that with

 \1\2 \3\4\5

which mean replace with the string that was found with the first character (digit 0-9), then the second character, then a space, then the next three numbers.

With wildcards like * and nearly unlimited combinations, once you get comfortable with regular expressions, you can locate and modify documents with ease. See here or here for more of the basics of regular expressions.

Now to get that manuscript submitted. . .

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s