Regular Expressions Pattern and String Matching in Text
What is a RegEx “Find” option but fancier Match a “pattern” to a “string” Cheat sheet: Different coding languages have different RegEx syntax But you can usually set them to read any of the other syntaxes My favorite RegEx tester is:
Find any word that starts with a capital letter:
Find any word that starts with a capital letter and then a lower case letter:
Match a phrase that might be arranged differently:
Or dates… You can get about as fancy with this as you’d like Useful to parse larger chunks of text entered by date or number
Example Uses: Syria Militant Networks and Violence in Syria What sorts of networks are likely to result in infighting as opposed to alliances?
Example Uses:.gov Searching terabytes of this…..
To get counts: To say something about government attention to different issues 2008 Agency Relative Emphasis
RegEx: what it isn’t good for Time consuming to do pattern matching over large data When you have a lot of variation in spelling or phrasing Fuzzy Sets! Useful reference for fuzzy matches: bloggers.com/fuzzy-string-matching-a-survival-skill-to-tackle- unstructured-information/ bloggers.com/fuzzy-string-matching-a-survival-skill-to-tackle- unstructured-information/