Download presentation
Presentation is loading. Please wait.
Published byΚαλλίστρατος Ἰωάννης Στεφανόπουλος Modified over 6 years ago
1
Practice PRX Matching Using This Program And Learn How to Do This
Learn SAS’s Perl Regular Expression (PRX) Matching to Catch All 384,000 Ways to Misspell “Afghanistan” Paul C. Genovesi, MSBA Practice PRX Matching Using This Program And Learn How to Do This Background The Program: A SAS Enterprise Guide Project PRX Functions in Use: SAS Code That Catches All 384,000 Misspellings Although the SAS programming language has been around since 1976, SAS Enterprise Guide has only been around since SAS Enterprise Guide is simply a way to talk to SAS. It is the connection between you and SAS. With SAS Enterprise Guide, SAS becomes more graphical, more point-and-click, and more user friendly. The process flow window, a SAS Enterprise Guide feature, allows the user to compartmentalize and “flow chart” SAS program operation by using icons to make it more understandable from an “above-the-code,” non-programmer viewpoint. PRX functions and call routines were added to SAS in 2004, making them relatively new. Compared to all other SAS functions and calls, they are unsurpassed for string identification and manipulation through their use of pattern matching. At the heart of pattern matching is the use of PRX metacharacters, which can be thought of as “wildcards on steroids.” Learning how to use the PRX metacharacters leads to learning pattern matching, which leads to mastery of string identification and manipulation. The regex_learning_tool consisting of a SAS Enterprise Guide project and a Microsoft Excel file allows users to practice pattern matching in a most efficient and effective way while at the same time allowing users to keep track of their “practice trail” for future reference, expansion, and avoidance of wheel-reinventing. Each row in the Excel file contains a user-entered match record consisting of a unique combination of string (in one column) and perl regular expression (in another column) being matched to it. The regex_learning_tool project imports and processes each match record and gives near-instant output results of the match in the form of a SAS dataset and SAS report. In addition, the project allows the user to selectively process match records based on a PRX metacharacter occurring in the perl regular expression column or a string contained in the match_description column. This speeds up processing by zeroing in on only desired match records For example, the current Excel file contains over 400 match records demonstrating the use of most all PRX metacharacters, but the user might only be interested in match records containing a certain PRX metacharacter. Most importantly, the regex_learning_tool allows the user to employ repetitive input-modification/output-inspection scenarios so that the user is able to quickly learn pattern matching through trial and error. %let td=travel_destination; %let ws='(?: |-)*'; re_Afghanistan = prxparse('/\b' || '(' || 'a(?:h?ff?|ph)' || ')' || &ws || /* misspellings */ '(' || 'gg?h?[aeou]h?nn?' || ')' || &ws || /* *2*4*2*2 misspellings */ '(' || '[iaeou][sz]{1,2}' || ')' || &ws || /* *6 misspellings */ '(' || 'tt?[aeiou]h?nn?' || ')' || /* *5*2*2 misspellings */ '\b/i'); /* Total = 384,000 misspellings */ if prxmatch(re_Afghanistan, &td) then do; matched_expr = 'Afghanistan'; cbuff1 = prxposn(re_Afghanistan, 1, &td); cbuff2 = prxposn(re_Afghanistan, 2, &td); cbuff3 = prxposn(re_Afghanistan, 3, &td); cbuff4 = prxposn(re_Afghanistan, 4, &td); output Travel.RESULTS_travel; cbuff1 = ' '; cbuff2 = ' '; cbuff3 = ' '; cbuff4 = ' '; end; Input: An Excel File Example: Caught Misspellings autoexec Process Flow Output: A SAS Dataset and SAS Report Conclusion regex field Search-and-Match Programs Process Flow Ordered Lists – Import, Search, and Match Learning SAS Enterprise Guide has little to do with your level of SAS experience and has more to do with learning how to more effectively organize the SAS programming that you do know. It allows the user the ability to separate and isolate complex SAS programming within its own labelled program icon, thus separating what it does from how it does it. At the heart of learning how to use SAS’s PRX functions and call routines is learning PRX pattern matching. Like steel breaking sod, the regex_learning_tool allows the user to quickly learn PRX pattern matching and unleash the power of string identification and manipulation using SAS’s PRX functions and call routines. Acknowledgments: The author would like to thank Lt Col Melinda Eaton for her overall editing of this poster and Lt Col Monica Selent for her editing advice. The views expressed are those of the author and do not necessarily reflect the official policy or position of the Air Force, the Department of Defense, or the U.S. Government. Distribution A: Approved for public release; distribution is unlimited. Case Number: 88ABW-2014-xxxx, xx Oct 2014.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.