How to: check for, identify and remove non-alphanumeric characters when working with strings in SAS base code.
The below SAS code is one way to remove non alphanumeric characters from a string when working with SAS code. It utilises the
prxchange() function to remove any characters not specified in the perl regular expression argument.
Below is the same SAS code showing you how to use the
prxmatch() functions in a datastep.
At line 18 we use
prxmatch() to identify if our string contains any special characters according to the regular expression passed as argument one
/[^A-Z 0-9]/i. This returns true or false and basen on this result we update column
Then if the result of the
prxmatch() condition is true, we use
prxchange() to first extract all special characters (populating
special_chars_list) and secondly to create a new string without any special characters (
clean_string). Not the differece between the two is the use of the
^ symbol in the regular expression.
Additionally if you wanted to add a character to ignore like for example an underscore the code from line 30 onwards shows you how. Like wise if you wanted to remove spaces form the string and only keep underscores you could update the regular expression to look like this