Ranjan Sitaula |
Standard clinical datasets often contain character variables with texts. Some of the examples of such variables include adverse events terms, disposition reason and comment fields. There is often a requirement to have prespecified capitalization on these variables. Moreover, the raw datasets, which are the primary source for these datasets often contains text strings with typographical errors and random capitalization. Use of standard SAS functions such as propcase or lowcase do not always give the desired capitalization. Similarly, use of characters replacement functions such as tranwrd and translate functions to fix capitalization and typographical errors can be cumbersome when multiple words are involved. To address this shortcoming, a single macro program called "Fontcase Macro" has been created which performs the following tasks simultaneously with minimal user input: (i) converts texts of any prior capitalization to sentence case or title case as required, (ii) converts or maintains abbreviations/standard terminologies to standard capitalization, (iii) corrects typos. The macro code uses the Perl Regular Expression (RegEX) function prxchange to identify text patterns and change them to desired capitalization. Moreover, prespecified abbreviations and standard terminologies are stored in a spreadsheet file that is accessed by the macro program. The macro has been extremely useful in the generation of standard clinical datasets such as SDTM (Study Data Tabulation Model) and ADaM (Analysis Data Model) datasets for variables involving text strings and in the generation of TLFs (tables, listings, and figures) using raw datasets. |