SESUG 2012 Conference Abstracts

Beyond the Basics

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole
Stephen Overton, Overton Technologies, LLC
Paper BB-01

Business information can be consumed many ways using the SAS Enterprise Business Intelligence Platform.   SAS BI Dashboard can be used to report high level key performance indicators at-a-glance through the SAS® Information Delivery Portal.  Detailed web-based reports can also be surfaced from SAS Information Delivery Portal through SAS Web Report Studio.  This paper will present an information system that integrates the functionality of these tools to answer business questions faster and with a greater understanding of the key drivers of business critical data.  This paper will also present the data infrastructure needed to support this type of information system through the use of OLAP technology, effective data architecture, report linking, and information maps.

SAS Stored Processes: The Swiss Army Knife of the SAS BI Toolset
Patricia Aanderud, And Data, Inc
Angela Hall, The SAS Institute

Paper BB-02

One of the major benefits of using SAS Stored Processes is extendibility.  SAS stored processes are one of the most customizable products; there are several advantages, such as the ability to set up reports that can run in various locations, enhance out-of-the box functionality with custom widgets, and leverage all of the stored process server options.  In this discussion, you will learn tips and tricks for using stored processes within SAS BI clients.

Leveraging SQL Return Codes When Querying Relational Databases
John Bentley, Wells Fargo Bank
Paper BB-03

When querying a relational database, each PROC SQL query automatically produces four macro variables that contain return codes and messages, and the PROC SQL itself produces two more.  This presentation will explain how and when these macro variables are generated, what they represent, and how to use them to make your programs more dynamic, robust, and error-proof.  For real-world examples we will look at using SQL return codes to (1) cleanly halt execution when a query fails and send an email with the error message, and (2) get the number of records written to a table without running SELECT COUNT(*).  The presentation will be useful to all levels of SAS users not familiar with SQL return codes.

A SAS Programming Framework for Data Extraction Using Perl Regular Expression: The First Wave
Jiangtang Hu, d-Wise Technologies
Paper BB-04

The Perl regular expression (PRX) is somehow easier to write than to read.  This paper introduces bits of PRX by encouraging SAS users new to this field just to fill the programming building blocks presented step by step rather than to read the daunting regular expression (regex) syntax in their first learning stage.

Basically, it presents a simple while robust programming framework for SAS users to utilize PRX to retrieve data from multiple sources, structured or unstructured: 1) scratch the patterns using meta-characters, 2) validate the expression, 3) locate the patterns, and 4) extract the data needed with an explicit output statement.  Such framework makes programmers more concentrated on the pattern needed to match based on their own domain knowledge (or Google Search of course), while left others to the almost same programming blocks.  Expansibility wise, if users want to exact more data with different patterns, just repeat such programming blocks with different scratched meta-characters in step 1); also, even for the complicated pattern needed matched, it doesn’t need any change for the building blocks but the increasing complication of the meta-characters accordingly.

Three real life examples are demonstrated using the same programming logic following a gradually design process: read tittles and footnotes from Statistical Analysis Plan (SAP) shells, extract data from an XML file and retrieve information from websites.  Some useful supporting tools and websites are also introduced.

Using Dictionary Tables to Profile SAS Datasets
Phillip Julian, Bank of America
Paper BB-06

Data profiling is an essential task for data management, data warehousing, and exploring SAS® datasets.  TDWI ( extends the usual definition of data profiling to include data exploration.  This paper presents two SAS programs, Data_Explorer and Data_Profiler, that implement the TDWI definition.

These SAS programs are low-cost, free solutions for data exploration and data profiling.  Data_Explorer searches for all SAS datasets, and gathers essential dataset and file attributes into a single report.  Data_Profiler summarizes the values of any SAS dataset in a generic manner, which eliminates the need for custom SQL queries and custom programs to summarize what a dataset contains.

These programs have been used in banking and state government.  They should also be useful in the pharmaceutical industry for validating SAS datasets and managing data repositories.

Through the DoW-Loop, One Statement at a Time
Paul Dorfman, Dorfman Consulting
Paper BB-07

The DOW-loop is a nested, repetitive DATA step structure enabling you to isolate instructions related to a certain break event before, after, and during a DO loop cycle in a naturally logical manner.  Readily recognizable in its most ubiquitous form by the DO UNTIL(LAST.ID) construct, which readily lends itself to control break processing of BY group data, the DOW loop's nature is more morphologically diverse and generic.  In this workshop, the DOW-loop's logic is examined via the power of example to reveal its aesthetic beauty and pragmatic utility.  In some industries like Pharma, where flagging BY group observations based on in-group conditions is standard fare, the DOW-loop is an ideal vehicle, greatly simplifying the alignment of business logic and SAS code.  In this presentation, we will use the power of the SAS DATA step debugger facility to explore the workings of the DOW-loop one instruction at a time.

Make Macros Safe for Others to Use: Eliminate Unexpected Side Effects
David Abbott, Durham Veteran Affairs Medical Center
Rebecca McNeil, Durham Veteran Affairs Medical Center

Paper BB-08

The SAS® macro language can, in theory, be used to produce components of SAS software that can be safely used by a broad audience.   However, in practice, SAS macros can be problematic for users because they may have unintended and unadvertised side effects.  These side effects, e.g. resetting the value of a macro variable already in use, are not just minor nuisances; they can cause the invoking SAS software to fail in ways that are hard to debug, and worse, introduce incorrect behavior that may go undetected until the validity of the results is challenged.  Although the potential for expected side effects is usually not realized (e.g. the macro variable written to is usually not previously in use), the potential is neither rare nor isolated.  We examined mid-length macros publicized to the SAS community and found the potential for one or more unexpected side effects pervasive.

For macros to achieve their potential as reusable software components, it is necessary to construct them according to practices that reliably prevent unintended side effects.  Aside from the %LOCAL statement, SAS Base® provides only indirect and obscure support for eliminating unintended side effects in macros.  The most difficult problems arise when macros inadvertently redefine symbols already defined in the invoking environment.  Inventing names that are thought to be unique does reduce the likelihood of unexpected overwrite but it does not reliably prevent such occurrences.  This paper shows how to consistently prevent this dangerous overwriting in the most important classes of symbols used by macros: global macro variable names, macro names, dataset names, variable names, and format names.

Automating SAS/Graph Axis Ranges: Using a macro to produce easily read major tick mark increments based on the data to be graphed
Rick Edwards, PPD, Inc.
Paper BB-09

This paper describes a macro that determines the min, max and increment for the axis order definition based on that axis’ data in a manner that provides for 8 to 12 increments in multiples of 1, 2, 2.5 or 5 for easy extrapolation.  Flexibility is provided by parameters to insure a default minimum, maximum or both are contained in the range and to insure that zero is the minimum or maximum if desired.

ODS Output Datasets that Work for You
Stuart Long, Westat
Dan Scharf, Westat
Ed Heaton, Data and Analytic Solutions, Inc.

Paper BB-11

Although ODS Tables, combined with the ODS TRACE and ODS Output statements, provide a consistent method for retrieving statistics generated by SAS® Procedures, the design of these tables is inconsistent.

Many of the ODS Output data sets created with ODS Tables revolve around statistics describing one specific variable or groups of variables.  We commonly use only a subset of the data found in a SAS Listing when reporting or continuing an analysis.  Each of these data sets could be streamlined to contain these values plus additional descriptive information concerning these variables of interest -- such as the variable name, value, label, and formatted value.

This paper proposes a consistent data structure for these data sets that better facilitates reporting and analysis.  It demonstrates a method to translate the data in the ODS Tables into this format.

After showing how to design data sets with this common convention, this paper demonstrates how to (1) automate their creation and (2) combine the summarized data from multiple ODS Tables within a procedure or between different procedures.

Proc Summary Options Beyond The Basics
Susmita Pattnaik, PPD, Inc.
Paper BB-12

We all know about PROC SUMMARY, the base SAS procedure to summarize the data.  This paper will illustrate some of the options that most programmers do not use that could add values and enable us to provide the functionality of a ‘cube’ i.e. multi-dimensional summarization.

Inventory Your Files Using SAS
Brian Varney, Experis
Paper BB-13

Whether you are attempting to figure out what you have when preparing for a migration or you just want to find out which files or directories are taking up all of your space, SAS is a great tool to inventory and report on the files on your desktop or server.  This paper intends to present SAS code to inventory and generate reports on the files on your desktop or server.

“There’s an App for That”: It’s Called SAS® ODS! Mobile Data Entry and Reporting via SAS ODS
Michael Drutar, The SAS Institute
Paper BB-14

Most mobile reporting allows users only to receive reports; however, it is often necessary to INPUT data on a mobile device.  This can be accomplished by leveraging Google spreadsheets and SAS ODS.  This paper’s example explains how a Cross Country Coach records a runner’s race times (in real time) on a Google spreadsheet using his mobile device.  His office computer has a SAS® job which, via the filename statement, reads the updated Google spreadsheet into SAS.  SAS processes this data and sends an email containing an embedded HTML dashboard of Runner KPI dials, charts and/or tables.  The Coach receives this report almost instantaneously.  Because it is an embedded email, this solution works on nearly any mobile device (for example, iPhone or Android).

The Elephant in the Room: Running Hadoop on SAS & Greenplum
Christopher Stevens, Greenplum
Paper BB-15

Recent computing and business trends have triggered an explosion in the amount of unstructured data companies generate each day.  That's why Big Data analytics can give you a competitive advantage.  By extracting the knowledge wrapped within unstructured and machine-generated data, your enterprise can make better decisions that drive revenue and reduce costs.

Hadoop has rapidly emerged as the preferred solution for Big Data analytics across unstructured data.  But thefast-changing Hadoop ecosystem can present challenges to any company that wants to standardize on core functionality and build repeatable processes.  It is often up to you to determine which packages to use, provide version control, test for compatibility, and deploy functionality into production.  This can lead to many time-consuming failed attempts-and, ultimately, a dead end.

During this session you'll learn about Greenplum Hadoop solutions, leveraging a product called Greenplum HD, and how it can enable you to take advantage of Big Data analytics without the overhead and complexity of a project built from scratch.  You'll also learn about how Greenplum HD and SAS software can work together to store and retrieve data as if they were native SAS datasets to create a unified analytical platform.

Coder’s Tips and Tricks

SAS® Macros and the SAS® DATASETS Procedure - An Automated Approach to Dataset Management and Manipulation
Christopher Alexander, RTI International
Paper CT-01

The combination of SAS Macro functionality and the SAS DATASETS procedure provides a simple, straightforward approach to automating dataset manipulation and management.  For users who wish to perform common tasks across groups of datasets, the DATASETS procedure is the tool to programmatically identify which dataset(s) to work with.  When it is combined with the Macro facility, users can write compact, maintenance-friendly (or even maintenance-free) code to manipulate the dataset(s) targeted by the DATASETS procedure in any number of ways.  This paper assumes a basic knowledge of the DATA step and SAS Macro language.

An email macro: Exploring metadata in EG and user credentials in Linux to automate email notifications.
Jason Baucom, Ateb, Inc.
Paper CT-02

Enterprise Guide (EG) provides useful metadata variables to identify user credentials.  Metadata can be exploited to send program notifications, but a more generic macro is required for usage in batch mode where credentials may not be required.  A flexible macro is presented that can extract email addresses by exploring metadata in an EG session or by identifying email addresses in user credentials on a Linux system.

PRXCHANGE: Accept No Substitutions
Kenneth Borowiak, PPD, Inc.
Paper CT-03

SAS provides a variety of functions for removing and replacing text, such as COMPRESS, TRANSLATE & TRANWRD.  However, when the replacement is conditional upon the text around the string the logic can become long and difficult to follow.  The PRXCHANGE function is ideal for complicated text replacements, as it leverages the power of regular expressions.  The PRXCHANGE function not only encapsulates the functionality of traditional character string functions, but exceeds them because of the tremendous flexibility afforded by concepts such as predefined and user-defined character classes, capture buffers, and positive and negative look-arounds.

Labels: What They Are and How To Use Them
David Chapman, Chapman Analytics, LLC
Paper CT-04

Labels for datasets, variables, and values of variables are a useful and valuable to a SAS programmer.  This paper discusses how to assign labels using both code and interactively with SAS explorer.  In addition to show how to assign a dataset and each variable in the dataset, the paper will discuss why it is useful to do it.  The paper also discusses how to assign variables to the values of variables in the dataset.  Labels make management of the dataset easier for the programmer and use of the dataset easier for the user.

A Closer Look at PROC SQL's FEEDBACK Option
Ken Borowiak, PPD, Inc.
Paper CT-05

The FEEDBACK option on the PROC SQL statement controls whether an expanded or transformed version of a query using terse notations is written to the SAS log.  This paper will review some of the documented features of this option and provide additional programming conventions that are explicitly stated when the option is enabled.  It will be shown that the FEEDBACK option is an invaluable tool for understanding how PROC SQL processes a query and how it can be used as a code generator.

A three-piece suite to address the worth and girth of expanding a data set
Phil d'Almada, Duke Clinical Research Institute
Brian Tinga, Duke Clinical Research Institute
Daniel Wojdyla, Duke Clinical Research Institute

Paper CT-07

For a medical therapy analysis involving more than one medication of interest within a defined analysis time period, data sets containing records of time intervals of medication use were to be expanded to data sets consisting of daily-use records.  For each medication, the data expansion required some means of resolving occurrences of overlapping intervals from across multiple time-interval records.  This paper illustrates a suite of three approaches, using the SAS® System, that were independently developed to accomplish the same objective.  The suite of three methods that were used included array processing, the SQL procedure, and DO loop processing, and among these methods, either vertical or horizontal data processing formed the core or central platform to data set expansion.

Fitting Bayesian hierarchical multinomial logit models in PROC MCMC
Jacob Fisher, Duke University
Paper CT-09

The paper illustrates how to use the MCMC procedure to fit a hierarchical, multinomial logit model for a nominal response variable with correlated responses in a Bayesian framework.  In particular, the paper illustrates how to perform three important parts of Bayesian model fitting.  First, to make sure appropriate prior distributions are selected, the paper shows how to simulate draws directly from the prior distribution.  Second, since the reference category and random effects may require special attention, the paper shows how to code the sampling model into PROC MCMC using the RANDOM statement, new to SAS® 9.3.  Finally, the paper demonstrates how to run two chains simultaneously on a multi-core processor, and how to use those two chains to check convergence of the MCMC chain using the Gelman-Rubin diagnostic test.  By following these steps, many common pitfalls associated with fitting complicated models in PROC MCMC may be avoided.  The target audience for this paper is people with some knowledge of Bayesian methods and a moderate level of SAS experience, but who may not be familiar with PROC MCMC or multinomial logit models.

Fatal Witlessness: Appending Datasets! WARNING! This may cause truncation of data!
Arunim Gupta, Genpact, India
Paper CT-10

Datasets appended with the SET statement in a DATA step or by using the PROC APPEND procedure may or may not have a common source.  When two data sets are appended, the programmer often assumes that the character variables in both datasets have the same lengths.  If that is not true, SAS® commonly puts this message in the log: “WARNING: Multiple lengths were specified for the variable NAME by input data set(s).  This may cause truncation of data.” where NAME is a character variable with different length in each of the datasets being appended.  Programmers who ignore this warning unknowingly leave themselves with truncated variables, which will later give erroneous and fatal results.

An Introduction to Criteria-based Deduplication of Records
Elizabeth Heath, RTI International
Priya Suresh, RTI International

Paper CT-11

When survey respondents are allowed to select whether they will complete a paper or electronic version of a survey, a few respondents will inadvertently submit two versions of the survey.  Because the survey needs only one data submission from each respondent, multiple submissions per respondent are first identified, reviewed for completeness and other criteria provided by the survey, and then the criteria are applied for keeping only one record per respondent.  We will show how SAS® can be used to apply selection criteria to identify and remove duplicate records for a respondent.

Hidden Biases Using SAS Dates
Steve James, CDC
Paper CT-12

Long-time SAS users are well familiar with the fact that SAS stores dates as the number of days since Jan 1, 1960.  However what they may not realize, is that there are some hidden biases that can introduced as a result of using SAS dates.  When identifying the time interval between two dates expressed in months or years, how you do that can introduce a bias that would be difficult to identify.  This paper discusses three biases that can be introduced when using SAS dates.

Programmatic Automation of Categorizing and Listing Specific Clinical Terms
Ravi Kankipati, Pinnacle Technical Resources
Abhilash Chimbirithy, Accenture LLP

Paper CT-13

Typical clinical trial counts tables sometimes require listing specific terms and their respective groupings in the table footnotes.   Manually hard coding this list of terms into a SAS® program is time-consuming and prone to typographical error.  Thus, it is important to automate this process.  The macro presented in this paper uses an Adverse Events (AE) data set to demonstrate an automated process that will increase efficiency and accuracy.  This paper will also provide a brief introduction to the use of SAS automatic macro variable SQLOBS.  The example presented may be expanded to include other types of clinical data such as Concomitant Medication (CM) or Medical History (MH).

David Kerman, Bank of America
Paper CT-14

A PROC MEANS Primer gives an introduction to the PROC MEANS procedure (included in Base SAS), describing the syntax and key options, and providing examples on how and when to use this procedure.  Special focus is given to a couple of important options, NWAY and COMPLETETYPES, which are very powerful but can lead to confusion and errors if not used properly.  The paper concludes by giving an example of these types of errors, and provides recommendations on how to avoid the errors when using PROC MEANS.

A Winning Combination: Generation of Listing and Descriptive Statistics Table in One Report
Chenille Lloyd, PharmaNet/i3
Paper CT-15

Presenting clinical trials data clearly, but from more than one perspective can be very helpful to understanding the safety and efficacy of a new drug.  This paper shows how to generate a report combining an individual subject listing with a descriptive statistics table, using only SAS®/MACRO language and other Base SAS features.  Such a presentation is particularly useful for pharmacokinetic analysis studies, which require easy identification of outliers and trends in drug concentration data.  The macro features options that allow the programmer to specify subject exclusions and the number of decimal places used to report results.

Manage Hierarchical or Associated Data With The RETAIN Statement
Alan Mann, Blue Ridge Analytics, Inc.
Paper CT-16

For most of the history of computing machinery, hierarchical data has existed, and will undoubtedly persist through the next several decades.  Lining up parent and child relationships by several key fields can be a challenge, and in most cases, could be served by joins or merges.  This short presentation will show SAS programmers and analysts of all levels a trick to line up parent and child data using an associated key and ultimately arranging them via a surrogate key using the RETAIN statement in a data step.  What perplexes many programmers is where to place the RETAIN and the logic to make use of it.  This short paper clears up this issue, demonstrating a real-world application the audience can take away and start using immediately.

What's in a FILENAME?
Heidi Markovitz, Federal Reserve Board of Governors
Paper CT-17

The FILENAME statement is an old standard that assigns a nickname or file handle to a non-SAS data location.  It is most commonly used to point to a single flat file.  However, for many years it has been able to act on an explicit list of files or a location that contains an unknown batch of files.  This presentation will show how to use these capabilities, along with the INFILE statement in a single DATA step, to combine a list of files, explore all the files in a directory (folder), or build a list of files of a certain type.

Let's Play a Game: A SAS Program for Creating a Word Search Matrix
Robert Matthews, University of Alabama at Birmingham
Paper CT-18

This paper describes a process for inserting a list of words into a matrix of random letters.  The output produces two tables.  The first table is a display of the matrix after the words in the list are inserted, but before random letters are inserted into the remaining cells in the matrix.  The second table is the final matrix after empty cells are filled with random letters.  The program has two levels of difficulty and highlights a number of techniques for working with two-dimensional arrays.  It is applicable to all versions and platforms of The SAS® System.

Minimum Level of Documentation for Ad Hoc Report Programming
Robert McCurdy, Southern Company
Paper CT-19

In an ad hoc reporting group, unfortunately documentation tends to slide.  When under pressure to get a new report out, it is advantageous to be able to quickly identify and locate previously written programs which have similar features.  This paper explores minimal documentation which facilitates this process.  A work sheet for each program specifying databases, tables, variables, algorithms and called routines not only helps in program development but also provides the basis for the development of an external matrix showing features versus programs.  This matrix can help quickly identify which programs have the needed features when beginning a new project.  Minimum level of documentation inside the program including the flower box is explored.

Learning PROC SQL the DATA Step Way
Meghal Parikh, University of Central Florida
Elayne Reiss, University of Central Florida

Paper CT-20

As evidenced by some dissenting opinions within our own office, the use of PROC SQL for dataset manipulations may be considered dreadful by those who learned their way around Base SAS® with only the DATA step.  For many tasks, there are few better routes to successful completion than a well-designed DATA step.  However, PROC SQL can serve as an efficient replacement to many tasks.  This paper illustrates where replacing DATA step code with PROC SQL-based code might be a smart decision for any base SAS programmer.  Additionally, it draws simple analogies to DATA step syntax, reducing the possible intimidation associated with learning the complexities of PROC SQL.  With knowledge of both techniques, programmers will always be able to select the best technique for each data manipulation scenario.

An In-Line View to a SQL
Darryl Putnam, CACI, Inc.
Paper CT-21

PROC SQL® is a powerful yet still overlooked tool within our SAS® arsenal.  PROC SQL can create tables, sort and summarize data, and join/merge data from multiple tables and in-line views.  The SELECT statement with the CASE-WHEN clause can conditionally process the data like the IF-THEN-ELSE statement in the DATA step.  An advantage specific to PROC SQL is that with careful coding, the SQL code can be ported to 3rd party Relational Database Management Systems (RDBMS) such as Oracle® and SQL Server® with virtually no changes.  This paper will show some techniques to QA and reshape data the way you want to see it, with a focus on in-line views.

Schedule Impossible: Using ODS and PROC REPORT to Create a Schedule Visualization
Jeffrey Reiss, University of Central Florida
Paper CT-22

When preparing reports for mass user consumption, those with unique requirements beyond an organization’s expected norm may surface periodically.  While some of these reports may appear to be impossible to create at first glance, the current abilities of PROC REPORT and SAS® ODS are sufficiently developed to make these reporting anomalies possible.  In this paper, the author will discuss how a report that was once deemed “impossible” to produce in the SAS environment can indeed be made possible.

Introducing the FINDIT Macro: An Efficient Tool for Simultaneously Searching Several Free-Text Fields Using Multiple Keywords
LaTonia Richardson, CDC
Paper CT-23

Have you ever faced the daunting task of searching for information within a free-text field?  Even with a list of specific keywords, it is tedious to write multiple "where like" statements or to apply the SAS® INDEX function for each keyword.  This paper presents a new FINDIT macro that enables users to quickly find information based on many keywords in multiple free-text fields.  Using DATA step and PROC SQL commands within an iterative “do loop” procedure, the macro simultaneously searches all the free-text fields and keywords specified.  The best part is that the results are presented in a user-friendly summary report listing the specific keyword found, the variable that contained it, and the exact text found by the search.  This macro considerably reduces the need for manually reading information stored in free-text fields.  It is the next-best thing to perusing with the human eye!

A Handy New SAS® Tool for Comparing Dynamic Datasets
LaTonia Richardson, CDC
Paper CT-25

A new data comparison tool provides an efficient way for users to compare two dynamic datasets created at different times.  The tool can compare datasets of various formats, including SAS, Access, Excel, and XML.  Using a Microsoft Access “front-end” user interface to execute SAS Data Step and Proc SQL code, users are prompted to import two files of the same file format (an “old file” and a “new file”) and then thorough comparisons are made to identify differences in variables, records (i.e., records found in only one dataset), and data values.  Once all comparisons are made, users can view user-friendly summary reports of all differences identified by the tool.  The tool is most useful for tracking changes to dynamic datasets and for detecting, investigating, and resolving data discrepancies.  It provides a convenient, efficient method for identifying key differences between two datasets.  This paper reviews the features of this new tool, common uses for it, and the programming techniques used to create it.

Securing Your SAS Systems - A Simple Step to Identify Users
Leanne Tang, NASS/USDA
Paper CT-26

Securing our applications and data has become an integral part of our SAS systems.  Identifying users and what they are entitled to is the first step in this effort.  In our organization, Active Directory Domain Services Server maintains the user information.  DSQuery is a command-line utility from Microsoft which is part of Windows Server 2003.  DSQuery can be executed from a SAS program to query Active Directory for user information.  This information can then be used in SAS to authenticate users.  This paper will discuss some of the common features in DSQuery utility and how to use query result in SAS to validate users.

Display, Group or Order: Using Proc Report to Create Clinical Trials Outputs
Sally Walczak, Quintiles
Paper CT-27

PROC REPORT is a Base SAS® procedure that simplifies the creation of custom reports.  Sometimes it can add a level of frustration to creating the tables and listings that you desire.  This paper will explore the different variable definitions: DISPLAY, GROUP and ORDER and associated options with each.  This will allow you to confidently use the correct DEFINE statement the first time to get the output you really want.

Encoding the Password - A low maintenance way to secure your data access
Leanne Tang, NASS/USDA
Paper CT-28

When a user accesses data in remote databases through a SAS application, a valid user account and password associated with the database is required.  One method for granting access is to create a user account for each user in each database.  You then pass the user ID and password through the SAS program when accessing data in remote databases.  Doing so requires the maintenance of an account for each user and is subject to the risk of exposing the user passwords in the SAS programs.  This paper is going to present a simple solution to eliminate the security concerns.  By using “Application Account” we can get away from individual user accounts.  By incorporating the PROC PWENCODE we can “disguise” the passwords from revealed as clear text.

We Can Import It For You Wholesale: How to Use SAS Macros to Import Hundreds of Excel Files
Matthew Gyory, DevTech Systems, Inc
Paper CT-29

Importing a large number of Excel worksheets into SAS can be a time-consuming and frustrating process.  Repeating LIBNAME or PROC IMPORT statements for each Excel file can quickly become overwhelming.  However, with SAS Macro %DO loops, PROC SQL and the metadata SAS can extract from files, any SAS user can quickly and easily import dozens or hundreds of Excel files with some or all the associated sheets.  This paper shows how.

Beyond “If then” - Three Techniques for Cleaning Character Variables from Write-in Questions
Yusheng Zhai, American Cancer Society
Ryan Diver, American Cancer Society
Xia Lin, American Cancer Society

Paper CT-30

In survey studies, cleaning answers to write-in questions can be difficult and time consuming, especially when the same response may be written in multiple ways.  Misunderstanding of the survey question, unrecognizable handwriting, and negligence in data entry are major factors leading to data inaccuracies that are almost impossible to avoid.  Writing a series of “if then” statements is a classic solution to cleaning data.  However with growing datasets, the number of conditions to be tested grow too, until thousands of “if then” statements may be required.  This paper presents three techniques that we used to clean up the country of birth questions in the Cancer Prevention Study-3 (CPS-3).  Combining data merging and Excel spreadsheets, using LIKE and SOUND LIKE operators, and implementing join tables with compare functions in the SQL procedure not only reduced the workload and eases the stress in cleaning character variables but also added some flavors to this tedious task.

Discover JMP

Run JMP as a virtual application – Changing How the Game Is Played
Hui Di, The SAS Institute
Paper DJ-01

The reign of the personal computer as the sole corporate access device is coming to a close, and by 2014, the personal cloud will replace the personal computer at the center of users’ digital lives, according to Gartner, Inc.

Desktop Application Virtualization is a top megatrend in this new era.  Virtualization has, to some extent, freed applications from the peculiarities of individual devices, operating systems or even processor architectures.  The specifics of devices will become less important. Users will use a collection of devices, with the PC remaining one of many options.  Access to the cloud and the content stored or shared in the cloud will be managed and secured.  The virtual application will have no conflict with any application on your desktop, independent of languages, versions, and environment.  Deployment is centralized.  Sometimes, deploy to each desktop is almost impossible (example of Wal-Mart).

Can JMP run a desktop virtualization application?  The answer is YES.

This paper shows you how to run virtualized JMP in Citrix XenApp environment and benefits of doing so.

It demonstrates that in Citrix environment, JMP is accessible from different Operating Systems (Windows, Mac, and Linux).  Virtualized JMP can run different languages for the same version concurrently on a desktop.  JMP in this environment can print to a local printer, copy and paste to third party application, access to individual's local file system.

Last but not the least, this paper shows you how to run JMP on a mobile device, such as iPhone.

SPC Data Visualization of Seasonal and Financial Data Using JMP®
Annie Dudley Zangi, The SAS Institute
Diane K. Michelson, The SAS Institute

Paper DJ-02

JMP® Software offers many types of Statistical Process Control (SPC) charts, including Shewart, Cusum, and Moving Average charts.  SPC chart features in JMP can be accessed through the menus and dialogs, scripting, and now in version 10.0, through a drag and drop interface.  The periodic nature of some financial data makes it unsuitable in its original form for detecting anomalies using a SPC chart.  One viable method for presenting this data on a SPC chart is to apply time series techniques first, then chart the output.  This paper investigates two case studies applying these techniques.

An equipment company who had successfully used SPC charts in their manufacturing department in the past was interested in monitoring monthly revenue.  Their revenue figures were cyclic in nature, dropping off at the beginning of each year and peaking in December.  While they could visually see the trend, months with unusual patterns were masked by all the months triggering alarms as traditional control charting methods failed to work.

In another company, control charts were being used on water quality data in a water purification scheme.  City water is pumped through a series of filters.  At the end of the filtering process, the quality of the water is measured twice a second.  The frequency of measurement is important to catching dirty water before it arrives at processing equipment, but the resulting positive serial correlation leads to an increase in the false alarm rate of the control chart.

SPC charts work very well under the ideal conditions of data independence and normality.  SPC has traditionally been used in manufacturing, where these conditions are often satisfied.  However, SPC is beginning to be used more outside of manufacturing, in areas like insurance claims processing, banking, health care, and survey research.  In many of these environments, the desired mean may be shifting up or down, or the responses may be cyclic in nature.  In this paper, we examine some of the problems with plotting time series data on control charts and suggest remedies.

Getting to the Good Part of Data Analysis: Data Access, Manipulation, and Customization Using JMP®
Audrey Ventura, The SAS Institute
Paper DJ-03

Effective data analysis requires easy access to your data no matter what format it comes in.  JMP can handle a wide variety of formats.  Once the data is in JMP, you can choose from a variety of options to reshape the data with just a few clicks.  Finally, customize your data with labels, colors, and data roles so that graphs and charts automatically look the way you want them to.  This paper walks through two or three story lines that demonstrate how JMP can easily import, reshape, and customize data (even large datasets) in ways that allow your data to be displayed in vibrant visualizations that will wow your audience.

Hands-On Workshops

Getting Up to Speed with PROC REPORT
Kimberly LeBouton, KJL Computing
Paper HW-01

Learning the basics of PROC REPORT can help the new SAS® user avoid hours of headaches.  PROC REPORT can often be used in lieu of PROC TABULATE or DATA _NULL_ reporting - two areas that have driven the new SAS user crazy!  With the added capabilities of ODS, PROC REPORT can look as sharp as an EXCEL report.  This paper will show how to use PROC REPORT in both a windowing and non-windowing environment using SAS Version 9.

Quick Results with ODS Graphics Designer
Sanjay Matange, The SAS Institute
Paper HW-02

You just got the study results and want to get some quick graphical views of the data before you begin the analysis.  Do you need a crash course in the SG procedures (also known as ODS Graphics Procedures) just to get a simple histogram?  What to do?

The ODS Graphics Designer is the answer.  With this application, you can create many graphs including histograms, box plots, scatter plot matrices, classification panels, and more using an interactive "drag-and-drop" process.  You can render your graph in batch with new data and output the results to any open destination.  You can view the generated GTL code as a leg up to GTL programming.  You can do all this without cracking the book or breaking a sweat.

This hands-on-workshop takes you step-by-step through the application's features.

The Armchair Quarterback: Writing SAS® Code for the Perfect Pivot (Table, That Is)
Peter Eberhardt, Fernwood Consulting Group Inc.
Paper HW-03

Can I have that in Excel?  This is a request that makes many of us shudder.  Now your boss has discovered Excel pivot tables.  Unfortunately, he has not discovered how to make them.  So you get to extract the data, massage the data, put the data into Excel, and then spend hours rebuilding pivot tables every time the corporate data are refreshed.  In this workshop, you learn to be the armchair quarterback and build pivot tables without leaving the comfort of your SAS® environment.  In this workshop, you learn the basics of Excel pivot tables and, through a series of exercises, you learn how to augment basic pivot tables first in Excel, and then using SAS.  No prior knowledge of Excel pivot tables is required.

FREQ Out – Exploring Your Data the Old School Way
Stephanie Thompson, Datamum
Paper HW-04

This tried and true procedure just doesn’t get the attention it deserves.  But, as they say, it is an oldie but a goodie.  Sometimes you just need a quick look at your data and a few simple statistics.  PROC FREQ is a great way to get an overview of you data with a limited amount of code.  We will explore the basic framework of the procedure to how to customize the output.  There will also be an overview of the statistical options that are available.

How to Perform and Interpret Chi-Square and T-Tests
Jennifer Waller, Georgia Health Sciences University
Paper HW-05

For both statisticians and non-statisticians, knowing what data look like before more rigorous analyses is key to understanding what analyses can and should be performed.  After all data have been cleaned up, descriptive statistics have been calculated and before more rigorous statistical analysis begins, it is a good idea to perform some basic inferential statistical tests such as chi-square and t-tests.  This workshop concentrates on how to perform and interpret basic chi-square, and one- and two-sample t-tests.  Additionally, how to plot your data using some of the statistical graphics options in SAS® 9.2 will be introduced.

Queries, Joins, and WHERE Clauses. Oh My!! Demystifying PROC SQL
Christianna Williams
Paper HW-06

Subqueries, inner joins, outer joins, HAVING expressions, set operators…just the terminology of PROC SQL might intimidate SAS® programmers accustomed to getting the DATA step to do our bidding for data manipulation.  Nonetheless, even DATA step die-hards must grudgingly acknowledge that there are some tasks, such as the many-to-many merge or the "not-quite-equi-join," requiring Herculean effort to achieve with DATA steps, that SQL can accomplish amazingly concisely, even elegantly.  Through increasingly complex examples, this workshop illustrates each of PROC SQL’s clauses, with particular focus on problems difficult to solve with “traditional” SAS code.  After all, PROC SQL is part of Base SAS® so, although you might need to learn a few new keywords to become an SQL wizard, no special license is required!

Launching Off: Intro Tutorials

Using SAS® Enterprise Guide® to Coax Your Excel Data In To SAS®
Mira Shapiro, Analytic Designers LLC
Kirk Paul Lafler, Software Intelligence Corporation

Paper IT-01

Importing Microsoft Excel files into SAS can often be a challenge.  Perfectly formatted Excel files with labels in the first row and idiosyncrasy-free, clean data is not usually the norm.  We will show how to overcome many of the obstacles associated with creating SAS data sets from Excel workbooks by using various combinations of SAS Enterprise Guide 4.3's features.  The import wizard, generated code, code suggestion mechanism, options, and the ability to preview the first section of a CSV file will all be shown as mechanisms for creating analytic data sets from Excel input.

Reducing Big Data to Manageable Proportions
Sigurd Hermansen, Westat
Paper IT-02

Abstract Billions and billions of observations now fall within the scope of SAS data Libraries; that is, should you happen to have an HP Blade server or equivalent at your disposal.  Those of us getting along with more ordinary workstations and servers have to resort to various strategies for reducing the scale of datasets to manageable proportions.  In this presentation we explore methods for the scale of datasets without losing significant information: specifically, blocking, indexing, summarization, streaming, views, restructuring, and normalization.  SAS/Base happily supports all of these methods.

Quick Hits - My favorite SAS tricks
Marje Fecht, Prowerk Consulting
Paper IT-03

Are you time-poor and code-heavy?

It's easy to get into a rut with your SAS code and it can be time-consuming to spend your time learning and implementing improved techniques.

This presentation is designed to share quick improvements that take 5 minutes to learn and about the same time to implement.  The quick hits are applicable across versions of SAS and require only BASE SAS knowledge.

Included are:

Building the Better Macro: Best Practices for the Design of Reliable, Effective Tools
Frank DiIorio, CodeCrafters, Inc.
(presented by Paul Dorfman)
Paper IT-04

The SAS® macro language has power and flexibility.  When badly implemented, however, it demonstrates a chaos-inducing capacity unrivalled by other components of the SAS System.  It can generate or supplement code for practically any type of SAS application, and is an essential part of the serious programmer's tool box.

Collections of macro applications and utilities can prove invaluable to an organization wanting to routinize work flow and quickly react to new programming challenges.  But the language's flexibility is also one of its implementation hazards.  The syntax, while sometimes rather baroque, is reasonably straightforward and imposes relatively few spacing, documentation, and similar requirements on the programmer.  In the absence of many rules imposed by the language, the result is often awkward and ineffective coding.  Some amount of self-imposed structure must be used during the program design process, particularly when writing systems of interconnected applications.  This paper presents a collection of macro design guidelines and coding best practices.  It is written primarily for programmers who create systems of macro-based applications and utilities, but will also be useful to programmers just starting to become familiar with the language.

Why Did SAS® Say That? What Common DATA Step and Macro Messages Are Trying to Tell You
Kevin Russell, The SAS Institute
Paper IT-05

SAS notes, warnings, and errors are written to the log to help SAS programmers understand what SAS is expecting to find.  Some messages are for information, some signal potential problems, some require you to make changes in your SAS code, and some might seem obscure.  This paper explores some of these notes, warnings, and errors that come from DATA step and macro programs.  This paper deciphers them into easily understood explanations that enable you to answer many of your questions.

Review That You Can Do: A Guide for Systematic Review of Complex Data
Lesa Caves, RTI International
Nicole Williams, RTI International

Paper IT-06

Quality control is a critical step in the process of creating and reviewing composite variables.  Review of a single composite variable typically requires several iterations of multi-way crosstabs and case-level review in order to verify that the variable is programmed according to the analyst’s specifications.  This approach is suitable when working with simple data structures (e.g., a single dataset or multiple datasets with the same number of records per file) or when the variable is simple to program.  However, when a composite variable is created from complex, multi-level data structures, it requires special care in review and quality control procedures.  Analysts, with content expertise but basic SAS® programming skills, may find it difficult to adequately review the variable.  In this paper, we describe a process for effectively and systematically reviewing a composite variable created from several multi-level datasets.  Through this process, a programmer creates a composite variable in few data steps for efficiency, while an analyst methodically breaks the code down into multiple small data steps to create a local version of the same variable.  The programmer’s and analyst’s versions of the variable are then compared and discrepancies are investigated.

HELP, My SAS® Program isn't Working: Where to Turn When You Need Help
Kimberly LeBouton, KJL Computing
Paper IT-07

Instead of a quick code fix, I often assist the SAS user by troubleshooting their issue with my knowledge of SAS and the SAS community.  For over 20 years, I have provided SAS Technical Support, and this paper will present strategies I have used to work through simple to complex technical problems, and will include access to my “cheat sheets”.

Pharma and Healthcare

The SDTM Programming Toolkit
David Scocca, Rho, Inc.
Paper PH-01

Data standards make programmers' lives simpler but more repetitive.  The similarity across studies of SDTM domain structures and relationships presents opportunities for code standardization and re-use.  This paper discusses the process of building your own programming toolkit for SDTM work, with examples of common tasks and the code to implement those tasks.  Examples will include mapping study visits, parsing dates, standardizing test codes, and transposing horizontal clinical data.

A GUI-based utility macro for creating a version controlled project directory structure and copying in standard tools and template files
Hisham Madi, INC Research
Matt Psioda, University of North Carolina at Chapel Hill

Paper PH-02

There are many considerable advantages to standardizing a directory structure for all projects/studies.  A well defined project directory structure enhances the organization of study files such as data, SAS programs, output and study documentation.  This paper illustrates an approach to creating a standardized directory using a utility macro, a default list of tools and files to be copied into the newly created study directory and an excel spreadsheet which defines the directory structure.  Additionally, the process is driven by a SAS pop up window interface that collects user specified options, a management controlled sponsor list/root directory naming convention, and call system commands to create the necessary subdirectories and copy template files, programs, specs, etc. into the newly created study directory.  Lastly, since each component of the setup process is version controlled, the macro dynamically selects the most recently approved version of the sponsor, tool and directory/subdirectory lists upon each execution of the setup macro.

Knowing When To Start, Where You Are, and How Far You Need To Go: Customized Software Tracks Project Workflow, Deliverables, and Communication
Eric Vandervort, Rho, Inc.
Paper PH-03

In a clinical trials environment, projects can have multiple statisticians and statistical programmers working on tables, listings and figures, or "displays", for project deliverables.  Communication between the various team members regarding when to program, validate, review or revise these displays is vital to the success of a project.  This paper describes a custom web-based application that stores relevant data about displays, tracks programming and reviewing workflow and provides a tool for project-level management overview.

A CareerView Mirror: Another Perspective on Your Work and Career Planning
Bill Donovan, OckhamSource™
Paper PH-04

Career planning in the today’s tumultuous job market place requires a more rigorous and disciplined approach, which must begin with each individual tracking his or her particular skills and experiences.

The ability to organize and inventory your entire career-related experiences is the foundation of a solid career plan.  The catalog of your work assignments and functional responsibilities creates a reflection of your efforts in your career to date.

All of this helps to build your CareerView Mirror.

An Introduction to the Clinical Standards Toolkit
Mike Molter, d-Wise Techologies
Paper PH-05

Since the dawn of CDISC, pharmaceutical and biotech companies as well as their vendors have tried to inject CDISC processes such as standards compliance checking and define.xml generation into their clinical and statistical programming flows.  Such efforts are not without their challenges, both from technical as well as process standpoints.  With the production of data sets and tabular results taking place inside of SAS programs, it’s tempting to add code to this flow that performs these tasks.  While the required code can be relatively straightforward for SAS programmers with even modest programming and industry experience, all too often the management of such code and the processes around its use is where the difficulties occur.  Without proper management, seemingly simple tasks such as selecting which checks to execute or changing process parameters become more complicated than necessary.

The Clinical Standards Toolkit (CST) is an attempt by SAS to build a stable framework for the consistent use of BASE SAS around clinical data standards processes by striking the proper balance between the flexibility of BASE SAS and the needed discipline of process parameter management.  In this paper we will take a tour of the CST components and demonstrate the execution of these processes.  In the end, users should know not only how to set up programs to achieve these tasks, but also how to manipulate files to make these processes work for their own needs.

A SAS Macro Approach to Assign CTCAE Grades to Laboratory Adverse Experiences
Mei Dey, Accenture
Lisa Pyle, Accenture

Paper PH-06

The Common Terminology Criteria for Adverse Events (CTCAE) published by the National Cancer Institute (NCI) is widely used in the oncology therapeutic area and provides a severity grading scale for adverse experiences.  This guideline describes the severity grading for clinical adverse experiences as well as certain laboratory results.  Typically, CTCAE grading is directly collected from the site on adverse experience case report form.  However this may not be the case for laboratory results.  Oftentimes, only lab result/unit and normal ranges are collected without any indication of toxicity severity.

This paper discusses a SAS utility macro designed to specifically apply toxicity grading to selected laboratory results based on the CTCAE guidance and the challenges surrounding the implementation.  This macro can be utilized on laboratory results collected from either local or central laboratories once all units are converted to standardized measures.  Lastly, this utility macro was designed using Version 3.0 CTCAE and populates the LBTOXGR column in the LB SDTM domain per CDISC standards and could easily be modified to reflect CTCAE Version 4.0.

Developing a Complete Picture of Patient Safety in Clinical Trials
Richard Zink, JMP Life Sciences, The SAS Institute
Russell Wolfinger, JMP Life Sciences, The SAS Institute

Paper PH-07

There are numerous dimensions to consider in the analysis and review of safety endpoints in clinical trials.  First, a multitude of tests are regularly performed to monitor the well-being of the patients under investigation.  This may include physical examinations, monitoring of vital signs and electrocardiograms, and frequent laboratory assessments.  Spontaneously-occurring events of significance include deaths, study discontinuations, hospitalizations for disease progression, or other adverse events.  Second, the temporal relationship of the various outcomes to one another may provide insight as to the circumstances leading to safety issues, or may highlight individuals requiring intervention.  Finally, demographic characteristics, medical history and knowledge of concomitant therapies and substance use are needed to appropriately intervene without causing additional harm.  Ideally, experience in the therapeutic area or within a particular drug class should inform the clinical team of safety concerns likely to arise.  However, individuals studying orphan diseases or novel compounds may have little information to limit the scope of their investigation.

Summarizing this data deluge to highlight important safety concerns has traditionally been a difficult task.  To gain a complete picture of the subject, data from several domains need to be combined in a clear and meaningful way to highlight any irregularities.  These attempts are often cumbersome and difficult to customize, and typically require additional programming resources to manipulate and present the data.  Rarely would such tools be available early in the lifetime of the trial.

During our presentation, we will demonstrate the Patient Profile and AE Narrative analytical processes of JMP Clinical.  These interactive point-and-click tools summarize cross-domain safety information directly from CDISC-formatted data sets so that a) no additional programming is required; b) they are available early in the course of the trial; and c) they can be used by anyone, even those individuals less comfortable with software.  Various options enable you to create Profiles or Narratives as drill downs from varioius statistical analyses and to tailor them to your specific needs.  Data from a clinical trial of aneurysmal subarachnoid hemorrhage will provide illustration.

A Standard SAS Program for Corroborating OpenCDISC Error Messages
John R Gerlach, Independent Consultant
Paper PH-09

The freeware application OpenCDISC does a thorough job of assessing the compliance of SDTM domains to the CDISC standard.  However, the application generates error and warning messages that are often vague, even confusing.  Thus, it is beneficial to corroborate the CDISC compliance issues using SAS.  Moreover, realizing that a validation check generates similar messages across CDISC data libraries, the SAS code should be similar, as well.  Thus, paper explains a comprehensive standard SAS program that contains concise and reusable code that facilitates the process of corroborating OpenCDISC (OC) reports.

Generating SUPPQUAL Domains from SDTM-Plus Domains
John R Gerlach, Independent Consultant
Paper PH-11

Generating Supplemental (SUPPQUAL) domains is a staple component of any CDISC conversion project.  In fact, many of the SDTM domains often have a respective SUPPQUAL domain, which can be just as challenging to produce, as well as being considerably larger in size.  Even if there is an SDTM-Plus domain that contains the variables intended for a respective SUPPQUAL domain, the task of creating the respective SUPPQUAL domain can still be quite tedious, as well as error prone.

Obviously, it would be better to automate this process that guarantees compliance and accuracy.  This paper explains a SAS utility for generating SUPPQUAL domains from SDTM-Plus domains.

Planning and Administration

Serving SAS®: A Visual Guide to SAS Servers
Greg Nelson, ThotWave
Paper PA-01

SAS® has been running on servers since the late 1960s.  Despite the emergence of PCs and workstation-class machines, SAS still reigns supreme on the server.  With the introduction of the SAS platform 9 in 2004, the number and types of servers have grown exponentially.

As any good student of the DATA step will attest, knowing what SAS is doing is a critically important step in debugging and authoring efficient programs.

In this paper, you will experience SAS through a visual tour - you will see what SAS is doing, how it works, which server is doing what, when the operating system plays a role, how security functions, and what happens to your data through the entire process.

SAS Enterprise Business Intelligence (EBI) Deployment Projects in the Federal Sector: Best Practices
Jennifer Parks, CSC Inc.
Paper PA-02

Systems engineering life cycles in the federal sector embody a high level of complexity due to legislative mandates, agency policies, and contract specifications layered over industry best practices – all of which must be taken into consideration when designing and deploying a system release.  Additional complexity stems from the unique nature of EBI systems that draw programmers and analysts as power end users engaged in ad-hoc predictive analytics and are at odds with traditional, unidirectional (read-only) federal production software deployments to which many federal sector project managers have grown accustomed.  This paper provides a high-level roadmap for successful SAS EBI design and deployment projects within the federal sector.  It is addressed primarily to project managers, SAS administrators, and SAS architects engaged in the systems engineering life cycle (SELC) for a SAS EBI system release.

Getting to Know an Undocumented SAS Environment
Brian Varney, Experis
Paper PA-03

For many companies, SAS has been around for decades and there may or may not be people still around that know the details about the SAS deployment.  For a more complex SAS environment(such as SAS Enterprise Business Intelligence), it is harder to decipher how it was deployed.  This paper intends to help communicate the steps on how to become familiar with the details of a SAS environment whether it is a basic SAS Foundation install or a multi-server SAS Enterprise Business Intelligence Environment.  This paper intends to help a new SAS administrator or user become familiar with a SAS environment for which there is no or limited documentation.

Best Practices for Managing and Monitoring SAS® Data Management Solutions
Greg Nelson, ThotWave
Paper PA-04

SAS® and DataFlux® technologies combine to create a powerful platform for data warehousing and master data management.  Whether you are a professional SAS administrator who is responsible for the care and feeding of your SAS architecture, or you find yourself playing that role on nights and weekends, this paper is a primer on SAS Data Management solutions from the perspective of the administrator.  Here, we will review some typical implementations in terms of logical architecture so that you can see where all of the moving parts are and provide some best practices around monitoring system and job performance, managing metadata including promotion and replication of content, setting up version control, managing the job scheduler, and discuss various security topics.

Gotcha – Hidden Workplace and Career Traps to Avoid
Steve Noga, Rho
Bill Donovan, Ockham

Paper PA-06

Being successful at your job takes more than just completing your tasks accurately and on time.  There are hidden holes everywhere, some deeper than others, that must be navigated, yet no map exists for you to follow.  Most companies have a set of stated policies or rules that their employees are expected to follow, but what about the unstated ones that may have an effect on how fast or how far you advance within the company?  Hidden traps also exist along the way of your career path.  This panel discussion will highlight some “gotchas” of which you should be aware and ways to keep from falling into the holes.

Running SAS on the Grid
Margaret Crevar, The SAS Institute
Paper PA-07

Have you ever wondered if you are really prepared to start the installation process of SAS software on your hardware?  Perhaps you have read the System Requirements sheets for your SAS release and version, and the appropriate SAS platform administration guide for your operating system.  Are there other crucial items to consider before the installation process that are targeted toward your company’s expected performance of SAS?


A Visual Approach to Monitoring Case Report Form Submission During Clinical Trials
Rebecca Horney, Dept. of Veterans Affairs Cooperative Studies Program Coordinating Center
Karen Jones, Dept. of Veterans Affairs Cooperative Studies Program Coordinating Center
Annette Wiseman, Dept. of Veterans Affairs Cooperative Studies Program Coordinating Center

Paper PO-02

Clinical Trials data management requires close monitoring of Case Report Form (CRF) submission, particularly for trials involving a complex mixture of hundreds or thousands of subjects, multiple recruitment sites and many different rating periods.  Monitoring must be performed on a routine, almost “real-time”, basis to assure timeliness and accuracy of data submission as well as the overall integrity and validity of the trial.

Using common SAS code and procedures, we have developed a tabular method of presenting Clinical Trial data submission on a subject-by-subject basis over the course of the trial.  This report, run in tandem with the usual monitoring and progress reports, allows us to quickly scan and visually detect several common types of errors and inconsistencies, including:
  1. Adherence to the visit schedule or alternately, the emergence of what we’ve termed “visit creep”
  2. Patterns of missed visits
  3. Erroneous dates
  4. Incorrect subject numbers or identifiers
  5. Incorrect visit numbers
  6. Duplicate CRF data received for multiple visits (potential fraud detection)
Aside from its usefulness to the data coordinating center, this report can be used for monitoring and review by other stakeholders, such as the study sponsor or chairperson; scientific and data review bodies; management groups and on-site clinical monitors.  Though this report was created in a Clinical Trials context, we believe its applicability could extend to other fields and disciplines such as epidemiology, banking and educational settings.

A Corporate SAS® Community of Support
Barbara Okerson, WellPoint
Paper PO-03

Many SAS users are not aware of an abundance of resources available to them from a variety of sources.  The available resources range from those internal to their own organization to SAS itself.  In order for these resources to be utilized they need to be available to the users in an accessible way.  This paper shows how one large company with SAS users at many locations throughout the United States has built a highly successful collaborative community for SAS support.  Modeled in the style of, the online corporate SAS community includes discussion forums, surveys, interactive training, places to upload code, tips, techniques, and links to documentation and other relevant resources that help users get their jobs done.

Do You Have Too Much Class?
Janet Willis, Rho, Inc.
Paper PO-04

Do you ever use Proc Means with a CLASS statement to calculate frequencies for several variables at the same time without having to sort your dataset first?  Have you ever stopped to check if any of your variables that you are calculating the frequencies on are missing?  If your answer to the first question is yes and your answer to the second question is no then you need to take a closer look at your code.  Did you know that using the TYPE statement “TYPES category*visit*(var1 var2 var3 var4);” in your Proc Means when all of these variables are CLASS variables could result in errors?  What’s MISSING?  Learn what could possibly go wrong, why, and how to avoid errors in your frequencies.  Alternative solutions that provide different output structures will be included to meet your programming needs.  Examples will include using Proc Means with CLASS and TYPES statements, with and without the MISSING option, as well as Proc Means with CLASS and VAR statements.

Using Macro to simplify to Calculate Multi-Rater Observation Agreement
Abbas Tavakoli, USC/Nursing
Richard Walker, USC/Computer Science

Paper PO-05

This paper describes using several macros program to calculate multi-rater observation agreement using the SAS® Kappa statistic.  In the paper, we show an example of four raters observed a video to select certain tasks.  Each rater could select up to ten tasks.  Each rater could select different tasks numbers.  Inter-rater reliability (IRR) between the four raters is examined using the Kappa statistic, calculated using the SAS® PROC FREQ, MEANS, and PRINT procedures.  The Kappa statistic and 95% CI for observers were calculated and the overall IRR was calculated by averaging pairwise Kappa agreements.

This paper provides an example of how using macro to calculate percentage agreement with the Kappa statistic with a 95% CI using SAS® PROC FREQ, MEANS, and PRINT for multiple raters with multiple observation categories.  The program can be used for more raters and tasks.  This paper expands the current functionality of the SAS® PROC FREQ procedure to support application of the Kappa statistic for more than two raters and several categories.

Mastering the Basics: Preventing Problems by Understanding How SAS® Works
Imelda Go, SC Department of Education
Paper PO-06

There are times when SAS programmers might be tempted to blame undesirable results on a SAS error when the problem actually occurred because they did not understand how SAS works.  This paper provides a few examples of how misunderstanding SAS data processing can produce unexpected results.  Examples include those involving the program data vector, syntax, and behavior of PROCs.  These examples emphasize the need for programmers to have a solid understanding of what their SAS code produces.  Making the assumption that one’s code is perfect before testing can lead to inadequate testing and costly but preventable mistakes.  A safer approach is to assume that one’s code might result in mistakes until testing proves otherwise.

A SAS Users Guide to Regular Expressions When the Data Resides in Oracle
Kunal Agnihotri, PPD
Kenneth Borowiak, PPD

Paper PO-07

The popularity of the PRX functions and call routines has grown since they were introduced in SAS Version 9 due to the tremendous power they provide for matching patterns of text.  Since the implementation of the regular expressions within these functions are rooted in the Perl-style syntax, there is portability outside of a SAS environment.  It is not uncommon for SAS users to access data residing in an Oracle-based environment.  This paper explores the Oracle 10g implementation of regular expressions by highlighting similarities and differences to the PRX implementation in series of queries using PROC SQL’s Pass-Through facility against Oracle system tables.

Using Windows Batch Files to Sequentially Execute Sets of SAS Programs Efficiently
Matthew Psioda, UNC Chapel Hill
Paper PO-08

SAS users commonly have the need to repeatedly execute large sets of SAS programs and to efficiently perform scans of the SAS logs created during the process.  We discuss one simple method that uses a utility SAS program to create windows batch files (*.bat) that can then be used to sequentially submit a set of programs and scan the resulting SAS logs.  Additionally, many times the SAS programs are stored in a standard folder structure which compartmentalizes the programs into subfolders according to purpose.  In this setting, users often want to execute the SAS programs in one or more of the subfolders in some consistent sequence.  We describe a method that is based on an easy-to-construct windows batch file, called a global batch file, which gives the user simple prompts to determine what sets of programs need to be executed.  Based on user input, the SAS programs are executed in sequence, logs are scanned for each set of programs, and a batch report is generated.  When an organization uses consistent folder structure from project to project, these tools become completely portable allowing efficient batch processing with virtually no modification.

SAS Programming tips and techniques for Data Mapping
Sheetal Nisal, Sterling Healthstat, Inc.
Paper PO-09

Data mapping is a very common process for getting the data in homogeneous standards and making it ready for analysis.  In healthcare and pharmaceutical industry mapping is a very common process to analyze the data and to take informed decisions based on it.  SAS is a powerful software and can be used with lot of ease to map the data from legacy data standards to target data standards.  If you are a novice user of SAS or if you have to do data mapping first time, you may be qurious to know more about what type of SAS coding techniques you may need to use while mapping the data.  Typically data mapping involves a set of well defined inter-dependent processes.  To complete each process programmatically using SAS, data mapping specialist needs to know some basic but powerful features of SAS programming and should be able to use those features effectively by understanding the data.

This paper illustrates such basic SAS techniques with necessary illustrations on data mapping and transformations.  In this case, for ease of understanding, typical clinical domain in CDISC submission data standards is considered as a target data standard.  Application of SAS techniques is explained with reference to steps in the typical data mapping process.  Transformation of each variable, and its attributes requires careful use of SAS data steps and procedures.  Quality check in data mapping process is a very important step and it involves verification of data records, ensuring proper transformations of variables wherever applicable, and having meta data aligned to the standards requirements.  This paper illustrates SAS tips and techniques which are recommended to follow while mapping the data.

PROC TTEST® (Old Friend), What Are You Trying to Tell Us?
Jeffrey Kromrey, University of South Florida
Diep Nguyen, University of South Florida
Patricia Rodriguez de Gil, University of South Florida
Eun Sook Kim, University of South Florida
Aarti Bellara, University of South Florida
Anh Kellerman, University of South Florida
Yi-hsin Chen, University of South Florida
Jeffrey Kromrey, University of South Florida

Paper PO-10

The SAS procedure for Student’s t-test (PROC TTEST) has been a part of the SAS system of statistical procedures since its mainframe computer days.  The procedure provides hypothesis testing and confidence interval estimation for the difference between two population means.  By default, the procedure provides two estimates of standard errors, two hypothesis tests, and two interval estimates: one that assumes homogeneity of variance and the other that avoids this assumption.  In addition, PROC TTEST provides a test of variance homogeneity (the Folded F test) that ostensibly provides guidance in the choice between the two estimation methods.  This paper describes past research on the accuracy of this conditional testing procedure, provides new simulation research results, and suggests guidelines for the use of the Folded F test in selecting between the two t-test approaches.

Integration of Scientific Writing into an Applied Biostatistics and SAS Programming Course for Pharmaceutical Sciences Graduate Students
Daniel Hertz, University of North Carolina at Chapel Hill
Dan Crona, University of North Carolina at Chapel Hill
Jasmine Talameh, University of North Carolina at Chapel Hill
Scott Brantley, University of North Carolina at Chapel Hill
Luke Roode, University of North Carolina at Chapel Hill
Katie Theken, University of North Carolina at Chapel Hill
J. Heyward Hull, University of North Carolina at Chapel Hill

Paper PO-11

Background: Successful training of graduate students and young investigators requires a mixture of didactic and practice-based learning.   Students routinely indicate that they would benefit from more training in scientific writing; however, this is often difficult to teach effectively in isolation from research-associated activities to a group of students with a diverse background and research focus.  The core elements of a scientific report are the presentation and interpretation of findings from the study statistical analyses.  Thus, in order to provide students with an opportunity to develop their skills in the written reporting of scientific findings, we have integrated a scientific-writing component into a previously existing course in applied biostatistics and SAS programming.

Methods: Graduate students in the UNC Eshelman School of Pharmacy are required to take a second-level 3-credit course in the application of biostatistics (DPET 831).  Each week students attend a 2-hour lecture discussing the most common statistical procedures used in biomedical research.  Subsequently, students attend a 2-hour recitation at which they are provided a related case assignment (background, hypothesis, raw data, and analysis plan) and are expected, individually, to carry out the required analyses using SAS Software (Cary, NC) in a supervised computer lab.  After recitation students independently write a brief but formal report (‹1,000 words) on a customized Microsoft Word (Redmond, WA) template, including: introduction, methods, results, and discussion sections, similar to the standard format for a scientific manuscript.  The report is submitted and returned electronically for critique and grading.

Discussion: The key course objectives for students enrolling in DPET 831 remain instruction in statistical methods, SAS programming, and data interpretation.  Nevertheless, with minor modifications to the assignment template and instructions, students can simultaneously learn biostatistics and efficiently develop scientific writing skills in a controlled environment with a consistent mechanism to monitor student progression.  This combined training could also be expanded to other research-related activities (e.g., grant writing) expected of the young investigator.  Finally, we believe this integrated model could be implemented at other academic institutions with courses in applied biostatistics that are training young investigators for careers in the biomedical sciences.

SAS macro to obtain reference values based on estimation of the lower and upper percentiles via quantile regression.
Neeta Shenvi, Emory University
Amita Manatunga, Emory University
Andrew Taylor, Emory University School of Medicine

Paper PO-12

Reference or normative values for disease or healthy populations are often required for medical studies.  For example, in order to differentiate disease subjects from healthy subjects, the percentiles of the distribution of the disease marker among healthy subjects are frequently used.  In addition, the disease marker may depend on certain covariates and there is a need to adjust for these covariates when establishing the reference values.  Quantile regression methods (QR) have numerous advantages over existing least squared methods; for example, QR can deal with skewed data without usual distributional assumptions and is flexible enough to allow different regression coefficients for different percentiles.  SAS code is available to program the QR and adjust for covariates for a given dataset; however the procedure is not automated to provide organized reports in the presence of many outcome variables.  We describe a SAS macro that combines three SAS procedures (Proc Quantreg, Proc Report and Graph template language) that provides the estimation of quantiles, regression equations with and without adjustments for possible covariates.  In addition, the program provides plots of the marker vs covariates and the corresponding specified quantiles.  We illustrate our program using a kidney study where normal values of renal area and length are determined for males and females from 99mTc- MAG3 renal scintigraphy.

A Randomization Test SAS Program for Making Treatment Effects Inferences for Extensions and Variations of ABAB Single-Case Experimental Designs
Patricia Rodriguez de Gil, University of South Florida
John M. Ferron, University of South Florida

Paper PO-13

While the evaluation of intervention effects in single-case research has relied on visual inspection of the data (Kazdin, 1980), the description of graphical forms are not considered an adequate substitute for statistical tests (Edgington, 1980).  Moreover, there are cases when graphical displays of data tend to be quite ambiguous and treatment effects are not easily appreciated (Ferron & Sentovich, 2002); in these cases, inferential statistics are often necessary to determine if a treatment effect exists.  Randomization tests are considered valid statistical tests for determining the presence of a treatment effect in single-case experimental data (Edgington, 1980).  In addition, significance tests lead to a more informed and reflective statistical analysis (Thompson & Snyder, 1997).  Although the statistical validity of randomization tests has been established, randomization tests for single-case data are not incorporated into readily available statistical software like SAS and SPSS, making it difficult for researchers to implement randomization tests into their statistical analysis of data.  The example provided for Onghena (1992) was used to illustrate a worked example of a randomization test where the use of random assignment of treatment to treatment times and the incorporation of randomization into single-case reversal designs is explained and applied to statistical testing.  SAS/IML code for randomization tests for extensions and variations of ABAB single-case experimental designs is provided and discussed.

Spatial Analysis of Gastric Cancer in Costa Rica using SAS
So Young Park, North Carolina State University
Marcela Alfaro-Cordoba, North Carolina State University

Paper PO-14

Stomach cancer, or gastric cancer, refers to cancer arising from any part of the stomach.  It causes about 800,000 deaths worldwide per year.  Gastric cancer (GC) is the leading cause of cancer-related mortality in Costa Rican males.  After breast cancer, it is the second highest cause of cancer mortality in women in Costa Rica.  Most predictor variables have been based on epidemiological and social factors, yet spatial factors have not been commonly accounted in the analysis.

In epidemiology, the prevalence of a health-related state (in this case, GC) in a statistical population is defined as the total number of cases in the population, divided by the number of individuals in the population.  It is used as an estimate of how common a disease is within a population over a certain period of time.  It helps health professionals understand the probability of certain diagnoses and is routinely used by epidemiologists, health care providers, government agencies and insurers.  The objective of this analysis is to identify a spatial variation of GC in Costa Rica after accounting social factors, and establish possible spatial patterns, with which future public health efforts can be effectively utilized for decreasing such prevalence.  Using SAS, we construct conditional autoregressive (CAR) and simultaneous autoregressive (SAR) models to show the presence of spatial autocorrelation in the areal data (i.e. by-county data).

Extend the Power of SAS® to Use Callable VBS and VBA Code Files Stored in External Libraries to Control Excel Formatting Routines
William E Benjamin Jr., Owl Computer Consultancy LLC
Paper PO-17

Did you ever wish you could use the power of SAS® to take control of EXCEL and make EXCEL do what you wanted “WHEN YOU WANTED”?  Well one letter is the key to doing just that, the letter “X” as in the SAS “X” Command that opens the door to all operating system commands from SAS.  The Windows operating system comes with a facility to write a series of commands called scripts.  These scripts have the ability to open and reach into the internals of EXCEL.  Scripts can load, execute and remove VBA macro code and control EXCEL.  This level of control allows you to make EXCEL do what you want, without leaving any traces of a macro behind.  This is Power.

Array, Hurray, Array; Consolidate or Expand Your Input Data Stream Using Arrays
William E Benjamin Jr., Owl Computer Consultancy LLC
Paper PO-18

You have an input file with one record per month, but need an output file with one record per year.  But you cannot use PROC TRANSPOSE because other fields need to be retained or the input file is sparsely populated.  The techniques shown in this paper will enable you to be able to either consolidate or expand your output stream of data by using arrays.  Sorted files of data records can be processed as a unit using "BY Variable" groups and building an array of records to process.  This technique allows access to all of the data records for a "BY Variable" group and gives the programmer access to the first, last and all records in between at the same time.  This will allow the selection of any data value for the final output record.

Evaluating effectiveness of management interventions in a hospital using SAS® Text Miner
Anil Kumar Pantangi, Oklahoma State University
Musthan Mohideen, Oklahoma State University
Goutam Chakraborty, Oklahoma State University
Gary Gaeth, University of Iowa

Paper PO-19

Businesses often implement changes as part of their business strategy to improve outcomes and enhance customer satisfaction.  The best situation occurs when a business can measure the impact of the change before and after the intervention.  Healthcare and Hospital management is no exception to this.  In this paper we have analyzed patient survey data obtained from a large Midwestern University Hospitals.  The hospital management introduced a key intervention to improve the patient satisfaction.  As part of this project we used data from a third party which has standard set of questions about following areas: “access to Care,” “during your visit”, “your care provider”, “personal issues” and “overall assessment”.  We have used SAS Enterprise Guide® and Enterprise Miner® to analyze the pre and post effects of a key intervention – the introduction of an online portal to access patient medical information, including test results.  The data were in two forms—quantitative and qualitative (comments).  As would often occur, the survey was not specifically designed to measure this intervention.  We have analyzed both the quantitative data and the text data to gauge the valence of customers’ comments about the intervention.  We find a significant increment in the means of outcomes when patients commented about test results in the survey section which asks for the one thing that you “wish were different with the clinic.”  This kind of pre and post analysis where quantitative and qualitative data are used in tandem help the management in measuring the effectiveness and significance of the intervention strategies.

MIXED_FIT: A SAS Macro to Assess Model Fit and Adequacy for Two-Level Linear Models
Mihaela Ene, University of South Carolina
Whitney Smiley, University of South Carolina
Bethany Bell, University of South Carolina

Paper PO-20

As multilevel models (MLMs) are useful in understanding relationships existent in hierarchical data structures, these models have started to be used more frequently in research developed in social and health sciences.  In order to draw meaningful conclusions from MLMs, researchers need to make sure that the model fits the data.  Model fit, and thus, ultimately model selection can be assessed by examining changes in several fit indices across different nested and non-nested models [e.g., -2 log likelihood, Akaike Information Criterion (AIC), and Schwarz’s Bayesian Information Criterion (BIC)].  In addition, the difference in pseudo-R2 is often used to examine the practical significance between two nested models.  Considering the importance of using all of these measures when determining model selection, the accessibility of this information is of major interest to researchers using this analytic technique.  Whereas SAS PROC MIXED produces the -2 log likelihood, AIC, and BIC, it does not provide the actual change in these fit indices or the change in pseudo-R2 between different nested and non-nested models.  In order to make this information more attainable, Bardenheier (2009) developed a macro that allowed researchers using PROC MIXED to obtain the test statistic for the difference in -2 log likelihood along with the p-value of the Likelihood Ratio Test (LRT).  As an extension of Bardenheier’s work, this paper provides a comprehensive SAS macro that incorporates changes in model fit statistics (-2 log likelihood, AIC and BIC) as well as change in pseudo-R2.  By utilizing data from PROC MIXED ODS tables, the macro produces a comprehensive table of changes in model fit measures.  Thus, this expanded macro allows SAS users to examine model fit in both nested and non-nested models and both in terms of statistical and practical significance.  This paper provides a review of the different methods used to assess model fit in multilevel analysis, the macro programming language, and an executed example of the macro.

Using Dictionary Tables to Profile SAS Datasets
Phillip Julian, Bank of America
Paper PO-21

Data profiling is an essential task for data management, data warehousing, and exploring SAS® datasets.  TDWI ( extends the usual definition of data profiling to include data exploration.  This paper presents two SAS programs, Data_Explorer and Data_Profiler, that implement the TDWI definition.

These SAS programs are low-cost, free solutions for data exploration and data profiling.  Data_Explorer searches for all SAS datasets, and gathers essential dataset and file attributes into a single report.  Data_Profiler summarizes the values of any SAS dataset in a generic manner, which eliminates the need for custom SQL queries and custom programs to summarize what a dataset contains.

These programs have been used in banking and state government.  They should also be useful in the pharmaceutical industry for validating SAS datasets and managing data repositories.

Reporting and Information Visualization

Get Your "Fast Pass" to Building Business Intelligence with SAS® and Google Analytics
Patricia Aanderud, And Data Inc
Paper RI-01

See the magic of Google Analytics with SAS® business intelligence tools!  Use your "Fast Pass" to learn SAS® Information Map Studio, SAS® Web Report Studio, and SAS® Information Delivery Portal with the authors of the newly released "Building Business Intelligence with SAS®".  We will guide you step-by-step through creation of information maps, custom web reports, and portal pages using data from Google Analytics.  You will learn some tips and tricks for creating custom data items, designing and linking reports, and adding external content to your portal.

Let the Data Paint the Picture: Data-Driven, Interactive and Animated Visualizations Using SAS®, Java and the Processing Graphics Library
Patrick Hall, NCSU
Ryan Snyder, NCSU

Paper RI-03

This paper introduces a scalable technique that combines the data manipulation capabilities of Base SAS® 9.2 with the Java Processing graphics library to generate customizable visualizations.  With an instructive example, the reader is guided through connecting to the sashelp.iris sample data using a SAS/SHARE® server to build a visualization applet.  The explanation includes details for adding animation and simple mouse and keyboard interactions into the visualization.  A basic understanding of Object Oriented (OO) programming is assumed.  The included example was designed for the Windows platform; however, Java and Processing are designed to support many other operating systems.  Additional reference material, including an online code repository, and sample visualizations are also included.

Converting from SAS/GRAPH(R) to ODS Graphics
Jim Horne, Lowe's Companies, Inc.
Paper RI-04

As of SAS 9.3, SAS has moved ODS Graphics and the Statistical Graphics procedures from SAS/GRAPH® to Base SAS®.  This provides us an opportunity to eliminate SAS/GRAPH by converting our SAS/GRAPH procedures to ODS Graphics procedures.  This paper is an overview of some ways to convert basic graphs from SAS/GRAPH to ODS Graphics and help determine if ODS Graphics can replace SAS/GRAPH.

Pulling Data from Ellucian-Banner ODS with SAS-EG: Not only fast but fun as well!
Claudia McCann, East Carolina University
Paper RI-05

Assessment of learning, and of services, in Higher Education is crucial for continued improvement.  Administrators and faculty are demanding, and using, data more and more in their decision making processes.  There are many data input experts on campus and, unfortunately, far fewer who can easily extract the data in the aggregate form required by administrators, accreditors, and other institutional stake holders.  The SAS-Enterprise Guide (SAS-EG) interface with the Banner Operational Data Store (ODS) is a very powerful combination of softwares that enable the end user to quickly access the institution's data and produce descriptive reports.  More powerful still is the ability to bring other relational databases, such as Excel spreadsheets, into the SAS-EG environment thereby allowing variables not available in ODS to be used in the analyses.  This presentation/demonstration will explore how to load ODS views into SAS-EG and how to produce simple descriptive statistics such as frequencies and tables.  The process of including data external to Banner/ODS, via merging tables in SAS-EG, will also be demonstrated.

Don’t Avoid It, Exploit It: Using Annotate to Enhance Graphical Output
Sarah Mikol, Rho, Inc.
Paper RI-06

SAS/GRAPH® is widely used for displaying the results of an analysis or simply creating graphical summaries of the data.  Often times it is necessary to customize these graphics to either meet client needs or clarify the output.  However, many users are intimidated by the intricate syntax of SAS/GRAPH®’s customization tool, ANNOTATE, and may settle for graphics that do not provide a clear, comprehensive presentation of the results.

This paper will first introduce the cornerstone of the ANNOTATE facility, the ANNOTATE dataset, and will explain the details of its construction.  A summary of the annotation coordinate system as well as the ANNOTATE macros, which offer a more compact way to produce the annotation dataset, will be given.  Lastly, two examples considering both data driven and predetermined modifications will demonstrate how to incorporate custom text, symbols, and line segments into graphics produced by both the GPLOT and GCHART procedures.

Enhance your SAS/Intrnet application with jQuery and Google Earth
David Mintz, EPA
Paper RI-07

Sometimes the hardest part of developing a data delivery system is designing the interface.  How will users select what they need?  How will the output be displayed?  What’s the most efficient design when menu options need to reflect database updates?  This paper will demonstrate how to enhance your SAS/Intrnet application with jQuery and a Google Earth API to satisfy common user expectations.  Examples will show how to implement data driven menu options that change as your data change, return output to the same window (with no page refresh!), display a “processing data” icon while your program is running, and provide access to data-driven kml files via a map API.

Do SAS® users read books? Using SAS graphics to enhance survey research
Barbara Okerson, WellPoint
Paper RI-08

In survey research, graphics play two important but distinctly different roles.  Visualization graphics enable analysts to view respondent segments, trends and outliers that may not be readily obvious from a simple examination of the data.  Presentation graphics are designed to quickly illustrate key points or conclusions to a defined audience from the analysis of the survey responses.  SAS provides the tools for both these graphics roles through SAS/Graph and ODS graphics procedures.  Using a survey of the Virginia SAS Users Group (VASUG) as the data source, this paper answers the above question and more while illustrating several SAS techniques for survey response visualization and presentation.  The techniques presented here include correspondence analysis, spatial analysis, heat maps and others.

Results included in this paper were created with version 9.2 of SAS on a Windows 64-bit server platform and use Base SAS, SAS/STAT and SAS/GRAPH.  SAS Version 9.1 or later and a SAS/GRAPH license are required for ODS graphics extensions.  The techniques represented in this paper are not platform-specific and can be adapted by beginning through advanced SAS users.

Mobile Business Applications: Delivering SAS Dashboards To Mobile Devices via MMS
Ben Robbins, Eaton
Michael Drutar, The SAS Institute

Paper RI-09

Today’s face-paced business environment requires leveraging mobile technology more than ever.  In order to conduct business within the mobile space, several delivery platforms have been developed; however one optimal method of delivery is the Multimedia Messaging Service (or picture message).  MMS is a cost-effective and platform agnostic solution, simplifying the financial, IT and logistical challenges that are common with mobile integration.  Without added cost of development, MMS empowers businesses to push a single graphical representation of data to all necessary recipients, regardless of manufacturer (iPhone, Andriod) or the technological level of the phone (smartphone, basic phone).

This paper explains how a sales representative of a company learns that her territory has changed while out of the office.  From her mobile device she sends a text message to her company’s main email address.  At the office, a SAS Job receives the text message; the job is then triggered and uses PROC GMAP to generate a map showing the employee’s current sales territories and distance to goal.  Then the SAS job immediately sends a picture message back to the sales representative’s mobile device.  The content of the message is the output image from the GMAP procedure and any accompanying text.  Using this method, the company’s employees can receive SAS graphical output on their mobile devices on demand.  The SAS Job also logs how many MMS messages it sends out, to which users and the messages’ contents.  This data is used for the companies’ internal reporting. Sending SAS Graphs of several varieties are discussed in this paper.

Data merging and Exploration to identify association of Epidemiological outbreaks with Environmental factors
Neeta Shenvi, Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA
Xin Zhang, Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA
Azhar Nizam, Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA

Paper RI-10

This paper describes data merging and visualization techniques for epidemiological and environmental surveillance data.  The ultimate goal is to learn influence of specific environmental factors associated with disease epidemics.  Results included in this paper were created with SAS 9.3 on a Windows XP platform, using Base SAS, SAS/STAT software, and SAS/GRAPH.  SAS 9.1 or later is required for ODS graphics extensions.

Using Design Principles to Make ODS Template Decisions
Helen Smith, RTI International
Susan Myers, RTI International

Paper RI-12

With the introduction of the Output Delivery System (ODS) in SAS® 9.1 and subsequent enhancements in SAS® 9.2, SAS has provided programmers with many style templates for developing reports.  These default templates or style definitions are often able to present the data in a clear and attractive manner with no further thought needed.  However, with more complicated reports and their requirements, basic design principles or perceptional terms can be helpful in making choices between one template and another or one custom graphic feature over another in order to make the report comprehensible for users.  This paper will discuss and present the code of two reports; one a redesign of a 10+ year old SAS program originally designed with PUT statements, and a second highly customized SAS program with an Excel format.  Each of the reports are produced with Proc Report, but have very different approaches to setting the style applied to the output.

Diverse Report Generation With PROC REPORT
Chris Speck, PAREXEL International
Paper RI-13

Automation is often the goal of SAS programming.  If we could just hit “Submit” and watch our program generate all our tables and listings while making the right decisions at run time we could get a lot more accomplished.  Of course, with the SAS macro facility, we can already do this…up to a point.  We can equip our macros with macro logic, we can feed them different parameters, and then watch as they produce one output after another.  This works great when all your outputs are based on the same data set and require the same number of columns.

But what if they don’t?  What if you need to automate report generation from a large number of different data sets?  What if you must allow for any number of columns?  Developing such a program using macro logic would be cumbersome indeed.  You would need %IF %THEN blocks for every contingency, and your program would get so bogged down in logic that you’d be better off without automation at all.

This paper will demonstrate how SAS programmers can easily and gracefully automate Diverse Report Generation.  The methods discussed in this paper use a patient profile program as a primary example and will make use of the REPORT procedure, the SQL procedure, and SASHELP views.

Creating a Heatmap Visualization of 150 Million GPS Points on Roadway Maps via SAS®
Shih-Ching Wu, Virginia Tech Transportation Institute
Shane McLaughlin, Virginia Tech Transportation Institute

Paper RI-14

This paper introduces an approach to using SAS integrated with common geospatial tools to reduce a large amount of GPS points to heatmaps describing the frequency with which road segments are traversed.  To accomplish this task, there are three steps.  First, matching the large number of GPS points to known road networks which is known as a map-matching process.  Second, calculating the number of trips on each road segment based on previous map-matching results.  Third, coloring road segments based on the number of trips.  An example is described which maps one hundred and fifty million GPS points using SAS with SAS Bridge for ESRI, ArcGIS, and two free and open-source tools - PostgreSQL and PostGIS.

Using SAS/GRAPH® to Create Visualizations That Also Support Tactile and Auditory Interaction
Ed Summers, The SAS Institute
Paper RI-15

Concepts, ideas, and analyses are commonly presented as graphics that are optimized for visual consumption.  Does that mean that blind students and professionals are out of luck?  Not anymore.

This presentation demonstrates best practices for multimodal visualizations for blind users of the iPad and other touchscreen mobile devices.  Multimodal visualizations allow blind users to interactively explore visualizations through touch, discover details through sound, and comprehend the essence of the visualizations without vision.  You learn best practices for a variety of common charts, plots, and maps.  We demonstrate how to create multimodal visualizations using SAS macros that encapsulate the best practices.  Lastly, we explore how an auditory channel can improve the usability of visualizations for sighted users.

Using Axes Options to Stretch the Limits of SAS® Graph Template Language
Perry Watts, Stakana Analytics
Paper RI-16

While Graph Template Language (GTL) in ODS statistical graphics has made it possible to produce a wide variety of high quality graphs with relative ease, problems remain that defy a simple solution.  Two that are discussed in this paper are resolved by applying axes options available in GTL.   Ironically, the first problem addressed is an axis problem involving the placement of minor ticks along a continuous axis.  Minor ticks have always been available in SAS/GRAPH to facilitate value tracking and to signal the precision of the underlying data that are being graphed.  Unfortunately, though, they are not available in GTL.  This paper shows how they can be implemented in 9.3 SAS with the application of several embedded user-defined macros.

The second problem addressed is the inability to generate true n-bin endpoint or midpoint histograms in GTL.  When the NBINS option is used directly in a HISTOGRAM statement, zero-frequency bins outside the data range often make their way into the output graph by mistake.  This problem is solved by exercising user-defined NBINHISTO macros that combine options from the HISTOGRAM and XAXISOPTS statements to get more reliable results.

Since GTL is significantly different from SAS/GRAPH when it comes to axes definitions, comparisons are made and the new linear and discrete axes are fully described.   The SAS user who is comfortable with macros and has some experience creating and interpreting graphs will get the most out of this paper.  Source code is available upon request.

Together at Last: Spatial Analysis and SAS® Mapping
Darrell Massengill, The SAS Institute
Paper RI-17

Spatial analysis and maps are a perfect match. Spatial analysis adds intelligence to your maps; maps provide context for your spatial analysis.  The geostatistical tools in SAS/STAT® software can model and predict a variety of spatial data.  SAS mapping tools enable you to create rich visualizations from that material.  This presentation introduces a new framework that combines SAS spatial analytics with SAS mapping.  Examples demonstrate how you can use the SAS/GRAPH® ANNOTATE facility with the transparency specification (new in SAS® 9.3) to combine a predicted spatial surface with traditional SAS/GRAPH® maps, and show how to tap into the additional mapping resources of ESRI software through the SAS® Bridge for ESRI.  These tools empower you to make more intelligent maps and more informative spatial analyses.

Statistics and Data Analysis

Compare MIXED and GLMMIX to Analyze Breast Cancer Longitudinal Study
Abbas Tavakoli, Uinversity of South Carolina
Sue Heiney, Uinversity of South Carolina, College of Nursing

Paper SD-01

The importance of choosing what type of statistical program that we use to analyze the longitudinal study are growing as a specialty due to the fact computerized data analyses has become the basic for scientific research.  There are many procedures in SAS that can be used to analyze longitudinal study.  The purpose of this paper is to compare MIXED and GLIMMIX procedure in SAS to analyze the longitudinal study.  A randomized trial design was used in which 185 participants were assigned to the therapeutic group (n=92) who received by teleconference with participants interacting in real time with each other and control group (n=93) who received usual psychosocial care (any support used by the patient in the course of cancer treatment).  The randomization was stratified by treatment type.  Data were collected at baseline, the end of the intervention, and 16 weeks from baseline.  A mixed-effects repeated measures model was used to assess outcome variable of social well being (social connection) by group over time.  The effect of group, time and interaction effect of group by time were examined after controlling for several confounding factors.  SAS is the most powerful statistical program in data analyses for longitudinal study.

Random Effects Simulation for Sample Size Calculations Using SAS
Matthew Psioda, UNC Chapel Hill
Paper SD-02

Sample size calculations are a critical step in the planning of any experiment.  In all but the simplest of experimental designs, closed-form equations are not readily available, and statisticians are required to use simulations to estimate an appropriate sample size for the experiment.   Specifically, when multiple explanatory variables are thought to be predictive of the response or when missing data is likely to occur, simulation is a valuable approach for sample size calculation.

In this paper we consider a simulation study for a continuous response measured at a set of fixed time points.  We consider a randomized study comparing an experimental treatment plus standard of care (ET+SOC) to the standard of care (SOC).  We assume that, based on data from a previous study, the response trajectory for the SOC subjects varies with respect to gender and that the response curves are reasonably well modeled by a quadratic polynomial in time.

Using PROC IML, we simulate data from a linear mixed effects model including a gender main effect, linear and quadratic time, and treatment by time interactions.  For this scenario, the primary (null) hypothesis is that there is no treatment effect.  Model fitting is performed using PROC MIXED.  Our technique for simulation is easily generalizable and efficient.

The Effects of Q-Matrix Mis-Specification when Employing Proc NLMIXED: A Simulation Study
George MacDonald, University of South Florida
Jeffrey Kromrey, University of South Florida

Paper SD-03

In the report, Adding it Up: Helping Children Learn Mathematics, the Mathematics Learning Committee from the National Research Council suggested that student learning and performance could be enhanced if conceptual understanding was taught at the same time as procedural fluency.  Fischer (1973) introduced a model called the linear logistic test model (LLTM) that is capable of bridging cognitive processing models and psychometric models.  Fischer found that differentiating calculus items could be explained by cognitive operations that the examinee must implement.  The specification of the cognitive operations are quantified in a Q-Matrix.  Most model fit analysis do not address the correctness of the constructed Q-Matrix, therefore, the amount of model misspecification that the Q-Matrix can tolerate and still function adequately is unknown.  This simulation study employed Proc IML and Proc NLMIXED in SAS 9.2 to examine the extent to which the LLTM and the 2-Pl constrained models, an extension of the LLTM, function well when the Q-Matrix is: properly specified; under specified; balanced misspecified, and over specified The results of the simulation will be interpreted and the implications for educational assessment will be discussed.

Decision-Making using the Analytic Hierarchy Process (AHP) and SAS/IML®
Melvin Alexander, Social Security Administration
Paper SD-04

SAS/IML can be used to implement the Analytic Hierarchy Process (AHP).  AHP helps decision-makers choose the best solution from several options and selection criteria.  Thomas Saaty developed AHP as a decision-making method in the 1970s.  AHP has broad applications in operations research, quality engineering, and define-for-six-sigma (DFSS) situations.

AHP builds a hierarchy (ranking) of decision items using comparisons between each pair of items expressed as a matrix.  Paired comparisons produce weighting scores that measure how much importance items have with each other.

This presentation will demonstrate AHP using personal, business, and medical decision-making examples.  A SAS/IML subroutine will generate output that includes measures of criteria and selection importance and data consistency.

SAS Procedures for Analyzing Survey Data
Pushpal Mukhopadhyay, The SAS Institute
Paper SD-05

The design of probability-based sample surveys involves specialized elements such as stratification, clustering, and unequal weighting.   In order to make statistically valid inferences, correspondingly specialized software is required that takes these elements into account in variance estimation.  This tutorial provides an overview of the basic functionality of the SAS/STAT procedures which are specifically designed for selecting and analyzing probability samples for survey data.  You will learn how to: The tutorial also discusses the characteristics of different variance estimation techniques, including both Taylor series method and replication methods.  The course is intended for a broad audience of statisticians who are interested in analyzing sample survey data.  Familiarity with basic statistics, including regression analysis, is strongly recommended.

A SAS macro to compute effect size (Cohen’s d) and its confidence interval from raw survey data
Rajendra Kadel, University of South Florida, College of Public Health
Kevin Kip, University of South Florida

Paper SD-06

Effect size estimates are used to measure the magnitude of treatment effect or the association between two or more variables.  When comparing two conditions, standardized mean difference (Cohen’s d-statistic) is one of the most frequently used measure of effect size in social and biomedical sciences.  In this case, the response (dependent) variable is continuous whereas and the predictor (independent) variable is categorical.  Even though some web-based and Microsoft EXCEL-based software are available to calculate effect size using summary statistics, available software to calculate both effect size and its confidence interval are very limited.  To the author’s knowledge, there is no standard SAS® procedure that directly produces the d-statistic and its confidence interval.  In this paper, I describe how to calculate the d-statistic and its confidence interval directly from raw survey data following methods from Morris & DeShon (2002), Nakagawa & Cuthill (2007), and Fritz, Morris, & Richler (2012).  The SAS ODS Output delivery system is used to collect summary statistics to compute the d-statistic and its confidence interval.  In this realm, three types of survey designs are considered including: (a) Independent group post-test (completely randomized) design; (b) Within-subject (one-group pretest–posttest) design, and (c) Independent-groups pretest–posttest design (IGPP).  Data from the IGPP design can also be analyzed using analysis of covariance (ANCOVA), with pretest scores as a covariate.  Finally the author presents a SAS macro that implements the methodology for d statistics and their confidence interval for all three designs mentioned above.  Very basic understanding of SAS is sufficient to use this macro without skill in SAS macro programming.  When the macro is invoked, a table with summary statistics, effect size, and its confidence interval are calculated, and results can be saved either in Microsoft EXCEL or in HTML format.  This macro has been tested on Windows SAS 9.2.

Sample Size Determination for a Nonparametric Upper Tolerance Limit for any Order Statistic
Dennis Beal, SAIC
Paper SD-07

A nonparametric upper tolerance limit (UTL) bounds a given percentage of the population distribution with specified confidence.  The most common UTL is based on the largest order statistic (the maximum) where the number of samples required for a given confidence and coverage is easily derived for an infinitely large population.  However, for other order statistics such as the second largest, third largest, etc., the equations used to determine the number of samples to achieve a specified confidence and coverage become more complex as the order statistic decreases from the maximum.  This paper uses the theory of order statistics to derive the equations necessary for calculating the sample size for a one-sided nonparametric UTL using any order statistic.  SAS® code is shown that performs these calculations in a single macro.  Examples of calculations using the SAS code are shown for various order statistics, confidence and coverage.  This paper is for intermediate SAS users of Base SAS who understand statistical intervals and SAS macros.

Difference Estimation versus Mean per Unit Methods for Skewed Populations: A Simulation Study
John Chantis, DoD IG
Paper SD-08

In most of the financial statements audits, Different Estimation and Mean per Unit (MPU) methods are often used to estimate the error or the value of the population.  In this research we compare Mean per Unit and Different Estimation methods and study the characteristics of the estimates.  In each of these methods we focus our study on Simple Random Sample (SRS) and Stratified Sample designs.  Most of the financial data are strongly positive skewed, and our primary interests are only on these types of data.  For each case, our simulations consisted of 1,000 replications.

For a fixed sample size, we estimated the total audited value of the population using Difference Estimation and MPU methods.  We then calculated the coverage probability and precision for each of these cases.  The coverage probability is the probability that the true value is within the confidence interval.  We ideally want the coverage probability to be closer to the stated (1- alpha) level.  The estimator with larger precision is not useful for decision makers or to book the adjustment in financial statements.  In addition to the coverage probability and precision we also examine the convergence rate of the estimator to the true value in our simulation study.

Based on the simulation results we concluded that the relative precision is superior for the difference estimation method with compared to the MPU for the SRS case.  As for the coverage probability difference estimation method produces a very low coverage probability.  The same pattern holds for the stratified design.  When we compared the precision and the coverage probability we noticed that both the precision and the coverage probability significantly improved in the stratified design.

The views expressed are attributable to the authors and do not necessarily reflect the views of the Department Defense Office of the Inspector General.

K-Nearest Neighbor Classification and Regression using SAS
Liang Xie, Travelers Insurance
Paper SD-09

K-Nearest Neighbor (KNN) classification and regression are two widely used analytic methods in predictive modeling and data mining fields.  They provide a way to model highly nonlinear decision boundaries, and to fulfill many other analytical tasks such as missing value imputation, local smoothing, etc.

In this paper, we discuss ways in SAS to conduct KNN classification and KNN Regression.  Specifically, PROC DISCRIM is used to build multi-class KNN classification and PROC KRIGE2D is used for KNN regression tasks.  Technical details such as tuning parameter selection, etc. are discussed as well.  We also discuss tips and tricks in using these two procedures for KNN classification and regression, and examples are presented to demonstrate full process flow in applying KNN classification and regression in real world business projects.

Where Should I Dig? What to do Before Mining Your Data
Stephanie Thompson, Datamum
Paper SD-10

Data mining involves large amounts of data from many sources.  In order to successfully extract knowledge from data, you need to do a bit of work before running models.  This paper covers selecting your target and data preparation.  You want to make sure you find golden nuggets and not pyrite.  The work done up front will make sure your panning yields results and is not just a trip down an empty shaft.

The Keouk County CAFO Study: A Complementary Analysis Using Classification Trees in SAS® Enterprise Miner™
Leonard Gordon, University of Kentucky
Brian Pavilonis, University of Iowa

Paper SD-11

There is an increase in larger specialized operations in the livestock production sector in the United States(US).  The number of hogs raised in the US has been relatively constant but the number of producers has reduced resulting in concentrated animal feeding operations (CAFOs) which has not been without consequences.  The amount of waste generated has had adverse effects on the environment.  Studies have shown an association between CAFO exposure and adverse respiratory outcomes.  Classification trees- a non-parametric methodology- using SAS® Enterprise Miner™ are used for the analysis.   They are underused in the public health literature and have the ability to divide populations into meaningful subgroups which will allow the identification of vulnerable groups and enhance the provision of products and services.

Multiple Imputation for Ordinal Variables: A Comparison of SUDAAN’s PROC IMPUTE vs. PROC MI
Kimberly Ault, RTI International
Paper SD-12

Imputation techniques are typically used to allow standard analysis techniques to be performed while, if assumptions hold true, reducing nonresponse bias in parameter estimates.  The naïve use of conventional single imputation methods, such as regression imputation and hot-deck methods has been shown to underestimate standard errors, which affect confidence intervals and statistical tests.  Treating imputed values as if they were true values in variance estimation, which is often done in practice, does not reflect the additional uncertainty due to imputing for missing data so multiple imputation methods have been suggested to help measure the additional variance due to imputation.  SUDAAN’s PROC IMPUTE and SAS‘s PROC MI both produce multiply imputed data using different methods.  PROC IMPUTE performs imputation on missing data using weighted sequential hot deck imputation that takes into account the probabilities of selection by using the sampling weight to specify the expected number of times a respondent's answer is used to replace a missing item.  This procedure is a non-model based method that can be used for all types of variables – binary, ordinal, nominal, and continuous- without imposing restrictions on the missing data patterns.  PROC IMPUTE can be use for both multiple (several imputed versions of the same variable) imputations and multivariate (several variables imputed at the same time) imputation.  For ordinal variables with more than two categories, PROC MI performs logistic regression imputation for monotone missing data patterns.  However, it does not use sample weights when estimating the parameters of the regression function.  For data with monotone missing patterns, the variables with missing values can be imputed sequentially with variables constructed from their corresponding sets of preceding variables.  Multiple imputation will be performed on ordinal variables using both procedures using a sequential approach where prior imputed variables are used to impute subsequent variables in a monotone missing pattern.  Additionally, a multivariate imputation will be performed using PROC IMPUTE to demonstrate the ease of performing imputation without the monotone missing pattern.  A comparison between the point and variance estimates will be examined and a summary of advantages and disadvantages of the two procedures will be provided.

Tips, Tricks, and Strategies for Mixed Modeling with SAS/STAT® Procedures
Kathleen Kiernan, The SAS Institute
Paper SD-13

Inherently, mixed modeling with SAS/STAT procedures, such as GLIMMIX, MIXED, and NLMIXED, is computationally intensive.  Therefore, considerable memory and CPU time can be required.  As a result, the default algorithms in these procedures might fail to converge for some data sets and models.  This paper provides recommendations for circumventing memory problems and reducing execution times for your mixed modeling analyses.  This paper also shows how the new HPMIXED procedure can be beneficial for certain situations, as with large sparse mixed models.  Lastly, the discussion focuses on the best way to interpret and address common notes, warnings, and error messages that can occur with mixed models.

Transporter Room

Linking Medical Records to Medics in Cyberspace
Sigurd Hermansen
Paper TR-01

In the brave new world of Web registries, the National Provider Identifier (NPI) supposedly links providers of medical services to a registry of medics.  When electronic medical records (eHR's) include accurate NPI, using it to look up a medic's personal data on the NPI registry ( works seamlessly.  Without an accurate NPI, searches become more complex and less reliable.  The downloadable NPI registry database turns out to be a exceedingly complex and difficult target for searches.  Fortunately, the SAS System provides all of the tools that we need to uncompress, restructure, index, and search the NPI registry.

SAS Server Pages, ‹?sas and ‹?sas=
Richard DeVenezia, High Impact Technologies
Paper TR-02

Have you ever wanted to plunk down some SAS output smack dab in the middle of a web page?  How about controlling page generation on the server side from within your html source code?  You can do both, and more, using a SAS Server Page (.ssp)  This concept aligns closely with other server page technologies such as ASP, PHP and JSP.  The page processor, a Base SAS program, will let you transport your web development to a new creative plateau.  This paper will discuss the processor and demonstrate the ‹?sas and ‹?sas= tags in a rich user interface web application served up by the SAS Stored Process Web Application.

The ADDR-PEEK-POKE Capsule: Transporting Data Within Memory and Between Memory and the PDV
Paul Dorfman, Dorfman Consutling
Paper TR-03

“APP” is an unofficial collective abbreviation for the SAS® functions ADDR, PEEK, PEEKC, the CALL POKE routine, and their so-called “LONG” 64-bit counterparts - the SAS tools designed to directly read from, and write to, the physical memory in the DATA step and the SQL Procedure.

APP functions have long been a SAS user’s dark horse.  Firstly, the examples of APP usage in SAS documentation boil down to a few tidbits in a technical report, all intended for mainframe system programming tasks, with nary a hint how the functions could be used for data management SAS programming.  Secondly, the note about the CALL POKE routine in the SAS documentation is so intimidating in tone, that many a potentially receptive folk may have decided to avoid the allegedly precarious route altogether.

However, nothing can stand on the way of a curious SAS programmer daring to take a closer look; and it turns out that APP functions are very simple and useful tools!   They can be used to explore how things “really work”, make code more concise, implement “en masse” (group) data moves, and, oftentimes, significantly improve execution efficiency.

The authors and many other SAS experts (notably Peter Crawford, Koen Vyverman, Richard DeVenezia, Toby Dunn, and the fellow masked by his Puddin’ Man sobriquet) have been poking around the SAS’ APP realm on SAS-L and in their own practices since 1998, occasionally letting the SAS community at large to peek at their findings.  This opus is an effort to circumscribe the results in a systematic manner.  Welcome to the APP world!  You are in for a few glorious surprises.