SESUG 2023 Conference Proceedings

Track
Development and Support
Industry Applications
Leadership, Careers and Planning
Learning SAS I
Learning SAS II
Showcasing SAS
Statistics, Analytics and Reporting



Development and Support

Paper Authors Title Key Takeaways
Paper 103 Ronald Fehd List Processing using SQL Select Into to Replace Call Symputx Creating Indexed Arrays of Macro Variables arrays of macro variables are constructed with call symput, which allocates macro variables var1, ... varN
      macro array can then be read with macro loop: %do i = 1 %to &N; %let value = &&var&i;
      list processing with sql eliminates allocation of macro variables and macro loop by using dictionary.columns
Paper 111 David Ward Speeding up Your SAS Code Using Parallel Processing Parallel processing is an important technique and available in the SAS system
      Parallel processing can massively speed up various kinds of operations in SAS
Paper 133 Kirk Paul Lafler and Clark Roberts Modernizing Legacy SAS® Applications and Program Code Modernizing legacy applications and program code involves an incremental and structured approach. It consists of identifying the target application and program code; selecting potential solutions to use; and finally, implementing structured and scalable solutions to replace varied coding styles and conventions made over its functional life. So, how does an organization know when an application needs modernizing? Green shares five signs to answer this question. 1. Operation and maintenance costs are high. 2. It's clunky or uses outdated technologies. 3. Your business processes have changed. 4. There's no tight integration with future applications. 5. It's not mobile-ready.
      How should an organization proceed with the modernization of a legacy application and program code project? The best place to start is to get all stakeholders on-board and in agreement with the objectives and changes to be made. Next, obtain the necessary funding for performing the project work, modernizing a legacy application conjures concern from everyone involved. All too often, stakeholders develop a, "If it's not broke then leave it alone!" attitude. To help alleviate the issues associated with modernizing legacy applications and program code, we recommend a five-step modernization approach. 1. Identify mission-critical applications and program code that is/are indispensable to the organization. 2. Review and understand the code associated with the user-interface, the data sources being accessed, the processing requirements, and finally the output and results. 3. Identify and modernize older technologies; hard to modify and inflexible code; and inefficient statements, functions, options and their settings, code constructs, algorithms, and programming techniques with newer and more efficient methods and techniques. 4. Test, Train and Deploy the modernized application and program code to bring all stakeholders on board. 5. Maintain and Support the modernized applications and program code to ensure their flexibility and adaptability to changing requirements, environments and technologies.
      Discovering the number of occurrences of individual values in a data set is useful information, particularly when constructing data-driven approaches. SAS provides several ways to count and determine the number of occurrences of a value in a data set. 1. Discovering the Number of Occurrences of a Value in a DATA Step. 2. Discovering the Number of Occurrences of a Value with the PROC FREQ NLEVELS Option. 3. Discovering the Number of Occurrences of a Value with PROC SQL.
Paper 145 Amita Patil Data Management and Analysis of Adverse Events Data using SAS ARRAY, IF-THEN, DO LOOP, and MACRO Statements Total Adverse Event Data by Grades
      Total of Each AE Term
      Categorizing Adverse Events to Body Systems
Paper 150 Brian Varney Inventory your OS for Programming Information Your file system and OS contains valuable information for your programming.
Paper 155 Ronald Fehd A Batch Processing Companion, how to write Windows *.bat and *.cmd files for my-program.sas Windows batch files can allocate environment variable: set job=my-program echo %job%
      Windows provides system environment variable date: echo %date%
      SAS software startup-only options log and print can contain a date-stamp: sas %job% -log %job%-%date%.log -print %job%-%date%.lst
Paper 166 Troy Martin Hughes Sorting a Bajillion Variables: When SORTC and SORTN Subroutines Have Stopped Satisfying, User-Defined PROC FCMP Subroutines Can Leverage the Hash Object to Reorder Limitless Arrays The OF and IN operators do not work inside of PROC FCMP.
      SAS built-in functions called inside of PROC FCMP are arbitrarily limited to 800-element arrays.
      The hash object can be utilized inside of PROC FCMP as an alternative data structure to SAS arrays.
Paper 168 Troy Martin Hughes What's black and white and sheds all over? The Python Pandas DataFrame, the Open-Source Data Structure Supplanting the SAS® Data Set Python Pandas is the most widely utilized open-source data analytic package.
      The DataFrame is the Pandas equivalent of the SAS data set data structure.
      Python methods and functions typically enable comparable functionality (as SAS) with fewer lines of code.
Paper 182 Kirk Paul Lafler Benefits, Challenges, and Opportunities with Open-source Software (OSS) Integration Software is created using source code which tells a program or application how to function. For this paper, two distinct software types (or categories) will be illustrated: 1) Proprietary (or commercial) software and 2) Open-source software. A major decision confronting a software developer is whether the source code related to the software release will be made publicly available on Github for anyone to inspect, modify, enhance, and share – referred to as open-source, versus software where the developer maintains exclusive control over the source code preventing the public availability to inspect, modify, enhance, and share it – referred to as closed source or proprietary software.
      Open-source software (OSS) has increasingly become more popular among enthusiasts particularly in the IT industry. For example, a popular open-source alternative to Microsoft Office is LibreOffice. An open-source alternative to Microsoft Windows is the Linux operating system. Another popular open-source alternative to Google's or Bing's web browser software is the Mozilla Firefox web browser.
      Open-source software integration promotes free access to inspect, modify, enhance, and share source code. The redistribution of software is not only permitted but encouraged to sustain innovation. According to a recent study by Gartner, open-source tools provide flexibility and cost-effectiveness for data integration tasks and projects such as connectivity, data routing, and transformation.
Paper 190 david horvath Zen and the Art of Problem Solving Zen Pirsig Problem Solving
Paper 222 Randy Betancourt Effective APIs for SAS Language Applications Implementing Rest APIs for SAS Programs makes the program more broadly consumable.
Paper 238 Bobbie Frye Using SAS® to Prepare Postsecondary Data Partnership (PDP) Data Submission Files The SAS® code can be utilized by beginners or intermediate programmers to prepare files for the PDP
      The PDP equips institutions with accessible reports and visualizations and SAS® software provides a flexible roadmap to successful data submissions.



Industry Applications

Paper Authors Title Key Takeaways
Paper 108 Linping Li and Shunbing Zhao How to Prepare Precise and Intact Software Program for Submission How to use MPRINT/MFILE to incorporate the pre-processed code with the standard macro program code to create a completed macro-free SAS program.
      Ensure the macro-free SAS program can produce duplicated results as the original program.
Paper 110 Keli Sorrentino and Julie Plano Geocoding in SAS®: The Basics Geocoding is SAS can be simple
Paper 128 Stephen Sloan and Lindsey Puryear Advanced Project Management beyond Microsoft Project, Using PROC CPM, PROC GANTT, and Advanced Graphics SAS has PROCs for Project Management that allow multi-level prioritization when different departments need the same resources in a shared services environment.
      SAS can produce a project plan and the associated Gantt charts
      SAS has project management facilities that are not in Microsoft Project.
Paper 129 Stephen Sloan and Kevin Gillette A unique and innovative end-to-end demand planning and forecasting process using a collection of SAS products SAS has a variety of products that can be strung together to create an integrated product.
      SAS can support an integrated end-to-end demand planning system.
Paper 140 Lydia LI and Shunbing Zhao An Innovative Approach to Generate a CONSORT Diagram in Clinical Trials What is Consort Diagram
      How Consort Diagram can be used in clinical trial's data
      How codes are developed to create Consort Diagram
Paper 157 Dishant Banga Response Analysis for marketing campaigns using SAS how to measure the performance of marketing campaign using SAS tools
Paper 160 Imelda Go and Abbas Tavakoli Generating Mock Data in SAS® You can control the probability of simulated values for discrete univariate and multivariate distributions.
Paper 161 Yeats Ye, Jessie Parker, Cindy Zhang and Cordell Golden Ensuring Accurate Data Linkages with Metadata Tables Metadata tables are a critical tool for data quality assurance when maintaining complex databases.
Paper 186 Peter Styliadis The Best of Both Worlds: SAS and Open Source Software Understand the massively parallel processing capabilities in SAS Viya
      Integrate Python and SAS Viya to process data throughout the analytics life cycle
      Create a dashboard using SAS Visual Analytics
Paper 199 Xiaoying Liu and Hennadii Balashov Forecasting Goals for Student Success Metrics in Accountability Performance Reporting using SAS Visual Analytics Participants will understand basic model techniques for time series forecasting.
      Participants will learn how to use the forecasting template within SAS Visual Analytics to build forecast models.
      Participants will understand the value of using analytics to assist institution-wide planning and creating a data driven decision making culture at the institution.
Paper 201 Bruce Nawrocki, Kathy Dail and Julie A. Walker A Central Data Storage and Reporting System for Statewide Local Health Dept Visits in SAS Even without a SAS server, you can create a production-like system in SAS EG to manage files
      SAS can work as an alternative to full-featured database systems
Paper 213 jalenderreddy musku and Srinivas Tiyyagura Blinded Studies and Challenges Here the takeaway for audience is they will understand why the blinded studies are conducted and the challenges involved in designing blinded studies.
Paper 226 Star Nze For Clinical Trials: A Faster and Smoother Approach to Create your SDTM and ADAM Define Specifications for Define.xml with SAS® SDTM and ADAM Define Specifications documents can be completed using SAS.
      SAS is a great tool to use for creating and analyzing tabular-based documents.
Paper 227 Lawrence Ogbeifun Two illustrations of the Quantity Theory of Money In testing Robert Lucas' assertion about the Quantity Theory of Money, how the data used for the empirical analysis are measured matters.
      The filtering techniques used for the data are essential.
Paper 230 Abdullah Khan Using SAS In Generating Multi-Jurisdictional Reports for U.S. Retail Industries' Employment Trends Have hands on experience about how to use SAS in industrial employment trends analysis
      How to generate customized report of industrial data at various jurisdictional levels such as at the county-level, state-level, census division-level, and census region-level.
      How to generate custom maps and multivariate scatterplots using SAS codes.



Leadership, Careers and Planning

Paper Authors Title Key Takeaways
Paper 131 Stephen Sloan Developing and running an in-house SAS Users Group In-house SAS Users Groups can help an organization achieve its g
      SAS Institute will provide help when setting up or running an in-house SAS Users Group
      Many people will want to participate
Paper 134 Kirk Paul Lafler Exploring the Skills Needed by the Data Scientist / Analytics Professional Employment of Data Scientists is projected to grow 36 percent from 2021 to 2031, much faster than the average for all occupations. The most in-demand technical skills for data science careers are Python and SQL. The average data scientist salary in the U.S. is $125,242 / year.
      In Aleksandra Yosifova's April 6th, 2023 report, The Data Scientist Job Outlook in 2023 – Research on 1,000 LinkedIn Job Postings, researched 1,000 LinkedIn job postings to answer questions about the Data Scientist job outlook in 2023. The report attempts to answer the following questions:  Is data science still the sexiest job of the 21st century?  Where do data scientists work?  How much do they earn?  What skills are required for a successful data science career?  What are the next in-demand tools and techniques in the field?
      Data Science / Analytics skills are all encompassing and requires in-depth knowledge and experience in many technical and non-technical areas. SAS Software SAS is a statistical software suite of products developed by SAS Institute Inc. for advanced analytics, multivariate analysis, business intelligence, data management, predictive analytics, and criminal investigation. SAS runs on all important platforms and supports object-oriented and structured programming along with other programming paradigms. Developed by Dr. James Goodnight, Anthony Barr, John Sall and Jane T. Helwig. The SAS software suite has more than 200 components including Base SAS, SAS/STAT, SAS/GRAPH, SAS/OR, SAS/ETS, SAS/IML, SAS/AF, SAS/QC, SAS/INSIGHT, SAS/PH, Enterprise Miner, Enterprise Guide, SAS/EBI, and SAS Grid Manager. SQL Structured Query Language (SQL) is a relational database language that is used in programming relational database management systems (RDBMS). It is specifically useful in handling structured data. SQL comprises many types of statements including a data query language (DQL), a data definition language (DDL), a data control language (DCL), and a data manipulation language (DML). There are several types of SQL implementations including SAS' PROC SQL, Microsoft's SQL-Server, Oracle, and IBM. SQL was originally developed by Edgar Frank "Ted" Codd, Donald D. Chamberlin, and Raymond F. Boyce in the early 1970s. Python Python is an open-source programming language that is available under a free software license. It supports object-oriented and structured programming along with other programming paradigms. Developed by Guido van Rossum in the late 1980s, Python is designed to be an "easy to read language" with numerous third-party modules to interact with other languages; extensive support libraries such as web service tools; text processing; string operations; internet protocols; a powerful scripting language; an extensive user community; and many other features. R R is a powerful open-source programming language and is used for statistical computing, graphics and data analysis. Available under a free software license, R runs on all important platforms and is used by statisticians, data miners and thousands of major corporations and institutions worldwide. Developed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, their initial version of R was released in 1995 with a stable beta version in 2000. R boasts an extensive array of packages including data wrangling; data analysis; plotting; graphing; reporting; statistics; an extensive user community; and many other features. Excel Microsoft Excel is widely used spreadsheet software operating under Windows, macOS, Android and iOS platforms to allow users to format, organize, manipulate, and calculate data in spreadsheets. Common Excel uses include the collection and storage of data, business analysis, data analysis, statistical analysis, accounting and budgeting, account management, project management, performance reporting, administrative and managerial management, operations management, and office administration. Users can arrange data in a spreadsheet using graphical tools, formulas, and pivot tables to work with large quantities of data to identify sums, averages, percentages, unique values, minimum and maximum values, ranges, outliers, and other needs. Cloud Computing Services Cloud computing is the delivery of computing services (aka, SaaS) including software, databases, servers, storage, networking, analytics, and intelligence over the Internet to offer users improved and affordable computing speed, flexibility, and scale. From my own experience using a few cloud services and from reviewing an article on cloud service providers (Peterson, Richard. July 26, 2022) cloud services are offered by SAS Institute Inc., Amazon Web Services (AWS), Microsoft, IBM, Google, ServerSpace, Adobe, Kamatera, VMware, Rackspace, Red Hat, Salesforce, Oracle, SAP, Verizon, Linode, HostPapa, DigitalOcean, ScalaHosting, OVHcloud, LiquidWeb, Vultr, CloudSigma, LimeStone, Navisite, and Dropbox.
Paper 214 Jim Blum and Jonathan Duggins Methods and Tools for Publishing at SESUG and Beyond—Basic Techniques Writing SESUG papers in LaTeX
      Publishing technical documents
Paper 215 Jonathan Duggins and Jim Blum Methods and Tools for Publishing at SESUG and Beyond-Advanced Techniques Generating SAS results in LaTeX
      Automatically inserting SAS code into a LaTeX document
      Constructing reproducible workflows



Learning SAS I

Paper Authors Title Key Takeaways
Paper 008 Kim Wilson Debugging Tips for PROC HTTP Several great papers have been written about how to get started with PROC HTTP, which includes accessing Microsoft 365 applications, modifying various options for desired results, and more. As a SAS Technical Support Engineer, I often assist SAS customers who are not receiving the expected resource, or they are seeing a return code that is not a 200 OK. This paper describes common errors that you might encounter regarding certificates, authentication, and general errors, as well as overall debugging techniques and suggestions. This paper also helps you gather pertinent information that SAS Technical Support will need when helping to solve the problems occurring with or around PROC HTTP.
Paper 101 Richann Watson and Louise Hadden Going Command(o): Power(Shell)ing Through Your Workload Underlying SAS on different platforms are powerful "command" shells which include operating system commands. These commands can be harnessed from within SAS and without SAS and can be a valuable and efficient toolset.
Paper 102 Richann Watson and Lisa Mendez 10 Quick Tips for Getting Tipsy with SAS There are different types of shortcuts that can be used within SAS. The ':' acts like a wild card to indicate that an anything after the initial prefix is game. The ':' can be used for both a shorthand for setting a bunch of data sets with the same prefix or can be used as a shorthand for grabbing all the variables with the same prefix.
      You no longer need to save that favorite piece of code or commenting style in a separate file. By creating an abbreviation or keyboard macro, you can save it directly within your SAS application for quick access next time you need it.
Paper 107 Abbas Tavakoli, Ashley Howard, kayla Everhart, Jessica Bradshaw and Robin Dail Using Macro in SAS ® to Read, Combine Datasets, and Analysis NICU Study Application of using SAS in clinical field
Paper 109 Julie Plano and Keli Sorrentino How to Merge SAS® Datasets with PII Merging safely should be of utmost importance. First and foremost know your data.
      Cleaning and standardizing key variables before a merge or join can greatly improve match success.
      There are a multitude of options that can be used to complete the same task in SAS , use the option that best fits your programming style and data question.
Paper 119 Kirk Paul Lafler, Shaonan Wang, Nuoer Lu, Zheyuan Walter Yu and Daniel Qian Data Access Made Easy Using SAS® Studio SAS Studio's built-in point-and-click interface helps make working with SAS data sets, text-delimited data files, CSV data files, Excel data files, JSON data files, and program code easier with a powerful toolkit of predefined tasks that enable users to list table attributes, characterize data, describe missing data, and much more. access data sources, perform data analytics, and several other tasks.
      SAS Studio offers users with the ability to create new SAS libraries; establish library references (LIBREFs); upload SAS data sets, tab-delimited, CSV, and Excel data files in the cloud; import tab-delimited, CSV, and Excel data files to SAS data sets using tasks and utilities; and produce results using the Navigation pane.
      SAS Studio's point-and-click approach uses the Navigation pane as a relatively easy and flexible way to access SAS data sets and data files, automatically generate program code, and run (or execute) program code using SAS ODA software. Our paper guides you through the steps to access permanent and temporary SAS data sets and data files residing in the cloud; create new SAS data sets; produce results including reports, tables, statistics, and charts using SAS Studio's point-and-click approach. We'll explore the data access steps for four different types of data files:  SAS (SAS7BDAT) Data Sets  Tab-delimited Text Data Files  Comma-separated Values (CSV) Data Files  Excel (XLSX) Data Files
Paper 121 Kirk Paul Lafler, Zheyuan Walter Yu, Nuoer Lu, Daniel Qian, Kai Kang, Yanzhang Gavin Chen, Nicklas (Rebel) Yee and Yixuan (Jason) Xiang Regression Analysis Made Easy Using SAS® Studio The extract, transform, and load (ETL) process involves moving / migrating data from various sources into a data warehouse. The objective of the data extraction / retrieval phase is to access and extract the desired data from the various into a consistent and usable format enabling successful data transformation. Data transformation is the process of converting, cleaning, and structuring data from one format to a more usable format to enable processing and analysis tasks.
      Regression analysis is a statistical method used for investigating the relationship between one or more independent variables and a dependent variable. The dependent variable is the object we are trying to predict, whereas the independent variables are the factors that might have an impact on the dependent variable. The primary objective is to model and quantify the relationship between the dependent and independent variables and sort out which of these variables have a significant impact.
      Simple linear regression makes certain assumptions about the data and are presented below: 1. Linearity: The relationship between the independent variable X and the dependent variable of Y is linear. We can observe a straight line through the data points visually. 2. Homoscedasticity: The variance of residual is the same for any value of the independent variable X. This means that the spread of the residuals should be roughly the same throughout the range of the independent variable. 3. Independence: Observations in the data set are independent of each other, meaning there is no relationship among observations. 4. Normality: The data is approximately normally distributed.
Paper 125 Stephen Sloan Getting a Handle on All of Your SAS® 9.4 Usage Tracking down all of an organization's SAS usage is possible with sufficient planning.
      You need to have a general idea about which SAS products are being used.
      You need to determine your strategy and then follow it.
Paper 127 Stephen Sloan Reducing the space requirements of SAS® data sets without sacrificing any variables or observations You can achieve considerable reductions in the space used by your program.
      This can be done in an automated process.
      Reductions in space can be structured so that all observations and variables will still be available.
Paper 135 Brooke Ellen Delgoffe Coding for the Future: Smart Commenting There are many styles that SAS Comments can be in
      Comment Styles can provide structure, meaning, and help future programmers (and your future self)
      Comments can be leveraged for parsing if they remain in standard form
Paper 136 Bruce Gilsen SAS ® Program Efficiency For Beginners Efficiency techniques were presented for a variety of tasks
      Avoiding divide by zero can provide a huge efficiency gain
Paper 137 Osmel Brito Bigott, Angye Rivero Rodriguez and Stephanie Inostroza Arratia IMPLEMENTING GIT FOR CHANGE CONTROL OF DELEGATED TABLES OWNED BY END USERS use of git for managing delegated tables
      delegated tables as an unavoidable issue with data work
Paper 141 Brooke Ellen Delgoffe So Close Yet How Far Away: Closest City Macro There are many applications that distance calculation can be used in
      Use ZIPCITYDISTANCE () or GEODIST() to calculate distance
      Use SASHELP.ZIPCODES with PROC SQL to dynamically calculate distances for a large list of locations.
Paper 151 Jayanth Iyengar Best Practices for Efficiency and Code Optimization in SAS programming CPU, Input/Output, Memory, and Storage Space are all Efficiency metrics.
      Efficiency can refer to processing of data, or number of lines of code or steps in a program.
      Finding the most efficient technique can depend on your data, and operating environment
Paper 154 William Smith Utilizing SAS to create HTML Codebooks How to integrate HMTL Code in SAS
Paper 158 Dane Korver A Gentle Introduction to Creating SAS Graphs Part 2 Creating SAS graphs
Paper 159 Mark McLean Creating and using code snippets with SAS Enterprise Guide Setting up code snippets in SAS EG is easy!
      Using code snippets in your programs will save you time and help reduce syntax errors.
      Using code snippets can help all programmers, especially newer ones, and even teams.
Paper 167 Troy Martin Hughes Make You Holla' Tikka Masala: Creating User-Defined Informats Using the PROC FORMAT OTHER Option To Call User-Defined FCMP Functions That Facilitate Data Ingestion Data Quality PROC FCMP can be used to build user-defined functions called via PROC FORMAT OTHER option.
      Both user-defined formats and user-defined informats can call user-defined functions dynamically.
      Limitations exist when using the PROC FORMAT OTHER option, such as the inability to initialize the _ERROR_ automatic variable.
Paper 188 david horvath Pol-y-mor-phism in SAS Or Good Programmers are Lazy polymorphism techniques in sas
Paper 189 david horvath Wildcarding in Where Clauses where clause
Paper 193 Jinson Erinjeri MISSING Mysteries for SAS Beginners permission to publish
Paper 195 Kirk Paul Lafler, Richann Watson, Joshua Horstman and Charu Shankar The Battle of the Titans: DATA Step versus PROC SQL Both the DATA step and PROC SQL provide us with the ability to apply logic scenarios in our programs so they can conditionally do or perform the operations we desire – "if one condition is true, then do X but if another condition is true, then do Y."
      Comparison operators are used in the DATA step and PROC SQL to compare one character or numeric value to another. Logical operators like AND, OR, and NOT are used to connect two or more expressions together.
      SAS users often need the ability to identify the first (beginning) and last (ending) observation as well as the between observation(s) in a by-group. The DATA step is the "go-to" approach used by many but PROC SQL can also be used to emulate this stalwart DATA step approach.
Paper 198 Emily Morin The Variety of Email Notifications Using SAS How to write a basic email
      How to attach a file
      Use macros and zip files
Paper 203 Paul Dorfman and Richard DeVenezia Hashes From the Ashes Hash tables are keyed data stored in memory and accessed in O(1) time for such operations as Insert, Search, Retrieve, Update, Delete. In SAS, they are available as (a) the canned hash object and (b) arrays paired with an explicitly coded hash algorithm.
      The hash object is like a Swiss knife: It's more universal and allows to do more things in one package. The arrays with custom hash code around them are like individual tools better suited to specific tasks and data situations, examples of which are discussed in the paper.
      The notion that array-hashing code is overly complex is an odd myth (demystified in the paper), as in reality it is easy to comprehend. However, to take advantage of array hashing, even that is unnecessary since it can be macro-encapsulated and used much in the same way the hash object is used. The corresponding macro package is presented in the paper as well.
Paper 212 Nat Wooding Reading Comma Separated Files containing CR LF embedded in quoted strings Recognize when a carriage return and or line feed embedded in a file is breaking input lines
      Have a means of removing the unwanted CR and LF characters
Paper 233 xiaofeng Liu How to use SAS POWER procedure to plan a study Power
      SAS PROC POWER
      sufficient sample size for confidence interval
Paper 234 Paul Newsom Getting to Know the SAS Data Access Engine: SASEFRED The SASEFRED Data Access Engine makes downloading Federal Reserve Economic DATA (FRED) data and uploading it into SAS easy



Learning SAS II

Paper Authors Title Key Takeaways
Paper 120 Kirk Paul Lafler and Stephen B. Sloan Application of Fuzzy Matching Programming Techniques Using SAS® Software SAS software provides users with four techniques for its users: the Soundex (phonetic matching) algorithm and function, and the SPEDIS, COMPLEV, and COMPGED functions to help make fuzzy matching easier and more effective (Lafler and Sloan, 2023, 2022).
      The COMPLEV function is best used when comparing simple strings where data sizes and/or the speed of comparison is important, such as when working with large datasets.
      The generalized edit distance computations performed by the COMPGED function requires more processing time to complete due to its more exhaustive and thorough capabilities.
Paper 208 Jayanth Iyengar Understanding Administrative Healthcare Datasets using SAS programming tools. There are many different sources and types of Administrative healthcare data.
      Administrative healthcare data sets are very nuanced and complex.
      SAS has many effecitve tools and constructs to manipulate and report on healthcare data.
Paper 211 Jason Brinkley Introduction to Principal Components and Factor Analysis Principal Components
      Dimension Reduction
      Composite Scoring
Paper 218 Jim Blum and Jonathan Duggins Getting Started with PROC DS2 x
Paper 219 Jonathan Duggins and Jim Blum Working in SGPLOT: Understanding the General Logic of Attributes Adjusting attributes in SGPLOT
      Determining the logic of attribute modification keywords in SGPLOT



Showcasing SAS

Paper Authors Title Key Takeaways
Paper 117 Ronald Fehd List Processing Macro Call-Macro Source Control Language (SCL) provides functions to read dataset attributes: nobs and nvars
      %sysfunc is used within a macro to access these dataset attributes which are upper bounds of two loops: do row = 1 to n_obs; do varnum = 1 to nvars
      within the loops, assemble the macro call with values of each row and column: %subroutine(var1=value(r1,c1),...,varN=value(r1,cN)) ... %subroutine(var1=value(rM,c1),...,varN=value(rM,cN))
Paper 171 Louise Hadden With a View to Make Your Metadata Function(al): Exploring SAS® Sources of Information on SAS Formats SAS maintains a dictionary table for formats (and catalogs) that contains information on all formats in currently assigned libraries with format catalogs and all SAS provided formats. There are similar SAS views available for currently assigned libraries with format catalogs.
      SAS provides functions which provide selected pieces of information about available SAS formats in each session, VFORMAT, which provides any format information for a named variable, and the FMTINFO format which provides information about a named format.
      Additional information about SAS formats can be obtained via PROC FORMAT FMTLIB output, converting a SAS catalog to a SAS data set via PROC FORMAT CNTLOUT, and PROC CATALOG.
Paper 185 Alissa Wise How to Mentor the SAS Bee Reading
      Writing
      Math
Paper 196 Alex Mason Is Rushing in the Modern NFL a Viable Option? The NFL is constantly evolving and data analytics can be used to stay on the forefront of change.
Paper 200 Bruce Nawrocki, Scott Proescholdbell and Shana Geary ORION – A Non-Server-Based Interactive SAS Report Builder Build a customized report generator in SAS Windows environment, even without a SAS Server
      Collect report requirments from users, and use SAS to build its own custom SAS code
Paper 225 Zhong Zheng Navigating First-Year STEM Student Persistence: Insights from Student Experience and Expectations Understanding Persistence Patterns: Gain insights into the persistence patterns of STEM students and the factors linked with retention.
      Learning Logistic Regression Analysis: Acquire knowledge on performing logistic regression analysis through the PROC LOGISTIC procedure in SAS.



Statistics, Analytics and Reporting

Paper Authors Title Key Takeaways
Paper 105 Yueming Wu and Steven Li Calculate Physical Length of String in RTF file with ‘Times New Roman' by SAS Set Right Column Width for PROC REPORT with ODS RTF
      There are Some Rules on String's Physical Length Calculation
Paper 112 David Corliss Designing Against Bias: Identifying and Mitigating Bias in Machine Learning and AI Bias in machine learning and AI can result from a number of sources, including bias in selecting the training set and inclusion of biased predictors
      Bias can be quantified by the disparate impact of algorithm use on population subsets and are best expressed using odds ratios
      New open-source statistical packages such as Fairlearn facilitate quantifying and mitigating bias
Paper 114 David Corliss Unobserved Components Models: Applications in Post-COVID Analysis Unobserved Components models decomposes time series data into level, slope, periodic, and irregular components
      Through the use of a binary dummy variable, SAS PROC UCM can estimate changes in baseline levels
      While the medical impacts have changed from pandemic to endemic, the non-medical effects of COVID continue to evolve
Paper 122 David Dickey Wald Tests in Mixed Models - a Warning In mixed models the well known "large sample test" description for Wald tests does NOT refer to the overall sample size being large but rather the number of random effect levels.
      If a PROC GLM type of random effect is available it has way more power than the Wald test.
Paper 123 Stephen Sloan and Kevin Gillette Assigning agents to districts under multiple constraints using PROC CLP PROC CLP provides a satisfactory solution to assigning agents to districts under a set of constraints.
      SAS OR has a variety of products to help in this area.
Paper 132 Stephen Sloan and Kirk Paul Lafler A Quick Look at Fuzzy Matching Programming Techniques Using SAS® Software Data can be matched even if the items don't match exactly by using fuzzy matching.
      There are a number of different fuzzy matching techniques supported by SAS including the SOUNDEX, SPEDIS, COMPLEV, and COMPGED functions.
      As the programmer, you have control over how much inconsistency can be allowed.
Paper 138 Osmel Brito Bigott and Yenireth Gil SAS Viya 3.5: SAS Job Execution with Guest user SAS Viya
      Guest user use
Paper 139 Elaine Kowalewski, Ann Marie K. Weideman and Gary G. Koch SAS Macro for Randomization-Based Methods for Covariance and Stratified Adjustment of Win Ratios and Win Odds for Ordinal Outcomes SAS macro implements randomization-based methodology for covariance and stratified adjustment of the win ratio and win odds for ordinal outcomes
Paper 142 Bruce Lund Binning Predictors for Logistic Regression Binning is a step in preparing classification predictors for Logistic Models
      Monotonic Binning can be accomplished by %ORDINAL_BIN
      Problem with zero-cells is handled by %NOD_BIN
Paper 143 William Smith %modelselect: A macro for automatically selecting statistical models based on fit statistics In analysis of experimental data, it is useful to develop and test multiple competing hypotheses. Coding of these analyses and transcription of the resulting fit statistics for model selection can become cumbersome.
      In order to combat the tediousness of coding and computing each candidate model, transcribing the information criteria, and manually comparing criteria to select the best-fit model, a macro was developed to automate the process.
      It should be noted that this macro does not choose the best information criterion; that is still at the discretion of the researcher and must be specified in the macro. The macro is simply a tool to aid in coding for model comparison and selection.
Paper 147 Jayanth Iyengar The Everytown Research database: Using SAS analytic procedures to analyze mass shootings Mass shootings are on the rise in the U.S. by looking at charts and reports produced by SAS.
      Using SAS Analytic Procedures, insights into mass shootings in the US can be gained.
      Throught use of SAS, relevant data can shape and guide public policy and programs
Paper 148 Kannan Deivasigamani and Douglas Lunsford Systemic Quantitative check to identify if a variable is a confounder in a dataset using SAS® Macro code A quantitative way to identify a confounder
      How SAS macros can be used for this purpose
      How certain level of automation could be achieved to perform the function
Paper 174 Louise Hadden Looking for the Missing(ness) Piece PROC FREQ NLEVELS allows the SAS user to evaluate the number of levels for both missingness and presence, and produces a complete summary of all variables within a single procedural call.
      True missingness, i.e. whether a variable is missing or not, can be determined with a number of SAS procedures and SAS functions.
      Combining a specialized macro routine to determine true missingness and the PROC FREQ NLEVELS procedure provides a complete missingness summary on all, or selected, variables in a data set and allows for the calculation of cardinality ratios.
Paper 178 Joe DeMaio, Jonathan Bishop, LaToya Bond and Nate Jones An Efficacy Rating for March Madness Tournament Seeding Establish a metric to rate March Madness seeding of teams against results for men and women.
      Seeding men's teams is more difficult than seeding women's teams.
      Different economic realities in NBA vs. WNBA likely impact the accuracy of seeding teams vs. results.
Paper 184 Phil Moore, Marla Mamrick and Sara Reinhardt Performing Higher Education Enrollment Management Predictions Using SAS When building models in higher education, the most accurate model is seldom the most useful model.
      Researchers must ensure that the models built do not overfit the train dataset, and of equal importance, they must also ensure that the models built are generalizable to future cohorts.
      Researchers must do a better job of supporting higher education experts, which includes more than just building models.
Paper 187 Vindhya Hegde Empowering Decision-Making: Predictive Analytics Integration with SAS® and Python™ in a Higher Education Setup The findings presented in the paper demonstrate the value of leveraging the powerful tools of SAS and Python in higher education to anticipate and plan for the future effectively.
Paper 194 Livingstone Gadzanku Predicting employees' preference for remote or on-site jobs Employee preference
      Remote or on-site job
Paper 197 Jeetender Chauhan, Sarad Nepal and Madhusudhan Ginnaram Generate and Customize eDISH Plot to Identify Hy's Law Cases in Simple Steps sas macro, eDISH plot
Paper 204 Yaa Awuah and Mostafa Zahed Longitudinal Analysis of Marital Therapy's Impact on Intimacy in Heterosexual Couples Longitudinal Data, Mixed-Models, Conditional Models, Time Plot, Spaghetti Plot, Variogram, Marital Therapy, Intimacy, Akaike Information Criterion (AIC), Residual Diagnostics, Random Intercept Model, Random Time Slope Model.
Paper 205 Janet Kireta and Mostafa Zahed Unsupervised Dimension Reduction Techniques for Lung Cancer Diagnosis Based on Radiomics What is Radiomics, its influence on understanding the development of safe and effective methods to prevent, detect, diagnose, treat, and, ultimately, cure the collections of diseases we call cancer.
      Understand the dimension reduction techniques that can be used in radiomics.
      Understand the use of SAS software in implementing dimension-reduction techniques in radiomics.
Paper 206 Cheng Lee Sampling by Reversing The Landmarking Process landmarking
      random sampling
      backward landmarking
Paper 207 Brian Varney CAMIS: Comparing Analysis Method Implementations in Software This is a key time for multi-lingual collaboration in software.
Paper 210 Austin Brown Translating Common Data Visualizations from PROC SGPLOT to ggplot2 Understand how common visualizations generated by PROC SGPLOT can be generated by ggplot2 in R software
Paper 220 Natalie AJordan, A.Nicole Ferguson, Jessica G. Woo, Kevin Gittner and Reese H. Clark Visualizing Chronic Lung Disease Incidence in SAS; An Educational Journey to Data Visualization The beginner to intermediate SAS user as well as the novice SAS user with a strong interest in neonatal and maternal health should walk away with a firm understanding of how relationships between more than two medically relevant categorical factors may be examined descriptively using both SAS and Excel.
Paper 235 Shanzhen Gao Teaching Business Statistics with JMP Software Statistics Topics will be covered in a business statistic with JMP course
      Active learning
      How to engage students in active learning