Day of Python and R
SESUG 2022 is
bring your own device
for any tutorials and/or hands on workshops for which you want to follow along.
October 22, 2022
Programmers generally need more than one tool in their toolkit. As a programmer in SAS® or another language, have you ever wondered about how Python and/or R could add value to your arsenal? SESUG 2022 is presenting a day of Python and R to let you discover how these free open source programming languages can help you find solutions to today’s problems.
Python - The Power Tool - 8:30 AM-12:00 PM
Instructor: Troy Martin Hughes, Datmesis Analytics, LLC
Python is a general-purpose, object-oriented programming language that is especially useful for big data, artificial intelligence, and deep learning algorithms. This session will get you started with writing code using variables and data operators under the umbrella of Python core syntax rules. Curious about exploring the most popular, freely downloadable open-source data analytic software?! This course will introduce the Python “Pandas” library, which is the predominant data manipulation module within the Python language, and the “DataFrame” data structure. Pandas is an open-source library, and a core component of the popular Anaconda software—the most widely utilized Python distribution supporting data analytics and data science.
All examples will be demonstrated in the latest software releases, including Python 3.10.0 and Pandas 1.4.1. No previous Python experience is required to attend!
Attendees will learn how to manipulate the Pandas DataFrame, including:
- data manipulation with basic mathematical operators
- sorting by columns and by values
- evaluating categorical frequency
- performing various mathematical transformations via functions
- performing various character transformations via functions
- creating a user-defined function
- masking data to clean, categorize, or bin them
- data validation through lookup tables (using hash objects)
- how to import a SAS data set into a Pandas DataFrame
- how to export a Pandas DataFrame to a SAS data set
- SAS® Data-Driven Development: From Abstract Design to Dynamic Functionality, Second Edition (2022)
- SAS® Data Analytic Development: Dimensions of Software Quality (2016)
R - The Multi-Tool - 1:00 PM-4:30 PM
Instructor: Edgar Ruiz, RStudio
Description: A brief introduction and examples of R with SAS are followed by a three-hour general introduction to R as a programming language. R is an open source software environment that’s optimized for statistical calculation and data visualization. This session will introduce the R working environment, arithmetic and logical operators, important functions and R scripting while introducing the packages available through R for visualization, reporting, data manipulation, and statistical analysis.
Instructor Bio: Edgar Ruiz is a Software Engineer at RStudio PBC. He loves helping data analysts be successful in their journey. He contributes by developing tools, documenting solutions, giving talks, and teaching classes. He had the privilege of co-authoring the "Mastering Spark with R" book, and the R tool (package) that translates R code into SQL queries.