
Introduction
This lecture provides you with an outline of the course; a lay out of the notions that will be covered in this course. It should help you decide which parts/notions are more appropriate for you. From the Course Materials, the file: Codes_for_Building_Course_Source_datasets contains all the codes to build the data sets needed throughout this course. Run this code in order to create the needed data sets. Create library and declare it at each DATA STEP/DATALINES code to save the created data sets in a permanent location
What you need to start coding
Overview of all of the elements that make up SAS programming language:
- DATA STEP
- PROC STEP
- PROC SQL
- MACROS
- GLOBAL statements and OPTIONS
- IML and SCL (won't be covered in course)
Overview of the main tasks accomplished with DATA STEP:
Data manipulation including:
- Reading, filtering, merging, concatenating, exporting
- Data transformation: variables creation, recoding...
Overview of of the USE of PROC STEPS (Procedures):
- Reporting
- Analyses
- Graphics
- Data transformation
- Utility procedures
Overview of PROC SQL use:
- Manipulating data
- Producing reports
- Designed for relational database systems
Overview of the use of MACROS:
- avoid repetition in coding
- Make code more generic
- expand SAS elements capabilities
- Make logical decisions based on content of data
Simplest ways for creating a DATA STEP:
- the output statement for creating output DATA SET
- the RUN statement for instructing SAS to complete processing for each record
- Saving output data sets in a permanent location
- The use of libraries associated to permanent folders
Ways for reading input data sets in:
- Using: SET, MERGE, INFILE & INPUT, or INPUT & DATALINES
Navigating between DATA STEP and PROC STEPS for keeping track of data manipulation
- Describe variables: LABEL statement
- Describe values: FORMAT statement
How do we select data columns of interest:
- KEEP= and DROP= options
- KEEP and DROP statements
How do we filter data:
- Filter on the way in: WHERE STATEMENT
- Filter on the way out: IF CONDITION
Combining data by a common (set of) column(s): Merging
- The need to sort, index, or use NOTSORTED for merging
- Explore ONE TO ONE, ONE TO MANY, and MANY TO MANY merges
- Explore INNER JOIN, LEFT JOIN, FULL OUTER JOIN
Combining data sets of similar columns by stacking: CONCATENATE
- Using one SET for all declared data sets
- Using one SET per declared data set
Use of variable assignment statement for creating and modifying variables
- Data types in SAS
- Setting length for character variables
Summarizing data:
- Use of addition and subtraction
- how missing values are treated
Using functions:
- Dates functions
Summarizing data:
- Use of summary functions as alternative to addition and substraction
Use of functions: SUBSTR and SCAN
Use of functions: Concatenation functions
Use of functions: Remove space
Use of functions: conversion functions: INPUT and PUT
Use of conditional statements:
- IF THEN
- IF THEN DO
Learn the SAS data step and proc step, counting statements by semicolons. Create and transform datasets using set, merge, input, and assignment statements, with filters, formats, and library management.
PROC PRINT, PROC FREQ, and PROC MEANS for:
- Reviewing and cleaning up data
- Producing descriptive statistics and reports
Outputting STATISTICS to data sets
- Chi square for assessing the association between categorical variables
Calculating crude associations:
- Odd ratio (OR)
- Relative risk (RR)
PROC TABULATE is a combination of PROC FREQ and PROC MEANS
- A PURE PROC FREQ
- A PURE PROC MEANS
- Or a combination of PROC FREQ and PROC MEANS
Utility PROCEDURE: PROC CONTENTS
- learn about variables
- organize variables
Using PROC FORMAT to create user defined formats:
- using VALUE STATEMENT
- using CNTLIN=option
Manage SAS formats by storing them in the work library or permanently in a named library using proc format, libname, and proc catalog.
Overview of REGRESSION
- PROC REG
- PROC GLM
Learn how to handle categorical variables in regression by creating dummy variables with a reference category, then interpret coefficients for education levels and compare groups.
Transposing data using PROC TRANSPOSE
Explore proc sql's basic select form, including from clause, labels and format options, comma-separated columns or star, and saving results with create table.
Explore how SAS macros encapsulate reusable code with %macro...%mend, using global and local macro variables, and define positional and keyword parameters to enable conditional execution across datasets.
Learn to concatenate macro variables with static characters to form names by year, using ampersands to denote macro variables and dots when needed.
Explore SAS metadata and data step fundamentals, focusing on input, output, and the body, library handling, and using contents to inspect variables and formats.
Learn to craft the SAS data step output statement and input statement, manage libraries and datasets, including temporary and permanent ones, and apply conditional routing, duplication, and external file output.
Explore SAS arrays to transpose data, turning columns into rows (and vice versa) while preserving values, with examples on temperature conversion to Celsius and student test data.
Explore using arrays to transpose temperature data across multiple rows and columns. Build input and output steps with temporary arrays to transpose by month, capturing columns and days.
In designing this course, we tried to incorporate the current state of affairs ( in the INTERNET AGE) when it comes to learning a programming language, where the sheer amount of resources
at our disposal can be both a curse and a blessing. This is no less true for SAS programming. There are more than enough resources for anyone embarking on the goal of learning or upgrading their skills in SAS to rely on: (1) SAS particularly extensive documentation about all of their products, (2) the dynamically active community of programmers and users and their contributions; (3) the no less important arrays of courses (paid or not) offered, (4) and even the ability to learn from people next to us. We believe that none of these options is either the definitive solution, nor worthless.
However, for a learner, especially those with no prior knowledge about the programming, making the right choices can be both stressful and/or costly.
So we designed this course in a way to allow you to see the forest from steps/far away or from above so to help you build a contextual self-awareness before you step into it. Once, you have a sense of how your achievement will look like when you get there, we are convinced that (1) the journey will be less stressful, (2) any prior investment you have made or any investment you will make on that front will add value to this one. If you follow through our offer and give it a real try, many of the resources, you have found confusing before will likely start to sound valuable for you. In order to balance the need to help you get a complete picture of SAS programming and not make your journey endless, we put a serious emphasis on the WHY: why SAS does something the way it does; we go to the essential elements of each SAS component, so that you learn more than just the mechanics of making it work when you need it.
For example, in preparing the lesson about MACROS, we made sure to answer the following questions, among other things:
- Why do we need macros?
- how do we create macros (the different methods)
- why do we need different methods to create macros? how do the different methods compare?
- and when it comes to these tools, we let you know upfront that there are limitations (don't even know why we find the need to repeat this to you--- everybody knows that there is no such thing as a perfect/bugs free tool when it comes to POGRAMMING). So expect errors and work to fix them; any error you manage to fix yourself is one more stair on your stairways to professional experience. Don't be afraid of carefully reading your log, then your code again before you throw the towel; the most elementary errors (simple syntax errors, typos...) are the ones that will stick with you the longest in your journey. We highlighted some of those limitations (ex.: macro quoting).
At a high level, here is what we cover:
Part I: SAS Programming components
Lesson 1: DATA STEP
Lesson 2: PROC STEPS
Lesson 3: PROC SQL
Lesson 4: MACROS
Part II: SAS Programming Structure
Lesson 1: STEPS and STATEMENTS
Lesson 2: More on DATA STEP
Part III: Reading data into SAS
Lesson 1: Reading from various sources
Lesson 2: Reading using INPUT STYLES (raw data)
Part IV: SAS Program Data Vector (PDV)
Lesson: PDV
As you can see, we cover most SAS components/programming elements. For each lesson, we start with a few basic notions and expand in level of difficulty throughout the lesson.
Whether you intend to prepare SAS based certification (without going through the usual painful memorization exercise) or list this language on your resume, we are convinced that this course will helps you achieve your goals. A lot of practice exercises were included; they are targeted mostly at people who are acquiring this course with certification (base SAS) in mind.