Acknowledgments As hard as we have worked on this book, we could never have done it alone. Many people at SAS helped make this book what it is. To our many hard-working reviewers Carole Beam, Janice Bloom, Brent Cohen, Vicki Leary, Elizabeth Maldonado, Allison McMahill, Sandy McNeill, Randy Poindexter, Morris Vaughan, and Deanna Warner, we say,Thanks for hanging in there with us. To our copyeditor, Mary Beth Steinbach, and our designer, Kris Rinne, Thanks for making us look good. "To our production specialist, Karen Perkins, Thanks for rectifying all those wayward footnotes, mysterious font errors, and uncooperative images. And last but not least we would like to thank-faster than a speeding deadline, stronger than Microsoft Word, able to leap tall drafts in a single bound--our editor, Stephenie Joyner Outside the walls of SAS many other people also contributed to this book. In particular we would like to thank our readers. We love meeting you at conferences even if we seem a little shy. Without you, of course, there would be no reason to keep writing. To her co-workers-Tim Allis, Dana Drennan, Paul Grant, and Steve Nichols-Lora would like to say, Thanks for being so flexible when I needed to take time off to write. Most of all we would like to thank our families for their understanding and
Acknowledgments As hard as we have worked on this book, we could never have done it alone. Many people at SAS helped make this book what it is. To our many hard-working reviewers: Carole Beam, Janice Bloom, Brent Cohen, Vicki Leary, Elizabeth Maldonado, Allison McMahill, Sandy McNeill, Randy Poindexter, Morris Vaughan, and Deanna Warner, we say, “Thanks for hanging in there with us.” To our copyeditor, Mary Beth Steinbach, and our designer, Kris Rinne, “Thanks for making us look good.” To our production specialist, Karen Perkins, “Thanks for rectifying all those wayward footnotes, mysterious font errors, and uncooperative images.” And last but not least we would like to thank—faster than a speeding deadline, stronger than Microsoft Word, able to leap tall drafts in a single bound—our editor, Stephenie Joyner. Outside the walls of SAS many other people also contributed to this book. In particular we would like to thank our readers. We love meeting you at conferences even if we seem a little shy. Without you, of course, there would be no reason to keep writing. To her co-workers—Tim Allis, Dana Drennan, Paul Grant, and Steve Nichols—Lora would like to say, “Thanks for being so flexible when I needed to take time off to write.” Most of all we would like to thank our families for their understanding and support
he Little sas book Introducing SAS Software SAS software is used by people all over the world-in 118 countries, at over 40,000 sites, by more than 3.5 million users. SAS(pronounced sass) is both a company and software. When people say SAS, they sometimes mean the software running on their computers and sometimes mean the company People often ask what SAS stands for. Originally the letters S-A-S stood for Statistical Analys System(not to be confused with Scandinavian Airlines System, San Antonio Shoemakers, or the Society for Applied Spectroscopy). SAS products have become so diverse that a few years back SAS officially dropped the name Statistical Analysis System, now outgrown, and became simply SAS SAS Products The roots of SAS software reach back to the 1970s when it started out as a already b package for statistical analysis, but SAS didnt stop there. By the mid-1980s SAS had ranched out into graphics, online data entry and compilers for the C programming data warehouses, and building interfaces to the World Wide Web. In the new century, SAS has8 language. In the 1990s the SAS family tree grew to include tools for visualizing data, admi continued to grow with products for cleansing messy data, and analyzing genetic data Appendix C,"An Overview of SAS Products, "lists the products available at the time this book was written. Just as aT&T is now more than telephones and telegraphs, SAS is more than statistics. be put together like building blocks to construct a seamless system. For example, you might us an While SAs has a diverse family of products, most of these products are integrated; that is, they SAS/ACCESS software to read data stored in an external database such as Oracle, analyze it using SAS/ETS software(business planning, forecasting, and decision support), and then forward the Its in e-mail messages to your colleagues, all in a single computer program. Operating environments SAS software runs in a wide range of operating environments ou can take a program written on a personal computer and run it on a mainframe after changing only the file-handling statements specific to each operating environment And becaus SAS programs are as portable as possible, SAS programmers are as portable as possible too. If you know SAS in one operating environment, you can switch to another operating environment without having to relearn SAs Licensing SAS products Most SAS software is licensed. Lic software is like leasing it; once a year you pay your rent. Licensing has one important adva buying: you automatically get each new release without an extra c Since sas software is continually being improved and new versions released, licensing is helpful SAS Learning Edition This modestly priced edition of SAS can be purchased(not licensed Designed for students and business professionals who are new to SAS, this is a full-featured edition of SAS with some limitations. SAS Learning Edition is limited to 1,000 observations, expires on a specific date, and does not include live technical support SASware Ballot SAS puts a high percentage of its revenue into research and development, and each year SAS users help determine how that money will be spent by voting on the SASware Ballot. The ballot is a list of suggestions for new features and enhancements. All SAS users are eligible to vote and thereby influence the future development of SAS software. You can even make your own suggestions for the SASware Ballot by mailing them to SAS or by sending e-mail to suggest@sas.com.ForfurtherinformationabouttheSaswAreBallotsee
x The Little SAS Book Introducing SAS Software SAS software is used by people all over the world—in 118 countries, at over 40,000 sites, by more than 3.5 million users. SAS (pronounced sass) is both a company and software. When people say SAS, they sometimes mean the software running on their computers and sometimes mean the company. People often ask what SAS stands for. Originally the letters S-A-S stood for Statistical Analysis System (not to be confused with Scandinavian Airlines System, San Antonio Shoemakers, or the Society for Applied Spectroscopy). SAS products have become so diverse that a few years back SAS officially dropped the name Statistical Analysis System, now outgrown, and became simply SAS. SAS products The roots of SAS software reach back to the 1970s when it started out as a software package for statistical analysis, but SAS didn’t stop there. By the mid-1980s SAS had already branched out into graphics, online data entry and compilers for the C programming language. In the 1990s the SAS family tree grew to include tools for visualizing data, administering data warehouses, and building interfaces to the World Wide Web. In the new century, SAS has continued to grow with products for cleansing messy data, and analyzing genetic data. Appendix C, “An Overview of SAS Products,” lists the products available at the time this book was written. Just as AT&T is now more than telephones and telegraphs, SAS is more than statistics. While SAS has a diverse family of products, most of these products are integrated; that is, they can be put together like building blocks to construct a seamless system. For example, you might use SAS/ACCESS software to read data stored in an external database such as Oracle, analyze it using SAS/ETS software (business planning, forecasting, and decision support), and then forward the results in e-mail messages to your colleagues, all in a single computer program. Operating environments SAS software runs in a wide range of operating environments. You can take a program written on a personal computer and run it on a mainframe after changing only the file-handling statements specific to each operating environment. And because SAS programs are as portable as possible, SAS programmers are as portable as possible too. If you know SAS in one operating environment, you can switch to another operating environment without having to relearn SAS. Licensing SAS products Most SAS software is licensed. Licensing software is like leasing it; once a year you pay your rent. Licensing has one important advantage when compared with buying: you automatically get each new release without an extra charge. Since SAS software is continually being improved and new versions released, licensing is helpful. SAS Learning Edition This modestly priced edition of SAS can be purchased (not licensed). Designed for students and business professionals who are new to SAS, this is a full-featured edition of SAS with some limitations. SAS Learning Edition is limited to 1,000 observations, expires on a specific date, and does not include live technical support. SASware Ballot SAS puts a high percentage of its revenue into research and development, and each year SAS users help determine how that money will be spent by voting on the SASware Ballot. The ballot is a list of suggestions for new features and enhancements. All SAS users are eligible to vote and thereby influence the future development of SAS software. You can even make your own suggestions for the SASware Ballot by mailing them to SAS or by sending e-mail to suggest@sas.com. For further information about the SASware Ballot see support.sas.com/techsup/news/sasware.html
about This book Who needs this book This book is for all new SAS users in business, government, and academia, or for anyone who will be conducting data analysis using SAS. You need no prior experience with SAS software, but if you have some experience you may still find this book ful for learning techniques you missed or for reference. What this book covers This book introduces you to the SAS language with lots of practical examples, clear and concise explanations, and as little technical jargon as possible. Most of the features covered here come from Base SAS software, which contains the core of features used by all SAS programmers. One exception is Chapter 8 which includes some procedures from SAS/STAT software. Other exceptions appear in Chapters 2 and 9 which cover importing and exporting data from other types of software; some methods require SAS/ACCESS for PC File Formats software We have tried to include every feature of Base SAS software that a beginner is likely to need Some of you will be surprised that certain topics, such as macros, are included because macros are normally considered advanced. But they appear here because sometimes new users need them. However, that doesnt mean that you need to know everything in this book. On the contrary, this book is designed so you can read just those sections you need to solve your problems. Even if you read this book from cover to cover, you may find yourself returning to refresh your memory as new programming challenges arise What this book does not cover To use this book you need no prior knowledge of SAS, but you must know something about your local computer and operating environment. The SAs language is virtually the same from one operating environment to another, but some differences are unavoidable. For example, every operating environment has a different way of storing and accessing files. Also, some operating environments have more of a capacity for interactive computing than others. Your employer may have rules limiting the size of files you can print. This book addresses operating environments as much as possible, but no book can answer every question about your local system. You must have either a working knowledge of your operating environment or someone you can turn to with questions This book is not a replacement for the SAS Help and Documentation, or the many SAS manuals Sooner or later you'll need to go to these sources to learn details not covered in this book. The exact documentation available to you depends on which version of SAS you use. Starting with SAS 9, the SAS Online Doc has been combined with the system help accessed via the Help menu, giving you more detailed documentation at your fingertips. You can also purchase SAs Online Doc on a separate CD Ne cover only a few of the many SAS statistical procedures. Fortunately, the statistical proce- dures share many of the same statements, options, and output, so these few can serve introduction to the others. Once you have read Chapter 8, we think that other statistical procedures will feel familiar about your data that must be met for the tests to be valid. Experimental design and carefu s p Unfortunately, a book of this type cannot provide a thorough introduction to statistical conce such as degrees of freedom, or crossed and nested effects. There are underlying assumptin
Introduction xi About This Book Who needs this book This book is for all new SAS users in business, government, and academia, or for anyone who will be conducting data analysis using SAS. You need no prior experience with SAS software, but if you have some experience you may still find this book useful for learning techniques you missed or for reference. What this book covers This book introduces you to the SAS language with lots of practical examples, clear and concise explanations, and as little technical jargon as possible. Most of the features covered here come from Base SAS software, which contains the core of features used by all SAS programmers. One exception is Chapter 8 which includes some procedures from SAS/STAT software. Other exceptions appear in Chapters 2 and 9 which cover importing and exporting data from other types of software; some methods require SAS/ACCESS for PC File Formats software. We have tried to include every feature of Base SAS software that a beginner is likely to need. Some of you will be surprised that certain topics, such as macros, are included because macros are normally considered advanced. But they appear here because sometimes new users need them. However, that doesn’t mean that you need to know everything in this book. On the contrary, this book is designed so you can read just those sections you need to solve your problems. Even if you read this book from cover to cover, you may find yourself returning to refresh your memory as new programming challenges arise. What this book does not cover To use this book you need no prior knowledge of SAS, but you must know something about your local computer and operating environment. The SAS language is virtually the same from one operating environment to another, but some differences are unavoidable. For example, every operating environment has a different way of storing and accessing files. Also, some operating environments have more of a capacity for interactive computing than others. Your employer may have rules limiting the size of files you can print. This book addresses operating environments as much as possible, but no book can answer every question about your local system. You must have either a working knowledge of your operating environment or someone you can turn to with questions. This book is not a replacement for the SAS Help and Documentation, or the many SAS manuals. Sooner or later you’ll need to go to these sources to learn details not covered in this book. The exact documentation available to you depends on which version of SAS you use. Starting with SAS 9, the SAS OnlineDoc has been combined with the system help accessed via the Help menu, giving you more detailed documentation at your fingertips. You can also purchase SAS OnlineDoc on a separate CD. We cover only a few of the many SAS statistical procedures. Fortunately, the statistical procedures share many of the same statements, options, and output, so these few can serve as an introduction to the others. Once you have read Chapter 8, we think that other statistical procedures will feel familiar. Unfortunately, a book of this type cannot provide a thorough introduction to statistical concepts such as degrees of freedom, or crossed and nested effects. There are underlying assumptions about your data that must be met for the tests to be valid. Experimental design and careful
The Little sas book election of the models are critical. Interpretation of the results can often be difficult and subjective. We assume that readers who are interested in statistical computing already know something about statistics. People who want to use statistical procedures but are unfamiliar with these concepts should consult a statistician, seek out an introductory statistics text, or, better yet, take a course in statistics Modular sections Our goal in writing this book is to make learning SAS as easy and enjoyable as possible. Lets face itSAS is a big topic. You may have already spent some time scratching your head in front of a shelf full of SAS manuals, or staring at a screen full of online documentation until your eyes become blurry. We cant condense all of SAs into this little book, but we can odense topics into short, readable sections This entire book is composed of two-page sections, each section a complete topic. This way, you can easily skip over topics which do not apply to you. Of course, we think every section is important, or we would not have included it. You probably don' t need to know everything in this book, however, to complete your job. By presenting topics in short digestible sections, we believe that learning SAS will be easier and more fun--like eating three meals a day instead of one giant meal a week Graphics Wherever possible, graphic illustrations either identify the contents of the section or help explain the topic. A box with rough edges indicates a raw data file, and a box with nice smooth edges indicates a SAs data set. The squiggles inside the box indicate data-any old data-and a period indicates a missing value. The arrow between boxes of these types means that the section explains how to get from data that look like the one box to data that look like the other. Some sections have graphics which depict printed output. These graphics look like a stack rs with headers printed at the top of the page sAS data set wwM 作W N ww W data
xii The Little SAS Book raw data file SAS data set data SAS output Obs Lions Tigers Bears 1 2 3 4 selection of the models are critical. Interpretation of the results can often be difficult and subjective. We assume that readers who are interested in statistical computing already know something about statistics. People who want to use statistical procedures but are unfamiliar with these concepts should consult a statistician, seek out an introductory statistics text, or, better yet, take a course in statistics. Modular sections Our goal in writing this book is to make learning SAS as easy and enjoyable as possible. Let’s face itSAS is a big topic. You may have already spent some time scratching your head in front of a shelf full of SAS manuals, or staring at a screen full of online documentation until your eyes become blurry. We can’t condense all of SAS into this little book, but we can condense topics into short, readable sections. This entire book is composed of two-page sections, each section a complete topic. This way, you can easily skip over topics which do not apply to you. Of course, we think every section is important, or we would not have included it. You probably don’t need to know everything in this book, however, to complete your job. By presenting topics in short digestible sections, we believe that learning SAS will be easier and more funlike eating three meals a day instead of one giant meal a week. Graphics Wherever possible, graphic illustrations either identify the contents of the section or help explain the topic. A box with rough edges indicates a raw data file, and a box with nice smooth edges indicates a SAS data set. The squiggles inside the box indicate data—any old data—and a period indicates a missing value. The arrow between boxes of these types means that the section explains how to get from data that look like the one box to data that look like the other. Some sections have graphics which depict printed output. These graphics look like a stack of papers with headers printed at the top of the page
Introduction xiii Typographical conventions sAs doesnt care whether your programs are written in or lowercase, so you can write your programs any way you want. In this book, we have used uppercase and lowercase to tell you something. The statements on the left below show the syntax, or general form, while the statements on the right show an example of actual statements as they might appear in a SAS program. PROC PRINT DATA data-set-name PROC PRINT DATA bigcats VAR variable-list AR Lions Tigers Notice that the keywords PROC PRINT, DATA, and VAR are the same on both sides and that the descriptive terms data-set-name and variable-list on the syntax side have been replaced with an actual data set name and variable names in the example In this book, all SAs keywords appear in uppercase letters. A keyword is an instruction to SAS and must be spelled correctly. Anything written in lowercase italics is a description of what goes in that spot in the statement, not what you actually type. Anything in lowercase or mixed case etters(and not in italics)is something that the programmer has made up such as a variable name, a name for a sas data set, a comment, or a title. See section 1.2 for further discussion of cance of case in Sas nar Indention This book contains many SAS programs, each complete and executable Programs are formatted in a way which makes them easy for you to read and understand. You do not hav to format your programs this way, as SAS is very flexible, but attention to some of these details will make your programs easier to read. Easy-to-read programs are time-savers for you, or the onsultant you hire at $100 per hour, when you need to go back and decipher the program months or years later The structure of programs is shown by indenting all statements after the first in a step. This is a simple way to make your programs more readable, and it's a good habit to form. SAS doesn't really care where statements start or even if they are all on one line. In the following program the INFILE and INPUT statements are indented, indicating that they belong with the DATA Read animals weights from file. Print the results DATA animals INFILE 'c: \MyRawData\Zoo. dat PROC PRINT DATA animals: Last, we have tried to make this book as readable as possible and, we hope, even enjoyable. Once you master the contents of this small book you will no longer be a beginning SAS programmer
Introduction xiii Typographical conventions SAS doesn’t care whether your programs are written in uppercase or lowercase, so you can write your programs any way you want. In this book, we have used uppercase and lowercase to tell you something. The statements on the left below show the syntax, or general form, while the statements on the right show an example of actual statements as they might appear in a SAS program. Syntax Example PROC PRINT DATA = data-set-name; PROC PRINT DATA = bigcats; VAR variable-list; VAR Lions Tigers; Notice that the keywords PROC PRINT, DATA, and VAR are the same on both sides and that the descriptive terms data-set-name and variable-list on the syntax side have been replaced with an actual data set name and variable names in the example. In this book, all SAS keywords appear in uppercase letters. A keyword is an instruction to SAS and must be spelled correctly. Anything written in lowercase italics is a description of what goes in that spot in the statement, not what you actually type. Anything in lowercase or mixed case letters (and not in italics) is something that the programmer has made up such as a variable name, a name for a SAS data set, a comment, or a title. See section 1.2 for further discussion of the significance of case in SAS names. Indention This book contains many SAS programs, each complete and executable. Programs are formatted in a way which makes them easy for you to read and understand. You do not have to format your programs this way, as SAS is very flexible, but attention to some of these details will make your programs easier to read. Easy-to-read programs are time-savers for you, or the consultant you hire at $100 per hour, when you need to go back and decipher the program months or years later. The structure of programs is shown by indenting all statements after the first in a step. This is a simple way to make your programs more readable, and it’s a good habit to form. SAS doesn’t really care where statements start or even if they are all on one line. In the following program, the INFILE and INPUT statements are indented, indicating that they belong with the DATA statement: * Read animals’ weights from file. Print the results.; DATA animals; INFILE ’c:\MyRawData\Zoo.dat’; INPUT Lions Tigers; PROC PRINT DATA = animals; RUN; Last, we have tried to make this book as readable as possible and, we hope, even enjoyable. Once you master the contents of this small book you will no longer be a beginning SAS programmer