This blog is a journal of the tips and tricks I use on a routine basis.
Wednesday, February 22, 2012
PROC SQL Explicit Pass thru
Use the PROC SQL Pass-Through Facility when you want to interact with DBMS
data by using SQL syntax that is specific to your DBMS.
In this example, SAS/ACCESS connects to an ORACLE database by using the alias
ora2, selects all rows in the STAFF table, and displays the first 15 rows of data by
using PROC SQL.
proc sql outobs=15;
connect to oracle as ora2 (user=user-id password=password);
select * from connection to ora2 (select lname, fname, state from staff);
disconnect from ora2;
quit;
Saturday, February 18, 2012
Date Time Functions
Nice paper on date time functions
http://www.sierrainformation.com/presentationPDF/presentation_63.pdf
http://www.sierrainformation.com/presentationPDF/presentation_63.pdf
Friday, February 17, 2012
Macro Functions/Commands
List of frequently used Macro Functions
_ALL_ describes all currently defined macro variables, regardless of scope.
This output includes user-defined global and local variables as well as automatic macro variables. Scopes are listed in the order of innermost to outermost.
_AUTOMATIC_ describes all automatic macro variables.
The scope is listed asAUTOMATIC. All automatic macro variables are global except SYSPBUFF.
_GLOBAL_ describes all global macro variables that were not created by the
macro processor. The scope is listed as GLOBAL. Automatic macro variables are not listed.
_LOCAL_ describes user-defined local macro variables defined within the
currently executing macro. The scope is listed as the name of the macro in which the macro variable is defined.
_USER_ describes all user-defined macro variables, regardless of scope.
The scope is either GLOBAL, for global macro variables, or the name of the macro in which the macro variable is defined.
Create Global macros by using the command %Global
Two rules control where CALL SYMPUT creates its variables:
Table 6.1 Macro Language Elements that Evaluate Arithmetic and Logical
Expressions
%DOmacro-variable=expression %TO expression<%BY expression>;
%DO %UNTIL(expression);
%DO %WHILE(expression);
%EVAL (expression);
%IF expression %THEN statement;
%QSCAN(argument,expression<,delimiters>)
%QSUBSTR(argument,expression<,expression>)
%SCAN(argument,expression,)
%SUBSTR(argument,expression<,expression>)
%SYSEVALF(expression,conversion-type)
%SYSEVALF function to evaluate logical expressions containing floating-point or missing values.
The following macro quoting functions are most commonly used:
For the paired macro quoting functions, the function beginning with NR affects the same category of special characters that are masked by the plain macro quoting function as well as ampersands and percent signs. In effect, the NR functions prevent macro and macro variable resolution. To help you remember which does which, try associating the NR in the macro quoting function names with the words “not resolved”
— that is, macros and macro variables are not resolved when you use these functions. The macro quoting functions with B in their names are useful for macro quoting unmatched quotation marks and parentheses. To help you remember the B, try associating B with “by itself”.
Here is an example that uses %STR to mask a string that contains an unmatched single quotation mark. Note the use of the % sign before the quotation mark:
%let innocent=%str(I didn%’t do it!);
Examples Using %NRSTR
Suppose you want the name (not the value) of a macro variable to be printed by the %PUT statement. To do so, you must use the %NRSTR function to mask the & and prevent the resolution of the macro variable, as in the following example:
%macro example;
%local myvar;
%let myvar=abc;
%put %nrstr(The string &myvar appears in log output,);
%put instead of the variable value.;
%mend example;
%example
This code writes the following text to the SAS log:
The string &myvar appears in log output,
instead of the variable value.
If you did not use the %NRSTR function or if you used %STR, the following undesired output would appear in the SAS log:
The string abc appears in log output,
instead of the variable value.
_ALL_ describes all currently defined macro variables, regardless of scope.
This output includes user-defined global and local variables as well as automatic macro variables. Scopes are listed in the order of innermost to outermost.
_AUTOMATIC_ describes all automatic macro variables.
The scope is listed asAUTOMATIC. All automatic macro variables are global except SYSPBUFF.
_GLOBAL_ describes all global macro variables that were not created by the
macro processor. The scope is listed as GLOBAL. Automatic macro variables are not listed.
_LOCAL_ describes user-defined local macro variables defined within the
currently executing macro. The scope is listed as the name of the macro in which the macro variable is defined.
_USER_ describes all user-defined macro variables, regardless of scope.
The scope is either GLOBAL, for global macro variables, or the name of the macro in which the macro variable is defined.
Create Global macros by using the command %Global
Two rules control where CALL SYMPUT creates its variables:
- CALL SYMPUT creates the macro variable in the current symbol table available while the DATA step is executing, provided that symbol table is not empty. If it is empty (contains no local macro variables), usually CALL SYMPUT creates the variable in the closest nonempty symbol table.
- However, there are three cases where CALL SYMPUT creates the variable in the local symbol table, even if that symbol table is empty:
- Beginning with SAS Version 8, if CALL SYMPUT is used after a PROC SQL, the variable will be created in a local symbol table.
- If the macro variable SYSPBUFF is created at macro invocation time, the variable will be created in the local symbol table.
- If the executing macro contains a computed %GOTO statement, the variable will be created in the local symbol table. A computed %GOTO statement is one that uses a label that contains an & or a % in it. That is, a computed %GOTO statement contains a macro variable reference or a macro call that produces a text expression. Here is an example of a computed %GOTO statement:
Table 6.1 Macro Language Elements that Evaluate Arithmetic and Logical
Expressions
%DOmacro-variable=expression %TO expression<%BY expression>;
%DO %UNTIL(expression);
%DO %WHILE(expression);
%EVAL (expression);
%IF expression %THEN statement;
%QSCAN(argument,expression<,delimiters>)
%QSUBSTR(argument,expression<,expression>)
%SCAN(argument,expression,
%SUBSTR(argument,expression<,expression>)
%SYSEVALF(expression,conversion-type)
%SYSEVALF function to evaluate logical expressions containing floating-point or missing values.
The following macro quoting functions are most commonly used:
- %STR and %NRSTR
- %BQUOTE and %NRBQUOTE
- %SUPERQ
For the paired macro quoting functions, the function beginning with NR affects the same category of special characters that are masked by the plain macro quoting function as well as ampersands and percent signs. In effect, the NR functions prevent macro and macro variable resolution. To help you remember which does which, try associating the NR in the macro quoting function names with the words “not resolved”
— that is, macros and macro variables are not resolved when you use these functions. The macro quoting functions with B in their names are useful for macro quoting unmatched quotation marks and parentheses. To help you remember the B, try associating B with “by itself”.
Here is an example that uses %STR to mask a string that contains an unmatched single quotation mark. Note the use of the % sign before the quotation mark:
%let innocent=%str(I didn%’t do it!);
Examples Using %NRSTR
Suppose you want the name (not the value) of a macro variable to be printed by the %PUT statement. To do so, you must use the %NRSTR function to mask the & and prevent the resolution of the macro variable, as in the following example:
%macro example;
%local myvar;
%let myvar=abc;
%put %nrstr(The string &myvar appears in log output,);
%put instead of the variable value.;
%mend example;
%example
This code writes the following text to the SAS log:
The string &myvar appears in log output,
instead of the variable value.
If you did not use the %NRSTR function or if you used %STR, the following undesired output would appear in the SAS log:
The string abc appears in log output,
instead of the variable value.
Tuesday, February 7, 2012
Improving Query Performance
There are several ways to improve query performance, including the following:
-
using indexes and composite indexes
-
using the keyword ALL in set operations when you know that there are no
duplicate rows, or when it does not matter if you have duplicate rows in the
result table
-
omitting the ORDER BY clause when you create tables and views
-
using in-line views instead of temporary tables (or vice versa)
-
using joins instead of subqueries
-
using WHERE expressions to limit the size of result tables that are created
with joins
-
using either PROC SQL options, SAS system options, or both to replace a PUT
function in a query with a logically equivalent expression
-
replacing references to the DATE, TIME, DATETIME, and TODAY functions in a
query with their equivalent constant values before the query executes
- disabling the remerging of data when summary functions are used in a query
Using PROC SQL Options to Create and Debug Queries
PROC SQL supports options that can give you greater control over PROC SQL while you are developing a query:
-
The INOBS=, OUTOBS=, and LOOPS= options reduce query execution time by
limiting the number of rows and the number of iterations that PROC SQL
processes.
-
The EXEC and VALIDATE statements enable you to quickly check the syntax of a
query.
-
The FEEDBACK option displays the columns that are represented by a SELECT *
statement.
- The PROC SQL STIMER option records and displays query execution time.
Using the PROC SQL Automatic Macro Variables
PROC SQL sets up macro variables with certain values after it executes each statement. These macro variables can be tested inside a macro to determine whether to continue executing the PROC SQL step.
After each PROC SQL statement has executed, the following macro variables are updated with these values:
- SQLEXITCODE
-
contains the highest return code that occurred from some types of SQL insert
failures. This return code is written to the SYSERR macro variable when PROC SQL
terminates.
- SQLOBS
-
contains the number of rows that were processed by an SQL procedure
statement. For example, the SQLOBS macro variable contains the number of rows
that were formatted and displayed in SAS output by a SELECT statement or the
number of rows that were deleted by a DELETE statement.
When the NOPRINT option is specified, the value of the SQLOBS macro variable depends on whether an output table, single macro variable, macro variable list, or macro variable range is created:
-
If no output table, macro variable list, or macro variable range is created,
then SQLOBS contains the value 1.
-
If an output table is created, then SQLOBS contains the number of rows in the
output table.
-
If a single macro variable is created, then SQLOBS contains the value 1.
- If a macro variable list or macro variable range is created, then SQLOBS contains the number of rows that are processed to create the macro variable list or range.
Note: The SQLOBS automatic macro variable is assigned a value after the SQL SELECT statement executes.
-
If no output table, macro variable list, or macro variable range is created,
then SQLOBS contains the value 1.
- SQLOOPS
-
contains the number of iterations that the inner loop of PROC SQL processes.
The number of iterations increases proportionally with the complexity of the
query. For more information, see Limiting
Iterations with the LOOPS= Option and LOOPS= in the Base SAS Procedures
Guide.
- SQLRC
-
contains the following status values that indicate the success of the SQL
procedure statement:
- 0
-
PROC SQL statement completed successfully with no errors.
- 4
-
PROC SQL statement encountered a situation for which it issued a warning. The
statement continued to execute.
- 8
-
PROC SQL statement encountered an error. The statement stopped execution at
this point.
- 12
-
PROC SQL statement encountered an internal error, indicating a bug in PROC
SQL that should be reported to SAS Technical Support. These errors can occur
only during compile time.
- 16
-
PROC SQL statement encountered a user error. For example, this error code is
used, when a subquery (that can return only a single value) evaluates to more
than one row. These errors can be detected only during run time.
- 24
-
PROC SQL statement encountered a system error. For example, this error is
used, if the system cannot write to a PROC SQL table because the disk is full.
These errors can occur only during run time.
- 28
- PROC SQL statement encountered an internal error, indicating a bug in PROC SQL that should be reported to SAS Technical Support. These errors can occur only during run time.
For example, the values for the SQLRC return code differ based on the value of the UNDO_POLICY= option or the SQLUNDOPOLICY system option if you attempt to insert duplicate values into an index that is defined using the CREATE UNIQUE INDEX statement:
-
If you set the UNDO_POLICY= option or the SQLUNDOPOLICY system option to
either REQUIRED or OPTIONAL, and you attempt to insert a duplicate index value,
SAS creates and tries to maintain a copy of the table before and after updates
are applied. SAS detects an error condition and supplies a return code to PROC
SQL, which stops execution as soon as the error condition is received. SQLRC
contains the value 24.
- If you set the UNDO_POLICY= option or the SQLUNDOPOLICY system option to NONE and you attempt to insert a duplicate index value, SAS does not create a before-and-after copy of the table. SAS does not detect an error condition and does not supply a return code to PROC SQL, which attempts to continue to process the updates. SQLRC contains the value 8.
- SQLXMSG
-
contains descriptive information and the DBMS-specific return code for the
error that is returned by the Pass-Through Facility.
Note: Because the value of the SQLXMSG macro variable can contain special characters (such as &, %, /, *, and ;), use the %SUPERQ macro function when printing the following value:
%put %superq(sqlxmsg);
For information about the %SUPERQ function, see SAS Macro Language: Reference. - SQLXRC
- contains the DBMS-specific return code that is returned by the Pass-Through Facility.
Users of SAS/AF software can access these automatic macro variables in SAS Component Language (SCL) programs by using the SYMGET function. The following example uses the VALIDATE statement in a SAS/AF software application to check the syntax of a block of code. Before it issues the CREATE VIEW statement, the application checks that the view is accessible.
submit sql immediate; validate &viewdef; end submit; if symget('SQLRC') gt 4 then do; ... the view is not valid ... end; else do; submit sql immediate; create view &viewname as &viewdef; end submit; end;
The following example retrieves the data from the COUNTRIES table, but does not display the table because the NOPRINT option is specified in the PROC SQL statement. The %PUT macro language statement displays the three automatic macro variable values in the SAS log. For more information about the %PUT statement and the SAS macro facility, see SAS Macro Language: Reference.
proc sql noprint; select * from sql.countries; %put SQLOBS=*&sqlobs* SQLOOPS=*&sqloops* SQLRC=*&sqlrc*;
Using the PROC SQL Automatic Macro Variables
SQLOBS=*1* SQLOOPS=*11* SQLRC=*0*Notice that the value of SQLOBS is 1. When the NOPRINT option is used and no table or macro variables are created, SQLOBS returns a value of 1 because only one row is processed.
Note: You can use the _AUTOMATIC_ option in the %PUT statement to list the values of all automatic macro variables. The list depends on the SAS products that are installed at your site
Macros in Proc SQL
Creating Macro Variables in PROC SQL |
Other software vendors' SQL products allow the embedding of SQL into another language. References to variables (columns) of that language are termed host-variable references. They are differentiated from references to columns in tables by names that are prefixed with a colon. The host-variable stores the values of the object-items that are listed in the SELECT clause.
The only host language that is currently available in SAS is the macro language, which is part of Base SAS software. When a calculation is performed on a column's value, its result can be stored, using :macro-variable, in the macro facility. The result can then be referenced by that name in another PROC SQL query or SAS procedure. Host-variable can be used only in the outer query of a SELECT statement, not in a subquery. Host-variable cannot be used in a CREATE statement.
If the query produces more than one row of output, then the macro variable will contain only the value from the first row. If the query has no rows in its output, then the macro variable is not modified. If the macro variable does not exist yet, it will not be created. The PROC SQL macro variable SQLOBS contains the number of rows that are produced by the query.
Note: The SQLOBS automatic macro variable is assigned a value after the SQL SELECT statement executes.
Creating Macro Variables from the First Row of a Query Result
If you specify a single macro variable in the INTO clause, then PROC SQL assigns the variable the value from the first row only of the appropriate column in the SELECT list. In this example, &country1 is assigned the value from the first row of the Country column, and &barrels1 is assigned the value from the first row of the Barrels column. The NOPRINT option prevents PROC SQL from displaying the results of the query. The %PUT statement writes the contents of the macro variables to the SAS log.proc sql noprint; select country, barrels into :country1, :barrels1 from sql.oilrsrvs; %put &country1 &barrels1;
Creating Macro Variables from the First Row of a Query Result
4 proc sql noprint; 5 select country, barrels 6 into :country1, :barrels1 7 from sql.oilrsrvs; 8 9 %put &country1 &barrels1; Algeria 9,200,000,000 NOTE: PROCEDURE SQL used: real time 0.12 seconds
Thursday, February 2, 2012
SAS Handling Extremely Large Data Sets and Options
A very good document with tips on how to handle extremely large data sets in SAS
http://www.nesug.org/proceedings/nesug08/ap/ap03.pdf
A document about setting various options up in SAS
http://www2.sas.com/proceedings/sugi28/067-28.pdf
Another blog with lot of tips and useful SAS information
http://studysas.blogspot.com/2009/03/sas-datetime-and-datetime-functions.html#
http://www.nesug.org/proceedings/nesug08/ap/ap03.pdf
A document about setting various options up in SAS
http://www2.sas.com/proceedings/sugi28/067-28.pdf
Another blog with lot of tips and useful SAS information
http://studysas.blogspot.com/2009/03/sas-datetime-and-datetime-functions.html#
Subscribe to:
Posts (Atom)