Tuesday, January 19, 2016

Formats

It has been a really long time since I posted an entry on this blog, I am working on using existing formats and create new custom formats in SAS and thought this would be a good topic to write on.

In SAS, values can be displayed using either the generic formats available or create custom formats. The FORMAT procedure is used to create custom formats. The custom formats can be stored either temporarily (active only in the current SAS program) or stored permanently so they can reused as needed.

Some of the generic formats available in SAS

  • MMDDYYs.                                 01/16/2016
  • dollar11.2                                     $243,678.25


PROC FORMAT
General Form
        PROC FORMAT ;

where options include
  •  LIBRARY= libref  specifies the libref for a SAS data library to contain a permanent catalog of user-defined formats
  • FMTLIB  displays a list of all of the formats in your catalog, along with descriptions of their values.
Use the Library= option to store the format in a permanent format catalog.

PROC FORMAT LIBRARY=libref;
 Remember to associate the libref to the SAS data library where format catalog is stored

libname custom 'c:/sas/formats/lib';


Now formats created as below will be stored in the permanent format catalog called custom.formats

libname custom 'c:/sas/formats/lib';

proc format library=custom;
.............;
run;

 

Wednesday, February 22, 2012

PROC SQL Explicit Pass thru


Use the PROC SQL Pass-Through Facility when you want to interact with DBMS
data by using SQL syntax that is specific to your DBMS.
In this example, SAS/ACCESS connects to an ORACLE database by using the alias
ora2, selects all rows in the STAFF table, and displays the first 15 rows of data by
using PROC SQL.

proc sql outobs=15;
connect to oracle as ora2 (user=user-id password=password);
select * from connection to ora2 (select lname, fname, state from staff);
disconnect from ora2;
quit;

Friday, February 17, 2012

Macro Functions/Commands

List of frequently used Macro Functions


_ALL_          describes all currently defined macro variables, regardless of scope.
This output includes user-defined global and local variables as well as automatic macro variables. Scopes are listed in the order of innermost to outermost.

_AUTOMATIC_          describes all automatic macro variables.
The scope is listed asAUTOMATIC. All automatic macro variables are global except SYSPBUFF.

_GLOBAL_ describes all global macro variables that were not created by the
macro processor. The scope is listed as GLOBAL. Automatic macro variables are not listed.

_LOCAL_ describes user-defined local macro variables defined within the
currently executing macro. The scope is listed as the name of the macro in which the macro variable is defined.

_USER_ describes all user-defined macro variables, regardless of scope.
The scope is either GLOBAL, for global macro variables, or the name of the macro in which the macro variable is defined.

Create Global macros by using the command %Global


Two rules control where CALL SYMPUT creates its variables:


  • CALL SYMPUT creates the macro variable in the current symbol table available while the DATA step is executing, provided that symbol table is not empty. If it is empty (contains no local macro variables), usually CALL SYMPUT creates the variable in the closest nonempty symbol table.
  • However, there are three cases where CALL SYMPUT creates the variable in the local symbol table, even if that symbol table is empty:
    • Beginning with SAS Version 8, if CALL SYMPUT is used after a PROC SQL, the variable will be created in a local symbol table.
    • If the macro variable SYSPBUFF is created at macro invocation time, the variable will be created in the local symbol table.
    • If the executing macro contains a computed %GOTO statement, the variable will be created in the local symbol table. A computed %GOTO statement is one that uses a label that contains an & or a % in it. That is, a computed %GOTO statement contains a macro variable reference or a macro call that produces a text expression. Here is an example of a computed %GOTO statement:
                                  %goto &home;



Table 6.1 Macro Language Elements that Evaluate Arithmetic and Logical
Expressions
%DOmacro-variable=expression %TO expression<%BY expression>;
%DO %UNTIL(expression);
%DO %WHILE(expression);
%EVAL (expression);
%IF expression %THEN statement;
%QSCAN(argument,expression<,delimiters>)
%QSUBSTR(argument,expression<,expression>)
%SCAN(argument,expression,)
%SUBSTR(argument,expression<,expression>)
%SYSEVALF(expression,conversion-type)


%SYSEVALF function to evaluate logical expressions containing floating-point or missing values.


The following macro quoting functions are most commonly used:

  1. %STR and %NRSTR
  2. %BQUOTE and %NRBQUOTE
  3. %SUPERQ





For the paired macro quoting functions, the function beginning with NR affects the same category of special characters that are masked by the plain macro quoting function as well as ampersands and percent signs. In effect, the NR functions prevent macro and macro variable resolution. To help you remember which does which, try associating the NR in the macro quoting function names with the words “not resolved”
— that is, macros and macro variables are not resolved when you use these functions. The macro quoting functions with B in their names are useful for macro quoting unmatched quotation marks and parentheses. To help you remember the B, try associating B with “by itself”.



Here is an example that uses %STR to mask a string that contains an unmatched single quotation mark. Note the use of the % sign before the quotation mark:

%let innocent=%str(I didn%’t do it!);


Examples Using %NRSTR
Suppose you want the name (not the value) of a macro variable to be printed by the %PUT statement. To do so, you must use the %NRSTR function to mask the & and prevent the resolution of the macro variable, as in the following example:


%macro example;
               %local myvar;
               %let myvar=abc;
               %put %nrstr(The string &myvar appears in log output,);
              %put instead of the variable value.;
%mend example;

%example
This code writes the following text to the SAS log:

The string &myvar appears in log output,
instead of the variable value.

If you did not use the %NRSTR function or if you used %STR, the following undesired output would appear in the SAS log:
The string abc appears in log output,
instead of the variable value.





Tuesday, February 7, 2012

Improving Query Performance


There are several ways to improve query performance, including the following:
  • using indexes and composite indexes
  • using the keyword ALL in set operations when you know that there are no duplicate rows, or when it does not matter if you have duplicate rows in the result table
  • omitting the ORDER BY clause when you create tables and views
  • using in-line views instead of temporary tables (or vice versa)
  • using joins instead of subqueries
  • using WHERE expressions to limit the size of result tables that are created with joins
  • using either PROC SQL options, SAS system options, or both to replace a PUT function in a query with a logically equivalent expression
  • replacing references to the DATE, TIME, DATETIME, and TODAY functions in a query with their equivalent constant values before the query executes
  • disabling the remerging of data when summary functions are used in a query

Using PROC SQL Options to Create and Debug Queries


PROC SQL supports options that can give you greater control over PROC SQL while you are developing a query:
  • The INOBS=, OUTOBS=, and LOOPS= options reduce query execution time by limiting the number of rows and the number of iterations that PROC SQL processes.
  • The EXEC and VALIDATE statements enable you to quickly check the syntax of a query.
  • The FEEDBACK option displays the columns that are represented by a SELECT * statement.
  • The PROC SQL STIMER option records and displays query execution time.
You can set an option initially in the PROC SQL statement, and then use the RESET statement to change the same option's setting without ending the current PROC SQL step.

Using the PROC SQL Automatic Macro Variables



PROC SQL sets up macro variables with certain values after it executes each statement. These macro variables can be tested inside a macro to determine whether to continue executing the PROC SQL step.
After each PROC SQL statement has executed, the following macro variables are updated with these values:
SQLEXITCODE
contains the highest return code that occurred from some types of SQL insert failures. This return code is written to the SYSERR macro variable when PROC SQL terminates.
SQLOBS
contains the number of rows that were processed by an SQL procedure statement. For example, the SQLOBS macro variable contains the number of rows that were formatted and displayed in SAS output by a SELECT statement or the number of rows that were deleted by a DELETE statement.
When the NOPRINT option is specified, the value of the SQLOBS macro variable depends on whether an output table, single macro variable, macro variable list, or macro variable range is created:
  • If no output table, macro variable list, or macro variable range is created, then SQLOBS contains the value 1.
  • If an output table is created, then SQLOBS contains the number of rows in the output table.
  • If a single macro variable is created, then SQLOBS contains the value 1.
  • If a macro variable list or macro variable range is created, then SQLOBS contains the number of rows that are processed to create the macro variable list or range.
If an SQL view is created, then SQLOBS contains the value 0.
Note:   The SQLOBS automatic macro variable is assigned a value after the SQL SELECT statement executes.  [cautionend]
SQLOOPS
contains the number of iterations that the inner loop of PROC SQL processes. The number of iterations increases proportionally with the complexity of the query. For more information, see Limiting Iterations with the LOOPS= Option and LOOPS= in the Base SAS Procedures Guide.
SQLRC
contains the following status values that indicate the success of the SQL procedure statement:
0
PROC SQL statement completed successfully with no errors.
4
PROC SQL statement encountered a situation for which it issued a warning. The statement continued to execute.
8
PROC SQL statement encountered an error. The statement stopped execution at this point.
12
PROC SQL statement encountered an internal error, indicating a bug in PROC SQL that should be reported to SAS Technical Support. These errors can occur only during compile time.
16
PROC SQL statement encountered a user error. For example, this error code is used, when a subquery (that can return only a single value) evaluates to more than one row. These errors can be detected only during run time.
24
PROC SQL statement encountered a system error. For example, this error is used, if the system cannot write to a PROC SQL table because the disk is full. These errors can occur only during run time.
28
PROC SQL statement encountered an internal error, indicating a bug in PROC SQL that should be reported to SAS Technical Support. These errors can occur only during run time.
The value of SQLRC can vary based on the value of the PROC SQL statement UNDO_POLICY= option or the SQLUNDOPOLICY system option.
For example, the values for the SQLRC return code differ based on the value of the UNDO_POLICY= option or the SQLUNDOPOLICY system option if you attempt to insert duplicate values into an index that is defined using the CREATE UNIQUE INDEX statement:
  • If you set the UNDO_POLICY= option or the SQLUNDOPOLICY system option to either REQUIRED or OPTIONAL, and you attempt to insert a duplicate index value, SAS creates and tries to maintain a copy of the table before and after updates are applied. SAS detects an error condition and supplies a return code to PROC SQL, which stops execution as soon as the error condition is received. SQLRC contains the value 24.
  • If you set the UNDO_POLICY= option or the SQLUNDOPOLICY system option to NONE and you attempt to insert a duplicate index value, SAS does not create a before-and-after copy of the table. SAS does not detect an error condition and does not supply a return code to PROC SQL, which attempts to continue to process the updates. SQLRC contains the value 8.
SQLXMSG
contains descriptive information and the DBMS-specific return code for the error that is returned by the Pass-Through Facility.
Note:   Because the value of the SQLXMSG macro variable can contain special characters (such as &, %, /, *, and ;), use the %SUPERQ macro function when printing the following value:
%put %superq(sqlxmsg);
For information about the %SUPERQ function, see SAS Macro Language: Reference.  [cautionend]
SQLXRC
contains the DBMS-specific return code that is returned by the Pass-Through Facility.
Macro variables that are generated by PROC SQL follow the scoping rules for %LET. For more information about macro variable scoping, see SAS Macro Language: Reference.
Users of SAS/AF software can access these automatic macro variables in SAS Component Language (SCL) programs by using the SYMGET function. The following example uses the VALIDATE statement in a SAS/AF software application to check the syntax of a block of code. Before it issues the CREATE VIEW statement, the application checks that the view is accessible.
submit sql immediate;
   validate &viewdef;
end submit;

if symget('SQLRC') gt 4 then 
   do;
      ... the view is not valid ...
   end;
else do;
   submit sql immediate;
      create view &viewname as &viewdef;
   end submit;
end;

The following example retrieves the data from the COUNTRIES table, but does not display the table because the NOPRINT option is specified in the PROC SQL statement. The %PUT macro language statement displays the three automatic macro variable values in the SAS log. For more information about the %PUT statement and the SAS macro facility, see SAS Macro Language: Reference.
proc sql noprint;
   select * from sql.countries;
%put SQLOBS=*&sqlobs* SQLOOPS=*&sqloops* SQLRC=*&sqlrc*;


Using the PROC SQL Automatic Macro Variables
SQLOBS=*1* SQLOOPS=*11* SQLRC=*0*
Notice that the value of SQLOBS is 1. When the NOPRINT option is used and no table or macro variables are created, SQLOBS returns a value of 1 because only one row is processed.
Note:   You can use the _AUTOMATIC_ option in the %PUT statement to list the values of all automatic macro variables. The list depends on the SAS products that are installed at your site