clear set mem 80m
exit help ____
search ____ log
in describe (des) ds codebook inspect summarize (sum) lsum list varlist sort gsort order aorder
use a data set–e.g.: use anes92 save a data set: save anes92–or save anes92,replace (if the data set already exists) save anes92,old (for STATA 5 format) OrControl+S clear existing data in memory allocate 80 megs of memory to STATA (default depends on machine) NOTE THAT STATA has a maximum limit of 2047 variables no matter how much memory you have. exit STATA help for STATA command ______ help contents gives a list of STATA commands where you can get help. search the on-line manual for all references to ____ (e.g., search regress gives all references toregress in STATA—it’s a lot!) set log file for output, e.g., log using c:\log\mylog will produce a file called mylog.log, which you can edit in any ASCII word processor. Variations: log using c:\log\mylog,append will add to existing file log using c:\log\mylog,replace will replace existing file log close will close log file (can reopen with append) restricts commands to those cases that fulfillcondition: e.g., sum var1 if partyid==1 (note two equal signs needed) will produce a summary (see below) of var1 only for partyid==1 (e.g., Democrats) restricts commands to ranges you specify: sum var1 in 1/20 will summarize only the first 20 cases of var1 produces a list of variables with format and labels produces a list of variables without format or labels will produce codebook of variableinformation (takes time!) will provide mini-histograms of data will provide summaries of data (means, sds, numbers of cases, max, min); sum varlist,detail will provide breakdowns by quartiles (If installed): lsum varlist will give you summary only for cases that are not missing for any variable. will print out all values of variables in varlist sorts variables in ascending order (If installed): sortsvariables in ascending or descending order changes the order of variables to your preferences (If installed): orders variables alphabetically
recode variables, as in: recode var1 3=2 2=1 1=0 4=. Note that var1 is recoded to missing (see mvdecode). Also note that you can only recode one variable at a time in a recode command (but see “for” below).
mvencode changes alloccurrences of missing to # in the specified varlist (see below). mvdecode changes all occurrences of # to missing in the specified varlist–e.g., mvdecode var1-var99,mv(999) changes all instances of 999 in var1 through var99 to system missing.
create new variables, the equivalent of compute in SPSS. E.g., gen pcincome=income/pop where pop=population. A neat featureis: gen byte democrat=(partyid==1) creates a dummy variable “democrat” that is equal to 1 if partyid==1 and 0 otherwise (you generally need to recode missing values in the new dummy variable to be missing in the new variable), which is best done with the replace command. Note that the byte command makes the new variable a “byte” variable –occupying less space in the data set. You can also usegenerate with many functions, such as mean, while also using the command by. E.g., gen incstate=income,by(state) which will give you a contextual variable of the income level in each state computed from income figures and the state code (in variable state).
like gen (it stands for extensions to generate), egen handles some more complex functions. While gen x=mean(var1) will produce aconstant containing the mean for var1, egen z=rmean(var2 var3 var4 var7) will produce a variable that has the row means for variables 2, 3, 4, and 7. rsum, e.g., gives the row sum of the variables in the list and rmiss gives the number of missing variables of variables in the list.
like generate, but the variable already exists. E.g.: replace democrat=. if partyid==. More generally: gen...