******************************************************************************************** This macro accompanies our manuscript (Desai RJ et al. Am J Epidemiol. 2018 Nov 01, 187(11):2439-2448) and is intended to calculate disease risk scores (DRS) predicting 1-year risk for an outcome of interest based on pre-specified risk factors in a Cox proportional hazards model. The macro allows for DRS calculation based on one of three development approaches, 1) unexposed only- model is constructed only in the unexposed (reference) group of the study population and then applied to the whole study population for DRS calculation, 2) whole cohort- model is constructed in the whole study population, a treatment indicator is included when constructing the model but treatment is set to 0 (refernece level) when applying the model to the study popuation for DRS calculation 3) an external cohort- model is constructed in an external cohort of unexposed (reference treated) patients (eg a historical cohort, cohort from a second database) and then applied to the whole study population for DRS calculation, For further details on these approaches and pros/cons of each, the reader is referred to Glynn et al. Pharmacoepidemiol Drug Saf. 2012 May, 21(Suppl 2): 138–147. The following is a summary of required parameters of this macro -------------------------------------------------------------------------------- method = options include 1= unexposed only, 2= whole cohort, 3= external cohort -------------------------------------------------------------------------------- study_cohort= patient level analytic dataset containing risk factor information for DRS calculation -------------------------------------------------------------------------------- external_cohort = Only relavent if method= 3, where an external cohort must be specified in which DRS model should be developed. Will only calculate score if sas names are consistent between the study cohort and development cohort -------------------------------------------------------------------------------- model_var= A full list of variables in the DRS model- including class variables -------------------------------------------------------------------------------- class_var= List of class variables in the DRS model -------------------------------------------------------------------------------- fu_time= Follow-up time- must be provided in days -------------------------------------------------------------------------------- event= outcome indicator -------------------------------------------------------------------------------- treatment = treatment variable, reference must be coded as 0 -------------------------------------------------------------------------------- out_data= output dataset name with calculated DRS -------------------------------------------------------------------------------- NOTE 1- This macro cannot handle categorical predictors with >2 categories, dummy variables for (n-1) classes need to be provided for accrate risk score estimation NOTE 2- All binary risk factors must be provided as numerically coded variables (0/1) for best performance ********************************************************************************************; options mprint mlogic; %macro drs_calculation ( method = , study_cohort= , external_cohort = , model_var= , class_var= , fu_time= , event= , treatment = , out_data= ); *** Depending upon the method option above, we determine the data in which the model will be developed ***; %if &method=1 %then %do; data development_cohort; set &study_cohort; where &treatment=0; run; %end; %if &method=2 %then %do; data development_cohort; set &study_cohort; run; %end; %if &method=3 %then %do; data development_cohort; set &external_cohort; run; %end; *** Next, we set all the covariate values to zero for estimating the baseline risk/hazard in the baseline statement of the following phreg procedures; data covars; retain &model_var 0; %if %bquote(&method)='full cohort' %then retain &treatment 0;; ** treatment needs to be modeled in the whole cohort development appraoch; run; ** construct the DRS model in the development cohort; ods output ParameterEstimates=betas_development_cohort ; proc phreg data=development_cohort namelen=32; class &class_var/ref=first param=ref; model &fu_time*&event(0)= %if (&method)=2 %then &treatment; &model_var; baseline out=phout_development_cohort survival=surv covariates=covars; run; ods output close; *Create a macro variable containing the survival probability at the specified event time when the covariate vector is zero; %let time_point=365; ** to provide 1 year risk estiamtes; data phout_development_cohort; set phout_development_cohort; proxy_time= abs(&time_point-&fu_time); ** to extract out survival probability at the closest time to the 1 year mark; run; proc sort data=phout_development_cohort out=forsurv; by proxy_time; run; data forsurv1; set forsurv (obs=1); run; proc sql noprint; select (surv) into:survival from forsurv1; quit; ** drop the treatment variable, ie set it to 0 when calculating the DRS- only relavent for the full cohort approach because that is the only appraoch that estimates coefficients for included variables along with the treatment coefficient; data betas_development_cohort; set betas_development_cohort; if parameter="&treatment" then delete; run; ** enumerate the remaining variables; data betas_development_cohort; set betas_development_cohort; covar=_n_; run; proc sql noprint; select max(covar) into:total_Covar from betas_development_cohort; quit; ** save the parameter estiamtes and the corresponding variable names into macro variables; %do c = 1 %to &total_covar; proc sql noprint; select Estimate into :a1_&c from betas_development_cohort where covar=&c; select Parameter into:c1_&c from betas_development_cohort where covar=&c; quit; %end; quit; ** Calculate the DRS in the application cohort; data &out_data; set &study_cohort ; %do c = 1 %to &total_covar; product1_&c = &&a1_&c.*&&c1_&c.; %end; **output drs on the probability scale; drs=1-(&survival**exp(sum(OF product1_:))); lin_drs= sum(OF product1_:); drop product1_:; run; ** check distribution; proc univariate data=&out_data; var drs lin_drs; histogram; run; %mend;