A predictive risk probability approach for microarray data with survival   as an endpoint

Chen Dung Tsa<sup>*</sup>; Schell Michael J; Chen James J; Fulp William J; Eschrich Steven; Yeatman Timothy

doi:10.1080/10543400802277967

摘要

Gene expression profiling has played an important role in cancer risk classification and has shown promising results. Since gene expression profiling often involves determination of a set of top rank genes for analysis, it is important to evaluate how modeling performance varies with the number of selected top ranked genes incorporated in the model. We used a colon data set collected at Moffitt Cancer Center as an example of the study, and ranked genes based on the univariate Cox proportional hazards model. A set of top ranked genes was selected for evaluation. The selection was done by choosing the top k ranked genes for k = 1 to 12,500. An analysis indicated a considerable variation of classification outcomes when the number of top ranked genes was changed. We developed a predictive risk probability approach to accommodate this variation by identifying a range number of top ranked genes. For each number of top ranked genes, the procedure classifies each patient as having high risk (score = 1) or low risk (score = 0). The categorizations are then averaged, giving a risk score between 0 and 1, thus providing a ranking for the patient's need for further treatment. This approach was applied to the colon data set and demonstrated the strength of this approach by three criteria: First, a univariate Cox proportional hazards model showed a highly statistically significant level (log-rank chi(2) statistics = 110 with p-value <10(-16)) for the predictive risk probability classification. Second, the survival tree model used the risk probability to partition patients into five risk groups showing a good separation of survival curves (log-rank chi(2) statistics = 215). In addition, utilization of the risk group status identified a small set of risk genes that may be practical for biological validation. Third, analysis of resampling the risk probability suggested the variation pattern of the log-rank 2 in the colon cancer data set was unlikely caused by chance.

出版日期2008
单位中国医科大学

全文

访问全文

收藏分享被引浏览

更新时间：2018-08-02 14:52

A predictive risk probability approach for microarray data with survival as an endpoint

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友