How to resolve discrepancy in 95% confidence intervals in survival function in R -
i'm in process of writing functions extract info results of survival analysis , ran discrepancy between extraction of lower , upper survival time specified 95% confidence interval , reported bundle summary.
i'm using survival
bundle (v 2.37-7) in r (v 3.1.2).
so problem extraction of lower and/or upper boundary of 95% ci median survival time not match returned when evaluate results of survfit
. when inspect data, believe results of survfit
wrong, appears returning boundary+1 value (again, sometimes). here info illustrate problem.
# fit info stratified gender of subject survfit30sex <- survfit(surv(thirtydaysuicides$daysfrominvestigation) ~ thirtydaysuicides$sex) # display median survival , confidence interval survfit30sex call: survfit(formula = surv(thirtydaysuicides$daysfrominvestigation) ~ thirtydaysuicides$sex) records n.max n.start events median 0.95lcl 0.95ucl thirtydaysuicides$sex=1 35 35 35 35 15 9 20 thirtydaysuicides$sex=2 93 93 93 93 9 6 13
survfit
determines lower , upper boundary sex = 1
9 days , 20 days respectively when inspect data, seems upper boundary should 19, not 20
here actual data; i'm showing sex=1
discrepancy is, i've cutting out values before , after critical part create info easier read
call: survfit(formula = surv(thirtydaysuicides$daysfrominvestigation) ~ thirtydaysuicides$sex) summary( thirtydaysuicides$sex=1 ) time n.risk n.event survival std.err lower 95% ci upper 95% ci 9 24 2 0.6286 0.0817 0.48725 0.811 10 22 1 0.6000 0.0828 0.45780 0.786 11 21 1 0.5714 0.0836 0.42890 0.761 13 20 1 0.5429 0.0842 0.40055 0.736 14 19 1 0.5143 0.0845 0.37272 0.710 15 18 1 0.4857 0.0845 0.34541 0.683 16 17 1 0.4571 0.0842 0.31861 0.656 17 16 3 0.3714 0.0817 0.24138 0.572 19 13 1 0.3429 0.0802 0.21673 0.542 20 12 2 0.2857 0.0764 0.16921 0.482 21 10 2 0.2286 0.0710 0.12437 0.420 22 8 1 0.2000 0.0676 0.10310 0.388
as understand it, lower 95% ci median survival time 0.34541. searching downwards survival column until finding value < 0.34541 occurs in row associated survival time of 19 (survival = 0.3429). isn't upper bound? why survfit
homecoming upper survival time of 20? i've automated algorithm , of time match output survfit not always.
this leads me think either there unusual error in survival
bundle (which doubt), or i'm finding boundary incorrectly (most likely).
--------- update
unfortunately don't know how link info file question, info pretty short can set here. note eliminated stratification sex simplify, info females discrepancy.
it occurs me approaching incorrectly, perhaps 95% ci beingness computed standard error, not looked way i'm thinking of it. thought i'm having similar problems. question more generally, how 1 pull out xth percentile survival time it's corresponding 95% ci in units of time survfit object?
here survival input info via dput, , unstructured re-create below that.
structure(list(daysfrominvestigation = c(27l, 27l, 10l, 20l, 15l, 21l, 27l, 1l, 9l, 22l, 29l, 14l, 4l, 19l, 7l, 3l, 2l, 7l, 21l, 4l, 17l, 20l, 16l, 2l, 9l, 7l, 17l, 2l, 17l, 26l, 25l, 11l, 3l, 13l, 27l), censored = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), class = "data.frame", row.names = c(na, -35l), .names = c("daysfrominvestigation", "censored")) daysfrominvestigation censored 1 27 1 2 27 1 3 10 1 4 20 1 5 15 1 6 21 1 7 27 1 8 1 1 9 9 1 10 22 1 11 29 1 12 14 1 13 4 1 14 19 1 15 7 1 16 3 1 17 2 1 18 7 1 19 21 1 20 4 1 21 17 1 22 20 1 23 16 1 24 2 1 25 9 1 26 7 1 27 17 1 28 2 1 29 17 1 30 26 1 31 25 1 32 11 1 33 3 1 34 13 1 35 27 1
i have reply own question, @ to the lowest degree approximate reply if not best answer.
the main problem having failing utilize weighted average. in question, interested in median survival time, survival = 0.5. info didn't produce events @ precise median time , have survival probability of 14 days = 0.5143 , 15 days = 0.4857, weighted average of rounds 15 days.
the sec problem misunderstanding how utilize confidence intervals. in order match survival bundle reports, find lower bound of median survival interval, 1 searches lower bound vector find first value less median, , computes weighted average of time value below median , above. likewise, upper bound 1 searches upper bound vector find target interval , computes weighted average. example, upper bound of median survival happens between 19 days , 20 days. weighted average rounds 20 days.
i haven't tracked survival code confirm how done properly, in case i've got 50 specific combinations of survival fits looking @ different time periods , different moderators , matching median output provided survival bundle 100%.
i hope runs same question helped summary, , if wants help correct/refine understanding, it's welcome.
r survival-analysis
No comments:
Post a Comment