diff --git a/.doctrees/auto_examples/plot_01_survival_analysis.doctree b/.doctrees/auto_examples/plot_01_survival_analysis.doctree index ca5a4d7..2035684 100644 Binary files a/.doctrees/auto_examples/plot_01_survival_analysis.doctree and b/.doctrees/auto_examples/plot_01_survival_analysis.doctree differ diff --git a/.doctrees/auto_examples/plot_02_marginal_cumulative_incidence_estimation.doctree b/.doctrees/auto_examples/plot_02_marginal_cumulative_incidence_estimation.doctree index 386a142..f1fd397 100644 Binary files a/.doctrees/auto_examples/plot_02_marginal_cumulative_incidence_estimation.doctree and b/.doctrees/auto_examples/plot_02_marginal_cumulative_incidence_estimation.doctree differ diff --git a/.doctrees/environment.pickle b/.doctrees/environment.pickle index f604871..4463f38 100644 Binary files a/.doctrees/environment.pickle and b/.doctrees/environment.pickle differ diff --git a/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip b/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip index ac170d1..1abc9a3 100644 Binary files a/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip and b/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip differ diff --git a/_downloads/28dc0b4663d142f699c8b1869d27757f/plot_02_marginal_cumulative_incidence_estimation.zip b/_downloads/28dc0b4663d142f699c8b1869d27757f/plot_02_marginal_cumulative_incidence_estimation.zip index 8b275c4..e83ecd1 100644 Binary files a/_downloads/28dc0b4663d142f699c8b1869d27757f/plot_02_marginal_cumulative_incidence_estimation.zip and b/_downloads/28dc0b4663d142f699c8b1869d27757f/plot_02_marginal_cumulative_incidence_estimation.zip differ diff --git a/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip b/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip index 873d7b0..4d5752a 100644 Binary files a/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip and b/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip differ diff --git a/_downloads/a079d3dcf2a2ab50cd0cea38604de27d/plot_01_survival_analysis.zip b/_downloads/a079d3dcf2a2ab50cd0cea38604de27d/plot_01_survival_analysis.zip index 9ec5db9..2a69b5f 100644 Binary files a/_downloads/a079d3dcf2a2ab50cd0cea38604de27d/plot_01_survival_analysis.zip and b/_downloads/a079d3dcf2a2ab50cd0cea38604de27d/plot_01_survival_analysis.zip differ diff --git a/_downloads/a6916f06450964ef8d10eb5f311100d1/plot_01_survival_analysis.ipynb b/_downloads/a6916f06450964ef8d10eb5f311100d1/plot_01_survival_analysis.ipynb index 83571c4..af20935 100644 --- a/_downloads/a6916f06450964ef8d10eb5f311100d1/plot_01_survival_analysis.ipynb +++ b/_downloads/a6916f06450964ef8d10eb5f311100d1/plot_01_survival_analysis.ipynb @@ -44,7 +44,7 @@ }, "outputs": [], "source": [ - "from sklearn.model_selection import train_test_split\n\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\nX_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2)" + "from sklearn.model_selection import train_test_split\n\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)" ] }, { diff --git a/_downloads/accdde75b290e8d01376db9bd16d592c/plot_01_survival_analysis.py b/_downloads/accdde75b290e8d01376db9bd16d592c/plot_01_survival_analysis.py index 969eb75..316bbfd 100644 --- a/_downloads/accdde75b290e8d01376db9bd16d592c/plot_01_survival_analysis.py +++ b/_downloads/accdde75b290e8d01376db9bd16d592c/plot_01_survival_analysis.py @@ -48,8 +48,7 @@ # %% from sklearn.model_selection import train_test_split -X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) -X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2) +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) # %% # diff --git a/_images/sphx_glr_plot_01_survival_analysis_001.png b/_images/sphx_glr_plot_01_survival_analysis_001.png index 2a05188..c5b989a 100644 Binary files a/_images/sphx_glr_plot_01_survival_analysis_001.png and b/_images/sphx_glr_plot_01_survival_analysis_001.png differ diff --git a/_images/sphx_glr_plot_01_survival_analysis_002.png b/_images/sphx_glr_plot_01_survival_analysis_002.png index b3df926..c54d88c 100644 Binary files a/_images/sphx_glr_plot_01_survival_analysis_002.png and b/_images/sphx_glr_plot_01_survival_analysis_002.png differ diff --git a/_images/sphx_glr_plot_01_survival_analysis_003.png b/_images/sphx_glr_plot_01_survival_analysis_003.png index c819ad0..5554557 100644 Binary files a/_images/sphx_glr_plot_01_survival_analysis_003.png and b/_images/sphx_glr_plot_01_survival_analysis_003.png differ diff --git a/_images/sphx_glr_plot_01_survival_analysis_thumb.png b/_images/sphx_glr_plot_01_survival_analysis_thumb.png index 4045d7a..f788927 100644 Binary files a/_images/sphx_glr_plot_01_survival_analysis_thumb.png and b/_images/sphx_glr_plot_01_survival_analysis_thumb.png differ diff --git a/_sources/auto_examples/plot_01_survival_analysis.rst.txt b/_sources/auto_examples/plot_01_survival_analysis.rst.txt index ba10d8f..f38fa35 100644 --- a/_sources/auto_examples/plot_01_survival_analysis.rst.txt +++ b/_sources/auto_examples/plot_01_survival_analysis.rst.txt @@ -187,14 +187,13 @@ In this dataset, approximately 42% of the data is censored.. -.. GENERATED FROM PYTHON SOURCE LINES 49-54 +.. GENERATED FROM PYTHON SOURCE LINES 49-53 .. code-block:: Python from sklearn.model_selection import train_test_split - X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) - X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2) + X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) @@ -203,7 +202,7 @@ In this dataset, approximately 42% of the data is censored.. -.. GENERATED FROM PYTHON SOURCE LINES 55-71 +.. GENERATED FROM PYTHON SOURCE LINES 54-70 Using SurvivalBoost to estimate the survival function ----------------------------------------------------- @@ -222,7 +221,7 @@ SurvivalBoost is a scikit-learn compatible model which expects a covariates data (or array-like) ``X``, and a target dataframe ``y`` with columns "event" and "duration". This allows SurvivalBoost to estimate the survival function :math:`S`. -.. GENERATED FROM PYTHON SOURCE LINES 72-78 +.. GENERATED FROM PYTHON SOURCE LINES 71-77 .. code-block:: Python @@ -649,7 +648,7 @@ SurvivalBoost is a scikit-learn compatible model which expects a covariates data

-.. GENERATED FROM PYTHON SOURCE LINES 79-84 +.. GENERATED FROM PYTHON SOURCE LINES 78-83 SurvivalBoost can then predict the survival function for each patient, according to some time grid of horizons. @@ -657,7 +656,7 @@ according to some time grid of horizons. with the parameter ``times``. When ``times`` is set to ``None``, the model will used the learned time grid. -.. GENERATED FROM PYTHON SOURCE LINES 85-94 +.. GENERATED FROM PYTHON SOURCE LINES 84-93 .. code-block:: Python @@ -677,11 +676,11 @@ When ``times`` is set to ``None``, the model will used the learned time grid. -.. GENERATED FROM PYTHON SOURCE LINES 95-96 +.. GENERATED FROM PYTHON SOURCE LINES 94-95 Let's plot the estimated survival function for some patients. -.. GENERATED FROM PYTHON SOURCE LINES 96-127 +.. GENERATED FROM PYTHON SOURCE LINES 95-126 .. code-block:: Python @@ -728,7 +727,7 @@ Let's plot the estimated survival function for some patients. -.. GENERATED FROM PYTHON SOURCE LINES 128-138 +.. GENERATED FROM PYTHON SOURCE LINES 127-137 Measuring features impact on predictions ---------------------------------------- @@ -741,7 +740,7 @@ features to eliminate correlations. We create a synthetic dataset where age (``x8``) is resampled to reduce confounder bias. -.. GENERATED FROM PYTHON SOURCE LINES 139-187 +.. GENERATED FROM PYTHON SOURCE LINES 138-186 .. code-block:: Python @@ -805,7 +804,7 @@ confounder bias. -.. GENERATED FROM PYTHON SOURCE LINES 188-193 +.. GENERATED FROM PYTHON SOURCE LINES 187-192 Unsurprisingly, the cumulative incidence of death mostly increases with age. We can do the same thing with chemotherapy treatement. @@ -813,7 +812,7 @@ We can do the same thing with chemotherapy treatement. Let's create a synthetic dataset where chemotherapy (``x6``) alternates between 0 and 1. -.. GENERATED FROM PYTHON SOURCE LINES 194-235 +.. GENERATED FROM PYTHON SOURCE LINES 193-234 .. code-block:: Python @@ -870,7 +869,7 @@ alternates between 0 and 1. -.. GENERATED FROM PYTHON SOURCE LINES 236-304 +.. GENERATED FROM PYTHON SOURCE LINES 235-303 People treated with chemotherapy likely have more advanced stages of cancer, which is reflected by the lower estimated survival function. This serves as a reminder that @@ -941,7 +940,7 @@ summarize the Brier score in time: \mathrm{BS(t)} dt -.. GENERATED FROM PYTHON SOURCE LINES 305-315 +.. GENERATED FROM PYTHON SOURCE LINES 304-314 .. code-block:: Python @@ -963,17 +962,17 @@ summarize the Brier score in time: .. code-block:: none - IBS for SurvivalBoost: 0.1382 + IBS for SurvivalBoost: 0.1439 -.. GENERATED FROM PYTHON SOURCE LINES 316-318 +.. GENERATED FROM PYTHON SOURCE LINES 315-317 We can compare this to the Integrated Brier score of a simple Kaplan-Meier estimator, which doesn't take the patient features into account. -.. GENERATED FROM PYTHON SOURCE LINES 319-339 +.. GENERATED FROM PYTHON SOURCE LINES 318-338 .. code-block:: Python @@ -1005,16 +1004,16 @@ which doesn't take the patient features into account. .. code-block:: none - IBS for Kaplan-Meier: 0.1566 + IBS for Kaplan-Meier: 0.1653 -.. GENERATED FROM PYTHON SOURCE LINES 340-341 +.. GENERATED FROM PYTHON SOURCE LINES 339-340 Let's also compute the concordance index for both the Kaplan-Meier and SurvivalBoost. -.. GENERATED FROM PYTHON SOURCE LINES 344-353 +.. GENERATED FROM PYTHON SOURCE LINES 343-352 .. code-block:: Python @@ -1040,13 +1039,13 @@ Let's also compute the concordance index for both the Kaplan-Meier and SurvivalB -.. GENERATED FROM PYTHON SOURCE LINES 354-357 +.. GENERATED FROM PYTHON SOURCE LINES 353-356 0.5 corresponds to random chance, which makes sense as the Kaplan-Meier estimator doesn't depend on the patient features. -.. GENERATED FROM PYTHON SOURCE LINES 358-365 +.. GENERATED FROM PYTHON SOURCE LINES 357-364 .. code-block:: Python @@ -1073,7 +1072,7 @@ doesn't depend on the patient features. .. rst-class:: sphx-glr-timing - **Total running time of the script:** (0 minutes 6.993 seconds) + **Total running time of the script:** (0 minutes 7.376 seconds) .. _sphx_glr_download_auto_examples_plot_01_survival_analysis.py: diff --git a/_sources/auto_examples/plot_02_marginal_cumulative_incidence_estimation.rst.txt b/_sources/auto_examples/plot_02_marginal_cumulative_incidence_estimation.rst.txt index 995ef01..73d7d18 100644 --- a/_sources/auto_examples/plot_02_marginal_cumulative_incidence_estimation.rst.txt +++ b/_sources/auto_examples/plot_02_marginal_cumulative_incidence_estimation.rst.txt @@ -277,15 +277,15 @@ theoretical CIFs: .. code-block:: none - Integrated theoretical any event survival curve in 0.662 s - SurvivalBoost fit: 2.690 s - SurvivalBoost prediction: 2.927 s - Integrated theoretical cumulative incidence curve for event 1 in 2.988 s - Aalen-Johansen for event 1 fit in 4.937 s - Integrated theoretical cumulative incidence curve for event 2 in 5.032 s - Aalen-Johansen for event 2 fit in 5.018 s - Integrated theoretical cumulative incidence curve for event 3 in 5.096 s - Aalen-Johansen for event 3 fit in 4.976 s + Integrated theoretical any event survival curve in 0.614 s + SurvivalBoost fit: 2.766 s + SurvivalBoost prediction: 2.911 s + Integrated theoretical cumulative incidence curve for event 1 in 2.971 s + Aalen-Johansen for event 1 fit in 5.112 s + Integrated theoretical cumulative incidence curve for event 2 in 5.210 s + Aalen-Johansen for event 2 fit in 5.024 s + Integrated theoretical cumulative incidence curve for event 3 in 5.102 s + Aalen-Johansen for event 3 fit in 4.967 s @@ -328,15 +328,15 @@ of censoring. .. code-block:: none - Integrated theoretical any event survival curve in 0.591 s - SurvivalBoost fit: 2.705 s - SurvivalBoost prediction: 2.940 s - Integrated theoretical cumulative incidence curve for event 1 in 3.000 s - Aalen-Johansen for event 1 fit in 4.967 s - Integrated theoretical cumulative incidence curve for event 2 in 5.058 s - Aalen-Johansen for event 2 fit in 4.967 s - Integrated theoretical cumulative incidence curve for event 3 in 5.045 s - Aalen-Johansen for event 3 fit in 5.035 s + Integrated theoretical any event survival curve in 0.576 s + SurvivalBoost fit: 2.708 s + SurvivalBoost prediction: 2.914 s + Integrated theoretical cumulative incidence curve for event 1 in 2.974 s + Aalen-Johansen for event 1 fit in 4.936 s + Integrated theoretical cumulative incidence curve for event 2 in 5.027 s + Aalen-Johansen for event 2 fit in 4.917 s + Integrated theoretical cumulative incidence curve for event 3 in 4.995 s + Aalen-Johansen for event 3 fit in 4.988 s @@ -360,7 +360,7 @@ the large time horizons: .. rst-class:: sphx-glr-timing - **Total running time of the script:** (0 minutes 43.187 seconds) + **Total running time of the script:** (0 minutes 43.202 seconds) .. _sphx_glr_download_auto_examples_plot_02_marginal_cumulative_incidence_estimation.py: diff --git a/auto_examples/plot_01_survival_analysis.html b/auto_examples/plot_01_survival_analysis.html index 40e5788..37f407f 100644 --- a/auto_examples/plot_01_survival_analysis.html +++ b/auto_examples/plot_01_survival_analysis.html @@ -506,8 +506,7 @@
from sklearn.model_selection import train_test_split
 
-X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
-X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2)
+X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
 
@@ -1157,7 +1156,7 @@

Survival model evaluationprint(f"IBS for SurvivalBoost: {ibs_survboost:.4f}") - -
-

Total running time of the script: (0 minutes 6.993 seconds)

+

Total running time of the script: (0 minutes 7.376 seconds)

-Cause-specific cumulative incidence functions (0.0% censoring), Event 1, Event 2, Event 3
Integrated theoretical any event survival curve in 0.662 s
-SurvivalBoost fit: 2.690 s
-SurvivalBoost prediction: 2.927 s
-Integrated theoretical cumulative incidence curve for event 1 in 2.988 s
-Aalen-Johansen for event 1 fit in 4.937 s
-Integrated theoretical cumulative incidence curve for event 2 in 5.032 s
-Aalen-Johansen for event 2 fit in 5.018 s
-Integrated theoretical cumulative incidence curve for event 3 in 5.096 s
-Aalen-Johansen for event 3 fit in 4.976 s
+Cause-specific cumulative incidence functions (0.0% censoring), Event 1, Event 2, Event 3
Integrated theoretical any event survival curve in 0.614 s
+SurvivalBoost fit: 2.766 s
+SurvivalBoost prediction: 2.911 s
+Integrated theoretical cumulative incidence curve for event 1 in 2.971 s
+Aalen-Johansen for event 1 fit in 5.112 s
+Integrated theoretical cumulative incidence curve for event 2 in 5.210 s
+Aalen-Johansen for event 2 fit in 5.024 s
+Integrated theoretical cumulative incidence curve for event 3 in 5.102 s
+Aalen-Johansen for event 3 fit in 4.967 s
 

@@ -590,15 +590,15 @@

CIFs estimated on censored dataplot_cumulative_incidence_functions(survival_boost=survival_boost, aj=aj, y=y_censored) -Cause-specific cumulative incidence functions (40.4% censoring), Event 1, Event 2, Event 3
Integrated theoretical any event survival curve in 0.591 s
-SurvivalBoost fit: 2.705 s
-SurvivalBoost prediction: 2.940 s
-Integrated theoretical cumulative incidence curve for event 1 in 3.000 s
-Aalen-Johansen for event 1 fit in 4.967 s
-Integrated theoretical cumulative incidence curve for event 2 in 5.058 s
-Aalen-Johansen for event 2 fit in 4.967 s
-Integrated theoretical cumulative incidence curve for event 3 in 5.045 s
-Aalen-Johansen for event 3 fit in 5.035 s
+Cause-specific cumulative incidence functions (40.4% censoring), Event 1, Event 2, Event 3
Integrated theoretical any event survival curve in 0.576 s
+SurvivalBoost fit: 2.708 s
+SurvivalBoost prediction: 2.914 s
+Integrated theoretical cumulative incidence curve for event 1 in 2.974 s
+Aalen-Johansen for event 1 fit in 4.936 s
+Integrated theoretical cumulative incidence curve for event 2 in 5.027 s
+Aalen-Johansen for event 2 fit in 4.917 s
+Integrated theoretical cumulative incidence curve for event 3 in 4.995 s
+Aalen-Johansen for event 3 fit in 4.988 s
 

Note that the Aalen-Johansen estimator is unbiased and empirically recovers @@ -613,7 +613,7 @@

CIFs estimated on censored dataTotal running time of the script: (0 minutes 43.187 seconds)

+

Total running time of the script: (0 minutes 43.202 seconds)