Codebook for small-cell lung cancer dataset
Codebook: Small-Cell Lung Cancer Dataset
Where to find it
- Used in: Lab 4b: logistic regression
-
File name:
Stats4- more.csv - Location: Available online at https://media.githubusercontent.com/media/CUNY-epibios/PUBH614/tree/main/datasets
- Download the dataset now
Code to load this dataset into R, and ensure all columns are read in as factor type:
Dataset Overview
This dataset contains information about small-cell lung cancer cases, including smoking status, cancer status, and patient sex. The dataset includes 209 observations with 3 variables.
Variables
smoke
- Description: Smoking status of the patient
- Type: Binary (0/1)
-
Values:
- 0 = Non-smoker
- 1 = Smoker
- Missing values: None
lungca
- Description: Lung cancer status
- Type: Binary (0/1)
-
Values:
- 0 = No lung cancer
- 1 = Has lung cancer
- Missing values: None
sex
- Description: Biological sex of the patient
- Type: Binary (M/F)
-
Values:
- M = Male
- F = Female
- Missing values: None
Summary Statistics
Data Quality Notes
- The dataset is complete with no missing values
- All variables are coded consistently
-
smokeandlungcause 0/1 coding -
sexis coded using single-letter values (M/F)
Acknowledgements
This codebook was drafted by Microsoft Copilot and edited by Levi Waldron.