Study population
Detailed information about the China Kadoorie Biobank (CKB) study design, survey methods and participants’ characteristics have been described elsewhere [16,17,18]. The data utilized in the current study were obtained from the Wuzhong District of Suzhou city, one of the 10 regions included in the CKB study. In brief, 53,269 participants aged 30–79 years were recruited for the baseline survey between June 2004 and July 2008.
In this study, we excluded participants who had been diagnosed with malignant cancer (excluding nonmelanoma skin cancer) before baseline (n = 331). After this exclusion, a total of 52,938 (22,234 men, 30,704 women) participants remained for inclusion in the final analyses.
Assessment of physical activity
Details of the methods used to assess physical activity have been previously reported [18, 19]. At the baseline survey, participants were asked about the intensity, frequency and duration of physical activities (including occupation, commuting, housework and leisure-time activity) during the past 12 months. Metabolic equivalents of tasks (METs) of different types of activities were adopted from the 2011 Compendium Of Physical Activities [20]. The MET of each activity was multiplied by the frequency and duration of physical activity to calculate physical activity in MET-hours per day (MET-h/day) from each activity. Occupational physical activity included all physical activity performed during paid employment, and nonoccupational physical activity included all physical activity performed during travel to and from work, household activity and leisure-time exercise. Total physical activity was the summation of occupational and nonoccupational physical activity.
Assessment of covariates
Covariate information was collected in the baseline questionnaire, including sociodemographic characteristics (age, sex, level of education, and marital status), lifestyle behaviours (alcohol consumption, smoking status, consumption of fresh fruit and red meat). For alcohol consumption, we asked about drinking frequency (‘Never regular drinker’, ‘Ex regular drinker’, ‘Occasional or seasonal drinker’, ‘Monthly drinker’, ‘Reduced intake drinker’, ‘Weekly drinker’). For smoking status, we asked about smoking status (‘Never smoker’, ‘Occasional smoker’, ‘Ex regular smoker’, ‘Smoker’). For consumption of fresh fruit and red meat, we asked about consumption of fresh fruit and red meat (‘Daily’, ‘4–6 days per week’, ‘1–3 days per week’, ‘Monthly’, ‘Never/rarely’). Baseline measurements of body weight and height were measured by trained staff using well-calibrated instruments. Body mass index was calculated as weight in kilograms divided by height in metres squared.
Ascertainment of outcomes
Incident outcome cases since the participants’ enrolment into the cohort were identified utilizing linkage with local disease and death registries, the national health insurance system, and by active follow-up [21]. Approximately 98% of participants were covered by the health insurance system, which recorded details of all episodes of hospitalization and coded examination and treatment procedures. The 10th revision of the International Classification of Diseases (ICD-10) was used to code the incident events by trained staff “blinded” to baseline information. In this study, we selected cancers with an incidence of 200 cases or more, including total cancer cases coded as C00-C99, oesophageal cancer [C15], stomach cancer [C16], colorectal cancer [C18-C20], liver cancer [C22], lung cancer [C33-C34] and breast cancer [C50].
Statistical analysis
Total daily physical activity was categorized into four groups based on quartiles among 52,938 participants. Mean values and prevalence of baseline characteristics were calculated for categories of total physical activity at baseline. Continuous variables were described as means (standard deviations, SDs), and categorical variables were described as proportions (%).
Hazard ratios and 95% confidence intervals (CIs) for the incidence of common types of cancer associated with total physical activity levels were estimated using Cox proportional hazard regression models. Tests for trend were assessed by including physical activity as a continuous variable. Physical activity was also modelled as a continuous variable to estimate the risk associated with a standard deviation (SD) higher level of physical activity. The Cox regression analyses were stratified by age in 5-year intervals to fit proportional hazard assumption, and adjusted for sex (female, male), level of education (no formal schooling, middle school and below, or high school and above), marital status (married, widowed, separated or divorced or never married), alcohol consumption (never regular drinker, former regular drinker, occasional drinker, or regular drinker), smoking status (never regular smoker, former regular smoker, occasional smoker, or regular smoker), consumption of fresh fruit (daily, 4–6 days per week, 1–3 days per week, monthly, never or rarely) and red meat (≥ 4 days per week, 1–3 days per week, monthly or less), and BMI (continuous). The linearity of physical activity and cancer associations was evaluated with restricted cubic splines.
We also examined the associations of total physical activity with the incidence of total cancer, lung cancer and colorectal cancer among prespecified baseline subgroups based on age (< 60, ≥ 60 years), sex, BMI (< 25, ≥ 25 kg/m2) and smoking status (never, ever). To investigate potential interaction effects, we used a likelihood ratio test comparing the models with and without a cross-product term between total physical activity levels and each of the stratification variables.
Furthermore, several sensitivity analyses were conducted to test the robustness of the results excluding cases of cancer diagnosed during the first two years of follow-up or excluding individuals with poor self-rated health at baseline. All analyses used two-sided P values and were conducted using R V4.1.3.