Publication-Ready Tables in R

A Practical Guide Using gtsummary and flextable

Author

Muntasir Masum, PhD

Published

March 10, 2026

Overview

Producing clean, publication-ready tables is one of the most important — and most under-taught — skills in applied social science and public health research. This tutorial walks through a complete workflow using two complementary R packages:

Package Strength Best for
gtsummary Automatic table building with survey support Table 1, regression tables
flextable Fine-grained formatting control Word / .docx export

All examples use a 20,000-observation random sample of the National Health Interview Survey (NHIS), linked to the National Death Index through the NHIS Linked Mortality Files (LMF). The analysis illustrates a common demography and public health workflow: describing a sample, then modeling a binary health outcome with survey-weighted logistic regression.

TipWhat you will learn
  • How to build a complete Table 1 (descriptive statistics), with and without survey weights
  • How to control variable type summaries (continuous vs. categorical)
  • How to produce and format regression tables from svyglm() output
  • Advanced gtsummary techniques: removing reference rows, significance stars, managing footnotes, tbl_merge(), and tbl_stack()
  • How to export polished tables to Word via flextable

1 Data and Variables

The analytic sample contains 20000 respondents drawn from NHIS waves collected between 1997 and 2018, with variables spanning demographics, socioeconomic status, health behaviors, and health outcomes.

Table 1: Variables in the Teaching Dataset
Variable Label
us_born U.S.-born vs. Foreign-born
genx_mil Birth cohort (Generation X / Millennial)
age_cat Age group
male Sex
race_eth15 Race/ethnicity (6 categories)
edu4 Educational attainment (4 levels)
famincome Family income category
marital_stat2 Marital status
region2 U.S. Census region
smoker Current smoker
alcohol Heavy alcohol use
obese Obesity status
hyperten Hypertension diagnosis
healthstatus Self-rated health (Healthy / Poor)
poor_health Poor self-rated health (0 = Healthy, 1 = Poor)
wgt Mortality analysis weight (mortwtsa)
strata Survey stratum
psu Primary sampling unit

Survey design

NHIS uses a complex multistage probability sample. Correct inference requires declaring the survey design before any analysis:

Code
library(survey)
options(survey.lonely.psu = "adjust")

des <- svydesign(
  id      = ~psu,       # primary sampling unit
  strata  = ~strata,    # stratification variable
  weights = ~wgt,       # mortality analysis weight (mortwtsa)
  data    = nhis,
  nest    = TRUE        # PSUs nested within strata
)
Important

Always use nest = TRUE when PSU IDs are not unique across strata (which is the case with NHIS). Setting options(survey.lonely.psu = "adjust") prevents errors in strata with only one PSU.

Global gtsummary options

Two global options should be set once per session, before any table is built. They apply automatically to every subsequent tbl_summary(), tbl_svysummary(), and tbl_regression() call.

Code
options(gtsummary.use_ftExtra = TRUE)
set_gtsummary_theme(theme_gtsummary_compact(set_theme = TRUE))
TipWhy these two options matter

options(gtsummary.use_ftExtra = TRUE)

Activates the ftExtra backend for flextable rendering. By default, gtsummary uses a basic text renderer when converting tables to flextable via as_flex_table(). With ftExtra enabled, markdown in cell content (bold labels, italic levels, superscripts, footnote symbols) is preserved faithfully in the Word export. Without it, formatted text can appear as raw markdown syntax in the .docx file.

set_gtsummary_theme(theme_gtsummary_compact(set_theme = TRUE))

Applies the compact theme globally. Compared to the default theme, compact reduces row padding and font size, producing tables that fit comfortably on a journal page. Using set_gtsummary_theme() as the outer wrapper is the recommended pattern in gtsummary 2.x: it registers the theme with the package engine so it persists across all tables in the session. Passing set_theme = TRUE inside theme_gtsummary_compact() ensures the theme is active immediately even if called in isolation.

Both lines are already active in this tutorial’s hidden setup chunk.

2 Part 1: Descriptive Tables (Table 1)

Table 1 in a manuscript reports sample characteristics. The tabs below progress from the simplest approach to a fully stratified, weighted table ready for publication.

tbl_summary() works directly on a data frame — no survey design needed. Appropriate for convenience samples, or as a first pass before adding weights.

Code
nhis |>
  select(us_born, genx_mil, age_cat, male, race_eth15,
         edu4, famincome, healthstatus) |>
  tbl_summary(
    label = list(
      us_born      ~ "Nativity",
      genx_mil     ~ "Birth cohort",
      age_cat      ~ "Age group",
      male         ~ "Sex",
      race_eth15   ~ "Race/ethnicity",
      edu4         ~ "Education",
      famincome    ~ "Family income",
      healthstatus ~ "Self-rated health"
    ),
    missing = "no"
  ) |>
  bold_labels() |>
  modify_caption("**Table 1. Sample Characteristics (Unweighted)**") |>
  as_gt()
Table 1. Sample Characteristics (Unweighted)
Characteristic N = 20,0001
Nativity
    U.S. born 15,664 (78%)
    Foreign born 4,336 (22%)
Birth cohort
    Generation X 13,117 (66%)
    Millennial 6,883 (34%)
Age group
    18-24 4,745 (24%)
    25-34 8,430 (42%)
    35-44 5,304 (27%)
    45-55 1,521 (7.6%)
Sex
    Female 10,956 (55%)
    Male 9,044 (45%)
Race/ethnicity
    Non-Hispanic white 11,119 (56%)
    Hispanic 4,519 (23%)
    Non-Hispanic Black 2,906 (15%)
    Non-Hispanic ANAI 172 (0.9%)
    Non-Hispanic Asian 790 (4.0%)
    Non-Hispanic other 494 (2.5%)
Education
    Less than HS 2,482 (12%)
    High school 5,340 (27%)
    Some college 6,635 (33%)
    BA or higher 5,497 (28%)
Family income
    Less than $35,000 8,338 (45%)
    $35,000 - $75,000 6,666 (36%)
    $75,000 - $99,999 1,350 (7.3%)
    $100,000 and over 2,184 (12%)
Self-rated health
    Healthy 18,580 (93%)
    Poor 1,420 (7.1%)
1 n (%)

tbl_svysummary() accepts the svydesign object and weights all statistics automatically. Proportions and means can differ substantially from the unweighted version when survey weights correct for unequal selection probabilities.

Code
des |>
  tbl_svysummary(
    include = c(us_born, genx_mil, age_cat, male, race_eth15,
                edu4, famincome, healthstatus),
    label = list(
      us_born      ~ "Nativity",
      genx_mil     ~ "Birth cohort",
      age_cat      ~ "Age group",
      male         ~ "Sex",
      race_eth15   ~ "Race/ethnicity",
      edu4         ~ "Education",
      famincome    ~ "Family income",
      healthstatus ~ "Self-rated health"
    ),
    missing = "no"
  ) |>
  bold_labels() |>
  modify_caption("**Table 1. Sample Characteristics (Survey-Weighted)**") |>
  as_gt()
Table 1. Sample Characteristics (Survey-Weighted)
Characteristic N = 20,7261
Nativity
    U.S. born 16,683 (80%)
    Foreign born 4,042 (20%)
Birth cohort
    Generation X 12,794 (62%)
    Millennial 7,931 (38%)
Age group
    18-24 5,647 (27%)
    25-34 7,962 (38%)
    35-44 5,418 (26%)
    45-55 1,698 (8.2%)
Sex
    Female 10,499 (51%)
    Male 10,226 (49%)
Race/ethnicity
    Non-Hispanic white 12,701 (61%)
    Hispanic 3,896 (19%)
    Non-Hispanic Black 2,682 (13%)
    Non-Hispanic ANAI 185 (0.9%)
    Non-Hispanic Asian 787 (3.8%)
    Non-Hispanic other 473 (2.3%)
Education
    Less than HS 2,350 (11%)
    High school 5,702 (28%)
    Some college 6,894 (33%)
    BA or higher 5,725 (28%)
Family income
    Less than $35,000 6,654 (35%)
    $35,000 - $75,000 7,447 (39%)
    $75,000 - $99,999 1,693 (8.9%)
    $100,000 and over 3,164 (17%)
Self-rated health
    Healthy 19,386 (94%)
    Poor 1,339 (6.5%)
1 n (%)

Add by = to compare groups side by side. Chain add_overall() for a total column, and add_p() for group-comparison p-values. Use modify_header() with {n} (unweighted count) to show interpretable sample sizes in column headers.

Code
tbl1 <- des |>
  tbl_svysummary(
    by = us_born,
    include = c(genx_mil, age_cat, male, race_eth15,
                edu4, famincome, smoker, alcohol, obese,
                hyperten, healthstatus),
    label = list(
      genx_mil     ~ "Birth cohort",
      age_cat      ~ "Age group",
      male         ~ "Sex",
      race_eth15   ~ "Race/ethnicity",
      edu4         ~ "Education",
      famincome    ~ "Family income",
      smoker       ~ "Current smoker",
      alcohol      ~ "Heavy alcohol use",
      obese        ~ "Obese",
      hyperten     ~ "Hypertension",
      healthstatus ~ "Self-rated health"
    ),
    missing = "no"
  ) |>
  add_overall(last = FALSE) |>
  add_p(test.args = all_tests("svy.wilcox.test") ~ list(design = des)) |>
  bold_labels() |>
  italicize_levels() |>
  bold_p(t = 0.05) |>
  modify_header(
    stat_0 ~ "**Overall**\nn = {n_unweighted}",
    stat_1 ~ "**U.S.-Born**\nn = {n_unweighted}",
    stat_2 ~ "**Foreign-Born**\nn = {n_unweighted}"
  ) |>
  modify_caption("**Table 1. Sample Characteristics by Nativity**")

tbl1 |> as_gt()
Table 1. Sample Characteristics by Nativity
Characteristic Overall n = 200001 U.S.-Born n = 156641 Foreign-Born n = 43361 p-value2
Birth cohort


<0.001
    Generation X 12,794 (62%) 10,058 (60%) 2,736 (68%)
    Millennial 7,931 (38%) 6,626 (40%) 1,306 (32%)
Age group


<0.001
    18-24 5,647 (27%) 4,921 (29%) 726 (18%)
    25-34 7,962 (38%) 6,304 (38%) 1,659 (41%)
    35-44 5,418 (26%) 4,136 (25%) 1,282 (32%)
    45-55 1,698 (8.2%) 1,323 (7.9%) 375 (9.3%)
Sex


0.10
    Female 10,499 (51%) 8,510 (51%) 1,990 (49%)
    Male 10,226 (49%) 8,174 (49%) 2,052 (51%)
Race/ethnicity


<0.001
    Non-Hispanic white 12,701 (61%) 12,080 (72%) 622 (15%)
    Hispanic 3,896 (19%) 1,717 (10%) 2,179 (54%)
    Non-Hispanic Black 2,682 (13%) 2,332 (14%) 350 (8.7%)
    Non-Hispanic ANAI 185 (0.9%) 179 (1.1%) 6 (0.2%)
    Non-Hispanic Asian 787 (3.8%) 189 (1.1%) 599 (15%)
    Non-Hispanic other 473 (2.3%) 187 (1.1%) 286 (7.1%)
Education


<0.001
    Less than HS 2,350 (11%) 1,335 (8.0%) 1,015 (25%)
    High school 5,702 (28%) 4,725 (28%) 977 (24%)
    Some college 6,894 (33%) 6,005 (36%) 889 (22%)
    BA or higher 5,725 (28%) 4,599 (28%) 1,126 (28%)
Family income


<0.001
    Less than $35,000 6,654 (35%) 5,098 (33%) 1,556 (43%)
    $35,000 - $75,000 7,447 (39%) 6,093 (40%) 1,354 (37%)
    $75,000 - $99,999 1,693 (8.9%) 1,451 (9.5%) 242 (6.6%)
    $100,000 and over 3,164 (17%) 2,668 (17%) 496 (14%)
Current smoker


<0.001
    Never smoked 13,761 (66%) 10,518 (63%) 3,243 (80%)
    Former smoker 2,553 (12%) 2,190 (13%) 363 (9.0%)
    Current smoker 4,383 (21%) 3,950 (24%) 433 (11%)
Heavy alcohol use


<0.001
    Current drinker 14,162 (68%) 11,972 (72%) 2,190 (54%)
    Lifetime abstainer 4,659 (22%) 3,132 (19%) 1,527 (38%)
    Infrequent drinker 1,054 (5.1%) 866 (5.2%) 188 (4.6%)
    Former drinker 851 (4.1%) 713 (4.3%) 137 (3.4%)
Obese


<0.001
    Not obese 14,391 (76%) 11,374 (75%) 3,016 (82%)
    Obese 4,527 (24%) 3,865 (25%) 662 (18%)
Hypertension 2,437 (12%) 2,105 (13%) 332 (8.2%) <0.001
Self-rated health


0.3
    Healthy 19,386 (94%) 15,585 (93%) 3,801 (94%)
    Poor 1,339 (6.5%) 1,099 (6.6%) 241 (6.0%)
1 n (%)
2 Pearson’s X^2: Rao & Scott adjustment
TipHeader placeholder guide for tbl_svysummary
Placeholder Returns Use when
{n_unweighted} Unweighted count per group Column headers (most readable)
{N_unweighted} Total unweighted N Overall column header
{n} Sum of weights per group Reporting weighted N
{N} Total sum of weights Rarely useful in headers

Both {n} and {N} return decimals in weighted surveys — use {n_unweighted} for clean integer counts in column headers.

2.1 Controlling Variable Type Summaries

By default, gtsummary guesses whether each variable is categorical or continuous. You can override this with the type = argument, and customize the displayed statistics with statistic =.

Code
nhis |>
  select(age_cat, edu4, poor_health) |>
  tbl_summary(missing = "no") |>
  bold_labels() |>
  as_gt()
Table 2
Characteristic N = 20,0001
Age group
    18-24 4,745 (24%)
    25-34 8,430 (42%)
    35-44 5,304 (27%)
    45-55 1,521 (7.6%)
Education
    Less than HS 2,482 (12%)
    High school 5,340 (27%)
    Some college 6,635 (33%)
    BA or higher 5,497 (28%)
Poor health 1,420 (7.1%)
1 n (%)

Use type = list(var ~ "continuous") to display mean ± SD instead of counts. Useful for variables that happen to be stored as factors but have a meaningful numeric interpretation (e.g., an age index).

Code
nhis |>
  select(age_cat, edu4, poor_health) |>
  tbl_summary(
    type    = list(poor_health ~ "continuous"),
    statistic = list(
      all_continuous()  ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    ),
    missing = "no"
  ) |>
  bold_labels() |>
  as_gt()
Table 3
Characteristic N = 20,0001
Age group
    18-24 4,745 (24%)
    25-34 8,430 (42%)
    35-44 5,304 (27%)
    45-55 1,521 (7.6%)
Education
    Less than HS 2,482 (12%)
    High school 5,340 (27%)
    Some college 6,635 (33%)
    BA or higher 5,497 (28%)
Poor health 0 (0)
1 n (%); Mean (SD)

Use type = list(var ~ "categorical") to show counts and percentages for a numeric variable (e.g., an integer 0/1 outcome where you want both values displayed explicitly).

Code
nhis |>
  select(age_cat, edu4, poor_health) |>
  tbl_summary(
    type    = list(poor_health ~ "categorical"),
    missing = "no"
  ) |>
  bold_labels() |>
  as_gt()
Table 4
Characteristic N = 20,0001
Age group
    18-24 4,745 (24%)
    25-34 8,430 (42%)
    35-44 5,304 (27%)
    45-55 1,521 (7.6%)
Education
    Less than HS 2,482 (12%)
    High school 5,340 (27%)
    Some college 6,635 (33%)
    BA or higher 5,497 (28%)
Poor health
    0 18,580 (93%)
    1 1,420 (7.1%)
1 n (%)
TipKey gtsummary verbs for Table 1
Function Purpose
tbl_svysummary(by = ...) Stratify columns by a grouping variable
add_overall() Append an overall (unstratified) column
add_p() Add p-values for group comparisons
type = list(var ~ "continuous") Force a variable to display mean/SD
type = list(var ~ "categorical") Force a variable to display counts/%
statistic = list(...) Customize the displayed summary statistic
bold_labels() Bold the variable name rows
italicize_levels() Italicize the category rows
bold_p(t = 0.05) Bold significant p-values
modify_header() Rewrite any column header (use {n} for counts)
modify_caption() Add or change table caption

3 Part 2: Regression Tables

We model poor self-rated health (binary: 1 = Poor, 0 = Healthy) using survey-weighted logistic regression via svyglm() with a quasibinomial() family.

Code
# Model A: Demographic only
m_A <- svyglm(
  poor_health ~ us_born + genx_mil + male + race_eth15,
  design = des_cc,
  family  = quasibinomial()
)

# Model B: + Socioeconomic status
m_B <- svyglm(
  poor_health ~ us_born + genx_mil + male + race_eth15 +
                edu4 + famincome,
  design = des_cc,
  family  = quasibinomial()
)

# Model C: Full model
m_C <- svyglm(
  poor_health ~ us_born + genx_mil + male + race_eth15 +
                edu4 + famincome + marital_stat2 +
                smoker + alcohol + obese + hyperten + region2,
  design = des_cc,
  family  = quasibinomial()
)

The four tabs below demonstrate progressively more refined formatting, starting from the default output and building toward a publication-ready table.

tbl_regression() converts any model object into a formatted table. Set exponentiate = TRUE to display odds ratios instead of log-odds.

Code
tbl_regression(m_A, exponentiate = TRUE) |>
  bold_labels() |>
  as_gt()
Table 5
Characteristic OR 95% CI p-value
Nativity


    U.S. born
    Foreign born 0.73 0.57, 0.93 0.012
Birth cohort


    Generation X
    Millennial 0.60 0.50, 0.73 <0.001
Sex


    Female
    Male 0.84 0.71, 0.98 0.027
Race/ethnicity


    Non-Hispanic white
    Hispanic 1.80 1.45, 2.23 <0.001
    Non-Hispanic Black 1.49 1.21, 1.84 <0.001
    Non-Hispanic ANAI 2.69 1.53, 4.71 <0.001
    Non-Hispanic Asian 0.72 0.44, 1.19 0.2
    Non-Hispanic other 0.97 0.56, 1.67 >0.9
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

By default, gtsummary shows a reference row for each categorical predictor. These rows add visual clutter. A single call to remove_row_type(type = "reference") cleans them up.

Code
tbl_regression(m_A, exponentiate = TRUE) |>
  bold_labels() |>
  italicize_levels() |>
  as_gt()
Table 6
Characteristic OR 95% CI p-value
Nativity


    U.S. born
    Foreign born 0.73 0.57, 0.93 0.012
Birth cohort


    Generation X
    Millennial 0.60 0.50, 0.73 <0.001
Sex


    Female
    Male 0.84 0.71, 0.98 0.027
Race/ethnicity


    Non-Hispanic white
    Hispanic 1.80 1.45, 2.23 <0.001
    Non-Hispanic Black 1.49 1.21, 1.84 <0.001
    Non-Hispanic ANAI 2.69 1.53, 4.71 <0.001
    Non-Hispanic Asian 0.72 0.44, 1.19 0.2
    Non-Hispanic other 0.97 0.56, 1.67 >0.9
Abbreviations: CI = Confidence Interval, OR = Odds Ratio
Code
tbl_regression(m_A, exponentiate = TRUE) |>
  remove_row_type(type = "reference") |>
  bold_labels() |>
  italicize_levels() |>
  as_gt()
Table 7
Characteristic OR 95% CI p-value
Nativity


    Foreign born 0.73 0.57, 0.93 0.012
Birth cohort


    Millennial 0.60 0.50, 0.73 <0.001
Sex


    Male 0.84 0.71, 0.98 0.027
Race/ethnicity


    Hispanic 1.80 1.45, 2.23 <0.001
    Non-Hispanic Black 1.49 1.21, 1.84 <0.001
    Non-Hispanic ANAI 2.69 1.53, 4.71 <0.001
    Non-Hispanic Asian 0.72 0.44, 1.19 0.2
    Non-Hispanic other 0.97 0.56, 1.67 >0.9
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

Some journals prefer the European style: coefficient (SE) with significance stars rather than a confidence interval column. Use add_significance_stars() with conf.int = FALSE.

Code
tbl_regression(
  m_A,
  exponentiate = TRUE,
  conf.int     = FALSE
) |>
  add_significance_stars(
    hide_ci  = TRUE,
    hide_se  = FALSE,
    pattern  = "{estimate}{stars}"
  ) |>
  remove_row_type(type = "reference") |>
  bold_labels() |>
  italicize_levels() |>
  modify_header(estimate ~ "**OR**", std.error ~ "**SE**") |>
  modify_caption("**Table 2. Model A: Demographic Predictors of Poor Health**") |>
  as_gt()
Table 8: Table 2. Model A: Demographic Predictors of Poor Health
Characteristic OR1 SE
Nativity

    Foreign born 0.73* 0.126
Birth cohort

    Millennial 0.60*** 0.095
Sex

    Male 0.84* 0.081
Race/ethnicity

    Hispanic 1.80*** 0.111
    Non-Hispanic Black 1.49*** 0.107
    Non-Hispanic ANAI 2.69*** 0.287
    Non-Hispanic Asian 0.72 0.255
    Non-Hispanic other 0.97 0.277
1 p<0.05; p<0.01; p<0.001
Abbreviations: OR = Odds Ratio, SE = Standard Error
Note

pattern = "{estimate}{stars}" places the stars immediately after the estimate. Use "{estimate} {stars}" to add a space. hide_ci = TRUE drops the CI columns entirely.

When you drop the CI columns, the default footnote defining “CI” becomes orphaned. Remove it by filtering the internal table_styling$abbreviation tibble directly.

Code
tbl_no_ci <- tbl_regression(
  m_B,
  exponentiate = TRUE,
  conf.int     = FALSE
) |>
  add_significance_stars(
    hide_ci  = TRUE,
    hide_se  = FALSE,
    pattern  = "{estimate}{stars}"
  ) |>
  remove_row_type(type = "reference") |>
  bold_labels() |>
  italicize_levels() |>
  modify_header(estimate ~ "**OR**", std.error ~ "**SE**")

# Surgically remove the orphaned CI footnote
tbl_no_ci$table_styling$abbreviation <-
  tbl_no_ci$table_styling$abbreviation |>
  dplyr::filter(column != "conf.low")

tbl_no_ci |>
  modify_caption("**Table 2. Model B: Demographic + SES Predictors**") |>
  as_gt()
Table 9: Table 2. Model B: Demographic + SES Predictors
Characteristic OR1 SE
Nativity

    Foreign born 0.60*** 0.136
Birth cohort

    Millennial 0.54*** 0.096
Sex

    Male 0.84* 0.082
Race/ethnicity

    Hispanic 1.31* 0.129
    Non-Hispanic Black 1.07 0.108
    Non-Hispanic ANAI 1.79 0.303
    Non-Hispanic Asian 1.24 0.258
    Non-Hispanic other 1.15 0.281
Education

    High school 0.71** 0.119
    Some college 0.58*** 0.125
    BA or higher 0.28*** 0.166
Family income

    $35,000 - $75,000 0.43*** 0.103
    $75,000 - $99,999 0.44*** 0.204
    $100,000 and over 0.26*** 0.235
1 p<0.05; p<0.01; p<0.001
Abbreviations: OR = Odds Ratio, SE = Standard Error

Before customizing column headers with modify_header(), use show_header_names() to print the exact internal column names gtsummary assigns. This eliminates guesswork.

Code
tbl_regression(m_A, exponentiate = TRUE) |>
  show_header_names()
#> Column Name   Header                 N*             N_event*       
#> label         "**Characteristic**"   17,295 <dbl>   1,040 <dbl>    
#> estimate      "**OR**"               17,295 <dbl>   1,040 <dbl>    
#> conf.low      "**95% CI**"           17,295 <dbl>   1,040 <dbl>    
#> p.value       "**p-value**"          17,295 <dbl>   1,040 <dbl>

"estimate" = OR column, "conf.low" / "conf.high" = CI bounds, "std.error" = SE. Pass these strings to modify_header().

4 Part 3: Combining Tables

tbl_merge() places multiple model tables side by side under optional spanning headers — the standard multi-model format for journal manuscripts.

Code
# Helper: builds a consistently formatted model table
make_tbl <- function(model) {
  tbl_regression(
    model,
    exponentiate = TRUE,
    conf.int     = FALSE
  ) |>
    add_significance_stars(
      hide_ci  = TRUE,
      hide_se  = FALSE,
      pattern  = "{estimate}{stars}"
    ) |>
    remove_row_type(type = "reference") |>
    bold_labels() |>
    italicize_levels() |>
    modify_header(estimate ~ "**OR**", std.error ~ "**SE**")
}

t_A <- make_tbl(m_A)
t_B <- make_tbl(m_B)
t_C <- make_tbl(m_C)

# Remove orphaned CI footnotes
for (tbl_obj in list(t_A, t_B, t_C)) {
  tbl_obj$table_styling$abbreviation <-
    tbl_obj$table_styling$abbreviation |>
    dplyr::filter(column != "conf.low")
}

tbl_merge(
  tbls      = list(t_A, t_B, t_C),
  tab_spanner = c(
    "**Model A: Demographic**",
    "**Model B: + SES**",
    "**Model C: Full**"
  )
) |>
  bold_labels() |>
  modify_caption("**Table 3. Logistic Regression: Predictors of Poor Self-Rated Health**") |>
  as_gt() |>
  tab_footnote(
    footnote = "OR = odds ratio; SE = standard error. Survey-weighted quasibinomial logistic regression. ***p < 0.001; **p < 0.01; *p < 0.05.",
    locations = cells_title()
  )
Table 10: Table 3. Logistic Regression: Predictors of Poor Self-Rated Health
Characteristic
Model A: Demographic
Model B: + SES
Model C: Full
OR1 SE OR1 SE OR1 SE
Nativity





    Foreign born 0.73* 0.126 0.60*** 0.136 0.78 0.140
Birth cohort





    Millennial 0.60*** 0.095 0.54*** 0.096 0.73** 0.104
Sex





    Male 0.84* 0.081 0.84* 0.082 0.86 0.083
Race/ethnicity





    Hispanic 1.80*** 0.111 1.31* 0.129 1.41** 0.131
    Non-Hispanic Black 1.49*** 0.107 1.07 0.108 0.98 0.124
    Non-Hispanic ANAI 2.69*** 0.287 1.79 0.303 1.73 0.287
    Non-Hispanic Asian 0.72 0.255 1.24 0.258 1.21 0.267
    Non-Hispanic other 0.97 0.277 1.15 0.281 1.27 0.288
Education





    High school

0.71** 0.119 0.74* 0.123
    Some college

0.58*** 0.125 0.71** 0.133
    BA or higher

0.28*** 0.166 0.42*** 0.180
Family income





    $35,000 - $75,000

0.43*** 0.103 0.47*** 0.105
    $75,000 - $99,999

0.44*** 0.204 0.46*** 0.210
    $100,000 and over

0.26*** 0.235 0.28*** 0.247
Marital status





    Marital dissolution



1.39** 0.119
    Never married



1.03 0.111
Smoking status





    Former smoker



1.22 0.147
    Current smoker



1.97*** 0.100
Alcohol use





    Lifetime abstainer



1.44** 0.125
    Infrequent drinker



2.10*** 0.151
    Former drinker



2.23*** 0.157
Obesity





    Obese



1.82*** 0.096
Hypertension





    Yes



3.19*** 0.100
Census region





    Midwest



1.07 0.151
    West



1.03 0.153
    South



1.01 0.141
1 p<0.05; p<0.01; p<0.001
Abbreviations: OR = Odds Ratio, SE = Standard Error
Tip

Spanning headers in tab_spanner support markdown (**bold**). All subsequent calls such as bold_labels() apply to the merged object.

tbl_stack() stacks tables vertically under group headers — ideal for comparing an unweighted model to a survey-weighted model. The difference in confidence interval widths illustrates the design effect directly.

Code
# Unweighted model (plain glm)
m_unwt <- glm(
  poor_health ~ us_born + genx_mil + male + race_eth15 + edu4 + famincome,
  data   = nhis_cc,
  family = binomial()
)

tbl_unwt <- tbl_regression(m_unwt, exponentiate = TRUE) |>
  remove_row_type(type = "reference") |>
  bold_labels() |>
  italicize_levels()

# Survey-weighted model (already fitted as m_B)
tbl_wt <- tbl_regression(m_B, exponentiate = TRUE) |>
  remove_row_type(type = "reference") |>
  bold_labels() |>
  italicize_levels()

tbl_stack(
  tbls         = list(tbl_unwt, tbl_wt),
  group_header = c(
    "Unweighted (plain glm)",
    "Survey-Weighted (svyglm + quasibinomial)"
  )
) |>
  modify_caption("**Table 4. Unweighted vs. Survey-Weighted Logistic Regression**") |>
  as_gt() |>
  tab_footnote(
    footnote = "Models include: nativity, birth cohort, sex, race/ethnicity, education, and family income. OR = odds ratio; 95% CI in brackets.",
    locations = cells_title()
  )
Table 11: Table 4. Unweighted vs. Survey-Weighted Logistic Regression
Characteristic OR 95% CI p-value
Unweighted (plain glm)
Nativity


    Foreign born 0.52 0.42, 0.63 <0.001
Birth cohort


    Millennial 0.56 0.49, 0.65 <0.001
Sex


    Male 0.84 0.74, 0.96 0.008
Race/ethnicity


    Hispanic 1.20 0.99, 1.43 0.057
    Non-Hispanic Black 1.18 0.99, 1.40 0.056
    Non-Hispanic ANAI 2.39 1.50, 3.67 <0.001
    Non-Hispanic Asian 1.65 1.05, 2.48 0.022
    Non-Hispanic other 1.38 0.84, 2.15 0.2
Education


    High school 0.65 0.54, 0.78 <0.001
    Some college 0.48 0.40, 0.58 <0.001
    BA or higher 0.25 0.20, 0.32 <0.001
Family income


    $35,000 - $75,000 0.42 0.36, 0.49 <0.001
    $75,000 - $99,999 0.42 0.30, 0.57 <0.001
    $100,000 and over 0.27 0.20, 0.37 <0.001
Survey-Weighted (svyglm + quasibinomial)
Nativity


    Foreign born 0.60 0.46, 0.79 <0.001
Birth cohort


    Millennial 0.54 0.44, 0.65 <0.001
Sex


    Male 0.84 0.71, 0.98 0.030
Race/ethnicity


    Hispanic 1.31 1.01, 1.69 0.038
    Non-Hispanic Black 1.07 0.86, 1.32 0.5
    Non-Hispanic ANAI 1.79 0.99, 3.25 0.054
    Non-Hispanic Asian 1.24 0.75, 2.06 0.4
    Non-Hispanic other 1.15 0.66, 2.00 0.6
Education


    High school 0.71 0.56, 0.90 0.004
    Some college 0.58 0.45, 0.74 <0.001
    BA or higher 0.28 0.20, 0.38 <0.001
Family income


    $35,000 - $75,000 0.43 0.35, 0.52 <0.001
    $75,000 - $99,999 0.44 0.30, 0.66 <0.001
    $100,000 and over 0.26 0.16, 0.41 <0.001
1 Models include: nativity, birth cohort, sex, race/ethnicity, education, and family income. OR = odds ratio; 95% CI in brackets.
2 Models include: nativity, birth cohort, sex, race/ethnicity, education, and family income. OR = odds ratio; 95% CI in brackets.
Abbreviations: CI = Confidence Interval, OR = Odds Ratio
Note

Notice how confidence intervals widen in the survey-weighted model. This reflects the design effect: clustering and stratification reduce the effective sample size, increasing standard errors.

5 Part 4: Exporting to Word with flextable

For manuscript submission, most journals require .docx files. as_flex_table() converts any gtsummary object to a flextable, which is then exported with save_as_docx().

Code
ft <- tbl_merge(
  tbls      = list(t_A, t_B, t_C),
  tab_spanner = c(
    "**Model A: Demographic**",
    "**Model B: + SES**",
    "**Model C: Full**"
  )
) |>
  bold_labels() |>
  modify_caption("Table 3. Logistic Regression: Predictors of Poor Self-Rated Health") |>
  as_flex_table() |>
  set_table_properties(width = 1, layout = "autofit") |>
  fontsize(size = 10, part = "all") |>
  font(fontname = "Times New Roman", part = "all") |>
  add_footer_lines(
    "Note. OR = odds ratio; SE = standard error. Survey-weighted quasibinomial logistic regression. ***p < 0.001; **p < 0.01; *p < 0.05."
  )

ft

Model A: Demographic

Model B: + SES

Model C: Full

Characteristic

OR1

SE

OR1

SE

OR1

SE

Nativity

Foreign born

0.73*

0.126

0.60***

0.136

0.78

0.140

Birth cohort

Millennial

0.60***

0.095

0.54***

0.096

0.73**

0.104

Sex

Male

0.84*

0.081

0.84*

0.082

0.86

0.083

Race/ethnicity

Hispanic

1.80***

0.111

1.31*

0.129

1.41**

0.131

Non-Hispanic Black

1.49***

0.107

1.07

0.108

0.98

0.124

Non-Hispanic ANAI

2.69***

0.287

1.79

0.303

1.73

0.287

Non-Hispanic Asian

0.72

0.255

1.24

0.258

1.21

0.267

Non-Hispanic other

0.97

0.277

1.15

0.281

1.27

0.288

Education

High school

0.71**

0.119

0.74*

0.123

Some college

0.58***

0.125

0.71**

0.133

BA or higher

0.28***

0.166

0.42***

0.180

Family income

$35,000 - $75,000

0.43***

0.103

0.47***

0.105

$75,000 - $99,999

0.44***

0.204

0.46***

0.210

$100,000 and over

0.26***

0.235

0.28***

0.247

Marital status

Marital dissolution

1.39**

0.119

Never married

1.03

0.111

Smoking status

Former smoker

1.22

0.147

Current smoker

1.97***

0.100

Alcohol use

Lifetime abstainer

1.44**

0.125

Infrequent drinker

2.10***

0.151

Former drinker

2.23***

0.157

Obesity

Obese

1.82***

0.096

Hypertension

Yes

3.19***

0.100

Census region

Midwest

1.07

0.151

West

1.03

0.153

South

1.01

0.141

1*p<0.05; **p<0.01; ***p<0.001

Abbreviations: OR = Odds Ratio, SE = Standard Error

Note. OR = odds ratio; SE = standard error. Survey-weighted quasibinomial logistic regression. ***p < 0.001; **p < 0.01; *p < 0.05.

Code
# Always save in its own separate chunk
save_as_docx(ft, path = "output/Table3_regression.docx")
Importantsave_as_docx() rule

Always call save_as_docx() in its own separate chunk. Sharing a chunk with table-building code causes Quarto to render the table inline and write the file simultaneously, producing duplicate output.

You can also build a flextable directly from any data frame — without going through gtsummary — when you need full control over layout.

Code
summary_tbl <- nhis_cc |>
  group_by(us_born) |>
  summarise(
    N        = n(),
    Pct_poor = round(mean(poor_health) * 100, 1),
    .groups  = "drop"
  )

flextable(summary_tbl) |>
  set_header_labels(
    us_born  = "Nativity",
    N        = "N (unweighted)",
    Pct_poor = "% Poor Health"
  ) |>
  bold(part = "header") |>
  bg(bg = "#f0f0f0", part = "header") |>
  add_footer_lines("Source: NHIS teaching sample, n = 16,880 complete cases.") |>
  autofit() |>
  set_caption("Summary of Poor Health by Nativity")

Nativity

N (unweighted)

% Poor Health

U.S. born

13,292

6.9

Foreign born

3,588

5.8

Source: NHIS teaching sample, n = 16,880 complete cases.

6 Quick Reference

Function Purpose
tbl_summary() Unweighted descriptive statistics
tbl_svysummary() Survey-weighted descriptive statistics
tbl_regression() Regression table from model object
tbl_merge() Place tables side by side
tbl_stack() Stack tables vertically
add_overall() Add total/overall column
add_p() Append p-values for group comparisons
type = list(var ~ 'continuous') Display variable as mean (SD)
type = list(var ~ 'categorical') Display variable as counts (%)
statistic = list(...) Customize the displayed statistic
bold_labels() Bold variable name rows
italicize_levels() Italicize category rows
bold_p(t = 0.05) Bold significant p-values
remove_row_type('reference') Drop reference category rows
add_significance_stars() Add * ** *** to estimates
modify_header() Rewrite column headers (use `{n}` for counts)
modify_caption() Add or change table caption
show_header_names() Print internal column name reference
as_gt() Convert to gt for HTML rendering
as_flex_table() Convert to flextable (for Word export)
options(gtsummary.use_ftExtra = TRUE) Preserve markdown formatting in Word export
set_gtsummary_theme(...) Apply a theme globally for the session
Task Package Function
Descriptive Table 1 gtsummary tbl_summary() / tbl_svysummary()
Single regression table gtsummary tbl_regression()
Multiple models side by side gtsummary tbl_merge()
Before/after comparison (stacked) gtsummary tbl_stack()
Word / .docx export flextable as_flex_table() + save_as_docx()
Custom standalone table flextable flextable()