How to Find Function Origins & Namespace Discipline in RStudio

Mayta
Dec 8, 2025
2 min read

1. Base R Functions Require No Library

Some functions—such as paste0(), mean(), round(), factor()—belong to base R, which loads automatically.

To verify:

getAnywhere("paste0")

Typical output:

A single object matching ‘paste0’ was found
It was found in:
  package:base

Therefore:paste0() → base package → no library() required.

2. Determine the Origin of Any Function

R exposes two powerful tools for function lookup.

Method 1 — getAnywhere()

Works even for hidden or S3/S4 generic methods.

getAnywhere("tidy")

Method 2 — find()

Lists all packages on the search path that export a function with that name.

find("tidy")

If multiple packages contain a function of the same name, you may see:

"package:broom.mixed"
"package:broom"

This reveals where masking can occur during analysis.

3. Explicit Namespace Calling (Best Practice in CECS Research)

To ensure reproducibility and avoid masks:

❌ Avoid

tidy(model)

✔ Prefer

broom::tidy(model)

Advantages

Zero ambiguity in clinical audit scripts
Avoids masked functions producing silent, inconsistent results
Required for reproducible research, PRISMA-compliant data extraction code
Prevents accidental promotion of the wrong method (e.g., mixed models vs fixed models)

4. Detect Masks When Loading Packages

Any time you load libraries, R announces conflicts, e.g.:

The following objects are masked from 'package:stats':
    filter, lag

Meaning:

dplyr::filter() overrides stats::filter()
You can always call the original explicitly:

stats::filter()
dplyr::filter()

Namespace discipline is essential in multistage modeling pipelines (GLM → GEE → mixed-effects → survival).

5. CECS Cheat Sheet: Where Common Functions Come From

Function	Package
paste0, sprintf, factor, round	base
mutate, filter, select, case_when, %>%	dplyr
pivot_wider, pivot_longer	tidyr
tidy, glance, augment	broom
tidy (mixed models)	broom.mixed
emmeans, contrast	emmeans
glmmTMB	glmmTMB
gt, gtsave	gt
read_dta	haven

6. How to Know Which tidy() Was Used?

Since both broom and broom.mixed contain tidy():

find("tidy")

Returns something like:

"package:broom.mixed"
"package:broom"

The first on the search path is the one being used.

Therefore: ALWAYS specify namespaces.

broom::tidy(model)
broom.mixed::tidy(model)

7. Why This Matters in Clinical Epidemiology & Statistics

Ambiguous function calls produce:

Incorrect standard errors (masking different model classes)
Incorrect convergence methods (e.g., S3 vs S4 methods)
Inconsistent meta-analysis extraction
Audit failures due to non-reproducible scripts

Explicit namespaces (package::function) eliminate these risks.

8. Ready-to-Use CECS Code Snippets

Identify function origin

getAnywhere("FUNCTION")
find("FUNCTION")

Audit all attached packages

sessionInfo()
search()

Force explicit namespace for entire script

broom::tidy(model)
dplyr::mutate(df, ...)
tidyr::pivot_wider(df)

Conclusion

Function origin identification is not cosmetic — it is a core reproducibility requirement in CECS-level clinical analytics. By using getAnywhere(), find(), and explicit package::function notation, you guarantee:

Transparency
Reproducibility
Mask-safe modeling
Audit-ready code