How to Find Function Origins & Namespace Discipline in RStudio
- Mayta

- 13 hours ago
- 2 min read
1. Base R Functions Require No Library
Some functions—such as paste0(), mean(), round(), factor()—belong to base R, which loads automatically.
To verify:
getAnywhere("paste0")
Typical output:
A single object matching ‘paste0’ was found
It was found in:
package:base
Therefore:paste0() → base package → no library() required.
2. Determine the Origin of Any Function
R exposes two powerful tools for function lookup.
Method 1 — getAnywhere()
Works even for hidden or S3/S4 generic methods.
getAnywhere("tidy")
Method 2 — find()
Lists all packages on the search path that export a function with that name.
find("tidy")
If multiple packages contain a function of the same name, you may see:
"package:broom.mixed"
"package:broom"
This reveals where masking can occur during analysis.
3. Explicit Namespace Calling (Best Practice in CECS Research)
To ensure reproducibility and avoid masks:
❌ Avoid
tidy(model)
✔ Prefer
broom::tidy(model)
Advantages
Zero ambiguity in clinical audit scripts
Avoids masked functions producing silent, inconsistent results
Required for reproducible research, PRISMA-compliant data extraction code
Prevents accidental promotion of the wrong method (e.g., mixed models vs fixed models)
4. Detect Masks When Loading Packages
Any time you load libraries, R announces conflicts, e.g.:
The following objects are masked from 'package:stats':
filter, lag
Meaning:
dplyr::filter() overrides stats::filter()
You can always call the original explicitly:
stats::filter()
dplyr::filter()
Namespace discipline is essential in multistage modeling pipelines (GLM → GEE → mixed-effects → survival).
5. CECS Cheat Sheet: Where Common Functions Come From
Function | Package |
paste0, sprintf, factor, round | base |
mutate, filter, select, case_when, %>% | dplyr |
pivot_wider, pivot_longer | tidyr |
tidy, glance, augment | broom |
tidy (mixed models) | broom.mixed |
emmeans, contrast | emmeans |
glmmTMB | glmmTMB |
gt, gtsave | gt |
read_dta | haven |
6. How to Know Which tidy() Was Used?
Since both broom and broom.mixed contain tidy():
find("tidy")
Returns something like:
"package:broom.mixed"
"package:broom"
The first on the search path is the one being used.
Therefore: ALWAYS specify namespaces.
broom::tidy(model)
broom.mixed::tidy(model)
7. Why This Matters in Clinical Epidemiology & Statistics
Ambiguous function calls produce:
Incorrect standard errors (masking different model classes)
Incorrect convergence methods (e.g., S3 vs S4 methods)
Inconsistent meta-analysis extraction
Audit failures due to non-reproducible scripts
Explicit namespaces (package::function) eliminate these risks.
8. Ready-to-Use CECS Code Snippets
Identify function origin
getAnywhere("FUNCTION")
find("FUNCTION")
Audit all attached packages
sessionInfo()
search()
Force explicit namespace for entire script
broom::tidy(model)
dplyr::mutate(df, ...)
tidyr::pivot_wider(df)
Conclusion
Function origin identification is not cosmetic — it is a core reproducibility requirement in CECS-level clinical analytics. By using getAnywhere(), find(), and explicit package::function notation, you guarantee:
Transparency
Reproducibility
Mask-safe modeling
Audit-ready code





Comments