What Is a Systematic Search? And Check Keyword validation and Noise
🔍 Definition: What Is a Systematic Search?
A systematic search is a methodologically rigorous, comprehensive, and transparent process of identifying all relevant literature on a specific research question. It is a cornerstone of evidence-based reviews, like systematic reviews or meta-analyses, and is designed to be reproducible, exhaustive, and bias-minimized. This approach contrasts sharply with informal or narrative literature searches.
📌 Key Features
- Focused Research Question structured via frameworks like PICO (clinical) or DDO (determinant-driven).
- Protocol-Based: Search strategy is predefined and registered or documented before execution.
- Comprehensive: Uses multiple databases, both keyword and controlled vocabulary, and includes synonyms and variant terms.
- Boolean Logic: Uses OR, AND, NOT to build structured search queries.
- Reproducible: Must be fully documented for replication.
- Bias-Reduction: Prevents selection bias by predefining inclusion/exclusion criteria and using exhaustive strategies.
📊 Pattern of a Query Table: Systematic Search Format
In systematic reviews, the search query documentation typically includes three structured components:
✅ 1. Search Term List (Concept Table)
| Component | Core Concepts & Variants |
| Domain | Clinical trial, Kidney, Nephrology |
| Determinant | Normality test, Skewness, Shapiro-Wilk, QQ plot |
| Outcome | Parametric test, ANOVA, Wilcoxon, Regression |
| Study Design | Systematic review, Meta-analysis |
✅ 2. Detailed Database Search Histories (Syntax Tables)
Each database (PubMed, EMBASE, Scopus) gets a structured table:
Example: PubMed Search History Table
| No | Query | Result |
| #1 | "observational study"[tw] OR "Observational Studies as Topic"[Mesh] | |
| #2 | "clinical trial"[tw] OR "Clinical Trial"[Publication Type] | |
| ... | ... | ... |
| #20 | #4 AND #9 AND #19 |
Each row builds up Boolean logic blocks and uses database-specific syntax:
- "[tw]" for text word in PubMed
- "/exp" and ":ti,ab,kw,de" for Embase
- TITLE-ABS-KEY() for Scopus
✅ 3. Summary Table
A final table is used to summarize how many hits each search yields:
| Database | Result |
| EMBASE | |
| SCOPUS | |
| PUBMED | |
| Total (before deduplicate) | |
| Total (after deduplicate) |
Keyword validation
✅ 1. Check if the Search Strategy Captures the Key Papers
(Does the current search string retrieve known relevant studies?)
This is called "backward validation" or inclusion verification.
🔎 How to do it:
- Take a few relevant papers you already know are important.
- Run your full search string in each target database (e.g., PubMed).
- Check if those known papers appear in the results.
- Check by adding AND with "Author name" AND"Year" of your relevant papers
- If not, examine what terms those papers use.
- Are they using synonyms or acronyms that your search missed?
- Are you missing a MeSH or Emtree term?
🛠 Tools:
- Use the article's PubMed ID (PMID) or title to test if it gets retrieved.
- Use "Search within results" or filter by date or author to pinpoint it.
⚠️ 2. Check for Noisy or Overbroad Keywords
(Are some terms too generic and pulling in irrelevant results?)
This is part of precision testing — avoiding a flood of irrelevant results.
🔎 How to do it:
- Run individual keyword blocks (one at a time).
- Check the volume of results.
- Skim the first few pages:
- Are many irrelevant to your topic?
- Are there domains (e.g., engineering, veterinary, finance) unrelated to your scope?
📌 Tip: Problematic terms often include:
- Abbreviations: e.g., "SD", "AI"
- Overbroad terms: e.g., "mean", "regression", "normality"
- Truncated roots: e.g., nephro* might pull in both "nephrology" and unrelated prefixes
✅ Refinement Strategies:
- Use phrase searching (quotes) to avoid word scattering.
- Combine with field tags: mean[tw] instead of just mean
- Add contextual anchors: pair broad terms with another concept using AND
🧠 Advanced Tip: Log and Score Terms
- Create a tracking table for each term:
- Include: result count, relevancy, noise ratio, and inclusion of known papers.
- Example:
| Term | Hits | Captures Known Papers | Noise | Keep? |
| "normality test"[tw] | 45 | Yes | Low | ✅ |
| mean[tw] | 50,000 | No | High | ❌ |
| "descriptive statistics"[tw] | 500 | Yes | Medium | ✅ |
Comments
No comments yet. Be the first to share your thoughts.
Sign in to comment