Barie executes systematic literature reviews by querying PubMed, Cochrane, Embase, Web of Science, and clinical trial registries simultaneously. It screens against your inclusion/exclusion criteria, assesses study methodology and bias risk, and delivers the final synthesis organised by key finding, with a complete PRISMA flow diagram and a structured list of every included study with its DOI.
The problem with asking any AI tool about medical evidence
A clinical researcher asks a general-purpose LLM to summarise the evidence base for GLP-1 agonists in weight management. The output lists five clinical trials, describes them accurately, and synthesises the conclusions. But it is not a systematic review. It is a highly sophisticated literature search. It is working with information retrieved from the open web.
The researcher does not know if these are the *best* five trials, the *only* five trials, or just the five trials most heavily cited on the internet. She does not know what search terms the model used, which databases it queried, or what inclusion criteria it applied. The output is a black box. If she submits a literature review based on that output to a peer-reviewed journal, it will be desk-rejected. The PRISMA flow diagram requires transparency.
Medical research demands an auditable trail of evidence. If a study is excluded, the researcher needs to know why. If a claim is made, the researcher needs to trace it back to the exact paragraph in the primary literature. A language model chatting with a search engine cannot do this. A dedicated systematic review workflow executing API queries directly against medical databases can.
💡
Barie executes literature searches against the databases directly: PubMed, Embase, and Cochrane Library are queried via API using structured boolean logic. The retrieval step relies on the database’s own indexing, not an LLM’s semantic interpretation of the open web. The LLM is used where it excels: reading the retrieved full texts at scale against your inclusion criteria to automate the screening phase. The retrieval is deterministic; the screening is algorithmic.
Your prompt
Task prompt
“Conduct a systematic literature review on GLP-1 agonists for weight management, last 5 years.”
One sentence. Barie activates the database connectors, executes boolean queries across all of them, deduplicates the results, and applies PRISMA screening methodology. Here is the full execution chain.
1
Searching Medical Databases
Step 1: Deep Research searches seven live medical databases simultaneously
The Deep Research agent converts the natural language prompt into structured boolean queries optimized for the syntax of each database. It queries all seven databases at the same time. A manual search across these databases sequentially — translating the terms, executing the queries, and exporting the results — typically takes 4-8 hours of highly repetitive work. Barie completes the retrieval phase in 45 seconds.
📘 PubMed / MEDLINE
Executes query string utilizing MeSH terms: (“Glucagon-Like Peptide-1″[Mesh] OR “GLP-1 receptor agonists”) AND (“Obesity”[Mesh] OR “Weight Loss”[Mesh])
146 hits
📕 Cochrane Library
Searches Cochrane Database of Systematic Reviews (CDSR) and Cochrane Central Register of Controlled Trials (CENTRAL).
63 hits
📙 Embase
Executes Emtree query: ‘glucagon like peptide 1 receptor agonist’/exp AND (‘obesity’/exp OR ‘weight reduction’/exp). Captures European literature often missed by PubMed.
204 hits
🌐 Web of Science
Executes broad topic search across Core Collection. Cross-references citation indices to ensure highly-cited cornerstone papers are not missed.
188 hits
📋 ClinicalTrials.gov
Searches registry for completed Phase 3 trials with posted results. Identifies unpublished data and trial protocols critical for assessing publication bias.
41 hits
📊 Scopus
Retrieves broad interdisciplinary literature. High overlap with Web of Science but captures additional conference proceedings and journals.
172 hits
🎓 Google Scholar
Keyword search across full text. Used primarily for grey literature (reports, white papers) and catching pre-prints on medRxiv that have not yet been indexed.
Top 100 hits
⚡
Deep Research retrieves the exact records: Every hit in this list corresponds to an actual PMID, DOI, or clinical trial registry ID retrieved directly from the database API. It is not generating a list of papers from its “training data memory.” It is executing a standard database search and downloading the metadata and abstracts. It is 100% deterministic and reproducible.
2
Screening and PRISMA Criteria
Step 2: Systematic screening — inclusion criteria applied, duplicates removed, studies classified
The combined retrieval yields 914 total records. Barie executes the PRISMA workflow automatically. It deduplicates across databases. It reads every abstract (and full text where available) against the inclusion criteria inferred from the prompt. It classifies the study type. The result is a clean, deduplicated, classified list of included studies ready for synthesis. This step typically takes a human researcher days to weeks.
✅
Inclusion criteria applied
Focus on weight management. Primary endpoint is weight loss. Adult populations. GLP-1 receptor agonists used as primary intervention. Published within the last 5 years. Clinical trials or systematic reviews.
❌
Exclusion criteria applied
Studies focusing primarily on glycemic control in T2D without weight endpoints. Animal studies. In vitro studies. Non-GLP-1 interventions. Opinion pieces, editorials, or case reports (n<10).
🔄
Deduplication across sources
The same pivotal trials appear in multiple databases. Barie normalizes DOIs and titles, identifying 412 duplicates across the 914 raw records. The unique record count is 502.
📊
Study type classification
Barie reads the methodology section of every abstract and labels the study: RCT, Observational, Meta-Analysis, Systematic Review, or Trial Protocol. Used for quality weighting in the synthesis step.
📄
Barie logs the reason for every excluded paper: When the 502 unique records are screened, 418 are excluded. If you ask Barie why a specific paper was excluded, it provides the exact reason (e.g., “Excluded because the primary endpoint was HbA1c reduction, not weight loss”). This log is exported as part of the PRISMA diagram data, providing total transparency for your methodology section.
3
The Structured Review Output
Step 3: The structured review — every study extracted, classified, and source-linked
84
Studies included in final review
648K+
Total participants across
included studies
7
Core findings synthesised
100%
Findings sourced to primary literature
The PRISMA screening pipeline yields an included dataset of 84 studies. This list is not a bibliography. Barie extracts the key methodology, endpoints, and conclusions for every single study. The full dataset is formatted as a sortable, exportable matrix. Below is a sample of the highly-cited pivotal RCTs and meta-analyses that form the backbone of the evidence base in the final synthesis.
📄
Semaglutide 2.4 mg for Weight Management in Adults with Overweight or Obesity
NEJM · Wilding et al., 2021 · n=1,961 participants
RCT
doi.org
📄
Tirzepatide Once Weekly for the Treatment of Obesity (SURMOUNT-1)
NEJM · Jastreboff et al., 2022 · n=2,539 participants
RCT
doi.org
📄
Cardiovascular Outcomes with Semaglutide in Patients with Overweight or Obesity (SELECT trial)
NEJM · Lincoff et al., 2023 · n=17,604 participants
RCT
doi.org
📄
GLP-1 receptor agonists for weight management: a systematic review and network meta-analysis
The Lancet · Shi et al., 2022 · n=23,456 (pooled)
Meta-analysis
doi.org
📄
Real-world weight loss with semaglutide and liraglutide: an electronic health records study
Obesity · Ghusn et al., 2022 · n=1,244 participants
Observational
doi.org
🔗
Every paper links to the primary record, not a summary: The source link beside each included study points directly to the DOI, the PubMed ID record, or the full-text publisher page. The entire list is formatted as structured data with authors, year, journal, study design, and n-value so you can immediately drop it into a literature matrix or methodology section.
4
Key Findings and Synthesis
Step 4: Key findings synthesized from the evidence base — strength of evidence classified
The final review reads all 84 included abstracts and full texts and synthesises the findings into core themes. Unlike an LLM chatting with the internet, Barie classifies the evidence strength behind each finding based on the study type and sample size underlying the claim. A finding supported by three large RCTs is flagged as Strong Evidence. A finding supported by small observational data is flagged as Weak Evidence. You do not just get a summary; you get a critical appraisal.
Comparative trials and network meta-analyses indicate tirzepatide (GIP/GLP-1 dual agonist) yields greater weight reduction compared to semaglutide across 68-72 week timeframes. The SURMOUNT trials show mean weight reductions up to 22.5% with high-dose tirzepatide, compared to approximately 15% with high-dose semaglutide in the STEP trials. Network meta-analyses confirm statistical superiority for tirzepatide across pooled endpoints. This finding is robust and supported by high-quality RCTs.
The 2023 SELECT trial demonstrated a 20% reduction in major adverse cardiovascular events (MACE) among patients with overweight or obesity and established cardiovascular disease, without diabetes. This represents a paradigm shift from viewing GLP-1s solely as weight loss drugs to viewing them as primary cardiovascular interventions. The magnitude of this finding is exceptional and heavily cited in recent policy literature. The evidence is strong, but driven primarily by a single massive, well-powered trial.
Multiple extension studies and post-trial observational data (e.g. STEP 1 extension) indicate that patients regain approximately 2/3 of their lost weight within one year of discontinuing semaglutide or tirzepatide. This supports the clinical consensus that GLP-1 therapy for obesity requires chronic administration, similar to hypertension or lipid-lowering therapies. The evidence is consistent but relies heavily on extension phases rather than primary blinded endpoints.
Step 5: The review delivered in formats your research team actually uses
The full systematic review is useless as text in a chat window. It needs to be integrated into your drafting workflow. The structured review lands in Notion as a database for ongoing team collaboration. The full text, PRISMA flow data, and synthesis narrative export directly to a formatted Google Doc or Word document ready for editing. The complete matrix of 84 included studies exports to Google Sheets for sub-analysis or inclusion in systematic review appendices. The key findings and search strategy push to Slack or Teams for immediate team review.
For researcher teams using reference management software, the included list of 84 studies is exported to standard format files compatible with Zotero, Mendeley, and EndNote. Barie pulls the exact metadata directly from the database APIs (PubMed, Embase, etc.) to ensure the citations are clean, complete, and perfectly formatted. You do not spend three hours downloading RIS files from different databases and merging them manually. The clean citation file is attached to the output.
📚
Direct exporting to RIS format with full mapping metadata: Barie generates a raw `.ris` text file alongside the review text. You download the file and import it directly into Zotero or Mendeley. Every title, author, journal, volume, issue, page number, and DOI maps perfectly. This saves an estimated two hours of manual citation management per systematic review.
What you get
A systematic review is not just a summary of medical studies. It is a formal, reproducible method of identifying all available evidence for a specific clinical question, filtering it transparently, and synthesizing the findings based on methodological strength. A general-purpose LLM chatting with the internet cannot perform this task. It guesses. It hallucinates citations. It cannot reproduce its own work. It lacks the Boolean query syntax needed to retrieve comprehensively, the strict inclusion screening logic needed to filter objectively, and the study design classification capability needed to assess quality.
Barie’s Deep Research mode is built for researchers. It queries the actual medical databases. It retrieves the actual DOIs. It executes the PRISMA workflow step-by-step. Every claim is sourced. Every exclusion is logged. Every finding is graded by evidence quality. It turns a process that takes a human researcher three weeks into a process that takes forty-five minutes.
The Verdict
Medical evidence does not hold still. The SELECT trial almost changed the clinical and insurer calculus for semaglutide overnight when it was published in 2023. A generic AI tool producing a GLP-1 literature review from training data has a reasonable chance of not knowing it exists. Barie Deep Research retrieves from PubMed, ClinicalTrials.gov, Cochrane, EMBASE, and the primary journal pages at query time. The analysis reflects the evidence base as it stands today. That is not a feature. In a clinical and academic context, it is the minimum standard that makes the output usable.
Barie features used in this task
Feature
ChatGPT
Perplexity
Barie
Medical Database API Direct Queries — structured Boolean queries sent to PubMed, Embase, Cochrane
✗
✗
✓
Automated PRISMA Screening — transparent inclusion/exclusion applied at scale to raw retrieval set, logging exclusions
✗
✗
✓
Evidence Quality Grading — findings grouped by evidence strength (RCT vs observational) rather than citation frequency
✗
✗
✓
Citation File Export — outputs structured RIS files mapping title, author, journal, and DOI perfectly to reference software
✗
✗
✓