How Barie conducts a systematic review of published research on AI in K-12 education — themes, gaps, and trends from the last 3 years

Use case walkthrough | Education & Research | 9 min read

Barie searches ERIC, Scopus, Google Scholar, IEEE Xplore, and ACM Digital Library simultaneously. It screens papers against inclusion criteria, extracts themes and methodologies, identifies where the literature is concentrated and where it is absent, and delivers a structured review with a source link and quality assessment for every included paper.

Why manual systematic reviews in fast-moving fields are out of date before they are submitted

A doctoral researcher spent four months conducting a systematic review of AI in K-12 education for their literature review chapter. They searched ERIC and Google Scholar manually, applied inclusion criteria, and coded 44 papers by hand. By the time the draft was submitted to their supervisor, 23 new papers had been published in journals they had already searched. Three of these papers directly addressed the gap the researcher had identified as their original contribution. The review was methodologically sound but empirically outdated simultaneously.

The problem is not the effort. Manual systematic reviews at the required rigour level take months because the database searching, deduplication, screening, and extraction steps are all labour-intensive at scale. The field moves faster than the methodology. AI in K-12 education has produced over 400 peer-reviewed publications in the 36 months preceding April 2026. A researcher who queries five databases manually over four months is always working from an incomplete snapshot.

💡

Barie queries five academic databases simultaneously and extracts themes from the full text, not just the abstract: Most systematic review tools work from abstracts. Barie retrieves full-text access where available and scans methodology sections, findings paragraphs, and limitation notes to extract themes that do not appear in the title or abstract. A paper whose abstract describes “AI-assisted writing feedback” but whose methodology section describes “automated grading at scale” is coded correctly because the full text was read.

Your prompt

Task prompt

“Conduct a systematic review of published research on AI in K-12 education, themes, gaps, and trends from the last 2 years.”

One sentence. Five academic databases activated simultaneously. Inclusion criteria applied at the connector level. Full-text extraction where accessible. Here is the complete workflow.

Academic Database Stack Activated

Step 1: Five academic connectors activated — each covering a database the others do not fully index

AI in K-12 education research is published across education journals (predominantly indexed in ERIC), computer science and engineering venues (IEEE Xplore, ACM Digital Library), interdisciplinary outlets (Scopus, Web of Science), and grey literature including conference proceedings and policy reports. No single database indexes all five. ERIC misses most IEEE publications. Scopus misses many education policy reports. Google Scholar covers the broadest range but with the weakest metadata quality for systematic filtering. Barie activates five connectors and assigns each to the database it retrieves most effectively.

Barie Academic Research Stack · AI in K-12 · 3-Year Window

5 connectors · parallel

🔬 Deep Research

Queries ERIC (Education Resources Information Center), the primary index for education research, and Scopus for peer-reviewed journal articles published between April 2023 and April 2026. Search query: (“artificial intelligence” OR “machine learning” OR “generative AI”) AND (“K-12” OR “primary school” OR “secondary school” OR “elementary” OR “high school”). Inclusion criteria applied: peer-reviewed, English language, empirical or review study type.

ERIC · Scopus

🕷️ Firecrawl

Retrieves search results from IEEE Xplore and ACM Digital Library for conference papers and journal articles from the same period. Computer science conferences (AIED, EDM, LAK) are often the primary venue for technical AI in education research that are systematically underrepresented in ERIC. Firecrawl also retrieves full text of abstracts for the three highest-impact journals in the field: Computers and Education, Journal of Learning Analytics, and British Journal of Educational Technology.

IEEE · ACM · High-impact journals

🌐 Web Research

Retrieves grey literature. UNESCO, OECD, and national ministry of education policy reports published in the review period. Conference proceedings from AERA, CERA, and BERA annual meetings. Preprints and working papers in education section for working papers that have not yet completed peer review but accurately reflect the literature. Grey literature captures the policy and practitioner evidence base that peer-reviewed databases exclude.

Grey literature · Policy reports

⚡

Inclusion criteria and deduplication run automatically before any paper is coded: All three connectors retrieve results simultaneously. Barie deduplicates across databases (the same paper often appears in ERIC and Scopus), applies the inclusion criteria to each record, and builds the included set before any theme extraction begins. The deduplication step alone typically removes 16 to 24% of raw search results that appear in multiple databases.

Search Results and Screening

Step 2: Results screened, included set confirmed, papers assessed for quality before theme extraction

847

Total records retrieved across all
databases

186

Papers included after screening

Distinct research themes
identified

Critical research gaps
documented

After deduplication, 847 unique records are screened against the inclusion criteria. 186 papers pass all criteria and enter the included set. For each included paper, Barie extracts title, authors, publication venue, year, methodology type (experimental, quasi-experimental, qualitative, review, mixed methods), grade level studied, AI technology studied, key findings, and limitations. Papers are assessed for quality using a modified Mixed Methods Appraisal Tool (MMAT) rubric applied consistently across all 186 included studies.

Themes, Gaps, and Emerging Trends

Step 3: Major themes, research gaps, and emerging trends — each documented with the papers that constitute the evidence

📚

Major theme: Intelligent tutoring systems dominate the empirical literature — 38% of included papers

Major theme: 71 papers

Seventy-one of the 186 included papers examine intelligent tutoring systems or AI-assisted adaptive practice platforms in K-12 settings. The majority study mathematics and reading domains with students in grades 3 to 8. Studies predominantly report positive effects on immediate assessment performance with limited follow-up beyond four weeks. The dominance of ITS research reflects the availability of commercial platforms (Khan Academy, Carnegie Learning, ALEKS) that provide research access, creating a commercially mediated selection bias in the empirical base.

71 papers
Grades 3-8 (element.)
Mathematics and reading focus
📄 71 individual papers in ERIC and Scopus

⚠️

Research gap: Teacher professional development for AI integration — fewer than 8 empirical studies

Critical gap: 8 papers

Fewer than eight empirical studies in the included set examine how teachers are trained to integrate AI tools into instructional practice. The policy literature consistently identifies teacher readiness as the primary implementation barrier, yet the empirical research base on effective professional development models is almost entirely absent. This is the most significant gap between practitioner need and research supply in the current literature. Three of the eight existing studies were published in the last six months of the review period, suggesting this gap is beginning to receive attention.

8 papers only
Highest policy-research gap
Emerging in 2025-2026
📄 UNESCO teacher readiness report, 2025

📈

Emerging trend: Generative AI in classroom writing — 41 papers in 18 months, up from 3 in prior 18 months

Fast-emerging: 41 papers

Research on generative AI tools (ChatGPT, Gemini, Claude) in K-12 writing instruction has grown from three papers in the first half of the review period to forty-one in the second half. The trend line is steep enough to suggest this will be the dominant theme in the next systematic review cycle. Current papers are disproportionately qualitative and exploratory, reflecting the early stage of both the technology and the research field. Longitudinal and experimental studies examining learning outcomes are almost entirely absent from the current body of work — a predictable gap given the recency of the phenomenon.

3 papers first 18 months -> 41 second 18 months
Mostly qualitative currently
📄 BJET special issue 2025; ACM L@S proceedings

Delivered to Research Tools

Step 4: The systematic review delivered to your research workflow tools

The full systematic review with all 186 included papers, their extracted data, quality assessments, and theme classifications exports to eight destinations. Notion holds the structured review document with all findings, gap analysis, and trend narrative. Airtable receives every included paper as a structured record — author, year, journal, methodology, grade level, AI technology, findings, quality score, and theme tags — sortable and filterable by any field. Google Sheets receives the complete paper matrix for further quantitative analysis, co-citation analysis input, and chart generation. Zotero-compatible RIS citation file generated for direct library import. Word document formatted as a publishable literature review section. Gmail drafts a summary briefing for a doctoral supervisor or research team. ClickUp creates a six-month monitoring task to re-run the search and flag new papers. Slack posts the review summary to the research team channel.

📓 Notion

Full Structured Review with themes, gaps, trends, and evidence narrative for every finding.

📋 Airtable

186-paper database with all extracted fields accessible, filterable min/max by methodology and theme.

📊 Google Sheets

Paper matrix for quantitative analysis, frequency counts, and trend visualisation.

📄 Word (.docx)

Publication-ready literature review section formatted for journal submission or thesis chapter.

📚 RIS Export

Zotero and Mendeley-compatible citation file for all 186 included papers for reference management.

📧 Gmail

Supervisor or collaborator briefing email drafted with key findings, gaps, and recommended next steps.

✅ ClickUp

Six-month search update task — re-queries all five databases and flags new papers matching inclusion criteria.

💬 Slack

Research team channel digest with the three major themes, first six gaps, and emerging trend summary.

The Verdict

A manual systematic review that takes four months to complete is already partially outdated when it is submitted. The AI in K-12 field published 41 papers on generative AI in writing instruction in eighteen months. A review completed before month 10 of that period missed more than half of the literature that now defines the most active area of the field. Barie queries ERIC, Scopus, IEEE Xplore, ACM, and grey literature simultaneously, deduplicates automatically, applies inclusion criteria consistently, and extracts themes from full text where available. The literature is current to the day the search is run. The methodology is applied identically across every paper. The result is a systematic review that takes a session rather than a semester and is current rather than historical.

Barie features used in this task

Feature

ChatGPT

Perplexity

Barie

Five Academic Database Connectors — ERIC, Scopus, IEEE, ACM, and grey literature queried simultaneously

✗

✓

Full-Text Theme Extraction — methodology and findings sections read, not just abstracts

✗

✓

Automated Deduplication — cross-database duplicates removed before any paper is coded

✗

✓

Eight Output Connectors — Notion, Airtable, Google Sheets, Word, RIS, Gmail, ClickUp, and Slack

✗

✓