Home / Publications / Reports / Sample ForIntel Frontier Concept Brief — Four Frontier Concepts, One That Carries Signal on Every Leg
ForIntel Frontier Concept Brief · Sample Report

Sample ForIntel Frontier Concept Brief — Four Frontier Concepts, One That Carries Signal on Every Leg

A public sample of a ForIntel Frontier Concept Brief — a signal-strength read of the AI-agents research field. It ranks four candidate frontier concepts on three legs of the scholarly record — recency, citation magnitude where the window allows, and publication-growth slope — and finds that tool-use / function-calling agents is the only one carrying signal on every leg (recent, already cited, growing ~4.8×), while agent evaluation and multi-agent orchestration rest on recency alone and agentic retrieval is deferred for re-instrumentation.

14 min read · Published 2026-06-20 · ai-agents-research vertical

What this sample shows

This is a public sample of a ForIntel Frontier Concept Brief, published by Foragentis to demonstrate the method. It is a field-level read — it has no private buyer and names no client. Every concept, dataset and cited work in it is public and is preserved in full; the only thing reframed for this sample is the deliverable's confidential cover note, which becomes the public-method framing you are reading now.

It demonstrates what the tier reads: a signal-strength ranking of research attention across four candidate frontier concepts in the AI-agents field, scored on the three legs the scholarly publication record actually supports — recency, citation magnitude (where the frontier window allows it to be read), and publication-growth slope — with the leading concept's growth verified on an independent slice of the same record. Each finding carries an explicit confidence label, and the read is deliberately hedged where the evidence is thin: the citation leg is treated as an early / partial read in a frontier window, three of the four concepts are ranked on recency and growth alone because their recent work has not yet accrued citations, and verification is named as internal-consistency rather than cross-corpus corroboration. This brief is a research-attention read; it makes no technical, benchmark or investment claim about agentic systems.

Field AI agents / agentic-systems research domain
Scope Frontier-concept signal-strength ranking · research-attention read (four candidate concepts)
Window Trailing ~18-month frontier window · 36-month growth lookback · Jun 2026
Method Three-leg signal read of the scholarly publication record — recency, citation magnitude where readable, and publication-growth slope — with the top concept's growth verified on an independent slice
Prepared by ForIntel by Foragentis

The verdict

Of four candidate frontier concepts, one is rising on every leg the record lets us measure — and the ranking says exactly which leg each rests on.

This is a signal-strength read of the AI-agents research field, scored on three legs of the scholarly record. The headline is that the candidate set does not rank close. Tool-use / function-calling agents is the only concept where the field's attention is visible on every measure we can read — recent, already cited, and growing — so it earns a deep-dive. The other three rest on recency alone: recent activity is present, but their recent work has not yet accrued enough citations to read a magnitude leg, so they are ranked on what is genuinely there and the rest is withheld rather than guessed. And because the read is built on a single scholarly record, the verification step re-pulls a different slice of that same record — an internal-consistency check, with cross-corpus corroboration named as the next enhancement rather than claimed here.

  1. Tool-use / function-calling agents is the clear frontier — the only concept carrying signal on all three legs. Recency: the most-recent slice of the record returns an all-recent set, the great majority of it genuinely on-topic — far cleaner than any other candidate. Citation magnitude: its most-cited recent works have already accrued citations — the field anchor "A survey on large-language-model-based autonomous agents" at 1,148 citations, alongside "Can Open LLMs Catch Vulnerabilities?" (488) and "ChatGPT for Robotics" (436) — the one concept whose citation leg is readable at all. Publication growth: recent-period output runs roughly 4.8× the early period, and that slope holds when re-computed on an independent slice of the record. It ranks first on a strong, multi-leg signal — not on any single number.
  2. The other three rank on recency alone — their citation leg is not yet readable. Agent evaluation / benchmarks / safety shows strong recent activity (a healthy share of the most-recent slice is on-topic, with governance- and benchmark-themed titles), but its recent papers have not accrued citations yet, so the citation leg is withheld rather than guessed — a watch. Multi-agent orchestration / collaboration is recent-active too, but its most-cited works are an older control-theory / robotics literature the broad term also catches — recency-only signal, a watch. Agentic retrieval / long-horizon memory reads weakest — but that is at least partly a measurement gap: the sub-topic is too new for the field's controlled vocabulary, so the free-text term under-catches it and the slice fills with off-topic and not-yet-dated records. We defer it and re-instrument rather than declare it quiet.
  3. Citation velocity is the weak leg for everything in a frontier window — we read it only where it is real, and label it early. Papers published in the trailing window have had too little time to accrue citations, so a raw citation count under-represents a genuinely hot recent concept. We therefore never manufacture a velocity number: where accrual is too thin to read (three of the four concepts), the rank rests on recency and growth and we say so; where it is readable (tool-use), we report it as a partial / early read, not a settled velocity. And because this read is built on a single scholarly record, the verification step re-pulls a different slice of that same record rather than a second independent index — an internal-consistency check, with cross-corpus corroboration named as the next enhancement.

In one line: Tool-use / function-calling agents leads (deep-dive) on a strong three-leg signal — recent, citation-accruing, and growing ~4.8× with the slope verified on an independent slice. Agent evaluation and multi-agent orchestration follow on recency alone (watch); agentic retrieval is deferred — its low rank is as likely a measurement gap as a quiet field, and it should be re-instrumented before it is judged.

How to read this report. This is a signal-strength ranking of research attention, not a technical assessment of agentic systems. It reports which sub-topics the scholarly record is moving toward fastest, on three legs: recency (share of activity in the most-recent periods), citation magnitude (how much recent work has already been cited — readable only where the accrual window allows), and the publication-growth slope (recent vs earlier output). Each concept's rank names which legs it rests on. It does not adjudicate which agentic approach works best, benchmark any system, or make a build, architecture or investment recommendation — those judgements are yours. Every figure is recomputed from the underlying scholarly records, partial reads are labelled partial, and the citation leg is treated as an early read in a frontier window, never a settled one.

01 · The ranking — one concept carries signal on all three legs

(Confidence: High.) We held the candidate set fixed at four concepts so the read is comparable across them, then scored each on the three legs the scholarly record actually supports. (Source: raw scholarly-record pulls, recomputed per concept across recency, citation magnitude and publication-growth slope.) The result is not close. Tool-use / function-calling agents is the only candidate that reads strong on every leg; the other three rest on recency — recent activity is present, but their recent work has not yet accrued enough citations to read a magnitude leg, so we rank them on what is genuinely there and withhold what is not.

Candidate concept Recency leg Citation-magnitude leg Publication-growth leg Posture
Tool-use / function-calling agents Strong Strong (early read) Strong Deep-dive
Agent evaluation / benchmarks / safety Strong Unreadable (not yet accrued) Thin Watch
Multi-agent orchestration / collaboration Strong Unreadable (legacy work only) Thin Watch
Agentic retrieval / long-horizon memory Thin (under-caught) Unreadable Thin Defer & re-instrument

Figure — The three-leg signal matrix (each candidate concept × each signal leg, recomputed from the raw scholarly-record pulls). Strong = the leg reads clearly in the concept's favour; thin / partial = present but caveated; unreadable = the leg cannot be read for this concept (e.g. recent papers have not accrued citations). Only tool-use / function-calling agents reads strong across the row.

The reason the citation leg is unreadable for three of four concepts is structural, not an oversight: in a frontier window, recent papers have had too little time to accrue citations. So for those three we rank on recency and the growth read, and we say so — we do not invent a velocity number to fill the gap. The next two sections read the two legs that are measurable for the leader: its citation magnitude, and its verified growth slope.

02 · Citation magnitude — the leader's recent work is already being cited

(Confidence: Medium.) For tool-use / function-calling agents, the most-cited in-window works are recent — 22 of the 25 are dated 2023–2026 — and they have measurably accrued citations. (Source: most-cited in-window pull for the leading concept, restricted to recent on-topic works.) The on-topic anchor, a survey of large-language-model-based autonomous agents, has drawn 1,148 citations; "Can Open LLMs Catch Vulnerabilities?" 488; "ChatGPT for Robotics" 436. For every other candidate, the same most-cited pull returns only pre-2023 legacy work — their recent papers have not accrued, so there is simply no magnitude leg to read.

Recent, on-topic tool-use work Citations
A survey on large-language-model-based autonomous agents 1,148
Can Open LLMs Catch Vulnerabilities? 488
ChatGPT for Robotics 436

Figure — Recent, on-topic tool-use work that has already accrued citations (most-cited in-window works for the leading concept; recent, on-topic only). These are the recent, on-topic works only; the broad search term also catches off-topic biomedical and general-AI papers — a known limit of carrying a too-new sub-topic as free text — and those are excluded here and disclosed in the boundaries. Even read favourably, this is an early / partial citation read: a frontier window under-counts by construction.

Two honest caveats travel with this leg. First, it is an early read — in a frontier window citations are still arriving, so these magnitudes will be larger later and are a floor, not a verdict. Second, the broad free-text term that locates the concept also catches off-topic work (general-AI and biomedical papers using the words "tool use"); the figures above are the on-topic subset, and the over-catch is named as a boundary so the magnitude is not overstated.

03 · Publication growth — the leader's slope holds under an independent re-pull

(Confidence: High.) The third leg is the publication-growth slope: is the concept's output accelerating? (Source: an independently-phrased re-pull over a 2022–2024 window, 200 works, of the same scholarly record.) For the leader, the answer is yes — and it survives an independent check. Re-computed on a different, tightened slice of the same scholarly record, in-window output for tool-use / function-calling agents rose from 18 works in the early half to 87 in the recent half — roughly a 4.8× lift. Because the slope reproduces on a slice other than the one that first surfaced it, the growth is not a single-query artifact.

Slice of the independent re-pull (200 works) In-window works
Early half 18
Recent half 87
Recent ÷ early ~4.8×

Figure — Recent-period output runs ~4.8× the early period — and the slope reproduces (independent re-pull of the leading concept's growth; 200 works, tightened 2022–2024 window). This is the verification leg: the growth claim is re-computed on a second, independently-phrased slice of the same record and the ~4.8× lift holds. One limit travels with it — this is internal-consistency verification (a second slice of the same scholarly record), not corroboration against a second independent index.

This is the strongest single piece of evidence behind the rank-1 placement, and the reason the leader earns a deep-dive while the others earn a watch: recency tells you a topic is busy now; a verified growth slope tells you the busyness is accelerating. The honest boundary is the nature of the check — it confirms the slope is real within this record, not that a second, independent scholarly index would reproduce it. That cross-index corroboration is the named next enhancement (see Boundaries).

04 · What to do — deep-dive one, watch two, defer one

(Confidence: Medium.) The ranking converts into a research-attention posture per concept. Each posture ties to the legs that survived, and to the caveat where one applies. These are recommendations about where to place reading, hiring and early scouting attention — not technical or investment verdicts.

  1. Deep-dive — tool-use / function-calling agents. The only concept with a strong, verified, multi-leg signal: clean recency, a readable (if early) citation leg, and a ~4.8× growth slope that holds on an independent slice. It warrants dedicated reading and scouting attention now — this is where the field's measurable attention is concentrating and accelerating.
  2. Watch — agent evaluation / benchmarks / safety. Strong recent activity and on-topic governance/benchmark work, but its recent papers have not yet accrued citations, so the rank rests on a single readable leg (recency) plus a thin growth read. Re-check next window: if the citation leg starts to read, it moves toward a deep-dive.
  3. Watch — multi-agent orchestration / collaboration. Recent activity is present, but the term catches a sizeable older control-theory / robotics literature, so the signal is recency-only and partly diluted by term over-catch. Worth watching with a tightened, agent-specific query next cycle to separate the new orchestration work from the legacy multi-agent-systems base.
  4. Defer (and re-instrument) — agentic retrieval / long-horizon memory. Reads weakest, but the low rank is as likely a measurement gap as a quiet field: the sub-topic is too new for the controlled vocabulary, the free-text term under-catches it (most of its slice is off-topic or not-yet-dated), so no confident judgement is warranted this cycle. The action is to re-instrument with better-targeted terms before ranking it — not to conclude it is cold.

05 · Boundaries — what this read does and does not establish

(Confidence: High.) This brief is a signal-strength ranking of research attention, built on a single scholarly publication record in a fast-moving frontier window. The honest limits below bound what the ranking can claim; none of them changes the order at the top (tool-use leads on legs that several independent slices agree on), but each is a real constraint a reader should carry.

  • Citation velocity is an early read in a frontier window. Papers published in the trailing window have had too little time to accrue citations, so a raw citation count under-represents a genuinely hot recent concept. For three of the four concepts the recent-paper citation leg is too thin to rank on — they are ranked on recency and growth, and the thinness is declared, not papered over with a fabricated velocity number. Even for the leader, the citation magnitude is a partial / early read (a floor), not a settled velocity. Closing it: re-read the citation leg in a later window once the frontier papers have had time to accrue.
  • Single scholarly record — verification is internal-consistency, not cross-corpus. Every leg is read from one scholarly publication record, and the verification step re-pulls a different slice / sort / window of that same record rather than joining to an independent second index. So a claim that survives the re-pull is internally consistent, not externally corroborated. Independent cross-corpus verification against a second scholarly index is a pending enhancement not available this cycle (it requires second-source access not yet provisioned). Closing it: a cross-index corroboration pass once that access is in place.
  • The newest sub-topics are too new for the field's controlled vocabulary. The agentic sub-topics are mostly not yet recognised concepts in the scholarly record's controlled index, so they are located by free-text terms — which both over-catch (the leader's broad term also returns off-topic biomedical and general-AI work, excluded from the magnitude read above) and under-catch (agentic retrieval's term misses most on-topic work, which is why it is deferred). This is the single largest source of measurement noise in the read. Closing it: hand-curated, concept-specific query sets and a re-instrumentation pass for the under-caught concepts.
  • Indexing lag on the most-recent periods. The very latest months are under-indexed, and the record also carries some forward-dated entries (works stamped with future publication dates). A low most-recent-month count can therefore be a lag artifact rather than real cooling, and a most-recent slice can fill with not-yet-cited or future-dated records — we read recency on the share that is genuinely recent and on-topic, and flag where a slice was lag-distorted (part of why agentic retrieval is deferred). Closing it: re-pull the recency leg after the indexing settles.
  • The pulls are capped — counts are floors, not field totals. Each concept's slice is capped at a fixed number of records per pull; the brief uses these as floors and a comparative read across the equally-capped concepts, not as a complete census of the field. The ordering holds steady under the cap (it rests on the relative strength of the legs, read identically across concepts); the absolute counts are not field totals. Closing it: an uncapped, paginated count per concept to convert the comparative read into absolute volumes.

This is a frontier-concept signal-strength read at the Frontier Concept Brief tier. The natural next step is a deeper engagement that hardens the ranking: (1) a cross-index corroboration pass once second-source access is provisioned, so the leader's growth and citation reads are confirmed against an independent scholarly record, not just a second slice of one; (2) hand-curated, concept-specific query sets that close the controlled-vocabulary gap — especially to re-instrument the deferred concept before it is judged; and (3) a later-window re-read of the citation leg once the frontier papers have had time to accrue, to convert the early citation read into a settled velocity. To commission it, reach the ForIntel desk directly at forintel@foragentis.com.

This is a public sample of a ForIntel Frontier Concept Brief, published by Foragentis to demonstrate the method. It is a field-level read with no private buyer; every concept and cited work is public and preserved in full. The recency and publication-growth legs are directly observed and the leader's growth slope is verified on an independent slice of the same record; the citation-magnitude leg is an early / partial read in a frontier window and is readable for only one of the four concepts. Verification is internal-consistency (a second slice of the same record), not corroboration against a second independent scholarly index, which is a named pending enhancement. The brief reports research-attention signal only and states no technical, benchmark or investment conclusion about agentic systems.

Commission research like this

ForIntel produces the kind of research above on commission. These SKUs answer the questions this piece raises — directly, on a fixed timeline, with sources cited.

Frontier Concept Brief
Academic concepts moving fast in the literature with no productized offering yet.
$4,999 base· +$1,499 per adjacent cluster
Research Field Atlas
Complete structural map of a research field — institutions, citations, funders, gaps.
Quote· typically $7,500–25,000
R&D Velocity Audit
How fast a company translates R&D into shipping product, benchmarked against peers.
$4,999 base· +$1,499 per peer benchmark
See the full catalog →Procurement deck (PDF)

© 2026 Foragentis. This report may be cited with attribution. Redistribution requires permission.

Want a report like this for your vertical?

Request a Custom Report →Email forintel@foragentis.com

Contact forintel@foragentis.com or visit foragentis.com/forintel#order to scope a custom report.