Group 15 (2).png

In the rapidly evolving landscape of AI, the "three-layer stack" framework helps explain where value is created and where it is being commoditized. As Evan Armstrong explores, this stack consists of Systems of Record (the data), the Context Layer (the intelligence bridge), and Endpoint Applications (the interface).

As the first and third layers become increasingly commoditized, a critical shift is underway in which the value of AI migrates almost entirely to the Context Layer. For financial firms, where precision is non-negotiable, this is where the battle for high-rigor output is won or lost.

The Three-Layer Stack and the Commoditization Trap

According to Armstrong, the new AI stack is undergoing rapid hollowing out at the extremes.

Systems of Record are the databases and repositories where information lives, such as CRMs or document stores. While necessary, they are becoming commoditized as integration becomes easier. On the other end of the spectrum are Endpoint Applications, the user interfaces where humans interact with AI. As UI libraries and LLM wrappers proliferate, the interface itself is no longer a sustainable competitive advantage.

The Context Layer represents the connective tissue between these two extremes. It handles selecting, refining, and injecting the correct data into a model at the right time. As the industry matures, the Context Layer is where the actual intellectual property resides. In Financial AI, it represents the difference between a generic chatbot and a sophisticated AI analyst that understands the nuances of an investment committee memorandum or a complex credit agreement.

Bullpen, A Context Engine: The Right Data at The Right Time

At Bullpen, we recognized early on that the most significant bottleneck in financial services AI is not the raw intelligence of the models but their ability to handle context with surgical precision. We typically define our product not just as an AI application but as a Context Engine.

The Fallacy of the Infinite Window

One reason for our focus on Context is that while leading models like Gemini and Claude boast context windows of up to 1 million tokens, most enterprise implementations remain capped at roughly 200,000 tokens due to the quadratic scaling of computational costs.

When financial professionals attempt to feed large volumes of data from hundreds of pages across multiple documents into these models, they encounter a critical failure mode. Exceeding the effective context window leads to missed content, hallucinations, and diluted analysis. For high-rigor financial tasks such as private equity memoranda or complex credit agreement analysis, these errors are unacceptable.

While the market often focuses on the size of a model's context window, more data is rarely the solution for financial professionals. Overwhelming a model with raw data usually leads to hallucinations or diluted analysis. Handling massive amounts of data is exponentially expensive and often results in diminishing returns in quality.

Bullpen’s Differentiator Lies in Context Engineering

Instead of trying to shove an entire data room into a 200k token window, our engineering focus is on selecting the most relevant information. We spend significant resources narrowing down billions of tokens worth of documents to the specific sentences and paragraphs required for a high-rigor task. We have engineered our systems so that by the time the AI begins its work, Bullpen has already filtered out the noise, ensuring that every token used contributes directly to the final insight.

Rather than relying on brute-force data injection, Bullpen has engineered a sophisticated context-pruning system that addresses these limitations at the architectural level. The approach consists of three core technical components.

First, we utilize Context Pruning. Bullpen does not dump entire data rooms into the model. Instead, it uses advanced engineering to prune and select only the most relevant sections, sentences, paragraphs, and data points from potentially billions of tokens of source documents. Second, we employ Targeted Input Selection. By narrowing input to what is most relevant for each section of a report, Bullpen maximizes the value of the limited context window. It ensures that every token contributes directly to the final output. Finally, we use Sectionalization to break large documents into discrete sections, with each section's context tightly aligned with the model's token window.

The Context Engine’s Foundation: Knowledge

While Context Engineering addresses pruning, the Bullpen Knowledge framework serves as the bedrock of the context layer. Consisting of a dynamic web of aggregated domain-specific intelligence and the firm’s proprietary institutional intelligence, Knowledge centralizes these institutional nuances to ensure the AI is specifically smart for you rather than just generally intelligent. Knowledge captures the methodology behind a firm’s decision-making process and serves as the foundation for selecting context.