October 2025 | Joshua New, Marina Meyjes, and Austin Carson
I. Overview
AI has the potential to radically transform the U.S. economy, and indeed may already be doing so.¹ But information on industry adoption of generative AI, the type of AI behind this rapid wave of change, remains frustratingly shallow. Detailed adoption data is essential for understanding the potential impacts of generative AI on the workforce, shifts in productivity, and the pace and direction of sectoral transformation. Yet existing measurement tools do not provide a sufficiently clear picture of generative AI adoption to understand the potential scale of economic transformation underway. Failure to adequately understand and respond to economic transformation could have significant consequences for the U.S. economy and the livelihoods of people across industries and regions.
The U.S. needs a Generative AI Intensity Index to help overcome this challenge. This Index would measure the digital processes created during real-world use of generative AI systems, offering a direct proxy for their adoption across the economy. By tracking the volume of model outputs generated during use, the Index would convert these raw processing indicators into a standardized metric of generative AI activity, called Normalized Token Equivalents (NTEs), across firms and sectors. This approach would enable consistent comparisons across providers and model types. When paired with NAICS (North American Industry Classification System) codes, which identify and classify business establishments by their primary business activity, the Index would give policymakers a granular and timely view of where and to what extent generative AI is being adopted across the economy.
Generative AI has attracted extraordinary hype and billions in spending, yet its integration throughout the economy remains poorly understood.² Without better data, policymakers risk misreading both the scale of opportunity and the nature of the transformation underway. A Generative AI Intensity Index would help close this critical gap.
II. Defining Generative AI
Generative AI – a subset of AI that uses generative architectures and can create new content based on patterns learned from data – is a distinct class of technology with unique economic significance.³ Generative architectures include transformers, generative adversarial networks (GANs), variational autoencoders (VAEs), diffusion models, flow-based models, and autoregressive approaches, and enable large language models (LLMs), image, audio, and video generation models, and a wide swath of other kinds of applications.
Generative AI models can generate novel content on demand, across many domains, and can often be used by non-specialists through simple, intuitive prompts. These general purpose technologies have garnered remarkable public attention and a wave of early experimentation, from record consumer sign-ups to swift bundling into widely-used software.⁴ Enthusiasts and critics alike predict transformative economic effects, with forecasts ranging from utopian promises of the end of poverty to warnings of mass job loss and impoverishment.⁵ Yet we lack the tools to understand fundamental aspects of its integration into the economy.
This stems from a twofold, interconnected problem. First, existing measures of AI adoption often conflate generative AI with other technologies under the broad “AI” umbrella, muddying the picture of uptake. AI is an extensive category of technologies, legally defined as any “machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations or decisions influencing real or virtual environments.”⁶ This definition spans everything from recent generative AI systems like ChatGPT, to classic AI applications such as credit card fraud detection tools that banks have used for automated decision-making since the 1980s. For those seeking to understand the economic effects of generative AI, this is not especially useful.
Second, existing data on generative AI adoption fails to meaningfully address the intensity of use. Generative AI usage itself can range from ad hoc lightweight uses such as drafting short communications, to transformative, agentic AI workflows, where systems can plan and execute multi-step tasks with limited human oversight.⁷ This lack of precision generates false signals: overstating adoption in some areas, underestimating it in others, and ultimately distorting our understanding of how, or indeed if, generative AI is reshaping the economy.
III. The Policy Imperative
Policymakers cannot manage what they cannot measure. Without sharper visibility into how actors across the economy are actually adopting generative AI, strategies to support the AI economy and respond to potential disruptions rest on shaky ground.
This distinction is critical as generative AI could reshape the economy along several different trajectories. Recent debate reflects this uncertainty, with forecasts varying from systemic worker displacement to the creation of entirely new industries.⁸ Early trends even suggest the decline in demand for early-career workers but increased demand for more senior ones as a result of generative AI adoption.⁹ All of these may warrant very different policy responses.
Understanding if and how these trajectories play out requires data that fully captures the depth of generative AI adoption. Policymakers need metrics that reflect not just whether adoption is happening, but how intensely U.S. firms are implementing this technology.
Past economic transitions, such as trade liberalization, serve as a cautionary example. Leading economists have since acknowledged that the scale of disruption “came as something of a surprise” – a blindness they attribute in part to inadequate data for monitoring free trade's impacts as they unfolded.¹⁰ Free trade spurred significant macroeconomic growth, but its benefits were unevenly distributed, with devastating consequences. It shattered manufacturing hubs and caused mass job loss from which parts of the American economy have never recovered.¹¹
The effects of this myopia extended far beyond factory closures. Trade disruption devastated social cohesion in the hardest-hit regions, driving down marriage and fertility rates while fueling deteriorating health and higher overdose mortality.¹² By the mid-2000s, these social and economic strains had reshaped the political landscape in communities most affected by import competition, eroding the political will for major alliances and free trade frameworks.¹³
As technological change now comes into focus as the source of potential disruption, the lesson remains. Timely, detailed data is essential for ensuring that the integration of generative AI supports workers across all industries and regions, rather than leaving them behind.
IV. Current Approaches to Measuring AI Adoption
Improving measurement must begin with a clear understanding of the current landscape. Across public and private initiatives, three main approaches stand out:
Direct Firm Survey Approaches
The most common method for gathering data about AI adoption involves directly surveying businesses about their AI use. The U.S. Census Bureau's Business Trends and Outlook Survey (BTOS) is the most extensive effort of this kind.¹⁴ The survey collects biweekly AI adoption data from a representative sample of 1.2 million businesses by asking two AI-specific questions introduced in September 2023: “Between MMM DD – MMM DD, did this business use Artificial Intelligence in producing goods or services? (e.g., machine learning, natural language processing, virtual agents, voice recognition, etc.)”¹⁵ and “During the next six months, do you think this business will be using Artificial Intelligence in producing goods or services? (e.g., machine learning, natural language processing, virtual agents, voice recognition, etc.).” A 13-question AI-specific supplement was also added to the questionnaire in late 2023 to provide more detailed information about AI usage, asking, for example, “In the last six months, what types of applications of artificial intelligence (AI) did this business use in producing goods or services?”¹⁶
Although it provides valuable baseline data, the BTOS faces several critical limitations. First, while the survey supplement does provide more detailed information about the types of AI used by businesses, this was a one-time integration (although a request was made for the supplement to be repeated in March 2025).¹⁷ The typical biweekly survey lumps all types of AI, ranging from generative AI to tools such as spam filters, into a single category. Though the supplement includes generative AI-specific options such as “large language models,” it lists them alongside broad and overlapping categories including “machine learning,” “text analytics,” and “data analytics,” without offering definitions or clarifying distinctions. In the absence of clearer guidance or examples, such as ChatGPT or other common generative AI use cases, there is a possibility that respondents using generative AI tools misidentify them or even fail to recognize them at all.
Further, many firms are unsure what qualifies as AI, or how to themselves distinguish between generative AI and other types of automation.¹⁸ This ambiguity limits the utility of the data, blurring the picture of where generative AI is actually being adopted and how it might be reshaping economic activity.
BTOS also provides limited visibility into the intensity of adoption. Its core questions capture whether firms report using AI, but not if the technology is central to production or simply being tested on a small scale. While the AI-specific supplement adds some granularity by asking about the types of AI used and whether it replaced tasks previously done by employees, it stops short of measuring frequency of use, their impact on a firm’s output, or how central they are to day-to-day operations. This means light uses, such as drafting short emails, are counted the same as complex or sustained applications like code generation or entire workflow automation.
The survey’s structure is another limitation. BTOS reports data at the firm level and categorizes businesses by a limited number of size groupings: 1 to 4, 5 to 9, 10 to 19, 20 to 49, 50 to 99, 100 to 249, and 250+. A multinational corporation with thousands of employees actively using AI registers the same as a moderately sized business with 250 staff.¹⁹ This approach risks obscuring the extent to which AI tools are already integrated across the American business landscape.
At the same time, reputational pressures may lead firms to overstate generative AI adoption. In an environment where AI is a buzzword linked to innovation, competitiveness, and investment potential, some businesses may exaggerate reported use through “AI washing,” labeling pilot projects or limited automation tools as enterprise-level deployments to appeal to investors.²⁰ This introduces a different form of distortion, inflating adoption statistics among firms seeking to position themselves as technologically advanced.
In addition to government-led efforts like BTOS, private-sector surveys offer another source of insight. A high-profile example is McKinsey’s Global Survey on AI.²¹ While this survey can surface useful trends, its sampling bias toward large enterprises may misrepresent the reality of AI usage across the economy. McKinsey draws 42% of respondents from organizations with over $500 million in annual revenue, despite small and medium enterprises comprising 99.9% of U.S. businesses.²² Compounding this bias, annual survey cycles lag behind the pace of AI development, creating measurement gaps during critical adoption periods.²³
To complement firm-level reporting, some surveys instead track adoption at the individual level. The Gallup Practice Panel is one such effort: a probability-based, longitudinal survey of roughly 10,000 U.S. workers conducted annually.²⁴ As of 2023, it asks respondents about AI use in their jobs, ranging from “never” to “daily.”²⁵ The Gallup Panel provides data on how many workers across the U.S. are likely using generative AI and how frequently they do so, providing useful insights for researchers.²⁶
By focusing on individual workers rather than firms, the survey provides a different lens on adoption; useful for understanding uptake of AI among employees, but less suited to assessing how widely or deeply generative AI is being integrated at the firm level.
Model Usage Data from AI Platforms
The newest and most promising approach leverages internal data from AI providers to observe how users rely on generative AI systems. Anthropic's Economic Index exemplifies this method, analyzing anonymized interactions from both Claude.ai consumer conversations and enterprise API usage, mapping them onto the Department of Labor's O*NET occupational framework, which breaks down occupations into the tasks and skills that workers perform.²⁷ The Index has evolved to include geographic analysis across 150+ countries and all U.S. states, as well as how businesses are using Claude through its API. This enterprise data comes from companies that connect Claude directly into their own tools and systems, offering a detailed look at how AI is being adopted inside firms.
OpenAI has developed a similar effort, described in a recent NBER paper, using anonymized, internal data from ChatGPT to analyze real-world usage patterns at scale.²⁸ Its economic research team examined more than 1.5 million ChatGPT conversations from its launch in November 2022 through July 2025, classifying messages by task type and mapping them to detailed work activities. Like Anthropic’s Economic Index, this method leverages first-party platform data to reveal how generative AI is being incorporated into users’ workflows and to develop a more nuanced picture of adoption across the economy.
These new approaches provide the first large-scale, empirical accounts of generative AI integration across the economy. Rather than relying on self-reported use or predictive modeling, they capture what users do with the technology based on recent usage patterns, offering more immediate and behaviorally grounded insights into adoption landscapes. This method also provides insight into which occupations are using AI to augment tasks, and which occupations are using the technology to automate tasks, adding granularity to the nature of AI’s adoption across the economy.
These are very welcome, first-of-their-kind efforts to measure generative AI adoption specifically, though limitations remain. First, usage data can only show interactions, not how outputs are ultimately incorporated into workflows, making it difficult to draw firm conclusions about how deeply AI is being adopted in practice (a challenge noted by the researchers behind Anthropic’s Index).²⁹ Second, these are provider-specific datasets. Because each provider may define and categorize usage differently, the figures are not directly comparable, making it difficult to build a comprehensive view of economy-wide adoption. Finally, even where geographic analysis has been added, as in the case of Anthropic’s Index, it applies solely to consumer usage on Claude.ai, offering limited insight into how enterprise API adoption varies across regions.
Occupation-Based Exposure Metrics
An indirect method for measuring AI adoption relies on estimating which jobs are most exposed to the technology, highlighting where generative AI could, in principle, be applied. To make these estimates, economists typically rely on the U.S. Department of Labor’s O*NET database, which breaks down each occupation into the tasks and skills that workers perform.³⁰ This information is linked to the Bureau of Labor Statistics’ Standard Occupational Classification system – the framework federal agencies use to categorize jobs – so exposure can be compared consistently across hundreds of occupational categories.³¹ For example, economists have used this approach to predict how different occupations are exposed to machine learning, generative AI, and LLMs specifically.³² This approach has been used for years to extrapolate the potential economic disruption of AI, with one 2013 study audaciously claiming that AI puts 47 percent of US employment “at risk.”³³
While this approach can sometimes be a useful exercise in anticipating how generative AI may diffuse throughout the economy, particularly when combined with other economic indicators, this approach only captures theoretical exposure, not actual deployment. A job may appear highly exposed to generative AI based on its task profile, yet in practice see minimal integration due to barriers like cost, technical readiness, or regulatory uncertainty. As a result, task-based exposure metrics are not a reliable indicator of AI’s actual presence in the economy and they may miss key details about how uptake is actually unfolding.
V. Why is Generative AI Adoption So Difficult to Measure?
Each of these three approaches offers pieces of the puzzle, but none gives policymakers a clear picture of exactly how and to what extent generative AI is spreading through the economy. The result is adoption data that is fragmented, inconsistent, and at times misleading.
These measurement gaps arise from both traditional challenges in collecting economic data and the unique nature of generative AI itself. The technology changes rapidly; its boundaries within the broader AI world can be blurry; and measuring how deeply it's actually integrated into work processes is complex. Addressing these factors is crucial for policymakers who want a clear picture of generative AI's economic impact. Three primary considerations include:
Definition Ambiguity
The measurement challenge starts with definitions. Ambiguous and overly expansive categories of “AI” produce adoption statistics that blur the line between long-standing automation and emerging generative systems.
The BTOS is a clear example. It broadly defines AI as “computer systems and software that are able to perform tasks normally requiring human intelligence, such as decision-making, visual perception, speech recognition, and language processing” and lists applications including “machine learning, natural language processing, virtual agents, predictive analytics, machine vision, voice recognition, decision making systems, data analytics, text analytics, image processing, etc.”³⁴ By this definition, an email spam filter and an LLM capable of generating software code are both counted as “AI.” Adoption figures drawn from such a sweeping categorization blur the line between embedded automation and generative models, leaving a muddied picture of the AI economic landscape.
Private-sector surveys reflect similar issues. McKinsey’s State of AI reports have begun to report “generative AI” separately from traditional, non-generative forms, distinguishing between “gen AI” and “analytical AI” in their data.³⁵ Yet these distinctions often collapse in the interpretation. For example, their review on the State of AI in early 2024 highlights that overall AI adoption jumped to 72 percent from the previous year, a figure that merges generative and analytical AI, and their latest 2025 report even frames “the use of AI—that is, gen AI as well as analytical AI—[as] continuing to build momentum.”³⁶ These aggregations mix fundamentally different technologies, limiting the report’s usefulness for policymakers seeking to understand where generative AI is specifically driving change.
Such problems extend across the broader measurement ecosystem. Firms, researchers, and statistical agencies often adopt definitions that diverge in subtle but consequential ways. In surveys, respondents may not share a common understanding of what qualifies as “AI.”³⁷ In task-exposure metrics, results depend heavily on design choices about which capabilities are included and how thresholds are applied.
Taken together, definitional ambiguity clouds the evidence base, makes datasets difficult to compare, and produces adoption figures that are inconsistent or misleading.
2. The Speed of Change
These problems are compounded by the fact that generative AI is evolving at a pace that consistently outstrips the tools meant to track it. New model releases and product integrations roll out every few months, reshaping the landscape of adoption.³⁸ A firm that reported using “AI for customer service” earlier in the year may now be running on entirely different systems with new features, workflows, and economic implications.
Even surveys designed to keep up with frequent updates struggle to capture this churn. The BTOS, for example, collects data every two weeks, yet its static question design cannot keep pace with the rapid turnover of generative AI models and applications. What counts as “AI use” in March may look very different by September.
This speed also complicates other measurement frameworks. Job classifications and task-exposure metrics often rely on fixed assumptions about what AI can or cannot do. But as generative AI expands into new areas such as agentic systems that can autonomously plan and execute complex workflows, tasks once assumed resistant to automation can suddenly become vulnerable.
To stay accurate, current measurement tools would need to be continuously updated, not only in frequency but also in how precisely they distinguish among different models, deployment methods, and use cases. Doing so would be costly and difficult to sustain, and runs the risk of leaving policymakers with data that becomes outdated just as they need it for decision-making.
3. Intensity of Generative AI Use
Finally, and critically, generative AI adoption exists along a spectrum that current measurement tools struggle to capture. The same survey question that asks "Does your firm use AI?" will receive identical "yes" responses from a company with employees occasionally using ChatGPT and a law firm that has systematically integrated generative AI into legal workflows.
This measurement challenge is not unique to AI: much of the digital economy resists traditional quantification methods, from cloud adoption to software-as-a-service deployment. This is part of a broader challenge economists have identified in measuring digital technologies where identical adoption rates can mask dramatically different levels of organizational integration and economic impact.³⁹
Even the most sophisticated current approaches to measuring generative AI adoption come up against this limitation. While efforts like Anthropic’s Economic Index offer detailed insights into usage patterns, they still offer limited visibility into how AI is integrated into day-to-day operations or embedded within organizational routines. As a result, it remains difficult to assess not just whether firms are using AI, but how deeply it is shaping the way they work.
Lacking the means to comprehensively measure adoption depth, policymakers have a critical blind spot: they can't distinguish between sectors experiencing genuine AI-driven transformation and those merely experimenting with the technology. This lack of insight hinders their ability to understand and guide AI’s role in reshaping the American economy.
VI. Building a Generative AI Intensity Index
To overcome these fundamental challenges, policymakers need better measurement tools. These tools must be specifically designed to track generative AI adoption across the economy and move beyond simple usage metrics to capture the intensity and depth of its application within firms and industries.
We propose a new framework: a Generative AI Intensity Index, which would continuously measure the digital processes that occur during actual AI system-use as a direct proxy for generative AI adoption across the economy. This represents a fundamental shift in measurement which would track the metered workloads of generative AI systems, offering a more immediate and grounded measure of AI adoption than ever before.
This Index is predicated on a clear principle: the “work” done by generative AI systems can be standardized and tracked over time to reveal the intensity of use. For many kinds of generative AI, this work is measured in tokens: units of data that models process and generate, such as when drafting an email, summarizing a report, or writing code.⁴⁰ Tokens represent words or word fragments, snippets of code, small pieces of images, or other discrete base units of information that models functionally treat as building blocks to analyze larger works or to create entirely new content.⁴¹
This measurement approach builds on infrastructure that already exists. Major AI providers track computational work and present it to customers through standardized systems. OpenAI includes detailed token counts in every API response and provides customers with dashboards to monitor usage.⁴² Anthropic's Messages API returns precise input and output token counts for each request, while Google provides token counting tools and usage monitoring through its Cloud Billing system.⁴³ Every time a business uses AI to draft an email, analyze data, or generate code, the system automatically captures precise measurements, creating a digital record of AI's contribution to economically relevant activity.
However, different kinds of generative models use tokens differently. For example, a text model processes and generates tokens that correspond to words or code fragments, while a video model operates over frames and audio segments that act as token-like building blocks.⁴⁴ Some models may not even use “tokens” in a traditional sense.⁴⁵ To enable comprehensive economic measurement across these varied systems, a Generative AI Intensity Index should convert these disparate measurement approaches into a single standardized measure called Normalized Token Equivalents (NTEs). This provides a common yardstick for comparing AI use across different providers and content types, converting each into token-equivalents. By summing NTEs across all AI workloads, the Index produces a snapshot of an organization's total generative AI intensity over a given period. This figure can then be normalized by workforce size or user base, enabling meaningful comparisons across sectors.
VI.A. Connecting AI Usage Data to the Structure of the Economy
A critical component would be linking this data to the North American Industry Classification System (NAICS) codes of enterprise AI customers to enable sectoral analysis.⁴⁶ NAICS is the federal standard for classifying business establishments by their primary economic activity, developed by the U.S. Office of Management and Budget’s Economic Classification Policy Committee (ECPC). Each NAICS code has six digits: the first two identify the broad sector, the third the subsector, the fourth the industry group, the fifth the NAICS industry, and the sixth the U.S. national industry.⁴⁷ Codes are assigned at the establishment level based on whichever activity generates the largest share of that establishment’s economic value, typically using Economic Census and survey responses.⁴⁸
Linking NTE usage data to NAICS codes would crucially allow industry and government to track sectoral trends in AI adoption. Since NAICS codes are already embedded throughout U.S. economic statistics, this approach would provide detailed industry analysis without revealing individual company data while maintaining the confidentiality protections essential for provider and business participation.
Given the regional concentration of different sectors across the country, this on its own would provide a rough level of detail about where in the country generative AI adoption is happening. This data could also be linked with public information about the location of enterprise AI customers, providing a more granular, regional snapshot of generative AI intensity.
Once NTE usage is linked to sectoral and geographic information, economists, policymakers, and others could leverage this tool in a wide variety of ways to uncover valuable insights about the impact of generative AI on the economy. For example, NTE usage could be mapped to changes in firm size over time, total factor productivity, income levels, or other key economic metrics to help inform decisions about how to help American businesses and workers adapt to AI-driven economic transformation.
VI.B. Closing the Gaps in Current Adoption Metrics
By grounding measurement in the metered activity of AI systems, the Generative AI Intensity Index would help to close several of the blind spots that limit existing adoption metrics.
First and foremost (and as its name suggests) it would capture adoption intensity in ways that distinguish between surface-level experimentation and transformative use; an area where existing measurement approaches consistently come up short. A company with a few employees prompting a chatbot for help with a simple task will generate a relatively small footprint of token usage, while a law firm integrating generative AI across thousands of contracts will generate orders of magnitude more. By capturing the volume of this activity, the Index highlights not just the presence of generative AI but the scale and depth at which it is embedded in workflows.
NTE tracking also addresses problems of definitional ambiguity. Because the framework relies on the metered outputs unique to generative systems – tokens generated, images rendered, audio minutes processed, video frames produced – it inherently distinguishes generative AI from older forms of automation. It also eliminates the respondent confusion and self-reporting errors that plague surveys, since usage is recorded directly rather than inferred from whether firms or individuals classify their activities as “AI.”
The Index also overcomes the challenge of keeping pace with generative AI’s development. When firms shift from one model to another, or start using new features like image or audio generation, those changes show up automatically in the usage data that providers already meter for billing. This means the Index updates with each reporting cycle (say, monthly) without the need to redesign surveys or rewrite task taxonomies every time the technology evolves. Because it tracks actual system outputs rather than broad self-reports, the framework adjusts as models and modalities change, providing a picture of adoption that keeps pace with the technology. It cannot fully eliminate time lags, but it shortens them significantly compared with today’s surveys and static classification schemes, and does so without significant administrative burdens.
It is worth noting that this kind of automatic tracking mostly applies to companies using generative AI through provider‑hosted or API‑based systems like those from OpenAI, Anthropic, or Google, where usage is measured as part of the service. Organizations that run models “in-house” on their own infrastructure may not have that same built-in metering and may require different methods to generate comparable data. That said, most generative AI adoption across the broader economy occurs through hosted platforms, or hybrid deployments that rely heavily on provider-hosted systems, meaning the proposed framework would still capture a significant majority of economically relevant activity.⁴⁹ And as use of locally-hosted and open-source models expands, future work may explore complementary measurement strategies to ensure broader coverage.
VII. Putting It Into Practice
While implementing NTE tracking as a national economic measurement tool would be a substantial undertaking, the core infrastructure is already in place.
As discussed, major AI providers — both model providers and the cloud service providers (CSPs) that distribute and meter their models — already meter tokens processed by generative AI through standardized systems. And, crucially, a small number of major providers control the vast majority of generative AI model usage across the economy. In enterprise markets, just four providers, Anthropic (32%), OpenAI (25%), Google (20%), and Meta (9%), account for roughly 86% of usage.⁵⁰ A share of activity reaches end users through indirect channels, such as cloud platforms (Azure OpenAI, AWS Bedrock, Vertex AI) or software with built-in AI capabilities (Microsoft 365 Copilot, Salesforce). This market concentration means that securing cooperation from just a critical mass of major providers would provide a statistically robust picture of economic activity. And because these providers already meter usage, their participation would make NTE reporting feasible without requiring new data collection systems.
This market structure provides the foundation for a public-private partnership that integrates privacy-protecting industry information with existing statistical infrastructure.
First, major AI providers should develop a Generative AI Intensity Index pilot to validate it and inform further efforts. Proactive industry participation would advance key public objectives as well as deliver meaningful benefits to the providers themselves. Enhanced visibility into actual enterprise user behavior patterns could inform model development and deployment, creating natural alignment between public policy objectives and private sector interests. This pilot should identify best practices for NTE tracking, reporting intervals, customer privacy preservation, and linkage of NTE usage with geographic data.
After a minimum viable product is developed, policymakers should prioritize integrating the Index into federal statistical efforts around economic measurement. To start, the Bureau of Economic Analysis (BEA) and the Office of Science and Technology Policy (OSTP) should launch a task force to identify opportunities to maximize the utility of the Index. The task force should include federal statistical and standards agencies, including the Census Bureau, which manages the NAICS system, and the National Institute of Standards and Technology (NIST), which could establish best practices and technical standards for NTE measurement and associated metadata, including preserving privacy and sensitive enterprise use. Together, these agencies bring the essential expertise in economic measurement, robust statistical operations, and technical standardization required for this cross-cutting initiative. As a unit, they can create a frequent, scalable, national statistical reporting system that measures token usage across the economy while adhering to the highest privacy and security standards.
This task force should work closely with private sector contributors to identify best practices to incentivize industry participation and coordination, as well as align the Index with broader policy considerations and federal needs. As the Index proves useful for advancing these priorities, the task force should also identify whether additional administrative structures or Congressional support would be required to ensure that Index data can be made maximally available and valuable to all stakeholders.
VIII. Challenges
Maximizing the utility of a Generative AI Intensity Index will require navigating several key technical and operational hurdles. The industry-led pilot should help readily identify these challenges and experiment with developing viable solutions.
First, complex delivery chains present a limitation for mapping the full extent of generative AI usage throughout the economy. Not all AI usage flows through direct customer relationships with major providers.⁵¹ Similarly, new types of AI providers, such as integrated development tools like Cursor, may not be easily captured in the initial grouping of major platforms.⁵² In these cases, the usage may pass through multiple providers before reaching the end customer, creating a risk of double-counting or misattributing activity to the wrong sector.
The Index will need a method to consistently track generative AI usage across direct and indirect delivery pathways, recognizing that the same underlying AI service may be consumed and resold through multiple intermediaries before reaching end users. This would require defining attribution standards that can be applied systematically across different types of delivery arrangements.
Second, not all generative AI models process tokens in the same way, particularly in non-text modalities where measurement units vary significantly across different model types. Even for text-modalities, which dominate enterprise adoption, different providers may rely on architectures that process tokens differently, making it difficult to treat tokens as fungible.⁵³ Even so, the Index should attempt to incorporate normalization adjustments to account for differences in how various providers or model types generate tokens, helping prevent inflated token counts from being mistaken for meaningful growth in adoption.
Third, distinguishing between corporate and personal generative AI usage presents a further challenge, as employees may use these tools across both professional and personal contexts. For instance, a marketing professional using ChatGPT on a personal account for work tasks would fall outside corporate usage tracking, potentially understating AI adoption in the marketing sector. These boundary cases highlight the difficulty of cleanly separating business-relevant AI adoption from personal use, though the Index should still capture the majority of enterprise-scale AI deployment.
Finally, the Index would not capture locally run models that operate outside major providers' metered services, such as when firms deploy open-source systems entirely on their own infrastructure. Open source AI usage represents a minority, but still significant, share of enterprise AI workloads.⁵⁴ Future work may need to develop complementary approaches such as lightweight usage telemetry, compute-based estimates, or voluntary reporting to bring this activity into view.
IX. Future-Proofing
As AI technologies rapidly evolve, so too must the metrics used to track them. While token generation may serve as the most immediate and scalable proxy for generative AI adoption, it is essential to recognize that technological and economic developments could potentially skew measurement over time.
On the economic front, declining compute costs are enabling more intensive use of generative AI, inflating token usage even when firm-level adoption remains flat.⁵⁵ As tokens get cheaper, users are less constrained by length or frequency, and the same business processes that previously generated thousands of tokens might now generate millions simply because longer prompts, outputs, or multi-step agentic reasoning are now less expensive. While this sometimes may indeed signal more comprehensive integration of AI into workflows, it may also simply represent a kind of “token inflation” where firms use “more” AI for the same amount of work. Recent data suggests this dynamic may already be underway: Google DeepMind processed almost a quadrillion tokens in April 2025, more than double the previous month’s total.⁵⁶ This surge likely reflects declining compute costs enabling more expansive use, rather than a proportional increase in new generative AI deployments.
Similarly, increased reliance on different modalities like video generation may result in significant increases in token processing compared to text generation, potentially distorting this metric. The Index should be designed to account for exponential increases or other structural changes, as well as delineating token usage across different modalities, to still provide meaningful insight about generative AI intensity.
Technological developments pose additional measurement challenges. The concept of tokens itself may evolve as generative AI architectures advance, with emerging research suggesting that tokenization schemes could become dynamic rather than static, with models learning to adjust their text-splitting strategies during training.⁵⁷ In practice, this means that instead of using fixed rules to divide text into smaller parts, as current models do, future systems could adapt how they process language based on the content or context they encounter. This shift, known as “dynamic tokenization,” could help models represent information more efficiently and flexibly.⁵⁸ But it would also complicate the task of measurement. Today’s token-based metrics rely on consistent, comparable units across models and providers. If different systems begin using different tokenization strategies, the underlying unit of measurement may vary, making it harder to standardize or compare generative AI activity. Measurement frameworks would need to evolve accordingly to ensure they continue capturing meaningful and comparable data across increasingly diverse architectures.
These obstacles underscore the importance of robust public-private cooperation to help develop consensus best practices to overcome them. The BEA-spearheaded task force should work closely with industry to address how to account for cost declines, technological shifts, and evolving AI architectures to ensure that measurements continue to reflect meaningful economic integration rather than simply computational volume increases.
X. Conclusion
As generative AI reshapes the economy, policymakers face a critical imperative: develop the measurement capabilities needed to guide this transformation, or risk the missteps that have marked previous periods of rapid technological and economic change. Successfully steering the U.S. economy through this period of potential upheaval requires unprecedented visibility into how extensively and deeply generative AI is being embedded across economic activity.
The proposed Generative AI Intensity Index offers a direct and novel solution for comprehensively monitoring generative AI adoption across the economy. While existing measurement efforts provide valuable insights, they struggle to clearly capture the full scope and intensity of generative AI's integration into industries, leaving policymakers to navigate this transformation with incomplete information at precisely the moment when comprehensive data is most critical.
The foundation for such a system already exists. With the necessary infrastructure in place and a clear implementation pathway through coordinated federal agencies, this public-private collaboration could equip businesses and policymakers alike to act with precision as generative AI’s economic footprint expands. A Generative AI Intensity Index would provide the foresight to chart generative AI’s economic path and prepare for its impacts.
###
SeedAI would like to thank the following for reviewing this paper:
Addie Cooke, Andre Barbe, Jeffrey Ding, Harrison Durland, Conor Griffin, Christos Makridis, Aalok Mehta, Trisha Ray, and Kellee Wicker.
XI. Endnotes
https://digitaleconomy.stanford.edu/wp-content/uploads/2025/08/Canaries_BrynjolfssonChandarChen.pdf
https://hai.stanford.edu/ai-index/2025-ai-index-report
https://arxiv.org/abs/2309.07930
https://news.harvard.edu/gazette/story/2024/10/generative-ai-embraced-faster-than-internet-pcs/; https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
https://economictimes.indiatimes.com/magazines/panache/ai-generated-wealth-could-be-the-future-predicts-openais-sam-altman-as-society-gets-richer-/articleshow/123306824.cms?from=mdr; https://research.aimultiple.com/ai-job-loss/
https://uscode.house.gov/view.xhtml?req=granuleid:USC-prelim-title15-section9401&num=0&edition=prelim
https://safe.security/resources/blog/what-is-agentic-ai-in-cybersecurity/
https://www.axios.com/2025/07/14/ai-jobs-nvidia-jensen-huang-dario-amodei; https://www.nber.org/system/files/working_papers/w30172/w30172.pdf; https://sloanreview.mit.edu/article/how-ai-will-define-new-industries/#:~:text=If%20AI%20can%20supercharge%20discovery,together%20to%20promote%20that%20potential;
https://digitaleconomy.stanford.edu/wp-content/uploads/2025/08/Canaries_BrynjolfssonChandarChen.pdf
https://www.npr.org/2025/02/11/g-s1-47352/why-economists-got-free-trade-with-china-so-wrong
https://www.nber.org/digest/dec14/import-competition-and-great-us-employment-sag
https://www.aeaweb.org/articles?id=10.1257%2Faeri.20180010; https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3534472; https://www.sciencedirect.com/science/article/pii/S2352827319300096
https://2017-2021.state.gov/us-TPP-withdrawal/
https://www.census.gov/programs-surveys/btos.html
https://www.census.gov/hfp/btos/downloads/CES-WP-24-16.pdf
Ibid.
https://www.census.gov/newsroom/press-releases/2024/business-trends-outlook-survey-artificial-intelligence-supplement.html; https://www.govinfo.gov/content/pkg/FR-2025-03-31/html/2025-05461.htm?
https://www.census.gov/hfp/btos/downloads/CES-WP-24-16.pdf
https://bipartisanpolicy.org/blog/taking-stock-of-ai-adoption-across-the-u-s-economy/
https://www.axios.com/2017/12/15/report-many-firms-are-ai-washing-claims-of-intelligent-products-1513304292?utm
https://www.axios.com/2017/12/15/report-many-firms-are-ai-washing-claims-of-intelligent-products-1513304292?utm;
https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai; ; https://www.uschamber.com/small-business/state-of-small-business-now?
https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-2024
https://papers.ssrn.com/abstract=5401053
https://www.gallup.com/workplace/651203/workplace-answering-big-questions.aspx
https://papers.ssrn.com/abstract=5401053
https://www.anthropic.com/research/anthropic-economic-index-september-2025-report
https://www.nber.org/papers/w34255
https://arxiv.org/abs/2503.04761
https://www.aeaweb.org/content/file?id=22228; https://www.dol.gov/agencies/eta/onet
https://www.bls.gov/soc/
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4414065; https://www.aeaweb.org/articles?id=10.1257/pandp.20181019; https://arxiv.org/abs/2303.10130
https://oms-www.files.svdcdn.com/production/downloads/academic/future-of-employment.pdf
https://www2.census.gov/library/working-papers/2024/adrm/ces/CES-WP-24-16R.pdf
https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-2024;https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
https://www.census.gov/hfp/btos/downloads/CES-WP-24-16.pdf
https://lsvp.com/stories/remarkably-rapid-rollout-of-foundational-ai-models-at-the-enterprise-level-a-survey/?
https://www.nber.org/papers/w25695
https://arxiv.org/pdf/2407.11606
Ibid.
https://openai.com/api/pricing/
https://docs.anthropic.com/en/docs/about-claude/pricing; https://ai.google.dev/gemini-api/docs/billing
https://arxiv.org/html/2312.14125v2?
Ibid.
https://www.census.gov/naics/
https://www.census.gov/programs-surveys/economic-census/year/2022/guidance/understanding-naics.html
https://www.census.gov/programs-surveys/economic-census/year/2022/technical-documentation/methodology.html
https://menlovc.com/perspective/2025-mid-year-llm-market-update/; https://www.bentoml.com/blog/2024-ai-infra-survey-highlights?utm_source
https://menlovc.com/perspective/2025-mid-year-llm-market-update/
https://assets.publishing.service.gov.uk/media/65081d3aa41cc300145612c0/Full_report_.pdf
https://cursor.com/en
https://www.deloitte.com/hu/en/services/consulting/research/state-of-generative-ai-in-enterprise.html?
https://menlovc.com/perspective/2025-mid-year-llm-market-update/#18aaeef7-0c05-404c-b36f-01edbc154d0f
https://hai.stanford.edu/assets/files/hai_ai_index_report_2025.pdf
https://x.com/demishassabis/status/1948579654790774931?t=7FaYqboSS9u0tfIpUayK6Q&s=19
https://arxiv.org/abs/2506.14761
https://arxiv.org/abs/2411.18553