AI Across America—Stop #1: Baton Rouge, Louisiana on May 11th

Exposure Scores Tell Us Less Than the Headlines Suggest

May 2026 | Marina Meyjes, Policy Analyst

Exposure Scores Tell Us Less Than the Headlines Suggest

Headlines warning of large-scale job displacement from AI are becoming more frequent and urgent. Much of this coverage centers on a particular method of analysis that seeks to quantify how exposed a particular job is to AI by estimating how much of an occupation’s tasks an AI could conceivably perform. The more tasks AI can do, the greater the exposure, and thus the greater the likelihood that the job could be automated.

But while task-based exposure to AI is a useful signal, it is a far more complex and limited one than the headlines would suggest. Exposure can illuminate that change is likely to occur, but tells little about how that change will actually play out.

Much sharper approaches already exist. Researchers have proposed several alternatives that go beyond exposure scores as a single, headline-grabbing metric, taking into account the direction of change within occupations, task interactions, market dynamics, and how AI is evolving and actually being adopted across the economy. Policymakers should support these approaches and build the institutional infrastructure for coordinated, ongoing measurement that captures the full complexity of how AI is reshaping work.

Measuring Exposure to AI

Most AI exposure research starts with the U.S. Department of Labor’s (DOL) Occupational Information Network (O*NET) database. O*NET catalogues the tasks performed by hundreds of occupations across the economy. For example, according to O*NET, some of the tasks “Paralegals and Legal Assistants” perform include: the regular preparation of affidavits and other documents; filing of pleadings with court clerks; and calling upon witnesses to testify at hearings. Researchers use a variety of techniques to estimate AI exposure based on this information, from scoring each of the tasks within a job against a structured rubric of machine learning criteria to matching AI patent language to occupational task descriptions, or even asking large language models to evaluate their own performance on specific tasks.

These methods all share a common logic: they decompose a job into units (tasks or abilities), assess each unit individually, and combine those assessments into an average score. The greater the overlap between a job's components and what AI can do, the higher the exposure score.

What Exposure Misses

There are several major critiques of using exposure assessments to predict labor market disruption. First, interpretations of exposure scores often conflate the fact of change with the direction of change. As MIT economist David Autor emphasizes, exposure doesn’t predict job loss; it signals that a job will change, but not how.

When AI capabilities overlap with tasks, outcomes can vary widely depending on which tasks are automated. For example, if automation primarily affects routine components of work, roles may become more specialized and indeed contract as firms need fewer workers. If, instead, automation affects more specialized tasks, barriers to entry may fall, and employment in that occupation can actually expand. At the same time, however, downward pressure on wages can occur as those skills lose scarcity value. The latter is still a meaningful economic shift, but reflects dynamic changes in the structure of expertise within occupations rather than straightforward job loss.

Second, exposure scores rest on an inaccurate assumption baked into the methodology: that the tasks that make up a job are independent of each other. But as economists Joshua S. Gans and Avi Goldfarb argue, this isn't the reality in practice. Most exposure scoring methods treat jobs as collections of discrete tasks, scoring each for automatability and then averaging the results. However, tasks within a job aren't independent; they interact, and the effects of automating one ripple through the others. When some tasks within a job get automated, workers tend to spend more time and effort on the ones that remain, often improving their performance on those tasks in turn and raising the bar for further automation. The same logic implies that a job's automatability is defined by its least-automatable task – the bottleneck. As surrounding tasks get automated, that bottleneck becomes more valuable and more difficult to automate, not less.

Third, measurement frameworks aren’t factoring in how the market will respond. University of Chicago economist Alex Imas recently argued that even with a clearer picture of how tasks are exposed and how they interact, exposure still can't tell us how the broader economics will play out. Take productivity, for instance. When AI makes workers in an occupation more productive, that gain doesn't translate directly into job losses or job gains. It has to travel through the market first. Higher productivity can lower costs, which can lower prices, which can shift demand. It's that final step – how demand responds – that actually determines employment outcomes. If lower prices drive a significant increase in demand, firms may hire more workers to meet the demand; if demand barely shifts, fewer workers will be needed to do the same work. Exposure doesn’t speak to this dynamic at all, yet it's a critical piece of what actually determines employment outcomes.

Finally, the dominant exposure scoring methods may be too static for the pace at which AI systems are evolving. AI models continue to develop rapidly, with each new model release tending to improve on the last across a widening range of capability benchmarks. Exposure scores, calibrated to the tasks AI can perform today, therefore run the risk of becoming outdated as those capabilities shift, missing categories of work that may face automation pressure as AI develops. Researchers Philip Tomei and Bouke Klein Teeselink illustrate this in a recent paper, arguing that the tasks AI can be trained to do via reinforcement learning aren't always the same as the tasks current models handle well – meaning the boundaries of what counts as “exposed” can shift in unexpected directions.

Building a Better Measurement Toolkit

Exposure scores are a useful starting point, but as detailed above, much more refined approaches are out there. What's missing is the institutional infrastructure to implement them: expanding what federal agencies already collect, building partnerships with private-sector firms and academic researchers, developing novel approaches to capture AI's real-world adoption, and using legislative frameworks to turn isolated efforts into coordinated measurement.

First, we need better data to enable more effective predictive modeling. O*NET is a useful resource for detailing the component tasks that make up hundreds of occupations across the economy, but we also need improved data on how those tasks interact with one another, which ones bottleneck automation, and what types of tasks AI is being used for. One opportunity to expand this dataset is to start with O*NET itself. For example, a major source of data collection for O*NET is worker surveys. Expanding these surveys to include questions about task interactions — such as which tasks depend on which, where workers feel time pressure shifting as automation changes their workload, and which specific tasks AI is being used for – could give researchers a more holistic source of task data than currently exists, without requiring an entirely new infrastructure.

We also need data on how AI-driven changes inside occupations interact with the wider economy. Critically, we require much more data on the elasticity of consumer demand across occupations – how much people will buy more of something when its price changes – without which we can't predict whether productivity gains in a given occupation will lead firms to hire more or fewer workers.

Useful working models exist. The University of Chicago's Kilts Center provides academic researchers with anonymized consumer purchase data through partnerships with market intelligence firms like NielsenIQ and Numerator, making elasticity research tractable for goods like groceries. The harder challenge, as has been noted, is extending this model across the economy, including to the more amorphous, AI-exposed services, such as tutoring or web development. Doing so would require a massive, coordinated effort to gather fragmented transaction data across freelance platforms, payment processors, and individual firms.

But there is already some legislation in the works that could help to tackle this. The AI Workforce PREPARE Act would convene experts to identify the highest-value datasets, metrics, and analyses for understanding AI's labor market impacts – an avenue through which policymakers could elevate the price elasticity gap as a top priority. It would also establish a framework for voluntary data-sharing partnerships with private firms on AI use and direct federal statistical agencies to assess how researchers could securely access detailed labor market data.

Second, beyond improved predictive modeling, we need evidence on how AI is actually being used in practice. Predictive frameworks can estimate how AI might reshape jobs, but they need to be grounded in real-time data on adoption across the economy to test and validate those forecasts.

Existing efforts like Anthropic's Economic Index and OpenAI's analysis of ChatGPT usage are valuable first steps, capturing what AI is actually being used for day-to-day, rather than what it could do to jobs in theory. But these approaches still rely on the same task taxonomies that underpin exposure scores, are provider-specific snapshots, and tell us little about the intensity or depth of adoption, for example, whether a firm is occasionally using AI to draft emails or it has integrated the technology deeply into core workflows – each resulting in very different outcomes for workers.

To close that gap, we need new approaches that add another dimension to the data. For example, a Generative AI Intensity Index, which would link the measurable output of generative AI use – tokens – with business and economic data to serve as a near-real-time metric for AI adoption across regions and sectors. This would give policymakers and researchers a sectoral heat map of where generative AI is actually being embedded and serve as a complementary dataset that grounds the predictive frameworks above against what's happening in practice.

Fortunately, potential avenues for this type of novel data collection might already be taking shape. The Department of Labor's forthcoming AI Workforce Hub aims to bring private-sector data on AI adoption into the federal statistical system, surfacing information that companies haven't previously shared with the government. The AI Workforce PREPARE Act would establish a framework for voluntary data-sharing partnerships with AI developers, deployers, and other private entities. Both build the institutional architecture for novel measurement efforts that could ground predictive frameworks in real-world adoption patterns, sector by sector, and in real time.

To ensure truly robust measurement, the toolkit will need to grow further still. Researchers, international organizations, and analytics firms have put forth various approaches, ranging from measures that capture what parts of jobs AI will overlap with as it develops, comparative analysis of how AI is affecting work across countries, and surveys that ask workers about their AI use.

AI's labor market impact deserves the public attention it's getting. Turning that attention into rigorous policy is the harder part. Researchers have already developed sharper alternatives to exposure scoring and lawmakers have begun to draft institutional infrastructure that could serve as avenues for more rigorous measurement. The challenge now is for policymakers to coordinate and extend that work – pushing the research forward, deepening the data partnerships already underway, and using legislative momentum to turn isolated efforts into ongoing federal measurement frameworks. Without this, we risk navigating one of the most consequential labor market shifts in decades with blurry vision.

Marina Meyjes is a policy analyst at SeedAI examining how AI is transforming the economy, including labor markets, productivity, and institutions. Her recent co-authored work, "The U.S. Needs a Generative AI Intensity Index," calls for the creation of a new economic index tying generative AI usage -- measured in tokens -- to economic data to better capture AI adoption intensity at the firm and sector level. Prior to SeedAI, Meyjes worked at the Atlantic Council's GeoTech Center at the intersection of emerging technologies and geopolitics. She holds an MPhil in Politics and International Studies from the University of Cambridge, where she focused on political economy and feminist technoscience, and a BA in History from UCLA