This page contains press release content distributed by XPR Media. Members of the editorial and news staff of the USA TODAY Network were not involved in the creation of this content.

Quesma Releases OTelBench: Independent Benchmark Reveals Frontier LLMs Struggle with Real-World SRE Tasks

New benchmark shows top LLMs achieve only 29% pass rate on OpenTelemetry instrumentation, exposing the gap between coding ability and real-world SRE work.

OTelBench shows that while LLMs are impressive at generating code snippets, they’re not yet capable of the cross-cutting reasoning required for production engineering.”

— Jacek Migdał, founder of Quesma

WARSAW, POLAND, January 20, 2026 /EINPresswire.com/ — Quesma, Inc. announced the release of OTelBench, the first comprehensive benchmark for evaluating LLMs on OpenTelemetry instrumentation tasks. The open-source dataset tests 14 state-of-the-art models across 23 real-world tasks in 11 programming languages, revealing significant gaps in AI’s ability to handle production-grade Site Reliability Engineering (SRE) work.

While frontier LLMs have demonstrated impressive coding capabilities, the benchmark reveals a stark reality: the best-performing model, Claude Opus 4.5, achieved only a 29% pass rate on OpenTelemetry instrumentation tasks, compared to 80.9% pass rate in the SWE-Bench. This gap highlights a critical distinction between writing code and performing the complex, cross-cutting engineering work required for production systems.

The $1.4 Million Per Hour Problem
Enterprise outages cost an average of $1.4 million per hour, making production visibility mission-critical. Distributed tracing, the gold standard for debugging complex microservices, allows teams to link user actions to every underlying service call. However, implementing this visibility remains difficult, with 39% of organizations citing complexity as their top observability obstacle. OpenTelemetry has emerged as the industry standard with backing from 1,100+ organizations, yet configuring it correctly remains a major source of toil for SRE teams.

Fundamental Limitations Exposed
The benchmark tested models on agentic coding tasks where they were given source code from realistic applications, an interactive Linux terminal, and clear instrumentation objectives. The results revealed several critical failure modes:

Context propagation, passing trace context between services to maintain parent-child span relationships, proved to be an insurmountable barrier for most models. This is particularly concerning because context propagation is fundamental to distributed tracing.

“The backbone of the software industry consists of complex, high-scale production systems with mission-critical reliability, and seasoned engineers are architecting, evolving, and troubleshooting them,” said Jacek Migdał, founder of Quesma. “OTelBench shows that while LLMs are impressive at generating code snippets, they’re not yet capable of the cross-cutting reasoning and sustained problem-solving required for production engineering. This gap matters because many vendors are marketing AI SRE solutions with bold claims but no independent verification. We need benchmarks like this to separate reality from hype.”

Language Ecosystems Matter
Success rates varied dramatically across programming languages, revealing that AI generalization is far weaker than human engineers. Models had some moderate success with Go and, quite surprisingly, C++. A few tasks were completed for JavaScript, PHP, .NET, and Python. Just a single model solved a single task in Rust. None of the models solved a single task in Swift, Ruby, or (to our biggest surprise, due to a build issue) – Java.

Why This Matters for AI Development
OTelBench reveals several reasons why OpenTelemetry instrumentation challenges current LLMs:
– Reliability-critical applications reside in private repositories at companies like Apple, Airbnb, and Netflix, limiting training data.
– Instrumentation requires cross-cutting changes across codebases, rather than sequential additions.
– Some tasks required 50+ commands over 10+ minutes. Models consistently performed worse as tasks lengthened.

Migdał added, “AI SRE in 2026 is what DevOps Anomaly Detection was in 2016—lots of marketing, huge budgets, but lacking independent benchmarks. Just as SWE-Bench became the standard for coding evaluation, we need SRE-style benchmarks to determine what actually works. That’s why we’re releasing OTelBench as open-source: to create a North Star for navigating the AI hype and to enable the community to track real progress.”

A Path Forward
Despite the challenges, the benchmark reveals promising signals. Claude Opus 4.5, GPT-5.2, and Gemini 3 models show capability on specific tasks, with go-otel-microservices-traces reaching a 52% pass rate. With more environments for Reinforcement Learning with Verified Rewards, OpenTelemetry instrumentation appears to be a solvable problem for future AI systems.

Until then, organizations requiring distributed tracing across services should expect to write that code themselves—or work alongside AI assistants that understand their limitations.

OTelBench is available today as an open-source project at https://quesma.com/benchmarks/otel/, enabling researchers and practitioners to reproduce results and contribute additional test cases.

Lucie Šimečková
Quesma
press@quesma.com

Legal Disclaimer:

EIN Presswire provides this news content “as is” without warranty of any kind. We do not accept any responsibility or liability
for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this
article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact pressreleases@xpr.media

Buyer Guide — Selecting a Custom CNC Turning Partner for Complex Metal Parts

Buyer Guide — Selecting a Custom CNC Turning Partner for Complex Metal Parts

WENZHOU, ZHEJIANG, CHINA, January 21, 2026 /EINPresswire.com/ — Precision metal components have become essential parts

January 27, 2026

How to Evaluate a high purity alumina ceramic supplier: Key Factors Demonstrated by Mingrui Ceramic

How to Evaluate a high purity alumina ceramic supplier: Key Factors Demonstrated by Mingrui Ceramic

YONGZHOU, HUNAN, CHINA, January 21, 2026 /EINPresswire.com/ — Engineers and procurement teams are increasingly turning

January 27, 2026

What Buyers should Look for in a Professional Oil Hair Growth Factory like Topfeel Beauty

What Buyers should Look for in a Professional Oil Hair Growth Factory like Topfeel Beauty

WENZHOU, ZHEJIANG, CHINA, January 21, 2026 /EINPresswire.com/ — Professional buyers continue to seek reliable OEM/ODM

January 27, 2026

Guide to Choosing Technical Ceramic Companies in China – Why Mingrui Ceramic should be on your list

Guide to Choosing Technical Ceramic Companies in China – Why Mingrui Ceramic should be on your list

YONGZHOU, HUNAN, CHINA, January 21, 2026 /EINPresswire.com/ — Precision materials are becoming more important as

January 27, 2026

Rise of China’s Leading Skincare Manufacturers and Exporters in Global Beauty Supply Chains

Rise of China’s Leading Skincare Manufacturers and Exporters in Global Beauty Supply Chains

HONG KONG, HONG KONG, HONG KONG, January 21, 2026 /EINPresswire.com/ — The global beauty industry is undergoing rapid

January 27, 2026

Top Custom Ceramic Parts Manufacturer in China 2026: Mingrui Ceramic

Top Custom Ceramic Parts Manufacturer in China 2026: Mingrui Ceramic

YONGZHOU, HUNAN, CHINA, January 21, 2026 /EINPresswire.com/ — The demand for engineered components has never been

January 27, 2026

How to Choose the Best Custom Hair Care Manufacturer for Private Label

How to Choose the Best Custom Hair Care Manufacturer for Private Label

HONG KONG, HONG KONG, HONG KONG, January 21, 2026 /EINPresswire.com/ — Private-label is the fastest growing segment in

January 27, 2026

Global Leading SiSiC/RBSIC Carbide Cold Air Tubes for Rapid Cooling & Thermal Shock Resistance

Global Leading SiSiC/RBSIC Carbide Cold Air Tubes for Rapid Cooling & Thermal Shock Resistance

YIXING, JIANGSU, CHINA, January 21, 2026 /EINPresswire.com/ — As high-temperature industrial processes continue to

January 27, 2026

Topfeel Beauty Custom Serum & Mask Manufacturer: OEM/ODM Skincare Experts

Topfeel Beauty Custom Serum & Mask Manufacturer: OEM/ODM Skincare Experts

HONG KONG, HONG KONG, HONG KONG, January 21, 2026 /EINPresswire.com/ — Global beauty consumers are increasingly

January 27, 2026

Why CE and RoHS Compliance Is Critical for Touchless Trash Can Manufacturing

Why CE and RoHS Compliance Is Critical for Touchless Trash Can Manufacturing

JIANGMEN, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — Touchless trash cans have quickly become an

January 27, 2026

Huazheng Innovation: The Global Leading Relay Protection Tester Company Delivering Solutions for Modern Grids

Huazheng Innovation: The Global Leading Relay Protection Tester Company Delivering Solutions for Modern Grids

BAODING, HEBEI, CHINA, January 21, 2026 /EINPresswire.com/ — As power systems worldwide evolve toward higher capacity,

January 27, 2026

Huazheng Ranks High: Advanced Solutions Transformer Tan Delta Tester Supplier for Precision Substation Diagnostics

Huazheng Ranks High: Advanced Solutions Transformer Tan Delta Tester Supplier for Precision Substation Diagnostics

BAODING, HEBEI, CHINA, January 21, 2026 /EINPresswire.com/ — As global electricity demand continues to rise and power

January 27, 2026

Innovation Spotlight: China Top Turns Ratio Tester Exporter Demonstrates with HV Tech at IEEE PES T&D

Innovation Spotlight: China Top Turns Ratio Tester Exporter Demonstrates with HV Tech at IEEE PES T&D

BAODING, HEBEI, CHINA, January 21, 2026 /EINPresswire.com/ — As global power systems move toward higher capacity,

January 27, 2026

High-Quality Home Solutions: Discover China’s Top-Tier Living Room Furniture Craftsman

High-Quality Home Solutions: Discover China’s Top-Tier Living Room Furniture Craftsman

TIANJIN, TIANJIN, CHINA, January 21, 2026 /EINPresswire.com/ — The global furniture landscape is undergoing a profound

January 27, 2026

The Essential Checklist: Guide to Selecting the Best Transformer Oil Filter Manufacturer

The Essential Checklist: Guide to Selecting the Best Transformer Oil Filter Manufacturer

BAODING, HEBEI, CHINA, January 21, 2026 /EINPresswire.com/ — As global power systems continue to expand and modernize,

January 27, 2026

Leading Machinery Equipment Manufacturer Responds to Demand for Precision Injection Molding Solutions

Leading Machinery Equipment Manufacturer Responds to Demand for Precision Injection Molding Solutions

SHANGHAI, SHANGHAI, CHINA, January 21, 2026 /EINPresswire.com/ — As the global demand for precision manufacturing

January 27, 2026

Wanjia INTERNATIONAL SUPPLY: Global Leading Reusable Bags Manufacturer Driving the 2026 Circular Economy

Wanjia INTERNATIONAL SUPPLY: Global Leading Reusable Bags Manufacturer Driving the 2026 Circular Economy

RIZHAO, SHANDONG, CHINA, January 21, 2026 /EINPresswire.com/ — As we navigate the beginning of 2026, the global

January 27, 2026

ZOZEN: Leading Industrial Thermal Oil Heater Manufacturer for Food Safety

ZOZEN: Leading Industrial Thermal Oil Heater Manufacturer for Food Safety

WUXI, JIANGSU, CHINA, January 21, 2026 /EINPresswire.com/ — Food processing demands unwavering commitment to safety,

January 27, 2026

Understanding UL, CE, and CB Certifications for LED Drivers — Insights from Zhptpower Experts

Understanding UL, CE, and CB Certifications for LED Drivers — Insights from Zhptpower Experts

ZHUHAI, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — As global demand for intelligent, energy-efficient

January 27, 2026

ZOZEN: Top Water Tube Boiler Manufacturer Showcases Eco-Power at HEAT&POWER EXPO 2026

ZOZEN: Top Water Tube Boiler Manufacturer Showcases Eco-Power at HEAT&POWER EXPO 2026

WUXI, JIANGSU, CHINA, January 21, 2026 /EINPresswire.com/ — The industrial heating sector's premier gathering is

January 27, 2026

Comparing High-Quality PET Plastic Film Suppliers: Why ACS Is A Market Leader

Comparing High-Quality PET Plastic Film Suppliers: Why ACS Is A Market Leader

FOSHAN, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — Foshan AoChuanShun New Material Industrial Co., Ltd.

January 27, 2026

Zhptpower: A Global Leading LED Power Supply Manufacturer Driving the Future of Smart Lighting

Zhptpower: A Global Leading LED Power Supply Manufacturer Driving the Future of Smart Lighting

ZHUHAI, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — As intelligent lighting rapidly becomes the core of

January 27, 2026

Performance Analysis of Zhptpower, a Leading LED Lighting Drive Power Supply Company From China, in Smart Lighting

Performance Analysis of Zhptpower, a Leading LED Lighting Drive Power Supply Company From China, in Smart Lighting

ZHUHAI, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — Smart lighting technology is reshaping how buildings

January 27, 2026

Why Do OEMs Prefer a Leading LED Lighting Drive Power Supply Company From China Like Zhptpower

Why Do OEMs Prefer a Leading LED Lighting Drive Power Supply Company From China Like Zhptpower

ZHUHAI, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — Original Equipment Manufacturers face unique pressures

January 27, 2026

Top Reasons to Choose a Leading LED Lighting Drive Power Supply Company From China for Global LED Projects

Top Reasons to Choose a Leading LED Lighting Drive Power Supply Company From China for Global LED Projects

ZHUHAI, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — Global LED projects demand power supply partners who

January 27, 2026

2026 Guide: Cost-Effective and Creative Ideas for Trade Show Booths Using Modular Systems

2026 Guide: Cost-Effective and Creative Ideas for Trade Show Booths Using Modular Systems

JIANGMEN, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — Exhibitions are changing rapidly, and the year 2026

January 27, 2026

Ammonia Market Size, Share | Global Industry Analysis Report, Growth, Leading Players, and Forecast, 2030 | ChemAnalyst

Ammonia Market Size, Share | Global Industry Analysis Report, Growth, Leading Players, and Forecast, 2030 | ChemAnalyst

The Global Ammonia Market is forecasted to achieve a healthy CAGR of 4.60% in the next ten years until 2030. NEW YORK

January 27, 2026

Ammonia Market Report 2022-2027, Share, Size, Growth, Outlook, Forecast

Ammonia Market Report 2022-2027, Share, Size, Growth, Outlook, Forecast

Ammonia (NH3) represents a colorless gas and inorganic compound with a distinct odor. SHERIDAN, ALABAMA, UNITED STATES,

January 27, 2026

Design Checklist for SiSiC Heat Exchanger Tubes: Wall Thickness, Joining & Thermal Cycling

Design Checklist for SiSiC Heat Exchanger Tubes: Wall Thickness, Joining & Thermal Cycling

YIXING, JIANGSU, CHINA, January 21, 2026 /EINPresswire.com/ — As high-temperature industrial systems grow more complex

January 27, 2026

How China’s Leading Customized Skincare Serum Manufacturer Competes With Global Personal Care Factories

How China’s Leading Customized Skincare Serum Manufacturer Competes With Global Personal Care Factories

HONG KONG, HONG KONG , HONG KONG, January 21, 2026 /EINPresswire.com/ — As skincare science rapidly evolves toward

January 27, 2026

High-Quality SiSiC Heat Exchanger Tubes: Efficiency Gains in Corrosive High-Temp Systems

High-Quality SiSiC Heat Exchanger Tubes: Efficiency Gains in Corrosive High-Temp Systems

YIXING, JIANGSU, CHINA, January 21, 2026 /EINPresswire.com/ — As industrial processes operate at ever-higher

January 27, 2026

HUANSHANG SiSiC/RBSIC Carbide Beams: High-Load Kiln Support Solutions for Extreme Heat Zones

HUANSHANG SiSiC/RBSIC Carbide Beams: High-Load Kiln Support Solutions for Extreme Heat Zones

YIXING, JIANGSU, CHINA, January 21, 2026 /EINPresswire.com/ — As global high-temperature industries continue to push

January 27, 2026

Leading Plastic Tube Manufacturer Expands Product Line for Eco-Friendly Cosmetic Packaging

Leading Plastic Tube Manufacturer Expands Product Line for Eco-Friendly Cosmetic Packaging

GUANGZHOU, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — As the beauty and personal care industry continues

January 27, 2026

Top-Rated SiSiC/RBSIC Radiant Tubes: Clean Heating Performance for Modern Furnaces

Top-Rated SiSiC/RBSIC Radiant Tubes: Clean Heating Performance for Modern Furnaces

YIXING, JIANGSU, CHINA, January 21, 2026 /EINPresswire.com/ — As global industries accelerate toward cleaner, more

January 27, 2026

Why ISO Certification Matters in Stainless Steel Kitchen Trash Can Manufacturing

Why ISO Certification Matters in Stainless Steel Kitchen Trash Can Manufacturing

JIANGMEN, GUANGDONG, CHINA, January 21, 2026 /EINPresswire.com/ — People today take great care and concern in

January 27, 2026

EM TECH’s Strategic Advantages: Assessing One of China’s Leading Electrical Insulation Film Suppliers

EM TECH’s Strategic Advantages: Assessing One of China’s Leading Electrical Insulation Film Suppliers

CHENGDU, SICHUAN, CHINA, January 21, 2026 /EINPresswire.com/ — When global manufacturers seek reliable electrical

January 27, 2026

Mingrui Ceramic Leads China’s OEM Advanced Ceramic Parts Market: A Comprehensive Industry Analysis

Mingrui Ceramic Leads China’s OEM Advanced Ceramic Parts Market: A Comprehensive Industry Analysis

CHANGSHA, HUNAN, CHINA, January 21, 2026 /EINPresswire.com/ — As global manufacturing adjusts to miniaturization,

January 27, 2026

Why Mingrui Ceramic’s Custom Technical Ceramics Service is Driving New Demand for Precision Applications

Why Mingrui Ceramic’s Custom Technical Ceramics Service is Driving New Demand for Precision Applications

YONGZHOU, HUNAN, CHINA, January 21, 2026 /EINPresswire.com/ — Advanced ceramics is emerging as an innovation enabler

January 27, 2026

A Guide to Mingrui Ceramic’s Technical Ceramic Parts: How to Choose a Supplier

A Guide to Mingrui Ceramic’s Technical Ceramic Parts: How to Choose a Supplier

YONGZHOU, HUNAN, CHINA, January 21, 2026 /EINPresswire.com/ — The industrial world looks to advanced ceramics, which

January 27, 2026

How To Verify Quality Before Selecting A High Precision Alumina Ceramic Parts Manufacture Such As Mingrui Ceramic

How To Verify Quality Before Selecting A High Precision Alumina Ceramic Parts Manufacture Such As Mingrui Ceramic

YONGZHOU, HUNAN, CHINA, January 21, 2026 /EINPresswire.com/ — Mingrui Ceramic is a high-precision alumina parts

January 27, 2026