Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"
-
Updated
Jul 16, 2025 - Python
Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"
Exploring and improving the quality of ChatGPT-generated code for LeetCode programming tasks.
Repository-level automated code repair agent using SWE-Bench dataset
基于 AI Agent 服务自动化修复系统:Agent 自动读取错误日志,定位 Bug,生成补丁,运行测试,提交 PR,并通知开发者 Review。AI-powered auto-fix agent for web services: analyzes logs, patches code, runs tests, and creates pull requests automatically.
Trusted autonomy T&E runtime that links mission needs, hazards, scenarios, telemetry, evidence, verification reports, and hash-chained ledgers so AI/autonomous decisions can be reviewed instead of merely trusted.
Controlled AI repair loop. Audit → Reproduce → Patch → Test → Report. Safety boundaries most AI agents skip.
AI proposes. Humans decide. Source-available AI engineering control plane with policy gates, model comparison, budget-aware routing, PR/CI evidence binding, chained receipts, and human review.
Autonomous AI Code Repair Agent. Finds crashes, compiles code, and fixes bugs in real-time for Python, Rust, Go, & C++.
Broken code indentation repair tool with automatic language detection
Autonomous GitHub issue → validated patch pipeline: LangGraph agents research, plan, and generate search-replace patches, validated in a Docker sandbox (pytest · ruff · mypy), with structured retry feedback and auto PR creation.
Production-grade autonomous AI platform using LangGraph, Docker sandboxing, GitHub APIs, and multi-agent orchestration to generate, validate, and iteratively repair code patches from GitHub issues.
Library-aware TypeScript error recovery for LLM-generated code. Deterministic VS Code Quick Fix + single-file LLM mend. 98.6% on a real-world bench at <$0.005 per fix.
Benchmark pipeline for evaluating file-level localization in repository-level LLM repair on SWE-bench Verified tasks.
Evidence-grounded AI agent for Java code repair with LLM-guided patches and human-in-the-loop review.
Add a description, image, and links to the code-repair topic page so that developers can more easily learn about it.
To associate your repository with the code-repair topic, visit your repo's landing page and select "manage topics."