How the GitHub Copilot agentic harness achieves task completion on par with model-vendor harnesses while using fewer tokens, and supports over 20 models.
GitHub Copilot agentic harness achieves on-par task resolution with lower token consumption across benchmarks.
The harness supports over 20 models including GPT, Claude, Gemini, allowing flexible model selection.
A senior director at GitHub shares how she uses 40 automations to manage her workflow, freeing up mental space for what really matters. She explains how automations help with meeting prep, follow-ups, team alignment, and especially for her as someone with AuDHD, they serve as an essential accessibility tool.
Automations are not about replacing human connection but enabling leaders to show up fully for their teams
Start with the single biggest friction point (e.g., meeting prep) and build from there
Qubot, our internal Copilot-powered analytics agent, allows any GitHub employee to ask questions about our data in plain language. Here's what we learned as we built it.
Qubot offers multiple interfaces (Slack, VS Code, Copilot CLI) for low-barrier access to data analytics.
A federated context layer with structured knowledge is key to improving accuracy and speed (3x faster).
GitHub Copilot is improving efficiency by reducing redundant context through prompt caching and deferred tool loading, and by introducing Auto model selection that routes tasks to the best-fit model based on intent and real-time health, saving credits without sacrificing quality.
Prompt caching and tool search reduce repeated context across turns.
Auto model selection uses task intent and model health to choose the right model.
Learn how to use slash commands in GitHub Copilot CLI to switch models, manage context, resume sessions, inspect changes, navigate directories, and reset permissions for efficient terminal AI control.
Slash commands provide control over model selection, context management, and session handling.
Use /model to choose the right model based on capabilities, availability, and cost.
GitHub releases the GitHub Multilingual Repositories Dataset (CC0-1.0), a metadata dataset covering over 80 million classification rows across more than 40 million repositories, helping researchers discover non-English developer content and build more inclusive AI tools.
Dataset provides language classifications for READMEs, issues, and pull requests from three classifiers (fastText, gcld3, lingua-py) with confidence scores.
Covers over 40 million repositories and 80 million classification rows. Korean is most common non-English in issues; Portuguese tops READMEs.
GitHub Copilot CLI now uses smarter subagent delegation to reduce unnecessary handoffs and wait times. Production A/B testing shows a 23% reduction in tool failures and a 5% improvement in user wait time. The article details how the team identified delegation bottlenecks, refined the orchestration policy, and validated improvements.
Copilot CLI now delegates more selectively, using subagents only when they create real leverage.
Production A/B test results: tool failures down 23%, P95 wait time reduced by 5%.
Install and configure LSP servers for GitHub Copilot CLI, replacing brute-force grep/decompile with real code intelligence. The LSP Setup skill automates the process, supporting 14 languages. This post explains how it works and how to get started.
GitHub Copilot CLI previously relied on text search and binary extraction to understand code, which was inefficient and inaccurate.
The LSP Setup skill automates installation and configuration of LSP servers for 14 languages.
Custom agents let GitHub Copilot CLI understand your stack and team workflows, turning one-off terminal prompts into repeatable, reviewable processes. This article covers the concept, creation, and usage of custom agents with three practical workflow examples: security audit, IaC compliance, and release documentation.
Custom agents are defined using Markdown files with YAML frontmatter, specifying role, tools, guardrails, and output format.
Agent profiles are stored in the .github/agents directory of a repository, enabling version control and team review.
GitHub Copilot now serves 140,000 organizations, with over 100% year-over-year growth. Gartner positions GitHub as a Leader for the third consecutive year, highest in ability to execute.
GitHub recognized as Leader in Gartner Magic Quadrant for third year
GitHub Copilot serves 140,000 organizations, triple last year
Remote control for GitHub Copilot CLI sessions is now generally available on github.com and GitHub Mobile. Developers can start a session in VS Code or the CLI, then monitor and adjust it from another device. Features include real-time monitoring, mid-flight instruction changes, permission approvals, and a seamless cross-device workflow, with privacy by default.
Remote control for GitHub Copilot CLI sessions is now GA on github.com and GitHub Mobile.
Support for remote control in VS Code and JetBrains IDE enables multi-surface workflows.
GitHub is piloting an experimental general-purpose accessibility agent to provide engineers with just-in-time accessibility answers and automatically catch and remediate simple issues before production. The agent has reviewed 3,535 pull requests with a 68% resolution rate, focusing on structure, control naming, announcements, text alternatives, and keyboard focus order. The article shares lessons on mindset, leveraging past issues, sub-agent architecture, linear instruction execution, templated content passing, and handling complexity and risk patterns.
GitHub pilots a general-purpose accessibility agent to assist engineers and auto-fix common accessibility issues.
Agent reviewed 3,535 PRs with 68% resolution; top issues include structure, naming, announcements, text alternatives, and keyboard navigation.
GitHub systematically optimized token usage in its agentic workflows by logging via API proxy, identifying inefficiencies like unused MCP tools, replacing MCP calls with CLI commands, and building automated auditor/optimizer workflows, achieving up to 62% cost savings.
GitHub used an API proxy to normalize token logging across agent frameworks and built daily auditor and optimizer workflows to detect inefficiencies.
Removing unused MCP tools reduced per-call context by 8–12 KB, saving thousands of tokens per run.
Learn how to validate autonomous AI agents using dominator analysis to focus on essential outcomes instead of rigid scripts, reducing false negatives in CI pipelines.
Current testing tools assume deterministic behavior, causing false negatives in agent-driven workflows.
The Trust Layer framework uses Prefix Tree Acceptors and dominator analysis to extract essential states.
Learn the difference between CLI interactive and non-interactive modes. Interactive mode offers a chat-like experience for deep collaboration, while non-interactive mode provides quick one-off answers.
Interactive mode is default; allows back-and-forth conversation and iteration.
Non-interactive mode uses -p flag for quick, single prompts without entering a session.
GitHub engineer Brittany Ellich built a personal organization command center in one day with the support of AI, unifying digital fragmentation into a single central space. She shares her experience using Copilot for planning and implementation, along with her tool stack and advice for builders.
Brittany Ellich built a personal organization command center to solve digital fragmentation.
Using AI for planning and GitHub Copilot for implementation, she completed v1 in a single day.
Learn to find and exploit real-world agentic AI vulnerabilities through five progressive challenges in this free, open source game that over 10,000 developers have already used to sharpen their security skills.
Season 4 focuses on agentic AI security vulnerabilities.
Players use natural language to exploit a deliberately vulnerable AI assistant.
This tutorial covers the basics of GitHub Copilot CLI, including installation, authentication, folder permissions, and common use cases, enabling developers to leverage AI coding assistance directly from the terminal.
GitHub Copilot CLI brings agentic AI to the terminal, supporting autonomous code building and testing.
Install via npm, WinGet, or Homebrew; first-time use requires GitHub login.
GitHub Copilot CLI introduces Rubber Duck, an experimental feature that uses a second model from a different AI family to act as an independent reviewer. Evaluations show Claude Sonnet + Rubber Duck closes 74.7% of the performance gap to Opus alone, improving results on complex multi-file tasks. Rubber Duck activates at critical checkpoints or on demand.
Rubber Duck provides a second opinion from a different model family to catch errors single models might miss.
Sonnet + Rubber Duck achieves 74.7% of Opus's performance gain, with higher gains on the hardest tasks.
GitHub Copilot CLI introduces /fleet, a slash command that orchestrates multiple AI sub-agents to work in parallel on different files. Learn how to write effective prompts, declare dependencies, and avoid common pitfalls.
/fleet decomposes tasks into parallel work items and dispatches sub-agents.
Write specific prompts with explicit file boundaries and dependencies.
An AI researcher used GitHub Copilot to build coding agents that automated repetitive analysis tasks, sharing key strategies: prompting conversational and verbose, frequent refactoring and documentation, and blaming process not agents. The team created 11 agents, 4 skills, and new workflows in under 3 days.
Built coding agents using Copilot CLI and Claude Opus 4.6 to automate trajectory analysis.
Key strategies: conversational prompts, regular refactoring and doc updates, and a blameless process philosophy.
Learn how to integrate the Copilot SDK into a React Native app to generate AI-powered issue summaries, with production patterns for graceful degradation and caching.
The Copilot SDK requires server-side execution because it depends on the Copilot CLI binary and Node.js runtime.
Prompt structure with metadata (labels, author, etc.) significantly improves summary quality over raw text.