AI News HubLIVE

Today's must-reads

Agents

AI Agent on Android

RikkaHub Agent is an open-source Android app that transforms a local LLM chat client into a powerful on-device AI agent with over 80 native device tools, workflow automation, Telegram bot, SSH, voice transcription, and more, all running locally with privacy-first design.

  • Fork of RikkaHub that turns an Android LLM chat client into a full on-device agent with 80+ tools.
  • Supports automated workflows, scheduled tasks, Telegram bot, in-app browser, SSH, and voice transcription.
In-site article

Try AI Operators on PostgreSQL

samtSQL enables you to run SQL queries enhanced with AI operators on your existing PostgreSQL database, supporting multimodal data including text, images, and audio.

  • Run SQL with AI operators on PostgreSQL database
  • Support for multimodal data: text, images, audio
In-site article

AI makes us more of who we are

AI doesn't make bad engineers good, it makes them fast. It amplifies existing traits rather than improving skills. For lazy or sloppy coders, AI accelerates output of low-quality code, and because AI itself tends to copy existing patterns without questioning, it perpetuates and scales tech debt.

  • AI amplifies existing traits, not abilities.
  • Bad engineers ship more code with same judgment and blind spots.
In-site article

Some Thoughts on AI Safety

A cautious, nuanced case for AI optimism: why safety, interpretability, bias, and alignment matter as much as raw capability.

  • AI risks include misuse, misalignment, and systemic issues; each requires different responses.
  • Interpretability is crucial for understanding models, but current methods lag behind capability.
In-site article

A better way to model the behavior of metal alloys

MIT researchers have developed a machine-learning approach that improves the accuracy of material simulations by constructing training datasets that capture the diversity of atomic environments in chemically disordered metal alloys, potentially accelerating materials discovery.

  • New method uses information theory to build diverse training datasets for machine-learning models, capturing subtle atomic patterns in disordered alloys.
  • Outperforms brute-force methods and large-scale models from Google and Microsoft in predicting material properties.
In-site article
Chips

Running local AI on AMD RX 580 (2017 GPU) using Vulkan – no CUDA, no ROCm

This article demonstrates how to run local AI inference on the 2017 AMD RX 580 GPU using the Vulkan backend of llama.cpp and stable-diffusion.cpp, without requiring CUDA or ROCm. It covers hardware setup, compilation steps, and performance results.

  • AMD RX 580 can run local AI via Vulkan, no CUDA or ROCm needed
  • Vulkan backends of llama.cpp and stable-diffusion.cpp enable GPU acceleration
In-site article
Tools

400B Parameter Model: Consortium "Europa" Wins AI Competition

The EU Commission announced the winner of its “Frontier AI Grand Challenge” on Friday. The consortium “Europa”, led by the Italian company Domyn, prevailed in the competition. The alliance is to receive the necessary resources with the award to develop a state-of-the-art open-source AI model. The prestige project is intended to cover all 24 official EU languages and set a statement for the continent's technological ambitions.

  • EU Commission announced the winner of the Frontier AI Grand Challenge
  • Consortium 'Europa' led by Italy's Domyn won
In-site article

Show HN: Multiplayer Usage Tracking for Claude Code, Codex and OpenCode

Summer is an open-source local tool by Autumn for tracking AI coding tool usage and spend. It supports Claude Code, Codex, and OpenCode, requires no hosting, and provides a local dashboard to aggregate usage across a team per developer, model, and cost.

  • Summer is a local, open-source tool with no hosting required.
  • Supports Claude Code, Codex, and OpenCode for usage tracking.
In-site article
Research

Five Chinese AI Labs Cut Token Prices Up to 99%

ByteDance, Tencent, MiniMax, Alibaba, and Xiaomi all cut AI token prices between 50% and 99% within the same competitive window. Bank of America Securities analysts attribute the pricing race to narrowing capability differences between China's major AI models. Alibaba's 50% discount on Qwen3.7-Max was tied to the 618 shopping event, blending AI competition with consumer promotions.

  • Five Chinese AI labs slashed token prices by 50-99% in a short period
  • Bank of America Securities cites narrowing capability gaps among major providers as the cause
In-site article
Models

IEEE Rolls Out Large Language Models Virtual Training Course

Large language models have moved from research labs into engineers' daily workflows. To help technical professionals stay ahead, IEEE offers a five-course online program that covers the engineering behind generative AI, from transformer architectures to deployment.

  • The LLM market is expected to grow by 33% annually through 2030, making proficiency a core requirement for technologists.
  • Engineers must understand transformer architecture and internal logic, not just treat LLMs as conversational robots.
In-site article
Other updates (7)
Agents

Huawei chips refine DeepSeek model in major leap for China's AI self-reliance

A research team including Huawei has successfully used Ascend 910C chips to complete post-training of the DeepSeek-V4-Pro model, marking a milestone in China's ability to perform complex AI training with domestic hardware. The project, involving over 1,000 chips and a 1.6 trillion parameter model, demonstrates a shift from inference-only capabilities to full training, bolstering China's AI self-sufficiency amid US sanctions.

  • Huawei and partners used Ascend 910C chips to post-train DeepSeek-V4-Pro.
  • The cluster of at least 1,000 chips performed full-parameter tuning on a 1.6 trillion parameter model.
In-site article

PhD_fleet – Manage a virtual research lab of AI PhD students via Slack

PhD_fleet is a Python toolkit that lets a single researcher (advisor) spawn and converse with a fleet of Claude Code agents through Slack. Each agent has its own workspace directory, with Slack messages driving turns and the filesystem serving as long-term memory. A separate coach agent watches the advisor's mentoring and provides evidence-based feedback.

  • Advisor can spawn multiple AI student agents via Slack commands, each with independent workspace and long-term memory.
  • Coach agent analyzes mentoring interactions and offers improvement suggestions based on pedagogical frameworks.
In-site article

Open-source AI skills that make Claude/ChatGPT produce real work, eval-scored

pm-claude-skills is an open-source library of 174 professional SKILL.md files for AI assistants, covering 18 professions. Each skill is eval-verified to produce professional-grade output usable as a first draft. Includes workflow recipes, skill memory, and cross-tool compatibility.

  • 174 professional skills covering product management, engineering, customer success, and more
  • Eval-verified quality with scores on structure, completeness, usefulness, and grounding
In-site article

How we built an internal data analytics agent

Qubot, our internal Copilot-powered analytics agent, allows any GitHub employee to ask questions about our data in plain language. Here's what we learned as we built it.

  • Qubot offers multiple interfaces (Slack, VS Code, Copilot CLI) for low-barrier access to data analytics.
  • A federated context layer with structured knowledge is key to improving accuracy and speed (3x faster).
In-site article
Models

MiniMax M3 vs. GLM 5.2: Codegen comparison across autonomous coding tasks

In the Thinkbench benchmark, GLM 5.2 led in correctness (92% full-pass) while MiniMax M3 was cheaper and faster. They performed similarly on code modification tasks, but GLM was steadier on greenfield builds. MiniMax tended to build more complete systems with ambiguous prompts.

  • GLM 5.2 scored 92% full-pass vs MiniMax M3's 84%
  • MiniMax cost $6.67 vs GLM $18.47; avg latency 45s vs 80s
In-site article

Checkmarx’s new SAST engine isn’t about the LLM. It’s about what happens after.

Checkmarx releases a new SAST engine combining a deterministic rules-based scanner, an LLM trained on security data, and a third engine to classify findings as true or false positives before they reach developers. The company claims an F1 score of 0.499, far above the category average of 0.20, and found 327 true positives missed by a leading frontier model. The architecture emphasizes orchestration, automatically running three engines together to provide determinism, language coverage, and noise filtering.

  • Checkmarx's new SAST engine includes three engines: a deterministic rules scanner, an LLM, and a Findings Analysis Engine (FAE) to filter false positives before results reach developers.
  • The company claims an F1 score of 0.499, more than double the category average of 0.20, and found 327 true positives missed by a leading frontier model in tests.
In-site article
Tools

Find the right stack for your AI use case

Inferlay is a platform that helps developers choose the appropriate AI technology stack for their projects by comparing various tools and frameworks.

  • Inferlay simplifies AI stack selection.
  • The platform offers tool comparisons and recommendations.