The Gradient AI News Source

Public articles 10Collected articles 10Trust 80Refresh 120 min

Health HealthySource type ResearchFull-text rights In-site rewriteLast ingested 2026-05-05ID the-gradientStatus Enabled

Independent publication; summary-only unless authorization is obtained.

Latest public articles

After Orthogonality: Virtue-Ethical Agency and AI Alignment

2026-02-18 23:25 UTC

This essay argues that rational people don’t have goals, and that rational AIs shouldn’t have goals. Human actions are rational not because we direct them at some final ‘goals,’ but because we align actions to practices. It proposes ‘eudaimonic rationality’ as a framework for AI alignment that shares a ‘type signature’ with human practice-based logic, addressing safety properties like transparency, corrigibility, and more.

Rational agency arises from participation in practices, not goal pursuit
Eudaimonic rationality is a natural and effective form of reasoning

AGI Is Not Multimodal

2025-06-04 14:00 UTC

This article argues that the multimodal scaling approach to AGI is fundamentally flawed. True general intelligence requires embodied understanding and interaction with the physical world, which current LLMs and multimodal models lack. The author critiques the assumption that language models learn world models, and proposes that genuine AGI must be built on interactive, embodied processes rather than glued-together modality modules.

LLMs do not learn true world models; they learn heuristics for next-token prediction. Their performance is achieved through brute-force memorization of syntactic patterns, not genuine understanding.
Multimodal approaches artificially stitch together separate modalities, failing to form coherent concepts and ignoring the embodied nature of intelligence.

Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research

2024-11-16 16:46 UTC

What is the Role of Mathematics in Modern Machine Learning? The past decade has witnessed a shift in how progress is made in machine learning. Research involving carefully designed and mathematically principled architectures result in only marginal improvements while compute-intensive and engineering-first efforts that scale to ever larger training sets

Math shifts from theoretical guarantees to post-hoc explanation.
Intrinsic dimension, curvature, topology analyze models.

What's Missing From LLM Chatbots: A Sense of Purpose

2024-09-09 17:28 UTC

The article argues that while LLM chatbots have advanced on benchmarks, they lack a sense of purpose in dialogues. Current systems are trained to predict next tokens and fine-tuned with RLHF, leading to persona drift and poor long-horizon goal achievement. The author proposes Dialogue Action Tokens (DAT) to steer models toward purposeful interactions, and discusses future directions for monitoring and reward utilization.

LLM chatbots excel on benchmarks like MMLU but user experience hasn't improved proportionally.
Purposeful dialogue (multi-round, goal-oriented) is essential for human-AI collaboration.

We Need Positive Visions for AI Grounded in Wellbeing

2024-08-03 17:00 UTC

This article argues that ensuring AI benefits humanity requires grounding AI development in individual and societal wellbeing. Despite the lack of a definitive definition of wellbeing, common factors like relationships, meaningful work, and positive emotions provide a foundation. It calls for positive visions, measurement of AI's impact on wellbeing, training models to support wellbeing, and deploying AI in wellbeing-aligned ways.

Beneficial AI must be grounded in human wellbeing, both individual and societal.
We need positive, actionable visions for AI in society, not just harm mitigation.

Financial Market Applications of LLMs

2024-04-20 17:57 UTC

This article explores the potential and challenges of applying Large Language Models (LLMs) to financial markets. While LLMs excel in natural language, their application to financial time series faces issues like data scarcity, noise, and adversarial environments. The article examines possibilities such as multimodal learning, residualization, and long-context windows, and suggests synthetic data generation and fundamental analysis support as more feasible directions. Overall, a cautiously optimistic view is presented.

LLMs' autoregressive nature parallels quantitative trading's autoregressive structure, but financial data is noisy and signals are weak.
Financial data volume is far less than language data, and market participants actively eliminate predictability.

A Brief Overview of Gender Bias in AI

2024-04-08 15:54 UTC

This article surveys research on gender bias in AI, covering word embeddings, facial recognition, coreference resolution, LLMs, and image generation. It discusses gaps, other bias types, and the philosophical question of how to 'fix' bias.

AI models reflect and amplify real-world gender biases; quantification is key to mitigation.
Research spans word embeddings to LLMs and image generation models.

Mamba Explained

2024-03-28 01:24 UTC

Is Attention all you need? Mamba, a novel AI model based on State Space Models (SSMs), emerges as a formidable alternative to the widely used Transformer models, addressing their inefficiency in processing long sequences.

Mamba replaces attention with state space models, eliminating the quadratic bottleneck and enabling O(n) time and O(1) memory.
Its selection mechanism allows context-dependent compression, balancing effectiveness and efficiency.

Car-GPT: Could LLMs finally make self-driving cars happen?

2024-03-08 16:55 UTC

Exploring the utility of large language models in autonomous driving: Can they be trusted for self-driving cars, and what are the key challenges?

LLMs work via tokenization, transformers, and next-word prediction, and can be applied to perception, planning, and generation in autonomous driving.
In perception, LLMs describe scenes and detect objects; in planning, they combine with bird's-eye views for decision-making; in generation, they create training data or simulated scenarios.

Do text embeddings perfectly encode text?

2024-03-05 20:15 UTC

The article introduces vec2text, a method that can perfectly reconstruct original text from embeddings, revealing significant security vulnerabilities in current RAG systems and vector databases.

Text embeddings are used for similarity search but can be inverted to recover original text.
Vec2text achieves up to 92% exact match on 32-token sequences through iterative optimization.

The Gradient