2026-05-26 04:00 UTCOriginal source2 min readUpdated: 2026-06-30 13:03 UTC

Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing

This paper studies truthful online preference aggregation for LLM fine-tuning in mobile crowdsourcing, addressing strategic misreporting by workers. It proposes a dynamic Bayesian game model and an online weighted aggregation mechanism that dynamically adjusts worker weights based on feedback accuracy, ensuring truthful feedback and achieving sublinear regret O(√T). Experiments on real-world datasets show significant performance gains.

SourcearXiv Machine LearningAuthor: Shugang Hao, Lingjie Duan

[2605.24052] Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing

[Submitted on 22 May 2026]

Title:Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing

View a PDF of the paper titled Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing, by Shugang Hao and Lingjie Duan

View PDF HTML (experimental)

Abstract:To better serve users' demands in mobile applications (e.g., navigation), mobile crowdsourcing platforms can iteratively align large language model (LLM)-generated content (e.g., AI-generated traffic condition predictions) with human feedback collected from crowdsourcing workers (e.g., mobile users). However, workers may strategically misreport their online preference feedback to maximize their influence or payment. Existing pipelines in mobile crowdsourcing (e.g., EM-based weight estimation) fail to identify the most accurate worker in this online setting, resulting in a linear regret $\mathcal{O}(T)$ over $T$ time slots. In this paper, we study truthful online preference aggregation for LLM fine-tuning in mobile crowdsourcing. We formulate a new dynamic Bayesian game to model the multi-agent online learning process between the platform and strategic mobile workers. We propose a novel online weighted aggregation mechanism that dynamically adjusts each worker's weight in the preference aggregation according to their feedback accuracy. We prove that our mechanism ensures truthful feedback from strategic workers and achieves a sublinear regret $\mathcal{O}(\sqrt{T})$ over $T$ time slots. We further extend our mechanism to a challenging scenario with limited worker feedback per time slot, still guaranteeing a sublinear regret $\mathcal{O}(\sqrt{T})$. Experiments on LLM fine-tuning with real-world datasets further demonstrate significant performance gains of our mechanisms over benchmark schemes.

Subjects:

Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Cite as: arXiv:2605.24052 [cs.LG]

(or arXiv:2605.24052v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2605.24052

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Shugang Hao [view email] [v1] Fri, 22 May 2026 00:26:12 UTC (7,042 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing, by Shugang Hao and Lingjie Duan

View PDF

HTML (experimental)

TeX Source

view license

Current browse context:

cs.LG

new | recent | 2026-05

Change to browse by:

cs cs.AI

References & Citations

NASA ADS

Google Scholar

Semantic Scholar

Data provided by:

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)