Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing
This paper studies truthful online preference aggregation for LLM fine-tuning in mobile crowdsourcing, addressing strategic misreporting by workers. It proposes a dynamic Bayesian game model and an online weighted aggregation mechanism that dynamically adjusts worker weights based on feedback accuracy, ensuring truthful feedback and achieving sublinear regret O(√T). Experiments on real-world datasets show significant performance gains.
Article intelligence
Key points
- Dynamic Bayesian game model formulated for multi-agent online learning between platform and strategic workers.
- Online weighted aggregation mechanism adjusts weights to incentivize truthful feedback.
- Proven to achieve sublinear regret O(√T), even with limited worker feedback per time slot.
- Experiments on LLM fine-tuning with real datasets demonstrate significant improvements over baselines.
Why it matters
This matters because dynamic Bayesian game model formulated for multi-agent online learning between platform and strategic workers.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
[2605.24052] Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing
[Submitted on 22 May 2026]
Title:Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing
View a PDF of the paper titled Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing, by Shugang Hao and Lingjie Duan
View PDF HTML (experimental)
Abstract:To better serve users' demands in mobile applications (e.g., navigation), mobile crowdsourcing platforms can iteratively align large language model (LLM)-generated content (e.g., AI-generated traffic condition predictions) with human feedback collected from crowdsourcing workers (e.g., mobile users). However, workers may strategically misreport their online preference feedback to maximize their influence or payment. Existing pipelines in mobile crowdsourcing (e.g., EM-based weight estimation) fail to identify the most accurate worker in this online setting, resulting in a linear regret $\mathcal{O}(T)$ over $T$ time slots. In this paper, we study truthful online preference aggregation for LLM fine-tuning in mobile crowdsourcing. We formulate a new dynamic Bayesian game to model the multi-agent online learning process between the platform and strategic mobile workers. We propose a novel online weighted aggregation mechanism that dynamically adjusts each worker's weight in the preference aggregation according to their feedback accuracy. We prove that our mechanism ensures truthful feedback from strategic workers and achieves a sublinear regret $\mathcal{O}(\sqrt{T})$ over $T$ time slots. We further extend our mechanism to a challenging scenario with limited worker feedback per time slot, still guaranteeing a sublinear regret $\mathcal{O}(\sqrt{T})$. Experiments on LLM fine-tuning with real-world datasets further demonstrate significant performance gains of our mechanisms over benchmark schemes.
Subjects:
Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as: arXiv:2605.24052 [cs.LG]
(or arXiv:2605.24052v1 [cs.LG] for this version)
https://doi.org/10.48550/arXiv.2605.24052
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Shugang Hao [view email] [v1] Fri, 22 May 2026 00:26:12 UTC (7,042 KB)
Full-text links:
Access Paper:
View a PDF of the paper titled Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing, by Shugang Hao and Lingjie Duan
View PDF
HTML (experimental)
TeX Source
view license
Current browse context:
cs.LG
new | recent | 2026-05
Change to browse by:
cs cs.AI
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Loading...
Data provided by:
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Code, Data and Media Associated with this Article
alphaXiv Toggle
alphaXiv (What is alphaXiv?)
Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub Toggle
DagsHub (What is DagsHub?)
GotitPub Toggle
Gotit.pub (What is GotitPub?)
Huggingface Toggle
Hugging Face (What is Huggingface?)
ScienceCast Toggle
ScienceCast (What is ScienceCast?)
Demos
Demos
Replicate Toggle
Replicate (What is Replicate?)
Spaces Toggle
Hugging Face Spaces (What is Spaces?)
Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
Related Papers
Recommenders and Search Tools
Link to Influence Flower
Influence Flower (What are Influence Flowers?)
Core recommender toggle
CORE Recommender (What is CORE?)
IArxiv recommender toggle
IArxiv Recommender (What is IArxiv?)
Author
Venue
Institution
Topic
About arXivLabs
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)