2025-06-05 |
Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets |
Lei Hsiung et.al. |
2506.05346 |
null |
2025-06-05 |
SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs |
Jiahui Wang et.al. |
2506.05344 |
null |
2025-06-05 |
ContentV: Efficient Training of Video Generation Models with Limited Compute |
Wenfeng Lin et.al. |
2506.05343 |
null |
2025-06-05 |
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning |
Xingjian Ran et.al. |
2506.05341 |
null |
2025-06-05 |
VideoMolmo: Spatio-Temporal Grounding Meets Pointing |
Ghazi Shazan Ahmad et.al. |
2506.05336 |
null |
2025-06-05 |
Search Arena: Analyzing Search-Augmented LLMs |
Mihran Miroyan et.al. |
2506.05334 |
null |
2025-06-05 |
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning |
Xinyan Chen et.al. |
2506.05331 |
null |
2025-06-05 |
LSM-2: Learning from Incomplete Wearable Sensor Data |
Maxwell A. Xu et.al. |
2506.05321 |
null |
2025-06-05 |
Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay |
Yifan Sun et.al. |
2506.05316 |
null |
2025-06-05 |
Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models |
Taha Entesari et.al. |
2506.05314 |
null |
2025-06-05 |
Learning normalized image densities via dual score matching |
Florentin Guth et.al. |
2506.05310 |
null |
2025-06-05 |
ProRefine: Inference-time Prompt Refinement with Textual Feedback |
Deepak Pandita et.al. |
2506.05305 |
null |
2025-06-05 |
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos |
Weifeng Lin et.al. |
2506.05302 |
null |
2025-06-05 |
Power Law Guided Dynamic Sifting for Efficient Attention |
Nirav Koley et.al. |
2506.05300 |
null |
2025-06-05 |
Sample Complexity and Representation Ability of Test-time Scaling Paradigms |
Baihe Huang et.al. |
2506.05295 |
null |
2025-06-05 |
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World? |
Yuqian Yuan et.al. |
2506.05287 |
null |
2025-06-05 |
Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning |
Nan Huo et.al. |
2506.05278 |
null |
2025-06-05 |
How to Unlock Time Series Editing? Diffusion-Driven Approach with Multi-Grained Control |
Hao Yu et.al. |
2506.05276 |
null |
2025-06-05 |
Teaming in the AI Era: AI-Augmented Frameworks for Forming, Simulating, and Optimizing Human Teams |
Mohammed Almutairi et.al. |
2506.05265 |
null |
2025-06-05 |
Can Foundation Models Generalise the Presentation Attack Detection Capabilities on ID Cards? |
Juan E. Tapia et.al. |
2506.05263 |
null |
2025-06-05 |
LeanPO: Lean Preference Optimization for Likelihood Alignment in Video-LLMs |
Xiaodong Wang et.al. |
2506.05260 |
null |
2025-06-05 |
SECNEURON: Reliable and Flexible Abuse Control in Local LLMs via Hybrid Neuron Encryption |
Zhiqiang Wang et.al. |
2506.05242 |
null |
2025-06-05 |
Aligning Latent Spaces with Flow Priors |
Yizhuo Li et.al. |
2506.05240 |
null |
2025-06-05 |
DSG-World: Learning a 3D Gaussian World Model from Dual State Videos |
Wenhao Hu et.al. |
2506.05217 |
null |
2025-06-05 |
LLM-First Search: Self-Guided Exploration of the Solution Space |
Nathan Herr et.al. |
2506.05213 |
null |
2025-06-05 |
Towards Vision-Language-Garment Models For Web Knowledge Garment Understanding and Generation |
Jan Ackermann et.al. |
2506.05210 |
null |
2025-06-05 |
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text |
Nikhil Kandpal et.al. |
2506.05209 |
null |
2025-06-05 |
RELIC: Evaluating Compositional Instruction Following via Language Recognition |
Jackson Petty et.al. |
2506.05205 |
null |
2025-06-05 |
Transformers Meet In-Context Learning: A Universal Approximation Theory |
Gen Li et.al. |
2506.05200 |
null |
2025-06-05 |
Quantifying Cross-Modality Memorization in Vision-Language Models |
Yuxin Wen et.al. |
2506.05198 |
null |
2025-06-05 |
Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis |
Neeraj Kumar et.al. |
2506.05184 |
null |
2025-06-05 |
TreeRPO: Tree Relative Policy Optimization |
Zhicheng Yang et.al. |
2506.05183 |
null |
2025-06-05 |
On the Comprehensibility of Multi-structured Financial Documents using LLMs and Pre-processing Tools |
Shivani Upadhyay et.al. |
2506.05182 |
null |
2025-06-05 |
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models |
Yanzhao Zhang et.al. |
2506.05176 |
null |
2025-06-05 |
ECoRAG: Evidentiality-guided Compression for Long Context RAG |
Yeonseok Jeong et.al. |
2506.05167 |
null |
2025-06-05 |
Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective |
Bhavik Chandna et.al. |
2506.05166 |
null |
2025-06-05 |
Do Large Language Models Judge Error Severity Like Humans? |
Diege Sun et.al. |
2506.05142 |
null |
2025-06-05 |
DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning |
Tanmay Parekh et.al. |
2506.05128 |
null |
2025-06-05 |
PixCell: A generative foundation model for digital histopathology images |
Srikar Yellapragada et.al. |
2506.05127 |
null |
2025-06-05 |
Membership Inference Attacks on Sequence Models |
Lorenzo Rossi et.al. |
2506.05126 |
null |
2025-06-05 |
The NTNU System at the S&I Challenge 2025 SLA Open Track |
Hong-Yun Lin et.al. |
2506.05121 |
null |
2025-06-05 |
DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models |
Revant Teotia et.al. |
2506.05108 |
null |
2025-06-05 |
Survey on the Evaluation of Generative Models in Music |
Alexander Lerch et.al. |
2506.05104 |
null |
2025-06-05 |
Privacy Amplification Through Synthetic Data: Insights from Linear Regression |
Clément Pierquin et.al. |
2506.05101 |
null |
2025-06-05 |
Interpretable Multimodal Framework for Human-Centered Street Assessment: Integrating Visual-Language Models for Perceptual Urban Diagnostics |
HaoTian Lan et.al. |
2506.05087 |
null |
2025-06-05 |
Parking, Perception, and Retail: Street-Level Determinants of Community Vitality in Harbin |
HaoTian Lan et.al. |
2506.05080 |
null |
2025-06-05 |
Just a Scratch: Enhancing LLM Capabilities for Self-harm Detection through Intent Differentiation and Emoji Interpretation |
Soumitra Ghosh et.al. |
2506.05073 |
null |
2025-06-05 |
RIVAL: Reinforcement Learning with Iterative and Adversarial Optimization for Machine Translation |
Tianjiao Li et.al. |
2506.05070 |
null |
2025-06-05 |
Reason-to-Recommend: Using Interaction-of-Thought Reasoning to Enhance LLM Recommendation |
Keyu Zhao et.al. |
2506.05069 |
null |
2025-06-05 |
Does It Make Sense to Speak of Introspection in Large Language Models? |
Iulia Comşa et.al. |
2506.05068 |
null |
2025-06-05 |
A Survey on Vietnamese Document Analysis and Recognition: Challenges and Future Directions |
Anh Le et.al. |
2506.05061 |
null |
2025-06-05 |
TALL – A Trainable Architecture for Enhancing LLM Performance in Low-Resource Languages |
Moshe Ofer et.al. |
2506.05057 |
null |
2025-06-05 |
Automatic Robustness Stress Testing of LLMs as Mathematical Problem Solvers |
Yutao Hou et.al. |
2506.05038 |
null |
2025-06-05 |
Tuning the Right Foundation Models is What you Need for Partial Label Learning |
Kuang He et.al. |
2506.05027 |
null |
2025-06-05 |
Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System |
Haokun Liu et.al. |
2506.05020 |
null |
2025-06-05 |
UAV4D: Dynamic Neural Rendering of Human-Centric UAV Imagery using Gaussian Splatting |
Jaehoon Choi et.al. |
2506.05011 |
null |
2025-06-05 |
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development |
Zhenran Xu et.al. |
2506.05010 |
null |
2025-06-05 |
Point Cloud Segmentation of Agricultural Vehicles using 3D Gaussian Splatting |
Alfred T. Christiansen et.al. |
2506.05009 |
null |
2025-06-05 |
QiMeng: Fully Automated Hardware and Software Design for Processor Chip |
Rui Zhang et.al. |
2506.05007 |
null |
2025-06-05 |
SCOP: Evaluating the Comprehension Process of Large Language Models from a Cognitive View |
Yongjie Xiao et.al. |
2506.05000 |
null |
2025-06-05 |
Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic Reasoning |
Mehdi Azarafza et.al. |
2506.04998 |
null |
2025-06-05 |
BacPrep: An Experimental Platform for Evaluating LLM-Based Bacalaureat Assessment |
Dumitran Adrian Marius et.al. |
2506.04989 |
null |
2025-06-05 |
FPTQuant: Function-Preserving Transforms for LLM Quantization |
Boris van Breugel et.al. |
2506.04985 |
null |
2025-06-05 |
TextVidBench: A Benchmark for Long Video Scene Text Understanding |
Yangyang Zhong et.al. |
2506.04983 |
null |
2025-06-05 |
Agentic AI for Intent-Based Industrial Automation |
Marcos Lima Romero et.al. |
2506.04980 |
null |
2025-06-05 |
Evaluating Prompt-Driven Chinese Large Language Models: The Influence of Persona Assignment on Stereotypes and Safeguards |
Geng Liu et.al. |
2506.04975 |
null |
2025-06-05 |
From Struggle (06-2024) to Mastery (02-2025) LLMs Conquer Advanced Algorithm Exams and Pave the Way for Editorial Generation |
Adrian Marius Dumitran et.al. |
2506.04965 |
null |
2025-06-05 |
PoCGen: Generating Proof-of-Concept Exploits for Vulnerabilities in Npm Packages |
Deniz Simsek et.al. |
2506.04962 |
null |
2025-06-05 |
APVR: Hour-Level Long Video Understanding with Adaptive Pivot Visual Information Retrieval |
Hong Gao et.al. |
2506.04953 |
null |
2025-06-05 |
Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback |
Junior Cedric Tonga et.al. |
2506.04920 |
null |
2025-06-05 |
When Thinking LLMs Lie: Unveiling the Strategic Deception in Representations of Reasoning Models |
Kai Wang et.al. |
2506.04909 |
null |
2025-06-05 |
Verbose ListOps (VLO): Beyond Long Context – Unmasking LLM’s Reasoning Blind Spots |
Alex Pan et.al. |
2506.04907 |
null |
2025-06-05 |
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes |
Tianxu Wang et.al. |
2506.04897 |
null |
2025-06-05 |
ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests |
Shiyi Xu et.al. |
2506.04894 |
null |
2025-06-05 |
Evaluating the Effectiveness of Linguistic Knowledge in Pretrained Language Models: A Case Study of Universal Dependencies |
Wenxi Li et.al. |
2506.04887 |
null |
2025-06-05 |
Consciousness via MIPT? |
Alexander Gorsky et.al. |
2506.04875 |
null |
2025-06-05 |
LLMs for sensory-motor control: Combining in-context and iterative learning |
Jônata Tyska Carvalho et.al. |
2506.04867 |
null |
2025-06-05 |
Adapting Online Customer Reviews for Blind Users: A Case Study of Restaurant Reviews |
Mohan Sunkara et.al. |
2506.04865 |
null |
2025-06-05 |
Sparse Autoencoders, Again? |
Yin Lu et.al. |
2506.04859 |
null |
2025-06-05 |
Prompting LLMs: Length Control for Isometric Machine Translation |
Dávid Javorský et.al. |
2506.04855 |
null |
2025-06-05 |
Multiple-Choice Question Generation Using Large Language Models: Methodology and Educator Insights |
Giorgio Biancini et.al. |
2506.04851 |
null |
2025-06-05 |
On Automating Security Policies with Contemporary LLMs |
Pablo Fernández Saura et.al. |
2506.04838 |
null |
2025-06-05 |
OpenMaskDINO3D : Reasoning 3D Segmentation via Large Language Model |
Kunshen Zhang et.al. |
2506.04837 |
null |
2025-06-05 |
Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models |
Changyue Wang et.al. |
2506.04832 |
null |
2025-06-05 |
DualX-VSR: Dual Axial Spatial $\times$ Temporal Transformer for Real-World Video Super-Resolution without Motion Compensation |
Shuo Cao et.al. |
2506.04830 |
null |
2025-06-05 |
Evaluating Vision-Language and Large Language Models for Automated Student Assessment in Indonesian Classrooms |
Nurul Aisyah et.al. |
2506.04822 |
null |
2025-06-05 |
LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning |
Zhen Hao Wong et.al. |
2506.04821 |
null |
2025-06-05 |
Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision Study |
Yujun Zhou et.al. |
2506.04810 |
null |
2025-06-05 |
Towards LLM-Centric Multimodal Fusion: A Survey on Integration Strategies and Techniques |
Jisu An et.al. |
2506.04788 |
null |
2025-06-05 |
MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark |
Dingdong Wang et.al. |
2506.04779 |
null |
2025-06-05 |
Fine-Grained Interpretation of Political Opinions in Large Language Models |
Jingyu Hu et.al. |
2506.04774 |
null |
2025-06-05 |
GOLFer: Smaller LM-Generated Documents Hallucination Filter & Combiner for Query Expansion in Information Retrieval |
Lingyuan Liu et.al. |
2506.04762 |
link |
2025-06-05 |
Exp4Fuse: A Rank Fusion Framework for Enhanced Sparse Retrieval using Large Language Model-based Query Expansion |
Lingyuan Liu et.al. |
2506.04760 |
link |
2025-06-05 |
Truth in the Few: High-Value Data Selection for Efficient Multi-Modal Reasoning |
Shenshen Li et.al. |
2506.04755 |
null |
2025-06-05 |
Multi-Layer GRPO: Enhancing Reasoning and Self-Correction in Large Language Models |
Fei Ding et.al. |
2506.04746 |
null |
2025-06-05 |
SRD: Reinforcement-Learned Semantic Perturbation for Backdoor Defense in VLMs |
Shuhan Xu et.al. |
2506.04743 |
null |
2025-06-05 |
Lifelong Evolution: Collaborative Learning between Large and Small Language Models for Continuous Emergent Fake News Detection |
Ziyi Zhou et.al. |
2506.04739 |
null |
2025-06-05 |
Towards Holistic Visual Quality Assessment of AI-Generated Videos: A LLM-Based Multi-Dimensional Evaluation Model |
Zelu Qi et.al. |
2506.04715 |
null |
2025-06-05 |
UNO: Unlearning via Orthogonalization in Generative models |
Pinak Mandal et.al. |
2506.04712 |
null |
2025-06-05 |
LLM-based phoneme-to-grapheme for phoneme-based speech recognition |
Te Ma et.al. |
2506.04711 |
null |
2025-06-04 |
OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis |
Junting Chen et.al. |
2506.04217 |
null |
2025-06-04 |
Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector |
Boyong He et.al. |
2506.04211 |
link |
2025-06-04 |
Language-Image Alignment with Fixed Text Encoders |
Jingfeng Yang et.al. |
2506.04209 |
null |
2025-06-04 |
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning |
Shuang Chen et.al. |
2506.04207 |
null |
2025-06-04 |
EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation |
Jinghan Jia et.al. |
2506.04205 |
null |
2025-06-04 |
Cascadia: A Cascade Serving System for Large Language Models |
Youhe Jiang et.al. |
2506.04203 |
null |
2025-06-04 |
TracLLM: A Generic Framework for Attributing Long Context LLMs |
Yanting Wang et.al. |
2506.04202 |
null |
2025-06-04 |
R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning |
Qingfei Zhao et.al. |
2506.04185 |
null |
2025-06-04 |
SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models |
Yuhao Wu et.al. |
2506.04180 |
null |
2025-06-04 |
SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling |
Anhao Zhao et.al. |
2506.04179 |
null |
2025-06-04 |
Does Prompt Design Impact Quality of Data Imputation by LLMs? |
Shreenidhi Srinivasan et.al. |
2506.04172 |
null |
2025-06-04 |
Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints |
Utkarsh Utkarsh et.al. |
2506.04171 |
null |
2025-06-04 |
VISCA: Inferring Component Abstractions for Automated End-to-End Testing |
Parsa Alian et.al. |
2506.04161 |
null |
2025-06-04 |
A Dataset for Addressing Patient’s Information Needs related to Clinical Course of Hospitalization |
Sarvesh Soni et.al. |
2506.04156 |
null |
2025-06-04 |
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis |
Kejian Zhu et.al. |
2506.04142 |
null |
2025-06-04 |
MMR-V: What’s Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos |
Kejian Zhu et.al. |
2506.04141 |
null |
2025-06-04 |
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems |
Shaina Raza et.al. |
2506.04133 |
null |
2025-06-04 |
Guided Speculative Inference for Efficient Test-Time Alignment of LLMs |
Jonathan Geuter et.al. |
2506.04118 |
null |
2025-06-05 |
Rectified Sparse Attention |
Yutao Sun et.al. |
2506.04108 |
null |
2025-06-04 |
TextAtari: 100K Frames Game Playing with Language Agents |
Wenhao Li et.al. |
2506.04098 |
link |
2025-06-04 |
AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment |
Anastasiia Ivanova et.al. |
2506.04089 |
null |
2025-06-04 |
Multimodal Tabular Reasoning with Privileged Structured Information |
Jun-Peng Jiang et.al. |
2506.04088 |
null |
2025-06-04 |
EuroLLM-9B: Technical Report |
Pedro Henrique Martins et.al. |
2506.04079 |
null |
2025-06-04 |
LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation |
Ming Zhang et.al. |
2506.04078 |
null |
2025-06-04 |
A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions |
Chung-Chun Wang et.al. |
2506.04077 |
null |
2025-06-04 |
A Statistics-Driven Differentiable Approach for Sound Texture Synthesis and Analysis |
Esteban Gutiérrez et.al. |
2506.04073 |
null |
2025-06-04 |
Controlling Difficulty of Generated Text for AI-Assisted Language Learning |
Meiqing Jin et.al. |
2506.04072 |
null |
2025-06-04 |
Progressive Mastery: Customized Curriculum Learning with Guided Prompting for Mathematical Reasoning |
Muling Wu et.al. |
2506.04065 |
null |
2025-06-04 |
Crowd-SFT: Crowdsourcing for LLM Alignment |
Alex Sotiropoulos et.al. |
2506.04063 |
null |
2025-06-04 |
Towards generating more interpretable counterfactuals via concept vectors: a preliminary study on chest X-rays |
Bulat Maksudov et.al. |
2506.04058 |
null |
2025-06-04 |
High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning |
Tim Franzmeyer et.al. |
2506.04051 |
null |
2025-06-04 |
Explainability-Based Token Replacement on LLM-Generated Text |
Hadi Mohammadi et.al. |
2506.04050 |
null |
2025-06-04 |
Lacuna Inc. at SemEval-2025 Task 4: LoRA-Enhanced Influence-Based Unlearning for LLMs |
Aleksey Kudelya et.al. |
2506.04044 |
null |
2025-06-04 |
Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate |
Mikel K. Ngueajio et.al. |
2506.04043 |
null |
2025-06-04 |
Unveiling and Eliminating the Shortcut Learning for Locate-Then-Edit Knowledge Editing via Both Subject and Relation Awareness |
Xiyu Liu et.al. |
2506.04042 |
null |
2025-06-04 |
Mitigating Hallucinations in Large Vision-Language Models via Entity-Centric Multimodal Preference Optimization |
Jiulong Wu et.al. |
2506.04039 |
null |
2025-06-04 |
Generating Automotive Code: Large Language Models for Software Development and Verification in Safety-Critical Systems |
Sven Kirchner et.al. |
2506.04038 |
null |
2025-06-04 |
Privacy and Security Threat for OpenAI GPTs |
Wei Wenying et.al. |
2506.04036 |
null |
2025-06-04 |
AgentMisalignment: Measuring the Propensity for Misaligned Behaviour in LLM-Based Agents |
Akshat Naik et.al. |
2506.04018 |
null |
2025-06-04 |
Dreaming up scale invariance via inverse renormalization group |
Adam Rançon et.al. |
2506.04016 |
null |
2025-06-04 |
GORACS: Group-level Optimal Transport-guided Coreset Selection for LLM-based Recommender Systems |
Tiehua Mei et.al. |
2506.04015 |
null |
2025-06-04 |
Large deviations for scaled families of Schrödinger bridges with reflection |
Viktor Nilsson et.al. |
2506.03999 |
null |
2025-06-04 |
Seeing What Tastes Good: Revisiting Multimodal Distributional Semantics in the Billion Parameter Era |
Dan Oneata et.al. |
2506.03994 |
null |
2025-06-04 |
From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding |
Chiwei Zhu et.al. |
2506.03968 |
null |
2025-06-04 |
Lower Ricci Curvature for Hypergraphs |
Shiyi Yang et.al. |
2506.03943 |
null |
2025-06-04 |
Graph Counselor: Adaptive Graph Exploration via Multi-Agent Synergy to Enhance LLM Reasoning |
Junqi Gao et.al. |
2506.03939 |
null |
2025-06-04 |
VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation |
Yuansheng Ni et.al. |
2506.03930 |
null |
2025-06-04 |
Vision Remember: Alleviating Visual Forgetting in Efficient MLLM with Vision Feature Resample |
Ze Feng et.al. |
2506.03928 |
null |
2025-06-04 |
More or Less Wrong: A Benchmark for Directional Bias in LLM Comparative Reasoning |
Mohammadamin Shafiei et.al. |
2506.03923 |
null |
2025-06-04 |
HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models |
Zhaolu Kang et.al. |
2506.03922 |
null |
2025-06-05 |
Magic Mushroom: A Customizable Benchmark for Fine-grained Analysis of Retrieval Noise Erosion in RAG Systems |
Yuxin Zhang et.al. |
2506.03901 |
null |
2025-06-04 |
RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing |
Ruihan Jin et.al. |
2506.03880 |
null |
2025-06-04 |
Evaluating Apple Intelligence’s Writing Tools for Privacy Against Large Language Model-Based Inference Attacks: Insights from Early Datasets |
Mohd. Farhan Israk Soumik et.al. |
2506.03870 |
null |
2025-06-04 |
EuroGEST: Investigating gender stereotypes in multilingual language models |
Jacqueline Rowe et.al. |
2506.03867 |
null |
2025-06-04 |
PulseReddit: A Novel Reddit Dataset for Benchmarking MAS in High-Frequency Cryptocurrency Trading |
Qiuhan Han et.al. |
2506.03861 |
null |
2025-06-04 |
Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation |
Mingxuan Xia et.al. |
2506.03857 |
null |
2025-06-04 |
Algorithm- and Data-Dependent Generalization Bounds for Score-Based Generative Models |
Benjamin Dupuis et.al. |
2506.03849 |
null |
2025-06-04 |
Enhancing Safety of Foundation Models for Visual Navigation through Collision Avoidance via Repulsive Estimation |
Joonkyung Kim et.al. |
2506.03834 |
null |
2025-06-04 |
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance |
Dhaval Patel et.al. |
2506.03828 |
null |
2025-06-04 |
Multi-objective Aligned Bidword Generation Model for E-commerce Search Advertising |
Zhenhui Liu et.al. |
2506.03827 |
null |
2025-06-04 |
From Theory to Practice: Real-World Use Cases on Trustworthy LLM-Driven Process Modeling, Prediction and Automation |
Peter Pfeiffer et.al. |
2506.03801 |
null |
2025-06-04 |
STELLA: Towards Protein Function Prediction with Multimodal LLMs Integrating Sequence-Structure Representations |
Hongwang Xiao et.al. |
2506.03800 |
null |
2025-06-04 |
Mark My Words: A Robust Multilingual Model for Punctuation in Text and Speech Transcripts |
Sidharth Pulipaka et.al. |
2506.03793 |
null |
2025-06-05 |
Knockout LLM Assessment: Using Large Language Models for Evaluations through Iterative Pairwise Comparisons |
Isik Baran Sandan et.al. |
2506.03785 |
null |
2025-06-04 |
Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models |
Seungcheol Park et.al. |
2506.03781 |
null |
2025-06-04 |
ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations |
Quang Hieu Pham et.al. |
2506.03763 |
null |
2025-06-04 |
AhaKV: Adaptive Holistic Attention-Driven KV Cache Eviction for Efficient Inference of Large Language Models |
Yifeng Gu et.al. |
2506.03762 |
null |
2025-06-04 |
Act-as-Pet: Benchmarking the Abilities of Large Language Models as E-Pets in Social Network Services |
Hongcheng Guo et.al. |
2506.03761 |
null |
2025-06-04 |
Understanding Physical Properties of Unseen Deformable Objects by Leveraging Large Language Models and Robot Actions |
Changmin Park et.al. |
2506.03760 |
null |
2025-06-04 |
Frame-Level Real-Time Assessment of Stroke Rehabilitation Exercises from Video-Level Labeled Data: Task-Specific vs. Foundation Models |
Gonçalo Mesquita et.al. |
2506.03752 |
null |
2025-06-04 |
Spatiotemporal Prediction of Electric Vehicle Charging Load Based on Large Language Models |
Hang Fan et.al. |
2506.03728 |
null |
2025-06-04 |
Sign-SGD is the Golden Gate between Multi-Node to Single-Node Learning: Significant Boost via Parameter-Free Optimization |
Daniil Medyakov et.al. |
2506.03725 |
null |
2025-06-04 |
Verbalized Confidence Triggers Self-Verification: Emergent Behavior Without Explicit Reasoning Supervision |
Chaeyun Jang et.al. |
2506.03723 |
null |
2025-06-04 |
On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity |
Quentin Bertrand et.al. |
2506.03719 |
null |
2025-06-04 |
AetherVision-Bench: An Open-Vocabulary RGB-Infrared Benchmark for Multi-Angle Segmentation across Aerial and Ground Perspectives |
Aniruddh Sikdar et.al. |
2506.03709 |
null |
2025-06-04 |
ScoreRAG: A Retrieval-Augmented Generation Framework with Consistency-Relevance Scoring and Structured Summarization for News Generation |
Pei-Yun Lin et.al. |
2506.03704 |
null |
2025-06-04 |
Learning-at-Criticality in Large Language Models for Quantum Field Theory and Beyond |
Xiansheng Cai et.al. |
2506.03703 |
null |
2025-06-04 |
AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism |
Zhepei Wei et.al. |
2506.03700 |
null |
2025-06-04 |
Scaling Transformers for Discriminative Recommendation via Generative Pretraining |
Chunqi Wang et.al. |
2506.03699 |
null |
2025-06-04 |
Robust Preference Optimization via Dynamic Target Margins |
Jie Sun et.al. |
2506.03690 |
null |
2025-06-04 |
Out-of-Distribution Graph Models Merging |
Yidi Wang et.al. |
2506.03674 |
null |
2025-06-04 |
Trustworthy Medical Question Answering: An Evaluation-Centric Survey |
Yinuo Wang et.al. |
2506.03659 |
null |
2025-06-04 |
Client-Side Zero-Shot LLM Inference for Comprehensive In-Browser URL Analysis |
Avihay Cohen et.al. |
2506.03656 |
null |
2025-06-04 |
Facts are Harder Than Opinions – A Multilingual, Comparative Analysis of LLM-Based Fact-Checking Reliability |
Lorraine Saju et.al. |
2506.03655 |
null |
2025-06-04 |
RewardAnything: Generalizable Principle-Following Reward Models |
Zhuohao Yu et.al. |
2506.03637 |
null |
2025-06-04 |
Robustness of Prompting: Enhancing Robustness of Large Language Models Against Prompting Attacks |
Lin Mu et.al. |
2506.03627 |
null |
2025-06-04 |
Do Large Language Models Know Folktales? A Case Study of Yokai in Japanese Folktales |
Ayuto Tsutsumi et.al. |
2506.03619 |
null |
2025-06-04 |
Learning to Insert [PAUSE] Tokens for Better Reasoning |
Eunki Kim et.al. |
2506.03616 |
null |
2025-06-04 |
Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games |
Dongmin Park et.al. |
2506.03610 |
null |
2025-06-04 |
Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision |
Tomoya Yoshida et.al. |
2506.03605 |
null |
2025-06-04 |
Auto prompt sql: a resource-efficient architecture for text-to-sql translation in constrained environments |
Zetong Tang et.al. |
2506.03598 |
null |
2025-06-04 |
Resolving Task Objective Conflicts in Unified Multimodal Understanding and Generation via Task-Aware Mixture-of-Experts |
Jiaxing Zhang et.al. |
2506.03591 |
null |
2025-06-04 |
Preface to the Special Issue of the TAL Journal on Scholarly Document Processing |
Florian Boudin et.al. |
2506.03587 |
null |
2025-06-04 |
Improving LLM-Based Fault Localization with External Memory and Project Context |
Inseok Yeo et.al. |
2506.03585 |
null |
2025-06-04 |
Exchange of Perspective Prompting Enhances Reasoning in Large Language Models |
Lin Sun et.al. |
2506.03573 |
null |
2025-06-04 |
FreePRM: Training Process Reward Models Without Ground Truth Process Labels |
Lin Sun et.al. |
2506.03570 |
null |
2025-06-04 |
POSS: Position Specialist Generates Better Draft for Speculative Decoding |
Langlin Huang et.al. |
2506.03566 |
null |
2025-06-04 |
ConsistentChat: Building Skeleton-Guided Consistent Dialogues for Large Language Models from Scratch |
Jiawei Chen et.al. |
2506.03558 |
null |
2025-06-04 |
BPO: Revisiting Preference Modeling in Direct Preference Optimization |
Lin Sun et.al. |
2506.03557 |
null |
2025-06-04 |
From Virtual Agents to Robot Teams: A Multi-Robot Framework Evaluation in High-Stakes Healthcare Context |
Yuanchen Bai et.al. |
2506.03546 |
null |
2025-06-03 |
Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM |
Pralaypati Ta et.al. |
2506.03145 |
null |
2025-06-03 |
Not All Tokens Are Meant to Be Forgotten |
Xiangyu Zhou et.al. |
2506.03142 |
null |
2025-06-03 |
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation |
Siqi Chen et.al. |
2506.03139 |
null |
2025-06-03 |
Native-Resolution Image Synthesis |
Zidong Wang et.al. |
2506.03131 |
null |
2025-06-03 |
AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation |
Lu Qiu et.al. |
2506.03126 |
null |
2025-06-03 |
AUTOCIRCUIT-RL: Reinforcement Learning-Driven LLM for Automated Circuit Topology Generation |
Prashanth Vijayaraghavan et.al. |
2506.03122 |
null |
2025-06-03 |
Targeted Forgetting of Image Subgroups in CLIP Models |
Zeliang Zhang et.al. |
2506.03117 |
null |
2025-06-03 |
Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback |
Xiaoying Zhang et.al. |
2506.03106 |
null |
2025-06-03 |
TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models |
Chetwin Low et.al. |
2506.03099 |
null |
2025-06-03 |
SG2VID: Scene Graphs Enable Fine-Grained Control for Video Synthesis |
Ssharvien Kumar Sivakumar et.al. |
2506.03082 |
null |
2025-06-03 |
ORV: 4D Occupancy-centric Robot Video Generation |
Xiuyu Yang et.al. |
2506.03079 |
null |
2025-06-03 |
EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models |
Mingzhe Li et.al. |
2506.03067 |
null |
2025-06-03 |
Corrigibility as a Singular Target: A Vision for Inherently Reliable Foundation Models |
Ram Potham et.al. |
2506.03056 |
null |
2025-06-03 |
Facts Do Care About Your Language: Assessing Answer Quality of Multilingual LLMs |
Yuval Kansal et.al. |
2506.03051 |
null |
2025-06-03 |
Sample complexity of Schrödinger potential estimation |
Nikita Puchkin et.al. |
2506.03043 |
null |
2025-06-03 |
Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective |
Jintian Shao et.al. |
2506.03038 |
null |
2025-06-03 |
Leveraging Information Retrieval to Enhance Spoken Language Understanding Prompts in Few-Shot Learning |
Pierre Lepagnol et.al. |
2506.03035 |
null |
2025-06-03 |
TestAgent: An Adaptive and Intelligent Expert for Human Assessment |
Junhao Yu et.al. |
2506.03032 |
null |
2025-06-03 |
GenFair: Systematic Test Generation for Fairness Fault Detection in Large Language Models |
Madhusudan Srinivasan et.al. |
2506.03024 |
null |
2025-06-03 |
Conditioning Large Language Models on Legal Systems? Detecting Punishable Hate Speech |
Florian Ludwig et.al. |
2506.03009 |
null |
2025-06-03 |
DFBench: Benchmarking Deepfake Image Detection Capability of Large Multimodal Models |
Jiarui Wang et.al. |
2506.03007 |
null |
2025-06-03 |
A Preference-Driven Methodology for High-Quality Solidity Code Generation |
Zhiyuan Peng et.al. |
2506.03006 |
null |
2025-06-03 |
Linear Spatial World Models Emerge in Large Language Models |
Matthieu Tehenan et.al. |
2506.02996 |
null |
2025-06-03 |
It’s Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems |
Iuliia Zaitova et.al. |
2506.02995 |
null |
2025-06-03 |
Mitigating Manipulation and Enhancing Persuasion: A Reflective Multi-Agent Approach for Legal Argument Generation |
Li Zhang et.al. |
2506.02992 |
null |
2025-06-03 |
Performance of leading large language models in May 2025 in Membership of the Royal College of General Practitioners-style examination questions: a cross-sectional analysis |
Richard Armitage et.al. |
2506.02987 |
null |
2025-06-03 |
Astrophotography turbulence mitigation via generative models |
Joonyeoup Kim et.al. |
2506.02981 |
null |
2025-06-03 |
On the Robustness of Tabular Foundation Models: Test-Time Attacks and In-Context Defenses |
Mohamed Djilani et.al. |
2506.02978 |
null |
2025-06-03 |
Expanding before Inferring: Enhancing Factuality in Large Language Models through Premature Layers Interpolation |
Dingwei Chen et.al. |
2506.02973 |
null |
2025-06-04 |
PC-MoE: Memory-Efficient and Privacy-Preserving Collaborative Training for Mixture-of-Experts LLMs |
Ze Yu Zhang et.al. |
2506.02965 |
null |
2025-06-03 |
FORLA:Federated Object-centric Representation Learning with Slot Attention |
Guiqiu Liao et.al. |
2506.02964 |
null |
2025-06-03 |
FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models |
Yan Gao et.al. |
2506.02961 |
null |
2025-06-03 |
HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring |
Zhixiong Su et.al. |
2506.02959 |
null |
2025-06-03 |
UniConFlow: A Unified Constrained Generalization Framework for Certified Motion Planning with Flow Matching Models |
Zewen Yang et.al. |
2506.02955 |
null |
2025-06-03 |
Towards More Effective Fault Detection in LLM-Based Unit Test Generation |
Guancheng Wang et.al. |
2506.02954 |
null |
2025-06-03 |
Adaptive Graph Pruning for Multi-Agent Communication |
Boyi Li et.al. |
2506.02951 |
null |
2025-06-03 |
Quantitative LLM Judges |
Aishwarya Sahoo et.al. |
2506.02945 |
null |
2025-06-03 |
A Multi-agent LLM-based JUit Test Generation with Strong Oracles |
Qinghua Xu et.al. |
2506.02943 |
null |
2025-06-03 |
Memory-Efficient Split Federated Learning for LLM Fine-Tuning on Heterogeneous Mobile Devices |
Xiaopei Chen et.al. |
2506.02940 |
null |
2025-06-03 |
Elasticity of substitution and general model of economic growth |
Constantin Chilarescu et.al. |
2506.02936 |
null |
2025-06-03 |
Large Processor Chip Model |
Kaiyan Chang et.al. |
2506.02929 |
null |
2025-06-03 |
INESC-ID @ eRisk 2025: Exploring Fine-Tuned, Similarity-Based, and Prompt-Based Approaches to Depression Symptom Identification |
Diogo A. P. Nunes et.al. |
2506.02924 |
null |
2025-06-03 |
Sample, Predict, then Proceed: Self-Verification Sampling for Tool Use of LLMs |
Shangmin Guo et.al. |
2506.02918 |
null |
2025-06-03 |
Towards Auto-Annotation from Annotation Guidelines: A Benchmark through 3D LiDAR Detection |
Yechi Ma et.al. |
2506.02914 |
null |
2025-06-03 |
Cell-o1: Training LLMs to Solve Single-Cell Reasoning Puzzles with Reinforcement Learning |
Yin Fang et.al. |
2506.02911 |
null |
2025-06-03 |
Diffusion Buffer: Online Diffusion-based Speech Enhancement with Sub-Second Latency |
Bunlong Lay et.al. |
2506.02908 |
null |
2025-06-03 |
Scaling Fine-Grained MoE Beyond 50B Parameters: Empirical Evaluation and Practical Insights |
Jakub Krajewski et.al. |
2506.02890 |
null |
2025-06-03 |
CoT is Not True Reasoning, It Is Just a Tight Constraint to Imitate: A Theory Perspective |
Jintian Shao et.al. |
2506.02878 |
null |
2025-06-03 |
It’s the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics |
Matthew Kowal et.al. |
2506.02873 |
null |
2025-06-03 |
Token and Span Classification for Entity Recognition in French Historical Encyclopedias |
Ludovic Moncla et.al. |
2506.02872 |
null |
2025-06-03 |
Pan-Arctic Permafrost Landform and Human-built Infrastructure Feature Detection with Vision Transformers and Location Embeddings |
Amal S. Perera et.al. |
2506.02868 |
null |
2025-06-03 |
BNPO: Beta Normalization Policy Optimization |
Changyi Xiao et.al. |
2506.02864 |
null |
2025-06-03 |
Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs |
Wenjing Tang et.al. |
2506.02860 |
null |
2025-06-03 |
ATAG: AI-Agent Application Threat Assessment with Attack Graphs |
Parth Atulbhai Gandhi et.al. |
2506.02859 |
null |
2025-06-03 |
Enhancing Abnormality Identification: Robust Out-of-Distribution Strategies for Deepfake Detection |
Luca Maiano et.al. |
2506.02857 |
null |
2025-06-03 |
METok: Multi-Stage Event-based Token Compression for Efficient Long Video Understanding |
Mengyue Wang et.al. |
2506.02850 |
null |
2025-06-03 |
CLONE: Customizing LLMs for Efficient Latency-Aware Inference at the Edge |
Chunlin Tian et.al. |
2506.02847 |
null |
2025-06-03 |
TaxAgent: How Large Language Model Designs Fiscal Policy |
Jizhou Wang et.al. |
2506.02838 |
null |
2025-06-03 |
High-speed control and navigation for quadrupedal robots on complex and discrete terrain |
Hyeongjun Kim et.al. |
2506.02835 |
null |
2025-06-03 |
TO-GATE: Clarifying Questions and Summarizing Responses with Trajectory Optimization for Eliciting Human Preference |
Yulin Dou et.al. |
2506.02827 |
null |
2025-06-03 |
ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations |
Ekaterina Grishina et.al. |
2506.02818 |
null |
2025-06-03 |
CART-based Synthetic Tabular Data Generation for Imbalanced Regression |
António Pedro Pinheiro et.al. |
2506.02811 |
null |
2025-06-03 |
Rethinking the effects of data contamination in Code Intelligence |
Zhen Yang et.al. |
2506.02791 |
null |
2025-06-03 |
Rethinking Dynamic Networks and Heterogeneous Computing with Automatic Parallelization |
Ruilong Wu et.al. |
2506.02787 |
null |
2025-06-03 |
Reuse or Generate? Accelerating Code Editing via Edit-Oriented Speculative Decoding |
Peiding Wang et.al. |
2506.02780 |
null |
2025-06-03 |
Rethinking Machine Unlearning in Image Generation Models |
Renyang Liu et.al. |
2506.02761 |
null |
2025-06-03 |
Exploiting the English Vocabulary Profile for L2 word-level vocabulary assessment with LLMs |
Stefano Bannò et.al. |
2506.02758 |
null |
2025-06-03 |
Enriching Location Representation with Detailed Semantic Information |
Junyuan Liu et.al. |
2506.02744 |
null |
2025-06-03 |
Why do AI agents communicate in human language? |
Pengcheng Zhou et.al. |
2506.02739 |
null |
2025-06-03 |
RACE-Align: Retrieval-Augmented and Chain-of-Thought Enhanced Preference Alignment for Large Language Models |
Qihang Yan et.al. |
2506.02726 |
null |
2025-06-03 |
Benchmarking and Advancing Large Language Models for Local Life Services |
Xiaochong Lan et.al. |
2506.02720 |
null |
2025-06-03 |
Expansion-contraction duality breaking in a Planck-scale sensitive cosmological quantum simulator |
S. Mahesh Chandran et.al. |
2506.02719 |
null |
2025-06-03 |
Heterogeneous Group-Based Reinforcement Learning for LLM-based Multi-Agent Systems |
Guanzhong Chen et.al. |
2506.02718 |
null |
2025-06-03 |
Open-Set Living Need Prediction with Large Language Models |
Xiaochong Lan et.al. |
2506.02713 |
null |
2025-06-03 |
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences |
Yunhong Lu et.al. |
2506.02698 |
null |
2025-06-03 |
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations |
Jinyuan Luo et.al. |
2506.02696 |
null |
2025-06-03 |
Large-scale Self-supervised Video Foundation Model for Intelligent Surgery |
Shu Yang et.al. |
2506.02692 |
null |
2025-06-04 |
MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching |
Liang Yue et.al. |
2506.02689 |
null |
2025-06-03 |
Decompose, Plan in Parallel, and Merge: A Novel Paradigm for Large Language Models based Planning with Multiple Constraints |
Zhengdong Lu et.al. |
2506.02683 |
null |
2025-06-03 |
Solving Inverse Problems with FLAIR |
Julius Erbach et.al. |
2506.02680 |
null |
2025-06-03 |
TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression |
Zhong-Zhi Li et.al. |
2506.02678 |
null |
2025-06-03 |
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving |
Shihan Dou et.al. |
2506.02672 |
null |
2025-06-03 |
Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs |
Manon Reusens et.al. |
2506.02659 |
null |
2025-06-04 |
Computational Thinking Reasoning in Large Language Models |
Kechi Zhang et.al. |
2506.02658 |
null |
2025-06-03 |
From Prompts to Protection: Large Language Model-Enabled In-Context Learning for Smart Public Safety UAV |
Yousef Emami et.al. |
2506.02649 |
null |
2025-06-03 |
Truly Assessing Fluid Intelligence of Large Language Models through Dynamic Reasoning Evaluation |
Yue Yang et.al. |
2506.02648 |
null |
2025-06-03 |
KVCache Cache in the Wild: Characterizing and Optimizing KVCache Cache at a Large Cloud Provider |
Jiahao Wang et.al. |
2506.02634 |
null |
2025-06-03 |
HAM: A Hyperbolic Step to Regulate Implicit Bias |
Tom Jacobs et.al. |
2506.02630 |
null |
2025-06-03 |
Hyperspectral Image Generation with Unmixing Guided Diffusion Model |
Shiyu Shen et.al. |
2506.02601 |
null |
2025-06-03 |
EssayBench: Evaluating Large Language Models in Multi-Genre Chinese Essay Writing |
Fan Gao et.al. |
2506.02596 |
null |
2025-06-03 |
EALG: Evolutionary Adversarial Generation of Language Model-Guided Generators for Combinatorial Optimization |
Ruibo Duan et.al. |
2506.02594 |
null |
2025-06-03 |
Beyond the Surface: Measuring Self-Preference in LLM Judgments |
Zhi-Yuan Chen et.al. |
2506.02592 |
null |
2025-06-03 |
On Generalization across Measurement Systems: LLMs Entail More Test-Time Compute for Underrepresented Cultures |
Minh Duc Bui et.al. |
2506.02591 |
null |
2025-06-03 |
Evaluating Named Entity Recognition Models for Russian Cultural News Texts: From BERT to LLM |
Maria Levchenko et.al. |
2506.02589 |
null |
2025-06-03 |
IndoSafety: Culturally Grounded Safety for LLMs in Indonesian Languages |
Muhammad Falensi Azmi et.al. |
2506.02573 |
null |
2025-06-03 |
HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference |
Ping Gong et.al. |
2506.02572 |
null |
2025-06-03 |
MLaGA: Multimodal Large Language and Graph Assistant |
Dongzhe Fan et.al. |
2506.02568 |
null |
2025-06-03 |
Pruning General Large Language Models into Customized Expert Models |
Yirao Zhao et.al. |
2506.02561 |
null |
2025-06-03 |
Kernel-based Unsupervised Embedding Alignment for Enhanced Visual Representation in Vision-language Models |
Shizhan Gong et.al. |
2506.02557 |
null |
2025-06-03 |
SurgVLM: A Large Vision-Language Model and Systematic Evaluation Benchmark for Surgical Intelligence |
Zhitao Zeng et.al. |
2506.02555 |
null |
2025-05-30 |
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL |
Yu Zhang et.al. |
2505.24875 |
null |
2025-05-30 |
The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models |
Adam Stein et.al. |
2505.24874 |
null |
2025-05-30 |
MiniMax-Remover: Taming Bad Noise Helps Video Object Removal |
Bojia Zi et.al. |
2505.24873 |
null |
2025-05-30 |
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning |
Yiqing Liang et.al. |
2505.24871 |
null |
2025-05-30 |
GenSpace: Benchmarking Spatially-Aware Image Generation |
Zehan Wang et.al. |
2505.24870 |
null |
2025-05-30 |
SiLVR: A Simple Language-based Video Reasoning Framework |
Ce Zhang et.al. |
2505.24869 |
null |
2025-05-30 |
TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection |
Xinqi Xiong et.al. |
2505.24866 |
null |
2025-05-30 |
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models |
Mingjie Liu et.al. |
2505.24864 |
null |
2025-05-30 |
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization |
Cailin Zhuang et.al. |
2505.24862 |
null |
2025-05-30 |
MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning |
Jingyan Shen et.al. |
2505.24846 |
null |
2025-05-30 |
Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning |
Wanyun Xie et.al. |
2505.24844 |
null |
2025-05-30 |
Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck |
Yuwen Tan et.al. |
2505.24840 |
null |
2025-05-30 |
VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software |
Brandon Man et.al. |
2505.24838 |
null |
2025-05-30 |
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs |
Juraj Vladika et.al. |
2505.24830 |
null |
2025-05-30 |
LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal Text |
Li yunhan et.al. |
2505.24826 |
null |
2025-05-30 |
PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models |
Yinggan Xu et.al. |
2505.24823 |
null |
2025-05-30 |
Bi-Manual Joint Camera Calibration and Scene Representation |
Haozhan Tang et.al. |
2505.24819 |
null |
2025-06-02 |
Guiding Generative Storytelling with Knowledge Graphs |
Zhijun Pan et.al. |
2505.24803 |
null |
2025-05-30 |
Inference Acceleration of Autoregressive Normalizing Flows by Selective Jacobi Decoding |
Jiaru Zhang et.al. |
2505.24791 |
null |
2025-05-30 |
Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation |
Yucheng Zhou et.al. |
2505.24787 |
null |
2025-05-30 |
AXIOM: Learning to Play Games in Minutes with Expanding Object-Centric Models |
Conor Heins et.al. |
2505.24784 |
null |
2025-06-03 |
EVA-MILP: Towards Standardized Evaluation of MILP Instance Generation |
Yidong Luo et.al. |
2505.24779 |
null |
2025-05-30 |
Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models’ Uncertainty? |
Jiayu Liu et.al. |
2505.24778 |
null |
2025-05-30 |
Diffusion-Based Symbolic Regression |
Zachary Bastiani et.al. |
2505.24776 |
null |
2025-05-30 |
AFLoRA: Adaptive Federated Fine-Tuning of Large Language Models with Resource-Aware Low-Rank Adaption |
Yajie Zhou et.al. |
2505.24773 |
null |
2025-05-30 |
Generalization Dynamics of Linear Diffusion Models |
Claudia Merger et.al. |
2505.24769 |
null |
2025-05-30 |
From Macro to Micro: Probing Dataset Diversity in Language Model Fine-Tuning |
Haoyu Li et.al. |
2505.24768 |
null |
2025-05-30 |
A survey of using EHR as real-world evidence for discovering and validating new drug indications |
Nabasmita Talukdar et.al. |
2505.24767 |
null |
2025-05-30 |
LGAR: Zero-Shot LLM-Guided Neural Ranking for Abstract Screening in Systematic Literature Reviews |
Christian Jaumann et.al. |
2505.24757 |
null |
2025-05-30 |
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training |
Yehonathan Refael et.al. |
2505.24749 |
null |
2025-05-30 |
DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds |
Jiaxu Zhang et.al. |
2505.24733 |
null |
2025-05-30 |
Circuit Stability Characterizes Language Model Generalization |
Alan Sun et.al. |
2505.24731 |
null |
2025-05-30 |
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning |
Shelly Bensal et.al. |
2505.24726 |
null |
2025-05-30 |
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts |
Neil He et.al. |
2505.24722 |
null |
2025-06-03 |
Reinforcing Video Reasoning with Focused Thinking |
Jisheng Dang et.al. |
2505.24718 |
null |
2025-05-30 |
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations |
Benjamin Holzschuh et.al. |
2505.24717 |
null |
2025-05-30 |
Towards Scalable Schema Mapping using Large Language Models |
Christopher Buss et.al. |
2505.24716 |
null |
2025-05-30 |
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation |
Junyu Luo et.al. |
2505.24714 |
null |
2025-05-30 |
HESEIA: A community-based dataset for evaluating social biases in large language models, co-designed in real school settings in Latin America |
Guido Ivetta et.al. |
2505.24712 |
null |
2025-05-30 |
Causal-aware Large Language Models: Enhancing Decision-Making Through Learning, Adapting and Acting |
Wei Chen et.al. |
2505.24710 |
null |
2025-05-30 |
Multi-Domain ABSA Conversation Dataset Generation via LLMs for Real-World Evaluation and Model Comparison |
Tejul Pandit et.al. |
2505.24701 |
null |
2025-05-30 |
Conformal Prediction for Zero-Shot Models |
Julio Silva-Rodríguez et.al. |
2505.24693 |
null |
2025-05-30 |
BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization |
Sander Land et.al. |
2505.24689 |
null |
2025-05-30 |
Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration |
Qinglin Zhu et.al. |
2505.24688 |
null |
2025-05-30 |
A Simple Linear Patch Revives Layer-Pruned Large Language Models |
Xinrui Chen et.al. |
2505.24680 |
null |
2025-05-30 |
TRIDENT: Enhancing Large Language Model Safety with Tri-Dimensional Diversified Red-Teaming Data Synthesis |
Xiaorui Wu et.al. |
2505.24672 |
null |
2025-05-30 |
Multiple LLM Agents Debate for Equitable Cultural Alignment |
Dayeon Ki et.al. |
2505.24671 |
null |
2025-05-30 |
Can LLMs and humans be friends? Uncovering factors affecting human-AI intimacy formation |
Yeseon Hong et.al. |
2505.24658 |
null |
2025-05-30 |
Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models |
Frederike Lübeck et.al. |
2505.24655 |
null |
2025-05-30 |
Efficient Text Encoders for Labor Market Analysis |
Jens-Joris Decorte et.al. |
2505.24640 |
null |
2025-05-30 |
Disentangling Language and Culture for Evaluating Multilingual Large Language Models |
Jiahao Ying et.al. |
2505.24635 |
null |
2025-05-30 |
The Hallucination Dilemma: Factuality-Aware Reinforcement Learning for Large Reasoning Models |
Junyi Li et.al. |
2505.24630 |
null |
2025-05-30 |
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors |
Duo Zheng et.al. |
2505.24625 |
null |
2025-05-30 |
Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success |
Ben Griffin et.al. |
2505.24622 |
null |
2025-05-30 |
Benchmarking Large Language Models for Cryptanalysis and Mismatched-Generalization |
Utsav Maskey et.al. |
2505.24621 |
null |
2025-05-30 |
Eye of Judgement: Dissecting the Evaluation of Russian-speaking LLMs with POLLUX |
Nikita Martynov et.al. |
2505.24616 |
null |
2025-05-30 |
Harnessing Large Language Models for Scientific Novelty Detection |
Yan Liu et.al. |
2505.24615 |
null |
2025-05-30 |
Mixture-of-Experts for Personalized and Semantic-Aware Next Location Prediction |
Shuai Liu et.al. |
2505.24597 |
null |
2025-05-30 |
A Composite Predictive-Generative Approach to Monaural Universal Speech Enhancement |
Jie Zhang et.al. |
2505.24576 |
null |
2025-05-30 |
Bench4KE: Benchmarking Automated Competency Question Generation |
Anna Sofia Lippolis et.al. |
2505.24554 |
null |
2025-05-30 |
CREFT: Sequential Multi-Agent LLM for Character Relation Extraction |
Ye Eun Chun et.al. |
2505.24553 |
null |
2025-05-30 |
Cross-Attention Speculative Decoding |
Wei Zhong et.al. |
2505.24544 |
null |
2025-05-30 |
Mixpert: Mitigating Multimodal Learning Conflicts with Efficient Mixture-of-Vision-Experts |
Xin He et.al. |
2505.24541 |
null |
2025-06-03 |
Localizing Persona Representations in LLMs |
Celia Cintas et.al. |
2505.24539 |
null |
2025-05-30 |
Don’t Erase, Inform! Detecting and Contextualizing Harmful Language in Cultural Heritage Collections |
Orfeas Menis Mastromichalakis et.al. |
2505.24538 |
null |
2025-05-30 |
Beyond Linear Steering: Unified Multi-Attribute Control for Language Models |
Narmeen Oozeer et.al. |
2505.24535 |
null |
2025-05-30 |
Transformers Are Universally Consistent |
Sagar Ghosh et.al. |
2505.24531 |
null |
2025-05-30 |
Geospatial Foundation Models to Enable Progress on Sustainable Development Goals |
Pedram Ghamisi et.al. |
2505.24528 |
null |
2025-05-30 |
Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors |
Andrea Pedrotti et.al. |
2505.24523 |
null |
2025-05-30 |
UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation |
Yang-Tian Sun et.al. |
2505.24521 |
null |
2025-05-30 |
un $^2$ CLIP: Improving CLIP’s Visual Detail Capturing Ability via Inverting unCLIP |
Yinqi Li et.al. |
2505.24517 |
null |
2025-05-30 |
TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs’ Social Intelligence |
Guiyang Hou et.al. |
2505.24500 |
null |
2025-05-30 |
Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation |
Ximing Xing et.al. |
2505.24499 |
null |
2025-05-30 |
MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge |
Xin Jing et.al. |
2505.24493 |
null |
2025-05-30 |
Object Centric Concept Bottlenecks |
David Steinmann et.al. |
2505.24492 |
null |
2025-05-30 |
Leveraging Knowledge Graphs and LLMs for Structured Generation of Misinformation |
Sania Nayab et.al. |
2505.24479 |
null |
2025-05-30 |
Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning |
Vasilije Markovic et.al. |
2505.24478 |
null |
2025-05-30 |
Period-LLM: Extending the Periodic Capability of Multimodal Large Language Model |
Yuting Zhang et.al. |
2505.24476 |
null |
2025-05-30 |
SA-Person: Text-Based Person Retrieval with Scene-aware Re-ranking |
Yingjia Xu et.al. |
2505.24466 |
null |
2025-05-30 |
SEAR: A Multimodal Dataset for Analyzing AR-LLM-Driven Social Engineering Behaviors |
Tianlong Yu et.al. |
2505.24458 |
null |
2025-05-30 |
LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs |
Luis Ibanez-Lissen et.al. |
2505.24451 |
null |
2025-05-30 |
Exploring the Impact of Occupational Personas on Domain-Specific QA |
Eojin Kang et.al. |
2505.24448 |
null |
2025-05-30 |
Learning Safety Constraints for Large Language Models |
Xin Chen et.al. |
2505.24445 |
null |
2025-05-30 |
RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation |
Zhentao Xie et.al. |
2505.24442 |
null |
2025-05-30 |
SORCE: Small Object Retrieval in Complex Environments |
Chunxu Liu et.al. |
2505.24441 |
null |
2025-05-30 |
Model Unlearning via Sparse Autoencoder Subspace Guided Projections |
Xu Wang et.al. |
2505.24428 |
null |
2025-05-30 |
MMAFFBen: A Multilingual and Multimodal Affective Analysis Benchmark for Evaluating LLMs and VLMs |
Zhiwei Liu et.al. |
2505.24423 |
null |
2025-05-30 |
LLMs Are Globally Multilingual Yet Locally Monolingual: Exploring Knowledge Transfer via Language and Thought Theory |
Eojin Kang et.al. |
2505.24409 |
null |
2025-05-30 |
IRBridge: Solving Image Restoration Bridge with Pre-trained Generative Diffusion Models |
Hanting Wang et.al. |
2505.24406 |
null |
2025-05-30 |
ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented Generation |
Hao Chen et.al. |
2505.24388 |
null |
2025-05-30 |
Breaking the Gold Standard: Extracting Forgotten Data under Exact Unlearning in Large Language Models |
Xiaoyu Wu et.al. |
2505.24379 |
null |
2025-05-30 |
LLM Inference Enhanced by External Knowledge: A Survey |
Yu-Hsuan Lin et.al. |
2505.24377 |
null |
2025-05-30 |
Grid-LOGAT: Grid Based Local and Global Area Transcription for Video Question Answering |
Md Intisar Chowdhury et.al. |
2505.24371 |
null |
2025-05-30 |
ReCalKV: Low-Rank KV Cache Compression via Head Reordering and Offline Calibration |
Xianglong Yan et.al. |
2505.24357 |
null |
2025-05-30 |
Multilingual Gloss-free Sign Language Translation: Towards Building a Sign Language Foundation Model |
Sihan Tan et.al. |
2505.24355 |
null |
2025-05-30 |
Unifying Language Agent Algorithms with Graph-based Orchestration Engine for Reproducible Agent Research |
Qianqian Zhang et.al. |
2505.24354 |
null |
2025-05-30 |
Exploring Multimodal Challenges in Toxic Chinese Detection: Taxonomy, Benchmark, and Findings |
Shujian Yang et.al. |
2505.24341 |
null |
2025-05-30 |
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models |
Gilles Quentin Hacheme et.al. |
2505.24340 |
null |
2025-05-30 |
Pangu DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning |
Wenxuan Shi et.al. |
2505.24332 |
null |
2025-05-30 |
DisTime: Distribution-based Time Representation for Video Large Language Models |
Yingsen Zeng et.al. |
2505.24329 |
null |
2025-05-29 |
Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought |
Yunze Man et.al. |
2505.23766 |
null |
2025-05-29 |
From Chat Logs to Collective Insights: Aggregative Question Answering |
Wentao Zhang et.al. |
2505.23765 |
null |
2025-05-29 |
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence |
Sihan Yang et.al. |
2505.23764 |
null |
2025-05-29 |
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning |
Ziyin Zhang et.al. |
2505.23754 |
link |
2025-05-29 |
ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks |
Akashah Shabbir et.al. |
2505.23752 |
link |
2025-05-29 |
Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences? |
Paul Gölz et.al. |
2505.23749 |
null |
2025-05-29 |
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence |
Diankun Wu et.al. |
2505.23747 |
null |
2025-05-29 |
MAGREF: Masked Guidance for Any-Reference Video Generation |
Yufan Deng et.al. |
2505.23742 |
link |
2025-05-29 |
Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time |
Mohamad Chehade et.al. |
2505.23729 |
null |
2025-05-29 |
PixelThink: Towards Efficient Chain-of-Pixel Reasoning |
Song Wang et.al. |
2505.23727 |
null |
2025-05-29 |
FMG-Det: Foundation Model Guided Robust Object Detection |
Darryl Hannan et.al. |
2505.23726 |
null |
2025-05-29 |
MuLoCo: Muon is a practical inner optimizer for DiLoCo |
Benjamin Thérien et.al. |
2505.23725 |
null |
2025-05-29 |
SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA |
Minrui Luo et.al. |
2505.23724 |
null |
2025-05-29 |
ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering |
Zexi Liu et.al. |
2505.23723 |
link |
2025-05-29 |
Label-Guided In-Context Learning for Named Entity Recognition |
Fan Bai et.al. |
2505.23722 |
link |
2025-05-29 |
TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning |
Andreas Auer et.al. |
2505.23719 |
link |
2025-05-29 |
Don’t Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models |
Jinzhe Li et.al. |
2505.23715 |
link |
2025-05-29 |
SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models |
Zixiang Xu et.al. |
2505.23713 |
link |
2025-05-29 |
Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation |
Ziling Cheng et.al. |
2505.23701 |
null |
2025-05-29 |
Fortune: Formula-Driven Reinforcement Learning for Symbolic Table Reasoning in Language Models |
Lang Cao et.al. |
2505.23667 |
null |
2025-05-29 |
LoLA: Low-Rank Linear Attention With Sparse Caching |
Luke McDermott et.al. |
2505.23666 |
null |
2025-05-29 |
ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions |
Beong-woo Kwak et.al. |
2505.23662 |
null |
2025-05-29 |
OpenUni: A Simple Baseline for Unified Multimodal Understanding and Generation |
Size Wu et.al. |
2505.23661 |
link |
2025-05-29 |
D-AR: Diffusion via Autoregressive Models |
Ziteng Gao et.al. |
2505.23660 |
link |
2025-05-29 |
Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation |
Hongxiang Zhang et.al. |
2505.23657 |
null |
2025-05-29 |
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models |
Xiangdong Zhang et.al. |
2505.23656 |
null |
2025-05-29 |
ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs |
Mohamed Elaraby et.al. |
2505.23654 |
null |
2025-05-29 |
How does Transformer Learn Implicit Reasoning? |
Jiaran Ye et.al. |
2505.23653 |
link |
2025-05-29 |
Optimization-Free Diffusion Model – A Perturbation Theory Approach |
Yuehaw Khoo et.al. |
2505.23652 |
null |
2025-05-29 |
MCP Safety Training: Learning to Refuse Falsely Benign MCP Exploits using Improved Preference Alignment |
John Halloran et.al. |
2505.23634 |
null |
2025-05-29 |
AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale Corpora |
Jiaxin Bai et.al. |
2505.23628 |
null |
2025-05-29 |
ZeroSep: Separate Anything in Audio with Zero Training |
Chao Huang et.al. |
2505.23625 |
null |
2025-05-29 |
Few-Shot Speech Deepfake Detection Adaptation with Gaussian Processes |
Neta Glazer et.al. |
2505.23619 |
null |
2025-05-29 |
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model |
Qingyu Shi et.al. |
2505.23606 |
link |
2025-05-29 |
A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis |
Shengyuan Liu et.al. |
2505.23601 |
null |
2025-05-29 |
LLM Performance for Code Generation on Noisy Tasks |
Radzim Sendyka et.al. |
2505.23598 |
null |
2025-05-29 |
MAPLE: A Mobile Assistant with Persistent Finite State Machines for Recovery Reasoning |
Linqiang Guo et.al. |
2505.23596 |
null |
2025-05-29 |
Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles |
Zifu Wang et.al. |
2505.23590 |
link |
2025-05-29 |
On-Policy RL with Optimal Reward Baseline |
Yaru Hao et.al. |
2505.23585 |
link |
2025-05-29 |
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model |
Adibvafa Fallahpour et.al. |
2505.23579 |
null |
2025-05-29 |
Cognitive Guardrails for Open-World Decision Making in Autonomous Drone Swarms |
Jane Cleland-Huang et.al. |
2505.23576 |
null |
2025-05-29 |
Evaluating AI capabilities in detecting conspiracy theories on YouTube |
Leonardo La Rocca et.al. |
2505.23570 |
link |
2025-05-29 |
Maximum Likelihood Learning of Latent Dynamics Without Reconstruction |
Samo Hromadka et.al. |
2505.23569 |
null |
2025-05-29 |
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models |
Yiran Guo et.al. |
2505.23564 |
link |
2025-05-29 |
Merge Hijacking: Backdoor Attacks to Model Merging of Large Language Models |
Zenghui Yuan et.al. |
2505.23561 |
null |
2025-05-29 |
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents |
Kunlun Zhu et.al. |
2505.23559 |
link |
2025-05-29 |
Adaptive Federated LoRA in Heterogeneous Wireless Networks with Independent Sampling |
Yanzhao Hou et.al. |
2505.23555 |
null |
2025-05-29 |
Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters |
Hayden Moore et.al. |
2505.23554 |
null |
2025-05-29 |
LLM-based Property-based Test Generation for Guardrailing Cyber-Physical Systems |
Khashayar Etemadi et.al. |
2505.23549 |
null |
2025-05-29 |
Translation in the Wild |
Yuri Balashov et.al. |
2505.23548 |
null |
2025-05-29 |
Position Paper: Metadata Enrichment Model: Integrating Neural Networks and Semantic Knowledge Graphs for Cultural Heritage Applications |
Jan Ignatowicz et.al. |
2505.23543 |
null |
2025-05-29 |
Probability-Consistent Preference Optimization for Enhanced LLM Reasoning |
Yunqiao Yang et.al. |
2505.23540 |
link |
2025-05-29 |
Domain-Aware Tensor Network Structure Search |
Giorgos Iacovides et.al. |
2505.23537 |
null |
2025-05-29 |
AnchorAttention: Difference-Aware Sparse Attention with Stripe Granularity |
Yu Zhang et.al. |
2505.23520 |
link |
2025-05-29 |
DeepFilterGAN: A Full-band Real-time Speech Enhancement System with GAN-based Stochastic Regeneration |
Sanberk Serbest et.al. |
2505.23515 |
null |
2025-05-29 |
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning |
Liyun Zhu et.al. |
2505.23504 |
link |
2025-05-29 |
Can Large Language Models Challenge CNNS in Medical Image Analysis? |
Shibbir Ahmed et.al. |
2505.23503 |
null |
2025-05-29 |
Identity resolution of software metadata using Large Language Models |
Eva Martín del Pico et.al. |
2505.23500 |
null |
2025-05-29 |
R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation |
Kaijie Chen et.al. |
2505.23493 |
null |
2025-05-29 |
Autoformalization in the Era of Large Language Models: A Survey |
Ke Weng et.al. |
2505.23486 |
null |
2025-05-29 |
VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation |
Shi-Xue Zhang et.al. |
2505.23484 |
link |
2025-05-29 |
Revisiting Overthinking in Long Chain-of-Thought from the Perspective of Self-Doubt |
Keqin Peng et.al. |
2505.23480 |
null |
2025-05-29 |
Evaluating the performance and fragility of large language models on the self-assessment for neurological surgeons |
Krithik Vishwanath et.al. |
2505.23477 |
null |
2025-05-29 |
EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions |
Xiaorui Wu et.al. |
2505.23473 |
null |
2025-05-29 |
Synthesizing Performance Constraints for Evaluating and Improving Code Efficiency |
Jun Yang et.al. |
2505.23471 |
null |
2025-05-29 |
Diffusion Guidance Is a Controllable Policy Improvement Operator |
Kevin Frans et.al. |
2505.23458 |
link |
2025-05-29 |
What About Emotions? Guiding Fine-Grained Emotion Extraction from Mobile App Reviews |
Quim Motger et.al. |
2505.23452 |
link |
2025-05-30 |
CMIE: Combining MLLM Insights with External Evidence for Explainable Out-of-Context Misinformation Detection |
Fanxiao Li et.al. |
2505.23449 |
null |
2025-05-29 |
Diversity-Aware Policy Optimization for Large Language Model Reasoning |
Jian Yao et.al. |
2505.23433 |
null |
2025-05-29 |
SWE-bench Goes Live! |
Linghao Zhang et.al. |
2505.23419 |
link |
2025-05-29 |
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction |
Jang-Hyun Kim et.al. |
2505.23416 |
link |
2025-05-29 |
Bidirectional predictive coding |
Gaspard Oliviers et.al. |
2505.23415 |
null |
2025-05-29 |
From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs |
Xuan Gong et.al. |
2505.23410 |
null |
2025-05-29 |
A Practical Guide for Supporting Formative Assessment and Feedback Using Generative AI |
Sapolnach Prompiengchai et.al. |
2505.23405 |
null |
2025-05-29 |
Adaptive Jailbreaking Strategies Based on the Semantic Understanding Capabilities of Large Language Models |
Mingyu Yu et.al. |
2505.23404 |
null |
2025-05-29 |
Bridging Geometric and Semantic Foundation Models for Generalized Monocular Depth Estimation |
Sanggyun Ma et.al. |
2505.23400 |
null |
2025-05-29 |
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization |
Mingzhe Du et.al. |
2505.23387 |
null |
2025-05-29 |
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning |
Weijia Mao et.al. |
2505.23380 |
link |
2025-05-29 |
Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation |
Beiduo Chen et.al. |
2505.23368 |
link |
2025-05-29 |
Discriminative Policy Optimization for Token-Level Reward Models |
Hongzhan Chen et.al. |
2505.23363 |
link |
2025-05-29 |
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning? |
Yuanxin Liu et.al. |
2505.23359 |
link |
2025-05-29 |
Representing local protein environments with atomistic foundation models |
Meital Bojan et.al. |
2505.23354 |
null |
2025-05-29 |
Understanding the Information Propagation Effects of Communication Topologies in LLM-based Multi-Agent Systems |
Xu Shen et.al. |
2505.23352 |
link |
2025-05-29 |
Towards Reward Fairness in RLHF: From a Resource Allocation Perspective |
Sheng Ouyang et.al. |
2505.23349 |
link |
2025-05-29 |
Fine-Tuning Next-Scale Visual Autoregressive Models with Group Relative Policy Optimization |
Matteo Gallici et.al. |
2505.23331 |
null |
2025-05-29 |
Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis |
Hengyuan Cao et.al. |
2505.23325 |
null |
2025-05-29 |
Proximalized Preference Optimization for Diverse Feedback Types: A Decomposed Perspective on DPO |
Kaiyang Guo et.al. |
2505.23316 |
null |
2025-05-29 |
TRACE: Trajectory-Constrained Concept Erasure in Diffusion Models |
Finn Carter et.al. |
2505.23312 |
null |
2025-05-29 |
Towards LLM-based Generation of Human-Readable Proofs in Polynomial Formal Verification |
Rolf Drechsler et.al. |
2505.23311 |
null |
2025-05-29 |
Score-based Generative Modeling for Conditional Independence Testing |
Yixin Ren et.al. |
2505.23309 |
link |
2025-05-29 |
Data-efficient Meta-models for Evaluation of Context-based Questions and Answers in LLMs |
Julia Belikova et.al. |
2505.23299 |
null |
2025-05-29 |
EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian |
Daryna Dementieva et.al. |
2505.23297 |
null |
2025-05-29 |
How Does Response Length Affect Long-Form Factuality |
James Xu Zhao et.al. |
2505.23295 |
link |
2025-05-29 |
Federated Unsupervised Semantic Segmentation |
Evangelos Charalampakis et.al. |
2505.23292 |
null |
2025-05-29 |
GenCAD-Self-Repairing: Feasibility Enhancement for 3D CAD Generation |
Chikaha Tsuji et.al. |
2505.23287 |
null |
2025-05-29 |
MathArena: Evaluating LLMs on Uncontaminated Math Competitions |
Mislav Balunović et.al. |
2505.23281 |
link |
2025-05-29 |
Sentinel: Attention Probing of Proxy Models for LLM Context Compression with an Understanding Perspective |
Yong Zhang et.al. |
2505.23277 |
link |
2025-05-29 |
The Arabic AI Fingerprint: Stylometric Analysis and Detection of Large Language Models Text |
Maged S. Al-Shaibani et.al. |
2505.23276 |
link |
2025-05-29 |
Wireless Agentic AI with Retrieval-Augmented Multimodal Semantic Perception |
Guangyuan Liu et.al. |
2505.23275 |
null |
2025-05-29 |
Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs |
Haokun Chen et.al. |
2505.23270 |
null |
2025-05-28 |
Zero-Shot Vision Encoder Grafting via LLM Surrogates |
Kaiyu Yue et.al. |
2505.22664 |
link |
2025-05-28 |
AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models |
Feng Luo et.al. |
2505.22662 |
null |
2025-05-28 |
GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning |
Qingchen Yu et.al. |
2505.22661 |
null |
2025-05-28 |
3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model |
Wenbo Hu et.al. |
2505.22657 |
null |
2025-05-28 |
Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents |
Michael Kirchhof et.al. |
2505.22655 |
null |
2025-05-28 |
The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason |
Ang Lv et.al. |
2505.22653 |
null |
2025-05-28 |
Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese |
Hanjia Lyu et.al. |
2505.22645 |
link |
2025-05-28 |
Understanding (Un)Reliability of Steering Vectors in Language Models |
Joschka Braun et.al. |
2505.22637 |
null |
2025-05-28 |
Learning Composable Chains-of-Thought |
Fangcong Yin et.al. |
2505.22635 |
null |
2025-05-28 |
Spatial Knowledge Graph-Guided Multimodal Synthesis |
Yida Xue et.al. |
2505.22633 |
null |
2025-05-28 |
Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs |
Ziling Cheng et.al. |
2505.22630 |
null |
2025-05-28 |
Principled Out-of-Distribution Generalization via Simplicity |
Jiawei Ge et.al. |
2505.22622 |
null |
2025-05-28 |
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding |
Chengyue Wu et.al. |
2505.22618 |
null |
2025-05-28 |
RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction |
Yuchi Wang et.al. |
2505.22613 |
null |
2025-05-28 |
Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates |
Haoning Xu et.al. |
2505.22608 |
null |
2025-05-28 |
Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning |
Erxin Yu et.al. |
2505.22591 |
null |
2025-05-28 |
Precise In-Parameter Concept Erasure in Large Language Models |
Yoav Gur-Arieh et.al. |
2505.22586 |
null |
2025-05-28 |
Less, but Better: Efficient Multilingual Expansion for LLMs via Layer-wise Mixture-of-Experts |
Xue Zhang et.al. |
2505.22582 |
null |
2025-05-28 |
Fusion Steering: Prompt-Specific Activation Control |
Waldemar Chang et.al. |
2505.22572 |
null |
2025-05-29 |
Agent-UniRAG: A Trainable Open-Source LLM Agent Framework for Unified Retrieval-Augmented Generation Systems |
Hoang Pham et.al. |
2505.22571 |
null |
2025-05-28 |
Universal Visuo-Tactile Video Understanding for Embodied Interaction |
Yifan Xie et.al. |
2505.22566 |
null |
2025-05-28 |
Do Large Language Models Think Like the Brain? Sentence-Level Evidence from fMRI and Hierarchical Embeddings |
Yu Lei et.al. |
2505.22563 |
null |
2025-05-28 |
ClaimPKG: Enhancing Claim Verification via Pseudo-Subgraph Generation with Lightweight Specialized LLM |
Hoang Pham et.al. |
2505.22552 |
null |
2025-05-28 |
DES-LOC: Desynced Low Communication Adaptive Optimizers for Training Foundation Models |
Alex Iacob et.al. |
2505.22549 |
null |
2025-05-28 |
TabularQGAN: A Quantum Generative Model for Tabular Data |
Pallavi Bhardwaj et.al. |
2505.22533 |
null |
2025-05-28 |
Symplectic Generative Networks (SGNs): A Hamiltonian Framework for Invertible Deep Generative Modeling |
Agnideep Aich et.al. |
2505.22527 |
null |
2025-05-28 |
PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models |
Junwen Chen et.al. |
2505.22523 |
null |
2025-05-28 |
Multi-MLLM Knowledge Distillation for Out-of-Context News Detection |
Yimeng Gu et.al. |
2505.22517 |
null |
2025-05-28 |
EvolveSearch: An Iterative Self-Evolving Search Agent |
Dingchu Zhang et.al. |
2505.22501 |
null |
2025-05-28 |
ProSpero: Active Learning for Robust Protein Design Beyond Wild-Type Neighborhoods |
Michal Kmicikiewicz et.al. |
2505.22494 |
null |
2025-05-28 |
Understanding Adversarial Training with Energy-based Models |
Mujtaba Hussain Mirza et.al. |
2505.22486 |
null |
2025-05-29 |
Topological Structure Learning Should Be A Research Priority for LLM-Based Multi-Agent Systems |
Jiaxi Yang et.al. |
2505.22467 |
null |
2025-05-28 |
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO |
Lai Wei et.al. |
2505.22453 |
link |
2025-05-28 |
Position: All Current Generative Fidelity and Diversity Metrics are Flawed |
Ossi Räisä et.al. |
2505.22450 |
null |
2025-05-28 |
Privacy-preserving Prompt Personalization in Federated Learning for Multimodal Large Language Models |
Sizai Hou et.al. |
2505.22447 |
null |
2025-05-28 |
Does Johnny Get the Message? Evaluating Cybersecurity Notifications for Everyday Users |
Victor Jüttner et.al. |
2505.22435 |
null |
2025-05-28 |
Scaling Reasoning without Attention |
Xueliang Zhao et.al. |
2505.22425 |
null |
2025-05-28 |
Frugal Incremental Generative Modeling using Variational Autoencoders |
Victor Enescu et.al. |
2505.22408 |
null |
2025-05-28 |
Zooming from Context to Cue: Hierarchical Preference Optimization for Multi-Image MLLMs |
Xudong Li et.al. |
2505.22396 |
null |
2025-05-28 |
PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models |
Fan Fei et.al. |
2505.22394 |
null |
2025-05-28 |
Physics-Informed Distillation of Diffusion Models for PDE-Constrained Generation |
Yi Zhang et.al. |
2505.22391 |
null |
2025-05-29 |
Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition |
Hanting Chen et.al. |
2505.22375 |
null |
2025-05-28 |
AgentDNS: A Root Domain Naming System for LLM Agents |
Enfang Cui et.al. |
2505.22368 |
null |
2025-05-28 |
Identity-Preserving Text-to-Image Generation via Dual-Level Feature Decoupling and Expert-Guided Fusion |
Kewen Chen et.al. |
2505.22360 |
null |
2025-05-28 |
Budget-Adaptive Adapter Tuning in Orthogonal Subspaces for Continual Learning in LLMs |
Zhiyi Wan et.al. |
2505.22358 |
null |
2025-05-28 |
ChatPD: An LLM-driven Paper-Dataset Networking System |
Anjie Xu et.al. |
2505.22349 |
null |
2025-05-28 |
Task-Driven Implicit Representations for Automated Design of LiDAR Systems |
Nikhil Behari et.al. |
2505.22344 |
null |
2025-05-28 |
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start |
Lai Wei et.al. |
2505.22334 |
link |
2025-05-28 |
NLP for Social Good: A Survey of Challenges, Opportunities, and Responsible Deployment |
Antonia Karamolegkou et.al. |
2505.22327 |
null |
2025-05-28 |
Advancing Expert Specialization for Better MoE |
Hongcan Guo et.al. |
2505.22323 |
null |
2025-05-28 |
Chain-of-Thought for Large Language Model-empowered Wireless Communications |
Xudong Wang et.al. |
2505.22320 |
null |
2025-05-28 |
If Pigs Could Fly… Can LLMs Logically Reason Through Counterfactuals? |
Ishwar B Balappanawar et.al. |
2505.22318 |
null |
2025-05-29 |
Skywork Open Reasoner 1 Technical Report |
Jujie He et.al. |
2505.22312 |
link |
2025-05-28 |
From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications |
Feibo Jiang et.al. |
2505.22311 |
null |
2025-05-28 |
CADReview: Automatically Reviewing CAD Programs with Error Detection and Correction |
Jiali Chen et.al. |
2505.22304 |
null |
2025-05-28 |
Adaptive Detoxification: Safeguarding General Capabilities of LLMs through Toxicity-Aware Knowledge Editing |
Yifan Lu et.al. |
2505.22298 |
null |
2025-05-28 |
Compensating for Data with Reasoning: Low-Resource Machine Translation with LLMs |
Samuel Frontull et.al. |
2505.22293 |
null |
2025-05-28 |
Rethinking the Unsolvable: When In-Context Search Meets Test-Time Scaling |
Fanzeng Xia et.al. |
2505.22290 |
null |
2025-05-28 |
New Tools are Needed for Tracking Adherence to AI Model Behavioral Use Clauses |
Daniel McDuff et.al. |
2505.22287 |
null |
2025-05-28 |
Test-Time Immunization: A Universal Defense Framework Against Jailbreaks for (Multimodal) Large Language Models |
Yongcan Yu et.al. |
2505.22271 |
null |
2025-05-28 |
Evaluation of LLMs in Speech is Often Flawed: Test Set Contamination in Large Language Models for Speech Recognition |
Yuan Tseng et.al. |
2505.22251 |
null |
2025-05-28 |
BioHopR: A Benchmark for Multi-Hop, Multi-Answer Reasoning in Biomedical Domain |
Yunsoo Kim et.al. |
2505.22240 |
null |
2025-05-28 |
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models |
Mehdi Ali et.al. |
2505.22232 |
null |
2025-05-28 |
Look & Mark: Leveraging Radiologist Eye Fixations and Bounding boxes in Multimodal Large Language Models for Chest X-ray Report Generation |
Yunsoo Kim et.al. |
2505.22222 |
null |
2025-05-28 |
A Survey on Training-free Open-Vocabulary Semantic Segmentation |
Naomi Kombol et.al. |
2505.22209 |
null |
2025-05-28 |
Efficient Leave-one-out Approximation in LLM Multi-agent Debate Based on Introspection |
Yue Cui et.al. |
2505.22192 |
null |
2025-05-29 |
Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design |
Yudi Zhang et.al. |
2505.22179 |
link |
2025-05-28 |
Reverse Preference Optimization for Complex Instruction Following |
Xiang Huang et.al. |
2505.22172 |
null |
2025-05-28 |
Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers |
Weilun Feng et.al. |
2505.22167 |
null |
2025-05-28 |
InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing |
Shuaiyi Li et.al. |
2505.22156 |
null |
2025-05-28 |
What Makes a Good Reasoning Chain? Uncovering Structural Patterns in Long Chain-of-Thought Reasoning |
Gangwei Jiang et.al. |
2505.22148 |
null |
2025-05-28 |
Developing a Top-tier Framework in Naturalistic Conditions Challenge for Categorized Emotion Prediction: From Speech Foundation Models and Learning Objective to Data Augmentation and Engineering Choices |
Tiantian Feng et.al. |
2505.22133 |
link |
2025-05-28 |
EULER: Enhancing the Reasoning Ability of Large Language Models through Error-Induced Learning |
Zhuoyang Wu et.al. |
2505.22131 |
null |
2025-05-28 |
SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model |
Yifan Chang et.al. |
2505.22126 |
null |
2025-05-28 |
LoKI: Low-damage Knowledge Implanting of Large Language Models |
Runyu Wang et.al. |
2505.22120 |
link |
2025-05-28 |
THINK-Bench: Evaluating Thinking Efficiency and Chain-of-Thought Quality of Large Reasoning Models |
Zhiyuan Li et.al. |
2505.22113 |
null |
2025-05-28 |
Visual Large Language Models Exhibit Human-Level Cognitive Flexibility in the Wisconsin Card Sorting Test |
Guangfu Hao et.al. |
2505.22112 |
null |
2025-05-28 |
Curse of High Dimensionality Issue in Transformer for Long-context Modeling |
Shuhai Zhang et.al. |
2505.22107 |
null |
2025-05-28 |
MemOS: An Operating System for Memory-Augmented Generation (MAG) in Large Language Models |
Zhiyu Li et.al. |
2505.22101 |
null |
2025-05-28 |
Knowledge Base Construction for Knowledge-Augmented Text-to-SQL |
Jinheon Baek et.al. |
2505.22096 |
null |
2025-05-28 |
Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning |
Chunyi Peng et.al. |
2505.22095 |
null |
2025-05-28 |
VIRAL: Vision-grounded Integration for Reward design And Learning |
Valentin Cuzin-Rambaud et.al. |
2505.22092 |
null |
2025-05-28 |
ArgInstruct: Specialized Instruction Fine-Tuning for Computational Argumentation |
Maja Stahl et.al. |
2505.22076 |
null |
2025-05-28 |
On-the-fly Routing for Zero-shot MoE Speaker Adaptation of Speech Foundation Models for Dysarthric Speech Recognition |
Shujie HU et.al. |
2505.22072 |
null |
2025-05-28 |
Beyond path selection: Better LLMs for Scientific Information Extraction with MimicSFT and Relevance and Rule-induced(R $^2$ )GRPO |
Ran Li et.al. |
2505.22068 |
null |
2025-05-28 |
Weakly Supervised Data Refinement and Flexible Sequence Compression for Efficient Thai LLM-based ASR |
Mingchen Shao et.al. |
2505.22063 |
null |
2025-05-28 |
Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home? |
Yujin Choi et.al. |
2505.22061 |
null |
2025-05-28 |
Estimating the Effects of Sample Training Orders for Large Language Models without Retraining |
Hao Yang et.al. |
2505.22042 |
null |
2025-05-28 |
Detecting Undesired Process Behavior by Means of Retrieval Augmented Generation |
Michael Grohs et.al. |
2505.22041 |
null |
2025-05-28 |
Jailbreak Distillation: Renewable Safety Benchmarking |
Jingyu Zhang et.al. |
2505.22037 |
null |
2025-05-28 |
CoThink: Token-Efficient Reasoning via Instruct Models Guiding Reasoning Models |
Siqi Fan et.al. |
2505.22017 |
null |
2025-05-28 |
PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms |
Yifei Xia et.al. |
2505.22016 |
null |
2025-05-28 |
VulBinLLM: LLM-powered Vulnerability Detection for Stripped Binaries |
Nasir Hussain et.al. |
2505.22010 |
null |
2025-05-28 |
Efficiently Enhancing General Agents With Hierarchical-categorical Memory |
Changze Qiao et.al. |
2505.22006 |
null |
2025-05-28 |
Legal Assist AI: Leveraging Transformer-Based Model for Effective Legal Assistance |
Jatin Gupta et.al. |
2505.22003 |
null |
2025-05-28 |
Found in Translation: Measuring Multilingual LLM Consistency as Simple as Translate then Evaluate |
Ashim Gupta et.al. |
2505.21999 |
null |
2025-05-28 |
Leveraging Interview-Informed LLMs to Model Survey Responses: Comparative Insights from AI-Generated and Human Data |
Jihong Zhang et.al. |
2505.21997 |
null |
2025-05-28 |
Learning World Models for Interactive Video Generation |
Taiye Chen et.al. |
2505.21996 |
null |
2025-05-28 |
ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning |
Zhendong Mi et.al. |
2505.21987 |
null |
2025-05-28 |
Learning Compositional Behaviors from Demonstration and Language |
Weiyu Liu et.al. |
2505.21981 |
null |
2025-05-27 |
Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making |
Yihan Wang et.al. |
2505.21503 |
null |
2025-05-27 |
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers |
Wei Pang et.al. |
2505.21497 |
link |
2025-05-27 |
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment |
Xiaojun Jia et.al. |
2505.21494 |
null |
2025-05-27 |
Reinforcing General Reasoning without Verifiers |
Xiangxin Zhou et.al. |
2505.21493 |
null |
2025-05-27 |
Robust Hypothesis Generation: LLM-Automated Language Bias for Inductive Logic Programming |
Yang Yang et.al. |
2505.21486 |
null |
2025-05-27 |
Are Language Models Consequentialist or Deontological Moral Reasoners? |
Keenan Samway et.al. |
2505.21479 |
null |
2025-05-27 |
Policy Optimized Text-to-Image Pipeline Design |
Uri Gadot et.al. |
2505.21478 |
null |
2025-05-27 |
Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration |
Zijun Liu et.al. |
2505.21471 |
link |
2025-05-27 |
PropMolFlow: Property-guided Molecule Generation with Geometry-Complete Flow Matching |
Cheng Zeng et.al. |
2505.21469 |
null |
2025-05-27 |
Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance |
Shintaro Ozaki et.al. |
2505.21458 |
null |
2025-05-27 |
Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO |
Muzhi Zhu et.al. |
2505.21457 |
null |
2025-05-27 |
Designing Cyclic Peptides via Harmonic SDE with Atom-Bond Modeling |
Xiangxin Zhou et.al. |
2505.21452 |
null |
2025-05-27 |
Can Large Reasoning Models Self-Train? |
Sheikh Shafayat et.al. |
2505.21444 |
null |
2025-05-27 |
Hume: Introducing System-2 Thinking in Visual-Language-Action Model |
Haoming Song et.al. |
2505.21432 |
null |
2025-05-27 |
Policy Induction: Predicting Startup Success via Explainable Memory-Augmented In-Context Learning |
Xianling Mu et.al. |
2505.21427 |
null |
2025-05-27 |
GUARD:Dual-Agent based Backdoor Defense on Chain-of-Thought in Neural Code Generation |
Naizhu Jin et.al. |
2505.21425 |
null |
2025-05-27 |
Autonomous Multi-Modal LLM Agents for Treatment Planning in Focused Ultrasound Ablation Surgery |
Lina Zhao et.al. |
2505.21418 |
null |
2025-05-27 |
RefTool: Enhancing Model Reasoning with Reference-Guided Tool Creation |
Xiao Liu et.al. |
2505.21413 |
null |
2025-05-28 |
Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity |
Yehui Tang et.al. |
2505.21411 |
null |
2025-05-27 |
RelationalFactQA: A Benchmark for Evaluating Tabular Fact Retrieval from Large Language Models |
Dario Satriani et.al. |
2505.21409 |
null |
2025-05-27 |
A Convergence Theory for Diffusion Language Models: An Information-Theoretic Perspective |
Gen Li et.al. |
2505.21400 |
null |
2025-05-27 |
Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling |
Hovhannes Tamoyan et.al. |
2505.21399 |
null |
2025-05-27 |
DecisionFlow: Advancing Large Language Model as Principled Decision Maker |
Xiusi Chen et.al. |
2505.21397 |
null |
2025-05-27 |
Improving Research Idea Generation Through Data: An Empirical Investigation in Social Science |
Xiao Liu et.al. |
2505.21396 |
null |
2025-05-27 |
AutoJudger: An Agent-Driven Framework for Efficient Benchmarking of MLLMs |
Xuanwen Ding et.al. |
2505.21389 |
null |
2025-05-27 |
DeCAF: Decentralized Consensus-And-Factorization for Low-Rank Adaptation of Foundation Models |
Nastaran Saadati et.al. |
2505.21382 |
null |
2025-05-27 |
GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution |
Fengxiang Wang et.al. |
2505.21375 |
link |
2025-05-27 |
Improving LLM-based Global Optimization with Search Space Partitioning |
Andrej Schwanke et.al. |
2505.21372 |
null |
2025-05-27 |
When Experimental Economics Meets Large Language Models: Tactics with Evidence |
Shu Wang et.al. |
2505.21371 |
null |
2025-05-27 |
Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders |
James Oldfield et.al. |
2505.21364 |
null |
2025-05-27 |
Evaluating LLM Adaptation to Sociodemographic Factors: User Profile vs. Dialogue History |
Qishuai Zhong et.al. |
2505.21362 |
null |
2025-05-28 |
AgriFM: A Multi-source Temporal Remote Sensing Foundation Model for Crop Mapping |
Wenyuan Li et.al. |
2505.21357 |
null |
2025-05-28 |
Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models |
Whenty Ariyanti et.al. |
2505.21356 |
null |
2025-05-27 |
Leveraging Large Language Models for Bengali Math Word Problem Solving with Chain of Thought Reasoning |
Bidyarthi Paul et.al. |
2505.21354 |
null |
2025-05-27 |
Out of the Past: An AI-Enabled Pipeline for Traffic Simulation from Noisy, Multimodal Detector Data and Stakeholder Feedback |
Rex Chen et.al. |
2505.21349 |
null |
2025-05-27 |
The Multilingual Divide and Its Impact on Global AI Safety |
Aidan Peppin et.al. |
2505.21344 |
null |
2025-05-28 |
PEDANTIC: A Dataset for the Automatic Examination of Definiteness in Patent Claims |
Valentin Knappich et.al. |
2505.21342 |
null |
2025-05-28 |
HoliTom: Holistic Token Merging for Fast Video Large Language Models |
Kele Shao et.al. |
2505.21334 |
null |
2025-05-27 |
MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios |
Yang Shi et.al. |
2505.21333 |
null |
2025-05-27 |
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs |
Jiakang Yuan et.al. |
2505.21327 |
null |
2025-05-27 |
Leveraging large language models and traditional machine learning ensembles for ADHD detection from narrative transcripts |
Yuxin Zhu et.al. |
2505.21324 |
null |
2025-05-27 |
Assured Autonomy with Neuro-Symbolic Perception |
R. Spencer Hallyburton et.al. |
2505.21322 |
null |
2025-05-27 |
Beyond Chemical QA: Evaluating LLM’s Chemical Reasoning with Modular Chemical Operations |
Hao Li et.al. |
2505.21318 |
null |
2025-05-27 |
A Cross Modal Knowledge Distillation & Data Augmentation Recipe for Improving Transcriptomics Representations through Morphological Features |
Ihab Bendidi et.al. |
2505.21317 |
null |
2025-05-27 |
Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead |
Jesujoba O. Alabi et.al. |
2505.21315 |
null |
2025-05-27 |
Large Language Models Miss the Multi-Agent Mark |
Emanuele La Malfa et.al. |
2505.21298 |
null |
2025-05-27 |
rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset |
Yifei Liu et.al. |
2505.21297 |
null |
2025-05-27 |
Complex System Diagnostics Using a Knowledge Graph-Informed and Large Language Model-Enhanced Framework |
Saman Marandi et.al. |
2505.21291 |
null |
2025-05-27 |
PACT: A Contract-Theoretic Framework for Pricing Agentic AI Services Powered by Large Language Models |
Ya-Ting Yang et.al. |
2505.21286 |
null |
2025-05-27 |
RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models |
Yue Zhang et.al. |
2505.21281 |
null |
2025-05-27 |
Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space |
Yao Huang et.al. |
2505.21277 |
link |
2025-05-27 |
JavaSith: A Client-Side Framework for Analyzing Potentially Malicious Extensions in Browsers, VS Code, and NPM Packages |
Avihay Cohen et.al. |
2505.21263 |
null |
2025-05-27 |
ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision |
Dosung Lee et.al. |
2505.21250 |
null |
2025-05-27 |
Evaluation of LLMs in Medical Text Summarization: The Role of Vocabulary Adaptation in High OOV Settings |
Gunjan Balde et.al. |
2505.21242 |
null |
2025-05-27 |
LMCD: Language Models are Zeroshot Cognitive Diagnosis Learners |
Yu He et.al. |
2505.21239 |
null |
2025-05-27 |
Unfolding A Few Structures for The Many: Memory-Efficient Compression of Conformer and Speech Foundation Models |
Zhaoqing Li et.al. |
2505.21237 |
null |
2025-05-27 |
Pretrained LLMs Learn Multiple Types of Uncertainty |
Roi Cohen et.al. |
2505.21218 |
null |
2025-05-27 |
Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM’s Instruction-Following Capabilities |
Junyan Zhang et.al. |
2505.21191 |
null |
2025-05-27 |
Exploring the Latent Capacity of LLMs for One-Step Text Generation |
Gleb Mezentsev et.al. |
2505.21189 |
null |
2025-05-27 |
PoisonSwarm: Universal Harmful Information Synthesis via Model Crowdsourcing |
Yu Yan et.al. |
2505.21184 |
null |
2025-05-27 |
Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning |
Mingyang Song et.al. |
2505.21178 |
null |
2025-05-27 |
SOLIDGEO: Measuring Multimodal Spatial Math Reasoning in Solid Geometry |
Peijie Wang et.al. |
2505.21177 |
null |
2025-05-27 |
TAT-R1: Terminology-Aware Translation with Reinforcement Learning and Word Alignment |
Zheng Li et.al. |
2505.21172 |
null |
2025-05-27 |
STEB: In Search of the Best Evaluation Approach for Synthetic Time Series |
Michael Stenger et.al. |
2505.21160 |
null |
2025-05-27 |
Assessment of L2 Oral Proficiency using Speech Large Language Models |
Rao Ma et.al. |
2505.21148 |
null |
2025-05-27 |
IKMo: Image-Keyframed Motion Generation with Trajectory-Pose Conditioned Motion Diffusion Model |
Yang Zhao et.al. |
2505.21146 |
null |
2025-05-27 |
Leveraging LLM and Self-Supervised Training Models for Speech Recognition in Chinese Dialects: A Comparative Analysis |
Tianyi Xu et.al. |
2505.21138 |
null |
2025-05-27 |
Scaling and Prompting for Improved End-to-End Spoken Grammatical Error Correction |
Mengjie Qian et.al. |
2505.21137 |
null |
2025-05-27 |
Named Entity Swapping for Metadata Anonymization in a Text Corpus |
Jan Greve et.al. |
2505.21128 |
null |
2025-05-27 |
Creativity in LLM-based Multi-Agent Systems: A Survey |
Yi-Cheng Lin et.al. |
2505.21116 |
null |
2025-05-27 |
Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA |
Sergey Pletenev et.al. |
2505.21115 |
null |
2025-05-27 |
Simulating Ethics: Using LLM Debate Panels to Model Deliberation on Medical Dilemmas |
Hazem Zohny et.al. |
2505.21112 |
null |
2025-05-27 |
A Lightweight Multi-Expert Generative Language Model System for Engineering Information and Knowledge Extraction |
Bogdan Bogachov et.al. |
2505.21109 |
null |
2025-05-27 |
Thinker: Learning to Think Fast and Slow |
Stephen Chung et.al. |
2505.21097 |
null |
2025-05-27 |
BLUCK: A Benchmark Dataset for Bengali Linguistic Understanding and Cultural Knowledge |
Daeen Kabir et.al. |
2505.21092 |
null |
2025-05-27 |
Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs) |
Anna Neumann et.al. |
2505.21091 |
null |
2025-05-27 |
LLMs Think, But Not In Your Flow: Reasoning-Level Personalization for Black-Box Large Language Models |
Jieyong Kim et.al. |
2505.21082 |
null |
2025-05-27 |
Uni3D-MoE: Scalable Multimodal 3D Scene Understanding via Mixture of Experts |
Yue Zhang et.al. |
2505.21079 |
null |
2025-05-27 |
Efficient Large Language Model Inference with Neural Block Linearization |
Mete Erdogan et.al. |
2505.21077 |
null |
2025-05-27 |
DynamicVL: Benchmarking Multimodal Large Language Models for Dynamic City Understanding |
Weihao Xuan et.al. |
2505.21076 |
null |
2025-05-27 |
Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation |
Ekaterina Fadeeva et.al. |
2505.21072 |
null |
2025-05-27 |
CXXCrafter: An LLM-Based Agent for Automated C/C++ Open Source Software Building |
Zhengmin Yu et.al. |
2505.21069 |
null |
2025-05-27 |
Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning |
Xiao Hu et.al. |
2505.21067 |
null |
2025-05-27 |
Agent-Environment Alignment via Automated Interface Generation |
Kaiming Liu et.al. |
2505.21055 |
null |
2025-05-27 |
SHE-LoRA: Selective Homomorphic Encryption for Federated Tuning with Heterogeneous LoRA |
Jianmin Liu et.al. |
2505.21051 |
null |
2025-05-28 |
Advancing high-fidelity 3D and Texture Generation with 2.5D latents |
Xin Yang et.al. |
2505.21050 |
null |
2025-05-27 |
Large Language Model-enhanced Reinforcement Learning for Low-Altitude Economy Networking |
Lingyi Cai et.al. |
2505.21045 |
null |
2025-05-28 |
FCKT: Fine-Grained Cross-Task Knowledge Transfer with Semantic Contrastive Learning for Targeted Sentiment Analysis |
Wei Chen et.al. |
2505.21040 |
null |
2025-05-27 |
RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy |
Aiyue Chen et.al. |
2505.21036 |
null |
2025-05-27 |
LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms |
Wenhu Li et.al. |
2505.21034 |
null |
2025-05-27 |
Def-DTS: Deductive Reasoning for Open-domain Dialogue Topic Segmentation |
Seungmin Lee et.al. |
2505.21033 |
null |
2025-05-27 |
Uncertainty Unveiled: Can Exposure to More In-context Examples Mitigate Uncertainty for Large Language Models? |
Yifei Wang et.al. |
2505.21003 |
null |
2025-05-27 |
Who Reasons in the Large Language Models? |
Jie Shao et.al. |
2505.20993 |
null |
2025-05-27 |
LifeIR at the NTCIR-18 Lifelog-6 Task |
Jiahan Chen et.al. |
2505.20987 |
null |
2025-05-27 |
Generative Image Compression by Estimating Gradients of the Rate-variable Feature Distribution |
Minghao Han et.al. |
2505.20984 |
null |
2025-05-27 |
Evaluating and Steering Modality Preferences in Multimodal Large Language Model |
Yu Zhang et.al. |
2505.20977 |
null |
2025-05-27 |
Contrastive Learning on LLM Back Generation Treebank for Cross-domain Constituency Parsing |
Peiming Guo et.al. |
2505.20976 |
null |
2025-05-28 |
Towards Conversational Development Environments: Using Theory-of-Mind and Multi-Agent Architectures for Requirements Refinement |
Keheliya Gallaba et.al. |
2505.20973 |
null |
2025-05-27 |
Research Community Perspectives on “Intelligence” and Large Language Models |
Bertram Højer et.al. |
2505.20959 |
null |
2025-05-27 |
IRCopilot: Automated Incident Response with Large Language Models |
Xihuan Lin et.al. |
2505.20945 |
null |
2025-05-26 |
Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs |
Hanting Chen et.al. |
2505.20155 |
null |
2025-05-26 |
UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models |
Xueyan Zhang et.al. |
2505.20154 |
null |
2025-05-26 |
MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents |
Ziming Wei et.al. |
2505.20148 |
link |
2025-05-26 |
FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities |
Jin Wang et.al. |
2505.20147 |
null |
2025-05-26 |
StructEval: Benchmarking LLMs’ Capabilities to Generate Structural Outputs |
Jialin Yang et.al. |
2505.20139 |
null |
2025-05-26 |
Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers |
Zhengliang Shi et.al. |
2505.20128 |
link |
2025-05-26 |
Agentic AI Process Observability: Discovering Behavioral Variability |
Fabiana Fournier et.al. |
2505.20127 |
null |
2025-05-26 |
Understanding Generalization in Diffusion Models via Probability Flow Distance |
Huijie Zhang et.al. |
2505.20123 |
null |
2025-05-27 |
TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent |
Dominik Meier et.al. |
2505.20118 |
link |
2025-05-26 |
Named Entity Recognition in Historical Italian: The Case of Giacomo Leopardi’s Zibaldone |
Cristian Santini et.al. |
2505.20113 |
null |
2025-05-26 |
ResSVD: Residual Compensated SVD for Large Language Model Compression |
Haolei Bai et.al. |
2505.20112 |
null |
2025-05-26 |
Proxy-Free GFlowNet |
Ruishuo Chen et.al. |
2505.20110 |
null |
2025-05-26 |
Language-Agnostic Suicidal Risk Detection Using Large Language Models |
June-Woo Kim et.al. |
2505.20109 |
null |
2025-05-26 |
Adaptive Deep Reasoning: Triggering Deep Thinking When Needed |
Yunhao Wang et.al. |
2505.20101 |
null |
2025-05-26 |
AdaTP: Attention-Debiased Token Pruning for Video Large Language Models |
Fengyuan Sun et.al. |
2505.20100 |
null |
2025-05-26 |
Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities |
Chuangtao Ma et.al. |
2505.20099 |
link |
2025-05-26 |
S2LPP: Small-to-Large Prompt Prediction across LLMs |
Liang Cheng et.al. |
2505.20097 |
null |
2025-05-26 |
Multi-Domain Explainability of Preferences |
Nitay Calderon et.al. |
2505.20088 |
null |
2025-05-26 |
Inference-time Alignment in Continuous Space |
Yige Yuan et.al. |
2505.20081 |
link |
2025-05-26 |
Incentivizing Reasoning from Weak Supervision |
Yige Yuan et.al. |
2505.20072 |
link |
2025-05-26 |
On the Same Page: Dimensions of Perceived Shared Understanding in Human-AI Interaction |
Qingyu Liang et.al. |
2505.20068 |
null |
2025-05-26 |
SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety |
Geon-Hyeong Kim et.al. |
2505.20065 |
null |
2025-05-26 |
Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion |
Zheqi Lv et.al. |
2505.20053 |
link |
2025-05-26 |
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks |
Debargha Ganguly et.al. |
2505.20047 |
null |
2025-05-26 |
REARANK: Reasoning Re-ranking Agent via Reinforcement Learning |
Le Zhang et.al. |
2505.20046 |
link |
2025-05-26 |
Uncertainty-Aware Attention Heads: Efficient Unsupervised Uncertainty Quantification for LLMs |
Artem Vazhentsev et.al. |
2505.20045 |
null |
2025-05-26 |
ReasonPlan: Unified Scene Prediction and Decision Reasoning for Closed-loop Autonomous Driving |
Xueyi Liu et.al. |
2505.20024 |
link |
2025-05-26 |
Training LLM-Based Agents with Synthetic Self-Reflected Trajectories and Partial Masking |
Yihan Chen et.al. |
2505.20023 |
null |
2025-05-26 |
Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare |
Natallia Kokash et.al. |
2505.20020 |
null |
2025-05-26 |
Does Rationale Quality Matter? Enhancing Mental Disorder Detection via Selective Reasoning Distillation |
Hoyun Song et.al. |
2505.20014 |
link |
2025-05-26 |
WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback |
Minda Hu et.al. |
2505.20013 |
null |
2025-05-26 |
TabPFN: One Model to Rule Them All? |
Qiong Zhang et.al. |
2505.20003 |
link |
2025-05-26 |
NEXT: Multi-Grained Mixture of Experts via Text-Modulation for Multi-Modal Object Re-ID |
Shihao Li et.al. |
2505.20001 |
null |
2025-05-26 |
Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents |
Tao Wu et.al. |
2505.19997 |
null |
2025-05-26 |
Automatic Metadata Extraction for Text-to-SQL |
Vladislav Shkapenyuk et.al. |
2505.19988 |
null |
2025-05-26 |
How Well Do Large Reasoning Models Translate? A Comprehensive Evaluation for Multi-Domain Machine Translation |
Yongshi Ye et.al. |
2505.19987 |
link |
2025-05-26 |
Rethinking Probabilistic Circuit Parameter Learning |
Anji Liu et.al. |
2505.19982 |
null |
2025-05-26 |
DFIR-Metric: A Benchmark Dataset for Evaluating Large Language Models in Digital Forensics and Incident Response |
Bilel Cherif et.al. |
2505.19973 |
null |
2025-05-26 |
CP-Router: An Uncertainty-Aware Router Between LLM and LRM |
Jiayuan Su et.al. |
2505.19970 |
null |
2025-05-26 |
Learning to Select In-Context Demonstration Preferred by Large Language Model |
Zheng Zhang et.al. |
2505.19966 |
null |
2025-05-26 |
Adaptive Location Hierarchy Learning for Long-Tailed Mobility Prediction |
Yu Wang et.al. |
2505.19965 |
null |
2025-05-26 |
The Limits of Preference Data for Post-Training |
Eric Zhao et.al. |
2505.19964 |
null |
2025-05-26 |
MiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language Models |
Zhongzhan Huang et.al. |
2505.19959 |
link |
2025-05-26 |
DCG-SQL: Enhancing In-Context Learning for Text-to-SQL with Deep Contextual Schema Link Graph |
Jihyung Lee et.al. |
2505.19956 |
null |
2025-05-26 |
An Explainable Diagnostic Framework for Neurodegenerative Dementias via Reinforcement-Optimized LLM Reasoning |
Andrew Zamai et.al. |
2505.19954 |
null |
2025-05-26 |
Multimodal Reasoning Agent for Zero-Shot Composed Image Retrieval |
Rong-Cheng Tu et.al. |
2505.19952 |
null |
2025-05-26 |
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions |
Siqi Kou et.al. |
2505.19949 |
null |
2025-05-26 |
ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs |
Pooneh Mousavi et.al. |
2505.19937 |
null |
2025-05-26 |
Subtle Risks, Critical Failures: A Framework for Diagnosing Physical Safety of LLMs for Embodied Decision Making |
Yejin Son et.al. |
2505.19933 |
null |
2025-05-26 |
TCP: a Benchmark for Temporal Constraint-Based Planning |
Zifeng Ding et.al. |
2505.19927 |
null |
2025-05-26 |
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles |
Jiangjie Chen et.al. |
2505.19914 |
null |
2025-05-26 |
APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization |
Javier Marín et.al. |
2505.19912 |
link |
2025-05-27 |
Dynamic-I2V: Exploring Image-to-Video Generation Models via Multimodal LLM |
Peng Liu et.al. |
2505.19901 |
null |
2025-05-26 |
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows |
Qiushi Sun et.al. |
2505.19897 |
null |
2025-05-26 |
Large Language Models as Autonomous Spacecraft Operators in Kerbal Space Program |
Alejandro Carrasco et.al. |
2505.19896 |
link |
2025-05-26 |
ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining |
Melis Ilayda Bal et.al. |
2505.19893 |
null |
2025-05-26 |
Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging |
Yongxian Wei et.al. |
2505.19892 |
link |
2025-05-27 |
Generalized and Personalized Federated Learning with Foundation Models via Orthogonal Transformations |
Eun Gyung Kong et.al. |
2505.19888 |
null |
2025-05-26 |
Deconstructing Obfuscation: A four-dimensional framework for evaluating Large Language Models assembly code deobfuscation capabilities |
Anton Tkachenko et.al. |
2505.19887 |
null |
2025-05-26 |
Vad-R1: Towards Video Anomaly Reasoning via Perception-to-Cognition Chain-of-Thought |
Chao Huang et.al. |
2505.19877 |
link |
2025-05-26 |
A fully automated urban PV parameterization framework for improved estimation of energy production profiles |
Bowen Tian et.al. |
2505.19876 |
null |
2025-05-26 |
StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation |
Yi Wu et.al. |
2505.19874 |
null |
2025-05-26 |
Deep Active Inference Agents for Delayed and Long-Horizon Environments |
Yavar Taheri Yeganeh et.al. |
2505.19867 |
link |
2025-05-26 |
HS-STAR: Hierarchical Sampling for Self-Taught Reasoners via Difficulty Estimation and Budget Reallocation |
Feng Xiong et.al. |
2505.19866 |
null |
2025-05-26 |
CPA-RAG:Covert Poisoning Attacks on Retrieval-Augmented Generation in Large Language Models |
Chunyang Li et.al. |
2505.19864 |
null |
2025-05-26 |
FruitNeRF++: A Generalized Multi-Fruit Counting Method Utilizing Contrastive Learning and Neural Radiance Fields |
Lukas Meyer et.al. |
2505.19863 |
link |
2025-05-26 |
Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning? |
Zexi Li et.al. |
2505.19855 |
null |
2025-05-26 |
Beyond Specialization: Benchmarking LLMs for Transliteration of Indian Languages |
Gulfarogh Azam et.al. |
2505.19851 |
null |
2025-05-26 |
Improving Multilingual Math Reasoning for African Languages |
Odunayo Ogundepo et.al. |
2505.19848 |
null |
2025-05-26 |
FoodTaxo: Generating Food Taxonomies with Large Language Models |
Pascal Wullschleger et.al. |
2505.19838 |
link |
2025-05-26 |
SecVulEval: Benchmarking LLMs for Real-World C/C++ Vulnerability Detection |
Md Basim Uddin Ahmed et.al. |
2505.19828 |
link |
2025-05-26 |
Foundation Models for Tabular Data within Systemic Contexts Need Grounding |
Tassilo Klein et.al. |
2505.19825 |
null |
2025-05-26 |
FinLoRA: Benchmarking LoRA Methods for Fine-Tuning LLMs on Financial Datasets |
Dannong Wang et.al. |
2505.19819 |
link |
2025-05-26 |
Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective |
Junnan Liu et.al. |
2505.19815 |
link |
2025-05-26 |
Efficient Multi-modal Long Context Learning for Training-free Adaptation |
Zehong Ma et.al. |
2505.19812 |
link |
2025-05-26 |
Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks |
Sirui Chen et.al. |
2505.19806 |
link |
2025-05-26 |
Compliance-to-Code: Enhancing Financial Compliance Checking via Code Generation |
Siyuan Li et.al. |
2505.19804 |
null |
2025-05-26 |
Integrating emotional intelligence, memory architecture, and gestures to achieve empathetic humanoid robot interaction in an educational setting |
Fuze Sun et.al. |
2505.19803 |
null |
2025-05-26 |
MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs |
Zaid Alyafeai et.al. |
2505.19800 |
link |
2025-05-26 |
Advancements in Medical Image Classification through Fine-Tuning Natural Domain Foundation Models |
Mobina Mansoori et.al. |
2505.19779 |
link |
2025-05-26 |
DuRep: Dual-Mode Speech Representation Learning via ASR-Aware Distillation |
Prabash Reddy Male et.al. |
2505.19774 |
null |
2025-05-26 |
What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs |
Sangyeop Kim et.al. |
2505.19773 |
null |
2025-05-26 |
SGM: A Framework for Building Specification-Guided Moderation Filters |
Masoomali Fatehkia et.al. |
2505.19766 |
null |
2025-05-26 |
Agentic Predictor: Performance Prediction for Agentic Workflows via Multi-View Encoding |
Patara Trirat et.al. |
2505.19764 |
null |
2025-05-26 |
Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning |
Zican Hu et.al. |
2505.19761 |
link |
2025-05-26 |
NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question Answering |
Ruisheng Cao et.al. |
2505.19754 |
null |
2025-05-27 |
Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models |
Yang Zhang et.al. |
2505.19743 |
link |
2025-05-26 |
ReChisel: Effective Automatic Chisel Code Generation by LLM with Reflection |
Juxin Niu et.al. |
2505.19734 |
link |
2025-05-26 |
Accelerating Nash Learning from Human Feedback via Mirror Prox |
Daniil Tiapkin et.al. |
2505.19731 |
null |
2025-05-26 |
Distilling Closed-Source LLM’s Knowledge for Locally Stable and Economic Biomedical Entity Linking |
Yihao Ai et.al. |
2505.19722 |
null |
2025-05-26 |
Extremum Flow Matching for Offline Goal Conditioned Reinforcement Learning |
Quentin Rouxel et.al. |
2505.19717 |
null |
2025-05-26 |
MT $^{3}$ : Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning |
Zhaopeng Feng et.al. |
2505.19714 |
null |
2025-05-26 |
On the Relation between Rectified Flows and Optimal Transport |
Johannes Hertrich et.al. |
2505.19712 |
null |
2025-05-26 |
MLLM-Guided VLM Fine-Tuning with Joint Inference for Zero-Shot Composed Image Retrieval |
Rong-Cheng Tu et.al. |
2505.19707 |
null |
2025-05-26 |
Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision |
Tej Deep Pala et.al. |
2505.19706 |
link |
2025-05-26 |
Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning |
Minheng Ni et.al. |
2505.19702 |
null |
2025-05-26 |
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models |
Yi Liu et.al. |
2505.19700 |
null |
2025-05-26 |
Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments |
Junming Liu et.al. |
2505.19699 |
null |
2025-05-26 |
DriveCamSim: Generalizable Camera Simulation via Explicit Camera Modeling for Autonomous Driving |
Wenchao Sun et.al. |
2505.19692 |
link |
2025-05-26 |
Graph Guided Diffusion: Unified Guidance for Conditional Graph Generation |
Victor M. Tenorio et.al. |
2505.19685 |
null |
2025-05-26 |
VisCRA: A Visual Chain Reasoning Attack for Jailbreaking Multimodal Large Language Models |
Bingrui Sima et.al. |
2505.19684 |
null |
2025-05-26 |
Large Language Models for Planning: A Comprehensive and Systematic Survey |
Pengfei Cao et.al. |
2505.19683 |
link |
2025-05-23 |
Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs |
Wafa Alghallabi et.al. |
2505.18152 |
link |
2025-05-23 |
Generative Distribution Embeddings |
Nic Fishman et.al. |
2505.18150 |
null |
2025-05-23 |
First Finish Search: Efficient Test-Time Scaling in Large Language Models |
Aradhye Agarwal et.al. |
2505.18149 |
null |
2025-05-23 |
Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find |
Owen Bianchi et.al. |
2505.18148 |
null |
2025-05-23 |
Gaming Tool Preferences in Agentic LLMs |
Kazem Faghih et.al. |
2505.18135 |
link |
2025-05-23 |
Reward Model Overoptimisation in Iterated RLHF |
Lorenz Wolf et.al. |
2505.18126 |
null |
2025-05-23 |
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations |
Alan Arazi et.al. |
2505.18125 |
null |
2025-05-23 |
UNJOIN: Enhancing Multi-Table Text-to-SQL Generation via Schema Simplification |
Poojah Ganesan et.al. |
2505.18122 |
null |
2025-05-23 |
ProgRM: Build Better GUI Agents with Progress Rewards |
Danyang Zhang et.al. |
2505.18121 |
null |
2025-05-23 |
Bidirectional Knowledge Distillation for Enhancing Sequential Recommendation with Large Language Models |
Jiongran Wu et.al. |
2505.18120 |
null |
2025-05-23 |
Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM |
Zinuo Li et.al. |
2505.18110 |
null |
2025-05-23 |
ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework |
Lisheng Huang et.al. |
2505.18105 |
null |
2025-05-23 |
How Can I Publish My LLM Benchmark Without Giving the True Answers Away? |
Takashi Ishida et.al. |
2505.18102 |
null |
2025-05-23 |
Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL |
Joey Hong et.al. |
2505.18098 |
null |
2025-05-23 |
DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations |
Ziqiao Peng et.al. |
2505.18096 |
null |
2025-05-23 |
QwenLong-CPRS: Towards $\infty$ -LLMs with Dynamic Context Optimization |
Weizhou Shen et.al. |
2505.18092 |
null |
2025-05-23 |
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition |
Xinran Gu et.al. |
2505.18091 |
null |
2025-05-23 |
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding |
Xiaoyi Zhang et.al. |
2505.18079 |
null |
2025-05-23 |
Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals |
Jia-Nan Li et.al. |
2505.18071 |
null |
2025-05-23 |
Emergence of Hebbian Dynamics in Regularized Non-Local Learners |
David Koplow et.al. |
2505.18069 |
null |
2025-05-23 |
Reward Model Generalization for Compute-Aware Test-Time Reasoning |
Zeen Song et.al. |
2505.18065 |
null |
2025-05-23 |
A Foundation Model Framework for Multi-View MRI Classification of Extramural Vascular Invasion and Mesorectal Fascia Invasion in Rectal Cancer |
Yumeng Zhang et.al. |
2505.18058 |
null |
2025-05-23 |
MathEDU: Towards Adaptive Feedback for Student Mathematical Problem-Solving |
Wei-Ling Hsu et.al. |
2505.18056 |
null |
2025-05-23 |
SpikeGen: Generative Framework for Visual Spike Stream Processing |
Gaole Dai et.al. |
2505.18049 |
null |
2025-05-23 |
Contrastive Distillation of Emotion Knowledge from LLMs for Zero-Shot Emotion Recognition |
Minxue Niu et.al. |
2505.18040 |
link |
2025-05-23 |
Clip4Retrofit: Enabling Real-Time Image Labeling on Edge Devices via Cross-Architecture CLIP Distillation |
Li Zhong et.al. |
2505.18039 |
null |
2025-05-23 |
RemoteSAM: Towards Segment Anything for Earth Observation |
Liang Yao et.al. |
2505.18022 |
link |
2025-05-23 |
LLM assisted web application functional requirements generation: A case study of four popular LLMs over a Mess Management System |
Rashmi Gupta et.al. |
2505.18019 |
null |
2025-05-23 |
Strictly Constrained Generative Modeling via Split Augmented Langevin Sampling |
Matthieu Blanke et.al. |
2505.18017 |
link |
2025-05-23 |
Training with Pseudo-Code for Instruction Following |
Prince Kumar et.al. |
2505.18011 |
null |
2025-05-23 |
Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective |
Jintian Shao et.al. |
2505.17997 |
null |
2025-05-23 |
Outcome-based Reinforcement Learning to Predict the Future |
Benjamin Turtel et.al. |
2505.17989 |
null |
2025-05-23 |
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning |
Yutong Chen et.al. |
2505.17988 |
link |
2025-05-23 |
ADLGen: Synthesizing Symbolic, Event-Triggered Sensor Sequences for Human Activity Modeling |
Weihang You et.al. |
2505.17987 |
null |
2025-05-23 |
SmartNote: An LLM-Powered, Personalised Release Note Generator That Just Works |
Farbod Daneshyan et.al. |
2505.17977 |
link |
2025-05-23 |
Generalized Fisher-Weighted SVD: Scalable Kronecker-Factored Fisher Approximation for Compressing Large Language Models |
Viktoriia Chekalina et.al. |
2505.17974 |
null |
2025-05-23 |
Explainable Anatomy-Guided AI for Prostate MRI: Foundation Models and In Silico Clinical Trials for Virtual Biopsy-based Risk Assessment |
Danial Khan et.al. |
2505.17971 |
null |
2025-05-23 |
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems |
Jiayi Geng et.al. |
2505.17968 |
null |
2025-05-23 |
SVD-Free Low-Rank Adaptive Gradient Optimization for Large Language Models |
Ionut-Vlad Modoranu et.al. |
2505.17967 |
null |
2025-05-23 |
Beyond Distillation: Pushing the Limits of Medical LLM Reasoning with Minimalist Rule-Based RL |
Che Liu et.al. |
2505.17952 |
null |
2025-05-23 |
Survival Games: Human-LLM Strategic Showdowns under Severe Resource Scarcity |
Zhihong Chen et.al. |
2505.17937 |
link |
2025-05-23 |
AutoMiSeg: Automatic Medical Image Segmentation via Test-Time Adaptation of Foundation Models |
Xingjian Li et.al. |
2505.17931 |
null |
2025-05-23 |
LLM Meeting Decision Trees on Tabular Data |
Hangting Ye et.al. |
2505.17918 |
null |
2025-05-23 |
Flexible MOF Generation with Torsion-Aware Flow Matching |
Nayoung Kim et.al. |
2505.17914 |
null |
2025-05-23 |
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback |
Litao Guo et.al. |
2505.17908 |
link |
2025-05-23 |
T2I-Eval-R1: Reinforcement Learning-Driven Reasoning for Interpretable Text-to-Image Evaluation |
Zi-Ao Ma et.al. |
2505.17897 |
null |
2025-05-23 |
DataRater: Meta-Learned Dataset Curation |
Dan A. Calian et.al. |
2505.17895 |
null |
2025-05-23 |
Pixels to Prognosis: Harmonized Multi-Region CT-Radiomics and Foundation-Model Signatures Across Multicentre NSCLC Data |
Shruti Atul Mali et.al. |
2505.17893 |
null |
2025-05-23 |
LLM4SP: Large Language Models for Scatterer Prediction via Synesthesia of Machines |
Zengrui Han et.al. |
2505.17879 |
null |
2025-05-23 |
MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback |
Wanhao Liu et.al. |
2505.17873 |
link |
2025-05-23 |
Mixture of Low Rank Adaptation with Partial Parameter Sharing for Time Series Forecasting |
Licheng Pan et.al. |
2505.17872 |
null |
2025-05-23 |
The emergence of sparse attention: impact of data distribution and benefits of repetition |
Nicolas Zucchet et.al. |
2505.17863 |
null |
2025-05-23 |
Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities |
Ziwei Zhou et.al. |
2505.17862 |
link |
2025-05-23 |
Superplatforms Have to Attack AI Agents |
Jianghao Lin et.al. |
2505.17861 |
null |
2025-05-23 |
Automated Testing of the GUI of a Real-Life Engineering Software using Large Language Models |
Tim Rosenbach et.al. |
2505.17839 |
null |
2025-05-23 |
Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs’ Reasoning |
Zezhong Wang et.al. |
2505.17829 |
null |
2025-05-23 |
Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models |
Xuchen Pan et.al. |
2505.17826 |
link |
2025-05-23 |
Evaluation Faking: Unveiling Observer Effects in Safety Evaluation of Frontier AI Systems |
Yihe Fan et.al. |
2505.17815 |
null |
2025-05-23 |
Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning |
Michael Hassid et.al. |
2505.17813 |
null |
2025-05-23 |
A Coreset Selection of Coreset Selection Literature: Introduction and Recent Advances |
Brian B. Moser et.al. |
2505.17799 |
null |
2025-05-23 |
DialogXpert: Driving Intelligent and Emotion-Aware Conversations through Online Value-Based Reinforcement Learning with LLM Priors |
Tazeek Bin Abdur Rakib et.al. |
2505.17795 |
null |
2025-05-23 |
Titanus: Enabling KV Cache Pruning and Quantization On-the-Fly for LLM Acceleration |
Peilin Chen et.al. |
2505.17787 |
link |
2025-05-23 |
Generative Data Augmentation for Object Point Cloud Segmentation |
Dekai Zhu et.al. |
2505.17783 |
null |
2025-05-23 |
C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language Models |
Amir Hossein Rahmati et.al. |
2505.17773 |
null |
2025-05-23 |
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models |
Patrick Leask et.al. |
2505.17769 |
null |
2025-05-23 |
R-Genie: Reasoning-Guided Generative Image Editing |
Dong Zhang et.al. |
2505.17768 |
null |
2025-05-23 |
The Real Barrier to LLM Agent Usability is Agentic ROI |
Weiwen Liu et.al. |
2505.17767 |
null |
2025-05-23 |
Resolving Conflicting Evidence in Automated Fact-Checking: A Study on Retrieval-Augmented LLMs |
Ziyu Ge et.al. |
2505.17762 |
null |
2025-05-23 |
But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors |
Leon Eshuijs et.al. |
2505.17760 |
null |
2025-05-23 |
Fast Quiet-STaR: Thinking Without Thought Tokens |
Wei Huang et.al. |
2505.17746 |
null |
2025-05-23 |
Automating Safety Enhancement for LLM-based Agents with Synthetic Risk Scenarios |
Xueyang Zhou et.al. |
2505.17735 |
null |
2025-05-23 |
Slot-MLLM: Object-Centric Visual Tokenization for Multimodal LLM |
Donghwan Chi et.al. |
2505.17726 |
null |
2025-05-23 |
SeaLion: Semantic Part-Aware Latent Point Diffusion Models for 3D Generation |
Dekai Zhu et.al. |
2505.17721 |
null |
2025-05-23 |
Get Experience from Practice: LLM Agents with Record & Replay |
Erhu Feng et.al. |
2505.17716 |
null |
2025-05-23 |
Understanding How Value Neurons Shape the Generation of Specified Values in LLMs |
Yi Su et.al. |
2505.17712 |
null |
2025-05-23 |
LLM Contribution Summarization in Software Projects |
Rafael Corsi Ferrao et.al. |
2505.17710 |
null |
2025-05-23 |
CIKT: A Collaborative and Iterative Knowledge Tracing Framework with Large Language Models |
Runze Li et.al. |
2505.17705 |
null |
2025-05-23 |
Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek |
Xueyang Li et.al. |
2505.17702 |
null |
2025-05-23 |
COUNTDOWN: Contextually Sparse Activation Filtering Out Unnecessary Weights in Down Projection |
Jaewon Cheon et.al. |
2505.17701 |
null |
2025-05-23 |
Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models |
Zekai Zhao et.al. |
2505.17697 |
null |
2025-05-23 |
ELSPR: Evaluator LLM Training Data Self-Purification on Non-Transitive Preferences via Tournament Graph Reconstruction |
Yan Yu et.al. |
2505.17691 |
null |
2025-05-23 |
Tuning Language Models for Robust Prediction of Diverse User Behaviors |
Fanjin Meng et.al. |
2505.17682 |
null |
2025-05-23 |
Patterns with long and short-range order in monoloyers of binary mixtures with competing interactions |
M. Litniewski et.al. |
2505.17675 |
null |
2025-05-23 |
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States |
Yang Xiao et.al. |
2505.17663 |
link |
2025-05-23 |
Automated scientific minimization of regret |
Marcel Binz et.al. |
2505.17661 |
null |
2025-05-23 |
Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling |
Xiaolong Tang et.al. |
2505.17659 |
null |
2025-05-23 |
Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs |
Hexiang Tan et.al. |
2505.17656 |
null |
2025-05-23 |
EVADE: Multimodal Benchmark for Evasive Content Detection in E-Commerce Applications |
Ancheng Xu et.al. |
2505.17654 |
null |
2025-05-23 |
GeoGramBench: Benchmarking the Geometric Program Reasoning in Modern LLMs |
Shixian Luo et.al. |
2505.17653 |
null |
2025-05-23 |
Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective |
Deyang Kong et.al. |
2505.17652 |
null |
2025-05-23 |
Simulating Macroeconomic Expectations using LLM Agents |
Jianhao Lin et.al. |
2505.17648 |
null |
2025-05-23 |
Understanding Pre-training and Fine-tuning from Loss Landscape Perspectives |
Huanran Chen et.al. |
2505.17646 |
null |
2025-05-23 |
HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning |
Chuhao Zhou et.al. |
2505.17645 |
null |
2025-05-23 |
PreMoe: Lightening MoEs on Constrained Memory by Expert Pruning and Retrieval |
Zehua Pei et.al. |
2505.17639 |
null |
2025-05-23 |
ReqBrain: Task-Specific Instruction Tuning of LLMs for AI-Assisted Requirements Generation |
Mohammad Kasra Habib et.al. |
2505.17632 |
null |
2025-05-23 |
BehaveGPT: A Foundation Model for Large-scale User Behavior Modeling |
Jiahui Gong et.al. |
2505.17631 |
null |
2025-05-23 |
GIM: Improved Interpretability for Large Language Models |
Joakim Edin et.al. |
2505.17630 |
null |
2025-05-23 |
Enhancing Large Vision-Language Models with Layout Modality for Table Question Answering on Japanese Annual Securities Reports |
Hayato Aida et.al. |
2505.17625 |
null |
2025-05-23 |
Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration |
Jingtong Gao et.al. |
2505.17621 |
null |
2025-05-23 |
CAS-IQA: Teaching Vision-Language Models for Synthetic Angiography Quality Assessment |
Bo Wang et.al. |
2505.17619 |
null |
2025-05-22 |
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning |
Chengqi Duan et.al. |
2505.17022 |
link |
2025-05-22 |
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms |
Shilin Yan et.al. |
2505.17020 |
link |
2025-05-22 |
Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning Framework |
Chenhao Zhang et.al. |
2505.17019 |
link |
2025-05-22 |
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward |
Kaixuan Fan et.al. |
2505.17018 |
link |
2025-05-22 |
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO |
Chengzhuo Tong et.al. |
2505.17017 |
link |
2025-05-22 |
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models |
Runsen Xu et.al. |
2505.17015 |
null |
2025-05-22 |
SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding |
Haoning Wu et.al. |
2505.17012 |
link |
2025-05-22 |
R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning |
Huatong Song et.al. |
2505.17005 |
link |
2025-05-22 |
Do Large Language Models Excel in Complex Logical Reasoning with Formal Language? |
Jin Jiang et.al. |
2505.16998 |
link |
2025-05-22 |
DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization |
Chao Zhang et.al. |
2505.16995 |
null |
2025-05-22 |
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding |
Runpeng Yu et.al. |
2505.16990 |
link |
2025-05-22 |
T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning |
Amartya Chakraborty et.al. |
2505.16986 |
null |
2025-05-22 |
UFT: Unifying Supervised and Reinforcement Fine-Tuning |
Mingyang Liu et.al. |
2505.16984 |
link |
2025-05-22 |
LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding |
Junlong Tong et.al. |
2505.16983 |
link |
2025-05-22 |
Beyond Correlation: Towards Causal Large Language Model Agents in Biomedicine |
Adib Bazgir et.al. |
2505.16982 |
null |
2025-05-22 |
HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation |
Weizhi Tang et.al. |
2505.16978 |
link |
2025-05-22 |
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development |
Yaxin Du et.al. |
2505.16975 |
link |
2025-05-22 |
Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models |
Junjie Xiong et.al. |
2505.16957 |
null |
2025-05-22 |
A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization |
Shengyu Feng et.al. |
2505.16952 |
null |
2025-05-22 |
From Reality to Virtual Worlds: The Role of Photogrammetry in Game Development |
Santiago Berrezueta-Guzman et.al. |
2505.16951 |
null |
2025-05-22 |
Bottlenecked Transformers: Periodic KV Cache Abstraction for Generalised Reasoning |
Adnan Oomerjee et.al. |
2505.16950 |
null |
2025-05-22 |
MixAT: Combining Continuous and Discrete Adversarial Training for LLMs |
Csaba Dékány et.al. |
2505.16947 |
null |
2025-05-22 |
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios |
Yunjia Qi et.al. |
2505.16944 |
link |
2025-05-23 |
FoMoH: A clinically meaningful foundation model evaluation for structured electronic health records |
Chao Pang et.al. |
2505.16941 |
link |
2025-05-22 |
In-Context Watermarks for Large Language Models |
Yepeng Liu et.al. |
2505.16934 |
null |
2025-05-22 |
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning |
Zebin You et.al. |
2505.16933 |
null |
2025-05-22 |
UNCLE: Uncertainty Expressions in Long-Form Generation |
Ruihan Yang et.al. |
2505.16922 |
null |
2025-05-22 |
Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype |
Nikola Tankovic et.al. |
2505.16918 |
null |
2025-05-22 |
Backdoor Cleaning without External Guidance in MLLM Fine-tuning |
Xuankun Rong et.al. |
2505.16916 |
link |
2025-05-22 |
Unsupervised Prompting for Graph Neural Networks |
Peyman Baghershahi et.al. |
2505.16903 |
null |
2025-05-22 |
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks |
Hongyuan Tao et.al. |
2505.16901 |
null |
2025-05-23 |
Power-Law Decay Loss for Large Language Model Finetuning: Focusing on Information Sparsity to Enhance Generation Quality |
Jintian Shao et.al. |
2505.16900 |
link |
2025-05-22 |
Shadows in the Attention: Contextual Perturbation and Representation Drift in the Dynamics of Hallucination in LLMs |
Zeyu Wei et.al. |
2505.16894 |
null |
2025-05-22 |
CAIN: Hijacking LLM-Humans Conversations via a Two-Stage Malicious System Prompt Generation and Refining Framework |
Viet Pham et.al. |
2505.16888 |
null |
2025-05-22 |
Don’t “Overthink” Passage Reranking: Is Reasoning Truly Necessary? |
Nour Jedidi et.al. |
2505.16886 |
null |
2025-05-22 |
CASTILLO: Characterizing Response Length Distributions of Large Language Models |
Daniel F. Perez-Ramirez et.al. |
2505.16881 |
link |
2025-05-22 |
MPO: Multilingual Safety Alignment via Reward Gap Optimization |
Weixiang Zhao et.al. |
2505.16869 |
link |
2025-05-22 |
Conditional Panoramic Image Generation via Masked Autoregressive Modeling |
Chaoyang Wang et.al. |
2505.16862 |
null |
2025-05-22 |
Walk&Retrieve: Simple Yet Effective Zero-shot Retrieval-Augmented Generation via Knowledge Graph Walks |
Martin Böckling et.al. |
2505.16849 |
link |
2025-05-22 |
R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search |
Yibo Wang et.al. |
2505.16838 |
link |
2025-05-22 |
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis |
Shuang Sun et.al. |
2505.16834 |
link |
2025-05-22 |
From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Pedagogical Visualization |
Haonian Ji et.al. |
2505.16832 |
link |
2025-05-22 |
Unlearning Isn’t Deletion: Investigating Reversibility of Machine Unlearning in LLMs |
Xiaoyu Xu et.al. |
2505.16831 |
link |
2025-05-22 |
KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning |
Wei Sun et.al. |
2505.16826 |
link |
2025-05-22 |
Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts |
Taewon Kang et.al. |
2505.16819 |
null |
2025-05-22 |
DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation |
Bowen Zheng et.al. |
2505.16810 |
null |
2025-05-22 |
Two-way Evidence self-Alignment based Dual-Gated Reasoning Enhancement |
Kexin Zhang et.al. |
2505.16806 |
null |
2025-05-22 |
Learning Beyond Limits: Multitask Learning and Synthetic Data for Low-Resource Canonical Morpheme Segmentation |
Changbing Yang et.al. |
2505.16800 |
null |
2025-05-22 |
REOBench: Benchmarking Robustness of Earth Observation Foundation Models |
Xiang Li et.al. |
2505.16793 |
link |
2025-05-22 |
Accidental Misalignment: Fine-Tuning Language Models Induces Unexpected Vulnerability |
Punya Syon Pandey et.al. |
2505.16789 |
link |
2025-05-22 |
CoTSRF: Utilize Chain of Thought as Stealthy and Robust Fingerprint of Large Language Models |
Zhenzhen Ren et.al. |
2505.16785 |
null |
2025-05-22 |
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning |
Xinghao Chen et.al. |
2505.16782 |
link |
2025-05-22 |
IFEval-Audio: Benchmarking Instruction-Following Capability in Audio-based Large Language Models |
Yiming Gao et.al. |
2505.16774 |
link |
2025-05-22 |
When Safety Detectors Aren’t Enough: A Stealthy and Effective Jailbreak Attack on LLMs via Steganographic Techniques |
Jianing Geng et.al. |
2505.16765 |
null |
2025-05-22 |
TRIM: Achieving Extreme Sparsity with Targeted Row-wise Iterative Metric-driven Pruning |
Florentin Beck et.al. |
2505.16743 |
link |
2025-05-22 |
Mitigating Fine-tuning Risks in LLMs via Safety-Aware Probing Optimization |
Chengcan Wu et.al. |
2505.16737 |
null |
2025-05-22 |
Forward-only Diffusion Probabilistic Models |
Ziwei Luo et.al. |
2505.16733 |
link |
2025-05-22 |
Masked Conditioning for Deep Generative Models |
Phillip Mueller et.al. |
2505.16725 |
null |
2025-05-22 |
Advancing Brainwave Modeling with a Codebook-Based Foundation Model |
Konstantinos Barmpas et.al. |
2505.16724 |
null |
2025-05-22 |
Breaking mBad! Supervised Fine-tuning for Cross-Lingual Detoxification |
Himanshu Beniwal et.al. |
2505.16722 |
link |
2025-05-22 |
Training Long-Context LLMs Efficiently via Chunk-wise Optimization |
Wenhao Li et.al. |
2505.16710 |
link |
2025-05-22 |
A Novel Generative Model with Causality Constraint for Mitigating Biases in Recommender Systems |
Jianfeng Deng et.al. |
2505.16708 |
null |
2025-05-22 |
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models |
Yongliang Wu et.al. |
2505.16707 |
null |
2025-05-22 |
Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs |
Zeping Yu et.al. |
2505.16703 |
null |
2025-05-22 |
MCP-RADAR: A Multi-Dimensional Benchmark for Evaluating Tool Use Capabilities in Large Language Models |
Xuanqi Gao et.al. |
2505.16700 |
null |
2025-05-22 |
Software Architecture Meets LLMs: A Systematic Literature Review |
Larissa Schmid et.al. |
2505.16697 |
null |
2025-05-22 |
Sensitivity of ECG QRS Complexes to His-Purkinje Structure in Computational Heart Models |
Preetam V. Tanikella et.al. |
2505.16696 |
null |
2025-05-22 |
Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence |
Gouki Minegishi et.al. |
2505.16694 |
null |
2025-05-22 |
Your Pre-trained LLM is Secretly an Unsupervised Confidence Calibrator |
Beier Luo et.al. |
2505.16690 |
null |
2025-05-22 |
Semantic Compression of 3D Objects for Open and Collaborative Virtual Worlds |
Jordan Dotzel et.al. |
2505.16679 |
null |
2025-05-22 |
Hybrid Parameterized Quantum States for Variational Quantum Learning |
Chen-Yu Liu et.al. |
2505.16676 |
null |
2025-05-22 |
R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO |
Huanjin Yao et.al. |
2505.16673 |
link |
2025-05-22 |
BitHydra: Towards Bit-flip Inference Cost Attack against Large Language Models |
Xiaobei Yan et.al. |
2505.16670 |
null |
2025-05-22 |
SD-MAD: Sign-Driven Few-shot Multi-Anomaly Detection in Medical Images |
Kaiyu Guo et.al. |
2505.16659 |
null |
2025-05-22 |
Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding |
Feilong Tang et.al. |
2505.16652 |
null |
2025-05-22 |
Collaboration among Multiple Large Language Models for Medical Question Answering |
Kexin Shang et.al. |
2505.16648 |
null |
2025-05-23 |
SMART: Self-Generating and Self-Validating Multi-Dimensional Assessment for LLMs’ Mathematical Problem Solving |
Yujie Hou et.al. |
2505.16646 |
null |
2025-05-22 |
From Evaluation to Defense: Advancing Safety in Video Large Language Models |
Yiwei Sun et.al. |
2505.16643 |
null |
2025-05-23 |
SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation |
Wenjie Yang et.al. |
2505.16637 |
link |
2025-05-22 |
WikiDBGraph: Large-Scale Database Graph of Wikidata for Collaborative Learning |
Zhaomin Wu et.al. |
2505.16635 |
null |
2025-05-22 |
Steering Large Language Models for Machine Translation Personalization |
Daniel Scalena et.al. |
2505.16612 |
link |
2025-05-22 |
From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment |
Jing Ye et.al. |
2505.16610 |
null |
2025-05-22 |
Evaluating Large Language Model with Knowledge Oriented Language Specific Simple Question Answering |
Bowen Jiang et.al. |
2505.16591 |
null |
2025-05-22 |
Beyond LLMs: An Exploration of Small Open-source Language Models in Logging Statement Generation |
Renyi Zhong et.al. |
2505.16590 |
null |
2025-05-22 |
A Survey on the Application of Large Language Models in Scenario-Based Testing of Automated Driving Systems |
Yongqi Zhao et.al. |
2505.16587 |
link |
2025-05-22 |
O $^2$ -Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering |
Jianbiao Mei et.al. |
2505.16582 |
link |
2025-05-22 |
Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial Reasoning |
Siqu Ou et.al. |
2505.16579 |
link |
2025-05-22 |
Large Language Model-Empowered Interactive Load Forecasting |
Yu Zuo et.al. |
2505.16577 |
null |
2025-05-22 |
EMULATE: A Multi-Agent Framework for Determining the Veracity of Atomic Claims by Emulating Human Actions |
Spencer Hong et.al. |
2505.16576 |
link |
2025-05-22 |
URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training |
Dongyang Fan et.al. |
2505.16570 |
null |
2025-05-23 |
Finetuning-Activated Backdoors in LLMs |
Thibaud Gloaguen et.al. |
2505.16567 |
link |
2025-05-22 |
ScholarBench: A Bilingual Benchmark for Abstraction, Comprehension, and Reasoning Evaluation in Academic Contexts |
Dongwon Noh et.al. |
2505.16566 |
null |
2025-05-22 |
CTRAP: Embedding Collapse Trap to Safeguard Large Language Models from Harmful Fine-Tuning |
Biao Yi et.al. |
2505.16559 |
null |
2025-05-22 |
Is Your LLM-Based Multi-Agent a Reliable Real-World Planner? Exploring Fraud Detection in Travel Planning |
Junchi Yao et.al. |
2505.16557 |
null |
2025-05-23 |
Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains |
Wenhui Tan et.al. |
2505.16552 |
null |
2025-05-22 |
Incremental Sequence Classification with Temporal Consistency |
Lucas Maystre et.al. |
2505.16548 |
null |
2025-05-22 |
TextureSAM: Towards a Texture Aware Foundation Model for Segmentation |
Inbal Cohen et.al. |
2505.16540 |
null |
2025-05-22 |
Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models |
Ercong Nie et.al. |
2505.16538 |
null |
2025-05-22 |
HOFT: Householder Orthogonal Fine-tuning |
Alejandro Moreno Arcas et.al. |
2505.16531 |
null |
2025-05-22 |
DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection |
Yuliang Yan et.al. |
2505.16530 |
link |
2025-05-21 |
On the creation of narrow AI: hierarchy and nonlocality of neural network skills |
Eric J. Michaud et.al. |
2505.15811 |
link |
2025-05-21 |
MMaDA: Multimodal Large Diffusion Language Models |
Ling Yang et.al. |
2505.15809 |
link |
2025-05-21 |
Neural Conditional Transport Maps |
Carlos Rodriguez-Pardo et.al. |
2505.15808 |
null |
2025-05-21 |
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation |
Patrick Kahardipraja et.al. |
2505.15807 |
link |
2025-05-21 |
Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering |
Hwan Chang et.al. |
2505.15805 |
link |
2025-05-21 |
STAR-R1: Spacial TrAnsformation Reasoning by Reinforcing Multimodal LLMs |
Zongzhao Li et.al. |
2505.15804 |
link |
2025-05-21 |
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models |
Yuchen Yan et.al. |
2505.15801 |
null |
2025-05-21 |
Interspatial Attention for Efficient 4D Human Video Generation |
Ruizhi Shao et.al. |
2505.15800 |
null |
2025-05-21 |
Reverse Engineering Human Preferences with Reinforcement Learning |
Lisa Alazraki et.al. |
2505.15795 |
null |
2025-05-21 |
HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving |
Zhiwen Chen et.al. |
2505.15793 |
null |
2025-05-21 |
Large Language Models as Computable Approximations to Solomonoff Induction |
Jun Wan et.al. |
2505.15784 |
null |
2025-05-21 |
IA-T2I: Internet-Augmented Text-to-Image Generation |
Chuanhao Li et.al. |
2505.15779 |
null |
2025-05-21 |
ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning |
Changtai Zhu et.al. |
2505.15776 |
link |
2025-05-21 |
Beyond Hard and Soft: Hybrid Context Compression for Balancing Local and Global Information Retention |
Huanxuan Liao et.al. |
2505.15774 |
link |
2025-05-21 |
MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling |
Cheng Yifan et.al. |
2505.15772 |
null |
2025-05-21 |
Constructing a 3D Town from a Single Image |
Kaizhi Zheng et.al. |
2505.15765 |
null |
2025-05-21 |
An Empirical Analysis of Vulnerability Detection Tools for Solidity Smart Contracts Using Line Level Manually Annotated Vulnerabilities |
Francesco Salzano et.al. |
2505.15756 |
null |
2025-05-21 |
Exploring The Visual Feature Space for Multimodal Neural Decoding |
Weihao Xia et.al. |
2505.15755 |
null |
2025-05-21 |
Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval |
Taiye Chen et.al. |
2505.15753 |
null |
2025-05-21 |
Multi-modal Integration Analysis of Alzheimer’s Disease Using Large Language Models and Knowledge Graphs |
Kanan Kiguchi et.al. |
2505.15747 |
null |
2025-05-21 |
Evolutionary Computation and Large Language Models: A Survey of Methods, Synergies, and Applications |
Dikshit Chauhan et.al. |
2505.15741 |
null |
2025-05-21 |
HybridProver: Augmenting Theorem Proving with LLM-Driven Proof Synthesis and Refinement |
Jilin Hu et.al. |
2505.15740 |
null |
2025-05-21 |
Alignment Under Pressure: The Case for Informed Adversaries When Evaluating LLM Defenses |
Xiaoxue Yang et.al. |
2505.15738 |
link |
2025-05-21 |
DEBATE, TRAIN, EVOLVE: Self Evolution of Language Model Reasoning |
Gaurav Srivastava et.al. |
2505.15734 |
null |
2025-05-21 |
VocalBench: Benchmarking the Vocal Conversational Abilities for Speech Interaction Models |
Heyang Liu et.al. |
2505.15727 |
link |
2025-05-21 |
Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities |
Xiaoyu Luo et.al. |
2505.15722 |
null |
2025-05-21 |
Privacy-Preserving Conformal Prediction Under Local Differential Privacy |
Coby Penso et.al. |
2505.15721 |
link |
2025-05-21 |
Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling |
He Hu et.al. |
2505.15715 |
null |
2025-05-21 |
TurnaboutLLM: A Deductive Reasoning Benchmark from Detective Games |
Yuan Yuan et.al. |
2505.15712 |
null |
2025-05-21 |
Advancing LLM Safe Alignment with Safety Representation Ranking |
Tianqi Du et.al. |
2505.15710 |
null |
2025-05-21 |
LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing |
Peng Wang et.al. |
2505.15702 |
link |
2025-05-21 |
HDLxGraph: Bridging Large Language Models and HDL Repositories via HDL Graph Databases |
Pingqing Zheng et.al. |
2505.15701 |
link |
2025-05-21 |
Can Large Language Models be Effective Online Opinion Miners? |
Ryang Heo et.al. |
2505.15695 |
null |
2025-05-21 |
Toward Open Earth Science as Fast and Accessible as Natural Language |
Marquita Ellis et.al. |
2505.15690 |
null |
2025-05-21 |
From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems |
Xiuchao Sui et.al. |
2505.15685 |
link |
2025-05-21 |
ThinkLess: A Training-Free Inference-Efficient Method for Reducing Reasoning Redundancy |
Gengyang Li et.al. |
2505.15684 |
null |
2025-05-21 |
UniErase: Unlearning Token as a Universal Erasure Primitive for Language Models |
Miao Yu et.al. |
2505.15674 |
link |
2025-05-21 |
Graph Conditional Flow Matching for Relational Data Generation |
Davide Scassola et.al. |
2505.15668 |
link |
2025-05-21 |
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization |
Jiaming Zhou et.al. |
2505.15660 |
link |
2025-05-21 |
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen! |
Zhexin Zhang et.al. |
2505.15656 |
link |
2025-05-21 |
Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models |
Zihao Li et.al. |
2505.15634 |
null |
2025-05-21 |
Listen to the Context: Towards Faithful Large Language Models for Retrieval Augmented Generation on Climate Questions |
David Thulke et.al. |
2505.15633 |
null |
2025-05-21 |
Can LLMs $\textit{understand}$ Math? – Exploring the Pitfalls in Mathematical Reasoning |
Tiasa Singha Roy et.al. |
2505.15623 |
null |
2025-05-21 |
DS-Bench: A Realistic Benchmark for Data Science Code Generation |
Shuyin Ouyang et.al. |
2505.15621 |
link |
2025-05-21 |
LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models |
Ruilin Yao et.al. |
2505.15616 |
null |
2025-05-21 |
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning |
David Dinucu-Jianu et.al. |
2505.15607 |
link |
2025-05-21 |
Beyond Classification: Evaluating Diffusion Denoised Smoothing for Security-Utility Trade off |
Yury Belousov et.al. |
2505.15594 |
null |
2025-05-21 |
Federated Learning with Unlabeled Clients: Personalization Can Happen in Low Dimensions |
Hossein Zakerinia et.al. |
2505.15579 |
null |
2025-05-21 |
Bridging the Domain Gap in Equation Distillation with Reinforcement Feedback |
Wangyang Ying et.al. |
2505.15572 |
null |
2025-05-21 |
Moonbeam: A MIDI Foundation Model Using Both Absolute and Relative Music Attributes |
Zixun Guo et.al. |
2505.15559 |
null |
2025-05-21 |
DayDreamer at CQs-Gen 2025: Generating Critical Questions through Argument Scheme Completion |
Wendi Zhou et.al. |
2505.15554 |
null |
2025-05-21 |
Social Bias in Popular Question-Answering Benchmarks |
Angelie Kraft et.al. |
2505.15553 |
null |
2025-05-21 |
Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs |
Lang Gao et.al. |
2505.15524 |
null |
2025-05-21 |
Prompt Tuning Vision Language Models with Margin Regularizer for Few-Shot Learning under Distribution Shifts |
Debarshi Brahma et.al. |
2505.15506 |
link |
2025-05-21 |
Protoknowledge Shapes Behaviour of LLMs in Downstream Tasks: Memorization and Generalization with Knowledge Graphs |
Federico Ranaldi et.al. |
2505.15501 |
null |
2025-05-21 |
KaFT: Knowledge-aware Fine-tuning for Boosting LLMs’ Domain-specific Question-Answering Performance |
Qihuang Zhong et.al. |
2505.15480 |
null |
2025-05-21 |
LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language Models |
Zhanyue Qin et.al. |
2505.15475 |
null |
2025-05-21 |
PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions |
Song Dai et.al. |
2505.15472 |
null |
2025-05-21 |
CoLA: Collaborative Low-Rank Adaptation |
Yiyun Zhou et.al. |
2505.15471 |
link |
2025-05-21 |
A Qualitative Investigation into LLM-Generated Multilingual Code Comments and Automatic Evaluation Metrics |
Jonathan Katzy et.al. |
2505.15469 |
null |
2025-05-21 |
Joint Flashback Adaptation for Forgetting-Resistant Instruction Tuning |
Yukun Zhao et.al. |
2505.15467 |
null |
2025-05-21 |
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment |
Weixiang Zhao et.al. |
2505.15456 |
null |
2025-05-21 |
ViaRL: Adaptive Temporal Grounding via Visual Iterated Amplification Reinforcement Learning |
Ziqiang Xu et.al. |
2505.15447 |
null |
2025-05-21 |
On the Generalization vs Fidelity Paradox in Knowledge Distillation |
Suhas Kamasetty Ramesh et.al. |
2505.15442 |
link |
2025-05-21 |
Bridging Sign and Spoken Languages: Pseudo Gloss Generation for Sign Language Translation |
Jianyuan Guo et.al. |
2505.15438 |
null |
2025-05-21 |
Set-LLM: A Permutation-Invariant LLM |
Beni Egressy et.al. |
2505.15433 |
null |
2025-05-21 |
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought |
Ao Liu et.al. |
2505.15431 |
null |
2025-05-21 |
Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems through Benign Queries |
Yuhao Wang et.al. |
2505.15420 |
null |
2025-05-21 |
ClickSight: Interpreting Student Clickstreams to Reveal Insights on Learning Strategies via LLMs |
Bahar Radmehr et.al. |
2505.15410 |
link |
2025-05-21 |
Reranking with Compressed Document Representation |
Hervé Déjean et.al. |
2505.15394 |
null |
2025-05-21 |
An Empirical Study of the Anchoring Effect in LLMs: Existence, Mechanism, and Potential Mitigations |
Yiming Huang et.al. |
2505.15392 |
null |
2025-05-21 |
RePPL: Recalibrating Perplexity by Uncertainty in Semantic Propagation and Language Generation for Explainable QA Hallucination Detection |
Yiming Huang et.al. |
2505.15386 |
null |
2025-05-21 |
X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic System |
Peng Wang et.al. |
2505.15372 |
link |
2025-05-21 |
AI vs. Human Judgment of Content Moderation: LLM-as-a-Judge and Ethics-Based Response Refusals |
Stefan Pasch et.al. |
2505.15365 |
null |
2025-05-21 |
NL-Debugging: Exploiting Natural Language as an Intermediate Representation for Code Debugging |
Weiming Zhang et.al. |
2505.15356 |
null |
2025-05-21 |
FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management |
Xiang Liu et.al. |
2505.15347 |
null |
2025-05-21 |
SSR: Speculative Parallel Scaling Reasoning in Test-time |
Yuanlin Chu et.al. |
2505.15340 |
null |
2025-05-21 |
Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors |
Hao Fang et.al. |
2505.15337 |
null |
2025-05-21 |
Parameter-Efficient Fine-Tuning of Multispectral Foundation Models for Hyperspectral Image Classification |
Bernardin Ligan et.al. |
2505.15334 |
null |
2025-05-21 |
Towards Zero-Shot Differential Morphing Attack Detection with Multimodal Large Language Models |
Ria Shekhawat et.al. |
2505.15332 |
null |
2025-05-21 |
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack |
Silvia Cappelletti et.al. |
2505.15323 |
null |
2025-05-21 |
Emotional Supporters often Use Multiple Strategies in a Single Turn |
Xin Bai et.al. |
2505.15316 |
null |
2025-05-21 |
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning |
Yurun Yuan et.al. |
2505.15311 |
null |
2025-05-21 |
Towards Pre-training an Effective Respiratory Audio Foundation Model |
Daisuke Niizumi et.al. |
2505.15307 |
link |
2025-05-21 |
Multiple Weaks Win Single Strong: Large Language Models Ensemble Weak Reinforcement Learning Agents into a Supreme One |
Yiwen Song et.al. |
2505.15306 |
null |
2025-05-21 |
Chinese Toxic Language Mitigation via Sentiment Polarity Consistent Rewrites |
Xintong Wang et.al. |
2505.15297 |
null |
2025-05-21 |
LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models |
Qianyue Hao et.al. |
2505.15293 |
null |
2025-05-21 |
Hallucinate at the Last in Long Response Generation: A Case Study on Long Document Summarization |
Joonho Yang et.al. |
2505.15291 |
null |
2025-05-21 |
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents |
Hyungjoo Chae et.al. |
2505.15277 |
link |
2025-05-21 |
Scaling Diffusion Transformers Efficiently via $μ$ P |
Chenyu Zheng et.al. |
2505.15270 |
link |
2025-05-21 |
LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval |
Zhenyu Ning et.al. |
2505.15269 |
null |
2025-05-21 |
Blind Spot Navigation: Evolutionary Discovery of Sensitive Semantic Concepts for LVLMs |
Zihao Pan et.al. |
2505.15265 |
null |
2025-05-21 |
gen2seg: Generative Models Enable Generalizable Instance Segmentation |
Om Khangaonkar et.al. |
2505.15263 |
null |
2025-05-21 |
ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search |
Hyunseok Lee et.al. |
2505.15259 |
null |
2025-05-21 |
When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners |
Weixiang Zhao et.al. |
2505.15257 |
null |
2025-05-21 |
MentalMAC: Enhancing Large Language Models for Detecting Mental Manipulation via Multi-Task Anti-Curriculum Distillation |
Yuansheng Gao et.al. |
2505.15255 |
null |
2025-05-21 |
Towards Explainable Temporal Reasoning in Large Language Models: A Structure-Aware Generative Framework |
Zihao Jiang et.al. |
2505.15245 |
link |
2025-05-21 |
Adaptive Plan-Execute Framework for Smart Contract Security Auditing |
Zhiyuan Wei et.al. |
2505.15242 |
null |
2025-05-21 |
Multilingual Prompting for Improving LLM Generation Diversity |
Qihan Wang et.al. |
2505.15229 |
null |
2025-05-21 |
Multimodal Conditional Information Bottleneck for Generalizable AI-Generated Image Detection |
Haotian Qin et.al. |
2505.15217 |
link |
2025-05-20 |
Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning |
Haolei Xu et.al. |
2505.14684 |
null |
2025-05-20 |
Emerging Properties in Unified Multimodal Pretraining |
Chaorui Deng et.al. |
2505.14683 |
null |
2025-05-20 |
UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation |
Rui Tian et.al. |
2505.14682 |
null |
2025-05-20 |
NExT-Search: Rebuilding User Feedback Ecosystem for Generative AI Search |
Sunhao Dai et.al. |
2505.14680 |
null |
2025-05-20 |
UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models |
Xiaojie Gu et.al. |
2505.14679 |
link |
2025-05-20 |
Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning |
Jiaer Xia et.al. |
2505.14677 |
null |
2025-05-20 |
Reward Reasoning Model |
Jiaxin Guo et.al. |
2505.14674 |
null |
2025-05-20 |
Training-Free Watermarking for Autoregressive Image Generation |
Yu Tong et.al. |
2505.14673 |
null |
2025-05-20 |
Quartet: Native FP4 Training Can Be Optimal for Large Language Models |
Roberto L. Castro et.al. |
2505.14669 |
link |
2025-05-20 |
ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions |
Bufang Yang et.al. |
2505.14668 |
null |
2025-05-20 |
Beyond Words: Multimodal LLM Knows When to Speak |
Zikai Liao et.al. |
2505.14654 |
null |
2025-05-20 |
General-Reasoner: Advancing LLM Reasoning Across All Domains |
Xueguang Ma et.al. |
2505.14652 |
null |
2025-05-20 |
Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits |
Tiantian Feng et.al. |
2505.14648 |
link |
2025-05-21 |
Think Only When You Need with Large Hybrid-Reasoning Models |
Lingjie Jiang et.al. |
2505.14631 |
null |
2025-05-20 |
KERL: Knowledge-Enhanced Personalized Recipe Recommendation using Large Language Models |
Fnu Mohbat et.al. |
2505.14629 |
link |
2025-05-20 |
Debating for Better Reasoning: An Unsupervised Multimodal Approach |
Ashutosh Adhikari et.al. |
2505.14627 |
null |
2025-05-20 |
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning |
Zhangchen Xu et.al. |
2505.14625 |
link |
2025-05-20 |
Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs |
Morgan Lindsay Heisler et.al. |
2505.14620 |
null |
2025-05-20 |
Linear Control of Test Awareness Reveals Differential Compliance in Reasoning Models |
Sahar Abdelnabi et.al. |
2505.14617 |
null |
2025-05-20 |
SATBench: Benchmarking LLMs’ Logical Reasoning via Automated Puzzle Generation from SAT Formulas |
Anjiang Wei et.al. |
2505.14615 |
null |
2025-05-20 |
sudoLLM : On Multi-role Alignment of Language Models |
Soumadeep Saha et.al. |
2505.14607 |
null |
2025-05-20 |
Towards a Foundation Model for Communication Systems |
Davide Buffelli et.al. |
2505.14603 |
null |
2025-05-20 |
Toward Reliable Biomedical Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models |
Guangzhi Xiong et.al. |
2505.14599 |
link |
2025-05-20 |
Context Reasoner: Incentivizing Reasoning Capability for Contextualized Privacy and Safety Compliance via Reinforcement Learning |
Wenbin Hu et.al. |
2505.14585 |
null |
2025-05-20 |
TRATES: Trait-Specific Rubric-Assisted Cross-Prompt Essay Scoring |
Sohaila Eltanbouly et.al. |
2505.14577 |
null |
2025-05-20 |
Neural Inverse Scattering with Score-based Regularization |
Yuan Gao et.al. |
2505.14560 |
null |
2025-05-21 |
KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation |
Jiajun Shi et.al. |
2505.14552 |
link |
2025-05-20 |
Can Large Language Models Really Recognize Your Name? |
Dzung Pham et.al. |
2505.14549 |
link |
2025-05-20 |
Time to Embed: Unlocking Foundation Models for Time Series with Channel Descriptions |
Utsav Dutta et.al. |
2505.14543 |
null |
2025-05-20 |
Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders |
Agam Goyal et.al. |
2505.14536 |
null |
2025-05-20 |
Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs |
Zhipeng Yang et.al. |
2505.14530 |
link |
2025-05-20 |
BugRepro: Enhancing Android Bug Reproduction with Domain-Specific Knowledge Integration |
Hongrong Yin et.al. |
2505.14528 |
null |
2025-05-20 |
Guarded Query Routing for Large Language Models |
Richard Šléher et.al. |
2505.14524 |
link |
2025-05-21 |
Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling |
Zhihao Li et.al. |
2505.14521 |
null |
2025-05-20 |
Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples |
Chun-Yi Kuan et.al. |
2505.14518 |
null |
2025-05-20 |
Latent Flow Transformer |
Yen-Chen Wu et.al. |
2505.14513 |
link |
2025-05-20 |
ModRWKV: Transformer Multimodality in Linear Time |
Jiale Kang et.al. |
2505.14505 |
link |
2025-05-20 |
Enhanced Multimodal Aspect-Based Sentiment Analysis by LLM-Generated Rationales |
Jun Cao et.al. |
2505.14499 |
null |
2025-05-20 |
Reasoning Models Better Express Their Confidence |
Dongkeun Yoon et.al. |
2505.14489 |
link |
2025-05-20 |
MoMoE: Mixture of Moderation Experts Framework for AI-Assisted Online Governance |
Agam Goyal et.al. |
2505.14483 |
null |
2025-05-20 |
Towards Reliable Proof Generation with LLMs: A Neuro-Symbolic Approach |
Oren Sultan et.al. |
2505.14479 |
null |
2025-05-20 |
Enhancing Interpretability of Sparse Latent Representations with Class Information |
Farshad Sangari Abiz et.al. |
2505.14476 |
null |
2025-05-20 |
Attributional Safety Failures in Large Language Models under Code-Mixed Perturbations |
Somnath Banerjee et.al. |
2505.14469 |
null |
2025-05-20 |
ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs |
Yifan Sui et.al. |
2505.14468 |
null |
2025-05-20 |
VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank |
Tianhe Wu et.al. |
2505.14460 |
link |
2025-05-20 |
Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models |
Xuyang Liu et.al. |
2505.14454 |
link |
2025-05-20 |
Creative Preference Optimization |
Mete Ismayilzada et.al. |
2505.14442 |
null |
2025-05-20 |
S2SBench: A Benchmark for Quantifying Intelligence Degradation in Speech-to-Speech Large Language Models |
Yuanbo Fang et.al. |
2505.14438 |
link |
2025-05-20 |
Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models |
Yuqiao Tan et.al. |
2505.14436 |
link |
2025-05-20 |
Choosing a Model, Shaping a Future: Comparing LLM Perspectives on Sustainability and its Relationship with AI |
Annika Bush et.al. |
2505.14435 |
null |
2025-05-20 |
Rank-K: Test-Time Reasoning for Listwise Reranking |
Eugene Yang et.al. |
2505.14432 |
link |
2025-05-20 |
From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning |
Chalamalasetti Kranti et.al. |
2505.14425 |
null |
2025-05-20 |
MindVote: How LLMs Predict Human Decision-Making in Social Media Polls |
Xutao Mao et.al. |
2505.14422 |
null |
2025-05-20 |
Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI Agents |
Pengzhou Cheng et.al. |
2505.14418 |
null |
2025-05-20 |
Towards Non-Euclidean Foundation Models: Advancing AI Beyond Euclidean Frameworks |
Menglin Yang et.al. |
2505.14417 |
null |
2025-05-20 |
Table Foundation Models: on knowledge pre-training for tabular learning |
Myung Jun Kim et.al. |
2505.14415 |
null |
2025-05-20 |
Diving into the Fusion of Monocular Priors for Generalized Stereo Matching |
Chengtang Yao et.al. |
2505.14414 |
link |
2025-05-20 |
Byte Pair Encoding for Efficient Time Series Forecasting |
Leon Götz et.al. |
2505.14411 |
null |
2025-05-21 |
Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis |
Haoming Huang et.al. |
2505.14406 |
null |
2025-05-20 |
OmniGenBench: A Modular Platform for Reproducible Genomic Foundation Models Benchmarking |
Heng Yang et.al. |
2505.14402 |
link |
2025-05-20 |
Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation |
Peter Baile Chen et.al. |
2505.14398 |
null |
2025-05-20 |
Causal Cartographer: From Mapping to Reasoning Over Counterfactual Worlds |
Gaël Gendron et.al. |
2505.14396 |
link |
2025-05-20 |
MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language |
Seyoung Song et.al. |
2505.14395 |
link |
2025-05-20 |
Knowledge Graph Based Repository-Level Code Generation |
Mihir Athale et.al. |
2505.14394 |
null |
2025-05-20 |
SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation |
Yuyang Dong et.al. |
2505.14381 |
null |
2025-05-20 |
AutoRev: Automatic Peer Review System for Academic Research Papers |
Maitreya Prafulla Chitale et.al. |
2505.14376 |
null |
2025-05-20 |
Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs |
Jiawen Wang et.al. |
2505.14368 |
null |
2025-05-21 |
Dual Decomposition of Weights and Singular Value Low Rank Adaptation |
Jialong Han et.al. |
2505.14367 |
null |
2025-05-20 |
Vision-Language Modeling Meets Remote Sensing: Models, Datasets and Perspectives |
Xingxing Weng et.al. |
2505.14361 |
null |
2025-05-20 |
Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable |
Ruoxin Chen et.al. |
2505.14359 |
null |
2025-05-20 |
PersonaTAB: Predicting Personality Traits using Textual, Acoustic, and Behavioral Cues in Fully-Duplex Speech Dialogs |
Sho Inoue et.al. |
2505.14356 |
link |
2025-05-20 |
WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications |
Xin Li et.al. |
2505.14354 |
null |
2025-05-21 |
OSoRA: Output-Dimension and Singular-Value Initialized Low-Rank Adaptation |
Jialong Han et.al. |
2505.14350 |
null |
2025-05-20 |
QA-prompting: Improving Summarization with Large Language Models using Question-Answering |
Neelabh Sinha et.al. |
2505.14347 |
link |
2025-05-20 |
Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach |
Umberto Cappellazzo et.al. |
2505.14336 |
null |
2025-05-20 |
Handloom Design Generation Using Generative Networks |
Rajat Kanti Bhattacharjee et.al. |
2505.14330 |
null |
2025-05-20 |
RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection |
Wenjun Hou et.al. |
2505.14318 |
link |
2025-05-20 |
Exploring Jailbreak Attacks on LLMs through Intent Concealment and Diversion |
Tiehan Cui et.al. |
2505.14316 |
null |
2025-05-20 |
Low-Cost FlashAttention with Fused Exponential and Multiplication Hardware Operators |
Kosmas Alexandridis et.al. |
2505.14314 |
null |
2025-05-20 |
A MIND for Reasoning: Meta-learning for In-context Deduction |
Leonardo Bertolazzi et.al. |
2505.14313 |
link |
2025-05-20 |
HausaNLP: Current Status, Challenges and Future Directions for Hausa Natural Language Processing |
Shamsuddeen Hassan Muhammad et.al. |
2505.14311 |
null |
2025-05-20 |
JOLT-SQL: Joint Loss Tuning of Text-to-SQL with Confusion-aware Noisy Schema Sampling |
Jinwang Song et.al. |
2505.14305 |
link |
2025-05-20 |
Scaling Law for Quantization-Aware Training |
Mengzhao Chen et.al. |
2505.14302 |
null |
2025-05-20 |
SafetyNet: Detecting Harmful Outputs in LLMs by Modeling and Monitoring Deceptive Behaviors |
Maheep Chaudhary et.al. |
2505.14300 |
null |
2025-05-20 |
Empowering LLMs in Task-Oriented Dialogues: A Domain-Independent Multi-Agent Framework and Fine-Tuning Strategy |
Zihao Feng et.al. |
2505.14299 |
null |
2025-05-20 |
Cross-Lingual Optimization for Language Transfer in Large Language Models |
Jungseob Lee et.al. |
2505.14297 |
null |
2025-05-20 |
Universal Acoustic Adversarial Attacks for Flexible Control of Speech-LLMs |
Rao Ma et.al. |
2505.14286 |
null |
2025-05-20 |
YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering |
Jennifer D’Souza et.al. |
2505.14279 |
null |
2025-05-20 |
Think-J: Learning to Think for Generative LLM-as-a-Judge |
Hui Huang et.al. |
2505.14268 |
link |
2025-05-20 |
AAPO: Enhance the Reasoning Capabilities of LLMs with Advantage Momentum |
Jian Xiong et.al. |
2505.14264 |
null |
2025-05-20 |
Speculative Decoding Reimagined for Multimodal Large Language Models |
Luxi Lin et.al. |
2505.14260 |
link |
2025-05-20 |
FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation |
Shaolin Zhu et.al. |
2505.14256 |
null |
2025-05-20 |
TransBench: Benchmarking Machine Translation for Industrial-Scale Applications |
Haijun Li et.al. |
2505.14244 |
null |
2025-05-20 |
ABBA: Highly Expressive Hadamard Product Adaptation for Large Language Models |
Raghav Singhal et.al. |
2505.14238 |
link |
2025-05-20 |
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning |
Sule Bai et.al. |
2505.14231 |
null |
2025-05-20 |
“Haet Bhasha aur Diskrimineshun”: Phonetic Perturbations in Code-Mixed Hinglish to Red-Team LLMs |
Darpan Aswal et.al. |
2505.14226 |
null |
2025-05-20 |
Automatic Dataset Generation for Knowledge Intensive Question Answering Tasks |
Sizhe Yuen et.al. |
2505.14212 |
null |
2025-05-20 |
Challenges and Limitations in the Synthetic Generation of mHealth Sensor Data |
Flavio Di Martino et.al. |
2505.14206 |
null |
2025-05-20 |
Capturing the Effects of Quantization on Trojans in Code LLMs |
Aftab Hussain et.al. |
2505.14200 |
null |
2025-05-20 |
Towards Omnidirectional Reasoning with 360-R1: A Dataset, Benchmark, and GRPO-based Method |
Xinshen Zhang et.al. |
2505.14197 |
null |
2025-05-19 |
Mean Flows for One-step Generative Modeling |
Zhengyang Geng et.al. |
2505.13447 |
null |
2025-05-19 |
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards |
Xiaoyuan Liu et.al. |
2505.13445 |
link |
2025-05-19 |
Optimizing Anytime Reasoning via Budget Relative Policy Optimization |
Penghui Qi et.al. |
2505.13438 |
link |
2025-05-19 |
SMOTExT: SMOTE meets Large Language Models |
Mateusz Bystroński et.al. |
2505.13434 |
null |
2025-05-19 |
Synthetic-Powered Predictive Inference |
Meshi Bashari et.al. |
2505.13432 |
link |
2025-05-19 |
Fine-tuning Quantized Neural Networks with Zeroth-order Optimization |
Sifeng Shang et.al. |
2505.13430 |
null |
2025-05-19 |
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision |
Lingxiao Du et.al. |
2505.13427 |
link |
2025-05-19 |
Learnware of Language Models: Specialized Small Language Models Can Do Big |
Zhi-Hao Tan et.al. |
2505.13425 |
link |
2025-05-19 |
Make Still Further Progress: Chain of Thoughts for Tabular Data Leaderboard |
Si-Yang Liu et.al. |
2505.13421 |
null |
2025-05-19 |
FEALLM: Advancing Facial Emotion Analysis in Multimodal Large Language Models with Emotional Synergy and Reasoning |
Zhuozhao Hu et.al. |
2505.13419 |
link |
2025-05-19 |
CoT-Kinetics: A Theoretical Modeling Assessing LRM Reasoning Process |
Jinhe Bi et.al. |
2505.13408 |
null |
2025-05-19 |
AutoMathKG: The automated mathematical knowledge graph based on LLM and vector database |
Rong Bian et.al. |
2505.13406 |
null |
2025-05-19 |
MR. Judge: Multimodal Reasoner as a Judge |
Renjie Pi et.al. |
2505.13403 |
null |
2025-05-19 |
CompeteSMoE – Statistically Guaranteed Mixture of Experts Training via Competition |
Nam V. Nguyen et.al. |
2505.13380 |
link |
2025-05-19 |
Restoration Score Distillation: From Corrupted Diffusion Pretraining to One-Step High-Quality Generation |
Yasi Zhang et.al. |
2505.13377 |
null |
2025-05-19 |
Seeing, Saying, Solving: An LLM-to-TL Framework for Cooperative Robots |
Dan BW Choe et.al. |
2505.13376 |
null |
2025-05-19 |
Minimum-Excess-Work Guidance |
Christopher Kolloff et.al. |
2505.13375 |
null |
2025-05-19 |
One-Step Offline Distillation of Diffusion-based Models via Koopman Modeling |
Nimrod Berman et.al. |
2505.13358 |
link |
2025-05-19 |
Multi-Armed Bandits Meet Large Language Models |
Djallel Bouneffouf et.al. |
2505.13355 |
null |
2025-05-20 |
Sense and Sensitivity: Examining the Influence of Semantic Recall on Long Context Code Reasoning |
Adam Štorek et.al. |
2505.13353 |
null |
2025-05-19 |
Investigating the Vulnerability of LLM-as-a-Judge Architectures to Prompt-Injection Attacks |
Narek Maloyan et.al. |
2505.13348 |
null |
2025-05-19 |
J4R: Learning to Judge with Equivalent Initial State Group Relative Preference Optimization |
Austin Xu et.al. |
2505.13346 |
null |
2025-05-19 |
Thinking Short and Right Over Thinking Long: Serving LLM Reasoning Efficiently and Accurately |
Yuhang Wang et.al. |
2505.13326 |
null |
2025-05-19 |
VesselGPT: Autoregressive Modeling of Vascular Geometry |
Paula Feldman et.al. |
2505.13318 |
null |
2025-05-19 |
GUARD: Generation-time LLM Unlearning via Adaptive Restriction and Detection |
Zhijie Deng et.al. |
2505.13312 |
null |
2025-05-19 |
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space |
Hengli Li et.al. |
2505.13308 |
link |
2025-05-19 |
RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning |
Qiguang Chen et.al. |
2505.13307 |
link |
2025-05-19 |
I’ll believe it when I see it: Images increase misinformation sharing in Vision-Language Models |
Alice Plebe et.al. |
2505.13302 |
link |
2025-05-19 |
TimeSeriesGym: A Scalable Benchmark for (Time Series) Machine Learning Engineering Agents |
Yifu Cai et.al. |
2505.13291 |
link |
2025-05-19 |
Hybrid Voting-Based Task Assignment in Modular Construction Scenarios |
Daniel Weiner et.al. |
2505.13278 |
null |
2025-05-19 |
CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning |
Lei Sheng et.al. |
2505.13271 |
link |
2025-05-19 |
Distilling a speech and music encoder with task arithmetic |
Fabian Ritter-Gutierrez et.al. |
2505.13270 |
null |
2025-05-19 |
Are requirements really all you need? A case study of LLM-driven configuration code generation for automotive simulations |
Krzysztof Lebioda et.al. |
2505.13263 |
null |
2025-05-19 |
From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery |
Tianshi Zheng et.al. |
2505.13259 |
link |
2025-05-19 |
Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability |
Jingyi Ren et.al. |
2505.13258 |
link |
2025-05-19 |
Policy Contrastive Decoding for Robotic Foundation Models |
Shihan Wu et.al. |
2505.13255 |
link |
2025-05-19 |
HeteroSpec: Leveraging Contextual Heterogeneity for Efficient Speculative Decoding |
Siran Liu et.al. |
2505.13254 |
null |
2025-05-19 |
RN-F: A Novel Approach for Mitigating Contaminated Data in Large Language Models |
Le Vu Anh et.al. |
2505.13249 |
link |
2025-05-19 |
JNLP at SemEval-2025 Task 11: Cross-Lingual Multi-Label Emotion Detection Using Generative Models |
Jieying Xue et.al. |
2505.13244 |
link |
2025-05-19 |
Conformalized Decision Risk Assessment |
Wenbin Zhou et.al. |
2505.13243 |
null |
2025-05-19 |
SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information |
Chih-Kai Yang et.al. |
2505.13237 |
link |
2025-05-19 |
From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection |
Lincan Cai et.al. |
2505.13233 |
link |
2025-05-19 |
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis |
Tianbao Xie et.al. |
2505.13227 |
null |
2025-05-19 |
SeedBench: A Multi-task Benchmark for Evaluating Large Language Models in Seed Science |
Jie Ying et.al. |
2505.13220 |
link |
2025-05-19 |
Diffusion Models with Double Guidance: Generate with aggregated datasets |
Yanfeng Yang et.al. |
2505.13213 |
null |
2025-05-19 |
Quantum Knowledge Distillation for Large Language Models |
Lingxiao Li et.al. |
2505.13205 |
null |
2025-05-19 |
Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification |
Jikai Wang et.al. |
2505.13204 |
null |
2025-05-19 |
A Physics-Inspired Optimizer: Velocity Regularized Adam |
Pranav Vaidhyanathan et.al. |
2505.13196 |
null |
2025-05-19 |
Adversarial Testing in LLMs: Insights into Decision-Making Vulnerabilities |
Lili Zhang et.al. |
2505.13195 |
null |
2025-05-19 |
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics |
Christoph Jürgen Hemmer et.al. |
2505.13192 |
null |
2025-05-19 |
A Malliavin-Gamma calculus approach to Score Based Diffusion Generative models for random fields |
Giacomo Greco et.al. |
2505.13189 |
null |
2025-05-19 |
ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models |
Matteo Merler et.al. |
2505.13180 |
link |
2025-05-19 |
ToolSpectrum : Towards Personalized Tool Utilization for Large Language Models |
Zihao Cheng et.al. |
2505.13176 |
null |
2025-05-19 |
Enhancing LLMs for Time Series Forecasting via Structure-Guided Cross-Modal Alignment |
Siming Sun et.al. |
2505.13175 |
null |
2025-05-19 |
A Case Study of Cross-Lingual Zero-Shot Generalization for Classical Languages in LLMs |
V. S. D. S. Mahesh Akavarapu et.al. |
2505.13173 |
link |
2025-05-19 |
Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks |
Yixuan Xu et.al. |
2505.13171 |
null |
2025-05-19 |
Role-Playing Evaluation for Large Language Models |
Yassine El Boudouri et.al. |
2505.13157 |
link |
2025-05-19 |
Tianyi: A Traditional Chinese Medicine all-rounder language model and its Real-World Clinical Practice |
Zhi Liu et.al. |
2505.13156 |
null |
2025-05-19 |
Zero-Shot Adaptation of Behavioral Foundation Models to Unseen Dynamics |
Maksim Bobrin et.al. |
2505.13150 |
null |
2025-05-20 |
What if Deception Cannot be Detected? A Cross-Linguistic Study on the Limits of Deception Detection from Text |
Aswathy Velutharambath et.al. |
2505.13147 |
null |
2025-05-19 |
Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models |
Haolang Lu et.al. |
2505.13143 |
null |
2025-05-19 |
Understanding Cross-Lingual Inconsistency in Large Language Models |
Zheng Wei Lim et.al. |
2505.13141 |
null |
2025-05-19 |
CacheFlow: Fast Human Motion Prediction by Cached Normalizing Flow |
Takahiro Maeda et.al. |
2505.13140 |
null |
2025-05-19 |
Optimizing Retrieval Augmented Generation for Object Constraint Language |
Kevin Chenhao Li et.al. |
2505.13129 |
null |
2025-05-19 |
Benchmarking and Confidence Evaluation of LALMs For Temporal Reasoning |
Debarpan Bhattacharya et.al. |
2505.13115 |
link |
2025-05-19 |
Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation |
Sungmin Cha et.al. |
2505.13111 |
null |
2025-05-19 |
FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference |
Guangda Liu et.al. |
2505.13109 |
null |
2025-05-19 |
Fixing 7,400 Bugs for 1$: Cheap Crash-Site Program Repair |
Han Zheng et.al. |
2505.13103 |
null |
2025-05-20 |
Industrial Synthetic Segment Pre-training |
Shinichi Mae et.al. |
2505.13099 |
null |
2025-05-19 |
LLM-KG-Bench 3.0: A Compass for SemanticTechnology Capabilities in the Ocean of LLMs |
Lars-Peter Meyer et.al. |
2505.13098 |
link |
2025-05-19 |
The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation |
David Stap et.al. |
2505.13090 |
null |
2025-05-19 |
Walking the Tightrope: Disentangling Beneficial and Detrimental Drifts in Non-Stationary Custom-Tuning |
Xiaoyu Yang et.al. |
2505.13081 |
null |
2025-05-19 |
The Hidden Dangers of Browsing AI Agents |
Mykyta Mudryi et.al. |
2505.13076 |
null |
2025-05-19 |
Structure-Aware Corpus Construction and User-Perception-Aligned Metrics for Large-Language-Model Code Completion |
Dengfeng Liu et.al. |
2505.13073 |
null |
2025-05-19 |
Hearing from Silence: Reasoning Audio Descriptions from Silent Videos via Vision-Language Model |
Yong Ren et.al. |
2505.13062 |
null |
2025-05-19 |
Automatic mixed precision for optimizing gained time with constrained loss mean-squared-error based on model partition to sequential sub-graphs |
Shmulik Markovich-Golan et.al. |
2505.13060 |
null |
2025-05-19 |
CAIM: Development and Evaluation of a Cognitive AI Memory Framework for Long-Term Interaction with Intelligent Agents |
Rebecca Westhäußer et.al. |
2505.13044 |
null |
2025-05-19 |
KIT’s Offline Speech Translation and Instruction Following Submission for IWSLT 2025 |
Sai Koneru et.al. |
2505.13036 |
null |
2025-05-19 |
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix |
Ziyang Ma et.al. |
2505.13032 |
link |
2025-05-19 |
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO |
Yicheng Xiao et.al. |
2505.13031 |
link |
2025-05-19 |
MDDM: A Multi-view Discriminative Enhanced Diffusion-based Model for Speech Enhancement |
Nan Xu et.al. |
2505.13029 |
null |
2025-05-20 |
Evaluating the efficacy of LLM Safety Solutions : The Palit Benchmark Dataset |
Sayon Palit et.al. |
2505.13028 |
null |
2025-05-19 |
Step-wise Adaptive Integration of Supervised Fine-tuning and Reinforcement Learning for Task-Specific LLMs |
Jack Chen et.al. |
2505.13026 |
null |
2025-05-19 |
Unveiling and Steering Connectome Organization with Interpretable Latent Variables |
Yubin Li et.al. |
2505.13011 |
null |
2025-05-19 |
Generative Modeling of Random Fields from Limited Data via Constrained Latent Flow Matching |
James E. Warner et.al. |
2505.13007 |
link |
2025-05-19 |
Fractured Chain-of-Thought Reasoning |
Baohao Liao et.al. |
2505.12992 |
null |
2025-05-19 |
An Empirical Study of Many-to-Many Summarization with Large Language Models |
Jiaan Wang et.al. |
2505.12983 |
null |
2025-05-20 |
From Assistants to Adversaries: Exploring the Security Risks of Mobile LLM Agents |
Liangxuan Wu et.al. |
2505.12981 |
null |
2025-05-19 |
A Structured Literature Review on Traditional Approaches in Current Natural Language Processing |
Robin Jegan et.al. |
2505.12970 |
null |
2025-05-19 |
MA-COIR: Leveraging Semantic Search Index and Generative Models for Ontology-Driven Biomedical Concept Recognition |
Shanshan Liu et.al. |
2505.12964 |
link |
2025-05-19 |
DGRO: Enhancing LLM Reasoning via Exploration-Exploitation Control and Reward Variance Management |
Xuerui Su et.al. |
2505.12951 |
null |
2025-05-19 |
GuRE:Generative Query REwriter for Legal Passage Retrieval |
Daehee Kim et.al. |
2505.12950 |
link |
2025-05-19 |
A3 : an Analytical Low-Rank Approximation Framework for Attention |
Jeffrey T. H. Wong et.al. |
2505.12942 |
null |
2025-05-19 |
Leveraging LLM Inconsistency to Boost Pass@k Performance |
Uri Dalal et.al. |
2505.12938 |
null |
2025-05-19 |
Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs |
Zhihe Yang et.al. |
2505.12929 |
link |
2025-05-19 |
CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive Programming |
Han Deng et.al. |
2505.12925 |
link |
2025-05-19 |
The Traitors: Deception and Trust in Multi-Agent Language Model Simulations |
Pedro M. P. Curvo et.al. |
2505.12923 |
link |
2025-05-19 |
Sinusoidal Initialization, Time for a New Start |
Alberto Fernández-Hernández et.al. |
2505.12909 |
null |
2025-05-19 |
AutoGEEval: A Multimodal and Automated Framework for Geospatial Code Generation on GEE with Large Language Models |
Shuyang Hou et.al. |
2505.12900 |
null |
2025-05-19 |
On the Thinking-Language Modeling Gap in Large Language Models |
Chenxi Liu et.al. |
2505.12896 |
null |
2025-05-16 |
QVGen: Pushing the Limit of Quantized Video Generative Models |
Yushi Huang et.al. |
2505.11497 |
null |
2025-05-16 |
msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML |
Zhaolan Huang et.al. |
2505.11483 |
link |
2025-05-16 |
Improving Assembly Code Performance with Large Language Models via Reinforcement Learning |
Anjiang Wei et.al. |
2505.11480 |
null |
2025-05-16 |
HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages |
Zhilin Wang et.al. |
2505.11475 |
null |
2025-05-16 |
Disentangling Reasoning and Knowledge in Medical Large Language Models |
Rahul Thapa et.al. |
2505.11462 |
null |
2025-05-16 |
ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks |
Zhixiong Zhuang et.al. |
2505.11459 |
null |
2025-05-16 |
LLMs unlock new paths to monetizing exploits |
Nicholas Carlini et.al. |
2505.11449 |
null |
2025-05-16 |
Is Compression Really Linear with Code Intelligence? |
Xianzhen Luo et.al. |
2505.11441 |
null |
2025-05-16 |
GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art |
Chenkai Zhang et.al. |
2505.11436 |
link |
2025-05-16 |
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production |
Chao Jin et.al. |
2505.11432 |
null |
2025-05-16 |
When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs |
Xiaomin Li et.al. |
2505.11423 |
null |
2025-05-16 |
EdgeWisePersona: A Dataset for On-Device User Profiling from Natural Language Interactions |
Patryk Bartkowiak et.al. |
2505.11417 |
link |
2025-05-16 |
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems |
Yinsicheng Jiang et.al. |
2505.11415 |
null |
2025-05-16 |
CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs |
Sijia Chen et.al. |
2505.11413 |
null |
2025-05-16 |
Visual Planning: Let’s Think Only with Images |
Yi Xu et.al. |
2505.11409 |
link |
2025-05-16 |
Large Language Model Use Impact Locus of Control |
Jenny Xiyu Fu et.al. |
2505.11406 |
null |
2025-05-16 |
EmotionHallucer: Evaluating Emotion Hallucinations in Multimodal Large Language Models |
Bohao Xing et.al. |
2505.11405 |
link |
2025-05-16 |
GuideBench: Benchmarking Domain-Oriented Guideline Following for LLM Agents |
Lingxiao Diao et.al. |
2505.11368 |
null |
2025-05-16 |
Phare: A Safety Probe for Large Language Models |
Pierre Le Jeune et.al. |
2505.11365 |
link |
2025-05-16 |
LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors |
Rao Ma et.al. |
2505.11352 |
null |
2025-05-16 |
Context parroting: A simple but tough-to-beat baseline for foundation models in scientific machine learning |
Yuanzhao Zhang et.al. |
2505.11349 |
null |
2025-05-16 |
Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models |
Banca Calvo Figueras et.al. |
2505.11341 |
null |
2025-05-16 |
XtraGPT: LLMs for Human-AI Collaboration on Controllable Academic Paper Revision |
Nuo Chen et.al. |
2505.11336 |
null |
2025-05-16 |
TokenWeave: Efficient Compute-Communication Overlap for Distributed LLM Inference |
Raja Gond et.al. |
2505.11329 |
null |
2025-05-16 |
Uncertainty Quantification for Prior-Data Fitted Networks using Martingale Posteriors |
Thomas Nagler et.al. |
2505.11325 |
null |
2025-05-16 |
A Fourier Space Perspective on Diffusion Models |
Fabian Falck et.al. |
2505.11278 |
null |
2025-05-16 |
Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs |
Yaorui Shi et.al. |
2505.11277 |
link |
2025-05-16 |
TCC-Bench: Benchmarking the Traditional Chinese Culture Understanding Capabilities of MLLMs |
Pengju Xu et.al. |
2505.11275 |
link |
2025-05-16 |
Semantic Caching of Contextual Summaries for Efficient Question-Answering with Language Models |
Camille Couturier et.al. |
2505.11271 |
null |
2025-05-16 |
TAIJI: MCP-based Multi-Modal Data Analytics on Data Lakes |
Chao Zhang et.al. |
2505.11270 |
null |
2025-05-16 |
DRAGON: A Large-Scale Dataset of Realistic Images Generated by Diffusion Models |
Giulia Bertazzini et.al. |
2505.11257 |
null |
2025-05-16 |
LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios |
Mingxing Peng et.al. |
2505.11247 |
null |
2025-05-16 |
Concept Drift Guided LayerNorm Tuning for Efficient Multimodal Metaphor Identification |
Wenhao Qian et.al. |
2505.11237 |
link |
2025-05-16 |
Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs |
Zhangying Feng et.al. |
2505.11227 |
null |
2025-05-16 |
HAPO: Training Language Models to Reason Concisely via History-Aware Policy Optimization |
Chengyu Huang et.al. |
2505.11225 |
link |
2025-05-16 |
Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation |
Donghoon Lee et.al. |
2505.11221 |
link |
2025-05-16 |
Unveiling the Potential of Vision-Language-Action Models with Open-Ended Multimodal Instructions |
Wei Zhao et.al. |
2505.11214 |
null |
2025-05-16 |
Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese |
Xihuai Wang et.al. |
2505.11200 |
null |
2025-05-16 |
Multi-Modal Multi-Task (M3T) Federated Foundation Models for Embodied AI: Potentials and Challenges for Edge Integration |
Kasra Borazjani et.al. |
2505.11191 |
null |
2025-05-16 |
Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAP |
Francesco Sovrano et.al. |
2505.11189 |
link |
2025-05-16 |
On Next-Token Prediction in LLMs: How End Goals Determine the Consistency of Decoding Algorithms |
Jacob Trauger et.al. |
2505.11183 |
null |
2025-05-16 |
Feasibility with Language Models for Open-World Compositional Zero-Shot Learning |
Jae Myung Kim et.al. |
2505.11181 |
null |
2025-05-16 |
mmRAG: A Modular Benchmark for Retrieval-Augmented Generation over Text, Tables, and Knowledge Graphs |
Chuan Xu et.al. |
2505.11180 |
link |
2025-05-16 |
Low-Resource Language Processing: An OCR-Driven Summarization and Translation Pipeline |
Hrishit Madhavi et.al. |
2505.11177 |
link |
2025-05-16 |
Gaussian Weight Sampling for Scalable, Efficient and Stable Pseudo-Quantization Training |
Myeonghwan Ahn et.al. |
2505.11170 |
null |
2025-05-16 |
SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization |
Huashan Sun et.al. |
2505.11166 |
null |
2025-05-16 |
Foundation Time-Series AI Model for Realized Volatility Forecasting |
Anubha Goel et.al. |
2505.11163 |
null |
2025-05-16 |
Diffusion Model in Hyperspectral Image Processing and Analysis: A Review |
Xing Hu et.al. |
2505.11158 |
null |
2025-05-16 |
MPMA: Preference Manipulation Attack Against Model Context Protocol |
Zihan Wang et.al. |
2505.11154 |
null |
2025-05-16 |
Human-Aligned Bench: Fine-Grained Assessment of Reasoning Ability in MLLMs vs. Humans |
Yansheng Qiu et.al. |
2505.11141 |
null |
2025-05-16 |
Scaling Reasoning can Improve Factuality in Large Language Models |
Mike Zhang et.al. |
2505.11140 |
link |
2025-05-16 |
PhiNet v2: A Mask-Free Brain-Inspired Vision Foundation Model from Video |
Makoto Yamada et.al. |
2505.11129 |
link |
2025-05-16 |
Risk theory in a finite customer-pool setting |
Michel Mandjes et.al. |
2505.11127 |
link |
2025-05-16 |
GraphOracle: A Foundation Model for Knowledge Graph Reasoning |
Enjun Du et.al. |
2505.11125 |
null |
2025-05-16 |
Navigating the Alpha Jungle: An LLM-Powered MCTS Framework for Formulaic Factor Mining |
Yu Shi et.al. |
2505.11122 |
null |
2025-05-16 |
Redundancy-Aware Pretraining of Vision-Language Foundation Models in Remote Sensing |
Mathis Jürgen Adler et.al. |
2505.11121 |
null |
2025-05-16 |
Deepfake Forensic Analysis: Source Dataset Attribution and Legal Implications of Synthetic Media Manipulation |
Massimiliano Cassia et.al. |
2505.11110 |
null |
2025-05-16 |
MAVOS-DD: Multilingual Audio-Video Open-Set Deepfake Detection Benchmark |
Florinel-Alin Croitoru et.al. |
2505.11109 |
null |
2025-05-16 |
Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity |
Chan-Jan Hsu et.al. |
2505.11107 |
null |
2025-05-16 |
Towards Better Evaluation for Generated Patent Claims |
Lekang Jiang et.al. |
2505.11095 |
link |
2025-05-16 |
ShiQ: Bringing back Bellman to LLMs |
Pierre Clavier et.al. |
2505.11081 |
null |
2025-05-16 |
$\mathcal{A}LLM4ADD$ : Unlocking the Capabilities of Audio Large Language Models for Audio Deepfake Detection |
Hao Gu et.al. |
2505.11079 |
null |
2025-05-16 |
LLM-Enhanced Symbolic Control for Safety-Critical Applications |
Amir Bayat et.al. |
2505.11077 |
null |
2025-05-16 |
Addition is almost all you need: Compressing neural networks with double binary factorization |
Vladimír Boža et.al. |
2505.11076 |
link |
2025-05-16 |
Time Travel is Cheating: Going Live with DeepFund for Real-Time Fund Investment Benchmarking |
Changlun Li et.al. |
2505.11065 |
link |
2025-05-16 |
Conceptual framework for the application of deep neural networks to surface composition reconstruction from Mercury’s exospheric data |
Adrian Kazakov et.al. |
2505.11053 |
null |
2025-05-16 |
OntoURL: A Benchmark for Evaluating Large Language Models on Symbolic Ontological Understanding, Reasoning and Learning |
Xiao Zhang et.al. |
2505.11031 |
link |
2025-05-16 |
Exploiting the Asymmetric Uncertainty Structure of Pre-trained VLMs on the Unit Hypersphere |
Li Ju et.al. |
2505.11029 |
null |
2025-05-16 |
Logo-LLM: Local and Global Modeling with Large Language Models for Time Series Forecasting |
Wenjie Ou et.al. |
2505.11017 |
link |
2025-05-16 |
WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild? |
An-Lan Wang et.al. |
2505.11015 |
null |
2025-05-16 |
Humans expect rationality and cooperation from LLM opponents in strategic games |
Darija Barak et.al. |
2505.11011 |
null |
2025-05-16 |
Review-Instruct: A Review-Driven Multi-Turn Conversations Generation Method for Large Language Models |
Jiangxu Wu et.al. |
2505.11010 |
null |
2025-05-16 |
Space Group Equivariant Crystal Diffusion |
Rees Chang et.al. |
2505.10994 |
null |
2025-05-16 |
Generative Models in Computational Pathology: A Comprehensive Survey on Methods, Applications, and Challenges |
Yuan Zhang et.al. |
2505.10993 |
null |
2025-05-16 |
ReaCritic: Large Reasoning Transformer-based DRL Critic-model Scaling For Heterogeneous Networks |
Feiran You et.al. |
2505.10992 |
null |
2025-05-16 |
GenoArmory: A Unified Evaluation Framework for Adversarial Attacks on Genomic Foundation Models |
Haozheng Luo et.al. |
2505.10983 |
link |
2025-05-16 |
Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory |
Yexiang Liu et.al. |
2505.10981 |
link |
2025-05-16 |
Group-in-Group Policy Optimization for LLM Agent Training |
Lang Feng et.al. |
2505.10978 |
link |
2025-05-16 |
Can Large Language Models Correctly Interpret Equations with Errors? |
Lachlan McGinness et.al. |
2505.10966 |
null |
2025-05-16 |
MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation |
Zhenwen Liang et.al. |
2505.10962 |
null |
2025-05-16 |
SubGCache: Accelerating Graph-based RAG with Subgraph-level KV Cache |
Qiuyu Zhu et.al. |
2505.10951 |
null |
2025-05-16 |
Shackled Dancing: A Bit-Locked Diffusion Algorithm for Lossless and Controllable Image Steganography |
Tianshuo Zhang et.al. |
2505.10950 |
null |
2025-05-16 |
The Way We Prompt: Conceptual Blending, Neural Dynamics, and Prompt-Induced Transitions in LLMs |
Makoto Sato et.al. |
2505.10948 |
null |
2025-05-16 |
ToDMA: Large Model-Driven Token-Domain Multiple Access for Semantic Communications |
Li Qiao et.al. |
2505.10946 |
null |
2025-05-16 |
Semantic Aware Linear Transfer by Recycling Pre-trained Language Models for Cross-lingual Transfer |
Seungyoon Lee et.al. |
2505.10945 |
null |
2025-05-16 |
Who You Are Matters: Bridging Topics and Social Roles via LLM-Enhanced Logical Recommendation |
Qing Yu et.al. |
2505.10940 |
null |
2025-05-16 |
GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction |
Mohammadtaha Bagherifard et.al. |
2505.10939 |
link |
2025-05-16 |
Accurate KV Cache Quantization with Outlier Tokens Tracing |
Yi Su et.al. |
2505.10938 |
link |
2025-05-16 |
Connecting the Dots: A Chain-of-Collaboration Prompting Framework for LLM Agents |
Jiaxing Zhao et.al. |
2505.10936 |
null |
2025-05-16 |
Physics-informed Temporal Alignment for Auto-regressive PDE Foundation Models |
Congcong Zhu et.al. |
2505.10930 |
link |
2025-05-16 |
Vaiage: A Multi-Agent Solution to Personalized Travel Planning |
Binwen Liu et.al. |
2505.10922 |
null |
2025-05-16 |
A Physics-Informed Convolutional Long Short Term Memory Statistical Model for Fluid Thermodynamics Simulations |
Luca Menicali et.al. |
2505.10919 |
link |
2025-05-16 |
VISTA: Enhancing Vision-Text Alignment in MLLMs via Cross-Modal Mutual Information Maximization |
Mingxiao Li et.al. |
2505.10917 |
null |
2025-05-16 |
Explain What You Mean: Intent Augmented Knowledge Graph Recommender Built With LLM |
Wenqing Zheng et.al. |
2505.10900 |
null |
2025-05-16 |
Multi-Objective Preference Optimization: Improving Human Alignment of Generative Models |
Akhil Agnihotri et.al. |
2505.10892 |
null |
2025-05-16 |
Approximation and Generalization Abilities of Score-based Neural Network Generative Models for Sub-Gaussian Distributions |
Guoji Fu et.al. |
2505.10880 |
null |
2025-05-16 |
A Light and Smart Wearable Platform with Multimodal Foundation Model for Enhanced Spatial Reasoning in People with Blindness and Low Vision |
Alexey Magay et.al. |
2505.10875 |
null |
2025-05-16 |
REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning? |
Chenxi Jiang et.al. |
2505.10872 |
null |
2025-05-16 |
Improve Rule Retrieval and Reasoning with Self-Induction and Relevance ReEstimate |
Ziyang Huang et.al. |
2505.10870 |
null |
2025-05-16 |
Have Multimodal Large Language Models (MLLMs) Really Learned to Tell the Time on Analog Clocks? |
Tairan Fu et.al. |
2505.10862 |
null |
2025-05-15 |
End-to-End Vision Tokenizer Tuning |
Wenxuan Wang et.al. |
2505.10562 |
null |
2025-05-15 |
T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback |
Zehan Wang et.al. |
2505.10561 |
null |
2025-05-15 |
Neural Thermodynamic Laws for Large Language Model Training |
Ziming Liu et.al. |
2505.10559 |
null |
2025-05-15 |
Flowing Through Hilbert Space: Quantum-Enhanced Generative Models for Lattice Field Theory |
Jehu Martinez et.al. |
2505.10553 |
null |
2025-05-15 |
Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data |
Yiwen Liu et.al. |
2505.10551 |
link |
2025-05-15 |
Real-Time Out-of-Distribution Failure Prevention via Multi-Modal Reasoning |
Milan Ganai et.al. |
2505.10547 |
null |
2025-05-15 |
Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models |
Annie Wong et.al. |
2505.10543 |
link |
2025-05-15 |
Exploring Implicit Visual Misunderstandings in Multimodal Large Language Models through Attention Analysis |
Pengfei Wang et.al. |
2505.10541 |
link |
2025-05-15 |
S3C2 Summit 2024-09: Industry Secure Software Supply Chain Summit |
Imranur Rahman et.al. |
2505.10538 |
null |
2025-05-15 |
CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs |
Raman Dutt et.al. |
2505.10496 |
link |
2025-05-15 |
RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs |
Vibha Belavadi et.al. |
2505.10495 |
null |
2025-05-15 |
Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective |
Yutao Mou et.al. |
2505.10494 |
link |
2025-05-15 |
CL-RAG: Bridging the Gap in Retrieval-Augmented Generation with Curriculum Learning |
Shaohan Wang et.al. |
2505.10493 |
null |
2025-05-15 |
Campus AI vs Commercial AI: A Late-Breaking Study on How LLM As-A-Service Customizations Shape Trust and Usage Patterns |
Leon Hannig et.al. |
2505.10490 |
null |
2025-05-15 |
UniEval: Unified Holistic Evaluation for Unified Multimodal Understanding and Generation |
Yi Li et.al. |
2505.10483 |
null |
2025-05-15 |
Large Language Models for Cancer Communication: Evaluating Linguistic Quality, Safety, and Accessibility in Generative AI |
Agnik Saha et.al. |
2505.10472 |
null |
2025-05-15 |
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenge |
Ranjan Sapkota et.al. |
2505.10468 |
null |
2025-05-15 |
Superposition Yields Robust Neural Scaling |
Yizhou liu et.al. |
2505.10465 |
link |
2025-05-15 |
Are Large Language Models Robust in Understanding Code Against Semantics-Preserving Mutations? |
Pedro Orvalho et.al. |
2505.10443 |
null |
2025-05-15 |
Learning to Think: Information-Theoretic Reinforcement Fine-Tuning for LLMs |
Jingyao Wang et.al. |
2505.10425 |
null |
2025-05-15 |
Hierarchical Document Refinement for Long-context Retrieval-augmented Generation |
Jiajie Jin et.al. |
2505.10413 |
link |
2025-05-15 |
Are LLM-generated plain language summaries truly understandable? A large-scale crowdsourced evaluation |
Yue Guo et.al. |
2505.10409 |
null |
2025-05-15 |
Two-Stage Generative Model for Intracranial Aneurysm Meshes with Morphological Marker Conditioning |
Wenhao Ding et.al. |
2505.10407 |
link |
2025-05-15 |
Visual Fidelity Index for Generative Semantic Communications with Critical Information Embedding |
Jianhao Huang et.al. |
2505.10405 |
null |
2025-05-15 |
Multi-domain Multilingual Sentiment Analysis in Industry: Predicting Aspect-based Opinion Quadruples |
Benjamin White et.al. |
2505.10389 |
null |
2025-05-15 |
Are Sparse Autoencoders Useful for Java Function Bug Detection? |
Rui Melo et.al. |
2505.10375 |
null |
2025-05-15 |
FactsR: A Safer Method for Producing High Quality Healthcare Documentation |
Victor Petrén Bach Hansen et.al. |
2505.10360 |
null |
2025-05-15 |
NVSPolicy: Adaptive Novel-View Synthesis for Generalizable Language-Conditioned Policy Learning |
Le Shi et.al. |
2505.10359 |
null |
2025-05-16 |
LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations |
Yile Wang et.al. |
2505.10354 |
link |
2025-05-15 |
Non-Markovian dynamics with a driven three-level giant atom in a semi-infinite photonic waveguide |
S. J. Sun et.al. |
2505.10340 |
null |
2025-05-15 |
AutoPentest: Enhancing Vulnerability Management With Autonomous LLM Agents |
Julius Henke et.al. |
2505.10321 |
link |
2025-05-15 |
One For All: Formally Verifying Protocols which use Aggregate Signatures (extended version) |
Xenia Hofmeier et.al. |
2505.10316 |
null |
2025-05-15 |
Empirically evaluating commonsense intelligence in large language models with large-scale human judgments |
Tuan Dung Nguyen et.al. |
2505.10309 |
null |
2025-05-15 |
MIPHEI-ViT: Multiplex Immunofluorescence Prediction from H&E Images using ViT Foundation Models |
Guillaume Balezo et.al. |
2505.10294 |
link |
2025-05-15 |
From Questions to Clinical Recommendations: Large Language Models Driving Evidence-Based Clinical Decision Making |
Dubai Li et.al. |
2505.10282 |
link |
2025-05-15 |
The Evolving Landscape of Generative Large Language Models and Traditional Natural Language Processing in Medicine |
Rui Yang et.al. |
2505.10261 |
null |
2025-05-15 |
Comparing LLM Text Annotation Skills: A Study on Human Rights Violations in Social Media Data |
Poli Apollinaire Nemkova et.al. |
2505.10260 |
link |
2025-05-15 |
Towards Safe Robot Foundation Models Using Inductive Biases |
Maximilian Tölle et.al. |
2505.10219 |
null |
2025-05-15 |
Informed Forecasting: Leveraging Auxiliary Knowledge to Boost LLM Performance on Time Series Forecasting |
Mohammadmahdi Ghasemloo et.al. |
2505.10213 |
null |
2025-05-15 |
Do LLMs Memorize Recommendation Datasets? A Preliminary Study on MovieLens-1M |
Dario Di Palma et.al. |
2505.10212 |
link |
2025-05-15 |
VQ-Logits: Compressing the Output Bottleneck of Large Language Models via Vector Quantized Logits |
Jintian Shao et.al. |
2505.10202 |
null |
2025-05-15 |
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think |
Seongyun Lee et.al. |
2505.10185 |
null |
2025-05-15 |
Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning |
Yoichi Ishibashi et.al. |
2505.10182 |
null |
2025-05-15 |
GE-Chat: A Graph Enhanced RAG Framework for Evidential Response Generation of LLMs |
Longchao Da et.al. |
2505.10143 |
null |
2025-05-15 |
Large Wireless Localization Model (LWLM): A Foundation Model for Positioning in 6G Networks |
Guangjin Pan et.al. |
2505.10134 |
link |
2025-05-15 |
Learning Virtual Machine Scheduling in Cloud Computing through Language Agents |
JieHao Wu et.al. |
2505.10117 |
null |
2025-05-15 |
What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs |
Xinlan Yan et.al. |
2505.10113 |
null |
2025-05-15 |
EmbodiedMAE: A Unified 3D Multi-Modal Representation for Robot Manipulation |
Zibin Dong et.al. |
2505.10105 |
null |
2025-05-15 |
From Text to Network: Constructing a Knowledge Graph of Taiwan-Based China Studies Using Generative AI |
Hsuan-Lei Shao et.al. |
2505.10093 |
null |
2025-05-15 |
ChronoSteer: Bridging Large Language Model and Time Series Foundation Model via Synthetic Data |
Chengsen Wang et.al. |
2505.10083 |
null |
2025-05-16 |
Leveraging Graph Retrieval-Augmented Generation to Support Learners’ Understanding of Knowledge Concepts in MOOCs |
Mohamed Abdelmagied et.al. |
2505.10074 |
null |
2025-05-15 |
Dark LLMs: The Growing Threat of Unaligned AI Models |
Michael Fire et.al. |
2505.10066 |
null |
2025-05-15 |
CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability |
Han Peng et.al. |
2505.10063 |
null |
2025-05-15 |
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis |
Bingda Tang et.al. |
2505.10046 |
link |
2025-05-15 |
DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs |
Lake Yin et.al. |
2505.10013 |
null |
2025-05-15 |
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts |
Jing-Cheng Pang et.al. |
2505.10010 |
link |
2025-05-15 |
SVA-ICL: Improving LLM-based Software Vulnerability Assessment via In-Context Learning and Information Fusion |
Chaoyang Gao et.al. |
2505.10008 |
link |
2025-05-15 |
AI2MMUM: AI-AI Oriented Multi-Modal Universal Model Leveraging Telecom Domain Large Model |
Tianyu Jiao et.al. |
2505.10003 |
null |
2025-05-15 |
ServeGen: Workload Characterization and Generation of Large Language Model Serving in Production |
Yuxing Xiang et.al. |
2505.09999 |
null |
2025-05-15 |
Physical regularized Hierarchical Generative Model for Metallic Glass Structural Generation and Energy Prediction |
Qiyuan Chen et.al. |
2505.09977 |
null |
2025-05-15 |
Analysing Safety Risks in LLMs Fine-Tuned with Pseudo-Malicious Cyber Security Data |
Adel ElZemity et.al. |
2505.09974 |
null |
2025-05-15 |
Pre-Act: Multi-Step Planning and Reasoning Improves Acting in LLM Agents |
Mrinal Rawat et.al. |
2505.09970 |
null |
2025-05-15 |
Advanced Crash Causation Analysis for Freeway Safety: A Large Language Model Approach to Identifying Key Contributing Factors |
Ahmed S. Abdelrahman et.al. |
2505.09949 |
null |
2025-05-15 |
Personalizing Large Language Models using Retrieval Augmented Generation and Knowledge Graph |
Deeksha Prahlad et.al. |
2505.09945 |
link |
2025-05-15 |
Design and Evaluation of Generative Agent-based Platform for Human-Assistant Interaction Research: A Tale of 10 User Studies |
Ziyi Xuan et.al. |
2505.09938 |
null |
2025-05-15 |
CartoAgent: a multimodal large language model-powered multi-agent cartographic framework for map style transfer and evaluation |
Chenglong Wang et.al. |
2505.09936 |
null |
2025-05-15 |
Rethinking Prompt Optimizers: From Prompt Merits to Optimization |
Zixiao Zhu et.al. |
2505.09930 |
link |
2025-05-15 |
Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback |
Yutao Yang et.al. |
2505.09925 |
null |
2025-05-16 |
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models |
Yidan Wang et.al. |
2505.09924 |
link |
2025-05-15 |
Improving the Euclidean Diffusion Generation of Manifold Data by Mitigating Score Function Singularity |
Zichen Liu et.al. |
2505.09922 |
null |
2025-05-16 |
PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization |
Yidan Wang et.al. |
2505.09921 |
link |
2025-05-15 |
UICopilot: Automating UI Synthesis via Hierarchical Code Generation from Webpage Designs |
Yi Gui et.al. |
2505.09904 |
link |
2025-05-15 |
Crossing Borders Without Crossing Boundaries: How Sociolinguistic Awareness Can Optimize User Engagement with Localized Spanish AI Models Across Hispanophone Countries |
Martin Capdevila et.al. |
2505.09902 |
null |
2025-05-15 |
Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit Tasks |
Ziyuan Zhang et.al. |
2505.09901 |
link |
2025-05-16 |
Characterizing Unintended Consequences in Human-GUI Agent Collaboration for Web Browsing |
Shuning Zhang et.al. |
2505.09875 |
null |
2025-05-14 |
Do Large Language Models Know Conflict? Investigating Parametric vs. Non-Parametric Knowledge of LLMs for Conflict Forecasting |
Apollinaire Poli Nemkova et.al. |
2505.09852 |
null |
2025-05-14 |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values |
Martín Rodríguez et.al. |
2505.09830 |
null |
2025-05-14 |
KRISTEVA: Close Reading as a Novel Task for Benchmarking Interpretive Reasoning |
Peiqi Sui et.al. |
2505.09825 |
null |
2025-05-14 |
Adversarial Attack on Large Language Models using Exponentiated Gradient Descent |
Sajib Biswas et.al. |
2505.09820 |
link |
2025-05-14 |
Lossless Compression for LLM Tensor Incremental Snapshots |
Daniel Waddington et.al. |
2505.09810 |
null |
2025-05-14 |
Contextual Phenotyping of Pediatric Sepsis Cohort Using Large Language Models |
Aditya Nagori et.al. |
2505.09805 |
null |
2025-05-14 |
A Multimodal Multi-Agent Framework for Radiology Report Generation |
Ziruo Yi et.al. |
2505.09787 |
null |
2025-05-14 |
Regularized Operator Extrapolation Method For Stochastic Bilevel Variational Inequality Problems |
Mohammad Khalafi et.al. |
2505.09778 |
null |
2025-05-14 |
A Survey on Large Language Models in Multimodal Recommender Systems |
Alejo Lopez-Avila et.al. |
2505.09777 |
null |
2025-05-14 |
Self-Consuming Generative Models with Adversarially Curated Data |
Xiukun Wei et.al. |
2505.09768 |
null |
2025-05-14 |
Trustless Autonomy: Understanding Motivations, Benefits and Governance Dilemma in Self-Sovereign Decentralized AI Agents |
Botao Amber Hu et.al. |
2505.09757 |
null |
2025-05-14 |
FAS-LLM: Large Language Model-Based Channel Prediction for OTFS-Enabled Satellite-FAS Links |
Halvin Yang et.al. |
2505.09751 |
null |
2025-05-14 |
VeriFact: Enhancing Long-Form Factuality Evaluation with Refined Fact Extraction and Reference Facts |
Xin Liu et.al. |
2505.09701 |
null |
2025-05-14 |
EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models |
Hu Yue et.al. |
2505.09694 |
link |
2025-05-14 |
System Prompt Optimization with Meta-Learning |
Yumin Choi et.al. |
2505.09666 |
null |
2025-05-16 |
Tales of the 2025 Los Angeles Fire: Hotwash for Public Health Concerns in Reddit via LLM-Enhanced Topic Modeling |
Sulong Zhou et.al. |
2505.09665 |
null |
2025-05-14 |
Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors |
Nicolas Dupuis et.al. |
2505.09610 |
null |
2025-05-14 |
Adversarial Suffix Filtering: a Defense Pipeline for LLMs |
David Khachaturov et.al. |
2505.09602 |
null |
2025-05-15 |
How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference |
Nidhal Jegham et.al. |
2505.09598 |
null |
2025-05-14 |
WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models |
Abdullah Mushtaq et.al. |
2505.09595 |
null |
2025-05-15 |
Beyond Likes: How Normative Feedback Complements Engagement Signals on Social Media |
Yuchen Wu et.al. |
2505.09583 |
null |
2025-05-14 |
Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach |
Shannon Lodoen et.al. |
2505.09576 |
null |
2025-05-14 |
MIGRATION-BENCH: Repository-Level Code Migration Benchmark from Java 8 |
Linbo Liu et.al. |
2505.09569 |
link |
2025-05-14 |
Using Foundation Models as Pseudo-Label Generators for Pre-Clinical 4D Cardiac CT Segmentation |
Anne-Marie Rickmann et.al. |
2505.09564 |
null |
2025-05-14 |
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning |
Zongqian Li et.al. |
2505.09519 |
link |
2025-05-15 |
Towards Fair In-Context Learning with Tabular Foundation Models |
Patrik Kenfack et.al. |
2505.09503 |
null |
2025-05-14 |
Layered Unlearning for Adversarial Relearning |
Timothy Qian et.al. |
2505.09500 |
link |
2025-05-14 |
Card Sorting Simulator: Augmenting Design of Logical Information Architectures with Large Language Models |
Eduard Kuric et.al. |
2505.09478 |
null |
2025-05-14 |
Deploying Foundation Model-Enabled Air and Ground Robots in the Field: Challenges and Opportunities |
Zachary Ravichandran et.al. |
2505.09477 |
null |
2025-05-14 |
Evaluating GPT- and Reasoning-based Large Language Models on Physics Olympiad Problems: Surpassing Human Performance and Implications for Educational Assessment |
Paul Tschisgale et.al. |
2505.09438 |
null |
2025-05-14 |
CXMArena: Unified Dataset to benchmark performance in realistic CXM Scenarios |
Raghav Garg et.al. |
2505.09436 |
link |
2025-05-14 |
Endo-CLIP: Progressive Self-Supervised Pre-training on Raw Colonoscopy Records |
Yili He et.al. |
2505.09435 |
null |
2025-05-15 |
SafePath: Conformal Prediction for Safe LLM-Based Autonomous Navigation |
Achref Doula et.al. |
2505.09427 |
null |
2025-05-14 |
FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models |
Hongyang Wang et.al. |
2505.09415 |
null |
2025-05-14 |
The Influence of Human-inspired Agentic Sophistication in LLM-driven Strategic Reasoners |
Vince Trencsenyi et.al. |
2505.09396 |
null |
2025-05-14 |
Quantum-Enhanced Parameter-Efficient Learning for Typhoon Trajectory Forecasting |
Chen-Yu Liu et.al. |
2505.09395 |
null |
2025-05-14 |
Qwen3 Technical Report |
An Yang et.al. |
2505.09388 |
link |
2025-05-14 |
MAKE: Multi-Aspect Knowledge-Enhanced Vision-Language Pretraining for Zero-shot Dermatological Assessment |
Siyuan Yan et.al. |
2505.09372 |
link |
2025-05-14 |
Efficient Modelling of Lyman-α opacity fluctuations during late EoR |
Barun Maity et.al. |
2505.09369 |
null |
2025-05-14 |
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis |
Bingxin Ke et.al. |
2505.09358 |
link |
2025-05-14 |
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures |
Chenggang Zhao et.al. |
2505.09343 |
null |
2025-05-14 |
Access Controls Will Solve the Dual-Use Dilemma |
Evžen Wybitul et.al. |
2505.09341 |
null |
2025-05-14 |
RAG-Enabled Intent Reasoning for Application-Network Interaction |
Salwa Mostafa et.al. |
2505.09339 |
link |
2025-05-14 |
BioVFM-21M: Benchmarking and Scaling Self-Supervised Vision Foundation Models for Biomedical Image Analysis |
Jiarun Liu et.al. |
2505.09329 |
null |
2025-05-14 |
Statistical Modeling and Uncertainty Estimation of LLM Inference Systems |
Kaustabha Ray et.al. |
2505.09319 |
null |
2025-05-14 |
Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging |
Hongjin Qian et.al. |
2505.09316 |
null |
2025-05-14 |
Reproducibility Study of “Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents” |
Pedro M. P. Curvo et.al. |
2505.09289 |
link |
2025-05-14 |
A Scalable Unsupervised Framework for multi-aspect labeling of Multilingual and Multi-Domain Review Data |
Jiin Park et.al. |
2505.09286 |
null |
2025-05-14 |
Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations |
Panqi Chen et.al. |
2505.09284 |
null |
2025-05-14 |
A Note on Semantic Diffusion |
Alexander P. Ryjov et.al. |
2505.09283 |
null |
2025-05-14 |
Recent Advances in Medical Imaging Segmentation: A Survey |
Fares Bougourzi et.al. |
2505.09274 |
link |
2025-05-14 |
MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning |
Bin-Bin Gao et.al. |
2505.09265 |
null |
2025-05-14 |
Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation |
Guan Gui et.al. |
2505.09263 |
link |
2025-05-14 |
Instantiating Standards: Enabling Standard-Driven Text TTP Extraction with Evolvable Memory |
Cheng Meng et.al. |
2505.09261 |
null |
2025-05-14 |
Zero-Shot Multi-modal Large Language Model v.s. Supervised Deep Learning: A Comparative Study on CT-Based Intracranial Hemorrhage Subtyping |
Yinuo Wang et.al. |
2505.09252 |
link |
2025-05-14 |
Focus, Merge, Rank: Improved Question Answering Based on Semi-structured Knowledge Bases |
Derian Boer et.al. |
2505.09246 |
link |
2025-05-14 |
InvDesFlow-AL: Active Learning-based Workflow for Inverse Design of Functional Materials |
Xiao-Qi Han et.al. |
2505.09203 |
link |
2025-05-15 |
UniCAD: Efficient and Extendable Architecture for Multi-Task Computer-Aided Diagnosis System |
Yitao Zhu et.al. |
2505.09178 |
null |
2025-05-14 |
A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning |
Berkay Guler et.al. |
2505.09160 |
null |
2025-05-14 |
AMSnet 2.0: A Large AMS Database with AI Segmentation for Net Detection |
Yichen Shi et.al. |
2505.09155 |
null |
2025-05-14 |
ELIS: Efficient LLM Iterative Scheduling System with Response Length Predictor |
Seungbeom Choi et.al. |
2505.09142 |
null |
2025-05-14 |
Sensing-Assisted Channel Prediction in Complex Wireless Environments: An LLM-Based Approach |
Junjie He et.al. |
2505.09141 |
null |
2025-05-14 |
Beyond General Prompts: Automated Prompt Refinement using Contrastive Class Alignment Scores for Disambiguating Objects in Vision-Language Models |
Lucas Choi et.al. |
2505.09139 |
null |
2025-05-14 |
FoldNet: Learning Generalizable Closed-Loop Policy for Garment Folding via Keypoint-Driven Asset and Demonstration Synthesis |
Yuxing Chen et.al. |
2505.09109 |
null |
2025-05-14 |
Air-Ground Collaboration for Language-Specified Missions in Unknown Environments |
Fernando Cladera et.al. |
2505.09108 |
null |
2025-05-14 |
Ornithologist: Towards Trustworthy “Reasoning” about Central Bank Communications |
Dominic Zaun Eu Jones et.al. |
2505.09083 |
null |
2025-05-14 |
CEC-Zero: Chinese Error Correction Solution Based on LLM |
Sophie Zhang et.al. |
2505.09082 |
null |
2025-05-14 |
S-DAT: A Multilingual, GenAI-Driven Framework for Automated Divergent Thinking Assessment |
Jennifer Haase et.al. |
2505.09068 |
null |
2025-05-14 |
Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models |
Junda Zhao et.al. |
2505.09062 |
link |
2025-05-14 |
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias |
Brandon Smith et.al. |
2505.09056 |
null |
2025-05-14 |
Atomic Consistency Preference Optimization for Long-Form Question Answering |
Jingfeng Chen et.al. |
2505.09039 |
link |
2025-05-13 |
Improving the Reliability of LLMs: Combining CoT, RAG, Self-Consistency, and Self-Verification |
Adarsh Kumar et.al. |
2505.09031 |
null |
2025-05-13 |
Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation |
Yi Cui et.al. |
2505.09027 |
null |
2025-05-13 |
Automated Meta Prompt Engineering for Alignment with the Theory of Mind |
Aaron Baughman et.al. |
2505.09024 |
null |
2025-05-13 |
Block-Biased Mamba for Long-Range Sequence Processing |
Annan Yu et.al. |
2505.09022 |
null |
2025-05-13 |
AI-Mediated Code Comment Improvement |
Maria Dhakal et.al. |
2505.09021 |
null |
2025-05-13 |
A suite of LMs comprehend puzzle statements as well as humans |
Adele E Goldberg et.al. |
2505.08996 |
null |
2025-05-13 |
ITERA-LLM: Boosting Sub-8-Bit Large Language Model Inference via Iterative Tensor Decomposition |
Keran Zheng et.al. |
2505.08981 |
null |
2025-05-13 |
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training |
Yangyi Chen et.al. |
2505.08971 |
link |
2025-05-13 |
Parameter-Efficient Fine-Tuning of Vision Foundation Model for Forest Floor Segmentation from UAV Imagery |
Mohammad Wasil et.al. |
2505.08932 |
link |
2025-05-13 |
Assessing and Advancing Benchmarks for Evaluating Large Language Models in Software Engineering Tasks |
Xing Hu et.al. |
2505.08903 |
null |
2025-05-13 |
Optimized Couplings for Watermarking Large Language Models |
Dor Tsur et.al. |
2505.08878 |
link |
2025-05-13 |
Generative AI for Autonomous Driving: Frontiers and Opportunities |
Yuping Wang et.al. |
2505.08854 |
link |
2025-05-13 |
Improved Algorithms for Differentially Private Language Model Alignment |
Keyu Chen et.al. |
2505.08849 |
null |
2025-05-13 |
CellTypeAgent: Trustworthy cell type annotation with Large Language Models |
Jiawen Chen et.al. |
2505.08844 |
link |
2025-05-13 |
PCS-UQ: Uncertainty Quantification via the Predictability-Computability-Stability Framework |
Abhineet Agarwal et.al. |
2505.08784 |
null |
2025-05-13 |
CodePDE: An Inference Framework for LLM-driven PDE Solver Generation |
Shanda Li et.al. |
2505.08783 |
link |
2025-05-13 |
HealthBench: Evaluating Large Language Models Towards Improved Human Health |
Rahul K. Arora et.al. |
2505.08775 |
link |
2025-05-13 |
Generative Molecular Design with Steerable and Granular Synthesizability Control |
Jeff Guo et.al. |
2505.08774 |
link |
2025-05-14 |
Towards Autonomous UAV Visual Object Search in City Space: Benchmark and Agentic Methodology |
Yatai Ji et.al. |
2505.08765 |
null |
2025-05-13 |
AC-Reason: Towards Theory-Guided Actual Causality Reasoning with Large Language Models |
Yanxi Zhang et.al. |
2505.08750 |
link |
2025-05-13 |
DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models |
Xiaoyang Chen et.al. |
2505.08744 |
link |
2025-05-13 |
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies |
Xiaoliang Luo et.al. |
2505.08739 |
link |
2025-05-13 |
Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data |
James Giroux et.al. |
2505.08736 |
link |
2025-05-13 |
NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context |
Ben Yao et.al. |
2505.08734 |
null |
2025-05-13 |
Securing RAG: A Risk Assessment and Mitigation Framework |
Lukas Ammann et.al. |
2505.08728 |
null |
2025-05-13 |
TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series |
Xiaolei Qin et.al. |
2505.08723 |
link |
2025-05-13 |
PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts |
Yang Su et.al. |
2505.08719 |
null |
2025-05-13 |
LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs |
K M Sajjadul Islam et.al. |
2505.08704 |
null |
2025-05-13 |
A Survey of Deep Learning for Complex Speech Spectrograms |
Yuying Xie et.al. |
2505.08694 |
null |
2025-05-13 |
VizCV: AI-assisted visualization of researchers’ publications tracks |
Vladimír Lazárik et.al. |
2505.08691 |
null |
2025-05-13 |
Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation |
Sheng Liang et.al. |
2505.08690 |
null |
2025-05-13 |
A Social Robot with Inner Speech for Dietary Guidance |
Valerio Belcamino et.al. |
2505.08664 |
link |
2025-05-13 |
Revealing economic facts: LLMs know more than they say |
Marcus Buckmann et.al. |
2505.08662 |
null |
2025-05-13 |
Enhancing Software Development with Context-Aware Conversational Agents: A User Study on Developer Interactions with Chatbots |
Glaucia Melo et.al. |
2505.08648 |
null |
2025-05-13 |
Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models |
Donghoon Kim et.al. |
2505.08622 |
null |
2025-05-13 |
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference |
Tollef Emil Jørgensen et.al. |
2505.08620 |
null |
2025-05-13 |
Boosting Zero-shot Stereo Matching using Large-scale Mixed Images Sources in the Real World |
Yuran Wang et.al. |
2505.08607 |
null |
2025-05-13 |
Automatic Task Detection and Heterogeneous LLM Speculative Decoding |
Danying Ge et.al. |
2505.08600 |
null |
2025-05-13 |
Enhancing Thyroid Cytology Diagnosis with RAG-Optimized LLMs and Pa-thology Foundation Models |
Hussien Al-Asi et.al. |
2505.08590 |
null |
2025-05-13 |
Small but Significant: On the Promise of Small Language Models for Accessible AIED |
Yumou Wei et.al. |
2505.08588 |
null |
2025-05-13 |
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection |
Ayush K. Rai et.al. |
2505.08561 |
null |
2025-05-13 |
DFA-CON: A Contrastive Learning Approach for Detecting Copyright Infringement in DeepFake Art |
Haroon Wahab et.al. |
2505.08552 |
null |
2025-05-13 |
Guiding LLM-based Smart Contract Generation with Finite State Machine |
Hao Luo et.al. |
2505.08542 |
null |
2025-05-13 |
The Truth Becomes Clearer Through Debate! Multi-Agent Systems with Large Language Models Unmask Fake News |
Yuhan Liu et.al. |
2505.08532 |
null |
2025-05-13 |
Building-Block Aware Generative Modeling for 3D Crystals of Metal Organic Frameworks |
Chenru Duan et.al. |
2505.08531 |
link |
2025-05-13 |
ExEBench: Benchmarking Foundation Models on Extreme Earth Events |
Shan Zhao et.al. |
2505.08529 |
null |
2025-05-13 |
A Deep Learning-Driven Framework for Inhalation Injury Grading Using Bronchoscopy Images |
Yifan Li et.al. |
2505.08517 |
null |
2025-05-13 |
TrialMatchAI: An End-to-End AI-powered Clinical Trial Recommendation System to Streamline Patient-to-Trial Matching |
Majd Abdallah et.al. |
2505.08508 |
null |
2025-05-13 |
InfoPO: On Mutual Information Maximization for Large Language Model Alignment |
Teng Xiao et.al. |
2505.08507 |
null |
2025-05-13 |
LCES: Zero-shot Automated Essay Scoring via Pairwise Comparisons Using Large Language Models |
Takumi Shibata et.al. |
2505.08498 |
null |
2025-05-13 |
BizChat: Scaffolding AI-Powered Business Planning for Small Business Owners Across Digital Skill Levels |
Quentin Romero Lauro et.al. |
2505.08493 |
null |
2025-05-13 |
Large Language Models Meet Stance Detection: A Survey of Tasks, Methods, Applications, Challenges and Future Directions |
Lata Pangtey et.al. |
2505.08464 |
null |
2025-05-13 |
Strategy-Augmented Planning for Large Language Models via Opponent Exploitation |
Shuai Xu et.al. |
2505.08459 |
link |
2025-05-13 |
IterKey: Iterative Keyword Generation with LLMs for Enhanced Retrieval Augmented Generation |
Kazuki Hayashi et.al. |
2505.08450 |
null |
2025-05-13 |
Scalable UAV Multi-Hop Networking via Multi-Agent Reinforcement Learning with Large Language Models |
Yanggang Xu et.al. |
2505.08448 |
null |
2025-05-13 |
Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency |
Adel Ammar et.al. |
2505.08445 |
null |
2025-05-13 |
Symbolically-Guided Visual Plan Inference from Uncurated Video Data |
Wenyan Yang et.al. |
2505.08444 |
null |
2025-05-13 |
A document processing pipeline for the construction of a dataset for topic modeling based on the judgments of the Italian Supreme Court |
Matteo Marulli et.al. |
2505.08439 |
null |
2025-05-13 |
Visual Image Reconstruction from Brain Activity via Latent Representation |
Yukiyasu Kamitani et.al. |
2505.08429 |
null |
2025-05-13 |
An integrated language-vision foundation model for conversational diagnostics and triaging in primary eye care |
Zhi Da Soh et.al. |
2505.08414 |
null |
2025-05-13 |
TUMS: Enhancing Tool-use Abilities of LLMs with Multi-structure Handlers |
Aiyao He et.al. |
2505.08402 |
null |
2025-05-13 |
Accelerating Chain-of-Thought Reasoning: When Goal-Gradient Importance Meets Dynamic Skipping |
Ren Zhuang et.al. |
2505.08392 |
null |
2025-05-13 |
Towards Contamination Resistant Benchmarks |
Rahmatullah Musawi et.al. |
2505.08389 |
null |
2025-05-13 |
Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation |
Enci Zhang et.al. |
2505.08364 |
null |
2025-05-13 |
Hamiltonian replica exchange augmented with diffusion-based generative models and importance sampling to assess biomolecular conformational basins and barriers |
Zakarya Benayad et.al. |
2505.08357 |
null |
2025-05-13 |
Alignment Drift in CEFR-prompted LLMs for Interactive Spanish Tutoring |
Mina Almasi et.al. |
2505.08351 |
null |
2025-05-13 |
Benchmarking AI scientists in omics data-driven biological research |
Erpai Luo et.al. |
2505.08341 |
link |
2025-05-13 |
Evaluating the Effectiveness of Black-Box Prompt Optimization as the Scale of LLMs Continues to Grow |
Ziyu Zhou et.al. |
2505.08303 |
null |
2025-05-13 |
A Practical Introduction to Deep Reinforcement Learning |
Yinghan Sun et.al. |
2505.08295 |
null |
2025-05-13 |
Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion |
Anle Ke et.al. |
2505.08281 |
null |
2025-05-13 |
LLM Enhancers for GNNs: An Analysis from the Perspective of Causal Mechanism Identification |
Hang Gao et.al. |
2505.08265 |
null |
2025-05-13 |
LLM-Based Detection of Tangled Code Changes for Higher-Quality Method-Level Bug Datasets |
Md Nahidul Islam Opu et.al. |
2505.08263 |
null |
2025-05-13 |
Enhancing Cache-Augmented Generation (CAG) with Adaptive Contextual Compression for Scalable Knowledge Integration |
Rishabh Agrawal et.al. |
2505.08261 |
null |
2025-05-13 |
Evaluating LLM Metrics Through Real-World Capabilities |
Justin K Miller et.al. |
2505.08253 |
null |
2025-05-13 |
Identifying Memorization of Diffusion Models through p-Laplace Analysis |
Jonathan Brokman et.al. |
2505.08246 |
link |
2025-05-13 |
Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement |
Haoran Ye et.al. |
2505.08245 |
link |
2025-05-13 |
Unveiling the Best Practices for Applying Speech Foundation Models to Speech Intelligibility Prediction for Hearing-Impaired People |
Haoshuai Zhou et.al. |
2505.08215 |
null |
2025-05-13 |
A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs |
Artem Shelmanov et.al. |
2505.08200 |
null |
2025-05-13 |
Aitomia: Your Intelligent Assistant for AI-Driven Atomistic and Quantum Chemical Simulations |
Jinming Hu et.al. |
2505.08195 |
null |
2025-05-13 |
CLTP: Contrastive Language-Tactile Pre-training for 3D Contact Geometry Understanding |
Wenxuan Ma et.al. |
2505.08194 |
null |
2025-05-13 |
DSADF: Thinking Fast and Slow for Decision Making |
Alex Zhihao Dou et.al. |
2505.08189 |
null |
2025-05-14 |
Fusing Bidirectional Chains of Thought and Reward Mechanisms A Method for Enhancing Question-Answering Capabilities of Large Language Models for Chinese Intangible Cultural Heritage |
Ruilin Liu et.al. |
2505.08167 |
null |
2025-05-13 |
Decoding Neighborhood Environments with Large Language Models |
Andrew Cart et.al. |
2505.08163 |
null |
2025-05-13 |
Foundation Models Knowledge Distillation For Battery Capacity Degradation Forecast |
Joey Chan et.al. |
2505.08151 |
link |
2025-05-13 |
A Large-Scale Empirical Analysis of Custom GPTs’ Vulnerabilities in the OpenAI Ecosystem |
Sunday Oyinlola Ogundoyin et.al. |
2505.08148 |
link |
2025-05-13 |
Communication Styles and Reader Preferences of LLM and Human Experts in Explaining Health Information |
Jiawei Zhou et.al. |
2505.08143 |
null |
2025-05-13 |
Lost in Transmission: When and Why LLMs Fail to Reason Globally |
Tobias Schnabel et.al. |
2505.08140 |
null |
2025-05-13 |
Large Language Models for Computer-Aided Design: A Survey |
Licheng Zhang et.al. |
2505.08137 |
link |
2025-05-13 |
Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions |
Keita Teranishi et.al. |
2505.08135 |
null |
2025-05-13 |
ALOHA: Empowering Multilingual Agent for University Orientation with Hierarchical Retrieval |
Mingxu Tao et.al. |
2505.08130 |
null |
2025-05-12 |
Will Your Next Pair Programming Partner Be Human? An Empirical Evaluation of Generative AI as a Collaborative Teammate in a Semester-Long Classroom Setting |
Wenhan Lyu et.al. |
2505.08119 |
null |
2025-05-12 |
Are LLMs complicated ethical dilemma analyzers? |
Jiashen et.al. |
2505.08106 |
link |
2025-05-12 |
Visually Interpretable Subtask Reasoning for Visual Question Answering |
Yu Cheng et.al. |
2505.08084 |
null |
2025-05-12 |
LLMs to Support K-12 Teachers in Culturally Relevant Pedagogy: An AI Literacy Example |
Jiayi Wang et.al. |
2505.08083 |
null |
2025-05-12 |
Fréchet Power-Scenario Distance: A Metric for Evaluating Generative AI Models across Multiple Time-Scales in Smart Grids |
Yuting Cai et.al. |
2505.08082 |
null |
2025-05-12 |
Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders |
Dong Shu et.al. |
2505.08080 |
null |
2025-05-12 |
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning |
Zhehao Zhang et.al. |
2505.08054 |
null |
2025-05-12 |
Protein FID: Improved Evaluation of Protein Structure Generative Models |
Felix Faltings et.al. |
2505.08041 |
null |
2025-05-12 |
Opportunities and Applications of GenAI in Smart Cities: A User-Centric Survey |
Ankit Shetgaonkar et.al. |
2505.08034 |
null |
2025-05-12 |
Large Language Models and Arabic Content: A Review |
Haneh Rhel et.al. |
2505.08004 |
null |
2025-05-12 |
Vision Foundation Model Embedding-Based Semantic Anomaly Detection |
Max Peter Ronecker et.al. |
2505.07998 |
null |
2025-05-12 |
Spec2Assertion: Automatic Pre-RTL Assertion Generation using Large Language Models with Progressive Regularization |
Fenghua Wu et.al. |
2505.07995 |
null |
2025-05-12 |
MilChat: Introducing Chain of Thought Reasoning and GRPO to a Multimodal Small Language Model for Remote Sensing |
Aybora Koksal et.al. |
2505.07984 |
null |
2025-05-12 |
Assessing and Mitigating Medical Knowledge Drift and Conflicts in Large Language Models |
Weiyi Wu et.al. |
2505.07968 |
null |
2025-05-12 |
Symbolic Regression with Multimodal Large Language Models and Kolmogorov Arnold Networks |
Thomas R. Harvey et.al. |
2505.07956 |
null |
2025-05-12 |
H $^{\mathbf{3}}$ DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning |
Yiyang Lu et.al. |
2505.07819 |
null |
2025-05-12 |
DanceGRPO: Unleashing GRPO on Visual Generation |
Zeyue Xue et.al. |
2505.07818 |
null |
2025-05-12 |
Continuous Visual Autoregressive Generation via Score Maximization |
Chenze Shao et.al. |
2505.07812 |
link |
2025-05-12 |
Improving Trajectory Stitching with Flow Models |
Reece O’Mahoney et.al. |
2505.07802 |
null |
2025-05-12 |
Learning Dynamics in Continual Pre-Training for Large Language Models |
Xingjin Wang et.al. |
2505.07796 |
null |
2025-05-12 |
Domain Regeneration: How well do LLMs match syntactic properties of text domains? |
Da Ju et.al. |
2505.07784 |
null |
2025-05-12 |
Relative Overfitting and Accept-Reject Framework |
Yanxin Liu et.al. |
2505.07783 |
null |
2025-05-12 |
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering |
Rushi Qiang et.al. |
2505.07782 |
link |
2025-05-12 |
Synthesizing Diverse Network Flow Datasets with Scalable Dynamic Multigraph Generation |
Arya Grayeli et.al. |
2505.07777 |
null |
2025-05-12 |
Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving |
Xinji Mai et.al. |
2505.07773 |
link |
2025-05-12 |
Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding |
Yifeng Di et.al. |
2505.07768 |
link |
2025-05-12 |
BodyGPS: Anatomical Positioning System |
Halid Ziya Yerebakan et.al. |
2505.07744 |
null |
2025-05-12 |
Assessing the Chemical Intelligence of Large Language Models |
Nicholas T. Runcie et.al. |
2505.07735 |
link |
2025-05-12 |
LAMM-ViT: AI Face Detection via Layer-Aware Modulation of Region-Guided Attention |
Jiangling Zhang et.al. |
2505.07734 |
null |
2025-05-12 |
Spoken Language Understanding on Unseen Tasks With In-Context Learning |
Neeraj Agrawal et.al. |
2505.07731 |
null |
2025-05-12 |
Circuit Partitioning Using Large Language Models for Quantum Compilation and Simulations |
Pranav Sinha et.al. |
2505.07711 |
null |
2025-05-12 |
PatchTrack: A Comprehensive Analysis of ChatGPT’s Influence on Pull Request Outcomes |
Daniel Ogenrwot et.al. |
2505.07700 |
null |
2025-05-12 |
S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models |
Muzhi Dai et.al. |
2505.07686 |
null |
2025-05-12 |
Multimodal Survival Modeling in the Age of Foundation Models |
Steven Song et.al. |
2505.07683 |
link |
2025-05-12 |
SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models |
Hang Wu et.al. |
2505.07680 |
null |
2025-05-13 |
OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit |
Arun S. Maiya et.al. |
2505.07672 |
link |
2025-05-12 |
Benchmarking Retrieval-Augmented Generation for Chemistry |
Xianrui Zhong et.al. |
2505.07671 |
null |
2025-05-12 |
A Case Study Investigating the Role of Generative AI in Quality Evaluations of Epics in Agile Software Development |
Werner Geyer et.al. |
2505.07664 |
null |
2025-05-12 |
JobHop: A Large-Scale Dataset of Career Trajectories |
Iman Johary et.al. |
2505.07653 |
null |
2025-05-12 |
Neural Brain: A Neuroscience-inspired Framework for Embodied Agents |
Jian Liu et.al. |
2505.07634 |
link |
2025-05-12 |
Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models |
Riccardo Passoni et.al. |
2505.07615 |
null |
2025-05-12 |
Concept-Level Explainability for Auditing & Steering LLM Responses |
Kenza Amara et.al. |
2505.07610 |
link |
2025-05-12 |
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining |
Paul Primus et.al. |
2505.07609 |
null |
2025-05-12 |
MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining |
Xiaomi LLM-Core Team et.al. |
2505.07608 |
link |
2025-05-12 |
Characterizing the Investigative Methods of Fictional Detectives with Large Language Models |
Edirlei Soares de Lima et.al. |
2505.07601 |
null |
2025-05-12 |
Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent |
Ziyang Huang et.al. |
2505.07596 |
null |
2025-05-12 |
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models |
Junjie Ye et.al. |
2505.07591 |
link |
2025-05-12 |
SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models |
Huining Cui et.al. |
2505.07584 |
null |
2025-05-12 |
YuLan-OneSim: Towards the Next Generation of Social Simulator with Large Language Models |
Lei Wang et.al. |
2505.07581 |
link |
2025-05-12 |
Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models |
Rei Higuchi et.al. |
2505.07558 |
null |
2025-05-12 |
Injecting Knowledge Graphs into Large Language Models |
Erica Coppolillo et.al. |
2505.07554 |
null |
2025-05-12 |
Towards Requirements Engineering for RAG Systems |
Tor Sporsem et.al. |
2505.07553 |
null |
2025-05-12 |
GRADA: Graph-based Reranker against Adversarial Documents Attack |
Jingjie Zheng et.al. |
2505.07546 |
link |
2025-05-12 |
RAI: Flexible Agent Framework for Embodied AI |
Kajetan Rachwał et.al. |
2505.07532 |
link |
2025-05-12 |
Byam: Fixing Breaking Dependency Updates with Large Language Models |
Frank Reyes et.al. |
2505.07522 |
link |
2025-05-12 |
ToolACE-DEV: Self-Improving Tool Learning via Decomposition and EVolution |
Xu Huang et.al. |
2505.07512 |
null |
2025-05-12 |
Learning to Reason and Navigate: Parameter Efficient Action Planning with Large Language Models |
Bahram Mohammadi et.al. |
2505.07500 |
null |
2025-05-12 |
Web-Bench: A LLM Code Benchmark Based on Web Standards and Frameworks |
Kai Xu et.al. |
2505.07473 |
link |
2025-05-12 |
A Survey on Collaborative Mechanisms Between Large and Small Language Models |
Yi Chen et.al. |
2505.07460 |
null |
2025-05-12 |
Why Uncertainty Estimation Methods Fall Short in RAG: An Axiomatic Analysis |
Heydar Soudani et.al. |
2505.07459 |
null |
2025-05-12 |
Can Generative AI agents behave like humans? Evidence from laboratory market experiments |
R. Maria del Rio-Chanona et.al. |
2505.07457 |
null |
2025-05-12 |
How well do LLMs reason over tabular data, really? |
Cornelius Wolff et.al. |
2505.07453 |
null |
2025-05-13 |
Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model |
Wei Li et.al. |
2505.07449 |
link |
2025-05-12 |
Unified Continuous Generative Models |
Peng Sun et.al. |
2505.07447 |
link |
2025-05-12 |
DiffCrysGen: A Score-Based Diffusion Model for Design of Diverse Inorganic Crystalline Materials |
Sourav Mal et.al. |
2505.07442 |
null |
2025-05-12 |
LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning |
Xiaotian Lin et.al. |
2505.07437 |
link |
2025-05-12 |
A Systematic Literature Review on Neural Code Translation |
Xiang Chen et.al. |
2505.07425 |
null |
2025-05-12 |
AI in Money Matters |
Nadine Sandjo Tchatchoua et.al. |
2505.07393 |
null |
2025-05-12 |
Examining the Role of LLM-Driven Interactions on Attention and Cognitive Engagement in Virtual Classrooms |
Suleyman Ozdel et.al. |
2505.07377 |
null |
2025-05-12 |
A Preliminary Study of Large Language Models for Multilingual Vulnerability Detection |
Junji Yu et.al. |
2505.07376 |
null |
2025-05-12 |
Synthetic Code Surgery: Repairing Bugs and Vulnerabilities with LLMs and Synthetic Data |
David de-Fitero-Dominguez et.al. |
2505.07372 |
null |
2025-05-12 |
GAN-based synthetic FDG PET images from T1 brain MRI can serve to improve performance of deep unsupervised anomaly detection models |
Daria Zotova et.al. |
2505.07364 |
null |
2025-05-12 |
BinMetric: A Comprehensive Binary Analysis Benchmark for Large Language Models |
Xiuwei Shang et.al. |
2505.07360 |
null |
2025-05-12 |
From Search To Sampling: Generative Models For Robust Algorithmic Recourse |
Prateek Garg et.al. |
2505.07351 |
link |
2025-05-12 |
QUPID: Quantified Understanding for Enhanced Performance, Insights, and Decisions in Korean Search Engines |
Ohjoon Kwon et.al. |
2505.07345 |
null |
2025-05-12 |
Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption |
Jordan Frery et.al. |
2505.07329 |
null |
2025-05-12 |
Uncertainty Profiles for LLMs: Uncertainty Source Decomposition and Adaptive Model-Metric Selection |
Pei-Fu Guo et.al. |
2505.07309 |
null |
2025-05-12 |
L-SWAG: Layer-Sample Wise Activation with Gradients information for Zero-Shot NAS on Vision Transformers |
Sofia Casarin et.al. |
2505.07300 |
null |
2025-05-12 |
Semantic Retention and Extreme Compression in LLMs: Can We Have Both? |
Stanislas Laborde et.al. |
2505.07289 |
null |
2025-05-12 |
Piloting Structure-Based Drug Design via Modality-Specific Optimal Schedule |
Keyue Qiu et.al. |
2505.07286 |
null |
2025-05-12 |
Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains |
Ibne Farabi Shihab et.al. |
2505.07274 |
link |
2025-05-12 |
Automated Repair of Ambiguous Natural Language Requirements |
Haoxiang Jia et.al. |
2505.07270 |
null |
2025-05-12 |
No Query, No Access |
Wenqiang Wang et.al. |
2505.07258 |
null |
2025-05-12 |
Synthetic Similarity Search in Automotive Production |
Christoph Huber et.al. |
2505.07256 |
null |
2025-05-12 |
SAS-Bench: A Fine-Grained Benchmark for Evaluating Short Answer Scoring with Large Language Models |
Peichao Lai et.al. |
2505.07247 |
link |
2025-05-12 |
Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity |
Guang Yan et.al. |
2505.07239 |
null |
2025-05-12 |
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation |
Jiashuo Sun et.al. |
2505.07233 |
link |
2025-05-12 |
Spatial Confounding in Multivariate Areal Data Analysis |
Kyle Lin Wu et.al. |
2505.07232 |
link |
2025-05-12 |
Measuring General Intelligence with Generated Games |
Vivek Verma et.al. |
2505.07215 |
link |
2025-05-12 |
Towards user-centered interactive medical image segmentation in VR with an assistive AI agent |
Pascal Spiegler et.al. |
2505.07214 |
null |
2025-05-12 |
Benchmarking Ethical and Safety Risks of Healthcare LLMs in China-Toward Systemic Governance under Healthy China 2030 |
Mouxiao Bian et.al. |
2505.07205 |
null |
2025-05-12 |
PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications |
Kuntai Du et.al. |
2505.07203 |
null |
2025-05-12 |
Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs |
Yifan Wei et.al. |
2505.07184 |
link |
2025-05-12 |
Internet of Agents: Fundamentals, Applications, and Challenges |
Yuntao Wang et.al. |
2505.07176 |
null |
2025-05-12 |
Metrics that matter: Evaluating image quality metrics for medical image generation |
Yash Deo et.al. |
2505.07175 |
link |
2025-05-12 |
One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models |
Haoran Gu et.al. |
2505.07167 |
null |
2025-05-12 |
KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification |
Hajar Sakai et.al. |
2505.07162 |
null |
2025-05-12 |
HAMLET: Healthcare-focused Adaptive Multilingual Learning Embedding-based Topic Modeling |
Hajar Sakai et.al. |
2505.07157 |
null |
2025-05-12 |
Reassessing Large Language Model Boolean Query Generation for Systematic Reviews |
Shuai Wang et.al. |
2505.07155 |
null |
2025-05-13 |
Exploring Anthropomorphism in Conversational Agents for Environmental Sustainability |
Mathyas Giudici et.al. |
2505.07142 |
null |
2025-05-11 |
KOKKAI DOC: An LLM-driven framework for scaling parliamentary representatives |
Ken Kato et.al. |
2505.07118 |
null |
2025-05-11 |
Knowledge Distillation for Enhancing Walmart E-commerce Search Relevance Using Large Language Models |
Hongwei Shang et.al. |
2505.07105 |
null |
2025-05-11 |
RefPentester: A Knowledge-Informed Self-Reflective Penetration Testing Framework Based on Large Language Models |
Hanzheng Dai et.al. |
2505.07089 |
null |
2025-05-11 |
Architectural Precedents for General Agents using Large Language Models |
Robert E. Wray et.al. |
2505.07087 |
null |
2025-05-11 |
Multi-Objective-Guided Discrete Flow Matching for Controllable Biological Sequence Design |
Tong Chen et.al. |
2505.07086 |
null |
2025-05-11 |
DriveSOTIF: Advancing Perception SOTIF Through Multimodal Large Language Models |
Shucheng Huang et.al. |
2505.07084 |
link |
2025-05-11 |
Can LLM-based Financial Investing Strategies Outperform the Market in Long Run? |
Weixian Waylon Li et.al. |
2505.07078 |
link |
2025-05-11 |
ParaView-MCP: An Autonomous Visualization Agent with Direct Tool Use |
Shusen Liu et.al. |
2505.07064 |
null |
2025-05-11 |
Seed1.5-VL Technical Report |
Dong Guo et.al. |
2505.07062 |
null |
2025-05-11 |
LLM-Augmented Chemical Synthesis and Design Decision Programs |
Haorui Wang et.al. |
2505.07027 |
null |
2025-05-11 |
A Vision-Language Foundation Model for Leaf Disease Identification |
Khang Nguyen Quoc et.al. |
2505.07019 |
link |
2025-05-11 |
MELLM: Exploring LLM-Powered Micro-Expression Understanding Enhanced by Subtle Motion Perception |
Zhengye Zhang et.al. |
2505.07007 |
link |
2025-05-11 |
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance |
Jinuk Kim et.al. |
2505.07004 |
link |
2025-05-11 |
Convert Language Model into a Value-based Strategic Planner |
Xiaoyu Wang et.al. |
2505.06987 |
null |
2025-05-11 |
Web Page Classification using LLMs for Crawling Support |
Yuichi Sasazawa et.al. |
2505.06972 |
link |
2025-05-09 |
Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks |
Christos Plachouras et.al. |
2505.06224 |
link |
2025-05-09 |
Adapting a Segmentation Foundation Model for Medical Image Classification |
Pengfei Gu et.al. |
2505.06217 |
null |
2025-05-09 |
From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling |
Vahid Rahimzadeh et.al. |
2505.06184 |
null |
2025-05-09 |
A Large Language Model-Enhanced Q-learning for Capacitated Vehicle Routing Problem with Time Windows |
Linjiang Cao et.al. |
2505.06178 |
null |
2025-05-09 |
MonetGPT: Solving Puzzles Enhances MLLMs’ Image Retouching Skills |
Niladri Shekhar Dutt et.al. |
2505.06176 |
null |
2025-05-09 |
Turbo-ICL: In-Context Learning-Based Turbo Equalization |
Zihang Song et.al. |
2505.06175 |
null |
2025-05-09 |
A Scaling Law for Token Efficiency in LLM Fine-Tuning Under Fixed Compute Budgets |
Ryan Lagasse et.al. |
2505.06150 |
null |
2025-05-09 |
Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study |
Faeze Ghorbanpour et.al. |
2505.06149 |
null |
2025-05-09 |
Constraints to Lorentz violation and ultrahigh-energy electrons in D-foamy space-times |
Chengyi Li et.al. |
2505.06121 |
null |
2025-05-09 |
LLMs Get Lost In Multi-Turn Conversation |
Philippe Laban et.al. |
2505.06120 |
link |
2025-05-09 |
Photovoltaic Defect Image Generator with Boundary Alignment Smoothing Constraint for Domain Shift Mitigation |
Dongying Li et.al. |
2505.06117 |
null |
2025-05-09 |
LLMs Outperform Experts on Challenging Biology Benchmarks |
Lennart Justen et.al. |
2505.06108 |
null |
2025-05-09 |
Free and Fair Hardware: A Pathway to Copyright Infringement-Free Verilog Generation using LLMs |
Sam Bush et.al. |
2505.06096 |
null |
2025-05-09 |
Assessing Tenstorrent’s RISC-V MatMul Acceleration Capabilities |
Hiari Pizzini Cavagna et.al. |
2505.06085 |
null |
2025-05-09 |
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information |
Joshua Harris et.al. |
2505.06046 |
null |
2025-05-09 |
Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation |
Stefan Vasilev et.al. |
2505.06027 |
null |
2025-05-09 |
ArtRAG: Retrieval-Augmented Generation with Structured Context for Visual Art Understanding |
Shuai Wang et.al. |
2505.06020 |
null |
2025-05-09 |
Task-Adapter++: Task-specific Adaptation with Order-aware Alignment for Few-shot Action Recognition |
Congqi Cao et.al. |
2505.06002 |
link |
2025-05-09 |
Offline Multi-agent Reinforcement Learning via Score Decomposition |
Dan Qiao et.al. |
2505.05968 |
null |
2025-05-09 |
GEORCE: A Fast New Control Algorithm for Computing Geodesics |
Frederik Möbius Rygaard et.al. |
2505.05961 |
link |
2025-05-09 |
NeoQA: Evidence-based Question Answering with Generated News Events |
Max Glockner et.al. |
2505.05949 |
link |
2025-05-09 |
Summarisation of German Judgments in conjunction with a Class-based Evaluation |
Bianca Steffes et.al. |
2505.05947 |
link |
2025-05-09 |
Elastic Weight Consolidation for Full-Parameter Continual Pre-Training of Gemma2 |
Vytenis Šliogeris et.al. |
2505.05946 |
null |
2025-05-09 |
Autoencoder-Based Hybrid Replay for Class-Incremental Learning |
Milad Khademi Nori et.al. |
2505.05926 |
null |
2025-05-09 |
CAPE: Context-Aware Prompt Perturbation Mechanism with Differential Privacy |
Haoqi Wu et.al. |
2505.05922 |
null |
2025-05-09 |
Generative Discovery of Partial Differential Equations by Learning from Math Handbooks |
Hao Xu et.al. |
2505.05869 |
null |
2025-05-09 |
Evolutionary ecology of words |
Reiji Suzuki et.al. |
2505.05863 |
null |
2025-05-09 |
AgentXploit: End-to-End Redteaming of Black-Box AI Agents |
Zhun Wang et.al. |
2505.05849 |
null |
2025-05-09 |
Augmented Body Communicator: Enhancing daily body expression for people with upper limb limitations through LLM and a robotic arm |
Songchen Zhou et.al. |
2505.05832 |
null |
2025-05-09 |
Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students’ (Mis)Understanding Is Hinted |
Machi Shimmei et.al. |
2505.05815 |
null |
2025-05-09 |
What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips |
Renjie Li et.al. |
2505.05794 |
null |
2025-05-09 |
A Day in Their Shoes: Using LLM-Based Perspective-Taking Interactive Fiction to Reduce Stigma Toward Dirty Work |
Xiangzhe Yuan et.al. |
2505.05786 |
null |
2025-05-09 |
Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM |
Zehao Fan et.al. |
2505.05772 |
null |
2025-05-09 |
Multi-Agent Systems for Robotic Autonomy with LLMs |
Junhong Chen et.al. |
2505.05762 |
null |
2025-05-09 |
APOLLO: Automated LLM and Lean Collaboration for Advanced Formal Reasoning |
Azim Ospanov et.al. |
2505.05758 |
null |
2025-05-09 |
Evolutionary thoughts: integration of large language models and evolutionary algorithms |
Antonio Jimeno Yepes et.al. |
2505.05756 |
link |
2025-05-09 |
Harnessing LLMs Explanations to Boost Surrogate Models in Tabular Data Classification |
Ruxue Shi et.al. |
2505.05744 |
null |
2025-05-09 |
Multimodal Integrated Knowledge Transfer to Large Language Models through Preference Optimization with Biomedical Applications |
Da Wu et.al. |
2505.05736 |
link |
2025-05-09 |
Automated Learning of Semantic Embedding Representations for Diffusion Models |
Limai Jiang et.al. |
2505.05732 |
null |
2025-05-09 |
Understanding Stragglers in Large Model Training Using What-if Analysis |
Jinkun Lin et.al. |
2505.05713 |
link |
2025-05-09 |
LLM-Text Watermarking based on Lagrange Interpolation |
Jarosław Janas et.al. |
2505.05712 |
null |
2025-05-09 |
HyperspectralMAE: The Hyperspectral Imagery Classification Model using Fourier-Encoded Dual-Branch Masked Autoencoder |
Wooyoung Jeong et.al. |
2505.05710 |
null |
2025-05-09 |
Assessing Robustness to Spurious Correlations in Post-Training Language Models |
Julia Shuieh et.al. |
2505.05704 |
null |
2025-05-08 |
Fine-Tuning Video-Text Contrastive Model for Primate Behavior Retrieval from Unlabeled Raw Videos |
Giulio Cesare Mastrocinque Santo et.al. |
2505.05681 |
null |
2025-05-08 |
From Bias To Improved Prompts: A Case Study of Bias Mitigation of Clone Detection Models |
QiHong Chen et.al. |
2505.05679 |
null |
2025-05-08 |
InstanceGen: Image Generation with Instance-level Instructions |
Etai Sella et.al. |
2505.05678 |
link |
2025-05-08 |
Lost in OCR Translation? Vision-Based Approaches to Robust Document Retrieval |
Alexander Most et.al. |
2505.05666 |
null |
2025-05-08 |
Adaptive Stress Testing Black-Box LLM Planners |
Neeloy Chakraborty et.al. |
2505.05665 |
null |
2025-05-08 |
Not Like Us, Hunty: Measuring Perceptions and Behavioral Effects of Minoritized Anthropomorphic Cues in LLMs |
Jeffrey Basoah et.al. |
2505.05660 |
null |
2025-05-08 |
The Moon’s Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction |
Tom Sander et.al. |
2505.05644 |
null |
2025-05-08 |
Looking Beyond Language Priors: Enhancing Visual Comprehension and Attention in Multimodal Models |
Aarti Ghatkesar et.al. |
2505.05626 |
null |
2025-05-08 |
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory |
Weichen Zhang et.al. |
2505.05622 |
link |
2025-05-08 |
LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities |
Kalyan Nakka et.al. |
2505.05619 |
link |
2025-05-08 |
Leveraging Large Language Models for enzymatic reaction prediction and characterization |
Lorenzo Di Fruscia et.al. |
2505.05616 |
link |
2025-05-08 |
scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction |
Qing Wang et.al. |
2505.05612 |
link |
2025-05-08 |
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics |
Lennart Luettgau et.al. |
2505.05602 |
link |
2025-05-08 |
Enhancing Large Language Models with Faster Code Preprocessing for Vulnerability Detection |
José Gonçalves et.al. |
2505.05600 |
link |
2025-05-08 |
PRIMG : Efficient LLM-driven Test Generation Using Mutant Prioritization |
Mohamed Salah Bouafif et.al. |
2505.05584 |
link |
2025-05-08 |
KG-HTC: Integrating Knowledge Graphs into LLMs for Effective Zero-shot Hierarchical Text Classification |
Qianbo Zang et.al. |
2505.05583 |
link |
2025-05-08 |
PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models |
Alejandro Velez-Arce et.al. |
2505.05577 |
link |
2025-05-08 |
Griffin: Towards a Graph-Centric Relational Database Foundation Model |
Yanbo Wang et.al. |
2505.05568 |
link |
2025-05-08 |
3D Scene Generation: A Survey |
Beichen Wen et.al. |
2505.05474 |
link |
2025-05-08 |
Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation |
Chao Liao et.al. |
2505.05472 |
null |
2025-05-08 |
Generating Physically Stable and Buildable LEGO Designs from Text |
Ava Pun et.al. |
2505.05469 |
link |
2025-05-08 |
StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant |
Haibo Wang et.al. |
2505.05467 |
null |
2025-05-08 |
ComPO: Preference Alignment via Comparison Oracles |
Peter Chen et.al. |
2505.05465 |
null |
2025-05-08 |
Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging |
Shiqi Chen et.al. |
2505.05464 |
link |
2025-05-08 |
UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections |
Fatima Haouari et.al. |
2505.05459 |
null |
2025-05-08 |
Conversational Process Model Redesign |
Nataliia Klievtsova et.al. |
2505.05453 |
null |
2025-05-08 |
clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations |
Chalamalasetti Kranti et.al. |
2505.05445 |
null |
2025-05-08 |
GesPrompt: Leveraging Co-Speech Gestures to Augment LLM-Based Interaction in Virtual Reality |
Xiyun Hu et.al. |
2505.05441 |
null |
2025-05-09 |
EcoAgent: An Efficient Edge-Cloud Collaborative Multi-Agent Framework for Mobile Automation |
Biao Yi et.al. |
2505.05440 |
null |
2025-05-08 |
Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data |
Yudong Wang et.al. |
2505.05427 |
null |
2025-05-09 |
LiTransProQA: an LLM-based Literary Translation evaluation metric with Professional Question Answering |
Ran Zhang et.al. |
2505.05423 |
link |
2025-05-08 |
Crosslingual Reasoning through Test-Time Scaling |
Zheng-Xin Yong et.al. |
2505.05408 |
link |
2025-05-08 |
Frame In, Frame Out: Do LLMs Generate More Biased News Headlines than Humans? |
Valeria Pastorino et.al. |
2505.05406 |
null |
2025-05-08 |
A Pain Assessment Framework based on multimodal data and Deep Machine Learning methods |
Stefanos Gkikas et.al. |
2505.05396 |
null |
2025-05-08 |
Modelling and Verifying Neuronal Archetypes in Coq |
Abdorrahim Bahrami et.al. |
2505.05362 |
link |
2025-05-08 |
DSDrive: Distilling Large Language Model for Lightweight End-to-End Autonomous Driving with Unified Reasoning and Planning |
Wenru Liu et.al. |
2505.05360 |
null |
2025-05-08 |
Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization |
Sooyoung Park et.al. |
2505.05343 |
link |
2025-05-08 |
ICon: In-Context Contribution for Automatic Data Selection |
Yixin Yang et.al. |
2505.05327 |
null |
2025-05-08 |
Toward Reasonable Parrots: Why Large Language Models Should Argue with Us by Design |
Elena Musi et.al. |
2505.05298 |
null |
2025-05-08 |
Benchmarking Ophthalmology Foundation Models for Clinically Significant Age Macular Degeneration Detection |
Benjamin A. Cohen et.al. |
2505.05291 |
null |
2025-05-08 |
HEXGEN-TEXT2SQL: Optimizing LLM Inference Request Scheduling for Agentic Text-to-SQL Workflow |
You Peng et.al. |
2505.05286 |
link |
2025-05-09 |
Software Development Life Cycle Perspective: A Survey of Benchmarks for Code Large Language Models and Agents |
Kaixin Wang et.al. |
2505.05283 |
null |
2025-05-08 |
MTL-UE: Learning to Learn Nothing for Multi-Task Learning |
Yi Yu et.al. |
2505.05279 |
null |
2025-05-08 |
PADriver: Towards Personalized Autonomous Driving |
Genghua Kou et.al. |
2505.05240 |
null |
2025-05-08 |
Latte: Transfering LLMs` Latent-level Knowledge for Few-shot Tabular Learning |
Ruxue Shi et.al. |
2505.05237 |
null |
2025-05-08 |
ChemRxivQuest: A Curated Chemistry Question-Answer Database Extracted from ChemRxiv Preprints |
Mahmoud Amiri et.al. |
2505.05232 |
null |
2025-05-08 |
QualBench: Benchmarking Chinese LLMs with Localized Professional Qualifications for Vertical Domain Evaluation |
Mengze Hong et.al. |
2505.05225 |
null |
2025-05-08 |
Diffusion Model Quantization: A Review |
Qian Zeng et.al. |
2505.05215 |
link |
2025-05-08 |
Stealthy LLM-Driven Data Poisoning Attacks Against Embedding-Based Retrieval-Augmented Recommender Systems |
Fatemeh Nazary et.al. |
2505.05196 |
null |
2025-05-08 |
Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks |
Yixin Cheng et.al. |
2505.05190 |
link |
2025-05-08 |
Biomed-DPT: Dual Modality Prompt Tuning for Biomedical Vision-Language Models |
Wei Peng et.al. |
2505.05189 |
link |
2025-05-08 |
MARK: Memory Augmented Refinement of Knowledge |
Anish Ganguli et.al. |
2505.05177 |
null |
2025-05-08 |
FedTDP: A Privacy-Preserving and Unified Framework for Trajectory Data Preparation via Federated Learning |
Zhihao Zeng et.al. |
2505.05155 |
null |
2025-05-08 |
Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution Learning |
Chuangtao Chen et.al. |
2505.05151 |
link |
2025-05-08 |
Text2Cypher: Data Pruning using Hard Example Selection |
Makbule Gulcin Ozsoy et.al. |
2505.05122 |
null |
2025-05-08 |
Enhancing Text2Cypher with Schema Filtering |
Makbule Gulcin Ozsoy et.al. |
2505.05118 |
null |
2025-05-08 |
Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders |
Boyi Deng et.al. |
2505.05111 |
null |
2025-05-08 |
Multi-agent Embodied AI: Advances and Future Directions |
Zhaohan Feng et.al. |
2505.05108 |
null |
2025-05-08 |
A Weighted Byzantine Fault Tolerance Consensus Driven Trusted Multiple Large Language Models Network |
Haoxiang Luo et.al. |
2505.05103 |
null |
2025-05-08 |
X-Driver: Explainable Autonomous Driving with Vision-Language Models |
Wei Liu et.al. |
2505.05098 |
null |
2025-05-08 |
Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction |
Xiaowei Zhu et.al. |
2505.05084 |
null |
2025-05-08 |
ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model |
Sagnik Bhattacharya et.al. |
2505.05082 |
null |
2025-05-08 |
Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization |
Ajwad Abrar et.al. |
2505.05070 |
null |
2025-05-08 |
WaterDrum: Watermarking for Data-centric Unlearning Metric |
Xinyang Lu et.al. |
2505.05064 |
link |
2025-05-08 |
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts |
Manik Sheokand et.al. |
2505.05063 |
null |
2025-05-08 |
ULFine: Unbiased Lightweight Fine-tuning for Foundation-Model-Assisted Long-Tailed Semi-Supervised Learning |
Enhao Zhang et.al. |
2505.05062 |
null |
2025-05-08 |
Towards Mitigating API Hallucination in Code Generated by LLMs with Hierarchical Dependency Aware |
Yujia Chen et.al. |
2505.05057 |
link |
2025-05-08 |
Statistical method for A-RNA and B-DNA |
Marco Zoli et.al. |
2505.05053 |
null |
2025-05-09 |
UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model |
Timo Kaiser et.al. |
2505.05049 |
link |
2025-05-08 |
LSRP: A Leader-Subordinate Retrieval Framework for Privacy-Preserving Cloud-Device Collaboration |
Yingyi Zhang et.al. |
2505.05031 |
link |
2025-05-08 |
A Reputation System for Large Language Model-based Multi-agent Systems to Avoid the Tragedy of the Commons |
Siyue Ren et.al. |
2505.05029 |
null |
2025-05-08 |
Generative Models for Long Time Series: Approximately Equivariant Recurrent Network Structures for an Adjusted Training Scheme |
Ruwen Fulek et.al. |
2505.05020 |
link |
2025-05-08 |
Generating Reliable Synthetic Clinical Trial Data: The Role of Hyperparameter Optimization and Domain Constraints |
Waldemar Hahn et.al. |
2505.05019 |
null |
2025-05-08 |
Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization |
Yuntai Bao et.al. |
2505.05017 |
link |
2025-05-08 |
The Pitfalls of Growing Group Complexity: LLMs and Social Choice-Based Aggregation for Group Recommendations |
Cedric Waterschoot et.al. |
2505.05016 |
null |
2025-05-08 |
Inter-Diffusion Generation Model of Speakers and Listeners for Effective Communication |
Jinhe Huang et.al. |
2505.04996 |
null |
2025-05-08 |
Rethinking Invariance in In-context Learning |
Lizhe Fang et.al. |
2505.04994 |
null |
2025-05-08 |
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes |
Zhuocheng Gong et.al. |
2505.04993 |
null |
2025-05-08 |
Boosting Statistic Learning with Synthetic Data from Pretrained Large Models |
Jialong Jiang et.al. |
2505.04992 |
null |
2025-05-08 |
LVLM-MPC Collaboration for Autonomous Driving: A Safety-Aware and Task-Scalable Control Architecture |
Kazuki Atsuta et.al. |
2505.04980 |
null |
2025-05-08 |
ReAlign: Bilingual Text-to-Motion Generation via Step-Aware Reward-Guided Alignment |
Wanjiang Weng et.al. |
2505.04974 |
null |
2025-05-08 |
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding |
Henry Zheng et.al. |
2505.04965 |
null |
2025-05-08 |
Learning Item Representations Directly from Multimodal Features for Effective Recommendation |
Xin Zhou et.al. |
2505.04960 |
link |
2025-05-08 |
Graffe: Graph Representation Learning via Diffusion Probabilistic Models |
Dingshuo Chen et.al. |
2505.04956 |
null |
2025-05-08 |
Chain-of-Thought Tokens are Computer Program Variables |
Fangwei Zhu et.al. |
2505.04955 |
link |
2025-05-08 |
Position: Epistemic Artificial Intelligence is Essential for Machine Learning Models to Know When They Do Not Know |
Shireen Kudukkil Manchingal et.al. |
2505.04950 |
null |
2025-05-08 |
Prompt-Based LLMs for Position Bias-Aware Reranking in Personalized Recommendations |
Md Aminul Islam et.al. |
2505.04948 |
link |
2025-05-08 |
T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models |
Xuyang Guo et.al. |
2505.04946 |
null |
2025-05-08 |
An Open-Source Dual-Loss Embedding Model for Semantic Retrieval in Higher Education |
Ramteja Sajja et.al. |
2505.04916 |
null |
2025-05-08 |
SpatialPrompting: Keyframe-driven Zero-Shot Spatial Reasoning with Off-the-Shelf Multimodal Large Language Models |
Shun Taguchi et.al. |
2505.04911 |
null |
2025-05-08 |
Clustering with Communication: A Variational Framework for Single Cell Representation Learning |
Cong Qi et.al. |
2505.04891 |
null |
2025-05-08 |
A Multi-Agent AI Framework for Immersive Audiobook Production through Spatial Audio and Neural Narration |
Shaja Arul Selvamani et.al. |
2505.04885 |
null |
2025-05-08 |
GroverGPT-2: Simulating Grover’s Algorithm via Chain-of-Thought Reasoning and Quantum-Native Tokenization |
Min Chen et.al. |
2505.04880 |
null |
2025-05-08 |
From First Draft to Final Insight: A Multi-Agent Approach for Feedback Generation |
Jie Cao et.al. |
2505.04869 |
null |
2025-05-08 |
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model |
Navin Ranjan et.al. |
2505.04861 |
null |
2025-05-07 |
CRAFT: Cultural Russian-Oriented Dataset Adaptation for Focused Text-to-Image Generation |
Viacheslav Vasilev et.al. |
2505.04851 |
null |
2025-05-07 |
HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights |
Ozan Gokdemir et.al. |
2505.04846 |
null |
2025-05-07 |
Comparative Study of Generative Models for Early Detection of Failures in Medical Devices |
Binesh Sadanandan et.al. |
2505.04845 |
link |
2025-05-07 |
Osiris: A Lightweight Open-Source Hallucination Detection System |
Alex Shan et.al. |
2505.04844 |
null |
2025-05-07 |
Large Language Models are Autonomous Cyber Defenders |
Sebastián R. Castro et.al. |
2505.04843 |
link |
2025-05-07 |
Steerable Scene Generation with Post Training and Inference-Time Search |
Nicholas Pfaff et.al. |
2505.04831 |
link |
2025-05-07 |
Guide your favorite protein sequence generative model |
Junhao Xiong et.al. |
2505.04823 |
null |
2025-05-07 |
WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction |
Richard Liu et.al. |
2505.04813 |
null |
2025-05-07 |
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs |
Chetan Pathade et.al. |
2505.04806 |
null |
2025-05-07 |
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling |
Xiao Wang et.al. |
2505.04802 |
null |
2025-05-07 |
Safeguard-by-Development: A Privacy-Enhanced Development Paradigm for Multi-Agent Collaboration Systems |
Jian Cui et.al. |
2505.04799 |
null |
2025-05-07 |
A Proposal for Evaluating the Operational Risk for ChatBots based on Large Language Models |
Pedro Pinacho-Davidson et.al. |
2505.04784 |
null |
2025-05-07 |
When Bad Data Leads to Good Models |
Kenneth Li et.al. |
2505.04741 |
null |
2025-05-07 |
The Promise and Limits of LLMs in Constructing Proofs and Hints for Logic Problems in Intelligent Tutoring Systems |
Sutapa Dey Tithi et.al. |
2505.04736 |
null |
2025-05-07 |
QBD-RankedDataGen: Generating Custom Ranked Datasets for Improving Query-By-Document Search Using LLM-Reranking with Reduced Human Effort |
Sriram Gopalakrishnan et.al. |
2505.04732 |
null |
2025-05-07 |
SOAEsV2-7B/72B: Full-Pipeline Optimization for State-Owned Enterprise LLMs via Continual Pre-Training, Domain-Progressive SFT and Distillation-Enhanced Speculative Decoding |
Jingyang Deng et.al. |
2505.04723 |
null |
2025-05-07 |
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers |
Divyansh Srivastava et.al. |
2505.04718 |
null |
2025-05-07 |
EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning |
Zhenghao Xing et.al. |
2505.04623 |
link |
2025-05-07 |
On Path to Multimodal Generalist: General-Level and General-Bench |
Hao Fei et.al. |
2505.04620 |
null |
2025-05-07 |
OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution |
Lianghong Guo et.al. |
2505.04606 |
link |
2025-05-07 |
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning |
Xianhang Li et.al. |
2505.04601 |
null |
2025-05-08 |
MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection |
Zhihao Zhang et.al. |
2505.04594 |
null |
2025-05-07 |
ZeroSearch: Incentivize the Search Capability of LLMs without Searching |
Hao Sun et.al. |
2505.04588 |
link |
2025-05-07 |
SlideItRight: Using AI to Find Relevant Slides and Provide Feedback for Open-Ended Questions |
Chloe Qianhui Zhao et.al. |
2505.04584 |
link |
2025-05-07 |
Fight Fire with Fire: Defending Against Malicious RL Fine-Tuning via Reward Neutralization |
Wenjun Cao et.al. |
2505.04578 |
null |
2025-05-07 |
Comparative Analysis of Carbon Footprint in Manual vs. LLM-Assisted Code Development |
Kuen Sum Cheung et.al. |
2505.04521 |
null |
2025-05-07 |
Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs |
Yehui Tang et.al. |
2505.04519 |
null |
2025-05-07 |
Detecting Spelling and Grammatical Anomalies in Russian Poetry Texts |
Ilya Koziev et.al. |
2505.04507 |
null |
2025-05-08 |
Defining and Quantifying Creative Behavior in Popular Image Generators |
Aditi Ramaswamy et.al. |
2505.04497 |
null |
2025-05-07 |
Efficient Flow Matching using Latent Variables |
Anirban Samaddar et.al. |
2505.04486 |
null |
2025-05-07 |
CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation |
Jiahao Li et.al. |
2505.04481 |
null |
2025-05-07 |
TrajEvo: Designing Trajectory Prediction Heuristics via LLM-driven Evolution |
Zhikai Zhao et.al. |
2505.04480 |
link |
2025-05-07 |
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration |
Shigeki Karita et.al. |
2505.04457 |
link |
2025-05-07 |
M2Rec: Multi-scale Mamba for Efficient Sequential Recommendation |
Qianru Zhang et.al. |
2505.04445 |
null |
2025-05-07 |
Towards Effectively Leveraging Execution Traces for Program Repair with Code LLMs |
Mirazul Haque et.al. |
2505.04441 |
null |
2025-05-07 |
OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models |
Xiaoyu Xu et.al. |
2505.04416 |
null |
2025-05-07 |
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception |
Junjie Wang et.al. |
2505.04410 |
link |
2025-05-07 |
YABLoCo: Yet Another Benchmark for Long Context Code Generation |
Aidar Valeev et.al. |
2505.04406 |
null |
2025-05-07 |
Large Means Left: Political Bias in Large Language Models Increases with Their Number of Parameters |
David Exler et.al. |
2505.04393 |
null |
2025-05-07 |
The Aloe Family Recipe for Open and Specialized Healthcare LLMs |
Dario Garcia-Gasulla et.al. |
2505.04388 |
null |
2025-05-07 |
CDE-Mapper: Using Retrieval-Augmented Language Models for Linking Clinical Data Elements to Controlled Vocabularies |
Komal Gilani et.al. |
2505.04365 |
null |
2025-05-07 |
Benchmarking LLMs’ Swarm intelligence |
Kai Ruan et.al. |
2505.04364 |
link |
2025-05-07 |
Optimization Problem Solving Can Transition to Evolutionary Agentic Workflows |
Wenhao Li et.al. |
2505.04354 |
null |
2025-05-07 |
CountDiffusion: Text-to-Image Synthesis with Training-Free Counting-Guidance Diffusion |
Yanyu Li et.al. |
2505.04347 |
null |
2025-05-07 |
Riemannian Denoising Diffusion Probabilistic Models |
Zichen Liu et.al. |
2505.04338 |
null |
2025-05-07 |
GASCADE: Grouped Summarization of Adverse Drug Event for Enhanced Cancer Pharmacovigilance |
Sofia Jamil et.al. |
2505.04284 |
link |
2025-05-07 |
Non-stationary Diffusion For Probabilistic Time Series Forecasting |
Weiwei Ye et.al. |
2505.04278 |
link |
2025-05-07 |
Weaponizing Language Models for Cybersecurity Offensive Operations: Automating Vulnerability Assessment Report Validation; A Review Paper |
Abdulrahman S Almuhaidib et.al. |
2505.04265 |
null |
2025-05-07 |
Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering |
Jessica Y. Bo et.al. |
2505.04260 |
null |
2025-05-07 |
LLM-Independent Adaptive RAG: Let the Question Speak for Itself |
Maria Marina et.al. |
2505.04253 |
null |
2025-05-07 |
A Large Language Model for Feasible and Diverse Population Synthesis |
Sung Yoo Lim et.al. |
2505.04196 |
null |
2025-05-07 |
AutoPatch: Multi-Agent Framework for Patching Real-World CVE Vulnerabilities |
Minjae Seo et.al. |
2505.04195 |
link |
2025-05-07 |
On-Device LLM for Context-Aware Wi-Fi Roaming |
Ju-Hyung Lee et.al. |
2505.04174 |
link |
2025-05-07 |
DiffPattern-Flex: Efficient Layout Pattern Generation via Discrete Diffusion |
Zixiao Wang et.al. |
2505.04173 |
null |
2025-05-07 |
Large Language Models are often politically extreme, usually ideologically inconsistent, and persuasive even in informational contexts |
Nouar Aldahoul et.al. |
2505.04171 |
null |
2025-05-07 |
Can Language Models Understand Social Behavior in Clinical Conversations? |
Manas Satish Bedmutha et.al. |
2505.04152 |
null |
2025-05-07 |
Unmasking the Canvas: A Dynamic Benchmark for Image Generation Jailbreaking and LLM Content Safety |
Variath Madhupal Gautham Nair et.al. |
2505.04146 |
null |
2025-05-07 |
NAMO-LLM: Efficient Navigation Among Movable Obstacles with Large Language Model Guidance |
Yuqing Zhang et.al. |
2505.04141 |
link |
2025-05-07 |
Enhancing Granular Sentiment Classification with Chain-of-Thought Prompting in Large Language Models |
Vihaan Miriyala et.al. |
2505.04135 |
null |
2025-05-07 |
RFNNS: Robust Fixed Neural Network Steganography with Popular Deep Generative Models |
Yu Cheng et.al. |
2505.04116 |
null |
2025-05-07 |
Alpha Excel Benchmark |
David Noever et.al. |
2505.04110 |
null |
2025-05-08 |
MAISY: Motion-Aware Image SYnthesis for Medical Image Motion Correction |
Andrew Zhang et.al. |
2505.04105 |
null |
2025-05-07 |
LLMs’ Suitability for Network Security: A Case Study of STRIDE Threat Modeling |
AbdulAziz AbdulGhaffar et.al. |
2505.04101 |
null |
2025-05-07 |
An Empirical Study of OpenAI API Discussions on Stack Overflow |
Xiang Chen et.al. |
2505.04084 |
null |
2025-05-07 |
QStore: Quantization-Aware Compressed Model Storage |
Raunak Shah et.al. |
2505.04081 |
link |
2025-05-07 |
LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress? |
Teddy Foley et.al. |
2505.04075 |
link |
2025-05-07 |
Natural Language Generation in Healthcare: A Review of Methods and Applications |
Mengxian Lyu et.al. |
2505.04073 |
null |
2025-05-07 |
Advancing and Benchmarking Personalized Tool Invocation for LLMs |
Xu Huang et.al. |
2505.04072 |
link |
2025-05-07 |
Shadow Wireless Intelligence: Large Language Model-Driven Reasoning in Covert Communications |
Yuanai Xie et.al. |
2505.04068 |
null |
2025-05-07 |
BuildingBlock: A Hybrid Approach for Structured Building Generation |
Junming Huang et.al. |
2505.04051 |
null |
2025-05-07 |
Identification and Optimization of Redundant Code Using Large Language Models |
Shamse Tasnim Cynthia et.al. |
2505.04040 |
null |
2025-05-06 |
Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving |
Shan Yu et.al. |
2505.04021 |
null |
2025-05-06 |
SLOT: Structuring the Output of Large Language Models |
Darren Yow-Bang Wang et.al. |
2505.04016 |
null |
2025-05-06 |
Can Large Language Models Predict Parallel Code Performance? |
Gregory Bolet et.al. |
2505.03988 |
null |
2025-05-06 |
LogiDebrief: A Signal-Temporal Logic based Automated Debriefing Approach with Large Language Models Integration |
Zirong Chen et.al. |
2505.03985 |
null |
2025-05-06 |
Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Autospeculation |
Hengyuan Hu et.al. |
2505.03983 |
null |
2025-05-06 |
A Reasoning-Focused Legal Retrieval Benchmark |
Lucia Zheng et.al. |
2505.03970 |
null |
2025-05-06 |
nuGAN: Generative Adversarial Emulator for Cosmic Web with Neutrinos |
Neerav Kaushal et.al. |
2505.03936 |
null |
2025-05-06 |
MARCO: A Multi-Agent System for Optimizing HPC Code Generation Using Large Language Models |
Asif Rahman et.al. |
2505.03906 |
null |
2025-05-06 |
Unveiling the Role of ChatGPT in Software Development: Insights from Developer-ChatGPT Interactions on GitHub |
Ruiyin Li et.al. |
2505.03901 |
null |
2025-05-06 |
Machine Learning: a Lecture Note |
Kyunghyun Cho et.al. |
2505.03861 |
null |
2025-05-06 |
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model |
Zuwei Long et.al. |
2505.03739 |
link |
2025-05-06 |
Meta-Optimization and Program Search using Language Models for Task and Motion Planning |
Denis Shcherba et.al. |
2505.03725 |
null |
2025-05-06 |
Fairness of Automatic Speech Recognition in Cleft Lip and Palate Speech |
Susmita Bhattacharjee et.al. |
2505.03697 |
null |
2025-05-06 |
Graph Drawing for LLMs: An Empirical Evaluation |
Walter Didimo et.al. |
2505.03678 |
null |
2025-05-06 |
Binding threshold units with artificial oscillatory neurons |
Vladimir Fanaskov et.al. |
2505.03648 |
link |
2025-05-06 |
PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing |
Yiping Xie et.al. |
2505.03621 |
null |
2025-05-06 |
From Pixels to Polygons: A Survey of Deep Learning Approaches for Medical Image-to-Mesh Reconstruction |
Fengming Lin et.al. |
2505.03599 |
null |
2025-05-06 |
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes |
Sergey Linok et.al. |
2505.03581 |
link |
2025-05-06 |
LlamaFirewall: An open source guardrail system for building secure AI agents |
Sahana Chennabasappa et.al. |
2505.03574 |
null |
2025-05-06 |
Say It Another Way: A Framework for User-Grounded Paraphrasing |
Cléa Chataigner et.al. |
2505.03563 |
null |
2025-05-06 |
Real-Time Person Image Synthesis Using a Flow Matching Model |
Jiwoo Jeong et.al. |
2505.03562 |
null |
2025-05-06 |
A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications and Challenges |
Feibo Jiang et.al. |
2505.03556 |
link |
2025-05-06 |
A Hashgraph-Inspired Consensus Mechanism for Reliable Multi-Model Reasoning |
Kolawole E. Ogunsina et.al. |
2505.03553 |
null |
2025-05-06 |
STORY2GAME: Generating (Almost) Everything in an Interactive Fiction Game |
Eric Zhou et.al. |
2505.03547 |
null |
2025-05-06 |
Faster MoE LLM Inference for Extremely Large Models |
Haoqi Yang et.al. |
2505.03531 |
null |
2025-05-06 |
Causal Intervention Framework for Variational Auto Encoder Mechanistic Interpretability |
Dip Roy et.al. |
2505.03530 |
null |
2025-05-06 |
Ruled by the Representation Space: On the University’s Embrace of Large Language Models |
Katia Schwerzmann et.al. |
2505.03513 |
null |
2025-05-06 |
Modality-Guided Dynamic Graph Fusion and Temporal Diffusion for Self-Supervised RGB-T Tracking |
Shenglan Li et.al. |
2505.03507 |
link |
2025-05-06 |
BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models |
Zihan Wang et.al. |
2505.03501 |
null |
2025-05-06 |
A new membership inference attack that spots memorization in generative and predictive models: Loss-Based with Reference Model algorithm (LBRM) |
Faiz Taleb et.al. |
2505.03490 |
null |
2025-05-06 |
am-ELO: A Stable Framework for Arena-based LLM Evaluation |
Zirui Liu et.al. |
2505.03475 |
null |
2025-05-06 |
Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning Eliciting Efficient Reasoning in Large Language Models |
Bin Yu et.al. |
2505.03469 |
link |
2025-05-06 |
Uncertainty-Aware Large Language Models for Explainable Disease Diagnosis |
Shuang Zhou et.al. |
2505.03467 |
null |
2025-05-06 |
LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs |
Xinyuan Zhang et.al. |
2505.03460 |
null |
2025-05-06 |
The Steganographic Potentials of Language Models |
Artem Karpov et.al. |
2505.03439 |
null |
2025-05-06 |
Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents |
Schaun Wheeler et.al. |
2505.03434 |
null |
2025-05-06 |
Wasserstein Convergence of Score-based Generative Models under Semiconvexity and Discontinuous Gradients |
Stefano Bruno et.al. |
2505.03432 |
null |
2025-05-06 |
MedArabiQ: Benchmarking Large Language Models on Arabic Medical Tasks |
Mouath Abu Daoud et.al. |
2505.03427 |
link |
2025-05-06 |
Phenotype-Guided Generative Model for High-Fidelity Cardiac MRI Synthesis: Advancing Pretraining and Clinical Applications |
Ziyu Li et.al. |
2505.03426 |
null |
2025-05-06 |
Directed Greybox Fuzzing via Large Language Model |
Hanxiang Xu et.al. |
2505.03425 |
null |
2025-05-06 |
Knowledge Augmented Complex Problem Solving with Large Language Models: A Survey |
Da Zheng et.al. |
2505.03418 |
null |
2025-05-06 |
Lightweight Clinical Decision Support System using QLoRA-Fine-Tuned LLMs and Retrieval-Augmented Generation |
Mohammad Shoaib Ansari et.al. |
2505.03406 |
null |
2025-05-06 |
Automatic Calibration for Membership Inference Attack on Large Language Models |
Saleh Zare Zade et.al. |
2505.03392 |
link |
2025-05-06 |
SPAP: Structured Pruning via Alternating Optimization and Penalty Methods |
Hanyu Hu et.al. |
2505.03373 |
null |
2025-05-06 |
Validating the Effectiveness of a Large Language Model-based Approach for Identifying Children’s Development across Various Free Play Settings in Kindergarten |
Yuanyuan Yang et.al. |
2505.03369 |
null |
2025-05-06 |
Geospatial Mechanistic Interpretability of Large Language Models |
Stef De Sabbata et.al. |
2505.03368 |
link |
2025-05-06 |
Domain Adversarial Training for Mitigating Gender Bias in Speech-based Mental Health Detection |
June-Woo Kim et.al. |
2505.03359 |
null |
2025-05-06 |
Elevating Cyber Threat Intelligence against Disinformation Campaigns with LLM-based Concept Extraction and the FakeCTI Dataset |
Domenico Cotroneo et.al. |
2505.03345 |
link |
2025-05-06 |
Avoid Recommending Out-of-Domain Items: Constrained Generative Recommendation with LLMs |
Hao Liao et.al. |
2505.03336 |
link |
2025-05-07 |
Absolute Zero: Reinforced Self-play Reasoning with Zero Data |
Andrew Zhao et.al. |
2505.03335 |
link |
2025-05-06 |
AI-Driven Scholarly Peer Review via Persistent Workflow Prompting, Meta-Prompting, and Meta-Reasoning |
Evgeny Markhasin et.al. |
2505.03332 |
null |
2025-05-06 |
Artificial Behavior Intelligence: Technology, Challenges, and Future Directions |
Kanghyun Jo et.al. |
2505.03315 |
null |
2025-05-06 |
Towards Efficient Benchmarking of Foundation Models in Remote Sensing: A Capabilities Encoding Approach |
Pierre Adorni et.al. |
2505.03299 |
link |
2025-05-06 |
Capability-Driven Skill Generation with LLMs: A RAG-Based Approach for Reusing Existing Libraries and Interfaces |
Luis Miguel Vieira da Silva et.al. |
2505.03295 |
null |
2025-05-06 |
Ψ-Arena: Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback |
Shijing Zhu et.al. |
2505.03293 |
null |
2025-05-06 |
RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented Generation |
Tiantian Gan et.al. |
2505.03275 |
null |
2025-05-06 |
SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation |
Zhaoxi Mu et.al. |
2505.03273 |
null |
2025-05-06 |
Synthline: A Product Line Approach for Synthetic Requirements Engineering Data Generation using Large Language Models |
Abdelkarim El-Hajjami et.al. |
2505.03265 |
link |
2025-05-06 |
SonicRAG : High Fidelity Sound Effects Synthesis Based on Retrival Augmented Generation |
Yu-Ren Guo et.al. |
2505.03244 |
null |
2025-05-06 |
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning |
Liam Boyle et.al. |
2505.03238 |
null |
2025-05-06 |
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data |
Shengliang Deng et.al. |
2505.03233 |
null |
2025-05-06 |
DocSpiral: A Platform for Integrated Assistive Document Annotation through Human-in-the-Spiral |
Qiang Sun et.al. |
2505.03214 |
null |
2025-05-06 |
DYSTIL: Dynamic Strategy Induction with Large Language Models for Reinforcement Learning |
Borui Wang et.al. |
2505.03209 |
null |
2025-05-06 |
Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights |
Zhaiming Shen et.al. |
2505.03205 |
null |
2025-05-06 |
A Trustworthy Multi-LLM Network: Challenges,Solutions, and A Use Case |
Haoxiang Luo et.al. |
2505.03196 |
null |
2025-05-06 |
Patterns and Mechanisms of Contrastive Activation Engineering |
Yixiong Hao et.al. |
2505.03189 |
null |
2025-05-06 |
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making |
Jake Grigsby et.al. |
2505.03181 |
null |
2025-05-06 |
Bridging Expertise Gaps: The Role of LLMs in Human-AI Collaboration for Cybersecurity |
Shahroz Tariq et.al. |
2505.03179 |
null |
2025-05-06 |
CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics |
Junqi Liu et.al. |
2505.03171 |
link |
2025-05-06 |
The Impact of Large Language Models on K-12 Education in Rural India: A Thematic Analysis of Student Volunteer’s Perspectives |
Harshita Goyal et.al. |
2505.03163 |
null |
2025-05-06 |
An LLM-based Self-Evolving Security Framework for 6G Space-Air-Ground Integrated Networks |
Qi Qin et.al. |
2505.03161 |
null |
2025-05-06 |
StableMotion: Training Motion Cleanup Models with Unpaired Corrupted Data |
Yuxuan Mu et.al. |
2505.03154 |
null |
2025-05-06 |
Towards Effective Identification of Attack Techniques in Cyber Threat Intelligence Reports using Large Language Models |
Hoang Cuong Nguyen et.al. |
2505.03147 |
link |
2025-05-06 |
Holmes: Automated Fact Check with Large Language Models |
Haoran Ou et.al. |
2505.03135 |
null |
2025-05-06 |
VISLIX: An XAI Framework for Validating Vision Models with Slice Discovery and Analysis |
Xinyuan Yan et.al. |
2505.03132 |
null |
2025-05-06 |
Plug-and-Play AMC: Context Is King in Training-Free, Open-Set Modulation with LLMs |
Mohammad Rostami et.al. |
2505.03112 |
link |
2025-05-06 |
Towards a standardized methodology and dataset for evaluating LLM-based digital forensic timeline analysis |
Hudan Studiawan et.al. |
2505.03100 |
null |
2025-05-06 |
Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering |
Joshua Owotogbe et.al. |
2505.03096 |
null |
2025-05-05 |
Direct Retrieval-augmented Optimization: Synergizing Knowledge Selection and Language Models |
Zhengliang Shi et.al. |
2505.03075 |
link |
2025-05-05 |
Variational diffusion transformers for conditional sampling of supernovae spectra |
Yunyi Shen et.al. |
2505.03063 |
null |
2025-05-05 |
Improving Model Alignment Through Collective Intelligence of Open-Source LLMS |
Junlin Wang et.al. |
2505.03059 |
null |
2025-05-05 |
34 Examples of LLM Applications in Materials Science and Chemistry: Towards Automation, Assistants, Agents, and Accelerated Scientific Discovery |
Yoel Zimmermann et.al. |
2505.03049 |
null |
2025-05-05 |
MORE: Mobile Manipulation Rearrangement Through Grounded Language Reasoning |
Mohammad Mohammadi et.al. |
2505.03035 |
null |
2025-05-05 |
Evaluating the Impact of AI-Powered Audiovisual Personalization on Learner Emotion, Focus, and Learning Outcomes |
George Xi Wang et.al. |
2505.03033 |
null |
2025-05-05 |
Radio: Rate-Distortion Optimization for Large Language Model Compression |
Sean I. Young et.al. |
2505.03031 |
null |
2025-05-05 |
UCSC at SemEval-2025 Task 3: Context, Models and Prompt Optimization for Automated Hallucination Detection in LLM Output |
Sicong Huang et.al. |
2505.03030 |
null |
2025-05-05 |
Memorization or Interpolation ? Detecting LLM Memorization through Input Perturbation Analysis |
Albérick Euraste Djiré et.al. |
2505.03019 |
null |
2025-05-05 |
Lesion-Aware Generative Artificial Intelligence for Virtual Contrast-Enhanced Mammography in Breast Cancer |
Aurora Rofena et.al. |
2505.03018 |
null |
2025-05-05 |
GIF: Generative Inspiration for Face Recognition at Scale |
Saeed Ebrahimi et.al. |
2505.03012 |
null |
2025-05-05 |
Modeling the Impact of Group Interactions on Climate-related Opinion Change in Reddit |
Alessia Antelmi et.al. |
2505.02989 |
link |
2025-05-05 |
Generative modelling of multivariate geometric extremes using normalising flows |
Lambert De Monte et.al. |
2505.02957 |
null |
2025-05-05 |
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference |
Yaoqi Chen et.al. |
2505.02922 |
null |
2025-05-05 |
When Your Own Output Becomes Your Training Data: Noise-to-Meaning Loops and a Formal RSI Trigger |
Rintaro Ando et.al. |
2505.02888 |
link |
2025-05-05 |
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation |
Lu Ling et.al. |
2505.02836 |
null |
2025-05-05 |
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning |
Yi-Fan Zhang et.al. |
2505.02835 |
link |
2025-05-07 |
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves |
Dengyang Jiang et.al. |
2505.02831 |
link |
2025-05-05 |
LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery |
Jerome Quenum et.al. |
2505.02829 |
null |
2025-05-05 |
ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations |
Dmitriy Shopkhoev et.al. |
2505.02819 |
link |
2025-05-05 |
Towards Quantifying the Hessian Structure of Neural Networks |
Zhaorui Dong et.al. |
2505.02809 |
link |
2025-05-05 |
Generating HomeAssistant Automations Using an LLM-based Chatbot |
Mathyas Giudici et.al. |
2505.02802 |
null |
2025-05-05 |
HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models |
Zheng Lin et.al. |
2505.02795 |
null |
2025-05-05 |
Giving Simulated Cells a Voice: Evolving Prompt-to-Intervention Models for Cellular Control |
Nam H. Le et.al. |
2505.02766 |
null |
2025-05-05 |
Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models |
Matthew Dahl et.al. |
2505.02763 |
null |
2025-05-05 |
Using Knowledge Graphs to harvest datasets for efficient CLIP model training |
Simon Ging et.al. |
2505.02746 |
link |
2025-05-06 |
Knowledge Graphs for Enhancing Large Language Models in Entity Disambiguation |
Gerard Pons et.al. |
2505.02737 |
null |
2025-05-05 |
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models |
Zhouliang Yu et.al. |
2505.02735 |
link |
2025-05-05 |
Enhancing LLMs’ Clinical Reasoning with Real-World Data from a Nationwide Sepsis Registry |
Junu Kim et.al. |
2505.02722 |
link |
2025-05-05 |
Less is More: Efficient Weight Farcasting with 1-Layer Neural Network |
Xiao Shou et.al. |
2505.02714 |
null |
2025-05-05 |
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play |
Yemin Shi et.al. |
2505.02707 |
link |
2025-05-05 |
AI Standardized Patient Improves Human Conversations in Advanced Cancer Care |
Kurtis Haut et.al. |
2505.02694 |
link |
2025-05-05 |
Predicting Movie Hits Before They Happen with LLMs |
Shaghayegh Agah et.al. |
2505.02693 |
null |
2025-05-05 |
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models |
Xiaobao Wu et.al. |
2505.02686 |
link |
2025-05-05 |
Multimodal Deep Learning for Stroke Prediction and Detection using Retinal Imaging and Clinical Data |
Saeed Shurrab et.al. |
2505.02677 |
null |
2025-05-05 |
A Survey on Progress in LLM Alignment from the Perspective of Reward Design |
Miaomiao Ji et.al. |
2505.02666 |
null |
2025-05-05 |
A Survey of Slow Thinking-based Reasoning LLMs using Reinforced Learning and Inference-time Scaling Law |
Qianjun Pan et.al. |
2505.02665 |
null |
2025-05-06 |
A Note on Statistically Accurate Tabular Data Generation Using Large Language Models |
Andrey Sidorenko et.al. |
2505.02659 |
link |
2025-05-05 |
Hierarchical random measures without tables |
Marta Catalano et.al. |
2505.02653 |
null |
2025-05-05 |
Enhancing Chemical Reaction and Retrosynthesis Prediction with Large Language Model and Dual-task Learning |
Xuan Lin et.al. |
2505.02639 |
null |
2025-05-05 |
Parameter-Efficient Fine-Tuning with Attributed Patch Semantic Graph for Automated Patch Correctness Assessment |
Zhenyu Yang et.al. |
2505.02629 |
link |
2025-05-05 |
DeepSparse: A Foundation Model for Sparse-View CBCT Reconstruction |
Yiqun Lin et.al. |
2505.02628 |
null |
2025-05-05 |
Detect, Classify, Act: Categorizing Industrial Anomalies with Multi-Modal Large Language Models |
Sassan Mokhtar et.al. |
2505.02626 |
link |
2025-05-05 |
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis |
Qingkai Fang et.al. |
2505.02625 |
link |
2025-05-05 |
Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era |
Chenxi Liu et.al. |
2505.02583 |
link |
2025-05-06 |
EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning |
Lingxiao Kong et.al. |
2505.02579 |
link |
2025-05-05 |
Recursive Decomposition with Dependencies for Generic Divide-and-Conquer Reasoning |
Sergio Hernández-Gutiérrez et.al. |
2505.02576 |
null |
2025-05-05 |
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities |
Xinjie Zhang et.al. |
2505.02567 |
link |
2025-05-06 |
Evaluating Contrastive Feedback for Effective User Simulations |
Andreas Konstantin Kruff et.al. |
2505.02560 |
link |
2025-05-05 |
The Turing Test Is More Relevant Than Ever |
Avraham Rahimov et.al. |
2505.02558 |
null |
2025-05-05 |
Large Language Model Partitioning for Low-Latency Inference at the Edge |
Dimitrios Kafetzis et.al. |
2505.02533 |
null |
2025-05-05 |
Text to Image Generation and Editing: A Survey |
Pengfei Yang et.al. |
2505.02527 |
null |
2025-05-05 |
Unveiling the Landscape of LLM Deployment in the Wild: An Empirical Study |
Xinyi Hou et.al. |
2505.02502 |
null |
2025-05-05 |
Automating Automotive Software Development: A Synergy of Generative AI and Formal Methods |
Fengjunjie Pan et.al. |
2505.02500 |
null |
2025-05-05 |
Beyond the model: Key differentiators in large language models and multi-agent services |
Muskaan Goyal et.al. |
2505.02489 |
null |
2025-05-05 |
Hypothesis testing and Stein’s lemma in general probability theories with Euclidean Jordan algebra and its quantum realization |
Kanta Sonoda et.al. |
2505.02487 |
null |
2025-05-05 |
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning |
Jinpeng Chen et.al. |
2505.02486 |
link |
2025-05-05 |
Automated Hybrid Reward Scheduling via Large Language Models for Robotic Skill Learning |
Changxin Huang et.al. |
2505.02483 |
null |
2025-05-05 |
Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and Modality |
Xueguang Ma et.al. |
2505.02466 |
link |
2025-05-05 |
Incentivizing Inclusive Contributions in Model Sharing Markets |
Enpei Zhang et.al. |
2505.02462 |
null |
2025-05-05 |
Colombian Waitresses y Jueces canadienses: Gender and Country Biases in Occupation Recommendations from LLMs |
Elisa Forcada Rodríguez et.al. |
2505.02456 |
null |
2025-05-05 |
Can LLM-Simulated Practice and Feedback Upskill Human Counselors? A Randomized Study with 90+ Novice Counselors |
Ryan Louie et.al. |
2505.02428 |
null |
2025-05-05 |
Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks |
Baoxia Du et.al. |
2505.02413 |
null |
2025-05-05 |
Estimating Commonsense Scene Composition on Belief Scene Graphs |
Mario A. V. Saucedo et.al. |
2505.02405 |
null |
2025-05-05 |
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL |
Jiarui Yao et.al. |
2505.02391 |
link |
2025-05-05 |
RM-R1: Reward Modeling as Reasoning |
Xiusi Chen et.al. |
2505.02387 |
link |
2025-05-06 |
EntroLLM: Entropy Encoded Weight Compression for Efficient Large Language Model Inference on Edge Devices |
Arnab Sanyal et.al. |
2505.02380 |
null |
2025-05-05 |
LAMeD: LLM-generated Annotations for Memory Leak Detection |
Ekaterina Shemetova et.al. |
2505.02376 |
null |
2025-05-05 |
Advancing Email Spam Detection: Leveraging Zero-Shot Learning and Large Language Models |
Ghazaleh SHirvani et.al. |
2505.02362 |
link |
2025-05-05 |
An End-to-End Model For Logits Based Large Language Models Watermarking |
Kahim Wong et.al. |
2505.02344 |
link |
2025-05-05 |
VAEmo: Efficient Representation Learning for Visual-Audio Emotion with Knowledge Injection |
Hao Cheng et.al. |
2505.02331 |
link |
2025-05-05 |
From Course to Skill: Evaluating LLM Performance in Curricular Analytics |
Zhen Xu et.al. |
2505.02324 |
link |
2025-05-05 |
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking |
Runquan Gui et.al. |
2505.02322 |
null |
2025-05-05 |
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering |
Jihao Zhao et.al. |
2505.02311 |
link |
2025-05-05 |
Bayesian inference for cluster-randomized trials with multivariate outcomes subject to both truncation by death and missingness |
Guangyu Tong et.al. |
2505.02310 |
null |
2025-05-05 |
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques |
Sanjay Surendranath Girija et.al. |
2505.02309 |
null |
2025-05-05 |
Generative Sign-description Prompts with Multi-positive Contrastive Learning for Sign Language Recognition |
Siyu Liang et.al. |
2505.02304 |
null |
2025-05-05 |
Entropy-Guided Sampling of Flat Modes in Discrete Spaces |
Pinaki Mohanty et.al. |
2505.02296 |
link |
2025-05-04 |
A survey of agent interoperability protocols: Model Context Protocol (MCP), Agent Communication Protocol (ACP), Agent-to-Agent Protocol (A2A), and Agent Network Protocol (ANP) |
Abul Ehtesham et.al. |
2505.02279 |
null |
2025-05-04 |
Real-time Spatial Retrieval Augmented Generation for Urban Environments |
David Nazareno Campo et.al. |
2505.02271 |
null |
2025-05-04 |
Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset |
Jakub Wąsala et.al. |
2505.02255 |
null |
2025-05-04 |
Personalisation or Prejudice? Addressing Geographic Bias in Hate Speech Detection using Debias Tuning in Large Language Models |
Paloma Piot et.al. |
2505.02252 |
null |
2025-05-04 |
Improving Physical Object State Representation in Text-to-Image Generative Systems |
Tianle Chen et.al. |
2505.02236 |
link |
2025-05-04 |
Prompt-responsive Object Retrieval with Memory-augmented Student-Teacher Learning |
Malte Mosbach et.al. |
2505.02232 |
null |
2025-05-04 |
An Empirical Study of Qwen3 Quantization |
Xingyu Zheng et.al. |
2505.02214 |
link |
2025-05-04 |
Leveraging LLMs to Automate Energy-Aware Refactoring of Parallel Scientific Codes |
Matthew T. Dearing et.al. |
2505.02184 |
null |
2025-05-04 |
Robust AI-Generated Face Detection with Imbalanced Data |
Yamini Sri Krubha et.al. |
2505.02182 |
link |
2025-05-04 |
Sparfels: Fast Reconstruction from Sparse Unposed Imagery |
Shubhendu Jena et.al. |
2505.02178 |
null |
2025-05-04 |
Measuring Hong Kong Massive Multi-Task Language Understanding |
Chuxue Cao et.al. |
2505.02177 |
null |
2025-05-04 |
Identifying Legal Holdings with LLMs: A Systematic Study of Performance, Scale, and Memorization |
Chuck Arvin et.al. |
2505.02172 |
link |
2025-05-04 |
A New HOPE: Domain-agnostic Automatic Evaluation of Text Chunking |
Henrik Brådland et.al. |
2505.02171 |
null |
2025-05-04 |
Interleave-VLA: Enhancing Robot Manipulation with Interleaved Image-Text Instructions |
Cunxin Fan et.al. |
2505.02152 |
null |
2025-05-04 |
Large Language Models are overconfident and amplify human bias |
Fengfei Sun et.al. |
2505.02151 |
null |
2025-05-04 |
QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach |
Shouyang Dong et.al. |
2505.02146 |
null |
2025-05-04 |
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study |
Xiaoyu Tian et.al. |
2505.02142 |
null |
2025-05-06 |
Efficient Multivariate Time Series Forecasting via Calibrated Language Models with Privileged Knowledge Distillation |
Chenxi Liu et.al. |
2505.02138 |
link |
2025-05-04 |
Enhancing LLM Code Generation: A Systematic Evaluation of Multi-Agent Collaboration and Runtime Debugging for Improved Accuracy, Reliability, and Latency |
Nazmus Ashrafi et.al. |
2505.02133 |
link |
2025-05-04 |
Attention Mechanisms Perspective: Exploring LLM Processing of Graph-Structured Data |
Zhong Guan et.al. |
2505.02130 |
link |
2025-05-04 |
GRAIL: Graph Edit Distance and Node Alignment Using LLM-Generated Code |
Samidha Verma et.al. |
2505.02124 |
link |
2025-05-04 |
DriveAgent: Multi-Agent Structured Reasoning with LLM and Multimodal Sensor Fusion for Autonomous Driving |
Xinmeng Hou et.al. |
2505.02123 |
link |
2025-05-04 |
MemEngine: A Unified and Modular Library for Developing Advanced Memory of LLM-based Agents |
Zeyu Zhang et.al. |
2505.02099 |
link |
2025-05-04 |
LLM-OptiRA: LLM-Driven Optimization of Resource Allocation for Non-Convex Problems in Wireless Communications |
Xinyue Peng et.al. |
2505.02091 |
link |
2025-05-04 |
Retrieval-augmented in-context learning for multimodal large language models in disease classification |
Zaifu Zhan et.al. |
2505.02087 |
null |
2025-05-04 |
LecEval: An Automated Metric for Multimodal Knowledge Acquisition in Multimedia Learning |
Joy Lim Jia Yin et.al. |
2505.02078 |
link |
2025-05-04 |
Leveraging LLM Agents and Digital Twins for Fault Handling in Process Plants |
Milapji Singh Gill et.al. |
2505.02076 |
link |
2025-05-04 |
Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation |
Volodymyr Havrylov et.al. |
2505.02075 |
link |
2025-05-04 |
Lightweight Defense Against Adversarial Attacks in Time Series Classification |
Yi Han et.al. |
2505.02073 |
link |
2025-05-06 |
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video |
Shuhang Xun et.al. |
2505.02064 |
link |
2025-05-04 |
TxP: Reciprocal Generation of Ground Pressure Dynamics and Activity Descriptions for Improving Human Activity Recognition |
Lala Shakti Swarup Ray et.al. |
2505.02052 |
link |
2025-05-04 |
Secrets of GFlowNets’ Learning Behavior: A Theoretical Study |
Tianshu Yu et.al. |
2505.02035 |
null |
2025-05-04 |
From Mind to Machine: The Rise of Manus AI as a Fully Autonomous Digital Agent |
Minjie Shen et.al. |
2505.02024 |
null |
2025-05-04 |
Wide & Deep Learning for Node Classification |
Yancheng Chen et.al. |
2505.02020 |
link |
2025-05-04 |
Learning the Simplest Neural ODE |
Yuji Okamoto et.al. |
2505.02019 |
null |
2025-05-04 |
MLLM-Enhanced Face Forgery Detection: A Vision-Language Fusion Solution |
Siran Peng et.al. |
2505.02013 |
null |
2025-05-04 |
Testing Database Systems with Large Language Model Synthesized Fragments |
Suyang Zhong et.al. |
2505.02012 |
null |
2025-05-02 |
GENMO: A GENeralist Model for Human MOtion |
Jiefeng Li et.al. |
2505.01425 |
null |
2025-05-02 |
How Effective are Large Time Series Models in Hydrology? A Study on Water Level Forecasting in Everglades |
Rahuul Rangaraj et.al. |
2505.01415 |
null |
2025-05-02 |
Provable Efficiency of Guidance in Diffusion Models for General Data Distribution |
Gen Li et.al. |
2505.01382 |
null |
2025-05-02 |
FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors |
Chenxi Li et.al. |
2505.01322 |
null |
2025-05-02 |
Helping Big Language Models Protect Themselves: An Enhanced Filtering and Summarization System |
Sheikh Samit Muhaimin et.al. |
2505.01315 |
null |
2025-05-02 |
Enhancing SPARQL Query Rewriting for Complex Ontology Alignments |
Anicet Lepetit Ondo et.al. |
2505.01309 |
null |
2025-05-02 |
Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments |
Regan Bolton et.al. |
2505.01307 |
null |
2025-05-02 |
ViSA-Flow: Accelerating Robot Skill Learning via Large-Scale Video Semantic Action Flow |
Changhe Chen et.al. |
2505.01288 |
null |
2025-05-02 |
Scoring-Assisted Generative Exploration for Proteins (SAGE-Prot): A Framework for Multi-Objective Protein Optimization via Iterative Sequence Generation and Evaluation |
Hocheol Lim et.al. |
2505.01277 |
link |
2025-05-02 |
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing |
Gaoxiang Cong et.al. |
2505.01263 |
null |
2025-05-02 |
Enhancing Obsolescence Forecasting with Deep Generative Data Augmentation: A Semi-Supervised Framework for Low-Data Industrial Applications |
Elie Saad et.al. |
2505.01261 |
null |
2025-05-02 |
Digital Pathway Curation (DPC): a comparative pipeline to assess the reproducibility, consensus and accuracy across Gemini, PubMed, and scientific reviewers in biomedical research |
Flavio Lichtenstein et.al. |
2505.01259 |
null |
2025-05-02 |
Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging |
Elena Mulero Ayllón et.al. |
2505.01239 |
null |
2025-05-02 |
CaReAQA: A Cardiac and Respiratory Audio Question Answering Model for Open-Ended Diagnostic Reasoning |
Tsai-Ning Wang et.al. |
2505.01199 |
null |
2025-05-02 |
A Combinatorial Proof of Universal Optimality for Computing a Planar Convex Hull |
Ivor van der Hoog et.al. |
2505.01194 |
null |
2025-05-02 |
TSTMotion: Training-free Scene-awarenText-to-motion Generation |
Ziyan Guo et.al. |
2505.01182 |
null |
2025-05-02 |
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures |
Francisco Aguilera-Martínez et.al. |
2505.01177 |
null |
2025-05-02 |
Methodological Foundations for AI-Driven Survey Question Generation |
Ted K. Mburu et.al. |
2505.01150 |
null |
2025-05-02 |
Retrieval-Augmented Generation in Biomedicine: A Survey of Technologies, Datasets, and Clinical Applications |
Jiawei He et.al. |
2505.01146 |
null |
2025-05-02 |
Evaluating the Impact of Data Cleaning on the Quality of Generated Pull Request Descriptions |
Kutay Tire et.al. |
2505.01120 |
null |
2025-05-02 |
Incorporating Inductive Biases to Energy-based Generative Models |
Yukun Li et.al. |
2505.01111 |
null |
2025-05-02 |
MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning |
Murtadha Ahmed et.al. |
2505.01110 |
null |
2025-05-02 |
Self-Supervision Enhances Instance-based Multiple Instance Learning Methods in Digital Pathology: A Benchmark Study |
Ali Mammadov et.al. |
2505.01109 |
link |
2025-05-02 |
Any-to-Any Vision-Language Model for Multimodal X-ray Imaging and Radiological Report Generation |
Daniele Molino et.al. |
2505.01091 |
null |
2025-05-02 |
MADIL: An MDL-based Framework for Efficient Program Synthesis in the ARC Benchmark |
Sébastien Ferré et.al. |
2505.01081 |
null |
2025-05-02 |
Zero-Shot Document-Level Biomedical Relation Extraction via Scenario-based Prompt Design in Two-Stage with LLM |
Lei Zhao et.al. |
2505.01077 |
null |
2025-05-02 |
Federated Adapter on Foundation Models: An Out-Of-Distribution Approach |
Yiyuan Yang et.al. |
2505.01075 |
null |
2025-05-02 |
WirelessAgent: Large Language Model Agents for Intelligent Wireless Networks |
Jingwen Tong et.al. |
2505.01074 |
link |
2025-05-02 |
Retrieval Augmented Learning: A Retrial-based Large Language Model Self-Supervised Learning and Autonomous Knowledge Generation |
Zongyuan Li et.al. |
2505.01073 |
null |
2025-05-02 |
A Rusty Link in the AI Supply Chain: Detecting Evil Configurations in Model Repositories |
Ziqi Ding et.al. |
2505.01067 |
null |
2025-05-02 |
Good News for Script Kiddies? Evaluating Large Language Models for Automated Exploit Generation |
David Jin et.al. |
2505.01065 |
null |
2025-05-02 |
Efficient Vocabulary-Free Fine-Grained Visual Recognition in the Age of Multimodal LLMs |
Hari Chandana Kuchibhotla et.al. |
2505.01064 |
null |
2025-05-02 |
Transferable Adversarial Attacks on Black-Box Vision-Language Models |
Kai Hu et.al. |
2505.01050 |
null |
2025-05-02 |
Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities |
Zhiwei Hao et.al. |
2505.01043 |
null |
2025-05-02 |
Do We Need a Detailed Rubric for Automated Essay Scoring using Large Language Models? |
Lui Yoshida et.al. |
2505.01035 |
null |
2025-05-02 |
Improving Large Language Model Planning with Action Sequence Similarity |
Xinran Zhao et.al. |
2505.01009 |
null |
2025-05-02 |
Where’s the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content |
Haoyue Bai et.al. |
2505.01008 |
null |
2025-05-02 |
Togedule: Scheduling Meetings with Large Language Models and Adaptive Representations of Group Availability |
Jaeyoon Song et.al. |
2505.01000 |
link |
2025-05-02 |
Deterministic-to-Stochastic Diverse Latent Feature Mapping for Human Motion Synthesis |
Yu Hua et.al. |
2505.00998 |
null |
2025-05-02 |
Position: Enough of Scaling LLMs! Lets Focus on Downscaling |
Ayan Sengupta et.al. |
2505.00985 |
link |
2025-05-02 |
Multi-agents based User Values Mining for Recommendation |
Lijian Chen et.al. |
2505.00981 |
null |
2025-05-02 |
Synthesize-on-Graph: Knowledgeable Synthetic Data Generation for Continue Pre-training of Large Language Models |
Xuhui Jiang et.al. |
2505.00979 |
null |
2025-05-02 |
Attack and defense techniques in large language models: A survey and new perspectives |
Zhiyu Liao et.al. |
2505.00976 |
null |
2025-05-02 |
Seeking to Collide: Online Safety-Critical Scenario Generation for Autonomous Driving with Retrieval Augmented Large Language Models |
Yuewen Mei et.al. |
2505.00972 |
null |
2025-05-02 |
Tree-Sliced Wasserstein Distance with Nonlinear Projection |
Thanh Tran et.al. |
2505.00968 |
null |
2025-05-02 |
Preserving Privacy and Utility in LLM-Based Product Recommendations |
Tina Khezresmaeilzadeh et.al. |
2505.00951 |
null |
2025-05-02 |
SSRLBot: Designing and Developing an LLM-based Agent using Socially Shared Regulated Learning |
Xiaoshan Huang et.al. |
2505.00945 |
null |
2025-05-02 |
Large Language Model-Driven Dynamic Assessment of Grammatical Accuracy in English Language Learner Writing |
Timur Jaganov et.al. |
2505.00931 |
null |
2025-05-02 |
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias |
Ruiquan Huang et.al. |
2505.00926 |
null |
2025-05-01 |
Multivariate Conformal Selection |
Tian Bai et.al. |
2505.00917 |
null |
2025-05-01 |
NeMo-Inspector: A Visualization Tool for LLM Generation Analysis |
Daria Gitman et.al. |
2505.00903 |
link |
2025-05-01 |
Towards Explainable Temporal User Profiling with LLMs |
Milad Sabouri et.al. |
2505.00886 |
link |
2025-05-01 |
Protocol-agnostic and Data-free Backdoor Attacks on Pre-trained Models in RF Fingerprinting |
Tianya Zhao et.al. |
2505.00881 |
link |
2025-05-01 |
LLM Ethics Benchmark: A Three-Dimensional Assessment System for Evaluating Moral Reasoning in Large Language Models |
Junfeng Jiao et.al. |
2505.00853 |
link |
2025-05-01 |
ICQuant: Index Coding enables Low-bit LLM Quantization |
Xinlin Li et.al. |
2505.00850 |
null |
2025-05-01 |
OET: Optimization-based prompt injection Evaluation Toolkit |
Jinsheng Pan et.al. |
2505.00843 |
link |
2025-05-01 |
From Texts to Shields: Convergence of Large Language Models and Cybersecurity |
Tao Li et.al. |
2505.00841 |
null |
2025-05-01 |
Multi-site modelling and reconstruction of past extreme skew surges along the French Atlantic coast |
Nathan Huet et.al. |
2505.00835 |
link |
2025-05-01 |
SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation |
Quang P. M. Pham et.al. |
2505.00831 |
link |
2025-05-01 |
Data-Driven Optical To Thermal Inference in Pool Boiling Using Generative Adversarial Networks |
Qianxi Fu et.al. |
2505.00823 |
null |
2025-05-01 |
Should AI Mimic People? Understanding AI-Supported Writing Technology Among Black Users |
Jeffrey Basoah et.al. |
2505.00821 |
null |
2025-05-01 |
HMCF: A Human-in-the-loop Multi-Robot Collaboration Framework Based on Large Language Models |
Zhaoxing Li et.al. |
2505.00820 |
null |
2025-05-01 |
Spill The Beans: Exploiting CPU Cache Side-Channels to Leak Tokens from Large Language Models |
Andrew Adiletta et.al. |
2505.00817 |
null |
2025-05-01 |
Reasoning Capabilities and Invariability of Large Language Models |
Alessandro Raganato et.al. |
2505.00776 |
link |
2025-05-01 |
Multi-Modal Language Models as Text-to-Image Model Evaluators |
Jiahui Chen et.al. |
2505.00759 |
null |
2025-05-01 |
A Survey on Large Language Model based Human-Agent Systems |
Henry Peng Zou et.al. |
2505.00753 |
link |
2025-05-01 |
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT |
Dongzhi Jiang et.al. |
2505.00703 |
link |
2025-05-01 |
GuideSR: Rethinking Guidance for One-Step High-Fidelity Diffusion-Based Super-Resolution |
Aditya Arora et.al. |
2505.00687 |
null |
2025-05-01 |
Steering Large Language Models with Register Analysis for Arbitrary Style Transfer |
Xinchen Yang et.al. |
2505.00679 |
null |
2025-05-01 |
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions |
Yiming Du et.al. |
2505.00675 |
link |
2025-05-01 |
DeepCritic: Deliberate Critique with Large Language Models |
Wenkai Yang et.al. |
2505.00662 |
link |
2025-05-01 |
On the generalization of language models from in-context learning and finetuning: a controlled study |
Andrew K. Lampinen et.al. |
2505.00661 |
null |
2025-05-01 |
Large Language Models Understanding: an Inherent Ambiguity Barrier |
Daniel N. Nissani et.al. |
2505.00654 |
null |
2025-05-01 |
Open-Source LLM-Driven Federated Transformer for Predictive IoV Management |
Yazan Otoum et.al. |
2505.00651 |
null |
2025-05-01 |
Investigating Task Arithmetic for Zero-Shot Information Retrieval |
Marco Braga et.al. |
2505.00649 |
link |
2025-05-01 |
Brain Foundation Models with Hypergraph Dynamic Adapter for Brain Disease Analysis |
Zhongying Deng et.al. |
2505.00627 |
null |
2025-05-01 |
The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them) |
Zihao Wang et.al. |
2505.00626 |
null |
2025-05-02 |
SA-GAT-SR: Self-Adaptable Graph Attention Networks with Symbolic Regression for high-fidelity material property prediction |
Liu Junchi et.al. |
2505.00625 |
link |
2025-05-01 |
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation |
Chaitali Bhattacharyya et.al. |
2505.00624 |
null |
2025-05-01 |
Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction |
Simon Giebenhain et.al. |
2505.00615 |
null |
2025-05-01 |
Combining LLMs with Logic-Based Framework to Explain MCTS |
Ziyan An et.al. |
2505.00610 |
null |
2025-05-01 |
Can LLMs Help Improve Analogical Reasoning For Strategic Decisions? Experimental Evidence from Humans and GPT-4 |
Phanish Puranam et.al. |
2505.00603 |
null |
2025-05-02 |
Fast and Low-Cost Genomic Foundation Models via Outlier Removal |
Haozheng Luo et.al. |
2505.00598 |
link |
2025-05-01 |
Block Circulant Adapter for Large Language Models |
Xinyu Ding et.al. |
2505.00582 |
null |
2025-05-01 |
Parameter-Efficient Fine-Tuning with Circulant and Diagonal Vectors |
Xinyu Ding et.al. |
2505.00580 |
null |
2025-05-01 |
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension |
Jushi Kai et.al. |
2505.00570 |
null |
2025-05-01 |
Triggering Hallucinations in LLMs: A Quantitative Study of Prompt-Induced Hallucination in Large Language Models |
Makoto Sato et.al. |
2505.00557 |
null |
2025-05-02 |
100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models |
Chong Zhang et.al. |
2505.00551 |
null |
2025-05-01 |
Leveraging Partial SMILES Validation Scheme for Enhanced Drug Design in Reinforcement Learning Frameworks |
Xinyu Wang et.al. |
2505.00530 |
null |
2025-05-01 |
HalluMix: A Task-Agnostic, Multi-Domain Benchmark for Real-World Hallucination Detection |
Deanna Emery et.al. |
2505.00506 |
null |
2025-05-01 |
UserCentrix: An Agentic Memory-augmented AI Framework for Smart Spaces |
Alaa Saleh et.al. |
2505.00472 |
null |
2025-05-01 |
A General Model for Linearly Polarized Optical Vector Beams |
Jonathan Nichols et.al. |
2505.00471 |
null |
2025-05-01 |
Red Teaming Large Language Models for Healthcare |
Vahid Balazadeh et.al. |
2505.00467 |
null |
2025-05-01 |
Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models |
Sungbok Shin et.al. |
2505.00455 |
null |
2025-05-01 |
Distributed Retrieval-Augmented Generation |
Chenhao Xu et.al. |
2505.00443 |
link |
2025-05-01 |
CSE-SFP: Enabling Unsupervised Sentence Representation Learning via a Single Forward Pass |
Bowen Zhang et.al. |
2505.00389 |
link |
2025-05-01 |
Urban Air Mobility as a System of Systems: An LLM-Enhanced Holonic Approach |
Ahmed R. Sadik et.al. |
2505.00368 |
null |
2025-05-01 |
KoACD: The First Korean Adolescent Dataset for Cognitive Distortion Analysis |
JunSeo Kim et.al. |
2505.00367 |
null |
2025-05-01 |
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training |
Albert Ge et.al. |
2505.00358 |
null |
2025-05-01 |
LLMPrism: Black-box Performance Diagnosis for Production LLM Training Platforms |
Zhihan Jiang et.al. |
2505.00342 |
null |
2025-05-01 |
T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation |
Xuyang Guo et.al. |
2505.00337 |
null |
2025-05-01 |
Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution |
Luigi Sigillo et.al. |
2505.00334 |
null |
2025-05-01 |
Communication-Efficient Wireless Federated Fine-Tuning for Large-Scale AI Models |
Bumjun Kim et.al. |
2505.00333 |
null |
2025-05-01 |
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing |
Piotr Piękos et.al. |
2505.00315 |
link |
2025-05-01 |
Large Language Models as AI Agents for Digital Atoms and Molecules: Catalyzing a New Era in Computational Biophysics |
Yijie Xia et.al. |
2505.00270 |
null |
2025-05-01 |
EnronQA: Towards Personalized RAG over Private Documents |
Michael J. Ryan et.al. |
2505.00263 |
null |
2025-05-01 |
LLM-Based Threat Detection and Prevention Framework for IoT Ecosystems |
Yazan Otoum et.al. |
2505.00240 |
null |
2025-05-02 |
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks |
Vishnu Sarukkai et.al. |
2505.00234 |
null |
2025-05-01 |
Scaling On-Device GPU Inference for Large Generative Models |
Jiuqiang Tang et.al. |
2505.00232 |
null |
2025-05-01 |
ReXGradient-160K: A Large-Scale Publicly Available Dataset of Chest Radiographs with Free-text Reports |
Xiaoman Zhang et.al. |
2505.00228 |
null |
2025-04-30 |
RAIL in the Wild: Operationalizing Responsible AI Evaluation Using Anthropic’s Value Dataset |
Sumit Verma et.al. |
2505.00204 |
null |
2025-04-30 |
Generative Multimodal Multiscale Data Fusion for Digital Twins in Aerosol Jet Electronics Printing |
Fatemeh Elhambakhsh et.al. |
2505.00176 |
null |
2025-04-30 |
GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule Generation |
Filipp Nikitin et.al. |
2505.00169 |
link |
2025-04-30 |
V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving |
Jannik Lübberstedt et.al. |
2505.00156 |
null |
2025-04-30 |
Audo-Sight: Enabling Ambient Interaction For Blind And Visually Impaired Individuals |
Bhanuja Ainary et.al. |
2505.00153 |
null |
2025-04-30 |
AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models |
Yinghui He et.al. |
2505.00147 |
null |
2025-04-30 |
When Deep Learning Meets Information Retrieval-based Bug Localization: A Survey |
Feifei Niu et.al. |
2505.00144 |
null |
2025-04-30 |
Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs |
Jinyan Su et.al. |
2505.00127 |
null |
2025-04-30 |
Fine-Tuning LLMs for Low-Resource Dialect Translation: The Case of Lebanese |
Silvana Yakhni et.al. |
2505.00114 |
link |
2025-04-30 |
CoordField: Coordination Field for Agentic UAV Task Allocation In Low-altitude Urban Scenarios |
Tengchao Zhang et.al. |
2505.00091 |
null |
2025-04-30 |
Materials discovery acceleration by using condition generative methodology |
Caiyuan Ye et.al. |
2505.00076 |
link |
2025-04-30 |
ConSens: Assessing context grounding in open-book question answering |
Ivan Vankov et.al. |
2505.00065 |
null |
2025-04-30 |
GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling |
Siqi Li et.al. |
2505.00063 |
null |
2025-04-30 |
Enhancing Security and Strengthening Defenses in Automated Short-Answer Grading Systems |
Sahar Yarmohammadtoosky et.al. |
2505.00061 |
null |
2025-04-30 |
Fact-Consistency Evaluation of Text-to-SQL Generation for Business Intelligence Using Exaone 3.5 |
Jeho Choi et.al. |
2505.00060 |
null |
2025-04-30 |
A Report on the llms evaluating the high school questions |
Zhu Jiawei et.al. |
2505.00057 |
null |
2025-04-30 |
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction |
Qihao Liu et.al. |
2504.21855 |
null |
2025-04-30 |
TRUST: An LLM-Based Dialogue System for Trauma Understanding and Structured Assessments |
Sichang Tu et.al. |
2504.21851 |
null |
2025-04-30 |
COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning |
Xindi Wu et.al. |
2504.21850 |
null |
2025-04-30 |
3D Stylization via Large Reconstruction Model |
Ipek Oztas et.al. |
2504.21836 |
null |
2025-04-30 |
From Aesthetics to Human Preferences: Comparative Perspectives of Evaluating Text-to-Music Systems |
Huan Zhang et.al. |
2504.21815 |
null |
2025-04-30 |
Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields |
Yixin Gao et.al. |
2504.21814 |
null |
2025-04-30 |
A simple and effective approach for body part recognition on CT scans based on projection estimation |
Franko Hrzic et.al. |
2504.21810 |
null |
2025-04-30 |
An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding |
Xiuwei Shang et.al. |
2504.21803 |
null |
2025-04-30 |
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition |
Z. Z. Ren et.al. |
2504.21801 |
link |
2025-04-30 |
MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness |
Junsheng Huang et.al. |
2504.21773 |
null |
2025-04-30 |
Anatomical Similarity as a New Metric to Evaluate Brain Generative Models |
Bahram Jafrasteh et.al. |
2504.21771 |
null |
2025-04-30 |
LASHED: LLMs And Static Hardware Analysis for Early Detection of RTL Bugs |
Baleegh Ahmad et.al. |
2504.21770 |
null |
2025-04-30 |
LLM-based Interactive Imitation Learning for Robotic Manipulation |
Jonas Werner et.al. |
2504.21769 |
link |
2025-04-30 |
Investigating Literary Motifs in Ancient and Medieval Novels with Large Language Models |
Emelie Hallenberg et.al. |
2504.21742 |
null |
2025-04-30 |
TheraQuest: A Gamified, LLM-Powered Simulation for Massage Therapy Training |
Shengqian Wang et.al. |
2504.21735 |
null |
2025-04-30 |
XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs |
Marco Arazzi et.al. |
2504.21700 |
null |
2025-04-30 |
Visual Text Processing: A Comprehensive Review and Unified Evaluation |
Yan Shu et.al. |
2504.21682 |
link |
2025-04-30 |
Hoist with His Own Petard: Inducing Guardrails to Facilitate Denial-of-Service Attacks on Retrieval-Augmented Generation of LLMs |
Pan Suo et.al. |
2504.21680 |
null |
2025-04-30 |
Traceback of Poisoning Attacks to Retrieval-Augmented Generation |
Baolei Zhang et.al. |
2504.21668 |
null |
2025-04-30 |
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising |
Jingwen Cai et.al. |
2504.21667 |
null |
2025-04-30 |
AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization |
Haotian Luo et.al. |
2504.21659 |
link |
2025-04-30 |
Sadeed: Advancing Arabic Diacritization Through Small Language Model |
Zeina Aldallal et.al. |
2504.21635 |
null |
2025-04-30 |
Meeseeks: An Iterative Benchmark Evaluating LLMs Multi-Turn Instruction-Following Ability |
Jiaming Wang et.al. |
2504.21625 |
null |
2025-04-30 |
RDF-Based Structured Quality Assessment Representation of Multilingual LLM Evaluations |
Jonas Gwozdz et.al. |
2504.21605 |
null |
2025-04-30 |
Leveraging Pre-trained Large Language Models with Refined Prompting for Online Task and Motion Planning |
Huihui Guo et.al. |
2504.21596 |
null |
2025-04-30 |
MF-LLM: Simulating Collective Decision Dynamics via a Mean-Field Large Language Model Framework |
Qirui Mi et.al. |
2504.21582 |
null |
2025-04-30 |
Generative AI in Financial Institution: A Global Survey of Opportunities, Threats, and Regulation |
Bikash Saha et.al. |
2504.21574 |
null |
2025-04-29 |
A Systematic Literature Review of Parameter-Efficient Fine-Tuning for Large Code Models |
Md Zahidul Haque et.al. |
2504.21569 |
link |
2025-04-30 |
eNCApsulate: NCA for Precision Diagnosis on Capsule Endoscopes |
Henry John Krumb et.al. |
2504.21562 |
null |
2025-04-30 |
Iterative Trajectory Exploration for Multimodal Agents |
Pengxiang Li et.al. |
2504.21561 |
null |
2025-04-30 |
Precision Where It Matters: A Novel Spike Aware Mixed-Precision Quantization Strategy for LLaMA-based Language Models |
Lucas Maisonnave et.al. |
2504.21553 |
null |
2025-04-30 |
Consistency-aware Fake Videos Detection on Short Video Platforms |
Junxi Wang et.al. |
2504.21495 |
link |
2025-04-30 |
GarmentDiffusion: 3D Garment Sewing Pattern Generation with Multimodal Diffusion Transformers |
Xinyu Li et.al. |
2504.21476 |
null |
2025-04-30 |
Rethinking Visual Layer Selection in Multimodal LLMs |
Haoran Chen et.al. |
2504.21447 |
null |
2025-04-30 |
SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding |
Chenkai Zhang et.al. |
2504.21435 |
link |
2025-04-30 |
UAV-VLN: End-to-End Vision Language guided Navigation for UAVs |
Pranav Saxena et.al. |
2504.21432 |
null |
2025-04-30 |
Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision |
Weicai Yan et.al. |
2504.21423 |
null |
2025-04-30 |
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training |
Xinyi Liu et.al. |
2504.21411 |
link |
2025-04-30 |
Who Gets the Callback? Generative AI and Gender Bias |
Sugat Chaturvedi et.al. |
2504.21400 |
null |
2025-04-30 |
Sparse-to-Sparse Training of Diffusion Models |
Inês Cardoso Oliveira et.al. |
2504.21380 |
null |
2025-04-30 |
Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction |
Máté Gedeon et.al. |
2504.21372 |
null |
2025-04-30 |
Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing |
Hong Zhang et.al. |
2504.21356 |
link |
2025-04-30 |
Generative QoE Modeling: A Lightweight Approach for Telecom Networks |
Vinti Nayar et.al. |
2504.21353 |
null |
2025-04-30 |
UniBiomed: A Universal Foundation Model for Grounded Biomedical Image Interpretation |
Linshan Wu et.al. |
2504.21336 |
link |
2025-04-30 |
Simple Visual Artifact Detection in Sora-Generated Videos |
Misora Sugiyama et.al. |
2504.21334 |
null |
2025-04-30 |
Does the Prompt-based Large Language Model Recognize Students’ Demographics and Introduce Bias in Essay Scoring? |
Kaixun Yang et.al. |
2504.21330 |
null |
2025-04-30 |
Covert Prompt Transmission for Secure Large Language Model Services |
Ruichen Zhang et.al. |
2504.21311 |
null |
2025-04-30 |
An Evaluation of a Visual Question Answering Strategy for Zero-shot Facial Expression Recognition in Still Images |
Modesto Castrillón-Santana et.al. |
2504.21309 |
null |
2025-04-30 |
Confidence in Large Language Model Evaluation: A Bayesian Approach to Limited-Sample Challenges |
Xiao Xiao et.al. |
2504.21303 |
null |
2025-04-30 |
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models |
Zhiting Fan et.al. |
2504.21299 |
null |
2025-04-30 |
NEP89: Universal neuroevolution potential for inorganic and organic materials across 89 elements |
Ting Liang et.al. |
2504.21286 |
link |
2025-04-30 |
Birdie: Natural Language-Driven Table Discovery Using Differentiable Search Index |
Yuxiang Guo et.al. |
2504.21282 |
null |
2025-04-30 |
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models |
Guanghao Zhou et.al. |
2504.21277 |
null |
2025-04-30 |
CoCoDiff: Diversifying Skeleton Action Features via Coarse-Fine Text-Co-Guided Latent Diffusion |
Zhifu Zhao et.al. |
2504.21266 |
null |
2025-04-30 |
Talk Before You Retrieve: Agent-Led Discussions for Better RAG in Medical QA |
Xuanzhao Dong et.al. |
2504.21252 |
link |
2025-04-30 |
Memorization and Knowledge Injection in Gated LLMs |
Xu Pan et.al. |
2504.21239 |
null |
2025-04-30 |
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math |
Haoran Xu et.al. |
2504.21233 |
null |
2025-04-29 |
CachePrune: Neural-Based Attribution Defense Against Indirect Prompt Injection Attacks |
Rui Wang et.al. |
2504.21228 |
null |
2025-04-29 |
Theoretical Foundations for Semantic Cognition in Artificial Intelligence |
Sebastian Dumbrava et.al. |
2504.21218 |
null |
2025-04-29 |
A Cost-Effective LLM-based Approach to Identify Wildlife Trafficking in Online Marketplaces |
Juliana Barbosa et.al. |
2504.21211 |
null |
2025-04-29 |
Automatic Legal Writing Evaluation of LLMs |
Ramon Pires et.al. |
2504.21202 |
link |
2025-04-29 |
Graph Synthetic Out-of-Distribution Exposure with Large Language Models |
Haoyan Xu et.al. |
2504.21198 |
null |
2025-04-29 |
Small or Large? Zero-Shot or Finetuned? Guiding Language Model Choice for Specialized Applications in Healthcare |
Lovedeep Gondara et.al. |
2504.21191 |
null |
2025-04-29 |
LIFT: LLM-Based Pragma Insertion for HLS via GNN Supervised Fine-Tuning |
Neha Prakriya et.al. |
2504.21187 |
null |
2025-04-29 |
GLIP-OOD: Zero-Shot Graph OOD Detection with Foundation Model |
Haoyan Xu et.al. |
2504.21186 |
null |
2025-05-01 |
AI-in-the-Loop Planning for Transportation Electrification: Case Studies from Austin, Texas |
Seung Jun Choi et.al. |
2504.21185 |
null |
2025-04-29 |
Efficient LLMs with AMP: Attention Heads and MLP Pruning |
Leandro Giusti Mugnaini et.al. |
2504.21174 |
null |
2025-04-29 |
Detecting Manipulated Contents Using Knowledge-Grounded Inference |
Mark Huasong Meng et.al. |
2504.21165 |
link |
2025-04-29 |
LLM Enhancer: Merged Approach using Vector Embedding for Reducing Large Language Model Hallucinations with External Knowledge |
Naheed Rayhan et.al. |
2504.21132 |
null |
2025-04-29 |
Optimized Quantum Embedding: A Universal Minor-Embedding Framework for Large Complete Bipartite Graph |
Salvatore Sinno et.al. |
2504.21112 |
null |
2025-04-29 |
A Survey on Parameter-Efficient Fine-Tuning for Foundation Models in Federated Learning |
Jieming Bian et.al. |
2504.21099 |
null |
2025-04-29 |
ProT-GFDM: A Generative Fractional Diffusion Model for Protein Generation |
Xiao Liang et.al. |
2504.21092 |
null |
2025-04-29 |
On the Potential of Large Language Models to Solve Semantics-Aware Process Mining Tasks |
Adrian Rebmann et.al. |
2504.21074 |
null |
2025-04-29 |
YoChameleon: Personalized Vision and Language Generation |
Thao Nguyen et.al. |
2504.20998 |
null |
2025-04-29 |
Toward Efficient Exploration by Large Language Model Agents |
Dilip Arumugam et.al. |
2504.20997 |
null |
2025-04-29 |
X-Fusion: Introducing New Modality to Frozen Large Language Models |
Sicheng Mo et.al. |
2504.20996 |
null |
2025-04-29 |
TesserAct: Learning 4D Embodied World Models |
Haoyu Zhen et.al. |
2504.20995 |
null |
2025-04-29 |
ACE: A Security Architecture for LLM-Integrated App Systems |
Evan Li et.al. |
2504.20984 |
null |
2025-04-29 |
Real-Time Wayfinding Assistant for Blind and Low-Vision Users |
Dabbrata Das et.al. |
2504.20976 |
null |
2025-04-29 |
SetKE: Knowledge Editing for Knowledge Elements Overlap |
Yifan Wei et.al. |
2504.20972 |
null |
2025-04-29 |
OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification |
Shangyu Li et.al. |
2504.20964 |
link |
2025-04-29 |
Information Gravity: A Field-Theoretic Model for Token Selection in Large Language Models |
Maryna Vyshnyvetska et.al. |
2504.20951 |
null |
2025-04-30 |
Trace-of-Thought Prompting: Investigating Prompt-Based Knowledge Distillation Through Question Decomposition |
Tyler McDonald et.al. |
2504.20946 |
null |
2025-04-29 |
ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification |
Ziqing Fan et.al. |
2504.20930 |
link |
2025-04-30 |
End-to-end Audio Deepfake Detection from RAW Waveforms: a RawNet-Based Approach with Cross-Dataset Evaluation |
Andrea Di Pierno et.al. |
2504.20923 |
link |
2025-04-29 |
An Empirical Study on the Capability of LLMs in Decomposing Bug Reports |
Zhiyuan Chen et.al. |
2504.20911 |
null |
2025-04-29 |
Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers |
Quentin Guimard et.al. |
2504.20902 |
null |
2025-04-29 |
Evaluating Generative Models for Tabular Data: Novel Metrics and Benchmarking |
Dayananda Herurkar et.al. |
2504.20900 |
null |
2025-04-29 |
LELANTE: LEveraging LLM for Automated ANdroid TEsting |
Shamit Fatin et.al. |
2504.20896 |
null |
2025-04-29 |
The Leaderboard Illusion |
Shivalika Singh et.al. |
2504.20879 |
null |
2025-04-29 |
AI-GenBench: A New Ongoing Benchmark for AI-Generated Image Detection |
Lorenzo Pellegrini et.al. |
2504.20865 |
null |
2025-04-29 |
Universal language model with the intervention of quantum theory |
D. -F. Qin et.al. |
2504.20839 |
null |
2025-04-29 |
Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning |
Hongfei Xue et.al. |
2504.20835 |
null |
2025-04-29 |
Reinforcement Learning for LLM Reasoning Under Memory Constraints |
Alan Lee et.al. |
2504.20834 |
null |
2025-04-30 |
Ascendra: Dynamic Request Prioritization for Efficient LLM Serving |
Azam Ikram et.al. |
2504.20828 |
null |
2025-04-29 |
Secure Coding with AI, From Creation to Inspection |
Vladislav Belozerov et.al. |
2504.20814 |
null |
2025-04-30 |
Unlocking User-oriented Pages: Intention-driven Black-box Scanner for Real-world Web Applications |
Weizhe Wang et.al. |
2504.20801 |
null |
2025-04-29 |
Hallucination by Code Generation LLMs: Taxonomy, Benchmarks, Mitigation, and Challenges |
Yunseo Lee et.al. |
2504.20799 |
null |
2025-04-29 |
Q-Fusion: Diffusing Quantum Circuits |
Collin Beaudoin et.al. |
2504.20794 |
null |
2025-04-29 |
Using LLMs in Generating Design Rationale for Software Architecture Decisions |
Xiyu Zhou et.al. |
2504.20781 |
link |
2025-04-29 |
Turing Machine Evaluation for Large Language Model |
Haitao Wu et.al. |
2504.20771 |
link |
2025-04-29 |
Chain-of-Defensive-Thought: Structured Reasoning Elicits Robustness in Large Language Models against Reference Corruption |
Wenxiao Wang et.al. |
2504.20769 |
null |
2025-04-29 |
Understanding Large Language Model Supply Chain: Structure, Domain, and Vulnerabilities |
Yanzhe Hu et.al. |
2504.20763 |
null |
2025-04-29 |
DDPS: Discrete Diffusion Posterior Sampling for Paths in Layered Graphs |
Hao Luan et.al. |
2504.20754 |
null |
2025-04-29 |
Learning a General Model: Folding Clothing with Topological Dynamics |
Yiming Liu et.al. |
2504.20720 |
null |
2025-04-29 |
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think |
Hasan Abed Al Kader Hammoud et.al. |
2504.20708 |
null |
2025-04-29 |
What’s Wrong with Your Synthetic Tabular Data? Using Explainable AI to Evaluate Generative Models |
Jan Kapar et.al. |
2504.20687 |
link |
2025-04-29 |
Identifying Uncertainty in Self-Adaptive Robotics with Large Language Models |
Hassan Sartaj et.al. |
2504.20684 |
null |
2025-04-29 |
CoCo-Bench: A Comprehensive Code Benchmark For Multi-task Large Language Model Evaluation |
Wenjing Yin et.al. |
2504.20673 |
null |
2025-04-29 |
A Generative-AI-Driven Claim Retrieval System Capable of Detecting and Retrieving Claims from Social Media Platforms in Multiple Languages |
Ivan Vykopal et.al. |
2504.20668 |
link |
2025-04-29 |
ComplexVCoder: An LLM-Driven Framework for Systematic Generation of Complex Verilog Code |
Jian Zuo et.al. |
2504.20653 |
null |
2025-04-29 |
Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection |
Ziqing Fan et.al. |
2504.20644 |
null |
2025-04-29 |
Cooking Up Creativity: A Cognitively-Inspired Approach for Enhancing LLM Creativity through Structured Representations |
Moran Mizrahi et.al. |
2504.20643 |
link |
2025-04-29 |
Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model Validation |
Bradley Segal et.al. |
2504.20635 |
link |
2025-04-29 |
ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting |
Yu Zhang et.al. |
2504.20630 |
null |
2025-04-29 |
Cognitive maps are generative programs |
Marta Kryven et.al. |
2504.20628 |
null |
2025-04-29 |
DiffusionRIR: Room Impulse Response Interpolation using Diffusion Models |
Sagi Della Torre et.al. |
2504.20625 |
null |
2025-04-29 |
PaRT: Enhancing Proactive Social Chatbots with Personalized Real-Time Retrieval |
Zihan Niu et.al. |
2504.20624 |
null |
2025-04-29 |
The Hidden Risks of LLM-Generated Web Application Code: A Security-Centric Evaluation of Code Generation Capabilities in Large Language Models |
Swaroop Dora et.al. |
2504.20612 |
null |
2025-04-29 |
Information Retrieval in the Age of Generative AI: The RGB Model |
Michele Garetto et.al. |
2504.20610 |
link |
2025-04-29 |
WenyanGPT: A Large Language Model for Classical Chinese Tasks |
Xinyu Yao et.al. |
2504.20609 |
null |
2025-04-29 |
Reinforcement Learning for Reasoning in Large Language Models with One Training Example |
Yiping Wang et.al. |
2504.20571 |
link |
2025-04-29 |
ReCIT: Reconstructing Full Private Data from Gradient in Parameter-Efficient Fine-Tuning of Large Language Models |
Jin Xie et.al. |
2504.20570 |
null |
2025-04-29 |
BrAIcht, a theatrical agent that speaks like Bertolt Brecht’s characters |
Baz Roland et.al. |
2504.20552 |
null |
2025-04-29 |
TriniMark: A Robust Generative Speech Watermarking Method for Trinity-Level Attribution |
Yue Li et.al. |
2504.20532 |
null |
2025-04-30 |
Conversations with AI Chatbots Increase Short-Term Vaccine Intentions But Do Not Outperform Standard Public Health Messaging |
Neil K. R. Sehgal et.al. |
2504.20519 |
null |
2025-04-29 |
MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living |
Xi Chen et.al. |
2504.20505 |
null |
2025-04-29 |
SAM-Guided Robust Representation Learning for One-Shot 3D Medical Image Segmentation |
Jia Wang et.al. |
2504.20501 |
null |
2025-04-29 |
UniDetox: Universal Detoxification of Large Language Models via Dataset Distillation |
Huimin Lu et.al. |
2504.20500 |
link |
2025-04-29 |
Token-Efficient Prompt Injection Attack: Provoking Cessation in LLM Reasoning via Adaptive Token Compression |
Yu Cui et.al. |
2504.20493 |
null |
2025-04-29 |
Enhancing LLM Language Adaption through Cross-lingual In-Context Pre-training |
Linjuan Wu et.al. |
2504.20484 |
null |
2025-04-29 |
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction |
Yulin Chen et.al. |
2504.20472 |
null |
2025-04-29 |
Fane at SemEval-2025 Task 10: Zero-Shot Entity Framing with Large Language Models |
Enfa Fane et.al. |
2504.20469 |
link |
2025-04-29 |
A Summary on GUI Agents with Foundation Models Enhanced by Reinforcement Learning |
Jiahao Li et.al. |
2504.20464 |
null |
2025-04-30 |
TAMO:Fine-Grained Root Cause Analysis via Tool-Assisted LLM Agent with Multi-Modality Observation Data |
Qi Wang et.al. |
2504.20462 |
null |
2025-04-29 |
SAS-Prompt: Large Language Models as Numerical Optimizers for Robot Self-Improvement |
Heni Ben Amor et.al. |
2504.20459 |
null |
2025-04-29 |
Enhancing News Recommendation with Hierarchical LLM Prompting |
Hai-Dang Kieu et.al. |
2504.20452 |
null |
2025-04-29 |
GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection |
DiJia Su et.al. |
2504.20437 |
null |
2025-04-29 |
RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library |
Jiapeng Wang et.al. |
2504.20426 |
null |
2025-04-29 |
Plant Disease Detection through Multimodal Large Language Models and Convolutional Neural Networks |
Konstantinos I. Roumeliotis et.al. |
2504.20419 |
null |
2025-04-29 |
Enhancing Leakage Attacks on Searchable Symmetric Encryption Using LLM-Based Synthetic Data Generation |
Joshua Chiu et.al. |
2504.20414 |
link |
2025-04-29 |
CrashFixer: A crash resolution agent for the Linux kernel |
Alex Mathai et.al. |
2504.20412 |
null |
2025-04-29 |
Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs |
Paiheng Xu et.al. |
2504.20406 |
null |
2025-04-29 |
FiLA-Video: Spatio-Temporal Compression for Fine-Grained Long Video Understanding |
Yanan Guo et.al. |
2504.20384 |
null |
2025-04-29 |
Generative Learning for Slow Manifolds and Bifurcation Diagrams |
Ellis R. Crabtree et.al. |
2504.20375 |
null |
2025-04-29 |
DMDTEval: An Evaluation and Analysis of LLMs on Disambiguation in Multi-domain Translation |
Zhibo Man et.al. |
2504.20371 |
null |
2025-04-29 |
Thoughtful, Confused, or Untrustworthy: How Text Presentation Influences Perceptions of AI Writing Tools |
David Zhou et.al. |
2504.20365 |
null |
2025-04-29 |
PRISM-DP: Spatial Pose-based Observations for Diffusion-Policies via Segmentation, Mesh Generation, and Pose Tracking |
Xiatao Sun et.al. |
2504.20359 |
null |
2025-04-29 |
Local Prompt Optimization |
Yash Jain et.al. |
2504.20355 |
null |
2025-04-29 |
CarbonCall: Sustainability-Aware Function Calling for Large Language Models on Edge Devices |
Varatheepan Paramanayakam et.al. |
2504.20348 |
null |
2025-04-29 |
“I’ve talked to ChatGPT about my issues last night.”: Examining Mental Health Conversations with Large Language Models through Reddit Analysis |
Kyuha Jung et.al. |
2504.20320 |
null |
2025-04-28 |
DeepAndes: A Self-Supervised Vision Foundation Model for Multi-Spectral Remote Sensing Imagery of the Andes |
Junlin Guo et.al. |
2504.20303 |
null |
2025-04-28 |
FALCO: a Foundation model of Astronomical Light Curves for time dOmain astronomy |
Xiaoxiong Zuo et.al. |
2504.20290 |
null |
2025-04-28 |
Image Interpolation with Score-based Riemannian Metrics of Diffusion Models |
Shinnosuke Saito et.al. |
2504.20288 |
null |
2025-04-28 |
Enhancing Systematic Reviews with Large Language Models: Using GPT-4 and Kimi |
Dandan Chen Kaptur et.al. |
2504.20276 |
null |
2025-04-28 |
Can Large Language Models Learn Formal Logic? A Data-Driven Training and Evaluation Framework |
Yuan Xia et.al. |
2504.20213 |
null |
2025-04-28 |
Prompting LLMs for Code Editing: Struggles and Remedies |
Daye Nam et.al. |
2504.20196 |
null |
2025-04-28 |
BLADE: Benchmark suite for LLM-driven Automated Design and Evolution of iterative optimisation heuristics |
Niki van Stein et.al. |
2504.20183 |
null |
2025-04-28 |
Integration Flow Models |
Jingjing Wang et.al. |
2504.20179 |
null |
2025-04-28 |
Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models |
Zae Myung Kim et.al. |
2504.20157 |
link |
2025-04-28 |
AutoJudge: Judge Decoding Without Manual Annotation |
Roman Garipov et.al. |
2504.20039 |
null |
2025-04-28 |
SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning |
Wufei Ma et.al. |
2504.20024 |
null |
2025-04-28 |
Better To Ask in English? Evaluating Factual Accuracy of Multilingual LLMs in English and Low-Resource Languages |
Pritika Rohera et.al. |
2504.20022 |
null |
2025-04-28 |
Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models |
Xin Wang et.al. |
2504.20020 |
null |
2025-04-29 |
LLM-Generated Fake News Induces Truth Decay in News Ecosystem: A Case Study on Neural News Recommendation |
Beizhe Hu et.al. |
2504.20013 |
null |
2025-04-28 |
Towards Automated Scoping of AI for Social Good Projects |
Jacob Emmerson et.al. |
2504.20010 |
null |
2025-04-28 |
Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom |
Rishika Sen et.al. |
2504.20000 |
null |
2025-04-28 |
HJRNO: Hamilton-Jacobi Reachability with Neural Operators |
Yankai Li et.al. |
2504.19989 |
null |
2025-04-28 |
TD-EVAL: Revisiting Task-Oriented Dialogue Evaluation by Combining Turn-Level Precision with Dialogue-Level Comparisons |
Emre Can Acikgoz et.al. |
2504.19982 |
null |
2025-04-28 |
Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets |
Adam Younsi et.al. |
2504.19981 |
null |
2025-04-29 |
From Concept to Practice: an Automated LLM-aided UVM Machine for RTL Verification |
Junhao Ye et.al. |
2504.19959 |
null |
2025-04-28 |
Enhancing Surgical Documentation through Multimodal Visual-Temporal Transformers and Generative AI |
Hugo Georgenthum et.al. |
2504.19918 |
null |
2025-04-28 |
Can AI Agents Design and Implement Drug Discovery Pipelines? |
Khachik Smbatyan et.al. |
2504.19912 |
null |
2025-04-28 |
GenCLS++: Pushing the Boundaries of Generative Classification in LLMs Through Comprehensive SFT and RL Studies Across Diverse Datasets |
Mingqian He et.al. |
2504.19898 |
null |
2025-04-28 |
CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition |
Quynh Phung et.al. |
2504.19894 |
null |
2025-04-28 |
DeeCLIP: A Robust and Generalizable Transformer-Based Framework for Detecting AI-Generated Images |
Mamadou Keita et.al. |
2504.19876 |
link |
2025-04-28 |
semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage |
Ke Hong et.al. |
2504.19867 |
null |
2025-04-28 |
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback |
Chenhan Jiang et.al. |
2504.19860 |
null |
2025-04-29 |
The Automation Advantage in AI Red Teaming |
Rob Mulla et.al. |
2504.19855 |
null |
2025-04-28 |
Do You Know the Way? Human-in-the-Loop Understanding for Fast Traversability Estimation in Mobile Robotics |
Andre Schreiber et.al. |
2504.19851 |
link |
2025-04-28 |
Foundation Model-Driven Framework for Human-Object Interaction Prediction with Segmentation Mask Integration |
Juhan Park et.al. |
2504.19847 |
null |
2025-04-28 |
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects |
Guangyi Liu et.al. |
2504.19838 |
link |
2025-04-28 |
PhenoAssistant: A Conversational Multi-Agent AI System for Automated Plant Phenotyping |
Feng Chen et.al. |
2504.19818 |
link |
2025-04-28 |
Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance |
Takuya Tamura et.al. |
2504.19811 |
null |
2025-04-28 |
Contextures: The Mechanism of Representation Learning |
Runtian Zhai et.al. |
2504.19792 |
null |
2025-04-28 |
Heterophily-informed Message Passing |
Haishan Wang et.al. |
2504.19785 |
null |
2025-04-29 |
If Concept Bottlenecks are the Question, are Foundation Models the Answer? |
Nicola Debole et.al. |
2504.19774 |
link |
2025-04-28 |
Moral Reasoning Across Languages: The Critical Role of Low-Resource Languages in LLMs |
Huichi Zhou et.al. |
2504.19759 |
null |
2025-04-28 |
Reconstructing Context: Evaluating Advanced Chunking Strategies for Retrieval-Augmented Generation |
Carlo Merola et.al. |
2504.19754 |
link |
2025-04-28 |
FineQ: Software-Hardware Co-Design for Low-Bit Fine-Grained Mixed-Precision Quantization of LLMs |
Xilong Xie et.al. |
2504.19746 |
null |
2025-04-28 |
LLM-Assisted Automated Deductive Coding of Dialogue Data: Leveraging Dialogue-Specific Characteristics to Enhance Contextual Understanding |
Ying Na et.al. |
2504.19734 |
null |
2025-04-28 |
RepText: Rendering Visual Text via Replicating |
Haofan Wang et.al. |
2504.19724 |
null |
2025-04-28 |
Taming the Titans: A Survey of Efficient LLM Inference Serving |
Ranran Zhen et.al. |
2504.19720 |
link |
2025-04-28 |
Pixels2Points: Fusing 2D and 3D Features for Facial Skin Segmentation |
Victoria Yue Chen et.al. |
2504.19718 |
null |
2025-04-28 |
Guided Tensor Lifting |
Yixuan Li et.al. |
2504.19705 |
null |
2025-04-28 |
From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review |
Mohamed Amine Ferrag et.al. |
2504.19678 |
null |
2025-04-28 |
Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs |
Osma Suominen et.al. |
2504.19675 |
link |
2025-04-28 |
$\texttt{SAGE}$ : A Generic Framework for LLM Safety Evaluation |
Madhur Jindal et.al. |
2504.19674 |
link |
2025-04-28 |
A Tripartite Perspective on GraphRAG |
Michael Banf et.al. |
2504.19667 |
null |
2025-04-28 |
Decentralization of Generative AI via Mixture of Experts for Wireless Networks: A Comprehensive Survey |
Yunting Xu et.al. |
2504.19660 |
null |
2025-04-28 |
Intelligent4DSE: Optimizing High-Level Synthesis Design Space Exploration with Graph Neural Networks and Large Language Models |
Lei Xu et.al. |
2504.19649 |
null |
2025-04-28 |
Fitness Landscape of Large Language Model-Assisted Automated Algorithm Search |
Fei Liu et.al. |
2504.19636 |
null |
2025-04-28 |
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer |
Junpeng Jiang et.al. |
2504.19614 |
null |
2025-04-28 |
Scene2Hap: Combining LLMs and Physical Modeling for Automatically Generating Vibrotactile Signals for Full VR Scenes |
Arata Jingu et.al. |
2504.19611 |
null |
2025-04-28 |
Coreference Resolution for Vietnamese Narrative Texts |
Hieu-Dai Tran et.al. |
2504.19606 |
null |
2025-04-28 |
GVPO: Group Variance Policy Optimization for Large Language Model Post-Training |
Kaichen Zhang et.al. |
2504.19599 |
null |
2025-04-28 |
Towards Robust Multimodal Physiological Foundation Models: Handling Arbitrary Missing Modalities |
Xi Fu et.al. |
2504.19596 |
null |
2025-04-28 |
Mapping the Italian Telegram Ecosystem |
Lorenzo Alvisi et.al. |
2504.19594 |
null |
2025-04-28 |
Graph-Based Spectral Decomposition for Parameter Coordination in Language Model Fine-Tuning |
Hanlu Zhang et.al. |
2504.19583 |
null |
2025-04-28 |
m-KAILIN: Knowledge-Driven Agentic Scientific Corpus Distillation Framework for Biomedical Large Language Models Training |
Meng Xiao et.al. |
2504.19565 |
null |
2025-04-28 |
Quantifying Memory Utilization with Effective State-Size |
Rom N. Parnichkun et.al. |
2504.19561 |
null |
2025-04-28 |
Detecting Effects of AI-Mediated Communication on Language Complexity and Sentiment |
Kristen Sussman et.al. |
2504.19556 |
null |
2025-04-28 |
DEEMO: De-identity Multimodal Emotion Recognition and Reasoning |
Deng Li et.al. |
2504.19549 |
null |
2025-04-28 |
Towards Faster and More Compact Foundation Models for Molecular Property Prediction |
Yasir Ghunaim et.al. |
2504.19538 |
link |
2025-04-28 |
LR-IAD:Mask-Free Industrial Anomaly Detection with Logical Reasoning |
Peijian Zeng et.al. |
2504.19524 |
null |
2025-04-28 |
FlashOverlap: A Lightweight Design for Efficiently Overlapping Communication and Computation |
Ke Hong et.al. |
2504.19519 |
null |
2025-04-28 |
Evolution of Cooperation in LLM-Agent Societies: A Preliminary Study Using Different Punishment Strategies |
Kavindu Warnakulasuriya et.al. |
2504.19487 |
null |
2025-04-28 |
Improving Reasoning Performance in Large Language Models via Representation Engineering |
Bertram Højer et.al. |
2504.19483 |
null |
2025-04-28 |
An Automated Reinforcement Learning Reward Design Framework with Large Language Model for Cooperative Platoon Coordination |
Dixiao Wei et.al. |
2504.19480 |
null |
2025-04-28 |
BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text |
Jiageng Wu et.al. |
2504.19467 |
link |
2025-04-28 |
Do Automatic Comment Generation Techniques Fall Short? Exploring the Influence of Method Dependencies on Code Understanding |
Md Mustakim Billah et.al. |
2504.19459 |
null |
2025-04-28 |
Towards Long Context Hallucination Detection |
Siyi Liu et.al. |
2504.19457 |
null |
2025-04-28 |
Masked Language Prompting for Generative Data Augmentation in Few-shot Fashion Style Recognition |
Yuki Hirakawa et.al. |
2504.19455 |
null |
2025-04-28 |
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference |
Zhenyu Zhang et.al. |
2504.19449 |
null |
2025-04-28 |
Systematic Bias in Large Language Models: Discrepant Response Patterns in Binary vs. Continuous Judgment Tasks |
Yi-Long Lu et.al. |
2504.19445 |
null |
2025-04-28 |
Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks |
Kang Yang et.al. |
2504.19444 |
null |
2025-04-28 |
Context-Guided Dynamic Retrieval for Improving Generation Quality in RAG Models |
Jacky He et.al. |
2504.19436 |
null |
2025-04-29 |
MER 2025: When Affective Computing Meets Large Language Models |
Zheng Lian et.al. |
2504.19423 |
null |
2025-04-28 |
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory |
Prateek Chhikara et.al. |
2504.19413 |
null |
2025-04-29 |
Context Selection and Rewriting for Video-based Educational Question Generation |
Mengxia Yu et.al. |
2504.19406 |
link |
2025-04-27 |
LLMs for Engineering: Teaching Models to Design High Powered Rockets |
Toby Simonds et.al. |
2504.19394 |
null |
2025-04-27 |
From Inductive to Deductive: LLMs-Based Qualitative Data Analysis in Requirements Engineering |
Syed Tauhid Ullah Shah et.al. |
2504.19384 |
link |
2025-04-27 |
Flow Along the K-Amplitude for Generative Modeling |
Weitao Du et.al. |
2504.19353 |
null |
2025-04-27 |
Contextual Online Uncertainty-Aware Preference Learning for Human Feedback |
Nan Lu et.al. |
2504.19342 |
null |
2025-04-27 |
OpenFOAMGPT 2.0: end-to-end, trustworthy automation for computational fluid dynamics |
Jingsen Feng et.al. |
2504.19338 |
null |
2025-04-29 |
Unified Multi-Task Learning & Model Fusion for Efficient Language Model Guardrailing |
James O’ Neill et.al. |
2504.19333 |
null |
2025-04-27 |
BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese |
Peilin Zhou et.al. |
2504.19314 |
link |
2025-04-27 |
AndroidGen: Building an Android Language Agent under Data Scarcity |
Hanyu Lai et.al. |
2504.19298 |
null |
2025-04-27 |
Multiscale Roughness of Upper Mantle Discontinuities Inferred from the USArray: Dependence on Tomography Models |
Yinzhi Wang et.al. |
2504.19290 |
null |
2025-04-27 |
Generalized Score Matching: Bridging $f$ -Divergence and Statistical Estimation Under Correlated Noise |
Yirong Shen et.al. |
2504.19288 |
null |
2025-04-27 |
Small Models, Big Tasks: An Exploratory Empirical Study on Small Language Models for Function Calling |
Ishan Kavathekar et.al. |
2504.19277 |
link |
2025-04-27 |
Anyprefer: An Agentic Framework for Preference Data Synthesis |
Yiyang Zhou et.al. |
2504.19276 |
null |
2025-04-27 |
OpenFusion++: An Open-vocabulary Real-time Scene Understanding System |
Xiaofeng Jin et.al. |
2504.19266 |
null |
2025-04-27 |
The Convergent Ethics of AI? Analyzing Moral Foundation Priorities in Large Language Models with a Multi-Framework Approach |
Chad Coleman et.al. |
2504.19255 |
null |
2025-04-27 |
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers |
Dylan Bouchard et.al. |
2504.19254 |
link |
2025-04-27 |
CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis |
Alexander Baumann et.al. |
2504.19223 |
null |
2025-04-27 |
AlphaFuse: Learn ID Embeddings for Sequential Recommendation in Null Space of Language Embeddings |
Guoqing Hu et.al. |
2504.19218 |
link |
2025-04-27 |
WuNeng: Hybrid State with Attention |
Liu Xiao et.al. |
2504.19191 |
null |
2025-04-27 |
Different behaviors of diffusing diffusivity dynamics based on three different definitions of fractional Brownian motion |
Wei Wang et.al. |
2504.19190 |
null |
2025-04-27 |
Hierarchical Attention Generates Better Proofs |
Jianlong Chen et.al. |
2504.19188 |
link |
2025-04-27 |
Segmenting Objectiveness and Task-awareness Unknown Region for Autonomous Driving |
Mi Zheng et.al. |
2504.19183 |
null |
2025-04-27 |
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning |
Jiaqi Chen et.al. |
2504.19162 |
null |
2025-04-27 |
Muyan-TTS: A Trainable Text-to-Speech Model Optimized for Podcast Scenarios with a $50K Budget |
Xin Li et.al. |
2504.19146 |
link |
2025-04-27 |
ChiseLLM: Unleashing the Power of Reasoning LLMs for Chisel Agile Hardware Development |
Bowei Wang et.al. |
2504.19144 |
link |
2025-04-27 |
APE-Bench I: Towards File-level Automated Proof Engineering of Formal Math Libraries |
Huajian Xin et.al. |
2504.19110 |
null |
2025-04-27 |
A Multi-Language Perspective on the Robustness of LLM Code Generation |
Fazle Rabbi et.al. |
2504.19108 |
link |
2025-04-27 |
Harmonizing Generalization and Personalization in Ring-topology Decentralized Federated Learning |
Shunxin Guo et.al. |
2504.19103 |
null |
2025-04-27 |
Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation |
Qianren Mao et.al. |
2504.19101 |
null |
2025-04-27 |
VeriDebug: A Unified LLM for Verilog Debugging via Contrastive Embedding and Guided Correction |
Ning Wang et.al. |
2504.19099 |
null |
2025-04-27 |
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges |
Yu Li et.al. |
2504.19093 |
null |
2025-04-25 |
Generalization Capability for Imitation Learning |
Yixiao Wang et.al. |
2504.18538 |
null |
2025-04-25 |
TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation |
Gwen Yidou Weng et.al. |
2504.18535 |
null |
2025-04-25 |
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation |
Shivam Duggal et.al. |
2504.18509 |
null |
2025-04-25 |
Action-Minimization Meets Generative Modeling: Efficient Transition Path Sampling with the Onsager-Machlup Functional |
Sanjeev Raja et.al. |
2504.18506 |
null |
2025-04-25 |
Investigating Co-Constructive Behavior of Large Language Models in Explanation Dialogues |
Leandra Fichtel et.al. |
2504.18483 |
null |
2025-04-25 |
Reason Like a Radiologist: Chain-of-Thought and Reinforcement Learning for Verifiable Report Generation |
Peiyuan Jing et.al. |
2504.18453 |
null |
2025-04-25 |
Kimi-Audio Technical Report |
KimiTeam et.al. |
2504.18425 |
link |
2025-04-25 |
LaRI: Layered Ray Intersections for Single-view 3D Geometric Reasoning |
Rui Li et.al. |
2504.18424 |
null |
2025-04-25 |
LLMpatronous: Harnessing the Power of LLMs For Vulnerability Detection |
Rajesh Yarra et.al. |
2504.18423 |
null |
2025-04-25 |
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs |
Hongyu Wang et.al. |
2504.18415 |
null |
2025-04-25 |
An Empirical Study of Evaluating Long-form Question Answering |
Ning Xian et.al. |
2504.18413 |
link |
2025-04-25 |
Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers |
Jared Moore et.al. |
2504.18412 |
link |
2025-04-25 |
HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding? |
Yusen Zhang et.al. |
2504.18406 |
null |
2025-04-25 |
HepatoGEN: Generating Hepatobiliary Phase MRI with Perceptual and Adversarial Models |
Jens Hooge et.al. |
2504.18405 |
null |
2025-04-25 |
Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization |
Kesen Zhao et.al. |
2504.18397 |
null |
2025-04-25 |
Bridge the Domains: Large Language Models Enhanced Cross-domain Sequential Recommendation |
Qidong Liu et.al. |
2504.18383 |
null |
2025-04-25 |
Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal Assistant |
Lei Shen et.al. |
2504.18373 |
null |
2025-04-25 |
ThreMoLIA: Threat Modeling of Large Language Model-Integrated Applications |
Felix Viktor Jedrzejewski et.al. |
2504.18369 |
null |
2025-04-25 |
Enhanced Sampling, Public Dataset and Generative Model for Drug-Protein Dissociation Dynamics |
Maodong Li et.al. |
2504.18367 |
null |
2025-04-25 |
Testing Individual Fairness in Graph Neural Networks |
Roya Nasiri et.al. |
2504.18353 |
null |
2025-04-25 |
Revisiting Data Auditing in Large Vision-Language Models |
Hongyu Zhu et.al. |
2504.18349 |
null |
2025-04-25 |
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review |
Toghrul Abbasli et.al. |
2504.18346 |
null |
2025-04-25 |
Large Language Models to Accelerate Organic Chemistry Synthesis |
Yu Zhang et.al. |
2504.18340 |
null |
2025-04-25 |
SSD-Poser: Avatar Pose Estimation with State Space Duality from Sparse Observations |
Shuting Zhao et.al. |
2504.18332 |
null |
2025-04-25 |
Towards Adaptive Software Agents for Debugging |
Yacine Majdoub et.al. |
2504.18316 |
null |
2025-04-25 |
Artificial Intelligence health advice accuracy varies across languages and contexts |
Prashant Garg et.al. |
2504.18310 |
null |
2025-04-25 |
Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator |
Minjae Kang et.al. |
2504.18283 |
null |
2025-04-25 |
LEAM: A Prompt-only Large Language Model-enabled Antenna Modeling Method |
Tao Wu et.al. |
2504.18271 |
link |
2025-04-25 |
TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation |
Shintaro Ozaki et.al. |
2504.18269 |
null |
2025-04-25 |
MAGI: Multi-Agent Guided Interview for Psychiatric Assessment |
Guanqun Bi et.al. |
2504.18260 |
null |
2025-04-25 |
SSL4Eco: A Global Seasonal Dataset for Geospatial Foundation Models in Ecology |
Elena Plekhanova et.al. |
2504.18256 |
null |
2025-04-25 |
Efficient Single-Pass Training for Multi-Turn Reasoning |
Ritesh Goru et.al. |
2504.18246 |
null |
2025-04-25 |
What is the Added Value of UDA in the VFM Era? |
Brunó B. Englert et.al. |
2504.18190 |
null |
2025-04-25 |
Offline Learning of Controllable Diverse Behaviors |
Mathieu Petitbois et.al. |
2504.18160 |
null |
2025-04-25 |
Leveraging Decoder Architectures for Learned Sparse Retrieval |
Jingfen Qiao et.al. |
2504.18151 |
null |
2025-04-25 |
NoEsis: Differentially Private Knowledge Transfer in Modular LLM Adaptation |
Rob Romijnders et.al. |
2504.18147 |
null |
2025-04-25 |
Score-Based Deterministic Density Sampling |
Vasily Ilin et.al. |
2504.18130 |
null |
2025-04-25 |
Think, Prune, Train, Improve: Scaling Reasoning without Scaling Models |
Caia Costello et.al. |
2504.18116 |
null |
2025-04-25 |
Comparative Study on the Discourse Meaning of Chinese and English Media in the Paris Olympics Based on LDA Topic Modeling Technology and LLM Prompt Engineering |
Yinglong Yu et.al. |
2504.18106 |
null |
2025-04-25 |
Application and Optimization of Large Models Based on Prompt Tuning for Fact-Check-Worthiness Estimation |
Yinglong Yu et.al. |
2504.18104 |
null |
2025-04-25 |
Random-Set Large Language Models |
Muhammad Mubashar et.al. |
2504.18085 |
null |
2025-04-25 |
Automating Function-Level TARA for Automotive Full-Lifecycle Security |
Yuqiao Yang et.al. |
2504.18083 |
null |
2025-04-25 |
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization |
Wataru Kawakami et.al. |
2504.18080 |
null |
2025-04-25 |
PropRAG: Guiding Retrieval with Beam Search over Proposition Paths |
Jingjin Wang et.al. |
2504.18070 |
null |
2025-04-25 |
LLM-Guided Open RAN: Empowering Hierarchical RAN Intelligent Control |
Lingyan Bao et.al. |
2504.18062 |
null |
2025-04-25 |
DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models |
Jianyu Liu et.al. |
2504.18053 |
link |
2025-04-25 |
Validating Network Protocol Parsers with Traceable RFC Document Interpretation |
Mingwei Zheng et.al. |
2504.18050 |
null |
2025-04-25 |
RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models |
Bang An et.al. |
2504.18041 |
null |
2025-04-25 |
MultiMind: Enhancing Werewolf Agents with Multimodal Reasoning and Theory of Mind |
Zheng Zhang et.al. |
2504.18039 |
null |
2025-04-25 |
Federated Client-tailored Adapter for Medical Image Segmentation |
Guyue Hu et.al. |
2504.18020 |
null |
2025-04-25 |
Diffusion-Driven Universal Model Inversion Attack for Face Recognition |
Hanrui Wang et.al. |
2504.18015 |
null |
2025-04-25 |
Sky-Drive: A Distributed Multi-Agent Simulation Platform for Socially-Aware and Human-AI Collaborative Future Transportation |
Zilin Huang et.al. |
2504.18010 |
null |
2025-04-25 |
Assessing the Utility of Audio Foundation Models for Heart and Respiratory Sound Analysis |
Daisuke Niizumi et.al. |
2504.18004 |
null |
2025-04-25 |
Self-Balancing, Memory Efficient, Dynamic Metric Space Data Maintenance, for Rapid Multi-Kernel Estimation |
Aditya S Ellendula et.al. |
2504.18003 |
null |
2025-04-25 |
Streaming, Fast and Slow: Cognitive Load-Aware Streaming for Efficient LLM Serving |
Chang Xiao et.al. |
2504.17999 |
null |
2025-04-24 |
Optimism, Expectation, or Sarcasm? Multi-Class Hope Speech Detection in Spanish and English |
Sabur Butt et.al. |
2504.17974 |
null |
2025-04-24 |
LLM Agent Swarm for Hypothesis-Driven Drug Discovery |
Kevin Song et.al. |
2504.17967 |
null |
2025-04-24 |
Evaluating Machine Expertise: How Graduate Students Develop Frameworks for Assessing GenAI Content |
Celia Chen et.al. |
2504.17964 |
null |
2025-04-24 |
Toward a Human-Centered Evaluation Framework for Trustworthy LLM-Powered GUI Agents |
Chaoran Chen et.al. |
2504.17934 |
null |
2025-04-24 |
The Role of Open-Source LLMs in Shaping the Future of GeoAI |
Xiao Huang et.al. |
2504.17833 |
null |
2025-04-24 |
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models |
Xu Ma et.al. |
2504.17789 |
null |
2025-04-24 |
Replay to Remember: Retaining Domain Knowledge in Streaming Language Models |
Sneh Pillai et.al. |
2504.17780 |
null |
2025-04-24 |
Conversational Assistants to support Heart Failure Patients: comparing a Neurosymbolic Architecture with ChatGPT |
Anuja Tayal et.al. |
2504.17753 |
null |
2025-04-24 |
Towards Robust LLMs: an Adversarial Robustness Measurement Framework |
Natan Levy et.al. |
2504.17723 |
null |
2025-04-24 |
Multilingual Performance Biases of Large Language Models in Education |
Vansh Gupta et.al. |
2504.17720 |
null |
2025-04-24 |
PICO: Reconstructing 3D People In Contact with Objects |
Alpár Cseke et.al. |
2504.17695 |
null |
2025-04-24 |
Ensemble Bayesian Inference: Leveraging Small Language Models to Achieve LLM-level Accuracy in Profile Matching Tasks |
Haru-Tada Sato et.al. |
2504.17685 |
null |
2025-04-24 |
INSIGHT: Bridging the Student-Teacher Gap in Times of Large Language Models |
Jarne Thys et.al. |
2504.17677 |
null |
2025-04-24 |
Energy Considerations of Large Language Model Inference and Efficiency Optimizations |
Jared Fernandez et.al. |
2504.17674 |
null |
2025-04-24 |
Cross-region Model Training with Communication-Computation Overlapping and Delay Compensation |
Ying Zhu et.al. |
2504.17672 |
null |
2025-04-24 |
DiMeR: Disentangled Mesh Reconstruction Model |
Lutao Jiang et.al. |
2504.17670 |
link |
2025-04-24 |
Towards a HIPAA Compliant Agentic AI System in Healthcare |
Subash Neupane et.al. |
2504.17669 |
null |
2025-04-24 |
Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics |
Zena Al-Khalili et.al. |
2504.17665 |
null |
2025-04-24 |
Effortless, Simulation-Efficient Bayesian Inference using Tabular Foundation Models |
Julius Vetter et.al. |
2504.17660 |
null |
2025-04-24 |
Likelihood-Free Variational Autoencoders |
Chen Xu et.al. |
2504.17622 |
null |
2025-04-24 |
L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference |
Qingyuan Liu et.al. |
2504.17584 |
null |
2025-04-25 |
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training |
Xiaoyu Tian et.al. |
2504.17565 |
null |
2025-04-24 |
HalluLens: LLM Hallucination Benchmark |
Yejin Bang et.al. |
2504.17550 |
null |
2025-04-24 |
A Comprehensive Survey of Knowledge-Based Vision Question Answering Systems: The Lifecycle of Knowledge in Visual Reasoning Task |
Jiaqi Deng et.al. |
2504.17547 |
null |
2025-04-24 |
Auditing the Ethical Logic of Generative AI Models |
W. Russell Neuman et.al. |
2504.17544 |
null |
2025-04-24 |
Large Language Model-Driven Concolic Execution for Highly Structured Test Input Generation |
Haoxin Tu et.al. |
2504.17542 |
null |
2025-04-24 |
Towards Machine-Generated Code for the Resolution of User Intentions |
Justus Flerlage et.al. |
2504.17531 |
link |
2025-04-26 |
Combining GCN Structural Learning with LLM Chemical Knowledge for Enhanced Virtual Screening |
Radia Berreziga et.al. |
2504.17497 |
null |
2025-04-24 |
Unified Attacks to Large Language Model Watermarks: Spoofing and Scrubbing in Unauthorized Knowledge Distillation |
Xin Yi et.al. |
2504.17480 |
null |
2025-04-24 |
Unveiling Hidden Vulnerabilities in Digital Human Generation via Adversarial Attacks |
Zhiying Li et.al. |
2504.17457 |
null |
2025-04-24 |
Adaptive Orchestration of Modular Generative Information Access Systems |
Mohanna Hoveyda et.al. |
2504.17454 |
link |
2025-04-24 |
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs |
Tiancheng Gu et.al. |
2504.17432 |
null |
2025-04-24 |
Beyond Whole Dialogue Modeling: Contextual Disentanglement for Conversational Recommendation |
Guojia An et.al. |
2504.17427 |
null |
2025-04-24 |
Towards Leveraging Large Language Model Summaries for Topic Modeling in Source Code |
Michele Carissimi et.al. |
2504.17426 |
null |
2025-04-24 |
Towards Harnessing the Collaborative Power of Large and Small Models for Domain Tasks |
Yang Liu et.al. |
2504.17421 |
null |
2025-04-24 |
Assessing the Capability of Large Language Models for Domain-Specific Ontology Generation |
Anna Sofia Lippolis et.al. |
2504.17402 |
null |
2025-04-24 |
Fine-tune Smarter, Not Harder: Parameter-Efficient Fine-Tuning for Geospatial Foundation Models |
Francesc Marti-Escofet et.al. |
2504.17397 |
null |
2025-04-25 |
On the workflow, opportunities and challenges of developing foundation model in geophysics |
Hanlin Sheng et.al. |
2504.17384 |
null |
2025-04-24 |
On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration |
Maoyang Xiang et.al. |
2504.17376 |
null |
2025-04-24 |
LiveLongBench: Tackling Long-Context Understanding for Spoken Texts from Live Streams |
Yongxuan Wu et.al. |
2504.17366 |
link |
2025-04-25 |
TimeSoccer: An End-to-End Multimodal Large Language Model for Soccer Commentary Generation |
Ling You et.al. |
2504.17365 |
null |
2025-04-24 |
PatientDx: Merging Large Language Models for Protecting Data-Privacy in Healthcare |
Jose G. Moreno et.al. |
2504.17360 |
null |
2025-04-24 |
Comprehend, Divide, and Conquer: Feature Subspace Exploration via Multi-Agent Hierarchical Reinforcement Learning |
Weiliang Zhang et.al. |
2504.17356 |
null |
2025-04-24 |
DRC: Enhancing Personalized Image Generation via Disentangled Representation Composition |
Yiyan Xu et.al. |
2504.17349 |
null |
2025-04-24 |
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos |
Linli Yao et.al. |
2504.17343 |
null |
2025-04-24 |
Bridging Cognition and Emotion: Empathy-Driven Multimodal Misinformation Detection |
Zihan Wang et.al. |
2504.17332 |
null |
2025-04-24 |
Exploring Context-aware and LLM-driven Locomotion for Immersive Virtual Reality |
Süleyman Özdel et.al. |
2504.17331 |
null |
2025-04-24 |
Dargana: fine-tuning EarthPT for dynamic tree canopy mapping from space |
Michael J. Smith et.al. |
2504.17321 |
null |
2025-04-25 |
Class-Conditional Distribution Balancing for Group Robust Classification |
Miaoyun Zhao et.al. |
2504.17314 |
null |
2025-04-24 |
FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation |
Yulia Otmakhova et.al. |
2504.17311 |
null |
2025-04-24 |
CoheMark: A Novel Sentence-Level Watermark for Enhanced Text Quality |
Junyan Zhang et.al. |
2504.17309 |
null |
2025-04-24 |
AI-Enhanced Business Process Automation: A Case Study in the Insurance Domain Using Object-Centric Process Mining |
Shahrzad Khayatbashi et.al. |
2504.17295 |
null |
2025-04-24 |
Combining Static and Dynamic Approaches for Mining and Testing Constraints for RESTful API Testing |
Hieu Huynh et.al. |
2504.17287 |
null |
2025-04-24 |
MV-Crafter: An Intelligent System for Music-guided Video Generation |
Chuer Chen et.al. |
2504.17267 |
null |
2025-04-24 |
JurisCTC: Enhancing Legal Judgment Prediction via Cross-Domain Transfer and Contrastive Learning |
Zhaolu Kang et.al. |
2504.17264 |
null |
2025-04-24 |
Symbolic Representation for Any-to-Any Generative Tasks |
Jiaqi Chen et.al. |
2504.17261 |
null |
2025-04-24 |
Targeted AMP generation through controlled diffusion with efficient embeddings |
Diogo Soares et.al. |
2504.17247 |
null |
2025-04-24 |
FLAG: Formal and LLM-assisted SVA Generation for Formal Specifications of On-Chip Communication Protocols |
Yu-An Shih et.al. |
2504.17226 |
null |
2025-04-24 |
Visual and textual prompts for enhancing emotion recognition in video |
Zhifeng Wang et.al. |
2504.17224 |
null |
2025-04-24 |
Towards Generalizable Deepfake Detection with Spatial-Frequency Collaborative Learning and Hierarchical Cross-Modal Fusion |
Mengyu Qiao et.al. |
2504.17223 |
null |
2025-04-24 |
Does Knowledge Distillation Matter for Large Language Model based Bundle Generation? |
Kaidong Feng et.al. |
2504.17220 |
null |
2025-04-24 |
Enhancing Variational Autoencoders with Smooth Robust Latent Encoding |
Hyomin Lee et.al. |
2504.17219 |
null |
2025-04-24 |
Synthetic Power Flow Data Generation Using Physics-Informed Denoising Diffusion Probabilistic Models |
Junfei Wang et.al. |
2504.17210 |
null |
2025-04-24 |
Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation |
Phillip Y. Lee et.al. |
2504.17207 |
null |
2025-04-24 |
High-Fidelity And Complex Test Data Generation For Real-World SQL Code Generation Services |
Shivasankari Kannan et.al. |
2504.17203 |
null |
2025-04-24 |
A RAG-Based Multi-Agent LLM System for Natural Hazard Resilience and Adaptation |
Yangxinyu Xie et.al. |
2504.17200 |
null |
2025-04-24 |
Automatically Generating Rules of Malicious Software Packages via Large Language Model |
XiangRui Zhang et.al. |
2504.17198 |
null |
2025-04-24 |
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning |
Minju Seo et.al. |
2504.17192 |
link |
2025-04-25 |
We’ll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback |
Minkyu Choi et.al. |
2504.17180 |
null |
2025-04-24 |
A Genealogy of Multi-Sensor Foundation Models in Remote Sensing |
Kevin Lane et.al. |
2504.17177 |
null |
2025-04-23 |
MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation |
Chanhee Park et.al. |
2504.17137 |
null |
2025-04-23 |
Steering the CensorShip: Uncovering Representation Vectors for LLM “Thought” Control |
Hannah Cyberey et.al. |
2504.17130 |
link |
2025-04-23 |
Physiological neural representation for personalised tracer kinetic parameter estimation from dynamic PET |
Kartikay Tehlan et.al. |
2504.17122 |
link |
2025-04-25 |
The Rise of Small Language Models in Healthcare: A Comprehensive Survey |
Muskan Garg et.al. |
2504.17119 |
null |
2025-04-23 |
Leveraging LLMs as Meta-Judges: A Multi-Agent Framework for Evaluating LLM Judgments |
Yuran Li et.al. |
2504.17087 |
null |
2025-04-23 |
Scene-Aware Location Modeling for Data Augmentation in Automotive Object Detection |
Jens Petersen et.al. |
2504.17076 |
null |
2025-04-23 |
Robo-Troj: Attacking LLM-based Task Planners |
Mohaiminul Al Nahian et.al. |
2504.17070 |
null |
2025-04-23 |
Distilling semantically aware orders for autoregressive image generation |
Rishav Pramanik et.al. |
2504.17069 |
null |
2025-04-23 |
Statistical Guarantees in Synthetic Data through Conformal Adversarial Generation |
Rahul Vishwakarma et.al. |
2504.17058 |
null |
2025-04-23 |
Do Words Reflect Beliefs? Evaluating Belief Depth in Large Language Models |
Shariar Kabir et.al. |
2504.17052 |
null |
2025-04-23 |
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs |
Zhenhailong Wang et.al. |
2504.17040 |
null |
2025-04-23 |
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation |
Luca Moroni et.al. |
2504.17025 |
null |
2025-04-23 |
LLM impact on BLV programming |
Prashant Chandrasekar et.al. |
2504.17018 |
null |
2025-04-23 |
(Im)possibility of Automated Hallucination Detection in Large Language Models |
Amin Karbasi et.al. |
2504.17004 |
null |
2025-04-23 |
Safety Pretraining: Toward the Next Generation of Safe AI |
Pratyush Maini et.al. |
2504.16980 |
null |
2025-04-23 |
Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light |
Ali Hassani et.al. |
2504.16922 |
null |
2025-04-23 |
IberBench: LLM Evaluation on Iberian Languages |
José Ángel González et.al. |
2504.16921 |
null |
2025-04-23 |
DreamO: A Unified Framework for Image Customization |
Chong Mou et.al. |
2504.16915 |
null |
2025-04-23 |
BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation |
Ruotong Wang et.al. |
2504.16907 |
null |
2025-04-23 |
Practical approaches for crystal structure predictions with inpainting generation and universal interatomic potentials |
Peichen Zhong et.al. |
2504.16893 |
null |
2025-04-23 |
Do Large Language Models know who did what to whom? |
Joseph M. Denning et.al. |
2504.16884 |
null |
2025-04-23 |
Enhancing Critical Thinking with AI: A Tailored Warning System for RAG Models |
Xuyang Zhu et.al. |
2504.16883 |
null |
2025-04-23 |
Context-Enhanced Vulnerability Detection Based on Large Language Model |
Yixin Yang et.al. |
2504.16877 |
null |
2025-04-24 |
Exploring How LLMs Capture and Represent Domain-Specific Knowledge |
Mirian Hipolito Garcia et.al. |
2504.16871 |
null |
2025-04-23 |
Common Functional Decompositions Can Mis-attribute Differences in Outcomes Between Populations |
Manuel Quintero et.al. |
2504.16864 |
null |
2025-04-23 |
Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification |
Alexander Shvets et.al. |
2504.16856 |
null |
2025-04-23 |
Monte Carlo Planning with Large Language Model for Text-Based Game Agents |
Zijing Shi et.al. |
2504.16855 |
null |
2025-04-25 |
Improving Significant Wave Height Prediction Using Chronos Models |
Yilin Zhai et.al. |
2504.16834 |
null |
2025-04-23 |
LRASGen: LLM-based RESTful API Specification Generation |
Sida Deng et.al. |
2504.16833 |
null |
2025-04-23 |
GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning |
Luu Quy Tung et.al. |
2504.16832 |
null |
2025-04-23 |
Decoupled Global-Local Alignment for Improving Compositional Understanding |
Xiaoxing Hu et.al. |
2504.16801 |
null |
2025-04-23 |
MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores |
Fengwei Zhou et.al. |
2504.16786 |
null |
2025-04-23 |
Graph2Nav: 3D Object-Relation Graph Generation to Robot Navigation |
Tixiao Shan et.al. |
2504.16782 |
null |
2025-04-23 |
Advanced Chest X-Ray Analysis via Transformer-Based Image Descriptors and Cross-Model Attention Mechanism |
Lakshita Agarwal et.al. |
2504.16774 |
null |
2025-04-23 |
How Effective are Generative Large Language Models in Performing Requirements Classification? |
Waad Alhoshan et.al. |
2504.16768 |
null |
2025-04-23 |
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism |
Lakshita Agarwal et.al. |
2504.16761 |
null |
2025-04-23 |
Lightweight Latent Verifiers for Efficient Meta-Generation Strategies |
Bartosz Piotrowski et.al. |
2504.16760 |
null |
2025-04-23 |
HEMA : A Hippocampus-Inspired Extended Memory Architecture for Long-Context AI Conversations |
Kwangseob Ahn et.al. |
2504.16754 |
null |
2025-04-23 |
Feature Mixing Approach for Detecting Intraoperative Adverse Events in Laparoscopic Roux-en-Y Gastric Bypass Surgery |
Rupak Bose et.al. |
2504.16749 |
null |
2025-04-23 |
A Survey of AI Agent Protocols |
Yingxuan Yang et.al. |
2504.16736 |
null |
2025-04-23 |
IRIS: Interactive Research Ideation System for Accelerating Scientific Discovery |
Aniketh Garikaparthi et.al. |
2504.16728 |
link |
2025-04-23 |
Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator |
Chenhao Li et.al. |
2504.16680 |
null |
2025-04-23 |
A Post-trainer’s Guide to Multilingual Training Data: Uncovering Cross-lingual Transfer Dynamics |
Luisa Shimabucoro et.al. |
2504.16677 |
null |
2025-04-23 |
LLMCode: Evaluating and Enhancing Researcher-AI Alignment in Qualitative Analysis |
Joel Oksanen et.al. |
2504.16671 |
null |
2025-04-23 |
MAYA: Addressing Inconsistencies in Generative Password Guessing through a Unified Benchmark |
William Corrias et.al. |
2504.16651 |
link |
2025-04-23 |
ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data |
Haoran Gu et.al. |
2504.16628 |
null |
2025-04-23 |
Federated EndoViT: Pretraining Vision Transformers via Federated Learning on Endoscopic Image Collections |
Max Kirchner et.al. |
2504.16612 |
null |
2025-04-23 |
Debunking with Dialogue? Exploring AI-Generated Counterspeech to Challenge Conspiracy Theories |
Mareike Lisker et.al. |
2504.16604 |
null |
2025-04-23 |
Comparing Large Language Models and Traditional Machine Translation Tools for Translating Medical Consultation Summaries: A Pilot Study |
Andy Li et.al. |
2504.16601 |
null |
2025-04-23 |
Case Study: Fine-tuning Small Language Models for Accurate and Private CWE Detection in Python Code |
Md. Azizul Hakim Bappy et.al. |
2504.16584 |
null |
2025-04-24 |
Hyper-Transforming Latent Diffusion Models |
Ignacio Peis et.al. |
2504.16580 |
null |
2025-04-23 |
PIS: Linking Importance Sampling and Attention Mechanisms for Efficient Prompt Compression |
Lizhe Chen et.al. |
2504.16574 |
null |
2025-04-23 |
PsyCounAssist: A Full-Cycle AI-Powered Psychological Counseling Assistant System |
Xianghe Liu et.al. |
2504.16573 |
null |
2025-04-23 |
Enhancing LLM-Based Agents via Global Planning and Hierarchical Execution |
Junjie Chen et.al. |
2504.16563 |
link |
2025-04-23 |
Exploring human-SAV interaction using large language models: The impact of psychological ownership and anthropomorphism on user experience |
Lirui Guo et.al. |
2504.16548 |
null |
2025-04-23 |
Tinkering Against Scaling |
Bolun Zhang et.al. |
2504.16546 |
null |
2025-04-23 |
6G EdgeAI: Performance Evaluation and Analysis |
Chien-Sheng Yang et.al. |
2504.16529 |
null |
2025-04-23 |
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining |
Fengze Liu et.al. |
2504.16511 |
null |
2025-04-23 |
A Comprehensive Survey of Synthetic Tabular Data Generation |
Ruxue Shi et.al. |
2504.16506 |
link |
2025-04-23 |
TraveLLaMA: Facilitating Multi-modal Large Language Models to Understand Urban Scenes and Provide Travel Assistance |
Meng Chu et.al. |
2504.16505 |
null |
2025-04-23 |
Intelligent Depression Prevention via LLM-Based Dialogue Analysis: Overcoming the Limitations of Scale-Dependent Diagnosis through Precise Emotional Pattern Recognition |
Zhenguang Zhong et.al. |
2504.16504 |
null |
2025-04-23 |
Amplified Vulnerabilities: Structured Jailbreak Attacks on LLM-based Multi-Agent Debate |
Senmao Qi et.al. |
2504.16489 |
null |
2025-04-23 |
Harden and Catch for Just-in-Time Assured LLM-Based Software Testing: Open Research Challenges |
Mark Harman et.al. |
2504.16472 |
null |
2025-04-23 |
Killing Two Birds with One Stone: Unifying Retrieval and Ranking with a Single Generative Recommendation Model |
Luankang Zhang et.al. |
2504.16454 |
null |
2025-04-23 |
EMRModel: A Large Language Model for Extracting Medical Consultation Dialogues into Structured Medical Records |
Shuguang Zhao et.al. |
2504.16448 |
null |
2025-04-23 |
Give LLMs a Security Course: Securing Retrieval-Augmented Code Generation via Knowledge Injection |
Bo Lin et.al. |
2504.16429 |
null |
2025-04-24 |
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark |
Hanlei Zhang et.al. |
2504.16427 |
link |
2025-04-23 |
Advancing Radar Hand Gesture Recognition: A Hybrid Spectrum Synthetic Framework Merging Simulation with Neural Networks |
Jiaqi Tang et.al. |
2504.16423 |
null |
2025-04-23 |
A Survey of Foundation Model-Powered Recommender Systems: From Feature-Based, Generative to Agentic Paradigms |
Chengkai Huang et.al. |
2504.16420 |
null |
2025-04-23 |
Evaluating Multi-Hop Reasoning in Large Language Models: A Chemistry-Centric Case Study |
Mohammad Khodadad et.al. |
2504.16414 |
null |
2025-04-23 |
Out-of-the-Box Conditional Text Embeddings from Large Language Models |
Kosuke Yamada et.al. |
2504.16411 |
null |
2025-04-23 |
EEmo-Bench: A Benchmark for Multi-modal Large Language Models on Image Evoked Emotion Assessment |
Lancheng Gao et.al. |
2504.16405 |
null |
2025-04-23 |
Study of Auto-igniting Spray Flame in Vitiated Swirling Hot Coflow using flamelet generated model |
Zafar Alam et.al. |
2504.16384 |
null |
2025-04-23 |
SplitReason: Learning To Offload Reasoning |
Yash Akhauri et.al. |
2504.16379 |
null |
2025-04-23 |
Text-to-TrajVis: Enabling Trajectory Data Visualizations from Natural Language Questions |
Tian Bai et.al. |
2504.16358 |
null |
2025-04-23 |
DP2FL: Dual Prompt Personalized Federated Learning in Foundation Models |
Ying Chang et.al. |
2504.16357 |
null |
2025-04-23 |
Transitive Array: An Efficient GEMM Accelerator with Result Reuse |
Cong Guo et.al. |
2504.16339 |
null |
2025-04-23 |
ClarifyCoder: Clarification-Aware Fine-Tuning for Programmatic Problem Solving |
Jie JW Wu et.al. |
2504.16331 |
null |
2025-04-22 |
Media Content Atlas: A Pipeline to Explore and Investigate Multidimensional Media Space using Multimodal LLMs |
Merve Cerit et.al. |
2504.16323 |
link |
2025-04-22 |
SignX: The Foundation Model for Sign Recognition |
Sen Fang et.al. |
2504.16315 |
null |
2025-04-22 |
Capturing Symmetry and Antisymmetry in Language Models through Symmetry-Aware Training Objectives |
Zhangdie Yuan et.al. |
2504.16312 |
null |
2025-04-22 |
Improving Automated Secure Code Reviews: A Synthetic Dataset for Code Vulnerability Flaws |
Leonardo Centellas-Claros et.al. |
2504.16310 |
null |
2025-04-22 |
The Paradox of Poetic Intent in Back-Translation: Evaluating the Quality of Large Language Models in Chinese Translation |
Li Weigang et.al. |
2504.16286 |
null |
2025-04-22 |
Investigating LLMs in Clinical Triage: Promising Capabilities, Persistent Intersectional Biases |
Joseph Lee et.al. |
2504.16273 |
null |
2025-04-22 |
Learning Explainable Dense Reward Shapes via Bayesian Optimization |
Ryan Koo et.al. |
2504.16272 |
null |
2025-04-22 |
TeLLMe: An Energy-Efficient Ternary LLM Accelerator for Prefilling and Decoding on Edge FPGAs |
Ye Qiao et.al. |
2504.16266 |
null |
2025-04-22 |
Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching |
Junn Yong Loo et.al. |
2504.16262 |
null |
2025-04-22 |
FinNLI: Novel Dataset for Multi-Genre Financial Natural Language Inference Benchmarking |
Jabez Magomere et.al. |
2504.16188 |
null |
2025-04-22 |
DATETIME: A new benchmark to measure LLM translation and reasoning capabilities |
Edward Gaere et.al. |
2504.16155 |
null |
2025-04-22 |
Towards responsible AI for education: Hybrid human-AI to confront the Elephant in the room |
Danial Hooshyar et.al. |
2504.16148 |
null |
2025-04-22 |
TTRL: Test-Time Reinforcement Learning |
Yuxin Zuo et.al. |
2504.16084 |
link |
2025-04-22 |
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning |
Le Zhuo et.al. |
2504.16080 |
null |
2025-04-22 |
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities |
Thomas Schmied et.al. |
2504.16078 |
null |
2025-04-22 |
PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models |
Shi Qiu et.al. |
2504.16074 |
null |
2025-04-22 |
Boosting Generative Image Modeling via Joint Image-Feature Synthesis |
Theodoros Kouzelis et.al. |
2504.16064 |
null |
2025-04-23 |
Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach |
Penghui Li et.al. |
2504.16057 |
null |
2025-04-22 |
Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability |
Daniel Hendriks et.al. |
2504.16056 |
null |
2025-04-22 |
Evaluating Vision Language Models (VLMs) for Radiology: A Comprehensive Analysis |
Frank Li et.al. |
2504.16047 |
null |
2025-04-23 |
Certified Mitigation of Worst-Case LLM Copyright Infringement |
Jingyu Zhang et.al. |
2504.16046 |
null |
2025-04-22 |
LLMs meet Federated Learning for Scalable and Secure IoT Management |
Yazan Otoum et.al. |
2504.16032 |
null |
2025-04-22 |
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale |
Joya Chen et.al. |
2504.16030 |
null |
2025-04-22 |
Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V3 |
Ahmed R. Sadik et.al. |
2504.16027 |
null |
2025-04-23 |
CAPO: Cost-Aware Prompt Optimization |
Tom Zehle et.al. |
2504.16005 |
link |
2025-04-23 |
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs |
Yaxiong Wu et.al. |
2504.15965 |
null |
2025-04-22 |
Deep learning of point processes for modeling high-frequency data |
Yoshihiro Gyotoku et.al. |
2504.15944 |
null |
2025-04-22 |
FairTranslate: An English-French Dataset for Gender Bias Evaluation in Machine Translation by Overcoming Gender Binarity |
Fanny Jourdan et.al. |
2504.15941 |
link |
2025-04-22 |
Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning |
Wang Lin et.al. |
2504.15932 |
null |
2025-04-22 |
StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation |
Yinmin Zhong et.al. |
2504.15930 |
null |
2025-04-22 |
Towards Test Generation from Task Description for Mobile Testing with Multi-modal Reasoning |
Hieu Huynh et.al. |
2504.15917 |
null |
2025-04-22 |
Automated Bug Report Prioritization in Large Open-Source Projects |
Riley Pierson et.al. |
2504.15912 |
null |
2025-04-24 |
Synergizing RAG and Reasoning: A Systematic Review |
Yunfan Gao et.al. |
2504.15909 |
null |
2025-04-23 |
Impact of Noise on LLM-Models Performance in Abstraction and Reasoning Corpus (ARC) Tasks with Model Temperature Considerations |
Nikhil Khandalkar et.al. |
2504.15903 |
null |
2025-04-22 |
SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning |
Cheng Wen et.al. |
2504.15900 |
null |
2025-04-22 |
Exploring Cognitive and Aesthetic Causality for Multimodal Aspect-Based Sentiment Analysis |
Luwei Xiao et.al. |
2504.15848 |
null |
2025-04-22 |
Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model |
Junshu Pan et.al. |
2504.15843 |
null |
2025-04-22 |
DualOptim: Enhancing Efficacy and Stability in Machine Unlearning with Dual Optimizers |
Xuyang Zhong et.al. |
2504.15827 |
null |
2025-04-22 |
What’s the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns |
Michael A. Hedderich et.al. |
2504.15815 |
null |
2025-04-22 |
Insights from Verification: Training a Verilog Generation LLM with Reinforcement Learning with Testbench Feedback |
Ning Wang et.al. |
2504.15804 |
null |
2025-04-22 |
A closer look at how large language models trust humans: patterns and biases |
Valeria Lerman et.al. |
2504.15801 |
null |
2025-04-23 |
FinDER: Financial Dataset for Question Answering and Evaluating Retrieval-Augmented Generation |
Chanyeol Choi et.al. |
2504.15800 |
null |
2025-04-22 |
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents |
Siyu Zhou et.al. |
2504.15785 |
link |
2025-04-22 |
Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach |
Ruizhe Li et.al. |
2504.15784 |
null |
2025-04-22 |
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving |
Daocheng Fu et.al. |
2504.15780 |
null |
2025-04-22 |
Clifford Group Equivariant Diffusion Models for 3D Molecular Generation |
Cong Liu et.al. |
2504.15773 |
null |
2025-04-22 |
Grounded in Context: Retrieval-Based Method for Hallucination Detection |
Assaf Gerner et.al. |
2504.15771 |
null |
2025-04-22 |
Riemannian Neural Geodesic Interpolant |
Jiawen Wu et.al. |
2504.15736 |
null |
2025-04-22 |
BBAL: A Bidirectional Block Floating Point-Based Quantisation Accelerator for Large Language Models |
Xiaomeng Han et.al. |
2504.15721 |
null |
2025-04-22 |
SeaLLM: Service-Aware and Latency-Optimized Resource Sharing for Large Language Model Inference |
Yihao Zhao et.al. |
2504.15720 |
null |
2025-04-22 |
Implementing Rational Choice Functions with LLMs and Measuring their Alignment with User Preferences |
Anna Karnysheva et.al. |
2504.15719 |
null |
2025-04-22 |
DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models |
Jie Zhu et.al. |
2504.15716 |
null |
2025-04-22 |
Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation |
Ning Wang et.al. |
2504.15699 |
null |
2025-04-22 |
DINOv2-powered Few-Shot Semantic Segmentation: A Unified Framework via Cross-Model Distillation and 4D Correlation Mining |
Wei Zhuo et.al. |
2504.15669 |
null |
2025-04-22 |
FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning |
Ju Yeon Kang et.al. |
2504.15663 |
null |
2025-04-22 |
VeriCoder: Enhancing LLM-Based RTL Code Generation through Functional Correctness Validation |
Anjiang Wei et.al. |
2504.15659 |
null |
2025-04-22 |
Cost-Effective Text Clustering with Large Language Models |
Hongtao Wang et.al. |
2504.15640 |
null |
2025-04-22 |
DR.FIX: Automatically Fixing Data Races at Industry Scale |
Farnaz Behrang et.al. |
2504.15637 |
null |
2025-04-22 |
Exploiting Contextual Knowledge in LLMs through V-usable Information based Layer Enhancement |
Xiaowei Yuan et.al. |
2504.15630 |
null |
2025-04-22 |
CiteFix: Enhancing RAG Accuracy Through Post-Processing Citation Correction |
Harsh Maheshwari et.al. |
2504.15629 |
null |
2025-04-22 |
ZeroSlide: Is Zero-Shot Classification Adequate for Lifelong Learning in Whole-Slide Image Analysis in the Era of Pathology Vision-Language Foundation Models? |
Doanh C. Bui et.al. |
2504.15627 |
null |
2025-04-22 |
FaceInsight: A Multimodal Large Language Model for Face Perception |
Jingzhi Li et.al. |
2504.15624 |
null |
2025-04-22 |
Exploring the Role of Large Language Models in Cybersecurity: A Systematic Survey |
Shuang Tian et.al. |
2504.15622 |
null |
2025-04-22 |
AdaViP: Aligning Multi-modal LLMs via Adaptive Vision-enhanced Preference Optimization |
Jinda Lu et.al. |
2504.15619 |
null |
2025-04-23 |
A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained Settings |
Md Millat Hosen et.al. |
2504.15610 |
link |
2025-04-22 |
Research on Navigation Methods Based on LLMs |
Anlong Zhang et.al. |
2504.15600 |
null |
2025-04-22 |
MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design |
Zimo Yan et.al. |
2504.15587 |
null |
2025-04-22 |
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment |
Kun Wang et.al. |
2504.15585 |
null |
2025-04-22 |
A Large-scale Class-level Benchmark Dataset for Code Generation with LLMs |
Musfiqur Rahman et.al. |
2504.15564 |
null |
2025-04-22 |
A Multi-Agent Framework for Automated Qinqiang Opera Script Generation Using Large Language Models |
Gengxian Cao et.al. |
2504.15552 |
null |
2025-04-22 |
Do It For Me vs. Do It With Me: Investigating User Perceptions of Different Paradigms of Automation in Copilots for Feature-Rich Software |
Anjali Khurana et.al. |
2504.15549 |
null |
2025-04-22 |
LLM-based Semantic Augmentation for Harmful Content Detection |
Elyas Meguellati et.al. |
2504.15548 |
null |
2025-04-22 |
A Framework for Testing and Adapting REST APIs as LLM Tools |
Jayachandu Bandlamudi et.al. |
2504.15546 |
null |
2025-04-22 |
IPBench: Benchmarking the Knowledge of Large Language Models in Intellectual Property |
Qiyao Wang et.al. |
2504.15524 |
null |
2025-04-22 |
The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks |
Minghao Wu et.al. |
2504.15521 |
null |
2025-04-23 |
Transport f divergences |
Wuchen Li et.al. |
2504.15515 |
null |
2025-04-22 |
SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation |
Keqi Deng et.al. |
2504.15509 |
null |
2025-04-21 |
Application of Deep Generative Models for Anomaly Detection in Complex Financial Transactions |
Tengda Tang et.al. |
2504.15491 |
null |
2025-04-21 |
Unifying Image Counterfactuals and Feature Attributions with Latent-Space Adversarial Attacks |
Jeremy Goldwasser et.al. |
2504.15479 |
null |
2025-04-21 |
In-context Ranking Preference Optimization |
Junda Wu et.al. |
2504.15477 |
null |
2025-04-21 |
From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System |
Rohan Surana et.al. |
2504.15476 |
null |
2025-04-21 |
Speculative Sampling via Exponential Races |
Szymon Kobus et.al. |
2504.15475 |
null |
2025-04-21 |
Agent for User: Testing Multi-User Interactive Features in TikTok |
Sidong Feng et.al. |
2504.15474 |
null |
2025-04-21 |
Emergence and Evolution of Interpretable Concepts in Diffusion Models |
Berk Tinaz et.al. |
2504.15473 |
null |
2025-04-21 |
LAPP: Large Language Model Feedback for Preference-Driven Reinforcement Learning |
Pingcheng Jian et.al. |
2504.15472 |
null |
2025-04-21 |
Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated Images |
Jonathan Brokman et.al. |
2504.15470 |
null |
2025-04-21 |
Improving Human-AI Coordination through Adversarial Training and Generative Models |
Paresh Chaudhary et.al. |
2504.15457 |
null |
2025-04-21 |
Prize-Collecting Forest with Submodular Penalties: Improved Approximation |
Ali Ahmadi et.al. |
2504.15445 |
null |
2025-04-21 |
Demand for LLMs: Descriptive Evidence on Substitution, Market Expansion, and Multihoming |
Andrey Fradkin et.al. |
2504.15440 |
null |
2025-04-21 |
Combating Toxic Language: A Review of LLM-Based Strategies for Software Engineering |
Hao Zhuo et.al. |
2504.15439 |
null |
2025-04-21 |
TVR: Automotive System Requirement Traceability Validation and Recovery Through Retrieval-Augmented Generation |
Feifei Niu et.al. |
2504.15427 |
null |
2025-04-21 |
LLM-Assisted Translation of Legacy FORTRAN Codes to C++: A Cross-Platform Study |
Nishath Rajiv Ranasinghe et.al. |
2504.15424 |
null |
2025-04-21 |
IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs |
David Ma et.al. |
2504.15415 |
link |
2025-04-21 |
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World |
Ankit Dhiman et.al. |
2504.15397 |
null |
2025-04-21 |
Tell Me What You Know About Sexism: Expert-LLM Interaction Strategies and Co-Created Definitions for Zero-Shot Sexism Detection |
Myrthe Reuver et.al. |
2504.15392 |
null |
2025-04-21 |
Solving New Tasks by Adapting Internet Video Knowledge |
Calvin Luo et.al. |
2504.15369 |
null |
2025-04-21 |
Measuring Interest Group Positions on Legislation: An AI-Driven Analysis of Lobbying Reports |
Jiseon Kim et.al. |
2504.15333 |
null |
2025-04-21 |
Med-CoDE: Medical Critique based Disagreement Evaluation Framework |
Mohit Gupta et.al. |
2504.15330 |
null |
2025-04-21 |
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs |
Chun-Hsiao Yeh et.al. |
2504.15280 |
link |
2025-04-21 |
VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models |
Weiye Xu et.al. |
2504.15279 |
null |
2025-04-21 |
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning |
Jie Cheng et.al. |
2504.15275 |
link |
2025-04-21 |
Interpretable Locomotion Prediction in Construction Using a Memory-Driven LLM Agent With Chain-of-Thought Reasoning |
Ehsan Ahmadi et.al. |
2504.15263 |
null |
2025-04-21 |
CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation |
Anirudh Khatry et.al. |
2504.15254 |
link |
2025-04-21 |
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators |
Yilun Zhou et.al. |
2504.15253 |
link |
2025-04-21 |
MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning |
Yahan Yang et.al. |
2504.15241 |
null |
2025-04-21 |
A Self-Improving Coding Agent |
Maxime Robeyns et.al. |
2504.15228 |
null |
2025-04-21 |
EvalAgent: Discovering Implicit Evaluation Criteria from the Web |
Manya Wadhwa et.al. |
2504.15219 |
null |
2025-04-21 |
DRAGON: Distributional Rewards Optimize Diffusion Generative Models |
Yatong Bai et.al. |
2504.15217 |
null |
2025-04-21 |
Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs |
Marina Sakharova et.al. |
2504.15210 |
null |
2025-04-21 |
Compute-Optimal LLMs Provably Generalize Better With Scale |
Marc Finzi et.al. |
2504.15208 |
null |
2025-04-21 |
Support Evaluation for the TREC 2024 RAG Track: Comparing Human versus LLM Judges |
Nandan Thakur et.al. |
2504.15205 |
null |
2025-04-22 |
Synergistic Weak-Strong Collaboration by Aligning Preferences |
Yizhu Jiao et.al. |
2504.15188 |
null |
2025-04-21 |
Tiger200K: Manually Curated High Visual Quality Video Dataset from UGC Platform |
Xianpan Zhou et.al. |
2504.15182 |
null |
2025-04-21 |
DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution |
Miaomiao Cai et.al. |
2504.15176 |
null |
2025-04-21 |
The Synthetic Imputation Approach: Generating Optimal Synthetic Texts For Underrepresented Categories In Supervised Classification Tasks |
Joan C. Timoneda et.al. |
2504.15160 |
null |
2025-04-21 |
KGMEL: Knowledge Graph-Enhanced Multimodal Entity Linking |
Juyeon Kim et.al. |
2504.15135 |
link |
2025-04-21 |
EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models |
Ziwen Xu et.al. |
2504.15133 |
link |
2025-04-21 |
Kuwain 1.5B: An Arabic SLM via Language Injection |
Khalil Hennara et.al. |
2504.15120 |
null |
2025-04-21 |
Rethinking the Potential of Multimodality in Collaborative Problem Solving Diagnosis with Large Language Models |
K. Wong et.al. |
2504.15093 |
null |
2025-04-21 |
Safety Co-Option and Compromised National Security: The Self-Fulfilling Prophecy of Weakened AI Risk Thresholds |
Heidy Khlaaf et.al. |
2504.15088 |
null |
2025-04-21 |
Empowering AI to Generate Better AI Code: Guided Generation of Deep Learning Projects with LLMs |
Chen Xie et.al. |
2504.15080 |
null |
2025-04-21 |
Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL |
Simone Papicchio et.al. |
2504.15077 |
null |
2025-04-21 |
The Great Nugget Recall: Automating Fact Extraction and RAG Evaluation with Large Language Models |
Ronak Pradeep et.al. |
2504.15068 |
null |
2025-04-21 |
Testing LLMs’ Capabilities in Annotating Translations Based on an Error Typology Designed for LSP Translation: First Experiments with ChatGPT |
Joachim Minder et.al. |
2504.15052 |
null |
2025-04-21 |
ScanEdit: Hierarchically-Guided Functional 3D Scan Editing |
Mohamed el amine Boudjoghra et.al. |
2504.15049 |
null |
2025-04-21 |
RainbowPlus: Enhancing Adversarial Prompt Generation via Evolutionary Quality-Diversity Search |
Quy-Anh Dang et.al. |
2504.15047 |
link |
2025-04-21 |
A Call for New Recipes to Enhance Spatial Reasoning in MLLMs |
Huanyu Zhang et.al. |
2504.15037 |
null |
2025-04-21 |
SOLIDO: A Robust Watermarking Method for Speech Synthesis via Low-Rank Adaptation |
Yue Li et.al. |
2504.15035 |
null |
2025-04-21 |
DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation |
Weijie He et.al. |
2504.15032 |
null |
2025-04-21 |
DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models |
Chengyu Wang et.al. |
2504.15027 |
null |
2025-04-21 |
Stay Hungry, Stay Foolish: On the Extended Reading Articles Generation with LLMs |
Yow-Fu Liou et.al. |
2504.15013 |
null |
2025-04-21 |
Efficient Pretraining Length Scaling |
Bohong Wu et.al. |
2504.14992 |
null |
2025-04-21 |
aiXamine: LLM Safety and Security Simplified |
Fatih Deniz et.al. |
2504.14985 |
null |
2025-04-21 |
RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild |
Jingkai Zhou et.al. |
2504.14977 |
null |
2025-04-21 |
Evaluating LLMs on Chinese Topic Constructions: A Research Proposal Inspired by Tian et al. (2024) |
Xiaodong Yang et.al. |
2504.14969 |
null |
2025-04-21 |
SLO-Aware Scheduling for Large Language Model Inferences |
Jinqi Huang et.al. |
2504.14966 |
null |
2025-04-21 |
Evaluating Code Generation of LLMs in Advanced Computer Science Problems |
Emir Catir et.al. |
2504.14964 |
null |
2025-04-21 |
Efficient Document Retrieval with G-Retriever |
Manthankumar Solanki et.al. |
2504.14955 |
link |
2025-04-21 |
Generative Semantic Communications: Principles and Practices |
Xiaojun Yuan et.al. |
2504.14947 |
null |
2025-04-22 |
WindVE: Collaborative CPU-NPU Vector Embedding |
Jinqi Huang et.al. |
2504.14941 |
null |
2025-04-21 |
TWIG: Two-Step Image Generation using Segmentation Masks in Diffusion Models |
Mazharul Islam Rakib et.al. |
2504.14933 |
null |
2025-04-21 |
EducationQ: Evaluating LLMs’ Teaching Capabilities Through Multi-Agent Dialogue Framework |
Yao Shi et.al. |
2504.14928 |
null |
2025-04-21 |
POLYRAG: Integrating Polyviews into Retrieval-Augmented Generation for Medical Applications |
Chunjing Gan et.al. |
2504.14917 |
null |
2025-04-21 |
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models |
Yeona Hong et.al. |
2504.14915 |
null |
2025-04-21 |
CRAVE: A Conflicting Reasoning Approach for Explainable Claim Verification Using LLMs |
Yingming Zheng et.al. |
2504.14905 |
link |
2025-04-21 |
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation |
Chenjie Cao et.al. |
2504.14899 |
link |
2025-04-21 |
Expected Free Energy-based Planning as Variational Inference |
Bert de Vries et.al. |
2504.14898 |
null |
2025-04-21 |
Hardware-based Heterogeneous Memory Management for Large Language Model Inference |
Soojin Hwang et.al. |
2504.14893 |
null |
2025-04-21 |
Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey |
Aoran Gan et.al. |
2504.14891 |
null |
2025-04-21 |
Latent Bayesian Optimization via Autoregressive Normalizing Flows |
Seunghun Lee et.al. |
2504.14889 |
null |
2025-04-21 |
Efficient Function Orchestration for Large Language Models |
Xiaoxia Liu et.al. |
2504.14872 |
null |
2025-04-21 |
Natural Fingerprints of Large Language Models |
Teppei Suzuki et.al. |
2504.14871 |
null |
2025-04-21 |
OTC: Optimal Tool Calls via Reinforcement Learning |
Hongru Wang et.al. |
2504.14870 |
null |
2025-04-21 |
Transparentize the Internal and External Knowledge Utilization in LLMs with Trustworthy Citation |
Jiajun Shen et.al. |
2504.14856 |
null |
2025-04-21 |
Uncertainty quantification of neural network models of evolving processes via Langevin sampling |
Cosmin Safta et.al. |
2504.14854 |
null |
2025-04-21 |
APIRAT: Integrating Multi-source API Knowledge for Enhanced Code Translation with LLMs |
Chaofan Wang et.al. |
2504.14852 |
null |
2025-04-21 |
Language Models for Materials Discovery and Sustainability: Progress, Challenges, and Opportunities |
Zongrui Pei et.al. |
2504.14849 |
null |
2025-04-21 |
Enhancing the Patent Matching Capability of Large Language Models via the Memory Graph |
Qiushi Xiong et.al. |
2504.14845 |
link |
2025-04-21 |
Establishing Reliability Metrics for Reward Models in Large Language Models |
Yizhou Chen et.al. |
2504.14838 |
null |
2025-04-21 |
SQL-Factory: A Multi-Agent Framework for High-Quality and Large-Scale SQL Generation |
Jiahui Li et.al. |
2504.14837 |
null |
2025-04-21 |
Protecting Your Voice: Temporal-aware Robust Watermarking |
Yue Li et.al. |
2504.14832 |
null |
2025-04-21 |
Completing A Systematic Review in Hours instead of Months with Interactive AI Agents |
Rui Qiu et.al. |
2504.14822 |
null |
2025-04-21 |
DONOD: Robust and Generalizable Instruction Fine-Tuning for LLMs via Model-Intrinsic Dataset Pruning |
Jucheng Hu et.al. |
2504.14810 |
null |
2025-04-21 |
On Self-improving Token Embeddings |
Mario M. Kubek et.al. |
2504.14808 |
null |
2025-04-21 |
Automatic Evaluation Metrics for Document-level Translation: Overview, Challenges and Trends |
Jiaxin GUO et.al. |
2504.14804 |
null |
2025-04-21 |
Automated Duplicate Bug Report Detection in Large Open Bug Repositories |
Clare E. Laney et.al. |
2504.14797 |
null |
2025-04-21 |
Enhanced Data-driven Topology Design Methodology with Multi-level Mesh and Correlation-based Mutation for Stress-related Multi-objective Optimization |
Jun Yang et.al. |
2504.14790 |
null |
2025-04-21 |
The 1st EReL@MIR Workshop on Efficient Representation Learning for Multimodal Information Retrieval |
Junchen Fu et.al. |
2504.14788 |
null |
2025-04-21 |
gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling |
Tianyu Guo et.al. |
2504.14775 |
link |
2025-04-20 |
Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions |
Luyang Fang et.al. |
2504.14772 |
null |
2025-04-20 |
The Memorization Problem: Can We Trust LLMs’ Economic Forecasts? |
Alejandro Lopez-Lira et.al. |
2504.14765 |
null |
2025-04-20 |
Steering Semantic Data Processing With DocWrangler |
Shreya Shankar et.al. |
2504.14764 |
null |
2025-04-20 |
SWE-Synth: Synthesizing Verifiable Bug-Fix Data to Enable Large Language Models in Resolving Real-World Bugs |
Minh V. T. Pham et.al. |
2504.14757 |
null |
2025-04-20 |
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines |
Reya Vir et.al. |
2504.14738 |
null |
2025-04-20 |
Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation |
Tuhina Tripathi et.al. |
2504.14716 |
null |
2025-04-22 |
AI with Emotions: Exploring Emotional Expressions in Large Language Models |
Shin-nosuke Ishikawa et.al. |
2504.14706 |
null |
2025-04-20 |
Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark |
Enxin Song et.al. |
2504.14693 |
link |
2025-04-20 |
FarsEval-PKBETS: A new diverse benchmark for evaluating Persian large language models |
Mehrnoush Shamsfard et.al. |
2504.14690 |
null |
2025-04-20 |
Evaluating Temporal Plasticity in Foundation Time Series Models for Incremental Fine-tuning |
Jia Liu et.al. |
2504.14677 |
null |
2025-04-20 |
Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data |
Wei Zou et.al. |
2504.14669 |
link |
2025-04-20 |
Efficient Federated Split Learning for Large Language Models over Communication Networks |
Kai Zhao et.al. |
2504.14667 |
null |
2025-04-20 |
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens |
Kaihang Pan et.al. |
2504.14666 |
null |
2025-04-20 |
A Case Study Exploring the Current Landscape of Synthetic Medical Record Generation with Commercial LLMs |
Yihan Lin et.al. |
2504.14657 |
null |
2025-04-20 |
LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs |
Yunhui Xia et.al. |
2504.14655 |
null |
2025-04-20 |
A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents |
Yuting Huang et.al. |
2504.14650 |
null |
2025-04-20 |
Relation-R1: Cognitive Chain-of-Thought Guided Reinforcement Learning for Unified Relational Comprehension |
Lin Li et.al. |
2504.14642 |
null |
2025-04-20 |
HLSTester: Efficient Testing of Behavioral Discrepancies with LLMs for High-Level Synthesis |
Kangwei Xu et.al. |
2504.14641 |
null |
2025-04-20 |
Risk Assessment Framework for Code LLMs via Leveraging Internal States |
Yuheng Huang et.al. |
2504.14640 |
null |
2025-04-20 |
Harnessing Generative LLMs for Enhanced Financial Event Entity Extraction Performance |
Soo-joon Choi et.al. |
2504.14633 |
null |
2025-04-20 |
Towards Optimal Circuit Generation: Multi-Agent Collaboration Meets Collective Intelligence |
Haiyan Qin et.al. |
2504.14625 |
link |
2025-04-20 |
A Hierarchical Framework for Measuring Scientific Paper Innovation via Large Language Models |
Hongming Tan et.al. |
2504.14620 |
null |
2025-04-20 |
Translation Analytics for Freelancers: I. Introduction, Data Preparation, Baseline Evaluations |
Yuri Balashov et.al. |
2504.14619 |
null |
2025-04-20 |
UFO2: The Desktop AgentOS |
Chaoyun Zhang et.al. |
2504.14603 |
link |
2025-04-20 |
a1: Steep Test-time Scaling Law via Environment Augmented Generation |
Lingrui Mei et.al. |
2504.14597 |
null |
2025-04-20 |
HealthGenie: Empowering Users with Healthy Dietary Guidance through Knowledge Graph and Large Language Models |
Fan Gao et.al. |
2504.14594 |
null |
2025-04-20 |
Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction |
Wenke Xia et.al. |
2504.14588 |
link |
2025-04-20 |
Using street view imagery and deep generative modeling for estimating the health of urban forests |
Akshit Gupta et.al. |
2504.14583 |
null |
2025-04-20 |
Prompt-Hacking: The New p-Hacking? |
Thomas Kosch et.al. |
2504.14571 |
null |
2025-04-18 |
Generative AI Act II: Test Time Scaling Drives Cognition Engineering |
Shijie Xia et.al. |
2504.13828 |
link |
2025-04-18 |
Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models |
Junjie Yang et.al. |
2504.13825 |
null |
2025-04-18 |
CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning |
Yang Yue et.al. |
2504.13820 |
link |
2025-04-18 |
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning |
Yixuan Even Xu et.al. |
2504.13818 |
null |
2025-04-18 |
BadApex: Backdoor Attack Based on Adaptive Optimization Mechanism of Black-box Large Language Models |
Zhengxian Wu et.al. |
2504.13775 |
null |
2025-04-18 |
DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs |
Tamim Al Mahmud et.al. |
2504.13774 |
link |
2025-04-18 |
Detecting Malicious Source Code in PyPI Packages with LLMs: Does RAG Come in Handy? |
Motunrayo Ibiyo et.al. |
2504.13769 |
null |
2025-04-18 |
Scaling sparse feature circuit finding for in-context learning |
Dmitrii Kharlapenko et.al. |
2504.13756 |
null |
2025-04-18 |
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis |
Andrea Rigo et.al. |
2504.13745 |
null |
2025-04-18 |
Controlled Territory and Conflict Tracking (CONTACT): (Geo-)Mapping Occupied Territory from Open Source Intelligence |
Paul K. Mandal et.al. |
2504.13730 |
link |
2025-04-18 |
MLEP: Multi-granularity Local Entropy Patterns for Universal AI-generated Image Detection |
Lin Yuan et.al. |
2504.13726 |
null |
2025-04-18 |
OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation |
Yichen Wu et.al. |
2504.13707 |
null |
2025-04-18 |
Exploring Multimodal Prompt for Visualization Authoring with Large Language Models |
Zhen Wen et.al. |
2504.13700 |
null |
2025-04-17 |
Deep literature reviews: an application of fine-tuned language models to migration research |
Stefano M. Iacus et.al. |
2504.13685 |
null |
2025-04-18 |
Intelligent Interaction Strategies for Context-Aware Cognitive Augmentation |
Xiangrong et.al. |
2504.13684 |
null |
2025-04-18 |
Large Language Models Will Change The Way Children Think About Technology And Impact Every Interaction Paradigm |
Russell Beale et.al. |
2504.13667 |
null |
2025-04-18 |
Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code |
Antonio Della Porta et.al. |
2504.13656 |
null |
2025-04-18 |
Exploring the Potential for Large Language Models to Demonstrate Rational Probabilistic Beliefs |
Gabriel Freedman et.al. |
2504.13644 |
link |
2025-04-18 |
Divergent LLM Adoption and Heterogeneous Convergence Paths in Research Writing |
Cong William Lin et.al. |
2504.13629 |
null |
2025-04-18 |
PV-VLM: A Multimodal Vision-Language Approach Incorporating Sky Images for Intra-Hour Photovoltaic Power Forecasting |
Huapeng Lin et.al. |
2504.13624 |
null |
2025-04-18 |
Compile Scene Graphs with Reinforcement Learning |
Zuyao Chen et.al. |
2504.13617 |
link |
2025-04-18 |
Long-context Non-factoid Question Answering in Indic Languages |
Ritwik Mishra et.al. |
2504.13615 |
link |
2025-04-18 |
Continual Pre-Training is (not) What You Need in Domain Adaption |
Pin-Er Chen et.al. |
2504.13603 |
null |
2025-04-18 |
HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering |
Alexander Rusnak et.al. |
2504.13590 |
null |
2025-04-18 |
Towards End-to-End Network Intent Management with Large Language Models |
Lam Dinh et.al. |
2504.13589 |
null |
2025-04-18 |
RAG Without the Lag: Interactive Debugging for Retrieval-Augmented Generation Pipelines |
Quentin Romero Lauro et.al. |
2504.13587 |
null |
2025-04-18 |
Contextualizing Spotify’s Audiobook List Recommendations with Descriptive Shelves |
Gustavo Penha et.al. |
2504.13572 |
null |
2025-04-18 |
DETAM: Defending LLMs Against Jailbreak Attacks via Targeted Attention Modification |
Yu Li et.al. |
2504.13562 |
null |
2025-04-18 |
Zero-Shot Industrial Anomaly Segmentation with Image-Aware Prompt Generation |
SoYoung Park et.al. |
2504.13560 |
link |
2025-04-18 |
Integrating LLMs for Grading and Appeal Resolution in Computer Science Education |
I. Aytutuldu et.al. |
2504.13557 |
null |
2025-04-18 |
MusFlow: Multimodal Music Generation via Conditional Flow Matching |
Jiahao Song et.al. |
2504.13535 |
null |
2025-04-18 |
CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models |
Feiyang Li et.al. |
2504.13534 |
null |
2025-04-18 |
Designing a reliable lateral movement detector using a graph foundation model |
Corentin Larroche et.al. |
2504.13527 |
null |
2025-04-18 |
Large Language Models for Validating Network Protocol Parsers |
Mingwei Zheng et.al. |
2504.13515 |
link |
2025-04-18 |
Prejudge-Before-Think: Enhancing Large Language Models at Test-Time by Process Prejudge Reasoning |
Jianing Wang et.al. |
2504.13500 |
link |
2025-04-18 |
U-Shape Mamba: State Space Model for faster diffusion |
Alex Ergasti et.al. |
2504.13499 |
link |
2025-04-18 |
Early Timestep Zero-Shot Candidate Selection for Instruction-Guided Image Editing |
Joowon Kim et.al. |
2504.13490 |
null |
2025-04-18 |
LLM Sensitivity Evaluation Framework for Clinical Diagnosis |
Chenwei Yan et.al. |
2504.13475 |
null |
2025-04-18 |
Everything You Wanted to Know About LLM-based Vulnerability Detection But Were Afraid to Ask |
Yue Li et.al. |
2504.13474 |
null |
2025-04-18 |
CodeVisionary: An Agent-based Framework for Evaluating Large Language Models in Code Generation |
Xinchen Wang et.al. |
2504.13472 |
null |
2025-04-18 |
From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs |
Jiliang Ni et.al. |
2504.13471 |
null |
2025-04-18 |
Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization |
Hongwei Ji et.al. |
2504.13460 |
null |
2025-04-18 |
SatelliteCalculator: A Multi-Task Vision Foundation Model for Quantitative Remote Sensing Inversion |
Zhenyu Yu et.al. |
2504.13442 |
null |
2025-04-18 |
D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Model |
Grace Byun et.al. |
2504.13439 |
null |
2025-04-18 |
Secure Multifaceted-RAG for Enterprise: Hybrid Knowledge Retrieval with Security Filtering |
Grace Byun et.al. |
2504.13425 |
null |
2025-04-18 |
Mono3R: Exploiting Monocular Cues for Geometric 3D Reconstruction |
Wenyu Li et.al. |
2504.13419 |
null |
2025-04-18 |
STAMP Your Content: Proving Dataset Membership via Watermarked Rephrasings |
Saksham Rastogi et.al. |
2504.13416 |
null |
2025-04-18 |
CytoFM: The first cytology foundation model |
Vedrana Ivezić et.al. |
2504.13402 |
null |
2025-04-18 |
Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving Safety |
Shashank Shriram et.al. |
2504.13399 |
link |
2025-04-18 |
POET: Supporting Prompting Creativity and Personalization with Automated Expansion of Text-to-Image Generation |
Evans Xu Han et.al. |
2504.13392 |
null |
2025-04-17 |
SMPL-GPTexture: Dual-View 3D Human Texture Estimation using Text-to-Image Generation Models |
Mingxiao Tu et.al. |
2504.13378 |
null |
2025-04-17 |
On the minimax optimality of Flow Matching through the connection to kernel density estimation |
Lea Kunkel et.al. |
2504.13336 |
null |
2025-04-17 |
Predicting Forced Responses of Probability Distributions via the Fluctuation-Dissipation Theorem and Generative Modeling |
Ludovico T. Giorgini et.al. |
2504.13333 |
null |
2025-04-17 |
Weak Cube R-CNN: Weakly Supervised 3D Detection using only 2D Bounding Boxes |
Andreas Lau Hansen et.al. |
2504.13297 |
null |
2025-04-17 |
LIFT+: Lightweight Fine-Tuning for Long-Tail Learning |
Jiang-Xin Shi et.al. |
2504.13282 |
link |
2025-04-17 |
Using LLMs for Library Migration |
Md Mohayeminul Islam et.al. |
2504.13272 |
null |
2025-04-17 |
Causal-Copilot: An Autonomous Causal Analysis Agent |
Xinyue Wang et.al. |
2504.13263 |
null |
2025-04-17 |
CPG-EVAL: A Multi-Tiered Benchmark for Evaluating the Chinese Pedagogical Grammar Competence of Large Language Models |
Dong Wang et.al. |
2504.13261 |
null |
2025-04-17 |
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs |
Yan Yang et.al. |
2504.13237 |
null |
2025-04-17 |
NNTile: a machine learning framework capable of training extremely large GPT language models on a single node |
Aleksandr Mikhalev et.al. |
2504.13236 |
null |
2025-04-17 |
Auto-FEDUS: Autoregressive Generative Modeling of Doppler Ultrasound Signals from Fetal Electrocardiograms |
Alireza Rafiei et.al. |
2504.13233 |
null |
2025-04-17 |
Aligning Constraint Generation with Design Intent in Parametric CAD |
Evan Casey et.al. |
2504.13178 |
null |
2025-04-17 |
It’s All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization |
Ali Behrouz et.al. |
2504.13173 |
null |
2025-04-17 |
SemCORE: A Semantic-Enhanced Generative Cross-Modal Retrieval Framework with MLLMs |
Haoxuan Li et.al. |
2504.13172 |
null |
2025-04-17 |
Sleep-time Compute: Beyond Inference Scaling at Test-time |
Kevin Lin et.al. |
2504.13171 |
link |
2025-04-17 |
Digital Twin Generation from Visual Data: A Survey |
Andrew Melnik et.al. |
2504.13159 |
link |
2025-04-18 |
Exploring Expert Failures Improves LLM Agent Tuning |
Li-Cheng Lan et.al. |
2504.13145 |
null |
2025-04-18 |
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo |
João Loula et.al. |
2504.13139 |
null |
2025-04-17 |
Energy-Based Reward Models for Robust Language Model Alignment |
Anamika Lochab et.al. |
2504.13134 |
link |
2025-04-17 |
Science-T2I: Addressing Scientific Illusions in Image Synthesis |
Jialuo Li et.al. |
2504.13129 |
null |
2025-04-17 |
LLMs Meet Finance: Fine-Tuning Foundation Models for the Open FinLLM Leaderboard |
Varun Rao et.al. |
2504.13125 |
null |
2025-04-17 |
Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training |
Xinsong Zhang et.al. |
2504.13123 |
null |
2025-04-17 |
VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models |
Haojian Huang et.al. |
2504.13122 |
link |
2025-04-17 |
Uncertainty-Aware Trajectory Prediction via Rule-Regularized Heteroscedastic Deep Classification |
Kumar Manas et.al. |
2504.13111 |
null |
2025-04-17 |
UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models |
Guanlong Jiao et.al. |
2504.13109 |
null |
2025-04-17 |
EventVAD: Training-Free Event-Aware Video Anomaly Detection |
Yihua Shao et.al. |
2504.13092 |
null |
2025-04-17 |
Retrieval-Augmented Generation with Conflicting Evidence |
Han Wang et.al. |
2504.13079 |
link |
2025-04-17 |
An All-Atom Generative Model for Designing Protein Complexes |
Ruizhe Chen et.al. |
2504.13075 |
link |
2025-04-18 |
SkyReels-V2: Infinite-length Film Generative Model |
Guibin Chen et.al. |
2504.13074 |
link |
2025-04-17 |
Early Accessibility: Automating Alt-Text Generation for UI Icons During App Development |
Sabrina Haque et.al. |
2504.13069 |
null |
2025-04-17 |
Accuracy is Not Agreement: Expert-Aligned Evaluation of Crash Narrative Classification Models |
Sudesh Ramesh Bhagat et.al. |
2504.13068 |
null |
2025-04-17 |
ArtistAuditor: Auditing Artist Style Pirate in Text-to-Image Generation Models |
Linkang Du et.al. |
2504.13061 |
link |
2025-04-17 |
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins |
Yao Mu et.al. |
2504.13059 |
null |
2025-04-17 |
Aspect-Based Summarization with Self-Aspect Retrieval Enhanced Generation |
Yichao Feng et.al. |
2504.13054 |
null |
2025-04-17 |
GraphAttack: Exploiting Representational Blindspots in LLM Safety Mechanisms |
Sinan He et.al. |
2504.13052 |
null |
2025-04-17 |
Multi-modal single-cell foundation models via dynamic token adaptation |
Wenmin Zhao et.al. |
2504.13049 |
null |
2025-04-17 |
Design Topological Materials by Reinforcement Fine-Tuned Generative Model |
Haosheng Xu et.al. |
2504.13048 |
null |
2025-04-17 |
How Large Language Models Are Changing MOOC Essay Answers: A Comparison of Pre- and Post-LLM Responses |
Leo Leppänen et.al. |
2504.13038 |
null |
2025-04-18 |
Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond |
Yundi Zhang et.al. |
2504.13037 |
link |
2025-04-17 |
InstructRAG: Leveraging Retrieval-Augmented Generation on Instruction Graphs for LLM-Based Task Planning |
Zheng Wang et.al. |
2504.13032 |
null |
2025-04-17 |
ChatEXAONEPath: An Expert-level Multimodal Large Language Model for Histopathology Using Whole Slide Images |
Sangwook Kim et.al. |
2504.13023 |
null |
2025-04-17 |
SHA256 at SemEval-2025 Task 4: Selective Amnesia – Constrained Unlearning for Large Language Models via Knowledge Isolation |
Saransh Agrawal et.al. |
2504.12996 |
link |
2025-04-17 |
Chain-of-Thought Prompting for Out-of-Distribution Samples: A Latent-Variable Study |
Yu Wang et.al. |
2504.12991 |
link |
2025-04-17 |
A Virtual Machine for Arbitrary Low-Precision GPGPU Computation in LLM Serving |
Yaoyao Ding et.al. |
2504.12984 |
null |
2025-04-17 |
Accommodate Knowledge Conflicts in Retrieval-augmented LLMs: Towards Reliable Response Generation in the Wild |
Jiatai Wang et.al. |
2504.12982 |
null |
2025-04-17 |
Sparks of Science: Hypothesis Generation Using Structured Paper Data |
Charles O’Neill et.al. |
2504.12976 |
null |
2025-04-17 |
QLLM: Do We Really Need a Mixing Network for Credit Assignment in Multi-Agent Reinforcement Learning? |
Zhouyang Jiang et.al. |
2504.12961 |
null |
2025-04-17 |
Are Retrials All You Need? Enhancing Large Language Model Reasoning Without Verbalized Feedback |
Nearchos Potamitis et.al. |
2504.12951 |
null |
2025-04-18 |
Customizing Emotional Support: How Do Individuals Construct and Interact With LLM-Powered Chatbots |
Xi Zheng et.al. |
2504.12943 |
null |
2025-04-17 |
Explainable AI in Usable Privacy and Security: Challenges and Opportunities |
Vincent Freiberger et.al. |
2504.12931 |
null |
2025-04-17 |
ConExion: Concept Extraction with Large Language Models |
Ebrahim Norouzi et.al. |
2504.12915 |
link |
2025-04-17 |
MAIN: Mutual Alignment Is Necessary for instruction tuning |
Fanyi Yang et.al. |
2504.12913 |
null |
2025-04-17 |
Benchmarking Multi-National Value Alignment for Large Language Models |
Chengyi Ju et.al. |
2504.12911 |
null |
2025-04-17 |
FashionDPO:Fine-tune Fashion Outfit Generation Model using Direct Preference Optimization |
Mingzhe Yu et.al. |
2504.12900 |
link |
2025-04-17 |
Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models |
Zhouhao Sun et.al. |
2504.12898 |
null |
2025-04-18 |
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting |
Guanrou Yang et.al. |
2504.12867 |
null |
2025-04-17 |
DashChat: Interactive Authoring of Industrial Dashboard Design Prototypes through Conversation with LLM-Powered Agents |
S. Shen et.al. |
2504.12865 |
null |
2025-04-17 |
Enhancing Decentralization in Blockchain Decision-Making Through Quadratic Voting and Its Generalization |
Lyudmila Kovalchuk et.al. |
2504.12859 |
null |
2025-04-17 |
3D-PNAS: 3D Industrial Surface Anomaly Synthesis with Perlin Noise |
Yifeng Cheng et.al. |
2504.12856 |
null |
2025-04-17 |
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks |
Amey Hengle et.al. |
2504.12845 |
link |
2025-04-17 |
TwoSquared: 4D Generation from 2D Image Pairs |
Lu Sang et.al. |
2504.12825 |
null |
2025-04-17 |
Assesing LLMs in Art Contexts: Critique Generation and Theory of Mind Evaluation |
Takaya Arita et.al. |
2504.12805 |
null |
2025-04-17 |
EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery |
Wei Zhang et.al. |
2504.12795 |
null |
2025-04-17 |
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration |
Yicheng Pan et.al. |
2504.12773 |
link |
2025-04-17 |
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks |
Hao Xu et.al. |
2504.12764 |
link |
2025-04-17 |
Trajectory Adaptation using Large Language Models |
Anurag Maurya et.al. |
2504.12755 |
null |
2025-04-17 |
Stronger, Steadier & Superior: Geometric Consistency in Depth VFM Forges Domain Generalized Semantic Segmentation |
Siyu Chen et.al. |
2504.12753 |
link |
2025-04-17 |
Pandora: A Code-Driven Large Language Model Agent for Unified Reasoning Across Diverse Structured Knowledge |
Yongrui Chen et.al. |
2504.12734 |
null |
2025-04-17 |
Validating LLM-Generated Relevance Labels for Educational Resource Search |
Ratan J. Sebastian et.al. |
2504.12732 |
null |
2025-04-17 |
SimUSER: Simulating User Behavior with Large Language Models for Recommender System Evaluation |
Nicolas Bougie et.al. |
2504.12722 |
null |
2025-04-17 |
Post-pre-training for Modality Alignment in Vision-Language Foundation Models |
Shin’ya Yamaguchi et.al. |
2504.12717 |
link |
2025-04-17 |
SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding |
Qianqian Sun et.al. |
2504.12704 |
null |
2025-04-17 |
Collaborative Perception Datasets for Autonomous Driving: A Review |
Naibang Wang et.al. |
2504.12696 |
link |
2025-04-17 |
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations |
Yiyou Sun et.al. |
2504.12691 |
link |
2025-04-17 |
Data-efficient LLM Fine-tuning for Code Generation |
Weijie Lv et.al. |
2504.12687 |
link |
2025-04-17 |
SOPHY: Generating Simulation-Ready Objects with Physical Materials |
Junyi Cao et.al. |
2504.12684 |
null |
2025-04-17 |
GRAIL: Gradient-Based Adaptive Unlearning for Privacy and Copyright in LLMs |
Kun-Woo Kim et.al. |
2504.12681 |
null |
2025-04-17 |
Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning |
Baining Zhao et.al. |
2504.12680 |
link |
2025-04-17 |
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment |
Xiaotian Zhang et.al. |
2504.12663 |
null |
2025-04-17 |
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation |
Linda He et.al. |
2504.12637 |
null |
2025-04-17 |
Towards Characterizing Subjectivity of Individuals through Modeling Value Conflicts and Trade-offs |
Younghun Lee et.al. |
2504.12633 |
null |
2025-04-17 |
SAM-Based Building Change Detection with Distribution-Aware Fourier Adaptation and Edge-Constrained Warping |
Yun-Cheng Li et.al. |
2504.12619 |
null |
2025-04-17 |
Code Copycat Conundrum: Demystifying Repetition in LLM-based Code Generation |
Mingwei Liu et.al. |
2504.12608 |
null |
2025-04-17 |
GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning |
Liangyu Xu et.al. |
2504.12597 |
null |
2025-04-17 |
Identifying and Mitigating the Influence of the Prior Distribution in Large Language Models |
Liyi Zhang et.al. |
2504.12585 |
link |
2025-04-17 |
Provable Secure Steganography Based on Adaptive Dynamic Sampling |
Kaiyi Pang et.al. |
2504.12579 |
null |
2025-04-17 |
Prompt-Driven and Training-Free Forgetting Approach and Dataset for Large Language Models |
Zhenyu Yu et.al. |
2504.12574 |
null |
2025-04-17 |
ZeroSumEval: Scaling LLM Evaluation with Inter-Model Competition |
Haidar Khan et.al. |
2504.12562 |
link |
2025-04-17 |
CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation |
Elahe Khatibi et.al. |
2504.12560 |
link |
2025-04-17 |
Benchmarking LLM-based Relevance Judgment Methods |
Negar Arabzadeh et.al. |
2504.12558 |
link |
2025-04-17 |
ELAB: Extensive LLM Alignment Benchmark in Persian Language |
Zahra Pourbahman et.al. |
2504.12553 |
null |
2025-04-17 |
Privacy-Preserving Operating Room Workflow Analysis using Digital Twins |
Alejandra Perez et.al. |
2504.12552 |
null |
2025-04-17 |
Knowledge Acquisition on Mass-shooting Events via LLMs for AI-Driven Justice |
Benign John Ihugba et.al. |
2504.12545 |
null |
2025-04-16 |
Memorization vs. Reasoning: Updating LLMs with New Knowledge |
Aochong Oliver Li et.al. |
2504.12523 |
null |
2025-04-16 |
Evaluating the Diversity and Quality of LLM Generated Content |
Alexander Shypula et.al. |
2504.12522 |
null |
2025-04-16 |
Multimodal LLM Augmented Reasoning for Interpretable Visual Perception Analysis |
Shravan Chaudhari et.al. |
2504.12511 |
null |
2025-04-16 |
Towards Conversational AI for Human-Machine Collaborative MLOps |
George Fatouros et.al. |
2504.12477 |
null |
2025-04-16 |
Integrating Structural and Semantic Signals in Text-Attributed Graphs with BiGTex |
Azadeh Beiranvand et.al. |
2504.12474 |
link |
2025-04-16 |
You Don’t Need All Attentions: Distributed Dynamic Fine-Tuning for Foundation Models |
Shiwei Ding et.al. |
2504.12471 |
null |
2025-04-16 |
SLURG: Investigating the Feasibility of Generating Synthetic Online Fallacious Discourse |
Cal Blanco et.al. |
2504.12466 |
null |
2025-04-16 |
PlanGlow: Personalized Study Planning with an Explainable and Controllable LLM-Driven System |
Jiwon Chun et.al. |
2504.12452 |
link |
2025-04-16 |
Position: The Most Expensive Part of an LLM should be its Training Data |
Nikhil Kandpal et.al. |
2504.12427 |
null |
2025-04-16 |
Don’t Just Translate, Agitate: Using Large Language Models as Devil’s Advocates for AI Explanations |
Ashley Suh et.al. |
2504.12424 |
null |
2025-04-16 |
Mitigating LLM Hallucinations with Knowledge Graphs: A Case Study |
Harry Li et.al. |
2504.12422 |
null |
2025-04-16 |
A Human-AI Comparative Analysis of Prompt Sensitivity in LLM-Based Relevance Judgment |
Negar Arabzadeh et.al. |
2504.12408 |
link |
2025-04-16 |
Activated LoRA: Fine-tuned LLMs for Intrinsics |
Kristjan Greenewald et.al. |
2504.12397 |
link |
2025-04-16 |
BitNet b1.58 2B4T Technical Report |
Shuming Ma et.al. |
2504.12285 |
null |
2025-04-16 |
HLS-Eval: A Benchmark and Framework for Evaluating LLMs on High-Level Synthesis Design Tasks |
Stefan Abi-Karam et.al. |
2504.12268 |
link |
2025-04-16 |
VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate |
Zhihang Yuan et.al. |
2504.12259 |
link |
2025-04-16 |
FLIP Reasoning Challenge |
Andreas Plesner et.al. |
2504.12256 |
link |
2025-04-16 |
AnomalyGen: An Automated Semantic Log Sequence Generation Framework with LLM for Anomaly Detection |
Xinyu Li et.al. |
2504.12250 |
null |
2025-04-16 |
MOS: Towards Effective Smart Contract Vulnerability Detection through Mixture-of-Experts Tuning of Large Language Models |
Hang Yuan et.al. |
2504.12234 |
null |
2025-04-16 |
Watermarking Needs Input Repetition Masking |
David Khachaturov et.al. |
2504.12229 |
null |
2025-04-16 |
Coding-Prior Guided Diffusion Network for Video Deblurring |
Yike Liu et.al. |
2504.12222 |
null |
2025-04-16 |
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning |
Siyan Zhao et.al. |
2504.12216 |
null |
2025-04-16 |
What Do Large Language Models Know? Tacit Knowledge as a Potential Causal-Explanatory Structure |
Céline Budding et.al. |
2504.12187 |
null |
2025-04-16 |
SALAD: Improving Robustness and Generalization through Contrastive Learning with Structure-Aware and LLM-Driven Augmented Data |
Suyoung Bae et.al. |
2504.12185 |
null |
2025-04-16 |
Trusting CHATGPT: how minor tweaks in the prompts lead to major differences in sentiment classification |
Jaime E. Cuellar et.al. |
2504.12180 |
null |
2025-04-16 |
Deep Generative Models for Bayesian Inference on High-Rate Sensor Data: Applications in Automotive Radar and Medical Imaging |
Tristan S. W. Stevens et.al. |
2504.12154 |
null |
2025-04-16 |
Multilingual Contextualization of Large Language Models for Document-Level Machine Translation |
Miguel Moura Ramos et.al. |
2504.12140 |
null |
2025-04-16 |
Clarifying Ambiguities: on the Role of Ambiguity Types in Prompting Methods for Clarification Generation |
Anfu Tang et.al. |
2504.12113 |
null |
2025-04-16 |
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation |
Shizhan Cai et.al. |
2504.12108 |
null |
2025-04-16 |
Gauging Overprecision in LLMs: An Empirical Study |
Adil Bahaj et.al. |
2504.12098 |
null |
2025-04-16 |
Reasoning-Based AI for Startup Evaluation (R.A.I.S.E.): A Memory-Augmented, Multi-Step Decision Framework |
Jack Preuveneers et.al. |
2504.12090 |
null |
2025-04-16 |
Selective Demonstration Retrieval for Improved Implicit Hate Speech Detection |
Yumin Kim et.al. |
2504.12082 |
null |
2025-04-16 |
Subitizing-Inspired_Large_Language_Models_for_Floorplanning |
Shao-Chien Lu et.al. |
2504.12076 |
null |
2025-04-16 |
Generative Deep Learning Framework for Inverse Design of Fuels |
Kiran K. Yalamanchi et.al. |
2504.12075 |
null |
2025-04-16 |
Optimizing Compound Retrieval Systems |
Harrie Oosterhuis et.al. |
2504.12063 |
null |
2025-04-16 |
Modular-Cam: Modular Dynamic Camera-view Video Generation with LLM |
Zirui Pan et.al. |
2504.12048 |
null |
2025-04-16 |
Instruction-augmented Multimodal Alignment for Image-Text and Element Matching |
Xinli Yue et.al. |
2504.12018 |
null |
2025-04-16 |
Purposefully Induced Psychosis (PIP): Embracing Hallucination as Imagination in Large Language Models |
Kris Pilcher et.al. |
2504.12012 |
null |
2025-04-16 |
Generative Recommendation with Continuous-Token Diffusion |
Haohao Qu et.al. |
2504.12007 |
null |
2025-04-16 |
A Complex-valued SAR Foundation Model Based on Physically Inspired Representation Learning |
Mengyu Wang et.al. |
2504.11999 |
null |
2025-04-16 |
Language Models as Quasi-Crystalline Thought: Structure, Constraint, and Emergence in Generative Systems |
Jose Manuel Guevara-Vela et.al. |
2504.11986 |
null |
2025-04-16 |
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes |
Raúl Vázquez et.al. |
2504.11975 |
null |
2025-04-16 |
LLM-as-a-Judge: Reassessing the Performance of LLMs in Extractive QA |
Xanh Ho et.al. |
2504.11972 |
link |
2025-04-16 |
Mind2Matter: Creating 3D Models from EEG Signals |
Xia Deng et.al. |
2504.11936 |
link |
2025-04-16 |
An LLM-as-a-judge Approach for Scalable Gender-Neutral Translation Evaluation |
Andrea Piergentili et.al. |
2504.11934 |
null |
2025-04-16 |
Rethinking the Generation of High-Quality CoT Data from the Perspective of LLM-Adaptive Question Difficulty Grading |
Qianjin Yu et.al. |
2504.11919 |
null |
2025-04-16 |
AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection |
Yuhao Chao et.al. |
2504.11914 |
null |
2025-04-16 |
Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection |
Kabir Ahuja et.al. |
2504.11900 |
null |
2025-04-16 |
Search is All You Need for Few-shot Anomaly Detection |
Qishan Wang et.al. |
2504.11895 |
link |
2025-04-16 |
Rethinking LLM-Based Recommendations: A Query Generation-Based, Training-Free Approach |
Donghee Han et.al. |
2504.11889 |
null |
2025-04-16 |
Boosting Multi-View Stereo with Depth Foundation Model in the Absence of Real-World Labels |
Jie Zhu et.al. |
2504.11845 |
null |
2025-04-16 |
Evaluating the Goal-Directedness of Large Language Models |
Tom Everitt et.al. |
2504.11844 |
link |
2025-04-16 |
FiSMiness: A Finite State Machine Based Paradigm for Emotional Support Conversations |
Yue Zhao et.al. |
2504.11837 |
null |
2025-04-16 |
Could Thinking Multilingually Empower LLM Reasoning? |
Changjiang Gao et.al. |
2504.11833 |
link |
2025-04-16 |
Déjà Vu: Multilingual LLM Evaluation through the Lens of Machine Translation Evaluation |
Julia Kreutzer et.al. |
2504.11829 |
null |
2025-04-16 |
Towards Forceful Robotic Foundation Models: a Literature Survey |
William Xie et.al. |
2504.11827 |
null |
2025-04-16 |
Real-World Depth Recovery via Structure Uncertainty Modeling and Inaccurate GT Depth Fitting |
Delong Suzhang et.al. |
2504.11820 |
null |
2025-04-16 |
Efficient and Adaptive Simultaneous Speech Translation with Fully Unidirectional Architecture |
Biao Fu et.al. |
2504.11809 |
null |
2025-04-16 |
Résumé abstractif à partir d’une transcription audio |
Ilia Derkach et.al. |
2504.11803 |
null |
2025-04-17 |
Selective Attention Federated Learning: Improving Privacy and Efficiency for Clinical Text Classification |
Yue Li et.al. |
2504.11793 |
null |
2025-04-16 |
Large Language Models for Drug Overdose Prediction from Longitudinal Medical Records |
Md Sultan Al Nahian et.al. |
2504.11792 |
null |
2025-04-16 |
Enhancing Web Agents with Explicit Rollback Mechanisms |
Zhisong Zhang et.al. |
2504.11788 |
null |
2025-04-16 |
The Digital Cybersecurity Expert: How Far Have We Come? |
Dawei Wang et.al. |
2504.11783 |
link |
2025-04-16 |
Bridging the Semantic Gaps: Improving Medical VQA Consistency with LLM-Augmented Question Sets |
Yongpei Ma et.al. |
2504.11777 |
null |
2025-04-16 |
Shared Disk KV Cache Management for Efficient Multi-Instance Inference in RAG-Powered LLMs |
Hyungwoo Lee et.al. |
2504.11765 |
null |
2025-04-16 |
Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures |
Prabhu Vellaisamy et.al. |
2504.11750 |
null |
2025-04-16 |
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation |
Bingjie Gao et.al. |
2504.11739 |
null |
2025-04-16 |
Recent Advance in 3D Object and Scene Generation: A Survey |
Xiang Tang et.al. |
2504.11734 |
null |
2025-04-16 |
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos |
Jilan Xu et.al. |
2504.11732 |
null |
2025-04-16 |
EdgePrompt: A Distributed Key-Value Inference Framework for LLMs in 6G Networks |
Jiahong Ning et.al. |
2504.11729 |
null |
2025-04-16 |
Probing the Unknown: Exploring Student Interactions with Probeable Problems at Scale in Introductory Programming |
Paul Denny et.al. |
2504.11723 |
null |
2025-04-17 |
The Hitchhiker’s Guide to Program Analysis, Part II: Deep Thoughts by LLMs |
Haonan Li et.al. |
2504.11711 |
null |
2025-04-16 |
Learning What NOT to Count |
Adriano D’Alessandro et.al. |
2504.11705 |
null |
2025-04-16 |
A Library of LLM Intrinsics for Retrieval-Augmented Generation |
Marina Danilevsky et.al. |
2504.11704 |
null |
2025-04-16 |
Progent: Programmable Privilege Control for LLM Agents |
Tianneng Shi et.al. |
2504.11703 |
link |
2025-04-16 |
A New Paradigm of User-Centric Wireless Communication Driven by Large Language Models |
Kuiyuan Ding et.al. |
2504.11696 |
null |
2025-04-16 |
Can GPT tell us why these images are synthesized? Empowering Multimodal Large Language Models for Forensics |
Yiran He et.al. |
2504.11686 |
null |
2025-04-16 |
Higher-Order Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions |
Minwoo Kang et.al. |
2504.11673 |
null |
2025-04-16 |
Steering Prosocial AI Agents: Computational Basis of LLM’s Decision Making in Social Simulation |
Ji Ma et.al. |
2504.11671 |
null |
2025-04-15 |
Improving LLM Interpretability and Performance via Guided Embedding Refinement for Sequential Recommendation |
Nanshan Jia et.al. |
2504.11658 |
null |
2025-04-15 |
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float |
Tianyi Zhang et.al. |
2504.11651 |
link |
2025-04-15 |
Making Acoustic Side-Channel Attacks on Noisy Keyboards Viable with LLM-Assisted Spectrograms’ “Typo” Correction |
Seyyed Ali Ayati et.al. |
2504.11622 |
link |
2025-04-15 |
Towards Interpretable Deep Generative Models via Causal Representation Learning |
Gemma E. Moran et.al. |
2504.11609 |
null |
2025-04-15 |
GraphicBench: A Planning Benchmark for Graphic Design with Language Agents |
Dayeon Ki et.al. |
2504.11571 |
null |
2025-04-15 |
Probabilistic causal graphs as categorical data synthesizers: Do they do better than Gaussian Copulas and Conditional Tabular GANs? |
Olha Shaposhnyk et.al. |
2504.11547 |
null |
2025-04-15 |
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes |
Tianyang Xu et.al. |
2504.11544 |
null |
2025-04-15 |
HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis Generation |
Haokun Liu et.al. |
2504.11524 |
null |
2025-04-15 |
FACT: Foundation Model for Assessing Cancer Tissue Margins with Mass Spectrometry |
Mohammad Farahmand et.al. |
2504.11519 |
link |
2025-04-15 |
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception |
Ziqi Pang et.al. |
2504.11457 |
link |
2025-04-16 |
Elucidating the Design Space of Multimodal Protein Language Models |
Cheng-Yen Hsieh et.al. |
2504.11454 |
null |
2025-04-15 |
TextArena |
Leon Guertler et.al. |
2504.11442 |
link |
2025-04-15 |
Masculine Defaults via Gendered Discourse in Podcasts and Large Language Models |
Maria Teleki et.al. |
2504.11431 |
link |
2025-04-15 |
A Dual-Space Framework for General Knowledge Distillation of Large Language Models |
Xue Zhang et.al. |
2504.11426 |
null |
2025-04-15 |
Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts |
Quanyu Long et.al. |
2504.11420 |
null |
2025-04-15 |
DataDecide: How to Predict Best Pretraining Data with Small Experiments |
Ian Magnusson et.al. |
2504.11393 |
null |
2025-04-15 |
RankAlign: A Ranking View of the Generator-Validator Gap in Large Language Models |
Juan Diego Rodriguez et.al. |
2504.11381 |
link |
2025-04-15 |
Ring Artifacts Correction Based on Global-Local Features Interaction Guidance in the Projection Domain |
Yunze Liu et.al. |
2504.11375 |
null |
2025-04-15 |
Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions |
Wang Bill Zhu et.al. |
2504.11373 |
link |
2025-04-15 |
OpenTuringBench: An Open-Model-based Benchmark and Framework for Machine-Generated Text Detection and Attribution |
Lucio La Cava et.al. |
2504.11369 |
null |
2025-04-15 |
Teaching Large Language Models to Reason through Learning and Forgetting |
Tianwei Ni et.al. |
2504.11364 |
link |
2025-04-15 |
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning |
Haiming Wang et.al. |
2504.11354 |
link |
2025-04-16 |
Seedream 3.0 Technical Report |
Yu Gao et.al. |
2504.11346 |
null |
2025-04-15 |
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce |
Wei Xiong et.al. |
2504.11343 |
link |
2025-04-15 |
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints |
Ruicheng Ao et.al. |
2504.11320 |
link |
2025-04-15 |
Learning to Be A Doctor: Searching for Effective Medical Agent Architectures |
Yangyang Zhuang et.al. |
2504.11301 |
null |
2025-04-16 |
Automated Python Translation |
Joshua Otten et.al. |
2504.11290 |
null |
2025-04-15 |
The Obvious Invisible Threat: LLM-Powered GUI Agents’ Vulnerability to Fine-Print Injections |
Chaoran Chen et.al. |
2504.11281 |
null |
2025-04-15 |
From Misleading Queries to Accurate Answers: A Three-Stage Fine-Tuning Method for LLMs |
Guocong Li et.al. |
2504.11277 |
null |
2025-04-15 |
Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution |
Xinning Chai et.al. |
2504.11271 |
link |
2025-04-15 |
Single-Input Multi-Output Model Merging: Leveraging Foundation Models for Dense Multi-Task Learning |
Juan Garcia Giraldo et.al. |
2504.11268 |
null |
2025-04-15 |
Nondeterministic Polynomial-time Problem Challenge: An Ever-Scaling Reasoning Benchmark for LLMs |
Chang Yang et.al. |
2504.11239 |
link |
2025-04-15 |
Video Summarization with Large Language Models |
Min Jung Lee et.al. |
2504.11199 |
null |
2025-04-15 |
R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning |
Lijun Sheng et.al. |
2504.11195 |
link |
2025-04-15 |
Enhancing multimodal analogical reasoning with Logic Augmented Generation |
Anna Sofia Lippolis et.al. |
2504.11190 |
link |
2025-04-15 |
Benchmarking Next-Generation Reasoning-Focused Large Language Models in Ophthalmology: A Head-to-Head Evaluation on 5,888 Items |
Minjie Zou et.al. |
2504.11186 |
null |
2025-04-15 |
Exploring Backdoor Attack and Defense for LLM-empowered Recommendations |
Liangbo Ning et.al. |
2504.11182 |
null |
2025-04-15 |
TerraMesh: A Planetary Mosaic of Multimodal Earth Observation Data |
Benedikt Blumenstiel et.al. |
2504.11172 |
null |
2025-04-15 |
TerraMind: Large-Scale Generative Multimodality for Earth Observation |
Johannes Jakubik et.al. |
2504.11171 |
null |
2025-04-15 |
MuSeD: A Multimodal Spanish Dataset for Sexism Detection in Social Media Videos |
Laura De Grazia et.al. |
2504.11169 |
link |
2025-04-15 |
Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails |
William Hackett et.al. |
2504.11168 |
null |
2025-04-15 |
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation |
Linus Jern et.al. |
2504.11109 |
link |
2025-04-15 |
Using LLMs as prompt modifier to avoid biases in AI image generators |
René Peinl et.al. |
2504.11104 |
null |
2025-04-15 |
AI-guided Antibiotic Discovery Pipeline from Target Selection to Compound Identification |
Maximilian G. Schuh et.al. |
2504.11091 |
null |
2025-04-15 |
TD-Suite: All Batteries Included Framework for Technical Debt Classification |
Karthik Shivashankar et.al. |
2504.11085 |
link |
2025-04-15 |
QAMA: Quantum annealing multi-head attention operator with classical deep learning framework |
Peng Du et.al. |
2504.11083 |
null |
2025-04-15 |
DPS: Design Pattern Summarisation Using Code Features |
Najam Nazar et.al. |
2504.11081 |
link |
2025-04-15 |
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models |
Andrea Tirinzoni et.al. |
2504.11054 |
link |
2025-04-15 |
Leveraging LLMs and attention-mechanism for automatic annotation of historical maps |
Yunshuang Yuan et.al. |
2504.11050 |
null |
2025-04-15 |
LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews |
Sukannya Purkayastha et.al. |
2504.11042 |
link |
2025-04-15 |
Defending Against Frequency-Based Attacks with Diffusion Models |
Fatemeh Amerehi et.al. |
2504.11034 |
null |
2025-04-16 |
GATE3D: Generalized Attention-based Task-synergized Estimation in 3D* |
Eunsoo Im et.al. |
2504.11014 |
null |
2025-04-15 |
MMC: Iterative Refinement of VLM Reasoning via MCTS-based Multimodal Critique |
Shuhang Liu et.al. |
2504.11009 |
null |
2025-04-15 |
Dynamic Compressing Prompts for Efficient Inference of Large Language Models |
Jinwu Hu et.al. |
2504.11004 |
null |
2025-04-15 |
Dopamine Audiobook: A Training-free MLLM Agent for Emotional and Human-like Audiobook Generation |
Yan Rong et.al. |
2504.11002 |
null |
2025-04-15 |
ReZero: Enhancing LLM search ability by trying one-more-time |
Alan Dao et.al. |
2504.11001 |
null |
2025-04-16 |
Exploring the Role of Knowledge Graph-Based RAG in Japanese Medical Question Answering with Small-Scale LLMs |
Yingjian Chen et.al. |
2504.10982 |
null |
2025-04-15 |
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers |
Hongkang Li et.al. |
2504.10957 |
null |
2025-04-15 |
Unveiling Challenges for LLMs in Enterprise Data Engineering |
Jan-Micha Bodensohn et.al. |
2504.10950 |
link |
2025-04-15 |
Can LLMs Leverage Observational Data? Towards Data-Driven Causal Discovery with LLMs |
Yuni Susanti et.al. |
2504.10936 |
null |
2025-04-15 |
Transfer Learning for Temporal Link Prediction |
Ayan Chatterjee et.al. |
2504.10925 |
link |
2025-04-15 |
MSCRS: Multi-modal Semantic Graph Prompt Learning Framework for Conversational Recommender Systems |
Yibiao Wei et.al. |
2504.10921 |
link |
2025-04-15 |
Adaptive Human-Agent Teaming: A Review of Empirical Studies from the Process Dynamics Perspective |
Mengyao Wang et.al. |
2504.10918 |
null |
2025-04-15 |
Towards A Universal Graph Structural Encoder |
Jialin Chen et.al. |
2504.10917 |
null |
2025-04-15 |
Understanding LLMs’ Cross-Lingual Context Retrieval: How Good It Is And Where It Comes From |
Changjiang Gao et.al. |
2504.10906 |
null |
2025-04-15 |
Bridging Distribution Gaps in Time Series Foundation Model Pretraining with Prototype-Guided Normalization |
Peiliang Gong et.al. |
2504.10900 |
null |
2025-04-15 |
ARise: Towards Knowledge-Augmented Reasoning via Risk-Adaptive Search |
Yize Zhang et.al. |
2504.10893 |
null |
2025-04-15 |
Exploring Persona-dependent LLM Alignment for the Moral Machine Experiment |
Jiseon Kim et.al. |
2504.10886 |
null |
2025-04-15 |
Large Language Model-Informed Feature Discovery Improves Prediction and Interpretation of Credibility Perceptions of Visual Content |
Yilang Peng et.al. |
2504.10878 |
null |
2025-04-15 |
LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation |
Hanning Chen et.al. |
2504.10854 |
null |
2025-04-15 |
Enhancing Features in Long-tailed Data Using Large Vision Mode |
Pengxiao Han et.al. |
2504.10852 |
null |
2025-04-15 |
How to Enhance Downstream Adversarial Robustness (almost) without Touching the Pre-Trained Foundation Model? |
Meiqi Liu et.al. |
2504.10850 |
null |
2025-04-15 |
Moving Beyond Next-Token Prediction: Transformers are Context-Sensitive Language Generators |
Phill Kyu Rhee et.al. |
2504.10845 |
null |
2025-04-15 |
LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation |
Hengyu Shi et.al. |
2504.10829 |
null |
2025-04-15 |
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives |
Ayoung Lee et.al. |
2504.10823 |
null |
2025-04-15 |
IlluSign: Illustrating Sign Language Videos by Leveraging the Attention Mechanism |
Janna Bruner et.al. |
2504.10822 |
null |
2025-04-15 |
CSPLADE: Learned Sparse Retrieval with Causal Language Models |
Zhichao Xu et.al. |
2504.10816 |
null |
2025-04-15 |
Tabular foundation model to detect empathy from visual cues |
Md Rakibul Hasan et.al. |
2504.10808 |
null |
2025-04-15 |
Can Large Language Models Trade? Testing Financial Theories with LLM Agents in Market Simulations |
Alejandro Lopez-Lira et.al. |
2504.10789 |
null |
2025-04-15 |
The Art of Audience Engagement: LLM-Based Thin-Slicing of Scientific Talks |
Ralf Schmälzle et.al. |
2504.10768 |
null |
2025-04-14 |
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients |
Ming Li et.al. |
2504.10766 |
link |
2025-04-14 |
CleanMAP: Distilling Multimodal LLMs for Confidence-Driven Crowdsourced HD Map Updates |
Ankit Kumar Shaw et.al. |
2504.10738 |
null |
2025-04-14 |
Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization |
Darryl Hannan et.al. |
2504.10727 |
null |
2025-04-14 |
HELIOS: Adaptive Model And Early-Exit Selection for Efficient LLM Inference Serving |
Avinash Kumar et.al. |
2504.10724 |
null |
2025-04-14 |
Can LLMs Classify CVEs? Investigating LLMs Capabilities in Computing CVSS Vectors |
Francesco Marchiori et.al. |
2504.10713 |
link |
2025-04-14 |
Distinct hydrologic response patterns and trends worldwide revealed by physics-embedded learning |
Haoyu Ji et.al. |
2504.10707 |
null |
2025-04-14 |
Optimizing Data Distribution and Kernel Performance for Efficient Training of Chemistry Foundation Models: A Case Study with MACE |
Jesun Firoz et.al. |
2504.10700 |
null |
2025-04-14 |
The Jailbreak Tax: How Useful are Your Jailbreak Outputs? |
Kristina Nikolić et.al. |
2504.10694 |
link |
2025-04-14 |
Load Balancing with Network Latencies via Distributed Gradient Descent |
Santiago R. Balseiro et.al. |
2504.10693 |
null |
2025-04-14 |
Introducing Large Language Models as the Next Challenging Internet Traffic Source |
Nataliia Koneva et.al. |
2504.10688 |
link |
2025-04-14 |
EMAFusion: A Self-Optimizing System for Seamless LLM Selection and Integration |
Soham Shah et.al. |
2504.10681 |
null |
2025-04-14 |
Relation-Rich Visual Document Generator for Visual Information Extraction |
Zi-Han Jiang et.al. |
2504.10659 |
link |
2025-04-14 |
MatterTune: An Integrated, User-Friendly Platform for Fine-Tuning Atomistic Foundation Models to Accelerate Materials Simulation and Discovery |
Lingyu Kong et.al. |
2504.10655 |
null |
2025-04-14 |
Un marco conceptual para la generación de requerimientos de software de calidad |
Mauro José Pacchiotti et.al. |
2504.10654 |
null |
2025-04-14 |
Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning |
Saif Punjwani et.al. |
2504.10646 |
link |
2025-04-14 |
Who is More Bayesian: Humans or ChatGPT? |
Tianshi Mu et.al. |
2504.10636 |
null |
2025-04-14 |
Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models |
Thilo Hagendorff et.al. |
2504.10615 |
null |
2025-04-14 |
Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling |
Michal Balcerak et.al. |
2504.10612 |
null |
2025-04-15 |
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models |
Jinguo Zhu et.al. |
2504.10479 |
link |
2025-04-14 |
MIEB: Massive Image Embedding Benchmark |
Chenghao Xiao et.al. |
2504.10471 |
link |
2025-04-14 |
Art3D: Training-Free 3D Generation from Flat-Colored Illustration |
Xiaoyan Cong et.al. |
2504.10466 |
null |
2025-04-14 |
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding |
Tao Zhang et.al. |
2504.10465 |
link |
2025-04-14 |
The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer |
Weixian Lei et.al. |
2504.10462 |
link |
2025-04-15 |
GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents |
Xiaobo Xia et.al. |
2504.10458 |
null |
2025-04-14 |
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models |
Junxiong Wang et.al. |
2504.10449 |
link |
2025-04-14 |
Multimodal Long Video Modeling Based on Temporal Dynamic Context |
Haoran Hao et.al. |
2504.10443 |
link |
2025-04-14 |
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing |
Taihang Hu et.al. |
2504.10434 |
link |
2025-04-14 |
LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models |
Minqian Liu et.al. |
2504.10430 |
null |
2025-04-14 |
Foundation models for electronic health records: representation dynamics and transferability |
Michael C. Burkhart et.al. |
2504.10422 |
link |
2025-04-14 |
Can We Edit LLMs for Long-Tail Biomedical Knowledge? |
Xinhao Yi et.al. |
2504.10421 |
link |
2025-04-14 |
Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA |
Michał Turski et.al. |
2504.10419 |
link |
2025-04-14 |
CliniChat: A Multi-Source Knowledge-Driven Framework for Clinical Interview Dialogue Reconstruction and Evaluation |
Jing Chen et.al. |
2504.10418 |
null |
2025-04-14 |
LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models |
Parshin Shojaee et.al. |
2504.10415 |
link |
2025-04-14 |
Performance of Large Language Models in Supporting Medical Diagnosis and Treatment |
Diogo Sousa et.al. |
2504.10405 |
null |
2025-04-14 |
Satellite Federated Fine-Tuning for Foundation Models in Space Computing Power Networks |
Yan zhu et.al. |
2504.10403 |
null |
2025-04-14 |
Can LLMs Assist Expert Elicitation for Probabilistic Causal Modeling? |
Olha Shaposhnyk et.al. |
2504.10397 |
null |
2025-04-14 |
SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning |
Yiting Wang et.al. |
2504.10369 |
null |
2025-04-14 |
Multimodal Representation Learning Techniques for Comprehensive Facial State Analysis |
Kaiwen Zheng et.al. |
2504.10351 |
null |
2025-04-14 |
VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge |
Yueqi Song et.al. |
2504.10342 |
null |
2025-04-14 |
Forecasting from Clinical Textual Time Series: Adaptations of the Encoder and Decoder Language Model Families |
Shahriar Noroozizadeh et.al. |
2504.10340 |
null |
2025-04-14 |
MorphTok: Morphologically Grounded Tokenization for Indian Languages |
Maharaj Brahma et.al. |
2504.10335 |
null |
2025-04-14 |
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference |
Yangshen Deng et.al. |
2504.10326 |
null |
2025-04-14 |
CROSSAN: Towards Efficient and Effective Adaptation of Multiple Multimodal Foundation Models for Sequential Recommendation |
Junchen Fu et.al. |
2504.10307 |
link |
2025-04-14 |
Characterizing LLM-driven Social Network: The Chirper.ai Case |
Yiming Zhu et.al. |
2504.10286 |
null |
2025-04-14 |
$α$ -Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models |
Chaoran Cheng et.al. |
2504.10283 |
null |
2025-04-14 |
Zero-shot Autonomous Microscopy for Scalable and Intelligent Characterization of 2D Materials |
Jingyun Yang et.al. |
2504.10281 |
null |
2025-04-14 |
XY-Cut++: Advanced Layout Ordering via Hierarchical Mask Mechanism on a Novel Benchmark |
Shuai Liu et.al. |
2504.10258 |
link |
2025-04-14 |
GNN-ACLP: Graph Neural Networks based Analog Circuit Link Prediction |
Guanyuan Pan et.al. |
2504.10240 |
null |
2025-04-14 |
A Model Zoo of Vision Transformers |
Damian Falk et.al. |
2504.10231 |
link |
2025-04-14 |
Probing then Editing Response Personality of Large Language Models |
Tianjie Ju et.al. |
2504.10227 |
link |
2025-04-14 |
PRM-BAS: Enhancing Multimodal Reasoning through PRM-guided Beam Annealing Search |
Pengfei Hu et.al. |
2504.10222 |
null |
2025-04-14 |
Can Competition Enhance the Proficiency of Agents Powered by Large Language Models in the Realm of News-driven Time Series Forecasting? |
Yuxuan Zhang et.al. |
2504.10210 |
null |
2025-04-14 |
DioR: Adaptive Cognitive Detection and Contextual Retrieval Optimization for Dynamic Retrieval-Augmented Generation |
Hanghui Guo et.al. |
2504.10198 |
null |
2025-04-14 |
Localized Cultural Knowledge is Conserved and Controllable in Large Language Models |
Veniamin Veselovsky et.al. |
2504.10191 |
null |
2025-04-14 |
Efficient Generative Model Training via Embedded Representation Warmup |
Deyuan Liu et.al. |
2504.10188 |
link |
2025-04-14 |
LLM Unlearning Reveals a Stronger-Than-Expected Coreset Effect in Current Benchmarks |
Soumyadeep Pal et.al. |
2504.10185 |
link |
2025-04-14 |
A New Paradigm in IBR Modeling for Power Flow and Short Circuit Analysis |
Zahid Javid et.al. |
2504.10181 |
null |
2025-04-14 |
The Future of MLLM Prompting is Adaptive: A Comprehensive Experimental Evaluation of Prompt Engineering Methods for Robust Multimodal Performance |
Anwesha Mohanty et.al. |
2504.10179 |
null |
2025-04-14 |
MSCoT: Structured Chain-of-Thought Generation for Multiple Programming Languages |
Naizhu Jin et.al. |
2504.10178 |
link |
2025-04-14 |
HalluSearch at SemEval-2025 Task 3: A Search-Enhanced RAG Pipeline for Hallucination Detection |
Mohamed A. Abdallah et.al. |
2504.10168 |
null |
2025-04-14 |
C-FAITH: A Chinese Fine-Grained Benchmark for Automated Hallucination Evaluation |
Xu Zhang et.al. |
2504.10167 |
null |
2025-04-14 |
Fact-Checking with Contextual Narratives: Leveraging Retrieval-Augmented LLMs for Social Media Analysis |
Arka Ujjal Dey et.al. |
2504.10166 |
null |
2025-04-14 |
MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning |
Zhaopeng Feng et.al. |
2504.10160 |
link |
2025-04-14 |
COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts |
Jiansheng Li et.al. |
2504.10158 |
null |
2025-04-14 |
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users |
Xinnong Zhang et.al. |
2504.10157 |
link |
2025-04-14 |
HistLLM: A Unified Framework for LLM-Based Multimodal Recommendation with User History Encoding and Compression |
Chen Zhang et.al. |
2504.10150 |
null |
2025-04-14 |
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers |
Chunyang Zhang et.al. |
2504.10148 |
null |
2025-04-14 |
Benchmarking Practices in LLM-driven Offensive Security: Testbeds, Metrics, and Experiment Design |
Andreas Happe et.al. |
2504.10112 |
null |
2025-04-14 |
Enhancing LLM-based Recommendation through Semantic-Aligned Collaborative Knowledge |
Zihan Wang et.al. |
2504.10107 |
null |
2025-04-14 |
CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography |
I-Sheng Fang et.al. |
2504.10090 |
null |
2025-04-14 |
RealSafe-R1: Safety-Aligned DeepSeek-R1 without Compromising Reasoning Capability |
Yichi Zhang et.al. |
2504.10081 |
null |
2025-04-15 |
MMKB-RAG: A Multi-Modal Knowledge-Based Retrieval-Augmented Generation Framework |
Zihan Ling et.al. |
2504.10074 |
null |
2025-04-14 |
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model |
Yang Shi et.al. |
2504.10068 |
null |
2025-04-14 |
Hallucination Detection in LLMs via Topological Divergence on Attention Graphs |
Alexandra Bazarova et.al. |
2504.10063 |
null |
2025-04-15 |
Emotional Strain and Frustration in LLM Interactions in Software Engineering |
Cristina Martinez Montes et.al. |
2504.10050 |
null |
2025-04-14 |
CodeRAG: Supportive Code Retrieval on Bigraph for Real-World Code Generation |
Jia Li et.al. |
2504.10046 |
null |
2025-04-14 |
CHARM: Calibrating Reward Models With Chatbot Arena Scores |
Xiao Zhu et.al. |
2504.10045 |
link |
2025-04-14 |
DataMosaic: Explainable and Verifiable Multi-Modal Data Analytics through Extract-Reason-Verify |
Zhengxuan Zhang et.al. |
2504.10036 |
null |
2025-04-14 |
The Mirage of Performance Gains: Why Contrastive Decoding Fails to Address Multimodal Hallucination |
Hao Yin et.al. |
2504.10020 |
null |
2025-04-14 |
Training LLMs on HPC Systems: Best Practices from the OpenGPT-X Project |
Carolin Penke et.al. |
2504.10013 |
null |
2025-04-15 |
GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting |
Junlin Hao et.al. |
2504.10001 |
null |
2025-04-14 |
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models? |
Yanbo Wang et.al. |
2504.10000 |
null |
2025-04-14 |
Enhancing Multi-task Learning Capability of Medical Generalist Foundation Model via Image-centric Multi-annotation Data |
Xun Zhu et.al. |
2504.09967 |
null |
2025-04-14 |
Privacy Meets Explainability: Managing Confidential Data and Transparency Policies in LLM-Empowered Science |
Yashothara Shanmugarasa et.al. |
2504.09961 |
null |
2025-04-14 |
C-MTCSD: A Chinese Multi-Turn Conversational Stance Detection Dataset |
Fuqiang Niu et.al. |
2504.09958 |
null |
2025-04-14 |
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes |
Huijie Liu et.al. |
2504.09948 |
null |
2025-04-14 |
KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference |
Yuxuan Tian et.al. |
2504.09936 |
null |
2025-04-14 |
Constrained Auto-Regressive Decoding Constrains Generative Retrieval |
Shiguang Wu et.al. |
2504.09935 |
null |
2025-04-14 |
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding |
Zheng Liu et.al. |
2504.09925 |
link |
2025-04-14 |
Guiding Reasoning in Small Language Models with LLM Assistance |
Yujin Kim et.al. |
2504.09923 |
null |
2025-04-14 |
Learning to Erase Private Knowledge from Multi-Documents for Retrieval-Augmented Large Language Models |
Yujing Wang et.al. |
2504.09910 |
null |
2025-04-14 |
Refining Financial Consumer Complaints through Multi-Scale Model Interaction |
Bo-Wei Chen et.al. |
2504.09903 |
null |
2025-04-14 |
TAMP: Token-Adaptive Layerwise Pruning in Multimodal Large Language Models |
Jaewoo Lee et.al. |
2504.09897 |
link |
2025-04-14 |
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data |
Shuai Zhao et.al. |
2504.09895 |
null |
2025-04-14 |
LangPert: Detecting and Handling Task-level Perturbations for Robust Object Rearrangement |
Xu Yin et.al. |
2504.09893 |
null |
2025-04-14 |
Ember: A Compiler for Efficient Embedding Operations on Decoupled Access-Execute Architectures |
Marco Siracusa et.al. |
2504.09870 |
null |
2025-04-14 |
RadarLLM: Empowering Large Language Models to Understand Human Motion from Millimeter-wave Point Cloud Sequence |
Zengyuan Lai et.al. |
2504.09862 |
null |
2025-04-14 |
EthosGPT: Mapping Human Value Diversity to Advance Sustainable Development Goals (SDGs) |
Luyao Zhang et.al. |
2504.09861 |
link |
2025-04-14 |
SUMART: SUMmARizing Translation from Wordy to Concise Expression |
Naoto Nishida et.al. |
2504.09860 |
null |
2025-04-14 |
Working with Large Language Models to Enhance Messaging Effectiveness for Vaccine Confidence |
Lucinda Gullison et.al. |
2504.09857 |
null |
2025-04-14 |
PestMA: LLM-based Multi-Agent System for Informed Pest Management |
Hongrui Shi et.al. |
2504.09855 |
null |
2025-04-14 |
A Survey of Large Language Model-Powered Spatial Intelligence Across Scales: Advances in Embodied Agents, Smart Cities, and Earth Science |
Jie Feng et.al. |
2504.09848 |
null |
2025-04-14 |
$\mathbb{Z}_N$ generalizations of three-dimensional stabilizer codes |
Chanbeen Lee et.al. |
2504.09847 |
null |
2025-04-14 |
OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training |
Juntao Zhao et.al. |
2504.09844 |
null |
2025-04-14 |
StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models |
Yang Feng et.al. |
2504.09841 |
null |
2025-04-14 |
Score Matching Diffusion Based Feedback Control and Planning of Nonlinear Systems |
Karthik Elamvazhuthi et.al. |
2504.09836 |
null |
2025-04-14 |
RAKG:Document-level Retrieval Augmented Knowledge Graph Construction |
Hairong Zhang et.al. |
2504.09823 |
link |
2025-04-14 |
Transferable text data distillation by trajectory matching |
Rong Yao et.al. |
2504.09818 |
null |
2025-04-14 |
Augmented Relevance Datasets with Fine-Tuned Small LLMs |
Quentin Fitte-Rey et.al. |
2504.09816 |
null |
2025-04-14 |
See or Recall: A Sanity Check for the Role of Vision in Solving Visualization Question Answer Tasks with Multimodal LLMs |
Zhimin Li et.al. |
2504.09809 |
null |
2025-04-14 |
Training Small Reasoning LLMs with Cognitive Preference Alignment |
Wenrui Cai et.al. |
2504.09802 |
null |
2025-04-14 |
ReadMe.LLM: A Framework to Help LLMs Understand Your Library |
Sandya Wijaya et.al. |
2504.09798 |
null |
2025-04-14 |
Reasoning Court: Combining Reasoning, Action, and Judgment for Multi-Hop Reasoning |
Jingtian Wu et.al. |
2504.09781 |
null |
2025-04-14 |
Reasoning without Regret |
Tarun Chitra et.al. |
2504.09777 |
null |
2025-04-14 |
An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection |
Qiyao Tang et.al. |
2504.09776 |
null |
2025-04-14 |
Understanding and Optimizing Multi-Stage AI Inference Pipelines |
Abhimanyu Rajeshkumar Bambhaniya et.al. |
2504.09775 |
null |
2025-04-14 |
Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning |
Can Jin et.al. |
2504.09772 |
link |
2025-04-14 |
Socratic Chart: Cooperating Multiple Agents for Robust SVG Chart Understanding |
Yuyang Ji et.al. |
2504.09764 |
null |
2025-04-11 |
Quantum Large Language Model Fine-Tuning |
Sang Hyub Kim et.al. |
2504.08732 |
null |
2025-04-11 |
DocAgent: A Multi-Agent System for Automated Code Documentation Generation |
Dayu Yang et.al. |
2504.08725 |
link |
2025-04-11 |
SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents |
Muhammad Shihab Rashid et.al. |
2504.08703 |
link |
2025-04-11 |
Large Language Models as Span Annotators |
Zdeněk Kasner et.al. |
2504.08697 |
null |
2025-04-11 |
TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning |
Hang Ni et.al. |
2504.08694 |
null |
2025-04-11 |
Fast-Slow-Thinking: Complex Task Solving with Large Language Models |
Yiliu Sun et.al. |
2504.08690 |
null |
2025-04-11 |
Voice Interaction With Conversational AI Could Facilitate Thoughtful Reflection and Substantive Revision in Writing |
Jiho Kim et.al. |
2504.08687 |
null |
2025-04-11 |
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model |
Team Seawead et.al. |
2504.08685 |
null |
2025-04-11 |
Variability-Driven User-Story Generation using LLM and Triadic Concept Analysis |
Alexandre Bazin et.al. |
2504.08666 |
null |
2025-04-11 |
Safe Flow Matching: Robot Motion Planning with Control Barrier Functions |
Xiaobing Dai et.al. |
2504.08661 |
null |
2025-04-11 |
Quality evaluation of Tabby coding assistant using real source code snippets |
Marta Borek et.al. |
2504.08650 |
link |
2025-04-11 |
Do LLMs trust AI regulation? Emerging behaviour of game-theoretic LLM agents |
Alessio Buscemi et.al. |
2504.08640 |
null |
2025-04-11 |
Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging |
Gabriele Lozupone et.al. |
2504.08635 |
link |
2025-04-11 |
Analyzing 16,193 LLM Papers for Fun and Profits |
Zhiqiu Xia et.al. |
2504.08619 |
null |
2025-04-11 |
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration |
Yongsheng Yu et.al. |
2504.08591 |
null |
2025-04-11 |
Playpen: An Environment for Exploring Learning Through Conversational Interaction |
Nicola Horst et.al. |
2504.08590 |
link |
2025-04-11 |
COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails |
Miguel Espinosa et.al. |
2504.08548 |
null |
2025-04-11 |
Slicing the Gaussian Mixture Wasserstein Distance |
Moritz Piening et.al. |
2504.08544 |
link |
2025-04-11 |
UoB-NLP at SemEval-2025 Task 11: Leveraging Adapters for Multilingual and Cross-Lingual Emotion Detection |
Frances Laureano De Leon et.al. |
2504.08543 |
null |
2025-04-11 |
Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions |
Tommaso Galliena et.al. |
2504.08531 |
null |
2025-04-11 |
Task Memory Engine (TME): Enhancing State Awareness for Multi-Step LLM Agent Tasks |
Ye Ye et.al. |
2504.08525 |
link |
2025-04-11 |
Adopting Large Language Models to Automated System Integration |
Robin D. Pesl et.al. |
2504.08490 |
null |
2025-04-11 |
TickIt: Leveraging Large Language Models for Automated Ticket Escalation |
Fengrui Liu et.al. |
2504.08475 |
null |
2025-04-11 |
On the Design of Diffusion-based Neural Speech Codecs |
Pietro Foti et.al. |
2504.08470 |
null |
2025-04-11 |
Diffusion Models for Robotic Manipulation: A Survey |
Rosa Wolf et.al. |
2504.08438 |
null |
2025-04-11 |
Customizing Spider Silk: Generative Models with Mechanical Property Conditioning for Protein Engineering |
Neeru Dubey et.al. |
2504.08437 |
null |
2025-04-11 |
A Reproducibility Study of Graph-Based Legal Case Retrieval |
Gregor Donabauer et.al. |
2504.08400 |
null |
2025-04-11 |
Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models |
Yin Jou Huang et.al. |
2504.08399 |
null |
2025-04-11 |
PCA-RAG: Principal Component Analysis for Efficient Retrieval-Augmented Generation |
Arman Khaledian et.al. |
2504.08386 |
null |
2025-04-11 |
Scaling Up On-Device LLMs via Active-Weight Swapping Between DRAM and Flash |
Fucheng Jia et.al. |
2504.08378 |
null |
2025-04-11 |
MedRep: Medical Concept Representation for General Electronic Health Record Foundation Models |
Junmo Kim et.al. |
2504.08329 |
link |
2025-04-11 |
SortBench: Benchmarking LLMs based on their ability to sort lists |
Steffen Herbold et.al. |
2504.08312 |
null |
2025-04-11 |
DSM: Building A Diverse Semantic Map for 3D Visual Grounding |
Qinghongbing Xie et.al. |
2504.08307 |
null |
2025-04-11 |
Large language models could be rote learners |
Yuyang Xu et.al. |
2504.08300 |
null |
2025-04-11 |
ELSA: A Style Aligned Dataset for Emotionally Intelligent Language Generation |
Vishal Gandhi et.al. |
2504.08281 |
null |
2025-04-11 |
To See or Not to See – Fingerprinting Devices in Adversarial Environments Amid Advanced Machine Learning |
Justin Feng et.al. |
2504.08264 |
null |
2025-04-11 |
Evaluating the Bias in LLMs for Surveying Opinion and Decision Making in Healthcare |
Yonchanok Khaokaew et.al. |
2504.08260 |
null |
2025-04-11 |
CoProSketch: Controllable and Progressive Sketch Generation with Diffusion Model |
Ruohao Zhan et.al. |
2504.08259 |
null |
2025-04-11 |
RAG-VR: Leveraging Retrieval-Augmented Generation for 3D Question Answering in VR Environments |
Shiyi Ding et.al. |
2504.08256 |
link |
2025-04-11 |
Understanding the Impact of Data Domain Extraction on Synthetic Data Privacy |
Georgi Ganev et.al. |
2504.08254 |
null |
2025-04-11 |
Jupiter: Fast and Resource-Efficient Collaborative Inference of Generative LLMs on Edge Devices |
Shengyuan Ye et.al. |
2504.08242 |
null |
2025-04-11 |
Optimal Transport-Based Generative Models for Bayesian Posterior Sampling |
Ke Li et.al. |
2504.08214 |
null |
2025-04-11 |
How Good Are Large Language Models for Course Recommendation in MOOCs? |
Boxuan Ma et.al. |
2504.08208 |
null |
2025-04-11 |
DRAFT-ing Architectural Design Decisions using LLMs |
Rudra Dhar et.al. |
2504.08207 |
link |
2025-04-11 |
Harnessing the Unseen: The Hidden Influence of Intrinsic Knowledge in Long-Context Language Models |
Yu Fu et.al. |
2504.08202 |
null |
2025-04-11 |
Neural Encoding and Decoding at Scale |
Yizi Zhang et.al. |
2504.08201 |
null |
2025-04-11 |
A Vulnerability Code Intent Summary Dataset |
Yifan Huang et.al. |
2504.08180 |
null |
2025-04-11 |
SynthFM: Training Modality-agnostic Foundation Models for Medical Image Segmentation without Real Medical Data |
Sourya Sengupta et.al. |
2504.08177 |
null |
2025-04-11 |
GenXSS: an AI-Driven Framework for Automated Detection of XSS Attacks in WAFs |
Vahid Babaey et.al. |
2504.08176 |
null |
2025-04-10 |
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora |
Alex Warstadt et.al. |
2504.08165 |
link |
2025-04-10 |
Information bounds on the accuracy of cell polarization |
Tau-Mu Yi et.al. |
2504.08164 |
null |
2025-04-10 |
Investigating Vision-Language Model for Point Cloud-based Vehicle Classification |
Yiqiao Li et.al. |
2504.08154 |
null |
2025-04-10 |
LoRAX: LoRA eXpandable Networks for Continual Synthetic Image Attribution |
Danielle Sullivan-Pao et.al. |
2504.08149 |
link |
2025-04-10 |
Orchestrating Agents and Data for Enterprise: A Blueprint Architecture for Compound AI |
Eser Kandogan et.al. |
2504.08148 |
null |
2025-04-10 |
Empowering Vector Architectures for ML: The CAMP Architecture for Matrix Multiplication |
Mohammadreza Esmali Nojehdeh et.al. |
2504.08137 |
null |
2025-04-10 |
Gen3DEval: Using vLLMs for Automatic Evaluation of Generated 3D Objects |
Shalini Maiti et.al. |
2504.08125 |
null |
2025-04-10 |
DeepSeek vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization? |
Daniil Larionov et.al. |
2504.08120 |
null |
2025-04-10 |
Test Amplification for REST APIs via Single and Multi-Agent LLM Systems |
Robbe Nooyens et.al. |
2504.08113 |
null |
2025-04-10 |
Scaling Laws of Graph Neural Networks for Atomistic Materials Modeling |
Chaojian Li et.al. |
2504.08112 |
null |
2025-04-10 |
POEM: Precise Object-level Editing via MLLM control |
Marco Schouten et.al. |
2504.08111 |
null |
2025-04-10 |
Optimal Investment in Equity and Credit Default Swaps in the Presence of Default |
Zhe Fei et.al. |
2504.08085 |
null |
2025-04-10 |
Teaching Humans Subtle Differences with DIFFusion |
Mia Chiquier et.al. |
2504.08046 |
null |
2025-04-10 |
Can Reasoning LLMs Enhance Clinical Document Classification? |
Akram Mustafa et.al. |
2504.08040 |
null |
2025-04-10 |
Emergence of psychopathological computations in large language models |
Soo Yong Lee et.al. |
2504.08016 |
link |
2025-04-10 |
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing |
Zhongyang Li et.al. |
2504.07964 |
link |
2025-04-10 |
PixelFlow: Pixel-Space Generative Models with Flow |
Shoufa Chen et.al. |
2504.07963 |
link |
2025-04-10 |
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation |
Lang Lin et.al. |
2504.07962 |
null |
2025-04-10 |
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning |
Zhong-Yu Li et.al. |
2504.07960 |
null |
2025-04-10 |
Detect Anything 3D in the Wild |
Hanxue Zhang et.al. |
2504.07958 |
null |
2025-04-10 |
MM-IFEngine: Towards Multimodal Instruction Following |
Shengyuan Ding et.al. |
2504.07957 |
link |
2025-04-10 |
VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning |
Yukun Qi et.al. |
2504.07956 |
null |
2025-04-11 |
Pushing the Accuracy Limit of Foundation Neural Network Models with Quantum Monte Carlo Forces and Path Integrals |
Anouar Benali et.al. |
2504.07948 |
null |
2025-04-10 |
We Are All Creators: Generative AI, Collective Knowledge, and the Path Towards Human-AI Synergy |
Jordi Linares-Pellicer et.al. |
2504.07936 |
null |
2025-04-10 |
Porting an LLM based Application from ChatGPT to an On-Premise Environment |
Teemu Paloniemi et.al. |
2504.07907 |
null |
2025-04-10 |
Redefining Machine Translation on Social Network Services with Large Language Models |
Hongcheng Guo et.al. |
2504.07901 |
link |
2025-04-10 |
How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective |
Qi Liu et.al. |
2504.07898 |
link |
2025-04-10 |
Fast Adaptation with Behavioral Foundation Models |
Harshit Sikchi et.al. |
2504.07896 |
null |
2025-04-10 |
DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows |
Mashrur M. Morshed et.al. |
2504.07894 |
null |
2025-04-10 |
Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge |
Riccardo Cantini et.al. |
2504.07887 |
link |
2025-04-10 |
Token Level Routing Inference System for Edge Devices |
Jianshu She et.al. |
2504.07878 |
null |
2025-04-10 |
SAMJAM: Zero-Shot Video Scene Graph Generation for Egocentric Kitchen Videos |
Joshua Li et.al. |
2504.07867 |
null |
2025-04-11 |
Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs |
Yichun Yin et.al. |
2504.07866 |
null |
2025-04-10 |
Robust Hallucination Detection in LLMs via Adaptive Token Selection |
Mengjia Niu et.al. |
2504.07863 |
null |
2025-04-10 |
Horizons, throats and bounces in hybrid metric-Palatini gravity with a non-zero potential |
Gabriel I. Róis et.al. |
2504.07861 |
null |
2025-04-10 |
2D-Curri-DPO: Two-Dimensional Curriculum Learning for Direct Preference Optimization |
Mengyang Li et.al. |
2504.07856 |
null |
2025-04-10 |
The KL3M Data Project: Copyright-Clean Training Resources for Large Language Models |
Michael J Bommarito II et.al. |
2504.07854 |
link |
2025-04-10 |
Understanding Learner-LLM Chatbot Interactions and the Impact of Prompting Guidelines |
Cansu Koyuturk et.al. |
2504.07840 |
null |
2025-04-10 |
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models |
Hongcheng Guo et.al. |
2504.07807 |
link |
2025-04-10 |
A System for Comprehensive Assessment of RAG Frameworks |
Mattia Rengo et.al. |
2504.07803 |
link |
2025-04-10 |
FairEval: Evaluating Fairness in LLM-Based Recommendations with Personality Awareness |
Chandan Kumar Sah et.al. |
2504.07801 |
null |
2025-04-10 |
Plan-and-Refine: Diverse and Comprehensive Retrieval-Augmented Generation |
Alireza Salemi et.al. |
2504.07794 |
link |
2025-04-10 |
Revisiting Likelihood-Based Out-of-Distribution Detection by Modeling Representations |
Yifan Ding et.al. |
2504.07793 |
link |
2025-04-10 |
Fairness Mediator: Neutralize Stereotype Associations to Mitigate Bias in Large Language Models |
Yisong Xiao et.al. |
2504.07787 |
null |
2025-04-10 |
Exploring a Patch-Wise Approach for Privacy-Preserving Fake ID Detection |
Javier Muñoz-Haro et.al. |
2504.07761 |
null |
2025-04-10 |
Efficient Tuning of Large Language Models for Knowledge-Grounded Dialogue Generation |
Bo Zhang et.al. |
2504.07754 |
link |
2025-04-10 |
SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding |
Yangliu Hu et.al. |
2504.07745 |
null |
2025-04-10 |
Zero-Shot Cross-Domain Code Search without Fine-Tuning |
Keyu Liang et.al. |
2504.07740 |
link |
2025-04-10 |
Automated Construction of a Knowledge Graph of Nuclear Fusion Energy for Effective Elicitation and Retrieval of Information |
A. Loreti et.al. |
2504.07738 |
null |
2025-04-10 |
DeepGreen: Effective LLM-Driven Green-washing Monitoring System Designed for Empirical Testing – Evidence from China |
Congluo Xu et.al. |
2504.07733 |
null |
2025-04-10 |
MRD-RAG: Enhancing Medical Diagnosis with Multi-Round Retrieval-Augmented Generation |
Yixiang Chen et.al. |
2504.07724 |
link |
2025-04-10 |
PR-Attack: Coordinated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Optimization |
Yang Jiao et.al. |
2504.07717 |
null |
2025-04-10 |
Proactive User Information Acquisition via Chats on User-Favored Topics |
Shiki Sato et.al. |
2504.07698 |
null |
2025-04-10 |
Conformalized Generative Bayesian Imaging: An Uncertainty Quantification Framework for Computational Imaging |
Canberk Ekmekci et.al. |
2504.07696 |
null |
2025-04-10 |
FMNV: A Dataset of Media-Published News Videos for Fake News Detection |
Yihao Wang et.al. |
2504.07687 |
null |
2025-04-10 |
Synthetic Fluency: Hallucinations, Confabulations, and the Creation of Irish Words in LLM-Generated Translations |
Sheila Castilho et.al. |
2504.07680 |
null |
2025-04-10 |
Data Requirement Goal Modeling for Machine Learning Systems |
Asma Yamani et.al. |
2504.07664 |
null |
2025-04-10 |
Unveiling the Impact of Multimodal Features on Chinese Spelling Correction: From Analysis to Design |
Xiaowu Zhang et.al. |
2504.07661 |
link |
2025-04-10 |
Synthesizing High-Quality Programming Tasks with LLM-based Expert and Student Agents |
Manh Hung Nguyen et.al. |
2504.07655 |
null |
2025-04-10 |
On the Temporal Question-Answering Capabilities of Large Language Models Over Anonymized Data |
Alfredo Garrachón Ruiz et.al. |
2504.07646 |
null |
2025-04-10 |
Enhancing Large Language Models through Neuro-Symbolic Integration and Ontological Reasoning |
Ruslan Idelfonso Magana Vsevolodovna et.al. |
2504.07640 |
link |
2025-04-10 |
Agent That Debugs: Dynamic State-Guided Vulnerability Repair |
Zhengyao Liu et.al. |
2504.07634 |
null |
2025-04-10 |
ConceptFormer: Towards Efficient Use of Knowledge-Graph Embeddings in Large Language Models |
Joel Barmettler et.al. |
2504.07624 |
null |
2025-04-10 |
Beating Transformers using Synthetic Cognition |
Alfredo Ibias et.al. |
2504.07619 |
null |
2025-04-10 |
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model |
Haozhan Shen et.al. |
2504.07615 |
link |
2025-04-11 |
Boosting Universal LLM Reward Design through Heuristic Reward Observation Space Evolution |
Zen Kit Heng et.al. |
2504.07596 |
null |
2025-04-10 |
REANIMATOR: Reanimate Retrieval Test Collections with Extracted and Synthetic Resources |
Björn Engelmann et.al. |
2504.07584 |
link |
2025-04-10 |
Exploring Human-Like Thinking in Search Simulations with Large Language Models |
Erhan Zhang et.al. |
2504.07570 |
link |
2025-04-10 |
Benchmarking Image Embeddings for E-Commerce: Evaluating Off-the Shelf Foundation Models, Fine-Tuning Strategies and Practical Trade-offs |
Urszula Czerwinska et.al. |
2504.07567 |
null |
2025-04-11 |
Using LLMs for Analyzing AIS Data |
Gaspard Merten et.al. |
2504.07557 |
null |
2025-04-10 |
TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs |
Zijian Zhang et.al. |
2504.07556 |
null |
2025-04-10 |
A taxonomy of epistemic injustice in the context of AI and the case for generative hermeneutical erasure |
Warmhold Jan Thomas Mollema et.al. |
2504.07531 |
null |
2025-04-10 |
Automating the Path: An R&D Agenda for Human-Centered AI and Visualization |
Niklas Elmqvist et.al. |
2504.07529 |
null |
2025-04-10 |
Supervised Optimism Correction: Be Confident When LLMs Are Sure |
Junjie Zhang et.al. |
2504.07527 |
null |
2025-04-10 |
Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models |
Yuxiang Lin et.al. |
2504.07521 |
link |
2025-04-10 |
VideoExpert: Augmented LLM for Temporal-Sensitive Video Understanding |
Henghao Zhao et.al. |
2504.07519 |
null |
2025-04-10 |
Enhancements for Developing a Comprehensive AI Fairness Assessment Standard |
Avinash Agarwal et.al. |
2504.07516 |
null |
2025-04-10 |
GPT Carry-On: Training Foundation Model for Customization Could Be Simple, Scalable and Affordable |
Jianqiao Wangni et.al. |
2504.07513 |
null |
2025-04-10 |
Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference Serving |
Shihong Gao et.al. |
2504.07494 |
link |
2025-04-10 |
UniCAIM: A Unified CAM/CIM Architecture with Static-Dynamic KV Cache Pruning for Efficient Long-Context LLM Inference |
Weikai Xu et.al. |
2504.07479 |
null |
2025-04-10 |
Defense against Prompt Injection Attacks via Mixture of Encodings |
Ruiyi Zhang et.al. |
2504.07467 |
link |
2025-04-10 |
Learning Universal Features for Generalizable Image Forgery Localization |
Hengrun Zhao et.al. |
2504.07462 |
link |
2025-04-10 |
Achilles Heel of Distributed Multi-Agent Systems |
Yiting Zhang et.al. |
2504.07461 |
null |
2025-04-10 |
Beyond LLMs: A Linguistic Approach to Causal Graph Generation from Narrative Texts |
Zehan Li et.al. |
2504.07459 |
null |
2025-04-10 |
How Can Objects Help Video-Language Understanding? |
Zitian Tang et.al. |
2504.07454 |
null |
2025-04-10 |
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation |
Juzheng Zhang et.al. |
2504.07448 |
link |
2025-04-10 |
Revisiting LLM Evaluation through Mechanism Interpretability: a New Metric and Model Utility Law |
Yixin Cao et.al. |
2504.07440 |
link |
2025-04-10 |
LLM4Ranking: An Easy-to-use Framework of Utilizing Large Language Models for Document Reranking |
Qi Liu et.al. |
2504.07439 |
link |
2025-04-10 |
From Token to Line: Enhancing Code Generation with a Long-Term Perspective |
Tingwei Lu et.al. |
2504.07433 |
null |
2025-04-10 |
LLM-Enabled Data Transmission in End-to-End Semantic Communication |
Shavbo Salehi et.al. |
2504.07431 |
null |
2025-04-10 |
Task-oriented Age of Information for Remote Inference with Hybrid Language Models |
Shuying Gan et.al. |
2504.07428 |
null |
2025-04-10 |
Conditional Data Synthesis Augmentation |
Xinyu Tian et.al. |
2504.07426 |
null |
2025-04-10 |
Enhancing Player Enjoyment with a Two-Tier DRL and LLM-Based Agent System for Fighting Games |
Shouren Wang et.al. |
2504.07425 |
null |
2025-04-10 |
Routing to the Right Expertise: A Trustworthy Judge for Instruction-based Image Editing |
Chenxi Sun et.al. |
2504.07424 |
null |
2025-04-10 |
RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Radiology with Zero-Shot Multi-Task Capability |
Jonggwon Park et.al. |
2504.07416 |
null |
2025-04-10 |
Leveraging LLMs for Multimodal Retrieval-Augmented Radiology Report Generation via Key Phrase Extraction |
Kyoyun Choi et.al. |
2504.07415 |
null |
2025-04-10 |
AI Coding with Few-Shot Prompting for Thematic Analysis |
Samuel Flanders et.al. |
2504.07408 |
null |
2025-04-10 |
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation |
Linyan Huang et.al. |
2504.07405 |
null |
2025-04-10 |
Automating quantum feature map design via large language models |
Kenya Sakka et.al. |
2504.07396 |
link |
2025-04-10 |
ID-Booth: Identity-consistent Face Generation with Diffusion Models |
Darian Tomašević et.al. |
2504.07392 |
link |
2025-04-10 |
TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models |
Sher Badshah et.al. |
2504.07385 |
null |
2025-04-10 |
Model Discrepancy Learning: Synthetic Faces Detection Based on Multi-Reconstruction |
Qingchao Jiang et.al. |
2504.07382 |
link |
2025-04-10 |
Structure-Property Relationship in Disordered Hyperuniform Materials: Microstructure Representation, Field Fluctuations and Effective Properties |
Liyu Zhong et.al. |
2504.07380 |
null |
2025-04-10 |
Towards Distribution Matching between Collaborative and Language Spaces for Generative Recommendation |
Yi Zhang et.al. |
2504.07363 |
link |
2025-04-10 |
Enhancing Time Series Forecasting via Multi-Level Text Alignment with LLMs |
Taibiao Zhao et.al. |
2504.07360 |
link |
2025-04-10 |
Revisiting Prompt Optimization with Large Reasoning Models-A Case Study on Event Extraction |
Saurabh Srivastava et.al. |
2504.07357 |
null |
2025-04-10 |
Throughput-Optimal Scheduling Algorithms for LLM Inference and AI Agents |
Yueying Li et.al. |
2504.07347 |
null |
2025-04-09 |
Code Generation with Small Language Models: A Deep Evaluation on Codeforces |
Débora Souza et.al. |
2504.07343 |
null |
2025-04-09 |
Leveraging deep learning for plant disease identification: a bibliometric analysis in SCOPUS from 2018 to 2024 |
Enow Takang Achuo Albert et.al. |
2504.07342 |
null |
2025-04-09 |
Zeus: Zero-shot LLM Instruction for Union Segmentation in Multimodal Medical Imaging |
Siyuan Dai et.al. |
2504.07336 |
null |
2025-04-09 |
Objaverse++: Curated 3D Object Dataset with Quality Annotations |
Chendi Lin et.al. |
2504.07334 |
link |
2025-04-09 |
Alice: Proactive Learning with Teacher’s Demonstrations for Weak-to-Strong Generalization |
Shujin Wu et.al. |
2504.07316 |
link |
2025-04-09 |
PAYADOR: A Minimalist Approach to Grounding Language Models on Structured Data for Interactive Storytelling and Role-playing Games |
Santiago Góngora et.al. |
2504.07304 |
link |
2025-04-09 |
Modeling Response Consistency in Multi-Agent LLM Systems: A Comparative Analysis of Shared and Separate Context Approaches |
Tooraj Helmi et.al. |
2504.07303 |
null |
2025-04-09 |
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning |
Yangning Li et.al. |
2504.07288 |
null |
2025-04-09 |
RAISE: Reinforenced Adaptive Instruction Selection For Large Language Models |
Lv Qingsong et.al. |
2504.07282 |
null |
2025-04-09 |
Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning |
Nikhil Shivakumar Nayak et.al. |
2504.07097 |
link |
2025-04-09 |
Are We Done with Object-Centric Learning? |
Alexander Rubinstein et.al. |
2504.07092 |
link |
2025-04-09 |
KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs |
Elan Markowitz et.al. |
2504.07087 |
null |
2025-04-09 |
Identifying Unknown Stochastic Dynamics via Finite expression methods |
Senwei Liang et.al. |
2504.07085 |
null |
2025-04-09 |
DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning |
Atharva Pandey et.al. |
2504.07080 |
null |
2025-04-09 |
A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models |
Zhouhang Xie et.al. |
2504.07070 |
null |
2025-04-09 |
HalluciNot: Hallucination Detection Through Context and Common Knowledge Verification |
Bibek Paudel et.al. |
2504.07069 |
null |
2025-04-09 |
Teaching pathology foundation models to accurately predict gene expression with parameter efficient knowledge transfer |
Shi Pan et.al. |
2504.07061 |
null |
2025-04-09 |
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling |
Liang-Hsuan Tseng et.al. |
2504.07053 |
link |
2025-04-09 |
To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning |
Tian Qin et.al. |
2504.07052 |
null |
2025-04-09 |
Evaluating Retrieval Augmented Generative Models for Document Queries in Transportation Safety |
Chad Melton et.al. |
2504.07022 |
null |
2025-04-09 |
LLM-IFT: LLM-Powered Information Flow Tracking for Secure Hardware |
Nowfel Mashnoor et.al. |
2504.07015 |
null |
2025-04-09 |
Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies |
Jonas Loos et.al. |
2504.07008 |
link |
2025-04-09 |
Towards LLMs Robustness to Changes in Prompt Format Styles |
Lilian Ngweta et.al. |
2504.06969 |
null |
2025-04-09 |
Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation |
Thomas Kerdreux et.al. |
2504.06962 |
null |
2025-04-09 |
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning |
Xinhao Li et.al. |
2504.06958 |
null |
2025-04-09 |
RuOpinionNE-2024: Extraction of Opinion Tuples from Russian News Texts |
Natalia Loukachevitch et.al. |
2504.06947 |
link |
2025-04-09 |
Review of Case-Based Reasoning for LLM Agents: Theoretical Foundations, Architectural Components, and Cognitive Integration |
Kostas Hatalis et.al. |
2504.06943 |
null |
2025-04-09 |
FeedbackEval: A Benchmark for Evaluating Large Language Models in Feedback-Driven Code Repair Tasks |
Dekun Dai et.al. |
2504.06939 |
link |
2025-04-09 |
The Importance of Being Discrete: Measuring the Impact of Discretization in End-to-End Differentially Private Synthetic Data |
Georgi Ganev et.al. |
2504.06923 |
null |
2025-04-09 |
Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains |
Ming Liu et.al. |
2504.06917 |
null |
2025-04-09 |
UKBOB: One Billion MRI Labeled Masks for Generalizable 3D Medical Image Segmentation |
Emmanuelle Bourigault et.al. |
2504.06908 |
null |
2025-04-09 |
MovSAM: A Single-image Moving Object Segmentation Framework Based on Deep Thinking |
Chang Nie et.al. |
2504.06863 |
null |
2025-04-09 |
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation |
Diljeet Jagpal et.al. |
2504.06861 |
null |
2025-04-09 |
Integrating Cognitive Processing Signals into Language Models: A Review of Advances, Applications and Future Directions |
Angela Lopez-Cardona et.al. |
2504.06843 |
null |
2025-04-09 |
LVC: A Lightweight Compression Framework for Enhancing VLMs in Long Video Understanding |
Ziyi Wang et.al. |
2504.06835 |
null |
2025-04-09 |
IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments |
Can Zhang et.al. |
2504.06827 |
null |
2025-04-09 |
Open Problems and a Hypothetical Path Forward in LLM Knowledge Paradigms |
Xiaotian Ye et.al. |
2504.06823 |
null |
2025-04-09 |
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation |
Wangbo Zhao et.al. |
2504.06803 |
link |
2025-04-09 |
A Meaningful Perturbation Metric for Evaluating Explainability Methods |
Danielle Cohen et.al. |
2504.06800 |
null |
2025-04-09 |
Zero-Shot Image-Based Large Language Model Approach to Road Pavement Monitoring |
Shuoshuo Xu et.al. |
2504.06785 |
null |
2025-04-09 |
CHIME: A Compressive Framework for Holistic Interest Modeling |
Yong Bai et.al. |
2504.06780 |
null |
2025-04-09 |
FamilyTool: A Multi-hop Personalized Tool Use Benchmark |
Yuxin Wang et.al. |
2504.06766 |
link |
2025-04-09 |
Robust Capacity Expansion Modelling for Renewable Energy Systems under Weather and Demand Uncertainty |
Sebastian Kebrich et.al. |
2504.06750 |
link |
2025-04-09 |
Plastic tensor networks for interpretable generative modeling |
Katsuya O. Akamatsu et.al. |
2504.06722 |
null |
2025-04-09 |
Toward Holistic Evaluation of Recommender Systems Powered by Generative Models |
Yashar Deldjoo et.al. |
2504.06667 |
null |
2025-04-09 |
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception |
Ruotian Peng et.al. |
2504.06666 |
null |
2025-04-09 |
SEE: Continual Fine-tuning with Sequential Ensemble of Experts |
Zhilin Wang et.al. |
2504.06664 |
link |
2025-04-09 |
Bridging the Gap Between Preference Alignment and Machine Unlearning |
Xiaohua Feng et.al. |
2504.06659 |
null |
2025-04-09 |
A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty |
Xiaohua Feng et.al. |
2504.06658 |
null |
2025-04-09 |
ThoughtProbe: Classifier-Guided Thought Space Exploration Leveraging LLM Intrinsic Reasoning |
Zijian Wang et.al. |
2504.06650 |
null |
2025-04-09 |
SCI-Reason: A Dataset with Chain-of-Thought Rationales for Complex Multimodal Reasoning in Academic Areas |
Chenghao Ma et.al. |
2504.06637 |
null |
2025-04-09 |
BBQRec: Behavior-Bind Quantization for Multi-Modal Sequential Recommendation |
Kaiyuan Li et.al. |
2504.06636 |
null |
2025-04-09 |
The Method for Storing Patterns in Neural Networks-Memorization and Recall of QR code Patterns- |
Hiroshi Inazawa et.al. |
2504.06631 |
null |
2025-04-09 |
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program |
Minghe Gao et.al. |
2504.06606 |
null |
2025-04-09 |
Automated Business Process Analysis: An LLM-Based Approach to Value Assessment |
William De Michele et.al. |
2504.06600 |
link |
2025-04-09 |
A Multi-Modal Interaction Framework for Efficient Human-Robot Collaborative Shelf Picking |
Abhinav Pathak et.al. |
2504.06593 |
null |
2025-04-09 |
Right Prediction, Wrong Reasoning: Uncovering LLM Misalignment in RA Disease Diagnosis |
Umakanta Maharana et.al. |
2504.06581 |
link |
2025-04-09 |
Bypassing Safety Guardrails in LLMs Using Humor |
Pedro Cisneros-Velarde et.al. |
2504.06577 |
null |
2025-04-09 |
NeedleInATable: Exploring Long-Context Capability of Large Language Models towards Long-Structured Tables |
Lanrui Wang et.al. |
2504.06560 |
null |
2025-04-09 |
Societal Impacts Research Requires Benchmarks for Creative Composition Tasks |
Judy Hanwen Shen et.al. |
2504.06549 |
null |
2025-04-09 |
DiffusionCom: Structure-Aware Multimodal Diffusion Model for Multimodal Knowledge Graph Completion |
Wei Huang et.al. |
2504.06543 |
null |
2025-04-09 |
Lugha-Llama: Adapting Large Language Models for African Languages |
Happy Buzaaba et.al. |
2504.06536 |
null |
2025-04-08 |
Towards Holistic Prompt Craft |
Joseph Lindley et.al. |
2504.06496 |
null |
2025-04-08 |
Mind the Gap: Evaluating Vision Systems in Small Data Applications |
Samuel Stevens et.al. |
2504.06486 |
link |
2025-04-08 |
Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following |
Sai Adith Senthil Kumar et.al. |
2504.06460 |
null |
2025-04-08 |
Can you Finetune your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models |
Fay Elhassan et.al. |
2504.06446 |
null |
2025-04-08 |
Don’t Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning |
Yuehan Qin et.al. |
2504.06438 |
null |
2025-04-08 |
Language-Dependent Political Bias in AI: A Study of ChatGPT and Gemini |
Dogus Yuksel et.al. |
2504.06436 |
null |
2025-04-08 |
Human Trust in AI Search: A Large-Scale Experiment |
Haiwen Li et.al. |
2504.06435 |
null |
2025-04-08 |
S’MoRE: Structural Mixture of Residual Experts for LLM Fine-tuning |
Hanqing Zeng et.al. |
2504.06426 |
null |
2025-04-08 |
Releasing Differentially Private Event Logs Using Generative Models |
Frederik Wangelik et.al. |
2504.06418 |
link |
2025-04-08 |
Unifying Autoregressive and Diffusion-Based Sequence Generation |
Nima Fathi et.al. |
2504.06416 |
null |
2025-04-08 |
Comparing Self-Disclosure Themes and Semantics to a Human, a Robot, and a Disembodied Agent |
Sophie Chiang et.al. |
2504.06374 |
null |
2025-04-08 |
Query Understanding in LLM-based Conversational Information Seeking |
Yifei Yuan et.al. |
2504.06356 |
null |
2025-04-08 |
A Geometric-Aware Perspective and Beyond: Hybrid Quantum-Classical Machine Learning Methods |
Azadeh Alavia et.al. |
2504.06328 |
null |
2025-04-08 |
From Stability to Inconsistency: A Study of Moral Preferences in LLMs |
Monika Jotautaite et.al. |
2504.06324 |
null |
2025-04-08 |
Mosaic: Composite Projection Pruning for Resource-efficient LLMs |
Bailey J. Eccles et.al. |
2504.06323 |
null |
2025-04-09 |
GOLLuM: Gaussian Process Optimized LLMs – Reframing LLM Finetuning through Bayesian Optimization |
Bojana Ranković et.al. |
2504.06265 |
link |
2025-04-08 |
OmniSVG: A Unified Scalable Vector Graphics Generation Model |
Yiying Yang et.al. |
2504.06263 |
null |
2025-04-09 |
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention |
Gleb Rodionov et.al. |
2504.06261 |
null |
2025-04-08 |
FEABench: Evaluating Language Models on Multiphysics Reasoning Ability |
Nayantara Mudur et.al. |
2504.06260 |
link |
2025-04-08 |
Electronic Structure Guided Inverse Design Using Generative Models |
Shuyi Jia et.al. |
2504.06249 |
link |
2025-04-08 |
Orb-v3: atomistic simulation at scale |
Benjamin Rhodes et.al. |
2504.06231 |
link |
2025-04-08 |
LExT: Towards Evaluating Trustworthiness of Natural Language Explanations |
Krithi Shailya et.al. |
2504.06227 |
null |
2025-04-08 |
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation |
Biao Zhang et.al. |
2504.06225 |
null |
2025-04-09 |
Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation |
Xiaoxing Hu et.al. |
2504.06220 |
link |
2025-04-08 |
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs |
Dongyang Fan et.al. |
2504.06219 |
null |
2025-04-08 |
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models |
Chejian Xu et.al. |
2504.06214 |
null |
2025-04-08 |
TxGemma: Efficient and Agentic LLMs for Therapeutics |
Eric Wang et.al. |
2504.06196 |
null |
2025-04-08 |
A Self-Supervised Framework for Space Object Behaviour Characterisation |
Ian Groves et.al. |
2504.06176 |
null |
2025-04-08 |
Assessing how hyperparameters impact Large Language Models’ sarcasm detection performance |
Montgomery Gole et.al. |
2504.06166 |
null |
2025-04-09 |
Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups |
Rijul Magu et.al. |
2504.06160 |
null |
2025-04-08 |
A Large-Scale Analysis on Contextual Self-Supervised Video Representation Learning |
Akash Kumar et.al. |
2504.06153 |
null |
2025-04-08 |
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models |
Xiangxi Zheng et.al. |
2504.06148 |
link |
2025-04-08 |
ARLO: A Tailorable Approach for Transforming Natural Language Software Requirements into Architecture using LLMs |
Tooraj Helmi et.al. |
2504.06143 |
null |
2025-04-10 |
A Multimedia Analytics Model for the Foundation Model Era |
Marcel Worring et.al. |
2504.06138 |
null |
2025-04-08 |
QGen Studio: An Adaptive Question-Answer Generation, Training and Evaluation Platform |
Movina Moses et.al. |
2504.06136 |
null |
2025-04-08 |
FaceCloak: Learning to Protect Face Templates |
Sudipta Banerjee et.al. |
2504.06131 |
null |
2025-04-08 |
Knowledge Graph Completion with Relation-Aware Anchor Enhancement |
Duanyang Yuan et.al. |
2504.06129 |
link |
2025-04-08 |
Multi-Sense Embeddings for Language Models and Knowledge Distillation |
Qitong Wang et.al. |
2504.06036 |
null |
2025-04-08 |
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi |
Monojit Choudhury et.al. |
2504.06011 |
null |
2025-04-08 |
Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning? |
Roman Kochnev et.al. |
2504.06006 |
null |
2025-04-08 |
Note on the Universality of Parameterized IQP Circuits with Hidden Units for Generating Probability Distributions |
Andrii Kurkin et.al. |
2504.05997 |
null |
2025-04-08 |
NativQA Framework: Enabling LLMs with Native, Local, and Everyday Knowledge |
Firoj Alam et.al. |
2504.05995 |
null |
2025-04-08 |
An Empirical Study of GPT-4o Image Generation Capabilities |
Sixiang Chen et.al. |
2504.05979 |
link |
2025-04-08 |
AVP-AP: Self-supervised Automatic View Positioning in 3D cardiac CT via Atlas Prompting |
Xiaolin Fan et.al. |
2504.05966 |
null |
2025-04-08 |
Unsupervised Location Mapping for Narrative Corpora |
Eitan Wagner et.al. |
2504.05954 |
null |
2025-04-08 |
InstructMPC: A Human-LLM-in-the-Loop Framework for Context-Aware Control |
Ruixiang Wu et.al. |
2504.05946 |
null |
2025-04-08 |
Assessing Thai Dialect Performance in LLMs with Automatic Benchmarks and Human Evaluation |
Peerat Limkonchotiwat et.al. |
2504.05898 |
null |
2025-04-08 |
KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection |
Xingyuan Li et.al. |
2504.05878 |
null |
2025-04-08 |
Agent Guide: A Simple Agent Behavioral Watermarking Framework |
Kaibo Huang et.al. |
2504.05871 |
null |
2025-04-08 |
CTI-HAL: A Human-Annotated Dataset for Cyber Threat Intelligence Analysis |
Sofia Della Penna et.al. |
2504.05866 |
null |
2025-04-08 |
Are Generative AI Agents Effective Personalized Financial Advisors? |
Takehiro Takayanagi et.al. |
2504.05862 |
link |
2025-04-08 |
Enhancing Coreference Resolution with Pretrained Language Models: Bridging the Gap Between Syntax and Semantics |
Xingzu Liu et.al. |
2504.05855 |
null |
2025-04-08 |
Physics-aware generative models for turbulent fluid flows through energy-consistent stochastic interpolants |
Nikolaj T. Mücke et.al. |
2504.05852 |
link |
2025-04-08 |
PathGPT: Leveraging Large Language Models for Personalized Route Generation |
Steeve Cuthbert Marcelyn et.al. |
2504.05846 |
null |
2025-04-08 |
Leveraging Robust Optimization for LLM Alignment under Distribution Shifts |
Mingye Zhu et.al. |
2504.05831 |
null |
2025-04-08 |
Parasite: A Steganography-based Backdoor Attack Framework for Diffusion Models |
Jiahao Chen et.al. |
2504.05815 |
null |
2025-04-08 |
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization |
Qingyang Zhang et.al. |
2504.05812 |
link |
2025-04-08 |
PaMi-VDPO: Mitigating Video Hallucinations by Prompt-Aware Multi-Instance Video Preference Learning |
Xinpeng Ding et.al. |
2504.05810 |
null |
2025-04-08 |
StealthRank: LLM Ranking Manipulation via Stealthy Prompt Optimization |
Yiming Tang et.al. |
2504.05804 |
link |
2025-04-08 |
From Superficial to Deep: Integrating External Knowledge for Follow-up Question Generation Using Knowledge Graph and LLM |
Jianyu Liu et.al. |
2504.05801 |
null |
2025-04-08 |
DefMamba: Deformable Visual State Space Model |
Leiye Liu et.al. |
2504.05794 |
null |
2025-04-08 |
ViralQC: A Tool for Assessing Completeness and Contamination of Predicted Viral Contigs |
Cheng Peng et.al. |
2504.05790 |
link |
2025-04-08 |
How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM |
Jirong Zha et.al. |
2504.05786 |
null |
2025-04-08 |
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models |
Pengfei Zhou et.al. |
2504.05782 |
link |
2025-04-08 |
SEA-LION: Southeast Asian Languages in One Network |
Raymond Ng et.al. |
2504.05747 |
null |
2025-04-08 |
LLM-assisted Mutation for Whitebox API Testing |
Jia Li et.al. |
2504.05738 |
null |
2025-04-08 |
Rank-Then-Score: Enhancing Large Language Models for Automated Essay Scoring |
Yida Cai et.al. |
2504.05736 |
null |
2025-04-08 |
LLM $\times$ MapReduce-V2: Entropy-Driven Convolutional Test-Time Scaling for Generating Long-Form Articles from Extremely Long Resources |
Haoyu Wang et.al. |
2504.05732 |
link |
2025-04-08 |
Retrieval Augmented Generation with Collaborative Filtering for Personalized Text Generation |
Teng Shi et.al. |
2504.05731 |
link |
2025-04-08 |
Unified Generative Search and Recommendation |
Teng Shi et.al. |
2504.05730 |
null |
2025-04-08 |
Single-Agent vs. Multi-Agent LLM Strategies for Automated Student Reflection Assessment |
Gen Li et.al. |
2504.05716 |
null |
2025-04-08 |
Automated Archival Descriptions with Federated Intelligence of LLMs |
Jinghua Groppe et.al. |
2504.05711 |
null |
2025-04-08 |
Large Language Models Enhanced Hyperbolic Space Recommender Systems |
Wentao Cheng et.al. |
2504.05694 |
null |
2025-04-08 |
STRIVE: A Think & Improve Approach with Iterative Refinement for Enhancing Question Quality Estimation |
Aniket Deroy et.al. |
2504.05693 |
null |
2025-04-08 |
StayLTC: A Cost-Effective Multimodal Framework for Hospital Length of Stay Forecasting |
Sudeshna Jana et.al. |
2504.05691 |
null |
2025-04-09 |
STAGE: Stemmed Accompaniment Generation through Prefix-Based Conditioning |
Giorgio Strano et.al. |
2504.05690 |
null |
2025-04-08 |
Separator Injection Attack: Uncovering Dialogue Biases in Large Language Models Caused by Role Separators |
Xitao Li et.al. |
2504.05689 |
null |
2025-04-08 |
Towards Smarter Hiring: Are Zero-Shot and Few-Shot Pre-trained LLMs Ready for HR Spoken Interview Transcript Analysis? |
Subhankar Maity et.al. |
2504.05683 |
null |
2025-04-08 |
VC-LLM: Automated Advertisement Video Creation from Raw Footage using Multi-modal LLMs |
Dongjun Qian et.al. |
2504.05673 |
null |
2025-04-08 |
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing |
Tianchi Liu et.al. |
2504.05657 |
link |
2025-04-08 |
Sugar-Coated Poison: Benign Generation Unlocks LLM Jailbreaking |
Yu-Hang Wu et.al. |
2504.05652 |
link |
2025-04-08 |
iEBAKER: Improved Remote Sensing Image-Text Retrieval Framework via Eliminate Before Align and Keyword Explicit Reasoning |
Yan Zhang et.al. |
2504.05644 |
null |
2025-04-08 |
Leveraging Prompt-Tuning for Bengali Grammatical Error Explanation Using Large Language Models |
Subhankar Maity et.al. |
2504.05642 |
null |
2025-04-08 |
TAGC: Optimizing Gradient Communication in Distributed Transformer Training |
Igor Polyakov et.al. |
2504.05638 |
link |
2025-04-08 |
Model-Agnostic Policy Explanations with Large Language Models |
Zhang Xi-Jia et.al. |
2504.05625 |
null |
2025-04-08 |
Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement |
Yichen Dong et.al. |
2504.05614 |
null |
2025-04-08 |
Falcon: Fractional Alternating Cut with Overcoming Minima in Unsupervised Segmentation |
Xiao Zhang et.al. |
2504.05613 |
null |
2025-04-08 |
FactGuard: Leveraging Multi-Agent Systems to Generate Answerable and Unanswerable Questions for Enhanced Long-Context LLM Extraction |
Qian-Wen Zhang et.al. |
2504.05607 |
null |
2025-04-08 |
On the Impact of Language Nuances on Sentiment Analysis with Large Language Models: Paraphrasing, Sarcasm, and Emojis |
Naman Bhargava et.al. |
2504.05603 |
null |
2025-04-08 |
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought |
Yi Peng et.al. |
2504.05599 |
null |
2025-04-08 |
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding |
Hossein Entezari Zarch et.al. |
2504.05598 |
null |
2025-04-08 |
Knowledge-Instruct: Effective Continual Pre-training from Limited Data using Instructions |
Oded Ovadia et.al. |
2504.05571 |
null |
2025-04-07 |
Can Large Language Models Match Tutoring System Adaptivity? A Benchmarking Study |
Conrad Borchers et.al. |
2504.05570 |
null |
2025-04-07 |
From Fairness to Truthfulness: Rethinking Data Valuation Design |
Dongyang Fan et.al. |
2504.05563 |
null |
2025-04-07 |
SciSciGPT: Advancing Human-AI Collaboration in the Science of Science |
Erzhuo Shao et.al. |
2504.05559 |
null |
2025-04-07 |
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values |
M-A-P Team et.al. |
2504.05535 |
null |
2025-04-07 |
Bridging Industrial Expertise and XR with LLM-Powered Conversational Agents |
Despina Tomkou et.al. |
2504.05527 |
null |
2025-04-07 |
Pretraining Language Models for Diachronic Linguistic Change Discovery |
Elisabeth Fittschen et.al. |
2504.05523 |
null |
2025-04-07 |
User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems |
Jianling Wang et.al. |
2504.05522 |
null |
2025-04-07 |
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning |
Taiwei Shi et.al. |
2504.05520 |
link |
2025-04-07 |
Evaluating the Generalization Capabilities of Large Language Models on Code Reasoning |
Rem Yang et.al. |
2504.05518 |
null |
2025-04-07 |
SelfMAD: Enhancing Generalization and Robustness in Morphing Attack Detection via Self-Supervised Learning |
Marija Ivanovska et.al. |
2504.05504 |
null |
2025-04-07 |
Prism: Dynamic and Flexible Benchmarking of LLMs Code Generation with Monte Carlo Tree Search |
Vahid Majdinasab et.al. |
2504.05500 |
null |
2025-04-07 |
A Survey on Hypothesis Generation for Scientific Discovery in the Era of Large Language Models |
Atilla Kaan Alkan et.al. |
2504.05496 |
null |
2025-04-07 |
REEF: Relevance-Aware and Efficient LLM Adapter for Video Understanding |
Sakib Reza et.al. |
2504.05491 |
null |
2025-04-07 |
GraphRAFT: Retrieval Augmented Fine-Tuning for Knowledge Graphs on Graph Databases |
Alfred Clemedtson et.al. |
2504.05478 |
link |
2025-04-07 |
Generative Adversarial Networks with Limited Data: A Survey and Benchmarking |
Omar De Mitri et.al. |
2504.05456 |
null |
2025-04-07 |
Connecting Feedback to Choice: Understanding Educator Preferences in GenAI vs. Human-Created Lesson Plans in K-12 Education – A Comparative Analysis |
Shawon Sarkar et.al. |
2504.05449 |
null |
2025-04-07 |
EP-Diffuser: An Efficient Diffusion Model for Traffic Scene Generation and Prediction via Polynomial Representations |
Yue Yao et.al. |
2504.05422 |
null |
2025-04-07 |
Less but Better: Parameter-Efficient Fine-Tuning of Large Language Models for Personality Detection |
Lingzhi Shen et.al. |
2504.05411 |
null |
2025-04-07 |
EduPlanner: LLM-Based Multi-Agent Systems for Customized and Intelligent Instructional Design |
Xueqiao Zhang et.al. |
2504.05370 |
null |
2025-04-07 |
URECA: Unique Region Caption Anything |
Sangbeom Lim et.al. |
2504.05305 |
null |
2025-04-07 |
InteractVLM: 3D Interaction Reasoning from 2D Foundational Models |
Sai Kumar Dwivedi et.al. |
2504.05303 |
link |
2025-04-07 |
Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations |
Pedro Ferreira et.al. |
2504.05294 |
null |
2025-04-07 |
The challenge of uncertainty quantification of large language models in medicine |
Zahra Atf et.al. |
2504.05278 |
null |
2025-04-07 |
Enhancing LLM-Based Short Answer Grading with Retrieval-Augmented Generation |
Yucheng Chu et.al. |
2504.05276 |
null |
2025-04-07 |
Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models |
Yang Yan et.al. |
2504.05262 |
null |
2025-04-07 |
Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models |
Adrián Bazaga et.al. |
2504.05258 |
null |
2025-04-07 |
Explaining Low Perception Model Competency with High-Competency Counterfactuals |
Sara Pohland et.al. |
2504.05254 |
null |
2025-04-07 |
LLM-based Automated Grading with Human-in-the-Loop |
Hang Li et.al. |
2504.05239 |
null |
2025-04-08 |
Leveraging LLMs for Utility-Focused Annotation: Reducing Manual Effort for Retrieval and RAG |
Hengran Zhang et.al. |
2504.05220 |
null |
2025-04-07 |
Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling |
Hengran Zhang et.al. |
2504.05216 |
null |
2025-04-07 |
Post-Training Language Models for Continual Relation Extraction |
Sefika Efeoglu et.al. |
2504.05214 |
null |
2025-04-07 |
Quantum Program Linting with LLMs: Emerging Results from a Comparative Study |
Seung Yeob Shin et.al. |
2504.05204 |
null |
2025-04-07 |
P2Mark: Plug-and-play Parameter-intrinsic Watermarking for Neural Speech Generation |
Yong Ren et.al. |
2504.05197 |
null |
2025-04-07 |
Training state-of-the-art pathology foundation models with orders of magnitude less data |
Mikhail Karasikov et.al. |
2504.05186 |
null |
2025-04-07 |
Concise Reasoning via Reinforcement Learning |
Mehdi Fatemi et.al. |
2504.05185 |
link |
2025-04-07 |
BRIDGES: Bridging Graph Modality and Large Language Models within EDA Tasks |
Wei Li et.al. |
2504.05180 |
null |
2025-04-07 |
Learning symmetries in datasets |
Veronica Sanz et.al. |
2504.05174 |
null |
2025-04-07 |
Evaluating Knowledge Graph Based Retrieval Augmented Generation Methods under Knowledge Incompleteness |
Dongzhuoran Zhou et.al. |
2504.05163 |
null |
2025-04-07 |
DDPM Score Matching and Distribution Learning |
Sinho Chewi et.al. |
2504.05161 |
null |
2025-04-07 |
PanoDreamer: Consistent Text to 360-Degree Scene Generation |
Zhexiao Xiong et.al. |
2504.05152 |
null |
2025-04-07 |
Pr $εε$ mpt: Sanitizing Sensitive Prompts for LLMs |
Amrita Roy Chowdhury et.al. |
2504.05147 |
link |
2025-04-07 |
Query Smarter, Trust Better? Exploring Search Behaviours for Verifying News Accuracy |
David Elsweiler et.al. |
2504.05146 |
null |
2025-04-07 |
DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation |
Xinglin Lyu et.al. |
2504.05122 |
null |
2025-04-07 |
Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning |
Anja Surina et.al. |
2504.05108 |
null |
2025-04-07 |
Speech-to-Trajectory: Learning Human-Like Verbal Guidance for Robot Motion |
Eran Beeri Bamani et.al. |
2504.05084 |
null |
2025-04-07 |
The Curse of CoT: On the Limitations of Chain-of-Thought in In-Context Learning |
Tianshi Zheng et.al. |
2504.05081 |
null |
2025-04-07 |
On the Performance of an Explainable Language Model on PubMedQA |
Venkat Srinivasan et.al. |
2504.05074 |
null |
2025-04-08 |
Not All Data Are Unlearned Equally |
Aravind Krishnan et.al. |
2504.05058 |
link |
2025-04-07 |
Revealing the Intrinsic Ethical Vulnerability of Aligned Large Language Models |
Jiawei Lian et.al. |
2504.05050 |
null |
2025-04-07 |
Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning |
Sugyeong Eo et.al. |
2504.05047 |
null |
2025-04-07 |
InstructionBench: An Instructional Video Understanding Benchmark |
Haiwan Wei et.al. |
2504.05040 |
null |
2025-04-07 |
Mixture-of-Personas Language Models for Population Simulation |
Ngoc Bui et.al. |
2504.05019 |
null |
2025-04-07 |
Surveying Professional Writers on AI: Limitations, Expectations, and Fears |
Anastasiia Ivanova et.al. |
2504.05008 |
link |
2025-04-07 |
Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM |
Jiuyang Bu et.al. |
2504.05006 |
null |
2025-04-07 |
Following the Whispers of Values: Unraveling Neural Mechanisms Behind Value-Oriented Behaviors in LLMs |
Ling Hu et.al. |
2504.04994 |
null |
2025-04-07 |
RS-RAG: Bridging Remote Sensing Imagery and Comprehensive Knowledge with a Multi-Modal Dataset and Retrieval-Augmented Generation Model |
Congcong Wen et.al. |
2504.04988 |
null |
2025-04-07 |
Low-Rate Semantic Communication with Codebook-based Conditional Generative Models |
Kailang Ye et.al. |
2504.04977 |
null |
2025-04-07 |
A Domain-Based Taxonomy of Jailbreak Vulnerabilities in Large Language Models |
Carlos Peláez-González et.al. |
2504.04976 |
null |
2025-04-07 |
Towards Visual Text Grounding of Multimodal Large Language Model |
Ming Li et.al. |
2504.04974 |
null |
2025-04-07 |
The Dream Within Huang Long Cave: AI-Driven Interactive Narrative for Family Storytelling and Emotional Reflection |
Jiayang Huang et.al. |
2504.04968 |
null |
2025-04-07 |
A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization |
Wenyuan Xu et.al. |
2504.04950 |
null |
2025-04-07 |
One Quantizer is Enough: Toward a Lightweight Audio Codec |
Linwei Zhai et.al. |
2504.04949 |
link |
2025-04-07 |
A Llama walks into the ‘Bar’: Efficient Supervised Fine-Tuning for Legal Reasoning in the Multi-state Bar Exam |
Rean Fernandes et.al. |
2504.04945 |
null |
2025-04-07 |
Lemmanaid: Neuro-Symbolic Lemma Conjecturing |
Yousef Alhessi et.al. |
2504.04942 |
null |
2025-04-07 |
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration |
Ran Xu et.al. |
2504.04915 |
link |
2025-04-07 |
Video-Bench: Human-Aligned Video Generation Benchmark |
Hui Han et.al. |
2504.04907 |
null |
2025-04-07 |
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models |
Justus Westerhoff et.al. |
2504.04893 |
link |
2025-04-07 |
Leveraging Large Language Models for Cost-Effective, Multilingual Depression Detection and Severity Assessment |
Longdi Xian et.al. |
2504.04891 |
null |
2025-04-07 |
SoK: LLM-based Log Parsing |
Viktor Beck et.al. |
2504.04877 |
link |
2025-04-07 |
Simulating Persuasive Dialogues on Meat Reduction with Generative Agents |
Georg Ahnert et.al. |
2504.04872 |
link |
2025-04-07 |
BIASINSPECTOR: Detecting Bias in Structured Data through LLM Agents |
Haoxuan Li et.al. |
2504.04855 |
null |
2025-04-07 |
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models |
Ruikang Liu et.al. |
2504.04823 |
link |
2025-04-07 |
Beyond Answers: How LLMs Can Pursue Strategic Thinking in Education |
Eleonora Grassucci et.al. |
2504.04815 |
null |
2025-04-07 |
Select Me! When You Need a Tool: A Black-box Text Attack on Tool Selection |
Liuji Chen et.al. |
2504.04809 |
null |
2025-04-07 |
ELT-Bench: An End-to-End Benchmark for Evaluating AI Agents on ELT Pipelines |
Tengjun Jin et.al. |
2504.04808 |
link |
2025-04-07 |
OrderChain: A General Prompting Paradigm to Improve Ordinal Understanding Ability of MLLM |
Jinhong Wang et.al. |
2504.04801 |
null |
2025-04-07 |
Topological Schrödinger Bridge Matching |
Maosheng Yang et.al. |
2504.04799 |
link |
2025-04-07 |
TabRep: Training Tabular Diffusion Models with a Simple and Effective Continuous Representation |
Jacob Si et.al. |
2504.04798 |
link |
2025-04-07 |
Addressing the Curse of Scenario and Task Generalization in AI-6G: A Multi-Modal Paradigm |
Tianyu Jiao et.al. |
2504.04797 |
null |
2025-04-07 |
Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors |
Fan Nie et.al. |
2504.04785 |
link |
2025-04-07 |
OCC-MLLM-CoT-Alpha: Towards Multi-stage Occlusion Recognition Based on Large Language Models via 3D-Aware Supervision and Chain-of-Thoughts Guidance |
Chaoyi Wang et.al. |
2504.04781 |
null |
2025-04-07 |
Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations |
Leonardo Ranaldi et.al. |
2504.04771 |
null |
2025-04-07 |
Unsupervised Estimation of Nonlinear Audio Effects: Comparing Diffusion-Based and Adversarial approaches |
Eloi Moliner et.al. |
2504.04751 |
null |
2025-04-07 |
Can LLMs Interpret and Leverage Structured Linguistic Representations? A Case Study with AMRs |
Ankush Raut et.al. |
2504.04745 |
null |
2025-04-07 |
AnyArtisticGlyph: Multilingual Controllable Artistic Glyph Generation |
Xiongbo Lu et.al. |
2504.04743 |
null |
2025-04-07 |
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data |
Samarth Mishra et.al. |
2504.04740 |
link |
2025-04-07 |
TathyaNyaya and FactLegalLlama: Advancing Factual Judgment Prediction and Explanation in the Indian Legal Context |
Shubham Kumar Nigam et.al. |
2504.04737 |
null |
2025-04-07 |
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use |
Anna Goldie et.al. |
2504.04736 |
null |
2025-04-07 |
Can LLM-Driven Hard Negative Sampling Empower Collaborative Filtering? Findings and Potentials |
Chu Zhao et.al. |
2504.04726 |
link |
2025-04-08 |
Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language Models |
Yubo Li et.al. |
2504.04717 |
link |
2025-04-07 |
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs |
Will Cai et.al. |
2504.04715 |
link |
2025-04-07 |
Sequential-NIAH: A Needle-In-A-Haystack Benchmark for Extracting Sequential Needles from Long Contexts |
Yifei Yu et.al. |
2504.04713 |
null |
2025-04-07 |
Generalising from Self-Produced Data: Model Training Beyond Human Constraints |
Alfath Daryl Alhajir et.al. |
2504.04711 |
null |
2025-04-07 |
LagKV: Lag-Relative Information of the KV Cache Tells Which Tokens Are Important |
Manlai Liang et.al. |
2504.04704 |
link |
2025-04-07 |
Causal Retrieval with Semantic Consideration |
Hyunseo Shin et.al. |
2504.04700 |
null |
2025-04-07 |
R2Vul: Learning to Reason about Software Vulnerabilities with Reinforcement Learning and Structured Reasoning Distillation |
Martin Weyssow et.al. |
2504.04699 |
link |
2025-04-07 |
scAgent: Universal Single-Cell Annotation via a LLM Agent |
Yuren Mao et.al. |
2504.04698 |
null |
2025-04-07 |
Generative Large Language Model usage in Smart Contract Vulnerability Detection |
Peter Ince et.al. |
2504.04685 |
null |
2025-04-07 |
ACE-RLHF: Automated Code Evaluation and Socratic Feedback Generation Tool using Large Language Models and Reinforcement Learning with Human Feedback |
Tasnia Rahman et.al. |
2504.04657 |
null |
2025-04-07 |
LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts |
Yimu Wang et.al. |
2504.04653 |
null |
2025-04-06 |
Splits! A Flexible Dataset for Evaluating a Model’s Demographic Social Inference |
Eylon Caplan et.al. |
2504.04640 |
link |
2025-04-06 |
Foundation Models for Software Engineering of Cyber-Physical Systems: the Road Ahead |
Chengjie Lu et.al. |
2504.04630 |
null |
2025-04-06 |
SECQUE: A Benchmark for Evaluating Real-World Financial Analysis Capabilities |
Noga Ben Yoash et.al. |
2504.04596 |
null |
2025-04-08 |
Your Image Generator Is Your New Private Dataset |
Nicolo Resmini et.al. |
2504.04582 |
null |
2025-04-06 |
Hierarchical Planning for Complex Tasks with Knowledge Graph-RAG and Symbolic Verification |
Cristina Cornelio et.al. |
2504.04578 |
null |
2025-04-06 |
DexTOG: Learning Task-Oriented Dexterous Grasp with Language |
Jieyi Zhang et.al. |
2504.04573 |
null |
2025-04-06 |
Planning Safety Trajectories with Dual-Phase, Physics-Informed, and Transportation Knowledge-Driven Large Language Models |
Rui Gan et.al. |
2504.04562 |
link |
2025-04-06 |
Chain of Understanding: Supporting Code Understanding with Large Language Models |
Jie Gao et.al. |
2504.04553 |
null |
2025-04-06 |
Advancing Egocentric Video Question Answering with Multimodal Large Language Models |
Alkesh Patel et.al. |
2504.04550 |
null |
2025-04-06 |
Opening the black box of deep learning: Validating the statistical association between explainable artificial intelligence (XAI) and clinical domain knowledge in fundus image-based glaucoma diagnosis |
Han Yuan et.al. |
2504.04549 |
null |
2025-04-06 |
The Point, the Vision and the Text: Does Point Cloud Boost Spatial Reasoning of Large Language Models? |
Weichen Zhang et.al. |
2504.04540 |
null |
2025-04-06 |
An Empirical Comparison of Text Summarization: A Multi-Dimensional Evaluation of Large Language Models |
Anantharaman Janakiraman et.al. |
2504.04534 |
null |
2025-04-06 |
Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning |
Xuerui Su et.al. |
2504.04524 |
null |
2025-04-06 |
Hessian of Perplexity for Large Language Models by PyTorch autograd (Open Source) |
Ivan Ilin et.al. |
2504.04520 |
link |
2025-04-06 |
Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection |
Jiancheng Pan et.al. |
2504.04517 |
link |
2025-04-06 |
Saliency-driven Dynamic Token Pruning for Large Language Models |
Yao Tao et.al. |
2504.04514 |
null |
2025-04-06 |
Attributed Synthetic Data Generation for Zero-shot Domain-specific Image Classification |
Shijian Wang et.al. |
2504.04510 |
null |
2025-04-06 |
VideoAgent2: Enhancing the LLM-Based Agent System for Long-Form Video Understanding by Uncertainty-Aware CoT |
Zhuo Zhi et.al. |
2504.04471 |
null |
2025-04-06 |
Domain Generalization for Face Anti-spoofing via Content-aware Composite Prompt Engineering |
Jiabao Guo et.al. |
2504.04470 |
null |
2025-04-04 |
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models |
Wulin Xie et.al. |
2504.03641 |
null |
2025-04-04 |
Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning |
Xinyi Wang et.al. |
2504.03635 |
null |
2025-04-04 |
Enhancing Causal Effect Estimation with Diffusion-Generated Data |
Li Chen et.al. |
2504.03630 |
null |
2025-04-04 |
Align to Structure: Aligning Large Language Models with Structural Information |
Zae Myung Kim et.al. |
2504.03622 |
null |
2025-04-04 |
VISTA-OCR: Towards generative and interactive end to end OCR models |
Laziz Hamdi et.al. |
2504.03621 |
null |
2025-04-04 |
Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task |
Leonardo Ranaldi et.al. |
2504.03616 |
null |
2025-04-04 |
Autonomous and Self-Adapting System for Synthetic Media Detection and Attribution |
Aref Azizpour et.al. |
2504.03615 |
null |
2025-04-04 |
AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset |
Bingxiang He et.al. |
2504.03612 |
null |
2025-04-04 |
MedSAM2: Segment Anything in 3D Medical Images and Videos |
Jun Ma et.al. |
2504.03600 |
link |
2025-04-04 |
EnrichIndex: Using LLMs to Enrich Retrieval Indices Offline |
Peter Baile Chen et.al. |
2504.03598 |
null |
2025-04-04 |
PF3Det: A Prompted Foundation Feature Assisted Visual LiDAR 3D Detector |
Kaidong Li et.al. |
2504.03563 |
null |
2025-04-04 |
Agentic Knowledgeable Self-awareness |
Shuofei Qiao et.al. |
2504.03553 |
link |
2025-04-04 |
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration |
Boyuan Wang et.al. |
2504.03536 |
null |
2025-04-04 |
RANa: Retrieval-Augmented Navigation |
Gianluca Monaci et.al. |
2504.03524 |
null |
2025-04-04 |
Neutralizing the Narrative: AI-Powered Debiasing of Online News Articles |
Chen Wei Kuo et.al. |
2504.03520 |
null |
2025-04-04 |
Structured Legal Document Generation in India: A Model-Agnostic Wrapper Approach with VidhikDastaavej |
Shubham Kumar Nigam et.al. |
2504.03486 |
null |
2025-04-04 |
D-Garment: Physics-Conditioned Latent Diffusion for Dynamic Garment Deformations |
Antoine Dumoulin et.al. |
2504.03468 |
null |
2025-04-04 |
Generating ensembles of spatially-coherent in-situ forecasts using flow matching |
David Landry et.al. |
2504.03463 |
null |
2025-04-04 |
Conditioning Diffusions Using Malliavin Calculus |
Jakiw Pidstrigach et.al. |
2504.03461 |
null |
2025-04-04 |
Optimizing Specific and Shared Parameters for Efficient Parameter Tuning |
Van-Anh Nguyen et.al. |
2504.03450 |
null |
2025-04-04 |
LLMSched: Uncertainty-Aware Workload Scheduling for Compound LLM Applications |
Botao Zhu et.al. |
2504.03444 |
null |
2025-04-04 |
Know What You do Not Know: Verbalized Uncertainty Estimation Robustness on Corrupted Images in Vision-Language Models |
Mirko Borszukovszki et.al. |
2504.03440 |
null |
2025-04-04 |
Locations of Characters in Narratives: Andersen and Persuasion Datasets |
Batuhan Ozyurt et.al. |
2504.03434 |
link |
2025-04-04 |
BitHEP – The Limits of Low-Precision ML in HEP |
Claudius Krause et.al. |
2504.03387 |
link |
2025-04-04 |
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning |
Sanghwan Bae et.al. |
2504.03380 |
null |
2025-04-04 |
MultiClear: Multimodal Soft Exoskeleton Glove for Transparent Object Grasping Assistance |
Chen Hu et.al. |
2504.03379 |
null |
2025-04-04 |
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency |
Erik Johannes Husom et.al. |
2504.03360 |
null |
2025-04-04 |
BabyLM’s First Words: Word Segmentation as a Phonological Probing Task |
Zébulon Goriely et.al. |
2504.03338 |
null |
2025-04-04 |
Steerable Anatomical Shape Synthesis with Implicit Neural Representations |
Bram de Wilde et.al. |
2504.03313 |
link |
2025-04-04 |
Evaluating Compact LLMs for Zero-Shot Iberian Language Tasks on End-User Devices |
Luís Couto Seller et.al. |
2504.03312 |
null |
2025-04-04 |
Noise Augmented Fine Tuning for Mitigating Hallucinations in Large Language Models |
Afshin Khadangi et.al. |
2504.03302 |
link |
2025-04-04 |
Stance-Driven Multimodal Controlled Statement Generation: New Dataset and Task |
Bingqian Wang et.al. |
2504.03295 |
null |
2025-04-04 |
Towards Effective EU E-Participation: The Development of AskThePublic |
Kilian Sprenkamp et.al. |
2504.03287 |
null |
2025-04-04 |
Do Large Language Models Solve the Problems of Agent-Based Modeling? A Critical Review of Generative Social Simulations |
Maik Larooij et.al. |
2504.03274 |
null |
2025-04-04 |
Inherent and emergent liability issues in LLM-based agentic systems: a principal-agent perspective |
Garry A. Gabison et.al. |
2504.03255 |
null |
2025-04-04 |
Seeing is Believing: Belief-Space Planning with Foundation Models as Uncertainty Estimators |
Linfeng Zhao et.al. |
2504.03245 |
null |
2025-04-04 |
Endo3R: Unified Online Reconstruction from Dynamic Monocular Endoscopic Video |
Jiaxin Guo et.al. |
2504.03198 |
null |
2025-04-04 |
Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation |
Jaewoo Park et.al. |
2504.03197 |
null |
2025-04-04 |
Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation |
Xin Zhang et.al. |
2504.03193 |
link |
2025-04-04 |
Learning Natural Language Constraints for Safe Reinforcement Learning of Language Agents |
Jaymari Chua et.al. |
2504.03185 |
null |
2025-04-04 |
RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation |
Hanbo Bi et.al. |
2504.03166 |
null |
2025-04-04 |
Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation |
Weitao Li et.al. |
2504.03165 |
link |
2025-04-04 |
DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments |
Yuxiang Zheng et.al. |
2504.03160 |
link |
2025-04-04 |
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction |
Junlang Qian et.al. |
2504.03159 |
link |
2025-04-04 |
TokenFLEX: Unified VLM Training for Flexible Visual Tokens Inference |
Junshan Hu et.al. |
2504.03154 |
null |
2025-04-04 |
Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1) |
Jing Bi et.al. |
2504.03151 |
null |
2025-04-04 |
A Human Digital Twin Architecture for Knowledge-based Interactions and Context-Aware Conversations |
Abdul Mannan Mohammed et.al. |
2504.03147 |
null |
2025-04-04 |
LightPROF: A Lightweight Reasoning Framework for Large Language Model on Knowledge Graph |
Tu Ao et.al. |
2504.03137 |
null |
2025-04-04 |
Les Dissonances: Cross-Tool Harvesting and Polluting in Multi-Tool Empowered LLM Agents |
Zichuan Li et.al. |
2504.03111 |
null |
2025-04-04 |
Single-Pass Document Scanning for Question Answering |
Weili Cao et.al. |
2504.03101 |
link |
2025-04-03 |
Unlocking the AMD Neural Processing Unit for ML Training on the Client Using Bare-Metal-Programming Tools |
André Rösti et.al. |
2504.03083 |
null |
2025-04-03 |
AD-GPT: Large Language Models in Alzheimer’s Disease |
Ziyu Liu et.al. |
2504.03071 |
null |
2025-04-03 |
Design of AI-Powered Tool for Self-Regulation Support in Programming Education |
Huiyong Li et.al. |
2504.03068 |
null |
2025-04-03 |
Task as Context Prompting for Accurate Medical Symptom Coding Using Large Language Models |
Chengyang He et.al. |
2504.03051 |
null |
2025-04-03 |
Extending CREAMT: Leveraging Large Language Models for Literary Translation Post-Editing |
Antonio Castaldo et.al. |
2504.03045 |
null |
2025-04-03 |
Ontologies in Design: How Imagining a Tree Reveals Possibilites and Assumptions in Large Language Models |
Nava Haghighi et.al. |
2504.03029 |
null |
2025-04-03 |
AuDeRe: Automated Strategy Decision and Realization in Robot Planning and Control via LLMs |
Yue Meng et.al. |
2504.03015 |
link |
2025-04-03 |
What People Share With a Robot When Feeling Lonely and Stressed and How It Helps Over Time |
Guy Laban et.al. |
2504.02991 |
null |
2025-04-03 |
Language Models Guidance with Multi-Aspect-Cueing: A Case Study for Competitor Analysis |
Amir Hadifar et.al. |
2504.02984 |
null |
2025-04-03 |
Hummus: A Dataset of Humorous Multimodal Metaphor Use |
Xiaoyu Tong et.al. |
2504.02983 |
link |
2025-04-03 |
Digital Forensics in the Age of Large Language Models |
Zhipeng Yin et.al. |
2504.02963 |
null |
2025-04-03 |
Cultural Learning-Based Culture Adaptation of Language Models |
Chen Cecilia Liu et.al. |
2504.02953 |
null |
2025-04-03 |
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning |
Xianwei Zhuang et.al. |
2504.02949 |
link |
2025-04-03 |
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse |
Yuwei An et.al. |
2504.02921 |
null |
2025-04-03 |
Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments |
Chenyu Zhang et.al. |
2504.02918 |
null |
2025-04-03 |
Bias in Large Language Models Across Clinical Applications: A Systematic Review |
Thanathip Suenghataiphorn et.al. |
2504.02917 |
null |
2025-04-03 |
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models |
Mateusz Pach et.al. |
2504.02821 |
link |
2025-04-03 |
Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization |
Kangle Deng et.al. |
2504.02817 |
null |
2025-04-03 |
Generative Evaluation of Complex Reasoning in Large Language Models |
Haowei Lin et.al. |
2504.02810 |
link |
2025-04-03 |
MegaMath: Pushing the Limits of Open Math Corpora |
Fan Zhou et.al. |
2504.02807 |
link |
2025-04-03 |
F-ViTA: Foundation Model Guided Visible to Thermal Translation |
Jay N. Paranjape et.al. |
2504.02801 |
link |
2025-04-04 |
A Survey of Large Language Models in Mental Health Disorder Detection on Social Media |
Zhuohan Ge et.al. |
2504.02800 |
null |
2025-04-03 |
A Framework for Situating Innovations, Opportunities, and Challenges in Advancing Vertical Systems with Large AI Models |
Gaurav Verma et.al. |
2504.02793 |
null |
2025-04-03 |
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets |
Chuning Zhu et.al. |
2504.02792 |
null |
2025-04-03 |
A Framework for Robust Cognitive Evaluation of LLMs |
Karin de Langis et.al. |
2504.02789 |
null |
2025-04-03 |
From Consumption to Collaboration: Measuring Interaction Patterns to Augment Human Cognition in Open-Ended Tasks |
Joshua Holstein et.al. |
2504.02780 |
null |
2025-04-03 |
BT-ACTION: A Test-Driven Approach for Modular Understanding of User Instruction Leveraging Behaviour Trees and LLMs |
Alexander Leszczynski et.al. |
2504.02779 |
link |
2025-04-03 |
How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices? |
Andres Algaba et.al. |
2504.02767 |
link |
2025-04-03 |
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model |
Shengjun Zhang et.al. |
2504.02764 |
null |
2025-04-03 |
Echoes of the hidden: Uncovering coordination beyond network structure |
Shahar Somin et.al. |
2504.02757 |
null |
2025-04-04 |
RBT4DNN: Requirements-based Testing of Neural Networks |
Nusrat Jahan Mozumder et.al. |
2504.02737 |
link |
2025-04-03 |
Enhancing LLM Robustness to Perturbed Instructions: An Empirical Study |
Aryan Agrawal et.al. |
2504.02733 |
link |
2025-04-04 |
Why do LLMs attend to the first token? |
Federico Barbero et.al. |
2504.02732 |
null |
2025-04-03 |
ERPO: Advancing Safety Alignment via Ex-Ante Reasoning Preference Optimization |
Kehua Feng et.al. |
2504.02725 |
null |
2025-04-03 |
TeleMoM: Consensus-Driven Telecom Intelligence via Mixture of Models |
Xinquan Wang et.al. |
2504.02712 |
null |
2025-04-03 |
The Hidden Space of Safety: Understanding Preference-Tuned LLMs in Multilingual context |
Nikhil Verma et.al. |
2504.02708 |
null |
2025-04-03 |
LLM for Complex Reasoning Task: An Exploratory Study in Fermi Problems |
Zishuo Liu et.al. |
2504.02671 |
null |
2025-04-03 |
Affordable AI Assistants with Knowledge Graph of Thoughts |
Maciej Besta et.al. |
2504.02670 |
null |
2025-04-03 |
Prompt Optimization with Logged Bandit Data |
Haruka Kiyohara et.al. |
2504.02646 |
null |
2025-04-03 |
Towards Computation- and Communication-efficient Computational Pathology |
Chu Han et.al. |
2504.02628 |
null |
2025-04-03 |
Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions |
PeiJie Yu et.al. |
2504.02623 |
link |
2025-04-03 |
Exploring undercurrents of learning tensions in an LLM-enhanced landscape: A student-centered qualitative perspective on LLM vs Search |
Rahul R. Divekar et.al. |
2504.02622 |
null |
2025-04-03 |
Efficient Model Editing with Task-Localized Sparse Fine-tuning |
Leonardo Iurada et.al. |
2504.02620 |
link |
2025-04-03 |
Variational Online Mirror Descent for Robust Learning in Schrödinger Bridge |
Dong-Sig Han et.al. |
2504.02618 |
null |
2025-04-03 |
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation |
Jiwoo Chung et.al. |
2504.02612 |
null |
2025-04-03 |
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving |
Daoguang Zan et.al. |
2504.02605 |
link |
2025-04-04 |
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme |
Yan Ma et.al. |
2504.02587 |
link |
2025-04-03 |
Language Models reach higher Agreement than Humans in Historical Interpretation |
Fabio Celli et.al. |
2504.02572 |
null |
2025-04-04 |
Leveraging LLM For Synchronizing Information Across Multilingual Tables |
Siddharth Khincha et.al. |
2504.02559 |
null |
2025-04-03 |
Exploring Individual Factors in the Adoption of LLMs for Specific Software Engineering Tasks |
Stefano Lambiase et.al. |
2504.02553 |
null |
2025-04-03 |
GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning |
Xiangxiang Chu et.al. |
2504.02546 |
link |
2025-04-03 |
UNDO: Understanding Distillation as Optimization |
Kushal Jain et.al. |
2504.02521 |
null |
2025-04-03 |
A Memory-Augmented LLM-Driven Method for Autonomous Merging of 3D Printing Work Orders |
Yuhao Liu et.al. |
2504.02509 |
null |
2025-04-03 |
ZClip: Adaptive Spike Mitigation for LLM Pre-Training |
Abhay Kumar et.al. |
2504.02507 |
link |
2025-04-03 |
Inference-Time Scaling for Generalist Reward Modeling |
Zijun Liu et.al. |
2504.02495 |
null |
2025-04-03 |
MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities |
Bizhu Wu et.al. |
2504.02478 |
link |
2025-04-03 |
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision |
Xiaofeng Han et.al. |
2504.02477 |
null |
2025-04-03 |
Retrieval-Augmented Purifier for Robust LLM-Empowered Recommendation |
Liangbo Ning et.al. |
2504.02458 |
null |
2025-04-03 |
Cognitive Memory in Large Language Models |
Lianlei Shan et.al. |
2504.02441 |
null |
2025-04-03 |
A Multi-Level Sentiment Analysis Framework for Financial Texts |
Yiwei Liu et.al. |
2504.02429 |
link |
2025-04-03 |
Adapting Large Language Models for Multi-Domain Retrieval-Augmented-Generation |
Alexandre Misrahi et.al. |
2504.02411 |
null |
2025-04-03 |
AnesBench: Multi-Dimensional Evaluation of LLM Reasoning in Anesthesiology |
Xiang Feng et.al. |
2504.02404 |
link |
2025-04-03 |
DaKultur: Evaluating the Cultural Awareness of Language Models for Danish with Native Speakers |
Max Müller-Eberstein et.al. |
2504.02403 |
null |
2025-04-03 |
CrystalFormer-RL: Reinforcement Fine-Tuning for Materials Design |
Zhendong Cao et.al. |
2504.02367 |
link |
2025-04-03 |
ReuseDroid: A VLM-empowered Android UI Test Migrator Boosted by Active Feedback |
Xiaolei Li et.al. |
2504.02357 |
null |
2025-04-03 |
All-day Depth Completion via Thermal-LiDAR Fusion |
Janghyun Kim et.al. |
2504.02356 |
null |
2025-04-03 |
Agglomerating Large Vision Encoders via Distillation for VFSS Segmentation |
Chengxi Zeng et.al. |
2504.02351 |
null |
2025-04-03 |
Toward General and Robust LLM-enhanced Text-attributed Graph Learning |
Zihao Zhang et.al. |
2504.02343 |
null |
2025-04-03 |
LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models |
Weibin Liao et.al. |
2504.02327 |
null |
2025-04-03 |
CoTAL: Human-in-the-Loop Prompt Engineering, Chain-of-Thought Reasoning, and Active Learning for Generalizable Formative Assessment Scoring |
Clayton Cohn et.al. |
2504.02323 |
null |
2025-04-03 |
OmniCam: Unified Multimodal Video Generation via Camera Control |
Xiaoda Yang et.al. |
2504.02312 |
null |
2025-04-03 |
Relativistic compact object in Generalised Tolman-Kuchowicz spacetime with quadratic equation of state |
Hemani R. Acharya et.al. |
2504.02311 |
null |
2025-04-03 |
Improving Harmful Text Detection with Joint Retrieval and External Knowledge |
Zidong Yu et.al. |
2504.02310 |
null |
2025-04-03 |
Measurement of LLM’s Philosophies of Human Nature |
Minheng Ni et.al. |
2504.02304 |
link |
2025-04-03 |
Parallel Market Environments for FinRL Contests |
Keyi Wang et.al. |
2504.02281 |
null |
2025-04-03 |
LLM-Guided Evolution: An Autonomous Model Optimization for Object Detection |
YiMing Yu et.al. |
2504.02280 |
null |
2025-04-03 |
Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models |
Hung Le et.al. |
2504.02273 |
null |
2025-04-03 |
MinkOcc: Towards real-time label-efficient semantic occupancy prediction |
Samuel Sze et.al. |
2504.02270 |
null |
2025-04-03 |
MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism |
Ruidong Zhu et.al. |
2504.02263 |
null |
2025-04-03 |
LLMs as Deceptive Agents: How Role-Based Prompting Induces Semantic Ambiguity in Puzzle Tasks |
Seunghyun Yoo et.al. |
2504.02254 |
null |
2025-04-03 |
LLM Social Simulations Are a Promising Research Method |
Jacy Reese Anthis et.al. |
2504.02234 |
null |
2025-04-03 |
The Plot Thickens: Quantitative Part-by-Part Exploration of MLLM Visualization Literacy |
Matheus Valentim et.al. |
2504.02217 |
null |
2025-04-03 |
LLM-Augmented Graph Neural Recommenders: Integrating User Reviews |
Hiroki Kanezashi et.al. |
2504.02195 |
null |
2025-04-03 |
More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment |
Yifan Wang et.al. |
2504.02193 |
null |
2025-04-02 |
A Survey of Scaling in Large Language Model Reasoning |
Zihan Chen et.al. |
2504.02181 |
null |
2025-04-02 |
Subasa – Adapting Language Models for Low-resourced Offensive Language Detection in Sinhala |
Shanilka Haturusinghe et.al. |
2504.02178 |
null |
2025-04-02 |
Responsible Innovation: A Strategic Framework for Financial LLM Integration |
Ahmadreza Tavasoli et.al. |
2504.02165 |
null |
2025-04-02 |
OmniCellTOSG: The First Cell Text-Omic Signaling Graphs Dataset for Joint LLM and GNN Modeling |
Heming Zhang et.al. |
2504.02148 |
link |
2025-04-02 |
LL4G: Self-Supervised Dynamic Optimization for Graph-Based Personality Detection |
Lingzhi Shen et.al. |
2504.02146 |
null |
2025-04-02 |
On Simulation-Guided LLM-based Code Generation for Safe Autonomous Driving Software |
Ali Nouri et.al. |
2504.02141 |
null |
2025-04-02 |
One Pic is All it Takes: Poisoning Visual Document Retrieval Augmented Generation with a Single Image |
Ezzeldin Shereen et.al. |
2504.02132 |
null |
2025-04-02 |
Achieving Unanimous Consensus in Decision Making Using Multi-Agents |
Apurba Pokharel et.al. |
2504.02128 |
null |
2025-04-02 |
Efficient Model Selection for Time Series Forecasting via LLMs |
Wang Wei et.al. |
2504.02119 |
null |
2025-04-02 |
LLMPi: Optimizing LLMs for High-Throughput on Raspberry Pi |
Mahsa Ardakani et.al. |
2504.02118 |
null |
2025-04-02 |
PolyG: Effective and Efficient GraphRAG with Adaptive Graph Traversal |
Renjie Liu et.al. |
2504.02112 |
null |
2025-04-02 |
Exploring LLM Reasoning Through Controlled Prompt Variations |
Giannis Chatziveroglou et.al. |
2504.02111 |
link |
2025-04-02 |
ScreenAudit: Detecting Screen Reader Accessibility Errors in Mobile Apps Using Large Language Models |
Mingyuan Zhong et.al. |
2504.02110 |
null |
2025-04-02 |
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining |
Jeffrey Li et.al. |
2504.02107 |
link |
2025-04-02 |
ContrastScore: Towards Higher Quality, Less Biased, More Efficient Evaluation Metrics with Contrastive Evaluation |
Xiao Wang et.al. |
2504.02106 |
null |
2025-04-02 |
FlowDistill: Scalable Traffic Flow Prediction via Distillation from LLMs |
Chenyang Yu et.al. |
2504.02094 |
link |
2025-04-02 |
Increasing happiness through conversations with artificial intelligence |
Joseph Heffner et.al. |
2504.02091 |
null |
2025-04-02 |
Evolving Security in LLMs: A Study of Jailbreak Attacks and Defenses |
Zhengchun Shang et.al. |
2504.02080 |
null |
2025-04-02 |
Trapped by Expectations: Functional Fixedness in LLM-Enabled Chat Search |
Jiqun Liu et.al. |
2504.02074 |
null |
2025-04-02 |
From Text to Graph: Leveraging Graph Neural Networks for Enhanced Explainability in NLP |
Fabio Yáñez-Romero et.al. |
2504.02064 |
null |
2025-04-02 |
Aligned Better, Listen Better for Audio-Visual Large Language Models |
Yuxin Guo et.al. |
2504.02061 |
null |
2025-04-02 |
Towards Operationalizing Heterogeneous Data Discovery |
Jin Wang et.al. |
2504.02059 |
null |
2025-04-02 |
MageSQL: Enhancing In-context Learning for Text-to-SQL Applications with Large Language Models |
Chen Shen et.al. |
2504.02055 |
null |
2025-04-02 |
From Prompts to Templates: A Systematic Prompt Template Analysis for Real-world LLMapps |
Yuetian Mao et.al. |
2504.02052 |
null |
2025-04-02 |
WorldPrompter: Traversable Text-to-Scene Generation |
Zhaoyang Zhang et.al. |
2504.02045 |
null |
2025-04-02 |
Slot-Level Robotic Placement via Visual Imitation from Single Human Video |
Dandan Shan et.al. |
2504.01959 |
null |
2025-04-03 |
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step |
Hanyang Wang et.al. |
2504.01956 |
null |
2025-04-02 |
Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target Granularities |
Jing Liu et.al. |
2504.01954 |
null |
2025-04-02 |
The LLM Wears Prada: Analysing Gender Bias and Stereotypes through Online Shopping Data |
Massimiliano Luca et.al. |
2504.01951 |
null |
2025-04-02 |
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding |
Wasi Uddin Ahmad et.al. |
2504.01943 |
null |
2025-04-02 |
A Unified Approach to Analysis and Design of Denoising Markov Models |
Yinuo Ren et.al. |
2504.01938 |
null |
2025-04-02 |
Critical Thinking: Which Kinds of Complexity Govern Optimal Reasoning Length? |
Celine Lee et.al. |
2504.01935 |
link |
2025-04-02 |
A thorough benchmark of automatic text classification: From traditional approaches to large language models |
Washington Cunha et.al. |
2504.01930 |
link |
2025-04-02 |
Gen-C: Populating Virtual Worlds with Generative Crowds |
Andreas Panayiotou et.al. |
2504.01924 |
null |
2025-04-02 |
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation |
Baban Gain et.al. |
2504.01919 |
null |
2025-04-02 |
Advancing AI-Scientist Understanding: Making LLM Think Like a Physicist with Interpretable Reasoning |
Yinggan Xu et.al. |
2504.01911 |
null |
2025-04-02 |
Multi-fidelity Parameter Estimation Using Conditional Diffusion Models |
Caroline Tatsuoka et.al. |
2504.01894 |
null |
2025-04-02 |
TransientTables: Evaluating LLMs’ Reasoning on Temporally Evolving Semi-structured Tables |
Abhilash Shankarampeta et.al. |
2504.01879 |
null |
2025-04-02 |
Interpreting Emergent Planning in Model-Free Reinforcement Learning |
Thomas Bush et.al. |
2504.01871 |
null |
2025-04-02 |
From Code Generation to Software Testing: AI Copilot with Context-Based RAG |
Yuchen Wang et.al. |
2504.01866 |
null |
2025-04-02 |
Cross-Lingual Consistency: A Novel Inference Framework for Advancing Reasoning in Large Language Models |
Zhiwei Yu et.al. |
2504.01857 |
null |
2025-04-02 |
Code Red! On the Harmfulness of Applying Off-the-shelf Large Language Models to Programming Tasks |
Ali Al-Kaswan et.al. |
2504.01850 |
null |
2025-04-02 |
BOGausS: Better Optimized Gaussian Splatting |
Stéphane Pateux et.al. |
2504.01844 |
null |
2025-04-02 |
LARGE: Legal Retrieval Augmented Generation Evaluation Tool |
Minhu Park et.al. |
2504.01840 |
link |
2025-04-02 |
YourBench: Easy Custom Evaluation Sets for Everyone |
Sumuk Shashidhar et.al. |
2504.01833 |
link |
2025-04-02 |
Spatial-R1: Enhancing MLLMs in Video Spatial Reasoning |
Kun Ouyang et.al. |
2504.01805 |
link |
2025-04-02 |
Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training |
Zhijun Wang et.al. |
2504.01801 |
link |
2025-04-02 |
UniViTAR: Unified Vision Transformer with Native Resolution |
Limeng Qiao et.al. |
2504.01792 |
null |
2025-04-02 |
OpenThaiGPT 1.6 and R1: Thai-Centric Open Source and Reasoning Large Language Models |
Sumeth Yuenyong et.al. |
2504.01789 |
null |
2025-04-02 |
BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing |
Yunqi Gu et.al. |
2504.01786 |
link |
2025-04-02 |
Leveraging Embedding Techniques in Multimodal Machine Learning for Mental Illness Assessment |
Abdelrahaman A. Hassan et.al. |
2504.01767 |
null |
2025-04-02 |
AdPO: Enhancing the Adversarial Robustness of Large Vision-Language Models with Preference Optimization |
Chaohu Liu et.al. |
2504.01735 |
null |
2025-04-03 |
InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation |
Bowen Cao et.al. |
2504.01707 |
null |
2025-04-02 |
ToM-RL: Reinforcement Learning Unlocks Theory of Mind in Small LLMs |
Yi-Long Lu et.al. |
2504.01698 |
link |
2025-04-02 |
System Level Synthesis for Affine Control Policies: Model Based and Data-Driven Settings |
Lukas Schüepp et.al. |
2504.01677 |
link |
2025-04-03 |
Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish |
Cedric Lothritz et.al. |
2504.01667 |
null |
2025-04-02 |
Q-Adapt: Adapting LMM for Visual Quality Assessment with Progressive Instruction Tuning |
Yiting Lu et.al. |
2504.01655 |
link |
2025-04-02 |
FlowR: Flowing from Sparse to Dense 3D Reconstructions |
Tobias Fischer et.al. |
2504.01647 |
null |
2025-04-02 |
Proposition of Affordance-Driven Environment Recognition Framework Using Symbol Networks in Large Language Models |
Kazuma Arii et.al. |
2504.01644 |
null |
2025-04-02 |
LLM-mediated Dynamic Plan Generation with a Multi-Agent Approach |
Reo Abe et.al. |
2504.01637 |
null |
2025-04-02 |
Horizon Scans can be accelerated using novel information retrieval and artificial intelligence tools |
Lena Schmidt et.al. |
2504.01627 |
null |
2025-04-02 |
Comment Staytime Prediction with LLM-enhanced Comment Understanding |
Changshuo Zhang et.al. |
2504.01602 |
link |
2025-04-02 |
Integrating experimental feedback improves generative models for biological sequences |
Francesco Calvanese et.al. |
2504.01593 |
null |
2025-04-03 |
Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval |
Adriano Fragomeni et.al. |
2504.01591 |
null |
2025-04-02 |
Building Knowledge from Interactions: An LLM-Based Architecture for Adaptive Tutoring and Social Reasoning |
Luca Garello et.al. |
2504.01588 |
null |
2025-04-02 |
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation |
Aleksander Plocharski et.al. |
2504.01571 |
null |
2025-04-02 |
GPT Adoption and the Impact of Disclosure Policies |
Cathy Yang et.al. |
2504.01566 |
null |
2025-04-02 |
Bhakti: A Lightweight Vector Database Management System for Endowing Large Language Models with Semantic Search Capabilities and Memory |
Zihao Wu et.al. |
2504.01553 |
link |
2025-04-02 |
Representation Bending for Large Language Model Safety |
Ashkan Yousefpour et.al. |
2504.01550 |
link |
2025-04-02 |
Semi-Supervised Biomedical Image Segmentation via Diffusion Models and Teacher-Student Co-Training |
Luca Ciampi et.al. |
2504.01547 |
link |
2025-04-02 |
Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation |
Amanda Myntti et.al. |
2504.01542 |
null |
2025-04-02 |
Hyperbolic Diffusion Recommender Model |
Meng Yuan et.al. |
2504.01541 |
null |
2025-04-02 |
LightDefense: A Lightweight Uncertainty-Driven Defense against Jailbreaks via Shifted Token Distribution |
Zhuoran Yang et.al. |
2504.01533 |
null |
2025-04-02 |
Adapting Knowledge Prompt Tuning for Enhanced Automated Program Repair |
Xuemeng Cai et.al. |
2504.01523 |
link |
2025-04-02 |
Redefining technology for indigenous languages |
Silvia Fernandez-Sabido et.al. |
2504.01522 |
null |
2025-04-02 |
Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model |
Jincheng Zhong et.al. |
2504.01521 |
link |
2025-04-02 |
Chain of Correction for Full-text Speech Recognition with Large Language Models |
Zhiyuan Tang et.al. |
2504.01519 |
null |
2025-04-02 |
PROPHET: An Inferable Future Forecasting Benchmark with Causal Intervened Likelihood Estimation |
Zhengwei Tao et.al. |
2504.01509 |
link |
2025-04-02 |
Are Autonomous Web Agents Good Testers? |
Antoine Chevrot et.al. |
2504.01495 |
null |
2025-04-02 |
ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction |
Yuejiao Su et.al. |
2504.01472 |
null |
2025-04-02 |
A Prefixed Patch Time Series Transformer for Two-Point Boundary Value Problems in Three-Body Problems |
Akira Hatakeyama et.al. |
2504.01464 |
null |
2025-04-03 |
GeoRAG: A Question-Answering Approach from a Geographical Perspective |
Jian Wang et.al. |
2504.01458 |
null |
2025-04-02 |
LLM-VPRF: Large Language Model Based Vector Pseudo Relevance Feedback |
Hang Li et.al. |
2504.01448 |
null |
2025-04-02 |
Enabling Systematic Generalization in Abstract Spatial Reasoning through Meta-Learning for Compositionality |
Philipp Mondorf et.al. |
2504.01445 |
link |
2025-04-02 |
PiCo: Jailbreaking Multimodal Large Language Models via $\textbf{Pi}$ctorial $\textbf{Co}$ de Contextualization |
Aofan Liu et.al. |
2504.01444 |
null |
2025-04-02 |
Refining Interactions: Enhancing Anisotropy in Graph Neural Networks with Language Semantics |
Zhaoxing Li et.al. |
2504.01429 |
null |
2025-04-02 |
Dynamic Incentive Strategies for Smart EV Charging Stations: An LLM-Driven User Digital Twin Approach |
Yichen Sun et.al. |
2504.01423 |
null |
2025-04-02 |
FAIRE: Assessing Racial and Gender Bias in AI-Driven Resume Evaluations |
Athena Wen et.al. |
2504.01420 |
link |
2025-04-02 |
LLM4SZZ: Enhancing SZZ Algorithm with Context-Enhanced Assessment on Large Language Models |
Lingxiao Tang et.al. |
2504.01404 |
null |
2025-04-02 |
Generative Retrieval and Alignment Model: A New Paradigm for E-commerce Retrieval |
Ming Pang et.al. |
2504.01403 |
null |
2025-04-02 |
ToolACE-R: Tool Learning with Adaptive Self-Refinement |
Xingshan Zeng et.al. |
2504.01400 |
null |
2025-04-02 |
An Illusion of Progress? Assessing the Current State of Web Agents |
Tianci Xue et.al. |
2504.01382 |
link |
2025-04-02 |
UniFault: A Fault Diagnosis Foundation Model from Bearing Data |
Emadeldeen Eldele et.al. |
2504.01373 |
null |
2025-04-02 |
Slow-Fast Architecture for Video Multi-Modal Large Language Models |
Min Shi et.al. |
2504.01328 |
link |
2025-04-02 |
Adaptive Rectification Sampling for Test-Time Compute Scaling |
Zhendong Tan et.al. |
2504.01317 |
link |
2025-04-02 |
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks |
Jiawei Wang et.al. |
2504.01308 |
link |
2025-04-02 |
Real-time Ad retrieval via LLM-generative Commercial Intention for Sponsored Search Advertising |
Tongtong Liu et.al. |
2504.01304 |
null |
2025-04-02 |
Extracting Formal Specifications from Documents Using LLMs for Automated Testing |
Hui Li et.al. |
2504.01294 |
link |
2025-04-02 |
Prompt-Reverse Inconsistency: LLM Self-Inconsistency Beyond Generative Randomness and Prompt Paraphrasing |
Jihyun Janice Ahn et.al. |
2504.01282 |
null |
2025-04-03 |
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding |
Sakhinana Sagar Srinivas et.al. |
2504.01281 |
null |
2025-04-02 |
Strategize Globally, Adapt Locally: A Multi-Turn Red Teaming Agent with Dual-Level Learning |
Si Chen et.al. |
2504.01278 |
null |
2025-04-02 |
Facilitating Instructors-LLM Collaboration for Problem Design in Introductory Programming Classrooms |
Muntasir Hoq et.al. |
2504.01259 |
null |
2025-04-01 |
Grade Guard: A Smart System for Short Answer Automated Grading |
Niharika Dadu et.al. |
2504.01253 |
null |
2025-04-01 |
Plan-and-Act using Large Language Models for Interactive Agreement |
Kazuhiro Sasabuchi et.al. |
2504.01252 |
null |
2025-04-01 |
Automated Factual Benchmarking for In-Car Conversational Systems using Large Language Models |
Rafael Giebisch et.al. |
2504.01248 |
null |
2025-04-01 |
Catastrophic Forgetting in LLMs: A Comparative Analysis Across Language Tasks |
Naimul Haque et.al. |
2504.01241 |
null |
2025-04-01 |
Towards Resilient Federated Learning in CyberEdge Networks: Recent Advances and Future Trends |
Kai Li et.al. |
2504.01240 |
null |
2025-04-01 |
Prompting Forgetting: Unlearning in GANs via Textual Guidance |
Piyush Nagasubramaniam et.al. |
2504.01218 |
null |
2025-04-01 |
Detecting PTSD in Clinical Interviews: A Comparative Analysis of NLP Methods and Large Language Models |
Feng Chen et.al. |
2504.01216 |
null |
2025-04-01 |
Articulated Kinematics Distillation from Video Diffusion Models |
Xuan Li et.al. |
2504.01204 |
null |
2025-04-01 |
Medical large language models are easily distracted |
Krithik Vishwanath et.al. |
2504.01201 |
link |
2025-04-01 |
$μ$ KE: Matryoshka Unstructured Knowledge Editing of Large Language Models |
Zian Su et.al. |
2504.01196 |
null |
2025-04-01 |
Predicting Field Experiments with Large Language Models |
Yaoyu Chen et.al. |
2504.01167 |
null |
2025-04-01 |
Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB |
Anas Dorbani et.al. |
2504.01157 |
null |
2025-04-01 |
Catch Me if You Search: When Contextual Web Search Results Affect the Detection of Hallucinations |
Mahjabin Nahar et.al. |
2504.01153 |
link |
2025-04-01 |
MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs) |
Bikash Saha et.al. |
2504.01145 |
link |
2025-04-01 |
Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs’ Metacognitive Cultural Intelligence with CQ-Bench |
Ziyi Liu et.al. |
2504.01127 |
link |
2025-04-01 |
ShieldGemma 2: Robust and Tractable Image Content Moderation |
Wenjun Zeng et.al. |
2504.01081 |
null |
2025-04-01 |
MixerMDM: Learnable Composition of Human Motion Diffusion Models |
Pablo Ruiz-Ponce et.al. |
2504.01019 |
null |
2025-04-01 |
Self-Routing RAG: Binding Selective Retrieval with Knowledge Verbalization |
Di Wu et.al. |
2504.01018 |
null |
2025-04-01 |
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction |
Junhao Cheng et.al. |
2504.01014 |
link |
2025-04-01 |
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning |
Nishad Singhi et.al. |
2504.01005 |
null |
2025-04-01 |
Token embeddings violate the manifold hypothesis |
Michael Robinson et.al. |
2504.01002 |
null |
2025-04-01 |
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization |
Siyuan Li et.al. |
2504.00999 |
null |
2025-04-01 |
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs |
Juncheng Wu et.al. |
2504.00993 |
link |
2025-03-31 |
Consistent Subject Generation via Contrastive Instantiated Concepts |
Lee Hsin-Ying et.al. |
2503.24387 |
null |
2025-03-31 |
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation |
Shengqiong Wu et.al. |
2503.24379 |
null |
2025-03-31 |
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models |
Rui Wang et.al. |
2503.24377 |
link |
2025-03-31 |
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 |
Yi Chen et.al. |
2503.24376 |
link |
2025-03-31 |
Effectively Controlling Reasoning Models through Thinking Intervention |
Tong Wu et.al. |
2503.24370 |
null |
2025-03-31 |
Adapting Vision Foundation Models for Real-time Ultrasound Image Segmentation |
Xiaoran Zhang et.al. |
2503.24368 |
null |
2025-03-31 |
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion |
Rana Muhammad Shahroz Khan et.al. |
2503.24354 |
null |
2025-03-31 |
PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks |
Fang Yan et.al. |
2503.24345 |
null |
2025-03-31 |
Can Test-Time Scaling Improve World Foundation Model? |
Wenyan Cong et.al. |
2503.24320 |
link |
2025-03-31 |
BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models |
Alok Abhishek et.al. |
2503.24310 |
null |
2025-03-31 |
A Systematic Evaluation of LLM Strategies for Mental Health Text Analysis: Fine-tuning vs. Prompt Engineering vs. RAG |
Arshia Kermani et.al. |
2503.24307 |
null |
2025-03-31 |
Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning |
Jiacheng Lin et.al. |
2503.24289 |
link |
2025-03-31 |
Style Quantization for Data-Efficient GAN Training |
Jian Wang et.al. |
2503.24282 |
null |
2025-03-31 |
Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality |
Sewoong Lee et.al. |
2503.24277 |
link |
2025-03-31 |
Enhancing Large Language Models (LLMs) for Telecommunications using Knowledge Graphs and Retrieval-Augmented Generation |
Dun Yuan et.al. |
2503.24245 |
null |
2025-03-31 |
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models |
Qiyuan Zhang et.al. |
2503.24235 |
link |
2025-03-31 |
Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes |
Daichi Otsuka et.al. |
2503.24229 |
null |
2025-03-31 |
Synthetic News Generation for Fake News Classification |
Abdul Sittar et.al. |
2503.24206 |
null |
2025-03-31 |
TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers’ Guidance |
Jingxian Xu et.al. |
2503.24198 |
null |
2025-03-31 |
Text2Tracks: Prompt-based Music Recommendation via Generative Retrieval |
Enrico Palumbo et.al. |
2503.24193 |
null |
2025-03-31 |
Output Constraints as Attack Surface: Exploiting Structured Generation to Bypass LLM Safety Mechanisms |
Shuoming Zhang et.al. |
2503.24191 |
null |
2025-03-31 |
Foundation Models For Seismic Data Processing: An Extensive Review |
Fabian Fuchs et.al. |
2503.24166 |
link |
2025-03-31 |
LLM4FS: Leveraging Large Language Models for Feature Selection and How to Improve It |
Jianhao Li et.al. |
2503.24157 |
null |
2025-03-31 |
AI-Assisted Colonoscopy: Polyp Detection and Segmentation using Foundation Models |
Uxue Delaquintana-Aramendi et.al. |
2503.24138 |
link |
2025-03-31 |
It’s a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data |
Dominik Schnaus et.al. |
2503.24129 |
link |
2025-04-01 |
TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection |
Zhiming Ma et.al. |
2503.24115 |
link |
2025-03-31 |
PolypSegTrack: Unified Foundation Model for Colonoscopy Video Analysis |
Anwesa Choudhuri et.al. |
2503.24108 |
null |
2025-03-31 |
Is LLM the Silver Bullet to Low-Resource Languages Machine Translation? |
Yewei Song et.al. |
2503.24102 |
null |
2025-03-31 |
TransMamba: Flexibly Switching between Transformer and Mamba |
Yixing Li et.al. |
2503.24067 |
null |
2025-03-31 |
Artificial Conversations, Real Results: Fostering Language Detection with Synthetic Data |
Fatemeh Mohammadi et.al. |
2503.24062 |
null |
2025-03-31 |
ReaLM: Reliable and Efficient Large Language Model Inference with Statistical Algorithm-Based Fault Tolerance |
Tong Xie et.al. |
2503.24053 |
link |
2025-04-01 |
A Deep Learning Framework for the Electronic Structure of Water: Towards a Universal Model |
Xinyuan Liang et.al. |
2503.24050 |
null |
2025-03-31 |
Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents |
Shuo Ren et.al. |
2503.24047 |
null |
2025-03-31 |
IntelliCircos: A Data-driven and AI-powered Authoring Tool for Circos Plots |
Mingyang Gu et.al. |
2503.24021 |
null |
2025-03-31 |
H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding |
Qi Wu et.al. |
2503.24008 |
null |
2025-03-31 |
Rethinking Key-Value Cache Compression Techniques for Large Language Model Serving |
Wei Gao et.al. |
2503.24000 |
link |
2025-03-31 |
SALT: A Flexible Semi-Automatic Labeling Tool for General LiDAR Point Clouds with Cross-Scene Adaptability and 4D Consistency |
Yanbo Wang et.al. |
2503.23980 |
link |
2025-04-01 |
Local Information Matters: Inference Acceleration For Grounded Conversation Generation Models Through Adaptive Local-Aware Token Pruning |
Bizhe Bai et.al. |
2503.23959 |
null |
2025-03-31 |
Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations |
Adrián Sánchez-Mompó et.al. |
2503.23934 |
null |
2025-03-31 |
Model Hemorrhage and the Robustness Limits of Large Language Models |
Ziyang Ma et.al. |
2503.23924 |
null |
2025-03-31 |
Entropy-Based Adaptive Weighting for Self-Training |
Xiaoxuan Wang et.al. |
2503.23913 |
link |
2025-03-31 |
HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment |
Zhichao Liao et.al. |
2503.23907 |
null |
2025-03-31 |
Rubrik’s Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset |
Diana Galvan-Sosa et.al. |
2503.23899 |
null |
2025-03-31 |
Better wit than wealth: Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement |
Yuqiao Tan et.al. |
2503.23895 |
link |
2025-03-31 |
SchemaAgent: A Multi-Agents Framework for Generating Relational Database Schema |
Qin Wang et.al. |
2503.23886 |
link |
2025-03-31 |
GenSwarm: Scalable Multi-Robot Code-Policy Generation and Deployment via Language Models |
Wenkang Ji et.al. |
2503.23875 |
link |
2025-03-31 |
Exploring In-Context Learning Capabilities of ChatGPT for Pathological Speech Detection |
Mahdi Amiri et.al. |
2503.23873 |
null |
2025-03-31 |
Communication-Efficient and Personalized Federated Foundation Model Fine-Tuning via Tri-Matrix Adaptation |
Yongle Li et.al. |
2503.23869 |
null |
2025-04-01 |
Evaluating small vision-language models as AI assistants for radio astronomical source analysis tasks |
S. Riggi et.al. |
2503.23859 |
link |
2025-03-31 |
FlexiMo: A Flexible Remote Sensing Foundation Model |
Xuyang Li et.al. |
2503.23844 |
null |
2025-03-31 |
OrchMLLM: Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training |
Yijie Zheng et.al. |
2503.23830 |
null |
2025-04-01 |
Crossing the Reward Bridge: Expanding RL with Verifiable Rewards Across Diverse Domains |
Yi Su et.al. |
2503.23829 |
null |
2025-03-31 |
Aud-Sur: An Audio Analyzer Assistant for Audio Surveillance Applications |
Phat Lam et.al. |
2503.23827 |
null |
2025-03-31 |
Conformal uncertainty quantification to evaluate predictive fairness of foundation AI model for skin lesion classes across patient demographics |
Swarnava Bhattacharyya et.al. |
2503.23819 |
null |
2025-03-31 |
MVDRAM: Enabling GeMV Execution in Unmodified DRAM for Low-Bit LLM Acceleration |
Tatsuya Kubo et.al. |
2503.23817 |
null |
2025-04-01 |
Did ChatGPT or Copilot use alter the style of internet news headlines? A time series regression analysis |
Chris Brogly et.al. |
2503.23811 |
null |
2025-03-31 |
Adaptive Attention-Based Model for 5G Radio-based Outdoor Localization |
Ilayda Yaman et.al. |
2503.23810 |
null |
2025-03-31 |
Get the Agents Drunk: Memory Perturbations in Autonomous Agent-based Recommender Systems |
Shiyi Yang et.al. |
2503.23804 |
null |
2025-03-31 |
Adaptive Layer-skipping in Pre-trained LLMs |
Xuan Luo et.al. |
2503.23798 |
null |
2025-04-01 |
On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile Devices |
Bosung Kim et.al. |
2503.23796 |
link |
2025-03-31 |
LLMigrate: Transforming “Lazy” Large Language Models into Efficient Source Code Migrators |
Yuchen Liu et.al. |
2503.23791 |
null |
2025-03-31 |
MGD-SAM2: Multi-view Guided Detail-enhanced Segment Anything Model 2 for High-Resolution Class-agnostic Segmentation |
Haoran Shen et.al. |
2503.23786 |
link |
2025-03-31 |
ObfusQate: Unveiling the First Quantum Program Obfuscation Framework |
Nilhil Bartake et.al. |
2503.23785 |
null |
2025-03-31 |
DebFlow: Automating Agent Creation via Agent Debate |
Jinwei Su et.al. |
2503.23781 |
null |
2025-03-31 |
WinoWhat: A Parallel Corpus of Paraphrased WinoGrande Sentences with Common Sense Categorization |
Ine Gevers et.al. |
2503.23779 |
null |
2025-03-31 |
CONGRAD:Conflicting Gradient Filtering for Multilingual Preference Alignment |
Jiangnan Li et.al. |
2503.23777 |
null |
2025-03-31 |
XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery? |
Fengxiang Wang et.al. |
2503.23771 |
null |
2025-03-31 |
Biologically Inspired Spiking Diffusion Model with Adaptive Lateral Selection Mechanism |
Linghao Feng et.al. |
2503.23767 |
null |
2025-03-31 |
Accelerating High-Efficiency Organic Photovoltaic Discovery via Pretrained Graph Neural Networks and Generative Reinforcement Learning |
Jiangjie Qiu et.al. |
2503.23766 |
null |
2025-03-31 |
STI-Bench: Are MLLMs Ready for Precise Spatial-Temporal World Understanding? |
Yun Li et.al. |
2503.23765 |
null |
2025-03-31 |
UniSep: Universal Target Audio Separation with Language Models at Scale |
Yuanyuan Wang et.al. |
2503.23762 |
null |
2025-03-31 |
Short-video Propagation Influence Rating: A New Real-world Dataset and A New Large Graph Model |
Dizhan Xue et.al. |
2503.23746 |
link |
2025-03-31 |
LANID: LLM-assisted New Intent Discovery |
Lu Fan et.al. |
2503.23740 |
link |
2025-03-31 |
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization |
Yiyang Du et.al. |
2503.23733 |
link |
2025-03-31 |
Detecting Functional Bugs in Smart Contracts through LLM-Powered and Bug-Oriented Composite Analysis |
Binbin Zhao et.al. |
2503.23718 |
null |
2025-03-31 |
HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation |
Kun Liu et.al. |
2503.23715 |
null |
2025-03-31 |
Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models |
Youmi Ma et.al. |
2503.23714 |
null |
2025-03-31 |
A Conceptual Framework for Human-AI Collaborative Genome Annotation |
Xiaomei Li et.al. |
2503.23691 |
null |
2025-03-31 |
Mapping Geopolitical Bias in 11 Large Language Models: A Bilingual, Dual-Framing Analysis of U.S.-China Tensions |
William Guey et.al. |
2503.23688 |
null |
2025-03-31 |
Large Language Models Pass the Turing Test |
Cameron R. Jones et.al. |
2503.23674 |
null |
2025-03-31 |
WHERE and WHICH: Iterative Debate for Biomedical Synthetic Data Augmentation |
Zhengyi Zhao et.al. |
2503.23673 |
null |
2025-03-31 |
Context-Independent OCR with Multimodal LLMs: Effects of Image Resolution and Visual Complexity |
Kotaro Inoue et.al. |
2503.23667 |
null |
2025-03-31 |
DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning Guidance |
Junjie Zheng et.al. |
2503.23660 |
null |
2025-04-01 |
GIScience in the Era of Artificial Intelligence: A Research Agenda Towards Autonomous GIS |
Zhenlong Li et.al. |
2503.23633 |
null |
2025-03-30 |
Language-Guided Trajectory Traversal in Disentangled Stable Diffusion Latent Space for Factorized Medical Image Generation |
Zahra TehraniNasab et.al. |
2503.23623 |
null |
2025-03-30 |
Leveraging Vision-Language Foundation Models to Reveal Hidden Image-Attribute Relationships in Medical Imaging |
Amar Kumar et.al. |
2503.23618 |
null |
2025-03-30 |
Graph-Eq: Discovering Mathematical Equations using Graph Generative Models |
Nisal Ranasinghe et.al. |
2503.23617 |
null |
2025-03-30 |
Make Autoregressive Great Again: Diffusion-Free Graph Generation with Next-Scale Prediction |
Samuel Belkadi et.al. |
2503.23612 |
null |
2025-03-30 |
Exploring GPT-4 for Robotic Agent Strategy with Real-Time State Feedback and a Reactive Behaviour Framework |
Thomas O’Brien et.al. |
2503.23601 |
null |
2025-03-30 |
When LLM Therapists Become Salespeople: Evaluating Large Language Models for Ethical Motivational Interviewing |
Haein Kong et.al. |
2503.23566 |
null |
2025-03-30 |
Modelling the impact of phenotypic heterogeneity on cell migration: a continuum framework derived from individual-based principles |
Rebecca M. Crossley et.al. |
2503.23545 |
link |
2025-03-30 |
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages |
Xabier de Zuazo et.al. |
2503.23542 |
link |
2025-03-30 |
Enhancing Creative Generation on Stable Diffusion-based Models |
Jiyeon Han et.al. |
2503.23538 |
link |
2025-03-30 |
Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models |
Haochen Liu et.al. |
2503.23523 |
link |
2025-03-30 |
If an LLM Were a Character, Would It Know Its Own Story? Evaluating Lifelong Learning in LLMs |
Siqi Fan et.al. |
2503.23514 |
null |
2025-03-30 |
RARE: Retrieval-Augmented Reasoning Modeling |
Zhengren Wang et.al. |
2503.23513 |
link |
2025-03-30 |
SCORE: Story Coherence and Retrieval Enhancement for AI Narratives |
Qiang Yi et.al. |
2503.23512 |
null |
2025-03-30 |
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model |
Jannik Endres et.al. |
2503.23502 |
link |
2025-03-30 |
DNA and Human Language: Epigenetic Memory and Redundancy in Linear Sequence |
Li Yang et.al. |
2503.23494 |
null |
2025-03-30 |
POINT $^{2}$ : A Polymer Informatics Training and Testing Database |
Jiaxin Xu et.al. |
2503.23491 |
link |
2025-03-28 |
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning |
Weiqi Li et.al. |
2503.22679 |
link |
2025-03-28 |
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness |
Ruining Li et.al. |
2503.22677 |
null |
2025-03-28 |
QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks? |
Belinda Z. Li et.al. |
2503.22674 |
link |
2025-03-28 |
Exploring the Effectiveness of Multi-stage Fine-tuning for Cross-encoder Re-rankers |
Francesca Pezzuti et.al. |
2503.22672 |
link |
2025-03-28 |
Unicorn: Text-Only Data Synthesis for Vision Language Model Training |
Xiaomin Yu et.al. |
2503.22655 |
link |
2025-03-28 |
Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users |
Antonia Karamolegkou et.al. |
2503.22610 |
null |
2025-03-28 |
On the Alignment of Post-Publication Reviews & Bibliometric and Altmetric Impact – A Case Study on Expert Statements from the Science Media Center Germany |
Dirk Tunger et.al. |
2503.22594 |
null |
2025-03-28 |
LLM-enabled Instance Model Generation |
Fengjunjie Pan et.al. |
2503.22587 |
null |
2025-03-28 |
Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish |
Kevin Cohen et.al. |
2503.22585 |
link |
2025-03-28 |
Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation |
Sarubi Thillainathan et.al. |
2503.22582 |
null |
2025-03-28 |
RELD: Regularization by Latent Diffusion Models for Image Restoration |
Pasquale Cascarano et.al. |
2503.22563 |
null |
2025-03-28 |
Niyama : Breaking the Silos of LLM Inference Serving |
Kanishk Goel et.al. |
2503.22562 |
null |
2025-03-28 |
Bridging the Dimensional Chasm: Uncover Layer-wise Dimensional Reduction in Transformers through Token Correlation |
Zhuo-Yang Song et.al. |
2503.22547 |
null |
2025-03-28 |
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities |
Raman Dutt et.al. |
2503.22517 |
null |
2025-03-28 |
Assessing Foundation Models for Sea Ice Type Segmentation in Sentinel-1 SAR Imagery |
Samira Alkaee Taleghan et.al. |
2503.22516 |
null |
2025-03-28 |
Probabilistic Uncertain Reward Model: A Natural Generalization of Bradley-Terry Reward Model |
Wangtao Sun et.al. |
2503.22480 |
null |
2025-03-28 |
WorkTeam: Constructing Workflows from Natural Language with Multi-Agents |
Hanchao Liu et.al. |
2503.22473 |
null |
2025-03-28 |
Evaluating LLM-based Agents for Multi-Turn Conversations: A Survey |
Shengyue Guan et.al. |
2503.22458 |
null |
2025-03-28 |
Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning |
Abdullah Vanlioglu et.al. |
2503.22456 |
null |
2025-03-28 |
STADE: Standard Deviation as a Pruning Metric |
Diego Coello de Portugal Mecke et.al. |
2503.22451 |
link |
2025-03-28 |
CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph Searching |
Zhonghao Jiang et.al. |
2503.22424 |
link |
2025-03-28 |
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis |
Jiangyong Huang et.al. |
2503.22420 |
link |
2025-03-28 |
Training Large Language Models for Advanced Typosquatting Detection |
Jackson Welch et.al. |
2503.22406 |
null |
2025-03-28 |
Generative Reliability-Based Design Optimization Using In-Context Learning Capabilities of Large Language Models |
Zhonglin Jiang et.al. |
2503.22401 |
null |
2025-03-28 |
GAITGen: Disentangled Motion-Pathology Impaired Gait Generative Model – Bringing Motion Generation to the Clinical Domain |
Vida Adeli et.al. |
2503.22397 |
null |
2025-03-28 |
Negation: A Pink Elephant in the Large Language Models’ Room? |
Tereza Vrabcová et.al. |
2503.22395 |
null |
2025-03-28 |
Supposedly Equivalent Facts That Aren’t? Entity Frequency in Pre-training Induces Asymmetry in LLMs |
Yuan He et.al. |
2503.22362 |
link |
2025-03-28 |
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation |
Hadrien Reynaud et.al. |
2503.22357 |
null |
2025-03-28 |
Firm or Fickle? Evaluating Large Language Models Consistency in Sequential Interactions |
Yubo Li et.al. |
2503.22353 |
null |
2025-03-28 |
Meta-LoRA: Meta-Learning LoRA Components for Domain-Aware ID Personalization |
Barış Batuhan Topal et.al. |
2503.22352 |
null |
2025-03-28 |
Using a Large Language Model as Design Material for an Interactive Museum Installation |
Maria Padilla Engstrøm et.al. |
2503.22345 |
null |
2025-03-28 |
SKDU at De-Factify 4.0: Natural Language Features for AI-Generated Text-Detection |
Shrikant Malviya et.al. |
2503.22338 |
link |
2025-03-28 |
A Refined Analysis of Massive Activations in LLMs |
Louis Owen et.al. |
2503.22329 |
link |
2025-03-28 |
Large Language Models Are Democracy Coders with Attitudes |
Nils B. Weidmann et.al. |
2503.22315 |
null |
2025-03-28 |
BanglAssist: A Bengali-English Generative AI Chatbot for Code-Switching and Dialect-Handling in Customer Service |
Francesco Kruk et.al. |
2503.22283 |
null |
2025-03-28 |
MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters |
Rrubaa Panchendrarajan et.al. |
2503.22280 |
null |
2025-03-28 |
Make Some Noise: Towards LLM audio reasoning and generation using sound tokens |
Shivam Mehta et.al. |
2503.22275 |
null |
2025-03-28 |
Beyond the Script: Testing LLMs for Authentic Patient Communication Styles in Healthcare |
Anna Bodonhelyi et.al. |
2503.22250 |
null |
2025-03-28 |
FLAM: Foundation Model-Based Body Stabilization for Humanoid Locomotion and Manipulation |
Xianqi Zhang et.al. |
2503.22249 |
null |
2025-03-28 |
Agent-Centric Personalized Multiple Clustering with Multi-Modal LLMs |
Ziye Chen et.al. |
2503.22241 |
null |
2025-03-28 |
Integrating LLMs in Software Engineering Education: Motivators, Demotivators, and a Roadmap Towards a Framework for Finnish Higher Education Institutes |
Maryam Khan et.al. |
2503.22238 |
null |
2025-03-28 |
SCHNet: SAM Marries CLIP for Human Parsing |
Kunliang Liu et.al. |
2503.22237 |
null |
2025-03-28 |
CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving |
Yishen Ji et.al. |
2503.22231 |
null |
2025-03-28 |
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback |
Wei Shen et.al. |
2503.22230 |
null |
2025-03-28 |
DeepSound-V1: Start to Think Step-by-Step in the Audio Generation from Videos |
Yunming Liang et.al. |
2503.22208 |
null |
2025-03-28 |
Enhance Generation Quality of Flow Matching V2A Model via Multi-Step CoT-Like Guidance and Combined Preference Optimization |
Haomin Zhang et.al. |
2503.22200 |
null |
2025-03-28 |
EdgeInfinite: A Memory-Efficient Infinite-Context Transformer for Edge Devices |
Jiyu Chen et.al. |
2503.22196 |
null |
2025-03-28 |
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation |
Minho Park et.al. |
2503.22172 |
null |
2025-03-28 |
Reasoning of Large Language Models over Knowledge Graphs with Super-Relations |
Song Wang et.al. |
2503.22166 |
null |
2025-03-28 |
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models |
Zhanke Zhou et.al. |
2503.22165 |
link |
2025-03-28 |
PharmAgents: Building a Virtual Pharma with Large Language Model Agents |
Bowen Gao et.al. |
2503.22164 |
null |
2025-03-28 |
EgoToM: Benchmarking Theory of Mind Reasoning from Egocentric Videos |
Yuxuan Li et.al. |
2503.22152 |
link |
2025-03-28 |
Tokenization of Gaze Data |
Tim Rolff et.al. |
2503.22145 |
null |
2025-03-28 |
FRASE: Structured Representations for Generalizable SPARQL Query Generation |
Papa Abdou Karim Karou Diallo et.al. |
2503.22144 |
null |
2025-03-28 |
A Self-Supervised Learning of a Foundation Model for Analog Layout Design Automation |
Sungyu Jeong et.al. |
2503.22143 |
null |
2025-03-28 |
Score-Based Turbo Message Passing for Plug-and-Play Compressive Image Recovery |
Chang Cai et.al. |
2503.22140 |
null |
2025-03-28 |
Enhancing Dance-to-Music Generation via Negative Conditioning Latent Diffusion Model |
Changchang Sun et.al. |
2503.22138 |
null |
2025-03-28 |
Sharpe Ratio-Guided Active Learning for Preference Optimization in RLHF |
Syrine Belakaria et.al. |
2503.22137 |
null |
2025-03-28 |
Detecting Localized Deepfake Manipulations Using Action Unit-Guided Video Representations |
Tharun Anand et.al. |
2503.22121 |
null |
2025-03-28 |
Beyond Single-Sentence Prompts: Upgrading Value Alignment Benchmarks with Dialogues and Stories |
Yazhou Zhang et.al. |
2503.22115 |
null |
2025-03-28 |
Few-Shot Graph Out-of-Distribution Detection with LLMs |
Haoyan Xu et.al. |
2503.22097 |
null |
2025-03-28 |
Leveraging LLMs for Predicting Unknown Diagnoses from Clinical Notes |
Dina Albassam et.al. |
2503.22092 |
null |
2025-03-28 |
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality |
Ziyue Huang et.al. |
2503.22081 |
link |
2025-03-28 |
Penrose Tiled Low-Rank Compression and Section-Wise Q&A Fine-Tuning: A General Framework for Domain-Specific Large Language Model Adaptation |
Chuan-Wei Kuo et.al. |
2503.22074 |
null |
2025-03-28 |
Arch-LLM: Taming LLMs for Neural Architecture Generation via Unsupervised Discrete Representation Learning |
Deshani Geethika Poddenige et.al. |
2503.22063 |
null |
2025-03-27 |
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models |
Chung-En Sun et.al. |
2503.22048 |
link |
2025-03-27 |
The Risks of Using Large Language Models for Text Annotation in Social Science Research |
Hao Lin et.al. |
2503.22040 |
null |
2025-03-27 |
Debate-Driven Multi-Agent LLMs for Phishing Email Detection |
Ngoc Tuong Vy Nguyen et.al. |
2503.22038 |
null |
2025-03-27 |
Cognitive Prompts Using Guilford’s Structure of Intellect Model |
Oliver Kramer et.al. |
2503.22036 |
null |
2025-03-27 |
AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification |
Earl Ranario et.al. |
2503.22019 |
link |
2025-03-27 |
Tune It Up: Music Genre Transfer and Prediction |
Fidan Samet et.al. |
2503.22008 |
link |
2025-03-27 |
BOOTPLACE: Bootstrapped Object Placement with Detection Transformers |
Hang Zhou et.al. |
2503.21991 |
link |
2025-03-27 |
Socially Constructed Treatment Plans: Analyzing Online Peer Interactions to Understand How Patients Navigate Complex Medical Conditions |
Madhusudan Basak et.al. |
2503.21986 |
null |
2025-03-27 |
Improving Equivariant Networks with Probabilistic Symmetry Breaking |
Hannah Lawrence et.al. |
2503.21985 |
null |
2025-03-27 |
RocketPPA: Ultra-Fast LLM-Based PPA Estimator at Code-Level Abstraction |
Armin Abdollahi et.al. |
2503.21971 |
null |
2025-03-27 |
Data-Agnostic Robotic Long-Horizon Manipulation with Vision-Language-Guided Closed-Loop Feedback |
Yuan Meng et.al. |
2503.21969 |
link |
2025-03-27 |
Benchmarking Deep Learning-Based Methods for Irradiance Nowcasting with Sky Images |
Lorenzo F. C. Varaschin et.al. |
2503.21966 |
null |
2025-03-27 |
Entropy-Aware Branching for Improved Mathematical Reasoning |
Xianzhi Li et.al. |
2503.21961 |
null |
2025-03-27 |
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad |
Ivo Petrov et.al. |
2503.21934 |
null |
2025-03-27 |
Multimodal Data Integration for Sustainable Indoor Gardening: Tracking Anyplant with Time Series Foundation Model |
Seyed Hamidreza Nabaei et.al. |
2503.21932 |
null |
2025-03-27 |
Local Normalization Distortion and the Thermodynamic Formalism of Decoding Strategies for Large Language Models |
Tom Kempton et.al. |
2503.21929 |
null |
2025-03-27 |
Hybrid Emotion Recognition: Enhancing Customer Interactions Through Acoustic and Textual Analysis |
Sahan Hewage Wewelwala et.al. |
2503.21927 |
null |
2025-03-27 |
AutoPsyC: Automatic Recognition of Psychodynamic Conflicts from Semi-structured Interviews with Large Language Models |
Sayed Muddashir Hossain et.al. |
2503.21911 |
null |
2025-03-27 |
AssistPDA: An Online Video Surveillance Assistant for Video Anomaly Prediction, Detection, and Analysis |
Zhiwei Yang et.al. |
2503.21904 |
null |
2025-03-27 |
OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment |
Hamed Babaei Giglou et.al. |
2503.21902 |
link |
2025-03-27 |
StarFlow: Generating Structured Workflow Outputs From Sketch Images |
Patrice Bechard et.al. |
2503.21889 |
null |
2025-03-27 |
RedditESS: A Mental Health Social Support Interaction Dataset – Understanding Effective Social Support to Refine AI-Driven Support Tools |
Zeyad Alghamdi et.al. |
2503.21888 |
null |
2025-03-27 |
Video-R1: Reinforcing Video Reasoning in MLLMs |
Kaituo Feng et.al. |
2503.21776 |
link |
2025-03-27 |
A Unified Image-Dense Annotation Generation Model for Underwater Scenes |
Hongkai Lin et.al. |
2503.21771 |
link |
2025-03-27 |
Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence |
Haolin Liu et.al. |
2503.21766 |
null |
2025-03-27 |
Exploring the Evolution of Physics Cognition in Video Generation: A Survey |
Minghui Lin et.al. |
2503.21765 |
link |
2025-03-27 |
Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video |
David Yifan Yao et.al. |
2503.21761 |
link |
2025-03-27 |
MemInsight: Autonomous Memory Augmentation for LLM Agents |
Rana Salama et.al. |
2503.21760 |
null |
2025-03-27 |
A Unified Framework for Diffusion Bridge Problems: Flow Matching and Schrödinger Matching into One |
Minyoung Kim et.al. |
2503.21756 |
null |
2025-03-27 |
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness |
Dian Zheng et.al. |
2503.21755 |
link |
2025-03-27 |
3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models |
Yuhan Zhang et.al. |
2503.21745 |
null |
2025-03-27 |
GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics |
Arsham Gholamzadeh Khoee et.al. |
2503.21735 |
null |
2025-03-29 |
Effective Skill Unlearning through Intervention and Abstention |
Yongce Li et.al. |
2503.21730 |
link |
2025-03-27 |
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment |
Souradip Chakraborty et.al. |
2503.21720 |
null |
2025-03-27 |
Enhancing Repository-Level Software Repair via Repository-Aware Knowledge Graphs |
Boyang Yang et.al. |
2503.21710 |
null |
2025-03-27 |
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data |
Zhiyuan Ma et.al. |
2503.21694 |
link |
2025-03-27 |
LLM-Gomoku: A Large Language Model-Based System for Strategic Gomoku with Self-Play and Reinforcement Learning |
Hui Wang et.al. |
2503.21683 |
null |
2025-03-27 |
JiraiBench: A Bilingual Benchmark for Evaluating Large Language Models’ Detection of Human Self-Destructive Behavior Content in Jirai Community |
Yunze Xiao et.al. |
2503.21679 |
null |
2025-03-27 |
How do language models learn facts? Dynamics, curricula and hallucinations |
Nicolas Zucchet et.al. |
2503.21676 |
null |
2025-03-27 |
Intelligent IoT Attack Detection Design via ODLLM with Feature Ranking-based Knowledge Base |
Satvik Verma et.al. |
2503.21674 |
link |
2025-03-27 |
A friendly introduction to triangular transport |
Maximilian Ramgraber et.al. |
2503.21673 |
null |
2025-03-27 |
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning |
Zhengxi Lu et.al. |
2503.21620 |
link |
2025-03-27 |
Evaluating book summaries from internal knowledge in Large Language Models: a cross-model and semantic consistency approach |
Javier Coronado-Blázquez et.al. |
2503.21613 |
null |
2025-03-27 |
GenEdit: Compounding Operators and Continuous Improvement to Tackle Text-to-SQL in the Enterprise |
Karime Maamari et.al. |
2503.21602 |
null |
2025-03-27 |
Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing |
Johan Wahréus et.al. |
2503.21598 |
null |
2025-03-27 |
Critical Iterative Denoising: A Discrete Generative Model Applied to Graphs |
Yoann Boget et.al. |
2503.21592 |
null |
2025-03-27 |
Cooking Task Planning using LLM and Verified by Graph Network |
Ryunosuke Takebayashi et.al. |
2503.21564 |
null |
2025-03-27 |
debug-gym: A Text-Based Environment for Interactive Debugging |
Xingdi Yuan et.al. |
2503.21557 |
null |
2025-03-27 |
SWI: Speaking with Intent in Large Language Models |
Yuwei Yin et.al. |
2503.21544 |
link |
2025-03-27 |
Combining Artificial Users and Psychotherapist Assessment to Evaluate Large Language Model-based Mental Health Chatbots |
Florian Onur Kuhlmeier et.al. |
2503.21540 |
null |
2025-03-27 |
Exploring the Energy Landscape of RBMs: Reciprocal Space Insights into Bosons, Hierarchical Learning and Symmetry Breaking |
J. Quetzalcóatl Toledo-Marin et.al. |
2503.21536 |
null |
2025-03-27 |
Uncertainty-aware Bayesian machine learning modelling of land cover classification |
Samuel Bilson et.al. |
2503.21510 |
null |
2025-03-27 |
Keyword-Oriented Multimodal Modeling for Euphemism Identification |
Yuxue Hu et.al. |
2503.21504 |
link |
2025-03-27 |
Double Blind Imaging with Generative Modeling |
Brett Levac et.al. |
2503.21501 |
null |
2025-03-27 |
OpenHuEval: Evaluating Large Language Model on Hungarian Specifics |
Haote Yang et.al. |
2503.21500 |
link |
2025-03-28 |
OmniVox: Zero-Shot Emotion Recognition with Omni-LLMs |
John Murzaku et.al. |
2503.21480 |
null |
2025-03-27 |
DeepRV: pre-trained spatial priors for accelerated disease mapping |
Jhonathan Navott et.al. |
2503.21473 |
null |
2025-03-27 |
Harnessing Chain-of-Thought Metadata for Task Routing and Adversarial Prompt Detection |
Ryan Marinelli et.al. |
2503.21464 |
link |
2025-03-27 |
Large Language Model Agent: A Survey on Methodology, Applications and Challenges |
Junyu Luo et.al. |
2503.21460 |
link |
2025-03-27 |
FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs |
Xiaoqin Wang et.al. |
2503.21457 |
link |
2025-03-27 |
CMADiff: Cross-Modal Aligned Diffusion for Controllable Protein Generation |
Changjian Zhou et.al. |
2503.21450 |
null |
2025-03-27 |
Towards Generating Realistic 3D Semantic Training Data for Autonomous Driving |
Lucas Nunes et.al. |
2503.21449 |
link |
2025-03-27 |
From Deep Learning to LLMs: A survey of AI in Quantitative Investment |
Bokai Cao et.al. |
2503.21422 |
null |
2025-03-28 |
Neuroplasticity in Artificial Intelligence – An Overview and Inspirations on Drop In & Out Learning |
Yupei Li et.al. |
2503.21419 |
null |
2025-03-27 |
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap |
Tong Nie et.al. |
2503.21411 |
link |
2025-03-27 |
VALLR: Visual ASR Language Model for Lip Reading |
Marshall Thomas et.al. |
2503.21408 |
null |
2025-03-27 |
An evaluation of LLMs and Google Translate for translation of selected Indian languages via sentiment and semantic analyses |
Rohitash Chandra et.al. |
2503.21393 |
null |
2025-03-27 |
Controlling Large Language Model with Latent Actions |
Chengxing Jia et.al. |
2503.21383 |
link |
2025-03-27 |
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models |
Haoxiang Sun et.al. |
2503.21380 |
link |
2025-03-27 |
Generative Decoding for Quantum Error-correcting Codes |
Hanyan Cao et.al. |
2503.21374 |
null |
2025-03-27 |
From User Preferences to Optimization Constraints Using Large Language Models |
Manuela Sanguinetti et.al. |
2503.21360 |
null |
2025-03-27 |
Using large language models to produce literature reviews: Usages and systematic biases of microphysics parametrizations in 2699 publications |
Tianhang Zhang et.al. |
2503.21352 |
null |
2025-03-27 |
Fine-Tuning LLMs on Small Medical Datasets: Text Classification and Normalization Effectiveness on Cardiology reports and Discharge records |
Noah Losch et.al. |
2503.21349 |
null |
2025-03-27 |
Scalable Expectation Estimation with Subtractive Mixture Models |
Lena Zellinger et.al. |
2503.21346 |
null |
2025-03-27 |
Large Language Models for Traffic and Transportation Research: Methodologies, State of the Art, and Future Opportunities |
Yimo Yan et.al. |
2503.21330 |
null |
2025-03-27 |
Structural bias in three-dimensional autoregressive generative machine learning of organic molecules |
Zsuzsanna Koczor-Benda et.al. |
2503.21328 |
null |
2025-03-27 |
Tricking Retrievers with Influential Tokens: An Efficient Black-Box Corpus Poisoning Attack |
Cheng Wang et.al. |
2503.21315 |
null |
2025-03-27 |
InternVL-X: Advancing and Accelerating InternVL Series with Efficient Visual Token Compression |
Dongchen Lu et.al. |
2503.21307 |
link |
2025-03-27 |
R-PRM: Reasoning-Driven Process Reward Modeling |
Shuaijie She et.al. |
2503.21295 |
link |
2025-03-27 |
Reinforced Model Merging |
Jiaqi Han et.al. |
2503.21272 |
link |
2025-03-27 |
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition |
Yujie Liu et.al. |
2503.21248 |
null |
2025-03-27 |
DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation |
Haoyu Zhao et.al. |
2503.21246 |
link |
2025-03-27 |
Exploring the Rastall Gravity Cosmological Model using Gong-Zhang parameterization with Latest Observational Data and Deep Learning Techniques |
Vinod Kumar Bhardwaj et.al. |
2503.21243 |
null |
2025-03-27 |
Bias-Aware Agent: Enhancing Fairness in AI-Driven Knowledge Retrieval |
Karanbir Singh et.al. |
2503.21237 |
link |
2025-03-27 |
LLaVA-CMoE: Towards Continual Mixture of Experts for Large Vision-Language Models |
Hengyuan Zhao et.al. |
2503.21227 |
null |
2025-03-27 |
Efficient Learning for Entropy-regularized Markov Decision Processes via Multilevel Monte Carlo |
Matthieu Meunier et.al. |
2503.21224 |
null |
2025-03-27 |
Rethinking Graph Structure Learning in the Era of LLMs |
Zhihan Zhang et.al. |
2503.21223 |
null |
2025-03-27 |
GenFusion: Closing the Loop between Reconstruction and Generation via Videos |
Sibo Wu et.al. |
2503.21219 |
null |
2025-03-27 |
Resource-Efficient Federated Fine-Tuning Large Language Models for Heterogeneous Data |
Jun Liu et.al. |
2503.21213 |
null |
2025-03-27 |
FakeReasoning: Towards Generalizable Forgery Detection and Reasoning |
Yueying Gao et.al. |
2503.21210 |
null |
2025-03-27 |
PilotANN: Memory-Bounded GPU Acceleration for Vector Search |
Yuntao Gui et.al. |
2503.21206 |
link |
2025-03-27 |
Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering |
Erika Mori et.al. |
2503.21190 |
null |
2025-03-27 |
DGSUnet: An Improved Unet Model with DINO-Guided SAM2 for Multi-Scale Feature Collaboration |
Yimin Xu et.al. |
2503.21187 |
link |
2025-03-27 |
Integrating Large Language Models For Monte Carlo Simulation of Chemical Reaction Networks |
Sadikshya Gyawali et.al. |
2503.21178 |
null |
2025-03-27 |
Model as a Game: On Numerical and Spatial Consistency for Generative Games |
Jingye Chen et.al. |
2503.21172 |
null |
2025-03-27 |
Integrating Travel Behavior Forecasting and Generative Modeling for Predicting Future Urban Mobility and Spatial Transformations |
Eugene Denteh et.al. |
2503.21158 |
null |
2025-03-27 |
Embedding Domain-Specific Knowledge from LLMs into the Feature Engineering Pipeline |
João Eduardo Batista et.al. |
2503.21155 |
null |
2025-03-27 |
Expressive Timing in Hindustani Vocal Music |
Yash Bhake et.al. |
2503.21142 |
null |
2025-03-27 |
MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness |
Zihao Zheng et.al. |
2503.21135 |
null |
2025-03-27 |
Collaborative Evolution: Multi-Round Learning Between Large and Small Language Models for Emergent Fake News Detection |
Ziyi Zhou et.al. |
2503.21127 |
null |
2025-03-27 |
De Novo Functional Protein Sequence Generation: Overcoming Data Scarcity through Regeneration and Large Models |
Chenyu Ren et.al. |
2503.21123 |
null |
2025-03-27 |
Leveraging Large Language Models for Risk Assessment in Hyperconnected Logistic Hub Network Deployment |
Yinzhu Quan et.al. |
2503.21115 |
null |
2025-03-27 |
Alleviating LLM-based Generative Retrieval Hallucination in Alipay Search |
Yedan Shen et.al. |
2503.21098 |
null |
2025-03-27 |
ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging |
Haoming Xu et.al. |
2503.21088 |
link |
2025-03-28 |
EQ-Negotiator: An Emotion-Reasoning LLM Agent in Credit Dialogues |
Yuhan Liu et.al. |
2503.21080 |
null |
2025-03-27 |
Online Reasoning Video Segmentation with Just-in-Time Digital Twins |
Yiqing Shen et.al. |
2503.21056 |
null |
2025-03-27 |
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning |
Chi-Hsi Kung et.al. |
2503.21055 |
null |
2025-03-26 |
Operating Room Workflow Analysis via Reasoning Segmentation over Digital Twins |
Yiqing Shen et.al. |
2503.21054 |
null |
2025-03-26 |
Scalability Evaluation of HPC Multi-GPU Training for ECG-based LLMs |
Dimitar Mileski et.al. |
2503.21033 |
null |
2025-03-26 |
Two for the Price of One: Integrating Large Language Models to Learn Biophysical Interactions |
Joseph D. Clark et.al. |
2503.21017 |
null |
2025-03-26 |
Can Large Language Models Predict Associations Among Human Attitudes? |
Ana Ma et.al. |
2503.21011 |
null |
2025-03-26 |
Evaluating Large Language Models for Automated Clinical Abstraction in Pulmonary Embolism Registries: Performance Across Model Sizes, Versions, and Parameters |
Mahmoud Alwakeel et.al. |
2503.21004 |
null |
2025-03-26 |
Multi-head Reward Aggregation Guided by Entropy |
Xiaomin Li et.al. |
2503.20995 |
null |
2025-03-26 |
FinAudio: A Benchmark for Audio Large Language Models in Financial Applications |
Yupeng Cao et.al. |
2503.20990 |
null |
2025-03-26 |
Patients Speak, AI Listens: LLM-based Analysis of Online Reviews Uncovers Key Drivers for Urgent Care Satisfaction |
Xiaoran Xu et.al. |
2503.20981 |
null |
2025-03-26 |
ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction |
Yiqiao Jin et.al. |
2503.20978 |
null |
2025-03-26 |
Sociotechnical Effects of Machine Translation |
Joss Moorkens et.al. |
2503.20959 |
null |
2025-03-26 |
DEMENTIA-PLAN: An Agent-Based Framework for Multi-Knowledge Graph Retrieval-Augmented Generation in Dementia Care |
Yutong Song et.al. |
2503.20950 |
null |
2025-03-26 |
Hacia la interpretabilidad de la detección anticipada de riesgos de depresión utilizando grandes modelos de lenguaje |
Horacio Thompson et.al. |
2503.20939 |
null |
2025-03-26 |
Leveraging LLMs, IDEs, and Semantic Embeddings for Automated Move Method Refactoring |
Fraol Batole et.al. |
2503.20934 |
null |
2025-03-26 |
D4R – Exploring and Querying Relational Graphs Using Natural Language and Large Language Models – the Case of Historical Documents |
Michel Boeglin et.al. |
2503.20914 |
null |
2025-03-26 |
Assessing Generative Models for Structured Data |
Reilly Cannon et.al. |
2503.20903 |
null |
2025-03-26 |
Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark |
Sondos Mahmoud Bsharat et.al. |
2503.20786 |
link |
2025-03-26 |
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency |
Tianqi Liu et.al. |
2503.20785 |
link |
2025-03-26 |
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields |
Shijie Zhou et.al. |
2503.20776 |
null |
2025-03-26 |
Reliable algorithm selection for machine learning-guided design |
Clara Fannjiang et.al. |
2503.20767 |
null |
2025-03-26 |
UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines |
Chen Tang et.al. |
2503.20748 |
null |
2025-03-26 |
MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams |
Yanpeng Sun et.al. |
2503.20745 |
null |
2025-03-26 |
Continual learning via probabilistic exchangeable sequence modelling |
Hanwen Xing et.al. |
2503.20725 |
null |
2025-03-26 |
Dynamic Motion Blending for Versatile Motion Editing |
Nan Jiang et.al. |
2503.20724 |
null |
2025-03-26 |
From Annotation to Adaptation: Metrics, Synthetic Data, and Aspect Extraction for Aspect-Based Sentiment Analysis with Large Language Models |
Nikita Neveditsin et.al. |
2503.20715 |
null |
2025-03-26 |
Graph-Enhanced Model-Free Reinforcement Learning Agents for Efficient Power Grid Topological Control |
Eloy Anguiano Batanero et.al. |
2503.20688 |
null |
2025-03-27 |
Flip Learning: Weakly Supervised Erase to Segment Nodules in Breast Ultrasound |
Yuhao Huang et.al. |
2503.20685 |
null |
2025-03-27 |
Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy |
Yinan Sun et.al. |
2503.20673 |
null |
2025-03-26 |
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation |
Yuyang Peng et.al. |
2503.20672 |
null |
2025-03-26 |
TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews |
Huimin Xu et.al. |
2503.20666 |
null |
2025-03-26 |
ARMO: Autoregressive Rigging for Multi-Category Objects |
Mingze Sun et.al. |
2503.20663 |
null |
2025-03-26 |
Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging |
Han Wu et.al. |
2503.20641 |
link |
2025-03-26 |
Collaborative Storytelling and LLM: A Linguistic Analysis of Automatically-Generated Role-Playing Game Sessions |
Alessandro Maisto et.al. |
2503.20623 |
null |
2025-03-26 |
Diffusion Counterfactuals for Image Regressors |
Trung Duc Ha et.al. |
2503.20595 |
link |
2025-03-26 |
Supply chain network rewiring dynamics at the firm-level |
Tobias Reisch et.al. |
2503.20594 |
link |
2025-03-26 |
What to Retrieve for Effective Retrieval-Augmented Code Generation? An Empirical Study and Beyond |
Wenchao Gu et.al. |
2503.20589 |
null |
2025-03-26 |
LLPut: Investigating Large Language Models for Bug Report-Based Input Generation |
Alif Al Hasan et.al. |
2503.20578 |
null |
2025-03-26 |
Optimizing Case-Based Reasoning System for Functional Test Script Generation with Large Language Models |
Siyuan Guo et.al. |
2503.20576 |
null |
2025-03-26 |
Stochastic Transport Maps in Diffusion Models and Sampling |
Xicheng Zhang et.al. |
2503.20573 |
null |
2025-03-26 |
Low-resource Information Extraction with the European Clinical Case Corpus |
Soumitra Ghosh et.al. |
2503.20568 |
null |
2025-03-26 |
TerraTorch: The Geospatial Foundation Models Toolkit |
Carlos Gomes et.al. |
2503.20563 |
link |
2025-03-26 |
A Theoretical Framework for Prompt Engineering: Approximating Smooth Functions with Transformer Prompts |
Ryumei Nakada et.al. |
2503.20561 |
null |
2025-03-26 |
Injecting Adrenaline into LLM Serving: Boosting Resource Utilization and Throughput via Attention Disaggregation |
Yunkai Liang et.al. |
2503.20552 |
link |
2025-03-26 |
Knowledge-Based Multi-Agent Framework for Automated Software Architecture Design |
Yiran Zhang et.al. |
2503.20536 |
null |
2025-03-26 |
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIs |
Zhicheng Guo et.al. |
2503.20527 |
link |
2025-03-26 |
GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving |
Lloyd Russell et.al. |
2503.20523 |
null |
2025-03-26 |
MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation |
Jinnan Chen et.al. |
2503.20519 |
null |
2025-03-26 |
Exploring the Effect of Robotic Embodiment and Empathetic Tone of LLMs on Empathy Elicitation |
Liza Darwesh et.al. |
2503.20518 |
null |
2025-03-26 |
Explainable ICD Coding via Entity Linking |
Leonor Barreiros et.al. |
2503.20508 |
null |
2025-03-26 |
Vision-Amplified Semantic Entropy for Hallucination Detection in Medical Visual Question Answering |
Zehui Liao et.al. |
2503.20504 |
null |
2025-03-26 |
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning |
Yiwei Ma et.al. |
2503.20502 |
null |
2025-03-26 |
FireRedTTS-1S: An Upgraded Streamable Foundation Text-to-Speech System |
Hao-Han Guo et.al. |
2503.20499 |
null |
2025-03-26 |
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization |
Jiale Cheng et.al. |
2503.20491 |
link |
2025-03-26 |
Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability |
Yingdong Shi et.al. |
2503.20483 |
null |
2025-03-26 |
From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment |
Yucheng Suo et.al. |
2503.20472 |
null |
2025-03-26 |
Data-driven Seasonal Climate Predictions via Variational Inference and Transformers |
Lluís Palma et.al. |
2503.20466 |
null |
2025-03-26 |
Attention Xception UNet (AXUNet): A Novel Combination of CNN and Self-Attention for Brain Tumor Segmentation |
Farzan Moodi et.al. |
2503.20446 |
null |
2025-03-26 |
RALLRec+: Retrieval Augmented Large Language Model Recommendation with Reasoning |
Sichun Luo et.al. |
2503.20430 |
link |
2025-03-26 |
CFunModel: A “Funny” Language Model Capable of Chinese Humor Generation and Processing |
Zhenghan Yu et.al. |
2503.20417 |
null |
2025-03-26 |
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation |
Rongyu Zhang et.al. |
2503.20384 |
null |
2025-03-26 |
Dewey Long Context Embedding Model: A Technical Report |
Dun Zhang et.al. |
2503.20376 |
null |
2025-03-26 |
VideoGEM: Training-free Action Grounding in Videos |
Felix Vogel et.al. |
2503.20348 |
null |
2025-03-26 |
Dynamic Pyramid Network for Efficient Multimodal Large Language Model |
Hao Ai et.al. |
2503.20322 |
null |
2025-03-26 |
Iterative Prompting with Persuasion Skills in Jailbreaking Large Language Models |
Shih-Wen Ke et.al. |
2503.20320 |
null |
2025-03-26 |
Wan: Open and Advanced Large-Scale Video Generative Models |
WanTeam et.al. |
2503.20314 |
link |
2025-03-26 |
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs |
Zitian Wang et.al. |
2503.20309 |
null |
2025-03-27 |
Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics |
Lee Chae-Yeon et.al. |
2503.20308 |
null |
2025-03-26 |
A Multilingual, Culture-First Approach to Addressing Misgendering in LLM Applications |
Sunayana Sitaram et.al. |
2503.20302 |
link |
2025-03-26 |
Traversing Distortion-Perception Tradeoff using a Single Score-Based Generative Model |
Yuhan Wang et.al. |
2503.20297 |
null |
2025-03-26 |
QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions |
Siyin Wang et.al. |
2503.20290 |
null |
2025-03-26 |
Faster Parameter-Efficient Tuning with Token Redundancy Reduction |
Kwonyoung Kim et.al. |
2503.20282 |
link |
2025-03-26 |
sudo rm -rf agentic_security |
Sejin Lee et.al. |
2503.20279 |
link |
2025-03-26 |
The cell as a token: high-dimensional geometry in language models and cell embeddings |
William Gilpin et.al. |
2503.20278 |
null |
2025-03-26 |
ViLBench: A Suite for Vision-Language Process Reward Modeling |
Haoqin Tu et.al. |
2503.20271 |
null |
2025-03-26 |
L4: Diagnosing Large-scale LLM Training Failures via Automated Log Analysis |
Zhihan Jiang et.al. |
2503.20263 |
null |
2025-03-26 |
LGR: LLM-Guided Ranking of Frontiers for Object Goal Navigation |
Mitsuaki Uno et.al. |
2503.20241 |
null |
2025-03-26 |
Automated UI Interface Generation via Diffusion Models: Enhancing Personalization and Efficiency |
Yifei Duan et.al. |
2503.20229 |
null |
2025-03-26 |
TeleLoRA: Teleporting Model-Specific Alignment Across LLMs |
Xiao Lin et.al. |
2503.20228 |
null |
2025-03-26 |
DINeMo: Learning Neural Mesh Models with no 3D Annotations |
Weijie Guo et.al. |
2503.20220 |
null |
2025-03-26 |
Qwen2.5-Omni Technical Report |
Jin Xu et.al. |
2503.20215 |
null |
2025-03-26 |
SARGes: Semantically Aligned Reliable Gesture Generation via Intent Chain |
Nan Gao et.al. |
2503.20202 |
null |
2025-03-26 |
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models |
Alex Jinpeng Wang et.al. |
2503.20198 |
null |
2025-03-26 |
Enhancing the Robustness of LLM-Generated Code: Empirical Study and Framework |
ZiKe Li et.al. |
2503.20197 |
link |
2025-03-26 |
GAPO: Learning Preferential Prompt through Generative Adversarial Policy Optimization |
Zhouhong Gu et.al. |
2503.20194 |
link |
2025-03-26 |
Maya: Optimizing Deep Learning Training Workloads using Emulated Virtual Accelerators |
Srihas Yarlagadda et.al. |
2503.20191 |
null |
2025-03-26 |
Cross-Modal Prototype Allocation: Unsupervised Slide Representation Learning via Patch-Text Contrast in Computational Pathology |
Yuxuan Chen et.al. |
2503.20190 |
null |
2025-03-26 |
Rethinking Vision-Language Model in Face Forensics: Multi-Modal Interpretable Forged Face Detector |
Xiao Guo et.al. |
2503.20188 |
link |
2025-03-26 |
Leveraging Implicit Sentiments: Enhancing Reliability and Validity in Psychological Trait Evaluation of LLMs |
Huanhuan Ma et.al. |
2503.20182 |
link |
2025-03-26 |
Can We Make Code Green? Understanding Trade-Offs in LLMs vs. Human Code Optimizations |
Pooja Rani et.al. |
2503.20126 |
null |
2025-03-26 |
Synthesizing world models for bilevel planning |
Zergham Ahmed et.al. |
2503.20124 |
null |
2025-03-25 |
Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors |
Yuke Lou et.al. |
2503.20118 |
null |
2025-03-25 |
VibE: A Visual Analytics Workflow for Semantic Error Analysis of CVML Models at Subgroup Level |
Jun Yuan et.al. |
2503.20112 |
null |
2025-03-25 |
Federated Learning: A new frontier in the exploration of multi-institutional medical imaging data |
Dominika Ciupek et.al. |
2503.20107 |
null |
2025-03-25 |
Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Models Using Implicit Feedback from Pre-training Demonstrations |
Ran Tian et.al. |
2503.20105 |
null |
2025-03-25 |
Bigger But Not Better: Small Neural Language Models Outperform Large Language Models in Detection of Thought Disorder |
Changye Li et.al. |
2503.20103 |
link |
2025-03-25 |
Generative Linguistics, Large Language Models, and the Social Nature of Scientific Success |
Sophie Hao et.al. |
2503.20088 |
null |
2025-03-25 |
Can Multi-modal (reasoning) LLMs work as deepfake detectors? |
Simiao Ren et.al. |
2503.20084 |
null |
2025-03-27 |
Cross-Tokenizer Distillation via Approximate Likelihood Matching |
Benjamin Minixhofer et.al. |
2503.20083 |
link |
2025-03-25 |
Poor Alignment and Steerability of Large Language Models: Evidence from College Admission Essays |
Jinsook Lee et.al. |
2503.20062 |
null |
2025-03-25 |
Deep Learning Approaches for Blood Disease Diagnosis Across Hematopoietic Lineages |
Gabriel Bo et.al. |
2503.20049 |
link |
2025-03-25 |
Warm Start Adaptive-Bias Quantum Approximate Optimization Algorithm |
Yunlong Yu et.al. |
2503.20048 |
null |
2025-03-25 |
Unlocking Multi-Task Electric Energy System Intelligence: Data Scaling Laws and Performance with Limited Fine-Tuning |
Shaohuai Liu et.al. |
2503.20040 |
null |
2025-03-25 |
OmniNova:A General Multimodal Agent Framework |
Pengfei Du et.al. |
2503.20028 |
null |
2025-03-25 |
Gemini Robotics: Bringing AI into the Physical World |
Gemini Robotics Team et.al. |
2503.20020 |
null |
2025-03-25 |
LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning? |
Kexian Tang et.al. |
2503.19990 |
null |
2025-03-25 |
ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback |
Bohan Zhai et.al. |
2503.19988 |
link |
2025-03-25 |
Conditional Deep Generative Models for Simultaneous Simulation and Reconstruction of Entire Events |
Etienne Dreyer et.al. |
2503.19981 |
link |
2025-03-25 |
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining |
Xiang Xu et.al. |
2503.19912 |
link |
2025-03-25 |
CoLLM: A Large Language Model for Composed Image Retrieval |
Chuong Huynh et.al. |
2503.19910 |
link |
2025-03-25 |
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention |
Xuan Ju et.al. |
2503.19907 |
null |
2025-03-25 |
ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models |
Fernando Julio Cendra et.al. |
2503.19902 |
null |
2025-03-25 |
A Multi-Agent Framework Integrating Large Language Models and Generative AI for Accelerated Metamaterial Design |
Jie Tian et.al. |
2503.19889 |
null |
2025-03-25 |
CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation |
Nengbo Wang et.al. |
2503.19878 |
null |
2025-03-25 |
SLA-Awareness for AI-assisted coding |
Kishanthan Thangarajah et.al. |
2503.19876 |
null |
2025-03-25 |
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking |
Xiaoyu Tian et.al. |
2503.19855 |
null |
2025-03-25 |
Towards Online Multi-Modal Social Interaction Understanding |
Xinpeng Li et.al. |
2503.19851 |
link |
2025-03-25 |
FALCONEye: Finding Answers and Localizing Content in ONE-hour-long videos with multi-modal LLMs |
Carlos Plou et.al. |
2503.19850 |
null |
2025-03-25 |
A Comparative Analysis of Word Segmentation, Part-of-Speech Tagging, and Named Entity Recognition for Historical Chinese Sources, 1900-1950 |
Zhao Fang et.al. |
2503.19844 |
null |
2025-03-25 |
TopoGEN: topology-driven microstructure generation for in silico modeling of fiber network mechanics |
Sara Cardona et.al. |
2503.19832 |
null |
2025-03-25 |
IgCraft: A versatile sequence generation framework for antibody discovery and engineering |
Matthew Greenig et.al. |
2503.19821 |
link |
2025-03-25 |
Domain-incremental White Blood Cell Classification with Privacy-aware Continual Learning |
Pratibha Kumari et.al. |
2503.19819 |
null |
2025-03-25 |
SeLIP: Similarity Enhanced Contrastive Language Image Pretraining for Multi-modal Head MRI |
Zhiyang Liu et.al. |
2503.19801 |
null |
2025-03-25 |
SemEval-2025 Task 9: The Food Hazard Detection Challenge |
Korbinian Randl et.al. |
2503.19800 |
null |
2025-03-25 |
PAVE: Patching and Adapting Video Large Language Models |
Zhuoming Liu et.al. |
2503.19794 |
link |
2025-03-25 |
Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models |
Kartik Thakral et.al. |
2503.19783 |
null |
2025-03-25 |
ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation |
Haoyu Fu et.al. |
2503.19755 |
null |
2025-03-25 |
Inducing Personality in LLM-Based Honeypot Agents: Measuring the Effect on Human-Like Agenda Generation |
Lewis Newsham et.al. |
2503.19752 |
null |
2025-03-25 |
Optimizing Photonic Structures with Large Language Model Driven Algorithm Discovery |
Haoran Yin et.al. |
2503.19742 |
null |
2025-03-25 |
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings |
Chengan Che et.al. |
2503.19740 |
link |
2025-03-26 |
FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion |
Pihai Sun et.al. |
2503.19739 |
link |
2025-03-25 |
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation |
Itay Nakash et.al. |
2503.19693 |
link |
2025-03-25 |
CoSimGen: Controllable Diffusion Model for Simultaneous Image and Mask Generation |
Rupak Bose et.al. |
2503.19661 |
null |
2025-03-25 |
BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction |
Jan Kohút et.al. |
2503.19658 |
null |
2025-03-25 |
OpenSDI: Spotting Diffusion-Generated Images in the Open World |
Yabin Wang et.al. |
2503.19653 |
link |
2025-03-25 |
HausaNLP at SemEval-2025 Task 3: Towards a Fine-Grained Model-Aware Hallucination Detection |
Maryam Bala et.al. |
2503.19650 |
null |
2025-03-25 |
Show or Tell? Effectively prompting Vision-Language Models for semantic segmentation |
Niccolo Avogaro et.al. |
2503.19647 |
null |
2025-03-25 |
1.4 Million Open-Source Distilled Reasoning Dataset to Empower Large Language Model Training |
Han Zhao et.al. |
2503.19633 |
null |
2025-03-25 |
Optimization through In-Context Learning and Iterative LLM Prompting for Nuclear Engineering Design Problems |
M. Rizki Oktavian et.al. |
2503.19620 |
null |
2025-03-25 |
Exploring Next Token Prediction For Optimizing Databases |
Yeasir Rayhan et.al. |
2503.19619 |
null |
2025-03-25 |
RL-finetuning LLMs from on- and off-policy data with a single algorithm |
Yunhao Tang et.al. |
2503.19612 |
null |
2025-03-25 |
Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation |
Max W. Y. Lam et.al. |
2503.19611 |
null |
2025-03-25 |
Innate Reasoning is Not Enough: In-Context Learning Enhances Reasoning Large Language Models with Less Overthinking |
Yuyao Ge et.al. |
2503.19602 |
null |
2025-03-25 |
HoarePrompt: Structural Reasoning About Program Correctness in Natural Language |
Dimitrios Stamatios Bouras et.al. |
2503.19599 |
link |
2025-03-25 |
Context-Efficient Retrieval with Factual Decomposition |
Yanhong Li et.al. |
2503.19574 |
null |
2025-03-25 |
Motif Counting in Complex Networks: A Comprehensive Survey |
Haozhe Yin et.al. |
2503.19573 |
null |
2025-03-25 |
Dance Like a Chicken: Low-Rank Stylization for Human Motion Diffusion |
Haim Sawdayee et.al. |
2503.19557 |
null |
2025-03-26 |
Scaling Laws of Synthetic Data for Language Models |
Zeyu Qin et.al. |
2503.19551 |
null |
2025-03-25 |
FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models |
Dahyun Jung et.al. |
2503.19540 |
link |
2025-03-25 |
VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models |
Suhas G Hegde et.al. |
2503.19530 |
null |
2025-03-25 |
Conditional Autoencoder for Generating BNS Waveforms with Tidal and Precession Effects |
Mengfei Sun et.al. |
2503.19512 |
null |
2025-03-25 |
SparSamp: Efficient Provably Secure Steganography Based on Sparse Sampling |
Yaofei Wang et.al. |
2503.19499 |
null |
2025-03-25 |
DomainCQA: Crafting Expert-Level QA from Domain-Specific Charts |
Ling Zhong et.al. |
2503.19498 |
null |
2025-03-25 |
Exploring Disentangled and Controllable Human Image Synthesis: From End-to-End to Stage-by-Stage |
Zhengwentai Sun et.al. |
2503.19486 |
null |
2025-03-25 |
KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models |
Zhiwei Wang et.al. |
2503.19482 |
null |
2025-03-25 |
GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers |
Shijie Ma et.al. |
2503.19480 |
null |
2025-03-25 |
A-MESS: Anchor based Multimodal Embedding with Semantic Synchronization for Multimodal Intent Recognition |
Yaomin Shen et.al. |
2503.19474 |
null |
2025-03-25 |
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning |
Mingyang Chen et.al. |
2503.19470 |
null |
2025-03-25 |
G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation |
Juntao Jian et.al. |
2503.19457 |
null |
2025-03-25 |
Data-centric Federated Graph Learning with Large Language Models |
Bo Yan et.al. |
2503.19455 |
null |
2025-03-25 |
VecTrans: LLM Transformation Framework for Better Auto-vectorization on High-performance CPU |
Zhongchun Zheng et.al. |
2503.19449 |
null |
2025-03-25 |
Enhanced Bloom’s Educational Taxonomy for Fostering Information Literacy in the Era of Large Language Models |
Yiming Luo et.al. |
2503.19434 |
null |
2025-03-25 |
DeCAP: Context-Adaptive Prompt Generation for Debiasing Zero-shot Question Answering in Large Language Models |
Suyoung Bae et.al. |
2503.19426 |
null |
2025-03-25 |
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing |
Jaihoon Kim et.al. |
2503.19385 |
null |
2025-03-25 |
MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation |
Yukang Lin et.al. |
2503.19383 |
null |
2025-03-25 |
Interpretable Generative Models through Post-hoc Concept Bottlenecks |
Akshay Kulkarni et.al. |
2503.19377 |
link |
2025-03-26 |
EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models |
Yufei Cai et.al. |
2503.19369 |
link |
2025-03-25 |
ImageSet2Text: Describing Sets of Images through Text |
Piera Riccio et.al. |
2503.19361 |
null |
2025-03-25 |
QUAD: Quantization and Parameter-Efficient Tuning of LLM with Activation Decomposition |
Yuxuan Hu et.al. |
2503.19353 |
link |
2025-03-25 |
Membership Inference Attacks on Large-Scale Models: A Survey |
Hengyu Wu et.al. |
2503.19338 |
null |
2025-03-25 |
Process or Result? Manipulated Ending Tokens Can Mislead Reasoning LLMs to Ignore the Correct Reasoning Steps |
Yu Cui et.al. |
2503.19326 |
null |
2025-03-25 |
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text |
Weizhi Chen et.al. |
2503.19311 |
link |
2025-03-25 |
Iterative Hypothesis Generation for Scientific Discovery with Monte Carlo Nash Equilibrium Self-Refining Trees |
Gollam Rabby et.al. |
2503.19309 |
null |
2025-03-25 |
UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design |
Xiangzhe Kong et.al. |
2503.19300 |
link |
2025-03-25 |
Context-Aware Semantic Segmentation: Enhancing Pixel-Level Understanding with Large Language Models for Advanced Vision Applications |
Ben Rahman et.al. |
2503.19276 |
null |
2025-03-25 |
MARS: Memory-Enhanced Agents with Reflective Self-improvement |
Xuechen Liang et.al. |
2503.19271 |
null |
2025-03-25 |
PHEONA: An Evaluation Framework for Large Language Model-based Approaches to Computational Phenotyping |
Sarah Pungitore et.al. |
2503.19265 |
null |
2025-03-25 |
DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning |
Fucai Ke et.al. |
2503.19263 |
null |
2025-03-25 |
Linguistic Blind Spots of Large Language Models |
Jiali Cheng et.al. |
2503.19260 |
null |
2025-03-25 |
SCI-IDEA: Context-Aware Scientific Ideation Using Token and Sentence Embeddings |
Farhana Keya et.al. |
2503.19257 |
null |
2025-03-24 |
LLM Benchmarking with LLaMA2: Evaluating Code Development Performance Across Multiple Programming Languages |
Patrick Diehl et.al. |
2503.19217 |
null |
2025-03-24 |
A Survey of Large Language Model Agents for Question Answering |
Murong Yue et.al. |
2503.19213 |
null |
2025-03-24 |
Overtrained Language Models Are Harder to Fine-Tune |
Jacob Mitchell Springer et.al. |
2503.19206 |
null |
2025-03-24 |
A Shared Low-Rank Adaptation Approach to Personalized RLHF |
Renpu Liu et.al. |
2503.19201 |
null |
2025-03-24 |
Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor Spaces |
Chenyangguang Zhang et.al. |
2503.19199 |
null |
2025-03-24 |
Evaluating Bias in LLMs for Job-Resume Matching: Gender, Race, and Education |
Hayate Iso et.al. |
2503.19182 |
null |
2025-03-24 |
Language Model Uncertainty Quantification with Attention Chain |
Yinghao Li et.al. |
2503.19168 |
link |
2025-03-24 |
Reconstructing hadronically decaying tau leptons with a jet foundation model |
Laurits Tani et.al. |
2503.19165 |
null |
2025-03-24 |
HOIGPT: Learning Long Sequence Hand-Object Interaction with Language Models |
Mingzhen Huang et.al. |
2503.19157 |
null |
2025-03-24 |
Risk-Based Thresholding for Reliable Anomaly Detection in Concentrated Solar Power Plants |
Yorick Estievenart et.al. |
2503.19146 |
null |
2025-03-24 |
Compositional Caching for Training-free Open-vocabulary Attribute Detection |
Marco Garosi et.al. |
2503.19145 |
null |
2025-03-24 |
MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks |
Wenhao You et.al. |
2503.19134 |
null |
2025-03-24 |
Understanding and Improving Information Preservation in Prompt Compression for LLMs |
Weronika Łajewska et.al. |
2503.19114 |
null |
2025-03-24 |
Masks and Mimicry: Strategic Obfuscation and Impersonation Attacks on Authorship Verification |
Kenneth Alperin et.al. |
2503.19099 |
null |
2025-03-24 |
Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation |
Krisztian Balog et.al. |
2503.19092 |
null |
2025-03-24 |
LLM-Based Insight Extraction for Contact Center Analytics and Cost-Efficient Deployment |
Varsha Embar et.al. |
2503.19090 |
null |
2025-03-24 |
Paving the way for scientific foundation models: enhancing generalization and robustness in PDEs with constraint-aware pre-training |
Amin Totounferoush et.al. |
2503.19081 |
null |
2025-03-24 |
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization |
Zhanda Zhu et.al. |
2503.19050 |
link |
2025-03-24 |
LookAhead Tuning: Safer Language Models via Partial Answer Previews |
Kangwei Liu et.al. |
2503.19041 |
link |
2025-03-24 |
Equivariant Image Modeling |
Ruixiao Dong et.al. |
2503.18948 |
link |
2025-03-25 |
Aether: Geometric-Aware Unified World Modeling |
Aether Team et.al. |
2503.18945 |
null |
2025-03-24 |
DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation |
Karim Abou Zeid et.al. |
2503.18944 |
link |
2025-03-24 |
SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding |
Mingze Xu et.al. |
2503.18943 |
null |
2025-03-24 |
Video-T1: Test-Time Scaling for Video Generation |
Fangfu Liu et.al. |
2503.18942 |
null |
2025-03-24 |
Exploring Training and Inference Scaling Laws in Generative Retrieval |
Hongru Cai et.al. |
2503.18941 |
link |
2025-03-24 |
CoMP: Continual Multimodal Pre-training for Vision Foundation Models |
Yitong Chen et.al. |
2503.18931 |
link |
2025-03-24 |
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training |
Brian R. Bartoldson et.al. |
2503.18929 |
null |
2025-03-24 |
FFN Fusion: Rethinking Sequential Computation in Large Language Models |
Akhiad Bercovich et.al. |
2503.18908 |
null |
2025-03-24 |
xKV: Cross-Layer SVD for KV-Cache Compression |
Chi-Chih Chang et.al. |
2503.18893 |
link |
2025-03-24 |
AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration |
Zhexuan Wang et.al. |
2503.18891 |
link |
2025-03-24 |
Toward building next-generation Geocoding systems: a systematic review |
Zhengcong Yin et.al. |
2503.18888 |
null |
2025-03-24 |
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders |
Andrey Galichin et.al. |
2503.18878 |
link |
2025-03-24 |
Efficient Self-Supervised Adaptation for Medical Image Analysis |
Moein Sorkhei et.al. |
2503.18873 |
link |
2025-03-24 |
Reimagining Memory Access for LLM Inference: Compression-Aware Memory Controller Design |
Rui Xie et.al. |
2503.18869 |
null |
2025-03-24 |
Structuring Scientific Innovation: A Framework for Modeling and Discovering Impactful Knowledge Combinations |
Junlan Chen et.al. |
2503.18865 |
null |
2025-03-24 |
3DSwapping: Texture Swapping For 3D Object From Single Reference Image |
Xiao Cao et.al. |
2503.18853 |
null |
2025-03-24 |
Defeating Prompt Injections by Design |
Edoardo Debenedetti et.al. |
2503.18813 |
null |
2025-03-24 |
Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code |
Augusto B. Corrêa et.al. |
2503.18809 |
null |
2025-03-24 |
REALM: A Dataset of Real-World LLM Use Cases |
Jingwen Cheng et.al. |
2503.18792 |
null |
2025-03-24 |
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache |
Dayou Du et.al. |
2503.18773 |
link |
2025-03-24 |
AlphaSpace: Enabling Robotic Actions through Semantic Tokenization and Symbolic Reasoning |
Alan Dao et.al. |
2503.18769 |
null |
2025-03-24 |
RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation |
Chengbo Yuan et.al. |
2503.18738 |
null |
2025-03-24 |
Predicting the Road Ahead: A Knowledge Graph based Foundation Model for Scene Understanding in Autonomous Driving |
Hongkuan Zhou et.al. |
2503.18730 |
null |
2025-03-24 |
LLaVAction: evaluating and training multi-modal large language models for action recognition |
Shaokai Ye et.al. |
2503.18712 |
link |
2025-03-24 |
Revisiting Automatic Data Curation for Vision Foundation Models in Digital Pathology |
Boqi Chen et.al. |
2503.18709 |
null |
2025-03-24 |
OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad |
Luyao Tang et.al. |
2503.18695 |
link |
2025-03-25 |
Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models |
Yazhou Zhang et.al. |
2503.18681 |
null |
2025-03-24 |
NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping |
Tianyi Wang et.al. |
2503.18678 |
null |
2025-03-24 |
Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark |
Bingchen Miao et.al. |
2503.18665 |
link |
2025-03-24 |
From Fragment to One Piece: A Survey on AI-Driven Graphic Design |
Xingxing Zou et.al. |
2503.18641 |
null |
2025-03-24 |
Adaptive Machine Learning for Resource-Constrained Environments |
Sebastián A. Cajas Ordóñez et.al. |
2503.18634 |
link |
2025-03-24 |
Generative Dataset Distillation using Min-Max Diffusion Model |
Junqiao Fan et.al. |
2503.18626 |
null |
2025-03-24 |
Scaling Laws for Emulation of Stellar Spectra |
Tomasz Różański et.al. |
2503.18617 |
null |
2025-03-24 |
LANGALIGN: Enhancing Non-English Language Models via Cross-Lingual Embedding Alignment |
Jong Myoung Kim et.al. |
2503.18603 |
null |
2025-03-24 |
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization |
Minsu Kim et.al. |
2503.18599 |
null |
2025-03-24 |
A Universal Model Combining Differential Equations and Neural Networks for Ball Trajectory Prediction |
Zhiwei Shi et.al. |
2503.18584 |
null |
2025-03-24 |
Anchor-based oversampling for imbalanced tabular data via contrastive and adversarial learning |
Hadi Mohammadi et.al. |
2503.18569 |
null |
2025-03-24 |
Distil-xLSTM: Learning Attention Mechanisms through Recurrent Structures |
Abdoul Majid O. Thiombiano et.al. |
2503.18565 |
null |
2025-03-24 |
Power-fractional distributions and branching processes |
Gerold Alsmeyer et.al. |
2503.18563 |
null |
2025-03-24 |
Self-Reported Confidence of Large Language Models in Gastroenterology: Analysis of Commercial, Open-Source, and Quantized Models |
Nariman Naderi et.al. |
2503.18562 |
null |
2025-03-25 |
AMD-Hummingbird: Towards an Efficient Text-to-Video Model |
Takashi Isobe et.al. |
2503.18559 |
link |
2025-03-24 |
HiRes-FusedMIM: A High-Resolution RGB-DSM Pre-trained Model for Building-Level Remote Sensing Applications |
Guneet Mutreja et.al. |
2503.18540 |
null |
2025-03-24 |
SciClaims: An End-to-End Generative System for Biomedical Claim Analysis |
Raúl Ortega et.al. |
2503.18526 |
null |
2025-03-24 |
P3Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction |
Yufeng Zhong et.al. |
2503.18525 |
null |
2025-03-24 |
Can Text-to-Video Generation help Video-Language Alignment? |
Luca Zanella et.al. |
2503.18507 |
null |
2025-03-24 |
Autoregressive Language Models for Knowledge Base Population: A case study in the space mission domain |
Andrés García-Silva et.al. |
2503.18502 |
null |
2025-03-24 |
Verbal Process Supervision Elicits Better Coding Agents |
Hao-Yuan Chen et.al. |
2503.18494 |
null |
2025-03-24 |
Safeguarding Mobile GUI Agent via Logic-based Action Verification |
Jungjae Lee et.al. |
2503.18492 |
null |
2025-03-24 |
Large Language Models powered Network Attack Detection: Architecture, Opportunities and Case Study |
Xinggong Zhang et.al. |
2503.18487 |
null |
2025-03-24 |
Explaining Domain Shifts in Language: Concept erasing for Interpretable Image Classification |
Zequn Zeng et.al. |
2503.18483 |
link |
2025-03-24 |
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding |
Xiangrui Liu et.al. |
2503.18478 |
null |
2025-03-24 |
PALATE: Peculiar Application of the Law of Total Expectation to Enhance the Evaluation of Deep Generative Models |
Tadeusz Dziarmaga et.al. |
2503.18462 |
link |
2025-03-24 |
MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing |
Lingting Zhu et.al. |
2503.18461 |
null |
2025-03-24 |
ModiGen: A Large Language Model-Based Workflow for Multi-Task Modelica Code Generation |
Jiahui Xiang et.al. |
2503.18460 |
null |
2025-03-24 |
SEAlign: Alignment Training for Software Engineering Agent |
Kechi Zhang et.al. |
2503.18455 |
null |
2025-03-24 |
InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment |
Yunhong Lu et.al. |
2503.18454 |
link |
2025-03-24 |
ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation |
Guosheng Zhao et.al. |
2503.18438 |
null |
2025-03-24 |
A Simple yet Effective Layout Token in Large Language Models for Document Understanding |
Zhaoqing Zhu et.al. |
2503.18434 |
null |
2025-03-24 |
Teaching LLMs for Step-Level Automatic Math Correction via Reinforcement Learning |
Junsong Li et.al. |
2503.18432 |
null |
2025-03-24 |
Breaking the Encoder Barrier for Seamless Video-Language Understanding |
Handong Li et.al. |
2503.18422 |
null |
2025-03-25 |
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning |
Sherry X. Chen et.al. |
2503.18406 |
link |
2025-03-24 |
Solving Situation Puzzles with Large Language Model and External Reformulation |
Kun Li et.al. |
2503.18394 |
null |
2025-03-24 |
Manipulation and the AI Act: Large Language Model Chatbots and the Danger of Mirrors |
Joshua Krook et.al. |
2503.18387 |
null |
2025-03-24 |
Resource-Efficient Motion Control for Video Generation via Dynamic Mask Guidance |
Sicong Feng et.al. |
2503.18386 |
null |
2025-03-24 |
Maximum Redundancy Pruning: A Principle-Driven Layerwise Sparsity Allocation for LLMs |
Chang Gao et.al. |
2503.18377 |
null |
2025-03-24 |
J&H: Evaluating the Robustness of Large Language Models Under Knowledge-Injection Attacks in Legal Domain |
Yiran Hu et.al. |
2503.18360 |
link |
2025-03-24 |
Mitigating Cache Noise in Test-Time Adaptation for Large Vision-Language Models |
Haotian Zhai et.al. |
2503.18334 |
null |
2025-03-24 |
Optimizing Influence Campaigns: Nudging under Bounded Confidence |
Yen-Shao Chen et.al. |
2503.18331 |
null |
2025-03-24 |
Towards Training-free Anomaly Detection with Vision and Language Foundation Models |
Jinjin Zhang et.al. |
2503.18325 |
link |
2025-03-24 |
Bridging Writing Manner Gap in Visual Instruction Tuning by Creating LLM-aligned Instructions |
Dong Jing et.al. |
2503.18320 |
null |
2025-03-24 |
Knowledge Transfer from LLMs to Provenance Analysis: A Semantic-Augmented Method for APT Detection |
Fei Zuo et.al. |
2503.18316 |
null |
2025-03-24 |
DeepFund: Will LLM be Professional at Fund Investment? A Live Arena Perspective |
Changlun Li et.al. |
2503.18313 |
null |
2025-03-24 |
Enhancing LLM-based Code Translation in Repository Context via Triple Knowledge-Augmented |
Guangsheng Ou et.al. |
2503.18305 |
null |
2025-03-24 |
How to Capture and Study Conversations Between Research Participants and ChatGPT: GPT for Researchers (g4r.org) |
Jin Kim et.al. |
2503.18303 |
null |
2025-03-24 |
Image-to-Text for Medical Reports Using Adaptive Co-Attention and Triple-LSTM Module |
Yishen Liu et.al. |
2503.18297 |
null |
2025-03-24 |
Surgical Action Planning with Large Language Models |
Mengya Xu et.al. |
2503.18296 |
null |
2025-03-24 |
Fact-checking AI-generated news reports: Can LLMs catch their own lies? |
Jiayi Yao et.al. |
2503.18293 |
null |
2025-03-24 |
Jenga: Effective Memory Management for Serving LLM with Heterogeneity |
Chen Zhang et.al. |
2503.18292 |
null |
2025-03-24 |
Sun-Shine: A Large Language Model for Tibetan Culture |
Cheng Huang et.al. |
2503.18288 |
link |
2025-03-24 |
CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI |
Siyuan Cheng et.al. |
2503.18286 |
link |
2025-03-24 |
Analyzing Islamophobic Discourse Using Semi-Coded Terms and LLMs |
Raza Ul Mustafa et.al. |
2503.18273 |
null |
2025-03-24 |
Efficient Inference for Covariate-adjusted Bradley-Terry Model with Covariate Shift |
Xiudi Li et.al. |
2503.18256 |
null |
2025-03-24 |
Surface-Aware Distilled 3D Semantic Features |
Lukas Uzolas et.al. |
2503.18254 |
null |
2025-03-24 |
Enhancing Multi-Label Emotion Analysis and Corresponding Intensities for Ethiopian Languages |
Tadesse Destaw Belay et.al. |
2503.18253 |
null |
2025-03-23 |
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation |
Jungsoo Lee et.al. |
2503.18244 |
null |
2025-03-23 |
ShED-HD: A Shannon Entropy Distribution Framework for Lightweight Hallucination Detection on Edge Devices |
Aneesh Vathul et.al. |
2503.18242 |
null |
2025-03-23 |
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters |
Roberto Garcia et.al. |
2503.18216 |
link |
2025-03-23 |
LakotaBERT: A Transformer-based Model for Low Resource Lakota Language |
Kanishka Parankusham et.al. |
2503.18212 |
null |
2025-03-23 |
The Power of Small LLMs in Geometry Generation for Physical Simulations |
Ossama Shafiq et.al. |
2503.18178 |
null |
2025-03-23 |
Unmasking Deceptive Visuals: Benchmarking Multimodal Large Language Models on Misleading Chart Question Answering |
Zixin Chen et.al. |
2503.18172 |
null |
2025-03-23 |
Decorum: A Language-Based Approach For Style-Conditioned Synthesis of Indoor 3D Scenes |
Kelly O. Marshall et.al. |
2503.18155 |
null |
2025-03-23 |
LocDiffusion: Identifying Locations on Earth by Diffusing in the Hilbert Space |
Zhangyu Wang et.al. |
2503.18142 |
null |
2025-03-23 |
AGIR: Assessing 3D Gait Impairment with Reasoning based on LLMs |
Diwei Wang et.al. |
2503.18141 |
null |
2025-03-23 |
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation |
Jiaxin Huang et.al. |
2503.18135 |
null |
2025-03-23 |
MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection |
Yibo Yan et.al. |
2503.18132 |
null |
2025-03-23 |
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization |
Juntao Dai et.al. |
2503.18130 |
null |
2025-03-23 |
GeoBenchX: Benchmarking LLMs for Multistep Geospatial Tasks |
Varvara Krechetova et.al. |
2503.18129 |
link |
2025-03-23 |
$D^2LoRA$ : Data-Driven LoRA Initialization for Low Resource Tasks |
Javad SeraJ et.al. |
2503.18089 |
null |
2025-03-23 |
Vehicular Road Crack Detection with Deep Learning: A New Online Benchmark for Comprehensive Evaluation of Existing Algorithms |
Nachuan Ma et.al. |
2503.18082 |
null |
2025-03-21 |
Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique |
Yansi Li et.al. |
2503.17363 |
null |
2025-03-21 |
Position: Interactive Generative Video as Next-Generation Game Engine |
Jiwen Yu et.al. |
2503.17359 |
null |
2025-03-21 |
HCAST: Human-Calibrated Autonomy Software Tasks |
David Rein et.al. |
2503.17354 |
link |
2025-03-21 |
NdLinear Is All You Need for Representation Learning |
Alex Reneau et.al. |
2503.17353 |
link |
2025-03-21 |
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement |
Yihe Deng et.al. |
2503.17352 |
link |
2025-03-21 |
Capturing Individual Human Preferences with Reward Features |
André Barreto et.al. |
2503.17338 |
null |
2025-03-21 |
Efficient Intent-Based Filtering for Multi-Party Conversations Using Knowledge Distillation from LLMs |
Reem Gody et.al. |
2503.17336 |
null |
2025-03-21 |
CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities |
Yuxuan Zhu et.al. |
2503.17332 |
link |
2025-03-21 |
LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language |
Kun Chu et.al. |
2503.17309 |
link |
2025-03-21 |
Bugdar: AI-Augmented Secure Code Review for GitHub Pull Requests |
John Naulty et.al. |
2503.17302 |
null |
2025-03-21 |
Offline Model-Based Optimization: Comprehensive Review |
Minsu Kim et.al. |
2503.17286 |
link |
2025-03-21 |
CASE – Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement |
Gaifan Zhang et.al. |
2503.17279 |
null |
2025-03-21 |
Unsupervised Joint Learning of Optical Flow and Intensity with Event Cameras |
Shuang Guo et.al. |
2503.17262 |
link |
2025-03-21 |
SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging |
Aladin Djuhera et.al. |
2503.17239 |
link |
2025-03-21 |
FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs |
Albert Sawczyn et.al. |
2503.17229 |
null |
2025-03-21 |
Neuro-Symbolic Scene Graph Conditioning for Synthetic Image Dataset Generation |
Giacomo Savazzi et.al. |
2503.17224 |
null |
2025-03-21 |
Automating Adjudication of Cardiovascular Events Using Large Language Models |
Sonish Sivarajkumar et.al. |
2503.17222 |
null |
2025-03-21 |
TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning |
Sheng Wang et.al. |
2503.17195 |
null |
2025-03-21 |
LLMs Love Python: A Study of LLMs’ Bias for Programming Languages and Libraries |
Lukas Twist et.al. |
2503.17181 |
link |
2025-03-21 |
D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens |
Panpan Wang et.al. |
2503.17155 |
null |
2025-03-21 |
Modifying Large Language Model Post-Training for Diverse Creative Writing |
John Joon Young Chung et.al. |
2503.17126 |
link |
2025-03-21 |
Large Language Model Compression via the Nested Activation-Aware Decomposition |
Jun Lu et.al. |
2503.17101 |
null |
2025-03-21 |
Deterministic AI Agent Personality Expression through Standard Psychological Diagnostics |
J. M. Diederik Kruijssen et.al. |
2503.17085 |
null |
2025-03-21 |
A Study into Investigating Temporal Robustness of LLMs |
Jonas Wallat et.al. |
2503.17073 |
null |
2025-03-21 |
PVChat: Personalized Video Chat with One-Shot Learning |
Yufei Shi et.al. |
2503.17069 |
null |
2025-03-21 |
Problem Framing in the AI era: a new model |
Matteo Tuveri et.al. |
2503.17040 |
null |
2025-03-21 |
AnimatePainter: A Self-Supervised Rendering Framework for Reconstructing Painting Process |
Junjie Hu et.al. |
2503.17029 |
null |
2025-03-21 |
RiboFlow: Conditional De Novo RNA Sequence-Structure Co-Design via Synergistic Flow Matching |
Runze Ma et.al. |
2503.17007 |
null |
2025-03-21 |
Text2Model: Generating dynamic chemical reactor models using large language models (LLMs) |
Sophia Rupprecht et.al. |
2503.17004 |
null |
2025-03-21 |
A Survey on Personalized Alignment – The Missing Piece for Large Language Models in Real-World Applications |
Jian Guan et.al. |
2503.17003 |
null |
2025-03-21 |
Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation |
Qinghe Ma et.al. |
2503.16997 |
link |
2025-03-21 |
TRACE: Time SeRies PArameter EffiCient FinE-tuning |
Yuze Li et.al. |
2503.16991 |
null |
2025-03-21 |
Token Dynamics: Towards Efficient and Dynamic Video Token Representation for Video Large Language Models |
Haichao Zhang et.al. |
2503.16980 |
null |
2025-03-21 |
Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks |
Julian Junyan Wang et.al. |
2503.16974 |
null |
2025-03-21 |
Distilling Monocular Foundation Model for Fine-grained Depth Completion |
Yingping Liang et.al. |
2503.16970 |
null |
2025-03-21 |
HyperLoRA: Parameter-Efficient Adaptive Generation for Portrait Synthesis |
Mengtian Li et.al. |
2503.16944 |
null |
2025-03-21 |
TEMPO: Temporal Preference Optimization of Video LLMs via Difficulty Scheduling and Pre-SFT Alignment |
Shicheng Li et.al. |
2503.16929 |
link |
2025-03-21 |
RustEvo^2: An Evolving Benchmark for API Evolution in LLM-based Rust Code Generation |
Linxi Liang et.al. |
2503.16922 |
link |
2025-03-21 |
Malliavin-Bismut Score-based Diffusion Models |
Ehsan Mirafzali et.al. |
2503.16917 |
null |
2025-03-21 |
FAIT: Fault-Aware Fine-Tuning for Better Code Generation |
Lishui Fan et.al. |
2503.16913 |
null |
2025-03-21 |
Improving the End-to-End Efficiency of Offline Inference for Multi-LLM Applications Based on Sampling and Simulation |
Jingzhi Fang et.al. |
2503.16893 |
null |
2025-03-21 |
Federated Cross-Domain Click-Through Rate Prediction With Large Language Model Augmentation |
Jiangcheng Qin et.al. |
2503.16875 |
null |
2025-03-21 |
MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization |
Jian Zhang et.al. |
2503.16874 |
null |
2025-03-21 |
Lie Detector: Unified Backdoor Detection via Cross-Examination Framework |
Xuan Wang et.al. |
2503.16872 |
null |
2025-03-21 |
Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs |
Anshumann et.al. |
2503.16870 |
null |
2025-03-21 |
Nonparametric Factor Analysis and Beyond |
Yujia Zheng et.al. |
2503.16865 |
null |
2025-03-21 |
MTBench: A Multimodal Time Series Benchmark for Temporal Reasoning and Question Answering |
Jialin Chen et.al. |
2503.16858 |
link |
2025-03-21 |
Generative Compositor for Few-Shot Visual Information Extraction |
Zhibo Yang et.al. |
2503.16854 |
null |
2025-03-21 |
Imagine to Hear: Auditory Knowledge Generation can be an Effective Assistant for Language Models |
Suho Yoo et.al. |
2503.16853 |
null |
2025-03-21 |
Towards LLM Guardrails via Sparse Representation Steering |
Zeqing He et.al. |
2503.16851 |
null |
2025-03-21 |
LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models |
Jian Liang et.al. |
2503.16843 |
null |
2025-03-21 |
Downstream Analysis of Foundational Medical Vision Models for Disease Progression |
Basar Demir et.al. |
2503.16842 |
null |
2025-03-21 |
When Tom Eats Kimchi: Evaluating Cultural Bias of Multimodal Large Language Models in Cultural Mixture Contexts |
Jun Seong Kim et.al. |
2503.16826 |
null |
2025-03-21 |
When Debate Fails: Bias Reinforcement in Large Language Models |
Jihwan Oh et.al. |
2503.16814 |
null |
2025-03-21 |
Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models |
Mengsong Wu et.al. |
2503.16779 |
link |
2025-03-21 |
Current and Future Use of Large Language Models for Knowledge Work |
Michelle Brachman et.al. |
2503.16774 |
null |
2025-03-21 |
On Explaining (Large) Language Models For Code Using Global Code-Based Explanations |
David N. Palacio et.al. |
2503.16771 |
null |
2025-03-20 |
Automated Harmfulness Testing for Code Large Language Models |
Honghao Tan et.al. |
2503.16740 |
null |
2025-03-20 |
Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models |
Chengkai Huang et.al. |
2503.16734 |
null |
2025-03-20 |
Natural Language Generation |
Emiel van Miltenburg et.al. |
2503.16728 |
null |
2025-03-20 |
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding |
Jinlong Li et.al. |
2503.16707 |
link |
2025-03-20 |
APPA : Agentic Preformulation Pathway Assistant |
Julius Lange et.al. |
2503.16698 |
null |
2025-03-20 |
GAIR: Improving Multimodal Geo-Foundation Model with Geo-Aligned Implicit Representations |
Zeping Liu et.al. |
2503.16683 |
null |
2025-03-20 |
Echoes of Power: Investigating Geopolitical Bias in US and China Large Language Models |
Andre G. C. Pacheco et.al. |
2503.16679 |
null |
2025-03-20 |
Accelerating Transformer Inference and Training with 2:4 Activation Sparsity |
Daniel Haziza et.al. |
2503.16672 |
null |
2025-03-20 |
Code Evolution Graphs: Understanding Large Language Model Driven Design of Algorithms |
Niki van Stein et.al. |
2503.16668 |
null |
2025-03-20 |
A preliminary data fusion study to assess the feasibility of Foundation Process-Property Models in Laser Powder Bed Fusion |
Oriol Vendrell-Gallart et.al. |
2503.16667 |
null |
2025-03-20 |
Accelerating Antibiotic Discovery with Large Language Models and Knowledge Graphs |
Maxime Delmas et.al. |
2503.16655 |
null |
2025-03-20 |
Leveraging Large Language Models for Explainable Activity Recognition in Smart Homes: A Critical Evaluation |
Michele Fiori et.al. |
2503.16622 |
null |
2025-03-20 |
A Recipe for Generating 3D Worlds From a Single Image |
Katja Schwarz et.al. |
2503.16611 |
null |
2025-03-20 |
Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions |
Hadi Amini et.al. |
2503.16585 |
link |
2025-03-22 |
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation |
Yuqing Wang et.al. |
2503.16430 |
null |
2025-03-20 |
DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding |
Keyan Chen et.al. |
2503.16426 |
link |
2025-03-20 |
SynCity: Training-Free Generation of 3D Worlds |
Paul Engstler et.al. |
2503.16420 |
null |
2025-03-20 |
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models |
Yang Sui et.al. |
2503.16419 |
link |
2025-03-20 |
M3: 3D-Spatial MultiModal Memory |
Xueyan Zou et.al. |
2503.16413 |
link |
2025-03-20 |
DreamTexture: Shape from Virtual Texture with Analysis by Augmentation |
Ananta R. Bhattarai et.al. |
2503.16412 |
null |
2025-03-20 |
VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness |
SeungJu Cha et.al. |
2503.16406 |
link |
2025-03-20 |
The Emperor’s New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination |
Yifan Sun et.al. |
2503.16402 |
link |
2025-03-20 |
Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them |
Guanyu Chen et.al. |
2503.16401 |
null |
2025-03-20 |
Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation |
Yijia Luo et.al. |
2503.16385 |
link |
2025-03-20 |
LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images |
Leyang Wang et.al. |
2503.16376 |
null |
2025-03-20 |
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse |
Muyao Li et.al. |
2503.16365 |
null |
2025-03-20 |
CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners |
Yunzhi Yao et.al. |
2503.16356 |
link |
2025-03-20 |
Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences |
Krithik Ramesh et.al. |
2503.16351 |
null |
2025-03-20 |
LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates |
Ying Shen et.al. |
2503.16334 |
null |
2025-03-20 |
OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence |
Long Yuan et.al. |
2503.16326 |
null |
2025-03-20 |
Issue2Test: Generating Reproducing Test Cases from Issue Reports |
Noor Nashid et.al. |
2503.16320 |
null |
2025-03-21 |
Bridging Technology and Humanities: Evaluating the Impact of Large Language Models on Social Sciences Research with DeepSeek-R1 |
Peiran Gu et.al. |
2503.16304 |
null |
2025-03-20 |
SceneMI: Motion In-betweening for Modeling Human-Scene Interactions |
Inwoo Hwang et.al. |
2503.16289 |
null |
2025-03-21 |
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens |
Shuqi Lu et.al. |
2503.16278 |
link |
2025-03-20 |
Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data |
Zijian Li et.al. |
2503.16260 |
null |
2025-03-20 |
Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models |
Keda Tao et.al. |
2503.16257 |
null |
2025-03-21 |
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning |
Zhaowei Liu et.al. |
2503.16252 |
link |
2025-03-20 |
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t |
Quy-Anh Dang et.al. |
2503.16219 |
link |
2025-03-20 |
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion |
Qizhi Pei et.al. |
2503.16212 |
link |
2025-03-20 |
VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data Synthesis |
Chia-Yi Hsu et.al. |
2503.16195 |
null |
2025-03-21 |
Affective Polarization Amongst Swedish Politicians |
François t’Serstevens et.al. |
2503.16193 |
link |
2025-03-20 |
Large Language Models for Water Distribution Systems Modeling and Decision-Making |
Yinon Goldshtein et.al. |
2503.16191 |
null |
2025-03-20 |
CLS-RL: Image Classification with Rule-Based Reinforcement Learning |
Ming Li et.al. |
2503.16188 |
link |
2025-03-20 |
Narrowing Class-Wise Robustness Gaps in Adversarial Training |
Fatemeh Amerehi et.al. |
2503.16179 |
null |
2025-03-20 |
CodeReviewQA: The Code Review Comprehension Assessment for Large Language Models |
Hong Yi Lin et.al. |
2503.16167 |
null |
2025-03-20 |
SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs |
Shibo Jie et.al. |
2503.16163 |
null |
2025-03-20 |
Towards Lighter and Robust Evaluation for Retrieval Augmented Generation |
Alex-Razvan Ispas et.al. |
2503.16161 |
link |
2025-03-20 |
Automatically Generating Chinese Homophone Words to Probe Machine Translation Estimation Systems |
Shenbin Qian et.al. |
2503.16158 |
link |
2025-03-20 |
Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models |
Mats Faulborn et.al. |
2503.16148 |
null |
2025-03-20 |
Unify and Triumph: Polyglot, Diverse, and Self-Consistent Generation of Unit Tests with LLMs |
Djamel Eddine Khelladi et.al. |
2503.16144 |
null |
2025-03-21 |
MKG-Rank: Enhancing Large Language Models with Knowledge Graph for Multilingual Medical Question Answering |
Feiyang Li et.al. |
2503.16131 |
null |
2025-03-20 |
The Impact of Revealing Large Language Model Stochasticity on Trust, Reliability, and Anthropomorphization |
Chelse Swoopes et.al. |
2503.16114 |
null |
2025-03-20 |
OSLoPrompt: Bridging Low-Supervision Challenges and Open-Set Domain Generalization in CLIP |
Mohamad Hassan N C et.al. |
2503.16106 |
link |
2025-03-20 |
Cultural Alignment in Large Language Models Using Soft Prompt Tuning |
Reem I. Masoud et.al. |
2503.16094 |
null |
2025-03-20 |
Quantum Chebyshev Probabilistic Models for Fragmentation Functions |
Jorge J. Martínez de Lejarza et.al. |
2503.16073 |
null |
2025-03-20 |
Tuning LLMs by RAG Principles: Towards LLM-native Memory |
Jiale Wei et.al. |
2503.16071 |
link |
2025-03-20 |
SALT: Singular Value Adaptation with Low-Rank Transformation |
Abdelrahman Elsayed et.al. |
2503.16055 |
link |
2025-03-20 |
Meta-Learning Neural Mechanisms rather than Bayesian Priors |
Michael Goodale et.al. |
2503.16048 |
null |
2025-03-20 |
Incomplete Utterance Rewriting with Editing Operation Guidance and Utterance Augmentation |
Zhiyu Cao et.al. |
2503.16043 |
null |
2025-03-20 |
GreenIQ: A Deep Search Platform for Comprehensive Carbon Market Analysis and Automated Report Generation |
Bisola Faith Kayode et.al. |
2503.16041 |
null |
2025-03-20 |
Evaluating Test-Time Scaling LLMs for Legal Reasoning: OpenAI o1, DeepSeek-R1, and Beyond |
Yaoyao Yu et.al. |
2503.16040 |
null |
2025-03-20 |
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models |
Zhihang Liu et.al. |
2503.16036 |
link |
2025-03-20 |
The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement |
Ruihan Yang et.al. |
2503.16024 |
null |
2025-03-20 |
BadToken: Token-level Backdoor Attacks to Multi-modal Large Language Models |
Zenghui Yuan et.al. |
2503.16023 |
null |
2025-03-20 |
Corrective In-Context Learning: Evaluating Self-Correction in Large Language Models |
Mario Sanz-Guerrero et.al. |
2503.16022 |
link |
2025-03-21 |
Autonomous AI imitators increase diversity in homogeneous information ecosystems |
Emil Bakkensen Johansen et.al. |
2503.16021 |
null |
2025-03-20 |
GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions |
Xiaomeng Chu et.al. |
2503.16013 |
null |
2025-03-20 |
“This could save us months of work” – Use Cases of AI and Automation Support in Investigative Journalism |
Besjon Cifliku et.al. |
2503.16011 |
null |
2025-03-20 |
ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph |
Langming Liu et.al. |
2503.15990 |
null |
2025-03-20 |
A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli |
Pengyu Liu et.al. |
2503.15978 |
null |
2025-03-20 |
Stability of Schrödinger bridges and Sinkhorn semigroups for log-concave models |
Pierre Del Moral et.al. |
2503.15963 |
null |
2025-03-20 |
GAN-enhanced Simulation-driven DNN Testing in Absence of Ground Truth |
Mohammed Attaoui et.al. |
2503.15953 |
null |
2025-03-20 |
From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models |
Jinyi Liu et.al. |
2503.15944 |
null |
2025-03-21 |
Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment |
Gaole Dai et.al. |
2503.15937 |
null |
2025-03-20 |
Towards Automatic Continual Learning: A Self-Adaptive Framework for Continual Instruction Tuning |
Peiyi Lin et.al. |
2503.15924 |
null |
2025-03-20 |
SPIN: Accelerating Large Language Model Inference with Heterogeneous Speculative Models |
Fahao Chen et.al. |
2503.15921 |
null |
2025-03-20 |
Learning to Efficiently Adapt Foundation Models for Self-Supervised Endoscopic 3D Scene Reconstruction from Any Cameras |
Beilei Cui et.al. |
2503.15917 |
null |
2025-03-20 |
From Structured Prompts to Open Narratives: Measuring Gender Bias in LLMs Through Open-Ended Storytelling |
Evan Chen et.al. |
2503.15904 |
null |
2025-03-20 |
Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models |
Baolong Bi et.al. |
2503.15888 |
link |
2025-03-21 |
Enhancing Zero-Shot Image Recognition in Vision-Language Models through Human-like Concept Guidance |
Hui Liu et.al. |
2503.15886 |
null |
2025-03-20 |
DeepPsy-Agent: A Stage-Aware and Deep-Thinking Emotional Support Agent System |
Kai Chen et.al. |
2503.15876 |
null |
2025-03-20 |
MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations |
Kyungho Bae et.al. |
2503.15871 |
null |
2025-03-20 |
TruthLens: Explainable DeepFake Detection for Face Manipulated and Fully Synthetic Data |
Rohit Kundu et.al. |
2503.15867 |
null |
2025-03-20 |
DroidTTP: Mapping Android Applications with TTP for Cyber Threat Intelligence |
Dincy R Arikkat et.al. |
2503.15866 |
link |
2025-03-20 |
VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling |
Hyojun Go et.al. |
2503.15855 |
null |
2025-03-20 |
Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey |
Xiaoou Liu et.al. |
2503.15850 |
null |
2025-03-20 |
Entropy-based Exploration Conduction for Multi-step Reasoning |
Jinghan Zhang et.al. |
2503.15848 |
null |
2025-03-20 |
Automatic Generation of Safety-compliant Linear Temporal Logic via Large Language Model: A Self-supervised Framework |
Junle Li et.al. |
2503.15840 |
null |
2025-03-20 |
Enhancing LLM Code Generation with Ensembles: A Similarity-Based Selection Approach |
Tarek Mahmud et.al. |
2503.15838 |
null |
2025-03-20 |
Fùxì: A Benchmark for Evaluating Language Models on Ancient Chinese Text Understanding and Generation |
Shangqing Zhao et.al. |
2503.15837 |
link |
2025-03-20 |
Computation-Efficient and Recognition-Friendly 3D Point Cloud Privacy Protection |
Haotian Ma et.al. |
2503.15818 |
null |
2025-03-20 |
A Vision Centric Remote Sensing Benchmark |
Abduljaleel Adejumo et.al. |
2503.15816 |
null |
2025-03-20 |
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing |
Vishnu Asutosh Dasu et.al. |
2503.15815 |
null |
2025-03-20 |
ChatGPT and U(X): A Rapid Review on Measuring the User Experience |
Katie Seaborn et.al. |
2503.15808 |
null |
2025-03-20 |
Video-VoT-R1: An efficient video inference model integrating image packing and AoE architecture |
Cheng Li et.al. |
2503.15807 |
null |
2025-03-20 |
DNA Bench: When Silence is Smarter – Benchmarking Over-Reasoning in Reasoning LLMs |
Masoud Hashemi et.al. |
2503.15793 |
null |
2025-03-20 |
RL4Med-DDPO: Reinforcement Learning for Controlled Guidance Towards Diverse Medical Image Generation using Vision-Language Foundation Models |
Parham Saremi et.al. |
2503.15784 |
null |
2025-03-20 |
Grammar and Gameplay-aligned RL for Game Description Generation with LLMs |
Tsunehiko Tanaka et.al. |
2503.15783 |
null |
2025-03-20 |
AutoDrive-QA- Automated Generation of Multiple-Choice Questions for Autonomous Driving Datasets Using Large Vision-Language Models |
Boshra Khalili et.al. |
2503.15778 |
null |
2025-03-20 |
Detecting LLM-Written Peer Reviews |
Vishisht Rao et.al. |
2503.15772 |
link |
2025-03-20 |
Towards Agentic AI Networking in 6G: A Generative Foundation Model-as-Agent Approach |
Yong Xiao et.al. |
2503.15764 |
null |
2025-03-20 |
Dialogic Learning in Child-Robot Interaction: A Hybrid Approach to Personalized Educational Content Generation |
Elena Malnatsky et.al. |
2503.15762 |
null |
2025-03-20 |
GraPLUS: Graph-based Placement Using Semantics for Image Composition |
Mir Mohammad Khaleghi et.al. |
2503.15761 |
null |
2025-03-20 |
AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration |
Andy Zhou et.al. |
2503.15754 |
null |
2025-03-20 |
Using Language Models to Decipher the Motivation Behind Human Behaviors |
Yutong Xie et.al. |
2503.15752 |
null |
2025-03-19 |
Reinforcement Learning Environment with LLM-Controlled Adversary in D&D 5th Edition Combat |
Joseph Emmanuel DL Dayo et.al. |
2503.15726 |
null |
2025-03-21 |
Leveraging MoE-based Large Language Model for Zero-Shot Multi-Task Semantic Communication |
Sin-Yu Huang et.al. |
2503.15722 |
null |
2025-03-19 |
Am I eligible? Natural Language Inference for Clinical Trial Patient Recruitment: the Patient’s Point of View |
Mathilde Aguiar et.al. |
2503.15718 |
link |
2025-03-19 |
Safety Aware Task Planning via Large Language Models in Robotics |
Azal Ahmad Khan et.al. |
2503.15707 |
null |
2025-03-19 |
GASP: Unifying Geometric and Semantic Self-Supervised Pre-training for Autonomous Driving |
William Ljungbergh et.al. |
2503.15672 |
null |
2025-03-19 |
Enhancing Pancreatic Cancer Staging with Large Language Models: The Role of Retrieval-Augmented Generation |
Hisashi Johno et.al. |
2503.15664 |
null |
2025-03-19 |
R $^2$ : A LLM Based Novel-to-Screenplay Generation Framework with Causal Plot Graphs |
Zefeng Lin et.al. |
2503.15655 |
null |
2025-03-19 |
LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning |
Federico Cocchi et.al. |
2503.15621 |
link |
2025-03-19 |
Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings |
Austin Xu et.al. |
2503.15620 |
link |
2025-03-19 |
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks |
Yifei Zhou et.al. |
2503.15478 |
link |
2025-03-19 |
Cube: A Roblox View of 3D Intelligence |
Foundation AI Team et.al. |
2503.15475 |
link |
2025-03-19 |
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining |
Boshen Xu et.al. |
2503.15470 |
link |
2025-03-19 |
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment |
Jia-Nan Li et.al. |
2503.15463 |
link |
2025-03-19 |
Di $\mathtt{[M]}$ O: Distilling Masked Diffusion Models into One-step Generator |
Yuanzhi Zhu et.al. |
2503.15457 |
null |
2025-03-19 |
SkyLadder: Better and Faster Pretraining via Context Window Scheduling |
Tongyao Zhu et.al. |
2503.15450 |
link |
2025-03-19 |
Visual Position Prompt for MLLM based Visual Grounding |
Wei Tang et.al. |
2503.15426 |
link |
2025-03-19 |
Probing the topology of the space of tokens with structured prompts |
Michael Robinson et.al. |
2503.15421 |
null |
2025-03-19 |
LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding |
Amirhossein Kazerouni et.al. |
2503.15420 |
null |
2025-03-19 |
Temporal Regularization Makes Your Video Generator Stronger |
Harold Haodong Chen et.al. |
2503.15417 |
null |
2025-03-19 |
Visual Persona: Foundation Model for Full-Body Human Customization |
Jisu Nam et.al. |
2503.15406 |
null |
2025-03-19 |
FedSCA: Federated Tuning with Similarity-guided Collaborative Aggregation for Heterogeneous Medical Image Segmentation |
Yumin Zhang et.al. |
2503.15390 |
null |
2025-03-19 |
Material Decomposition in Photon-Counting Computed Tomography with Diffusion Models: Comparative Study and Hybridization with Variational Regularizers |
Corentin Vazia et.al. |
2503.15383 |
null |
2025-03-19 |
EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models |
Yinan Liang et.al. |
2503.15369 |
null |
2025-03-19 |
SemEval-2025 Task 1: AdMIRe – Advancing Multimodal Idiomaticity Representation |
Thomas Pickard et.al. |
2503.15358 |
null |
2025-03-19 |
SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models |
I-Fan Lin et.al. |
2503.15351 |
null |
2025-03-19 |
TruthLens:A Training-Free Paradigm for DeepFake Detection |
Ritabrata Chakraborty et.al. |
2503.15342 |
null |
2025-03-19 |
Uncertainty-Guided Chain-of-Thought for Code Generation with LLMs |
Yuqi Zhu et.al. |
2503.15341 |
null |
2025-03-19 |
Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context |
Junyi Ao et.al. |
2503.15338 |
link |
2025-03-19 |
Euclid Quick Data Release (Q1) Exploring galaxy properties with a multi-modal foundation model |
Euclid Collaboration et.al. |
2503.15312 |
link |
2025-03-19 |
Euclid Quick Data Release (Q1): First visual morphology catalogue |
Euclid Collaboration et.al. |
2503.15310 |
link |
2025-03-19 |
aiXcoder-7B-v2: Training LLMs to Fully Utilize the Long Context in Repository-level Code Completion |
Jia Li et.al. |
2503.15301 |
null |
2025-03-19 |
Inside-Out: Hidden Factual Knowledge in LLMs |
Zorik Gekhman et.al. |
2503.15299 |
null |
2025-03-19 |
SENAI: Towards Software Engineering Native Generative Artificial Intelligence |
Mootez Saad et.al. |
2503.15282 |
null |
2025-03-19 |
MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration |
David Wan et.al. |
2503.15272 |
null |
2025-03-19 |
Do Chains-of-Thoughts of Large Language Models Suffer from Hallucinations, Cognitive Biases, or Phobias in Bayesian Reasoning? |
Roberto Araya et.al. |
2503.15268 |
null |
2025-03-19 |
LEGION: Learning to Ground and Explain for Synthetic Image Detection |
Hengrui Kang et.al. |
2503.15264 |
null |
2025-03-19 |
Efficient allocation of image recognition and LLM tasks on multi-GPU system |
Marcin Lawenda et.al. |
2503.15252 |
null |
2025-03-19 |
Automated Non-Functional Requirements Generation in Software Engineering with Large Language Models: A Comparative Study |
Jomar Thomas Almonte et.al. |
2503.15248 |
null |
2025-03-19 |
Exploring Large Language Models for Word Games:Who is the Spy? |
Chentian Wei et.al. |
2503.15235 |
link |
2025-03-19 |
When LLMs Meet API Documentation: Can Retrieval Augmentation Aid Code Generation Just as It Helps Developers? |
Jingyi Chen et.al. |
2503.15231 |
null |
2025-03-19 |
A Personalized Data-Driven Generative Model of Human Motion |
Angelo Di Porzio et.al. |
2503.15225 |
null |
2025-03-19 |
A Foundation Model for Patient Behavior Monitoring and Suicide Detection |
Rodrigo Oliver et.al. |
2503.15221 |
null |
2025-03-19 |
Context-Aware Vision Language Foundation Models for Ocular Disease Screening in Retinal Images |
Lucie Berger et.al. |
2503.15212 |
null |
2025-03-19 |
DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation |
Jiazhe Guo et.al. |
2503.15208 |
null |
2025-03-19 |
Benchmarking Large Language Models for Handwritten Text Recognition |
Giorgia Crosilla et.al. |
2503.15195 |
null |
2025-03-19 |
Optimizing Retrieval Strategies for Financial Question Answering Documents in Retrieval-Augmented Generation Systems |
Sejong Kim et.al. |
2503.15191 |
link |
2025-03-19 |
Foundation models may exhibit staged progression in novel CBRN threat disclosure |
Kevin M Esvelt et.al. |
2503.15182 |
null |
2025-03-19 |
A Review on Large Language Models for Visual Analytics |
Navya Sonal Agarwal et.al. |
2503.15176 |
null |
2025-03-19 |
Comparing Llama3 and DeepSeekR1 on Biomedical Text Classification Tasks |
Yuting Guo et.al. |
2503.15169 |
null |
2025-03-19 |
Object-Centric Pretraining via Target Encoder Bootstrapping |
Nikola Đukić et.al. |
2503.15141 |
null |
2025-03-19 |
VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention |
Mingzhe Zheng et.al. |
2503.15138 |
null |
2025-03-19 |
Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models |
Man Fai Wong et.al. |
2503.15129 |
null |
2025-03-19 |
Text-Derived Relational Graph-Enhanced Network for Skeleton-Based Action Segmentation |
Haoyu Ji et.al. |
2503.15126 |
null |
2025-03-19 |
Exploring Model Editing for LLM-based Aspect-Based Sentiment Classification |
Shichen Li et.al. |
2503.15117 |
null |
2025-03-19 |
DeCaFlow: A Deconfounding Causal Generative Model |
Alejandro Almodóvar et.al. |
2503.15114 |
link |
2025-03-19 |
Reasoning Effort and Problem Complexity: A Scaling Analysis in LLMs |
Benjamin Estermann et.al. |
2503.15113 |
null |
2025-03-19 |
OpenLLM-RTL: Open Dataset and Benchmark for LLM-Aided Design RTL Generation |
Shang Liu et.al. |
2503.15112 |
link |
2025-03-19 |
VIPER: Visual Perception and Explainable Reasoning for Sequential Decision-Making |
Mohamed Salim Aissi et.al. |
2503.15108 |
null |
2025-03-19 |
Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings |
Zonghao Ying et.al. |
2503.15092 |
link |
2025-03-19 |
Intelligent Spatial Perception by Building Hierarchical 3D Scene Graphs for Indoor Scenarios with the Help of LLMs |
Yao Cheng et.al. |
2503.15091 |
null |
2025-03-19 |
LogiAgent: Automated Logical Testing for REST Systems with LLM-Based Multi-Agents |
Ke Zhang et.al. |
2503.15079 |
null |
2025-03-19 |
Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis |
Imanol G. Estepa et.al. |
2503.15060 |
null |
2025-03-19 |
ELTEX: A Framework for Domain-Driven Synthetic Data Generation |
Arina Razmyslovich et.al. |
2503.15055 |
link |
2025-03-19 |
Studying and Understanding the Effectiveness and Failures of Conversational LLM-Based Repair |
Aolin Chen et.al. |
2503.15050 |
null |
2025-03-19 |
SPADE: Systematic Prompt Framework for Automated Dialogue Expansion in Machine-Generated Text Detection |
Haoyi Li et.al. |
2503.15044 |
null |
2025-03-19 |
DRoPE: Directional Rotary Position Embedding for Efficient Agent Interaction Modeling |
Jianbo Zhao et.al. |
2503.15029 |
null |
2025-03-19 |
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene |
Shengqiong Wu et.al. |
2503.15019 |
null |
2025-03-19 |
LLM Alignment for the Arabs: A Homogenous Culture or Diverse Ones? |
Amr Keleg et.al. |
2503.15003 |
null |
2025-03-19 |
Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering |
Francesco Maria Molfese et.al. |
2503.14996 |
null |
2025-03-19 |
ChatStitch: Visualizing Through Structures via Surround-View Unsupervised Deep Image Stitching with Collaborative LLM-Agents |
Hao Liang et.al. |
2503.14948 |
null |
2025-03-19 |
Generating Multimodal Driving Scenes via Next-Scene Prediction |
Yanhao Wu et.al. |
2503.14945 |
null |
2025-03-19 |
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation |
Qihui Zhang et.al. |
2503.14941 |
null |
2025-03-19 |
VisNumBench: Evaluating Number Sense of Multimodal Large Language Models |
Tengjin Weng et.al. |
2503.14939 |
null |
2025-03-19 |
Proceedings of the 3rd Italian Conference on Big Data and Data Science (ITADATA2024) |
Nicola Bena et.al. |
2503.14937 |
null |
2025-03-19 |
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding |
Chongjun Tu et.al. |
2503.14935 |
null |
2025-03-19 |
Prada: Black-Box LLM Adaptation with Private Data on Resource-Constrained Devices |
Ziyao Wang et.al. |
2503.14932 |
null |
2025-03-19 |
GenM $^3$ : Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation |
Junyu Shi et.al. |
2503.14919 |
null |
2025-03-19 |
MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models |
Jiazheng Li et.al. |
2503.14917 |
null |
2025-03-19 |
Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology |
Siyuan Yan et.al. |
2503.14911 |
link |
2025-03-19 |
POSTA: A Go-to Framework for Customized Artistic Poster Generation |
Haoyu Chen et.al. |
2503.14908 |
null |
2025-03-19 |
Deep Contrastive Unlearning for Language Models |
Estrid He et.al. |
2503.14900 |
null |
2025-03-19 |
When Domain Generalization meets Generalized Category Discovery: An Adaptive Task-Arithmetic Driven Approach |
Vaibhav Rathore et.al. |
2503.14897 |
null |
2025-03-19 |
Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations |
Shuo Li et.al. |
2503.14895 |
null |
2025-03-19 |
MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer |
Honglin Lin et.al. |
2503.14891 |
link |
2025-03-19 |
Pseudo-Relevance Feedback Can Improve Zero-Shot LLM-Based Dense Retrieval |
Hang Li et.al. |
2503.14887 |
null |
2025-03-19 |
Envisioning an AI-Enhanced Mental Health Ecosystem |
Kellie Yu Hui Sim et.al. |
2503.14883 |
null |
2025-03-19 |
Communication-Efficient Distributed On-Device LLM Inference Over Wireless Networks |
Kai Zhang et.al. |
2503.14882 |
null |
2025-03-19 |
Chemical Foundation Model Guided Design of High Ionic Conductivity Electrolyte Formulations |
Murtaza Zohair et.al. |
2503.14878 |
null |
2025-03-19 |
Unlocking the Capabilities of Vision-Language Models for Generalizable and Explainable Deepfake Detection |
Peipeng Yu et.al. |
2503.14853 |
null |
2025-03-19 |
LogLLaMA: Transformer-based log anomaly detection with LLaMA |
Zhuoyi Yang et.al. |
2503.14849 |
null |
2025-03-19 |
Think Like Human Developers: Harnessing Community Knowledge for Structured Code Reasoning |
Chengran Yang et.al. |
2503.14838 |
null |
2025-03-19 |
Robust Transmission of Punctured Text with Large Language Model-based Recovery |
Sojeong Park et.al. |
2503.14831 |
null |
2025-03-19 |
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models |
Chejian Xu et.al. |
2503.14827 |
null |
2025-03-18 |
Bayesian Modeling of Zero-Shot Classifications for Urban Flood Detection |
Matt Franchi et.al. |
2503.14754 |
link |
2025-03-18 |
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence |
Sophia Hager et.al. |
2503.14749 |
null |
2025-03-18 |
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots |
NVIDIA et.al. |
2503.14734 |
null |
2025-03-18 |
CodingGenie: A Proactive LLM-Powered Programming Assistant |
Sebastian Zhao et.al. |
2503.14724 |
link |
2025-03-18 |
Generating Medically-Informed Explanations for Depression Detection using LLMs |
Xiangyong Chen et.al. |
2503.14671 |
null |
2025-03-18 |
RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation Serving |
Wenqi Jiang et.al. |
2503.14649 |
null |
2025-03-18 |
Towards More Economical Context-Augmented LLM Generation by Reusing Stored KV Cache |
Hanchen Li et.al. |
2503.14647 |
null |
2025-03-18 |
Reinforcement learning-based motion imitation for physiologically plausible musculoskeletal motor control |
Merkourios Simos et.al. |
2503.14637 |
link |
2025-03-18 |
Assessing Large Language Models for Automated Feedback Generation in Learning Programming Problem Solving |
Priscylla Silva et.al. |
2503.14630 |
link |
2025-03-18 |
Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives |
Sara Sarto et.al. |
2503.14604 |
null |
2025-03-18 |
Command R7B Arabic: A Small, Enterprise Focused, Multilingual, and Culturally Aware Arabic LLM |
Yazeed Alnumay et.al. |
2503.14603 |
null |
2025-03-18 |
Aligning Multimodal LLM with Human Preference: A Survey |
Tao Yu et.al. |
2503.14504 |
link |
2025-03-18 |
Deeply Supervised Flow-Based Generative Models |
Inkyu Shin et.al. |
2503.14494 |
null |
2025-03-18 |
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control |
NVIDIA et.al. |
2503.14492 |
link |
2025-03-18 |
Engineering Scientific Assistants using Interactive Structured Induction of Programs |
Shraddha Surana et.al. |
2503.14488 |
null |
2025-03-18 |
Gricean Norms as a Basis for Effective Collaboration |
Fardin Saad et.al. |
2503.14484 |
link |
2025-03-18 |
ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing |
Yulin Pan et.al. |
2503.14482 |
null |
2025-03-18 |
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM |
Xinyu Fang et.al. |
2503.14478 |
link |
2025-03-18 |
The Atacama Cosmology Telescope: DR6 Constraints on Extended Cosmological Models |
Erminia Calabrese et.al. |
2503.14454 |
null |
2025-03-18 |
Bolt3D: Generating 3D Scenes in Seconds |
Stanislaw Szymanowicz et.al. |
2503.14445 |
null |
2025-03-18 |
EnvBench: A Benchmark for Automated Environment Setup |
Aleksandra Eliseeva et.al. |
2503.14443 |
link |
2025-03-18 |
LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers |
Nikhil Abhyankar et.al. |
2503.14434 |
link |
2025-03-18 |
PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play |
Wei Fang et.al. |
2503.14432 |
null |
2025-03-18 |
Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models |
Siwei Zhang et.al. |
2503.14411 |
null |
2025-03-18 |
Large Language Models for Virtual Human Gesture Selection |
Parisa Ghanad Torshizi et.al. |
2503.14408 |
null |
2025-03-18 |
DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers |
Mert Bulent Sariyildiz et.al. |
2503.14405 |
null |
2025-03-18 |
Diffusion-based Facial Aesthetics Enhancement with 3D Structure Guidance |
Lisha Li et.al. |
2503.14402 |
null |
2025-03-18 |
From “Hallucination” to “Suture”: Insights from Language Philosophy to Enhance Large Language Models |
Qiantong Wang et.al. |
2503.14392 |
null |
2025-03-18 |
How much do LLMs learn from negative examples? |
Shadi Hamdan et.al. |
2503.14391 |
null |
2025-03-18 |
Good/Evil Reputation Judgment of Celebrities by LLMs via Retrieval Augmented Generation |
Rikuto Tsuchida et.al. |
2503.14382 |
null |
2025-03-18 |
On the Standard Performance Criteria for Applied Control Design: PID, MPC or Machine Learning Controller? |
Pouria Sarhadi et.al. |
2503.14379 |
link |
2025-03-18 |
Impossible Videos |
Zechen Bai et.al. |
2503.14378 |
null |
2025-03-18 |
RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment |
Chao Wang et.al. |
2503.14358 |
null |
2025-03-18 |
MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts |
Runqi Meng et.al. |
2503.14355 |
null |
2025-03-18 |
MANTRA: Enhancing Automated Method-Level Refactoring with Contextual RAG and Multi-Agent LLM Collaboration |
Yisen Xu et.al. |
2503.14340 |
null |
2025-03-18 |
DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies |
Wei Song et.al. |
2503.14324 |
link |
2025-03-18 |
COPA: Comparing the Incomparable to Explore the Pareto Front |
Adrián Javaloy et.al. |
2503.14321 |
null |
2025-03-18 |
RoMedFormer: A Rotary-Embedding Transformer Foundation Model for 3D Genito-Pelvic Structure Segmentation in MRI and CT |
Yuheng Li et.al. |
2503.14304 |
null |
2025-03-18 |
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs |
Nicolas Le Roux et.al. |
2503.14286 |
null |
2025-03-18 |
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal |
Vaibhav Aggarwal et.al. |
2503.14269 |
link |
2025-03-18 |
Quantization-Free Autoregressive Action Transformer |
Ziyad Sheebaelhamd et.al. |
2503.14259 |
link |
2025-03-18 |
InnerSelf: Designing Self-Deepfaked Voice for Emotional Well-being |
Guang Dai et.al. |
2503.14257 |
null |
2025-03-18 |
Towards a Barrier-free GeoQA Portal: Natural Language Interaction with Geospatial Data Using Multi-Agent LLMs and Semantic Search |
Yu Feng et.al. |
2503.14251 |
null |
2025-03-19 |
KG-IRAG: A Knowledge Graph-Based Iterative Retrieval-Augmented Generation Framework for Temporal Reasoning |
Ruiyi Yang et.al. |
2503.14234 |
null |
2025-03-18 |
CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models |
Yuyang Xue et.al. |
2503.14232 |
null |
2025-03-18 |
Decision Tree Induction Through LLMs via Semantically-Aware Evolution |
Tennison Liu et.al. |
2503.14217 |
null |
2025-03-18 |
Inferring Event Descriptions from Time Series with Language Models |
Mingtian Tan et.al. |
2503.14190 |
link |
2025-03-18 |
Towards Harmless Multimodal Assistants with Blind Preference Optimization |
Yongqi Li et.al. |
2503.14189 |
null |
2025-03-18 |
Can LLMs Enable Verification in Mainstream Programming? |
Aleksandr Shefer et.al. |
2503.14183 |
null |
2025-03-18 |
EIAD: Explainable Industrial Anomaly Detection Via Multi-Modal Large Language Models |
Zongyun Zhang et.al. |
2503.14162 |
null |
2025-03-18 |
Speculative Decoding for Verilog: Speed and Quality, All in One |
Changran Xu et.al. |
2503.14153 |
null |
2025-03-18 |
Marten: Visual Question Answering with Mask Generation for Multi-modal Document Understanding |
Zining Wang et.al. |
2503.14140 |
null |
2025-03-18 |
CARE: A QLoRA-Fine Tuned Multi-Domain Chatbot With Fast Learning On Minimal Hardware |
Ankit Dutta et.al. |
2503.14136 |
null |
2025-03-18 |
Inference-Time Intervention in Large Language Models for Reliable Requirement Verification |
Paul Darm et.al. |
2503.14130 |
null |
2025-03-18 |
SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models |
Subhadeep Koley et.al. |
2503.14129 |
null |
2025-03-18 |
PET-MAD, a universal interatomic potential for advanced materials modeling |
Arslan Mazitov et.al. |
2503.14118 |
link |
2025-03-18 |
DangerMaps: Personalized Safety Advice for Travel in Urban Environments using a Retrieval-Augmented Language Model |
Jonas Oppenlaender et.al. |
2503.14103 |
null |
2025-03-18 |
Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency |
Jiangxuan Long et.al. |
2503.14076 |
null |
2025-03-18 |
Fast Autoregressive Video Generation with Diagonal Decoding |
Yang Ye et.al. |
2503.14070 |
null |
2025-03-18 |
AIGVE-Tool: AI-Generated Video Evaluation Toolkit with Multifaceted Benchmark |
Xinhao Xiang et.al. |
2503.14064 |
link |
2025-03-18 |
Foundation Feature-Driven Online End-Effector Pose Estimation: A Marker-Free and Learning-Free Approach |
Tianshu Wu et.al. |
2503.14051 |
null |
2025-03-18 |
Learning on LLM Output Signatures for gray-box LLM Behavior Analysis |
Guy Bar-Shalom et.al. |
2503.14043 |
link |
2025-03-18 |
Intra and Inter Parser-Prompted Transformers for Effective Image Restoration |
Cong Wang et.al. |
2503.14037 |
link |
2025-03-18 |
Synthetic Data Generation Using Large Language Models: Advances in Text and Code |
Mihai Nadas et.al. |
2503.14023 |
null |
2025-03-18 |
MP-GUI: Modality Perception with MLLMs for GUI Understanding |
Ziwei Wang et.al. |
2503.14021 |
link |
2025-03-18 |
Predicting Human Choice Between Textually Described Lotteries |
Eyal Marantz et.al. |
2503.14004 |
null |
2025-03-18 |
MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling |
Damian Boborzi et.al. |
2503.14002 |
link |
2025-03-18 |
The KoLMogorov Test: Compression by Code Generation |
Ori Yoran et.al. |
2503.13992 |
null |
2025-03-18 |
Empowering Smaller Models: Tuning LLaMA and Gemma with Chain-of-Thought for Ukrainian Exam Tasks |
Mykyta Syromiatnikov et.al. |
2503.13988 |
link |
2025-03-18 |
DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection |
Jaewoo Song et.al. |
2503.13985 |
null |
2025-03-18 |
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability |
Jiankang Wang et.al. |
2503.13983 |
null |
2025-03-18 |
Empowering LLMs in Decision Games through Algorithmic Data Synthesis |
Haolin Wang et.al. |
2503.13980 |
null |
2025-03-18 |
FlexVLN: Flexible Adaptation for Diverse Vision-and-Language Navigation Tasks |
Siqi Zhang et.al. |
2503.13966 |
null |
2025-03-18 |
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding |
Siwei Han et.al. |
2503.13964 |
link |
2025-03-18 |
Survey of Adversarial Robustness in Multimodal Large Language Models |
Chengze Jiang et.al. |
2503.13962 |
null |
2025-03-18 |
Improving LLM Video Understanding with 16 Frames Per Second |
Yixuan Li et.al. |
2503.13956 |
null |
2025-03-18 |
ConSCompF: Consistency-focused Similarity Comparison Framework for Generative Large Language Models |
Alexey Karev et.al. |
2503.13923 |
null |
2025-03-18 |
MoK-RAG: Mixture of Knowledge Paths Enhanced Retrieval-Augmented Generation for Embodied AI Environments |
Zhengsheng Guo et.al. |
2503.13882 |
null |
2025-03-18 |
MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation |
Donggon Jang et.al. |
2503.13881 |
link |
2025-03-18 |
Bridging Social Psychology and LLM Reasoning: Conflict-Aware Meta-Review Generation via Cognitive Alignment |
Wei Chen et.al. |
2503.13879 |
null |
2025-03-18 |
Enabling Inclusive Systematic Reviews: Incorporating Preprint Articles with Large Language Model-Driven Evaluations |
Rui Yang et.al. |
2503.13857 |
null |
2025-03-18 |
MDTeamGPT: A Self-Evolving LLM-based Multi-Agent Framework for Multi-Disciplinary Team Medical Consultation |
Kai Chen et.al. |
2503.13856 |
null |
2025-03-18 |
Causal Discovery from Data Assisted by Large Language Models |
Kamyar Barakati et.al. |
2503.13833 |
null |
2025-03-18 |
Scale-Aware Contrastive Reverse Distillation for Unsupervised Medical Anomaly Detection |
Chunlei Li et.al. |
2503.13828 |
link |
2025-03-18 |
LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions |
Xiaopei Chen et.al. |
2503.13819 |
null |
2025-03-18 |
Automatic MILP Model Construction for Multi-Robot Task Allocation and Scheduling Based on Large Language Models |
Mingming Peng et.al. |
2503.13813 |
null |
2025-03-18 |
The Empty Chair: Using LLMs to Raise Missing Perspectives in Policy Deliberations |
Suyash Fulay et.al. |
2503.13812 |
null |
2025-03-18 |
Empowering GraphRAG with Knowledge Filtering and Integration |
Kai Guo et.al. |
2503.13804 |
null |
2025-03-18 |
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation |
Yang Zhou et.al. |
2503.13794 |
null |
2025-03-18 |
Mapping the Trust Terrain: LLMs in Software Engineering – Insights and Perspectives |
Dipin Khati et.al. |
2503.13793 |
null |
2025-03-17 |
Mitigating KV Cache Competition to Enhance User Experience in LLM Inference |
Haiying Shen et.al. |
2503.13773 |
null |
2025-03-17 |
Do Large Language Models Understand Performance Optimization? |
Bowen Cui et.al. |
2503.13772 |
null |
2025-03-17 |
Continual Unlearning for Foundational Text-to-Image Models without Generalization Erosion |
Kartik Thakral et.al. |
2503.13769 |
null |
2025-03-17 |
AccelGen: Heterogeneous SLO-Guaranteed High-Throughput LLM Inference Serving for Diverse Applications |
Haiying Shen et.al. |
2503.13737 |
null |
2025-03-17 |
CoDet-M4: Detecting Machine-Generated Code in Multi-Lingual, Multi-Generator and Multi-Domain Settings |
Daniil Orel et.al. |
2503.13733 |
null |
2025-03-17 |
FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models |
Minghan Li et.al. |
2503.13684 |
null |
2025-03-17 |
Pensez: Less Data, Better Reasoning – Rethinking French LLM |
Huy Hoang Ha et.al. |
2503.13661 |
null |
2025-03-17 |
INPROVF: Leveraging Large Language Models to Repair High-level Robot Controllers from Assumption Violations |
Qian Meng et.al. |
2503.13660 |
null |
2025-03-17 |
SOSecure: Safer Code Generation with RAG and StackOverflow Discussions |
Manisha Mukherjee et.al. |
2503.13654 |
null |
2025-03-17 |
Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos |
Chiara Plizzari et.al. |
2503.13646 |
link |
2025-03-17 |
Plasmon-Plasmon Interaction in Nanoparticle Assemblies: Role of the Dipole-Quadrupole Coupling |
Olivier Masset et.al. |
2503.13645 |
null |
2025-03-17 |
Evaluating Programming Language Confusion |
Micheline Bénédicte Moumoula et.al. |
2503.13620 |
null |
2025-03-17 |
MetaScale: Test-Time Scaling with Evolving Meta-Thoughts |
Qin Liu et.al. |
2503.13447 |
null |
2025-03-17 |
MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation |
Zhenyu Wu et.al. |
2503.13446 |
null |
2025-03-17 |
Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance |
Noah Y. Siegel et.al. |
2503.13445 |
null |
2025-03-17 |
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning |
Ye Liu et.al. |
2503.13444 |
link |
2025-03-17 |
Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images |
Tianhao Wu et.al. |
2503.13439 |
null |
2025-03-17 |
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference |
Maximilian Beck et.al. |
2503.13427 |
link |
2025-03-17 |
Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation |
Xinyu Lian et.al. |
2503.13424 |
null |
2025-03-17 |
A Comprehensive Survey on Multi-Agent Cooperative Decision-Making: Scenarios, Approaches, Challenges and Perspectives |
Weiqiang Jin et.al. |
2503.13415 |
null |
2025-03-18 |
DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective |
Dengyun Peng et.al. |
2503.13413 |
link |
2025-03-17 |
Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis |
Alexander Ku et.al. |
2503.13401 |
null |
2025-03-17 |
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research |
James Burgess et.al. |
2503.13399 |
link |
2025-03-17 |
Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning |
Mengyao Lyu et.al. |
2503.13383 |
null |
2025-03-17 |
Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning |
Hai-Long Sun et.al. |
2503.13360 |
null |
2025-03-17 |
Agents Play Thousands of 3D Video Games |
Zhongwen Xu et.al. |
2503.13356 |
null |
2025-03-17 |
Valid Text-to-SQL Generation with Unification-based DeepStochLog |
Ying Jiao et.al. |
2503.13342 |
link |
2025-03-17 |
LearnMate: Enhancing Online Education with LLM-Powered Personalized Learning Plans and Support |
Xinyu Jessica Wang et.al. |
2503.13340 |
null |
2025-03-17 |
LEAVS: An LLM-based Labeler for Abdominal CT Supervision |
Ricardo Bigolin Lanfredi et.al. |
2503.13330 |
link |
2025-03-17 |
Edit Transfer: Learning Image Editing via Vision In-Context Relations |
Lan Chen et.al. |
2503.13327 |
null |
2025-03-17 |
Computation Mechanism Behind LLM Position Generalization |
Chi Han et.al. |
2503.13305 |
null |
2025-03-17 |
LIMCA: LLM for Automating Analog In-Memory Computing Architecture Design Exploration |
Deepak Vungarala et.al. |
2503.13301 |
null |
2025-03-17 |
A Survey on Transformer Context Extension: Approaches and Evaluation |
Yijun Liu et.al. |
2503.13299 |
null |
2025-03-17 |
LLM-Match: An Open-Sourced Patient Matching Model Based on Large Language Models and Retrieval-Augmented Generation |
Xiaodi Li et.al. |
2503.13281 |
null |
2025-03-17 |
Knowledge-Aware Iterative Retrieval for Multi-Agent Systems |
Seyoung Song et.al. |
2503.13275 |
null |
2025-03-17 |
Graph Generative Models Evaluation with Masked Autoencoder |
Chengen Wang et.al. |
2503.13271 |
null |
2025-03-17 |
TablePilot; Recommending Human-Preferred Tabular Data Analysis with Large Language Models |
Deyin Yi et.al. |
2503.13262 |
null |
2025-03-17 |
MindEye-OmniAssist: A Gaze-Driven LLM-Enhanced Assistive Robot System for Implicit Intention Recognition and Task Execution |
Zejia Zhang et.al. |
2503.13250 |
null |
2025-03-17 |
Can Language Models Follow Multiple Turns of Entangled Instructions? |
Chi Han et.al. |
2503.13222 |
link |
2025-03-17 |
Dense Policy: Bidirectional Autoregressive Learning of Actions |
Yue Su et.al. |
2503.13217 |
null |
2025-03-17 |
MedLoRD: A Medical Low-Resource Diffusion Model for High-Resolution 3D CT Image Synthesis |
Marvin Seyfarth et.al. |
2503.13211 |
null |
2025-03-17 |
Improving Complex Reasoning with Dynamic Prompt Corruption: A soft prompt Optimization Approach |
Sinan Fan et.al. |
2503.13208 |
null |
2025-03-17 |
MAP: Evaluation and Multi-Agent Enhancement of Large Language Models for Inpatient Pathways |
Zhen Chen et.al. |
2503.13205 |
null |
2025-03-17 |
3DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o |
Dingning Liu et.al. |
2503.13185 |
null |
2025-03-17 |
Are LLMs (Really) Ideological? An IRT-based Analysis and Alignment Tool for Perceived Socio-Economic Bias in LLMs |
Jasmin Wachter et.al. |
2503.13149 |
null |
2025-03-17 |
Patient-specific radiomic feature selection with reconstructed healthy persona of knee MR images |
Yaxi Chen et.al. |
2503.13131 |
null |
2025-03-17 |
3D Human Interaction Generation: A Survey |
Siyuan Fan et.al. |
2503.13120 |
null |
2025-03-17 |
VeriLeaky: Navigating IP Protection vs Utility in Fine-Tuning for LLM-Driven Verilog Coding |
Zeng Wang et.al. |
2503.13116 |
null |
2025-03-17 |
MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs |
Erik Daxberger et.al. |
2503.13111 |
null |
2025-03-17 |
DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry |
Jing Li et.al. |
2503.13110 |
link |
2025-03-17 |
Code-Driven Inductive Synthesis: Enhancing Reasoning Abilities of Large Language Models with Sequences |
Kedi Chen et.al. |
2503.13109 |
null |
2025-03-17 |
Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference |
Hao Yin et.al. |
2503.13108 |
link |
2025-03-17 |
ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models |
Hao Yin et.al. |
2503.13107 |
link |
2025-03-17 |
Managing Hybrid Solid-State Drives Using Large Language Models |
Qian Wei et.al. |
2503.13105 |
null |
2025-03-17 |
REPA: Russian Error Types Annotation for Evaluating Text Generation and Judgment Capabilities |
Alexander Pugachev et.al. |
2503.13102 |
null |
2025-03-17 |
Who Wrote This? Identifying Machine vs Human-Generated Text in Hausa |
Babangida Sani et.al. |
2503.13101 |
link |
2025-03-17 |
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning |
Baohao Liao et.al. |
2503.13089 |
null |
2025-03-17 |
A Framework to Assess Multilingual Vulnerabilities of LLMs |
Likai Tang et.al. |
2503.13081 |
null |
2025-03-17 |
Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation |
Yihong Luo et.al. |
2503.13070 |
null |
2025-03-17 |
Do Vision Models Develop Human-Like Progressive Difficulty Understanding? |
Zeyi Huang et.al. |
2503.13058 |
null |
2025-03-17 |
MaskSDM with Shapley values to improve flexibility, robustness, and explainability in species distribution modeling |
Robin Zbinden et.al. |
2503.13057 |
null |
2025-03-17 |
Mitigating Cross-Modal Distraction and Ensuring Geometric Feasibility via Affordance-Guided, Self-Consistent MLLMs for Food Preparation Task Planning |
Yu-Hong Shen et.al. |
2503.13055 |
null |
2025-03-17 |
InsightDrive: Insight Scene Representation for End-to-End Autonomous Driving |
Ruiqi Song et.al. |
2503.13047 |
null |
2025-03-17 |
Overview of the NTCIR-18 Automatic Evaluation of LLMs (AEOLLM) Task |
Junjie Chen et.al. |
2503.13038 |
null |
2025-03-17 |
How Good is my Histopathology Vision-Language Foundation Model? A Holistic Benchmark |
Roba Al Majzoub et.al. |
2503.12990 |
link |
2025-03-17 |
A Multi-Stage Framework with Taxonomy-Guided Reasoning for Occupation Classification Using Large Language Models |
Palakorn Achananuparp et.al. |
2503.12989 |
null |
2025-03-17 |
ROMA: a Read-Only-Memory-based Accelerator for QLoRA-based On-Device LLM |
Wenqiang Wang et.al. |
2503.12988 |
null |
2025-03-17 |
Aligning Vision to Language: Text-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning |
Junming Liu et.al. |
2503.12972 |
null |
2025-03-17 |
Optimal Denoising in Score-Based Generative Models: The Role of Data Regularity |
Eliot Beyler et.al. |
2503.12966 |
null |
2025-03-17 |
Training Video Foundation Models with NVIDIA NeMo |
Zeeshan Patel et.al. |
2503.12964 |
null |
2025-03-17 |
HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding |
Jiahe Zhao et.al. |
2503.12955 |
null |
2025-03-17 |
HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model |
Haiyang Guo et.al. |
2503.12941 |
null |
2025-03-17 |
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization |
Jingyi Zhang et.al. |
2503.12937 |
link |
2025-03-17 |
Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs |
Wei Hung et.al. |
2503.12932 |
null |
2025-03-17 |
MirrorGuard: Adaptive Defense Against Jailbreaks via Entropy-Guided Mirror Crafting |
Rui Pu et.al. |
2503.12931 |
null |
2025-03-17 |
Lifelong Reinforcement Learning with Similarity-Driven Weighting by Large Models |
Zhiyi Huang et.al. |
2503.12923 |
null |
2025-03-17 |
ThinkPatterns-21k: A Systematic Study on the Impact of Thinking Patterns in LLMs |
Pengcheng Wen et.al. |
2503.12918 |
null |
2025-03-17 |
HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models |
Xinyan Jiang et.al. |
2503.12908 |
link |
2025-03-17 |
Optimizing Ansatz Design in Quantum Generative Adversarial Networks Using Large Language Models |
Kento Ueda et.al. |
2503.12884 |
null |
2025-03-17 |
nvBench 2.0: A Benchmark for Natural Language to Visualization under Ambiguity |
Tianqi Luo et.al. |
2503.12880 |
null |
2025-03-17 |
An interpretable approach to automating the assessment of biofouling in video footage |
Evelyn J. Mannix et.al. |
2503.12875 |
link |
2025-03-17 |
UniReg: Foundation Model for Controllable Medical Image Registration |
Zi Li et.al. |
2503.12868 |
null |
2025-03-17 |
Harnessing Test-time Adaptation for NLU tasks Involving Dialects of English |
Duke Nguyen et.al. |
2503.12858 |
null |
2025-03-17 |
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation |
Songjun Tu et.al. |
2503.12854 |
link |
2025-03-17 |
ACT360: An Efficient 360-Degree Action Detection and Summarization Framework for Mission-Critical Training and Debriefing |
Aditi Tiwari et.al. |
2503.12852 |
null |
2025-03-17 |
GuideDog: A Real-World Egocentric Multimodal Dataset for Blind and Low-Vision Accessibility-Aware Guidance |
Junhyeok Kim et.al. |
2503.12844 |
null |
2025-03-18 |
Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data |
Haozhe Si et.al. |
2503.12843 |
null |
2025-03-17 |
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules |
Kairong Luo et.al. |
2503.12811 |
link |
2025-03-17 |
Grounded Chain-of-Thought for Multimodal Large Language Models |
Qiong Wu et.al. |
2503.12799 |
link |
2025-03-18 |
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding |
Xinyu Ma et.al. |
2503.12797 |
link |
2025-03-17 |
Quantum-Enhanced LLM Efficient Fine Tuning |
Xiaofei Kong et.al. |
2503.12790 |
null |
2025-03-17 |
SAM2 for Image and Video Segmentation: A Comprehensive Survey |
Zhang Jiaxing et.al. |
2503.12781 |
null |
2025-03-17 |
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models |
Sung-Yeon Park et.al. |
2503.12772 |
null |
2025-03-17 |
A Survey on Human Interaction Motion Generation |
Kewei Sui et.al. |
2503.12763 |
link |
2025-03-17 |
RAG-RL: Advancing Retrieval-Augmented Generation via RL and Curriculum Learning |
Jerry Huang et.al. |
2503.12759 |
null |
2025-03-17 |
VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis |
Zhifeng Wang et.al. |
2503.12758 |
null |
2025-03-17 |
MAP: Multi-user Personalization with Collaborative LLM-powered Agents |
Christine Lee et.al. |
2503.12757 |
link |
2025-03-17 |
Identifying Cooperative Personalities in Multi-agent Contexts through Personality Steering with Representation Engineering |
Kenneth J. K. Ong et.al. |
2503.12722 |
null |
2025-03-17 |
Can Reasoning Models Reason about Hardware? An Agentic HLS Perspective |
Luca Collini et.al. |
2503.12721 |
null |
2025-03-16 |
AnyCalib: On-Manifold Learning for Model-Agnostic Single-View Camera Calibration |
Javier Tirado-Garín et.al. |
2503.12701 |
null |
2025-03-16 |
A Continual Learning-driven Model for Accurate and Generalizable Segmentation of Clinically Comprehensive and Fine-grained Whole-body Anatomies in CT |
Dazhou Guo et.al. |
2503.12698 |
null |
2025-03-16 |
AI Agents: Evolution, Architecture, and Real-World Applications |
Naveen Krishnan et.al. |
2503.12687 |
null |
2025-03-16 |
ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory |
Liangyu Wang et.al. |
2503.12668 |
link |
2025-03-16 |
Plausibility Vaccine: Injecting LLM Knowledge for Event Plausibility |
Jacob Chmura et.al. |
2503.12667 |
null |
2025-03-16 |
Quantum Chemistry Driven Molecular Inverse Design with Data-free Reinforcement Learning |
Francesco Calcagno et.al. |
2503.12653 |
null |
2025-03-16 |
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing |
Tsu-Jui Fu et.al. |
2503.12652 |
null |
2025-03-16 |
VeriLA: A Human-Centered Evaluation Framework for Interpretable Verification of LLM Agent Failures |
Yoo Yeon Sung et.al. |
2503.12651 |
null |
2025-03-16 |
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization |
Hao Mark Chen et.al. |
2503.12649 |
link |
2025-03-16 |
LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization |
Alessio Spagnoletti et.al. |
2503.12615 |
null |
2025-03-16 |
VISO-Grasp: Vision-Language Informed Spatial Object-centric 6-DoF Active View Planning and Grasping in Clutter and Invisibility |
Yitian Shi et.al. |
2503.12609 |
null |
2025-03-16 |
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey |
Yaoting Wang et.al. |
2503.12605 |
link |
2025-03-16 |
SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models |
Kunyang Sun et.al. |
2503.12602 |
link |
2025-03-14 |
From few to many maps: A fast map-level emulator for extreme augmentation of CMB systematics datasets |
P. Campeti et.al. |
2503.11643 |
link |
2025-03-14 |
Gradient-bridged Posterior: Bayesian Inference for Models with Implicit Functions |
Cheng Zeng et.al. |
2503.11637 |
null |
2025-03-14 |
ASMA-Tune: Unlocking LLMs’ Assembly Code Comprehension via Structural-Semantic Instruction Tuning |
Xinyi Wang et.al. |
2503.11617 |
link |
2025-03-14 |
Pathology Image Compression with Pre-trained Autoencoders |
Srikar Yellapragada et.al. |
2503.11591 |
null |
2025-03-14 |
Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs using Semantic Space |
Zhiliang Chen et.al. |
2503.11586 |
link |
2025-03-14 |
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion |
Ahmed Nassar et.al. |
2503.11576 |
null |
2025-03-14 |
Synthesizing Access Control Policies using Large Language Models |
Adarsh Vatsa et.al. |
2503.11573 |
null |
2025-03-14 |
Implicit Bias-Like Patterns in Reasoning Models |
Messi H. J. Lee et.al. |
2503.11572 |
null |
2025-03-14 |
VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity |
Jing Bi et.al. |
2503.11557 |
null |
2025-03-14 |
AugGen: Synthetic Augmentation Can Improve Discriminative Models |
Parsa Rahimi et.al. |
2503.11544 |
null |
2025-03-14 |
Potential of large language model-powered nudges for promoting daily water and energy conservation |
Zonghan Li et.al. |
2503.11531 |
null |
2025-03-14 |
Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models |
Hao Cheng et.al. |
2503.11519 |
null |
2025-03-14 |
HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models |
Ziqin Zhou et.al. |
2503.11513 |
null |
2025-03-14 |
Perfect Stabilization of Biomolecular Adhesions under Load |
Anton F. Burnet et.al. |
2503.11510 |
null |
2025-03-14 |
V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning |
Zixu Cheng et.al. |
2503.11495 |
null |
2025-03-14 |
A Review of DeepSeek Models’ Key Innovative Techniques |
Chengen Wang et.al. |
2503.11486 |
null |
2025-03-14 |
Exponential Quantum Advantage for Simulating Open Classical Systems |
Agi Villanyi et.al. |
2503.11483 |
null |
2025-03-14 |
T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation |
Seyed Mohammad Hadi Hosseini et.al. |
2503.11481 |
null |
2025-03-14 |
Integrating LLMs in Gamified Systems |
Carlos J. Costa et.al. |
2503.11458 |
null |
2025-03-14 |
D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning |
Jia Zhang et.al. |
2503.11441 |
null |
2025-03-14 |
Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models |
Xu Liu et.al. |
2503.11411 |
null |
2025-03-14 |
A Framework for a Capability-driven Evaluation of Scenario Understanding for Multimodal Large Language Models in Autonomous Driving |
Tin Stribor Sohn et.al. |
2503.11400 |
null |
2025-03-14 |
Optimizing Large Language Models for Detecting Symptoms of Comorbid Depression or Anxiety in Chronic Diseases: Insights from Patient Messages |
Jiyeong Kim et.al. |
2503.11384 |
null |
2025-03-14 |
Modeling Subjectivity in Cognitive Appraisal with Language Models |
Yuxiang Zhou et.al. |
2503.11381 |
null |
2025-03-14 |
Annotating Scientific Uncertainty: A comprehensive model using linguistic patterns and comparison with existing approaches |
Panggih Kusuma Ningrum et.al. |
2503.11376 |
null |
2025-03-14 |
Cornstarch: Distributed Multimodal Training Must Be Multimodality-Aware |
Insu Jang et.al. |
2503.11367 |
link |
2025-03-14 |
PARIC: Probabilistic Attention Regularization for Language Guided Image Classification from Pre-trained Vison Language Models |
Mayank Nautiyal et.al. |
2503.11360 |
null |
2025-03-14 |
Integrating Dynamical Systems Modeling with Spatiotemporal scRNA-seq Data Analysis |
Zhenyi Zhang et.al. |
2503.11347 |
null |
2025-03-14 |
AIstorian lets AI be a historian: A KG-powered multi-agent system for accurate biography generation |
Fengyu Li et.al. |
2503.11346 |
link |
2025-03-14 |
Rule-Guided Feedback: Enhancing Reasoning by Enforcing Rule Adherence in Large Language Models |
Aissatou Diallo et.al. |
2503.11336 |
null |
2025-03-14 |
Safe-VAR: Safe Visual Autoregressive Model for Text-to-Image Generative Watermarking |
Ziyi Wang et.al. |
2503.11324 |
null |
2025-03-14 |
MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens |
Jeong Hun Yeo et.al. |
2503.11315 |
link |
2025-03-14 |
Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering |
Xinyu Tang et.al. |
2503.11314 |
link |
2025-03-14 |
Are formal and functional linguistic mechanisms dissociated? |
Michael Hanna et.al. |
2503.11302 |
link |
2025-03-14 |
GNNs as Predictors of Agentic Workflow Performances |
Yuanshuo Zhang et.al. |
2503.11301 |
link |
2025-03-14 |
BriLLM: Brain-inspired Large Language Model |
Hai Zhao et.al. |
2503.11299 |
null |
2025-03-14 |
High-Dimensional Interlingual Representations of Large Language Models |
Bryan Wilie et.al. |
2503.11280 |
null |
2025-03-14 |
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective |
Alireza Mousavi-Hosseini et.al. |
2503.11272 |
link |
2025-03-14 |
CyclePose – Leveraging Cycle-Consistency for Annotation-Free Nuclei Segmentation in Fluorescence Microscopy |
Jonas Utz et.al. |
2503.11266 |
null |
2025-03-14 |
Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model |
Haoyang Huang et.al. |
2503.11251 |
link |
2025-03-14 |
Reasoning-Grounded Natural Language Explanations for Language Models |
Vojtech Cahlik et.al. |
2503.11248 |
link |
2025-03-14 |
LLMPerf: GPU Performance Modeling meets Large Language Models |
Khoi N. M. Nguyen et.al. |
2503.11244 |
link |
2025-03-14 |
PrivacyScalpel: Enhancing LLM Privacy via Interpretable Feature Intervention with Sparse Autoencoders |
Ahmed Frikha et.al. |
2503.11232 |
null |
2025-03-14 |
Exploring the Potential of Large Multimodal Models as Effective Alternatives for Pronunciation Assessment |
Ke Wang et.al. |
2503.11229 |
null |
2025-03-14 |
GKG-LLM: A Unified Framework for Generalized Knowledge Graph Construction |
Jian Zhang et.al. |
2503.11227 |
null |
2025-03-14 |
Heterogeneously structured compartmental models of epidemiological systems: from individual-level processes to population-scale dynamics |
Emanuele Bernardi et.al. |
2503.11225 |
null |
2025-03-14 |
Can Large Reasoning Models do Analogical Reasoning under Perceptual Uncertainty? |
Giacomo Camposampiero et.al. |
2503.11207 |
link |
2025-03-14 |
LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs |
Leqi Shen et.al. |
2503.11205 |
null |
2025-03-14 |
Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering |
Gang Li et.al. |
2503.11197 |
link |
2025-03-14 |
FastVID: Dynamic Density Pruning for Fast Video Large Language Models |
Leqi Shen et.al. |
2503.11187 |
link |
2025-03-14 |
Align in Depth: Defending Jailbreak Attacks via Progressive Answer Detoxification |
Yingjie Zhang et.al. |
2503.11185 |
null |
2025-03-14 |
Palette of Language Models: A Solver for Controlled Text Generation |
Zhe Yang et.al. |
2503.11182 |
null |
2025-03-14 |
Towards Extreme Pruning of LLMs with Plug-and-Play Mixed Sparsity |
Chi Xu et.al. |
2503.11164 |
null |
2025-03-14 |
Don’t Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models |
Shaotian Yan et.al. |
2503.11154 |
null |
2025-03-14 |
SpaceSeg: A High-Precision Intelligent Perception Segmentation Method for Multi-Spacecraft On-Orbit Targets |
Hao Liu et.al. |
2503.11133 |
null |
2025-03-14 |
Don’t Forget It! Conditional Sparse Autoencoder Clamping Works for Unlearning |
Matthew Khoriaty et.al. |
2503.11127 |
null |
2025-03-14 |
Limits of KV Cache Compression for Tensor Attention based Autoregressive Transformers |
Yifang Chen et.al. |
2503.11108 |
null |
2025-03-14 |
Quantifying Interpretability in CLIP Models with Concept Consistency |
Avinash Madasu et.al. |
2503.11103 |
null |
2025-03-14 |
Open3DVQA: A Benchmark for Comprehensive Spatial Reasoning with Multimodal Large Language Model in Open Space |
Weichen Zhan et.al. |
2503.11094 |
link |
2025-03-14 |
OmniDiff: A Comprehensive Benchmark for Fine-grained Image Difference Captioning |
Yuan Liu et.al. |
2503.11093 |
null |
2025-03-14 |
EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks |
Yi Zhang et.al. |
2503.11089 |
null |
2025-03-14 |
A Survey of Cross-domain Graph Learning: Progress and Future Directions |
Haihong Zhao et.al. |
2503.11086 |
link |
2025-03-14 |
Prompt Alchemy: Automatic Prompt Refinement for Enhancing Code Generation |
Sixiang Ye et.al. |
2503.11085 |
link |
2025-03-14 |
LLMs are Bug Replicators: An Empirical Study on LLMs’ Capability in Completing Bug-prone Code |
Liwei Guo et.al. |
2503.11082 |
link |
2025-03-14 |
Understanding Flatness in Generative Models: Its Role and Benefits |
Taehwan Lee et.al. |
2503.11078 |
null |
2025-03-14 |
Large Reasoning Models in Agent Scenarios: Exploring the Necessity of Reasoning Capabilities |
Xueyang Zhou et.al. |
2503.11074 |
null |
2025-03-14 |
Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models |
Hongyang Wei et.al. |
2503.11073 |
link |
2025-03-14 |
Falcon: A Remote Sensing Vision-Language Foundation Model |
Kelu Yao et.al. |
2503.11070 |
link |
2025-03-14 |
API Agents vs. GUI Agents: Divergence and Convergence |
Chaoyun Zhang et.al. |
2503.11069 |
null |
2025-03-14 |
DeepSeek Powered Solid Dosage Formulation Design and Development |
Leqi Lin et.al. |
2503.11068 |
null |
2025-03-14 |
Generative Modelling for Mathematical Discovery |
Jordan S. Ellenberg et.al. |
2503.11061 |
link |
2025-03-14 |
BannerAgency: Advertising Banner Design with Multimodal LLM Agents |
Heng Wang et.al. |
2503.11060 |
null |
2025-03-14 |
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization |
Kyle Sargent et.al. |
2503.11056 |
null |
2025-03-14 |
Towards Privacy-preserved Pre-training of Remote Sensing Foundation Models with Federated Mutual-guidance Learning |
Jieyi Tan et.al. |
2503.11051 |
null |
2025-03-14 |
PSF-4D: A Progressive Sampling Framework for View Consistent 4D Editing |
Hasan Iqbal et.al. |
2503.11044 |
null |
2025-03-14 |
Beyond A Single AI Cluster: A Survey of Decentralized LLM Training |
Haotian Dong et.al. |
2503.11023 |
null |
2025-03-14 |
An LLM’s Attempts to Adapt to Diverse Software Engineers’ Problem-Solving Styles: More Inclusive & Equitable? |
Andrew Anderson et.al. |
2503.11018 |
null |
2025-03-14 |
RONA: Pragmatically Diverse Image Captioning with Coherence Relations |
Aashish Anantha Ramakrishnan et.al. |
2503.10997 |
link |
2025-03-14 |
TigerLLM – A Family of Bangla Large Language Models |
Nishat Raihan et.al. |
2503.10995 |
link |
2025-03-14 |
Statistical Impossibility and Possibility of Aligning LLMs with Human Preferences: From Condorcet Paradox to Nash Equilibrium |
Kaizhao Liu et.al. |
2503.10990 |
link |
2025-03-14 |
From Dionysius Emerges Apollo – Learning Patterns and Abstractions from Perceptual Sequences |
Shuchen Wu et.al. |
2503.10973 |
null |
2025-03-14 |
Combinatorial Optimization for All: Using LLMs to Aid Non-Experts in Improving Optimization Algorithms |
Camilo Chacón Sartori et.al. |
2503.10968 |
null |
2025-03-13 |
Empirical Computation |
Eric Tang et.al. |
2503.10954 |
null |
2025-03-13 |
Graph-Grounded LLMs: Leveraging Graphical Function Calling to Minimize LLM Hallucinations |
Piyush Gupta et.al. |
2503.10941 |
null |
2025-03-13 |
ChatGPT Encounters Morphing Attack Detection: Zero-Shot MAD with Multi-Modal Large Language Models and General Vision Models |
Haoyu Zhang et.al. |
2503.10937 |
null |
2025-03-13 |
OASST-ETC Dataset: Alignment Signals from Eye-tracking Analysis of LLM Responses |
Angela Lopez-Cardona et.al. |
2503.10927 |
link |
2025-03-13 |
Learning to Inference Adaptively for Multimodal Large Language Models |
Zhuoyan Xu et.al. |
2503.10905 |
null |
2025-03-13 |
Taxonomic Reasoning for Rare Arthropods: Combining Dense Image Captioning and RAG for Interpretable Classification |
Nathaniel Lesperance et.al. |
2503.10886 |
null |
2025-03-13 |
Chat-TS: Enhancing Multi-Modal Reasoning Over Time-Series and Natural Language Data |
Paul Quinlan et.al. |
2503.10883 |
null |
2025-03-13 |
SCE: Scalable Consistency Ensembles Make Blackbox Large Language Model Generation More Reliable |
Jiaxin Zhang et.al. |
2503.10881 |
null |
2025-03-13 |
Teamwork makes the dream work: LLMs-Based Agents for GitHub README.MD Summarization |
Duc S. H. Nguyen et.al. |
2503.10876 |
null |
2025-03-13 |
Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation |
Leonard Waldmann et.al. |
2503.10845 |
link |
2025-03-13 |
Who Relies More on World Knowledge and Bias for Syntactic Ambiguity Resolution: Humans or LLMs? |
So Young Lee et.al. |
2503.10838 |
link |
2025-03-13 |
Exploiting Concavity Information in Gaussian Process Contextual Bandit Optimization |
Kevin Li et.al. |
2503.10836 |
null |
2025-03-13 |
Thinking Machines: A Survey of LLM based Reasoning Strategies |
Dibyanayan Bandyopadhyay et.al. |
2503.10814 |
null |
2025-03-13 |
HALURust: Exploiting Hallucinations of Large Language Models to Detect Vulnerabilities in Rust |
Yu Luo et.al. |
2503.10793 |
null |
2025-03-13 |
Vulnerability Detection: From Formal Verification to Large Language Models and Hybrid Approaches: A Comprehensive Overview |
Norbert Tihanyi et.al. |
2503.10784 |
null |
2025-03-13 |
Large-scale Pre-training for Grounded Video Caption Generation |
Evangelos Kazakos et.al. |
2503.10781 |
link |
2025-03-13 |
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing |
Rongyao Fang et.al. |
2503.10639 |
link |
2025-03-13 |
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model |
Jiaming Liu et.al. |
2503.10631 |
null |
2025-03-13 |
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation |
Hang Yin et.al. |
2503.10630 |
null |
2025-03-13 |
From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM |
Kshitij Ambilduke et.al. |
2503.10620 |
link |
2025-03-13 |
Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search |
Andy Zhou et.al. |
2503.10619 |
null |
2025-03-13 |
Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models |
Andy Zhou et.al. |
2503.10617 |
null |
2025-03-13 |
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization |
Yi Yang et.al. |
2503.10615 |
link |
2025-03-13 |
CoSTA $\ast$ : Cost-Sensitive Toolpath Agent for Multi-turn Image Editing |
Advait Gupta et.al. |
2503.10613 |
link |
2025-03-13 |
TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention |
Jinhao Duan et.al. |
2503.10602 |
link |
2025-03-13 |
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models |
Hao He et.al. |
2503.10592 |
null |
2025-03-13 |
Unlock the Power of Unlabeled Data in Language Driving Model |
Chaoqun Wang et.al. |
2503.10586 |
null |
2025-03-13 |
Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures |
Nina Vesseron et.al. |
2503.10576 |
null |
2025-03-13 |
Unveiling the Mathematical Reasoning in DeepSeek Models: A Comparative Study of Large Language Models |
Afrar Jahin et.al. |
2503.10573 |
null |
2025-03-13 |
ASIDE: Architectural Separation of Instructions and Data in Language Models |
Egor Zverev et.al. |
2503.10566 |
null |
2025-03-13 |
Short-term AI literacy intervention does not reduce over-reliance on incorrect ChatGPT recommendations |
Brett Puppart et.al. |
2503.10556 |
null |
2025-03-13 |
KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation |
Zixian Liu et.al. |
2503.10546 |
null |
2025-03-13 |
DP-GPL: Differentially Private Graph Prompt Learning |
Jing Xu et.al. |
2503.10544 |
null |
2025-03-13 |
Foundation Models for Atomistic Simulation of Chemistry and Materials |
Eric C. -Y. Yuan et.al. |
2503.10538 |
null |
2025-03-13 |
PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models |
Zilu Guo et.al. |
2503.10529 |
null |
2025-03-13 |
Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set |
Florian Eichin et.al. |
2503.10515 |
link |
2025-03-13 |
Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression |
Hooman Shahrokhi et.al. |
2503.10512 |
null |
2025-03-13 |
SySLLM: Generating Synthesized Policy Summaries for Reinforcement Learning Agents Using Large Language Models |
Sahar Admoni et.al. |
2503.10509 |
null |
2025-03-13 |
TokenCarve: Information-Preserving Visual Token Compression in Multimodal Large Language Models |
Xudong Tan et.al. |
2503.10501 |
link |
2025-03-13 |
MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation |
Weihao Xuan et.al. |
2503.10497 |
null |
2025-03-13 |
Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents |
Hanxu Hu et.al. |
2503.10494 |
link |
2025-03-13 |
Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion |
Evgeniia Vu et.al. |
2503.10488 |
null |
2025-03-13 |
LLMs in Disease Diagnosis: A Comparative Study of DeepSeek-R1 and O3 Mini Across Chronic Health Conditions |
Gaurav Kumar Gupta et.al. |
2503.10486 |
null |
2025-03-13 |
Siamese Foundation Models for Crystal Structure Prediction |
Liming Wu et.al. |
2503.10471 |
null |
2025-03-13 |
DynaCode: A Dynamic Complexity-Aware Code Benchmark for Evaluating Large Language Models in Code Generation |
Wenhao Hu et.al. |
2503.10452 |
null |
2025-03-13 |
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models |
Wanhua Li et.al. |
2503.10437 |
link |
2025-03-13 |
Finetuning Generative Trajectory Model with Reinforcement Learning from Human Feedback |
Derun Li et.al. |
2503.10434 |
null |
2025-03-13 |
BeamLLM: Vision-Empowered mmWave Beam Prediction with Large Language Models |
Can Zheng et.al. |
2503.10432 |
null |
2025-03-13 |
Understanding the Logical Capabilities of Large Language Models via Out-of-Context Representation Learning |
Jonathan Shaki et.al. |
2503.10408 |
null |
2025-03-13 |
RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models |
Yijing Lin et.al. |
2503.10406 |
null |
2025-03-13 |
RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing |
Fengxiang Wang et.al. |
2503.10392 |
link |
2025-03-13 |
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance |
Yufan Deng et.al. |
2503.10391 |
null |
2025-03-13 |
SPPO:Efficient Long-sequence LLM Training via Adaptive Sequence Pipeline Parallel Offloading |
Qiaoling Chen et.al. |
2503.10377 |
null |
2025-03-13 |
Probabilistic Forecasting via Autoregressive Flow Matching |
Ahmed El-Gazzar et.al. |
2503.10375 |
null |
2025-03-13 |
G-Boost: Boosting Private SLMs with General LLMs |
Yijiang Fan et.al. |
2503.10367 |
null |
2025-03-13 |
Piece it Together: Part-Based Concepting with IP-Priors |
Elad Richardson et.al. |
2503.10365 |
null |
2025-03-13 |
BioSerenity-E1: a self-supervised EEG model for medical applications |
Ruggero G. Bettinardi et.al. |
2503.10362 |
null |
2025-03-13 |
Collaborative Speculative Inference for Efficient LLM Inference Serving |
Luyao Gao et.al. |
2503.10325 |
null |
2025-03-13 |
IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification |
Yuhao Wang et.al. |
2503.10324 |
null |
2025-03-13 |
Towards Fast, Memory-based and Data-Efficient Vision-Language Policy |
Haoxuan Li et.al. |
2503.10322 |
null |
2025-03-13 |
Capturing Semantic Flow of ML-based Systems |
Shin Yoo et.al. |
2503.10310 |
null |
2025-03-13 |
Test Amplification for REST APIs Using “Out-of-the-box” Large Language Models |
Tolgahan Bardakci et.al. |
2503.10306 |
null |
2025-03-13 |
CoDiPhy: A General Framework for Applying Denoising Diffusion Models to the Physical Layer of Wireless Communication Systems |
Peyman Neshaastegaran et.al. |
2503.10297 |
null |
2025-03-13 |
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning |
Weiyun Wang et.al. |
2503.10291 |
null |
2025-03-13 |
MACS: Multi-source Audio-to-image Generation with Contextual Significance and Semantic Alignment |
Hao Zhou et.al. |
2503.10287 |
null |
2025-03-13 |
An Expanded Massive Multilingual Dataset for High-Performance Language Technologies |
Laurie Burchell et.al. |
2503.10267 |
link |
2025-03-13 |
Numerical Error Analysis of Large Language Models |
Stanislav Budzinskiy et.al. |
2503.10251 |
null |
2025-03-13 |
LLM Agents Display Human Biases but Exhibit Distinct Learning Patterns |
Idan Horowitz et.al. |
2503.10248 |
null |
2025-03-13 |
MinorBench: A hand-built benchmark for content-based risks for children |
Shaun Khoo et.al. |
2503.10242 |
null |
2025-03-13 |
SCOOP: A Framework for Proactive Collaboration and Social Continual Learning through Natural Language Interaction andCausal Reasoning |
Dimitri Ognibene et.al. |
2503.10241 |
null |
2025-03-13 |
Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA |
Zhixuan Li et.al. |
2503.10225 |
null |
2025-03-13 |
Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout |
Shilong Wang et.al. |
2503.10217 |
null |
2025-03-13 |
Adaptive Preference Aggregation |
Benjamin Heymann et.al. |
2503.10215 |
null |
2025-03-13 |
Singular Value Fine-tuning for Few-Shot Class-Incremental Learning |
Zhiwu Wang et.al. |
2503.10214 |
null |
2025-03-13 |
Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation |
Henglyu Liu et.al. |
2503.10211 |
null |
2025-03-13 |
LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents |
Boyu Chen et.al. |
2503.10200 |
null |
2025-03-13 |
Robustness Tokens: Towards Adversarial Robustness of Transformers |
Brian Pulfer et.al. |
2503.10191 |
link |
2025-03-13 |
“Well, Keep Thinking”: Enhancing LLM Reasoning with Adaptive Injection Decoding |
Hyunbin Jin et.al. |
2503.10167 |
null |
2025-03-13 |
Retrieval-Augmented Generation with Hierarchical Knowledge |
Haoyu Huang et.al. |
2503.10150 |
link |
2025-03-13 |
Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding |
Jinze Li et.al. |
2503.10135 |
null |
2025-03-13 |
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models |
Runze He et.al. |
2503.10127 |
null |
2025-03-13 |
Hybrid Agents for Image Restoration |
Bingchen Li et.al. |
2503.10120 |
null |
2025-03-13 |
StepMathAgent: A Step-Wise Agent for Evaluating Mathematical Processes through Tree-of-Error |
Shu-Xun Yang et.al. |
2503.10105 |
link |
2025-03-13 |
AgentDAO: Synthesis of Proposal Transactions Via Abstract DAO Semantics |
Lin Ao et.al. |
2503.10099 |
null |
2025-03-13 |
Semantic Latent Motion for Portrait Video Generation |
Qiyuan Zhang et.al. |
2503.10096 |
null |
2025-03-13 |
Cognitive-Mental-LLM: Leveraging Reasoning in Large Language Models for Mental Health Prediction via Online Text |
Avinash Patil et.al. |
2503.10095 |
link |
2025-03-13 |
Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model |
Qiyuan Deng et.al. |
2503.10093 |
null |
2025-03-13 |
Light-weighted foundation model for seismic data processing based on representative and non-redundant pre-training dataset |
Xintong Dong et.al. |
2503.10092 |
null |
2025-03-13 |
Why Does Your CoT Prompt (Not) Work? Theoretical Analysis of Prompt Space Complexity, its Interaction with Answer Space During CoT Reasoning with LLMs: A Recurrent Perspective |
Xiang Zhang et.al. |
2503.10084 |
null |
2025-03-13 |
AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption |
Joonsung Jeon et.al. |
2503.10081 |
link |
2025-03-13 |
Information Density Principle for MLLM Benchmarks |
Chunyi Li et.al. |
2503.10079 |
link |
2025-03-13 |
VMBench: A Benchmark for Perception-Aligned Video Motion Generation |
Xinrang Ling et.al. |
2503.10076 |
link |
2025-03-13 |
SmartWay: Enhanced Waypoint Prediction and Backtracking for Zero-Shot Vision-and-Language Navigation |
Xiangyu Shi et.al. |
2503.10069 |
null |
2025-03-13 |
Multi-Modal Mamba Modeling for Survival Prediction (M4Survive): Adapting Joint Foundation Model Representations |
Ho Hin Lee et.al. |
2503.10057 |
link |
2025-03-13 |
Enhancing Multi-Agent Systems via Reinforcement Learning with LLM-based Planner and Graph-based Policy |
Ziqi Jia et.al. |
2503.10049 |
null |
2025-03-13 |
How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game |
Ziyue Wang et.al. |
2503.10042 |
link |
2025-03-13 |
NumScout: Unveiling Numerical Defects in Smart Contracts using LLM-Pruning Symbolic Execution |
Jiachi Chen et.al. |
2503.10041 |
link |
2025-03-13 |
OR-LLM-Agent: Automating Modeling and Solving of Operations Research Optimization Problem with Reasoning Large Language Model |
Bowen Zhang et.al. |
2503.10009 |
link |
2025-03-13 |
TIME: Temporal-sensitive Multi-dimensional Instruction Tuning and Benchmarking for Video-LLMs |
Yunxiao Wang et.al. |
2503.09994 |
null |
2025-03-13 |
Channel-wise Noise Scheduled Diffusion for Inverse Rendering in Indoor Scenes |
JunYong Choi et.al. |
2503.09993 |
null |
2025-03-13 |
From Equations to Insights: Unraveling Symbolic Structures in PDEs with LLMs |
Rohan Bhatnagar et.al. |
2503.09986 |
null |
2025-03-13 |
ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content |
Bhavik Chandna et.al. |
2503.09964 |
null |
2025-03-13 |
Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification |
Jiayu Jiang et.al. |
2503.09962 |
link |
2025-03-13 |
RMG: Real-Time Expressive Motion Generation with Self-collision Avoidance for 6-DOF Companion Robotic Arms |
Jiansheng Li et.al. |
2503.09959 |
null |
2025-03-13 |
Exploring Mutual Empowerment Between Wireless Networks and RL-based LLMs: A Survey |
Yu Qiao et.al. |
2503.09956 |
null |
2025-03-13 |
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos? |
Yuanxin Liu et.al. |
2503.09949 |
link |
2025-03-13 |
PluralLLM: Pluralistic Alignment in LLMs via Federated Learning |
Mahmoud Srewa et.al. |
2503.09925 |
null |
2025-03-13 |
Inter-environmental world modeling for continuous and compositional dynamics |
Kohei Hayashi et.al. |
2503.09911 |
null |
2025-03-12 |
Conversational Gold: Evaluating Personalized Conversational Search System using Gold Nuggets |
Zahra Abbasiantaeb et.al. |
2503.09902 |
link |
2025-03-12 |
Improving the Reusability of Conversational Search Test Collections |
Zahra Abbasiantaeb et.al. |
2503.09899 |
link |
2025-03-12 |
What’s In Your Field? Mapping Scientific Research with Knowledge Graphs and Large Language Models |
Abhipsha Das et.al. |
2503.09894 |
link |
2025-03-12 |
On the contraction properties of Sinkhorn semigroups |
O. Deniz Akyildiz et.al. |
2503.09887 |
null |
2025-03-12 |
CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation |
Hariprasath Govindarajan et.al. |
2503.09878 |
null |
2025-03-12 |
LuciBot: Automated Robot Policy Learning from Generated Videos |
Xiaowen Qiu et.al. |
2503.09871 |
null |
2025-03-12 |
Foundation X: Integrating Classification, Localization, and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis |
Nahid Ul Islam et.al. |
2503.09860 |
link |
2025-03-12 |
Media and responsible AI governance: a game-theoretic and LLM analysis |
Nataliya Balabanova et.al. |
2503.09858 |
null |
2025-03-12 |
MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System |
Jihao Zhao et.al. |
2503.09600 |
link |
2025-03-12 |
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation |
Ruohao Guo et.al. |
2503.09598 |
link |
2025-03-12 |
PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop |
Chenyu Li et.al. |
2503.09595 |
link |
2025-03-12 |
SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment |
Katrin Renz et.al. |
2503.09594 |
null |
2025-03-12 |
BIMBA: Selective-Scan Compression for Long-Range Video Question Answering |
Md Mohaiminul Islam et.al. |
2503.09590 |
link |
2025-03-12 |
Minimax Optimality of the Probability Flow ODE for Diffusion Models |
Changxiao Cai et.al. |
2503.09583 |
null |
2025-03-12 |
Cost-Optimal Grouped-Query Attention for Long-Context LLMs |
Yingfa Chen et.al. |
2503.09579 |
link |
2025-03-12 |
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks |
Lutfi Eren Erdogan et.al. |
2503.09572 |
null |
2025-03-13 |
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models |
Qiguang Chen et.al. |
2503.09567 |
null |
2025-03-12 |
GenHPE: Generative Counterfactuals for 3D Human Pose Estimation with Radio Frequency Signals |
Shuokang Huang et.al. |
2503.09537 |
null |
2025-03-13 |
Large Language Models for Multi-Facility Location Mechanism Design |
Nguyen Thach et.al. |
2503.09533 |
null |
2025-03-12 |
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning |
Bowen Jin et.al. |
2503.09516 |
link |
2025-03-12 |
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning |
Ziyu Wan et.al. |
2503.09501 |
link |
2025-03-12 |
Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection |
Romain Thoreau et.al. |
2503.09493 |
null |
2025-03-12 |
DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction |
Junjie Zhou et.al. |
2503.09491 |
link |
2025-03-12 |
Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness |
Beier Zhu et.al. |
2503.09487 |
null |
2025-03-12 |
BAMBI: Developing Baby Language Models for Italian |
Alice Suozzi et.al. |
2503.09481 |
null |
2025-03-12 |
Explicit Learning and the LLM in Machine Translation |
Malik Marmonier et.al. |
2503.09454 |
link |
2025-03-12 |
How Well Does Your Tabular Generator Learn the Structure of Tabular Data? |
Xiangjian Jiang et.al. |
2503.09453 |
link |
2025-03-12 |
Florenz: Scaling Laws for Systematic Generalization in Vision-Language Models |
Julian Spravil et.al. |
2503.09443 |
null |
2025-03-12 |
CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE Detection |
Richard A. Dubniczky et.al. |
2503.09433 |
link |
2025-03-12 |
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in Clutter |
Kechun Xu et.al. |
2503.09423 |
null |
2025-03-12 |
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary |
Kevin Qinghong Lin et.al. |
2503.09402 |
link |
2025-03-12 |
ForAug: Recombining Foregrounds and Backgrounds to Improve Vision Transformer Training with Bias Mitigation |
Tobias Christian Nauen et.al. |
2503.09399 |
link |
2025-03-12 |
Close-up-GS: Enhancing Close-Up View Synthesis in 3D Gaussian Splatting with Progressive Self-Training |
Jiatong Xia et.al. |
2503.09396 |
null |
2025-03-12 |
Towards Next-Generation Recommender Systems: A Benchmark for Personalized Recommendation Assistant with LLMs |
Jiani Huang et.al. |
2503.09382 |
link |
2025-03-12 |
Towards Graph Foundation Models: A Transferability Perspective |
Yuxiang Wang et.al. |
2503.09363 |
null |
2025-03-12 |
Deep Learning for Climate Action: Computer Vision Analysis of Visual Narratives on X |
Katharina Prasse et.al. |
2503.09361 |
null |
2025-03-12 |
RetSTA: An LLM-Based Approach for Standardizing Clinical Fundus Image Reports |
Jiushen Cai et.al. |
2503.09358 |
null |
2025-03-12 |
Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts |
Hongyu Chen et.al. |
2503.09347 |
null |
2025-03-12 |
NVP-HRI: Zero Shot Natural Voice and Posture-based Human-Robot Interaction via Large Language Model |
Yuzhi Lai et.al. |
2503.09335 |
link |
2025-03-12 |
CyberLLMInstruct: A New Dataset for Analysing Safety of Fine-Tuned LLMs Using Cyber Security Data |
Adel ElZemity et.al. |
2503.09334 |
link |
2025-03-12 |
A Survey on Enhancing Causal Reasoning Ability of Large Language Models |
Xin Li et.al. |
2503.09326 |
null |
2025-03-12 |
Revealing the Implicit Noise-based Imprint of Generative Models |
Xinghan Li et.al. |
2503.09314 |
null |
2025-03-12 |
xVLM2Vec: Adapting LVLM-based embedding models to multilinguality using Self-Knowledge Distillation |
Elio Musacchio et.al. |
2503.09313 |
null |
2025-03-12 |
Adaptive political surveys and GPT-4: Tackling the cold start problem with simulated user interactions |
Fynn Bachmann et.al. |
2503.09311 |
link |
2025-03-12 |
Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference |
Mohammad Siavashi et.al. |
2503.09304 |
null |
2025-03-12 |
Prompt Inference Attack on Distributed Large Language Model Inference Frameworks |
Xinjian Luo et.al. |
2503.09291 |
null |
2025-03-12 |
Crowdsourced Homophily Ties Based Graph Annotation Via Large Language Model |
Yu Bu et.al. |
2503.09281 |
null |
2025-03-12 |
Fine-Tuning Large Language Models for Educational Support: Leveraging Gagne’s Nine Events of Instruction for Lesson Planning |
Linzhao Jia et.al. |
2503.09276 |
null |
2025-03-12 |
COLA: A Scalable Multi-Agent Framework For Windows UI Task Automation |
Di Zhao et.al. |
2503.09263 |
link |
2025-03-13 |
DeepInnovation AI: A Global Dataset Mapping the AI innovation from Academic Research to Industrial Patents |
Haixing Gong et.al. |
2503.09257 |
null |
2025-03-12 |
City Models: Past, Present and Future Prospects |
Helge Ritter et.al. |
2503.09237 |
null |
2025-03-12 |
LREF: A Novel LLM-based Relevance Framework for E-commerce |
Tian Tang et.al. |
2503.09223 |
null |
2025-03-12 |
Rethinking Prompt-based Debiasing in Large Language Models |
Xinyi Yang et.al. |
2503.09219 |
null |
2025-03-12 |
Why LLMs Cannot Think and How to Fix It |
Marius Jahrens et.al. |
2503.09211 |
null |
2025-03-12 |
Quality Over Quantity? LLM-Based Curation for a Data-Efficient Audio-Video Foundation Model |
Ali Vosoughi et.al. |
2503.09205 |
null |
2025-03-12 |
Token Weighting for Long-Range Language Modeling |
Falko Helm et.al. |
2503.09202 |
link |
2025-03-12 |
WonderVerse: Extendable 3D Scene Generation with Video Generative Models |
Hao Feng et.al. |
2503.09160 |
null |
2025-03-12 |
FaVChat: Unlocking Fine-Grained Facail Video Understanding with Multimodal Large Language Models |
Fufangchen Zhao et.al. |
2503.09158 |
null |
2025-03-12 |
AdaptAI: A Personalized Solution to Sense Your Stress, Fix Your Mess, and Boost Productivity |
Rushiraj Gadhvi et.al. |
2503.09150 |
link |
2025-03-12 |
Generative Frame Sampler for Long Video Understanding |
Linli Yao et.al. |
2503.09146 |
null |
2025-03-12 |
Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding |
Haoyu Zhang et.al. |
2503.09143 |
null |
2025-03-12 |
AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial Attacks |
Jin Li et.al. |
2503.09124 |
null |
2025-03-12 |
Training Data Provenance Verification: Did Your Model Use Synthetic Data from My Generative Model for Training? |
Yuechen Xie et.al. |
2503.09122 |
link |
2025-03-12 |
GRU: Mitigating the Trade-off between Unlearning and Retention for Large Language Models |
Yue Wang et.al. |
2503.09117 |
null |
2025-03-12 |
VaxGuard: A Multi-Generator, Multi-Type, and Multi-Role Dataset for Detecting LLM-Generated Vaccine Misinformation |
Syed Talal Ahmad et.al. |
2503.09103 |
null |
2025-03-12 |
Multi-Modal Foundation Models for Computational Pathology: A Survey |
Dong Li et.al. |
2503.09091 |
null |
2025-03-12 |
Theoretical Guarantees for High Order Trajectory Refinement in Generative Flows |
Chengyue Gong et.al. |
2503.09069 |
null |
2025-03-12 |
Probing Latent Subspaces in LLM for AI Security: Identifying and Manipulating Adversarial States |
Xin Wei Chia et.al. |
2503.09066 |
null |
2025-03-12 |
Discovering Influential Neuron Path in Vision Transformers |
Yifan Wang et.al. |
2503.09046 |
null |
2025-03-12 |
ManeuverGPT Agentic Control for Safe Autonomous Stunt Maneuvers |
Shawn Azdam et.al. |
2503.09035 |
link |
2025-03-12 |
Teaching LLMs How to Learn with Contextual Fine-Tuning |
Younwoo Choi et.al. |
2503.09032 |
null |
2025-03-12 |
DAST: Difficulty-Aware Self-Training on Large Language Models |
Boyang Xue et.al. |
2503.09029 |
link |
2025-03-12 |
Aligning to What? Limits to RLHF Based Alignment |
Logan Barnhart et.al. |
2503.09025 |
null |
2025-03-13 |
Prompt Inversion Attack against Collaborative Inference of Large Language Models |
Wenjie Qu et.al. |
2503.09022 |
null |
2025-03-12 |
Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning |
Yuan Jiang et.al. |
2503.09020 |
link |
2025-03-12 |
Natural Humanoid Robot Locomotion with Generative Motion Prior |
Haodong Zhang et.al. |
2503.09015 |
null |
2025-03-12 |
Leveraging Retrieval Augmented Generative LLMs For Automated Metadata Description Generation to Enhance Data Catalogs |
Mayank Singh et.al. |
2503.09003 |
null |
2025-03-12 |
KNighter: Transforming Static Analysis with LLM-Synthesized Checkers |
Chenyuan Yang et.al. |
2503.09002 |
link |
2025-03-12 |
JBFuzz: Jailbreaking LLMs Efficiently and Effectively Using Fuzzing |
Vasudev Gohil et.al. |
2503.08990 |
null |
2025-03-12 |
I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? |
Yuhang Liu et.al. |
2503.08980 |
null |
2025-03-12 |
Large Language Models-Aided Program Debloating |
Bo Lin et.al. |
2503.08969 |
null |
2025-03-11 |
Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation |
Yu Wang et.al. |
2503.08963 |
null |
2025-03-11 |
FP3: A 3D Foundation Policy for Robotic Manipulation |
Rujia Yang et.al. |
2503.08950 |
null |
2025-03-11 |
Near-Optimal Sample Complexity for Iterated CVaR Reinforcement Learning with a Generative Model |
Zilong Deng et.al. |
2503.08934 |
null |
2025-03-11 |
ARCHED: A Human-Centered Framework for Transparent, Responsible, and Collaborative AI-Assisted Instructional Design |
Hongming Li et.al. |
2503.08931 |
null |
2025-03-11 |
Enhancing Large Language Models for Hardware Verification: A Novel SystemVerilog Assertion Dataset |
Anand Menon et.al. |
2503.08923 |
null |
2025-03-11 |
Backtracking for Safety |
Bilgehan Sel et.al. |
2503.08919 |
null |
2025-03-11 |
Multilevel Generative Samplers for Investigating Critical Phenomena |
Ankur Singha et.al. |
2503.08918 |
link |
2025-03-11 |
Reconstruct Anything Model: a lightweight foundation model for computational imaging |
Matthieu Terris et.al. |
2503.08915 |
null |
2025-03-11 |
Interpreting the Repeated Token Phenomenon in Large Language Models |
Itay Yona et.al. |
2503.08908 |
link |
2025-03-11 |
A Deep Bayesian Nonparametric Framework for Robust Mutual Information Estimation |
Forough Fazeliasl et.al. |
2503.08902 |
null |
2025-03-11 |
Seeing What’s Not There: Spurious Correlation in Multimodal LLMs |
Parsa Hosseini et.al. |
2503.08884 |
null |
2025-03-11 |
LLMs Know What to Drop: Self-Attention Guided KV Cache Eviction for Efficient Long-Context Inference |
Guangtao Wang et.al. |
2503.08879 |
null |
2025-03-11 |
Interpretable and Robust Dialogue State Tracking via Natural Language Summarization with LLMs |
Rafael Carranza et.al. |
2503.08857 |
null |
2025-03-11 |
Contrastive Speaker-Aware Learning for Multi-party Dialogue Generation with LLMs |
Tianyu Sun et.al. |
2503.08842 |
null |
2025-03-11 |
ResBench: Benchmarking LLM-Generated FPGA Designs with Resource Awareness |
Ce Guo et.al. |
2503.08823 |
null |
2025-03-11 |
Cross-Examiner: Evaluating Consistency of Large Language Model-Generated Explanations |
Danielle Villa et.al. |
2503.08815 |
null |
2025-03-11 |
Robust Multi-Objective Controlled Decoding of Large Language Models |
Seongho Son et.al. |
2503.08796 |
link |
2025-03-11 |
Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs |
Ariba Khan et.al. |
2503.08688 |
link |
2025-03-11 |
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models |
Jialv Zou et.al. |
2503.08686 |
link |
2025-03-11 |
Self-Taught Self-Correction for Small Language Models |
Viktor Moskvoretskii et.al. |
2503.08681 |
null |
2025-03-11 |
GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing |
Yuanhao Wang et.al. |
2503.08678 |
null |
2025-03-12 |
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting |
Yongsheng Yu et.al. |
2503.08677 |
null |
2025-03-11 |
Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields |
Tobias Kreiman et.al. |
2503.08674 |
null |
2025-03-11 |
REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder |
Yitian Zhang et.al. |
2503.08665 |
null |
2025-03-11 |
Generating Robot Constitutions & Benchmarks for Semantic Safety |
Pierre Sermanet et.al. |
2503.08663 |
null |
2025-03-11 |
Exploring the Word Sense Disambiguation Capabilities of Large Language Models |
Pierpaolo Basile et.al. |
2503.08662 |
null |
2025-03-11 |
YuE: Scaling Open Foundation Models for Long-Form Music Generation |
Ruibin Yuan et.al. |
2503.08638 |
link |
2025-03-11 |
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization |
Xianfeng Wu et.al. |
2503.08619 |
link |
2025-03-11 |
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments |
Dongping Li et.al. |
2503.08604 |
link |
2025-03-11 |
NSF-SciFy: Mining the NSF Awards Database for Scientific Claims |
Delip Rao et.al. |
2503.08600 |
null |
2025-03-11 |
3D Point Cloud Generation via Autoregressive Up-sampling |
Ziqiao Meng et.al. |
2503.08594 |
null |
2025-03-11 |
Proc4Gem: Foundation models for physical agency through procedural generation |
Yixin Lin et.al. |
2503.08593 |
null |
2025-03-11 |
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding |
Shehreen Azad et.al. |
2503.08585 |
null |
2025-03-11 |
RAG-Adapter: A Plug-and-Play RAG-enhanced Framework for Long Video Understanding |
Xichen Tan et.al. |
2503.08576 |
null |
2025-03-11 |
DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process |
Minjun Zhu et.al. |
2503.08569 |
null |
2025-03-11 |
Can We Detect Failures Without Failure Data? Uncertainty-Aware Runtime Failure Detection for Imitation Learning Policies |
Chen Xu et.al. |
2503.08558 |
null |
2025-03-11 |
Reasoning and Sampling-Augmented MCQ Difficulty Prediction via LLMs |
Wanyong Feng et.al. |
2503.08551 |
null |
2025-03-11 |
Transferring Extreme Subword Style Using Ngram Model-Based Logit Scaling |
Craig Messner et.al. |
2503.08550 |
null |
2025-03-11 |
Graph of AI Ideas: Leveraging Knowledge Graphs and LLMs for AI Research Idea Generation |
Xian Gao et.al. |
2503.08549 |
null |
2025-03-11 |
DAFE: LLM-Based Evaluation Through Dynamic Arbitration for Free-Form Question-Answering |
Sher Badshah et.al. |
2503.08542 |
null |
2025-03-11 |
Mellow: a small audio language model for reasoning |
Soham Deshmukh et.al. |
2503.08540 |
link |
2025-03-11 |
Chemical reasoning in LLMs unlocks steerable synthesis planning and reaction mechanism elucidation |
Andres M Bran et.al. |
2503.08537 |
link |
2025-03-11 |
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems |
Siddhant Arora et.al. |
2503.08533 |
null |
2025-03-11 |
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training |
Tong Wei et.al. |
2503.08525 |
null |
2025-03-11 |
Position-Aware Depth Decay Decoding ( $D^3$ ): Boosting Large Language Model Inference Efficiency |
Siqi Fan et.al. |
2503.08524 |
null |
2025-03-11 |
High-Quality 3D Head Reconstruction from Any Single Portrait Image |
Jianfu Zhang et.al. |
2503.08516 |
null |
2025-03-11 |
LightPlanner: Unleashing the Reasoning Capabilities of Lightweight Large Language Models in Task Planning |
Weijie Zhou et.al. |
2503.08508 |
link |
2025-03-11 |
Referring to Any Person |
Qing Jiang et.al. |
2503.08507 |
link |
2025-03-11 |
ReviewAgents: Bridging the Gap Between Human and AI-Generated Paper Reviews |
Xian Gao et.al. |
2503.08506 |
null |
2025-03-11 |
Enhancing Multi-Hop Fact Verification with Structured Knowledge-Augmented Large Language Models |
Han Cao et.al. |
2503.08495 |
null |
2025-03-11 |
TT-GaussOcc: Test-Time Compute for Self-Supervised Occupancy Prediction via Spatio-Temporal Gaussian Splatting |
Fengyi Zhang et.al. |
2503.08485 |
null |
2025-03-11 |
Generalizable AI-Generated Image Detection Based on Fractal Self-Similarity in the Spectrum |
Shengpeng Xiao et.al. |
2503.08484 |
null |
2025-03-11 |
PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability |
Weijie Zhou et.al. |
2503.08481 |
link |
2025-03-11 |
FastCache: Optimizing Multimodal LLM Serving through Lightweight KV-Cache Compression Framework |
Jianian Zhu et.al. |
2503.08461 |
null |
2025-03-11 |
KAP: MLLM-assisted OCR Text Enhancement for Hybrid Retrieval in Chinese Non-Narrative Documents |
Hsin-Ling Hsu et.al. |
2503.08452 |
link |
2025-03-11 |
LLM-Pack: Intuitive Grocery Handling for Logistics Applications |
Yannik Blei et.al. |
2503.08445 |
null |
2025-03-11 |
TokenSim: Enabling Hardware and Software Exploration for Large Language Model Inference Systems |
Feiyang Wu et.al. |
2503.08415 |
link |
2025-03-11 |
Fact-checking with Generative AI: A Systematic Cross-Topic Examination of LLMs Capacity to Detect Veracity of Political Information |
Elizaveta Kuznetsova et.al. |
2503.08404 |
null |
2025-03-11 |
OpenRAG: Optimizing RAG End-to-End via In-Context Retrieval Learning |
Jiawei Zhou et.al. |
2503.08398 |
null |
2025-03-11 |
Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens |
Qingsong Xie et.al. |
2503.08377 |
null |
2025-03-11 |
nnInteractive: Redefining 3D Promptable Segmentation |
Fabian Isensee et.al. |
2503.08373 |
link |
2025-03-11 |
MetaFold: Language-Guided Multi-Category Garment Folding Framework via Trajectory Generation and Foundation Model |
Haonan Chen et.al. |
2503.08372 |
null |
2025-03-11 |
Robust Latent Matters: Boosting Image Generation with Sampling Error |
Kai Qiu et.al. |
2503.08354 |
link |
2025-03-12 |
Attention Reallocation: Towards Zero-cost and Controllable Hallucination Mitigation of MLLMs |
Chongjun Tu et.al. |
2503.08342 |
null |
2025-03-11 |
Trinity: A Modular Humanoid Robot AI System |
Jingkai Sun et.al. |
2503.08338 |
null |
2025-03-11 |
Prompt2LVideos: Exploring Prompts for Understanding Long-Form Multimodal Videos |
Soumya Shamarao Jahagirdar et.al. |
2503.08335 |
null |
2025-03-11 |
KiteRunner: Language-Driven Cooperative Local-Global Navigation Policy with UAV Mapping in Outdoor Environments |
Shibo Huang et.al. |
2503.08330 |
null |
2025-03-11 |
Towards Scalable and Cross-Lingual Specialist Language Models for Oncology |
Morteza Rohanian et.al. |
2503.08323 |
null |
2025-03-11 |
Mind the Memory Gap: Unveiling GPU Bottlenecks in Large-Batch LLM Inference |
Pol G. Recasens et.al. |
2503.08311 |
null |
2025-03-11 |
Seeing and Reasoning with Confidence: Supercharging Multimodal LLMs with an Uncertainty-Aware Agentic Framework |
Zhuo Zhi et.al. |
2503.08308 |
null |
2025-03-11 |
General-Purpose Aerial Intelligent Agents Empowered by Large Language Models |
Ji Zhao et.al. |
2503.08302 |
null |
2025-03-12 |
Large Language Model as Meta-Surrogate for Data-Driven Many-Task Optimization: A Proof-of-Principle Study |
Xian-Rong Zhang et.al. |
2503.08301 |
null |
2025-03-11 |
Large Language Models for Outpatient Referral: Problem Definition, Benchmarking and Challenges |
Xiaoxiao Liu et.al. |
2503.08292 |
link |
2025-03-11 |
PromptLNet: Region-Adaptive Aesthetic Enhancement via Prompt Guidance in Low-Light Enhancement Net |
Jun Yin et.al. |
2503.08276 |
null |
2025-03-11 |
LangTime: A Language-Guided Unified Model for Time Series Forecasting with Proximal Policy Optimization |
Wenzhe Niu et.al. |
2503.08271 |
null |
2025-03-11 |
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness |
Yiming Zhong et.al. |
2503.08257 |
link |
2025-03-11 |
Aligning Text to Image in Diffusion Models is Easier Than You Think |
Jaa-Yeon Lee et.al. |
2503.08250 |
null |
2025-03-11 |
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices |
Tao Shen et.al. |
2503.08223 |
null |
2025-03-11 |
EgoBlind: Towards Egocentric Visual Assistance for the Blind People |
Junbin Xiao et.al. |
2503.08221 |
null |
2025-03-11 |
S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction |
Guangting Zheng et.al. |
2503.08217 |
null |
2025-03-11 |
To Use or Not to Use a Universal Force Field |
Denan Li et.al. |
2503.08207 |
null |
2025-03-11 |
Route Sparse Autoencoder to Interpret Large Language Models |
Wei Shi et.al. |
2503.08200 |
link |
2025-03-11 |
A Cascading Cooperative Multi-agent Framework for On-ramp Merging Control Integrating Large Language Models |
Miao Zhang et.al. |
2503.08199 |
null |
2025-03-11 |
Dialogue Injection Attack: Jailbreaking LLMs through Context Manipulation |
Wenlong Meng et.al. |
2503.08195 |
link |
2025-03-11 |
Automating Violence Detection and Categorization from Ancient Texts |
Alhassan Abdelhalim et.al. |
2503.08192 |
null |
2025-03-11 |
RigoChat 2: an adapted language model to Spanish using a bounded dataset and reduced hardware |
Gonzalo Santamaría Gómez et.al. |
2503.08188 |
null |
2025-03-11 |
Mutation Testing via Iterative Large Language Model-Driven Scientific Debugging |
Philipp Straubinger et.al. |
2503.08182 |
null |
2025-03-12 |
ProtTeX: Structure-In-Context Reasoning and Editing of Proteins with Large Language Models |
Zicheng Ma et.al. |
2503.08179 |
null |
2025-03-11 |
Investigating the Effectiveness of a Socratic Chain-of-Thoughts Reasoning Method for Task Planning in Robotics, A Case Study |
Veronica Bot et.al. |
2503.08174 |
null |
2025-03-11 |
Towards All-in-One Medical Image Re-Identification |
Yuan Tian et.al. |
2503.08173 |
link |
2025-03-11 |
TSCnet: A Text-driven Semantic-level Controllable Framework for Customized Low-Light Image Enhancement |
Miao Zhang et.al. |
2503.08168 |
null |
2025-03-11 |
FASIONAD++ : Integrating High-Level Instruction and Information Bottleneck in FAt-Slow fusION Systems for Enhanced Safety in Autonomous Driving with Adaptive Feedback |
Kangan Qian et.al. |
2503.08162 |
null |
2025-03-12 |
OASIS: Order-Augmented Strategy for Improved Code Search |
Zuchen Gao et.al. |
2503.08161 |
null |
2025-03-11 |
Towards Large-scale Chemical Reaction Image Parsing via a Multimodal Large Language Model |
Yufan Chen et.al. |
2503.08156 |
null |
2025-03-11 |
WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation |
Jing Wang et.al. |
2503.08153 |
null |
2025-03-11 |
Few-Shot Class-Incremental Model Attribution Using Learnable Representation From CLIP-ViT Features |
Hanbyul Lee et.al. |
2503.08148 |
null |
2025-03-11 |
FilmComposer: LLM-Driven Music Production for Silent Film Clips |
Zhifeng Xie et.al. |
2503.08147 |
null |
2025-03-11 |
Bring Remote Sensing Object Detect Into Nature Language Model: Using SFT Method |
Fei Wang et.al. |
2503.08144 |
null |
2025-03-11 |
FlowDPS: Flow-Driven Posterior Sampling for Inverse Problems |
Jeongsol Kim et.al. |
2503.08136 |
null |
2025-03-11 |
Large Scale Multi-Task Bayesian Optimization with Large Language Models |
Yimeng Zeng et.al. |
2503.08131 |
null |
2025-03-11 |
LLM4MAC: An LLM-Driven Reinforcement Learning Framework for MAC Protocol Emergence |
Renxuan Tan et.al. |
2503.08123 |
null |
2025-03-11 |
Toward Stable World Models: Measuring and Addressing World Instability in Generative Environments |
Soonwoo Kwon et.al. |
2503.08122 |
null |
2025-03-11 |
Uni $\textbf{F}^2$ ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models |
Junzhe Li et.al. |
2503.08120 |
null |
2025-03-11 |
Convergence Dynamics and Stabilization Strategies of Co-Evolving Generative Models |
Weiguo Gao et.al. |
2503.08117 |
null |
2025-03-11 |
AI-native Memory 2.0: Second Me |
Jiale Wei et.al. |
2503.08102 |
null |
2025-03-12 |
PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models |
Kyeongkook Seo et.al. |
2503.08085 |
link |
2025-03-11 |
Instruction-Augmented Long-Horizon Planning: Embedding Grounding Mechanisms in Embodied Mobile Manipulation |
Fangyuan Wang et.al. |
2503.08084 |
null |
2025-03-11 |
Seeing Beyond Haze: Generative Nighttime Image Dehazing |
Beibei Lin et.al. |
2503.08073 |
null |
2025-03-11 |
Flow Matching for Discrete Systems: Efficient Free Energy Sampling Across Lattice Sizes and Temperatures |
Ping Tuo et.al. |
2503.08063 |
null |
2025-03-11 |
Odysseus Navigates the Sirens’ Song: Dynamic Focus Decoding for Factual and Diverse Open-Ended Text Generation |
Wen Luo et.al. |
2503.08057 |
null |
2025-03-11 |
Counterfactual Language Reasoning for Explainable Recommendation Systems |
Guanrong Li et.al. |
2503.08051 |
null |
2025-03-11 |
SphOR: A Representation Learning Perspective on Open-set Recognition for Identifying Unknown Classes in Deep Learning Models |
Nadarasar Bahavan et.al. |
2503.08049 |
link |
2025-03-11 |
LongProLIP: A Probabilistic Vision-Language Model with Long Context Text |
Sanghyuk Chun et.al. |
2503.08048 |
link |
2025-03-11 |
Adapting Large Language Models for Parameter-Efficient Log Anomaly Detection |
Ying Fu Lim et.al. |
2503.08045 |
null |
2025-03-11 |
ObjectMover: Generative Object Movement with Video Prior |
Xin Yu et.al. |
2503.08037 |
null |
2025-03-11 |
Learning to Search Effective Example Sequences for In-Context Learning |
Xiang Gao et.al. |
2503.08030 |
null |
2025-03-11 |
In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents |
Zhen Tan et.al. |
2503.08026 |
null |
2025-03-10 |
V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation |
Guiwei Zhang et.al. |
2503.07493 |
link |
2025-03-10 |
LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition? |
Bangyan Li et.al. |
2503.07487 |
null |
2025-03-10 |
Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction |
Zongzheng Zhang et.al. |
2503.07485 |
link |
2025-03-10 |
GenAIReading: Augmenting Human Cognition with Interactive Digital Textbooks Using Large Language Models and Image Generation Models |
Ryugo Morita et.al. |
2503.07463 |
null |
2025-03-10 |
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning |
Xiangru Tang et.al. |
2503.07459 |
link |
2025-03-10 |
LLMs syntactically adapt their language use to their conversational partner |
Florian Kandra et.al. |
2503.07457 |
null |
2025-03-10 |
Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration |
Dylan J. Foster et.al. |
2503.07453 |
null |
2025-03-10 |
From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development – An Opinion Paper |
Sargam Yadav et.al. |
2503.07450 |
null |
2025-03-10 |
From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics |
Jaewook Lee et.al. |
2503.07429 |
null |
2025-03-10 |
RePO: ReLU-based Preference Optimization |
Junkang Wu et.al. |
2503.07426 |
link |
2025-03-10 |
REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding |
Yan Tai et.al. |
2503.07413 |
link |
2025-03-10 |
Towards Safe Robot Foundation Models |
Maximilian Tölle et.al. |
2503.07404 |
null |
2025-03-10 |
Keeping Representation Similarity in Finetuning for Medical Image Analysis |
Wenqiang Zu et.al. |
2503.07399 |
null |
2025-03-10 |
Revisiting Noise in Natural Language Processing for Computational Social Science |
Nadav Borenstein et.al. |
2503.07395 |
null |
2025-03-10 |
Process-Supervised LLM Recommenders via Flow-guided Tuning |
Chongming Gao et.al. |
2503.07377 |
link |
2025-03-10 |
Artificial Utopia: Simulation and Intelligent Agents for a Democratised Future |
Yannick Oswald et.al. |
2503.07364 |
null |
2025-03-10 |
RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing |
Yiqing Xie et.al. |
2503.07358 |
link |
2025-03-10 |
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment |
Xing Xie et.al. |
2503.07334 |
link |
2025-03-10 |
Assessing the Macro and Micro Effects of Random Seeds on Fine-Tuning Large Language Models |
Hao Zhou et.al. |
2503.07329 |
null |
2025-03-10 |
Dynamic Path Navigation for Motion Agents with LLM Reasoning |
Yubo Zhao et.al. |
2503.07323 |
null |
2025-03-10 |
Experimental Exploration: Investigating Cooperative Interaction Behavior Between Humans and Large Language Model Agents |
Guanxuan Jiang et.al. |
2503.07320 |
null |
2025-03-10 |
Self-Corrective Task Planning by Inverse Prompting with Large Language Models |
Jiho Lee et.al. |
2503.07317 |
null |
2025-03-10 |
Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies |
Luyi Jiang et.al. |
2503.07306 |
null |
2025-03-10 |
A Graph-based Verification Framework for Fact-Checking |
Yani Huang et.al. |
2503.07282 |
null |
2025-03-10 |
COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition |
Baiyu Chen et.al. |
2503.07259 |
link |
2025-03-10 |
CoT-Drive: Efficient Motion Forecasting for Autonomous Driving with LLMs and Chain-of-Thought Prompting |
Haicheng Liao et.al. |
2503.07234 |
null |
2025-03-10 |
Control Flow-Augmented Decompiler based on Large Language Model |
Peipei Liu et.al. |
2503.07215 |
null |
2025-03-10 |
Endo-FASt3r: Endoscopic Foundation model Adaptation for Structure from motion |
Mona Sheikh Zeinoddin et.al. |
2503.07204 |
null |
2025-03-10 |
A Zero-shot Learning Method Based on Large Language Models for Multi-modal Knowledge Graph Embedding |
Bingchen Liu et.al. |
2503.07202 |
null |
2025-03-10 |
Effective and Efficient Masked Image Generation Models |
Zebin You et.al. |
2503.07197 |
link |
2025-03-10 |
Contextual Cues in Machine Translation: Investigating the Potential of Multi-Source Input Strategies in LLMs and NMT Systems |
Lia Shahnazaryan et.al. |
2503.07195 |
null |
2025-03-10 |
Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms |
Jiaming Song et.al. |
2503.07154 |
null |
2025-03-10 |
MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark |
Shengkun Ma et.al. |
2503.07144 |
link |
2025-03-10 |
Application of Multiple Chain-of-Thought in Contrastive Reasoning for Implicit Sentiment Analysis |
Liwei Yang et.al. |
2503.07140 |
null |
2025-03-10 |
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation |
Hanzhi Chen et.al. |
2503.07135 |
null |
2025-03-10 |
Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation |
Sihao Lin et.al. |
2503.07125 |
null |
2025-03-10 |
Quantizing Large Language Models for Code Generation: A Differentiated Replication |
Alessandro Giagnorio et.al. |
2503.07103 |
null |
2025-03-10 |
A Novel Ophthalmic Benchmark for Evaluating Multimodal Large Language Models with Fundus Photographs and OCT Images |
Xiaoyi Liang et.al. |
2503.07094 |
null |
2025-03-10 |
Linguistic Knowledge Transfer Learning for Speech Enhancement |
Kuo-Hsuan Hung et.al. |
2503.07078 |
null |
2025-03-10 |
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs |
Jongwoo Ko et.al. |
2503.07067 |
null |
2025-03-10 |
Boosting the Generalization and Reasoning of Vision Language Models with Curriculum Reinforcement Learning |
Huilin Deng et.al. |
2503.07065 |
link |
2025-03-10 |
TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation |
Victor Shea-Jay Huang et.al. |
2503.07050 |
null |
2025-03-10 |
Recovering Partially Corrupted Major Objects through Tri-modality Based Image Completion |
Yongle Zhang et.al. |
2503.07047 |
null |
2025-03-10 |
Conditional Generative Modeling for Amorphous Multi-Element Materials |
Honglin Li et.al. |
2503.07043 |
link |
2025-03-10 |
TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine |
Tianai Huang et.al. |
2503.07041 |
null |
2025-03-10 |
Bot Wars Evolved: Orchestrating Competing LLMs in a Counterstrike Against Phone Scams |
Nardine Basta et.al. |
2503.07036 |
null |
2025-03-10 |
Multimodal Human-AI Synergy for Medical Imaging Quality Control: A Hybrid Intelligence Framework with Adaptive Dataset Curation and Closed-Loop Evaluation |
Zhi Qin et.al. |
2503.07032 |
null |
2025-03-10 |
Combating Partial Perception Deficit in Autonomous Driving with Multimodal LLM Commonsense |
Yuting Hu et.al. |
2503.07020 |
null |
2025-03-10 |
Toward Multi-Session Personalized Conversation: A Large-Scale Dataset and Hierarchical Tree Framework for Implicit Reasoning |
Xintong Li et.al. |
2503.07018 |
link |
2025-03-10 |
HELM: Human-Preferred Exploration with Language Models |
Shuhao Liao et.al. |
2503.07006 |
null |
2025-03-10 |
Large Language Models Often Say One Thing and Do Another |
Ruoxi Xu et.al. |
2503.07003 |
link |
2025-03-10 |
Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning |
Jiazheng Liu et.al. |
2503.07002 |
null |
2025-03-10 |
Utilizing Jailbreak Probability to Attack and Safeguard Multimodal LLMs |
Wenzhuo Xu et.al. |
2503.06989 |
null |
2025-03-10 |
Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations |
Jiho Jin et.al. |
2503.06987 |
null |
2025-03-10 |
Learning Decision Trees as Amortized Structure Inference |
Mohammed Mahfoud et.al. |
2503.06985 |
link |
2025-03-10 |
Exploring Multimodal Perception in Large Language Models Through Perceptual Strength Ratings |
Jonghyun Lee et.al. |
2503.06980 |
null |
2025-03-10 |
Lightweight Multimodal Artificial Intelligence Framework for Maritime Multi-Scene Recognition |
Xinyu Xi et.al. |
2503.06978 |
null |
2025-03-10 |
Task-Specific Knowledge Distillation from the Vision Foundation Model for Enhanced Medical Image Segmentation |
Pengchen Liang et.al. |
2503.06976 |
null |
2025-03-10 |
ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA |
Zhao Xinjie et.al. |
2503.06951 |
null |
2025-03-10 |
CtrlRAG: Black-box Adversarial Attacks Based on Masked Language Models in Retrieval-Augmented Language Generation |
Runqi Sui et.al. |
2503.06950 |
null |
2025-03-11 |
LexPro-1.0 Technical Report |
Haotian Chen et.al. |
2503.06949 |
link |
2025-03-10 |
Large Language Model Guided Progressive Feature Alignment for Multimodal UAV Object Detection |
Wentao Wu et.al. |
2503.06948 |
null |
2025-03-10 |
Handle Object Navigation as Weighted Traveling Repairman Problem |
Ruimeng Liu et.al. |
2503.06937 |
link |
2025-03-10 |
Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping |
Ning Ding et.al. |
2503.06930 |
null |
2025-03-10 |
Effect of Selection Format on LLM Performance |
Yuchen Han et.al. |
2503.06926 |
null |
2025-03-10 |
Combinatorial Optimization via LLM-driven Iterated Fine-tuning |
Pranjal Awasthi et.al. |
2503.06917 |
null |
2025-03-10 |
Beyond Code Generation: LLM-supported Exploration of the Program Design Space |
J. D. Zamfirescu-Pereira et.al. |
2503.06911 |
null |
2025-03-10 |
A Query Optimization Method Utilizing Large Language Models |
Zhiming Yao et.al. |
2503.06902 |
null |
2025-03-10 |
DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation |
Xiaoliang Ju et.al. |
2503.06900 |
null |
2025-03-10 |
SafePlan: Leveraging Formal Logic and Chain-of-Thought Reasoning for Enhanced Safety in LLM-based Robotic Task Planning |
Ike Obi et.al. |
2503.06892 |
null |
2025-03-10 |
ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks |
Yan Yang et.al. |
2503.06885 |
null |
2025-03-10 |
Text-to-Image Diffusion Models Cannot Count, and Prompt Refinement Cannot Help |
Yuefan Cao et.al. |
2503.06884 |
null |
2025-03-10 |
ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual Restoration |
Mengting Ai et.al. |
2503.06881 |
link |
2025-03-10 |
Graphormer-Guided Task Planning: Beyond Static Rules with LLM Safety Perception |
Wanjing Huang et.al. |
2503.06866 |
link |
2025-03-10 |
FIGLUT: An Energy-Efficient Accelerator Design for FP-INT GEMM Using Look-Up Tables |
Gunho Park et.al. |
2503.06862 |
null |
2025-03-10 |
Enhanced Multi-Tuple Extraction for Alloys: Integrating Pointer Networks and Augmented Attention |
Mengzhe Hei et.al. |
2503.06861 |
null |
2025-03-10 |
MADS: Multi-Attribute Document Supervision for Zero-Shot Image Classification |
Xiangyan Qu et.al. |
2503.06847 |
null |
2025-03-10 |
GUIDE-CoT: Goal-driven and User-Informed Dynamic Estimation for Pedestrian Trajectory using Chain-of-Thought |
Sungsik Kim et.al. |
2503.06832 |
link |
2025-03-10 |
Towards a Multimodal MRI-Based Foundation Model for Multi-Level Feature Exploration in Segmentation, Molecular Subtyping, and Grading of Glioma |
Somayeh Farahani et.al. |
2503.06828 |
null |
2025-03-10 |
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference |
Suraiya Tairin et.al. |
2503.06823 |
null |
2025-03-10 |
HierDAMap: Towards Universal Domain Adaptive BEV Mapping via Hierarchical Perspective Priors |
Siyu Li et.al. |
2503.06821 |
link |
2025-03-10 |
Towards Fine-Grained Video Question Answering |
Wei Dai et.al. |
2503.06820 |
null |
2025-03-09 |
Privacy Auditing of Large Language Models |
Ashwinee Panda et.al. |
2503.06808 |
null |
2025-03-09 |
VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation |
Hritik Bansal et.al. |
2503.06800 |
null |
2025-03-09 |
Multimodal AI-driven Biomarker for Early Detection of Cancer Cachexia |
Sabeen Ahmed et.al. |
2503.06797 |
null |
2025-03-09 |
RoboDesign1M: A Large-scale Dataset for Robot Design Understanding |
Tri Le et.al. |
2503.06796 |
null |
2025-03-09 |
AutoMisty: A Multi-Agent LLM Framework for Automated Code Generation in the Misty Social Robot |
Xiao Wang et.al. |
2503.06791 |
null |
2025-03-09 |
Infinite Leagues Under the Sea: Photorealistic 3D Underwater Terrain Generation by Latent Fractal Diffusion Models |
Tianyi Zhang et.al. |
2503.06784 |
null |
2025-03-09 |
Dr Genre: Reinforcement Learning from Decoupled LLM Feedback for Generic Text Rewriting |
Yufei Li et.al. |
2503.06781 |
null |
2025-03-09 |
Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators |
Feng Gu et.al. |
2503.06778 |
null |
2025-03-09 |
Primal-Dual Sample Complexity Bounds for Constrained Markov Decision Processes with Multiple Constraints |
Max Buckley et.al. |
2503.06751 |
null |
2025-03-09 |
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models |
Wenxuan Huang et.al. |
2503.06749 |
link |
2025-03-09 |
CoDa-4DGS: Dynamic Gaussian Splatting with Context and Deformation Awareness for Autonomous Driving |
Rui Song et.al. |
2503.06744 |
null |
2025-03-09 |
Delusions of Large Language Models |
Hongshen Xu et.al. |
2503.06709 |
null |
2025-03-09 |
Alignment for Efficient Tool Calling of Large Language Models |
Hongshen Xu et.al. |
2503.06708 |
null |
2025-03-09 |
PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts |
Ming Zhang et.al. |
2503.06706 |
link |
2025-03-09 |
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models |
Yuchen Yan et.al. |
2503.06692 |
null |
2025-03-09 |
DependEval: Benchmarking LLMs for Repository Dependency Understanding |
Junjia Du et.al. |
2503.06689 |
link |
2025-03-09 |
UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion |
Gongbo Zhang et.al. |
2503.06687 |
null |
2025-03-09 |
FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation |
Wei Li et.al. |
2503.06680 |
null |
2025-03-09 |
Exploring LLM Agents for Cleaning Tabular Machine Learning Datasets |
Tommaso Bendinelli et.al. |
2503.06664 |
null |
2025-03-07 |
Fairness-Aware Low-Rank Adaptation Under Demographic Privacy Constraints |
Parameswaran Kamalaruban et.al. |
2503.05684 |
null |
2025-03-07 |
Understanding the Limits of Lifelong Knowledge Editing in LLMs |
Lukas Thede et.al. |
2503.05683 |
null |
2025-03-07 |
AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data |
Zengqun Zhao et.al. |
2503.05665 |
link |
2025-03-07 |
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval |
Yu Zhang et.al. |
2503.05659 |
link |
2025-03-07 |
A functional approach for curve alignment and shape analysis |
Issam-Ali Moindjié et.al. |
2503.05632 |
null |
2025-03-07 |
Learning LLM Preference over Intra-Dialogue Pairs: A Framework for Utterance-level Understandings |
Xuanqing Liu et.al. |
2503.05620 |
null |
2025-03-07 |
A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models |
Dong Shu et.al. |
2503.05613 |
null |
2025-03-07 |
From Theory to Application: A Practical Introduction to Neural Operators in Scientific Computing |
Prashant K. Jha et.al. |
2503.05598 |
link |
2025-03-07 |
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning |
Huatong Song et.al. |
2503.05592 |
null |
2025-03-07 |
Evaluating open-source Large Language Models for automated fact-checking |
Nicolo’ Fontana et.al. |
2503.05565 |
null |
2025-03-07 |
Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance |
Bryan Etzine et.al. |
2503.05551 |
null |
2025-03-07 |
Leveraging Approximate Caching for Faster Retrieval-Augmented Generation |
Shai Bergman et.al. |
2503.05530 |
null |
2025-03-07 |
PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs |
Roberto Cerina et.al. |
2503.05529 |
null |
2025-03-07 |
Post-Hoc Concept Disentanglement: From Correlated to Isolated Concept Representations |
Eren Erogullari et.al. |
2503.05522 |
link |
2025-03-07 |
Cognitive Bias Detection Using Advanced Prompt Engineering |
Frederic Lemieux et.al. |
2503.05516 |
null |
2025-03-07 |
Statistical Guarantees of Correctness Coverage for Medical Multiple-Choice Question Answering |
Yusong Ke et.al. |
2503.05505 |
null |
2025-03-07 |
Benchmarking LLMs in Recommendation Tasks: A Comparative Evaluation with Conventional Recommenders |
Qijiong Liu et.al. |
2503.05493 |
null |
2025-03-07 |
Statistical Deficiency for Task Inclusion Estimation |
Loïc Fosse et.al. |
2503.05491 |
null |
2025-03-07 |
Maximum Hallucination Standards for Domain-Specific Large Language Models |
Tingmingke Lu et.al. |
2503.05481 |
null |
2025-03-07 |
The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence |
Noah Mamie et.al. |
2503.05473 |
null |
2025-03-07 |
De Novo Design of Protein-Binding Peptides by Quantum Computing |
Lars Meuser et.al. |
2503.05458 |
null |
2025-03-07 |
LLM-based Iterative Approach to Metamodeling in Automotive |
Nenad Petrovic et.al. |
2503.05449 |
null |
2025-03-07 |
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts |
Weigao Sun et.al. |
2503.05447 |
link |
2025-03-07 |
Are Your LLM-based Text-to-SQL Models Secure? Exploring SQL Injection via Backdoor Attacks |
Meiyu Lin et.al. |
2503.05445 |
null |
2025-03-07 |
Static Program Analysis Guided LLM Based Unit Test Generation |
Sujoy Roychowdhury et.al. |
2503.05394 |
null |
2025-03-07 |
Ontology Generation using Large Language Models |
Anna Sofia Lippolis et.al. |
2503.05388 |
link |
2025-03-07 |
VLMs Play StarCraft II: A Benchmark and Multimodal Decision Method |
Weiyu Ma et.al. |
2503.05383 |
link |
2025-03-07 |
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning |
Jiaxing Zhao et.al. |
2503.05379 |
null |
2025-03-07 |
Shifting Perspectives: Steering Vector Ensembles for Robust Bias Mitigation in LLMs |
Zara Siddique et.al. |
2503.05371 |
null |
2025-03-07 |
Chain of Strategy Optimization Makes Large Language Models Better Emotional Supporter |
Weixiang Zhao et.al. |
2503.05362 |
null |
2025-03-07 |
GEMA-Score: Granular Explainable Multi-Agent Score for Radiology Report Evaluation |
Zhenxuan Zhang et.al. |
2503.05347 |
link |
2025-03-07 |
AutoIOT: LLM-Driven Automated Natural Language Programming for AIoT Applications |
Leming Shen et.al. |
2503.05346 |
link |
2025-03-07 |
PhysicsGen: Can Generative Models Learn from Images to Predict Complex Physical Relations? |
Martin Spitznagel et.al. |
2503.05333 |
null |
2025-03-07 |
Dynamic Knowledge Integration for Evidence-Driven Counter-Argument Generation with Large Language Models |
Anar Yeginbergen et.al. |
2503.05328 |
null |
2025-03-07 |
Routing for Large ML Models |
Ofir Cohen et.al. |
2503.05324 |
link |
2025-03-07 |
Riemannian Metric Learning: Closer to You than You Imagine |
Samuel Gruffaz et.al. |
2503.05321 |
null |
2025-03-07 |
Disentangling Task Interference within Neurons: Model Merging in Alignment with Neuronal Mechanisms |
Zitao Fang et.al. |
2503.05320 |
null |
2025-03-07 |
Escaping Plato’s Cave: Towards the Alignment of 3D and Text Latent Spaces |
Souhail Hadgi et.al. |
2503.05283 |
null |
2025-03-07 |
Similarity-Based Domain Adaptation with LLMs |
Jie He et.al. |
2503.05281 |
null |
2025-03-07 |
Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching |
Bowen Pang et.al. |
2503.05248 |
link |
2025-03-07 |
L-FUSION: Laplacian Fetal Ultrasound Segmentation & Uncertainty Estimation |
Johanna P. Müller et.al. |
2503.05245 |
null |
2025-03-07 |
WritingBench: A Comprehensive Benchmark for Generative Writing |
Yuning Wu et.al. |
2503.05244 |
link |
2025-03-07 |
MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio |
Xuenan Xu et.al. |
2503.05242 |
link |
2025-03-07 |
Unveiling Biases in AI: ChatGPT’s Political Economy Perspectives and Human Comparisons |
Leonardo Becchetti et.al. |
2503.05234 |
null |
2025-03-07 |
Kaiwu: A Multimodal Manipulation Dataset and Framework for Robot Learning and Human-Robot Interaction |
Shuo Jiang et.al. |
2503.05231 |
null |
2025-03-07 |
ARbiter: Generating Dialogue Options and Communication Support in Augmented Reality |
Julián Méndez et.al. |
2503.05220 |
null |
2025-03-07 |
Knowledge Updating? No More Model Editing! Just Selective Contextual Reasoning |
Guoxiu He et.al. |
2503.05212 |
null |
2025-03-07 |
Path Pooling: Train-Free Structure Enhancement for Efficient Knowledge Graph Retrieval-Augmented Generation |
Hairu Wang et.al. |
2503.05203 |
null |
2025-03-07 |
ORANSight-2.0: Foundational LLMs for O-RAN |
Pranshav Gajjar et.al. |
2503.05200 |
null |
2025-03-07 |
Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning |
Mufan Xu et.al. |
2503.05193 |
null |
2025-03-07 |
Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions |
Chan hur et.al. |
2503.05186 |
null |
2025-03-07 |
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching |
Simon A. Aytes et.al. |
2503.05179 |
link |
2025-03-07 |
Development and Enhancement of Text-to-Image Diffusion Models |
Rajdeep Roshan Sahu et.al. |
2503.05149 |
null |
2025-03-07 |
RocketEval: Efficient Automated LLM Evaluation via Grading Checklist |
Tianjun Wei et.al. |
2503.05142 |
link |
2025-03-07 |
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs |
Ling Team et.al. |
2503.05139 |
null |
2025-03-07 |
R1-Zero’s “Aha Moment” in Visual Reasoning on a 2B Non-SFT Model |
Hengguang Zhou et.al. |
2503.05132 |
link |
2025-03-07 |
Dilu: Enabling GPU Resourcing-on-Demand for Serverless DL Serving via Introspective Elasticity |
Cunchi Lv et.al. |
2503.05130 |
null |
2025-03-07 |
Can Large Language Models Grasp Concepts in Visual Content? A Case Study on YouTube Shorts about Depression |
Jiaying “Lizzy” Liu et.al. |
2503.05109 |
null |
2025-03-07 |
AutoTestForge: A Multidimensional Automated Testing Framework for Natural Language Processing Models |
Hengrui Xing et.al. |
2503.05102 |
null |
2025-03-07 |
SpecServe: Efficient and SLO-Aware Large Language Model Serving with Adaptive Speculative Decoding |
Kaiyu Huang et.al. |
2503.05096 |
null |
2025-03-07 |
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information |
Feng Jiang et.al. |
2503.05085 |
null |
2025-03-07 |
On a Connection Between Imitation Learning and RLHF |
Teng Xiao et.al. |
2503.05079 |
link |
2025-03-07 |
PromptPex: Automatic Test Generation for Language Model Prompts |
Reshabh K Sharma et.al. |
2503.05070 |
link |
2025-03-07 |
Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts |
Shwai He et.al. |
2503.05066 |
null |
2025-03-07 |
No Free Labels: Limitations of LLM-as-a-Judge Without Human Grounding |
Michael Krumdick et.al. |
2503.05061 |
null |
2025-03-06 |
Dynamic-KGQA: A Scalable Framework for Generating Adaptive Question Answering Datasets |
Preetam Prabhu Srikar Dammu et.al. |
2503.05049 |
null |
2025-03-06 |
Biases in Large Language Model-Elicited Text: A Case Study in Natural Language Inference |
Grace Proebsting et.al. |
2503.05047 |
null |
2025-03-06 |
Continual Pre-training of MoEs: How robust is your router? |
Benjamin Thérien et.al. |
2503.05029 |
null |
2025-03-06 |
ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids |
Hannes Stark et.al. |
2503.05025 |
link |
2025-03-06 |
Safety is Not Only About Refusal: Reasoning-Enhanced Fine-tuning for Interpretable LLM Safety |
Yuyou Zhang et.al. |
2503.05021 |
null |
2025-03-06 |
LLMs’ Reshaping of People, Processes, Products, and Society in Software Development: A Comprehensive Exploration with Early Adopters |
Benyamin Tabarsi et.al. |
2503.05012 |
null |
2025-03-06 |
Leveraging Domain Knowledge at Inference Time for LLM Translation: Retrieval versus Generation |
Bryan Li et.al. |
2503.05010 |
null |
2025-03-06 |
Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models |
Benyamin Jamialahmadi et.al. |
2503.05005 |
link |
2025-03-06 |
Wanda++: Pruning Large Language Models via Regional Gradients |
Yifan Yang et.al. |
2503.04992 |
null |
2025-03-06 |
DP-GTR: Differentially Private Prompt Protection via Group Text Rewriting |
Mingchen Li et.al. |
2503.04990 |
null |
2025-03-06 |
Leveraging Large Language Models For Scalable Vector Graphics Processing: A Review |
Boris Malashenko et.al. |
2503.04983 |
null |
2025-03-06 |
LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression |
Souvik Kundu et.al. |
2503.04982 |
null |
2025-03-06 |
Quantifying the Relevance of Youth Research Cited in the US Policy Documents |
Miftahul Jannat Mokarrama et.al. |
2503.04977 |
link |
2025-03-06 |
Energy-Weighted Flow Matching for Offline Reinforcement Learning |
Shiyuan Zhang et.al. |
2503.04975 |
null |
2025-03-06 |
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning |
Giulio Corallo et.al. |
2503.04973 |
null |
2025-03-06 |
Incentivizing Multi-Tenant Split Federated Learning for Foundation Models at the Network Edge |
Songyuan Li et.al. |
2503.04971 |
null |
2025-03-06 |
DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL |
Haoyuan Ma et.al. |
2503.04959 |
null |
2025-03-06 |
Collaborative Evaluation of Deepfake Text with Deliberation-Enhancing Dialogue Systems |
Jooyoung Lee et.al. |
2503.04945 |
null |
2025-03-06 |
HILGEN: Hierarchically-Informed Data Generation for Biomedical NER Using Knowledgebases and Large Language Models |
Yao Ge et.al. |
2503.04930 |
null |
2025-03-06 |
Metadata-free Georegistration of Ground and Airborne Imagery |
Adam Bredvik et.al. |
2503.04927 |
null |
2025-03-06 |
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement |
Ian Huang et.al. |
2503.04919 |
null |
2025-03-06 |
L $^2$ M: Mutual Information Scaling Law for Long-Context Language Modeling |
Zhuo Chen et.al. |
2503.04725 |
link |
2025-03-07 |
Shifting Long-Context LLMs Research from Input to Output |
Yuhao Wu et.al. |
2503.04723 |
null |
2025-03-06 |
Enough Coin Flips Can Make LLMs Act Bayesian |
Ritwik Gupta et.al. |
2503.04722 |
null |
2025-03-06 |
Predictable Scale: Part I – Optimal Hyperparameter Scaling Law in Large Language Model Pretraining |
Houyi Li et.al. |
2503.04715 |
null |
2025-03-07 |
Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size |
Alireza Behtash et.al. |
2503.04704 |
null |
2025-03-06 |
UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets |
Wenyu Wang et.al. |
2503.04693 |
null |
2025-03-06 |
Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases |
Pengcheng Qiu et.al. |
2503.04691 |
null |
2025-03-06 |
LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue |
Sangyeop Kim et.al. |
2503.04675 |
null |
2025-03-06 |
What Are You Doing? A Closer Look at Controllable Human Video Generation |
Emanuele Bugliarello et.al. |
2503.04666 |
null |
2025-03-06 |
CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models |
Shengzhuang Chen et.al. |
2503.04655 |
link |
2025-03-06 |
Transferable Foundation Models for Geometric Tasks on Point Cloud Representations: Geometric Neural Operators |
Blaine Quackenbush et.al. |
2503.04649 |
link |
2025-03-06 |
Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment |
Wen Yang et.al. |
2503.04647 |
null |
2025-03-06 |
Simulating the Real World: A Unified Survey of Multimodal Generative Models |
Yuqi Hu et.al. |
2503.04641 |
link |
2025-03-06 |
Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation |
Aishik Konwer et.al. |
2503.04639 |
null |
2025-03-06 |
Mark Your LLM: Detecting the Misuse of Open-Source Large Language Models via Watermarking |
Yijie Xu et.al. |
2503.04636 |
null |
2025-03-06 |
3HANDS Dataset: Learning from Humans for Generating Naturalistic Handovers with Supernumerary Robotic Limbs |
Artin Saberpour Abadian et.al. |
2503.04635 |
null |
2025-03-06 |
Better Process Supervision with Bi-directional Rewarding Signals |
Wenxiang Chen et.al. |
2503.04618 |
null |
2025-03-06 |
Towards Data-Efficient Language Models: A Child-Inspired Approach to Language Learning |
Mohammad Amin Ghanizadeh et.al. |
2503.04611 |
null |
2025-03-06 |
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization |
Zhijian Zhuo et.al. |
2503.04598 |
link |
2025-03-06 |
The Next Frontier of LLM Applications: Open Ecosystems and Hardware Synergy |
Xinyi Hou et.al. |
2503.04596 |
null |
2025-03-06 |
Learning Generalizable Language-Conditioned Cloth Manipulation from Long Demonstrations |
Hanyi Zhao et.al. |
2503.04557 |
null |
2025-03-06 |
Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation |
Armel Zebaze et.al. |
2503.04554 |
null |
2025-03-06 |
Benchmarking Reasoning Robustness in Large Language Models |
Tong Yu et.al. |
2503.04550 |
null |
2025-03-06 |
Keeping Yourself is Important in Downstream Tuning Multimodal Large Language Model |
Wenke Huang et.al. |
2503.04543 |
link |
2025-03-06 |
SOLAR: Scalable Optimization of Large-scale Architecture for Reasoning |
Chen Li et.al. |
2503.04530 |
null |
2025-03-06 |
Multi-modal Summarization in Model-Based Engineering: Automotive Software Development Case Study |
Nenad Petrovic et.al. |
2503.04506 |
null |
2025-03-06 |
Learning Object Placement Programs for Indoor Scene Synthesis with Iterative Self Training |
Adrian Chang et.al. |
2503.04496 |
null |
2025-03-06 |
Large Language Models in Bioinformatics: A Survey |
Zhenyu Wang et.al. |
2503.04490 |
null |
2025-03-06 |
InfoSEM: A Deep Generative Model with Informative Priors for Gene Regulatory Network Inference |
Tianyu Cui et.al. |
2503.04483 |
null |
2025-03-06 |
ToolFuzz – Automated Agent Tool Testing |
Ivan Milev et.al. |
2503.04479 |
null |
2025-03-06 |
Semantic Alignment of Unimodal Medical Text and Vision Representations |
Maxime Di Folco et.al. |
2503.04478 |
null |
2025-03-06 |
Know Thy Judge: On the Robustness Meta-Evaluation of LLM Safety Judges |
Francisco Eiras et.al. |
2503.04474 |
null |
2025-03-06 |
Guiding LLMs to Generate High-Fidelity and High-Quality Counterfactual Explanations for Text Classification |
Van Bach Nguyen et.al. |
2503.04463 |
null |
2025-03-06 |
TPC: Cross-Temporal Prediction Connection for Vision-Language Model Hallucination Reduction |
Chao Wang et.al. |
2503.04457 |
null |
2025-03-06 |
Activation Space Interventions Can Be Transferred Between Large Language Models |
Narmeen Oozeer et.al. |
2503.04429 |
link |
2025-03-06 |
AOLO: Analysis and Optimization For Low-Carbon Oriented Wireless Large Language Model Services |
Xiaoqi Wang et.al. |
2503.04418 |
null |
2025-03-06 |
Can Large Language Models Predict Antimicrobial Resistance Gene? |
Hyunwoo Yoo et.al. |
2503.04413 |
null |
2025-03-06 |
Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search |
Kou Misaki et.al. |
2503.04412 |
null |
2025-03-06 |
Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling |
Yan Li et.al. |
2503.04398 |
null |
2025-03-06 |
TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models |
Xinyi He et.al. |
2503.04396 |
null |
2025-03-06 |
Shaping Shared Languages: Human and Large Language Models’ Inductive Biases in Emergent Communication |
Tom Kouwenhoven et.al. |
2503.04395 |
null |
2025-03-06 |
AgentSafe: Safeguarding Large Language Model-based Multi-agent Systems via Hierarchical Data Management |
Junyuan Mao et.al. |
2503.04392 |
null |
2025-03-06 |
TRACT: Regression-Aware Fine-tuning Meets Chain-of-Thought Reasoning for LLM-as-a-Judge |
Cheng-Han Chiang et.al. |
2503.04381 |
link |
2025-03-06 |
Lost in Literalism: How Supervised Training Shapes Translationese in LLMs |
Yafu Li et.al. |
2503.04369 |
null |
2025-03-06 |
A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery |
Yiheng Zhu et.al. |
2503.04362 |
null |
2025-03-06 |
LONGCODEU: Benchmarking Long-Context Language Models on Long Code Understanding |
Jia Li et.al. |
2503.04359 |
null |
2025-03-06 |
scDD: Latent Codes Based scRNA-seq Dataset Distillation with Foundation Model Knowledge |
Zhen Yu et.al. |
2503.04357 |
null |
2025-03-06 |
Layer-Specific Scaling of Positional Encodings for Superior Long-Context Modeling |
Zhenghua Wang et.al. |
2503.04355 |
null |
2025-03-06 |
Large Language Models for Zero-shot Inference of Causal Structures in Biology |
Izzy Newsham et.al. |
2503.04347 |
null |
2025-03-06 |
TRANSIT your events into a new mass: Fast background interpolation for weakly-supervised anomaly searches |
Ivan Oleksiyuk et.al. |
2503.04342 |
link |
2025-03-06 |
In-depth Analysis of Graph-based RAG in a Unified Framework |
Yingli Zhou et.al. |
2503.04338 |
null |
2025-03-06 |
The Challenge of Identifying the Origin of Black-Box Large Language Models |
Ziqing Yang et.al. |
2503.04332 |
null |
2025-03-06 |
Solving Word-Sense Disambiguation and Word-Sense Induction with Dictionary Examples |
Tadej Škvorc et.al. |
2503.04328 |
null |
2025-03-06 |
Malware Detection at the Edge with Lightweight LLMs: A Performance Evaluation |
Christian Rondanini et.al. |
2503.04302 |
null |
2025-03-06 |
Mapping AI Benchmark Data to Quantitative Risk Estimates Through Expert Elicitation |
Malcolm Murray et.al. |
2503.04299 |
null |
2025-03-06 |
MathMistake Checker: A Comprehensive Demonstration for Step-by-Step Math Problem Mistake Finding by Prompt-Guided LLMs |
Tianyang Zhang et.al. |
2503.04291 |
null |
2025-03-06 |
How Do Hackathons Foster Creativity? Towards AI Collaborative Evaluation of Creativity at Scale |
Jeanette Falk et.al. |
2503.04290 |
null |
2025-03-06 |
Towards Autonomous Reinforcement Learning for Real-World Robotic Manipulation with Large Language Models |
Niccolò Turcato et.al. |
2503.04280 |
null |
2025-03-06 |
VirtualXAI: A User-Centric Framework for Explainability Assessment Leveraging GPT-Generated Personas |
Georgios Makridis et.al. |
2503.04261 |
null |
2025-03-06 |
Knowledge Retention for Continual Model-Based Reinforcement Learning |
Yixiang Sun et.al. |
2503.04256 |
null |
2025-03-06 |
ADOR: A Design Exploration Framework for LLM Serving with Enhanced Latency and Throughput |
Junsoo Kim et.al. |
2503.04253 |
null |
2025-03-06 |
An Egocentric Vision-Language Model based Portable Real-time Smart Assistant |
Yifei Huang et.al. |
2503.04250 |
link |
2025-03-06 |
How to Mitigate Overfitting in Weak-to-strong Generalization? |
Junhao Shi et.al. |
2503.04249 |
null |
2025-03-06 |
ThrowBench: Benchmarking LLMs by Predicting Runtime Exceptions |
Julian Aron Prenner et.al. |
2503.04241 |
link |
2025-03-06 |
DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models |
Ruizhe Chen et.al. |
2503.04240 |
null |
2025-03-06 |
SemaSK: Answering Semantics-aware Spatial Keyword Queries with Large Language Models |
Zesong Zhang et.al. |
2503.04234 |
null |
2025-03-06 |
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion |
Ziyi Yang et.al. |
2503.04222 |
link |
2025-03-06 |
Knowledge-Decoupled Synergetic Learning: An MLLM based Collaborative Approach to Few-shot Multimodal Dialogue Intention Recognition |
Bin Chen et.al. |
2503.04201 |
null |
2025-03-06 |
MASTER: Multimodal Segmentation with Text Prompts |
Fuyang Liu et.al. |
2503.04199 |
null |
2025-03-06 |
Measuring temporal effects of agent knowledge by date-controlled tool use |
R. Patrick Xian et.al. |
2503.04188 |
null |
2025-03-06 |
TIMER: Temporal Instruction Modeling and Evaluation for Longitudinal Clinical Records |
Hejie Cui et.al. |
2503.04176 |
null |
2025-03-06 |
DuCos: Duality Constrained Depth Super-Resolution via Foundation Model |
Zhiqiang Yan et.al. |
2503.04171 |
null |
2025-03-06 |
CoFinDiff: Controllable Financial Diffusion Model for Time Series Generation |
Yuki Tanaka et.al. |
2503.04164 |
null |
2025-03-06 |
VLA Model-Expert Collaboration for Bi-directional Manipulation Learning |
Tian-Yu Xiang et.al. |
2503.04163 |
null |
2025-03-06 |
Semantic Retrieval Augmented Contrastive Learning for Sequential Recommendation |
Ziqiang Cui et.al. |
2503.04162 |
null |
2025-03-06 |
KidneyTalk-open: No-code Deployment of a Private Large Language Model with Medical Documentation-Enhanced Knowledge Database for Kidney Disease |
Yongchao Long et.al. |
2503.04153 |
link |
2025-03-06 |
Ticktack : Long Span Temporal Alignment of Large Language Models Leveraging Sexagenary Cycle Time Expression |
Xue Han et.al. |
2503.04150 |
null |
2025-03-06 |
Dynamic Benchmarking of Reasoning Capabilities in Code Large Language Models Under Data Contamination |
Simin Chen et.al. |
2503.04149 |
null |
2025-03-06 |
Biological Sequence with Language Model Prompting: A Survey |
Jiyue Jiang et.al. |
2503.04135 |
null |
2025-03-06 |
Token-Efficient Long Video Understanding for Multimodal LLMs |
Jindong Jiang et.al. |
2503.04130 |
null |
2025-03-06 |
TimeFound: A Foundation Model for Time Series Forecasting |
Congxi Xiao et.al. |
2503.04118 |
null |
2025-03-06 |
InterChat: Enhancing Generative Visual Analytics using Multimodal Interactions |
Juntong Chen et.al. |
2503.04110 |
null |
2025-03-06 |
WeakMedSAM: Weakly-Supervised Medical Image Segmentation via SAM with Sub-Class Exploration and Prompt Affinity Mining |
Haoran Wang et.al. |
2503.04106 |
link |
2025-03-06 |
LLMs Can Generate a Better Answer by Aggregating Their Own Responses |
Zichong Li et.al. |
2503.04104 |
null |
2025-03-06 |
Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English |
Runtao Zhou et.al. |
2503.04099 |
null |
2025-03-07 |
Chart-HQA: A Benchmark for Hypothetical Question Answering in Charts |
Xiangnan Chen et.al. |
2503.04095 |
null |
2025-03-06 |
PokéChamp: an Expert-level Minimax Language Agent |
Seth Karten et.al. |
2503.04094 |
null |
2025-03-06 |
Beyond Memorization: Evaluating the True Type Inference Capabilities of LLMs for Java Code Snippets |
Yiwen Dong et.al. |
2503.04076 |
null |
2025-03-06 |
PP-DocBee: Improving Multimodal Document Understanding Through a Bag of Tricks |
Feng Ni et.al. |
2503.04065 |
link |
2025-03-06 |
Uncovering inequalities in new knowledge learning by large language models across different languages |
Chenglong Wang et.al. |
2503.04064 |
link |
2025-03-06 |
EVE: Towards End-to-End Video Subtitle Extraction with Vision-Language Models |
Haiyang Yu et.al. |
2503.04058 |
null |
2025-03-06 |
Insights from Rights and Wrongs: A Large Language Model for Solving Assertion Failures in RTL Design |
Jie Zhou et.al. |
2503.04057 |
link |
2025-03-06 |
GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding |
Xihan Wang et.al. |
2503.04034 |
null |
2025-03-06 |
Benchmarking Large Language Models on Multiple Tasks in Bioinformatics NLP with Prompting |
Jiyue Jiang et.al. |
2503.04013 |
null |
2025-03-06 |
DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation |
Amin Karimi et.al. |
2503.04006 |
null |
2025-03-06 |
Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows |
Xiangxin Zhou et.al. |
2503.03989 |
null |
2025-03-06 |
RetinalGPT: A Retinal Clinical Preference Conversational Assistant Powered by Large Vision-Language Models |
Wenhui Zhu et.al. |
2503.03987 |
null |
2025-03-06 |
ReasonGraph: Visualisation of Reasoning Paths |
Zongqian Li et.al. |
2503.03979 |
link |
2025-03-05 |
Towards Universal Learning-based Model for Cardiac Image Reconstruction: Summary of the CMRxRecon2024 Challenge |
Fanwen Wang et.al. |
2503.03971 |
link |
2025-03-05 |
Model Behavior Specification by Leveraging LLM Self-Playing and Self-Improving |
Soya Park et.al. |
2503.03967 |
null |
2025-03-05 |
The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems |
Richard Ren et.al. |
2503.03750 |
null |
2025-03-05 |
Process-based Self-Rewarding Language Models |
Shimao Zhang et.al. |
2503.03746 |
link |
2025-03-05 |
Towards Understanding Distilled Reasoning Models: A Representational Approach |
David D. Baek et.al. |
2503.03730 |
null |
2025-03-05 |
Improving LLM Safety Alignment with Dual-Objective Optimization |
Xuandong Zhao et.al. |
2503.03710 |
link |
2025-03-05 |
Rethinking Video Tokenization: A Conditioned Diffusion-based Approach |
Nianzu Yang et.al. |
2503.03708 |
link |
2025-03-05 |
Effective LLM Knowledge Learning via Model Generalization |
Mingkang Zhu et.al. |
2503.03705 |
null |
2025-03-05 |
A Practical Memory Injection Attack against LLM Agents |
Shen Dong et.al. |
2503.03704 |
null |
2025-03-05 |
Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models |
Jiyue Jiang et.al. |
2503.03702 |
null |
2025-03-05 |
Addressing Overprescribing Challenges: Fine-Tuning Large Language Models for Medication Recommendation Tasks |
Zihao Zhao et.al. |
2503.03687 |
link |
2025-03-05 |
Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models |
Bar Karov et.al. |
2503.03669 |
link |
2025-03-05 |
Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction |
Gustaw Opiełka et.al. |
2503.03666 |
link |
2025-03-05 |
A Generative Approach to High Fidelity 3D Reconstruction from Text Data |
Venkat Kumar R et.al. |
2503.03664 |
null |
2025-03-05 |
Improving Neutral Point of View Text Generation through Parameter-Efficient Reinforcement Learning and a Small-Scale High-Quality Dataset |
Jessica Hoffmann et.al. |
2503.03654 |
null |
2025-03-05 |
Token-Level Privacy in Large Language Models |
Re’em Harel et.al. |
2503.03652 |
null |
2025-03-05 |
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles |
Rui Zhao et.al. |
2503.03651 |
link |
2025-03-05 |
Psy-Copilot: Visual Chain of Thought for Counseling |
Keqi Chen et.al. |
2503.03645 |
null |
2025-03-05 |
Large language models in finance: estimating financial sentiment for stock prediction |
Kemal Kirtac et.al. |
2503.03612 |
null |
2025-03-05 |
Enhancing the Accuracy and Comprehensibility in Architectural Tactics Detection via Small Model-Augmented Prompt Engineering |
Lingli Cao et.al. |
2503.03609 |
link |
2025-03-05 |
Psy-Insight: Explainable Multi-turn Bilingual Dataset for Mental Health Counseling |
Keqi Chen et.al. |
2503.03607 |
null |
2025-03-05 |
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders |
Kristian Kuznetsov et.al. |
2503.03601 |
null |
2025-03-05 |
PowerAttention: Exponentially Scaling of Receptive Fields for Effective Sparse Attention |
Lida Chen et.al. |
2503.03588 |
null |
2025-03-05 |
“You don’t need a university degree to comprehend data protection this way”: LLM-Powered Interactive Privacy Policy Assessment |
Vincent Freiberger et.al. |
2503.03587 |
null |
2025-03-05 |
Benchmarking LLMs and LLM-based Agents in Practical Vulnerability Detection for Code Repositories |
Alperen Yildiz et.al. |
2503.03586 |
null |
2025-03-05 |
Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection |
Wenqiao Li et.al. |
2503.03562 |
null |
2025-03-05 |
Afford-X: Generalizable and Slim Affordance Reasoning for Task-oriented Manipulation |
Xiaomeng Zhu et.al. |
2503.03556 |
null |
2025-03-05 |
Parallelized Planning-Acting for Efficient LLM-based Multi-Agent Systems |
Yaoru Li et.al. |
2503.03505 |
link |
2025-03-05 |
Collaborative Expert LLMs Guided Multi-Objective Molecular Optimization |
Jiajun Yu et.al. |
2503.03503 |
link |
2025-03-05 |
CURVALID: Geometrically-guided Adversarial Prompt Detection |
Canaan Yung et.al. |
2503.03502 |
link |
2025-03-05 |
TEDDY: A Family Of Foundation Models For Understanding Single Cell Biology |
Alexis Chevalier et.al. |
2503.03485 |
null |
2025-03-05 |
Generative Artificial Intelligence in Robotic Manipulation: A Survey |
Kun Zhang et.al. |
2503.03464 |
null |
2025-03-05 |
Open-Source Large Language Models as Multilingual Crowdworkers: Synthesizing Open-Domain Dialogues in Several Languages With No Examples in Targets and No Machine Translation |
Ahmed Njifenjou et.al. |
2503.03462 |
null |
2025-03-05 |
Visualising Policy-Reward Interplay to Inform Zeroth-Order Preference Optimisation of Large Language Models |
Alessio Galatolo et.al. |
2503.03460 |
link |
2025-03-05 |
Unified Mind Model: Reimagining Autonomous Agents in the LLM Era |
Pengbo Hu et.al. |
2503.03459 |
null |
2025-03-05 |
Taxation Perspectives from Large Language Models: A Case Study on Additional Tax Penalties |
Eunkyung Choi et.al. |
2503.03444 |
null |
2025-03-05 |
RASD: Retrieval-Augmented Speculative Decoding |
Guofeng Quan et.al. |
2503.03434 |
null |
2025-03-05 |
Video Super-Resolution: All You Need is a Video Diffusion Model |
Zhihao Zhan et.al. |
2503.03355 |
null |
2025-03-05 |
Leveraging Large Language Models to Develop Heuristics for Emerging Optimization Problems |
Thomas Bömer et.al. |
2503.03350 |
null |
2025-03-05 |
EnigmaToM: Improve LLMs’ Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States |
Hainiu Xu et.al. |
2503.03340 |
link |
2025-03-05 |
LLM as GNN: Graph Vocabulary Learning for Text-Attributed Graph Foundation Models |
Xi Zhu et.al. |
2503.03313 |
link |
2025-03-05 |
SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection |
Yi-Fan Lu et.al. |
2503.03303 |
null |
2025-03-05 |
Label-Efficient LiDAR Semantic Segmentation with 2D-3D Vision Transformer Adapters |
Julia Hindel et.al. |
2503.03299 |
null |
2025-03-05 |
A 262 TOPS Hyperdimensional Photonic AI Accelerator powered by a Si3N4 microcomb laser |
Christos Pappas et.al. |
2503.03263 |
null |
2025-03-05 |
Can Frontier LLMs Replace Annotators in Biomedical Text Mining? Analyzing Challenges and Exploring Solutions |
Yichong Zhao et.al. |
2503.03261 |
link |
2025-03-05 |
Exploring the Potential of Large Language Models as Predictors in Dynamic Text-Attributed Graphs |
Runlin Lei et.al. |
2503.03258 |
null |
2025-03-05 |
PAIR: A Novel Large Language Model-Guided Selection Strategy for Evolutionary Algorithms |
Shady Ali et.al. |
2503.03239 |
link |
2025-03-05 |
FANS – Formal Answer Selection for Natural Language Math Reasoning Using Lean4 |
Jiarui Yao et.al. |
2503.03238 |
null |
2025-03-05 |
Targeted Distillation for Sentiment Analysis |
Yice Zhang et.al. |
2503.03225 |
null |
2025-03-05 |
Mocap-2-to-3: Lifting 2D Diffusion-Based Pretrained Models for 3D Motion Capture |
Zhumei Wang et.al. |
2503.03222 |
null |
2025-03-05 |
COSINT-Agent: A Knowledge-Driven Multimodal Agent for Chinese Open Source Intelligence |
Wentao Li et.al. |
2503.03215 |
null |
2025-03-05 |
PolyVer: A Compositional Approach for Polyglot System Modeling and Verification |
Pei-Wei Chen et.al. |
2503.03207 |
null |
2025-03-05 |
An Analytical Theory of Power Law Spectral Bias in the Learning Dynamics of Diffusion Models |
Binxu Wang et.al. |
2503.03206 |
null |
2025-03-05 |
MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving |
Ruida Wang et.al. |
2503.03205 |
link |
2025-03-05 |
Find Matching Faces Based On Face Parameters |
Setu A. Bhatt et.al. |
2503.03204 |
null |
2025-03-05 |
Towards Robust Universal Information Extraction: Benchmark, Evaluation, and Solution |
Jizhao Zhu et.al. |
2503.03201 |
null |
2025-03-05 |
Structured Outputs Enable General-Purpose LLMs to be Medical Experts |
Guangfu Guo et.al. |
2503.03194 |
null |
2025-03-05 |
Enhancing Memory Efficiency in Large Language Model Training Through Chronos-aware Pipeline Parallelism |
Xinyuan Lin et.al. |
2503.03182 |
null |
2025-03-05 |
Enhancing Cybersecurity in Critical Infrastructure with LLM-Assisted Explainable IoT Systems |
Ashutosh Ghimire et.al. |
2503.03180 |
null |
2025-03-05 |
AttackSeqBench: Benchmarking Large Language Models’ Understanding of Sequential Patterns in Cyber Attacks |
Javier Yong et.al. |
2503.03170 |
link |
2025-03-05 |
Dango: A Mixed-Initiative Data Wrangling System using Large Language Model |
Wei-Hao Chen et.al. |
2503.03154 |
null |
2025-03-05 |
Position: Model Collapse Does Not Mean What You Think |
Rylan Schaeffer et.al. |
2503.03150 |
null |
2025-03-05 |
DSVD: Dynamic Self-Verify Decoding for Faithful Generation in Large Language Models |
YiQiu Guo et.al. |
2503.03149 |
null |
2025-03-05 |
PriFFT: Privacy-preserving Federated Fine-tuning of Large Language Models via Function Secret Sharing |
Zhichao You et.al. |
2503.03146 |
null |
2025-03-05 |
A Survey of Foundation Models for Environmental Science |
Runlong Yu et.al. |
2503.03142 |
null |
2025-03-05 |
StarFlow: Leveraging Normalizing Flows for Stellar Age Estimation in SDSS-V DR19 |
Alexander Stone-Martinez et.al. |
2503.03138 |
null |
2025-03-05 |
Bridging Molecular Graphs and Large Language Models |
Runze Wang et.al. |
2503.03135 |
link |
2025-03-05 |
Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability |
Chenhui Xu et.al. |
2503.03128 |
null |
2025-03-05 |
The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models |
Zichao Li et.al. |
2503.03122 |
link |
2025-03-05 |
PromAssistant: Leveraging Large Language Models for Text-to-PromQL |
Chenxi Zhang et.al. |
2503.03114 |
null |
2025-03-05 |
SoK: Knowledge is All You Need: Last Mile Delivery for Automated Provenance-based Intrusion Detection with LLMs |
Wenrui Cheng et.al. |
2503.03108 |
null |
2025-03-05 |
Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation |
Yurui Chang et.al. |
2503.03106 |
null |
2025-03-05 |
BEVDriver: Leveraging BEV Maps in LLMs for Robust Closed-Loop Driving |
Katharina Winter et.al. |
2503.03074 |
link |
2025-03-04 |
Unification of Stochastic and Quantum Thermodynamics in Scalar Field Theory via a Model with Brownian Thermostat |
T. Koide et.al. |
2503.03059 |
null |
2025-03-04 |
SAGE: Steering and Refining Dialog Generation with State-Action Augmentation |
Yizhe Zhang et.al. |
2503.03040 |
link |
2025-03-04 |
SAFE: A Sparse Autoencoder-Based Framework for Robust Query Enrichment and Hallucination Mitigation in LLMs |
Samir Abdaljalil et.al. |
2503.03032 |
null |
2025-03-04 |
Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection |
Ragini Gupta et.al. |
2503.03022 |
null |
2025-03-04 |
Can Diffusion Models Provide Rigorous Uncertainty Quantification for Bayesian Inverse Problems? |
Evan Scope Crafts et.al. |
2503.03007 |
link |
2025-03-04 |
Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment |
Matthew DosSantos DiSorbo et.al. |
2503.02976 |
null |
2025-03-04 |
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation |
Jude Khouja et.al. |
2503.02972 |
null |
2025-03-04 |
Multilingual Relative Clause Attachment Ambiguity Resolution in Large Language Models |
So Young Lee et.al. |
2503.02971 |
link |
2025-03-04 |
InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model |
Siqi Ouyang et.al. |
2503.02969 |
link |
2025-03-04 |
Privacy-Preserving Fair Synthetic Tabular Data |
Fatima J. Sarmin et.al. |
2503.02968 |
null |
2025-03-04 |
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding |
Zhangchen Xu et.al. |
2503.02951 |
link |
2025-03-04 |
Train on classical, deploy on quantum: scaling generative quantum machine learning to a thousand qubits |
Erik Recio-Armengol et.al. |
2503.02934 |
link |
2025-03-04 |
Optimizing open-domain question answering with graph-based retrieval augmented generation |
Joyce Cahoon et.al. |
2503.02922 |
null |
2025-03-04 |
ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models |
Qinyu Zhao et.al. |
2503.02883 |
link |
2025-03-04 |
Wikipedia in the Era of LLMs: Evolution and Risks |
Siming Huang et.al. |
2503.02879 |
link |
2025-03-04 |
SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models |
Dmitry Nechaev et.al. |
2503.02876 |
link |
2025-03-04 |
The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models |
Ke Ji et.al. |
2503.02875 |
null |
2025-03-04 |
Prompting Generative AI with Interaction-Augmented Instructions |
Leixian Shen et.al. |
2503.02874 |
null |
2025-03-05 |
FairSense-AI: Responsible AI Meets Sustainability |
Shaina Raza et.al. |
2503.02865 |
null |
2025-03-04 |
Calibrating LLM Confidence with Semantic Steering: A Multi-Prompt Aggregation Framework |
Ziang Zhou et.al. |
2503.02863 |
null |
2025-03-04 |
Privacy and Accuracy-Aware AI/ML Model Deduplication |
Hong Guan et.al. |
2503.02862 |
null |
2025-03-04 |
Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs’ Decoding Layers |
Zicong He et.al. |
2503.02851 |
link |
2025-03-04 |
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs |
Yuzhe Gu et.al. |
2503.02846 |
link |
2025-03-04 |
SeqFusion: Sequential Fusion of Pre-Trained Models for Zero-Shot Time-Series Forecasting |
Ting-Ji Huang et.al. |
2503.02836 |
link |
2025-03-04 |
AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation |
Songming Zhang et.al. |
2503.02832 |
null |
2025-03-04 |
Developing a PET/CT Foundation Model for Cross-Modal Anatomical and Functional Imaging |
Yujin Oh et.al. |
2503.02824 |
null |
2025-03-04 |
A Multimodal Symphony: Integrating Taste and Sound through Generative AI |
Matteo Spanio et.al. |
2503.02823 |
link |
2025-03-04 |
Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts |
Marta Skreta et.al. |
2503.02819 |
link |
2025-03-04 |
RAAD-LLM: Adaptive Anomaly Detection Using LLMs and RAG Integration |
Alicia Russell-Gilbert et.al. |
2503.02800 |
null |
2025-03-04 |
Multimodal AI predicts clinical outcomes of drug combinations from preclinical data |
Yepeng Huang et.al. |
2503.02781 |
link |
2025-03-04 |
Implicit Bias in LLMs: A Survey |
Xinru Lin et.al. |
2503.02776 |
null |
2025-03-04 |
InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training |
Dingdong Wang et.al. |
2503.02769 |
null |
2025-03-04 |
BatchGEMBA: Token-Efficient Machine Translation Evaluation with Batched Prompting and Prompt Compression |
Daniil Larionov et.al. |
2503.02756 |
null |
2025-03-04 |
Large Language Models for Multilingual Previously Fact-Checked Claim Detection |
Ivan Vykopal et.al. |
2503.02737 |
link |
2025-03-04 |
RedChronos: A Large Language Model-Based Log Analysis System for Insider Threat Detection in Enterprises |
Chenyu Li et.al. |
2503.02702 |
null |
2025-03-04 |
MindBridge: Scalable and Cross-Model Knowledge Editing via Memory-Augmented Modality |
Shuaike Li et.al. |
2503.02701 |
link |
2025-03-04 |
Zero-Shot Complex Question-Answering on Long Scientific Documents |
Wanting Wang et.al. |
2503.02695 |
link |
2025-03-04 |
FinArena: A Human-Agent Collaboration Framework for Financial Market Analysis and Forecasting |
Congluo Xu et.al. |
2503.02692 |
null |
2025-03-04 |
Generative Modeling of Microweather Wind Velocities for Urban Air Mobility |
Tristan A. Shah et.al. |
2503.02690 |
link |
2025-03-04 |
MPO: Boosting LLM Agents with Meta Plan Optimization |
Weimin Xiong et.al. |
2503.02682 |
link |
2025-03-04 |
Multidimensional Consistency Improves Reasoning in Language Models |
Huiyuan Lai et.al. |
2503.02670 |
null |
2025-03-04 |
LoRA-Null: Low-Rank Adaptation via Null Space for Large Language Models |
Pengwei Tang et.al. |
2503.02659 |
null |
2025-03-04 |
The Effectiveness of Large Language Models in Transforming Unstructured Text to Standardized Formats |
William Brach et.al. |
2503.02650 |
link |
2025-03-04 |
YARE-GAN: Yet Another Resting State EEG-GAN |
Yeganeh Farahzadi et.al. |
2503.02636 |
link |
2025-03-04 |
Reflection on Data Storytelling Tools in the Generative AI Era from the Human-AI Collaboration Perspective |
Haotian Li et.al. |
2503.02631 |
null |
2025-03-04 |
Towards Event Extraction with Massive Types: LLM-based Collaborative Annotation and Partitioning Extraction |
Wenxuan Liu et.al. |
2503.02628 |
null |
2025-03-04 |
Rewarding Doubt: A Reinforcement Learning Approach to Confidence Calibration of Large Language Models |
Paul Stangel et.al. |
2503.02623 |
null |
2025-03-04 |
OkraLong: A Flexible Retrieval-Augmented Framework for Long-Text Query Processing |
Yulong Hui et.al. |
2503.02603 |
null |
2025-03-04 |
Seeing is Understanding: Unlocking Causal Attention into Modality-Mutual Attention for Multimodal LLMs |
Wei-Yao Wang et.al. |
2503.02597 |
link |
2025-03-04 |
StageDesigner: Artistic Stage Generation for Scenography via Theater Scripts |
Zhaoxing Gan et.al. |
2503.02595 |
null |
2025-03-04 |
MciteBench: A Benchmark for Multimodal Citation Text Generation in MLLMs |
Caiyu Hu et.al. |
2503.02589 |
link |
2025-03-04 |
Playing games with Large language models: Randomness and strategy |
Alicia Vidler et.al. |
2503.02582 |
null |
2025-03-04 |
LLM-Safety Evaluations Lack Robustness |
Tim Beyer et.al. |
2503.02574 |
null |
2025-03-04 |
SpecInF: Exploiting Idle GPU Resources in Distributed DL Training via Speculative Inference Filling |
Cunchi Lv et.al. |
2503.02550 |
null |
2025-03-04 |
PVTree: Realistic and Controllable Palm Vein Generation for Recognition Tasks |
Sheng Shang et.al. |
2503.02547 |
null |
2025-03-04 |
SAGE-Amine: Generative Amine Design with Multi-Property Optimization for Efficient CO2 Capture |
Hocheol Lim et.al. |
2503.02534 |
link |
2025-03-04 |
Use Me Wisely: AI-Driven Assessment for LLM Prompting Skills Development |
Dimitri Ognibene et.al. |
2503.02532 |
null |
2025-03-04 |
Generator-Assistant Stepwise Rollback Framework for Large Language Model Agent |
Xingzuo Li et.al. |
2503.02519 |
link |
2025-03-04 |
Deepfake Detection via Knowledge Injection |
Tonghui Li et.al. |
2503.02503 |
null |
2025-03-04 |
LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMs |
Jianghao Chen et.al. |
2503.02502 |
null |
2025-03-04 |
PennyLang: Pioneering LLM-Based Quantum Code Generation with a Novel PennyLane-Centric Dataset |
Haider Asif et.al. |
2503.02497 |
null |
2025-03-04 |
BioD2C: A Dual-level Semantic Consistency Constraint Framework for Biomedical VQA |
Zhengyang Ji et.al. |
2503.02476 |
link |
2025-03-04 |
It Helps to Take a Second Opinion: Teaching Smaller LLMs to Deliberate Mutually via Selective Rationale Optimisation |
Sohan Patnaik et.al. |
2503.02463 |
null |
2025-03-04 |
Don’t Get Too Excited – Eliciting Emotions in LLMs |
Gino Franco Fazzi et.al. |
2503.02457 |
null |
2025-03-04 |
Sparse Meets Dense: Unified Generative Recommendations with Cascaded Sparse-Dense Representations |
Yuhao Yang et.al. |
2503.02453 |
null |
2025-03-04 |
Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization |
Yilun Qiu et.al. |
2503.02450 |
link |
2025-03-04 |
AILS-NTUA at SemEval-2025 Task 4: Parameter-Efficient Unlearning for Large Language Models using Data Chunking |
Iraklis Premptis et.al. |
2503.02443 |
null |
2025-03-04 |
AILS-NTUA at SemEval-2025 Task 3: Leveraging Large Language Models and Translation Strategies for Multilingual Hallucination Detection |
Dimitra Karkani et.al. |
2503.02442 |
null |
2025-03-04 |
Artificial Intelligence in Reactor Physics: Current Status and Future Prospects |
Ruizhi Zhang et.al. |
2503.02440 |
null |
2025-03-04 |
Beyond the Leland strategies |
Emmanuel Lepinette et.al. |
2503.02419 |
null |
2025-03-04 |
Building 3D In-Context Learning Universal Model in Neuroimaging |
Jiesi Hu et.al. |
2503.02410 |
link |
2025-03-04 |
Wyckoff Transformer: Generation of Symmetric Crystals |
Nikita Kazeev et.al. |
2503.02407 |
link |
2025-03-04 |
Hierarchical Re-ranker Retriever (HRR) |
Ashish Singh et.al. |
2503.02401 |
null |
2025-03-04 |
Promptware Engineering: Software Engineering for LLM Prompt Development |
Zhenpeng Chen et.al. |
2503.02400 |
null |
2025-03-04 |
PersonaX: A Recommendation Agent Oriented User Modeling Framework for Long Behavior Sequence |
Yunxiao Shi et.al. |
2503.02398 |
link |
2025-03-04 |
ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks |
Heng Zhou et.al. |
2503.02390 |
link |
2025-03-04 |
RGBSQGrasp: Inferring Local Superquadric Primitives from Single RGB Image for Graspability-Aware Bin Picking |
Yifeng Xu et.al. |
2503.02387 |
null |
2025-03-04 |
An Efficient and Precise Training Data Construction Framework for Process-supervised Reward Model in Mathematical Reasoning |
Wei Sun et.al. |
2503.02382 |
link |
2025-03-04 |
Teaching Metric Distance to Autoregressive Multimodal Foundational Models |
Jiwan Chung et.al. |
2503.02379 |
null |
2025-03-04 |
MedEthicEval: Evaluating Large Language Models Based on Chinese Medical Ethics |
Haoan Jin et.al. |
2503.02374 |
null |
2025-03-04 |
EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports |
Lama Moukheiber et.al. |
2503.02365 |
null |
2025-03-04 |
Add-One-In: Incremental Sample Selection for Large Language Models via a Choice-Based Greedy Paradigm |
Zhuo Li et.al. |
2503.02359 |
null |
2025-03-04 |
Efficient Long Context Fine-tuning with Chunk Flow |
Xiulong Yuan et.al. |
2503.02356 |
null |
2025-03-04 |
CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory |
Jiashun Suo et.al. |
2503.02354 |
null |
2025-03-04 |
DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability |
Yunzhen He et.al. |
2503.02343 |
link |
2025-03-04 |
GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning |
Zhun Mou et.al. |
2503.02341 |
null |
2025-03-04 |
Limited Effectiveness of LLM-based Data Augmentation for COVID-19 Misinformation Stance Detection |
Eun Cheol Choi et.al. |
2503.02328 |
null |
2025-03-04 |
PromptCoT: Synthesizing Olympiad-level Problems for Mathematical Reasoning in Large Language Models |
Xueliang Zhao et.al. |
2503.02324 |
link |
2025-03-04 |
Generative Model-Assisted Demosaicing for Cross-multispectral Cameras |
Jiahui Luo et.al. |
2503.02322 |
null |
2025-03-04 |
Semantic Prior Distillation with Vision Foundation Model for Enhanced Rapid Bone Scintigraphy Image Restoration |
Pengchen Liang et.al. |
2503.02321 |
null |
2025-03-04 |
A Token-level Text Image Foundation Model for Document Understanding |
Tongkun Guan et.al. |
2503.02304 |
null |
2025-03-04 |
Towards Large Language Model Guided Kernel Direct Fuzzing |
Xie Li et.al. |
2503.02301 |
null |
2025-03-04 |
Towards Explainable Doctor Recommendation with Large Language Models |
Ziyang Zeng et.al. |
2503.02298 |
null |
2025-03-04 |
Memorize or Generalize? Evaluating LLM Code Generation with Evolved Questions |
Wentao Chen et.al. |
2503.02296 |
null |
2025-03-04 |
spike: A tool to drizzle HST, JWST, and Roman PSFs for improved analyses |
Ava Polzin et.al. |
2503.02288 |
link |
2025-03-04 |
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users |
Wenjia Jiang et.al. |
2503.02268 |
null |
2025-03-04 |
Large Language Models as Natural Selector for Embodied Soft Robot Design |
Changhe Chen et.al. |
2503.02249 |
null |
2025-03-04 |
Making Better Mistakes in CLIP-Based Zero-Shot Classification with Hierarchy-Aware Language Prompts |
Tong Liang et.al. |
2503.02248 |
null |
2025-03-04 |
From Code to Courtroom: LLMs as the New Software Judges |
Junda He et.al. |
2503.02246 |
null |
2025-03-04 |
OmniSQL: Synthesizing High-quality Text-to-SQL Data at Scale |
Haoyang Li et.al. |
2503.02240 |
link |
2025-03-04 |
V2X-LLM: Enhancing V2X Integration and Understanding in Connected Vehicle Corridors |
Keshu Wu et.al. |
2503.02239 |
null |
2025-03-04 |
Haste Makes Waste: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between Actions |
Zirui Wu et.al. |
2503.02238 |
link |
2025-03-04 |
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling |
Hang Zheng et.al. |
2503.02233 |
null |
2025-03-04 |
ATLaS: Agent Tuning via Learning Critical Steps |
Zhixun Chen et.al. |
2503.02197 |
null |
2025-03-04 |
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models |
Saeed Ranjbar Alvar et.al. |
2503.02175 |
link |
2025-03-04 |
Leveraging Large Language Models for Enhanced Digital Twin Modeling: Trends, Methods, and Challenges |
Linyao Yang et.al. |
2503.02167 |
null |
2025-03-04 |
X2CT-CLIP: Enable Multi-Abnormality Detection in Computed Tomography from Chest Radiography via Tri-Modal Contrastive Learning |
Jianzhong You et.al. |
2503.02162 |
null |
2025-03-04 |
LLM-TabFlow: Synthetic Tabular Data Generation with Inter-column Logical Relationship Preservation |
Yunbo Long et.al. |
2503.02161 |
null |
2025-03-04 |
Tabby: Tabular Data Synthesis with Language Models |
Sonia Cromp et.al. |
2503.02152 |
null |
2025-03-04 |
Malware Classification from Memory Dumps Using Machine Learning, Transformers, and Large Language Models |
Areej Dweib et.al. |
2503.02144 |
null |
2025-03-04 |
Measuring Intrinsic Dimension of Token Embeddings |
Takuya Kataiwa et.al. |
2503.02142 |
null |
2025-03-04 |
Network Traffic Classification Using Machine Learning, Transformer, and Large Language Models |
Ahmad Antari et.al. |
2503.02141 |
null |
2025-03-03 |
TMIQ: Quantifying Test and Measurement Domain Intelligence in Large Language Models |
Emmanuel A. Olowe et.al. |
2503.02123 |
null |
2025-02-28 |
LLM Post-Training: A Deep Dive into Reasoning Large Language Models |
Komal Kumar et.al. |
2502.21321 |
link |
2025-02-28 |
How far can we go with ImageNet for Text-to-Image generation? |
L. Degeorge et.al. |
2502.21318 |
null |
2025-02-28 |
FANformer: Improving Large Language Models Through Effective Periodicity Modeling |
Yihong Dong et.al. |
2502.21309 |
link |
2025-02-28 |
Contextualizing biological perturbation experiments through language |
Menghua Wu et.al. |
2502.21290 |
link |
2025-02-28 |
Does Generation Require Memorization? Creative Diffusion Models using Ambient Diffusion |
Kulin Shah et.al. |
2502.21278 |
null |
2025-02-28 |
Adaptive Keyframe Sampling for Long Video Understanding |
Xi Tang et.al. |
2502.21271 |
null |
2025-03-03 |
Foundation Models – A Panacea for Artificial Intelligence in Pathology? |
Nita Mulliqi et.al. |
2502.21264 |
null |
2025-02-28 |
Modeling Human Beliefs about AI Behavior for Scalable Oversight |
Leon Lang et.al. |
2502.21262 |
null |
2025-02-28 |
RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete |
Yuheng Ji et.al. |
2502.21257 |
null |
2025-02-28 |
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding |
Haoran Zhang et.al. |
2502.21245 |
null |
2025-03-04 |
Semantic Volume: Quantifying and Detecting both External and Internal Uncertainty in LLMs |
Xiaomin Li et.al. |
2502.21239 |
null |
2025-02-28 |
Transforming Tuberculosis Care: Optimizing Large Language Models For Enhanced Clinician-Patient Communication |
Daniil Filienko et.al. |
2502.21236 |
null |
2025-02-28 |
ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs |
Hao Ge et.al. |
2502.21231 |
null |
2025-03-03 |
ECLeKTic: a Novel Challenge Set for Evaluation of Cross-Lingual Knowledge Transfer |
Omer Goldman et.al. |
2502.21228 |
null |
2025-02-28 |
Dynamic Markov Blanket Detection for Macroscopic Physics Discovery |
Jeff Beck et.al. |
2502.21217 |
link |
2025-02-28 |
Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought |
Jianhao Huang et.al. |
2502.21212 |
null |
2025-02-28 |
Chronologically Consistent Large Language Models |
Songrun He et.al. |
2502.21206 |
null |
2025-03-04 |
SYN-LUNGS: Towards Simulating Lung Nodules with Anatomy-Informed Digital Twins for AI Training |
Fakrul Islam Tushar et.al. |
2502.21187 |
null |
2025-02-28 |
$Δ$ -model correction of Foundation Model based on the models own understanding |
Mads-Peter Verner Christiansen et.al. |
2502.21179 |
null |
2025-03-03 |
Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models |
Ruta Binkyte et.al. |
2502.21123 |
null |
2025-02-28 |
Optimizing Large Language Models for ESG Activity Detection in Financial Texts |
Mattia Birti et.al. |
2502.21112 |
link |
2025-02-28 |
Rare event modeling with self-regularized normalizing flows: what can we learn from a single failure? |
Charles Dawson et.al. |
2502.21110 |
null |
2025-02-28 |
Large Language Model-Based Benchmarking Experiment Settings for Evolutionary Multi-Objective Optimization |
Lie Meng Pang et.al. |
2502.21108 |
null |
2025-02-28 |
Generating patient cohorts from electronic health records using two-step retrieval-augmented text-to-SQL generation |
Angelo Ziletti et.al. |
2502.21107 |
null |
2025-02-28 |
A Non-contrast Head CT Foundation Model for Comprehensive Neuro-Trauma Triage |
Youngjin Yoo et.al. |
2502.21106 |
null |
2025-02-28 |
Re-evaluating Theory of Mind evaluation in large language models |
Jennifer Hu et.al. |
2502.21098 |
null |
2025-02-28 |
An LLM-based Delphi Study to Predict GenAI Evolution |
Francesco Bertolotti et.al. |
2502.21092 |
null |
2025-02-28 |
PASemiQA: Plan-Assisted Agent for Question Answering on Semi-Structured Data with Text and Relational Information |
Hansi Yang et.al. |
2502.21087 |
null |
2025-02-28 |
Are foundation models useful feature extractors for electroencephalography analysis? |
Özgün Turgut et.al. |
2502.21086 |
null |
2025-02-28 |
Spatial Reasoning with Denoising Models |
Christopher Wewer et.al. |
2502.21075 |
null |
2025-02-28 |
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation |
Zhenyi Shen et.al. |
2502.21074 |
null |
2025-02-28 |
GUIDE: LLM-Driven GUI Generation Decomposition for Automated Prototyping |
Kristian Kolthoff et.al. |
2502.21068 |
null |
2025-02-28 |
Synthesizing Individualized Aging Brains in Health and Disease with Generative Models and Parallel Transport |
Jingru Fu et.al. |
2502.21049 |
link |
2025-02-28 |
Incorporating Long-Range Interactions via the Multipole Expansion into Ground and Excited-State Molecular Simulations |
Rhyan Barrett et.al. |
2502.21045 |
null |
2025-02-28 |
The amplifier effect of artificial agents in social contagion |
Eric Hitz et.al. |
2502.21037 |
null |
2025-02-28 |
Beyond Words: A Latent Memory Approach to Internal Reasoning in LLMs |
José I. Orlicki et.al. |
2502.21030 |
null |
2025-02-28 |
Measuring and identifying factors of individuals’ trust in Large Language Models |
Edoardo Sebastiano De Duro et.al. |
2502.21028 |
null |
2025-02-28 |
PersuasiveToM: A Benchmark for Evaluating Machine Theory of Mind in Persuasive Dialogues |
Fangxu Yu et.al. |
2502.21017 |
null |
2025-02-28 |
Explainable Biomedical Claim Verification with Large Language Models |
Siting Liang et.al. |
2502.21014 |
null |
2025-02-28 |
Merging Clinical Knowledge into Large Language Models for Medical Research and Applications: A Survey |
Qiyuan Li et.al. |
2502.20988 |
null |
2025-02-28 |
UoR-NCL at SemEval-2025 Task 1: Using Generative LLMs and CLIP Models for Multilingual Multimodal Idiomaticity Representation |
Thanet Markchom et.al. |
2502.20984 |
null |
2025-02-28 |
Set-Theoretic Compositionality of Sentence Embeddings |
Naman Bansal et.al. |
2502.20975 |
null |
2025-02-28 |
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval |
Chien-Yu Lin et.al. |
2502.20969 |
null |
2025-02-28 |
Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMs |
Weixiang Zhao et.al. |
2502.20968 |
null |
2025-02-28 |
Fine-Grained Retrieval-Augmented Generation for Visual Question Answering |
Zhengxuan Zhang et.al. |
2502.20964 |
null |
2025-02-28 |
Efficient Jailbreaking of Large Models by Freeze Training: Lower Layers Exhibit Greater Sensitivity to Harmful Content |
Hongyuan Shen et.al. |
2502.20952 |
null |
2025-02-28 |
Generative Uncertainty in Diffusion Models |
Metod Jazbec et.al. |
2502.20946 |
null |
2025-02-28 |
A Deep User Interface for Exploring LLaMa |
Divya Perumal et.al. |
2502.20938 |
null |
2025-02-28 |
Large Language Models Are Innate Crystal Structure Generators |
Jingru Gan et.al. |
2502.20933 |
null |
2025-02-28 |
Automated Evaluation of Meter and Rhyme in Russian Generative and Human-Authored Poetry |
Ilya Koziev et.al. |
2502.20931 |
null |
2025-02-28 |
A database to support the evaluation of gender biases in GPT-4o output |
Luise Mehner et.al. |
2502.20898 |
null |
2025-02-28 |
Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals’ Subjective Text Perceptions |
Matthias Orlikowski et.al. |
2502.20897 |
null |
2025-02-28 |
PathVG: A New Benchmark and Dataset for Pathology Visual Grounding |
Chunlin Zhong et.al. |
2502.20869 |
null |
2025-02-28 |
ProBench: Benchmarking Large Language Models in Competitive Programming |
Lei Yang et.al. |
2502.20868 |
null |
2025-02-28 |
The Power of Personality: A Human Simulation Perspective to Investigate Large Language Model Agents |
Yifan Duan et.al. |
2502.20859 |
null |
2025-02-28 |
Learning to Substitute Components for Compositional Generalization |
Zhaoyi Li et.al. |
2502.20834 |
null |
2025-02-28 |
CoTMR: Chain-of-Thought Multi-Scale Reasoning for Training-Free Zero-Shot Composed Image Retrieval |
Zelong Sun et.al. |
2502.20826 |
null |
2025-02-28 |
Can We Simplify Slide-level Fine-tuning of Pathology Foundation Models? |
Jiawen Li et.al. |
2502.20823 |
null |
2025-02-28 |
Towards Reliable Vector Database Management Systems: A Software Testing Roadmap for 2030 |
Shenao Wang et.al. |
2502.20812 |
null |
2025-02-28 |
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models |
Xiao Wang et.al. |
2502.20811 |
null |
2025-02-28 |
PFD: Automatically Generating Machine Learning Force Fields from Universal Models |
Ruoyu Wang et.al. |
2502.20809 |
link |
2025-03-03 |
MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts |
Peijie Wang et.al. |
2502.20808 |
null |
2025-02-28 |
Digital Player: Evaluating Large Language Models based Human-like Agent in Games |
Jiawei Wang et.al. |
2502.20807 |
link |
2025-02-28 |
Plan2Align: Predictive Planning Based Test-Time Preference Alignment in Paragraph-Level Machine Translation |
Kuang-Da Wang et.al. |
2502.20795 |
null |
2025-02-28 |
Cyber Defense Reinvented: Large Language Models as Threat Intelligence Copilots |
Xiaoqun Liu et.al. |
2502.20791 |
null |
2025-02-28 |
Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision |
Dawei Zhu et.al. |
2502.20790 |
null |
2025-02-28 |
Triple Phase Transitions: Understanding the Learning Dynamics of Large Language Models from a Neuroscience Perspective |
Yuko Nakagi et.al. |
2502.20779 |
null |
2025-02-28 |
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference |
Xunhao Lai et.al. |
2502.20766 |
link |
2025-02-28 |
Collective Reasoning Among LLMs A Framework for Answer Validation Without Ground Truth |
Seyed Pouyan Mousavi Davoudi et.al. |
2502.20758 |
null |
2025-02-28 |
The Rise of Darkness: Safety-Utility Trade-Offs in Role-Playing Dialogue Agents |
Yihong Tang et.al. |
2502.20757 |
null |
2025-02-28 |
SemiSAM+: Rethinking Semi-Supervised Medical Image Segmentation in the Era of Foundation Models |
Yichi Zhang et.al. |
2502.20749 |
link |
2025-02-28 |
Teach-to-Reason with Scoring: Self-Explainable Rationale-Driven Multi-Trait Essay Scoring |
Heejin Do et.al. |
2502.20748 |
null |
2025-02-28 |
Measuring Determinism in Large Language Models for Software Code Review |
Eugene Klishevich et.al. |
2502.20747 |
null |
2025-02-28 |
CADDreamer: CAD object Generation from Single-view Images |
Yuan Li et.al. |
2502.20732 |
null |
2025-02-28 |
SPD: Sync-Point Drop for efficient tensor parallelism of Large Language Models |
Han-Byul Kim et.al. |
2502.20727 |
null |
2025-02-28 |
Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition |
Yifei Duan et.al. |
2502.20726 |
link |
2025-02-28 |
Generating Clinically Realistic EHR Data via a Hierarchy- and Semantics-Guided Transformer |
Guanglin Zhou et.al. |
2502.20719 |
null |
2025-02-28 |
Why Trust in AI May Be Inevitable |
Nghi Truong et.al. |
2502.20701 |
null |
2025-02-28 |
Towards General Visual-Linguistic Face Forgery Detection(V2) |
Ke Sun et.al. |
2502.20698 |
link |
2025-02-28 |
WorldModelBench: Judging Video Generation Models As World Models |
Dacheng Li et.al. |
2502.20694 |
null |
2025-02-28 |
Unleashing the Potential of Two-Tower Models: Diffusion-Based Cross-Interaction for Large-Scale Matching |
Yihan Wang et.al. |
2502.20687 |
null |
2025-02-28 |
JAM: Controllable and Responsible Text Generation via Causal Reasoning and Latent Vector Manipulation |
Yingbing Huang et.al. |
2502.20684 |
null |
2025-02-28 |
STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding |
Aaryan Garg et.al. |
2502.20678 |
null |
2025-02-28 |
SciceVPR: Stable Cross-Image Correlation Enhanced Model for Visual Place Recognition |
Shanshan Wan et.al. |
2502.20676 |
null |
2025-02-28 |
Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA |
Ojonugwa Oluwafemi Ejiga Peter et.al. |
2502.20667 |
null |
2025-02-28 |
Consistency Evaluation of News Article Summaries Generated by Large (and Small) Language Models |
Colleen Gilhuly et.al. |
2502.20647 |
null |
2025-02-28 |
LexRAG: Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation |
Haitao Li et.al. |
2502.20640 |
link |
2025-02-28 |
Can LLM Assist in the Evaluation of the Quality of Machine Learning Explanations? |
Bo Wang et.al. |
2502.20635 |
null |
2025-02-28 |
Are LLMs Ready for Practical Adoption for Assertion Generation? |
Vaishnavi Pulavarthi et.al. |
2502.20633 |
null |
2025-02-28 |
Rectifying Belief Space via Unlearning to Harness LLMs’ Reasoning |
Ayana Niwa et.al. |
2502.20620 |
null |
2025-02-28 |
Leveraging Large Language Models for Building Interpretable Rule-Based Data-to-Text Systems |
Jędrzej Warczyński et.al. |
2502.20609 |
null |
2025-02-28 |
NutriGen: Personalized Meal Plan Generator Leveraging Large Language Models to Enhance Dietary and Nutritional Adherence |
Saman Khamesian et.al. |
2502.20601 |
link |
2025-02-27 |
Few-Shot, No Problem: Descriptive Continual Relation Extraction |
Nguyen Xuan Thanh et.al. |
2502.20596 |
null |
2025-02-27 |
Multi $^2$ : Multi-Agent Test-Time Scalable Framework for Multi-Document Processing |
Juntai Cao et.al. |
2502.20592 |
null |
2025-02-27 |
LLMs Have Rhythm: Fingerprinting Large Language Models Using Inter-Token Times and Network Traffic Analysis |
Saeif Alhazbi et.al. |
2502.20589 |
null |
2025-03-04 |
InstaFace: Identity-Preserving Facial Editing with Single Image Inference |
MD Wahiduzzaman Khan et.al. |
2502.20577 |
null |
2025-02-27 |
ECCOS: Efficient Capability and Cost Coordinated Scheduling for Multi-LLM Serving |
Kai Mei et.al. |
2502.20576 |
link |
2025-02-27 |
Visual Reasoning at Urban Intersections: FineTuning GPT-4o for Traffic Conflict Detection |
Sari Masri et.al. |
2502.20573 |
null |
2025-02-27 |
Stochastic Rounding for LLM Training: Theory and Practice |
Kaan Ozkara et.al. |
2502.20566 |
null |
2025-02-27 |
LISArD: Learning Image Similarity to Defend Against Gray-box Adversarial Attacks |
Joana C. Costa et.al. |
2502.20562 |
link |
2025-02-27 |
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts |
Zhongyang Li et.al. |
2502.20395 |
link |
2025-02-27 |
InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions |
Sirui Xu et.al. |
2502.20390 |
link |
2025-02-27 |
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation |
Sucheng Ren et.al. |
2502.20388 |
link |
2025-02-27 |
Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis |
Jeffrey Yang Fan Chiang et.al. |
2502.20383 |
null |
2025-02-27 |
Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers |
Shalev Lifshitz et.al. |
2502.20379 |
null |
2025-02-27 |
PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation |
Albert Gong et.al. |
2502.20377 |
link |
2025-02-27 |
Constrained Generative Modeling with Manually Bridged Diffusion Models |
Saeid Naderiparizi et.al. |
2502.20371 |
null |
2025-02-27 |
Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization |
Ryan C. Barron et.al. |
2502.20364 |
link |
2025-02-27 |
Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs |
Kuan Lok Zhou et.al. |
2502.20356 |
null |
2025-02-27 |
KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model |
Kai Zhang et.al. |
2502.20350 |
null |
2025-02-27 |
Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models |
Yi Jing et.al. |
2502.20344 |
null |
2025-02-27 |
Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners |
Daniele Paliotta et.al. |
2502.20339 |
null |
2025-02-27 |
Expertise Is What We Want |
Alan Ashworth et.al. |
2502.20335 |
null |
2025-02-27 |
Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models |
Yukang Yang et.al. |
2502.20332 |
null |
2025-02-27 |
Long-Context Inference with Retrieval-Augmented Speculative Decoding |
Guanzheng Chen et.al. |
2502.20330 |
link |
2025-02-27 |
EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants |
Franck Cappello et.al. |
2502.20309 |
link |
2025-02-27 |
M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging |
Jinghao Feng et.al. |
2502.20301 |
null |
2025-02-27 |
An exploration of features to improve the generalisability of fake news detection models |
Nathaniel Hoy et.al. |
2502.20299 |
null |
2025-02-27 |
Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription |
Benjamin Gutteridge et.al. |
2502.20295 |
link |
2025-02-27 |
Conformal Tail Risk Control for Large Language Model Alignment |
Catherine Yu-Chi Chen et.al. |
2502.20285 |
null |
2025-02-27 |
Evaluating Human Trust in LLM-Based Planners: A Preliminary Study |
Shenghui Chen et.al. |
2502.20284 |
null |
2025-02-27 |
Large Language Models as Attribution Regularizers for Efficient Model Training |
Davor Vukadin et.al. |
2502.20268 |
link |
2025-02-27 |
Vector-Quantized Vision Foundation Models for Object-Centric Learning |
Rongzhen Zhao et.al. |
2502.20263 |
null |
2025-02-27 |
LLM as a Broken Telephone: Iterative Generation Distorts Information |
Amr Mohamed et.al. |
2502.20258 |
link |
2025-02-27 |
Do computer vision foundation models learn the low-level characteristics of the human visual system? |
Yancheng Cai et.al. |
2502.20256 |
null |
2025-02-27 |
Beyond Natural Language Perplexity: Detecting Dead Code Poisoning in Code Generation Datasets |
Chichien Tsai et.al. |
2502.20246 |
null |
2025-02-27 |
From Retrieval to Generation: Comparing Different Approaches |
Abdelrahman Abdallah et.al. |
2502.20245 |
null |
2025-02-27 |
FINEREASON: Evaluating and Improving LLMs’ Deliberate Reasoning through Reflective Puzzle Solving |
Guizhen Chen et.al. |
2502.20238 |
link |
2025-02-27 |
AI Will Always Love You: Studying Implicit Biases in Romantic AI Companions |
Clare Grogan et.al. |
2502.20231 |
link |
2025-02-27 |
Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars |
Tobias Kirschstein et.al. |
2502.20220 |
null |
2025-02-27 |
ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models |
Haibin Chen et.al. |
2502.20196 |
link |
2025-02-27 |
Model Checking Linear Temporal Logic with Standpoint Modalities |
Rajab Aghamov et.al. |
2502.20193 |
null |
2025-02-27 |
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge |
Yan-Lun Chen et.al. |
2502.20186 |
null |
2025-02-27 |
DGFM: Full Body Dance Generation Driven by Music Foundation Models |
Xinran Liu et.al. |
2502.20176 |
null |
2025-02-27 |
An Extensive Evaluation of PDDL Capabilities in off-the-shelf LLMs |
Kaustubh Vyas et.al. |
2502.20175 |
null |
2025-02-27 |
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think |
Liang Chen et.al. |
2502.20172 |
link |
2025-02-27 |
Re-evaluating Open-ended Evaluation of Large Language Models |
Siqi Liu et.al. |
2502.20170 |
null |
2025-02-27 |
Adaptive H&E-IHC information fusion staining framework based on feature extra |
Yifan Jia et.al. |
2502.20156 |
link |
2025-02-27 |
Telephone Surveys Meet Conversational AI: Evaluating a LLM-Based Telephone Survey System at Scale |
Max M. Lang et.al. |
2502.20140 |
null |
2025-02-27 |
Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking |
Yifan Zhang et.al. |
2502.20129 |
null |
2025-02-27 |
Self-Training Elicits Concise Reasoning in Large Language Models |
Tergel Munkhbat et.al. |
2502.20122 |
link |
2025-02-27 |
LongRoPE2: Near-Lossless LLM Context Window Scaling |
Ning Shang et.al. |
2502.20082 |
link |
2025-02-27 |
Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents |
Haochen Sun et.al. |
2502.20073 |
link |
2025-02-27 |
A Generative Model Enhanced Multi-Agent Reinforcement Learning Method for Electric Vehicle Charging Navigation |
Tianyang Qi et.al. |
2502.20068 |
null |
2025-02-27 |
Polish-ASTE: Aspect-Sentiment Triplet Extraction Datasets for Polish |
Marta Lango et.al. |
2502.20046 |
null |
2025-02-27 |
3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds |
Hengshuo Chu et.al. |
2502.20041 |
null |
2025-02-27 |
AsymLoRA: Harmonizing Data Conflicts and Commonalities in MLLMs |
Xuyang Wei et.al. |
2502.20035 |
link |
2025-02-27 |
Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models |
Huazheng Wang et.al. |
2502.19982 |
link |
2025-02-27 |
The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs |
Tanja Baeumel et.al. |
2502.19981 |
null |
2025-02-27 |
Can Large Language Models Unveil the Mysteries? An Exploration of Their Ability to Unlock Information in Complex Scenarios |
Chao Wang et.al. |
2502.19973 |
null |
2025-02-27 |
Deterministic or probabilistic? The psychology of LLMs as random number generators |
Javier Coronado-Blázquez et.al. |
2502.19965 |
null |
2025-02-27 |
SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model |
Xinghao Wang et.al. |
2502.19960 |
link |
2025-02-27 |
Collaborative Stance Detection via Small-Large Language Model Consistency Verification |
Yu Yan et.al. |
2502.19954 |
link |
2025-02-27 |
GeoEdit: Geometric Knowledge Editing for Large Language Models |
Yujie Feng et.al. |
2502.19953 |
null |
2025-02-27 |
Algebraic Machine Learning: Learning as computing an algebraic decomposition of a task |
Fernando Martin-Maroto et.al. |
2502.19944 |
link |
2025-02-27 |
Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation |
Xiang Geng et.al. |
2502.19941 |
null |
2025-02-27 |
Playing Pokémon Red via Deep Reinforcement Learning |
Marco Pleines et.al. |
2502.19920 |
link |
2025-02-27 |
Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models |
Yuan Sui et.al. |
2502.19918 |
null |
2025-02-27 |
Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents |
Zhenyu Liu et.al. |
2502.19917 |
link |
2025-02-27 |
LLM-driven Effective Knowledge Tracing by Integrating Dual-channel Difficulty |
Jiahui Cen et.al. |
2502.19915 |
null |
2025-02-27 |
SkipPipe: Partial and Reordered Pipelining Framework for Training LLMs in Heterogeneous Networks |
Nikolay Blagoev et.al. |
2502.19913 |
link |
2025-02-27 |
Order Doesn’t Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation |
Qianxi He et.al. |
2502.19907 |
null |
2025-02-27 |
Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy |
Zaijing Li et.al. |
2502.19902 |
null |
2025-02-27 |
GenPC: Zero-shot Point Cloud Completion via 3D Generative Priors |
An Li et.al. |
2502.19896 |
null |
2025-02-27 |
Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models |
Sibo Yi et.al. |
2502.19883 |
null |
2025-02-27 |
Towards Multimodal Large-Language Models for Parent-Child Interaction: A Focus on Joint Attention |
Weiyan Shi et.al. |
2502.19877 |
null |
2025-02-27 |
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge |
Yuntao Du et.al. |
2502.19870 |
link |
2025-02-27 |
MIND: Towards Immersive Psychological Healing with Multi-agent Inner Dialogue |
Yujia Chen et.al. |
2502.19860 |
null |
2025-02-27 |
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments |
Hojae Han et.al. |
2502.19852 |
null |
2025-02-27 |
One-for-More: Continual Diffusion Model for Anomaly Detection |
Xiaofan Li et.al. |
2502.19848 |
link |
2025-02-27 |
ProAPO: Progressively Automatic Prompt Optimization for Visual Classification |
Xiangyan Qu et.al. |
2502.19844 |
link |
2025-02-27 |
Shared Stochastic Gaussian Process Latent Variable Models: A Multi-modal Generative Model for Quasar Spectra |
Vidhi Lalchand et.al. |
2502.19824 |
link |
2025-02-27 |
Foot-In-The-Door: A Multi-turn Jailbreak for LLMs |
Zixuan Weng et.al. |
2502.19820 |
link |
2025-02-27 |
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts |
Shulai Zhang et.al. |
2502.19811 |
link |
2025-02-27 |
Implicit Search via Discrete Diffusion: A Study on Chess |
Jiacheng Ye et.al. |
2502.19805 |
link |
2025-02-27 |
Developmental Support Approach to AI’s Autonomous Growth: Toward the Realization of a Mutually Beneficial Stage Through Experiential Learning |
Taichiro Endo et.al. |
2502.19798 |
null |
2025-02-27 |
ChatMol: A Versatile Molecule Designer Based on the Numerically Enhanced Large Language Model |
Chuanliu Fan et.al. |
2502.19794 |
null |
2025-02-27 |
Mixtera: A Data Plane for Foundation Model Training |
Maximilian Böther et.al. |
2502.19790 |
link |
2025-02-27 |
Advancements in Natural Language Processing for Automatic Text Summarization |
Nevidu Jayatilleke et.al. |
2502.19773 |
null |
2025-02-27 |
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models |
Heeseung Kim et.al. |
2502.19759 |
null |
2025-02-27 |
PolyPrompt: Automating Knowledge Extraction from Multilingual Language Models with Dynamic Prompt Generation |
Nathan Roll et.al. |
2502.19756 |
null |
2025-02-27 |
Beneath the Surface: How Large Language Models Reflect Hidden Bias |
Jinhao Pan et.al. |
2502.19749 |
link |
2025-02-27 |
HaLoRA: Hardware-aware Low-Rank Adaptation for Large Language Models Based on Hybrid Compute-in-Memory Architecture |
Taiqiang Wu et.al. |
2502.19747 |
null |
2025-02-27 |
R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning |
Minggui He et.al. |
2502.19735 |
null |
2025-02-27 |
Preference Learning Unlocks LLMs’ Psycho-Counseling Skills |
Mian Zhang et.al. |
2502.19731 |
null |
2025-02-27 |
Do Expressions Change Decisions? Exploring the Impact of AI’s Explanation Tone on Decision-Making |
Ayano Okoso et.al. |
2502.19730 |
null |
2025-02-27 |
Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training |
Toan Tran et.al. |
2502.19726 |
null |
2025-02-27 |
Few-Shot Multilingual Open-Domain QA from 5 Examples |
Fan Jiang et.al. |
2502.19722 |
link |
2025-02-27 |
Sensing and Steering Stereotypes: Extracting and Applying Gender Representation Vectors in LLMs |
Hannah Cyberey et.al. |
2502.19721 |
link |
2025-02-27 |
Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation |
Manveer Singh Tamber et.al. |
2502.19712 |
link |
2025-02-27 |
AoECR: AI-ization of Elderly Care Robot |
Linkun Zhou et.al. |
2502.19706 |
null |
2025-02-27 |
You Only Click Once: Single Point Weakly Supervised 3D Instance Segmentation for Autonomous Driving |
Guangfeng Jiang et.al. |
2502.19698 |
null |
2025-02-27 |
M-LLM Based Video Frame Selection for Efficient Video Understanding |
Kai Hu et.al. |
2502.19680 |
null |
2025-02-27 |
Old Experience Helps: Leveraging Survey Methodology to Improve AI Text Annotation Reliability in Social Sciences |
Linzhuo li et.al. |
2502.19679 |
null |
2025-02-27 |
Improving Adversarial Transferability in MLLMs via Dynamic Vision-Language Alignment Attack |
Chenhe Gu et.al. |
2502.19672 |
null |
2025-02-27 |
SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning |
Mingsheng Cai et.al. |
2502.19668 |
null |
2025-02-27 |
Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models |
Jan Wehner et.al. |
2502.19649 |
null |
2025-02-27 |
cMIM: A Contrastive Mutual Information Framework for Unified Generative and Discriminative Representation Learning |
Micha Livne et.al. |
2502.19642 |
null |
2025-02-26 |
Agentic Mixture-of-Workflows for Multi-Modal Chemical Search |
Tiffany J. Callahan et.al. |
2502.19629 |
null |
2025-02-26 |
Treatment Non-Adherence Bias in Clinical Machine Learning: A Real-World Study on Hypertension Medication |
Zhongyuan Liang et.al. |
2502.19625 |
null |
2025-02-26 |
Norm Growth and Stability Challenges in Localized Sequential Knowledge Editing |
Akshat Gupta et.al. |
2502.19416 |
null |
2025-02-26 |
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs |
Dayu Yang et.al. |
2502.19411 |
link |
2025-02-26 |
Less or More: Towards Glanceable Explanations for LLM Recommendations Using Ultra-Small Devices |
Xinru Wang et.al. |
2502.19410 |
null |
2025-02-26 |
ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models |
Danae Sánchez Villegas et.al. |
2502.19409 |
null |
2025-02-26 |
Learning Code-Edit Embedding to Model Student Debugging Behavior |
Hasnain Heickal et.al. |
2502.19407 |
null |
2025-02-26 |
General Reasoning Requires Learning to Reason from the Get-go |
Seungwook Han et.al. |
2502.19402 |
null |
2025-02-26 |
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding |
Max Ku et.al. |
2502.19400 |
null |
2025-02-26 |
Multi-modal Contrastive Learning for Tumor-specific Missing Modality Synthesis |
Minjoo Lim et.al. |
2502.19390 |
null |
2025-02-26 |
LiDAR Registration with Visual Foundation Models |
Niclas Vödisch et.al. |
2502.19374 |
null |
2025-02-26 |
Deep Learning For Time Series Analysis With Application On Human Motion |
Ali Ismail-Fawaz et.al. |
2502.19364 |
null |
2025-02-26 |
DataMan: Data Manager for Pre-training Large Language Models |
Ru Peng et.al. |
2502.19363 |
null |
2025-02-26 |
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? |
Yancheng He et.al. |
2502.19361 |
link |
2025-02-26 |
Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets |
Tohida Rehman et.al. |
2502.19339 |
null |
2025-02-26 |
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems |
Hao Peng et.al. |
2502.19328 |
link |
2025-02-26 |
Shh, don’t say that! Domain Certification in LLMs |
Cornelius Emde et.al. |
2502.19320 |
null |
2025-02-26 |
Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond |
Qizhou Wang et.al. |
2502.19301 |
null |
2025-02-26 |
Agent-centric Information Access |
Evangelos Kanoulas et.al. |
2502.19298 |
null |
2025-02-26 |
Complex LLM Planning via Automated Heuristics Discovery |
Hongyi Ling et.al. |
2502.19295 |
null |
2025-02-26 |
Efficient Federated Search for Retrieval-Augmented Generation |
Rachid Guerraoui et.al. |
2502.19280 |
null |
2025-02-26 |
ArtInsight: Enabling AI-Powered Artwork Engagement for Mixed Visual-Ability Families |
Arnavi Chheda-Kothary et.al. |
2502.19263 |
null |
2025-02-26 |
AI-Powered Bayesian Inference |
Veronika Ročková et.al. |
2502.19231 |
null |
2025-02-26 |
Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time |
Jiazheng Li et.al. |
2502.19230 |
null |
2025-02-26 |
A Lightweight and Extensible Cell Segmentation and Classification Model for Whole Slide Images |
Nikita Shvetsov et.al. |
2502.19217 |
null |
2025-02-26 |
A Hybrid Transformer Architecture with a Quantized Self-Attention Mechanism Applied to Molecular Generation |
Anthony M. Smaldone et.al. |
2502.19214 |
link |
2025-02-26 |
Negation-Induced Forgetting in LLMs |
Francesca Capuano et.al. |
2502.19211 |
null |
2025-02-26 |
Bi’an: A Bilingual Benchmark and Model for Hallucination Detection in Retrieval-Augmented Generation |
Zhouyu Jiang et.al. |
2502.19209 |
null |
2025-02-26 |
Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms |
Jinyu Cai et.al. |
2502.19193 |
null |
2025-02-26 |
BIG-Bench Extra Hard |
Mehran Kazemi et.al. |
2502.19187 |
link |
2025-02-26 |
INFO-SEDD: Continuous Time Markov Chains as Scalable Information Metrics Estimators |
Alberto Foresti et.al. |
2502.19183 |
null |
2025-02-26 |
UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering |
Langming Liu et.al. |
2502.19178 |
link |
2025-02-26 |
MEDDxAgent: A Unified Modular Agent Framework for Explainable Automatic Differential Diagnosis |
Daniel Rose et.al. |
2502.19175 |
null |
2025-02-26 |
A Model-Centric Review of Deep Learning for Protein Design |
Gregory W. Kyro et.al. |
2502.19173 |
null |
2025-02-26 |
CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation |
Kaiwen Yan et.al. |
2502.19166 |
link |
2025-02-26 |
TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency |
Henry Peng Zou et.al. |
2502.19163 |
link |
2025-02-26 |
Detecting Linguistic Indicators for Stereotype Assessment with Large Language Models |
Rebekka Görge et.al. |
2502.19160 |
null |
2025-02-26 |
A Sliding Layer Merging Method for Efficient Depth-Wise Pruning in LLMs |
Xuan Ding et.al. |
2502.19159 |
link |
2025-02-26 |
When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference Learning |
Yijiang River Dong et.al. |
2502.19158 |
null |
2025-02-26 |
Isolating Language-Coding from Problem-Solving: Benchmarking LLMs with PseudoEval |
Jiarong Wu et.al. |
2502.19149 |
null |
2025-02-26 |
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs |
Zhaowei Zhang et.al. |
2502.19148 |
null |
2025-02-26 |
Identification Under the Semantic Effective Secrecy Constraint |
Abdalla Ibrahim et.al. |
2502.19142 |
null |
2025-02-26 |
A Temporal Planning Framework for Multi-Agent Systems via LLM-Aided Knowledge Base Management |
Enrico Saccon et.al. |
2502.19135 |
null |
2025-02-26 |
Self-Memory Alignment: Mitigating Factual Hallucinations with Generalized Improvement |
Siyuan Zhang et.al. |
2502.19127 |
null |
2025-02-26 |
A Survey on Foundation-Model-Based Industrial Defect Detection |
Tianle Yang et.al. |
2502.19106 |
null |
2025-02-26 |
Evaluating Gender Bias in German Machine Translation |
Michelle Kappl et.al. |
2502.19104 |
link |
2025-02-26 |
LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm |
Siwei Wu et.al. |
2502.19103 |
null |
2025-02-26 |
Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation |
Humza Sami et.al. |
2502.19091 |
link |
2025-02-26 |
EndoMamba: An Efficient Foundation Model for Endoscopic Videos |
Qingyao Tian et.al. |
2502.19090 |
link |
2025-02-26 |
Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs |
Yiheng Yang et.al. |
2502.19078 |
null |
2025-02-26 |
IndicEval-XL: Bridging Linguistic Diversity in Code Generation Across Indic Languages |
Ujjwal Singh et.al. |
2502.19067 |
link |
2025-02-26 |
Can Large Language Models Outperform Non-Experts in Poetry Evaluation? A Comparative Study Using the Consensual Assessment Technique |
Piotr Sawicki et.al. |
2502.19064 |
null |
2025-02-26 |
MathClean: A Benchmark for Synthetic Mathematical Data Cleaning |
Hao Liang et.al. |
2502.19058 |
null |
2025-02-26 |
Beyond Surface-Level Patterns: An Essence-Driven Defense Framework Against Jailbreak Attacks in LLMs |
Shiyu Xiang et.al. |
2502.19041 |
null |
2025-02-26 |
FungalZSL: Zero-Shot Fungal Classification with Image Captioning Using a Synthetic Data Approach |
Anju Rani et.al. |
2502.19038 |
null |
2025-02-26 |
InternVQA: Advancing Compressed Video Quality Assessment with Distilling Large Foundation Model |
Fengbin Guan et.al. |
2502.19026 |
null |
2025-02-26 |
Binary Neural Networks for Large Language Model: A Survey |
Liangdong Liu et.al. |
2502.19008 |
null |
2025-02-26 |
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training |
Jinbo Wang et.al. |
2502.19002 |
null |
2025-02-26 |
MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering |
Teng Lin et.al. |
2502.18993 |
null |
2025-02-26 |
OntologyRAG: Better and Faster Biomedical Code Mapping with Retrieval-Augmented Generation (RAG) Leveraging Ontology Knowledge Graphs and Large Language Models |
Hui Feng et.al. |
2502.18992 |
link |
2025-02-26 |
GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation |
Jie He et.al. |
2502.18990 |
null |
2025-02-26 |
PEToolLLM: Towards Personalized Tool Learning in Large Language Models |
Qiancheng Xu et.al. |
2502.18980 |
null |
2025-02-26 |
Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction Tuning |
Hongyi Cal et.al. |
2502.18978 |
null |
2025-02-26 |
(Mis)Fitting: A Survey of Scaling Laws |
Margaret Li et.al. |
2502.18969 |
link |
2025-02-26 |
Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles |
Kuang Wang et.al. |
2502.18968 |
link |
2025-02-26 |
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment |
Jiaxin Deng et.al. |
2502.18965 |
null |
2025-02-26 |
DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model |
Lei Zhao et.al. |
2502.18952 |
null |
2025-02-26 |
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models |
Yu He et.al. |
2502.18943 |
null |
2025-02-26 |
JailBench: A Comprehensive Chinese Security Assessment Benchmark for Large Language Models |
Shuyi Liu et.al. |
2502.18935 |
null |
2025-02-26 |
Talking like Piping and Instrumentation Diagrams (P&IDs) |
Achmad Anggawirya Alimin et.al. |
2502.18928 |
null |
2025-02-26 |
ClassInvGen: Class Invariant Synthesis using Large Language Models |
Chuyue Sun et.al. |
2502.18917 |
null |
2025-02-26 |
END: Early Noise Dropping for Efficient and Effective Context Denoising |
Hongye Jin et.al. |
2502.18915 |
null |
2025-02-26 |
CLLoRA: An Approach to Measure the Effects of the Context Length for LLM Fine-Tuning |
Ping Zhang et.al. |
2502.18910 |
null |
2025-02-26 |
An Empirical Study on Commit Message Generation using LLMs via In-Context Learning |
Yifan Wu et.al. |
2502.18904 |
link |
2025-02-26 |
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens |
Tong Wu et.al. |
2502.18890 |
link |
2025-02-26 |
Letters from Future Self: Augmenting the Letter-Exchange Exercise with LLM-based Agents to Enhance Young Adults’ Career Exploration |
Hayeon Jeon et.al. |
2502.18881 |
null |
2025-02-26 |
Learning to Generate Structured Output with Schema Reinforcement Learning |
Yaxi Lu et.al. |
2502.18878 |
null |
2025-02-26 |
Learning to Align Multi-Faceted Evaluation: A Unified and Robust Framework |
Kaishuai Xu et.al. |
2502.18874 |
null |
2025-02-26 |
Multi-LLM Collaborative Search for Complex Problem Solving |
Sen Yang et.al. |
2502.18873 |
null |
2025-02-26 |
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops |
Shi Fu et.al. |
2502.18865 |
null |
2025-02-26 |
Sherlock: Towards Multi-scene Video Abnormal Event Extraction and Localization via a Global-local Spatial-sensitive LLM |
Junxiao Ma et.al. |
2502.18863 |
null |
2025-02-26 |
A Causal Lens for Evaluating Faithfulness Metrics |
Kerem Zaman et.al. |
2502.18848 |
null |
2025-02-26 |
Sliding Window Attention Training for Efficient Large Language Models |
Zichuan Fu et.al. |
2502.18845 |
null |
2025-02-26 |
Evidence-Driven Marker Extraction for Social Media Suicide Risk Detection |
Carter Adams et.al. |
2502.18823 |
null |
2025-02-26 |
Data-Efficient Multi-Agent Spatial Planning with LLMs |
Huangyuan Su et.al. |
2502.18822 |
null |
2025-02-26 |
CAMEx: Curvature-aware Merging of Experts |
Dung V. Nguyen et.al. |
2502.18821 |
link |
2025-02-26 |
Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models |
Shuliang Liu et.al. |
2502.18817 |
null |
2025-02-26 |
Holistic Audit Dataset Generation for LLM Unlearning via Knowledge Graph Traversal and Redundancy Removal |
Weipeng Jiang et.al. |
2502.18810 |
null |
2025-02-26 |
Optimal Stochastic Trace Estimation in Generative Modeling |
Xinyang Liu et.al. |
2502.18808 |
null |
2025-02-26 |
SolEval: Benchmarking Large Language Models for Repository-level Solidity Code Generation |
Zhiyuan Peng et.al. |
2502.18793 |
null |
2025-02-26 |
Active Few-Shot Learning for Text Classification |
Saeed Ahmadnia et.al. |
2502.18782 |
null |
2025-02-26 |
Towards Optimal Multi-draft Speculative Decoding |
Zhengmian Hu et.al. |
2502.18779 |
null |
2025-02-26 |
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance |
Qingpei Guo et.al. |
2502.18778 |
null |
2025-02-26 |
Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance |
Xueqing Peng et.al. |
2502.18772 |
null |
2025-02-26 |
Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation |
Yuxiang Wang et.al. |
2502.18771 |
link |
2025-02-26 |
Reward Shaping to Mitigate Reward Hacking in RLHF |
Jiayi Fu et.al. |
2502.18770 |
link |
2025-02-26 |
CommGPT: A Graph and Retrieval-Augmented Multimodal Communication Foundation Model |
Feibo Jiang et.al. |
2502.18763 |
null |
2025-02-26 |
Training Large Recommendation Models via Graph-Language Token Alignment |
Mingdai Yang et.al. |
2502.18757 |
null |
2025-02-26 |
M-ANT: Efficient Low-bit Group Quantization for LLMs via Mathematically Adaptive Numerical Type |
Weiming Hu et.al. |
2502.18755 |
null |
2025-02-26 |
AgentSociety Challenge: Designing LLM Agents for User Modeling and Recommendation on Web Platforms |
Yuwei Yan et.al. |
2502.18754 |
link |
2025-02-26 |
Spectral-Enhanced Transformers: Leveraging Large-Scale Pretrained Models for Hyperspectral Object Tracking |
Shaheer Mohamed et.al. |
2502.18748 |
null |
2025-02-26 |
Automatic Prompt Optimization via Heuristic Search: A Survey |
Wendi Cui et.al. |
2502.18746 |
null |
2025-02-25 |
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers |
Xueguang Ma et.al. |
2502.18460 |
link |
2025-02-25 |
LLM-Based Design Pattern Detection |
Christian Schindler et.al. |
2502.18458 |
null |
2025-02-25 |
FRIDA to the Rescue! Analyzing Synthetic Data Effectiveness in Object-Based Common Sense Reasoning for Disaster Response |
Mollie Shichman et.al. |
2502.18452 |
null |
2025-02-25 |
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution |
Yuxiang Wei et.al. |
2502.18449 |
null |
2025-02-25 |
MAPoRL: Multi-Agent Post-Co-Training for Collaborative Large Language Models with Reinforcement Learning |
Chanwoo Park et.al. |
2502.18439 |
null |
2025-02-25 |
TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning |
Frederikus Hudi et.al. |
2502.18431 |
link |
2025-02-25 |
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference |
Xiangyu Zhao et.al. |
2502.18411 |
link |
2025-02-25 |
Enhancing DNA Foundation Models to Address Masking Inefficiencies |
Monireh Safari et.al. |
2502.18405 |
null |
2025-02-25 |
Monte Carlo Temperature: a robust sampling strategy for LLM’s uncertainty quantification methods |
Nicola Cecere et.al. |
2502.18389 |
null |
2025-02-25 |
How Far are LLMs from Real Search? A Comprehensive Study on Efficiency, Completeness, and Inherent Capabilities |
Minhua Lin et.al. |
2502.18387 |
null |
2025-02-25 |
MindMem: Multimodal for Predicting Advertisement Memorability Using LLMs and Deep Learning |
Sepehr Asgarian et.al. |
2502.18371 |
null |
2025-02-25 |
Sparse Bayesian Generative Modeling for Joint Parameter and Channel Estimation |
Benedikt Böck et.al. |
2502.18369 |
null |
2025-02-25 |
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation |
Yifan Pu et.al. |
2502.18364 |
null |
2025-02-25 |
Responsible AI Agents |
Deven R. Desai et.al. |
2502.18359 |
null |
2025-02-25 |
Which Contributions Deserve Credit? Perceptions of Attribution in Human-AI Co-Creation |
Jessica He et.al. |
2502.18357 |
null |
2025-02-25 |
BRIDO: Bringing Democratic Order to Abstractive Summarization |
Junhyun Lee et.al. |
2502.18342 |
null |
2025-02-25 |
Mapping of Subjective Accounts into Interpreted Clusters (MOSAIC): Topic Modelling and LLM applied to Stroboscopic Phenomenology |
Romy Beauté et.al. |
2502.18318 |
null |
2025-02-25 |
GCDance: Genre-Controlled 3D Full Body Dance Generation Driven By Music |
Xinran Liu et.al. |
2502.18309 |
null |
2025-02-25 |
RefuteBench 2.0 – Agentic Benchmark for Dynamic Evaluation of LLM Responses to Refutation Instruction |
Jianhao Yan et.al. |
2502.18308 |
null |
2025-02-25 |
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation |
Pengzhi Li et.al. |
2502.18302 |
null |
2025-02-25 |
Bayesian Computation in Deep Learning |
Wenlong Chen et.al. |
2502.18300 |
null |
2025-02-25 |
DeepCircuitX: A Comprehensive Repository-Level Dataset for RTL Code Understanding, Generation, and PPA Analysis |
Zeju Li et.al. |
2502.18297 |
null |
2025-02-25 |
AMPO: Active Multi-Preference Optimization |
Taneesh Gupta et.al. |
2502.18293 |
null |
2025-02-25 |
Better Aligned with Survey Respondents or Training Data? Unveiling Political Leanings of LLMs on U.S. Supreme Court Cases |
Shanshan Xu et.al. |
2502.18282 |
null |
2025-02-25 |
Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support |
Guoxin Wang et.al. |
2502.18274 |
link |
2025-02-25 |
Imperfect Knowledge Management (IKM) in GEFRED (GENeralized model for Fuzzy RElational Databases) |
Leoncio Jimenez et.al. |
2502.18255 |
null |
2025-02-25 |
Iterative Counterfactual Data Augmentation |
Mitchell Plyler et.al. |
2502.18249 |
link |
2025-02-25 |
Unveiling and Causalizing CoT: A Causal Pespective |
Jiarun Fu et.al. |
2502.18239 |
null |
2025-02-25 |
Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints |
Mihaela Cătălina Stoian et.al. |
2502.18237 |
link |
2025-02-25 |
Debt Collection Negotiations with Large Language Models: An Evaluation System and Optimizing Decision Making with Multi-Agent |
Xiaofeng Wang et.al. |
2502.18228 |
null |
2025-02-25 |
From ChatGPT to DeepSeek: Can LLMs Simulate Humanity? |
Qian Wang et.al. |
2502.18210 |
null |
2025-02-25 |
LAG: LLM agents for Leaderboard Auto Generation on Demanding |
Jian Wu et.al. |
2502.18209 |
null |
2025-02-25 |
Grandes modelos de lenguaje: de la predicción de palabras a la comprensión? |
Carlos Gómez-Rodríguez et.al. |
2502.18205 |
null |
2025-02-25 |
Intersubjective Model of AI-mediated Communication: Augmenting Human-Human Text Chat through LLM-based Adaptive Agent Pair |
Shutaro Aoyama et.al. |
2502.18201 |
null |
2025-02-25 |
Task-Agnostic Semantic Communication with Multimodal Foundation Models |
Jiangjing Hu et.al. |
2502.18200 |
null |
2025-02-25 |
Agnostic calculation of atomic free energies with the descriptor density of states |
Thomas D Swinburne et.al. |
2502.18191 |
link |
2025-02-25 |
ChatMotion: A Multimodal Multi-Agent for Human Motion Analysis |
Li Lei et.al. |
2502.18180 |
null |
2025-02-25 |
Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs |
Gaye Colakoglu et.al. |
2502.18179 |
link |
2025-02-25 |
CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification |
Mingkun Zhang et.al. |
2502.18176 |
link |
2025-02-25 |
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models |
Zhang Yuxuan et.al. |
2502.18168 |
null |
2025-02-25 |
Can LLMs Explain Themselves Counterfactually? |
Zahra Dehghanighobadi et.al. |
2502.18156 |
null |
2025-02-25 |
Carbon and Silicon, Coexist or Compete? A Survey on Human-AI Interactions in Agent-based Modeling and Simulation |
Ziyue Lin et.al. |
2502.18145 |
null |
2025-02-25 |
LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers |
Zhuocheng Zhang et.al. |
2502.18139 |
link |
2025-02-25 |
Large Language Model Driven Agents for Simulating Echo Chamber Formation |
Chenhao Gu et.al. |
2502.18138 |
null |
2025-02-25 |
Inverse Materials Design by Large Language Model-Assisted Generative Framework |
Yun Hao et.al. |
2502.18127 |
link |
2025-02-25 |
HyperG: Hypergraph-Enhanced LLMs for Structured Knowledge |
Sirui Huang et.al. |
2502.18125 |
null |
2025-02-25 |
Bayesian Optimization for Controlled Image Editing via LLMs |
Chengkun Cai et.al. |
2502.18116 |
null |
2025-02-25 |
PromptMID: Modal Invariant Descriptors Based on Diffusion and Vision Foundation Models for Optical-SAR Image Matching |
Han Nie et.al. |
2502.18104 |
link |
2025-02-25 |
Detecting Offensive Memes with Social Biases in Singapore Context Using Multimodal Large Language Models |
Cao Yuxuan et.al. |
2502.18101 |
link |
2025-02-25 |
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning |
Wenkai Yang et.al. |
2502.18080 |
null |
2025-02-25 |
Examining the Threat Landscape: Foundation Models and Model Stealing |
Ankita Raj et.al. |
2502.18077 |
null |
2025-02-25 |
MRBTP: Efficient Multi-Robot Behavior Tree Planning and Collaboration |
Yishuai Cai et.al. |
2502.18072 |
link |
2025-02-25 |
Golden Ratio Mixing of Real and Synthetic Data for Stabilizing Generative Model Training |
Hengzhi He et.al. |
2502.18049 |
null |
2025-02-25 |
AutoCas: Autoregressive Cascade Predictor in Social Networks via Large Language Models |
Yuhao Zheng et.al. |
2502.18040 |
null |
2025-02-25 |
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble |
Zhijun Chen et.al. |
2502.18036 |
link |
2025-02-25 |
Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference |
Zhuo Chen et.al. |
2502.18023 |
null |
2025-02-25 |
AfroXLMR-Comet: Multilingual Knowledge Distillation with Attention Matching for Low-Resource languages |
Joshua Sakthivel Raju et.al. |
2502.18020 |
null |
2025-02-25 |
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms |
Yashan Wang et.al. |
2502.18008 |
null |
2025-02-25 |
Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning |
Xinghao Chen et.al. |
2502.18001 |
link |
2025-02-25 |
Model-Free Adversarial Purification via Coarse-To-Fine Tensor Network Representation |
Guang Lin et.al. |
2502.17972 |
null |
2025-02-25 |
LLM Knows Geometry Better than Algebra: Numerical Understanding of LLM-Based Agents in A Trading Arena |
Tianmi Ma et.al. |
2502.17967 |
link |
2025-02-25 |
Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments |
Patomporn Payoungkhamdee et.al. |
2502.17956 |
null |
2025-02-25 |
DeepSeek-R1 Outperforms Gemini 2.0 Pro, OpenAI o1, and o3-mini in Bilingual Complex Ophthalmology Reasoning |
Pusheng Xu et.al. |
2502.17947 |
null |
2025-02-25 |
Assessing Large Language Models in Agentic Multilingual National Bias |
Qianying Liu et.al. |
2502.17945 |
null |
2025-02-25 |
CaseGen: A Benchmark for Multi-Stage Legal Case Documents Generation |
Haitao Li et.al. |
2502.17943 |
link |
2025-02-25 |
Advantage-Guided Distillation for Preference Alignment in Small Language Models |
Shiping Gao et.al. |
2502.17927 |
link |
2025-02-25 |
LeanProgress: Guiding Search for Neural Theorem Proving via Proof Progress Prediction |
Suozhi Huang et.al. |
2502.17925 |
null |
2025-02-25 |
FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models |
Hongzhan Lin et.al. |
2502.17924 |
link |
2025-02-25 |
Towards Sustainable Web Agents: A Plea for Transparency and Dedicated Metrics for Energy Consumption |
Lars Krupp et.al. |
2502.17903 |
null |
2025-02-25 |
Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs |
Che Liu et.al. |
2502.17900 |
null |
2025-02-25 |
Can Large Language Models Identify Implicit Suicidal Ideation? An Empirical Evaluation |
Tong Li et.al. |
2502.17899 |
null |
2025-02-25 |
FetchBot: Object Fetching in Cluttered Shelves via Zero-Shot Sim2Real |
Weiheng Liu et.al. |
2502.17894 |
null |
2025-02-25 |
RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts |
Mingyan Wu et.al. |
2502.17888 |
link |
2025-02-25 |
Science Across Languages: Assessing LLM Multilingual Translation of Scientific Papers |
Hannah Calzi Kleidermacher et.al. |
2502.17882 |
null |
2025-02-25 |
EEGM2: An Efficient Mamba-2-Based Self-Supervised Framework for Long-Sequence EEG Modeling |
Jiazhen Hong et.al. |
2502.17873 |
link |
2025-02-25 |
ASurvey: Spatiotemporal Consistency in Video Generation |
Zhiyu Yin et.al. |
2502.17863 |
null |
2025-02-25 |
HRR: Hierarchical Retrospection Refinement for Generated Image Detection |
Peipei Yuan et.al. |
2502.17862 |
null |
2025-02-25 |
LR ${}^{2}$ Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems |
Jianghao Chen et.al. |
2502.17848 |
null |
2025-02-25 |
Quantifying interdisciplinary synergy in higher STEM education |
Gahyoun Gim et.al. |
2502.17841 |
null |
2025-02-25 |
A Combinatorial Identities Benchmark for Theorem Proving via Automated Theorem Generation |
Beibei Xiong et.al. |
2502.17840 |
null |
2025-02-25 |
TagGAN: A Generative Model for Data Tagging |
Muhammad Nawaz et.al. |
2502.17836 |
null |
2025-02-25 |
MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks |
Hyeonjeong Ha et.al. |
2502.17832 |
link |
2025-02-25 |
A General Framework to Enhance Fine-tuning-based LLM Unlearning |
Jie Ren et.al. |
2502.17823 |
link |
2025-02-25 |
An Overview of Large Language Models for Statisticians |
Wenlong Ji et.al. |
2502.17814 |
null |
2025-02-25 |
Can Multimodal LLMs Perform Time Series Anomaly Detection? |
Xiongxiao Xu et.al. |
2502.17812 |
link |
2025-02-25 |
URO-Bench: A Comprehensive Benchmark for End-to-End Spoken Dialogue Models |
Ruiqi Yan et.al. |
2502.17810 |
null |
2025-02-25 |
DocPuzzle: A Process-Aware Benchmark for Evaluating Realistic Long-Context Reasoning Capabilities |
Tianyi Zhuang et.al. |
2502.17807 |
null |
2025-02-25 |
Your Language Model May Think Too Rigidly: Achieving Reasoning Consistency with Symmetry-Enhanced Training |
Yihang Yao et.al. |
2502.17800 |
null |
2025-02-25 |
AIR: Complex Instruction Generation via Automatic Iterative Refinement |
Wei Liu et.al. |
2502.17787 |
link |
2025-02-25 |
Exploring the Potential of Large Language Models for Estimating the Reading Comprehension Question Difficulty |
Yoshee Jain et.al. |
2502.17785 |
null |
2025-02-25 |
Tip of the Tongue Query Elicitation for Simulated Evaluation |
Yifan He et.al. |
2502.17776 |
link |
2025-02-25 |
FoREST: Frame of Reference Evaluation in Spatial Reasoning Tasks |
Tanawan Premsri et.al. |
2502.17775 |
link |
2025-02-25 |
Uncertainty Quantification for LLM-Based Survey Simulations |
Chengpiao Huang et.al. |
2502.17773 |
null |
2025-02-25 |
DeepSeek vs. ChatGPT: A Comparative Study for Scientific Computing and Scientific Machine Learning Tasks |
Qile Jiang et.al. |
2502.17764 |
null |
2025-02-25 |
Design and implementation of a distributed security threat detection system integrating federated learning and multimodal LLM |
Yuqing Wang et.al. |
2502.17763 |
null |
2025-02-25 |
Detection of LLM-Paraphrased Code and Identification of the Responsible LLM Using Coding Style Features |
Shinwoo Park et.al. |
2502.17749 |
null |
2025-02-24 |
LLM Inference Acceleration via Efficient Operation Fusion |
Mahsa Salmani et.al. |
2502.17728 |
null |
2025-02-24 |
Can Score-Based Generative Modeling Effectively Handle Medical Image Classification? |
Sushmita Sarker et.al. |
2502.17727 |
link |
2025-02-24 |
Spontaneous Giving and Calculated Greed in Language Models |
Yuxuan Li et.al. |
2502.17720 |
null |
2025-02-24 |
Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures |
Akhila Yerukola et.al. |
2502.17710 |
link |
2025-02-24 |
Fractal Generative Models |
Tianhong Li et.al. |
2502.17437 |
link |
2025-02-24 |
Introducing Visual Perception Token into Multimodal Large Language Model |
Runpeng Yu et.al. |
2502.17425 |
link |
2025-02-24 |
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs |
Jiarui Zhang et.al. |
2502.17422 |
link |
2025-02-24 |
LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification |
Penghui Yang et.al. |
2502.17421 |
link |
2025-02-24 |
The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence |
Tom Wollschläger et.al. |
2502.17420 |
null |
2025-02-24 |
From System 1 to System 2: A Survey of Reasoning Large Language Models |
Zhong-Zhi Li et.al. |
2502.17419 |
link |
2025-02-24 |
Reasoning with Latent Thoughts: On the Power of Looped Transformers |
Nikunj Saunshi et.al. |
2502.17416 |
null |
2025-02-24 |
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs |
Liming Liu et.al. |
2502.17410 |
link |
2025-02-24 |
Large Language Models are Powerful EHR Encoders |
Stefan Hegselmann et.al. |
2502.17403 |
link |
2025-02-24 |
What is a Good Question? Utility Estimation with LLM-based Simulations |
Dong-Ho Lee et.al. |
2502.17383 |
null |
2025-02-24 |
KV-Edit: Training-Free Image Editing for Precise Background Preservation |
Tianrui Zhu et.al. |
2502.17363 |
link |
2025-02-24 |
A Closer Look at TabPFN v2: Strength, Limitation, and Extension |
Han-Jia Ye et.al. |
2502.17361 |
null |
2025-02-24 |
RELICT: A Replica Detection Framework for Medical Image Generation |
Orhun Utku Aydin et.al. |
2502.17360 |
link |
2025-02-24 |
On Relation-Specific Neurons in Large Language Models |
Yihong Liu et.al. |
2502.17355 |
link |
2025-02-24 |
How Scientists Use Large Language Models to Program |
Gabrielle O’Brien et.al. |
2502.17348 |
null |
2025-02-24 |
Time series forecasting based on optimized LLM for fault prediction in distribution power grid insulators |
João Pedro Matos-Carvalho et.al. |
2502.17341 |
null |
2025-02-24 |
HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization |
Zhenghao Liu et.al. |
2502.17315 |
link |
2025-02-24 |
Delta Decompression for MoE-based LLMs Compression |
Hao Gu et.al. |
2502.17298 |
link |
2025-02-24 |
Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts |
Zhenghao Liu et.al. |
2502.17297 |
link |
2025-02-24 |
Integrating protein sequence embeddings with structure via graph-based deep learning for the prediction of single-residue properties |
Kevin Michalewicz et.al. |
2502.17294 |
link |
2025-02-24 |
Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing |
Yi-Kai Zhang et.al. |
2502.17282 |
link |
2025-02-24 |
MonoTODia: Translating Monologue Requests to Task-Oriented Dialogues |
Sebastian Steindl et.al. |
2502.17268 |
null |
2025-02-24 |
Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective |
Chengyin Xu et.al. |
2502.17262 |
null |
2025-02-24 |
Detecting Benchmark Contamination Through Watermarking |
Tom Sander et.al. |
2502.17259 |
null |
2025-02-24 |
REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective |
Simon Geisler et.al. |
2502.17254 |
link |
2025-02-24 |
Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search |
Boyan Li et.al. |
2502.17248 |
null |
2025-02-24 |
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction |
Tianpeng Li et.al. |
2502.17239 |
link |
2025-02-24 |
Making LLMs Reason? The Intermediate Language Problem in Neurosymbolic Approaches |
Alexander Beiser et.al. |
2502.17216 |
null |
2025-02-24 |
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought |
Boxuan Zhang et.al. |
2502.17214 |
link |
2025-02-24 |
Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following |
Jie Zeng et.al. |
2502.17204 |
link |
2025-02-24 |
IGDA: Interactive Graph Discovery through Large Language Model Agents |
Alex Havrilla et.al. |
2502.17189 |
null |
2025-02-24 |
Evaluating Expert Contributions in a MoE LLM for Quiz-Based Tasks |
Andrei Chernov et.al. |
2502.17187 |
null |
2025-02-24 |
Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric |
Yuming Yang et.al. |
2502.17184 |
link |
2025-02-24 |
Unsupervised Accelerated MRI Reconstruction via Ground-Truth-Free Flow Matching |
Xinzhe Luo et.al. |
2502.17174 |
null |
2025-02-24 |
Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch |
Xueru Wen et.al. |
2502.17173 |
null |
2025-02-24 |
Logic Haystacks: Probing LLMs Long-Context Logical Reasoning (Without Easily Identifiable Unrelated Padding) |
Damien Sileo et.al. |
2502.17169 |
null |
2025-02-24 |
JUREX-4E: Juridical Expert-Annotated Four-Element Knowledge Base for Legal Reasoning |
Huanghai Liu et.al. |
2502.17166 |
link |
2025-02-24 |
MEMERAG: A Multilingual End-to-End Meta-Evaluation Benchmark for Retrieval Augmented Generation |
María Andrea Cruz Blandón et.al. |
2502.17163 |
link |
2025-02-24 |
Real-time Monitoring of Economic Shocks using Company Websites |
Michael Koenig et.al. |
2502.17161 |
null |
2025-02-24 |
A Pragmatic Note on Evaluating Generative Models with Fréchet Inception Distance for Retinal Image Synthesis |
Yuli Wu et.al. |
2502.17160 |
null |
2025-02-24 |
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation |
Fanhu Zeng et.al. |
2502.17159 |
null |
2025-02-24 |
CodeSwift: Accelerating LLM Inference for Efficient Code Generation |
Qianhui Zhao et.al. |
2502.17139 |
null |
2025-02-24 |
Evaluating the Effectiveness of Large Language Models in Automated News Article Summarization |
Lionel Richy Panlap Houamegni et.al. |
2502.17136 |
null |
2025-02-24 |
Applications of Large Models in Medicine |
YunHe Su et.al. |
2502.17132 |
null |
2025-02-24 |
Thus Spake Long-Context Large Language Model |
Xiaoran Liu et.al. |
2502.17129 |
null |
2025-02-24 |
Adversarial Training for Defense Against Label Poisoning Attacks |
Melis Ilayda Bal et.al. |
2502.17121 |
link |
2025-02-24 |
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions |
Zhong Li et.al. |
2502.17119 |
link |
2025-02-24 |
SFLD: Reducing the content bias for AI-generated Image Detection |
Seoyeon Gye et.al. |
2502.17105 |
null |
2025-02-24 |
Generative Models in Decision Making: A Survey |
Yinchuan Li et.al. |
2502.17100 |
null |
2025-02-24 |
Improved Diffusion-based Generative Model with Better Adversarial Robustness |
Zekun Wang et.al. |
2502.17099 |
link |
2025-02-24 |
Conditional Diffusion-Flow models for generating 3D cosmic density fields: applications to f(R) cosmologies |
Julieth Katherine Riveros et.al. |
2502.17087 |
null |
2025-02-24 |
Automatically Evaluating the Paper Reviewing Capability of Large Language Models |
Hyungyu Shin et.al. |
2502.17086 |
null |
2025-02-24 |
Pleno-Generation: A Scalable Generative Face Video Compression Framework with Bandwidth Intelligence |
Bolin Chen et.al. |
2502.17085 |
null |
2025-02-24 |
Systematic Weight Evaluation for Pruning Large Language Models: Enhancing Performance and Sustainability |
Ashhadul Islam et.al. |
2502.17071 |
null |
2025-02-24 |
LLM-QE: Improving Query Expansion by Aligning Large Language Models with Ranking Preferences |
Sijia Yao et.al. |
2502.17057 |
link |
2025-02-24 |
PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance |
Haoran Li et.al. |
2502.17041 |
link |
2025-02-24 |
Evolution 6.0: Evolving Robotic Capabilities Through Generative Design |
Muhammad Haris Khan et.al. |
2502.17034 |
null |
2025-02-24 |
Understanding the Uncertainty of LLM Explanations: A Perspective Based on Reasoning Topology |
Longchao Da et.al. |
2502.17026 |
null |
2025-02-24 |
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization |
Zixuan Gong et.al. |
2502.17024 |
null |
2025-02-24 |
Quantifying Logical Consistency in Transformers via Query-Key Alignment |
Eduard Tulchinskii et.al. |
2502.17017 |
null |
2025-02-24 |
Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation |
Jaskaran Singh Walia et.al. |
2502.17011 |
null |
2025-02-24 |
Be CIM or Be Memory: A Dual-mode-aware DNN Compiler for CIM Accelerators |
Shixin Zhao et.al. |
2502.17006 |
null |
2025-02-24 |
An Enhanced Large Language Model For Cross Modal Query Understanding System Using DL-KeyBERT Based CAZSSCL-MPGPT |
Shreya Singh et.al. |
2502.17000 |
null |
2025-02-24 |
Active Learning for Conditional Inverse Design with Crystal Generation and Foundation Atomic Models |
Zhuoyuan Li et.al. |
2502.16984 |
null |
2025-02-24 |
LongSafety: Evaluating Long-Context Safety of Large Language Models |
Yida Lu et.al. |
2502.16971 |
link |
2025-02-24 |
Autoregressive Image Generation Guided by Chains of Thought |
Miaomiao Cai et.al. |
2502.16965 |
null |
2025-02-24 |
Make LLM Inference Affordable to Everyone: Augmenting GPU Memory with NDP-DIMM |
Lian Liu et.al. |
2502.16963 |
null |
2025-02-24 |
UrduLLaMA 1.0: Dataset Curation, Preprocessing, and Evaluation in Low-Resource Settings |
Layba Fiaz et.al. |
2502.16961 |
null |
2025-02-24 |
Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance |
Chenghua Huang et.al. |
2502.16944 |
null |
2025-02-24 |
Reasoning Does Not Necessarily Improve Role-Playing Ability |
Xiachong Feng et.al. |
2502.16940 |
null |
2025-02-24 |
BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference |
Zewen Jin et.al. |
2502.16927 |
null |
2025-02-24 |
FilterLLM: Text-To-Distribution LLM for Billion-Scale Cold-Start Recommendation |
Ruochen Liu et.al. |
2502.16924 |
null |
2025-02-24 |
A Systematic Survey of Automatic Prompt Optimization Techniques |
Kiran Ramnath et.al. |
2502.16923 |
null |
2025-02-24 |
Benchmarking Temporal Reasoning and Alignment Across Chinese Dynasties |
Zhenglin Wang et.al. |
2502.16922 |
link |
2025-02-24 |
SS-MPC: A Sequence-Structured Multi-Party Conversation System |
Yoonjin Jang et.al. |
2502.16920 |
null |
2025-02-24 |
Multi-Dimensional Quality Assessment for Text-to-3D Assets: Dataset and Model |
Kang Fu et.al. |
2502.16915 |
null |
2025-02-24 |
SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models |
Kevin Miller et.al. |
2502.16911 |
null |
2025-02-24 |
AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models |
Qin Zhu et.al. |
2502.16906 |
link |
2025-02-24 |
GuidedBench: Equipping Jailbreak Evaluation with Guidelines |
Ruixuan Huang et.al. |
2502.16903 |
null |
2025-02-24 |
Culture-TRIP: Culturally-Aware Text-to-Image Generation with Iterative Prompt Refinment |
Suchae Jeong et.al. |
2502.16902 |
null |
2025-02-24 |
Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs |
Himanshu Beniwal et.al. |
2502.16901 |
link |
2025-02-24 |
Zero-shot Load Forecasting for Integrated Energy Systems: A Large Language Model-based Framework with Multi-task Learning |
Jiaheng Li et.al. |
2502.16896 |
null |
2025-02-24 |
Unlocking Scientific Concepts: How Effective Are LLM-Generated Analogies for Student Understanding and Classroom Practice? |
Zekai Shao et.al. |
2502.16895 |
null |
2025-02-24 |
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment |
Chenghao Fan et.al. |
2502.16894 |
link |
2025-02-24 |
Applying LLMs to Active Learning: Towards Cost-Efficient Cross-Task Text Classification without Manually Labeled Data |
Yejian Zhang et.al. |
2502.16892 |
null |
2025-02-24 |
Unveiling Institution-Specific Bias in Pathology Foundation Models: Detriments, Causes, and Potential Solutions |
Weiping Lin et.al. |
2502.16889 |
null |
2025-02-24 |
DBudgetKV: Dynamic Budget in KV Cache Compression for Ensuring Optimal Performance |
Xuanfan Ni et.al. |
2502.16886 |
null |
2025-02-24 |
CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter |
Yepeng Weng et.al. |
2502.16880 |
null |
2025-02-24 |
A Multi-LLM-Agent-Based Framework for Economic and Public Policy Analysis |
Yuzhi Hao et.al. |
2502.16879 |
null |
2025-02-24 |
Graphy’our Data: Towards End-to-End Modeling, Exploring and Generating Report from Raw Data |
Longbin Lai et.al. |
2502.16868 |
null |
2025-02-24 |
Leveraging Large Language Models for Effective and Explainable Multi-Agent Credit Assignment |
Kartik Nagpal et.al. |
2502.16863 |
null |
2025-02-24 |
LongAttn: Selecting Long-context Training Data via Token-level Attention |
Longyun Wu et.al. |
2502.16860 |
link |
2025-02-24 |
Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models |
Avinash Trivedi et.al. |
2502.16857 |
null |
2025-02-24 |
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent |
Yuheng Zhang et.al. |
2502.16852 |
null |
2025-02-24 |
Exploring Causes and Mitigation of Hallucinations in Large Vision Language Models |
Yaqi Sun et.al. |
2502.16842 |
null |
2025-02-24 |
Fair Foundation Models for Medical Image Analysis: Challenges and Perspectives |
Dilermando Queiroz et.al. |
2502.16841 |
null |
2025-02-24 |
In-context learning of evolving data streams with tabular foundational models |
Afonso Lourenço et.al. |
2502.16840 |
null |
2025-02-24 |
“Actionable Help” in Crises: A Novel Dataset and Resource-Efficient Models for Identifying Request and Offer Social Media Posts |
Rabindra Lamsal et.al. |
2502.16839 |
null |
2025-02-24 |
REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction |
Omar Sharif et.al. |
2502.16838 |
null |
2025-02-24 |
Finding the Sweet Spot: Preference Data Construction for Scaling Preference Optimization |
Yao Xiao et.al. |
2502.16825 |
null |
2025-02-21 |
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval |
Guanqi Zhan et.al. |
2502.15682 |
null |
2025-02-21 |
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training |
Jaydeep Borkar et.al. |
2502.15680 |
link |
2025-02-21 |
FLEKE: Federated Locate-then-Edit Knowledge Editing |
Zongkai Zhao et.al. |
2502.15677 |
link |
2025-02-21 |
AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind |
Zhining Zhang et.al. |
2502.15676 |
link |
2025-02-21 |
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling |
Florent Bartoccioni et.al. |
2502.15672 |
link |
2025-02-21 |
Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing |
Shoumik Saha et.al. |
2502.15666 |
link |
2025-02-21 |
Machine-generated text detection prevents language model collapse |
George Drayson et.al. |
2502.15654 |
link |
2025-02-21 |
Empowering LLMs with Logical Reasoning: A Comprehensive Survey |
Fengxiang Cheng et.al. |
2502.15652 |
null |
2025-02-21 |
Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models |
Anirudh Sundar et.al. |
2502.15639 |
null |
2025-02-21 |
Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification |
Vasilii Feofanov et.al. |
2502.15637 |
link |
2025-02-21 |
The Relationship Between Reasoning and Performance in Large Language Models – o3 (mini) Thinks Harder, Not Longer |
Marthe Ballon et.al. |
2502.15631 |
link |
2025-02-21 |
Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing |
Qi Le et.al. |
2502.15618 |
link |
2025-02-21 |
On the Robustness of Transformers against Context Hijacking for Linear Classification |
Tianle Li et.al. |
2502.15609 |
null |
2025-02-21 |
Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance |
Akos Nagy et.al. |
2502.15604 |
null |
2025-02-21 |
Do Multilingual LLMs Think In English? |
Lisa Schut et.al. |
2502.15603 |
null |
2025-02-21 |
WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents |
Xinhang Liu et.al. |
2502.15601 |
null |
2025-02-21 |
SafeInt: Shielding Large Language Models from Jailbreak Attacks via Safety-Aware Representation Intervention |
Jiaqi Wu et.al. |
2502.15594 |
null |
2025-02-21 |
Generalizing From Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning |
Wenhao Zhu et.al. |
2502.15592 |
link |
2025-02-21 |
LightThinker: Thinking Step-by-Step Compression |
Jintian Zhang et.al. |
2502.15589 |
null |
2025-02-21 |
Chats-Grid: An Iterative Retrieval Q&A Optimization Scheme Leveraging Large Model and Retrieval Enhancement Generation in smart grid |
Yunfeng Li et.al. |
2502.15583 |
null |
2025-02-21 |
Fine-tuning foundation models of materials interatomic potentials with frozen transfer learning |
Mariia Radova et.al. |
2502.15582 |
null |
2025-02-21 |
Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders |
Xuansheng Wu et.al. |
2502.15576 |
null |
2025-02-21 |
DReSD: Dense Retrieval for Speculative Decoding |
Milan Gritta et.al. |
2502.15572 |
link |
2025-02-21 |
A Cautionary Tale About “Neutrally” Informative AI Tools Ahead of the 2025 Federal Elections in Germany |
Ina Dormuth et.al. |
2502.15568 |
null |
2025-02-21 |
PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning |
Pengcheng Huang et.al. |
2502.15543 |
link |
2025-02-21 |
Accurate and efficient machine learning interatomic potentials for finite temperature modeling of molecular crystals |
Flaviano Della Pia et.al. |
2502.15530 |
null |
2025-02-21 |
Scaling Sparse and Dense Retrieval in Decoder-Only LLMs |
Hansi Zeng et.al. |
2502.15526 |
link |
2025-02-21 |
Towards Swift Serverless LLM Cold Starts with ParaServe |
Chiheng Lou et.al. |
2502.15524 |
null |
2025-02-21 |
Activation Steering in Neural Theorem Provers |
Shashank Kirtania et.al. |
2502.15507 |
null |
2025-02-21 |
Construction and Evaluation of LLM-based agents for Semi-Autonomous penetration testing |
Masaya Kobayashi et.al. |
2502.15506 |
null |
2025-02-21 |
Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models |
Ya Wang et.al. |
2502.15499 |
link |
2025-02-21 |
Programmers Aren’t Obsolete Yet: A Syllabus for Teaching CS Students to Responsibly Use Large Language Models for Code Generation |
Bruno Pereira Cipriano et.al. |
2502.15493 |
null |
2025-02-21 |
ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models |
Martina Miliani et.al. |
2502.15487 |
null |
2025-02-21 |
Enhancing RWKV-based Language Models for Long-Sequence Text Generation |
Xinghan Pan et.al. |
2502.15485 |
link |
2025-02-21 |
FaultGPT: Industrial Fault Diagnosis Question Answering System by Vision Language Models |
Jiao Chen et.al. |
2502.15481 |
null |
2025-02-21 |
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System |
Yintao He et.al. |
2502.15470 |
null |
2025-02-21 |
Mitigating Data Scarcity in Time Series Analysis: A Foundation Model with Series-Symbol Data Generation |
Wenxuan Wang et.al. |
2502.15466 |
null |
2025-02-21 |
Memory Helps, but Confabulation Misleads: Understanding Streaming Events in Videos with MLLMs |
Gengyuan Zhang et.al. |
2502.15457 |
null |
2025-02-21 |
R-LoRA: Random Initialization of Multi-Head LoRA for Multi-Task Learning |
Jinda Liu et.al. |
2502.15455 |
link |
2025-02-21 |
A fast convergence algorithm based on binary integer programming for expert load balancing in MoE LLMs |
Yuan Sun et.al. |
2502.15451 |
link |
2025-02-21 |
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models |
Weilan Wang et.al. |
2502.15443 |
null |
2025-02-21 |
On the Effectiveness of Large Language Models in Writing Alloy Formulas |
Yang Hong et.al. |
2502.15441 |
null |
2025-02-21 |
Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-Tuning |
Raghav Singhal et.al. |
2502.15436 |
link |
2025-02-21 |
Single-pass Detection of Jailbreaking Input in Large Language Models |
Leyla Naz Candogan et.al. |
2502.15435 |
null |
2025-02-21 |
Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation |
Yue Zhou et.al. |
2502.15434 |
link |
2025-02-21 |
Pub-Guard-LLM: Detecting Fraudulent Biomedical Articles with Reliable Explanations |
Lihu Chen et.al. |
2502.15429 |
link |
2025-02-21 |
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs |
Giulio Zizzo et.al. |
2502.15427 |
link |
2025-02-21 |
Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking |
Yi-Ling Chung et.al. |
2502.15419 |
link |
2025-02-21 |
MHQA: A Diverse, Knowledge Intensive Mental Health Question Answering Challenge for Language Models |
Suraj Racha et.al. |
2502.15418 |
link |
2025-02-21 |
HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings |
Rasmus Aavang et.al. |
2502.15411 |
link |
2025-02-21 |
Problem-Solving Logic Guided Curriculum In-Context Learning for LLMs Complex Reasoning |
Xuetao Ma et.al. |
2502.15401 |
null |
2025-02-21 |
Beyond Tools: Understanding How Heavy Users Integrate LLMs into Everyday Tasks and Decision-Making |
Eunhye Kim et.al. |
2502.15395 |
null |
2025-02-21 |
Chitrarth: Bridging Vision and Language for a Billion People |
Shaharukh Khan et.al. |
2502.15392 |
null |
2025-02-21 |
MOVE: A Mixture-of-Vision-Encoders Approach for Domain-Focused Vision-Language Processing |
Matvey Skripkin et.al. |
2502.15381 |
null |
2025-02-21 |
Weakly Supervised Video Scene Graph Generation via Natural Language Supervision |
Kibum Kim et.al. |
2502.15370 |
link |
2025-02-21 |
Identifying Features that Shape Perceived Consciousness in Large Language Model-based AI: A Quantitative Study of Human Responses |
Kang Bongsu et.al. |
2502.15365 |
null |
2025-02-21 |
Evaluating Social Biases in LLM Reasoning |
Xuyang Wu et.al. |
2502.15361 |
null |
2025-02-21 |
ARS: Automatic Routing Solver with Large Language Models |
Kai Li et.al. |
2502.15359 |
link |
2025-02-21 |
AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms |
Feiyang Chen et.al. |
2502.15349 |
link |
2025-02-21 |
Constructing a Norm for Children’s Scientific Drawing: Distribution Features Based on Semantic Similarity of Large Language Models |
Yi Zhang et.al. |
2502.15348 |
null |
2025-02-21 |
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices |
Lixing Lyu et.al. |
2502.15345 |
null |
2025-02-21 |
Exploring Embodied Multimodal Large Models: Development, Datasets, and Future Directions |
Shoubin Chen et.al. |
2502.15336 |
null |
2025-02-21 |
Stepwise Informativeness Search for Improving LLM Reasoning |
Siyuan Wang et.al. |
2502.15335 |
null |
2025-02-21 |
Attention Eclipse: Manipulating Attention to Bypass LLM Safety-Alignment |
Pedram Zaree et.al. |
2502.15334 |
null |
2025-02-21 |
Detecting Future-related Contexts of Entity Mentions |
Puneet Prashar et.al. |
2502.15332 |
null |
2025-02-21 |
DynamicGSG: Dynamic 3D Gaussian Scene Graphs for Environment Adaptation |
Luzhou Ge et.al. |
2502.15309 |
link |
2025-02-21 |
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention |
Hong Yankun et.al. |
2502.15304 |
null |
2025-02-21 |
Round Attention: A Novel Round-Level Attention Mechanism to Accelerate LLM Inference |
Yaohua Tang et.al. |
2502.15294 |
null |
2025-02-21 |
Bridging Bug Localization and Issue Fixing: A Hierarchical Localization Framework Leveraging Large Language Models |
Jianming Chang et.al. |
2502.15292 |
null |
2025-02-21 |
BundleFlow: Deep Menus for Combinatorial Auctions by Diffusion-Based Optimization |
Tonghan Wang et.al. |
2502.15283 |
null |
2025-02-21 |
A Training-free LLM-based Approach to General Chinese Character Error Correction |
Houquan Zhou et.al. |
2502.15266 |
link |
2025-02-21 |
Retrieval-Augmented Speech Recognition Approach for Domain Challenges |
Peng Shen et.al. |
2502.15264 |
null |
2025-02-21 |
LightMamba: Efficient Mamba Acceleration on FPGA with Quantization and Hardware Co-design |
Renjie Wei et.al. |
2502.15260 |
null |
2025-02-21 |
An approach for API synthesis using large language models |
Hua Zhong et.al. |
2502.15246 |
null |
2025-02-21 |
Comparative Analysis of Large Language Models for Context-Aware Code Completion using SAFIM Framework |
Hang Zhang et.al. |
2502.15243 |
null |
2025-02-21 |
From Documents to Dialogue: Building KG-RAG Enhanced AI Assistants |
Manisha Mukherjee et.al. |
2502.15237 |
null |
2025-02-21 |
A General Pseudonymization Framework for Cloud-Based LLMs: Replacing Privacy Information in Controlled Text Generation |
Shilong Hou et.al. |
2502.15233 |
link |
2025-02-21 |
User Experience with LLM-powered Conversational Recommendation Systems: A Case of Music Recommendation |
Sojeong Yun et.al. |
2502.15229 |
null |
2025-02-21 |
Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews |
Mengqiao Liu et.al. |
2502.15226 |
null |
2025-02-21 |
Auto-Bench: An Automated Benchmark for Scientific Discovery in LLMs |
Tingting Chen et.al. |
2502.15224 |
null |
2025-02-21 |
FormalSpecCpp: A Dataset of C++ Formal Specifications created using LLMs |
Madhurima Chakraborty et.al. |
2502.15217 |
link |
2025-02-21 |
The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning |
Sheila Schoepp et.al. |
2502.15214 |
null |
2025-02-21 |
Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing |
Zhilin Wang et.al. |
2502.15208 |
null |
2025-02-21 |
Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image Synthesis |
Yifan Jiang et.al. |
2502.15204 |
link |
2025-02-21 |
TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding |
Zhaoxuan Wu et.al. |
2502.15197 |
null |
2025-02-21 |
LEDD: Large Language Model-Empowered Data Discovery in Data Lakes |
Qi An et.al. |
2502.15182 |
null |
2025-02-21 |
Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders |
Weiqiao Shan et.al. |
2502.15178 |
null |
2025-02-21 |
Methods and Trends in Detecting Generated Images: A Comprehensive Review |
Arpan Mahara et.al. |
2502.15176 |
null |
2025-02-21 |
M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment |
Chuan Cui et.al. |
2502.15167 |
null |
2025-02-21 |
Extreme Speech Classification in the Era of LLMs: Exploring Open-Source and Proprietary Models |
Sarthak Mahajan et.al. |
2502.15155 |
null |
2025-02-21 |
Investigating the Adaptive Robustness with Knowledge Conflicts in LLM-based Multi-Agent Systems |
Tianjie Ju et.al. |
2502.15153 |
link |
2025-02-21 |
Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns |
Naiming Liu et.al. |
2502.15140 |
null |
2025-02-21 |
Chain-of-Rank: Enhancing Large Language Models for Domain-Specific RAG in Edge Device |
Juntae Lee et.al. |
2502.15134 |
null |
2025-02-21 |
TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba |
Xiuwei Chen et.al. |
2502.15130 |
null |
2025-02-20 |
LUME: LLM Unlearning with Multitask Evaluations |
Anil Ramakrishna et.al. |
2502.15097 |
null |
2025-02-20 |
Detecting Student Intent for Chat-Based Intelligent Tutoring Systems |
Ella Cutler et.al. |
2502.15096 |
null |
2025-02-20 |
Judging It, Washing It: Scoring and Greenwashing Corporate Climate Disclosures using Large Language Models |
Marianne Chuang et.al. |
2502.15094 |
null |
2025-02-20 |
Optimizing Singular Spectrum for Large Language Model Compression |
Dengjie Li et.al. |
2502.15092 |
null |
2025-02-20 |
Analyze the Neurons, not the Embeddings: Understanding When and Where LLM Representations Align with Humans |
Masha Fedzechkina et.al. |
2502.15090 |
null |
2025-02-20 |
Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models |
Yeonjun In et.al. |
2502.15086 |
link |
2025-02-20 |
LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention |
Shang Yang et.al. |
2502.14866 |
link |
2025-02-20 |
Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning |
Shuyue Stella Li et.al. |
2502.14860 |
link |
2025-02-20 |
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling |
Weilin Zhao et.al. |
2502.14856 |
null |
2025-02-20 |
Prompt-to-Leaderboard |
Evan Frick et.al. |
2502.14855 |
link |
2025-02-20 |
GATE: Graph-based Adaptive Tool Evolution Across Diverse Tasks |
Jianwen Luo et.al. |
2502.14848 |
link |
2025-02-20 |
Red-Teaming LLM Multi-Agent Systems via Communication Attacks |
Pengfei He et.al. |
2502.14847 |
null |
2025-02-20 |
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation |
Yue Yang et.al. |
2502.14846 |
null |
2025-02-20 |
Revealing and Mitigating Over-Attention in Knowledge Editing |
Pinzheng Wang et.al. |
2502.14838 |
link |
2025-02-20 |
Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs |
Danni Liu et.al. |
2502.14830 |
link |
2025-02-20 |
A Survey of Model Architectures in Information Retrieval |
Zhichao Xu et.al. |
2502.14822 |
null |
2025-02-20 |
eC-Tab2Text: Aspect-Based Text Generation from e-Commerce Product Tables |
Luis Antonio Gutiérrez Guanilo et.al. |
2502.14820 |
null |
2025-02-20 |
Dynamic Low-Rank Sparse Adaptation for Large Language Models |
Weizhong Huang et.al. |
2502.14816 |
link |
2025-02-20 |
FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis |
Fadillah Maani et.al. |
2502.14807 |
link |
2025-02-20 |
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models |
Bernal Jiménez Gutiérrez et.al. |
2502.14802 |
link |
2025-02-20 |
A Multi-Agent Perspective on Modern Information Retrieval |
Haya Nachimovsky et.al. |
2502.14796 |
null |
2025-02-20 |
Rapid Word Learning Through Meta In-Context Learning |
Wentao Wang et.al. |
2502.14791 |
null |
2025-02-20 |
DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models |
Hongji Yang et.al. |
2502.14779 |
null |
2025-02-20 |
SurveyX: Academic Survey Automation via Large Language Models |
Xun Liang et.al. |
2502.14776 |
null |
2025-02-20 |
Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective |
Weizhong Huang et.al. |
2502.14770 |
null |
2025-02-20 |
Tree-of-Debate: Multi-Persona Debate Trees Elicit Critical Thinking for Scientific Comparative Analysis |
Priyanka Kargupta et.al. |
2502.14767 |
link |
2025-02-20 |
EquivaMap: Leveraging LLMs for Automatic Equivalence Checking of Optimization Formulations |
Haotian Zhai et.al. |
2502.14760 |
link |
2025-02-20 |
On the Influence of Context Size and Model Choice in Retrieval-Augmented Generation Systems |
Juraj Vladika et.al. |
2502.14759 |
link |
2025-02-20 |
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators |
Jianling Li et.al. |
2502.14752 |
link |
2025-02-20 |
Large Language Models Struggle to Describe the Haystack without Human Help: Human-in-the-loop Evaluation of LLMs |
Zongxia Li et.al. |
2502.14748 |
null |
2025-02-20 |
Multi-Agent Coordination across Diverse Applications: A Survey |
Lijun Sun et.al. |
2502.14743 |
null |
2025-02-20 |
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines |
M-A-P Team et.al. |
2502.14739 |
null |
2025-02-20 |
EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration |
Minjie Hong et.al. |
2502.14735 |
null |
2025-02-20 |
WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models |
Yifu Chen et.al. |
2502.14727 |
null |
2025-02-20 |
I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search |
Zujie Liang et.al. |
2502.14693 |
null |
2025-02-20 |
Bridging the Gap: Transforming Natural Language Questions into SQL Queries via Abstract Query Pattern and Contextual Schema Markup |
Yonghui Kong et.al. |
2502.14682 |
null |
2025-02-20 |
How to Get Your LLM to Generate Challenging Problems for Evaluation |
Arkil Patel et.al. |
2502.14678 |
link |
2025-02-20 |
Data-Constrained Synthesis of Training Data for De-Identification |
Thomas Vakili et.al. |
2502.14677 |
null |
2025-02-20 |
Explanations of Deep Language Models Explain Language Representations in the Brain |
Maryam Rahimi et.al. |
2502.14671 |
null |
2025-02-20 |
AlphaMaze: Enhancing Large Language Models’ Spatial Intelligence via GRPO |
Alan Dao et.al. |
2502.14669 |
link |
2025-02-20 |
Beyond the Surface: Uncovering Implicit Locations with LLMs for Personalized Local News |
Gali Katz et.al. |
2502.14660 |
null |
2025-02-20 |
Edit Once, Update Everywhere: A Simple Framework for Cross-Lingual Knowledge Synchronization in LLMs |
Yuchen Wu et.al. |
2502.14645 |
null |
2025-02-20 |
LIFT: Improving Long Context Understanding of Large Language Models through Long Input Fine-Tuning |
Yansheng Mao et.al. |
2502.14644 |
null |
2025-02-20 |
Length-Controlled Margin-Based Preference Optimization without Reference Model |
Gengxu Li et.al. |
2502.14643 |
link |
2025-02-20 |
ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation |
Angxiao Yue et.al. |
2502.14637 |
link |
2025-02-20 |
CER: Confidence Enhanced Reasoning in LLMs |
Ali Razghandi et.al. |
2502.14634 |
link |
2025-02-20 |
Augmenting Coaching with GenAI: Insights into Use, Effectiveness, and Future Potential |
Jennifer Haase et.al. |
2502.14632 |
null |
2025-02-20 |
Synergistic Fusion of Multi-Source Knowledge via Evidence Theory for High-Entropy Alloy Discovery |
Minh-Quyet Ha et.al. |
2502.14631 |
null |
2025-02-20 |
PEARL: Towards Permutation-Resilient LLMs |
Liang Chen et.al. |
2502.14628 |
link |
2025-02-20 |
Reward Models Identify Consistency, Not Causality |
Yuhui Xu et.al. |
2502.14619 |
null |
2025-02-20 |
Serving Models, Fast and Slow:Optimizing Heterogeneous LLM Inferencing Workloads at Scale |
Shashwat Jaiswal et.al. |
2502.14617 |
null |
2025-02-20 |
FIND: Fine-grained Information Density Guided Adaptive Retrieval-Augmented Generation for Disease Diagnosis |
Mingyi Jia et.al. |
2502.14614 |
null |
2025-02-20 |
Behavioral Analysis of Information Salience in Large Language Models |
Jan Trienes et.al. |
2502.14613 |
link |
2025-02-20 |
“Don’t Forget the Teachers”: Towards an Educator-Centered Understanding of Harms from Large Language Models in Education |
Emma Harvey et.al. |
2502.14592 |
null |
2025-02-20 |
Vision Foundation Models in Medical Image Analysis: Advances and Challenges |
Pengchen Liang et.al. |
2502.14584 |
null |
2025-02-20 |
A Theory for Conditional Generative Modeling on Multiple Data Sources |
Rongzhen Wang et.al. |
2502.14583 |
link |
2025-02-20 |
ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification |
Hyunseok Lee et.al. |
2502.14565 |
null |
2025-02-20 |
Plan-over-Graph: Towards Parallelable LLM Agent Schedule |
Shiqi Zhang et.al. |
2502.14563 |
link |
2025-02-20 |
Can LLMs Predict Citation Intent? An Experimental Analysis of In-context Learning and Fine-tuning on Open LLMs |
Paris Koloveas et.al. |
2502.14561 |
link |
2025-02-20 |
Less is More: Improving LLM Alignment via Preference Data Selection |
Xun Deng et.al. |
2502.14560 |
null |
2025-02-20 |
Multiscale Byte Language Models – A Hierarchical Architecture for Causal Million-Length Sequence Modeling |
Eric Egli et.al. |
2502.14553 |
link |
2025-02-20 |
Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks |
Maya Bechler-Speicher et.al. |
2502.14546 |
null |
2025-02-20 |
LLM-based User Profile Management for Recommender System |
Seunghwan Bang et.al. |
2502.14541 |
null |
2025-02-20 |
LoRA-GGPO: Mitigating Double Descent in LoRA Fine-Tuning via Gradient-Guided Perturbation Optimization |
Yupeng Chang et.al. |
2502.14538 |
link |
2025-02-20 |
CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models |
Zhenhong Zhou et.al. |
2502.14529 |
link |
2025-02-20 |
Generative adversarial networks vs large language models: a comparative study on synthetic tabular data generation |
Austin A. Barr et.al. |
2502.14523 |
link |
2025-02-20 |
Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases |
Rena Gao et.al. |
2502.14507 |
link |
2025-02-20 |
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? |
Sergey Pletenev et.al. |
2502.14502 |
link |
2025-02-20 |
MLGym: A New Framework and Benchmark for Advancing AI Research Agents |
Deepak Nathani et.al. |
2502.14499 |
null |
2025-02-20 |
StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following |
Jinnan Li et.al. |
2502.14494 |
link |
2025-02-20 |
How Jailbreak Defenses Work and Ensemble? A Mechanistic Investigation |
Zhuohang Long et.al. |
2502.14486 |
null |
2025-02-20 |
NLoRA: Nyström-Initiated Low-Rank Adaptation for Large Language Models |
Chenlu Guo et.al. |
2502.14482 |
link |
2025-02-20 |
Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression |
Haoyu Wang et.al. |
2502.14477 |
null |
2025-02-20 |
Argument-Based Comparative Question Answering Evaluation Benchmark |
Irina Nikishina et.al. |
2502.14476 |
null |
2025-02-20 |
Enhancing Smart Environments with Context-Aware Chatbots using Large Language Models |
Aurora Polo-Rodríguez et.al. |
2502.14469 |
null |
2025-02-20 |
Narrative-Driven Travel Planning: Geoculturally-Grounded Script Generation with Evolutionary Itinerary Optimization |
Ran Ding et.al. |
2502.14456 |
link |
2025-02-20 |
Optimal word order for non-causal text generation with Large Language Models: the Spanish case |
Andrea Busto-Castiñeira et.al. |
2502.14451 |
null |
2025-02-20 |
LLM4FaaS: No-Code Application Development using LLMs and FaaS |
Minghe Wang et.al. |
2502.14450 |
null |
2025-02-20 |
PredictaBoard: Benchmarking LLM Score Predictability |
Lorenzo Pacchiardi et.al. |
2502.14445 |
link |
2025-02-20 |
Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models |
Artem Vazhentsev et.al. |
2502.14427 |
link |
2025-02-20 |
A Survey on Data Contamination for Large Language Models |
Yuxing Cheng et.al. |
2502.14425 |
link |
2025-02-20 |
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model |
Zhongyi Zhou et.al. |
2502.14420 |
null |
2025-02-20 |
Towards Efficient Automatic Self-Pruning of Large Language Models |
Weizhong Huang et.al. |
2502.14413 |
null |
2025-02-20 |
Evaluating Precise Geolocation Inference Capabilities of Vision Language Models |
Neel Jay et.al. |
2502.14412 |
link |
2025-02-20 |
Unstructured Evidence Attribution for Long Context Query Focused Summarization |
Dustin Wright et.al. |
2502.14409 |
link |
2025-02-20 |
HPS: Hard Preference Sampling for Human Preference Alignment |
Xiandong Zou et.al. |
2502.14400 |
null |
2025-02-20 |
Enhancing Portuguese Variety Identification with Cross-Domain Approaches |
Hugo Sousa et.al. |
2502.14394 |
null |
2025-02-20 |
Leveraging Small LLMs for Argument Mining in Education: Argument Component Identification, Classification, and Assessment |
Lucile Favero et.al. |
2502.14389 |
null |
2025-02-20 |
S*: Test Time Scaling for Code Generation |
Dacheng Li et.al. |
2502.14382 |
link |
2025-02-20 |
PPO-MI: Efficient Black-Box Model Inversion via Proximal Policy Optimization |
Xinpeng Shou et.al. |
2502.14370 |
null |
2025-02-20 |
Entropy-UID: A Method for Optimizing Information Density |
Xinpeng Shou et.al. |
2502.14366 |
null |
2025-02-20 |
Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning |
Jiachen Zhu et.al. |
2502.14361 |
null |
2025-02-20 |
SR-LLM: Rethinking the Structured Representation in Large Language Model |
Jiahuan Zhang et.al. |
2502.14352 |
null |
2025-02-20 |
SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images |
Yichi Zhang et.al. |
2502.14351 |
link |
2025-02-20 |
FlowAgent: Achieving Compliance and Flexibility for Workflow Agents |
Yuchen Shi et.al. |
2502.14345 |
link |
2025-02-20 |
Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective |
Ruichen Shao et.al. |
2502.14340 |
link |
2025-02-20 |
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics |
Ting-Ruen Wei et.al. |
2502.14333 |
null |
2025-02-20 |
SolSearch: An LLM-Driven Framework for Efficient SAT-Solving Code Generation |
Junjie Sheng et.al. |
2502.14328 |
null |
2025-02-20 |
ChemHTS: Hierarchical Tool Stacking for Enhancing Chemical Agents |
Zhucong Li et.al. |
2502.14327 |
link |
2025-02-20 |
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems |
Bingyu Yan et.al. |
2502.14321 |
null |
2025-02-20 |
Line Goes Up? Inherent Limitations of Benchmarks for Evaluating Large Language Models |
James Fodor et.al. |
2502.14318 |
null |
2025-02-20 |
ParallelComp: Parallel Long-Context Compressor for Length Extrapolation |
Jing Xiong et.al. |
2502.14317 |
null |
2025-02-20 |
Unveiling Cultural Blind Spots: Analyzing the Limitations of mLLMs in Procedural Text Comprehension |
Amir Hossein Yari et.al. |
2502.14315 |
null |
2025-02-20 |
Efficient AI in Practice: Training and Deployment of Efficient LLMs for Industry Applications |
Kayhan Behdin et.al. |
2502.14305 |
null |
2025-02-20 |
MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models |
Shrey Pandit et.al. |
2502.14302 |
null |
2025-02-20 |
SEA-HELM: Southeast Asian Holistic Evaluation of Language Models |
Yosephine Susanto et.al. |
2502.14301 |
null |
2025-02-19 |
Where’s the Bug? Attention Probing for Scalable Fault Localization |
Adam Stein et.al. |
2502.13966 |
null |
2025-02-19 |
Autellix: An Efficient Serving Engine for LLM Agents as General Programs |
Michael Luo et.al. |
2502.13965 |
null |
2025-02-19 |
MuDAF: Long-Context Multi-Document Attention Focusing through Contrastive Learning on Attention Heads |
Weihao Liu et.al. |
2502.13963 |
link |
2025-02-19 |
Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering |
William Jurayj et.al. |
2502.13962 |
null |
2025-02-19 |
LIDDIA: Language-based Intelligent Drug Discovery Agent |
Reza Averly et.al. |
2502.13959 |
null |
2025-02-19 |
Neurosymbolic artificial intelligence via large language models and coherence-driven inference |
Steve Huntsman et.al. |
2502.13953 |
null |
2025-02-19 |
Why Safeguarded Ships Run Aground? Aligned Large Language Models’ Safety Mechanisms Tend to Be Anchored in The Template Region |
Chak Tou Leong et.al. |
2502.13946 |
null |
2025-02-19 |
Image compositing is all you need for data augmentation |
Ang Jia Ning Shermaine et.al. |
2502.13936 |
null |
2025-02-19 |
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization |
Guanzheng Chen et.al. |
2502.13922 |
link |
2025-02-19 |
Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis |
Jiahao Gai et.al. |
2502.13921 |
null |
2025-02-19 |
Exploring Personalized Health Support through Data-Driven, Theory-Guided LLMs: A Case Study in Sleep Health |
Xingbo Wang et.al. |
2502.13920 |
link |
2025-02-19 |
How Do LLMs Perform Two-Hop Reasoning in Context? |
Tianyu Guo et.al. |
2502.13913 |
null |
2025-02-19 |
Lost in Sequence: Do Large Language Models Understand Sequential Recommendation? |
Sein Kim et.al. |
2502.13909 |
link |
2025-02-19 |
Judging the Judges: A Collection of LLM-Generated Relevance Judgements |
Hossein A. Rahmani et.al. |
2502.13908 |
link |
2025-02-19 |
DataSciBench: An LLM Agent Benchmark for Data Science |
Dan Zhang et.al. |
2502.13897 |
link |
2025-02-19 |
NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants |
Yiran Qin et.al. |
2502.13894 |
null |
2025-02-19 |
Refining embeddings with fill-tuning: data-efficient generalised performance improvements for materials foundation models |
Matthew P. Wilson et.al. |
2502.13886 |
link |
2025-02-19 |
SPEX: Scaling Feature Interaction Explanations for LLMs |
Justin Singh Kang et.al. |
2502.13870 |
link |
2025-02-19 |
MagicGeo: Training-Free Text-Guided Geometric Diagram Generation |
Junxiao Wang et.al. |
2502.13855 |
null |
2025-02-19 |
Enhancing LLM-Based Recommendations Through Personalized Reasoning |
Jiahao Liu et.al. |
2502.13845 |
link |
2025-02-19 |
Enhancing Cross-Domain Recommendations with Memory-Optimized LLM-Based User Agents |
Jiahao Liu et.al. |
2502.13843 |
link |
2025-02-19 |
Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking |
Yilong Chen et.al. |
2502.13842 |
null |
2025-02-19 |
Quantifying Memorization and Retriever Performance in Retrieval-Augmented Vision-Language Models |
Peter Carragher et.al. |
2502.13836 |
null |
2025-02-19 |
Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning |
Zenan Li et.al. |
2502.13834 |
link |
2025-02-19 |
ArtMentor: AI-Assisted Evaluation of Artworks to Explore Multimodal Large Language Models Capabilities |
Chanjin Zheng et.al. |
2502.13832 |
link |
2025-02-19 |
LESA: Learnable LLM Layer Scaling-Up |
Yifei Yang et.al. |
2502.13794 |
link |
2025-02-19 |
From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions |
Nathanaël Carraz Rakotonirina et.al. |
2502.13791 |
link |
2025-02-19 |
From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education |
Yi-Fan Zhang et.al. |
2502.13789 |
null |
2025-02-19 |
Helix-mRNA: A Hybrid Foundation Model For Full Sequence mRNA Therapeutics |
Matthew Wood et.al. |
2502.13785 |
link |
2025-02-19 |
Generative Large Recommendation Models: Emerging Trends in LLMs for Recommendation |
Hao Wang et.al. |
2502.13783 |
null |
2025-02-19 |
Translation in the Hands of Many:Centering Lay Users in Machine Translation Interactions |
Beatrice Savoldi et.al. |
2502.13780 |
null |
2025-02-19 |
VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare |
Anudeex Shetty et.al. |
2502.13775 |
null |
2025-02-19 |
AI Software Engineer: Programming with Trust |
Abhik Roychoudhury et.al. |
2502.13767 |
null |
2025-02-19 |
SCALAR: Scientific Citation-based Live Assessment of Long-context Academic Reasoning |
Renxi Wang et.al. |
2502.13753 |
link |
2025-02-19 |
Reverse Markov Learning: Multi-Step Generative Models for Complex Distributions |
Xinwei Shen et.al. |
2502.13747 |
null |
2025-02-19 |
Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding |
Keqin Peng et.al. |
2502.13738 |
null |
2025-02-19 |
CARE: Confidence-Aware Regression Estimation of building density fine-tuning EO Foundation Models |
Nikolaos Dionelis et.al. |
2502.13734 |
null |
2025-02-19 |
Adapting Large Language Models for Time Series Modeling via a Novel Parameter-efficient Adaptation Method |
Juyuan Zhang et.al. |
2502.13725 |
null |
2025-02-19 |
Direct Value Optimization: Improving Chain-of-Thought Reasoning in LLMs with Refined Values |
Hongbo Zhang et.al. |
2502.13723 |
null |
2025-02-19 |
TALKPLAY: Multimodal Music Recommendation with Large Language Models |
Seungheon Doh et.al. |
2502.13713 |
null |
2025-02-19 |
Is This Collection Worth My LLM’s Time? Automatically Measuring Information Potential in Text Corpora |
Tristan Karch et.al. |
2502.13691 |
null |
2025-02-19 |
An LLM-based Agent for Reliable Docker Environment Configuration |
Ruida Hu et.al. |
2502.13681 |
link |
2025-02-19 |
SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation |
Song Duong et.al. |
2502.13674 |
null |
2025-02-19 |
Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models |
Liyang He et.al. |
2502.13656 |
link |
2025-02-19 |
C2T: A Classifier-Based Tree Construction Method in Speculative Decoding |
Feiye Huo et.al. |
2502.13652 |
null |
2025-02-19 |
Reliability Across Parametric and External Knowledge: Understanding Knowledge Handling in LLMs |
Youna Kim et.al. |
2502.13648 |
null |
2025-02-19 |
D.Va: Validate Your Demonstration First Before You Use It |
Qi Zhang et.al. |
2502.13646 |
null |
2025-02-19 |
Qorgau: Evaluating LLM Safety in Kazakh-Russian Bilingual Contexts |
Maiya Goloburda et.al. |
2502.13640 |
null |
2025-02-19 |
Concept Layers: Enhancing Interpretability and Intervenability via LLM Conceptualization |
Or Raphael Bidusa et.al. |
2502.13632 |
null |
2025-02-19 |
AI-Empowered Catalyst Discovery: A Survey from Classical Machine Learning Approaches to Large Language Models |
Yuanyuan Xu et.al. |
2502.13626 |
null |
2025-02-19 |
REFIND: Retrieval-Augmented Factuality Hallucination Detection in Large Language Models |
DongGeon Lee et.al. |
2502.13622 |
null |
2025-02-19 |
Complex Ontology Matching with Large Language Model Embeddings |
Guilherme Sousa et.al. |
2502.13619 |
null |
2025-02-19 |
LaVCa: LLM-assisted Visual Cortex Captioning |
Takuya Matsuyama et.al. |
2502.13606 |
null |
2025-02-19 |
BeamLoRA: Beam-Constraint Low-Rank Adaptation |
Naibin Gu et.al. |
2502.13604 |
null |
2025-02-19 |
MMTEB: Massive Multilingual Text Embedding Benchmark |
Kenneth Enevoldsen et.al. |
2502.13595 |
link |
2025-02-19 |
Don’t Stop the Multi-Party! On Generating Synthetic Multi-Party Conversations with Constraints |
Nicolò Penzo et.al. |
2502.13592 |
link |
2025-02-19 |
Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts |
Xin Li et.al. |
2502.13577 |
null |
2025-02-19 |
LSR-Adapt: Ultra-Efficient Parameter Tuning with Matrix Low Separation Rank Kernel Adaptation |
Xin Li et.al. |
2502.13568 |
null |
2025-02-19 |
Extracting Social Connections from Finnish Karelian Refugee Interviews Using LLMs |
Joonatan Laato et.al. |
2502.13566 |
null |
2025-02-19 |
PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models |
Guangwei Li et.al. |
2502.13564 |
link |
2025-02-19 |
Are Large Language Models In-Context Graph Learners? |
Jintang Li et.al. |
2502.13562 |
null |
2025-02-19 |
Democratizing Large Language Model-Based Graph Data Augmentation via Latent Knowledge Graphs |
Yushi Feng et.al. |
2502.13555 |
link |
2025-02-19 |
STaR-SQL: Self-Taught Reasoner for Text-to-SQL |
Mingqian He et.al. |
2502.13550 |
null |
2025-02-19 |
Detecting Linguistic Bias in Government Documents Using Large language Models |
Milena de Swart et.al. |
2502.13548 |
null |
2025-02-19 |
From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN |
Peiwen Yuan et.al. |
2502.13544 |
null |
2025-02-19 |
Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference |
Qingfa Xiao et.al. |
2502.13542 |
null |
2025-02-19 |
Bursting Filter Bubble: Enhancing Serendipity Recommendations with Aligned Large Language Models |
Yunjia Xi et.al. |
2502.13539 |
null |
2025-02-19 |
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models |
Jun Zhang et.al. |
2502.13533 |
link |
2025-02-19 |
Exploiting Prefix-Tree in Structured Output Interfaces for Enhancing Jailbreak Attacking |
Yanzeng Li et.al. |
2502.13527 |
link |
2025-02-19 |
SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin |
Hao Yi et.al. |
2502.13516 |
null |
2025-02-19 |
Unlocking Multimodal Integration in EHRs: A Prompt Learning Framework for Language and Time Series Fusion |
Shuai Niu et.al. |
2502.13509 |
null |
2025-02-19 |
Reproducing NevIR: Negation in Neural Information Retrieval |
Coen van Elsen et.al. |
2502.13506 |
link |
2025-02-19 |
PLDR-LLMs Learn A Generalizable Tensor Operator That Can Replace Its Own Deep Neural Net At Inference |
Burc Gokden et.al. |
2502.13502 |
link |
2025-02-19 |
Towards Geo-Culturally Grounded LLM Generations |
Piyawat Lertvittayakumjorn et.al. |
2502.13497 |
null |
2025-02-19 |
What are Models Thinking about? Understanding Large Language Model Hallucinations “Psychology” through Model Inner State Analysis |
Peiran Wang et.al. |
2502.13490 |
null |
2025-02-19 |
LLM4Tag: Automatic Tagging System for Information Retrieval via Large Language Models |
Ruiming Tang et.al. |
2502.13481 |
null |
2025-02-19 |
Integration of Agentic AI with 6G Networks for Mission-Critical Applications: Use-case and Challenges |
Sunder Ali Khowaja et.al. |
2502.13476 |
null |
2025-02-19 |
LLM should think and action as a human |
Haun Leung et.al. |
2502.13475 |
null |
2025-02-19 |
Towards Lightweight, Adaptive and Attribute-Aware Multi-Aspect Controllable Text Generation with Large Language Models |
Chenyu Zhu et.al. |
2502.13474 |
null |
2025-02-19 |
ThinkGuard: Deliberative Slow Thinking Leads to Cautious Guardrails |
Xiaofei Wen et.al. |
2502.13458 |
link |
2025-02-19 |
Interleaved Gibbs Diffusion for Constrained Generation |
Gautham Govind Anil et.al. |
2502.13450 |
null |
2025-02-19 |
Enhancing Chest X-ray Classification through Knowledge Injection in Cross-Modality Learning |
Yang Yan et.al. |
2502.13447 |
null |
2025-02-19 |
TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination Evaluation |
Jialin Ouyang et.al. |
2502.13442 |
link |
2025-02-19 |
The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding? |
Yutao Sun et.al. |
2502.13441 |
null |
2025-02-19 |
MATS: An Audio Language Model under Text-only Supervision |
Wen Wang et.al. |
2502.13433 |
null |
2025-02-19 |
Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning |
Hao Ma et.al. |
2502.13430 |
null |
2025-02-19 |
MCTS-KBQA: Monte Carlo Tree Search for Knowledge Base Question Answering |
Guanming Xiong et.al. |
2502.13428 |
null |
2025-02-19 |
TabSD: Large Free-Form Table Question Answering with SQL-Based Table Decomposition |
Yuxiang Wang et.al. |
2502.13422 |
null |
2025-02-19 |
RLTHF: Targeted Human Feedback for LLM Alignment |
Yifei Xu et.al. |
2502.13417 |
null |
2025-02-19 |
Detecting LLM Fact-conflicting Hallucinations Enhanced by Temporal-logic-based Reasoning |
Ningke Li et.al. |
2502.13416 |
null |
2025-02-19 |
Explore-Construct-Filter: An Automated Framework for Rich and Reliable API Knowledge Graph Construction |
Yanbang Sun et.al. |
2502.13412 |
null |
2025-02-19 |
Generative Predictive Control: Flow Matching Policies for Dynamic and Difficult-to-Demonstrate Tasks |
Vince Kurtz et.al. |
2502.13406 |
null |
2025-02-19 |
$\mathtt{GeLLM^3O}$ : Generalizing Large Language Models for Multi-property Molecule Optimization |
Vishal Dey et.al. |
2502.13398 |
link |
2025-02-19 |
Prompting a Weighting Mechanism into LLM-as-a-Judge in Two-Step: A Case Study |
Wenwen Xie et.al. |
2502.13396 |
null |
2025-02-19 |
Flow-based generative models as iterative algorithms in probability space |
Yao Xie et.al. |
2502.13394 |
null |
2025-02-19 |
Reasoning with Reinforced Functional Token Tuning |
Kongcheng Zhang et.al. |
2502.13389 |
link |
2025-02-19 |
Reflection of Episodes: Learning to Play Game from Expert and Self Experiences |
Xiaojie Xu et.al. |
2502.13388 |
null |
2025-02-19 |
MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification |
Linzhuang Sun et.al. |
2502.13383 |
link |
2025-02-19 |
AutoTEE: Automated Migration and Protection of Programs in Trusted Execution Environments |
Ruidong Han et.al. |
2502.13379 |
link |
2025-02-19 |
Task-agnostic Prompt Compression with Context-aware Sentence Embedding and Reward-guided Task Descriptor |
Barys Liskavets et.al. |
2502.13374 |
null |
2025-02-18 |
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization |
Shuo Xing et.al. |
2502.13146 |
link |
2025-02-18 |
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation |
Bencheng Liao et.al. |
2502.13145 |
link |
2025-02-18 |
Pre-training Auto-regressive Robotic Models with 4D Representations |
Dantong Niu et.al. |
2502.13142 |
null |
2025-02-18 |
UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models |
Huawei Lin et.al. |
2502.13141 |
link |
2025-02-18 |
AIDE: AI-Driven Exploration in the Space of Code |
Zhengyao Jiang et.al. |
2502.13138 |
link |
2025-02-18 |
Theorem Prover as a Judge for Synthetic Data Generation |
Joshua Ong Jun Leang et.al. |
2502.13137 |
null |
2025-02-18 |
AV-Flow: Transforming Text to Audio-Visual Human-like Interactions |
Aggelina Chatziagapi et.al. |
2502.13133 |
null |
2025-02-18 |
Learning to Defer for Causal Discovery with Imperfect Experts |
Oscar Clivio et.al. |
2502.13132 |
null |
2025-02-18 |
Rethinking Diverse Human Preference Learning through Principal Component Analysis |
Feng Luo et.al. |
2502.13131 |
null |
2025-02-18 |
Magma: A Foundation Model for Multimodal AI Agents |
Jianwei Yang et.al. |
2502.13130 |
link |
2025-02-18 |
Is Noise Conditioning Necessary for Denoising Generative Models? |
Qiao Sun et.al. |
2502.13129 |
null |
2025-02-18 |
Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning |
Jingyang Lin et.al. |
2502.13127 |
null |
2025-02-18 |
RuozhiBench: Evaluating LLMs with Logical Fallacies and Misleading Premises |
Zenan Zhai et.al. |
2502.13125 |
link |
2025-02-18 |
Adapting Psycholinguistic Research for LLMs: Gender-inclusive Language in a Coreference Context |
Marion Bartl et.al. |
2502.13120 |
null |
2025-02-18 |
STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models |
Narun Raman et.al. |
2502.13119 |
null |
2025-02-18 |
Performance Evaluation of Large Language Models in Statistical Programming |
Xinyi Song et.al. |
2502.13117 |
link |
2025-02-18 |
MatterChat: A Multi-Modal LLM for Material Science |
Yingheng Tang et.al. |
2502.13107 |
null |
2025-02-18 |
Text2World: Benchmarking Large Language Models for Symbolic World Model Generation |
Mengkang Hu et.al. |
2502.13092 |
null |
2025-02-18 |
A Neural Difference-of-Entropies Estimator for Mutual Information |
Haoran Ni et.al. |
2502.13085 |
null |
2025-02-18 |
Personalized Image Generation with Deep Generative Models: A Decade Survey |
Yuxiang Wei et.al. |
2502.13081 |
link |
2025-02-18 |
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models |
Xianfu Cheng et.al. |
2502.13059 |
null |
2025-02-18 |
LAMD: Context-driven Android Malware Detection and Classification with LLMs |
Xingzhi Qian et.al. |
2502.13055 |
null |
2025-02-18 |
Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction |
Nils Constantin Hellwig et.al. |
2502.13044 |
null |
2025-02-18 |
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators |
Bosi Wen et.al. |
2502.13031 |
null |
2025-02-18 |
A deep learning framework for efficient pathology image analysis |
Peter Neidlinger et.al. |
2502.13027 |
null |
2025-02-18 |
Agentic Deep Graph Reasoning Yields Self-Organizing Knowledge Networks |
Markus J. Buehler et.al. |
2502.13025 |
link |
2025-02-18 |
Oreo: A Plug-in Context Reconstructor to Enhance Retrieval-Augmented Generation |
Sha Li et.al. |
2502.13019 |
null |
2025-02-18 |
Towards a Design Guideline for RPA Evaluation: A Survey of Large Language Model-Based Role-Playing Agents |
Chaoran Chen et.al. |
2502.13012 |
null |
2025-02-18 |
Adaptive Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical Knowledge |
Mohammad Reza Rezaei et.al. |
2502.13010 |
null |
2025-02-18 |
You need to MIMIC to get FAME: Solving Meeting Transcript Scarcity with a Multi-Agent Conversations |
Frederic Kirstein et.al. |
2502.13001 |
null |
2025-02-18 |
Personalized Top-k Set Queries Over Predicted Scores |
Sohrab Namazi Nia et.al. |
2502.12998 |
null |
2025-02-18 |
Beyond Profile: From Surface-Level Facts to Deep Persona Simulation in LLMs |
Zixiao Wang et.al. |
2502.12988 |
null |
2025-02-18 |
Towards Variational Flow Matching on General Geometries |
Olga Zaghen et.al. |
2502.12981 |
null |
2025-02-18 |
Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search |
Yifan Ji et.al. |
2502.12974 |
link |
2025-02-18 |
Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking |
Junda Zhu et.al. |
2502.12970 |
link |
2025-02-18 |
Trust Me, I’m Wrong: High-Certainty Hallucinations in LLMs |
Adi Simhi et.al. |
2502.12964 |
null |
2025-02-18 |
Infinite Retrieval: Attention Enhanced LLMs in Long-Context Processing |
Xiaoju Ye et.al. |
2502.12962 |
null |
2025-02-18 |
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger |
Wenjun Li et.al. |
2502.12961 |
null |
2025-02-18 |
Guaranteed Conditional Diffusion: 3D Block-based Models for Scientific Data Compression |
Jaemoon Lee et.al. |
2502.12951 |
null |
2025-02-18 |
Fake It Till You Make It: Using Synthetic Data and Domain Knowledge for Improved Text-Based Learning for LGE Detection |
Athira J Jacob et.al. |
2502.12948 |
null |
2025-02-18 |
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models |
Gyeongman Kim et.al. |
2502.12947 |
null |
2025-02-18 |
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation |
Junchen Fu et.al. |
2502.12945 |
null |
2025-02-18 |
Performance of Zero-Shot Time Series Foundation Models on Cloud Data |
William Toner et.al. |
2502.12944 |
link |
2025-02-18 |
Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options |
Lakshmi Nair et.al. |
2502.12929 |
link |
2025-02-18 |
Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts |
Leiyu Pan et.al. |
2502.12928 |
null |
2025-02-18 |
SEFL: Harnessing Large Language Model Agents to Improve Educational Feedback Systems |
Mike Zhang et.al. |
2502.12927 |
link |
2025-02-18 |
Towards more Contextual Agents: An extractor-Generator Optimization Framework |
Mourad Aouini et.al. |
2502.12926 |
null |
2025-02-18 |
Keep what you need : extracting efficient subnetworks from large audio representation models |
David Genova et.al. |
2502.12925 |
link |
2025-02-18 |
Conditioning LLMs to Generate Code-Switched Text: A Methodology Grounded in Naturally Occurring Data |
Maite Heredia et.al. |
2502.12924 |
link |
2025-02-18 |
On-Device LLMs for Home Assistant: Dual Role in Intent Detection and Response Generation |
Rune Birkmose et.al. |
2502.12923 |
link |
2025-02-18 |
Q-STRUM Debate: Query-Driven Contrastive Summarization for Recommendation Comparison |
George-Kirollos Saad et.al. |
2502.12921 |
link |
2025-02-18 |
Lightweight Online Adaption for Time Series Foundation Model Forecasts |
Thomas L. Lee et.al. |
2502.12920 |
null |
2025-02-18 |
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning |
Sifan Zhou et.al. |
2502.12913 |
null |
2025-02-18 |
Probabilistic neural operators for functional uncertainty quantification |
Christopher Bülte et.al. |
2502.12902 |
link |
2025-02-18 |
Soundwave: Less is More for Speech-Text Alignment in LLMs |
Yuhao Zhang et.al. |
2502.12900 |
link |
2025-02-18 |
Multilingual European Language Models: Benchmarking Approaches and Challenges |
Fabio Barth et.al. |
2502.12895 |
null |
2025-02-18 |
CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image |
Kaixin Yao et.al. |
2502.12894 |
null |
2025-02-18 |
Are Multilingual Language Models an Off-ramp for Under-resourced Languages? Will we arrive at Digital Language Equality in Europe in 2030? |
Georg Rehm et.al. |
2502.12886 |
null |
2025-02-18 |
How desirable is alignment between LLMs and linguistically diverse human users? |
Pia Knoeferle et.al. |
2502.12884 |
null |
2025-02-18 |
Continuous Learning Conversational AI: A Personalized Agent Framework via A2C Reinforcement Learning |
Nandakishor M et.al. |
2502.12876 |
null |
2025-02-18 |
RobotIQ: Empowering Mobile Robots with Human-Level Planning for Real-World Execution |
Emmanuel K. Raptis et.al. |
2502.12862 |
link |
2025-02-18 |
PAFT: Prompt-Agnostic Fine-Tuning |
Chenxing Wei et.al. |
2502.12859 |
null |
2025-02-18 |
Rejected Dialects: Biases Against African American Language in Reward Models |
Joel Mire et.al. |
2502.12858 |
link |
2025-02-18 |
MeMo: Towards Language Models with Associative Memory Mechanisms |
Fabio Massimo Zanzotto et.al. |
2502.12851 |
null |
2025-02-18 |
MOLLM: Multi-Objective Large Language Model for Molecular Design – Optimizing with Experts |
Nian Ran et.al. |
2502.12845 |
null |
2025-02-18 |
Towards Adaptive Feedback with AI: Comparing the Feedback Quality of LLMs and Teachers on Experimentation Protocols |
Kathrin Seßler et.al. |
2502.12842 |
null |
2025-02-18 |
Towards Equitable AI: Detecting Bias in Using Large Language Models for Marketing |
Berk Yilmaz et.al. |
2502.12838 |
null |
2025-02-18 |
An LLM-Powered Agent for Physiological Data Analysis: A Case Study on PPG-based Heart Rate Estimation |
Mohammad Feli et.al. |
2502.12836 |
null |
2025-02-18 |
KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan |
Mukhammed Togmanov et.al. |
2502.12829 |
null |
2025-02-18 |
Reasoning and the Trusting Behavior of DeepSeek and GPT: An Experiment Revealing Hidden Fault Lines in Large Language Models |
Rubing Lu et.al. |
2502.12825 |
null |
2025-02-18 |
Pitfalls of Scale: Investigating the Inverse Task of Redefinition in Large Language Models |
Elena Stringli et.al. |
2502.12821 |
null |
2025-02-18 |
Simulating User Diversity in Task-Oriented Dialogue Systems using Large Language Models |
Adnan Ahmad et.al. |
2502.12813 |
null |
2025-02-18 |
Towards Text-Image Interleaved Retrieval |
Xin Zhang et.al. |
2502.12799 |
link |
2025-02-18 |
RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models |
Tanqiu Jiang et.al. |
2502.12794 |
link |
2025-02-18 |
Commonsense Reasoning in Arab Culture |
Abdelrahman Sadallah et.al. |
2502.12788 |
null |
2025-02-18 |
Portable Reward Tuning: Towards Reusable Fine-Tuning across Different Pretrained Models |
Daiki Chijiwa et.al. |
2502.12776 |
null |
2025-02-18 |
How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild |
Saad Obaid ul Islam et.al. |
2502.12769 |
link |
2025-02-18 |
R2-KG: General-Purpose Dual-Agent Framework for Reliable Reasoning on Knowledge Graphs |
Sumin Jo et.al. |
2502.12767 |
link |
2025-02-18 |
One-bit Compressed Sensing using Generative Models |
Swatantra Kafle et.al. |
2502.12762 |
null |
2025-02-18 |
Efficient Machine Translation Corpus Generation: Integrating Human-in-the-Loop Post-Editing with Large Language Models |
Kamer Ali Yuksel et.al. |
2502.12755 |
link |
2025-02-18 |
Architect of the Bits World: Masked Autoregressive Modeling for Circuit Generation Guided by Truth Table |
Haoyuan Wu et.al. |
2502.12751 |
null |
2025-02-18 |
Self-Enhanced Reasoning Training: Activating Latent Reasoning in Small Models for Enhanced Reasoning Distillation |
Yong Zhang et.al. |
2502.12744 |
null |
2025-02-18 |
“I know myself better, but not really greatly”: Using LLMs to Detect and Explain LLM-Generated Texts |
Jiazhou Ji et.al. |
2502.12743 |
null |
2025-02-18 |
Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment |
Haoyuan Wu et.al. |
2502.12732 |
null |
2025-02-18 |
TREND: A Whitespace Replacement Information Hiding Method |
Malte Hellmeier et.al. |
2502.12710 |
null |
2025-02-18 |
Multi-Novelty: Improve the Diversity and Novelty of Contents Generated by Large Language Models via inference-time Multi-Views Brainstorming |
Arash Lagzian et.al. |
2502.12700 |
null |
2025-02-18 |
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees |
Yongtao Wu et.al. |
2502.12678 |
null |
2025-02-18 |
Baichuan-M1: Pushing the Medical Capability of Large Language Models |
Bingning Wang et.al. |
2502.12671 |
null |
2025-02-18 |
Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research |
Xiang Liu et.al. |
2502.12669 |
null |
2025-02-18 |
Evaluation of Best-of-N Sampling Strategies for Language Model Alignment |
Yuki Ichihara et.al. |
2502.12668 |
null |
2025-02-18 |
A $^2$ ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization |
Junhui He et.al. |
2502.12665 |
null |
2025-02-18 |
Demystifying Multilingual Chain-of-Thought in Process Reward Modeling |
Weixuan Wang et.al. |
2502.12663 |
null |
2025-02-18 |
The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1 |
Kaiwen Zhou et.al. |
2502.12659 |
null |
2025-02-18 |
R.R.: Unveiling LLM Training Privacy through Recollection and Ranking |
Wenlong Meng et.al. |
2502.12658 |
link |
2025-02-18 |
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation |
Zhiyuan Liu et.al. |
2502.12638 |
link |
2025-02-18 |
Corrupted but Not Broken: Rethinking the Impact of Corrupted Data in Visual Instruction Tuning |
Yunhao Gou et.al. |
2502.12635 |
null |
2025-02-18 |
\textit{One Size doesn’t Fit All}: A Personalized Conversational Tutoring Agent for Mathematics Instruction |
Ben Liu et.al. |
2502.12633 |
null |
2025-02-18 |
Automating Prompt Leakage Attacks on Large Language Models Using Agentic Approach |
Tvrtko Sternak et.al. |
2502.12630 |
link |
2025-02-18 |
DeepResonance: Enhancing Multimodal Music Understanding via Music-centric Multi-way Instruction Tuning |
Zhuoyuan Mao et.al. |
2502.12623 |
null |
2025-02-18 |
Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions |
Leonardo Ranaldi et.al. |
2502.12616 |
null |
2025-02-17 |
Idiosyncrasies in Large Language Models |
Mingjie Sun et.al. |
2502.12150 |
link |
2025-02-17 |
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation |
Ling Yang et.al. |
2502.12148 |
link |
2025-02-17 |
Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control |
Jinyan Su et.al. |
2502.12145 |
link |
2025-02-17 |
Small Models Struggle to Learn from Strong Reasoners |
Yuetai Li et.al. |
2502.12143 |
null |
2025-02-17 |
SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs |
Yige Xu et.al. |
2502.12134 |
link |
2025-02-17 |
Transformer Dynamics: A neuroscientific approach to interpretability of large language models |
Jesseba Fernando et.al. |
2502.12131 |
null |
2025-02-17 |
Scaling Autonomous Agents via Automatic Reward Modeling And Planning |
Zhenfang Chen et.al. |
2502.12130 |
null |
2025-02-17 |
LaM-SLidE: Latent Space Modeling of Spatial Dynamical Systems via Linked Entities |
Florian Sestak et.al. |
2502.12128 |
link |
2025-02-17 |
Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA |
Patryk Marszałek et.al. |
2502.12122 |
link |
2025-02-17 |
LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws |
Prasanna Mayilvahanan et.al. |
2502.12120 |
null |
2025-02-17 |
PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection |
Jinhe Bi et.al. |
2502.12119 |
null |
2025-02-17 |
A-MEM: Agentic Memory for LLM Agents |
Wujiang Xu et.al. |
2502.12110 |
link |
2025-02-17 |
Personality Structured Interview for Large Language Model Simulation in Personality Research |
Pengda Wang et.al. |
2502.12109 |
null |
2025-02-17 |
Relational Norms for Human-AI Cooperation |
Brian D. Earp et.al. |
2502.12102 |
null |
2025-02-17 |
Token Communications: A Unified Framework for Cross-modal Context-aware Semantic Communications |
Li Qiao et.al. |
2502.12096 |
null |
2025-02-17 |
How compositional generalization and creativity improve as diffusion models are trained |
Alessandro Favero et.al. |
2502.12089 |
null |
2025-02-17 |
Meta-Statistical Learning: Supervised Learning of Statistical Inference |
Maxime Peyrard et.al. |
2502.12088 |
null |
2025-02-17 |
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs |
Yuxiang Huang et.al. |
2502.12085 |
link |
2025-02-17 |
Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation |
Zhongyi Qiu et.al. |
2502.12073 |
null |
2025-02-17 |
TokenSkip: Controllable Chain-of-Thought Compression in LLMs |
Heming Xia et.al. |
2502.12067 |
link |
2025-02-17 |
CONSTRUCTA: Automating Commercial Construction Schedules in Fabrication Facilities with Large Language Models |
Yifan Zhang et.al. |
2502.12066 |
null |
2025-02-17 |
AI-generated Text Detection with a GLTR-based Approach |
Lucía Yan Wu et.al. |
2502.12064 |
null |
2025-02-17 |
Designing Role Vectors to Improve LLM Inference Behaviour |
Daniele Potertì et.al. |
2502.12055 |
null |
2025-02-17 |
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning |
Xinyu Zhang et.al. |
2502.12054 |
null |
2025-02-17 |
A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond |
Shreya Shukla et.al. |
2502.12048 |
null |
2025-02-17 |
KnowPath: Knowledge-enhanced Reasoning via LLM-generated Inference Paths over Knowledge Graphs |
Qi Zhao et.al. |
2502.12029 |
null |
2025-02-17 |
SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities |
Fengqing Jiang et.al. |
2502.12025 |
null |
2025-02-17 |
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving |
Xin Xu et.al. |
2502.12022 |
null |
2025-02-17 |
Atom of Thoughts for Markov LLM Test-Time Scaling |
Fengwei Teng et.al. |
2502.12018 |
link |
2025-02-17 |
Unsupervised Structural-Counterfactual Generation under Domain Shift |
Krishn Vishwas Kher et.al. |
2502.12013 |
null |
2025-02-17 |
Design Considerations Based on Stability for a Class of TCP Algorithms |
Sreekanth Prabhakar et.al. |
2502.11983 |
null |
2025-02-17 |
Image Inversion: A Survey from GANs to Diffusion and Beyond |
Yinan Chen et.al. |
2502.11974 |
link |
2025-02-17 |
Generating Text from Uniform Meaning Representation |
Emma Markle et.al. |
2502.11973 |
link |
2025-02-17 |
A MIMO Wireless Channel Foundation Model via CIR-CSI Consistency |
Jun Jiang et.al. |
2502.11965 |
null |
2025-02-17 |
Navigating the Helpfulness-Truthfulness Trade-Off with Uncertainty-Aware Instruction Fine-Tuning |
Tianyi Wu et.al. |
2502.11962 |
null |
2025-02-17 |
On Representational Dissociation of Language and Arithmetic in Large Language Models |
Riku Kisako et.al. |
2502.11932 |
null |
2025-02-17 |
GRAPHGPT-O: Synergistic Multimodal Comprehension and Generation on Graphs |
Yi Fang et.al. |
2502.11925 |
null |
2025-02-17 |
From Text to Trust: Empowering AI-assisted Decision Making with Adaptive LLM-powered Analysis |
Zhuoyan Li et.al. |
2502.11919 |
null |
2025-02-17 |
EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models |
Jiamin Su et.al. |
2502.11916 |
link |
2025-02-17 |
Adversarial Alignment for LLMs Requires Simpler, Reproducible, and More Measurable Objectives |
Leo Schwinn et.al. |
2502.11910 |
null |
2025-02-17 |
MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation |
Haochen Xue et.al. |
2502.11903 |
null |
2025-02-17 |
DLFR-VAE: Dynamic Latent Frame Rate VAE for Video Generation |
Zhihang Yuan et.al. |
2502.11897 |
link |
2025-02-17 |
CAMEL: Continuous Action Masking Enabled by Large Language Models for Reinforcement Learning |
Yanxiao Zhao et.al. |
2502.11896 |
null |
2025-02-17 |
Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models? |
Jacob Nielsen et.al. |
2502.11895 |
null |
2025-02-17 |
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration |
Shao Zhang et.al. |
2502.11882 |
link |
2025-02-17 |
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models |
Hyunwoo Kim et.al. |
2502.11881 |
null |
2025-02-17 |
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs |
Jinheng Wang et.al. |
2502.11880 |
link |
2025-02-17 |
JoLT: Joint Probabilistic Predictions on Tabular Data Using LLMs |
Aliaksandra Shysheya et.al. |
2502.11877 |
link |
2025-02-17 |
FedEAT: A Robustness Optimization Framework for Federated LLMs |
Yahao Pang et.al. |
2502.11863 |
null |
2025-02-17 |
Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu |
Renhao Pei et.al. |
2502.11862 |
link |
2025-02-17 |
Exploring Large Language Models in Healthcare: Insights into Corpora Sources, Customization Strategies, and Evaluation Metrics |
Shuqi Yang et.al. |
2502.11861 |
null |
2025-02-17 |
StructTransform: A Scalable Attack Surface for Safety-Aligned Large Language Models |
Shehel Yoosuf et.al. |
2502.11853 |
link |
2025-02-17 |
BaxBench: Can LLMs Generate Correct and Secure Backends? |
Mark Vero et.al. |
2502.11844 |
null |
2025-02-17 |
Can LLM Agents Maintain a Persona in Discourse? |
Pranav Bhandari et.al. |
2502.11843 |
null |
2025-02-17 |
Model Generalization on Text Attribute Graphs: Principles with Large Language Models |
Haoyu Wang et.al. |
2502.11836 |
link |
2025-02-17 |
HAAN: A Holistic Approach for Accelerating Normalization Operations in Large Language Models |
Tianfan Peng et.al. |
2502.11832 |
null |
2025-02-17 |
Intuitive physics understanding emerges from self-supervised pretraining on natural videos |
Quentin Garrido et.al. |
2502.11831 |
link |
2025-02-17 |
Text Classification in the LLM Era - Where do we stand? |
Sowmya Vajjala et.al. |
2502.11830 |
null |
2025-02-17 |
Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation Capabilities |
Hanbin Wang et.al. |
2502.11829 |
link |
2025-02-17 |
M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis |
Chengyan Wu et.al. |
2502.11824 |
link |
2025-02-17 |
Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis |
Xu Wang et.al. |
2502.11812 |
null |
2025-02-17 |
FineFilter: A Fine-grained Noise Filtering Mechanism for Retrieval-Augmented Large Language Models |
Qianchi Zhang et.al. |
2502.11811 |
null |
2025-02-17 |
Exploring Translation Mechanism of Large Language Models |
Hongbin Zhang et.al. |
2502.11806 |
null |
2025-02-17 |
Table-Critic: A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning |
Peiying Yu et.al. |
2502.11799 |
link |
2025-02-17 |
Personality Editing for Language Models through Relevant Knowledge Editing |
Seojin Hwang et.al. |
2502.11789 |
null |
2025-02-17 |
Efficient Response Generation Method Selection for Fine-Tuning Large Language Models |
Xuan Ren et.al. |
2502.11779 |
null |
2025-02-17 |
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model |
Guangzhi Sun et.al. |
2502.11775 |
link |
2025-02-17 |
The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It |
Leonardo Bertolazzi et.al. |
2502.11771 |
link |
2025-02-17 |
Cognitive-Aligned Document Selection for Retrieval-augmented Generation |
Bingyu Wan et.al. |
2502.11770 |
null |
2025-02-17 |
From Selection to Generation: A Survey of LLM-based Active Learning |
Yu Xia et.al. |
2502.11767 |
null |
2025-02-17 |
Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge Distillation |
Zengkui Sun et.al. |
2502.11766 |
link |
2025-02-17 |
HintsOfTruth: A Multimodal Checkworthiness Detection Dataset with Real and Synthetic Claims |
Michiel van der Meer et.al. |
2502.11753 |
null |
2025-02-17 |
Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal Reasoning |
Yuqi Pang et.al. |
2502.11751 |
link |
2025-02-17 |
ILIAS: Instance-Level Image retrieval At Scale |
Giorgos Kordopatis-Zilos et.al. |
2502.11748 |
null |
2025-02-17 |
SQL-o1: A Self-Reward Heuristic Dynamic Search Method for Text-to-SQL |
Shuai Lyu et.al. |
2502.11741 |
link |
2025-02-17 |
ReviewEval: An Evaluation Framework for AI-Generated Reviews |
Chavvi Kirtani et.al. |
2502.11736 |
null |
2025-02-17 |
Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment |
Jonathan Jordan et.al. |
2502.11733 |
null |
2025-02-17 |
Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption |
Alireza Nik et.al. |
2502.11723 |
null |
2025-02-17 |
Enhancing Recommendation Explanations through User-Centric Refinement |
Jingsen Zhang et.al. |
2502.11721 |
null |
2025-02-17 |
Can you pass that tool?: Implications of Indirect Speech in Physical Human-Robot Collaboration |
Yan Zhang et.al. |
2502.11720 |
null |
2025-02-17 |
Component-aware Unsupervised Logical Anomaly Generation for Industrial Anomaly Detection |
Xuan Tong et.al. |
2502.11712 |
null |
2025-02-17 |
Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models |
Sherzod Hakimov et.al. |
2502.11707 |
null |
2025-02-17 |
LLM Agents Making Agent Tools |
Georg Wölflein et.al. |
2502.11705 |
null |
2025-02-17 |
CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation |
Guangya Yu et.al. |
2502.11703 |
null |
2025-02-17 |
MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow |
Hanzhuo Huang et.al. |
2502.11697 |
null |
2025-02-17 |
Improve LLM-as-a-Judge Ability as a General Ability |
Jiachen Yu et.al. |
2502.11689 |
null |
2025-02-17 |
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task |
Yuchen Yan et.al. |
2502.11684 |
null |
2025-02-17 |
RIDE: Enhancing Large Language Model Alignment through Restyled In-Context Learning Demonstration Exemplars |
Yuncheng Hua et.al. |
2502.11681 |
link |
2025-02-17 |
Exploring LLM-based Student Simulation for Metacognitive Cultivation |
Haoxuan Li et.al. |
2502.11678 |
null |
2025-02-17 |
Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception |
Shiyu Ni et.al. |
2502.11677 |
null |
2025-02-17 |
Diversity-Oriented Data Augmentation with Large Language Models |
Zaitian Wang et.al. |
2502.11671 |
null |
2025-02-17 |
VRoPE: Rotary Position Embedding for Video Large Language Models |
Zikang Liu et.al. |
2502.11664 |
link |
2025-02-17 |
An Innovative Brain-Computer Interface Interaction System Based on the Large Language Model |
Jing Jina et.al. |
2502.11659 |
null |
2025-02-17 |
Competing LLM Agents in a Non-Cooperative Game of Opinion Polarisation |
Amin Qasmi et.al. |
2502.11649 |
null |
2025-02-17 |
DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing |
Yi Wang et.al. |
2502.11647 |
null |
2025-02-17 |
Hyperspherical Energy Transformer with Recurrent Depth |
Yunzhe Hu et.al. |
2502.11646 |
null |
2025-02-17 |
Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI |
Yuxia Wang et.al. |
2502.11614 |
null |
2025-02-17 |
Maximum Entropy Reinforcement Learning with Diffusion Policy |
Xiaoyi Dong et.al. |
2502.11612 |
link |
2025-02-17 |
Accuracy Assessment of OpenAlex and Clarivate Scholar ID with an LLM-Assisted Benchmark |
Renyu Zhao et.al. |
2502.11610 |
null |
2025-02-17 |
GraphThought: Graph Combinatorial Optimization with Thought Generation |
Zixiao Huang et.al. |
2502.11607 |
null |
2025-02-14 |
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment |
Yi-Fan Zhang et.al. |
2502.10391 |
null |
2025-02-14 |
Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction |
WonJin Yoon et.al. |
2502.10388 |
null |
2025-02-14 |
Robustness tests for biomedical foundation models should tailor to specification |
R. Patrick Xian et.al. |
2502.10374 |
link |
2025-02-14 |
AffinityFlow: Guided Flows for Antibody Affinity Maturation |
Can Chen et.al. |
2502.10365 |
null |
2025-02-14 |
Enhancing Multilingual LLM Pretraining with Model-Based Data Selection |
Bettina Messmer et.al. |
2502.10361 |
null |
2025-02-14 |
Dimension-free Score Matching and Time Bootstrapping for Diffusion Models |
Syamantak Kumar et.al. |
2502.10354 |
null |
2025-02-14 |
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation |
Alexander Wettig et.al. |
2502.10341 |
null |
2025-02-14 |
Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering |
Nick Ferguson et.al. |
2502.10338 |
null |
2025-02-14 |
Generalised Parallel Tempering: Flexible Replica Exchange via Flows and Diffusions |
Leo Zhang et.al. |
2502.10328 |
null |
2025-02-14 |
LLM-Powered Preference Elicitation in Combinatorial Assignment |
Ermis Soumalias et.al. |
2502.10308 |
null |
2025-02-14 |
SPIRIT: Short-term Prediction of solar IRradIance for zero-shot Transfer learning using Foundation Models |
Aditya Mishra et.al. |
2502.10307 |
null |
2025-02-14 |
Open-Source AI-Powered Optimization in Scalene: Advancing Python Performance Profiling with DeepSeek-R1 and LLaMA 3.2 |
Saem Hasan et.al. |
2502.10299 |
null |
2025-02-14 |
Probabilistic Super-Resolution for High-Fidelity Physical System Simulations with Uncertainty Quantification |
Pengyu Zhang et.al. |
2502.10280 |
null |
2025-02-14 |
Are Large Language Models the future crowd workers of Linguistics? |
Iris Ferrazzo et.al. |
2502.10266 |
null |
2025-02-14 |
Large Language Models and Synthetic Data for Monitoring Dataset Mentions in Research Papers |
Aivin V. Solatorio et.al. |
2502.10263 |
link |
2025-02-14 |
VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models |
Gokul Karthik Kumar et.al. |
2502.10250 |
null |
2025-02-14 |
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model |
Guoqing Ma et.al. |
2502.10248 |
link |
2025-02-14 |
Efficient Zero-Order Federated Finetuning of Language Models for Resource-Constrained Devices |
Mohamed Aboelenien Ahmed et.al. |
2502.10239 |
null |
2025-02-14 |
Shaping Inductive Bias in Diffusion Models through Frequency-Based Noise Control |
Thomas Jiralerspong et.al. |
2502.10236 |
null |
2025-02-14 |
AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting |
Abdelhakim Benechehab et.al. |
2502.10235 |
link |
2025-02-14 |
Do Large Language Models Reason Causally Like Us? Even Better? |
Hanna M. Dettki et.al. |
2502.10215 |
null |
2025-02-14 |
Can Post-Training Quantization Benefit from an Additional QLoRA Integration? |
Xiliang Zhu et.al. |
2502.10202 |
null |
2025-02-14 |
Prediction hubs are context-informed frequent tokens in LLMs |
Beatrix M. G. Nielsen et.al. |
2502.10201 |
null |
2025-02-14 |
MathConstruct: Challenging LLM Reasoning with Constructive Proofs |
Mislav Balunović et.al. |
2502.10197 |
null |
2025-02-14 |
Translating Common Security Assertions Across Processor Designs: A RISC-V Case Study |
Sharjeel Imtiaz et.al. |
2502.10194 |
null |
2025-02-14 |
VideoDiff: Human-AI Video Co-Creation with Alternatives |
Mina Huh et.al. |
2502.10190 |
null |
2025-02-14 |
Modeling biases in binary decision-making within the generalized nonlinear q-voter model |
Maciej Doniec et.al. |
2502.10172 |
link |
2025-02-14 |
Video Soundtrack Generation by Aligning Emotions and Temporal Boundaries |
Serkan Sulun et.al. |
2502.10154 |
null |
2025-02-14 |
Semantica: Decentralized Search using a LLM-Guided Semantic Tree Overlay |
Petru Neague et.al. |
2502.10151 |
link |
2025-02-14 |
Cooperative Multi-Agent Planning with Adaptive Skill Synthesis |
Zhiyuan Li et.al. |
2502.10148 |
null |
2025-02-14 |
Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages |
Daniil Gurgurov et.al. |
2502.10140 |
null |
2025-02-14 |
Physics-Informed Generative Modeling of Wireless Channels |
Benedikt Böck et.al. |
2502.10137 |
null |
2025-02-14 |
ScamFerret: Detecting Scam Websites Autonomously with Large Language Models |
Hiroki Nakano et.al. |
2502.10110 |
link |
2025-02-14 |
NeuroXVocal: Detection and Explanation of Alzheimer’s Disease through Non-invasive Analysis of Picture-prompted Speech |
Nikolaos Ntampakis et.al. |
2502.10108 |
null |
2025-02-14 |
A novel approach to data generation in generative model |
JaeHong Kim et.al. |
2502.10092 |
null |
2025-02-14 |
Enhancing Patient Acceptance of Robotic Ultrasound through Conversational Virtual Agent and Immersive Visualizations |
Tianyu Song et.al. |
2502.10088 |
link |
2025-02-14 |
DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery |
Utkarsh Mall et.al. |
2502.10060 |
null |
2025-02-14 |
A Generalized Modeling Approach to Liquid-driven Ballooning Membranes |
Mirroyal Ismayilov et.al. |
2502.10057 |
null |
2025-02-14 |
ORI: O Routing Intelligence |
Ahmad Shadid et.al. |
2502.10051 |
null |
2025-02-14 |
A Survey on LLM-powered Agents for Recommender Systems |
Qiyao Peng et.al. |
2502.10050 |
null |
2025-02-14 |
ViRAC: A Vision-Reasoning Agent Head Movement Control Framework in Arbitrary Virtual Environments |
Juyeong Hwang et.al. |
2502.10046 |
null |
2025-02-14 |
POI-Enhancer: An LLM-based Semantic Enhancement Framework for POI Representation Learning |
Jiawei Cheng et.al. |
2502.10038 |
null |
2025-02-14 |
Probabilistic Lexical Manifold Construction in Large Language Models via Hierarchical Vector Field Interpolation |
Clive Pendleton et.al. |
2502.10013 |
null |
2025-02-14 |
ChatGPT and Deepseek: Can They Predict the Stock Market and Macroeconomy? |
Jian Chen et.al. |
2502.10008 |
null |
2025-02-14 |
EmbBERT-Q: Breaking Memory Barriers in Embedded NLP |
Riccardo Bravin et.al. |
2502.10001 |
null |
2025-02-14 |
Decision Information Meets Large Language Models: The Future of Explainable Operations Research |
Yansen Zhang et.al. |
2502.09994 |
link |
2025-02-14 |
Large Language Diffusion Models |
Shen Nie et.al. |
2502.09992 |
null |
2025-02-14 |
V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models |
Hsu-kuang Chiu et.al. |
2502.09980 |
null |
2025-02-14 |
LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs - No Silver Bullet for LC or RAG Routing |
Kuan Li et.al. |
2502.09977 |
null |
2025-02-14 |
Has My System Prompt Been Used? Large Language Model Prompt Membership Inference |
Roman Levin et.al. |
2502.09974 |
null |
2025-02-14 |
KGGen: Extracting Knowledge Graphs from Plain Text with Language Models |
Belinda Mo et.al. |
2502.09956 |
null |
2025-02-14 |
A Preliminary Exploration with GPT-4o Voice Mode |
Yu-Xiang Lin et.al. |
2502.09940 |
null |
2025-02-14 |
Precise Parameter Localization for Textual Generation in Diffusion Models |
Łukasz Staniszewski et.al. |
2502.09935 |
null |
2025-02-14 |
MIR-Bench: Benchmarking LLM’s Long-Context Intelligence via Many-Shot In-Context Inductive Reasoning |
Kai Yan et.al. |
2502.09933 |
null |
2025-02-14 |
Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence |
Granite Vision Team et.al. |
2502.09927 |
null |
2025-02-14 |
λScale: Enabling Fast Scaling for Serverless Large Language Model Inference |
Minchen Yu et.al. |
2502.09922 |
null |
2025-02-14 |
INF^2: High-Throughput Generative Inference of Large Language Models using Near-Storage Processing |
Hongsun Jang et.al. |
2502.09921 |
null |
2025-02-14 |
AutoS $^2$ earch: Unlocking the Reasoning Potential of Large Models for Web-based Source Search |
Zhengqiu Zhu et.al. |
2502.09913 |
null |
2025-02-14 |
Insect-Foundation: A Foundation Model and Large Multimodal Dataset for Vision-Language Insect Understanding |
Thanh-Dat Truong et.al. |
2502.09906 |
null |
2025-02-14 |
The Ann Arbor Architecture for Agent-Oriented Programming |
Wei Dong et.al. |
2502.09903 |
link |
2025-02-14 |
Artificial Intelligence in Spectroscopy: Advancing Chemistry from Prediction to Generation and Beyond |
Kehan Guo et.al. |
2502.09897 |
null |
2025-02-14 |
ChatIoT: Large Language Model-based Security Assistant for Internet of Things with Retrieval-Augmented Generation |
Ye Dong et.al. |
2502.09896 |
null |
2025-02-14 |
ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation |
Shu Wang et.al. |
2502.09891 |
null |
2025-02-14 |
Video2Policy: Scaling up Manipulation Tasks in Simulation through Internet Videos |
Weirui Ye et.al. |
2502.09886 |
null |
2025-02-14 |
Solvable Dynamics of Self-Supervised Word Embeddings and the Emergence of Analogical Reasoning |
Dhruva Karkada et.al. |
2502.09863 |
null |
2025-02-14 |
Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge |
Naoyuki Kamo et.al. |
2502.09859 |
null |
2025-02-14 |
Automated Hypothesis Validation with Agentic Sequential Falsifications |
Kexin Huang et.al. |
2502.09858 |
link |
2025-02-14 |
Port-LLM: A Port Prediction Method for Fluid Antenna based on Large Language Models |
Yali Zhang et.al. |
2502.09857 |
null |
2025-02-14 |
Efficient Multitask Learning in Small Language Models Through Upside-Down Reinforcement Learning |
Yu-Chen Lin et.al. |
2502.09854 |
null |
2025-02-14 |
HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation |
Tianwei Lin et.al. |
2502.09838 |
link |
2025-02-13 |
A Solver-Aided Hierarchical Language for LLM-Driven CAD Design |
Benjamin T. Jones et.al. |
2502.09819 |
null |
2025-02-13 |
Statistical Coherence Alignment for Large Language Model Representation Learning Through Tensor Field Convergence |
Jonathan Gale et.al. |
2502.09815 |
null |
2025-02-13 |
INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages |
Hao Yu et.al. |
2502.09814 |
null |
2025-02-13 |
AgentGuard: Repurposing Agentic Orchestrator for Safety Evaluation of Tool Orchestration |
Jizhou Chen et.al. |
2502.09809 |
null |
2025-02-13 |
Unit Testing Past vs. Present: Examining LLMs’ Impact on Defect Detection and Efficiency |
Rudolf Ramler et.al. |
2502.09801 |
null |
2025-02-13 |
Co-designing Large Language Model Tools for Project-Based Learning with K12 Educators |
Prerna Ravi et.al. |
2502.09799 |
null |
2025-02-13 |
A Survey on LLM-based News Recommender Systems |
Rongyao Wang et.al. |
2502.09797 |
null |
2025-02-13 |
TableTalk: Scaffolding Spreadsheet Development with a Language Agent |
Jenny T. Liang et.al. |
2502.09787 |
null |
2025-02-13 |
Improving Acoustic Side-Channel Attacks on Keyboards Using Transformers and Large Language Models |
Jin Hyun Park et.al. |
2502.09782 |
null |
2025-02-13 |
CellFlow: Simulating Cellular Morphology Changes via Flow Matching |
Yuhui Zhang et.al. |
2502.09775 |
null |
2025-02-13 |
Non-Markovian Discrete Diffusion with Causal Language Models |
Yangtian Zhang et.al. |
2502.09767 |
null |
2025-02-13 |
LLM-Generated Microservice Implementations from RESTful API Definitions |
Saurabh Chauhan et.al. |
2502.09766 |
link |
2025-02-13 |
Enhancing Jailbreak Attacks via Compliance-Refusal-Based Initialization |
Amit Levi et.al. |
2502.09755 |
null |
2025-02-13 |
Vote-Tree-Planner: Optimizing Execution Order in LLM-based Task Planning Pipeline via Voting |
Chaoyuan Zhang et.al. |
2502.09749 |
null |
2025-02-13 |
The Widespread Adoption of Large Language Model-Assisted Writing Across Society |
Weixin Liang et.al. |
2502.09747 |
null |
2025-02-13 |
Fine-Tuning Foundation Models with Federated Learning for Privacy Preserving Medical Time Series Forecasting |
Mahad Ali et.al. |
2502.09744 |
null |
2025-02-13 |
FoNE: Precise Single-Token Number Embeddings via Fourier Features |
Tianyi Zhou et.al. |
2502.09741 |
null |
2025-02-13 |
Making Them a Malicious Database: Exploiting Query Code to Jailbreak Aligned Large Language Models |
Qingsong Zou et.al. |
2502.09723 |
link |
2025-02-13 |
NestQuant: Nested Lattice Quantization for Matrix Products and LLMs |
Semyon Savkin et.al. |
2502.09720 |
null |
2025-02-13 |
Genetic Data Governance in Crisis: Policy Recommendations for Safeguarding Privacy and Preventing Discrimination |
Vivek Ramanan et.al. |
2502.09716 |
null |
2025-02-13 |
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency |
Dongzhi Jiang et.al. |
2502.09621 |
null |
2025-02-13 |
Exploring the Potential of Encoder-free Architectures in 3D LMMs |
Yiwen Tang et.al. |
2502.09620 |
link |
2025-02-13 |
Designing a Conditional Prior Distribution for Flow-Based Generative Models |
Noam Issachar et.al. |
2502.09611 |
null |
2025-02-14 |
Score-of-Mixture Training: Training One-Step Generative Models Made Simple via Score Estimation of Mixture Distributions |
Tejas Jayashankar et.al. |
2502.09609 |
null |
2025-02-13 |
Human-LLM Coevolution: Evidence from Academic Writing |
Mingmeng Geng et.al. |
2502.09606 |
null |
2025-02-13 |
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models |
Yung-Sung Chuang et.al. |
2502.09604 |
link |
2025-02-13 |
Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs |
Siyan Zhao et.al. |
2502.09597 |
link |
2025-02-13 |
KIMAs: A Configurable Knowledge Integrated Multi-Agent System |
Zitao Li et.al. |
2502.09596 |
null |
2025-02-13 |
Logical forms complement probability in understanding language model (and human) performance |
Yixuan Wang et.al. |
2502.09589 |
null |
2025-02-13 |
Rolling Ahead Diffusion for Traffic Scene Simulation |
Yunpeng Liu et.al. |
2502.09587 |
null |
2025-02-13 |
Polymind: Parallel Visual Diagramming with Large Language Models to Support Prewriting Through Microtasks |
Qian Wan et.al. |
2502.09577 |
null |
2025-02-13 |
Zero-shot generation of synthetic neurosurgical data with large language models |
Austin A. Barr et.al. |
2502.09566 |
link |
2025-02-13 |
MDCrow: Automating Molecular Dynamics Workflows with Large Language Models |
Quintina Campbell et.al. |
2502.09565 |
link |
2025-02-13 |
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents |
Rui Yang et.al. |
2502.09560 |
null |
2025-02-13 |
Explainable AI-assisted Optimization for Feynman Integral Reduction |
Zhuo-Yang Song et.al. |
2502.09544 |
null |
2025-02-13 |
Mind the Gap! Choice Independence in Using Multilingual LLMs for Persuasive Co-Writing Tasks in Different Languages |
Shreyan Biswas et.al. |
2502.09532 |
null |
2025-02-13 |
SQ-GAN: Semantic Image Communications Using Masked Vector Quantization |
Francesco Pezone et.al. |
2502.09520 |
link |
2025-02-13 |
Diffusion Models for Molecules: A Survey of Methods and Tasks |
Liang Wang et.al. |
2502.09511 |
link |
2025-02-13 |
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling |
Theodoros Kouzelis et.al. |
2502.09509 |
null |
2025-02-13 |
Improve LLM-based Automatic Essay Scoring with Linguistic Features |
Zhaoyi Joey Hou et.al. |
2502.09497 |
null |
2025-02-13 |
Foundation Neural-Network Quantum States |
Riccardo Rende et.al. |
2502.09488 |
null |
2025-02-13 |
Objective quantification of mood states using large language models |
Jakub Onysk et.al. |
2502.09487 |
null |
2025-02-13 |
DiffRenderGAN: Addressing Training Data Scarcity in Deep Segmentation Networks for Quantitative Nanomaterial Analysis through Differentiable Rendering and Generative Modelling |
Dennis Possart et.al. |
2502.09477 |
null |
2025-02-13 |
Transformer-Enhanced Variational Autoencoder for Crystal Structure Prediction |
Ziyi Chen et.al. |
2502.09423 |
null |
2025-02-13 |
ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation |
Rotem Shalev-Arkushin et.al. |
2502.09411 |
null |
2025-02-13 |
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models |
Daniel Fleischer et.al. |
2502.09390 |
link |
2025-02-13 |
Truth Knows No Language: Evaluating Truthfulness Beyond English |
Blanca Calvo Figueras et.al. |
2502.09387 |
link |
2025-02-13 |
APT-LLM: Embedding-Based Anomaly Detection of Cyber Advanced Persistent Threats Using Large Language Models |
Sidahmed Benabderrahmane et.al. |
2502.09385 |
null |
2025-02-13 |
LoRA Training Provably Converges to a Low-Rank Global Minimum or It Fails Loudly (But it Probably Won’t Fail) |
Junsu Kim et.al. |
2502.09376 |
null |
2025-02-13 |
Inverse problems with experiment-guided AlphaFold |
Advaith Maddipatla et.al. |
2502.09372 |
null |
2025-02-13 |
Language Agents as Digital Representatives in Collective Decision-Making |
Daniel Jarrett et.al. |
2502.09369 |
null |
2025-02-13 |
Machine learning for modelling unstructured grid data in computational physics: a review |
Sibo Cheng et.al. |
2502.09346 |
null |
2025-02-13 |
ThunderServe: High-performance and Cost-efficient LLM Serving in Cloud Environments |
Youhe Jiang et.al. |
2502.09334 |
null |
2025-02-13 |
Beyond English: The Impact of Prompt Translation Strategies across Languages and Tasks in Multilingual LLMs |
Itai Mondshine et.al. |
2502.09331 |
null |
2025-02-13 |
Copilot Arena: A Platform for Code LLM Evaluation in the Wild |
Wayne Chi et.al. |
2502.09328 |
null |
2025-02-13 |
A Benchmark for Crime Surveillance Video Analysis with Large Models |
Haoran Chen et.al. |
2502.09325 |
null |
2025-02-13 |
A Judge-free LLM Open-ended Generation Benchmark Based on the Distributional Hypothesis |
Kentaro Imajo et.al. |
2502.09316 |
link |
2025-02-13 |
When the LM misunderstood the human chuckled: Analyzing garden path effects in humans and language models |
Samuel Joseph Amouyal et.al. |
2502.09307 |
null |
2025-02-13 |
Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling |
Paula Cordero-Encinar et.al. |
2502.09306 |
null |
2025-02-13 |
KET-RAG: A Cost-Efficient Multi-Granular Indexing Framework for Graph-RAG |
Yiqian Huang et.al. |
2502.09304 |
link |
2025-02-13 |
When do neural networks learn world models? |
Tianren Zhang et.al. |
2502.09297 |
null |
2025-02-13 |
SparQLe: Speech Queries to Text Translation Through LLMs |
Amirbek Djanibekov et.al. |
2502.09284 |
link |
2025-02-13 |
GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation |
Hongyin Zhang et.al. |
2502.09268 |
null |
2025-02-13 |
AnomalyGFM: Graph Foundation Model for Zero/Few-shot Anomaly Detection |
Hezhe Qiao et.al. |
2502.09254 |
link |
2025-02-13 |
From large language models to multimodal AI: A scoping review on the potential of generative AI in medicine |
Lukas Buess et.al. |
2502.09242 |
null |
2025-02-13 |
OpenBench: A New Benchmark and Baseline for Semantic Navigation in Smart Logistics |
Junhui Wang et.al. |
2502.09238 |
null |
2025-02-13 |
Reliable Conversational Agents under ASP Control that Understand Natural Language |
Yankai Zeng et.al. |
2502.09237 |
null |
2025-02-13 |
Data2Concept2Text: An Explainable Multilingual Framework for Data Analysis Narration |
Flavio Bertini et.al. |
2502.09218 |
null |
2025-02-13 |
LP-LM: No Hallucinations in Question Answering with Logic Programming |
Katherine Wu et.al. |
2502.09212 |
link |
2025-02-13 |
Visual Graph Question Answering with ASP and LLMs for Language Parsing |
Jakob Johannes Bauer et.al. |
2502.09211 |
null |
2025-02-13 |
On LLM-generated Logic Programs and their Inference Execution Methods |
Paul Tarau et.al. |
2502.09209 |
null |
2025-02-13 |
Logical Lease Litigation: Prolog and LLMs for Rental Law Compliance in New York |
Sanskar Sehgal et.al. |
2502.09204 |
null |
2025-02-13 |
XAInomaly: Explainable and Interpretable Deep Contractive Autoencoder for O-RAN Traffic Anomaly Detection |
Osman Tugay Basaran et.al. |
2502.09194 |
null |
2025-02-13 |
Thinking beyond the anthropomorphic paradigm benefits LLM research |
Lujain Ibrahim et.al. |
2502.09192 |
null |
2025-02-13 |
Matina: A Large-Scale 73B Token Persian Text Corpus |
Sara Bourbour Hosseinbeigi et.al. |
2502.09188 |
null |
2025-02-13 |
RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation |
Changzhi Zhou et.al. |
2502.09183 |
null |
2025-02-13 |
FLAME: Flexible LLM-Assisted Moderation Engine |
Ivan Bakulin et.al. |
2502.09175 |
null |
2025-02-13 |
Two-Stage Representation Learning for Analyzing Movement Behavior Dynamics in People Living with Dementia |
Jin Cui et.al. |
2502.09173 |
null |
2025-02-13 |
Improving TCM Question Answering through Tree-Organized Self-Reflective Retrieval with LLMs |
Chang Liu et.al. |
2502.09156 |
null |
2025-02-13 |
Finite-Time Analysis of Discrete-Time Stochastic Interpolants |
Yuhao Liu et.al. |
2502.09130 |
null |
2025-02-13 |
One-shot Federated Learning Methods: A Practical Guide |
Xiang Liu et.al. |
2502.09104 |
null |
2025-02-13 |
Bridging the Gap Between LLMs and Human Intentions: Progresses and Challenges in Instruction Understanding, Intention Reasoning, and Reliable Generation |
Zongyu Chang et.al. |
2502.09101 |
null |
2025-02-13 |
Logical Reasoning in Large Language Models: A Survey |
Hanmeng Liu et.al. |
2502.09100 |
null |
2025-02-13 |
Show Me the Work: Fact-Checkers’ Requirements for Explainable Automated Fact-Checking |
Greta Warren et.al. |
2502.09083 |
null |
2025-02-13 |
CoSER: Coordinating LLM-Based Persona Simulation of Established Roles |
Xintao Wang et.al. |
2502.09082 |
link |
2025-02-13 |
Enhancing RAG with Active Learning on Conversation Records: Reject Incapables and Answer Capables |
Xuzhao Geng et.al. |
2502.09073 |
null |
2025-02-13 |
Unleashing the Power of Large Language Model for Denoising Recommendation |
Shuyao Wang et.al. |
2502.09058 |
null |
2025-02-13 |
An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging |
Kunat Pipatanakul et.al. |
2502.09056 |
null |
2025-02-13 |
Game Theory Meets Large Language Models: A Systematic Survey |
Haoran Sun et.al. |
2502.09053 |
null |
2025-02-13 |
Typhoon T1: An Open Thai Reasoning Model |
Pittawat Taveekitworachai et.al. |
2502.09042 |
null |
2025-02-13 |
Implementation of a Fuzzy Relational Database. Case Study: Chilean Cardboard Industry in the Maule Region |
Leoncio Jimenez et.al. |
2502.09035 |
null |
2025-02-13 |
MTDP: Modulated Transformer Diffusion Policy Model |
Qianhao Wang et.al. |
2502.09029 |
null |
2025-02-13 |
EventSTR: A Benchmark Dataset and Baselines for Event Stream based Scene Text Recognition |
Xiao Wang et.al. |
2502.09020 |
link |
2025-02-13 |
Diversity Enhances an LLM’s Performance in RAG and Long-context Task |
Zhchao Wang et.al. |
2502.09017 |
null |
2025-02-13 |
Hope vs. Hate: Understanding User Interactions with LGBTQ+ News Content in Mainstream US News Media through the Lens of Hope Speech |
Jonathan Pofcher et.al. |
2502.09004 |
null |
2025-02-13 |
RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models |
Quan Wei et.al. |
2502.09003 |
null |
2025-02-13 |
End-to-End triplet loss based fine-tuning for network embedding in effective PII detection |
Rishika Kohli et.al. |
2502.09002 |
null |
2025-02-13 |
Task Generalization With AutoRegressive Compositional Structure: Can Learning From $\d$ Tasks Generalize to $\d^{T}$ Tasks? |
Amirhesam Abedsoltan et.al. |
2502.08991 |
null |
2025-02-13 |
Prophet Inequalities for Bandits, Cabinets, and DAGs |
Robin Bowers et.al. |
2502.08976 |
null |
2025-02-13 |
Medicine on the Edge: Comparative Performance Analysis of On-Device LLMs for Clinical Reasoning |
Leon Nissen et.al. |
2502.08954 |
link |
2025-02-13 |
Structured Convergence in Large Language Model Representations via Hierarchical Latent Space Folding |
Fenella Harcourt et.al. |
2502.08947 |
null |
2025-02-13 |
Beyond the Singular: The Essential Role of Multiple Generations in Effective Benchmark Evaluation and Analysis |
Wenbo Zhang et.al. |
2502.08943 |
null |
2025-02-13 |
Escaping Collapse: The Strength of Weak Data for Large Language Model Training |
Kareem Amin et.al. |
2502.08924 |
null |
2025-02-13 |
Self-Consistency of the Internal Reward Models Improves Self-Rewarding Language Models |
Xin Zhou et.al. |
2502.08922 |
null |
2025-02-13 |
Detecting Malicious Concepts Without Image Generation in AIGC |
Kun Xu et.al. |
2502.08921 |
null |
2025-02-13 |
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU |
Heejun Lee et.al. |
2502.08910 |
null |
2025-02-13 |
Towards Automated Fact-Checking of Real-World Claims: Exploring Task Formulation and Assessment with LLMs |
Premtim Sahitaj et.al. |
2502.08909 |
null |
2025-02-13 |
Reinforced Large Language Model is a formal theorem prover |
Zhiling Luo et.al. |
2502.08908 |
link |
2025-02-13 |
DiffoRA: Enabling Parameter-Efficient LLM Fine-Tuning via Differential Low-Rank Matrix Adaptation |
Tangyu Jiang et.al. |
2502.08905 |
null |
2025-02-13 |
MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training |
Xinxin You et.al. |
2502.08904 |
null |
2025-02-13 |
3D-Grounded Vision-Language Framework for Robotic Task Planning: Automated Prompt Synthesis and Supervised Reasoning |
Guoqin Tang et.al. |
2502.08903 |
null |
2025-02-13 |
Communication is All You Need: Persuasion Dataset Construction via Multi-LLM Communication |
Weicheng Ma et.al. |
2502.08896 |
null |
2025-02-13 |
ShapeLib: designing a library of procedural 3D shape abstractions with Large Language Models |
R. Kenny Jones et.al. |
2502.08884 |
null |
2025-02-13 |
Utilizing Pre-trained and Large Language Models for 10-K Items Segmentation |
Hsin-Min Lu et.al. |
2502.08875 |
null |
2025-02-13 |
Harnessing Vision Models for Time Series Analysis: A Survey |
Jingchao Ni et.al. |
2502.08869 |
link |
2025-02-13 |
A Systematic Evaluation of Generative Models on Tabular Transportation Data |
Chengen Wang et.al. |
2502.08856 |
link |
2025-02-12 |
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation |
Mohammad Mahdi Abootorabi et.al. |
2502.08826 |
link |
2025-02-12 |
DejAIvu: Identifying and Explaining AI Art on the Web in Real-Time with Saliency Maps |
Jocelyn Dzuong et.al. |
2502.08821 |
link |
2025-02-12 |
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model |
Emre Can Acikgoz et.al. |
2502.08820 |
null |
2025-02-12 |
Lexical Manifold Reconfiguration in Large Language Models: A Novel Architectural Approach for Contextual Modulation |
Koinis Vassilis et.al. |
2502.08818 |
null |
2025-02-12 |
Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples |
Andrianos Michail et.al. |
2502.08638 |
null |
2025-02-12 |
Ensemble based approach to quantifying uncertainty of LLM based classifications |
Srijith Rajamohan et.al. |
2502.08631 |
null |
2025-02-12 |
Continuous Cardiac Arrest Prediction in ICU using PPG Foundation Model |
Saurabh Kataria et.al. |
2502.08612 |
null |
2025-02-12 |
Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors |
Vishwanath Pratap Singh et.al. |
2502.08587 |
null |
2025-02-12 |
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks |
Ang Li et.al. |
2502.08586 |
null |
2025-02-12 |
Statistically validated projection of bipartite signed networks |
Anna Gallo et.al. |
2502.08567 |
null |
2025-02-12 |
QA-Expand: Multi-Question Answer Generation for Enhanced Query Expansion in Information Retrieval |
Wonduk Seo et.al. |
2502.08557 |
null |
2025-02-12 |
Human-Centric Foundation Models: Perception, Generation and Agentic Modeling |
Shixiang Tang et.al. |
2502.08556 |
link |
2025-02-12 |
Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies |
Sunnie S. Y. Kim et.al. |
2502.08554 |
null |
2025-02-12 |
LLMs can implicitly learn from mistakes in-context |
Lisa Alazraki et.al. |
2502.08550 |
null |
2025-02-12 |
LLM Pretraining with Continuous Concepts |
Jihoon Tack et.al. |
2502.08524 |
null |
2025-02-12 |
FedMHO: Heterogeneous One-Shot Federated Learning Towards Resource-Constrained Edge Devices |
Dezhong Yao et.al. |
2502.08518 |
link |
2025-02-12 |
The Paradox of Stochasticity: Limited Creativity and Computational Decoupling in Temperature-Varied LLM Outputs of Structured Fictional Data |
Evgenii Evstafev et.al. |
2502.08515 |
null |
2025-02-12 |
Faithful, Unfaithful or Ambiguous? Multi-Agent Debate with Initial Stance for Summary Evaluation |
Mahnaz Koupaee et.al. |
2502.08514 |
link |
2025-02-12 |
Measuring Diversity in Synthetic Datasets |
Yuchang Zhu et.al. |
2502.08512 |
link |
2025-02-12 |
Explanation based In-Context Demonstrations Retrieval for Multilingual Grammatical Error Correction |
Wei Li et.al. |
2502.08507 |
link |
2025-02-12 |
Salamandra Technical Report |
Aitor Gonzalez-Agirre et.al. |
2502.08489 |
link |
2025-02-12 |
One-Shot Federated Learning with Classifier-Free Diffusion Models |
Obaidullah Zaland et.al. |
2502.08488 |
null |
2025-02-12 |
Computed fingertip touch for the instrumental control of musical sound with an excursion on the computed retinal afterimage |
Staas de Jong et.al. |
2502.08471 |
null |
2025-02-12 |
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data |
Haonan Chen et.al. |
2502.08468 |
link |
2025-02-12 |
From Haystack to Needle: Label Space Reduction for Zero-shot Classification |
Nathan Vandemoortele et.al. |
2502.08436 |
null |
2025-02-12 |
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance |
Paul Röttger et.al. |
2502.08395 |
null |
2025-02-12 |
ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification |
Jiangbo Shi et.al. |
2502.08391 |
link |
2025-02-12 |
Top-Theta Attention: Sparsifying Transformers by Compensated Thresholding |
Konstantin Berestizshevsky et.al. |
2502.08363 |
link |
2025-02-12 |
Systematic Knowledge Injection into Large Language Models via Diverse Augmentation for Domain-Specific RAG |
Kushagra Bhushan et.al. |
2502.08356 |
link |
2025-02-12 |
Trustworthy GNNs with LLMs: A Systematic Review and Taxonomy |
Ruizhan Xue et.al. |
2502.08353 |
null |
2025-02-12 |
Graph Foundation Models for Recommendation: A Comprehensive Survey |
Bin Wu et.al. |
2502.08346 |
null |
2025-02-12 |
Foundation Models in Computational Pathology: A Review of Challenges, Opportunities, and Impact |
Mohsin Bilal et.al. |
2502.08333 |
null |
2025-02-12 |
Modification and Generated-Text Detection: Achieving Dual Detection Capabilities for the Outputs of LLM by Watermark |
Yuhang Cai et.al. |
2502.08332 |
null |
2025-02-12 |
Contextual Compression Encoding for Large Language Models: A Novel Framework for Multi-Layered Parameter Space Pruning |
Barnaby Schmitt et.al. |
2502.08323 |
null |
2025-02-12 |
MultiProSE: A Multi-label Arabic Dataset for Propaganda, Sentiment, and Emotion Detection |
Lubna Al-Henaki et.al. |
2502.08319 |
null |
2025-02-12 |
Word Synchronization Challenge: A Benchmark for Word Association Responses for LLMs |
Tanguy Cazalets et.al. |
2502.08312 |
null |
2025-02-12 |
Unlocking Scaling Law in Industrial Recommendation Systems with a Three-step Paradigm based Large User Model |
Bencheng Yan et.al. |
2502.08309 |
null |
2025-02-12 |
HDT: Hierarchical Discrete Transformer for Multivariate Time Series Forecasting |
Shibo Feng et.al. |
2502.08302 |
link |
2025-02-12 |
Compromising Honesty and Harmlessness in Language Models via Deception Attacks |
Laurène Vaugrante et.al. |
2502.08301 |
null |
2025-02-12 |
Improving Existing Optimization Algorithms with LLMs |
Camilo Chacón Sartori et.al. |
2502.08298 |
null |
2025-02-12 |
Redefining Simplicity: Benchmarking Large Language Models from Lexical to Document Simplification |
Jipeng Qiang et.al. |
2502.08281 |
null |
2025-02-12 |
MoLoRec: A Generalizable and Efficient Framework for LLM-Based Recommendation |
Min Hou et.al. |
2502.08271 |
null |
2025-02-12 |
Exploring the Potential of Large Language Models to Simulate Personality |
Maria Molchanova et.al. |
2502.08265 |
link |
2025-02-12 |
GenIAS: Generator for Instantiating Anomalies in time Series |
Zahra Zamanzadeh Darban et.al. |
2502.08262 |
null |
2025-02-12 |
FixDrive: Automatically Repairing Autonomous Vehicle Driving Behaviour for $0.08 per Violation |
Yang Sun et.al. |
2502.08260 |
link |
2025-02-12 |
Learning Human Skill Generators at Key-Step Levels |
Yilu Wu et.al. |
2502.08234 |
null |
2025-02-12 |
Flow-of-Action: SOP Enhanced LLM-Based Multi-Agent System for Root Cause Analysis |
Changhua Pei et.al. |
2502.08224 |
null |
2025-02-12 |
Memory Offloading for Large Language Model Inference with Latency SLO Guarantees |
Chenxiang Ma et.al. |
2502.08182 |
null |
2025-02-12 |
Enhancing LLM Character-Level Manipulation via Divide and Conquer |
Zhen Xiong et.al. |
2502.08180 |
null |
2025-02-12 |
ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented Generation |
Ruobing Yao et.al. |
2502.08178 |
null |
2025-02-12 |
SycEval: Evaluating LLM Sycophancy |
Aaron Fanous et.al. |
2502.08177 |
null |
2025-02-12 |
Intention is All You Need: Refining Your Code from Your Intention |
Qi Guo et.al. |
2502.08172 |
null |
2025-02-12 |
Force Matching with Relativistic Constraints: A Physics-Inspired Approach to Stable and Efficient Generative Modeling |
Yang Cao et.al. |
2502.08150 |
null |
2025-02-12 |
ACCESS : A Benchmark for Abstract Causal Event Discovery and Reasoning |
Vy Vo et.al. |
2502.08148 |
null |
2025-02-12 |
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers |
Siddharth Singh et.al. |
2502.08145 |
null |
2025-02-12 |
Bridging the Safety Gap: A Guardrail Pipeline for Trustworthy LLM Inferences |
Shanshan Han et.al. |
2502.08142 |
null |
2025-02-12 |
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits |
Zikai Zhou et.al. |
2502.08141 |
null |
2025-02-12 |
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models |
Sonam Gupta et.al. |
2502.08130 |
null |
2025-02-12 |
Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance |
Lingfei Qian et.al. |
2502.08127 |
link |
2025-02-12 |
HuDEx: Integrating Hallucination Detection and Explainability for Enhancing the Reliability of LLM responses |
Sujeong Lee et.al. |
2502.08109 |
null |
2025-02-12 |
Large language models perpetuate bias in palliative care: development and analysis of the Palliative Care Adversarial Dataset (PCAD) |
Naomi Akhras et.al. |
2502.08073 |
null |
2025-02-12 |
On Mechanistic Circuits for Extractive Question-Answering |
Samyadeep Basu et.al. |
2502.08059 |
null |
2025-02-12 |
Break the Checkbox: Challenging Closed-Style Evaluations of Cultural Alignment in LLMs |
Mohsinul Kabir et.al. |
2502.08045 |
null |
2025-02-12 |
Franken-Adapter: Cross-Lingual Adaptation of LLMs by Embedding Surgery |
Fan Jiang et.al. |
2502.08037 |
null |
2025-02-12 |
Stochastic Kinetics of Transcription: Analysis and Computation |
Yuntao Lu et.al. |
2502.08028 |
null |
2025-02-12 |
Contextual Subspace Manifold Projection for Structural Refinement of Large Language Model Representations |
Alistair Wren et.al. |
2502.08026 |
null |
2025-02-11 |
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding |
Ziyao Wang et.al. |
2502.08020 |
null |
2025-02-11 |
The Geometry of Prompting: Unveiling Distinct Mechanisms of Task Adaptation in Language Models |
Artem Kirsanov et.al. |
2502.08009 |
null |
2025-02-11 |
An Interactive Framework for Implementing Privacy-Preserving Federated Learning: Experiments on Large Language Models |
Kasra Ahmadi et.al. |
2502.08008 |
link |
2025-02-11 |
Towards Training One-Step Diffusion Models Without Distillation |
Mingtian Zhang et.al. |
2502.08005 |
null |
2025-02-11 |
Universal Adversarial Attack on Aligned Multimodal LLMs |
Temurbek Rahmatullaev et.al. |
2502.07987 |
null |
2025-02-11 |
Deep Semantic Graph Learning via LLM based Node Enhancement |
Chuanqi Shi et.al. |
2502.07982 |
null |
2025-02-11 |
CIRCUIT: A Benchmark for Circuit Interpretation and Reasoning Capabilities of LLMs |
Lejla Skelic et.al. |
2502.07980 |
null |
2025-02-11 |
From Hazard Identification to Controller Design: Proactive and LLM-Supported Safety Engineering for ML-Powered Systems |
Yining Hong et.al. |
2502.07974 |
null |
2025-02-11 |
Caught in the Web of Words: Do LLMs Fall for Spin in Medical Literature? |
Hye Sun Yun et.al. |
2502.07963 |
link |
2025-02-11 |
Accelerating Scientific Research Through a Multi-LLM Framework |
Joaquin Ramirez-Medina et.al. |
2502.07960 |
null |
2025-02-11 |
Bridging HCI and AI Research for the Evaluation of Conversational SE Assistants |
Jonan Richards et.al. |
2502.07956 |
null |
2025-02-11 |
Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs |
Ruichen Zhang et.al. |
2502.07942 |
null |
2025-02-11 |
Discrete Markov Probabilistic Models |
Le-Tuyet-Nhi Pham et.al. |
2502.07939 |
null |
2025-02-11 |
Distributed Approach to Haskell Based Applications Refactoring with LLMs Based Multi-Agent Systems |
Shahbaz Siddeeq et.al. |
2502.07928 |
null |
2025-02-11 |
Sign Operator for Coping with Heavy-Tailed Noise: High Probability Convergence Bounds with Extensions to Distributed Optimization and Comparison Oracle |
Nikita Kornilov et.al. |
2502.07923 |
null |
2025-02-11 |
Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning |
Rujing Yao et.al. |
2502.07912 |
link |
2025-02-11 |
DeepSeek on a Trip: Inducing Targeted Visual Hallucinations via Representation Vulnerabilities |
Chashi Mahiul Islam et.al. |
2502.07905 |
null |
2025-02-11 |
Intelligent Legal Assistant: An Interactive Clarification System for Legal Question Answering |
Rujing Yao et.al. |
2502.07904 |
null |
2025-02-11 |
HexGen-2: Disaggregated Generative Inference of LLMs in Heterogeneous Environment |
Youhe Jiang et.al. |
2502.07903 |
null |
2025-02-11 |
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation |
Alex Jinpeng Wang et.al. |
2502.07870 |
link |
2025-02-11 |
TransMLA: Multi-head Latent Attention Is All You Need |
Fanxu Meng et.al. |
2502.07864 |
link |
2025-02-11 |
BalanceKV: KV Cache Compression through Discrepancy Theory |
Insu Han et.al. |
2502.07861 |
null |
2025-02-11 |
Pippo: High-Resolution Multi-View Humans from a Single Image |
Yash Kant et.al. |
2502.07785 |
null |
2025-02-11 |
DarwinLM: Evolutionary Structured Pruning of Large Language Models |
Shengkun Tang et.al. |
2502.07780 |
link |
2025-02-11 |
Stay-Positive: A Case for Ignoring Real Image Features in Fake Image Detection |
Anirudh Sundara Rajan et.al. |
2502.07778 |
null |
2025-02-11 |
Auditing Prompt Caching in Language Model APIs |
Chenchen Gu et.al. |
2502.07776 |
link |
2025-02-11 |
Automatic Robot Task Planning by Integrating Large Language Model with Genetic Programming |
Azizjon Kobilov et.al. |
2502.07772 |
null |
2025-02-11 |
Great Power Brings Great Responsibility: Personalizing Conversational AI for Diverse Problem-Solvers |
Italo Santos et.al. |
2502.07763 |
null |
2025-02-11 |
Scalable Fingerprinting of Large Language Models |
Anshul Nasery et.al. |
2502.07760 |
null |
2025-02-11 |
Towards Efficient Optimizer Design for LLM via Structured Fisher Approximation with a Low-Rank Extension |
Wenbo Gong et.al. |
2502.07752 |
null |
2025-02-11 |
WHODUNIT: Evaluation benchmark for culprit detection in mystery stories |
Kshitij Gupta et.al. |
2502.07747 |
link |
2025-02-11 |
The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing |
Dirk Bergemann et.al. |
2502.07736 |
null |
2025-02-11 |
Revisiting Non-Acyclic GFlowNets in Discrete Environments |
Nikita Morozov et.al. |
2502.07735 |
link |
2025-02-11 |
Economics of Sourcing Human Data |
Sebastin Santy et.al. |
2502.07732 |
null |
2025-02-11 |
Verifying LLM-Generated Code in the Context of Software Verification with Ada/SPARK |
Marcos Cramer et.al. |
2502.07728 |
null |
2025-02-11 |
Near-Optimal Sample Complexity in Reward-Free Kernel-Based Reinforcement Learning |
Aya Kayal et.al. |
2502.07715 |
null |
2025-02-11 |
Magic 1-For-1: Generating One Minute Video Clips within One Minute |
Hongwei Yi et.al. |
2502.07701 |
link |
2025-02-11 |
A Framework for LLM-powered Design Assistants |
Swaroop Panda et.al. |
2502.07698 |
null |
2025-02-11 |
Large Language Models as Proxies for Theories of Human Linguistic Cognition |
Imry Ziv et.al. |
2502.07687 |
null |
2025-02-11 |
Steering Protein Family Design through Profile Bayesian Flow |
Jingjing Gong et.al. |
2502.07671 |
null |
2025-02-11 |
Guiding Time-Varying Generative Models with Natural Gradients on Exponential Family Manifold |
Song Liu et.al. |
2502.07650 |
null |
2025-02-11 |
SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models |
Shihao Xia et.al. |
2502.07644 |
null |
2025-02-11 |
FoQA: A Faroese Question-Answering Dataset |
Annika Simonsen et.al. |
2502.07642 |
null |
2025-02-11 |
Distributional Instrumental Variable Method |
Anastasiia Holovchak et.al. |
2502.07641 |
link |
2025-02-11 |
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving |
Yong Lin et.al. |
2502.07640 |
link |
2025-02-11 |
Consistency Training with Physical Constraints |
Che-Chia Chang et.al. |
2502.07636 |
null |
2025-02-11 |
Exploring Mobile Touch Interaction with Large Language Models |
Tim Zindulka et.al. |
2502.07629 |
null |
2025-02-11 |
Tractable Transformers for Flexible Conditional Generation |
Anji Liu et.al. |
2502.07616 |
null |
2025-02-11 |
Beyond Prompting: Time2Lang – Bridging Time-Series Foundation Models and Large Language Models for Health Sensing |
Arvind Pillai et.al. |
2502.07608 |
link |
2025-02-11 |
Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models |
Jiacong Xu et.al. |
2502.07601 |
null |
2025-02-11 |
Towards spatial computing: recent advances in multimodal natural interaction for XR headsets |
Zhimin Wang et.al. |
2502.07598 |
null |
2025-02-11 |
SEMU: Singular Value Decomposition for Efficient Machine Unlearning |
Marcin Sendera et.al. |
2502.07587 |
null |
2025-02-11 |
Generative Modeling with Bayesian Sample Inference |
Marten Lienen et.al. |
2502.07580 |
link |
2025-02-11 |
PIM Is All You Need: A CXL-Enabled GPU-Free System for Large Language Model Inference |
Yufeng Gu et.al. |
2502.07578 |
link |
2025-02-11 |
Automated Capability Discovery via Model Self-Exploration |
Cong Lu et.al. |
2502.07577 |
link |
2025-02-11 |
JBShield: Defending Large Language Models from Jailbreak Attacks through Activated Concept Analysis and Manipulation |
Shenyi Zhang et.al. |
2502.07557 |
link |
2025-02-11 |
O1 Embedder: Let Retrievers Think Before Action |
Ruin Yan et.al. |
2502.07555 |
null |
2025-02-11 |
Grammar Control in Dialogue Response Generation for Language Learning Chatbots |
Dominik Glandorf et.al. |
2502.07544 |
link |
2025-02-11 |
NatureLM: Deciphering the Language of Nature for Scientific Discovery |
Yingce Xia et.al. |
2502.07527 |
null |
2025-02-11 |
The Devil is in the Prompts: De-Identification Traces Enhance Memorization Risks in Synthetic Chest X-Ray Generation |
Raman Dutt et.al. |
2502.07516 |
link |
2025-02-11 |
Enhance-A-Video: Better Generated Video for Free |
Yang Luo et.al. |
2502.07508 |
link |
2025-02-11 |
Towards THz-based Obstacle Sensing: A Generative Radio Environment Awareness Framework |
Tianyu Hu et.al. |
2502.07504 |
null |
2025-02-11 |
Unified Graph Networks (UGN): A Deep Neural Framework for Solving Graph Problems |
Rudrajit Dawn et.al. |
2502.07500 |
null |
2025-02-11 |
LLM-Sketch: Enhancing Network Sketches with LLM |
Yuanpeng Li et.al. |
2502.07495 |
link |
2025-02-11 |
Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More |
Xialie Zhuang et.al. |
2502.07490 |
link |
2025-02-11 |
Improving Adaptive Moment Optimization via Preconditioner Diagonalization |
Son Nguyen et.al. |
2502.07488 |
null |
2025-02-11 |
ETimeline: An Extensive Timeline Generation Dataset based on Large Language Model |
Xiaochen Liu et.al. |
2502.07474 |
null |
2025-02-11 |
JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata |
Abhinaba Roy et.al. |
2502.07461 |
link |
2025-02-11 |
Logarithmic Regret for Online KL-Regularized Reinforcement Learning |
Heyang Zhao et.al. |
2502.07460 |
null |
2025-02-11 |
PerCul: A Story-Driven Cultural Evaluation of LLMs in Persian |
Erfan Moosavi Monazzah et.al. |
2502.07459 |
null |
2025-02-11 |
RusCode: Russian Cultural Code Benchmark for Text-to-Image Generation |
Viacheslav Vasilev et.al. |
2502.07455 |
link |
2025-02-11 |
Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon |
Nurit Cohen-Inger et.al. |
2502.07445 |
link |
2025-02-11 |
Towards a Foundation Model for Physics-Informed Neural Networks: Multi-PDE Learning with Active Sampling |
Keon Vin Park et.al. |
2502.07425 |
null |
2025-02-11 |
RomanLens: Latent Romanization and its role in Multilinguality in LLMs |
Alan Saji et.al. |
2502.07424 |
null |
2025-02-11 |
Entity Linking using LLMs for Automated Product Carbon Footprint Estimation |
Steffen Castle et.al. |
2502.07418 |
null |
2025-02-11 |
EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering |
Sheng Zhou et.al. |
2502.07411 |
link |
2025-02-11 |
MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification |
Anh-Tien Nguyen et.al. |
2502.07409 |
link |
2025-02-11 |
On Iterative Evaluation and Enhancement of Code Quality Using GPT-4o |
Rundong Liu et.al. |
2502.07399 |
link |
2025-02-11 |
FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents |
Mostapha Benhenda et.al. |
2502.07393 |
link |
2025-02-11 |
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! |
Dacheng Li et.al. |
2502.07374 |
link |
2025-02-11 |
EvoFlow: Evolving Diverse Agentic Workflows On The Fly |
Guibin Zhang et.al. |
2502.07373 |
null |
2025-02-11 |
LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation |
Zican Dong et.al. |
2502.07365 |
null |
2025-02-11 |
Bridging the Evaluation Gap: Leveraging Large Language Models for Topic Model Evaluation |
Zhiyin Tan et.al. |
2502.07352 |
link |
2025-02-11 |
KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems |
Jusheng Zhang et.al. |
2502.07350 |
null |
2025-02-11 |
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models |
Xu Huang et.al. |
2502.07346 |
link |
2025-02-11 |
Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering |
Shuzheng Si et.al. |
2502.07340 |
link |
2025-02-11 |
Music for All: Exploring Multicultural Representations in Music Generation Models (Camera Ready) |
Atharva Mehta et.al. |
2502.07328 |
link |
2025-02-11 |
Generative Ghost: Investigating Ranking Bias Hidden in AI-Generated Videos |
Haowen Gao et.al. |
2502.07327 |
null |
2025-02-11 |
MEMIT-Merge: Addressing MEMIT’s Key-Value Conflicts in Same-Subject Batch Editing for LLMs |
Zilu Dong et.al. |
2502.07322 |
null |
2025-02-11 |
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction |
Junlong Li et.al. |
2502.07316 |
link |
2025-02-11 |
Prompt-Based Document Modifications In Ranking Competitions |
Niv Bardas et.al. |
2502.07315 |
null |
2025-02-11 |
CreAgent: Towards Long-Term Evaluation of Recommender System under Platform-Creator Information Asymmetry |
Xiaopeng Ye et.al. |
2502.07307 |
link |
2025-02-11 |
TRAVEL: Training-Free Retrieval and Alignment for Vision-and-Language Navigation |
Navid Rajabi et.al. |
2502.07306 |
null |
2025-02-11 |
Flow Matching for Collaborative Filtering |
Chengkai Liu et.al. |
2502.07303 |
link |
2025-02-11 |
Generation of Drug-Induced Cardiac Reactions towards Virtual Clinical Trials |
Qian Shao et.al. |
2502.07297 |
null |
2025-02-11 |
Small Language Model Makes an Effective Long Text Extractor |
Yelin Chen et.al. |
2502.07286 |
link |
2025-02-11 |
Articulate That Object Part (ATOP): 3D Part Articulation from Text and Motion Personalization |
Aditya Vora et.al. |
2502.07278 |
null |
2025-02-11 |
Cost-Efficient Continual Learning with Sufficient Exemplar Memory |
Dongkyu Cho et.al. |
2502.07274 |
null |
2025-02-11 |
GENERator: A Long-Context Generative Genomic Foundation Model |
Wei Wu et.al. |
2502.07272 |
link |
2025-02-11 |
When More is Less: Understanding Chain-of-Thought Length in LLMs |
Yuyang Wu et.al. |
2502.07266 |
null |
2025-02-11 |
DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization |
Xuefeng Liu et.al. |
2502.07237 |
null |
2025-02-11 |
A Memory Efficient Randomized Subspace Optimization Method for Training Large Language Models |
Yiming Chen et.al. |
2502.07222 |
null |
2025-02-11 |
MLLM4PUE: Toward Universal Embeddings in Computational Pathology through Multimodal LLMs |
Qifeng Zhou et.al. |
2502.07221 |
null |
2025-02-11 |
LUNAR: LLM Unlearning via Neural Activation Redirection |
William F. Shen et.al. |
2502.07218 |
null |
2025-02-11 |
Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion |
Xingpei Ma et.al. |
2502.07203 |
null |
2025-02-11 |
Provably Efficient RLHF Pipeline: A Unified View from Contextual Bandits |
Long-Fei Li et.al. |
2502.07193 |
link |
2025-02-11 |
Bag of Tricks for Inference-time Computation of LLM Reasoning |
Fan Liu et.al. |
2502.07191 |
link |
2025-02-11 |
A Large-Scale Benchmark for Vietnamese Sentence Paraphrases |
Sang Quang Nguyen et.al. |
2502.07188 |
link |
2025-02-11 |
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning |
Yinghui Li et.al. |
2502.07184 |
null |
2025-02-11 |
Does Training on Synthetic Data Make Models Less Robust? |
Lingze Zhang et.al. |
2502.07164 |
null |
2025-02-11 |
Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning |
Feng Chen et.al. |
2502.07154 |
link |
2025-02-11 |
Ask Patients with Patience: Enabling LLMs for Human-Centric Medical Dialogue with Grounded Reasoning |
Jiayuan Zhu et.al. |
2502.07143 |
null |
2025-02-11 |
Language-TPP: Integrating Temporal Point Processes with Language Models for Event Analysis |
Quyu Kong et.al. |
2502.07139 |
null |
2025-02-10 |
Cardiverse: Harnessing LLMs for Novel Card Game Prototyping |
Danrui Li et.al. |
2502.07128 |
null |
2025-02-10 |
Structural Reformation of Large Language Model Neuron Encapsulation for Divergent Information Aggregation |
Denis Bakushev et.al. |
2502.07124 |
null |
2025-02-10 |
Online Scheduling for LLM Inference with KV Cache Constraints |
Patrick Jaillet et.al. |
2502.07115 |
null |
2025-02-10 |
Generative Distribution Prediction: A Unified Approach to Multimodal Learning |
Xinyu Tian et.al. |
2502.07090 |
null |
2025-02-10 |
Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph Coloring |
Alex Heyman et.al. |
2502.07087 |
link |
2025-02-10 |
MPFBench: A Large Scale Dataset for SciML of Multi-Phase-Flows: Droplet and Bubble Dynamics |
Mehdi Shadkhah et.al. |
2502.07080 |
null |
2025-02-10 |
Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models |
Lujain Ibrahim et.al. |
2502.07077 |
null |
2025-02-10 |
IRepair: An Intent-Aware Approach to Repair Data-Driven Errors in Large Language Models |
Sayem Mohammad Imtiaz et.al. |
2502.07072 |
null |
2025-02-10 |
Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations |
Yong Cao et.al. |
2502.07068 |
link |
2025-02-10 |
Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT |
Dongyang Liu et.al. |
2502.06782 |
null |
2025-02-10 |
Enhancing Performance of Explainable AI Models with Constrained Concept Refinement |
Geyu Liang et.al. |
2502.06775 |
null |
2025-02-10 |
Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions |
Jaeyeon Kim et.al. |
2502.06768 |
null |
2025-02-10 |
Rationalization Models for Text-to-SQL |
Gaetano Rossiello et.al. |
2502.06759 |
null |
2025-02-10 |
Accelerating Data Processing and Benchmarking of AI Models for Pathology |
Andrew Zhang et.al. |
2502.06750 |
link |
2025-02-10 |
Gradient Multi-Normalization for Stateless and Scalable LLM Training |
Meyer Scetbon et.al. |
2502.06742 |
null |
2025-02-10 |
VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data |
Thomas Zeng et.al. |
2502.06737 |
null |
2025-02-10 |
Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists |
Bojia Zi et.al. |
2502.06734 |
null |
2025-02-10 |
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining |
Daouda Sow et.al. |
2502.06733 |
null |
2025-02-10 |
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling |
Runze Liu et.al. |
2502.06703 |
link |
2025-02-10 |
No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers |
Jiajun He et.al. |
2502.06685 |
null |
2025-02-10 |
EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks |
Michael Arbel et.al. |
2502.06684 |
null |
2025-02-10 |
Boosting Self-Efficacy and Performance of Large Language Models via Verbal Efficacy Stimulations |
Rui Chen et.al. |
2502.06669 |
null |
2025-02-10 |
Automatic Evaluation of Healthcare LLMs Beyond Question-Answering |
Anna Arias-Duart et.al. |
2502.06666 |
null |
2025-02-10 |
Evaluation of Deep Audio Representations for Hearables |
Fabian Gröger et.al. |
2502.06664 |
null |
2025-02-10 |
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models |
Xingrun Xing et.al. |
2502.06663 |
null |
2025-02-10 |
Unbiased Evaluation of Large Language Models from a Causal Perspective |
Meilin Chen et.al. |
2502.06655 |
null |
2025-02-10 |
In-Context Learning (and Unlearning) of Length Biases |
Stephanie Schoch et.al. |
2502.06653 |
null |
2025-02-10 |
Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A |
Anna Leschanowsky et.al. |
2502.06652 |
null |
2025-02-10 |
Automatic Annotation Augmentation Boosts Translation between Molecules and Natural Language |
Zhiqiang Zhong et.al. |
2502.06634 |
null |
2025-02-10 |
Combining Large Language Models with Static Analyzers for Code Review Generation |
Imen Jaoua et.al. |
2502.06633 |
link |
2025-02-10 |
Multi-Scale Feature Fusion with Image-Driven Spatial Integration for Left Atrium Segmentation from Cardiac MRI Images |
Bipasha Kundu et.al. |
2502.06615 |
null |
2025-02-10 |
A Large-scale AI-generated Image Inpainting Benchmark |
Paschalis Giakoumoglou et.al. |
2502.06593 |
null |
2025-02-10 |
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training |
Yuchen Zhuang et.al. |
2502.06589 |
null |
2025-02-10 |
A Survey on Video Analytics in Cloud-Edge-Terminal Collaborative Systems |
Linxiao Gong et.al. |
2502.06581 |
null |
2025-02-10 |
LawGPT: Knowledge-Guided Data Generation and Its Application to Legal LLM |
Zhi Zhou et.al. |
2502.06572 |
link |
2025-02-10 |
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation |
Chengwen Qi et.al. |
2502.06563 |
link |
2025-02-10 |
Is API Access to LLMs Useful for Generating Private Synthetic Tabular Data? |
Marika Swanberg et.al. |
2502.06555 |
null |
2025-02-10 |
Efficient Scientific Full Text Classification: The Case of EICAT Impact Assessments |
Marc Felix Brinner et.al. |
2502.06551 |
null |
2025-02-10 |
Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning |
Jean Vassoyan et.al. |
2502.06533 |
null |
2025-02-10 |
Properties of Wasserstein Gradient Flows for the Sliced-Wasserstein Distance |
Christophe Vauthier et.al. |
2502.06525 |
null |
2025-02-10 |
GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing |
Jinhao Duan et.al. |
2502.06494 |
null |
2025-02-10 |
Recent Advances in Discrete Speech Tokens: A Review |
Yiwei Guo et.al. |
2502.06490 |
null |
2025-02-10 |
Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection |
Maximilian Spliethöver et.al. |
2502.06487 |
null |
2025-02-10 |
WyckoffDiff - A Generative Diffusion Model for Crystal Symmetry |
Filip Ekström Kelvinius et.al. |
2502.06485 |
link |
2025-02-10 |
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths |
Weijia Mao et.al. |
2502.06474 |
null |
2025-02-10 |
KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment |
Yuxing Lu et.al. |
2502.06472 |
link |
2025-02-10 |
A Survey of Theory of Mind in Large Language Models: Evaluations, Representations, and Safety Risks |
Hieu Minh “Jord” Nguyen et.al. |
2502.06470 |
null |
2025-02-10 |
MATH-Perturb: Benchmarking LLMs’ Math Reasoning Abilities against Hard Perturbations |
Kaixuan Huang et.al. |
2502.06453 |
null |
2025-02-10 |
FEMBA: Efficient and Scalable EEG Analysis with a Bidirectional Mamba Foundation Model |
Anna Tegon et.al. |
2502.06438 |
null |
2025-02-10 |
Prompt-SID: Learning Structural Representation Prompt via Latent Diffusion for Single-Image Denoising |
Huaqiu Li et.al. |
2502.06432 |
link |
2025-02-10 |
CoS: Chain-of-Shot Prompting for Long Video Understanding |
Jian Hu et.al. |
2502.06428 |
null |
2025-02-10 |
Generating Privacy-Preserving Personalized Advice with Zero-Knowledge Proofs and LLMs |
Hiroki Watanabe et.al. |
2502.06425 |
null |
2025-02-10 |
Occ-LLM: Enhancing Autonomous Driving with Occupancy-Based Large Language Models |
Tianshuo Xu et.al. |
2502.06419 |
null |
2025-02-10 |
Systematic Outliers in Large Language Models |
Yongqi An et.al. |
2502.06415 |
link |
2025-02-10 |
AppVLM: A Lightweight Vision Language Model for Online App Control |
Georgios Papoudakis et.al. |
2502.06395 |
null |
2025-02-10 |
How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators |
Shang Liu et.al. |
2502.06387 |
null |
2025-02-10 |
Simulation as Reality? The Effectiveness of LLM-Generated Data in Open-ended Question Assessment |
Long Zhang et.al. |
2502.06371 |
null |
2025-02-10 |
Calibrating LLMs with Information-Theoretic Evidential Deep Learning |
Yawei Li et.al. |
2502.06351 |
link |
2025-02-10 |
Can AI Examine Novelty of Patents?: Novelty Evaluation Based on the Correspondence between Patent Claim and Prior Art |
Hayato Ikoma et.al. |
2502.06316 |
null |
2025-02-10 |
Latent Convergence Modulation in Large Language Models: A Novel Approach to Iterative Contextual Realignment |
Patricia Porretta et.al. |
2502.06302 |
null |
2025-02-10 |
SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia |
Chaoqun Liu et.al. |
2502.06298 |
null |
2025-02-10 |
Is an Ultra Large Natural Image-Based Foundation Model Superior to a Retina-Specific Model for Detecting Ocular and Systemic Diseases? |
Qingshan Hou et.al. |
2502.06289 |
null |
2025-02-10 |
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE |
Haiduo Huang et.al. |
2502.06282 |
link |
2025-02-10 |
DebateBench: A Challenging Long Context Reasoning Benchmark For Large Language Models |
Utkarsh Tiwari et.al. |
2502.06279 |
null |
2025-02-10 |
Emergent Response Planning in LLM |
Zhichen Dong et.al. |
2502.06258 |
null |
2025-02-10 |
K-ON: Stacking Knowledge On the Head Layer of Large Language Model |
Lingbing Guo et.al. |
2502.06257 |
null |
2025-02-10 |
Find Central Dogma Again |
Wang Liang et.al. |
2502.06253 |
null |
2025-02-10 |
Amplifying Minority Voices: AI-Mediated Devil’s Advocate System for Inclusive Group Decision-Making |
Soohwan Lee et.al. |
2502.06251 |
null |
2025-02-10 |
PiKE: Adaptive Data Mixing for Multi-Task Learning Under Low Gradient Conflicts |
Zeman Li et.al. |
2502.06244 |
null |
2025-02-10 |
Fully Exploiting Vision Foundation Model’s Profound Prior Knowledge for Generalizable RGB-Depth Driving Scene Parsing |
Sicen Guo et.al. |
2502.06219 |
null |
2025-02-10 |
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks |
Xin Zhou et.al. |
2502.06215 |
null |
2025-02-10 |
Unveiling the Capabilities of Large Language Models in Detecting Offensive Language with Annotation Disagreement |
Junyu Lu et.al. |
2502.06207 |
link |
2025-02-10 |
C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Generation |
Guoxin Chen et.al. |
2502.06205 |
null |
2025-02-10 |
Non-literal Understanding of Number Words by Language Models |
Polina Tsvilodub et.al. |
2502.06204 |
null |
2025-02-10 |
Timing Matters: How Using LLMs at Different Timings Influences Writers’ Perceptions and Ideation Outcomes in AI-Assisted Ideation |
Peinuan Qin et.al. |
2502.06197 |
null |
2025-02-10 |
Can LLMs Replace Human Evaluators? An Empirical Study of LLM-as-a-Judge in Software Engineering |
Ruiqi Wang et.al. |
2502.06193 |
null |
2025-02-10 |
Uncertainty-Aware Adaptation of Large Language Models for Protein-Protein Interaction Analysis |
Sanket Jantre et.al. |
2502.06173 |
null |
2025-02-10 |
A Data-Efficient Pan-Tumor Foundation Model for Oncology CT Interpretation |
Wenhui Lei et.al. |
2502.06171 |
null |
2025-02-10 |
Universal Approximation of Visual Autoregressive Transformers |
Yifang Chen et.al. |
2502.06167 |
null |
2025-02-10 |
Scaling Public Health Text Annotation: Zero-Shot Learning vs. Crowdsourcing for Improved Efficiency and Labeling Accuracy |
Kamyar Kazari et.al. |
2502.06150 |
null |
2025-02-10 |
Optimizing Knowledge Integration in Retrieval-Augmented Generation with Self-Selection |
Yan Weng et.al. |
2502.06148 |
null |
2025-02-10 |
LegalViz: Legal Text Visualization by Text To Diagram Generation |
Eri Onami et.al. |
2502.06147 |
null |
2025-02-10 |
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs |
Sumin An et.al. |
2502.06139 |
null |
2025-02-10 |
Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models |
Ce Zhang et.al. |
2502.06130 |
link |
2025-02-10 |
Foundation Model of Electronic Medical Records for Adaptive Risk Estimation |
Pawel Renc et.al. |
2502.06124 |
link |
2025-02-10 |
Task-driven Layerwise Additive Activation Intervention |
Hieu Trung Nguyen et.al. |
2502.06115 |
null |
2025-02-10 |
CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories |
Yijia Xiao et.al. |
2502.06111 |
null |
2025-02-10 |
RALLRec: Improving Retrieval Augmented Large Language Model Recommendation with Representation Learning |
Jian Xu et.al. |
2502.06101 |
link |
2025-02-10 |
ConMeC: A Dataset for Metonymy Resolution with Common Nouns |
Saptarshi Ghosh et.al. |
2502.06087 |
link |
2025-02-10 |
Physics-Guided Foundation Model for Scientific Discovery: An Application to Aquatic Science |
Runlong Yu et.al. |
2502.06084 |
link |
2025-02-10 |
Debiasing Guidance for Discrete Diffusion with Sequential Monte Carlo |
Cheuk Kit Lee et.al. |
2502.06079 |
null |
2025-02-09 |
Deconstructing Depression Stigma: Integrating AI-driven Data Collection and Analysis with Causal Knowledge Graphs |
Han Meng et.al. |
2502.06075 |
null |
2025-02-09 |
Allegro-FM: Towards Equivariant Foundation Model for Exascale Molecular Dynamics Simulations |
Ken-ichi Nomura et.al. |
2502.06073 |
null |
2025-02-09 |
Benchmarking Prompt Sensitivity in Large Language Models |
Amirhossein Razavi et.al. |
2502.06065 |
null |
2025-02-09 |
Online Reward-Weighted Fine-Tuning of Flow Matching with Wasserstein Regularization |
Jiajun Fan et.al. |
2502.06061 |
null |
2025-02-09 |
Benchmarking Prompt Engineering Techniques for Secure Code Generation with GPT Models |
Marc Bruni et.al. |
2502.06039 |
null |
2025-02-09 |
Investigating Compositional Reasoning in Time Series Foundation Models |
Willa Potosnak et.al. |
2502.06037 |
link |
2025-02-09 |
A Multimodal PDE Foundation Model for Prediction and Scientific Text Descriptions |
Elisa Negrini et.al. |
2502.06026 |
link |
2025-02-09 |
Dual Caption Preference Optimization for Diffusion Models |
Amir Saeidi et.al. |
2502.06023 |
link |
2025-02-09 |
Temporal Working Memory: Query-Guided Segment Refinement for Enhanced Multimodal Understanding |
Xingjian Diao et.al. |
2502.06020 |
link |
2025-02-09 |
Media Bias Detector: Designing and Implementing a Tool for Real-Time Selection and Framing Bias Analysis in News Coverage |
Jenny S Wang et.al. |
2502.06009 |
null |
2025-02-09 |
Analysis of LLM as a grammatical feature tagger for African American English |
Rahul Porwal et.al. |
2502.06004 |
null |
2025-02-09 |
HamRaz: A Culture-Based Persian Conversation Dataset for Person-Centered Therapy Using LLM Agents |
Mohammad Amin Abbasi et.al. |
2502.05982 |
null |
2025-02-09 |
$μ$ nit Scaling: Simple and Scalable FP8 LLM Training |
Saaketh Narayan et.al. |
2502.05967 |
null |
2025-02-09 |
Redefining Robot Generalization Through Interactive Intelligence |
Sharmita Dey et.al. |
2502.05963 |
null |
2025-02-09 |
MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents |
Jiabin Tang et.al. |
2502.05957 |
link |
2025-02-09 |
Cyri: A Conversational AI-based Assistant for Supporting the Human User in Detecting and Responding to Phishing Attacks |
Antonio La Torre et.al. |
2502.05951 |
null |
2025-02-09 |
Acceleration Multiple Heads Decoding for LLM via Dynamic Tree Attention |
Zhendong Zhang et.al. |
2502.05947 |
null |
2025-02-09 |
“Let the AI conspiracy begin…” Language Model coordination is just one inference-intervention away |
Paul Darm et.al. |
2502.05945 |
link |
2025-02-07 |
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray |
Yunhang Shen et.al. |
2502.05177 |
link |
2025-02-07 |
Fillerbuster: Multi-View Scene Completion for Casual Captures |
Ethan Weber et.al. |
2502.05175 |
null |
2025-02-07 |
NoLiMa: Long-Context Evaluation Beyond Literal Matching |
Ali Modarressi et.al. |
2502.05167 |
link |
2025-02-07 |
Multitwine: Multi-Object Compositing with Text and Layout Control |
Gemma Canet Tarrés et.al. |
2502.05165 |
null |
2025-02-07 |
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails |
Yihe Deng et.al. |
2502.05163 |
link |
2025-02-07 |
A Lightweight Method to Disrupt Memorized Sequences in LLM |
Parjanya Prajakta Prashant et.al. |
2502.05159 |
null |
2025-02-07 |
Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation |
Steffen Eger et.al. |
2502.05151 |
link |
2025-02-07 |
CodeSCM: Causal Analysis for Multi-Modal Code Generation |
Mukur Gupta et.al. |
2502.05150 |
link |
2025-02-07 |
An Annotated Reading of ‘The Singer of Tales’ in the LLM Era |
Kush R. Varshney et.al. |
2502.05148 |
null |
2025-02-07 |
Chest X-ray Foundation Model with Global and Local Representations Integration |
Zefan Yang et.al. |
2502.05142 |
link |
2025-02-07 |
Latent Swap Joint Diffusion for Long-Form Audio Generation |
Yusheng Dai et.al. |
2502.05130 |
null |
2025-02-07 |
Refining Integration-by-Parts Reduction of Feynman Integrals with Machine Learning |
Matt von Hippel et.al. |
2502.05121 |
null |
2025-02-07 |
Flexible and Efficient Grammar-Constrained Decoding |
Kanghee Park et.al. |
2502.05111 |
null |
2025-02-07 |
Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs |
Rohit Saxena et.al. |
2502.05092 |
null |
2025-02-07 |
Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs |
Thierry Bossy et.al. |
2502.05087 |
link |
2025-02-07 |
Causality can systematically address the monsters under the bench(marks) |
Felix Leeb et.al. |
2502.05085 |
null |
2025-02-07 |
ChallengeMe: An Adversarial Learning-enabled Text Summarization Framework |
Xiaoyu Deng et.al. |
2502.05084 |
null |
2025-02-07 |
Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures |
Tushar Pandey et.al. |
2502.05078 |
link |
2025-02-07 |
Beautiful Images, Toxic Words: Understanding and Addressing Offensive Text in Generated Images |
Aditya Kumar et.al. |
2502.05066 |
link |
2025-02-07 |
nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow |
Geliang Ouyang et.al. |
2502.05036 |
link |
2025-02-07 |
Prospects for detecting generic fast-time features in the neutrino lightcurve of nearby supernovae in neutrino telescopes |
Jakob Beise et.al. |
2502.05024 |
null |
2025-02-07 |
QuEST: Stable Training of LLMs with 1-Bit Weights and Activations |
Andrei Panferov et.al. |
2502.05003 |
link |
2025-02-07 |
Aligning Black-box Language Models with Human Judgments |
Gerrit J. J. van den Burg et.al. |
2502.04997 |
null |
2025-02-07 |
C2GM: Cascading Conditional Generation of Multi-scale Maps from Remote Sensing Images Constrained by Geographic Features |
Chenxing Sun et.al. |
2502.04991 |
null |
2025-02-07 |
MoGraphGPT: Creating Interactive Scenes Using Modular LLM and Graphical Control |
Hui Ye et.al. |
2502.04983 |
null |
2025-02-07 |
Enhancing Pre-Trained Decision Transformers with Prompt-Tuning Bandits |
Finn Rietz et.al. |
2502.04979 |
null |
2025-02-07 |
Towards Multimodal Empathetic Response Generation: A Rich Text-Speech-Vision Avatar-based Benchmark |
Han Zhang et.al. |
2502.04976 |
null |
2025-02-07 |
CoCoA: A Generalized Approach to Uncertainty Quantification by Integrating Confidence and Consistency of LLM Outputs |
Roman Vashurin et.al. |
2502.04964 |
null |
2025-02-07 |
The Rising Threat to Emerging AI-Powered Search Engines |
Zeren Luo et.al. |
2502.04951 |
null |
2025-02-07 |
Mobile Network-specialized Large Language Models for 6G: Architectures, Innovations, Challenges, and Future Trends |
Abdelaali Chaoub et.al. |
2502.04933 |
null |
2025-02-07 |
Generative-enhanced optimization for knapsack problems: an industry-relevant study |
Yelyzaveta Vodovozova et.al. |
2502.04928 |
null |
2025-02-07 |
Classification or Prompting: A Case Study on Legal Requirements Traceability |
Romina Etezadi et.al. |
2502.04916 |
null |
2025-02-07 |
Goku: Flow Based Video Generative Foundation Models |
Shoufa Chen et.al. |
2502.04896 |
null |
2025-02-07 |
A Foundational Brain Dynamics Model via Stochastic Optimal Control |
Joonhyeong Park et.al. |
2502.04892 |
null |
2025-02-07 |
Training-free Task-oriented Grasp Generation |
Jiaming Wang et.al. |
2502.04873 |
null |
2025-02-07 |
Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration |
Yifeng Yu et.al. |
2502.04849 |
null |
2025-02-07 |
Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition |
Masato Mita et.al. |
2502.04795 |
null |
2025-02-07 |
S $^2$ -MAD: Breaking the Token Barrier to Enhance Multi-Agent Debate Efficiency |
Yuting Zeng et.al. |
2502.04790 |
null |
2025-02-07 |
Probing Internal Representations of Multi-Word Verbs in Large Language Models |
Hassane Kissane et.al. |
2502.04789 |
null |
2025-02-07 |
Enhancing SQL Injection Detection and Prevention Using Generative Models |
Naga Sai Dasari et.al. |
2502.04786 |
null |
2025-02-07 |
SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning |
Wanjia Zhao et.al. |
2502.04780 |
link |
2025-02-07 |
SeDi-Instruct: Enhancing Alignment of Language Models through Self-Directed Instruction Generation |
Jungwoo Kim et.al. |
2502.04774 |
null |
2025-02-07 |
Enhancing Phishing Email Identification with Large Language Models |
Catherine Lee et.al. |
2502.04759 |
null |
2025-02-07 |
Concept Navigation and Classification via Open Source Large Language Model Processing |
Maël Kubli et.al. |
2502.04756 |
null |
2025-02-07 |
Every Software as an Agent: Blueprint and Case Study |
Mengwei Xu et.al. |
2502.04747 |
null |
2025-02-07 |
PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders |
Tianyu Xie et.al. |
2502.04730 |
link |
2025-02-07 |
Generating Symbolic World Models via Test-time Scaling of Large Language Models |
Zhouliang Yu et.al. |
2502.04728 |
link |
2025-02-07 |
Evaluating Text Style Transfer Evaluation: Are There Any Reliable Metrics? |
Sourabrata Mukherjee et.al. |
2502.04718 |
null |
2025-02-07 |
Enhancing Impression Change Prediction in Speed Dating Simulations Based on Speakers’ Personalities |
Kazuya Matsuo et.al. |
2502.04706 |
null |
2025-02-07 |
STRIDE: Automating Reward Design, Deep Reinforcement Learning Training and Feedback Optimization in Humanoid Robotics Locomotion |
Zhenwei Wu et.al. |
2502.04692 |
null |
2025-02-07 |
ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning |
Yuwei Yin et.al. |
2502.04689 |
link |
2025-02-07 |
M-IFEval: Multilingual Instruction-Following Evaluation |
Antoine Dussolle et.al. |
2502.04688 |
link |
2025-02-07 |
Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization |
Zelai Xu et.al. |
2502.04686 |
null |
2025-02-07 |
G2PDiffusion: Genotype-to-Phenotype Prediction with Diffusion Models |
Mengdi Liu et.al. |
2502.04684 |
null |
2025-02-07 |
CALF-SBM: A Covariate-Assisted Latent Factor Stochastic Block Model |
Sydney Louit et.al. |
2502.04681 |
null |
2025-02-07 |
LLM Query Scheduling with Prefix Reuse and Latency Constraints |
Gregory Dexter et.al. |
2502.04677 |
null |
2025-02-07 |
AdParaphrase: Paraphrase Dataset for Analyzing Linguistic Features toward Generating Attractive Ad Texts |
Soichiro Murakami et.al. |
2502.04674 |
link |
2025-02-07 |
Unveiling the Mechanisms of Explicit CoT Training: How Chain-of-Thought Enhances Reasoning Generalization |
Xinhao Yao et.al. |
2502.04667 |
link |
2025-02-07 |
Enhancing Health Information Retrieval with RAG by Prioritizing Topical Relevance and Factual Accuracy |
Rishabh Uapadhyay et.al. |
2502.04666 |
null |
2025-02-07 |
Importance Sampling via Score-based Generative Models |
Heasung Kim et.al. |
2502.04646 |
null |
2025-02-07 |
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research |
Junde Wu et.al. |
2502.04644 |
link |
2025-02-07 |
Confidence Elicitation: A New Attack Vector for Large Language Models |
Brian Formento et.al. |
2502.04643 |
link |
2025-02-07 |
Contrastive Learning-Enhanced Large Language Models for Monolith-to-Microservice Decomposition |
Khaled Sellami et.al. |
2502.04604 |
null |
2025-02-07 |
Extracting and Understanding the Superficial Knowledge in Alignment |
Runjin Chen et.al. |
2502.04602 |
link |
2025-02-07 |
The $α$ -Alternator: Dynamic Adaptation To Varying Noise Levels In Sequences Using The Vendi Score For Improved Robustness and Performance |
Mohammad Reza Rezaei et.al. |
2502.04593 |
null |
2025-02-07 |
Position-aware Automatic Circuit Discovery |
Tal Haklay et.al. |
2502.04577 |
link |
2025-02-06 |
My LLM might Mimic AAE – But When Should it? |
Sandra C. Sandoval et.al. |
2502.04564 |
link |
2025-02-06 |
Speeding up Speculative Decoding via Approximate Verification |
Meiyu Zhong et.al. |
2502.04557 |
null |
2025-02-06 |
TruthFlow: Truthful LLM Generation via Representation Flow Correction |
Hanyu Wang et.al. |
2502.04556 |
null |
2025-02-06 |
Contextual Gradient Flow Modeling for Large Language Model Generalization in Multi-Scale Feature Spaces |
Daphne Quillington et.al. |
2502.04548 |
null |
2025-02-06 |
Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection |
Minseok Jung et.al. |
2502.04528 |
null |
2025-02-06 |
Safety is Essential for Responsible Open-Ended Systems |
Ivaxi Sheth et.al. |
2502.04512 |
null |
2025-02-06 |
ULPT: Prompt Tuning with Ultra-Low-Dimensional Optimization |
Zijun Wu et.al. |
2502.04501 |
null |
2025-02-06 |
Verifiable Format Control for Large Language Model Generations |
Zhaoyang Wang et.al. |
2502.04498 |
null |
2025-02-06 |
Multi-Agent Reinforcement Learning with Focal Diversity Optimization |
Selim Furkan Tekin et.al. |
2502.04492 |
link |
2025-02-06 |
Building A Unified AI-centric Language System: analysis, framework and future work |
Edward Hong Wang et.al. |
2502.04488 |
null |
2025-02-06 |
Active Task Disambiguation with LLMs |
Katarzyna Kobalczyk et.al. |
2502.04485 |
link |
2025-02-06 |
The ML Supply Chain in the Era of Software 2.0: Lessons Learned from Hugging Face |
Trevor Stalnaker et.al. |
2502.04484 |
null |
2025-02-06 |
Near-Optimal Sample Complexity for MDPs via Anchoring |
Jongmin Lee et.al. |
2502.04477 |
null |
2025-02-06 |
ADIFF: Explaining audio difference using natural language |
Soham Deshmukh et.al. |
2502.04476 |
link |
2025-02-06 |
Augmented Conditioning Is Enough For Effective Training Image Generation |
Jiahui Chen et.al. |
2502.04475 |
null |
2025-02-06 |
Iterative Importance Fine-tuning of Diffusion Models |
Alexander Denker et.al. |
2502.04468 |
null |
2025-02-06 |
FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks |
Luca Della Libera et.al. |
2502.04465 |
null |
2025-02-06 |
Training Language Models to Reason Efficiently |
Daman Arora et.al. |
2502.04463 |
link |
2025-02-06 |
Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization |
Yu-Neng Chuang et.al. |
2502.04428 |
null |
2025-02-06 |
Decoding AI Judgment: How LLMs Assess News Credibility and Bias |
Edoardo Loru et.al. |
2502.04426 |
null |
2025-02-06 |
EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models |
He Hu et.al. |
2502.04424 |
null |
2025-02-06 |
Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment |
Zuyan Liu et.al. |
2502.04328 |
link |
2025-02-06 |
Can Grammarly and ChatGPT accelerate language change? AI-powered technologies and their impact on the English language: wordiness vs. conciseness |
Karolina Rudnicka et.al. |
2502.04324 |
null |
2025-02-06 |
Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions |
Yik Siu Chan et.al. |
2502.04322 |
link |
2025-02-06 |
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features |
Alec Helbling et.al. |
2502.04320 |
link |
2025-02-06 |
sshELF: Single-Shot Hierarchical Extrapolation of Latent Features for 3D Reconstruction from Sparse-Views |
Eyvaz Najafli et.al. |
2502.04318 |
null |
2025-02-06 |
ChamaleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters |
Kamer Ali Yuksel et.al. |
2502.04315 |
link |
2025-02-06 |
ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization |
Yinjie Wang et.al. |
2502.04306 |
link |
2025-02-06 |
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation |
Jinbo Xing et.al. |
2502.04299 |
null |
2025-02-06 |
Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression |
Lirui Wang et.al. |
2502.04296 |
null |
2025-02-06 |
Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization |
Yuanye Liu et.al. |
2502.04295 |
link |
2025-02-06 |
PILAF: Optimal Human Preference Sampling for Reward Modeling |
Yunzhen Feng et.al. |
2502.04270 |
null |
2025-02-06 |
Efficient Randomized Experiments Using Foundation Models |
Piersilvio De Bartolomeis et.al. |
2502.04262 |
link |
2025-02-06 |
Realistic Image-to-Image Machine Unlearning via Decoupling and Knowledge Retention |
Ayush K. Varshney et.al. |
2502.04260 |
null |
2025-02-06 |
MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion |
Xintong Hao et.al. |
2502.04235 |
null |
2025-02-06 |
Can LLMs Hack Enterprise Networks? Autonomous Assumed Breach Penetration-Testing Active Directory Networks |
Andreas Happe et.al. |
2502.04227 |
null |
2025-02-06 |
Keep It Light! Simplifying Image Clustering Via Text-Free Adapters |
Yicen Li et.al. |
2502.04226 |
null |
2025-02-06 |
Éclair – Extracting Content and Layout with Integrated Reading Order for Documents |
Ilia Karmanov et.al. |
2502.04223 |
null |
2025-02-06 |
Sports and Women’s Sports: Gender Bias in Text Generation with Olympic Data |
Laura Biester et.al. |
2502.04218 |
null |
2025-02-06 |
Algorithmic causal structure emerging through compression |
Liang Wendong et.al. |
2502.04210 |
null |
2025-02-06 |
“Short-length” Adversarial Training Helps LLMs Defend “Long-length” Jailbreak Attacks: Theoretical and Empirical Evidence |
Shaopeng Fu et.al. |
2502.04204 |
link |
2025-02-06 |
The Best Instruction-Tuning Data are Those That Fit |
Dylan Zhang et.al. |
2502.04194 |
null |
2025-02-06 |
PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models? |
Mennatullah Siam et.al. |
2502.04192 |
link |
2025-02-06 |
Automated Microservice Pattern Instance Detection Using Infrastructure-as-Code Artifacts and Large Language Models |
Carlos Eduardo Duarte et.al. |
2502.04188 |
null |
2025-02-06 |
Multi-agent Architecture Search via Agentic Supernet |
Guibin Zhang et.al. |
2502.04180 |
link |
2025-02-06 |
MRAMG-Bench: A BeyondText Benchmark for Multimodal Retrieval-Augmented Multimodal Generation |
Qinhan Yu et.al. |
2502.04176 |
link |
2025-02-06 |
Diffusion-based mass map reconstruction from weak lensing data |
Supranta S. Boruah et.al. |
2502.04158 |
null |
2025-02-06 |
UltraIF: Advancing Instruction Following from the Wild |
Kaikai An et.al. |
2502.04153 |
link |
2025-02-06 |
The Order Effect: Investigating Prompt Sensitivity in Closed-Source LLMs |
Bryan Guan et.al. |
2502.04134 |
null |
2025-02-06 |
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis |
Zhen Ye et.al. |
2502.04128 |
link |
2025-02-06 |
Generative Adversarial Networks Bridging Art and Machine Intelligence |
Junhao Song et.al. |
2502.04116 |
null |
2025-02-06 |
VTutor: An Open-Source SDK for Generative AI-Powered Animated Pedagogical Agents with Multi-Media Output |
Eason Chen et.al. |
2502.04103 |
null |
2025-02-06 |
LLMs to Support a Domain Specific Knowledge Assistant |
Maria-Flavia Lovin et.al. |
2502.04095 |
null |
2025-02-06 |
AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference |
Qingyue Yang et.al. |
2502.04077 |
link |
2025-02-06 |
Content-Rich AIGC Video Quality Assessment via Intricate Text Alignment and Motion-Aware Consistency |
Shangkun Sun et.al. |
2502.04076 |
link |
2025-02-06 |
Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training |
Changhao Jiang et.al. |
2502.04066 |
link |
2025-02-06 |
TQ-DiT: Efficient Time-Aware Quantization for Diffusion Transformers |
Younghye Hwang et.al. |
2502.04056 |
null |
2025-02-06 |
Exploring Imbalanced Annotations for Effective In-Context Learning |
Hongfu Gao et.al. |
2502.04037 |
null |
2025-02-06 |
Fine, I’ll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging |
Guinan Su et.al. |
2502.04030 |
link |
2025-02-06 |
Echo-Teddy: Preliminary Design and Development of Large Language Model-based Social Robot for Autistic Students |
Unggi Lee et.al. |
2502.04029 |
null |
2025-02-06 |
Quantification of Biodiversity from Historical Survey Text with LLM-based Best-Worst Scaling |
Thomas Haider et.al. |
2502.04022 |
null |
2025-02-06 |
Automating a Complete Software Test Process Using LLMs: An Automotive Case Study |
Shuai Wang et.al. |
2502.04008 |
null |
2025-02-06 |
CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing |
Yu Yuan et.al. |
2502.03997 |
null |
2025-02-06 |
Ontology-Guided, Hybrid Prompt Learning for Generalization in Knowledge Graph Question Answering |
Longquan Jiang et.al. |
2502.03992 |
link |
2025-02-06 |
Tight Bounds on Jensen’s Gap: Novel Approach with Applications in Generative Modeling |
Marcin Mazur et.al. |
2502.03988 |
null |
2025-02-06 |
MultiFloodSynth: Multi-Annotated Flood Synthetic Dataset Generation |
YoonJe Kang et.al. |
2502.03966 |
null |
2025-02-06 |
MAQInstruct: Instruction-based Unified Event Relation Extraction |
Jun Xu et.al. |
2502.03954 |
null |
2025-02-06 |
LR0.FM: Low-Resolution Zero-shot Classification Benchmark For Foundation Models |
Priyank Pathak et.al. |
2502.03950 |
link |
2025-02-06 |
Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond |
Mardhiyah Sanni et.al. |
2502.03945 |
null |
2025-02-06 |
Unravelling Causal Genetic Biomarkers of Alzheimer’s Disease via Neuron to Gene-token Backtracking in Neural Architecture: A Groundbreaking Reverse-Gene-Finder Approach |
Victor OK Li et.al. |
2502.03938 |
null |
2025-02-06 |
Quantifying Correlations of Machine Learning Models |
Yuanyuan Li et.al. |
2502.03937 |
link |
2025-02-06 |
HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture |
Jai Bardhan et.al. |
2502.03933 |
null |
2025-02-06 |
Experiments with Large Language Models on Retrieval-Augmented Generation for Closed-Source Simulation Software |
Andreas Baumann et.al. |
2502.03916 |
null |
2025-02-06 |
No Free Lunch in Annotation either: An objective evaluation of foundation models for streamlining annotation in animal tracking |
Emil Mededovic et.al. |
2502.03907 |
link |
2025-02-06 |
LeAP: Consistent multi-domain 3D labeling using Foundation Models |
Simon Gebraad et.al. |
2502.03901 |
null |
2025-02-06 |
InfinitePOD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers |
Chenchen Shou et.al. |
2502.03885 |
null |
2025-02-06 |
Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning |
Peizhuang Cong et.al. |
2502.03884 |
null |
2025-02-06 |
BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation |
Bo Pang et.al. |
2502.03860 |
null |
2025-02-06 |
PAGNet: Pluggable Adaptive Generative Networks for Information Completion in Multi-Agent Communication |
Zhuohui Zhang et.al. |
2502.03845 |
null |
2025-02-06 |
Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis |
Lin Yuan et.al. |
2502.03843 |
null |
2025-02-06 |
FairT2I: Mitigating Social Bias in Text-to-Image Generation via Large Language Model-Assisted Detection and Attribute Rebalancing |
Jinya Sakurai et.al. |
2502.03826 |
null |
2025-02-06 |
Synthetic Poisoning Attacks: The Impact of Poisoned MRI Image on U-Net Brain Tumor Segmentation |
Tianhao Li et.al. |
2502.03825 |
null |
2025-02-06 |
PsyPlay: Personality-Infused Role-Playing Conversational Agents |
Tao Yang et.al. |
2502.03821 |
null |
2025-02-06 |
Large Language Models for Multi-Robot Systems: A Survey |
Peihan Li et.al. |
2502.03814 |
link |
2025-02-06 |
Identify Critical KV Cache in LLM Inference from an Output Perturbation Perspective |
Yuan Feng et.al. |
2502.03805 |
link |
2025-02-06 |
Understanding and Supporting Formal Email Exchange by Answering AI-Generated Questions |
Yusuke Miura et.al. |
2502.03804 |
link |
2025-02-06 |
Enhancing Hallucination Detection through Noise Injection |
Litian Liu et.al. |
2502.03799 |
null |
2025-02-06 |
Distribution learning via neural differential equations: minimal energy regularization and approximation theory |
Youssef Marzouk et.al. |
2502.03795 |
null |
2025-02-06 |
It’s All in The [MASK]: Simple Instruction-Tuning Enables BERT-like Masked Language Models As Generative Classifiers |
Benjamin Clavié et.al. |
2502.03793 |
null |
2025-02-06 |
Iterate to Accelerate: A Unified Framework for Iterative Reasoning and Feedback Convergence |
Jacob Fein-Ashley et.al. |
2502.03787 |
null |
2025-02-06 |
GistVis: Automatic Generation of Word-scale Visualizations from Data-rich Documents |
Ruishi Zou et.al. |
2502.03784 |
link |
2025-02-06 |
Adaptive Semantic Prompt Caching with VectorQ |
Luis Gaspar Schroeder et.al. |
2502.03771 |
null |
2025-02-06 |
Hierarchical Contextual Manifold Alignment for Structuring Latent Representations in Large Language Models |
Meiquan Dong et.al. |
2502.03766 |
null |
2025-02-06 |
Rethinking the Residual Distribution of Locate-then-Editing Methods in Model Editing |
Xiaopeng Li et.al. |
2502.03748 |
null |
2025-02-06 |
Speaking the Language of Teamwork: LLM-Guided Credit Assignment in Multi-Agent Reinforcement Learning |
Muhan Lin et.al. |
2502.03723 |
null |
2025-02-06 |
Boosting Knowledge Graph-based Recommendations through Confidence-Aware Augmentation with Large Language Models |
Rui Cai et.al. |
2502.03715 |
null |
2025-02-06 |
MultiQ&A: An Analysis in Measuring Robustness via Automated Crowdsourcing of Question Perturbations and Answers |
Nicole Cho et.al. |
2502.03711 |
null |
2025-02-06 |
Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers |
Daniel Beaglehole et.al. |
2502.03708 |
null |
2025-02-06 |
LLM Alignment as Retriever Optimization: An Information Retrieval Perspective |
Bowen Jin et.al. |
2502.03699 |
null |
2025-02-06 |
A Comparison of DeepSeek and Other LLMs |
Tianchen Gao et.al. |
2502.03688 |
null |
2025-02-06 |
Conditional Diffusion Models are Medical Image Classifiers that Provide Explainability and Uncertainty for Free |
Gian Mario Favero et.al. |
2502.03687 |
null |
2025-02-06 |
Controlled LLM Decoding via Discrete Auto-regressive Biasing |
Patrick Pynadath et.al. |
2502.03685 |
null |
2025-02-05 |
Reflection-Window Decoding: Text Generation with Selective Refinement |
Zeyu Tang et.al. |
2502.03678 |
null |
2025-02-05 |
Advancing Reasoning in Large Language Models: Promising Methods and Approaches |
Avinash Patil et.al. |
2502.03671 |
null |
2025-02-05 |
Unrealized Expectations: Comparing AI Methods vs Classical Algorithms for Maximum Independent Set |
Yikai Wu et.al. |
2502.03669 |
null |
2025-02-05 |
Privacy-Preserving Generative Models: A Comprehensive Survey |
Debalina Padariya et.al. |
2502.03668 |
null |
2025-02-05 |
Context-Preserving Gradient Modulation for Large Language Models: A Novel Approach to Semantic Consistency in Long-Form Text Generation |
Nirola Kobanov et.al. |
2502.03643 |
null |
2025-02-05 |
SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models |
Daniel Levy et.al. |
2502.03638 |
link |
2025-02-05 |
AdaPhish: AI-Powered Adaptive Defense and Education Resource Against Deceptive Emails |
Rei Meguro et.al. |
2502.03622 |
null |
2025-02-05 |
Bilevel ZOFO: Bridging Parameter-Efficient and Zeroth-Order Techniques for Efficient LLM Fine-Tuning and Meta-Training |
Reza Shirkavand et.al. |
2502.03604 |
null |
2025-02-05 |
HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference |
Zeyu Zhang et.al. |
2502.03589 |
null |
2025-02-05 |
A Mixed-Methods Evaluation of LLM-Based Chatbots for Menopause |
Roshini Deva et.al. |
2502.03579 |
null |
2025-02-05 |
Code Simulation as a Proxy for High-order Tasks in Large Language Models |
Emanuele La Malfa et.al. |
2502.03568 |
null |
2025-02-05 |
Kronecker Mask and Interpretive Prompts are Language-Action Video Learners |
Jingyi Yang et.al. |
2502.03549 |
link |
2025-02-05 |
YINYANG-ALIGN: Benchmarking Contradictory Objectives and Proposing Multi-Objective Optimization based DPO for Text-to-Image Alignment |
Amitava Das et.al. |
2502.03512 |
null |
2025-02-05 |
Do Large Language Model Benchmarks Test Reliability? |
Joshua Vendrow et.al. |
2502.03461 |
link |
2025-02-05 |
Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training |
Boyao Wang et.al. |
2502.03460 |
null |
2025-02-05 |
A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs) |
Yiye Chen et.al. |
2502.03450 |
null |
2025-02-05 |
Dress-1-to-3: Single Image to Simulation-Ready 3D Outfit with Diffusion Prior and Differentiable Physics |
Xuan Li et.al. |
2502.03449 |
null |
2025-02-05 |
BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving |
Ran Xin et.al. |
2502.03438 |
null |
2025-02-05 |
Taking a Big Step: Large Learning Rates in Denoising Score Matching Prevent Memorization |
Yu-Han Wu et.al. |
2502.03435 |
null |
2025-02-05 |
On Fairness of Unified Multimodal Large Language Model for Image Generation |
Ming Liu et.al. |
2502.03429 |
null |
2025-02-05 |
Harnessing Large Language Models for Curated Code Reviews |
Oussama Ben Sghaier et.al. |
2502.03425 |
link |
2025-02-05 |
Can Text-to-Image Generative Models Accurately Depict Age? A Comparative Study on Synthetic Portrait Generation and Age Estimation |
Alexey A. Novikov et.al. |
2502.03420 |
null |
2025-02-05 |
Think or Step-by-Step? UnZIPping the Black Box in Zero-Shot Prompts |
Nikta Gohari Sadr et.al. |
2502.03418 |
null |
2025-02-05 |
SPRI: Aligning Large Language Models with Context-Situated Principles |
Hongli Zhan et.al. |
2502.03397 |
null |
2025-02-05 |
Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications |
Issar Arab et.al. |
2502.03395 |
null |
2025-02-05 |
LIMO: Less is More for Reasoning |
Yixin Ye et.al. |
2502.03387 |
link |
2025-02-05 |
Transformers and Their Roles as Time Series Foundation Models |
Dennis Wu et.al. |
2502.03383 |
null |
2025-02-05 |
Demystifying Long Chain-of-Thought Reasoning in LLMs |
Edward Yeo et.al. |
2502.03373 |
link |
2025-02-05 |
PalimpChat: Declarative and Interactive AI analytics |
Chunwei Liu et.al. |
2502.03368 |
null |
2025-02-05 |
RadVLM: A Multitask Conversational Vision-Language Model for Radiology |
Nicolas Deperrois et.al. |
2502.03333 |
null |
2025-02-05 |
ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model |
Qiguang Chen et.al. |
2502.03325 |
null |
2025-02-05 |
Out-of-Distribution Detection using Synthetic Data Generation |
Momin Abbas et.al. |
2502.03323 |
null |
2025-02-05 |
Simplifying Formal Proof-Generating Models with ChatGPT and Basic Searching Techniques |
Sangjun Han et.al. |
2502.03321 |
null |
2025-02-05 |
Intent Representation Learning with Large Language Model for Recommendation |
Yu Wang et.al. |
2502.03307 |
link |
2025-02-05 |
Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning |
Qitao Tan et.al. |
2502.03304 |
null |
2025-02-05 |
MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters |
Amin Dada et.al. |
2502.03298 |
null |
2025-02-05 |
SymAgent: A Neural-Symbolic Self-Learning Agent Framework for Complex Reasoning over Knowledge Graphs |
Ben Liu et.al. |
2502.03283 |
null |
2025-02-05 |
Posterior SBC: Simulation-Based Calibration Checking Conditional on Data |
Teemu Säilynoja et.al. |
2502.03279 |
link |
2025-02-05 |
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning |
DiJia Su et.al. |
2502.03275 |
null |
2025-02-05 |
ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models |
Ying Zhang et.al. |
2502.03266 |
link |
2025-02-05 |
General Time-series Model for Universal Knowledge Representation of Multivariate Time-Series data |
Cheng He et.al. |
2502.03264 |
null |
2025-02-05 |
CARROT: A Cost Aware Rate Optimal Router |
Seamus Somerstep et.al. |
2502.03261 |
null |
2025-02-05 |
RiemannGFM: Learning a Graph Foundation Model from Riemannian Geometry |
Li Sun et.al. |
2502.03251 |
null |
2025-02-05 |
Exploring the Security Threats of Knowledge Base Poisoning in Retrieval-Augmented Code Generation |
Bo Lin et.al. |
2502.03233 |
null |
2025-02-05 |
Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models |
Jialiang Wu et.al. |
2502.03199 |
null |
2025-02-05 |
MaxInfo: A Training-Free Key-Frame Selection Method Using Maximum Volume for Enhanced Video Understanding |
Pengyi Li et.al. |
2502.03183 |
null |
2025-02-05 |
PICBench: Benchmarking LLMs for Photonic Integrated Circuits Design |
Yuchao Wu et.al. |
2502.03159 |
link |
2025-02-05 |
Strategizing with AI: Insights from a Beauty Contest Experiment |
Iuliia Alekseenko et.al. |
2502.03158 |
null |
2025-02-05 |
Scalable In-Context Learning on Tabular Data via Retrieval-Augmented Large Language Models |
Xumeng Wen et.al. |
2502.03147 |
null |
2025-02-05 |
Symmetry-Aware Bayesian Flow Networks for Crystal Generation |
Laura Ruple et.al. |
2502.03146 |
null |
2025-02-05 |
Teaching Large Language Models Number-Focused Headline Generation With Key Element Rationales |
Zhen Qian et.al. |
2502.03129 |
null |
2025-02-05 |
Metis: A Foundation Speech Generation Model with Masked Generative Pre-training |
Yuancheng Wang et.al. |
2502.03128 |
link |
2025-02-05 |
Structured Token Retention and Computational Memory Paths in Large Language Models |
Jonathan Delena et.al. |
2502.03102 |
null |
2025-02-05 |
Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms |
Xuerui Su et.al. |
2502.03095 |
null |
2025-02-05 |
Implementing Large Quantum Boltzmann Machines as Generative AI Models for Dataset Balancing |
Salvatore Sinno et.al. |
2502.03086 |
null |
2025-02-05 |
IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates |
Aissatou Diallo et.al. |
2502.03080 |
null |
2025-02-05 |
Poisson Flow Joint Model for Multiphase contrast-enhanced CT |
Rongjun Ge et.al. |
2502.03079 |
null |
2025-02-05 |
Automatic Prompt Optimization Techniques: Exploring the Potential for Synthetic Data Generation |
Nina Freise et.al. |
2502.03078 |
null |
2025-02-05 |
Optimizing Electric Vehicles Charging using Large Language Models and Graph Neural Networks |
Stavros Orfanoudakis et.al. |
2502.03067 |
null |
2025-02-05 |
Understanding and Enhancing the Transferability of Jailbreaking Attacks |
Runqi Lin et.al. |
2502.03052 |
link |
2025-02-05 |
RepLoRA: Reparameterizing Low-Rank Adaptation via the Perspective of Mixture of Experts |
Tuan Truong et.al. |
2502.03044 |
null |
2025-02-05 |
Large Language Models Are Universal Recommendation Learners |
Junguang Jiang et.al. |
2502.03041 |
null |
2025-02-05 |
FuXi- $α$ : Scaling Recommendation Model with Feature Interaction Enhanced Transformer |
Yufei Ye et.al. |
2502.03036 |
null |
2025-02-05 |
Knowledge Distillation from Large Language Models for Household Energy Modeling |
Mohannad Takrouri et.al. |
2502.03034 |
null |
2025-02-05 |
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models |
Daniil Laptev et.al. |
2502.03032 |
null |
2025-02-05 |
Scaling Laws for Upcycling Mixture-of-Experts Language Models |
Seng Pei Liew et.al. |
2502.03009 |
null |
2025-02-05 |
MedBioLM: Optimizing Medical and Biological QA with Fine-Tuned Large Language Models and Retrieval-Augmented Generation |
Seonok Kim et.al. |
2502.03004 |
null |
2025-02-05 |
Training an LLM-as-a-Judge Model: Pipeline, Insights, and Practical Lessons |
Renjun Hu et.al. |
2502.02988 |
null |
2025-02-05 |
Membership Inference Attack Should Move On to Distributional Statistics for Distilled Generative Models |
Muxing Li et.al. |
2502.02970 |
null |
2025-02-05 |
The Labeled Coupon Collector Problem with Random Sample Sizes and Partial Recovery |
Shoham Shimon Berrebi et.al. |
2502.02968 |
null |
2025-02-05 |
Large Language Model Adversarial Landscape Through the Lens of Attack Objectives |
Nan Wang et.al. |
2502.02960 |
null |
2025-02-05 |
Position: Editing Large Language Models Poses Serious Safety Risks |
Paul Youssef et.al. |
2502.02958 |
null |
2025-02-05 |
Control Search Rankings, Control the World: What is a Good Search Engine? |
Simon Coghlan et.al. |
2502.02957 |
null |
2025-02-05 |
LLM-KT: Aligning Large Language Models with Knowledge Tracing using a Plug-and-Play Instruction |
Ziwei Wang et.al. |
2502.02945 |
null |
2025-02-05 |
Large Language Model Guided Self-Debugging Code Generation |
Muntasir Adnan et.al. |
2502.02928 |
null |
2025-02-05 |
SPARC: Subspace-Aware Prompt Adaptation for Robust Continual Learning in LLMs |
Dinithi Jayasuriya et.al. |
2502.02909 |
null |
2025-02-05 |
AI-driven materials design: a mini-review |
Mouyang Cheng et.al. |
2502.02905 |
null |
2025-02-05 |
A Benchmark for the Detection of Metalinguistic Disagreements between LLMs and Knowledge Graphs |
Bradley P. Allen et.al. |
2502.02896 |
null |
2025-02-05 |
Lowering the Barrier of Machine Learning: Achieving Zero Manual Labeling in Review Classification Using LLMs |
Yejian Zhang et.al. |
2502.02893 |
null |
2025-02-05 |
Expertized Caption Auto-Enhancement for Video-Text Retrieval |
Junxiang Chen et.al. |
2502.02885 |
link |
2025-02-05 |
SensorChat: Answering Qualitative and Quantitative Questions during Long-Term Multimodal Sensor Interactions |
Xiaofan Yu et.al. |
2502.02883 |
null |
2025-02-05 |
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning |
Yibo Yan et.al. |
2502.02871 |
null |
2025-02-05 |
A Systematic Approach for Assessing Large Language Models’ Test Case Generation Capability |
Hung-Fu Chang et.al. |
2502.02866 |
null |
2025-02-05 |
OceanChat: The Effect of Virtual Conversational AI Agents on Sustainable Attitude and Behavior Change |
Pat Pataranutaporn et.al. |
2502.02863 |
null |
2025-02-05 |
A Survey of Sample-Efficient Deep Learning for Change Detection in Remote Sensing: Tasks, Strategies, and Challenges |
Lei Ding et.al. |
2502.02835 |
null |
2025-02-05 |
COFFE: A Code Efficiency Benchmark for Code Generation |
Yun Peng et.al. |
2502.02827 |
link |
2025-02-05 |
Accessible and Portable LLM Inference by Compiling Computational Graphs into SQL |
Wenbo Sun et.al. |
2502.02818 |
null |
2025-02-05 |
Mol-LLM: Generalist Molecular LLM with Improved Graph Utilization |
Chanhui Lee et.al. |
2502.02810 |
null |
2025-02-05 |
CAMI: A Counselor Agent Supporting Motivational Interviewing through State Inference and Topic Exploration |
Yizhe Yang et.al. |
2502.02807 |
null |
2025-02-05 |
Leveraging the true depth of LLMs |
Ramón Calvo González et.al. |
2502.02790 |
null |
2025-02-05 |
Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation |
Jingyu Liu et.al. |
2502.02789 |
link |
2025-02-05 |
SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models |
Amirhossein Dabiriaghdam et.al. |
2502.02787 |
link |
2025-02-04 |
Classroom Simulacra: Building Contextual Student Generative Agents in Online Education for Learning Behavioral Simulation |
Songlin Xu et.al. |
2502.02780 |
link |
2025-02-04 |
3D Foundation AI Model for Generalizable Disease Detection in Head Computed Tomography |
Weicheng Zhu et.al. |
2502.02779 |
null |
2025-02-04 |
Twilight: Adaptive Attention Sparsity with Hierarchical Top- $p$ Pruning |
Chaofan Lin et.al. |
2502.02770 |
null |
2025-02-04 |
LLM-USO: Large Language Model-based Universal Sizing Optimizer |
Karthik Somayaji N. S et.al. |
2502.02764 |
null |
2025-02-04 |
Rethinking Vision Transformer for Object Centric Foundation Models |
Manuel Traub et.al. |
2502.02763 |
null |
2025-02-04 |
Too Noisy To Learn: Enhancing Data Quality for Code Review C |
Chunhua Liu et.al. |
2502.02757 |
null |
2025-02-04 |
PatchPilot: A Stable and Cost-Efficient Agentic Patching Framework |
Hongwei Li et.al. |
2502.02747 |
null |
2025-02-04 |
LLM Bandit: Cost-Efficient LLM Generation via Preference-Conditioned Dynamic Routing |
Yang Li et.al. |
2502.02743 |
null |
2025-02-04 |
RFMedSAM 2: Automatic Prompt Refinement for Enhanced Volumetric Medical Image Segmentation with SAM 2 |
Bin Xie et.al. |
2502.02741 |
null |
2025-02-04 |
SmolLM2: When Smol Goes Big – Data-Centric Training of a Small Language Model |
Loubna Ben Allal et.al. |
2502.02737 |
null |
2025-02-04 |
Peri-LN: Revisiting Layer Normalization in the Transformer Architecture |
Jeonghoon Kim et.al. |
2502.02732 |
null |
2025-02-04 |
Cross-Lingual Transfer for Low-Resource Natural Language Processing |
Iker García-Ferrero et.al. |
2502.02722 |
null |
2025-02-04 |
Astromer 2 |
Cristobal Donoso-Oliva et.al. |
2502.02717 |
null |
2025-02-04 |
A Unified Understanding and Evaluation of Steering Methods |
Shawn Im et.al. |
2502.02716 |
null |
2025-02-04 |
An Analysis of LLM Fine-Tuning and Few-Shot Learning for Flaky Test Detection and Classification |
Riddhi More et.al. |
2502.02715 |
null |
2025-02-04 |
Exploring LLMs Impact on Student-Created User Stories and Acceptance Testing in Software Development |
Allan Brockenbrough et.al. |
2502.02675 |
null |
2025-02-04 |
MedRAX: Medical Reasoning Agent for Chest X-ray |
Adibvafa Fallahpour et.al. |
2502.02673 |
link |
2025-02-04 |
Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes |
Mayuka Jayawardhana et.al. |
2502.02672 |
link |
2025-02-04 |
Machine-learning approaches to accelerating lattice simulations |
Scott Lawrence et.al. |
2502.02670 |
null |
2025-02-04 |
A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation (GALI) |
Yan Li et.al. |
2502.02659 |
link |
2025-02-04 |
Introducing the Rhea simulations of Milky-Way-like galaxies I: Effect of gravitational potential on morphology and star formation |
Junia Göller et.al. |
2502.02646 |
null |
2025-02-04 |
COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation |
Xueqing Deng et.al. |
2502.02589 |
null |
2025-02-04 |
Open Materials Generation with Stochastic Interpolants |
Philipp Hoellmer et.al. |
2502.02582 |
null |
2025-02-04 |
A comparison of translation performance between DeepL and Supertext |
Alex Flückiger et.al. |
2502.02577 |
link |
2025-02-04 |
Are Language Models Up to Sequential Optimization Problems? From Evaluation to a Hegelian-Inspired Enhancement |
Soheil Abbasloo et.al. |
2502.02573 |
null |
2025-02-04 |
Learning the RoPEs: Better 2D and 3D Position Encodings with STRING |
Connor Schenck et.al. |
2502.02562 |
null |
2025-02-04 |
Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation |
Junha Lee et.al. |
2502.02548 |
null |
2025-02-04 |
LLMs for Generation of Architectural Components: An Exploratory Empirical Study in the Serverless World |
Shrikara Arun et.al. |
2502.02539 |
null |
2025-02-04 |
Adaptive Self-improvement LLM Agentic System for ML Library Development |
Genghan Zhang et.al. |
2502.02534 |
link |
2025-02-04 |
Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies |
Han Zhou et.al. |
2502.02533 |
null |
2025-02-04 |
Generative Modeling on Lie Groups via Euclidean Generalized Score Matching |
Marco Bertolini et.al. |
2502.02513 |
null |
2025-02-04 |
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search |
Maohao Shen et.al. |
2502.02508 |
null |
2025-02-04 |
Learning to generate physical ocean states: Towards hybrid climate modeling |
Etienne Meunier et.al. |
2502.02499 |
null |
2025-02-04 |
EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization |
Yize Wu et.al. |
2502.02493 |
null |
2025-02-04 |
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study |
Menglong Cui et.al. |
2502.02481 |
null |
2025-02-04 |
Style transfer as data augmentation: evaluating unpaired image-to-image translation models in mammography |
Emir Ahmed et.al. |
2502.02475 |
null |
2025-02-04 |
Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification |
Valentina Vadori et.al. |
2502.02471 |
link |
2025-02-04 |
SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency |
Qianhao Yuan et.al. |
2502.02458 |
link |
2025-02-04 |
Personalization Toolkit: Training Free Personalization of Large Vision Language Models |
Soroush Seifi et.al. |
2502.02452 |
null |
2025-02-04 |
Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study |
Calvin Yixiang Cheng et.al. |
2502.02451 |
link |
2025-02-04 |
Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models |
Haoran Ye et.al. |
2502.02444 |
null |
2025-02-04 |
LLMER: Crafting Interactive Extended Reality Worlds with JSON Data Generated by Large Language Models |
Jiangong Chen et.al. |
2502.02441 |
link |
2025-02-04 |
Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment |
Yaling Shen et.al. |
2502.02438 |
null |
2025-02-04 |
TransformDAS: Mapping Φ-OTDR Signals to Riemannian Manifold for Robust Classification |
Jiaju Kang et.al. |
2502.02428 |
null |
2025-02-04 |
Activation-Informed Merging of Large Language Models |
Amin Heyrani Nobari et.al. |
2502.02421 |
link |
2025-02-04 |
Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling |
Markus Krimmel et.al. |
2502.02415 |
link |
2025-02-04 |
AI-Powered, But Power-Hungry? Energy Efficiency of LLM-Generated Code |
Lola Solovyeva et.al. |
2502.02412 |
null |
2025-02-04 |
Avoiding spurious sharpness minimization broadens applicability of SAM |
Sidak Pal Singh et.al. |
2502.02407 |
null |
2025-02-04 |
LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models |
Tzu-Tao Chang et.al. |
2502.02406 |
null |
2025-02-04 |
CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning |
Jianfeng Pan et.al. |
2502.02390 |
null |
2025-02-04 |
Hypergraph Link Prediction via Hyperedge Copying |
Xie He et.al. |
2502.02386 |
link |
2025-02-04 |
STAIR: Improving Safety Alignment with Introspective Reasoning |
Yichi Zhang et.al. |
2502.02384 |
link |
2025-02-04 |
Evaluating the Effectiveness of LLMs in Fixing Maintainability Issues in Real-World Projects |
Henrique Nunes et.al. |
2502.02368 |
null |
2025-02-04 |
Field Matching: an Electrostatic Paradigm to Generate and Transfer Data |
Alexander Kolesov et.al. |
2502.02367 |
null |
2025-02-04 |
Premise-Augmented Reasoning Chains Improve Error Identification in Math reasoning with LLMs |
Sagnik Mukherjee et.al. |
2502.02362 |
null |
2025-02-04 |
SHIELD: APT Detection and Intelligent Explanation Using LLM |
Parth Atulbhai Gandhi et.al. |
2502.02342 |
null |
2025-02-04 |
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking |
Jinyang Wu et.al. |
2502.02339 |
null |
2025-02-04 |
ReSpark: Leveraging Previous Data Reports as References to Generate New Reports with LLMs |
Yuan Tian et.al. |
2502.02329 |
null |
2025-02-04 |
Information-Theoretic Proofs for Diffusion Sampling |
Galen Reeves et.al. |
2502.02305 |
null |
2025-02-04 |
Density Ratio Estimation with Conditional Probability Paths |
Hanlin Yu et.al. |
2502.02300 |
null |
2025-02-04 |
Evalita-LLM: Benchmarking Large Language Models on Italian |
Bernardo Magnini et.al. |
2502.02289 |
null |
2025-02-04 |
Adaptive Resource Allocation Optimization Using Large Language Models in Dynamic Wireless Environments |
Hyeonho Noh et.al. |
2502.02287 |
null |
2025-02-04 |
Conversation AI Dialog for Medicare powered by Finetuning and Retrieval Augmented Generation |
Atharva Mangeshkumar Agrawal et.al. |
2502.02249 |
null |
2025-02-04 |
Flatten Graphs as Sequences: Transformers are Scalable Graph Generators |
Dexiong Chen et.al. |
2502.02216 |
null |
2025-02-04 |
When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks |
Felix Drinkall et.al. |
2502.02199 |
link |
2025-02-04 |
Large language models in climate and sustainability policy: limits and opportunities |
Francesca Larosa et.al. |
2502.02191 |
null |
2025-02-04 |
ShapeShifter: 3D Variations Using Multiscale and Sparse Point-Voxel Diffusion |
Nissim Maruani et.al. |
2502.02187 |
null |
2025-02-04 |
Generative Kernel Spectral Clustering |
David Winant et.al. |
2502.02185 |
null |
2025-02-04 |
Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge |
Daniel Tamayo et.al. |
2502.02173 |
link |
2025-02-04 |
EditIQ: Automated Cinematic Editing of Static Wide-Angle Videos via Dialogue Interpretation and Saliency Cues |
Rohit Girmaji et.al. |
2502.02172 |
null |
2025-02-04 |
Risk-Aware Driving Scenario Analysis with Large Language Models |
Yuan Gao et.al. |
2502.02145 |
link |
2025-02-04 |
IPO: Iterative Preference Optimization for Text-to-Video Generation |
Xiaomeng Yang et.al. |
2502.02088 |
null |
2025-02-04 |
Position Paper: Building Trust in Synthetic Data for Clinical AI |
Krishan Agyakari Raja Babu et.al. |
2502.02076 |
null |
2025-02-04 |
Rethinking stance detection: A theoretically-informed research agenda for user-level inference using language models |
Prasanta Bhattacharya et.al. |
2502.02074 |
null |
2025-02-04 |
ASCenD-BDS: Adaptable, Stochastic and Context-aware framework for Detection of Bias, Discrimination and Stereotyping |
Rajiv Bahl et.al. |
2502.02072 |
null |
2025-02-04 |
Robust and Secure Code Watermarking for Large Language Models via ML/Crypto Codesign |
Ruisi Zhang et.al. |
2502.02068 |
null |
2025-02-04 |
AdaptBot: Combining LLM with Knowledge Graphs and Human Input for Generic-to-Specific Task Decomposition and Knowledge Refinement |
Shivam Singh et.al. |
2502.02067 |
link |
2025-02-04 |
Anticipate & Act : Integrating LLMs and Classical Planning for Efficient Task Execution in Household Environments |
Raghav Arora et.al. |
2502.02066 |
null |
2025-02-04 |
CASIM: Composite Aware Semantic Injection for Text to Motion Generation |
Che-Jui Chang et.al. |
2502.02063 |
null |
2025-02-04 |
Large Language Models for Recommendation with Deliberative User Preference Alignment |
Yi Fang et.al. |
2502.02061 |
null |
2025-02-04 |
Efficient Domain Adaptation of Multimodal Embeddings using Constrastive Learning |
Georgios Margaritis et.al. |
2502.02048 |
null |
2025-02-04 |
Contextual Memory Reweaving in Large Language Models Using Layered Latent State Reconstruction |
Frederick Dillon et.al. |
2502.02046 |
null |
2025-02-04 |
M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference |
Nikhil Bhendawade et.al. |
2502.02040 |
null |
2025-02-04 |
ContinuouSP: Generative Model for Crystal Structure Prediction with Invariance and Continuity |
Yuji Tone et.al. |
2502.02026 |
link |
2025-02-04 |
From Accidents to Insights: Leveraging Multimodal Data for Scenario-Driven ADS Testing |
Siwei Luo et.al. |
2502.02025 |
null |
2025-02-04 |
ComplexDec: A Domain-robust High-fidelity Neural Audio Codec with Complex Spectrum Modeling |
Yi-Chiao Wu et.al. |
2502.02019 |
null |
2025-02-04 |
Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment |
Shuo Wang et.al. |
2502.02017 |
null |
2025-02-04 |
A Periodic Bayesian Flow for Material Generation |
Hanlin Wu et.al. |
2502.02016 |
link |
2025-02-04 |
Layer by Layer: Uncovering Hidden Representations in Language Models |
Oscar Skean et.al. |
2502.02013 |
null |
2025-02-04 |
LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations |
Ziyang Ye et.al. |
2502.02009 |
null |
2025-02-04 |
Reasoning Bias of Next Token Prediction Training |
Pengxiao Lin et.al. |
2502.02007 |
null |
2025-02-04 |
FinRLlama: A Solution to LLM-Engineered Signals Challenge at FinRL Contest 2024 |
Arnav Grover et.al. |
2502.01992 |
null |
2025-02-04 |
Can LLMs Assist Annotators in Identifying Morality Frames? – Case Study on Vaccination Debate on Social Media |
Tunazzina Islam et.al. |
2502.01991 |
null |
2025-02-04 |
Generative Data Mining with Longtail-Guided Diffusion |
David S. Hayden et.al. |
2502.01980 |
null |
2025-02-04 |
Gradient-Regularized Latent Space Modulation in Large Language Models for Structured Contextual Synthesis |
Derek Yotheringhay et.al. |
2502.01979 |
null |
2025-02-04 |
AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs |
Hongxin Li et.al. |
2502.01977 |
null |
2025-02-04 |
CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing |
Wenhao Zheng et.al. |
2502.01976 |
null |
2025-02-04 |
Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning |
Jinlong Pang et.al. |
2502.01968 |
null |
2025-02-04 |
MPIC: Position-Independent Multimodal Context Caching System for Efficient MLLM Serving |
Shiju Zhao et.al. |
2502.01960 |
null |
2025-02-04 |
Local minima of the empirical risk in high dimension: General theorems and convex examples |
Kiana Asgari et.al. |
2502.01953 |
null |
2025-02-04 |
DAMO: Data- and Model-aware Alignment of Multi-modal LLMs |
Jinda Lu et.al. |
2502.01943 |
link |
2025-02-04 |
Can LLMs Maintain Fundamental Abilities under KV Cache Compression? |
Xiang Liu et.al. |
2502.01941 |
null |
2025-02-04 |
Toward a Low-Cost Perception System in Autonomous Vehicles: A Spectrum Learning Approach |
Mohammed Alsakabi et.al. |
2502.01940 |
null |
2025-02-04 |
Distributionally Robust Direct Preference Optimization |
Zaiyan Xu et.al. |
2502.01930 |
null |
2025-02-04 |
PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling |
Avery Ma et.al. |
2502.01925 |
null |
2025-02-04 |
LAST SToP For Modeling Asynchronous Time Series |
Shubham Gupta et.al. |
2502.01922 |
null |
2025-02-04 |
Anomaly Detection via Autoencoder Composite Features and NCE |
Yalin Liao et.al. |
2502.01920 |
null |
2025-02-04 |
Unlocking Efficient Large Inference Models: One-Bit Unrolling Tips the Scales |
Arian Eamaz et.al. |
2502.01908 |
null |
2025-02-04 |
Rethinking Homogeneity of Vision and Text Tokens in Large Vision-and-Language Models |
Chia-Wen Kuo et.al. |
2502.01906 |
null |
2025-02-04 |
Conceptual Metaphor Theory as a Prompting Paradigm for Large Language Models |
Oliver Kramer et.al. |
2502.01901 |
null |
2025-02-03 |
Latent Lexical Projection in Large Language Models: A Novel Approach to Implicit Representation Refinement |
Ziad Shaker et.al. |
2502.01882 |
null |
2025-02-03 |
SE Arena: Benchmarking Software Engineering Chatbots with Iterative Interactions |
Zhimin Zhao et.al. |
2502.01860 |
null |
2025-02-03 |
Security and Quality in LLM-Generated Code: A Multi-Language, Multi-Model Analysis |
Mohammed Kharma et.al. |
2502.01853 |
null |
2025-02-03 |
Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting |
Keyi Zhu et.al. |
2502.01850 |
link |
2025-02-03 |
Relatively-Secure LLM-Based Steganography via Constrained Markov Decision Processes |
Yu-Shin Huang et.al. |
2502.01827 |
link |
2025-02-03 |
Agentic Bug Reproduction for Effective Automated Program Repair at Google |
Runxiang Cheng et.al. |
2502.01821 |
null |
2025-02-03 |
Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning |
Hanyang Zhao et.al. |
2502.01819 |
null |
2025-02-03 |
SelfCheckAgent: Zero-Resource Hallucination Detection in Generative Large Language Models |
Diyana Muhammed et.al. |
2502.01812 |
null |
2025-02-03 |
Toward Neurosymbolic Program Comprehension |
Alejandro Velasco et.al. |
2502.01806 |
null |
2025-02-03 |
Discovering Chunks in Neural Embeddings for Interpretability |
Shuchen Wu et.al. |
2502.01803 |
null |
2025-02-03 |
Harmful Terms and Where to Find Them: Measuring and Modeling Unfavorable Financial Terms and Conditions in Shopping Websites at Scale |
Elisa Tsai et.al. |
2502.01798 |
link |
2025-01-31 |
Vintix: Action Model via In-Context Reinforcement Learning |
Andrey Polubarov et.al. |
2501.19400 |
link |
2025-01-31 |
Do LLMs Strategically Reveal, Conceal, and Infer Information? A Theoretical and Empirical Analysis in The Chameleon Game |
Mustafa O. Karabag et.al. |
2501.19398 |
link |
2025-01-31 |
Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models |
Alina Shutova et.al. |
2501.19392 |
link |
2025-01-31 |
Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models |
Wenzhi Fang et.al. |
2501.19389 |
link |
2025-02-03 |
SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions |
Dominik Wagner et.al. |
2501.19377 |
null |
2025-01-31 |
Beyond Fixed Horizons: A Theoretical Framework for Adaptive Denoising Diffusions |
Sören Christensen et.al. |
2501.19373 |
null |
2025-01-31 |
We’re Different, We’re the Same: Creative Homogeneity Across LLMs |
Emily Wenger et.al. |
2501.19361 |
null |
2025-01-31 |
Mechanical Properties of the Meninges: Large Language Model Assisted Systematic Review of over 25,000 Studies |
Brandon P. Chelstrom et.al. |
2501.19359 |
null |
2025-01-31 |
The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking |
Yuchun Miao et.al. |
2501.19358 |
null |
2025-01-31 |
Addressing the correlation of Stokes-shifted photons emitted from two quantum emitters |
Adrián Juan-Delgado et.al. |
2501.19356 |
null |
2025-01-31 |
Do Large Multimodal Models Solve Caption Generation for Scientific Figures? Lessons Learned from SCICAP Challenge 2023 |
Ting-Yao E. Hsu et.al. |
2501.19353 |
null |
2025-01-31 |
Towards Adaptive Self-Improvement for Smarter Energy Systems |
Alexander Sommer et.al. |
2501.19340 |
null |
2025-01-31 |
PixelWorld: Towards Perceiving Everything as Pixels |
Zhiheng Lyu et.al. |
2501.19339 |
null |
2025-01-31 |
Homogeneity Bias as Differential Sampling Uncertainty in Language Models |
Messi H. J. Lee et.al. |
2501.19337 |
null |
2025-01-31 |
Reward-Guided Speculative Decoding for Efficient LLM Reasoning |
Baohao Liao et.al. |
2501.19324 |
null |
2025-01-31 |
MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems |
Anirudh Chari et.al. |
2501.19318 |
null |
2025-01-31 |
LLM-based Affective Text Generation Quality Based on Different Quantization Values |
Yarik Menchaca Resendiz et.al. |
2501.19317 |
null |
2025-01-31 |
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment |
Gregor Bachmann et.al. |
2501.19309 |
null |
2025-02-03 |
SETS: Leveraging Self-Verification and Self-Correction for Improved Test-Time Scaling |
Jiefeng Chen et.al. |
2501.19306 |
null |
2025-01-31 |
Beyond checkmate: exploring the creative chokepoints in AI text |
Nafis Irtiza Tripto et.al. |
2501.19301 |
link |
2025-01-31 |
Offline Learning for Combinatorial Multi-armed Bandits |
Xutong Liu et.al. |
2501.19300 |
null |
2025-01-31 |
Synthetic User Behavior Sequence Generation with Large Language Models for Smart Homes |
Zhiyao Xu et.al. |
2501.19298 |
null |
2025-01-31 |
Analysis of LLMs vs Human Experts in Requirements Engineering |
Cory Hymel et.al. |
2501.19297 |
null |
2025-01-31 |
Low-Cost and Comprehensive Non-textual Input Fuzzing with LLM-Synthesized Input Generators |
Kunpeng Zhang et.al. |
2501.19282 |
null |
2025-01-31 |
Pheromone-based Learning of Optimal Reasoning Paths |
Anirudh Chari et.al. |
2501.19278 |
null |
2025-01-31 |
From Assistance to Autonomy – A Researcher Study on the Potential of AI Support for Qualitative Data Analysis |
Elisabeth Kirsten et.al. |
2501.19275 |
null |
2025-01-31 |
Jackpot! Alignment as a Maximal Lottery |
Roberto-Rafael Maura-Rivero et.al. |
2501.19266 |
null |
2025-01-31 |
Neuro-LIFT: A Neuromorphic, LLM-based Interactive Framework for Autonomous Drone FlighT at the Edge |
Amogh Joshi et.al. |
2501.19259 |
null |
2025-01-31 |
A Zero-Shot Generalization Framework for LLM-Driven Cross-Domain Sequential Recommendation |
Yunzhe Li et.al. |
2501.19232 |
null |
2025-01-31 |
Autonomous Legacy Web Application Upgrades Using a Multi-Agent System |
Valtteri Ala-Salmi et.al. |
2501.19204 |
link |
2025-02-03 |
Improving the Robustness of Representation Misdirection for Large Language Model Unlearning |
Dang Huu-Tien et.al. |
2501.19202 |
link |
2025-01-31 |
Efficient Reasoning with Hidden Thinking |
Xuan Shen et.al. |
2501.19201 |
link |
2025-01-31 |
Enhancing Model Defense Against Jailbreaks with Proactive Safety Reasoning |
Xianglin Yang et.al. |
2501.19180 |
null |
2025-01-31 |
No Foundations without Foundations – Why semi-mechanistic models are essential for regulatory biology |
Luka Kovačević et.al. |
2501.19178 |
null |
2025-01-31 |
Position: Contextual Integrity Washing for Language Models |
Yan Shvartzshnaider et.al. |
2501.19173 |
null |
2025-01-31 |
Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs |
Kejia Zhang et.al. |
2501.19164 |
null |
2025-01-31 |
A theoretical framework for overfitting in energy-based modeling |
Giovanni Catania et.al. |
2501.19158 |
null |
2025-01-31 |
A Tensor-Train Decomposition based Compression of LLMs on Group Vector Systolic Accelerator |
Sixiao Huang et.al. |
2501.19135 |
null |
2025-01-31 |
Unraveling Zeroth-Order Optimization through the Lens of Low-Dimensional Structured Perturbations |
Sihwan Park et.al. |
2501.19099 |
null |
2025-01-31 |
Ambient Denoising Diffusion Generative Adversarial Networks for Establishing Stochastic Object Models from Noisy Image Data |
Xichen Xu et.al. |
2501.19094 |
null |
2025-01-31 |
Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models |
Jialin Zhao et.al. |
2501.19090 |
null |
2025-01-31 |
Fairness Analysis of CLIP-Based Foundation Models for X-Ray Image Classification |
Xiangyu Sun et.al. |
2501.19086 |
null |
2025-01-31 |
Enhancing Code Generation for Low-Resource Languages: No Silver Bullet |
Alessandro Giagnorio et.al. |
2501.19085 |
null |
2025-01-31 |
Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations |
Dahye Kim et.al. |
2501.19066 |
link |
2025-01-31 |
TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs |
Yan Sun et.al. |
2501.19057 |
null |
2025-01-31 |
Enabling Autonomic Microservice Management through Self-Learning Agents |
Fenglin Yu et.al. |
2501.19056 |
null |
2025-01-31 |
Text-to-CAD Generation Through Infusing Visual Feedback in Large Language Models |
Ruiyu Wang et.al. |
2501.19054 |
null |
2025-01-31 |
Swarm-Gen: Fast Generation of Diverse Feasible Swarm Behaviors |
Simon Idoko et.al. |
2501.19042 |
link |
2025-01-31 |
Towards the Worst-case Robustness of Large Language Models |
Huanran Chen et.al. |
2501.19040 |
null |
2025-01-31 |
Beyond Token Compression: A Training-Free Reduction Framework for Efficient Visual Processing in MLLMs |
Hongliang Li et.al. |
2501.19036 |
null |
2025-01-31 |
XRF V2: A Dataset for Action Summarization with Wi-Fi Signals, and IMUs in Phones, Watches, Earbuds, and Glasses |
Bo Lan et.al. |
2501.19034 |
link |
2025-01-31 |
Multilayer Networks in Neuroimaging |
Vesna Vuksanovic et.al. |
2501.19024 |
null |
2025-01-31 |
Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation |
Bin Zhu et.al. |
2501.19017 |
null |
2025-01-31 |
Importing Phantoms: Measuring LLM Package Hallucination Vulnerabilities |
Arjun Krishna et.al. |
2501.19012 |
null |
2025-01-31 |
Visual Autoregressive Modeling for Image Super-Resolution |
Yunpeng Qu et.al. |
2501.18993 |
link |
2025-01-31 |
Symmetric Pruning of Large Language Models |
Kai Yi et.al. |
2501.18980 |
null |
2025-01-31 |
BCAT: A Block Causal Transformer for PDE Foundation Models for Fluid Dynamics |
Yuxuan Liu et.al. |
2501.18972 |
null |
2025-01-31 |
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Boostrapping |
Pu Yang et.al. |
2501.18962 |
link |
2025-01-31 |
Intrinsic Tensor Field Propagation in Large Language Models: A Novel Approach to Contextual Information Flow |
Alfred Bexley et.al. |
2501.18957 |
null |
2025-01-31 |
LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models |
Shenghao Fu et.al. |
2501.18954 |
link |
2025-01-31 |
TabFSBench: Tabular Benchmark for Feature Shifts in Open Environment |
Zi-Jian Cheng et.al. |
2501.18935 |
link |
2025-01-31 |
Language Games as the Pathway to Artificial Superhuman Intelligence |
Ying Wen et.al. |
2501.18924 |
null |
2025-01-31 |
KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search |
Haoran Luo et.al. |
2501.18922 |
link |
2025-01-31 |
LLM Program Optimization via Retrieval Augmented Search |
Sagnik Anupam et.al. |
2501.18916 |
null |
2025-01-31 |
Scaling Laws for Differentially Private Language Models |
Ryan McKenna et.al. |
2501.18914 |
null |
2025-01-31 |
Streamlining Security Vulnerability Triage with Large Language Models |
Mohammad Jalili Torkamani et.al. |
2501.18908 |
null |
2025-01-31 |
Trustworthy Evaluation of Generative AI Models |
Zijun Gao et.al. |
2501.18897 |
null |
2025-01-31 |
Can We Predict the Effect of Prompts? |
Jae Yong Lee et.al. |
2501.18883 |
null |
2025-01-31 |
Adaptivity and Convergence of Probability Flow ODEs in Diffusion Generative Models |
Jiaqi Tang et.al. |
2501.18863 |
null |
2025-01-31 |
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning |
Han Zhong et.al. |
2501.18858 |
null |
2025-01-31 |
Equivariant Hypergraph Diffusion for Crystal Structure Prediction |
Yang Liu et.al. |
2501.18850 |
null |
2025-01-31 |
Text Data Augmentation for Large Language Models: A Comprehensive Survey of Methods, Challenges, and Opportunities |
Yaping Chai et.al. |
2501.18845 |
null |
2025-01-31 |
Trading Inference-Time Compute for Adversarial Robustness |
Wojciech Zaremba et.al. |
2501.18841 |
null |
2025-01-31 |
Partially Rewriting a Transformer in Natural Language |
Gonçalo Paulo et.al. |
2501.18838 |
link |
2025-01-31 |
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming |
Mrinank Sharma et.al. |
2501.18837 |
null |
2025-01-31 |
Pitfalls of defacing whole-head MRI: re-identification risk with diffusion models and compromised research potential |
Chenyu Gao et.al. |
2501.18834 |
null |
2025-01-31 |
Structural Embedding Projection for Contextual Large Language Model Inference |
Vincent Enoasmo et.al. |
2501.18826 |
null |
2025-01-31 |
Bridging the Reasoning Gap: Small LLMs Can Plan with Generalised Strategies |
Andrey Borro et.al. |
2501.18817 |
link |
2025-01-31 |
Large Language Models as Common-Sense Heuristics |
Andrey Borro et.al. |
2501.18816 |
null |
2025-01-30 |
Compositional Generalization Requires More Than Disentangled Representations |
Qiyao Liang et.al. |
2501.18797 |
null |
2025-01-30 |
Rope to Nope and Back Again: A New Hybrid Attention Strategy |
Bowen Yang et.al. |
2501.18795 |
null |
2025-01-30 |
Survey and Improvement Strategies for Gene Prioritization with Large Language Models |
Matthew Neeley et.al. |
2501.18794 |
null |
2025-01-30 |
LLM-Generated Heuristics for AI Planning: Do We Even Need Domain-Independence Anymore? |
Alexander Tuisov et.al. |
2501.18784 |
null |
2025-01-30 |
Navigating the Fragrance space Via Graph Generative Models And Predicting Odors |
Mrityunjay Sharma et.al. |
2501.18777 |
link |
2025-01-30 |
Probabilistic Joint Recovery Method for CO $_2$ Plume Monitoring |
Zijun Deng et.al. |
2501.18761 |
null |
2025-01-30 |
Synthetic Data Generation for Augmenting Small Samples |
Dan Liu et.al. |
2501.18741 |
null |
2025-01-30 |
Examining the Robustness of Large Language Models across Language Complexity |
Jiayi Zhang et.al. |
2501.18738 |
null |
2025-01-30 |
Exploring Audio Editing Features as User-Centric Privacy Defenses Against Emotion Inference Attacks |
Mohd. Farhan Israk Soumik et.al. |
2501.18727 |
null |
2025-01-30 |
Strong and Controllable 3D Motion Generation |
Canxuan Gang et.al. |
2501.18726 |
null |
2025-01-30 |
Zero-shot Large Language Models for Long Clinical Text Summarization with Temporal Reasoning |
Maya Kruse et.al. |
2501.18724 |
null |
2025-02-03 |
Invisible Traces: Using Hybrid Fingerprinting to identify underlying LLMs in GenAI Apps |
Devansh Bhardwaj et.al. |
2501.18712 |
null |
2025-01-30 |
Regularized second-order optimization of tensor-network Born machines |
Matan Ben-Dov et.al. |
2501.18691 |
null |
2025-01-30 |
Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting |
Yansong Qu et.al. |
2501.18672 |
null |
2025-01-30 |
Foundational Models for 3D Point Clouds: A Survey and Outlook |
Vishal Thengane et.al. |
2501.18594 |
null |
2025-01-30 |
Diffusion Autoencoders are Scalable Image Tokenizers |
Yinbo Chen et.al. |
2501.18593 |
null |
2025-02-03 |
Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models |
Hao Dong et.al. |
2501.18592 |
link |
2025-01-30 |
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs |
Yue Wang et.al. |
2501.18585 |
null |
2025-01-30 |
Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH |
Evgenii Evstafev et.al. |
2501.18576 |
null |
2025-01-30 |
BounTCHA: A CAPTCHA Utilizing Boundary Identification in AI-extended Videos |
Lehao Lin et.al. |
2501.18565 |
null |
2025-01-30 |
SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation |
Haoquan Fang et.al. |
2501.18564 |
link |
2025-01-30 |
Semantic Web and Creative AI – A Technical Report from ISWS 2023 |
Raia Abu Ahmad et.al. |
2501.18542 |
null |
2025-01-30 |
Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges |
Manveer Singh Tamber et.al. |
2501.18536 |
link |
2025-01-30 |
Differentially Private Steering for Large Language Model Alignment |
Anmol Goel et.al. |
2501.18532 |
link |
2025-01-30 |
Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models |
Guanqun Cao et.al. |
2501.18516 |
null |
2025-01-30 |
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch |
Arthur Douillard et.al. |
2501.18512 |
null |
2025-01-30 |
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training |
Benjamin Feuer et.al. |
2501.18511 |
link |
2025-01-30 |
CLEAR: Cue Learning using Evolution for Accurate Recognition Applied to Sustainability Data Extraction |
Peter J. Bentley et.al. |
2501.18504 |
null |
2025-01-30 |
Examining the Expanding Role of Synthetic Data Throughout the AI Development Pipeline |
Shivani Kapania et.al. |
2501.18493 |
null |
2025-01-30 |
A Tool for In-depth Analysis of Code Execution Reasoning of Large Language Models |
Changshu Liu et.al. |
2501.18482 |
null |
2025-01-30 |
CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization |
Yanxia Deng et.al. |
2501.18475 |
null |
2025-01-30 |
Tuning Vision Foundation Model via Test-Time Prompt-Guided Training for VFSS Segmentations |
Chengxi Zeng et.al. |
2501.18474 |
null |
2025-01-30 |
ExeCoder: Empowering Large Language Models with Executability Representation for Code Translation |
Minghua He et.al. |
2501.18460 |
null |
2025-01-30 |
CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering |
Yumeng Wang et.al. |
2501.18457 |
null |
2025-01-30 |
GENIE: Generative Note Information Extraction model for structuring EHR data |
Huaiyuan Ying et.al. |
2501.18435 |
null |
2025-01-30 |
Exploring Potential Prompt Injection Attacks in Federated Military LLMs and Their Mitigation |
Youngjoon Lee et.al. |
2501.18416 |
null |
2025-01-30 |
RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects |
Yiteng Tu et.al. |
2501.18365 |
link |
2025-01-30 |
A Video-grounded Dialogue Dataset and Metric for Event-driven Activities |
Wiradee Imrattanatrai et.al. |
2501.18324 |
link |
2025-01-30 |
Leveraging LLM Agents for Automated Optimization Modeling for SASP Problems: A Graph-RAG based Approach |
Tianpeng Pan et.al. |
2501.18320 |
null |
2025-01-30 |
Mining for Species, Locations, Habitats, and Ecosystems from Scientific Papers in Invasion Biology: A Large-Scale Exploratory Study with Large Language Models |
Jennifer D’Souza et.al. |
2501.18287 |
null |
2025-01-30 |
Jailbreaking LLMs’ Safeguard with Universal Magic Words for Text Embedding Models |
Haoyu Liang et.al. |
2501.18280 |
null |
2025-01-30 |
Collecting Cost-Effective, High-Quality Truthfulness Assessments with LLM Summarized Evidence |
Kevin Roitero et.al. |
2501.18265 |
null |
2025-01-30 |
How to Select Datapoints for Efficient Human Evaluation of NLG Models? |
Vilém Zouhar et.al. |
2501.18251 |
link |
2025-01-30 |
Statistical multi-metric evaluation and visualization of LLM system predictive performance |
Samuel Ackerman et.al. |
2501.18243 |
null |
2025-01-30 |
Contextually Structured Token Dependency Encoding for Large Language Models |
James Blades et.al. |
2501.18205 |
null |
2025-01-30 |
Economic Rationality under Specialization: Evidence of Decision Bias in AI Agents |
ShuiDe Wen et.al. |
2501.18190 |
null |
2025-01-30 |
Investigating Tax Evasion Emergence Using Dual Large Language Model and Deep Reinforcement Learning Powered Agent-based Simulation |
Teddy Lazebnik et.al. |
2501.18177 |
null |
2025-01-30 |
Continually Evolved Multimodal Foundation Models for Cancer Prognosis |
Jie Peng et.al. |
2501.18170 |
null |
2025-01-30 |
RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing |
Jinyao Guo et.al. |
2501.18160 |
null |
2025-01-30 |
Large Language Models for Cryptocurrency Transaction Analysis: A Bitcoin Case Study |
Yuchen Lei et.al. |
2501.18158 |
null |
2025-01-30 |
Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models |
Wanlong Liu et.al. |
2501.18154 |
null |
2025-01-30 |
Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models |
Qika Lin et.al. |
2501.18119 |
null |
2025-01-30 |
Scaling Inference-Efficient Language Models |
Song Bian et.al. |
2501.18107 |
null |
2025-01-30 |
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation |
Yibo Wang et.al. |
2501.18100 |
link |
2025-01-30 |
AlphaAdam:Asynchronous Masked Optimization with Dynamic Alpha for Selective Updates |
Da Chang et.al. |
2501.18094 |
null |
2025-01-30 |
Normative Evaluation of Large Language Models with Everyday Moral Dilemmas |
Pratik S. Sachdeva et.al. |
2501.18081 |
null |
2025-01-30 |
FinanceQA: A Benchmark for Evaluating Financial Analysis Capabilities of Large Language Models |
Spencer Mateega et.al. |
2501.18062 |
null |
2025-01-29 |
RL-based Query Rewriting with Distilled LLM for online E-Commerce Systems |
Duy A. Nguyen et.al. |
2501.18056 |
null |
2025-01-29 |
Current Pathology Foundation Models are unrobust to Medical Center Differences |
Edwin D. de Jong et.al. |
2501.18055 |
null |
2025-01-29 |
A Proximal Operator for Inducing 2:4-Sparsity |
Jonas M Kübler et.al. |
2501.18015 |
null |
2025-01-29 |
Large Language Models Think Too Fast To Explore Effectively |
Lan Pan et.al. |
2501.18009 |
null |
2025-01-29 |
Fault Localization via Fine-tuning Large Language Models with Mutation Generated Stack Traces |
Neetha Jambigi et.al. |
2501.18005 |
null |
2025-01-29 |
InnerThoughts: Disentangling Representations and Predictions in Large Language Models |
Didier Chételat et.al. |
2501.17994 |
null |
2025-01-29 |
Can Generative LLMs Create Query Variants for Test Collections? An Exploratory Study |
Marwah Alaofi et.al. |
2501.17981 |
link |
2025-01-29 |
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization |
Zishun Yu et.al. |
2501.17974 |
null |
2025-01-29 |
“I Would Never Trust Anything Western”: Kumu (Educator) Perspectives on Use of LLMs for Culturally Revitalizing CS Education in Hawaiian Schools |
Manas Mhasakar et.al. |
2501.17942 |
null |
2025-01-29 |
DReSS: Data-driven Regularized Structured Streamlining for Large Language Models |
Mingkuan Feng et.al. |
2501.17905 |
null |
2025-01-29 |
Learning Beyond the Surface: How Far Can Continual Pre-Training with LoRA Enhance LLMs’ Domain-Specific Insight Learning? |
Pouya Pezeshkpour et.al. |
2501.17840 |
link |
2025-01-29 |
Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology |
Sobhan Hemati et.al. |
2501.17822 |
null |
2025-01-30 |
Leveraging Multimodal LLM for Inspirational User Interface Search |
Seokhyeon Park et.al. |
2501.17799 |
link |
2025-01-29 |
BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation – Challenges and Insights |
Chan-Jan Hsu et.al. |
2501.17790 |
null |
2025-01-29 |
AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing |
Peter Pak et.al. |
2501.17784 |
null |
2025-01-29 |
2SSP: A Two-Stage Framework for Structured Pruning of LLMs |
Fabrizio Sandri et.al. |
2501.17771 |
link |
2025-01-29 |
Generative Unordered Flow for Set-Structured Data Generation |
Yangming Li et.al. |
2501.17770 |
null |
2025-01-29 |
Hybrid Graphs for Table-and-Text based Question Answering using LLMs |
Ankush Agarwal et.al. |
2501.17767 |
null |
2025-01-29 |
On the Partitioning of GPU Power among Multi-Instances |
Tirth Vamja et.al. |
2501.17752 |
null |
2025-01-29 |
Early External Safety Testing of OpenAI’s o3-mini: Insights from the Pre-Deployment Evaluation |
Aitor Arrieta et.al. |
2501.17749 |
null |
2025-01-29 |
A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches |
Ana R. Baião et.al. |
2501.17729 |
null |
2025-01-29 |
Using Code Generation to Solve Open Instances of Combinatorial Design Problems |
Christopher D. Rosin et.al. |
2501.17725 |
link |
2025-01-29 |
RICoTA: Red-teaming of In-the-wild Conversation with Test Attempts |
Eujeong Choi et.al. |
2501.17715 |
link |
2025-01-29 |
Source-Channel Separation Theorems for Distortion Perception Coding |
Chao Tian et.al. |
2501.17706 |
null |
2025-01-29 |
Planning with Vision-Language Models and a Use Case in Robot-Assisted Teaching |
Xuzhe Dang et.al. |
2501.17665 |
null |
2025-01-30 |
In-Context Meta LoRA Generation |
Yihua Shao et.al. |
2501.17635 |
null |
2025-01-29 |
Uncertainty Quantification and Decomposition for LLM-based Recommendation |
Wonbin Kweon et.al. |
2501.17630 |
link |
2025-01-29 |
The Imitation Game According To Turing |
Sharon Temtsin et.al. |
2501.17629 |
null |
2025-01-29 |
Structured Context Recomposition for Large Language Models Using Probabilistic Layer Realignment |
Jonathan Teel et.al. |
2501.17617 |
null |
2025-01-29 |
Semantic Consistency Regularization with Large Language Models for Semi-supervised Sentiment Analysis |
Kunrong Li et.al. |
2501.17598 |
null |
2025-01-30 |
Technical report on label-informed logit redistribution for better domain generalization in low-shot classification with foundation models |
Behraj Khan et.al. |
2501.17595 |
null |
2025-01-29 |
GLLM: Self-Corrective G-Code Generation using Large Language Models with User Feedback |
Mohamed Abdelaal et.al. |
2501.17584 |
null |
2025-01-29 |
CSEval: Towards Automated, Multi-Dimensional, and Reference-Free Counterspeech Evaluation using Auto-Calibrated LLMs |
Amey Hengle et.al. |
2501.17581 |
null |
2025-01-29 |
Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding |
Marco Pasini et.al. |
2501.17578 |
null |
2025-01-29 |
Query-Aware Learnable Graph Pooling Tokens as Prompt for Large Language Models |
Wooyoung Kim et.al. |
2501.17549 |
null |
2025-01-29 |
Towards Training-Free Open-World Classification with 3D Generative Models |
Xinzhe Xia et.al. |
2501.17547 |
null |
2025-01-29 |
Is Conversational XAI All You Need? Human-AI Decision Making With a Conversational XAI Assistant |
Gaole He et.al. |
2501.17546 |
link |
2025-01-29 |
Towards Supporting Penetration Testing Education with Large Language Models: an Evaluation and Comparison |
Martin Nizon-Deladoeuille et.al. |
2501.17539 |
null |
2025-01-29 |
Neural Spelling: A Spell-Based BCI System for Language Neural Decoding |
Xiaowei Jiang et.al. |
2501.17489 |
null |
2025-01-29 |
DFPE: A Diverse Fingerprint Ensemble for Enhancing LLM Performance |
Seffi Cohen et.al. |
2501.17479 |
link |
2025-01-29 |
AugmenTest: Enhancing Tests with LLM-Driven Oracles |
Shaker Mahmud Khandaker et.al. |
2501.17461 |
link |
2025-01-29 |
Large Language Models for Single-Step and Multi-Step Flight Trajectory Prediction |
Kaiwei Luo et.al. |
2501.17459 |
null |
2025-01-29 |
Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation |
Tiansheng Huang et.al. |
2501.17433 |
link |
2025-01-29 |
Actions Speak Louder than Words: Agent Decisions Reveal Implicit Biases in Language Models |
Yuxuan Li et.al. |
2501.17420 |
null |
2025-01-29 |
MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs |
Ved Sirdeshmukh et.al. |
2501.17399 |
link |
2025-01-29 |
Learning Free Token Reduction for Multi-Modal LLM |
Zihui Zhao et.al. |
2501.17391 |
null |
2025-01-29 |
Context-Aware Semantic Recomposition Mechanism for Large Language Models |
Richard Katrix et.al. |
2501.17386 |
null |
2025-01-28 |
Deep-and-Wide Learning: Enhancing Data-Driven Inference via Synergistic Learning of Inter- and Intra-Data Representations |
Md Tauhidul Islam et.al. |
2501.17347 |
null |
2025-01-28 |
Memorize and Rank: Elevating Large Language Models for Clinical Diagnosis Prediction |
Mingyu Derek Ma et.al. |
2501.17326 |
null |
2025-01-28 |
CardiCat: a Variational Autoencoder for High-Cardinality Tabular Data |
Lee Carlin et.al. |
2501.17324 |
null |
2025-01-30 |
Probing LLM World Models: Enhancing Guesstimation with Wisdom of Crowds Decoding |
Yun-Shiuan Chuang et.al. |
2501.17310 |
null |
2025-01-28 |
“Ownership, Not Just Happy Talk”: Co-Designing a Participatory Large Language Model for Journalism |
Emily Tseng et.al. |
2501.17299 |
null |
2025-01-28 |
Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization |
Zilu Tang et.al. |
2501.17295 |
null |
2025-01-28 |
Fine-Tuning Open-Source Large Language Models to Improve Their Performance on Radiation Oncology Tasks: A Feasibility Study to Investigate Their Potential Clinical Applications in Radiation Oncology |
Peilong Wang et.al. |
2501.17286 |
null |
2025-01-30 |
From Natural Language to Extensive-Form Game Representations |
Shilong Deng et.al. |
2501.17282 |
link |
2025-01-28 |
Engineering Point Defects in MoS2 for Tailored Material Properties using Large Language Models |
Abdalaziz Al-Maeeni et.al. |
2501.17279 |
null |
2025-01-28 |
Tailored Truths: Optimizing LLM Persuasion with Personalization and Fabricated Statistics |
Jasper Timm et.al. |
2501.17273 |
link |
2025-01-28 |
Integrating Reinforcement Learning and AI Agents for Adaptive Robotic Interaction and Assistance in Dementia Care |
Fengpei Yuan et.al. |
2501.17206 |
null |
2025-01-28 |
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training |
Tianzhe Chu et.al. |
2501.17161 |
null |
2025-01-28 |
FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data |
Deren Lei et.al. |
2501.17144 |
link |
2025-01-28 |
ASTRAL: Automated Safety Testing of Large Language Models |
Miriam Ugarte et.al. |
2501.17132 |
null |
2025-01-28 |
Optimizing Large Language Model Training Using FP4 Quantization |
Ruizhe Wang et.al. |
2501.17116 |
null |
2025-01-28 |
Unlocking Transparent Alignment Through Enhanced Inverse Constitutional AI for Principle Extraction |
Carl-Leander Henneking et.al. |
2501.17112 |
null |
2025-01-28 |
Goodness of Fit for Bayesian Generative Models with Applications in Population Genetics |
Guillaume Le Mailloux et.al. |
2501.17107 |
link |
2025-01-28 |
Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving |
Evgenii Evstafev et.al. |
2501.17084 |
null |
2025-01-28 |
Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding |
Akash Kumar et.al. |
2501.17053 |
null |
2025-01-28 |
Enhanced Retrieval of Long Documents: Leveraging Fine-Grained Block Representations with Large Language Models |
Minghan Li et.al. |
2501.17039 |
null |
2025-01-28 |
Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies |
Manojkumar Parmar et.al. |
2501.17030 |
null |
2025-01-28 |
Automated Refactoring of Non-Idiomatic Python Code: A Differentiated Replication with LLMs |
Alessandro Midolo et.al. |
2501.17024 |
link |
2025-01-28 |
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement |
Kei Katsumata et.al. |
2501.17022 |
link |
2025-01-28 |
MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition |
Philippe Pasquier et.al. |
2501.17011 |
null |
2025-01-28 |
Large Language Models for Code Generation: The Practitioners Perspective |
Zeeshan Rasheed et.al. |
2501.16998 |
link |
2025-01-28 |
Artificial Intelligence Clones |
Annie Liang et.al. |
2501.16996 |
null |
2025-01-28 |
FedEFM: Federated Endovascular Foundation Model with Unseen Data |
Tuong Do et.al. |
2501.16992 |
null |
2025-01-28 |
Generative quantum combinatorial optimization by means of a novel conditional generative quantum eigensolver |
Shunya Minami et.al. |
2501.16986 |
null |
2025-01-28 |
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling |
Hongzhi Huang et.al. |
2501.16975 |
null |
2025-01-28 |
Instantiation-based Formalization of Logical Reasoning Tasks using Language Models and Logical Solvers |
Mohammad Raza et.al. |
2501.16961 |
null |
2025-01-28 |
Multiple Abstraction Level Retrieve Augment Generation |
Zheng Zheng et.al. |
2501.16952 |
null |
2025-01-29 |
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models |
Makoto Shing et.al. |
2501.16937 |
null |
2025-01-28 |
Detecting harassment and defamation in cyberbullying with emotion-adaptive training |
Peiling Yi et.al. |
2501.16925 |
link |
2025-01-28 |
RDMM: Fine-Tuned LLM Models for On-Device Robotic Decision Making with Enhanced Contextual Awareness in Specific Domains |
Shady Nasrat et.al. |
2501.16899 |
link |
2025-01-28 |
Machine-learning semi-local exchange-correlation functionals for Kohn-Sham density functional theory of the Hubbard model |
Eoghan Cronin et.al. |
2501.16893 |
link |
2025-01-28 |
Irony Detection, Reasoning and Understanding in Zero-shot Learning |
Peiling Yi et.al. |
2501.16884 |
null |
2025-01-28 |
Comparing Human and LLM Generated Code: The Jury is Still Out! |
Sherlock A. Licorish et.al. |
2501.16857 |
null |
2025-01-28 |
Adapting Network Information to Semantics for Generalizable and Plug-and-Play Multi-Scenario Network Diagnosis |
Tiao Tan et.al. |
2501.16842 |
null |
2025-01-28 |
Misspellings in Natural Language Processing: A survey |
Gianluca Sperduti et.al. |
2501.16836 |
null |
2025-01-28 |
DIRIGENt: End-To-End Robotic Imitation of Human Demonstrations Based on a Diffusion Model |
Josua Spisak et.al. |
2501.16800 |
null |
2025-01-28 |
Algorithm for Automatic Legislative Text Consolidation |
Matias Etcheverry et.al. |
2501.16794 |
null |
2025-01-28 |
Exponential Family Attention |
Kevin Christian Wibisono et.al. |
2501.16790 |
link |
2025-01-28 |
Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding |
Yun Li et.al. |
2501.16786 |
null |
2025-01-28 |
TORCHLIGHT: Shedding LIGHT on Real-World Attacks on Cloudless IoT Devices Concealed within the Tor Network |
Yumingzhi Pan et.al. |
2501.16784 |
null |
2025-01-28 |
A Stochastic Dynamical Theory of LLM Self-Adversariality: Modeling Severity Drift as a Critical Process |
Jack David Carson et.al. |
2501.16783 |
null |
2025-01-29 |
Beyond-Labels: Advancing Open-Vocabulary Segmentation With Vision-Language Models |
Muhammad Atta ur Rahman et.al. |
2501.16769 |
null |
2025-01-28 |
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation |
Chenguo Lin et.al. |
2501.16764 |
null |
2025-01-28 |
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns |
Xinyue Shen et.al. |
2501.16750 |
link |
2025-01-28 |
Through the Prism of Culture: Evaluating LLMs’ Understanding of Indian Subcultures and Traditions |
Garima Chhikara et.al. |
2501.16748 |
null |
2025-01-28 |
LLM Assisted Anomaly Detection Service for Site Reliability Engineers: Enhancing Cloud Infrastructure Resilience |
Nimesh Jha et.al. |
2501.16744 |
null |
2025-01-28 |
Distilling Large Language Models for Network Active Queue Management |
Deol Satish et.al. |
2501.16734 |
null |
2025-01-28 |
xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking |
Sunbowen Lee et.al. |
2501.16727 |
link |
2025-01-28 |
One Head Eight Arms: Block Matrix based Low Rank Adaptation for CLIP-based Few-Shot Learning |
Chunpeng Zhou et.al. |
2501.16720 |
null |
2025-01-28 |
Outlier Synthesis via Hamiltonian Monte Carlo for Out-of-Distribution Detection |
Hengzhuang Li et.al. |
2501.16718 |
link |
2025-01-28 |
3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow |
Yueen Ma et.al. |
2501.16698 |
null |
2025-01-28 |
MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark |
Dongyi Yi et.al. |
2501.16688 |
null |
2025-01-28 |
Auto-Differentiating Any LLM Workflow: A Farewell to Manual Prompting |
Li Yin et.al. |
2501.16673 |
link |
2025-01-28 |
VeriFact: Verifying Facts in LLM-Generated Clinical Text with Electronic Health Records |
Philip Chung et.al. |
2501.16672 |
link |
2025-01-28 |
Contextual Reinforcement in Multimodal Token Compression for Large Language Models |
Naderdel Piero et.al. |
2501.16658 |
null |
2025-01-28 |
Large Language Model Critics for Execution-Free Evaluation of Code Changes |
Aashish Yadavally et.al. |
2501.16655 |
link |
2025-01-28 |
Molecular-driven Foundation Model for Oncologic Pathology |
Anurag Vaidya et.al. |
2501.16652 |
link |
2025-01-28 |
DOCS: Quantifying Weight Similarity for Deeper Insights into Large Language Models |
Zeping Min et.al. |
2501.16650 |
null |
2025-01-28 |
An LLM Benchmark for Addressee Recognition in Multi-modal Multi-party Dialogue |
Koji Inoue et.al. |
2501.16643 |
null |
2025-01-28 |
CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs |
Jinlan Fu et.al. |
2501.16629 |
link |
2025-01-28 |
Few-Shot Optimized Framework for Hallucination Detection in Resource-Limited NLP Systems |
Baraa Hikal et.al. |
2501.16616 |
null |
2025-01-28 |
Sparse Autoencoders Trained on the Same Data Learn Different Features |
Gonçalo Paulo et.al. |
2501.16615 |
null |
2025-01-28 |
Fine-Tuned Language Models as Space Systems Controllers |
Enrico M. Zucchelli et.al. |
2501.16588 |
null |
2025-01-27 |
AffectGPT: A New Dataset, Model, and Benchmark for Emotion Understanding with Multimodal Large Language Models |
Zheng Lian et.al. |
2501.16566 |
link |
2025-01-27 |
LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation |
Farzad Farhadzadeh et.al. |
2501.16559 |
null |
2025-01-27 |
Distributional Information Embedding: A Framework for Multi-bit Watermarking |
Haiyun He et.al. |
2501.16558 |
null |
2025-01-27 |
PackDiT: Joint Human Motion and Text Generation via Mutual Prompting |
Zhongyu Jiang et.al. |
2501.16551 |
null |
2025-01-27 |
PhysAnimator: Physics-Guided Generative Cartoon Animation |
Tianyi Xie et.al. |
2501.16550 |
null |
2025-01-27 |
Sample-Efficient Behavior Cloning Using General Domain Knowledge |
Feiyu Zhu et.al. |
2501.16546 |
null |
2025-01-27 |
Generalized Mission Planning for Heterogeneous Multi-Robot Teams via LLM-constructed Hierarchical Trees |
Piyush Gupta et.al. |
2501.16539 |
null |
2025-01-27 |
Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs |
Jean-Charles Noirot Ferrand et.al. |
2501.16534 |
null |
2025-01-27 |
A comparison of data filtering techniques for English-Polish LLM-based machine translation in the biomedical domain |
Jorge del Pozo Lérida et.al. |
2501.16533 |
null |
2025-01-27 |
Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction |
Atharva Naik et.al. |
2501.16524 |
null |
2025-01-27 |
How well can LLMs Grade Essays in Arabic? |
Rayed Ghazawi et.al. |
2501.16516 |
null |
2025-01-27 |
Deception in LLMs: Self-Preservation and Autonomous Goals in Large Language Models |
Sudarshan Kamath Barkur et.al. |
2501.16513 |
null |
2025-01-27 |
Smoothed Embeddings for Robust Language Models |
Ryo Hase et.al. |
2501.16497 |
null |
2025-01-27 |
Explaining GitHub Actions Failures with Large Language Models: Challenges, Insights, and Limitations |
Pablo Valenzuela-Toledo et.al. |
2501.16495 |
null |
2025-01-27 |
Generating customized prompts for Zero-Shot Rare Event Medical Image Classification using LLM |
Payal Kamboj et.al. |
2501.16481 |
link |
2025-01-27 |
Cross-Domain Semantic Segmentation with Large Language Model-Assisted Descriptor Generation |
Philip Hughes et.al. |
2501.16467 |
null |
2025-01-27 |
CoCoNUT: Structural Code Understanding does not fall out of a tree |
Claas Beger et.al. |
2501.16456 |
link |
2025-01-27 |
Detecting Zero-Day Attacks in Digital Substations via In-Context Learning |
Faizan Manzoor et.al. |
2501.16453 |
null |
2025-01-27 |
360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation |
Hamed Firooz et.al. |
2501.16450 |
null |
2025-01-27 |
DynAlign: Unsupervised Dynamic Taxonomy Alignment for Cross-Domain Segmentation |
Han Sun et.al. |
2501.16410 |
null |
2025-01-27 |
Evaluating The Performance of Using Large Language Models to Automate Summarization of CT Simulation Orders in Radiation Oncology |
Meiyun Cao et.al. |
2501.16309 |
null |
2025-01-27 |
RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval |
Long Nguyen et.al. |
2501.16303 |
null |
2025-01-27 |
Matryoshka Re-Ranker: A Flexible Re-Ranking Architecture With Configurable Depth and Width |
Zheng Liu et.al. |
2501.16302 |
null |
2025-01-27 |
Large Models in Dialogue for Active Perception and Anomaly Detection |
Tzoulio Chamiti et.al. |
2501.16300 |
link |
2025-01-27 |
FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers |
Renshan Zhang et.al. |
2501.16297 |
null |
2025-01-27 |
Brain-Adapter: Enhancing Neurological Disorder Analysis with Adapter-Tuning Multimodal Large Language Models |
Jing Zhang et.al. |
2501.16282 |
null |
2025-01-27 |
Do LLMs Have Visualization Literacy? An Evaluation on Modified Visualizations to Test Generalization in Data Interpretation |
Jiayi Hong et.al. |
2501.16277 |
link |
2025-01-27 |
URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots – A Case Study at HCMUT |
Long Nguyen et.al. |
2501.16276 |
null |
2025-01-27 |
A foundation model for human-AI collaboration in medical literature mining |
Zifeng Wang et.al. |
2501.16255 |
null |
2025-01-27 |
Multi-Agent Geospatial Copilots for Remote Sensing Workflows |
Chaehong Lee et.al. |
2501.16254 |
null |
2025-01-27 |
Zero-Shot Decision Tree Construction via Large Language Models |
Lucas Carrasco et.al. |
2501.16247 |
null |
2025-01-27 |
CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation |
Xiaochuan Ma et.al. |
2501.16246 |
null |
2025-01-27 |
Phase Transitions in Large Language Models and the $O(N)$ Model |
Youran Sun et.al. |
2501.16241 |
null |
2025-01-27 |
AiGet: Transforming Everyday Moments into Hidden Knowledge Discovery with AI Assistance on Smart Glasses |
Runze Cai et.al. |
2501.16240 |
link |
2025-01-28 |
Distilling foundation models for robust and efficient models in digital pathology |
Alexandre Filiot et.al. |
2501.16239 |
null |
2025-01-27 |
Language-Based Bayesian Optimization Research Assistant (BORA) |
Abdoulatif Cissé et.al. |
2501.16224 |
null |
2025-01-27 |
Enhancing Visual Inspection Capability of Multi-Modal Large Language Models on Medical Time Series with Supportive Conformalized and Interpretable Small Specialized Models |
Huayu Li et.al. |
2501.16215 |
link |
2025-01-27 |
Provence: efficient and robust context pruning for retrieval-augmented generation |
Nadezhda Chirkova et.al. |
2501.16214 |
null |
2025-01-27 |
Raiders of the Lost Dependency: Fixing Dependency Conflicts in Python using LLMs |
Antony Bartlett et.al. |
2501.16191 |
null |
2025-01-27 |
SWIFT: Mapping Sub-series with Wavelet Decomposition Improves Time Series Forecasting |
Wenxuan Xie et.al. |
2501.16178 |
link |
2025-01-27 |
BAG: Body-Aligned 3D Wearable Asset Generation |
Zhongjin Luo et.al. |
2501.16177 |
null |
2025-01-27 |
Will Systems of LLM Agents Cooperate: An Investigation into a Social Dilemma |
Richard Willis et.al. |
2501.16173 |
link |
2025-01-27 |
MetaDecorator: Generating Immersive Virtual Tours through Multimodality |
Shuang Xie et.al. |
2501.16164 |
null |
2025-01-27 |
CITYWALK: Enhancing LLM-Based C++ Unit Test Generation via Project-Dependency Awareness and Language-Specific Knowledge |
Yuwei Zhang et.al. |
2501.16155 |
null |
2025-01-27 |
AdaCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Chain-of-Thought |
Xin Huang et.al. |
2501.16154 |
null |
2025-01-27 |
AI Agents for Computer Use: A Review of Instruction-based Computer Control, GUI Automation, and Operator Assistants |
Pascal J. Sager et.al. |
2501.16150 |
null |
2025-01-27 |
PATCH: Empowering Large Language Model with Programmer-Intent Guidance and Collaborative-Behavior Simulation for Automatic Bug Fixing |
Yuwei Zhang et.al. |
2501.16149 |
null |
2025-01-27 |
SampleLLM: Optimizing Tabular Data Synthesis in Recommendations |
Jingtong Gao et.al. |
2501.16125 |
null |
2025-01-27 |
Using Generative Models to Produce Realistic Populations of UK Windstorms |
Yee Chun Tsoi et.al. |
2501.16110 |
null |
2025-01-27 |
Integration of LLM Quality Assurance into an NLG System |
Ching-Yi Chen et.al. |
2501.16078 |
null |
2025-01-27 |
PISCO: Pretty Simple Compression for Retrieval-Augmented Generation |
Maxime Louis et.al. |
2501.16075 |
null |
2025-01-27 |
A generative material transformer using Wyckoff representation |
Pierre-Paul De Breuck et.al. |
2501.16051 |
null |
2025-01-27 |
Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation |
Xing Zhang et.al. |
2501.16050 |
null |
2025-01-27 |
PRISMe: A Novel LLM-Powered Tool for Interactive Privacy Policy Assessment |
Vincent Freiberger et.al. |
2501.16033 |
null |
2025-01-27 |
FDLLM: A Text Fingerprint Detection Method for LLMs in Multi-Language, Multi-Domain Black-Box Environments |
Zhiyuan Fu et.al. |
2501.16029 |
null |
2025-01-27 |
Transformability reveals the interplay of dynamics across different network orders |
Ming Xie et.al. |
2501.16016 |
null |
2025-01-27 |
TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference |
Jack Min Ong et.al. |
2501.16007 |
null |
2025-01-27 |
EDSep: An Effective Diffusion-Based Method for Speech Source Separation |
Jinwei Dong et.al. |
2501.15965 |
null |
2025-01-27 |
Rethinking the Bias of Foundation Model under Long-tailed Distribution |
Jiahao Chen et.al. |
2501.15955 |
null |
2025-01-27 |
Understanding Long Videos via LLM-Powered Entity Relation Graphs |
Meng Chu et.al. |
2501.15953 |
null |
2025-01-27 |
TimeHF: Billion-Scale Time Series Models Guided by Human Feedback |
Yongzhi Qi et.al. |
2501.15942 |
null |
2025-01-27 |
SkillScope: A Tool to Predict Fine-Grained Skills Needed to Solve Issues on GitHub |
Benjamin C. Carter et.al. |
2501.15922 |
null |
2025-01-27 |
Parametric Retrieval Augmented Generation |
Weihang Su et.al. |
2501.15915 |
link |
2025-01-27 |
Robust Mobile Robot Path Planning via LLM-Based Dynamic Waypoint Generation |
Muhammad Taha Tariq et.al. |
2501.15901 |
null |
2025-01-27 |
Investigating the Sensitivity of Pre-trained Audio Embeddings to Common Effects |
Victor Deng et.al. |
2501.15900 |
null |
2025-01-27 |
Adaptive Width Neural Networks |
Federico Errica et.al. |
2501.15889 |
null |
2025-01-27 |
LCTG Bench: LLM Controlled Text Generation Benchmark |
Kentaro Kurihara et.al. |
2501.15875 |
link |
2025-01-27 |
LLM-attacker: Enhancing Closed-loop Adversarial Scenario Generation for Autonomous Driving with Large Language Models |
Yuewen Mei et.al. |
2501.15850 |
null |
2025-01-27 |
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model |
Delin Qu et.al. |
2501.15830 |
null |
2025-01-27 |
Aging-aware CPU Core Management for Embodied Carbon Amortization in Cloud LLM Inference |
Tharindu B. Hewage et.al. |
2501.15829 |
link |
2025-01-27 |
MADP: Multi-Agent Deductive Planning for Enhanced Cognitive-Behavioral Mental Health Question Answer |
Qi Chen et.al. |
2501.15826 |
null |
2025-01-27 |
LemmaHead: RAG Assisted Proof Generation Using Large Language Models |
Tianbo Yang et.al. |
2501.15797 |
null |
2025-01-27 |
Can Multimodal Large Language Models be Guided to Improve Industrial Anomaly Detection? |
Zhiling Chen et.al. |
2501.15795 |
null |
2025-01-27 |
Harnessing Diverse Perspectives: A Multi-Agent Framework for Enhanced Error Detection in Knowledge Graphs |
Yu Li et.al. |
2501.15791 |
link |
2025-01-27 |
Memorization and Regularization in Generative Diffusion Models |
Ricardo Baptista et.al. |
2501.15785 |
link |
2025-01-27 |
Large Language Models to Diffusion Finetuning |
Edoardo Cetin et.al. |
2501.15781 |
null |
2025-01-27 |
Is It Navajo? Accurate Language Detection in Endangered Athabaskan Languages |
Ivory Yang et.al. |
2501.15773 |
link |
2025-01-27 |
GraphICL: Unlocking Graph Learning Potential in LLMs through Structured Prompt Design |
Yuanfu Sun et.al. |
2501.15755 |
null |
2025-01-27 |
IndicMMLU-Pro: Benchmarking the Indic Large Language Models |
Sankalp KJ et.al. |
2501.15747 |
null |
2025-01-27 |
Gensors: Authoring Personalized Visual Sensors with Multimodal Foundation Models and Reasoning |
Michael Xieyang Liu et.al. |
2501.15727 |
null |
2025-01-27 |
A Survey on Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluation Tasks |
Dong Li et.al. |
2501.15724 |
null |
2025-01-27 |
On Parallelism in Music and Language: A Perspective from Symbol Emergence Systems based on Probabilistic Generative Models |
Tadahiro Taniguchi et.al. |
2501.15721 |
null |
2025-01-26 |
Adapting Biomedical Abstracts into Plain language using Large Language Models |
Haritha Gangavarapu et.al. |
2501.15700 |
null |
2025-01-26 |
TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs |
Yuxuan Gu et.al. |
2501.15674 |
link |
2025-01-26 |
Bringing Characters to New Stories: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting |
Yuxin Zhang et.al. |
2501.15641 |
null |
2025-01-26 |
BoKDiff: Best-of-K Diffusion Alignment for Target-Specific 3D Molecule Generation |
Ali Khodabandeh Yalabadi et.al. |
2501.15631 |
link |
2025-01-26 |
Improving Estonian Text Simplification through Pretrained Language Models and Custom Datasets |
Eduard Barbu et.al. |
2501.15624 |
null |
2025-01-26 |
Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning |
Zeyu Gan et.al. |
2501.15602 |
link |
2025-01-26 |
Evaluating an LLM-Powered Chatbot for Cognitive Restructuring: Insights from Mental Health Professionals |
Yinzhou Wang et.al. |
2501.15599 |
null |
2025-01-26 |
Diffusion Generative Modeling for Spatially Resolved Gene Expression Inference from Histology Images |
Sichen Zhu et.al. |
2501.15598 |
link |
2025-01-26 |
SedarEval: Automated Evaluation using Self-Adaptive Rubrics |
Zhiyuan Fan et.al. |
2501.15595 |
link |
2025-01-26 |
SCP-116K: A High-Quality Problem-Solution Dataset and a Generalized Pipeline for Automated Extraction in the Higher Education Science Domain |
Dakuan Lu et.al. |
2501.15587 |
link |
2025-01-26 |
Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework |
Yuhong Sun et.al. |
2501.15581 |
null |
2025-01-26 |
Instruction Tuning for Story Understanding and Generation with Weak Supervision |
Yangshu Yuan et.al. |
2501.15574 |
null |
2025-01-26 |
Cross-Cultural Fashion Design via Interactive Large Language Models and Diffusion Models |
Spencer Ramsey et.al. |
2501.15571 |
null |
2025-01-26 |
ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer |
Lin Yueyu et.al. |
2501.15570 |
link |
2025-01-26 |
Ocean-OCR: Towards General OCR Application via a Vision-Language Model |
Song Chen et.al. |
2501.15558 |
link |
2025-01-26 |
Advancing Generative Artificial Intelligence and Large Language Models for Demand Side Management with Electric Vehicles |
Hanwen Zhang et.al. |
2501.15544 |
null |
2025-01-26 |
Estimating Committor Functions via Deep Adaptive Sampling on Rare Transition Paths |
Yueyang Wang et.al. |
2501.15522 |
null |
2025-01-26 |
Domain Adaptation from Generated Multi-Weather Images for Unsupervised Maritime Object Classification |
Dan Song et.al. |
2501.15503 |
null |
2025-01-26 |
Unveiling the Potential of Multimodal Retrieval Augmented Generation with Planning |
Xiaohan Yu et.al. |
2501.15470 |
null |
2025-01-26 |
Data-adaptive Safety Rules for Training Reward Models |
Xiaomin Li et.al. |
2501.15453 |
null |
2025-01-26 |
OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas |
Xiaoyang Wang et.al. |
2501.15427 |
null |
2025-01-26 |
Visual Generation Without Guidance |
Huayu Chen et.al. |
2501.15420 |
link |
2025-01-26 |
AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement |
Junan Zhang et.al. |
2501.15417 |
null |
2025-01-26 |
The Potential of Large Language Models in Supply Chain Management: Advancing Decision-Making, Efficiency, and Innovation |
Raha Aghaei et.al. |
2501.15411 |
null |
2025-01-26 |
Semantic Layered Embedding Diffusion in Large Language Models for Multi-Contextual Consistency |
Irin Kabakum et.al. |
2501.15405 |
null |
2025-01-26 |
How Green are Neural Language Models? Analyzing Energy Consumption in Text Summarization Fine-tuning |
Tohida Rehman et.al. |
2501.15398 |
null |
2025-01-26 |
Zero-Shot Interactive Text-to-Image Retrieval via Diffusion-Augmented Representations |
Zijun Long et.al. |
2501.15379 |
null |
2025-01-26 |
How to Mitigate Information Loss in Knowledge Graphs for GraphRAG: Leveraging Triple Context Restoration and Query-Driven Feedback |
Manzong Huang et.al. |
2501.15378 |
null |
2025-01-26 |
Evaluating the Effectiveness of XAI Techniques for Encoder-Based Language Models |
Melkamu Abay Mersha et.al. |
2501.15374 |
null |
2025-01-26 |
Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension In Biomedical Image Analysis |
Robinson Umeike et.al. |
2501.15370 |
null |
2025-01-26 |
Decentralized Low-Rank Fine-Tuning of Large Language Models |
Sajjad Ghiasvand et.al. |
2501.15361 |
null |
2025-01-26 |
Large Language Models as Theory of Mind Aware Generative Agents with Counterfactual Reflection |
Bo Yang et.al. |
2501.15355 |
null |
2025-01-25 |
Fairness in LLM-Generated Surveys |
Andrés Abeliuk et.al. |
2501.15351 |
null |
2025-01-25 |
Between Puppet and Actor: Reframing Authorship in this Age of AI Agents |
Yuqian Sun et.al. |
2501.15346 |
null |
2025-01-25 |
Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data |
Jiajie Li et.al. |
2501.15326 |
null |
2025-01-25 |
ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning |
Shangqian Gao et.al. |
2501.15316 |
null |
2025-01-25 |
The Multicultural Medical Assistant: Can LLMs Improve Medical ASR Errors Across Borders? |
Ayo Adedeji et.al. |
2501.15310 |
null |
2025-01-25 |
You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning |
Ayan Sengupta et.al. |
2501.15296 |
null |
2025-01-24 |
HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation |
Xin Zhou et.al. |
2501.14729 |
link |
2025-01-24 |
Do LLMs Provide Consistent Answers to Health-Related Questions across Languages? |
Ipek Baris Schlicht et.al. |
2501.14719 |
null |
2025-01-24 |
Towards Better Understanding Table Instruction Tuning: Decoupling the Effects from Data versus Models |
Naihao Deng et.al. |
2501.14717 |
null |
2025-01-24 |
FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing |
James Seale Smith et.al. |
2501.14713 |
null |
2025-01-24 |
The Karp Dataset |
Mason DiCicco et.al. |
2501.14705 |
null |
2025-01-24 |
Rethinking Table Instruction Tuning |
Naihao Deng et.al. |
2501.14693 |
null |
2025-01-24 |
Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST |
Fuping Wu et.al. |
2501.14685 |
null |
2025-01-24 |
An Empirical Study on LLM-based Classification of Requirements-related Provisions in Food-safety Regulations |
Shabnam Hassani et.al. |
2501.14683 |
null |
2025-01-24 |
Diffusion based Text-to-Music Generationwith Global and Local Text based Conditioning |
Jisi Zhang et.al. |
2501.14680 |
null |
2025-01-24 |
MedAgentBench: Dataset for Benchmarking LLMs as Agents in Medical Applications |
Yixing Jiang et.al. |
2501.14654 |
link |
2025-01-24 |
Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion |
Ziyao Xu et.al. |
2501.14649 |
link |
2025-01-24 |
Towards Scalable Topological Regularizers |
Hiu-Tung Wong et.al. |
2501.14641 |
null |
2025-01-24 |
Recommending Actionable Strategies: A Semantic Approach to Integrating Analytical Frameworks with Decision Heuristics |
Renato Ghisellini et.al. |
2501.14634 |
null |
2025-01-24 |
Extracting Problem Structure with LLMs for Optimized SAT Local Search |
André Schilder et.al. |
2501.14630 |
null |
2025-01-24 |
Single-neuron deep generative model uncovers underlying physics of neuronal activity in Ca imaging data |
Jordi Abante et.al. |
2501.14615 |
null |
2025-01-24 |
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations |
Tianming Liang et.al. |
2501.14607 |
null |
2025-01-24 |
Leveraging ChatGPT’s Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research |
Hamid Sarmadi et.al. |
2501.14546 |
null |
2025-01-24 |
VERUS-LM: a Versatile Framework for Combining LLMs with Symbolic Reasoning |
Benjamin Callewaert et.al. |
2501.14540 |
null |
2025-01-24 |
Design and Implementation of a Psychiatry Resident Training System Based on Large Language Models |
Zhenguang Zhong et.al. |
2501.14530 |
link |
2025-01-24 |
Scene Understanding Enabled Semantic Communication with Open Channel Coding |
Zhe Xiang et.al. |
2501.14520 |
null |
2025-01-24 |
Real-world Edge Neural Network Implementations Leak Private Interactions Through Physical Side Channel |
Zhuoran Liu et.al. |
2501.14512 |
null |
2025-01-24 |
Automated Assignment Grading with Large Language Models: Insights From a Bioinformatics Course |
Pavlin G. Poličar et.al. |
2501.14499 |
null |
2025-01-24 |
Evaluating and Improving Graph to Text Generation with Large Language Models |
Jie He et.al. |
2501.14497 |
link |
2025-01-24 |
RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques |
Zhengyang Tang et.al. |
2501.14492 |
link |
2025-01-24 |
Pesti-Gen: Unleashing a Generative Molecule Approach for Toxicity Aware Pesticide Design |
Taehan Kim et.al. |
2501.14469 |
null |
2025-01-24 |
Boundary Value Test Input Generation Using Prompt Engineering with LLMs: Fault Detection and Coverage Analysis |
Xiujing Guo et.al. |
2501.14465 |
null |
2025-01-24 |
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing |
Zeping Yu et.al. |
2501.14457 |
null |
2025-01-24 |
Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains |
Xu Chu et.al. |
2501.14431 |
null |
2025-01-24 |
GraphBC: Improving LLMs for Better Graph Data Processing |
Xu Chu et.al. |
2501.14427 |
null |
2025-01-24 |
CENTS: Generating synthetic electricity consumption time series for rare and unseen scenarios |
Michael Fuest et.al. |
2501.14426 |
null |
2025-01-24 |
DeepFlow: Serverless Large Language Model Serving at Scale |
Junhao Hu et.al. |
2501.14417 |
null |
2025-01-24 |
SKIL: Semantic Keypoint Imitation Learning for Generalizable Data-efficient Manipulation |
Shengjie Wang et.al. |
2501.14400 |
null |
2025-01-24 |
ECTIL: Label-efficient Computational Tumour Infiltrating Lymphocyte (TIL) assessment in breast cancer: Multicentre validation in 2,340 patients with breast cancer |
Yoni Schirris et.al. |
2501.14379 |
link |
2025-01-24 |
DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing |
Xinyu Ma et.al. |
2501.14371 |
link |
2025-01-24 |
Uncovering the bias in the evidence for dynamical dark energy through minimal and generalized modeling approaches |
Ziad Sakr et.al. |
2501.14366 |
null |
2025-01-24 |
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration |
Kai-Tuo Xu et.al. |
2501.14350 |
link |
2025-01-24 |
Chain-of-Retrieval Augmented Generation |
Liang Wang et.al. |
2501.14342 |
null |
2025-01-24 |
Exploring the sustainable scaling of AI dilemma: A projective study of corporations’ AI environmental impacts |
Clément Desroches et.al. |
2501.14334 |
null |
2025-01-24 |
Assessing Large Language Models in Comprehending and Verifying Concurrent Programs across Memory Models |
Ridhi Jain et.al. |
2501.14326 |
null |
2025-01-24 |
PAID: A Framework of Product-Centric Advertising Image Design |
Hongyu Chen et.al. |
2501.14316 |
null |
2025-01-24 |
Locality-aware Fair Scheduling in LLM Serving |
Shiyi Cao et.al. |
2501.14312 |
null |
2025-01-24 |
A Zero-Shot LLM Framework for Automatic Assignment Grading in Higher Education |
Calvin Yeung et.al. |
2501.14305 |
link |
2025-01-24 |
MASTER: A Multi-Agent System with LLM Specialized MCTS |
Bingzheng Gan et.al. |
2501.14304 |
null |
2025-01-24 |
Fast Think-on-Graph: Wider, Deeper and Faster Reasoning of Large Language Model on Knowledge Graph |
Xujian Liang et.al. |
2501.14300 |
link |
2025-01-24 |
Multi-stage Large Language Model Pipelines Can Outperform GPT-4o in Relevance Assessment |
Julian A. Schnabel et.al. |
2501.14296 |
null |
2025-01-24 |
Examining Alignment of Large Language Models through Representative Heuristics: The Case of Political Stereotypes |
Sullam Jeoung et.al. |
2501.14294 |
link |
2025-01-24 |
Advances in Temporal Point Processes: Bayesian, Deep, and LLM Approaches |
Feng Zhou et.al. |
2501.14291 |
null |
2025-01-24 |
Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation |
Sadegh Mahdavi et.al. |
2501.14275 |
link |
2025-01-24 |
Siren: A Learning-Based Multi-Turn Attack Framework for Simulating Real-World Human Jailbreak Behaviors |
Yi Zhao et.al. |
2501.14250 |
link |
2025-01-24 |
Humanity’s Last Exam |
Long Phan et.al. |
2501.14249 |
null |
2025-01-24 |
Multi-agent KTO: Reinforcing Strategic Interactions of Large Language Model in Language Game |
Rong Ye et.al. |
2501.14225 |
null |
2025-01-24 |
Top Ten Challenges Towards Agentic Neural Graph Databases |
Jiaxin Bai et.al. |
2501.14224 |
null |
2025-01-24 |
TFG-Flow: Training-free Guidance in Multimodal Generative Flow |
Haowei Lin et.al. |
2501.14216 |
link |
2025-01-24 |
Serving Long-Context LLMs at the Mobile Edge: Test-Time Reinforcement Learning-based Model Caching and Inference Offloading |
Minrui Xu et.al. |
2501.14205 |
null |
2025-01-24 |
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking |
Runyi Hu et.al. |
2501.14195 |
link |
2025-01-24 |
Distributed Multi-Agent Coordination Using Multi-Modal Foundation Models |
Saaduddin Mahmud et.al. |
2501.14189 |
null |
2025-01-24 |
GeoSim.AI: AI assistants for numerical simulations in geomechanics |
Yared W. Bekele et.al. |
2501.14186 |
null |
2025-01-24 |
AI Chatbots as Professional Service Agents: Developing a Professional Identity |
Wenwen Li et.al. |
2501.14179 |
null |
2025-01-24 |
Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models |
Yile Gu et.al. |
2501.14170 |
null |
2025-01-24 |
Test-Time Code-Switching for Cross-lingual Aspect Sentiment Triplet Extraction |
Dongming Sheng et.al. |
2501.14144 |
null |
2025-01-23 |
Autonomous Structural Memory Manipulation for Large Language Models Using Hierarchical Embedding Augmentation |
Derek Yotheringhay et.al. |
2501.14119 |
null |
2025-01-23 |
Domain-Factored Untrained Deep Prior for Spectrum Cartography |
Subash Timilsina et.al. |
2501.14116 |
null |
2025-01-23 |
MedSlice: Fine-Tuned Large Language Models for Secure Clinical Note Sectioning |
Joshua Davis et.al. |
2501.14105 |
link |
2025-01-23 |
StreamingRAG: Real-time Contextual Retrieval and Generation Framework |
Murugan Sankaradas et.al. |
2501.14101 |
null |
2025-01-23 |
Enhancing Biomedical Relation Extraction with Directionality |
Po-Ting Lai et.al. |
2501.14079 |
link |
2025-01-23 |
LLMs are Vulnerable to Malicious Prompts Disguised as Scientific Language |
Yubin Ge et.al. |
2501.14073 |
null |
2025-01-23 |
Efficient 2D CT Foundation Model for Contrast Phase Classification |
Benjamin Hou et.al. |
2501.14066 |
null |
2025-01-23 |
Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation Models |
Jakob Krogh Petersen et.al. |
2501.14051 |
link |
2025-01-23 |
LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps |
Andrey Palaev et.al. |
2501.14046 |
link |
2025-01-23 |
Leveraging Large Language Models to Analyze Emotional and Contextual Drivers of Teen Substance Use in Online Discussions |
Jianfeng Zhu et.al. |
2501.14037 |
null |
2025-01-23 |
CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation |
Guofeng Cui et.al. |
2501.13927 |
null |
2025-01-23 |
Improving Video Generation with Human Feedback |
Jie Liu et.al. |
2501.13918 |
null |
2025-01-23 |
Binary Diffusion Probabilistic Model |
Vitaliy Kinakh et.al. |
2501.13915 |
null |
2025-01-23 |
Analysis of Indic Language Capabilities in LLMs |
Aatman Vaidya et.al. |
2501.13912 |
null |
2025-01-23 |
Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models |
Linh Tran et.al. |
2501.13904 |
null |
2025-01-23 |
Exploring Finetuned Audio-LLM on Heart Murmur Features |
Adrian Florea et.al. |
2501.13884 |
null |
2025-01-23 |
The machine learning platform for developers of large systems |
Alexey Naikov et.al. |
2501.13881 |
null |
2025-01-23 |
A RAG-Based Institutional Assistant |
Gustavo Kuratomi et.al. |
2501.13880 |
null |
2025-01-23 |
On the Reasoning Capacity of AI Models and How to Quantify It |
Santosh Kumar Radha et.al. |
2501.13833 |
null |
2025-01-23 |
Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing |
Hao Zhang et.al. |
2501.13831 |
null |
2025-01-23 |
Hallucinations Can Improve Large Language Models in Drug Discovery |
Shuzhou Yuan et.al. |
2501.13824 |
null |
2025-01-23 |
Large Language Model driven Policy Exploration for Recommender Systems |
Jie Wang et.al. |
2501.13816 |
null |
2025-01-23 |
Enhancing LLMs for Governance with Human Oversight: Evaluating and Aligning LLMs on Expert Classification of Climate Misinformation for Detecting False or Misleading Claims about Climate Change |
Mowafak Allaham et.al. |
2501.13802 |
null |
2025-01-23 |
Parameter-Efficient Fine-Tuning for Foundation Models |
Dan Zhang et.al. |
2501.13787 |
link |
2025-01-23 |
Not Every AI Problem is a Data Problem: We Should Be Intentional About Data Scaling |
Tanya Rodchenko et.al. |
2501.13779 |
null |
2025-01-23 |
Explainable XR: Understanding User Behaviors of XR Environments using LLM-assisted Analytics Framework |
Yoonsang Kim et.al. |
2501.13778 |
link |
2025-01-23 |
Do Large Language Models Truly Understand Geometric Structures? |
Xiaofeng Wang et.al. |
2501.13773 |
link |
2025-01-23 |
Tune In, Act Up: Exploring the Impact of Audio Modality-Specific Edits on Large Audio Language Models in Jailbreak |
Erjia Xiao et.al. |
2501.13772 |
null |
2025-01-23 |
UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models |
Xin Xu et.al. |
2501.13766 |
null |
2025-01-23 |
EICopilot: Search and Explore Enterprise Information over Large-scale Knowledge Graphs with LLM-driven Agents |
Yuhui Yun et.al. |
2501.13746 |
null |
2025-01-23 |
GPT-HTree: A Decision Tree Framework Integrating Hierarchical Clustering and Large Language Models for Explainable Classification |
Te Pei et.al. |
2501.13743 |
null |
2025-01-23 |
An Empirical Study of Retrieval-Augmented Code Generation: Challenges and Opportunities |
Zezhou Yang et.al. |
2501.13742 |
link |
2025-01-23 |
Pseudocode-Injection Magic: Enabling LLMs to Tackle Graph Computational Tasks |
Chang Gong et.al. |
2501.13731 |
null |
2025-01-23 |
RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation |
Shi-Qi Yan et.al. |
2501.13726 |
null |
2025-01-23 |
Musical ethnocentrism in Large Language Models |
Anna Kruspe et.al. |
2501.13720 |
null |
2025-01-23 |
A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation |
Dario Serez et.al. |
2501.13718 |
null |
2025-01-23 |
EventVL: Understand Event Streams via Multimodal Large Language Model |
Pengteng Li et.al. |
2501.13707 |
null |
2025-01-23 |
DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale |
Linghao Zhang et.al. |
2501.13699 |
null |
2025-01-23 |
Question Answering on Patient Medical Records with Private Fine-Tuned LLMs |
Sara Kothari et.al. |
2501.13687 |
null |
2025-01-23 |
HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor |
Zihui Wu et.al. |
2501.13677 |
link |
2025-01-23 |
How to Complete Domain Tuning while Keeping General Ability in LLM: Adaptive Layer-wise and Element-wise Regularization |
Shezheng Song et.al. |
2501.13669 |
null |
2025-01-23 |
LVPruning: An Effective yet Simple Language-Guided Vision Token Pruning Approach for Multi-modal Large Language Models |
Yizheng Sun et.al. |
2501.13652 |
null |
2025-01-23 |
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models |
Zhenghao Lin et.al. |
2501.13629 |
null |
2025-01-23 |
Text-to-SQL based on Large Language Models and Database Keyword Search |
Eduardo R. Nascimento et.al. |
2501.13594 |
null |
2025-01-23 |
Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization |
Lei Huang et.al. |
2501.13573 |
null |
2025-01-23 |
One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt |
Tao Liu et.al. |
2501.13554 |
link |
2025-01-23 |
LLMs Can Plan Only If We Tell Them |
Bilgehan Sel et.al. |
2501.13545 |
null |
2025-01-23 |
ReasVQA: Advancing VideoQA with Imperfect Reasoning Process |
Jianxin Liang et.al. |
2501.13536 |
null |
2025-01-23 |
RECALL: Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles |
Munachiso Nwadike et.al. |
2501.13491 |
link |
2025-01-23 |
Adaptive Testing for LLM-Based Applications: A Diversity-based Approach |
Juyeon Yoon et.al. |
2501.13480 |
null |
2025-01-23 |
LDR-Net: A Novel Framework for AI-generated Image Detection via Localized Discrepancy Representation |
JiaXin Chen et.al. |
2501.13475 |
null |
2025-01-23 |
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge |
Haomiao Xiong et.al. |
2501.13468 |
link |
2025-01-23 |
Spurious Forgetting in Continual Learning of Language Models |
Junhao Zheng et.al. |
2501.13453 |
link |
2025-01-23 |
Softplus Attention with Re-weighting Boosts Length Extrapolation in Large Language Models |
Bo Gao et.al. |
2501.13428 |
null |
2025-01-23 |
Predicting Turbulence Structure In Street-Canyon Flows using Deep Generative Modeling |
Tomek Jaroslawski et.al. |
2501.13415 |
null |
2025-01-23 |
VulnBot: Autonomous Penetration Testing for A Multi-Agent Collaborative Framework |
He Kong et.al. |
2501.13411 |
link |
2025-01-23 |
Towards Intelligent Design: A Self-driven Framework for Collocated Clothing Synthesis Leveraging Fashion Styles and Textures |
Minglong Dong et.al. |
2501.13396 |
null |
2025-01-23 |
Can Large Language Models Understand Preferences in Personalized Recommendation? |
Zhaoxuan Tan et.al. |
2501.13391 |
link |
2025-01-23 |
Do as We Do, Not as You Think: the Conformity of Large Language Models |
Zhiyuan Weng et.al. |
2501.13381 |
link |
2025-01-23 |
Scalable Evaluation Framework for Foundation Models in Musculoskeletal MRI Bridging Computational Innovation with Clinical Utility |
Gabrielle Hoyer et.al. |
2501.13376 |
link |
2025-01-23 |
Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement |
Jae-Sung Bae et.al. |
2501.13372 |
null |
2025-01-23 |
Meta-Feature Adapter: Integrating Environmental Metadata for Enhanced Animal Re-identification |
Yuzhuo Li et.al. |
2501.13368 |
null |
2025-01-23 |
50 Shades of Deceptive Patterns: A Unified Taxonomy, Multimodal Detection, and Security Implications |
Zewei Shi et.al. |
2501.13351 |
link |
2025-01-23 |
MSF: Efficient Diffusion Model Via Multi-Scale Latent Factorize |
Haohang Xu et.al. |
2501.13349 |
null |
2025-01-23 |
Full-Stack Optimized Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation |
Rong Shan et.al. |
2501.13344 |
link |
2025-01-23 |
Multi-aspect Knowledge Distillation with Large Language Model |
Taegyeong Lee et.al. |
2501.13341 |
link |
2025-01-23 |
Generative Multi-Form Bayesian Optimization |
Zhendong Guo et.al. |
2501.13337 |
null |
2025-01-23 |
SplitLLM: Hierarchical Split Learning for Large Language Model over Wireless Network |
Songge Zhang et.al. |
2501.13318 |
null |
2025-01-23 |
Representing Visualization Insights as a Dense Insight Network |
Jane Hoffswell et.al. |
2501.13309 |
null |
2025-01-23 |
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia |
Xuelong Geng et.al. |
2501.13306 |
link |
2025-01-23 |
Watching the AI Watchdogs: A Fairness and Robustness Analysis of AI Safety Moderation Classifiers |
Akshit Achara et.al. |
2501.13302 |
link |
2025-01-23 |
Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents |
Shrinidhi Kumbhar et.al. |
2501.13299 |
null |
2025-01-23 |
RAMQA: A Unified Framework for Retrieval-Augmented Multi-Modal Question Answering |
Yang Bai et.al. |
2501.13297 |
link |
2025-01-23 |
Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols |
John Joon Young Chung et.al. |
2501.13284 |
null |
2025-01-22 |
MEDFORM: A Foundation Model for Contrastive Learning of CT Imaging and Clinical Numeric Data in Multi-Cancer Analysis |
Daeun Jung et.al. |
2501.13277 |
link |
2025-01-22 |
RAG-Reward: Optimizing RAG with Reward Modeling and RLHF |
Hanning Zhang et.al. |
2501.13264 |
null |
2025-01-22 |
Exploring GPT’s Ability as a Judge in Music Understanding |
Kun Fang et.al. |
2501.13261 |
link |
2025-01-22 |
Bypassing Array Canaries via Autonomous Function Call Resolution |
Nathaniel Oh et.al. |
2501.13256 |
link |
2025-01-22 |
S-LoRA: Scalable Low-Rank Adaptation for Class Incremental Learning |
Yichen Wu et.al. |
2501.13198 |
null |
2025-01-22 |
Computational modelling of biological systems now and then: revisiting tools and visions from the beginning of the century |
Axel Loewe et.al. |
2501.13142 |
null |
2025-01-23 |
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding |
Boqiang Zhang et.al. |
2501.13106 |
link |
2025-01-22 |
Robust Representation Consistency Model via Contrastive Denoising |
Jiachen Lei et.al. |
2501.13094 |
link |
2025-01-22 |
Refining Input Guardrails: Enhancing LLM-as-a-Judge Efficiency Through Chain-of-Thought Fine-Tuning and Alignment |
Melissa Kazemi Rad et.al. |
2501.13080 |
null |
2025-01-22 |
Does Table Source Matter? Benchmarking and Improving Multimodal Scientific Table Understanding and Reasoning |
Bohao Yang et.al. |
2501.13042 |
link |
2025-01-22 |
Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament |
Yantao Liu et.al. |
2501.13007 |
link |
2025-01-22 |
Neural network enhanced cross entropy benchmark for monitored circuits |
Yangrui Hu et.al. |
2501.13005 |
null |
2025-01-22 |
Large Language Model-Based Semantic Communication System for Image Transmission |
Soheyb Ribouh et.al. |
2501.12988 |
null |
2025-01-22 |
LLM4WM: Adapting LLM for Wireless Multi-Tasking |
Xuanyu Liu et.al. |
2501.12983 |
null |
2025-01-22 |
Low-dimensional adaptation of diffusion models: Convergence in total variation |
Jiadong Liang et.al. |
2501.12982 |
null |
2025-01-22 |
OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language Models |
Chongren Sun et.al. |
2501.12975 |
link |
2025-01-22 |
Accessible Smart Contracts Verification: Synthesizing Formal Models with Tamed LLMs |
Jan Corazza et.al. |
2501.12972 |
null |
2025-01-22 |
It’s complicated. The relationship of algorithmic fairness and non-discrimination regulations in the EU AI Act |
Kristof Meding et.al. |
2501.12962 |
null |
2025-01-22 |
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference |
Weizhi Fei et.al. |
2501.12959 |
null |
2025-01-22 |
GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models |
Pengxiang Zhao et.al. |
2501.12956 |
null |
2025-01-22 |
3D Object Manipulation in a Single Image using Generative Models |
Ruisi Zhao et.al. |
2501.12935 |
null |
2025-01-22 |
Correctness Assessment of Code Generated by Large Language Models Using Internal Representations |
Tuan-Dung Bui et.al. |
2501.12934 |
link |
2025-01-22 |
DynamicEarth: How Far are We from Open-Vocabulary Change Detection? |
Kaiyu Li et.al. |
2501.12931 |
null |
2025-01-22 |
A Functional Software Reference Architecture for LLM-Integrated Systems |
Alessio Bucaioni et.al. |
2501.12904 |
null |
2025-01-22 |
Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration |
Offa Kingsleigh et.al. |
2501.12901 |
null |
2025-01-22 |
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback |
Yafu Li et.al. |
2501.12895 |
link |
2025-01-23 |
Generative AI Misuse Potential in Cyber Security Education: A Case Study of a UK Degree Program |
Carlton Shepherd et.al. |
2501.12883 |
null |
2025-01-22 |
WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge |
Jingyuan Chen et.al. |
2501.12877 |
null |
2025-01-22 |
ACEBench: Who Wins the Match Point in Tool Learning? |
Chen Chen et.al. |
2501.12851 |
null |
2025-01-22 |
AMM-Diff: Adaptive Multi-Modality Diffusion Network for Missing Modality Imputation |
Aghiles Kebaili et.al. |
2501.12840 |
null |
2025-01-22 |
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home |
Viktor Moskvoretskii et.al. |
2501.12835 |
null |
2025-01-22 |
Open or Closed LLM for Lesser-Resourced Languages? Lessons from Greek |
John Pavlopoulos et.al. |
2501.12826 |
link |
2025-01-22 |
Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks |
Alessio Quercia et.al. |
2501.12824 |
link |
2025-01-22 |
Certified Guidance for Planning with Deep Generative Models |
Francesco Giacomarra et.al. |
2501.12815 |
null |
2025-01-22 |
Revisit Self-Debugging with Self-Generated Tests for Code Generation |
Xiancai Chen et.al. |
2501.12793 |
null |
2025-01-22 |
LLMs as Repositories of Factual Knowledge: Limitations and Solutions |
Seyed Mahed Mousavi et.al. |
2501.12774 |
null |
2025-01-22 |
NExtLong: Toward Effective Long-Context Training without Long Documents |
Chaochen Gao et.al. |
2501.12766 |
link |
2025-01-22 |
Online Preference Alignment for Language Models via Count-based Exploration |
Chenjia Bai et.al. |
2501.12735 |
link |
2025-01-22 |
Paradigm-Based Automatic HDL Code Generation Using LLMs |
Wenhao Sun et.al. |
2501.12702 |
null |
2025-01-22 |
Training Dialogue Systems by AI Feedback for Improving Overall Dialogue Impression |
Kai Yoshida et.al. |
2501.12698 |
null |
2025-01-22 |
Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering |
Qian Tao et.al. |
2501.12697 |
null |
2025-01-22 |
SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling |
Shengshi Yao et.al. |
2501.12696 |
null |
2025-01-22 |
EchoLM: Accelerating LLM Serving with Real-time Knowledge Distillation |
Yifan Yu et.al. |
2501.12689 |
null |
2025-01-22 |
Distillation Quantification for Large Language Models |
Sunbowen Lee et.al. |
2501.12619 |
link |
2025-01-22 |
Deep Learning-Based Identification of Inconsistent Method Names: How Far Are We? |
Taiming Wang et.al. |
2501.12617 |
null |
2025-01-22 |
Kimi k1.5: Scaling Reinforcement Learning with LLMs |
Kimi Team et.al. |
2501.12599 |
null |
2025-01-22 |
Leveraging LLMs to Create a Haptic Devices’ Recommendation System |
Yang Liu et.al. |
2501.12573 |
null |
2025-01-22 |
Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review |
Rock Yuren Pang et.al. |
2501.12557 |
link |
2025-01-21 |
Human-like conceptual representations emerge from language prediction |
Ningyu Xu et.al. |
2501.12547 |
null |
2025-01-21 |
How Does the Spatial Distribution of Pre-training Data Affect Geospatial Foundation Models? |
Mirali Purohit et.al. |
2501.12535 |
null |
2025-01-21 |
An Empirically-grounded tool for Automatic Prompt Linting and Repair: A Case Study on Bias, Vulnerability, and Optimization in Developer Prompts |
Dhia Elhaq Rzig et.al. |
2501.12521 |
null |
2025-01-21 |
A Domain Adaptation Framework for Speech Recognition Systems with Only Synthetic data |
Minh Tran et.al. |
2501.12501 |
null |
2025-01-21 |
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws |
Tian Jin et.al. |
2501.12486 |
null |
2025-01-21 |
An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models |
Xiaoyu Chu et.al. |
2501.12469 |
link |
2025-01-21 |
Adaptive PII Mitigation Framework for Large Language Models |
Shubhi Asthana et.al. |
2501.12465 |
null |
2025-01-21 |
Empowering AIOps: Leveraging Large Language Models for IT Operations ManagementOperations Management |
Arthur Vitui et.al. |
2501.12461 |
link |
2025-01-21 |
Deploying Privacy Guardrails for LLMs: A Comparative Analysis of Real-World Applications |
Shubhi Asthana et.al. |
2501.12456 |
null |
2025-01-21 |
Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation |
Dongsheng Zhu et.al. |
2501.12432 |
null |
2025-01-21 |
FREYR: A Framework for Recognizing and Executing Your Requests |
Roberto Gallotta et.al. |
2501.12423 |
link |
2025-01-21 |
CroMe: Multimodal Fake News Detection using Cross-Modal Tri-Transformer and Metric Learning |
Eunjee Choi et.al. |
2501.12422 |
null |
2025-01-22 |
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling |
Yi Wang et.al. |
2501.12386 |
link |
2025-01-21 |
Accelerating Pulsar Parameter Estimation Using Convolutional Neural Networks |
Greg Olmschenk et.al. |
2501.12383 |
null |
2025-01-21 |
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding |
Yilun Zhao et.al. |
2501.12380 |
link |
2025-01-22 |
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos |
Sili Chen et.al. |
2501.12375 |
null |
2025-01-21 |
Expertise elevates AI usage: experimental evidence comparing laypeople and professional artists |
Thomas F. Eisenmann et.al. |
2501.12374 |
link |
2025-01-21 |
Is Long Context All You Need? Leveraging LLM’s Extended Context for NL2SQL |
Yeounoh Chung et.al. |
2501.12372 |
link |
2025-01-21 |
Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration |
Thomas Walshe et.al. |
2501.12332 |
null |
2025-01-21 |
Cinepro: Robust Training of Foundation Models for Cancer Detection in Prostate Ultrasound Cineloops |
Mohamed Harmanani et.al. |
2501.12331 |
link |
2025-01-21 |
VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model |
Xianwei Zhuang et.al. |
2501.12327 |
link |
2025-01-21 |
LLM-Assisted Knowledge Graph Completion for Curriculum and Domain Modelling in Personalized Higher Education Recommendations |
Hasan Abu-Rasheed et.al. |
2501.12300 |
null |
2025-01-21 |
MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing Networks |
Qishen Zhou et.al. |
2501.12281 |
link |
2025-01-21 |
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement |
Maosong Cao et.al. |
2501.12273 |
link |
2025-01-21 |
FOCUS: First Order Concentrated Updating Scheme |
Yizhou Liu et.al. |
2501.12243 |
null |
2025-01-21 |
InsTALL: Context-aware Instructional Task Assistance with Multi-modal Large Language Models |
Pha Nguyen et.al. |
2501.12231 |
null |
2025-01-21 |
CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning |
Yuanheng Fang et.al. |
2501.12226 |
null |
2025-01-21 |
Leveraging Large Language Models for Realizing Truly Intelligent User Interfaces |
Allard Oelen et.al. |
2501.12221 |
null |
2025-01-21 |
You Can’t Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense |
Wuyuao Mai et.al. |
2501.12210 |
null |
2025-01-21 |
Explainability for Vision Foundation Models: A Survey |
Rémi Kazmierczak et.al. |
2501.12203 |
null |
2025-01-22 |
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation |
Zibo Zhao et.al. |
2501.12202 |
link |
2025-01-21 |
BiMarker: Enhancing Text Watermark Detection for Large Language Models with Bipolar Watermarks |
Zhuang Li et.al. |
2501.12174 |
null |
2025-01-21 |
Contextualizing Recommendation Explanations with LLMs: A User Study |
Yuanjun Feng et.al. |
2501.12152 |
null |
2025-01-21 |
Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities |
Qirun Dai et.al. |
2501.12147 |
null |
2025-01-21 |
Do LLMs Provide Links to Code Similar to what they Generate? A Study with Gemini and Bing CoPilot |
Daniele Bifolco et.al. |
2501.12134 |
null |
2025-01-21 |
Evaluating Efficiency and Engagement in Scripted and LLM-Enhanced Human-Robot Interactions |
Tim Schreiter et.al. |
2501.12128 |
null |
2025-01-21 |
Can open source large language models be used for tumor documentation in Germany? – An evaluation on urological doctors’ notes |
Stefan Lenz et.al. |
2501.12106 |
link |
2025-01-21 |
Dissecting the NVIDIA Hopper Architecture through Microbenchmarking and Multiple Level Analysis |
Weile Luo et.al. |
2501.12084 |
null |
2025-01-21 |
Phishing Awareness via Game-Based Learning |
Argianto Rahartomo et.al. |
2501.12077 |
link |
2025-01-21 |
PINNsAgent: Automated PDE Surrogation with Large Language Models |
Qingpo Wuwu et.al. |
2501.12053 |
null |
2025-01-21 |
Harnessing Generative Pre-Trained Transformer for Datacenter Packet Trace Generation |
Chen Griner et.al. |
2501.12033 |
null |
2025-01-21 |
Comparative Analysis of Pre-trained Deep Learning Models and DINOv2 for Cushing’s Syndrome Diagnosis in Facial Analysis |
Hongjun Liu et.al. |
2501.12023 |
null |
2025-01-21 |
Are Traditional Deep Learning Model Approaches as Effective as a Retinal-Specific Foundation Model for Ocular and Systemic Disease Detection? |
Samantha Min Er Yew et.al. |
2501.12016 |
null |
2025-01-21 |
Rate-Aware Learned Speech Compression |
Jun Xu et.al. |
2501.11999 |
null |
2025-01-21 |
Linear Feedback Control Systems for Iterative Prompt Optimization in Large Language Models |
Rupesh Raj Karn et.al. |
2501.11979 |
null |
2025-01-21 |
Leveraging Graph Structures and Large Language Models for End-to-End Synthetic Task-Oriented Dialogues |
Maya Medjad et.al. |
2501.11977 |
link |
2025-01-21 |
Bridging Visualization and Optimization: Multimodal Large Language Models on Graph-Structured Combinatorial Optimization |
Jie Zhao et.al. |
2501.11968 |
null |
2025-01-21 |
A Hybrid Attention Framework for Fake News Detection with Large Language Models |
Xiaochuan Xu et.al. |
2501.11967 |
null |
2025-01-21 |
TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anomaly Detection |
Yang Cao et.al. |
2501.11960 |
null |
2025-01-21 |
Proverbs Run in Pairs: Evaluating Proverb Translation Capability of Large Language Model |
Minghan Wang et.al. |
2501.11953 |
null |
2025-01-21 |
ALoFTRAG: Automatic Local Fine Tuning for Retrieval Augmented Generation |
Peter Devine et.al. |
2501.11929 |
link |
2025-01-21 |
Integrate Temporal Graph Learning into LLM-based Temporal Knowledge Graph Model |
He Chang et.al. |
2501.11911 |
null |
2025-01-21 |
Panoramic Interests: Stylistic-Content Aware Personalized Headline Generation |
Junhong Lian et.al. |
2501.11900 |
link |
2025-01-22 |
Med-R $^2$ : Crafting Trustworthy LLM Physicians through Retrieval and Reasoning of Evidence-Based Medicine |
Keer Lu et.al. |
2501.11885 |
link |
2025-01-21 |
From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning |
Yafu Li et.al. |
2501.11877 |
link |
2025-01-21 |
LLM-Agents Driven Automated Simulation Testing and Analysis of small Uncrewed Aerial Systems |
Venkata Sai Aswath Duvvuru et.al. |
2501.11864 |
null |
2025-01-21 |
EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents |
Zhili Cheng et.al. |
2501.11858 |
link |
2025-01-21 |
Network-informed Prompt Engineering against Organized Astroturf Campaigns under Extreme Class Imbalance |
Nikos Kanakaris et.al. |
2501.11849 |
link |
2025-01-21 |
A Survey on Memory-Efficient Large-Scale Model Training in AI for Science |
Kaiyuan Tian et.al. |
2501.11847 |
null |
2025-01-21 |
Large Language Models with Human-In-The-Loop Validation for Systematic Review Data Extraction |
Noah L. Schroeder et.al. |
2501.11840 |
null |
2025-01-21 |
PXGen: A Post-hoc Explainable Method for Generative Models |
Yen-Lung Huang et.al. |
2501.11827 |
null |
2025-01-21 |
CogMorph: Cognitive Morphing Attacks for Text-to-Image Models |
Zonglei Jing et.al. |
2501.11815 |
null |
2025-01-20 |
Benchmarking Large Language Models via Random Variables |
Zijin Hong et.al. |
2501.11790 |
null |
2025-01-20 |
Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection |
Ali Naseh et.al. |
2501.11786 |
null |
2025-01-20 |
Glinthawk: A Two-Tiered Architecture for High-Throughput LLM Inference |
Pouya Hamadanian et.al. |
2501.11779 |
link |
2025-01-20 |
The Value of Nothing: Multimodal Extraction of Human Values Expressed by TikTok Influencers |
Alina Starovolsky-Shitrit et.al. |
2501.11770 |
null |
2025-01-20 |
Poison-RAG: Adversarial Data Poisoning Attacks on Retrieval-Augmented Generation in Recommender Systems |
Fatemeh Nazary et.al. |
2501.11759 |
link |
2025-01-20 |
A generalizable 3D framework and model for self-supervised learning in medical imaging |
Tony Xu et.al. |
2501.11755 |
link |
2025-01-20 |
Are generative models fair? A study of racial bias in dermatological image generation |
Miguel López-Pérez et.al. |
2501.11752 |
null |
2025-01-20 |
Optimizing Pretraining Data Mixtures with LLM-Estimated Utility |
William Held et.al. |
2501.11747 |
null |
2025-01-20 |
MedicoSAM: Towards foundation models for medical image segmentation |
Anwai Archit et.al. |
2501.11734 |
link |
2025-01-20 |
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks |
Zhenhailong Wang et.al. |
2501.11733 |
null |
2025-01-20 |
Explain-Query-Test: Self-Evaluating LLMs Via Explanation and Comprehension Discrepancy |
Saeid Asgari Taghanaki et.al. |
2501.11721 |
link |
2025-01-20 |
YouLeQD: Decoding the Cognitive Complexity of Questions and Engagement in Online Educational Videos from Learners’ Perspectives |
Nong Ming et.al. |
2501.11712 |
link |
2025-01-20 |
Towards Detecting Prompt Knowledge Gaps for Improved LLM-guided Issue Resolution |
Ramtin Ehsani et.al. |
2501.11709 |
link |
2025-01-20 |
Trustformer: A Trusted Federated Transformer |
Ali Abbasi Tadi et.al. |
2501.11706 |
null |
2025-01-20 |
Human services organizations and the responsible integration of AI: Considering ethics and contextualizing risk(s) |
Brian E. Perron et.al. |
2501.11705 |
null |
2025-01-20 |
Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling |
Zhenyu Hou et.al. |
2501.11651 |
link |
2025-01-20 |
Trojan Detection Through Pattern Recognition for Large Language Models |
Vedant Bhasin et.al. |
2501.11621 |
null |
2025-01-20 |
Conversation Routines: A Prompt Engineering Framework for Task-Oriented Dialog Systems |
Giorgio Robino et.al. |
2501.11613 |
null |
2025-01-20 |
SR-FoT: A Syllogistic-Reasoning Framework of Thought for Large Language Models Tackling Knowledge-based Reasoning Tasks |
Wentao Wan et.al. |
2501.11599 |
link |
2025-01-20 |
Recurrent Diffusion for Large-Scale Parameter Generation |
Kai Wang et.al. |
2501.11587 |
link |
2025-01-20 |
Open Sourcing GPTs: Economics of Open Sourcing Advanced AI Models |
Mahyar Habibi et.al. |
2501.11581 |
null |
2025-01-20 |
Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution |
Zhiyuan You et.al. |
2501.11561 |
null |
2025-01-20 |
PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation |
Jinyu Wang et.al. |
2501.11551 |
link |
2025-01-20 |
UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion |
Zixuan Chen et.al. |
2501.11515 |
null |
2025-01-20 |
Generative AI and Large Language Models in Language Preservation: Opportunities and Challenges |
Vincent Koc et.al. |
2501.11496 |
null |
2025-01-20 |
Graph-defined Language Learning with LLMs |
Huachi Zhou et.al. |
2501.11478 |
null |
2025-01-20 |
Curiosity-Driven Reinforcement Learning from Human Feedback |
Haoran Sun et.al. |
2501.11463 |
link |
2025-01-20 |
Ontology Matching with Large Language Models and Prioritized Depth-First Search |
Maria Taboada et.al. |
2501.11441 |
null |
2025-01-20 |
One Does Not Simply Meme Alone: Evaluating Co-Creativity Between LLMs and Humans in the Generation of Humor |
Zhikun Wu et.al. |
2501.11433 |
null |
2025-01-20 |
A Survey on Diffusion Models for Anomaly Detection |
Jing Liu et.al. |
2501.11430 |
link |
2025-01-20 |
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training |
Siyu Yuan et.al. |
2501.11425 |
link |
2025-01-20 |
Neural Contextual Reinforcement Framework for Logical Structure Language Generation |
Marcus Irvin et.al. |
2501.11417 |
null |
2025-01-20 |
Beyond the Hype: Benchmarking LLM-Evolved Heuristics for Bin Packing |
Kevin Sim et.al. |
2501.11411 |
null |
2025-01-20 |
Revisiting Language Models in Neural News Recommender Systems |
Yuyue Zhao et.al. |
2501.11391 |
link |
2025-01-20 |
Towards Advancing Code Generation with Large Language Models: A Research Roadmap |
Haolin Jin et.al. |
2501.11354 |
null |
2025-01-20 |
EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery |
Guankun Wang et.al. |
2501.11347 |
link |
2025-01-20 |
GenVidBench: A Challenging Benchmark for Detecting AI-Generated Video |
Zhenliang Ni et.al. |
2501.11340 |
null |
2025-01-20 |
Few-shot Policy (de)composition in Conversational Question Answering |
Kyle Erwin et.al. |
2501.11335 |
null |
2025-01-20 |
Nested Annealed Training Scheme for Generative Adversarial Networks |
Chang Wan et.al. |
2501.11318 |
null |
2025-01-20 |
Advancing Multi-Party Dialogue Systems with Speaker-ware Contrastive Learning |
Zhongtian Hu et.al. |
2501.11292 |
null |
2025-01-20 |
Large Language Model Agents for Radio Map Generation and Wireless Network Planning |
Hongye Quan et.al. |
2501.11283 |
null |
2025-01-20 |
Multi-round, Chain-of-thought Post-editing for Unfaithful Summaries |
Yi-Hui Lee et.al. |
2501.11273 |
null |
2025-01-20 |
Can xLLMs Understand the Structure of Dialog? Exploring Multilingual Response Generation in Complex Scenarios |
Zhongtian Hu et.al. |
2501.11269 |
null |
2025-01-20 |
Code Readability in the Age of Large Language Models: An Industrial Case Study from Atlassian |
Wannita Takerngsaksiri et.al. |
2501.11264 |
link |
2025-01-20 |
Multivariate Wireless Link Quality Prediction Based on Pre-trained Large Language Models |
Zhuangzhuang Yan et.al. |
2501.11247 |
null |
2025-01-20 |
Irony in Emojis: A Comparative Study of Human and LLM Interpretation |
Yawen Zheng et.al. |
2501.11241 |
null |
2025-01-20 |
KPL: Training-Free Medical Knowledge Mining of Vision-Language Models |
Jiaxiang Liu et.al. |
2501.11231 |
link |
2025-01-20 |
Reasoning Language Models: A Blueprint |
Maciej Besta et.al. |
2501.11223 |
link |
2025-01-20 |
Embedding-Driven Diversity Sampling to Improve Few-Shot Synthetic Data Generation |
Ivan Lopez et.al. |
2501.11199 |
null |
2025-01-19 |
Conditional Feature Importance with Generative Modeling Using Adversarial Random Forests |
Kristin Blesch et.al. |
2501.11178 |
link |
2025-01-17 |
FaceXBench: Evaluating Multimodal LLMs on Face Understanding |
Kartik Narayan et.al. |
2501.10360 |
link |
2025-01-17 |
Zero-Shot Monocular Scene Flow Estimation in the Wild |
Yiqing Liang et.al. |
2501.10357 |
null |
2025-01-17 |
Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems |
Weibo Gao et.al. |
2501.10332 |
link |
2025-01-17 |
Large language models for automated scholarly paper review: A survey |
Zhenzhen Zhuang et.al. |
2501.10326 |
null |
2025-01-17 |
HiMix: Reducing Computational Complexity in Large Vision-Language Models |
Xuange Zhang et.al. |
2501.10318 |
null |
2025-01-17 |
Addressing Popularity Bias in Third-Party Library Recommendations Using LLMs |
Claudio Di Sipio et.al. |
2501.10313 |
null |
2025-01-17 |
Computational Protein Science in the Era of Large Language Models (LLMs) |
Wenqi Fan et.al. |
2501.10282 |
null |
2025-01-17 |
Test Wars: A Comparative Study of SBST, Symbolic Execution, and LLM-Based Approaches to Unit Test Generation |
Azat Abdullin et.al. |
2501.10200 |
null |
2025-01-17 |
Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education |
William Hersh et.al. |
2501.10186 |
null |
2025-01-17 |
Multi-stage Training of Bilingual Islamic LLM for Neural Passage Retrieval |
Vera Pavlova et.al. |
2501.10175 |
null |
2025-01-17 |
Exploring the Impact of Generative Artificial Intelligence in Education: A Thematic Analysis |
Abhishek Kaushik et.al. |
2501.10134 |
null |
2025-01-17 |
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario |
Lucen Zhong et.al. |
2501.10132 |
link |
2025-01-17 |
PaSa: An LLM Agent for Comprehensive Academic Paper Search |
Yichen He et.al. |
2501.10120 |
link |
2025-01-17 |
AI-Generated Music Detection and its Challenges |
Darius Afchar et.al. |
2501.10111 |
link |
2025-01-17 |
LLM Reasoner and Automated Planner: A new NPC approach |
Israel Puerta-Merino et.al. |
2501.10106 |
null |
2025-01-17 |
Universal Actions for Enhanced Embodied Foundation Models |
Jinliang Zheng et.al. |
2501.10105 |
link |
2025-01-17 |
Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks |
Michael Schwingshackl et.al. |
2501.10080 |
link |
2025-01-17 |
FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization |
Zhaopeng Gu et.al. |
2501.10067 |
link |
2025-01-17 |
Accelerating Large Language Models through Partially Linear Feed-Forward Network |
Gansen Hu et.al. |
2501.10054 |
null |
2025-01-17 |
AirRAG: Activating Intrinsic Reasoning for Retrieval Augmented Generation via Tree-based Search |
Wenfeng Feng et.al. |
2501.10053 |
null |
2025-01-17 |
Exploring Code Comprehension in Scientific Programming: Preliminary Insights from Research Scientists |
Alyssia Chen et.al. |
2501.10037 |
null |
2025-01-17 |
Mapping scientific communities at scale |
Victor Barbier et.al. |
2501.10035 |
link |
2025-01-17 |
Mitigating Hallucinations on Object Attributes using Multiview Images and Negative Instructions |
Zhijie Tan et.al. |
2501.10011 |
null |
2025-01-17 |
Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models |
Qiang Liu et.al. |
2501.09997 |
null |
2025-01-17 |
Agent-as-Judge for Factual Summarization of Long Narratives |
Yeonseok Jeong et.al. |
2501.09993 |
link |
2025-01-17 |
RichSpace: Enriching Text-to-Video Prompt Space via Text Embedding Interpolation |
Yuefan Cao et.al. |
2501.09982 |
null |
2025-01-17 |
GVMGen: A General Video-to-Music Generation Model with Hierarchical Attentions |
Heda Zuo et.al. |
2501.09972 |
null |
2025-01-17 |
Explainable artificial intelligence (XAI): from inherent explainability to large language models |
Fuseini Mumuni et.al. |
2501.09967 |
null |
2025-01-17 |
A Survey on Multi-Turn Interaction Capabilities of Large Language Models |
Chen Zhang et.al. |
2501.09959 |
null |
2025-01-17 |
FRAG: A Flexible Modular Framework for Retrieval-Augmented Generation based on Knowledge Graphs |
Zengyi Gao et.al. |
2501.09957 |
null |
2025-01-17 |
AIRCHITECT v2: Learning the Hardware Accelerator Design Space through Unified Representations |
Jamin Seo et.al. |
2501.09954 |
link |
2025-01-17 |
Sympathy over Polarization: A Computational Discourse Analysis of Social Media Posts about the July 2024 Trump Assassination Attempt |
Qingcheng Zeng et.al. |
2501.09950 |
null |
2025-01-17 |
MultiPruner: Balanced Structure Removal in Foundation Models |
J. Pablo Muñoz et.al. |
2501.09949 |
link |
2025-01-17 |
Steering Large Language Models with Feature Guided Activation Additions |
Samuel Soo et.al. |
2501.09929 |
null |
2025-01-17 |
Towards A Litmus Test for Common Sense |
Hugo Latapie et.al. |
2501.09913 |
null |
2025-01-17 |
Demo: Interactive Visualization of Semantic Relationships in a Biomedical Project’s Talent Knowledge Graph |
Jiawei Xu et.al. |
2501.09909 |
null |
2025-01-17 |
Position: Open and Closed Large Language Models in Healthcare |
Jiawei Xu et.al. |
2501.09906 |
null |
2025-01-17 |
FoundationStereo: Zero-Shot Stereo Matching |
Bowen Wen et.al. |
2501.09898 |
link |
2025-01-17 |
Evolving Deeper LLM Thinking |
Kuang-Huei Lee et.al. |
2501.09891 |
null |
2025-01-17 |
Understanding the Effectiveness of LLMs in Automated Self-Admitted Technical Debt Repayment |
Mohammad Sadegh Sheikhaei et.al. |
2501.09888 |
link |
2025-01-17 |
FLORA: Formal Language Model Enables Robust Training-free Zero-shot Object Referring Analysis |
Zhe Chen et.al. |
2501.09887 |
null |
2025-01-16 |
ASTRA: A Scene-aware TRAnsformer-based model for trajectory prediction |
Izzeddin Teeti et.al. |
2501.09878 |
null |
2025-01-16 |
Geometry-Preserving Encoder/Decoder in Latent Generative Models |
Wonjun Lee et.al. |
2501.09876 |
null |
2025-01-16 |
An LLM-Guided Tutoring System for Social Skills Training |
Michael Guevarra et.al. |
2501.09870 |
null |
2025-01-16 |
Fine-grained Testing for Autonomous Driving Software: a Study on Autoware with LLM-driven Unit Testing |
Wenhan Wang et.al. |
2501.09866 |
null |
2025-01-16 |
Optimization is Better than Generation: Optimizing Commit Message Leveraging Human-written Commit Message |
Jiawei Li et.al. |
2501.09861 |
null |
2025-01-16 |
PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery |
Shristi Das Biswas et.al. |
2501.09826 |
link |
2025-01-16 |
Bridging Language Barriers in Healthcare: A Study on Arabic LLMs |
Nada Saadi et.al. |
2501.09825 |
null |
2025-01-16 |
BN-Pool: a Bayesian Nonparametric Approach to Graph Pooling |
Daniele Castellana et.al. |
2501.09821 |
link |
2025-01-16 |
Conversational Text Extraction with Large Language Models Using Retrieval-Augmented Systems |
Soham Roy et.al. |
2501.09801 |
null |
2025-01-16 |
Computing Optimization-Based Prompt Injections Against Closed-Weights Models By Misusing a Fine-Tuning API |
Andrey Labunets et.al. |
2501.09798 |
null |
2025-01-16 |
GeoManip: Geometric Constraints as General Interfaces for Robot Manipulation |
Weiliang Tang et.al. |
2501.09783 |
null |
2025-01-16 |
SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation |
Wanqi Yin et.al. |
2501.09782 |
link |
2025-01-16 |
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos |
Zhongwei Ren et.al. |
2501.09781 |
null |
2025-01-16 |
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong |
Tairan Fu et.al. |
2501.09775 |
null |
2025-01-16 |
Distilling Multi-modal Large Language Models for Autonomous Driving |
Deepti Hegde et.al. |
2501.09757 |
null |
2025-01-16 |
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation |
Philippe Hansen-Estruch et.al. |
2501.09755 |
null |
2025-01-16 |
Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues |
Youngjoon Jang et.al. |
2501.09754 |
null |
2025-01-16 |
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking |
Zekun Xi et.al. |
2501.09751 |
link |
2025-01-16 |
Enhancing Lexicon-Based Text Embeddings with Large Language Models |
Yibin Lei et.al. |
2501.09749 |
null |
2025-01-16 |
Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models |
Bihui Jin et.al. |
2501.09745 |
null |
2025-01-16 |
KU AIGEN ICL EDI@BC8 Track 3: Advancing Phenotype Named Entity Recognition and Normalization for Dysmorphology Physical Examination Reports |
Hajung Kim et.al. |
2501.09744 |
null |
2025-01-16 |
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps |
Nanye Ma et.al. |
2501.09732 |
null |
2025-01-16 |
A Simple Aerial Detection Baseline of Multimodal Language Models |
Qingyun Li et.al. |
2501.09720 |
link |
2025-01-16 |
Comparative Insights from 12 Machine Learning Models in Extracting Economic Ideology from Political Text |
Jihed Ncib et.al. |
2501.09719 |
null |
2025-01-16 |
CyberMentor: AI Powered Learning Tool Platform to Address Diverse Student Needs in Cybersecurity Education |
Tianyu Wang et.al. |
2501.09709 |
link |
2025-01-16 |
Domain Adaptation of Foundation LLMs for e-Commerce |
Christian Herold et.al. |
2501.09706 |
null |
2025-01-16 |
Cueless EEG imagined speech for subject identification: dataset and benchmarks |
Ali Derakhshesh et.al. |
2501.09700 |
link |
2025-01-16 |
Simulated Interactive Debugging |
Yannic Noller et.al. |
2501.09694 |
null |
2025-01-17 |
Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities |
Fengli Xu et.al. |
2501.09686 |
null |
2025-01-16 |
Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review |
Masatoshi Uehara et.al. |
2501.09685 |
null |
2025-01-16 |
Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark |
Alexis Roger et.al. |
2501.09672 |
null |
2025-01-16 |
A Survey of Research in Large Language Models for Electronic Design Automation |
Jingyu Pan et.al. |
2501.09655 |
null |
2025-01-16 |
The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models |
Jonathan Katzy et.al. |
2501.09653 |
null |
2025-01-16 |
CarMem: Enhancing Long-Term Memory in LLM Voice Assistants through Category-Bounding |
Johannes Kirmayr et.al. |
2501.09645 |
link |
2025-01-17 |
LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading |
Kuan-Ming Liu et.al. |
2501.09636 |
null |
2025-01-16 |
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework |
Yushen Lin et.al. |
2501.09631 |
null |
2025-01-16 |
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment |
Chaoqi Wang et.al. |
2501.09620 |
link |
2025-01-16 |
From Scarcity to Capability: Empowering Fake News Detection in Low-Resource Languages with LLMs |
Hrithik Majumdar Shibu et.al. |
2501.09604 |
link |
2025-01-16 |
Atleus: Accelerating Transformers on the Edge Enabled by 3D Heterogeneous Manycore Architectures |
Pratyush Dhingra et.al. |
2501.09588 |
null |
2025-01-16 |
Text-driven Adaptation of Foundation Models for Few-shot Surgical Workflow Analysis |
Tingxuan Chen et.al. |
2501.09555 |
link |
2025-01-16 |
AI in Support of Diversity and Inclusion |
Çiçek Güven et.al. |
2501.09534 |
null |
2025-01-16 |
Confidence Estimation for Error Detection in Text-to-SQL Systems |
Oleg Somov et.al. |
2501.09527 |
link |
2025-01-16 |
Augmenting a Large Language Model with a Combination of Text and Visual Data for Conversational Visualization of Global Geospatial Data |
Omar Mena et.al. |
2501.09521 |
null |
2025-01-16 |
AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation |
Junjie He et.al. |
2501.09503 |
link |
2025-01-16 |
Omni-Emotion: Extending Video MLLM with Detailed Face and Audio Modeling for Multimodal Emotion Analysis |
Qize Yang et.al. |
2501.09502 |
null |
2025-01-16 |
Evaluating Conversational Recommender Systems with Large Language Models: A User-Centric Evaluation Framework |
Nuo Chen et.al. |
2501.09493 |
null |
2025-01-16 |
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators |
Zhaocheng Liu et.al. |
2501.09484 |
link |
2025-01-16 |
Guided Debugging of Auto-Translated Code Using Differential Testing |
Shengnan Wu et.al. |
2501.09475 |
null |
2025-01-16 |
DEFOM-Stereo: Depth Foundation Model Based Stereo Matching |
Hualie Jiang et.al. |
2501.09466 |
link |
2025-01-16 |
Pruning for Sparse Diffusion Models based on Gradient Flow |
Ben Wan et.al. |
2501.09464 |
null |
2025-01-16 |
“A Great Start, But…”: Evaluating LLM-Generated Mind Maps for Information Mapping in Video-Based Design |
Tianhao He et.al. |
2501.09457 |
null |
2025-01-16 |
Solving the unsolvable: Translating case law in Hong Kong |
King-kui Sin et.al. |
2501.09444 |
null |
2025-01-16 |
Scaling up self-supervised learning for improved surgical foundation models |
Tim J. M. Jaspers et.al. |
2501.09436 |
link |
2025-01-16 |
CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation |
Hwan Heo et.al. |
2501.09433 |
link |
2025-01-16 |
A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy |
Huandong Wang et.al. |
2501.09431 |
null |
2025-01-16 |
AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring |
Xinyi Wang et.al. |
2501.09428 |
null |
2025-01-16 |
AutoCBT: An Autonomous Multi-agent Framework for Cognitive Behavioral Therapy in Psychological Counseling |
Ancheng Xu et.al. |
2501.09426 |
null |
2025-01-16 |
FASP: Fast and Accurate Structured Pruning of Large Language Models |
Hanyu Hu et.al. |
2501.09412 |
null |
2025-01-16 |
MoE $^2$ : Optimizing Collaborative Inference for Edge Large Language Models |
Lyudong Jin et.al. |
2501.09410 |
null |
2025-01-16 |
Adaptive Contextual Caching for Mobile Edge Large Language Model Service |
Guangyuan Liu et.al. |
2501.09383 |
null |
2025-01-16 |
Aligning Instruction Tuning with Pre-training |
Yiming Liang et.al. |
2501.09368 |
null |
2025-01-16 |
PICE: A Semantic-Driven Progressive Inference System for LLM Serving in Cloud-Edge Networks |
Huiyou Zhan et.al. |
2501.09367 |
null |
2025-01-16 |
YETI (YET to Intervene) Proactive Interventions by Multimodal AI Agents in Augmented Reality Tasks |
Saptarashmi Bandyopadhyay et.al. |
2501.09355 |
null |
2025-01-16 |
UVRM: A Scalable 3D Reconstruction Model from Unposed Videos |
Shiu-hong Kao et.al. |
2501.09347 |
null |
2025-01-16 |
Rational Tuning of LLM Cascades via Probabilistic Modeling |
Michael J. Zellinger et.al. |
2501.09345 |
null |
2025-01-16 |
SOP-Agent: Empower General Purpose AI Agent with Domain-Specific SOPs |
Anbang Ye et.al. |
2501.09316 |
null |
2025-01-16 |
A Study of In-Context-Learning-Based Text-to-SQL Errors |
Jiawei Shen et.al. |
2501.09310 |
link |
2025-01-16 |
To Retrieve or Not to Retrieve? Uncertainty Detection for Dynamic Retrieval Augmented Generation |
Kaustubh D. Dhole et.al. |
2501.09292 |
null |
2025-01-16 |
LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport |
Kyeongha Rho et.al. |
2501.09291 |
link |
2025-01-16 |
Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding |
Kohei Torimi et.al. |
2501.09278 |
null |
2025-01-16 |
Large Language Model is Secretly a Protein Sequence Optimizer |
Yinkai Wang et.al. |
2501.09274 |
null |
2025-01-16 |
Perspective Transition of Large Language Models for Solving Subjective Tasks |
Xiaolong Wang et.al. |
2501.09265 |
null |
2025-01-16 |
Delayed Fusion: Integrating Large Language Models into First-Pass Decoding in End-to-end Speech Recognition |
Takaaki Hori et.al. |
2501.09258 |
null |
2025-01-16 |
Clone-Robust AI Alignment |
Ariel D. Procaccia et.al. |
2501.09254 |
null |
2025-01-16 |
Split Fine-Tuning for Large Language Models in Wireless Networks |
Songge Zhang et.al. |
2501.09237 |
null |
2025-01-16 |
Foundations of Large Language Models |
Tong Xiao et.al. |
2501.09223 |
link |
2025-01-16 |
Leveraging Scale-aware Representations for improved Concept-Representation Alignment in ViTs |
Sanchit Sinha et.al. |
2501.09221 |
null |
2025-01-16 |
A Simple Graph Contrastive Learning Framework for Short Text Classification |
Yonghao Liu et.al. |
2501.09219 |
link |
2025-01-16 |
Interpretable Droplet Digital PCR Assay for Trustworthy Molecular Diagnostics |
Yuanyuan Wei et.al. |
2501.09218 |
null |
2025-01-16 |
Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning |
Yonghao Liu et.al. |
2501.09214 |
link |
2025-01-16 |
FineMedLM-o1: Enhancing the Medical Reasoning Ability of LLM from Supervised Fine-Tuning to Test-Time Training |
Hongzhou Yu et.al. |
2501.09213 |
link |
2025-01-15 |
Unified Few-shot Crack Segmentation and its Precise 3D Automatic Measurement in Concrete Structures |
Pengru Deng et.al. |
2501.09203 |
null |
2025-01-15 |
Towards Semantics Lifting for Scientific Computing: A Case Study on FFT |
Naifeng Zhang et.al. |
2501.09201 |
null |
2025-01-15 |
Guiding Retrieval using LLM-based Listwise Rankers |
Mandeep Rathee et.al. |
2501.09186 |
link |
2025-01-15 |
The Veln(ia)s is in the Details: Evaluating LLM Judgment on Latvian and Lithuanian Short Answer Matching |
Yevhen Kostiuk et.al. |
2501.09164 |
null |
2025-01-15 |
Evaluating GenAI for Simplifying Texts for Education: Improving Accuracy and Consistency for Enhanced Readability |
Stephanie L. Day et.al. |
2501.09158 |
null |
2025-01-15 |
Towards Multilingual LLM Evaluation for Baltic and Nordic languages: A study on Lithuanian History |
Yevhen Kostiuk et.al. |
2501.09154 |
null |
2025-01-15 |
Few-Shot Adaptation of Training-Free Foundation Model for 3D Medical Image Segmentation |
Xingxin He et.al. |
2501.09138 |
null |
2025-01-15 |
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG |
Aditi Singh et.al. |
2501.09136 |
link |
2025-01-15 |
HAFix: History-Augmented Large Language Models for Bug Fixing |
Yu Shi et.al. |
2501.09135 |
link |
2025-01-15 |
Multilingual LLMs Struggle to Link Orthography and Semantics in Bilingual Word Processing |
Eshaan Tanwar et.al. |
2501.09127 |
link |
2025-01-15 |
Augmenting Human-Annotated Training Data with Large Language Model Generation and Distillation in Open-Response Assessment |
Conrad Borchers et.al. |
2501.09126 |
null |
2025-01-15 |
Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach |
Alireza Ghaffari et.al. |
2501.09107 |
null |
2025-01-15 |
Tracking the Takes and Trajectories of English-Language News Narratives across Trustworthy and Worrisome Websites |
Hans W. A. Hanley et.al. |
2501.09102 |
link |
2025-01-15 |
Drama Llama: An LLM-Powered Storylets Framework for Authorable Responsiveness in Interactive Narrative |
Yuqian Sun et.al. |
2501.09099 |
null |
2025-01-15 |
SteLLA: A Structured Grading System Using LLMs with RAG |
Hefei Qiu et.al. |
2501.09092 |
null |
2025-01-15 |
Generative diffusion model with inverse renormalization group flows |
Kanta Masuki et.al. |
2501.09064 |
link |
2025-01-15 |
Decompose-ToM: Enhancing Theory of Mind Reasoning in Large Language Models through Simulation and Task Decomposition |
Sneheel Sarangi et.al. |
2501.09056 |
link |
2025-01-15 |
How Do Generative Models Draw a Software Engineer? A Case Study on Stable Diffusion Bias |
Tosin Fadahunsi et.al. |
2501.09014 |
link |
2025-01-15 |
Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians |
Ishan Amin et.al. |
2501.09009 |
link |
2025-01-15 |
Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails |
Shaona Ghosh et.al. |
2501.09004 |
null |
2025-01-15 |
Vision Foundation Models for Computed Tomography |
Suraj Pai et.al. |
2501.09001 |
link |
2025-01-15 |
CrystalGRW: Generative Modeling of Crystal Structures with Targeted Properties via Geodesic Random Walks |
Krit Tangsongcharoen et.al. |
2501.08998 |
link |
2025-01-15 |
VECT-GAN: A variationally encoded generative model for overcoming data scarcity in pharmaceutical science |
Youssef Abdalla et.al. |
2501.08995 |
link |
2025-01-15 |
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities |
Haozhe Xie et.al. |
2501.08983 |
link |
2025-01-15 |
Development and Validation of the Provider Documentation Summarization Quality Instrument for Large Language Models |
Emma Croxford et.al. |
2501.08977 |
null |
2025-01-15 |
Learning to Extract Cross-Domain Aspects and Understanding Sentiments Using Large Language Models |
Karukriti Kaushik Ghosh et.al. |
2501.08974 |
null |
2025-01-15 |
Analyzing the Ethical Logic of Six Large Language Models |
W. Russell Neuman et.al. |
2501.08951 |
null |
2025-01-15 |
Applying General Turn-taking Models to Conversational Human-Robot Interaction |
Gabriel Skantze et.al. |
2501.08946 |
null |
2025-01-15 |
Disentangling Exploration of Large Language Models by Optimal Exploitation |
Tim Grams et.al. |
2501.08925 |
null |
2025-01-15 |
GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge |
Liam Dugan et.al. |
2501.08913 |
link |
2025-01-15 |
Leveraging Large Language Models as Knowledge-Driven Agents for Reliable Retrosynthesis Planning |
Qinyu Ma et.al. |
2501.08897 |
link |
2025-01-15 |
Connecting SPDE to SGMs |
Junsu Seo et.al. |
2501.08877 |
null |
2025-01-15 |
Exploring Task-Level Optimal Prompts for Visual In-Context Learning |
Yan Zhu et.al. |
2501.08841 |
null |
2025-01-15 |
How Developers Interact with AI: A Taxonomy of Human-AI Collaboration in Software Engineering |
Christoph Treude et.al. |
2501.08774 |
null |
2025-01-15 |
Admitting Ignorance Helps the Video Question Answering Models to Answer |
Haopeng Li et.al. |
2501.08771 |
null |
2025-01-15 |
Enhanced Large Language Models for Effective Screening of Depression and Anxiety |
June M. Liu et.al. |
2501.08769 |
null |
2025-01-15 |
Few-Shot Learner Generalizes Across AI-Generated Image Detection |
Shiyu Wu et.al. |
2501.08763 |
null |
2025-01-15 |
Leveraging LLM Agents for Translating Network Configurations |
Yunze Wei et.al. |
2501.08760 |
null |
2025-01-15 |
The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities |
Irina Bigoulaeva et.al. |
2501.08716 |
link |
2025-01-15 |
Knowledge Graph-based Retrieval-Augmented Generation for Schema Matching |
Chuangtao Ma et.al. |
2501.08686 |
link |
2025-01-15 |
RealVVT: Towards Photorealistic Video Virtual Try-on via Spatio-Temporal Consistency |
Siqi Li et.al. |
2501.08682 |
null |
2025-01-15 |
Augmenting Smart Contract Decompiler Output through Fine-grained Dependency Analysis and LLM-facilitated Semantic Recovery |
Zeqin Liao et.al. |
2501.08670 |
null |
2025-01-15 |
MAGNET: Augmenting Generative Decoders with Representation Learning and Infilling Capabilities |
Savya Khosla et.al. |
2501.08648 |
null |
2025-01-15 |
Reassessing the Role of Chain-of-Thought in Sentiment Analysis: Insights and Limitations |
Kaiyuan Zheng et.al. |
2501.08641 |
null |
2025-01-15 |
SWSC: Shared Weight for Similar Channel in LLM |
Binrui Zeng et.al. |
2501.08631 |
null |
2025-01-15 |
Disjoint Processing Mechanisms of Hierarchical and Linear Grammars in Large Language Models |
Aruna Sankaranarayanan et.al. |
2501.08618 |
link |
2025-01-15 |
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation |
Kaiqu Liang et.al. |
2501.08617 |
null |
2025-01-15 |
Assessing the Alignment of FOL Closeness Metrics with Human Judgement |
Ramya Keerthy Thatikonda et.al. |
2501.08613 |
link |
2025-01-15 |
Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design |
Zhi Zheng et.al. |
2501.08603 |
link |
2025-01-15 |
AutoRestTest: A Tool for Automated REST API Testing Using LLMs and MARL |
Tyler Stennett et.al. |
2501.08600 |
null |
2025-01-15 |
LlamaRestTest: Effective REST API Testing with Small Language Models |
Myeongsoo Kim et.al. |
2501.08598 |
null |
2025-01-15 |
Sound Scene Synthesis at the DCASE 2024 Challenge |
Mathieu Lagrange et.al. |
2501.08587 |
null |
2025-01-15 |
LoRS: Efficient Low-Rank Adaptation for Sparse Large Language Model |
Yuxuan Hu et.al. |
2501.08582 |
null |
2025-01-15 |
Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation |
Jiaqi Huang et.al. |
2501.08580 |
link |
2025-01-15 |
Information Entropy Invariance: Enhancing Length Extrapolation in Attention Mechanisms |
Kewei Li et.al. |
2501.08570 |
link |
2025-01-15 |
Adaptive Sampled Softmax with Inverted Multi-Index: Methods, Theory and Applications |
Jin Chen et.al. |
2501.08563 |
link |
2025-01-15 |
LAMS: LLM-Driven Automatic Mode Switching for Assistive Teleoperation |
Yiran Tao et.al. |
2501.08558 |
null |
2025-01-15 |
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation |
Sitong Gong et.al. |
2501.08549 |
link |
2025-01-15 |
Comprehensive Subjective and Objective Evaluation Method for Text-generated Video |
Zelu Qi et.al. |
2501.08545 |
null |
2025-01-15 |
Doc-Guided Sent2Sent++: A Sent2Sent++ Agent with Doc-Guided memory for Document-level Machine Translation |
Jiaxin Guo et.al. |
2501.08523 |
null |
2025-01-14 |
Quantifying the Importance of Data Alignment in Downstream Model Performance |
Krrish Chawla et.al. |
2501.08496 |
null |
2025-01-14 |
Benchmarking Classical, Deep, and Generative Models for Human Activity Recognition |
Md Meem Hossain et.al. |
2501.08471 |
null |
2025-01-14 |
Selective Attention Merging for low resource tasks: A case study of Child ASR |
Natarajan Balaji Shankar et.al. |
2501.08468 |
link |
2025-01-14 |
Time series forecasting for multidimensional telemetry data using GAN and BiLSTM in a Digital Twin |
Joao Carmo de Almeida Neto et.al. |
2501.08464 |
null |
2025-01-14 |
Large Language Models For Text Classification: Case Study And Comprehensive Review |
Arina Kostina et.al. |
2501.08457 |
null |
2025-01-14 |
Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference Attack |
Sagiv Antebi et.al. |
2501.08454 |
null |
2025-01-14 |
Religious Bias Landscape in Language and Text-to-Image Models: Analysis, Detection, and Debiasing Strategies |
Ajwad Abrar et.al. |
2501.08441 |
link |
2025-01-14 |
SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models |
Anurag Kumar et.al. |
2501.08421 |
null |
2025-01-14 |
Nonlinear Modeling of a PEM Fuel Cell System; a Practical Study with Experimental Validation |
Seyed Mehdi Rakhtala et.al. |
2501.08420 |
null |
2025-01-14 |
Ensemble of Large Language Models for Curated Labeling and Rating of Free-text Data |
Jiaxing Qiu et.al. |
2501.08413 |
link |
2025-01-14 |
OptiChat: Bridging Optimization Models and Practitioners with Large Language Models |
Hao Chen et.al. |
2501.08406 |
link |
2025-01-14 |
Towards Best Practices for Open Datasets for LLM Training |
Stefan Baack et.al. |
2501.08365 |
null |
2025-01-14 |
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise |
Ryan Burgert et.al. |
2501.08331 |
link |
2025-01-14 |
PokerBench: Training Large Language Models to become Professional Poker Players |
Richard Zhuang et.al. |
2501.08328 |
link |
2025-01-14 |
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks |
Miran Heo et.al. |
2501.08326 |
null |
2025-01-14 |
ADAM-1: AI and Bioinformatics for Alzheimer’s Detection and Microbiome-Clinical Data Integrations |
Ziyuan Huang et.al. |
2501.08324 |
null |
2025-01-14 |
Exploring Robustness of Multilingual LLMs on Real-World Noisy Data |
Amirhossein Aliakbarzadeh et.al. |
2501.08322 |
link |
2025-01-14 |
Enhancing Automated Interpretability with Output-Centric Feature Descriptions |
Yoav Gur-Arieh et.al. |
2501.08319 |
link |
2025-01-14 |
MiniMax-01: Scaling Foundation Models with Lightning Attention |
MiniMax et.al. |
2501.08313 |
null |
2025-01-14 |
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them |
Abhilasha Ravichander et.al. |
2501.08292 |
null |
2025-01-14 |
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding |
Hongyu Li et.al. |
2501.08282 |
link |
2025-01-14 |
Exploring Robustness of LLMs to Sociodemographically-Conditioned Paraphrasing |
Pulkit Arora et.al. |
2501.08276 |
null |
2025-01-14 |
Addressing the sustainable AI trilemma: a case study on LLM agents and RAG |
Hui Wu et.al. |
2501.08262 |
link |
2025-01-14 |
Eliciting In-context Retrieval and Reasoning for Long-context Large Language Models |
Yifu Qiu et.al. |
2501.08248 |
null |
2025-01-14 |
Text-Diffusion Red-Teaming of Large Language Models: Unveiling Harmful Behaviors with Proximity Constraints |
Jonathan Nöther et.al. |
2501.08246 |
null |
2025-01-14 |
CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset |
Jiawei Du et.al. |
2501.08238 |
null |
2025-01-14 |
Investigating Energy Efficiency and Performance Trade-offs in LLM Inference Across Tasks and DVFS Settings |
Paul Joe Maliakel et.al. |
2501.08219 |
null |
2025-01-14 |
ASTRID – An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems |
Mohita Chowdhury et.al. |
2501.08208 |
null |
2025-01-14 |
ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving |
Zain Ul Abedin et.al. |
2501.08203 |
null |
2025-01-14 |
CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation |
Jinjun Peng et.al. |
2501.08200 |
link |
2025-01-14 |
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training |
Yijiong Yu et.al. |
2501.08197 |
link |
2025-01-14 |
PRESERVE: Prefetching Model Weights and KV-Cache in Distributed LLM Serving |
Ahmet Caner Yüzügüler et.al. |
2501.08192 |
null |
2025-01-14 |
A Critical Synthesis of Uncertainty Quantification and Foundation Models in Monocular Depth Estimation |
Steven Landgraf et.al. |
2501.08188 |
null |
2025-01-15 |
A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following |
Yin Fang et.al. |
2501.08187 |
link |
2025-01-14 |
Potential and Perils of Large Language Models as Judges of Unstructured Textual Data |
Rewina Bedemariam et.al. |
2501.08167 |
null |
2025-01-14 |
I Can Find You in Seconds! Leveraging Large Language Models for Code Authorship Attribution |
Soohyeon Choi et.al. |
2501.08165 |
null |
2025-01-14 |
Multiple-Input Variational Auto-Encoder for Anomaly Detection in Heterogeneous Data |
Phai Vu Dinh et.al. |
2501.08149 |
null |
2025-01-14 |
Refusal Behavior in Large Language Models: A Nonlinear Perspective |
Fabian Hildebrandt et.al. |
2501.08145 |
link |
2025-01-14 |
Bootstrapping Corner Cases: High-Resolution Inpainting for Safety Critical Detect and Avoid for Automated Flying |
Jonathan Lyhs et.al. |
2501.08142 |
null |
2025-01-14 |
Revisiting Birds Eye View Perception Models with Frozen Foundation Models: DINOv2 and Metric3Dv2 |
Seamie Hayes et.al. |
2501.08118 |
null |
2025-01-15 |
Consistency of Responses and Continuations Generated by Large Language Models on Social Media |
Wenlu Fan et.al. |
2501.08102 |
null |
2025-01-14 |
Hierarchical Autoscaling for Large Language Model Serving with Chiron |
Archit Patke et.al. |
2501.08090 |
null |
2025-01-14 |
Benchmarking Vision Foundation Models for Input Monitoring in Autonomous Driving |
Nert Keser et.al. |
2501.08083 |
null |
2025-01-14 |
CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning |
Guoliang He et.al. |
2501.08071 |
link |
2025-01-14 |
A Roadmap to Guide the Integration of LLMs in Hierarchical Planning |
Israel Puerta-Merino et.al. |
2501.08068 |
null |
2025-01-14 |
Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT |
Awritrojit Banerjee et.al. |
2501.08053 |
null |
2025-01-14 |
TriAdaptLoRA: Brain-Inspired Triangular Adaptive Low-Rank Adaptation for Parameter-Efficient Fine-Tuning |
Yao Liang et.al. |
2501.08008 |
null |
2025-01-14 |
LLM-Ehnanced Holonic Architecture for Ad-Hoc Scalable SoS |
Muhammad Ashfaq et.al. |
2501.07992 |
null |
2025-01-14 |
Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness |
Jiaxing Zhao et.al. |
2501.07978 |
link |
2025-01-14 |
Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models |
Yifang Xu et.al. |
2501.07972 |
null |
2025-01-14 |
Self-Instruct Few-Shot Jailbreaking: Decompose the Attack into Pattern and Behavior Learning |
Jiaqi Hua et.al. |
2501.07959 |
link |
2025-01-14 |
AI Guide Dog: Egocentric Path Prediction on Smartphone |
Aishwarya Jadhav et.al. |
2501.07957 |
null |
2025-01-14 |
Advice for Diabetes Self-Management by ChatGPT Models: Challenges and Recommendations |
Waqar Hussain et.al. |
2501.07931 |
null |
2025-01-14 |
Gandalf the Red: Adaptive Security for LLMs |
Niklas Pfister et.al. |
2501.07927 |
link |
2025-01-14 |
VENOM: Text-driven Unrestricted Adversarial Example Generation with Diffusion Models |
Hui Kuurila-Zhang et.al. |
2501.07922 |
link |
2025-01-14 |
Large Language Model Interface for Home Energy Management Systems |
François Michelon et.al. |
2501.07919 |
null |
2025-01-14 |
Bridge-SR: Schrödinger Bridge for Efficient SR |
Chang Li et.al. |
2501.07897 |
null |
2025-01-14 |
Leveraging Metamemory Mechanisms for Enhanced Data-Free Code Generation in LLMs |
Shuai Wang et.al. |
2501.07892 |
null |
2025-01-14 |
ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process Rewarding |
Zhongxiang Sun et.al. |
2501.07861 |
null |
2025-01-14 |
Optimizing Language Models for Grammatical Acceptability: A Comparative Study of Fine-Tuning Techniques |
Shobhit Ratan et.al. |
2501.07853 |
null |
2025-01-14 |
Unveiling Provider Bias in Large Language Models for Code Generation |
Xiaoyu Zhang et.al. |
2501.07849 |
null |
2025-01-14 |
Reasoning with Graphs: Structuring Implicit Knowledge to Enhance LLMs Reasoning |
Haoyu Han et.al. |
2501.07845 |
null |
2025-01-14 |
A Driver Advisory System Based on Large Language Model for High-speed Train |
Y. C. Luo et.al. |
2501.07837 |
null |
2025-01-14 |
Flow: A Modular Approach to Automated Agentic Workflow Generation |
Boye Niu et.al. |
2501.07834 |
link |
2025-01-14 |
Real-time Verification and Refinement of Language Model Text Generation |
Joonho Ko et.al. |
2501.07824 |
null |
2025-01-14 |
3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding |
Haomiao Xiong et.al. |
2501.07819 |
link |
2025-01-14 |
A Multi-Encoder Frozen-Decoder Approach for Fine-Tuning Large Language Models |
Kaustubh D. Dhole et.al. |
2501.07818 |
null |
2025-01-14 |
Agent-Centric Projection of Prompting Techniques and Implications for Synthetic Training Data for Large Language Models |
Dhruv Dhamani et.al. |
2501.07815 |
null |
2025-01-14 |
Talk to Right Specialists: Routing and Planning in Multi-agent System for Question Answering |
Feijie Wu et.al. |
2501.07813 |
null |
2025-01-14 |
CodeCoR: An LLM-Based Self-Reflective Multi-Agent Framework for Code Generation |
Ruwei Pan et.al. |
2501.07811 |
null |
2025-01-14 |
Visual Language Models as Operator Agents in the Space Domain |
Alejandro Carrasco et.al. |
2501.07802 |
null |
2025-01-14 |
Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding |
Zhaokai Wang et.al. |
2501.07783 |
link |
2025-01-14 |
Symmetry-Aware Generative Modeling through Learned Canonicalization |
Kusha Sareen et.al. |
2501.07773 |
null |
2025-01-14 |
Large Language Models for Knowledge Graph Embedding Techniques, Methods, and Challenges: A Survey |
Bingchen Liu et.al. |
2501.07766 |
null |
2025-01-14 |
On the Statistical Capacity of Deep Generative Models |
Edric Tam et.al. |
2501.07763 |
link |
2025-01-13 |
Advancing Student Writing Through Automated Syntax Feedback |
Kamyar Zeinalipour et.al. |
2501.07740 |
null |
2025-01-13 |
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens |
Dongwon Kim et.al. |
2501.07730 |
null |
2025-01-13 |
LLMic: Romanian Foundation Language Model |
Vlad-Andrei Bădoiu et.al. |
2501.07721 |
null |
2025-01-13 |
CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory |
Haokun Zhao et.al. |
2501.07674 |
null |
2025-01-13 |
Enhancing Talent Employment Insights Through Feature Extraction with LLM Finetuning |
Karishma Thakrar et.al. |
2501.07663 |
null |
2025-01-13 |
Large Language Models for Interpretable Mental Health Diagnosis |
Brian Hyeongseok Kim et.al. |
2501.07653 |
null |
2025-01-13 |
BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations |
Weixi Feng et.al. |
2501.07647 |
null |
2025-01-13 |
GPT as a Monte Carlo Language Tree: A Probabilistic Perspective |
Kun-Peng Ning et.al. |
2501.07641 |
null |
2025-01-13 |
SafePowerGraph-LLM: Novel Power Grid Graph Embedding and Optimization with Large Language Models |
Fabien Bernier et.al. |
2501.07639 |
null |
2025-01-13 |
Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss |
Xinyu Zhang et.al. |
2501.07563 |
null |
2025-01-13 |
Imagine while Reasoning in Space: Multimodal Visualization-of-Thought |
Chengzu Li et.al. |
2501.07542 |
null |
2025-01-13 |
ML Mule: Mobile-Driven Context-Aware Collaborative Learning |
Haoxiang Yu et.al. |
2501.07536 |
null |
2025-01-13 |
Investigating Large Language Models in Inferring Personality Traits from User Conversations |
Jianfeng Zhu et.al. |
2501.07532 |
null |
2025-01-13 |
RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment |
Difei Gu et.al. |
2501.07525 |
link |
2025-01-13 |
Parallel Key-Value Cache Fusion for Position Invariant RAG |
Philhoon Oh et.al. |
2501.07523 |
null |
2025-01-13 |
Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards |
Yangsibo Huang et.al. |
2501.07493 |
null |
2025-01-13 |
TiEBe: A Benchmark for Assessing the Current Knowledge of Large Language Models |
Thales Sales Almeida et.al. |
2501.07482 |
link |
2025-01-13 |
A Survey of Embodied AI in Healthcare: Techniques, Applications, and Opportunities |
Yihao Liu et.al. |
2501.07468 |
null |
2025-01-13 |
Understanding and Benchmarking Artificial Intelligence: OpenAI’s o3 Is Not AGI |
Rolf Pfister et.al. |
2501.07458 |
null |
2025-01-13 |
Enhancing LLM’s Ability to Generate More Repository-Aware Unit Tests Through Precise Contextual Information Injection |
Xin Yin et.al. |
2501.07425 |
null |
2025-01-13 |
Initial Findings on Sensor based Open Vocabulary Activity Recognition via Text Embedding Inversion |
Lala Shakti Swarup Ray et.al. |
2501.07408 |
null |
2025-01-13 |
OCORD: Open-Campus Object Removal Dataset |
Shuo Zhang et.al. |
2501.07397 |
null |
2025-01-13 |
Simulating the Hubbard Model with Equivariant Normalizing Flows |
Dominic Schuh et.al. |
2501.07371 |
null |
2025-01-13 |
Emergent effects of scaling on the functional hierarchies within large language models |
Paul C. Bogdan et.al. |
2501.07359 |
null |
2025-01-13 |
Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic Hiring |
Buse Sibel Korkmaz et.al. |
2501.07324 |
link |
2025-01-13 |
FinerWeb-10BT: Refining Web Data with LLM-Based Line-Level Filtering |
Erik Henriksson et.al. |
2501.07314 |
link |
2025-01-13 |
The Lessons of Developing Process Reward Models in Mathematical Reasoning |
Zhenru Zhang et.al. |
2501.07301 |
null |
2025-01-13 |
GestLLM: Advanced Hand Gesture Interpretation via Large Language Models for Human-Robot Interaction |
Oleg Kobzarev et.al. |
2501.07295 |
null |
2025-01-13 |
LLM-Net: Democratizing LLMs-as-a-Service through Blockchain-based Expert Networks |
Zan-Kai Chong et.al. |
2501.07288 |
null |
2025-01-13 |
Lifelong Learning of Large Language Model based Agents: A Roadmap |
Junhao Zheng et.al. |
2501.07278 |
link |
2025-01-13 |
Bridging Smart Meter Gaps: A Benchmark of Statistical, Machine Learning and Time Series Foundation Models for Data Imputation |
Amir Sartipi et.al. |
2501.07276 |
null |
2025-01-13 |
Transforming Role Classification in Scientific Teams Using LLMs and Advanced Predictive Analytics |
Wonduk Seo et.al. |
2501.07267 |
null |
2025-01-13 |
Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion |
Li Liang et.al. |
2501.07260 |
link |
2025-01-13 |
EdgeTAM: On-Device Track Anything Model |
Chong Zhou et.al. |
2501.07256 |
link |
2025-01-13 |
Large Language Models: New Opportunities for Access to Science |
Jutta Schnabel et.al. |
2501.07250 |
null |
2025-01-13 |
Breaking Memory Limits: Gradient Wavelet Transform Enhances LLMs Training |
Ziqing Wen et.al. |
2501.07237 |
link |
2025-01-13 |
Touched by ChatGPT: Using an LLM to Drive Affective Tactile Interaction |
Qiaoqiao Ren et.al. |
2501.07224 |
link |
2025-01-13 |
Pre-Trained Large Language Model Based Remaining Useful Life Transfer Prediction of Bearing |
Laifa Tao et.al. |
2501.07191 |
null |
2025-01-13 |
Unveiling Code Clone Patterns in Open Source VR Software: An Empirical Study |
Huashan Chen et.al. |
2501.07165 |
null |
2025-01-13 |
AlphaNet: Scaling Up Local Frame-based Atomistic Foundation Model |
Bangchen Yin et.al. |
2501.07155 |
link |
2025-01-13 |
LLM360 K2: Scaling Up 360-Open-Source Large Language Models |
Zhengzhong Liu et.al. |
2501.07124 |
null |
2025-01-13 |
How GPT learns layer by layer |
Jason Du et.al. |
2501.07108 |
link |
2025-01-13 |
ADKGD: Anomaly Detection in Knowledge Graphs with Dual-Channel Training |
Jiayang Wu et.al. |
2501.07078 |
link |
2025-01-13 |
D3MES: Diffusion Transformer with multihead equivariant self-attention for 3D molecule generation |
Zhejun Zhang et.al. |
2501.07077 |
link |
2025-01-13 |
Value Compass Leaderboard: A Platform for Fundamental and Validated Evaluation of LLMs Values |
Jing Yao et.al. |
2501.07071 |
null |
2025-01-13 |
Enhancing Image Generation Fidelity via Progressive Prompts |
Zhen Xiong et.al. |
2501.07070 |
link |
2025-01-13 |
Logic Meets Magic: LLMs Cracking Smart Contract Vulnerabilities |
ZeKe Xiao et.al. |
2501.07058 |
null |
2025-01-13 |
SFC-GAN: A Generative Adversarial Network for Brain Functional and Structural Connectome Translation |
Yee-Fan Tan et.al. |
2501.07055 |
null |
2025-01-13 |
PoAct: Policy and Action Dual-Control Agent for Generalized Applications |
Guozhi Yuan et.al. |
2501.07054 |
null |
2025-01-13 |
ROSAnnotator: A Web Application for ROSBag Data Analysis in Human-Robot Interaction |
Yan Zhang et.al. |
2501.07051 |
link |
2025-01-13 |
Unveiling the Potential of Text in High-Dimensional Time Series Forecasting |
Xin Zhou et.al. |
2501.07048 |
link |
2025-01-13 |
Explore the Use of Time Series Foundation Model for Car-Following Behavior Analysis |
Luwei Zeng et.al. |
2501.07034 |
null |
2025-01-13 |
A Proposed Large Language Model-Based Smart Search for Archive System |
Ha Dung Nguyen et.al. |
2501.07024 |
null |
2025-01-13 |
Likelihood Training of Cascaded Diffusion Models via Hierarchical Volume-preserving Maps |
Henry Li et.al. |
2501.06999 |
link |
2025-01-13 |
LEO: Boosting Mixture of Vision Encoders for Multimodal Large Language Models |
Mozhgan Nasr Azadani et.al. |
2501.06986 |
link |
2025-01-13 |
Combining LLM decision and RL action selection to improve RL policy for adaptive interventions |
Karine Karine et.al. |
2501.06980 |
null |
2025-01-12 |
How is Google using AI for internal code migrations? |
Stoyan Nikolov et.al. |
2501.06972 |
null |
2025-01-12 |
Enhancing Patient-Centric Communication: Leveraging LLMs to Simulate Patient Perspectives |
Xinyao Ma et.al. |
2501.06964 |
null |
2025-01-12 |
Comparison of Autoencoders for tokenization of ASL datasets |
Vouk Praun-Petrovic et.al. |
2501.06942 |
null |
2025-01-12 |
Super-Resolution of 3D Micro-CT Images Using Generative Adversarial Networks: Enhancing Resolution and Segmentation Accuracy |
Evgeny Ugolkov et.al. |
2501.06939 |
link |
2025-01-12 |
Harnessing Large Language Models for Disaster Management: A Survey |
Zhenyu Lei et.al. |
2501.06932 |
null |
2025-01-12 |
Monolithic 3D FPGAs Utilizing Back-End-of-Line Configuration Memories |
Faaiq Waqar et.al. |
2501.06921 |
null |
2025-01-12 |
Risk-Averse Finetuning of Large Language Models |
Sapana Chaudhary et.al. |
2501.06911 |
link |
2025-01-12 |
Deep Learning and Foundation Models for Weather Prediction: A Survey |
Jimeng Shi et.al. |
2501.06907 |
link |
2025-01-12 |
A Foundational Generative Model for Breast Ultrasound Image Analysis |
Haojun Yu et.al. |
2501.06869 |
null |
2025-01-12 |
Transfer Learning of Tabular Data by Finetuning Large Language Models |
Shourav B. Rabbani et.al. |
2501.06863 |
null |
2025-01-12 |
A Comprehensive Evaluation of Large Language Models on Mental Illnesses in Arabic Context |
Noureldin Zahran et.al. |
2501.06859 |
null |
2025-01-12 |
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training |
Tianjin Huang et.al. |
2501.06842 |
link |
2025-01-12 |
An efficient approach to represent enterprise web application structure using Large Language Model in the service of Intelligent Quality Engineering |
Zaber Al Hassan Ayon et.al. |
2501.06837 |
null |
2025-01-12 |
X-LeBench: A Benchmark for Extremely Long Egocentric Video Understanding |
Wenqi Zhou et.al. |
2501.06835 |
null |
2025-01-12 |
LLMs Model Non-WEIRD Populations: Experiments with Synthetic Cultural Agents |
Augusto Gonzalez-Bonorino et.al. |
2501.06834 |
link |
2025-01-12 |
GeoPix: Multi-Modal Large Language Model for Pixel-level Image Understanding in Remote Sensing |
Ruizhe Ou et.al. |
2501.06828 |
null |
2025-01-12 |
Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification |
Shijing Chen et.al. |
2501.06827 |
null |
2025-01-12 |
Event Argument Extraction with Enriched Prompts |
Chen Liang et.al. |
2501.06825 |
link |
2025-01-12 |
A Study on Educational Data Analysis and Personalized Feedback Report Generation Based on Tags and ChatGPT |
Yizhou Zhou et.al. |
2501.06819 |
null |
2025-01-12 |
RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models |
Keyan Chen et.al. |
2501.06809 |
link |
2025-01-12 |
Semantic-CD: Remote Sensing Image Semantic Change Detection towards Open-vocabulary Setting |
Yongshuo Zhu et.al. |
2501.06808 |
null |
2025-01-12 |
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private Large Language Model Inference |
Wenxuan Zeng et.al. |
2501.06807 |
null |
2025-01-12 |
Bridging the Fairness Gap: Enhancing Pre-trained Models with LLM-Generated Sentences |
Liu Yu et.al. |
2501.06795 |
null |
2025-01-12 |
3DCoMPaT200: Language-Grounded Compositional Understanding of Parts and Materials of 3D Shapes |
Mahmoud Ahmed et.al. |
2501.06785 |
link |
2025-01-12 |
Cost-Effective Robotic Handwriting System with AI Integration |
Tianyi Huang et.al. |
2501.06783 |
null |
2025-01-12 |
Eliza: A Web3 friendly AI Agent Operating System |
Shaw Walters et.al. |
2501.06781 |
link |
2025-01-12 |
VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning |
Ji Soo Lee et.al. |
2501.06761 |
link |
2025-01-12 |
Hierarchical Divide-and-Conquer for Fine-Grained Alignment in LLM-Based Medical Evaluation |
Shunfan Zheng et.al. |
2501.06741 |
null |
2025-01-12 |
ZOQO: Zero-Order Quantized Optimization |
Noga Bar et.al. |
2501.06736 |
null |
2025-01-12 |
Better Prompt Compression Without Multi-Layer Perceptrons |
Edouardo Honig et.al. |
2501.06730 |
null |
2025-01-12 |
Measuring the Robustness of Reference-Free Dialogue Evaluation Systems |
Justin Vasselli et.al. |
2501.06728 |
link |
2025-01-12 |
Integrated Sensing and Edge AI: Realizing Intelligent Perception in 6G |
Zhiyan Liu et.al. |
2501.06726 |
null |
2025-01-12 |
DRDT3: Diffusion-Refined Decision Test-Time Training Model |
Xingshuai Huang et.al. |
2501.06718 |
null |
2025-01-12 |
ZNO-Eval: Benchmarking reasoning capabilities of large language models in Ukrainian |
Mykyta Syromiatnikov et.al. |
2501.06715 |
link |
2025-01-12 |
Mell: Memory-Efficient Large Language Model Serving via Multi-GPU KV Cache Management |
Liu Qianli et.al. |
2501.06709 |
null |
2025-01-12 |
Evaluating Sample Utility for Data Selection by Mimicking Model Weights |
Tzu-Heng Huang et.al. |
2501.06708 |
null |
2025-01-12 |
AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds |
Yinfang Chen et.al. |
2501.06706 |
null |
2025-01-12 |
Fine-tuning ChatGPT for Automatic Scoring of Written Scientific Explanations in Chinese |
Jie Yang et.al. |
2501.06704 |
null |
2025-01-12 |
Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users’ Questions |
Aidan Hogan et.al. |
2501.06699 |
null |
2025-01-12 |
DVM: Towards Controllable LLM Agents in Social Deduction Games |
Zheng Zhang et.al. |
2501.06695 |
null |
2025-01-12 |
TAPO: Task-Referenced Adaptation for Prompt Optimization |
Wenxin Luo et.al. |
2501.06689 |
link |
2025-01-12 |
Generative AI in Education: From Foundational Insights to the Socratic Playground for Learning |
Xiangen Hu et.al. |
2501.06682 |
null |
2025-01-12 |
Application of Vision-Language Model to Pedestrians Behavior and Scene Understanding in Autonomous Driving |
Haoxiang Gao et.al. |
2501.06680 |
null |
2025-01-11 |
Challenging reaction prediction models to generalize to novel chemistry |
John Bradshaw et.al. |
2501.06669 |
link |
2025-01-11 |
Comparing Few-Shot Prompting of GPT-4 LLMs with BERT Classifiers for Open-Response Assessment in Tutor Equity Training |
Sanjit Kakarla et.al. |
2501.06658 |
link |
2025-01-11 |
FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings |
Tong Liu et.al. |
2501.06645 |
null |
2025-01-11 |
Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller Language Models |
Veronika Smilga et.al. |
2501.06638 |
link |
2025-01-11 |
Quantifying Relational Exploration in Cultural Heritage Knowledge Graphs with LLMs: A Neuro-Symbolic Approach |
Mohammed Maree et.al. |
2501.06628 |
null |
2025-01-11 |
Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks |
Amr Almorsi et.al. |
2501.06625 |
null |
2025-01-11 |
Denoising Diffusion Probabilistic Model for Radio Map Estimation in Generative Wireless Networks |
Xuanhao Luo et.al. |
2501.06604 |
null |
2025-01-11 |
ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation |
Xuanle Zhao et.al. |
2501.06598 |
link |
2025-01-11 |
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning |
Xiangru Tang et.al. |
2501.06590 |
link |
2025-01-11 |
Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping |
Muru Zhang et.al. |
2501.06589 |
link |
2025-01-10 |
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs |
Omkar Thawakar et.al. |
2501.06186 |
link |
2025-01-10 |
PEACE: Empowering Geologic Map Holistic Understanding with MLLMs |
Yangyu Huang et.al. |
2501.06184 |
null |
2025-01-10 |
VideoAuteur: Towards Long Narrative Video Generation |
Junfei Xiao et.al. |
2501.06173 |
null |
2025-01-10 |
GenMol: A Drug Discovery Generalist with Discrete Diffusion |
Seul Lee et.al. |
2501.06158 |
null |
2025-01-10 |
Multilingual Performance of a Multimodal Artificial Intelligence System on Multisubject Physics Concept Inventories |
Gerd Kortemeyer et.al. |
2501.06143 |
null |
2025-01-10 |
Supervision policies can shape long-term risk management in general-purpose AI models |
Manuel Cebrian et.al. |
2501.06137 |
link |
2025-01-10 |
Contextual ASR Error Handling with LLMs Augmentation for Goal-Oriented Conversational AI |
Yuya Asano et.al. |
2501.06129 |
null |
2025-01-10 |
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding |
Fabian David Schmidt et.al. |
2501.06117 |
link |
2025-01-10 |
From Conversation to Automation: Leveraging Large Language Models to Analyze Strategies in Problem Solving Therapy |
Elham Aghakhani et.al. |
2501.06101 |
null |
2025-01-10 |
Photokinetics of Photothermal Reactions |
Mounir Maafi et.al. |
2501.06057 |
null |
2025-01-10 |
AI-powered virtual tissues from spatial proteomics for clinical diagnostics and biomedical discovery |
Johann Wenckstern et.al. |
2501.06039 |
link |
2025-01-10 |
Addressing speaker gender bias in large scale speech translation systems |
Shubham Bansal et.al. |
2501.05989 |
null |
2025-01-10 |
Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics Processing |
Eklavya Sarkar et.al. |
2501.05987 |
link |
2025-01-10 |
Exploring LLMs for Automated Pre-Testing of Cross-Cultural Surveys |
Divya Mani Adhikari et.al. |
2501.05985 |
null |
2025-01-10 |
Hermit Kingdom Through the Lens of Multiple Perspectives: A Case Study of LLM Hallucination on North Korea |
Eunjung Cho et.al. |
2501.05981 |
null |
2025-01-10 |
Model Inversion in Split Learning for Personalized LLMs: New Insights from Information Bottleneck Theory |
Yunmeng Shu et.al. |
2501.05965 |
null |
2025-01-10 |
Effective faking of verbal deception detection with target-aligned adversarial attacks |
Bennett Kleinberg et.al. |
2501.05962 |
null |
2025-01-10 |
Reusable specimen-level inference in computational pathology |
Jakub R. Kaczmarzyk et.al. |
2501.05945 |
link |
2025-01-10 |
DiffuSETS: 12-lead ECG Generation Conditioned on Clinical Text Reports and Patient-Specific Information |
Yongfan Lai et.al. |
2501.05932 |
link |
2025-01-10 |
LLMs Reproduce Stereotypes of Sexual and Gender Minorities |
Ruby Ostrow et.al. |
2501.05926 |
null |
2025-01-10 |
Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction |
Petraq Nako et.al. |
2501.05925 |
null |
2025-01-10 |
Valley2: Exploring Multimodal Models with Scalable Vision-Language Design |
Ziheng Wu et.al. |
2501.05901 |
link |
2025-01-10 |
Prompt engineering and its implications on the energy consumption of Large Language Models |
Riccardo Rubei et.al. |
2501.05899 |
link |
2025-01-10 |
Affordably Fine-tuned LLMs Provide Better Answers to Course-specific MCQs |
Bianca Raimondi et.al. |
2501.05891 |
link |
2025-01-10 |
Text-to-Edit: Controllable End-to-End Video Ad Creation via Multimodal LLMs |
Dabing Cheng et.al. |
2501.05884 |
null |
2025-01-10 |
VideoRAG: Retrieval-Augmented Generation over Video Corpus |
Soyeong Jeong et.al. |
2501.05874 |
link |
2025-01-10 |
ConSim: Measuring Concept-Based Explanations’ Effectiveness with Automated Simulatability |
Antonin Poché et.al. |
2501.05855 |
link |
2025-01-10 |
Understanding Impact of Human Feedback via Influence Functions |
Taywon Min et.al. |
2501.05790 |
link |
2025-01-10 |
Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models |
You Li et.al. |
2501.05767 |
null |
2025-01-10 |
Controlling Large Language Models Through Concept Activation Vectors |
Hanyu Zhang et.al. |
2501.05764 |
null |
2025-01-10 |
StarGen: A Spatiotemporal Autoregression Framework with Video Diffusion Model for Scalable and Controllable Scene Generation |
Shangjin Zhai et.al. |
2501.05763 |
null |
2025-01-10 |
CognoSpeak: an automatic, remote assessment of early cognitive decline in real-world conversational speech |
Madhurananda Pahar et.al. |
2501.05755 |
null |
2025-01-10 |
Semantic Exploration with Adaptive Gating for Efficient Problem Solving with Language Models |
Sungjae Lee et.al. |
2501.05752 |
null |
2025-01-10 |
TB-Bench: Training and Testing Multi-Modal AI for Understanding Spatio-Temporal Traffic Behaviors from Dashcam Images/Videos |
Korawat Charoenpitaks et.al. |
2501.05733 |
link |
2025-01-10 |
Enabling Scalable Oversight via Self-Evolving Critic |
Zhengyang Tang et.al. |
2501.05727 |
null |
2025-01-10 |
I Can’t Share Code, but I need Translation – An Empirical Study on Code Translation through Federated LLM |
Jahnavi Kumar et.al. |
2501.05724 |
null |
2025-01-10 |
How to Enable Effective Cooperation Between Humans and NLP Models: A Survey of Principles, Formalizations, and Beyond |
Chen Huang et.al. |
2501.05714 |
null |
2025-01-10 |
Multi-Step Reasoning in Korean and the Emergent Mirage |
Guijin Son et.al. |
2501.05712 |
null |
2025-01-10 |
EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model |
Yi He et.al. |
2501.05710 |
null |
2025-01-10 |
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains |
Vighnesh Subramaniam et.al. |
2501.05707 |
null |
2025-01-10 |
Debugging Without Error Messages: How LLM Prompting Strategy Affects Programming Error Explanation Effectiveness |
Audrey Salmon et.al. |
2501.05706 |
null |
2025-01-10 |
Facilitate Collaboration between Large Language Model and Task-specific Model for Time Series Anomaly Detection |
Feiyi Chen et.al. |
2501.05675 |
null |
2025-01-10 |
Network Diffuser for Placing-Scheduling Service Function Chains with Inverse Demonstration |
Zuyuan Zhang et.al. |
2501.05673 |
null |
2025-01-10 |
Cascaded Self-Evaluation Augmented Training for Efficient Multimodal Large Language Models |
Zheqi Lv et.al. |
2501.05662 |
null |
2025-01-10 |
Collaboration of Large Language Models and Small Recommendation Models for Device-Cloud Recommendation |
Zheqi Lv et.al. |
2501.05647 |
null |
2025-01-10 |
Iconicity in Large Language Models |
Anna Marklová et.al. |
2501.05643 |
null |
2025-01-10 |
HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection |
Anant Mehta et.al. |
2501.05631 |
link |
2025-01-10 |
The Impact of Model Scaling on Seen and Unseen Language Performance |
Rhitabrat Pokharel et.al. |
2501.05629 |
null |
2025-01-09 |
Harnessing Large Language Model for Virtual Reality Exploration Testing: A Case Study |
Zhenyu Qi et.al. |
2501.05625 |
null |
2025-01-09 |
Exploring Large Language Models for Translating Romanian Computational Problems into English |
Adrian Marius Dumitran et.al. |
2501.05601 |
null |
2025-01-09 |
Physics-Driven Learning for Inverse Problems in Quantum Chromodynamics |
Gert Aarts et.al. |
2501.05580 |
null |
2025-01-09 |
Exploring Large Language Models (LLMs) through interactive Python activities |
Eugenio Tufino et.al. |
2501.05577 |
link |
2025-01-09 |
LLMQuoter: Enhancing RAG Capabilities Through Efficient Quote Extraction From Large Contexts |
Yuri Facanha Bezerra et.al. |
2501.05554 |
link |
2025-01-09 |
The dynamics of meaning through time: Assessment of Large Language Models |
Mohamed Taher Alrefaie et.al. |
2501.05552 |
null |
2025-01-09 |
Infecting Generative AI With Viruses |
David Noever et.al. |
2501.05542 |
null |
2025-01-09 |
NSChat: A Chatbot System To Rule Them All |
Zenon Lamprou et.al. |
2501.05541 |
null |
2025-01-09 |
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding |
Xingyu Fu et.al. |
2501.05452 |
null |
2025-01-09 |
Relative Pose Estimation through Affine Corrections of Monocular Depth Priors |
Yifan Yu et.al. |
2501.05446 |
link |
2025-01-09 |
Consistent Flow Distillation for Text-to-3D Generation |
Runjie Yan et.al. |
2501.05445 |
null |
2025-01-09 |
Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark |
Yunzhuo Hao et.al. |
2501.05444 |
link |
2025-01-09 |
A survey of textual cyber abuse detection using cutting-edge language models and large language models |
Jose A. Diaz-Garcia et.al. |
2501.05443 |
null |
2025-01-09 |
Zero-1-to-G: Taming Pretrained 2D Diffusion Model for Direct 3D Generation |
Xuyi Meng et.al. |
2501.05427 |
null |
2025-01-09 |
Using LLMs to Infer Non-Binary COVID-19 Sentiments of Chinese Micro-bloggers |
Jerry Chongyi Hu et.al. |
2501.05423 |
null |
2025-01-09 |
Seeing Sound: Assembling Sounds from Visuals for Audio-to-Image Generation |
Darius Petermann et.al. |
2501.05413 |
null |
2025-01-10 |
Atlas: A Novel Pathology Foundation Model by Mayo Clinic, Charité, and Aignostics |
Maximilian Alber et.al. |
2501.05409 |
null |
2025-01-09 |
TimeDP: Learning to Generate Multi-Domain Time Series with Domain Prompts |
Yu-Hao Huang et.al. |
2501.05403 |
link |
2025-01-09 |
Mechanistic understanding and validation of large AI models with SemanticLens |
Maximilian Dreyer et.al. |
2501.05398 |
link |
2025-01-09 |
FairCode: Evaluating Social Bias of LLMs in Code Generation |
Yongkang Du et.al. |
2501.05396 |
link |
2025-01-09 |
Large Physics Models: Towards a collaborative approach with Large Language Models and Foundation Models |
Kristian G. Barman et.al. |
2501.05382 |
null |
2025-01-09 |
Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance |
Dimitrios Gerogiannis et.al. |
2501.05379 |
null |
2025-01-09 |
Accelerated Diffusion Models via Speculative Sampling |
Valentin De Bortoli et.al. |
2501.05370 |
null |
2025-01-09 |
Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction |
Hantao Lou et.al. |
2501.05336 |
link |
2025-01-09 |
“What’s Happening”- A Human-centered Multimodal Interpreter Explaining the Actions of Autonomous Vehicles |
Xuewen Luo et.al. |
2501.05322 |
null |
2025-01-09 |
Comparison Study: Glacier Calving Front Delineation in Synthetic Aperture Radar Images With Deep Learning |
Nora Gourmelon et.al. |
2501.05281 |
link |
2025-01-09 |
CellViT++: Energy-Efficient and Adaptive Cell Segmentation and Classification Using Foundation Models |
Fabian Hörst et.al. |
2501.05269 |
link |
2025-01-09 |
Patch-GAN Transfer Learning with Reconstructive Models for Cloud Removal |
Wanli Ma et.al. |
2501.05265 |
null |
2025-01-09 |
CallNavi: A Study and Challenge on Function Calling Routing and Invocation in Large Language Models |
Yewei Song et.al. |
2501.05255 |
null |
2025-01-09 |
From Scientific Texts to Verifiable Code: Automating the Process with Transformers |
Changjie Wang et.al. |
2501.05252 |
null |
2025-01-09 |
RAG-WM: An Efficient Black-Box Watermarking Approach for Retrieval-Augmented Generation of Large Language Models |
Peizhuo Lv et.al. |
2501.05249 |
null |
2025-01-09 |
Deriving Coding-Specific Sub-Models from LLMs using Resource-Efficient Pruning |
Laura Puccioni et.al. |
2501.05248 |
null |
2025-01-09 |
Online Prompt and Solver Selection for Program Synthesis |
Yixuan Li et.al. |
2501.05247 |
null |
2025-01-09 |
Optimizing Estonian TV Subtitles with Semi-supervised Learning and LLMs |
Artem Fedorchenko et.al. |
2501.05234 |
null |
2025-01-09 |
Harnessing Large Language and Vision-Language Models for Robust Out-of-Distribution Detection |
Pei-Kang Lee et.al. |
2501.05228 |
null |
2025-01-09 |
Light Transport-aware Diffusion Posterior Sampling for Single-View Reconstruction of 3D Volumes |
Ludwic Leonard et.al. |
2501.05226 |
link |
2025-01-09 |
Leveraging Large Language Models for Zero-shot Lay Summarisation in Biomedicine and Beyond |
Tomas Goldsack et.al. |
2501.05224 |
null |
2025-01-09 |
A Novel Approach to Scalable and Automatic Topic-Controlled Question Generation in Education |
Ziqing Li et.al. |
2501.05220 |
null |
2025-01-09 |
Compression with Global Guidance: Towards Training-free High-Resolution MLLMs Acceleration |
Xuyang Liu et.al. |
2501.05179 |
link |
2025-01-09 |
Emergence of human-like polarization among large language model agents |
Jinghua Piao et.al. |
2501.05171 |
null |
2025-01-09 |
Bringing Order Amidst Chaos: On the Role of Artificial Intelligence in Secure Software Engineering |
Matteo Esposito et.al. |
2501.05165 |
null |
2025-01-09 |
Biomedical Relation Extraction via Adaptive Document-Relation Cross-Mapping and Concept Unique Identifier |
Yufei Shang et.al. |
2501.05155 |
null |
2025-01-09 |
DriVLM: Domain Adaptation of Vision-Language Models in Autonomous Driving |
Xuran Zheng et.al. |
2501.05081 |
null |
2025-01-09 |
Multimodal-to-Text Prompt Engineering in Large Language Models Using Feature Embeddings for GNSS Interference Characterization |
Harshith Manjunath et.al. |
2501.05079 |
null |
2025-01-09 |
Analyzing Memorization in Large Language Models through the Lens of Model Attribution |
Tarun Ram Menta et.al. |
2501.05078 |
link |
2025-01-09 |
A Text-Based Knowledge-Embedded Soft Sensing Modeling Approach for General Industrial Process Tasks Based on Large Language Model |
Shuo Tong et.al. |
2501.05075 |
null |
2025-01-09 |
Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning |
Huabin Liu et.al. |
2501.05069 |
null |
2025-01-09 |
LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding |
Jiaxing Zhao et.al. |
2501.05067 |
null |
2025-01-09 |
Simultaneous emulation and downscaling with physically-consistent deep learning-based regional ocean emulators |
Leonard Lupin-Jimenez et.al. |
2501.05058 |
null |
2025-01-09 |
LearningFlow: Automated Policy Learning Workflow for Urban Driving with Large Language Models |
Zengqi Peng et.al. |
2501.05057 |
null |
2025-01-09 |
On the Generalizability of Transformer Models to Code Completions of Different Lengths |
Nathan Cooper et.al. |
2501.05051 |
null |
2025-01-09 |
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution |
Chengxing Xie et.al. |
2501.05040 |
link |
2025-01-09 |
Enhancing Human-Like Responses in Large Language Models |
Ethem Yağız Çalık et.al. |
2501.05032 |
null |
2025-01-09 |
ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark |
Ronghao Dang et.al. |
2501.05031 |
link |
2025-01-09 |
A General Retrieval-Augmented Generation Framework for Multimodal Case-Based Reasoning Applications |
Ofir Marom et.al. |
2501.05030 |
null |
2025-01-09 |
TreeKV: Smooth Key-Value Cache Compression with Tree Structures |
Ziwei He et.al. |
2501.04987 |
null |
2025-01-09 |
SpaLLM-Guard: Pairing SMS Spam Detection Using Open-source and Commercial LLMs |
Muhammad Salman et.al. |
2501.04985 |
null |
2025-01-09 |
V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer |
Hangzhou He et.al. |
2501.04975 |
link |
2025-01-09 |
Demystifying Domain-adaptive Post-training for Financial LLMs |
Zixuan Ke et.al. |
2501.04961 |
link |
2025-01-09 |
Seeing with Partial Certainty: Conformal Prediction for Robotic Scene Recognition in Built Environments |
Yifan Xu et.al. |
2501.04947 |
null |
2025-01-09 |
Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models |
Qingyu Ren et.al. |
2501.04945 |
link |
2025-01-09 |
Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency |
Shiji Zhao et.al. |
2501.04931 |
null |
2025-01-09 |
Investigating Numerical Translation with Large Language Models |
Wei Tang et.al. |
2501.04927 |
null |
2025-01-09 |
FLowHigh: Towards Efficient and High-Quality Audio Super-Resolution with Single-Step Flow Matching |
Jun-Hak Yun et.al. |
2501.04926 |
link |
2025-01-09 |
HaVen: Hallucination-Mitigated LLM for Verilog Code Generation Aligned with HDL Engineers |
Yiyao Yang et.al. |
2501.04908 |
link |
2025-01-09 |
JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis |
Jun-Hyeok Cha et.al. |
2501.04904 |
null |
2025-01-09 |
ThriftLLM: On Cost-Effective Selection of Large Language Models for Classification Queries |
Keke Huang et.al. |
2501.04901 |
null |
2025-01-09 |
SUGAR: Leveraging Contextual Confidence for Smarter Retrieval |
Hanna Zubkova et.al. |
2501.04899 |
null |
2025-01-08 |
Leveraging Log Probabilities in Language Models to Forecast Future Events |
Tommaso Soru et.al. |
2501.04880 |
null |
2025-01-08 |
Real-Time Textless Dialogue Generation |
Long Mai et.al. |
2501.04877 |
link |
2025-01-08 |
Modelling complex proton transport phenomena – Exploring the limits of fine-tuning and transferability of foundational machine-learned force fields |
Malte Grunert et.al. |
2501.04876 |
null |
2025-01-08 |
Exploring Large Language Models for Semantic Analysis and Categorization of Android Malware |
Brandon J Walton et.al. |
2501.04848 |
null |
2025-01-08 |
Do Code LLMs Understand Design Patterns? |
Zhenyu Pan et.al. |
2501.04835 |
null |
2025-01-08 |
On the Impact of Requirements Smells in Prompts: The Case of Automated Traceability |
Andreas Vogelsang et.al. |
2501.04810 |
null |
2025-01-08 |
IQPopt: Fast optimization of instantaneous quantum polynomial circuits in JAX |
Erik Recio-Armengol et.al. |
2501.04776 |
link |
2025-01-08 |
Efficient and Responsible Adaptation of Large Language Models for Robust and Equitable Top-k Recommendations |
Kirandeep Kaur et.al. |
2501.04762 |
null |
2025-01-08 |
Improving Human-Robot Teaching by Quantifying and Reducing Mental Model Mismatch |
Phillip Richter et.al. |
2501.04755 |
null |
2025-01-08 |
EditAR: Unified Conditional Generation with Autoregressive Models |
Jiteng Mu et.al. |
2501.04699 |
null |
2025-01-08 |
Re-ranking the Context for Multimodal Retrieval Augmented Generation |
Matin Mortaheb et.al. |
2501.04695 |
null |
2025-01-08 |
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images |
Zixuan Huang et.al. |
2501.04689 |
null |
2025-01-08 |
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics |
Ruilin Luo et.al. |
2501.04686 |
link |
2025-01-08 |
Enhancing Financial VQA in Vision Language Models using Intermediate Structured Representations |
Archita Srivastava et.al. |
2501.04675 |
null |
2025-01-08 |
Assessing Language Comprehension in Large Language Models Using Construction Grammar |
Wesley Scivetti et.al. |
2501.04661 |
null |
2025-01-08 |
Multi-task retriever fine-tuning for domain-specific and efficient RAG |
Patrice Béchard et.al. |
2501.04652 |
null |
2025-01-08 |
FlairGPT: Repurposing LLMs for Interior Designs |
Gabrielle Littlefair et.al. |
2501.04648 |
null |
2025-01-08 |
Knowledge Retrieval Based on Generative AI |
Te-Lun Yang et.al. |
2501.04635 |
null |
2025-01-08 |
“Can you be my mum?”: Manipulating Social Robots in the Large Language Models Era |
Giulio Antonio Abbo et.al. |
2501.04633 |
null |
2025-01-09 |
MedCoDi-M: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation |
Daniele Molino et.al. |
2501.04614 |
null |
2025-01-08 |
Quantum-inspired Embeddings Projection and Similarity Metrics for Representation Learning |
Ivan Kankeu et.al. |
2501.04591 |
link |
2025-01-08 |
Boosting Salient Object Detection with Knowledge Distillated from Large Foundation Models |
Miaoyang He et.al. |
2501.04582 |
null |
2025-01-08 |
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection |
Yuhang Liu et.al. |
2501.04575 |
link |
2025-01-09 |
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis |
Run Luo et.al. |
2501.04561 |
link |
2025-01-08 |
The Impostor is Among Us: Can Large Language Models Capture the Complexity of Human Personas? |
Christopher Lazik et.al. |
2501.04543 |
null |
2025-01-08 |
Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time |
Uri Berger et.al. |
2501.04513 |
null |
2025-01-08 |
CGP-Tuning: Structure-Aware Soft Prompt Tuning for Code Vulnerability Detection |
Ruijun Feng et.al. |
2501.04510 |
null |
2025-01-08 |
Integrating remote sensing data assimilation, deep learning and large language model for interactive wheat breeding yield prediction |
Guofeng Yang et.al. |
2501.04487 |
null |
2025-01-08 |
When LLMs Struggle: Reference-less Translation Evaluation for Low-resource Languages |
Archchana Sindhujan et.al. |
2501.04473 |
null |
2025-01-08 |
Hidden Entity Detection from GitHub Leveraging Large Language Models |
Lu Gan et.al. |
2501.04455 |
link |
2025-01-08 |
Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions |
Doaa Mahmud et.al. |
2501.04437 |
null |
2025-01-08 |
Federated Fine-Tuning of LLMs: Framework Comparison and Research Directions |
Na Yan et.al. |
2501.04436 |
null |
2025-01-08 |
End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach |
H. M. Shadman Tabib et.al. |
2501.04425 |
null |
2025-01-08 |
SEO: Stochastic Experience Optimization for Large Language Models |
Jitao Xu et.al. |
2501.04393 |
null |
2025-01-08 |
iFADIT: Invertible Face Anonymization via Disentangled Identity Transform |
Lin Yuan et.al. |
2501.04390 |
null |
2025-01-08 |
DispFormer: Pretrained Transformer for Flexible Dispersion Curve Inversion from Global Synthesis to Regional Applications |
Feng Liu et.al. |
2501.04366 |
link |
2025-01-08 |
Understanding Before Reasoning: Enhancing Chain-of-Thought with Iterative Summarization Pre-Prompting |
Dong-Hai Zhu et.al. |
2501.04341 |
link |
2025-01-09 |
Navigating the Designs of Privacy-Preserving Fine-tuning for Large Language Models |
Haonan Shi et.al. |
2501.04323 |
null |
2025-01-08 |
Who Does the Giant Number Pile Like Best: Analyzing Fairness in Hiring Contexts |
Preethi Seshadri et.al. |
2501.04316 |
link |
2025-01-08 |
RoRA: Efficient Fine-Tuning of LLM with Reliability Optimization for Rank Adaptation |
Jun Liu et.al. |
2501.04315 |
null |
2025-01-08 |
Your Fix Is My Exploit: Enabling Comprehensive DL Library API Fuzzing with Large Language Models |
Kunpeng Zhang et.al. |
2501.04312 |
null |
2025-01-08 |
LLM4SR: A Survey on Large Language Models for Scientific Research |
Ziming Luo et.al. |
2501.04306 |
link |
2025-01-08 |
Multimodal Graph Constrastive Learning and Prompt for ChartQA |
Yue Dai et.al. |
2501.04303 |
null |
2025-01-08 |
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving |
Siran Chen et.al. |
2501.04302 |
null |
2025-01-08 |
An Analysis of Model Robustness across Concurrent Distribution Shifts |
Myeongho Jeon et.al. |
2501.04288 |
null |
2025-01-08 |
Mapping the Edge of Chaos: Fractal-Like Boundaries in The Trainability of Decoder-Only Transformer Models |
Bahman Torkamandi et.al. |
2501.04286 |
link |
2025-01-08 |
Separate Source Channel Coding Is Still What You Need: An LLM-based Rethinking |
Tianqi Ren et.al. |
2501.04285 |
null |
2025-01-08 |
OpenIN: Open-Vocabulary Instance-Oriented Navigation in Dynamic Domestic Environments |
Yujie Tang et.al. |
2501.04279 |
null |
2025-01-08 |
Exploring the Expertise of Large Language Models in Materials Science and Metallurgical Engineering |
Christophe Bajan et.al. |
2501.04277 |
link |
2025-01-08 |
Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation |
Senwei Xie et.al. |
2501.04268 |
null |
2025-01-08 |
Scaling Large Language Model Training on Frontier with Low-Bandwidth Partitioning |
Lang Xu et.al. |
2501.04266 |
null |
2025-01-08 |
IOLBENCH: Benchmarking LLMs on Linguistic Reasoning |
Satyam Goyal et.al. |
2501.04249 |
link |
2025-01-08 |
TransientVerse: A Comprehensive Real-Time Alert and Multi-Wavelength Analysis System for Transient Astronomical Events |
Jian-Hua Fang et.al. |
2501.04247 |
null |
2025-01-08 |
Statistical Uncertainty Quantification for Aggregate Performance Metrics in Machine Learning Benchmarks |
Rachel Longjohn et.al. |
2501.04234 |
null |
2025-01-07 |
Reasoning-Enhanced Self-Training for Long-Form Personalized Text Generation |
Alireza Salemi et.al. |
2501.04167 |
null |
2025-01-07 |
AdaptiveCoPilot: Design and Testing of a NeuroAdaptive LLM Cockpit Guidance System in both Novice and Expert Pilots |
Shaoyue Wen et.al. |
2501.04156 |
link |
2025-01-07 |
Multilingual Open QA on the MIA Shared Task |
Navya Yarrabelly et.al. |
2501.04153 |
null |
2025-01-07 |
The angular momentum spiral of the Milky Way disc in Gaia |
Rashid Yaaqib et.al. |
2501.04095 |
null |
2025-01-07 |
More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives |
Xiaoqing Zhang et.al. |
2501.04070 |
link |
2025-01-07 |
ChronoLLM: A Framework for Customizing Large Language Model for Digital Twins generalization based on PyChrono |
Jingquan Wang et.al. |
2501.04062 |
null |
2025-01-07 |
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving |
Lingdong Kong et.al. |
2501.04005 |
null |
2025-01-07 |
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos |
Haobo Yuan et.al. |
2501.04001 |
link |
2025-01-07 |
RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance |
Matin Mortaheb et.al. |
2501.03995 |
null |
2025-01-07 |
Synthetic Data for Portfolios: A Throw of the Dice Will Never Abolish Chance |
Adil Rengim Cetingoz et.al. |
2501.03993 |
null |
2025-01-07 |
Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles |
Yuxi Xia et.al. |
2501.03991 |
null |
2025-01-07 |
(De)-Indexing and the Right to be Forgotten |
Salvatore Vilella et.al. |
2501.03989 |
null |
2025-01-07 |
VLM-driven Behavior Tree for Context-aware Task Planning |
Naoki Wake et.al. |
2501.03968 |
link |
2025-01-07 |
Vision Language Models as Values Detectors |
Giulio Antonio Abbo et.al. |
2501.03957 |
null |
2025-01-07 |
Localizing AI: Evaluating Open-Weight Language Models for Languages of Baltic States |
Jurgita Kapočiūtė-Dzikienė et.al. |
2501.03952 |
null |
2025-01-07 |
Synthetic Data Privacy Metrics |
Amy Steier et.al. |
2501.03941 |
null |
2025-01-07 |
Not all tokens are created equal: Perplexity Attention Weighted Networks for AI generated text detection |
Pablo Miralles-González et.al. |
2501.03940 |
null |
2025-01-07 |
A precise asymptotic analysis of learning diffusion models: theory and insights |
Hugo Cui et.al. |
2501.03937 |
link |
2025-01-07 |
Exploring the Potential of Large Language Models in Public Transportation: San Antonio Case Study |
Ramya Jonnala et.al. |
2501.03904 |
null |
2025-01-07 |
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token |
Shaolei Zhang et.al. |
2501.03895 |
link |
2025-01-07 |
AlphaPO – Reward shape matters for LLM alignment |
Aman Gupta et.al. |
2501.03884 |
null |
2025-01-07 |
CL3DOR: Contrastive Learning for 3D Large Multimodal Models via Odds Ratio on High-Resolution Point Clouds |
Keonwoo Kim et.al. |
2501.03879 |
null |
2025-01-07 |
Progressive Document-level Text Simplification via Large Language Models |
Dengzhao Fang et.al. |
2501.03857 |
null |
2025-01-07 |
MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention |
Aadya Arora et.al. |
2501.03839 |
null |
2025-01-07 |
Deep Sylvester Posterior Inference for Adaptive Compressed Sensing in Ultrasound Imaging |
Simon W. Penninga et.al. |
2501.03825 |
null |
2025-01-08 |
MADation: Face Morphing Attack Detection with Foundation Models |
Eduarda Caldeira et.al. |
2501.03800 |
link |
2025-01-07 |
KAnoCLIP: Zero-Shot Anomaly Detection through Knowledge-Driven Prompt Learning and Enhanced Cross-Modal Integration |
Chengyuan Li et.al. |
2501.03786 |
null |
2025-01-07 |
Context-Alignment: Activating and Enhancing LLM Capabilities in Time Series |
Yuxiao Hu et.al. |
2501.03747 |
null |
2025-01-07 |
Self-adaptive vision-language model for 3D segmentation of pulmonary artery and vein |
Xiaotong Guo et.al. |
2501.03722 |
null |
2025-01-07 |
Motion-Aware Generative Frame Interpolation |
Guozhen Zhang et.al. |
2501.03699 |
null |
2025-01-07 |
SLAM: Towards Efficient Multilingual Reasoning via Selective Language Alignment |
Yuchun Fan et.al. |
2501.03681 |
link |
2025-01-07 |
Effective and Efficient Mixed Precision Quantization of Speech Foundation Models |
Haoning Xu et.al. |
2501.03643 |
null |
2025-01-07 |
CommitShield: Tracking Vulnerability Introduction and Fix in Version Control Systems |
Zhaonan Wu et.al. |
2501.03626 |
link |
2025-01-07 |
LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment |
Gaoussou Youssouf Kebe et.al. |
2501.03624 |
null |
2025-01-07 |
Cosmos World Foundation Model Platform for Physical AI |
NVIDIA et.al. |
2501.03575 |
link |
2025-01-07 |
From Code to Compliance: Assessing ChatGPT’s Utility in Designing an Accessible Webpage – A Case Study |
Ammar Ahmed et.al. |
2501.03572 |
null |
2025-01-07 |
What Does a Software Engineer Look Like? Exploring Societal Stereotypes in LLMs |
Muneera Bano et.al. |
2501.03569 |
null |
2025-01-07 |
Applying Large Language Models in Knowledge Graph-based Enterprise Modeling: Challenges and Opportunities |
Benedikt Reitemeyer et.al. |
2501.03566 |
null |
2025-01-07 |
Bridged Semantic Alignment for Zero-shot 3D Medical Image Diagnosis |
Haoran Lai et.al. |
2501.03565 |
null |
2025-01-07 |
PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models |
Lingzhi Yuan et.al. |
2501.03544 |
null |
2025-01-07 |
Deep Learning within Tabular Data: Foundations, Challenges, Advances and Future Directions |
Weijieying Ren et.al. |
2501.03540 |
null |
2025-01-07 |
Deep Learning for Pathological Speech: A Survey |
Shakeel A. Sheikh et.al. |
2501.03536 |
null |
2025-01-08 |
SenseRAG: Constructing Environmental Knowledge Bases with Proactive Querying for LLM-Based Autonomous Driving |
Xuewen Luo et.al. |
2501.03535 |
null |
2025-01-07 |
A generative approach for lensless imaging in low-light conditions |
Ziyang Liu et.al. |
2501.03511 |
null |
2025-01-07 |
A Sequential Optimal Learning Approach to Automated Prompt Engineering in Large Language Models |
Shuyang Wang et.al. |
2501.03508 |
null |
2025-01-07 |
Textualize Visual Prompt for Image Editing via Diffusion Bridge |
Pengcheng Xu et.al. |
2501.03495 |
null |
2025-01-07 |
Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment |
Prashant Trivedi et.al. |
2501.03486 |
null |
2025-01-07 |
Reading with Intent – Neutralizing Intent |
Benjamin Reichman et.al. |
2501.03475 |
null |
2025-01-07 |
Information-Maximized Soft Variable Discretization for Self-Supervised Image Representation Learning |
Chuang Niu et.al. |
2501.03469 |
link |
2025-01-07 |
MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation Systems |
Yannis Katsis et.al. |
2501.03468 |
link |
2025-01-07 |
ISSR: Iterative Selection with Self-Review for Vocabulary Test Distractor Generation |
Yu-Cheng Liu et.al. |
2501.03462 |
null |
2025-01-07 |
Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report Generation |
Xiao Wang et.al. |
2501.03458 |
link |
2025-01-07 |
CoReQA: Uncovering Potentials of Language Models in Code Repository Question Answering |
Jialiang Chen et.al. |
2501.03447 |
null |
2025-01-07 |
LLM4CVE: Enabling Iterative Automated Vulnerability Repair with Large Language Models |
Mohamad Fakih et.al. |
2501.03446 |
null |
2025-01-07 |
Finding A Voice: Evaluating African American Dialect Generation for Chatbot Technology |
Sarah E. Finch et.al. |
2501.03441 |
link |
2025-01-06 |
SALT: Sales Autocompletion Linked Business Tables Dataset |
Tassilo Klein et.al. |
2501.03413 |
link |
2025-01-06 |
BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations |
Simone Giovannini et.al. |
2501.03403 |
null |
2025-01-06 |
DoubleDiffusion: Combining Heat Diffusion with Denoising Diffusion for Generative Learning on 3D Meshes |
Xuyang Wang et.al. |
2501.03397 |
link |
2025-01-06 |
Evolved Quantum Boltzmann Machines |
Michele Minervini et.al. |
2501.03367 |
null |
2025-01-06 |
CM3T: Framework for Efficient Multimodal Learning for Inhomogeneous Interaction Datasets |
Tanay Agrawal et.al. |
2501.03332 |
null |
2025-01-06 |
LiLMaps: Learnable Implicit Language Maps |
Evgenii Kruzhkov et.al. |
2501.03304 |
null |
2025-01-06 |
A Soft Sensor Method with Uncertainty-Awareness and Self-Explanation Based on Large Language Models Enhanced by Domain Knowledge Retrieval |
Shuo Tong et.al. |
2501.03295 |
null |
2025-01-06 |
Multi-Modal One-Shot Federated Ensemble Learning for Medical Data with Vision Large Language Model |
Naibo Wang et.al. |
2501.03292 |
null |
2025-01-06 |
ADePT: Adaptive Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning |
Pengwei Tang et.al. |
2501.03291 |
link |
2025-01-06 |
CodeVision: Detecting LLM-Generated Code Using 2D Token Probability Maps and Vision Models |
Zhenyu Xu et.al. |
2501.03288 |
null |
2025-01-06 |
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning |
Beichen Zhang et.al. |
2501.03226 |
link |
2025-01-06 |
Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text |
Ayat Najjar et.al. |
2501.03212 |
null |
2025-01-06 |
Detecting AI-Generated Text in Educational Content: Leveraging Machine Learning and Explainable AI for Academic Integrity |
Ayat A. Najjar et.al. |
2501.03203 |
null |
2025-01-06 |
CLIX: Cross-Lingual Explanations of Idiomatic Expressions |
Aaron Gluck et.al. |
2501.03191 |
null |
2025-01-06 |
Semantic Captioning: Benchmark Dataset and Graph-Aware Few-Shot In-Context Learning for SQL2Text |
Ali Al-Lawati et.al. |
2501.03166 |
link |
2025-01-06 |
Segment Anything Model for Zero-shot Single Particle Tracking in Liquid Phase Transmission Electron Microscopy |
Risha Goel et.al. |
2501.03153 |
link |
2025-01-06 |
Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches |
Alhassan Mumuni et.al. |
2501.03151 |
null |
2025-01-06 |
VicSim: Enhancing Victim Simulation with Emotional and Linguistic Fidelity |
Yerong Li et.al. |
2501.03139 |
null |
2025-01-07 |
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models |
Mingyang Song et.al. |
2501.03124 |
link |
2025-01-06 |
CAT: Content-Adaptive Image Tokenization |
Junhong Shen et.al. |
2501.03120 |
null |
2025-01-06 |
LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases |
Dylan Bouchard et.al. |
2501.03112 |
link |
2025-01-06 |
Sentiment-guided Commonsense-aware Response Generation for Mental Health Counseling |
Aseem Srivastava et.al. |
2501.03088 |
null |
2025-01-06 |
Retrieval-Augmented TLAPS Proof Generation with Large Language Models |
Yuhao Zhou et.al. |
2501.03073 |
null |
2025-01-06 |
ChronoSense: Exploring Temporal Understanding in Large Language Models with Time Intervals of Events |
Duygu Sezen Islakoglu et.al. |
2501.03040 |
null |
2025-01-06 |
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning |
Zhen Li et.al. |
2501.03035 |
null |
2025-01-06 |
TransPixar: Advancing Text-to-Video Generation with Transparency |
Luozhou Wang et.al. |
2501.03006 |
link |
2025-01-06 |
CALM: Curiosity-Driven Auditing for Large Language Models |
Xiang Zheng et.al. |
2501.02997 |
link |
2025-01-06 |
Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation |
Zhi Qu et.al. |
2501.02979 |
link |
2025-01-06 |
FlipedRAG: Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models |
Zhuo Chen et.al. |
2501.02968 |
null |
2025-01-07 |
Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the Wild |
Wanpeng Hu et.al. |
2501.02964 |
link |
2025-01-07 |
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild |
Jiawei Liu et.al. |
2501.02962 |
null |
2025-01-06 |
The Tabular Foundation Model TabPFN Outperforms Specialized Time Series Forecasting Models Based on Simple Features |
Shi Bin Hoo et.al. |
2501.02945 |
link |
2025-01-07 |
Inhibition of bacterial growth by antibiotics |
Barnabe Ledoux et.al. |
2501.02944 |
null |
2025-01-06 |
Deep Generative Model-Aided Power System Dynamic State Estimation and Reconstruction with Unknown Control Inputs or Data Distributions |
Jianhua Pei et.al. |
2501.02928 |
null |
2025-01-06 |
DeCon: Detecting Incorrect Assertions via Postconditions Generated by a Large Language Model |
Hao Yu et.al. |
2501.02901 |
link |
2025-01-06 |
FoundPAD: Foundation Models Reloaded for Face Presentation Attack Detection |
Guray Ozgur et.al. |
2501.02892 |
link |
2025-01-06 |
MDP3: A Training-free Approach for List-wise Frame Selection in Video-LLMs |
Hui Sun et.al. |
2501.02885 |
null |
2025-01-06 |
IIMedGPT: Promoting Large Language Model Capabilities of Medical Tasks by Efficient Human Preference Alignment |
Yiming Zhang et.al. |
2501.02869 |
null |
2025-01-06 |
Large Language Models for Video Surveillance Applications |
Ulindu De Silva et.al. |
2501.02850 |
null |
2025-01-06 |
Graph-based Retrieval Augmented Generation for Dynamic Few-shot Text Classification |
Yubo Wang et.al. |
2501.02844 |
null |
2025-01-06 |
Foundations of GenIR |
Qingyao Ai et.al. |
2501.02842 |
null |
2025-01-06 |
An Infrastructure Software Perspective Toward Computation Offloading between Executable Specifications and Foundation Models |
Dezhi Ran et.al. |
2501.02829 |
null |
2025-01-06 |
InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion |
Zhaoyi Yan et.al. |
2501.02795 |
null |
2025-01-06 |
CCStereo: Audio-Visual Contextual and Contrastive Learning for Binaural Audio Generation |
Yuanhong Chen et.al. |
2501.02786 |
null |
2025-01-06 |
GeAR: Generation Augmented Retrieval |
Haoyu Liu et.al. |
2501.02772 |
null |
2025-01-06 |
Visual Large Language Models for Generalized and Specialized Applications |
Yifan Li et.al. |
2501.02765 |
link |
2025-01-06 |
Ultrasound-QBench: Can LLMs Aid in Quality Assessment of Ultrasound Imaging? |
Hongyi Miao et.al. |
2501.02751 |
null |
2025-01-06 |
Artificial Intelligence in Creative Industries: Advances Prior to 2025 |
Nantheera Anantrasirichai et.al. |
2501.02725 |
null |
2025-01-06 |
KG-CF: Knowledge Graph Completion with Context Filtering under the Guidance of Large Language Models |
Zaiyi Zheng et.al. |
2501.02711 |
null |
2025-01-06 |
QuIM-RAG: Advancing Retrieval-Augmented Generation with Inverted Question Matching for Enhanced QA Performance |
Binita Saha et.al. |
2501.02702 |
null |
2025-01-06 |
EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models |
Andrés Villa et.al. |
2501.02699 |
null |
2025-01-05 |
GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking |
Weikang Bian et.al. |
2501.02690 |
null |
2025-01-05 |
Decoding specialised feature neurons in LLMs with the final projection layer |
Harry J Davies et.al. |
2501.02688 |
null |
2025-01-05 |
From thermodynamics to protein design: Diffusion models for biomolecule generation towards autonomous protein engineering |
Wen-ran Li et.al. |
2501.02680 |
null |
2025-01-05 |
A New Interpretation of the Certainty-Equivalence Approach for PAC Reinforcement Learning with a Generative Model |
Shivaram Kalyanakrishnan et.al. |
2501.02652 |
null |
2025-01-05 |
Representation Learning of Lab Values via Masked AutoEncoder |
David Restrepo et.al. |
2501.02648 |
link |
2025-01-05 |
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense |
Yang Ouyang et.al. |
2501.02629 |
link |
2025-01-05 |
Cracks in The Stack: Hidden Vulnerabilities and Licensing Risks in LLM Pre-Training Datasets |
Mahmoud Jahanshahi et.al. |
2501.02628 |
null |
2025-01-05 |
HALO: Hadamard-Assisted Lossless Optimization for Efficient Low-Precision LLM Training and Fine-Tuning |
Saleh Ashkboos et.al. |
2501.02625 |
link |
2025-01-05 |
LLMs Help Alleviate the Cross-Subject Variability in Brain Signal and Language Alignment |
Yifei Liu et.al. |
2501.02621 |
null |
2025-01-05 |
TAPAS: Thermal- and Power-Aware Scheduling for LLM Inference in Cloud Platforms |
Jovan Stojkovic et.al. |
2501.02600 |
null |
2025-01-05 |
LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA Implementations |
Jiaping Wang et.al. |
2501.02573 |
link |
2025-01-05 |
Multi-LLM Collaborative Caption Generation in Scientific Documents |
Jaeyoung Kim et.al. |
2501.02552 |
link |
2025-01-05 |
Transformers Simulate MLE for Sequence Generation in Bayesian Networks |
Yuan Cao et.al. |
2501.02547 |
null |
2025-01-05 |
Evaluating Large Language Models Against Human Annotators in Latent Content Analysis: Sentiment, Political Leaning, Emotional Intensity, and Sarcasm |
Ljubisa Bojic et.al. |
2501.02532 |
null |
2025-01-05 |
Towards New Benchmark for AI Alignment & Sentiment Analysis in Socially Important Issues: A Comparative Study of Human and LLMs in the Context of AGI |
Ljubisa Bojic et.al. |
2501.02531 |
null |
2025-01-05 |
Vision-Driven Prompt Optimization for Large Language Models in Multimodal Generative Tasks |
Leo Franklin et.al. |
2501.02527 |
null |
2025-01-05 |
Unified Guidance for Geometry-Conditioned Molecular Generation |
Sirine Ayadi et.al. |
2501.02526 |
null |
2025-01-05 |
Layout2Scene: 3D Semantic Layout Guided Scene Generation via Geometry and Appearance Diffusion Priors |
Minglin Chen et.al. |
2501.02519 |
null |
2025-01-05 |
CHAIR-Classifier of Hallucination as Improver |
Ao Sun et.al. |
2501.02518 |
link |
2025-01-05 |
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use |
Junjie Ye et.al. |
2501.02506 |
null |
2025-01-05 |
Learning when to rank: Estimation of partial rankings from sparse, noisy comparisons |
Sebastian Morel-Balbi et.al. |
2501.02505 |
null |
2025-01-05 |
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling |
Chaojie Mao et.al. |
2501.02487 |
null |
2025-01-05 |
LLMPC: Large Language Model Predictive Control |
Gabriel Maher et.al. |
2501.02486 |
link |
2025-01-05 |
Decoding News Bias: Multi Bias Detection in News Articles |
Bhushan Santosh Shah et.al. |
2501.02482 |
null |
2025-01-05 |
Hengqin-RA-v1: Advanced Large Language Model for Diagnosis and Treatment of Rheumatoid Arthritis with Dataset based Traditional Chinese Medicine |
Yishen Liu et.al. |
2501.02471 |
null |
2025-01-05 |
Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera |
Yuliang Guo et.al. |
2501.02464 |
link |
2025-01-05 |
Towards Omni-RAG: Comprehensive Retrieval-Augmented Generation for Large Language Models in Medical Applications |
Zhe Chen et.al. |
2501.02460 |
null |
2025-01-05 |
Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap |
Hyunwoo Ko et.al. |
2501.02448 |
null |
2025-01-05 |
RTLMarker: Protecting LLM-Generated RTL Copyright via a Hardware Watermarking Framework |
Kun Wang et.al. |
2501.02446 |
null |
2025-01-05 |
A Statistical Hypothesis Testing Framework for Data Misappropriation Detection in Large Language Models |
Yinpeng Cai et.al. |
2501.02441 |
null |
2025-01-05 |
Efficient Deployment of Large Language Models on Resource-constrained Devices |
Zhiwei Yao et.al. |
2501.02438 |
null |
2025-01-05 |
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance |
Haicheng Wang et.al. |
2501.02430 |
link |
2025-01-05 |
GenTREC: The First Test Collection Generated by Large Language Models for Evaluating Information Retrieval Systems |
Mehmet Deniz Türkmen et.al. |
2501.02408 |
null |
2025-01-04 |
Who Wrote This? Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities |
Tara Radvand et.al. |
2501.02406 |
link |
2025-01-04 |
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers |
Markus J. Buehler et.al. |
2501.02393 |
link |
2025-01-04 |
Guiding Medical Vision-Language Models with Explicit Visual Prompts: Framework Design and Comprehensive Exploration of Prompt Variations |
Kangyu Zhu et.al. |
2501.02385 |
null |
2025-01-04 |
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison |
Tsz Kin Lam et.al. |
2501.02370 |
null |
2025-01-04 |
Thinking with Many Minds: Using Large Language Models for Multi-Perspective Problem-Solving |
Sanghyun Park et.al. |
2501.02348 |
null |
2025-01-04 |
Exploring the Capabilities and Limitations of Large Language Models for Radiation Oncology Decision Support |
Florian Putz et.al. |
2501.02346 |
null |
2025-01-04 |
UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility |
Yonglin Tian et.al. |
2501.02341 |
link |
2025-01-04 |
AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference |
Zhuomin He et.al. |
2501.02336 |
link |
2025-01-04 |
Validity Arguments For Constructed Response Scoring Using Generative Artificial Intelligence Applications |
Jodi M. Casabianca et.al. |
2501.02334 |
null |
2025-01-04 |
Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance |
Marta Gentiloni-Silveri et.al. |
2501.02298 |
null |
2025-01-04 |
Explicit vs. Implicit: Investigating Social Bias in Large Language Models through Self-Reflection |
Yachao Zhao et.al. |
2501.02295 |
null |
2025-01-04 |
Digital Deep Joint Source-Channel Coding with Blind Training for Adaptive Modulation and Power Control |
Yongjeong Oh et.al. |
2501.02273 |
null |
2025-01-04 |
What Kind of Visual Tokens Do We Need? Training-free Visual Token Pruning for Multi-modal Large Language Models from the Perspective of Graph |
Yutao Jiang et.al. |
2501.02268 |
link |
2025-01-04 |
Unsupervised Class Generation to Expand Semantic Segmentation Datasets |
Javier Montalvo et.al. |
2501.02264 |
null |
2025-01-04 |
Financial Named Entity Recognition: How Far Can LLM Go? |
Yi-Te Lu et.al. |
2501.02237 |
link |
2025-01-04 |
Survey on Question Answering over Visually Rich Documents: Methods, Challenges, and Trends |
Camille Barboule et.al. |
2501.02235 |
null |
2025-01-04 |
Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection |
S M Mostaq Hossain et.al. |
2501.02229 |
null |
2025-01-04 |
Knowledge Graph Retrieval-Augmented Generation for LLM-based Recommendation |
Shijie Wang et.al. |
2501.02226 |
null |
2025-01-04 |
Can ChatGPT implement finite element models for geotechnical engineering applications? |
Taegu Kim et.al. |
2501.02199 |
null |
2025-01-04 |
EvoPath: Evolutionary Meta-path Discovery with Large Language Models for Complex Heterogeneous Information Networks |
Shixuan Liu et.al. |
2501.02192 |
null |
2025-01-04 |
On LLM-Enhanced Mixed-Type Data Imputation with High-Order Message Passing |
Jianwei Wang et.al. |
2501.02191 |
link |
2025-01-04 |
The Application of Large Language Models in Recommendation Systems |
Peiyang Yu et.al. |
2501.02178 |
null |
2025-01-04 |
The Efficiency vs. Accuracy Trade-off: Optimizing RAG-Enhanced LLM Recommender Systems Using Multi-Head Early Exit |
Huixue Zhou et.al. |
2501.02173 |
null |
2025-01-04 |
Personalized Graph-Based Retrieval for Large Language Models |
Steven Au et.al. |
2501.02157 |
link |
2025-01-04 |
Table as Thought: Exploring Structured Thoughts in LLM Reasoning |
Zhenjie Sun et.al. |
2501.02152 |
null |
2025-01-04 |
Plasma-CycleGAN: Plasma Biomarker-Guided MRI to PET Cross-modality Translation Using Conditional CycleGAN |
Yanxi Chen et.al. |
2501.02146 |
null |
2025-01-03 |
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction |
Chaoyou Fu et.al. |
2501.01957 |
link |
2025-01-03 |
Metadata Conditioning Accelerates Language Model Pre-training |
Tianyu Gao et.al. |
2501.01956 |
link |
2025-01-03 |
MADGEN – Mass-Spec attends to De Novo Molecular generation |
Yinkai Wang et.al. |
2501.01950 |
link |
2025-01-03 |
Cold-Start Recommendation towards the Era of Large Language Models (LLMs): A Comprehensive Survey and Roadmap |
Weizhi Zhang et.al. |
2501.01945 |
link |
2025-01-03 |
Bridging Classification and Segmentation in Osteosarcoma Assessment via Foundation and Discrete Diffusion Models |
Manh Duong Nguyen et.al. |
2501.01932 |
link |
2025-01-03 |
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM |
Yifan Du et.al. |
2501.01904 |
link |
2025-01-03 |
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation |
Siyuan Huang et.al. |
2501.01895 |
null |
2025-01-03 |
Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions |
Rachneet Sachdeva et.al. |
2501.01872 |
link |
2025-01-03 |
Multi-Agent Conversational Online Learning for Adaptive LLM Response Identification |
Xiangxiang Dai et.al. |
2501.01849 |
link |
2025-01-03 |
MoColl: Agent-Based Specific and General Model Collaboration for Image Captioning |
Pu Yang et.al. |
2501.01834 |
null |
2025-01-03 |
Time Series Language Model for Descriptive Caption Generation |
Mohamed Trabelsi et.al. |
2501.01832 |
null |
2025-01-03 |
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models |
Yanjiang Liu et.al. |
2501.01830 |
null |
2025-01-03 |
SDPO: Segment-Level Direct Preference Optimization for Social Agents |
Aobo Kong et.al. |
2501.01821 |
link |
2025-01-03 |
BERT4MIMO: A Foundation Model using BERT Architecture for Massive MIMO Channel State Information Prediction |
Ferhat Ozgur Catak et.al. |
2501.01802 |
link |
2025-01-03 |
Creating Artificial Students that Never Existed: Leveraging Large Language Models and CTGANs for Synthetic Data Generation |
Mohammad Khalil et.al. |
2501.01793 |
link |
2025-01-03 |
Efficient LLM Inference with Activation Checkpointing and Hybrid Caching |
Sanghyeon Lee et.al. |
2501.01792 |
null |
2025-01-03 |
Nonparametric estimation of a factorizable density using diffusion models |
Hyeok Kyu Kwon et.al. |
2501.01783 |
null |
2025-01-03 |
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation |
Mingjie Li et.al. |
2501.01765 |
null |
2025-01-03 |
Adverse Weather Conditions Augmentation of LiDAR Scenes with Latent Diffusion Models |
Andrea Matteazzi et.al. |
2501.01761 |
null |
2025-01-03 |
MusicGen-Stem: Multi-stem music generation and edition through autoregressive modeling |
Simon Rouard et.al. |
2501.01757 |
null |
2025-01-03 |
Automating Legal Concept Interpretation with LLMs: Retrieval, Generation, and Evaluation |
Kangcheng Luo et.al. |
2501.01743 |
null |
2025-01-03 |
How Toxic Can You Get? Search-based Toxicity Testing for Large Language Models |
Simone Corbo et.al. |
2501.01741 |
null |
2025-01-03 |
AR4D: Autoregressive 4D Generation from Monocular Videos |
Hanxin Zhu et.al. |
2501.01722 |
null |
2025-01-03 |
Interpretable Face Anti-Spoofing: Enhancing Generalization with Multimodal Large Language Models |
Guosheng Zhang et.al. |
2501.01720 |
null |
2025-01-03 |
LLMs & Legal Aid: Understanding Legal Needs Exhibited Through User Queries |
Michal Kuk et.al. |
2501.01711 |
null |
2025-01-03 |
MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders |
Jiajun Cao et.al. |
2501.01709 |
null |
2025-01-03 |
AgentRefine: Enhancing Agent Generalization through Refinement Tuning |
Dayuan Fu et.al. |
2501.01702 |
null |
2025-01-03 |
Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models |
Lei Tang et.al. |
2501.01679 |
null |
2025-01-03 |
Practical Secure Inference Algorithm for Fine-tuned Large Language Model Based on Fully Homomorphic Encryption |
Zhang Ruoyan et.al. |
2501.01672 |
null |
2025-01-03 |
BARTPredict: Empowering IoT Security with LLM-Driven Cyber Threat Prediction |
Alaeddine Diaf et.al. |
2501.01664 |
null |
2025-01-03 |
Look Back for More: Harnessing Historical Sequential Updates for Personalized Federated Adapter Tuning |
Danni Peng et.al. |
2501.01653 |
null |
2025-01-03 |
MIRAGE: Exploring How Large Language Models Perform in Complex Social Interactive Environments |
Cai Yin et.al. |
2501.01652 |
link |
2025-01-03 |
HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video Understanding |
Heqing Zou et.al. |
2501.01645 |
link |
2025-01-03 |
iCBIR-Sli: Interpretable Content-Based Image Retrieval with 2D Slice Embeddings |
Shuhei Tomoshige et.al. |
2501.01642 |
null |
2025-01-03 |
Uncertainty and Energy based Loss Guided Semi-Supervised Semantic Segmentation |
Rini Smita Thakur et.al. |
2501.01640 |
null |
2025-01-03 |
A non-ergodic framework for understanding emergent capabilities in Large Language Models |
Javier Marin et.al. |
2501.01638 |
null |
2025-01-03 |
Revisiting Data Analysis with Pre-trained Foundation Models |
Chen Liang et.al. |
2501.01631 |
null |
2025-01-03 |
ICPC: In-context Prompt Compression with Faster Inference |
Ziyang Yu et.al. |
2501.01625 |
null |
2025-01-03 |
PSYCHE: A Multi-faceted Patient Simulation Framework for Evaluation of Psychiatric Assessment Conversational Agents |
Jingoo Lee et.al. |
2501.01594 |
null |
2025-01-03 |
(WhyPHI) Fine-Tuning PHI-3 for Multiple-Choice Question Answering: Methodology, Results, and Challenges |
Mohamed Hisham Abdellatif et.al. |
2501.01588 |
null |
2025-01-02 |
Predicting the Performance of Black-box LLMs through Self-Queries |
Dylan Sam et.al. |
2501.01558 |
link |
2025-01-02 |
Enhancing User Engagement in Large-Scale Social Annotation Platforms: Community-Based Design Interventions and Implications for Large Language Models (LLMs) |
Jumana Almahmoud et.al. |
2501.01545 |
null |
2025-01-02 |
Many of Your DPOs are Secretly One: Attempting Unification Through Mutual Information |
Rasul Tutnov et.al. |
2501.01544 |
null |
2025-01-02 |
Denoising Diffused Embeddings: a Generative Approach for Hypergraphs |
Shihao Wu et.al. |
2501.01541 |
null |
2025-01-02 |
BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery |
Kanishk Gandhi et.al. |
2501.01540 |
link |
2025-01-02 |
SAFER: Sharpness Aware layer-selective Finetuning for Enhanced Robustness in vision transformers |
Bhavna Gopal et.al. |
2501.01529 |
null |
2025-01-02 |
Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search |
Shuangtao Li et.al. |
2501.01478 |
null |
2025-01-02 |
Unifying Specialized Visual Encoders for Video Language Models |
Jihoon Chung et.al. |
2501.01426 |
link |
2025-01-02 |
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models |
Jingfeng Yao et.al. |
2501.01423 |
link |
2025-01-02 |
Multi-Modal Video Feature Extraction for Popularity Prediction |
Haixu Liu et.al. |
2501.01422 |
null |
2025-01-02 |
Deep Discrete Encoders: Identifiable Deep Generative Models for Rich Data with Discrete Latent Layers |
Seunghyun Lee et.al. |
2501.01414 |
null |
2025-01-02 |
On Unifying Video Generation and Camera Pose Estimation |
Chun-Hao Paul Huang et.al. |
2501.01409 |
null |
2025-01-02 |
OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios |
Xize Cheng et.al. |
2501.01384 |
null |
2025-01-02 |
ScarNet: A Novel Foundation Model for Automated Myocardial Scar Quantification from LGE in Cardiac MRI |
Neda Tavakoli et.al. |
2501.01372 |
link |
2025-01-02 |
Aligning Large Language Models for Faithful Integrity Against Opposing Argument |
Yong Zhao et.al. |
2501.01336 |
link |
2025-01-02 |
CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models |
Johan Wahréus et.al. |
2501.01335 |
link |
2025-01-02 |
Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension |
Yanbo Fang et.al. |
2501.01332 |
null |
2025-01-02 |
The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation |
Shuzheng Gao et.al. |
2501.01329 |
null |
2025-01-03 |
Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking |
Xiaoxue Cheng et.al. |
2501.01306 |
null |
2025-01-02 |
Large Language Models for Mental Health Diagnostic Assessments: Exploring The Potential of Large Language Models for Assisting with Mental Health Diagnostic Assessments – The Depression and Anxiety Case |
Kaushik Roy et.al. |
2501.01305 |
null |
2025-01-02 |
Does a Large Language Model Really Speak in Human-Like Language? |
Mose Park et.al. |
2501.01273 |
null |
2025-01-02 |
ProgCo: Program Helps Self-Correction of Large Language Models |
Xiaoshuai Song et.al. |
2501.01264 |
link |
2025-01-03 |
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings |
Shanghaoran Quan et.al. |
2501.01257 |
null |
2025-01-02 |
Digital Guardians: Can GPT-4, Perspective API, and Moderation API reliably detect hate speech in reader comments of German online newspapers? |
Manuel Weber et.al. |
2501.01256 |
null |
2025-01-02 |
Large Language Model-Enhanced Symbolic Reasoning for Knowledge Base Completion |
Qiyuan He et.al. |
2501.01246 |
null |
2025-01-02 |
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization |
Yongle Huang et.al. |
2501.01245 |
link |
2025-01-02 |
Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants |
Lixiong Qin et.al. |
2501.01243 |
null |
2025-01-02 |
Automated Self-Refinement and Self-Correction for LLM-based Product Attribute Value Extraction |
Alexander Brinkmann et.al. |
2501.01237 |
link |
2025-01-03 |
TabTreeFormer: Tabular Data Generation Using Hybrid Tree-Transformer |
Jiayu Li et.al. |
2501.01216 |
null |
2025-01-02 |
Harnessing Multi-Agent LLMs for Complex Engineering Problem-Solving: A Framework for Senior Design Projects |
Abdullah Mushtaq et.al. |
2501.01205 |
null |
2025-01-02 |
HetGCoT-Rec: Heterogeneous Graph-Enhanced Chain-of-Thought LLM Reasoning for Journal Recommendation |
Runsong Jia et.al. |
2501.01203 |
null |
2025-01-02 |
LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge |
Kyoungkook Kang et.al. |
2501.01197 |
null |
2025-01-02 |
Bridging the Early Science Gap with Artificial Intelligence: Evaluating Large Language Models as Tools for Early Childhood Science Education |
Annika Bush et.al. |
2501.01192 |
null |
2025-01-02 |
Towards Interactive Deepfake Analysis |
Lixiong Qin et.al. |
2501.01164 |
link |
2025-01-02 |
TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions |
Vriksha Srihari et.al. |
2501.01156 |
null |
2025-01-02 |
A3: Android Agent Arena for Mobile GUI Agents |
Yuxiang Chai et.al. |
2501.01149 |
null |
2025-01-03 |
BlockDialect: Block-wise Fine-grained Mixed Format for Energy-Efficient LLM Inference |
Wonsuk Jang et.al. |
2501.01144 |
link |
2025-01-02 |
Embodied AI-Enhanced Vehicular Networks: An Integrated Large Language Models and Reinforcement Learning Method |
Ruichen Zhang et.al. |
2501.01141 |
null |
2025-01-02 |
Graph2text or Graph2token: A Perspective of Large Language Models for Graph Learning |
Shuo Yu et.al. |
2501.01124 |
null |
2025-01-02 |
MalCL: Leveraging GAN-Based Generative Replay to Combat Catastrophic Forgetting in Malware Classification |
Jimin Park et.al. |
2501.01110 |
link |
2025-01-03 |
MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization |
Haina Zhu et.al. |
2501.01108 |
link |
2025-01-02 |
Graph Generative Pre-trained Transformer |
Xiaohui Chen et.al. |
2501.01073 |
null |
2025-01-02 |
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models |
Yanwen Huang et.al. |
2501.01059 |
null |
2025-01-02 |
Risks of Cultural Erasure in Large Language Models |
Rida Qadri et.al. |
2501.01056 |
null |
2025-01-02 |
Dynamic Scaling of Unit Tests for Code Reward Modeling |
Zeyao Ma et.al. |
2501.01054 |
null |
2025-01-02 |
Image-based Multimodal Models as Intruders: Transferable Multimodal Attacks on Video-based MLLMs |
Linhao Huang et.al. |
2501.01042 |
null |
2025-01-02 |
Advancing Singlish Understanding: Bridging the Gap with Datasets and Multimodal Models |
Bin Wang et.al. |
2501.01034 |
link |
2025-01-02 |
ValuesRAG: Enhancing Cultural Alignment Through Retrieval-Augmented Contextual Learning |
Wonduk Seo et.al. |
2501.01031 |
null |
2025-01-03 |
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model |
Xinshuo Hu et.al. |
2501.01028 |
link |
2025-01-02 |
MDSF: Context-Aware Multi-Dimensional Data Storytelling Framework based on Large language Model |
Chengze Zhang et.al. |
2501.01014 |
null |
2025-01-02 |
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving |
Zihao Ye et.al. |
2501.01005 |
link |
2025-01-02 |
Exploring Information Processing in Large Language Models: Insights from Information Bottleneck Theory |
Zhou Yang et.al. |
2501.00999 |
null |
2025-01-02 |
Optimizing Noise Schedules of Generative Models in High Dimensionss |
Santiago Aranguri et.al. |
2501.00988 |
null |
2025-01-02 |
Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice |
Federico Ravenda et.al. |
2501.00982 |
link |
2025-01-01 |
IGGA: A Dataset of Industrial Guidelines and Policy Statements for Generative AIs |
Junfeng Jiao et.al. |
2501.00959 |
null |
2025-01-01 |
Generative AI and LLMs in Industry: A text-mining Analysis and Critical Evaluation of Guidelines and Policy Statements Across Fourteen Industrial Sectors |
Junfeng Jiao et.al. |
2501.00957 |
null |
2025-01-01 |
Incremental Dialogue Management: Survey, Discussion, and Implications for HRI |
Casey Kennington et.al. |
2501.00953 |
null |
2025-01-01 |
SPADE: Enhancing Adaptive Cyber Deception Strategies with Generative AI and Structured Prompt Engineering |
Shihab Ahmed et.al. |
2501.00940 |
null |
2025-01-01 |
Diffusion Policies for Generative Modeling of Spacecraft Trajectories |
Julia Briden et.al. |
2501.00915 |
null |
2025-01-01 |
Aligning LLMs with Domain Invariant Reward Models |
David Wu et.al. |
2501.00911 |
link |
2025-01-01 |
Population Aware Diffusion for Time Series Generation |
Yang Li et.al. |
2501.00910 |
link |
2025-01-01 |
Large Language Model Based Multi-Agent System Augmented Complex Event Processing Pipeline for Internet of Multimedia Things |
Talha Zeeshan et.al. |
2501.00906 |
null |
2025-01-01 |
Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model |
Chenyang Liu et.al. |
2501.00895 |
null |
2025-01-01 |
Evaluating Time Series Foundation Models on Noisy Periodic Time Series |
Syamantak Datta Gupta et.al. |
2501.00889 |
null |
2025-01-01 |
Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization |
Weiqi Wu et.al. |
2501.00888 |
link |
2025-01-01 |
Representation in large language models |
Cameron C. Yetman et.al. |
2501.00885 |
null |
2025-01-01 |
Agentic Systems: A Guide to Transforming Industries with Vertical AI Agents |
Fouad Bousetouane et.al. |
2501.00881 |
null |
2025-01-01 |
Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction |
Teng Hu et.al. |
2501.00880 |
null |
2025-01-01 |
TrustRAG: Enhancing Robustness and Trustworthiness in RAG |
Huichi Zhou et.al. |
2501.00879 |
link |
2025-01-01 |
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models |
Hieu Man et.al. |
2501.00874 |
link |
2025-01-01 |
Exploring Structured Semantic Priors Underlying Diffusion Score for Test-time Adaptation |
Mingjia Li et.al. |
2501.00873 |
link |
2025-01-01 |
Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation |
Shoutao Guo et.al. |
2501.00868 |
link |
2025-01-01 |
Interactionalism: Re-Designing Higher Learning for the Large Language Agent Era |
Mihnea C. Moldoveanu et.al. |
2501.00867 |
null |
2025-01-01 |
Alzheimer’s disease detection based on large language model prompt engineering |
Tian Zheng et.al. |
2501.00861 |
null |
2025-01-01 |
LLM+AL: Bridging Large Language Models and Action Languages for Complex Reasoning about Actions |
Adam Ishay et.al. |
2501.00830 |
null |
2025-01-01 |
An LLM-Empowered Adaptive Evolutionary Algorithm For Multi-Component Deep Learning Systems |
Haoxiang Tian et.al. |
2501.00829 |
null |
2025-01-01 |
LLM-Powered Multi-Agent System for Automated Crypto Portfolio Management |
Yichen Luo et.al. |
2501.00826 |
null |
2025-01-01 |
Multimodal Large Models Are Effective Action Anticipators |
Binglu Wang et.al. |
2501.00795 |
link |
2025-01-01 |
Shifting-Merging: Secure, High-Capacity and Efficient Steganography via Large Language Models |
Minhao Bai et.al. |
2501.00786 |
null |
2025-01-01 |
NMM-HRI: Natural Multi-modal Human-Robot Interaction with Voice and Deictic Posture via Large Language Model |
Yuzhi Lai et.al. |
2501.00785 |
null |
2025-01-01 |
REM: A Scalable Reinforced Multi-Expert Framework for Multiplex Influence Maximization |
Huyen Nguyen et.al. |
2501.00779 |
null |
2025-01-01 |
FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation |
Qianli Wang et.al. |
2501.00777 |
link |
2025-01-01 |
Using Large Language Model to Support Flexible and Structural Inductive Qualitative Analysis |
Jie Gao et.al. |
2501.00775 |
null |
2025-01-01 |
An AI-powered Bayesian generative modeling approach for causal inference in observational studies |
Qiao Liu et.al. |
2501.00755 |
null |
2025-01-01 |
Beyond Text: Implementing Multimodal Large Language Model-Powered Multi-Agent Systems Using a No-Code Platform |
Cheonsu Jeong et.al. |
2501.00750 |
null |
2025-01-01 |
DIVE: Diversified Iterative Self-Improvement |
Yiwei Qin et.al. |
2501.00747 |
link |
2025-01-01 |
Dynamics of Adversarial Attacks on Large Language Model-Based Search Engines |
Xiyang Hu et.al. |
2501.00745 |
null |
2025-01-01 |
A Distributional Evaluation of Generative Image Models |
Edric Tam et.al. |
2501.00744 |
null |
2025-01-01 |
New Agegraphic Dark Energy Model in Modified Symmetric Teleparallel Theory |
Madiha Ajmal et.al. |
2501.00721 |
null |
2025-01-01 |
Knowledge-Guided Prompt Learning for Deepfake Facial Image Detection |
Hao Wang et.al. |
2501.00700 |
null |
2025-01-01 |
Adjoint sharding for very long context training of state space models |
Xingzi Xu et.al. |
2501.00692 |
null |
2025-01-01 |
Labels Generated by Large Language Model Helps Measuring People’s Empathy in Vitro |
Md Rakibul Hasan et.al. |
2501.00691 |
null |
2025-01-01 |
IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently |
Florian Dietz et.al. |
2501.00684 |
null |
2024-12-31 |
Grade Inflation in Generative Models |
Phuc Nguyen et.al. |
2501.00664 |
null |
2024-12-31 |
Finding Missed Code Size Optimizations in Compilers using LLMs |
Davide Italiano et.al. |
2501.00655 |
null |
2024-12-31 |
Taming Feed-forward Reconstruction Models as Latent Encoders for 3D Generative Models |
Suttisak Wizadwongsa et.al. |
2501.00651 |
null |
2024-12-31 |
Efficient Standardization of Clinical Notes using Large Language Models |
Daniel B. Hier et.al. |
2501.00644 |
null |
2024-12-31 |
Enabling New HDLs with Agents |
Mark Zakharov et.al. |
2501.00642 |
null |
2024-12-31 |
DreamDrive: Generative 4D Scene Modeling from Street View Images |
Jiageng Mao et.al. |
2501.00601 |
null |
2024-12-31 |
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM |
Yuqian Yuan et.al. |
2501.00599 |
link |
2024-12-31 |
Setting Standards in Turkish NLP: TR-MMLU for Large Language Model Evaluation |
M. Ali Bayram et.al. |
2501.00593 |
null |
2024-12-31 |
Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method |
Zhenpeng Huang et.al. |
2501.00584 |
null |
2024-12-31 |
Causal Graph Guided Steering of LLM Values via Prompts and Sparse Autoencoders |
Yipeng Kang et.al. |
2501.00581 |
null |
2024-12-31 |
AI and Quantum Computing in Binary Photocatalytic Hydrogen Production |
Dennis Delali Kwesi Wayo et.al. |
2501.00575 |
null |
2024-12-31 |
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling |
Xinhao Li et.al. |
2501.00574 |
link |
2024-12-31 |
Probing Visual Language Priors in VLMs |
Tiange Luo et.al. |
2501.00569 |
null |
2024-12-31 |
Robust and Adaptive Optimization under a Large Language Model Lens |
Dimitris Bertsimas et.al. |
2501.00568 |
null |
2024-12-30 |
Distributed Mixture-of-Agents for Edge Inference with Large Language Models |
Purbesh Mitra et.al. |
2412.21200 |
link |
2024-12-31 |
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation |
Zhaojian Yu et.al. |
2412.21199 |
link |
2024-12-30 |
The Gaussian Kicked Rotor: Periodic forcing with finite-width pulses and the role of shifting the kick |
Jonathan Berkheim et.al. |
2412.21186 |
null |
2024-12-30 |
Facilitating large language model Russian adaptation with Learned Embedding Propagation |
Mikhail Tikhomirov et.al. |
2412.21140 |
link |
2024-12-30 |
ExpShield: Safeguarding Web Text from Unauthorized Crawling and Language Modeling Exploitation |
Ruixuan Liu et.al. |
2412.21123 |
null |
2025-01-02 |
Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation |
Yuanbo Yang et.al. |
2412.21117 |
null |
2024-12-30 |
Varformer: Adapting VAR’s Generative Prior for Image Restoration |
Siyang Wang et.al. |
2412.21063 |
link |
2024-12-30 |
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation |
Jiazheng Xu et.al. |
2412.21059 |
link |
2024-12-30 |
Toward Intelligent and Secure Cloud: Large Language Model Empowered Proactive Defense |
Yuyang Zhou et.al. |
2412.21051 |
link |
2024-12-30 |
E2EDiff: Direct Mapping from Noise to Data for Enhanced Diffusion Models |
Zhiyu Tan et.al. |
2412.21044 |
null |
2024-12-30 |
Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration |
Wanglong Lu et.al. |
2412.21042 |
link |
2024-12-30 |
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization |
Chia-Yu Hung et.al. |
2412.21037 |
link |
2024-12-30 |
GePBench: Evaluating Fundamental Geometric Perception for Multimodal Large Language Models |
Shangyu Xing et.al. |
2412.21036 |
null |
2024-12-30 |
MapQaTor: A System for Efficient Annotation of Map Query Datasets |
Mahir Labib Dihan et.al. |
2412.21015 |
link |
2024-12-31 |
Verbosity-Aware Rationale Reduction: Effective Reduction of Redundant Rationale via Principled Criteria |
Joonwon Jang et.al. |
2412.21006 |
null |
2024-12-30 |
Plug-and-Play Training Framework for Preference Optimization |
Jingyuan Ma et.al. |
2412.20996 |
null |
2024-12-30 |
KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Model’s Reasoning Path Aggregation |
Siyuan Fang et.al. |
2412.20995 |
null |
2024-12-30 |
Efficiently Serving LLM Reasoning Programs with Certaindex |
Yichao Fu et.al. |
2412.20993 |
null |
2024-12-30 |
QuantumLLMInstruct: A 500k LLM Instruction-Tuning Dataset with Problem-Solution Pairs for Quantum Computing |
Shlomo Kashani et.al. |
2412.20956 |
null |
2024-12-30 |
AGON: Automated Design Framework for Customizing Processors from ISA Documents |
Chongxiao Li et.al. |
2412.20954 |
null |
2024-12-30 |
Ontology-grounded Automatic Knowledge Graph Construction by LLM under Wikidata schema |
Xiaohan Feng et.al. |
2412.20942 |
null |
2024-12-30 |
Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering |
Junxiao Xue et.al. |
2412.20927 |
null |
2024-12-30 |
ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation |
Ting Zhang et.al. |
2412.20901 |
null |
2024-12-30 |
Towards Compatible Fine-tuning for Vision-Language Model Updates |
Zhengbo Wang et.al. |
2412.20895 |
null |
2024-12-30 |
DoTA: Weight-Decomposed Tensor Adaptation for Large Language Models |
Xiaolin Hu et.al. |
2412.20891 |
null |
2024-12-30 |
Enhancing Annotated Bibliography Generation with LLM Ensembles |
Sergio Bermejo et.al. |
2412.20864 |
null |
2024-12-30 |
Are LLMs Really Not Knowledgable? Mining the Submerged Knowledge in LLMs’ Memory |
Xingjian Tao et.al. |
2412.20846 |
null |
2024-12-30 |
Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment |
Jianfei Zhang et.al. |
2412.20834 |
link |
2024-12-30 |
Retrieval-Augmented Generation for Mobile Edge Computing via Large Language Model |
Runtao Ren et.al. |
2412.20820 |
null |
2024-12-30 |
TimeRAF: Retrieval-Augmented Foundation model for Zero-shot Time Series Forecasting |
Huanyu Zhang et.al. |
2412.20810 |
null |
2024-12-30 |
Pre-trained Audio Transformer as a Foundational AI Tool for Gravitational Waves |
Chayan Chatterjee et.al. |
2412.20789 |
null |
2024-12-31 |
SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity |
Pengfei Jing et.al. |
2412.20787 |
null |
2024-12-30 |
Large Language Model Enabled Multi-Task Physical Layer Network |
Tianyue Zheng et.al. |
2412.20772 |
null |
2024-12-30 |
Attributing Culture-Conditioned Generations to Pretraining Corpora |
Huihan Li et.al. |
2412.20760 |
link |
2024-12-30 |
M $^3$ oralBench: A MultiModal Moral Benchmark for LVLMs |
Bei Yan et.al. |
2412.20718 |
link |
2024-12-30 |
HFI: A unified framework for training-free detection and implicit watermarking of latent diffusion model generated images |
Sungik Choi et.al. |
2412.20704 |
null |
2024-12-30 |
UBER: Uncertainty-Based Evolution with Large Language Models for Automatic Heuristic Design |
Zijie Chen et.al. |
2412.20694 |
link |
2024-12-30 |
Learning to Rank Pre-trained Vision-Language Models for Downstream Tasks |
Yuhe Ding et.al. |
2412.20682 |
null |
2024-12-30 |
Align Attention Heads Before Merging Them: An Effective Way for Converting MHA to GQA |
Qingyun Jin et.al. |
2412.20677 |
null |
2024-12-30 |
Enhancing Table Recognition with Vision LLMs: A Benchmark and Neighbor-Guided Toolchain Reasoner |
Yitong Zhou et.al. |
2412.20662 |
link |
2024-12-30 |
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis |
Yousef Yeganeh et.al. |
2412.20651 |
null |
2024-12-30 |
SafeSynthDP: Leveraging Large Language Models for Privacy-Preserving Synthetic Data Generation Using Differential Privacy |
Md Mahadi Hasan Nahid et.al. |
2412.20641 |
null |
2024-12-30 |
Knowledge Editing for Large Language Model with Knowledge Neuronal Ensemble |
Yongchang Li et.al. |
2412.20637 |
null |
2024-12-30 |
EVOLVE: Emotion and Visual Output Learning via LLM Evaluation |
Jordan Sinclair et.al. |
2412.20632 |
null |
2024-12-29 |
Do Current Video LLMs Have Strong OCR Abilities? A Preliminary Study |
Yulin Fei et.al. |
2412.20613 |
link |
2024-12-29 |
NLP-based Regulatory Compliance – Using GPT 4.0 to Decode Regulatory Documents |
Bimal Kumar et.al. |
2412.20602 |
null |
2024-12-29 |
MATEY: multiscale adaptive foundation models for spatiotemporal physical systems |
Pei Zhang et.al. |
2412.20601 |
null |
2024-12-29 |
Controlling Out-of-Domain Gaps in LLMs for Genre Classification and Generated Text Detection |
Dmitri Roussinov et.al. |
2412.20595 |
link |
2024-12-29 |
Towards Neural No-Resource Language Translation: A Comparative Evaluation of Approaches |
Madhavendra Thakur et.al. |
2412.20584 |
null |
2024-12-29 |
Counterfactual Samples Constructing and Training for Commonsense Statements Estimation |
Chong Liu et.al. |
2412.20563 |
null |
2024-12-29 |
Distributionally Robust Optimization via Iterative Algorithms in Continuous Probability Spaces |
Linglingzhi Zhu et.al. |
2412.20556 |
null |
2024-12-29 |
The Impact of Prompt Programming on Function-Level Code Generation |
Ranim Khojah et.al. |
2412.20545 |
link |
2024-12-29 |
Goal-Conditioned Data Augmentation for Offline Reinforcement Learning |
Xingshuai Huang et.al. |
2412.20519 |
null |
2024-12-29 |
Planning, Living and Judging: A Multi-agent LLM-based Framework for Cyclical Urban Planning |
Hang Ni et.al. |
2412.20505 |
null |
2024-12-29 |
ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding |
Xiao Wang et.al. |
2412.20504 |
link |
2024-12-29 |
TokenRing: An Efficient Parallelism Framework for Infinite-Context LLMs via Bidirectional Communication |
Zongwu Wang et.al. |
2412.20501 |
link |
2024-12-29 |
Multimodal Variational Autoencoder: a Barycentric View |
Peijie Qiu et.al. |
2412.20487 |
null |
2024-12-29 |
JADE: Joint-aware Latent Diffusion for 3D Human Generative Modeling |
Haorui Ji et.al. |
2412.20470 |
null |
2024-12-29 |
Improving Vision-Language-Action Models via Chain-of-Affordance |
Jinming Li et.al. |
2412.20451 |
null |
2024-12-29 |
Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs |
Pratik Rakesh Singh et.al. |
2412.20440 |
null |
2024-12-29 |
Image Augmentation Agent for Weakly Supervised Semantic Segmentation |
Wangyu Wu et.al. |
2412.20439 |
null |
2024-12-29 |
Unlocking adaptive digital pathology through dynamic feature learning |
Jiawen Li et.al. |
2412.20430 |
null |
2024-12-29 |
AmalREC: A Dataset for Relation Extraction and Classification Leveraging Amalgamation of Large Language Models |
Mansi et.al. |
2412.20427 |
null |
2024-12-29 |
Bringing Objects to Life: 4D generation from 3D objects |
Ohad Rahamim et.al. |
2412.20422 |
null |
2024-12-29 |
Comparative Performance of Advanced NLP Models and LLMs in Multilingual Geo-Entity Detection |
Kalin Kopanov et.al. |
2412.20414 |
null |
2024-12-29 |
Multi-Objective Large Language Model Unlearning |
Zibin Pan et.al. |
2412.20412 |
link |
2024-12-29 |
Open-Sora: Democratizing Efficient Video Production for All |
Zangwei Zheng et.al. |
2412.20404 |
link |
2024-12-29 |
Natural Language Fine-Tuning |
Jia Liu et.al. |
2412.20382 |
link |
2024-12-29 |
Protégé: Learn and Generate Basic Makeup Styles with Generative Adversarial Networks (GANs) |
Jia Wei Sii et.al. |
2412.20381 |
null |
2024-12-29 |
FairDiffusion: Enhancing Equity in Latent Diffusion Models via Fair Bayesian Perturbation |
Yan Luo et.al. |
2412.20374 |
link |
2024-12-29 |
LLM2: Let Large Language Models Harness System 2 Reasoning |
Cheng Yang et.al. |
2412.20372 |
link |
2025-01-02 |
Enhancing Code LLMs with Reinforcement Learning in Code Generation: A Survey |
Junqiao Wang et.al. |
2412.20367 |
null |
2024-12-29 |
HindiLLM: Large Language Model for Hindi |
Sanjay Chouhan et.al. |
2412.20357 |
null |
2024-12-29 |
Distilling Desired Comments for Enhanced Code Review with Large Language Models |
Yongda Yu et.al. |
2412.20340 |
null |
2024-12-29 |
Mind the Data Gap: Bridging LLMs to Enterprise Data Integration |
Moe Kayali et.al. |
2412.20331 |
null |
2024-12-29 |
GreenLLM: Disaggregating Large Language Model Serving on Heterogeneous GPUs for Lower Carbon Emissions |
Tianyao Shi et.al. |
2412.20322 |
null |
2024-12-29 |
Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain |
Shintaro Ozaki et.al. |
2412.20309 |
link |
2024-12-28 |
FaGeL: Fabric LLMs Agent empowered Embodied Intelligence Evolution with Autonomous Human-Machine Collaboration |
Jia Liu et.al. |
2412.20297 |
null |
2024-12-28 |
Deep Generalized Schrödinger Bridges: From Image Generation to Solving Mean-Field Games |
Guan-Horng Liu et.al. |
2412.20279 |
null |
2024-12-28 |
Scoring with Large Language Models: A Study on Measuring Empathy of Responses in Dialogues |
Henry J. Xie et.al. |
2412.20264 |
link |
2024-12-28 |
Leveraging Large Language Models for Enhancing Autonomous Vehicle Perception |
Athanasios Karagounis et.al. |
2412.20230 |
null |
2024-12-28 |
LLM Reasoning Engine: Specialized Training for Enhanced Mathematical Reasoning |
Shuguang Chen et.al. |
2412.20227 |
null |
2024-12-28 |
Pushing the Envelope of Low-Bit LLM via Dynamic Error Compensation |
Yeonhong Park et.al. |
2412.20185 |
null |
2024-12-28 |
LoL-PIM: Long-Context LLM Decoding with Scalable DRAM-PIM System |
Hyucksung Kwon et.al. |
2412.20166 |
null |
2024-12-28 |
StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN |
Andrzej Bedychaj et.al. |
2412.20164 |
null |
2024-12-28 |
Topic-Aware Knowledge Graph with Large Language Models for Interoperability in Recommender Systems |
Minhye Jeon et.al. |
2412.20163 |
null |
2024-12-28 |
Multi-Modality Driven LoRA for Adverse Condition Depth Estimation |
Guanglei Yang et.al. |
2412.20162 |
null |
2024-12-28 |
Defending Against Network Attacks for Secure AI Agent Migration in Vehicular Metaverses |
Xinru Wen et.al. |
2412.20154 |
null |
2024-12-28 |
Efficient Multi-Agent Collaboration with Tool Use for Online Planning in Complex Table Question Answering |
Wei Zhou et.al. |
2412.20145 |
null |
2024-12-28 |
TradingAgents: Multi-Agents LLM Financial Trading Framework |
Yijia Xiao et.al. |
2412.20138 |
null |
2024-12-28 |
M-MAD: Multidimensional Multi-Agent Debate Framework for Fine-grained Machine Translation Evaluation |
Zhaopeng Feng et.al. |
2412.20127 |
link |
2024-12-28 |
Functional Lower Bounds in Algebraic Proofs: Symmetry, Lifting, and Barriers |
Tuomas Hakoniemi et.al. |
2412.20114 |
null |
2024-12-28 |
ST $^3$ : Accelerating Multimodal Large Language Model by Spatial-Temporal Visual Token Trimming |
Jiedong Zhuang et.al. |
2412.20105 |
null |
2024-12-28 |
On the Validity of Traditional Vulnerability Scoring Systems for Adversarial Attacks against LLMs |
Atmane Ayoub Mansour Bahar et.al. |
2412.20087 |
null |
2024-12-31 |
Extract Information from Hybrid Long Documents Leveraging LLMs: A Framework and Dataset |
Chongjian Yue et.al. |
2412.20072 |
null |
2024-12-28 |
On the Compositional Generalization of Multimodal LLMs for Medical Imaging |
Zhenyang Cai et.al. |
2412.20070 |
link |
2024-12-28 |
VELoRA: A Low-Rank Adaptation Approach for Efficient RGB-Event based Recognition |
Lan Chen et.al. |
2412.20064 |
link |
2024-12-28 |
MADiff: Text-Guided Fashion Image Editing with Mask Prediction and Attention-Enhanced Diffusion |
Zechao Zhan et.al. |
2412.20062 |
null |
2024-12-28 |
Comparative Analysis of Listwise Reranking with Large Language Models in Limited-Resource Language Contexts |
Yanxin Shen et.al. |
2412.20061 |
null |
2024-12-28 |
“My life is miserable, have to sign 500 autographs everyday”: Exposing Humblebragging, the Brags in Disguise |
Sharath Naganna et.al. |
2412.20057 |
null |
2024-12-27 |
Enhancing Whisper’s Accuracy and Speed for Indian Languages through Prompt-Tuning and Tokenization |
Kumud Tripathi et.al. |
2412.19785 |
null |
2024-12-27 |
Can AI Help with Your Personal Finances? |
Oudom Hean et.al. |
2412.19784 |
null |
2024-12-27 |
Tensor Network Estimation of Distribution Algorithms |
John Gardiner et.al. |
2412.19780 |
null |
2024-12-27 |
Fortran2CPP: Automating Fortran-to-C++ Migration using LLMs via Multi-Turn Dialogue and Dual-Agent Integration |
Le Chen et.al. |
2412.19770 |
link |
2024-12-27 |
Generative Video Propagation |
Shaoteng Liu et.al. |
2412.19761 |
null |
2024-12-27 |
On dual-projectively equivalent connections associated to second order superintegrable systems |
Andreas Vollmer et.al. |
2412.19739 |
null |
2024-12-27 |
Can Large Language Models Adapt to Other Agents In-Context? |
Matthew Riemer et.al. |
2412.19726 |
null |
2024-12-27 |
From Elements to Design: A Layered Approach for Automatic Graphic Design Composition |
Jiawei Lin et.al. |
2412.19712 |
null |
2024-12-27 |
Toward Adaptive Reasoning in Large Language Models with Thought Rollback |
Sijia Chen et.al. |
2412.19707 |
link |
2024-12-27 |
A Large-scale Interpretable Multi-modality Benchmark for Facial Image Forgery Localization |
Jingchun Lian et.al. |
2412.19685 |
null |
2024-12-27 |
Boosting Private Domain Understanding of Efficient MLLMs: A Tuning-free, Adaptive, Universal Prompt Optimization Framework |
Jiang Liu et.al. |
2412.19684 |
null |
2024-12-27 |
CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs |
Siyu Wang et.al. |
2412.19663 |
null |
2024-12-27 |
Asymmetrical Reciprocity-based Federated Learning for Resolving Disparities in Medical Diagnosis |
Jiaqi Wang et.al. |
2412.19654 |
link |
2024-12-27 |
FreStega: A Plug-and-Play Method for Boosting Imperceptibility and Capacity in Generative Linguistic Steganography for Real-World Scenarios |
Kaiyi Pang et.al. |
2412.19652 |
null |
2024-12-27 |
Xmodel-2 Technical Report |
Wang Qun et.al. |
2412.19638 |
null |
2024-12-27 |
IMTP: Search-based Code Generation for In-memory Tensor Programs |
Yongwon Shin et.al. |
2412.19630 |
null |
2024-12-27 |
Signatures of prediction during natural listening in MEG data? |
Sahel Azizpour et.al. |
2412.19622 |
null |
2024-12-27 |
Gradient Weight-normalized Low-rank Projection for Efficient LLM Training |
Jia-Hong Huang et.al. |
2412.19616 |
link |
2024-12-27 |
SocRATES: Towards Automated Scenario-based Testing of Social Navigation Algorithms |
Shashank Rao Marpally et.al. |
2412.19595 |
null |
2024-12-27 |
Hindsight Planner: A Closed-Loop Few-Shot Planner for Embodied Instruction Following |
Yuxiao Yang et.al. |
2412.19562 |
null |
2024-12-27 |
Diverse Rare Sample Generation with Pretrained GANs |
Subeen Lee et.al. |
2412.19543 |
link |
2024-12-27 |
Lévy Score Function and Score-Based Particle Algorithm for Nonlinear Lévy–Fokker–Planck Equations |
Yuanfei Huang et.al. |
2412.19520 |
link |
2024-12-27 |
Estimation of System Parameters Including Repeated Cross-Sectional Data through Emulator-Informed Deep Generative Model |
Hyunwoo Cho et.al. |
2412.19517 |
null |
2024-12-27 |
Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs |
Zhe Yang et.al. |
2412.19513 |
link |
2024-12-27 |
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging |
Hua Farn et.al. |
2412.19512 |
null |
2024-12-27 |
Parameter Efficient Fine-Tuning for Deep Learning-Based Full-Waveform Inversion |
Koustav Ghosal et.al. |
2412.19510 |
null |
2024-12-27 |
MBQ: Modality-Balanced Quantization for Large Vision-Language Models |
Shiyao Li et.al. |
2412.19509 |
link |
2024-12-27 |
DrivingWorld: ConstructingWorld Model for Autonomous Driving via Video GPT |
Xiaotao Hu et.al. |
2412.19505 |
link |
2024-12-27 |
Casevo: A Cognitive Agents and Social Evolution Simulator |
Zexun Jiang et.al. |
2412.19498 |
link |
2024-12-27 |
Towards Open-Vocabulary Remote Sensing Image Semantic Segmentation |
Chengyang Ye et.al. |
2412.19492 |
link |
2024-12-27 |
Focusing Image Generation to Mitigate Spurious Correlations |
Xuewei Li et.al. |
2412.19457 |
null |
2024-12-27 |
Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models |
Hyeonseok Moon et.al. |
2412.19450 |
link |
2024-12-27 |
Feature Alignment-Based Knowledge Distillation for Efficient Compression of Large Language Models |
Shuo Wang et.al. |
2412.19449 |
null |
2024-12-27 |
A Survey on Large Language Model Acceleration based on KV Cache Management |
Haoyang Li et.al. |
2412.19442 |
link |
2024-12-27 |
Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback |
Seong Jin Lee et.al. |
2412.19436 |
null |
2024-12-27 |
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints |
Alberto Maté et.al. |
2412.19424 |
null |
2024-12-27 |
Gx2Mol: De Novo Generation of Hit-like Molecules from Gene Expression Profiles via Deep Learning |
Chen Li et.al. |
2412.19422 |
link |
2024-12-27 |
MINIMA: Modality Invariant Image Matching |
Xingyu Jiang et.al. |
2412.19412 |
link |
2024-12-27 |
MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios |
Jiaqi Fan et.al. |
2412.19406 |
link |
2024-12-27 |
An Engorgio Prompt Makes Large Language Model Babble on |
Jianshuo Dong et.al. |
2412.19394 |
link |
2024-12-26 |
Large Language Models for Market Research: A Data-augmentation Approach |
Mengxin Wang et.al. |
2412.19363 |
null |
2024-12-26 |
Dynamic Skill Adaptation for Large Language Models |
Jiaao Chen et.al. |
2412.19361 |
null |
2024-12-26 |
Identifying Split Vacancies with Foundation Models and Electrostatics |
Seán R. Kavanagh et.al. |
2412.19330 |
null |
2024-12-26 |
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment |
Ziang Yan et.al. |
2412.19326 |
link |
2024-12-26 |
Performance Control in Early Exiting to Deploy Large Models at the Same Cost of Smaller Ones |
Mehrnaz Mofakhami et.al. |
2412.19325 |
null |
2024-12-26 |
From Interets to Insights: An LLM Approach to Course Recommendations Using Natural Language Queries |
Hugh Van Deventer et.al. |
2412.19312 |
link |
2024-12-26 |
Perceive, Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries |
Roberto Amoroso et.al. |
2412.19304 |
null |
2024-12-26 |
RecLM: Recommendation Instruction Tuning |
Yangqin Jiang et.al. |
2412.19302 |
link |
2024-12-26 |
RAG with Differential Privacy |
Nicolas Grislain et.al. |
2412.19291 |
link |
2024-12-26 |
Time Series Foundational Models: Their Role in Anomaly Detection and Prediction |
Chathurangi Shyalika et.al. |
2412.19286 |
link |
2024-12-26 |
PearSAN: A Machine Learning Method for Inverse Design using Pearson Correlated Surrogate Annealing |
Michael Bezick et.al. |
2412.19284 |
null |
2024-12-26 |
MEDEC: A Benchmark for Medical Error Detection and Correction in Clinical Notes |
Asma Ben Abacha et.al. |
2412.19260 |
link |
2024-12-26 |
VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis |
Jaemin Jung et.al. |
2412.19259 |
null |
2024-12-26 |
Sentiment trading with large language models |
Kemal Kirtac et.al. |
2412.19245 |
null |
2024-12-26 |
SeaMo: A Multi-Seasonal and Multimodal Remote Sensing Foundation Model |
Xuyang Li et.al. |
2412.19237 |
null |
2024-12-26 |
Large Language Models Meet Graph Neural Networks: A Perspective of Graph Mining |
Yuxin You et.al. |
2412.19211 |
null |
2024-12-26 |
Multi-Attribute Constraint Satisfaction via Language Model Rewriting |
Ashutosh Baheti et.al. |
2412.19198 |
null |
2024-12-26 |
Biology Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models |
Haonan He et.al. |
2412.19191 |
null |
2024-12-26 |
Evolutionary de-homogenization using a generative model for optimizing solid-porous infill structures considering the stress concentration issue |
Shuzhi Xu et.al. |
2412.19154 |
null |
2024-12-26 |
AskChart: Universal Chart Understanding through Textual Enhancement |
Xudong Yang et.al. |
2412.19146 |
link |
2024-12-26 |
SILC-EFSA: Self-aware In-context Learning Correction for Entity-level Financial Sentiment Analysis |
Senbin Zhu et.al. |
2412.19140 |
link |
2024-12-26 |
PlanLLM: Video Procedure Planning with Refinable Large Language Models |
Dejie Yang et.al. |
2412.19139 |
link |
2024-12-26 |
Advanced Knowledge Transfer: Refined Feature Distillation for Zero-Shot Quantization in Edge Computing |
Inpyo Hong et.al. |
2412.19125 |
link |
2024-12-26 |
Discrete vs. Continuous Trade-offs for Generative Models |
Jathin Korrapati et.al. |
2412.19114 |
null |
2024-12-26 |
SketchFill: Sketch-Guided Code Generation for Imputing Derived Missing Values |
Yunfan Zhang et.al. |
2412.19113 |
null |
2024-12-26 |
Stochastic normalizing flows for Effective String Theory |
Michele Caselle et.al. |
2412.19109 |
null |
2024-12-26 |
“I’ve Heard of You!”: Generate Spoken Named Entity Recognition Data for Unseen Entities |
Jiawei Yu et.al. |
2412.19102 |
null |
2024-12-26 |
Integrating Artificial Open Generative Artificial Intelligence into Software Supply Chain Security |
Vasileios Alevizos et.al. |
2412.19088 |
null |
2024-12-26 |
Mask Factory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation |
Haotian Qian et.al. |
2412.19080 |
null |
2024-12-26 |
CL-attack: Textual Backdoor Attacks via Cross-Lingual Triggers |
Jingyi Zheng et.al. |
2412.19037 |
link |
2024-12-26 |
Repository Structure-Aware Training Makes SLMs Better Issue Resolver |
Zexiong Ma et.al. |
2412.19031 |
null |
2024-12-26 |
Modality-Projection Universal Model for Comprehensive Full-Body Medical Imaging Segmentation |
Yixin Chen et.al. |
2412.19026 |
link |
2024-12-26 |
Channel-Aware Optimal Transport: A Theoretical Framework for Generative Communication |
Xiqiang Qu et.al. |
2412.19025 |
null |
2024-12-26 |
Relation-aware Hierarchical Prompt for Open-vocabulary Scene Graph Generation |
Tao Liu et.al. |
2412.19021 |
null |
2024-12-26 |
Let the Rule Speak: Enhancing In-context Learning Debiasing with Interpretability |
Ruixi Lin et.al. |
2412.19018 |
null |
2024-12-25 |
How Propense Are Large Language Models at Producing Code Smells? A Benchmarking Study |
Alejandro Velasco et.al. |
2412.18989 |
null |
2024-12-25 |
ModelGrow: Continual Text-to-Video Pre-training with Model Expansion and Language Understanding Enhancement |
Zhefan Rao et.al. |
2412.18966 |
null |
2024-12-25 |
Musings About the Future of Search: A Return to the Past? |
Jimmy Lin et.al. |
2412.18956 |
null |
2024-12-25 |
A Power-Efficient Hardware Implementation of L-Mul |
Ruiqi Chen et.al. |
2412.18948 |
null |
2024-12-25 |
MedHallBench: A New Benchmark for Assessing Hallucination in Medical Large Language Models |
Kaiwen Zuo et.al. |
2412.18947 |
null |
2024-12-25 |
Amuse: Human-AI Collaborative Songwriting with Multimodal Inspirations |
Yewon Kim et.al. |
2412.18940 |
null |
2024-12-25 |
Dovetail: A CPU/GPU Heterogeneous Speculative Decoding for LLM inference |
Libo Zhang et.al. |
2412.18934 |
null |
2024-12-25 |
UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation |
Lunhao Duan et.al. |
2412.18928 |
null |
2024-12-25 |
Exemplar-condensed Federated Class-incremental Learning |
Rui Sun et.al. |
2412.18926 |
null |
2024-12-25 |
Open-Vocabulary Panoptic Segmentation Using BERT Pre-Training of Vision-Language Multiway Transformer Model |
Yi-Chia Chen et.al. |
2412.18917 |
link |
2024-12-25 |
AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures |
Situo Zhang et.al. |
2412.18910 |
null |
2024-12-25 |
CoEvo: Continual Evolution of Symbolic Solutions Using Large Language Models |
Ping Guo et.al. |
2412.18890 |
link |
2024-12-25 |
MotionMap: Representing Multimodality in Human Pose Forecasting |
Reyhaneh Hosseininejad et.al. |
2412.18883 |
link |
2024-12-25 |
Whose Morality Do They Speak? Unraveling Cultural Bias in Multilingual Language Models |
Meltem Aksoy et.al. |
2412.18863 |
null |
2024-12-25 |
Improving the Readability of Automatically Generated Tests using Large Language Models |
Matteo Biagiola et.al. |
2412.18843 |
null |
2024-12-25 |
LoGFiLM: Fine-Tuning A Large Language Model for Automated Generation of Log Statements |
Hao Zhang et.al. |
2412.18835 |
null |
2024-12-25 |
Structured Speaker-Deficiency Adaptation of Foundation Models for Dysarthric and Elderly Speech Recognition |
Shujie Hu et.al. |
2412.18832 |
null |
2024-12-25 |
RapGuard: Safeguarding Multimodal Large Language Models via Rationale-aware Defensive Prompting |
Yilei Jiang et.al. |
2412.18826 |
null |
2024-12-25 |
CausalTAD: Causal Implicit Generative Model for Debiased Online Trajectory Anomaly Detection |
Wenbin Li et.al. |
2412.18820 |
link |
2024-12-25 |
LLM-assisted vector similarity search |
Md Riyadh et.al. |
2412.18819 |
null |
2024-12-25 |
DCIS: Efficient Length Extrapolation of LLMs via Divide-and-Conquer Scaling Factor Search |
Lei Yang et.al. |
2412.18811 |
link |
2024-12-25 |
Improving Generated and Retrieved Knowledge Combination Through Zero-shot Generation |
Xinkai Du et.al. |
2412.18800 |
null |
2024-12-25 |
Torque-Aware Momentum |
Pranshu Malviya et.al. |
2412.18790 |
null |
2024-12-25 |
Attack-in-the-Chain: Bootstrapping Large Language Models for Attacks Against Black-box Neural Ranking Models |
Yu-An Liu et.al. |
2412.18770 |
link |
2024-12-25 |
The Impact of Input Order Bias on Large Language Models for Software Fault Localization |
Md Nakhla Rafi et.al. |
2412.18750 |
null |
2024-12-24 |
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models |
Zehan Wang et.al. |
2412.18605 |
link |
2024-12-24 |
Long-Form Speech Generation with Spoken Language Models |
Se Jin Park et.al. |
2412.18603 |
link |
2024-12-24 |
Decentralized Intelligence in GameFi: Embodied AI Agents and the Convergence of DeFi and Virtual Ecosystems |
Fernando Jia et.al. |
2412.18601 |
link |
2024-12-24 |
ZeroHSI: Zero-Shot 4D Human-Scene Interaction by Video Generation |
Hongjie Li et.al. |
2412.18600 |
null |
2024-12-24 |
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation |
Minghong Cai et.al. |
2412.18597 |
link |
2024-12-24 |
A Paragraph is All It Takes: Rich Robot Behaviors from Interacting, Trusted LLMs |
OpenMind et.al. |
2412.18588 |
null |
2024-12-24 |
Exploring Embedding Priors in Prompt-Tuning for Improved Interpretability and Control |
Sergey Sedov et.al. |
2412.18582 |
null |
2024-12-24 |
Zero-resource Speech Translation and Recognition with LLMs |
Karel Mundnich et.al. |
2412.18566 |
null |
2024-12-24 |
Distilling Fine-grained Sentiment Understanding from Large Language Models |
Yice Zhang et.al. |
2412.18552 |
link |
2024-12-24 |
Token-Budget-Aware LLM Reasoning |
Tingxu Han et.al. |
2412.18547 |
link |
2024-12-24 |
PLD-Tree: Persistent Laplacian Decision Tree for Protein-Protein Binding Free Energy Prediction |
Xingjian Xu et.al. |
2412.18541 |
null |
2024-12-24 |
Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation |
Derong Xu Xinhang Li et.al. |
2412.18537 |
link |
2024-12-24 |
Automated Code Review In Practice |
Umut Cihan et.al. |
2412.18531 |
null |
2024-12-24 |
Large Language Model guided Deep Reinforcement Learning for Decision Making in Autonomous Driving |
Hao Pang et.al. |
2412.18511 |
null |
2024-12-24 |
Think or Remember? Detecting and Directing LLMs Towards Memorization or Generalization |
Yi-Fu Fu et.al. |
2412.18497 |
null |
2024-12-24 |
GeFL: Model-Agnostic Federated Learning with Generative Models |
Honggu Kang et.al. |
2412.18460 |
null |
2024-12-24 |
3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding |
Tatiana Zemskova et.al. |
2412.18450 |
link |
2024-12-24 |
Is Large Language Model Good at Triple Set Prediction? An Empirical Study |
Yuan Yuan et.al. |
2412.18443 |
null |
2024-12-24 |
Gaussian entropic optimal transport: Schrödinger bridges and the Sinkhorn algorithm |
O. Deniz Akyildiz et.al. |
2412.18432 |
null |
2024-12-24 |
GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent |
Kangjia Zhao et.al. |
2412.18426 |
null |
2024-12-24 |
Research on the Proximity Relationships of Psychosomatic Disease Knowledge Graph Modules Extracted by Large Language Models |
Zihan Zhou et.al. |
2412.18419 |
null |
2024-12-24 |
Muse: A Multimodal Conversational Recommendation Dataset with Scenario-Grounded User Profiles |
Zihan Wang et.al. |
2412.18416 |
null |
2024-12-24 |
Multilingual Mathematical Reasoning: Advancing Open-Source LLMs in Hindi and English |
Avinash Anand et.al. |
2412.18415 |
link |
2024-12-24 |
Discovery of 2D Materials via Symmetry-Constrained Diffusion Model |
Shihang Xu et.al. |
2412.18414 |
null |
2024-12-24 |
A Statistical Framework for Ranking LLM-Based Chatbots |
Siavash Ameli et.al. |
2412.18407 |
link |
2024-12-24 |
Extract Free Dense Misalignment from CLIP |
JeongYeon Nam et.al. |
2412.18404 |
link |
2024-12-24 |
RDPM: Solve Diffusion Probabilistic Models via Recurrent Token Prediction |
Wu Xiaoping et.al. |
2412.18390 |
null |
2024-12-24 |
MR-COGraphs: Communication-efficient Multi-Robot Open-vocabulary Mapping System via 3D Scene Graphs |
Qiuyi Gu et.al. |
2412.18381 |
link |
2024-12-24 |
Defining and Detecting the Defects of the Large Language Model-based Autonomous Agents |
Kaiwen Ning et.al. |
2412.18371 |
link |
2024-12-24 |
Multi-Agents Based on Large Language Models for Knowledge-based Visual Question Answering |
Zhongjian Hu et.al. |
2412.18351 |
null |
2024-12-24 |
M-Ped: Multi-Prompt Ensemble Decoding for Large Language Models |
Jiaxin Guo et.al. |
2412.18299 |
null |
2024-12-24 |
Quo Vadis, Anomaly Detection? LLMs and VLMs in the Spotlight |
Xi Ding et.al. |
2412.18298 |
link |
2024-12-24 |
Pirates of the RAG: Adaptively Attacking LLMs to Leak Knowledge Bases |
Christian Di Maio et.al. |
2412.18295 |
null |
2024-12-24 |
DeepCRCEval: Revisiting the Evaluation of Code Review Comment Generation |
Junyi Lu et.al. |
2412.18291 |
null |
2024-12-24 |
Improved Feature Generating Framework for Transductive Zero-shot Learning |
Zihan Ye et.al. |
2412.18282 |
null |
2024-12-24 |
GDM4MMIMO: Generative Diffusion Models for Massive MIMO Communications |
Zhenzhou Jin et.al. |
2412.18281 |
null |
2024-12-24 |
Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization |
Jiacai Liu et.al. |
2412.18279 |
null |
2024-12-24 |
GenAI Content Detection Task 2: AI vs. Human – Academic Essay Authenticity Challenge |
Shammur Absar Chowdhury et.al. |
2412.18274 |
null |
2024-12-24 |
Annotating References to Mythological Entities in French Literature |
Thierry Poibeau et.al. |
2412.18270 |
null |
2024-12-24 |
Investigating Large Language Models for Code Vulnerability Detection: An Experimental Study |
Xuefeng Jiang et.al. |
2412.18260 |
link |
2024-12-24 |
AdaCo: Overcoming Visual Foundation Model Noise in 3D Semantic Segmentation via Adaptive Label Correction |
Pufan Zou et.al. |
2412.18255 |
null |
2024-12-24 |
An Automatic Graph Construction Framework based on Large Language Models for Recommendation |
Rong Shan et.al. |
2412.18241 |
link |
2024-12-24 |
Combining GPT and Code-Based Similarity Checking for Effective Smart Contract Vulnerability Detection |
Jango Zhang et.al. |
2412.18225 |
null |
2024-12-24 |
Expand VSR Benchmark for VLLM to Expertize in Spatial Rules |
Peijin Xie et.al. |
2412.18224 |
link |
2024-12-24 |
ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation |
Mengyang Wu et.al. |
2412.18216 |
link |
2024-12-24 |
Adapting Large Language Models for Improving TCP Fairness over WiFi |
Shyam Kumar Shrestha et.al. |
2412.18200 |
null |
2024-12-24 |
Robustness-aware Automatic Prompt Optimization |
Zeru Shi et.al. |
2412.18196 |
link |
2024-12-24 |
VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks |
Shiduo Zhang et.al. |
2412.18194 |
null |
2024-12-24 |
TextMatch: Enhancing Image-Text Consistency Through Multimodal Optimization |
Yucong Luo et.al. |
2412.18185 |
null |
2024-12-24 |
Molar: Multimodal LLMs with Collaborative Filtering Alignment for Enhanced Sequential Recommendation |
Yucong Luo et.al. |
2412.18176 |
null |
2024-12-24 |
INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent |
Haohang Li et.al. |
2412.18174 |
null |
2024-12-24 |
Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language Models |
Xiaomeng Hu et.al. |
2412.18171 |
null |
2024-12-24 |
KunServe: Elastic and Efficient Large Language Model Serving with Parameter-centric Memory Management |
Rongxin Cheng et.al. |
2412.18169 |
null |
2024-12-24 |
Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence |
Yinbin Han et.al. |
2412.18164 |
null |
2024-12-24 |
VISION: A Modular AI Assistant for Natural Human-Instrument Interaction at Scientific User Facilities |
Shray Mathur et.al. |
2412.18161 |
null |
2024-12-24 |
Semantics Disentanglement and Composition for Versatile Codec toward both Human-eye Perception and Machine Vision Task |
Jinming Liu et.al. |
2412.18158 |
null |
2024-12-24 |
Smooth-Foley: Creating Continuous Sound for Video-to-Audio Generation Under Semantic Guidance |
Yaoyun Zhang et.al. |
2412.18157 |
null |
2024-12-24 |
scReader: Prompting Large Language Models to Interpret scRNA-seq Data |
Cong Li et.al. |
2412.18156 |
null |
2024-12-24 |
GeneSUM: Large Language Model-based Gene Summary Extraction |
Zhijian Chen et.al. |
2412.18154 |
null |
2024-12-24 |
CoAM: Corpus of All-Type Multiword Expressions |
Yusuke Ide et.al. |
2412.18151 |
null |
2024-12-24 |
EvalMuse-40K: A Reliable and Fine-Grained Benchmark with Comprehensive Human Annotations for Text-to-Image Generation Model Evaluation |
Shuhao Han et.al. |
2412.18150 |
link |
2024-12-24 |
Dense-Face: Personalized Face Generation Model via Dense Annotation Prediction |
Xiao Guo et.al. |
2412.18149 |
null |
2024-12-24 |
Ensuring Consistency for In-Image Translation |
Chengpeng Fu et.al. |
2412.18139 |
null |
2024-12-24 |
LSAQ: Layer-Specific Adaptive Quantization for Large Language Model Deployment |
Binrui Zeng et.al. |
2412.18135 |
null |
2024-12-24 |
VisionLLM-based Multimodal Fusion Network for Glottic Carcinoma Early Detection |
Zhaohui Jin et.al. |
2412.18124 |
null |
2024-12-24 |
AutoDroid-V2: Boosting SLM-based GUI Agents via Code Generation |
Hao Wen et.al. |
2412.18116 |
link |
2024-12-24 |
AIGT: AI Generative Table Based on Prompt |
Mingming Zhang et.al. |
2412.18111 |
null |
2024-12-24 |
SlimGPT: Layer-wise Structured Pruning for Large Language Models |
Gui Ling et.al. |
2412.18110 |
null |
2024-12-24 |
Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach |
Jing Bi et.al. |
2412.18108 |
null |
2024-12-24 |
Tackling the Dynamicity in a Production LLM Serving System with SOTA Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient Meta-kernels |
Mingcong Song et.al. |
2412.18106 |
null |
2024-12-24 |
EvoPat: A Multi-LLM-based Patents Summarization and Analysis Agent |
Suyuan Wang et.al. |
2412.18100 |
null |
2024-12-24 |
Real-world Deployment and Evaluation of PErioperative AI CHatbot (PEACH) – a Large Language Model Chatbot for Perioperative Medicine |
Yu He Ke et.al. |
2412.18096 |
null |
2024-12-24 |
Molly: Making Large Language Model Agents Solve Python Problem More Logically |
Rui Xiao et.al. |
2412.18093 |
null |
2024-12-24 |
Generating Traffic Scenarios via In-Context Learning to Learn Better Motion Planner |
Aizierjiang Aiersilan et.al. |
2412.18086 |
link |
2024-12-24 |
Property Enhanced Instruction Tuning for Multi-task Molecule Generation with Large Language Models |
Xuan Lin et.al. |
2412.18084 |
link |
2024-12-24 |
Improving Factuality with Explicit Working Memory |
Mingda Chen et.al. |
2412.18069 |
null |
2024-12-24 |
LMRPA: Large Language Model-Driven Efficient Robotic Process Automation for OCR |
Osama Hosam Abdellaif et.al. |
2412.18063 |
link |
2024-12-24 |
Lla-VAP: LSTM Ensemble of Llama and VAP for Turn-Taking Prediction |
Hyunbae Jeon et.al. |
2412.18061 |
null |
2024-12-24 |
An Ensemble Approach to Short-form Video Quality Assessment Using Multimodal LLM |
Wen Wen et.al. |
2412.18060 |
null |
2024-12-23 |
Factuality or Fiction? Benchmarking Modern LLMs on Ambiguous QA with Citations |
Maya Patel et.al. |
2412.18051 |
null |
2024-12-23 |
AA-SGAN: Adversarially Augmented Social GAN with Synthetic Data |
Mirko Zaffaroni et.al. |
2412.18038 |
link |
2024-12-23 |
Generating refactored code accurately using reinforcement learning |
Indranil Palit et.al. |
2412.18035 |
null |
2024-12-23 |
More than Chit-Chat: Developing Robots for Small-Talk Interactions |
Rebecca Ramnauth et.al. |
2412.18023 |
null |
2024-12-23 |
Trustworthy and Efficient LLMs Meet Databases |
Kyoungmin Kim et.al. |
2412.18022 |
null |
2024-12-23 |
StructTest: Benchmarking LLMs’ Reasoning through Compositional Structured Outputs |
Hailin Chen et.al. |
2412.18011 |
null |
2024-12-23 |
CARL-GT: Evaluating Causal Reasoning Capabilities of Large Language Models |
Ruibo Tu et.al. |
2412.17970 |
link |
2024-12-23 |
LMV-RPA: Large Model Voting-based Robotic Process Automation |
Osama Abdellatif et.al. |
2412.17965 |
link |
2024-12-23 |
Dynamic Multi-Agent Orchestration and Retrieval for Multi-Source Question-Answer Systems using Large Language Models |
Antony Seabra et.al. |
2412.17964 |
null |
2024-12-23 |
Path-of-Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models |
Ge Zhang et.al. |
2412.17963 |
null |
2024-12-23 |
Contrato360 2.0: A Document and Database-Driven Question-Answer System using Large Language Models and Agents |
Antony Seabra et.al. |
2412.17942 |
null |
2024-12-23 |
BenCzechMark : A Czech-centric Multitask and Multimetric Benchmark for Large Language Models with Duel Scoring Mechanism |
Martin Fajcik et.al. |
2412.17933 |
null |
2024-12-23 |
Causal Composition Diffusion Model for Closed-loop Traffic Generation |
Haohong Lin et.al. |
2412.17920 |
null |
2024-12-23 |
Trading Devil RL: Backdoor attack via Stock market, Bayesian Optimization and Reinforcement Learning |
Orson Mengara et.al. |
2412.17908 |
null |
2024-12-23 |
LLM-Driven Feedback for Enhancing Conceptual Design Learning in Database Systems Courses |
Sara Riazi et.al. |
2412.17892 |
null |
2024-12-23 |
ChatGarment: Garment Estimation, Generation and Editing via Large Language Models |
Siyuan Bian et.al. |
2412.17811 |
null |
2024-12-23 |
Reconstructing People, Places, and Cameras |
Lea Müller et.al. |
2412.17806 |
link |
2024-12-23 |
Automating the Search for Artificial Life with Foundation Models |
Akarsh Kumar et.al. |
2412.17799 |
link |
2024-12-23 |
ResearchTown: Simulator of Human Research Community |
Haofei Yu et.al. |
2412.17767 |
link |
2024-12-23 |
ADC: Enhancing Function Calling Via Adversarial Datasets and Code Line-Level Feedback |
Wei Zhang et.al. |
2412.17754 |
null |
2024-12-23 |
Deliberation in Latent Space via Differentiable Cache Augmentation |
Luyang Liu et.al. |
2412.17747 |
null |
2024-12-23 |
YuLan-Mini: An Open Data-efficient Language Model |
Yiwen Hu et.al. |
2412.17743 |
link |
2024-12-23 |
**Reasoning to Attend: Try to Understand How Token Works** |
Rui Qian et.al. |
2412.17741 |
link |
2024-12-23 |
Knowledge Editing through Chain-of-Thought |
Changyue Wang et.al. |
2412.17727 |
link |
2024-12-23 |
Understanding the Logic of Direct Preference Alignment through Logic |
Kyle Richardson et.al. |
2412.17696 |
null |
2024-12-23 |
Large Language Model Safety: A Holistic Survey |
Dan Shi et.al. |
2412.17686 |
link |
2024-12-23 |
A Bias-Free Training Paradigm for More General AI-generated Image Detection |
Fabrizio Guillaro et.al. |
2412.17671 |
null |
2024-12-23 |
Generating Completions for Fragmented Broca’s Aphasic Sentences Using Large Language Models |
Sijbren van Vaals et.al. |
2412.17669 |
link |
2024-12-23 |
Detecting anxiety and depression in dialogues: a multi-label and explainable approach |
Francisco de Arriba-Pérez et.al. |
2412.17651 |
null |
2024-12-23 |
SCBench: A Sports Commentary Benchmark for Video LLMs |
Kuangzhi Ge et.al. |
2412.17637 |
null |
2024-12-23 |
ANID: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance |
Renyang Liu et.al. |
2412.17632 |
link |
2024-12-23 |
Tracking the Feature Dynamics in LLM Training: A Mechanistic Study |
Yang Xu et.al. |
2412.17626 |
null |
2024-12-23 |
Be More Diverse than the Most Diverse: Online Selection of Diverse Mixtures of Generative Models |
Parham Rezaei et.al. |
2412.17622 |
link |
2024-12-23 |
Emerging Security Challenges of Large Language Models |
Herve Debar et.al. |
2412.17614 |
null |
2024-12-23 |
Towards Foundation Models on Graphs: An Analysis on Cross-Dataset Transfer of Pretrained GNNs |
Fabrizio Frasca et.al. |
2412.17609 |
null |
2024-12-23 |
EasyTime: Time Series Forecasting Made Easy |
Xiangfei Qiu et.al. |
2412.17603 |
null |
2024-12-23 |
LiveIdeaBench: Evaluating LLMs’ Scientific Creativity and Idea Generation with Minimal Context |
Kai Ruan et.al. |
2412.17596 |
link |
2024-12-23 |
Leveraging Memory Retrieval to Enhance LLM-based Generative Recommendation |
Chengbing Wang et.al. |
2412.17593 |
null |
2024-12-23 |
HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data |
Ting Zhou et.al. |
2412.17574 |
link |
2024-12-23 |
S-INF: Towards Realistic Indoor Scene Synthesis via Scene Implicit Neural Field |
Zixi Liang et.al. |
2412.17561 |
link |
2024-12-23 |
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference |
Chao Zeng et.al. |
2412.17560 |
null |
2024-12-23 |
A Survey of Query Optimization in Large Language Models |
Mingyang Song et.al. |
2412.17558 |
null |
2024-12-23 |
Resource-Aware Arabic LLM Creation: Model Adaptation, Integration, and Multi-Domain Testing |
Prakash Aryan et.al. |
2412.17548 |
link |
2024-12-23 |
Retention Score: Quantifying Jailbreak Risks for Vision Language Models |
Zaitang Li et.al. |
2412.17544 |
null |
2024-12-23 |
Constructing Fair Latent Space for Intersection of Fairness and Explainability |
Hyungjun Joo et.al. |
2412.17523 |
null |
2024-12-23 |
DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak |
Hao Wang et.al. |
2412.17522 |
null |
2024-12-23 |
Improving the Noise Estimation of Latent Neural Stochastic Differential Equations |
Linus Heck et.al. |
2412.17499 |
null |
2024-12-23 |
Is ChatGPT Massively Used by Students Nowadays? A Survey on the Use of Large Language Models such as ChatGPT in Educational Settings |
Jérémie Sublime et.al. |
2412.17486 |
null |
2024-12-23 |
Power- and Fragmentation-aware Online Scheduling for GPU Datacenters |
Francesco Lettich et.al. |
2412.17484 |
link |
2024-12-23 |
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression |
Chenlong Deng et.al. |
2412.17483 |
null |
2024-12-23 |
A Survey on Multi-Generative Agent System: Recent Advances and New Frontiers |
Shuaihang Chen et.al. |
2412.17481 |
link |
2024-12-23 |
CALLIC: Content Adaptive Learning for Lossless Image Compression |
Daxin Li et.al. |
2412.17464 |
null |
2024-12-23 |
Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning |
Xiaodan Chen et.al. |
2412.17456 |
null |
2024-12-23 |
Applying LLM and Topic Modelling in Psychotherapeutic Contexts |
Alexander Vanin et.al. |
2412.17449 |
null |
2024-12-23 |
Measuring Contextual Informativeness in Child-Directed Text |
Maria Valentini et.al. |
2412.17427 |
link |
2024-12-23 |
Multimodal Preference Data Synthetic Alignment with Reward Model |
Robert Wijaya et.al. |
2412.17417 |
link |
2024-12-23 |
VidCtx: Context-aware Video Question Answering with Image Models |
Andreas Goulas et.al. |
2412.17415 |
link |
2024-12-23 |
Just What You Desire: Constrained Timeline Summarization with Self-Reflection for Enhanced Relevance |
Muhammad Reza Qorib et.al. |
2412.17408 |
link |
2024-12-23 |
Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning |
Huchen Jiang et.al. |
2412.17397 |
null |
2024-12-23 |
WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models |
Huawen Feng et.al. |
2412.17395 |
null |
2024-12-23 |
Singular Value Scaling: Efficient Generative Model Compression via Pruned Weights Refinement |
Hyeonjin Kim et.al. |
2412.17387 |
link |
2024-12-23 |
Interweaving Memories of a Siamese Large Language Model |
Xin Song et.al. |
2412.17383 |
link |
2024-12-23 |
MineAgent: Towards Remote-Sensing Mineral Exploration with Multimodal Large Language Models |
Beibei Yu et.al. |
2412.17339 |
null |
2024-12-23 |
A Dual-Perspective Metaphor Detection Framework Using Large Language Models |
Yujie Lin et.al. |
2412.17332 |
link |
2024-12-23 |
Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance |
Nicolas Devatine et.al. |
2412.17321 |
null |
2024-12-23 |
CodeV: Issue Resolving with Visual Data |
Linhao Zhang et.al. |
2412.17315 |
link |
2024-12-23 |
Prompting in the Wild: An Empirical Study of Prompt Evolution in Software Repositories |
Mahan Tafreshipour et.al. |
2412.17298 |
null |
2024-12-23 |
Multi-Modal Grounded Planning and Efficient Replanning For Learning Embodied Agents with A Few Examples |
Taewoong Kim et.al. |
2412.17288 |
link |
2024-12-23 |
LLM4AD: A Platform for Algorithm Design with Large Language Model |
Fei Liu et.al. |
2412.17287 |
link |
2024-12-23 |
Enabling Time-series Foundation Model for Building Energy Forecasting via Contrastive Curriculum Learning |
Rui Liang et.al. |
2412.17285 |
null |
2024-12-23 |
Unlocking Cross-Lingual Sentiment Analysis through Emoji Interpretation: A Multimodal Generative AI Approach |
Rafid Ishrak Jahan et.al. |
2412.17255 |
link |
2024-12-23 |
SyNeg: LLM-Driven Synthetic Hard-Negatives for Dense Retrieval |
Xiaopeng Li et.al. |
2412.17250 |
null |
2024-12-23 |
EM-MIAs: Enhancing Membership Inference Attacks in Large Language Models through Ensemble Modeling |
Zichen Song et.al. |
2412.17249 |
null |
2024-12-23 |
On the Generalization Ability of Machine-Generated Text Detectors |
Yule Liu et.al. |
2412.17242 |
link |
2024-12-23 |
Brain-to-Text Benchmark ‘24: Lessons Learned |
Francis R. Willett et.al. |
2412.17227 |
link |
2024-12-23 |
CharGen: High Accurate Character-Level Visual Text Generation Model with MultiModal Encoder |
Lichen Ma et.al. |
2412.17225 |
null |
2024-12-22 |
Better Think with Tables: Leveraging Tables to Enhance Large Language Model Comprehension |
Jio Oh et.al. |
2412.17189 |
null |
2024-12-22 |
Foundation Model for Lossy Compression of Spatiotemporal Scientific Data |
Xiao Li et.al. |
2412.17184 |
null |
2024-12-22 |
Enhancing Item Tokenization for Generative Recommendation through Self-Improvement |
Runjin Chen et.al. |
2412.17171 |
null |
2024-12-22 |
Generative Diffusion Modeling: A Practical Handbook |
Zihan Ding et.al. |
2412.17162 |
null |
2024-12-22 |
LLM-based relevance assessment still can’t replace human relevance assessment |
Charles L. A. Clarke et.al. |
2412.17156 |
null |
2024-12-22 |
LLM Agent for Fire Dynamics Simulations |
Leidong Xu et.al. |
2412.17146 |
null |
2024-12-22 |
Hate Speech Detection and Target Identification in Devanagari Languages via Parameter Efficient Fine-Tuning of LLMs |
Rushendra Sidibomma et.al. |
2412.17131 |
link |
2024-12-22 |
Lies, Damned Lies, and Distributional Language Statistics: Persuasion and Deception with Large Language Models |
Cameron R. Jones et.al. |
2412.17128 |
null |
2024-12-22 |
Learning to Adapt to Low-Resource Paraphrase Generation |
Zhigen Li et.al. |
2412.17111 |
null |
2024-12-22 |
DreamOmni: Unified Image Generation and Editing |
Bin Xia et.al. |
2412.17098 |
null |
2024-12-22 |
Analysis on LLMs Performance for Code Summarization |
Md. Ahnaf Akib et.al. |
2412.17094 |
null |
2024-12-22 |
SAIL: Sample-Centric In-Context Learning for Document Information Extraction |
Jinyu Zhang et.al. |
2412.17092 |
link |
2024-12-22 |
SubstationAI: Multimodal Large Model-Based Approaches for Analyzing Substation Equipment Faults |
Jinzhi Wang et.al. |
2412.17077 |
null |
2024-12-22 |
The HalluRAG Dataset: Detecting Closed-Domain Hallucinations in RAG Applications Using an LLM’s Internal States |
Fabian Ridder et.al. |
2412.17056 |
link |
2024-12-22 |
DR-Encoder: Encode Low-rank Gradients with Random Prior for Large Language Models Differentially Privately |
Huiwen Wu et.al. |
2412.17053 |
null |
2024-12-22 |
ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues |
Shaina Raza et.al. |
2412.17052 |
link |
2024-12-22 |
Modular Conversational Agents for Surveys and Interviews |
Jiangbo Yu et.al. |
2412.17049 |
null |
2024-12-22 |
Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective |
Hankun Wang et.al. |
2412.17048 |
null |
2024-12-22 |
Adapting Image-to-Video Diffusion Models for Large-Motion Frame Interpolation |
Luoxu Jin et.al. |
2412.17042 |
null |
2024-12-22 |
HyperNet Fields: Efficiently Training Hypernetworks without Ground Truth by Learning Weight Trajectories |
Eric Hedlin et.al. |
2412.17040 |
null |
2024-12-22 |
Shadow-Frugal Expectation-Value-Sampling Variational Quantum Generative Model |
Kevin Shen et.al. |
2412.17039 |
null |
2024-12-22 |
Shaping the Safety Boundaries: Understanding and Defending Against Jailbreaks in Large Language Models |
Lang Gao et.al. |
2412.17034 |
null |
2024-12-22 |
MINTQA: A Multi-Hop Question Answering Benchmark for Evaluating LLMs on New and Tail Knowledge |
Jie He et.al. |
2412.17032 |
link |
2024-12-22 |
FriendsQA: A New Large-Scale Deep Video Understanding Dataset with Fine-grained Topic Categorization for Story Videos |
Zhengqian Wu et.al. |
2412.17022 |
link |
2024-12-22 |
GAS: Generative Auto-bidding with Post-training Search |
Yewen Li et.al. |
2412.17018 |
null |
2024-12-22 |
Robustness of Large Language Models Against Adversarial Attacks |
Yiyi Tao et.al. |
2412.17011 |
null |
2024-12-22 |
InterDance:Reactive 3D Dance Generation with Realistic Duet Interactions |
Ronghui Li et.al. |
2412.16982 |
null |
2024-12-22 |
On Fusing ChatGPT and Ensemble Learning in Discon-tinuous Named Entity Recognition in Health Corpora |
Tzu-Chieh Chen et.al. |
2412.16976 |
null |
2024-12-22 |
Cannot or Should Not? Automatic Analysis of Refusal Composition in IFT/RLHF Datasets and Refusal Behavior of Black-Box LLMs |
Alexander von Recum et.al. |
2412.16974 |
null |
2024-12-22 |
Multifaceted User Modeling in Recommendation: A Federated Foundation Models Approach |
Chunxu Zhang et.al. |
2412.16969 |
link |
2024-12-22 |
System-2 Mathematical Reasoning via Enriched Instruction Tuning |
Huanqia Cai et.al. |
2412.16964 |
null |
2024-12-22 |
Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework |
Jundong Xu et.al. |
2412.16953 |
null |
2024-12-22 |
A Career Interview Dialogue System using Large Language Model-based Dynamic Slot Generation |
Ekai Hashimoto et.al. |
2412.16943 |
null |
2024-12-22 |
Prompting Large Language Models with Rationale Heuristics for Knowledge-based Visual Question Answering |
Zhongjian Hu et.al. |
2412.16936 |
null |
2024-12-22 |
Towards a Unified Paradigm: Integrating Recommendation Systems as a New Language in Large Models |
Kai Zheng et.al. |
2412.16933 |
null |
2024-12-22 |
Enhancing Supply Chain Transparency in Emerging Economies Using Online Contents and LLMs |
Bohan Jin et.al. |
2412.16922 |
null |
2024-12-22 |
Detect Changes like Humans: Incorporating Semantic Priors for Improved Change Detection |
Yuhang Gan et.al. |
2412.16918 |
null |
2024-12-22 |
Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation |
Quan Dao et.al. |
2412.16906 |
null |
2024-12-22 |
Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model |
Songjun Tu et.al. |
2412.16878 |
link |
2024-12-20 |
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding |
Chenxin Tao et.al. |
2412.16158 |
null |
2024-12-20 |
Can Generative Video Models Help Pose Estimation? |
Ruojin Cai et.al. |
2412.16155 |
null |
2024-12-20 |
Offline Reinforcement Learning for LLM Multi-Step Reasoning |
Huaijie Wang et.al. |
2412.16145 |
link |
2024-12-20 |
Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation |
Seyedreza Mohseni et.al. |
2412.16135 |
null |
2024-12-20 |
Data-Driven Mechanism Design: Jointly Eliciting Preferences and Information |
Dirk Bergemann et.al. |
2412.16132 |
null |
2024-12-20 |
PromptOptMe: Error-Aware Prompt Compression for LLM-based MT Evaluation Metrics |
Daniil Larionov et.al. |
2412.16120 |
null |
2024-12-20 |
Deciphering the Underserved: Benchmarking LLM OCR for Low-Resource Scripts |
Muhammad Abdullah Sohail et.al. |
2412.16119 |
link |
2024-12-20 |
PruneVid: Visual Token Pruning for Efficient Video Large Language Models |
Xiaohu Huang et.al. |
2412.16117 |
link |
2024-12-20 |
The Content Moderator’s Dilemma: Removal of Toxic Content and Distortions to Online Discourse |
Mahyar Habibi et.al. |
2412.16114 |
null |
2024-12-20 |
Logical Consistency of Large Language Models in Fact-checking |
Bishwamittra Ghosh et.al. |
2412.16100 |
null |
2024-12-20 |
The Evolution of LLM Adoption in Industry Data Curation Practices |
Crystal Qian et.al. |
2412.16089 |
null |
2024-12-20 |
Efficient MedSAMs: Segment Anything in Medical Images on Laptop |
Jun Ma et.al. |
2412.16085 |
link |
2024-12-20 |
Formal Mathematical Reasoning: A New Frontier in AI |
Kaiyu Yang et.al. |
2412.16075 |
null |
2024-12-20 |
The Only Way is Ethics: A Guide to Ethical Research with Large Language Models |
Eddie L. Ungless et.al. |
2412.16022 |
link |
2024-12-20 |
Legommenders: A Comprehensive Content-Based Recommendation Library with LLM Support |
Qijiong Liu et.al. |
2412.15973 |
link |
2024-12-20 |
From General to Specific: Tailoring Large Language Models for Personalized Healthcare |
Ruize Shi et.al. |
2412.15957 |
null |
2024-12-20 |
Trust Calibration in IDEs: Paving the Way for Widespread Adoption of AI Refactoring |
Markus Borg et.al. |
2412.15948 |
null |
2024-12-20 |
Reframing Image Difference Captioning with BLIP2IDC and Synthetic Augmentation |
Gautier Evennou et.al. |
2412.15939 |
link |
2024-12-20 |
Large Language Model assisted Hybrid Fuzzing |
Ruijie Meng et.al. |
2412.15931 |
null |
2024-12-20 |
MiniGPT-Pancreas: Multimodal Large Language Model for Pancreas Cancer Classification and Detection |
Andrea Moglia et.al. |
2412.15925 |
link |
2024-12-20 |
RiTTA: Modeling Event Relations in Text-to-Audio Generation |
Yuhang He et.al. |
2412.15922 |
link |
2024-12-20 |
Less is More: Towards Green Code Large Language Models via Unified Structural Pruning |
Guang Yang et.al. |
2412.15921 |
null |
2024-12-20 |
Development of a Large-scale Dataset of Chest Computed Tomography Reports in Japanese and a High-performance Finding Classification Model |
Yosuke Yamagishi et.al. |
2412.15907 |
null |
2024-12-20 |
Evaluation of Reliability Criteria for News Publishers with Large Language Models |
Manuel Pratelli et.al. |
2412.15896 |
null |
2024-12-20 |
TelcoLM: collecting data, adapting, and benchmarking language models for the telecommunication domain |
Camille Barboule et.al. |
2412.15891 |
null |
2024-12-20 |
AI-in-the-loop: The future of biomedical visual analytics applications in the era of AI |
Katja Bühler et.al. |
2412.15876 |
null |
2024-12-20 |
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback |
Jiaming Ji et.al. |
2412.15838 |
link |
2024-12-20 |
WebLLM: A High-Performance In-Browser LLM Inference Engine |
Charlie F. Ruan et.al. |
2412.15803 |
link |
2024-12-20 |
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning |
Sungjin Park et.al. |
2412.15797 |
null |
2024-12-20 |
GraphSeqLM: A Unified Graph Language Framework for Omic Graph Learning |
Heming Zhang et.al. |
2412.15790 |
link |
2024-12-20 |
Linguistic Features Extracted by GPT-4 Improve Alzheimer’s Disease Detection based on Spontaneous Speech |
Jonathan Heitz et.al. |
2412.15772 |
link |
2024-12-20 |
Extracting Interpretable Task-Specific Circuits from Large Language Models for Faster Inference |
Jorge García-Carrasco et.al. |
2412.15750 |
link |
2024-12-20 |
Critique of Impure Reason: Unveiling the reasoning behaviour of medical Large Language Models |
Shamus Sim et.al. |
2412.15748 |
null |
2024-12-20 |
VORD: Visual Ordinal Calibration for Mitigating Object Hallucinations in Large Vision-Language Models |
Dexter Neo et.al. |
2412.15739 |
null |
2024-12-20 |
AutoLife: Automatic Life Journaling with Smartphones and LLMs |
Huatao Xu et.al. |
2412.15714 |
null |
2024-12-20 |
Contrastive Learning for Task-Independent SpeechLLM-Pretraining |
Maike Züfle et.al. |
2412.15712 |
link |
2024-12-20 |
Cracking the Code: Evaluating Zero-Shot Prompting Methods for Providing Programming Feedback |
Niklas Ippisch et.al. |
2412.15702 |
null |
2024-12-20 |
Code Review Automation Via Multi-task Federated LLM – An Empirical Study |
Jahnavi Kumar et.al. |
2412.15676 |
null |
2024-12-20 |
Adaptable and Precise: Enterprise-Scenario LLM Function-Calling Capability Training Pipeline |
Guancheng Zeng et.al. |
2412.15660 |
null |
2024-12-20 |
Synthetic Tabular Data Generation for Imbalanced Classification: The Surprising Effectiveness of an Overlap Class |
Annie D’souza et.al. |
2412.15657 |
link |
2024-12-20 |
MathSpeech: Leveraging Small LMs for Accurate Conversion in Mathematical Speech-to-Formula |
Sieun Hyeon et.al. |
2412.15655 |
link |
2024-12-20 |
Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution |
Wentao Tan et.al. |
2412.15650 |
link |
2024-12-20 |
Darkit: A User-Friendly Software Toolkit for Spiking Large Language Model |
Xin Du et.al. |
2412.15634 |
link |
2024-12-20 |
Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning? |
Mengyu Ye et.al. |
2412.15628 |
null |
2024-12-20 |
JailPO: A Novel Black-box Jailbreak Framework via Preference Optimization against Aligned LLMs |
Hongyi Li et.al. |
2412.15623 |
null |
2024-12-20 |
Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage |
Zhi Gao et.al. |
2412.15606 |
null |
2024-12-20 |
Don’t Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks |
Brian J Chan et.al. |
2412.15605 |
link |
2024-12-20 |
Dynamic Label Name Refinement for Few-Shot Dialogue Intent Classification |
Gyutae Park et.al. |
2412.15603 |
null |
2024-12-20 |
Template-Driven LLM-Paraphrased Framework for Tabular Math Word Problem Generation |
Xiaoqiang Kang et.al. |
2412.15594 |
link |
2024-12-20 |
NeSyCoCo: A Neuro-Symbolic Concept Composer for Compositional Generalization |
Danial Kamali et.al. |
2412.15588 |
link |
2024-12-20 |
To Rely or Not to Rely? Evaluating Interventions for Appropriate Reliance on Large Language Models |
Jessica Y. Bo et.al. |
2412.15584 |
null |
2024-12-20 |
A Deep Probabilistic Framework for Continuous Time Dynamic Graph Generation |
Ryien Hosseini et.al. |
2412.15582 |
link |
2024-12-20 |
Score-based Generative Diffusion Models for Social Recommendations |
Chengyi Liu et.al. |
2412.15579 |
link |
2024-12-20 |
QUART-Online: Latency-Free Large Multimodal Language Model for Quadruped Robot Learning |
Xinyang Tong et.al. |
2412.15576 |
null |
2024-12-20 |
J-EDI QA: Benchmark for deep-sea organism-specific multimodal LLM |
Takero Yoshida et.al. |
2412.15574 |
null |
2024-12-20 |
Continual Learning Using a Kernel-Based Method Over Foundation Models |
Saleh Momeni et.al. |
2412.15571 |
link |
2024-12-20 |
DefFiller: Mask-Conditioned Diffusion for Salient Steel Surface Defect Generation |
Yichun Tai et.al. |
2412.15570 |
link |
2024-12-20 |
In-context Continual Learning Assisted by an External Continual Learner |
Saleh Momeni et.al. |
2412.15563 |
null |
2024-12-20 |
NGQA: A Nutritional Graph Question Answering Benchmark for Personalized Health-aware Nutritional Reasoning |
Zheyuan Zhang et.al. |
2412.15547 |
null |
2024-12-20 |
MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering |
Zhang Siyue et.al. |
2412.15540 |
null |
2024-12-20 |
XRAG: eXamining the Core – Benchmarking Foundational Components in Advanced Retrieval-Augmented Generation |
Qianren Mao et.al. |
2412.15529 |
link |
2024-12-20 |
HREF: Human Response-Guided Evaluation of Instruction Following in Language Models |
Xinxi Lyu et.al. |
2412.15524 |
link |
2024-12-20 |
PreNeT: Leveraging Computational Features to Predict Deep Neural Network Training Time |
Alireza Pourali et.al. |
2412.15519 |
link |
2024-12-20 |
Stylish and Functional: Guided Interpolation Subject to Physical Constraints |
Yan-Ying Chen et.al. |
2412.15507 |
null |
2024-12-20 |
Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework |
Zhenjie Xu et.al. |
2412.15504 |
link |
2024-12-20 |
Humanlike Cognitive Patterns as Emergent Phenomena in Large Language Models |
Zhisheng Tang et.al. |
2412.15501 |
null |
2024-12-20 |
TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use |
Junjie Ye et.al. |
2412.15495 |
link |
2024-12-20 |
PolySmart and VIREO @ TRECVid 2024 Ad-hoc Video Search |
Jiaxin Wu et.al. |
2412.15494 |
null |
2024-12-20 |
GCA-3D: Towards Generalized and Consistent Domain Adaptation of 3D Generators |
Hengjia Li et.al. |
2412.15491 |
null |
2024-12-20 |
Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage |
Saehyung Lee et.al. |
2412.15484 |
null |
2024-12-20 |
Continual Learning Using Only Large Language Model Prompting |
Jiabao Qiu et.al. |
2412.15479 |
null |
2024-12-19 |
TalkWithMachines: Enhancing Human-Robot Interaction for Interpretable Industrial Robotics Through Large/Vision Language Models |
Ammar N. Abbas et.al. |
2412.15462 |
null |
2024-12-19 |
Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization |
Sahil Wadhwa et.al. |
2412.15453 |
null |
2024-12-19 |
AI-Enhanced Sensemaking: Exploring the Design of a Generative AI-Based Assistant to Support Genetic Professionals |
Angela Mastrianni et.al. |
2412.15444 |
null |
2024-12-19 |
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval |
Aakash Mahalingam et.al. |
2412.15443 |
null |
2024-12-19 |
Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models |
Tianchen Zhang et.al. |
2412.15431 |
null |
2024-12-19 |
MoEtion: Efficient and Reliable Checkpointing for Mixture-of-Experts Models at Scale |
Swapnil Gandhi et.al. |
2412.15411 |
null |
2024-12-19 |
Deciphering Social Behaviour: a Novel Biological Approach For Social Users Classification |
Edoardo Allegrini et.al. |
2412.15410 |
null |
2024-12-19 |
Systematic Evaluation of Long-Context LLMs on Financial Concepts |
Lavanya Gupta et.al. |
2412.15386 |
null |
2024-12-19 |
Automatic Extraction of Metaphoric Analogies from Literary Texts: Task Formulation, Dataset Construction, and Evaluation |
Joanne Boisson et.al. |
2412.15375 |
link |
2024-12-19 |
Automated Root Cause Analysis System for Complex Data Products |
Mathieu Demarne et.al. |
2412.15374 |
null |
2024-12-19 |
Large Language Models on Small Resource-Constrained Systems: Performance Characterization, Analysis and Trade-offs |
Liam Seymour et.al. |
2412.15352 |
link |
2024-12-19 |
Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models |
Reza Shirkavand et.al. |
2412.15341 |
link |
2024-12-19 |
Complete background cosmology of parity-even quadratic metric-affine gravity |
Thomas Dyer et.al. |
2412.15329 |
null |
2024-12-19 |
OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving |
Shuo Xing et.al. |
2412.15208 |
link |
2024-12-19 |
MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark |
Qihao Zhao et.al. |
2412.15194 |
link |
2024-12-19 |
LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation |
Weijia Shi et.al. |
2412.15188 |
null |
2024-12-19 |
Tiled Diffusion |
Or Madar et.al. |
2412.15185 |
null |
2024-12-19 |
Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning |
Simon Frieder et.al. |
2412.15184 |
null |
2024-12-19 |
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning |
Marius Memmel et.al. |
2412.15182 |
null |
2024-12-19 |
HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel Languages |
Aman Chaturvedi et.al. |
2412.15178 |
null |
2024-12-19 |
Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying |
Federico Castagna et.al. |
2412.15177 |
link |
2024-12-19 |
Rethinking Uncertainty Estimation in Natural Language Generation |
Lukas Aichberger et.al. |
2412.15176 |
null |
2024-12-19 |
Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM |
Yatai Ji et.al. |
2412.15156 |
link |
2024-12-19 |
Language Models as Continuous Self-Evolving Data Engineers |
Peidong Wang et.al. |
2412.15151 |
null |
2024-12-19 |
Jet: A Modern Transformer-Based Normalizing Flow |
Alexander Kolesnikov et.al. |
2412.15129 |
null |
2024-12-19 |
Adaptive Pruning for Large Language Models with Structural Importance Awareness |
Haotian Zheng et.al. |
2412.15127 |
null |
2024-12-19 |
Outcome-Refining Process Supervision for Code Generation |
Zhuohao Yu et.al. |
2412.15118 |
link |
2024-12-19 |
Qwen2.5 Technical Report |
Qwen et.al. |
2412.15115 |
link |
2024-12-19 |
Associative memory inspires improvements for in-context learning using a novel attention residual stream architecture |
Thomas F Burns et.al. |
2412.15113 |
link |
2024-12-19 |
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation |
Yang Tian et.al. |
2412.15109 |
link |
2024-12-19 |
Review-Then-Refine: A Dynamic Framework for Multi-Hop Question Answering with Temporal Adaptability |
Xiangsen Chen et.al. |
2412.15101 |
null |
2024-12-19 |
Nano-ESG: Extracting Corporate Sustainability Information from News Articles |
Fabian Billert et.al. |
2412.15093 |
link |
2024-12-19 |
Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation |
Haoran Liu et.al. |
2412.15086 |
null |
2024-12-19 |
ScamChatBot: An End-to-End Analysis of Fake Account Recovery on Social Media via Chatbots |
Bhupendra Acharya et.al. |
2412.15072 |
null |
2024-12-19 |
ConfliBERT: A Language Model for Political Conflict |
Patrick T. Brandt et.al. |
2412.15060 |
link |
2024-12-19 |
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps |
Felix Friedrich et.al. |
2412.15035 |
null |
2024-12-19 |
DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space |
Mang Ning et.al. |
2412.15032 |
link |
2024-12-19 |
Large Language Models and Code Security: A Systematic Literature Review |
Enna Basic et.al. |
2412.15004 |
null |
2024-12-19 |
HSEvo: Elevating Automatic Heuristic Design with Diversity-Driven Harmony Search and Genetic Algorithm Using LLMs |
Pham Vu Tuan Dat et.al. |
2412.14995 |
link |
2024-12-19 |
RoboCup@Home 2024 OPL Winner NimbRo: Anthropomorphic Service Robots using Foundation Models for Perception and Planning |
Raphael Memmesheimer et.al. |
2412.14989 |
null |
2024-12-19 |
Chain-of-MetaWriting: Linguistic and Textual Analysis of How Small Language Models Write Young Students Texts |
Ioana Buhnila et.al. |
2412.14986 |
null |
2024-12-19 |
AI and Cultural Context: An Empirical Investigation of Large Language Models’ Performance on Chinese Social Work Professional Standards |
Zia Qi et.al. |
2412.14971 |
null |
2024-12-19 |
Movie2Story: A framework for understanding videos and telling stories in the form of novel text |
Kangning Li et.al. |
2412.14965 |
null |
2024-12-19 |
Knowledge Injection via Prompt Distillation |
Kalle Kujanpää et.al. |
2412.14964 |
null |
2024-12-19 |
Effective Method with Compression for Distributed and Federated Cocoercive Variational Inequalities |
Daniil Medyakov et.al. |
2412.14935 |
null |
2024-12-19 |
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response |
Junyu Luo et.al. |
2412.14922 |
link |
2024-12-19 |
Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation |
Zexiong Ma et.al. |
2412.14905 |
null |
2024-12-19 |
Multimodal Hypothetical Summary for Retrieval-based Multi-image Question Answering |
Peize Li et.al. |
2412.14880 |
null |
2024-12-19 |
Graph-Convolutional Networks: Named Entity Recognition and Large Language Model Embedding in Document Clustering |
Imed Keraghel et.al. |
2412.14867 |
null |
2024-12-19 |
Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling |
Junyi Li et.al. |
2412.14860 |
null |
2024-12-19 |
DS $^2$ -ABSA: Dual-Stream Data Synthesis with Label Refinement for Few-Shot Aspect-Based Sentiment Analysis |
Hongling Xu et.al. |
2412.14849 |
link |
2024-12-19 |
Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas |
Pietro Bernardelle et.al. |
2412.14843 |
null |
2024-12-19 |
Helping LLMs Improve Code Generation Using Feedback from Testing and Static Analysis |
Greta Dolcetti et.al. |
2412.14841 |
null |
2024-12-19 |
Progressive Multimodal Reasoning via Active Retrieval |
Guanting Dong et.al. |
2412.14835 |
null |
2024-12-19 |
Answer Set Networks: Casting Answer Set Programming into Deep Learning |
Arseny Skryagin et.al. |
2412.14814 |
link |
2024-12-19 |
ResoFilter: Rine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance Analysis |
Zeao Tu et.al. |
2412.14809 |
link |
2024-12-19 |
Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning |
Ziang Ye et.al. |
2412.14780 |
null |
2024-12-19 |
ALKAFI-LLAMA3: Fine-Tuning LLMs for Precise Legal Understanding in Palestine |
Rabee Qasem et.al. |
2412.14771 |
null |
2024-12-19 |
PsyDraw: A Multi-Agent Multimodal System for Mental Health Screening in Left-Behind Children |
Yiqun Zhang et.al. |
2412.14769 |
link |
2024-12-19 |
CodeRepoQA: A Large-scale Benchmark for Software Engineering Question Answering |
Ruida Hu et.al. |
2412.14764 |
link |
2024-12-19 |
Query pipeline optimization for cancer patient question answering systems |
Maolin He et.al. |
2412.14751 |
null |
2024-12-19 |
Active Inference and Human–Computer Interaction |
Roderick Murray-Smith et.al. |
2412.14741 |
null |
2024-12-19 |
On Verbalized Confidence Scores for LLMs |
Daniel Yang et.al. |
2412.14737 |
link |
2024-12-19 |
Creation of AI-driven Smart Spaces for Enhanced Indoor Environments – A Survey |
Aygün Varol et.al. |
2412.14708 |
null |
2024-12-19 |
LLMs as mediators: Can they diagnose conflicts accurately? |
Özgecan Koçak et.al. |
2412.14675 |
null |
2024-12-19 |
Analysis and Visualization of Linguistic Structures in Large Language Models: Neural Representations of Verb-Particle Constructions in BERT |
Hassane Kissane et.al. |
2412.14670 |
null |
2024-12-19 |
IOHunter: Graph Foundation Model to Uncover Online Information Operations |
Marco Minici et.al. |
2412.14663 |
link |
2024-12-19 |
Unveiling Uncertainty: A Deep Dive into Calibration and Performance of Multimodal Large Language Models |
Zijun Chen et.al. |
2412.14660 |
link |
2024-12-19 |
Length Controlled Generation for Black-box LLMs |
Yuxuan Gu et.al. |
2412.14656 |
null |
2024-12-19 |
Learning to Generate Research Idea with Dynamic Control |
Ruochen Li et.al. |
2412.14626 |
null |
2024-12-19 |
How good is GPT at writing political speeches for the White House? |
Jacques Savoy et.al. |
2412.14617 |
null |
2024-12-19 |
Beyond Guilt: Legal Judgment Prediction with Trichotomous Reasoning |
Kepu Zhang et.al. |
2412.14588 |
null |
2024-12-19 |
HiCM $^2$ : Hierarchical Compact Memory Modeling for Dense Video Captioning |
Minkuk Kim et.al. |
2412.14585 |
null |
2024-12-19 |
Simulation-Free Hierarchical Latent Policy Planning for Proactive Dialogues |
Tao He et.al. |
2412.14584 |
null |
2024-12-19 |
CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation |
Youngwon Lee et.al. |
2412.14581 |
null |
2024-12-19 |
DiffSim: Taming Diffusion Models for Evaluating Visual Similarity |
Yiren Song et.al. |
2412.14580 |
link |
2024-12-19 |
Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models |
Wenhan Liu et.al. |
2412.14574 |
link |
2024-12-19 |
ScaMo: Exploring the Scaling Law in Autoregressive Motion Generation Model |
Shunlin Lu et.al. |
2412.14559 |
null |
2024-12-19 |
The Current Challenges of Software Engineering in the Era of Large Language Models |
Cuiyun Gao et.al. |
2412.14554 |
null |
2024-12-19 |
Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models |
Xiao Cui et.al. |
2412.14528 |
link |
2024-12-19 |
Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment |
Teng Xiao et.al. |
2412.14516 |
link |
2024-12-19 |
Relational Programming with Foundation Models |
Ziyang Li et.al. |
2412.14515 |
null |
2024-12-19 |
PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization |
Jiayi Wu et.al. |
2412.14510 |
link |
2024-12-19 |
Do Large Language Models Defend Inferentialist Semantics?: On the Logical Expressivism and Anti-Representationalism of LLMs |
Yuzuki Arai et.al. |
2412.14501 |
null |
2024-12-19 |
Guided Diffusion Model for Sensor Data Obfuscation |
Xin Yang et.al. |
2412.14499 |
null |
2024-12-19 |
FaultExplainer: Leveraging Large Language Models for Interpretable Fault Detection and Diagnosis |
Abdullah Khan et.al. |
2412.14492 |
link |
2024-12-19 |
Moving Beyond LDA: A Comparison of Unsupervised Topic Modelling Techniques for Qualitative Data Analysis of Online Communities |
Amandeep Kaur et.al. |
2412.14486 |
null |
2024-12-19 |
DirectorLLM for Human-Centric Video Generation |
Kunpeng Song et.al. |
2412.14484 |
null |
2024-12-19 |
Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs |
Koshiro Saito et.al. |
2412.14471 |
null |
2024-12-19 |
Agent-SafetyBench: Evaluating the Safety of LLM Agents |
Zhexin Zhang et.al. |
2412.14470 |
link |
2024-12-19 |
From Human Annotation to LLMs: SILICON Annotation Workflow for Management Research |
Xiang Cheng et.al. |
2412.14461 |
null |
2024-12-19 |
LEDiff: Latent Exposure Diffusion for HDR Generation |
Chao Wang et.al. |
2412.14456 |
null |
2024-12-19 |
Are Longer Prompts Always Better? Prompt Selection in Large Language Models for Recommendation Systems |
Genki Kusano et.al. |
2412.14454 |
null |
2024-12-19 |
Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation |
Shengqi Liu et.al. |
2412.14453 |
null |
2024-12-19 |
ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case Study |
Eric Modesitt et.al. |
2412.14436 |
link |
2024-12-19 |
All-in-One Tuning and Structural Pruning for Domain-Specific LLMs |
Lei Lu et.al. |
2412.14426 |
null |
2024-12-19 |
FedPIA – Permuting and Integrating Adapters leveraging Wasserstein Barycenters for Finetuning Foundation Models in Multi-Modal Federated Learning |
Pramit Saha et.al. |
2412.14424 |
null |
2024-12-19 |
Enhancing Diffusion Models for High-Quality Image Generation |
Jaineet Shah et.al. |
2412.14422 |
null |
2024-12-18 |
ChainRank-DPO: Chain Rank Direct Preference Optimization for LLM Rankers |
Haowei Liu et.al. |
2412.14405 |
null |
2024-12-18 |
Clinical Trials Ontology Engineering with Large Language Models |
Berkan Çakır et.al. |
2412.14387 |
null |
2024-12-18 |
ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling |
William Han et.al. |
2412.14373 |
link |
2024-12-18 |
Memorization Over Reasoning? Exposing and Mitigating Verbatim Memorization in Large Language Models’ Character Understanding Evaluation |
Yuxuan Jiang et.al. |
2412.14368 |
null |
2024-12-18 |
Surrealistic-like Image Generation with Vision-Language Models |
Elif Ayten et.al. |
2412.14366 |
link |
2024-12-18 |
ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals |
Utkarsh Saxena et.al. |
2412.14363 |
link |
2024-12-18 |
A Unifying Information-theoretic Perspective on Evaluating Generative Models |
Alexis Fox et.al. |
2412.14340 |
null |
2024-12-18 |
Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation |
Benjamin Steenhoek et.al. |
2412.14308 |
null |
2024-12-18 |
Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs |
David Restrepo et.al. |
2412.14304 |
null |
2024-12-18 |
Fake News Detection: Comparative Evaluation of BERT-like Models and Large Language Models with Generative AI-Annotated Data |
haina Raza et.al. |
2412.14276 |
link |
2024-12-18 |
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces |
Jihan Yang et.al. |
2412.14171 |
link |
2024-12-18 |
MetaMorph: Multimodal Understanding and Generation via Instruction Tuning |
Shengbang Tong et.al. |
2412.14164 |
null |
2024-12-18 |
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks |
Frank F. Xu et.al. |
2412.14161 |
link |
2024-12-18 |
Advanced Reasoning and Transformation Engine for Multi-Step Insight Synthesis in Data Analytics with Large Language Models |
Atin Sakkeer Hussain et.al. |
2412.14146 |
null |
2024-12-18 |
LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research |
Tianyang Gu et.al. |
2412.14141 |
null |