2024-12-10 |
Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences |
Alan Nawzad Amin et.al. |
2412.07763 |
link |
2024-12-10 |
SAT: Spatial Aptitude Training for Multimodal Language Models |
Arijit Ray et.al. |
2412.07755 |
null |
2024-12-10 |
LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation Models |
Ziqi Lu et.al. |
2412.07746 |
null |
2024-12-10 |
Zero-Shot ATC Coding with Large Language Models for Clinical Assessments |
Zijian Chen et.al. |
2412.07743 |
null |
2024-12-10 |
AI Expands Scientists’ Impact but Contracts Science’s Focus |
Qianyue Hao et.al. |
2412.07727 |
null |
2024-12-10 |
Granite Guardian |
Inkit Padhi et.al. |
2412.07724 |
link |
2024-12-10 |
Leveraging Content and Context Cues for Low-Light Image Enhancement |
Igor Morawski et.al. |
2412.07693 |
null |
2024-12-10 |
DriveMM: All-in-One Large Multimodal Model for Autonomous Driving |
Zhijian Huang et.al. |
2412.07689 |
link |
2024-12-10 |
Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions |
Anant Prakash Awasthi et.al. |
2412.07687 |
null |
2024-12-10 |
TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation |
Alfredo Garrachón Ruiz et.al. |
2412.07682 |
null |
2024-12-10 |
RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models |
Greg Heinrich et.al. |
2412.07679 |
null |
2024-12-10 |
Ask Humans or AI? Exploring Their Roles in Visualization Troubleshooting |
Shuyu Shen et.al. |
2412.07673 |
null |
2024-12-10 |
FlexLLM: Exploring LLM Customization for Moving Target Defense on Black-Box LLMs Against Jailbreak Attacks |
Bocheng Chen et.al. |
2412.07672 |
null |
2024-12-10 |
Automating Business Intelligence Requirements with Generative AI and Semantic Search |
Nimrod Busany et.al. |
2412.07668 |
null |
2024-12-10 |
Searching for Structure: Investigating Emergent Communication with Large Language Models |
Tom Kouwenhoven et.al. |
2412.07646 |
null |
2024-12-10 |
TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans |
Md Omar Faruque et.al. |
2412.07636 |
null |
2024-12-10 |
ChocoLlama: Lessons Learned From Teaching Llamas Dutch |
Matthieu Meeus et.al. |
2412.07633 |
null |
2024-12-10 |
Piece of Table: A Divide-and-Conquer Approach for Selecting Sub-Tables in Table Question Answering |
Wonjin Lee et.al. |
2412.07629 |
null |
2024-12-10 |
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations |
Linke Ouyang et.al. |
2412.07626 |
link |
2024-12-10 |
DRUM: Learning Demonstration Retriever for Large MUlti-modal Models |
Ellen Yi-Ge et.al. |
2412.07619 |
null |
2024-12-09 |
Delve into Visual Contrastive Decoding for Hallucination Mitigation of Large Vision-Language Models |
Yi-Lun Lee et.al. |
2412.06775 |
link |
2024-12-09 |
Visual Lexicon: Rich Image Features in Language Space |
XuDong Wang et.al. |
2412.06774 |
null |
2024-12-09 |
Training Large Language Models to Reason in a Continuous Latent Space |
Shibo Hao et.al. |
2412.06769 |
null |
2024-12-09 |
Ranking-aware adapter for text-driven image ordering with CLIP |
Wei-Hsiang Yu et.al. |
2412.06760 |
link |
2024-12-09 |
Why Do Developers Engage with ChatGPT in Issue-Tracker? Investigating Usage and Reliance on ChatGPT-Generated Code |
Joy Krishan Das et.al. |
2412.06757 |
null |
2024-12-09 |
Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models |
Neel Jain et.al. |
2412.06748 |
null |
2024-12-09 |
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities |
Adhiraj Ghosh et.al. |
2412.06745 |
null |
2024-12-09 |
JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM |
Takuro Fujii et.al. |
2412.06738 |
null |
2024-12-09 |
AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and Benchmark |
Lan Li et.al. |
2412.06724 |
null |
2024-12-09 |
How to Merge Your Multimodal Models Over Time? |
Sebastian Dziadzio et.al. |
2412.06712 |
null |
2024-12-09 |
OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions |
Yi-Kai Zhang et.al. |
2412.06693 |
null |
2024-12-09 |
Exploring Critical Testing Scenarios for Decision-Making Policies: An LLM Approach |
Weichao Xu et.al. |
2412.06684 |
null |
2024-12-09 |
Toward LLM-Agent-Based Modeling of Transportation Systems: A Conceptual Framework |
Tianming Liu et.al. |
2412.06681 |
null |
2024-12-09 |
I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token |
Roi Cohen et.al. |
2412.06676 |
null |
2024-12-09 |
ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance |
Chunwei Wang et.al. |
2412.06673 |
null |
2024-12-09 |
MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models |
Shansong Liu et.al. |
2412.06660 |
null |
2024-12-09 |
Chatbots im Schulunterricht: Wir testen das Fobizz-Tool zur automatischen Bewertung von Hausaufgaben |
Rainer Mühlhoff et.al. |
2412.06651 |
null |
2024-12-09 |
The Narrow Gate: Localized Image-Text Communication in Vision-Language Models |
Alessandro Serra et.al. |
2412.06646 |
null |
2024-12-09 |
MAVias: Mitigate any Visual Bias |
Ioannis Sarridis et.al. |
2412.06632 |
null |
2024-12-09 |
Copyright-Protected Language Generation via Adaptive Model Fusion |
Javier Abad et.al. |
2412.06619 |
link |
2024-12-06 |
Birth and Death of a Rose |
Chen Geng et.al. |
2412.05278 |
null |
2024-12-06 |
Sparse autoencoders reveal selective remapping of visual concepts during adaptation |
Hyesu Lim et.al. |
2412.05276 |
link |
2024-12-06 |
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling |
Zhe Chen et.al. |
2412.05271 |
null |
2024-12-06 |
APOLLO: SGD-like Memory, AdamW-level Performance |
Hanqing Zhu et.al. |
2412.05270 |
null |
2024-12-06 |
Uncertainty Quantification for Transformer Models for Dark-Pattern Detection |
Javier Muñoz et.al. |
2412.05251 |
null |
2024-12-06 |
Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization |
Luca Masserano et.al. |
2412.05244 |
null |
2024-12-06 |
CompCap: Improving Multimodal Large Language Models with Composite Captions |
Xiaohui Chen et.al. |
2412.05243 |
null |
2024-12-06 |
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale |
Jarvis Guo et.al. |
2412.05237 |
null |
2024-12-06 |
BEExformer: A Fast Inferencing Transformer Architecture via Binarization with Multiple Early Exits |
Wazib Ansar et.al. |
2412.05225 |
null |
2024-12-06 |
100% Hallucination Elimination Using Acurai |
Michael C. Wood et.al. |
2412.05223 |
null |
2024-12-06 |
Evaluating and Aligning CodeLLMs on Human Preference |
Jian Yang et.al. |
2412.05210 |
null |
2024-12-06 |
A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges |
Aditi Singh et.al. |
2412.05208 |
null |
2024-12-06 |
Are Frontier Large Language Models Suitable for Q&A in Science Centres? |
Jacob Watson et.al. |
2412.05200 |
null |
2024-12-06 |
SurgBox: Agent-Driven Operating Room Sandbox with Surgery Copilot |
Jinlin Wu et.al. |
2412.05187 |
link |
2024-12-06 |
LinVT: Empower Your Image-level Large Language Model to Understand Videos |
Lishuai Gao et.al. |
2412.05185 |
link |
2024-12-06 |
QueEn: A Large Language Model for Quechua-English Translation |
Junhao Chen et.al. |
2412.05184 |
null |
2024-12-06 |
Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models |
Kuofeng Gao et.al. |
2412.05167 |
null |
2024-12-06 |
Enhancing Cross-Language Code Translation via Task-Specific Embedding Alignment in Retrieval-Augmented Generation |
Manish Bhattarai et.al. |
2412.05159 |
null |
2024-12-06 |
Multimodal Fact-Checking with Vision Language Models: A Probing Classifier based Solution with Embedding Strategies |
Recep Firat Cekinel et.al. |
2412.05155 |
null |
2024-12-06 |
A text-to-tabular approach to generate synthetic patient data using LLMs |
Margaux Tornqvist et.al. |
2412.05153 |
null |
2024-12-05 |
Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail |
Luca Bartolomei et.al. |
2412.04472 |
link |
2024-12-05 |
NVILA: Efficient Frontier Visual Language Models |
Zhijian Liu et.al. |
2412.04468 |
null |
2024-12-05 |
VisionZip: Longer is Better but Not Necessary in Vision Language Models |
Senqiao Yang et.al. |
2412.04467 |
link |
2024-12-05 |
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection |
Enshen Zhou et.al. |
2412.04455 |
null |
2024-12-05 |
p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay |
Jun Zhang et.al. |
2412.04449 |
link |
2024-12-05 |
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios |
Lu Qiu et.al. |
2412.04447 |
null |
2024-12-05 |
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models |
Yizhuo Li et.al. |
2412.04446 |
null |
2024-12-05 |
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation |
Yi Chen et.al. |
2412.04445 |
null |
2024-12-05 |
Towards Real-Time Open-Vocabulary Video Instance Segmentation |
Bin Yan et.al. |
2412.04434 |
null |
2024-12-05 |
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation |
Yuying Ge et.al. |
2412.04432 |
link |
2024-12-05 |
Grounding Descriptions in Images informs Zero-Shot Visual Recognition |
Shaunak Halbe et.al. |
2412.04429 |
link |
2024-12-05 |
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion |
Jiuhai Chen et.al. |
2412.04424 |
link |
2024-12-05 |
Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation |
Xuying Li et.al. |
2412.04415 |
null |
2024-12-05 |
Establishing Task Scaling Laws via Compute-Efficient Model Ladders |
Akshita Bhagia et.al. |
2412.04403 |
null |
2024-12-05 |
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding |
Rong Li et.al. |
2412.04383 |
null |
2024-12-05 |
Discriminative Fine-tuning of LVLMs |
Yassine Ouali et.al. |
2412.04378 |
null |
2024-12-05 |
Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting |
Edoardo Cetin et.al. |
2412.04368 |
null |
2024-12-05 |
Approximate Top- $k$ for Increased Parallelism |
Oscar Key et.al. |
2412.04358 |
null |
2024-12-05 |
Retrieval-Augmented Machine Translation with Unstructured Knowledge |
Jiaan Wang et.al. |
2412.04342 |
link |
2024-12-05 |
Liquid: Language Models are Scalable Multi-modal Generators |
Junfeng Wu et.al. |
2412.04332 |
null |
2024-12-04 |
From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents |
Xinyi Mou et.al. |
2412.03563 |
link |
2024-12-04 |
FLAIR: VLM with Fine-grained Language-informed Image Representations |
Rui Xiao et.al. |
2412.03561 |
link |
2024-12-04 |
Best-of-N Jailbreaking |
John Hughes et.al. |
2412.03556 |
link |
2024-12-04 |
PaliGemma 2: A Family of Versatile VLMs for Transfer |
Andreas Steiner et.al. |
2412.03555 |
null |
2024-12-04 |
SPICE: Smart Projection Interface for Cooking Enhancement |
Vera Prohaska et.al. |
2412.03551 |
null |
2024-12-04 |
Perception Tokens Enhance Visual Reasoning in Multimodal Language Models |
Mahtab Bigverdi et.al. |
2412.03548 |
null |
2024-12-04 |
Evaluating Gender Bias Transfer between Pre-trained and Prompt-Adapted Language Models |
Natalie Mackraz et.al. |
2412.03537 |
null |
2024-12-04 |
A Review on Scientific Knowledge Extraction using Large Language Models in Biomedical Sciences |
Gabriel Lino Garcia et.al. |
2412.03531 |
null |
2024-12-04 |
FANAL – Financial Activity News Alerting Language Modeling Framework |
Urjitkumar Patel et.al. |
2412.03527 |
null |
2024-12-04 |
You’re (Not) My Type – Can LLMs Generate Feedback of Specific Types for Introductory Programming Tasks? |
Dominic Lohr et.al. |
2412.03516 |
null |
2024-12-04 |
Distillation of Diffusion Features for Semantic Correspondence |
Frank Fundel et.al. |
2412.03512 |
null |
2024-12-04 |
Tight PAC-Bayesian Risk Certificates for Contrastive Learning |
Anna van Elst et.al. |
2412.03486 |
link |
2024-12-04 |
Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning |
Neale Ratzlaff et.al. |
2412.03467 |
null |
2024-12-04 |
Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks |
Dario Serez et.al. |
2412.03453 |
link |
2024-12-04 |
From Words to Workflows: Automating Business Processes |
Laura Minkova et.al. |
2412.03446 |
null |
2024-12-04 |
Assessing Foundation Models’ Transferability to Physiological Signals in Precision Medicine |
Matthias Christenson et.al. |
2412.03427 |
null |
2024-12-04 |
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation |
Ao Wang et.al. |
2412.03409 |
link |
2024-12-04 |
RedStone: Curating General, Code, Math, and QA Data for Large Language Models |
Yaoyao Chang et.al. |
2412.03398 |
null |
2024-12-04 |
Enhancing Supply Chain Visibility with Generative AI: An Exploratory Case Study on Relationship Prediction in Knowledge Graphs |
Ge Zheng et.al. |
2412.03390 |
null |
2024-12-04 |
WiS Platform: Enhancing Evaluation of LLM-Based Multi-Agent Systems Through Game-Based Analysis |
Chengwei Hu et.al. |
2412.03359 |
null |
2024-12-03 |
T-REG: Preference Optimization with Token-Level Reward Regularization |
Wenxuan Zhou et.al. |
2412.02685 |
null |
2024-12-03 |
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models |
Yuda Song et.al. |
2412.02674 |
null |
2024-12-03 |
LLM-Enhanced Path Planning: Safe and Efficient Autonomous Navigation with Instructional Inputs |
Pranav Doma et.al. |
2412.02655 |
null |
2024-12-03 |
Time-Reversal Provides Unsupervised Feedback to LLMs |
Yerram Varun et.al. |
2412.02626 |
null |
2024-12-03 |
Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions |
Kai Sun et.al. |
2412.02621 |
null |
2024-12-03 |
Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback |
Hiroki Furuta et.al. |
2412.02617 |
null |
2024-12-03 |
GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot |
Aohan Zeng et.al. |
2412.02612 |
link |
2024-12-03 |
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? |
Kaixiong Gong et.al. |
2412.02611 |
null |
2024-12-03 |
Interpretable Company Similarity with Sparse Autoencoders |
Marco Molinari et.al. |
2412.02605 |
null |
2024-12-03 |
CEGI: Measuring the trade-off between efficiency and carbon emissions for SLMs and VLMs |
Abhas Kumar et.al. |
2412.02602 |
null |
2024-12-03 |
PrefixLLM: LLM-aided Prefix Circuit Design |
Weihua Xiao et.al. |
2412.02594 |
null |
2024-12-03 |
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation |
Junyuan Zhang et.al. |
2412.02592 |
link |
2024-12-03 |
Explainable CTR Prediction via LLM Reasoning |
Xiaohan Yu et.al. |
2412.02588 |
null |
2024-12-03 |
Remote Sensing Temporal Vision-Language Models: A Comprehensive Survey |
Chenyang Liu et.al. |
2412.02573 |
link |
2024-12-03 |
SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection |
Joongwon Chae et.al. |
2412.02565 |
link |
2024-12-03 |
Semantic Tokens in Retrieval Augmented Generation |
Joel Suro et.al. |
2412.02563 |
null |
2024-12-03 |
Patent-CR: A Dataset for Patent Claim Revision |
Lekang Jiang et.al. |
2412.02549 |
null |
2024-12-03 |
Multimodal Remote Sensing Scene Classification Using VLMs and Dual-Cross Attention Networks |
Jinjin Cai et.al. |
2412.02531 |
null |
2024-12-03 |
LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data |
Hanyu Zhang et.al. |
2412.02525 |
null |
2024-12-03 |
OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations |
Caixin Kang et.al. |
2412.02479 |
null |
2024-12-02 |
T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs |
Shukang Yin et.al. |
2411.19951 |
link |
2024-12-02 |
Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM’s Reasoning Capability |
Zicheng Lin et.al. |
2411.19943 |
null |
2024-11-29 |
VLSBench: Unveiling Visual Leakage in Multimodal Safety |
Xuhao Hu et.al. |
2411.19939 |
null |
2024-11-29 |
On Domain-Specific Post-Training for Multimodal Large Language Models |
Daixuan Cheng et.al. |
2411.19930 |
null |
2024-11-29 |
SIMS: Simulating Human-Scene Interactions with Real World Script Planning |
Wenjia Wang et.al. |
2411.19921 |
null |
2024-11-29 |
FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation |
Chang Won Lee et.al. |
2411.19888 |
null |
2024-11-29 |
PDDLFuse: A Tool for Generating Diverse Planning Domains |
Vedant Khandelwal et.al. |
2411.19886 |
null |
2024-12-02 |
LUMIA: Linear probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states |
Luis Ibanez-Lissen et.al. |
2411.19876 |
null |
2024-11-29 |
DeMo: Decoupled Momentum Optimization |
Bowen Peng et.al. |
2411.19870 |
link |
2024-11-29 |
AIDetx: a compression-based method for identification of machine-learning generated text |
Leonardo Almeida et.al. |
2411.19869 |
link |
2024-11-29 |
Reverse Thinking Makes LLMs Stronger Reasoners |
Justin Chih-Yao Chen et.al. |
2411.19865 |
null |
2024-11-29 |
Cross-Domain Recommendation Meets Large Language Models |
Ajay Krishna Vajjala et.al. |
2411.19862 |
link |
2024-11-29 |
What fifty-one years of Linguistics and Artificial Intelligence research tell us about their correlation: A scientometric review |
Mohammed Q. Shormani et.al. |
2411.19858 |
null |
2024-11-29 |
Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation |
Dimosthenis Antypas et.al. |
2411.19832 |
null |
2024-11-29 |
Advanced System Integration: Analyzing OpenAPI Chunking for Retrieval-Augmented Generation |
Robin D. Pesl et.al. |
2411.19804 |
null |
2024-11-29 |
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge |
Angelika Romanou et.al. |
2411.19799 |
null |
2024-11-29 |
MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks |
Yiming Wu et.al. |
2411.19786 |
null |
2024-11-29 |
PerLA: Perceptive 3D Language Assistant |
Guofeng Mei et.al. |
2411.19774 |
null |
2024-11-29 |
LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos |
Tiantian Geng et.al. |
2411.19772 |
null |
2024-11-29 |
Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models |
Kaican Li et.al. |
2411.19757 |
link |
2024-11-27 |
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation |
Yueru Jia et.al. |
2411.18623 |
null |
2024-11-27 |
Cross-modal Information Flow in Multimodal Large Language Models |
Zhi Zhang et.al. |
2411.18620 |
null |
2024-11-27 |
Diffusion Self-Distillation for Zero-Shot Customized Image Generation |
Shengqu Cai et.al. |
2411.18616 |
null |
2024-11-27 |
Automated Literature Review Using NLP Techniques and LLM-Based Retrieval-Augmented Generation |
Nurshat Fateh Ali et.al. |
2411.18583 |
null |
2024-11-27 |
Challenges in Adapting Multilingual LLMs to Low-Resource Languages using LoRA PEFT Tuning |
Omkar Khade et.al. |
2411.18571 |
null |
2024-11-27 |
A Pipeline of Neural-Symbolic Integration to Enhance Spatial Reasoning in Large Language Models |
Rong Wang et.al. |
2411.18564 |
null |
2024-11-27 |
DexDiffuser: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation |
Zhixuan Liang et.al. |
2411.18562 |
null |
2024-11-27 |
Retrofitting (Large) Language Models with Dynamic Tokenization |
Darius Feher et.al. |
2411.18553 |
null |
2024-11-27 |
AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans |
Dillon Loh et.al. |
2411.18539 |
link |
2024-11-27 |
Emergence of Self-Identity in AI: A Mathematical Framework and Empirical Study with Generative Large Language Models |
Minhyeok Lee et.al. |
2411.18530 |
link |
2024-11-27 |
LLM-ABBA: Understand time series via symbolic approximation |
Erin Carson et.al. |
2411.18506 |
null |
2024-11-27 |
GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation |
Pengfei Zhou et.al. |
2411.18499 |
null |
2024-11-27 |
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS |
Jinyang Wu et.al. |
2411.18478 |
null |
2024-11-27 |
Draft Model Knows When to Stop: A Self-Verification Length Policy for Speculative Decoding |
Ziyin Zhang et.al. |
2411.18462 |
link |
2024-11-27 |
Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator |
Frederic Kirstein et.al. |
2411.18444 |
null |
2024-11-27 |
An AI-Assisted Multi-Agent Dual Dialogue System to Support Mental Health Care Providers |
Onno P. Kampman et.al. |
2411.18429 |
null |
2024-11-27 |
FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving |
Ao Shen et.al. |
2411.18424 |
null |
2024-11-27 |
Politicians vs ChatGPT. A study of presuppositions in French and Italian political communication |
Davide Garassino et.al. |
2411.18403 |
null |
2024-11-27 |
Topic Modeling and Sentiment Analysis on Japanese Online Media’s Coverage of Nuclear Energy |
Yifan Sun et.al. |
2411.18383 |
null |
2024-11-27 |
ChatGPT as speechwriter for the French presidents |
Dominique Labbé et.al. |
2411.18382 |
null |
2024-11-26 |
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats |
Jiaxin Wen et.al. |
2411.17693 |
null |
2024-11-26 |
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens |
Xu Ouyang et.al. |
2411.17691 |
null |
2024-11-26 |
Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration |
Yuhang Han et.al. |
2411.17686 |
null |
2024-11-26 |
Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning |
Zhu Xu et.al. |
2411.17679 |
link |
2024-11-26 |
Instance-Aware Graph Prompt Learning |
Jiazheng Li et.al. |
2411.17676 |
null |
2024-11-26 |
Push the Limit of Multi-modal Emotion Recognition by Prompting LLMs with Receptive-Field-Aware Attention Weighting |
Liyun Zhang et.al. |
2411.17674 |
null |
2024-11-26 |
SketchAgent: Language-Driven Sequential Sketch Generation |
Yael Vinker et.al. |
2411.17673 |
null |
2024-11-26 |
Synthetic Data Generation with LLM for Improved Depression Prediction |
Andrea Kang et.al. |
2411.17672 |
null |
2024-11-26 |
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations |
Hyunji Lee et.al. |
2411.17666 |
null |
2024-11-26 |
Toward High-Performance LLM Serving: A Simulation-Based Approach for Identifying Optimal Parallelism |
Yi-Chien Lin et.al. |
2411.17651 |
null |
2024-11-26 |
On Limitations of LLM as Annotator for Low Resource Languages |
Suramya Jadhav et.al. |
2411.17637 |
null |
2024-11-26 |
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation |
Harsh Singh et.al. |
2411.17636 |
null |
2024-11-26 |
Data-driven development of cycle prediction models for lithium metal batteries using multi modal mining |
Jaewoong Lee et.al. |
2411.17625 |
null |
2024-11-26 |
Scaling Speech-Text Pre-training with Synthetic Interleaved Data |
Aohan Zeng et.al. |
2411.17607 |
null |
2024-11-26 |
HyperSeg: Towards Universal Visual Segmentation with Large Language Model |
Cong Wei et.al. |
2411.17606 |
link |
2024-11-26 |
Making History Readable |
Bipasha Banerjee et.al. |
2411.17600 |
null |
2024-11-26 |
Agentic AI for Improving Precision in Identifying Contributions to Sustainable Development Goals |
William A. Ingram et.al. |
2411.17598 |
null |
2024-11-26 |
Can artificial intelligence predict clinical trial outcomes? |
Shuyi Jin et.al. |
2411.17595 |
null |
2024-11-26 |
RTL-Breaker: Assessing the Security of LLMs against Backdoor Attacks on HDL Code Generation |
Lakshmi Likhitha Mankali et.al. |
2411.17569 |
null |
2024-11-26 |
Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey |
Jiayi Kuang et.al. |
2411.17558 |
null |
2024-11-25 |
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts? |
Sohee Yang et.al. |
2411.16679 |
null |
2024-11-25 |
Diffusion Features for Zero-Shot 6DoF Object Pose Estimation |
Bernd Von Gimborn et.al. |
2411.16668 |
null |
2024-11-25 |
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation |
Zun Wang et.al. |
2411.16657 |
null |
2024-11-25 |
Self-Generated Critiques Boost Reward Modeling for Language Models |
Yue Yu et.al. |
2411.16646 |
null |
2024-11-25 |
Preventing Jailbreak Prompts as Malicious Tools for Cybercriminals: A Cyber Defense Perspective |
Jean Marie Tshimula et.al. |
2411.16642 |
null |
2024-11-25 |
StructFormer: Document Structure-based Masked Attention and its Impact on Language Model Pre-Training |
Kaustubh Ponkshe et.al. |
2411.16618 |
null |
2024-11-25 |
Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models |
Ronghuan Wu et.al. |
2411.16602 |
null |
2024-11-25 |
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge |
Dawei Li et.al. |
2411.16594 |
link |
2024-11-25 |
Large Language Model-based Decision-making for COLREGs and the Control of Autonomous Surface Vehicles |
Klinsmann Agyei et.al. |
2411.16587 |
null |
2024-11-25 |
MarketGPT: Developing a Pre-trained transformer (GPT) for Modeling Financial Time Series |
Aaron Wheeler et.al. |
2411.16585 |
link |
2024-11-25 |
Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision |
Zhiheng Xi et.al. |
2411.16579 |
null |
2024-11-25 |
Predictive Power of LLMs in Financial Markets |
Jerick Shi et.al. |
2411.16569 |
null |
2024-11-25 |
EnStack: An Ensemble Stacking Framework of Large Language Models for Enhanced Vulnerability Detection in Source Code |
Shahriyar Zaman Ridoy et.al. |
2411.16561 |
null |
2024-11-25 |
Generating Out-Of-Distribution Scenarios Using Language Models |
Erfan Aasi et.al. |
2411.16554 |
null |
2024-11-25 |
Representation Collapsing Problems in Vector Quantization |
Wenhao Zhao et.al. |
2411.16550 |
null |
2024-11-25 |
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics |
Chan Hee Song et.al. |
2411.16537 |
null |
2024-11-25 |
Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings |
Carolin M. Schuster et.al. |
2411.16527 |
null |
2024-11-25 |
Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency |
Jerry Yao-Chieh Hu et.al. |
2411.16525 |
null |
2024-11-25 |
LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation |
Steven Song et.al. |
2411.16523 |
null |
2024-11-25 |
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis |
Boming Miao et.al. |
2411.16503 |
null |
2024-11-22 |
Measuring Bullshit in the Language Games played by ChatGPT |
Alessandro Trevisan et.al. |
2411.15129 |
null |
2024-11-22 |
Health AI Developer Foundations |
Atilla P. Kiraly et.al. |
2411.15128 |
null |
2024-11-22 |
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training |
Nathan Lambert et.al. |
2411.15124 |
link |
2024-11-22 |
RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts |
Hjalmar Wijk et.al. |
2411.15114 |
link |
2024-11-22 |
Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion |
Samarth N Ramesh et.al. |
2411.15113 |
null |
2024-11-22 |
AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution |
Fengyuan Liu et.al. |
2411.15102 |
link |
2024-11-22 |
What You See is Not What You Get: Neural Partial Differential Equations and The Illusion of Learning |
Arvind Mohan et.al. |
2411.15101 |
null |
2024-11-22 |
XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models |
Yixin Dong et.al. |
2411.15100 |
null |
2024-11-22 |
Context-Aware Multimodal Pretraining |
Karsten Roth et.al. |
2411.15099 |
null |
2024-11-22 |
mR $^2$ AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA |
Tao Zhang et.al. |
2411.15041 |
null |
2024-11-22 |
One to rule them all: natural language to bind communication, perception and action |
Simone Colombani et.al. |
2411.15033 |
null |
2024-11-22 |
Time is on my sight: scene graph filtering for dynamic environment perception in an LLM-driven robot |
Simone Colombani et.al. |
2411.15027 |
null |
2024-11-22 |
DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models |
Keda Tao et.al. |
2411.15024 |
null |
2024-11-22 |
FTA generation using GenAI with an Autonomy sensor Usecase |
Sneha Sudhir Shetiya et.al. |
2411.15007 |
null |
2024-11-22 |
ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data |
Junhong Shen et.al. |
2411.15004 |
link |
2024-11-22 |
Generative AI may backfire for counterspeech |
Dominik Bär et.al. |
2411.14986 |
null |
2024-11-22 |
Exploring Foundation Models Fine-Tuning for Cytology Classification |
Manon Dausort et.al. |
2411.14975 |
link |
2024-11-22 |
Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models |
Alec Wright et.al. |
2411.14972 |
link |
2024-11-22 |
SwissADT: An Audio Description Translation System for Swiss Languages |
Lukas Fischer et.al. |
2411.14967 |
null |
2024-11-22 |
LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement |
Jieming Bian et.al. |
2411.14961 |
null |
2024-11-21 |
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models |
Yuhao Dong et.al. |
2411.14432 |
link |
2024-11-21 |
Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation |
Zhuoman Liu et.al. |
2411.14423 |
null |
2024-11-21 |
From RNNs to Foundation Models: An Empirical Study on Commercial Building Energy Consumption |
Shourya Bose et.al. |
2411.14421 |
null |
2024-11-21 |
Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding |
Yiming Zhang et.al. |
2411.14401 |
null |
2024-11-21 |
Lightweight Safety Guardrails Using Fine-tuned BERT Embeddings |
Aaron Zheng et.al. |
2411.14398 |
null |
2024-11-21 |
UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages |
Bethel Melesse Tessema et.al. |
2411.14343 |
link |
2024-11-21 |
SplatR : Experience Goal Visual Rearrangement with 3D Gaussian Splatting and Dense Feature Matching |
Arjun P S et.al. |
2411.14322 |
null |
2024-11-21 |
Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training |
Zheheng Luo et.al. |
2411.14318 |
null |
2024-11-21 |
Automated Generation of Code Debugging Exercises |
Victor-Alexandru Pădurean et.al. |
2411.14303 |
null |
2024-11-21 |
Auto-SPICE: Leveraging LLMs for Dataset Creation via Automated SPICE Netlist Extraction from Analog Circuit Diagrams |
Jitendra Bhandari et.al. |
2411.14299 |
link |
2024-11-21 |
EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild |
Yumeng Liu et.al. |
2411.14280 |
null |
2024-11-21 |
Looking Beyond Text: Reducing Language bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance |
Haozhe Zhao et.al. |
2411.14279 |
null |
2024-11-21 |
Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models |
Iacopo Ghinassi et.al. |
2411.14272 |
link |
2024-11-21 |
Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective |
Ernests Lavrinovics et.al. |
2411.14258 |
null |
2024-11-21 |
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models |
Javier Ferrando et.al. |
2411.14257 |
null |
2024-11-21 |
Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs |
Zeyu Dong et.al. |
2411.14256 |
null |
2024-11-21 |
Intent-Aware Dialogue Generation and Multi-Task Contrastive Learning for Multi-Turn Intent Classification |
Junhua Liu et.al. |
2411.14252 |
null |
2024-11-21 |
Natural Language Reinforcement Learning |
Xidong Feng et.al. |
2411.14251 |
null |
2024-11-21 |
FocusLLaVA: A Coarse-to-Fine Approach for Efficient and Effective Visual Token Compression |
Yuke Zhu et.al. |
2411.14228 |
null |
2024-11-21 |
Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data |
Paul Fergus et.al. |
2411.14219 |
null |
2024-11-20 |
Find Any Part in 3D |
Ziqi Ma et.al. |
2411.13550 |
null |
2024-11-20 |
SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs |
Shirley Kokane et.al. |
2411.13547 |
null |
2024-11-20 |
Promoting User Data Autonomy During the Dissolution of a Monopolistic Firm |
Rushabh Solanki et.al. |
2411.13546 |
null |
2024-11-20 |
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games |
Davide Paglieri et.al. |
2411.13543 |
null |
2024-11-20 |
Metacognition for Unknown Situations and Environments (MUSE) |
Rodolfo Valiente et.al. |
2411.13537 |
null |
2024-11-20 |
Predictive Insights into LGBTQ+ Minority Stress: A Transductive Exploration of Social Media Discourse |
S. Chapagain et.al. |
2411.13534 |
link |
2024-11-20 |
Advancing Complex Medical Communication in Arabic with Sporo AraSum: Surpassing Existing Large Language Models |
Chanseo Lee et.al. |
2411.13518 |
null |
2024-11-20 |
Disentangling Memory and Reasoning Ability in Large Language Models |
Mingyu Jin et.al. |
2411.13504 |
link |
2024-11-20 |
Neural machine translation of seismic waves for petrophysical inversion |
José Cunha Teixeira et.al. |
2411.13491 |
null |
2024-11-20 |
Utilizing Large Language Models to Synthesize Product Desirability Datasets |
John D. Hastings et.al. |
2411.13485 |
null |
2024-11-20 |
PatentEdits: Framing Patent Novelty as Textual Entailment |
Ryan Lee et.al. |
2411.13477 |
null |
2024-11-20 |
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training |
Haonan Wang et.al. |
2411.13476 |
link |
2024-11-20 |
SoK: A Systems Perspective on Compound AI Threats and Countermeasures |
Sarbartha Banerjee et.al. |
2411.13459 |
null |
2024-11-20 |
LIMBA: An Open-Source Framework for the Preservation and Valorization of Low-Resource Languages using Generative Models |
Salvatore Mario Carta et.al. |
2411.13453 |
null |
2024-11-20 |
AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations |
Gaurav Verma et.al. |
2411.13451 |
null |
2024-11-20 |
WaterPark: A Robustness Assessment of Language Model Watermarking |
Jiacheng Liang et.al. |
2411.13425 |
link |
2024-11-20 |
Unleashing the Power of Large Language Models for Group POI Recommendations |
Jing Long et.al. |
2411.13415 |
null |
2024-11-20 |
A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback |
Alireza Rashidi Laleh et.al. |
2411.13410 |
null |
2024-11-20 |
Unification of Balti and trans-border sister dialects in the essence of LLMs and AI Technology |
Muhammad Sharif et.al. |
2411.13409 |
null |
2024-11-20 |
Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese |
Dat Van-Thanh Nguyen et.al. |
2411.13407 |
null |
2024-11-19 |
ACING: Actor-Critic for Instruction Learning in Black-Box Large Language Models |
Salma Kharrat et.al. |
2411.12736 |
link |
2024-11-19 |
Information Theory of Meaningful Communication |
Doron Sivan et.al. |
2411.12728 |
null |
2024-11-19 |
CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs |
Zhehan Kan et.al. |
2411.12713 |
null |
2024-11-19 |
Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs |
Ahmed Akib Jawad Karim et.al. |
2411.12712 |
null |
2024-11-19 |
Strengthening Fake News Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques. Defying BERT? |
Ahmed Akib Jawad Karim et.al. |
2411.12703 |
null |
2024-11-19 |
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations |
Huaizhi Ge et.al. |
2411.12701 |
null |
2024-11-19 |
SparseInfer: Training-free Prediction of Activation Sparsity for Fast LLM Inference |
Jiho Shin et.al. |
2411.12692 |
null |
2024-11-19 |
Neurosymbolic Graph Enrichment for Grounded World Models |
Stefano De Giorgis et.al. |
2411.12671 |
null |
2024-11-19 |
DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models |
Vinay Kumar Sankarapu et.al. |
2411.12643 |
link |
2024-11-19 |
Improving Controllability and Editability for Pretrained Text-to-Music Generation Models |
Yixiao Zhang et.al. |
2411.12641 |
null |
2024-11-19 |
Provable unlearning in topic modeling and downstream tasks |
Stanley Wei et.al. |
2411.12600 |
null |
2024-11-19 |
AdaCM $^2$ : On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction |
Yuanbin Man et.al. |
2411.12593 |
null |
2024-11-19 |
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models |
Laura Ruis et.al. |
2411.12580 |
link |
2024-11-19 |
Large Language Models for Combinatorial Optimization of Design Structure Matrix |
Shuo Jiang et.al. |
2411.12571 |
null |
2024-11-19 |
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues |
Riccardo Grazzi et.al. |
2411.12537 |
link |
2024-11-19 |
Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution |
Yang Zou et.al. |
2411.12530 |
link |
2024-11-19 |
Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus |
Terufumi Morishita et.al. |
2411.12498 |
link |
2024-11-19 |
AI Flow at the Network Edge |
Jiawei Shao et.al. |
2411.12469 |
null |
2024-11-19 |
Guide-to-Explain for Controllable Summarization |
Sangwon Ryu et.al. |
2411.12460 |
null |
2024-11-19 |
\textsc{Neon}: News Entity-Interaction Extraction for Enhanced Question Answering |
Sneha Singhania et.al. |
2411.12449 |
null |
2024-11-18 |
Bi-Mamba: Towards Accurate 1-Bit State Space Models |
Shengkun Tang et.al. |
2411.11843 |
null |
2024-11-18 |
Tackling prediction tasks in relational databases with LLMs |
Marek Wydmuch et.al. |
2411.11829 |
null |
2024-11-18 |
Exploring adversarial robustness of JPEG AI: methodology, comparison and new methods |
Egor Kovalev et.al. |
2411.11795 |
null |
2024-11-18 |
LLM-IE: A Python Package for Generative Information Extraction with Large Language Models |
Enshuo Hsu et.al. |
2411.11779 |
null |
2024-11-18 |
sMoRe: Enhancing Object Manipulation and Organization in Mixed Reality Spaces with LLMs and Generative AI |
Yunhao Xing et.al. |
2411.11752 |
null |
2024-11-18 |
BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration |
Yuzong Chen et.al. |
2411.11745 |
link |
2024-11-18 |
Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment |
Allison Huang et.al. |
2411.11731 |
link |
2024-11-18 |
Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation |
Mingchao Qi et.al. |
2411.11714 |
link |
2024-11-18 |
FedCoLLM: A Parameter-Efficient Federated Co-tuning Framework for Large and Small Language Models |
Tao Fan et.al. |
2411.11707 |
null |
2024-11-18 |
MC-LLaVA: Multi-Concept Personalized Vision-Language Model |
Ruichuan An et.al. |
2411.11706 |
link |
2024-11-18 |
Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search |
Jinhao Jiang et.al. |
2411.11694 |
null |
2024-11-18 |
TrojanRobot: Backdoor Attacks Against Robotic Manipulation in the Physical World |
Xianlong Wang et.al. |
2411.11683 |
null |
2024-11-18 |
PSPO*: An Effective Process-supervised Policy Optimization for Reasoning Alignment |
Jiawei Li et.al. |
2411.11681 |
link |
2024-11-18 |
Dissecting Misalignment of Multimodal Large Language Models via Influence Function |
Lijie Hu et.al. |
2411.11667 |
null |
2024-11-18 |
TSINR: Capturing Temporal Continuity via Implicit Neural Representations for Time Series Anomaly Detection |
Mengxuan Li et.al. |
2411.11641 |
link |
2024-11-18 |
Chapter 7 Review of Data-Driven Generative AI Models for Knowledge Extraction from Scientific Literature in Healthcare |
Leon Kopitar et.al. |
2411.11635 |
null |
2024-11-18 |
Signaling and Social Learning in Swarms of Robots |
Leo Cazenille et.al. |
2411.11616 |
null |
2024-11-18 |
Leveraging Computational Pathology AI for Noninvasive Optical Imaging Analysis Without Retraining |
Danny Barash et.al. |
2411.11613 |
null |
2024-11-18 |
VLN-Game: Vision-Language Equilibrium Search for Zero-Shot Semantic Navigation |
Bangguo Yu et.al. |
2411.11609 |
null |
2024-11-18 |
Exploring LLMs for Verifying Technical System Specifications Against Requirements |
Lasse M. Reinpold et.al. |
2411.11582 |
null |
2024-11-15 |
VeriGraph: Scene Graphs for Execution Verifiable Robot Planning |
Daniel Ekpo et.al. |
2411.10446 |
null |
2024-11-15 |
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization |
Weiyun Wang et.al. |
2411.10442 |
null |
2024-11-15 |
LLaVA-o1: Let Vision Language Models Reason Step-by-Step |
Guowei Xu et.al. |
2411.10440 |
link |
2024-11-15 |
MARS: Unleashing the Power of Variance Reduction for Training Large Models |
Huizhuo Yuan et.al. |
2411.10438 |
link |
2024-11-15 |
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization |
Yuhan Fu et.al. |
2411.10436 |
null |
2024-11-15 |
Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash |
Parsa Hejabi et.al. |
2411.10422 |
link |
2024-11-15 |
On the Foundation Model for Cardiac MRI Reconstruction |
Chi Zhang et.al. |
2411.10403 |
null |
2024-11-15 |
Interactive Cycle Model – The Linkage Combination among Automatic Speech Recognition, Large Language Models and Smart Glasses |
Libo Wang et.al. |
2411.10362 |
null |
2024-11-15 |
Bias Unveiled: Investigating Social Bias in LLM-Generated Code |
Lin Ling et.al. |
2411.10351 |
null |
2024-11-15 |
Y-MAP-Net: Real-time depth, normals, segmentation, multi-label captioning and 2D human pose in RGB images |
Ammar Qammaz et.al. |
2411.10334 |
null |
2024-11-15 |
Number it: Temporal Grounding Videos like Flipping Manga |
Yongliang Wu et.al. |
2411.10332 |
link |
2024-11-15 |
Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting |
Ziqi Xie et.al. |
2411.10309 |
link |
2024-11-15 |
Static network structure cannot stabilize cooperation among Large Language Model agents |
Jin Han et.al. |
2411.10294 |
null |
2024-11-15 |
Scaling Law for Post-training after Model Pruning |
Xiaodong Chen et.al. |
2411.10272 |
null |
2024-11-15 |
Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning |
Jingru Yang et.al. |
2411.10252 |
null |
2024-11-15 |
Measuring Non-Adversarial Reproduction of Training Data in Large Language Models |
Michael Aerni et.al. |
2411.10242 |
null |
2024-11-15 |
Generative AI in Multimodal User Interfaces: Trends, Challenges, and Cross-Platform Adaptability |
J. Bieniek et.al. |
2411.10234 |
null |
2024-11-15 |
An Empirical Study on LLM-based Agents for Automated Bug Fixing |
Xiangxin Meng et.al. |
2411.10213 |
null |
2024-11-15 |
Agentic LLMs in the Supply Chain: Towards Autonomous Multi-Agent Consensus-Seeking |
Valeria Jannelli et.al. |
2411.10184 |
null |
2024-11-15 |
CART: Compositional Auto-Regressive Transformer for Image Generation |
Siddharth Roheda et.al. |
2411.10180 |
null |
2024-11-14 |
MagicQuill: An Intelligent Interactive Image Editing System |
Zichen Liu et.al. |
2411.09703 |
null |
2024-11-14 |
Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models |
Wei Wang et.al. |
2411.09691 |
null |
2024-11-14 |
Squeezed Attention: Accelerating Long Context Length LLM Inference |
Coleman Hooper et.al. |
2411.09688 |
link |
2024-11-14 |
Adaptive Decoding via Latent Preference Optimization |
Shehzaad Dhuliawala et.al. |
2411.09661 |
null |
2024-11-14 |
On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse |
Alkis Kalavasis et.al. |
2411.09642 |
null |
2024-11-14 |
Local deployment of large-scale music AI models on commodity hardware |
Xun Zhou et.al. |
2411.09625 |
null |
2024-11-14 |
PTR: Precision-Driven Tool Recommendation for Large Language Models |
Hang Gao et.al. |
2411.09613 |
null |
2024-11-14 |
The Moral Foundations Weibo Corpus |
Renjie Cao et.al. |
2411.09612 |
null |
2024-11-14 |
Initial Nugget Evaluation Results for the TREC 2024 RAG Track with the AutoNuggetizer Framework |
Ronak Pradeep et.al. |
2411.09607 |
null |
2024-11-14 |
Accelerating Knowledge Graph and Ontology Engineering with Large Language Models |
Cogan Shimizu et.al. |
2411.09601 |
null |
2024-11-14 |
Assessing the Performance of the DINOv2 Self-supervised Learning Vision Transformer Model for the Segmentation of the Left Atrium from MRI Images |
Bipasha Kundu et.al. |
2411.09598 |
null |
2024-11-14 |
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models |
Zhengyi Wang et.al. |
2411.09595 |
null |
2024-11-14 |
Adopting RAG for LLM-Aided Future Vehicle Design |
Vahid Zolfaghari et.al. |
2411.09590 |
null |
2024-11-14 |
BabyLM Challenge: Exploring the Effect of Variation Sets on Language Model Training Efficiency |
Akari Haga et.al. |
2411.09587 |
null |
2024-11-14 |
Software Performance Engineering for Foundation Model-Powered Software (FMware) |
Haoxiang Zhang et.al. |
2411.09580 |
null |
2024-11-14 |
Piecing It All Together: Verifying Multi-Hop Multimodal Claims |
Haoran Wang et.al. |
2411.09547 |
null |
2024-11-14 |
A Practical Guide to Fine-tuning Language Models with Limited Data |
Márton Szép et.al. |
2411.09539 |
null |
2024-11-14 |
Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents |
Yuyou Gan et.al. |
2411.09523 |
null |
2024-11-14 |
Communication Compression for Tensor Parallel LLM Inference |
Jan Hansen-Palmus et.al. |
2411.09510 |
null |
2024-11-14 |
Spider: Any-to-Many Multimodal LLM |
Jinxiang Lai et.al. |
2411.09439 |
null |
2024-11-13 |
Large Wireless Model (LWM): A Foundation Model for Wireless Channels |
Sadjad Alikhani et.al. |
2411.08872 |
link |
2024-11-13 |
The Limited Impact of Medical Adaptation of Large Language and Vision-Language Models |
Daniel P. Jeong et.al. |
2411.08870 |
link |
2024-11-13 |
CamemBERT 2.0: A Smarter French Language Model Aged to Perfection |
Wissam Antoun et.al. |
2411.08868 |
null |
2024-11-13 |
LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs |
Piyush Jha et.al. |
2411.08862 |
null |
2024-11-13 |
Multimodal Instruction Tuning with Hybrid State Space Models |
Jianing Zhou et.al. |
2411.08840 |
null |
2024-11-13 |
FinRobot: AI Agent for Equity Research and Valuation with Large Language Models |
Tianyu Zhou et.al. |
2411.08804 |
link |
2024-11-13 |
Evaluating World Models with LLM for Decision Making |
Chang Yang et.al. |
2411.08794 |
null |
2024-11-13 |
Can sparse autoencoders be used to decompose and interpret steering vectors? |
Harry Mayne et.al. |
2411.08790 |
link |
2024-11-13 |
Sharingan: Extract User Action Sequence from Desktop Recordings |
Yanting Chen et.al. |
2411.08768 |
null |
2024-11-13 |
Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers |
Clément Dumas et.al. |
2411.08745 |
link |
2024-11-13 |
A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models |
Dingdong Wang et.al. |
2411.08742 |
null |
2024-11-13 |
Dynamic Rewarding with Prompt Optimization Enables Tuning-free Self-Alignment of Language Models |
Somanshu Singla et.al. |
2411.08733 |
link |
2024-11-13 |
Polymetis:Large Language Modeling for Multiple Material Domains |
Chao Huang et.al. |
2411.08728 |
null |
2024-11-13 |
Voxeland: Probabilistic Instance-Aware Semantic Mapping with Evidence-based Uncertainty Quantification |
Jose-Luis Matez-Bandera et.al. |
2411.08727 |
link |
2024-11-13 |
Theoretical Analysis of Byte-Pair Encoding |
László Kozma et.al. |
2411.08671 |
null |
2024-11-13 |
OSMLoc: Single Image-Based Visual Localization in OpenStreetMap with Geometric and Semantic Guidances |
Youqi Liao et.al. |
2411.08665 |
link |
2024-11-13 |
UniMat: Unifying Materials Embeddings through Multi-modal Learning |
Janghoon Ock et.al. |
2411.08664 |
null |
2024-11-13 |
Accelerating Quasi-Static Time Series Simulations with Foundation Models |
Alban Puech et.al. |
2411.08652 |
null |
2024-11-13 |
A System Level Performance Evaluation for Superconducting Digital Systems |
Joyjit Kundu et.al. |
2411.08645 |
null |
2024-11-13 |
Towards Secure Intelligent O-RAN Architecture: Vulnerabilities, Threats and Promising Technical Solutions using LLMs |
Mojdeh Karbalaee Motalleb et.al. |
2411.08640 |
null |
2024-11-12 |
Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data |
Juanhui Li et.al. |
2411.08028 |
null |
2024-11-12 |
LLMPhy: Complex Physical Reasoning Using Large Language Models and World Models |
Anoop Cherian et.al. |
2411.08027 |
null |
2024-11-12 |
Language Models as Causal Effect Generators |
Lucius E. J. Bynum et.al. |
2411.08019 |
link |
2024-11-12 |
ExpressivityArena: Can LLMs Express Information Implicitly? |
Joshua Tint et.al. |
2411.08010 |
null |
2024-11-12 |
Can adversarial attacks by large language models be attributed? |
Manuel Cebrian et.al. |
2411.08003 |
null |
2024-11-12 |
Derivational Morphology Reveals Analogical Generalization in Large Language Models |
Valentin Hofmann et.al. |
2411.07990 |
null |
2024-11-12 |
JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation |
Yiyang Ma et.al. |
2411.07975 |
link |
2024-11-12 |
From General to Specific: Utilizing General Hallucation to Automatically Measure the Role Relationship Fidelity for Specific Role-Play Agents |
Chuyi Kong et.al. |
2411.07965 |
null |
2024-11-12 |
Towards Low-bit Communication for Tensor Parallel LLM Inference |
Harry Dong et.al. |
2411.07942 |
null |
2024-11-12 |
Leveraging Multimodal Models for Enhanced Neuroimaging Diagnostics in Alzheimer’s Disease |
Francesco Chiumento et.al. |
2411.07871 |
null |
2024-11-12 |
Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders |
Xiaofeng Zhu et.al. |
2411.07870 |
null |
2024-11-12 |
Verbosity $\neq$ Veracity: Demystify Verbosity Compensation Behavior of Large Language Models |
Yusen Zhang et.al. |
2411.07858 |
link |
2024-11-12 |
Tucano: Advancing Neural Text Generation for Portuguese |
Nicholas Kluge Corrêa et.al. |
2411.07854 |
link |
2024-11-12 |
NL-SLAM for OC-VLN: Natural Language Grounded SLAM for Object-Centric VLN |
Sonia Raychaudhuri et.al. |
2411.07848 |
null |
2024-11-12 |
Chain Association-based Attacking and Shielding Natural Language Processing Systems |
Jiacheng Huang et.al. |
2411.07843 |
null |
2024-11-12 |
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training |
Philip Zmushko et.al. |
2411.07837 |
link |
2024-11-12 |
Efficient Federated Finetuning of Tiny Transformers with Resource-Constrained Devices |
Kilian Pfeiffer et.al. |
2411.07826 |
null |
2024-11-12 |
Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models |
Youan Cong et.al. |
2411.07820 |
null |
2024-11-12 |
Federated Low-Rank Adaptation with Differential Privacy over Wireless Networks |
Tianqu Kang et.al. |
2411.07806 |
null |
2024-11-12 |
Likelihood as a Performance Gauge for Retrieval-Augmented Generation |
Tianyu Liu et.al. |
2411.07773 |
link |
2024-11-11 |
UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts |
Bo Yang et.al. |
2411.07240 |
link |
2024-11-11 |
OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model |
Sumeth Yuenyong et.al. |
2411.07238 |
null |
2024-11-11 |
Contextualized Evaluations: Taking the Guesswork Out of Language Model Evaluations |
Chaitanya Malaviya et.al. |
2411.07237 |
null |
2024-11-11 |
Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving |
Botao Yu et.al. |
2411.07228 |
null |
2024-11-11 |
TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models |
Matheus Simão et.al. |
2411.07224 |
null |
2024-11-11 |
Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks |
Madeline Brumley et.al. |
2411.07213 |
null |
2024-11-11 |
General Geospatial Inference with a Population Dynamics Foundation Model |
Mohit Agarwal et.al. |
2411.07207 |
null |
2024-11-11 |
DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID |
Nyle Siddiqui et.al. |
2411.07205 |
link |
2024-11-11 |
The Super Weight in Large Language Models |
Mengxia Yu et.al. |
2411.07191 |
link |
2024-11-11 |
NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics |
David Robinson et.al. |
2411.07186 |
null |
2024-11-11 |
SAMPart3D: Segment Any Part in 3D Objects |
Yunhan Yang et.al. |
2411.07184 |
link |
2024-11-11 |
Counterfactual Generation from Language Models |
Shauli Ravfogel et.al. |
2411.07180 |
link |
2024-11-11 |
More Expressive Attention with Negative Weights |
Ang Lv et.al. |
2411.07176 |
link |
2024-11-11 |
Continual Memorization of Factoids in Large Language Models |
Howard Chen et.al. |
2411.07175 |
link |
2024-11-11 |
A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19 |
Vedant Khandelwal et.al. |
2411.07163 |
null |
2024-11-11 |
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models |
Yancheng He et.al. |
2411.07140 |
null |
2024-11-11 |
Stronger Models are NOT Stronger Teachers for Instruction Tuning |
Zhangchen Xu et.al. |
2411.07133 |
null |
2024-11-11 |
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis |
Taihang Hu et.al. |
2411.07132 |
link |
2024-11-11 |
Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context Evaluation |
Kaijian Zou et.al. |
2411.07130 |
link |
2024-11-11 |
Benchmarking LLMs’ Judgments with No Gold Standard |
Shengwei Xu et.al. |
2411.07127 |
link |
2024-11-08 |
Recycled Attention: Efficient inference for long-context language models |
Fangyuan Xu et.al. |
2411.05787 |
null |
2024-11-08 |
Using Language Models to Disambiguate Lexical Choices in Translation |
Josh Barua et.al. |
2411.05781 |
link |
2024-11-08 |
Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths? |
Veronica Chatrath et.al. |
2411.05775 |
null |
2024-11-08 |
Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024 |
Christopher Malon et.al. |
2411.05762 |
null |
2024-11-08 |
End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering |
Dylan Goetting et.al. |
2411.05755 |
link |
2024-11-08 |
Aioli: A Unified Optimization Framework for Language Model Data Mixing |
Mayee F. Chen et.al. |
2411.05735 |
link |
2024-11-08 |
Poze: Sports Technique Feedback under Data Constraints |
Agamdeep Singh et.al. |
2411.05734 |
null |
2024-11-08 |
STARS: Sensor-agnostic Transformer Architecture for Remote Sensing |
Ethan King et.al. |
2411.05714 |
null |
2024-11-08 |
Unmasking the Limits of Large Language Models: A Systematic Evaluation of Masked Text Processing Ability through MskQA and MskCal |
Fuka Matsuzaki et.al. |
2411.05665 |
link |
2024-11-08 |
The influence of persona and conversational task on social interactions with a LLM-controlled embodied conversational agent |
Leon O. H. Kroczek et.al. |
2411.05653 |
null |
2024-11-08 |
LightVA: Lightweight Visual Analytics with LLM Agent-Based Task Planning and Execution |
Yuheng Zhao et.al. |
2411.05651 |
null |
2024-11-08 |
Harnessing High-Level Song Descriptors towards Natural Language-Based Music Recommendation |
Elena V. Epure et.al. |
2411.05649 |
link |
2024-11-08 |
Evaluating Large Language Model Capability in Vietnamese Fact-Checking Data Generation |
Long Truong To et.al. |
2411.05641 |
null |
2024-11-08 |
Assessing Open-Source Large Language Models on Argumentation Mining Subtasks |
Mohammad Yeghaneh Abkenar et.al. |
2411.05639 |
null |
2024-11-08 |
A Two-Step Concept-Based Approach for Enhanced Interpretability and Trust in Skin Lesion Diagnosis |
Cristiano Patrício et.al. |
2411.05609 |
link |
2024-11-08 |
Evaluating and Adapting Large Language Models to Represent Folktales in Low-Resource Languages |
JA Meaney et.al. |
2411.05593 |
null |
2024-11-08 |
Open-set object detection: towards unified problem formulation and benchmarking |
Hejer Ammar et.al. |
2411.05564 |
null |
2024-11-08 |
Training objective drives the consistency of representational similarity across datasets |
Laure Ciernik et.al. |
2411.05561 |
link |
2024-11-08 |
AcceLLM: Accelerating LLM Inference using Redundancy for Load Balancing and Data Locality |
Ilias Bournias et.al. |
2411.05555 |
null |
2024-11-08 |
Assessing the Answerability of Queries in Retrieval-Augmented Code Generation |
Geonmin Kim et.al. |
2411.05547 |
null |
2024-11-07 |
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models |
Muyang Li et.al. |
2411.05007 |
link |
2024-11-07 |
Analyzing The Language of Visual Tokens |
David M. Chan et.al. |
2411.05001 |
null |
2024-11-07 |
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? |
Jonathan Roberts et.al. |
2411.05000 |
null |
2024-11-07 |
DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation |
Peiqi Liu et.al. |
2411.04999 |
link |
2024-11-07 |
LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation |
Weiquan Huang et.al. |
2411.04997 |
link |
2024-11-07 |
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models |
Weixin Liang et.al. |
2411.04996 |
null |
2024-11-07 |
Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives |
Hao Sun et.al. |
2411.04991 |
link |
2024-11-07 |
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities |
Zhaofeng Wu et.al. |
2411.04986 |
null |
2024-11-07 |
Enhancing Reverse Engineering: Investigating and Benchmarking Large Language Models for Vulnerability Analysis in Decompiled Binaries |
Dylan Manuel et.al. |
2411.04981 |
null |
2024-11-07 |
SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference |
Gabriele Oliaro et.al. |
2411.04975 |
null |
2024-11-07 |
BitNet a4.8: 4-bit Activations for 1-bit LLMs |
Hongyu Wang et.al. |
2411.04965 |
null |
2024-11-07 |
Position Paper On Diagnostic Uncertainty Estimation from Large Language Models: Next-Word Probability Is Not Pre-test Probability |
Yanjun Gao et.al. |
2411.04962 |
null |
2024-11-07 |
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM |
Jingwei Xu et.al. |
2411.04954 |
null |
2024-11-07 |
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding |
Jaemin Cho et.al. |
2411.04952 |
null |
2024-11-07 |
A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model |
Panwen Hu et.al. |
2411.04942 |
null |
2024-11-07 |
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos |
Shehan Munasinghe et.al. |
2411.04923 |
null |
2024-11-07 |
GPTKB: Building Very Large Knowledge Bases from Language Models |
Yujia Hu et.al. |
2411.04920 |
link |
2024-11-07 |
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models |
Siming Huang et.al. |
2411.04905 |
null |
2024-11-07 |
In the Era of Prompt Learning with Vision-Language Models |
Ankit Jha et.al. |
2411.04892 |
null |
2024-11-07 |
GUI Agents with Foundation Models: A Comprehensive Survey |
Shuai Wang et.al. |
2411.04890 |
null |
2024-11-06 |
Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress? |
Daniel P. Jeong et.al. |
2411.04118 |
link |
2024-11-06 |
How Transformers Solve Propositional Logic Problems: A Mechanistic Analysis |
Guan Zhe Hong et.al. |
2411.04105 |
null |
2024-11-06 |
RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models |
Maya Varma et.al. |
2411.04097 |
link |
2024-11-06 |
Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation |
Ke Fan et.al. |
2411.04079 |
null |
2024-11-06 |
H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models |
Nhi Pham et.al. |
2411.04077 |
null |
2024-11-06 |
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models |
Chuhan Li et.al. |
2411.04075 |
null |
2024-11-06 |
Pseudo-labeling with Keyword Refining for Few-Supervised Video Captioning |
Ping Li et.al. |
2411.04059 |
link |
2024-11-06 |
Beemo: Benchmark of Expert-edited Machine-generated Outputs |
Ekaterina Artemova et.al. |
2411.04032 |
null |
2024-11-06 |
Prompt Engineering Using GPT for Word-Level Code-Mixed Language Identification in Low-Resource Dravidian Languages |
Aniket Deroy et.al. |
2411.04025 |
null |
2024-11-06 |
Select2Plan: Training-Free ICL-Based Planning through VQA and Memory Retrieval |
Davide Buoso et.al. |
2411.04006 |
null |
2024-11-06 |
Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning |
Jiawei Yao et.al. |
2411.03978 |
link |
2024-11-06 |
What Really is Commonsense Knowledge? |
Quyet V. Do et.al. |
2411.03964 |
null |
2024-11-06 |
How Does A Text Preprocessing Pipeline Affect Ontology Syntactic Matching? |
Zhangcheng Qiang et.al. |
2411.03962 |
null |
2024-11-06 |
Face Reconstruction from Face Embeddings using Adapter to a Face Foundation Model |
Hatef Otroshi Shahreza et.al. |
2411.03960 |
null |
2024-11-06 |
Fine-Grained Guidance for Retrievers: Leveraging LLMs’ Feedback in Retrieval-Augmented Generation |
Yuhang Liu et.al. |
2411.03957 |
null |
2024-11-06 |
Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks |
Felipe Marra et.al. |
2411.03948 |
null |
2024-11-06 |
Interactions Across Blocks in Post-Training Quantization of Large Language Models |
Khasmamad Shabanovi et.al. |
2411.03934 |
null |
2024-11-06 |
Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision-Language Models |
Minh Duc Bui et.al. |
2411.03888 |
link |
2024-11-06 |
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models |
Zhijian Zhuo et.al. |
2411.03884 |
link |
2024-11-06 |
MEG: Medical Knowledge-Augmented Large Language Models for Question Answering |
Laura Cabello et.al. |
2411.03883 |
link |
2024-11-05 |
Inference Optimal VLMs Need Only One Visual Token but Larger Models |
Kevin Y. Li et.al. |
2411.03312 |
link |
2024-11-05 |
LLMs for Domain Generation Algorithm Detection |
Reynier Leyva La O et.al. |
2411.03307 |
null |
2024-11-05 |
VERITAS: A Unified Approach to Reliability Evaluation |
Rajkumar Ramamurthy et.al. |
2411.03300 |
null |
2024-11-05 |
Examining Human-AI Collaboration for Co-Writing Constructive Comments Online |
Farhana Shahid et.al. |
2411.03295 |
null |
2024-11-05 |
Interaction2Code: How Far Are We From Automatic Interactive Webpage Generation? |
Jingyu Xiao et.al. |
2411.03292 |
link |
2024-11-05 |
The Future of Intelligent Healthcare: A Systematic Analysis and Discussion on the Integration and Impact of Robots Using Large Language Models for Healthcare |
Souren Pashangpour et.al. |
2411.03287 |
null |
2024-11-05 |
SMoA: Improving Multi-agent Large Language Models with Sparse Mixture-of-Agents |
Dawei Li et.al. |
2411.03284 |
link |
2024-11-05 |
Spontaneous Emergence of Agent Individuality through Social Interactions in LLM-Based Communities |
Ryosuke Takata et.al. |
2411.03252 |
null |
2024-11-05 |
DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models |
Ying Zhou et.al. |
2411.03250 |
null |
2024-11-05 |
From Pen to Prompt: How Creative Writers Integrate AI into their Writing Practice |
Alicia Guo et.al. |
2411.03137 |
null |
2024-11-05 |
“Create a Fear of Missing Out” – ChatGPT Implements Unsolicited Deceptive Designs in Generated Websites Without Warning |
Veronika Krauß et.al. |
2411.03108 |
null |
2024-11-05 |
Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation |
Jinbao Chen et.al. |
2411.03079 |
null |
2024-11-05 |
Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning |
Bei Li et.al. |
2411.03042 |
null |
2024-11-05 |
HumanVLM: Foundation for Human-Scene Vision-Language Model |
Dawei Dai et.al. |
2411.03034 |
null |
2024-11-05 |
Leveraging Large Language Models in Code Question Answering: Baselines and Issues |
Georgy Andryushchenko et.al. |
2411.03012 |
link |
2024-11-05 |
Controlling for Unobserved Confounding with Large Language Model Classification of Patient Smoking Status |
Samuel Lee et.al. |
2411.03004 |
null |
2024-11-05 |
Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation |
Junchen Fu et.al. |
2411.02992 |
null |
2024-11-05 |
Growing a Tail: Increasing Output Diversity in Large Language Models |
Michal Shur-Ofry et.al. |
2411.02989 |
null |
2024-11-05 |
[Vision Paper] PRObot: Enhancing Patient-Reported Outcome Measures for Diabetic Retinopathy using Chatbots and Generative AI |
Maren Pielka et.al. |
2411.02973 |
null |
2024-11-05 |
Multi-modal NeRF Self-Supervision for LiDAR Semantic Segmentation |
Xavier Timoneda et.al. |
2411.02969 |
null |
2024-11-04 |
Training-free Regional Prompting for Diffusion Transformers |
Anthony Chen et.al. |
2411.02395 |
link |
2024-11-04 |
Adaptive Length Image Tokenization via Recurrent Allocation |
Shivam Duggal et.al. |
2411.02393 |
link |
2024-11-04 |
Attacking Vision-Language Computer Agents via Pop-ups |
Yanzhe Zhang et.al. |
2411.02391 |
link |
2024-11-04 |
Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models |
Guangzhi Xiong et.al. |
2411.02382 |
null |
2024-11-04 |
Addressing Uncertainty in LLMs to Enhance Reliability in Generative AI |
Ramneet Kaur et.al. |
2411.02381 |
null |
2024-11-04 |
Learning General-Purpose Biomedical Volume Representations using Randomized Synthesis |
Neel Dey et.al. |
2411.02372 |
link |
2024-11-04 |
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution |
Yang Yue et.al. |
2411.02359 |
link |
2024-11-04 |
“Give Me BF16 or Give Me Death”? Accuracy-Performance Trade-Offs in LLM Quantization |
Eldar Kurtic et.al. |
2411.02355 |
null |
2024-11-04 |
Machine learning identification of maternal inflammatory response and histologic choroamnionitis from placental membrane whole slide images |
Abhishek Sharma et.al. |
2411.02354 |
null |
2024-11-04 |
Social-RAG: Retrieving from Group Interactions to Socially Ground Proactive AI Generation to Group Preferences |
Ruotong Wang et.al. |
2411.02353 |
null |
2024-11-04 |
Can Large Language Models generalize analogy solving like people can? |
Claire E. Stevenson et.al. |
2411.02348 |
null |
2024-11-04 |
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning |
Zehan Qi et.al. |
2411.02337 |
link |
2024-11-04 |
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity |
Yuqi Luo et.al. |
2411.02335 |
link |
2024-11-04 |
Disrupting Test Development with AI Assistants |
Vijay Joshi et.al. |
2411.02328 |
null |
2024-11-04 |
PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance |
Ruyang Liu et.al. |
2411.02327 |
link |
2024-11-04 |
An Empirical Study on the Code Refactoring Capability of Large Language Models |
Jonathan Cordeiro et.al. |
2411.02320 |
null |
2024-11-04 |
Evaluating the Ability of Large Language Models to Generate Verifiable Specifications in VeriFast |
Marilyn Rego et.al. |
2411.02318 |
null |
2024-11-04 |
Defining and Evaluating Physical Safety for Large Language Models |
Yung-Chen Tang et.al. |
2411.02317 |
null |
2024-11-04 |
Evaluating Creative Short Story Generation in Humans and Large Language Models |
Mete Ismayilzada et.al. |
2411.02316 |
link |
2024-11-04 |
Taking AI Welfare Seriously |
Robert Long et.al. |
2411.00986 |
null |
2024-10-31 |
P-Masking: Power Law Masking Improves Multi-attribute Controlled Generation |
Mohamed Elgaar et.al. |
2410.24201 |
null |
2024-11-01 |
SelfCodeAlign: Self-Alignment for Code Generation |
Yuxiang Wei et.al. |
2410.24198 |
link |
2024-10-31 |
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models |
Heng-Jui Chang et.al. |
2410.24177 |
null |
2024-10-31 |
Constraint Back-translation Improves Complex Instruction Following of Large Language Models |
Yunjia Qi et.al. |
2410.24175 |
null |
2024-10-31 |
$π_0$ : A Vision-Language-Action Flow Model for General Robot Control |
Kevin Black et.al. |
2410.24164 |
null |
2024-10-31 |
GPT or BERT: why not both? |
Lucas Georges Gabriel Charpentier et.al. |
2410.24159 |
link |
2024-10-31 |
Thought Space Explorer: Navigating and Expanding Thought Space for Large Language Model Reasoning |
Jinghan Zhang et.al. |
2410.24155 |
null |
2024-10-31 |
Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning |
Jiaqi Liu et.al. |
2410.24152 |
null |
2024-10-31 |
Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age |
Nouar AlDahoul et.al. |
2410.24148 |
null |
2024-10-31 |
Leveraging Large Language Models for Code Translation and Software Development in Scientific Computing |
Akash Dhruv et.al. |
2410.24119 |
link |
2024-10-31 |
Repository-Level Compositional Code Translation and Validation |
Ali Reza Ibrahimzada et.al. |
2410.24117 |
link |
2024-10-31 |
Matchmaker: Self-Improving Large Language Model Programs for Schema Matching |
Nabeel Seedat et.al. |
2410.24105 |
null |
2024-10-31 |
Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning |
Nabil Omi et.al. |
2410.24096 |
null |
2024-10-31 |
In-Context Fine-Tuning for Time-Series Foundation Models |
Abhimanyu Das et.al. |
2410.24087 |
null |
2024-10-31 |
Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs |
Muhammed Saeed et.al. |
2410.24049 |
null |
2024-10-31 |
Handwriting Recognition in Historical Documents with Multimodal LLM |
Lucian Li et.al. |
2410.24034 |
null |
2024-10-31 |
Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks |
Yingzhe Peng et.al. |
2410.24032 |
null |
2024-10-31 |
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents |
Yifan Xu et.al. |
2410.24024 |
link |
2024-10-31 |
SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation |
Liang He et.al. |
2410.24022 |
null |
2024-10-31 |
Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody? |
Ioannis Tsiamas et.al. |
2410.24019 |
null |
2024-10-30 |
ReferEverything: Towards Segmenting Everything We Can Speak of in Videos |
Anurag Bagchi et.al. |
2410.23287 |
null |
2024-10-30 |
A Monte Carlo Framework for Calibrated Uncertainty Estimation in Sequence Prediction |
Qidong Yang et.al. |
2410.23272 |
null |
2024-10-30 |
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models |
Ziyao Shangguan et.al. |
2410.23266 |
link |
2024-10-30 |
EMMA: End-to-End Multimodal Model for Autonomous Driving |
Jyh-Jing Hwang et.al. |
2410.23262 |
null |
2024-10-30 |
Keypoint Abstraction using Large Models for Object-Relative Imitation Learning |
Xiaolin Fang et.al. |
2410.23254 |
null |
2024-10-30 |
Evaluating Cultural and Social Awareness of LLM Web Agents |
Haoyi Qiu et.al. |
2410.23252 |
null |
2024-10-30 |
Carrot and Stick: Eliciting Comparison Data and Beyond |
Yiling Chen et.al. |
2410.23243 |
null |
2024-10-30 |
A little less conversation, a little more action, please: Investigating the physical common-sense of LLMs in a 3D embodied environment |
Matteo G. Mecattaf et.al. |
2410.23242 |
link |
2024-10-30 |
EMOTION: Expressive Motion Sequence Generation for Humanoid Robots with In-Context Learning |
Peide Huang et.al. |
2410.23234 |
null |
2024-10-30 |
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences |
Yixin Liu et.al. |
2410.23223 |
link |
2024-10-30 |
Partial Channel Dependence with Channel Masks for Time Series Foundation Models |
Seunghan Lee et.al. |
2410.23222 |
null |
2024-10-30 |
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents |
Zhiyong Wu et.al. |
2410.23218 |
link |
2024-10-31 |
Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval |
Sheryl Hsu et.al. |
2410.23214 |
null |
2024-10-30 |
ProTransformer: Robustify Transformers via Plug-and-Play Paradigm |
Zhichao Hou et.al. |
2410.23182 |
null |
2024-10-30 |
ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning |
Millennium Bismay et.al. |
2410.23180 |
link |
2024-10-30 |
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters |
Haiyang Wang et.al. |
2410.23168 |
link |
2024-10-30 |
SciPIP: An LLM-based Scientific Paper Idea Proposer |
Wenxiao Wang et.al. |
2410.23166 |
link |
2024-10-30 |
FlexTSF: A Universal Forecasting Model for Time Series with Variable Regularities |
Jingge Xiao et.al. |
2410.23160 |
link |
2024-10-30 |
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning |
Yichao Liang et.al. |
2410.23156 |
null |
2024-10-30 |
Public Domain 12M: A Highly Aesthetic Image-Text Dataset with Novel Governance Mechanisms |
Jordan Meyer et.al. |
2410.23144 |
null |
2024-10-29 |
Local Policies Enable Zero-shot Long-horizon Manipulation |
Murtaza Dalal et.al. |
2410.22332 |
null |
2024-10-29 |
Task Vectors are Cross-Modal |
Grace Luo et.al. |
2410.22330 |
null |
2024-10-29 |
Enhancing Code Annotation Reliability: Generative AI’s Role in Comment Quality Assessment Models |
Seetharam Killivalavan et.al. |
2410.22323 |
null |
2024-10-29 |
Online Detecting LLM-Generated Texts via Sequential Hypothesis Testing by Betting |
Can Chen et.al. |
2410.22318 |
link |
2024-10-29 |
Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier |
Kai Wang et.al. |
2410.22317 |
link |
2024-10-29 |
Natural Language Inference Improves Compositionality in Vision-Language Models |
Paola Cascante-Bonilla et.al. |
2410.22315 |
null |
2024-10-29 |
Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving |
Bo Jiang et.al. |
2410.22313 |
link |
2024-10-29 |
GPT-4o reads the mind in the eyes |
James W. A. Strachan et.al. |
2410.22309 |
null |
2024-10-29 |
SVIP: Towards Verifiable Inference of Open-source Large Language Models |
Yifan Sun et.al. |
2410.22307 |
null |
2024-10-29 |
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning |
Yihe Deng et.al. |
2410.22304 |
null |
2024-10-29 |
LLMs are Highly-Constrained Biophysical Sequence Optimizers |
Angelica Chen et.al. |
2410.22296 |
null |
2024-10-29 |
Fine-Tuning LLMs for Code Mutation: A New Era of Cyber Threats |
Mohammad Setak et.al. |
2410.22293 |
null |
2024-10-29 |
From melodic note sequences to pitches using word2vec |
Daniel Defays et.al. |
2410.22285 |
null |
2024-10-29 |
Embedding-based classifiers can detect prompt injection attacks |
Md. Ahsan Ayub et.al. |
2410.22284 |
link |
2024-10-29 |
Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models |
Renzhe Yu et.al. |
2410.22282 |
null |
2024-10-29 |
Fourier Head: Helping Large Language Models Learn Complex Probability Distributions |
Nate Gillman et.al. |
2410.22269 |
null |
2024-10-29 |
Meta-Learning Adaptable Foundation Models |
Jacob L. Block et.al. |
2410.22264 |
null |
2024-10-29 |
FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation |
Farima Fatahi Bayat et.al. |
2410.22257 |
null |
2024-10-29 |
Abrupt Learning in Transformers: A Case Study on Matrix Completion |
Pulkit Gopalani et.al. |
2410.22244 |
null |
2024-10-29 |
Are Decoder-Only Large Language Models the Silver Bullet for Code Search? |
Yuxuan Chen et.al. |
2410.22240 |
link |
2024-10-28 |
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics |
Yaniv Nikankin et.al. |
2410.21272 |
link |
2024-10-28 |
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior |
Hanyu Wang et.al. |
2410.21264 |
null |
2024-10-28 |
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference |
Changwoo Lee et.al. |
2410.21262 |
link |
2024-10-28 |
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves? |
Han Bao et.al. |
2410.21259 |
link |
2024-10-28 |
Multi-modal AI for comprehensive breast cancer prognostication |
Jan Witowski et.al. |
2410.21256 |
null |
2024-10-28 |
LongReward: Improving Long-context Large Language Models with AI Feedback |
Jiajie Zhang et.al. |
2410.21252 |
link |
2024-10-28 |
Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback |
Nour Jedidi et.al. |
2410.21242 |
null |
2024-10-28 |
Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce |
Zhantao Yang et.al. |
2410.21237 |
null |
2024-10-28 |
Flaming-hot Initiation with Regular Execution Sampling for Large Language Models |
Weizhe Chen et.al. |
2410.21236 |
null |
2024-10-28 |
LoRA vs Full Fine-tuning: An Illusion of Equivalence |
Reece Shuttleworth et.al. |
2410.21228 |
null |
2024-10-28 |
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines |
Zhixin Zhang et.al. |
2410.21220 |
link |
2024-10-28 |
Lifting the Veil on the Large Language Model Supply Chain: Composition, Risks, and Mitigations |
Kaifeng Huang et.al. |
2410.21218 |
null |
2024-10-28 |
BongLLaMA: LLaMA for Bangla Language |
Abdullah Khan Zehady et.al. |
2410.21200 |
null |
2024-10-28 |
Belief in the Machine: Investigating Epistemological Blind Spots of Language Models |
Mirac Suzgun et.al. |
2410.21195 |
link |
2024-10-29 |
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction |
Qintong Zhang et.al. |
2410.21169 |
null |
2024-10-28 |
M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation |
Jiaheng Liu et.al. |
2410.21157 |
null |
2024-10-28 |
Palisade – Prompt Injection Detection Framework |
Sahasra Kokkula et.al. |
2410.21146 |
null |
2024-10-28 |
LLM-initialized Differentiable Causal Discovery |
Shiv Kampani et.al. |
2410.21141 |
null |
2024-10-28 |
Do LLMs generate test oracles that capture the actual or the expected program behaviour? |
Michael Konstantinou et.al. |
2410.21136 |
null |
2024-10-28 |
Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments |
Marharyta Domnich et.al. |
2410.21131 |
null |
2024-10-25 |
The Potential and Value of AI Chatbot in Personalized Cognitive Training |
Zilong Wang et.al. |
2410.19733 |
null |
2024-10-25 |
Rethinking Visual Dependency in Long-Context Reasoning for Large Vision-Language Models |
Yucheng Zhou et.al. |
2410.19732 |
null |
2024-10-25 |
Counting Ability of Large Language Models and Impact of Tokenization |
Xiang Zhang et.al. |
2410.19730 |
link |
2024-10-25 |
FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning |
Nicole Cho et.al. |
2410.19727 |
null |
2024-10-25 |
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision |
Shilong Li et.al. |
2410.19720 |
null |
2024-10-25 |
Multi-view biomedical foundation models for molecule-target and property prediction |
Parthasarathy Suryanarayanan et.al. |
2410.19704 |
link |
2024-10-25 |
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning |
Xiangyu Zeng et.al. |
2410.19702 |
null |
2024-10-25 |
IPPON: Common Sense Guided Informative Path Planning for Object Goal Navigation |
Kaixian Qu et.al. |
2410.19697 |
null |
2024-10-25 |
Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs |
Yifei Zhang et.al. |
2410.19694 |
null |
2024-10-25 |
APRICOT: Active Preference Learning and Constraint-Aware Task Planning with LLMs |
Huaxiaoyue Wang et.al. |
2410.19656 |
null |
2024-10-25 |
Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models |
Shenghao Fu et.al. |
2410.19635 |
null |
2024-10-25 |
Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina |
Yuan Gao et.al. |
2410.19599 |
null |
2024-10-25 |
Diverse Sign Language Translation |
Xin Shen et.al. |
2410.19586 |
link |
2024-10-25 |
ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems |
Ritvik Aggarwal Ishneet Sukhvinder Singh Ibrahim Allahverdiyev et.al. |
2410.19572 |
null |
2024-10-25 |
GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing |
Hosam Elgendy et.al. |
2410.19552 |
link |
2024-10-25 |
Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad? |
Antonia Wüst et.al. |
2410.19546 |
link |
2024-10-25 |
Brain-like Functional Organization within Large Language Models |
H. Sun et.al. |
2410.19542 |
null |
2024-10-25 |
Detection of Human and Machine-Authored Fake News in Urdu |
Muhammad Zain Ali et.al. |
2410.19517 |
link |
2024-10-25 |
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models |
Jahyun Koo et.al. |
2410.19503 |
null |
2024-10-25 |
Introducing MAPO: Momentum-Aided Gradient Descent Prompt Optimization |
Anthony Cui et.al. |
2410.19499 |
null |
2024-10-24 |
Unbounded: A Generative Infinite Game of Character Life Simulation |
Jialu Li et.al. |
2410.18975 |
null |
2024-10-24 |
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques |
David Ortiz-Perez et.al. |
2410.18972 |
null |
2024-10-24 |
ConceptDrift: Uncovering Biases through the Lens of Foundational Models |
Cristian Daniel Păduraru et.al. |
2410.18970 |
null |
2024-10-24 |
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms |
Zhangheng Li et.al. |
2410.18967 |
null |
2024-10-24 |
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions |
Yujuan Fu et.al. |
2410.18966 |
null |
2024-10-24 |
On the Crucial Role of Initialization for Matrix Factorization |
Bingcong Li et.al. |
2410.18965 |
null |
2024-10-24 |
OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning |
Xiaoqiang Wang et.al. |
2410.18963 |
null |
2024-10-24 |
Context is Key: A Benchmark for Forecasting with Essential Textual Information |
Andrew Robert Williams et.al. |
2410.18959 |
link |
2024-10-24 |
Bridge-Coder: Unlocking LLMs’ Potential to Overcome Language Gaps in Low-Resource Code |
Jipeng Zhang et.al. |
2410.18957 |
null |
2024-10-24 |
BioMistral-NLU: Towards More Generalizable Medical Language Understanding through Instruction Tuning |
Yujuan Velvin Fu et.al. |
2410.18955 |
null |
2024-10-24 |
Dynamic Vocabulary Pruning in Early-Exit LLMs |
Jort Vincenti et.al. |
2410.18952 |
link |
2024-10-24 |
SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models |
Zonghao Ying et.al. |
2410.18927 |
null |
2024-10-24 |
From Blind Solvers to Logical Thinkers: Benchmarking LLMs’ Logical Integrity on Faulty Mathematical Problems |
A M Muntasir Rahman et.al. |
2410.18921 |
null |
2024-10-25 |
A Survey on Speech Large Language Models |
Jing Peng et.al. |
2410.18908 |
null |
2024-10-24 |
PRISM: A Methodology for Auditing Biases in Large Language Models |
Leif Azzopardi et.al. |
2410.18906 |
link |
2024-10-24 |
LLMs for Extremely Low-Resource Finno-Ugric Languages |
Taido Purason et.al. |
2410.18902 |
null |
2024-10-24 |
Creating and Repairing Robot Programs in Open-World Domains |
Claire Schlesinger et.al. |
2410.18893 |
null |
2024-10-24 |
Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks |
Graziano A. Manduzio et.al. |
2410.18890 |
null |
2024-10-24 |
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance |
Omer Nahum et.al. |
2410.18889 |
null |
2024-10-24 |
Provably Robust Watermarks for Open-Source Language Models |
Miranda Christ et.al. |
2410.18861 |
null |
2024-10-23 |
TP-Eval: Tap Multimodal LLMs’ Potential in Evaluation by Customizing Prompts |
Yuxuan Xie et.al. |
2410.18071 |
null |
2024-10-23 |
CLEAR: Character Unlearning in Textual and Visual Modalities |
Alexey Dontsov et.al. |
2410.18057 |
null |
2024-10-23 |
LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering |
Qingfei Zhao et.al. |
2410.18050 |
link |
2024-10-23 |
Key Algorithms for Keyphrase Generation: Instruction-Based LLMs for Russian Scientific Keyphrases |
Anna Glazkova et.al. |
2410.18040 |
null |
2024-10-23 |
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning |
Jingfan Zhang et.al. |
2410.18035 |
null |
2024-10-23 |
GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration |
Xin Li et.al. |
2410.18032 |
link |
2024-10-23 |
MiniFed : Integrating LLM-based Agentic-Workflow for Simulating FOMC Meeting |
Sungil Seok et.al. |
2410.18012 |
null |
2024-10-23 |
Benchmarking Foundation Models on Exceptional Cases: Dataset Creation and Validation |
Suho Kang et.al. |
2410.18001 |
link |
2024-10-23 |
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers |
Zebin Yang et.al. |
2410.17957 |
null |
2024-10-23 |
ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference |
Xin He et.al. |
2410.17954 |
null |
2024-10-23 |
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains |
Ran Xu et.al. |
2410.17952 |
null |
2024-10-23 |
Benchmarking Floworks against OpenAI & Anthropic: A Novel Framework for Enhanced LLM Function Calling |
Nirav Bhan et.al. |
2410.17950 |
null |
2024-10-23 |
Toward path-invariant embeddings for local distance source characterization |
Lisa Linville et.al. |
2410.17937 |
null |
2024-10-23 |
Guide for Defense (G4D): Dynamic Guidance for Robust and Balanced Defense in Large Language Models |
He Cao et.al. |
2410.17922 |
link |
2024-10-23 |
Scaling Diffusion Language Models via Adaptation from Autoregressive Models |
Shansan Gong et.al. |
2410.17891 |
link |
2024-10-23 |
R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models |
Linger Deng et.al. |
2410.17885 |
link |
2024-10-23 |
Lightweight Neural App Control |
Filippos Christianos et.al. |
2410.17883 |
null |
2024-10-23 |
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning |
Yehonathan Refael et.al. |
2410.17881 |
null |
2024-10-23 |
Understanding Layer Significance in LLM Alignment |
Guangyuan Shi et.al. |
2410.17875 |
null |
2024-10-23 |
DataTales: A Benchmark for Real-World Intelligent Data Narration |
Yajing Yang et.al. |
2410.17859 |
link |
2024-10-22 |
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction |
Long Xing et.al. |
2410.17247 |
link |
2024-10-22 |
Towards Reliable Evaluation of Behavior Steering Interventions in LLMs |
Itamar Pres et.al. |
2410.17245 |
null |
2024-10-22 |
Frontiers in Intelligent Colonoscopy |
Ge-Peng Ji et.al. |
2410.17241 |
link |
2024-10-22 |
Large Language Models Empowered Personalized Web Agents |
Hongru Cai et.al. |
2410.17236 |
null |
2024-10-22 |
Automated Spinal MRI Labelling from Reports Using a Large Language Model |
Robin Y. Park et.al. |
2410.17235 |
link |
2024-10-22 |
Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy |
Benedict Aaron Tjandra et.al. |
2410.17234 |
null |
2024-10-22 |
Few-shot In-Context Preference Learning Using Large Language Models |
Chao Yu et.al. |
2410.17233 |
null |
2024-10-22 |
Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods |
Tsachi Blau et.al. |
2410.17222 |
null |
2024-10-22 |
MiniPLM: Knowledge Distillation for Pre-Training Language Models |
Yuxian Gu et.al. |
2410.17215 |
link |
2024-10-22 |
Exploring Possibilities of AI-Powered Legal Assistance in Bangladesh through Large Language Modeling |
Azmine Toushik Wasi et.al. |
2410.17210 |
link |
2024-10-22 |
VoiceBench: Benchmarking LLM-Based Voice Assistants |
Yiming Chen et.al. |
2410.17196 |
link |
2024-10-23 |
Non-myopic Generation of Language Model for Reasoning and Planning |
Chang Ma et.al. |
2410.17195 |
link |
2024-10-22 |
Remote Timing Attacks on Efficient Language Model Inference |
Nicholas Carlini et.al. |
2410.17175 |
null |
2024-10-22 |
From Attention to Activation: Unravelling the Enigmas of Large Language Models |
Prannay Kaul et.al. |
2410.17174 |
null |
2024-10-22 |
Self-calibration for Language Model Quantization and Pruning |
Miles Williams et.al. |
2410.17170 |
null |
2024-10-22 |
Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence |
İlker Işık et.al. |
2410.17161 |
null |
2024-10-22 |
Improving Pinterest Search Relevance Using Large Language Models |
Han Wang et.al. |
2410.17152 |
null |
2024-10-22 |
Are Visual-Language Models Effective in Action Recognition? A Comparative Study |
Mahmoud Ali et.al. |
2410.17149 |
null |
2024-10-22 |
Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ? |
Jirat Chiaranaipanich et.al. |
2410.17145 |
null |
2024-10-22 |
Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements |
Isamu Isozaki et.al. |
2410.17141 |
link |
2024-10-21 |
Reflection-Bench: probing AI intelligence with reflection |
Lingyu Li et.al. |
2410.16270 |
link |
2024-10-21 |
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree |
Shuangrui Ding et.al. |
2410.16268 |
link |
2024-10-21 |
xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs |
Michael S. Ryoo et.al. |
2410.16267 |
null |
2024-10-22 |
Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance |
Zhangwei Gao et.al. |
2410.16261 |
link |
2024-10-21 |
Elucidating the design space of language models for image generation |
Xuantong Liu et.al. |
2410.16257 |
link |
2024-10-21 |
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution |
Maosong Cao et.al. |
2410.16256 |
link |
2024-10-21 |
Can Knowledge Editing Really Correct Hallucinations? |
Baixiang Huang et.al. |
2410.16251 |
link |
2024-10-21 |
Analyzing Context Contributions in LLM-based Machine Translation |
Emmanouil Zaranis et.al. |
2410.16246 |
null |
2024-10-21 |
IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems |
Yihuan Mao et.al. |
2410.16237 |
null |
2024-10-21 |
LLaVA-KD: A Framework of Distilling Multimodal Large Language Models |
Yuxuan Cai et.al. |
2410.16236 |
link |
2024-10-21 |
ToW: Thoughts of Words Improve Reasoning in Large Language Models |
Zhikun Xu et.al. |
2410.16235 |
null |
2024-10-21 |
Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping |
Ryan Li et.al. |
2410.16232 |
null |
2024-10-21 |
Building A Coding Assistant via the Retrieval-Augmented Language Model |
Xinze Li et.al. |
2410.16229 |
link |
2024-10-21 |
A Realistic Threat Model for Large Language Model Jailbreaks |
Valentyn Boreiko et.al. |
2410.16222 |
link |
2024-10-21 |
Pre-training Distillation for Large Language Models: A Design Space Exploration |
Hao Peng et.al. |
2410.16215 |
null |
2024-10-21 |
Comprehensive benchmarking of large language models for RNA secondary structure prediction |
L. I. Zablocki et.al. |
2410.16212 |
link |
2024-10-21 |
CoT-TL: Low-Resource Temporal Knowledge Representation of Planning Instructions Using Chain-of-Thought Reasoning |
Kumar Manas et.al. |
2410.16207 |
null |
2024-10-21 |
Improve Vision Language Model Chain-of-thought Reasoning |
Ruohong Zhang et.al. |
2410.16198 |
link |
2024-10-22 |
LASER: Script Execution by Autonomous Agents for On-demand Traffic Simulation |
Hao Gao et.al. |
2410.16197 |
link |
2024-10-21 |
Contamination Report for Multilingual Benchmarks |
Sanchit Ahuja et.al. |
2410.16186 |
null |
2024-10-18 |
Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts |
German Gritsai et.al. |
2410.14677 |
null |
2024-10-18 |
SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment |
Qin Liu et.al. |
2410.14676 |
null |
2024-10-18 |
Enhancing Large Language Models’ Situated Faithfulness to External Contexts |
Yukun Huang et.al. |
2410.14675 |
link |
2024-10-18 |
Decomposing The Dark Matter of Sparse Autoencoders |
Joshua Engels et.al. |
2410.14670 |
link |
2024-10-18 |
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples |
Baiqi Li et.al. |
2410.14669 |
null |
2024-10-18 |
MiCEval: Unveiling Multimodal Chain of Thought’s Quality via Image Description and Reasoning Steps |
Xiongtao Zhou et.al. |
2410.14668 |
link |
2024-10-18 |
A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning |
Shengjie Sun et.al. |
2410.14660 |
null |
2024-10-18 |
Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens |
Zhepeng Cen et.al. |
2410.14655 |
null |
2024-10-18 |
EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search |
Oliver Sieberling et.al. |
2410.14649 |
link |
2024-10-18 |
Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs |
Runchu Tian et.al. |
2410.14641 |
link |
2024-10-18 |
GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings |
Raghuveer Thirukovalluru et.al. |
2410.14635 |
link |
2024-10-18 |
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning |
Yuxiang Lu et.al. |
2410.14633 |
null |
2024-10-18 |
On the Regularization of Learnable Embeddings for Time Series Processing |
Luca Butera et.al. |
2410.14630 |
null |
2024-10-18 |
CELI: Controller-Embedded Language Model Interactions |
Jan-Samuel Wagner et.al. |
2410.14627 |
null |
2024-10-18 |
DiSCo Meets LLMs: A Unified Approach for Sparse Retrieval and Contextual Distillation in Conversational Search |
Simon Lupart et.al. |
2410.14609 |
null |
2024-10-18 |
Teaching Models to Balance Resisting and Accepting Persuasion |
Elias Stengel-Eskin et.al. |
2410.14596 |
link |
2024-10-18 |
Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets |
Namid R. Stillman et.al. |
2410.14587 |
null |
2024-10-18 |
Do LLMs estimate uncertainty well in instruction-following? |
Juyeon Heo et.al. |
2410.14582 |
null |
2024-10-18 |
Large Language Models Are Overparameterized Text Encoders |
Thennal D K et.al. |
2410.14578 |
null |
2024-10-18 |
MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts |
Rachel S. Y. Teo et.al. |
2410.14574 |
link |
2024-10-17 |
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens |
Lijie Fan et.al. |
2410.13863 |
null |
2024-10-17 |
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation |
Rongyao Fang et.al. |
2410.13861 |
link |
2024-10-17 |
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding |
Runsen Xu et.al. |
2410.13860 |
link |
2024-10-17 |
$γ-$ MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models |
Yaxin Luo et.al. |
2410.13859 |
null |
2024-10-17 |
How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs |
Guhao Feng et.al. |
2410.13857 |
null |
2024-10-17 |
Can MLLMs Understand the Deep Implication Behind Chinese Images? |
Chenhao Zhang et.al. |
2410.13854 |
link |
2024-10-17 |
Retrospective Learning from Interactions |
Zizhao Chen et.al. |
2410.13852 |
null |
2024-10-17 |
Differentiable Robot Rendering |
Ruoshi Liu et.al. |
2410.13851 |
null |
2024-10-17 |
SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction |
Xuan Zhang et.al. |
2410.13846 |
link |
2024-10-17 |
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models |
Qiaoyu Tang et.al. |
2410.13841 |
null |
2024-10-17 |
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs |
Tianyu Guo et.al. |
2410.13835 |
link |
2024-10-17 |
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement |
Hui Yuan et.al. |
2410.13828 |
link |
2024-10-17 |
Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models |
Mazda Moayeri et.al. |
2410.13826 |
null |
2024-10-17 |
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents |
Ke Yang et.al. |
2410.13825 |
null |
2024-10-18 |
Harnessing Webpage UIs for Text-Rich Visual Understanding |
Junpeng Liu et.al. |
2410.13824 |
null |
2024-10-17 |
Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning |
Xiaodan Xing et.al. |
2410.13823 |
link |
2024-10-17 |
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance |
Mitsuhiko Nakamoto et.al. |
2410.13816 |
null |
2024-10-17 |
De-mark: Watermark Removal in Large Language Models |
Ruibo Chen et.al. |
2410.13808 |
null |
2024-10-17 |
A Watermark for Order-Agnostic Language Models |
Ruibo Chen et.al. |
2410.13805 |
null |
2024-10-18 |
BenTo: Benchmark Task Reduction with In-Context Transferability |
Hongyu Zhao et.al. |
2410.13804 |
link |
2024-10-16 |
Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models |
Ce Zhang et.al. |
2410.12790 |
link |
2024-10-16 |
Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception |
Jihao Zhao et.al. |
2410.12788 |
link |
2024-10-16 |
In-Context Learning Enables Robot Action Prediction in LLMs |
Yida Yin et.al. |
2410.12782 |
null |
2024-10-16 |
Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information |
Yingya Li et.al. |
2410.12774 |
null |
2024-10-16 |
Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions |
Zhenyu Jiang et.al. |
2410.12773 |
null |
2024-10-16 |
Towards Zero-Shot Camera Trap Image Categorization |
Jiří Vyskočil et.al. |
2410.12769 |
null |
2024-10-16 |
The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse |
Ekansh Sharma et.al. |
2410.12766 |
null |
2024-10-16 |
StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples |
Ajay Patel et.al. |
2410.12757 |
null |
2024-10-17 |
CREAM: Consistency Regularized Self-Rewarding Language Models |
Zhaoyang Wang et.al. |
2410.12735 |
null |
2024-10-16 |
WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation |
João Matos et.al. |
2410.12722 |
link |
2024-10-16 |
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression |
Zhenheng Tang et.al. |
2410.12707 |
null |
2024-10-16 |
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines |
Genta Indra Winata et.al. |
2410.12705 |
link |
2024-10-16 |
Sarcasm Detection in a Less-Resourced Language |
Lazar Đoković et.al. |
2410.12704 |
link |
2024-10-16 |
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization |
Xingqi Wang et.al. |
2410.12700 |
link |
2024-10-16 |
VividMed: Vision Language Model with Versatile Visual Grounding for Medicine |
Lingxiao Luo et.al. |
2410.12694 |
link |
2024-10-16 |
Automatic Mapping of Anatomical Landmarks from Free-Text Using Large Language Models: Insights from Llama-2 |
Mohamad Abdi et.al. |
2410.12686 |
null |
2024-10-16 |
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation |
Dewei Zhou et.al. |
2410.12669 |
null |
2024-10-16 |
Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models |
Shicheng Xu et.al. |
2410.12662 |
null |
2024-10-16 |
Evaluating Morphological Compositional Generalization in Large Language Models |
Mete Ismayilzada et.al. |
2410.12656 |
null |
2024-10-16 |
Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals |
Orchid Chetia Phukan et.al. |
2410.12645 |
null |
2024-10-15 |
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation |
Fei Tang et.al. |
2410.11841 |
link |
2024-10-15 |
A Hitchhiker’s Guide to Scaling Law Estimation |
Leshem Choshen et.al. |
2410.11840 |
link |
2024-10-15 |
MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding |
Yue Cao et.al. |
2410.11829 |
link |
2024-10-15 |
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws |
Yiding Jiang et.al. |
2410.11820 |
link |
2024-10-15 |
Improving Long-Text Alignment for Text-to-Image Diffusion Models |
Luping Liu et.al. |
2410.11817 |
link |
2024-10-15 |
SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing |
Zhiyuan Zhang et.al. |
2410.11815 |
null |
2024-10-15 |
NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models |
Han Han et.al. |
2410.11805 |
null |
2024-10-15 |
FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting |
Zhe Li et.al. |
2410.11802 |
null |
2024-10-15 |
Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability |
Tsz Ting Chung et.al. |
2410.11786 |
null |
2024-10-15 |
Latent BKI: Open-Dictionary Continuous Mapping in Visual-Language Latent Spaces with Quantifiable Uncertainty |
Joey Wilson et.al. |
2410.11783 |
link |
2024-10-15 |
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks |
Guibin Zhang et.al. |
2410.11782 |
null |
2024-10-15 |
Language Models Encode Numbers Using Digit Representations in Base 10 |
Amit Arnold Levy et.al. |
2410.11781 |
link |
2024-10-15 |
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation |
Chenxi Wang et.al. |
2410.11779 |
link |
2024-10-15 |
Time-Series Foundation Model for Value-at-Risk |
Anubha Goel et.al. |
2410.11773 |
link |
2024-10-15 |
Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models |
Kai Yao et.al. |
2410.11772 |
link |
2024-10-15 |
SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding |
Ying Chen et.al. |
2410.11761 |
null |
2024-10-15 |
Latent Action Pretraining from Videos |
Seonghyeon Ye et.al. |
2410.11758 |
null |
2024-10-15 |
Personas with Attitudes: Controlling LLMs for Diverse Data Annotation |
Leon Fröhling et.al. |
2410.11745 |
link |
2024-10-15 |
DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure |
Yunfan Xiong et.al. |
2410.11744 |
null |
2024-10-15 |
Light-Weight Fault Tolerant Attention for Large Language Model Training |
Yuhang Liang et.al. |
2410.11720 |
null |
2024-10-14 |
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads |
Guangxuan Xiao et.al. |
2410.10819 |
link |
2024-10-14 |
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free |
Ziyue Li et.al. |
2410.10814 |
link |
2024-10-14 |
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory |
Di Wu et.al. |
2410.10813 |
link |
2024-10-14 |
Local and Global Decoding in Text Generation |
Daniel Gareev et.al. |
2410.10810 |
link |
2024-10-14 |
Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning |
Aakanksha et.al. |
2410.10801 |
null |
2024-10-14 |
Towards Foundation Models for 3D Vision: How Close Are We? |
Yiming Zuo et.al. |
2410.10799 |
null |
2024-10-15 |
MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling |
Jian Yang et.al. |
2410.10798 |
null |
2024-10-14 |
Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance |
Sachin Goyal et.al. |
2410.10796 |
link |
2024-10-15 |
LiveXiv – A Multi-Modal Live Benchmark Based on Arxiv Papers Content |
Nimrod Shabtay et.al. |
2410.10783 |
link |
2024-10-14 |
When Attention Sink Emerges in Language Models: An Empirical View |
Xiangming Gu et.al. |
2410.10781 |
link |
2024-10-14 |
Focused ReAct: Improving ReAct through Reiterate and Early Stop |
Shuoqiu Li et.al. |
2410.10779 |
null |
2024-10-14 |
AFlow: Automating Agentic Workflow Generation |
Jiayi Zhang et.al. |
2410.10762 |
link |
2024-10-14 |
Denial-of-Service Poisoning Attacks against Large Language Models |
Kuofeng Gao et.al. |
2410.10760 |
link |
2024-10-14 |
SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization |
Akrit Mudvari et.al. |
2410.10759 |
null |
2024-10-14 |
Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation for Classification |
Jan Cegin et.al. |
2410.10756 |
link |
2024-10-14 |
NT-LLM: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models |
Yanbiao Ji et.al. |
2410.10743 |
null |
2024-10-14 |
SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing |
Pengrui Quan et.al. |
2410.10741 |
link |
2024-10-14 |
Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs |
Ishan Jindal et.al. |
2410.10739 |
null |
2024-10-14 |
Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning |
Kuofeng Gao et.al. |
2410.10735 |
null |
2024-10-14 |
Towards LLM-guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection |
Giorgos Iacovides et.al. |
2410.10728 |
null |
2024-10-11 |
Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models |
Qin Liu et.al. |
2410.09047 |
null |
2024-10-11 |
AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation |
Zijun Wang et.al. |
2410.09040 |
link |
2024-10-11 |
Semi-Supervised Learning of Noisy Mixture of Experts Models |
Oh-Ran Kwon et.al. |
2410.09039 |
null |
2024-10-11 |
SimpleStrat: Diversifying Language Model Generation with Stratification |
Justin Wong et.al. |
2410.09038 |
null |
2024-10-11 |
Mentor-KD: Making Small Language Models Better Multi-step Reasoners |
Hojae Lee et.al. |
2410.09037 |
link |
2024-10-11 |
PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents |
Xiangyu Yin et.al. |
2410.09034 |
link |
2024-10-11 |
MedMobile: A mobile-sized language model with expert-level clinical capabilities |
Krithik Vishwanath et.al. |
2410.09019 |
link |
2024-10-11 |
Parameter-Efficient Fine-Tuning of State Space Models |
Kevin Galim et.al. |
2410.09016 |
link |
2024-10-11 |
The Impact of Visual Information in Chinese Characters: Evaluating Large Models’ Ability to Recognize and Utilize Radicals |
Xiaofeng Wu et.al. |
2410.09013 |
null |
2024-10-11 |
Software Engineering and Foundation Models: Insights from Industry Blogs Using a Jury of Foundation Models |
Hao Li et.al. |
2410.09012 |
link |
2024-10-11 |
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights |
Ling Yang et.al. |
2410.09008 |
link |
2024-10-11 |
From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating UI Operation Impacts |
Zhuohao Jerry Zhang et.al. |
2410.09006 |
null |
2024-10-11 |
DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection |
Haochen Li et.al. |
2410.09004 |
null |
2024-10-11 |
Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference |
Grace Proebsting et.al. |
2410.08996 |
null |
2024-10-11 |
The structure of the token space for large language models |
Michael Robinson et.al. |
2410.08993 |
null |
2024-10-11 |
Science is Exploration: Computational Frontiers for Conceptual Metaphor Theory |
Rebecca M. M. Hicke et.al. |
2410.08991 |
link |
2024-10-11 |
SubZero: Random Subspace Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning |
Ziming Yu et.al. |
2410.08989 |
link |
2024-10-11 |
Towards Trustworthy Knowledge Graph Reasoning: An Uncertainty Aware Perspective |
Bo Ni et.al. |
2410.08985 |
null |
2024-10-11 |
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models |
Zheng Yi Ho et.al. |
2410.08970 |
null |
2024-10-11 |
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements |
Jingyu Zhang et.al. |
2410.08968 |
null |
2024-10-10 |
DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models |
Xiaoxiao He et.al. |
2410.08207 |
null |
2024-10-10 |
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training |
Gen Luo et.al. |
2410.08202 |
null |
2024-10-10 |
Adam Exploits $\ell_\infty$ -geometry of Loss Landscape via Coordinate-wise Adaptivity |
Shuo Xie et.al. |
2410.08198 |
link |
2024-10-10 |
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions |
Changle Qu et.al. |
2410.08197 |
link |
2024-10-10 |
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code |
Zimu Lu et.al. |
2410.08196 |
link |
2024-10-10 |
Features are fate: a theory of transfer learning in high-dimensional regression |
Javan Tahir et.al. |
2410.08194 |
null |
2024-10-10 |
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment |
Yuancheng Xu et.al. |
2410.08193 |
null |
2024-10-10 |
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models |
Wenbo Hu et.al. |
2410.08182 |
null |
2024-10-10 |
Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models |
Qingni Wang et.al. |
2410.08174 |
null |
2024-10-10 |
On the Evaluation of Generative Robotic Simulations |
Feng Chen et.al. |
2410.08172 |
null |
2024-10-10 |
Visual Scratchpads: Enabling Global Reasoning in Vision |
Aryo Lotfi et.al. |
2410.08165 |
null |
2024-10-10 |
Agent S: An Open Agentic Framework that Uses Computers Like a Human |
Saaket Agashe et.al. |
2410.08164 |
link |
2024-10-10 |
The Effect of Surprisal on Reading Times in Information Seeking and Repeated Reading |
Keren Gruteke Klein et.al. |
2410.08162 |
link |
2024-10-10 |
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation |
Jiatao Gu et.al. |
2410.08159 |
null |
2024-10-10 |
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning |
Amrith Setlur et.al. |
2410.08146 |
null |
2024-10-10 |
Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs |
Xiaoyuan Liu et.al. |
2410.08145 |
link |
2024-10-10 |
DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory |
Yutong Wang et.al. |
2410.08143 |
link |
2024-10-10 |
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction |
Jarrid Rector-Brooks et.al. |
2410.08134 |
null |
2024-10-10 |
Think Beyond Size: Dynamic Prompting for More Effective Reasoning |
Kamesh R et.al. |
2410.08130 |
null |
2024-10-10 |
Mars: Situated Inductive Reasoning in an Open-World Environment |
Xiaojuan Tang et.al. |
2410.08126 |
null |
2024-10-09 |
MM-Ego: Towards Building Egocentric Multimodal LLMs |
Hanrong Ye et.al. |
2410.07177 |
null |
2024-10-09 |
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models |
Fei Wang et.al. |
2410.07176 |
null |
2024-10-09 |
Do better language models have crisper vision? |
Jona Ruthardt et.al. |
2410.07173 |
null |
2024-10-09 |
One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation |
Fabian Paischer et.al. |
2410.07170 |
link |
2024-10-09 |
Sylber: Syllabic Embedding Representation of Speech from Raw Audio |
Cheol Jun Cho et.al. |
2410.07168 |
link |
2024-10-09 |
Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate |
Qidong Huang et.al. |
2410.07167 |
link |
2024-10-09 |
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making |
Manling Li et.al. |
2410.07166 |
link |
2024-10-09 |
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning |
Chongyu Fan et.al. |
2410.07163 |
link |
2024-10-09 |
Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis |
Bohan Zeng et.al. |
2410.07155 |
link |
2024-10-09 |
Towards Interpreting Visual Information Processing in Vision-Language Models |
Clement Neo et.al. |
2410.07149 |
link |
2024-10-09 |
Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling |
Yingfa Chen et.al. |
2410.07145 |
null |
2024-10-09 |
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates |
Xiaosen Zheng et.al. |
2410.07137 |
link |
2024-10-10 |
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models |
Rui Zhao et.al. |
2410.07133 |
link |
2024-10-09 |
Mental Disorders Detection in the Era of Large Language Models |
Gleb Kuzmin et.al. |
2410.07129 |
null |
2024-10-09 |
Exploring the Readiness of Prominent Small Language Models for the Democratization of Financial Literacy |
Tagore Rao Kosireddy et.al. |
2410.07118 |
link |
2024-10-09 |
Personalized Visual Instruction Tuning |
Renjie Pi et.al. |
2410.07113 |
link |
2024-10-09 |
VHELM: A Holistic Evaluation of Vision Language Models |
Tony Lee et.al. |
2410.07112 |
link |
2024-10-09 |
I Want to Break Free! Anti-Social Behavior and Persuasion Ability of LLMs in Multi-Agent Settings with Social Hierarchy |
Gian Maria Campedelli et.al. |
2410.07109 |
link |
2024-10-09 |
Unleashing Multi-Hop Reasoning Potential in Large Language Models through Repetition of Misordered Context |
Sangwon Yu et.al. |
2410.07103 |
null |
2024-10-09 |
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering |
Jun Shern Chan et.al. |
2410.07095 |
link |
2024-10-07 |
Fine-Tuning CLIP’s Last Visual Projector: A Few-Shot Cornucopia |
Mohammad Fahes et.al. |
2410.05270 |
link |
2024-10-07 |
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models |
Fei Wang et.al. |
2410.05269 |
null |
2024-10-07 |
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs |
Mengzhao Chen et.al. |
2410.05265 |
link |
2024-10-07 |
TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles |
Qingchen Yu et.al. |
2410.05262 |
link |
2024-10-07 |
TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens |
Ya-Qi Yu et.al. |
2410.05261 |
null |
2024-10-07 |
Differential Transformer |
Tianzhu Ye et.al. |
2410.05258 |
link |
2024-10-07 |
GLEE: A Unified Framework and Benchmark for Language-based Economic Environments |
Eilam Shapira et.al. |
2410.05254 |
link |
2024-10-07 |
Causal Micro-Narratives |
Mourad Heddaya et.al. |
2410.05252 |
null |
2024-10-07 |
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe |
Yuxin Xiao et.al. |
2410.05248 |
null |
2024-10-07 |
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents |
Boyu Gou et.al. |
2410.05243 |
link |
2024-10-08 |
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models |
Rabin Adhikari et.al. |
2410.05239 |
link |
2024-10-07 |
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models |
Iman Mirzadeh et.al. |
2410.05229 |
null |
2024-10-07 |
Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates |
Avanika Narayan et.al. |
2410.05224 |
null |
2024-10-07 |
Precise Model Benchmarking with Only a Few Observations |
Riccardo Fogliato et.al. |
2410.05222 |
null |
2024-10-07 |
Density estimation with LLMs: a geometric investigation of in-context learning trajectories |
Toni J. B. Liu et.al. |
2410.05218 |
null |
2024-10-07 |
Organizing Unstructured Image Collections using Natural Language |
Mingxuan Liu et.al. |
2410.05217 |
null |
2024-10-07 |
Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality |
Youngtaek Oh et.al. |
2410.05210 |
link |
2024-10-07 |
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References |
Qiyuan Zhang et.al. |
2410.05193 |
null |
2024-10-07 |
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective |
Kaiyue Wen et.al. |
2410.05192 |
null |
2024-10-07 |
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation |
Zhijie Wang et.al. |
2410.05191 |
null |
2024-10-04 |
Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models |
Zhuochun Li et.al. |
2410.03663 |
null |
2024-10-04 |
Unraveling Cross-Modality Knowledge Conflict in Large Vision-Language Models |
Tinghui Zhu et.al. |
2410.03659 |
link |
2024-10-04 |
RAFT: Realistic Attacks to Fool Text Detectors |
James Wang et.al. |
2410.03658 |
link |
2024-10-04 |
Aligning LLMs with Individual Preferences via Interaction |
Shujin Wu et.al. |
2410.03642 |
link |
2024-10-04 |
Conditional Enzyme Generation Using Protein Language Models with Adapters |
Jason Yang et.al. |
2410.03634 |
null |
2024-10-04 |
Large Language Model Performance Benchmarking on Mobile Platforms: A Thorough Evaluation |
Jie Xiao et.al. |
2410.03613 |
null |
2024-10-04 |
TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and Generation |
Jonathan Cook et.al. |
2410.03608 |
null |
2024-10-04 |
LeLaN: Learning A Language-Conditioned Navigation Policy from In-the-Wild Videos |
Noriaki Hirose et.al. |
2410.03603 |
null |
2024-10-04 |
Efficiently Identifying Watermarked Segments in Mixed-Source Texts |
Xuandong Zhao et.al. |
2410.03600 |
null |
2024-10-04 |
Understanding Reasoning in Chain-of-Thought from the Hopfieldian View |
Lijie Hu et.al. |
2410.03595 |
null |
2024-10-04 |
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models |
Xin Zou et.al. |
2410.03577 |
link |
2024-10-04 |
Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs) |
Abrar Rahman et.al. |
2410.03568 |
null |
2024-10-04 |
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding |
Wei Wu et.al. |
2410.03553 |
null |
2024-10-04 |
Re-examining Sexism and Misogyny Classification with Annotator Attitudes |
Aiqi Jiang et.al. |
2410.03543 |
null |
2024-10-04 |
No Need to Talk: Asynchronous Mixture of Language Models |
Anastasiia Filippova et.al. |
2410.03529 |
null |
2024-10-04 |
Steering Large Language Models between Code Execution and Textual Reasoning |
Yongchao Chen et.al. |
2410.03524 |
null |
2024-10-04 |
A Probabilistic Perspective on Unlearning and Alignment for Large Language Models |
Yan Scholten et.al. |
2410.03523 |
null |
2024-10-04 |
CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios |
Zetian Ouyang et.al. |
2410.03502 |
link |
2024-10-04 |
FedStein: Enhancing Multi-Domain Federated Learning Through James-Stein Estimator |
Sunny Gupta et.al. |
2410.03499 |
link |
2024-10-04 |
Towards Reproducible LLM Evaluation: Quantifying Uncertainty in LLM Benchmark Scores |
Robert E. Blackwell et.al. |
2410.03492 |
null |
2024-10-03 |
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations |
Nick Jiang et.al. |
2410.02762 |
link |
2024-10-03 |
FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models |
Zhipei Xu et.al. |
2410.02761 |
link |
2024-10-03 |
Erasing Conceptual Knowledge from Language Models |
Rohit Gandikota et.al. |
2410.02760 |
link |
2024-10-03 |
Loong: Generating Minute-level Long Videos with Autoregressive Language Models |
Yuqing Wang et.al. |
2410.02757 |
null |
2024-10-03 |
SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% the Cost |
Jifan Zhang et.al. |
2410.02755 |
null |
2024-10-03 |
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis |
Ulyana Piterbarg et.al. |
2410.02749 |
link |
2024-10-03 |
CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation |
Han He et.al. |
2410.02748 |
null |
2024-10-03 |
Contrastive Localized Language-Image Pre-Training |
Hong-You Chen et.al. |
2410.02746 |
null |
2024-10-03 |
Neutral residues: revisiting adapters for model extension |
Franck Signe Talla et.al. |
2410.02744 |
null |
2024-10-03 |
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions |
Yekun Chai et.al. |
2410.02743 |
null |
2024-10-03 |
Grounding Large Language Models In Embodied Environment With Imperfect World Models |
Haolan Liu et.al. |
2410.02742 |
null |
2024-10-03 |
Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization |
Lei Xu et.al. |
2410.02741 |
link |
2024-10-03 |
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models |
Zhengfeng Lai et.al. |
2410.02740 |
null |
2024-10-04 |
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge |
Jiayi Ye et.al. |
2410.02736 |
null |
2024-10-03 |
DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects |
Zhaowei Wang et.al. |
2410.02730 |
link |
2024-10-03 |
Unified Multi-Modal Interleaved Document Representation for Information Retrieval |
Jaewoo Lee et.al. |
2410.02729 |
null |
2024-10-03 |
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation |
Rohin Manvi et.al. |
2410.02725 |
null |
2024-10-03 |
Large Language Models as Markov Chains |
Oussama Zekri et.al. |
2410.02724 |
null |
2024-10-03 |
Domain-Specific Retrieval-Augmented Generation Using Vector Stores, Knowledge Graphs, and Tensor Factorization |
Ryan C. Barron et.al. |
2410.02721 |
null |
2024-10-03 |
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation |
Zixuan Li et.al. |
2410.02719 |
null |
2024-10-02 |
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads |
Yuxiang Huang et.al. |
2410.01805 |
link |
2024-10-02 |
Efficient $1$ -bit tensor approximations |
Alex W. Neal Riasanovsky et.al. |
2410.01799 |
null |
2024-10-02 |
Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models |
Joseph Lee et.al. |
2410.01795 |
link |
2024-10-02 |
When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1 |
R. Thomas McCoy et.al. |
2410.01792 |
null |
2024-10-02 |
Investigating on RLHF methodology |
Alexey Kutalev et.al. |
2410.01789 |
null |
2024-10-02 |
OmniGenBench: Automating Large-scale in-silico Benchmarking for Genomic Foundation Models |
Heng Yang et.al. |
2410.01784 |
link |
2024-10-02 |
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models |
Shayekh Bin Islam et.al. |
2410.01782 |
link |
2024-10-03 |
Quantifying Generalization Complexity for Large Language Models |
Zhenting Qi et.al. |
2410.01769 |
link |
2024-10-02 |
Integrating Protein Sequence and Expression Level to Analysis Molecular Characterization of Breast Cancer Subtypes |
Hossein Sholehrasa et.al. |
2410.01755 |
null |
2024-10-03 |
Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks |
Mengzhao Jia et.al. |
2410.01744 |
link |
2024-10-02 |
VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models |
Kailai Feng et.al. |
2410.01738 |
link |
2024-10-02 |
Visual Perception in Text Strings |
Qi Jia et.al. |
2410.01733 |
link |
2024-10-02 |
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing |
Yilmazcan Ozyurt et.al. |
2410.01727 |
link |
2024-10-02 |
Auto-Demo Prompting: Leveraging Generated Outputs as Demonstrations for Enhanced Batch Prompting |
Longyu Feng et.al. |
2410.01724 |
null |
2024-10-02 |
Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective |
Zeyu Gan et.al. |
2410.01720 |
link |
2024-10-02 |
Examining the Role of Relationship Alignment in Large Language Models |
Kristen M. Altenburger et.al. |
2410.01708 |
null |
2024-10-02 |
Interpretable Contrastive Monte Carlo Tree Search Reasoning |
Zitian Gao et.al. |
2410.01707 |
link |
2024-10-02 |
An Exploration of Self-Supervised Mutual Information Alignment for Multi-Task Settings |
Soham Govande et.al. |
2410.01704 |
link |
2024-10-02 |
CreDes: Causal Reasoning Enhancement and Dual-End Searching for Solving Long-Range Reasoning Problems using LLMs |
Kangsheng Wang et.al. |
2410.01696 |
null |
2024-10-02 |
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models |
Tung-Yu Wu et.al. |
2410.01692 |
null |
2024-09-30 |
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning |
Haotian Zhang et.al. |
2409.20566 |
null |
2024-09-30 |
LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner |
Xiaopan Zhang et.al. |
2409.20560 |
null |
2024-09-30 |
Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos |
Md Mohaiminul Islam et.al. |
2409.20557 |
null |
2024-09-30 |
UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models |
Qiaojun Yu et.al. |
2409.20551 |
null |
2024-09-30 |
LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation |
Ziyao Zhang et.al. |
2409.20550 |
null |
2024-09-30 |
Robi Butler: Remote Multimodal Interactions with Household Robot Assistant |
Anxing Xiao et.al. |
2409.20548 |
null |
2024-09-30 |
Uncertainty-Informed Screening for Safer Solvents Used in the Synthesis of Perovskite via Language Models |
Arpan Mukherjee et.al. |
2409.20512 |
null |
2024-09-30 |
COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models |
Divyanshu Daiya et.al. |
2409.20502 |
null |
2024-09-30 |
A Weakly Supervised Data Labeling Framework for Machine Lexical Normalization in Vietnamese Social Media |
Dung Ha Nguyen et.al. |
2409.20467 |
null |
2024-09-30 |
Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments |
Mohamed Elnoor et.al. |
2409.20445 |
null |
2024-10-01 |
Instance-adaptive Zero-shot Chain-of-Thought Prompting |
Xiaosong Yuan et.al. |
2409.20441 |
null |
2024-09-30 |
HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty Decoding |
Fan Yuan et.al. |
2409.20429 |
null |
2024-09-30 |
World to Code: Multi-modal Data Generation via Self-Instructed Compositional Captioning and Filtering |
Jiacong Wang et.al. |
2409.20424 |
link |
2024-09-30 |
Anti-stereotypical Predictive Text Suggestions Do Not Reliably Yield Anti-stereotypical Writing |
Connor Baumler et.al. |
2409.20390 |
null |
2024-09-30 |
Wait, but Tylenol is Acetaminophen… Investigating and Improving Language Models’ Ability to Resist Requests for Misinformation |
Shan Chen et.al. |
2409.20385 |
null |
2024-09-30 |
Word-wise intonation model for cross-language TTS systems |
Tomilov A. A. et.al. |
2409.20374 |
null |
2024-09-30 |
The Perfect Blend: Redefining RLHF with Mixture of Judges |
Tengyu Xu et.al. |
2409.20370 |
null |
2024-09-30 |
VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs |
Ruotong Liao et.al. |
2409.20365 |
link |
2024-09-30 |
Efficient Driving Behavior Narration and Reasoning on Edge Device Using Large Language Models |
Yizhou Huang et.al. |
2409.20364 |
null |
2024-09-30 |
Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference |
Ke Yi et.al. |
2409.20361 |
null |
2024-09-27 |
Exploring Token Pruning in Vision State Space Models |
Zheng Zhan et.al. |
2409.18962 |
null |
2024-09-27 |
LML: Language Model Learning a Dataset for Data-Augmented Prediction |
Praneeth Vadlapati et.al. |
2409.18957 |
link |
2024-09-27 |
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models |
Jiaming Li et.al. |
2409.18943 |
link |
2024-09-27 |
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding |
Heqing Zou et.al. |
2409.18938 |
null |
2024-09-27 |
Social Media Bot Policies: Evaluating Passive and Active Enforcement |
Kristina Radivojevic et.al. |
2409.18931 |
null |
2024-09-27 |
AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow |
Huizi Yu et.al. |
2409.18924 |
null |
2024-09-27 |
Soft Measures for Extracting Causal Collective Intelligence |
Maryam Berijanian et.al. |
2409.18911 |
link |
2024-09-27 |
Improving Visual Object Tracking through Visual Prompting |
Shih-Fang Chen et.al. |
2409.18901 |
link |
2024-09-27 |
IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation |
Fan Lin et.al. |
2409.18892 |
link |
2024-09-27 |
Suicide Phenotyping from Clinical Notes in Safety-Net Psychiatric Hospital Using Multi-Label Classification with Pre-Trained Language Models |
Zehan Li et.al. |
2409.18878 |
null |
2024-09-27 |
Predicting and analyzing memorization within fine-tuned Large Language Models |
Jérémie Dentan et.al. |
2409.18858 |
null |
2024-09-27 |
Mitigating Selection Bias with Node Pruning and Auxiliary Options |
Hyeong Kyu Choi et.al. |
2409.18857 |
null |
2024-09-27 |
LLMs4Synthesis: Leveraging Large Language Models for Scientific Synthesis |
Hamed Babaei Giglou et.al. |
2409.18812 |
link |
2024-09-27 |
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs |
Yanyuan Qiao et.al. |
2409.18794 |
null |
2024-09-27 |
A Survey on the Honesty of Large Language Models |
Siheng Li et.al. |
2409.18786 |
link |
2024-09-27 |
Enhancing Explainability in Multimodal Large Language Models Using Ontological Context |
Jihen Amara et.al. |
2409.18753 |
null |
2024-09-27 |
OpenObject-NAV: Open-Vocabulary Object-Oriented Navigation Based on Dynamic Carrier-Relationship Scene Graph |
Yujie Tang et.al. |
2409.18743 |
null |
2024-09-27 |
Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs |
Gleb Mezentsev et.al. |
2409.18721 |
link |
2024-09-27 |
Read Over the Lines: Attacking LLMs and Toxicity Detection Systems with ASCII Art to Mask Profanity |
Sergey Berezin et.al. |
2409.18708 |
link |
2024-09-27 |
Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models |
Yiming Chen et.al. |
2409.18680 |
link |
2024-09-26 |
EgoLM: Multi-Modal Language Model of Egocentric Motions |
Fangzhou Hong et.al. |
2409.18127 |
null |
2024-09-26 |
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction |
Jing He et.al. |
2409.18124 |
null |
2024-09-26 |
Multi-View and Multi-Scale Alignment for Contrastive Language-Image Pre-training in Mammography |
Yuexi Du et.al. |
2409.18119 |
null |
2024-09-26 |
E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding |
Ye Liu et.al. |
2409.18111 |
link |
2024-09-26 |
Open-World Evaluation for Retrieving Diverse Perspectives |
Hung-Ting Chen et.al. |
2409.18110 |
null |
2024-09-26 |
MALPOLON: A Framework for Deep Species Distribution Modeling |
Theo Larcher et.al. |
2409.18102 |
link |
2024-09-26 |
SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation |
Xin Li et.al. |
2409.18082 |
null |
2024-09-26 |
Infer Human’s Intentions Before Following Natural Language Instructions |
Yanming Wan et.al. |
2409.18073 |
link |
2024-09-26 |
Infering Alt-text For UI Icons With Large Language Models During App Development |
Sabrina Haque et.al. |
2409.18060 |
null |
2024-09-26 |
DualAD: Dual-Layer Planning for Reasoning in Autonomous Driving |
Dingrui Wang et.al. |
2409.18053 |
link |
2024-09-26 |
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions |
Kai Chen et.al. |
2409.18042 |
null |
2024-09-26 |
Compositional Hardness of Code in Large Language Models – A Probabilistic Perspective |
Yotam Wolf et.al. |
2409.18028 |
null |
2024-09-26 |
An Adversarial Perspective on Machine Unlearning for AI Safety |
Jakub Łucki et.al. |
2409.18025 |
link |
2024-09-26 |
DARE: Diverse Visual Question Answering with Robustness Evaluation |
Hannah Sterz et.al. |
2409.18023 |
null |
2024-09-26 |
Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles |
Lewei He et.al. |
2409.18014 |
null |
2024-09-26 |
Control Industrial Automation System with Large Language Models |
Yuchen Xia et.al. |
2409.18009 |
link |
2024-09-26 |
Multilingual Evaluation of Long Context Retrieval and Reasoning |
Ameeta Agrawal et.al. |
2409.18006 |
link |
2024-09-26 |
Enhancing Tourism Recommender Systems for Sustainable City Trips Using Retrieval-Augmented Generation |
Ashmi Banerjee et.al. |
2409.18003 |
null |
2024-09-26 |
Extracting Affect Aggregates from Longitudinal Social Media Data with Temporal Adapters for Large Language Models |
Georg Ahnert et.al. |
2409.17990 |
link |
2024-09-26 |
LLM4Brain: Training a Large Language Model for Brain Video Understanding |
Ruizhe Zheng et.al. |
2409.17987 |
null |
2024-09-25 |
Attention Prompting on Image for Large Vision-Language Models |
Runpeng Yu et.al. |
2409.17143 |
link |
2024-09-25 |
FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression |
Fazal Mittu et.al. |
2409.17141 |
link |
2024-09-25 |
Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents |
Junting Lu et.al. |
2409.17140 |
null |
2024-09-25 |
Blox-Net: Generative Design-for-Robot-Assembly Using VLM Supervision, Physics Simulation, and a Robot with Reset |
Andrew Goldberg et.al. |
2409.17126 |
null |
2024-09-25 |
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale |
Fan Zhou et.al. |
2409.17115 |
link |
2024-09-25 |
Unveiling Ontological Commitment in Multi-Modal Foundation Models |
Mert Keser et.al. |
2409.17109 |
null |
2024-09-25 |
Accumulator-Aware Post-Training Quantization |
Ian Colbert et.al. |
2409.17092 |
null |
2024-09-25 |
Can Vision Language Models Learn from Visual Demonstrations of Ambiguous Spatial Reasoning? |
Bowen Zhao et.al. |
2409.17080 |
link |
2024-09-25 |
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models |
Yifei Liu et.al. |
2409.17066 |
link |
2024-09-25 |
Benchmarking Domain Generalization Algorithms in Computational Pathology |
Neda Zamanitajeddin et.al. |
2409.17063 |
null |
2024-09-25 |
Using LLM for Real-Time Transcription and Summarization of Doctor-Patient Interactions into ePuskesmas in Indonesia |
Azmul Asmar Irfan et.al. |
2409.17054 |
null |
2024-09-25 |
GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design |
Phillip Mueller et.al. |
2409.17045 |
null |
2024-09-25 |
How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not |
Francesco Verdini et.al. |
2409.17044 |
null |
2024-09-25 |
Counterfactual Token Generation in Large Language Models |
Ivi Chatzi et.al. |
2409.17027 |
link |
2024-09-25 |
LLM-CARD: Towards a Description and Landscape of Large Language Models |
Shengwei Tian et.al. |
2409.17011 |
link |
2024-09-25 |
Models Can and Should Embrace the Communicative Nature of Human-Generated Math |
Sasha Boguraev et.al. |
2409.17005 |
null |
2024-09-26 |
INT-FlashAttention: Enabling Flash Attention for INT8 Quantization |
Shimao Chen et.al. |
2409.16997 |
link |
2024-09-25 |
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models |
Chi Zhang et.al. |
2409.16986 |
null |
2024-09-25 |
AXCEL: Automated eXplainable Consistency Evaluation using LLMs |
P Aditya Sreekar et.al. |
2409.16984 |
null |
2024-09-25 |
Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions |
Zeyneb N. Kaya et.al. |
2409.16974 |
null |
2024-09-24 |
Semantic Refocused Tuning for Open-Vocabulary Panoptic Segmentation |
Yong Xien Chng et.al. |
2409.16278 |
null |
2024-09-24 |
LLM Echo Chamber: personalized and automated disinformation |
Tony Ma et.al. |
2409.16241 |
link |
2024-09-24 |
EuroLLM: Multilingual Language Models for Europe |
Pedro Henrique Martins et.al. |
2409.16235 |
null |
2024-09-24 |
Fine-Tuning is Fine, if Calibrated |
Zheda Mai et.al. |
2409.16223 |
link |
2024-09-24 |
Towards Enhancing Linked Data Retrieval in Conversational UIs using Large Language Models |
Omar Mussa et.al. |
2409.16220 |
link |
2024-09-24 |
LLMCount: Enhancing Stationary mmWave Detection with Multimodal-LLM |
Boyan Li et.al. |
2409.16209 |
null |
2024-09-25 |
CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data |
Qian-Wen Zhang et.al. |
2409.16202 |
link |
2024-09-24 |
Leveraging Estimated Transferability Over Human Intuition for Model Selection in Text Ranking |
Jun Bai et.al. |
2409.16198 |
null |
2024-09-24 |
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models |
Haoran Que et.al. |
2409.16191 |
link |
2024-09-24 |
Expert-level vision-language foundation model for real-world radiology and comprehensive evaluation |
Xiaohong Liu et.al. |
2409.16183 |
null |
2024-09-24 |
SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image |
Dimitrije Antić et.al. |
2409.16178 |
null |
2024-09-24 |
Cyber Knowledge Completion Using Large Language Models |
Braden K Webb et.al. |
2409.16176 |
null |
2024-09-24 |
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering |
Ziyu Zhao et.al. |
2409.16167 |
null |
2024-09-24 |
EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges |
Talor Abramovich et.al. |
2409.16165 |
link |
2024-09-24 |
ComiCap: A VLMs pipeline for dense captioning of Comic Panels |
Emanuele Vivoli et.al. |
2409.16159 |
link |
2024-09-24 |
Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework |
Lu Chen et.al. |
2409.16146 |
link |
2024-09-24 |
Evaluation of state-of-the-art ASR Models in Child-Adult Interactions |
Aditya Ashvin et.al. |
2409.16135 |
null |
2024-09-24 |
MOSS: Enabling Code-Driven Evolution and Context Management for AI Agents |
Ming Zhu et.al. |
2409.16120 |
link |
2024-09-25 |
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration |
Pin-Jui Ku et.al. |
2409.16117 |
link |
2024-09-24 |
Exploring Hint Generation Approaches in Open-Domain Question Answering |
Jamshid Mozafari et.al. |
2409.16096 |
link |
2024-09-20 |
Gender Representation and Bias in Indian Civil Service Mock Interviews |
Somonnoy Banerjee et.al. |
2409.12194 |
null |
2024-09-18 |
Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution |
Peng Wang et.al. |
2409.12191 |
link |
2024-09-18 |
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning |
Zayne Sprague et.al. |
2409.12183 |
link |
2024-09-23 |
A Controlled Study on Long Context Extension and Generalization in LLMs |
Yi Lu et.al. |
2409.12181 |
link |
2024-09-18 |
Finetuning Language Models to Emit Linguistic Expressions of Uncertainty |
Arslan Chaudhry et.al. |
2409.12180 |
null |
2024-09-18 |
Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference |
Najmeh Forouzandehmehr et.al. |
2409.12150 |
null |
2024-09-18 |
MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning |
Justin Chih-Yao Chen et.al. |
2409.12147 |
link |
2024-09-18 |
MoRAG – Multi-Fusion Retrieval Augmented Generation for Human Motion |
Kalakonda Sai Shashank et.al. |
2409.12140 |
null |
2024-09-24 |
Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models |
Sijing Chen et.al. |
2409.12139 |
null |
2024-09-18 |
GRIN: GRadient-INformed MoE |
Liyuan Liu et.al. |
2409.12136 |
null |
2024-09-18 |
Linguini: A benchmark for language-agnostic linguistic reasoning |
Eduardo Sánchez et.al. |
2409.12126 |
link |
2024-09-18 |
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement |
An Yang et.al. |
2409.12122 |
null |
2024-09-18 |
Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference |
Edresson Casanova et.al. |
2409.12117 |
null |
2024-09-18 |
Measuring Human and AI Values based on Generative Psychometrics with Large Language Models |
Haoran Ye et.al. |
2409.12106 |
link |
2024-09-19 |
Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval |
Warren Jouanneau et.al. |
2409.12097 |
null |
2024-09-19 |
The Impact of Element Ordering on LM Agent Performance |
Wayne Chi et.al. |
2409.12089 |
link |
2024-09-18 |
Dual-Layer Training and Decoding of Large Language Model with Simultaneously Thinking and Speaking |
Ningyuan Xi et.al. |
2409.12059 |
null |
2024-09-19 |
Using Large Language Models to Generate Clinical Trial Tables and Figures |
Yumeng Yang et.al. |
2409.12046 |
null |
2024-09-18 |
All-in-one foundational models learning across quantum chemical levels |
Yuxinxin Chen et.al. |
2409.12015 |
link |
2024-09-18 |
Mixture of Prompt Learning for Vision Language Models |
Yu Du et.al. |
2409.12011 |
null |
2024-09-17 |
AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs |
Basel Mousi et.al. |
2409.11404 |
null |
2024-09-17 |
NVLM: Open Frontier-Class Multimodal LLMs |
Wenliang Dai et.al. |
2409.11402 |
null |
2024-09-17 |
Says Who? Effective Zero-Shot Annotation of Focalization |
Rebecca M. M. Hicke et.al. |
2409.11390 |
null |
2024-09-17 |
Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement |
Simon Yu et.al. |
2409.11378 |
link |
2024-09-17 |
Towards Time Series Reasoning with LLMs |
Winnie Chow et.al. |
2409.11376 |
null |
2024-09-17 |
Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification |
Fatema-E- Jannat et.al. |
2409.11375 |
null |
2024-09-17 |
Learning Spatially-Aware Language and Audio Embedding |
Bhavika Devnani et.al. |
2409.11369 |
null |
2024-09-17 |
CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration |
Jiahui Gao et.al. |
2409.11365 |
null |
2024-09-17 |
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark |
Zachary S. Siegel et.al. |
2409.11363 |
link |
2024-09-17 |
AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances |
Dhruv Agarwal et.al. |
2409.11360 |
null |
2024-09-17 |
THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models |
Mengfei Liang et.al. |
2409.11353 |
link |
2024-09-17 |
LPT++: Efficient Training on Mixture of Long-tailed Experts |
Bowen Dong et.al. |
2409.11323 |
null |
2024-09-17 |
SOAP: Improving and Stabilizing Shampoo using Adam |
Nikhil Vyas et.al. |
2409.11321 |
link |
2024-09-17 |
Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models |
Divij Gupta et.al. |
2409.11302 |
null |
2024-09-17 |
Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T5 |
Marcel Lamott et.al. |
2409.11282 |
null |
2024-09-17 |
P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task |
Weiye Xu et.al. |
2409.11279 |
null |
2024-09-17 |
Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments |
Maria Rigaki et.al. |
2409.11276 |
null |
2024-09-17 |
Task Arithmetic for Language Expansion in Speech Translation |
Yao-Fei Cheng et.al. |
2409.11274 |
null |
2024-09-17 |
LOLA – An Open-Source Massively Multilingual Large Language Model |
Nikit Srivastava et.al. |
2409.11272 |
link |
2024-09-17 |
Bio-Inspired Mamba: Temporal Locality and Bioplausible Learning in Selective State Space Models |
Jiahao Qin et.al. |
2409.11263 |
null |
2024-09-16 |
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval |
Di Liu et.al. |
2409.10516 |
link |
2024-09-16 |
Context-aware Code Segmentation for C-to-Rust Translation using Large Language Models |
Momoko Shiraishi et.al. |
2409.10506 |
null |
2024-09-16 |
DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction |
John Wu et.al. |
2409.10504 |
null |
2024-09-16 |
Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic Puzzles |
Kulin Shah et.al. |
2409.10502 |
link |
2024-09-16 |
Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models |
Shaznin Sultana et.al. |
2409.10490 |
null |
2024-09-16 |
Do Pre-trained Vision-Language Models Encode Object States? |
Kaleb Newman et.al. |
2409.10488 |
null |
2024-09-16 |
XLM for Autonomous Driving Systems: A Comprehensive Review |
Sonda Fourati et.al. |
2409.10484 |
null |
2024-09-16 |
Schrodinger’s Memory: Large Language Models |
Wei Wang et.al. |
2409.10482 |
null |
2024-09-16 |
Towards Semantic Versioning of Open Pre-trained Language Model Releases on Hugging Face |
Adekunle Ajibode et.al. |
2409.10472 |
null |
2024-09-16 |
LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning |
Jicong Ao et.al. |
2409.10444 |
link |
2024-09-16 |
CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera |
Jingpei Lu et.al. |
2409.10441 |
null |
2024-09-16 |
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models |
Vineet Bhat et.al. |
2409.10419 |
null |
2024-09-16 |
A Large-Scale Privacy Assessment of Android Third-Party SDKs |
Mark Huasong Meng et.al. |
2409.10411 |
null |
2024-09-16 |
A Knowledge-Enhanced Disease Diagnosis Method Based on Prompt Learning and BERT Integration |
Zhang Zheng et.al. |
2409.10403 |
null |
2024-09-17 |
Learnings from a Large-Scale Deployment of an LLM-Powered Expert-in-the-Loop Healthcare Chatbot |
Bhuvan Sachdeva et.al. |
2409.10354 |
null |
2024-09-16 |
Large Language Model Enhanced Hard Sample Identification for Denoising Recommendation |
Tianrui Song et.al. |
2409.10343 |
null |
2024-09-16 |
The 20 questions game to distinguish large language models |
Gurvan Richardeau et.al. |
2409.10338 |
null |
2024-09-16 |
MGSA: Multi-granularity Graph Structure Attention for Knowledge Graph-to-Text Generation |
Shanshan Wang et.al. |
2409.10294 |
null |
2024-09-16 |
ReflectDiffu: Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework |
Jiahao Yuan et.al. |
2409.10289 |
link |
2024-09-16 |
ComplexCodeEval: A Benchmark for Evaluating Large Code Models on More Complex Code |
Jia Feng et.al. |
2409.10280 |
link |
2024-09-13 |
Agents in Software Engineering: Survey, Landscape, and Vision |
Yanxian Huang et.al. |
2409.09030 |
link |
2024-09-13 |
Contri(e)ve: Context + Retrieve for Scholarly Question Answering |
Kanchan Shivashankar et.al. |
2409.09010 |
null |
2024-09-13 |
Safeguarding Decentralized Social Media: LLM Agents for Automating Community Rule Compliance |
Lucio La Cava et.al. |
2409.08963 |
null |
2024-09-13 |
Emerging Reliance Behaviors in Human-AI Text Generation: Hallucinations, Data Quality Assessment, and Cognitive Forcing Functions |
Zahra Ashktorab et.al. |
2409.08937 |
null |
2024-09-13 |
SynSUM – Synthetic Benchmark with Structured and Unstructured Medical Records |
Paloma Rabaey et.al. |
2409.08936 |
link |
2024-09-13 |
LLM-based Weak Supervision Framework for Query Intent Classification in Video Search |
Farnoosh Javadi et.al. |
2409.08931 |
null |
2024-09-13 |
Affective Computing Has Changed: The Foundation Model Disruption |
Björn Schuller et.al. |
2409.08907 |
null |
2024-09-13 |
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models |
Yifei Yao et.al. |
2409.08904 |
link |
2024-09-13 |
A Market for Lemons? Strategic Directions for a Vigilant Application of Artificial Intelligence in Entrepreneurship Research |
Martin Obschonka et.al. |
2409.08890 |
null |
2024-09-13 |
Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark |
Xuchen Li et.al. |
2409.08887 |
null |
2024-09-13 |
Exploring Graph Structure Comprehension Ability of Multimodal Large Language Models: Case Studies |
Zhiqiang Zhong et.al. |
2409.08864 |
null |
2024-09-13 |
FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition |
Zhenhua Xu et.al. |
2409.08846 |
null |
2024-09-13 |
AIPO: Improving Training Objective for Iterative Preference Optimization |
Yaojie Shen et.al. |
2409.08845 |
link |
2024-09-13 |
A RAG Approach for Generating Competency Questions in Ontology Engineering |
Xueli Pan et.al. |
2409.08820 |
null |
2024-09-13 |
Your Weak LLM is Secretly a Strong Teacher for Alignment |
Leitian Tao et.al. |
2409.08813 |
null |
2024-09-13 |
Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task |
Shao Zhang et.al. |
2409.08811 |
null |
2024-09-13 |
LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment |
Huan Zhang et.al. |
2409.08795 |
link |
2024-09-13 |
Optimizing Ingredient Substitution Using Large Language Models to Enhance Phytochemical Content in Recipes |
Luis Rita et.al. |
2409.08792 |
null |
2024-09-13 |
Electrocardiogram Report Generation and Question Answering via Retrieval-Augmented Self-Supervised Modeling |
Jialu Tang et.al. |
2409.08788 |
null |
2024-09-13 |
Uncertainty and Generalizability in Foundation Models for Earth Observation |
Raul Ramos-Pollan et.al. |
2409.08744 |
null |
2024-09-12 |
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale |
Rogerio Bonatti et.al. |
2409.08264 |
link |
2024-09-12 |
OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering |
Jiahao Nick Li et.al. |
2409.08250 |
null |
2024-09-12 |
Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources |
Alisia Lupidi et.al. |
2409.08239 |
null |
2024-09-12 |
LLM Honeypot: Leveraging Large Language Models as Advanced Interactive Honeypot Systems |
Hakan T. Otal et.al. |
2409.08234 |
link |
2024-09-12 |
Adaptive Language-Guided Abstraction from Contrastive Explanations |
Andi Peng et.al. |
2409.08212 |
null |
2024-09-12 |
ComAlign: Compositional Alignment in Vision-Language Models |
Ali Abdollah et.al. |
2409.08206 |
null |
2024-09-12 |
What Makes a Maze Look Like a Maze? |
Joy Hsu et.al. |
2409.08202 |
null |
2024-09-12 |
AudioBERT: Audio Knowledge Augmented Language Model |
Hyunjong Ok et.al. |
2409.08199 |
link |
2024-09-12 |
Fine-tuning Large Language Models for Entity Matching |
Aaron Steiner et.al. |
2409.08185 |
link |
2024-09-12 |
On the Role of Context in Reading Time Prediction |
Andreas Opedal et.al. |
2409.08160 |
link |
2024-09-12 |
Faster Speech-LLaMA Inference with Multi-token Prediction |
Desh Raj et.al. |
2409.08148 |
null |
2024-09-12 |
LLM-POTUS Score: A Framework of Analyzing Presidential Debates with Large Language Models |
Zhengliang Liu et.al. |
2409.08147 |
null |
2024-09-12 |
Towards a graph-based foundation model for network traffic analysis |
Louis Van Langendonck et.al. |
2409.08111 |
null |
2024-09-12 |
The Faetar Benchmark: Speech Recognition in a Very Under-Resourced Language |
Michael Ong et.al. |
2409.08103 |
null |
2024-09-12 |
The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal |
Huiyuan Xie et.al. |
2409.08098 |
null |
2024-09-12 |
Securing Large Language Models: Addressing Bias, Misinformation, and Prompt Attacks |
Benji Peng et.al. |
2409.08087 |
null |
2024-09-12 |
SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality |
Chenyang Lei et.al. |
2409.08083 |
link |
2024-09-12 |
SoVAR: Building Generalizable Scenarios from Accident Reports for Autonomous Driving Testing |
An Guo et.al. |
2409.08081 |
null |
2024-09-12 |
TravelAgent: An AI Assistant for Personalized Travel Planning |
Aili Chen et.al. |
2409.08069 |
null |
2024-09-12 |
An Evaluation Framework for Attributed Information Retrieval using Large Language Models |
Hanane Djeddal et.al. |
2409.08014 |
link |
2024-09-11 |
“My Grade is Wrong!”: A Contestable AI Framework for Interactive Feedback in Evaluating Student Essays |
Shengxin Hong et.al. |
2409.07453 |
null |
2024-09-11 |
StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos |
Sijie Zhao et.al. |
2409.07447 |
null |
2024-09-11 |
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories |
Ben Bogin et.al. |
2409.07440 |
link |
2024-09-11 |
A Suite for Acoustic Language Model Evaluation |
Gallil Maimon et.al. |
2409.07437 |
link |
2024-09-11 |
Synthetic continued pretraining |
Zitong Yang et.al. |
2409.07431 |
link |
2024-09-11 |
Agent Workflow Memory |
Zora Zhiruo Wang et.al. |
2409.07429 |
link |
2024-09-11 |
CLNX: Bridging Code and Natural Language for C/C++ Vulnerability-Contributing Commits Identification |
Zeqing Qin et.al. |
2409.07407 |
null |
2024-09-11 |
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge |
Han Wang et.al. |
2409.07394 |
link |
2024-09-11 |
Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination |
Daniel Zhang-Li et.al. |
2409.07372 |
null |
2024-09-11 |
Demo: SGCode: A Flexible Prompt-Optimizing System for Secure Generation of Code |
Khiem Ton et.al. |
2409.07368 |
null |
2024-09-11 |
Think Together and Work Better: Combining Humans’ and LLMs’ Think-Aloud Outcomes for Effective Text Evaluation |
SeongYeub Chu et.al. |
2409.07355 |
link |
2024-09-11 |
Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks |
Md Zarif Hossain et.al. |
2409.07353 |
link |
2024-09-11 |
Explanation, Debate, Align: A Weak-to-Strong Framework for Language Model Generalization |
Mehrdad Zakershahrak et.al. |
2409.07335 |
null |
2024-09-11 |
Learning to Compress Contexts for Efficient Knowledge-based Visual Question Answering |
Weixi Weng et.al. |
2409.07331 |
null |
2024-09-11 |
MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications |
Praveen K Kanithi et.al. |
2409.07314 |
null |
2024-09-11 |
Exploring User-level Gradient Inversion with a Diffusion Prior |
Zhuohang Li et.al. |
2409.07291 |
null |
2024-09-11 |
STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM |
Qijiong Liu et.al. |
2409.07276 |
null |
2024-09-11 |
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving |
Enming Zhang et.al. |
2409.07267 |
link |
2024-09-11 |
Alignment of Diffusion Models: Fundamentals, Challenges, and Future |
Buhua Liu et.al. |
2409.07253 |
link |
2024-09-11 |
PiTe: Pixel-Temporal Alignment for Large Video-Language Model |
Yang Liu et.al. |
2409.07239 |
link |
2024-09-10 |
Benchmarking Sub-Genre Classification For Mainstage Dance Music |
Hongzhi Shu et.al. |
2409.06690 |
null |
2024-09-10 |
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning |
Zihan Liao et.al. |
2409.06679 |
null |
2024-09-10 |
LLaMA-Omni: Seamless Speech Interaction with Large Language Models |
Qingkai Fang et.al. |
2409.06666 |
link |
2024-09-10 |
Human Perception of LLM-generated Text Content in Social Media Environments |
Kristina Radivojevic et.al. |
2409.06653 |
null |
2024-09-10 |
Optimal Workload Placement on Multi-Instance GPUs |
Bekir Turkkan et.al. |
2409.06646 |
null |
2024-09-10 |
EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysis |
Danli Shi et.al. |
2409.06644 |
null |
2024-09-11 |
Segmenting sea ice floes in close-range optical imagery with active contour and foundation models |
Giulio Passerotti et.al. |
2409.06641 |
null |
2024-09-10 |
TeXBLEU: Automatic Metric for Evaluate LaTeX Format |
Kyudan Jung et.al. |
2409.06639 |
link |
2024-09-10 |
MoWE-Audio: Multitask AudioLLMs with Mixture of Weak Encoders |
Wenyu Zhang et.al. |
2409.06635 |
null |
2024-09-10 |
A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio |
Ningyuan Xi et.al. |
2409.06624 |
null |
2024-09-10 |
Exploring Italian sentence embeddings properties through multi-tasking |
Vivi Nastase et.al. |
2409.06622 |
link |
2024-09-10 |
Alleviating Hallucinations in Large Language Models with Scepticism Modeling |
Yetao Wu et.al. |
2409.06601 |
null |
2024-09-10 |
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering |
Sacha Muller et.al. |
2409.06595 |
link |
2024-09-10 |
Quantifying and Enabling the Interpretability of CLIP-like Models |
Avinash Madasu et.al. |
2409.06579 |
null |
2024-09-10 |
Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement |
Vivi Nastase et.al. |
2409.06567 |
null |
2024-09-10 |
MAPS: Energy-Reliability Tradeoff Management in Autonomous Vehicles Through LLMs Penetrated Science |
Mahdieh Aliazam et.al. |
2409.06558 |
null |
2024-09-10 |
Questioning Internal Knowledge Structure of Large Language Models Through the Lens of the Olympic Games |
Juhwan Choi et.al. |
2409.06518 |
link |
2024-09-10 |
Aligning Machine and Human Visual Representations across Abstraction Levels |
Lukas Muttenthaler et.al. |
2409.06509 |
null |
2024-09-10 |
Mitigating Hallucination in Visual-Language Models via Re-Balancing Contrastive Decoding |
Xiaoyu Liang et.al. |
2409.06485 |
null |
2024-09-10 |
Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles |
Qiujing Lu et.al. |
2409.06450 |
null |
2024-09-09 |
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct |
Run Luo et.al. |
2409.05840 |
null |
2024-09-09 |
Are Large Language Models a Threat to Programming Platforms? An Exploratory Study |
Md Mustakim Billah et.al. |
2409.05824 |
null |
2024-09-09 |
VFA: Vision Frequency Analysis of Foundation Models and Human |
Mohammad-Javad Darvishi-Bayazi et.al. |
2409.05817 |
null |
2024-09-09 |
Improving Pretraining Data Using Perplexity Correlations |
Tristan Thrush et.al. |
2409.05816 |
null |
2024-09-09 |
Benchmarking Chinese Knowledge Rectification in Large Language Models |
Tianhe Lu et.al. |
2409.05806 |
link |
2024-09-09 |
Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models |
Emily Cheng et.al. |
2409.05771 |
null |
2024-09-09 |
Model Input Verification of Large Scale Simulations |
Rumyana Neykova et.al. |
2409.05768 |
null |
2024-09-09 |
A Novel Idea Generation Tool using a Structured Conversational AI (CAI) System |
B. Sankar et.al. |
2409.05747 |
null |
2024-09-09 |
LLMs Will Always Hallucinate, and We Need to Live With This |
Sourav Banerjee et.al. |
2409.05746 |
null |
2024-09-09 |
A System and Benchmark for LLM-based Q\&A on Heterogeneous Data |
Achille Fokoue et.al. |
2409.05735 |
null |
2024-09-09 |
Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach |
Meng Zhou et.al. |
2409.05732 |
null |
2024-09-09 |
The Influence of Task and Group Disparities over Users’ Attitudes Toward Using Large Language Models for Psychotherapy |
Qihang He et.al. |
2409.05703 |
null |
2024-09-09 |
Segmentation by Factorization: Unsupervised Semantic Segmentation for Pathology by Factorizing Foundation Model Features |
Jacob Gildenblat et.al. |
2409.05697 |
null |
2024-09-09 |
Zero-shot Outlier Detection via Prior-data Fitted Networks: Model Selection Bygone! |
Yuchen Shen et.al. |
2409.05672 |
null |
2024-09-09 |
Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case |
Vagrant Gautam et.al. |
2409.05653 |
link |
2024-09-10 |
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery |
Hongjin Qian et.al. |
2409.05591 |
link |
2024-09-09 |
Leveraging Content and Acoustic Representations for Efficient Speech Emotion Recognition |
Soumya Dutta et.al. |
2409.05566 |
null |
2024-09-09 |
CauseJudger: Identifying the Cause with LLMs for Abductive Logical Reasoning |
Jinwei He et.al. |
2409.05559 |
null |
2024-09-09 |
SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning |
Alireza Ghafarollahi et.al. |
2409.05556 |
link |
2024-09-09 |
Harmonic Reasoning in Large Language Models |
Anna Kruspe et.al. |
2409.05521 |
null |
2024-09-06 |
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation |
Yecheng Wu et.al. |
2409.04429 |
link |
2024-09-06 |
Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques |
Davide Clode da Silva et.al. |
2409.04424 |
null |
2024-09-06 |
RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs |
Jiaxing Wu et.al. |
2409.04421 |
null |
2024-09-06 |
Question-Answering Dense Video Events |
Hangyu Qin et.al. |
2409.04388 |
null |
2024-09-06 |
Learning vs Retrieval: The Role of In-Context Examples in Regression with LLMs |
Aliakbar Nafar et.al. |
2409.04318 |
link |
2024-09-06 |
An optically accelerated extreme learning machine using hot atomic vapors |
Pierre Azam et.al. |
2409.04312 |
null |
2024-09-06 |
Using Large Language Models to Generate Authentic Multi-agent Knowledge Work Datasets |
Desiree Heim et.al. |
2409.04286 |
null |
2024-09-06 |
Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models |
Yuxiao Huang et.al. |
2409.04270 |
null |
2024-09-06 |
An overview of domain-specific foundation model: key technologies, applications and challenges |
Haolong Chen et.al. |
2409.04267 |
null |
2024-09-06 |
UniDet3D: Multi-dataset Indoor 3D Object Detection |
Maksim Kolodiazhnyi et.al. |
2409.04234 |
link |
2024-09-06 |
Fast Forwarding Low-Rank Training |
Adir Rahamim et.al. |
2409.04206 |
null |
2024-09-06 |
Residual Stream Analysis with Multi-Layer SAEs |
Tim Lawson et.al. |
2409.04185 |
link |
2024-09-06 |
GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding |
Ziyin Zhang et.al. |
2409.04183 |
null |
2024-09-06 |
Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering |
Larissa Pusch et.al. |
2409.04181 |
null |
2024-09-06 |
From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks |
Andreas Stephan et.al. |
2409.04168 |
null |
2024-09-06 |
Can OpenSource beat ChatGPT? – A Comparative Study of Large Language Models for Text-to-Code Generation |
Luis Mayer et.al. |
2409.04164 |
null |
2024-09-06 |
Prompt-based Personality Profiling: Reinforcement Learning for Relevance Filtering |
Jan Hofmann et.al. |
2409.04122 |
null |
2024-09-06 |
Multi-Programming Language Ensemble for Code Generation in Large Language Model |
Tengfei Xue et.al. |
2409.04114 |
link |
2024-09-06 |
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers |
Chenglei Si et.al. |
2409.04109 |
link |
2024-09-06 |
UI-JEPA: Towards Active Perception of User Intent through Onscreen User Activity |
Yicheng Fu et.al. |
2409.04081 |
null |
2024-09-05 |
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding |
Yunze Man et.al. |
2409.03757 |
link |
2024-09-05 |
Foundation Model or Finetune? Evaluation of few-shot semantic segmentation for river pollution |
Marga Don et.al. |
2409.03754 |
link |
2024-09-05 |
Attention Heads of Large Language Models: A Survey |
Zifan Zheng et.al. |
2409.03752 |
link |
2024-09-05 |
LLM-CI: Assessing Contextual Integrity Norms in Language Models |
Yan Shvartzshnaider et.al. |
2409.03735 |
null |
2024-09-05 |
Safety vs. Performance: How Multi-Objective Learning Reduces Barriers to Market Entry |
Meena Jagadeesan et.al. |
2409.03734 |
null |
2024-09-05 |
Planning In Natural Language Improves LLM Search For Code Generation |
Evan Wang et.al. |
2409.03733 |
link |
2024-09-06 |
RAG based Question-Answering for Contextual Response Prediction System |
Sriram Veturi et.al. |
2409.03708 |
null |
2024-09-05 |
LAST: Language Model Aware Speech Tokenization |
Arnon Turetzky et.al. |
2409.03701 |
null |
2024-09-05 |
TRACE-cs: Trustworthy Reasoning for Contrastive Explanations in Course Scheduling Problems |
Stylianos Loukas Vasileiou et.al. |
2409.03671 |
null |
2024-09-05 |
A Fused Large Language Model for Predicting Startup Success |
Abdurahman Maarouf et.al. |
2409.03668 |
null |
2024-09-05 |
The representation landscape of few-shot learning and fine-tuning in large language models |
Diego Doimo et.al. |
2409.03662 |
link |
2024-09-06 |
LLM-based multi-agent poetry generation in non-cooperative environments |
Ran Zhang et.al. |
2409.03659 |
link |
2024-09-05 |
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization |
Yong Lin et.al. |
2409.03650 |
null |
2024-09-05 |
Text-Guided Mixup Towards Long-Tailed Image Categorization |
Richard Franklin et.al. |
2409.03583 |
link |
2024-09-05 |
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation |
Xi Chen et.al. |
2409.03525 |
null |
2024-09-05 |
Have Large Vision-Language Models Mastered Art History? |
Ombretta Strafforello et.al. |
2409.03521 |
null |
2024-09-05 |
Tissue Concepts: supervised foundation models in computational pathology |
Till Nicke et.al. |
2409.03519 |
link |
2024-09-05 |
From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents |
Jifan Yu et.al. |
2409.03512 |
null |
2024-09-05 |
LLM-based event abstraction and integration for IoT-sourced logs |
Mohsen Shirali et.al. |
2409.03478 |
link |
2024-09-05 |
How Much Data is Enough Data? Fine-Tuning Large Language Models for In-House Translation: Performance Evaluation Across Multiple Dataset Sizes |
Inacio Vieira et.al. |
2409.03454 |
null |
2024-09-04 |
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins (early version) |
Yao Mu et.al. |
2409.02920 |
null |
2024-09-04 |
Can LVLMs Obtain a Driver’s License? A Benchmark Towards Reliable AGI for Autonomous Driving |
Yuhang Lu et.al. |
2409.02914 |
null |
2024-09-04 |
Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling |
Kaiwen Zheng et.al. |
2409.02908 |
null |
2024-09-05 |
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA |
Jiajie Zhang et.al. |
2409.02897 |
link |
2024-09-04 |
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture |
Xidong Wang et.al. |
2409.02889 |
link |
2024-09-04 |
CanvOI, an Oncology Intelligence Foundation Model: Scaling FLOPS Differently |
Jonathan Zalach et.al. |
2409.02885 |
null |
2024-09-04 |
Benchmarking Spurious Bias in Few-Shot Image Classifiers |
Guangtao Zheng et.al. |
2409.02882 |
link |
2024-09-04 |
Configurable Foundation Models: Building LLMs from a Modular Perspective |
Chaojun Xiao et.al. |
2409.02877 |
null |
2024-09-04 |
Historical German Text Normalization Using Type- and Token-Based Language Modeling |
Anton Ehrmanntraut et.al. |
2409.02841 |
null |
2024-09-04 |
Exploring Sentiment Dynamics and Predictive Behaviors in Cryptocurrency Discussions by Few-Shot Learning with Large Language Models |
Moein Shahiki Tash et.al. |
2409.02836 |
null |
2024-09-04 |
CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models |
Wentao Liu et.al. |
2409.02834 |
link |
2024-09-04 |
ExpLLM: Towards Chain of Thought for Facial Expression Recognition |
Xing Lan et.al. |
2409.02828 |
null |
2024-09-04 |
Design Contradictions: Help or Hindrance? |
Aron E. Owen et.al. |
2409.02823 |
null |
2024-09-04 |
Language Understanding as a Constraint on Consensus Size in LLM Societies |
Giordano De Marzo et.al. |
2409.02822 |
null |
2024-09-04 |
Towards a Unified View of Preference Learning for Large Language Models: A Survey |
Bofei Gao et.al. |
2409.02795 |
link |
2024-09-05 |
Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models? |
Yixuan Tang et.al. |
2409.02727 |
link |
2024-09-04 |
Pre-training data selection for biomedical domain adaptation using journal impact metrics |
Mathieu Laï-king et.al. |
2409.02725 |
null |
2024-09-04 |
Alignment-Aware Model Extraction Attacks on Large Language Models |
Zi Liang et.al. |
2409.02718 |
link |
2024-09-04 |
Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL |
Mohammad Reshadati et.al. |
2409.02711 |
null |
2024-09-04 |
LLM-Assisted Visual Analytics: Opportunities and Challenges |
Maeve Hutchinson et.al. |
2409.02691 |
null |
2024-08-30 |
SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists |
Raoyuan Zhao et.al. |
2408.17437 |
link |
2024-08-30 |
DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model |
Mona Sheikh Zeinoddin et.al. |
2408.17433 |
link |
2024-08-30 |
Advancing Multi-talker ASR Performance with Large Language Models |
Mohan Shi et.al. |
2408.17431 |
null |
2024-08-30 |
CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models |
Jonathan Bourne et.al. |
2408.17428 |
null |
2024-09-03 |
Open-vocabulary Temporal Action Localization using VLMs |
Naoki Wake et.al. |
2408.17422 |
null |
2024-08-30 |
Getting Inspiration for Feature Elicitation: App Store- vs. LLM-based Approach |
Jialiang Wei et.al. |
2408.17404 |
link |
2024-08-30 |
EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online Grounding and Execution |
Francesco Argenziano et.al. |
2408.17379 |
null |
2024-08-30 |
NDP: Next Distribution Prediction as a More Broad Target |
Junhao Ruan et.al. |
2408.17377 |
null |
2024-08-30 |
Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain |
Francesca Grasso et.al. |
2408.17362 |
link |
2024-08-30 |
Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage |
Md Rafi Ur Rashid et.al. |
2408.17354 |
null |
2024-09-02 |
LSMS: Language-guided Scale-aware MedSegmentor for Medical Image Referring Segmentation |
Shuyi Ouyang et.al. |
2408.17347 |
null |
2024-08-30 |
Investigating Neuron Ablation in Attention Heads: The Case for Peak Activation Centering |
Nicholas Pochinkov et.al. |
2408.17322 |
link |
2024-08-30 |
Bridging Domain Knowledge and Process Discovery Using Large Language Models |
Ali Norouzifar et.al. |
2408.17316 |
link |
2024-08-30 |
Flexible and Effective Mixing of Large Language Models into a Mixture of Domain Experts |
Rhui Dih Lee et.al. |
2408.17280 |
null |
2024-08-30 |
Joint Estimation and Prediction of City-wide Delivery Demand: A Large Language Model Empowered Graph-based Learning Approach |
Tong Nie et.al. |
2408.17258 |
null |
2024-08-30 |
VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters |
Mouxiang Chen et.al. |
2408.17253 |
link |
2024-08-30 |
Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study |
Shubham Agarwal et.al. |
2408.17181 |
null |
2024-08-30 |
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model |
Zhen Ye et.al. |
2408.17175 |
link |
2024-08-30 |
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning |
Xiaoye Qu et.al. |
2408.17150 |
link |
2024-08-30 |
Reasoning AI Performance Degradation in 6G Networks with Large Language Models |
Liming Huang et.al. |
2408.17097 |
null |
2024-08-29 |
PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning |
Noor Hussein et.al. |
2408.16769 |
link |
2024-08-29 |
How Far Can Cantonese NLP Go? Benchmarking Cantonese Capabilities of Large Language Models |
Jiyue Jiang et.al. |
2408.16756 |
link |
2024-08-29 |
Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models |
Alec Solway et.al. |
2408.16753 |
null |
2024-08-29 |
A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models |
Yi-Lin Tuan et.al. |
2408.16751 |
null |
2024-08-29 |
Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge |
Beidi Dong et.al. |
2408.16749 |
null |
2024-08-29 |
Theoretical and Methodological Framework for Studying Texts Produced by Large Language Models |
Jiří Milička et.al. |
2408.16740 |
null |
2024-08-29 |
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling |
Hritik Bansal et.al. |
2408.16737 |
null |
2024-08-29 |
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation |
Shiwei Wu et.al. |
2408.16730 |
null |
2024-08-30 |
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming |
Zhifei Xie et.al. |
2408.16725 |
link |
2024-08-29 |
GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models |
Moreno D’Incà et.al. |
2408.16700 |
link |
2024-08-29 |
Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity |
Ziniu Li et.al. |
2408.16673 |
null |
2024-08-29 |
Space3D-Bench: Spatial 3D Question Answering Benchmark |
Emilia Szymanska et.al. |
2408.16662 |
null |
2024-08-29 |
DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving |
Yongjie Fu et.al. |
2408.16647 |
null |
2024-08-29 |
Examination of Code generated by Large Language Models |
Robin Beer et.al. |
2408.16601 |
link |
2024-08-29 |
Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies |
Zhiyang Qi et.al. |
2408.16586 |
null |
2024-08-29 |
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling |
Shengpeng Ji et.al. |
2408.16532 |
link |
2024-08-29 |
CNIMA: A Universal Evaluation Framework and Automated Approach for Assessing Second Language Dialogues |
Rena Gao et.al. |
2408.16518 |
link |
2024-08-29 |
LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs? |
Jan Cegin et.al. |
2408.16502 |
null |
2024-08-29 |
CogVLM2: Visual Language Models for Image and Video Understanding |
Wenyi Hong et.al. |
2408.16500 |
link |
2024-08-29 |
A Survey on Evaluating Large Language Models in Code Generation Tasks |
Liguo Chen et.al. |
2408.16498 |
null |
2024-08-28 |
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders |
Min Shi et.al. |
2408.15998 |
link |
2024-08-29 |
Spatio-Temporal Context Prompting for Zero-Shot Action Detection |
Wei-Jhe Huang et.al. |
2408.15996 |
null |
2024-08-28 |
Perceive-IR: Learning to Perceive Degradation Better for All-in-One Image Restoration |
Xu Zhang et.al. |
2408.15994 |
null |
2024-08-28 |
BattleAgentBench: A Benchmark for Evaluating Cooperation and Competition Capabilities of Language Models in Multi-Agent Systems |
Wei Wang et.al. |
2408.15971 |
null |
2024-08-28 |
More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding |
Yuan Tang et.al. |
2408.15966 |
link |
2024-08-28 |
Atari-GPT: Investigating the Capabilities of Multimodal Large Language Models as Low-Level Policies for Atari Games |
Nicholas R. Waytowich et.al. |
2408.15950 |
null |
2024-08-28 |
DeMoBot: Deformable Mobile Manipulation with Vision-based Sub-goal Retrieval |
Yuying Zhang et.al. |
2408.15919 |
null |
2024-08-28 |
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models |
Yuncheng Yang et.al. |
2408.15915 |
link |
2024-08-28 |
Decentralized LLM Inference over Edge Networks with Energy Harvesting |
Aria Khoshsirat et.al. |
2408.15907 |
null |
2024-08-28 |
LLM-Based Multi-Hop Question Answering with Knowledge Graph Integration in Evolving Environments |
Ruirui Chen et.al. |
2408.15903 |
null |
2024-08-28 |
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts |
Nikolas Gritsch et.al. |
2408.15901 |
null |
2024-08-28 |
Bias in LLMs as Annotators: The Effect of Party Cues on Labelling Decision by Large Language Models |
Sebastian Vallejo Vera et.al. |
2408.15895 |
null |
2024-08-28 |
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation |
Fangxun Shu et.al. |
2408.15881 |
link |
2024-08-28 |
Persuasion Games using Large Language Models |
Ganesh Prasath Ramani et.al. |
2408.15879 |
null |
2024-08-28 |
Retrieval-Augmented Instruction Tuning for Automated Process Engineering Calculations : A Tool-Chaining Problem-Solving Framework with Attributable Reflection |
Sagar Srinivas Sakhinana et.al. |
2408.15866 |
null |
2024-08-28 |
Benchmarking foundation models as feature extractors for weakly-supervised computational pathology |
Peter Neidlinger et.al. |
2408.15823 |
null |
2024-08-28 |
Visual Prompt Engineering for Medical Vision Language Models in Radiology |
Stefan Denner et.al. |
2408.15802 |
null |
2024-08-28 |
Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization |
Léo Hemamou et.al. |
2408.15801 |
null |
2024-08-28 |
Evaluating Named Entity Recognition Using Few-Shot Prompting with Large Language Models |
Hédi Zhegidi et.al. |
2408.15796 |
link |
2024-08-28 |
Efficient LLM Scheduling by Learning to Rank |
Yichao Fu et.al. |
2408.15792 |
link |
2024-08-27 |
Generative Verifiers: Reward Modeling as Next-Token Prediction |
Lunjun Zhang et.al. |
2408.15240 |
null |
2024-08-27 |
The Mamba in the Llama: Distilling and Accelerating Hybrid Models |
Junxiong Wang et.al. |
2408.15237 |
link |
2024-08-27 |
Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations |
Yucheng Jiang et.al. |
2408.15232 |
null |
2024-08-27 |
LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet |
Nathaniel Li et.al. |
2408.15221 |
null |
2024-08-27 |
Investigating Coverage Criteria in Large Language Models: An In-Depth Study Through Jailbreak Attacks |
Shide Zhou et.al. |
2408.15207 |
null |
2024-08-27 |
Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation |
Jian Hu et.al. |
2408.15205 |
link |
2024-08-27 |
Can Unconfident LLM Annotations Be Used for Confident Conclusions? |
Kristina Gligorić et.al. |
2408.15204 |
link |
2024-08-27 |
Infusing Acoustic Pause Context into Text-Based Dementia Assessment |
Franziska Braun et.al. |
2408.15188 |
null |
2024-08-27 |
Unlocking Potential in Pre-Trained Music Language Models for Versatile Multi-Track Music Arrangement |
Longshen Ou et.al. |
2408.15176 |
null |
2024-08-27 |
X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation |
Hanjia Lyu et.al. |
2408.15172 |
null |
2024-08-27 |
Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation |
N. E. Kriman et.al. |
2408.15171 |
null |
2024-08-27 |
How transformers learn structured data: insights from hierarchical filtering |
Jerome Garnier-Brun et.al. |
2408.15138 |
null |
2024-08-27 |
CLIP-AGIQA: Boosting the Performance of AI-Generated Image Quality Assessment with CLIP |
Zhenchen Tang et.al. |
2408.15098 |
null |
2024-08-27 |
Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models |
Xiyu Liu et.al. |
2408.15091 |
null |
2024-08-27 |
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline |
Guosheng Dong et.al. |
2408.15079 |
null |
2024-08-27 |
Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models |
Ned Cooper et.al. |
2408.15066 |
null |
2024-08-27 |
The Benefits of Balance: From Information Projections to Variance Reduction |
Lang Liu et.al. |
2408.15065 |
null |
2024-08-28 |
DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding |
Wenhui Liao et.al. |
2408.15045 |
null |
2024-08-28 |
A Survey of Large Language Models for European Languages |
Wazir Ali et.al. |
2408.15040 |
null |
2024-08-27 |
Speech Recognition Transformers: Topological-lingualism Perspective |
Shruti Singh et.al. |
2408.14991 |
null |
2024-08-26 |
A Practitioner’s Guide to Continual Multimodal Pretraining |
Karsten Roth et.al. |
2408.14471 |
link |
2024-08-27 |
Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models |
Aradhye Agarwal et.al. |
2408.14470 |
link |
2024-08-26 |
Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos |
Qirui Chen et.al. |
2408.14469 |
null |
2024-08-26 |
Explicit Inductive Inference using Large Language Models |
Tianyang Liu et.al. |
2408.14467 |
null |
2024-08-26 |
Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study |
Liuchang Xu Shuo Zhao et.al. |
2408.14438 |
null |
2024-08-26 |
Social perception of faces in a vision-language model |
Carina I. Hausladen et.al. |
2408.14435 |
link |
2024-08-26 |
CHARTOM: A Visual Theory-of-Mind Benchmark for Multimodal Large Language Models |
Shubham Bharti et.al. |
2408.14419 |
null |
2024-08-26 |
MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues |
Kuluhan Binici et.al. |
2408.14418 |
null |
2024-08-26 |
Hyperdimensional Computing Empowered Federated Foundation Model over Wireless Networks for Metaverse |
Yahao Ding et.al. |
2408.14416 |
null |
2024-08-26 |
Language-specific Calibration for Pruning Multilingual Language Models |
Simon Kurz et.al. |
2408.14398 |
null |
2024-08-26 |
Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning |
Sakhinana Sagar Srinivas et.al. |
2408.14387 |
null |
2024-08-26 |
Probing Causality Manipulation of Large Language Models |
Chenyang Zhang et.al. |
2408.14380 |
link |
2024-08-26 |
An Embedding is Worth a Thousand Noisy Labels |
Francesco Di Salvo et.al. |
2408.14358 |
link |
2024-08-26 |
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java |
Daoguang Zan et.al. |
2408.14354 |
link |
2024-08-26 |
Assessing Contamination in Large Language Models: Introducing the LogProber method |
Nicolas Yax et.al. |
2408.14352 |
null |
2024-08-26 |
Foundation Models for Music: A Survey |
Yinghao Ma et.al. |
2408.14340 |
link |
2024-08-26 |
Claim Verification in the Age of Large Language Models: A Survey |
Alphaeus Dmonte et.al. |
2408.14317 |
null |
2024-08-26 |
LLM-3D Print: Large Language Models To Monitor and Control 3D Printing |
Yayati Jadhav et.al. |
2408.14307 |
null |
2024-08-26 |
Investigating the Effectiveness of Bayesian Spam Filters in Detecting LLM-modified Spam Mails |
Malte Josten et.al. |
2408.14293 |
link |
2024-08-26 |
Predictability and Causality in Spanish and English Natural Language Generation |
Andrea Busto-Castiñeira et.al. |
2408.14283 |
null |
2024-08-23 |
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? |
Yi-Fan Zhang et.al. |
2408.13257 |
null |
2024-08-23 |
Domain-specific long text classification from sparse relevant information |
Célia D’Cruz et.al. |
2408.13253 |
null |
2024-08-23 |
Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption |
Sakhinana Sagar Srinivas et.al. |
2408.13248 |
null |
2024-08-23 |
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time |
Yingyu Liang et.al. |
2408.13233 |
null |
2024-08-23 |
EUR-USD Exchange Rate Forecasting Based on Information Fusion with Large Language Models and Deep Learning Methods |
Hongcheng Ding et.al. |
2408.13214 |
null |
2024-08-23 |
DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation |
Qiming Zhu et.al. |
2408.13204 |
null |
2024-08-23 |
Can LLM be a Good Path Planner based on Prompt Engineering? Mitigating the Hallucination for Path Planning |
Hourui Deng et.al. |
2408.13184 |
null |
2024-08-23 |
IntelliCare: Improving Healthcare Analysis with Variance-Controlled Patient-Level Knowledge from Large Language Models |
Zhihao Yu et.al. |
2408.13073 |
link |
2024-08-23 |
Guiding IoT-Based Healthcare Alert Systems with Large Language Models |
Yulan Gao et.al. |
2408.13071 |
null |
2024-08-23 |
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks |
Kai-Wei Chang et.al. |
2408.13040 |
null |
2024-08-23 |
VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models |
Wentao Wu et.al. |
2408.13031 |
link |
2024-08-23 |
In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting |
Haowei Du et.al. |
2408.13028 |
null |
2024-08-23 |
A Web-Based Solution for Federated Learning with LLM-Based Automation |
Chamith Mawela et.al. |
2408.13010 |
null |
2024-08-23 |
Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates |
Hui Wei et.al. |
2408.13006 |
link |
2024-08-23 |
CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution |
Ruiyang Xu et.al. |
2408.13001 |
null |
2024-08-23 |
Open Llama2 Model for the Lithuanian Language |
Artūras Nakvosas et.al. |
2408.12963 |
null |
2024-08-23 |
Multimodal Contrastive In-Context Learning |
Yosuke Miyanishi et.al. |
2408.12959 |
null |
2024-08-23 |
Image Segmentation in Foundation Model Era: A Survey |
Tianfei Zhou et.al. |
2408.12957 |
link |
2024-08-23 |
E-code: Mastering Efficient Code Generation through Pretrained Models and Expert Encoder Group |
Yue Pan et.al. |
2408.12948 |
null |
2024-08-23 |
Causal-Guided Active Learning for Debiasing Large Language Models |
Zhouhao Sun et.al. |
2408.12942 |
link |
2024-08-22 |
Controllable Text Generation for Large Language Models: A Survey |
Xun Liang et.al. |
2408.12599 |
link |
2024-08-23 |
Non-Homophilic Graph Pre-Training and Prompt Learning |
Xingtong Yu et.al. |
2408.12594 |
null |
2024-08-22 |
RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment |
Xiaohan Wang et.al. |
2408.12579 |
null |
2024-08-22 |
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind |
Haojun Shi et.al. |
2408.12574 |
link |
2024-08-22 |
Jamba-1.5: Hybrid Transformer-Mamba Models at Scale |
Jamba Team et.al. |
2408.12570 |
null |
2024-08-22 |
ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation |
Lujia Zhong et.al. |
2408.12561 |
link |
2024-08-22 |
Towards Evaluating and Building Versatile Large Language Models for Medicine |
Chaoyi Wu et.al. |
2408.12547 |
link |
2024-08-22 |
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation |
Jinheng Xie et.al. |
2408.12528 |
null |
2024-08-22 |
MEDCO: Medical Education Copilots Based on A Multi-Agent Framework |
Hao Wei et.al. |
2408.12496 |
null |
2024-08-22 |
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models |
Kunsheng Tang et.al. |
2408.12494 |
link |
2024-08-23 |
Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese |
Khang T. Doan et.al. |
2408.12480 |
null |
2024-08-22 |
Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition |
Bozheng Li et.al. |
2408.12475 |
null |
2024-08-22 |
DLCRec: A Novel Approach for Managing Diversity in LLM-Based Recommender Systems |
Jiaju Chen et.al. |
2408.12470 |
null |
2024-08-22 |
Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning |
Mushui Liu et.al. |
2408.12469 |
null |
2024-08-22 |
Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing |
Mengqi Zhang et.al. |
2408.12456 |
null |
2024-08-22 |
Positional Description for Numerical Normalization |
Deepanshu Gupta et.al. |
2408.12430 |
null |
2024-08-22 |
FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing |
Jue Wang et.al. |
2408.12429 |
link |
2024-08-22 |
Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification |
Sudi Murindanyi et.al. |
2408.12426 |
null |
2024-08-22 |
Unlearning Trojans in Large Language Models: A Comparison Between Natural Language and Source Code |
Mahdi Kazemi et.al. |
2408.12416 |
null |
2024-08-22 |
Generalized SAM: Efficient Fine-Tuning of SAM for Variable Input Image Sizes |
Sota Kato et.al. |
2408.12406 |
link |
2024-08-21 |
Great Memory, Shallow Reasoning: Limits of $k$ NN-LMs |
Shangyi Geng et.al. |
2408.11815 |
link |
2024-08-21 |
SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs |
Yuanyang Yin et.al. |
2408.11813 |
null |
2024-08-21 |
EmbodiedSAM: Online Segment Any 3D Thing in Real Time |
Xiuwei Xu et.al. |
2408.11811 |
null |
2024-08-21 |
Approaching Deep Learning through the Spectral Dynamics of Weights |
David Yunis et.al. |
2408.11804 |
link |
2024-08-21 |
Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models |
Yuzhou Huang et.al. |
2408.11801 |
null |
2024-08-21 |
PermitQA: A Benchmark for Retrieval Augmented Generation in Wind Siting and Permitting domain |
Rounak Meyur et.al. |
2408.11800 |
null |
2024-08-21 |
Practical token pruning for foundation models in few-shot conversational virtual assistant systems |
Haode Qi et.al. |
2408.11799 |
null |
2024-08-21 |
EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model |
Feipeng Ma et.al. |
2408.11795 |
null |
2024-08-21 |
Leveraging Chemistry Foundation Models to Facilitate Structure Focused Retrieval Augmented Generation in Multi-Agent Workflows for Catalyst and Materials Design |
Nathaniel H. Park et.al. |
2408.11793 |
null |
2024-08-21 |
Critique-out-Loud Reward Models |
Zachary Ankner et.al. |
2408.11791 |
link |
2024-08-21 |
DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework |
Zhifei Xie et.al. |
2408.11788 |
null |
2024-08-21 |
Personality Alignment of Large Language Models |
Minjun Zhu et.al. |
2408.11779 |
link |
2024-08-21 |
Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards |
Omar Erak et.al. |
2408.11775 |
link |
2024-08-21 |
Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks |
Yiyi Chen et.al. |
2408.11749 |
link |
2024-08-21 |
DH-Bench: Probing Depth and Height Perception of Large Visual-Language Models |
Shehreen Azad et.al. |
2408.11748 |
link |
2024-08-21 |
Open-Ended 3D Point Cloud Instance Segmentation |
Phuc D. A. Nguyen et.al. |
2408.11747 |
null |
2024-08-21 |
Mixed Sparsity Training: Achieving 4 $\times$ FLOP Reduction for Transformer Pretraining |
Pihe Hu et.al. |
2408.11746 |
null |
2024-08-21 |
FocusLLM: Scaling LLM’s Context by Parallel Decoding |
Zhenyu Li et.al. |
2408.11745 |
null |
2024-08-21 |
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models |
Elias Frantar et.al. |
2408.11743 |
link |
2024-08-21 |
CluMo: Cluster-based Modality Fusion Prompt for Continual Learning in Visual Question Answering |
Yuliang Cai et.al. |
2408.11742 |
link |
2024-08-20 |
Prompt-Guided Image-Adaptive Neural Implicit Lookup Tables for Interpretable Image Enhancement |
Satoshi Kosugi et.al. |
2408.11055 |
link |
2024-08-20 |
Revisiting VerilogEval: Newer LLMs, In-Context Learning, and Specification-to-RTL Tasks |
Nathaniel Pinckney et.al. |
2408.11053 |
link |
2024-08-20 |
FLAME: Learning to Navigate with Multimodal LLM in Urban Environments |
Yunzhe Xu et.al. |
2408.11051 |
link |
2024-08-20 |
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding |
Jian Chen et.al. |
2408.11049 |
link |
2024-08-20 |
Inside the Black Box: Detecting Data Leakage in Pre-trained Language Encoders |
Yuan Xin et.al. |
2408.11046 |
null |
2024-08-20 |
Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research |
Sreyoshi Bhaduri et.al. |
2408.11043 |
null |
2024-08-20 |
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model |
Chunting Zhou et.al. |
2408.11039 |
null |
2024-08-20 |
Scaling Law with Learning Rate Annealing |
Howe Tissue et.al. |
2408.11029 |
null |
2024-08-20 |
Athena: Safe Autonomous Agents with Verbal Contrastive Learning |
Tanmana Sadhu et.al. |
2408.11021 |
null |
2024-08-20 |
While GitHub Copilot Excels at Coding, Does It Ensure Responsible Output? |
Wen Cheng et.al. |
2408.11006 |
link |
2024-08-20 |
SenPa-MAE: Sensor Parameter Aware Masked Autoencoder for Multi-Satellite Self-Supervised Pretraining |
Jonathan Prexl et.al. |
2408.11000 |
link |
2024-08-20 |
CTP-LLM: Clinical Trial Phase Transition Prediction Using Large Language Models |
Michael Reinisch et.al. |
2408.10995 |
null |
2024-08-20 |
Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models |
Yuyan Chen et.al. |
2408.10947 |
null |
2024-08-20 |
Large Language Model Driven Recommendation |
Anton Korikov et.al. |
2408.10946 |
null |
2024-08-20 |
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments |
Kazi Hasan Ibn Arif et.al. |
2408.10945 |
link |
2024-08-20 |
SysBench: Can Large Language Models Follow System Messages? |
Yanzhao Qin et.al. |
2408.10943 |
link |
2024-08-20 |
Proxona: Leveraging LLM-Driven Personas to Enhance Creators’ Understanding of Their Audience |
Yoonseo Choi et.al. |
2408.10937 |
null |
2024-08-20 |
LBC: Language-Based-Classifier for Out-Of-Variable Generalization |
Kangjun Noh et.al. |
2408.10923 |
link |
2024-08-21 |
BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model |
Yeyong Yu et.al. |
2408.10903 |
link |
2024-08-20 |
Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMs |
John Mendonça et.al. |
2408.10902 |
link |
2024-08-19 |
SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP |
Yusuke Hirota et.al. |
2408.10202 |
null |
2024-08-19 |
Demystifying the Communication Characteristics for Distributed Transformer Models |
Quentin Anthony et.al. |
2408.10197 |
null |
2024-08-19 |
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models |
Aviv Bick et.al. |
2408.10189 |
null |
2024-08-19 |
LongVILA: Scaling Long-Context Visual Language Models for Long Videos |
Fuzhao Xue et.al. |
2408.10188 |
link |
2024-08-19 |
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models |
Anke Tang et.al. |
2408.10174 |
link |
2024-08-19 |
Customizing Language Models with Instance-wise LoRA for Sequential Recommendation |
Xiaoyu Kong et.al. |
2408.10159 |
link |
2024-08-19 |
Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models |
Amey Hengle et.al. |
2408.10151 |
link |
2024-08-19 |
In-Context Learning with Representations: Contextual Generalization of Trained Transformers |
Tong Yang et.al. |
2408.10147 |
null |
2024-08-19 |
Instruction Finetuning for Leaderboard Generation from Empirical AI Research |
Salomon Kabongo et.al. |
2408.10141 |
null |
2024-08-19 |
Rhyme-aware Chinese lyric generator based on GPT |
Yixiao Yuan et.al. |
2408.10130 |
null |
2024-08-19 |
Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track |
Feiyu Pan et.al. |
2408.10125 |
null |
2024-08-19 |
Molecular Graph Representation Learning Integrating Large Language Models with Domain-specific Small Models |
Tianyu Zhang et.al. |
2408.10124 |
link |
2024-08-19 |
Geometry Informed Tokenization of Molecules for Language Model Generation |
Xiner Li et.al. |
2408.10120 |
null |
2024-08-19 |
GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document Summarization |
Ran Liu et.al. |
2408.10115 |
link |
2024-08-20 |
PLUTUS: A Well Pre-trained Large Unified Transformer can Unveil Financial Time Series Regularities |
Yuanjian Xu et.al. |
2408.10111 |
null |
2024-08-19 |
ARMADA: Attribute-Based Multimodal Data Augmentation |
Xiaomeng Jin et.al. |
2408.10086 |
null |
2024-08-19 |
Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning |
Sriyash Poddar et.al. |
2408.10075 |
null |
2024-08-19 |
FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant |
Zhengchao Huang et.al. |
2408.10072 |
link |
2024-08-19 |
Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory |
Haoran Li et.al. |
2408.10053 |
null |
2024-08-19 |
Defense Priorities in the Open-Source AI Debate: A Preliminary Assessment |
Masao Dahlgren et.al. |
2408.10026 |
null |
2024-08-16 |
SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation |
Xinyu Xiong et.al. |
2408.08870 |
link |
2024-08-16 |
PEDAL: Enhancing Greedy Decoding with Large Language Models using Diverse Exemplars |
Sumanth Prabhu et.al. |
2408.08869 |
null |
2024-08-16 |
A Hassle-free Algorithm for Private Learning in Practice: Don’t Use Tree Aggregation, Use BLTs |
H. Brendan McMahan et.al. |
2408.08868 |
null |
2024-08-16 |
Visual Agents as Fast and Slow Thinkers |
Guangyan Sun et.al. |
2408.08862 |
link |
2024-08-16 |
DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models |
Eman Ali et.al. |
2408.08855 |
null |
2024-08-16 |
GeoTransformer: Enhancing Urban Forecasting with Geospatial Attention Mechanisms |
Yuhao Jia et.al. |
2408.08852 |
null |
2024-08-16 |
ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis |
Yubao Zhao et.al. |
2408.08849 |
link |
2024-08-16 |
PsychoLex: Unveiling the Psychological Mind of Large Language Models |
Mohammad Amin Abbasi et.al. |
2408.08848 |
null |
2024-08-16 |
FLEXTAF: Enhancing Table Reasoning with Flexible Tabular Formats |
Xuanliang Zhang et.al. |
2408.08841 |
link |
2024-08-16 |
EasyRec: Simple yet Effective Language Models for Recommendation |
Xubin Ren et.al. |
2408.08821 |
link |
2024-08-16 |
Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models |
Lin Zhao et.al. |
2408.08813 |
null |
2024-08-16 |
Artificial Intelligence and Strategic Decision-Making: Evidence from Entrepreneurs and Investors |
Felipe A. Csaszar et.al. |
2408.08811 |
null |
2024-08-16 |
Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge |
Ravi Raju et.al. |
2408.08808 |
null |
2024-08-16 |
CIKMar: A Dual-Encoder Approach to Prompt-Based Reranking in Educational Dialogue Systems |
Joanito Agili Lopo et.al. |
2408.08805 |
null |
2024-08-16 |
A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks |
Boa Jang et.al. |
2408.08790 |
link |
2024-08-16 |
EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics |
Chenwei Wan et.al. |
2408.08782 |
link |
2024-08-16 |
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions |
Chenming Tang et.al. |
2408.08780 |
null |
2024-08-16 |
DAC: Decomposed Automation Correction for Text-to-SQL |
Dingzirui Wang et.al. |
2408.08779 |
link |
2024-08-16 |
Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused |
Dingwei Chen et.al. |
2408.08769 |
null |
2024-08-16 |
Rethinking Generative Semantic Communication for Multi-User Systems with Multi-Modal LLM |
Wanting Yang et.al. |
2408.08765 |
null |
2024-08-15 |
Can Large Language Models Understand Symbolic Graphics Programs? |
Zeju Qiu et.al. |
2408.08313 |
null |
2024-08-15 |
ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws |
Ruihang Li et.al. |
2408.08310 |
null |
2024-08-15 |
Towards Flexible Visual Relationship Segmentation |
Fangrui Zhu et.al. |
2408.08305 |
null |
2024-08-15 |
Benchmarking the Capabilities of Large Language Models in Transportation System Engineering: Accuracy, Consistency, and Reasoning Behaviors |
Usman Syed et.al. |
2408.08302 |
null |
2024-08-15 |
VLPG-Nav: Object Navigation Using Visual Language Pose Graph and Object Localization Probability Maps |
Senthil Hariharan Arul et.al. |
2408.08301 |
null |
2024-08-15 |
HELP: Hierarchical Embeddings-based Log Parsing |
Andy Xu et.al. |
2408.08300 |
null |
2024-08-15 |
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community |
Shachar Don-Yehiya et.al. |
2408.08291 |
null |
2024-08-15 |
Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model |
Jin Wang et.al. |
2408.08282 |
null |
2024-08-15 |
BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts |
Qizhen Zhang et.al. |
2408.08274 |
null |
2024-08-15 |
DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System |
Xihong Yang et.al. |
2408.08231 |
null |
2024-08-15 |
RED-CT: A Systems Design Methodology for Using LLM-labeled Data to Train and Deploy Edge Classifiers for Computational Social Science |
David Farr et.al. |
2408.08217 |
null |
2024-08-15 |
Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models |
Javier González et.al. |
2408.08210 |
null |
2024-08-15 |
LLM4DSR: Leveraing Large Language Model for Denoising Sequential Recommendation |
Bohao Wang et.al. |
2408.08208 |
null |
2024-08-15 |
Heavy Labels Out! Dataset Distillation with Label Space Lightening |
Ruonan Yu et.al. |
2408.08201 |
null |
2024-08-15 |
Scaling Up Natural Language Understanding for Multi-Robots Through the Lens of Hierarchy |
Shaojun Xu et.al. |
2408.08188 |
null |
2024-08-15 |
General-purpose Clothes Manipulation with Semantic Keypoints |
Yuhong Deng et.al. |
2408.08160 |
null |
2024-08-15 |
EmBARDiment: an Embodied AI Agent for Productivity in XR |
Riccardo Bovo et.al. |
2408.08158 |
null |
2024-08-15 |
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search |
Huajian Xin et.al. |
2408.08152 |
link |
2024-08-15 |
P/D-Serve: Serving Disaggregated Large Language Model at Scale |
Yibo Jin et.al. |
2408.08147 |
null |
2024-08-15 |
KOALA: Enhancing Speculative Decoding for LLM via Multi-Layer Draft Heads with Adversarial Learning |
Kaiqi Zhang et.al. |
2408.08146 |
null |
2024-08-14 |
The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models |
Karime Maamari et.al. |
2408.07702 |
null |
2024-08-15 |
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities |
Enneng Yang et.al. |
2408.07666 |
link |
2024-08-14 |
Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models |
Yi-Cheng Lin et.al. |
2408.07665 |
link |
2024-08-14 |
Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions |
Quan Liu et.al. |
2408.07663 |
link |
2024-08-14 |
WeKnow-RAG: An Adaptive Approach for Retrieval-Augmented Generation Integrating Web Search and Knowledge Graphs |
Weijian Xie et.al. |
2408.07611 |
null |
2024-08-14 |
Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey |
Hamza Kheddar et.al. |
2408.07583 |
null |
2024-08-15 |
MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark |
Minxuan Zhou et.al. |
2408.07543 |
link |
2024-08-15 |
Usefulness of data flow diagrams and large language models for security threat validation: a registered report |
Winnie Bahati Mbaka et.al. |
2408.07537 |
null |
2024-08-14 |
Development of a Multi-Agent Clinical Decision Support System for Korean Triage and Acuity Scale (KTAS)-Based Triage and Treatment Planning in Emergency Departments |
Seungjun Han et.al. |
2408.07531 |
null |
2024-08-14 |
Large Language Models Know What Makes Exemplary Contexts |
Quanyu Long et.al. |
2408.07505 |
null |
2024-08-14 |
Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach |
Shizhou Zhang et.al. |
2408.07500 |
link |
2024-08-14 |
QirK: Question Answering via Intermediate Representation on Knowledge Graphs |
Jan Luca Scheerer et.al. |
2408.07494 |
null |
2024-08-14 |
Training Overhead Ratio: A Practical Reliability Metric for Large Language Model Training Systems |
Ning Lu et.al. |
2408.07482 |
null |
2024-08-14 |
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization |
Yuxin Jiang et.al. |
2408.07471 |
link |
2024-08-14 |
Domain-invariant Representation Learning via Segment Anything Model for Blood Cell Classification |
Yongcheng Li et.al. |
2408.07467 |
link |
2024-08-14 |
Large Language Models Prompting With Episodic Memory |
Dai Do et.al. |
2408.07465 |
null |
2024-08-14 |
From Brazilian Portuguese to European Portuguese |
João Sanches et.al. |
2408.07457 |
null |
2024-08-14 |
Fact or Fiction? Improving Fact Verification with Knowledge Graphs through Simplified Subgraph Retrievals |
Tobias A. Opsahl et.al. |
2408.07453 |
link |
2024-08-15 |
BAPLe: Backdoor Attacks on Medical Foundational Models using Prompt Learning |
Asif Hanif et.al. |
2408.07440 |
link |
2024-08-14 |
Beyond Inter-Item Relations: Dynamic Adaptive Mixture-of-Experts for LLM-Based Sequential Recommendation |
CanYi Liu et.al. |
2408.07427 |
null |
2024-08-13 |
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents |
Kexun Zhang et.al. |
2408.07060 |
null |
2024-08-13 |
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs |
Yushi Bai et.al. |
2408.07055 |
link |
2024-08-13 |
Casper: Prompt Sanitization for Protecting User Privacy in Web-Based Large Language Models |
Chun Jie Chong et.al. |
2408.07004 |
null |
2024-08-13 |
LLMs can Schedule |
Henrik Abgaryan et.al. |
2408.06993 |
link |
2024-08-13 |
DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs |
Dongyuan Li et.al. |
2408.06966 |
null |
2024-08-13 |
Towards Holistic Disease Risk Prediction using Small Language Models |
Liv Björkdahl et.al. |
2408.06943 |
null |
2024-08-13 |
OpenResearcher: Unleashing AI for Accelerated Scientific Research |
Yuxiang Zheng et.al. |
2408.06941 |
link |
2024-08-13 |
The advantages of context specific language models: the case of the Erasmian Language Model |
João Gonçalves et.al. |
2408.06931 |
link |
2024-08-13 |
Evaluating Cultural Adaptability of a Large Language Model via Simulation of Synthetic Personas |
Louis Kwok et.al. |
2408.06929 |
link |
2024-08-13 |
SceneGPT: A Language Model for 3D Scene Understanding |
Shivam Chandhok et.al. |
2408.06926 |
null |
2024-08-13 |
Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives |
Zhihu Wang et.al. |
2408.06904 |
null |
2024-08-13 |
Leveraging Language Models for Emotion and Behavior Analysis in Education |
Kaito Tanaka et.al. |
2408.06874 |
null |
2024-08-13 |
LoRA $^2$ : Multi-Scale Low-Rank Approximations for Fine-Tuning Large Language Models |
Jia-Chen Zhang et.al. |
2408.06854 |
null |
2024-08-13 |
Causal Agent based on Large Language Model |
Kairong Han et.al. |
2408.06849 |
link |
2024-08-13 |
DracoGPT: Extracting Visualization Design Preferences from Large Language Models |
Huichen Will Wang et.al. |
2408.06845 |
null |
2024-08-13 |
How Aligned are Human Chart Takeaways and LLM Predictions? A Case Study on Bar Charts with Varying Layouts |
Huichen Will Wang et.al. |
2408.06837 |
null |
2024-08-13 |
Efficient Search for Customized Activation Functions with Gradient Descent |
Lukas Strack et.al. |
2408.06820 |
link |
2024-08-13 |
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty |
Yongjin Yang et.al. |
2408.06816 |
null |
2024-08-13 |
HLSPilot: LLM-based High-Level Synthesis |
Chenwei Xiong et.al. |
2408.06810 |
link |
2024-08-13 |
Layerwise Recurrent Router for Mixture-of-Experts |
Zihan Qiu et.al. |
2408.06793 |
link |
2024-08-12 |
FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection |
Yufei Huang et.al. |
2408.06333 |
link |
2024-08-12 |
Animate, or Inanimate, That is the Question for Large Language Models |
Leonardo Ranaldi et.al. |
2408.06332 |
null |
2024-08-12 |
Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let’s Take TravelPlanner as an Example |
Yanan Chen et.al. |
2408.06318 |
null |
2024-08-12 |
Long-Form Answers to Visual Questions from Blind and Low Vision People |
Mina Huh et.al. |
2408.06303 |
null |
2024-08-12 |
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery |
Chris Lu et.al. |
2408.06292 |
link |
2024-08-12 |
MovieSum: An Abstractive Summarization Dataset for Movie Screenplays |
Rohit Saxena et.al. |
2408.06281 |
link |
2024-08-13 |
Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation |
Jieyong Kim et.al. |
2408.06276 |
null |
2024-08-12 |
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data |
Haoran Sun et.al. |
2408.06273 |
link |
2024-08-12 |
A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution |
Sampath Rajapaksha et.al. |
2408.06272 |
null |
2024-08-12 |
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment |
Karel D’Oosterlinck et.al. |
2408.06266 |
link |
2024-08-12 |
Context-aware Visual Storytelling with Visual Prefix Tuning and Contrastive Learning |
Yingjin Song et.al. |
2408.06259 |
null |
2024-08-12 |
On Effects of Steering Latent Representation for Large Language Model Unlearning |
Dang Huu-Tien et.al. |
2408.06223 |
null |
2024-08-12 |
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers |
Zhenting Qi et.al. |
2408.06195 |
link |
2024-08-12 |
FruitNeRF: A Unified Neural Radiance Field based Fruit Counting Framework |
Lukas Meyer et.al. |
2408.06190 |
link |
2024-08-12 |
Improving Structural Diversity of Blackbox LLMs via Chain-of-Specification Prompting |
Halley Young et.al. |
2408.06186 |
null |
2024-08-12 |
OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning |
Mushui Liu et.al. |
2408.06158 |
link |
2024-08-12 |
LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library |
Tianhao Yu et.al. |
2408.06150 |
null |
2024-08-12 |
Self-Supervised Learning on MeerKAT Wide-Field Continuum Images |
Erica Lastufka et.al. |
2408.06147 |
link |
2024-08-12 |
Med42-v2: A Suite of Clinical LLMs |
Clément Christophe et.al. |
2408.06142 |
null |
2024-08-12 |
Utilize Transformers for translating Wikipedia category names |
Hoang-Thang Ta et.al. |
2408.06124 |
null |
2024-08-10 |
Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions |
Michele Miranda et.al. |
2408.05212 |
link |
2024-08-09 |
VITA: Towards Open-Source Interactive Omni Multimodal LLM |
Chaoyou Fu et.al. |
2408.05211 |
link |
2024-08-09 |
Evaluating the capability of large language models to personalize science texts for diverse middle-school-age learners |
Michael Vaccaro Jr et.al. |
2408.05204 |
null |
2024-08-09 |
TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning |
Yujie Feng et.al. |
2408.05200 |
link |
2024-08-09 |
ECG-FM: An Open Electrocardiogram Foundation Model |
Kaden McKeen et.al. |
2408.05178 |
link |
2024-08-09 |
Weak-Annotation of HAR Datasets using Vision Foundation Models |
Marius Bock et.al. |
2408.05169 |
link |
2024-08-09 |
AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity Recognition Dataset |
Pritam Deka et.al. |
2408.05149 |
null |
2024-08-09 |
A Hybrid RAG System with Comprehensive Enhancement on Complex Reasoning |
Ye Yuan et.al. |
2408.05141 |
null |
2024-08-09 |
Is ChatGPT a Good Software Librarian? An Exploratory Study on the Use of ChatGPT for Software Library Recommendations |
Jasmine Latendresse et.al. |
2408.05128 |
null |
2024-08-09 |
Large Language Models and Thematic Analysis: Human-AI Synergy in Researching Hate Speech on Social Media |
Petre Breazu et.al. |
2408.05126 |
null |
2024-08-09 |
Sportify: Question Answering with Embedded Visualizations and Personified Narratives for Sports Video |
Chunggi Lee et.al. |
2408.05123 |
null |
2024-08-09 |
A Survey of NL2SQL with Large Language Models: Where are we, and where are we going? |
Xinyu Liu et.al. |
2408.05109 |
link |
2024-08-09 |
Depth Helps: Improving Pre-trained RGB-based Policy with Depth Information Injection |
Xincheng Pang et.al. |
2408.05107 |
null |
2024-08-09 |
How Well Do LLMs Identify Cultural Unity in Diversity? |
Jialin Li et.al. |
2408.05102 |
link |
2024-08-09 |
Hyperbolic Learning with Multimodal Large Language Models |
Paolo Mandica et.al. |
2408.05097 |
null |
2024-08-09 |
Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive Prompts |
Tingchen Fu et.al. |
2408.05094 |
null |
2024-08-09 |
Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models |
Zikai Xie et.al. |
2408.05093 |
link |
2024-08-09 |
Generating novel experimental hypotheses from language models: A case study on cross-dative generalization |
Kanishka Misra et.al. |
2408.05086 |
link |
2024-08-09 |
RT-Surv: Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records |
Sangjoon Park et.al. |
2408.05074 |
null |
2024-08-09 |
Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil |
Marcelo Sartori Locatelli et.al. |
2408.05035 |
null |
2024-08-08 |
Better Alignment with Instruction Back-and-Forth Translation |
Thao Nguyen et.al. |
2408.04614 |
null |
2024-08-08 |
Code-switching in text and speech reveals information-theoretic audience design |
Debasmita Bhattacharya et.al. |
2408.04596 |
null |
2024-08-09 |
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models |
Qirui Jiao et.al. |
2408.04594 |
link |
2024-08-08 |
Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency, Performance, and Adversarial Robustness |
Xiaojing Fan et.al. |
2408.04585 |
null |
2024-08-08 |
SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More |
Tianrun Chen et.al. |
2408.04579 |
null |
2024-08-08 |
SCENE: Evaluating Explainable AI Techniques Using Soft Counterfactuals |
Haoran Zheng et.al. |
2408.04575 |
null |
2024-08-08 |
Learning Fine-Grained Grounded Citations for Attributed Large Language Models |
Lei Huang et.al. |
2408.04568 |
link |
2024-08-08 |
Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models |
Yupeng Chang et.al. |
2408.04556 |
link |
2024-08-08 |
Depth Any Canopy: Leveraging Depth Foundation Models for Canopy Height Estimation |
Daniele Rege Cambrin et.al. |
2408.04523 |
link |
2024-08-08 |
Compromesso! Italian Many-Shot Jailbreaks Undermine the Safety of Large Language Models |
Fabio Pernisi et.al. |
2408.04522 |
null |
2024-08-08 |
What You Need is What You Get: Theory of Mind for an LLM-Based Code Understanding Assistant |
Jonan Richards et.al. |
2408.04477 |
null |
2024-08-08 |
Can LLMs Beat Humans in Debating? A Dynamic Multi-agent Framework for Competitive Debate |
Yiqun Zhang et.al. |
2408.04472 |
link |
2024-08-08 |
RiskAwareBench: Towards Evaluating Physical Risk Awareness for High-level Planning of LLM-based Embodied Agents |
Zihao Zhu et.al. |
2408.04449 |
link |
2024-08-08 |
Large Language Models for cross-language code clone detection |
Micheline Bénédicte Moumoula et.al. |
2408.04430 |
null |
2024-08-08 |
Recognizing Emotion Regulation Strategies from Human Behavior with Large Language Models |
Philipp Müller et.al. |
2408.04420 |
null |
2024-08-08 |
Enhancing Robustness of Retrieval-Augmented Language Models with In-Context Learning |
Seong-Il Park et.al. |
2408.04414 |
null |
2024-08-08 |
Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers |
Moritz Scherer et.al. |
2408.04413 |
null |
2024-08-08 |
Exploring Reasoning Biases in Large Language Models Through Syllogism: Insights from the NeuBAROCO Dataset |
Kentaro Ozeki et.al. |
2408.04403 |
link |
2024-08-08 |
Automated Educational Question Generation at Different Bloom’s Skill Levels using Large Language Models: Strategies and Evaluation |
Nicy Scaria et.al. |
2408.04394 |
link |
2024-08-08 |
Open-domain Implicit Format Control for Large Language Model Generation |
Yiqun Yao et.al. |
2408.04392 |
link |
2024-08-07 |
How Well Can Vision Language Models See Image Details? |
Chenhui Gou et.al. |
2408.03940 |
null |
2024-08-07 |
SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature |
Vinícius Di Oliveira et.al. |
2408.03936 |
null |
2024-08-07 |
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases |
Xiangyan Liu et.al. |
2408.03910 |
link |
2024-08-07 |
Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models |
Shachi H Kumar et.al. |
2408.03907 |
null |
2024-08-07 |
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond |
Beomseok Lee et.al. |
2408.03900 |
link |
2024-08-07 |
Simplifying Scholarly Abstracts for Accessible Digital Libraries |
Haining Wang et.al. |
2408.03899 |
link |
2024-08-07 |
From Data to Story: Towards Automatic Animated Data Video Creation with LLM-based Multi-Agent Systems |
Leixian Shen et.al. |
2408.03876 |
null |
2024-08-07 |
PackMamba: Efficient Processing of Variable-Length Sequences in Mamba training |
Haoran Xu et.al. |
2408.03865 |
null |
2024-08-07 |
GAIA – A Large Language Model for Advanced Power Dispatch |
Yuheng Cheng et.al. |
2408.03847 |
null |
2024-08-07 |
MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models |
Yuchen Dong et.al. |
2408.03841 |
null |
2024-08-07 |
WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models |
Prannaya Gupta et.al. |
2408.03837 |
link |
2024-08-07 |
Target Prompting for Information Extraction with Vision Language Model |
Dipankar Medhi et.al. |
2408.03834 |
null |
2024-08-07 |
Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning |
Simret Araya Gebreegziabher et.al. |
2408.03819 |
null |
2024-08-07 |
Generative Language Models with Retrieval Augmented Generation for Automated Short Answer Scoring |
Zifan Wang et.al. |
2408.03811 |
null |
2024-08-07 |
‘Finance Wizard’ at the FinLLM Challenge Task: Financial Text Summarization |
Meisin Lee et.al. |
2408.03762 |
null |
2024-08-07 |
MMSummary: Multimodal Summary Generation for Fetal Ultrasound Video |
Xiaoqing Guo et.al. |
2408.03761 |
null |
2024-08-07 |
Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation |
Jingjing Xie et.al. |
2408.03735 |
link |
2024-08-07 |
Question Rephrasing for Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks |
Zizhang Chen et.al. |
2408.03732 |
null |
2024-08-07 |
A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models |
Pengxiang Zhao et.al. |
2408.03728 |
null |
2024-08-07 |
Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction |
Benjamin Matthias Ruppik et.al. |
2408.03706 |
null |
2024-08-06 |
CoverBench: A Challenging Benchmark for Complex Claim Verification |
Alon Jacovi et.al. |
2408.03325 |
null |
2024-08-06 |
Segment Anything in Medical Images and Videos: Benchmark and Deployment |
Jun Ma et.al. |
2408.03322 |
link |
2024-08-06 |
TextIM: Part-aware Interactive Motion Synthesis from Text |
Siyuan Fan et.al. |
2408.03302 |
null |
2024-08-06 |
KaPO: Knowledge-aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models |
Ruizhe Zhang et.al. |
2408.03297 |
null |
2024-08-06 |
Biomedical SAM 2: Segment Anything in Biomedical Images and Videos |
Zhiling Yan et.al. |
2408.03286 |
link |
2024-08-07 |
StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation |
Boxi Cao et.al. |
2408.03281 |
link |
2024-08-06 |
Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments |
Angie Boggust et.al. |
2408.03274 |
null |
2024-08-06 |
Synthesizing Text-to-SQL Data from Weak and Strong LLMs |
Jiaxi Yang et.al. |
2408.03256 |
null |
2024-08-06 |
Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons |
Yifei Wang et.al. |
2408.03247 |
link |
2024-08-06 |
Making Long-Context Language Models Better Multi-Hop Reasoners |
Yanyang Li et.al. |
2408.03246 |
link |
2024-08-06 |
Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi |
Pranita Deshmukh et.al. |
2408.03172 |
null |
2024-08-06 |
Conditioning LLMs with Emotion in Neural Machine Translation |
Charles Brazier et.al. |
2408.03150 |
null |
2024-08-06 |
Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization |
Yanghai Zhang et.al. |
2408.03149 |
link |
2024-08-06 |
Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations |
Leo Donisch et.al. |
2408.03130 |
null |
2024-08-06 |
Lisbon Computational Linguists at SemEval-2024 Task 2: Using A Mistral 7B Model and Data Augmentation |
Artur Guimarães et.al. |
2408.03127 |
link |
2024-08-06 |
Evaluating the Translation Performance of Large Language Models Based on Euas-20 |
Yan Huang et.al. |
2408.03119 |
null |
2024-08-06 |
Topic Modeling with Fine-tuning LLMs and Bag of Sentences |
Johannes Schneider et.al. |
2408.03099 |
link |
2024-08-07 |
TestART: Improving LLM-based Unit Test via Co-evolution of Automated Generation and Repair Iteration |
Siqi Gu et.al. |
2408.03095 |
null |
2024-08-06 |
500xCompressor: Generalized Prompt Compression for Large Language Models |
Zongqian Li et.al. |
2408.03094 |
link |
2024-08-06 |
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement |
Le Yu et.al. |
2408.03092 |
link |
2024-08-05 |
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining |
Dongyang Liu et.al. |
2408.02657 |
link |
2024-08-05 |
Can Reinforcement Learning Unlock the Hidden Dangers in Aligned Large Language Models? |
Mohammad Bahrami Karkevandi et.al. |
2408.02651 |
null |
2024-08-05 |
Command-line Obfuscation Detection using Small Language Models |
Vojtech Outrata et.al. |
2408.02637 |
null |
2024-08-05 |
SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models |
Muxi Diao et.al. |
2408.02632 |
null |
2024-08-05 |
Language Model Can Listen While Speaking |
Ziyang Ma et.al. |
2408.02622 |
null |
2024-08-05 |
Progressively Selective Label Enhancement for Language Model Alignment |
Biao Liu et.al. |
2408.02599 |
null |
2024-08-05 |
Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection |
Sajal Aggarwal et.al. |
2408.02595 |
null |
2024-08-05 |
Leveraging the Power of LLMs: A Fine-Tuning Approach for High-Quality Aspect-Based Summarization |
Ankan Mullick et.al. |
2408.02584 |
null |
2024-08-05 |
DanModCap: Designing a Danmaku Moderation Tool for Video-Sharing Platforms that Leverages Impact Captions |
Siying Hu et.al. |
2408.02574 |
null |
2024-08-05 |
Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information |
Yauwai Yim et.al. |
2408.02559 |
null |
2024-08-05 |
Generative AI as a Service in 6G Edge-Cloud: Generation Task Offloading by In-context Learning |
Hao Zhou et.al. |
2408.02549 |
null |
2024-08-05 |
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation |
Daniel Fleischer et.al. |
2408.02545 |
link |
2024-08-05 |
Caution for the Environment: Multimodal Agents are Susceptible to Environmental Distractions |
Xinbei Ma et.al. |
2408.02544 |
link |
2024-08-05 |
Towards Coarse-grained Visual Language Navigation Task Planning Enhanced by Event Knowledge Graph |
Zhao Kaichen et.al. |
2408.02535 |
null |
2024-08-05 |
Practical Attacks against Black-box Code Completion Engines |
Slobodan Jenko et.al. |
2408.02509 |
null |
2024-08-05 |
UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model |
Zhaowei Li et.al. |
2408.02503 |
link |
2024-08-05 |
Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation |
Aaron Imani et.al. |
2408.02502 |
null |
2024-08-05 |
A First Look at License Compliance Capability of LLMs in Code Generation |
Weiwei Xu et.al. |
2408.02487 |
link |
2024-08-05 |
Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection |
Ting Lei et.al. |
2408.02484 |
link |
2024-08-05 |
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future |
Haolin Jin et.al. |
2408.02479 |
null |
2024-08-02 |
Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting |
Xiangyu Zhao et.al. |
2408.01423 |
null |
2024-08-02 |
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs |
Jingtong Su et.al. |
2408.01420 |
null |
2024-08-02 |
DebateQA: Evaluating Question Answering on Debatable Knowledge |
Rongwu Xu et.al. |
2408.01419 |
link |
2024-08-02 |
Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs |
Yilun Hua et.al. |
2408.01417 |
null |
2024-08-02 |
Pre-trained Language Models Improve the Few-shot Prompt Ability of Decision Transformer |
Yu Yang et.al. |
2408.01402 |
null |
2024-08-02 |
Coalitions of Large Language Models Increase the Robustness of AI Agents |
Prattyush Mangal et.al. |
2408.01380 |
null |
2024-08-02 |
Toward Automatic Relevance Judgment using Vision–Language Models for Image–Text Retrieval Evaluation |
Jheng-Hong Yang et.al. |
2408.01363 |
null |
2024-08-02 |
Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs |
Peng Ding et.al. |
2408.01355 |
link |
2024-08-02 |
MCGMark: An Encodable and Robust Online Watermark for LLM-Generated Malicious Code |
Kaiwen Ning et.al. |
2408.01354 |
link |
2024-08-02 |
Prompt Refinement or Fine-tuning? Best Practices for using LLMs in Computational Social Science Tasks |
Anders Giovanni Møller et.al. |
2408.01346 |
null |
2024-08-02 |
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models |
Benno Weck et.al. |
2408.01337 |
link |
2024-08-02 |
A Backbone for Long-Horizon Robot Task Understanding |
Xiaoshuai Chen et.al. |
2408.01334 |
null |
2024-08-02 |
FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only |
He Zhu et.al. |
2408.01323 |
null |
2024-08-02 |
A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks |
Jiaqi Wang et.al. |
2408.01319 |
null |
2024-08-02 |
Reconsidering Token Embeddings with the Definitions for Pre-trained Language Models |
Ying Zhang et.al. |
2408.01308 |
null |
2024-08-02 |
The Mismeasure of Man and Models: Evaluating Allocational Harms in Large Language Models |
Hannah Chen et.al. |
2408.01285 |
null |
2024-08-02 |
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework |
Kunlun Zhu et.al. |
2408.01262 |
link |
2024-08-02 |
The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models |
Simone Caldarella et.al. |
2408.01228 |
null |
2024-08-02 |
High-Throughput Phenotyping of Clinical Text Using Large Language Models |
Daniel B. Hier et.al. |
2408.01214 |
null |
2024-08-02 |
Misinforming LLMs: vulnerabilities, challenges and opportunities |
Bo Zhou et.al. |
2408.01168 |
null |
2024-08-01 |
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation |
Mengkang Hu et.al. |
2408.00764 |
null |
2024-08-01 |
UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model |
Xiangyu Fan et.al. |
2408.00762 |
null |
2024-08-01 |
Tamper-Resistant Safeguards for Open-Weight LLMs |
Rishub Tamirisa et.al. |
2408.00761 |
link |
2024-08-01 |
Thermal Conductivity Predictions with Foundation Atomistic Models |
Balázs Póta et.al. |
2408.00755 |
link |
2024-08-01 |
Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model |
Benlin Liu et.al. |
2408.00754 |
null |
2024-08-01 |
Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation |
Siyu Jiao et.al. |
2408.00744 |
link |
2024-08-01 |
DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency |
Jovan Stojkovic et.al. |
2408.00741 |
null |
2024-08-01 |
Virchow 2: Scaling Self-Supervised Mixed Magnification Models in Pathology |
Eric Zimmermann et.al. |
2408.00738 |
null |
2024-08-01 |
Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions |
Guangzhi Xiong et.al. |
2408.00727 |
link |
2024-08-01 |
An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models |
Yangzhen Wu et.al. |
2408.00724 |
null |
2024-08-01 |
Pathway to Secure and Trustworthy 6G for LLMs: Attacks, Defense, and Opportunities |
Sunder Ali Khowaja et.al. |
2408.00722 |
null |
2024-08-01 |
SAM 2: Segment Anything in Images and Videos |
Nikhila Ravi et.al. |
2408.00714 |
link |
2024-08-01 |
Point-supervised Brain Tumor Segmentation with Box-prompted MedSAM |
Xiaofeng Liu et.al. |
2408.00706 |
null |
2024-08-01 |
Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning |
Trapoom Ukarapol et.al. |
2408.00690 |
link |
2024-08-01 |
Can Developers Prompt? A Controlled Experiment for Code Documentation Generation |
Hans-Alexander Kruse et.al. |
2408.00686 |
null |
2024-08-01 |
ExpertAF: Expert Actionable Feedback from Video |
Kumar Ashutosh et.al. |
2408.00672 |
null |
2024-08-01 |
AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models |
Daqin Luo et.al. |
2408.00665 |
link |
2024-08-01 |
Disentangling Dense Embeddings with Sparse Autoencoders |
Charles O’Neill et.al. |
2408.00657 |
null |
2024-08-02 |
SentenceVAE: Faster, Longer and More Accurate Inference with Next-sentence Prediction for Large Language Models |
Hongjun An et.al. |
2408.00655 |
link |
2024-08-01 |
Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint Learning |
Xuri Ge et.al. |
2408.00644 |
null |
2024-07-31 |
Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey |
Atsuyuki Miyai et.al. |
2407.21794 |
null |
2024-07-31 |
Vision-Language Model Based Handwriting Verification |
Mihir Chauhan et.al. |
2407.21788 |
null |
2024-07-31 |
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling |
Bradley Brown et.al. |
2407.21787 |
null |
2024-07-31 |
The Llama 3 Herd of Models |
Abhimanyu Dubey et.al. |
2407.21783 |
null |
2024-07-31 |
Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs |
Shi Liu et.al. |
2407.21771 |
null |
2024-07-31 |
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts |
Xi Victoria Lin et.al. |
2407.21770 |
null |
2024-07-31 |
ReplanVLM: Replanning Robotic Tasks with Visual Language Models |
Aoran Mei et.al. |
2407.21762 |
null |
2024-07-31 |
Learning Video Context as Interleaved Multimodal Sequences |
Kevin Qinghong Lin et.al. |
2407.21757 |
link |
2024-07-31 |
A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation |
Mothilal Asokan et.al. |
2407.21739 |
null |
2024-07-31 |
Open-Vocabulary Audio-Visual Semantic Segmentation |
Ruohao Guo et.al. |
2407.21721 |
null |
2024-07-31 |
Adaptive Retrieval-Augmented Generation for Conversational Systems |
Xi Wang et.al. |
2407.21712 |
null |
2024-07-31 |
CEAR: Automatic construction of a knowledge graph of chemical entities and roles from scientific literature |
Stefan Langer et.al. |
2407.21708 |
null |
2024-07-31 |
TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities |
Ming Zhang et.al. |
2407.21693 |
link |
2024-07-31 |
Synth-Empathy: Towards High-Quality Synthetic Empathy Data |
Hao Liang et.al. |
2407.21669 |
link |
2024-08-01 |
Defending Jailbreak Attack in VLMs via Cross-modality Information Detector |
Yue Xu et.al. |
2407.21659 |
link |
2024-07-31 |
MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment |
Anurag Das et.al. |
2407.21654 |
null |
2024-07-31 |
Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation |
Xiang Luo et.al. |
2407.21633 |
link |
2024-07-31 |
TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods |
Gabriel Loiseau et.al. |
2407.21630 |
link |
2024-07-31 |
LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows |
Lukas Teufelberger et.al. |
2407.21593 |
null |
2024-07-31 |
A Performance Study of LLM-Generated Code on Leetcode |
Tristan Coignion et.al. |
2407.21579 |
null |
2024-07-30 |
ThinK: Thinner Key Cache by Query-Driven Pruning |
Yuhui Xu et.al. |
2407.21018 |
null |
2024-07-30 |
CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning |
Yuexi Du et.al. |
2407.21011 |
link |
2024-07-30 |
GABInsight: Exploring Gender-Activity Binding Bias in Vision-Language Models |
Ali Abdollahi et.al. |
2407.21001 |
link |
2024-07-30 |
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning |
Yupeng Chen et.al. |
2407.20999 |
null |
2024-07-30 |
From Feature Importance to Natural Language Explanations Using LLMs with RAG |
Sule Tekkesinoglu et.al. |
2407.20990 |
link |
2024-07-30 |
Large Language Models (LLMs) for Semantic Communication in Edge-based IoT Networks |
Alakesh Kalita et.al. |
2407.20970 |
null |
2024-07-30 |
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions |
Xiaowei Chi et.al. |
2407.20962 |
link |
2024-07-30 |
UniProcessor: A Text-induced Unified Low-level Image Processor |
Huiyu Duan et.al. |
2407.20928 |
link |
2024-07-30 |
SSPA: Split-and-Synthesize Prompting with Gated Alignments for Multi-Label Image Recognition |
Hao Tan et.al. |
2407.20920 |
null |
2024-07-30 |
Automated Review Generation Method Based on Large Language Models |
Shican Wu et.al. |
2407.20906 |
link |
2024-07-30 |
Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach |
Adam Wojciechowski et.al. |
2407.20899 |
link |
2024-07-30 |
ThinkRepair: Self-Directed Automated Program Repair |
Xin Yin et.al. |
2407.20898 |
link |
2024-07-30 |
Effective Black Box Testing of Sentiment Analysis Classification Networks |
Parsa Karbasizadeh et.al. |
2407.20884 |
null |
2024-07-30 |
Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification |
Boyang Zhang et.al. |
2407.20859 |
null |
2024-07-30 |
Learn by Selling: Equipping Large Language Models with Product Knowledge for Context-Driven Recommendations |
Sarthak Anand et.al. |
2407.20856 |
null |
2024-07-30 |
Large Language Model (LLM)-enabled Graphs in Dynamic Networking |
Geng Sun et.al. |
2407.20840 |
null |
2024-07-30 |
How to Measure the Intelligence of Large Language Models? |
Nils Körber et.al. |
2407.20828 |
null |
2024-07-30 |
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning |
Norman Di Palo et.al. |
2407.20798 |
null |
2024-07-30 |
Interpretable Pre-Trained Transformers for Heart Time-Series Data |
Harry J. Davies et.al. |
2407.20775 |
link |
2024-07-30 |
OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance |
Yongqiang Yao et.al. |
2407.20761 |
link |
2024-07-29 |
Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing |
Ekaterina Iakovleva et.al. |
2407.20232 |
null |
2024-07-29 |
Improving 2D Feature Representations by 3D-Aware Fine-Tuning |
Yuanwen Yue et.al. |
2407.20229 |
null |
2024-07-29 |
FlexAttention for Efficient High-Resolution Vision-Language Models |
Junyan Li et.al. |
2407.20228 |
null |
2024-07-29 |
Can Editing LLMs Inject Harm? |
Canyu Chen et.al. |
2407.20224 |
null |
2024-07-29 |
SANGRIA: Surgical Video Scene Graph Optimization for Surgical Workflow Prediction |
Çağhan Köksal et.al. |
2407.20214 |
null |
2024-07-29 |
QAEA-DR: A Unified Text Augmentation Framework for Dense Retrieval |
Hongming Tan et.al. |
2407.20207 |
null |
2024-07-29 |
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher |
Zehui Chen et.al. |
2407.20183 |
link |
2024-07-29 |
Theia: Distilling Diverse Vision Foundation Models for Robot Learning |
Jinghuan Shang et.al. |
2407.20179 |
link |
2024-07-29 |
AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs |
Feiyang Kang et.al. |
2407.20177 |
link |
2024-07-29 |
Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning |
Xingchen Zeng et.al. |
2407.20174 |
link |
2024-07-29 |
Diffusion Feedback Helps CLIP See Better |
Wenxuan Wang et.al. |
2407.20171 |
link |
2024-07-29 |
Language-Conditioned Offline RL for Multi-Robot Navigation |
Steven Morad et.al. |
2407.20164 |
null |
2024-07-29 |
rLLM: Relational Table Learning with LLMs |
Weichen Li et.al. |
2407.20157 |
link |
2024-07-29 |
ByteCheckpoint: A Unified Checkpointing System for LLM Development |
Borui Wan et.al. |
2407.20143 |
null |
2024-07-29 |
Strong Copyright Protection for Language Models via Adaptive Model Fusion |
Javier Abad et.al. |
2407.20105 |
null |
2024-07-29 |
Orca: Ocean Significant Wave Height Estimation with Spatio-temporally Aware Large Language Models |
Zhe Li et.al. |
2407.20053 |
null |
2024-07-29 |
Exploring Large Language Models to generate Easy to Read content |
Paloma Martínez et.al. |
2407.20046 |
null |
2024-07-29 |
MaskInversion: Localized Embeddings via Optimization of Explainability Maps |
Walid Bousselham et.al. |
2407.20034 |
null |
2024-07-29 |
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey |
Jiangfei Duan et.al. |
2407.20018 |
null |
2024-07-29 |
Rosetta Statements: Lowering the Barrier for Semantic Parsing and Increasing the Cognitive Interoperability of Knowledge Graphs |
Lars Vogt et.al. |
2407.20007 |
null |
2024-07-26 |
Wolf: Captioning Everything with a World Summarization Framework |
Boyi Li et.al. |
2407.18908 |
null |
2024-07-26 |
SHIC: Shape-Image Correspondences with no Keypoint Supervision |
Aleksandar Shtedritski et.al. |
2407.18907 |
null |
2024-07-26 |
A Flexible and Scalable Approach for Collecting Wildlife Advertisements on the Web |
Juliana Barbosa et.al. |
2407.18898 |
link |
2024-07-26 |
Small Molecule Optimization with Large Language Models |
Philipp Guevorguian et.al. |
2407.18897 |
link |
2024-07-26 |
Human-artificial intelligence teaming for scientific information extraction from data-driven additive manufacturing research using large language models |
Mutahar Safdar et.al. |
2407.18827 |
null |
2024-07-26 |
Automatic Detection of Moral Values in Music Lyrics |
Vjosa Preniqi et.al. |
2407.18787 |
link |
2024-07-26 |
The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs |
Aleix Sant et.al. |
2407.18786 |
null |
2024-07-26 |
Foundation Models for the Digital Twin Creation of Cyber-Physical Systems |
Shaukat Ali et.al. |
2407.18779 |
null |
2024-07-26 |
TAGIFY: LLM-powered Tagging Interface for Improved Data Findability on OGD portals |
Kevin Kliimask et.al. |
2407.18764 |
null |
2024-07-26 |
Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery |
Yuni Susanti et.al. |
2407.18752 |
link |
2024-07-26 |
Towards Effective and Efficient Continual Pre-training of Large Language Models |
Jie Chen et.al. |
2407.18743 |
null |
2024-07-26 |
Towards Generalized Offensive Language Identification |
Alphaeus Dmonte et.al. |
2407.18738 |
null |
2024-07-26 |
LLASP: Fine-tuning Large Language Models for Answer Set Programming |
Erica Coppolillo et.al. |
2407.18723 |
null |
2024-07-26 |
Neurosymbolic AI for Enhancing Instructability in Generative AI |
Amit Sheth et.al. |
2407.18722 |
null |
2024-07-26 |
Cluster-norm for Unsupervised Probing of Knowledge |
Walter Laurito et.al. |
2407.18712 |
link |
2024-07-26 |
Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation |
Esteban Garces Arias et.al. |
2407.18698 |
link |
2024-07-26 |
Collaborative Evolving Strategy for Automatic Data-Centric Development |
Xu Yang et.al. |
2407.18690 |
null |
2024-07-26 |
The BIAS Detection Framework: Bias Detection in Word Embeddings and Language Models for European Languages |
Alexandre Puttick et.al. |
2407.18689 |
link |
2024-07-26 |
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift |
Seongho Son et.al. |
2407.18676 |
null |
2024-07-26 |
Every Part Matters: Integrity Verification of Scientific Figures Based on Multimodal Large Language Models |
Xiang Shi et.al. |
2407.18626 |
link |
2024-07-25 |
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning |
Tianduo Wang et.al. |
2407.18248 |
link |
2024-07-25 |
LoRA-Pro: Are Low-Rank Adapters Properly Optimized? |
Zhengbo Wang et.al. |
2407.18242 |
link |
2024-07-25 |
Recursive Introspection: Teaching Language Model Agents How to Self-Improve |
Yuxiao Qu et.al. |
2407.18219 |
null |
2024-07-26 |
Exploring Scaling Trends in LLM Robustness |
Nikolaus Howe et.al. |
2407.18213 |
null |
2024-07-25 |
AsEP: Benchmarking Deep Learning Methods for Antibody-specific Epitope Prediction |
Chunan Liu et.al. |
2407.18184 |
link |
2024-07-25 |
Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning |
Sindhura Kommu et.al. |
2407.18181 |
null |
2024-07-25 |
Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models |
Sanae Lotfi et.al. |
2407.18158 |
null |
2024-07-25 |
$\mathbb{X}$ -Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs |
Vlad Sobal et.al. |
2407.18134 |
null |
2024-07-25 |
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic |
Fakhraddin Alwajih et.al. |
2407.18129 |
null |
2024-07-25 |
Efficient Inference of Vision Instruction-Following Models with Elastic Cache |
Zuyan Liu et.al. |
2407.18121 |
link |
2024-07-25 |
Multi-Resolution Histopathology Patch Graphs for Ovarian Cancer Subtyping |
Jack Breen et.al. |
2407.18105 |
link |
2024-07-25 |
Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow |
Tian Guo et.al. |
2407.18103 |
null |
2024-07-25 |
PEFT-U: Parameter-Efficient Fine-Tuning for User Personalization |
Christopher Clarke et.al. |
2407.18078 |
link |
2024-07-25 |
C2P: Featuring Large Language Models with Causal Reasoning |
Abdolmahdi Bagheri et.al. |
2407.18069 |
null |
2024-07-25 |
ComPeer: A Generative Conversational Agent for Proactive Peer Support |
Tianjian Liu et.al. |
2407.18064 |
link |
2024-07-25 |
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding |
Soham Deshmukh et.al. |
2407.18062 |
link |
2024-07-25 |
Difficulty Estimation and Simplification of French Text Using LLMs |
Henri Jamet et.al. |
2407.18061 |
null |
2024-07-25 |
The Geometry of Queries: Query-Based Innovations in Retrieval-Augmented Generation |
Eric Yang et.al. |
2407.18044 |
null |
2024-07-25 |
RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models |
Haoyu Chen et.al. |
2407.18035 |
null |
2024-07-25 |
GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy |
Jan Batzner et.al. |
2407.18008 |
null |
2024-07-24 |
I Could’ve Asked That: Reformulating Unanswerable Questions |
Wenting Zhao et.al. |
2407.17469 |
link |
2024-07-24 |
WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries |
Wenting Zhao et.al. |
2407.17468 |
null |
2024-07-24 |
CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models |
Jiawei Gu et.al. |
2407.17467 |
null |
2024-07-24 |
$VILA^2$ : VILA Augmented VILA |
Yunhao Fang et.al. |
2407.17453 |
null |
2024-07-24 |
Fluent Student-Teacher Redteaming |
T. Ben Thompson et.al. |
2407.17447 |
link |
2024-07-24 |
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? |
Michael-Andrei Panaitescu-Liess et.al. |
2407.17417 |
null |
2024-07-24 |
(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork |
Tianjin Huang et.al. |
2407.17412 |
null |
2024-07-24 |
Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models |
Yida Zhao et.al. |
2407.17406 |
link |
2024-07-24 |
Grammar-based Game Description Generation using Large Language Models |
Tsunehiko Tanaka et.al. |
2407.17404 |
null |
2024-07-24 |
3D Question Answering for City Scene Understanding |
Penglei Sun et.al. |
2407.17398 |
null |
2024-07-24 |
PERSONA: A Reproducible Testbed for Pluralistic Alignment |
Louis Castricato et.al. |
2407.17387 |
null |
2024-07-24 |
A Comprehensive Approach to Misspelling Correction with BERT and Levenshtein Distance |
Amirreza Naziri et.al. |
2407.17383 |
null |
2024-07-24 |
MMRA: A Benchmark for Multi-granularity Multi-image Relational Association |
Siwei Wu et.al. |
2407.17379 |
link |
2024-07-24 |
ViPer: Visual Personalization of Generative Models via Individual Preference Learning |
Sogand Salehi et.al. |
2407.17365 |
null |
2024-07-24 |
Gradient-based inference of abstract task representations for generalization in neural networks |
Ali Hummos et.al. |
2407.17356 |
null |
2024-07-24 |
Scalify: scale propagation for efficient low-precision LLM training |
Paul Balança et.al. |
2407.17353 |
link |
2024-07-24 |
Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching |
Yuyang Ding et.al. |
2407.17349 |
link |
2024-07-24 |
DexGANGrasp: Dexterous Generative Adversarial Grasping Synthesis for Task-Oriented Manipulation |
Qian Feng et.al. |
2407.17348 |
null |
2024-07-24 |
Label Alignment and Reassignment with Generalist Large Language Model for Enhanced Cross-Domain Named Entity Recognition |
Ke Bao et.al. |
2407.17344 |
null |
2024-07-24 |
How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations? |
Leo Yu-Ho Lo et.al. |
2407.17291 |
null |
2024-07-23 |
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects |
Junyi Li et.al. |
2407.16696 |
link |
2024-07-23 |
Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack |
Xiaoyue Xu et.al. |
2407.16695 |
link |
2024-07-23 |
Can Large Language Models Automatically Jailbreak GPT-4V? |
Yuanwei Wu et.al. |
2407.16686 |
null |
2024-07-23 |
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation |
Pengfei Chen et.al. |
2407.16682 |
null |
2024-07-23 |
RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent |
Huiyu Xu et.al. |
2407.16667 |
null |
2024-07-23 |
Course-Correction: Safety Alignment Using Synthetic Preferences |
Rongwu Xu et.al. |
2407.16637 |
link |
2024-07-23 |
Lawma: The Power of Specialization for Legal Tasks |
Ricardo Dominguez-Olmedo et.al. |
2407.16615 |
null |
2024-07-23 |
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data? |
Jonathan Hayase et.al. |
2407.16607 |
link |
2024-07-23 |
Shared Imagination: LLMs Hallucinate Alike |
Yilun Zhou et.al. |
2407.16604 |
null |
2024-07-23 |
A Comparative Study on Patient Language across Therapeutic Domains for Effective Patient Voice Classification in Online Health Discussions |
Giorgos Lysandrou et.al. |
2407.16593 |
null |
2024-07-23 |
Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs |
Yifan Xia et.al. |
2407.16576 |
null |
2024-07-23 |
TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback |
Eunseop Yoon et.al. |
2407.16574 |
null |
2024-07-23 |
Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models |
Ioana Buhnila et.al. |
2407.16565 |
link |
2024-07-23 |
Patched RTC: evaluating LLMs for diverse software development tasks |
Asankhaya Sharma et.al. |
2407.16557 |
link |
2024-07-24 |
MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues |
Liyun Zhang et.al. |
2407.16552 |
null |
2024-07-23 |
Quantifying the Role of Textual Predictability in Automatic Speech Recognition |
Sean Robertson et.al. |
2407.16537 |
null |
2024-07-23 |
Imperfect Vision Encoders: Efficient and Robust Tuning for Vision-Language Models |
Aristeidis Panos et.al. |
2407.16526 |
null |
2024-07-23 |
AMONGAGENTS: Evaluating Large Language Models in the Interactive Text-Based Social Deduction Game |
Yizhou Chi et.al. |
2407.16521 |
null |
2024-07-23 |
Language-Based Security for Low-Level MPC |
Christian Skalka et.al. |
2407.16504 |
null |
2024-07-23 |
Machine Translation Hallucination Detection for Low and High Resource Languages using Large Language Models |
Kenza Benkirane et.al. |
2407.16470 |
link |
2024-07-22 |
AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description |
Junyu Xie et.al. |
2407.15850 |
link |
2024-07-22 |
LLMmap: Fingerprinting For Large Language Models |
Dario Pasquini et.al. |
2407.15847 |
link |
2024-07-22 |
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models |
Mingze Xu et.al. |
2407.15841 |
link |
2024-07-22 |
MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity |
Yangzhou Liu et.al. |
2407.15838 |
link |
2024-07-22 |
dMel: Speech Tokenization made Simple |
He Bai et.al. |
2407.15835 |
null |
2024-07-22 |
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling |
Wataru Nakata et.al. |
2407.15828 |
null |
2024-07-22 |
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight |
Ziyuan Huang et.al. |
2407.15819 |
null |
2024-07-22 |
Perceptions of Linguistic Uncertainty by Language Models and Humans |
Catarina G Belem et.al. |
2407.15814 |
link |
2024-07-22 |
AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection |
Yunkang Cao et.al. |
2407.15795 |
link |
2024-07-22 |
CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning |
Emanuele Frascaroli et.al. |
2407.15793 |
link |
2024-07-22 |
Extracting Structured Insights from Financial News: An Augmented LLM Driven Approach |
Rian Dolphin et.al. |
2407.15788 |
null |
2024-07-22 |
Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels |
Zhuorui Ye et.al. |
2407.15786 |
null |
2024-07-22 |
Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning |
Kaiwen Wang et.al. |
2407.15762 |
null |
2024-07-22 |
MoRSE: Bridging the Gap in Cybersecurity Expertise with Retrieval Augmented Generation |
Marco Simoni et.al. |
2407.15748 |
null |
2024-07-22 |
OMoS-QA: A Dataset for Cross-Lingual Extractive Question Answering in a German Migration Context |
Steffen Kleinle et.al. |
2407.15736 |
null |
2024-07-22 |
TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSON |
John Chong Min Tan et.al. |
2407.15734 |
link |
2024-07-22 |
Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders |
Laura Niss et.al. |
2407.15731 |
null |
2024-07-22 |
SAM2CLIP2SAM: Vision Language Model for Segmentation of 3D CT Scans for Covid-19 Detection |
Dimitrios Kollias et.al. |
2407.15728 |
null |
2024-07-22 |
DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design |
Zhi Hao Luo et.al. |
2407.15723 |
link |
2024-07-22 |
Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability |
Zhuoyan Xu et.al. |
2407.15720 |
link |
2024-07-19 |
Internal Consistency and Self-Feedback in Large Language Models: A Survey |
Xun Liang et.al. |
2407.14507 |
link |
2024-07-19 |
On Pre-training of Multimodal Language Models Customized for Chart Understanding |
Wan-Cyuan Fan et.al. |
2407.14506 |
null |
2024-07-19 |
PD-TPE: Parallel Decoder with Text-guided Position Encoding for 3D Visual Grounding |
Chenshu Hou et.al. |
2407.14491 |
null |
2024-07-19 |
Evaluating the Reliability of Self-Explanations in Large Language Models |
Korbinian Randl et.al. |
2407.14487 |
link |
2024-07-19 |
Data-Centric Human Preference Optimization with Rationales |
Hoang Anh Just et.al. |
2407.14477 |
link |
2024-07-19 |
Contrastive Learning with Counterfactual Explanations for Radiology Report Generation |
Mingjie Li et.al. |
2407.14474 |
null |
2024-07-19 |
Check-Eval: A Checklist-based Approach for Evaluating Text Quality |
Jayr Pereira et.al. |
2407.14467 |
null |
2024-07-19 |
Undermining Mental Proof: How AI Can Make Cooperation Harder by Making Thinking Easier |
Zachary Wojtowicz et.al. |
2407.14452 |
null |
2024-07-19 |
Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding |
Renshan Zhang et.al. |
2407.14439 |
link |
2024-07-19 |
Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders |
Senthooran Rajamanoharan et.al. |
2407.14435 |
null |
2024-07-19 |
Mixture of Experts with Mixture of Precisions for Tuning Quality of Service |
HamidReza Imani et.al. |
2407.14417 |
null |
2024-07-19 |
System-1.x: Learning to Balance Fast and Slow Planning with Language Models |
Swarnadeep Saha et.al. |
2407.14414 |
link |
2024-07-19 |
DEAL: Disentangle and Localize Concept-level Explanations for VLMs |
Tang Li et.al. |
2407.14412 |
link |
2024-07-19 |
The Vision of Autonomic Computing: Can LLMs Make It a Reality? |
Zhiyang Zhang et.al. |
2407.14402 |
null |
2024-07-19 |
Frontiers of Deep Learning: From Novel Application to Real-World Deployment |
Rui Xie et.al. |
2407.14386 |
null |
2024-07-19 |
Open Artificial Knowledge |
Vadim Borisov et.al. |
2407.14371 |
null |
2024-07-19 |
Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models |
Xuenan Xu et.al. |
2407.14355 |
link |
2024-07-19 |
Improving Retrieval in Sponsored Search by Leveraging Query Context Signals |
Akash Kumar Mohankumar et.al. |
2407.14346 |
null |
2024-07-19 |
LLMs left, right, and center: Assessing GPT’s capabilities to label political bias from web domains |
Raphael Hernandes et.al. |
2407.14344 |
null |
2024-07-19 |
Multimodal Misinformation Detection using Large Vision-Language Models |
Sahar Tahmasebi et.al. |
2407.14321 |
null |
2024-07-18 |
Latent Causal Probing: A Formal Perspective on Probing with Causal Models of Data |
Charles Jin et.al. |
2407.13765 |
null |
2024-07-18 |
SegPoint: Segment Any Point Cloud via Large Language Model |
Shuting He et.al. |
2407.13761 |
null |
2024-07-18 |
Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models |
Zhuo Chen et.al. |
2407.13757 |
null |
2024-07-18 |
CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications |
Mirza Masfiqur Rahman et.al. |
2407.13742 |
null |
2024-07-18 |
Baba Is AI: Break the Rules to Beat the Benchmark |
Nathan Cloos et.al. |
2407.13729 |
null |
2024-07-18 |
CoDefeater: Using LLMs To Find Defeaters in Assurance Cases |
Usman Gohar et.al. |
2407.13717 |
link |
2024-07-18 |
Understanding Reference Policies in Direct Preference Optimization |
Yixin Liu et.al. |
2407.13709 |
link |
2024-07-18 |
A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice |
Shaina Raza et.al. |
2407.13699 |
null |
2024-07-18 |
Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation |
Yotam Perlitz et.al. |
2407.13696 |
link |
2024-07-18 |
Prover-Verifier Games improve legibility of LLM outputs |
Jan Hendrik Kirchner et.al. |
2407.13692 |
null |
2024-07-18 |
Shaded Route Planning Using Active Segmentation and Identification of Satellite Images |
Longchao Da et.al. |
2407.13689 |
null |
2024-07-18 |
FuLG: 150B Romanian Corpus for Language Model Pretraining |
Vlad-Andrei Bădoiu et.al. |
2407.13657 |
null |
2024-07-18 |
COMCAT: Leveraging Human Judgment to Improve Automatic Documentation and Summarization |
Skyler Grandel et.al. |
2407.13648 |
null |
2024-07-18 |
Weak-to-Strong Reasoning |
Yuqing Yang et.al. |
2407.13647 |
link |
2024-07-18 |
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies |
Chaofan Tao et.al. |
2407.13623 |
link |
2024-07-18 |
KNOWNET: Guided Health Information Seeking from LLMs via Knowledge Graph Integration |
Youfu Yan et.al. |
2407.13598 |
null |
2024-07-18 |
PLANTS: A Novel Problem and Dataset for Summarization of Planning-Like (PL) Tasks |
Vishal Pallagani et.al. |
2407.13597 |
null |
2024-07-18 |
EarthMarker: A Visual Prompt Learning Framework for Region-level and Point-level Remote Sensing Imagery Comprehension |
Wei Zhang et.al. |
2407.13596 |
link |
2024-07-18 |
Robust Calibration of Large Vision-Language Adapters |
Balamurali Murugesan et.al. |
2407.13588 |
link |
2024-07-18 |
Towards Zero-Shot Multimodal Machine Translation |
Matthieu Futeral et.al. |
2407.13579 |
link |
2024-07-17 |
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models |
Kaichen Zhang et.al. |
2407.12772 |
link |
2024-07-17 |
EchoSight: Advancing Visual-Language Models with Wiki Knowledge |
Yibin Yan et.al. |
2407.12735 |
null |
2024-07-17 |
NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model |
Zhongqun Zhang et.al. |
2407.12727 |
null |
2024-07-17 |
Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models? |
Ben Yao et.al. |
2407.12725 |
null |
2024-07-17 |
The Future of Learning: Large Language Models through the Lens of Students |
He Zhang et.al. |
2407.12723 |
null |
2024-07-17 |
MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models |
Leyang Shen et.al. |
2407.12709 |
link |
2024-07-17 |
Subgraph-Aware Training of Text-based Methods for Knowledge Graph Completion |
Youmin Ko et.al. |
2407.12703 |
null |
2024-07-17 |
Patch-Level Training for Large Language Models |
Chenze Shao et.al. |
2407.12665 |
link |
2024-07-17 |
Zero-shot Text-guided Infinite Image Synthesis with LLM guidance |
Soyeong Kwon et.al. |
2407.12642 |
null |
2024-07-17 |
Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification? |
Aman Sinha et.al. |
2407.12626 |
null |
2024-07-17 |
Harnessing the Power of Artificial Intelligence to Vitalize Endangered Indigenous Languages: Technologies and Experiences |
Claudio Pinhanez et.al. |
2407.12620 |
null |
2024-07-17 |
AudienceView: AI-Assisted Interpretation of Audience Feedback in Journalism |
William Brannon et.al. |
2407.12613 |
link |
2024-07-17 |
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding |
Ofir Abramovich et.al. |
2407.12594 |
null |
2024-07-18 |
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks |
Antoni Kowalczuk et.al. |
2407.12588 |
link |
2024-07-17 |
E5-V: Universal Embeddings with Multimodal Large Language Models |
Ting Jiang et.al. |
2407.12580 |
link |
2024-07-17 |
Audio Conditioning for Music Generation via Discrete Bottleneck Features |
Simon Rouard et.al. |
2407.12563 |
null |
2024-07-17 |
Conspiracy theories and where to find them on TikTok |
Francesco Corso et.al. |
2407.12545 |
null |
2024-07-17 |
Abstraction Alignment: Comparing Model and Human Conceptual Relationships |
Angie Boggust et.al. |
2407.12543 |
link |
2024-07-17 |
Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models |
Xihe Qiu et.al. |
2407.12532 |
null |
2024-07-17 |
Crafting the Path: Robust Query Rewriting for Information Retrieval |
Ingeol Baek et.al. |
2407.12529 |
null |
2024-07-16 |
UrbanWorld: An Urban World Model for 3D City Generation |
Yu Shang et.al. |
2407.11965 |
link |
2024-07-16 |
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? |
Mo Li et.al. |
2407.11963 |
link |
2024-07-16 |
Code Documentation and Analysis to Secure Software Development |
Paul Attie et.al. |
2407.11934 |
null |
2024-07-16 |
What’s Wrong? Refining Meeting Summaries with LLM Feedback |
Frederic Kirstein et.al. |
2407.11919 |
null |
2024-07-16 |
GraphFM: A Scalable Framework for Multi-Graph Pretraining |
Divyansha Lachi et.al. |
2407.11907 |
null |
2024-07-16 |
Ascend-CC: Confidential Computing on Heterogeneous NPU for Emerging Generative AI Workloads |
Aritra Dhar et.al. |
2407.11888 |
null |
2024-07-16 |
Zero-shot Cross-Lingual Transfer for Synthetic Data Generation in Grammatical Error Detection |
Gaetan Lopez Latouche et.al. |
2407.11854 |
null |
2024-07-16 |
Schema Matching with Large Language Models: an Experimental Study |
Marcel Parciak et.al. |
2407.11852 |
link |
2024-07-16 |
LoFTI: Localization and Factuality Transfer to Indian Locales |
Sona Elza Simon et.al. |
2407.11833 |
link |
2024-07-16 |
GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text |
Kyle Hamilton et.al. |
2407.11827 |
null |
2024-07-16 |
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation |
Branden Butler et.al. |
2407.11798 |
null |
2024-07-16 |
Large Language Models as Misleading Assistants in Conversation |
Betty Li Hou et.al. |
2407.11789 |
null |
2024-07-16 |
SwitchCIT: Switching for Continual Instruction Tuning of Large Language Models |
Xinbo Wu et.al. |
2407.11780 |
null |
2024-07-16 |
Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text |
Seyedeh Fatemeh Ebrahimi et.al. |
2407.11774 |
null |
2024-07-16 |
Educational Personalized Learning Path Planning with Large Language Models |
Chee Ng et.al. |
2407.11773 |
null |
2024-07-16 |
XEdgeAI: A Human-centered Industrial Inspection Framework with Data-centric Explainable Edge AI Approach |
Truong Thanh Hung Nguyen et.al. |
2407.11771 |
link |
2024-07-16 |
Robust Utility-Preserving Text Anonymization Based on Large Language Models |
Tianyu Yang et.al. |
2407.11770 |
link |
2024-07-16 |
Vectoring Languages |
Joseph Chen et.al. |
2407.11766 |
null |
2024-07-16 |
Exploring Quantization for Efficient Pre-Training of Transformer Language Models |
Kamran Chitsaz et.al. |
2407.11722 |
link |
2024-07-16 |
Harnessing Large Language Models for Multimodal Product Bundling |
Xiaohao Liu et.al. |
2407.11712 |
null |
2024-07-15 |
VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation |
Bocheng Zou et.al. |
2407.10972 |
link |
2024-07-15 |
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated |
Hongyu Wang et.al. |
2407.10969 |
null |
2024-07-15 |
Fast Matrix Multiplications for Lookup Table-Quantized LLMs |
Han Guo et.al. |
2407.10960 |
link |
2024-07-15 |
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? |
Ruisheng Cao et.al. |
2407.10956 |
link |
2024-07-15 |
MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models |
Chengguang Gan et.al. |
2407.10953 |
null |
2024-07-15 |
Can Textual Semantics Mitigate Sounding Object Segmentation Preference? |
Yaoting Wang et.al. |
2407.10947 |
link |
2024-07-15 |
Learning from Naturally Occurring Feedback |
Shachar Don-Yehiya et.al. |
2407.10944 |
link |
2024-07-15 |
GRUtopia: Dream General Robots in a City at Scale |
Hanqing Wang et.al. |
2407.10943 |
link |
2024-07-15 |
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together |
Dilara Soylu et.al. |
2407.10930 |
null |
2024-07-15 |
Benchmarking Vision Language Models for Cultural Understanding |
Shravan Nayak et.al. |
2407.10920 |
null |
2024-07-15 |
FinDKG: Dynamic Knowledge Graphs with Large Language Models for Detecting Global Trends in Financial Markets |
Xiaohui Victor Li et.al. |
2407.10909 |
link |
2024-07-15 |
Hey, That’s My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique |
Mark Russinovich et.al. |
2407.10887 |
null |
2024-07-15 |
SLIP: Securing LLMs IP Using Weights Decomposition |
Yehonathan Refael et.al. |
2407.10886 |
null |
2024-07-15 |
Understanding the Importance of Evolutionary Search in Automated Heuristic Design with Large Language Models |
Rui Zhang et.al. |
2407.10873 |
null |
2024-07-15 |
GPT Sonograpy: Hand Gesture Decoding from Forearm Ultrasound Images via VLM |
Keshav Bimbraw et.al. |
2407.10870 |
null |
2024-07-15 |
Physics-Inspired Generative Models in Medical Imaging: A Review |
Dennis Hein et.al. |
2407.10856 |
null |
2024-07-15 |
Weighted Grouped Query Attention in Transformers |
Sai Sena Chinnakonduru et.al. |
2407.10855 |
null |
2024-07-15 |
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use Cases |
Dylan Bouchard et.al. |
2407.10853 |
null |
2024-07-15 |
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs |
Quang H. Nguyen et.al. |
2407.10834 |
null |
2024-07-15 |
BiasScanner: Automatic Detection and Classification of News Bias to Strengthen Democracy |
Tim Menzner et.al. |
2407.10829 |
null |
2024-07-12 |
FairyLandAI: Personalized Fairy Tales utilizing ChatGPT and DALLE-3 |
Georgios Makridis et.al. |
2407.09467 |
null |
2024-07-12 |
Human-like Episodic Memory for Infinite Context LLMs |
Zafeirios Fountas et.al. |
2407.09450 |
link |
2024-07-12 |
ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts |
Amelia F. Hardy et.al. |
2407.09447 |
link |
2024-07-12 |
MUSCLE: A Model Update Strategy for Compatible LLM Evolution |
Jessica Echterhoff et.al. |
2407.09435 |
null |
2024-07-12 |
A Perspective on Foundation Models for the Electric Power Grid |
Hendrik F. Hamann et.al. |
2407.09434 |
null |
2024-07-12 |
Open (Clinical) LLMs are Sensitive to Instruction Phrasings |
Alberto Mario Ceballos Arroyo et.al. |
2407.09429 |
link |
2024-07-12 |
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models |
Hang Zou et.al. |
2407.09424 |
null |
2024-07-12 |
Mitigating Entity-Level Hallucination in Large Language Models |
Weihang Su et.al. |
2407.09417 |
link |
2024-07-12 |
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers |
Shraman Pramanick et.al. |
2407.09413 |
link |
2024-07-12 |
Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-Commerce |
Zhe Lin et.al. |
2407.09395 |
null |
2024-07-12 |
PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric Agents |
Saber Zerhoudi et.al. |
2407.09394 |
link |
2024-07-12 |
GAVEL: Generating Games Via Evolution and Language Models |
Graham Todd et.al. |
2407.09388 |
link |
2024-07-12 |
Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text |
Lucio La Cava et.al. |
2407.09364 |
null |
2024-07-12 |
Good Intentions, Risky Inventions: A Method for Assessing the Risks and Benefits of AI in Mobile and Wearable Uses |
Marios Constantinides et.al. |
2407.09322 |
link |
2024-07-12 |
Scalability of Bayesian Network Structure Elicitation with Large Language Models: a Novel Methodology and Comparative Analysis |
Nikolay Babakov et.al. |
2407.09311 |
null |
2024-07-12 |
Transformer Layers as Painters |
Qi Sun et.al. |
2407.09298 |
link |
2024-07-12 |
Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study |
Yulong Yang et.al. |
2407.09295 |
null |
2024-07-12 |
CEIPA: Counterfactual Explainable Incremental Prompt Attack Analysis on Large Language Models |
Dong Shu et.al. |
2407.09292 |
null |
2024-07-12 |
Structuring Authenticity Assessments on Historical Documents using LLMs |
Andrea Schimmenti et.al. |
2407.09290 |
null |
2024-07-12 |
WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation |
Robin Schön et.al. |
2407.09288 |
link |
2024-07-11 |
MAVIS: Mathematical Visual Instruction Tuning |
Renrui Zhang et.al. |
2407.08739 |
link |
2024-07-11 |
Real-Time Anomaly Detection and Reactive Planning with Large Language Models |
Rohan Sinha et.al. |
2407.08735 |
null |
2024-07-11 |
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist |
Zihao Zhou et.al. |
2407.08733 |
null |
2024-07-11 |
A Taxonomy for Data Contamination in Large Language Models |
Medha Palavalli et.al. |
2407.08716 |
null |
2024-07-11 |
GTA: A Benchmark for General Tool Agents |
Jize Wang et.al. |
2407.08713 |
link |
2024-07-11 |
eyeballvul: a future-proof benchmark for vulnerability detection in the wild |
Timothee Chauvin et.al. |
2407.08708 |
link |
2024-07-11 |
Extracting Training Data from Document-Based VQA Models |
Francesco Pinto et.al. |
2407.08707 |
null |
2024-07-11 |
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models |
Runhui Huang et.al. |
2407.08706 |
null |
2024-07-11 |
Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models |
Zhening Xing et.al. |
2407.08701 |
null |
2024-07-11 |
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging |
Anton Alexandrov et.al. |
2407.08699 |
null |
2024-07-11 |
Cloud Atlas: Efficient Fault Localization for Cloud Systems using Language Models and Causal Insight |
Zhiqiang Xie et.al. |
2407.08694 |
null |
2024-07-11 |
Robotic Control via Embodied Chain-of-Thought Reasoning |
Zawalski Michał et.al. |
2407.08693 |
null |
2024-07-11 |
SEED-Story: Multimodal Long Story Generation with Large Language Model |
Shuai Yang et.al. |
2407.08683 |
link |
2024-07-11 |
NODE-Adapter: Neural Ordinary Differential Equations for Better Vision-Language Reasoning |
Yi Zhang et.al. |
2407.08672 |
null |
2024-07-11 |
Uncertainty Estimation of Large Language Models in Medical Question Answering |
Jiaxin Wu et.al. |
2407.08662 |
null |
2024-07-11 |
Towards Building Specialized Generalist AI with System 1 and System 2 Fusion |
Kaiyan Zhang et.al. |
2407.08642 |
null |
2024-07-11 |
$β$-DPO: Direct Preference Optimization with Dynamic $β$ |
Junkang Wu et.al. |
2407.08639 |
link |
2024-07-11 |
RoboMorph: Evolving Robot Morphology using Large Language Models |
Kevin Qiu et.al. |
2407.08626 |
null |
2024-07-11 |
Tamil Language Computing: the Present and the Future |
Kengatharaiyer Sarveswaran et.al. |
2407.08618 |
null |
2024-07-11 |
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision |
Jay Shah et.al. |
2407.08608 |
link |
2024-07-10 |
Training on the Test Task Confounds Evaluation and Emergence |
Ricardo Dominguez-Olmedo et.al. |
2407.07890 |
link |
2024-07-10 |
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization |
Junkang Wu et.al. |
2407.07880 |
link |
2024-07-11 |
Toto: Time Series Optimized Transformer for Observability |
Ben Cohen et.al. |
2407.07874 |
null |
2024-07-10 |
FACTS About Building Retrieval Augmented Generation-based Chatbots |
Rama Akkiraju et.al. |
2407.07858 |
null |
2024-07-10 |
OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training |
Sami Jaghouar et.al. |
2407.07852 |
link |
2024-07-10 |
Natural Language Mechanisms via Self-Resolution with Foundation Models |
Nicolas Della Penna et.al. |
2407.07845 |
null |
2024-07-10 |
Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective |
Shengjia Chen et.al. |
2407.07841 |
link |
2024-07-10 |
Decompose and Compare Consistency: Measuring VLMs’ Answer Reliability via Task-Decomposition Consistency Comparison |
Qian Yang et.al. |
2407.07840 |
null |
2024-07-10 |
Transformer Alignment in Large Language Models |
Murdock Aubry et.al. |
2407.07810 |
null |
2024-07-11 |
AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning |
Jongsuk Kim et.al. |
2407.07801 |
link |
2024-07-10 |
Attribute or Abstain: Large Language Models as Long Document Assistants |
Jan Buchmann et.al. |
2407.07799 |
link |
2024-07-11 |
Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard |
Oguzhan Topsakal et.al. |
2407.07796 |
link |
2024-07-10 |
Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities |
Tianjie Ju et.al. |
2407.07791 |
link |
2024-07-10 |
WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment |
Jiefu Ou et.al. |
2407.07778 |
null |
2024-07-10 |
Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs |
Hao-Tien Lewis Chiang et.al. |
2407.07775 |
null |
2024-07-10 |
Can ChatGPT Pass a Theory of Computing Course? |
Matei A. Golesteanu et.al. |
2407.07757 |
null |
2024-07-10 |
Fine-Tuning Large Language Models with User-Level Differential Privacy |
Zachary Charles et.al. |
2407.07737 |
null |
2024-07-10 |
PaliGemma: A versatile 3B VLM for transfer |
Lucas Beyer et.al. |
2407.07726 |
link |
2024-07-10 |
Why should we ever automate moral decision making? |
Vincent Conitzer et.al. |
2407.07671 |
null |
2024-07-10 |
A Proposed S.C.O.R.E. Evaluation Framework for Large Language Models : Safety, Consensus, Objectivity, Reproducibility and Explainability |
Ting Fang Tan et.al. |
2407.07666 |
null |
2024-07-09 |
AnyTaskTune: Advanced Domain-Specific Solutions through Task-Fine-Tuning |
Jiaxi Cui et.al. |
2407.07094 |
link |
2024-07-09 |
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation |
Liqun Ma et.al. |
2407.07093 |
link |
2024-07-09 |
CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation |
Tong Chen et.al. |
2407.07087 |
link |
2024-07-09 |
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models |
Logan Cross et.al. |
2407.07086 |
link |
2024-07-09 |
Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities |
Shaltiel Shmidman et.al. |
2407.07080 |
null |
2024-07-09 |
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps |
Yung-Sung Chuang et.al. |
2407.07071 |
link |
2024-07-09 |
Prompting Techniques for Secure Code Generation: A Systematic Investigation |
Catherine Tony et.al. |
2407.07064 |
null |
2024-07-09 |
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence |
Weize Chen et.al. |
2407.07061 |
link |
2024-07-09 |
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model |
Wenqi Zhang et.al. |
2407.07053 |
link |
2024-07-09 |
ProtoSAM – One Shot Medical Image Segmentation With Foundational Models |
Lev Ayzenberg et.al. |
2407.07042 |
link |
2024-07-09 |
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models |
Yue Zhang et.al. |
2407.07035 |
link |
2024-07-09 |
Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization |
Jeongseok Hyun et.al. |
2407.07024 |
link |
2024-07-09 |
Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies |
Inwon Kang et.al. |
2407.07019 |
null |
2024-07-09 |
End-To-End Causal Effect Estimation from Unstructured Natural Language Data |
Nikita Dhawan et.al. |
2407.07018 |
null |
2024-07-09 |
Is Large Language Model All You Need to Predict the Synthesizability and Precursors of Crystal Structures? |
Zhilong Song et.al. |
2407.07016 |
null |
2024-07-09 |
Induction Heads as an Essential Mechanism for Pattern Matching in In-context Learning |
J. Crosbie et.al. |
2407.07011 |
null |
2024-07-09 |
Metron: Holistic Performance Evaluation Framework for LLM Inference Systems |
Amey Agrawal et.al. |
2407.07000 |
link |
2024-07-09 |
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective |
Yu-An Liu et.al. |
2407.06992 |
link |
2024-07-09 |
Segment-Based Interactive Machine Translation for Pre-trained Models |
Angel Navarro et.al. |
2407.06990 |
null |
2024-07-09 |
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models |
Yi-Cheng Lin et.al. |
2407.06957 |
link |
2024-07-08 |
Multi-Object Hallucination in Vision-Language Models |
Xuweiyi Chen et.al. |
2407.06192 |
link |
2024-07-08 |
4D Contrastive Superflows are Dense 3D Representation Learners |
Xiang Xu et.al. |
2407.06190 |
link |
2024-07-08 |
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision |
Orr Zohar et.al. |
2407.06189 |
link |
2024-07-08 |
CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation |
Xinying Guo et.al. |
2407.06188 |
null |
2024-07-08 |
JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation |
Yu Zeng et.al. |
2407.06187 |
null |
2024-07-08 |
Vision-Language Models under Cultural and Inclusive Considerations |
Antonia Karamolegkou et.al. |
2407.06177 |
null |
2024-07-08 |
On Speeding Up Language Model Evaluation |
Jin Peng Zhou et.al. |
2407.06172 |
null |
2024-07-08 |
What’s Wrong with Your Code Generated by Large Language Models? An Extensive Study |
Shihan Dou et.al. |
2407.06153 |
null |
2024-07-09 |
Using Grammar Masking to Ensure Syntactic Validity in LLM-based Modeling Tasks |
Lukas Netz et.al. |
2407.06146 |
null |
2024-07-08 |
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation |
Ethan Chern et.al. |
2407.06135 |
link |
2024-07-08 |
Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization |
Hannah K. Bako et.al. |
2407.06129 |
link |
2024-07-08 |
Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities |
Avinash Anand et.al. |
2407.06125 |
null |
2024-07-08 |
Enhancing Language Model Rationality with Bi-Directional Deliberation Reasoning |
Yadong Zhang et.al. |
2407.06112 |
null |
2024-07-08 |
Artificial Intuition: Efficient Classification of Scientific Abstracts |
Harsh Sakhrani et.al. |
2407.06093 |
null |
2024-07-08 |
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models |
Jinliang Lu et.al. |
2407.06089 |
null |
2024-07-08 |
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty |
Maor Ivgi et.al. |
2407.06071 |
link |
2024-07-08 |
Variational Best-of-N Alignment |
Afra Amini et.al. |
2407.06057 |
null |
2024-07-08 |
MST5 – Multilingual Question Answering over Knowledge Graphs |
Nikit Srivastava et.al. |
2407.06041 |
link |
2024-07-08 |
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System |
Miao Zheng et.al. |
2407.06027 |
null |
2024-07-08 |
iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement |
Aoyu Pang et.al. |
2407.06025 |
link |
2024-07-05 |
Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs |
Rudolf Laine et.al. |
2407.04694 |
link |
2024-07-05 |
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models |
Yuzhe Gu et.al. |
2407.04693 |
link |
2024-07-05 |
Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge |
Yuanze Lin et.al. |
2407.04681 |
null |
2024-07-05 |
Lost in Translation: The Algorithmic Gap Between LMs and the Brain |
Tommaso Tosato et.al. |
2407.04680 |
null |
2024-07-05 |
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition |
Ye Bai et.al. |
2407.04675 |
null |
2024-07-05 |
Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement |
Yongji Wu et.al. |
2407.04656 |
null |
2024-07-05 |
Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models |
Bolaji Yusuf et.al. |
2407.04641 |
null |
2024-07-05 |
Entity Decomposition with Filtering: A Zero-Shot Clinical Named Entity Recognition Framework |
Reza Averly et.al. |
2407.04629 |
null |
2024-07-05 |
On scalable oversight with weak LLMs judging strong LLMs |
Zachary Kenton et.al. |
2407.04622 |
null |
2024-07-05 |
CountGD: Multi-Modal Open-World Counting |
Niki Amini-Naieni et.al. |
2407.04619 |
null |
2024-07-05 |
ARM: Efficient Guided Decoding with Autoregressive Reward Models |
Sergey Troshin et.al. |
2407.04615 |
null |
2024-07-05 |
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation |
Yuhan Zhu et.al. |
2407.04603 |
link |
2024-07-05 |
Written Term Detection Improves Spoken Term Detection |
Bolaji Yusuf et.al. |
2407.04601 |
link |
2024-07-05 |
Testing learning hypotheses using neural networks by manipulating learning data |
Cara Su-Yi Leong et.al. |
2407.04593 |
null |
2024-07-05 |
Leveraging Large Language Models for Integrated Satellite-Aerial-Terrestrial Networks: Recent Advances and Future Directions |
Shumaila Javaid et.al. |
2407.04581 |
null |
2024-07-05 |
VRSD: Rethinking Similarity and Diversity for Retrieval in Large Language Models |
Hang Gao et.al. |
2407.04573 |
null |
2024-07-05 |
Not (yet) the whole story: Evaluating Visual Storytelling Requires More than Measuring Coherence, Grounding, and Repetition |
Aditya K Surikuchi et.al. |
2407.04559 |
link |
2024-07-05 |
Spontaneous Reward Hacking in Iterative Self-Refinement |
Jane Pan et.al. |
2407.04549 |
null |
2024-07-05 |
PoPreRo: A New Dataset for Popularity Prediction of Romanian Reddit Posts |
Ana-Cristina Rogoz et.al. |
2407.04541 |
link |
2024-07-05 |
GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning |
Aleksander Ficek et.al. |
2407.04528 |
null |
2024-07-03 |
Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages |
Max Zuo et.al. |
2407.03321 |
link |
2024-07-03 |
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output |
Pan Zhang et.al. |
2407.03320 |
link |
2024-07-03 |
BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations |
Zhantao Yang et.al. |
2407.03314 |
null |
2024-07-03 |
Universal Length Generalization with Turing Programs |
Kaiying Hou et.al. |
2407.03310 |
null |
2024-07-03 |
Large Language Models for JSON Schema Discovery |
Michael J. Mior et.al. |
2407.03286 |
null |
2024-07-03 |
LLM Internal States Reveal Hallucination Risk Faced With a Query |
Ziwei Ji et.al. |
2407.03282 |
link |
2024-07-03 |
STF: Sentence Transformer Fine-Tuning For Topic Categorization With Limited Data |
Kheir Eddine Daouadi et.al. |
2407.03253 |
null |
2024-07-03 |
Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning |
Zhili Shen et.al. |
2407.03227 |
null |
2024-07-03 |
How Does Quantization Affect Multilingual LLMs? |
Kelly Marchisio et.al. |
2407.03211 |
null |
2024-07-03 |
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts |
Ruida Wang et.al. |
2407.03203 |
link |
2024-07-03 |
Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models |
Haritz Puerto et.al. |
2407.03181 |
link |
2024-07-03 |
Investigating Decoder-only Large Language Models for Speech-to-text Translation |
Chao-Wei Huang et.al. |
2407.03169 |
null |
2024-07-03 |
SOS! Soft Prompt Attack Against Open-Source Large Language Models |
Ziqing Yang et.al. |
2407.03160 |
null |
2024-07-03 |
Let the Code LLM Edit Itself When You Edit the Code |
Zhenyu He et.al. |
2407.03157 |
null |
2024-07-03 |
Reinforcement Learning for Sequence Design Leveraging Protein Language Models |
Jithendaraa Subramanian et.al. |
2407.03154 |
null |
2024-07-03 |
Enhancing Translation Accuracy of Large Language Models through Continual Pre-Training on Parallel Data |
Minato Kondo et.al. |
2407.03145 |
null |
2024-07-03 |
Social Bias Evaluation for Large Language Models Requires Prompt Variations |
Rem Hida et.al. |
2407.03129 |
link |
2024-07-03 |
KeyVideoLLM: Towards Large-scale Video Keyframe Selection |
Hao Liang et.al. |
2407.03104 |
null |
2024-07-03 |
Cactus: Towards Psychological Counseling Conversations using Cognitive Behavioral Theory |
Suyeon Lee et.al. |
2407.03103 |
link |
2024-07-03 |
ScreenTK: Seamless Detection of Time-Killing Moments Using Continuous Mobile Screen Text Monitoring |
Le Fang et.al. |
2407.03063 |
null |
2024-07-02 |
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention |
Huiqiang Jiang et.al. |
2407.02490 |
link |
2024-07-02 |
Neurocache: Efficient Vector Retrieval for Long-range Language Modeling |
Ali Safaya et.al. |
2407.02486 |
link |
2024-07-02 |
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs |
Yue Yu et.al. |
2407.02485 |
null |
2024-07-02 |
MMedAgent: Learning to Use Medical Tools with Multi-modal Agent |
Binxu Li et.al. |
2407.02483 |
link |
2024-07-02 |
Understanding Alignment in Multimodal LLMs: A Comprehensive Study |
Elmira Amirloo et.al. |
2407.02477 |
null |
2024-07-02 |
Open Scene Graphs for Open World Object-Goal Navigation |
Joel Loo et.al. |
2407.02473 |
null |
2024-07-02 |
ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions |
Chan Young Park et.al. |
2407.02472 |
link |
2024-07-02 |
Reliable Confidence Intervals for Information Retrieval Evaluation Using Generative A.I |
Harrie Oosterhuis et.al. |
2407.02464 |
null |
2024-07-02 |
Ensemble of pre-trained language models and data augmentation for hate speech detection from Arabic tweets |
Kheir Eddine Daouadi et.al. |
2407.02448 |
null |
2024-07-03 |
Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs |
Jinmin Li et.al. |
2407.02411 |
null |
2024-07-02 |
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models |
Song Wang et.al. |
2407.02408 |
null |
2024-07-02 |
Assessing the Code Clone Detection Capability of Large Language Models |
Zixian Zhang et.al. |
2407.02402 |
null |
2024-07-02 |
Learning to Refine with Fine-Grained Natural Language Feedback |
Manya Wadhwa et.al. |
2407.02397 |
link |
2024-07-02 |
Is Your AI-Generated Code Really Secure? Evaluating Large Language Models on Secure Code Generation with CodeSecEval |
Jiexin Wang et.al. |
2407.02395 |
null |
2024-07-02 |
TokenPacker: Efficient Visual Projector for Multimodal LLM |
Wentong Li et.al. |
2407.02392 |
link |
2024-07-02 |
Talking to Machines: do you read me? |
Lina M. Rojas-Barahona et.al. |
2407.02354 |
null |
2024-07-02 |
Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification |
Pritish Sahu et.al. |
2407.02352 |
null |
2024-07-02 |
Generative Large Language Models in Automated Fact-Checking: A Survey |
Ivan Vykopal et.al. |
2407.02351 |
null |
2024-07-02 |
Conceptual Codebook Learning for Vision-Language Models |
Yi Zhang et.al. |
2407.02350 |
null |
2024-07-02 |
MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space |
Yihong Tang et.al. |
2407.02345 |
null |
2024-06-28 |
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs |
Sukmin Yun et.al. |
2406.20098 |
link |
2024-06-28 |
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy |
Xiang Li et.al. |
2406.20095 |
link |
2024-06-28 |
Scaling Synthetic Data Creation with 1,000,000,000 Personas |
Xin Chan et.al. |
2406.20094 |
link |
2024-06-28 |
LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression |
Jieneng Chen et.al. |
2406.20092 |
link |
2024-06-28 |
ProgressGym: Alignment with a Millennium of Moral Progress |
Tianyi Qiu et.al. |
2406.20087 |
link |
2024-06-28 |
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language |
Yicheng Chen et.al. |
2406.20085 |
null |
2024-06-28 |
Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification |
Anisha Gunjal et.al. |
2406.20079 |
link |
2024-06-28 |
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model |
Yuxuan Zhang et.al. |
2406.20076 |
link |
2024-06-28 |
To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models |
Bastien Liétard et.al. |
2406.20054 |
null |
2024-06-28 |
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation |
Danny Halawi et.al. |
2406.20053 |
null |
2024-07-01 |
BMW Agents – A Framework For Task Automation Through Multi-Agent Collaboration |
Noel Crawford et.al. |
2406.20041 |
null |
2024-06-28 |
BioMNER: A Dataset for Biomedical Method Entity Recognition |
Chen Tang et.al. |
2406.20038 |
null |
2024-06-28 |
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models |
Renzhi Wang et.al. |
2406.20030 |
null |
2024-06-28 |
ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models |
Yuxiang Zhang et.al. |
2406.20015 |
link |
2024-06-28 |
The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models |
Xinyi Chen et.al. |
2406.19999 |
link |
2024-06-28 |
Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model |
Habib Hajimolahoseini et.al. |
2406.19995 |
null |
2024-06-28 |
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting |
Rui Pan et.al. |
2406.19976 |
null |
2024-06-28 |
STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical |
Guohao Sun et.al. |
2406.19973 |
link |
2024-06-28 |
Into the Unknown: Generating Geospatial Descriptions for New Environments |
Tzuf Paz-Argaman et.al. |
2406.19967 |
null |
2024-06-28 |
Simulating Financial Market via Large Language Model based Agents |
Shen Gao et.al. |
2406.19966 |
null |
2024-06-27 |
ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos |
Jr-Jen Chen et.al. |
2406.19392 |
link |
2024-06-27 |
The Remarkable Robustness of LLMs: Stages of Inference? |
Vedang Lad et.al. |
2406.19384 |
link |
2024-06-27 |
The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in the Era of Large Language Models |
Xiliang Zhu et.al. |
2406.19358 |
null |
2024-06-27 |
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions |
Nigel Fernandez et.al. |
2406.19356 |
link |
2024-06-27 |
Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs? |
Peter Hase et.al. |
2406.19354 |
null |
2024-06-27 |
IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language |
Lucky Susanto et.al. |
2406.19349 |
null |
2024-06-27 |
Jump Starting Bandits with LLM-Generated Prior Knowledge |
Parand A. Alamdari et.al. |
2406.19317 |
link |
2024-06-27 |
MCNC: Manifold Constrained Network Compression |
Chayne Thrash et.al. |
2406.19301 |
null |
2024-06-27 |
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data |
Zheyang Xiong et.al. |
2406.19292 |
link |
2024-06-27 |
PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models |
Cathy Mengying Fang et.al. |
2406.19283 |
null |
2024-06-27 |
HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale |
Junying Chen et.al. |
2406.19280 |
link |
2024-06-27 |
VERISCORE: Evaluating the factuality of verifiable claims in long-form text generation |
Yixiao Song et.al. |
2406.19276 |
link |
2024-06-27 |
AutoPureData: Automated Filtering of Web Data for LLM Fine-tuning |
Praneeth Vadlapati et.al. |
2406.19271 |
link |
2024-06-27 |
Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding |
Yue Fan et.al. |
2406.19263 |
link |
2024-06-27 |
Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment |
Hao Fei et.al. |
2406.19255 |
null |
2024-06-27 |
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation |
Jia Fu et.al. |
2406.19251 |
null |
2024-06-27 |
Revealing Fine-Grained Values and Opinions in Large Language Models |
Dustin Wright et.al. |
2406.19238 |
link |
2024-06-28 |
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts |
Shubhankar Singh et.al. |
2406.19237 |
null |
2024-06-27 |
Seeing Is Believing: Black-Box Membership Inference Attacks Against Retrieval Augmented Generation |
Yuying Li et.al. |
2406.19234 |
null |
2024-06-28 |
RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs |
Ekaterina Taktasheva et.al. |
2406.19232 |
link |
2024-06-26 |
Towards Compositionality in Concept Learning |
Adam Stein et.al. |
2406.18534 |
link |
2024-06-26 |
Symbolic Learning Enables Self-Evolving Agents |
Wangchunshu Zhou et.al. |
2406.18532 |
link |
2024-06-26 |
PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine Translation and Summarization Evaluation |
Christoph Leiter et.al. |
2406.18528 |
link |
2024-06-26 |
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs |
Zirui Wang et.al. |
2406.18521 |
link |
2024-06-26 |
“Is ChatGPT a Better Explainer than My Professor?”: Evaluating the Explanation Capabilities of LLMs in Conversation Compared to a Human Baseline |
Grace Li et.al. |
2406.18512 |
null |
2024-06-26 |
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models |
Liwei Jiang et.al. |
2406.18510 |
link |
2024-06-26 |
Mental Modeling of Reinforcement Learning Agents by Language Models |
Wenhao Lu et.al. |
2406.18505 |
null |
2024-06-26 |
Is In-Context Learning a Type of Gradient-Based Learning? Evidence from the Inverse Frequency Effect in Structural Priming |
Zhenghao Zhou et.al. |
2406.18501 |
null |
2024-06-26 |
Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation |
Ahmed Njifenjou et.al. |
2406.18460 |
null |
2024-06-26 |
Cascading Large Language Models for Salient Event Graph Generation |
Xingwei Tan et.al. |
2406.18449 |
link |
2024-06-26 |
New intelligent empowerment for digital transformation |
Peng Yifeng et.al. |
2406.18440 |
null |
2024-06-26 |
IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons |
Dan Shi et.al. |
2406.18406 |
link |
2024-06-26 |
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers |
Yibo Jiang et.al. |
2406.18400 |
null |
2024-06-26 |
Adversarial Search Engine Optimization for Large Language Models |
Fredrik Nestaas et.al. |
2406.18382 |
null |
2024-06-26 |
MALSIGHT: Exploring Malicious Source Code and Benign Pseudocode for Iterative Binary Malware Summarization |
Haolang Lu et.al. |
2406.18379 |
null |
2024-06-26 |
Themis: Towards Flexible and Interpretable NLG Evaluation |
Xinyu Hu et.al. |
2406.18365 |
link |
2024-06-26 |
AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations |
Adam Dahlgren Lindström et.al. |
2406.18346 |
null |
2024-06-26 |
PDFA Distillation via String Probability Queries {PDFA Distillation via String Probability Queries} |
Robert Baumgartner et.al. |
2406.18328 |
link |
2024-06-26 |
PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language Models |
Huixuan Zhang et.al. |
2406.18326 |
null |
2024-06-26 |
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data |
Meng Fang et.al. |
2406.18321 |
null |
2024-06-25 |
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning |
Xiangyu Zhao et.al. |
2406.17770 |
link |
2024-06-25 |
EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data |
Jesse Zhang et.al. |
2406.17768 |
null |
2024-06-25 |
BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning |
Ercong Nie et.al. |
2406.17764 |
null |
2024-06-25 |
CaLMQA: Exploring culturally specific long-form question answering across 23 languages |
Shane Arora et.al. |
2406.17761 |
link |
2024-06-25 |
Accelerating Clinical Evidence Synthesis with Large Language Models |
Zifeng Wang et.al. |
2406.17755 |
null |
2024-06-25 |
Measuring and Benchmarking Large Language Models’ Capabilities to Generate Persuasive Language |
Amalie Brogaard Pauli et.al. |
2406.17753 |
null |
2024-06-25 |
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon |
USVSN Sai Prashanth et.al. |
2406.17746 |
link |
2024-06-25 |
Point-SAM: Promptable 3D Segmentation Model for Point Clouds |
Yuchen Zhou et.al. |
2406.17741 |
link |
2024-06-25 |
Find Parent then Label Children: A Two-stage Taxonomy Completion Method with Pre-trained Language Model |
Fei Xia et.al. |
2406.17739 |
null |
2024-06-25 |
LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users |
Elinor Poole-Dayan et.al. |
2406.17737 |
null |
2024-06-25 |
FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model |
Feijie Wu et.al. |
2406.17706 |
link |
2024-06-25 |
From Distributional to Overton Pluralism: Investigating Large Language Model Alignment |
Thom Lake et.al. |
2406.17692 |
link |
2024-06-25 |
VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation |
Kun Qian et.al. |
2406.17681 |
link |
2024-06-25 |
Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models |
Yuan Li et.al. |
2406.17675 |
null |
2024-06-25 |
LaTable: Towards Large Tabular Models |
Boris van Breugel et.al. |
2406.17673 |
null |
2024-06-25 |
LLM-ARC: Enhancing LLMs with an Automated Reasoning Critic |
Aditya Kalyanpur et.al. |
2406.17663 |
null |
2024-06-25 |
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients |
Aashiq Muhamed et.al. |
2406.17660 |
link |
2024-06-25 |
DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning |
Xiaohan Zhang et.al. |
2406.17659 |
null |
2024-06-25 |
Leveraging Large Language Models for Software Model Completion: Results from Industrial and Public Datasets |
Christof Tinnes et.al. |
2406.17651 |
link |
2024-06-25 |
Variationist: Exploring Multifaceted Variation and Bias in Written Language Data |
Alan Ramponi et.al. |
2406.17647 |
link |
2024-06-24 |
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs |
Shengbang Tong et.al. |
2406.16860 |
link |
2024-06-24 |
EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees |
Yuhui Li et.al. |
2406.16858 |
link |
2024-06-24 |
Long Context Transfer from Language to Vision |
Peiyuan Zhang et.al. |
2406.16852 |
link |
2024-06-24 |
Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts |
Aditya Sharma et.al. |
2406.16851 |
null |
2024-06-24 |
RaTEScore: A Metric for Radiology Report Generation |
Weike Zhao et.al. |
2406.16845 |
link |
2024-06-24 |
From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models |
Sean Welleck et.al. |
2406.16838 |
null |
2024-06-24 |
USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$ onversations |
Mounika Marreddy et.al. |
2406.16833 |
null |
2024-06-24 |
Understanding and Mitigating Tokenization Bias in Language Models |
Buu Phan et.al. |
2406.16829 |
null |
2024-06-24 |
Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track |
Ronak Pradeep et.al. |
2406.16828 |
link |
2024-06-24 |
GPT-4V Explorations: Mining Autonomous Driving |
Zixuan Li et.al. |
2406.16817 |
null |
2024-06-24 |
RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository Scale |
Beck LaBash et.al. |
2406.16801 |
link |
2024-06-24 |
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs |
Ashwinee Panda et.al. |
2406.16797 |
link |
2024-06-24 |
Adam-mini: Use Fewer Learning Rates To Gain More |
Yushun Zhang et.al. |
2406.16793 |
link |
2024-06-24 |
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models |
Rishabh Maheshwary et.al. |
2406.16783 |
null |
2024-06-24 |
It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension |
Sagi Shaier et.al. |
2406.16779 |
null |
2024-06-24 |
Finding Transformer Circuits with Edge Pruning |
Adithya Bhaskar et.al. |
2406.16778 |
link |
2024-06-24 |
Blending LLMs into Cascaded Speech Translation: KIT’s Offline Speech Translation System for IWSLT 2024 |
Sai Koneru et.al. |
2406.16777 |
null |
2024-06-24 |
WARP: On the Benefits of Weight Averaged Rewarded Policies |
Alexandre Ramé et.al. |
2406.16768 |
null |
2024-06-24 |
The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories |
Xi Yu Huang et.al. |
2406.16767 |
link |
2024-06-24 |
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters |
Euiin Yi et.al. |
2406.16758 |
link |
2024-06-21 |
GenoTEX: A Benchmark for Evaluating LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians |
Haoyang Liu et.al. |
2406.15341 |
link |
2024-06-21 |
Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance |
Haoling Li et.al. |
2406.15330 |
null |
2024-06-21 |
Bug In the Code Stack: Can LLMs Find Bugs in Large Python Code Stacks |
Hokyung Lee et.al. |
2406.15325 |
link |
2024-06-21 |
Cognitive Map for Language Models: Optimal Planning via Verbally Representing the World Model |
Doyoung Kim et.al. |
2406.15275 |
link |
2024-06-21 |
Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics |
Weijia Zhang et.al. |
2406.15264 |
null |
2024-06-21 |
Unsupervised Morphological Tree Tokenizer |
Qingyang Zhu et.al. |
2406.15245 |
null |
2024-06-21 |
Large Batch Analysis for Adagrad Under Anisotropic Smoothness |
Yuxing Liu et.al. |
2406.15244 |
null |
2024-06-21 |
Detecting Synthetic Lyrics with Few-Shot Inference |
Yanis Labrak et.al. |
2406.15231 |
null |
2024-06-21 |
A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation |
Irune Zubiaga et.al. |
2406.15227 |
link |
2024-06-21 |
Unsupervised Extraction of Dialogue Policies from Conversations |
Makesh Narsimhan Sreedhar et.al. |
2406.15214 |
null |
2024-06-21 |
Prompting Whisper for QA-driven Zero-shot End-to-end Spoken Language Understanding |
Mohan Li et.al. |
2406.15209 |
null |
2024-06-21 |
Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms |
Santiago Berrezueta-Guzman et.al. |
2406.15198 |
null |
2024-06-21 |
UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-world Document Analysis |
Yulong Hui et.al. |
2406.15187 |
link |
2024-06-21 |
Hybrid Alignment Training for Large Language Models |
Chenglong Wang et.al. |
2406.15178 |
link |
2024-06-21 |
EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot |
Hao Fei et.al. |
2406.15177 |
link |
2024-06-21 |
Enhancing Idiomatic Representation in Multiple Languages via an Adaptive Contrastive Triplet Loss |
Wei He et.al. |
2406.15175 |
null |
2024-06-21 |
Évaluation des capacités de réponse de larges modèles de langage (LLM) pour des questions d’historiens |
Mathieu Chartier et.al. |
2406.15173 |
null |
2024-06-21 |
Assessing Good, Bad and Ugly Arguments Generated by ChatGPT: a New Dataset, its Methodology and Associated Tasks |
Victor Hugo Nascimento Rocha et.al. |
2406.15130 |
link |
2024-06-21 |
Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network |
Badr AlKhamissi et.al. |
2406.15109 |
link |
2024-06-21 |
PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data |
Ishaan Watts et.al. |
2406.15053 |
null |
2024-06-20 |
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch |
Hasan Abed Al Kader Hammoud et.al. |
2406.14563 |
null |
2024-06-20 |
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities |
Sachit Menon et.al. |
2406.14562 |
null |
2024-06-20 |
How to Compute the Probability of a Word |
Tiago Pimentel et.al. |
2406.14561 |
link |
2024-06-21 |
Asynchronous Large Language Model Enhanced Planner for Autonomous Driving |
Yuan Chen et.al. |
2406.14556 |
link |
2024-06-20 |
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models |
Shilong Li et.al. |
2406.14550 |
null |
2024-06-20 |
Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Large Language Models |
Sunny Duan et.al. |
2406.14549 |
null |
2024-06-20 |
Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data |
Johannes Treutlein et.al. |
2406.14546 |
link |
2024-06-20 |
Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems |
Đorđe Klisura et.al. |
2406.14545 |
null |
2024-06-20 |
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs |
Yuxuan Qiao et.al. |
2406.14544 |
link |
2024-06-20 |
Are LLMs Naturally Good at Synthetic Tabular Data Generation? |
Shengzhe Xu et.al. |
2406.14541 |
link |
2024-06-20 |
PostMark: A Robust Blackbox Watermark for Large Language Models |
Yapei Chang et.al. |
2406.14517 |
link |
2024-06-20 |
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding |
Xinyu Fang et.al. |
2406.14515 |
link |
2024-06-20 |
Evidence of a log scaling law for political persuasion with large language models |
Kobi Hackenburg et.al. |
2406.14508 |
link |
2024-06-20 |
Overview of the CAIL 2023 Argument Mining Track |
Jingcong Liang et.al. |
2406.14503 |
null |
2024-06-20 |
Improving Expert Radiology Report Summarization by Prompting Large Language Models with a Layperson Summary |
Xingmeng Zhao et.al. |
2406.14500 |
null |
2024-06-20 |
LLaSA: Large Multimodal Agent for Human Activity Analysis Through Wearable Sensors |
Sheikh Asif Imran et.al. |
2406.14498 |
link |
2024-06-20 |
CodeRAG-Bench: Can Retrieval Augment Code Generation? |
Zora Zhiruo Wang et.al. |
2406.14497 |
link |
2024-06-20 |
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification |
Gregor Geigle et.al. |
2406.14496 |
link |
2024-06-20 |
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models? |
Gregor Geigle et.al. |
2406.14492 |
null |
2024-06-20 |
Instruction Pre-Training: Language Models are Supervised Multitask Learners |
Daixuan Cheng et.al. |
2406.14491 |
link |
2024-06-18 |
DrVideo: Document Retrieval Based Long Video Understanding |
Ziyu Ma et.al. |
2406.12846 |
null |
2024-06-18 |
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts |
Haoxiang Wang et.al. |
2406.12845 |
link |
2024-06-18 |
Synergizing Foundation Models and Federated Learning: A Survey |
Shenghui Li et.al. |
2406.12844 |
null |
2024-06-18 |
GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation |
Ci-Siang Lin et.al. |
2406.12834 |
null |
2024-06-18 |
LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation |
Seyedarmin Azizi et.al. |
2406.12832 |
link |
2024-06-18 |
What Are the Odds? Language Models Are Capable of Probabilistic Reasoning |
Akshay Paruchuri et.al. |
2406.12830 |
link |
2024-06-18 |
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries |
Hitesh Wadhwa et.al. |
2406.12824 |
null |
2024-06-18 |
Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models? |
Pinzhen Chen et.al. |
2406.12822 |
null |
2024-06-18 |
Adversarial Attacks on Multimodal Agents |
Chen Henry Wu et.al. |
2406.12814 |
link |
2024-06-18 |
Can Large Language Models Always Solve Easy Problems if They Can Solve Harder Ones? |
Zhe Yang et.al. |
2406.12809 |
link |
2024-06-18 |
Identifying Performance-Sensitive Configurations in Software Systems through Code Analysis with LLM Agents |
Zehao Wang et.al. |
2406.12806 |
null |
2024-06-18 |
Supporting Human Raters with the Detection of Harmful Content using Large Language Models |
Kurt Thomas et.al. |
2406.12800 |
null |
2024-06-18 |
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools |
Team GLM et.al. |
2406.12793 |
link |
2024-06-18 |
In-Context Learning of Energy Functions |
Rylan Schaeffer et.al. |
2406.12785 |
null |
2024-06-18 |
UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions |
Xunzhi Wang et.al. |
2406.12784 |
link |
2024-06-18 |
Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries |
Eden Biran et.al. |
2406.12775 |
link |
2024-06-18 |
Towards Exact Gradient-based Training on Analog In-memory Computing |
Zhaoxian Wu et.al. |
2406.12774 |
null |
2024-06-18 |
GFM4MPM: Towards Geospatial Foundation Models for Mineral Prospectivity Mapping |
Angel Daruna et.al. |
2406.12756 |
null |
2024-06-18 |
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI |
Zhen Huang et.al. |
2406.12753 |
link |
2024-06-18 |
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning |
Bingchen Zhao et.al. |
2406.12742 |
link |
2024-06-17 |
LLaNA: Large Language and NeRF Assistant |
Andrea Amaduzzi et.al. |
2406.11840 |
null |
2024-06-17 |
mDPO: Conditional Preference Optimization for Multimodal Large Language Models |
Fei Wang et.al. |
2406.11839 |
null |
2024-06-17 |
MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs |
Ziyu Liu et.al. |
2406.11833 |
link |
2024-06-17 |
Unveiling Encoder-Free Vision-Language Models |
Haiwen Diao et.al. |
2406.11832 |
link |
2024-06-17 |
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models |
Bingqi Ma et.al. |
2406.11831 |
null |
2024-06-17 |
Language Modeling with Editable External Knowledge |
Belinda Z. Li et.al. |
2406.11830 |
link |
2024-06-17 |
WPO: Enhancing RLHF with Weighted Preference Optimization |
Wenxuan Zhou et.al. |
2406.11827 |
link |
2024-06-17 |
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning |
Geewook Kim et.al. |
2406.11823 |
link |
2024-06-17 |
MegaScenes: Scene-Level View Synthesis at Scale |
Joseph Tung et.al. |
2406.11819 |
link |
2024-06-17 |
Embodied Instruction Following in Unknown Environments |
Zhenyu Wu et.al. |
2406.11818 |
null |
2024-06-17 |
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level |
Jie Liu et.al. |
2406.11817 |
null |
2024-06-17 |
VideoLLM-online: Online Video Large Language Model for Streaming Video |
Joya Chen et.al. |
2406.11816 |
null |
2024-06-17 |
How Do Large Language Models Acquire Factual Knowledge During Pretraining? |
Hoyeon Chang et.al. |
2406.11813 |
link |
2024-06-17 |
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content |
Joao Monteiro et.al. |
2406.11811 |
link |
2024-06-17 |
Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations |
Rima Hazra et.al. |
2406.11801 |
link |
2024-06-17 |
DataComp-LM: In search of the next generation of training sets for language models |
Jeffrey Li et.al. |
2406.11794 |
null |
2024-06-17 |
CELL your Model: Contrastive Explanation Methods for Large Language Models |
Ronny Luss et.al. |
2406.11785 |
null |
2024-06-17 |
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs |
Swanand Ravindra Kadhe et.al. |
2406.11780 |
null |
2024-06-17 |
Improving Multi-Agent Debate with Sparse Communication Topology |
Yunxuan Li et.al. |
2406.11776 |
null |
2024-06-17 |
Task Me Anything |
Jieyu Zhang et.al. |
2406.11775 |
link |
2024-06-14 |
Quantifying Variance in Evaluation Benchmarks |
Lovish Madaan et.al. |
2406.10229 |
null |
2024-06-14 |
EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models |
Julian Straub et.al. |
2406.10224 |
link |
2024-06-14 |
Short Film Dataset (SFD): A Benchmark for Story-Level Video Understanding |
Ridouane Ghermi et.al. |
2406.10221 |
link |
2024-06-14 |
Semantic Membership Inference Attack against Large Language Models |
Hamid Mozaffari et.al. |
2406.10218 |
null |
2024-06-14 |
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs |
Rui Yang et.al. |
2406.10216 |
link |
2024-06-14 |
DevBench: A multimodal developmental benchmark for language learning |
Alvin Wei Ming Tan et.al. |
2406.10215 |
link |
2024-06-14 |
Be like a Goldfish, Don’t Memorize! Mitigating Memorization in Generative LLMs |
Abhimanyu Hans et.al. |
2406.10209 |
link |
2024-06-14 |
A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors |
Naaman Tan et.al. |
2406.10203 |
link |
2024-06-14 |
TRIP-PAL: Travel Planning with Guarantees by Combining Large Language Models and Automated Planners |
Tomas de la Rosa et.al. |
2406.10196 |
null |
2024-06-14 |
Detecting and Evaluating Medical Hallucinations in Large Vision Language Models |
Jiawei Chen et.al. |
2406.10185 |
null |
2024-06-14 |
Practical offloading for fine-tuning LLM on commodity GPU via learned subspace projectors |
Siyuan Chen et.al. |
2406.10181 |
null |
2024-06-14 |
Let the Poem Hit the Rhythm: Using a Byte-Based Transformer for Beat-Aligned Poetry Generation |
Mohamad Elzohbi et.al. |
2406.10174 |
link |
2024-06-14 |
IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce |
Wenxuan Ding et.al. |
2406.10173 |
link |
2024-06-14 |
Datasets for Multilingual Answer Sentence Selection |
Matteo Gabburo et.al. |
2406.10172 |
null |
2024-06-14 |
CarLLaVA: Vision language models for camera-only closed-loop driving |
Katrin Renz et.al. |
2406.10165 |
null |
2024-06-14 |
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models |
Carson Denison et.al. |
2406.10162 |
link |
2024-06-14 |
RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model |
Hantao Zhou et.al. |
2406.10157 |
null |
2024-06-14 |
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack |
Yuri Kuratov et.al. |
2406.10149 |
link |
2024-06-14 |
Evaluation of Large Language Models: STEM education and Gender Stereotypes |
Smilla Due et.al. |
2406.10133 |
null |
2024-06-14 |
The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models |
Yan Liu et.al. |
2406.10130 |
link |
2024-06-13 |
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding |
Muhammad Maaz et.al. |
2406.09418 |
link |
2024-06-13 |
Explore the Limits of Omni-modal Pretraining at Scale |
Yiyuan Zhang et.al. |
2406.09412 |
link |
2024-06-13 |
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities |
Roman Bachmann et.al. |
2406.09406 |
null |
2024-06-13 |
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models |
Yushi Hu et.al. |
2406.09403 |
null |
2024-06-13 |
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation |
Junke Wang et.al. |
2406.09399 |
link |
2024-06-13 |
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms |
Miaosen Zhang et.al. |
2406.09397 |
null |
2024-06-13 |
Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA |
Jongwoo Park et.al. |
2406.09396 |
link |
2024-06-13 |
Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition |
Youngtaek Oh et.al. |
2406.09388 |
link |
2024-06-13 |
Towards Vision-Language Geo-Foundation Model: A Survey |
Yue Zhou et.al. |
2406.09385 |
link |
2024-06-13 |
Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models |
Lukas Thede et.al. |
2406.09384 |
null |
2024-06-13 |
Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs |
Zijia Zhao et.al. |
2406.09367 |
link |
2024-06-13 |
ElicitationGPT: Text Elicitation Mechanisms via Language Models |
Yifan Wu et.al. |
2406.09363 |
null |
2024-06-13 |
Enhancing Domain Adaptation through Prompt Gradient Alignment |
Hoang Phan et.al. |
2406.09353 |
link |
2024-06-13 |
Separations in the Representational Capabilities of Transformers and Recurrent Architectures |
Satwik Bhattamishra et.al. |
2406.09347 |
null |
2024-06-13 |
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding |
Suwon Shon et.al. |
2406.09345 |
null |
2024-06-13 |
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models |
David Anugraha et.al. |
2406.09334 |
link |
2024-06-13 |
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space |
Tomer Ashuach et.al. |
2406.09325 |
null |
2024-06-13 |
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs |
Zhao Xu et.al. |
2406.09324 |
link |
2024-06-13 |
JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models |
Delong Ran et.al. |
2406.09321 |
link |
2024-06-13 |
Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases |
Meng Wang et.al. |
2406.09317 |
link |
2024-06-12 |
What If We Recaption Billions of Web Images with LLaMA-3? |
Xianhang Li et.al. |
2406.08478 |
null |
2024-06-12 |
Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens |
Ting-Ji Huang et.al. |
2406.08477 |
null |
2024-06-12 |
Real2Code: Reconstruct Articulated Objects via Code Generation |
Zhao Mandi et.al. |
2406.08474 |
null |
2024-06-12 |
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences |
Daiwei Chen et.al. |
2406.08469 |
null |
2024-06-12 |
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing |
Zhangchen Xu et.al. |
2406.08464 |
link |
2024-06-12 |
AToM-Bot: Embodied Fulfillment of Unspoken Human Needs with Affective Theory of Mind |
Wei Ding et.al. |
2406.08455 |
null |
2024-06-12 |
OLMES: A Standard for Language Model Evaluations |
Yuling Gu et.al. |
2406.08446 |
null |
2024-06-12 |
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models |
Chun Yin et.al. |
2406.08445 |
null |
2024-06-12 |
TasTe: Teaching Large Language Models to Translate through Self-Reflection |
Yutong Wang et.al. |
2406.08434 |
link |
2024-06-12 |
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL |
Zijin Hong et.al. |
2406.08426 |
null |
2024-06-12 |
OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text |
Qingyun Li et.al. |
2406.08418 |
link |
2024-06-12 |
Discovering Preference Optimization Algorithms with and for Large Language Models |
Chris Lu et.al. |
2406.08414 |
link |
2024-06-12 |
Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference |
Christopher Wolters et.al. |
2406.08413 |
null |
2024-06-13 |
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos |
Xuehai He et.al. |
2406.08407 |
link |
2024-06-12 |
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models |
Chun-Yi Kuan et.al. |
2406.08402 |
link |
2024-06-12 |
cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers |
Anirudh Sundar et.al. |
2406.08398 |
null |
2024-06-12 |
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks |
Jiannan Wu et.al. |
2406.08394 |
link |
2024-06-12 |
Large Language Models Must Be Taught to Know What They Don’t Know |
Sanyam Kapoor et.al. |
2406.08391 |
link |
2024-06-12 |
Banal Deception Human-AI Ecosystems: A Study of People’s Perceptions of LLM-generated Deceptive Behaviour |
Xiao Zhan et.al. |
2406.08386 |
null |
2024-06-13 |
APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation |
Weizhao He et.al. |
2406.08372 |
null |
2024-06-11 |
A3VLM: Actionable Articulation-Aware Vision Language Model |
Siyuan Huang et.al. |
2406.07549 |
link |
2024-06-11 |
Image and Video Tokenization with Binary Spherical Quantization |
Yue Zhao et.al. |
2406.07548 |
link |
2024-06-11 |
Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena |
Aidar Myrzakhan et.al. |
2406.07545 |
link |
2024-06-11 |
QuickLLaMA: Query-aware Inference Acceleration for Large Language Models |
Jingyao Li et.al. |
2406.07528 |
link |
2024-06-11 |
Simple and Effective Masked Diffusion Language Models |
Subham Sekhar Sahoo et.al. |
2406.07524 |
link |
2024-06-11 |
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling |
Liliang Ren et.al. |
2406.07522 |
link |
2024-06-11 |
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement |
Yunzhen Feng et.al. |
2406.07515 |
null |
2024-06-11 |
THaLLE: Text Hyperlocally Augmented Large Language Extension – Technical Report |
KBTG Labs et.al. |
2406.07505 |
null |
2024-06-11 |
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions |
Renjie Pi et.al. |
2406.07502 |
link |
2024-06-11 |
TextGrad: Automatic “Differentiation” via Text |
Mert Yuksekgonul et.al. |
2406.07496 |
link |
2024-06-11 |
CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization |
Frederic Kirstein et.al. |
2406.07494 |
null |
2024-06-11 |
Paraphrasing in Affirmative Terms Improves Negation Understanding |
MohammadHossein Rezaei et.al. |
2406.07492 |
null |
2024-06-11 |
PITCH: Productivity and Mental Well-being Coaching through Daily Conversational Interaction |
Adnan Abbas et.al. |
2406.07485 |
null |
2024-06-11 |
Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing |
Mao Li et.al. |
2406.07483 |
null |
2024-06-11 |
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs |
Zesen Cheng et.al. |
2406.07476 |
link |
2024-06-11 |
Anomaly Detection on Unstable Logs with GPT Models |
Fatemeh Hadadi et.al. |
2406.07467 |
null |
2024-06-11 |
Estimating the Hallucination Rate of Generative AI |
Andrew Jesson et.al. |
2406.07457 |
null |
2024-06-11 |
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis |
Qining Zhang et.al. |
2406.07455 |
null |
2024-06-11 |
On the Robustness of Document-Level Relation Extraction Models to Entity Name Variations |
Shiao Meng et.al. |
2406.07444 |
link |
2024-06-11 |
McEval: Massively Multilingual Code Evaluation |
Linzheng Chai et.al. |
2406.07436 |
null |
2024-06-10 |
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation |
Peize Sun et.al. |
2406.06525 |
link |
2024-06-10 |
UMBRELA: UMbrela is the (Open-Source Reproduction of the) Bing RELevance Assessor |
Shivani Upadhyay et.al. |
2406.06519 |
link |
2024-06-10 |
Merlin: A Vision Language Foundation Model for 3D Computed Tomography |
Louis Blankemeier et.al. |
2406.06512 |
null |
2024-06-10 |
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative |
Asmar Nadeem et.al. |
2406.06499 |
null |
2024-06-10 |
Direct Preference Optimization for Suppressing Hallucinated Prior Exams in Radiology Report Generation |
Oishi Banerjee et.al. |
2406.06496 |
null |
2024-06-10 |
Can Language Models Serve as Text-Based World Simulators? |
Ruoyao Wang et.al. |
2406.06485 |
null |
2024-06-10 |
Parallelizing Linear Transformers with the Delta Rule over Sequence Length |
Songlin Yang et.al. |
2406.06484 |
link |
2024-06-10 |
Towards a Personal Health Large Language Model |
Justin Cosentino et.al. |
2406.06474 |
null |
2024-06-10 |
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction |
Zhen Xing et.al. |
2406.06465 |
null |
2024-06-10 |
Transforming Wearable Data into Health Insights using Large Language Model Agents |
Mike A. Merrill et.al. |
2406.06464 |
null |
2024-06-10 |
VCR: Visual Caption Restoration |
Tianyu Zhang et.al. |
2406.06462 |
link |
2024-06-11 |
Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies |
Junlin Wang et.al. |
2406.06461 |
null |
2024-06-10 |
Evaluating the Retrieval Component in LLM-Based Question Answering Systems |
Ashkan Alinejad et.al. |
2406.06458 |
null |
2024-06-10 |
A Large Language Model Pipeline for Breast Cancer Oncology |
Tristen Pool et.al. |
2406.06455 |
null |
2024-06-10 |
Insights from Social Shaping Theory: The Appropriation of Large Language Models in an Undergraduate Programming Course |
Aadarsh Padiyath et.al. |
2406.06451 |
null |
2024-06-10 |
LLM Dataset Inference: Did you train on my dataset? |
Pratyush Maini et.al. |
2406.06443 |
link |
2024-06-10 |
Interpretability of Language Models via Task Spaces |
Lucas Weber et.al. |
2406.06441 |
null |
2024-06-10 |
Language Models are Alignable Decision-Makers: Dataset and Application to the Medical Triage Domain |
Brian Hu et.al. |
2406.06435 |
link |
2024-06-10 |
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking |
Gabriel Rioux et.al. |
2406.06425 |
null |
2024-06-10 |
An Empirical Design Justice Approach to Identifying Ethical Considerations in the Intersection of Large Language Models and Social Robotics |
Alva Markelius et.al. |
2406.06400 |
null |
2024-06-07 |
3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs |
Jianing Yang et.al. |
2406.05132 |
link |
2024-06-07 |
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models |
Xiongtao Zhou et.al. |
2406.05130 |
link |
2024-06-07 |
Towards Semantic Equivalence of Tokenization in Multimodal LLM |
Shengqiong Wu et.al. |
2406.05127 |
null |
2024-06-07 |
Large Generative Graph Models |
Yu Wang et.al. |
2406.05109 |
null |
2024-06-07 |
LINX: A Language Driven Generative System for Goal-Oriented Automated Data Exploration |
Tavor Lipman et.al. |
2406.05107 |
null |
2024-06-07 |
Corpus Poisoning via Approximate Greedy Gradient Descent |
Jinyan Su et.al. |
2406.05087 |
link |
2024-06-07 |
Multi-Head RAG: Solving Multi-Aspect Problems with LLMs |
Maciej Besta et.al. |
2406.05085 |
link |
2024-06-07 |
SUMIE: A Synthetic Benchmark for Incremental Entity Summarization |
Eunjeong Hwang et.al. |
2406.05079 |
null |
2024-06-07 |
Are Large Language Models More Empathetic than Humans? |
Anuradha Welivita et.al. |
2406.05063 |
null |
2024-06-07 |
Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions |
Shi-Yu Tian et.al. |
2406.05055 |
null |
2024-06-07 |
Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation |
Nachiket Kotalwar et.al. |
2406.05053 |
null |
2024-06-07 |
Bootstrapping Referring Multi-Object Tracking |
Yani Zhang et.al. |
2406.05039 |
link |
2024-06-07 |
Scenarios and Approaches for Situated Natural Language Explanations |
Pengshuo Qiu et.al. |
2406.05035 |
null |
2024-06-07 |
CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search |
Fengran Mo et.al. |
2406.05013 |
link |
2024-06-07 |
Compositional Generalization with Grounded Language Models |
Sondre Wold et.al. |
2406.04989 |
link |
2024-06-07 |
Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences |
Patrick Haller et.al. |
2406.04988 |
link |
2024-06-07 |
MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter |
Jitai Hao et.al. |
2406.04984 |
link |
2024-06-07 |
CityCraft: A Real Crafter for 3D City Generation |
Jie Deng et.al. |
2406.04983 |
null |
2024-06-07 |
Quantifying Geospatial in the Common Crawl Corpus |
Ilya Ilyankou et.al. |
2406.04952 |
null |
2024-06-07 |
BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense |
Baktash Ansari et.al. |
2406.04947 |
link |
2024-06-06 |
Verbalized Machine Learning: Revisiting Machine Learning with Language Models |
Tim Z. Xiao et.al. |
2406.04344 |
null |
2024-06-06 |
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image |
Stanislaw Szymanowicz et.al. |
2406.04343 |
link |
2024-06-06 |
Learning 1D Causal Visual Representation with De-focus Attention Networks |
Chenxin Tao et.al. |
2406.04342 |
link |
2024-06-06 |
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation |
Jiaming Liu et.al. |
2406.04339 |
null |
2024-06-06 |
Coherent Zero-Shot Visual Instruction Generation |
Quynh Phung et.al. |
2406.04337 |
null |
2024-06-06 |
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs |
Lingchen Meng et.al. |
2406.04334 |
null |
2024-06-06 |
PaCE: Parsimonious Concept Engineering for Large Language Models |
Jinqi Luo et.al. |
2406.04331 |
link |
2024-06-06 |
Parameter-Inverted Image Pyramid Networks |
Xizhou Zhu et.al. |
2406.04330 |
link |
2024-06-06 |
Simplified and Generalized Masked Diffusion for Discrete Data |
Jiaxin Shi et.al. |
2406.04329 |
null |
2024-06-06 |
Causal Estimation of Memorisation Profiles |
Pietro Lesci et.al. |
2406.04327 |
link |
2024-06-06 |
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions |
Lin Chen et.al. |
2406.04325 |
null |
2024-06-06 |
Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step |
Zhanhao Liang et.al. |
2406.04314 |
link |
2024-06-06 |
Improving Alignment and Robustness with Short Circuiting |
Andy Zou et.al. |
2406.04313 |
link |
2024-06-06 |
Semantically Diverse Language Generation for Uncertainty Estimation in Language Models |
Lukas Aichberger et.al. |
2406.04306 |
link |
2024-06-06 |
Quixer: A Quantum Transformer Model |
Nikhil Khatri et.al. |
2406.04305 |
null |
2024-06-06 |
Text-to-Drive: Diverse Driving Behavior Synthesis via Large Language Models |
Phat Nguyen et.al. |
2406.04300 |
null |
2024-06-06 |
VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval |
Junjie Zhou et.al. |
2406.04292 |
link |
2024-06-06 |
Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation |
Adam Fisch et.al. |
2406.04291 |
null |
2024-06-07 |
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages |
Nadav Borenstein et.al. |
2406.04289 |
null |
2024-06-06 |
Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People |
Dun-Ming Huang et.al. |
2406.04278 |
link |
2024-06-05 |
Wings: Learning Multimodal LLMs without Text-only Forgetting |
Yi-Kai Zhang et.al. |
2406.03496 |
null |
2024-06-06 |
Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training |
Ao Sun et.al. |
2406.03488 |
link |
2024-06-05 |
Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends |
Sanjana Ramprasad et.al. |
2406.03487 |
null |
2024-06-05 |
BIPED: Pedagogically Informed Tutoring System for ESL Education |
Soonwoo Kwon et.al. |
2406.03486 |
null |
2024-06-05 |
Does your data spark joy? Performance gains from domain upsampling at the end of training |
Cody Blakeney et.al. |
2406.03476 |
null |
2024-06-05 |
AD-H: Autonomous Driving with Hierarchical Agents |
Zaibin Zhang et.al. |
2406.03474 |
null |
2024-06-05 |
What is the Best Way for ChatGPT to Translate Poetry? |
Shanshan Wang et.al. |
2406.03450 |
null |
2024-06-05 |
Pre-trained Large Language Models Use Fourier Features to Compute Addition |
Tianyi Zhou et.al. |
2406.03445 |
null |
2024-06-05 |
Are language models rational? The case of coherence norms and belief revision |
Thomas Hofweber et.al. |
2406.03442 |
null |
2024-06-05 |
Cycles of Thought: Measuring LLM Confidence through Stable Explanations |
Evan Becker et.al. |
2406.03441 |
null |
2024-06-05 |
Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis |
Moein Heidari et.al. |
2406.03430 |
link |
2024-06-05 |
Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach |
Saehyung Lee et.al. |
2406.03411 |
link |
2024-06-05 |
Automating Turkish Educational Quiz Generation Using Large Language Models |
Kamyar Zeinalipour et.al. |
2406.03397 |
link |
2024-06-05 |
Log Parsing with Self-Generated In-Context Learning and Self-Correction |
Yifan Wu et.al. |
2406.03376 |
null |
2024-06-05 |
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models |
David Ifeoluwa Adelani et.al. |
2406.03368 |
null |
2024-06-05 |
CLMASP: Coupling Large Language Models with Answer Set Programming for Robotic Task Planning |
Xinrui Lin et.al. |
2406.03367 |
null |
2024-06-05 |
LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback |
Timon Ziegenbein et.al. |
2406.03363 |
null |
2024-06-05 |
Save It for the “Hot” Day: An LLM-Empowered Visual Analytics System for Heat Risk Management |
Haobo Li et.al. |
2406.03317 |
null |
2024-06-05 |
The Good, the Bad, and the Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games |
Mikhail Mozikov et.al. |
2406.03299 |
null |
2024-06-05 |
SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms |
Xingrun Xing et.al. |
2406.03287 |
link |
2024-06-04 |
Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks |
Tianyu He et.al. |
2406.02550 |
link |
2024-06-04 |
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation |
Mohamed El Amine Boudjoghra et.al. |
2406.02548 |
link |
2024-06-04 |
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning |
Alex Jinpeng Wang et.al. |
2406.02547 |
link |
2024-06-04 |
To Believe or Not to Believe Your LLM |
Yasin Abbasi Yadkori et.al. |
2406.02543 |
null |
2024-06-04 |
Loki: Low-Rank Keys for Efficient Sparse Attention |
Prajwal Singhania et.al. |
2406.02542 |
link |
2024-06-04 |
Parrot: Multilingual Visual Instruction Tuning |
Hai-Long Sun et.al. |
2406.02539 |
link |
2024-06-04 |
TopViewRS: Vision-Language Models as Top-View Spatial Reasoners |
Chengzu Li et.al. |
2406.02537 |
link |
2024-06-04 |
Mitigate Position Bias in Large Language Models via Scaling a Single Dimension |
Yijiong Yu et.al. |
2406.02536 |
link |
2024-06-04 |
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices |
Ruslan Svirschevski et.al. |
2406.02532 |
link |
2024-06-04 |
Scalable MatMul-free Language Modeling |
Rui-Jie Zhu et.al. |
2406.02528 |
link |
2024-06-04 |
CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks |
Maciej Besta et.al. |
2406.02524 |
link |
2024-06-04 |
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots |
Soroush Nasiriany et.al. |
2406.02523 |
null |
2024-06-04 |
Demystifying the Compression of Mixture-of-Experts Through a Unified Framework |
Shwai He et.al. |
2406.02500 |
link |
2024-06-04 |
Hiding Text in Large Language Models: Introducing Unconditional Token Forcing Confusion |
Jakub Hoscilowicz et.al. |
2406.02481 |
link |
2024-06-04 |
Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding |
Zhihan Zhang et.al. |
2406.02472 |
link |
2024-06-04 |
Meta-Designing Quantum Experiments with Language Models |
Sören Arlt et.al. |
2406.02470 |
null |
2024-06-04 |
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models |
Philip Anastassiou et.al. |
2406.02430 |
link |
2024-06-04 |
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion |
Ruiqi Li et.al. |
2406.02429 |
null |
2024-06-04 |
GrootVL: Tree Topology is All You Need in State Space Model |
Yicheng Xiao et.al. |
2406.02395 |
link |
2024-06-04 |
Multiple Choice Questions and Large Languages Models: A Case Study with Fictional Medical Data |
Maxime Griot et.al. |
2406.02394 |
link |
2024-05-31 |
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis |
Chaoyou Fu et.al. |
2405.21075 |
null |
2024-05-31 |
Code Pretraining Improves Entity Tracking Abilities of Language Models |
Najoung Kim et.al. |
2405.21068 |
null |
2024-05-31 |
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality |
Tri Dao et.al. |
2405.21060 |
link |
2024-05-31 |
RydbergGPT |
David Fitzek et.al. |
2405.21052 |
link |
2024-05-31 |
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling |
Jiatao Gu et.al. |
2405.21048 |
null |
2024-05-31 |
Grammar-Aligned Decoding |
Kanghee Park et.al. |
2405.21047 |
null |
2024-05-31 |
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF |
Tengyang Xie et.al. |
2405.21046 |
null |
2024-05-31 |
Direct Alignment of Language Models via Quality-Aware Self-Refinement |
Runsheng Yu et.al. |
2405.21040 |
null |
2024-05-31 |
Standards for Belief Representations in LLMs |
Daniel A. Herrmann et.al. |
2405.21030 |
null |
2024-05-31 |
LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models |
Elias Stengel-Eskin et.al. |
2405.21028 |
link |
2024-05-31 |
You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet |
Zhen Qin et.al. |
2405.21022 |
null |
2024-05-31 |
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models |
Xiaojun Jia et.al. |
2405.21018 |
link |
2024-06-03 |
StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond |
Pengyuan Lyu et.al. |
2405.21013 |
null |
2024-05-31 |
Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models |
Yi Yang et.al. |
2405.20991 |
link |
2024-05-31 |
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models |
Linli Yao et.al. |
2405.20985 |
link |
2024-05-31 |
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training |
Feiteng Fang et.al. |
2405.20978 |
link |
2024-05-31 |
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales |
Tianyang Xu et.al. |
2405.20974 |
link |
2024-05-31 |
LCQ: Low-Rank Codebook based Quantization for Large Language Models |
Wen-Pu Cai et.al. |
2405.20973 |
null |
2024-06-03 |
Large Language Models are Zero-Shot Next Location Predictors |
Ciro Beneduce et.al. |
2405.20962 |
link |
2024-06-03 |
A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs’ Humour Alignment with Comedians |
Piotr Wojciech Mirowski et.al. |
2405.20956 |
null |
2024-05-30 |
MotionLLM: Understanding Human Behaviors from Human Motions and Videos |
Ling-Hao Chen et.al. |
2405.20340 |
link |
2024-05-30 |
Visual Perception by Large Language Model’s Weights |
Feipeng Ma et.al. |
2405.20339 |
link |
2024-05-30 |
Xwin-LM: Strong and Scalable Alignment Practice for LLMs |
Bolin Ni et.al. |
2405.20335 |
link |
2024-05-31 |
ParSEL: Parameterized Shape Editing with Language |
Aditya Ganeshan et.al. |
2405.20319 |
null |
2024-05-30 |
CausalQuest: Collecting Natural Causal Questions for AI Agents |
Roberto Ceraolo et.al. |
2405.20318 |
link |
2024-05-30 |
ANAH: Analytical Annotation of Hallucinations in Large Language Models |
Ziwei Ji et.al. |
2405.20315 |
link |
2024-05-30 |
Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation |
Guillaume Huguet et.al. |
2405.20313 |
null |
2024-05-30 |
Large Language Models Can Self-Improve At Web Agent Tasks |
Ajay Patel et.al. |
2405.20309 |
link |
2024-05-30 |
Can’t make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models |
Himangi Mittal et.al. |
2405.20305 |
null |
2024-05-30 |
Group Robust Preference Optimization in Reward-free RLHF |
Shyam Sundhar Ramesh et.al. |
2405.20304 |
link |
2024-05-30 |
Who Writes the Review, Human or AI? |
Panagiotis C. Theocharopoulos et.al. |
2405.20285 |
null |
2024-05-30 |
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections |
Massimo Bini et.al. |
2405.20271 |
link |
2024-05-30 |
Evaluating Large Language Model Biases in Persona-Steered Generation |
Andy Liu et.al. |
2405.20253 |
link |
2024-05-30 |
Towards Hierarchical Multi-Agent Workflows for Zero-Shot Prompt Optimization |
Yuchi Liu et.al. |
2405.20252 |
link |
2024-05-30 |
Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use |
Franz Louis Cesista et.al. |
2405.20245 |
null |
2024-05-30 |
Context Injection Attacks on Large Language Models |
Cheng’an Wei et.al. |
2405.20234 |
null |
2024-05-30 |
Data-efficient fine-tuning of foundational models for first-principles quality sublimation enthalpies |
Harveen Kaur et.al. |
2405.20217 |
null |
2024-05-30 |
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models |
Chen Zhang et.al. |
2405.20215 |
null |
2024-05-30 |
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments |
Ke Yi et.al. |
2405.20202 |
null |
2024-05-31 |
Using Large Language Models for Humanitarian Frontline Negotiation: Opportunities and Considerations |
Zilin Ma et.al. |
2405.20195 |
null |
2024-05-29 |
X-VILA: Cross-Modality Alignment for Large Language Model |
Hanrong Ye et.al. |
2405.19335 |
null |
2024-05-29 |
LLMs Meet Multimodal Generation and Editing: A Survey |
Yingqing He et.al. |
2405.19334 |
link |
2024-05-29 |
Multi-Modal Generative Embedding Model |
Feipeng Ma et.al. |
2405.19333 |
null |
2024-05-29 |
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment |
Shenao Zhang et.al. |
2405.19332 |
link |
2024-05-29 |
Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation |
Atrisha Sarkar et.al. |
2405.19328 |
null |
2024-05-29 |
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series |
Ge Zhang et.al. |
2405.19327 |
link |
2024-05-29 |
Reasoning3D – Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models |
Tianrun Chen et.al. |
2405.19326 |
null |
2024-05-29 |
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution |
Minghan Li et.al. |
2405.19325 |
null |
2024-05-29 |
Are Large Language Models Chameleons? |
Mingmeng Geng et.al. |
2405.19323 |
null |
2024-05-29 |
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF |
Shicong Cen et.al. |
2405.19320 |
null |
2024-05-29 |
Robust Preference Optimization through Reward Model Distillation |
Adam Fisch et.al. |
2405.19316 |
null |
2024-05-29 |
Matryoshka Query Transformer for Large Vision-Language Models |
Wenbo Hu et.al. |
2405.19315 |
link |
2024-05-29 |
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice |
Jian-Qiao Zhu et.al. |
2405.19313 |
null |
2024-05-29 |
Expert-Guided Extinction of Toxic Tokens for Debiased Generation |
Xueyao Sun et.al. |
2405.19299 |
null |
2024-05-29 |
MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection |
Michael Regan et.al. |
2405.19285 |
null |
2024-05-29 |
Optimizing Foundation Model Inference on a Many-tiny-core Open-source RISC-V Platform |
Viviane Potocnik et.al. |
2405.19284 |
null |
2024-05-29 |
Programmable Motion Generation for Open-Set Motion Control Tasks |
Hanchao Liu et.al. |
2405.19283 |
null |
2024-05-29 |
PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications |
Dingkang Yang et.al. |
2405.19266 |
link |
2024-05-29 |
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data |
Zifan Song et.al. |
2405.19265 |
link |
2024-05-29 |
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models |
Zhanhui Zhou et.al. |
2405.19262 |
link |
2024-05-28 |
Why are Visually-Grounded Language Models Bad at Image Classification? |
Yuhui Zhang et.al. |
2405.18415 |
link |
2024-05-28 |
Don’t Forget to Connect! Improving RAG with Graph-based Reranking |
Jialin Dong et.al. |
2405.18414 |
null |
2024-05-28 |
WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization |
Jiawei Ma et.al. |
2405.18405 |
null |
2024-05-29 |
Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass |
Ethan Shen et.al. |
2405.18400 |
link |
2024-05-28 |
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning |
Yixiao Zhang et.al. |
2405.18386 |
link |
2024-05-28 |
OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning |
Pengxiang Li et.al. |
2405.18380 |
link |
2024-05-28 |
LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models |
Anthony Sarah et.al. |
2405.18377 |
null |
2024-05-28 |
Empowering Source-Free Domain Adaptation with MLLM-driven Curriculum Learning |
Dongjie Chen et.al. |
2405.18376 |
link |
2024-05-28 |
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning |
Phakphum Artkaew et.al. |
2405.18375 |
link |
2024-05-28 |
PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework |
Eshaan Agarwal et.al. |
2405.18369 |
null |
2024-05-28 |
Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving? |
Yifan Bai et.al. |
2405.18361 |
null |
2024-05-28 |
Bridging the Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs |
Somnath Kumar et.al. |
2405.18359 |
null |
2024-05-28 |
MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning |
Somnath Kumar et.al. |
2405.18358 |
null |
2024-05-28 |
Faithful Logical Reasoning via Symbolic Chain-of-Thought |
Jundong Xu et.al. |
2405.18357 |
link |
2024-05-28 |
Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography |
Jie Liu et.al. |
2405.18356 |
link |
2024-05-28 |
Intelligent Clinical Documentation: Harnessing Generative AI for Patient-Centric Clinical Note Generation |
Anjanava Biswas et.al. |
2405.18346 |
null |
2024-05-28 |
The Battle of LLMs: A Comparative Study in Conversational QA Tasks |
Aryan Rangapur et.al. |
2405.18344 |
null |
2024-05-28 |
Frustratingly Easy Test-Time Adaptation of Vision-Language Models |
Matteo Farina et.al. |
2405.18330 |
link |
2024-05-28 |
Multi-modal Generation via Cross-Modal In-Context Learning |
Amandeep Kumar et.al. |
2405.18304 |
link |
2024-05-28 |
Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning |
Renzhi Wang et.al. |
2405.18292 |
null |
2024-05-27 |
Matryoshka Multimodal Models |
Mu Cai et.al. |
2405.17430 |
null |
2024-05-27 |
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models |
Chankyu Lee et.al. |
2405.17428 |
null |
2024-05-27 |
Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model |
Kuan-Chih Huang et.al. |
2405.17427 |
link |
2024-05-27 |
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence |
Zhuoling Li et.al. |
2405.17424 |
null |
2024-05-27 |
Privacy-Aware Visual Language Models |
Laurens Samson et.al. |
2405.17423 |
null |
2024-05-27 |
Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation |
Jiaming Liu et.al. |
2405.17418 |
null |
2024-05-27 |
THREAD: Thinking Deeper with Recursive Spawning |
Philip Schroeder et.al. |
2405.17402 |
link |
2024-05-27 |
The Expressive Capacity of State Space Models: A Formal Language Perspective |
Yash Sarrof et.al. |
2405.17394 |
null |
2024-05-27 |
MindMerger: Efficient Boosting LLM Reasoning in non-English Languages |
Zixian Huang et.al. |
2405.17386 |
link |
2024-05-27 |
Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective |
Zhen Qin et.al. |
2405.17383 |
null |
2024-05-27 |
ReMoDetect: Reward Models Recognize Aligned LLM’s Generations |
Hyunseok Lee et.al. |
2405.17382 |
link |
2024-05-27 |
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention |
Zhen Qin et.al. |
2405.17381 |
link |
2024-05-27 |
RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects |
Ahmed Allam et.al. |
2405.17378 |
link |
2024-05-28 |
Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models |
ShengYun Peng et.al. |
2405.17374 |
link |
2024-05-27 |
Prompt Optimization with Human Feedback |
Xiaoqiang Lin et.al. |
2405.17346 |
link |
2024-05-27 |
Exploring and steering the moral compass of Large Language Models |
Alejandro Tlaie et.al. |
2405.17345 |
link |
2024-05-27 |
Cost-efficient Knowledge-based Question Answering with Large Language Models |
Junnan Dong et.al. |
2405.17337 |
null |
2024-05-27 |
XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser |
Xianfu Cheng et.al. |
2405.17336 |
link |
2024-05-27 |
FedHPL: Efficient Heterogeneous Federated Learning with Prompt Tuning and Logit Distillation |
Yuting Ma et.al. |
2405.17267 |
null |
2024-05-27 |
On the Noise Robustness of In-Context Learning for Text Generation |
Hongfu Gao et.al. |
2405.17264 |
link |
2024-05-24 |
Scaling Laws for Discriminative Classification in Large Language Models |
Dean Wyatte et.al. |
2405.15765 |
null |
2024-05-24 |
Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence |
Abhinav Patil et.al. |
2405.15750 |
link |
2024-05-24 |
Sparse maximal update parameterization: A holistic approach to sparse training dynamics |
Nolan Dey et.al. |
2405.15743 |
link |
2024-05-24 |
Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias |
Andres Algaba et.al. |
2405.15739 |
link |
2024-05-24 |
LM4LV: A Frozen Large Language Model for Low-level Vision Tasks |
Boyang Zheng et.al. |
2405.15734 |
link |
2024-05-24 |
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks |
Jerome Sieber et.al. |
2405.15731 |
link |
2024-05-24 |
Optimizing Large Language Models for OpenAPI Code Completion |
Bohdan Petryshyn et.al. |
2405.15729 |
link |
2024-05-24 |
Disease-informed Adaptation of Vision-Language Models |
Jiajin Zhang et.al. |
2405.15728 |
link |
2024-05-24 |
The Impact of Geometric Complexity on Neural Collapse in Transfer Learning |
Michael Munn et.al. |
2405.15706 |
null |
2024-05-24 |
Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models |
Yue Zhang et.al. |
2405.15684 |
null |
2024-05-24 |
VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap |
Sreyan Ghosh et.al. |
2405.15683 |
link |
2024-05-24 |
What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models |
Abdelrahman Abdelhamed et.al. |
2405.15668 |
null |
2024-05-24 |
Class Machine Unlearning for Complex Data via Concepts Inference and Data Poisoning |
Wenhan Chang et.al. |
2405.15662 |
null |
2024-05-24 |
\(\mathbf{L^2\cdot M = C^2}\) Large Language Models as Covert Channels… a Systematic Analysis |
Simen Gaure et.al. |
2405.15652 |
null |
2024-05-24 |
LLM-based Robot Task Planning with Exceptional Handling for General Purpose Service Robots |
Ruoyu Wang et.al. |
2405.15646 |
null |
2024-05-24 |
GECKO: Generative Language Model for English, Code and Korean |
Sungwoo Oh et.al. |
2405.15640 |
null |
2024-05-24 |
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models |
Hongyu Wang et.al. |
2405.15638 |
link |
2024-05-24 |
GPTZoo: A Large-scale Dataset of GPTs for the Research Community |
Xinyi Hou et.al. |
2405.15630 |
link |
2024-05-24 |
A Comparative Analysis of Distributed Training Strategies for GPT-2 |
Ishan Patwardhan et.al. |
2405.15628 |
null |
2024-05-24 |
Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment |
Hao Sun et.al. |
2405.15624 |
null |
2024-05-23 |
PuzzleAvatar: Assembling 3D Avatars from Personal Albums |
Yuliang Xiu et.al. |
2405.14869 |
link |
2024-05-23 |
A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns |
Asaf Yehudai et.al. |
2405.14863 |
null |
2024-05-23 |
Bitune: Bidirectional Instruction-Tuning |
Dawid J. Kopiczko et.al. |
2405.14862 |
null |
2024-05-23 |
Not All Language Model Features Are Linear |
Joshua Engels et.al. |
2405.14860 |
link |
2024-05-23 |
PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression |
Vladimir Malinovskii et.al. |
2405.14852 |
link |
2024-05-23 |
A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis |
Yue Yang et.al. |
2405.14839 |
null |
2024-05-23 |
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step |
Yuntian Deng et.al. |
2405.14838 |
link |
2024-05-23 |
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models |
Bernal Jiménez Gutiérrez et.al. |
2405.14831 |
link |
2024-05-23 |
Designing A Sustainable Marine Debris Clean-up Framework without Human Labels |
Raymond Wang et.al. |
2405.14815 |
link |
2024-05-23 |
As an AI Language Model, “Yes I Would Recommend Calling the Police’’: Norm Inconsistency in LLM Decision-Making |
Shomik Jain et.al. |
2405.14812 |
null |
2024-05-23 |
Implicit Personalization in Language Models: A Systematic Study |
Zhijing Jin et.al. |
2405.14808 |
link |
2024-05-23 |
Can LLMs Solve longer Math Word Problems Better? |
Xin Xu et.al. |
2405.14804 |
null |
2024-05-23 |
Lessons from the Trenches on Reproducible Evaluation of Language Models |
Stella Biderman et.al. |
2405.14782 |
null |
2024-05-23 |
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models |
Peng Wang et.al. |
2405.14768 |
link |
2024-05-23 |
FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models |
Hongyang Yang et.al. |
2405.14767 |
link |
2024-05-23 |
Evaluating Large Language Models for Public Health Classification and Extraction Tasks |
Joshua Harris et.al. |
2405.14766 |
null |
2024-05-23 |
Large language models can be zero-shot anomaly detectors for time series? |
Sarah Alnegheimish et.al. |
2405.14755 |
link |
2024-05-23 |
A Transformer-Based Approach for Smart Invocation of Automatic Code Completion |
Aral de Moor et.al. |
2405.14753 |
link |
2024-05-23 |
MultiCast: Zero-Shot Multivariate Time Series Forecasting Using LLMs |
Georgios Chatzigeorgakidis et.al. |
2405.14748 |
null |
2024-05-23 |
Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View |
Xuan Liu et.al. |
2405.14744 |
null |
2024-05-21 |
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention |
William Brandon et.al. |
2405.12981 |
null |
2024-05-21 |
OmniGlue: Generalizable Feature Matching with Foundation Model Guidance |
Hanwen Jiang et.al. |
2405.12979 |
link |
2024-05-21 |
BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once |
Theodore Zhao et.al. |
2405.12971 |
null |
2024-05-21 |
Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale |
Shriram Chennakesavalu et.al. |
2405.12961 |
link |
2024-05-21 |
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models |
Zhangyue Yin et.al. |
2405.12939 |
link |
2024-05-21 |
Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs |
Bilgehan Sel et.al. |
2405.12933 |
null |
2024-05-21 |
Code-mixed Sentiment and Hate-speech Prediction |
Anjali Yadav et.al. |
2405.12929 |
link |
2024-05-21 |
Streamlining Software Reviews: Efficient Predictive Modeling with Minimal Examples |
Tim Menzies et.al. |
2405.12920 |
link |
2024-05-21 |
G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation |
Xingyuan Pan et.al. |
2405.12915 |
link |
2024-05-21 |
An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation |
Zhiyu Tan et.al. |
2405.12914 |
link |
2024-05-21 |
Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment |
Holli Sargeant et.al. |
2405.12910 |
link |
2024-05-21 |
Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents |
San Kim et.al. |
2405.12900 |
null |
2024-05-21 |
Investigating Persuasion Techniques in Arabic: An Empirical Study Leveraging Large Language Models |
Abdurahmman Alzahrani et.al. |
2405.12884 |
null |
2024-05-21 |
LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language |
James Requeima et.al. |
2405.12856 |
link |
2024-05-21 |
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models |
Zhaojian Yu et.al. |
2405.12843 |
link |
2024-05-21 |
SmartFlow: Robotic Process Automation using LLMs |
Arushi Jain et.al. |
2405.12842 |
null |
2024-05-21 |
Large Language Models Meet NLP: A Survey |
Libo Qin et.al. |
2405.12819 |
link |
2024-05-21 |
Test Oracle Automation in the era of LLMs |
Facundo Molina et.al. |
2405.12766 |
null |
2024-05-21 |
C3L: Content Correlated Vision-Language Instruction Tuning Data Generation via Contrastive Learning |
Ji Ma et.al. |
2405.12752 |
null |
2024-05-21 |
Generative AI and Large Language Models for Cyber Security: All Insights You Need |
Mohamed Amine Ferrag et.al. |
2405.12750 |
null |
2024-05-20 |
Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning |
Guanglin Zhou et.al. |
2405.12217 |
link |
2024-05-20 |
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark |
Hongwei Liu et.al. |
2405.12209 |
link |
2024-05-20 |
Developers’ Perceptions on the Impact of ChatGPT in Software Development: A Survey |
Thiago S. Vaillant et.al. |
2405.12195 |
link |
2024-05-20 |
CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models |
Haoxiang Shi et.al. |
2405.12174 |
null |
2024-05-20 |
Fennec: Fine-grained Language Model Evaluation and Correction Extended through Branching and Bridging |
Xiaobo Liang et.al. |
2405.12163 |
link |
2024-05-20 |
Eliciting Problem Specifications via Large Language Models |
Robert E. Wray et.al. |
2405.12147 |
null |
2024-05-20 |
DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM |
Xuchen Li et.al. |
2405.12139 |
null |
2024-05-20 |
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning |
Ting Jiang et.al. |
2405.12130 |
link |
2024-05-20 |
Reindex-Then-Adapt: Improving Large Language Models for Conversational Recommendation |
Zhankui He et.al. |
2405.12119 |
null |
2024-05-20 |
Imp: Highly Capable Large Multimodal Models for Mobile Devices |
Zhenwei Shao et.al. |
2405.12107 |
link |
2024-05-20 |
DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical Correction |
Hao Chen et.al. |
2405.12100 |
null |
2024-05-20 |
Distributional Semantics, Holism, and the Instability of Meaning |
Jumbly Grindrod et.al. |
2405.12084 |
null |
2024-05-20 |
PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation |
Zhuobin Huang et.al. |
2405.12079 |
null |
2024-05-20 |
CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models |
Tong Zhang et.al. |
2405.12063 |
link |
2024-05-20 |
STYLE: Improving Domain Transferability of Asking Clarification Questions in Large Language Model Powered Conversational Agents |
Yue Chen et.al. |
2405.12059 |
null |
2024-05-20 |
KG-RAG: Bridging the Gap Between Knowledge and Creativity |
Diego Sanmartin et.al. |
2405.12035 |
null |
2024-05-20 |
Can AI Relate: Testing Large Language Model Response for Mental Health Support |
Saadia Gabriel et.al. |
2405.12021 |
link |
2024-05-20 |
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering |
Jingqun Tang et.al. |
2405.11985 |
link |
2024-05-20 |
A review on the use of large language models as virtual tutors |
Silvia García-Méndez et.al. |
2405.11983 |
null |
2024-05-20 |
Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays |
Zhichao Sun et.al. |
2405.11976 |
link |
2024-05-17 |
Observational Scaling Laws and the Predictability of Language Model Performance |
Yangjun Ruan et.al. |
2405.10938 |
link |
2024-05-17 |
A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers |
Kaiyu Huang et.al. |
2405.10936 |
link |
2024-05-17 |
The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks |
Lucius Bushnaq et.al. |
2405.10928 |
link |
2024-05-17 |
Blackbox Adaptation for Medical Image Segmentation |
Jay N. Paranjape et.al. |
2405.10913 |
link |
2024-05-17 |
COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain |
Dimitrios P. Panagoulias et.al. |
2405.10893 |
null |
2024-05-17 |
Application of Artificial Intelligence in Schizophrenia Rehabilitation Management: Systematic Literature Review |
Hongyi Yang et.al. |
2405.10883 |
null |
2024-05-17 |
ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains |
Zhaopei Huang et.al. |
2405.10860 |
link |
2024-05-17 |
The Future of Large Language Model Pre-training is Federated |
Lorenzo Sani et.al. |
2405.10853 |
null |
2024-05-17 |
Open-Vocabulary Spatio-Temporal Action Detection |
Tao Wu et.al. |
2405.10832 |
null |
2024-05-17 |
Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities |
Hao Zhou et.al. |
2405.10825 |
null |
2024-05-17 |
ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios |
Markus Bayer et.al. |
2405.10808 |
null |
2024-05-17 |
The Relational Machine Calculus |
Chris Barrett et.al. |
2405.10801 |
null |
2024-05-17 |
Empowering Small-Scale Knowledge Graphs: A Strategy of Leveraging General-Purpose Knowledge Graphs for Enriched Embeddings |
Albert Sawczyn et.al. |
2405.10745 |
null |
2024-05-17 |
Efficient Multimodal Large Language Models: A Survey |
Yizhang Jin et.al. |
2405.10739 |
link |
2024-05-17 |
INDUS: Effective and Efficient Language Models for Scientific Applications |
Bishwaranjan Bhattacharjee et.al. |
2405.10725 |
null |
2024-05-17 |
SignLLM: Sign Languages Production Large Language Models |
Sen Fang et.al. |
2405.10718 |
null |
2024-05-17 |
Persian Pronoun Resolution: Leveraging Neural Networks and Language Models |
Hassan Haji Mohammadi et.al. |
2405.10714 |
null |
2024-05-17 |
SynDy: Synthetic Dynamic Dataset Generation Framework for Misinformation Tasks |
Michael Shliselberg et.al. |
2405.10700 |
null |
2024-05-17 |
Revolutionizing Process Mining: A Novel Architecture for ChatGPT Integration and Enhanced User Experience through Optimized Prompt Engineering |
Mehrdad Agha Mohammad Ali Kermani et.al. |
2405.10689 |
null |
2024-05-17 |
Realistic Evaluation of Toxicity in Large Language Models |
Tinh Son Luong et.al. |
2405.10659 |
null |
2024-05-16 |
UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models |
Sahel Sharifymoghaddam et.al. |
2405.10311 |
null |
2024-05-16 |
4D Panoptic Scene Graph Generation |
Jingkang Yang et.al. |
2405.10305 |
link |
2024-05-16 |
Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees |
Yu Gui et.al. |
2405.10301 |
link |
2024-05-16 |
HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models |
Rhea Sanjay Sukthanker et.al. |
2405.10299 |
link |
2024-05-17 |
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning |
Yuexiang Zhai et.al. |
2405.10292 |
null |
2024-05-16 |
Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction |
Jianhao Chen et.al. |
2405.10288 |
link |
2024-05-16 |
FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models |
Adrian Bulat et.al. |
2405.10286 |
null |
2024-05-16 |
Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers |
Tuo Zhang et.al. |
2405.10276 |
null |
2024-05-16 |
Keep It Private: Unsupervised Privatization of Online Text |
Calvin Bao et.al. |
2405.10260 |
link |
2024-05-16 |
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models |
Xianzheng Ma et.al. |
2405.10255 |
link |
2024-05-16 |
PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology |
George Shaikovski et.al. |
2405.10254 |
null |
2024-05-16 |
A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks |
Xuanfan Ni et.al. |
2405.10251 |
null |
2024-05-16 |
IntelliExplain: Enhancing Interactive Code Generation through Natural Language Explanations for Non-Professional Programmers |
Hao Yan et.al. |
2405.10250 |
null |
2024-05-16 |
A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts |
Xinru Zhang et.al. |
2405.10246 |
link |
2024-05-16 |
DocuMint: Docstring Generation for Python using Small Language Models |
Bibek Poudel et.al. |
2405.10243 |
link |
2024-05-16 |
Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting |
Divij Gupta et.al. |
2405.10216 |
null |
2024-05-16 |
CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations |
Jiahao Zhao et.al. |
2405.10212 |
link |
2024-05-16 |
LFED: A Literary Fiction Evaluation Dataset for Large Language Models |
Linhao Yu et.al. |
2405.10166 |
link |
2024-05-16 |
PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning |
Jiancheng Pan et.al. |
2405.10160 |
link |
2024-05-16 |
Speaker Verification in Agent-Generated Conversations |
Yizhe Yang et.al. |
2405.10150 |
null |
2024-05-15 |
Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming |
Bushi Xiao et.al. |
2405.09508 |
null |
2024-05-15 |
Constrained Learning for Causal Inference and Semiparametric Statistics |
Tiffany Tianhui Cai et.al. |
2405.09493 |
null |
2024-05-15 |
Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts |
Donya Rooein et.al. |
2405.09482 |
null |
2024-05-15 |
Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models |
Majid Zarharan et.al. |
2405.09454 |
link |
2024-05-15 |
M $^4$ oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts |
Yufeng Jiang et.al. |
2405.09446 |
link |
2024-05-15 |
Facilitating Opinion Diversity through Hybrid NLP Approaches |
Michiel van der Meer et.al. |
2405.09439 |
null |
2024-05-15 |
A Survey On Text-to-3D Contents Generation In The Wild |
Chenhan Jiang et.al. |
2405.09431 |
null |
2024-05-15 |
MicroPython Testbed for Federated Learning Algorithms |
Miroslav Popovic et.al. |
2405.09423 |
link |
2024-05-15 |
Matching domain experts by training from scratch on domain knowledge |
Xiaoliang Luo et.al. |
2405.09395 |
null |
2024-05-15 |
Compositional imprecise probability |
Jack Liell-Cock et.al. |
2405.09391 |
null |
2024-05-15 |
PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models |
Devansh Jain et.al. |
2405.09373 |
link |
2024-05-15 |
SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition |
Weijie L et.al. |
2405.09365 |
link |
2024-05-15 |
Large Language Model Bias Mitigation from the Perspective of Knowledge Editing |
Ruizhe Chen et.al. |
2405.09341 |
null |
2024-05-15 |
Prompting-based Synthetic Data Generation for Few-Shot Question Answering |
Maximilian Schmidt et.al. |
2405.09335 |
link |
2024-05-15 |
Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls |
Pedro Miguel Sánchez Sánchez et.al. |
2405.09318 |
null |
2024-05-15 |
Comparing the Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support |
Birger Moell et.al. |
2405.09300 |
null |
2024-05-15 |
Do language models capture implied discourse meanings? An investigation with exhaustivity implicatures of Korean morphology |
Hagyeong Shin et.al. |
2405.09293 |
null |
2024-05-15 |
Sign of the Times: Evaluating the use of Large Language Models for Idiomaticity Detection |
Dylan Phelps et.al. |
2405.09279 |
null |
2024-05-15 |
Dynamic Activation Pitfalls in LLaMA Models: An Empirical Study |
Chi Ma et.al. |
2405.09274 |
null |
2024-05-15 |
New Textual Corpora for Serbian Language Modeling |
Mihailo Škorić et.al. |
2405.09250 |
null |
2024-05-14 |
Efficient Vision-Language Pre-training by Cluster Masking |
Zihao Wei et.al. |
2405.08815 |
link |
2024-05-14 |
Towards Enhanced RAC Accessibility: Leveraging Datasets and LLMs |
Edison Jair Bejarano Sepulveda et.al. |
2405.08792 |
link |
2024-05-14 |
Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring |
Tiantian Zhang et.al. |
2405.08786 |
link |
2024-05-14 |
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs |
Akhila Yerukola et.al. |
2405.08760 |
link |
2024-05-14 |
Distributed Threat Intelligence at the Edge Devices: A Large Language Model-Driven Approach |
Syed Mhamudul Hasan et.al. |
2405.08755 |
null |
2024-05-14 |
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding |
Zhimin Li et.al. |
2405.08748 |
link |
2024-05-14 |
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory |
Xueyan Niu et.al. |
2405.08707 |
null |
2024-05-14 |
EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera |
Beilei Cui et.al. |
2405.08672 |
link |
2024-05-14 |
Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research |
Qinglong Cao et.al. |
2405.08668 |
link |
2024-05-14 |
Thinking Tokens for Language Modeling |
David Herel et.al. |
2405.08644 |
null |
2024-05-15 |
ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation |
Dimitris Gkoumas et.al. |
2405.08619 |
null |
2024-05-14 |
A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine |
Hanguang Xiao et.al. |
2405.08603 |
null |
2024-05-15 |
EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark |
Xiaohui Zhang et.al. |
2405.08596 |
link |
2024-05-14 |
Open-Vocabulary Object Detection via Neighboring Region Attention Alignment |
Sunyuan Qiang et.al. |
2405.08593 |
null |
2024-05-14 |
Improving Transformers with Dynamically Composable Multi-Head Attention |
Da Xiao et.al. |
2405.08553 |
link |
2024-05-14 |
Self-Distillation Improves DNA Sequence Inference |
Tong Yu et.al. |
2405.08538 |
link |
2024-05-14 |
Falcon 7b for Software Mention Detection in Scholarly Documents |
AmeerAli Khan et.al. |
2405.08514 |
null |
2024-05-14 |
Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure |
Odysseas S. Chlapanis et.al. |
2405.08502 |
link |
2024-05-14 |
Is Less More? Quality, Quantity and Context in Idiom Processing with Natural Language Models |
Agne Knietaite et.al. |
2405.08497 |
link |
2024-05-14 |
Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models |
Andrea Piergentili et.al. |
2405.08477 |
null |
2024-05-13 |
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots |
Chengyue Wu et.al. |
2405.07990 |
null |
2024-05-13 |
A Generalist Learner for Multifaceted Medical Image Interpretation |
Hong-Yu Zhou et.al. |
2405.07988 |
null |
2024-05-13 |
The Platonic Representation Hypothesis |
Minyoung Huh et.al. |
2405.07987 |
link |
2024-05-13 |
Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation |
Kevin Stangl et.al. |
2405.07969 |
null |
2024-05-13 |
PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation |
Suad Alshammari et.al. |
2405.07963 |
link |
2024-05-13 |
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments |
Samuel Schmidgall et.al. |
2405.07960 |
null |
2024-05-13 |
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning |
Yinzhu Quan et.al. |
2405.07938 |
link |
2024-05-13 |
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition |
Ziyang Zhang et.al. |
2405.07932 |
link |
2024-05-13 |
Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data |
Mahdi Morafah et.al. |
2405.07925 |
null |
2024-05-13 |
Can Better Text Semantics in Prompt Tuning Improve VLM Generalization? |
Hari Chandana Kuchibhotla et.al. |
2405.07921 |
null |
2024-05-13 |
A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking |
Ferdinand Schlatt et.al. |
2405.07920 |
link |
2024-05-13 |
PLUTO: Pathology-Universal Transformer |
Dinkar Juyal et.al. |
2405.07905 |
null |
2024-05-13 |
Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers |
Alena Tsanda et.al. |
2405.07886 |
link |
2024-05-13 |
Zero-Shot Tokenizer Transfer |
Benjamin Minixhofer et.al. |
2405.07883 |
link |
2024-05-13 |
RLHF Workflow: From Reward Modeling to Online RLHF |
Hanze Dong et.al. |
2405.07863 |
link |
2024-05-13 |
Can LLMs Help Predict Elections? (Counter)Evidence from the World’s Largest Democracy |
Pratik Gujral et.al. |
2405.07828 |
null |
2024-05-13 |
A View of How Language Models Will Transform Law |
Frank Fagan et.al. |
2405.07826 |
null |
2024-05-13 |
FreeVA: Offline MLLM as Training-Free Video Assistant |
Wenhao Wu et.al. |
2405.07798 |
link |
2024-05-13 |
DEPTH: Discourse Education through Pre-Training Hierarchically |
Zachary Bamberger et.al. |
2405.07788 |
link |
2024-05-13 |
Generating Human Motion in 3D Scenes from Text Descriptions |
Zhi Cen et.al. |
2405.07784 |
null |
2024-05-10 |
Linearizing Large Language Models |
Jean Mercat et.al. |
2405.06640 |
link |
2024-05-10 |
Value Augmented Sampling for Language Model Alignment and Personalization |
Seungwook Han et.al. |
2405.06639 |
link |
2024-05-10 |
Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark |
Evan M. Williams et.al. |
2405.06634 |
link |
2024-05-10 |
Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models |
Chakshu Moar et.al. |
2405.06626 |
null |
2024-05-10 |
Explaining Text Similarity in Transformer Models |
Alexandros Vasileiou et.al. |
2405.06604 |
link |
2024-05-10 |
Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach |
Elham Ravanbakhsh et.al. |
2405.06586 |
null |
2024-05-10 |
What Can Natural Language Processing Do for Peer Review? |
Ilia Kuznetsov et.al. |
2405.06563 |
link |
2024-05-10 |
Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval |
Mengjia Niu et.al. |
2405.06545 |
null |
2024-05-10 |
Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts |
Wenyu Huang et.al. |
2405.06524 |
null |
2024-05-10 |
UniDM: A Unified Framework for Data Manipulation with Large Language Models |
Yichen Qian et.al. |
2405.06510 |
null |
2024-05-10 |
Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling |
Lyumanshan Ye et.al. |
2405.06495 |
null |
2024-05-10 |
Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification |
Yaoqin Ye et.al. |
2405.06468 |
link |
2024-05-10 |
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation |
JoonHo Lee et.al. |
2405.06424 |
link |
2024-05-10 |
Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions? |
Hunter McNichols et.al. |
2405.06414 |
link |
2024-05-10 |
Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL |
Ning Cheng et.al. |
2405.06410 |
null |
2024-05-10 |
Program Synthesis using Inductive Logic Programming for the Abstraction and Reasoning Corpus |
Filipe Marinho Rocha et.al. |
2405.06399 |
null |
2024-05-10 |
Memory Mosaics |
Jianyu Zhang et.al. |
2405.06394 |
link |
2024-05-10 |
LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play |
Li-Chun Lu et.al. |
2405.06373 |
link |
2024-05-10 |
LMD3: Language Model Data Density Dependence |
John Kirchenbauer et.al. |
2405.06331 |
null |
2024-05-10 |
Correlation Dimension of Natural Language in a Statistical Manifold |
Xin Du et.al. |
2405.06321 |
null |
2024-05-09 |
Natural Language Processing RELIES on Linguistics |
Juri Opitz et.al. |
2405.05966 |
null |
2024-05-09 |
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning |
Dan Qiao et.al. |
2405.05957 |
link |
2024-05-09 |
Probing Multimodal LLMs as World Models for Driving |
Shiva Sreeram et.al. |
2405.05956 |
link |
2024-05-09 |
Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning |
Junzhi Chen et.al. |
2405.05955 |
link |
2024-05-09 |
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts |
Jiachen Li et.al. |
2405.05949 |
link |
2024-05-09 |
DOLOMITES: Domain-Specific Long-Form Methodical Tasks |
Chaitanya Malaviya et.al. |
2405.05938 |
null |
2024-05-09 |
Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness |
Siyuan Li et.al. |
2405.05930 |
null |
2024-05-09 |
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? |
Zorik Gekhman et.al. |
2405.05904 |
null |
2024-05-09 |
Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes |
Ziang Guo et.al. |
2405.05885 |
link |
2024-05-09 |
FlockGPT: Guiding UAV Flocking with Linguistic Orchestration |
Artem Lykov et.al. |
2405.05872 |
null |
2024-05-09 |
Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control |
Gunshi Gupta et.al. |
2405.05852 |
link |
2024-05-09 |
Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning |
Artem Lykov et.al. |
2405.05824 |
link |
2024-05-09 |
Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference |
Zhihang Lin et.al. |
2405.05803 |
link |
2024-05-09 |
Towards a More Inclusive AI: Progress and Perspectives in Large Language Model Training for the Sámi Language |
Ronny Paul et.al. |
2405.05777 |
null |
2024-05-09 |
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions |
Polina Tsvilodub et.al. |
2405.05776 |
null |
2024-05-09 |
Large Language Model-Aided Evolutionary Search for Constrained Multiobjective Optimization |
Zeyi Wang et.al. |
2405.05767 |
null |
2024-05-09 |
Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social Media |
Zhizhen Zhang et.al. |
2405.05760 |
null |
2024-05-09 |
Exploring the Potential of Human-LLM Synergy in Advancing Qualitative Analysis: A Case Study on Mental-Illness Stigma |
Han Meng et.al. |
2405.05758 |
null |
2024-05-09 |
Can large language models understand uncommon meanings of common words? |
Jinyang Wu et.al. |
2405.05741 |
null |
2024-05-09 |
Evaluating Dialect Robustness of Language Models via Conversation Understanding |
Dipankar Srirag et.al. |
2405.05688 |
link |
2024-05-08 |
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models |
Prannay Kaul et.al. |
2405.05256 |
null |
2024-05-08 |
You Only Cache Once: Decoder-Decoder Architectures for Language Models |
Yutao Sun et.al. |
2405.05254 |
link |
2024-05-08 |
Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge |
Charles Koutcheme et.al. |
2405.05253 |
link |
2024-05-09 |
LLMs with Personalities in Multi-issue Negotiation Games |
Sean Noh et.al. |
2405.05248 |
null |
2024-05-08 |
EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning |
Jingfeng Yao et.al. |
2405.05237 |
link |
2024-05-08 |
SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants |
Masoud Moghani et.al. |
2405.05226 |
null |
2024-05-08 |
Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers |
Jiuxiang Gu et.al. |
2405.05219 |
null |
2024-05-08 |
FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models |
Jinglin Xu et.al. |
2405.05216 |
link |
2024-05-08 |
MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning |
Inderjeet Nair et.al. |
2405.05189 |
link |
2024-05-08 |
Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming |
Tommaso Pasini et.al. |
2405.05176 |
null |
2024-05-08 |
Air Gap: Protecting Privacy-Conscious Conversational Agents |
Eugene Bagdasaryan et.al. |
2405.05175 |
null |
2024-05-08 |
XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples |
Peiqin Lin et.al. |
2405.05116 |
link |
2024-05-08 |
QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs |
Weijia Zhang et.al. |
2405.05109 |
null |
2024-05-08 |
Concerns on Bias in Large Language Models when Creating Synthetic Personae |
Helena A. Haxvig et.al. |
2405.05080 |
null |
2024-05-08 |
Impact of Tone-Aware Explanations in Recommender Systems |
Ayano Okoso et.al. |
2405.05061 |
null |
2024-05-08 |
Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models |
Aylin Gunal et.al. |
2405.05060 |
null |
2024-05-08 |
Seeds of Stereotypes: A Large-Scale Textual Analysis of Race and Gender Associations with Diseases in Online Sources |
Lasse Hyldig Hansen et.al. |
2405.05049 |
null |
2024-05-08 |
${M^2D}$ NeRF: Multi-Modal Decomposition NeRF with 3D Feature Fields |
Ning Wang et.al. |
2405.05010 |
null |
2024-05-08 |
ADELIE: Aligning Large Language Models on Information Extraction |
Yunjia Qi et.al. |
2405.05008 |
link |
2024-05-08 |
NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair |
Ruoke Wang et.al. |
2405.04994 |
null |
2024-05-07 |
ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning |
Jing Lin et.al. |
2405.04533 |
null |
2024-05-07 |
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving |
Yujun Lin et.al. |
2405.04532 |
link |
2024-05-07 |
NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts |
Shudan Zhang et.al. |
2405.04520 |
null |
2024-05-07 |
xLSTM: Extended Long Short-Term Memory |
Maximilian Beck et.al. |
2405.04517 |
link |
2024-05-07 |
A Transformer with Stack Attention |
Jiaoda Li et.al. |
2405.04515 |
link |
2024-05-08 |
Unveiling Disparities in Web Task Handling Between Human and Web Agent |
Kihoon Son et.al. |
2405.04497 |
null |
2024-05-07 |
Toward In-Context Teaching: Adapting Examples to Students’ Misconceptions |
Alexis Ross et.al. |
2405.04495 |
null |
2024-05-07 |
Representation Learning of Daily Movement Data Using Text Encoders |
Alexander Capstick et.al. |
2405.04494 |
link |
2024-05-08 |
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model |
DeepSeek-AI et.al. |
2405.04434 |
link |
2024-05-07 |
The Silicone Ceiling: Auditing GPT’s Race and Gender Biases in Hiring |
Lena Armstrong et.al. |
2405.04412 |
null |
2024-05-07 |
Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks |
Georgios Pantazopoulos et.al. |
2405.04403 |
link |
2024-05-07 |
Large Language Models Cannot Explain Themselves |
Advait Sarkar et.al. |
2405.04382 |
null |
2024-05-07 |
A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI |
Hannah Chafetz et.al. |
2405.04333 |
null |
2024-05-07 |
Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation |
Atharvan Dogra et.al. |
2405.04325 |
null |
2024-05-07 |
Granite Code Models: A Family of Open Foundation Models for Code Intelligence |
Mayank Mishra et.al. |
2405.04324 |
link |
2024-05-07 |
Accelerating Speculative Decoding using Dynamic Speculation Length |
Jonathan Mamou et.al. |
2405.04304 |
null |
2024-05-07 |
Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework |
Xiangpeng Wan et.al. |
2405.04294 |
link |
2024-05-07 |
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore |
Junchao Wu et.al. |
2405.04286 |
null |
2024-05-07 |
On the Foundations of Earth and Climate Foundation Models |
Xiao Xiang Zhu et.al. |
2405.04285 |
null |
2024-05-07 |
Semantic API Alignment: Linking High-level User Goals to APIs |
Robert Feldt et.al. |
2405.04236 |
null |
2024-05-06 |
Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs |
Muhammad Uzair Khattak et.al. |
2405.03690 |
null |
2024-05-06 |
Pose Priors from Language Models |
Sanjay Subramanian et.al. |
2405.03689 |
null |
2024-05-06 |
Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames |
Keith Burghardt et.al. |
2405.03688 |
link |
2024-05-06 |
Language-Image Models with 3D Understanding |
Jang Hyun Cho et.al. |
2405.03685 |
null |
2024-05-06 |
AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design |
Kamal Choudhary et.al. |
2405.03680 |
link |
2024-05-06 |
When LLMs Meet Cybersecurity: A Systematic Literature Review |
Jie Zhang et.al. |
2405.03644 |
link |
2024-05-06 |
A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama |
Vlad-Andrei Cursaru et.al. |
2405.03616 |
null |
2024-05-06 |
GREEN: Generative Radiology Report Evaluation and Error Notation |
Sophie Ostmeier et.al. |
2405.03595 |
null |
2024-05-06 |
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment |
Abhinav Agarwalla et.al. |
2405.03594 |
null |
2024-05-06 |
Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing |
Han Liu et.al. |
2405.03565 |
null |
2024-05-07 |
ID-centric Pre-training for Recommendation |
Yiqing Wu et.al. |
2405.03562 |
null |
2024-05-06 |
AlphaMath Almost Zero: process Supervision without process |
Guoxin Chen et.al. |
2405.03553 |
link |
2024-05-06 |
MAmmoTH2: Scaling Instructions from the Web |
Xiang Yue et.al. |
2405.03548 |
null |
2024-05-06 |
Position Paper: Leveraging Foundational Models for Black-Box Optimization: Benefits, Challenges, and Future Directions |
Xingyou Song et.al. |
2405.03547 |
null |
2024-05-06 |
Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning |
Yubo Mai et.al. |
2405.03509 |
null |
2024-05-06 |
UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images |
Yiting Qu et.al. |
2405.03486 |
null |
2024-05-06 |
LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model |
Haowen Sun et.al. |
2405.03485 |
link |
2024-05-06 |
Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search |
Hideaki Joko et.al. |
2405.03480 |
link |
2024-05-07 |
Large Language Models (LLMs) as Agents for Augmented Democracy |
Jairo Gudiño-Rosero et.al. |
2405.03452 |
null |
2024-05-06 |
SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence |
Hangyuan Ji et.al. |
2405.03446 |
link |
2024-05-03 |
Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models |
Piotr Padlewski et.al. |
2405.02287 |
link |
2024-05-03 |
Structural Pruning of Pre-trained Language Models via Neural Architecture Search |
Aaron Klein et.al. |
2405.02267 |
link |
2024-05-03 |
On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning? |
Maxime Zanella et.al. |
2405.02266 |
link |
2024-05-03 |
Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows |
Jasmine Y. Shih et.al. |
2405.02260 |
null |
2024-05-03 |
What matters when building vision-language models? |
Hugo Laurençon et.al. |
2405.02246 |
null |
2024-05-03 |
REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs |
Deepa Tilwani et.al. |
2405.02228 |
null |
2024-05-03 |
Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks |
Lujing Zhang et.al. |
2405.02225 |
null |
2024-05-03 |
FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems |
Yashar Deldjoo et.al. |
2405.02219 |
null |
2024-05-03 |
Automatic Programming: Large Language Models and Beyond |
Michael R. Lyu et.al. |
2405.02213 |
null |
2024-05-03 |
Assessing and Verifying Task Utility in LLM-Powered Applications |
Negar Arabzadeh et.al. |
2405.02178 |
null |
2024-05-03 |
Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset |
Hsuvas Borkakoty et.al. |
2405.02175 |
link |
2024-05-03 |
Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models |
Mohamad Al Mdfaa et.al. |
2405.02162 |
null |
2024-05-03 |
Neural Context Flows for Learning Generalizable Dynamical Systems |
Roussel Desmond Nzoyem et.al. |
2405.02154 |
link |
2024-05-03 |
The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates |
Giuseppe Russo Latona et.al. |
2405.02150 |
link |
2024-05-03 |
MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain |
Chao Jiang et.al. |
2405.02144 |
null |
2024-05-03 |
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection |
Guillem Ramírez et.al. |
2405.02134 |
null |
2024-05-03 |
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets |
Xuelong Geng et.al. |
2405.02132 |
link |
2024-05-03 |
Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph |
Vladyslav Nechakhin et.al. |
2405.02105 |
null |
2024-05-03 |
Argumentative Large Language Models for Explainable and Contestable Decision-Making |
Gabriel Freedman et.al. |
2405.02079 |
null |
2024-05-03 |
Comparative Analysis of Retrieval Systems in the Real World |
Dmytro Mozolevskyi et.al. |
2405.02048 |
null |
2024-05-02 |
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models |
Seungone Kim et.al. |
2405.01535 |
link |
2024-05-02 |
Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks |
Murtaza Dalal et.al. |
2405.01534 |
null |
2024-05-02 |
OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning |
Shihao Wang et.al. |
2405.01533 |
link |
2024-05-02 |
FLAME: Factuality-Aware Alignment for Large Language Models |
Sheng-Chieh Lin et.al. |
2405.01525 |
null |
2024-05-02 |
A separability-based approach to quantifying generalization: which layer is best? |
Luciano Dyballa et.al. |
2405.01524 |
link |
2024-05-02 |
Transformer-Aided Semantic Communications |
Matin Mortaheb et.al. |
2405.01521 |
null |
2024-05-02 |
D2PO: Discriminator-Guided DPO with Response Evaluation Models |
Prasann Singhal et.al. |
2405.01511 |
link |
2024-05-02 |
Analyzing the Role of Semantic Representations in the Era of Large Language Models |
Zhijing Jin et.al. |
2405.01502 |
link |
2024-05-02 |
Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models |
Raymond Fok et.al. |
2405.01501 |
null |
2024-05-02 |
Controllable Text Generation in the Instruction-Tuning Era |
Dhananjay Ashok et.al. |
2405.01490 |
null |
2024-05-02 |
MANTIS: Interleaved Multi-Image Instruction Tuning |
Dongfu Jiang et.al. |
2405.01483 |
link |
2024-05-02 |
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment |
Gerald Shen et.al. |
2405.01481 |
link |
2024-05-02 |
V-FLUTE: Visual Figurative Language Understanding with Textual Explanations |
Arkadiy Saakyan et.al. |
2405.01474 |
link |
2024-05-02 |
Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning |
Théo Moutakanni et.al. |
2405.01469 |
null |
2024-05-02 |
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models |
Yifei Ming et.al. |
2405.01468 |
null |
2024-05-02 |
A Systematic Literature Review on Large Language Models for Automated Program Repair |
Quanjun Zhang et.al. |
2405.01466 |
link |
2024-05-02 |
Natural Language to Verilog: Design of a Recurrent Spiking Neural Network using Large Language Models and ChatGPT |
Paola Vitolo et.al. |
2405.01419 |
null |
2024-05-02 |
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors |
Yuan Tang et.al. |
2405.01413 |
link |
2024-05-02 |
Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving |
Xin Quan et.al. |
2405.01379 |
link |
2024-05-02 |
GAIA: A General AI Assistant for Intelligent Accelerator Operations |
Frank Mayet et.al. |
2405.01359 |
null |
2024-05-01 |
Self-Play Preference Optimization for Language Model Alignment |
Yue Wu et.al. |
2405.00675 |
link |
2024-05-01 |
Is Bigger Edit Batch Size Always Better? – An Empirical Study on Model Editing with Llama-3 |
Junsang Yoon et.al. |
2405.00664 |
link |
2024-05-01 |
HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models |
Ningke Li et.al. |
2405.00648 |
null |
2024-05-01 |
When Quantization Affects Confidence of Large Language Models? |
Irina Proskurina et.al. |
2405.00632 |
link |
2024-05-01 |
“I’m Not Sure, But…”: Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust |
Sunnie S. Y. Kim et.al. |
2405.00623 |
null |
2024-05-01 |
Causal Evaluation of Language Models |
Sirui Chen et.al. |
2405.00622 |
link |
2024-05-01 |
Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling |
Yida Mu et.al. |
2405.00611 |
link |
2024-05-01 |
Investigating Automatic Scoring and Feedback using Large Language Models |
Gloria Ashiya Katuka et.al. |
2405.00602 |
null |
2024-05-01 |
Are Models Biased on Text without Gender-related Language? |
Catarina G Belém et.al. |
2405.00588 |
link |
2024-05-01 |
The Real, the Better: Aligning Large Language Models with Online Human Behaviors |
Guanying Jiang et.al. |
2405.00578 |
null |
2024-05-01 |
EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model |
Deng Li et.al. |
2405.00574 |
null |
2024-05-01 |
NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance |
Huan-Yi Su et.al. |
2405.00566 |
null |
2024-05-01 |
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment |
Zhili Liu et.al. |
2405.00557 |
null |
2024-05-01 |
Long-Term Human Trajectory Prediction using 3D Dynamic Scene Graphs |
Nicolas Gorlo et.al. |
2405.00552 |
link |
2024-05-01 |
ChatBI: Towards Natural Language to Complex Business Intelligence SQL |
Jinqing Lian et.al. |
2405.00527 |
null |
2024-05-01 |
CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions |
Donghee Choi et.al. |
2405.00523 |
null |
2024-05-01 |
Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning |
Lucas-Andreï Thil et.al. |
2405.00516 |
null |
2024-05-01 |
GOLD: Geometry Problem Solver with Natural Language Description |
Jiaxin Zhang et.al. |
2405.00494 |
link |
2024-05-01 |
Is Temperature the Creativity Parameter of Large Language Models? |
Max Peeperkorn et.al. |
2405.00492 |
link |
2024-05-01 |
The Pyramid of Captions |
Delong Chen et.al. |
2405.00485 |
null |
2024-04-30 |
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation |
Yunhao Ge et.al. |
2404.19752 |
null |
2024-04-30 |
PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification |
Leon Garza et.al. |
2404.19744 |
null |
2024-04-30 |
Better & Faster Large Language Models via Multi-token Prediction |
Fabian Gloeckle et.al. |
2404.19737 |
null |
2024-04-30 |
A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications |
Steph Buongiorno et.al. |
2404.19729 |
null |
2024-04-30 |
PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games |
Steph Buongiorno et.al. |
2404.19721 |
null |
2024-04-30 |
Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns |
Constantinos Patsakis et.al. |
2404.19715 |
null |
2024-04-30 |
Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models |
Scott Sumpter et.al. |
2404.19713 |
null |
2024-04-30 |
When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively |
Tiziano Labruna et.al. |
2404.19705 |
link |
2024-04-30 |
Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners |
Chun Feng et.al. |
2404.19696 |
null |
2024-04-30 |
Towards Generalist Robot Learning from Internet Video: A Survey |
Robert McCarthy et.al. |
2404.19664 |
null |
2024-04-30 |
MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation |
Min Zhang et.al. |
2404.19644 |
link |
2024-04-30 |
On Training a Neural Network to Explain Binaries |
Alexander Interrante-Grant et.al. |
2404.19631 |
null |
2024-04-30 |
Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model |
Denys Godwin et.al. |
2404.19609 |
null |
2024-04-30 |
Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning |
Xuanli He et.al. |
2404.19597 |
null |
2024-04-30 |
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing |
Yucheng Hu et.al. |
2404.19543 |
link |
2024-04-30 |
MoST: Multi-modality Scene Tokenization for Motion Prediction |
Norman Mu et.al. |
2404.19531 |
null |
2024-04-30 |
Do Large Language Models Understand Conversational Implicature – A case study with a chinese sitcom |
Shisen Yue et.al. |
2404.19509 |
link |
2024-04-30 |
More Compute Is What You Need |
Zhen Guo et.al. |
2404.19484 |
null |
2024-05-01 |
Neuro-Vision to Language: Image Reconstruction and Language enabled Interaction via Brain Recordings |
Guobin Shen et.al. |
2404.19438 |
null |
2024-04-30 |
Can Large Language Models put 2 and 2 together? Probing for Entailed Arithmetical Relationships |
D. Panas et.al. |
2404.19432 |
null |
2024-04-29 |
Hallucination of Multimodal Large Language Models: A Survey |
Zechen Bai et.al. |
2404.18930 |
link |
2024-04-29 |
Holmes: Benchmark the Linguistic Competence of Language Models |
Andreas Waldis et.al. |
2404.18923 |
null |
2024-04-29 |
DPO Meets PPO: Reinforced Token Optimization for RLHF |
Han Zhong et.al. |
2404.18922 |
null |
2024-04-29 |
TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation |
Junhao Cheng et.al. |
2404.18919 |
link |
2024-04-29 |
Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting |
Fangcheng Liu et.al. |
2404.18911 |
link |
2024-04-29 |
Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking |
Hong Jin Kang et.al. |
2404.18881 |
link |
2024-04-29 |
More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness |
Aaron J. Li et.al. |
2404.18870 |
link |
2024-04-29 |
Truth-value judgment in language models: belief directions are context sensitive |
Stefan F. Schouten et.al. |
2404.18865 |
null |
2024-04-29 |
Performance-Aligned LLMs for Generating Fast Code |
Daniel Nichols et.al. |
2404.18864 |
null |
2024-04-29 |
A Survey on Vision Mamba: Models, Applications and Challenges |
Rui Xu et.al. |
2404.18861 |
link |
2024-04-29 |
VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning |
Aidan Z. H. Yang et.al. |
2404.18852 |
null |
2024-04-29 |
FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition |
Yuxuan Yan et.al. |
2404.18848 |
null |
2024-04-29 |
It’s Difficult to be Neutral – Human and LLM-based Sentiment Annotation of Patient Comments |
Petter Mæhlum et.al. |
2404.18832 |
null |
2024-04-29 |
Benchmarking Benchmark Leakage in Large Language Models |
Ruijie Xu et.al. |
2404.18824 |
link |
2024-04-29 |
AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering |
Wenxiang Zhao et.al. |
2404.18816 |
null |
2024-04-29 |
Unknown Script: Impact of Script on Cross-Lingual Transfer |
Wondimagegnhue Tsegaye Tufa et.al. |
2404.18810 |
link |
2024-04-29 |
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models |
Pat Verga et.al. |
2404.18796 |
null |
2024-04-29 |
PECC: Problem Extraction and Coding Challenges |
Patrick Haller et.al. |
2404.18766 |
link |
2024-04-29 |
Transitive Vision-Language Prompt Learning for Domain Generalization |
Liyuan Wang et.al. |
2404.18758 |
null |
2024-04-29 |
Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models |
Hongyi Zhu et.al. |
2404.18746 |
null |
2024-04-26 |
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo |
Stephen Zhao et.al. |
2404.17546 |
link |
2024-04-26 |
Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models |
Yuhang Huang et.al. |
2404.17534 |
null |
2024-04-26 |
Large Language Model Agent as a Mechanical Designer |
Yayati Jadhav et.al. |
2404.17525 |
null |
2024-04-26 |
On the Use of Large Language Models to Generate Capability Ontologies |
Luis Miguel Vieira da Silva et.al. |
2404.17524 |
link |
2024-04-26 |
Enhancing Legal Compliance and Regulation Analysis with Large Language Models |
Shabnam Hassani et.al. |
2404.17522 |
null |
2024-04-26 |
A Comprehensive Evaluation on Event Reasoning of Large Language Models |
Zhengwei Tao et.al. |
2404.17513 |
link |
2024-04-26 |
CEval: A Benchmark for Evaluating Counterfactual Text Generation |
Van Bach Nguyen et.al. |
2404.17475 |
link |
2024-04-26 |
Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System |
Robin Schmucker et.al. |
2404.17460 |
null |
2024-04-26 |
“ChatGPT Is Here to Help, Not to Replace Anybody” – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses |
Bruno Pereira Cipriano et.al. |
2404.17443 |
null |
2024-04-26 |
PromptCIR: Blind Compressed Image Restoration with Prompt Learning |
Bingchen Li et.al. |
2404.17433 |
link |
2024-04-26 |
Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations |
Rémy Decoupes et.al. |
2404.17401 |
null |
2024-04-26 |
UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning |
Maoxun Yuan et.al. |
2404.17360 |
null |
2024-04-26 |
InspectorRAGet: An Introspection Platform for RAG Evaluation |
Kshitij Fadnis et.al. |
2404.17347 |
link |
2024-04-26 |
Introducing cosmosGPT: Monolingual Training for Turkish Language Models |
H. Toprak Kesgin et.al. |
2404.17336 |
null |
2024-04-26 |
A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation |
Xin Zhang et.al. |
2404.17335 |
null |
2024-04-26 |
An Extendable Cloud-Native Alloy Property Explorer |
Zhuoyuan Li et.al. |
2404.17330 |
link |
2024-04-26 |
When to Trust LLMs: Aligning Confidence with Response Quality |
Shuchang Tao et.al. |
2404.17287 |
link |
2024-04-26 |
Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM |
Xuan Zhang et.al. |
2404.17283 |
link |
2024-04-26 |
Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT as a Pivot |
Michelle Terblanche et.al. |
2404.17216 |
null |
2024-04-26 |
Low-Rank Knowledge Decomposition for Medical Foundation Models |
Yuhang Zhou et.al. |
2404.17184 |
link |
2024-04-25 |
The Third Monocular Depth Estimation Challenge |
Jaime Spencer et.al. |
2404.16831 |
null |
2024-04-25 |
Make-it-Real: Unleashing Large Multimodal Model’s Ability for Painting 3D Objects with Realistic Materials |
Ye Fang et.al. |
2404.16829 |
null |
2024-04-25 |
V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection |
Xuanyu Zhang et.al. |
2404.16824 |
null |
2024-04-25 |
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites |
Zhe Chen et.al. |
2404.16821 |
link |
2024-04-25 |
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages |
Harman Singh et.al. |
2404.16816 |
link |
2024-04-26 |
Make Your LLM Fully Utilize the Context |
Shengnan An et.al. |
2404.16811 |
link |
2024-04-25 |
Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning |
Tianhui Zhang et.al. |
2404.16807 |
link |
2024-04-25 |
AAPL: Adding Attributes to Prompt Learning for Vision-Language Models |
Gahyeon Kim et.al. |
2404.16804 |
link |
2024-04-25 |
Weak-to-Strong Extrapolation Expedites Alignment |
Chujie Zheng et.al. |
2404.16792 |
link |
2024-04-25 |
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension |
Bohao Li et.al. |
2404.16790 |
link |
2024-04-25 |
Continual Learning of Large Language Models: A Comprehensive Survey |
Haizhou Shi et.al. |
2404.16789 |
link |
2024-04-25 |
Modeling Selective Feature Attention for Representation-based Siamese Text Matching |
Jianxiang Zang et.al. |
2404.16776 |
link |
2024-04-25 |
REBEL: Reinforcement Learning via Regressing Relative Rewards |
Zhaolin Gao et.al. |
2404.16767 |
link |
2024-04-25 |
Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model |
Runzhe Zhan et.al. |
2404.16766 |
null |
2024-04-25 |
RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis |
Xiaoman Zhang et.al. |
2404.16754 |
link |
2024-04-25 |
Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class |
Mazda Moayeri et.al. |
2404.16717 |
null |
2024-04-25 |
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding |
Mostafa Elhoushi et.al. |
2404.16710 |
link |
2024-04-25 |
Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents |
Giorgio Piatti et.al. |
2404.16698 |
link |
2024-04-25 |
Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 |
Lydia Uhler et.al. |
2404.16692 |
null |
2024-04-25 |
EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning |
Hongxia Xie et.al. |
2404.16670 |
link |
2024-04-24 |
Hybrid LLM/Rule-based Approaches to Business Insights Generation from Structured Data |
Aliaksei Vertsel et.al. |
2404.15604 |
null |
2024-04-24 |
ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction |
Henry Peng Zou et.al. |
2404.15592 |
link |
2024-04-24 |
MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis |
Jiaxin Zhuang et.al. |
2404.15580 |
null |
2024-04-24 |
Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? |
Hossein Salami et.al. |
2404.15578 |
null |
2024-04-24 |
Retrieval Head Mechanistically Explains Long-Context Factuality |
Wenhao Wu et.al. |
2404.15574 |
link |
2024-04-23 |
PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models |
Shashi Kant Gupta et.al. |
2404.15549 |
null |
2024-04-23 |
BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis |
Shuhang Lin et.al. |
2404.15532 |
link |
2024-04-23 |
Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models |
Mihir Parmar et.al. |
2404.15522 |
link |
2024-04-23 |
Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval |
Young Kyun Jang et.al. |
2404.15516 |
null |
2024-04-23 |
ToM-LM: Delegating Theory Of Mind Reasoning to External Symbolic Executors in Large Language Models |
Weizhi Tang et.al. |
2404.15515 |
null |
2024-04-23 |
IryoNLP at MEDIQA-CORR 2024: Tackling the Medical Error Detection & Correction Task On the Shoulders of Medical Agents |
Jean-Philippe Corbeil et.al. |
2404.15488 |
link |
2024-04-23 |
Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance |
Het Patel et.al. |
2404.15485 |
null |
2024-04-23 |
Can Large Language Models Learn the Physics of Metamaterials? An Empirical Study with ChatGPT |
Darui Lu et.al. |
2404.15458 |
null |
2024-04-23 |
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference |
João Monteiro et.al. |
2404.15420 |
null |
2024-04-23 |
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs |
Davide Caffagni et.al. |
2404.15406 |
null |
2024-04-23 |
Aligning LLM Agents by Learning Latent Preference from User Edits |
Ge Gao et.al. |
2404.15269 |
link |
2024-04-23 |
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts |
Yifeng Ding et.al. |
2404.15247 |
link |
2024-04-23 |
CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies |
Weiyan Shi et.al. |
2404.15238 |
link |
2024-04-23 |
Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models |
Aidan Z. H. Yang et.al. |
2404.15236 |
null |
2024-04-23 |
Re-Thinking Inverse Graphics With Large Language Models |
Peter Kulits et.al. |
2404.15228 |
null |
2024-04-23 |
Does Instruction Tuning Make LLMs More Consistent? |
Constanza Fierro et.al. |
2404.15206 |
null |
2024-04-23 |
Setting up the Data Printer with Improved English to Ukrainian Machine Translation |
Yurii Paniv et.al. |
2404.15196 |
link |
2024-04-23 |
Regressive Side Effects of Training Language Models to Mimic Student Misconceptions |
Shashank Sonkar et.al. |
2404.15156 |
null |
2024-04-23 |
Bias patterns in the application of LLMs for clinical decision support: A comprehensive study |
Raphael Poulain et.al. |
2404.15149 |
link |
2024-04-23 |
Rethinking LLM Memorization through the Lens of Adversarial Compression |
Avi Schwarzschild et.al. |
2404.15146 |
null |
2024-04-23 |
MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning |
Sunan He et.al. |
2404.15127 |
link |
2024-04-23 |
Identifying Fairness Issues in Automatically Generated Testing Content |
Kevin Stowe et.al. |
2404.15104 |
null |
2024-04-23 |
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation |
Xun Wu et.al. |
2404.15100 |
null |
2024-04-23 |
Detection of circular permutations by Protein Language Models |
Yue Hu et.al. |
2404.15087 |
link |
2024-04-23 |
Multi-Head Mixture-of-Experts |
Xun Wu et.al. |
2404.15045 |
link |
2024-04-23 |
TAXI: Evaluating Categorical Knowledge Editing for Language Models |
Derek Powell et.al. |
2404.15004 |
link |
2024-04-23 |
Transformers Can Represent $n$ -gram Language Models |
Anej Svete et.al. |
2404.14994 |
null |
2024-04-23 |
A Short Review for Ontology Learning from Text: Stride from Shallow Learning, Deep Learning to Large Language Models Trend |
Rick Du et.al. |
2404.14991 |
null |
2024-04-23 |
$\texttt{MiniMol}$ : A Parameter-Efficient Foundation Model for Molecular Learning |
Kerstin Kläser et.al. |
2404.14986 |
null |
2024-04-23 |
Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case |
Muhammad Asif Auyb et.al. |
2404.14977 |
null |
2024-04-22 |
AutoAD III: The Prequel – Back to the Pixels |
Tengda Han et.al. |
2404.14412 |
null |
2024-04-22 |
SpaceByte: Towards Deleting Tokenization from Large Language Modeling |
Kevin Slagle et.al. |
2404.14408 |
link |
2024-04-22 |
RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios? |
Adrian de Wynter et.al. |
2404.14397 |
link |
2024-04-22 |
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation |
Yuying Ge et.al. |
2404.14396 |
link |
2024-04-22 |
PARAMANU-GANITA: Language Model with Mathematical Capabilities |
Mitodru Niyogi et.al. |
2404.14395 |
null |
2024-04-22 |
A Multimodal Automated Interpretability Agent |
Tamar Rott Shaham et.al. |
2404.14394 |
null |
2024-04-22 |
A Survey on Self-Evolution of Large Language Models |
Zhengwei Tao et.al. |
2404.14387 |
link |
2024-04-22 |
Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph |
Xiaochen Kev Gao et.al. |
2404.14372 |
link |
2024-04-23 |
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data |
Fahim Tajwar et.al. |
2404.14367 |
link |
2024-04-22 |
Better Synthetic Data by Retrieving and Transforming Existing Datasets |
Saumya Gandhi et.al. |
2404.14361 |
link |
2024-04-22 |
Rethinking Legal Compliance Automation: Opportunities with Large Language Models |
Shabnam Hassani et.al. |
2404.14356 |
null |
2024-04-22 |
Calc-CMU at SemEval-2024 Task 7: Pre-Calc – Learning to Use the Calculator Improves Numeracy in Language Models |
Vishruth Veerendranath et.al. |
2404.14355 |
link |
2024-04-22 |
Automated Long Answer Grading with RiceChem Dataset |
Shashank Sonkar et.al. |
2404.14316 |
link |
2024-04-22 |
Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels |
Jan-Philipp Fränken et.al. |
2404.14313 |
link |
2024-04-22 |
Explaining Arguments’ Strength: Unveiling the Role of Attacks and Supports (Technical Report) |
Xiang Yin et.al. |
2404.14304 |
link |
2024-04-22 |
Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits |
Shashank Sonkar et.al. |
2404.14301 |
null |
2024-04-22 |
Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach |
Yao Wan et.al. |
2404.14296 |
link |
2024-04-22 |
A Survey on Efficient Inference for Large Language Models |
Zixuan Zhou et.al. |
2404.14294 |
null |
2024-04-22 |
LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots |
Dongge Han et.al. |
2404.14285 |
null |
2024-04-22 |
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback |
Wenyi Xiao et.al. |
2404.14233 |
null |
2024-04-19 |
MoVA: Adapting Mixture of Vision Experts to Multimodal Context |
Zhuofan Zong et.al. |
2404.13046 |
link |
2024-04-19 |
Unified Scene Representation and Reconstruction for 3D Large Language Models |
Tao Chu et.al. |
2404.13044 |
null |
2024-04-19 |
Data Alignment for Zero-Shot Concept Generation in Dermatology AI |
Soham Gadgil et.al. |
2404.13043 |
null |
2024-04-19 |
Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs |
Biyang Guo et.al. |
2404.13033 |
link |
2024-04-19 |
When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering |
Stephen Choi et.al. |
2404.13028 |
null |
2024-04-19 |
Stronger Random Baselines for In-Context Learning |
Gregory Yauney et.al. |
2404.13020 |
link |
2024-04-19 |
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models |
Chuofan Ma et.al. |
2404.13013 |
link |
2024-04-19 |
Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs |
Clemencia Siro et.al. |
2404.12994 |
link |
2024-04-19 |
FineRec:Exploring Fine-grained Sequential Recommendation |
Xiaokun Zhang et.al. |
2404.12975 |
link |
2024-04-19 |
Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models |
Yian Li et.al. |
2404.12966 |
null |
2024-04-19 |
Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction |
Qinyuan Wu et.al. |
2404.12957 |
null |
2024-04-19 |
Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models |
Konstantinos Vilouras et.al. |
2404.12920 |
null |
2024-04-19 |
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models |
Zhenyang Ni et.al. |
2404.12916 |
link |
2024-04-19 |
Large Language Models for Networking: Workflow, Advances and Challenges |
Chang Liu et.al. |
2404.12901 |
null |
2024-04-19 |
Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning |
Ahmed Elshabrawy et.al. |
2404.12897 |
null |
2024-04-19 |
Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation |
Guanhua Chen et.al. |
2404.12879 |
null |
2024-04-19 |
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency |
Zhaodonghui Li et.al. |
2404.12872 |
link |
2024-04-19 |
How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning? |
Yang Luo et.al. |
2404.12866 |
link |
2024-04-19 |
Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation |
Yilong Chen et.al. |
2404.12861 |
null |
2024-04-19 |
TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages |
Aleksei Dorkin et.al. |
2404.12845 |
null |
2024-04-18 |
BLINK: Multimodal Large Language Models Can See but Not Perceive |
Xingyu Fu et.al. |
2404.12390 |
null |
2024-04-18 |
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models |
Aitor Ormazabal et.al. |
2404.12387 |
null |
2024-04-18 |
MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale |
Xiaotang Gai et.al. |
2404.12372 |
null |
2024-04-18 |
When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes |
Asaf Yehudai et.al. |
2404.12365 |
link |
2024-04-18 |
From $r$ to $Q^*$ : Your Language Model is Secretly a Q-Function |
Rafael Rafailov et.al. |
2404.12358 |
null |
2024-04-18 |
Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation |
Jingmin Sun et.al. |
2404.12355 |
link |
2024-04-18 |
V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning |
Hang Hua et.al. |
2404.12353 |
null |
2024-04-18 |
Evaluating AI for Law: Bridging the Gap with Open-Source Solutions |
Rohan Bhambhoria et.al. |
2404.12349 |
null |
2024-04-18 |
Large Language Models in Targeted Sentiment Analysis |
Nicolay Rusnachenko et.al. |
2404.12342 |
link |
2024-04-18 |
Normative Requirements Operationalization with Large Language Models |
Nick Feng et.al. |
2404.12335 |
null |
2024-04-18 |
Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment |
Zhaofeng Wu et.al. |
2404.12318 |
null |
2024-04-18 |
Large Language Models for Synthetic Participatory Planning of Shared Automated Electric Mobility Systems |
Jiangbo Yu et.al. |
2404.12317 |
null |
2024-04-18 |
Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair |
Yusuke Sakai et.al. |
2404.12299 |
null |
2024-04-18 |
Augmenting emotion features in irony detection with Large language modeling |
Yucheng Lin et.al. |
2404.12291 |
null |
2024-04-18 |
Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery |
Yona Falinie A. Gaus et.al. |
2404.12285 |
null |
2024-04-18 |
Enhancing Embedding Performance through Large Language Model-based Text Enrichment and Rewriting |
Nicholas Harris et.al. |
2404.12283 |
null |
2024-04-18 |
Advancing the Robustness of Large Language Models through Self-Denoised Smoothing |
Jiabao Ji et.al. |
2404.12274 |
link |
2024-04-18 |
FedEval-LLM: Federated Evaluation of Large Language Models on Downstream Tasks with Collective Wisdom |
Yuanqin He et.al. |
2404.12273 |
null |
2024-04-18 |
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences |
Shreya Shankar et.al. |
2404.12272 |
null |
2024-04-18 |
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM |
Michelle S. Lam et.al. |
2404.12259 |
link |
2024-04-17 |
Private federated discovery of out-of-vocabulary words for Gboard |
Ziteng Sun et.al. |
2404.11607 |
null |
2024-04-17 |
VG4D: Vision-Language Model Goes 4D Video Recognition |
Zhichao Deng et.al. |
2404.11605 |
link |
2024-04-17 |
A Deep Dive into Large Language Models for Automated Bug Localization and Repair |
Soneya Binta Hossain et.al. |
2404.11595 |
null |
2024-04-17 |
Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding |
Zezhong Fan et.al. |
2404.11589 |
null |
2024-04-17 |
LLMTune: Accelerate Database Knob Tuning with Large Language Models |
Xinmei Huang et.al. |
2404.11581 |
link |
2024-04-17 |
On the Scalability of GNNs for Molecular Graphs |
Maciej Sypetkowski et.al. |
2404.11568 |
null |
2024-04-17 |
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation |
Kuan-Chieh et.al. |
2404.11565 |
null |
2024-04-17 |
Quantifying Multilingual Performance of Large Language Models Across Languages |
Zihao Li et.al. |
2404.11553 |
null |
2024-04-17 |
Evaluating Span Extraction in Generative Paradigm: A Reflection on Aspect-Based Sentiment Analysis |
Soyoung Yang et.al. |
2404.11539 |
null |
2024-04-17 |
FedPFT: Federated Proxy Fine-Tuning of Foundation Models |
Zhaopeng Peng et.al. |
2404.11536 |
link |
2024-04-17 |
Select and Reorder: A Novel Approach for Neural Sign Language Production |
Harry Walsh et.al. |
2404.11532 |
null |
2024-04-17 |
Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization |
Costas Mavromatis et.al. |
2404.11531 |
link |
2024-04-17 |
Embedding Privacy in Computational Social Science and Artificial Intelligence Research |
Keenan Jones et.al. |
2404.11515 |
null |
2024-04-17 |
Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models |
Yushuo Chen et.al. |
2404.11502 |
link |
2024-04-17 |
Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models |
Yue Zhou et.al. |
2404.11500 |
link |
2024-04-18 |
Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent |
Wei Chen et.al. |
2404.11459 |
null |
2024-04-17 |
Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models |
Sunhao Dai et.al. |
2404.11457 |
link |
2024-04-17 |
AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts |
Meng Jiang et.al. |
2404.11449 |
link |
2024-04-17 |
Open-Ended Wargames with Large Language Models |
Daniel P. Hogan et.al. |
2404.11446 |
link |
2024-04-17 |
DUPE: Detection Undermining via Prompt Engineering for Deepfake Text |
James Weichert et.al. |
2404.11408 |
null |
2024-04-16 |
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback |
Qiwei Di et.al. |
2404.10776 |
null |
2024-04-16 |
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation |
Hongxin Zhang et.al. |
2404.10775 |
null |
2024-04-16 |
Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification |
Yu-Yang Li et.al. |
2404.10757 |
link |
2024-04-16 |
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study |
Shusheng Xu et.al. |
2404.10719 |
link |
2024-04-16 |
Dual Modalities of Text: Visual and Textual Generative Pre-training |
Yekun Chai et.al. |
2404.10710 |
link |
2024-04-16 |
Question Difficulty Ranking for Multiple-Choice Reading Comprehension |
Vatsal Raina et.al. |
2404.10704 |
null |
2024-04-16 |
An empirical study on code review activity prediction in practice |
Doriane Olewicki et.al. |
2404.10703 |
null |
2024-04-16 |
Automating REST API Postman Test Cases Using LLM |
S Deepika Sri et.al. |
2404.10678 |
null |
2024-04-16 |
Self-playing Adversarial Language Game Enhances LLM Reasoning |
Pengyu Cheng et.al. |
2404.10642 |
link |
2024-04-16 |
HLAT: High-quality Large Language Model Pre-trained on AWS Trainium |
Haozheng Fan et.al. |
2404.10630 |
link |
2024-04-16 |
Private Attribute Inference from Images with Vision-Language Models |
Batuhan Tömekçe et.al. |
2404.10618 |
null |
2024-04-16 |
Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases |
Yanze Li et.al. |
2404.10595 |
null |
2024-04-16 |
Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training |
Masanori Hirano et.al. |
2404.10555 |
null |
2024-04-16 |
Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning |
Xiao Wang et.al. |
2404.10552 |
null |
2024-04-16 |
Capturing the Macroscopic Behaviour of Molecular Dynamics with Membership Functions |
Alexander Sikorski et.al. |
2404.10523 |
link |
2024-04-16 |
CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity |
Moshe Berchansky et.al. |
2404.10513 |
null |
2024-04-16 |
White Men Lead, Black Women Help: Uncovering Gender, Racial, and Intersectional Bias in Language Agency |
Yixin Wan et.al. |
2404.10508 |
null |
2024-04-16 |
Self-Supervised Visual Preference Alignment |
Ke Zhu et.al. |
2404.10501 |
link |
2024-04-16 |
When Emotional Stimuli meet Prompt Designing: An Auto-Prompt Graphical Paradigm |
Chenggian Ma et.al. |
2404.10500 |
null |
2024-04-16 |
Spiral of Silences: How is Large Language Model Killing Information Retrieval? – A Case Study on Open Domain Question Answering |
Xiaoyang Chen et.al. |
2404.10496 |
link |
2024-04-15 |
KG-CTG: Citation Generation through Knowledge Graph-guided Large Language Models |
Avinash Anand et.al. |
2404.09763 |
null |
2024-04-15 |
Resilience of Large Language Models for Noisy Instructions |
Bin Wang et.al. |
2404.09754 |
null |
2024-04-15 |
Personalized Collaborative Fine-Tuning for On-Device Large Language Models |
Nicolas Wagner et.al. |
2404.09753 |
link |
2024-04-15 |
AMPCliff: quantitative definition and benchmarking of activity cliffs in antimicrobial peptides |
Kewei Li et.al. |
2404.09738 |
link |
2024-04-15 |
Quantization of Large Language Models with an Overdetermined Basis |
Daniil Merkulov et.al. |
2404.09737 |
null |
2024-04-15 |
Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models |
Ziwei Luo et.al. |
2404.09732 |
link |
2024-04-15 |
Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model |
Hyunsoo Cho et.al. |
2404.09717 |
null |
2024-04-15 |
Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction |
David Sobrín-Hidalgo et.al. |
2404.09705 |
null |
2024-04-15 |
Generative AI for Game Theory-based Mobile Networking |
Long He et.al. |
2404.09699 |
null |
2024-04-15 |
Are Large Language Models Reliable Argument Quality Annotators? |
Nailia Mirzakhmedova et.al. |
2404.09696 |
link |
2024-04-15 |
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models |
Guangyan Li et.al. |
2404.09695 |
null |
2024-04-15 |
Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation |
Juhwan Choi et.al. |
2404.09682 |
link |
2024-04-15 |
Learn Your Reference Model for Real Good Alignment |
Alexey Gorbatovski et.al. |
2404.09656 |
null |
2024-04-15 |
Do LLMs Understand Visual Anomalies? Uncovering LLM Capabilities in Zero-shot Anomaly Detection |
Jiaqi Zhu et.al. |
2404.09654 |
null |
2024-04-15 |
Bridging Vision and Language Spaces with Assignment Prediction |
Jungin Park et.al. |
2404.09632 |
link |
2024-04-15 |
AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception |
Yipo Huang et.al. |
2404.09624 |
link |
2024-04-15 |
UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark |
Zhaokun Zhou et.al. |
2404.09619 |
null |
2024-04-15 |
A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions |
Pengfei Liu et.al. |
2404.09606 |
link |
2024-04-15 |
Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction |
Zepeng Ding et.al. |
2404.09593 |
null |
2024-04-15 |
Modelling Language |
Jumbly Grindrod et.al. |
2404.09579 |
null |
2024-04-15 |
Transformers, Contextualism, and Polysemy |
Jumbly Grindrod et.al. |
2404.09577 |
link |
2024-04-15 |
Large language models and linguistic intentionality |
Jumbly Grindrod et.al. |
2404.09576 |
null |
2024-04-12 |
Probing the 3D Awareness of Visual Foundation Models |
Mohamed El Banani et.al. |
2404.08636 |
link |
2024-04-12 |
Pre-training Small Base LMs with Fewer Tokens |
Sunny Sanyal et.al. |
2404.08634 |
link |
2024-04-12 |
FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models |
Yanting Wang et.al. |
2404.08631 |
link |
2024-04-12 |
Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation |
Yanhao Zheng et.al. |
2404.08603 |
link |
2024-04-12 |
Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts |
Övgü Özdemir et.al. |
2404.08589 |
link |
2024-04-12 |
Pathological Primitive Segmentation Based on Visual Foundation Model with Zero-Shot Mask Generation |
Abu Bakor Hayat Arnob et.al. |
2404.08584 |
link |
2024-04-12 |
FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation |
Riza Velioglu et.al. |
2404.08582 |
link |
2024-04-12 |
Lossy Image Compression with Foundation Diffusion Models |
Lucas Relic et.al. |
2404.08580 |
null |
2024-04-12 |
Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation |
Hanlin Tian et.al. |
2404.08570 |
link |
2024-04-12 |
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs |
Shreyas Chaudhari et.al. |
2404.08555 |
null |
2024-04-12 |
Memory Traces: Are Transformers Tulving Machines? |
Jean-Marie Chauvet et.al. |
2404.08543 |
null |
2024-04-12 |
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward |
Xuan Xie et.al. |
2404.08517 |
null |
2024-04-12 |
ChatGPT and general-purpose AI count fruits in pictures surprisingly well |
Konlavach Mengsuwan et.al. |
2404.08515 |
null |
2024-04-12 |
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction |
Haoran Qiu et.al. |
2404.08509 |
link |
2024-04-12 |
LaSagnA: Language-based Segmentation Assistant for Complex Queries |
Cong Wei et.al. |
2404.08506 |
link |
2024-04-12 |
Strategic Interactions between Large Language Models-based Agents in Beauty Contests |
Siting Lu et.al. |
2404.08492 |
null |
2024-04-12 |
Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation |
Haozhe Zhao et.al. |
2404.08491 |
link |
2024-04-12 |
Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian |
Stefano De Paoli et.al. |
2404.08488 |
null |
2024-04-12 |
Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task |
Hassan Ali et.al. |
2404.08424 |
null |
2024-04-12 |
Adapting the Segment Anything Model During Usage in Novel Situations |
Robin Schön et.al. |
2404.08421 |
null |
2024-04-11 |
OpenBias: Open-set Bias Detection in Text-to-Image Generative Models |
Moreno D’Incà et.al. |
2404.07990 |
link |
2024-04-11 |
Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding |
Yiwen Tang et.al. |
2404.07989 |
link |
2024-04-11 |
Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning |
Simon Schrodi et.al. |
2404.07983 |
null |
2024-04-11 |
Language Imbalance Can Boost Cross-lingual Generalisation |
Anton Schäfer et.al. |
2404.07982 |
link |
2024-04-11 |
Manipulating Large Language Models to Increase Product Visibility |
Aounon Kumar et.al. |
2404.07981 |
link |
2024-04-11 |
LLoCO: Learning Long Contexts Offline |
Sijun Tan et.al. |
2404.07979 |
link |
2024-04-11 |
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models |
Haotian Zhang et.al. |
2404.07973 |
null |
2024-04-11 |
Rho-1: Not All Tokens Are What You Need |
Zhenghao Lin et.al. |
2404.07965 |
link |
2024-04-11 |
On Unified Prompt Tuning for Request Quality Assurance in Public Code Review |
Xinyu Chen et.al. |
2404.07942 |
null |
2024-04-11 |
Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation |
Jinkyung Park et.al. |
2404.07926 |
null |
2024-04-11 |
LaVy: Vietnamese Multimodal Large Language Model |
Chi Tran et.al. |
2404.07922 |
link |
2024-04-11 |
AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs |
Zeyi Liao et.al. |
2404.07921 |
link |
2024-04-11 |
DesignQA: A Multimodal Benchmark for Evaluating Large Language Models’ Understanding of Engineering Documentation |
Anna C. Doris et.al. |
2404.07917 |
link |
2024-04-11 |
HGRN2: Gated Linear RNNs with State Expansion |
Zhen Qin et.al. |
2404.07904 |
link |
2024-04-11 |
High-Dimension Human Value Representation in Large Language Models |
Samuel Cahyawijaya et.al. |
2404.07900 |
link |
2024-04-11 |
Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations |
Dayeon Ki et.al. |
2404.07851 |
link |
2024-04-11 |
On Training Data Influence of GPT Models |
Qingyi Liu et.al. |
2404.07840 |
link |
2024-04-11 |
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models |
Aleksandar Botev et.al. |
2404.07839 |
link |
2024-04-11 |
Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution |
Handi Deng et.al. |
2404.07833 |
null |
2024-04-11 |
Heron-Bench: A Benchmark for Evaluating Vision Language Models in Japanese |
Yuichi Inoue et.al. |
2404.07824 |
link |
2024-04-10 |
BRAVE: Broadening the visual encoding of vision-language models |
Oğuzhan Fatih Kar et.al. |
2404.07204 |
null |
2024-04-10 |
UMBRAE: Unified Multimodal Decoding of Brain Signals |
Weihao Xia et.al. |
2404.07202 |
link |
2024-04-10 |
Scaling Laws for Data Filtering – Data Curation cannot be Compute Agnostic |
Sachin Goyal et.al. |
2404.07177 |
link |
2024-04-10 |
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention |
Tsendsuren Munkhdalai et.al. |
2404.07143 |
null |
2024-04-10 |
Open reaction-diffusion systems: bridging probabilistic theory across scales |
Mauricio J. del Razo et.al. |
2404.07119 |
null |
2024-04-10 |
Continuous Language Model Interpolation for Dynamic and Controllable Text Generation |
Sara Kangaslahti et.al. |
2404.07117 |
link |
2024-04-11 |
From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications |
Yongqiang Ma et.al. |
2404.07108 |
null |
2024-04-10 |
Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs |
Bowen Jin et.al. |
2404.07103 |
link |
2024-04-10 |
Dynamic Generation of Personalities with Large Language Models |
Jianzhi Liu et.al. |
2404.07084 |
link |
2024-04-10 |
VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning |
Alexandros Xenos et.al. |
2404.07078 |
link |
2024-04-10 |
Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers? |
Mingyu Jin et.al. |
2404.07066 |
link |
2024-04-10 |
Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study |
Alessandro Stolfo et.al. |
2404.07060 |
null |
2024-04-10 |
Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation |
Elisa Sanchez-Bayona et.al. |
2404.07053 |
link |
2024-04-10 |
ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling |
Ege Özsoy et.al. |
2404.07031 |
link |
2024-04-10 |
Improving Language Model Reasoning with Self-motivated Learning |
Yunlong Feng et.al. |
2404.07017 |
null |
2024-04-10 |
A Mathematical Theory for Learning Semantic Languages by Abstract Learners |
Kuo-Yu Liao et.al. |
2404.07009 |
null |
2024-04-10 |
WordDecipher: Enhancing Digital Workspace Communication with Explainable AI for Non-native English Speakers |
Yuexi Chen et.al. |
2404.07005 |
null |
2024-04-10 |
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models |
Igor Tufanov et.al. |
2404.07004 |
null |
2024-04-10 |
Event Grounded Criminal Court View Generation withCooperative (Large) Language Models |
Linan Yue et.al. |
2404.07001 |
link |
2024-04-10 |
Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study |
Hongru Du et.al. |
2404.06962 |
link |
2024-04-09 |
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD |
Xiaoyi Dong et.al. |
2404.06512 |
link |
2024-04-09 |
Can Feedback Enhance Semantic Grounding in Large Vision-Language Models? |
Yuan-Hong Liao et.al. |
2404.06510 |
null |
2024-04-09 |
On the Effect of (Near) Duplicate Subwords in Language Modelling |
Anton Schäfer et.al. |
2404.06508 |
link |
2024-04-09 |
Pitfalls of Conversational LLMs on News Debiasing |
Ipek Baris Schlicht et.al. |
2404.06488 |
null |
2024-04-10 |
Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks |
Chonghua Wang et.al. |
2404.06480 |
link |
2024-04-10 |
Text-Based Reasoning About Vector Graphics |
Zhenhailong Wang et.al. |
2404.06479 |
null |
2024-04-09 |
Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models |
Zihan Fang et.al. |
2404.06448 |
null |
2024-04-09 |
Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems |
Kunal Garg et.al. |
2404.06413 |
null |
2024-04-09 |
AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents |
Luca Gioacchini et.al. |
2404.06411 |
link |
2024-04-09 |
Take a Look at it! Rethinking How to Evaluate Language Model Jailbreak |
Hongyu Cai et.al. |
2404.06407 |
link |
2024-04-09 |
Apprentices to Research Assistants: Advancing Research with Large Language Models |
M. Namvarpour et.al. |
2404.06404 |
null |
2024-04-09 |
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies |
Shengding Hu et.al. |
2404.06395 |
link |
2024-04-09 |
MuPT: A Generative Symbolic Music Pretrained Transformer |
Xingwei Qu et.al. |
2404.06393 |
null |
2024-04-09 |
Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis |
Mikel Zubillaga et.al. |
2404.06392 |
null |
2024-04-09 |
Latent Distance Guided Alignment Training for Large Language Models |
Haotian Luo et.al. |
2404.06390 |
null |
2024-04-09 |
Model Generation from Requirements with LLMs: an Exploratory Study |
Alessio Ferrari et.al. |
2404.06371 |
null |
2024-04-09 |
Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python |
Valdecy Pereira et.al. |
2404.06370 |
link |
2024-04-09 |
VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs |
Yi Gui et.al. |
2404.06369 |
null |
2024-04-09 |
ClinLinker: Medical Entity Linking of Clinical Concept Mentions in Spanish |
Fernando Gallego et.al. |
2404.06367 |
null |
2024-04-09 |
Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation |
Sidra Aleem et.al. |
2404.06362 |
link |
2024-04-08 |
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding |
Bo He et.al. |
2404.05726 |
link |
2024-04-08 |
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs |
Keen You et.al. |
2404.05719 |
null |
2024-04-08 |
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding |
Ahmad Idrissi-Yaghir et.al. |
2404.05694 |
null |
2024-04-08 |
Evaluating Mathematical Reasoning Beyond Accuracy |
Shijie Xia et.al. |
2404.05692 |
link |
2024-04-08 |
Retrieval-Augmented Open-Vocabulary Object Detection |
Jooyeon Kim et.al. |
2404.05687 |
link |
2024-04-08 |
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation |
Kunpeng Song et.al. |
2404.05674 |
link |
2024-04-08 |
CoReS: Orchestrating the Dance of Reasoning and Segmentation |
Xiaoyi Bao et.al. |
2404.05673 |
null |
2024-04-08 |
Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data |
Haitham Hammami et.al. |
2404.05632 |
link |
2024-04-08 |
LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking |
Faren Yan et.al. |
2404.05624 |
null |
2024-04-08 |
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning |
Matteo Farina et.al. |
2404.05621 |
link |
2024-04-08 |
SpeechAlign: Aligning Speech Generation to Human Preferences |
Dong Zhang et.al. |
2404.05600 |
link |
2024-04-08 |
MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering |
Iñigo Alonso et.al. |
2404.05590 |
null |
2024-04-08 |
Enhancing Software Related Information Extraction with Generative Language Models through Single-Choice Question Answering |
Wolfgang Otto et.al. |
2404.05587 |
null |
2024-04-08 |
Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model |
Yue-Hua Han et.al. |
2404.05583 |
null |
2024-04-08 |
360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System |
Shen Gao et.al. |
2404.05569 |
link |
2024-04-08 |
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models |
Bowen Pan et.al. |
2404.05567 |
null |
2024-04-08 |
Chinese Sequence Labeling with Semi-Supervised Boundary-Aware Language Model Pre-training |
Longhui Zhang et.al. |
2404.05560 |
link |
2024-04-08 |
Evaluating Interventional Reasoning Capabilities of Large Language Models |
Tejas Kasetty et.al. |
2404.05545 |
null |
2024-04-08 |
OPSD: an Offensive Persian Social media Dataset and its baseline evaluations |
Mehran Safayani et.al. |
2404.05540 |
null |
2024-04-08 |
Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data |
Tim Baumgärtner et.al. |
2404.05530 |
null |
2024-04-05 |
Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2) |
Michael Saxon et.al. |
2404.04251 |
link |
2024-04-05 |
Physical Property Understanding from Language-Embedded Feature Fields |
Albert J. Zhai et.al. |
2404.04242 |
null |
2024-04-05 |
Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents |
Harsh Kohli et.al. |
2404.04237 |
null |
2024-04-05 |
player2vec: A Language Modeling Approach to Understand Player Behavior in Games |
Tianze Wang et.al. |
2404.04234 |
null |
2024-04-05 |
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation |
Ji-Jia Wu et.al. |
2404.04231 |
link |
2024-04-05 |
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation |
Tong Su et.al. |
2404.04212 |
null |
2024-04-05 |
Social Skill Training with Large Language Models |
Diyi Yang et.al. |
2404.04204 |
null |
2024-04-05 |
Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text? |
Ilya Ilyankou et.al. |
2404.04169 |
null |
2024-04-05 |
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model |
Xinrun Du et.al. |
2404.04167 |
null |
2024-04-05 |
Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval |
João Coelho et.al. |
2404.04163 |
link |
2024-04-05 |
BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models |
Jacek Wiland et.al. |
2404.04113 |
link |
2024-04-05 |
Large language models as oracles for instantiating ontologies with domain-specific knowledge |
Giovanni Ciatto et.al. |
2404.04108 |
link |
2024-04-05 |
Robust Preference Optimization with Provable Noise Tolerance for LLMs |
Xize Liang et.al. |
2404.04102 |
null |
2024-04-05 |
Label Propagation for Zero-shot Classification with Vision-Language Models |
Vladan Stojnić et.al. |
2404.04072 |
link |
2024-04-05 |
Assessing the quality of information extraction |
Filip Seitl et.al. |
2404.04068 |
null |
2024-04-05 |
CLUE: A Clinical Language Understanding Evaluation for LLMs |
Amin Dada et.al. |
2404.04067 |
link |
2024-04-05 |
VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots |
Akhil Padmanabha et.al. |
2404.04066 |
null |
2024-04-05 |
A Comparison of Methods for Evaluating Generative IR |
Negar Arabzadeh et.al. |
2404.04044 |
link |
2024-04-05 |
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer |
Hele-Andra Kuulmets et.al. |
2404.04042 |
link |
2024-04-05 |
Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds |
Annerose Eichel et.al. |
2404.04031 |
link |
2024-04-04 |
OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views |
Francis Engelmann et.al. |
2404.03650 |
null |
2024-04-04 |
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent |
Hanyu Lai et.al. |
2404.03648 |
link |
2024-04-04 |
Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra |
Darioush Kevian et.al. |
2404.03647 |
null |
2024-04-04 |
Locating and Editing Factual Associations in Mamba |
Arnab Sen Sharma et.al. |
2404.03646 |
link |
2024-04-04 |
Training LLMs over Neurally Compressed Text |
Brian Lester et.al. |
2404.03626 |
null |
2024-04-04 |
Standardizing Knowledge Engineering Practices with a Reference Architecture |
Bradley P. Allen et.al. |
2404.03624 |
null |
2024-04-04 |
Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph |
Marco Bronzini et.al. |
2404.03623 |
link |
2024-04-04 |
Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models |
Wenshan Wu et.al. |
2404.03622 |
null |
2024-04-04 |
DeViDe: Faceted medical knowledge for improved medical vision-language pre-training |
Haozhe Luo et.al. |
2404.03618 |
null |
2024-04-04 |
Sailor: Open Language Models for South-East Asia |
Longxu Dou et.al. |
2404.03608 |
link |
2024-04-04 |
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization |
Aniruddha Nrusimha et.al. |
2404.03605 |
link |
2024-04-04 |
Evaluating LLMs at Detecting Errors in LLM Responses |
Ryo Kamoi et.al. |
2404.03602 |
link |
2024-04-04 |
Intent Detection and Entity Extraction from BioMedical Literature |
Ankan Mullick et.al. |
2404.03598 |
link |
2024-04-04 |
ReFT: Representation Finetuning for Language Models |
Zhengxuan Wu et.al. |
2404.03592 |
link |
2024-04-04 |
SemGrasp: Semantic Grasp Generation via Language Aligned Discretization |
Kailin Li et.al. |
2404.03590 |
null |
2024-04-04 |
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models |
Yantao Liu et.al. |
2404.03577 |
link |
2024-04-04 |
Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity |
Jake Varley et.al. |
2404.03570 |
null |
2024-04-04 |
Personalized LLM Response Generation with Parameterized Memory Injection |
Kai Zhang et.al. |
2404.03565 |
null |
2024-04-04 |
Select and Summarize: Scene Saliency for Movie Script Summarization |
Rohit Saxena et.al. |
2404.03561 |
link |
2024-04-04 |
How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes |
Harmon Bhasin et.al. |
2404.03558 |
link |
2024-04-03 |
ALOHa: A New Measure for Hallucination in Captioning Models |
Suzanne Petryk et.al. |
2404.02904 |
null |
2024-04-03 |
MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment |
Duygu Ceylan et.al. |
2404.02899 |
null |
2024-04-03 |
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline |
Yifan Xu et.al. |
2404.02893 |
link |
2024-04-03 |
MODNO: Multi Operator Learning With Distributed Neural Operators |
Zecheng Zhang et.al. |
2404.02892 |
null |
2024-04-03 |
Linear Attention Sequence Parallelism |
Weigao Sun et.al. |
2404.02882 |
link |
2024-04-03 |
Integrating Explanations in Learning LTL Specifications from Demonstrations |
Ashutosh Gupta et.al. |
2404.02872 |
null |
2024-04-03 |
Toward Inference-optimal Mixture-of-Expert Large Language Models |
Longfei Yun et.al. |
2404.02852 |
null |
2024-04-03 |
I-Design: Personalized LLM Interior Designer |
Ata Çelen et.al. |
2404.02838 |
null |
2024-04-03 |
Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models |
Wanyun Cui et.al. |
2404.02837 |
null |
2024-04-03 |
Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison |
Maxime Bouthors et.al. |
2404.02835 |
null |
2024-04-03 |
Empowering Biomedical Discovery with AI Agents |
Shanghua Gao et.al. |
2404.02831 |
null |
2024-04-03 |
BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models |
Qijun Luo et.al. |
2404.02827 |
link |
2024-04-03 |
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models |
Haoran Sun et.al. |
2404.02823 |
link |
2024-04-03 |
A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches |
Zhigen Zhao et.al. |
2404.02817 |
null |
2024-04-03 |
The RealHumanEval: Evaluating Large Language Models’ Abilities to Support Programmers |
Hussein Mozannar et.al. |
2404.02806 |
link |
2024-04-03 |
Efficient Multi-Vector Dense Retrieval Using Bit Vectors |
Franco Maria Nardini et.al. |
2404.02805 |
link |
2024-04-03 |
AI and personalized learning: bridging the gap with modern educational goals |
Kristjan-Julius Laak et.al. |
2404.02798 |
null |
2024-04-03 |
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech |
Jaehyeon Kim et.al. |
2404.02781 |
null |
2024-04-03 |
FPT: Feature Prompt Tuning for Few-shot Readability Assessment |
Ziyang Wang et.al. |
2404.02772 |
link |
2024-04-03 |
DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement |
Hao Wu et.al. |
2404.02755 |
null |
2024-04-02 |
Segment Any 3D Object with Language |
Seungjun Lee et.al. |
2404.02157 |
null |
2024-04-02 |
Iterated Learning Improves Compositionality in Large Vision-Language Models |
Chenhao Zheng et.al. |
2404.02145 |
null |
2024-04-02 |
Topic-based Watermarks for LLM-Generated Text |
Alexander Nemecek et.al. |
2404.02138 |
null |
2024-04-02 |
ViTamin: Designing Scalable Vision Models in the Vision-Language Era |
Jienneg Chen et.al. |
2404.02132 |
link |
2024-04-02 |
FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning |
Joel Niklaus et.al. |
2404.02127 |
link |
2024-04-02 |
Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models |
Wanyong Feng et.al. |
2404.02124 |
link |
2024-04-02 |
GINopic: Topic Modeling with Graph Isomorphism Network |
Suman Adhya et.al. |
2404.02115 |
link |
2024-04-02 |
CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems |
Sara Rosenthal et.al. |
2404.02103 |
link |
2024-04-02 |
Advancing LLM Reasoning Generalists with Preference Trees |
Lifan Yuan et.al. |
2404.02078 |
link |
2024-04-02 |
Red-Teaming Segment Anything Model |
Krzysztof Jankowski et.al. |
2404.02067 |
link |
2024-04-02 |
Digital Forgetting in Large Language Models: A Survey of Unlearning Methods |
Alberto Blanco-Justicia et.al. |
2404.02062 |
null |
2024-04-02 |
Long-context LLMs Struggle with Long In-context Learning |
Tianle Li et.al. |
2404.02060 |
link |
2024-04-02 |
IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT |
Junchen Fu et.al. |
2404.02059 |
link |
2024-04-02 |
Deconstructing In-Context Learning: Understanding Prompts via Corruption |
Namrata Shivagunde et.al. |
2404.02054 |
link |
2024-04-02 |
A Survey on Large Language Model-Based Game Agents |
Sihao Hu et.al. |
2404.02039 |
link |
2024-04-02 |
MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages |
Daryna Dementieva et.al. |
2404.02037 |
null |
2024-04-02 |
Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts |
Zhuo Chen et.al. |
2404.02022 |
link |
2024-04-02 |
Large Language Models for Orchestrating Bimanual Robots |
Kun Chu et.al. |
2404.02018 |
link |
2024-04-02 |
MuxServe: Flexible Multiplexing for Efficient Multiple LLM Serving |
Jiangfei Duan et.al. |
2404.02015 |
link |
2024-04-02 |
Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models |
Stephan Linzbach et.al. |
2404.01992 |
null |
2024-03-29 |
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models |
Atsuyuki Miyai et.al. |
2403.20331 |
link |
2024-03-29 |
Are We on the Right Way for Evaluating Large Vision-Language Models? |
Lin Chen et.al. |
2403.20330 |
link |
2024-03-29 |
ReALM: Reference Resolution As Language Modeling |
Joel Ruben Antony Moniz et.al. |
2403.20329 |
null |
2024-03-29 |
Gecko: Versatile Text Embeddings Distilled from Large Language Models |
Jinhyuk Lee et.al. |
2403.20327 |
null |
2024-03-29 |
Convolutional Prompting meets Language Models for Continual Learning |
Anurag Roy et.al. |
2403.20317 |
null |
2024-03-29 |
Learn “No” to Say “Yes” Better: Improving Vision-Language Models via Negations |
Jaisidh Singh et.al. |
2403.20312 |
link |
2024-03-29 |
Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference |
Jovan Stojkovic et.al. |
2403.20306 |
null |
2024-03-29 |
Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain |
Burcu Sayin et.al. |
2403.20288 |
link |
2024-03-29 |
LUQ: Long-text Uncertainty Quantification for LLMs |
Caiqi Zhang et.al. |
2403.20279 |
link |
2024-04-01 |
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want |
Weifeng Lin et.al. |
2403.20271 |
link |
2024-03-29 |
Latxa: An Open Language Model and Evaluation Suite for Basque |
Julen Etxaniz et.al. |
2403.20266 |
link |
2024-03-29 |
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models |
Thibaut Thonet et.al. |
2403.20262 |
link |
2024-03-29 |
MedCLIP-SAM: Bridging Text and Image Towards Universal Medical Image Segmentation |
Taha Koleilat et.al. |
2403.20253 |
link |
2024-03-29 |
Using LLMs to Model the Beliefs and Preferences of Targeted Populations |
Keiichi Namikoshi et.al. |
2403.20252 |
null |
2024-03-29 |
Long-Tailed Anomaly Detection with Learnable Class Names |
Chih-Hui Ho et.al. |
2403.20236 |
null |
2024-03-29 |
H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model |
Chao Pang et.al. |
2403.20213 |
link |
2024-03-29 |
Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science |
Yazheng Yang et.al. |
2403.20208 |
null |
2024-03-29 |
The Future of Combating Rumors? Retrieval, Discrimination, and Generation |
Junhao Xu et.al. |
2403.20204 |
null |
2024-03-29 |
ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models |
Shuo Liu et.al. |
2403.20194 |
null |
2024-03-29 |
HARMamba: Efficient Wearable Sensor Human Activity Recognition Based on Bidirectional Selective SSM |
Shuangjian Li et.al. |
2403.20183 |
null |
2024-03-28 |
RSMamba: Remote Sensing Image Classification with State Space Model |
Keyan Chen et.al. |
2403.19654 |
link |
2024-03-28 |
InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction |
Sirui Xu et.al. |
2403.19652 |
null |
2024-03-28 |
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions |
Kai Zhang et.al. |
2403.19651 |
link |
2024-03-28 |
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models |
Samuel Marks et.al. |
2403.19647 |
link |
2024-03-28 |
Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning |
Chenyang Liu et.al. |
2403.19646 |
link |
2024-03-28 |
Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models |
Yucheng Shi et.al. |
2403.19631 |
link |
2024-03-28 |
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents |
Zeren Chen et.al. |
2403.19622 |
null |
2024-03-28 |
SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects |
Avinash Ummadisingu et.al. |
2403.19607 |
null |
2024-03-28 |
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation |
Zhongliang Zhou et.al. |
2403.19584 |
link |
2024-03-28 |
Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics |
Norman Di Palo et.al. |
2403.19578 |
null |
2024-03-28 |
WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models |
Piotr Molenda et.al. |
2403.19548 |
null |
2024-03-28 |
Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models |
Ang Lv et.al. |
2403.19521 |
link |
2024-03-28 |
Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data |
Shan Chen et.al. |
2403.19511 |
link |
2024-03-28 |
LLMs as Academic Reading Companions: Extending HCI Through Synthetic Personae |
Celia Chen et.al. |
2403.19506 |
null |
2024-03-28 |
Evolving Assembly Code in an Adversarial Environment |
Irina Maliukov et.al. |
2403.19489 |
link |
2024-03-28 |
JDocQA: Japanese Document Question Answering Dataset for Generative Language Models |
Eri Onami et.al. |
2403.19454 |
link |
2024-03-28 |
Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model |
Qi Gou et.al. |
2403.19443 |
null |
2024-03-28 |
OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion |
Xinyu Zhan et.al. |
2403.19417 |
null |
2024-03-28 |
BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation |
Yuhong He et.al. |
2403.19414 |
null |
2024-03-28 |
Checkpoint Merging via Bayesian Optimization in LLM Pretraining |
Deyuan Liu et.al. |
2403.19390 |
null |
2024-03-27 |
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models |
Yanwei Li et.al. |
2403.18814 |
link |
2024-03-27 |
ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation |
Suraj Patni et.al. |
2403.18807 |
link |
2024-03-27 |
Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation |
Mateusz Klimaszewski et.al. |
2403.18804 |
link |
2024-03-27 |
Projective Methods for Mitigating Gender Bias in Pre-trained Language Models |
Hillary Dawkins et.al. |
2403.18803 |
link |
2024-03-27 |
Long-form factuality in large language models |
Jerry Wei et.al. |
2403.18802 |
link |
2024-03-27 |
Towards a World-English Language Model for On-Device Virtual Assistants |
Rricha Jalota et.al. |
2403.18783 |
null |
2024-03-27 |
3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation |
Ehsan Latif et.al. |
2403.18778 |
null |
2024-03-27 |
ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object |
Chenshuang Zhang et.al. |
2403.18775 |
link |
2024-03-27 |
CheckEval: Robust Evaluation Framework using Large Language Model via Checklist |
Yukyung Lee et.al. |
2403.18771 |
null |
2024-03-27 |
MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model |
Yike Wu et.al. |
2403.18760 |
link |
2024-03-27 |
CYCLE: Learning to Self-Refine the Code Generation |
Yangruibo Ding et.al. |
2403.18746 |
link |
2024-03-27 |
Understanding the Learning Dynamics of Alignment with Human Feedback |
Shawn Im et.al. |
2403.18742 |
link |
2024-03-27 |
PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations |
Ehsan Latif et.al. |
2403.18721 |
null |
2024-03-27 |
Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding |
Xintong Wang et.al. |
2403.18715 |
link |
2024-03-27 |
The Invalsi Benchmark: measuring Language Models Mathematical and Language understanding in Italian |
Andrea Esuli et.al. |
2403.18697 |
null |
2024-03-27 |
NL-ITI: Optimizing Probing and Intervention for Improvement of ITI Method |
Jakub Hoscilowicz et.al. |
2403.18680 |
link |
2024-03-27 |
An Exploratory Study on Upper-Level Computing Students’ Use of Large Language Models as Tools in a Semester-Long Project |
Ben Arie Tanay et.al. |
2403.18679 |
null |
2024-03-27 |
SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens |
Chengbo Liu et.al. |
2403.18647 |
link |
2024-03-27 |
To Recommend or Not: Recommendability Identification in Conversations with Pre-trained Language Models |
Zhefan Wang et.al. |
2403.18628 |
link |
2024-03-27 |
Vulnerability Detection with Code Language Models: How Far Are We? |
Yangruibo Ding et.al. |
2403.18624 |
link |
2024-03-26 |
OmniVid: A Generative Framework for Universal Video Understanding |
Junke Wang et.al. |
2403.17935 |
link |
2024-03-26 |
Track Everything Everywhere Fast and Robustly |
Yunzhou Song et.al. |
2403.17931 |
null |
2024-03-26 |
MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution |
Wei Tao et.al. |
2403.17927 |
null |
2024-03-26 |
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning |
Rui Pan et.al. |
2403.17919 |
link |
2024-03-26 |
Large scale paired antibody language models |
Henry Kenlay et.al. |
2403.17889 |
null |
2024-03-26 |
Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation |
Carlos Gomes et.al. |
2403.17886 |
link |
2024-03-26 |
MIND Your Language: A Multilingual Dataset for Cross-lingual News Recommendation |
Andreea Iana et.al. |
2403.17876 |
link |
2024-03-26 |
Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach |
Andrea Ferrario et.al. |
2403.17873 |
null |
2024-03-26 |
Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications |
Philip Lippmann et.al. |
2403.17860 |
null |
2024-03-26 |
ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages |
Bhawna Piryani et.al. |
2403.17859 |
link |
2024-03-26 |
Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs |
David R. Mortensen et.al. |
2403.17856 |
null |
2024-03-26 |
ArabicaQA: A Comprehensive Dataset for Arabic Question Answering |
Abdelrahman Abdallah et.al. |
2403.17848 |
link |
2024-03-26 |
Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation |
Abdelrhman Werby et.al. |
2403.17846 |
null |
2024-03-26 |
Mechanistic Design and Scaling of Hybrid Architectures |
Michael Poli et.al. |
2403.17844 |
link |
2024-03-26 |
ReMamber: Referring Image Segmentation with Mamba Twister |
Yuhuan Yang et.al. |
2403.17839 |
link |
2024-03-26 |
A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities |
Ibrahim Ethem Hamamci et.al. |
2403.17834 |
link |
2024-03-26 |
Assessment of Multimodal Large Language Models in Alignment with Human Values |
Zhelun Shi et.al. |
2403.17830 |
null |
2024-03-26 |
Accelerating Radio Spectrum Regulation Workflows with Large Language Models (LLMs) |
Amir Ghasemi et.al. |
2403.17819 |
null |
2024-03-26 |
Graph Language Model (GLM): A new graph-based approach to detect social instabilities |
Wallyson Lemes de Oliveira et.al. |
2403.17816 |
null |
2024-03-26 |
Are Compressed Language Models Less Subgroup Robust? |
Leonidas Gee et.al. |
2403.17811 |
link |
2024-03-25 |
Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making |
Shuai Ma et.al. |
2403.16812 |
null |
2024-03-25 |
An LLM-Based Digital Twin for Optimizing Human-in-the Loop Systems |
Hanqing Yang et.al. |
2403.16809 |
link |
2024-03-25 |
Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback |
Zhangqian Bi et.al. |
2403.16792 |
link |
2024-03-25 |
All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification |
Deepak Narayan Gadde et.al. |
2403.16750 |
null |
2024-03-25 |
A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models |
Nils Ingelhag et.al. |
2403.16730 |
null |
2024-03-25 |
ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search |
Zehan Li et.al. |
2403.16702 |
link |
2024-03-25 |
Synapse: Learning Preferential Concepts from Visual Demonstrations |
Sadanand Modak et.al. |
2403.16689 |
null |
2024-03-25 |
Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography |
Jiayue Zhang et.al. |
2403.16687 |
null |
2024-03-25 |
RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict |
Yirong Zeng et.al. |
2403.16662 |
link |
2024-03-25 |
Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT |
Rohit Raju et.al. |
2403.16655 |
null |
2024-03-25 |
CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment |
Feiteng Fang et.al. |
2403.16649 |
link |
2024-03-25 |
Virtual Co-Pilot: Multimodal Large Language Model-enabled Quick-access Procedures for Single Pilot Operations |
Fan Li et.al. |
2403.16645 |
null |
2024-03-25 |
Semantically Enriched Cross-Lingual Sentence Embeddings for Crisis-related Social Media Texts |
Rabindra Lamsal et.al. |
2403.16614 |
null |
2024-03-25 |
Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units |
Biswesh Mohapatra et.al. |
2403.16609 |
null |
2024-03-25 |
TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques |
Ashok Urlana et.al. |
2403.16592 |
null |
2024-03-25 |
Can Large Language Models (or Humans) Distill Text? |
Nicolas Audinet de Pieuchon et.al. |
2403.16584 |
link |
2024-03-25 |
NSINA: A News Corpus for Sinhala |
Hansi Hettiarachchi et.al. |
2403.16571 |
link |
2024-03-25 |
Elysium: Exploring Object-level Perception in Videos via MLLM |
Han Wang et.al. |
2403.16558 |
link |
2024-03-25 |
DOrA: 3D Visual Grounding with Order-Aware Referring |
Tung-Yu Wu et.al. |
2403.16539 |
null |
2024-03-25 |
Open-Set Recognition in the Age of Vision-Language Models |
Dimity Miller et.al. |
2403.16528 |
link |
2024-03-25 |
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art |
Neeloy Chakraborty et.al. |
2403.16527 |
null |
2024-03-25 |
Harnessing the power of LLMs for normative reasoning in MASs |
Bastin Tony Roy Savarimuthu et.al. |
2403.16524 |
null |
2024-03-25 |
Norm Violation Detection in Multi-Agent Systems using Large Language Models: A Pilot Study |
Shawn He et.al. |
2403.16517 |
null |
2024-03-25 |
Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social Media |
Uma Sushmitha Gunturi et.al. |
2403.16514 |
null |
2024-03-22 |
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models |
Yuzhang Shang et.al. |
2403.15388 |
null |
2024-03-22 |
Long-CLIP: Unlocking the Long-Text Capability of CLIP |
Beichen Zhang et.al. |
2403.15378 |
link |
2024-03-22 |
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding |
Yi Wang et.al. |
2403.15377 |
link |
2024-03-22 |
Can large language models explore in-context? |
Akshay Krishnamurthy et.al. |
2403.15371 |
null |
2024-03-22 |
CoLLEGe: Concept Embedding Generation for Large Language Models |
Ryan Teehan et.al. |
2403.15362 |
null |
2024-03-22 |
Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities |
Zhitong Xiong et.al. |
2403.15356 |
link |
2024-03-22 |
Controlled Training Data Generation with Diffusion Models |
Teresa Yeo et.al. |
2403.15309 |
null |
2024-03-22 |
Sphere Neural-Networks for Rational Reasoning |
Tiansi Dong et.al. |
2403.15297 |
null |
2024-03-22 |
Measuring Gender and Racial Biases in Large Language Models |
Jiafu An et.al. |
2403.15281 |
null |
2024-03-22 |
Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review |
Jinge Wang et.al. |
2403.15274 |
null |
2024-03-22 |
Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs |
Xiaobin Zhang et.al. |
2403.15273 |
null |
2024-03-22 |
Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models |
Huanxuan Liao et.al. |
2403.15268 |
link |
2024-03-22 |
AI Exposure and Strategic Positioning on an Online Work Platform |
Shun Yiu et.al. |
2403.15262 |
null |
2024-03-22 |
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions |
Orion Weller et.al. |
2403.15246 |
link |
2024-03-22 |
Shadow Generation for Composite Image Using Diffusion model |
Qingyang Liu et.al. |
2403.15234 |
link |
2024-03-22 |
An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets |
Jonathan Katzy et.al. |
2403.15230 |
link |
2024-03-22 |
Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models |
Qiong Wu et.al. |
2403.15226 |
link |
2024-03-22 |
Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations |
Pranav Kulkarni et.al. |
2403.15218 |
link |
2024-03-22 |
InstaSynth: Opportunities and Challenges in Generating Synthetic Instagram Data with ChatGPT for Sponsored Content Detection |
Thales Bertaglia et.al. |
2403.15214 |
link |
2024-03-22 |
MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection |
Taeheon Kim et.al. |
2403.15209 |
null |
2024-03-21 |
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? |
Renrui Zhang et.al. |
2403.14624 |
null |
2024-03-21 |
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey |
Zeyu Han et.al. |
2403.14608 |
null |
2024-03-21 |
MyVLM: Personalizing VLMs for User-Specific Queries |
Yuval Alaluf et.al. |
2403.14599 |
null |
2024-03-21 |
ReAct Meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training |
Zonghan Yang et.al. |
2403.14589 |
null |
2024-03-21 |
Large Language Models for Multi-Choice Question Classification of Medical Subjects |
Víctor Ponce-López et.al. |
2403.14582 |
null |
2024-03-21 |
RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain |
William James Bolton et.al. |
2403.14578 |
link |
2024-03-21 |
A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students’ Formative Assessment Responses in Science |
Clayton Cohn et.al. |
2403.14565 |
null |
2024-03-21 |
The Era of Semantic Decoding |
Maxime Peyrard et.al. |
2403.14562 |
null |
2024-03-21 |
Lexicon-Level Contrastive Visual-Grounding Improves Language Modeling |
Chengxu Zhuang et.al. |
2403.14551 |
null |
2024-03-21 |
EDT: Improving Large Language Models’ Generation by Entropy-based Dynamic Temperature Sampling |
Shimao Zhang et.al. |
2403.14541 |
link |
2024-03-21 |
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference |
Han Zhao et.al. |
2403.14520 |
link |
2024-03-21 |
The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs) |
Joschka Haltaufderheide et.al. |
2403.14473 |
null |
2024-03-21 |
Detoxifying Large Language Models via Knowledge Editing |
Mengru Wang et.al. |
2403.14472 |
link |
2024-03-21 |
ChatGPT Alternative Solutions: Large Language Models Survey |
Hanieh Alipour et.al. |
2403.14469 |
null |
2024-03-21 |
Recourse for reclamation: Chatting with generative language models |
Jennifer Chien et.al. |
2403.14467 |
null |
2024-03-21 |
Towards Single-System Illusion in Software-Defined Vehicles – Automated, AI-Powered Workflow |
Krzysztof Lebioda et.al. |
2403.14460 |
null |
2024-03-21 |
Multi-Level Explanations for Generative Language Models |
Lucas Monteiro Paes et.al. |
2403.14459 |
null |
2024-03-21 |
gTBLS: Generating Tables from Text by Conditional Question Answering |
Anirudh Sundar et.al. |
2403.14457 |
null |
2024-03-21 |
Language Models Can Reduce Asymmetry in Information Markets |
Nasim Rahaman et.al. |
2403.14443 |
null |
2024-03-21 |
A Multimodal Approach to Device-Directed Speech Detection with Large Language Models |
Dominik Wager et.al. |
2403.14438 |
null |
2024-03-20 |
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition |
Ziyu Liu et.al. |
2403.13805 |
link |
2024-03-20 |
Learning from Models and Data for Visual Grounding |
Ruozhen He et.al. |
2403.13804 |
null |
2024-03-20 |
Reverse Training to Nurse the Reversal Curse |
Olga Golovneva et.al. |
2403.13799 |
null |
2024-03-20 |
Bridge the Modality and Capacity Gaps in Vision-Language Model Selection |
Chao Yi et.al. |
2403.13797 |
null |
2024-03-20 |
RewardBench: Evaluating Reward Models for Language Modeling |
Nathan Lambert et.al. |
2403.13787 |
link |
2024-03-20 |
Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts |
Guangzeng Han et.al. |
2403.13786 |
link |
2024-03-20 |
Information-Theoretic Distillation for Reference-less Summarization |
Jaehun Jung et.al. |
2403.13780 |
null |
2024-03-20 |
Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation |
Hugues Thomas et.al. |
2403.13777 |
null |
2024-03-20 |
Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models |
Nicholas Bai et.al. |
2403.13771 |
link |
2024-03-20 |
Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model |
Diwei Wang et.al. |
2403.13756 |
null |
2024-03-20 |
Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement |
Catherine Arnett et.al. |
2403.13754 |
null |
2024-03-20 |
EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation |
Atnafu Lambebo Tonja et.al. |
2403.13737 |
null |
2024-03-20 |
Large Language Models meet Network Slicing Management and Orchestration |
Abdulhalim Dandoush et.al. |
2403.13721 |
null |
2024-03-20 |
SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning |
Hongjun Wang et.al. |
2403.13684 |
null |
2024-03-20 |
PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents |
Mitodru Niyogi et.al. |
2403.13681 |
null |
2024-03-20 |
RoleInteract: Evaluating the Social Interaction of Role-Playing Agents |
Hongzhan Chen et.al. |
2403.13679 |
link |
2024-03-20 |
Grounding Spatial Relations in Text-Only Language Models |
Gorka Azkune et.al. |
2403.13666 |
link |
2024-03-20 |
Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese |
Meet Doshi et.al. |
2403.13638 |
null |
2024-03-20 |
VL-Mamba: Exploring State Space Models for Multimodal Learning |
Yanyuan Qiao et.al. |
2403.13600 |
null |
2024-03-20 |
No more optimization rules: LLM-enabled policy-based multi-modal query optimizer (version 1) |
Yifan Wang et.al. |
2403.13597 |
null |
2024-03-19 |
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression |
Zhuoshi Pan et.al. |
2403.12968 |
link |
2024-03-19 |
Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models |
Zuyan Liu et.al. |
2403.12966 |
link |
2024-03-19 |
Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models |
Ce Zhang et.al. |
2403.12964 |
link |
2024-03-19 |
Dated Data: Tracing Knowledge Cutoffs in Large Language Models |
Jeffrey Cheng et.al. |
2403.12958 |
link |
2024-03-19 |
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models |
Elaine Sui et.al. |
2403.12952 |
link |
2024-03-19 |
Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models |
Joana Ribeiro de Faria et.al. |
2403.12936 |
null |
2024-03-19 |
Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties |
Efrain Torres-Lomas et.al. |
2403.12935 |
null |
2024-03-19 |
Rapid AIdeation: Generating Ideas With the Self and in Collaboration With Large Language Models |
Gionnieve Lim et.al. |
2403.12928 |
null |
2024-03-19 |
Supporting Energy Policy Research with Large Language Models |
Grant Buster et.al. |
2403.12924 |
null |
2024-03-19 |
Contextual AD Narration with Interleaved Multimodal Sequence |
Hanlin Wang et.al. |
2403.12922 |
null |
2024-03-19 |
Semantic Layering in Room Segmentation via LLMs |
Taehyeon Kim et.al. |
2403.12920 |
null |
2024-03-19 |
Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts |
Sai Ashish Somayajula et.al. |
2403.12918 |
link |
2024-03-19 |
Yell At Your Robot: Improving On-the-Fly from Language Corrections |
Lucy Xiaoyang Shi et.al. |
2403.12910 |
null |
2024-03-19 |
Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference |
Baolin Li et.al. |
2403.12900 |
null |
2024-03-19 |
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding |
Anwen Hu et.al. |
2403.12895 |
link |
2024-03-20 |
MEDBind: Unifying Language and Multimodal Medical Data Embeddings |
Yuan Gao et.al. |
2403.12894 |
null |
2024-03-19 |
HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning |
Fucai Ke et.al. |
2403.12884 |
link |
2024-03-19 |
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models |
Zehui Chen et.al. |
2403.12881 |
link |
2024-03-19 |
Epistemology of Language Models: Do Language Models Have Holistic Knowledge? |
Minsu Kim et.al. |
2403.12862 |
null |
2024-03-19 |
RASP: A Drone-based Reconfigurable Actuation and Sensing Platform Towards Ambient Intelligent Systems |
Minghui Zhao et.al. |
2403.12853 |
null |
2024-03-18 |
Modality-Agnostic fMRI Decoding of Vision and Language |
Mitja Nikolaus et.al. |
2403.11771 |
null |
2024-03-18 |
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs |
M. Jehanzeb Mirza et.al. |
2403.11755 |
link |
2024-03-18 |
Revisiting The Classics: A Study on Identifying and Rectifying Gender Stereotypes in Rhymes and Poems |
Aditya Narayan Sankaran et.al. |
2403.11752 |
link |
2024-03-18 |
Embedded Named Entity Recognition using Probing Classifiers |
Nicholas Popovič et.al. |
2403.11747 |
link |
2024-03-18 |
TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models |
Lisa Weijler et.al. |
2403.11691 |
null |
2024-03-18 |
HDLdebugger: Streamlining HDL debugging with Large Language Models |
Xufeng Yao et.al. |
2403.11671 |
null |
2024-03-18 |
Prioritized Semantic Learning for Zero-shot Instance Navigation |
Xander Sun et.al. |
2403.11650 |
link |
2024-03-18 |
Arc2Face: A Foundation Model of Human Faces |
Foivos Paraperas Papantoniou et.al. |
2403.11641 |
link |
2024-03-18 |
Compositional Kronecker Context Optimization for Vision-Language Models |
Kun Ding et.al. |
2403.11631 |
null |
2024-03-18 |
Let’s Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model |
Haoyun Xu et.al. |
2403.11621 |
null |
2024-03-18 |
CRS-Diff: Controllable Generative Remote Sensing Foundation Model |
Datao Tang et.al. |
2403.11614 |
link |
2024-03-18 |
Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines |
Ekaterina Trofimova et.al. |
2403.11585 |
null |
2024-03-18 |
Reinforcement Learning with Token-level Feedback for Controllable Text Generation |
Wendi Li et.al. |
2403.11558 |
link |
2024-03-18 |
LLM^3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning |
Shu Wang et.al. |
2403.11552 |
link |
2024-03-18 |
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters |
Jiazuo Yu et.al. |
2403.11549 |
link |
2024-03-18 |
DEE: Dual-stage Explainable Evaluation Method for Text Generation |
Shenyu Zhang et.al. |
2403.11509 |
null |
2024-03-18 |
Do CLIPs Always Generalize Better than ImageNet Models? |
Qizhou Wang et.al. |
2403.11497 |
null |
2024-03-18 |
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding |
Yue Fan et.al. |
2403.11481 |
null |
2024-03-18 |
HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models |
Huy Nghiem et.al. |
2403.11456 |
link |
2024-03-18 |
Zero-shot Compound Expression Recognition with Visual Language Model at the 6th ABAW Challenge |
Jiahe Wang et.al. |
2403.11450 |
null |
2024-03-18 |
LLM Guided Evolution - The Automation of Models Advancing Models |
Clint Morris et.al. |
2403.11446 |
link |
2024-03-18 |
StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation |
Jinpeng Li et.al. |
2403.11439 |
null |
2024-03-18 |
InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions |
Yifan Wang et.al. |
2403.11435 |
null |
2024-03-18 |
A Novel Paradigm Boosting Translation Capabilities of Large Language Models |
Jiaxin Guo et.al. |
2403.11430 |
null |
2024-03-15 |
VideoAgent: Long-form Video Understanding with Large Language Model as Agent |
Xiaohan Wang et.al. |
2403.10517 |
null |
2024-03-15 |
Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization |
Ratnadira Widyasari et.al. |
2403.10507 |
null |
2024-03-15 |
ATOM: Asynchronous Training of Massive Models for Deep Learning in a Decentralized Environment |
Xiaofeng Wu et.al. |
2403.10504 |
null |
2024-03-15 |
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study |
Chenguang Wang et.al. |
2403.10499 |
link |
2024-03-15 |
Reconfigurable Robot Identification from Motion Data |
Yuhang Hu et.al. |
2403.10496 |
null |
2024-03-15 |
Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst? |
Bruno de Melo et.al. |
2403.10482 |
null |
2024-03-15 |
Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases |
Jiarui Li et.al. |
2403.10446 |
link |
2024-03-15 |
Optimal Block-Level Draft Verification for Accelerating Speculative Decoding |
Ziteng Sun et.al. |
2403.10444 |
null |
2024-03-15 |
Using an LLM to Turn Sign Spottings into Spoken Language Sentences |
Ozge Mercanoglu Sincan et.al. |
2403.10434 |
null |
2024-03-15 |
SocialGenPod: Privacy-Friendly Generative AI Social Web Applications with Decentralised Personal Data Stores |
Vidminas Vizgirda et.al. |
2403.10408 |
link |
2024-03-15 |
A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE |
Hervé Déjean et.al. |
2403.10407 |
null |
2024-03-15 |
Monotonic Representation of Numeric Properties in Language Models |
Benjamin Heinzerling et.al. |
2403.10381 |
link |
2024-03-15 |
EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models |
Rocktim Jyoti Das et.al. |
2403.10378 |
link |
2024-03-15 |
TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale |
Pengcheng Jiang et.al. |
2403.10351 |
null |
2024-03-15 |
Investigating grammatical abstraction in language models using few-shot learning of novel noun gender |
Priyanka Sukumaran et.al. |
2403.10338 |
null |
2024-03-15 |
CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model |
Shang-Hsuan Chiang et.al. |
2403.10326 |
link |
2024-03-15 |
NetBench: A Large-Scale and Comprehensive Network Traffic Benchmark Dataset for Foundation Models |
Chen Qian et.al. |
2403.10319 |
link |
2024-03-15 |
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer |
Hengxing Cai et.al. |
2403.10301 |
null |
2024-03-15 |
Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models |
Tian Meng et.al. |
2403.10287 |
null |
2024-03-15 |
Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning |
Shang-Hsuan Chiang et.al. |
2403.10281 |
link |
2024-03-14 |
GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping |
Yuhang Zheng et.al. |
2403.09637 |
link |
2024-03-14 |
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference |
Piotr Nawrot et.al. |
2403.09636 |
null |
2024-03-14 |
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models |
Akhil Kedia et.al. |
2403.09635 |
link |
2024-03-14 |
OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning |
Lingyi Hong et.al. |
2403.09634 |
null |
2024-03-14 |
3D-VLA: A 3D Vision-Language-Action Generative World Model |
Haoyu Zhen et.al. |
2403.09631 |
null |
2024-03-14 |
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking |
Eric Zelikman et.al. |
2403.09629 |
link |
2024-03-14 |
Explore In-Context Segmentation via Latent Diffusion Models |
Chaoyang Wang et.al. |
2403.09616 |
null |
2024-03-14 |
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training |
Brandon McKinzie et.al. |
2403.09611 |
null |
2024-03-14 |
Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey |
Xiaoyu Liu et.al. |
2403.09606 |
null |
2024-03-14 |
Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis |
Gregory Coppola et.al. |
2403.09599 |
null |
2024-03-14 |
Renovating Names in Open-Vocabulary Segmentation Benchmarks |
Haiwen Huang et.al. |
2403.09593 |
null |
2024-03-14 |
ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models |
Runyu Ma et.al. |
2403.09583 |
null |
2024-03-14 |
Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation |
Yunhao Gou et.al. |
2403.09572 |
null |
2024-03-14 |
Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models |
Laura Fernández-Becerra et.al. |
2403.09567 |
null |
2024-03-14 |
Welcome Your New AI Teammate: On Safety Analysis by Leashing Large Language Models |
Ali Nouri et.al. |
2403.09565 |
null |
2024-03-14 |
PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps |
Ruixuan Liu et.al. |
2403.09562 |
null |
2024-03-14 |
Less is More: Data Value Estimation for Visual Instruction Tuning |
Zikang Liu et.al. |
2403.09559 |
null |
2024-03-15 |
Logits of API-Protected LLMs Leak Proprietary Information |
Matthew Finlayson et.al. |
2403.09539 |
null |
2024-03-14 |
VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding |
Chris Kelly et.al. |
2403.09530 |
null |
2024-03-15 |
WavCraft: Audio Editing and Generation with Natural Language Prompts |
Jinhua Liang et.al. |
2403.09527 |
link |
2024-03-13 |
Simple and Scalable Strategies to Continually Pre-train Large Language Models |
Adam Ibrahim et.al. |
2403.08763 |
link |
2024-03-13 |
Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework |
Jingling Li et.al. |
2403.08743 |
null |
2024-03-13 |
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models |
Carlo Nicolini et.al. |
2403.08739 |
null |
2024-03-13 |
ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation |
Sayar Ghosh Roy et.al. |
2403.08737 |
link |
2024-03-13 |
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization |
Renjie Pi et.al. |
2403.08730 |
null |
2024-03-14 |
SOTOPIA- $π$ : Interactive Learning of Socially Intelligent Language Agents |
Ruiyi Wang et.al. |
2403.08715 |
link |
2024-03-13 |
Review of Generative AI Methods in Cybersecurity |
Yagmur Yigit et.al. |
2403.08701 |
null |
2024-03-13 |
TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning |
Shangding Gu et.al. |
2403.08694 |
link |
2024-03-13 |
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages |
Rik van Noord et.al. |
2403.08693 |
null |
2024-03-13 |
Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records |
Erlend Frayling et.al. |
2403.08664 |
null |
2024-03-13 |
Self-Supervised Learning for Covariance Estimation |
Tzvi Diskin et.al. |
2403.08662 |
null |
2024-03-13 |
Human Alignment of Large Language Models through Online Preference Optimisation |
Daniele Calandriello et.al. |
2403.08635 |
null |
2024-03-13 |
MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models |
Subash Neupane et.al. |
2403.08607 |
null |
2024-03-13 |
Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation |
Daniel Honerkamp et.al. |
2403.08605 |
link |
2024-03-13 |
DevBench: A Comprehensive Benchmark for Software Development |
Bowen Li et.al. |
2403.08604 |
link |
2024-03-13 |
Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments |
Sitao Cheng et.al. |
2403.08593 |
null |
2024-03-13 |
Non-discrimination Criteria for Generative Language Models |
Sara Sterlie et.al. |
2403.08564 |
link |
2024-03-13 |
AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models |
Yifei Gao et.al. |
2403.08542 |
link |
2024-03-13 |
Language models scale reliably with over-training and on downstream tasks |
Samir Yitzhak Gadre et.al. |
2403.08540 |
link |
2024-03-13 |
Masked Generative Story Transformer with Character Guidance and Caption Augmentation |
Christos Papadimitriou et.al. |
2403.08502 |
link |
2024-03-12 |
Beyond Text: Frozen Large Language Models in Visual Signal Comprehension |
Lei Zhu et.al. |
2403.07874 |
link |
2024-03-12 |
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension |
Fangyun Wei et.al. |
2403.07872 |
null |
2024-03-12 |
Exploring Safety Generalization Challenges of Large Language Models via Code |
Qibing Ren et.al. |
2403.07865 |
link |
2024-03-12 |
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation |
Shihao Zhao et.al. |
2403.07860 |
link |
2024-03-12 |
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric |
Haokun Lin et.al. |
2403.07839 |
null |
2024-03-12 |
DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies |
William Xie et.al. |
2403.07832 |
null |
2024-03-12 |
The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing |
Jianchen Wang et.al. |
2403.07825 |
null |
2024-03-12 |
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM |
Sainbayar Sukhbaatar et.al. |
2403.07816 |
null |
2024-03-12 |
Chronos: Learning the Language of Time Series |
Abdul Fatir Ansari et.al. |
2403.07815 |
link |
2024-03-12 |
Beyond Memorization: The Challenge of Random Memory Access in Language Models |
Tongyao Zhu et.al. |
2403.07805 |
link |
2024-03-12 |
Fine-tuning Large Language Models with Sequential Instructions |
Hanxu Hu et.al. |
2403.07794 |
link |
2024-03-12 |
Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations |
Carlos Jose Xavier Cruz et.al. |
2403.07769 |
link |
2024-03-12 |
Synth $^2$ : Boosting Visual-Language Models with Synthetic Captions and Image Embeddings |
Sahand Sharifzadeh et.al. |
2403.07750 |
null |
2024-03-12 |
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models |
Yan Liu et.al. |
2403.07747 |
null |
2024-03-12 |
Multi-modal Auto-regressive Modeling via Visual Words |
Tianshuo Peng et.al. |
2403.07720 |
link |
2024-03-12 |
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? |
Alexandre Drouin et.al. |
2403.07718 |
link |
2024-03-12 |
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models |
Zhicheng Guo et.al. |
2403.07714 |
link |
2024-03-12 |
Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards |
Wei Shen et.al. |
2403.07708 |
null |
2024-03-12 |
Large, Small or Both: A Novel Data Augmentation Framework Based on Language Models for Debiasing Opinion Summarization |
Yanyue Zhang et.al. |
2403.07693 |
null |
2024-03-12 |
Reference-free Monolithic Preference Optimization with Odds Ratio |
Jiwoo Hong et.al. |
2403.07691 |
link |
2024-03-11 |
Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena |
Leonie Weissweiler et.al. |
2403.06965 |
null |
2024-03-11 |
Materials science in the era of large language models: a perspective |
Ge Lei et.al. |
2403.06949 |
null |
2024-03-11 |
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation |
Xinyao Li et.al. |
2403.06946 |
link |
2024-03-11 |
Naming, Describing, and Quantifying Visual Objects in Humans and LLMs |
Alberto Testoni et.al. |
2403.06935 |
link |
2024-03-11 |
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis |
Yanming Liu et.al. |
2403.06932 |
link |
2024-03-11 |
MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning |
Yichuan Li et.al. |
2403.06914 |
link |
2024-03-11 |
Application of Quantum Tensor Networks for Protein Classification |
Debarshi Kundu et.al. |
2403.06890 |
null |
2024-03-11 |
Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents |
Nishchal Prasad et.al. |
2403.06872 |
link |
2024-03-11 |
Semantic Residual Prompts for Continual Learning |
Martin Menabue et.al. |
2403.06870 |
link |
2024-03-11 |
Learning with Noisy Foundation Models |
Hao Chen et.al. |
2403.06869 |
null |
2024-03-11 |
A Geospatial Approach to Predicting Desert Locust Breeding Grounds in Africa |
Ibrahim Salihu Yusuf et.al. |
2403.06860 |
null |
2024-03-11 |
Development of a Reliable and Accessible Caregiving Language Model (CaLM) |
Bambang Parmanto et.al. |
2403.06857 |
null |
2024-03-11 |
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation |
Guosheng Zhao et.al. |
2403.06845 |
null |
2024-03-11 |
RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback |
Yanming Liu et.al. |
2403.06840 |
link |
2024-03-11 |
ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts |
Lyuye Zhang et.al. |
2403.06838 |
null |
2024-03-11 |
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? |
Egor Zverev et.al. |
2403.06833 |
link |
2024-03-11 |
The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework |
Zhuo Chen et.al. |
2403.06832 |
link |
2024-03-11 |
ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model |
Zhiwei Liu et.al. |
2403.06765 |
link |
2024-03-11 |
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models |
Liang Chen et.al. |
2403.06764 |
link |
2024-03-11 |
ALaRM: Align Language Models via Hierarchical Rewards Modeling |
Yuhang Lai et.al. |
2403.06754 |
link |
2024-03-08 |
Bayesian Preference Elicitation with Language Models |
Kunal Handa et.al. |
2403.05534 |
null |
2024-03-08 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context |
Machel Reid et.al. |
2403.05530 |
null |
2024-03-08 |
GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM |
Hao Kang et.al. |
2403.05527 |
link |
2024-03-08 |
DeepSeek-VL: Towards Real-World Vision-Language Understanding |
Haoyu Lu et.al. |
2403.05525 |
link |
2024-03-08 |
Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola |
Yijiang Li et.al. |
2403.05523 |
null |
2024-03-08 |
Authorship Attribution in Bangla Literature (AABL) via Transfer Learning using ULMFiT |
Aisha Khatun et.al. |
2403.05519 |
null |
2024-03-08 |
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought |
James Chua et.al. |
2403.05518 |
link |
2024-03-08 |
To Err Is Human, but Llamas Can Learn It Too |
Agnes Luhtaru et.al. |
2403.05493 |
link |
2024-03-08 |
Will GPT-4 Run DOOM? |
Adrian de Wynter et.al. |
2403.05468 |
null |
2024-03-08 |
Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs |
Arijit Nag et.al. |
2403.05434 |
null |
2024-03-08 |
Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition |
Bingbing Wang et.al. |
2403.05428 |
null |
2024-03-08 |
FedFMS: Exploring Federated Foundation Models for Medical Image Segmentation |
Yuxi Liu et.al. |
2403.05408 |
link |
2024-03-08 |
Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery |
Xavier Bou et.al. |
2403.05381 |
link |
2024-03-08 |
VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model |
Junsu Kim et.al. |
2403.05346 |
null |
2024-03-08 |
Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings |
Wei Zhou et.al. |
2403.05338 |
null |
2024-03-08 |
ChatASU: Evoking LLM’s Reflexion to Truly Understand Aspect Sentiment in Dialogues |
Yiding Liu et.al. |
2403.05326 |
null |
2024-03-08 |
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation |
Zihao Wang et.al. |
2403.05313 |
null |
2024-03-08 |
Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents |
Jinyang Li et.al. |
2403.05307 |
link |
2024-03-08 |
ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications |
Sotaro Takeshita et.al. |
2403.05303 |
link |
2024-03-08 |
Modeling Dynamic (De)Allocations of Local Memory for Translation Validation |
Abhishek Rose et.al. |
2403.05302 |
null |
2024-03-07 |
iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries |
Adam Coscia et.al. |
2403.04760 |
link |
2024-03-07 |
KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts |
Adam Coscia et.al. |
2403.04758 |
link |
2024-03-07 |
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error |
Boshi Wang et.al. |
2403.04746 |
link |
2024-03-08 |
How Far Are We from Intelligent Visual Deductive Reasoning? |
Yizhe Zhang et.al. |
2403.04732 |
link |
2024-03-07 |
Common 7B Language Models Already Possess Strong Math Capabilities |
Chen Li et.al. |
2403.04706 |
link |
2024-03-07 |
ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes |
Hashmat Shadab Malik et.al. |
2403.04701 |
link |
2024-03-07 |
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification |
Ekaterina Fadeeva et.al. |
2403.04696 |
link |
2024-03-07 |
Telecom Language Models: Must They Be Large? |
Nicola Piovesan et.al. |
2403.04666 |
null |
2024-03-07 |
Yi: Open Foundation Models by 01.AI |
01. AI et.al. |
2403.04652 |
link |
2024-03-07 |
Teaching Large Language Models to Reason with Reinforcement Learning |
Alex Havrilla et.al. |
2403.04642 |
null |
2024-03-07 |
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios |
Qilang Ye et.al. |
2403.04640 |
link |
2024-03-07 |
A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds |
Xuenan Xu et.al. |
2403.04594 |
link |
2024-03-07 |
Embodied Understanding of Driving Scenarios |
Yunsong Zhou et.al. |
2403.04593 |
link |
2024-03-07 |
Wiki-TabNER:Advancing Table Interpretation Through Named Entity Recognition |
Aneta Koleva et.al. |
2403.04577 |
link |
2024-03-07 |
Reducing self-supervised learning complexity improves weakly-supervised classification performance in computational pathology |
Tim Lenz et.al. |
2403.04558 |
null |
2024-03-07 |
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models |
Wanru Zhao et.al. |
2403.04529 |
null |
2024-03-07 |
Where does In-context Translation Happen in Large Language Models |
Suzanna Sia et.al. |
2403.04510 |
null |
2024-03-07 |
GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability |
Zihan Luo et.al. |
2403.04483 |
link |
2024-03-08 |
Do Large Language Model Understand Multi-Intent Spoken Language ? |
Shangjian Yin et.al. |
2403.04481 |
link |
2024-03-08 |
Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset |
Minjin Kim et.al. |
2403.04460 |
link |
2024-03-06 |
Backtracing: Retrieving the Cause of the Query |
Rose E. Wang et.al. |
2403.03956 |
link |
2024-03-06 |
Bridging Language and Items for Retrieval and Recommendation |
Yupeng Hou et.al. |
2403.03952 |
link |
2024-03-06 |
The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models |
Adithya Bhaskar et.al. |
2403.03942 |
link |
2024-03-06 |
Did Translation Models Get More Robust Without Anyone Even Noticing? |
Ben Peters et.al. |
2403.03923 |
null |
2024-03-06 |
Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing |
Asmita et.al. |
2403.03897 |
link |
2024-03-06 |
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators |
Indraneil Paul et.al. |
2403.03894 |
link |
2024-03-06 |
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models |
Luiza Pozzobon et.al. |
2403.03893 |
link |
2024-03-06 |
FaaF: Facts as a Function for the evaluation of RAG systems |
Vasileios Katranidis et.al. |
2403.03888 |
link |
2024-03-06 |
SaulLM-7B: A pioneering Large Language Model for Law |
Pierre Colombo et.al. |
2403.03883 |
null |
2024-03-06 |
Learning to Decode Collaboratively with Multiple Language Models |
Shannon Zejiang Shen et.al. |
2403.03870 |
link |
2024-03-06 |
On the Origins of Linear Representations in Large Language Models |
Yibo Jiang et.al. |
2403.03867 |
null |
2024-03-06 |
KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions |
Fangyuan Xu et.al. |
2403.03866 |
null |
2024-03-06 |
Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning |
Deepanway Ghosal et.al. |
2403.03864 |
link |
2024-03-06 |
X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification |
Hanzi Xu et.al. |
2403.03863 |
link |
2024-03-06 |
Designing Informative Metrics for Few-Shot Example Selection |
Rishabh Adiga et.al. |
2403.03861 |
null |
2024-03-06 |
Emojinize : Enriching Any Text with Emoji Translations |
Lars Henning Klein et.al. |
2403.03857 |
null |
2024-03-06 |
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect |
Xin Men et.al. |
2403.03853 |
null |
2024-03-06 |
Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ |
Carolin Holtermann et.al. |
2403.03814 |
link |
2024-03-06 |
Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery |
Wei Zhang et.al. |
2403.03790 |
null |
2024-03-06 |
PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion |
Zekai Zhang et.al. |
2403.03788 |
link |
2024-03-05 |
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning |
Nathaniel Li et.al. |
2403.03218 |
null |
2024-03-05 |
CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments |
Savitha Sam Abraham et.al. |
2403.03203 |
null |
2024-03-05 |
Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement |
Rafaela Martelo et.al. |
2403.03188 |
link |
2024-03-05 |
Reliable, Adaptable, and Attributable Language Models with Retrieval |
Akari Asai et.al. |
2403.03187 |
null |
2024-03-05 |
MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting |
Fangchen Liu et.al. |
2403.03174 |
null |
2024-03-05 |
SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection |
Peng Qi et.al. |
2403.03170 |
null |
2024-03-05 |
PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset |
Arda Uzunoğlu et.al. |
2403.03167 |
link |
2024-03-05 |
Quantum Many-Body Physics Calculations with Large Language Models |
Haining Pan et.al. |
2403.03154 |
null |
2024-03-05 |
Language Guided Exploration for RL Agents in Text Environments |
Hitesh Golchha et.al. |
2403.03141 |
null |
2024-03-05 |
CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following |
Kaiyan Zhang et.al. |
2403.03129 |
null |
2024-03-05 |
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution |
Flor Miriam Plaza-del-Arco et.al. |
2403.03121 |
link |
2024-03-05 |
“In Dialogues We Learn”: Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning |
Chuanqi Cheng et.al. |
2403.03102 |
null |
2024-03-05 |
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents |
Yuqi Zhu et.al. |
2403.03101 |
link |
2024-03-05 |
Learning to Use Tools via Cooperative and Interactive Agents |
Zhengliang Shi et.al. |
2403.03031 |
link |
2024-03-05 |
Socratic Reasoning Improves Positive Text Rewriting |
Anmol Goel et.al. |
2403.03029 |
null |
2024-03-05 |
Word Importance Explains How Prompts Affect Language Model Outputs |
Stefan Hackmann et.al. |
2403.03028 |
null |
2024-03-05 |
OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following |
Haochen Shi et.al. |
2403.03017 |
null |
2024-03-05 |
Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations |
Hasan Abu-Rasheed et.al. |
2403.03008 |
null |
2024-03-05 |
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models |
Gen Luo et.al. |
2403.03003 |
link |
2024-03-05 |
Localized Zeroth-Order Prompt Optimization |
Wenyang Hu et.al. |
2403.02993 |
null |
2024-03-02 |
LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems |
Tasnim Ahmed et.al. |
2403.01342 |
null |
2024-03-02 |
Making Hybrid Languages: A Recipe |
Leif Andersen et.al. |
2403.01335 |
null |
2024-03-02 |
Chaining thoughts and LLMs to learn DNA structural biophysics |
Tyler D. Ross et.al. |
2403.01332 |
link |
2024-03-02 |
VBART: The Turkish LLM |
Meliksah Turker et.al. |
2403.01308 |
null |
2024-03-02 |
ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation |
Moran Yanuka et.al. |
2403.01306 |
link |
2024-03-02 |
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning |
Alexander Scarlatos et.al. |
2403.01304 |
link |
2024-03-02 |
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention |
Tianyi Zhang et.al. |
2403.01273 |
link |
2024-03-02 |
Employing LLMs for Incident Response Planning and Review |
Sam Hays et.al. |
2403.01271 |
null |
2024-03-02 |
Dissecting Language Models: Machine Unlearning via Selective Pruning |
Nicholas Pochinkov et.al. |
2403.01267 |
link |
2024-03-02 |
Accelerating Greedy Coordinate Gradient via Probe Sampling |
Yiran Zhao et.al. |
2403.01251 |
link |
2024-03-02 |
SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code |
Ziniu Hu et.al. |
2403.01248 |
null |
2024-03-02 |
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal |
Jianheng Huang et.al. |
2403.01244 |
link |
2024-03-02 |
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact |
Ruikang Liu et.al. |
2403.01241 |
link |
2024-03-02 |
Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy |
Jamie Hayes et.al. |
2403.01218 |
null |
2024-03-02 |
API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access |
Jiayuan Su et.al. |
2403.01216 |
null |
2024-03-02 |
Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning |
Shuo Yang et.al. |
2403.01209 |
null |
2024-03-02 |
The Case for Animal-Friendly AI |
Sankalpa Ghose et.al. |
2403.01199 |
null |
2024-03-02 |
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling |
Shanghaoran Quan et.al. |
2403.01197 |
link |
2024-03-02 |
RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots |
Philip Feldman. James R. Foulds et.al. |
2403.01193 |
null |
2024-03-02 |
Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding |
Ha-Thanh Nguyen et.al. |
2403.01185 |
null |
2024-02-29 |
The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations? |
Alex Gu et.al. |
2402.19475 |
null |
2024-02-29 |
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World |
Weiyun Wang et.al. |
2402.19474 |
link |
2024-02-29 |
Retrieval-Augmented Generation for AI-Generated Content: A Survey |
Penghao Zhao et.al. |
2402.19473 |
link |
2024-02-29 |
Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling |
Gabriel Grand et.al. |
2402.19471 |
null |
2024-03-01 |
TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning |
Kate Sanders et.al. |
2402.19467 |
null |
2024-02-29 |
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models |
Chen Qian et.al. |
2402.19465 |
link |
2024-02-29 |
Curiosity-driven Red-teaming for Large Language Models |
Zhang-Wei Hong et.al. |
2402.19464 |
link |
2024-02-29 |
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap |
Saurabh Srivastava et.al. |
2402.19450 |
link |
2024-02-29 |
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models |
Frederik Kunstner et.al. |
2402.19449 |
null |
2024-02-29 |
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL |
Yifei Zhou et.al. |
2402.19446 |
link |
2024-02-29 |
Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation |
Jonathan Yang et.al. |
2402.19432 |
null |
2024-02-29 |
Compositional API Recommendation for Library-Oriented Code Generation |
Zexiong Ma et.al. |
2402.19431 |
null |
2024-02-29 |
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models |
Soham De et.al. |
2402.19427 |
null |
2024-02-29 |
Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines |
Lijia Ma et.al. |
2402.19421 |
null |
2024-02-29 |
PaECTER: Patent-level Representation Learning using Citation-informed Transformers |
Mainak Ghosh et.al. |
2402.19411 |
null |
2024-02-29 |
On the Scaling Laws of Geographical Representation in Language Models |
Nathan Godey et.al. |
2402.19406 |
null |
2024-02-29 |
Entity-Aware Multimodal Alignment Framework for News Image Captioning |
Junzhe Zhang et.al. |
2402.19404 |
null |
2024-02-29 |
Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Match Human Crowd Accuracy |
Philipp Schoenegger et.al. |
2402.19379 |
null |
2024-02-29 |
OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models |
Jenish Maharjan et.al. |
2402.19371 |
null |
2024-02-29 |
SoK: Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency |
Akila Wickramasekara et.al. |
2402.19366 |
null |
2024-02-28 |
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards |
Haoxiang Wang et.al. |
2402.18571 |
link |
2024-02-28 |
Diffusion Language Models Are Versatile Protein Learners |
Xinyou Wang et.al. |
2402.18567 |
link |
2024-02-28 |
A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic |
Gregory Coppola et.al. |
2402.18566 |
null |
2024-02-28 |
Approaching Human-Level Forecasting with Language Models |
Danny Halawi et.al. |
2402.18563 |
null |
2024-02-28 |
Implicit Bias of Next-Token Prediction |
Christos Thrampoulidis et.al. |
2402.18551 |
null |
2024-02-28 |
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling |
Mahdi Karami et.al. |
2402.18508 |
null |
2024-02-28 |
Few-Shot Fairness: Unveiling LLM’s Potential for Fairness-Aware Classification |
Garima Chhikara et.al. |
2402.18502 |
null |
2024-02-28 |
Language Models Represent Beliefs of Self and Others |
Wentao Zhu et.al. |
2402.18496 |
null |
2024-02-28 |
IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding |
Lanyun Zhu et.al. |
2402.18476 |
null |
2024-02-28 |
Meta-Task Prompting Elicits Embedding from Large Language Models |
Yibin Lei et.al. |
2402.18458 |
link |
2024-02-28 |
Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization |
Deng Li et.al. |
2402.18447 |
null |
2024-02-28 |
Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication |
Weize Chen et.al. |
2402.18439 |
link |
2024-02-28 |
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models |
Xiujie Song et.al. |
2402.18409 |
link |
2024-02-28 |
Balanced Similarity with Auxiliary Prompts: Towards Alleviating Text-to-Image Retrieval Bias for CLIP in Zero-shot Learning |
Hanyao Wang et.al. |
2402.18400 |
null |
2024-02-28 |
Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models |
Ercong Nie et.al. |
2402.18397 |
null |
2024-02-28 |
The First Place Solution of WSDM Cup 2024: Leveraging Large Language Models for Conversational Multi-Doc QA |
Yiming Li et.al. |
2402.18385 |
link |
2024-02-28 |
Large Language Models As Evolution Strategies |
Robert Tjarko Lange et.al. |
2402.18381 |
null |
2024-02-28 |
Tokenization Is More Than Compression |
Craig W. Schmidt et.al. |
2402.18376 |
link |
2024-02-28 |
VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models |
Seoyeon Kim et.al. |
2402.18374 |
link |
2024-02-28 |
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning |
Jiachun Li et.al. |
2402.18344 |
link |
2024-02-27 |
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction |
Zekun Qi et.al. |
2402.17766 |
link |
2024-02-27 |
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits |
Shuming Ma et.al. |
2402.17764 |
null |
2024-02-27 |
Massive Activations in Large Language Models |
Mingjie Sun et.al. |
2402.17762 |
link |
2024-02-27 |
Towards Optimal Learning of Language Models |
Yuxian Gu et.al. |
2402.17759 |
null |
2024-02-27 |
Evaluating Very Long-Term Conversational Memory of LLM Agents |
Adyasha Maharana et.al. |
2402.17753 |
null |
2024-02-27 |
Tower: An Open Multilingual Large Language Model for Translation-Related Tasks |
Duarte M. Alves et.al. |
2402.17733 |
link |
2024-02-27 |
AmbigNLG: Addressing Task Ambiguity in Instruction for NLG |
Ayana Niwa et.al. |
2402.17717 |
link |
2024-02-27 |
Case-Based or Rule-Based: How Do Transformers Do the Math? |
Yi Hu et.al. |
2402.17709 |
link |
2024-02-27 |
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations |
Jing Huang et.al. |
2402.17700 |
link |
2024-02-27 |
NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents |
Tamara Czinczoll et.al. |
2402.17682 |
link |
2024-02-27 |
The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks |
Ashwin Prasad Shivarpatna Venkatesh et.al. |
2402.17679 |
null |
2024-02-27 |
CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention |
Mohammad Sadil Khan et.al. |
2402.17678 |
null |
2024-02-27 |
Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models |
Yunpeng Huang et.al. |
2402.17671 |
null |
2024-02-27 |
Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs |
Tanise Ceron et.al. |
2402.17649 |
null |
2024-02-27 |
SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation |
Shuangrui Ding et.al. |
2402.17645 |
link |
2024-02-27 |
Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data |
Xiao Liu et.al. |
2402.17644 |
link |
2024-02-27 |
Variational Learning is Effective for Large Deep Networks |
Yuesong Shen et.al. |
2402.17641 |
link |
2024-02-27 |
Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling |
David S. W. Williams et.al. |
2402.17622 |
null |
2024-02-27 |
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization |
Wenqi Zhang et.al. |
2402.17574 |
link |
2024-02-27 |
Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers |
Xinyu Tang et.al. |
2402.17564 |
link |
2024-02-26 |
Integrating Large Language Models with Graphical Session-Based Recommendation |
Naicheng Guo et.al. |
2402.16539 |
null |
2024-02-26 |
LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments |
Junzhe Chen et.al. |
2402.16499 |
link |
2024-02-26 |
On Languaging a Simulation Engine |
Han Liu et.al. |
2402.16482 |
null |
2024-02-26 |
Unveiling ChatGPT’s Usage in Open Source Projects: A Mining-based Study |
Rosalia Tufano et.al. |
2402.16480 |
null |
2024-02-26 |
mEdIT: Multilingual Text Editing via Instruction Tuning |
Vipul Raheja et.al. |
2402.16472 |
link |
2024-02-26 |
Unveiling Vulnerability of Self-Attention |
Khai Jiet Liong et.al. |
2402.16470 |
link |
2024-02-26 |
Defending LLMs against Jailbreaking Attacks via Backtranslation |
Yihan Wang et.al. |
2402.16459 |
link |
2024-02-26 |
ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing |
Liuzhenghao Lv et.al. |
2402.16445 |
link |
2024-02-26 |
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors |
Zhexin Zhang et.al. |
2402.16444 |
link |
2024-02-26 |
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models |
Tianyi Tang et.al. |
2402.16438 |
link |
2024-02-26 |
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions |
Yuansen Zhang et.al. |
2402.16431 |
null |
2024-02-26 |
Predicting Sustainable Development Goals Using Course Descriptions – from LLMs to Conventional Foundation Models |
Lev Kharlashkin et.al. |
2402.16420 |
null |
2024-02-26 |
From RAGs to riches: Using large language models to write documents for clinical trials |
Nigel Markey et.al. |
2402.16406 |
null |
2024-02-26 |
MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property |
Shiwen Ni et.al. |
2402.16389 |
link |
2024-02-26 |
Immunization against harmful fine-tuning attacks |
Domenic Rosati et.al. |
2402.16382 |
null |
2024-02-26 |
Improving LLM-based Machine Translation with Systematic Self-Correction |
Zhaopeng Feng et.al. |
2402.16379 |
link |
2024-02-26 |
Unraveling Babel: Exploring Multilingual Activation Patterns within Large Language Models |
Weize Liu et.al. |
2402.16367 |
null |
2024-02-26 |
LLM Inference Unveiled: Survey and Roofline Model Insights |
Zhihang Yuan et.al. |
2402.16363 |
link |
2024-02-26 |
Layer-wise Regularized Dropout for Neural Language Models |
Shiwen Ni et.al. |
2402.16361 |
null |
2024-02-26 |
An Integrated Data Processing Framework for Pretraining Foundation Models |
Yiding Sun et.al. |
2402.16358 |
link |
2024-02-26 |
Language-guided Skill Learning with Temporal Variational Inference |
Haotian Fu et.al. |
2402.16354 |
null |
2024-02-23 |
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning |
Jianguo Zhang et.al. |
2402.15506 |
link |
2024-02-23 |
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs |
Kinjal Basu et.al. |
2402.15491 |
link |
2024-02-23 |
Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models |
Yiran Liu et.al. |
2402.15481 |
null |
2024-02-23 |
Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization |
Swaroop Nath et.al. |
2402.15473 |
link |
2024-02-23 |
Repetition Improves Language Model Embeddings |
Jacob Mitchell Springer et.al. |
2402.15449 |
link |
2024-02-23 |
A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models |
Stefan Hegselmann et.al. |
2402.15422 |
link |
2024-02-23 |
PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning |
Simon Holk et.al. |
2402.15420 |
null |
2024-02-23 |
Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy? |
Nader Asadi et.al. |
2402.15414 |
null |
2024-02-23 |
Grasp, See and Place: Efficient Unknown Object Rearrangement with Policy Structure Prior |
Kechun Xu et.al. |
2402.15402 |
link |
2024-02-23 |
Explorations of Self-Repair in Language Models |
Cody Rushing et.al. |
2402.15390 |
link |
2024-02-23 |
Safe Task Planning for Language-Instructed Multi-Robot Systems using Conformal Prediction |
Jun Wang et.al. |
2402.15368 |
null |
2024-02-23 |
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping |
Zijie J. Wang et.al. |
2402.15350 |
link |
2024-02-23 |
NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data |
Sergei Bogdanov et.al. |
2402.15343 |
link |
2024-02-23 |
Ranking Entities along Conceptual Space Dimensions with LLMs: An Analysis of Fine-Tuning Strategies |
Nitesh Kumar et.al. |
2402.15337 |
null |
2024-02-23 |
GPTVQ: The Blessing of Dimensionality for LLM Quantization |
Mart van Baalen et.al. |
2402.15319 |
null |
2024-02-23 |
ArabianGPT: Native Arabic GPT-based Large Language |
Anis Koubaa et.al. |
2402.15313 |
null |
2024-02-23 |
Counterfactual Generation with Identifiability Guarantees |
Hanqi Yan et.al. |
2402.15309 |
link |
2024-02-23 |
Representing Online Handwriting for Recognition in Large Vision-Language Models |
Anastasiia Fadeeva et.al. |
2402.15307 |
null |
2024-02-23 |
How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries |
Somnath Banerjee et.al. |
2402.15302 |
link |
2024-02-23 |
Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models |
Yuzhe Zhang et.al. |
2402.15301 |
null |
2024-02-22 |
PALO: A Polyglot Large Multimodal Model for 5B People |
Muhammad Maaz et.al. |
2402.14818 |
link |
2024-02-22 |
Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging |
Yuzhe Yang et.al. |
2402.14815 |
link |
2024-02-22 |
WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition |
Lianghui Zhu et.al. |
2402.14812 |
link |
2024-02-22 |
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking |
Nikhil Prakash et.al. |
2402.14811 |
null |
2024-02-22 |
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning |
Zicheng Lin et.al. |
2402.14809 |
link |
2024-02-22 |
RelayAttention for Efficient Large Language Model Serving with Long System Prompts |
Lei Zhu et.al. |
2402.14808 |
link |
2024-02-22 |
A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health |
Nikhil Behari et.al. |
2402.14807 |
null |
2024-02-22 |
Identifying Multiple Personalities in Large Language Models with External Evaluation |
Xiaoyang Song et.al. |
2402.14805 |
null |
2024-02-22 |
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models |
Xudong Lu et.al. |
2402.14800 |
link |
2024-02-22 |
Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic |
Nathaniel Weir et.al. |
2402.14798 |
null |
2024-02-22 |
Zero-shot cross-lingual transfer in instruction tuning of large language model |
Nadezhda Chirkova et.al. |
2402.14778 |
null |
2024-02-22 |
2D Matryoshka Sentence Embeddings |
Xianming Li et.al. |
2402.14776 |
link |
2024-02-22 |
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models |
Yuhang Cao et.al. |
2402.14767 |
link |
2024-02-22 |
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues |
Ge Bai et.al. |
2402.14762 |
link |
2024-02-22 |
Generalizing Reward Modeling for Out-of-Distribution Preference Learning |
Chen Jia et.al. |
2402.14760 |
link |
2024-02-22 |
Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation |
Jiawei Wang et.al. |
2402.14744 |
link |
2024-02-22 |
Dependency Annotation of Ottoman Turkish with Multilingual BERT |
Şaziye Betül Özateş et.al. |
2402.14743 |
null |
2024-02-22 |
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs |
Arash Ahmadian et.al. |
2402.14740 |
null |
2024-02-22 |
Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models |
Seungduk Kim et.al. |
2402.14714 |
link |
2024-02-22 |
IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus |
Honghao Gui et.al. |
2402.14710 |
link |
2024-02-21 |
Coercing LLMs to do and reveal (almost) anything |
Jonas Geiping et.al. |
2402.14020 |
link |
2024-02-21 |
Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment |
Vyas Raina et.al. |
2402.14016 |
link |
2024-02-21 |
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems |
Chaoqun He et.al. |
2402.14008 |
link |
2024-02-21 |
Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models |
Zhiwei He et.al. |
2402.14007 |
link |
2024-02-21 |
Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models |
Aline Ioste et.al. |
2402.14002 |
null |
2024-02-21 |
Analysing The Impact of Sequence Composition on Language Model Pre-Training |
Yu Zhao et.al. |
2402.13991 |
link |
2024-02-21 |
Towards Building Multilingual Language Model for Medicine |
Pengcheng Qiu et.al. |
2402.13963 |
link |
2024-02-21 |
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality |
Rahul Zalkikar et.al. |
2402.13954 |
link |
2024-02-21 |
Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning |
Debjit Paul et.al. |
2402.13950 |
null |
2024-02-21 |
Do Efficient Transformers Really Save Computation? |
Kai Yang et.al. |
2402.13934 |
null |
2024-02-21 |
Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content |
Federico Bianchi et.al. |
2402.13926 |
null |
2024-02-21 |
SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization |
Prakamya Mishra et.al. |
2402.13919 |
link |
2024-02-21 |
What Linguistic Features and Languages are Important in LLM Translation? |
Ryandito Diandaru et.al. |
2402.13917 |
null |
2024-02-21 |
Calibrating Large Language Models with Sample Consistency |
Qing Lyu et.al. |
2402.13904 |
null |
2024-02-21 |
Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models |
Chenyang Lyu et.al. |
2402.13887 |
null |
2024-02-21 |
$\texttt{Se}^2$: $\textit{Se}$quential Example $\textit{Se}$ lection for In-Context Learning |
Haoyu Liu et.al. |
2402.13874 |
link |
2024-02-21 |
An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach |
Mohammad Amaz Uddin et.al. |
2402.13871 |
null |
2024-02-21 |
Kuaiji: the First Chinese Accounting Large Language Model |
Jiayuan Luo et.al. |
2402.13866 |
null |
2024-02-21 |
RealDex: Towards Human-like Grasping for Robotic Dexterous Hand |
Yumeng Liu et.al. |
2402.13853 |
null |
2024-02-21 |
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models |
Jiawei Liang et.al. |
2402.13851 |
null |
2024-02-20 |
Towards audio language modeling – an overview |
Haibin Wu et.al. |
2402.13236 |
null |
2024-02-20 |
Unlocking Insights: Semantic Search in Jupyter Notebooks |
Lan Li et.al. |
2402.13234 |
null |
2024-02-20 |
A Touch, Vision, and Language Dataset for Multimodal Alignment |
Letian Fu et.al. |
2402.13232 |
link |
2024-02-20 |
Investigating Cultural Alignment of Large Language Models |
Badr AlKhamissi et.al. |
2402.13231 |
link |
2024-02-20 |
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive |
Arka Pal et.al. |
2402.13228 |
link |
2024-02-20 |
AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning |
Qiao Jin et.al. |
2402.13225 |
null |
2024-02-20 |
RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian |
Adrian Cosma et.al. |
2402.13222 |
link |
2024-02-20 |
How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts |
Yusu Qian et.al. |
2402.13220 |
null |
2024-02-20 |
Softmax Probabilities (Mostly) Predict Large Language Model Correctness on Multiple-Choice Q&A |
Benjamin Plaut et.al. |
2402.13213 |
link |
2024-02-20 |
Soft Self-Consistency Improves Language Model Agents |
Han Wang et.al. |
2402.13212 |
link |
2024-02-20 |
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation |
Dongjin Kang et.al. |
2402.13211 |
null |
2024-02-20 |
Bayesian Reward Models for LLM Alignment |
Adam X. Yang et.al. |
2402.13210 |
null |
2024-02-20 |
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena |
Marco Gaido et.al. |
2402.13208 |
link |
2024-02-20 |
Question Calibration and Multi-Hop Modeling for Temporal Question Answering |
Chao Xue et.al. |
2402.13188 |
null |
2024-02-20 |
What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents |
Mingyu Jin et.al. |
2402.13184 |
link |
2024-02-20 |
DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models |
Norman Di Palo et.al. |
2402.13181 |
null |
2024-02-20 |
Benchmarking Retrieval-Augmented Generation for Medicine |
Guangzhi Xiong et.al. |
2402.13178 |
link |
2024-02-20 |
Defending Jailbreak Prompts via In-Context Adversarial Game |
Yujun Zhou et.al. |
2402.13148 |
null |
2024-02-20 |
OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog |
Adnen Abdessaied et.al. |
2402.13146 |
null |
2024-02-20 |
The Hidden Space of Transformer Language Adapters |
Jesujoba O. Alabi et.al. |
2402.13137 |
link |
2024-02-19 |
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding |
Zhuoming Chen et.al. |
2402.12374 |
link |
2024-02-19 |
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies |
Xiao Ye et.al. |
2402.12370 |
link |
2024-02-19 |
A Critical Evaluation of AI Feedback for Aligning Large Language Models |
Archit Sharma et.al. |
2402.12366 |
link |
2024-02-19 |
Emergent Word Order Universals from Cognitively-Motivated Language Models |
Tatsuki Kuribayashi et.al. |
2402.12363 |
link |
2024-02-19 |
Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge |
Julien Delile et.al. |
2402.12352 |
null |
2024-02-19 |
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations |
Jinhao Duan et.al. |
2402.12348 |
link |
2024-02-19 |
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! |
Zhanhui Zhou et.al. |
2402.12343 |
link |
2024-02-19 |
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models |
Christian Schlarmann et.al. |
2402.12336 |
link |
2024-02-19 |
Query-Based Adversarial Prompt Generation |
Jonathan Hayase et.al. |
2402.12329 |
null |
2024-02-19 |
Shall We Talk: Exploring Spontaneous Collaborations of Competing LLM Agents |
Zengqing Wu et.al. |
2402.12327 |
link |
2024-02-19 |
ARKS: Active Retrieval in Knowledge Soup for Code Generation |
Hongjin Su et.al. |
2402.12317 |
link |
2024-02-19 |
Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports |
Felix J. Dorfner et.al. |
2402.12298 |
null |
2024-02-19 |
KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students |
Matthew Shu et.al. |
2402.12291 |
null |
2024-02-19 |
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models |
Xiaoyu Tian et.al. |
2402.12289 |
null |
2024-02-19 |
Adaptive Skeleton Graph Decoding |
Shuowei Jin et.al. |
2402.12280 |
null |
2024-02-19 |
Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks |
Nadezhda Chirkova et.al. |
2402.12279 |
null |
2024-02-19 |
Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from Large Language Models |
Puxuan Yu et.al. |
2402.12276 |
link |
2024-02-19 |
High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models |
Michela Lorandi et.al. |
2402.12267 |
link |
2024-02-19 |
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles |
Oleksandr Balabanov et.al. |
2402.12264 |
null |
2024-02-19 |
NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms |
Jonathan Zheng et.al. |
2402.12261 |
link |
2024-02-16 |
PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter |
Junfei Xiao et.al. |
2402.10896 |
null |
2024-02-16 |
RLVF: Learning from Verbal Feedback without Overgeneralization |
Moritz Stephan et.al. |
2402.10893 |
link |
2024-02-16 |
Instruction Diversity Drives Generalization To Unseen Tasks |
Dylan Zhang et.al. |
2402.10891 |
null |
2024-02-16 |
When is Tree Search Useful for LLM Planning? It Depends on the Discriminator |
Ziru Chen et.al. |
2402.10890 |
link |
2024-02-16 |
Multi-modal preference alignment remedies regression of visual instruction tuning on language model |
Shengzhi Li et.al. |
2402.10884 |
link |
2024-02-16 |
EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models |
Muhammad Shihab Rashid et.al. |
2402.10866 |
link |
2024-02-16 |
Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities |
Mingyu Jin et.al. |
2402.10835 |
null |
2024-02-16 |
RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model |
Jianhao Yuan et.al. |
2402.10828 |
null |
2024-02-16 |
Quantifying the Persona Effect in LLM Simulations |
Tiancheng Hu et.al. |
2402.10811 |
null |
2024-02-16 |
Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond |
Yongqi Li et.al. |
2402.10805 |
null |
2024-02-16 |
EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge |
Xuan Shen et.al. |
2402.10787 |
link |
2024-02-16 |
A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models |
Mingchen Li et.al. |
2402.10779 |
null |
2024-02-16 |
AutoGPT+P: Affordance-based Task Planning with Large Language Models |
Timo Birr et.al. |
2402.10778 |
null |
2024-02-16 |
How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs? |
Ehsan Doostmohammadi et.al. |
2402.10770 |
null |
2024-02-16 |
Distillation Enhanced Generative Retrieval |
Yongqi Li et.al. |
2402.10769 |
null |
2024-02-16 |
Inference to the Best Explanation in Large Language Models |
Dhairya Dalal et.al. |
2402.10767 |
null |
2024-02-16 |
When Dataflow Analysis Meets Large Language Models |
Chengpeng Wang et.al. |
2402.10754 |
link |
2024-02-16 |
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages |
Junjie Ye et.al. |
2402.10753 |
link |
2024-02-16 |
GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models |
Pengcheng Jiang et.al. |
2402.10744 |
link |
2024-02-16 |
Let’s Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning |
Yinpeng Liu et.al. |
2402.10738 |
link |
2024-02-15 |
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation |
Huizhuo Yuan et.al. |
2402.10210 |
null |
2024-02-15 |
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment |
Rui Yang et.al. |
2402.10207 |
link |
2024-02-15 |
Chain-of-Thought Reasoning Without Prompting |
Xuezhi Wang et.al. |
2402.10200 |
null |
2024-02-15 |
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents |
Lingbo Mo et.al. |
2402.10196 |
link |
2024-02-15 |
BitDelta: Your Fine-Tune May Only Be Worth One Bit |
James Liu et.al. |
2402.10193 |
link |
2024-02-15 |
Uncertainty Decomposition and Quantification for In-Context Learning of Large Language Models |
Chen Ling et.al. |
2402.10189 |
link |
2024-02-15 |
Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective |
Tianyi Qiu et.al. |
2402.10184 |
null |
2024-02-15 |
TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation |
Yaoxiang Wang et.al. |
2402.10178 |
null |
2024-02-15 |
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset |
Shubham Toshniwal et.al. |
2402.10176 |
link |
2024-02-15 |
Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence |
Yinhong Liu et.al. |
2402.10175 |
link |
2024-02-15 |
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models |
Ali AhmadiTeshnizi et.al. |
2402.10172 |
link |
2024-02-15 |
Data Engineering for Scaling Language Models to 128K Context |
Yao Fu et.al. |
2402.10171 |
link |
2024-02-15 |
Knowledge-Infused LLM-Powered Conversational Health Agent: A Case Study for Diabetes Patients |
Mahyar Abbasian et.al. |
2402.10153 |
null |
2024-02-15 |
ControlLM: Crafting Diverse Personalities for Language Models |
Yixuan Weng et.al. |
2402.10151 |
link |
2024-02-15 |
TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles |
Yinhong Liu et.al. |
2402.10137 |
null |
2024-02-15 |
Zero-Shot Reasoning: Personalized Content Generation Without the Cold Start Problem |
Davor Hafnar et.al. |
2402.10133 |
link |
2024-02-15 |
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning |
Ming Li et.al. |
2402.10110 |
link |
2024-02-15 |
Quantized Embedding Vectors for Controllable Diffusion Language Models |
Cheng Kang et.al. |
2402.10107 |
null |
2024-02-15 |
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving |
Jiaxin Zhang et.al. |
2402.10104 |
link |
2024-02-15 |
Any-Shift Prompting for Generalization over Distributions |
Zehao Xiao et.al. |
2402.10099 |
null |
2024-02-14 |
AQA-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability |
Siwei Yang et.al. |
2402.09404 |
link |
2024-02-14 |
Reinforcement Learning from Human Feedback with Active Queries |
Kaixuan Ji et.al. |
2402.09401 |
null |
2024-02-14 |
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference |
Harry Dong et.al. |
2402.09398 |
link |
2024-02-14 |
LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset |
Botao Yu et.al. |
2402.09391 |
link |
2024-02-14 |
HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation |
Yihao Fang et.al. |
2402.09390 |
link |
2024-02-14 |
Transformers Can Achieve Length Generalization But Not Robustly |
Yongchao Zhou et.al. |
2402.09371 |
null |
2024-02-14 |
Pseudorandom Error-Correcting Codes |
Miranda Christ et.al. |
2402.09370 |
null |
2024-02-14 |
Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking |
Yi Fung et.al. |
2402.09369 |
link |
2024-02-14 |
Copyright Traps for Large Language Models |
Matthieu Meeus et.al. |
2402.09363 |
link |
2024-02-14 |
HiRE: High Recall Approximate Top- $k$ Estimation for Efficient LLM Inference |
Yashas Samaga B L et.al. |
2402.09360 |
null |
2024-02-14 |
Developing a Framework for Auditing Large Language Models Using Human-in-the-Loop |
Maryam Amirizaniani et.al. |
2402.09346 |
null |
2024-02-14 |
Mitigating Reward Hacking via Information-Theoretic Reward Modeling |
Yuchun Miao et.al. |
2402.09345 |
link |
2024-02-14 |
AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach |
Maryam Amirizaniani et.al. |
2402.09334 |
null |
2024-02-14 |
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization |
Feifan Song et.al. |
2402.09320 |
link |
2024-02-14 |
Embracing the black box: Heading towards foundation models for causal discovery from time series data |
Gideon Stein et.al. |
2402.09305 |
link |
2024-02-14 |
Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code |
Vahid Majdinasab et.al. |
2402.09299 |
link |
2024-02-14 |
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey |
Zhichen Dong et.al. |
2402.09283 |
link |
2024-02-14 |
Leveraging Large Language Models for Enhanced NLP Task Performance through Knowledge Distillation and Optimized Training Strategies |
Yining Huang et.al. |
2402.09282 |
null |
2024-02-14 |
Personalized Large Language Models |
Stanisław Woźniak et.al. |
2402.09269 |
null |
2024-02-14 |
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation |
Xiaoying Zhang et.al. |
2402.09267 |
null |
2024-02-13 |
Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance |
Linxi Zhao et.al. |
2402.08680 |
null |
2024-02-13 |
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability |
Xingang Guo et.al. |
2402.08679 |
link |
2024-02-13 |
Human Curriculum Effects Emerge with In-Context Learning in Neural Networks |
Jacob Russin et.al. |
2402.08674 |
null |
2024-02-13 |
Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models |
Yuqing Liu et.al. |
2402.08670 |
null |
2024-02-13 |
Improving Generalization in Semantic Parsing by Increasing Natural Language Variation |
Irina Saparina et.al. |
2402.08666 |
link |
2024-02-13 |
The Last JITAI? The Unreasonable Effectiveness of Large Language Models in Issuing Just-in-Time Adaptive Interventions: Fostering Physical Activity in a Prospective Cardiac Rehabilitation Setting |
David Haag et.al. |
2402.08658 |
null |
2024-02-13 |
PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs |
Michael Dorkenwald et.al. |
2402.08657 |
null |
2024-02-13 |
Tandem Transformers for Inference Efficient LLMs |
Aishwarya P S et.al. |
2402.08644 |
null |
2024-02-13 |
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages |
Nedjma Ousidhoum et.al. |
2402.08638 |
null |
2024-02-13 |
Knowledge Editing on Black-box Large Language Models |
Xiaoshuai Song et.al. |
2402.08631 |
link |
2024-02-13 |
Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning |
Haeju Lee et.al. |
2402.08594 |
link |
2024-02-13 |
Test-Time Backdoor Attacks on Multimodal Large Language Models |
Dong Lu et.al. |
2402.08577 |
link |
2024-02-13 |
Online Foundation Model Selection in Robotics |
Po-han Li et.al. |
2402.08570 |
null |
2024-02-13 |
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast |
Xiangming Gu et.al. |
2402.08567 |
link |
2024-02-13 |
Artificial Intelligence for Literature Reviews: Opportunities and Challenges |
Francisco Bolanos et.al. |
2402.08565 |
null |
2024-02-13 |
Higher Layers Need More LoRA Experts |
Chongyang Gao et.al. |
2402.08562 |
link |
2024-02-13 |
Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback |
Vineet Bhat et.al. |
2402.08546 |
null |
2024-02-13 |
The Application of ChatGPT in Responding to Questions Related to the Boston Bowel Preparation Scale |
Xiaoqiang Liu et.al. |
2402.08492 |
null |
2024-02-13 |
Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models |
Shaeke Salman et.al. |
2402.08473 |
null |
2024-02-13 |
Large Language Models for the Automated Analysis of Optimization Algorithms |
Camilo Chacón Sartori et.al. |
2402.08472 |
link |
2024-02-12 |
A systematic investigation of learnability from single child linguistic input |
Yulu Qin et.al. |
2402.07899 |
link |
2024-02-12 |
Suppressing Pink Elephants with Direct Principle Feedback |
Louis Castricato et.al. |
2402.07896 |
null |
2024-02-12 |
WildfireGPT: Tailored Large Language Model for Wildfire Analysis |
Yangxinyu Xie et.al. |
2402.07877 |
null |
2024-02-12 |
Policy Improvement using Language Feedback Models |
Victor Zhong et.al. |
2402.07876 |
link |
2024-02-12 |
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs |
Soroush Nasiriany et.al. |
2402.07872 |
null |
2024-02-12 |
Scaling Laws for Fine-Grained Mixture of Experts |
Jakub Krajewski et.al. |
2402.07871 |
link |
2024-02-12 |
PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models |
Wei Zou et.al. |
2402.07867 |
link |
2024-02-12 |
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models |
Siddharth Karamcheti et.al. |
2402.07865 |
link |
2024-02-12 |
AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy |
Philipp Schoenegger et.al. |
2402.07862 |
null |
2024-02-12 |
Lissard: Long and Simple Sequential Reasoning Datasets |
Mirelle Bueno et.al. |
2402.07859 |
link |
2024-02-12 |
Mercury: An Efficiency Benchmark for LLM Code Synthesis |
Mingzhe Du et.al. |
2402.07844 |
link |
2024-02-12 |
Do Membership Inference Attacks Work on Large Language Models? |
Michael Duan et.al. |
2402.07841 |
link |
2024-02-12 |
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model |
Ahmet Üstün et.al. |
2402.07827 |
null |
2024-02-12 |
Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning |
Z Liu et.al. |
2402.07818 |
null |
2024-02-12 |
Injecting Wiktionary to improve token-level contextual representations using contrastive learning |
Anna Mosolova et.al. |
2402.07817 |
null |
2024-02-12 |
Retrieval-Augmented Thought Process as Sequential Decision Making |
Thomas Pouplin et.al. |
2402.07812 |
null |
2024-02-12 |
Empowering Federated Learning for Massive Models with NVIDIA FLARE |
Holger R. Roth et.al. |
2402.07792 |
null |
2024-02-12 |
TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection |
Hui Liu et.al. |
2402.07776 |
link |
2024-02-12 |
Quantitative knowledge retrieval from large language models |
David Selby et.al. |
2402.07770 |
link |
2024-02-12 |
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model |
Mikail Khona et.al. |
2402.07757 |
null |
2024-02-09 |
Feedback Loops With Language Models Drive In-Context Reward Hacking |
Alexander Pan et.al. |
2402.06627 |
link |
2024-02-09 |
Understanding the Effects of Iterative Prompting on Truthfulness |
Satyapriya Krishna et.al. |
2402.06625 |
null |
2024-02-09 |
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning |
Shivalika Singh et.al. |
2402.06619 |
null |
2024-02-09 |
FaBERT: Pre-training BERT on Persian Blogs |
Mostafa Masumi et.al. |
2402.06617 |
null |
2024-02-09 |
On the Out-Of-Distribution Generalization of Multimodal Large Language Models |
Xingxuan Zhang et.al. |
2402.06599 |
null |
2024-02-09 |
CigaR: Cost-efficient Program Repair with LLMs |
Dávid Hidvégi et.al. |
2402.06598 |
link |
2024-02-09 |
Understanding the Weakness of Large Language Model Agents within a Complex Android Environment |
Mingzhe Xing et.al. |
2402.06596 |
link |
2024-02-09 |
Self-consistent context aware conformer transducer for speech recognition |
Konstantin Kolokolov et.al. |
2402.06592 |
null |
2024-02-09 |
G-SciEdBERT: A Contextualized LLM for Science Assessment Tasks in German |
Ehsan Latif et.al. |
2402.06584 |
link |
2024-02-09 |
Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning |
Amir Ziai et.al. |
2402.06560 |
link |
2024-02-09 |
The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model |
Gregory Coppola et.al. |
2402.06557 |
link |
2024-02-09 |
Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA |
Marek Šuppa et.al. |
2402.06549 |
link |
2024-02-09 |
Calibrating Long-form Generations from Large Language Models |
Yukun Huang et.al. |
2402.06544 |
link |
2024-02-09 |
Introspective Planning: Guiding Language-Enabled Agents to Refine Their Own Uncertainty |
Kaiqu Liang et.al. |
2402.06529 |
link |
2024-02-09 |
Multimodal Clinical Trial Outcome Prediction with Large Language Models |
Wenhao Zheng et.al. |
2402.06512 |
link |
2024-02-09 |
Iris-SAM: Iris Segmentation Using a Foundational Model |
Parisa Farmanifard et.al. |
2402.06497 |
link |
2024-02-09 |
Large Language Models for Captioning and Retrieving Remote Sensing Images |
João Daniel Silva et.al. |
2402.06475 |
null |
2024-02-09 |
V-STaR: Training Verifiers for Self-Taught Reasoners |
Arian Hosseini et.al. |
2402.06457 |
null |
2024-02-09 |
StruQ: Defending Against Prompt Injection with Structured Queries |
Sizhe Chen et.al. |
2402.06363 |
link |
2024-02-09 |
CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models |
Peiyuan Gong et.al. |
2402.06360 |
link |
2024-02-08 |
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models |
Peng Gao et.al. |
2402.05935 |
link |
2024-02-08 |
Driving Everywhere with Large Language Model Policy Adaptation |
Boyi Li et.al. |
2402.05932 |
null |
2024-02-08 |
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue |
Xing Han Lù et.al. |
2402.05930 |
link |
2024-02-08 |
An Interactive Agent Foundation Model |
Zane Durante et.al. |
2402.05929 |
null |
2024-02-08 |
On the Convergence of Zeroth-Order Federated Tuning in Large Language Models |
Zhenqing Ling et.al. |
2402.05926 |
link |
2024-02-08 |
Efficient Stagewise Pretraining via Progressive Subnetworks |
Abhishek Panigrahi et.al. |
2402.05913 |
null |
2024-02-08 |
FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs |
Eun Cheol Choi et.al. |
2402.05904 |
link |
2024-02-08 |
Large Language Model Meets Graph Neural Network in Knowledge Distillation |
Shengxiang Hu et.al. |
2402.05894 |
null |
2024-02-08 |
Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking |
Nikhil Sharma et.al. |
2402.05880 |
null |
2024-02-08 |
PromptCrypt: Prompt Encryption for Secure Communication with Large Language Models |
Guo Lin et.al. |
2402.05868 |
link |
2024-02-08 |
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis |
Federico Bianchi et.al. |
2402.05863 |
link |
2024-02-08 |
Let Your Graph Do the Talking: Encoding Structured Data for LLMs |
Bryan Perozzi et.al. |
2402.05862 |
link |
2024-02-08 |
Learning to Route Among Specialized Experts for Zero-Shot Generalization |
Mohammed Muqeeth et.al. |
2402.05859 |
link |
2024-02-08 |
Limitations of Agents Simulated by Predictive Models |
Raymond Douglas et.al. |
2402.05829 |
null |
2024-02-08 |
Is it Possible to Edit Large Language Models Robustly? |
Xinbei Ma et.al. |
2402.05827 |
link |
2024-02-08 |
Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models |
Lingzhi Wang et.al. |
2402.05813 |
null |
2024-02-08 |
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning |
Zhiheng Xi et.al. |
2402.05808 |
link |
2024-02-08 |
How do Transformers perform In-Context Autoregressive Learning? |
Michael E. Sander et.al. |
2402.05787 |
null |
2024-02-08 |
Limits of Transformer Language Models on Algorithmic Learning |
Jonathan Thomm et.al. |
2402.05785 |
link |
2024-02-08 |
Text-to-Code Generation with Modality-relative Pre-training |
Fenia Christopoulou et.al. |
2402.05783 |
null |
2024-02-07 |
Opening the AI black box: program synthesis via mechanistic interpretability |
Eric J. Michaud et.al. |
2402.05110 |
link |
2024-02-07 |
You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models |
Alix Decrop et.al. |
2402.05102 |
null |
2024-02-07 |
Hydragen: High-Throughput LLM Inference with Shared Prefixes |
Jordan Juravsky et.al. |
2402.05099 |
link |
2024-02-07 |
Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation |
Dennis Hoftijzer et.al. |
2402.05090 |
link |
2024-02-07 |
A Roadmap to Pluralistic Alignment |
Taylor Sorensen et.al. |
2402.05070 |
link |
2024-02-07 |
SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models |
Lijun Li et.al. |
2402.05044 |
link |
2024-02-07 |
How BERT Speaks Shakespearean English? Evaluating Historical Bias in Contextual Language Models |
Miriam Cuscito et.al. |
2402.05034 |
null |
2024-02-07 |
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? |
Agustinus Kristiadi et.al. |
2402.05015 |
link |
2024-02-07 |
Pedagogical Alignment of Large Language Models |
Shashank Sonkar et.al. |
2402.05000 |
link |
2024-02-07 |
An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration |
Yihao Li et.al. |
2402.04978 |
null |
2024-02-07 |
ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12 |
Liuqing Chen et.al. |
2402.04975 |
null |
2024-02-07 |
Reconfidencing LLMs from the Grouping Loss Perspective |
Lihu Chen et.al. |
2402.04957 |
null |
2024-02-07 |
Chatbots in Knowledge-Intensive Contexts: Comparing Intent and LLM-Based Systems |
Samuel Kernan Freire et.al. |
2402.04955 |
null |
2024-02-07 |
Prompting Implicit Discourse Relation Annotation |
Frances Yung et.al. |
2402.04918 |
null |
2024-02-07 |
Personalized Text Generation with Fine-Grained Linguistic Control |
Bashar Alhafni et.al. |
2402.04914 |
link |
2024-02-07 |
L4Q: Parameter Efficient Quantization-Aware Training on Large Language Models via LoRA-wise LSQ |
Hyesung Jeon et.al. |
2402.04902 |
null |
2024-02-07 |
Detecting Generated Native Ads in Conversational Search |
Sebastian Schmidt et.al. |
2402.04889 |
link |
2024-02-07 |
Multimodal Query Suggestion with Multi-Agent Reinforcement Learning from Human Feedback |
Zheng Wang et.al. |
2402.04867 |
null |
2024-02-07 |
Automated Smart Contract Summarization via LLMs |
Yingjie Mao et.al. |
2402.04863 |
null |
2024-02-07 |
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay |
Natasha Butt et.al. |
2402.04858 |
link |
2024-02-06 |
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls |
Yu Du et.al. |
2402.04253 |
link |
2024-02-06 |
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal |
Mantas Mazeika et.al. |
2402.04249 |
link |
2024-02-06 |
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks |
Jongho Park et.al. |
2402.04248 |
link |
2024-02-06 |
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science |
Xiangru Tang et.al. |
2402.04247 |
null |
2024-02-06 |
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations |
Ji Qi et.al. |
2402.04236 |
link |
2024-02-06 |
Can Generative Agents Predict Emotion? |
Ciaran Regan et.al. |
2402.04232 |
null |
2024-02-06 |
“Task Success” is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors |
Lin Guan et.al. |
2402.04210 |
null |
2024-02-06 |
Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models |
David Sobrín-Hidalgo et.al. |
2402.04206 |
link |
2024-02-06 |
SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models |
Yichen Shi et.al. |
2402.04178 |
link |
2024-02-06 |
Scaling Laws for Downstream Task Performance of Large Language Models |
Berivan Isik et.al. |
2402.04177 |
null |
2024-02-06 |
Harnessing the Plug-and-Play Controller by Prompting |
Hao Wang et.al. |
2402.04160 |
null |
2024-02-06 |
Multi-line AI-assisted Code Authoring |
Omer Dunay et.al. |
2402.04141 |
null |
2024-02-06 |
Advancing Legal Reasoning: The Integration of AI to Navigate Complexities and Biases in Global Jurisprudence with Semi-Automated Arbitration Processes (SAAPs) |
Michael De’Shazer et.al. |
2402.04140 |
null |
2024-02-06 |
Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science |
Pengfei Liu et.al. |
2402.04119 |
link |
2024-02-06 |
Measuring Implicit Bias in Explicitly Unbiased Large Language Models |
Xuechunzi Bai et.al. |
2402.04105 |
link |
2024-02-06 |
The Use of a Large Language Model for Cyberbullying Detection |
Bayode Ogunleye et.al. |
2402.04088 |
null |
2024-02-06 |
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation |
Zhengbo Wang et.al. |
2402.04087 |
link |
2024-02-06 |
Provably learning a multi-head attention layer |
Sitan Chen et.al. |
2402.04084 |
null |
2024-02-06 |
Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models |
Reza Khanmohammadi et.al. |
2402.04075 |
null |
2024-02-06 |
Retrieve to Explain: Evidence-driven Predictions with Language Models |
Ravi Patel et.al. |
2402.04068 |
link |