2024-07-25 |
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning |
Tianduo Wang et.al. |
2407.18248 |
link |
2024-07-25 |
LoRA-Pro: Are Low-Rank Adapters Properly Optimized? |
Zhengbo Wang et.al. |
2407.18242 |
link |
2024-07-25 |
Recursive Introspection: Teaching Language Model Agents How to Self-Improve |
Yuxiao Qu et.al. |
2407.18219 |
null |
2024-07-25 |
Exploring Scaling Trends in LLM Robustness |
Nikolhaus Howe et.al. |
2407.18213 |
null |
2024-07-25 |
AsEP: Benchmarking Deep Learning Methods for Antibody-specific Epitope Prediction |
Chunan Liu et.al. |
2407.18184 |
link |
2024-07-25 |
Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning |
Sindhura Kommu et.al. |
2407.18181 |
null |
2024-07-25 |
Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models |
Sanae Lotfi et.al. |
2407.18158 |
null |
2024-07-25 |
$\mathbb{X}$ -Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs |
Vlad Sobal et.al. |
2407.18134 |
null |
2024-07-25 |
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic |
Fakhraddin Alwajih et.al. |
2407.18129 |
null |
2024-07-25 |
Efficient Inference of Vision Instruction-Following Models with Elastic Cache |
Zuyan Liu et.al. |
2407.18121 |
link |
2024-07-25 |
Multi-Resolution Histopathology Patch Graphs for Ovarian Cancer Subtyping |
Jack Breen et.al. |
2407.18105 |
null |
2024-07-25 |
Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow |
Tian Guo et.al. |
2407.18103 |
null |
2024-07-25 |
PEFT-U: Parameter-Efficient Fine-Tuning for User Personalization |
Christopher Clarke et.al. |
2407.18078 |
null |
2024-07-25 |
C2P: Featuring Large Language Models with Causal Reasoning |
Abdolmahdi Bagheri et.al. |
2407.18069 |
null |
2024-07-25 |
ComPeer: A Generative Conversational Agent for Proactive Peer Support |
Tianjian Liu et.al. |
2407.18064 |
null |
2024-07-25 |
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding |
Soham Deshmukh et.al. |
2407.18062 |
null |
2024-07-25 |
Difficulty Estimation and Simplification of French Text Using LLMs |
Henri Jamet et.al. |
2407.18061 |
null |
2024-07-25 |
The Geometry of Queries: Query-Based Innovations in Retrieval-Augmented Generation |
Eric Yang et.al. |
2407.18044 |
null |
2024-07-25 |
RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models |
Haoyu Chen et.al. |
2407.18035 |
null |
2024-07-25 |
GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy |
Jan Batzner et.al. |
2407.18008 |
null |
2024-07-24 |
I Could’ve Asked That: Reformulating Unanswerable Questions |
Wenting Zhao et.al. |
2407.17469 |
link |
2024-07-24 |
WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries |
Wenting Zhao et.al. |
2407.17468 |
null |
2024-07-24 |
CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models |
Jiawei Gu et.al. |
2407.17467 |
null |
2024-07-24 |
$VILA^2$ : VILA Augmented VILA |
Yunhao Fang et.al. |
2407.17453 |
null |
2024-07-24 |
Fluent Student-Teacher Redteaming |
T. Ben Thompson et.al. |
2407.17447 |
link |
2024-07-24 |
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? |
Michael-Andrei Panaitescu-Liess et.al. |
2407.17417 |
null |
2024-07-24 |
(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork |
Tianjin Huang et.al. |
2407.17412 |
null |
2024-07-24 |
Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models |
Yida Zhao et.al. |
2407.17406 |
link |
2024-07-24 |
Grammar-based Game Description Generation using Large Language Models |
Tsunehiko Tanaka et.al. |
2407.17404 |
null |
2024-07-24 |
3D Question Answering for City Scene Understanding |
Penglei Sun et.al. |
2407.17398 |
null |
2024-07-24 |
PERSONA: A Reproducible Testbed for Pluralistic Alignment |
Louis Castricato et.al. |
2407.17387 |
null |
2024-07-24 |
A Comprehensive Approach to Misspelling Correction with BERT and Levenshtein Distance |
Amirreza Naziri et.al. |
2407.17383 |
null |
2024-07-24 |
MMRA: A Benchmark for Multi-granularity Multi-image Relational Association |
Siwei Wu et.al. |
2407.17379 |
null |
2024-07-24 |
ViPer: Visual Personalization of Generative Models via Individual Preference Learning |
Sogand Salehi et.al. |
2407.17365 |
null |
2024-07-24 |
Gradient-based inference of abstract task representations for generalization in neural networks |
Ali Hummos et.al. |
2407.17356 |
null |
2024-07-24 |
Scalify: scale propagation for efficient low-precision LLM training |
Paul Balança et.al. |
2407.17353 |
link |
2024-07-24 |
Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching |
Yuyang Ding et.al. |
2407.17349 |
null |
2024-07-24 |
DexGANGrasp: Dexterous Generative Adversarial Grasping Synthesis for Task-Oriented Manipulation |
Qian Feng et.al. |
2407.17348 |
null |
2024-07-24 |
Label Alignment and Reassignment with Generalist Large Language Model for Enhanced Cross-Domain Named Entity Recognition |
Ke Bao et.al. |
2407.17344 |
null |
2024-07-24 |
How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations? |
Leo Yu-Ho Lo et.al. |
2407.17291 |
null |
2024-07-23 |
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects |
Junyi Li et.al. |
2407.16696 |
null |
2024-07-23 |
Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack |
Xiaoyue Xu et.al. |
2407.16695 |
null |
2024-07-23 |
Can Large Language Models Automatically Jailbreak GPT-4V? |
Yuanwei Wu et.al. |
2407.16686 |
null |
2024-07-23 |
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation |
Pengfei Chen et.al. |
2407.16682 |
null |
2024-07-23 |
RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent |
Huiyu Xu et.al. |
2407.16667 |
null |
2024-07-23 |
Course-Correction: Safety Alignment Using Synthetic Preferences |
Rongwu Xu et.al. |
2407.16637 |
null |
2024-07-23 |
Lawma: The Power of Specialization for Legal Tasks |
Ricardo Dominguez-Olmedo et.al. |
2407.16615 |
null |
2024-07-23 |
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data? |
Jonathan Hayase et.al. |
2407.16607 |
null |
2024-07-23 |
Shared Imagination: LLMs Hallucinate Alike |
Yilun Zhou et.al. |
2407.16604 |
null |
2024-07-23 |
A Comparative Study on Patient Language across Therapeutic Domains for Effective Patient Voice Classification in Online Health Discussions |
Giorgos Lysandrou et.al. |
2407.16593 |
null |
2024-07-23 |
Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs |
Yifan Xia et.al. |
2407.16576 |
null |
2024-07-23 |
TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback |
Eunseop Yoon et.al. |
2407.16574 |
null |
2024-07-23 |
Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models |
Ioana Buhnila et.al. |
2407.16565 |
null |
2024-07-23 |
Patched RTC: evaluating LLMs for diverse software development tasks |
Asankhaya Sharma et.al. |
2407.16557 |
null |
2024-07-24 |
MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues |
Liyun Zhang et.al. |
2407.16552 |
null |
2024-07-23 |
Quantifying the Role of Textual Predictability in Automatic Speech Recognition |
Sean Robertson et.al. |
2407.16537 |
null |
2024-07-23 |
Imperfect Vision Encoders: Efficient and Robust Tuning for Vision-Language Models |
Aristeidis Panos et.al. |
2407.16526 |
null |
2024-07-23 |
AMONGAGENTS: Evaluating Large Language Models in the Interactive Text-Based Social Deduction Game |
Yizhou Chi et.al. |
2407.16521 |
null |
2024-07-23 |
Language-Based Security for Low-Level MPC |
Christian Skalka et.al. |
2407.16504 |
null |
2024-07-23 |
Machine Translation Hallucination Detection for Low and High Resource Languages using Large Language Models |
Kenza Benkirane et.al. |
2407.16470 |
null |
2024-07-22 |
AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description |
Junyu Xie et.al. |
2407.15850 |
link |
2024-07-22 |
LLMmap: Fingerprinting For Large Language Models |
Dario Pasquini et.al. |
2407.15847 |
null |
2024-07-22 |
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models |
Mingze Xu et.al. |
2407.15841 |
null |
2024-07-22 |
MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity |
Yangzhou Liu et.al. |
2407.15838 |
null |
2024-07-22 |
dMel: Speech Tokenization made Simple |
He Bai et.al. |
2407.15835 |
null |
2024-07-22 |
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling |
Wataru Nakata et.al. |
2407.15828 |
null |
2024-07-22 |
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight |
Ziyuan Huang et.al. |
2407.15819 |
null |
2024-07-22 |
Perceptions of Linguistic Uncertainty by Language Models and Humans |
Catarina G Belem et.al. |
2407.15814 |
link |
2024-07-22 |
AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection |
Yunkang Cao et.al. |
2407.15795 |
link |
2024-07-22 |
CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning |
Emanuele Frascaroli et.al. |
2407.15793 |
link |
2024-07-22 |
Extracting Structured Insights from Financial News: An Augmented LLM Driven Approach |
Rian Dolphin et.al. |
2407.15788 |
null |
2024-07-22 |
Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels |
Zhuorui Ye et.al. |
2407.15786 |
null |
2024-07-22 |
Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning |
Kaiwen Wang et.al. |
2407.15762 |
null |
2024-07-22 |
MoRSE: Bridging the Gap in Cybersecurity Expertise with Retrieval Augmented Generation |
Marco Simoni et.al. |
2407.15748 |
null |
2024-07-22 |
OMoS-QA: A Dataset for Cross-Lingual Extractive Question Answering in a German Migration Context |
Steffen Kleinle et.al. |
2407.15736 |
null |
2024-07-22 |
TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSON |
John Chong Min Tan et.al. |
2407.15734 |
null |
2024-07-22 |
Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders |
Laura Niss et.al. |
2407.15731 |
null |
2024-07-22 |
SAM2CLIP2SAM: Vision Language Model for Segmentation of 3D CT Scans for Covid-19 Detection |
Dimitrios Kollias et.al. |
2407.15728 |
null |
2024-07-22 |
DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design |
Zhi Hao Luo et.al. |
2407.15723 |
link |
2024-07-22 |
Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability |
Zhuoyan Xu et.al. |
2407.15720 |
link |
2024-07-19 |
Internal Consistency and Self-Feedback in Large Language Models: A Survey |
Xun Liang et.al. |
2407.14507 |
link |
2024-07-19 |
On Pre-training of Multimodal Language Models Customized for Chart Understanding |
Wan-Cyuan Fan et.al. |
2407.14506 |
null |
2024-07-19 |
PD-TPE: Parallel Decoder with Text-guided Position Encoding for 3D Visual Grounding |
Chenshu Hou et.al. |
2407.14491 |
null |
2024-07-19 |
Evaluating the Reliability of Self-Explanations in Large Language Models |
Korbinian Randl et.al. |
2407.14487 |
link |
2024-07-19 |
Data-Centric Human Preference Optimization with Rationales |
Hoang Anh Just et.al. |
2407.14477 |
null |
2024-07-19 |
Contrastive Learning with Counterfactual Explanations for Radiology Report Generation |
Mingjie Li et.al. |
2407.14474 |
null |
2024-07-19 |
Check-Eval: A Checklist-based Approach for Evaluating Text Quality |
Jayr Pereira et.al. |
2407.14467 |
null |
2024-07-19 |
Undermining Mental Proof: How AI Can Make Cooperation Harder by Making Thinking Easier |
Zachary Wojtowicz et.al. |
2407.14452 |
null |
2024-07-19 |
Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding |
Renshan Zhang et.al. |
2407.14439 |
link |
2024-07-19 |
Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders |
Senthooran Rajamanoharan et.al. |
2407.14435 |
null |
2024-07-19 |
Mixture of Experts with Mixture of Precisions for Tuning Quality of Service |
HamidReza Imani et.al. |
2407.14417 |
null |
2024-07-19 |
System-1.x: Learning to Balance Fast and Slow Planning with Language Models |
Swarnadeep Saha et.al. |
2407.14414 |
link |
2024-07-19 |
DEAL: Disentangle and Localize Concept-level Explanations for VLMs |
Tang Li et.al. |
2407.14412 |
null |
2024-07-19 |
The Vision of Autonomic Computing: Can LLMs Make It a Reality? |
Zhiyang Zhang et.al. |
2407.14402 |
null |
2024-07-19 |
Frontiers of Deep Learning: From Novel Application to Real-World Deployment |
Rui Xie et.al. |
2407.14386 |
null |
2024-07-19 |
Open Artificial Knowledge |
Vadim Borisov et.al. |
2407.14371 |
null |
2024-07-19 |
Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models |
Xuenan Xu et.al. |
2407.14355 |
null |
2024-07-19 |
Improving Retrieval in Sponsored Search by Leveraging Query Context Signals |
Akash Kumar Mohankumar et.al. |
2407.14346 |
null |
2024-07-19 |
LLMs left, right, and center: Assessing GPT’s capabilities to label political bias from web domains |
Raphael Hernandes et.al. |
2407.14344 |
null |
2024-07-19 |
Multimodal Misinformation Detection using Large Vision-Language Models |
Sahar Tahmasebi et.al. |
2407.14321 |
null |
2024-07-18 |
Latent Causal Probing: A Formal Perspective on Probing with Causal Models of Data |
Charles Jin et.al. |
2407.13765 |
null |
2024-07-18 |
SegPoint: Segment Any Point Cloud via Large Language Model |
Shuting He et.al. |
2407.13761 |
null |
2024-07-18 |
Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models |
Zhuo Chen et.al. |
2407.13757 |
null |
2024-07-18 |
CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications |
Mirza Masfiqur Rahman et.al. |
2407.13742 |
null |
2024-07-18 |
Baba Is AI: Break the Rules to Beat the Benchmark |
Nathan Cloos et.al. |
2407.13729 |
null |
2024-07-18 |
CoDefeater: Using LLMs To Find Defeaters in Assurance Cases |
Usman Gohar et.al. |
2407.13717 |
link |
2024-07-18 |
Understanding Reference Policies in Direct Preference Optimization |
Yixin Liu et.al. |
2407.13709 |
null |
2024-07-18 |
A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice |
Shaina Raza et.al. |
2407.13699 |
null |
2024-07-18 |
Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation |
Yotam Perlitz et.al. |
2407.13696 |
link |
2024-07-18 |
Prover-Verifier Games improve legibility of LLM outputs |
Jan Hendrik Kirchner et.al. |
2407.13692 |
null |
2024-07-18 |
Shaded Route Planning Using Active Segmentation and Identification of Satellite Images |
Longchao Da et.al. |
2407.13689 |
null |
2024-07-18 |
FuLG: 150B Romanian Corpus for Language Model Pretraining |
Vlad-Andrei Bădoiu et.al. |
2407.13657 |
null |
2024-07-18 |
COMCAT: Leveraging Human Judgment to Improve Automatic Documentation and Summarization |
Skyler Grandel et.al. |
2407.13648 |
null |
2024-07-18 |
Weak-to-Strong Reasoning |
Yuqing Yang et.al. |
2407.13647 |
link |
2024-07-18 |
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies |
Chaofan Tao et.al. |
2407.13623 |
null |
2024-07-18 |
KNOWNET: Guided Health Information Seeking from LLMs via Knowledge Graph Integration |
Youfu Yan et.al. |
2407.13598 |
null |
2024-07-18 |
PLANTS: A Novel Problem and Dataset for Summarization of Planning-Like (PL) Tasks |
Vishal Pallagani et.al. |
2407.13597 |
null |
2024-07-18 |
EarthMarker: A Visual Prompt Learning Framework for Region-level and Point-level Remote Sensing Imagery Comprehension |
Wei Zhang et.al. |
2407.13596 |
null |
2024-07-18 |
Robust Calibration of Large Vision-Language Adapters |
Balamurali Murugesan et.al. |
2407.13588 |
link |
2024-07-18 |
Towards Zero-Shot Multimodal Machine Translation |
Matthieu Futeral et.al. |
2407.13579 |
link |
2024-07-17 |
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models |
Kaichen Zhang et.al. |
2407.12772 |
link |
2024-07-17 |
EchoSight: Advancing Visual-Language Models with Wiki Knowledge |
Yibin Yan et.al. |
2407.12735 |
null |
2024-07-17 |
NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model |
Zhongqun Zhang et.al. |
2407.12727 |
null |
2024-07-17 |
Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models? |
Ben Yao et.al. |
2407.12725 |
null |
2024-07-17 |
The Future of Learning: Large Language Models through the Lens of Students |
He Zhang et.al. |
2407.12723 |
null |
2024-07-17 |
MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models |
Leyang Shen et.al. |
2407.12709 |
link |
2024-07-17 |
Subgraph-Aware Training of Text-based Methods for Knowledge Graph Completion |
Youmin Ko et.al. |
2407.12703 |
null |
2024-07-17 |
Patch-Level Training for Large Language Models |
Chenze Shao et.al. |
2407.12665 |
link |
2024-07-17 |
Zero-shot Text-guided Infinite Image Synthesis with LLM guidance |
Soyeong Kwon et.al. |
2407.12642 |
null |
2024-07-17 |
Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification? |
Aman Sinha et.al. |
2407.12626 |
null |
2024-07-17 |
Harnessing the Power of Artificial Intelligence to Vitalize Endangered Indigenous Languages: Technologies and Experiences |
Claudio Pinhanez et.al. |
2407.12620 |
null |
2024-07-17 |
AudienceView: AI-Assisted Interpretation of Audience Feedback in Journalism |
William Brannon et.al. |
2407.12613 |
link |
2024-07-17 |
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding |
Ofir Abramovich et.al. |
2407.12594 |
null |
2024-07-18 |
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks |
Antoni Kowalczuk et.al. |
2407.12588 |
link |
2024-07-17 |
E5-V: Universal Embeddings with Multimodal Large Language Models |
Ting Jiang et.al. |
2407.12580 |
link |
2024-07-17 |
Audio Conditioning for Music Generation via Discrete Bottleneck Features |
Simon Rouard et.al. |
2407.12563 |
null |
2024-07-17 |
Conspiracy theories and where to find them on TikTok |
Francesco Corso et.al. |
2407.12545 |
null |
2024-07-17 |
Abstraction Alignment: Comparing Model and Human Conceptual Relationships |
Angie Boggust et.al. |
2407.12543 |
link |
2024-07-17 |
Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models |
Xihe Qiu et.al. |
2407.12532 |
null |
2024-07-17 |
Crafting the Path: Robust Query Rewriting for Information Retrieval |
Ingeol Baek et.al. |
2407.12529 |
null |
2024-07-16 |
UrbanWorld: An Urban World Model for 3D City Generation |
Yu Shang et.al. |
2407.11965 |
null |
2024-07-16 |
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? |
Mo Li et.al. |
2407.11963 |
link |
2024-07-16 |
Code Documentation and Analysis to Secure Software Development |
Paul Attie et.al. |
2407.11934 |
null |
2024-07-16 |
What’s Wrong? Refining Meeting Summaries with LLM Feedback |
Frederic Kirstein et.al. |
2407.11919 |
null |
2024-07-16 |
GraphFM: A Scalable Framework for Multi-Graph Pretraining |
Divyansha Lachi et.al. |
2407.11907 |
null |
2024-07-16 |
Ascend-CC: Confidential Computing on Heterogeneous NPU for Emerging Generative AI Workloads |
Aritra Dhar et.al. |
2407.11888 |
null |
2024-07-16 |
Zero-shot Cross-Lingual Transfer for Synthetic Data Generation in Grammatical Error Detection |
Gaetan Lopez Latouche et.al. |
2407.11854 |
null |
2024-07-16 |
Schema Matching with Large Language Models: an Experimental Study |
Marcel Parciak et.al. |
2407.11852 |
link |
2024-07-16 |
LoFTI: Localization and Factuality Transfer to Indian Locales |
Sona Elza Simon et.al. |
2407.11833 |
link |
2024-07-16 |
GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text |
Kyle Hamilton et.al. |
2407.11827 |
null |
2024-07-16 |
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation |
Branden Butler et.al. |
2407.11798 |
null |
2024-07-16 |
Large Language Models as Misleading Assistants in Conversation |
Betty Li Hou et.al. |
2407.11789 |
null |
2024-07-16 |
SwitchCIT: Switching for Continual Instruction Tuning of Large Language Models |
Xinbo Wu et.al. |
2407.11780 |
null |
2024-07-16 |
Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text |
Seyedeh Fatemeh Ebrahimi et.al. |
2407.11774 |
null |
2024-07-16 |
Educational Personalized Learning Path Planning with Large Language Models |
Chee Ng et.al. |
2407.11773 |
null |
2024-07-16 |
XEdgeAI: A Human-centered Industrial Inspection Framework with Data-centric Explainable Edge AI Approach |
Truong Thanh Hung Nguyen et.al. |
2407.11771 |
null |
2024-07-16 |
Robust Utility-Preserving Text Anonymization Based on Large Language Models |
Tianyu Yang et.al. |
2407.11770 |
link |
2024-07-16 |
Vectoring Languages |
Joseph Chen et.al. |
2407.11766 |
null |
2024-07-16 |
Exploring Quantization for Efficient Pre-Training of Transformer Language Models |
Kamran Chitsaz et.al. |
2407.11722 |
link |
2024-07-16 |
Harnessing Large Language Models for Multimodal Product Bundling |
Xiaohao Liu et.al. |
2407.11712 |
null |
2024-07-15 |
VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation |
Bocheng Zou et.al. |
2407.10972 |
link |
2024-07-15 |
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated |
Hongyu Wang et.al. |
2407.10969 |
null |
2024-07-15 |
Fast Matrix Multiplications for Lookup Table-Quantized LLMs |
Han Guo et.al. |
2407.10960 |
null |
2024-07-15 |
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? |
Ruisheng Cao et.al. |
2407.10956 |
link |
2024-07-15 |
MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models |
Chengguang Gan et.al. |
2407.10953 |
null |
2024-07-15 |
Can Textual Semantics Mitigate Sounding Object Segmentation Preference? |
Yaoting Wang et.al. |
2407.10947 |
link |
2024-07-15 |
Learning from Naturally Occurring Feedback |
Shachar Don-Yehiya et.al. |
2407.10944 |
link |
2024-07-15 |
GRUtopia: Dream General Robots in a City at Scale |
Hanqing Wang et.al. |
2407.10943 |
link |
2024-07-15 |
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together |
Dilara Soylu et.al. |
2407.10930 |
null |
2024-07-15 |
Benchmarking Vision Language Models for Cultural Understanding |
Shravan Nayak et.al. |
2407.10920 |
null |
2024-07-15 |
FinDKG: Dynamic Knowledge Graphs with Large Language Models for Detecting Global Trends in Financial Markets |
Xiaohui Victor Li et.al. |
2407.10909 |
link |
2024-07-15 |
Hey, That’s My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique |
Mark Russinovich et.al. |
2407.10887 |
null |
2024-07-15 |
SLIP: Securing LLMs IP Using Weights Decomposition |
Yehonathan Refael et.al. |
2407.10886 |
null |
2024-07-15 |
Understanding the Importance of Evolutionary Search in Automated Heuristic Design with Large Language Models |
Rui Zhang et.al. |
2407.10873 |
null |
2024-07-15 |
GPT Sonograpy: Hand Gesture Decoding from Forearm Ultrasound Images via VLM |
Keshav Bimbraw et.al. |
2407.10870 |
null |
2024-07-15 |
Physics-Inspired Generative Models in Medical Imaging: A Review |
Dennis Hein et.al. |
2407.10856 |
null |
2024-07-15 |
Weighted Grouped Query Attention in Transformers |
Sai Sena Chinnakonduru et.al. |
2407.10855 |
null |
2024-07-15 |
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use Cases |
Dylan Bouchard et.al. |
2407.10853 |
null |
2024-07-15 |
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs |
Quang H. Nguyen et.al. |
2407.10834 |
null |
2024-07-15 |
BiasScanner: Automatic Detection and Classification of News Bias to Strengthen Democracy |
Tim Menzner et.al. |
2407.10829 |
null |
2024-07-12 |
FairyLandAI: Personalized Fairy Tales utilizing ChatGPT and DALLE-3 |
Georgios Makridis et.al. |
2407.09467 |
null |
2024-07-12 |
Human-like Episodic Memory for Infinite Context LLMs |
Zafeirios Fountas et.al. |
2407.09450 |
null |
2024-07-12 |
ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts |
Amelia F. Hardy et.al. |
2407.09447 |
link |
2024-07-12 |
MUSCLE: A Model Update Strategy for Compatible LLM Evolution |
Jessica Echterhoff et.al. |
2407.09435 |
null |
2024-07-12 |
A Perspective on Foundation Models for the Electric Power Grid |
Hendrik F. Hamann et.al. |
2407.09434 |
null |
2024-07-12 |
Open (Clinical) LLMs are Sensitive to Instruction Phrasings |
Alberto Mario Ceballos Arroyo et.al. |
2407.09429 |
link |
2024-07-12 |
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models |
Hang Zou et.al. |
2407.09424 |
null |
2024-07-12 |
Mitigating Entity-Level Hallucination in Large Language Models |
Weihang Su et.al. |
2407.09417 |
link |
2024-07-12 |
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers |
Shraman Pramanick et.al. |
2407.09413 |
link |
2024-07-12 |
Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-Commerce |
Zhe Lin et.al. |
2407.09395 |
null |
2024-07-12 |
PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric Agents |
Saber Zerhoudi et.al. |
2407.09394 |
link |
2024-07-12 |
GAVEL: Generating Games Via Evolution and Language Models |
Graham Todd et.al. |
2407.09388 |
null |
2024-07-12 |
Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text |
Lucio La Cava et.al. |
2407.09364 |
null |
2024-07-12 |
Good Intentions, Risky Inventions: A Method for Assessing the Risks and Benefits of AI in Mobile and Wearable Uses |
Marios Constantinides et.al. |
2407.09322 |
link |
2024-07-12 |
Scalability of Bayesian Network Structure Elicitation with Large Language Models: a Novel Methodology and Comparative Analysis |
Nikolay Babakov et.al. |
2407.09311 |
null |
2024-07-12 |
Transformer Layers as Painters |
Qi Sun et.al. |
2407.09298 |
null |
2024-07-12 |
Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study |
Yulong Yang et.al. |
2407.09295 |
null |
2024-07-12 |
CEIPA: Counterfactual Explainable Incremental Prompt Attack Analysis on Large Language Models |
Dong Shu et.al. |
2407.09292 |
null |
2024-07-12 |
Structuring Authenticity Assessments on Historical Documents using LLMs |
Andrea Schimmenti et.al. |
2407.09290 |
null |
2024-07-12 |
WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation |
Robin Schön et.al. |
2407.09288 |
null |
2024-07-11 |
MAVIS: Mathematical Visual Instruction Tuning |
Renrui Zhang et.al. |
2407.08739 |
link |
2024-07-11 |
Real-Time Anomaly Detection and Reactive Planning with Large Language Models |
Rohan Sinha et.al. |
2407.08735 |
null |
2024-07-11 |
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist |
Zihao Zhou et.al. |
2407.08733 |
null |
2024-07-11 |
A Taxonomy for Data Contamination in Large Language Models |
Medha Palavalli et.al. |
2407.08716 |
null |
2024-07-11 |
GTA: A Benchmark for General Tool Agents |
Jize Wang et.al. |
2407.08713 |
link |
2024-07-11 |
eyeballvul: a future-proof benchmark for vulnerability detection in the wild |
Timothee Chauvin et.al. |
2407.08708 |
link |
2024-07-11 |
Extracting Training Data from Document-Based VQA Models |
Francesco Pinto et.al. |
2407.08707 |
null |
2024-07-11 |
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models |
Runhui Huang et.al. |
2407.08706 |
null |
2024-07-11 |
Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models |
Zhening Xing et.al. |
2407.08701 |
null |
2024-07-11 |
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging |
Anton Alexandrov et.al. |
2407.08699 |
null |
2024-07-11 |
Cloud Atlas: Efficient Fault Localization for Cloud Systems using Language Models and Causal Insight |
Zhiqiang Xie et.al. |
2407.08694 |
null |
2024-07-11 |
Robotic Control via Embodied Chain-of-Thought Reasoning |
Zawalski Michał et.al. |
2407.08693 |
null |
2024-07-11 |
SEED-Story: Multimodal Long Story Generation with Large Language Model |
Shuai Yang et.al. |
2407.08683 |
link |
2024-07-11 |
NODE-Adapter: Neural Ordinary Differential Equations for Better Vision-Language Reasoning |
Yi Zhang et.al. |
2407.08672 |
null |
2024-07-11 |
Uncertainty Estimation of Large Language Models in Medical Question Answering |
Jiaxin Wu et.al. |
2407.08662 |
null |
2024-07-11 |
Towards Building Specialized Generalist AI with System 1 and System 2 Fusion |
Kaiyan Zhang et.al. |
2407.08642 |
null |
2024-07-11 |
$β$-DPO: Direct Preference Optimization with Dynamic $β$ |
Junkang Wu et.al. |
2407.08639 |
link |
2024-07-11 |
RoboMorph: Evolving Robot Morphology using Large Language Models |
Kevin Qiu et.al. |
2407.08626 |
null |
2024-07-11 |
Tamil Language Computing: the Present and the Future |
Kengatharaiyer Sarveswaran et.al. |
2407.08618 |
null |
2024-07-11 |
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision |
Jay Shah et.al. |
2407.08608 |
null |
2024-07-10 |
Training on the Test Task Confounds Evaluation and Emergence |
Ricardo Dominguez-Olmedo et.al. |
2407.07890 |
link |
2024-07-10 |
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization |
Junkang Wu et.al. |
2407.07880 |
link |
2024-07-11 |
Toto: Time Series Optimized Transformer for Observability |
Ben Cohen et.al. |
2407.07874 |
null |
2024-07-10 |
FACTS About Building Retrieval Augmented Generation-based Chatbots |
Rama Akkiraju et.al. |
2407.07858 |
null |
2024-07-10 |
OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training |
Sami Jaghouar et.al. |
2407.07852 |
link |
2024-07-10 |
Natural Language Mechanisms via Self-Resolution with Foundation Models |
Nicolas Della Penna et.al. |
2407.07845 |
null |
2024-07-10 |
Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective |
Shengjia Chen et.al. |
2407.07841 |
link |
2024-07-10 |
Decompose and Compare Consistency: Measuring VLMs’ Answer Reliability via Task-Decomposition Consistency Comparison |
Qian Yang et.al. |
2407.07840 |
null |
2024-07-10 |
Transformer Alignment in Large Language Models |
Murdock Aubry et.al. |
2407.07810 |
null |
2024-07-11 |
AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning |
Jongsuk Kim et.al. |
2407.07801 |
link |
2024-07-10 |
Attribute or Abstain: Large Language Models as Long Document Assistants |
Jan Buchmann et.al. |
2407.07799 |
link |
2024-07-11 |
Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard |
Oguzhan Topsakal et.al. |
2407.07796 |
link |
2024-07-10 |
Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities |
Tianjie Ju et.al. |
2407.07791 |
link |
2024-07-10 |
WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment |
Jiefu Ou et.al. |
2407.07778 |
null |
2024-07-10 |
Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs |
Hao-Tien Lewis Chiang et.al. |
2407.07775 |
null |
2024-07-10 |
Can ChatGPT Pass a Theory of Computing Course? |
Matei A. Golesteanu et.al. |
2407.07757 |
null |
2024-07-10 |
Fine-Tuning Large Language Models with User-Level Differential Privacy |
Zachary Charles et.al. |
2407.07737 |
null |
2024-07-10 |
PaliGemma: A versatile 3B VLM for transfer |
Lucas Beyer et.al. |
2407.07726 |
link |
2024-07-10 |
Why should we ever automate moral decision making? |
Vincent Conitzer et.al. |
2407.07671 |
null |
2024-07-10 |
A Proposed S.C.O.R.E. Evaluation Framework for Large Language Models : Safety, Consensus, Objectivity, Reproducibility and Explainability |
Ting Fang Tan et.al. |
2407.07666 |
null |
2024-07-09 |
AnyTaskTune: Advanced Domain-Specific Solutions through Task-Fine-Tuning |
Jiaxi Cui et.al. |
2407.07094 |
link |
2024-07-09 |
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation |
Liqun Ma et.al. |
2407.07093 |
link |
2024-07-09 |
CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation |
Tong Chen et.al. |
2407.07087 |
link |
2024-07-09 |
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models |
Logan Cross et.al. |
2407.07086 |
link |
2024-07-09 |
Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities |
Shaltiel Shmidman et.al. |
2407.07080 |
null |
2024-07-09 |
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps |
Yung-Sung Chuang et.al. |
2407.07071 |
link |
2024-07-09 |
Prompting Techniques for Secure Code Generation: A Systematic Investigation |
Catherine Tony et.al. |
2407.07064 |
null |
2024-07-09 |
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence |
Weize Chen et.al. |
2407.07061 |
link |
2024-07-09 |
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model |
Wenqi Zhang et.al. |
2407.07053 |
link |
2024-07-09 |
ProtoSAM – One Shot Medical Image Segmentation With Foundational Models |
Lev Ayzenberg et.al. |
2407.07042 |
link |
2024-07-09 |
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models |
Yue Zhang et.al. |
2407.07035 |
null |
2024-07-09 |
Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization |
Jeongseok Hyun et.al. |
2407.07024 |
link |
2024-07-09 |
Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies |
Inwon Kang et.al. |
2407.07019 |
null |
2024-07-09 |
End-To-End Causal Effect Estimation from Unstructured Natural Language Data |
Nikita Dhawan et.al. |
2407.07018 |
null |
2024-07-09 |
Is Large Language Model All You Need to Predict the Synthesizability and Precursors of Crystal Structures? |
Zhilong Song et.al. |
2407.07016 |
null |
2024-07-09 |
Induction Heads as an Essential Mechanism for Pattern Matching in In-context Learning |
J. Crosbie et.al. |
2407.07011 |
null |
2024-07-09 |
Metron: Holistic Performance Evaluation Framework for LLM Inference Systems |
Amey Agrawal et.al. |
2407.07000 |
link |
2024-07-09 |
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective |
Yu-An Liu et.al. |
2407.06992 |
link |
2024-07-09 |
Segment-Based Interactive Machine Translation for Pre-trained Models |
Angel Navarro et.al. |
2407.06990 |
null |
2024-07-09 |
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models |
Yi-Cheng Lin et.al. |
2407.06957 |
link |
2024-07-08 |
Multi-Object Hallucination in Vision-Language Models |
Xuweiyi Chen et.al. |
2407.06192 |
null |
2024-07-08 |
4D Contrastive Superflows are Dense 3D Representation Learners |
Xiang Xu et.al. |
2407.06190 |
link |
2024-07-08 |
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision |
Orr Zohar et.al. |
2407.06189 |
link |
2024-07-08 |
CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation |
Xinying Guo et.al. |
2407.06188 |
null |
2024-07-08 |
JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation |
Yu Zeng et.al. |
2407.06187 |
null |
2024-07-08 |
Vision-Language Models under Cultural and Inclusive Considerations |
Antonia Karamolegkou et.al. |
2407.06177 |
null |
2024-07-08 |
On Speeding Up Language Model Evaluation |
Jin Peng Zhou et.al. |
2407.06172 |
null |
2024-07-08 |
What’s Wrong with Your Code Generated by Large Language Models? An Extensive Study |
Shihan Dou et.al. |
2407.06153 |
null |
2024-07-09 |
Using Grammar Masking to Ensure Syntactic Validity in LLM-based Modeling Tasks |
Lukas Netz et.al. |
2407.06146 |
null |
2024-07-08 |
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation |
Ethan Chern et.al. |
2407.06135 |
link |
2024-07-08 |
Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization |
Hannah K. Bako et.al. |
2407.06129 |
link |
2024-07-08 |
Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities |
Avinash Anand et.al. |
2407.06125 |
null |
2024-07-08 |
Enhancing Language Model Rationality with Bi-Directional Deliberation Reasoning |
Yadong Zhang et.al. |
2407.06112 |
null |
2024-07-08 |
Artificial Intuition: Efficient Classification of Scientific Abstracts |
Harsh Sakhrani et.al. |
2407.06093 |
null |
2024-07-08 |
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models |
Jinliang Lu et.al. |
2407.06089 |
null |
2024-07-08 |
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty |
Maor Ivgi et.al. |
2407.06071 |
link |
2024-07-08 |
Variational Best-of-N Alignment |
Afra Amini et.al. |
2407.06057 |
null |
2024-07-08 |
MST5 – Multilingual Question Answering over Knowledge Graphs |
Nikit Srivastava et.al. |
2407.06041 |
link |
2024-07-08 |
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System |
Miao Zheng et.al. |
2407.06027 |
null |
2024-07-08 |
iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement |
Aoyu Pang et.al. |
2407.06025 |
link |
2024-07-05 |
Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs |
Rudolf Laine et.al. |
2407.04694 |
link |
2024-07-05 |
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models |
Yuzhe Gu et.al. |
2407.04693 |
link |
2024-07-05 |
Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge |
Yuanze Lin et.al. |
2407.04681 |
null |
2024-07-05 |
Lost in Translation: The Algorithmic Gap Between LMs and the Brain |
Tommaso Tosato et.al. |
2407.04680 |
null |
2024-07-05 |
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition |
Ye Bai et.al. |
2407.04675 |
null |
2024-07-05 |
Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement |
Yongji Wu et.al. |
2407.04656 |
null |
2024-07-05 |
Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models |
Bolaji Yusuf et.al. |
2407.04641 |
null |
2024-07-05 |
Entity Decomposition with Filtering: A Zero-Shot Clinical Named Entity Recognition Framework |
Reza Averly et.al. |
2407.04629 |
null |
2024-07-05 |
On scalable oversight with weak LLMs judging strong LLMs |
Zachary Kenton et.al. |
2407.04622 |
null |
2024-07-05 |
CountGD: Multi-Modal Open-World Counting |
Niki Amini-Naieni et.al. |
2407.04619 |
null |
2024-07-05 |
ARM: Efficient Guided Decoding with Autoregressive Reward Models |
Sergey Troshin et.al. |
2407.04615 |
null |
2024-07-05 |
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation |
Yuhan Zhu et.al. |
2407.04603 |
null |
2024-07-05 |
Written Term Detection Improves Spoken Term Detection |
Bolaji Yusuf et.al. |
2407.04601 |
link |
2024-07-05 |
Testing learning hypotheses using neural networks by manipulating learning data |
Cara Su-Yi Leong et.al. |
2407.04593 |
null |
2024-07-05 |
Leveraging Large Language Models for Integrated Satellite-Aerial-Terrestrial Networks: Recent Advances and Future Directions |
Shumaila Javaid et.al. |
2407.04581 |
null |
2024-07-05 |
VRSD: Rethinking Similarity and Diversity for Retrieval in Large Language Models |
Hang Gao et.al. |
2407.04573 |
null |
2024-07-05 |
Not (yet) the whole story: Evaluating Visual Storytelling Requires More than Measuring Coherence, Grounding, and Repetition |
Aditya K Surikuchi et.al. |
2407.04559 |
null |
2024-07-05 |
Spontaneous Reward Hacking in Iterative Self-Refinement |
Jane Pan et.al. |
2407.04549 |
null |
2024-07-05 |
PoPreRo: A New Dataset for Popularity Prediction of Romanian Reddit Posts |
Ana-Cristina Rogoz et.al. |
2407.04541 |
link |
2024-07-05 |
GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning |
Aleksander Ficek et.al. |
2407.04528 |
null |
2024-07-03 |
Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages |
Max Zuo et.al. |
2407.03321 |
link |
2024-07-03 |
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output |
Pan Zhang et.al. |
2407.03320 |
link |
2024-07-03 |
BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations |
Zhantao Yang et.al. |
2407.03314 |
null |
2024-07-03 |
Universal Length Generalization with Turing Programs |
Kaiying Hou et.al. |
2407.03310 |
null |
2024-07-03 |
Large Language Models for JSON Schema Discovery |
Michael J. Mior et.al. |
2407.03286 |
null |
2024-07-03 |
LLM Internal States Reveal Hallucination Risk Faced With a Query |
Ziwei Ji et.al. |
2407.03282 |
null |
2024-07-03 |
STF: Sentence Transformer Fine-Tuning For Topic Categorization With Limited Data |
Kheir Eddine Daouadi et.al. |
2407.03253 |
null |
2024-07-03 |
Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning |
Zhili Shen et.al. |
2407.03227 |
null |
2024-07-03 |
How Does Quantization Affect Multilingual LLMs? |
Kelly Marchisio et.al. |
2407.03211 |
null |
2024-07-03 |
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts |
Ruida Wang et.al. |
2407.03203 |
link |
2024-07-03 |
Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models |
Haritz Puerto et.al. |
2407.03181 |
link |
2024-07-03 |
Investigating Decoder-only Large Language Models for Speech-to-text Translation |
Chao-Wei Huang et.al. |
2407.03169 |
null |
2024-07-03 |
SOS! Soft Prompt Attack Against Open-Source Large Language Models |
Ziqing Yang et.al. |
2407.03160 |
null |
2024-07-03 |
Let the Code LLM Edit Itself When You Edit the Code |
Zhenyu He et.al. |
2407.03157 |
null |
2024-07-03 |
Reinforcement Learning for Sequence Design Leveraging Protein Language Models |
Jithendaraa Subramanian et.al. |
2407.03154 |
null |
2024-07-03 |
Enhancing Translation Accuracy of Large Language Models through Continual Pre-Training on Parallel Data |
Minato Kondo et.al. |
2407.03145 |
null |
2024-07-03 |
Social Bias Evaluation for Large Language Models Requires Prompt Variations |
Rem Hida et.al. |
2407.03129 |
link |
2024-07-03 |
KeyVideoLLM: Towards Large-scale Video Keyframe Selection |
Hao Liang et.al. |
2407.03104 |
null |
2024-07-03 |
Cactus: Towards Psychological Counseling Conversations using Cognitive Behavioral Theory |
Suyeon Lee et.al. |
2407.03103 |
link |
2024-07-03 |
ScreenTK: Seamless Detection of Time-Killing Moments Using Continuous Mobile Screen Text Monitoring |
Le Fang et.al. |
2407.03063 |
null |
2024-07-02 |
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention |
Huiqiang Jiang et.al. |
2407.02490 |
link |
2024-07-02 |
Neurocache: Efficient Vector Retrieval for Long-range Language Modeling |
Ali Safaya et.al. |
2407.02486 |
link |
2024-07-02 |
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs |
Yue Yu et.al. |
2407.02485 |
null |
2024-07-02 |
MMedAgent: Learning to Use Medical Tools with Multi-modal Agent |
Binxu Li et.al. |
2407.02483 |
null |
2024-07-02 |
Understanding Alignment in Multimodal LLMs: A Comprehensive Study |
Elmira Amirloo et.al. |
2407.02477 |
null |
2024-07-02 |
Open Scene Graphs for Open World Object-Goal Navigation |
Joel Loo et.al. |
2407.02473 |
null |
2024-07-02 |
ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions |
Chan Young Park et.al. |
2407.02472 |
link |
2024-07-02 |
Reliable Confidence Intervals for Information Retrieval Evaluation Using Generative A.I |
Harrie Oosterhuis et.al. |
2407.02464 |
null |
2024-07-02 |
Ensemble of pre-trained language models and data augmentation for hate speech detection from Arabic tweets |
Kheir Eddine Daouadi et.al. |
2407.02448 |
null |
2024-07-03 |
Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs |
Jinmin Li et.al. |
2407.02411 |
null |
2024-07-02 |
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models |
Song Wang et.al. |
2407.02408 |
null |
2024-07-02 |
Assessing the Code Clone Detection Capability of Large Language Models |
Zixian Zhang et.al. |
2407.02402 |
null |
2024-07-02 |
Learning to Refine with Fine-Grained Natural Language Feedback |
Manya Wadhwa et.al. |
2407.02397 |
link |
2024-07-02 |
Is Your AI-Generated Code Really Secure? Evaluating Large Language Models on Secure Code Generation with CodeSecEval |
Jiexin Wang et.al. |
2407.02395 |
null |
2024-07-02 |
TokenPacker: Efficient Visual Projector for Multimodal LLM |
Wentong Li et.al. |
2407.02392 |
link |
2024-07-02 |
Talking to Machines: do you read me? |
Lina M. Rojas-Barahona et.al. |
2407.02354 |
null |
2024-07-02 |
Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification |
Pritish Sahu et.al. |
2407.02352 |
null |
2024-07-02 |
Generative Large Language Models in Automated Fact-Checking: A Survey |
Ivan Vykopal et.al. |
2407.02351 |
null |
2024-07-02 |
Conceptual Codebook Learning for Vision-Language Models |
Yi Zhang et.al. |
2407.02350 |
null |
2024-07-02 |
MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space |
Yihong Tang et.al. |
2407.02345 |
null |
2024-06-28 |
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs |
Sukmin Yun et.al. |
2406.20098 |
link |
2024-06-28 |
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy |
Xiang Li et.al. |
2406.20095 |
link |
2024-06-28 |
Scaling Synthetic Data Creation with 1,000,000,000 Personas |
Xin Chan et.al. |
2406.20094 |
link |
2024-06-28 |
LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression |
Jieneng Chen et.al. |
2406.20092 |
link |
2024-06-28 |
ProgressGym: Alignment with a Millennium of Moral Progress |
Tianyi Qiu et.al. |
2406.20087 |
null |
2024-06-28 |
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language |
Yicheng Chen et.al. |
2406.20085 |
null |
2024-06-28 |
Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification |
Anisha Gunjal et.al. |
2406.20079 |
link |
2024-06-28 |
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model |
Yuxuan Zhang et.al. |
2406.20076 |
link |
2024-06-28 |
To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models |
Bastien Liétard et.al. |
2406.20054 |
null |
2024-06-28 |
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation |
Danny Halawi et.al. |
2406.20053 |
null |
2024-07-01 |
BMW Agents – A Framework For Task Automation Through Multi-Agent Collaboration |
Noel Crawford et.al. |
2406.20041 |
null |
2024-06-28 |
BioMNER: A Dataset for Biomedical Method Entity Recognition |
Chen Tang et.al. |
2406.20038 |
null |
2024-06-28 |
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models |
Renzhi Wang et.al. |
2406.20030 |
null |
2024-06-28 |
ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models |
Yuxiang Zhang et.al. |
2406.20015 |
link |
2024-06-28 |
The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models |
Xinyi Chen et.al. |
2406.19999 |
link |
2024-06-28 |
Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model |
Habib Hajimolahoseini et.al. |
2406.19995 |
null |
2024-06-28 |
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting |
Rui Pan et.al. |
2406.19976 |
null |
2024-06-28 |
STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical |
Guohao Sun et.al. |
2406.19973 |
null |
2024-06-28 |
Into the Unknown: Generating Geospatial Descriptions for New Environments |
Tzuf Paz-Argaman et.al. |
2406.19967 |
null |
2024-06-28 |
Simulating Financial Market via Large Language Model based Agents |
Shen Gao et.al. |
2406.19966 |
null |
2024-06-27 |
ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos |
Jr-Jen Chen et.al. |
2406.19392 |
link |
2024-06-27 |
The Remarkable Robustness of LLMs: Stages of Inference? |
Vedang Lad et.al. |
2406.19384 |
link |
2024-06-27 |
The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in the Era of Large Language Models |
Xiliang Zhu et.al. |
2406.19358 |
null |
2024-06-27 |
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions |
Nigel Fernandez et.al. |
2406.19356 |
null |
2024-06-27 |
Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs? |
Peter Hase et.al. |
2406.19354 |
null |
2024-06-27 |
IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language |
Lucky Susanto et.al. |
2406.19349 |
null |
2024-06-27 |
Jump Starting Bandits with LLM-Generated Prior Knowledge |
Parand A. Alamdari et.al. |
2406.19317 |
null |
2024-06-27 |
MCNC: Manifold Constrained Network Compression |
Chayne Thrash et.al. |
2406.19301 |
null |
2024-06-27 |
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data |
Zheyang Xiong et.al. |
2406.19292 |
null |
2024-06-27 |
PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models |
Cathy Mengying Fang et.al. |
2406.19283 |
null |
2024-06-27 |
HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale |
Junying Chen et.al. |
2406.19280 |
link |
2024-06-27 |
VERISCORE: Evaluating the factuality of verifiable claims in long-form text generation |
Yixiao Song et.al. |
2406.19276 |
link |
2024-06-27 |
AutoPureData: Automated Filtering of Web Data for LLM Fine-tuning |
Praneeth Vadlapati et.al. |
2406.19271 |
link |
2024-06-27 |
Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding |
Yue Fan et.al. |
2406.19263 |
link |
2024-06-27 |
Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment |
Hao Fei et.al. |
2406.19255 |
null |
2024-06-27 |
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation |
Jia Fu et.al. |
2406.19251 |
null |
2024-06-27 |
Revealing Fine-Grained Values and Opinions in Large Language Models |
Dustin Wright et.al. |
2406.19238 |
link |
2024-06-28 |
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts |
Shubhankar Singh et.al. |
2406.19237 |
null |
2024-06-27 |
Seeing Is Believing: Black-Box Membership Inference Attacks Against Retrieval Augmented Generation |
Yuying Li et.al. |
2406.19234 |
null |
2024-06-28 |
RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs |
Ekaterina Taktasheva et.al. |
2406.19232 |
link |
2024-06-26 |
Towards Compositionality in Concept Learning |
Adam Stein et.al. |
2406.18534 |
link |
2024-06-26 |
Symbolic Learning Enables Self-Evolving Agents |
Wangchunshu Zhou et.al. |
2406.18532 |
link |
2024-06-26 |
PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine Translation and Summarization Evaluation |
Christoph Leiter et.al. |
2406.18528 |
link |
2024-06-26 |
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs |
Zirui Wang et.al. |
2406.18521 |
link |
2024-06-26 |
“Is ChatGPT a Better Explainer than My Professor?”: Evaluating the Explanation Capabilities of LLMs in Conversation Compared to a Human Baseline |
Grace Li et.al. |
2406.18512 |
null |
2024-06-26 |
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models |
Liwei Jiang et.al. |
2406.18510 |
null |
2024-06-26 |
Mental Modeling of Reinforcement Learning Agents by Language Models |
Wenhao Lu et.al. |
2406.18505 |
null |
2024-06-26 |
Is In-Context Learning a Type of Gradient-Based Learning? Evidence from the Inverse Frequency Effect in Structural Priming |
Zhenghao Zhou et.al. |
2406.18501 |
null |
2024-06-26 |
Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation |
Ahmed Njifenjou et.al. |
2406.18460 |
null |
2024-06-26 |
Cascading Large Language Models for Salient Event Graph Generation |
Xingwei Tan et.al. |
2406.18449 |
link |
2024-06-26 |
New intelligent empowerment for digital transformation |
Peng Yifeng et.al. |
2406.18440 |
null |
2024-06-26 |
IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons |
Dan Shi et.al. |
2406.18406 |
null |
2024-06-26 |
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers |
Yibo Jiang et.al. |
2406.18400 |
null |
2024-06-26 |
Adversarial Search Engine Optimization for Large Language Models |
Fredrik Nestaas et.al. |
2406.18382 |
null |
2024-06-26 |
MALSIGHT: Exploring Malicious Source Code and Benign Pseudocode for Iterative Binary Malware Summarization |
Haolang Lu et.al. |
2406.18379 |
null |
2024-06-26 |
Themis: Towards Flexible and Interpretable NLG Evaluation |
Xinyu Hu et.al. |
2406.18365 |
link |
2024-06-26 |
AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations |
Adam Dahlgren Lindström et.al. |
2406.18346 |
null |
2024-06-26 |
PDFA Distillation via String Probability Queries {PDFA Distillation via String Probability Queries} |
Robert Baumgartner et.al. |
2406.18328 |
link |
2024-06-26 |
PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language Models |
Huixuan Zhang et.al. |
2406.18326 |
null |
2024-06-26 |
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data |
Meng Fang et.al. |
2406.18321 |
null |
2024-06-25 |
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning |
Xiangyu Zhao et.al. |
2406.17770 |
link |
2024-06-25 |
EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data |
Jesse Zhang et.al. |
2406.17768 |
null |
2024-06-25 |
BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning |
Ercong Nie et.al. |
2406.17764 |
null |
2024-06-25 |
CaLMQA: Exploring culturally specific long-form question answering across 23 languages |
Shane Arora et.al. |
2406.17761 |
link |
2024-06-25 |
Accelerating Clinical Evidence Synthesis with Large Language Models |
Zifeng Wang et.al. |
2406.17755 |
null |
2024-06-25 |
Measuring and Benchmarking Large Language Models’ Capabilities to Generate Persuasive Language |
Amalie Brogaard Pauli et.al. |
2406.17753 |
null |
2024-06-25 |
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon |
USVSN Sai Prashanth et.al. |
2406.17746 |
link |
2024-06-25 |
Point-SAM: Promptable 3D Segmentation Model for Point Clouds |
Yuchen Zhou et.al. |
2406.17741 |
link |
2024-06-25 |
Find Parent then Label Children: A Two-stage Taxonomy Completion Method with Pre-trained Language Model |
Fei Xia et.al. |
2406.17739 |
null |
2024-06-25 |
LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users |
Elinor Poole-Dayan et.al. |
2406.17737 |
null |
2024-06-25 |
FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model |
Feijie Wu et.al. |
2406.17706 |
link |
2024-06-25 |
From Distributional to Overton Pluralism: Investigating Large Language Model Alignment |
Thom Lake et.al. |
2406.17692 |
link |
2024-06-25 |
VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation |
Kun Qian et.al. |
2406.17681 |
link |
2024-06-25 |
Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models |
Yuan Li et.al. |
2406.17675 |
null |
2024-06-25 |
LaTable: Towards Large Tabular Models |
Boris van Breugel et.al. |
2406.17673 |
null |
2024-06-25 |
LLM-ARC: Enhancing LLMs with an Automated Reasoning Critic |
Aditya Kalyanpur et.al. |
2406.17663 |
null |
2024-06-25 |
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients |
Aashiq Muhamed et.al. |
2406.17660 |
link |
2024-06-25 |
DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning |
Xiaohan Zhang et.al. |
2406.17659 |
null |
2024-06-25 |
Leveraging Large Language Models for Software Model Completion: Results from Industrial and Public Datasets |
Christof Tinnes et.al. |
2406.17651 |
null |
2024-06-25 |
Variationist: Exploring Multifaceted Variation and Bias in Written Language Data |
Alan Ramponi et.al. |
2406.17647 |
link |
2024-06-24 |
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs |
Shengbang Tong et.al. |
2406.16860 |
link |
2024-06-24 |
EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees |
Yuhui Li et.al. |
2406.16858 |
link |
2024-06-24 |
Long Context Transfer from Language to Vision |
Peiyuan Zhang et.al. |
2406.16852 |
link |
2024-06-24 |
Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts |
Aditya Sharma et.al. |
2406.16851 |
null |
2024-06-24 |
RaTEScore: A Metric for Radiology Report Generation |
Weike Zhao et.al. |
2406.16845 |
null |
2024-06-24 |
From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models |
Sean Welleck et.al. |
2406.16838 |
null |
2024-06-24 |
USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$ onversations |
Mounika Marreddy et.al. |
2406.16833 |
null |
2024-06-24 |
Understanding and Mitigating Tokenization Bias in Language Models |
Buu Phan et.al. |
2406.16829 |
null |
2024-06-24 |
Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track |
Ronak Pradeep et.al. |
2406.16828 |
link |
2024-06-24 |
GPT-4V Explorations: Mining Autonomous Driving |
Zixuan Li et.al. |
2406.16817 |
null |
2024-06-24 |
RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository Scale |
Beck LaBash et.al. |
2406.16801 |
link |
2024-06-24 |
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs |
Ashwinee Panda et.al. |
2406.16797 |
link |
2024-06-24 |
Adam-mini: Use Fewer Learning Rates To Gain More |
Yushun Zhang et.al. |
2406.16793 |
link |
2024-06-24 |
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models |
Rishabh Maheshwary et.al. |
2406.16783 |
null |
2024-06-24 |
It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension |
Sagi Shaier et.al. |
2406.16779 |
null |
2024-06-24 |
Finding Transformer Circuits with Edge Pruning |
Adithya Bhaskar et.al. |
2406.16778 |
link |
2024-06-24 |
Blending LLMs into Cascaded Speech Translation: KIT’s Offline Speech Translation System for IWSLT 2024 |
Sai Koneru et.al. |
2406.16777 |
null |
2024-06-24 |
WARP: On the Benefits of Weight Averaged Rewarded Policies |
Alexandre Ramé et.al. |
2406.16768 |
null |
2024-06-24 |
The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories |
Xi Yu Huang et.al. |
2406.16767 |
link |
2024-06-24 |
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters |
Euiin Yi et.al. |
2406.16758 |
link |
2024-06-21 |
GenoTEX: A Benchmark for Evaluating LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians |
Haoyang Liu et.al. |
2406.15341 |
link |
2024-06-21 |
Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance |
Haoling Li et.al. |
2406.15330 |
null |
2024-06-21 |
Bug In the Code Stack: Can LLMs Find Bugs in Large Python Code Stacks |
Hokyung Lee et.al. |
2406.15325 |
link |
2024-06-21 |
Cognitive Map for Language Models: Optimal Planning via Verbally Representing the World Model |
Doyoung Kim et.al. |
2406.15275 |
null |
2024-06-21 |
Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics |
Weijia Zhang et.al. |
2406.15264 |
null |
2024-06-21 |
Unsupervised Morphological Tree Tokenizer |
Qingyang Zhu et.al. |
2406.15245 |
null |
2024-06-21 |
Large Batch Analysis for Adagrad Under Anisotropic Smoothness |
Yuxing Liu et.al. |
2406.15244 |
null |
2024-06-21 |
Detecting Synthetic Lyrics with Few-Shot Inference |
Yanis Labrak et.al. |
2406.15231 |
null |
2024-06-21 |
A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation |
Irune Zubiaga et.al. |
2406.15227 |
null |
2024-06-21 |
Unsupervised Extraction of Dialogue Policies from Conversations |
Makesh Narsimhan Sreedhar et.al. |
2406.15214 |
null |
2024-06-21 |
Prompting Whisper for QA-driven Zero-shot End-to-end Spoken Language Understanding |
Mohan Li et.al. |
2406.15209 |
null |
2024-06-21 |
Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms |
Santiago Berrezueta-Guzman et.al. |
2406.15198 |
null |
2024-06-21 |
UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-world Document Analysis |
Yulong Hui et.al. |
2406.15187 |
link |
2024-06-21 |
Hybrid Alignment Training for Large Language Models |
Chenglong Wang et.al. |
2406.15178 |
link |
2024-06-21 |
EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot |
Hao Fei et.al. |
2406.15177 |
link |
2024-06-21 |
Enhancing Idiomatic Representation in Multiple Languages via an Adaptive Contrastive Triplet Loss |
Wei He et.al. |
2406.15175 |
null |
2024-06-21 |
Évaluation des capacités de réponse de larges modèles de langage (LLM) pour des questions d’historiens |
Mathieu Chartier et.al. |
2406.15173 |
null |
2024-06-21 |
Assessing Good, Bad and Ugly Arguments Generated by ChatGPT: a New Dataset, its Methodology and Associated Tasks |
Victor Hugo Nascimento Rocha et.al. |
2406.15130 |
link |
2024-06-21 |
Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network |
Badr AlKhamissi et.al. |
2406.15109 |
link |
2024-06-21 |
PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data |
Ishaan Watts et.al. |
2406.15053 |
null |
2024-06-20 |
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch |
Hasan Abed Al Kader Hammoud et.al. |
2406.14563 |
null |
2024-06-20 |
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities |
Sachit Menon et.al. |
2406.14562 |
null |
2024-06-20 |
How to Compute the Probability of a Word |
Tiago Pimentel et.al. |
2406.14561 |
null |
2024-06-21 |
Asynchronous Large Language Model Enhanced Planner for Autonomous Driving |
Yuan Chen et.al. |
2406.14556 |
null |
2024-06-20 |
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models |
Shilong Li et.al. |
2406.14550 |
null |
2024-06-20 |
Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Large Language Models |
Sunny Duan et.al. |
2406.14549 |
null |
2024-06-20 |
Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data |
Johannes Treutlein et.al. |
2406.14546 |
link |
2024-06-20 |
Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems |
Đorđe Klisura et.al. |
2406.14545 |
null |
2024-06-20 |
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs |
Yuxuan Qiao et.al. |
2406.14544 |
link |
2024-06-20 |
Are LLMs Naturally Good at Synthetic Tabular Data Generation? |
Shengzhe Xu et.al. |
2406.14541 |
link |
2024-06-20 |
PostMark: A Robust Blackbox Watermark for Large Language Models |
Yapei Chang et.al. |
2406.14517 |
link |
2024-06-20 |
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding |
Xinyu Fang et.al. |
2406.14515 |
link |
2024-06-20 |
Evidence of a log scaling law for political persuasion with large language models |
Kobi Hackenburg et.al. |
2406.14508 |
link |
2024-06-20 |
Overview of the CAIL 2023 Argument Mining Track |
Jingcong Liang et.al. |
2406.14503 |
null |
2024-06-20 |
Improving Expert Radiology Report Summarization by Prompting Large Language Models with a Layperson Summary |
Xingmeng Zhao et.al. |
2406.14500 |
null |
2024-06-20 |
LLaSA: Large Multimodal Agent for Human Activity Analysis Through Wearable Sensors |
Sheikh Asif Imran et.al. |
2406.14498 |
link |
2024-06-20 |
CodeRAG-Bench: Can Retrieval Augment Code Generation? |
Zora Zhiruo Wang et.al. |
2406.14497 |
link |
2024-06-20 |
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification |
Gregor Geigle et.al. |
2406.14496 |
link |
2024-06-20 |
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models? |
Gregor Geigle et.al. |
2406.14492 |
null |
2024-06-20 |
Instruction Pre-Training: Language Models are Supervised Multitask Learners |
Daixuan Cheng et.al. |
2406.14491 |
link |
2024-06-18 |
DrVideo: Document Retrieval Based Long Video Understanding |
Ziyu Ma et.al. |
2406.12846 |
null |
2024-06-18 |
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts |
Haoxiang Wang et.al. |
2406.12845 |
link |
2024-06-18 |
Synergizing Foundation Models and Federated Learning: A Survey |
Shenghui Li et.al. |
2406.12844 |
null |
2024-06-18 |
GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation |
Ci-Siang Lin et.al. |
2406.12834 |
null |
2024-06-18 |
LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation |
Seyedarmin Azizi et.al. |
2406.12832 |
link |
2024-06-18 |
What Are the Odds? Language Models Are Capable of Probabilistic Reasoning |
Akshay Paruchuri et.al. |
2406.12830 |
null |
2024-06-18 |
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries |
Hitesh Wadhwa et.al. |
2406.12824 |
null |
2024-06-18 |
Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models? |
Pinzhen Chen et.al. |
2406.12822 |
null |
2024-06-18 |
Adversarial Attacks on Multimodal Agents |
Chen Henry Wu et.al. |
2406.12814 |
link |
2024-06-18 |
Can Large Language Models Always Solve Easy Problems if They Can Solve Harder Ones? |
Zhe Yang et.al. |
2406.12809 |
null |
2024-06-18 |
Identifying Performance-Sensitive Configurations in Software Systems through Code Analysis with LLM Agents |
Zehao Wang et.al. |
2406.12806 |
null |
2024-06-18 |
Supporting Human Raters with the Detection of Harmful Content using Large Language Models |
Kurt Thomas et.al. |
2406.12800 |
null |
2024-06-18 |
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools |
Team GLM et.al. |
2406.12793 |
link |
2024-06-18 |
In-Context Learning of Energy Functions |
Rylan Schaeffer et.al. |
2406.12785 |
null |
2024-06-18 |
UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions |
Xunzhi Wang et.al. |
2406.12784 |
link |
2024-06-18 |
Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries |
Eden Biran et.al. |
2406.12775 |
link |
2024-06-18 |
Towards Exact Gradient-based Training on Analog In-memory Computing |
Zhaoxian Wu et.al. |
2406.12774 |
null |
2024-06-18 |
GFM4MPM: Towards Geospatial Foundation Models for Mineral Prospectivity Mapping |
Angel Daruna et.al. |
2406.12756 |
null |
2024-06-18 |
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI |
Zhen Huang et.al. |
2406.12753 |
link |
2024-06-18 |
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning |
Bingchen Zhao et.al. |
2406.12742 |
link |
2024-06-17 |
LLaNA: Large Language and NeRF Assistant |
Andrea Amaduzzi et.al. |
2406.11840 |
null |
2024-06-17 |
mDPO: Conditional Preference Optimization for Multimodal Large Language Models |
Fei Wang et.al. |
2406.11839 |
null |
2024-06-17 |
MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs |
Ziyu Liu et.al. |
2406.11833 |
link |
2024-06-17 |
Unveiling Encoder-Free Vision-Language Models |
Haiwen Diao et.al. |
2406.11832 |
link |
2024-06-17 |
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models |
Bingqi Ma et.al. |
2406.11831 |
null |
2024-06-17 |
Language Modeling with Editable External Knowledge |
Belinda Z. Li et.al. |
2406.11830 |
link |
2024-06-17 |
WPO: Enhancing RLHF with Weighted Preference Optimization |
Wenxuan Zhou et.al. |
2406.11827 |
link |
2024-06-17 |
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning |
Geewook Kim et.al. |
2406.11823 |
link |
2024-06-17 |
MegaScenes: Scene-Level View Synthesis at Scale |
Joseph Tung et.al. |
2406.11819 |
null |
2024-06-17 |
Embodied Instruction Following in Unknown Environments |
Zhenyu Wu et.al. |
2406.11818 |
null |
2024-06-17 |
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level |
Jie Liu et.al. |
2406.11817 |
null |
2024-06-17 |
VideoLLM-online: Online Video Large Language Model for Streaming Video |
Joya Chen et.al. |
2406.11816 |
null |
2024-06-17 |
How Do Large Language Models Acquire Factual Knowledge During Pretraining? |
Hoyeon Chang et.al. |
2406.11813 |
null |
2024-06-17 |
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content |
Joao Monteiro et.al. |
2406.11811 |
null |
2024-06-17 |
Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations |
Rima Hazra et.al. |
2406.11801 |
link |
2024-06-17 |
DataComp-LM: In search of the next generation of training sets for language models |
Jeffrey Li et.al. |
2406.11794 |
null |
2024-06-17 |
CELL your Model: Contrastive Explanation Methods for Large Language Models |
Ronny Luss et.al. |
2406.11785 |
null |
2024-06-17 |
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs |
Swanand Ravindra Kadhe et.al. |
2406.11780 |
null |
2024-06-17 |
Improving Multi-Agent Debate with Sparse Communication Topology |
Yunxuan Li et.al. |
2406.11776 |
null |
2024-06-17 |
Task Me Anything |
Jieyu Zhang et.al. |
2406.11775 |
link |
2024-06-14 |
Quantifying Variance in Evaluation Benchmarks |
Lovish Madaan et.al. |
2406.10229 |
null |
2024-06-14 |
EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models |
Julian Straub et.al. |
2406.10224 |
null |
2024-06-14 |
Short Film Dataset (SFD): A Benchmark for Story-Level Video Understanding |
Ridouane Ghermi et.al. |
2406.10221 |
null |
2024-06-14 |
Semantic Membership Inference Attack against Large Language Models |
Hamid Mozaffari et.al. |
2406.10218 |
null |
2024-06-14 |
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs |
Rui Yang et.al. |
2406.10216 |
null |
2024-06-14 |
DevBench: A multimodal developmental benchmark for language learning |
Alvin Wei Ming Tan et.al. |
2406.10215 |
null |
2024-06-14 |
Be like a Goldfish, Don’t Memorize! Mitigating Memorization in Generative LLMs |
Abhimanyu Hans et.al. |
2406.10209 |
link |
2024-06-14 |
A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors |
Naaman Tan et.al. |
2406.10203 |
link |
2024-06-14 |
TRIP-PAL: Travel Planning with Guarantees by Combining Large Language Models and Automated Planners |
Tomas de la Rosa et.al. |
2406.10196 |
null |
2024-06-14 |
Detecting and Evaluating Medical Hallucinations in Large Vision Language Models |
Jiawei Chen et.al. |
2406.10185 |
null |
2024-06-14 |
Practical offloading for fine-tuning LLM on commodity GPU via learned subspace projectors |
Siyuan Chen et.al. |
2406.10181 |
null |
2024-06-14 |
Let the Poem Hit the Rhythm: Using a Byte-Based Transformer for Beat-Aligned Poetry Generation |
Mohamad Elzohbi et.al. |
2406.10174 |
link |
2024-06-14 |
IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce |
Wenxuan Ding et.al. |
2406.10173 |
link |
2024-06-14 |
Datasets for Multilingual Answer Sentence Selection |
Matteo Gabburo et.al. |
2406.10172 |
null |
2024-06-14 |
CarLLaVA: Vision language models for camera-only closed-loop driving |
Katrin Renz et.al. |
2406.10165 |
null |
2024-06-14 |
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models |
Carson Denison et.al. |
2406.10162 |
link |
2024-06-14 |
RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model |
Hantao Zhou et.al. |
2406.10157 |
null |
2024-06-14 |
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack |
Yuri Kuratov et.al. |
2406.10149 |
link |
2024-06-14 |
Evaluation of Large Language Models: STEM education and Gender Stereotypes |
Smilla Due et.al. |
2406.10133 |
null |
2024-06-14 |
The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models |
Yan Liu et.al. |
2406.10130 |
link |
2024-06-13 |
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding |
Muhammad Maaz et.al. |
2406.09418 |
link |
2024-06-13 |
Explore the Limits of Omni-modal Pretraining at Scale |
Yiyuan Zhang et.al. |
2406.09412 |
link |
2024-06-13 |
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities |
Roman Bachmann et.al. |
2406.09406 |
null |
2024-06-13 |
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models |
Yushi Hu et.al. |
2406.09403 |
null |
2024-06-13 |
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation |
Junke Wang et.al. |
2406.09399 |
link |
2024-06-13 |
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms |
Miaosen Zhang et.al. |
2406.09397 |
null |
2024-06-13 |
Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA |
Jongwoo Park et.al. |
2406.09396 |
link |
2024-06-13 |
Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition |
Youngtaek Oh et.al. |
2406.09388 |
link |
2024-06-13 |
Towards Vision-Language Geo-Foundation Model: A Survey |
Yue Zhou et.al. |
2406.09385 |
link |
2024-06-13 |
Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models |
Lukas Thede et.al. |
2406.09384 |
null |
2024-06-13 |
Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs |
Zijia Zhao et.al. |
2406.09367 |
link |
2024-06-13 |
ElicitationGPT: Text Elicitation Mechanisms via Language Models |
Yifan Wu et.al. |
2406.09363 |
null |
2024-06-13 |
Enhancing Domain Adaptation through Prompt Gradient Alignment |
Hoang Phan et.al. |
2406.09353 |
null |
2024-06-13 |
Separations in the Representational Capabilities of Transformers and Recurrent Architectures |
Satwik Bhattamishra et.al. |
2406.09347 |
null |
2024-06-13 |
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding |
Suwon Shon et.al. |
2406.09345 |
null |
2024-06-13 |
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models |
David Anugraha et.al. |
2406.09334 |
link |
2024-06-13 |
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space |
Tomer Ashuach et.al. |
2406.09325 |
null |
2024-06-13 |
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs |
Zhao Xu et.al. |
2406.09324 |
link |
2024-06-13 |
JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models |
Delong Ran et.al. |
2406.09321 |
link |
2024-06-13 |
Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases |
Meng Wang et.al. |
2406.09317 |
link |
2024-06-12 |
What If We Recaption Billions of Web Images with LLaMA-3? |
Xianhang Li et.al. |
2406.08478 |
null |
2024-06-12 |
Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens |
Ting-Ji Huang et.al. |
2406.08477 |
null |
2024-06-12 |
Real2Code: Reconstruct Articulated Objects via Code Generation |
Zhao Mandi et.al. |
2406.08474 |
null |
2024-06-12 |
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences |
Daiwei Chen et.al. |
2406.08469 |
null |
2024-06-12 |
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing |
Zhangchen Xu et.al. |
2406.08464 |
link |
2024-06-12 |
AToM-Bot: Embodied Fulfillment of Unspoken Human Needs with Affective Theory of Mind |
Wei Ding et.al. |
2406.08455 |
null |
2024-06-12 |
OLMES: A Standard for Language Model Evaluations |
Yuling Gu et.al. |
2406.08446 |
null |
2024-06-12 |
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models |
Chun Yin et.al. |
2406.08445 |
null |
2024-06-12 |
TasTe: Teaching Large Language Models to Translate through Self-Reflection |
Yutong Wang et.al. |
2406.08434 |
link |
2024-06-12 |
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL |
Zijin Hong et.al. |
2406.08426 |
null |
2024-06-12 |
OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text |
Qingyun Li et.al. |
2406.08418 |
link |
2024-06-12 |
Discovering Preference Optimization Algorithms with and for Large Language Models |
Chris Lu et.al. |
2406.08414 |
link |
2024-06-12 |
Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference |
Christopher Wolters et.al. |
2406.08413 |
null |
2024-06-13 |
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos |
Xuehai He et.al. |
2406.08407 |
link |
2024-06-12 |
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models |
Chun-Yi Kuan et.al. |
2406.08402 |
link |
2024-06-12 |
cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers |
Anirudh Sundar et.al. |
2406.08398 |
null |
2024-06-12 |
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks |
Jiannan Wu et.al. |
2406.08394 |
link |
2024-06-12 |
Large Language Models Must Be Taught to Know What They Don’t Know |
Sanyam Kapoor et.al. |
2406.08391 |
link |
2024-06-12 |
Banal Deception Human-AI Ecosystems: A Study of People’s Perceptions of LLM-generated Deceptive Behaviour |
Xiao Zhan et.al. |
2406.08386 |
null |
2024-06-13 |
APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation |
Weizhao He et.al. |
2406.08372 |
null |
2024-06-11 |
A3VLM: Actionable Articulation-Aware Vision Language Model |
Siyuan Huang et.al. |
2406.07549 |
link |
2024-06-11 |
Image and Video Tokenization with Binary Spherical Quantization |
Yue Zhao et.al. |
2406.07548 |
link |
2024-06-11 |
Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena |
Aidar Myrzakhan et.al. |
2406.07545 |
link |
2024-06-11 |
QuickLLaMA: Query-aware Inference Acceleration for Large Language Models |
Jingyao Li et.al. |
2406.07528 |
link |
2024-06-11 |
Simple and Effective Masked Diffusion Language Models |
Subham Sekhar Sahoo et.al. |
2406.07524 |
link |
2024-06-11 |
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling |
Liliang Ren et.al. |
2406.07522 |
link |
2024-06-11 |
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement |
Yunzhen Feng et.al. |
2406.07515 |
null |
2024-06-11 |
THaLLE: Text Hyperlocally Augmented Large Language Extension – Technical Report |
KBTG Labs et.al. |
2406.07505 |
null |
2024-06-11 |
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions |
Renjie Pi et.al. |
2406.07502 |
link |
2024-06-11 |
TextGrad: Automatic “Differentiation” via Text |
Mert Yuksekgonul et.al. |
2406.07496 |
link |
2024-06-11 |
CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization |
Frederic Kirstein et.al. |
2406.07494 |
null |
2024-06-11 |
Paraphrasing in Affirmative Terms Improves Negation Understanding |
MohammadHossein Rezaei et.al. |
2406.07492 |
null |
2024-06-11 |
PITCH: Productivity and Mental Well-being Coaching through Daily Conversational Interaction |
Adnan Abbas et.al. |
2406.07485 |
null |
2024-06-11 |
Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing |
Mao Li et.al. |
2406.07483 |
null |
2024-06-11 |
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs |
Zesen Cheng et.al. |
2406.07476 |
link |
2024-06-11 |
Anomaly Detection on Unstable Logs with GPT Models |
Fatemeh Hadadi et.al. |
2406.07467 |
null |
2024-06-11 |
Estimating the Hallucination Rate of Generative AI |
Andrew Jesson et.al. |
2406.07457 |
null |
2024-06-11 |
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis |
Qining Zhang et.al. |
2406.07455 |
null |
2024-06-11 |
On the Robustness of Document-Level Relation Extraction Models to Entity Name Variations |
Shiao Meng et.al. |
2406.07444 |
link |
2024-06-11 |
McEval: Massively Multilingual Code Evaluation |
Linzheng Chai et.al. |
2406.07436 |
null |
2024-06-10 |
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation |
Peize Sun et.al. |
2406.06525 |
link |
2024-06-10 |
UMBRELA: UMbrela is the (Open-Source Reproduction of the) Bing RELevance Assessor |
Shivani Upadhyay et.al. |
2406.06519 |
link |
2024-06-10 |
Merlin: A Vision Language Foundation Model for 3D Computed Tomography |
Louis Blankemeier et.al. |
2406.06512 |
null |
2024-06-10 |
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative |
Asmar Nadeem et.al. |
2406.06499 |
null |
2024-06-10 |
Direct Preference Optimization for Suppressing Hallucinated Prior Exams in Radiology Report Generation |
Oishi Banerjee et.al. |
2406.06496 |
null |
2024-06-10 |
Can Language Models Serve as Text-Based World Simulators? |
Ruoyao Wang et.al. |
2406.06485 |
null |
2024-06-10 |
Parallelizing Linear Transformers with the Delta Rule over Sequence Length |
Songlin Yang et.al. |
2406.06484 |
link |
2024-06-10 |
Towards a Personal Health Large Language Model |
Justin Cosentino et.al. |
2406.06474 |
null |
2024-06-10 |
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction |
Zhen Xing et.al. |
2406.06465 |
null |
2024-06-10 |
Transforming Wearable Data into Health Insights using Large Language Model Agents |
Mike A. Merrill et.al. |
2406.06464 |
null |
2024-06-10 |
VCR: Visual Caption Restoration |
Tianyu Zhang et.al. |
2406.06462 |
link |
2024-06-11 |
Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies |
Junlin Wang et.al. |
2406.06461 |
null |
2024-06-10 |
Evaluating the Retrieval Component in LLM-Based Question Answering Systems |
Ashkan Alinejad et.al. |
2406.06458 |
null |
2024-06-10 |
A Large Language Model Pipeline for Breast Cancer Oncology |
Tristen Pool et.al. |
2406.06455 |
null |
2024-06-10 |
Insights from Social Shaping Theory: The Appropriation of Large Language Models in an Undergraduate Programming Course |
Aadarsh Padiyath et.al. |
2406.06451 |
null |
2024-06-10 |
LLM Dataset Inference: Did you train on my dataset? |
Pratyush Maini et.al. |
2406.06443 |
link |
2024-06-10 |
Interpretability of Language Models via Task Spaces |
Lucas Weber et.al. |
2406.06441 |
null |
2024-06-10 |
Language Models are Alignable Decision-Makers: Dataset and Application to the Medical Triage Domain |
Brian Hu et.al. |
2406.06435 |
link |
2024-06-10 |
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking |
Gabriel Rioux et.al. |
2406.06425 |
null |
2024-06-10 |
An Empirical Design Justice Approach to Identifying Ethical Considerations in the Intersection of Large Language Models and Social Robotics |
Alva Markelius et.al. |
2406.06400 |
null |
2024-06-07 |
3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs |
Jianing Yang et.al. |
2406.05132 |
link |
2024-06-07 |
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models |
Xiongtao Zhou et.al. |
2406.05130 |
null |
2024-06-07 |
Towards Semantic Equivalence of Tokenization in Multimodal LLM |
Shengqiong Wu et.al. |
2406.05127 |
null |
2024-06-07 |
Large Generative Graph Models |
Yu Wang et.al. |
2406.05109 |
null |
2024-06-07 |
LINX: A Language Driven Generative System for Goal-Oriented Automated Data Exploration |
Tavor Lipman et.al. |
2406.05107 |
null |
2024-06-07 |
Corpus Poisoning via Approximate Greedy Gradient Descent |
Jinyan Su et.al. |
2406.05087 |
link |
2024-06-07 |
Multi-Head RAG: Solving Multi-Aspect Problems with LLMs |
Maciej Besta et.al. |
2406.05085 |
link |
2024-06-07 |
SUMIE: A Synthetic Benchmark for Incremental Entity Summarization |
Eunjeong Hwang et.al. |
2406.05079 |
null |
2024-06-07 |
Are Large Language Models More Empathetic than Humans? |
Anuradha Welivita et.al. |
2406.05063 |
null |
2024-06-07 |
Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions |
Shi-Yu Tian et.al. |
2406.05055 |
null |
2024-06-07 |
Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation |
Nachiket Kotalwar et.al. |
2406.05053 |
null |
2024-06-07 |
Bootstrapping Referring Multi-Object Tracking |
Yani Zhang et.al. |
2406.05039 |
link |
2024-06-07 |
Scenarios and Approaches for Situated Natural Language Explanations |
Pengshuo Qiu et.al. |
2406.05035 |
null |
2024-06-07 |
CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search |
Fengran Mo et.al. |
2406.05013 |
link |
2024-06-07 |
Compositional Generalization with Grounded Language Models |
Sondre Wold et.al. |
2406.04989 |
link |
2024-06-07 |
Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences |
Patrick Haller et.al. |
2406.04988 |
null |
2024-06-07 |
MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter |
Jitai Hao et.al. |
2406.04984 |
link |
2024-06-07 |
CityCraft: A Real Crafter for 3D City Generation |
Jie Deng et.al. |
2406.04983 |
null |
2024-06-07 |
Quantifying Geospatial in the Common Crawl Corpus |
Ilya Ilyankou et.al. |
2406.04952 |
null |
2024-06-07 |
BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense |
Baktash Ansari et.al. |
2406.04947 |
link |
2024-06-06 |
Verbalized Machine Learning: Revisiting Machine Learning with Language Models |
Tim Z. Xiao et.al. |
2406.04344 |
null |
2024-06-06 |
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image |
Stanislaw Szymanowicz et.al. |
2406.04343 |
null |
2024-06-06 |
Learning 1D Causal Visual Representation with De-focus Attention Networks |
Chenxin Tao et.al. |
2406.04342 |
link |
2024-06-06 |
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation |
Jiaming Liu et.al. |
2406.04339 |
null |
2024-06-06 |
Coherent Zero-Shot Visual Instruction Generation |
Quynh Phung et.al. |
2406.04337 |
null |
2024-06-06 |
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs |
Lingchen Meng et.al. |
2406.04334 |
null |
2024-06-06 |
PaCE: Parsimonious Concept Engineering for Large Language Models |
Jinqi Luo et.al. |
2406.04331 |
link |
2024-06-06 |
Parameter-Inverted Image Pyramid Networks |
Xizhou Zhu et.al. |
2406.04330 |
link |
2024-06-06 |
Simplified and Generalized Masked Diffusion for Discrete Data |
Jiaxin Shi et.al. |
2406.04329 |
null |
2024-06-06 |
Causal Estimation of Memorisation Profiles |
Pietro Lesci et.al. |
2406.04327 |
link |
2024-06-06 |
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions |
Lin Chen et.al. |
2406.04325 |
null |
2024-06-06 |
Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step |
Zhanhao Liang et.al. |
2406.04314 |
null |
2024-06-06 |
Improving Alignment and Robustness with Short Circuiting |
Andy Zou et.al. |
2406.04313 |
link |
2024-06-06 |
Semantically Diverse Language Generation for Uncertainty Estimation in Language Models |
Lukas Aichberger et.al. |
2406.04306 |
link |
2024-06-06 |
Quixer: A Quantum Transformer Model |
Nikhil Khatri et.al. |
2406.04305 |
null |
2024-06-06 |
Text-to-Drive: Diverse Driving Behavior Synthesis via Large Language Models |
Phat Nguyen et.al. |
2406.04300 |
null |
2024-06-06 |
VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval |
Junjie Zhou et.al. |
2406.04292 |
link |
2024-06-06 |
Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation |
Adam Fisch et.al. |
2406.04291 |
null |
2024-06-07 |
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages |
Nadav Borenstein et.al. |
2406.04289 |
null |
2024-06-06 |
Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People |
Dun-Ming Huang et.al. |
2406.04278 |
link |
2024-06-05 |
Wings: Learning Multimodal LLMs without Text-only Forgetting |
Yi-Kai Zhang et.al. |
2406.03496 |
null |
2024-06-06 |
Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training |
Ao Sun et.al. |
2406.03488 |
link |
2024-06-05 |
Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends |
Sanjana Ramprasad et.al. |
2406.03487 |
null |
2024-06-05 |
BIPED: Pedagogically Informed Tutoring System for ESL Education |
Soonwoo Kwon et.al. |
2406.03486 |
null |
2024-06-05 |
Does your data spark joy? Performance gains from domain upsampling at the end of training |
Cody Blakeney et.al. |
2406.03476 |
null |
2024-06-05 |
AD-H: Autonomous Driving with Hierarchical Agents |
Zaibin Zhang et.al. |
2406.03474 |
null |
2024-06-05 |
What is the Best Way for ChatGPT to Translate Poetry? |
Shanshan Wang et.al. |
2406.03450 |
null |
2024-06-05 |
Pre-trained Large Language Models Use Fourier Features to Compute Addition |
Tianyi Zhou et.al. |
2406.03445 |
null |
2024-06-05 |
Are language models rational? The case of coherence norms and belief revision |
Thomas Hofweber et.al. |
2406.03442 |
null |
2024-06-05 |
Cycles of Thought: Measuring LLM Confidence through Stable Explanations |
Evan Becker et.al. |
2406.03441 |
null |
2024-06-05 |
Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis |
Moein Heidari et.al. |
2406.03430 |
link |
2024-06-05 |
Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach |
Saehyung Lee et.al. |
2406.03411 |
link |
2024-06-05 |
Automating Turkish Educational Quiz Generation Using Large Language Models |
Kamyar Zeinalipour et.al. |
2406.03397 |
link |
2024-06-05 |
Log Parsing with Self-Generated In-Context Learning and Self-Correction |
Yifan Wu et.al. |
2406.03376 |
null |
2024-06-05 |
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models |
David Ifeoluwa Adelani et.al. |
2406.03368 |
null |
2024-06-05 |
CLMASP: Coupling Large Language Models with Answer Set Programming for Robotic Task Planning |
Xinrui Lin et.al. |
2406.03367 |
null |
2024-06-05 |
LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback |
Timon Ziegenbein et.al. |
2406.03363 |
null |
2024-06-05 |
Save It for the “Hot” Day: An LLM-Empowered Visual Analytics System for Heat Risk Management |
Haobo Li et.al. |
2406.03317 |
null |
2024-06-05 |
The Good, the Bad, and the Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games |
Mikhail Mozikov et.al. |
2406.03299 |
null |
2024-06-05 |
SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms |
Xingrun Xing et.al. |
2406.03287 |
link |
2024-06-04 |
Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks |
Tianyu He et.al. |
2406.02550 |
link |
2024-06-04 |
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation |
Mohamed El Amine Boudjoghra et.al. |
2406.02548 |
link |
2024-06-04 |
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning |
Alex Jinpeng Wang et.al. |
2406.02547 |
link |
2024-06-04 |
To Believe or Not to Believe Your LLM |
Yasin Abbasi Yadkori et.al. |
2406.02543 |
null |
2024-06-04 |
Loki: Low-Rank Keys for Efficient Sparse Attention |
Prajwal Singhania et.al. |
2406.02542 |
null |
2024-06-04 |
Parrot: Multilingual Visual Instruction Tuning |
Hai-Long Sun et.al. |
2406.02539 |
null |
2024-06-04 |
TopViewRS: Vision-Language Models as Top-View Spatial Reasoners |
Chengzu Li et.al. |
2406.02537 |
link |
2024-06-04 |
Mitigate Position Bias in Large Language Models via Scaling a Single Dimension |
Yijiong Yu et.al. |
2406.02536 |
link |
2024-06-04 |
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices |
Ruslan Svirschevski et.al. |
2406.02532 |
link |
2024-06-04 |
Scalable MatMul-free Language Modeling |
Rui-Jie Zhu et.al. |
2406.02528 |
link |
2024-06-04 |
CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks |
Maciej Besta et.al. |
2406.02524 |
link |
2024-06-04 |
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots |
Soroush Nasiriany et.al. |
2406.02523 |
null |
2024-06-04 |
Demystifying the Compression of Mixture-of-Experts Through a Unified Framework |
Shwai He et.al. |
2406.02500 |
link |
2024-06-04 |
Hiding Text in Large Language Models: Introducing Unconditional Token Forcing Confusion |
Jakub Hoscilowicz et.al. |
2406.02481 |
link |
2024-06-04 |
Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding |
Zhihan Zhang et.al. |
2406.02472 |
null |
2024-06-04 |
Meta-Designing Quantum Experiments with Language Models |
Sören Arlt et.al. |
2406.02470 |
null |
2024-06-04 |
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models |
Philip Anastassiou et.al. |
2406.02430 |
link |
2024-06-04 |
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion |
Ruiqi Li et.al. |
2406.02429 |
null |
2024-06-04 |
GrootVL: Tree Topology is All You Need in State Space Model |
Yicheng Xiao et.al. |
2406.02395 |
link |
2024-06-04 |
Multiple Choice Questions and Large Languages Models: A Case Study with Fictional Medical Data |
Maxime Griot et.al. |
2406.02394 |
link |
2024-05-31 |
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis |
Chaoyou Fu et.al. |
2405.21075 |
null |
2024-05-31 |
Code Pretraining Improves Entity Tracking Abilities of Language Models |
Najoung Kim et.al. |
2405.21068 |
null |
2024-05-31 |
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality |
Tri Dao et.al. |
2405.21060 |
link |
2024-05-31 |
RydbergGPT |
David Fitzek et.al. |
2405.21052 |
link |
2024-05-31 |
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling |
Jiatao Gu et.al. |
2405.21048 |
null |
2024-05-31 |
Grammar-Aligned Decoding |
Kanghee Park et.al. |
2405.21047 |
null |
2024-05-31 |
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF |
Tengyang Xie et.al. |
2405.21046 |
null |
2024-05-31 |
Direct Alignment of Language Models via Quality-Aware Self-Refinement |
Runsheng Yu et.al. |
2405.21040 |
null |
2024-05-31 |
Standards for Belief Representations in LLMs |
Daniel A. Herrmann et.al. |
2405.21030 |
null |
2024-05-31 |
LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models |
Elias Stengel-Eskin et.al. |
2405.21028 |
link |
2024-05-31 |
You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet |
Zhen Qin et.al. |
2405.21022 |
null |
2024-05-31 |
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models |
Xiaojun Jia et.al. |
2405.21018 |
link |
2024-06-03 |
StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond |
Pengyuan Lyu et.al. |
2405.21013 |
null |
2024-05-31 |
Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models |
Yi Yang et.al. |
2405.20991 |
link |
2024-05-31 |
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models |
Linli Yao et.al. |
2405.20985 |
link |
2024-05-31 |
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training |
Feiteng Fang et.al. |
2405.20978 |
link |
2024-05-31 |
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales |
Tianyang Xu et.al. |
2405.20974 |
link |
2024-05-31 |
LCQ: Low-Rank Codebook based Quantization for Large Language Models |
Wen-Pu Cai et.al. |
2405.20973 |
null |
2024-06-03 |
Large Language Models are Zero-Shot Next Location Predictors |
Ciro Beneduce et.al. |
2405.20962 |
link |
2024-06-03 |
A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs’ Humour Alignment with Comedians |
Piotr Wojciech Mirowski et.al. |
2405.20956 |
null |
2024-05-30 |
MotionLLM: Understanding Human Behaviors from Human Motions and Videos |
Ling-Hao Chen et.al. |
2405.20340 |
null |
2024-05-30 |
Visual Perception by Large Language Model’s Weights |
Feipeng Ma et.al. |
2405.20339 |
null |
2024-05-30 |
Xwin-LM: Strong and Scalable Alignment Practice for LLMs |
Bolin Ni et.al. |
2405.20335 |
link |
2024-05-31 |
ParSEL: Parameterized Shape Editing with Language |
Aditya Ganeshan et.al. |
2405.20319 |
null |
2024-05-30 |
CausalQuest: Collecting Natural Causal Questions for AI Agents |
Roberto Ceraolo et.al. |
2405.20318 |
link |
2024-05-30 |
ANAH: Analytical Annotation of Hallucinations in Large Language Models |
Ziwei Ji et.al. |
2405.20315 |
link |
2024-05-30 |
Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation |
Guillaume Huguet et.al. |
2405.20313 |
null |
2024-05-30 |
Large Language Models Can Self-Improve At Web Agent Tasks |
Ajay Patel et.al. |
2405.20309 |
link |
2024-05-30 |
Can’t make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models |
Himangi Mittal et.al. |
2405.20305 |
null |
2024-05-30 |
Group Robust Preference Optimization in Reward-free RLHF |
Shyam Sundhar Ramesh et.al. |
2405.20304 |
link |
2024-05-30 |
Who Writes the Review, Human or AI? |
Panagiotis C. Theocharopoulos et.al. |
2405.20285 |
null |
2024-05-30 |
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections |
Massimo Bini et.al. |
2405.20271 |
link |
2024-05-30 |
Evaluating Large Language Model Biases in Persona-Steered Generation |
Andy Liu et.al. |
2405.20253 |
link |
2024-05-30 |
Towards Hierarchical Multi-Agent Workflows for Zero-Shot Prompt Optimization |
Yuchi Liu et.al. |
2405.20252 |
link |
2024-05-30 |
Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use |
Franz Louis Cesista et.al. |
2405.20245 |
null |
2024-05-30 |
Context Injection Attacks on Large Language Models |
Cheng’an Wei et.al. |
2405.20234 |
null |
2024-05-30 |
Data-efficient fine-tuning of foundational models for first-principles quality sublimation enthalpies |
Harveen Kaur et.al. |
2405.20217 |
null |
2024-05-30 |
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models |
Chen Zhang et.al. |
2405.20215 |
null |
2024-05-30 |
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments |
Ke Yi et.al. |
2405.20202 |
null |
2024-05-31 |
Using Large Language Models for Humanitarian Frontline Negotiation: Opportunities and Considerations |
Zilin Ma et.al. |
2405.20195 |
null |
2024-05-29 |
X-VILA: Cross-Modality Alignment for Large Language Model |
Hanrong Ye et.al. |
2405.19335 |
null |
2024-05-29 |
LLMs Meet Multimodal Generation and Editing: A Survey |
Yingqing He et.al. |
2405.19334 |
link |
2024-05-29 |
Multi-Modal Generative Embedding Model |
Feipeng Ma et.al. |
2405.19333 |
null |
2024-05-29 |
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment |
Shenao Zhang et.al. |
2405.19332 |
link |
2024-05-29 |
Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation |
Atrisha Sarkar et.al. |
2405.19328 |
null |
2024-05-29 |
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series |
Ge Zhang et.al. |
2405.19327 |
link |
2024-05-29 |
Reasoning3D – Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models |
Tianrun Chen et.al. |
2405.19326 |
null |
2024-05-29 |
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution |
Minghan Li et.al. |
2405.19325 |
null |
2024-05-29 |
Are Large Language Models Chameleons? |
Mingmeng Geng et.al. |
2405.19323 |
null |
2024-05-29 |
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF |
Shicong Cen et.al. |
2405.19320 |
null |
2024-05-29 |
Robust Preference Optimization through Reward Model Distillation |
Adam Fisch et.al. |
2405.19316 |
null |
2024-05-29 |
Matryoshka Query Transformer for Large Vision-Language Models |
Wenbo Hu et.al. |
2405.19315 |
link |
2024-05-29 |
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice |
Jian-Qiao Zhu et.al. |
2405.19313 |
null |
2024-05-29 |
Expert-Guided Extinction of Toxic Tokens for Debiased Generation |
Xueyao Sun et.al. |
2405.19299 |
null |
2024-05-29 |
MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection |
Michael Regan et.al. |
2405.19285 |
null |
2024-05-29 |
Optimizing Foundation Model Inference on a Many-tiny-core Open-source RISC-V Platform |
Viviane Potocnik et.al. |
2405.19284 |
null |
2024-05-29 |
Programmable Motion Generation for Open-Set Motion Control Tasks |
Hanchao Liu et.al. |
2405.19283 |
null |
2024-05-29 |
PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications |
Dingkang Yang et.al. |
2405.19266 |
null |
2024-05-29 |
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data |
Zifan Song et.al. |
2405.19265 |
link |
2024-05-29 |
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models |
Zhanhui Zhou et.al. |
2405.19262 |
link |
2024-05-28 |
Why are Visually-Grounded Language Models Bad at Image Classification? |
Yuhui Zhang et.al. |
2405.18415 |
link |
2024-05-28 |
Don’t Forget to Connect! Improving RAG with Graph-based Reranking |
Jialin Dong et.al. |
2405.18414 |
null |
2024-05-28 |
WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization |
Jiawei Ma et.al. |
2405.18405 |
null |
2024-05-29 |
Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass |
Ethan Shen et.al. |
2405.18400 |
link |
2024-05-28 |
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning |
Yixiao Zhang et.al. |
2405.18386 |
link |
2024-05-28 |
OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning |
Pengxiang Li et.al. |
2405.18380 |
link |
2024-05-28 |
LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models |
Anthony Sarah et.al. |
2405.18377 |
null |
2024-05-28 |
Empowering Source-Free Domain Adaptation with MLLM-driven Curriculum Learning |
Dongjie Chen et.al. |
2405.18376 |
link |
2024-05-28 |
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning |
Phakphum Artkaew et.al. |
2405.18375 |
link |
2024-05-28 |
PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework |
Eshaan Agarwal et.al. |
2405.18369 |
null |
2024-05-28 |
Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving? |
Yifan Bai et.al. |
2405.18361 |
null |
2024-05-28 |
Bridging the Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs |
Somnath Kumar et.al. |
2405.18359 |
null |
2024-05-28 |
MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning |
Somnath Kumar et.al. |
2405.18358 |
null |
2024-05-28 |
Faithful Logical Reasoning via Symbolic Chain-of-Thought |
Jundong Xu et.al. |
2405.18357 |
link |
2024-05-28 |
Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography |
Jie Liu et.al. |
2405.18356 |
link |
2024-05-28 |
Intelligent Clinical Documentation: Harnessing Generative AI for Patient-Centric Clinical Note Generation |
Anjanava Biswas et.al. |
2405.18346 |
null |
2024-05-28 |
The Battle of LLMs: A Comparative Study in Conversational QA Tasks |
Aryan Rangapur et.al. |
2405.18344 |
null |
2024-05-28 |
Frustratingly Easy Test-Time Adaptation of Vision-Language Models |
Matteo Farina et.al. |
2405.18330 |
link |
2024-05-28 |
Multi-modal Generation via Cross-Modal In-Context Learning |
Amandeep Kumar et.al. |
2405.18304 |
link |
2024-05-28 |
Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning |
Renzhi Wang et.al. |
2405.18292 |
null |
2024-05-27 |
Matryoshka Multimodal Models |
Mu Cai et.al. |
2405.17430 |
null |
2024-05-27 |
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models |
Chankyu Lee et.al. |
2405.17428 |
null |
2024-05-27 |
Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model |
Kuan-Chih Huang et.al. |
2405.17427 |
link |
2024-05-27 |
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence |
Zhuoling Li et.al. |
2405.17424 |
null |
2024-05-27 |
Privacy-Aware Visual Language Models |
Laurens Samson et.al. |
2405.17423 |
null |
2024-05-27 |
Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation |
Jiaming Liu et.al. |
2405.17418 |
null |
2024-05-27 |
THREAD: Thinking Deeper with Recursive Spawning |
Philip Schroeder et.al. |
2405.17402 |
link |
2024-05-27 |
The Expressive Capacity of State Space Models: A Formal Language Perspective |
Yash Sarrof et.al. |
2405.17394 |
null |
2024-05-27 |
MindMerger: Efficient Boosting LLM Reasoning in non-English Languages |
Zixian Huang et.al. |
2405.17386 |
link |
2024-05-27 |
Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective |
Zhen Qin et.al. |
2405.17383 |
null |
2024-05-27 |
ReMoDetect: Reward Models Recognize Aligned LLM’s Generations |
Hyunseok Lee et.al. |
2405.17382 |
null |
2024-05-27 |
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention |
Zhen Qin et.al. |
2405.17381 |
link |
2024-05-27 |
RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects |
Ahmed Allam et.al. |
2405.17378 |
link |
2024-05-28 |
Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models |
ShengYun Peng et.al. |
2405.17374 |
null |
2024-05-27 |
Prompt Optimization with Human Feedback |
Xiaoqiang Lin et.al. |
2405.17346 |
link |
2024-05-27 |
Exploring and steering the moral compass of Large Language Models |
Alejandro Tlaie et.al. |
2405.17345 |
link |
2024-05-27 |
Cost-efficient Knowledge-based Question Answering with Large Language Models |
Junnan Dong et.al. |
2405.17337 |
null |
2024-05-27 |
XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser |
Xianfu Cheng et.al. |
2405.17336 |
null |
2024-05-27 |
FedHPL: Efficient Heterogeneous Federated Learning with Prompt Tuning and Logit Distillation |
Yuting Ma et.al. |
2405.17267 |
null |
2024-05-27 |
On the Noise Robustness of In-Context Learning for Text Generation |
Hongfu Gao et.al. |
2405.17264 |
null |
2024-05-24 |
Scaling Laws for Discriminative Classification in Large Language Models |
Dean Wyatte et.al. |
2405.15765 |
null |
2024-05-24 |
Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence |
Abhinav Patil et.al. |
2405.15750 |
null |
2024-05-24 |
Sparse maximal update parameterization: A holistic approach to sparse training dynamics |
Nolan Dey et.al. |
2405.15743 |
null |
2024-05-24 |
Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias |
Andres Algaba et.al. |
2405.15739 |
link |
2024-05-24 |
LM4LV: A Frozen Large Language Model for Low-level Vision Tasks |
Boyang Zheng et.al. |
2405.15734 |
link |
2024-05-24 |
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks |
Jerome Sieber et.al. |
2405.15731 |
link |
2024-05-24 |
Optimizing Large Language Models for OpenAPI Code Completion |
Bohdan Petryshyn et.al. |
2405.15729 |
link |
2024-05-24 |
Disease-informed Adaptation of Vision-Language Models |
Jiajin Zhang et.al. |
2405.15728 |
link |
2024-05-24 |
The Impact of Geometric Complexity on Neural Collapse in Transfer Learning |
Michael Munn et.al. |
2405.15706 |
null |
2024-05-24 |
Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models |
Yue Zhang et.al. |
2405.15684 |
null |
2024-05-24 |
VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap |
Sreyan Ghosh et.al. |
2405.15683 |
null |
2024-05-24 |
What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models |
Abdelrahman Abdelhamed et.al. |
2405.15668 |
null |
2024-05-24 |
Class Machine Unlearning for Complex Data via Concepts Inference and Data Poisoning |
Wenhan Chang et.al. |
2405.15662 |
null |
2024-05-24 |
\(\mathbf{L^2\cdot M = C^2}\) Large Language Models as Covert Channels… a Systematic Analysis |
Simen Gaure et.al. |
2405.15652 |
null |
2024-05-24 |
LLM-based Robot Task Planning with Exceptional Handling for General Purpose Service Robots |
Ruoyu Wang et.al. |
2405.15646 |
null |
2024-05-24 |
GECKO: Generative Language Model for English, Code and Korean |
Sungwoo Oh et.al. |
2405.15640 |
null |
2024-05-24 |
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models |
Hongyu Wang et.al. |
2405.15638 |
link |
2024-05-24 |
GPTZoo: A Large-scale Dataset of GPTs for the Research Community |
Xinyi Hou et.al. |
2405.15630 |
link |
2024-05-24 |
A Comparative Analysis of Distributed Training Strategies for GPT-2 |
Ishan Patwardhan et.al. |
2405.15628 |
null |
2024-05-24 |
Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment |
Hao Sun et.al. |
2405.15624 |
null |
2024-05-23 |
PuzzleAvatar: Assembling 3D Avatars from Personal Albums |
Yuliang Xiu et.al. |
2405.14869 |
null |
2024-05-23 |
A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns |
Asaf Yehudai et.al. |
2405.14863 |
null |
2024-05-23 |
Bitune: Bidirectional Instruction-Tuning |
Dawid J. Kopiczko et.al. |
2405.14862 |
null |
2024-05-23 |
Not All Language Model Features Are Linear |
Joshua Engels et.al. |
2405.14860 |
link |
2024-05-23 |
PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression |
Vladimir Malinovskii et.al. |
2405.14852 |
link |
2024-05-23 |
A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis |
Yue Yang et.al. |
2405.14839 |
null |
2024-05-23 |
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step |
Yuntian Deng et.al. |
2405.14838 |
link |
2024-05-23 |
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models |
Bernal Jiménez Gutiérrez et.al. |
2405.14831 |
link |
2024-05-23 |
Designing A Sustainable Marine Debris Clean-up Framework without Human Labels |
Raymond Wang et.al. |
2405.14815 |
link |
2024-05-23 |
As an AI Language Model, “Yes I Would Recommend Calling the Police’’: Norm Inconsistency in LLM Decision-Making |
Shomik Jain et.al. |
2405.14812 |
null |
2024-05-23 |
Implicit Personalization in Language Models: A Systematic Study |
Zhijing Jin et.al. |
2405.14808 |
link |
2024-05-23 |
Can LLMs Solve longer Math Word Problems Better? |
Xin Xu et.al. |
2405.14804 |
null |
2024-05-23 |
Lessons from the Trenches on Reproducible Evaluation of Language Models |
Stella Biderman et.al. |
2405.14782 |
null |
2024-05-23 |
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models |
Peng Wang et.al. |
2405.14768 |
link |
2024-05-23 |
FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models |
Hongyang Yang et.al. |
2405.14767 |
link |
2024-05-23 |
Evaluating Large Language Models for Public Health Classification and Extraction Tasks |
Joshua Harris et.al. |
2405.14766 |
null |
2024-05-23 |
Large language models can be zero-shot anomaly detectors for time series? |
Sarah Alnegheimish et.al. |
2405.14755 |
null |
2024-05-23 |
A Transformer-Based Approach for Smart Invocation of Automatic Code Completion |
Aral de Moor et.al. |
2405.14753 |
link |
2024-05-23 |
MultiCast: Zero-Shot Multivariate Time Series Forecasting Using LLMs |
Georgios Chatzigeorgakidis et.al. |
2405.14748 |
null |
2024-05-23 |
Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View |
Xuan Liu et.al. |
2405.14744 |
null |
2024-05-21 |
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention |
William Brandon et.al. |
2405.12981 |
null |
2024-05-21 |
OmniGlue: Generalizable Feature Matching with Foundation Model Guidance |
Hanwen Jiang et.al. |
2405.12979 |
link |
2024-05-21 |
BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once |
Theodore Zhao et.al. |
2405.12971 |
null |
2024-05-21 |
Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale |
Shriram Chennakesavalu et.al. |
2405.12961 |
link |
2024-05-21 |
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models |
Zhangyue Yin et.al. |
2405.12939 |
link |
2024-05-21 |
Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs |
Bilgehan Sel et.al. |
2405.12933 |
null |
2024-05-21 |
Code-mixed Sentiment and Hate-speech Prediction |
Anjali Yadav et.al. |
2405.12929 |
null |
2024-05-21 |
Streamlining Software Reviews: Efficient Predictive Modeling with Minimal Examples |
Tim Menzies et.al. |
2405.12920 |
link |
2024-05-21 |
G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation |
Xingyuan Pan et.al. |
2405.12915 |
link |
2024-05-21 |
An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation |
Zhiyu Tan et.al. |
2405.12914 |
link |
2024-05-21 |
Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment |
Holli Sargeant et.al. |
2405.12910 |
link |
2024-05-21 |
Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents |
San Kim et.al. |
2405.12900 |
null |
2024-05-21 |
Investigating Persuasion Techniques in Arabic: An Empirical Study Leveraging Large Language Models |
Abdurahmman Alzahrani et.al. |
2405.12884 |
null |
2024-05-21 |
LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language |
James Requeima et.al. |
2405.12856 |
link |
2024-05-21 |
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models |
Zhaojian Yu et.al. |
2405.12843 |
link |
2024-05-21 |
SmartFlow: Robotic Process Automation using LLMs |
Arushi Jain et.al. |
2405.12842 |
null |
2024-05-21 |
Large Language Models Meet NLP: A Survey |
Libo Qin et.al. |
2405.12819 |
link |
2024-05-21 |
Test Oracle Automation in the era of LLMs |
Facundo Molina et.al. |
2405.12766 |
null |
2024-05-21 |
C3L: Content Correlated Vision-Language Instruction Tuning Data Generation via Contrastive Learning |
Ji Ma et.al. |
2405.12752 |
null |
2024-05-21 |
Generative AI and Large Language Models for Cyber Security: All Insights You Need |
Mohamed Amine Ferrag et.al. |
2405.12750 |
null |
2024-05-20 |
Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning |
Guanglin Zhou et.al. |
2405.12217 |
link |
2024-05-20 |
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark |
Hongwei Liu et.al. |
2405.12209 |
link |
2024-05-20 |
Developers’ Perceptions on the Impact of ChatGPT in Software Development: A Survey |
Thiago S. Vaillant et.al. |
2405.12195 |
link |
2024-05-20 |
CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models |
Haoxiang Shi et.al. |
2405.12174 |
null |
2024-05-20 |
Fennec: Fine-grained Language Model Evaluation and Correction Extended through Branching and Bridging |
Xiaobo Liang et.al. |
2405.12163 |
link |
2024-05-20 |
Eliciting Problem Specifications via Large Language Models |
Robert E. Wray et.al. |
2405.12147 |
null |
2024-05-20 |
DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM |
Xuchen Li et.al. |
2405.12139 |
null |
2024-05-20 |
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning |
Ting Jiang et.al. |
2405.12130 |
link |
2024-05-20 |
Reindex-Then-Adapt: Improving Large Language Models for Conversational Recommendation |
Zhankui He et.al. |
2405.12119 |
null |
2024-05-20 |
Imp: Highly Capable Large Multimodal Models for Mobile Devices |
Zhenwei Shao et.al. |
2405.12107 |
link |
2024-05-20 |
DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical Correction |
Hao Chen et.al. |
2405.12100 |
null |
2024-05-20 |
Distributional Semantics, Holism, and the Instability of Meaning |
Jumbly Grindrod et.al. |
2405.12084 |
null |
2024-05-20 |
PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation |
Zhuobin Huang et.al. |
2405.12079 |
null |
2024-05-20 |
CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models |
Tong Zhang et.al. |
2405.12063 |
link |
2024-05-20 |
STYLE: Improving Domain Transferability of Asking Clarification Questions in Large Language Model Powered Conversational Agents |
Yue Chen et.al. |
2405.12059 |
null |
2024-05-20 |
KG-RAG: Bridging the Gap Between Knowledge and Creativity |
Diego Sanmartin et.al. |
2405.12035 |
null |
2024-05-20 |
Can AI Relate: Testing Large Language Model Response for Mental Health Support |
Saadia Gabriel et.al. |
2405.12021 |
null |
2024-05-20 |
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering |
Jingqun Tang et.al. |
2405.11985 |
link |
2024-05-20 |
A review on the use of large language models as virtual tutors |
Silvia García-Méndez et.al. |
2405.11983 |
null |
2024-05-20 |
Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays |
Zhichao Sun et.al. |
2405.11976 |
link |
2024-05-17 |
Observational Scaling Laws and the Predictability of Language Model Performance |
Yangjun Ruan et.al. |
2405.10938 |
link |
2024-05-17 |
A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers |
Kaiyu Huang et.al. |
2405.10936 |
link |
2024-05-17 |
The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks |
Lucius Bushnaq et.al. |
2405.10928 |
link |
2024-05-17 |
Blackbox Adaptation for Medical Image Segmentation |
Jay N. Paranjape et.al. |
2405.10913 |
link |
2024-05-17 |
COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain |
Dimitrios P. Panagoulias et.al. |
2405.10893 |
null |
2024-05-17 |
Application of Artificial Intelligence in Schizophrenia Rehabilitation Management: Systematic Literature Review |
Hongyi Yang et.al. |
2405.10883 |
null |
2024-05-17 |
ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains |
Zhaopei Huang et.al. |
2405.10860 |
link |
2024-05-17 |
The Future of Large Language Model Pre-training is Federated |
Lorenzo Sani et.al. |
2405.10853 |
null |
2024-05-17 |
Open-Vocabulary Spatio-Temporal Action Detection |
Tao Wu et.al. |
2405.10832 |
null |
2024-05-17 |
Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities |
Hao Zhou et.al. |
2405.10825 |
null |
2024-05-17 |
ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios |
Markus Bayer et.al. |
2405.10808 |
null |
2024-05-17 |
The Relational Machine Calculus |
Chris Barrett et.al. |
2405.10801 |
null |
2024-05-17 |
Empowering Small-Scale Knowledge Graphs: A Strategy of Leveraging General-Purpose Knowledge Graphs for Enriched Embeddings |
Albert Sawczyn et.al. |
2405.10745 |
null |
2024-05-17 |
Efficient Multimodal Large Language Models: A Survey |
Yizhang Jin et.al. |
2405.10739 |
link |
2024-05-17 |
INDUS: Effective and Efficient Language Models for Scientific Applications |
Bishwaranjan Bhattacharjee et.al. |
2405.10725 |
null |
2024-05-17 |
SignLLM: Sign Languages Production Large Language Models |
Sen Fang et.al. |
2405.10718 |
null |
2024-05-17 |
Persian Pronoun Resolution: Leveraging Neural Networks and Language Models |
Hassan Haji Mohammadi et.al. |
2405.10714 |
null |
2024-05-17 |
SynDy: Synthetic Dynamic Dataset Generation Framework for Misinformation Tasks |
Michael Shliselberg et.al. |
2405.10700 |
null |
2024-05-17 |
Revolutionizing Process Mining: A Novel Architecture for ChatGPT Integration and Enhanced User Experience through Optimized Prompt Engineering |
Mehrdad Agha Mohammad Ali Kermani et.al. |
2405.10689 |
null |
2024-05-17 |
Realistic Evaluation of Toxicity in Large Language Models |
Tinh Son Luong et.al. |
2405.10659 |
null |
2024-05-16 |
UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models |
Sahel Sharifymoghaddam et.al. |
2405.10311 |
null |
2024-05-16 |
4D Panoptic Scene Graph Generation |
Jingkang Yang et.al. |
2405.10305 |
link |
2024-05-16 |
Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees |
Yu Gui et.al. |
2405.10301 |
null |
2024-05-16 |
HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models |
Rhea Sanjay Sukthanker et.al. |
2405.10299 |
link |
2024-05-17 |
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning |
Yuexiang Zhai et.al. |
2405.10292 |
null |
2024-05-16 |
Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction |
Jianhao Chen et.al. |
2405.10288 |
link |
2024-05-16 |
FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models |
Adrian Bulat et.al. |
2405.10286 |
null |
2024-05-16 |
Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers |
Tuo Zhang et.al. |
2405.10276 |
null |
2024-05-16 |
Keep It Private: Unsupervised Privatization of Online Text |
Calvin Bao et.al. |
2405.10260 |
link |
2024-05-16 |
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models |
Xianzheng Ma et.al. |
2405.10255 |
link |
2024-05-16 |
PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology |
George Shaikovski et.al. |
2405.10254 |
null |
2024-05-16 |
A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks |
Xuanfan Ni et.al. |
2405.10251 |
null |
2024-05-16 |
IntelliExplain: Enhancing Interactive Code Generation through Natural Language Explanations for Non-Professional Programmers |
Hao Yan et.al. |
2405.10250 |
null |
2024-05-16 |
A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts |
Xinru Zhang et.al. |
2405.10246 |
link |
2024-05-16 |
DocuMint: Docstring Generation for Python using Small Language Models |
Bibek Poudel et.al. |
2405.10243 |
link |
2024-05-16 |
Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting |
Divij Gupta et.al. |
2405.10216 |
null |
2024-05-16 |
CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations |
Jiahao Zhao et.al. |
2405.10212 |
null |
2024-05-16 |
LFED: A Literary Fiction Evaluation Dataset for Large Language Models |
Linhao Yu et.al. |
2405.10166 |
link |
2024-05-16 |
PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning |
Jiancheng Pan et.al. |
2405.10160 |
link |
2024-05-16 |
Speaker Verification in Agent-Generated Conversations |
Yizhe Yang et.al. |
2405.10150 |
null |
2024-05-15 |
Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming |
Bushi Xiao et.al. |
2405.09508 |
null |
2024-05-15 |
Constrained Learning for Causal Inference and Semiparametric Statistics |
Tiffany Tianhui Cai et.al. |
2405.09493 |
null |
2024-05-15 |
Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts |
Donya Rooein et.al. |
2405.09482 |
null |
2024-05-15 |
Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models |
Majid Zarharan et.al. |
2405.09454 |
link |
2024-05-15 |
M $^4$ oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts |
Yufeng Jiang et.al. |
2405.09446 |
link |
2024-05-15 |
Facilitating Opinion Diversity through Hybrid NLP Approaches |
Michiel van der Meer et.al. |
2405.09439 |
null |
2024-05-15 |
A Survey On Text-to-3D Contents Generation In The Wild |
Chenhan Jiang et.al. |
2405.09431 |
null |
2024-05-15 |
MicroPython Testbed for Federated Learning Algorithms |
Miroslav Popovic et.al. |
2405.09423 |
link |
2024-05-15 |
Matching domain experts by training from scratch on domain knowledge |
Xiaoliang Luo et.al. |
2405.09395 |
null |
2024-05-15 |
Compositional imprecise probability |
Jack Liell-Cock et.al. |
2405.09391 |
null |
2024-05-15 |
PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models |
Devansh Jain et.al. |
2405.09373 |
null |
2024-05-15 |
SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition |
Weijie L et.al. |
2405.09365 |
null |
2024-05-15 |
Large Language Model Bias Mitigation from the Perspective of Knowledge Editing |
Ruizhe Chen et.al. |
2405.09341 |
null |
2024-05-15 |
Prompting-based Synthetic Data Generation for Few-Shot Question Answering |
Maximilian Schmidt et.al. |
2405.09335 |
null |
2024-05-15 |
Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls |
Pedro Miguel Sánchez Sánchez et.al. |
2405.09318 |
null |
2024-05-15 |
Comparing the Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support |
Birger Moell et.al. |
2405.09300 |
null |
2024-05-15 |
Do language models capture implied discourse meanings? An investigation with exhaustivity implicatures of Korean morphology |
Hagyeong Shin et.al. |
2405.09293 |
null |
2024-05-15 |
Sign of the Times: Evaluating the use of Large Language Models for Idiomaticity Detection |
Dylan Phelps et.al. |
2405.09279 |
null |
2024-05-15 |
Dynamic Activation Pitfalls in LLaMA Models: An Empirical Study |
Chi Ma et.al. |
2405.09274 |
null |
2024-05-15 |
New Textual Corpora for Serbian Language Modeling |
Mihailo Škorić et.al. |
2405.09250 |
null |
2024-05-14 |
Efficient Vision-Language Pre-training by Cluster Masking |
Zihao Wei et.al. |
2405.08815 |
link |
2024-05-14 |
Towards Enhanced RAC Accessibility: Leveraging Datasets and LLMs |
Edison Jair Bejarano Sepulveda et.al. |
2405.08792 |
link |
2024-05-14 |
Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring |
Tiantian Zhang et.al. |
2405.08786 |
link |
2024-05-14 |
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs |
Akhila Yerukola et.al. |
2405.08760 |
link |
2024-05-14 |
Distributed Threat Intelligence at the Edge Devices: A Large Language Model-Driven Approach |
Syed Mhamudul Hasan et.al. |
2405.08755 |
null |
2024-05-14 |
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding |
Zhimin Li et.al. |
2405.08748 |
link |
2024-05-14 |
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory |
Xueyan Niu et.al. |
2405.08707 |
null |
2024-05-14 |
EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera |
Beilei Cui et.al. |
2405.08672 |
link |
2024-05-14 |
Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research |
Qinglong Cao et.al. |
2405.08668 |
link |
2024-05-14 |
Thinking Tokens for Language Modeling |
David Herel et.al. |
2405.08644 |
null |
2024-05-15 |
ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation |
Dimitris Gkoumas et.al. |
2405.08619 |
null |
2024-05-14 |
A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine |
Hanguang Xiao et.al. |
2405.08603 |
null |
2024-05-15 |
EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark |
Xiaohui Zhang et.al. |
2405.08596 |
null |
2024-05-14 |
Open-Vocabulary Object Detection via Neighboring Region Attention Alignment |
Sunyuan Qiang et.al. |
2405.08593 |
null |
2024-05-14 |
Improving Transformers with Dynamically Composable Multi-Head Attention |
Da Xiao et.al. |
2405.08553 |
link |
2024-05-14 |
Self-Distillation Improves DNA Sequence Inference |
Tong Yu et.al. |
2405.08538 |
link |
2024-05-14 |
Falcon 7b for Software Mention Detection in Scholarly Documents |
AmeerAli Khan et.al. |
2405.08514 |
null |
2024-05-14 |
Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure |
Odysseas S. Chlapanis et.al. |
2405.08502 |
link |
2024-05-14 |
Is Less More? Quality, Quantity and Context in Idiom Processing with Natural Language Models |
Agne Knietaite et.al. |
2405.08497 |
link |
2024-05-14 |
Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models |
Andrea Piergentili et.al. |
2405.08477 |
null |
2024-05-13 |
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots |
Chengyue Wu et.al. |
2405.07990 |
null |
2024-05-13 |
A Generalist Learner for Multifaceted Medical Image Interpretation |
Hong-Yu Zhou et.al. |
2405.07988 |
null |
2024-05-13 |
The Platonic Representation Hypothesis |
Minyoung Huh et.al. |
2405.07987 |
link |
2024-05-13 |
Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation |
Kevin Stangl et.al. |
2405.07969 |
null |
2024-05-13 |
PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation |
Suad Alshammari et.al. |
2405.07963 |
null |
2024-05-13 |
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments |
Samuel Schmidgall et.al. |
2405.07960 |
null |
2024-05-13 |
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning |
Yinzhu Quan et.al. |
2405.07938 |
link |
2024-05-13 |
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition |
Ziyang Zhang et.al. |
2405.07932 |
link |
2024-05-13 |
Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data |
Mahdi Morafah et.al. |
2405.07925 |
null |
2024-05-13 |
Can Better Text Semantics in Prompt Tuning Improve VLM Generalization? |
Hari Chandana Kuchibhotla et.al. |
2405.07921 |
null |
2024-05-13 |
A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking |
Ferdinand Schlatt et.al. |
2405.07920 |
link |
2024-05-13 |
PLUTO: Pathology-Universal Transformer |
Dinkar Juyal et.al. |
2405.07905 |
null |
2024-05-13 |
Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers |
Alena Tsanda et.al. |
2405.07886 |
link |
2024-05-13 |
Zero-Shot Tokenizer Transfer |
Benjamin Minixhofer et.al. |
2405.07883 |
link |
2024-05-13 |
RLHF Workflow: From Reward Modeling to Online RLHF |
Hanze Dong et.al. |
2405.07863 |
link |
2024-05-13 |
Can LLMs Help Predict Elections? (Counter)Evidence from the World’s Largest Democracy |
Pratik Gujral et.al. |
2405.07828 |
null |
2024-05-13 |
A View of How Language Models Will Transform Law |
Frank Fagan et.al. |
2405.07826 |
null |
2024-05-13 |
FreeVA: Offline MLLM as Training-Free Video Assistant |
Wenhao Wu et.al. |
2405.07798 |
link |
2024-05-13 |
DEPTH: Discourse Education through Pre-Training Hierarchically |
Zachary Bamberger et.al. |
2405.07788 |
link |
2024-05-13 |
Generating Human Motion in 3D Scenes from Text Descriptions |
Zhi Cen et.al. |
2405.07784 |
null |
2024-05-10 |
Linearizing Large Language Models |
Jean Mercat et.al. |
2405.06640 |
link |
2024-05-10 |
Value Augmented Sampling for Language Model Alignment and Personalization |
Seungwook Han et.al. |
2405.06639 |
link |
2024-05-10 |
Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark |
Evan M. Williams et.al. |
2405.06634 |
link |
2024-05-10 |
Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models |
Chakshu Moar et.al. |
2405.06626 |
null |
2024-05-10 |
Explaining Text Similarity in Transformer Models |
Alexandros Vasileiou et.al. |
2405.06604 |
link |
2024-05-10 |
Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach |
Elham Ravanbakhsh et.al. |
2405.06586 |
null |
2024-05-10 |
What Can Natural Language Processing Do for Peer Review? |
Ilia Kuznetsov et.al. |
2405.06563 |
link |
2024-05-10 |
Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval |
Mengjia Niu et.al. |
2405.06545 |
null |
2024-05-10 |
Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts |
Wenyu Huang et.al. |
2405.06524 |
null |
2024-05-10 |
UniDM: A Unified Framework for Data Manipulation with Large Language Models |
Yichen Qian et.al. |
2405.06510 |
null |
2024-05-10 |
Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling |
Lyumanshan Ye et.al. |
2405.06495 |
null |
2024-05-10 |
Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification |
Yaoqin Ye et.al. |
2405.06468 |
link |
2024-05-10 |
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation |
JoonHo Lee et.al. |
2405.06424 |
link |
2024-05-10 |
Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions? |
Hunter McNichols et.al. |
2405.06414 |
link |
2024-05-10 |
Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL |
Ning Cheng et.al. |
2405.06410 |
null |
2024-05-10 |
Program Synthesis using Inductive Logic Programming for the Abstraction and Reasoning Corpus |
Filipe Marinho Rocha et.al. |
2405.06399 |
null |
2024-05-10 |
Memory Mosaics |
Jianyu Zhang et.al. |
2405.06394 |
link |
2024-05-10 |
LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play |
Li-Chun Lu et.al. |
2405.06373 |
null |
2024-05-10 |
LMD3: Language Model Data Density Dependence |
John Kirchenbauer et.al. |
2405.06331 |
null |
2024-05-10 |
Correlation Dimension of Natural Language in a Statistical Manifold |
Xin Du et.al. |
2405.06321 |
null |
2024-05-09 |
Natural Language Processing RELIES on Linguistics |
Juri Opitz et.al. |
2405.05966 |
null |
2024-05-09 |
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning |
Dan Qiao et.al. |
2405.05957 |
link |
2024-05-09 |
Probing Multimodal LLMs as World Models for Driving |
Shiva Sreeram et.al. |
2405.05956 |
link |
2024-05-09 |
Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning |
Junzhi Chen et.al. |
2405.05955 |
link |
2024-05-09 |
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts |
Jiachen Li et.al. |
2405.05949 |
link |
2024-05-09 |
DOLOMITES: Domain-Specific Long-Form Methodical Tasks |
Chaitanya Malaviya et.al. |
2405.05938 |
null |
2024-05-09 |
Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness |
Siyuan Li et.al. |
2405.05930 |
null |
2024-05-09 |
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? |
Zorik Gekhman et.al. |
2405.05904 |
null |
2024-05-09 |
Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes |
Ziang Guo et.al. |
2405.05885 |
null |
2024-05-09 |
FlockGPT: Guiding UAV Flocking with Linguistic Orchestration |
Artem Lykov et.al. |
2405.05872 |
null |
2024-05-09 |
Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control |
Gunshi Gupta et.al. |
2405.05852 |
link |
2024-05-09 |
Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning |
Artem Lykov et.al. |
2405.05824 |
link |
2024-05-09 |
Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference |
Zhihang Lin et.al. |
2405.05803 |
link |
2024-05-09 |
Towards a More Inclusive AI: Progress and Perspectives in Large Language Model Training for the Sámi Language |
Ronny Paul et.al. |
2405.05777 |
null |
2024-05-09 |
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions |
Polina Tsvilodub et.al. |
2405.05776 |
null |
2024-05-09 |
Large Language Model-Aided Evolutionary Search for Constrained Multiobjective Optimization |
Zeyi Wang et.al. |
2405.05767 |
null |
2024-05-09 |
Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social Media |
Zhizhen Zhang et.al. |
2405.05760 |
null |
2024-05-09 |
Exploring the Potential of Human-LLM Synergy in Advancing Qualitative Analysis: A Case Study on Mental-Illness Stigma |
Han Meng et.al. |
2405.05758 |
null |
2024-05-09 |
Can large language models understand uncommon meanings of common words? |
Jinyang Wu et.al. |
2405.05741 |
null |
2024-05-09 |
Evaluating Dialect Robustness of Language Models via Conversation Understanding |
Dipankar Srirag et.al. |
2405.05688 |
link |
2024-05-08 |
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models |
Prannay Kaul et.al. |
2405.05256 |
null |
2024-05-08 |
You Only Cache Once: Decoder-Decoder Architectures for Language Models |
Yutao Sun et.al. |
2405.05254 |
link |
2024-05-08 |
Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge |
Charles Koutcheme et.al. |
2405.05253 |
link |
2024-05-09 |
LLMs with Personalities in Multi-issue Negotiation Games |
Sean Noh et.al. |
2405.05248 |
null |
2024-05-08 |
EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning |
Jingfeng Yao et.al. |
2405.05237 |
link |
2024-05-08 |
SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants |
Masoud Moghani et.al. |
2405.05226 |
null |
2024-05-08 |
Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers |
Jiuxiang Gu et.al. |
2405.05219 |
null |
2024-05-08 |
FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models |
Jinglin Xu et.al. |
2405.05216 |
link |
2024-05-08 |
MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning |
Inderjeet Nair et.al. |
2405.05189 |
null |
2024-05-08 |
Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming |
Tommaso Pasini et.al. |
2405.05176 |
null |
2024-05-08 |
Air Gap: Protecting Privacy-Conscious Conversational Agents |
Eugene Bagdasaryan et.al. |
2405.05175 |
null |
2024-05-08 |
XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples |
Peiqin Lin et.al. |
2405.05116 |
link |
2024-05-08 |
QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs |
Weijia Zhang et.al. |
2405.05109 |
null |
2024-05-08 |
Concerns on Bias in Large Language Models when Creating Synthetic Personae |
Helena A. Haxvig et.al. |
2405.05080 |
null |
2024-05-08 |
Impact of Tone-Aware Explanations in Recommender Systems |
Ayano Okoso et.al. |
2405.05061 |
null |
2024-05-08 |
Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models |
Aylin Gunal et.al. |
2405.05060 |
null |
2024-05-08 |
Seeds of Stereotypes: A Large-Scale Textual Analysis of Race and Gender Associations with Diseases in Online Sources |
Lasse Hyldig Hansen et.al. |
2405.05049 |
null |
2024-05-08 |
${M^2D}$ NeRF: Multi-Modal Decomposition NeRF with 3D Feature Fields |
Ning Wang et.al. |
2405.05010 |
null |
2024-05-08 |
ADELIE: Aligning Large Language Models on Information Extraction |
Yunjia Qi et.al. |
2405.05008 |
link |
2024-05-08 |
NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair |
Ruoke Wang et.al. |
2405.04994 |
null |
2024-05-07 |
ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning |
Jing Lin et.al. |
2405.04533 |
null |
2024-05-07 |
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving |
Yujun Lin et.al. |
2405.04532 |
link |
2024-05-07 |
NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts |
Shudan Zhang et.al. |
2405.04520 |
null |
2024-05-07 |
xLSTM: Extended Long Short-Term Memory |
Maximilian Beck et.al. |
2405.04517 |
null |
2024-05-07 |
A Transformer with Stack Attention |
Jiaoda Li et.al. |
2405.04515 |
link |
2024-05-08 |
Unveiling Disparities in Web Task Handling Between Human and Web Agent |
Kihoon Son et.al. |
2405.04497 |
null |
2024-05-07 |
Toward In-Context Teaching: Adapting Examples to Students’ Misconceptions |
Alexis Ross et.al. |
2405.04495 |
null |
2024-05-07 |
Representation Learning of Daily Movement Data Using Text Encoders |
Alexander Capstick et.al. |
2405.04494 |
link |
2024-05-08 |
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model |
DeepSeek-AI et.al. |
2405.04434 |
link |
2024-05-07 |
The Silicone Ceiling: Auditing GPT’s Race and Gender Biases in Hiring |
Lena Armstrong et.al. |
2405.04412 |
null |
2024-05-07 |
Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks |
Georgios Pantazopoulos et.al. |
2405.04403 |
link |
2024-05-07 |
Large Language Models Cannot Explain Themselves |
Advait Sarkar et.al. |
2405.04382 |
null |
2024-05-07 |
A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI |
Hannah Chafetz et.al. |
2405.04333 |
null |
2024-05-07 |
Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation |
Atharvan Dogra et.al. |
2405.04325 |
null |
2024-05-07 |
Granite Code Models: A Family of Open Foundation Models for Code Intelligence |
Mayank Mishra et.al. |
2405.04324 |
link |
2024-05-07 |
Accelerating Speculative Decoding using Dynamic Speculation Length |
Jonathan Mamou et.al. |
2405.04304 |
null |
2024-05-07 |
Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework |
Xiangpeng Wan et.al. |
2405.04294 |
link |
2024-05-07 |
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore |
Junchao Wu et.al. |
2405.04286 |
null |
2024-05-07 |
On the Foundations of Earth and Climate Foundation Models |
Xiao Xiang Zhu et.al. |
2405.04285 |
null |
2024-05-07 |
Semantic API Alignment: Linking High-level User Goals to APIs |
Robert Feldt et.al. |
2405.04236 |
null |
2024-05-06 |
Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs |
Muhammad Uzair Khattak et.al. |
2405.03690 |
null |
2024-05-06 |
Pose Priors from Language Models |
Sanjay Subramanian et.al. |
2405.03689 |
null |
2024-05-06 |
Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames |
Keith Burghardt et.al. |
2405.03688 |
link |
2024-05-06 |
Language-Image Models with 3D Understanding |
Jang Hyun Cho et.al. |
2405.03685 |
null |
2024-05-06 |
AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design |
Kamal Choudhary et.al. |
2405.03680 |
null |
2024-05-06 |
When LLMs Meet Cybersecurity: A Systematic Literature Review |
Jie Zhang et.al. |
2405.03644 |
link |
2024-05-06 |
A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama |
Vlad-Andrei Cursaru et.al. |
2405.03616 |
null |
2024-05-06 |
GREEN: Generative Radiology Report Evaluation and Error Notation |
Sophie Ostmeier et.al. |
2405.03595 |
null |
2024-05-06 |
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment |
Abhinav Agarwalla et.al. |
2405.03594 |
null |
2024-05-06 |
Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing |
Han Liu et.al. |
2405.03565 |
null |
2024-05-07 |
ID-centric Pre-training for Recommendation |
Yiqing Wu et.al. |
2405.03562 |
null |
2024-05-06 |
AlphaMath Almost Zero: process Supervision without process |
Guoxin Chen et.al. |
2405.03553 |
link |
2024-05-06 |
MAmmoTH2: Scaling Instructions from the Web |
Xiang Yue et.al. |
2405.03548 |
null |
2024-05-06 |
Position Paper: Leveraging Foundational Models for Black-Box Optimization: Benefits, Challenges, and Future Directions |
Xingyou Song et.al. |
2405.03547 |
null |
2024-05-06 |
Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning |
Yubo Mai et.al. |
2405.03509 |
null |
2024-05-06 |
UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images |
Yiting Qu et.al. |
2405.03486 |
null |
2024-05-06 |
LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model |
Haowen Sun et.al. |
2405.03485 |
link |
2024-05-06 |
Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search |
Hideaki Joko et.al. |
2405.03480 |
link |
2024-05-07 |
Large Language Models (LLMs) as Agents for Augmented Democracy |
Jairo Gudiño-Rosero et.al. |
2405.03452 |
null |
2024-05-06 |
SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence |
Hangyuan Ji et.al. |
2405.03446 |
link |
2024-05-03 |
Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models |
Piotr Padlewski et.al. |
2405.02287 |
link |
2024-05-03 |
Structural Pruning of Pre-trained Language Models via Neural Architecture Search |
Aaron Klein et.al. |
2405.02267 |
null |
2024-05-03 |
On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning? |
Maxime Zanella et.al. |
2405.02266 |
link |
2024-05-03 |
Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows |
Jasmine Y. Shih et.al. |
2405.02260 |
null |
2024-05-03 |
What matters when building vision-language models? |
Hugo Laurençon et.al. |
2405.02246 |
null |
2024-05-03 |
REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs |
Deepa Tilwani et.al. |
2405.02228 |
null |
2024-05-03 |
Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks |
Lujing Zhang et.al. |
2405.02225 |
null |
2024-05-03 |
FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems |
Yashar Deldjoo et.al. |
2405.02219 |
null |
2024-05-03 |
Automatic Programming: Large Language Models and Beyond |
Michael R. Lyu et.al. |
2405.02213 |
null |
2024-05-03 |
Assessing and Verifying Task Utility in LLM-Powered Applications |
Negar Arabzadeh et.al. |
2405.02178 |
null |
2024-05-03 |
Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset |
Hsuvas Borkakoty et.al. |
2405.02175 |
null |
2024-05-03 |
Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models |
Mohamad Al Mdfaa et.al. |
2405.02162 |
null |
2024-05-03 |
Neural Context Flows for Learning Generalizable Dynamical Systems |
Roussel Desmond Nzoyem et.al. |
2405.02154 |
link |
2024-05-03 |
The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates |
Giuseppe Russo Latona et.al. |
2405.02150 |
link |
2024-05-03 |
MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain |
Chao Jiang et.al. |
2405.02144 |
null |
2024-05-03 |
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection |
Guillem Ramírez et.al. |
2405.02134 |
null |
2024-05-03 |
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets |
Xuelong Geng et.al. |
2405.02132 |
null |
2024-05-03 |
Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph |
Vladyslav Nechakhin et.al. |
2405.02105 |
null |
2024-05-03 |
Argumentative Large Language Models for Explainable and Contestable Decision-Making |
Gabriel Freedman et.al. |
2405.02079 |
null |
2024-05-03 |
Comparative Analysis of Retrieval Systems in the Real World |
Dmytro Mozolevskyi et.al. |
2405.02048 |
null |
2024-05-02 |
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models |
Seungone Kim et.al. |
2405.01535 |
link |
2024-05-02 |
Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks |
Murtaza Dalal et.al. |
2405.01534 |
null |
2024-05-02 |
OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning |
Shihao Wang et.al. |
2405.01533 |
link |
2024-05-02 |
FLAME: Factuality-Aware Alignment for Large Language Models |
Sheng-Chieh Lin et.al. |
2405.01525 |
null |
2024-05-02 |
A separability-based approach to quantifying generalization: which layer is best? |
Luciano Dyballa et.al. |
2405.01524 |
null |
2024-05-02 |
Transformer-Aided Semantic Communications |
Matin Mortaheb et.al. |
2405.01521 |
null |
2024-05-02 |
D2PO: Discriminator-Guided DPO with Response Evaluation Models |
Prasann Singhal et.al. |
2405.01511 |
link |
2024-05-02 |
Analyzing the Role of Semantic Representations in the Era of Large Language Models |
Zhijing Jin et.al. |
2405.01502 |
link |
2024-05-02 |
Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models |
Raymond Fok et.al. |
2405.01501 |
null |
2024-05-02 |
Controllable Text Generation in the Instruction-Tuning Era |
Dhananjay Ashok et.al. |
2405.01490 |
null |
2024-05-02 |
MANTIS: Interleaved Multi-Image Instruction Tuning |
Dongfu Jiang et.al. |
2405.01483 |
link |
2024-05-02 |
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment |
Gerald Shen et.al. |
2405.01481 |
link |
2024-05-02 |
V-FLUTE: Visual Figurative Language Understanding with Textual Explanations |
Arkadiy Saakyan et.al. |
2405.01474 |
link |
2024-05-02 |
Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning |
Théo Moutakanni et.al. |
2405.01469 |
null |
2024-05-02 |
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models |
Yifei Ming et.al. |
2405.01468 |
null |
2024-05-02 |
A Systematic Literature Review on Large Language Models for Automated Program Repair |
Quanjun Zhang et.al. |
2405.01466 |
link |
2024-05-02 |
Natural Language to Verilog: Design of a Recurrent Spiking Neural Network using Large Language Models and ChatGPT |
Paola Vitolo et.al. |
2405.01419 |
null |
2024-05-02 |
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors |
Yuan Tang et.al. |
2405.01413 |
link |
2024-05-02 |
Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving |
Xin Quan et.al. |
2405.01379 |
null |
2024-05-02 |
GAIA: A General AI Assistant for Intelligent Accelerator Operations |
Frank Mayet et.al. |
2405.01359 |
null |
2024-05-01 |
Self-Play Preference Optimization for Language Model Alignment |
Yue Wu et.al. |
2405.00675 |
link |
2024-05-01 |
Is Bigger Edit Batch Size Always Better? – An Empirical Study on Model Editing with Llama-3 |
Junsang Yoon et.al. |
2405.00664 |
link |
2024-05-01 |
HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models |
Ningke Li et.al. |
2405.00648 |
null |
2024-05-01 |
When Quantization Affects Confidence of Large Language Models? |
Irina Proskurina et.al. |
2405.00632 |
link |
2024-05-01 |
“I’m Not Sure, But…”: Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust |
Sunnie S. Y. Kim et.al. |
2405.00623 |
null |
2024-05-01 |
Causal Evaluation of Language Models |
Sirui Chen et.al. |
2405.00622 |
link |
2024-05-01 |
Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling |
Yida Mu et.al. |
2405.00611 |
link |
2024-05-01 |
Investigating Automatic Scoring and Feedback using Large Language Models |
Gloria Ashiya Katuka et.al. |
2405.00602 |
null |
2024-05-01 |
Are Models Biased on Text without Gender-related Language? |
Catarina G Belém et.al. |
2405.00588 |
link |
2024-05-01 |
The Real, the Better: Aligning Large Language Models with Online Human Behaviors |
Guanying Jiang et.al. |
2405.00578 |
null |
2024-05-01 |
EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model |
Deng Li et.al. |
2405.00574 |
null |
2024-05-01 |
NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance |
Huan-Yi Su et.al. |
2405.00566 |
null |
2024-05-01 |
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment |
Zhili Liu et.al. |
2405.00557 |
null |
2024-05-01 |
Long-Term Human Trajectory Prediction using 3D Dynamic Scene Graphs |
Nicolas Gorlo et.al. |
2405.00552 |
link |
2024-05-01 |
ChatBI: Towards Natural Language to Complex Business Intelligence SQL |
Jinqing Lian et.al. |
2405.00527 |
null |
2024-05-01 |
CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions |
Donghee Choi et.al. |
2405.00523 |
null |
2024-05-01 |
Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning |
Lucas-Andreï Thil et.al. |
2405.00516 |
null |
2024-05-01 |
GOLD: Geometry Problem Solver with Natural Language Description |
Jiaxin Zhang et.al. |
2405.00494 |
link |
2024-05-01 |
Is Temperature the Creativity Parameter of Large Language Models? |
Max Peeperkorn et.al. |
2405.00492 |
null |
2024-05-01 |
The Pyramid of Captions |
Delong Chen et.al. |
2405.00485 |
null |
2024-04-30 |
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation |
Yunhao Ge et.al. |
2404.19752 |
null |
2024-04-30 |
PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification |
Leon Garza et.al. |
2404.19744 |
null |
2024-04-30 |
Better & Faster Large Language Models via Multi-token Prediction |
Fabian Gloeckle et.al. |
2404.19737 |
null |
2024-04-30 |
A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications |
Steph Buongiorno et.al. |
2404.19729 |
null |
2024-04-30 |
PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games |
Steph Buongiorno et.al. |
2404.19721 |
null |
2024-04-30 |
Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns |
Constantinos Patsakis et.al. |
2404.19715 |
null |
2024-04-30 |
Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models |
Scott Sumpter et.al. |
2404.19713 |
null |
2024-04-30 |
When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively |
Tiziano Labruna et.al. |
2404.19705 |
link |
2024-04-30 |
Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners |
Chun Feng et.al. |
2404.19696 |
null |
2024-04-30 |
Towards Generalist Robot Learning from Internet Video: A Survey |
Robert McCarthy et.al. |
2404.19664 |
null |
2024-04-30 |
MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation |
Min Zhang et.al. |
2404.19644 |
null |
2024-04-30 |
On Training a Neural Network to Explain Binaries |
Alexander Interrante-Grant et.al. |
2404.19631 |
null |
2024-04-30 |
Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model |
Denys Godwin et.al. |
2404.19609 |
null |
2024-04-30 |
Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning |
Xuanli He et.al. |
2404.19597 |
null |
2024-04-30 |
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing |
Yucheng Hu et.al. |
2404.19543 |
link |
2024-04-30 |
MoST: Multi-modality Scene Tokenization for Motion Prediction |
Norman Mu et.al. |
2404.19531 |
null |
2024-04-30 |
Do Large Language Models Understand Conversational Implicature – A case study with a chinese sitcom |
Shisen Yue et.al. |
2404.19509 |
link |
2024-04-30 |
More Compute Is What You Need |
Zhen Guo et.al. |
2404.19484 |
null |
2024-05-01 |
Neuro-Vision to Language: Image Reconstruction and Language enabled Interaction via Brain Recordings |
Guobin Shen et.al. |
2404.19438 |
null |
2024-04-30 |
Can Large Language Models put 2 and 2 together? Probing for Entailed Arithmetical Relationships |
D. Panas et.al. |
2404.19432 |
null |
2024-04-29 |
Hallucination of Multimodal Large Language Models: A Survey |
Zechen Bai et.al. |
2404.18930 |
link |
2024-04-29 |
Holmes: Benchmark the Linguistic Competence of Language Models |
Andreas Waldis et.al. |
2404.18923 |
null |
2024-04-29 |
DPO Meets PPO: Reinforced Token Optimization for RLHF |
Han Zhong et.al. |
2404.18922 |
null |
2024-04-29 |
TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation |
Junhao Cheng et.al. |
2404.18919 |
link |
2024-04-29 |
Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting |
Fangcheng Liu et.al. |
2404.18911 |
link |
2024-04-29 |
Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking |
Hong Jin Kang et.al. |
2404.18881 |
link |
2024-04-29 |
More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness |
Aaron J. Li et.al. |
2404.18870 |
link |
2024-04-29 |
Truth-value judgment in language models: belief directions are context sensitive |
Stefan F. Schouten et.al. |
2404.18865 |
null |
2024-04-29 |
Performance-Aligned LLMs for Generating Fast Code |
Daniel Nichols et.al. |
2404.18864 |
null |
2024-04-29 |
A Survey on Vision Mamba: Models, Applications and Challenges |
Rui Xu et.al. |
2404.18861 |
link |
2024-04-29 |
VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning |
Aidan Z. H. Yang et.al. |
2404.18852 |
null |
2024-04-29 |
FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition |
Yuxuan Yan et.al. |
2404.18848 |
null |
2024-04-29 |
It’s Difficult to be Neutral – Human and LLM-based Sentiment Annotation of Patient Comments |
Petter Mæhlum et.al. |
2404.18832 |
null |
2024-04-29 |
Benchmarking Benchmark Leakage in Large Language Models |
Ruijie Xu et.al. |
2404.18824 |
link |
2024-04-29 |
AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering |
Wenxiang Zhao et.al. |
2404.18816 |
null |
2024-04-29 |
Unknown Script: Impact of Script on Cross-Lingual Transfer |
Wondimagegnhue Tsegaye Tufa et.al. |
2404.18810 |
link |
2024-04-29 |
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models |
Pat Verga et.al. |
2404.18796 |
null |
2024-04-29 |
PECC: Problem Extraction and Coding Challenges |
Patrick Haller et.al. |
2404.18766 |
link |
2024-04-29 |
Transitive Vision-Language Prompt Learning for Domain Generalization |
Liyuan Wang et.al. |
2404.18758 |
null |
2024-04-29 |
Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models |
Hongyi Zhu et.al. |
2404.18746 |
null |
2024-04-26 |
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo |
Stephen Zhao et.al. |
2404.17546 |
link |
2024-04-26 |
Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models |
Yuhang Huang et.al. |
2404.17534 |
null |
2024-04-26 |
Large Language Model Agent as a Mechanical Designer |
Yayati Jadhav et.al. |
2404.17525 |
null |
2024-04-26 |
On the Use of Large Language Models to Generate Capability Ontologies |
Luis Miguel Vieira da Silva et.al. |
2404.17524 |
link |
2024-04-26 |
Enhancing Legal Compliance and Regulation Analysis with Large Language Models |
Shabnam Hassani et.al. |
2404.17522 |
null |
2024-04-26 |
A Comprehensive Evaluation on Event Reasoning of Large Language Models |
Zhengwei Tao et.al. |
2404.17513 |
link |
2024-04-26 |
CEval: A Benchmark for Evaluating Counterfactual Text Generation |
Van Bach Nguyen et.al. |
2404.17475 |
null |
2024-04-26 |
Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System |
Robin Schmucker et.al. |
2404.17460 |
null |
2024-04-26 |
“ChatGPT Is Here to Help, Not to Replace Anybody” – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses |
Bruno Pereira Cipriano et.al. |
2404.17443 |
null |
2024-04-26 |
PromptCIR: Blind Compressed Image Restoration with Prompt Learning |
Bingchen Li et.al. |
2404.17433 |
link |
2024-04-26 |
Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations |
Rémy Decoupes et.al. |
2404.17401 |
null |
2024-04-26 |
UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning |
Maoxun Yuan et.al. |
2404.17360 |
null |
2024-04-26 |
InspectorRAGet: An Introspection Platform for RAG Evaluation |
Kshitij Fadnis et.al. |
2404.17347 |
link |
2024-04-26 |
Introducing cosmosGPT: Monolingual Training for Turkish Language Models |
H. Toprak Kesgin et.al. |
2404.17336 |
null |
2024-04-26 |
A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation |
Xin Zhang et.al. |
2404.17335 |
null |
2024-04-26 |
An Extendable Cloud-Native Alloy Property Explorer |
Zhuoyuan Li et.al. |
2404.17330 |
link |
2024-04-26 |
When to Trust LLMs: Aligning Confidence with Response Quality |
Shuchang Tao et.al. |
2404.17287 |
null |
2024-04-26 |
Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM |
Xuan Zhang et.al. |
2404.17283 |
link |
2024-04-26 |
Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT as a Pivot |
Michelle Terblanche et.al. |
2404.17216 |
null |
2024-04-26 |
Low-Rank Knowledge Decomposition for Medical Foundation Models |
Yuhang Zhou et.al. |
2404.17184 |
null |
2024-04-25 |
The Third Monocular Depth Estimation Challenge |
Jaime Spencer et.al. |
2404.16831 |
null |
2024-04-25 |
Make-it-Real: Unleashing Large Multimodal Model’s Ability for Painting 3D Objects with Realistic Materials |
Ye Fang et.al. |
2404.16829 |
null |
2024-04-25 |
V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection |
Xuanyu Zhang et.al. |
2404.16824 |
null |
2024-04-25 |
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites |
Zhe Chen et.al. |
2404.16821 |
link |
2024-04-25 |
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages |
Harman Singh et.al. |
2404.16816 |
link |
2024-04-26 |
Make Your LLM Fully Utilize the Context |
Shengnan An et.al. |
2404.16811 |
link |
2024-04-25 |
Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning |
Tianhui Zhang et.al. |
2404.16807 |
null |
2024-04-25 |
AAPL: Adding Attributes to Prompt Learning for Vision-Language Models |
Gahyeon Kim et.al. |
2404.16804 |
link |
2024-04-25 |
Weak-to-Strong Extrapolation Expedites Alignment |
Chujie Zheng et.al. |
2404.16792 |
link |
2024-04-25 |
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension |
Bohao Li et.al. |
2404.16790 |
link |
2024-04-25 |
Continual Learning of Large Language Models: A Comprehensive Survey |
Haizhou Shi et.al. |
2404.16789 |
link |
2024-04-25 |
Modeling Selective Feature Attention for Representation-based Siamese Text Matching |
Jianxiang Zang et.al. |
2404.16776 |
link |
2024-04-25 |
REBEL: Reinforcement Learning via Regressing Relative Rewards |
Zhaolin Gao et.al. |
2404.16767 |
link |
2024-04-25 |
Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model |
Runzhe Zhan et.al. |
2404.16766 |
null |
2024-04-25 |
RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis |
Xiaoman Zhang et.al. |
2404.16754 |
null |
2024-04-25 |
Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class |
Mazda Moayeri et.al. |
2404.16717 |
null |
2024-04-25 |
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding |
Mostafa Elhoushi et.al. |
2404.16710 |
null |
2024-04-25 |
Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents |
Giorgio Piatti et.al. |
2404.16698 |
link |
2024-04-25 |
Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 |
Lydia Uhler et.al. |
2404.16692 |
null |
2024-04-25 |
EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning |
Hongxia Xie et.al. |
2404.16670 |
link |
2024-04-24 |
Hybrid LLM/Rule-based Approaches to Business Insights Generation from Structured Data |
Aliaksei Vertsel et.al. |
2404.15604 |
null |
2024-04-24 |
ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction |
Henry Peng Zou et.al. |
2404.15592 |
link |
2024-04-24 |
MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis |
Jiaxin Zhuang et.al. |
2404.15580 |
null |
2024-04-24 |
Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? |
Hossein Salami et.al. |
2404.15578 |
null |
2024-04-24 |
Retrieval Head Mechanistically Explains Long-Context Factuality |
Wenhao Wu et.al. |
2404.15574 |
link |
2024-04-23 |
PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models |
Shashi Kant Gupta et.al. |
2404.15549 |
null |
2024-04-23 |
BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis |
Shuhang Lin et.al. |
2404.15532 |
link |
2024-04-23 |
Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models |
Mihir Parmar et.al. |
2404.15522 |
link |
2024-04-23 |
Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval |
Young Kyun Jang et.al. |
2404.15516 |
null |
2024-04-23 |
ToM-LM: Delegating Theory Of Mind Reasoning to External Symbolic Executors in Large Language Models |
Weizhi Tang et.al. |
2404.15515 |
null |
2024-04-23 |
IryoNLP at MEDIQA-CORR 2024: Tackling the Medical Error Detection & Correction Task On the Shoulders of Medical Agents |
Jean-Philippe Corbeil et.al. |
2404.15488 |
link |
2024-04-23 |
Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance |
Het Patel et.al. |
2404.15485 |
null |
2024-04-23 |
Can Large Language Models Learn the Physics of Metamaterials? An Empirical Study with ChatGPT |
Darui Lu et.al. |
2404.15458 |
null |
2024-04-23 |
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference |
João Monteiro et.al. |
2404.15420 |
null |
2024-04-23 |
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs |
Davide Caffagni et.al. |
2404.15406 |
null |
2024-04-23 |
Aligning LLM Agents by Learning Latent Preference from User Edits |
Ge Gao et.al. |
2404.15269 |
link |
2024-04-23 |
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts |
Yifeng Ding et.al. |
2404.15247 |
link |
2024-04-23 |
CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies |
Weiyan Shi et.al. |
2404.15238 |
link |
2024-04-23 |
Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models |
Aidan Z. H. Yang et.al. |
2404.15236 |
null |
2024-04-23 |
Re-Thinking Inverse Graphics With Large Language Models |
Peter Kulits et.al. |
2404.15228 |
null |
2024-04-23 |
Does Instruction Tuning Make LLMs More Consistent? |
Constanza Fierro et.al. |
2404.15206 |
null |
2024-04-23 |
Setting up the Data Printer with Improved English to Ukrainian Machine Translation |
Yurii Paniv et.al. |
2404.15196 |
link |
2024-04-23 |
Regressive Side Effects of Training Language Models to Mimic Student Misconceptions |
Shashank Sonkar et.al. |
2404.15156 |
null |
2024-04-23 |
Bias patterns in the application of LLMs for clinical decision support: A comprehensive study |
Raphael Poulain et.al. |
2404.15149 |
link |
2024-04-23 |
Rethinking LLM Memorization through the Lens of Adversarial Compression |
Avi Schwarzschild et.al. |
2404.15146 |
null |
2024-04-23 |
MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning |
Sunan He et.al. |
2404.15127 |
null |
2024-04-23 |
Identifying Fairness Issues in Automatically Generated Testing Content |
Kevin Stowe et.al. |
2404.15104 |
null |
2024-04-23 |
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation |
Xun Wu et.al. |
2404.15100 |
null |
2024-04-23 |
Detection of circular permutations by Protein Language Models |
Yue Hu et.al. |
2404.15087 |
link |
2024-04-23 |
Multi-Head Mixture-of-Experts |
Xun Wu et.al. |
2404.15045 |
null |
2024-04-23 |
TAXI: Evaluating Categorical Knowledge Editing for Language Models |
Derek Powell et.al. |
2404.15004 |
link |
2024-04-23 |
Transformers Can Represent $n$ -gram Language Models |
Anej Svete et.al. |
2404.14994 |
null |
2024-04-23 |
A Short Review for Ontology Learning from Text: Stride from Shallow Learning, Deep Learning to Large Language Models Trend |
Rick Du et.al. |
2404.14991 |
null |
2024-04-23 |
$\texttt{MiniMol}$ : A Parameter-Efficient Foundation Model for Molecular Learning |
Kerstin Kläser et.al. |
2404.14986 |
null |
2024-04-23 |
Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case |
Muhammad Asif Auyb et.al. |
2404.14977 |
null |
2024-04-22 |
AutoAD III: The Prequel – Back to the Pixels |
Tengda Han et.al. |
2404.14412 |
null |
2024-04-22 |
SpaceByte: Towards Deleting Tokenization from Large Language Modeling |
Kevin Slagle et.al. |
2404.14408 |
link |
2024-04-22 |
RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios? |
Adrian de Wynter et.al. |
2404.14397 |
link |
2024-04-22 |
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation |
Yuying Ge et.al. |
2404.14396 |
link |
2024-04-22 |
PARAMANU-GANITA: Language Model with Mathematical Capabilities |
Mitodru Niyogi et.al. |
2404.14395 |
null |
2024-04-22 |
A Multimodal Automated Interpretability Agent |
Tamar Rott Shaham et.al. |
2404.14394 |
null |
2024-04-22 |
A Survey on Self-Evolution of Large Language Models |
Zhengwei Tao et.al. |
2404.14387 |
link |
2024-04-22 |
Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph |
Xiaochen Kev Gao et.al. |
2404.14372 |
link |
2024-04-23 |
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data |
Fahim Tajwar et.al. |
2404.14367 |
link |
2024-04-22 |
Better Synthetic Data by Retrieving and Transforming Existing Datasets |
Saumya Gandhi et.al. |
2404.14361 |
link |
2024-04-22 |
Rethinking Legal Compliance Automation: Opportunities with Large Language Models |
Shabnam Hassani et.al. |
2404.14356 |
null |
2024-04-22 |
Calc-CMU at SemEval-2024 Task 7: Pre-Calc – Learning to Use the Calculator Improves Numeracy in Language Models |
Vishruth Veerendranath et.al. |
2404.14355 |
link |
2024-04-22 |
Automated Long Answer Grading with RiceChem Dataset |
Shashank Sonkar et.al. |
2404.14316 |
link |
2024-04-22 |
Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels |
Jan-Philipp Fränken et.al. |
2404.14313 |
link |
2024-04-22 |
Explaining Arguments’ Strength: Unveiling the Role of Attacks and Supports (Technical Report) |
Xiang Yin et.al. |
2404.14304 |
link |
2024-04-22 |
Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits |
Shashank Sonkar et.al. |
2404.14301 |
null |
2024-04-22 |
Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach |
Yao Wan et.al. |
2404.14296 |
link |
2024-04-22 |
A Survey on Efficient Inference for Large Language Models |
Zixuan Zhou et.al. |
2404.14294 |
null |
2024-04-22 |
LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots |
Dongge Han et.al. |
2404.14285 |
null |
2024-04-22 |
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback |
Wenyi Xiao et.al. |
2404.14233 |
null |
2024-04-19 |
MoVA: Adapting Mixture of Vision Experts to Multimodal Context |
Zhuofan Zong et.al. |
2404.13046 |
link |
2024-04-19 |
Unified Scene Representation and Reconstruction for 3D Large Language Models |
Tao Chu et.al. |
2404.13044 |
null |
2024-04-19 |
Data Alignment for Zero-Shot Concept Generation in Dermatology AI |
Soham Gadgil et.al. |
2404.13043 |
null |
2024-04-19 |
Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs |
Biyang Guo et.al. |
2404.13033 |
link |
2024-04-19 |
When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering |
Stephen Choi et.al. |
2404.13028 |
null |
2024-04-19 |
Stronger Random Baselines for In-Context Learning |
Gregory Yauney et.al. |
2404.13020 |
link |
2024-04-19 |
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models |
Chuofan Ma et.al. |
2404.13013 |
null |
2024-04-19 |
Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs |
Clemencia Siro et.al. |
2404.12994 |
link |
2024-04-19 |
FineRec:Exploring Fine-grained Sequential Recommendation |
Xiaokun Zhang et.al. |
2404.12975 |
link |
2024-04-19 |
Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models |
Yian Li et.al. |
2404.12966 |
null |
2024-04-19 |
Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction |
Qinyuan Wu et.al. |
2404.12957 |
null |
2024-04-19 |
Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models |
Konstantinos Vilouras et.al. |
2404.12920 |
null |
2024-04-19 |
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models |
Zhenyang Ni et.al. |
2404.12916 |
link |
2024-04-19 |
Large Language Models for Networking: Workflow, Advances and Challenges |
Chang Liu et.al. |
2404.12901 |
null |
2024-04-19 |
Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning |
Ahmed Elshabrawy et.al. |
2404.12897 |
null |
2024-04-19 |
Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation |
Guanhua Chen et.al. |
2404.12879 |
null |
2024-04-19 |
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency |
Zhaodonghui Li et.al. |
2404.12872 |
link |
2024-04-19 |
How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning? |
Yang Luo et.al. |
2404.12866 |
null |
2024-04-19 |
Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation |
Yilong Chen et.al. |
2404.12861 |
null |
2024-04-19 |
TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages |
Aleksei Dorkin et.al. |
2404.12845 |
null |
2024-04-18 |
BLINK: Multimodal Large Language Models Can See but Not Perceive |
Xingyu Fu et.al. |
2404.12390 |
null |
2024-04-18 |
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models |
Aitor Ormazabal et.al. |
2404.12387 |
null |
2024-04-18 |
MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale |
Xiaotang Gai et.al. |
2404.12372 |
null |
2024-04-18 |
When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes |
Asaf Yehudai et.al. |
2404.12365 |
link |
2024-04-18 |
From $r$ to $Q^*$ : Your Language Model is Secretly a Q-Function |
Rafael Rafailov et.al. |
2404.12358 |
null |
2024-04-18 |
Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation |
Jingmin Sun et.al. |
2404.12355 |
link |
2024-04-18 |
V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning |
Hang Hua et.al. |
2404.12353 |
null |
2024-04-18 |
Evaluating AI for Law: Bridging the Gap with Open-Source Solutions |
Rohan Bhambhoria et.al. |
2404.12349 |
null |
2024-04-18 |
Large Language Models in Targeted Sentiment Analysis |
Nicolay Rusnachenko et.al. |
2404.12342 |
link |
2024-04-18 |
Normative Requirements Operationalization with Large Language Models |
Nick Feng et.al. |
2404.12335 |
null |
2024-04-18 |
Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment |
Zhaofeng Wu et.al. |
2404.12318 |
null |
2024-04-18 |
Large Language Models for Synthetic Participatory Planning of Shared Automated Electric Mobility Systems |
Jiangbo Yu et.al. |
2404.12317 |
null |
2024-04-18 |
Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair |
Yusuke Sakai et.al. |
2404.12299 |
null |
2024-04-18 |
Augmenting emotion features in irony detection with Large language modeling |
Yucheng Lin et.al. |
2404.12291 |
null |
2024-04-18 |
Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery |
Yona Falinie A. Gaus et.al. |
2404.12285 |
null |
2024-04-18 |
Enhancing Embedding Performance through Large Language Model-based Text Enrichment and Rewriting |
Nicholas Harris et.al. |
2404.12283 |
null |
2024-04-18 |
Advancing the Robustness of Large Language Models through Self-Denoised Smoothing |
Jiabao Ji et.al. |
2404.12274 |
link |
2024-04-18 |
FedEval-LLM: Federated Evaluation of Large Language Models on Downstream Tasks with Collective Wisdom |
Yuanqin He et.al. |
2404.12273 |
null |
2024-04-18 |
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences |
Shreya Shankar et.al. |
2404.12272 |
null |
2024-04-18 |
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM |
Michelle S. Lam et.al. |
2404.12259 |
link |
2024-04-17 |
Private federated discovery of out-of-vocabulary words for Gboard |
Ziteng Sun et.al. |
2404.11607 |
null |
2024-04-17 |
VG4D: Vision-Language Model Goes 4D Video Recognition |
Zhichao Deng et.al. |
2404.11605 |
link |
2024-04-17 |
A Deep Dive into Large Language Models for Automated Bug Localization and Repair |
Soneya Binta Hossain et.al. |
2404.11595 |
null |
2024-04-17 |
Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding |
Zezhong Fan et.al. |
2404.11589 |
null |
2024-04-17 |
LLMTune: Accelerate Database Knob Tuning with Large Language Models |
Xinmei Huang et.al. |
2404.11581 |
link |
2024-04-17 |
On the Scalability of GNNs for Molecular Graphs |
Maciej Sypetkowski et.al. |
2404.11568 |
null |
2024-04-17 |
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation |
Kuan-Chieh et.al. |
2404.11565 |
null |
2024-04-17 |
Quantifying Multilingual Performance of Large Language Models Across Languages |
Zihao Li et.al. |
2404.11553 |
null |
2024-04-17 |
Evaluating Span Extraction in Generative Paradigm: A Reflection on Aspect-Based Sentiment Analysis |
Soyoung Yang et.al. |
2404.11539 |
null |
2024-04-17 |
FedPFT: Federated Proxy Fine-Tuning of Foundation Models |
Zhaopeng Peng et.al. |
2404.11536 |
link |
2024-04-17 |
Select and Reorder: A Novel Approach for Neural Sign Language Production |
Harry Walsh et.al. |
2404.11532 |
null |
2024-04-17 |
Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization |
Costas Mavromatis et.al. |
2404.11531 |
link |
2024-04-17 |
Embedding Privacy in Computational Social Science and Artificial Intelligence Research |
Keenan Jones et.al. |
2404.11515 |
null |
2024-04-17 |
Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models |
Yushuo Chen et.al. |
2404.11502 |
link |
2024-04-17 |
Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models |
Yue Zhou et.al. |
2404.11500 |
link |
2024-04-18 |
Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent |
Wei Chen et.al. |
2404.11459 |
null |
2024-04-17 |
Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models |
Sunhao Dai et.al. |
2404.11457 |
link |
2024-04-17 |
AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts |
Meng Jiang et.al. |
2404.11449 |
null |
2024-04-17 |
Open-Ended Wargames with Large Language Models |
Daniel P. Hogan et.al. |
2404.11446 |
link |
2024-04-17 |
DUPE: Detection Undermining via Prompt Engineering for Deepfake Text |
James Weichert et.al. |
2404.11408 |
null |
2024-04-16 |
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback |
Qiwei Di et.al. |
2404.10776 |
null |
2024-04-16 |
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation |
Hongxin Zhang et.al. |
2404.10775 |
null |
2024-04-16 |
Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification |
Yu-Yang Li et.al. |
2404.10757 |
link |
2024-04-16 |
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study |
Shusheng Xu et.al. |
2404.10719 |
null |
2024-04-16 |
Dual Modalities of Text: Visual and Textual Generative Pre-training |
Yekun Chai et.al. |
2404.10710 |
null |
2024-04-16 |
Question Difficulty Ranking for Multiple-Choice Reading Comprehension |
Vatsal Raina et.al. |
2404.10704 |
null |
2024-04-16 |
An empirical study on code review activity prediction in practice |
Doriane Olewicki et.al. |
2404.10703 |
null |
2024-04-16 |
Automating REST API Postman Test Cases Using LLM |
S Deepika Sri et.al. |
2404.10678 |
null |
2024-04-16 |
Self-playing Adversarial Language Game Enhances LLM Reasoning |
Pengyu Cheng et.al. |
2404.10642 |
link |
2024-04-16 |
HLAT: High-quality Large Language Model Pre-trained on AWS Trainium |
Haozheng Fan et.al. |
2404.10630 |
null |
2024-04-16 |
Private Attribute Inference from Images with Vision-Language Models |
Batuhan Tömekçe et.al. |
2404.10618 |
null |
2024-04-16 |
Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases |
Yanze Li et.al. |
2404.10595 |
null |
2024-04-16 |
Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training |
Masanori Hirano et.al. |
2404.10555 |
null |
2024-04-16 |
Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning |
Xiao Wang et.al. |
2404.10552 |
null |
2024-04-16 |
Capturing the Macroscopic Behaviour of Molecular Dynamics with Membership Functions |
Alexander Sikorski et.al. |
2404.10523 |
link |
2024-04-16 |
CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity |
Moshe Berchansky et.al. |
2404.10513 |
null |
2024-04-16 |
White Men Lead, Black Women Help: Uncovering Gender, Racial, and Intersectional Bias in Language Agency |
Yixin Wan et.al. |
2404.10508 |
null |
2024-04-16 |
Self-Supervised Visual Preference Alignment |
Ke Zhu et.al. |
2404.10501 |
link |
2024-04-16 |
When Emotional Stimuli meet Prompt Designing: An Auto-Prompt Graphical Paradigm |
Chenggian Ma et.al. |
2404.10500 |
null |
2024-04-16 |
Spiral of Silences: How is Large Language Model Killing Information Retrieval? – A Case Study on Open Domain Question Answering |
Xiaoyang Chen et.al. |
2404.10496 |
link |
2024-04-15 |
KG-CTG: Citation Generation through Knowledge Graph-guided Large Language Models |
Avinash Anand et.al. |
2404.09763 |
null |
2024-04-15 |
Resilience of Large Language Models for Noisy Instructions |
Bin Wang et.al. |
2404.09754 |
null |
2024-04-15 |
Personalized Collaborative Fine-Tuning for On-Device Large Language Models |
Nicolas Wagner et.al. |
2404.09753 |
link |
2024-04-15 |
AMPCliff: quantitative definition and benchmarking of activity cliffs in antimicrobial peptides |
Kewei Li et.al. |
2404.09738 |
link |
2024-04-15 |
Quantization of Large Language Models with an Overdetermined Basis |
Daniil Merkulov et.al. |
2404.09737 |
null |
2024-04-15 |
Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models |
Ziwei Luo et.al. |
2404.09732 |
link |
2024-04-15 |
Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model |
Hyunsoo Cho et.al. |
2404.09717 |
null |
2024-04-15 |
Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction |
David Sobrín-Hidalgo et.al. |
2404.09705 |
null |
2024-04-15 |
Generative AI for Game Theory-based Mobile Networking |
Long He et.al. |
2404.09699 |
null |
2024-04-15 |
Are Large Language Models Reliable Argument Quality Annotators? |
Nailia Mirzakhmedova et.al. |
2404.09696 |
link |
2024-04-15 |
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models |
Guangyan Li et.al. |
2404.09695 |
null |
2024-04-15 |
Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation |
Juhwan Choi et.al. |
2404.09682 |
null |
2024-04-15 |
Learn Your Reference Model for Real Good Alignment |
Alexey Gorbatovski et.al. |
2404.09656 |
null |
2024-04-15 |
Do LLMs Understand Visual Anomalies? Uncovering LLM Capabilities in Zero-shot Anomaly Detection |
Jiaqi Zhu et.al. |
2404.09654 |
null |
2024-04-15 |
Bridging Vision and Language Spaces with Assignment Prediction |
Jungin Park et.al. |
2404.09632 |
link |
2024-04-15 |
AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception |
Yipo Huang et.al. |
2404.09624 |
link |
2024-04-15 |
UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark |
Zhaokun Zhou et.al. |
2404.09619 |
null |
2024-04-15 |
A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions |
Pengfei Liu et.al. |
2404.09606 |
link |
2024-04-15 |
Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction |
Zepeng Ding et.al. |
2404.09593 |
null |
2024-04-15 |
Modelling Language |
Jumbly Grindrod et.al. |
2404.09579 |
null |
2024-04-15 |
Transformers, Contextualism, and Polysemy |
Jumbly Grindrod et.al. |
2404.09577 |
null |
2024-04-15 |
Large language models and linguistic intentionality |
Jumbly Grindrod et.al. |
2404.09576 |
null |
2024-04-12 |
Probing the 3D Awareness of Visual Foundation Models |
Mohamed El Banani et.al. |
2404.08636 |
link |
2024-04-12 |
Pre-training Small Base LMs with Fewer Tokens |
Sunny Sanyal et.al. |
2404.08634 |
link |
2024-04-12 |
FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models |
Yanting Wang et.al. |
2404.08631 |
link |
2024-04-12 |
Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation |
Yanhao Zheng et.al. |
2404.08603 |
link |
2024-04-12 |
Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts |
Övgü Özdemir et.al. |
2404.08589 |
link |
2024-04-12 |
Pathological Primitive Segmentation Based on Visual Foundation Model with Zero-Shot Mask Generation |
Abu Bakor Hayat Arnob et.al. |
2404.08584 |
link |
2024-04-12 |
FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation |
Riza Velioglu et.al. |
2404.08582 |
link |
2024-04-12 |
Lossy Image Compression with Foundation Diffusion Models |
Lucas Relic et.al. |
2404.08580 |
null |
2024-04-12 |
Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation |
Hanlin Tian et.al. |
2404.08570 |
link |
2024-04-12 |
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs |
Shreyas Chaudhari et.al. |
2404.08555 |
null |
2024-04-12 |
Memory Traces: Are Transformers Tulving Machines? |
Jean-Marie Chauvet et.al. |
2404.08543 |
null |
2024-04-12 |
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward |
Xuan Xie et.al. |
2404.08517 |
null |
2024-04-12 |
ChatGPT and general-purpose AI count fruits in pictures surprisingly well |
Konlavach Mengsuwan et.al. |
2404.08515 |
null |
2024-04-12 |
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction |
Haoran Qiu et.al. |
2404.08509 |
link |
2024-04-12 |
LaSagnA: Language-based Segmentation Assistant for Complex Queries |
Cong Wei et.al. |
2404.08506 |
link |
2024-04-12 |
Strategic Interactions between Large Language Models-based Agents in Beauty Contests |
Siting Lu et.al. |
2404.08492 |
null |
2024-04-12 |
Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation |
Haozhe Zhao et.al. |
2404.08491 |
link |
2024-04-12 |
Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian |
Stefano De Paoli et.al. |
2404.08488 |
null |
2024-04-12 |
Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task |
Hassan Ali et.al. |
2404.08424 |
null |
2024-04-12 |
Adapting the Segment Anything Model During Usage in Novel Situations |
Robin Schön et.al. |
2404.08421 |
null |
2024-04-11 |
OpenBias: Open-set Bias Detection in Text-to-Image Generative Models |
Moreno D’Incà et.al. |
2404.07990 |
link |
2024-04-11 |
Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding |
Yiwen Tang et.al. |
2404.07989 |
link |
2024-04-11 |
Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning |
Simon Schrodi et.al. |
2404.07983 |
null |
2024-04-11 |
Language Imbalance Can Boost Cross-lingual Generalisation |
Anton Schäfer et.al. |
2404.07982 |
link |
2024-04-11 |
Manipulating Large Language Models to Increase Product Visibility |
Aounon Kumar et.al. |
2404.07981 |
link |
2024-04-11 |
LLoCO: Learning Long Contexts Offline |
Sijun Tan et.al. |
2404.07979 |
link |
2024-04-11 |
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models |
Haotian Zhang et.al. |
2404.07973 |
null |
2024-04-11 |
Rho-1: Not All Tokens Are What You Need |
Zhenghao Lin et.al. |
2404.07965 |
link |
2024-04-11 |
On Unified Prompt Tuning for Request Quality Assurance in Public Code Review |
Xinyu Chen et.al. |
2404.07942 |
null |
2024-04-11 |
Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation |
Jinkyung Park et.al. |
2404.07926 |
null |
2024-04-11 |
LaVy: Vietnamese Multimodal Large Language Model |
Chi Tran et.al. |
2404.07922 |
link |
2024-04-11 |
AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs |
Zeyi Liao et.al. |
2404.07921 |
link |
2024-04-11 |
DesignQA: A Multimodal Benchmark for Evaluating Large Language Models’ Understanding of Engineering Documentation |
Anna C. Doris et.al. |
2404.07917 |
link |
2024-04-11 |
HGRN2: Gated Linear RNNs with State Expansion |
Zhen Qin et.al. |
2404.07904 |
link |
2024-04-11 |
High-Dimension Human Value Representation in Large Language Models |
Samuel Cahyawijaya et.al. |
2404.07900 |
link |
2024-04-11 |
Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations |
Dayeon Ki et.al. |
2404.07851 |
link |
2024-04-11 |
On Training Data Influence of GPT Models |
Qingyi Liu et.al. |
2404.07840 |
link |
2024-04-11 |
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models |
Aleksandar Botev et.al. |
2404.07839 |
link |
2024-04-11 |
Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution |
Handi Deng et.al. |
2404.07833 |
null |
2024-04-11 |
Heron-Bench: A Benchmark for Evaluating Vision Language Models in Japanese |
Yuichi Inoue et.al. |
2404.07824 |
link |
2024-04-10 |
BRAVE: Broadening the visual encoding of vision-language models |
Oğuzhan Fatih Kar et.al. |
2404.07204 |
null |
2024-04-10 |
UMBRAE: Unified Multimodal Decoding of Brain Signals |
Weihao Xia et.al. |
2404.07202 |
link |
2024-04-10 |
Scaling Laws for Data Filtering – Data Curation cannot be Compute Agnostic |
Sachin Goyal et.al. |
2404.07177 |
link |
2024-04-10 |
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention |
Tsendsuren Munkhdalai et.al. |
2404.07143 |
null |
2024-04-10 |
Open reaction-diffusion systems: bridging probabilistic theory across scales |
Mauricio J. del Razo et.al. |
2404.07119 |
null |
2024-04-10 |
Continuous Language Model Interpolation for Dynamic and Controllable Text Generation |
Sara Kangaslahti et.al. |
2404.07117 |
link |
2024-04-11 |
From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications |
Yongqiang Ma et.al. |
2404.07108 |
null |
2024-04-10 |
Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs |
Bowen Jin et.al. |
2404.07103 |
link |
2024-04-10 |
Dynamic Generation of Personalities with Large Language Models |
Jianzhi Liu et.al. |
2404.07084 |
link |
2024-04-10 |
VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning |
Alexandros Xenos et.al. |
2404.07078 |
link |
2024-04-10 |
Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers? |
Mingyu Jin et.al. |
2404.07066 |
link |
2024-04-10 |
Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study |
Alessandro Stolfo et.al. |
2404.07060 |
null |
2024-04-10 |
Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation |
Elisa Sanchez-Bayona et.al. |
2404.07053 |
link |
2024-04-10 |
ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling |
Ege Özsoy et.al. |
2404.07031 |
null |
2024-04-10 |
Improving Language Model Reasoning with Self-motivated Learning |
Yunlong Feng et.al. |
2404.07017 |
null |
2024-04-10 |
A Mathematical Theory for Learning Semantic Languages by Abstract Learners |
Kuo-Yu Liao et.al. |
2404.07009 |
null |
2024-04-10 |
WordDecipher: Enhancing Digital Workspace Communication with Explainable AI for Non-native English Speakers |
Yuexi Chen et.al. |
2404.07005 |
null |
2024-04-10 |
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models |
Igor Tufanov et.al. |
2404.07004 |
null |
2024-04-10 |
Event Grounded Criminal Court View Generation withCooperative (Large) Language Models |
Linan Yue et.al. |
2404.07001 |
link |
2024-04-10 |
Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study |
Hongru Du et.al. |
2404.06962 |
link |
2024-04-09 |
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD |
Xiaoyi Dong et.al. |
2404.06512 |
link |
2024-04-09 |
Can Feedback Enhance Semantic Grounding in Large Vision-Language Models? |
Yuan-Hong Liao et.al. |
2404.06510 |
null |
2024-04-09 |
On the Effect of (Near) Duplicate Subwords in Language Modelling |
Anton Schäfer et.al. |
2404.06508 |
link |
2024-04-09 |
Pitfalls of Conversational LLMs on News Debiasing |
Ipek Baris Schlicht et.al. |
2404.06488 |
null |
2024-04-10 |
Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks |
Chonghua Wang et.al. |
2404.06480 |
link |
2024-04-10 |
Text-Based Reasoning About Vector Graphics |
Zhenhailong Wang et.al. |
2404.06479 |
null |
2024-04-09 |
Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models |
Zihan Fang et.al. |
2404.06448 |
null |
2024-04-09 |
Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems |
Kunal Garg et.al. |
2404.06413 |
null |
2024-04-09 |
AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents |
Luca Gioacchini et.al. |
2404.06411 |
link |
2024-04-09 |
Take a Look at it! Rethinking How to Evaluate Language Model Jailbreak |
Hongyu Cai et.al. |
2404.06407 |
link |
2024-04-09 |
Apprentices to Research Assistants: Advancing Research with Large Language Models |
M. Namvarpour et.al. |
2404.06404 |
null |
2024-04-09 |
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies |
Shengding Hu et.al. |
2404.06395 |
link |
2024-04-09 |
MuPT: A Generative Symbolic Music Pretrained Transformer |
Xingwei Qu et.al. |
2404.06393 |
null |
2024-04-09 |
Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis |
Mikel Zubillaga et.al. |
2404.06392 |
null |
2024-04-09 |
Latent Distance Guided Alignment Training for Large Language Models |
Haotian Luo et.al. |
2404.06390 |
null |
2024-04-09 |
Model Generation from Requirements with LLMs: an Exploratory Study |
Alessio Ferrari et.al. |
2404.06371 |
null |
2024-04-09 |
Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python |
Valdecy Pereira et.al. |
2404.06370 |
link |
2024-04-09 |
VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs |
Yi Gui et.al. |
2404.06369 |
null |
2024-04-09 |
ClinLinker: Medical Entity Linking of Clinical Concept Mentions in Spanish |
Fernando Gallego et.al. |
2404.06367 |
null |
2024-04-09 |
Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation |
Sidra Aleem et.al. |
2404.06362 |
link |
2024-04-08 |
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding |
Bo He et.al. |
2404.05726 |
link |
2024-04-08 |
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs |
Keen You et.al. |
2404.05719 |
null |
2024-04-08 |
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding |
Ahmad Idrissi-Yaghir et.al. |
2404.05694 |
null |
2024-04-08 |
Evaluating Mathematical Reasoning Beyond Accuracy |
Shijie Xia et.al. |
2404.05692 |
link |
2024-04-08 |
Retrieval-Augmented Open-Vocabulary Object Detection |
Jooyeon Kim et.al. |
2404.05687 |
link |
2024-04-08 |
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation |
Kunpeng Song et.al. |
2404.05674 |
link |
2024-04-08 |
CoReS: Orchestrating the Dance of Reasoning and Segmentation |
Xiaoyi Bao et.al. |
2404.05673 |
null |
2024-04-08 |
Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data |
Haitham Hammami et.al. |
2404.05632 |
link |
2024-04-08 |
LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking |
Faren Yan et.al. |
2404.05624 |
null |
2024-04-08 |
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning |
Matteo Farina et.al. |
2404.05621 |
link |
2024-04-08 |
SpeechAlign: Aligning Speech Generation to Human Preferences |
Dong Zhang et.al. |
2404.05600 |
link |
2024-04-08 |
MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering |
Iñigo Alonso et.al. |
2404.05590 |
null |
2024-04-08 |
Enhancing Software Related Information Extraction with Generative Language Models through Single-Choice Question Answering |
Wolfgang Otto et.al. |
2404.05587 |
null |
2024-04-08 |
Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model |
Yue-Hua Han et.al. |
2404.05583 |
null |
2024-04-08 |
360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System |
Shen Gao et.al. |
2404.05569 |
null |
2024-04-08 |
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models |
Bowen Pan et.al. |
2404.05567 |
null |
2024-04-08 |
Chinese Sequence Labeling with Semi-Supervised Boundary-Aware Language Model Pre-training |
Longhui Zhang et.al. |
2404.05560 |
link |
2024-04-08 |
Evaluating Interventional Reasoning Capabilities of Large Language Models |
Tejas Kasetty et.al. |
2404.05545 |
null |
2024-04-08 |
OPSD: an Offensive Persian Social media Dataset and its baseline evaluations |
Mehran Safayani et.al. |
2404.05540 |
null |
2024-04-08 |
Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data |
Tim Baumgärtner et.al. |
2404.05530 |
null |
2024-04-05 |
Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2) |
Michael Saxon et.al. |
2404.04251 |
link |
2024-04-05 |
Physical Property Understanding from Language-Embedded Feature Fields |
Albert J. Zhai et.al. |
2404.04242 |
null |
2024-04-05 |
Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents |
Harsh Kohli et.al. |
2404.04237 |
null |
2024-04-05 |
player2vec: A Language Modeling Approach to Understand Player Behavior in Games |
Tianze Wang et.al. |
2404.04234 |
null |
2024-04-05 |
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation |
Ji-Jia Wu et.al. |
2404.04231 |
link |
2024-04-05 |
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation |
Tong Su et.al. |
2404.04212 |
null |
2024-04-05 |
Social Skill Training with Large Language Models |
Diyi Yang et.al. |
2404.04204 |
null |
2024-04-05 |
Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text? |
Ilya Ilyankou et.al. |
2404.04169 |
null |
2024-04-05 |
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model |
Xinrun Du et.al. |
2404.04167 |
null |
2024-04-05 |
Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval |
João Coelho et.al. |
2404.04163 |
null |
2024-04-05 |
BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models |
Jacek Wiland et.al. |
2404.04113 |
link |
2024-04-05 |
Large language models as oracles for instantiating ontologies with domain-specific knowledge |
Giovanni Ciatto et.al. |
2404.04108 |
link |
2024-04-05 |
Robust Preference Optimization with Provable Noise Tolerance for LLMs |
Xize Liang et.al. |
2404.04102 |
null |
2024-04-05 |
Label Propagation for Zero-shot Classification with Vision-Language Models |
Vladan Stojnić et.al. |
2404.04072 |
link |
2024-04-05 |
Assessing the quality of information extraction |
Filip Seitl et.al. |
2404.04068 |
null |
2024-04-05 |
CLUE: A Clinical Language Understanding Evaluation for LLMs |
Amin Dada et.al. |
2404.04067 |
link |
2024-04-05 |
VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots |
Akhil Padmanabha et.al. |
2404.04066 |
null |
2024-04-05 |
A Comparison of Methods for Evaluating Generative IR |
Negar Arabzadeh et.al. |
2404.04044 |
link |
2024-04-05 |
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer |
Hele-Andra Kuulmets et.al. |
2404.04042 |
link |
2024-04-05 |
Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds |
Annerose Eichel et.al. |
2404.04031 |
link |
2024-04-04 |
OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views |
Francis Engelmann et.al. |
2404.03650 |
null |
2024-04-04 |
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent |
Hanyu Lai et.al. |
2404.03648 |
link |
2024-04-04 |
Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra |
Darioush Kevian et.al. |
2404.03647 |
null |
2024-04-04 |
Locating and Editing Factual Associations in Mamba |
Arnab Sen Sharma et.al. |
2404.03646 |
link |
2024-04-04 |
Training LLMs over Neurally Compressed Text |
Brian Lester et.al. |
2404.03626 |
null |
2024-04-04 |
Standardizing Knowledge Engineering Practices with a Reference Architecture |
Bradley P. Allen et.al. |
2404.03624 |
null |
2024-04-04 |
Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph |
Marco Bronzini et.al. |
2404.03623 |
null |
2024-04-04 |
Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models |
Wenshan Wu et.al. |
2404.03622 |
null |
2024-04-04 |
DeViDe: Faceted medical knowledge for improved medical vision-language pre-training |
Haozhe Luo et.al. |
2404.03618 |
null |
2024-04-04 |
Sailor: Open Language Models for South-East Asia |
Longxu Dou et.al. |
2404.03608 |
link |
2024-04-04 |
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization |
Aniruddha Nrusimha et.al. |
2404.03605 |
link |
2024-04-04 |
Evaluating LLMs at Detecting Errors in LLM Responses |
Ryo Kamoi et.al. |
2404.03602 |
link |
2024-04-04 |
Intent Detection and Entity Extraction from BioMedical Literature |
Ankan Mullick et.al. |
2404.03598 |
link |
2024-04-04 |
ReFT: Representation Finetuning for Language Models |
Zhengxuan Wu et.al. |
2404.03592 |
link |
2024-04-04 |
SemGrasp: Semantic Grasp Generation via Language Aligned Discretization |
Kailin Li et.al. |
2404.03590 |
null |
2024-04-04 |
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models |
Yantao Liu et.al. |
2404.03577 |
link |
2024-04-04 |
Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity |
Jake Varley et.al. |
2404.03570 |
null |
2024-04-04 |
Personalized LLM Response Generation with Parameterized Memory Injection |
Kai Zhang et.al. |
2404.03565 |
null |
2024-04-04 |
Select and Summarize: Scene Saliency for Movie Script Summarization |
Rohit Saxena et.al. |
2404.03561 |
link |
2024-04-04 |
How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes |
Harmon Bhasin et.al. |
2404.03558 |
link |
2024-04-03 |
ALOHa: A New Measure for Hallucination in Captioning Models |
Suzanne Petryk et.al. |
2404.02904 |
null |
2024-04-03 |
MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment |
Duygu Ceylan et.al. |
2404.02899 |
null |
2024-04-03 |
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline |
Yifan Xu et.al. |
2404.02893 |
link |
2024-04-03 |
MODNO: Multi Operator Learning With Distributed Neural Operators |
Zecheng Zhang et.al. |
2404.02892 |
null |
2024-04-03 |
Linear Attention Sequence Parallelism |
Weigao Sun et.al. |
2404.02882 |
link |
2024-04-03 |
Integrating Explanations in Learning LTL Specifications from Demonstrations |
Ashutosh Gupta et.al. |
2404.02872 |
null |
2024-04-03 |
Toward Inference-optimal Mixture-of-Expert Large Language Models |
Longfei Yun et.al. |
2404.02852 |
null |
2024-04-03 |
I-Design: Personalized LLM Interior Designer |
Ata Çelen et.al. |
2404.02838 |
null |
2024-04-03 |
Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models |
Wanyun Cui et.al. |
2404.02837 |
null |
2024-04-03 |
Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison |
Maxime Bouthors et.al. |
2404.02835 |
null |
2024-04-03 |
Empowering Biomedical Discovery with AI Agents |
Shanghua Gao et.al. |
2404.02831 |
null |
2024-04-03 |
BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models |
Qijun Luo et.al. |
2404.02827 |
link |
2024-04-03 |
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models |
Haoran Sun et.al. |
2404.02823 |
link |
2024-04-03 |
A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches |
Zhigen Zhao et.al. |
2404.02817 |
null |
2024-04-03 |
The RealHumanEval: Evaluating Large Language Models’ Abilities to Support Programmers |
Hussein Mozannar et.al. |
2404.02806 |
link |
2024-04-03 |
Efficient Multi-Vector Dense Retrieval Using Bit Vectors |
Franco Maria Nardini et.al. |
2404.02805 |
link |
2024-04-03 |
AI and personalized learning: bridging the gap with modern educational goals |
Kristjan-Julius Laak et.al. |
2404.02798 |
null |
2024-04-03 |
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech |
Jaehyeon Kim et.al. |
2404.02781 |
null |
2024-04-03 |
FPT: Feature Prompt Tuning for Few-shot Readability Assessment |
Ziyang Wang et.al. |
2404.02772 |
link |
2024-04-03 |
DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement |
Hao Wu et.al. |
2404.02755 |
null |
2024-04-02 |
Segment Any 3D Object with Language |
Seungjun Lee et.al. |
2404.02157 |
null |
2024-04-02 |
Iterated Learning Improves Compositionality in Large Vision-Language Models |
Chenhao Zheng et.al. |
2404.02145 |
null |
2024-04-02 |
Topic-based Watermarks for LLM-Generated Text |
Alexander Nemecek et.al. |
2404.02138 |
null |
2024-04-02 |
ViTamin: Designing Scalable Vision Models in the Vision-Language Era |
Jienneg Chen et.al. |
2404.02132 |
link |
2024-04-02 |
FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning |
Joel Niklaus et.al. |
2404.02127 |
link |
2024-04-02 |
Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models |
Wanyong Feng et.al. |
2404.02124 |
link |
2024-04-02 |
GINopic: Topic Modeling with Graph Isomorphism Network |
Suman Adhya et.al. |
2404.02115 |
link |
2024-04-02 |
CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems |
Sara Rosenthal et.al. |
2404.02103 |
link |
2024-04-02 |
Advancing LLM Reasoning Generalists with Preference Trees |
Lifan Yuan et.al. |
2404.02078 |
link |
2024-04-02 |
Red-Teaming Segment Anything Model |
Krzysztof Jankowski et.al. |
2404.02067 |
link |
2024-04-02 |
Digital Forgetting in Large Language Models: A Survey of Unlearning Methods |
Alberto Blanco-Justicia et.al. |
2404.02062 |
null |
2024-04-02 |
Long-context LLMs Struggle with Long In-context Learning |
Tianle Li et.al. |
2404.02060 |
link |
2024-04-02 |
IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT |
Junchen Fu et.al. |
2404.02059 |
link |
2024-04-02 |
Deconstructing In-Context Learning: Understanding Prompts via Corruption |
Namrata Shivagunde et.al. |
2404.02054 |
link |
2024-04-02 |
A Survey on Large Language Model-Based Game Agents |
Sihao Hu et.al. |
2404.02039 |
link |
2024-04-02 |
MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages |
Daryna Dementieva et.al. |
2404.02037 |
null |
2024-04-02 |
Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts |
Zhuo Chen et.al. |
2404.02022 |
link |
2024-04-02 |
Large Language Models for Orchestrating Bimanual Robots |
Kun Chu et.al. |
2404.02018 |
null |
2024-04-02 |
MuxServe: Flexible Multiplexing for Efficient Multiple LLM Serving |
Jiangfei Duan et.al. |
2404.02015 |
link |
2024-04-02 |
Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models |
Stephan Linzbach et.al. |
2404.01992 |
null |
2024-03-29 |
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models |
Atsuyuki Miyai et.al. |
2403.20331 |
link |
2024-03-29 |
Are We on the Right Way for Evaluating Large Vision-Language Models? |
Lin Chen et.al. |
2403.20330 |
link |
2024-03-29 |
ReALM: Reference Resolution As Language Modeling |
Joel Ruben Antony Moniz et.al. |
2403.20329 |
null |
2024-03-29 |
Gecko: Versatile Text Embeddings Distilled from Large Language Models |
Jinhyuk Lee et.al. |
2403.20327 |
null |
2024-03-29 |
Convolutional Prompting meets Language Models for Continual Learning |
Anurag Roy et.al. |
2403.20317 |
null |
2024-03-29 |
Learn “No” to Say “Yes” Better: Improving Vision-Language Models via Negations |
Jaisidh Singh et.al. |
2403.20312 |
link |
2024-03-29 |
Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference |
Jovan Stojkovic et.al. |
2403.20306 |
null |
2024-03-29 |
Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain |
Burcu Sayin et.al. |
2403.20288 |
link |
2024-03-29 |
LUQ: Long-text Uncertainty Quantification for LLMs |
Caiqi Zhang et.al. |
2403.20279 |
null |
2024-04-01 |
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want |
Weifeng Lin et.al. |
2403.20271 |
link |
2024-03-29 |
Latxa: An Open Language Model and Evaluation Suite for Basque |
Julen Etxaniz et.al. |
2403.20266 |
link |
2024-03-29 |
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models |
Thibaut Thonet et.al. |
2403.20262 |
null |
2024-03-29 |
MedCLIP-SAM: Bridging Text and Image Towards Universal Medical Image Segmentation |
Taha Koleilat et.al. |
2403.20253 |
link |
2024-03-29 |
Using LLMs to Model the Beliefs and Preferences of Targeted Populations |
Keiichi Namikoshi et.al. |
2403.20252 |
null |
2024-03-29 |
Long-Tailed Anomaly Detection with Learnable Class Names |
Chih-Hui Ho et.al. |
2403.20236 |
null |
2024-03-29 |
H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model |
Chao Pang et.al. |
2403.20213 |
link |
2024-03-29 |
Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science |
Yazheng Yang et.al. |
2403.20208 |
null |
2024-03-29 |
The Future of Combating Rumors? Retrieval, Discrimination, and Generation |
Junhao Xu et.al. |
2403.20204 |
null |
2024-03-29 |
ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models |
Shuo Liu et.al. |
2403.20194 |
null |
2024-03-29 |
HARMamba: Efficient Wearable Sensor Human Activity Recognition Based on Bidirectional Selective SSM |
Shuangjian Li et.al. |
2403.20183 |
null |
2024-03-28 |
RSMamba: Remote Sensing Image Classification with State Space Model |
Keyan Chen et.al. |
2403.19654 |
link |
2024-03-28 |
InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction |
Sirui Xu et.al. |
2403.19652 |
null |
2024-03-28 |
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions |
Kai Zhang et.al. |
2403.19651 |
null |
2024-03-28 |
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models |
Samuel Marks et.al. |
2403.19647 |
link |
2024-03-28 |
Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning |
Chenyang Liu et.al. |
2403.19646 |
link |
2024-03-28 |
Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models |
Yucheng Shi et.al. |
2403.19631 |
null |
2024-03-28 |
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents |
Zeren Chen et.al. |
2403.19622 |
null |
2024-03-28 |
SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects |
Avinash Ummadisingu et.al. |
2403.19607 |
null |
2024-03-28 |
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation |
Zhongliang Zhou et.al. |
2403.19584 |
link |
2024-03-28 |
Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics |
Norman Di Palo et.al. |
2403.19578 |
null |
2024-03-28 |
WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models |
Piotr Molenda et.al. |
2403.19548 |
null |
2024-03-28 |
Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models |
Ang Lv et.al. |
2403.19521 |
link |
2024-03-28 |
Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data |
Shan Chen et.al. |
2403.19511 |
link |
2024-03-28 |
LLMs as Academic Reading Companions: Extending HCI Through Synthetic Personae |
Celia Chen et.al. |
2403.19506 |
null |
2024-03-28 |
Evolving Assembly Code in an Adversarial Environment |
Irina Maliukov et.al. |
2403.19489 |
link |
2024-03-28 |
JDocQA: Japanese Document Question Answering Dataset for Generative Language Models |
Eri Onami et.al. |
2403.19454 |
link |
2024-03-28 |
Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model |
Qi Gou et.al. |
2403.19443 |
null |
2024-03-28 |
OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion |
Xinyu Zhan et.al. |
2403.19417 |
null |
2024-03-28 |
BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation |
Yuhong He et.al. |
2403.19414 |
null |
2024-03-28 |
Checkpoint Merging via Bayesian Optimization in LLM Pretraining |
Deyuan Liu et.al. |
2403.19390 |
null |
2024-03-27 |
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models |
Yanwei Li et.al. |
2403.18814 |
link |
2024-03-27 |
ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation |
Suraj Patni et.al. |
2403.18807 |
link |
2024-03-27 |
Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation |
Mateusz Klimaszewski et.al. |
2403.18804 |
link |
2024-03-27 |
Projective Methods for Mitigating Gender Bias in Pre-trained Language Models |
Hillary Dawkins et.al. |
2403.18803 |
link |
2024-03-27 |
Long-form factuality in large language models |
Jerry Wei et.al. |
2403.18802 |
link |
2024-03-27 |
Towards a World-English Language Model for On-Device Virtual Assistants |
Rricha Jalota et.al. |
2403.18783 |
null |
2024-03-27 |
3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation |
Ehsan Latif et.al. |
2403.18778 |
null |
2024-03-27 |
ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object |
Chenshuang Zhang et.al. |
2403.18775 |
link |
2024-03-27 |
CheckEval: Robust Evaluation Framework using Large Language Model via Checklist |
Yukyung Lee et.al. |
2403.18771 |
null |
2024-03-27 |
MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model |
Yike Wu et.al. |
2403.18760 |
link |
2024-03-27 |
CYCLE: Learning to Self-Refine the Code Generation |
Yangruibo Ding et.al. |
2403.18746 |
link |
2024-03-27 |
Understanding the Learning Dynamics of Alignment with Human Feedback |
Shawn Im et.al. |
2403.18742 |
link |
2024-03-27 |
PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations |
Ehsan Latif et.al. |
2403.18721 |
null |
2024-03-27 |
Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding |
Xintong Wang et.al. |
2403.18715 |
null |
2024-03-27 |
The Invalsi Benchmark: measuring Language Models Mathematical and Language understanding in Italian |
Andrea Esuli et.al. |
2403.18697 |
null |
2024-03-27 |
NL-ITI: Optimizing Probing and Intervention for Improvement of ITI Method |
Jakub Hoscilowicz et.al. |
2403.18680 |
link |
2024-03-27 |
An Exploratory Study on Upper-Level Computing Students’ Use of Large Language Models as Tools in a Semester-Long Project |
Ben Arie Tanay et.al. |
2403.18679 |
null |
2024-03-27 |
SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens |
Chengbo Liu et.al. |
2403.18647 |
link |
2024-03-27 |
To Recommend or Not: Recommendability Identification in Conversations with Pre-trained Language Models |
Zhefan Wang et.al. |
2403.18628 |
link |
2024-03-27 |
Vulnerability Detection with Code Language Models: How Far Are We? |
Yangruibo Ding et.al. |
2403.18624 |
link |
2024-03-26 |
OmniVid: A Generative Framework for Universal Video Understanding |
Junke Wang et.al. |
2403.17935 |
link |
2024-03-26 |
Track Everything Everywhere Fast and Robustly |
Yunzhou Song et.al. |
2403.17931 |
null |
2024-03-26 |
MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution |
Wei Tao et.al. |
2403.17927 |
null |
2024-03-26 |
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning |
Rui Pan et.al. |
2403.17919 |
link |
2024-03-26 |
Large scale paired antibody language models |
Henry Kenlay et.al. |
2403.17889 |
null |
2024-03-26 |
Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation |
Carlos Gomes et.al. |
2403.17886 |
link |
2024-03-26 |
MIND Your Language: A Multilingual Dataset for Cross-lingual News Recommendation |
Andreea Iana et.al. |
2403.17876 |
link |
2024-03-26 |
Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach |
Andrea Ferrario et.al. |
2403.17873 |
null |
2024-03-26 |
Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications |
Philip Lippmann et.al. |
2403.17860 |
null |
2024-03-26 |
ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages |
Bhawna Piryani et.al. |
2403.17859 |
link |
2024-03-26 |
Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs |
David R. Mortensen et.al. |
2403.17856 |
null |
2024-03-26 |
ArabicaQA: A Comprehensive Dataset for Arabic Question Answering |
Abdelrahman Abdallah et.al. |
2403.17848 |
link |
2024-03-26 |
Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation |
Abdelrhman Werby et.al. |
2403.17846 |
null |
2024-03-26 |
Mechanistic Design and Scaling of Hybrid Architectures |
Michael Poli et.al. |
2403.17844 |
null |
2024-03-26 |
ReMamber: Referring Image Segmentation with Mamba Twister |
Yuhuan Yang et.al. |
2403.17839 |
link |
2024-03-26 |
A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities |
Ibrahim Ethem Hamamci et.al. |
2403.17834 |
link |
2024-03-26 |
Assessment of Multimodal Large Language Models in Alignment with Human Values |
Zhelun Shi et.al. |
2403.17830 |
null |
2024-03-26 |
Accelerating Radio Spectrum Regulation Workflows with Large Language Models (LLMs) |
Amir Ghasemi et.al. |
2403.17819 |
null |
2024-03-26 |
Graph Language Model (GLM): A new graph-based approach to detect social instabilities |
Wallyson Lemes de Oliveira et.al. |
2403.17816 |
null |
2024-03-26 |
Are Compressed Language Models Less Subgroup Robust? |
Leonidas Gee et.al. |
2403.17811 |
link |
2024-03-25 |
Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making |
Shuai Ma et.al. |
2403.16812 |
null |
2024-03-25 |
An LLM-Based Digital Twin for Optimizing Human-in-the Loop Systems |
Hanqing Yang et.al. |
2403.16809 |
link |
2024-03-25 |
Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback |
Zhangqian Bi et.al. |
2403.16792 |
link |
2024-03-25 |
All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification |
Deepak Narayan Gadde et.al. |
2403.16750 |
null |
2024-03-25 |
A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models |
Nils Ingelhag et.al. |
2403.16730 |
null |
2024-03-25 |
ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search |
Zehan Li et.al. |
2403.16702 |
link |
2024-03-25 |
Synapse: Learning Preferential Concepts from Visual Demonstrations |
Sadanand Modak et.al. |
2403.16689 |
null |
2024-03-25 |
Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography |
Jiayue Zhang et.al. |
2403.16687 |
null |
2024-03-25 |
RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict |
Yirong Zeng et.al. |
2403.16662 |
link |
2024-03-25 |
Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT |
Rohit Raju et.al. |
2403.16655 |
null |
2024-03-25 |
CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment |
Feiteng Fang et.al. |
2403.16649 |
link |
2024-03-25 |
Virtual Co-Pilot: Multimodal Large Language Model-enabled Quick-access Procedures for Single Pilot Operations |
Fan Li et.al. |
2403.16645 |
null |
2024-03-25 |
Semantically Enriched Cross-Lingual Sentence Embeddings for Crisis-related Social Media Texts |
Rabindra Lamsal et.al. |
2403.16614 |
null |
2024-03-25 |
Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units |
Biswesh Mohapatra et.al. |
2403.16609 |
null |
2024-03-25 |
TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques |
Ashok Urlana et.al. |
2403.16592 |
null |
2024-03-25 |
Can Large Language Models (or Humans) Distill Text? |
Nicolas Audinet de Pieuchon et.al. |
2403.16584 |
link |
2024-03-25 |
NSINA: A News Corpus for Sinhala |
Hansi Hettiarachchi et.al. |
2403.16571 |
link |
2024-03-25 |
Elysium: Exploring Object-level Perception in Videos via MLLM |
Han Wang et.al. |
2403.16558 |
link |
2024-03-25 |
DOrA: 3D Visual Grounding with Order-Aware Referring |
Tung-Yu Wu et.al. |
2403.16539 |
null |
2024-03-25 |
Open-Set Recognition in the Age of Vision-Language Models |
Dimity Miller et.al. |
2403.16528 |
link |
2024-03-25 |
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art |
Neeloy Chakraborty et.al. |
2403.16527 |
null |
2024-03-25 |
Harnessing the power of LLMs for normative reasoning in MASs |
Bastin Tony Roy Savarimuthu et.al. |
2403.16524 |
null |
2024-03-25 |
Norm Violation Detection in Multi-Agent Systems using Large Language Models: A Pilot Study |
Shawn He et.al. |
2403.16517 |
null |
2024-03-25 |
Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social Media |
Uma Sushmitha Gunturi et.al. |
2403.16514 |
null |
2024-03-22 |
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models |
Yuzhang Shang et.al. |
2403.15388 |
null |
2024-03-22 |
Long-CLIP: Unlocking the Long-Text Capability of CLIP |
Beichen Zhang et.al. |
2403.15378 |
link |
2024-03-22 |
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding |
Yi Wang et.al. |
2403.15377 |
link |
2024-03-22 |
Can large language models explore in-context? |
Akshay Krishnamurthy et.al. |
2403.15371 |
null |
2024-03-22 |
CoLLEGe: Concept Embedding Generation for Large Language Models |
Ryan Teehan et.al. |
2403.15362 |
null |
2024-03-22 |
Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities |
Zhitong Xiong et.al. |
2403.15356 |
link |
2024-03-22 |
Controlled Training Data Generation with Diffusion Models |
Teresa Yeo et.al. |
2403.15309 |
null |
2024-03-22 |
Sphere Neural-Networks for Rational Reasoning |
Tiansi Dong et.al. |
2403.15297 |
null |
2024-03-22 |
Measuring Gender and Racial Biases in Large Language Models |
Jiafu An et.al. |
2403.15281 |
null |
2024-03-22 |
Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review |
Jinge Wang et.al. |
2403.15274 |
null |
2024-03-22 |
Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs |
Xiaobin Zhang et.al. |
2403.15273 |
null |
2024-03-22 |
Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models |
Huanxuan Liao et.al. |
2403.15268 |
link |
2024-03-22 |
AI Exposure and Strategic Positioning on an Online Work Platform |
Shun Yiu et.al. |
2403.15262 |
null |
2024-03-22 |
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions |
Orion Weller et.al. |
2403.15246 |
link |
2024-03-22 |
Shadow Generation for Composite Image Using Diffusion model |
Qingyang Liu et.al. |
2403.15234 |
link |
2024-03-22 |
An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets |
Jonathan Katzy et.al. |
2403.15230 |
link |
2024-03-22 |
Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models |
Qiong Wu et.al. |
2403.15226 |
link |
2024-03-22 |
Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations |
Pranav Kulkarni et.al. |
2403.15218 |
link |
2024-03-22 |
InstaSynth: Opportunities and Challenges in Generating Synthetic Instagram Data with ChatGPT for Sponsored Content Detection |
Thales Bertaglia et.al. |
2403.15214 |
link |
2024-03-22 |
MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection |
Taeheon Kim et.al. |
2403.15209 |
null |
2024-03-21 |
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? |
Renrui Zhang et.al. |
2403.14624 |
null |
2024-03-21 |
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey |
Zeyu Han et.al. |
2403.14608 |
null |
2024-03-21 |
MyVLM: Personalizing VLMs for User-Specific Queries |
Yuval Alaluf et.al. |
2403.14599 |
null |
2024-03-21 |
ReAct Meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training |
Zonghan Yang et.al. |
2403.14589 |
null |
2024-03-21 |
Large Language Models for Multi-Choice Question Classification of Medical Subjects |
Víctor Ponce-López et.al. |
2403.14582 |
null |
2024-03-21 |
RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain |
William James Bolton et.al. |
2403.14578 |
link |
2024-03-21 |
A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students’ Formative Assessment Responses in Science |
Clayton Cohn et.al. |
2403.14565 |
null |
2024-03-21 |
The Era of Semantic Decoding |
Maxime Peyrard et.al. |
2403.14562 |
null |
2024-03-21 |
Lexicon-Level Contrastive Visual-Grounding Improves Language Modeling |
Chengxu Zhuang et.al. |
2403.14551 |
null |
2024-03-21 |
EDT: Improving Large Language Models’ Generation by Entropy-based Dynamic Temperature Sampling |
Shimao Zhang et.al. |
2403.14541 |
link |
2024-03-21 |
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference |
Han Zhao et.al. |
2403.14520 |
link |
2024-03-21 |
The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs) |
Joschka Haltaufderheide et.al. |
2403.14473 |
null |
2024-03-21 |
Detoxifying Large Language Models via Knowledge Editing |
Mengru Wang et.al. |
2403.14472 |
link |
2024-03-21 |
ChatGPT Alternative Solutions: Large Language Models Survey |
Hanieh Alipour et.al. |
2403.14469 |
null |
2024-03-21 |
Recourse for reclamation: Chatting with generative language models |
Jennifer Chien et.al. |
2403.14467 |
null |
2024-03-21 |
Towards Single-System Illusion in Software-Defined Vehicles – Automated, AI-Powered Workflow |
Krzysztof Lebioda et.al. |
2403.14460 |
null |
2024-03-21 |
Multi-Level Explanations for Generative Language Models |
Lucas Monteiro Paes et.al. |
2403.14459 |
null |
2024-03-21 |
gTBLS: Generating Tables from Text by Conditional Question Answering |
Anirudh Sundar et.al. |
2403.14457 |
null |
2024-03-21 |
Language Models Can Reduce Asymmetry in Information Markets |
Nasim Rahaman et.al. |
2403.14443 |
null |
2024-03-21 |
A Multimodal Approach to Device-Directed Speech Detection with Large Language Models |
Dominik Wager et.al. |
2403.14438 |
null |
2024-03-20 |
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition |
Ziyu Liu et.al. |
2403.13805 |
link |
2024-03-20 |
Learning from Models and Data for Visual Grounding |
Ruozhen He et.al. |
2403.13804 |
null |
2024-03-20 |
Reverse Training to Nurse the Reversal Curse |
Olga Golovneva et.al. |
2403.13799 |
null |
2024-03-20 |
Bridge the Modality and Capacity Gaps in Vision-Language Model Selection |
Chao Yi et.al. |
2403.13797 |
null |
2024-03-20 |
RewardBench: Evaluating Reward Models for Language Modeling |
Nathan Lambert et.al. |
2403.13787 |
link |
2024-03-20 |
Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts |
Guangzeng Han et.al. |
2403.13786 |
link |
2024-03-20 |
Information-Theoretic Distillation for Reference-less Summarization |
Jaehun Jung et.al. |
2403.13780 |
null |
2024-03-20 |
Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation |
Hugues Thomas et.al. |
2403.13777 |
null |
2024-03-20 |
Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models |
Nicholas Bai et.al. |
2403.13771 |
link |
2024-03-20 |
Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model |
Diwei Wang et.al. |
2403.13756 |
null |
2024-03-20 |
Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement |
Catherine Arnett et.al. |
2403.13754 |
null |
2024-03-20 |
EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation |
Atnafu Lambebo Tonja et.al. |
2403.13737 |
null |
2024-03-20 |
Large Language Models meet Network Slicing Management and Orchestration |
Abdulhalim Dandoush et.al. |
2403.13721 |
null |
2024-03-20 |
SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning |
Hongjun Wang et.al. |
2403.13684 |
null |
2024-03-20 |
PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents |
Mitodru Niyogi et.al. |
2403.13681 |
null |
2024-03-20 |
RoleInteract: Evaluating the Social Interaction of Role-Playing Agents |
Hongzhan Chen et.al. |
2403.13679 |
link |
2024-03-20 |
Grounding Spatial Relations in Text-Only Language Models |
Gorka Azkune et.al. |
2403.13666 |
link |
2024-03-20 |
Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese |
Meet Doshi et.al. |
2403.13638 |
null |
2024-03-20 |
VL-Mamba: Exploring State Space Models for Multimodal Learning |
Yanyuan Qiao et.al. |
2403.13600 |
null |
2024-03-20 |
No more optimization rules: LLM-enabled policy-based multi-modal query optimizer (version 1) |
Yifan Wang et.al. |
2403.13597 |
null |
2024-03-19 |
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression |
Zhuoshi Pan et.al. |
2403.12968 |
link |
2024-03-19 |
Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models |
Zuyan Liu et.al. |
2403.12966 |
link |
2024-03-19 |
Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models |
Ce Zhang et.al. |
2403.12964 |
link |
2024-03-19 |
Dated Data: Tracing Knowledge Cutoffs in Large Language Models |
Jeffrey Cheng et.al. |
2403.12958 |
null |
2024-03-19 |
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models |
Elaine Sui et.al. |
2403.12952 |
link |
2024-03-19 |
Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models |
Joana Ribeiro de Faria et.al. |
2403.12936 |
null |
2024-03-19 |
Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties |
Efrain Torres-Lomas et.al. |
2403.12935 |
null |
2024-03-19 |
Rapid AIdeation: Generating Ideas With the Self and in Collaboration With Large Language Models |
Gionnieve Lim et.al. |
2403.12928 |
null |
2024-03-19 |
Supporting Energy Policy Research with Large Language Models |
Grant Buster et.al. |
2403.12924 |
null |
2024-03-19 |
Contextual AD Narration with Interleaved Multimodal Sequence |
Hanlin Wang et.al. |
2403.12922 |
null |
2024-03-19 |
Semantic Layering in Room Segmentation via LLMs |
Taehyeon Kim et.al. |
2403.12920 |
null |
2024-03-19 |
Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts |
Sai Ashish Somayajula et.al. |
2403.12918 |
link |
2024-03-19 |
Yell At Your Robot: Improving On-the-Fly from Language Corrections |
Lucy Xiaoyang Shi et.al. |
2403.12910 |
null |
2024-03-19 |
Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference |
Baolin Li et.al. |
2403.12900 |
null |
2024-03-19 |
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding |
Anwen Hu et.al. |
2403.12895 |
link |
2024-03-20 |
MEDBind: Unifying Language and Multimodal Medical Data Embeddings |
Yuan Gao et.al. |
2403.12894 |
null |
2024-03-19 |
HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning |
Fucai Ke et.al. |
2403.12884 |
null |
2024-03-19 |
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models |
Zehui Chen et.al. |
2403.12881 |
link |
2024-03-19 |
Epistemology of Language Models: Do Language Models Have Holistic Knowledge? |
Minsu Kim et.al. |
2403.12862 |
null |
2024-03-19 |
RASP: A Drone-based Reconfigurable Actuation and Sensing Platform Towards Ambient Intelligent Systems |
Minghui Zhao et.al. |
2403.12853 |
null |
2024-03-18 |
Modality-Agnostic fMRI Decoding of Vision and Language |
Mitja Nikolaus et.al. |
2403.11771 |
null |
2024-03-18 |
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs |
M. Jehanzeb Mirza et.al. |
2403.11755 |
link |
2024-03-18 |
Revisiting The Classics: A Study on Identifying and Rectifying Gender Stereotypes in Rhymes and Poems |
Aditya Narayan Sankaran et.al. |
2403.11752 |
link |
2024-03-18 |
Embedded Named Entity Recognition using Probing Classifiers |
Nicholas Popovič et.al. |
2403.11747 |
null |
2024-03-18 |
TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models |
Lisa Weijler et.al. |
2403.11691 |
null |
2024-03-18 |
HDLdebugger: Streamlining HDL debugging with Large Language Models |
Xufeng Yao et.al. |
2403.11671 |
null |
2024-03-18 |
Prioritized Semantic Learning for Zero-shot Instance Navigation |
Xander Sun et.al. |
2403.11650 |
link |
2024-03-18 |
Arc2Face: A Foundation Model of Human Faces |
Foivos Paraperas Papantoniou et.al. |
2403.11641 |
link |
2024-03-18 |
Compositional Kronecker Context Optimization for Vision-Language Models |
Kun Ding et.al. |
2403.11631 |
null |
2024-03-18 |
Let’s Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model |
Haoyun Xu et.al. |
2403.11621 |
null |
2024-03-18 |
CRS-Diff: Controllable Generative Remote Sensing Foundation Model |
Datao Tang et.al. |
2403.11614 |
link |
2024-03-18 |
Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines |
Ekaterina Trofimova et.al. |
2403.11585 |
null |
2024-03-18 |
Reinforcement Learning with Token-level Feedback for Controllable Text Generation |
Wendi Li et.al. |
2403.11558 |
link |
2024-03-18 |
LLM^3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning |
Shu Wang et.al. |
2403.11552 |
link |
2024-03-18 |
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters |
Jiazuo Yu et.al. |
2403.11549 |
link |
2024-03-18 |
DEE: Dual-stage Explainable Evaluation Method for Text Generation |
Shenyu Zhang et.al. |
2403.11509 |
null |
2024-03-18 |
Do CLIPs Always Generalize Better than ImageNet Models? |
Qizhou Wang et.al. |
2403.11497 |
null |
2024-03-18 |
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding |
Yue Fan et.al. |
2403.11481 |
null |
2024-03-18 |
HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models |
Huy Nghiem et.al. |
2403.11456 |
link |
2024-03-18 |
Zero-shot Compound Expression Recognition with Visual Language Model at the 6th ABAW Challenge |
Jiahe Wang et.al. |
2403.11450 |
null |
2024-03-18 |
LLM Guided Evolution - The Automation of Models Advancing Models |
Clint Morris et.al. |
2403.11446 |
link |
2024-03-18 |
StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation |
Jinpeng Li et.al. |
2403.11439 |
null |
2024-03-18 |
InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions |
Yifan Wang et.al. |
2403.11435 |
null |
2024-03-18 |
A Novel Paradigm Boosting Translation Capabilities of Large Language Models |
Jiaxin Guo et.al. |
2403.11430 |
null |
2024-03-15 |
VideoAgent: Long-form Video Understanding with Large Language Model as Agent |
Xiaohan Wang et.al. |
2403.10517 |
null |
2024-03-15 |
Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization |
Ratnadira Widyasari et.al. |
2403.10507 |
null |
2024-03-15 |
ATOM: Asynchronous Training of Massive Models for Deep Learning in a Decentralized Environment |
Xiaofeng Wu et.al. |
2403.10504 |
null |
2024-03-15 |
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study |
Chenguang Wang et.al. |
2403.10499 |
link |
2024-03-15 |
Reconfigurable Robot Identification from Motion Data |
Yuhang Hu et.al. |
2403.10496 |
null |
2024-03-15 |
Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst? |
Bruno de Melo et.al. |
2403.10482 |
null |
2024-03-15 |
Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases |
Jiarui Li et.al. |
2403.10446 |
link |
2024-03-15 |
Optimal Block-Level Draft Verification for Accelerating Speculative Decoding |
Ziteng Sun et.al. |
2403.10444 |
null |
2024-03-15 |
Using an LLM to Turn Sign Spottings into Spoken Language Sentences |
Ozge Mercanoglu Sincan et.al. |
2403.10434 |
null |
2024-03-15 |
SocialGenPod: Privacy-Friendly Generative AI Social Web Applications with Decentralised Personal Data Stores |
Vidminas Vizgirda et.al. |
2403.10408 |
link |
2024-03-15 |
A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE |
Hervé Déjean et.al. |
2403.10407 |
null |
2024-03-15 |
Monotonic Representation of Numeric Properties in Language Models |
Benjamin Heinzerling et.al. |
2403.10381 |
link |
2024-03-15 |
EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models |
Rocktim Jyoti Das et.al. |
2403.10378 |
link |
2024-03-15 |
TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale |
Pengcheng Jiang et.al. |
2403.10351 |
null |
2024-03-15 |
Investigating grammatical abstraction in language models using few-shot learning of novel noun gender |
Priyanka Sukumaran et.al. |
2403.10338 |
null |
2024-03-15 |
CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model |
Shang-Hsuan Chiang et.al. |
2403.10326 |
link |
2024-03-15 |
NetBench: A Large-Scale and Comprehensive Network Traffic Benchmark Dataset for Foundation Models |
Chen Qian et.al. |
2403.10319 |
link |
2024-03-15 |
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer |
Hengxing Cai et.al. |
2403.10301 |
null |
2024-03-15 |
Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models |
Tian Meng et.al. |
2403.10287 |
null |
2024-03-15 |
Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning |
Shang-Hsuan Chiang et.al. |
2403.10281 |
link |
2024-03-14 |
GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping |
Yuhang Zheng et.al. |
2403.09637 |
link |
2024-03-14 |
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference |
Piotr Nawrot et.al. |
2403.09636 |
null |
2024-03-14 |
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models |
Akhil Kedia et.al. |
2403.09635 |
link |
2024-03-14 |
OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning |
Lingyi Hong et.al. |
2403.09634 |
null |
2024-03-14 |
3D-VLA: A 3D Vision-Language-Action Generative World Model |
Haoyu Zhen et.al. |
2403.09631 |
null |
2024-03-14 |
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking |
Eric Zelikman et.al. |
2403.09629 |
link |
2024-03-14 |
Explore In-Context Segmentation via Latent Diffusion Models |
Chaoyang Wang et.al. |
2403.09616 |
null |
2024-03-14 |
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training |
Brandon McKinzie et.al. |
2403.09611 |
null |
2024-03-14 |
Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey |
Xiaoyu Liu et.al. |
2403.09606 |
null |
2024-03-14 |
Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis |
Gregory Coppola et.al. |
2403.09599 |
null |
2024-03-14 |
Renovating Names in Open-Vocabulary Segmentation Benchmarks |
Haiwen Huang et.al. |
2403.09593 |
null |
2024-03-14 |
ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models |
Runyu Ma et.al. |
2403.09583 |
null |
2024-03-14 |
Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation |
Yunhao Gou et.al. |
2403.09572 |
null |
2024-03-14 |
Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models |
Laura Fernández-Becerra et.al. |
2403.09567 |
null |
2024-03-14 |
Welcome Your New AI Teammate: On Safety Analysis by Leashing Large Language Models |
Ali Nouri et.al. |
2403.09565 |
null |
2024-03-14 |
PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps |
Ruixuan Liu et.al. |
2403.09562 |
null |
2024-03-14 |
Less is More: Data Value Estimation for Visual Instruction Tuning |
Zikang Liu et.al. |
2403.09559 |
null |
2024-03-15 |
Logits of API-Protected LLMs Leak Proprietary Information |
Matthew Finlayson et.al. |
2403.09539 |
null |
2024-03-14 |
VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding |
Chris Kelly et.al. |
2403.09530 |
null |
2024-03-15 |
WavCraft: Audio Editing and Generation with Natural Language Prompts |
Jinhua Liang et.al. |
2403.09527 |
link |
2024-03-13 |
Simple and Scalable Strategies to Continually Pre-train Large Language Models |
Adam Ibrahim et.al. |
2403.08763 |
link |
2024-03-13 |
Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework |
Jingling Li et.al. |
2403.08743 |
null |
2024-03-13 |
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models |
Carlo Nicolini et.al. |
2403.08739 |
null |
2024-03-13 |
ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation |
Sayar Ghosh Roy et.al. |
2403.08737 |
link |
2024-03-13 |
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization |
Renjie Pi et.al. |
2403.08730 |
null |
2024-03-14 |
SOTOPIA- $π$ : Interactive Learning of Socially Intelligent Language Agents |
Ruiyi Wang et.al. |
2403.08715 |
link |
2024-03-13 |
Review of Generative AI Methods in Cybersecurity |
Yagmur Yigit et.al. |
2403.08701 |
null |
2024-03-13 |
TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning |
Shangding Gu et.al. |
2403.08694 |
null |
2024-03-13 |
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages |
Rik van Noord et.al. |
2403.08693 |
null |
2024-03-13 |
Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records |
Erlend Frayling et.al. |
2403.08664 |
null |
2024-03-13 |
Self-Supervised Learning for Covariance Estimation |
Tzvi Diskin et.al. |
2403.08662 |
null |
2024-03-13 |
Human Alignment of Large Language Models through Online Preference Optimisation |
Daniele Calandriello et.al. |
2403.08635 |
null |
2024-03-13 |
MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models |
Subash Neupane et.al. |
2403.08607 |
null |
2024-03-13 |
Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation |
Daniel Honerkamp et.al. |
2403.08605 |
link |
2024-03-13 |
DevBench: A Comprehensive Benchmark for Software Development |
Bowen Li et.al. |
2403.08604 |
link |
2024-03-13 |
Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments |
Sitao Cheng et.al. |
2403.08593 |
null |
2024-03-13 |
Non-discrimination Criteria for Generative Language Models |
Sara Sterlie et.al. |
2403.08564 |
null |
2024-03-13 |
AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models |
Yifei Gao et.al. |
2403.08542 |
null |
2024-03-13 |
Language models scale reliably with over-training and on downstream tasks |
Samir Yitzhak Gadre et.al. |
2403.08540 |
link |
2024-03-13 |
Masked Generative Story Transformer with Character Guidance and Caption Augmentation |
Christos Papadimitriou et.al. |
2403.08502 |
link |
2024-03-12 |
Beyond Text: Frozen Large Language Models in Visual Signal Comprehension |
Lei Zhu et.al. |
2403.07874 |
link |
2024-03-12 |
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension |
Fangyun Wei et.al. |
2403.07872 |
null |
2024-03-12 |
Exploring Safety Generalization Challenges of Large Language Models via Code |
Qibing Ren et.al. |
2403.07865 |
link |
2024-03-12 |
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation |
Shihao Zhao et.al. |
2403.07860 |
link |
2024-03-12 |
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric |
Haokun Lin et.al. |
2403.07839 |
null |
2024-03-12 |
DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies |
William Xie et.al. |
2403.07832 |
null |
2024-03-12 |
The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing |
Jianchen Wang et.al. |
2403.07825 |
null |
2024-03-12 |
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM |
Sainbayar Sukhbaatar et.al. |
2403.07816 |
null |
2024-03-12 |
Chronos: Learning the Language of Time Series |
Abdul Fatir Ansari et.al. |
2403.07815 |
link |
2024-03-12 |
Beyond Memorization: The Challenge of Random Memory Access in Language Models |
Tongyao Zhu et.al. |
2403.07805 |
link |
2024-03-12 |
Fine-tuning Large Language Models with Sequential Instructions |
Hanxu Hu et.al. |
2403.07794 |
link |
2024-03-12 |
Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations |
Carlos Jose Xavier Cruz et.al. |
2403.07769 |
link |
2024-03-12 |
Synth $^2$ : Boosting Visual-Language Models with Synthetic Captions and Image Embeddings |
Sahand Sharifzadeh et.al. |
2403.07750 |
null |
2024-03-12 |
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models |
Yan Liu et.al. |
2403.07747 |
null |
2024-03-12 |
Multi-modal Auto-regressive Modeling via Visual Words |
Tianshuo Peng et.al. |
2403.07720 |
link |
2024-03-12 |
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? |
Alexandre Drouin et.al. |
2403.07718 |
link |
2024-03-12 |
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models |
Zhicheng Guo et.al. |
2403.07714 |
link |
2024-03-12 |
Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards |
Wei Shen et.al. |
2403.07708 |
null |
2024-03-12 |
Large, Small or Both: A Novel Data Augmentation Framework Based on Language Models for Debiasing Opinion Summarization |
Yanyue Zhang et.al. |
2403.07693 |
null |
2024-03-12 |
Reference-free Monolithic Preference Optimization with Odds Ratio |
Jiwoo Hong et.al. |
2403.07691 |
link |
2024-03-11 |
Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena |
Leonie Weissweiler et.al. |
2403.06965 |
null |
2024-03-11 |
Materials science in the era of large language models: a perspective |
Ge Lei et.al. |
2403.06949 |
null |
2024-03-11 |
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation |
Xinyao Li et.al. |
2403.06946 |
link |
2024-03-11 |
Naming, Describing, and Quantifying Visual Objects in Humans and LLMs |
Alberto Testoni et.al. |
2403.06935 |
link |
2024-03-11 |
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis |
Yanming Liu et.al. |
2403.06932 |
link |
2024-03-11 |
MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning |
Yichuan Li et.al. |
2403.06914 |
link |
2024-03-11 |
Application of Quantum Tensor Networks for Protein Classification |
Debarshi Kundu et.al. |
2403.06890 |
null |
2024-03-11 |
Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents |
Nishchal Prasad et.al. |
2403.06872 |
link |
2024-03-11 |
Semantic Residual Prompts for Continual Learning |
Martin Menabue et.al. |
2403.06870 |
link |
2024-03-11 |
Learning with Noisy Foundation Models |
Hao Chen et.al. |
2403.06869 |
null |
2024-03-11 |
A Geospatial Approach to Predicting Desert Locust Breeding Grounds in Africa |
Ibrahim Salihu Yusuf et.al. |
2403.06860 |
null |
2024-03-11 |
Development of a Reliable and Accessible Caregiving Language Model (CaLM) |
Bambang Parmanto et.al. |
2403.06857 |
null |
2024-03-11 |
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation |
Guosheng Zhao et.al. |
2403.06845 |
null |
2024-03-11 |
RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback |
Yanming Liu et.al. |
2403.06840 |
link |
2024-03-11 |
ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts |
Lyuye Zhang et.al. |
2403.06838 |
null |
2024-03-11 |
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? |
Egor Zverev et.al. |
2403.06833 |
link |
2024-03-11 |
The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework |
Zhuo Chen et.al. |
2403.06832 |
link |
2024-03-11 |
ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model |
Zhiwei Liu et.al. |
2403.06765 |
link |
2024-03-11 |
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models |
Liang Chen et.al. |
2403.06764 |
link |
2024-03-11 |
ALaRM: Align Language Models via Hierarchical Rewards Modeling |
Yuhang Lai et.al. |
2403.06754 |
link |
2024-03-08 |
Bayesian Preference Elicitation with Language Models |
Kunal Handa et.al. |
2403.05534 |
null |
2024-03-08 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context |
Machel Reid et.al. |
2403.05530 |
null |
2024-03-08 |
GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM |
Hao Kang et.al. |
2403.05527 |
link |
2024-03-08 |
DeepSeek-VL: Towards Real-World Vision-Language Understanding |
Haoyu Lu et.al. |
2403.05525 |
link |
2024-03-08 |
Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola |
Yijiang Li et.al. |
2403.05523 |
null |
2024-03-08 |
Authorship Attribution in Bangla Literature (AABL) via Transfer Learning using ULMFiT |
Aisha Khatun et.al. |
2403.05519 |
null |
2024-03-08 |
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought |
James Chua et.al. |
2403.05518 |
link |
2024-03-08 |
To Err Is Human, but Llamas Can Learn It Too |
Agnes Luhtaru et.al. |
2403.05493 |
null |
2024-03-08 |
Will GPT-4 Run DOOM? |
Adrian de Wynter et.al. |
2403.05468 |
null |
2024-03-08 |
Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs |
Arijit Nag et.al. |
2403.05434 |
null |
2024-03-08 |
Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition |
Bingbing Wang et.al. |
2403.05428 |
null |
2024-03-08 |
FedFMS: Exploring Federated Foundation Models for Medical Image Segmentation |
Yuxi Liu et.al. |
2403.05408 |
link |
2024-03-08 |
Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery |
Xavier Bou et.al. |
2403.05381 |
link |
2024-03-08 |
VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model |
Junsu Kim et.al. |
2403.05346 |
null |
2024-03-08 |
Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings |
Wei Zhou et.al. |
2403.05338 |
null |
2024-03-08 |
ChatASU: Evoking LLM’s Reflexion to Truly Understand Aspect Sentiment in Dialogues |
Yiding Liu et.al. |
2403.05326 |
null |
2024-03-08 |
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation |
Zihao Wang et.al. |
2403.05313 |
null |
2024-03-08 |
Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents |
Jinyang Li et.al. |
2403.05307 |
link |
2024-03-08 |
ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications |
Sotaro Takeshita et.al. |
2403.05303 |
link |
2024-03-08 |
Modeling Dynamic (De)Allocations of Local Memory for Translation Validation |
Abhishek Rose et.al. |
2403.05302 |
null |
2024-03-07 |
iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries |
Adam Coscia et.al. |
2403.04760 |
link |
2024-03-07 |
KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts |
Adam Coscia et.al. |
2403.04758 |
link |
2024-03-07 |
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error |
Boshi Wang et.al. |
2403.04746 |
link |
2024-03-08 |
How Far Are We from Intelligent Visual Deductive Reasoning? |
Yizhe Zhang et.al. |
2403.04732 |
link |
2024-03-07 |
Common 7B Language Models Already Possess Strong Math Capabilities |
Chen Li et.al. |
2403.04706 |
link |
2024-03-07 |
ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes |
Hashmat Shadab Malik et.al. |
2403.04701 |
link |
2024-03-07 |
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification |
Ekaterina Fadeeva et.al. |
2403.04696 |
link |
2024-03-07 |
Telecom Language Models: Must They Be Large? |
Nicola Piovesan et.al. |
2403.04666 |
null |
2024-03-07 |
Yi: Open Foundation Models by 01.AI |
01. AI et.al. |
2403.04652 |
link |
2024-03-07 |
Teaching Large Language Models to Reason with Reinforcement Learning |
Alex Havrilla et.al. |
2403.04642 |
null |
2024-03-07 |
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios |
Qilang Ye et.al. |
2403.04640 |
link |
2024-03-07 |
A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds |
Xuenan Xu et.al. |
2403.04594 |
link |
2024-03-07 |
Embodied Understanding of Driving Scenarios |
Yunsong Zhou et.al. |
2403.04593 |
link |
2024-03-07 |
Wiki-TabNER:Advancing Table Interpretation Through Named Entity Recognition |
Aneta Koleva et.al. |
2403.04577 |
link |
2024-03-07 |
Reducing self-supervised learning complexity improves weakly-supervised classification performance in computational pathology |
Tim Lenz et.al. |
2403.04558 |
null |
2024-03-07 |
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models |
Wanru Zhao et.al. |
2403.04529 |
null |
2024-03-07 |
Where does In-context Translation Happen in Large Language Models |
Suzanna Sia et.al. |
2403.04510 |
null |
2024-03-07 |
GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability |
Zihan Luo et.al. |
2403.04483 |
link |
2024-03-08 |
Do Large Language Model Understand Multi-Intent Spoken Language ? |
Shangjian Yin et.al. |
2403.04481 |
link |
2024-03-08 |
Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset |
Minjin Kim et.al. |
2403.04460 |
link |
2024-03-06 |
Backtracing: Retrieving the Cause of the Query |
Rose E. Wang et.al. |
2403.03956 |
link |
2024-03-06 |
Bridging Language and Items for Retrieval and Recommendation |
Yupeng Hou et.al. |
2403.03952 |
link |
2024-03-06 |
The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models |
Adithya Bhaskar et.al. |
2403.03942 |
link |
2024-03-06 |
Did Translation Models Get More Robust Without Anyone Even Noticing? |
Ben Peters et.al. |
2403.03923 |
null |
2024-03-06 |
Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing |
Asmita et.al. |
2403.03897 |
link |
2024-03-06 |
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators |
Indraneil Paul et.al. |
2403.03894 |
link |
2024-03-06 |
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models |
Luiza Pozzobon et.al. |
2403.03893 |
link |
2024-03-06 |
FaaF: Facts as a Function for the evaluation of RAG systems |
Vasileios Katranidis et.al. |
2403.03888 |
link |
2024-03-06 |
SaulLM-7B: A pioneering Large Language Model for Law |
Pierre Colombo et.al. |
2403.03883 |
null |
2024-03-06 |
Learning to Decode Collaboratively with Multiple Language Models |
Shannon Zejiang Shen et.al. |
2403.03870 |
link |
2024-03-06 |
On the Origins of Linear Representations in Large Language Models |
Yibo Jiang et.al. |
2403.03867 |
null |
2024-03-06 |
KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions |
Fangyuan Xu et.al. |
2403.03866 |
null |
2024-03-06 |
Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning |
Deepanway Ghosal et.al. |
2403.03864 |
link |
2024-03-06 |
X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification |
Hanzi Xu et.al. |
2403.03863 |
link |
2024-03-06 |
Designing Informative Metrics for Few-Shot Example Selection |
Rishabh Adiga et.al. |
2403.03861 |
null |
2024-03-06 |
Emojinize : Enriching Any Text with Emoji Translations |
Lars Henning Klein et.al. |
2403.03857 |
null |
2024-03-06 |
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect |
Xin Men et.al. |
2403.03853 |
null |
2024-03-06 |
Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ |
Carolin Holtermann et.al. |
2403.03814 |
link |
2024-03-06 |
Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery |
Wei Zhang et.al. |
2403.03790 |
null |
2024-03-06 |
PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion |
Zekai Zhang et.al. |
2403.03788 |
link |
2024-03-05 |
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning |
Nathaniel Li et.al. |
2403.03218 |
null |
2024-03-05 |
CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments |
Savitha Sam Abraham et.al. |
2403.03203 |
null |
2024-03-05 |
Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement |
Rafaela Martelo et.al. |
2403.03188 |
link |
2024-03-05 |
Reliable, Adaptable, and Attributable Language Models with Retrieval |
Akari Asai et.al. |
2403.03187 |
null |
2024-03-05 |
MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting |
Fangchen Liu et.al. |
2403.03174 |
null |
2024-03-05 |
SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection |
Peng Qi et.al. |
2403.03170 |
null |
2024-03-05 |
PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset |
Arda Uzunoğlu et.al. |
2403.03167 |
link |
2024-03-05 |
Quantum Many-Body Physics Calculations with Large Language Models |
Haining Pan et.al. |
2403.03154 |
null |
2024-03-05 |
Language Guided Exploration for RL Agents in Text Environments |
Hitesh Golchha et.al. |
2403.03141 |
null |
2024-03-05 |
CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following |
Kaiyan Zhang et.al. |
2403.03129 |
null |
2024-03-05 |
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution |
Flor Miriam Plaza-del-Arco et.al. |
2403.03121 |
link |
2024-03-05 |
“In Dialogues We Learn”: Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning |
Chuanqi Cheng et.al. |
2403.03102 |
null |
2024-03-05 |
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents |
Yuqi Zhu et.al. |
2403.03101 |
link |
2024-03-05 |
Learning to Use Tools via Cooperative and Interactive Agents |
Zhengliang Shi et.al. |
2403.03031 |
link |
2024-03-05 |
Socratic Reasoning Improves Positive Text Rewriting |
Anmol Goel et.al. |
2403.03029 |
null |
2024-03-05 |
Word Importance Explains How Prompts Affect Language Model Outputs |
Stefan Hackmann et.al. |
2403.03028 |
null |
2024-03-05 |
OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following |
Haochen Shi et.al. |
2403.03017 |
null |
2024-03-05 |
Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations |
Hasan Abu-Rasheed et.al. |
2403.03008 |
null |
2024-03-05 |
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models |
Gen Luo et.al. |
2403.03003 |
link |
2024-03-05 |
Localized Zeroth-Order Prompt Optimization |
Wenyang Hu et.al. |
2403.02993 |
null |
2024-03-02 |
LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems |
Tasnim Ahmed et.al. |
2403.01342 |
null |
2024-03-02 |
Making Hybrid Languages: A Recipe |
Leif Andersen et.al. |
2403.01335 |
null |
2024-03-02 |
Chaining thoughts and LLMs to learn DNA structural biophysics |
Tyler D. Ross et.al. |
2403.01332 |
link |
2024-03-02 |
VBART: The Turkish LLM |
Meliksah Turker et.al. |
2403.01308 |
null |
2024-03-02 |
ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation |
Moran Yanuka et.al. |
2403.01306 |
link |
2024-03-02 |
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning |
Alexander Scarlatos et.al. |
2403.01304 |
link |
2024-03-02 |
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention |
Tianyi Zhang et.al. |
2403.01273 |
link |
2024-03-02 |
Employing LLMs for Incident Response Planning and Review |
Sam Hays et.al. |
2403.01271 |
null |
2024-03-02 |
Dissecting Language Models: Machine Unlearning via Selective Pruning |
Nicholas Pochinkov et.al. |
2403.01267 |
link |
2024-03-02 |
Accelerating Greedy Coordinate Gradient via Probe Sampling |
Yiran Zhao et.al. |
2403.01251 |
link |
2024-03-02 |
SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code |
Ziniu Hu et.al. |
2403.01248 |
null |
2024-03-02 |
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal |
Jianheng Huang et.al. |
2403.01244 |
link |
2024-03-02 |
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact |
Ruikang Liu et.al. |
2403.01241 |
link |
2024-03-02 |
Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy |
Jamie Hayes et.al. |
2403.01218 |
null |
2024-03-02 |
API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access |
Jiayuan Su et.al. |
2403.01216 |
null |
2024-03-02 |
Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning |
Shuo Yang et.al. |
2403.01209 |
null |
2024-03-02 |
The Case for Animal-Friendly AI |
Sankalpa Ghose et.al. |
2403.01199 |
null |
2024-03-02 |
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling |
Shanghaoran Quan et.al. |
2403.01197 |
link |
2024-03-02 |
RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots |
Philip Feldman. James R. Foulds et.al. |
2403.01193 |
null |
2024-03-02 |
Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding |
Ha-Thanh Nguyen et.al. |
2403.01185 |
null |
2024-02-29 |
The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations? |
Alex Gu et.al. |
2402.19475 |
null |
2024-02-29 |
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World |
Weiyun Wang et.al. |
2402.19474 |
link |
2024-02-29 |
Retrieval-Augmented Generation for AI-Generated Content: A Survey |
Penghao Zhao et.al. |
2402.19473 |
link |
2024-02-29 |
Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling |
Gabriel Grand et.al. |
2402.19471 |
null |
2024-03-01 |
TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning |
Kate Sanders et.al. |
2402.19467 |
null |
2024-02-29 |
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models |
Chen Qian et.al. |
2402.19465 |
link |
2024-02-29 |
Curiosity-driven Red-teaming for Large Language Models |
Zhang-Wei Hong et.al. |
2402.19464 |
link |
2024-02-29 |
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap |
Saurabh Srivastava et.al. |
2402.19450 |
link |
2024-02-29 |
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models |
Frederik Kunstner et.al. |
2402.19449 |
null |
2024-02-29 |
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL |
Yifei Zhou et.al. |
2402.19446 |
link |
2024-02-29 |
Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation |
Jonathan Yang et.al. |
2402.19432 |
null |
2024-02-29 |
Compositional API Recommendation for Library-Oriented Code Generation |
Zexiong Ma et.al. |
2402.19431 |
null |
2024-02-29 |
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models |
Soham De et.al. |
2402.19427 |
null |
2024-02-29 |
Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines |
Lijia Ma et.al. |
2402.19421 |
null |
2024-02-29 |
PaECTER: Patent-level Representation Learning using Citation-informed Transformers |
Mainak Ghosh et.al. |
2402.19411 |
null |
2024-02-29 |
On the Scaling Laws of Geographical Representation in Language Models |
Nathan Godey et.al. |
2402.19406 |
null |
2024-02-29 |
Entity-Aware Multimodal Alignment Framework for News Image Captioning |
Junzhe Zhang et.al. |
2402.19404 |
null |
2024-02-29 |
Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Match Human Crowd Accuracy |
Philipp Schoenegger et.al. |
2402.19379 |
null |
2024-02-29 |
OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models |
Jenish Maharjan et.al. |
2402.19371 |
null |
2024-02-29 |
SoK: Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency |
Akila Wickramasekara et.al. |
2402.19366 |
null |
2024-02-28 |
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards |
Haoxiang Wang et.al. |
2402.18571 |
link |
2024-02-28 |
Diffusion Language Models Are Versatile Protein Learners |
Xinyou Wang et.al. |
2402.18567 |
null |
2024-02-28 |
A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic |
Gregory Coppola et.al. |
2402.18566 |
null |
2024-02-28 |
Approaching Human-Level Forecasting with Language Models |
Danny Halawi et.al. |
2402.18563 |
null |
2024-02-28 |
Implicit Bias of Next-Token Prediction |
Christos Thrampoulidis et.al. |
2402.18551 |
null |
2024-02-28 |
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling |
Mahdi Karami et.al. |
2402.18508 |
null |
2024-02-28 |
Few-Shot Fairness: Unveiling LLM’s Potential for Fairness-Aware Classification |
Garima Chhikara et.al. |
2402.18502 |
null |
2024-02-28 |
Language Models Represent Beliefs of Self and Others |
Wentao Zhu et.al. |
2402.18496 |
null |
2024-02-28 |
IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding |
Lanyun Zhu et.al. |
2402.18476 |
null |
2024-02-28 |
Meta-Task Prompting Elicits Embedding from Large Language Models |
Yibin Lei et.al. |
2402.18458 |
null |
2024-02-28 |
Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization |
Deng Li et.al. |
2402.18447 |
null |
2024-02-28 |
Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication |
Weize Chen et.al. |
2402.18439 |
link |
2024-02-28 |
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models |
Xiujie Song et.al. |
2402.18409 |
link |
2024-02-28 |
Balanced Similarity with Auxiliary Prompts: Towards Alleviating Text-to-Image Retrieval Bias for CLIP in Zero-shot Learning |
Hanyao Wang et.al. |
2402.18400 |
null |
2024-02-28 |
Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models |
Ercong Nie et.al. |
2402.18397 |
null |
2024-02-28 |
The First Place Solution of WSDM Cup 2024: Leveraging Large Language Models for Conversational Multi-Doc QA |
Yiming Li et.al. |
2402.18385 |
link |
2024-02-28 |
Large Language Models As Evolution Strategies |
Robert Tjarko Lange et.al. |
2402.18381 |
null |
2024-02-28 |
Tokenization Is More Than Compression |
Craig W. Schmidt et.al. |
2402.18376 |
null |
2024-02-28 |
VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models |
Seoyeon Kim et.al. |
2402.18374 |
link |
2024-02-28 |
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning |
Jiachun Li et.al. |
2402.18344 |
link |
2024-02-27 |
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction |
Zekun Qi et.al. |
2402.17766 |
link |
2024-02-27 |
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits |
Shuming Ma et.al. |
2402.17764 |
null |
2024-02-27 |
Massive Activations in Large Language Models |
Mingjie Sun et.al. |
2402.17762 |
link |
2024-02-27 |
Towards Optimal Learning of Language Models |
Yuxian Gu et.al. |
2402.17759 |
null |
2024-02-27 |
Evaluating Very Long-Term Conversational Memory of LLM Agents |
Adyasha Maharana et.al. |
2402.17753 |
null |
2024-02-27 |
Tower: An Open Multilingual Large Language Model for Translation-Related Tasks |
Duarte M. Alves et.al. |
2402.17733 |
link |
2024-02-27 |
AmbigNLG: Addressing Task Ambiguity in Instruction for NLG |
Ayana Niwa et.al. |
2402.17717 |
null |
2024-02-27 |
Case-Based or Rule-Based: How Do Transformers Do the Math? |
Yi Hu et.al. |
2402.17709 |
link |
2024-02-27 |
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations |
Jing Huang et.al. |
2402.17700 |
link |
2024-02-27 |
NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents |
Tamara Czinczoll et.al. |
2402.17682 |
link |
2024-02-27 |
The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks |
Ashwin Prasad Shivarpatna Venkatesh et.al. |
2402.17679 |
null |
2024-02-27 |
CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention |
Mohammad Sadil Khan et.al. |
2402.17678 |
null |
2024-02-27 |
Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models |
Yunpeng Huang et.al. |
2402.17671 |
null |
2024-02-27 |
Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs |
Tanise Ceron et.al. |
2402.17649 |
null |
2024-02-27 |
SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation |
Shuangrui Ding et.al. |
2402.17645 |
link |
2024-02-27 |
Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data |
Xiao Liu et.al. |
2402.17644 |
link |
2024-02-27 |
Variational Learning is Effective for Large Deep Networks |
Yuesong Shen et.al. |
2402.17641 |
link |
2024-02-27 |
Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling |
David S. W. Williams et.al. |
2402.17622 |
null |
2024-02-27 |
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization |
Wenqi Zhang et.al. |
2402.17574 |
link |
2024-02-27 |
Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers |
Xinyu Tang et.al. |
2402.17564 |
link |
2024-02-26 |
Integrating Large Language Models with Graphical Session-Based Recommendation |
Naicheng Guo et.al. |
2402.16539 |
null |
2024-02-26 |
LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments |
Junzhe Chen et.al. |
2402.16499 |
link |
2024-02-26 |
On Languaging a Simulation Engine |
Han Liu et.al. |
2402.16482 |
null |
2024-02-26 |
Unveiling ChatGPT’s Usage in Open Source Projects: A Mining-based Study |
Rosalia Tufano et.al. |
2402.16480 |
null |
2024-02-26 |
mEdIT: Multilingual Text Editing via Instruction Tuning |
Vipul Raheja et.al. |
2402.16472 |
link |
2024-02-26 |
Unveiling Vulnerability of Self-Attention |
Khai Jiet Liong et.al. |
2402.16470 |
link |
2024-02-26 |
Defending LLMs against Jailbreaking Attacks via Backtranslation |
Yihan Wang et.al. |
2402.16459 |
link |
2024-02-26 |
ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing |
Liuzhenghao Lv et.al. |
2402.16445 |
link |
2024-02-26 |
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors |
Zhexin Zhang et.al. |
2402.16444 |
link |
2024-02-26 |
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models |
Tianyi Tang et.al. |
2402.16438 |
null |
2024-02-26 |
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions |
Yuansen Zhang et.al. |
2402.16431 |
null |
2024-02-26 |
Predicting Sustainable Development Goals Using Course Descriptions – from LLMs to Conventional Foundation Models |
Lev Kharlashkin et.al. |
2402.16420 |
null |
2024-02-26 |
From RAGs to riches: Using large language models to write documents for clinical trials |
Nigel Markey et.al. |
2402.16406 |
null |
2024-02-26 |
MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property |
Shiwen Ni et.al. |
2402.16389 |
link |
2024-02-26 |
Immunization against harmful fine-tuning attacks |
Domenic Rosati et.al. |
2402.16382 |
null |
2024-02-26 |
Improving LLM-based Machine Translation with Systematic Self-Correction |
Zhaopeng Feng et.al. |
2402.16379 |
link |
2024-02-26 |
Unraveling Babel: Exploring Multilingual Activation Patterns within Large Language Models |
Weize Liu et.al. |
2402.16367 |
null |
2024-02-26 |
LLM Inference Unveiled: Survey and Roofline Model Insights |
Zhihang Yuan et.al. |
2402.16363 |
link |
2024-02-26 |
Layer-wise Regularized Dropout for Neural Language Models |
Shiwen Ni et.al. |
2402.16361 |
null |
2024-02-26 |
An Integrated Data Processing Framework for Pretraining Foundation Models |
Yiding Sun et.al. |
2402.16358 |
link |
2024-02-26 |
Language-guided Skill Learning with Temporal Variational Inference |
Haotian Fu et.al. |
2402.16354 |
null |
2024-02-23 |
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning |
Jianguo Zhang et.al. |
2402.15506 |
link |
2024-02-23 |
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs |
Kinjal Basu et.al. |
2402.15491 |
link |
2024-02-23 |
Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models |
Yiran Liu et.al. |
2402.15481 |
null |
2024-02-23 |
Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization |
Swaroop Nath et.al. |
2402.15473 |
link |
2024-02-23 |
Repetition Improves Language Model Embeddings |
Jacob Mitchell Springer et.al. |
2402.15449 |
link |
2024-02-23 |
A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models |
Stefan Hegselmann et.al. |
2402.15422 |
link |
2024-02-23 |
PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning |
Simon Holk et.al. |
2402.15420 |
null |
2024-02-23 |
Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy? |
Nader Asadi et.al. |
2402.15414 |
null |
2024-02-23 |
Grasp, See and Place: Efficient Unknown Object Rearrangement with Policy Structure Prior |
Kechun Xu et.al. |
2402.15402 |
link |
2024-02-23 |
Explorations of Self-Repair in Language Models |
Cody Rushing et.al. |
2402.15390 |
link |
2024-02-23 |
Safe Task Planning for Language-Instructed Multi-Robot Systems using Conformal Prediction |
Jun Wang et.al. |
2402.15368 |
null |
2024-02-23 |
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping |
Zijie J. Wang et.al. |
2402.15350 |
link |
2024-02-23 |
NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data |
Sergei Bogdanov et.al. |
2402.15343 |
link |
2024-02-23 |
Ranking Entities along Conceptual Space Dimensions with LLMs: An Analysis of Fine-Tuning Strategies |
Nitesh Kumar et.al. |
2402.15337 |
null |
2024-02-23 |
GPTVQ: The Blessing of Dimensionality for LLM Quantization |
Mart van Baalen et.al. |
2402.15319 |
null |
2024-02-23 |
ArabianGPT: Native Arabic GPT-based Large Language |
Anis Koubaa et.al. |
2402.15313 |
null |
2024-02-23 |
Counterfactual Generation with Identifiability Guarantees |
Hanqi Yan et.al. |
2402.15309 |
link |
2024-02-23 |
Representing Online Handwriting for Recognition in Large Vision-Language Models |
Anastasiia Fadeeva et.al. |
2402.15307 |
null |
2024-02-23 |
How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries |
Somnath Banerjee et.al. |
2402.15302 |
link |
2024-02-23 |
Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models |
Yuzhe Zhang et.al. |
2402.15301 |
null |
2024-02-22 |
PALO: A Polyglot Large Multimodal Model for 5B People |
Muhammad Maaz et.al. |
2402.14818 |
link |
2024-02-22 |
Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging |
Yuzhe Yang et.al. |
2402.14815 |
link |
2024-02-22 |
WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition |
Lianghui Zhu et.al. |
2402.14812 |
link |
2024-02-22 |
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking |
Nikhil Prakash et.al. |
2402.14811 |
null |
2024-02-22 |
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning |
Zicheng Lin et.al. |
2402.14809 |
link |
2024-02-22 |
RelayAttention for Efficient Large Language Model Serving with Long System Prompts |
Lei Zhu et.al. |
2402.14808 |
link |
2024-02-22 |
A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health |
Nikhil Behari et.al. |
2402.14807 |
null |
2024-02-22 |
Identifying Multiple Personalities in Large Language Models with External Evaluation |
Xiaoyang Song et.al. |
2402.14805 |
null |
2024-02-22 |
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models |
Xudong Lu et.al. |
2402.14800 |
link |
2024-02-22 |
Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic |
Nathaniel Weir et.al. |
2402.14798 |
null |
2024-02-22 |
Zero-shot cross-lingual transfer in instruction tuning of large language model |
Nadezhda Chirkova et.al. |
2402.14778 |
null |
2024-02-22 |
2D Matryoshka Sentence Embeddings |
Xianming Li et.al. |
2402.14776 |
link |
2024-02-22 |
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models |
Yuhang Cao et.al. |
2402.14767 |
link |
2024-02-22 |
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues |
Ge Bai et.al. |
2402.14762 |
link |
2024-02-22 |
Generalizing Reward Modeling for Out-of-Distribution Preference Learning |
Chen Jia et.al. |
2402.14760 |
link |
2024-02-22 |
Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation |
Jiawei Wang et.al. |
2402.14744 |
link |
2024-02-22 |
Dependency Annotation of Ottoman Turkish with Multilingual BERT |
Şaziye Betül Özateş et.al. |
2402.14743 |
null |
2024-02-22 |
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs |
Arash Ahmadian et.al. |
2402.14740 |
null |
2024-02-22 |
Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models |
Seungduk Kim et.al. |
2402.14714 |
link |
2024-02-22 |
IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus |
Honghao Gui et.al. |
2402.14710 |
link |
2024-02-21 |
Coercing LLMs to do and reveal (almost) anything |
Jonas Geiping et.al. |
2402.14020 |
link |
2024-02-21 |
Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment |
Vyas Raina et.al. |
2402.14016 |
link |
2024-02-21 |
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems |
Chaoqun He et.al. |
2402.14008 |
link |
2024-02-21 |
Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models |
Zhiwei He et.al. |
2402.14007 |
link |
2024-02-21 |
Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models |
Aline Ioste et.al. |
2402.14002 |
null |
2024-02-21 |
Analysing The Impact of Sequence Composition on Language Model Pre-Training |
Yu Zhao et.al. |
2402.13991 |
link |
2024-02-21 |
Towards Building Multilingual Language Model for Medicine |
Pengcheng Qiu et.al. |
2402.13963 |
link |
2024-02-21 |
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality |
Rahul Zalkikar et.al. |
2402.13954 |
null |
2024-02-21 |
Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning |
Debjit Paul et.al. |
2402.13950 |
null |
2024-02-21 |
Do Efficient Transformers Really Save Computation? |
Kai Yang et.al. |
2402.13934 |
null |
2024-02-21 |
Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content |
Federico Bianchi et.al. |
2402.13926 |
null |
2024-02-21 |
SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization |
Prakamya Mishra et.al. |
2402.13919 |
link |
2024-02-21 |
What Linguistic Features and Languages are Important in LLM Translation? |
Ryandito Diandaru et.al. |
2402.13917 |
null |
2024-02-21 |
Calibrating Large Language Models with Sample Consistency |
Qing Lyu et.al. |
2402.13904 |
null |
2024-02-21 |
Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models |
Chenyang Lyu et.al. |
2402.13887 |
null |
2024-02-21 |
$\texttt{Se}^2$: $\textit{Se}$quential Example $\textit{Se}$ lection for In-Context Learning |
Haoyu Liu et.al. |
2402.13874 |
link |
2024-02-21 |
An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach |
Mohammad Amaz Uddin et.al. |
2402.13871 |
null |
2024-02-21 |
Kuaiji: the First Chinese Accounting Large Language Model |
Jiayuan Luo et.al. |
2402.13866 |
null |
2024-02-21 |
RealDex: Towards Human-like Grasping for Robotic Dexterous Hand |
Yumeng Liu et.al. |
2402.13853 |
null |
2024-02-21 |
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models |
Jiawei Liang et.al. |
2402.13851 |
null |
2024-02-20 |
Towards audio language modeling – an overview |
Haibin Wu et.al. |
2402.13236 |
null |
2024-02-20 |
Unlocking Insights: Semantic Search in Jupyter Notebooks |
Lan Li et.al. |
2402.13234 |
null |
2024-02-20 |
A Touch, Vision, and Language Dataset for Multimodal Alignment |
Letian Fu et.al. |
2402.13232 |
link |
2024-02-20 |
Investigating Cultural Alignment of Large Language Models |
Badr AlKhamissi et.al. |
2402.13231 |
link |
2024-02-20 |
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive |
Arka Pal et.al. |
2402.13228 |
link |
2024-02-20 |
AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning |
Qiao Jin et.al. |
2402.13225 |
null |
2024-02-20 |
RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian |
Adrian Cosma et.al. |
2402.13222 |
link |
2024-02-20 |
How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts |
Yusu Qian et.al. |
2402.13220 |
null |
2024-02-20 |
Softmax Probabilities (Mostly) Predict Large Language Model Correctness on Multiple-Choice Q&A |
Benjamin Plaut et.al. |
2402.13213 |
link |
2024-02-20 |
Soft Self-Consistency Improves Language Model Agents |
Han Wang et.al. |
2402.13212 |
link |
2024-02-20 |
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation |
Dongjin Kang et.al. |
2402.13211 |
null |
2024-02-20 |
Bayesian Reward Models for LLM Alignment |
Adam X. Yang et.al. |
2402.13210 |
null |
2024-02-20 |
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena |
Marco Gaido et.al. |
2402.13208 |
link |
2024-02-20 |
Question Calibration and Multi-Hop Modeling for Temporal Question Answering |
Chao Xue et.al. |
2402.13188 |
null |
2024-02-20 |
What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents |
Mingyu Jin et.al. |
2402.13184 |
null |
2024-02-20 |
DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models |
Norman Di Palo et.al. |
2402.13181 |
null |
2024-02-20 |
Benchmarking Retrieval-Augmented Generation for Medicine |
Guangzhi Xiong et.al. |
2402.13178 |
link |
2024-02-20 |
Defending Jailbreak Prompts via In-Context Adversarial Game |
Yujun Zhou et.al. |
2402.13148 |
null |
2024-02-20 |
OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog |
Adnen Abdessaied et.al. |
2402.13146 |
null |
2024-02-20 |
The Hidden Space of Transformer Language Adapters |
Jesujoba O. Alabi et.al. |
2402.13137 |
link |
2024-02-19 |
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding |
Zhuoming Chen et.al. |
2402.12374 |
link |
2024-02-19 |
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies |
Xiao Ye et.al. |
2402.12370 |
link |
2024-02-19 |
A Critical Evaluation of AI Feedback for Aligning Large Language Models |
Archit Sharma et.al. |
2402.12366 |
link |
2024-02-19 |
Emergent Word Order Universals from Cognitively-Motivated Language Models |
Tatsuki Kuribayashi et.al. |
2402.12363 |
link |
2024-02-19 |
Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge |
Julien Delile et.al. |
2402.12352 |
null |
2024-02-19 |
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations |
Jinhao Duan et.al. |
2402.12348 |
link |
2024-02-19 |
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! |
Zhanhui Zhou et.al. |
2402.12343 |
link |
2024-02-19 |
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models |
Christian Schlarmann et.al. |
2402.12336 |
link |
2024-02-19 |
Query-Based Adversarial Prompt Generation |
Jonathan Hayase et.al. |
2402.12329 |
null |
2024-02-19 |
Shall We Talk: Exploring Spontaneous Collaborations of Competing LLM Agents |
Zengqing Wu et.al. |
2402.12327 |
link |
2024-02-19 |
ARKS: Active Retrieval in Knowledge Soup for Code Generation |
Hongjin Su et.al. |
2402.12317 |
link |
2024-02-19 |
Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports |
Felix J. Dorfner et.al. |
2402.12298 |
null |
2024-02-19 |
KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students |
Matthew Shu et.al. |
2402.12291 |
null |
2024-02-19 |
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models |
Xiaoyu Tian et.al. |
2402.12289 |
null |
2024-02-19 |
Adaptive Skeleton Graph Decoding |
Shuowei Jin et.al. |
2402.12280 |
null |
2024-02-19 |
Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks |
Nadezhda Chirkova et.al. |
2402.12279 |
null |
2024-02-19 |
Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from Large Language Models |
Puxuan Yu et.al. |
2402.12276 |
link |
2024-02-19 |
High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models |
Michela Lorandi et.al. |
2402.12267 |
link |
2024-02-19 |
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles |
Oleksandr Balabanov et.al. |
2402.12264 |
null |
2024-02-19 |
NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms |
Jonathan Zheng et.al. |
2402.12261 |
null |
2024-02-16 |
PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter |
Junfei Xiao et.al. |
2402.10896 |
null |
2024-02-16 |
RLVF: Learning from Verbal Feedback without Overgeneralization |
Moritz Stephan et.al. |
2402.10893 |
link |
2024-02-16 |
Instruction Diversity Drives Generalization To Unseen Tasks |
Dylan Zhang et.al. |
2402.10891 |
null |
2024-02-16 |
When is Tree Search Useful for LLM Planning? It Depends on the Discriminator |
Ziru Chen et.al. |
2402.10890 |
link |
2024-02-16 |
Multi-modal preference alignment remedies regression of visual instruction tuning on language model |
Shengzhi Li et.al. |
2402.10884 |
link |
2024-02-16 |
EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models |
Muhammad Shihab Rashid et.al. |
2402.10866 |
link |
2024-02-16 |
Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities |
Mingyu Jin et.al. |
2402.10835 |
null |
2024-02-16 |
RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model |
Jianhao Yuan et.al. |
2402.10828 |
null |
2024-02-16 |
Quantifying the Persona Effect in LLM Simulations |
Tiancheng Hu et.al. |
2402.10811 |
null |
2024-02-16 |
Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond |
Yongqi Li et.al. |
2402.10805 |
null |
2024-02-16 |
EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge |
Xuan Shen et.al. |
2402.10787 |
link |
2024-02-16 |
A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models |
Mingchen Li et.al. |
2402.10779 |
null |
2024-02-16 |
AutoGPT+P: Affordance-based Task Planning with Large Language Models |
Timo Birr et.al. |
2402.10778 |
null |
2024-02-16 |
How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs? |
Ehsan Doostmohammadi et.al. |
2402.10770 |
null |
2024-02-16 |
Distillation Enhanced Generative Retrieval |
Yongqi Li et.al. |
2402.10769 |
null |
2024-02-16 |
Inference to the Best Explanation in Large Language Models |
Dhairya Dalal et.al. |
2402.10767 |
null |
2024-02-16 |
When Dataflow Analysis Meets Large Language Models |
Chengpeng Wang et.al. |
2402.10754 |
null |
2024-02-16 |
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages |
Junjie Ye et.al. |
2402.10753 |
link |
2024-02-16 |
GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models |
Pengcheng Jiang et.al. |
2402.10744 |
link |
2024-02-16 |
Let’s Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning |
Yinpeng Liu et.al. |
2402.10738 |
link |
2024-02-15 |
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation |
Huizhuo Yuan et.al. |
2402.10210 |
null |
2024-02-15 |
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment |
Rui Yang et.al. |
2402.10207 |
link |
2024-02-15 |
Chain-of-Thought Reasoning Without Prompting |
Xuezhi Wang et.al. |
2402.10200 |
null |
2024-02-15 |
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents |
Lingbo Mo et.al. |
2402.10196 |
link |
2024-02-15 |
BitDelta: Your Fine-Tune May Only Be Worth One Bit |
James Liu et.al. |
2402.10193 |
link |
2024-02-15 |
Uncertainty Decomposition and Quantification for In-Context Learning of Large Language Models |
Chen Ling et.al. |
2402.10189 |
link |
2024-02-15 |
Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective |
Tianyi Qiu et.al. |
2402.10184 |
null |
2024-02-15 |
TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation |
Yaoxiang Wang et.al. |
2402.10178 |
null |
2024-02-15 |
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset |
Shubham Toshniwal et.al. |
2402.10176 |
link |
2024-02-15 |
Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence |
Yinhong Liu et.al. |
2402.10175 |
link |
2024-02-15 |
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models |
Ali AhmadiTeshnizi et.al. |
2402.10172 |
null |
2024-02-15 |
Data Engineering for Scaling Language Models to 128K Context |
Yao Fu et.al. |
2402.10171 |
link |
2024-02-15 |
Knowledge-Infused LLM-Powered Conversational Health Agent: A Case Study for Diabetes Patients |
Mahyar Abbasian et.al. |
2402.10153 |
null |
2024-02-15 |
ControlLM: Crafting Diverse Personalities for Language Models |
Yixuan Weng et.al. |
2402.10151 |
link |
2024-02-15 |
TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles |
Yinhong Liu et.al. |
2402.10137 |
null |
2024-02-15 |
Zero-Shot Reasoning: Personalized Content Generation Without the Cold Start Problem |
Davor Hafnar et.al. |
2402.10133 |
link |
2024-02-15 |
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning |
Ming Li et.al. |
2402.10110 |
link |
2024-02-15 |
Quantized Embedding Vectors for Controllable Diffusion Language Models |
Cheng Kang et.al. |
2402.10107 |
null |
2024-02-15 |
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving |
Jiaxin Zhang et.al. |
2402.10104 |
link |
2024-02-15 |
Any-Shift Prompting for Generalization over Distributions |
Zehao Xiao et.al. |
2402.10099 |
null |
2024-02-14 |
AQA-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability |
Siwei Yang et.al. |
2402.09404 |
link |
2024-02-14 |
Reinforcement Learning from Human Feedback with Active Queries |
Kaixuan Ji et.al. |
2402.09401 |
null |
2024-02-14 |
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference |
Harry Dong et.al. |
2402.09398 |
link |
2024-02-14 |
LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset |
Botao Yu et.al. |
2402.09391 |
link |
2024-02-14 |
HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation |
Yihao Fang et.al. |
2402.09390 |
link |
2024-02-14 |
Transformers Can Achieve Length Generalization But Not Robustly |
Yongchao Zhou et.al. |
2402.09371 |
null |
2024-02-14 |
Pseudorandom Error-Correcting Codes |
Miranda Christ et.al. |
2402.09370 |
null |
2024-02-14 |
Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking |
Yi Fung et.al. |
2402.09369 |
link |
2024-02-14 |
Copyright Traps for Large Language Models |
Matthieu Meeus et.al. |
2402.09363 |
link |
2024-02-14 |
HiRE: High Recall Approximate Top- $k$ Estimation for Efficient LLM Inference |
Yashas Samaga B L et.al. |
2402.09360 |
null |
2024-02-14 |
Developing a Framework for Auditing Large Language Models Using Human-in-the-Loop |
Maryam Amirizaniani et.al. |
2402.09346 |
null |
2024-02-14 |
Mitigating Reward Hacking via Information-Theoretic Reward Modeling |
Yuchun Miao et.al. |
2402.09345 |
null |
2024-02-14 |
AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach |
Maryam Amirizaniani et.al. |
2402.09334 |
null |
2024-02-14 |
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization |
Feifan Song et.al. |
2402.09320 |
link |
2024-02-14 |
Embracing the black box: Heading towards foundation models for causal discovery from time series data |
Gideon Stein et.al. |
2402.09305 |
link |
2024-02-14 |
Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code |
Vahid Majdinasab et.al. |
2402.09299 |
link |
2024-02-14 |
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey |
Zhichen Dong et.al. |
2402.09283 |
link |
2024-02-14 |
Leveraging Large Language Models for Enhanced NLP Task Performance through Knowledge Distillation and Optimized Training Strategies |
Yining Huang et.al. |
2402.09282 |
null |
2024-02-14 |
Personalized Large Language Models |
Stanisław Woźniak et.al. |
2402.09269 |
null |
2024-02-14 |
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation |
Xiaoying Zhang et.al. |
2402.09267 |
null |
2024-02-13 |
Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance |
Linxi Zhao et.al. |
2402.08680 |
null |
2024-02-13 |
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability |
Xingang Guo et.al. |
2402.08679 |
link |
2024-02-13 |
Human Curriculum Effects Emerge with In-Context Learning in Neural Networks |
Jacob Russin et.al. |
2402.08674 |
null |
2024-02-13 |
Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models |
Yuqing Liu et.al. |
2402.08670 |
null |
2024-02-13 |
Improving Generalization in Semantic Parsing by Increasing Natural Language Variation |
Irina Saparina et.al. |
2402.08666 |
link |
2024-02-13 |
The Last JITAI? The Unreasonable Effectiveness of Large Language Models in Issuing Just-in-Time Adaptive Interventions: Fostering Physical Activity in a Prospective Cardiac Rehabilitation Setting |
David Haag et.al. |
2402.08658 |
null |
2024-02-13 |
PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs |
Michael Dorkenwald et.al. |
2402.08657 |
null |
2024-02-13 |
Tandem Transformers for Inference Efficient LLMs |
Aishwarya P S et.al. |
2402.08644 |
null |
2024-02-13 |
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages |
Nedjma Ousidhoum et.al. |
2402.08638 |
null |
2024-02-13 |
Knowledge Editing on Black-box Large Language Models |
Xiaoshuai Song et.al. |
2402.08631 |
link |
2024-02-13 |
Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning |
Haeju Lee et.al. |
2402.08594 |
link |
2024-02-13 |
Test-Time Backdoor Attacks on Multimodal Large Language Models |
Dong Lu et.al. |
2402.08577 |
link |
2024-02-13 |
Online Foundation Model Selection in Robotics |
Po-han Li et.al. |
2402.08570 |
null |
2024-02-13 |
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast |
Xiangming Gu et.al. |
2402.08567 |
link |
2024-02-13 |
Artificial Intelligence for Literature Reviews: Opportunities and Challenges |
Francisco Bolanos et.al. |
2402.08565 |
null |
2024-02-13 |
Higher Layers Need More LoRA Experts |
Chongyang Gao et.al. |
2402.08562 |
link |
2024-02-13 |
Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback |
Vineet Bhat et.al. |
2402.08546 |
null |
2024-02-13 |
The Application of ChatGPT in Responding to Questions Related to the Boston Bowel Preparation Scale |
Xiaoqiang Liu et.al. |
2402.08492 |
null |
2024-02-13 |
Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models |
Shaeke Salman et.al. |
2402.08473 |
null |
2024-02-13 |
Large Language Models for the Automated Analysis of Optimization Algorithms |
Camilo Chacón Sartori et.al. |
2402.08472 |
link |
2024-02-12 |
A systematic investigation of learnability from single child linguistic input |
Yulu Qin et.al. |
2402.07899 |
link |
2024-02-12 |
Suppressing Pink Elephants with Direct Principle Feedback |
Louis Castricato et.al. |
2402.07896 |
null |
2024-02-12 |
WildfireGPT: Tailored Large Language Model for Wildfire Analysis |
Yangxinyu Xie et.al. |
2402.07877 |
null |
2024-02-12 |
Policy Improvement using Language Feedback Models |
Victor Zhong et.al. |
2402.07876 |
null |
2024-02-12 |
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs |
Soroush Nasiriany et.al. |
2402.07872 |
null |
2024-02-12 |
Scaling Laws for Fine-Grained Mixture of Experts |
Jakub Krajewski et.al. |
2402.07871 |
link |
2024-02-12 |
PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models |
Wei Zou et.al. |
2402.07867 |
link |
2024-02-12 |
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models |
Siddharth Karamcheti et.al. |
2402.07865 |
link |
2024-02-12 |
AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy |
Philipp Schoenegger et.al. |
2402.07862 |
null |
2024-02-12 |
Lissard: Long and Simple Sequential Reasoning Datasets |
Mirelle Bueno et.al. |
2402.07859 |
null |
2024-02-12 |
Mercury: An Efficiency Benchmark for LLM Code Synthesis |
Mingzhe Du et.al. |
2402.07844 |
link |
2024-02-12 |
Do Membership Inference Attacks Work on Large Language Models? |
Michael Duan et.al. |
2402.07841 |
link |
2024-02-12 |
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model |
Ahmet Üstün et.al. |
2402.07827 |
null |
2024-02-12 |
Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning |
Z Liu et.al. |
2402.07818 |
null |
2024-02-12 |
Injecting Wiktionary to improve token-level contextual representations using contrastive learning |
Anna Mosolova et.al. |
2402.07817 |
null |
2024-02-12 |
Retrieval-Augmented Thought Process as Sequential Decision Making |
Thomas Pouplin et.al. |
2402.07812 |
null |
2024-02-12 |
Empowering Federated Learning for Massive Models with NVIDIA FLARE |
Holger R. Roth et.al. |
2402.07792 |
null |
2024-02-12 |
TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection |
Hui Liu et.al. |
2402.07776 |
link |
2024-02-12 |
Quantitative knowledge retrieval from large language models |
David Selby et.al. |
2402.07770 |
link |
2024-02-12 |
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model |
Mikail Khona et.al. |
2402.07757 |
null |
2024-02-09 |
Feedback Loops With Language Models Drive In-Context Reward Hacking |
Alexander Pan et.al. |
2402.06627 |
link |
2024-02-09 |
Understanding the Effects of Iterative Prompting on Truthfulness |
Satyapriya Krishna et.al. |
2402.06625 |
null |
2024-02-09 |
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning |
Shivalika Singh et.al. |
2402.06619 |
null |
2024-02-09 |
FaBERT: Pre-training BERT on Persian Blogs |
Mostafa Masumi et.al. |
2402.06617 |
null |
2024-02-09 |
On the Out-Of-Distribution Generalization of Multimodal Large Language Models |
Xingxuan Zhang et.al. |
2402.06599 |
null |
2024-02-09 |
CigaR: Cost-efficient Program Repair with LLMs |
Dávid Hidvégi et.al. |
2402.06598 |
link |
2024-02-09 |
Understanding the Weakness of Large Language Model Agents within a Complex Android Environment |
Mingzhe Xing et.al. |
2402.06596 |
link |
2024-02-09 |
Self-consistent context aware conformer transducer for speech recognition |
Konstantin Kolokolov et.al. |
2402.06592 |
null |
2024-02-09 |
G-SciEdBERT: A Contextualized LLM for Science Assessment Tasks in German |
Ehsan Latif et.al. |
2402.06584 |
null |
2024-02-09 |
Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning |
Amir Ziai et.al. |
2402.06560 |
link |
2024-02-09 |
The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model |
Gregory Coppola et.al. |
2402.06557 |
link |
2024-02-09 |
Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA |
Marek Šuppa et.al. |
2402.06549 |
link |
2024-02-09 |
Calibrating Long-form Generations from Large Language Models |
Yukun Huang et.al. |
2402.06544 |
null |
2024-02-09 |
Introspective Planning: Guiding Language-Enabled Agents to Refine Their Own Uncertainty |
Kaiqu Liang et.al. |
2402.06529 |
link |
2024-02-09 |
Multimodal Clinical Trial Outcome Prediction with Large Language Models |
Wenhao Zheng et.al. |
2402.06512 |
link |
2024-02-09 |
Iris-SAM: Iris Segmentation Using a Foundational Model |
Parisa Farmanifard et.al. |
2402.06497 |
link |
2024-02-09 |
Large Language Models for Captioning and Retrieving Remote Sensing Images |
João Daniel Silva et.al. |
2402.06475 |
null |
2024-02-09 |
V-STaR: Training Verifiers for Self-Taught Reasoners |
Arian Hosseini et.al. |
2402.06457 |
null |
2024-02-09 |
StruQ: Defending Against Prompt Injection with Structured Queries |
Sizhe Chen et.al. |
2402.06363 |
null |
2024-02-09 |
CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models |
Peiyuan Gong et.al. |
2402.06360 |
link |
2024-02-08 |
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models |
Peng Gao et.al. |
2402.05935 |
link |
2024-02-08 |
Driving Everywhere with Large Language Model Policy Adaptation |
Boyi Li et.al. |
2402.05932 |
null |
2024-02-08 |
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue |
Xing Han Lù et.al. |
2402.05930 |
link |
2024-02-08 |
An Interactive Agent Foundation Model |
Zane Durante et.al. |
2402.05929 |
null |
2024-02-08 |
On the Convergence of Zeroth-Order Federated Tuning in Large Language Models |
Zhenqing Ling et.al. |
2402.05926 |
link |
2024-02-08 |
Efficient Stagewise Pretraining via Progressive Subnetworks |
Abhishek Panigrahi et.al. |
2402.05913 |
null |
2024-02-08 |
FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs |
Eun Cheol Choi et.al. |
2402.05904 |
link |
2024-02-08 |
Large Language Model Meets Graph Neural Network in Knowledge Distillation |
Shengxiang Hu et.al. |
2402.05894 |
null |
2024-02-08 |
Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking |
Nikhil Sharma et.al. |
2402.05880 |
null |
2024-02-08 |
PromptCrypt: Prompt Encryption for Secure Communication with Large Language Models |
Guo Lin et.al. |
2402.05868 |
link |
2024-02-08 |
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis |
Federico Bianchi et.al. |
2402.05863 |
link |
2024-02-08 |
Let Your Graph Do the Talking: Encoding Structured Data for LLMs |
Bryan Perozzi et.al. |
2402.05862 |
null |
2024-02-08 |
Learning to Route Among Specialized Experts for Zero-Shot Generalization |
Mohammed Muqeeth et.al. |
2402.05859 |
link |
2024-02-08 |
Limitations of Agents Simulated by Predictive Models |
Raymond Douglas et.al. |
2402.05829 |
null |
2024-02-08 |
Is it Possible to Edit Large Language Models Robustly? |
Xinbei Ma et.al. |
2402.05827 |
link |
2024-02-08 |
Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models |
Lingzhi Wang et.al. |
2402.05813 |
null |
2024-02-08 |
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning |
Zhiheng Xi et.al. |
2402.05808 |
link |
2024-02-08 |
How do Transformers perform In-Context Autoregressive Learning? |
Michael E. Sander et.al. |
2402.05787 |
null |
2024-02-08 |
Limits of Transformer Language Models on Algorithmic Learning |
Jonathan Thomm et.al. |
2402.05785 |
null |
2024-02-08 |
Text-to-Code Generation with Modality-relative Pre-training |
Fenia Christopoulou et.al. |
2402.05783 |
null |
2024-02-07 |
Opening the AI black box: program synthesis via mechanistic interpretability |
Eric J. Michaud et.al. |
2402.05110 |
link |
2024-02-07 |
You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models |
Alix Decrop et.al. |
2402.05102 |
null |
2024-02-07 |
Hydragen: High-Throughput LLM Inference with Shared Prefixes |
Jordan Juravsky et.al. |
2402.05099 |
link |
2024-02-07 |
Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation |
Dennis Hoftijzer et.al. |
2402.05090 |
link |
2024-02-07 |
A Roadmap to Pluralistic Alignment |
Taylor Sorensen et.al. |
2402.05070 |
link |
2024-02-07 |
SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models |
Lijun Li et.al. |
2402.05044 |
link |
2024-02-07 |
How BERT Speaks Shakespearean English? Evaluating Historical Bias in Contextual Language Models |
Miriam Cuscito et.al. |
2402.05034 |
null |
2024-02-07 |
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? |
Agustinus Kristiadi et.al. |
2402.05015 |
link |
2024-02-07 |
Pedagogical Alignment of Large Language Models |
Shashank Sonkar et.al. |
2402.05000 |
null |
2024-02-07 |
An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration |
Yihao Li et.al. |
2402.04978 |
null |
2024-02-07 |
ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12 |
Liuqing Chen et.al. |
2402.04975 |
null |
2024-02-07 |
Reconfidencing LLMs from the Grouping Loss Perspective |
Lihu Chen et.al. |
2402.04957 |
null |
2024-02-07 |
Chatbots in Knowledge-Intensive Contexts: Comparing Intent and LLM-Based Systems |
Samuel Kernan Freire et.al. |
2402.04955 |
null |
2024-02-07 |
Prompting Implicit Discourse Relation Annotation |
Frances Yung et.al. |
2402.04918 |
null |
2024-02-07 |
Personalized Text Generation with Fine-Grained Linguistic Control |
Bashar Alhafni et.al. |
2402.04914 |
link |
2024-02-07 |
L4Q: Parameter Efficient Quantization-Aware Training on Large Language Models via LoRA-wise LSQ |
Hyesung Jeon et.al. |
2402.04902 |
null |
2024-02-07 |
Detecting Generated Native Ads in Conversational Search |
Sebastian Schmidt et.al. |
2402.04889 |
link |
2024-02-07 |
Multimodal Query Suggestion with Multi-Agent Reinforcement Learning from Human Feedback |
Zheng Wang et.al. |
2402.04867 |
null |
2024-02-07 |
Automated Smart Contract Summarization via LLMs |
Yingjie Mao et.al. |
2402.04863 |
null |
2024-02-07 |
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay |
Natasha Butt et.al. |
2402.04858 |
link |
2024-02-06 |
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls |
Yu Du et.al. |
2402.04253 |
link |
2024-02-06 |
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal |
Mantas Mazeika et.al. |
2402.04249 |
link |
2024-02-06 |
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks |
Jongho Park et.al. |
2402.04248 |
link |
2024-02-06 |
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science |
Xiangru Tang et.al. |
2402.04247 |
null |
2024-02-06 |
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations |
Ji Qi et.al. |
2402.04236 |
link |
2024-02-06 |
Can Generative Agents Predict Emotion? |
Ciaran Regan et.al. |
2402.04232 |
null |
2024-02-06 |
“Task Success” is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors |
Lin Guan et.al. |
2402.04210 |
null |
2024-02-06 |
Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models |
David Sobrín-Hidalgo et.al. |
2402.04206 |
link |
2024-02-06 |
SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models |
Yichen Shi et.al. |
2402.04178 |
link |
2024-02-06 |
Scaling Laws for Downstream Task Performance of Large Language Models |
Berivan Isik et.al. |
2402.04177 |
null |
2024-02-06 |
Harnessing the Plug-and-Play Controller by Prompting |
Hao Wang et.al. |
2402.04160 |
null |
2024-02-06 |
Multi-line AI-assisted Code Authoring |
Omer Dunay et.al. |
2402.04141 |
null |
2024-02-06 |
Advancing Legal Reasoning: The Integration of AI to Navigate Complexities and Biases in Global Jurisprudence with Semi-Automated Arbitration Processes (SAAPs) |
Michael De’Shazer et.al. |
2402.04140 |
null |
2024-02-06 |
Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science |
Pengfei Liu et.al. |
2402.04119 |
link |
2024-02-06 |
Measuring Implicit Bias in Explicitly Unbiased Large Language Models |
Xuechunzi Bai et.al. |
2402.04105 |
link |
2024-02-06 |
The Use of a Large Language Model for Cyberbullying Detection |
Bayode Ogunleye et.al. |
2402.04088 |
null |
2024-02-06 |
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation |
Zhengbo Wang et.al. |
2402.04087 |
link |
2024-02-06 |
Provably learning a multi-head attention layer |
Sitan Chen et.al. |
2402.04084 |
null |
2024-02-06 |
Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models |
Reza Khanmohammadi et.al. |
2402.04075 |
null |
2024-02-06 |
Retrieve to Explain: Evidence-driven Predictions with Language Models |
Ravi Patel et.al. |
2402.04068 |
link |