2024-05-16 |
UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models |
Sahel Sharifymoghaddam et.al. |
2405.10311 |
null |
2024-05-16 |
4D Panoptic Scene Graph Generation |
Jingkang Yang et.al. |
2405.10305 |
link |
2024-05-16 |
Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees |
Yu Gui et.al. |
2405.10301 |
null |
2024-05-16 |
HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models |
Rhea Sanjay Sukthanker et.al. |
2405.10299 |
null |
2024-05-16 |
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning |
Yuexiang Zhai et.al. |
2405.10292 |
null |
2024-05-16 |
Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction |
Jianhao Chen et.al. |
2405.10288 |
null |
2024-05-16 |
FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models |
Adrian Bulat et.al. |
2405.10286 |
null |
2024-05-16 |
Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers |
Tuo Zhang et.al. |
2405.10276 |
null |
2024-05-16 |
Keep It Private: Unsupervised Privatization of Online Text |
Calvin Bao et.al. |
2405.10260 |
link |
2024-05-16 |
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models |
Xianzheng Ma et.al. |
2405.10255 |
null |
2024-05-16 |
PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology |
George Shaikovski et.al. |
2405.10254 |
null |
2024-05-16 |
A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks |
Xuanfan Ni et.al. |
2405.10251 |
null |
2024-05-16 |
IntelliExplain: Enhancing Interactive Code Generation through Natural Language Explanations for Non-Professional Programmers |
Hao Yan et.al. |
2405.10250 |
null |
2024-05-16 |
A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts |
Xinru Zhang et.al. |
2405.10246 |
null |
2024-05-16 |
DocuMint: Docstring Generation for Python using Small Language Models |
Bibek Poudel et.al. |
2405.10243 |
link |
2024-05-16 |
Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting |
Divij Gupta et.al. |
2405.10216 |
null |
2024-05-16 |
CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations |
Jiahao Zhao et.al. |
2405.10212 |
null |
2024-05-16 |
LFED: A Literary Fiction Evaluation Dataset for Large Language Models |
Linhao Yu et.al. |
2405.10166 |
link |
2024-05-16 |
PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning |
Jiancheng Pan et.al. |
2405.10160 |
link |
2024-05-16 |
Speaker Verification in Agent-Generated Conversations |
Yizhe Yang et.al. |
2405.10150 |
null |
2024-05-15 |
Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming |
Bushi Xiao et.al. |
2405.09508 |
null |
2024-05-15 |
Constrained Learning for Causal Inference and Semiparametric Statistics |
Tiffany Tianhui Cai et.al. |
2405.09493 |
null |
2024-05-15 |
Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts |
Donya Rooein et.al. |
2405.09482 |
null |
2024-05-15 |
Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models |
Majid Zarharan et.al. |
2405.09454 |
link |
2024-05-15 |
M $^4$ oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts |
Yufeng Jiang et.al. |
2405.09446 |
null |
2024-05-15 |
Facilitating Opinion Diversity through Hybrid NLP Approaches |
Michiel van der Meer et.al. |
2405.09439 |
null |
2024-05-15 |
A Survey On Text-to-3D Contents Generation In The Wild |
Chenhan Jiang et.al. |
2405.09431 |
null |
2024-05-15 |
MicroPython Testbed for Federated Learning Algorithms |
Miroslav Popovic et.al. |
2405.09423 |
null |
2024-05-15 |
Matching domain experts by training from scratch on domain knowledge |
Xiaoliang Luo et.al. |
2405.09395 |
null |
2024-05-15 |
Compositional imprecise probability |
Jack Liell-Cock et.al. |
2405.09391 |
null |
2024-05-15 |
PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models |
Devansh Jain et.al. |
2405.09373 |
null |
2024-05-15 |
SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition |
Weijie L et.al. |
2405.09365 |
null |
2024-05-15 |
Large Language Model Bias Mitigation from the Perspective of Knowledge Editing |
Ruizhe Chen et.al. |
2405.09341 |
null |
2024-05-15 |
Prompting-based Synthetic Data Generation for Few-Shot Question Answering |
Maximilian Schmidt et.al. |
2405.09335 |
null |
2024-05-15 |
Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls |
Pedro Miguel Sánchez Sánchez et.al. |
2405.09318 |
null |
2024-05-15 |
Comparing the Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support |
Birger Moell et.al. |
2405.09300 |
null |
2024-05-15 |
Do language models capture implied discourse meanings? An investigation with exhaustivity implicatures of Korean morphology |
Hagyeong Shin et.al. |
2405.09293 |
null |
2024-05-15 |
Sign of the Times: Evaluating the use of Large Language Models for Idiomaticity Detection |
Dylan Phelps et.al. |
2405.09279 |
null |
2024-05-15 |
Dynamic Activation Pitfalls in LLaMA Models: An Empirical Study |
Chi Ma et.al. |
2405.09274 |
null |
2024-05-15 |
New Textual Corpora for Serbian Language Modeling |
Mihailo Škorić et.al. |
2405.09250 |
null |
2024-05-14 |
Efficient Vision-Language Pre-training by Cluster Masking |
Zihao Wei et.al. |
2405.08815 |
link |
2024-05-14 |
Towards Enhanced RAC Accessibility: Leveraging Datasets and LLMs |
Edison Jair Bejarano Sepulveda et.al. |
2405.08792 |
null |
2024-05-14 |
Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring |
Tiantian Zhang et.al. |
2405.08786 |
null |
2024-05-14 |
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs |
Akhila Yerukola et.al. |
2405.08760 |
link |
2024-05-14 |
Distributed Threat Intelligence at the Edge Devices: A Large Language Model-Driven Approach |
Syed Mhamudul Hasan et.al. |
2405.08755 |
null |
2024-05-14 |
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding |
Zhimin Li et.al. |
2405.08748 |
link |
2024-05-14 |
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory |
Xueyan Niu et.al. |
2405.08707 |
null |
2024-05-14 |
EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera |
Beilei Cui et.al. |
2405.08672 |
link |
2024-05-14 |
Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research |
Qinglong Cao et.al. |
2405.08668 |
link |
2024-05-14 |
Thinking Tokens for Language Modeling |
David Herel et.al. |
2405.08644 |
null |
2024-05-15 |
ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation |
Dimitris Gkoumas et.al. |
2405.08619 |
null |
2024-05-14 |
A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine |
Hanguang Xiao et.al. |
2405.08603 |
null |
2024-05-15 |
EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark |
Xiaohui Zhang et.al. |
2405.08596 |
null |
2024-05-14 |
Open-Vocabulary Object Detection via Neighboring Region Attention Alignment |
Sunyuan Qiang et.al. |
2405.08593 |
null |
2024-05-14 |
Improving Transformers with Dynamically Composable Multi-Head Attention |
Da Xiao et.al. |
2405.08553 |
link |
2024-05-14 |
Self-Distillation Improves DNA Sequence Inference |
Tong Yu et.al. |
2405.08538 |
link |
2024-05-14 |
Falcon 7b for Software Mention Detection in Scholarly Documents |
AmeerAli Khan et.al. |
2405.08514 |
null |
2024-05-14 |
Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure |
Odysseas S. Chlapanis et.al. |
2405.08502 |
null |
2024-05-14 |
Is Less More? Quality, Quantity and Context in Idiom Processing with Natural Language Models |
Agne Knietaite et.al. |
2405.08497 |
null |
2024-05-14 |
Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models |
Andrea Piergentili et.al. |
2405.08477 |
null |
2024-05-13 |
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots |
Chengyue Wu et.al. |
2405.07990 |
null |
2024-05-13 |
A Generalist Learner for Multifaceted Medical Image Interpretation |
Hong-Yu Zhou et.al. |
2405.07988 |
null |
2024-05-13 |
The Platonic Representation Hypothesis |
Minyoung Huh et.al. |
2405.07987 |
link |
2024-05-13 |
Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation |
Kevin Stangl et.al. |
2405.07969 |
null |
2024-05-13 |
PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation |
Suad Alshammari et.al. |
2405.07963 |
null |
2024-05-13 |
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments |
Samuel Schmidgall et.al. |
2405.07960 |
null |
2024-05-13 |
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning |
Yinzhu Quan et.al. |
2405.07938 |
null |
2024-05-13 |
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition |
Ziyang Zhang et.al. |
2405.07932 |
link |
2024-05-13 |
Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data |
Mahdi Morafah et.al. |
2405.07925 |
null |
2024-05-13 |
Can Better Text Semantics in Prompt Tuning Improve VLM Generalization? |
Hari Chandana Kuchibhotla et.al. |
2405.07921 |
null |
2024-05-13 |
A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking |
Ferdinand Schlatt et.al. |
2405.07920 |
null |
2024-05-13 |
PLUTO: Pathology-Universal Transformer |
Dinkar Juyal et.al. |
2405.07905 |
null |
2024-05-13 |
Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers |
Alena Tsanda et.al. |
2405.07886 |
null |
2024-05-13 |
Zero-Shot Tokenizer Transfer |
Benjamin Minixhofer et.al. |
2405.07883 |
null |
2024-05-13 |
RLHF Workflow: From Reward Modeling to Online RLHF |
Hanze Dong et.al. |
2405.07863 |
link |
2024-05-13 |
Can LLMs Help Predict Elections? (Counter)Evidence from the World’s Largest Democracy |
Pratik Gujral et.al. |
2405.07828 |
null |
2024-05-13 |
A View of How Language Models Will Transform Law |
Frank Fagan et.al. |
2405.07826 |
null |
2024-05-13 |
FreeVA: Offline MLLM as Training-Free Video Assistant |
Wenhao Wu et.al. |
2405.07798 |
link |
2024-05-13 |
DEPTH: Discourse Education through Pre-Training Hierarchically |
Zachary Bamberger et.al. |
2405.07788 |
link |
2024-05-13 |
Generating Human Motion in 3D Scenes from Text Descriptions |
Zhi Cen et.al. |
2405.07784 |
null |
2024-05-10 |
Linearizing Large Language Models |
Jean Mercat et.al. |
2405.06640 |
link |
2024-05-10 |
Value Augmented Sampling for Language Model Alignment and Personalization |
Seungwook Han et.al. |
2405.06639 |
link |
2024-05-10 |
Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark |
Evan M. Williams et.al. |
2405.06634 |
null |
2024-05-10 |
Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models |
Chakshu Moar et.al. |
2405.06626 |
null |
2024-05-10 |
Explaining Text Similarity in Transformer Models |
Alexandros Vasileiou et.al. |
2405.06604 |
null |
2024-05-10 |
Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach |
Elham Ravanbakhsh et.al. |
2405.06586 |
null |
2024-05-10 |
What Can Natural Language Processing Do for Peer Review? |
Ilia Kuznetsov et.al. |
2405.06563 |
null |
2024-05-10 |
Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval |
Mengjia Niu et.al. |
2405.06545 |
null |
2024-05-10 |
Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts |
Wenyu Huang et.al. |
2405.06524 |
null |
2024-05-10 |
UniDM: A Unified Framework for Data Manipulation with Large Language Models |
Yichen Qian et.al. |
2405.06510 |
null |
2024-05-10 |
Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling |
Lyumanshan Ye et.al. |
2405.06495 |
null |
2024-05-10 |
Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification |
Yaoqin Ye et.al. |
2405.06468 |
null |
2024-05-10 |
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation |
JoonHo Lee et.al. |
2405.06424 |
link |
2024-05-10 |
Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions? |
Hunter McNichols et.al. |
2405.06414 |
null |
2024-05-10 |
Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL |
Ning Cheng et.al. |
2405.06410 |
null |
2024-05-10 |
Program Synthesis using Inductive Logic Programming for the Abstraction and Reasoning Corpus |
Filipe Marinho Rocha et.al. |
2405.06399 |
null |
2024-05-10 |
Memory Mosaics |
Jianyu Zhang et.al. |
2405.06394 |
null |
2024-05-10 |
LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play |
Li-Chun Lu et.al. |
2405.06373 |
null |
2024-05-10 |
LMD3: Language Model Data Density Dependence |
John Kirchenbauer et.al. |
2405.06331 |
null |
2024-05-10 |
Correlation Dimension of Natural Language in a Statistical Manifold |
Xin Du et.al. |
2405.06321 |
null |
2024-05-09 |
Natural Language Processing RELIES on Linguistics |
Juri Opitz et.al. |
2405.05966 |
null |
2024-05-09 |
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning |
Dan Qiao et.al. |
2405.05957 |
link |
2024-05-09 |
Probing Multimodal LLMs as World Models for Driving |
Shiva Sreeram et.al. |
2405.05956 |
link |
2024-05-09 |
Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning |
Junzhi Chen et.al. |
2405.05955 |
null |
2024-05-09 |
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts |
Jiachen Li et.al. |
2405.05949 |
link |
2024-05-09 |
DOLOMITES: Domain-Specific Long-Form Methodical Tasks |
Chaitanya Malaviya et.al. |
2405.05938 |
null |
2024-05-09 |
Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness |
Siyuan Li et.al. |
2405.05930 |
null |
2024-05-09 |
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? |
Zorik Gekhman et.al. |
2405.05904 |
null |
2024-05-09 |
Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes |
Ziang Guo et.al. |
2405.05885 |
null |
2024-05-09 |
FlockGPT: Guiding UAV Flocking with Linguistic Orchestration |
Artem Lykov et.al. |
2405.05872 |
null |
2024-05-09 |
Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control |
Gunshi Gupta et.al. |
2405.05852 |
link |
2024-05-09 |
Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning |
Artem Lykov et.al. |
2405.05824 |
link |
2024-05-09 |
Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference |
Zhihang Lin et.al. |
2405.05803 |
link |
2024-05-09 |
Towards a More Inclusive AI: Progress and Perspectives in Large Language Model Training for the Sámi Language |
Ronny Paul et.al. |
2405.05777 |
null |
2024-05-09 |
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions |
Polina Tsvilodub et.al. |
2405.05776 |
null |
2024-05-09 |
Large Language Model-Aided Evolutionary Search for Constrained Multiobjective Optimization |
Zeyi Wang et.al. |
2405.05767 |
null |
2024-05-09 |
Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social Media |
Zhizhen Zhang et.al. |
2405.05760 |
null |
2024-05-09 |
Exploring the Potential of Human-LLM Synergy in Advancing Qualitative Analysis: A Case Study on Mental-Illness Stigma |
Han Meng et.al. |
2405.05758 |
null |
2024-05-09 |
Can large language models understand uncommon meanings of common words? |
Jinyang Wu et.al. |
2405.05741 |
null |
2024-05-09 |
Evaluating Dialect Robustness of Language Models via Conversation Understanding |
Dipankar Srirag et.al. |
2405.05688 |
link |
2024-05-08 |
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models |
Prannay Kaul et.al. |
2405.05256 |
null |
2024-05-08 |
You Only Cache Once: Decoder-Decoder Architectures for Language Models |
Yutao Sun et.al. |
2405.05254 |
null |
2024-05-08 |
Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge |
Charles Koutcheme et.al. |
2405.05253 |
link |
2024-05-09 |
LLMs with Personalities in Multi-issue Negotiation Games |
Sean Noh et.al. |
2405.05248 |
null |
2024-05-08 |
EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning |
Jingfeng Yao et.al. |
2405.05237 |
link |
2024-05-08 |
SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants |
Masoud Moghani et.al. |
2405.05226 |
null |
2024-05-08 |
Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers |
Jiuxiang Gu et.al. |
2405.05219 |
null |
2024-05-08 |
FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models |
Jinglin Xu et.al. |
2405.05216 |
link |
2024-05-08 |
MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning |
Inderjeet Nair et.al. |
2405.05189 |
null |
2024-05-08 |
Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming |
Tommaso Pasini et.al. |
2405.05176 |
null |
2024-05-08 |
Air Gap: Protecting Privacy-Conscious Conversational Agents |
Eugene Bagdasaryan et.al. |
2405.05175 |
null |
2024-05-08 |
XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples |
Peiqin Lin et.al. |
2405.05116 |
link |
2024-05-08 |
QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs |
Weijia Zhang et.al. |
2405.05109 |
null |
2024-05-08 |
Concerns on Bias in Large Language Models when Creating Synthetic Personae |
Helena A. Haxvig et.al. |
2405.05080 |
null |
2024-05-08 |
Impact of Tone-Aware Explanations in Recommender Systems |
Ayano Okoso et.al. |
2405.05061 |
null |
2024-05-08 |
Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models |
Aylin Gunal et.al. |
2405.05060 |
null |
2024-05-08 |
Seeds of Stereotypes: A Large-Scale Textual Analysis of Race and Gender Associations with Diseases in Online Sources |
Lasse Hyldig Hansen et.al. |
2405.05049 |
null |
2024-05-08 |
${M^2D}$ NeRF: Multi-Modal Decomposition NeRF with 3D Feature Fields |
Ning Wang et.al. |
2405.05010 |
null |
2024-05-08 |
ADELIE: Aligning Large Language Models on Information Extraction |
Yunjia Qi et.al. |
2405.05008 |
link |
2024-05-08 |
NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair |
Ruoke Wang et.al. |
2405.04994 |
null |
2024-05-07 |
ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning |
Jing Lin et.al. |
2405.04533 |
null |
2024-05-07 |
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving |
Yujun Lin et.al. |
2405.04532 |
link |
2024-05-07 |
NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts |
Shudan Zhang et.al. |
2405.04520 |
null |
2024-05-07 |
xLSTM: Extended Long Short-Term Memory |
Maximilian Beck et.al. |
2405.04517 |
null |
2024-05-07 |
A Transformer with Stack Attention |
Jiaoda Li et.al. |
2405.04515 |
link |
2024-05-08 |
Unveiling Disparities in Web Task Handling Between Human and Web Agent |
Kihoon Son et.al. |
2405.04497 |
null |
2024-05-07 |
Toward In-Context Teaching: Adapting Examples to Students’ Misconceptions |
Alexis Ross et.al. |
2405.04495 |
null |
2024-05-07 |
Representation Learning of Daily Movement Data Using Text Encoders |
Alexander Capstick et.al. |
2405.04494 |
link |
2024-05-08 |
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model |
DeepSeek-AI et.al. |
2405.04434 |
link |
2024-05-07 |
The Silicone Ceiling: Auditing GPT’s Race and Gender Biases in Hiring |
Lena Armstrong et.al. |
2405.04412 |
null |
2024-05-07 |
Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks |
Georgios Pantazopoulos et.al. |
2405.04403 |
link |
2024-05-07 |
Large Language Models Cannot Explain Themselves |
Advait Sarkar et.al. |
2405.04382 |
null |
2024-05-07 |
A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI |
Hannah Chafetz et.al. |
2405.04333 |
null |
2024-05-07 |
Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation |
Atharvan Dogra et.al. |
2405.04325 |
null |
2024-05-07 |
Granite Code Models: A Family of Open Foundation Models for Code Intelligence |
Mayank Mishra et.al. |
2405.04324 |
link |
2024-05-07 |
Accelerating Speculative Decoding using Dynamic Speculation Length |
Jonathan Mamou et.al. |
2405.04304 |
null |
2024-05-07 |
Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework |
Xiangpeng Wan et.al. |
2405.04294 |
link |
2024-05-07 |
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore |
Junchao Wu et.al. |
2405.04286 |
null |
2024-05-07 |
On the Foundations of Earth and Climate Foundation Models |
Xiao Xiang Zhu et.al. |
2405.04285 |
null |
2024-05-07 |
Semantic API Alignment: Linking High-level User Goals to APIs |
Robert Feldt et.al. |
2405.04236 |
null |
2024-05-06 |
Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs |
Muhammad Uzair Khattak et.al. |
2405.03690 |
null |
2024-05-06 |
Pose Priors from Language Models |
Sanjay Subramanian et.al. |
2405.03689 |
null |
2024-05-06 |
Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames |
Keith Burghardt et.al. |
2405.03688 |
link |
2024-05-06 |
Language-Image Models with 3D Understanding |
Jang Hyun Cho et.al. |
2405.03685 |
null |
2024-05-06 |
AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design |
Kamal Choudhary et.al. |
2405.03680 |
null |
2024-05-06 |
When LLMs Meet Cybersecurity: A Systematic Literature Review |
Jie Zhang et.al. |
2405.03644 |
link |
2024-05-06 |
A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama |
Vlad-Andrei Cursaru et.al. |
2405.03616 |
null |
2024-05-06 |
GREEN: Generative Radiology Report Evaluation and Error Notation |
Sophie Ostmeier et.al. |
2405.03595 |
null |
2024-05-06 |
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment |
Abhinav Agarwalla et.al. |
2405.03594 |
null |
2024-05-06 |
Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing |
Han Liu et.al. |
2405.03565 |
null |
2024-05-07 |
ID-centric Pre-training for Recommendation |
Yiqing Wu et.al. |
2405.03562 |
null |
2024-05-06 |
AlphaMath Almost Zero: process Supervision without process |
Guoxin Chen et.al. |
2405.03553 |
link |
2024-05-06 |
MAmmoTH2: Scaling Instructions from the Web |
Xiang Yue et.al. |
2405.03548 |
null |
2024-05-06 |
Position Paper: Leveraging Foundational Models for Black-Box Optimization: Benefits, Challenges, and Future Directions |
Xingyou Song et.al. |
2405.03547 |
null |
2024-05-06 |
Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning |
Yubo Mai et.al. |
2405.03509 |
null |
2024-05-06 |
UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images |
Yiting Qu et.al. |
2405.03486 |
null |
2024-05-06 |
LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model |
Haowen Sun et.al. |
2405.03485 |
link |
2024-05-06 |
Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search |
Hideaki Joko et.al. |
2405.03480 |
link |
2024-05-07 |
Large Language Models (LLMs) as Agents for Augmented Democracy |
Jairo Gudiño-Rosero et.al. |
2405.03452 |
null |
2024-05-06 |
SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence |
Hangyuan Ji et.al. |
2405.03446 |
null |
2024-05-03 |
Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models |
Piotr Padlewski et.al. |
2405.02287 |
link |
2024-05-03 |
Structural Pruning of Pre-trained Language Models via Neural Architecture Search |
Aaron Klein et.al. |
2405.02267 |
null |
2024-05-03 |
On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning? |
Maxime Zanella et.al. |
2405.02266 |
link |
2024-05-03 |
Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows |
Jasmine Y. Shih et.al. |
2405.02260 |
null |
2024-05-03 |
What matters when building vision-language models? |
Hugo Laurençon et.al. |
2405.02246 |
null |
2024-05-03 |
REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs |
Deepa Tilwani et.al. |
2405.02228 |
null |
2024-05-03 |
Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks |
Lujing Zhang et.al. |
2405.02225 |
null |
2024-05-03 |
FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems |
Yashar Deldjoo et.al. |
2405.02219 |
null |
2024-05-03 |
Automatic Programming: Large Language Models and Beyond |
Michael R. Lyu et.al. |
2405.02213 |
null |
2024-05-03 |
Assessing and Verifying Task Utility in LLM-Powered Applications |
Negar Arabzadeh et.al. |
2405.02178 |
null |
2024-05-03 |
Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset |
Hsuvas Borkakoty et.al. |
2405.02175 |
null |
2024-05-03 |
Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models |
Mohamad Al Mdfaa et.al. |
2405.02162 |
null |
2024-05-03 |
Neural Context Flows for Learning Generalizable Dynamical Systems |
Roussel Desmond Nzoyem et.al. |
2405.02154 |
link |
2024-05-03 |
The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates |
Giuseppe Russo Latona et.al. |
2405.02150 |
link |
2024-05-03 |
MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain |
Chao Jiang et.al. |
2405.02144 |
null |
2024-05-03 |
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection |
Guillem Ramírez et.al. |
2405.02134 |
null |
2024-05-03 |
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets |
Xuelong Geng et.al. |
2405.02132 |
null |
2024-05-03 |
Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph |
Vladyslav Nechakhin et.al. |
2405.02105 |
null |
2024-05-03 |
Argumentative Large Language Models for Explainable and Contestable Decision-Making |
Gabriel Freedman et.al. |
2405.02079 |
null |
2024-05-03 |
Comparative Analysis of Retrieval Systems in the Real World |
Dmytro Mozolevskyi et.al. |
2405.02048 |
null |
2024-05-02 |
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models |
Seungone Kim et.al. |
2405.01535 |
link |
2024-05-02 |
Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks |
Murtaza Dalal et.al. |
2405.01534 |
null |
2024-05-02 |
OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning |
Shihao Wang et.al. |
2405.01533 |
link |
2024-05-02 |
FLAME: Factuality-Aware Alignment for Large Language Models |
Sheng-Chieh Lin et.al. |
2405.01525 |
null |
2024-05-02 |
A separability-based approach to quantifying generalization: which layer is best? |
Luciano Dyballa et.al. |
2405.01524 |
null |
2024-05-02 |
Transformer-Aided Semantic Communications |
Matin Mortaheb et.al. |
2405.01521 |
null |
2024-05-02 |
D2PO: Discriminator-Guided DPO with Response Evaluation Models |
Prasann Singhal et.al. |
2405.01511 |
link |
2024-05-02 |
Analyzing the Role of Semantic Representations in the Era of Large Language Models |
Zhijing Jin et.al. |
2405.01502 |
link |
2024-05-02 |
Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models |
Raymond Fok et.al. |
2405.01501 |
null |
2024-05-02 |
Controllable Text Generation in the Instruction-Tuning Era |
Dhananjay Ashok et.al. |
2405.01490 |
null |
2024-05-02 |
MANTIS: Interleaved Multi-Image Instruction Tuning |
Dongfu Jiang et.al. |
2405.01483 |
null |
2024-05-02 |
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment |
Gerald Shen et.al. |
2405.01481 |
link |
2024-05-02 |
V-FLUTE: Visual Figurative Language Understanding with Textual Explanations |
Arkadiy Saakyan et.al. |
2405.01474 |
link |
2024-05-02 |
Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning |
Théo Moutakanni et.al. |
2405.01469 |
null |
2024-05-02 |
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models |
Yifei Ming et.al. |
2405.01468 |
null |
2024-05-02 |
A Systematic Literature Review on Large Language Models for Automated Program Repair |
Quanjun Zhang et.al. |
2405.01466 |
link |
2024-05-02 |
Natural Language to Verilog: Design of a Recurrent Spiking Neural Network using Large Language Models and ChatGPT |
Paola Vitolo et.al. |
2405.01419 |
null |
2024-05-02 |
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors |
Yuan Tang et.al. |
2405.01413 |
link |
2024-05-02 |
Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving |
Xin Quan et.al. |
2405.01379 |
null |
2024-05-02 |
GAIA: A General AI Assistant for Intelligent Accelerator Operations |
Frank Mayet et.al. |
2405.01359 |
null |
2024-05-01 |
Self-Play Preference Optimization for Language Model Alignment |
Yue Wu et.al. |
2405.00675 |
null |
2024-05-01 |
Is Bigger Edit Batch Size Always Better? – An Empirical Study on Model Editing with Llama-3 |
Junsang Yoon et.al. |
2405.00664 |
link |
2024-05-01 |
HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models |
Ningke Li et.al. |
2405.00648 |
null |
2024-05-01 |
When Quantization Affects Confidence of Large Language Models? |
Irina Proskurina et.al. |
2405.00632 |
link |
2024-05-01 |
“I’m Not Sure, But…”: Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust |
Sunnie S. Y. Kim et.al. |
2405.00623 |
null |
2024-05-01 |
Causal Evaluation of Language Models |
Sirui Chen et.al. |
2405.00622 |
link |
2024-05-01 |
Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling |
Yida Mu et.al. |
2405.00611 |
null |
2024-05-01 |
Investigating Automatic Scoring and Feedback using Large Language Models |
Gloria Ashiya Katuka et.al. |
2405.00602 |
null |
2024-05-01 |
Are Models Biased on Text without Gender-related Language? |
Catarina G Belém et.al. |
2405.00588 |
link |
2024-05-01 |
The Real, the Better: Aligning Large Language Models with Online Human Behaviors |
Guanying Jiang et.al. |
2405.00578 |
null |
2024-05-01 |
EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model |
Deng Li et.al. |
2405.00574 |
null |
2024-05-01 |
NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance |
Huan-Yi Su et.al. |
2405.00566 |
null |
2024-05-01 |
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment |
Zhili Liu et.al. |
2405.00557 |
null |
2024-05-01 |
Long-Term Human Trajectory Prediction using 3D Dynamic Scene Graphs |
Nicolas Gorlo et.al. |
2405.00552 |
link |
2024-05-01 |
ChatBI: Towards Natural Language to Complex Business Intelligence SQL |
Jinqing Lian et.al. |
2405.00527 |
null |
2024-05-01 |
CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions |
Donghee Choi et.al. |
2405.00523 |
null |
2024-05-01 |
Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning |
Lucas-Andreï Thil et.al. |
2405.00516 |
null |
2024-05-01 |
GOLD: Geometry Problem Solver with Natural Language Description |
Jiaxin Zhang et.al. |
2405.00494 |
link |
2024-05-01 |
Is Temperature the Creativity Parameter of Large Language Models? |
Max Peeperkorn et.al. |
2405.00492 |
null |
2024-05-01 |
The Pyramid of Captions |
Delong Chen et.al. |
2405.00485 |
null |
2024-04-30 |
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation |
Yunhao Ge et.al. |
2404.19752 |
null |
2024-04-30 |
PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification |
Leon Garza et.al. |
2404.19744 |
null |
2024-04-30 |
Better & Faster Large Language Models via Multi-token Prediction |
Fabian Gloeckle et.al. |
2404.19737 |
null |
2024-04-30 |
A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications |
Steph Buongiorno et.al. |
2404.19729 |
null |
2024-04-30 |
PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games |
Steph Buongiorno et.al. |
2404.19721 |
null |
2024-04-30 |
Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns |
Constantinos Patsakis et.al. |
2404.19715 |
null |
2024-04-30 |
Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models |
Scott Sumpter et.al. |
2404.19713 |
null |
2024-04-30 |
When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively |
Tiziano Labruna et.al. |
2404.19705 |
link |
2024-04-30 |
Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners |
Chun Feng et.al. |
2404.19696 |
null |
2024-04-30 |
Towards Generalist Robot Learning from Internet Video: A Survey |
Robert McCarthy et.al. |
2404.19664 |
null |
2024-04-30 |
MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation |
Min Zhang et.al. |
2404.19644 |
null |
2024-04-30 |
On Training a Neural Network to Explain Binaries |
Alexander Interrante-Grant et.al. |
2404.19631 |
null |
2024-04-30 |
Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model |
Denys Godwin et.al. |
2404.19609 |
null |
2024-04-30 |
Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning |
Xuanli He et.al. |
2404.19597 |
null |
2024-04-30 |
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing |
Yucheng Hu et.al. |
2404.19543 |
link |
2024-04-30 |
MoST: Multi-modality Scene Tokenization for Motion Prediction |
Norman Mu et.al. |
2404.19531 |
null |
2024-04-30 |
Do Large Language Models Understand Conversational Implicature – A case study with a chinese sitcom |
Shisen Yue et.al. |
2404.19509 |
link |
2024-04-30 |
More Compute Is What You Need |
Zhen Guo et.al. |
2404.19484 |
null |
2024-05-01 |
Neuro-Vision to Language: Image Reconstruction and Language enabled Interaction via Brain Recordings |
Guobin Shen et.al. |
2404.19438 |
null |
2024-04-30 |
Can Large Language Models put 2 and 2 together? Probing for Entailed Arithmetical Relationships |
D. Panas et.al. |
2404.19432 |
null |
2024-04-29 |
Hallucination of Multimodal Large Language Models: A Survey |
Zechen Bai et.al. |
2404.18930 |
link |
2024-04-29 |
Holmes: Benchmark the Linguistic Competence of Language Models |
Andreas Waldis et.al. |
2404.18923 |
null |
2024-04-29 |
DPO Meets PPO: Reinforced Token Optimization for RLHF |
Han Zhong et.al. |
2404.18922 |
null |
2024-04-29 |
TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation |
Junhao Cheng et.al. |
2404.18919 |
link |
2024-04-29 |
Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting |
Fangcheng Liu et.al. |
2404.18911 |
link |
2024-04-29 |
Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking |
Hong Jin Kang et.al. |
2404.18881 |
link |
2024-04-29 |
More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness |
Aaron J. Li et.al. |
2404.18870 |
link |
2024-04-29 |
Truth-value judgment in language models: belief directions are context sensitive |
Stefan F. Schouten et.al. |
2404.18865 |
null |
2024-04-29 |
Performance-Aligned LLMs for Generating Fast Code |
Daniel Nichols et.al. |
2404.18864 |
null |
2024-04-29 |
A Survey on Vision Mamba: Models, Applications and Challenges |
Rui Xu et.al. |
2404.18861 |
link |
2024-04-29 |
VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning |
Aidan Z. H. Yang et.al. |
2404.18852 |
null |
2024-04-29 |
FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition |
Yuxuan Yan et.al. |
2404.18848 |
null |
2024-04-29 |
It’s Difficult to be Neutral – Human and LLM-based Sentiment Annotation of Patient Comments |
Petter Mæhlum et.al. |
2404.18832 |
null |
2024-04-29 |
Benchmarking Benchmark Leakage in Large Language Models |
Ruijie Xu et.al. |
2404.18824 |
link |
2024-04-29 |
AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering |
Wenxiang Zhao et.al. |
2404.18816 |
null |
2024-04-29 |
Unknown Script: Impact of Script on Cross-Lingual Transfer |
Wondimagegnhue Tsegaye Tufa et.al. |
2404.18810 |
link |
2024-04-29 |
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models |
Pat Verga et.al. |
2404.18796 |
null |
2024-04-29 |
PECC: Problem Extraction and Coding Challenges |
Patrick Haller et.al. |
2404.18766 |
link |
2024-04-29 |
Transitive Vision-Language Prompt Learning for Domain Generalization |
Liyuan Wang et.al. |
2404.18758 |
null |
2024-04-29 |
Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models |
Hongyi Zhu et.al. |
2404.18746 |
null |
2024-04-26 |
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo |
Stephen Zhao et.al. |
2404.17546 |
link |
2024-04-26 |
Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models |
Yuhang Huang et.al. |
2404.17534 |
null |
2024-04-26 |
Large Language Model Agent as a Mechanical Designer |
Yayati Jadhav et.al. |
2404.17525 |
null |
2024-04-26 |
On the Use of Large Language Models to Generate Capability Ontologies |
Luis Miguel Vieira da Silva et.al. |
2404.17524 |
null |
2024-04-26 |
Enhancing Legal Compliance and Regulation Analysis with Large Language Models |
Shabnam Hassani et.al. |
2404.17522 |
null |
2024-04-26 |
A Comprehensive Evaluation on Event Reasoning of Large Language Models |
Zhengwei Tao et.al. |
2404.17513 |
link |
2024-04-26 |
CEval: A Benchmark for Evaluating Counterfactual Text Generation |
Van Bach Nguyen et.al. |
2404.17475 |
null |
2024-04-26 |
Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System |
Robin Schmucker et.al. |
2404.17460 |
null |
2024-04-26 |
“ChatGPT Is Here to Help, Not to Replace Anybody” – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses |
Bruno Pereira Cipriano et.al. |
2404.17443 |
null |
2024-04-26 |
PromptCIR: Blind Compressed Image Restoration with Prompt Learning |
Bingchen Li et.al. |
2404.17433 |
link |
2024-04-26 |
Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations |
Rémy Decoupes et.al. |
2404.17401 |
null |
2024-04-26 |
UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning |
Maoxun Yuan et.al. |
2404.17360 |
null |
2024-04-26 |
InspectorRAGet: An Introspection Platform for RAG Evaluation |
Kshitij Fadnis et.al. |
2404.17347 |
link |
2024-04-26 |
Introducing cosmosGPT: Monolingual Training for Turkish Language Models |
H. Toprak Kesgin et.al. |
2404.17336 |
null |
2024-04-26 |
A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation |
Xin Zhang et.al. |
2404.17335 |
null |
2024-04-26 |
An Extendable Cloud-Native Alloy Property Explorer |
Zhuoyuan Li et.al. |
2404.17330 |
link |
2024-04-26 |
When to Trust LLMs: Aligning Confidence with Response Quality |
Shuchang Tao et.al. |
2404.17287 |
null |
2024-04-26 |
Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM |
Xuan Zhang et.al. |
2404.17283 |
link |
2024-04-26 |
Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT as a Pivot |
Michelle Terblanche et.al. |
2404.17216 |
null |
2024-04-26 |
Low-Rank Knowledge Decomposition for Medical Foundation Models |
Yuhang Zhou et.al. |
2404.17184 |
null |
2024-04-25 |
The Third Monocular Depth Estimation Challenge |
Jaime Spencer et.al. |
2404.16831 |
null |
2024-04-25 |
Make-it-Real: Unleashing Large Multimodal Model’s Ability for Painting 3D Objects with Realistic Materials |
Ye Fang et.al. |
2404.16829 |
null |
2024-04-25 |
V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection |
Xuanyu Zhang et.al. |
2404.16824 |
null |
2024-04-25 |
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites |
Zhe Chen et.al. |
2404.16821 |
link |
2024-04-25 |
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages |
Harman Singh et.al. |
2404.16816 |
link |
2024-04-26 |
Make Your LLM Fully Utilize the Context |
Shengnan An et.al. |
2404.16811 |
link |
2024-04-25 |
Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning |
Tianhui Zhang et.al. |
2404.16807 |
null |
2024-04-25 |
AAPL: Adding Attributes to Prompt Learning for Vision-Language Models |
Gahyeon Kim et.al. |
2404.16804 |
link |
2024-04-25 |
Weak-to-Strong Extrapolation Expedites Alignment |
Chujie Zheng et.al. |
2404.16792 |
link |
2024-04-25 |
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension |
Bohao Li et.al. |
2404.16790 |
link |
2024-04-25 |
Continual Learning of Large Language Models: A Comprehensive Survey |
Haizhou Shi et.al. |
2404.16789 |
link |
2024-04-25 |
Modeling Selective Feature Attention for Representation-based Siamese Text Matching |
Jianxiang Zang et.al. |
2404.16776 |
link |
2024-04-25 |
REBEL: Reinforcement Learning via Regressing Relative Rewards |
Zhaolin Gao et.al. |
2404.16767 |
link |
2024-04-25 |
Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model |
Runzhe Zhan et.al. |
2404.16766 |
null |
2024-04-25 |
RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis |
Xiaoman Zhang et.al. |
2404.16754 |
null |
2024-04-25 |
Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class |
Mazda Moayeri et.al. |
2404.16717 |
null |
2024-04-25 |
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding |
Mostafa Elhoushi et.al. |
2404.16710 |
null |
2024-04-25 |
Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents |
Giorgio Piatti et.al. |
2404.16698 |
null |
2024-04-25 |
Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 |
Lydia Uhler et.al. |
2404.16692 |
null |
2024-04-25 |
EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning |
Hongxia Xie et.al. |
2404.16670 |
link |
2024-04-24 |
Hybrid LLM/Rule-based Approaches to Business Insights Generation from Structured Data |
Aliaksei Vertsel et.al. |
2404.15604 |
null |
2024-04-24 |
ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction |
Henry Peng Zou et.al. |
2404.15592 |
link |
2024-04-24 |
MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis |
Jiaxin Zhuang et.al. |
2404.15580 |
null |
2024-04-24 |
Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? |
Hossein Salami et.al. |
2404.15578 |
null |
2024-04-24 |
Retrieval Head Mechanistically Explains Long-Context Factuality |
Wenhao Wu et.al. |
2404.15574 |
link |
2024-04-23 |
PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models |
Shashi Kant Gupta et.al. |
2404.15549 |
null |
2024-04-23 |
BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis |
Shuhang Lin et.al. |
2404.15532 |
link |
2024-04-23 |
Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models |
Mihir Parmar et.al. |
2404.15522 |
link |
2024-04-23 |
Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval |
Young Kyun Jang et.al. |
2404.15516 |
null |
2024-04-23 |
ToM-LM: Delegating Theory Of Mind Reasoning to External Symbolic Executors in Large Language Models |
Weizhi Tang et.al. |
2404.15515 |
null |
2024-04-23 |
IryoNLP at MEDIQA-CORR 2024: Tackling the Medical Error Detection & Correction Task On the Shoulders of Medical Agents |
Jean-Philippe Corbeil et.al. |
2404.15488 |
link |
2024-04-23 |
Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance |
Het Patel et.al. |
2404.15485 |
null |
2024-04-23 |
Can Large Language Models Learn the Physics of Metamaterials? An Empirical Study with ChatGPT |
Darui Lu et.al. |
2404.15458 |
null |
2024-04-23 |
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference |
João Monteiro et.al. |
2404.15420 |
null |
2024-04-23 |
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs |
Davide Caffagni et.al. |
2404.15406 |
null |
2024-04-23 |
Aligning LLM Agents by Learning Latent Preference from User Edits |
Ge Gao et.al. |
2404.15269 |
link |
2024-04-23 |
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts |
Yifeng Ding et.al. |
2404.15247 |
link |
2024-04-23 |
CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies |
Weiyan Shi et.al. |
2404.15238 |
link |
2024-04-23 |
Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models |
Aidan Z. H. Yang et.al. |
2404.15236 |
null |
2024-04-23 |
Re-Thinking Inverse Graphics With Large Language Models |
Peter Kulits et.al. |
2404.15228 |
null |
2024-04-23 |
Does Instruction Tuning Make LLMs More Consistent? |
Constanza Fierro et.al. |
2404.15206 |
null |
2024-04-23 |
Setting up the Data Printer with Improved English to Ukrainian Machine Translation |
Yurii Paniv et.al. |
2404.15196 |
link |
2024-04-23 |
Regressive Side Effects of Training Language Models to Mimic Student Misconceptions |
Shashank Sonkar et.al. |
2404.15156 |
null |
2024-04-23 |
Bias patterns in the application of LLMs for clinical decision support: A comprehensive study |
Raphael Poulain et.al. |
2404.15149 |
link |
2024-04-23 |
Rethinking LLM Memorization through the Lens of Adversarial Compression |
Avi Schwarzschild et.al. |
2404.15146 |
null |
2024-04-23 |
MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning |
Sunan He et.al. |
2404.15127 |
null |
2024-04-23 |
Identifying Fairness Issues in Automatically Generated Testing Content |
Kevin Stowe et.al. |
2404.15104 |
null |
2024-04-23 |
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation |
Xun Wu et.al. |
2404.15100 |
null |
2024-04-23 |
Detection of circular permutations by Protein Language Models |
Yue Hu et.al. |
2404.15087 |
link |
2024-04-23 |
Multi-Head Mixture-of-Experts |
Xun Wu et.al. |
2404.15045 |
null |
2024-04-23 |
TAXI: Evaluating Categorical Knowledge Editing for Language Models |
Derek Powell et.al. |
2404.15004 |
link |
2024-04-23 |
Transformers Can Represent $n$ -gram Language Models |
Anej Svete et.al. |
2404.14994 |
null |
2024-04-23 |
A Short Review for Ontology Learning from Text: Stride from Shallow Learning, Deep Learning to Large Language Models Trend |
Rick Du et.al. |
2404.14991 |
null |
2024-04-23 |
$\texttt{MiniMol}$ : A Parameter-Efficient Foundation Model for Molecular Learning |
Kerstin Kläser et.al. |
2404.14986 |
null |
2024-04-23 |
Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case |
Muhammad Asif Auyb et.al. |
2404.14977 |
null |
2024-04-22 |
AutoAD III: The Prequel – Back to the Pixels |
Tengda Han et.al. |
2404.14412 |
null |
2024-04-22 |
SpaceByte: Towards Deleting Tokenization from Large Language Modeling |
Kevin Slagle et.al. |
2404.14408 |
link |
2024-04-22 |
RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios? |
Adrian de Wynter et.al. |
2404.14397 |
link |
2024-04-22 |
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation |
Yuying Ge et.al. |
2404.14396 |
link |
2024-04-22 |
PARAMANU-GANITA: Language Model with Mathematical Capabilities |
Mitodru Niyogi et.al. |
2404.14395 |
null |
2024-04-22 |
A Multimodal Automated Interpretability Agent |
Tamar Rott Shaham et.al. |
2404.14394 |
null |
2024-04-22 |
A Survey on Self-Evolution of Large Language Models |
Zhengwei Tao et.al. |
2404.14387 |
link |
2024-04-22 |
Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph |
Xiaochen Kev Gao et.al. |
2404.14372 |
link |
2024-04-23 |
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data |
Fahim Tajwar et.al. |
2404.14367 |
link |
2024-04-22 |
Better Synthetic Data by Retrieving and Transforming Existing Datasets |
Saumya Gandhi et.al. |
2404.14361 |
link |
2024-04-22 |
Rethinking Legal Compliance Automation: Opportunities with Large Language Models |
Shabnam Hassani et.al. |
2404.14356 |
null |
2024-04-22 |
Calc-CMU at SemEval-2024 Task 7: Pre-Calc – Learning to Use the Calculator Improves Numeracy in Language Models |
Vishruth Veerendranath et.al. |
2404.14355 |
link |
2024-04-22 |
Automated Long Answer Grading with RiceChem Dataset |
Shashank Sonkar et.al. |
2404.14316 |
link |
2024-04-22 |
Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels |
Jan-Philipp Fränken et.al. |
2404.14313 |
link |
2024-04-22 |
Explaining Arguments’ Strength: Unveiling the Role of Attacks and Supports (Technical Report) |
Xiang Yin et.al. |
2404.14304 |
null |
2024-04-22 |
Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits |
Shashank Sonkar et.al. |
2404.14301 |
null |
2024-04-22 |
Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach |
Yao Wan et.al. |
2404.14296 |
link |
2024-04-22 |
A Survey on Efficient Inference for Large Language Models |
Zixuan Zhou et.al. |
2404.14294 |
null |
2024-04-22 |
LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots |
Dongge Han et.al. |
2404.14285 |
null |
2024-04-22 |
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback |
Wenyi Xiao et.al. |
2404.14233 |
null |
2024-04-19 |
MoVA: Adapting Mixture of Vision Experts to Multimodal Context |
Zhuofan Zong et.al. |
2404.13046 |
link |
2024-04-19 |
Unified Scene Representation and Reconstruction for 3D Large Language Models |
Tao Chu et.al. |
2404.13044 |
null |
2024-04-19 |
Data Alignment for Zero-Shot Concept Generation in Dermatology AI |
Soham Gadgil et.al. |
2404.13043 |
null |
2024-04-19 |
Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs |
Biyang Guo et.al. |
2404.13033 |
link |
2024-04-19 |
When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering |
Stephen Choi et.al. |
2404.13028 |
null |
2024-04-19 |
Stronger Random Baselines for In-Context Learning |
Gregory Yauney et.al. |
2404.13020 |
link |
2024-04-19 |
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models |
Chuofan Ma et.al. |
2404.13013 |
null |
2024-04-19 |
Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs |
Clemencia Siro et.al. |
2404.12994 |
link |
2024-04-19 |
FineRec:Exploring Fine-grained Sequential Recommendation |
Xiaokun Zhang et.al. |
2404.12975 |
link |
2024-04-19 |
Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models |
Yian Li et.al. |
2404.12966 |
null |
2024-04-19 |
Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction |
Qinyuan Wu et.al. |
2404.12957 |
null |
2024-04-19 |
Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models |
Konstantinos Vilouras et.al. |
2404.12920 |
null |
2024-04-19 |
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models |
Zhenyang Ni et.al. |
2404.12916 |
link |
2024-04-19 |
Large Language Models for Networking: Workflow, Advances and Challenges |
Chang Liu et.al. |
2404.12901 |
null |
2024-04-19 |
Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning |
Ahmed Elshabrawy et.al. |
2404.12897 |
null |
2024-04-19 |
Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation |
Guanhua Chen et.al. |
2404.12879 |
null |
2024-04-19 |
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency |
Zhaodonghui Li et.al. |
2404.12872 |
link |
2024-04-19 |
How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning? |
Yang Luo et.al. |
2404.12866 |
null |
2024-04-19 |
Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation |
Yilong Chen et.al. |
2404.12861 |
null |
2024-04-19 |
TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages |
Aleksei Dorkin et.al. |
2404.12845 |
null |
2024-04-18 |
BLINK: Multimodal Large Language Models Can See but Not Perceive |
Xingyu Fu et.al. |
2404.12390 |
null |
2024-04-18 |
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models |
Aitor Ormazabal et.al. |
2404.12387 |
null |
2024-04-18 |
MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale |
Xiaotang Gai et.al. |
2404.12372 |
null |
2024-04-18 |
When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes |
Asaf Yehudai et.al. |
2404.12365 |
link |
2024-04-18 |
From $r$ to $Q^*$ : Your Language Model is Secretly a Q-Function |
Rafael Rafailov et.al. |
2404.12358 |
null |
2024-04-18 |
Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation |
Jingmin Sun et.al. |
2404.12355 |
link |
2024-04-18 |
V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning |
Hang Hua et.al. |
2404.12353 |
null |
2024-04-18 |
Evaluating AI for Law: Bridging the Gap with Open-Source Solutions |
Rohan Bhambhoria et.al. |
2404.12349 |
null |
2024-04-18 |
Large Language Models in Targeted Sentiment Analysis |
Nicolay Rusnachenko et.al. |
2404.12342 |
link |
2024-04-18 |
Normative Requirements Operationalization with Large Language Models |
Nick Feng et.al. |
2404.12335 |
null |
2024-04-18 |
Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment |
Zhaofeng Wu et.al. |
2404.12318 |
null |
2024-04-18 |
Large Language Models for Synthetic Participatory Planning of Shared Automated Electric Mobility Systems |
Jiangbo Yu et.al. |
2404.12317 |
null |
2024-04-18 |
Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair |
Yusuke Sakai et.al. |
2404.12299 |
null |
2024-04-18 |
Augmenting emotion features in irony detection with Large language modeling |
Yucheng Lin et.al. |
2404.12291 |
null |
2024-04-18 |
Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery |
Yona Falinie A. Gaus et.al. |
2404.12285 |
null |
2024-04-18 |
Enhancing Embedding Performance through Large Language Model-based Text Enrichment and Rewriting |
Nicholas Harris et.al. |
2404.12283 |
null |
2024-04-18 |
Advancing the Robustness of Large Language Models through Self-Denoised Smoothing |
Jiabao Ji et.al. |
2404.12274 |
link |
2024-04-18 |
FedEval-LLM: Federated Evaluation of Large Language Models on Downstream Tasks with Collective Wisdom |
Yuanqin He et.al. |
2404.12273 |
null |
2024-04-18 |
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences |
Shreya Shankar et.al. |
2404.12272 |
null |
2024-04-18 |
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM |
Michelle S. Lam et.al. |
2404.12259 |
link |
2024-04-17 |
Private federated discovery of out-of-vocabulary words for Gboard |
Ziteng Sun et.al. |
2404.11607 |
null |
2024-04-17 |
VG4D: Vision-Language Model Goes 4D Video Recognition |
Zhichao Deng et.al. |
2404.11605 |
link |
2024-04-17 |
A Deep Dive into Large Language Models for Automated Bug Localization and Repair |
Soneya Binta Hossain et.al. |
2404.11595 |
null |
2024-04-17 |
Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding |
Zezhong Fan et.al. |
2404.11589 |
null |
2024-04-17 |
LLMTune: Accelerate Database Knob Tuning with Large Language Models |
Xinmei Huang et.al. |
2404.11581 |
link |
2024-04-17 |
On the Scalability of GNNs for Molecular Graphs |
Maciej Sypetkowski et.al. |
2404.11568 |
null |
2024-04-17 |
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation |
Kuan-Chieh et.al. |
2404.11565 |
null |
2024-04-17 |
Quantifying Multilingual Performance of Large Language Models Across Languages |
Zihao Li et.al. |
2404.11553 |
null |
2024-04-17 |
Evaluating Span Extraction in Generative Paradigm: A Reflection on Aspect-Based Sentiment Analysis |
Soyoung Yang et.al. |
2404.11539 |
null |
2024-04-17 |
FedPFT: Federated Proxy Fine-Tuning of Foundation Models |
Zhaopeng Peng et.al. |
2404.11536 |
link |
2024-04-17 |
Select and Reorder: A Novel Approach for Neural Sign Language Production |
Harry Walsh et.al. |
2404.11532 |
null |
2024-04-17 |
Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization |
Costas Mavromatis et.al. |
2404.11531 |
link |
2024-04-17 |
Embedding Privacy in Computational Social Science and Artificial Intelligence Research |
Keenan Jones et.al. |
2404.11515 |
null |
2024-04-17 |
Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models |
Yushuo Chen et.al. |
2404.11502 |
link |
2024-04-17 |
Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models |
Yue Zhou et.al. |
2404.11500 |
link |
2024-04-18 |
Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent |
Wei Chen et.al. |
2404.11459 |
null |
2024-04-17 |
Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models |
Sunhao Dai et.al. |
2404.11457 |
link |
2024-04-17 |
AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts |
Meng Jiang et.al. |
2404.11449 |
null |
2024-04-17 |
Open-Ended Wargames with Large Language Models |
Daniel P. Hogan et.al. |
2404.11446 |
link |
2024-04-17 |
DUPE: Detection Undermining via Prompt Engineering for Deepfake Text |
James Weichert et.al. |
2404.11408 |
null |
2024-04-16 |
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback |
Qiwei Di et.al. |
2404.10776 |
null |
2024-04-16 |
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation |
Hongxin Zhang et.al. |
2404.10775 |
null |
2024-04-16 |
Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification |
Yu-Yang Li et.al. |
2404.10757 |
link |
2024-04-16 |
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study |
Shusheng Xu et.al. |
2404.10719 |
null |
2024-04-16 |
Dual Modalities of Text: Visual and Textual Generative Pre-training |
Yekun Chai et.al. |
2404.10710 |
null |
2024-04-16 |
Question Difficulty Ranking for Multiple-Choice Reading Comprehension |
Vatsal Raina et.al. |
2404.10704 |
null |
2024-04-16 |
An empirical study on code review activity prediction in practice |
Doriane Olewicki et.al. |
2404.10703 |
null |
2024-04-16 |
Automating REST API Postman Test Cases Using LLM |
S Deepika Sri et.al. |
2404.10678 |
null |
2024-04-16 |
Self-playing Adversarial Language Game Enhances LLM Reasoning |
Pengyu Cheng et.al. |
2404.10642 |
link |
2024-04-16 |
HLAT: High-quality Large Language Model Pre-trained on AWS Trainium |
Haozheng Fan et.al. |
2404.10630 |
null |
2024-04-16 |
Private Attribute Inference from Images with Vision-Language Models |
Batuhan Tömekçe et.al. |
2404.10618 |
null |
2024-04-16 |
Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases |
Yanze Li et.al. |
2404.10595 |
null |
2024-04-16 |
Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training |
Masanori Hirano et.al. |
2404.10555 |
null |
2024-04-16 |
Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning |
Xiao Wang et.al. |
2404.10552 |
null |
2024-04-16 |
Capturing the Macroscopic Behaviour of Molecular Dynamics with Membership Functions |
Alexander Sikorski et.al. |
2404.10523 |
null |
2024-04-16 |
CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity |
Moshe Berchansky et.al. |
2404.10513 |
null |
2024-04-16 |
White Men Lead, Black Women Help: Uncovering Gender, Racial, and Intersectional Bias in Language Agency |
Yixin Wan et.al. |
2404.10508 |
null |
2024-04-16 |
Self-Supervised Visual Preference Alignment |
Ke Zhu et.al. |
2404.10501 |
link |
2024-04-16 |
When Emotional Stimuli meet Prompt Designing: An Auto-Prompt Graphical Paradigm |
Chenggian Ma et.al. |
2404.10500 |
null |
2024-04-16 |
Spiral of Silences: How is Large Language Model Killing Information Retrieval? – A Case Study on Open Domain Question Answering |
Xiaoyang Chen et.al. |
2404.10496 |
link |
2024-04-15 |
KG-CTG: Citation Generation through Knowledge Graph-guided Large Language Models |
Avinash Anand et.al. |
2404.09763 |
null |
2024-04-15 |
Resilience of Large Language Models for Noisy Instructions |
Bin Wang et.al. |
2404.09754 |
null |
2024-04-15 |
Personalized Collaborative Fine-Tuning for On-Device Large Language Models |
Nicolas Wagner et.al. |
2404.09753 |
link |
2024-04-15 |
AMPCliff: quantitative definition and benchmarking of activity cliffs in antimicrobial peptides |
Kewei Li et.al. |
2404.09738 |
link |
2024-04-15 |
Quantization of Large Language Models with an Overdetermined Basis |
Daniil Merkulov et.al. |
2404.09737 |
null |
2024-04-15 |
Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models |
Ziwei Luo et.al. |
2404.09732 |
link |
2024-04-15 |
Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model |
Hyunsoo Cho et.al. |
2404.09717 |
null |
2024-04-15 |
Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction |
David Sobrín-Hidalgo et.al. |
2404.09705 |
null |
2024-04-15 |
Generative AI for Game Theory-based Mobile Networking |
Long He et.al. |
2404.09699 |
null |
2024-04-15 |
Are Large Language Models Reliable Argument Quality Annotators? |
Nailia Mirzakhmedova et.al. |
2404.09696 |
null |
2024-04-15 |
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models |
Guangyan Li et.al. |
2404.09695 |
null |
2024-04-15 |
Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation |
Juhwan Choi et.al. |
2404.09682 |
null |
2024-04-15 |
Learn Your Reference Model for Real Good Alignment |
Alexey Gorbatovski et.al. |
2404.09656 |
null |
2024-04-15 |
Do LLMs Understand Visual Anomalies? Uncovering LLM Capabilities in Zero-shot Anomaly Detection |
Jiaqi Zhu et.al. |
2404.09654 |
null |
2024-04-15 |
Bridging Vision and Language Spaces with Assignment Prediction |
Jungin Park et.al. |
2404.09632 |
link |
2024-04-15 |
AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception |
Yipo Huang et.al. |
2404.09624 |
link |
2024-04-15 |
UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark |
Zhaokun Zhou et.al. |
2404.09619 |
null |
2024-04-15 |
A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions |
Pengfei Liu et.al. |
2404.09606 |
link |
2024-04-15 |
Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction |
Zepeng Ding et.al. |
2404.09593 |
null |
2024-04-15 |
Modelling Language |
Jumbly Grindrod et.al. |
2404.09579 |
null |
2024-04-15 |
Transformers, Contextualism, and Polysemy |
Jumbly Grindrod et.al. |
2404.09577 |
null |
2024-04-15 |
Large language models and linguistic intentionality |
Jumbly Grindrod et.al. |
2404.09576 |
null |
2024-04-12 |
Probing the 3D Awareness of Visual Foundation Models |
Mohamed El Banani et.al. |
2404.08636 |
link |
2024-04-12 |
Pre-training Small Base LMs with Fewer Tokens |
Sunny Sanyal et.al. |
2404.08634 |
link |
2024-04-12 |
FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models |
Yanting Wang et.al. |
2404.08631 |
link |
2024-04-12 |
Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation |
Yanhao Zheng et.al. |
2404.08603 |
link |
2024-04-12 |
Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts |
Övgü Özdemir et.al. |
2404.08589 |
link |
2024-04-12 |
Pathological Primitive Segmentation Based on Visual Foundation Model with Zero-Shot Mask Generation |
Abu Bakor Hayat Arnob et.al. |
2404.08584 |
link |
2024-04-12 |
FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation |
Riza Velioglu et.al. |
2404.08582 |
null |
2024-04-12 |
Lossy Image Compression with Foundation Diffusion Models |
Lucas Relic et.al. |
2404.08580 |
null |
2024-04-12 |
Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation |
Hanlin Tian et.al. |
2404.08570 |
null |
2024-04-12 |
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs |
Shreyas Chaudhari et.al. |
2404.08555 |
null |
2024-04-12 |
Memory Traces: Are Transformers Tulving Machines? |
Jean-Marie Chauvet et.al. |
2404.08543 |
null |
2024-04-12 |
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward |
Xuan Xie et.al. |
2404.08517 |
null |
2024-04-12 |
ChatGPT and general-purpose AI count fruits in pictures surprisingly well |
Konlavach Mengsuwan et.al. |
2404.08515 |
null |
2024-04-12 |
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction |
Haoran Qiu et.al. |
2404.08509 |
link |
2024-04-12 |
LaSagnA: Language-based Segmentation Assistant for Complex Queries |
Cong Wei et.al. |
2404.08506 |
link |
2024-04-12 |
Strategic Interactions between Large Language Models-based Agents in Beauty Contests |
Siting Lu et.al. |
2404.08492 |
null |
2024-04-12 |
Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation |
Haozhe Zhao et.al. |
2404.08491 |
link |
2024-04-12 |
Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian |
Stefano De Paoli et.al. |
2404.08488 |
null |
2024-04-12 |
Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task |
Hassan Ali et.al. |
2404.08424 |
null |
2024-04-12 |
Adapting the Segment Anything Model During Usage in Novel Situations |
Robin Schön et.al. |
2404.08421 |
null |
2024-04-11 |
OpenBias: Open-set Bias Detection in Text-to-Image Generative Models |
Moreno D’Incà et.al. |
2404.07990 |
link |
2024-04-11 |
Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding |
Yiwen Tang et.al. |
2404.07989 |
link |
2024-04-11 |
Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning |
Simon Schrodi et.al. |
2404.07983 |
null |
2024-04-11 |
Language Imbalance Can Boost Cross-lingual Generalisation |
Anton Schäfer et.al. |
2404.07982 |
link |
2024-04-11 |
Manipulating Large Language Models to Increase Product Visibility |
Aounon Kumar et.al. |
2404.07981 |
link |
2024-04-11 |
LLoCO: Learning Long Contexts Offline |
Sijun Tan et.al. |
2404.07979 |
link |
2024-04-11 |
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models |
Haotian Zhang et.al. |
2404.07973 |
null |
2024-04-11 |
Rho-1: Not All Tokens Are What You Need |
Zhenghao Lin et.al. |
2404.07965 |
link |
2024-04-11 |
On Unified Prompt Tuning for Request Quality Assurance in Public Code Review |
Xinyu Chen et.al. |
2404.07942 |
null |
2024-04-11 |
Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation |
Jinkyung Park et.al. |
2404.07926 |
null |
2024-04-11 |
LaVy: Vietnamese Multimodal Large Language Model |
Chi Tran et.al. |
2404.07922 |
link |
2024-04-11 |
AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs |
Zeyi Liao et.al. |
2404.07921 |
link |
2024-04-11 |
DesignQA: A Multimodal Benchmark for Evaluating Large Language Models’ Understanding of Engineering Documentation |
Anna C. Doris et.al. |
2404.07917 |
link |
2024-04-11 |
HGRN2: Gated Linear RNNs with State Expansion |
Zhen Qin et.al. |
2404.07904 |
link |
2024-04-11 |
High-Dimension Human Value Representation in Large Language Models |
Samuel Cahyawijaya et.al. |
2404.07900 |
null |
2024-04-11 |
Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations |
Dayeon Ki et.al. |
2404.07851 |
link |
2024-04-11 |
On Training Data Influence of GPT Models |
Qingyi Liu et.al. |
2404.07840 |
link |
2024-04-11 |
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models |
Aleksandar Botev et.al. |
2404.07839 |
link |
2024-04-11 |
Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution |
Handi Deng et.al. |
2404.07833 |
null |
2024-04-11 |
Heron-Bench: A Benchmark for Evaluating Vision Language Models in Japanese |
Yuichi Inoue et.al. |
2404.07824 |
link |
2024-04-10 |
BRAVE: Broadening the visual encoding of vision-language models |
Oğuzhan Fatih Kar et.al. |
2404.07204 |
null |
2024-04-10 |
UMBRAE: Unified Multimodal Decoding of Brain Signals |
Weihao Xia et.al. |
2404.07202 |
null |
2024-04-10 |
Scaling Laws for Data Filtering – Data Curation cannot be Compute Agnostic |
Sachin Goyal et.al. |
2404.07177 |
link |
2024-04-10 |
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention |
Tsendsuren Munkhdalai et.al. |
2404.07143 |
null |
2024-04-10 |
Open reaction-diffusion systems: bridging probabilistic theory across scales |
Mauricio J. del Razo et.al. |
2404.07119 |
null |
2024-04-10 |
Continuous Language Model Interpolation for Dynamic and Controllable Text Generation |
Sara Kangaslahti et.al. |
2404.07117 |
link |
2024-04-11 |
From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications |
Yongqiang Ma et.al. |
2404.07108 |
null |
2024-04-10 |
Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs |
Bowen Jin et.al. |
2404.07103 |
link |
2024-04-10 |
Dynamic Generation of Personalities with Large Language Models |
Jianzhi Liu et.al. |
2404.07084 |
link |
2024-04-10 |
VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning |
Alexandros Xenos et.al. |
2404.07078 |
link |
2024-04-10 |
Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers? |
Mingyu Jin et.al. |
2404.07066 |
link |
2024-04-10 |
Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study |
Alessandro Stolfo et.al. |
2404.07060 |
null |
2024-04-10 |
Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation |
Elisa Sanchez-Bayona et.al. |
2404.07053 |
link |
2024-04-10 |
ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling |
Ege Özsoy et.al. |
2404.07031 |
null |
2024-04-10 |
Improving Language Model Reasoning with Self-motivated Learning |
Yunlong Feng et.al. |
2404.07017 |
null |
2024-04-10 |
A Mathematical Theory for Learning Semantic Languages by Abstract Learners |
Kuo-Yu Liao et.al. |
2404.07009 |
null |
2024-04-10 |
WordDecipher: Enhancing Digital Workspace Communication with Explainable AI for Non-native English Speakers |
Yuexi Chen et.al. |
2404.07005 |
null |
2024-04-10 |
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models |
Igor Tufanov et.al. |
2404.07004 |
null |
2024-04-10 |
Event Grounded Criminal Court View Generation withCooperative (Large) Language Models |
Linan Yue et.al. |
2404.07001 |
link |
2024-04-10 |
Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study |
Hongru Du et.al. |
2404.06962 |
link |
2024-04-09 |
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD |
Xiaoyi Dong et.al. |
2404.06512 |
link |
2024-04-09 |
Can Feedback Enhance Semantic Grounding in Large Vision-Language Models? |
Yuan-Hong Liao et.al. |
2404.06510 |
null |
2024-04-09 |
On the Effect of (Near) Duplicate Subwords in Language Modelling |
Anton Schäfer et.al. |
2404.06508 |
link |
2024-04-09 |
Pitfalls of Conversational LLMs on News Debiasing |
Ipek Baris Schlicht et.al. |
2404.06488 |
null |
2024-04-10 |
Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks |
Chonghua Wang et.al. |
2404.06480 |
link |
2024-04-10 |
Text-Based Reasoning About Vector Graphics |
Zhenhailong Wang et.al. |
2404.06479 |
null |
2024-04-09 |
Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models |
Zihan Fang et.al. |
2404.06448 |
null |
2024-04-09 |
Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems |
Kunal Garg et.al. |
2404.06413 |
null |
2024-04-09 |
AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents |
Luca Gioacchini et.al. |
2404.06411 |
link |
2024-04-09 |
Take a Look at it! Rethinking How to Evaluate Language Model Jailbreak |
Hongyu Cai et.al. |
2404.06407 |
link |
2024-04-09 |
Apprentices to Research Assistants: Advancing Research with Large Language Models |
M. Namvarpour et.al. |
2404.06404 |
null |
2024-04-09 |
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies |
Shengding Hu et.al. |
2404.06395 |
link |
2024-04-09 |
MuPT: A Generative Symbolic Music Pretrained Transformer |
Xingwei Qu et.al. |
2404.06393 |
null |
2024-04-09 |
Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis |
Mikel Zubillaga et.al. |
2404.06392 |
null |
2024-04-09 |
Latent Distance Guided Alignment Training for Large Language Models |
Haotian Luo et.al. |
2404.06390 |
null |
2024-04-09 |
Model Generation from Requirements with LLMs: an Exploratory Study |
Alessio Ferrari et.al. |
2404.06371 |
null |
2024-04-09 |
Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python |
Valdecy Pereira et.al. |
2404.06370 |
link |
2024-04-09 |
VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs |
Yi Gui et.al. |
2404.06369 |
null |
2024-04-09 |
ClinLinker: Medical Entity Linking of Clinical Concept Mentions in Spanish |
Fernando Gallego et.al. |
2404.06367 |
null |
2024-04-09 |
Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation |
Sidra Aleem et.al. |
2404.06362 |
link |
2024-04-08 |
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding |
Bo He et.al. |
2404.05726 |
link |
2024-04-08 |
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs |
Keen You et.al. |
2404.05719 |
null |
2024-04-08 |
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding |
Ahmad Idrissi-Yaghir et.al. |
2404.05694 |
null |
2024-04-08 |
Evaluating Mathematical Reasoning Beyond Accuracy |
Shijie Xia et.al. |
2404.05692 |
link |
2024-04-08 |
Retrieval-Augmented Open-Vocabulary Object Detection |
Jooyeon Kim et.al. |
2404.05687 |
link |
2024-04-08 |
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation |
Kunpeng Song et.al. |
2404.05674 |
link |
2024-04-08 |
CoReS: Orchestrating the Dance of Reasoning and Segmentation |
Xiaoyi Bao et.al. |
2404.05673 |
null |
2024-04-08 |
Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data |
Haitham Hammami et.al. |
2404.05632 |
link |
2024-04-08 |
LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking |
Faren Yan et.al. |
2404.05624 |
null |
2024-04-08 |
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning |
Matteo Farina et.al. |
2404.05621 |
link |
2024-04-08 |
SpeechAlign: Aligning Speech Generation to Human Preferences |
Dong Zhang et.al. |
2404.05600 |
link |
2024-04-08 |
MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering |
Iñigo Alonso et.al. |
2404.05590 |
null |
2024-04-08 |
Enhancing Software Related Information Extraction with Generative Language Models through Single-Choice Question Answering |
Wolfgang Otto et.al. |
2404.05587 |
null |
2024-04-08 |
Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model |
Yue-Hua Han et.al. |
2404.05583 |
null |
2024-04-08 |
360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System |
Shen Gao et.al. |
2404.05569 |
null |
2024-04-08 |
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models |
Bowen Pan et.al. |
2404.05567 |
null |
2024-04-08 |
Chinese Sequence Labeling with Semi-Supervised Boundary-Aware Language Model Pre-training |
Longhui Zhang et.al. |
2404.05560 |
link |
2024-04-08 |
Evaluating Interventional Reasoning Capabilities of Large Language Models |
Tejas Kasetty et.al. |
2404.05545 |
null |
2024-04-08 |
OPSD: an Offensive Persian Social media Dataset and its baseline evaluations |
Mehran Safayani et.al. |
2404.05540 |
null |
2024-04-08 |
Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data |
Tim Baumgärtner et.al. |
2404.05530 |
null |
2024-04-05 |
Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2) |
Michael Saxon et.al. |
2404.04251 |
link |
2024-04-05 |
Physical Property Understanding from Language-Embedded Feature Fields |
Albert J. Zhai et.al. |
2404.04242 |
null |
2024-04-05 |
Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents |
Harsh Kohli et.al. |
2404.04237 |
null |
2024-04-05 |
player2vec: A Language Modeling Approach to Understand Player Behavior in Games |
Tianze Wang et.al. |
2404.04234 |
null |
2024-04-05 |
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation |
Ji-Jia Wu et.al. |
2404.04231 |
link |
2024-04-05 |
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation |
Tong Su et.al. |
2404.04212 |
null |
2024-04-05 |
Social Skill Training with Large Language Models |
Diyi Yang et.al. |
2404.04204 |
null |
2024-04-05 |
Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text? |
Ilya Ilyankou et.al. |
2404.04169 |
null |
2024-04-05 |
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model |
Xinrun Du et.al. |
2404.04167 |
null |
2024-04-05 |
Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval |
João Coelho et.al. |
2404.04163 |
null |
2024-04-05 |
BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models |
Jacek Wiland et.al. |
2404.04113 |
link |
2024-04-05 |
Large language models as oracles for instantiating ontologies with domain-specific knowledge |
Giovanni Ciatto et.al. |
2404.04108 |
link |
2024-04-05 |
Robust Preference Optimization with Provable Noise Tolerance for LLMs |
Xize Liang et.al. |
2404.04102 |
null |
2024-04-05 |
Label Propagation for Zero-shot Classification with Vision-Language Models |
Vladan Stojnić et.al. |
2404.04072 |
link |
2024-04-05 |
Assessing the quality of information extraction |
Filip Seitl et.al. |
2404.04068 |
null |
2024-04-05 |
CLUE: A Clinical Language Understanding Evaluation for LLMs |
Amin Dada et.al. |
2404.04067 |
link |
2024-04-05 |
VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots |
Akhil Padmanabha et.al. |
2404.04066 |
null |
2024-04-05 |
A Comparison of Methods for Evaluating Generative IR |
Negar Arabzadeh et.al. |
2404.04044 |
link |
2024-04-05 |
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer |
Hele-Andra Kuulmets et.al. |
2404.04042 |
null |
2024-04-05 |
Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds |
Annerose Eichel et.al. |
2404.04031 |
null |
2024-04-04 |
OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views |
Francis Engelmann et.al. |
2404.03650 |
null |
2024-04-04 |
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent |
Hanyu Lai et.al. |
2404.03648 |
link |
2024-04-04 |
Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra |
Darioush Kevian et.al. |
2404.03647 |
null |
2024-04-04 |
Locating and Editing Factual Associations in Mamba |
Arnab Sen Sharma et.al. |
2404.03646 |
link |
2024-04-04 |
Training LLMs over Neurally Compressed Text |
Brian Lester et.al. |
2404.03626 |
null |
2024-04-04 |
Standardizing Knowledge Engineering Practices with a Reference Architecture |
Bradley P. Allen et.al. |
2404.03624 |
null |
2024-04-04 |
Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph |
Marco Bronzini et.al. |
2404.03623 |
null |
2024-04-04 |
Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models |
Wenshan Wu et.al. |
2404.03622 |
null |
2024-04-04 |
DeViDe: Faceted medical knowledge for improved medical vision-language pre-training |
Haozhe Luo et.al. |
2404.03618 |
null |
2024-04-04 |
Sailor: Open Language Models for South-East Asia |
Longxu Dou et.al. |
2404.03608 |
link |
2024-04-04 |
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization |
Aniruddha Nrusimha et.al. |
2404.03605 |
link |
2024-04-04 |
Evaluating LLMs at Detecting Errors in LLM Responses |
Ryo Kamoi et.al. |
2404.03602 |
link |
2024-04-04 |
Intent Detection and Entity Extraction from BioMedical Literature |
Ankan Mullick et.al. |
2404.03598 |
link |
2024-04-04 |
ReFT: Representation Finetuning for Language Models |
Zhengxuan Wu et.al. |
2404.03592 |
link |
2024-04-04 |
SemGrasp: Semantic Grasp Generation via Language Aligned Discretization |
Kailin Li et.al. |
2404.03590 |
null |
2024-04-04 |
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models |
Yantao Liu et.al. |
2404.03577 |
link |
2024-04-04 |
Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity |
Jake Varley et.al. |
2404.03570 |
null |
2024-04-04 |
Personalized LLM Response Generation with Parameterized Memory Injection |
Kai Zhang et.al. |
2404.03565 |
null |
2024-04-04 |
Select and Summarize: Scene Saliency for Movie Script Summarization |
Rohit Saxena et.al. |
2404.03561 |
null |
2024-04-04 |
How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes |
Harmon Bhasin et.al. |
2404.03558 |
link |
2024-04-03 |
ALOHa: A New Measure for Hallucination in Captioning Models |
Suzanne Petryk et.al. |
2404.02904 |
null |
2024-04-03 |
MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment |
Duygu Ceylan et.al. |
2404.02899 |
null |
2024-04-03 |
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline |
Yifan Xu et.al. |
2404.02893 |
link |
2024-04-03 |
MODNO: Multi Operator Learning With Distributed Neural Operators |
Zecheng Zhang et.al. |
2404.02892 |
null |
2024-04-03 |
Linear Attention Sequence Parallelism |
Weigao Sun et.al. |
2404.02882 |
link |
2024-04-03 |
Integrating Explanations in Learning LTL Specifications from Demonstrations |
Ashutosh Gupta et.al. |
2404.02872 |
null |
2024-04-03 |
Toward Inference-optimal Mixture-of-Expert Large Language Models |
Longfei Yun et.al. |
2404.02852 |
null |
2024-04-03 |
I-Design: Personalized LLM Interior Designer |
Ata Çelen et.al. |
2404.02838 |
null |
2024-04-03 |
Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models |
Wanyun Cui et.al. |
2404.02837 |
null |
2024-04-03 |
Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison |
Maxime Bouthors et.al. |
2404.02835 |
null |
2024-04-03 |
Empowering Biomedical Discovery with AI Agents |
Shanghua Gao et.al. |
2404.02831 |
null |
2024-04-03 |
BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models |
Qijun Luo et.al. |
2404.02827 |
link |
2024-04-03 |
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models |
Haoran Sun et.al. |
2404.02823 |
link |
2024-04-03 |
A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches |
Zhigen Zhao et.al. |
2404.02817 |
null |
2024-04-03 |
The RealHumanEval: Evaluating Large Language Models’ Abilities to Support Programmers |
Hussein Mozannar et.al. |
2404.02806 |
link |
2024-04-03 |
Efficient Multi-Vector Dense Retrieval Using Bit Vectors |
Franco Maria Nardini et.al. |
2404.02805 |
link |
2024-04-03 |
AI and personalized learning: bridging the gap with modern educational goals |
Kristjan-Julius Laak et.al. |
2404.02798 |
null |
2024-04-03 |
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech |
Jaehyeon Kim et.al. |
2404.02781 |
null |
2024-04-03 |
FPT: Feature Prompt Tuning for Few-shot Readability Assessment |
Ziyang Wang et.al. |
2404.02772 |
link |
2024-04-03 |
DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement |
Hao Wu et.al. |
2404.02755 |
null |
2024-04-02 |
Segment Any 3D Object with Language |
Seungjun Lee et.al. |
2404.02157 |
null |
2024-04-02 |
Iterated Learning Improves Compositionality in Large Vision-Language Models |
Chenhao Zheng et.al. |
2404.02145 |
null |
2024-04-02 |
Topic-based Watermarks for LLM-Generated Text |
Alexander Nemecek et.al. |
2404.02138 |
null |
2024-04-02 |
ViTamin: Designing Scalable Vision Models in the Vision-Language Era |
Jienneg Chen et.al. |
2404.02132 |
link |
2024-04-02 |
FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning |
Joel Niklaus et.al. |
2404.02127 |
link |
2024-04-02 |
Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models |
Wanyong Feng et.al. |
2404.02124 |
link |
2024-04-02 |
GINopic: Topic Modeling with Graph Isomorphism Network |
Suman Adhya et.al. |
2404.02115 |
link |
2024-04-02 |
CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems |
Sara Rosenthal et.al. |
2404.02103 |
link |
2024-04-02 |
Advancing LLM Reasoning Generalists with Preference Trees |
Lifan Yuan et.al. |
2404.02078 |
link |
2024-04-02 |
Red-Teaming Segment Anything Model |
Krzysztof Jankowski et.al. |
2404.02067 |
link |
2024-04-02 |
Digital Forgetting in Large Language Models: A Survey of Unlearning Methods |
Alberto Blanco-Justicia et.al. |
2404.02062 |
null |
2024-04-02 |
Long-context LLMs Struggle with Long In-context Learning |
Tianle Li et.al. |
2404.02060 |
link |
2024-04-02 |
IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT |
Junchen Fu et.al. |
2404.02059 |
link |
2024-04-02 |
Deconstructing In-Context Learning: Understanding Prompts via Corruption |
Namrata Shivagunde et.al. |
2404.02054 |
link |
2024-04-02 |
A Survey on Large Language Model-Based Game Agents |
Sihao Hu et.al. |
2404.02039 |
link |
2024-04-02 |
MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages |
Daryna Dementieva et.al. |
2404.02037 |
null |
2024-04-02 |
Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts |
Zhuo Chen et.al. |
2404.02022 |
null |
2024-04-02 |
Large Language Models for Orchestrating Bimanual Robots |
Kun Chu et.al. |
2404.02018 |
null |
2024-04-02 |
MuxServe: Flexible Multiplexing for Efficient Multiple LLM Serving |
Jiangfei Duan et.al. |
2404.02015 |
null |
2024-04-02 |
Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models |
Stephan Linzbach et.al. |
2404.01992 |
null |
2024-03-29 |
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models |
Atsuyuki Miyai et.al. |
2403.20331 |
link |
2024-03-29 |
Are We on the Right Way for Evaluating Large Vision-Language Models? |
Lin Chen et.al. |
2403.20330 |
link |
2024-03-29 |
ReALM: Reference Resolution As Language Modeling |
Joel Ruben Antony Moniz et.al. |
2403.20329 |
null |
2024-03-29 |
Gecko: Versatile Text Embeddings Distilled from Large Language Models |
Jinhyuk Lee et.al. |
2403.20327 |
null |
2024-03-29 |
Convolutional Prompting meets Language Models for Continual Learning |
Anurag Roy et.al. |
2403.20317 |
null |
2024-03-29 |
Learn “No” to Say “Yes” Better: Improving Vision-Language Models via Negations |
Jaisidh Singh et.al. |
2403.20312 |
link |
2024-03-29 |
Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference |
Jovan Stojkovic et.al. |
2403.20306 |
null |
2024-03-29 |
Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain |
Burcu Sayin et.al. |
2403.20288 |
link |
2024-03-29 |
LUQ: Long-text Uncertainty Quantification for LLMs |
Caiqi Zhang et.al. |
2403.20279 |
null |
2024-04-01 |
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want |
Weifeng Lin et.al. |
2403.20271 |
link |
2024-03-29 |
Latxa: An Open Language Model and Evaluation Suite for Basque |
Julen Etxaniz et.al. |
2403.20266 |
link |
2024-03-29 |
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models |
Thibaut Thonet et.al. |
2403.20262 |
null |
2024-03-29 |
MedCLIP-SAM: Bridging Text and Image Towards Universal Medical Image Segmentation |
Taha Koleilat et.al. |
2403.20253 |
null |
2024-03-29 |
Using LLMs to Model the Beliefs and Preferences of Targeted Populations |
Keiichi Namikoshi et.al. |
2403.20252 |
null |
2024-03-29 |
Long-Tailed Anomaly Detection with Learnable Class Names |
Chih-Hui Ho et.al. |
2403.20236 |
null |
2024-03-29 |
H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model |
Chao Pang et.al. |
2403.20213 |
link |
2024-03-29 |
Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science |
Yazheng Yang et.al. |
2403.20208 |
null |
2024-03-29 |
The Future of Combating Rumors? Retrieval, Discrimination, and Generation |
Junhao Xu et.al. |
2403.20204 |
null |
2024-03-29 |
ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models |
Shuo Liu et.al. |
2403.20194 |
null |
2024-03-29 |
HARMamba: Efficient Wearable Sensor Human Activity Recognition Based on Bidirectional Selective SSM |
Shuangjian Li et.al. |
2403.20183 |
null |
2024-03-28 |
RSMamba: Remote Sensing Image Classification with State Space Model |
Keyan Chen et.al. |
2403.19654 |
link |
2024-03-28 |
InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction |
Sirui Xu et.al. |
2403.19652 |
null |
2024-03-28 |
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions |
Kai Zhang et.al. |
2403.19651 |
null |
2024-03-28 |
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models |
Samuel Marks et.al. |
2403.19647 |
link |
2024-03-28 |
Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning |
Chenyang Liu et.al. |
2403.19646 |
link |
2024-03-28 |
Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models |
Yucheng Shi et.al. |
2403.19631 |
null |
2024-03-28 |
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents |
Zeren Chen et.al. |
2403.19622 |
null |
2024-03-28 |
SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects |
Avinash Ummadisingu et.al. |
2403.19607 |
null |
2024-03-28 |
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation |
Zhongliang Zhou et.al. |
2403.19584 |
null |
2024-03-28 |
Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics |
Norman Di Palo et.al. |
2403.19578 |
null |
2024-03-28 |
WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models |
Piotr Molenda et.al. |
2403.19548 |
null |
2024-03-28 |
Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models |
Ang Lv et.al. |
2403.19521 |
link |
2024-03-28 |
Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data |
Shan Chen et.al. |
2403.19511 |
link |
2024-03-28 |
LLMs as Academic Reading Companions: Extending HCI Through Synthetic Personae |
Celia Chen et.al. |
2403.19506 |
null |
2024-03-28 |
Evolving Assembly Code in an Adversarial Environment |
Irina Maliukov et.al. |
2403.19489 |
null |
2024-03-28 |
JDocQA: Japanese Document Question Answering Dataset for Generative Language Models |
Eri Onami et.al. |
2403.19454 |
link |
2024-03-28 |
Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model |
Qi Gou et.al. |
2403.19443 |
null |
2024-03-28 |
OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion |
Xinyu Zhan et.al. |
2403.19417 |
null |
2024-03-28 |
BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation |
Yuhong He et.al. |
2403.19414 |
null |
2024-03-28 |
Checkpoint Merging via Bayesian Optimization in LLM Pretraining |
Deyuan Liu et.al. |
2403.19390 |
null |
2024-03-27 |
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models |
Yanwei Li et.al. |
2403.18814 |
link |
2024-03-27 |
ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation |
Suraj Patni et.al. |
2403.18807 |
link |
2024-03-27 |
Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation |
Mateusz Klimaszewski et.al. |
2403.18804 |
null |
2024-03-27 |
Projective Methods for Mitigating Gender Bias in Pre-trained Language Models |
Hillary Dawkins et.al. |
2403.18803 |
link |
2024-03-27 |
Long-form factuality in large language models |
Jerry Wei et.al. |
2403.18802 |
link |
2024-03-27 |
Towards a World-English Language Model for On-Device Virtual Assistants |
Rricha Jalota et.al. |
2403.18783 |
null |
2024-03-27 |
3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation |
Ehsan Latif et.al. |
2403.18778 |
null |
2024-03-27 |
ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object |
Chenshuang Zhang et.al. |
2403.18775 |
link |
2024-03-27 |
CheckEval: Robust Evaluation Framework using Large Language Model via Checklist |
Yukyung Lee et.al. |
2403.18771 |
null |
2024-03-27 |
MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model |
Yike Wu et.al. |
2403.18760 |
link |
2024-03-27 |
CYCLE: Learning to Self-Refine the Code Generation |
Yangruibo Ding et.al. |
2403.18746 |
link |
2024-03-27 |
Understanding the Learning Dynamics of Alignment with Human Feedback |
Shawn Im et.al. |
2403.18742 |
link |
2024-03-27 |
PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations |
Ehsan Latif et.al. |
2403.18721 |
null |
2024-03-27 |
Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding |
Xintong Wang et.al. |
2403.18715 |
null |
2024-03-27 |
The Invalsi Benchmark: measuring Language Models Mathematical and Language understanding in Italian |
Andrea Esuli et.al. |
2403.18697 |
null |
2024-03-27 |
NL-ITI: Optimizing Probing and Intervention for Improvement of ITI Method |
Jakub Hoscilowicz et.al. |
2403.18680 |
link |
2024-03-27 |
An Exploratory Study on Upper-Level Computing Students’ Use of Large Language Models as Tools in a Semester-Long Project |
Ben Arie Tanay et.al. |
2403.18679 |
null |
2024-03-27 |
SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens |
Chengbo Liu et.al. |
2403.18647 |
link |
2024-03-27 |
To Recommend or Not: Recommendability Identification in Conversations with Pre-trained Language Models |
Zhefan Wang et.al. |
2403.18628 |
link |
2024-03-27 |
Vulnerability Detection with Code Language Models: How Far Are We? |
Yangruibo Ding et.al. |
2403.18624 |
link |
2024-03-26 |
OmniVid: A Generative Framework for Universal Video Understanding |
Junke Wang et.al. |
2403.17935 |
link |
2024-03-26 |
Track Everything Everywhere Fast and Robustly |
Yunzhou Song et.al. |
2403.17931 |
null |
2024-03-26 |
MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution |
Wei Tao et.al. |
2403.17927 |
null |
2024-03-26 |
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning |
Rui Pan et.al. |
2403.17919 |
link |
2024-03-26 |
Large scale paired antibody language models |
Henry Kenlay et.al. |
2403.17889 |
null |
2024-03-26 |
Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation |
Carlos Gomes et.al. |
2403.17886 |
null |
2024-03-26 |
MIND Your Language: A Multilingual Dataset for Cross-lingual News Recommendation |
Andreea Iana et.al. |
2403.17876 |
link |
2024-03-26 |
Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach |
Andrea Ferrario et.al. |
2403.17873 |
null |
2024-03-26 |
Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications |
Philip Lippmann et.al. |
2403.17860 |
null |
2024-03-26 |
ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages |
Bhawna Piryani et.al. |
2403.17859 |
link |
2024-03-26 |
Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs |
David R. Mortensen et.al. |
2403.17856 |
null |
2024-03-26 |
ArabicaQA: A Comprehensive Dataset for Arabic Question Answering |
Abdelrahman Abdallah et.al. |
2403.17848 |
link |
2024-03-26 |
Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation |
Abdelrhman Werby et.al. |
2403.17846 |
null |
2024-03-26 |
Mechanistic Design and Scaling of Hybrid Architectures |
Michael Poli et.al. |
2403.17844 |
null |
2024-03-26 |
ReMamber: Referring Image Segmentation with Mamba Twister |
Yuhuan Yang et.al. |
2403.17839 |
null |
2024-03-26 |
A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities |
Ibrahim Ethem Hamamci et.al. |
2403.17834 |
link |
2024-03-26 |
Assessment of Multimodal Large Language Models in Alignment with Human Values |
Zhelun Shi et.al. |
2403.17830 |
null |
2024-03-26 |
Accelerating Radio Spectrum Regulation Workflows with Large Language Models (LLMs) |
Amir Ghasemi et.al. |
2403.17819 |
null |
2024-03-26 |
Graph Language Model (GLM): A new graph-based approach to detect social instabilities |
Wallyson Lemes de Oliveira et.al. |
2403.17816 |
null |
2024-03-26 |
Are Compressed Language Models Less Subgroup Robust? |
Leonidas Gee et.al. |
2403.17811 |
link |
2024-03-25 |
Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making |
Shuai Ma et.al. |
2403.16812 |
null |
2024-03-25 |
An LLM-Based Digital Twin for Optimizing Human-in-the Loop Systems |
Hanqing Yang et.al. |
2403.16809 |
link |
2024-03-25 |
Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback |
Zhangqian Bi et.al. |
2403.16792 |
null |
2024-03-25 |
All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification |
Deepak Narayan Gadde et.al. |
2403.16750 |
null |
2024-03-25 |
A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models |
Nils Ingelhag et.al. |
2403.16730 |
null |
2024-03-25 |
ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search |
Zehan Li et.al. |
2403.16702 |
link |
2024-03-25 |
Synapse: Learning Preferential Concepts from Visual Demonstrations |
Sadanand Modak et.al. |
2403.16689 |
null |
2024-03-25 |
Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography |
Jiayue Zhang et.al. |
2403.16687 |
null |
2024-03-25 |
RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict |
Yirong Zeng et.al. |
2403.16662 |
link |
2024-03-25 |
Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT |
Rohit Raju et.al. |
2403.16655 |
null |
2024-03-25 |
CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment |
Feiteng Fang et.al. |
2403.16649 |
link |
2024-03-25 |
Virtual Co-Pilot: Multimodal Large Language Model-enabled Quick-access Procedures for Single Pilot Operations |
Fan Li et.al. |
2403.16645 |
null |
2024-03-25 |
Semantically Enriched Cross-Lingual Sentence Embeddings for Crisis-related Social Media Texts |
Rabindra Lamsal et.al. |
2403.16614 |
null |
2024-03-25 |
Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units |
Biswesh Mohapatra et.al. |
2403.16609 |
null |
2024-03-25 |
TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques |
Ashok Urlana et.al. |
2403.16592 |
null |
2024-03-25 |
Can Large Language Models (or Humans) Distill Text? |
Nicolas Audinet de Pieuchon et.al. |
2403.16584 |
null |
2024-03-25 |
NSINA: A News Corpus for Sinhala |
Hansi Hettiarachchi et.al. |
2403.16571 |
link |
2024-03-25 |
Elysium: Exploring Object-level Perception in Videos via MLLM |
Han Wang et.al. |
2403.16558 |
link |
2024-03-25 |
DOrA: 3D Visual Grounding with Order-Aware Referring |
Tung-Yu Wu et.al. |
2403.16539 |
null |
2024-03-25 |
Open-Set Recognition in the Age of Vision-Language Models |
Dimity Miller et.al. |
2403.16528 |
null |
2024-03-25 |
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art |
Neeloy Chakraborty et.al. |
2403.16527 |
null |
2024-03-25 |
Harnessing the power of LLMs for normative reasoning in MASs |
Bastin Tony Roy Savarimuthu et.al. |
2403.16524 |
null |
2024-03-25 |
Norm Violation Detection in Multi-Agent Systems using Large Language Models: A Pilot Study |
Shawn He et.al. |
2403.16517 |
null |
2024-03-25 |
Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social Media |
Uma Sushmitha Gunturi et.al. |
2403.16514 |
null |
2024-03-22 |
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models |
Yuzhang Shang et.al. |
2403.15388 |
null |
2024-03-22 |
Long-CLIP: Unlocking the Long-Text Capability of CLIP |
Beichen Zhang et.al. |
2403.15378 |
link |
2024-03-22 |
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding |
Yi Wang et.al. |
2403.15377 |
link |
2024-03-22 |
Can large language models explore in-context? |
Akshay Krishnamurthy et.al. |
2403.15371 |
null |
2024-03-22 |
CoLLEGe: Concept Embedding Generation for Large Language Models |
Ryan Teehan et.al. |
2403.15362 |
null |
2024-03-22 |
Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities |
Zhitong Xiong et.al. |
2403.15356 |
link |
2024-03-22 |
Controlled Training Data Generation with Diffusion Models |
Teresa Yeo et.al. |
2403.15309 |
null |
2024-03-22 |
Sphere Neural-Networks for Rational Reasoning |
Tiansi Dong et.al. |
2403.15297 |
null |
2024-03-22 |
Measuring Gender and Racial Biases in Large Language Models |
Jiafu An et.al. |
2403.15281 |
null |
2024-03-22 |
Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review |
Jinge Wang et.al. |
2403.15274 |
null |
2024-03-22 |
Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs |
Xiaobin Zhang et.al. |
2403.15273 |
null |
2024-03-22 |
Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models |
Huanxuan Liao et.al. |
2403.15268 |
link |
2024-03-22 |
AI Exposure and Strategic Positioning on an Online Work Platform |
Shun Yiu et.al. |
2403.15262 |
null |
2024-03-22 |
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions |
Orion Weller et.al. |
2403.15246 |
link |
2024-03-22 |
Shadow Generation for Composite Image Using Diffusion model |
Qingyang Liu et.al. |
2403.15234 |
link |
2024-03-22 |
An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets |
Jonathan Katzy et.al. |
2403.15230 |
link |
2024-03-22 |
Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models |
Qiong Wu et.al. |
2403.15226 |
null |
2024-03-22 |
Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations |
Pranav Kulkarni et.al. |
2403.15218 |
link |
2024-03-22 |
InstaSynth: Opportunities and Challenges in Generating Synthetic Instagram Data with ChatGPT for Sponsored Content Detection |
Thales Bertaglia et.al. |
2403.15214 |
link |
2024-03-22 |
MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection |
Taeheon Kim et.al. |
2403.15209 |
null |
2024-03-21 |
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? |
Renrui Zhang et.al. |
2403.14624 |
null |
2024-03-21 |
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey |
Zeyu Han et.al. |
2403.14608 |
null |
2024-03-21 |
MyVLM: Personalizing VLMs for User-Specific Queries |
Yuval Alaluf et.al. |
2403.14599 |
null |
2024-03-21 |
ReAct Meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training |
Zonghan Yang et.al. |
2403.14589 |
null |
2024-03-21 |
Large Language Models for Multi-Choice Question Classification of Medical Subjects |
Víctor Ponce-López et.al. |
2403.14582 |
null |
2024-03-21 |
RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain |
William James Bolton et.al. |
2403.14578 |
link |
2024-03-21 |
A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students’ Formative Assessment Responses in Science |
Clayton Cohn et.al. |
2403.14565 |
null |
2024-03-21 |
The Era of Semantic Decoding |
Maxime Peyrard et.al. |
2403.14562 |
null |
2024-03-21 |
Lexicon-Level Contrastive Visual-Grounding Improves Language Modeling |
Chengxu Zhuang et.al. |
2403.14551 |
null |
2024-03-21 |
EDT: Improving Large Language Models’ Generation by Entropy-based Dynamic Temperature Sampling |
Shimao Zhang et.al. |
2403.14541 |
link |
2024-03-21 |
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference |
Han Zhao et.al. |
2403.14520 |
null |
2024-03-21 |
The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs) |
Joschka Haltaufderheide et.al. |
2403.14473 |
null |
2024-03-21 |
Detoxifying Large Language Models via Knowledge Editing |
Mengru Wang et.al. |
2403.14472 |
link |
2024-03-21 |
ChatGPT Alternative Solutions: Large Language Models Survey |
Hanieh Alipour et.al. |
2403.14469 |
null |
2024-03-21 |
Recourse for reclamation: Chatting with generative language models |
Jennifer Chien et.al. |
2403.14467 |
null |
2024-03-21 |
Towards Single-System Illusion in Software-Defined Vehicles – Automated, AI-Powered Workflow |
Krzysztof Lebioda et.al. |
2403.14460 |
null |
2024-03-21 |
Multi-Level Explanations for Generative Language Models |
Lucas Monteiro Paes et.al. |
2403.14459 |
null |
2024-03-21 |
gTBLS: Generating Tables from Text by Conditional Question Answering |
Anirudh Sundar et.al. |
2403.14457 |
null |
2024-03-21 |
Language Models Can Reduce Asymmetry in Information Markets |
Nasim Rahaman et.al. |
2403.14443 |
null |
2024-03-21 |
A Multimodal Approach to Device-Directed Speech Detection with Large Language Models |
Dominik Wager et.al. |
2403.14438 |
null |
2024-03-20 |
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition |
Ziyu Liu et.al. |
2403.13805 |
link |
2024-03-20 |
Learning from Models and Data for Visual Grounding |
Ruozhen He et.al. |
2403.13804 |
null |
2024-03-20 |
Reverse Training to Nurse the Reversal Curse |
Olga Golovneva et.al. |
2403.13799 |
null |
2024-03-20 |
Bridge the Modality and Capacity Gaps in Vision-Language Model Selection |
Chao Yi et.al. |
2403.13797 |
null |
2024-03-20 |
RewardBench: Evaluating Reward Models for Language Modeling |
Nathan Lambert et.al. |
2403.13787 |
link |
2024-03-20 |
Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts |
Guangzeng Han et.al. |
2403.13786 |
link |
2024-03-20 |
Information-Theoretic Distillation for Reference-less Summarization |
Jaehun Jung et.al. |
2403.13780 |
null |
2024-03-20 |
Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation |
Hugues Thomas et.al. |
2403.13777 |
null |
2024-03-20 |
Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models |
Nicholas Bai et.al. |
2403.13771 |
link |
2024-03-20 |
Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model |
Diwei Wang et.al. |
2403.13756 |
null |
2024-03-20 |
Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement |
Catherine Arnett et.al. |
2403.13754 |
null |
2024-03-20 |
EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation |
Atnafu Lambebo Tonja et.al. |
2403.13737 |
null |
2024-03-20 |
Large Language Models meet Network Slicing Management and Orchestration |
Abdulhalim Dandoush et.al. |
2403.13721 |
null |
2024-03-20 |
SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning |
Hongjun Wang et.al. |
2403.13684 |
null |
2024-03-20 |
PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents |
Mitodru Niyogi et.al. |
2403.13681 |
null |
2024-03-20 |
RoleInteract: Evaluating the Social Interaction of Role-Playing Agents |
Hongzhan Chen et.al. |
2403.13679 |
link |
2024-03-20 |
Grounding Spatial Relations in Text-Only Language Models |
Gorka Azkune et.al. |
2403.13666 |
link |
2024-03-20 |
Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese |
Meet Doshi et.al. |
2403.13638 |
null |
2024-03-20 |
VL-Mamba: Exploring State Space Models for Multimodal Learning |
Yanyuan Qiao et.al. |
2403.13600 |
null |
2024-03-20 |
No more optimization rules: LLM-enabled policy-based multi-modal query optimizer (version 1) |
Yifan Wang et.al. |
2403.13597 |
null |
2024-03-19 |
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression |
Zhuoshi Pan et.al. |
2403.12968 |
link |
2024-03-19 |
Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models |
Zuyan Liu et.al. |
2403.12966 |
link |
2024-03-19 |
Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models |
Ce Zhang et.al. |
2403.12964 |
link |
2024-03-19 |
Dated Data: Tracing Knowledge Cutoffs in Large Language Models |
Jeffrey Cheng et.al. |
2403.12958 |
null |
2024-03-19 |
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models |
Elaine Sui et.al. |
2403.12952 |
link |
2024-03-19 |
Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models |
Joana Ribeiro de Faria et.al. |
2403.12936 |
null |
2024-03-19 |
Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties |
Efrain Torres-Lomas et.al. |
2403.12935 |
null |
2024-03-19 |
Rapid AIdeation: Generating Ideas With the Self and in Collaboration With Large Language Models |
Gionnieve Lim et.al. |
2403.12928 |
null |
2024-03-19 |
Supporting Energy Policy Research with Large Language Models |
Grant Buster et.al. |
2403.12924 |
null |
2024-03-19 |
Contextual AD Narration with Interleaved Multimodal Sequence |
Hanlin Wang et.al. |
2403.12922 |
null |
2024-03-19 |
Semantic Layering in Room Segmentation via LLMs |
Taehyeon Kim et.al. |
2403.12920 |
null |
2024-03-19 |
Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts |
Sai Ashish Somayajula et.al. |
2403.12918 |
link |
2024-03-19 |
Yell At Your Robot: Improving On-the-Fly from Language Corrections |
Lucy Xiaoyang Shi et.al. |
2403.12910 |
null |
2024-03-19 |
Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference |
Baolin Li et.al. |
2403.12900 |
null |
2024-03-19 |
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding |
Anwen Hu et.al. |
2403.12895 |
link |
2024-03-20 |
MEDBind: Unifying Language and Multimodal Medical Data Embeddings |
Yuan Gao et.al. |
2403.12894 |
null |
2024-03-19 |
HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning |
Fucai Ke et.al. |
2403.12884 |
null |
2024-03-19 |
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models |
Zehui Chen et.al. |
2403.12881 |
link |
2024-03-19 |
Epistemology of Language Models: Do Language Models Have Holistic Knowledge? |
Minsu Kim et.al. |
2403.12862 |
null |
2024-03-19 |
RASP: A Drone-based Reconfigurable Actuation and Sensing Platform Towards Ambient Intelligent Systems |
Minghui Zhao et.al. |
2403.12853 |
null |
2024-03-18 |
Modality-Agnostic fMRI Decoding of Vision and Language |
Mitja Nikolaus et.al. |
2403.11771 |
null |
2024-03-18 |
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs |
M. Jehanzeb Mirza et.al. |
2403.11755 |
link |
2024-03-18 |
Revisiting The Classics: A Study on Identifying and Rectifying Gender Stereotypes in Rhymes and Poems |
Aditya Narayan Sankaran et.al. |
2403.11752 |
null |
2024-03-18 |
Embedded Named Entity Recognition using Probing Classifiers |
Nicholas Popovič et.al. |
2403.11747 |
null |
2024-03-18 |
TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models |
Lisa Weijler et.al. |
2403.11691 |
null |
2024-03-18 |
HDLdebugger: Streamlining HDL debugging with Large Language Models |
Xufeng Yao et.al. |
2403.11671 |
null |
2024-03-18 |
Prioritized Semantic Learning for Zero-shot Instance Navigation |
Xander Sun et.al. |
2403.11650 |
null |
2024-03-18 |
Arc2Face: A Foundation Model of Human Faces |
Foivos Paraperas Papantoniou et.al. |
2403.11641 |
link |
2024-03-18 |
Compositional Kronecker Context Optimization for Vision-Language Models |
Kun Ding et.al. |
2403.11631 |
null |
2024-03-18 |
Let’s Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model |
Haoyun Xu et.al. |
2403.11621 |
null |
2024-03-18 |
CRS-Diff: Controllable Generative Remote Sensing Foundation Model |
Datao Tang et.al. |
2403.11614 |
link |
2024-03-18 |
Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines |
Ekaterina Trofimova et.al. |
2403.11585 |
null |
2024-03-18 |
Reinforcement Learning with Token-level Feedback for Controllable Text Generation |
Wendi Li et.al. |
2403.11558 |
link |
2024-03-18 |
LLM^3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning |
Shu Wang et.al. |
2403.11552 |
link |
2024-03-18 |
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters |
Jiazuo Yu et.al. |
2403.11549 |
link |
2024-03-18 |
DEE: Dual-stage Explainable Evaluation Method for Text Generation |
Shenyu Zhang et.al. |
2403.11509 |
null |
2024-03-18 |
Do CLIPs Always Generalize Better than ImageNet Models? |
Qizhou Wang et.al. |
2403.11497 |
null |
2024-03-18 |
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding |
Yue Fan et.al. |
2403.11481 |
null |
2024-03-18 |
HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models |
Huy Nghiem et.al. |
2403.11456 |
link |
2024-03-18 |
Zero-shot Compound Expression Recognition with Visual Language Model at the 6th ABAW Challenge |
Jiahe Wang et.al. |
2403.11450 |
null |
2024-03-18 |
LLM Guided Evolution - The Automation of Models Advancing Models |
Clint Morris et.al. |
2403.11446 |
null |
2024-03-18 |
StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation |
Jinpeng Li et.al. |
2403.11439 |
null |
2024-03-18 |
InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions |
Yifan Wang et.al. |
2403.11435 |
null |
2024-03-18 |
A Novel Paradigm Boosting Translation Capabilities of Large Language Models |
Jiaxin Guo et.al. |
2403.11430 |
null |
2024-03-15 |
VideoAgent: Long-form Video Understanding with Large Language Model as Agent |
Xiaohan Wang et.al. |
2403.10517 |
null |
2024-03-15 |
Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization |
Ratnadira Widyasari et.al. |
2403.10507 |
null |
2024-03-15 |
ATOM: Asynchronous Training of Massive Models for Deep Learning in a Decentralized Environment |
Xiaofeng Wu et.al. |
2403.10504 |
null |
2024-03-15 |
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study |
Chenguang Wang et.al. |
2403.10499 |
link |
2024-03-15 |
Reconfigurable Robot Identification from Motion Data |
Yuhang Hu et.al. |
2403.10496 |
null |
2024-03-15 |
Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst? |
Bruno de Melo et.al. |
2403.10482 |
null |
2024-03-15 |
Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases |
Jiarui Li et.al. |
2403.10446 |
link |
2024-03-15 |
Optimal Block-Level Draft Verification for Accelerating Speculative Decoding |
Ziteng Sun et.al. |
2403.10444 |
null |
2024-03-15 |
Using an LLM to Turn Sign Spottings into Spoken Language Sentences |
Ozge Mercanoglu Sincan et.al. |
2403.10434 |
null |
2024-03-15 |
SocialGenPod: Privacy-Friendly Generative AI Social Web Applications with Decentralised Personal Data Stores |
Vidminas Vizgirda et.al. |
2403.10408 |
link |
2024-03-15 |
A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE |
Hervé Déjean et.al. |
2403.10407 |
null |
2024-03-15 |
Monotonic Representation of Numeric Properties in Language Models |
Benjamin Heinzerling et.al. |
2403.10381 |
link |
2024-03-15 |
EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models |
Rocktim Jyoti Das et.al. |
2403.10378 |
link |
2024-03-15 |
TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale |
Pengcheng Jiang et.al. |
2403.10351 |
null |
2024-03-15 |
Investigating grammatical abstraction in language models using few-shot learning of novel noun gender |
Priyanka Sukumaran et.al. |
2403.10338 |
null |
2024-03-15 |
CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model |
Shang-Hsuan Chiang et.al. |
2403.10326 |
link |
2024-03-15 |
NetBench: A Large-Scale and Comprehensive Network Traffic Benchmark Dataset for Foundation Models |
Chen Qian et.al. |
2403.10319 |
link |
2024-03-15 |
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer |
Hengxing Cai et.al. |
2403.10301 |
null |
2024-03-15 |
Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models |
Tian Meng et.al. |
2403.10287 |
null |
2024-03-15 |
Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning |
Shang-Hsuan Chiang et.al. |
2403.10281 |
link |
2024-03-14 |
GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping |
Yuhang Zheng et.al. |
2403.09637 |
link |
2024-03-14 |
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference |
Piotr Nawrot et.al. |
2403.09636 |
null |
2024-03-14 |
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models |
Akhil Kedia et.al. |
2403.09635 |
link |
2024-03-14 |
OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning |
Lingyi Hong et.al. |
2403.09634 |
null |
2024-03-14 |
3D-VLA: A 3D Vision-Language-Action Generative World Model |
Haoyu Zhen et.al. |
2403.09631 |
null |
2024-03-14 |
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking |
Eric Zelikman et.al. |
2403.09629 |
link |
2024-03-14 |
Explore In-Context Segmentation via Latent Diffusion Models |
Chaoyang Wang et.al. |
2403.09616 |
null |
2024-03-14 |
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training |
Brandon McKinzie et.al. |
2403.09611 |
null |
2024-03-14 |
Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey |
Xiaoyu Liu et.al. |
2403.09606 |
null |
2024-03-14 |
Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis |
Gregory Coppola et.al. |
2403.09599 |
null |
2024-03-14 |
Renovating Names in Open-Vocabulary Segmentation Benchmarks |
Haiwen Huang et.al. |
2403.09593 |
null |
2024-03-14 |
ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models |
Runyu Ma et.al. |
2403.09583 |
null |
2024-03-14 |
Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation |
Yunhao Gou et.al. |
2403.09572 |
null |
2024-03-14 |
Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models |
Laura Fernández-Becerra et.al. |
2403.09567 |
null |
2024-03-14 |
Welcome Your New AI Teammate: On Safety Analysis by Leashing Large Language Models |
Ali Nouri et.al. |
2403.09565 |
null |
2024-03-14 |
PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps |
Ruixuan Liu et.al. |
2403.09562 |
null |
2024-03-14 |
Less is More: Data Value Estimation for Visual Instruction Tuning |
Zikang Liu et.al. |
2403.09559 |
null |
2024-03-15 |
Logits of API-Protected LLMs Leak Proprietary Information |
Matthew Finlayson et.al. |
2403.09539 |
null |
2024-03-14 |
VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding |
Chris Kelly et.al. |
2403.09530 |
null |
2024-03-15 |
WavCraft: Audio Editing and Generation with Natural Language Prompts |
Jinhua Liang et.al. |
2403.09527 |
link |
2024-03-13 |
Simple and Scalable Strategies to Continually Pre-train Large Language Models |
Adam Ibrahim et.al. |
2403.08763 |
link |
2024-03-13 |
Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework |
Jingling Li et.al. |
2403.08743 |
null |
2024-03-13 |
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models |
Carlo Nicolini et.al. |
2403.08739 |
null |
2024-03-13 |
ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation |
Sayar Ghosh Roy et.al. |
2403.08737 |
link |
2024-03-13 |
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization |
Renjie Pi et.al. |
2403.08730 |
null |
2024-03-14 |
SOTOPIA- $π$ : Interactive Learning of Socially Intelligent Language Agents |
Ruiyi Wang et.al. |
2403.08715 |
link |
2024-03-13 |
Review of Generative AI Methods in Cybersecurity |
Yagmur Yigit et.al. |
2403.08701 |
null |
2024-03-13 |
TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning |
Shangding Gu et.al. |
2403.08694 |
null |
2024-03-13 |
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages |
Rik van Noord et.al. |
2403.08693 |
null |
2024-03-13 |
Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records |
Erlend Frayling et.al. |
2403.08664 |
null |
2024-03-13 |
Self-Supervised Learning for Covariance Estimation |
Tzvi Diskin et.al. |
2403.08662 |
null |
2024-03-13 |
Human Alignment of Large Language Models through Online Preference Optimisation |
Daniele Calandriello et.al. |
2403.08635 |
null |
2024-03-13 |
MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models |
Subash Neupane et.al. |
2403.08607 |
null |
2024-03-13 |
Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation |
Daniel Honerkamp et.al. |
2403.08605 |
link |
2024-03-13 |
DevBench: A Comprehensive Benchmark for Software Development |
Bowen Li et.al. |
2403.08604 |
link |
2024-03-13 |
Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments |
Sitao Cheng et.al. |
2403.08593 |
null |
2024-03-13 |
Non-discrimination Criteria for Generative Language Models |
Sara Sterlie et.al. |
2403.08564 |
null |
2024-03-13 |
AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models |
Yifei Gao et.al. |
2403.08542 |
null |
2024-03-13 |
Language models scale reliably with over-training and on downstream tasks |
Samir Yitzhak Gadre et.al. |
2403.08540 |
link |
2024-03-13 |
Masked Generative Story Transformer with Character Guidance and Caption Augmentation |
Christos Papadimitriou et.al. |
2403.08502 |
link |
2024-03-12 |
Beyond Text: Frozen Large Language Models in Visual Signal Comprehension |
Lei Zhu et.al. |
2403.07874 |
link |
2024-03-12 |
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension |
Fangyun Wei et.al. |
2403.07872 |
null |
2024-03-12 |
Exploring Safety Generalization Challenges of Large Language Models via Code |
Qibing Ren et.al. |
2403.07865 |
null |
2024-03-12 |
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation |
Shihao Zhao et.al. |
2403.07860 |
link |
2024-03-12 |
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric |
Haokun Lin et.al. |
2403.07839 |
null |
2024-03-12 |
DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies |
William Xie et.al. |
2403.07832 |
null |
2024-03-12 |
The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing |
Jianchen Wang et.al. |
2403.07825 |
null |
2024-03-12 |
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM |
Sainbayar Sukhbaatar et.al. |
2403.07816 |
null |
2024-03-12 |
Chronos: Learning the Language of Time Series |
Abdul Fatir Ansari et.al. |
2403.07815 |
link |
2024-03-12 |
Beyond Memorization: The Challenge of Random Memory Access in Language Models |
Tongyao Zhu et.al. |
2403.07805 |
link |
2024-03-12 |
Fine-tuning Large Language Models with Sequential Instructions |
Hanxu Hu et.al. |
2403.07794 |
link |
2024-03-12 |
Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations |
Carlos Jose Xavier Cruz et.al. |
2403.07769 |
link |
2024-03-12 |
Synth $^2$ : Boosting Visual-Language Models with Synthetic Captions and Image Embeddings |
Sahand Sharifzadeh et.al. |
2403.07750 |
null |
2024-03-12 |
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models |
Yan Liu et.al. |
2403.07747 |
null |
2024-03-12 |
Multi-modal Auto-regressive Modeling via Visual Words |
Tianshuo Peng et.al. |
2403.07720 |
link |
2024-03-12 |
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? |
Alexandre Drouin et.al. |
2403.07718 |
link |
2024-03-12 |
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models |
Zhicheng Guo et.al. |
2403.07714 |
link |
2024-03-12 |
Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards |
Wei Shen et.al. |
2403.07708 |
null |
2024-03-12 |
Large, Small or Both: A Novel Data Augmentation Framework Based on Language Models for Debiasing Opinion Summarization |
Yanyue Zhang et.al. |
2403.07693 |
null |
2024-03-12 |
Reference-free Monolithic Preference Optimization with Odds Ratio |
Jiwoo Hong et.al. |
2403.07691 |
link |
2024-03-11 |
Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena |
Leonie Weissweiler et.al. |
2403.06965 |
null |
2024-03-11 |
Materials science in the era of large language models: a perspective |
Ge Lei et.al. |
2403.06949 |
null |
2024-03-11 |
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation |
Xinyao Li et.al. |
2403.06946 |
link |
2024-03-11 |
Naming, Describing, and Quantifying Visual Objects in Humans and LLMs |
Alberto Testoni et.al. |
2403.06935 |
link |
2024-03-11 |
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis |
Yanming Liu et.al. |
2403.06932 |
link |
2024-03-11 |
MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning |
Yichuan Li et.al. |
2403.06914 |
link |
2024-03-11 |
Application of Quantum Tensor Networks for Protein Classification |
Debarshi Kundu et.al. |
2403.06890 |
null |
2024-03-11 |
Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents |
Nishchal Prasad et.al. |
2403.06872 |
link |
2024-03-11 |
Semantic Residual Prompts for Continual Learning |
Martin Menabue et.al. |
2403.06870 |
null |
2024-03-11 |
Learning with Noisy Foundation Models |
Hao Chen et.al. |
2403.06869 |
null |
2024-03-11 |
A Geospatial Approach to Predicting Desert Locust Breeding Grounds in Africa |
Ibrahim Salihu Yusuf et.al. |
2403.06860 |
null |
2024-03-11 |
Development of a Reliable and Accessible Caregiving Language Model (CaLM) |
Bambang Parmanto et.al. |
2403.06857 |
null |
2024-03-11 |
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation |
Guosheng Zhao et.al. |
2403.06845 |
null |
2024-03-11 |
RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback |
Yanming Liu et.al. |
2403.06840 |
link |
2024-03-11 |
ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts |
Lyuye Zhang et.al. |
2403.06838 |
null |
2024-03-11 |
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? |
Egor Zverev et.al. |
2403.06833 |
link |
2024-03-11 |
The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework |
Zhuo Chen et.al. |
2403.06832 |
link |
2024-03-11 |
ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model |
Zhiwei Liu et.al. |
2403.06765 |
link |
2024-03-11 |
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models |
Liang Chen et.al. |
2403.06764 |
link |
2024-03-11 |
ALaRM: Align Language Models via Hierarchical Rewards Modeling |
Yuhang Lai et.al. |
2403.06754 |
null |
2024-03-08 |
Bayesian Preference Elicitation with Language Models |
Kunal Handa et.al. |
2403.05534 |
null |
2024-03-08 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context |
Machel Reid et.al. |
2403.05530 |
null |
2024-03-08 |
GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM |
Hao Kang et.al. |
2403.05527 |
link |
2024-03-08 |
DeepSeek-VL: Towards Real-World Vision-Language Understanding |
Haoyu Lu et.al. |
2403.05525 |
link |
2024-03-08 |
Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola |
Yijiang Li et.al. |
2403.05523 |
null |
2024-03-08 |
Authorship Attribution in Bangla Literature (AABL) via Transfer Learning using ULMFiT |
Aisha Khatun et.al. |
2403.05519 |
null |
2024-03-08 |
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought |
James Chua et.al. |
2403.05518 |
link |
2024-03-08 |
To Err Is Human, but Llamas Can Learn It Too |
Agnes Luhtaru et.al. |
2403.05493 |
null |
2024-03-08 |
Will GPT-4 Run DOOM? |
Adrian de Wynter et.al. |
2403.05468 |
null |
2024-03-08 |
Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs |
Arijit Nag et.al. |
2403.05434 |
null |
2024-03-08 |
Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition |
Bingbing Wang et.al. |
2403.05428 |
null |
2024-03-08 |
FedFMS: Exploring Federated Foundation Models for Medical Image Segmentation |
Yuxi Liu et.al. |
2403.05408 |
link |
2024-03-08 |
Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery |
Xavier Bou et.al. |
2403.05381 |
link |
2024-03-08 |
VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model |
Junsu Kim et.al. |
2403.05346 |
null |
2024-03-08 |
Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings |
Wei Zhou et.al. |
2403.05338 |
null |
2024-03-08 |
ChatASU: Evoking LLM’s Reflexion to Truly Understand Aspect Sentiment in Dialogues |
Yiding Liu et.al. |
2403.05326 |
null |
2024-03-08 |
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation |
Zihao Wang et.al. |
2403.05313 |
null |
2024-03-08 |
Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents |
Jinyang Li et.al. |
2403.05307 |
null |
2024-03-08 |
ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications |
Sotaro Takeshita et.al. |
2403.05303 |
link |
2024-03-08 |
Modeling Dynamic (De)Allocations of Local Memory for Translation Validation |
Abhishek Rose et.al. |
2403.05302 |
null |
2024-03-07 |
iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries |
Adam Coscia et.al. |
2403.04760 |
link |
2024-03-07 |
KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts |
Adam Coscia et.al. |
2403.04758 |
link |
2024-03-07 |
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error |
Boshi Wang et.al. |
2403.04746 |
link |
2024-03-08 |
How Far Are We from Intelligent Visual Deductive Reasoning? |
Yizhe Zhang et.al. |
2403.04732 |
link |
2024-03-07 |
Common 7B Language Models Already Possess Strong Math Capabilities |
Chen Li et.al. |
2403.04706 |
null |
2024-03-07 |
ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes |
Hashmat Shadab Malik et.al. |
2403.04701 |
link |
2024-03-07 |
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification |
Ekaterina Fadeeva et.al. |
2403.04696 |
null |
2024-03-07 |
Telecom Language Models: Must They Be Large? |
Nicola Piovesan et.al. |
2403.04666 |
null |
2024-03-07 |
Yi: Open Foundation Models by 01.AI |
01. AI et.al. |
2403.04652 |
link |
2024-03-07 |
Teaching Large Language Models to Reason with Reinforcement Learning |
Alex Havrilla et.al. |
2403.04642 |
null |
2024-03-07 |
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios |
Qilang Ye et.al. |
2403.04640 |
link |
2024-03-07 |
A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds |
Xuenan Xu et.al. |
2403.04594 |
null |
2024-03-07 |
Embodied Understanding of Driving Scenarios |
Yunsong Zhou et.al. |
2403.04593 |
link |
2024-03-07 |
Wiki-TabNER:Advancing Table Interpretation Through Named Entity Recognition |
Aneta Koleva et.al. |
2403.04577 |
link |
2024-03-07 |
Reducing self-supervised learning complexity improves weakly-supervised classification performance in computational pathology |
Tim Lenz et.al. |
2403.04558 |
null |
2024-03-07 |
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models |
Wanru Zhao et.al. |
2403.04529 |
null |
2024-03-07 |
Where does In-context Translation Happen in Large Language Models |
Suzanna Sia et.al. |
2403.04510 |
null |
2024-03-07 |
GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability |
Zihan Luo et.al. |
2403.04483 |
link |
2024-03-08 |
Do Large Language Model Understand Multi-Intent Spoken Language ? |
Shangjian Yin et.al. |
2403.04481 |
link |
2024-03-08 |
Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset |
Minjin Kim et.al. |
2403.04460 |
null |
2024-03-06 |
Backtracing: Retrieving the Cause of the Query |
Rose E. Wang et.al. |
2403.03956 |
link |
2024-03-06 |
Bridging Language and Items for Retrieval and Recommendation |
Yupeng Hou et.al. |
2403.03952 |
link |
2024-03-06 |
The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models |
Adithya Bhaskar et.al. |
2403.03942 |
link |
2024-03-06 |
Did Translation Models Get More Robust Without Anyone Even Noticing? |
Ben Peters et.al. |
2403.03923 |
null |
2024-03-06 |
Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing |
Asmita et.al. |
2403.03897 |
link |
2024-03-06 |
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators |
Indraneil Paul et.al. |
2403.03894 |
link |
2024-03-06 |
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models |
Luiza Pozzobon et.al. |
2403.03893 |
link |
2024-03-06 |
FaaF: Facts as a Function for the evaluation of RAG systems |
Vasileios Katranidis et.al. |
2403.03888 |
link |
2024-03-06 |
SaulLM-7B: A pioneering Large Language Model for Law |
Pierre Colombo et.al. |
2403.03883 |
null |
2024-03-06 |
Learning to Decode Collaboratively with Multiple Language Models |
Shannon Zejiang Shen et.al. |
2403.03870 |
link |
2024-03-06 |
On the Origins of Linear Representations in Large Language Models |
Yibo Jiang et.al. |
2403.03867 |
null |
2024-03-06 |
KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions |
Fangyuan Xu et.al. |
2403.03866 |
null |
2024-03-06 |
Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning |
Deepanway Ghosal et.al. |
2403.03864 |
link |
2024-03-06 |
X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification |
Hanzi Xu et.al. |
2403.03863 |
link |
2024-03-06 |
Designing Informative Metrics for Few-Shot Example Selection |
Rishabh Adiga et.al. |
2403.03861 |
null |
2024-03-06 |
Emojinize : Enriching Any Text with Emoji Translations |
Lars Henning Klein et.al. |
2403.03857 |
null |
2024-03-06 |
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect |
Xin Men et.al. |
2403.03853 |
null |
2024-03-06 |
Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ |
Carolin Holtermann et.al. |
2403.03814 |
link |
2024-03-06 |
Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery |
Wei Zhang et.al. |
2403.03790 |
null |
2024-03-06 |
PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion |
Zekai Zhang et.al. |
2403.03788 |
link |
2024-03-05 |
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning |
Nathaniel Li et.al. |
2403.03218 |
null |
2024-03-05 |
CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments |
Savitha Sam Abraham et.al. |
2403.03203 |
null |
2024-03-05 |
Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement |
Rafaela Martelo et.al. |
2403.03188 |
link |
2024-03-05 |
Reliable, Adaptable, and Attributable Language Models with Retrieval |
Akari Asai et.al. |
2403.03187 |
null |
2024-03-05 |
MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting |
Fangchen Liu et.al. |
2403.03174 |
null |
2024-03-05 |
SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection |
Peng Qi et.al. |
2403.03170 |
null |
2024-03-05 |
PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset |
Arda Uzunoğlu et.al. |
2403.03167 |
link |
2024-03-05 |
Quantum Many-Body Physics Calculations with Large Language Models |
Haining Pan et.al. |
2403.03154 |
null |
2024-03-05 |
Language Guided Exploration for RL Agents in Text Environments |
Hitesh Golchha et.al. |
2403.03141 |
null |
2024-03-05 |
CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following |
Kaiyan Zhang et.al. |
2403.03129 |
null |
2024-03-05 |
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution |
Flor Miriam Plaza-del-Arco et.al. |
2403.03121 |
null |
2024-03-05 |
“In Dialogues We Learn”: Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning |
Chuanqi Cheng et.al. |
2403.03102 |
null |
2024-03-05 |
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents |
Yuqi Zhu et.al. |
2403.03101 |
link |
2024-03-05 |
Learning to Use Tools via Cooperative and Interactive Agents |
Zhengliang Shi et.al. |
2403.03031 |
null |
2024-03-05 |
Socratic Reasoning Improves Positive Text Rewriting |
Anmol Goel et.al. |
2403.03029 |
null |
2024-03-05 |
Word Importance Explains How Prompts Affect Language Model Outputs |
Stefan Hackmann et.al. |
2403.03028 |
null |
2024-03-05 |
OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following |
Haochen Shi et.al. |
2403.03017 |
null |
2024-03-05 |
Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations |
Hasan Abu-Rasheed et.al. |
2403.03008 |
null |
2024-03-05 |
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models |
Gen Luo et.al. |
2403.03003 |
link |
2024-03-05 |
Localized Zeroth-Order Prompt Optimization |
Wenyang Hu et.al. |
2403.02993 |
null |
2024-03-02 |
LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems |
Tasnim Ahmed et.al. |
2403.01342 |
null |
2024-03-02 |
Making Hybrid Languages: A Recipe |
Leif Andersen et.al. |
2403.01335 |
null |
2024-03-02 |
Chaining thoughts and LLMs to learn DNA structural biophysics |
Tyler D. Ross et.al. |
2403.01332 |
link |
2024-03-02 |
VBART: The Turkish LLM |
Meliksah Turker et.al. |
2403.01308 |
null |
2024-03-02 |
ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation |
Moran Yanuka et.al. |
2403.01306 |
null |
2024-03-02 |
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning |
Alexander Scarlatos et.al. |
2403.01304 |
link |
2024-03-02 |
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention |
Tianyi Zhang et.al. |
2403.01273 |
link |
2024-03-02 |
Employing LLMs for Incident Response Planning and Review |
Sam Hays et.al. |
2403.01271 |
null |
2024-03-02 |
Dissecting Language Models: Machine Unlearning via Selective Pruning |
Nicholas Pochinkov et.al. |
2403.01267 |
null |
2024-03-02 |
Accelerating Greedy Coordinate Gradient via Probe Sampling |
Yiran Zhao et.al. |
2403.01251 |
link |
2024-03-02 |
SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code |
Ziniu Hu et.al. |
2403.01248 |
null |
2024-03-02 |
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal |
Jianheng Huang et.al. |
2403.01244 |
null |
2024-03-02 |
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact |
Ruikang Liu et.al. |
2403.01241 |
null |
2024-03-02 |
Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy |
Jamie Hayes et.al. |
2403.01218 |
null |
2024-03-02 |
API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access |
Jiayuan Su et.al. |
2403.01216 |
null |
2024-03-02 |
Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning |
Shuo Yang et.al. |
2403.01209 |
null |
2024-03-02 |
The Case for Animal-Friendly AI |
Sankalpa Ghose et.al. |
2403.01199 |
null |
2024-03-02 |
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling |
Shanghaoran Quan et.al. |
2403.01197 |
link |
2024-03-02 |
RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots |
Philip Feldman. James R. Foulds et.al. |
2403.01193 |
null |
2024-03-02 |
Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding |
Ha-Thanh Nguyen et.al. |
2403.01185 |
null |
2024-02-29 |
The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations? |
Alex Gu et.al. |
2402.19475 |
null |
2024-02-29 |
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World |
Weiyun Wang et.al. |
2402.19474 |
link |
2024-02-29 |
Retrieval-Augmented Generation for AI-Generated Content: A Survey |
Penghao Zhao et.al. |
2402.19473 |
link |
2024-02-29 |
Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling |
Gabriel Grand et.al. |
2402.19471 |
null |
2024-03-01 |
TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning |
Kate Sanders et.al. |
2402.19467 |
null |
2024-02-29 |
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models |
Chen Qian et.al. |
2402.19465 |
link |
2024-02-29 |
Curiosity-driven Red-teaming for Large Language Models |
Zhang-Wei Hong et.al. |
2402.19464 |
link |
2024-02-29 |
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap |
Saurabh Srivastava et.al. |
2402.19450 |
link |
2024-02-29 |
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models |
Frederik Kunstner et.al. |
2402.19449 |
null |
2024-02-29 |
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL |
Yifei Zhou et.al. |
2402.19446 |
link |
2024-02-29 |
Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation |
Jonathan Yang et.al. |
2402.19432 |
null |
2024-02-29 |
Compositional API Recommendation for Library-Oriented Code Generation |
Zexiong Ma et.al. |
2402.19431 |
null |
2024-02-29 |
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models |
Soham De et.al. |
2402.19427 |
null |
2024-02-29 |
Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines |
Lijia Ma et.al. |
2402.19421 |
null |
2024-02-29 |
PaECTER: Patent-level Representation Learning using Citation-informed Transformers |
Mainak Ghosh et.al. |
2402.19411 |
null |
2024-02-29 |
On the Scaling Laws of Geographical Representation in Language Models |
Nathan Godey et.al. |
2402.19406 |
null |
2024-02-29 |
Entity-Aware Multimodal Alignment Framework for News Image Captioning |
Junzhe Zhang et.al. |
2402.19404 |
null |
2024-02-29 |
Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Match Human Crowd Accuracy |
Philipp Schoenegger et.al. |
2402.19379 |
null |
2024-02-29 |
OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models |
Jenish Maharjan et.al. |
2402.19371 |
null |
2024-02-29 |
SoK: Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency |
Akila Wickramasekara et.al. |
2402.19366 |
null |
2024-02-28 |
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards |
Haoxiang Wang et.al. |
2402.18571 |
link |
2024-02-28 |
Diffusion Language Models Are Versatile Protein Learners |
Xinyou Wang et.al. |
2402.18567 |
null |
2024-02-28 |
A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic |
Gregory Coppola et.al. |
2402.18566 |
null |
2024-02-28 |
Approaching Human-Level Forecasting with Language Models |
Danny Halawi et.al. |
2402.18563 |
null |
2024-02-28 |
Implicit Bias of Next-Token Prediction |
Christos Thrampoulidis et.al. |
2402.18551 |
null |
2024-02-28 |
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling |
Mahdi Karami et.al. |
2402.18508 |
null |
2024-02-28 |
Few-Shot Fairness: Unveiling LLM’s Potential for Fairness-Aware Classification |
Garima Chhikara et.al. |
2402.18502 |
null |
2024-02-28 |
Language Models Represent Beliefs of Self and Others |
Wentao Zhu et.al. |
2402.18496 |
null |
2024-02-28 |
IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding |
Lanyun Zhu et.al. |
2402.18476 |
null |
2024-02-28 |
Meta-Task Prompting Elicits Embedding from Large Language Models |
Yibin Lei et.al. |
2402.18458 |
null |
2024-02-28 |
Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization |
Deng Li et.al. |
2402.18447 |
null |
2024-02-28 |
Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication |
Weize Chen et.al. |
2402.18439 |
link |
2024-02-28 |
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models |
Xiujie Song et.al. |
2402.18409 |
null |
2024-02-28 |
Balanced Similarity with Auxiliary Prompts: Towards Alleviating Text-to-Image Retrieval Bias for CLIP in Zero-shot Learning |
Hanyao Wang et.al. |
2402.18400 |
null |
2024-02-28 |
Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models |
Ercong Nie et.al. |
2402.18397 |
null |
2024-02-28 |
The First Place Solution of WSDM Cup 2024: Leveraging Large Language Models for Conversational Multi-Doc QA |
Yiming Li et.al. |
2402.18385 |
link |
2024-02-28 |
Large Language Models As Evolution Strategies |
Robert Tjarko Lange et.al. |
2402.18381 |
null |
2024-02-28 |
Tokenization Is More Than Compression |
Craig W. Schmidt et.al. |
2402.18376 |
null |
2024-02-28 |
VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models |
Seoyeon Kim et.al. |
2402.18374 |
null |
2024-02-28 |
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning |
Jiachun Li et.al. |
2402.18344 |
null |
2024-02-27 |
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction |
Zekun Qi et.al. |
2402.17766 |
link |
2024-02-27 |
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits |
Shuming Ma et.al. |
2402.17764 |
null |
2024-02-27 |
Massive Activations in Large Language Models |
Mingjie Sun et.al. |
2402.17762 |
link |
2024-02-27 |
Towards Optimal Learning of Language Models |
Yuxian Gu et.al. |
2402.17759 |
null |
2024-02-27 |
Evaluating Very Long-Term Conversational Memory of LLM Agents |
Adyasha Maharana et.al. |
2402.17753 |
null |
2024-02-27 |
Tower: An Open Multilingual Large Language Model for Translation-Related Tasks |
Duarte M. Alves et.al. |
2402.17733 |
link |
2024-02-27 |
AmbigNLG: Addressing Task Ambiguity in Instruction for NLG |
Ayana Niwa et.al. |
2402.17717 |
null |
2024-02-27 |
Case-Based or Rule-Based: How Do Transformers Do the Math? |
Yi Hu et.al. |
2402.17709 |
link |
2024-02-27 |
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations |
Jing Huang et.al. |
2402.17700 |
link |
2024-02-27 |
NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents |
Tamara Czinczoll et.al. |
2402.17682 |
link |
2024-02-27 |
The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks |
Ashwin Prasad Shivarpatna Venkatesh et.al. |
2402.17679 |
null |
2024-02-27 |
CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention |
Mohammad Sadil Khan et.al. |
2402.17678 |
null |
2024-02-27 |
Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models |
Yunpeng Huang et.al. |
2402.17671 |
null |
2024-02-27 |
Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs |
Tanise Ceron et.al. |
2402.17649 |
null |
2024-02-27 |
SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation |
Shuangrui Ding et.al. |
2402.17645 |
link |
2024-02-27 |
Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data |
Xiao Liu et.al. |
2402.17644 |
link |
2024-02-27 |
Variational Learning is Effective for Large Deep Networks |
Yuesong Shen et.al. |
2402.17641 |
link |
2024-02-27 |
Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling |
David S. W. Williams et.al. |
2402.17622 |
null |
2024-02-27 |
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization |
Wenqi Zhang et.al. |
2402.17574 |
link |
2024-02-27 |
Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers |
Xinyu Tang et.al. |
2402.17564 |
link |
2024-02-26 |
Integrating Large Language Models with Graphical Session-Based Recommendation |
Naicheng Guo et.al. |
2402.16539 |
null |
2024-02-26 |
LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments |
Junzhe Chen et.al. |
2402.16499 |
null |
2024-02-26 |
On Languaging a Simulation Engine |
Han Liu et.al. |
2402.16482 |
null |
2024-02-26 |
Unveiling ChatGPT’s Usage in Open Source Projects: A Mining-based Study |
Rosalia Tufano et.al. |
2402.16480 |
null |
2024-02-26 |
mEdIT: Multilingual Text Editing via Instruction Tuning |
Vipul Raheja et.al. |
2402.16472 |
link |
2024-02-26 |
Unveiling Vulnerability of Self-Attention |
Khai Jiet Liong et.al. |
2402.16470 |
link |
2024-02-26 |
Defending LLMs against Jailbreaking Attacks via Backtranslation |
Yihan Wang et.al. |
2402.16459 |
link |
2024-02-26 |
ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing |
Liuzhenghao Lv et.al. |
2402.16445 |
link |
2024-02-26 |
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors |
Zhexin Zhang et.al. |
2402.16444 |
link |
2024-02-26 |
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models |
Tianyi Tang et.al. |
2402.16438 |
null |
2024-02-26 |
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions |
Yuansen Zhang et.al. |
2402.16431 |
null |
2024-02-26 |
Predicting Sustainable Development Goals Using Course Descriptions – from LLMs to Conventional Foundation Models |
Lev Kharlashkin et.al. |
2402.16420 |
null |
2024-02-26 |
From RAGs to riches: Using large language models to write documents for clinical trials |
Nigel Markey et.al. |
2402.16406 |
null |
2024-02-26 |
MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property |
Shiwen Ni et.al. |
2402.16389 |
link |
2024-02-26 |
Immunization against harmful fine-tuning attacks |
Domenic Rosati et.al. |
2402.16382 |
null |
2024-02-26 |
Improving LLM-based Machine Translation with Systematic Self-Correction |
Zhaopeng Feng et.al. |
2402.16379 |
link |
2024-02-26 |
Unraveling Babel: Exploring Multilingual Activation Patterns within Large Language Models |
Weize Liu et.al. |
2402.16367 |
null |
2024-02-26 |
LLM Inference Unveiled: Survey and Roofline Model Insights |
Zhihang Yuan et.al. |
2402.16363 |
link |
2024-02-26 |
Layer-wise Regularized Dropout for Neural Language Models |
Shiwen Ni et.al. |
2402.16361 |
null |
2024-02-26 |
An Integrated Data Processing Framework for Pretraining Foundation Models |
Yiding Sun et.al. |
2402.16358 |
link |
2024-02-26 |
Language-guided Skill Learning with Temporal Variational Inference |
Haotian Fu et.al. |
2402.16354 |
null |
2024-02-23 |
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning |
Jianguo Zhang et.al. |
2402.15506 |
link |
2024-02-23 |
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs |
Kinjal Basu et.al. |
2402.15491 |
null |
2024-02-23 |
Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models |
Yiran Liu et.al. |
2402.15481 |
null |
2024-02-23 |
Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization |
Swaroop Nath et.al. |
2402.15473 |
link |
2024-02-23 |
Repetition Improves Language Model Embeddings |
Jacob Mitchell Springer et.al. |
2402.15449 |
link |
2024-02-23 |
A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models |
Stefan Hegselmann et.al. |
2402.15422 |
link |
2024-02-23 |
PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning |
Simon Holk et.al. |
2402.15420 |
null |
2024-02-23 |
Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy? |
Nader Asadi et.al. |
2402.15414 |
null |
2024-02-23 |
Grasp, See and Place: Efficient Unknown Object Rearrangement with Policy Structure Prior |
Kechun Xu et.al. |
2402.15402 |
link |
2024-02-23 |
Explorations of Self-Repair in Language Models |
Cody Rushing et.al. |
2402.15390 |
link |
2024-02-23 |
Safe Task Planning for Language-Instructed Multi-Robot Systems using Conformal Prediction |
Jun Wang et.al. |
2402.15368 |
null |
2024-02-23 |
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping |
Zijie J. Wang et.al. |
2402.15350 |
link |
2024-02-23 |
NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data |
Sergei Bogdanov et.al. |
2402.15343 |
link |
2024-02-23 |
Ranking Entities along Conceptual Space Dimensions with LLMs: An Analysis of Fine-Tuning Strategies |
Nitesh Kumar et.al. |
2402.15337 |
null |
2024-02-23 |
GPTVQ: The Blessing of Dimensionality for LLM Quantization |
Mart van Baalen et.al. |
2402.15319 |
null |
2024-02-23 |
ArabianGPT: Native Arabic GPT-based Large Language |
Anis Koubaa et.al. |
2402.15313 |
null |
2024-02-23 |
Counterfactual Generation with Identifiability Guarantees |
Hanqi Yan et.al. |
2402.15309 |
link |
2024-02-23 |
Representing Online Handwriting for Recognition in Large Vision-Language Models |
Anastasiia Fadeeva et.al. |
2402.15307 |
null |
2024-02-23 |
How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries |
Somnath Banerjee et.al. |
2402.15302 |
link |
2024-02-23 |
Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models |
Yuzhe Zhang et.al. |
2402.15301 |
null |
2024-02-22 |
PALO: A Polyglot Large Multimodal Model for 5B People |
Muhammad Maaz et.al. |
2402.14818 |
link |
2024-02-22 |
Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging |
Yuzhe Yang et.al. |
2402.14815 |
link |
2024-02-22 |
WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition |
Lianghui Zhu et.al. |
2402.14812 |
link |
2024-02-22 |
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking |
Nikhil Prakash et.al. |
2402.14811 |
null |
2024-02-22 |
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning |
Zicheng Lin et.al. |
2402.14809 |
link |
2024-02-22 |
RelayAttention for Efficient Large Language Model Serving with Long System Prompts |
Lei Zhu et.al. |
2402.14808 |
link |
2024-02-22 |
A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health |
Nikhil Behari et.al. |
2402.14807 |
null |
2024-02-22 |
Identifying Multiple Personalities in Large Language Models with External Evaluation |
Xiaoyang Song et.al. |
2402.14805 |
null |
2024-02-22 |
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models |
Xudong Lu et.al. |
2402.14800 |
link |
2024-02-22 |
Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic |
Nathaniel Weir et.al. |
2402.14798 |
null |
2024-02-22 |
Zero-shot cross-lingual transfer in instruction tuning of large language model |
Nadezhda Chirkova et.al. |
2402.14778 |
null |
2024-02-22 |
2D Matryoshka Sentence Embeddings |
Xianming Li et.al. |
2402.14776 |
null |
2024-02-22 |
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models |
Yuhang Cao et.al. |
2402.14767 |
link |
2024-02-22 |
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues |
Ge Bai et.al. |
2402.14762 |
null |
2024-02-22 |
Generalizing Reward Modeling for Out-of-Distribution Preference Learning |
Chen Jia et.al. |
2402.14760 |
null |
2024-02-22 |
Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation |
Jiawei Wang et.al. |
2402.14744 |
null |
2024-02-22 |
Dependency Annotation of Ottoman Turkish with Multilingual BERT |
Şaziye Betül Özateş et.al. |
2402.14743 |
null |
2024-02-22 |
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs |
Arash Ahmadian et.al. |
2402.14740 |
null |
2024-02-22 |
Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models |
Seungduk Kim et.al. |
2402.14714 |
link |
2024-02-22 |
IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus |
Honghao Gui et.al. |
2402.14710 |
link |
2024-02-21 |
Coercing LLMs to do and reveal (almost) anything |
Jonas Geiping et.al. |
2402.14020 |
link |
2024-02-21 |
Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment |
Vyas Raina et.al. |
2402.14016 |
null |
2024-02-21 |
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems |
Chaoqun He et.al. |
2402.14008 |
link |
2024-02-21 |
Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models |
Zhiwei He et.al. |
2402.14007 |
null |
2024-02-21 |
Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models |
Aline Ioste et.al. |
2402.14002 |
null |
2024-02-21 |
Analysing The Impact of Sequence Composition on Language Model Pre-Training |
Yu Zhao et.al. |
2402.13991 |
link |
2024-02-21 |
Towards Building Multilingual Language Model for Medicine |
Pengcheng Qiu et.al. |
2402.13963 |
link |
2024-02-21 |
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality |
Rahul Zalkikar et.al. |
2402.13954 |
null |
2024-02-21 |
Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning |
Debjit Paul et.al. |
2402.13950 |
null |
2024-02-21 |
Do Efficient Transformers Really Save Computation? |
Kai Yang et.al. |
2402.13934 |
null |
2024-02-21 |
Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content |
Federico Bianchi et.al. |
2402.13926 |
null |
2024-02-21 |
SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization |
Prakamya Mishra et.al. |
2402.13919 |
link |
2024-02-21 |
What Linguistic Features and Languages are Important in LLM Translation? |
Ryandito Diandaru et.al. |
2402.13917 |
null |
2024-02-21 |
Calibrating Large Language Models with Sample Consistency |
Qing Lyu et.al. |
2402.13904 |
null |
2024-02-21 |
Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models |
Chenyang Lyu et.al. |
2402.13887 |
null |
2024-02-21 |
$\texttt{Se}^2$: $\textit{Se}$quential Example $\textit{Se}$ lection for In-Context Learning |
Haoyu Liu et.al. |
2402.13874 |
null |
2024-02-21 |
An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach |
Mohammad Amaz Uddin et.al. |
2402.13871 |
null |
2024-02-21 |
Kuaiji: the First Chinese Accounting Large Language Model |
Jiayuan Luo et.al. |
2402.13866 |
null |
2024-02-21 |
RealDex: Towards Human-like Grasping for Robotic Dexterous Hand |
Yumeng Liu et.al. |
2402.13853 |
null |
2024-02-21 |
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models |
Jiawei Liang et.al. |
2402.13851 |
null |
2024-02-20 |
Towards audio language modeling – an overview |
Haibin Wu et.al. |
2402.13236 |
null |
2024-02-20 |
Unlocking Insights: Semantic Search in Jupyter Notebooks |
Lan Li et.al. |
2402.13234 |
null |
2024-02-20 |
A Touch, Vision, and Language Dataset for Multimodal Alignment |
Letian Fu et.al. |
2402.13232 |
link |
2024-02-20 |
Investigating Cultural Alignment of Large Language Models |
Badr AlKhamissi et.al. |
2402.13231 |
link |
2024-02-20 |
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive |
Arka Pal et.al. |
2402.13228 |
link |
2024-02-20 |
AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning |
Qiao Jin et.al. |
2402.13225 |
null |
2024-02-20 |
RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian |
Adrian Cosma et.al. |
2402.13222 |
link |
2024-02-20 |
How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts |
Yusu Qian et.al. |
2402.13220 |
null |
2024-02-20 |
Softmax Probabilities (Mostly) Predict Large Language Model Correctness on Multiple-Choice Q&A |
Benjamin Plaut et.al. |
2402.13213 |
link |
2024-02-20 |
Soft Self-Consistency Improves Language Model Agents |
Han Wang et.al. |
2402.13212 |
link |
2024-02-20 |
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation |
Dongjin Kang et.al. |
2402.13211 |
null |
2024-02-20 |
Bayesian Reward Models for LLM Alignment |
Adam X. Yang et.al. |
2402.13210 |
null |
2024-02-20 |
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena |
Marco Gaido et.al. |
2402.13208 |
link |
2024-02-20 |
Question Calibration and Multi-Hop Modeling for Temporal Question Answering |
Chao Xue et.al. |
2402.13188 |
null |
2024-02-20 |
What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents |
Mingyu Jin et.al. |
2402.13184 |
null |
2024-02-20 |
DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models |
Norman Di Palo et.al. |
2402.13181 |
null |
2024-02-20 |
Benchmarking Retrieval-Augmented Generation for Medicine |
Guangzhi Xiong et.al. |
2402.13178 |
link |
2024-02-20 |
Defending Jailbreak Prompts via In-Context Adversarial Game |
Yujun Zhou et.al. |
2402.13148 |
null |
2024-02-20 |
OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog |
Adnen Abdessaied et.al. |
2402.13146 |
null |
2024-02-20 |
The Hidden Space of Transformer Language Adapters |
Jesujoba O. Alabi et.al. |
2402.13137 |
null |
2024-02-19 |
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding |
Zhuoming Chen et.al. |
2402.12374 |
link |
2024-02-19 |
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies |
Xiao Ye et.al. |
2402.12370 |
link |
2024-02-19 |
A Critical Evaluation of AI Feedback for Aligning Large Language Models |
Archit Sharma et.al. |
2402.12366 |
link |
2024-02-19 |
Emergent Word Order Universals from Cognitively-Motivated Language Models |
Tatsuki Kuribayashi et.al. |
2402.12363 |
null |
2024-02-19 |
Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge |
Julien Delile et.al. |
2402.12352 |
null |
2024-02-19 |
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations |
Jinhao Duan et.al. |
2402.12348 |
link |
2024-02-19 |
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! |
Zhanhui Zhou et.al. |
2402.12343 |
link |
2024-02-19 |
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models |
Christian Schlarmann et.al. |
2402.12336 |
link |
2024-02-19 |
Query-Based Adversarial Prompt Generation |
Jonathan Hayase et.al. |
2402.12329 |
null |
2024-02-19 |
Shall We Talk: Exploring Spontaneous Collaborations of Competing LLM Agents |
Zengqing Wu et.al. |
2402.12327 |
link |
2024-02-19 |
ARKS: Active Retrieval in Knowledge Soup for Code Generation |
Hongjin Su et.al. |
2402.12317 |
null |
2024-02-19 |
Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports |
Felix J. Dorfner et.al. |
2402.12298 |
null |
2024-02-19 |
KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students |
Matthew Shu et.al. |
2402.12291 |
null |
2024-02-19 |
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models |
Xiaoyu Tian et.al. |
2402.12289 |
null |
2024-02-19 |
Adaptive Skeleton Graph Decoding |
Shuowei Jin et.al. |
2402.12280 |
null |
2024-02-19 |
Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks |
Nadezhda Chirkova et.al. |
2402.12279 |
null |
2024-02-19 |
Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from Large Language Models |
Puxuan Yu et.al. |
2402.12276 |
link |
2024-02-19 |
High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models |
Michela Lorandi et.al. |
2402.12267 |
link |
2024-02-19 |
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles |
Oleksandr Balabanov et.al. |
2402.12264 |
null |
2024-02-19 |
NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms |
Jonathan Zheng et.al. |
2402.12261 |
null |
2024-02-16 |
PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter |
Junfei Xiao et.al. |
2402.10896 |
null |
2024-02-16 |
RLVF: Learning from Verbal Feedback without Overgeneralization |
Moritz Stephan et.al. |
2402.10893 |
link |
2024-02-16 |
Instruction Diversity Drives Generalization To Unseen Tasks |
Dylan Zhang et.al. |
2402.10891 |
null |
2024-02-16 |
When is Tree Search Useful for LLM Planning? It Depends on the Discriminator |
Ziru Chen et.al. |
2402.10890 |
link |
2024-02-16 |
Multi-modal preference alignment remedies regression of visual instruction tuning on language model |
Shengzhi Li et.al. |
2402.10884 |
link |
2024-02-16 |
EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models |
Muhammad Shihab Rashid et.al. |
2402.10866 |
null |
2024-02-16 |
Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities |
Mingyu Jin et.al. |
2402.10835 |
null |
2024-02-16 |
RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model |
Jianhao Yuan et.al. |
2402.10828 |
null |
2024-02-16 |
Quantifying the Persona Effect in LLM Simulations |
Tiancheng Hu et.al. |
2402.10811 |
null |
2024-02-16 |
Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond |
Yongqi Li et.al. |
2402.10805 |
null |
2024-02-16 |
EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge |
Xuan Shen et.al. |
2402.10787 |
link |
2024-02-16 |
A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models |
Mingchen Li et.al. |
2402.10779 |
null |
2024-02-16 |
AutoGPT+P: Affordance-based Task Planning with Large Language Models |
Timo Birr et.al. |
2402.10778 |
null |
2024-02-16 |
How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs? |
Ehsan Doostmohammadi et.al. |
2402.10770 |
null |
2024-02-16 |
Distillation Enhanced Generative Retrieval |
Yongqi Li et.al. |
2402.10769 |
null |
2024-02-16 |
Inference to the Best Explanation in Large Language Models |
Dhairya Dalal et.al. |
2402.10767 |
null |
2024-02-16 |
When Dataflow Analysis Meets Large Language Models |
Chengpeng Wang et.al. |
2402.10754 |
null |
2024-02-16 |
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages |
Junjie Ye et.al. |
2402.10753 |
link |
2024-02-16 |
GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models |
Pengcheng Jiang et.al. |
2402.10744 |
link |
2024-02-16 |
Let’s Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning |
Yinpeng Liu et.al. |
2402.10738 |
link |
2024-02-15 |
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation |
Huizhuo Yuan et.al. |
2402.10210 |
null |
2024-02-15 |
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment |
Rui Yang et.al. |
2402.10207 |
link |
2024-02-15 |
Chain-of-Thought Reasoning Without Prompting |
Xuezhi Wang et.al. |
2402.10200 |
null |
2024-02-15 |
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents |
Lingbo Mo et.al. |
2402.10196 |
link |
2024-02-15 |
BitDelta: Your Fine-Tune May Only Be Worth One Bit |
James Liu et.al. |
2402.10193 |
link |
2024-02-15 |
Uncertainty Decomposition and Quantification for In-Context Learning of Large Language Models |
Chen Ling et.al. |
2402.10189 |
link |
2024-02-15 |
Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective |
Tianyi Qiu et.al. |
2402.10184 |
null |
2024-02-15 |
TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation |
Yaoxiang Wang et.al. |
2402.10178 |
null |
2024-02-15 |
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset |
Shubham Toshniwal et.al. |
2402.10176 |
link |
2024-02-15 |
Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence |
Yinhong Liu et.al. |
2402.10175 |
link |
2024-02-15 |
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models |
Ali AhmadiTeshnizi et.al. |
2402.10172 |
null |
2024-02-15 |
Data Engineering for Scaling Language Models to 128K Context |
Yao Fu et.al. |
2402.10171 |
link |
2024-02-15 |
Knowledge-Infused LLM-Powered Conversational Health Agent: A Case Study for Diabetes Patients |
Mahyar Abbasian et.al. |
2402.10153 |
null |
2024-02-15 |
ControlLM: Crafting Diverse Personalities for Language Models |
Yixuan Weng et.al. |
2402.10151 |
link |
2024-02-15 |
TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles |
Yinhong Liu et.al. |
2402.10137 |
null |
2024-02-15 |
Zero-Shot Reasoning: Personalized Content Generation Without the Cold Start Problem |
Davor Hafnar et.al. |
2402.10133 |
null |
2024-02-15 |
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning |
Ming Li et.al. |
2402.10110 |
link |
2024-02-15 |
Quantized Embedding Vectors for Controllable Diffusion Language Models |
Cheng Kang et.al. |
2402.10107 |
null |
2024-02-15 |
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving |
Jiaxin Zhang et.al. |
2402.10104 |
link |
2024-02-15 |
Any-Shift Prompting for Generalization over Distributions |
Zehao Xiao et.al. |
2402.10099 |
null |
2024-02-14 |
AQA-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability |
Siwei Yang et.al. |
2402.09404 |
link |
2024-02-14 |
Reinforcement Learning from Human Feedback with Active Queries |
Kaixuan Ji et.al. |
2402.09401 |
null |
2024-02-14 |
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference |
Harry Dong et.al. |
2402.09398 |
link |
2024-02-14 |
LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset |
Botao Yu et.al. |
2402.09391 |
link |
2024-02-14 |
HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation |
Yihao Fang et.al. |
2402.09390 |
link |
2024-02-14 |
Transformers Can Achieve Length Generalization But Not Robustly |
Yongchao Zhou et.al. |
2402.09371 |
null |
2024-02-14 |
Pseudorandom Error-Correcting Codes |
Miranda Christ et.al. |
2402.09370 |
null |
2024-02-14 |
Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking |
Yi Fung et.al. |
2402.09369 |
link |
2024-02-14 |
Copyright Traps for Large Language Models |
Matthieu Meeus et.al. |
2402.09363 |
null |
2024-02-14 |
HiRE: High Recall Approximate Top- $k$ Estimation for Efficient LLM Inference |
Yashas Samaga B L et.al. |
2402.09360 |
null |
2024-02-14 |
Developing a Framework for Auditing Large Language Models Using Human-in-the-Loop |
Maryam Amirizaniani et.al. |
2402.09346 |
null |
2024-02-14 |
Mitigating Reward Hacking via Information-Theoretic Reward Modeling |
Yuchun Miao et.al. |
2402.09345 |
null |
2024-02-14 |
AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach |
Maryam Amirizaniani et.al. |
2402.09334 |
null |
2024-02-14 |
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization |
Feifan Song et.al. |
2402.09320 |
link |
2024-02-14 |
Embracing the black box: Heading towards foundation models for causal discovery from time series data |
Gideon Stein et.al. |
2402.09305 |
link |
2024-02-14 |
Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code |
Vahid Majdinasab et.al. |
2402.09299 |
link |
2024-02-14 |
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey |
Zhichen Dong et.al. |
2402.09283 |
link |
2024-02-14 |
Leveraging Large Language Models for Enhanced NLP Task Performance through Knowledge Distillation and Optimized Training Strategies |
Yining Huang et.al. |
2402.09282 |
null |
2024-02-14 |
Personalized Large Language Models |
Stanisław Woźniak et.al. |
2402.09269 |
null |
2024-02-14 |
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation |
Xiaoying Zhang et.al. |
2402.09267 |
null |
2024-02-13 |
Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance |
Linxi Zhao et.al. |
2402.08680 |
null |
2024-02-13 |
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability |
Xingang Guo et.al. |
2402.08679 |
link |
2024-02-13 |
Human Curriculum Effects Emerge with In-Context Learning in Neural Networks |
Jacob Russin et.al. |
2402.08674 |
null |
2024-02-13 |
Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models |
Yuqing Liu et.al. |
2402.08670 |
null |
2024-02-13 |
Improving Generalization in Semantic Parsing by Increasing Natural Language Variation |
Irina Saparina et.al. |
2402.08666 |
link |
2024-02-13 |
The Last JITAI? The Unreasonable Effectiveness of Large Language Models in Issuing Just-in-Time Adaptive Interventions: Fostering Physical Activity in a Prospective Cardiac Rehabilitation Setting |
David Haag et.al. |
2402.08658 |
null |
2024-02-13 |
PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs |
Michael Dorkenwald et.al. |
2402.08657 |
null |
2024-02-13 |
Tandem Transformers for Inference Efficient LLMs |
Aishwarya P S et.al. |
2402.08644 |
null |
2024-02-13 |
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages |
Nedjma Ousidhoum et.al. |
2402.08638 |
null |
2024-02-13 |
Knowledge Editing on Black-box Large Language Models |
Xiaoshuai Song et.al. |
2402.08631 |
link |
2024-02-13 |
Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning |
Haeju Lee et.al. |
2402.08594 |
link |
2024-02-13 |
Test-Time Backdoor Attacks on Multimodal Large Language Models |
Dong Lu et.al. |
2402.08577 |
link |
2024-02-13 |
Online Foundation Model Selection in Robotics |
Po-han Li et.al. |
2402.08570 |
null |
2024-02-13 |
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast |
Xiangming Gu et.al. |
2402.08567 |
link |
2024-02-13 |
Artificial Intelligence for Literature Reviews: Opportunities and Challenges |
Francisco Bolanos et.al. |
2402.08565 |
null |
2024-02-13 |
Higher Layers Need More LoRA Experts |
Chongyang Gao et.al. |
2402.08562 |
link |
2024-02-13 |
Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback |
Vineet Bhat et.al. |
2402.08546 |
null |
2024-02-13 |
The Application of ChatGPT in Responding to Questions Related to the Boston Bowel Preparation Scale |
Xiaoqiang Liu et.al. |
2402.08492 |
null |
2024-02-13 |
Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models |
Shaeke Salman et.al. |
2402.08473 |
null |
2024-02-13 |
Large Language Models for the Automated Analysis of Optimization Algorithms |
Camilo Chacón Sartori et.al. |
2402.08472 |
link |
2024-02-12 |
A systematic investigation of learnability from single child linguistic input |
Yulu Qin et.al. |
2402.07899 |
link |
2024-02-12 |
Suppressing Pink Elephants with Direct Principle Feedback |
Louis Castricato et.al. |
2402.07896 |
null |
2024-02-12 |
WildfireGPT: Tailored Large Language Model for Wildfire Analysis |
Yangxinyu Xie et.al. |
2402.07877 |
null |
2024-02-12 |
Policy Improvement using Language Feedback Models |
Victor Zhong et.al. |
2402.07876 |
null |
2024-02-12 |
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs |
Soroush Nasiriany et.al. |
2402.07872 |
null |
2024-02-12 |
Scaling Laws for Fine-Grained Mixture of Experts |
Jakub Krajewski et.al. |
2402.07871 |
link |
2024-02-12 |
PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models |
Wei Zou et.al. |
2402.07867 |
link |
2024-02-12 |
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models |
Siddharth Karamcheti et.al. |
2402.07865 |
link |
2024-02-12 |
AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy |
Philipp Schoenegger et.al. |
2402.07862 |
null |
2024-02-12 |
Lissard: Long and Simple Sequential Reasoning Datasets |
Mirelle Bueno et.al. |
2402.07859 |
null |
2024-02-12 |
Mercury: An Efficiency Benchmark for LLM Code Synthesis |
Mingzhe Du et.al. |
2402.07844 |
link |
2024-02-12 |
Do Membership Inference Attacks Work on Large Language Models? |
Michael Duan et.al. |
2402.07841 |
link |
2024-02-12 |
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model |
Ahmet Üstün et.al. |
2402.07827 |
null |
2024-02-12 |
Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning |
Z Liu et.al. |
2402.07818 |
null |
2024-02-12 |
Injecting Wiktionary to improve token-level contextual representations using contrastive learning |
Anna Mosolova et.al. |
2402.07817 |
null |
2024-02-12 |
Retrieval-Augmented Thought Process as Sequential Decision Making |
Thomas Pouplin et.al. |
2402.07812 |
null |
2024-02-12 |
Empowering Federated Learning for Massive Models with NVIDIA FLARE |
Holger R. Roth et.al. |
2402.07792 |
null |
2024-02-12 |
TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection |
Hui Liu et.al. |
2402.07776 |
link |
2024-02-12 |
Quantitative knowledge retrieval from large language models |
David Selby et.al. |
2402.07770 |
link |
2024-02-12 |
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model |
Mikail Khona et.al. |
2402.07757 |
null |
2024-02-09 |
Feedback Loops With Language Models Drive In-Context Reward Hacking |
Alexander Pan et.al. |
2402.06627 |
link |
2024-02-09 |
Understanding the Effects of Iterative Prompting on Truthfulness |
Satyapriya Krishna et.al. |
2402.06625 |
null |
2024-02-09 |
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning |
Shivalika Singh et.al. |
2402.06619 |
null |
2024-02-09 |
FaBERT: Pre-training BERT on Persian Blogs |
Mostafa Masumi et.al. |
2402.06617 |
null |
2024-02-09 |
On the Out-Of-Distribution Generalization of Multimodal Large Language Models |
Xingxuan Zhang et.al. |
2402.06599 |
null |
2024-02-09 |
CigaR: Cost-efficient Program Repair with LLMs |
Dávid Hidvégi et.al. |
2402.06598 |
link |
2024-02-09 |
Understanding the Weakness of Large Language Model Agents within a Complex Android Environment |
Mingzhe Xing et.al. |
2402.06596 |
link |
2024-02-09 |
Self-consistent context aware conformer transducer for speech recognition |
Konstantin Kolokolov et.al. |
2402.06592 |
null |
2024-02-09 |
G-SciEdBERT: A Contextualized LLM for Science Assessment Tasks in German |
Ehsan Latif et.al. |
2402.06584 |
null |
2024-02-09 |
Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning |
Amir Ziai et.al. |
2402.06560 |
link |
2024-02-09 |
The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model |
Gregory Coppola et.al. |
2402.06557 |
link |
2024-02-09 |
Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA |
Marek Šuppa et.al. |
2402.06549 |
link |
2024-02-09 |
Calibrating Long-form Generations from Large Language Models |
Yukun Huang et.al. |
2402.06544 |
null |
2024-02-09 |
Introspective Planning: Guiding Language-Enabled Agents to Refine Their Own Uncertainty |
Kaiqu Liang et.al. |
2402.06529 |
link |
2024-02-09 |
Multimodal Clinical Trial Outcome Prediction with Large Language Models |
Wenhao Zheng et.al. |
2402.06512 |
link |
2024-02-09 |
Iris-SAM: Iris Segmentation Using a Foundational Model |
Parisa Farmanifard et.al. |
2402.06497 |
link |
2024-02-09 |
Large Language Models for Captioning and Retrieving Remote Sensing Images |
João Daniel Silva et.al. |
2402.06475 |
null |
2024-02-09 |
V-STaR: Training Verifiers for Self-Taught Reasoners |
Arian Hosseini et.al. |
2402.06457 |
null |
2024-02-09 |
StruQ: Defending Against Prompt Injection with Structured Queries |
Sizhe Chen et.al. |
2402.06363 |
null |
2024-02-09 |
CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models |
Peiyuan Gong et.al. |
2402.06360 |
link |
2024-02-08 |
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models |
Peng Gao et.al. |
2402.05935 |
link |
2024-02-08 |
Driving Everywhere with Large Language Model Policy Adaptation |
Boyi Li et.al. |
2402.05932 |
null |
2024-02-08 |
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue |
Xing Han Lù et.al. |
2402.05930 |
link |
2024-02-08 |
An Interactive Agent Foundation Model |
Zane Durante et.al. |
2402.05929 |
null |
2024-02-08 |
On the Convergence of Zeroth-Order Federated Tuning in Large Language Models |
Zhenqing Ling et.al. |
2402.05926 |
null |
2024-02-08 |
Efficient Stagewise Pretraining via Progressive Subnetworks |
Abhishek Panigrahi et.al. |
2402.05913 |
null |
2024-02-08 |
FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs |
Eun Cheol Choi et.al. |
2402.05904 |
link |
2024-02-08 |
Large Language Model Meets Graph Neural Network in Knowledge Distillation |
Shengxiang Hu et.al. |
2402.05894 |
null |
2024-02-08 |
Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking |
Nikhil Sharma et.al. |
2402.05880 |
null |
2024-02-08 |
PromptCrypt: Prompt Encryption for Secure Communication with Large Language Models |
Guo Lin et.al. |
2402.05868 |
link |
2024-02-08 |
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis |
Federico Bianchi et.al. |
2402.05863 |
link |
2024-02-08 |
Let Your Graph Do the Talking: Encoding Structured Data for LLMs |
Bryan Perozzi et.al. |
2402.05862 |
null |
2024-02-08 |
Learning to Route Among Specialized Experts for Zero-Shot Generalization |
Mohammed Muqeeth et.al. |
2402.05859 |
link |
2024-02-08 |
Limitations of Agents Simulated by Predictive Models |
Raymond Douglas et.al. |
2402.05829 |
null |
2024-02-08 |
Is it Possible to Edit Large Language Models Robustly? |
Xinbei Ma et.al. |
2402.05827 |
link |
2024-02-08 |
Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models |
Lingzhi Wang et.al. |
2402.05813 |
null |
2024-02-08 |
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning |
Zhiheng Xi et.al. |
2402.05808 |
link |
2024-02-08 |
How do Transformers perform In-Context Autoregressive Learning? |
Michael E. Sander et.al. |
2402.05787 |
null |
2024-02-08 |
Limits of Transformer Language Models on Algorithmic Learning |
Jonathan Thomm et.al. |
2402.05785 |
null |
2024-02-08 |
Text-to-Code Generation with Modality-relative Pre-training |
Fenia Christopoulou et.al. |
2402.05783 |
null |
2024-02-07 |
Opening the AI black box: program synthesis via mechanistic interpretability |
Eric J. Michaud et.al. |
2402.05110 |
link |
2024-02-07 |
You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models |
Alix Decrop et.al. |
2402.05102 |
null |
2024-02-07 |
Hydragen: High-Throughput LLM Inference with Shared Prefixes |
Jordan Juravsky et.al. |
2402.05099 |
null |
2024-02-07 |
Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation |
Dennis Hoftijzer et.al. |
2402.05090 |
null |
2024-02-07 |
A Roadmap to Pluralistic Alignment |
Taylor Sorensen et.al. |
2402.05070 |
link |
2024-02-07 |
SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models |
Lijun Li et.al. |
2402.05044 |
link |
2024-02-07 |
How BERT Speaks Shakespearean English? Evaluating Historical Bias in Contextual Language Models |
Miriam Cuscito et.al. |
2402.05034 |
null |
2024-02-07 |
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? |
Agustinus Kristiadi et.al. |
2402.05015 |
link |
2024-02-07 |
Pedagogical Alignment of Large Language Models |
Shashank Sonkar et.al. |
2402.05000 |
null |
2024-02-07 |
An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration |
Yihao Li et.al. |
2402.04978 |
null |
2024-02-07 |
ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12 |
Liuqing Chen et.al. |
2402.04975 |
null |
2024-02-07 |
Reconfidencing LLMs from the Grouping Loss Perspective |
Lihu Chen et.al. |
2402.04957 |
null |
2024-02-07 |
Chatbots in Knowledge-Intensive Contexts: Comparing Intent and LLM-Based Systems |
Samuel Kernan Freire et.al. |
2402.04955 |
null |
2024-02-07 |
Prompting Implicit Discourse Relation Annotation |
Frances Yung et.al. |
2402.04918 |
null |
2024-02-07 |
Personalized Text Generation with Fine-Grained Linguistic Control |
Bashar Alhafni et.al. |
2402.04914 |
link |
2024-02-07 |
L4Q: Parameter Efficient Quantization-Aware Training on Large Language Models via LoRA-wise LSQ |
Hyesung Jeon et.al. |
2402.04902 |
null |
2024-02-07 |
Detecting Generated Native Ads in Conversational Search |
Sebastian Schmidt et.al. |
2402.04889 |
link |
2024-02-07 |
Multimodal Query Suggestion with Multi-Agent Reinforcement Learning from Human Feedback |
Zheng Wang et.al. |
2402.04867 |
null |
2024-02-07 |
Automated Smart Contract Summarization via LLMs |
Yingjie Mao et.al. |
2402.04863 |
null |
2024-02-07 |
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay |
Natasha Butt et.al. |
2402.04858 |
null |
2024-02-06 |
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls |
Yu Du et.al. |
2402.04253 |
link |
2024-02-06 |
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal |
Mantas Mazeika et.al. |
2402.04249 |
link |
2024-02-06 |
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks |
Jongho Park et.al. |
2402.04248 |
link |
2024-02-06 |
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science |
Xiangru Tang et.al. |
2402.04247 |
null |
2024-02-06 |
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations |
Ji Qi et.al. |
2402.04236 |
link |
2024-02-06 |
Can Generative Agents Predict Emotion? |
Ciaran Regan et.al. |
2402.04232 |
null |
2024-02-06 |
“Task Success” is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors |
Lin Guan et.al. |
2402.04210 |
null |
2024-02-06 |
Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models |
David Sobrín-Hidalgo et.al. |
2402.04206 |
null |
2024-02-06 |
SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models |
Yichen Shi et.al. |
2402.04178 |
link |
2024-02-06 |
Scaling Laws for Downstream Task Performance of Large Language Models |
Berivan Isik et.al. |
2402.04177 |
null |
2024-02-06 |
Harnessing the Plug-and-Play Controller by Prompting |
Hao Wang et.al. |
2402.04160 |
null |
2024-02-06 |
Multi-line AI-assisted Code Authoring |
Omer Dunay et.al. |
2402.04141 |
null |
2024-02-06 |
Advancing Legal Reasoning: The Integration of AI to Navigate Complexities and Biases in Global Jurisprudence with Semi-Automated Arbitration Processes (SAAPs) |
Michael De’Shazer et.al. |
2402.04140 |
null |
2024-02-06 |
Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science |
Pengfei Liu et.al. |
2402.04119 |
link |
2024-02-06 |
Measuring Implicit Bias in Explicitly Unbiased Large Language Models |
Xuechunzi Bai et.al. |
2402.04105 |
null |
2024-02-06 |
The Use of a Large Language Model for Cyberbullying Detection |
Bayode Ogunleye et.al. |
2402.04088 |
null |
2024-02-06 |
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation |
Zhengbo Wang et.al. |
2402.04087 |
link |
2024-02-06 |
Provably learning a multi-head attention layer |
Sitan Chen et.al. |
2402.04084 |
null |
2024-02-06 |
Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models |
Reza Khanmohammadi et.al. |
2402.04075 |
null |
2024-02-06 |
Retrieve to Explain: Evidence-driven Predictions with Language Models |
Ravi Patel et.al. |
2402.04068 |
link |