Updated on 2024.05.18

Single Object & Visual Language Tracking

Publish Date	Title	Authors	PDF	Code
2024-05-08	TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking	Pengcheng Shao et.al.	2405.05004	link
2024-04-22	360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos	Yinzhe Xu et.al.	2404.13953	null
2024-04-18	Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training	Jin Gao et.al.	2404.12210	link
2024-04-16	Attention-Aware Visualization: Tracking and Responding to User Perception Over Time	Arvind Srinivasan et.al.	2404.10732	null
2024-04-15	Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL	Fangwei Zhong et.al.	2404.09857	null
2024-04-15	Learning Tracking Representations from Single Point Annotations	Qiangqiang Wu et.al.	2404.09504	null
2024-04-11	PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds	Weisheng Xu et.al.	2404.07495	link
2024-05-02	Longitudinal Analysis and Quantitative Assessment of Child Development through Mobile Interaction	Juan Carlos Ruiz-Garcia et.al.	2404.06919	null
2024-04-09	LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks	Jianlang Chen et.al.	2404.06247	link
2024-04-08	Semi-Supervised Novelty Detection for Precise Ultra-Wideband Error Signal Prediction	Umberto Albertin et.al.	2404.05351	null
2024-03-29	Context-Aware Integration of Language and Visual References for Natural Language Tracking	Yanyan Shao et.al.	2403.19975	null
2024-03-27	TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes	Liangyu Xu et.al.	2403.18238	null
2024-03-26	OmniVid: A Generative Framework for Universal Video Understanding	Junke Wang et.al.	2403.17935	link
2024-03-26	Exploring Dynamic Transformer for Efficient Object Tracking	Jiawen Zhu et.al.	2403.17651	null
2024-03-29	Elysium: Exploring Object-level Perception in Videos via MLLM	Han Wang et.al.	2403.16558	link
2024-03-25	Multi-attention Associate Prediction Network for Visual Tracking	Xinglong Sun et.al.	2403.16395	null
2024-03-28	SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking	Xiaojun Hou et.al.	2403.16002	link
2024-03-23	Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking	Shaoyu Sun et.al.	2403.15831	null
2024-03-19	TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO	Chaoran Xiong et.al.	2403.12504	null
2024-03-18	Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model	Jan Krejčí et.al.	2403.11978	null
2024-03-16	A Spectrum-based Image Denoising Method with Edge Feature Enhancement	Peter Luvton et.al.	2403.11036	null
2024-03-15	Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers	Jinxia Xie et.al.	2403.10574	null
2024-03-14	OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning	Lingyi Hong et.al.	2403.09634	null
2024-02-27	ACTrack: Adding Spatio-Temporal Condition for Visual Object Tracking	Yushan Han et.al.	2403.07914	null
2024-04-03	Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline	Xiao Wang et.al.	2403.05839	link
2024-03-08	Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance	Liting Lin et.al.	2403.05231	null
2024-03-08	Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy	Yuelin Zhang et.al.	2403.05146	link
2024-03-06	VastTrack: Vast Category Visual Object Tracking	Liang Peng et.al.	2403.03493	link
2024-02-28	Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks	Zhewei Wu et.al.	2402.17976	null
2024-02-26	SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking	Yu Lin et.al.	2402.16249	link
2024-02-26	Reading Relevant Feature from Global Representation Memory for Visual Object Tracking	Xinyu Zhou et.al.	2402.14392	null
2024-02-13	Optimized Information Flow for Transformer Tracking	Janani Kugarajeevan et.al.	2402.08195	link
2024-02-07	BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision	Xin Zhao et.al.	2402.04519	null
2024-02-04	Spatio-temporal Prompting Network for Robust Video Feature Extraction	Guanxiong Sun et.al.	2402.02574	link
2024-01-24	Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region	Shengjing Tian et.al.	2401.13285	null
2024-01-23	Correlation-Embedded Transformer Tracking: A Single-Branch Framework	Fei Xie et.al.	2401.12743	link
2024-01-20	Unifying Visual and Vision-Language Tracking via Contrastive Learning	Yinchao Ma et.al.	2401.11228	link
2024-01-20	Towards Category Unification of 3D Single Object Tracking on Point Clouds	Jiahao Nie et.al.	2401.11204	null
2024-01-18	Multi-task Learning for Joint Re-identification, Team Affiliation, and Role Classification for Sports Visual Tracking	Amir M. Mansourian et.al.	2401.09942	null
2024-01-12	Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements	Muhammad Wasim Nawaz et.al.	2401.06396	null
2024-01-18	Hold ‘em and Fold ‘em: Towards Human-scale, Feedback-Controlled Soft Origami Robots	Immanuel Ampomah Mensah et.al.	2401.04650	null
2024-01-06	Explicit Visual Prompts for Visual Object Tracking	Liangtao Shi et.al.	2401.03142	link
2024-01-03	ODTrack: Online Dense Temporal Token Learning for Visual Tracking	Yaozong Zheng et.al.	2401.01686	link
2023-12-27	X Modality Assisting RGBT Object Tracking	Zhaisheng Ding et.al.	2312.17273	null
2023-12-22	Cross-Modal Object Tracking via Modality-Aware Fusion Network and A Large-Scale Dataset	Lei Liu et.al.	2312.14446	link
2023-12-18	Multi-Correlation Siamese Transformer Network with Dense Connection for 3D Single Object Tracking	Shihao Feng et.al.	2312.11051	link
2023-12-17	Robust 3D Tracking with Quality-Aware Shape Completion	Jingwen Zhang et.al.	2312.10608	null
2023-12-15	Tracking Skiers from the Top to the Bottom	Matteo Dunnhofer et.al.	2312.09723	null
2023-12-11	M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking	Jiaming Liu et.al.	2312.06117	link
2023-12-07	Instance Tracking in 3D Scenes from Egocentric Videos	Yunhan Zhao et.al.	2312.04117	link
2024-02-19	Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking	Jiawei Ge et.al.	2311.17085	null
2023-11-21	Visual tracking brain computer interface	Changxing Huang et.al.	2311.12592	null
2024-01-10	ViKi-HyCo: A Hybrid-Control approach for complex car-like maneuvers	Edison P. Velasco Sánchez et.al.	2311.07268	null

Large Language Model

Publish Date	Title	Authors	PDF	Code
2024-05-16	UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models	Sahel Sharifymoghaddam et.al.	2405.10311	null
2024-05-16	4D Panoptic Scene Graph Generation	Jingkang Yang et.al.	2405.10305	link
2024-05-16	Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees	Yu Gui et.al.	2405.10301	null
2024-05-16	HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models	Rhea Sanjay Sukthanker et.al.	2405.10299	null
2024-05-16	Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning	Yuexiang Zhai et.al.	2405.10292	null
2024-05-16	Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction	Jianhao Chen et.al.	2405.10288	null
2024-05-16	FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models	Adrian Bulat et.al.	2405.10286	null
2024-05-16	Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers	Tuo Zhang et.al.	2405.10276	null
2024-05-16	Keep It Private: Unsupervised Privatization of Online Text	Calvin Bao et.al.	2405.10260	link
2024-05-16	When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models	Xianzheng Ma et.al.	2405.10255	null
2024-05-16	PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology	George Shaikovski et.al.	2405.10254	null
2024-05-16	A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks	Xuanfan Ni et.al.	2405.10251	null
2024-05-16	IntelliExplain: Enhancing Interactive Code Generation through Natural Language Explanations for Non-Professional Programmers	Hao Yan et.al.	2405.10250	null
2024-05-16	A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts	Xinru Zhang et.al.	2405.10246	null
2024-05-16	DocuMint: Docstring Generation for Python using Small Language Models	Bibek Poudel et.al.	2405.10243	link
2024-05-16	Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting	Divij Gupta et.al.	2405.10216	null
2024-05-16	CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations	Jiahao Zhao et.al.	2405.10212	null
2024-05-16	LFED: A Literary Fiction Evaluation Dataset for Large Language Models	Linhao Yu et.al.	2405.10166	link
2024-05-16	PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning	Jiancheng Pan et.al.	2405.10160	link
2024-05-16	Speaker Verification in Agent-Generated Conversations	Yizhe Yang et.al.	2405.10150	null
2024-05-15	Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming	Bushi Xiao et.al.	2405.09508	null
2024-05-15	Constrained Learning for Causal Inference and Semiparametric Statistics	Tiffany Tianhui Cai et.al.	2405.09493	null
2024-05-15	Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts	Donya Rooein et.al.	2405.09482	null
2024-05-15	Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models	Majid Zarharan et.al.	2405.09454	link
2024-05-15	M $^4$ oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts	Yufeng Jiang et.al.	2405.09446	null
2024-05-15	Facilitating Opinion Diversity through Hybrid NLP Approaches	Michiel van der Meer et.al.	2405.09439	null
2024-05-15	A Survey On Text-to-3D Contents Generation In The Wild	Chenhan Jiang et.al.	2405.09431	null
2024-05-15	MicroPython Testbed for Federated Learning Algorithms	Miroslav Popovic et.al.	2405.09423	null
2024-05-15	Matching domain experts by training from scratch on domain knowledge	Xiaoliang Luo et.al.	2405.09395	null
2024-05-15	Compositional imprecise probability	Jack Liell-Cock et.al.	2405.09391	null
2024-05-15	PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models	Devansh Jain et.al.	2405.09373	null
2024-05-15	SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition	Weijie L et.al.	2405.09365	null
2024-05-15	Large Language Model Bias Mitigation from the Perspective of Knowledge Editing	Ruizhe Chen et.al.	2405.09341	null
2024-05-15	Prompting-based Synthetic Data Generation for Few-Shot Question Answering	Maximilian Schmidt et.al.	2405.09335	null
2024-05-15	Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls	Pedro Miguel Sánchez Sánchez et.al.	2405.09318	null
2024-05-15	Comparing the Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support	Birger Moell et.al.	2405.09300	null
2024-05-15	Do language models capture implied discourse meanings? An investigation with exhaustivity implicatures of Korean morphology	Hagyeong Shin et.al.	2405.09293	null
2024-05-15	Sign of the Times: Evaluating the use of Large Language Models for Idiomaticity Detection	Dylan Phelps et.al.	2405.09279	null
2024-05-15	Dynamic Activation Pitfalls in LLaMA Models: An Empirical Study	Chi Ma et.al.	2405.09274	null
2024-05-15	New Textual Corpora for Serbian Language Modeling	Mihailo Škorić et.al.	2405.09250	null
2024-05-14	Efficient Vision-Language Pre-training by Cluster Masking	Zihao Wei et.al.	2405.08815	link
2024-05-14	Towards Enhanced RAC Accessibility: Leveraging Datasets and LLMs	Edison Jair Bejarano Sepulveda et.al.	2405.08792	null
2024-05-14	Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring	Tiantian Zhang et.al.	2405.08786	null
2024-05-14	Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs	Akhila Yerukola et.al.	2405.08760	link
2024-05-14	Distributed Threat Intelligence at the Edge Devices: A Large Language Model-Driven Approach	Syed Mhamudul Hasan et.al.	2405.08755	null
2024-05-14	Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding	Zhimin Li et.al.	2405.08748	link
2024-05-14	Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory	Xueyan Niu et.al.	2405.08707	null
2024-05-14	EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera	Beilei Cui et.al.	2405.08672	link
2024-05-14	Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research	Qinglong Cao et.al.	2405.08668	link
2024-05-14	Thinking Tokens for Language Modeling	David Herel et.al.	2405.08644	null
2024-05-15	ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation	Dimitris Gkoumas et.al.	2405.08619	null
2024-05-14	A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine	Hanguang Xiao et.al.	2405.08603	null
2024-05-15	EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark	Xiaohui Zhang et.al.	2405.08596	null
2024-05-14	Open-Vocabulary Object Detection via Neighboring Region Attention Alignment	Sunyuan Qiang et.al.	2405.08593	null
2024-05-14	Improving Transformers with Dynamically Composable Multi-Head Attention	Da Xiao et.al.	2405.08553	link
2024-05-14	Self-Distillation Improves DNA Sequence Inference	Tong Yu et.al.	2405.08538	link
2024-05-14	Falcon 7b for Software Mention Detection in Scholarly Documents	AmeerAli Khan et.al.	2405.08514	null
2024-05-14	Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure	Odysseas S. Chlapanis et.al.	2405.08502	null
2024-05-14	Is Less More? Quality, Quantity and Context in Idiom Processing with Natural Language Models	Agne Knietaite et.al.	2405.08497	null
2024-05-14	Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models	Andrea Piergentili et.al.	2405.08477	null
2024-05-13	Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots	Chengyue Wu et.al.	2405.07990	null
2024-05-13	A Generalist Learner for Multifaceted Medical Image Interpretation	Hong-Yu Zhou et.al.	2405.07988	null
2024-05-13	The Platonic Representation Hypothesis	Minyoung Huh et.al.	2405.07987	link
2024-05-13	Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation	Kevin Stangl et.al.	2405.07969	null
2024-05-13	PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation	Suad Alshammari et.al.	2405.07963	null
2024-05-13	AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments	Samuel Schmidgall et.al.	2405.07960	null
2024-05-13	EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning	Yinzhu Quan et.al.	2405.07938	null
2024-05-13	PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition	Ziyang Zhang et.al.	2405.07932	link
2024-05-13	Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data	Mahdi Morafah et.al.	2405.07925	null
2024-05-13	Can Better Text Semantics in Prompt Tuning Improve VLM Generalization?	Hari Chandana Kuchibhotla et.al.	2405.07921	null
2024-05-13	A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking	Ferdinand Schlatt et.al.	2405.07920	null
2024-05-13	PLUTO: Pathology-Universal Transformer	Dinkar Juyal et.al.	2405.07905	null
2024-05-13	Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers	Alena Tsanda et.al.	2405.07886	null
2024-05-13	Zero-Shot Tokenizer Transfer	Benjamin Minixhofer et.al.	2405.07883	null
2024-05-13	RLHF Workflow: From Reward Modeling to Online RLHF	Hanze Dong et.al.	2405.07863	link
2024-05-13	Can LLMs Help Predict Elections? (Counter)Evidence from the World’s Largest Democracy	Pratik Gujral et.al.	2405.07828	null
2024-05-13	A View of How Language Models Will Transform Law	Frank Fagan et.al.	2405.07826	null
2024-05-13	FreeVA: Offline MLLM as Training-Free Video Assistant	Wenhao Wu et.al.	2405.07798	link
2024-05-13	DEPTH: Discourse Education through Pre-Training Hierarchically	Zachary Bamberger et.al.	2405.07788	link
2024-05-13	Generating Human Motion in 3D Scenes from Text Descriptions	Zhi Cen et.al.	2405.07784	null
2024-05-10	Linearizing Large Language Models	Jean Mercat et.al.	2405.06640	link
2024-05-10	Value Augmented Sampling for Language Model Alignment and Personalization	Seungwook Han et.al.	2405.06639	link
2024-05-10	Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark	Evan M. Williams et.al.	2405.06634	null
2024-05-10	Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models	Chakshu Moar et.al.	2405.06626	null
2024-05-10	Explaining Text Similarity in Transformer Models	Alexandros Vasileiou et.al.	2405.06604	null
2024-05-10	Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach	Elham Ravanbakhsh et.al.	2405.06586	null
2024-05-10	What Can Natural Language Processing Do for Peer Review?	Ilia Kuznetsov et.al.	2405.06563	null
2024-05-10	Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval	Mengjia Niu et.al.	2405.06545	null
2024-05-10	Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts	Wenyu Huang et.al.	2405.06524	null
2024-05-10	UniDM: A Unified Framework for Data Manipulation with Large Language Models	Yichen Qian et.al.	2405.06510	null
2024-05-10	Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling	Lyumanshan Ye et.al.	2405.06495	null
2024-05-10	Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification	Yaoqin Ye et.al.	2405.06468	null
2024-05-10	Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation	JoonHo Lee et.al.	2405.06424	link
2024-05-10	Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions?	Hunter McNichols et.al.	2405.06414	null
2024-05-10	Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL	Ning Cheng et.al.	2405.06410	null
2024-05-10	Program Synthesis using Inductive Logic Programming for the Abstraction and Reasoning Corpus	Filipe Marinho Rocha et.al.	2405.06399	null
2024-05-10	Memory Mosaics	Jianyu Zhang et.al.	2405.06394	null
2024-05-10	LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play	Li-Chun Lu et.al.	2405.06373	null
2024-05-10	LMD3: Language Model Data Density Dependence	John Kirchenbauer et.al.	2405.06331	null
2024-05-10	Correlation Dimension of Natural Language in a Statistical Manifold	Xin Du et.al.	2405.06321	null
2024-05-09	Natural Language Processing RELIES on Linguistics	Juri Opitz et.al.	2405.05966	null
2024-05-09	OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning	Dan Qiao et.al.	2405.05957	link
2024-05-09	Probing Multimodal LLMs as World Models for Driving	Shiva Sreeram et.al.	2405.05956	link
2024-05-09	Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning	Junzhi Chen et.al.	2405.05955	null
2024-05-09	CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts	Jiachen Li et.al.	2405.05949	link
2024-05-09	DOLOMITES: Domain-Specific Long-Form Methodical Tasks	Chaitanya Malaviya et.al.	2405.05938	null
2024-05-09	Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness	Siyuan Li et.al.	2405.05930	null
2024-05-09	Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?	Zorik Gekhman et.al.	2405.05904	null
2024-05-09	Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes	Ziang Guo et.al.	2405.05885	null
2024-05-09	FlockGPT: Guiding UAV Flocking with Linguistic Orchestration	Artem Lykov et.al.	2405.05872	null
2024-05-09	Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control	Gunshi Gupta et.al.	2405.05852	link
2024-05-09	Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning	Artem Lykov et.al.	2405.05824	link
2024-05-09	Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference	Zhihang Lin et.al.	2405.05803	link
2024-05-09	Towards a More Inclusive AI: Progress and Perspectives in Large Language Model Training for the Sámi Language	Ronny Paul et.al.	2405.05777	null
2024-05-09	Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions	Polina Tsvilodub et.al.	2405.05776	null
2024-05-09	Large Language Model-Aided Evolutionary Search for Constrained Multiobjective Optimization	Zeyi Wang et.al.	2405.05767	null
2024-05-09	Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social Media	Zhizhen Zhang et.al.	2405.05760	null
2024-05-09	Exploring the Potential of Human-LLM Synergy in Advancing Qualitative Analysis: A Case Study on Mental-Illness Stigma	Han Meng et.al.	2405.05758	null
2024-05-09	Can large language models understand uncommon meanings of common words?	Jinyang Wu et.al.	2405.05741	null
2024-05-09	Evaluating Dialect Robustness of Language Models via Conversation Understanding	Dipankar Srirag et.al.	2405.05688	link
2024-05-08	THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models	Prannay Kaul et.al.	2405.05256	null
2024-05-08	You Only Cache Once: Decoder-Decoder Architectures for Language Models	Yutao Sun et.al.	2405.05254	null
2024-05-08	Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge	Charles Koutcheme et.al.	2405.05253	link
2024-05-09	LLMs with Personalities in Multi-issue Negotiation Games	Sean Noh et.al.	2405.05248	null
2024-05-08	EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning	Jingfeng Yao et.al.	2405.05237	link
2024-05-08	SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants	Masoud Moghani et.al.	2405.05226	null
2024-05-08	Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers	Jiuxiang Gu et.al.	2405.05219	null
2024-05-08	FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models	Jinglin Xu et.al.	2405.05216	link
2024-05-08	MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning	Inderjeet Nair et.al.	2405.05189	null
2024-05-08	Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming	Tommaso Pasini et.al.	2405.05176	null
2024-05-08	Air Gap: Protecting Privacy-Conscious Conversational Agents	Eugene Bagdasaryan et.al.	2405.05175	null
2024-05-08	XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples	Peiqin Lin et.al.	2405.05116	link
2024-05-08	QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs	Weijia Zhang et.al.	2405.05109	null
2024-05-08	Concerns on Bias in Large Language Models when Creating Synthetic Personae	Helena A. Haxvig et.al.	2405.05080	null
2024-05-08	Impact of Tone-Aware Explanations in Recommender Systems	Ayano Okoso et.al.	2405.05061	null
2024-05-08	Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models	Aylin Gunal et.al.	2405.05060	null
2024-05-08	Seeds of Stereotypes: A Large-Scale Textual Analysis of Race and Gender Associations with Diseases in Online Sources	Lasse Hyldig Hansen et.al.	2405.05049	null
2024-05-08	${M^2D}$ NeRF: Multi-Modal Decomposition NeRF with 3D Feature Fields	Ning Wang et.al.	2405.05010	null
2024-05-08	ADELIE: Aligning Large Language Models on Information Extraction	Yunjia Qi et.al.	2405.05008	link
2024-05-08	NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair	Ruoke Wang et.al.	2405.04994	null
2024-05-07	ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning	Jing Lin et.al.	2405.04533	null
2024-05-07	QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving	Yujun Lin et.al.	2405.04532	link
2024-05-07	NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts	Shudan Zhang et.al.	2405.04520	null
2024-05-07	xLSTM: Extended Long Short-Term Memory	Maximilian Beck et.al.	2405.04517	null
2024-05-07	A Transformer with Stack Attention	Jiaoda Li et.al.	2405.04515	link
2024-05-08	Unveiling Disparities in Web Task Handling Between Human and Web Agent	Kihoon Son et.al.	2405.04497	null
2024-05-07	Toward In-Context Teaching: Adapting Examples to Students’ Misconceptions	Alexis Ross et.al.	2405.04495	null
2024-05-07	Representation Learning of Daily Movement Data Using Text Encoders	Alexander Capstick et.al.	2405.04494	link
2024-05-08	DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model	DeepSeek-AI et.al.	2405.04434	link
2024-05-07	The Silicone Ceiling: Auditing GPT’s Race and Gender Biases in Hiring	Lena Armstrong et.al.	2405.04412	null
2024-05-07	Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks	Georgios Pantazopoulos et.al.	2405.04403	link
2024-05-07	Large Language Models Cannot Explain Themselves	Advait Sarkar et.al.	2405.04382	null
2024-05-07	A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI	Hannah Chafetz et.al.	2405.04333	null
2024-05-07	Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation	Atharvan Dogra et.al.	2405.04325	null
2024-05-07	Granite Code Models: A Family of Open Foundation Models for Code Intelligence	Mayank Mishra et.al.	2405.04324	link
2024-05-07	Accelerating Speculative Decoding using Dynamic Speculation Length	Jonathan Mamou et.al.	2405.04304	null
2024-05-07	Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework	Xiangpeng Wan et.al.	2405.04294	link
2024-05-07	Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore	Junchao Wu et.al.	2405.04286	null
2024-05-07	On the Foundations of Earth and Climate Foundation Models	Xiao Xiang Zhu et.al.	2405.04285	null
2024-05-07	Semantic API Alignment: Linking High-level User Goals to APIs	Robert Feldt et.al.	2405.04236	null
2024-05-06	Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs	Muhammad Uzair Khattak et.al.	2405.03690	null
2024-05-06	Pose Priors from Language Models	Sanjay Subramanian et.al.	2405.03689	null
2024-05-06	Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames	Keith Burghardt et.al.	2405.03688	link
2024-05-06	Language-Image Models with 3D Understanding	Jang Hyun Cho et.al.	2405.03685	null
2024-05-06	AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design	Kamal Choudhary et.al.	2405.03680	null
2024-05-06	When LLMs Meet Cybersecurity: A Systematic Literature Review	Jie Zhang et.al.	2405.03644	link
2024-05-06	A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama	Vlad-Andrei Cursaru et.al.	2405.03616	null
2024-05-06	GREEN: Generative Radiology Report Evaluation and Error Notation	Sophie Ostmeier et.al.	2405.03595	null
2024-05-06	Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment	Abhinav Agarwalla et.al.	2405.03594	null
2024-05-06	Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing	Han Liu et.al.	2405.03565	null
2024-05-07	ID-centric Pre-training for Recommendation	Yiqing Wu et.al.	2405.03562	null
2024-05-06	AlphaMath Almost Zero: process Supervision without process	Guoxin Chen et.al.	2405.03553	link
2024-05-06	MAmmoTH2: Scaling Instructions from the Web	Xiang Yue et.al.	2405.03548	null
2024-05-06	Position Paper: Leveraging Foundational Models for Black-Box Optimization: Benefits, Challenges, and Future Directions	Xingyou Song et.al.	2405.03547	null
2024-05-06	Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning	Yubo Mai et.al.	2405.03509	null
2024-05-06	UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images	Yiting Qu et.al.	2405.03486	null
2024-05-06	LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model	Haowen Sun et.al.	2405.03485	link
2024-05-06	Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search	Hideaki Joko et.al.	2405.03480	link
2024-05-07	Large Language Models (LLMs) as Agents for Augmented Democracy	Jairo Gudiño-Rosero et.al.	2405.03452	null
2024-05-06	SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence	Hangyuan Ji et.al.	2405.03446	null
2024-05-03	Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models	Piotr Padlewski et.al.	2405.02287	link
2024-05-03	Structural Pruning of Pre-trained Language Models via Neural Architecture Search	Aaron Klein et.al.	2405.02267	null
2024-05-03	On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning?	Maxime Zanella et.al.	2405.02266	link
2024-05-03	Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows	Jasmine Y. Shih et.al.	2405.02260	null
2024-05-03	What matters when building vision-language models?	Hugo Laurençon et.al.	2405.02246	null
2024-05-03	REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs	Deepa Tilwani et.al.	2405.02228	null
2024-05-03	Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks	Lujing Zhang et.al.	2405.02225	null
2024-05-03	FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems	Yashar Deldjoo et.al.	2405.02219	null
2024-05-03	Automatic Programming: Large Language Models and Beyond	Michael R. Lyu et.al.	2405.02213	null
2024-05-03	Assessing and Verifying Task Utility in LLM-Powered Applications	Negar Arabzadeh et.al.	2405.02178	null
2024-05-03	Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset	Hsuvas Borkakoty et.al.	2405.02175	null
2024-05-03	Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models	Mohamad Al Mdfaa et.al.	2405.02162	null
2024-05-03	Neural Context Flows for Learning Generalizable Dynamical Systems	Roussel Desmond Nzoyem et.al.	2405.02154	link
2024-05-03	The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates	Giuseppe Russo Latona et.al.	2405.02150	link
2024-05-03	MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain	Chao Jiang et.al.	2405.02144	null
2024-05-03	Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection	Guillem Ramírez et.al.	2405.02134	null
2024-05-03	Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets	Xuelong Geng et.al.	2405.02132	null
2024-05-03	Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph	Vladyslav Nechakhin et.al.	2405.02105	null
2024-05-03	Argumentative Large Language Models for Explainable and Contestable Decision-Making	Gabriel Freedman et.al.	2405.02079	null
2024-05-03	Comparative Analysis of Retrieval Systems in the Real World	Dmytro Mozolevskyi et.al.	2405.02048	null
2024-05-02	Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models	Seungone Kim et.al.	2405.01535	link
2024-05-02	Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks	Murtaza Dalal et.al.	2405.01534	null
2024-05-02	OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning	Shihao Wang et.al.	2405.01533	link
2024-05-02	FLAME: Factuality-Aware Alignment for Large Language Models	Sheng-Chieh Lin et.al.	2405.01525	null
2024-05-02	A separability-based approach to quantifying generalization: which layer is best?	Luciano Dyballa et.al.	2405.01524	null
2024-05-02	Transformer-Aided Semantic Communications	Matin Mortaheb et.al.	2405.01521	null
2024-05-02	D2PO: Discriminator-Guided DPO with Response Evaluation Models	Prasann Singhal et.al.	2405.01511	link
2024-05-02	Analyzing the Role of Semantic Representations in the Era of Large Language Models	Zhijing Jin et.al.	2405.01502	link
2024-05-02	Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models	Raymond Fok et.al.	2405.01501	null
2024-05-02	Controllable Text Generation in the Instruction-Tuning Era	Dhananjay Ashok et.al.	2405.01490	null
2024-05-02	MANTIS: Interleaved Multi-Image Instruction Tuning	Dongfu Jiang et.al.	2405.01483	null
2024-05-02	NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment	Gerald Shen et.al.	2405.01481	link
2024-05-02	V-FLUTE: Visual Figurative Language Understanding with Textual Explanations	Arkadiy Saakyan et.al.	2405.01474	link
2024-05-02	Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning	Théo Moutakanni et.al.	2405.01469	null
2024-05-02	Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models	Yifei Ming et.al.	2405.01468	null
2024-05-02	A Systematic Literature Review on Large Language Models for Automated Program Repair	Quanjun Zhang et.al.	2405.01466	link
2024-05-02	Natural Language to Verilog: Design of a Recurrent Spiking Neural Network using Large Language Models and ChatGPT	Paola Vitolo et.al.	2405.01419	null
2024-05-02	MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors	Yuan Tang et.al.	2405.01413	link
2024-05-02	Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving	Xin Quan et.al.	2405.01379	null
2024-05-02	GAIA: A General AI Assistant for Intelligent Accelerator Operations	Frank Mayet et.al.	2405.01359	null
2024-05-01	Self-Play Preference Optimization for Language Model Alignment	Yue Wu et.al.	2405.00675	null
2024-05-01	Is Bigger Edit Batch Size Always Better? – An Empirical Study on Model Editing with Llama-3	Junsang Yoon et.al.	2405.00664	link
2024-05-01	HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models	Ningke Li et.al.	2405.00648	null
2024-05-01	When Quantization Affects Confidence of Large Language Models?	Irina Proskurina et.al.	2405.00632	link
2024-05-01	“I’m Not Sure, But…”: Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust	Sunnie S. Y. Kim et.al.	2405.00623	null
2024-05-01	Causal Evaluation of Language Models	Sirui Chen et.al.	2405.00622	link
2024-05-01	Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling	Yida Mu et.al.	2405.00611	null
2024-05-01	Investigating Automatic Scoring and Feedback using Large Language Models	Gloria Ashiya Katuka et.al.	2405.00602	null
2024-05-01	Are Models Biased on Text without Gender-related Language?	Catarina G Belém et.al.	2405.00588	link
2024-05-01	The Real, the Better: Aligning Large Language Models with Online Human Behaviors	Guanying Jiang et.al.	2405.00578	null
2024-05-01	EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model	Deng Li et.al.	2405.00574	null
2024-05-01	NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance	Huan-Yi Su et.al.	2405.00566	null
2024-05-01	Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment	Zhili Liu et.al.	2405.00557	null
2024-05-01	Long-Term Human Trajectory Prediction using 3D Dynamic Scene Graphs	Nicolas Gorlo et.al.	2405.00552	link
2024-05-01	ChatBI: Towards Natural Language to Complex Business Intelligence SQL	Jinqing Lian et.al.	2405.00527	null
2024-05-01	CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions	Donghee Choi et.al.	2405.00523	null
2024-05-01	Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning	Lucas-Andreï Thil et.al.	2405.00516	null
2024-05-01	GOLD: Geometry Problem Solver with Natural Language Description	Jiaxin Zhang et.al.	2405.00494	link
2024-05-01	Is Temperature the Creativity Parameter of Large Language Models?	Max Peeperkorn et.al.	2405.00492	null
2024-05-01	The Pyramid of Captions	Delong Chen et.al.	2405.00485	null
2024-04-30	Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation	Yunhao Ge et.al.	2404.19752	null
2024-04-30	PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification	Leon Garza et.al.	2404.19744	null
2024-04-30	Better & Faster Large Language Models via Multi-token Prediction	Fabian Gloeckle et.al.	2404.19737	null
2024-04-30	A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications	Steph Buongiorno et.al.	2404.19729	null
2024-04-30	PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games	Steph Buongiorno et.al.	2404.19721	null
2024-04-30	Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns	Constantinos Patsakis et.al.	2404.19715	null
2024-04-30	Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models	Scott Sumpter et.al.	2404.19713	null
2024-04-30	When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively	Tiziano Labruna et.al.	2404.19705	link
2024-04-30	Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners	Chun Feng et.al.	2404.19696	null
2024-04-30	Towards Generalist Robot Learning from Internet Video: A Survey	Robert McCarthy et.al.	2404.19664	null
2024-04-30	MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation	Min Zhang et.al.	2404.19644	null
2024-04-30	On Training a Neural Network to Explain Binaries	Alexander Interrante-Grant et.al.	2404.19631	null
2024-04-30	Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model	Denys Godwin et.al.	2404.19609	null
2024-04-30	Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning	Xuanli He et.al.	2404.19597	null
2024-04-30	RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing	Yucheng Hu et.al.	2404.19543	link
2024-04-30	MoST: Multi-modality Scene Tokenization for Motion Prediction	Norman Mu et.al.	2404.19531	null
2024-04-30	Do Large Language Models Understand Conversational Implicature – A case study with a chinese sitcom	Shisen Yue et.al.	2404.19509	link
2024-04-30	More Compute Is What You Need	Zhen Guo et.al.	2404.19484	null
2024-05-01	Neuro-Vision to Language: Image Reconstruction and Language enabled Interaction via Brain Recordings	Guobin Shen et.al.	2404.19438	null
2024-04-30	Can Large Language Models put 2 and 2 together? Probing for Entailed Arithmetical Relationships	D. Panas et.al.	2404.19432	null
2024-04-29	Hallucination of Multimodal Large Language Models: A Survey	Zechen Bai et.al.	2404.18930	link
2024-04-29	Holmes: Benchmark the Linguistic Competence of Language Models	Andreas Waldis et.al.	2404.18923	null
2024-04-29	DPO Meets PPO: Reinforced Token Optimization for RLHF	Han Zhong et.al.	2404.18922	null
2024-04-29	TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation	Junhao Cheng et.al.	2404.18919	link
2024-04-29	Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting	Fangcheng Liu et.al.	2404.18911	link
2024-04-29	Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking	Hong Jin Kang et.al.	2404.18881	link
2024-04-29	More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness	Aaron J. Li et.al.	2404.18870	link
2024-04-29	Truth-value judgment in language models: belief directions are context sensitive	Stefan F. Schouten et.al.	2404.18865	null
2024-04-29	Performance-Aligned LLMs for Generating Fast Code	Daniel Nichols et.al.	2404.18864	null
2024-04-29	A Survey on Vision Mamba: Models, Applications and Challenges	Rui Xu et.al.	2404.18861	link
2024-04-29	VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning	Aidan Z. H. Yang et.al.	2404.18852	null
2024-04-29	FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition	Yuxuan Yan et.al.	2404.18848	null
2024-04-29	It’s Difficult to be Neutral – Human and LLM-based Sentiment Annotation of Patient Comments	Petter Mæhlum et.al.	2404.18832	null
2024-04-29	Benchmarking Benchmark Leakage in Large Language Models	Ruijie Xu et.al.	2404.18824	link
2024-04-29	AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering	Wenxiang Zhao et.al.	2404.18816	null
2024-04-29	Unknown Script: Impact of Script on Cross-Lingual Transfer	Wondimagegnhue Tsegaye Tufa et.al.	2404.18810	link
2024-04-29	Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models	Pat Verga et.al.	2404.18796	null
2024-04-29	PECC: Problem Extraction and Coding Challenges	Patrick Haller et.al.	2404.18766	link
2024-04-29	Transitive Vision-Language Prompt Learning for Domain Generalization	Liyuan Wang et.al.	2404.18758	null
2024-04-29	Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models	Hongyi Zhu et.al.	2404.18746	null
2024-04-26	Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo	Stephen Zhao et.al.	2404.17546	link
2024-04-26	Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models	Yuhang Huang et.al.	2404.17534	null
2024-04-26	Large Language Model Agent as a Mechanical Designer	Yayati Jadhav et.al.	2404.17525	null
2024-04-26	On the Use of Large Language Models to Generate Capability Ontologies	Luis Miguel Vieira da Silva et.al.	2404.17524	null
2024-04-26	Enhancing Legal Compliance and Regulation Analysis with Large Language Models	Shabnam Hassani et.al.	2404.17522	null
2024-04-26	A Comprehensive Evaluation on Event Reasoning of Large Language Models	Zhengwei Tao et.al.	2404.17513	link
2024-04-26	CEval: A Benchmark for Evaluating Counterfactual Text Generation	Van Bach Nguyen et.al.	2404.17475	null
2024-04-26	Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System	Robin Schmucker et.al.	2404.17460	null
2024-04-26	“ChatGPT Is Here to Help, Not to Replace Anybody” – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses	Bruno Pereira Cipriano et.al.	2404.17443	null
2024-04-26	PromptCIR: Blind Compressed Image Restoration with Prompt Learning	Bingchen Li et.al.	2404.17433	link
2024-04-26	Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations	Rémy Decoupes et.al.	2404.17401	null
2024-04-26	UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning	Maoxun Yuan et.al.	2404.17360	null
2024-04-26	InspectorRAGet: An Introspection Platform for RAG Evaluation	Kshitij Fadnis et.al.	2404.17347	link
2024-04-26	Introducing cosmosGPT: Monolingual Training for Turkish Language Models	H. Toprak Kesgin et.al.	2404.17336	null
2024-04-26	A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation	Xin Zhang et.al.	2404.17335	null
2024-04-26	An Extendable Cloud-Native Alloy Property Explorer	Zhuoyuan Li et.al.	2404.17330	link
2024-04-26	When to Trust LLMs: Aligning Confidence with Response Quality	Shuchang Tao et.al.	2404.17287	null
2024-04-26	Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM	Xuan Zhang et.al.	2404.17283	link
2024-04-26	Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT as a Pivot	Michelle Terblanche et.al.	2404.17216	null
2024-04-26	Low-Rank Knowledge Decomposition for Medical Foundation Models	Yuhang Zhou et.al.	2404.17184	null
2024-04-25	The Third Monocular Depth Estimation Challenge	Jaime Spencer et.al.	2404.16831	null
2024-04-25	Make-it-Real: Unleashing Large Multimodal Model’s Ability for Painting 3D Objects with Realistic Materials	Ye Fang et.al.	2404.16829	null
2024-04-25	V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection	Xuanyu Zhang et.al.	2404.16824	null
2024-04-25	How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites	Zhe Chen et.al.	2404.16821	link
2024-04-25	IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages	Harman Singh et.al.	2404.16816	link
2024-04-26	Make Your LLM Fully Utilize the Context	Shengnan An et.al.	2404.16811	link
2024-04-25	Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning	Tianhui Zhang et.al.	2404.16807	null
2024-04-25	AAPL: Adding Attributes to Prompt Learning for Vision-Language Models	Gahyeon Kim et.al.	2404.16804	link
2024-04-25	Weak-to-Strong Extrapolation Expedites Alignment	Chujie Zheng et.al.	2404.16792	link
2024-04-25	SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension	Bohao Li et.al.	2404.16790	link
2024-04-25	Continual Learning of Large Language Models: A Comprehensive Survey	Haizhou Shi et.al.	2404.16789	link
2024-04-25	Modeling Selective Feature Attention for Representation-based Siamese Text Matching	Jianxiang Zang et.al.	2404.16776	link
2024-04-25	REBEL: Reinforcement Learning via Regressing Relative Rewards	Zhaolin Gao et.al.	2404.16767	link
2024-04-25	Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model	Runzhe Zhan et.al.	2404.16766	null
2024-04-25	RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis	Xiaoman Zhang et.al.	2404.16754	null
2024-04-25	Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class	Mazda Moayeri et.al.	2404.16717	null
2024-04-25	Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding	Mostafa Elhoushi et.al.	2404.16710	null
2024-04-25	Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents	Giorgio Piatti et.al.	2404.16698	null
2024-04-25	Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4	Lydia Uhler et.al.	2404.16692	null
2024-04-25	EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning	Hongxia Xie et.al.	2404.16670	link
2024-04-24	Hybrid LLM/Rule-based Approaches to Business Insights Generation from Structured Data	Aliaksei Vertsel et.al.	2404.15604	null
2024-04-24	ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction	Henry Peng Zou et.al.	2404.15592	link
2024-04-24	MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis	Jiaxin Zhuang et.al.	2404.15580	null
2024-04-24	Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations?	Hossein Salami et.al.	2404.15578	null
2024-04-24	Retrieval Head Mechanistically Explains Long-Context Factuality	Wenhao Wu et.al.	2404.15574	link
2024-04-23	PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models	Shashi Kant Gupta et.al.	2404.15549	null
2024-04-23	BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis	Shuhang Lin et.al.	2404.15532	link
2024-04-23	Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models	Mihir Parmar et.al.	2404.15522	link
2024-04-23	Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval	Young Kyun Jang et.al.	2404.15516	null
2024-04-23	ToM-LM: Delegating Theory Of Mind Reasoning to External Symbolic Executors in Large Language Models	Weizhi Tang et.al.	2404.15515	null
2024-04-23	IryoNLP at MEDIQA-CORR 2024: Tackling the Medical Error Detection & Correction Task On the Shoulders of Medical Agents	Jean-Philippe Corbeil et.al.	2404.15488	link
2024-04-23	Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance	Het Patel et.al.	2404.15485	null
2024-04-23	Can Large Language Models Learn the Physics of Metamaterials? An Empirical Study with ChatGPT	Darui Lu et.al.	2404.15458	null
2024-04-23	XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference	João Monteiro et.al.	2404.15420	null
2024-04-23	Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs	Davide Caffagni et.al.	2404.15406	null
2024-04-23	Aligning LLM Agents by Learning Latent Preference from User Edits	Ge Gao et.al.	2404.15269	link
2024-04-23	XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts	Yifeng Ding et.al.	2404.15247	link
2024-04-23	CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies	Weiyan Shi et.al.	2404.15238	link
2024-04-23	Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models	Aidan Z. H. Yang et.al.	2404.15236	null
2024-04-23	Re-Thinking Inverse Graphics With Large Language Models	Peter Kulits et.al.	2404.15228	null
2024-04-23	Does Instruction Tuning Make LLMs More Consistent?	Constanza Fierro et.al.	2404.15206	null
2024-04-23	Setting up the Data Printer with Improved English to Ukrainian Machine Translation	Yurii Paniv et.al.	2404.15196	link
2024-04-23	Regressive Side Effects of Training Language Models to Mimic Student Misconceptions	Shashank Sonkar et.al.	2404.15156	null
2024-04-23	Bias patterns in the application of LLMs for clinical decision support: A comprehensive study	Raphael Poulain et.al.	2404.15149	link
2024-04-23	Rethinking LLM Memorization through the Lens of Adversarial Compression	Avi Schwarzschild et.al.	2404.15146	null
2024-04-23	MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning	Sunan He et.al.	2404.15127	null
2024-04-23	Identifying Fairness Issues in Automatically Generated Testing Content	Kevin Stowe et.al.	2404.15104	null
2024-04-23	Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation	Xun Wu et.al.	2404.15100	null
2024-04-23	Detection of circular permutations by Protein Language Models	Yue Hu et.al.	2404.15087	link
2024-04-23	Multi-Head Mixture-of-Experts	Xun Wu et.al.	2404.15045	null
2024-04-23	TAXI: Evaluating Categorical Knowledge Editing for Language Models	Derek Powell et.al.	2404.15004	link
2024-04-23	Transformers Can Represent $n$ -gram Language Models	Anej Svete et.al.	2404.14994	null
2024-04-23	A Short Review for Ontology Learning from Text: Stride from Shallow Learning, Deep Learning to Large Language Models Trend	Rick Du et.al.	2404.14991	null
2024-04-23	$\texttt{MiniMol}$ : A Parameter-Efficient Foundation Model for Molecular Learning	Kerstin Kläser et.al.	2404.14986	null
2024-04-23	Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case	Muhammad Asif Auyb et.al.	2404.14977	null
2024-04-22	AutoAD III: The Prequel – Back to the Pixels	Tengda Han et.al.	2404.14412	null
2024-04-22	SpaceByte: Towards Deleting Tokenization from Large Language Modeling	Kevin Slagle et.al.	2404.14408	link
2024-04-22	RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?	Adrian de Wynter et.al.	2404.14397	link
2024-04-22	SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation	Yuying Ge et.al.	2404.14396	link
2024-04-22	PARAMANU-GANITA: Language Model with Mathematical Capabilities	Mitodru Niyogi et.al.	2404.14395	null
2024-04-22	A Multimodal Automated Interpretability Agent	Tamar Rott Shaham et.al.	2404.14394	null
2024-04-22	A Survey on Self-Evolution of Large Language Models	Zhengwei Tao et.al.	2404.14387	link
2024-04-22	Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph	Xiaochen Kev Gao et.al.	2404.14372	link
2024-04-23	Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data	Fahim Tajwar et.al.	2404.14367	link
2024-04-22	Better Synthetic Data by Retrieving and Transforming Existing Datasets	Saumya Gandhi et.al.	2404.14361	link
2024-04-22	Rethinking Legal Compliance Automation: Opportunities with Large Language Models	Shabnam Hassani et.al.	2404.14356	null
2024-04-22	Calc-CMU at SemEval-2024 Task 7: Pre-Calc – Learning to Use the Calculator Improves Numeracy in Language Models	Vishruth Veerendranath et.al.	2404.14355	link
2024-04-22	Automated Long Answer Grading with RiceChem Dataset	Shashank Sonkar et.al.	2404.14316	link
2024-04-22	Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels	Jan-Philipp Fränken et.al.	2404.14313	link
2024-04-22	Explaining Arguments’ Strength: Unveiling the Role of Attacks and Supports (Technical Report)	Xiang Yin et.al.	2404.14304	null
2024-04-22	Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits	Shashank Sonkar et.al.	2404.14301	null
2024-04-22	Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach	Yao Wan et.al.	2404.14296	link
2024-04-22	A Survey on Efficient Inference for Large Language Models	Zixuan Zhou et.al.	2404.14294	null
2024-04-22	LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots	Dongge Han et.al.	2404.14285	null
2024-04-22	Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback	Wenyi Xiao et.al.	2404.14233	null
2024-04-19	MoVA: Adapting Mixture of Vision Experts to Multimodal Context	Zhuofan Zong et.al.	2404.13046	link
2024-04-19	Unified Scene Representation and Reconstruction for 3D Large Language Models	Tao Chu et.al.	2404.13044	null
2024-04-19	Data Alignment for Zero-Shot Concept Generation in Dermatology AI	Soham Gadgil et.al.	2404.13043	null
2024-04-19	Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs	Biyang Guo et.al.	2404.13033	link
2024-04-19	When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering	Stephen Choi et.al.	2404.13028	null
2024-04-19	Stronger Random Baselines for In-Context Learning	Gregory Yauney et.al.	2404.13020	link
2024-04-19	Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models	Chuofan Ma et.al.	2404.13013	null
2024-04-19	Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs	Clemencia Siro et.al.	2404.12994	link
2024-04-19	FineRec:Exploring Fine-grained Sequential Recommendation	Xiaokun Zhang et.al.	2404.12975	link
2024-04-19	Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models	Yian Li et.al.	2404.12966	null
2024-04-19	Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction	Qinyuan Wu et.al.	2404.12957	null
2024-04-19	Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models	Konstantinos Vilouras et.al.	2404.12920	null
2024-04-19	Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models	Zhenyang Ni et.al.	2404.12916	link
2024-04-19	Large Language Models for Networking: Workflow, Advances and Challenges	Chang Liu et.al.	2404.12901	null
2024-04-19	Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning	Ahmed Elshabrawy et.al.	2404.12897	null
2024-04-19	Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation	Guanhua Chen et.al.	2404.12879	null
2024-04-19	LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency	Zhaodonghui Li et.al.	2404.12872	link
2024-04-19	How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning?	Yang Luo et.al.	2404.12866	null
2024-04-19	Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation	Yilong Chen et.al.	2404.12861	null
2024-04-19	TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages	Aleksei Dorkin et.al.	2404.12845	null
2024-04-18	BLINK: Multimodal Large Language Models Can See but Not Perceive	Xingyu Fu et.al.	2404.12390	null
2024-04-18	Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models	Aitor Ormazabal et.al.	2404.12387	null
2024-04-18	MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale	Xiaotang Gai et.al.	2404.12372	null
2024-04-18	When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes	Asaf Yehudai et.al.	2404.12365	link
2024-04-18	*From $r$ to $Q^$ : Your Language Model is Secretly a Q-Function**	Rafael Rafailov et.al.	2404.12358	null
2024-04-18	Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation	Jingmin Sun et.al.	2404.12355	link
2024-04-18	V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning	Hang Hua et.al.	2404.12353	null
2024-04-18	Evaluating AI for Law: Bridging the Gap with Open-Source Solutions	Rohan Bhambhoria et.al.	2404.12349	null
2024-04-18	Large Language Models in Targeted Sentiment Analysis	Nicolay Rusnachenko et.al.	2404.12342	link
2024-04-18	Normative Requirements Operationalization with Large Language Models	Nick Feng et.al.	2404.12335	null
2024-04-18	Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment	Zhaofeng Wu et.al.	2404.12318	null
2024-04-18	Large Language Models for Synthetic Participatory Planning of Shared Automated Electric Mobility Systems	Jiangbo Yu et.al.	2404.12317	null
2024-04-18	Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair	Yusuke Sakai et.al.	2404.12299	null
2024-04-18	Augmenting emotion features in irony detection with Large language modeling	Yucheng Lin et.al.	2404.12291	null
2024-04-18	Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery	Yona Falinie A. Gaus et.al.	2404.12285	null
2024-04-18	Enhancing Embedding Performance through Large Language Model-based Text Enrichment and Rewriting	Nicholas Harris et.al.	2404.12283	null
2024-04-18	Advancing the Robustness of Large Language Models through Self-Denoised Smoothing	Jiabao Ji et.al.	2404.12274	link
2024-04-18	FedEval-LLM: Federated Evaluation of Large Language Models on Downstream Tasks with Collective Wisdom	Yuanqin He et.al.	2404.12273	null
2024-04-18	Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences	Shreya Shankar et.al.	2404.12272	null
2024-04-18	Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM	Michelle S. Lam et.al.	2404.12259	link
2024-04-17	Private federated discovery of out-of-vocabulary words for Gboard	Ziteng Sun et.al.	2404.11607	null
2024-04-17	VG4D: Vision-Language Model Goes 4D Video Recognition	Zhichao Deng et.al.	2404.11605	link
2024-04-17	A Deep Dive into Large Language Models for Automated Bug Localization and Repair	Soneya Binta Hossain et.al.	2404.11595	null
2024-04-17	Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding	Zezhong Fan et.al.	2404.11589	null
2024-04-17	LLMTune: Accelerate Database Knob Tuning with Large Language Models	Xinmei Huang et.al.	2404.11581	link
2024-04-17	On the Scalability of GNNs for Molecular Graphs	Maciej Sypetkowski et.al.	2404.11568	null
2024-04-17	MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation	Kuan-Chieh et.al.	2404.11565	null
2024-04-17	Quantifying Multilingual Performance of Large Language Models Across Languages	Zihao Li et.al.	2404.11553	null
2024-04-17	Evaluating Span Extraction in Generative Paradigm: A Reflection on Aspect-Based Sentiment Analysis	Soyoung Yang et.al.	2404.11539	null
2024-04-17	FedPFT: Federated Proxy Fine-Tuning of Foundation Models	Zhaopeng Peng et.al.	2404.11536	link
2024-04-17	Select and Reorder: A Novel Approach for Neural Sign Language Production	Harry Walsh et.al.	2404.11532	null
2024-04-17	Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization	Costas Mavromatis et.al.	2404.11531	link
2024-04-17	Embedding Privacy in Computational Social Science and Artificial Intelligence Research	Keenan Jones et.al.	2404.11515	null
2024-04-17	Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models	Yushuo Chen et.al.	2404.11502	link
2024-04-17	Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models	Yue Zhou et.al.	2404.11500	link
2024-04-18	Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent	Wei Chen et.al.	2404.11459	null
2024-04-17	Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models	Sunhao Dai et.al.	2404.11457	link
2024-04-17	AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts	Meng Jiang et.al.	2404.11449	null
2024-04-17	Open-Ended Wargames with Large Language Models	Daniel P. Hogan et.al.	2404.11446	link
2024-04-17	DUPE: Detection Undermining via Prompt Engineering for Deepfake Text	James Weichert et.al.	2404.11408	null
2024-04-16	Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback	Qiwei Di et.al.	2404.10776	null
2024-04-16	COMBO: Compositional World Models for Embodied Multi-Agent Cooperation	Hongxin Zhang et.al.	2404.10775	null
2024-04-16	Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification	Yu-Yang Li et.al.	2404.10757	link
2024-04-16	Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study	Shusheng Xu et.al.	2404.10719	null
2024-04-16	Dual Modalities of Text: Visual and Textual Generative Pre-training	Yekun Chai et.al.	2404.10710	null
2024-04-16	Question Difficulty Ranking for Multiple-Choice Reading Comprehension	Vatsal Raina et.al.	2404.10704	null
2024-04-16	An empirical study on code review activity prediction in practice	Doriane Olewicki et.al.	2404.10703	null
2024-04-16	Automating REST API Postman Test Cases Using LLM	S Deepika Sri et.al.	2404.10678	null
2024-04-16	Self-playing Adversarial Language Game Enhances LLM Reasoning	Pengyu Cheng et.al.	2404.10642	link
2024-04-16	HLAT: High-quality Large Language Model Pre-trained on AWS Trainium	Haozheng Fan et.al.	2404.10630	null
2024-04-16	Private Attribute Inference from Images with Vision-Language Models	Batuhan Tömekçe et.al.	2404.10618	null
2024-04-16	Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases	Yanze Li et.al.	2404.10595	null
2024-04-16	Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training	Masanori Hirano et.al.	2404.10555	null
2024-04-16	Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning	Xiao Wang et.al.	2404.10552	null
2024-04-16	Capturing the Macroscopic Behaviour of Molecular Dynamics with Membership Functions	Alexander Sikorski et.al.	2404.10523	null
2024-04-16	CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity	Moshe Berchansky et.al.	2404.10513	null
2024-04-16	White Men Lead, Black Women Help: Uncovering Gender, Racial, and Intersectional Bias in Language Agency	Yixin Wan et.al.	2404.10508	null
2024-04-16	Self-Supervised Visual Preference Alignment	Ke Zhu et.al.	2404.10501	link
2024-04-16	When Emotional Stimuli meet Prompt Designing: An Auto-Prompt Graphical Paradigm	Chenggian Ma et.al.	2404.10500	null
2024-04-16	Spiral of Silences: How is Large Language Model Killing Information Retrieval? – A Case Study on Open Domain Question Answering	Xiaoyang Chen et.al.	2404.10496	link
2024-04-15	KG-CTG: Citation Generation through Knowledge Graph-guided Large Language Models	Avinash Anand et.al.	2404.09763	null
2024-04-15	Resilience of Large Language Models for Noisy Instructions	Bin Wang et.al.	2404.09754	null
2024-04-15	Personalized Collaborative Fine-Tuning for On-Device Large Language Models	Nicolas Wagner et.al.	2404.09753	link
2024-04-15	AMPCliff: quantitative definition and benchmarking of activity cliffs in antimicrobial peptides	Kewei Li et.al.	2404.09738	link
2024-04-15	Quantization of Large Language Models with an Overdetermined Basis	Daniil Merkulov et.al.	2404.09737	null
2024-04-15	Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models	Ziwei Luo et.al.	2404.09732	link
2024-04-15	Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model	Hyunsoo Cho et.al.	2404.09717	null
2024-04-15	Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction	David Sobrín-Hidalgo et.al.	2404.09705	null
2024-04-15	Generative AI for Game Theory-based Mobile Networking	Long He et.al.	2404.09699	null
2024-04-15	Are Large Language Models Reliable Argument Quality Annotators?	Nailia Mirzakhmedova et.al.	2404.09696	null
2024-04-15	LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models	Guangyan Li et.al.	2404.09695	null
2024-04-15	Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation	Juhwan Choi et.al.	2404.09682	null
2024-04-15	Learn Your Reference Model for Real Good Alignment	Alexey Gorbatovski et.al.	2404.09656	null
2024-04-15	Do LLMs Understand Visual Anomalies? Uncovering LLM Capabilities in Zero-shot Anomaly Detection	Jiaqi Zhu et.al.	2404.09654	null
2024-04-15	Bridging Vision and Language Spaces with Assignment Prediction	Jungin Park et.al.	2404.09632	link
2024-04-15	AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception	Yipo Huang et.al.	2404.09624	link
2024-04-15	UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark	Zhaokun Zhou et.al.	2404.09619	null
2024-04-15	A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions	Pengfei Liu et.al.	2404.09606	link
2024-04-15	Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction	Zepeng Ding et.al.	2404.09593	null
2024-04-15	Modelling Language	Jumbly Grindrod et.al.	2404.09579	null
2024-04-15	Transformers, Contextualism, and Polysemy	Jumbly Grindrod et.al.	2404.09577	null
2024-04-15	Large language models and linguistic intentionality	Jumbly Grindrod et.al.	2404.09576	null
2024-04-12	Probing the 3D Awareness of Visual Foundation Models	Mohamed El Banani et.al.	2404.08636	link
2024-04-12	Pre-training Small Base LMs with Fewer Tokens	Sunny Sanyal et.al.	2404.08634	link
2024-04-12	FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models	Yanting Wang et.al.	2404.08631	link
2024-04-12	Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation	Yanhao Zheng et.al.	2404.08603	link
2024-04-12	Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts	Övgü Özdemir et.al.	2404.08589	link
2024-04-12	Pathological Primitive Segmentation Based on Visual Foundation Model with Zero-Shot Mask Generation	Abu Bakor Hayat Arnob et.al.	2404.08584	link
2024-04-12	FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation	Riza Velioglu et.al.	2404.08582	null
2024-04-12	Lossy Image Compression with Foundation Diffusion Models	Lucas Relic et.al.	2404.08580	null
2024-04-12	Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation	Hanlin Tian et.al.	2404.08570	null
2024-04-12	RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs	Shreyas Chaudhari et.al.	2404.08555	null
2024-04-12	Memory Traces: Are Transformers Tulving Machines?	Jean-Marie Chauvet et.al.	2404.08543	null
2024-04-12	Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward	Xuan Xie et.al.	2404.08517	null
2024-04-12	ChatGPT and general-purpose AI count fruits in pictures surprisingly well	Konlavach Mengsuwan et.al.	2404.08515	null
2024-04-12	Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction	Haoran Qiu et.al.	2404.08509	link
2024-04-12	LaSagnA: Language-based Segmentation Assistant for Complex Queries	Cong Wei et.al.	2404.08506	link
2024-04-12	Strategic Interactions between Large Language Models-based Agents in Beauty Contests	Siting Lu et.al.	2404.08492	null
2024-04-12	Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation	Haozhe Zhao et.al.	2404.08491	link
2024-04-12	Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian	Stefano De Paoli et.al.	2404.08488	null
2024-04-12	Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task	Hassan Ali et.al.	2404.08424	null
2024-04-12	Adapting the Segment Anything Model During Usage in Novel Situations	Robin Schön et.al.	2404.08421	null
2024-04-11	OpenBias: Open-set Bias Detection in Text-to-Image Generative Models	Moreno D’Incà et.al.	2404.07990	link
2024-04-11	Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding	Yiwen Tang et.al.	2404.07989	link
2024-04-11	Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning	Simon Schrodi et.al.	2404.07983	null
2024-04-11	Language Imbalance Can Boost Cross-lingual Generalisation	Anton Schäfer et.al.	2404.07982	link
2024-04-11	Manipulating Large Language Models to Increase Product Visibility	Aounon Kumar et.al.	2404.07981	link
2024-04-11	LLoCO: Learning Long Contexts Offline	Sijun Tan et.al.	2404.07979	link
2024-04-11	Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models	Haotian Zhang et.al.	2404.07973	null
2024-04-11	Rho-1: Not All Tokens Are What You Need	Zhenghao Lin et.al.	2404.07965	link
2024-04-11	On Unified Prompt Tuning for Request Quality Assurance in Public Code Review	Xinyu Chen et.al.	2404.07942	null
2024-04-11	Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation	Jinkyung Park et.al.	2404.07926	null
2024-04-11	LaVy: Vietnamese Multimodal Large Language Model	Chi Tran et.al.	2404.07922	link
2024-04-11	AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs	Zeyi Liao et.al.	2404.07921	link
2024-04-11	DesignQA: A Multimodal Benchmark for Evaluating Large Language Models’ Understanding of Engineering Documentation	Anna C. Doris et.al.	2404.07917	link
2024-04-11	HGRN2: Gated Linear RNNs with State Expansion	Zhen Qin et.al.	2404.07904	link
2024-04-11	High-Dimension Human Value Representation in Large Language Models	Samuel Cahyawijaya et.al.	2404.07900	null
2024-04-11	Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations	Dayeon Ki et.al.	2404.07851	link
2024-04-11	On Training Data Influence of GPT Models	Qingyi Liu et.al.	2404.07840	link
2024-04-11	RecurrentGemma: Moving Past Transformers for Efficient Open Language Models	Aleksandar Botev et.al.	2404.07839	link
2024-04-11	Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution	Handi Deng et.al.	2404.07833	null
2024-04-11	Heron-Bench: A Benchmark for Evaluating Vision Language Models in Japanese	Yuichi Inoue et.al.	2404.07824	link
2024-04-10	BRAVE: Broadening the visual encoding of vision-language models	Oğuzhan Fatih Kar et.al.	2404.07204	null
2024-04-10	UMBRAE: Unified Multimodal Decoding of Brain Signals	Weihao Xia et.al.	2404.07202	null
2024-04-10	Scaling Laws for Data Filtering – Data Curation cannot be Compute Agnostic	Sachin Goyal et.al.	2404.07177	link
2024-04-10	Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention	Tsendsuren Munkhdalai et.al.	2404.07143	null
2024-04-10	Open reaction-diffusion systems: bridging probabilistic theory across scales	Mauricio J. del Razo et.al.	2404.07119	null
2024-04-10	Continuous Language Model Interpolation for Dynamic and Controllable Text Generation	Sara Kangaslahti et.al.	2404.07117	link
2024-04-11	From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications	Yongqiang Ma et.al.	2404.07108	null
2024-04-10	Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs	Bowen Jin et.al.	2404.07103	link
2024-04-10	Dynamic Generation of Personalities with Large Language Models	Jianzhi Liu et.al.	2404.07084	link
2024-04-10	VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning	Alexandros Xenos et.al.	2404.07078	link
2024-04-10	Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?	Mingyu Jin et.al.	2404.07066	link
2024-04-10	Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study	Alessandro Stolfo et.al.	2404.07060	null
2024-04-10	Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation	Elisa Sanchez-Bayona et.al.	2404.07053	link
2024-04-10	ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling	Ege Özsoy et.al.	2404.07031	null
2024-04-10	Improving Language Model Reasoning with Self-motivated Learning	Yunlong Feng et.al.	2404.07017	null
2024-04-10	A Mathematical Theory for Learning Semantic Languages by Abstract Learners	Kuo-Yu Liao et.al.	2404.07009	null
2024-04-10	WordDecipher: Enhancing Digital Workspace Communication with Explainable AI for Non-native English Speakers	Yuexi Chen et.al.	2404.07005	null
2024-04-10	LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models	Igor Tufanov et.al.	2404.07004	null
2024-04-10	Event Grounded Criminal Court View Generation withCooperative (Large) Language Models	Linan Yue et.al.	2404.07001	link
2024-04-10	Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study	Hongru Du et.al.	2404.06962	link
2024-04-09	InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD	Xiaoyi Dong et.al.	2404.06512	link
2024-04-09	Can Feedback Enhance Semantic Grounding in Large Vision-Language Models?	Yuan-Hong Liao et.al.	2404.06510	null
2024-04-09	On the Effect of (Near) Duplicate Subwords in Language Modelling	Anton Schäfer et.al.	2404.06508	link
2024-04-09	Pitfalls of Conversational LLMs on News Debiasing	Ipek Baris Schlicht et.al.	2404.06488	null
2024-04-10	Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks	Chonghua Wang et.al.	2404.06480	link
2024-04-10	Text-Based Reasoning About Vector Graphics	Zhenhailong Wang et.al.	2404.06479	null
2024-04-09	Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models	Zihan Fang et.al.	2404.06448	null
2024-04-09	Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems	Kunal Garg et.al.	2404.06413	null
2024-04-09	AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents	Luca Gioacchini et.al.	2404.06411	link
2024-04-09	Take a Look at it! Rethinking How to Evaluate Language Model Jailbreak	Hongyu Cai et.al.	2404.06407	link
2024-04-09	Apprentices to Research Assistants: Advancing Research with Large Language Models	M. Namvarpour et.al.	2404.06404	null
2024-04-09	MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies	Shengding Hu et.al.	2404.06395	link
2024-04-09	MuPT: A Generative Symbolic Music Pretrained Transformer	Xingwei Qu et.al.	2404.06393	null
2024-04-09	Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis	Mikel Zubillaga et.al.	2404.06392	null
2024-04-09	Latent Distance Guided Alignment Training for Large Language Models	Haotian Luo et.al.	2404.06390	null
2024-04-09	Model Generation from Requirements with LLMs: an Exploratory Study	Alessio Ferrari et.al.	2404.06371	null
2024-04-09	Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python	Valdecy Pereira et.al.	2404.06370	link
2024-04-09	VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs	Yi Gui et.al.	2404.06369	null
2024-04-09	ClinLinker: Medical Entity Linking of Clinical Concept Mentions in Spanish	Fernando Gallego et.al.	2404.06367	null
2024-04-09	Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation	Sidra Aleem et.al.	2404.06362	link
2024-04-08	MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding	Bo He et.al.	2404.05726	link
2024-04-08	Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs	Keen You et.al.	2404.05719	null
2024-04-08	Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding	Ahmad Idrissi-Yaghir et.al.	2404.05694	null
2024-04-08	Evaluating Mathematical Reasoning Beyond Accuracy	Shijie Xia et.al.	2404.05692	link
2024-04-08	Retrieval-Augmented Open-Vocabulary Object Detection	Jooyeon Kim et.al.	2404.05687	link
2024-04-08	MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation	Kunpeng Song et.al.	2404.05674	link
2024-04-08	CoReS: Orchestrating the Dance of Reasoning and Segmentation	Xiaoyi Bao et.al.	2404.05673	null
2024-04-08	Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data	Haitham Hammami et.al.	2404.05632	link
2024-04-08	LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking	Faren Yan et.al.	2404.05624	null
2024-04-08	MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning	Matteo Farina et.al.	2404.05621	link
2024-04-08	SpeechAlign: Aligning Speech Generation to Human Preferences	Dong Zhang et.al.	2404.05600	link
2024-04-08	MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering	Iñigo Alonso et.al.	2404.05590	null
2024-04-08	Enhancing Software Related Information Extraction with Generative Language Models through Single-Choice Question Answering	Wolfgang Otto et.al.	2404.05587	null
2024-04-08	Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model	Yue-Hua Han et.al.	2404.05583	null
2024-04-08	360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System	Shen Gao et.al.	2404.05569	null
2024-04-08	Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models	Bowen Pan et.al.	2404.05567	null
2024-04-08	Chinese Sequence Labeling with Semi-Supervised Boundary-Aware Language Model Pre-training	Longhui Zhang et.al.	2404.05560	link
2024-04-08	Evaluating Interventional Reasoning Capabilities of Large Language Models	Tejas Kasetty et.al.	2404.05545	null
2024-04-08	OPSD: an Offensive Persian Social media Dataset and its baseline evaluations	Mehran Safayani et.al.	2404.05540	null
2024-04-08	Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data	Tim Baumgärtner et.al.	2404.05530	null
2024-04-05	Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)	Michael Saxon et.al.	2404.04251	link
2024-04-05	Physical Property Understanding from Language-Embedded Feature Fields	Albert J. Zhai et.al.	2404.04242	null
2024-04-05	Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents	Harsh Kohli et.al.	2404.04237	null
2024-04-05	player2vec: A Language Modeling Approach to Understand Player Behavior in Games	Tianze Wang et.al.	2404.04234	null
2024-04-05	Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation	Ji-Jia Wu et.al.	2404.04231	link
2024-04-05	Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation	Tong Su et.al.	2404.04212	null
2024-04-05	Social Skill Training with Large Language Models	Diyi Yang et.al.	2404.04204	null
2024-04-05	Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text?	Ilya Ilyankou et.al.	2404.04169	null
2024-04-05	Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model	Xinrun Du et.al.	2404.04167	null
2024-04-05	Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval	João Coelho et.al.	2404.04163	null
2024-04-05	BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models	Jacek Wiland et.al.	2404.04113	link
2024-04-05	Large language models as oracles for instantiating ontologies with domain-specific knowledge	Giovanni Ciatto et.al.	2404.04108	link
2024-04-05	Robust Preference Optimization with Provable Noise Tolerance for LLMs	Xize Liang et.al.	2404.04102	null
2024-04-05	Label Propagation for Zero-shot Classification with Vision-Language Models	Vladan Stojnić et.al.	2404.04072	link
2024-04-05	Assessing the quality of information extraction	Filip Seitl et.al.	2404.04068	null
2024-04-05	CLUE: A Clinical Language Understanding Evaluation for LLMs	Amin Dada et.al.	2404.04067	link
2024-04-05	VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots	Akhil Padmanabha et.al.	2404.04066	null
2024-04-05	A Comparison of Methods for Evaluating Generative IR	Negar Arabzadeh et.al.	2404.04044	link
2024-04-05	Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer	Hele-Andra Kuulmets et.al.	2404.04042	null
2024-04-05	Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds	Annerose Eichel et.al.	2404.04031	null
2024-04-04	OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views	Francis Engelmann et.al.	2404.03650	null
2024-04-04	AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent	Hanyu Lai et.al.	2404.03648	link
2024-04-04	Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra	Darioush Kevian et.al.	2404.03647	null
2024-04-04	Locating and Editing Factual Associations in Mamba	Arnab Sen Sharma et.al.	2404.03646	link
2024-04-04	Training LLMs over Neurally Compressed Text	Brian Lester et.al.	2404.03626	null
2024-04-04	Standardizing Knowledge Engineering Practices with a Reference Architecture	Bradley P. Allen et.al.	2404.03624	null
2024-04-04	Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph	Marco Bronzini et.al.	2404.03623	null
2024-04-04	Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models	Wenshan Wu et.al.	2404.03622	null
2024-04-04	DeViDe: Faceted medical knowledge for improved medical vision-language pre-training	Haozhe Luo et.al.	2404.03618	null
2024-04-04	Sailor: Open Language Models for South-East Asia	Longxu Dou et.al.	2404.03608	link
2024-04-04	Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization	Aniruddha Nrusimha et.al.	2404.03605	link
2024-04-04	Evaluating LLMs at Detecting Errors in LLM Responses	Ryo Kamoi et.al.	2404.03602	link
2024-04-04	Intent Detection and Entity Extraction from BioMedical Literature	Ankan Mullick et.al.	2404.03598	link
2024-04-04	ReFT: Representation Finetuning for Language Models	Zhengxuan Wu et.al.	2404.03592	link
2024-04-04	SemGrasp: Semantic Grasp Generation via Language Aligned Discretization	Kailin Li et.al.	2404.03590	null
2024-04-04	Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models	Yantao Liu et.al.	2404.03577	link
2024-04-04	Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity	Jake Varley et.al.	2404.03570	null
2024-04-04	Personalized LLM Response Generation with Parameterized Memory Injection	Kai Zhang et.al.	2404.03565	null
2024-04-04	Select and Summarize: Scene Saliency for Movie Script Summarization	Rohit Saxena et.al.	2404.03561	null
2024-04-04	How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes	Harmon Bhasin et.al.	2404.03558	link
2024-04-03	ALOHa: A New Measure for Hallucination in Captioning Models	Suzanne Petryk et.al.	2404.02904	null
2024-04-03	MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment	Duygu Ceylan et.al.	2404.02899	null
2024-04-03	ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline	Yifan Xu et.al.	2404.02893	link
2024-04-03	MODNO: Multi Operator Learning With Distributed Neural Operators	Zecheng Zhang et.al.	2404.02892	null
2024-04-03	Linear Attention Sequence Parallelism	Weigao Sun et.al.	2404.02882	link
2024-04-03	Integrating Explanations in Learning LTL Specifications from Demonstrations	Ashutosh Gupta et.al.	2404.02872	null
2024-04-03	Toward Inference-optimal Mixture-of-Expert Large Language Models	Longfei Yun et.al.	2404.02852	null
2024-04-03	I-Design: Personalized LLM Interior Designer	Ata Çelen et.al.	2404.02838	null
2024-04-03	Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models	Wanyun Cui et.al.	2404.02837	null
2024-04-03	Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison	Maxime Bouthors et.al.	2404.02835	null
2024-04-03	Empowering Biomedical Discovery with AI Agents	Shanghua Gao et.al.	2404.02831	null
2024-04-03	BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models	Qijun Luo et.al.	2404.02827	link
2024-04-03	Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models	Haoran Sun et.al.	2404.02823	link
2024-04-03	A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches	Zhigen Zhao et.al.	2404.02817	null
2024-04-03	The RealHumanEval: Evaluating Large Language Models’ Abilities to Support Programmers	Hussein Mozannar et.al.	2404.02806	link
2024-04-03	Efficient Multi-Vector Dense Retrieval Using Bit Vectors	Franco Maria Nardini et.al.	2404.02805	link
2024-04-03	AI and personalized learning: bridging the gap with modern educational goals	Kristjan-Julius Laak et.al.	2404.02798	null
2024-04-03	CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech	Jaehyeon Kim et.al.	2404.02781	null
2024-04-03	FPT: Feature Prompt Tuning for Few-shot Readability Assessment	Ziyang Wang et.al.	2404.02772	link
2024-04-03	DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement	Hao Wu et.al.	2404.02755	null
2024-04-02	Segment Any 3D Object with Language	Seungjun Lee et.al.	2404.02157	null
2024-04-02	Iterated Learning Improves Compositionality in Large Vision-Language Models	Chenhao Zheng et.al.	2404.02145	null
2024-04-02	Topic-based Watermarks for LLM-Generated Text	Alexander Nemecek et.al.	2404.02138	null
2024-04-02	ViTamin: Designing Scalable Vision Models in the Vision-Language Era	Jienneg Chen et.al.	2404.02132	link
2024-04-02	FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning	Joel Niklaus et.al.	2404.02127	link
2024-04-02	Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models	Wanyong Feng et.al.	2404.02124	link
2024-04-02	GINopic: Topic Modeling with Graph Isomorphism Network	Suman Adhya et.al.	2404.02115	link
2024-04-02	CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems	Sara Rosenthal et.al.	2404.02103	link
2024-04-02	Advancing LLM Reasoning Generalists with Preference Trees	Lifan Yuan et.al.	2404.02078	link
2024-04-02	Red-Teaming Segment Anything Model	Krzysztof Jankowski et.al.	2404.02067	link
2024-04-02	Digital Forgetting in Large Language Models: A Survey of Unlearning Methods	Alberto Blanco-Justicia et.al.	2404.02062	null
2024-04-02	Long-context LLMs Struggle with Long In-context Learning	Tianle Li et.al.	2404.02060	link
2024-04-02	IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT	Junchen Fu et.al.	2404.02059	link
2024-04-02	Deconstructing In-Context Learning: Understanding Prompts via Corruption	Namrata Shivagunde et.al.	2404.02054	link
2024-04-02	A Survey on Large Language Model-Based Game Agents	Sihao Hu et.al.	2404.02039	link
2024-04-02	MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages	Daryna Dementieva et.al.	2404.02037	null
2024-04-02	Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts	Zhuo Chen et.al.	2404.02022	null
2024-04-02	Large Language Models for Orchestrating Bimanual Robots	Kun Chu et.al.	2404.02018	null
2024-04-02	MuxServe: Flexible Multiplexing for Efficient Multiple LLM Serving	Jiangfei Duan et.al.	2404.02015	null
2024-04-02	Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models	Stephan Linzbach et.al.	2404.01992	null
2024-03-29	Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models	Atsuyuki Miyai et.al.	2403.20331	link
2024-03-29	Are We on the Right Way for Evaluating Large Vision-Language Models?	Lin Chen et.al.	2403.20330	link
2024-03-29	ReALM: Reference Resolution As Language Modeling	Joel Ruben Antony Moniz et.al.	2403.20329	null
2024-03-29	Gecko: Versatile Text Embeddings Distilled from Large Language Models	Jinhyuk Lee et.al.	2403.20327	null
2024-03-29	Convolutional Prompting meets Language Models for Continual Learning	Anurag Roy et.al.	2403.20317	null
2024-03-29	Learn “No” to Say “Yes” Better: Improving Vision-Language Models via Negations	Jaisidh Singh et.al.	2403.20312	link
2024-03-29	Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference	Jovan Stojkovic et.al.	2403.20306	null
2024-03-29	Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain	Burcu Sayin et.al.	2403.20288	link
2024-03-29	LUQ: Long-text Uncertainty Quantification for LLMs	Caiqi Zhang et.al.	2403.20279	null
2024-04-01	Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want	Weifeng Lin et.al.	2403.20271	link
2024-03-29	Latxa: An Open Language Model and Evaluation Suite for Basque	Julen Etxaniz et.al.	2403.20266	link
2024-03-29	ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models	Thibaut Thonet et.al.	2403.20262	null
2024-03-29	MedCLIP-SAM: Bridging Text and Image Towards Universal Medical Image Segmentation	Taha Koleilat et.al.	2403.20253	null
2024-03-29	Using LLMs to Model the Beliefs and Preferences of Targeted Populations	Keiichi Namikoshi et.al.	2403.20252	null
2024-03-29	Long-Tailed Anomaly Detection with Learnable Class Names	Chih-Hui Ho et.al.	2403.20236	null
2024-03-29	H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model	Chao Pang et.al.	2403.20213	link
2024-03-29	Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science	Yazheng Yang et.al.	2403.20208	null
2024-03-29	The Future of Combating Rumors? Retrieval, Discrimination, and Generation	Junhao Xu et.al.	2403.20204	null
2024-03-29	ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models	Shuo Liu et.al.	2403.20194	null
2024-03-29	HARMamba: Efficient Wearable Sensor Human Activity Recognition Based on Bidirectional Selective SSM	Shuangjian Li et.al.	2403.20183	null
2024-03-28	RSMamba: Remote Sensing Image Classification with State Space Model	Keyan Chen et.al.	2403.19654	link
2024-03-28	InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction	Sirui Xu et.al.	2403.19652	null
2024-03-28	MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions	Kai Zhang et.al.	2403.19651	null
2024-03-28	Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models	Samuel Marks et.al.	2403.19647	link
2024-03-28	Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning	Chenyang Liu et.al.	2403.19646	link
2024-03-28	Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models	Yucheng Shi et.al.	2403.19631	null
2024-03-28	RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents	Zeren Chen et.al.	2403.19622	null
2024-03-28	SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects	Avinash Ummadisingu et.al.	2403.19607	null
2024-03-28	Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation	Zhongliang Zhou et.al.	2403.19584	null
2024-03-28	Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics	Norman Di Palo et.al.	2403.19578	null
2024-03-28	WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models	Piotr Molenda et.al.	2403.19548	null
2024-03-28	Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models	Ang Lv et.al.	2403.19521	link
2024-03-28	Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data	Shan Chen et.al.	2403.19511	link
2024-03-28	LLMs as Academic Reading Companions: Extending HCI Through Synthetic Personae	Celia Chen et.al.	2403.19506	null
2024-03-28	Evolving Assembly Code in an Adversarial Environment	Irina Maliukov et.al.	2403.19489	null
2024-03-28	JDocQA: Japanese Document Question Answering Dataset for Generative Language Models	Eri Onami et.al.	2403.19454	link
2024-03-28	Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model	Qi Gou et.al.	2403.19443	null
2024-03-28	OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion	Xinyu Zhan et.al.	2403.19417	null
2024-03-28	BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation	Yuhong He et.al.	2403.19414	null
2024-03-28	Checkpoint Merging via Bayesian Optimization in LLM Pretraining	Deyuan Liu et.al.	2403.19390	null
2024-03-27	Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models	Yanwei Li et.al.	2403.18814	link
2024-03-27	ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation	Suraj Patni et.al.	2403.18807	link
2024-03-27	Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation	Mateusz Klimaszewski et.al.	2403.18804	null
2024-03-27	Projective Methods for Mitigating Gender Bias in Pre-trained Language Models	Hillary Dawkins et.al.	2403.18803	link
2024-03-27	Long-form factuality in large language models	Jerry Wei et.al.	2403.18802	link
2024-03-27	Towards a World-English Language Model for On-Device Virtual Assistants	Rricha Jalota et.al.	2403.18783	null
2024-03-27	3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation	Ehsan Latif et.al.	2403.18778	null
2024-03-27	ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object	Chenshuang Zhang et.al.	2403.18775	link
2024-03-27	CheckEval: Robust Evaluation Framework using Large Language Model via Checklist	Yukyung Lee et.al.	2403.18771	null
2024-03-27	MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model	Yike Wu et.al.	2403.18760	link
2024-03-27	CYCLE: Learning to Self-Refine the Code Generation	Yangruibo Ding et.al.	2403.18746	link
2024-03-27	Understanding the Learning Dynamics of Alignment with Human Feedback	Shawn Im et.al.	2403.18742	link
2024-03-27	PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations	Ehsan Latif et.al.	2403.18721	null
2024-03-27	Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding	Xintong Wang et.al.	2403.18715	null
2024-03-27	The Invalsi Benchmark: measuring Language Models Mathematical and Language understanding in Italian	Andrea Esuli et.al.	2403.18697	null
2024-03-27	NL-ITI: Optimizing Probing and Intervention for Improvement of ITI Method	Jakub Hoscilowicz et.al.	2403.18680	link
2024-03-27	An Exploratory Study on Upper-Level Computing Students’ Use of Large Language Models as Tools in a Semester-Long Project	Ben Arie Tanay et.al.	2403.18679	null
2024-03-27	SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens	Chengbo Liu et.al.	2403.18647	link
2024-03-27	To Recommend or Not: Recommendability Identification in Conversations with Pre-trained Language Models	Zhefan Wang et.al.	2403.18628	link
2024-03-27	Vulnerability Detection with Code Language Models: How Far Are We?	Yangruibo Ding et.al.	2403.18624	link
2024-03-26	OmniVid: A Generative Framework for Universal Video Understanding	Junke Wang et.al.	2403.17935	link
2024-03-26	Track Everything Everywhere Fast and Robustly	Yunzhou Song et.al.	2403.17931	null
2024-03-26	MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution	Wei Tao et.al.	2403.17927	null
2024-03-26	LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning	Rui Pan et.al.	2403.17919	link
2024-03-26	Large scale paired antibody language models	Henry Kenlay et.al.	2403.17889	null
2024-03-26	Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation	Carlos Gomes et.al.	2403.17886	null
2024-03-26	MIND Your Language: A Multilingual Dataset for Cross-lingual News Recommendation	Andreea Iana et.al.	2403.17876	link
2024-03-26	Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach	Andrea Ferrario et.al.	2403.17873	null
2024-03-26	Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications	Philip Lippmann et.al.	2403.17860	null
2024-03-26	ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages	Bhawna Piryani et.al.	2403.17859	link
2024-03-26	Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs	David R. Mortensen et.al.	2403.17856	null
2024-03-26	ArabicaQA: A Comprehensive Dataset for Arabic Question Answering	Abdelrahman Abdallah et.al.	2403.17848	link
2024-03-26	Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation	Abdelrhman Werby et.al.	2403.17846	null
2024-03-26	Mechanistic Design and Scaling of Hybrid Architectures	Michael Poli et.al.	2403.17844	null
2024-03-26	ReMamber: Referring Image Segmentation with Mamba Twister	Yuhuan Yang et.al.	2403.17839	null
2024-03-26	A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities	Ibrahim Ethem Hamamci et.al.	2403.17834	link
2024-03-26	Assessment of Multimodal Large Language Models in Alignment with Human Values	Zhelun Shi et.al.	2403.17830	null
2024-03-26	Accelerating Radio Spectrum Regulation Workflows with Large Language Models (LLMs)	Amir Ghasemi et.al.	2403.17819	null
2024-03-26	Graph Language Model (GLM): A new graph-based approach to detect social instabilities	Wallyson Lemes de Oliveira et.al.	2403.17816	null
2024-03-26	Are Compressed Language Models Less Subgroup Robust?	Leonidas Gee et.al.	2403.17811	link
2024-03-25	Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making	Shuai Ma et.al.	2403.16812	null
2024-03-25	An LLM-Based Digital Twin for Optimizing Human-in-the Loop Systems	Hanqing Yang et.al.	2403.16809	link
2024-03-25	Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback	Zhangqian Bi et.al.	2403.16792	null
2024-03-25	All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification	Deepak Narayan Gadde et.al.	2403.16750	null
2024-03-25	A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models	Nils Ingelhag et.al.	2403.16730	null
2024-03-25	ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search	Zehan Li et.al.	2403.16702	link
2024-03-25	Synapse: Learning Preferential Concepts from Visual Demonstrations	Sadanand Modak et.al.	2403.16689	null
2024-03-25	Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography	Jiayue Zhang et.al.	2403.16687	null
2024-03-25	RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict	Yirong Zeng et.al.	2403.16662	link
2024-03-25	Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT	Rohit Raju et.al.	2403.16655	null
2024-03-25	CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment	Feiteng Fang et.al.	2403.16649	link
2024-03-25	Virtual Co-Pilot: Multimodal Large Language Model-enabled Quick-access Procedures for Single Pilot Operations	Fan Li et.al.	2403.16645	null
2024-03-25	Semantically Enriched Cross-Lingual Sentence Embeddings for Crisis-related Social Media Texts	Rabindra Lamsal et.al.	2403.16614	null
2024-03-25	Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units	Biswesh Mohapatra et.al.	2403.16609	null
2024-03-25	TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques	Ashok Urlana et.al.	2403.16592	null
2024-03-25	Can Large Language Models (or Humans) Distill Text?	Nicolas Audinet de Pieuchon et.al.	2403.16584	null
2024-03-25	NSINA: A News Corpus for Sinhala	Hansi Hettiarachchi et.al.	2403.16571	link
2024-03-25	Elysium: Exploring Object-level Perception in Videos via MLLM	Han Wang et.al.	2403.16558	link
2024-03-25	DOrA: 3D Visual Grounding with Order-Aware Referring	Tung-Yu Wu et.al.	2403.16539	null
2024-03-25	Open-Set Recognition in the Age of Vision-Language Models	Dimity Miller et.al.	2403.16528	null
2024-03-25	Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art	Neeloy Chakraborty et.al.	2403.16527	null
2024-03-25	Harnessing the power of LLMs for normative reasoning in MASs	Bastin Tony Roy Savarimuthu et.al.	2403.16524	null
2024-03-25	Norm Violation Detection in Multi-Agent Systems using Large Language Models: A Pilot Study	Shawn He et.al.	2403.16517	null
2024-03-25	Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social Media	Uma Sushmitha Gunturi et.al.	2403.16514	null
2024-03-22	LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models	Yuzhang Shang et.al.	2403.15388	null
2024-03-22	Long-CLIP: Unlocking the Long-Text Capability of CLIP	Beichen Zhang et.al.	2403.15378	link
2024-03-22	InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding	Yi Wang et.al.	2403.15377	link
2024-03-22	Can large language models explore in-context?	Akshay Krishnamurthy et.al.	2403.15371	null
2024-03-22	CoLLEGe: Concept Embedding Generation for Large Language Models	Ryan Teehan et.al.	2403.15362	null
2024-03-22	Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities	Zhitong Xiong et.al.	2403.15356	link
2024-03-22	Controlled Training Data Generation with Diffusion Models	Teresa Yeo et.al.	2403.15309	null
2024-03-22	Sphere Neural-Networks for Rational Reasoning	Tiansi Dong et.al.	2403.15297	null
2024-03-22	Measuring Gender and Racial Biases in Large Language Models	Jiafu An et.al.	2403.15281	null
2024-03-22	Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review	Jinge Wang et.al.	2403.15274	null
2024-03-22	Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs	Xiaobin Zhang et.al.	2403.15273	null
2024-03-22	Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models	Huanxuan Liao et.al.	2403.15268	link
2024-03-22	AI Exposure and Strategic Positioning on an Online Work Platform	Shun Yiu et.al.	2403.15262	null
2024-03-22	FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions	Orion Weller et.al.	2403.15246	link
2024-03-22	Shadow Generation for Composite Image Using Diffusion model	Qingyang Liu et.al.	2403.15234	link
2024-03-22	An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets	Jonathan Katzy et.al.	2403.15230	link
2024-03-22	Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models	Qiong Wu et.al.	2403.15226	null
2024-03-22	Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations	Pranav Kulkarni et.al.	2403.15218	link
2024-03-22	InstaSynth: Opportunities and Challenges in Generating Synthetic Instagram Data with ChatGPT for Sponsored Content Detection	Thales Bertaglia et.al.	2403.15214	link
2024-03-22	MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection	Taeheon Kim et.al.	2403.15209	null
2024-03-21	MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?	Renrui Zhang et.al.	2403.14624	null
2024-03-21	Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey	Zeyu Han et.al.	2403.14608	null
2024-03-21	MyVLM: Personalizing VLMs for User-Specific Queries	Yuval Alaluf et.al.	2403.14599	null
2024-03-21	ReAct Meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training	Zonghan Yang et.al.	2403.14589	null
2024-03-21	Large Language Models for Multi-Choice Question Classification of Medical Subjects	Víctor Ponce-López et.al.	2403.14582	null
2024-03-21	RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain	William James Bolton et.al.	2403.14578	link
2024-03-21	A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students’ Formative Assessment Responses in Science	Clayton Cohn et.al.	2403.14565	null
2024-03-21	The Era of Semantic Decoding	Maxime Peyrard et.al.	2403.14562	null
2024-03-21	Lexicon-Level Contrastive Visual-Grounding Improves Language Modeling	Chengxu Zhuang et.al.	2403.14551	null
2024-03-21	EDT: Improving Large Language Models’ Generation by Entropy-based Dynamic Temperature Sampling	Shimao Zhang et.al.	2403.14541	link
2024-03-21	Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference	Han Zhao et.al.	2403.14520	null
2024-03-21	The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs)	Joschka Haltaufderheide et.al.	2403.14473	null
2024-03-21	Detoxifying Large Language Models via Knowledge Editing	Mengru Wang et.al.	2403.14472	link
2024-03-21	ChatGPT Alternative Solutions: Large Language Models Survey	Hanieh Alipour et.al.	2403.14469	null
2024-03-21	Recourse for reclamation: Chatting with generative language models	Jennifer Chien et.al.	2403.14467	null
2024-03-21	Towards Single-System Illusion in Software-Defined Vehicles – Automated, AI-Powered Workflow	Krzysztof Lebioda et.al.	2403.14460	null
2024-03-21	Multi-Level Explanations for Generative Language Models	Lucas Monteiro Paes et.al.	2403.14459	null
2024-03-21	gTBLS: Generating Tables from Text by Conditional Question Answering	Anirudh Sundar et.al.	2403.14457	null
2024-03-21	Language Models Can Reduce Asymmetry in Information Markets	Nasim Rahaman et.al.	2403.14443	null
2024-03-21	A Multimodal Approach to Device-Directed Speech Detection with Large Language Models	Dominik Wager et.al.	2403.14438	null
2024-03-20	RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition	Ziyu Liu et.al.	2403.13805	link
2024-03-20	Learning from Models and Data for Visual Grounding	Ruozhen He et.al.	2403.13804	null
2024-03-20	Reverse Training to Nurse the Reversal Curse	Olga Golovneva et.al.	2403.13799	null
2024-03-20	Bridge the Modality and Capacity Gaps in Vision-Language Model Selection	Chao Yi et.al.	2403.13797	null
2024-03-20	RewardBench: Evaluating Reward Models for Language Modeling	Nathan Lambert et.al.	2403.13787	link
2024-03-20	Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts	Guangzeng Han et.al.	2403.13786	link
2024-03-20	Information-Theoretic Distillation for Reference-less Summarization	Jaehun Jung et.al.	2403.13780	null
2024-03-20	Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation	Hugues Thomas et.al.	2403.13777	null
2024-03-20	Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models	Nicholas Bai et.al.	2403.13771	link
2024-03-20	Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model	Diwei Wang et.al.	2403.13756	null
2024-03-20	Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement	Catherine Arnett et.al.	2403.13754	null
2024-03-20	EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation	Atnafu Lambebo Tonja et.al.	2403.13737	null
2024-03-20	Large Language Models meet Network Slicing Management and Orchestration	Abdulhalim Dandoush et.al.	2403.13721	null
2024-03-20	SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning	Hongjun Wang et.al.	2403.13684	null
2024-03-20	PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents	Mitodru Niyogi et.al.	2403.13681	null
2024-03-20	RoleInteract: Evaluating the Social Interaction of Role-Playing Agents	Hongzhan Chen et.al.	2403.13679	link
2024-03-20	Grounding Spatial Relations in Text-Only Language Models	Gorka Azkune et.al.	2403.13666	link
2024-03-20	Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese	Meet Doshi et.al.	2403.13638	null
2024-03-20	VL-Mamba: Exploring State Space Models for Multimodal Learning	Yanyuan Qiao et.al.	2403.13600	null
2024-03-20	No more optimization rules: LLM-enabled policy-based multi-modal query optimizer (version 1)	Yifan Wang et.al.	2403.13597	null
2024-03-19	LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression	Zhuoshi Pan et.al.	2403.12968	link
2024-03-19	Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models	Zuyan Liu et.al.	2403.12966	link
2024-03-19	Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models	Ce Zhang et.al.	2403.12964	link
2024-03-19	Dated Data: Tracing Knowledge Cutoffs in Large Language Models	Jeffrey Cheng et.al.	2403.12958	null
2024-03-19	Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models	Elaine Sui et.al.	2403.12952	link
2024-03-19	Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models	Joana Ribeiro de Faria et.al.	2403.12936	null
2024-03-19	Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties	Efrain Torres-Lomas et.al.	2403.12935	null
2024-03-19	Rapid AIdeation: Generating Ideas With the Self and in Collaboration With Large Language Models	Gionnieve Lim et.al.	2403.12928	null
2024-03-19	Supporting Energy Policy Research with Large Language Models	Grant Buster et.al.	2403.12924	null
2024-03-19	Contextual AD Narration with Interleaved Multimodal Sequence	Hanlin Wang et.al.	2403.12922	null
2024-03-19	Semantic Layering in Room Segmentation via LLMs	Taehyeon Kim et.al.	2403.12920	null
2024-03-19	Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts	Sai Ashish Somayajula et.al.	2403.12918	link
2024-03-19	Yell At Your Robot: Improving On-the-Fly from Language Corrections	Lucy Xiaoyang Shi et.al.	2403.12910	null
2024-03-19	Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference	Baolin Li et.al.	2403.12900	null
2024-03-19	mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding	Anwen Hu et.al.	2403.12895	link
2024-03-20	MEDBind: Unifying Language and Multimodal Medical Data Embeddings	Yuan Gao et.al.	2403.12894	null
2024-03-19	HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning	Fucai Ke et.al.	2403.12884	null
2024-03-19	Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models	Zehui Chen et.al.	2403.12881	link
2024-03-19	Epistemology of Language Models: Do Language Models Have Holistic Knowledge?	Minsu Kim et.al.	2403.12862	null
2024-03-19	RASP: A Drone-based Reconfigurable Actuation and Sensing Platform Towards Ambient Intelligent Systems	Minghui Zhao et.al.	2403.12853	null
2024-03-18	Modality-Agnostic fMRI Decoding of Vision and Language	Mitja Nikolaus et.al.	2403.11771	null
2024-03-18	Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs	M. Jehanzeb Mirza et.al.	2403.11755	link
2024-03-18	Revisiting The Classics: A Study on Identifying and Rectifying Gender Stereotypes in Rhymes and Poems	Aditya Narayan Sankaran et.al.	2403.11752	null
2024-03-18	Embedded Named Entity Recognition using Probing Classifiers	Nicholas Popovič et.al.	2403.11747	null
2024-03-18	TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models	Lisa Weijler et.al.	2403.11691	null
2024-03-18	HDLdebugger: Streamlining HDL debugging with Large Language Models	Xufeng Yao et.al.	2403.11671	null
2024-03-18	Prioritized Semantic Learning for Zero-shot Instance Navigation	Xander Sun et.al.	2403.11650	null
2024-03-18	Arc2Face: A Foundation Model of Human Faces	Foivos Paraperas Papantoniou et.al.	2403.11641	link
2024-03-18	Compositional Kronecker Context Optimization for Vision-Language Models	Kun Ding et.al.	2403.11631	null
2024-03-18	Let’s Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model	Haoyun Xu et.al.	2403.11621	null
2024-03-18	CRS-Diff: Controllable Generative Remote Sensing Foundation Model	Datao Tang et.al.	2403.11614	link
2024-03-18	Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines	Ekaterina Trofimova et.al.	2403.11585	null
2024-03-18	Reinforcement Learning with Token-level Feedback for Controllable Text Generation	Wendi Li et.al.	2403.11558	link
2024-03-18	LLM^3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning	Shu Wang et.al.	2403.11552	link
2024-03-18	Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters	Jiazuo Yu et.al.	2403.11549	link
2024-03-18	DEE: Dual-stage Explainable Evaluation Method for Text Generation	Shenyu Zhang et.al.	2403.11509	null
2024-03-18	Do CLIPs Always Generalize Better than ImageNet Models?	Qizhou Wang et.al.	2403.11497	null
2024-03-18	VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding	Yue Fan et.al.	2403.11481	null
2024-03-18	HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models	Huy Nghiem et.al.	2403.11456	link
2024-03-18	Zero-shot Compound Expression Recognition with Visual Language Model at the 6th ABAW Challenge	Jiahe Wang et.al.	2403.11450	null
2024-03-18	LLM Guided Evolution - The Automation of Models Advancing Models	Clint Morris et.al.	2403.11446	null
2024-03-18	StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation	Jinpeng Li et.al.	2403.11439	null
2024-03-18	InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions	Yifan Wang et.al.	2403.11435	null
2024-03-18	A Novel Paradigm Boosting Translation Capabilities of Large Language Models	Jiaxin Guo et.al.	2403.11430	null
2024-03-15	VideoAgent: Long-form Video Understanding with Large Language Model as Agent	Xiaohan Wang et.al.	2403.10517	null
2024-03-15	Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization	Ratnadira Widyasari et.al.	2403.10507	null
2024-03-15	ATOM: Asynchronous Training of Massive Models for Deep Learning in a Decentralized Environment	Xiaofeng Wu et.al.	2403.10504	null
2024-03-15	Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study	Chenguang Wang et.al.	2403.10499	link
2024-03-15	Reconfigurable Robot Identification from Motion Data	Yuhang Hu et.al.	2403.10496	null
2024-03-15	Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst?	Bruno de Melo et.al.	2403.10482	null
2024-03-15	Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases	Jiarui Li et.al.	2403.10446	link
2024-03-15	Optimal Block-Level Draft Verification for Accelerating Speculative Decoding	Ziteng Sun et.al.	2403.10444	null
2024-03-15	Using an LLM to Turn Sign Spottings into Spoken Language Sentences	Ozge Mercanoglu Sincan et.al.	2403.10434	null
2024-03-15	SocialGenPod: Privacy-Friendly Generative AI Social Web Applications with Decentralised Personal Data Stores	Vidminas Vizgirda et.al.	2403.10408	link
2024-03-15	A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE	Hervé Déjean et.al.	2403.10407	null
2024-03-15	Monotonic Representation of Numeric Properties in Language Models	Benjamin Heinzerling et.al.	2403.10381	link
2024-03-15	EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models	Rocktim Jyoti Das et.al.	2403.10378	link
2024-03-15	TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale	Pengcheng Jiang et.al.	2403.10351	null
2024-03-15	Investigating grammatical abstraction in language models using few-shot learning of novel noun gender	Priyanka Sukumaran et.al.	2403.10338	null
2024-03-15	CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model	Shang-Hsuan Chiang et.al.	2403.10326	link
2024-03-15	NetBench: A Large-Scale and Comprehensive Network Traffic Benchmark Dataset for Foundation Models	Chen Qian et.al.	2403.10319	link
2024-03-15	Uni-SMART: Universal Science Multimodal Analysis and Research Transformer	Hengxing Cai et.al.	2403.10301	null
2024-03-15	Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models	Tian Meng et.al.	2403.10287	null
2024-03-15	Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning	Shang-Hsuan Chiang et.al.	2403.10281	link
2024-03-14	GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping	Yuhang Zheng et.al.	2403.09637	link
2024-03-14	Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference	Piotr Nawrot et.al.	2403.09636	null
2024-03-14	Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models	Akhil Kedia et.al.	2403.09635	link
2024-03-14	OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning	Lingyi Hong et.al.	2403.09634	null
2024-03-14	3D-VLA: A 3D Vision-Language-Action Generative World Model	Haoyu Zhen et.al.	2403.09631	null
2024-03-14	Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking	Eric Zelikman et.al.	2403.09629	link
2024-03-14	Explore In-Context Segmentation via Latent Diffusion Models	Chaoyang Wang et.al.	2403.09616	null
2024-03-14	MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training	Brandon McKinzie et.al.	2403.09611	null
2024-03-14	Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey	Xiaoyu Liu et.al.	2403.09606	null
2024-03-14	Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis	Gregory Coppola et.al.	2403.09599	null
2024-03-14	Renovating Names in Open-Vocabulary Segmentation Benchmarks	Haiwen Huang et.al.	2403.09593	null
2024-03-14	ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models	Runyu Ma et.al.	2403.09583	null
2024-03-14	Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation	Yunhao Gou et.al.	2403.09572	null
2024-03-14	Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models	Laura Fernández-Becerra et.al.	2403.09567	null
2024-03-14	Welcome Your New AI Teammate: On Safety Analysis by Leashing Large Language Models	Ali Nouri et.al.	2403.09565	null
2024-03-14	PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps	Ruixuan Liu et.al.	2403.09562	null
2024-03-14	Less is More: Data Value Estimation for Visual Instruction Tuning	Zikang Liu et.al.	2403.09559	null
2024-03-15	Logits of API-Protected LLMs Leak Proprietary Information	Matthew Finlayson et.al.	2403.09539	null
2024-03-14	VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding	Chris Kelly et.al.	2403.09530	null
2024-03-15	WavCraft: Audio Editing and Generation with Natural Language Prompts	Jinhua Liang et.al.	2403.09527	link
2024-03-13	Simple and Scalable Strategies to Continually Pre-train Large Language Models	Adam Ibrahim et.al.	2403.08763	link
2024-03-13	Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework	Jingling Li et.al.	2403.08743	null
2024-03-13	The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models	Carlo Nicolini et.al.	2403.08739	null
2024-03-13	ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation	Sayar Ghosh Roy et.al.	2403.08737	link
2024-03-13	Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization	Renjie Pi et.al.	2403.08730	null
2024-03-14	SOTOPIA- $π$ : Interactive Learning of Socially Intelligent Language Agents	Ruiyi Wang et.al.	2403.08715	link
2024-03-13	Review of Generative AI Methods in Cybersecurity	Yagmur Yigit et.al.	2403.08701	null
2024-03-13	TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning	Shangding Gu et.al.	2403.08694	null
2024-03-13	Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages	Rik van Noord et.al.	2403.08693	null
2024-03-13	Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records	Erlend Frayling et.al.	2403.08664	null
2024-03-13	Self-Supervised Learning for Covariance Estimation	Tzvi Diskin et.al.	2403.08662	null
2024-03-13	Human Alignment of Large Language Models through Online Preference Optimisation	Daniele Calandriello et.al.	2403.08635	null
2024-03-13	MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models	Subash Neupane et.al.	2403.08607	null
2024-03-13	Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation	Daniel Honerkamp et.al.	2403.08605	link
2024-03-13	DevBench: A Comprehensive Benchmark for Software Development	Bowen Li et.al.	2403.08604	link
2024-03-13	Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments	Sitao Cheng et.al.	2403.08593	null
2024-03-13	Non-discrimination Criteria for Generative Language Models	Sara Sterlie et.al.	2403.08564	null
2024-03-13	AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models	Yifei Gao et.al.	2403.08542	null
2024-03-13	Language models scale reliably with over-training and on downstream tasks	Samir Yitzhak Gadre et.al.	2403.08540	link
2024-03-13	Masked Generative Story Transformer with Character Guidance and Caption Augmentation	Christos Papadimitriou et.al.	2403.08502	link
2024-03-12	Beyond Text: Frozen Large Language Models in Visual Signal Comprehension	Lei Zhu et.al.	2403.07874	link
2024-03-12	Rethinking Generative Large Language Model Evaluation for Semantic Comprehension	Fangyun Wei et.al.	2403.07872	null
2024-03-12	Exploring Safety Generalization Challenges of Large Language Models via Code	Qibing Ren et.al.	2403.07865	null
2024-03-12	Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation	Shihao Zhao et.al.	2403.07860	link
2024-03-12	MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric	Haokun Lin et.al.	2403.07839	null
2024-03-12	DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies	William Xie et.al.	2403.07832	null
2024-03-12	The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing	Jianchen Wang et.al.	2403.07825	null
2024-03-12	Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM	Sainbayar Sukhbaatar et.al.	2403.07816	null
2024-03-12	Chronos: Learning the Language of Time Series	Abdul Fatir Ansari et.al.	2403.07815	link
2024-03-12	Beyond Memorization: The Challenge of Random Memory Access in Language Models	Tongyao Zhu et.al.	2403.07805	link
2024-03-12	Fine-tuning Large Language Models with Sequential Instructions	Hanxu Hu et.al.	2403.07794	link
2024-03-12	Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations	Carlos Jose Xavier Cruz et.al.	2403.07769	link
2024-03-12	Synth $^2$ : Boosting Visual-Language Models with Synthetic Captions and Image Embeddings	Sahand Sharifzadeh et.al.	2403.07750	null
2024-03-12	FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models	Yan Liu et.al.	2403.07747	null
2024-03-12	Multi-modal Auto-regressive Modeling via Visual Words	Tianshuo Peng et.al.	2403.07720	link
2024-03-12	WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?	Alexandre Drouin et.al.	2403.07718	link
2024-03-12	StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models	Zhicheng Guo et.al.	2403.07714	link
2024-03-12	Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards	Wei Shen et.al.	2403.07708	null
2024-03-12	Large, Small or Both: A Novel Data Augmentation Framework Based on Language Models for Debiasing Opinion Summarization	Yanyue Zhang et.al.	2403.07693	null
2024-03-12	Reference-free Monolithic Preference Optimization with Odds Ratio	Jiwoo Hong et.al.	2403.07691	link
2024-03-11	Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena	Leonie Weissweiler et.al.	2403.06965	null
2024-03-11	Materials science in the era of large language models: a perspective	Ge Lei et.al.	2403.06949	null
2024-03-11	Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation	Xinyao Li et.al.	2403.06946	link
2024-03-11	Naming, Describing, and Quantifying Visual Objects in Humans and LLMs	Alberto Testoni et.al.	2403.06935	link
2024-03-11	ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis	Yanming Liu et.al.	2403.06932	link
2024-03-11	MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning	Yichuan Li et.al.	2403.06914	link
2024-03-11	Application of Quantum Tensor Networks for Protein Classification	Debarshi Kundu et.al.	2403.06890	null
2024-03-11	Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents	Nishchal Prasad et.al.	2403.06872	link
2024-03-11	Semantic Residual Prompts for Continual Learning	Martin Menabue et.al.	2403.06870	null
2024-03-11	Learning with Noisy Foundation Models	Hao Chen et.al.	2403.06869	null
2024-03-11	A Geospatial Approach to Predicting Desert Locust Breeding Grounds in Africa	Ibrahim Salihu Yusuf et.al.	2403.06860	null
2024-03-11	Development of a Reliable and Accessible Caregiving Language Model (CaLM)	Bambang Parmanto et.al.	2403.06857	null
2024-03-11	DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation	Guosheng Zhao et.al.	2403.06845	null
2024-03-11	RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback	Yanming Liu et.al.	2403.06840	link
2024-03-11	ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts	Lyuye Zhang et.al.	2403.06838	null
2024-03-11	Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?	Egor Zverev et.al.	2403.06833	link
2024-03-11	The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework	Zhuo Chen et.al.	2403.06832	link
2024-03-11	ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model	Zhiwei Liu et.al.	2403.06765	link
2024-03-11	An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models	Liang Chen et.al.	2403.06764	link
2024-03-11	ALaRM: Align Language Models via Hierarchical Rewards Modeling	Yuhang Lai et.al.	2403.06754	null
2024-03-08	Bayesian Preference Elicitation with Language Models	Kunal Handa et.al.	2403.05534	null
2024-03-08	Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context	Machel Reid et.al.	2403.05530	null
2024-03-08	GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM	Hao Kang et.al.	2403.05527	link
2024-03-08	DeepSeek-VL: Towards Real-World Vision-Language Understanding	Haoyu Lu et.al.	2403.05525	link
2024-03-08	Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola	Yijiang Li et.al.	2403.05523	null
2024-03-08	Authorship Attribution in Bangla Literature (AABL) via Transfer Learning using ULMFiT	Aisha Khatun et.al.	2403.05519	null
2024-03-08	Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought	James Chua et.al.	2403.05518	link
2024-03-08	To Err Is Human, but Llamas Can Learn It Too	Agnes Luhtaru et.al.	2403.05493	null
2024-03-08	Will GPT-4 Run DOOM?	Adrian de Wynter et.al.	2403.05468	null
2024-03-08	Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs	Arijit Nag et.al.	2403.05434	null
2024-03-08	Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition	Bingbing Wang et.al.	2403.05428	null
2024-03-08	FedFMS: Exploring Federated Foundation Models for Medical Image Segmentation	Yuxi Liu et.al.	2403.05408	link
2024-03-08	Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery	Xavier Bou et.al.	2403.05381	link
2024-03-08	VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model	Junsu Kim et.al.	2403.05346	null
2024-03-08	Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings	Wei Zhou et.al.	2403.05338	null
2024-03-08	ChatASU: Evoking LLM’s Reflexion to Truly Understand Aspect Sentiment in Dialogues	Yiding Liu et.al.	2403.05326	null
2024-03-08	RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation	Zihao Wang et.al.	2403.05313	null
2024-03-08	Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents	Jinyang Li et.al.	2403.05307	null
2024-03-08	ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications	Sotaro Takeshita et.al.	2403.05303	link
2024-03-08	Modeling Dynamic (De)Allocations of Local Memory for Translation Validation	Abhishek Rose et.al.	2403.05302	null
2024-03-07	iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries	Adam Coscia et.al.	2403.04760	link
2024-03-07	KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts	Adam Coscia et.al.	2403.04758	link
2024-03-07	LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error	Boshi Wang et.al.	2403.04746	link
2024-03-08	How Far Are We from Intelligent Visual Deductive Reasoning?	Yizhe Zhang et.al.	2403.04732	link
2024-03-07	Common 7B Language Models Already Possess Strong Math Capabilities	Chen Li et.al.	2403.04706	null
2024-03-07	ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes	Hashmat Shadab Malik et.al.	2403.04701	link
2024-03-07	Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification	Ekaterina Fadeeva et.al.	2403.04696	null
2024-03-07	Telecom Language Models: Must They Be Large?	Nicola Piovesan et.al.	2403.04666	null
2024-03-07	Yi: Open Foundation Models by 01.AI	01. AI et.al.	2403.04652	link
2024-03-07	Teaching Large Language Models to Reason with Reinforcement Learning	Alex Havrilla et.al.	2403.04642	null
2024-03-07	CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios	Qilang Ye et.al.	2403.04640	link
2024-03-07	A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds	Xuenan Xu et.al.	2403.04594	null
2024-03-07	Embodied Understanding of Driving Scenarios	Yunsong Zhou et.al.	2403.04593	link
2024-03-07	Wiki-TabNER:Advancing Table Interpretation Through Named Entity Recognition	Aneta Koleva et.al.	2403.04577	link
2024-03-07	Reducing self-supervised learning complexity improves weakly-supervised classification performance in computational pathology	Tim Lenz et.al.	2403.04558	null
2024-03-07	Enhancing Data Quality in Federated Fine-Tuning of Foundation Models	Wanru Zhao et.al.	2403.04529	null
2024-03-07	Where does In-context Translation Happen in Large Language Models	Suzanna Sia et.al.	2403.04510	null
2024-03-07	GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability	Zihan Luo et.al.	2403.04483	link
2024-03-08	Do Large Language Model Understand Multi-Intent Spoken Language ?	Shangjian Yin et.al.	2403.04481	link
2024-03-08	Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset	Minjin Kim et.al.	2403.04460	null
2024-03-06	Backtracing: Retrieving the Cause of the Query	Rose E. Wang et.al.	2403.03956	link
2024-03-06	Bridging Language and Items for Retrieval and Recommendation	Yupeng Hou et.al.	2403.03952	link
2024-03-06	The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models	Adithya Bhaskar et.al.	2403.03942	link
2024-03-06	Did Translation Models Get More Robust Without Anyone Even Noticing?	Ben Peters et.al.	2403.03923	null
2024-03-06	Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing	Asmita et.al.	2403.03897	link
2024-03-06	IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators	Indraneil Paul et.al.	2403.03894	link
2024-03-06	From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models	Luiza Pozzobon et.al.	2403.03893	link
2024-03-06	FaaF: Facts as a Function for the evaluation of RAG systems	Vasileios Katranidis et.al.	2403.03888	link
2024-03-06	SaulLM-7B: A pioneering Large Language Model for Law	Pierre Colombo et.al.	2403.03883	null
2024-03-06	Learning to Decode Collaboratively with Multiple Language Models	Shannon Zejiang Shen et.al.	2403.03870	link
2024-03-06	On the Origins of Linear Representations in Large Language Models	Yibo Jiang et.al.	2403.03867	null
2024-03-06	KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions	Fangyuan Xu et.al.	2403.03866	null
2024-03-06	Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning	Deepanway Ghosal et.al.	2403.03864	link
2024-03-06	X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification	Hanzi Xu et.al.	2403.03863	link
2024-03-06	Designing Informative Metrics for Few-Shot Example Selection	Rishabh Adiga et.al.	2403.03861	null
2024-03-06	Emojinize : Enriching Any Text with Emoji Translations	Lars Henning Klein et.al.	2403.03857	null
2024-03-06	ShortGPT: Layers in Large Language Models are More Redundant Than You Expect	Xin Men et.al.	2403.03853	null
2024-03-06	Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ	Carolin Holtermann et.al.	2403.03814	link
2024-03-06	Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery	Wei Zhang et.al.	2403.03790	null
2024-03-06	PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion	Zekai Zhang et.al.	2403.03788	link
2024-03-05	The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning	Nathaniel Li et.al.	2403.03218	null
2024-03-05	CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments	Savitha Sam Abraham et.al.	2403.03203	null
2024-03-05	Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement	Rafaela Martelo et.al.	2403.03188	link
2024-03-05	Reliable, Adaptable, and Attributable Language Models with Retrieval	Akari Asai et.al.	2403.03187	null
2024-03-05	MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting	Fangchen Liu et.al.	2403.03174	null
2024-03-05	SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection	Peng Qi et.al.	2403.03170	null
2024-03-05	PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset	Arda Uzunoğlu et.al.	2403.03167	link
2024-03-05	Quantum Many-Body Physics Calculations with Large Language Models	Haining Pan et.al.	2403.03154	null
2024-03-05	Language Guided Exploration for RL Agents in Text Environments	Hitesh Golchha et.al.	2403.03141	null
2024-03-05	CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following	Kaiyan Zhang et.al.	2403.03129	null
2024-03-05	Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution	Flor Miriam Plaza-del-Arco et.al.	2403.03121	null
2024-03-05	“In Dialogues We Learn”: Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning	Chuanqi Cheng et.al.	2403.03102	null
2024-03-05	KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents	Yuqi Zhu et.al.	2403.03101	link
2024-03-05	Learning to Use Tools via Cooperative and Interactive Agents	Zhengliang Shi et.al.	2403.03031	null
2024-03-05	Socratic Reasoning Improves Positive Text Rewriting	Anmol Goel et.al.	2403.03029	null
2024-03-05	Word Importance Explains How Prompts Affect Language Model Outputs	Stefan Hackmann et.al.	2403.03028	null
2024-03-05	OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following	Haochen Shi et.al.	2403.03017	null
2024-03-05	Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations	Hasan Abu-Rasheed et.al.	2403.03008	null
2024-03-05	Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models	Gen Luo et.al.	2403.03003	link
2024-03-05	Localized Zeroth-Order Prompt Optimization	Wenyang Hu et.al.	2403.02993	null
2024-03-02	LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems	Tasnim Ahmed et.al.	2403.01342	null
2024-03-02	Making Hybrid Languages: A Recipe	Leif Andersen et.al.	2403.01335	null
2024-03-02	Chaining thoughts and LLMs to learn DNA structural biophysics	Tyler D. Ross et.al.	2403.01332	link
2024-03-02	VBART: The Turkish LLM	Meliksah Turker et.al.	2403.01308	null
2024-03-02	ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation	Moran Yanuka et.al.	2403.01306	null
2024-03-02	Improving the Validity of Automatically Generated Feedback via Reinforcement Learning	Alexander Scarlatos et.al.	2403.01304	link
2024-03-02	NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention	Tianyi Zhang et.al.	2403.01273	link
2024-03-02	Employing LLMs for Incident Response Planning and Review	Sam Hays et.al.	2403.01271	null
2024-03-02	Dissecting Language Models: Machine Unlearning via Selective Pruning	Nicholas Pochinkov et.al.	2403.01267	null
2024-03-02	Accelerating Greedy Coordinate Gradient via Probe Sampling	Yiran Zhao et.al.	2403.01251	link
2024-03-02	SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code	Ziniu Hu et.al.	2403.01248	null
2024-03-02	Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal	Jianheng Huang et.al.	2403.01244	null
2024-03-02	IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact	Ruikang Liu et.al.	2403.01241	null
2024-03-02	Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy	Jamie Hayes et.al.	2403.01218	null
2024-03-02	API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access	Jiayuan Su et.al.	2403.01216	null
2024-03-02	Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning	Shuo Yang et.al.	2403.01209	null
2024-03-02	The Case for Animal-Friendly AI	Sankalpa Ghose et.al.	2403.01199	null
2024-03-02	DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling	Shanghaoran Quan et.al.	2403.01197	link
2024-03-02	RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots	Philip Feldman. James R. Foulds et.al.	2403.01193	null
2024-03-02	Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding	Ha-Thanh Nguyen et.al.	2403.01185	null
2024-02-29	The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?	Alex Gu et.al.	2402.19475	null
2024-02-29	The All-Seeing Project V2: Towards General Relation Comprehension of the Open World	Weiyun Wang et.al.	2402.19474	link
2024-02-29	Retrieval-Augmented Generation for AI-Generated Content: A Survey	Penghao Zhao et.al.	2402.19473	link
2024-02-29	Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling	Gabriel Grand et.al.	2402.19471	null
2024-03-01	TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning	Kate Sanders et.al.	2402.19467	null
2024-02-29	Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models	Chen Qian et.al.	2402.19465	link
2024-02-29	Curiosity-driven Red-teaming for Large Language Models	Zhang-Wei Hong et.al.	2402.19464	link
2024-02-29	Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap	Saurabh Srivastava et.al.	2402.19450	link
2024-02-29	Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models	Frederik Kunstner et.al.	2402.19449	null
2024-02-29	ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL	Yifei Zhou et.al.	2402.19446	link
2024-02-29	Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation	Jonathan Yang et.al.	2402.19432	null
2024-02-29	Compositional API Recommendation for Library-Oriented Code Generation	Zexiong Ma et.al.	2402.19431	null
2024-02-29	Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models	Soham De et.al.	2402.19427	null
2024-02-29	Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines	Lijia Ma et.al.	2402.19421	null
2024-02-29	PaECTER: Patent-level Representation Learning using Citation-informed Transformers	Mainak Ghosh et.al.	2402.19411	null
2024-02-29	On the Scaling Laws of Geographical Representation in Language Models	Nathan Godey et.al.	2402.19406	null
2024-02-29	Entity-Aware Multimodal Alignment Framework for News Image Captioning	Junzhe Zhang et.al.	2402.19404	null
2024-02-29	Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Match Human Crowd Accuracy	Philipp Schoenegger et.al.	2402.19379	null
2024-02-29	OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models	Jenish Maharjan et.al.	2402.19371	null
2024-02-29	SoK: Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency	Akila Wickramasekara et.al.	2402.19366	null
2024-02-28	Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards	Haoxiang Wang et.al.	2402.18571	link
2024-02-28	Diffusion Language Models Are Versatile Protein Learners	Xinyou Wang et.al.	2402.18567	null
2024-02-28	A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic	Gregory Coppola et.al.	2402.18566	null
2024-02-28	Approaching Human-Level Forecasting with Language Models	Danny Halawi et.al.	2402.18563	null
2024-02-28	Implicit Bias of Next-Token Prediction	Christos Thrampoulidis et.al.	2402.18551	null
2024-02-28	Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling	Mahdi Karami et.al.	2402.18508	null
2024-02-28	Few-Shot Fairness: Unveiling LLM’s Potential for Fairness-Aware Classification	Garima Chhikara et.al.	2402.18502	null
2024-02-28	Language Models Represent Beliefs of Self and Others	Wentao Zhu et.al.	2402.18496	null
2024-02-28	IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding	Lanyun Zhu et.al.	2402.18476	null
2024-02-28	Meta-Task Prompting Elicits Embedding from Large Language Models	Yibin Lei et.al.	2402.18458	null
2024-02-28	Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization	Deng Li et.al.	2402.18447	null
2024-02-28	Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication	Weize Chen et.al.	2402.18439	link
2024-02-28	A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models	Xiujie Song et.al.	2402.18409	null
2024-02-28	Balanced Similarity with Auxiliary Prompts: Towards Alleviating Text-to-Image Retrieval Bias for CLIP in Zero-shot Learning	Hanyao Wang et.al.	2402.18400	null
2024-02-28	Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models	Ercong Nie et.al.	2402.18397	null
2024-02-28	The First Place Solution of WSDM Cup 2024: Leveraging Large Language Models for Conversational Multi-Doc QA	Yiming Li et.al.	2402.18385	link
2024-02-28	Large Language Models As Evolution Strategies	Robert Tjarko Lange et.al.	2402.18381	null
2024-02-28	Tokenization Is More Than Compression	Craig W. Schmidt et.al.	2402.18376	null
2024-02-28	VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models	Seoyeon Kim et.al.	2402.18374	null
2024-02-28	Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning	Jiachun Li et.al.	2402.18344	null
2024-02-27	ShapeLLM: Universal 3D Object Understanding for Embodied Interaction	Zekun Qi et.al.	2402.17766	link
2024-02-27	The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits	Shuming Ma et.al.	2402.17764	null
2024-02-27	Massive Activations in Large Language Models	Mingjie Sun et.al.	2402.17762	link
2024-02-27	Towards Optimal Learning of Language Models	Yuxian Gu et.al.	2402.17759	null
2024-02-27	Evaluating Very Long-Term Conversational Memory of LLM Agents	Adyasha Maharana et.al.	2402.17753	null
2024-02-27	Tower: An Open Multilingual Large Language Model for Translation-Related Tasks	Duarte M. Alves et.al.	2402.17733	link
2024-02-27	AmbigNLG: Addressing Task Ambiguity in Instruction for NLG	Ayana Niwa et.al.	2402.17717	null
2024-02-27	Case-Based or Rule-Based: How Do Transformers Do the Math?	Yi Hu et.al.	2402.17709	link
2024-02-27	RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations	Jing Huang et.al.	2402.17700	link
2024-02-27	NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents	Tamara Czinczoll et.al.	2402.17682	link
2024-02-27	The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks	Ashwin Prasad Shivarpatna Venkatesh et.al.	2402.17679	null
2024-02-27	CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention	Mohammad Sadil Khan et.al.	2402.17678	null
2024-02-27	Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models	Yunpeng Huang et.al.	2402.17671	null
2024-02-27	Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs	Tanise Ceron et.al.	2402.17649	null
2024-02-27	SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation	Shuangrui Ding et.al.	2402.17645	link
2024-02-27	Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data	Xiao Liu et.al.	2402.17644	link
2024-02-27	Variational Learning is Effective for Large Deep Networks	Yuesong Shen et.al.	2402.17641	link
2024-02-27	Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling	David S. W. Williams et.al.	2402.17622	null
2024-02-27	Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization	Wenqi Zhang et.al.	2402.17574	link
2024-02-27	Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers	Xinyu Tang et.al.	2402.17564	link
2024-02-26	Integrating Large Language Models with Graphical Session-Based Recommendation	Naicheng Guo et.al.	2402.16539	null
2024-02-26	LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments	Junzhe Chen et.al.	2402.16499	null
2024-02-26	On Languaging a Simulation Engine	Han Liu et.al.	2402.16482	null
2024-02-26	Unveiling ChatGPT’s Usage in Open Source Projects: A Mining-based Study	Rosalia Tufano et.al.	2402.16480	null
2024-02-26	mEdIT: Multilingual Text Editing via Instruction Tuning	Vipul Raheja et.al.	2402.16472	link
2024-02-26	Unveiling Vulnerability of Self-Attention	Khai Jiet Liong et.al.	2402.16470	link
2024-02-26	Defending LLMs against Jailbreaking Attacks via Backtranslation	Yihan Wang et.al.	2402.16459	link
2024-02-26	ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing	Liuzhenghao Lv et.al.	2402.16445	link
2024-02-26	ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors	Zhexin Zhang et.al.	2402.16444	link
2024-02-26	Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models	Tianyi Tang et.al.	2402.16438	null
2024-02-26	RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions	Yuansen Zhang et.al.	2402.16431	null
2024-02-26	Predicting Sustainable Development Goals Using Course Descriptions – from LLMs to Conventional Foundation Models	Lev Kharlashkin et.al.	2402.16420	null
2024-02-26	From RAGs to riches: Using large language models to write documents for clinical trials	Nigel Markey et.al.	2402.16406	null
2024-02-26	MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property	Shiwen Ni et.al.	2402.16389	link
2024-02-26	Immunization against harmful fine-tuning attacks	Domenic Rosati et.al.	2402.16382	null
2024-02-26	Improving LLM-based Machine Translation with Systematic Self-Correction	Zhaopeng Feng et.al.	2402.16379	link
2024-02-26	Unraveling Babel: Exploring Multilingual Activation Patterns within Large Language Models	Weize Liu et.al.	2402.16367	null
2024-02-26	LLM Inference Unveiled: Survey and Roofline Model Insights	Zhihang Yuan et.al.	2402.16363	link
2024-02-26	Layer-wise Regularized Dropout for Neural Language Models	Shiwen Ni et.al.	2402.16361	null
2024-02-26	An Integrated Data Processing Framework for Pretraining Foundation Models	Yiding Sun et.al.	2402.16358	link
2024-02-26	Language-guided Skill Learning with Temporal Variational Inference	Haotian Fu et.al.	2402.16354	null
2024-02-23	AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning	Jianguo Zhang et.al.	2402.15506	link
2024-02-23	API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs	Kinjal Basu et.al.	2402.15491	null
2024-02-23	Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models	Yiran Liu et.al.	2402.15481	null
2024-02-23	Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization	Swaroop Nath et.al.	2402.15473	link
2024-02-23	Repetition Improves Language Model Embeddings	Jacob Mitchell Springer et.al.	2402.15449	link
2024-02-23	A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models	Stefan Hegselmann et.al.	2402.15422	link
2024-02-23	PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning	Simon Holk et.al.	2402.15420	null
2024-02-23	Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy?	Nader Asadi et.al.	2402.15414	null
2024-02-23	Grasp, See and Place: Efficient Unknown Object Rearrangement with Policy Structure Prior	Kechun Xu et.al.	2402.15402	link
2024-02-23	Explorations of Self-Repair in Language Models	Cody Rushing et.al.	2402.15390	link
2024-02-23	Safe Task Planning for Language-Instructed Multi-Robot Systems using Conformal Prediction	Jun Wang et.al.	2402.15368	null
2024-02-23	Farsight: Fostering Responsible AI Awareness During AI Application Prototyping	Zijie J. Wang et.al.	2402.15350	link
2024-02-23	NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data	Sergei Bogdanov et.al.	2402.15343	link
2024-02-23	Ranking Entities along Conceptual Space Dimensions with LLMs: An Analysis of Fine-Tuning Strategies	Nitesh Kumar et.al.	2402.15337	null
2024-02-23	GPTVQ: The Blessing of Dimensionality for LLM Quantization	Mart van Baalen et.al.	2402.15319	null
2024-02-23	ArabianGPT: Native Arabic GPT-based Large Language	Anis Koubaa et.al.	2402.15313	null
2024-02-23	Counterfactual Generation with Identifiability Guarantees	Hanqi Yan et.al.	2402.15309	link
2024-02-23	Representing Online Handwriting for Recognition in Large Vision-Language Models	Anastasiia Fadeeva et.al.	2402.15307	null
2024-02-23	How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries	Somnath Banerjee et.al.	2402.15302	link
2024-02-23	Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models	Yuzhe Zhang et.al.	2402.15301	null
2024-02-22	PALO: A Polyglot Large Multimodal Model for 5B People	Muhammad Maaz et.al.	2402.14818	link
2024-02-22	Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging	Yuzhe Yang et.al.	2402.14815	link
2024-02-22	WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition	Lianghui Zhu et.al.	2402.14812	link
2024-02-22	Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking	Nikhil Prakash et.al.	2402.14811	null
2024-02-22	CriticBench: Benchmarking LLMs for Critique-Correct Reasoning	Zicheng Lin et.al.	2402.14809	link
2024-02-22	RelayAttention for Efficient Large Language Model Serving with Long System Prompts	Lei Zhu et.al.	2402.14808	link
2024-02-22	A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health	Nikhil Behari et.al.	2402.14807	null
2024-02-22	Identifying Multiple Personalities in Large Language Models with External Evaluation	Xiaoyang Song et.al.	2402.14805	null
2024-02-22	Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models	Xudong Lu et.al.	2402.14800	link
2024-02-22	Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic	Nathaniel Weir et.al.	2402.14798	null
2024-02-22	Zero-shot cross-lingual transfer in instruction tuning of large language model	Nadezhda Chirkova et.al.	2402.14778	null
2024-02-22	2D Matryoshka Sentence Embeddings	Xianming Li et.al.	2402.14776	null
2024-02-22	DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models	Yuhang Cao et.al.	2402.14767	link
2024-02-22	MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues	Ge Bai et.al.	2402.14762	null
2024-02-22	Generalizing Reward Modeling for Out-of-Distribution Preference Learning	Chen Jia et.al.	2402.14760	null
2024-02-22	Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation	Jiawei Wang et.al.	2402.14744	null
2024-02-22	Dependency Annotation of Ottoman Turkish with Multilingual BERT	Şaziye Betül Özateş et.al.	2402.14743	null
2024-02-22	Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs	Arash Ahmadian et.al.	2402.14740	null
2024-02-22	Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models	Seungduk Kim et.al.	2402.14714	link
2024-02-22	IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus	Honghao Gui et.al.	2402.14710	link
2024-02-21	Coercing LLMs to do and reveal (almost) anything	Jonas Geiping et.al.	2402.14020	link
2024-02-21	Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment	Vyas Raina et.al.	2402.14016	null
2024-02-21	OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems	Chaoqun He et.al.	2402.14008	link
2024-02-21	Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models	Zhiwei He et.al.	2402.14007	null
2024-02-21	Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models	Aline Ioste et.al.	2402.14002	null
2024-02-21	Analysing The Impact of Sequence Composition on Language Model Pre-Training	Yu Zhao et.al.	2402.13991	link
2024-02-21	Towards Building Multilingual Language Model for Medicine	Pengcheng Qiu et.al.	2402.13963	link
2024-02-21	Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality	Rahul Zalkikar et.al.	2402.13954	null
2024-02-21	Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning	Debjit Paul et.al.	2402.13950	null
2024-02-21	Do Efficient Transformers Really Save Computation?	Kai Yang et.al.	2402.13934	null
2024-02-21	Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content	Federico Bianchi et.al.	2402.13926	null
2024-02-21	SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization	Prakamya Mishra et.al.	2402.13919	link
2024-02-21	What Linguistic Features and Languages are Important in LLM Translation?	Ryandito Diandaru et.al.	2402.13917	null
2024-02-21	Calibrating Large Language Models with Sample Consistency	Qing Lyu et.al.	2402.13904	null
2024-02-21	Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models	Chenyang Lyu et.al.	2402.13887	null
2024-02-21	$\texttt{Se}^2$: $\textit{Se}$quential Example $\textit{Se}$ lection for In-Context Learning	Haoyu Liu et.al.	2402.13874	null
2024-02-21	An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach	Mohammad Amaz Uddin et.al.	2402.13871	null
2024-02-21	Kuaiji: the First Chinese Accounting Large Language Model	Jiayuan Luo et.al.	2402.13866	null
2024-02-21	RealDex: Towards Human-like Grasping for Robotic Dexterous Hand	Yumeng Liu et.al.	2402.13853	null
2024-02-21	VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models	Jiawei Liang et.al.	2402.13851	null
2024-02-20	Towards audio language modeling – an overview	Haibin Wu et.al.	2402.13236	null
2024-02-20	Unlocking Insights: Semantic Search in Jupyter Notebooks	Lan Li et.al.	2402.13234	null
2024-02-20	A Touch, Vision, and Language Dataset for Multimodal Alignment	Letian Fu et.al.	2402.13232	link
2024-02-20	Investigating Cultural Alignment of Large Language Models	Badr AlKhamissi et.al.	2402.13231	link
2024-02-20	Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive	Arka Pal et.al.	2402.13228	link
2024-02-20	AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning	Qiao Jin et.al.	2402.13225	null
2024-02-20	RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian	Adrian Cosma et.al.	2402.13222	link
2024-02-20	How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts	Yusu Qian et.al.	2402.13220	null
2024-02-20	Softmax Probabilities (Mostly) Predict Large Language Model Correctness on Multiple-Choice Q&A	Benjamin Plaut et.al.	2402.13213	link
2024-02-20	Soft Self-Consistency Improves Language Model Agents	Han Wang et.al.	2402.13212	link
2024-02-20	Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation	Dongjin Kang et.al.	2402.13211	null
2024-02-20	Bayesian Reward Models for LLM Alignment	Adam X. Yang et.al.	2402.13210	null
2024-02-20	How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena	Marco Gaido et.al.	2402.13208	link
2024-02-20	Question Calibration and Multi-Hop Modeling for Temporal Question Answering	Chao Xue et.al.	2402.13188	null
2024-02-20	What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents	Mingyu Jin et.al.	2402.13184	null
2024-02-20	DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models	Norman Di Palo et.al.	2402.13181	null
2024-02-20	Benchmarking Retrieval-Augmented Generation for Medicine	Guangzhi Xiong et.al.	2402.13178	link
2024-02-20	Defending Jailbreak Prompts via In-Context Adversarial Game	Yujun Zhou et.al.	2402.13148	null
2024-02-20	OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog	Adnen Abdessaied et.al.	2402.13146	null
2024-02-20	The Hidden Space of Transformer Language Adapters	Jesujoba O. Alabi et.al.	2402.13137	null
2024-02-19	Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding	Zhuoming Chen et.al.	2402.12374	link
2024-02-19	AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies	Xiao Ye et.al.	2402.12370	link
2024-02-19	A Critical Evaluation of AI Feedback for Aligning Large Language Models	Archit Sharma et.al.	2402.12366	link
2024-02-19	Emergent Word Order Universals from Cognitively-Motivated Language Models	Tatsuki Kuribayashi et.al.	2402.12363	null
2024-02-19	Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge	Julien Delile et.al.	2402.12352	null
2024-02-19	GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations	Jinhao Duan et.al.	2402.12348	link
2024-02-19	Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!	Zhanhui Zhou et.al.	2402.12343	link
2024-02-19	Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models	Christian Schlarmann et.al.	2402.12336	link
2024-02-19	Query-Based Adversarial Prompt Generation	Jonathan Hayase et.al.	2402.12329	null
2024-02-19	Shall We Talk: Exploring Spontaneous Collaborations of Competing LLM Agents	Zengqing Wu et.al.	2402.12327	link
2024-02-19	ARKS: Active Retrieval in Knowledge Soup for Code Generation	Hongjin Su et.al.	2402.12317	null
2024-02-19	Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports	Felix J. Dorfner et.al.	2402.12298	null
2024-02-19	KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students	Matthew Shu et.al.	2402.12291	null
2024-02-19	DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models	Xiaoyu Tian et.al.	2402.12289	null
2024-02-19	Adaptive Skeleton Graph Decoding	Shuowei Jin et.al.	2402.12280	null
2024-02-19	Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks	Nadezhda Chirkova et.al.	2402.12279	null
2024-02-19	Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from Large Language Models	Puxuan Yu et.al.	2402.12276	link
2024-02-19	High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models	Michela Lorandi et.al.	2402.12267	link
2024-02-19	Uncertainty quantification in fine-tuned LLMs using LoRA ensembles	Oleksandr Balabanov et.al.	2402.12264	null
2024-02-19	NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms	Jonathan Zheng et.al.	2402.12261	null
2024-02-16	PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter	Junfei Xiao et.al.	2402.10896	null
2024-02-16	RLVF: Learning from Verbal Feedback without Overgeneralization	Moritz Stephan et.al.	2402.10893	link
2024-02-16	Instruction Diversity Drives Generalization To Unseen Tasks	Dylan Zhang et.al.	2402.10891	null
2024-02-16	When is Tree Search Useful for LLM Planning? It Depends on the Discriminator	Ziru Chen et.al.	2402.10890	link
2024-02-16	Multi-modal preference alignment remedies regression of visual instruction tuning on language model	Shengzhi Li et.al.	2402.10884	link
2024-02-16	EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models	Muhammad Shihab Rashid et.al.	2402.10866	null
2024-02-16	Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities	Mingyu Jin et.al.	2402.10835	null
2024-02-16	RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model	Jianhao Yuan et.al.	2402.10828	null
2024-02-16	Quantifying the Persona Effect in LLM Simulations	Tiancheng Hu et.al.	2402.10811	null
2024-02-16	Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond	Yongqi Li et.al.	2402.10805	null
2024-02-16	EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge	Xuan Shen et.al.	2402.10787	link
2024-02-16	A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models	Mingchen Li et.al.	2402.10779	null
2024-02-16	AutoGPT+P: Affordance-based Task Planning with Large Language Models	Timo Birr et.al.	2402.10778	null
2024-02-16	How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs?	Ehsan Doostmohammadi et.al.	2402.10770	null
2024-02-16	Distillation Enhanced Generative Retrieval	Yongqi Li et.al.	2402.10769	null
2024-02-16	Inference to the Best Explanation in Large Language Models	Dhairya Dalal et.al.	2402.10767	null
2024-02-16	When Dataflow Analysis Meets Large Language Models	Chengpeng Wang et.al.	2402.10754	null
2024-02-16	ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages	Junjie Ye et.al.	2402.10753	link
2024-02-16	GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models	Pengcheng Jiang et.al.	2402.10744	link
2024-02-16	Let’s Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning	Yinpeng Liu et.al.	2402.10738	link
2024-02-15	Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation	Huizhuo Yuan et.al.	2402.10210	null
2024-02-15	Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment	Rui Yang et.al.	2402.10207	link
2024-02-15	Chain-of-Thought Reasoning Without Prompting	Xuezhi Wang et.al.	2402.10200	null
2024-02-15	A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents	Lingbo Mo et.al.	2402.10196	link
2024-02-15	BitDelta: Your Fine-Tune May Only Be Worth One Bit	James Liu et.al.	2402.10193	link
2024-02-15	Uncertainty Decomposition and Quantification for In-Context Learning of Large Language Models	Chen Ling et.al.	2402.10189	link
2024-02-15	Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective	Tianyi Qiu et.al.	2402.10184	null
2024-02-15	TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation	Yaoxiang Wang et.al.	2402.10178	null
2024-02-15	OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset	Shubham Toshniwal et.al.	2402.10176	link
2024-02-15	Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence	Yinhong Liu et.al.	2402.10175	link
2024-02-15	OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models	Ali AhmadiTeshnizi et.al.	2402.10172	null
2024-02-15	Data Engineering for Scaling Language Models to 128K Context	Yao Fu et.al.	2402.10171	link
2024-02-15	Knowledge-Infused LLM-Powered Conversational Health Agent: A Case Study for Diabetes Patients	Mahyar Abbasian et.al.	2402.10153	null
2024-02-15	ControlLM: Crafting Diverse Personalities for Language Models	Yixuan Weng et.al.	2402.10151	link
2024-02-15	TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles	Yinhong Liu et.al.	2402.10137	null
2024-02-15	Zero-Shot Reasoning: Personalized Content Generation Without the Cold Start Problem	Davor Hafnar et.al.	2402.10133	null
2024-02-15	Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning	Ming Li et.al.	2402.10110	link
2024-02-15	Quantized Embedding Vectors for Controllable Diffusion Language Models	Cheng Kang et.al.	2402.10107	null
2024-02-15	GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving	Jiaxin Zhang et.al.	2402.10104	link
2024-02-15	Any-Shift Prompting for Generalization over Distributions	Zehao Xiao et.al.	2402.10099	null
2024-02-14	AQA-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability	Siwei Yang et.al.	2402.09404	link
2024-02-14	Reinforcement Learning from Human Feedback with Active Queries	Kaixuan Ji et.al.	2402.09401	null
2024-02-14	Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference	Harry Dong et.al.	2402.09398	link
2024-02-14	LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset	Botao Yu et.al.	2402.09391	link
2024-02-14	HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation	Yihao Fang et.al.	2402.09390	link
2024-02-14	Transformers Can Achieve Length Generalization But Not Robustly	Yongchao Zhou et.al.	2402.09371	null
2024-02-14	Pseudorandom Error-Correcting Codes	Miranda Christ et.al.	2402.09370	null
2024-02-14	Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking	Yi Fung et.al.	2402.09369	link
2024-02-14	Copyright Traps for Large Language Models	Matthieu Meeus et.al.	2402.09363	null
2024-02-14	HiRE: High Recall Approximate Top- $k$ Estimation for Efficient LLM Inference	Yashas Samaga B L et.al.	2402.09360	null
2024-02-14	Developing a Framework for Auditing Large Language Models Using Human-in-the-Loop	Maryam Amirizaniani et.al.	2402.09346	null
2024-02-14	Mitigating Reward Hacking via Information-Theoretic Reward Modeling	Yuchun Miao et.al.	2402.09345	null
2024-02-14	AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach	Maryam Amirizaniani et.al.	2402.09334	null
2024-02-14	ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization	Feifan Song et.al.	2402.09320	link
2024-02-14	Embracing the black box: Heading towards foundation models for causal discovery from time series data	Gideon Stein et.al.	2402.09305	link
2024-02-14	Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code	Vahid Majdinasab et.al.	2402.09299	link
2024-02-14	Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey	Zhichen Dong et.al.	2402.09283	link
2024-02-14	Leveraging Large Language Models for Enhanced NLP Task Performance through Knowledge Distillation and Optimized Training Strategies	Yining Huang et.al.	2402.09282	null
2024-02-14	Personalized Large Language Models	Stanisław Woźniak et.al.	2402.09269	null
2024-02-14	Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation	Xiaoying Zhang et.al.	2402.09267	null
2024-02-13	Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance	Linxi Zhao et.al.	2402.08680	null
2024-02-13	COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability	Xingang Guo et.al.	2402.08679	link
2024-02-13	Human Curriculum Effects Emerge with In-Context Learning in Neural Networks	Jacob Russin et.al.	2402.08674	null
2024-02-13	Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models	Yuqing Liu et.al.	2402.08670	null
2024-02-13	Improving Generalization in Semantic Parsing by Increasing Natural Language Variation	Irina Saparina et.al.	2402.08666	link
2024-02-13	The Last JITAI? The Unreasonable Effectiveness of Large Language Models in Issuing Just-in-Time Adaptive Interventions: Fostering Physical Activity in a Prospective Cardiac Rehabilitation Setting	David Haag et.al.	2402.08658	null
2024-02-13	PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs	Michael Dorkenwald et.al.	2402.08657	null
2024-02-13	Tandem Transformers for Inference Efficient LLMs	Aishwarya P S et.al.	2402.08644	null
2024-02-13	SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages	Nedjma Ousidhoum et.al.	2402.08638	null
2024-02-13	Knowledge Editing on Black-box Large Language Models	Xiaoshuai Song et.al.	2402.08631	link
2024-02-13	Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning	Haeju Lee et.al.	2402.08594	link
2024-02-13	Test-Time Backdoor Attacks on Multimodal Large Language Models	Dong Lu et.al.	2402.08577	link
2024-02-13	Online Foundation Model Selection in Robotics	Po-han Li et.al.	2402.08570	null
2024-02-13	Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast	Xiangming Gu et.al.	2402.08567	link
2024-02-13	Artificial Intelligence for Literature Reviews: Opportunities and Challenges	Francisco Bolanos et.al.	2402.08565	null
2024-02-13	Higher Layers Need More LoRA Experts	Chongyang Gao et.al.	2402.08562	link
2024-02-13	Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback	Vineet Bhat et.al.	2402.08546	null
2024-02-13	The Application of ChatGPT in Responding to Questions Related to the Boston Bowel Preparation Scale	Xiaoqiang Liu et.al.	2402.08492	null
2024-02-13	Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models	Shaeke Salman et.al.	2402.08473	null
2024-02-13	Large Language Models for the Automated Analysis of Optimization Algorithms	Camilo Chacón Sartori et.al.	2402.08472	link
2024-02-12	A systematic investigation of learnability from single child linguistic input	Yulu Qin et.al.	2402.07899	link
2024-02-12	Suppressing Pink Elephants with Direct Principle Feedback	Louis Castricato et.al.	2402.07896	null
2024-02-12	WildfireGPT: Tailored Large Language Model for Wildfire Analysis	Yangxinyu Xie et.al.	2402.07877	null
2024-02-12	Policy Improvement using Language Feedback Models	Victor Zhong et.al.	2402.07876	null
2024-02-12	PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs	Soroush Nasiriany et.al.	2402.07872	null
2024-02-12	Scaling Laws for Fine-Grained Mixture of Experts	Jakub Krajewski et.al.	2402.07871	link
2024-02-12	PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models	Wei Zou et.al.	2402.07867	link
2024-02-12	Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models	Siddharth Karamcheti et.al.	2402.07865	link
2024-02-12	AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy	Philipp Schoenegger et.al.	2402.07862	null
2024-02-12	Lissard: Long and Simple Sequential Reasoning Datasets	Mirelle Bueno et.al.	2402.07859	null
2024-02-12	Mercury: An Efficiency Benchmark for LLM Code Synthesis	Mingzhe Du et.al.	2402.07844	link
2024-02-12	Do Membership Inference Attacks Work on Large Language Models?	Michael Duan et.al.	2402.07841	link
2024-02-12	Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model	Ahmet Üstün et.al.	2402.07827	null
2024-02-12	Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning	Z Liu et.al.	2402.07818	null
2024-02-12	Injecting Wiktionary to improve token-level contextual representations using contrastive learning	Anna Mosolova et.al.	2402.07817	null
2024-02-12	Retrieval-Augmented Thought Process as Sequential Decision Making	Thomas Pouplin et.al.	2402.07812	null
2024-02-12	Empowering Federated Learning for Massive Models with NVIDIA FLARE	Holger R. Roth et.al.	2402.07792	null
2024-02-12	TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection	Hui Liu et.al.	2402.07776	link
2024-02-12	Quantitative knowledge retrieval from large language models	David Selby et.al.	2402.07770	link
2024-02-12	Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model	Mikail Khona et.al.	2402.07757	null
2024-02-09	Feedback Loops With Language Models Drive In-Context Reward Hacking	Alexander Pan et.al.	2402.06627	link
2024-02-09	Understanding the Effects of Iterative Prompting on Truthfulness	Satyapriya Krishna et.al.	2402.06625	null
2024-02-09	Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning	Shivalika Singh et.al.	2402.06619	null
2024-02-09	FaBERT: Pre-training BERT on Persian Blogs	Mostafa Masumi et.al.	2402.06617	null
2024-02-09	On the Out-Of-Distribution Generalization of Multimodal Large Language Models	Xingxuan Zhang et.al.	2402.06599	null
2024-02-09	CigaR: Cost-efficient Program Repair with LLMs	Dávid Hidvégi et.al.	2402.06598	link
2024-02-09	Understanding the Weakness of Large Language Model Agents within a Complex Android Environment	Mingzhe Xing et.al.	2402.06596	link
2024-02-09	Self-consistent context aware conformer transducer for speech recognition	Konstantin Kolokolov et.al.	2402.06592	null
2024-02-09	G-SciEdBERT: A Contextualized LLM for Science Assessment Tasks in German	Ehsan Latif et.al.	2402.06584	null
2024-02-09	Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning	Amir Ziai et.al.	2402.06560	link
2024-02-09	The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model	Gregory Coppola et.al.	2402.06557	link
2024-02-09	Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA	Marek Šuppa et.al.	2402.06549	link
2024-02-09	Calibrating Long-form Generations from Large Language Models	Yukun Huang et.al.	2402.06544	null
2024-02-09	Introspective Planning: Guiding Language-Enabled Agents to Refine Their Own Uncertainty	Kaiqu Liang et.al.	2402.06529	link
2024-02-09	Multimodal Clinical Trial Outcome Prediction with Large Language Models	Wenhao Zheng et.al.	2402.06512	link
2024-02-09	Iris-SAM: Iris Segmentation Using a Foundational Model	Parisa Farmanifard et.al.	2402.06497	link
2024-02-09	Large Language Models for Captioning and Retrieving Remote Sensing Images	João Daniel Silva et.al.	2402.06475	null
2024-02-09	V-STaR: Training Verifiers for Self-Taught Reasoners	Arian Hosseini et.al.	2402.06457	null
2024-02-09	StruQ: Defending Against Prompt Injection with Structured Queries	Sizhe Chen et.al.	2402.06363	null
2024-02-09	CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models	Peiyuan Gong et.al.	2402.06360	link
2024-02-08	SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models	Peng Gao et.al.	2402.05935	link
2024-02-08	Driving Everywhere with Large Language Model Policy Adaptation	Boyi Li et.al.	2402.05932	null
2024-02-08	WebLINX: Real-World Website Navigation with Multi-Turn Dialogue	Xing Han Lù et.al.	2402.05930	link
2024-02-08	An Interactive Agent Foundation Model	Zane Durante et.al.	2402.05929	null
2024-02-08	On the Convergence of Zeroth-Order Federated Tuning in Large Language Models	Zhenqing Ling et.al.	2402.05926	null
2024-02-08	Efficient Stagewise Pretraining via Progressive Subnetworks	Abhishek Panigrahi et.al.	2402.05913	null
2024-02-08	FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs	Eun Cheol Choi et.al.	2402.05904	link
2024-02-08	Large Language Model Meets Graph Neural Network in Knowledge Distillation	Shengxiang Hu et.al.	2402.05894	null
2024-02-08	Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking	Nikhil Sharma et.al.	2402.05880	null
2024-02-08	PromptCrypt: Prompt Encryption for Secure Communication with Large Language Models	Guo Lin et.al.	2402.05868	link
2024-02-08	How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis	Federico Bianchi et.al.	2402.05863	link
2024-02-08	Let Your Graph Do the Talking: Encoding Structured Data for LLMs	Bryan Perozzi et.al.	2402.05862	null
2024-02-08	Learning to Route Among Specialized Experts for Zero-Shot Generalization	Mohammed Muqeeth et.al.	2402.05859	link
2024-02-08	Limitations of Agents Simulated by Predictive Models	Raymond Douglas et.al.	2402.05829	null
2024-02-08	Is it Possible to Edit Large Language Models Robustly?	Xinbei Ma et.al.	2402.05827	link
2024-02-08	Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models	Lingzhi Wang et.al.	2402.05813	null
2024-02-08	Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning	Zhiheng Xi et.al.	2402.05808	link
2024-02-08	How do Transformers perform In-Context Autoregressive Learning?	Michael E. Sander et.al.	2402.05787	null
2024-02-08	Limits of Transformer Language Models on Algorithmic Learning	Jonathan Thomm et.al.	2402.05785	null
2024-02-08	Text-to-Code Generation with Modality-relative Pre-training	Fenia Christopoulou et.al.	2402.05783	null
2024-02-07	Opening the AI black box: program synthesis via mechanistic interpretability	Eric J. Michaud et.al.	2402.05110	link
2024-02-07	You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models	Alix Decrop et.al.	2402.05102	null
2024-02-07	Hydragen: High-Throughput LLM Inference with Shared Prefixes	Jordan Juravsky et.al.	2402.05099	null
2024-02-07	Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation	Dennis Hoftijzer et.al.	2402.05090	null
2024-02-07	A Roadmap to Pluralistic Alignment	Taylor Sorensen et.al.	2402.05070	link
2024-02-07	SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models	Lijun Li et.al.	2402.05044	link
2024-02-07	How BERT Speaks Shakespearean English? Evaluating Historical Bias in Contextual Language Models	Miriam Cuscito et.al.	2402.05034	null
2024-02-07	A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?	Agustinus Kristiadi et.al.	2402.05015	link
2024-02-07	Pedagogical Alignment of Large Language Models	Shashank Sonkar et.al.	2402.05000	null
2024-02-07	An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration	Yihao Li et.al.	2402.04978	null
2024-02-07	ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12	Liuqing Chen et.al.	2402.04975	null
2024-02-07	Reconfidencing LLMs from the Grouping Loss Perspective	Lihu Chen et.al.	2402.04957	null
2024-02-07	Chatbots in Knowledge-Intensive Contexts: Comparing Intent and LLM-Based Systems	Samuel Kernan Freire et.al.	2402.04955	null
2024-02-07	Prompting Implicit Discourse Relation Annotation	Frances Yung et.al.	2402.04918	null
2024-02-07	Personalized Text Generation with Fine-Grained Linguistic Control	Bashar Alhafni et.al.	2402.04914	link
2024-02-07	L4Q: Parameter Efficient Quantization-Aware Training on Large Language Models via LoRA-wise LSQ	Hyesung Jeon et.al.	2402.04902	null
2024-02-07	Detecting Generated Native Ads in Conversational Search	Sebastian Schmidt et.al.	2402.04889	link
2024-02-07	Multimodal Query Suggestion with Multi-Agent Reinforcement Learning from Human Feedback	Zheng Wang et.al.	2402.04867	null
2024-02-07	Automated Smart Contract Summarization via LLMs	Yingjie Mao et.al.	2402.04863	null
2024-02-07	CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay	Natasha Butt et.al.	2402.04858	null
2024-02-06	AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls	Yu Du et.al.	2402.04253	link
2024-02-06	HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal	Mantas Mazeika et.al.	2402.04249	link
2024-02-06	Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks	Jongho Park et.al.	2402.04248	link
2024-02-06	Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science	Xiangru Tang et.al.	2402.04247	null
2024-02-06	CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations	Ji Qi et.al.	2402.04236	link
2024-02-06	Can Generative Agents Predict Emotion?	Ciaran Regan et.al.	2402.04232	null
2024-02-06	“Task Success” is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors	Lin Guan et.al.	2402.04210	null
2024-02-06	Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models	David Sobrín-Hidalgo et.al.	2402.04206	null
2024-02-06	SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models	Yichen Shi et.al.	2402.04178	link
2024-02-06	Scaling Laws for Downstream Task Performance of Large Language Models	Berivan Isik et.al.	2402.04177	null
2024-02-06	Harnessing the Plug-and-Play Controller by Prompting	Hao Wang et.al.	2402.04160	null
2024-02-06	Multi-line AI-assisted Code Authoring	Omer Dunay et.al.	2402.04141	null
2024-02-06	Advancing Legal Reasoning: The Integration of AI to Navigate Complexities and Biases in Global Jurisprudence with Semi-Automated Arbitration Processes (SAAPs)	Michael De’Shazer et.al.	2402.04140	null
2024-02-06	Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science	Pengfei Liu et.al.	2402.04119	link
2024-02-06	Measuring Implicit Bias in Explicitly Unbiased Large Language Models	Xuechunzi Bai et.al.	2402.04105	null
2024-02-06	The Use of a Large Language Model for Cyberbullying Detection	Bayode Ogunleye et.al.	2402.04088	null
2024-02-06	A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation	Zhengbo Wang et.al.	2402.04087	link
2024-02-06	Provably learning a multi-head attention layer	Sitan Chen et.al.	2402.04084	null
2024-02-06	Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models	Reza Khanmohammadi et.al.	2402.04075	null
2024-02-06	Retrieve to Explain: Evidence-driven Predictions with Language Models	Ravi Patel et.al.	2402.04068	link

Video Understanding

Publish Date	Title	Authors	PDF	Code
2024-05-14	Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis	Yao Fu et.al.	2405.08944	null
2024-05-14	CinePile: A Long Video Question Answering Dataset and Benchmark	Ruchit Rawal et.al.	2405.08813	null
2024-05-14	No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding	Yingjie Zhai et.al.	2405.08344	link
2024-05-13	FreeVA: Offline MLLM as Training-Free Video Assistant	Wenhao Wu et.al.	2405.07798	link
2024-05-11	Memory-Maze: Scenario Driven Benchmark and Visual Language Navigation Model for Guiding Blind People	Masaki Kuribayashi et.al.	2405.07060	null
2024-05-11	Retrieval Enhanced Zero-Shot Video Captioning	Yunchuan Ma et.al.	2405.07046	null
2024-05-11	Global Motion Understanding in Large-Scale Video Object Segmentation	Volodymyr Fedynyak et.al.	2405.07031	null
2024-05-09	A Survey on Backbones for Deep Video Action Recognition	Zixuan Tang et.al.	2405.05584	null
2024-05-08	Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios	Chirag Parikh et.al.	2405.05354	null
2024-05-07	Vision Mamba: A Comprehensive Survey and Taxonomy	Xiao Liu et.al.	2405.04404	link
2024-05-06	Foundation Models for Video Understanding: A Survey	Neelu Madan et.al.	2405.03770	link
2024-05-08	How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs	Muhammad Uzair Khattak et.al.	2405.03690	null
2024-05-06	WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning	Yuanhan Zhang et.al.	2405.03272	null
2024-04-30	Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition	Zhendong Liu et.al.	2404.19383	null
2024-05-01	Capabilities of Gemini Models in Medicine	Khaled Saab et.al.	2404.18416	null
2024-04-26	Learning text-to-video retrieval from image captioning	Lucas Ventura et.al.	2404.17498	null
2024-04-26	MovieChat+: Question-aware Sparse Memory for Long Video Question Answering	Enxin Song et.al.	2404.17176	link
2024-04-26	Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting	Yuanyuan Liu et.al.	2404.17100	null
2024-04-29	PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning	Lin Xu et.al.	2404.16994	link
2024-04-25	SFMViT: SlowFast Meet ViT in Chaotic World	Jiaying Lin et.al.	2404.16609	link
2024-04-23	IPAD: Industrial Process Anomaly Detection Dataset	Jinfan Liu et.al.	2404.15033	null
2024-04-23	Pegasus-v1 Technical Report	Raehyuk Jung et.al.	2404.14687	null
2024-04-26	Narrative Action Evaluation with Prompt-Guided Multimodal Interaction	Shiyi Zhang et.al.	2404.14471	link
2024-04-20	Movie101v2: Improved Movie Narration Benchmark	Zihao Yue et.al.	2404.13370	null
2024-04-18	Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models	Reka Team et.al.	2404.12387	null
2024-04-18	From Image to Video, what do we need in multimodal LLMs?	Suyuan Huang et.al.	2404.11865	null
2024-04-17	VG4D: Vision-Language Model Goes 4D Video Recognition	Zhichao Deng et.al.	2404.11605	link
2024-04-15	Leveraging Temporal Contextualization for Video Action Recognition	Minji Kim et.al.	2404.09490	null
2024-04-15	The 8th AI City Challenge	Shuo Wang et.al.	2404.09432	null
2024-04-16	Human-in-the-Loop Segmentation of Multi-species Coral Imagery	Scarlett Raine et.al.	2404.09406	link
2024-04-14	In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition	Wiktor Mucha et.al.	2404.09308	null
2024-04-14	TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning	Quang Minh Dinh et.al.	2404.09275	link
2024-04-14	Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection	Jin Yang et.al.	2404.09263	link
2024-04-12	Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis	Maged Shoman et.al.	2404.08229	link
2024-04-11	Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval	Minkuk Kim et.al.	2404.07610	link
2024-04-10	A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos	Suleyman Ozdel et.al.	2404.07351	null
2024-04-10	Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention	Suleyman Ozdel et.al.	2404.07347	null
2024-04-09	MoReVQA: Exploring Modular Reasoning Models for Video Question Answering	Juhong Min et.al.	2404.06511	null
2024-04-07	X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model	Jan Held et.al.	2404.06332	null
2024-04-24	MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding	Bo He et.al.	2404.05726	link
2024-04-06	SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos	Tao Wu et.al.	2404.04565	null
2024-04-19	Koala: Key frame-conditioned long video-LLM	Reuben Tan et.al.	2404.04346	null
2024-04-05	Neural-Symbolic VideoQA: Learning Compositional Spatio-Temporal Reasoning for Real-world Video Question Answering	Lili Liang et.al.	2404.04007	null
2024-04-04	OW-VISCap: Open-World Video Instance Segmentation and Captioning	Anwesa Choudhuri et.al.	2404.03657	null
2024-04-04	MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens	Kirolos Ataallah et.al.	2404.03413	null
2024-04-10	LongVLM: Efficient Long Video Understanding via Large Language Models	Yuetian Weng et.al.	2404.03384	link
2024-04-03	DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement	Hao Wu et.al.	2404.02755	null
2024-04-05	SnAG: Scalable and Accurate Video Grounding	Fangzhou Mu et.al.	2404.02257	null
2024-04-01	TraveLER: A Multi-LMM Agent Framework for Video Question-Answering	Chuyi Shang et.al.	2404.01476	null
2024-04-01	CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes	Ting En Lam et.al.	2404.01299	null
2024-04-01	Streaming Dense Video Captioning	Xingyi Zhou et.al.	2404.01297	link
2024-04-02	Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward	Ruohong Zhang et.al.	2404.01258	link
2024-04-01	VideoDistill: Language-aware Vision Distillation for Video Question Answering	Bo Zou et.al.	2404.00973	null
2024-03-31	$R^2$ -Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding	Ye Liu et.al.	2404.00801	link
2024-03-30	Instrument-tissue Interaction Detection Framework for Surgical Video Understanding	Wenjun Lin et.al.	2404.00322	null
2024-03-30	ST-LLM: Large Language Models Are Effective Temporal Learners	Ruyang Liu et.al.	2404.00308	link
2024-03-29	A Unified Framework for Human-centric Point Cloud Video Understanding	Yiteng Xu et.al.	2403.20031	null
2024-03-28	Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality	Sishuo Chen et.al.	2403.19221	link
2024-03-27	An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM	Wonkyun Kim et.al.	2403.18406	link
2024-03-26	OmniVid: A Generative Framework for Universal Video Understanding	Junke Wang et.al.	2403.17935	link
2024-03-25	Understanding Long Videos in One Multimodal Language Model Pass	Kanchana Ranasinghe et.al.	2403.16998	link
2024-03-24	AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue	Yunlong Tang et.al.	2403.16276	null
2024-03-22	InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding	Yi Wang et.al.	2403.15377	link
2024-03-25	VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding	Ahmad Mahmood et.al.	2403.14743	null
2024-03-21	Language Repository for Long Video Understanding	Kumara Kahatapitiya et.al.	2403.14622	link
2024-03-21	Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels	Tianming Liang et.al.	2403.14430	null
2024-03-18	Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation	Zixin Zhu et.al.	2403.12042	link
2024-03-18	Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation	Wangbo Zhao et.al.	2403.11808	link
2024-03-27	LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model	Yuxin Cao et.al.	2403.11656	null
2024-03-18	VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding	Yue Fan et.al.	2403.11481	null
2024-03-15	VideoAgent: Long-form Video Understanding with Large Language Model as Agent	Xiaohan Wang et.al.	2403.10517	null
2024-03-14	Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding	Guo Chen et.al.	2403.09626	link
2024-03-25	Don’t Judge by the Look: Towards Motion Coherent Video Representation	Yitian Zhang et.al.	2403.09506	link
2024-03-13	DAM: Dynamic Adapter Merging for Continual Video QA Learning	Feng Cheng et.al.	2403.08755	link
2024-03-11	Action Reimagined: Text-to-Pose Video Editing for Dynamic Human Actions	Lan Wang et.al.	2403.07198	null
2024-03-12	VideoMamba: State Space Model for Efficient Video Understanding	Kunchang Li et.al.	2403.06977	link
2024-03-25	An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models	Liang Chen et.al.	2403.06764	link
2024-03-08	Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation	Joseph Cho et.al.	2403.05131	null
2024-03-11	Beyond MOT: Semantic Multi-Object Tracking	Yunhao Li et.al.	2403.05021	null
2024-03-08	Pix2Gif: Motion-Guided Diffusion for GIF Generation	Hitesh Kandala et.al.	2403.04634	null
2024-03-05	A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives	Simone Alberto Peirone et.al.	2403.03037	null
2024-03-03	MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies	Zhende Song et.al.	2403.01422	null
2024-03-01	Abductive Ego-View Accident Video Understanding for Safe Driving Perception	Jianwu Fang et.al.	2403.00436	null
2024-02-29	Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers	Tsai-Shien Chen et.al.	2402.19479	null
2024-03-11	TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning	Kate Sanders et.al.	2402.19467	null
2024-02-29	Percept, Chat, and then Adapt: Multimodal Knowledge Transfer of Foundation Models for Open-World Video Recognition	Boyu Chen et.al.	2402.18951	null
2024-02-27	MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning	Huiyu Xiong et.al.	2402.17680	null
2024-02-25	LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding	Yuxuan Wang et.al.	2402.16050	link
2024-02-22	Think before You Leap: Content-Aware Low-Cost Edge-Assisted Video Semantic Segmentation	Mingxuan Yan et.al.	2402.14326	null
2024-02-21	LLMs Meet Long Video: Advancing Long Video Comprehension with An Interactive Visual Adapter in LLMs	Yunxin Li et.al.	2402.13546	null
2024-02-28	Video ReCap: Recursive Captioning of Hour-Long Videos	Md Mohaiminul Islam et.al.	2402.13250	null
2024-02-20	VideoPrism: A Foundational Visual Encoder for Video Understanding	Long Zhao et.al.	2402.13217	null
2024-02-20	Slot-VLM: SlowFast Slots for Video-Language Modeling	Jiaqi Xu et.al.	2402.13088	null
2024-02-19	System Identification of Neural Systems: Going Beyond Images to Modelling Dynamics	Mai Gamal et.al.	2402.12519	null
2024-02-19	LVCHAT: Facilitating Long Video Comprehension	Yu Wang et.al.	2402.12079	link
2024-02-28	Are you Struggling? Dataset and Baselines for Struggle Determination in Assembly Videos	Shijia Feng et.al.	2402.11057	null
2024-02-16	Question-Instructed Visual Descriptions for Zero-Shot Video Question Answering	David Romero et.al.	2402.10698	null
2024-02-13	World Model on Million-Length Video And Language With RingAttention	Hao Liu et.al.	2402.08268	link
2024-02-12	BDIQA: A New Dataset for Video Question Answering to Explore Cognitive Reasoning through Theory of Mind	Yuanyuan Mao et.al.	2402.07402	null
2024-02-09	Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning	Amir Ziai et.al.	2402.06560	link
2024-02-09	Dynamic swarms regulate the morphology and distribution of soft membrane domains	Aakanksha Gubbala et.al.	2402.06518	null
2024-02-08	Memory Consolidation Enables Long-Context Video Understanding	Ivana Balažević et.al.	2402.05861	null
2024-02-06	Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization	Yang Jin et.al.	2402.03161	null
2024-02-04	Spatio-temporal Prompting Network for Robust Video Feature Extraction	Guanxiong Sun et.al.	2402.02574	link
2024-02-02	Simulator-Free Visual Domain Randomization via Video Games	Chintan Trivedi et.al.	2402.01335	link
2024-01-30	YTCommentQA: Video Question Answerability in Instructional Videos	Saelyne Yang et.al.	2401.17343	link
2024-01-30	Multi-granularity Correspondence Learning from Long-term Noisy Videos	Yijie Lin et.al.	2401.16702	null
2024-01-29	Cutup and Detect: Human Fall Detection on Cutup Untrimmed Videos Using a Large Foundational Video Understanding Model	Till Grutschus et.al.	2401.16280	null
2024-01-25	Knowledge Graph Supported Benchmark and Video Captioning for Basketball	Zeyu Xi et.al.	2401.13888	null
2024-01-22	ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition	Jiaming Zhou et.al.	2401.11654	null
2024-01-21	Exploring Missing Modality in Multimodal Egocentric Datasets	Merey Ramazanova et.al.	2401.11470	null
2024-01-19	Learning to Visually Connect Actions and their Effects	Eric Peh et.al.	2401.10805	null
2024-01-28	Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering	Haibo Wang et.al.	2401.10711	null
2024-01-17	CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding	Yunze Liu et.al.	2401.09057	null
2024-01-16	Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data	Yuhui Zhang et.al.	2401.08567	link
2024-01-16	Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization	Chongzhi Zhang et.al.	2401.08232	null
2024-01-11	Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition	Yukun Zuo et.al.	2401.06287	null
2024-01-10	HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition	Qian Wu et.al.	2401.04975	link
2024-01-10	SnapCap: Efficient Snapshot Compressive Video Captioning	Jianqiao Sun et.al.	2401.04903	null
2024-01-08	Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification	Wentao Zhu et.al.	2401.04154	null
2024-01-08	Dr $^2$ Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning	Chen Zhao et.al.	2401.04105	link
2024-01-08	STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering	Yueqian Wang et.al.	2401.03901	link

Publish Date	Title	Authors	PDF	Code
2024-05-16	Biomarker Selection for Adaptive Systems	Joshua Pickard et.al.	2405.09809	null
2024-05-14	No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding	Yingjie Zhai et.al.	2405.08344	link
2024-05-13	Improved Bound for Robust Causal Bandits with Linear Models	Zirui Yan et.al.	2405.07795	null
2024-05-10	Residual-based Attention Physics-informed Neural Networks for Efficient Spatio-Temporal Lifetime Assessment of Transformers Operated in Renewable Power Plants	Ibai Ramirez et.al.	2405.06443	null
2024-05-10	A Multi-Channel Spatial-Temporal Transformer Model for Traffic Flow Forecasting	Jianli Xiao et.al.	2405.06266	null
2024-05-07	DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving	Chen Min et.al.	2405.04390	null
2024-05-07	Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling	Jiawei Shi et.al.	2405.04309	null
2024-05-06	Hierarchical Space-Time Attention for Micro-Expression Recognition	Haihong Hao et.al.	2405.03202	link
2024-05-02	RSCaMa: Remote Sensing Image Change Captioning with State Space Model	Chenyang Liu et.al.	2404.18895	link
2024-04-24	Deep Predictive Model Learning with Parametric Bias: Handling Modeling Difficulties and Temporal Model Changes	Kento Kawaharazuka et.al.	2404.15726	null
2024-04-19	MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model	Kang Zeng et.al.	2404.12794	link
2024-04-13	Understanding Human-COVID-19 Dynamics using Geospatial Big Data: A Systematic Literature Review	Binbin Lin et.al.	2404.10013	null
2024-04-15	A spatio-temporal model to detect potential outliers in disease mapping	Victoire Michal et.al.	2404.09882	null
2024-04-11	Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos	Soumyabrata Chaudhuri et.al.	2404.07645	null
2024-04-05	Low-Rank Robust Subspace Tensor Clustering for Metro Passenger Flow Modeling	Jiuyun Hu et.al.	2404.04403	null
2024-04-03	Spatio-temporal Modeling of Count Data	Steffen Maletz et.al.	2404.02982	link
2024-03-31	$R^2$ -Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding	Ye Liu et.al.	2404.00801	link
2024-03-30	ST-LLM: Large Language Models Are Effective Temporal Learners	Ruyang Liu et.al.	2404.00308	link
2024-03-28	X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization	Anna Kukleva et.al.	2403.19811	link
2024-03-25	TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models	Zhongwei Zhang et.al.	2403.17005	null
2024-04-13	Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition	R. Gnana Praveen et.al.	2403.13659	link
2024-03-19	SUN Team’s Contribution to ABAW 2024 Competition: Audio-visual Valence-Arousal Estimation and Expression Recognition	Denis Dresvyanskiy et.al.	2403.12609	null
2024-03-18	Bayesian Optimization Sequential Surrogate (BOSS) Algorithm: Fast Bayesian Inference for a Broad Class of Bayesian Hierarchical Models	Dayi Li et.al.	2403.12250	null
2024-03-19	Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling	Jun Yu et.al.	2403.11942	null
2024-03-15	Spatio-temporal Occupancy Models with INLA	Jafet Belmont et.al.	2403.10680	null
2024-03-15	Multivariate Bayesian models with flexible shared interactions for analyzing spatio-temporal patterns of rare cancers	Garazi Retegui et.al.	2403.10440	link
2024-03-13	Leveraging Non-Decimated Wavelet Packet Features and Transformer Models for Time Series Forecasting	Guy P Nason et.al.	2403.08630	null
2024-03-10	Coherent Temporal Synthesis for Incremental Action Segmentation	Guodong Ding et.al.	2403.06102	null
2024-04-26	Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention	R. Gnana Praveen et.al.	2403.04654	link

Updated on 2024.05.18

Single Object & Visual Language Tracking

Large Language Model

Video Understanding

Multi-modal Learning