Updated on 2024.12.12

Single Object & Visual Language Tracking

Publish Date Title Authors PDF Code
2024-12-03 MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues Zhaofeng Hu et.al. 2412.02734 link
2024-12-03 GSOT3D: Towards Generic 3D Single Object Tracking in the Wild Yifan Jiao et.al. 2412.02129 link
2024-11-28 Improving Accuracy and Generalization for Efficient Visual Tracking Ram Zaveri et.al. 2411.18855 null
2024-11-27 A comparison of extended object tracking with multi-modal sensors in indoor environment Jiangtao Shuai et.al. 2411.18476 null
2024-12-04 A Distractor-Aware Memory for Visual Object Tracking with SAM2 Jovana Videnovic et.al. 2411.17576 link
2024-11-23 How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking Xuchen Li et.al. 2411.15600 null
2024-11-24 ClickTrack: Towards Real-time Interactive Single Object Tracking Kuiran Wang et.al. 2411.13183 null
2024-11-30 SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory Cheng-Yen Yang et.al. 2411.11922 link
2024-12-09 Vision Eagle Attention: a new lens for advancing image classification Mahmudul Hasan et.al. 2411.10564 link
2024-11-14 MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation Jonas Serych et.al. 2411.09551 link
2024-11-12 Visual Tracking with Intermittent Visibility: Switched Control Design and Implementation Yangge Li et.al. 2411.08144 null
2024-11-04 ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model Yiming Sun et.al. 2411.01756 null
2024-10-30 IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking Run Luo et.al. 2410.23907 null
2024-10-27 NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object Tracking Yu Liu et.al. 2410.20421 link
2024-10-19 The Solution for Single Object Tracking Task of Perception Test Challenge 2024 Zhiqiang Zhong et.al. 2410.16329 null
2024-10-13 Gaussian Splatting Visual MPC for Granular Media Manipulation Wei-Cheng Tseng et.al. 2410.09740 null
2024-10-09 DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM Xuchen Li et.al. 2410.02492 null
2024-09-30 Opt-in Camera: Person Identification in Video via UWB Localization and Its Application to Opt-in Systems Matthew Ishige et.al. 2409.19891 null
2024-09-27 Improving Visual Object Tracking through Visual Prompting Shih-Fang Chen et.al. 2409.18901 link
2024-09-26 General Compression Framework for Efficient Transformer Object Tracking Lingyi Hong et.al. 2409.17564 null
2024-09-25 Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2 Chunhui Zhang et.al. 2409.16902 link
2024-09-25 Conditional Generative Denoiser for Nighttime UAV Tracking Yucheng Wang et.al. 2409.16834 link
2024-09-25 Progressive Representation Learning for Real-Time UAV Tracking Changhong Fu et.al. 2409.16652 link
2024-09-25 Enhancing Nighttime UAV Tracking with Light Distribution Suppression Liangliang Yao et.al. 2409.16631 link
2024-09-19 WeHelp: A Shared Autonomy System for Wheelchair Users Abulikemu Abuduweili et.al. 2409.12159 link
2024-09-18 Distilling Channels for Efficient Deep Tracking Shiming Ge et.al. 2409.11785 null
2024-09-13 Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark Xuchen Li et.al. 2409.08887 null
2024-09-10 VBIT: Towards Enhancing Privacy Control Over IoT Devices Jad Al Aaraj et.al. 2409.06233 null
2024-09-03 Ultra-broadband room-temperature Fourier transform spectrometer with watt-level power consumption Jakub Mnich et.al. 2409.01875 null
2024-08-25 Camouflaged_Object_Tracking__A_Benchmark Xiaoyu Guo et.al. 2408.13877 null
2024-08-21 Low-Light Object Tracking: A Benchmark Pengzhi Zhong et.al. 2408.11463 link
2024-08-20 MambaEVT: Event Stream based Visual Object Tracking using State Space Model Xiao Wang et.al. 2408.10487 link
2024-08-05 VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking Yuxuan Lu et.al. 2408.02263 null
2024-09-06 3D Single-object Tracking in Point Clouds with High Temporal Variation Qiao Wu et.al. 2408.02049 null
2024-09-09 SiamMo: Siamese Motion-Centric 3D Object Tracking Yuxiang Yang et.al. 2408.01688 link
2024-08-02 Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach Yabin Zhu et.al. 2408.00969 link
2024-08-06 Broadband THz wave generation and detection in organic crystal PNPA at MHz repetition rates Lukasz A. Sterczewski et.al. 2407.20745 null
2024-07-16 Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers Zhengbo Zhang et.al. 2407.08394 null
2024-07-11 PINN-Ray: A Physics-Informed Neural Network to Model Soft Robotic Fin Ray Fingers Xing Wang et.al. 2407.08222 null
2024-07-07 Addressing single object tracking in satellite imagery through prompt-engineered solutions Athena Psalta et.al. 2407.05518 null
2024-07-07 Learning Motion Blur Robust Vision Transformers with Dynamic Early Exit for Real-Time UAV Tracking You Wu et.al. 2407.05383 null
2024-07-09 P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds Jiahao Nie et.al. 2407.05238 link
2024-07-07 Tracking Reflected Objects: A Benchmark Xiaoyu Guo et.al. 2407.05235 null
2024-07-04 TrackPGD: A White-box Attack using Binary Masks against Robust Transformer Trackers Fatemeh Nourilenjan Nokabadi et.al. 2407.03946 link
2024-07-02 FlowTrack: Point-level Flow Network for 3D Single Object Tracking Shuo Li et.al. 2407.01959 null
2024-09-07 eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking Yucheng Chen et.al. 2406.20024 null
2024-06-14 Constrained Motion Planning for a Robotic Endoscope Holder based on Hierarchical Quadratic Programming Jacinto Colan et.al. 2406.09982 null
2024-06-14 Robust compressive tracking via online weighted multiple instance learning Sandeep Singh Sengar et.al. 2406.09914 null
2024-07-01 Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking Xiangyang Yang et.al. 2406.08037 null
2024-06-07 Multi-Granularity Language-Guided Multi-Object Tracking Yuhao Li et.al. 2406.04844 link
2024-06-02 Robust Visual Tracking via Iterative Gradient Descent and Threshold Selection Zhuang Qi et.al. 2406.00589 null
2024-05-28 Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion Hongze Sun et.al. 2405.17903 link
2024-05-27 LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking Shaohua Dong et.al. 2405.17660 null
2024-05-31 Awesome Multi-modal Object Tracking Chunhui Zhang et.al. 2405.14200 link
2024-05-20 DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM Xuchen Li et.al. 2405.12139 null
2024-05-16 A Novel Bounding Box Regression Method for Single Object Tracking Omar Abdelaziz et.al. 2405.10444 null
2024-05-16 Beyond Traditional Single Object Tracking: A Survey Omar Abdelaziz et.al. 2405.10439 null
2024-05-08 TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking Pengcheng Shao et.al. 2405.05004 link
2024-04-22 360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos Yinzhe Xu et.al. 2404.13953 null
2024-05-25 An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training Jin Gao et.al. 2404.12210 link
2024-04-16 Attention-Aware Visualization: Tracking and Responding to User Perception Over Time Arvind Srinivasan et.al. 2404.10732 null
2024-04-15 Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL Fangwei Zhong et.al. 2404.09857 null
2024-04-15 Learning Tracking Representations from Single Point Annotations Qiangqiang Wu et.al. 2404.09504 null
2024-04-11 PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds Weisheng Xu et.al. 2404.07495 link
2024-05-02 Longitudinal Analysis and Quantitative Assessment of Child Development through Mobile Interaction Juan Carlos Ruiz-Garcia et.al. 2404.06919 link
2024-04-09 LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks Jianlang Chen et.al. 2404.06247 link
2024-04-08 Semi-Supervised Novelty Detection for Precise Ultra-Wideband Error Signal Prediction Umberto Albertin et.al. 2404.05351 null
2024-03-29 Context-Aware Integration of Language and Visual References for Natural Language Tracking Yanyan Shao et.al. 2403.19975 null
2024-03-27 TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes Liangyu Xu et.al. 2403.18238 null
2024-03-26 OmniVid: A Generative Framework for Universal Video Understanding Junke Wang et.al. 2403.17935 link
2024-03-26 Exploring Dynamic Transformer for Efficient Object Tracking Jiawen Zhu et.al. 2403.17651 null
2024-03-29 Elysium: Exploring Object-level Perception in Videos via MLLM Han Wang et.al. 2403.16558 link
2024-03-25 Multi-attention Associate Prediction Network for Visual Tracking Xinglong Sun et.al. 2403.16395 null
2024-03-28 SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking Xiaojun Hou et.al. 2403.16002 link
2024-03-23 Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking Shaoyu Sun et.al. 2403.15831 null
2024-03-19 TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO Chaoran Xiong et.al. 2403.12504 null
2024-03-18 Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model Jan Krejčí et.al. 2403.11978 null
2024-03-16 A Spectrum-based Image Denoising Method with Edge Feature Enhancement Peter Luvton et.al. 2403.11036 null
2024-03-15 Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers Jinxia Xie et.al. 2403.10574 null
2024-03-14 OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning Lingyi Hong et.al. 2403.09634 null
2024-02-27 ACTrack: Adding Spatio-Temporal Condition for Visual Object Tracking Yushan Han et.al. 2403.07914 null
2024-04-03 Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline Xiao Wang et.al. 2403.05839 link
2024-03-08 Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance Liting Lin et.al. 2403.05231 link
2024-03-08 Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy Yuelin Zhang et.al. 2403.05146 link
2024-03-06 VastTrack: Vast Category Visual Object Tracking Liang Peng et.al. 2403.03493 link
2024-02-28 Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks Zhewei Wu et.al. 2402.17976 null
2024-02-26 SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking Yu Lin et.al. 2402.16249 link
2024-02-26 Reading Relevant Feature from Global Representation Memory for Visual Object Tracking Xinyu Zhou et.al. 2402.14392 null
2024-02-13 Optimized Information Flow for Transformer Tracking Janani Kugarajeevan et.al. 2402.08195 link
2024-02-07 BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision Xin Zhao et.al. 2402.04519 null
2024-02-04 Spatio-temporal Prompting Network for Robust Video Feature Extraction Guanxiong Sun et.al. 2402.02574 link
2024-01-24 Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region Shengjing Tian et.al. 2401.13285 null
2024-01-23 Correlation-Embedded Transformer Tracking: A Single-Branch Framework Fei Xie et.al. 2401.12743 link
2024-01-20 Unifying Visual and Vision-Language Tracking via Contrastive Learning Yinchao Ma et.al. 2401.11228 link
2024-01-20 Towards Category Unification of 3D Single Object Tracking on Point Clouds Jiahao Nie et.al. 2401.11204 null
2024-01-18 Multi-task Learning for Joint Re-identification, Team Affiliation, and Role Classification for Sports Visual Tracking Amir M. Mansourian et.al. 2401.09942 null
2024-01-12 Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements Muhammad Wasim Nawaz et.al. 2401.06396 null
2024-01-18 Hold ‘em and Fold ‘em: Towards Human-scale, Feedback-Controlled Soft Origami Robots Immanuel Ampomah Mensah et.al. 2401.04650 null
2024-01-06 Explicit Visual Prompts for Visual Object Tracking Liangtao Shi et.al. 2401.03142 link
2024-01-03 ODTrack: Online Dense Temporal Token Learning for Visual Tracking Yaozong Zheng et.al. 2401.01686 link
2023-12-27 X Modality Assisting RGBT Object Tracking Zhaisheng Ding et.al. 2312.17273 null
2023-12-22 Cross-Modal Object Tracking via Modality-Aware Fusion Network and A Large-Scale Dataset Lei Liu et.al. 2312.14446 link
2023-12-18 Multi-Correlation Siamese Transformer Network with Dense Connection for 3D Single Object Tracking Shihao Feng et.al. 2312.11051 link
2023-12-17 Robust 3D Tracking with Quality-Aware Shape Completion Jingwen Zhang et.al. 2312.10608 null
2023-12-15 Tracking Skiers from the Top to the Bottom Matteo Dunnhofer et.al. 2312.09723 null
2023-12-11 M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking Jiaming Liu et.al. 2312.06117 link
2023-12-07 Instance Tracking in 3D Scenes from Egocentric Videos Yunhan Zhao et.al. 2312.04117 link
2024-02-19 Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking Jiawei Ge et.al. 2311.17085 null
2023-11-21 Visual tracking brain computer interface Changxing Huang et.al. 2311.12592 null
2024-01-10 ViKi-HyCo: A Hybrid-Control approach for complex car-like maneuvers Edison P. Velasco Sánchez et.al. 2311.07268 null

Large Language Model

Publish Date Title Authors PDF Code
2024-12-10 Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences Alan Nawzad Amin et.al. 2412.07763 link
2024-12-10 SAT: Spatial Aptitude Training for Multimodal Language Models Arijit Ray et.al. 2412.07755 null
2024-12-10 LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation Models Ziqi Lu et.al. 2412.07746 null
2024-12-10 Zero-Shot ATC Coding with Large Language Models for Clinical Assessments Zijian Chen et.al. 2412.07743 null
2024-12-10 AI Expands Scientists’ Impact but Contracts Science’s Focus Qianyue Hao et.al. 2412.07727 null
2024-12-10 Granite Guardian Inkit Padhi et.al. 2412.07724 link
2024-12-10 Leveraging Content and Context Cues for Low-Light Image Enhancement Igor Morawski et.al. 2412.07693 null
2024-12-10 DriveMM: All-in-One Large Multimodal Model for Autonomous Driving Zhijian Huang et.al. 2412.07689 link
2024-12-10 Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions Anant Prakash Awasthi et.al. 2412.07687 null
2024-12-10 TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation Alfredo Garrachón Ruiz et.al. 2412.07682 null
2024-12-10 RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models Greg Heinrich et.al. 2412.07679 null
2024-12-10 Ask Humans or AI? Exploring Their Roles in Visualization Troubleshooting Shuyu Shen et.al. 2412.07673 null
2024-12-10 FlexLLM: Exploring LLM Customization for Moving Target Defense on Black-Box LLMs Against Jailbreak Attacks Bocheng Chen et.al. 2412.07672 null
2024-12-10 Automating Business Intelligence Requirements with Generative AI and Semantic Search Nimrod Busany et.al. 2412.07668 null
2024-12-10 Searching for Structure: Investigating Emergent Communication with Large Language Models Tom Kouwenhoven et.al. 2412.07646 null
2024-12-10 TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans Md Omar Faruque et.al. 2412.07636 null
2024-12-10 ChocoLlama: Lessons Learned From Teaching Llamas Dutch Matthieu Meeus et.al. 2412.07633 null
2024-12-10 Piece of Table: A Divide-and-Conquer Approach for Selecting Sub-Tables in Table Question Answering Wonjin Lee et.al. 2412.07629 null
2024-12-10 OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations Linke Ouyang et.al. 2412.07626 link
2024-12-10 DRUM: Learning Demonstration Retriever for Large MUlti-modal Models Ellen Yi-Ge et.al. 2412.07619 null
2024-12-09 Delve into Visual Contrastive Decoding for Hallucination Mitigation of Large Vision-Language Models Yi-Lun Lee et.al. 2412.06775 link
2024-12-09 Visual Lexicon: Rich Image Features in Language Space XuDong Wang et.al. 2412.06774 null
2024-12-09 Training Large Language Models to Reason in a Continuous Latent Space Shibo Hao et.al. 2412.06769 null
2024-12-09 Ranking-aware adapter for text-driven image ordering with CLIP Wei-Hsiang Yu et.al. 2412.06760 link
2024-12-09 Why Do Developers Engage with ChatGPT in Issue-Tracker? Investigating Usage and Reliance on ChatGPT-Generated Code Joy Krishan Das et.al. 2412.06757 null
2024-12-09 Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models Neel Jain et.al. 2412.06748 null
2024-12-09 ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities Adhiraj Ghosh et.al. 2412.06745 null
2024-12-09 JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM Takuro Fujii et.al. 2412.06738 null
2024-12-09 AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and Benchmark Lan Li et.al. 2412.06724 null
2024-12-09 How to Merge Your Multimodal Models Over Time? Sebastian Dziadzio et.al. 2412.06712 null
2024-12-09 OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions Yi-Kai Zhang et.al. 2412.06693 null
2024-12-09 Exploring Critical Testing Scenarios for Decision-Making Policies: An LLM Approach Weichao Xu et.al. 2412.06684 null
2024-12-09 Toward LLM-Agent-Based Modeling of Transportation Systems: A Conceptual Framework Tianming Liu et.al. 2412.06681 null
2024-12-09 I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token Roi Cohen et.al. 2412.06676 null
2024-12-09 ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance Chunwei Wang et.al. 2412.06673 null
2024-12-09 MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models Shansong Liu et.al. 2412.06660 null
2024-12-09 Chatbots im Schulunterricht: Wir testen das Fobizz-Tool zur automatischen Bewertung von Hausaufgaben Rainer Mühlhoff et.al. 2412.06651 null
2024-12-09 The Narrow Gate: Localized Image-Text Communication in Vision-Language Models Alessandro Serra et.al. 2412.06646 null
2024-12-09 MAVias: Mitigate any Visual Bias Ioannis Sarridis et.al. 2412.06632 null
2024-12-09 Copyright-Protected Language Generation via Adaptive Model Fusion Javier Abad et.al. 2412.06619 link
2024-12-06 Birth and Death of a Rose Chen Geng et.al. 2412.05278 null
2024-12-06 Sparse autoencoders reveal selective remapping of visual concepts during adaptation Hyesu Lim et.al. 2412.05276 link
2024-12-06 Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Zhe Chen et.al. 2412.05271 null
2024-12-06 APOLLO: SGD-like Memory, AdamW-level Performance Hanqing Zhu et.al. 2412.05270 null
2024-12-06 Uncertainty Quantification for Transformer Models for Dark-Pattern Detection Javier Muñoz et.al. 2412.05251 null
2024-12-06 Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization Luca Masserano et.al. 2412.05244 null
2024-12-06 CompCap: Improving Multimodal Large Language Models with Composite Captions Xiaohui Chen et.al. 2412.05243 null
2024-12-06 MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale Jarvis Guo et.al. 2412.05237 null
2024-12-06 BEExformer: A Fast Inferencing Transformer Architecture via Binarization with Multiple Early Exits Wazib Ansar et.al. 2412.05225 null
2024-12-06 100% Hallucination Elimination Using Acurai Michael C. Wood et.al. 2412.05223 null
2024-12-06 Evaluating and Aligning CodeLLMs on Human Preference Jian Yang et.al. 2412.05210 null
2024-12-06 A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges Aditi Singh et.al. 2412.05208 null
2024-12-06 Are Frontier Large Language Models Suitable for Q&A in Science Centres? Jacob Watson et.al. 2412.05200 null
2024-12-06 SurgBox: Agent-Driven Operating Room Sandbox with Surgery Copilot Jinlin Wu et.al. 2412.05187 link
2024-12-06 LinVT: Empower Your Image-level Large Language Model to Understand Videos Lishuai Gao et.al. 2412.05185 link
2024-12-06 QueEn: A Large Language Model for Quechua-English Translation Junhao Chen et.al. 2412.05184 null
2024-12-06 Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models Kuofeng Gao et.al. 2412.05167 null
2024-12-06 Enhancing Cross-Language Code Translation via Task-Specific Embedding Alignment in Retrieval-Augmented Generation Manish Bhattarai et.al. 2412.05159 null
2024-12-06 Multimodal Fact-Checking with Vision Language Models: A Probing Classifier based Solution with Embedding Strategies Recep Firat Cekinel et.al. 2412.05155 null
2024-12-06 A text-to-tabular approach to generate synthetic patient data using LLMs Margaux Tornqvist et.al. 2412.05153 null
2024-12-05 Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail Luca Bartolomei et.al. 2412.04472 link
2024-12-05 NVILA: Efficient Frontier Visual Language Models Zhijian Liu et.al. 2412.04468 null
2024-12-05 VisionZip: Longer is Better but Not Necessary in Vision Language Models Senqiao Yang et.al. 2412.04467 link
2024-12-05 Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection Enshen Zhou et.al. 2412.04455 null
2024-12-05 p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay Jun Zhang et.al. 2412.04449 link
2024-12-05 EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios Lu Qiu et.al. 2412.04447 null
2024-12-05 DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models Yizhuo Li et.al. 2412.04446 null
2024-12-05 Moto: Latent Motion Token as the Bridging Language for Robot Manipulation Yi Chen et.al. 2412.04445 null
2024-12-05 Towards Real-Time Open-Vocabulary Video Instance Segmentation Bin Yan et.al. 2412.04434 null
2024-12-05 Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Yuying Ge et.al. 2412.04432 link
2024-12-05 Grounding Descriptions in Images informs Zero-Shot Visual Recognition Shaunak Halbe et.al. 2412.04429 link
2024-12-05 Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Jiuhai Chen et.al. 2412.04424 link
2024-12-05 Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation Xuying Li et.al. 2412.04415 null
2024-12-05 Establishing Task Scaling Laws via Compute-Efficient Model Ladders Akshita Bhagia et.al. 2412.04403 null
2024-12-05 SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding Rong Li et.al. 2412.04383 null
2024-12-05 Discriminative Fine-tuning of LVLMs Yassine Ouali et.al. 2412.04378 null
2024-12-05 Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting Edoardo Cetin et.al. 2412.04368 null
2024-12-05 Approximate Top- $k$ for Increased Parallelism Oscar Key et.al. 2412.04358 null
2024-12-05 Retrieval-Augmented Machine Translation with Unstructured Knowledge Jiaan Wang et.al. 2412.04342 link
2024-12-05 Liquid: Language Models are Scalable Multi-modal Generators Junfeng Wu et.al. 2412.04332 null
2024-12-04 From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents Xinyi Mou et.al. 2412.03563 link
2024-12-04 FLAIR: VLM with Fine-grained Language-informed Image Representations Rui Xiao et.al. 2412.03561 link
2024-12-04 Best-of-N Jailbreaking John Hughes et.al. 2412.03556 link
2024-12-04 PaliGemma 2: A Family of Versatile VLMs for Transfer Andreas Steiner et.al. 2412.03555 null
2024-12-04 SPICE: Smart Projection Interface for Cooking Enhancement Vera Prohaska et.al. 2412.03551 null
2024-12-04 Perception Tokens Enhance Visual Reasoning in Multimodal Language Models Mahtab Bigverdi et.al. 2412.03548 null
2024-12-04 Evaluating Gender Bias Transfer between Pre-trained and Prompt-Adapted Language Models Natalie Mackraz et.al. 2412.03537 null
2024-12-04 A Review on Scientific Knowledge Extraction using Large Language Models in Biomedical Sciences Gabriel Lino Garcia et.al. 2412.03531 null
2024-12-04 FANAL – Financial Activity News Alerting Language Modeling Framework Urjitkumar Patel et.al. 2412.03527 null
2024-12-04 You’re (Not) My Type – Can LLMs Generate Feedback of Specific Types for Introductory Programming Tasks? Dominic Lohr et.al. 2412.03516 null
2024-12-04 Distillation of Diffusion Features for Semantic Correspondence Frank Fundel et.al. 2412.03512 null
2024-12-04 Tight PAC-Bayesian Risk Certificates for Contrastive Learning Anna van Elst et.al. 2412.03486 link
2024-12-04 Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning Neale Ratzlaff et.al. 2412.03467 null
2024-12-04 Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks Dario Serez et.al. 2412.03453 link
2024-12-04 From Words to Workflows: Automating Business Processes Laura Minkova et.al. 2412.03446 null
2024-12-04 Assessing Foundation Models’ Transferability to Physiological Signals in Precision Medicine Matthias Christenson et.al. 2412.03427 null
2024-12-04 PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation Ao Wang et.al. 2412.03409 link
2024-12-04 RedStone: Curating General, Code, Math, and QA Data for Large Language Models Yaoyao Chang et.al. 2412.03398 null
2024-12-04 Enhancing Supply Chain Visibility with Generative AI: An Exploratory Case Study on Relationship Prediction in Knowledge Graphs Ge Zheng et.al. 2412.03390 null
2024-12-04 WiS Platform: Enhancing Evaluation of LLM-Based Multi-Agent Systems Through Game-Based Analysis Chengwei Hu et.al. 2412.03359 null
2024-12-03 T-REG: Preference Optimization with Token-Level Reward Regularization Wenxuan Zhou et.al. 2412.02685 null
2024-12-03 Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models Yuda Song et.al. 2412.02674 null
2024-12-03 LLM-Enhanced Path Planning: Safe and Efficient Autonomous Navigation with Instructional Inputs Pranav Doma et.al. 2412.02655 null
2024-12-03 Time-Reversal Provides Unsupervised Feedback to LLMs Yerram Varun et.al. 2412.02626 null
2024-12-03 Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions Kai Sun et.al. 2412.02621 null
2024-12-03 Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback Hiroki Furuta et.al. 2412.02617 null
2024-12-03 GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot Aohan Zeng et.al. 2412.02612 link
2024-12-03 AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? Kaixiong Gong et.al. 2412.02611 null
2024-12-03 Interpretable Company Similarity with Sparse Autoencoders Marco Molinari et.al. 2412.02605 null
2024-12-03 CEGI: Measuring the trade-off between efficiency and carbon emissions for SLMs and VLMs Abhas Kumar et.al. 2412.02602 null
2024-12-03 PrefixLLM: LLM-aided Prefix Circuit Design Weihua Xiao et.al. 2412.02594 null
2024-12-03 OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation Junyuan Zhang et.al. 2412.02592 link
2024-12-03 Explainable CTR Prediction via LLM Reasoning Xiaohan Yu et.al. 2412.02588 null
2024-12-03 Remote Sensing Temporal Vision-Language Models: A Comprehensive Survey Chenyang Liu et.al. 2412.02573 link
2024-12-03 SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection Joongwon Chae et.al. 2412.02565 link
2024-12-03 Semantic Tokens in Retrieval Augmented Generation Joel Suro et.al. 2412.02563 null
2024-12-03 Patent-CR: A Dataset for Patent Claim Revision Lekang Jiang et.al. 2412.02549 null
2024-12-03 Multimodal Remote Sensing Scene Classification Using VLMs and Dual-Cross Attention Networks Jinjin Cai et.al. 2412.02531 null
2024-12-03 LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data Hanyu Zhang et.al. 2412.02525 null
2024-12-03 OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations Caixin Kang et.al. 2412.02479 null
2024-12-02 T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs Shukang Yin et.al. 2411.19951 link
2024-12-02 Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM’s Reasoning Capability Zicheng Lin et.al. 2411.19943 null
2024-11-29 VLSBench: Unveiling Visual Leakage in Multimodal Safety Xuhao Hu et.al. 2411.19939 null
2024-11-29 On Domain-Specific Post-Training for Multimodal Large Language Models Daixuan Cheng et.al. 2411.19930 null
2024-11-29 SIMS: Simulating Human-Scene Interactions with Real World Script Planning Wenjia Wang et.al. 2411.19921 null
2024-11-29 FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation Chang Won Lee et.al. 2411.19888 null
2024-11-29 PDDLFuse: A Tool for Generating Diverse Planning Domains Vedant Khandelwal et.al. 2411.19886 null
2024-12-02 LUMIA: Linear probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states Luis Ibanez-Lissen et.al. 2411.19876 null
2024-11-29 DeMo: Decoupled Momentum Optimization Bowen Peng et.al. 2411.19870 link
2024-11-29 AIDetx: a compression-based method for identification of machine-learning generated text Leonardo Almeida et.al. 2411.19869 link
2024-11-29 Reverse Thinking Makes LLMs Stronger Reasoners Justin Chih-Yao Chen et.al. 2411.19865 null
2024-11-29 Cross-Domain Recommendation Meets Large Language Models Ajay Krishna Vajjala et.al. 2411.19862 link
2024-11-29 What fifty-one years of Linguistics and Artificial Intelligence research tell us about their correlation: A scientometric review Mohammed Q. Shormani et.al. 2411.19858 null
2024-11-29 Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation Dimosthenis Antypas et.al. 2411.19832 null
2024-11-29 Advanced System Integration: Analyzing OpenAPI Chunking for Retrieval-Augmented Generation Robin D. Pesl et.al. 2411.19804 null
2024-11-29 INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge Angelika Romanou et.al. 2411.19799 null
2024-11-29 MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks Yiming Wu et.al. 2411.19786 null
2024-11-29 PerLA: Perceptive 3D Language Assistant Guofeng Mei et.al. 2411.19774 null
2024-11-29 LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos Tiantian Geng et.al. 2411.19772 null
2024-11-29 Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models Kaican Li et.al. 2411.19757 link
2024-11-27 Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation Yueru Jia et.al. 2411.18623 null
2024-11-27 Cross-modal Information Flow in Multimodal Large Language Models Zhi Zhang et.al. 2411.18620 null
2024-11-27 Diffusion Self-Distillation for Zero-Shot Customized Image Generation Shengqu Cai et.al. 2411.18616 null
2024-11-27 Automated Literature Review Using NLP Techniques and LLM-Based Retrieval-Augmented Generation Nurshat Fateh Ali et.al. 2411.18583 null
2024-11-27 Challenges in Adapting Multilingual LLMs to Low-Resource Languages using LoRA PEFT Tuning Omkar Khade et.al. 2411.18571 null
2024-11-27 A Pipeline of Neural-Symbolic Integration to Enhance Spatial Reasoning in Large Language Models Rong Wang et.al. 2411.18564 null
2024-11-27 DexDiffuser: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation Zhixuan Liang et.al. 2411.18562 null
2024-11-27 Retrofitting (Large) Language Models with Dynamic Tokenization Darius Feher et.al. 2411.18553 null
2024-11-27 AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans Dillon Loh et.al. 2411.18539 link
2024-11-27 Emergence of Self-Identity in AI: A Mathematical Framework and Empirical Study with Generative Large Language Models Minhyeok Lee et.al. 2411.18530 link
2024-11-27 LLM-ABBA: Understand time series via symbolic approximation Erin Carson et.al. 2411.18506 null
2024-11-27 GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation Pengfei Zhou et.al. 2411.18499 null
2024-11-27 Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS Jinyang Wu et.al. 2411.18478 null
2024-11-27 Draft Model Knows When to Stop: A Self-Verification Length Policy for Speculative Decoding Ziyin Zhang et.al. 2411.18462 link
2024-11-27 Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator Frederic Kirstein et.al. 2411.18444 null
2024-11-27 An AI-Assisted Multi-Agent Dual Dialogue System to Support Mental Health Care Providers Onno P. Kampman et.al. 2411.18429 null
2024-11-27 FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving Ao Shen et.al. 2411.18424 null
2024-11-27 Politicians vs ChatGPT. A study of presuppositions in French and Italian political communication Davide Garassino et.al. 2411.18403 null
2024-11-27 Topic Modeling and Sentiment Analysis on Japanese Online Media’s Coverage of Nuclear Energy Yifan Sun et.al. 2411.18383 null
2024-11-27 ChatGPT as speechwriter for the French presidents Dominique Labbé et.al. 2411.18382 null
2024-11-26 Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats Jiaxin Wen et.al. 2411.17693 null
2024-11-26 Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens Xu Ouyang et.al. 2411.17691 null
2024-11-26 Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration Yuhang Han et.al. 2411.17686 null
2024-11-26 Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning Zhu Xu et.al. 2411.17679 link
2024-11-26 Instance-Aware Graph Prompt Learning Jiazheng Li et.al. 2411.17676 null
2024-11-26 Push the Limit of Multi-modal Emotion Recognition by Prompting LLMs with Receptive-Field-Aware Attention Weighting Liyun Zhang et.al. 2411.17674 null
2024-11-26 SketchAgent: Language-Driven Sequential Sketch Generation Yael Vinker et.al. 2411.17673 null
2024-11-26 Synthetic Data Generation with LLM for Improved Depression Prediction Andrea Kang et.al. 2411.17672 null
2024-11-26 How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations Hyunji Lee et.al. 2411.17666 null
2024-11-26 Toward High-Performance LLM Serving: A Simulation-Based Approach for Identifying Optimal Parallelism Yi-Chien Lin et.al. 2411.17651 null
2024-11-26 On Limitations of LLM as Annotator for Low Resource Languages Suramya Jadhav et.al. 2411.17637 null
2024-11-26 MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation Harsh Singh et.al. 2411.17636 null
2024-11-26 Data-driven development of cycle prediction models for lithium metal batteries using multi modal mining Jaewoong Lee et.al. 2411.17625 null
2024-11-26 Scaling Speech-Text Pre-training with Synthetic Interleaved Data Aohan Zeng et.al. 2411.17607 null
2024-11-26 HyperSeg: Towards Universal Visual Segmentation with Large Language Model Cong Wei et.al. 2411.17606 link
2024-11-26 Making History Readable Bipasha Banerjee et.al. 2411.17600 null
2024-11-26 Agentic AI for Improving Precision in Identifying Contributions to Sustainable Development Goals William A. Ingram et.al. 2411.17598 null
2024-11-26 Can artificial intelligence predict clinical trial outcomes? Shuyi Jin et.al. 2411.17595 null
2024-11-26 RTL-Breaker: Assessing the Security of LLMs against Backdoor Attacks on HDL Code Generation Lakshmi Likhitha Mankali et.al. 2411.17569 null
2024-11-26 Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey Jiayi Kuang et.al. 2411.17558 null
2024-11-25 Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts? Sohee Yang et.al. 2411.16679 null
2024-11-25 Diffusion Features for Zero-Shot 6DoF Object Pose Estimation Bernd Von Gimborn et.al. 2411.16668 null
2024-11-25 DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation Zun Wang et.al. 2411.16657 null
2024-11-25 Self-Generated Critiques Boost Reward Modeling for Language Models Yue Yu et.al. 2411.16646 null
2024-11-25 Preventing Jailbreak Prompts as Malicious Tools for Cybercriminals: A Cyber Defense Perspective Jean Marie Tshimula et.al. 2411.16642 null
2024-11-25 StructFormer: Document Structure-based Masked Attention and its Impact on Language Model Pre-Training Kaustubh Ponkshe et.al. 2411.16618 null
2024-11-25 Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models Ronghuan Wu et.al. 2411.16602 null
2024-11-25 From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Dawei Li et.al. 2411.16594 link
2024-11-25 Large Language Model-based Decision-making for COLREGs and the Control of Autonomous Surface Vehicles Klinsmann Agyei et.al. 2411.16587 null
2024-11-25 MarketGPT: Developing a Pre-trained transformer (GPT) for Modeling Financial Time Series Aaron Wheeler et.al. 2411.16585 link
2024-11-25 Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision Zhiheng Xi et.al. 2411.16579 null
2024-11-25 Predictive Power of LLMs in Financial Markets Jerick Shi et.al. 2411.16569 null
2024-11-25 EnStack: An Ensemble Stacking Framework of Large Language Models for Enhanced Vulnerability Detection in Source Code Shahriyar Zaman Ridoy et.al. 2411.16561 null
2024-11-25 Generating Out-Of-Distribution Scenarios Using Language Models Erfan Aasi et.al. 2411.16554 null
2024-11-25 Representation Collapsing Problems in Vector Quantization Wenhao Zhao et.al. 2411.16550 null
2024-11-25 RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics Chan Hee Song et.al. 2411.16537 null
2024-11-25 Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings Carolin M. Schuster et.al. 2411.16527 null
2024-11-25 Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency Jerry Yao-Chieh Hu et.al. 2411.16525 null
2024-11-25 LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation Steven Song et.al. 2411.16523 null
2024-11-25 Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis Boming Miao et.al. 2411.16503 null
2024-11-22 Measuring Bullshit in the Language Games played by ChatGPT Alessandro Trevisan et.al. 2411.15129 null
2024-11-22 Health AI Developer Foundations Atilla P. Kiraly et.al. 2411.15128 null
2024-11-22 TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Nathan Lambert et.al. 2411.15124 link
2024-11-22 RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts Hjalmar Wijk et.al. 2411.15114 link
2024-11-22 Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion Samarth N Ramesh et.al. 2411.15113 null
2024-11-22 AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution Fengyuan Liu et.al. 2411.15102 link
2024-11-22 What You See is Not What You Get: Neural Partial Differential Equations and The Illusion of Learning Arvind Mohan et.al. 2411.15101 null
2024-11-22 XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models Yixin Dong et.al. 2411.15100 null
2024-11-22 Context-Aware Multimodal Pretraining Karsten Roth et.al. 2411.15099 null
2024-11-22 mR $^2$ AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA Tao Zhang et.al. 2411.15041 null
2024-11-22 One to rule them all: natural language to bind communication, perception and action Simone Colombani et.al. 2411.15033 null
2024-11-22 Time is on my sight: scene graph filtering for dynamic environment perception in an LLM-driven robot Simone Colombani et.al. 2411.15027 null
2024-11-22 DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models Keda Tao et.al. 2411.15024 null
2024-11-22 FTA generation using GenAI with an Autonomy sensor Usecase Sneha Sudhir Shetiya et.al. 2411.15007 null
2024-11-22 ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data Junhong Shen et.al. 2411.15004 link
2024-11-22 Generative AI may backfire for counterspeech Dominik Bär et.al. 2411.14986 null
2024-11-22 Exploring Foundation Models Fine-Tuning for Cytology Classification Manon Dausort et.al. 2411.14975 link
2024-11-22 Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models Alec Wright et.al. 2411.14972 link
2024-11-22 SwissADT: An Audio Description Translation System for Swiss Languages Lukas Fischer et.al. 2411.14967 null
2024-11-22 LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement Jieming Bian et.al. 2411.14961 null
2024-11-21 Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models Yuhao Dong et.al. 2411.14432 link
2024-11-21 Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation Zhuoman Liu et.al. 2411.14423 null
2024-11-21 From RNNs to Foundation Models: An Empirical Study on Commercial Building Energy Consumption Shourya Bose et.al. 2411.14421 null
2024-11-21 Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding Yiming Zhang et.al. 2411.14401 null
2024-11-21 Lightweight Safety Guardrails Using Fine-tuned BERT Embeddings Aaron Zheng et.al. 2411.14398 null
2024-11-21 UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages Bethel Melesse Tessema et.al. 2411.14343 link
2024-11-21 SplatR : Experience Goal Visual Rearrangement with 3D Gaussian Splatting and Dense Feature Matching Arjun P S et.al. 2411.14322 null
2024-11-21 Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training Zheheng Luo et.al. 2411.14318 null
2024-11-21 Automated Generation of Code Debugging Exercises Victor-Alexandru Pădurean et.al. 2411.14303 null
2024-11-21 Auto-SPICE: Leveraging LLMs for Dataset Creation via Automated SPICE Netlist Extraction from Analog Circuit Diagrams Jitendra Bhandari et.al. 2411.14299 link
2024-11-21 EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild Yumeng Liu et.al. 2411.14280 null
2024-11-21 Looking Beyond Text: Reducing Language bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance Haozhe Zhao et.al. 2411.14279 null
2024-11-21 Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models Iacopo Ghinassi et.al. 2411.14272 link
2024-11-21 Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective Ernests Lavrinovics et.al. 2411.14258 null
2024-11-21 Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models Javier Ferrando et.al. 2411.14257 null
2024-11-21 Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs Zeyu Dong et.al. 2411.14256 null
2024-11-21 Intent-Aware Dialogue Generation and Multi-Task Contrastive Learning for Multi-Turn Intent Classification Junhua Liu et.al. 2411.14252 null
2024-11-21 Natural Language Reinforcement Learning Xidong Feng et.al. 2411.14251 null
2024-11-21 FocusLLaVA: A Coarse-to-Fine Approach for Efficient and Effective Visual Token Compression Yuke Zhu et.al. 2411.14228 null
2024-11-21 Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data Paul Fergus et.al. 2411.14219 null
2024-11-20 Find Any Part in 3D Ziqi Ma et.al. 2411.13550 null
2024-11-20 SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs Shirley Kokane et.al. 2411.13547 null
2024-11-20 Promoting User Data Autonomy During the Dissolution of a Monopolistic Firm Rushabh Solanki et.al. 2411.13546 null
2024-11-20 BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Davide Paglieri et.al. 2411.13543 null
2024-11-20 Metacognition for Unknown Situations and Environments (MUSE) Rodolfo Valiente et.al. 2411.13537 null
2024-11-20 Predictive Insights into LGBTQ+ Minority Stress: A Transductive Exploration of Social Media Discourse S. Chapagain et.al. 2411.13534 link
2024-11-20 Advancing Complex Medical Communication in Arabic with Sporo AraSum: Surpassing Existing Large Language Models Chanseo Lee et.al. 2411.13518 null
2024-11-20 Disentangling Memory and Reasoning Ability in Large Language Models Mingyu Jin et.al. 2411.13504 link
2024-11-20 Neural machine translation of seismic waves for petrophysical inversion José Cunha Teixeira et.al. 2411.13491 null
2024-11-20 Utilizing Large Language Models to Synthesize Product Desirability Datasets John D. Hastings et.al. 2411.13485 null
2024-11-20 PatentEdits: Framing Patent Novelty as Textual Entailment Ryan Lee et.al. 2411.13477 null
2024-11-20 When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Haonan Wang et.al. 2411.13476 link
2024-11-20 SoK: A Systems Perspective on Compound AI Threats and Countermeasures Sarbartha Banerjee et.al. 2411.13459 null
2024-11-20 LIMBA: An Open-Source Framework for the Preservation and Valorization of Low-Resource Languages using Generative Models Salvatore Mario Carta et.al. 2411.13453 null
2024-11-20 AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations Gaurav Verma et.al. 2411.13451 null
2024-11-20 WaterPark: A Robustness Assessment of Language Model Watermarking Jiacheng Liang et.al. 2411.13425 link
2024-11-20 Unleashing the Power of Large Language Models for Group POI Recommendations Jing Long et.al. 2411.13415 null
2024-11-20 A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback Alireza Rashidi Laleh et.al. 2411.13410 null
2024-11-20 Unification of Balti and trans-border sister dialects in the essence of LLMs and AI Technology Muhammad Sharif et.al. 2411.13409 null
2024-11-20 Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese Dat Van-Thanh Nguyen et.al. 2411.13407 null
2024-11-19 ACING: Actor-Critic for Instruction Learning in Black-Box Large Language Models Salma Kharrat et.al. 2411.12736 link
2024-11-19 Information Theory of Meaningful Communication Doron Sivan et.al. 2411.12728 null
2024-11-19 CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs Zhehan Kan et.al. 2411.12713 null
2024-11-19 Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs Ahmed Akib Jawad Karim et.al. 2411.12712 null
2024-11-19 Strengthening Fake News Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques. Defying BERT? Ahmed Akib Jawad Karim et.al. 2411.12703 null
2024-11-19 When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations Huaizhi Ge et.al. 2411.12701 null
2024-11-19 SparseInfer: Training-free Prediction of Activation Sparsity for Fast LLM Inference Jiho Shin et.al. 2411.12692 null
2024-11-19 Neurosymbolic Graph Enrichment for Grounded World Models Stefano De Giorgis et.al. 2411.12671 null
2024-11-19 DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models Vinay Kumar Sankarapu et.al. 2411.12643 link
2024-11-19 Improving Controllability and Editability for Pretrained Text-to-Music Generation Models Yixiao Zhang et.al. 2411.12641 null
2024-11-19 Provable unlearning in topic modeling and downstream tasks Stanley Wei et.al. 2411.12600 null
2024-11-19 AdaCM $^2$ : On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction Yuanbin Man et.al. 2411.12593 null
2024-11-19 Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models Laura Ruis et.al. 2411.12580 link
2024-11-19 Large Language Models for Combinatorial Optimization of Design Structure Matrix Shuo Jiang et.al. 2411.12571 null
2024-11-19 Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues Riccardo Grazzi et.al. 2411.12537 link
2024-11-19 Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution Yang Zou et.al. 2411.12530 link
2024-11-19 Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus Terufumi Morishita et.al. 2411.12498 link
2024-11-19 AI Flow at the Network Edge Jiawei Shao et.al. 2411.12469 null
2024-11-19 Guide-to-Explain for Controllable Summarization Sangwon Ryu et.al. 2411.12460 null
2024-11-19 \textsc{Neon}: News Entity-Interaction Extraction for Enhanced Question Answering Sneha Singhania et.al. 2411.12449 null
2024-11-18 Bi-Mamba: Towards Accurate 1-Bit State Space Models Shengkun Tang et.al. 2411.11843 null
2024-11-18 Tackling prediction tasks in relational databases with LLMs Marek Wydmuch et.al. 2411.11829 null
2024-11-18 Exploring adversarial robustness of JPEG AI: methodology, comparison and new methods Egor Kovalev et.al. 2411.11795 null
2024-11-18 LLM-IE: A Python Package for Generative Information Extraction with Large Language Models Enshuo Hsu et.al. 2411.11779 null
2024-11-18 sMoRe: Enhancing Object Manipulation and Organization in Mixed Reality Spaces with LLMs and Generative AI Yunhao Xing et.al. 2411.11752 null
2024-11-18 BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration Yuzong Chen et.al. 2411.11745 link
2024-11-18 Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment Allison Huang et.al. 2411.11731 link
2024-11-18 Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation Mingchao Qi et.al. 2411.11714 link
2024-11-18 FedCoLLM: A Parameter-Efficient Federated Co-tuning Framework for Large and Small Language Models Tao Fan et.al. 2411.11707 null
2024-11-18 MC-LLaVA: Multi-Concept Personalized Vision-Language Model Ruichuan An et.al. 2411.11706 link
2024-11-18 Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search Jinhao Jiang et.al. 2411.11694 null
2024-11-18 TrojanRobot: Backdoor Attacks Against Robotic Manipulation in the Physical World Xianlong Wang et.al. 2411.11683 null
2024-11-18 PSPO*: An Effective Process-supervised Policy Optimization for Reasoning Alignment Jiawei Li et.al. 2411.11681 link
2024-11-18 Dissecting Misalignment of Multimodal Large Language Models via Influence Function Lijie Hu et.al. 2411.11667 null
2024-11-18 TSINR: Capturing Temporal Continuity via Implicit Neural Representations for Time Series Anomaly Detection Mengxuan Li et.al. 2411.11641 link
2024-11-18 Chapter 7 Review of Data-Driven Generative AI Models for Knowledge Extraction from Scientific Literature in Healthcare Leon Kopitar et.al. 2411.11635 null
2024-11-18 Signaling and Social Learning in Swarms of Robots Leo Cazenille et.al. 2411.11616 null
2024-11-18 Leveraging Computational Pathology AI for Noninvasive Optical Imaging Analysis Without Retraining Danny Barash et.al. 2411.11613 null
2024-11-18 VLN-Game: Vision-Language Equilibrium Search for Zero-Shot Semantic Navigation Bangguo Yu et.al. 2411.11609 null
2024-11-18 Exploring LLMs for Verifying Technical System Specifications Against Requirements Lasse M. Reinpold et.al. 2411.11582 null
2024-11-15 VeriGraph: Scene Graphs for Execution Verifiable Robot Planning Daniel Ekpo et.al. 2411.10446 null
2024-11-15 Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Weiyun Wang et.al. 2411.10442 null
2024-11-15 LLaVA-o1: Let Vision Language Models Reason Step-by-Step Guowei Xu et.al. 2411.10440 link
2024-11-15 MARS: Unleashing the Power of Variance Reduction for Training Large Models Huizhuo Yuan et.al. 2411.10438 link
2024-11-15 Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization Yuhan Fu et.al. 2411.10436 null
2024-11-15 Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash Parsa Hejabi et.al. 2411.10422 link
2024-11-15 On the Foundation Model for Cardiac MRI Reconstruction Chi Zhang et.al. 2411.10403 null
2024-11-15 Interactive Cycle Model – The Linkage Combination among Automatic Speech Recognition, Large Language Models and Smart Glasses Libo Wang et.al. 2411.10362 null
2024-11-15 Bias Unveiled: Investigating Social Bias in LLM-Generated Code Lin Ling et.al. 2411.10351 null
2024-11-15 Y-MAP-Net: Real-time depth, normals, segmentation, multi-label captioning and 2D human pose in RGB images Ammar Qammaz et.al. 2411.10334 null
2024-11-15 Number it: Temporal Grounding Videos like Flipping Manga Yongliang Wu et.al. 2411.10332 link
2024-11-15 Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting Ziqi Xie et.al. 2411.10309 link
2024-11-15 Static network structure cannot stabilize cooperation among Large Language Model agents Jin Han et.al. 2411.10294 null
2024-11-15 Scaling Law for Post-training after Model Pruning Xiaodong Chen et.al. 2411.10272 null
2024-11-15 Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning Jingru Yang et.al. 2411.10252 null
2024-11-15 Measuring Non-Adversarial Reproduction of Training Data in Large Language Models Michael Aerni et.al. 2411.10242 null
2024-11-15 Generative AI in Multimodal User Interfaces: Trends, Challenges, and Cross-Platform Adaptability J. Bieniek et.al. 2411.10234 null
2024-11-15 An Empirical Study on LLM-based Agents for Automated Bug Fixing Xiangxin Meng et.al. 2411.10213 null
2024-11-15 Agentic LLMs in the Supply Chain: Towards Autonomous Multi-Agent Consensus-Seeking Valeria Jannelli et.al. 2411.10184 null
2024-11-15 CART: Compositional Auto-Regressive Transformer for Image Generation Siddharth Roheda et.al. 2411.10180 null
2024-11-14 MagicQuill: An Intelligent Interactive Image Editing System Zichen Liu et.al. 2411.09703 null
2024-11-14 Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models Wei Wang et.al. 2411.09691 null
2024-11-14 Squeezed Attention: Accelerating Long Context Length LLM Inference Coleman Hooper et.al. 2411.09688 link
2024-11-14 Adaptive Decoding via Latent Preference Optimization Shehzaad Dhuliawala et.al. 2411.09661 null
2024-11-14 On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse Alkis Kalavasis et.al. 2411.09642 null
2024-11-14 Local deployment of large-scale music AI models on commodity hardware Xun Zhou et.al. 2411.09625 null
2024-11-14 PTR: Precision-Driven Tool Recommendation for Large Language Models Hang Gao et.al. 2411.09613 null
2024-11-14 The Moral Foundations Weibo Corpus Renjie Cao et.al. 2411.09612 null
2024-11-14 Initial Nugget Evaluation Results for the TREC 2024 RAG Track with the AutoNuggetizer Framework Ronak Pradeep et.al. 2411.09607 null
2024-11-14 Accelerating Knowledge Graph and Ontology Engineering with Large Language Models Cogan Shimizu et.al. 2411.09601 null
2024-11-14 Assessing the Performance of the DINOv2 Self-supervised Learning Vision Transformer Model for the Segmentation of the Left Atrium from MRI Images Bipasha Kundu et.al. 2411.09598 null
2024-11-14 LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models Zhengyi Wang et.al. 2411.09595 null
2024-11-14 Adopting RAG for LLM-Aided Future Vehicle Design Vahid Zolfaghari et.al. 2411.09590 null
2024-11-14 BabyLM Challenge: Exploring the Effect of Variation Sets on Language Model Training Efficiency Akari Haga et.al. 2411.09587 null
2024-11-14 Software Performance Engineering for Foundation Model-Powered Software (FMware) Haoxiang Zhang et.al. 2411.09580 null
2024-11-14 Piecing It All Together: Verifying Multi-Hop Multimodal Claims Haoran Wang et.al. 2411.09547 null
2024-11-14 A Practical Guide to Fine-tuning Language Models with Limited Data Márton Szép et.al. 2411.09539 null
2024-11-14 Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents Yuyou Gan et.al. 2411.09523 null
2024-11-14 Communication Compression for Tensor Parallel LLM Inference Jan Hansen-Palmus et.al. 2411.09510 null
2024-11-14 Spider: Any-to-Many Multimodal LLM Jinxiang Lai et.al. 2411.09439 null
2024-11-13 Large Wireless Model (LWM): A Foundation Model for Wireless Channels Sadjad Alikhani et.al. 2411.08872 link
2024-11-13 The Limited Impact of Medical Adaptation of Large Language and Vision-Language Models Daniel P. Jeong et.al. 2411.08870 link
2024-11-13 CamemBERT 2.0: A Smarter French Language Model Aged to Perfection Wissam Antoun et.al. 2411.08868 null
2024-11-13 LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs Piyush Jha et.al. 2411.08862 null
2024-11-13 Multimodal Instruction Tuning with Hybrid State Space Models Jianing Zhou et.al. 2411.08840 null
2024-11-13 FinRobot: AI Agent for Equity Research and Valuation with Large Language Models Tianyu Zhou et.al. 2411.08804 link
2024-11-13 Evaluating World Models with LLM for Decision Making Chang Yang et.al. 2411.08794 null
2024-11-13 Can sparse autoencoders be used to decompose and interpret steering vectors? Harry Mayne et.al. 2411.08790 link
2024-11-13 Sharingan: Extract User Action Sequence from Desktop Recordings Yanting Chen et.al. 2411.08768 null
2024-11-13 Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers Clément Dumas et.al. 2411.08745 link
2024-11-13 A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models Dingdong Wang et.al. 2411.08742 null
2024-11-13 Dynamic Rewarding with Prompt Optimization Enables Tuning-free Self-Alignment of Language Models Somanshu Singla et.al. 2411.08733 link
2024-11-13 Polymetis:Large Language Modeling for Multiple Material Domains Chao Huang et.al. 2411.08728 null
2024-11-13 Voxeland: Probabilistic Instance-Aware Semantic Mapping with Evidence-based Uncertainty Quantification Jose-Luis Matez-Bandera et.al. 2411.08727 link
2024-11-13 Theoretical Analysis of Byte-Pair Encoding László Kozma et.al. 2411.08671 null
2024-11-13 OSMLoc: Single Image-Based Visual Localization in OpenStreetMap with Geometric and Semantic Guidances Youqi Liao et.al. 2411.08665 link
2024-11-13 UniMat: Unifying Materials Embeddings through Multi-modal Learning Janghoon Ock et.al. 2411.08664 null
2024-11-13 Accelerating Quasi-Static Time Series Simulations with Foundation Models Alban Puech et.al. 2411.08652 null
2024-11-13 A System Level Performance Evaluation for Superconducting Digital Systems Joyjit Kundu et.al. 2411.08645 null
2024-11-13 Towards Secure Intelligent O-RAN Architecture: Vulnerabilities, Threats and Promising Technical Solutions using LLMs Mojdeh Karbalaee Motalleb et.al. 2411.08640 null
2024-11-12 Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data Juanhui Li et.al. 2411.08028 null
2024-11-12 LLMPhy: Complex Physical Reasoning Using Large Language Models and World Models Anoop Cherian et.al. 2411.08027 null
2024-11-12 Language Models as Causal Effect Generators Lucius E. J. Bynum et.al. 2411.08019 link
2024-11-12 ExpressivityArena: Can LLMs Express Information Implicitly? Joshua Tint et.al. 2411.08010 null
2024-11-12 Can adversarial attacks by large language models be attributed? Manuel Cebrian et.al. 2411.08003 null
2024-11-12 Derivational Morphology Reveals Analogical Generalization in Large Language Models Valentin Hofmann et.al. 2411.07990 null
2024-11-12 JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation Yiyang Ma et.al. 2411.07975 link
2024-11-12 From General to Specific: Utilizing General Hallucation to Automatically Measure the Role Relationship Fidelity for Specific Role-Play Agents Chuyi Kong et.al. 2411.07965 null
2024-11-12 Towards Low-bit Communication for Tensor Parallel LLM Inference Harry Dong et.al. 2411.07942 null
2024-11-12 Leveraging Multimodal Models for Enhanced Neuroimaging Diagnostics in Alzheimer’s Disease Francesco Chiumento et.al. 2411.07871 null
2024-11-12 Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders Xiaofeng Zhu et.al. 2411.07870 null
2024-11-12 Verbosity $\neq$ Veracity: Demystify Verbosity Compensation Behavior of Large Language Models Yusen Zhang et.al. 2411.07858 link
2024-11-12 Tucano: Advancing Neural Text Generation for Portuguese Nicholas Kluge Corrêa et.al. 2411.07854 link
2024-11-12 NL-SLAM for OC-VLN: Natural Language Grounded SLAM for Object-Centric VLN Sonia Raychaudhuri et.al. 2411.07848 null
2024-11-12 Chain Association-based Attacking and Shielding Natural Language Processing Systems Jiacheng Huang et.al. 2411.07843 null
2024-11-12 FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training Philip Zmushko et.al. 2411.07837 link
2024-11-12 Efficient Federated Finetuning of Tiny Transformers with Resource-Constrained Devices Kilian Pfeiffer et.al. 2411.07826 null
2024-11-12 Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models Youan Cong et.al. 2411.07820 null
2024-11-12 Federated Low-Rank Adaptation with Differential Privacy over Wireless Networks Tianqu Kang et.al. 2411.07806 null
2024-11-12 Likelihood as a Performance Gauge for Retrieval-Augmented Generation Tianyu Liu et.al. 2411.07773 link
2024-11-11 UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts Bo Yang et.al. 2411.07240 link
2024-11-11 OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model Sumeth Yuenyong et.al. 2411.07238 null
2024-11-11 Contextualized Evaluations: Taking the Guesswork Out of Language Model Evaluations Chaitanya Malaviya et.al. 2411.07237 null
2024-11-11 Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving Botao Yu et.al. 2411.07228 null
2024-11-11 TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models Matheus Simão et.al. 2411.07224 null
2024-11-11 Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks Madeline Brumley et.al. 2411.07213 null
2024-11-11 General Geospatial Inference with a Population Dynamics Foundation Model Mohit Agarwal et.al. 2411.07207 null
2024-11-11 DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID Nyle Siddiqui et.al. 2411.07205 link
2024-11-11 The Super Weight in Large Language Models Mengxia Yu et.al. 2411.07191 link
2024-11-11 NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics David Robinson et.al. 2411.07186 null
2024-11-11 SAMPart3D: Segment Any Part in 3D Objects Yunhan Yang et.al. 2411.07184 link
2024-11-11 Counterfactual Generation from Language Models Shauli Ravfogel et.al. 2411.07180 link
2024-11-11 More Expressive Attention with Negative Weights Ang Lv et.al. 2411.07176 link
2024-11-11 Continual Memorization of Factoids in Large Language Models Howard Chen et.al. 2411.07175 link
2024-11-11 A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19 Vedant Khandelwal et.al. 2411.07163 null
2024-11-11 Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models Yancheng He et.al. 2411.07140 null
2024-11-11 Stronger Models are NOT Stronger Teachers for Instruction Tuning Zhangchen Xu et.al. 2411.07133 null
2024-11-11 Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis Taihang Hu et.al. 2411.07132 link
2024-11-11 Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context Evaluation Kaijian Zou et.al. 2411.07130 link
2024-11-11 Benchmarking LLMs’ Judgments with No Gold Standard Shengwei Xu et.al. 2411.07127 link
2024-11-08 Recycled Attention: Efficient inference for long-context language models Fangyuan Xu et.al. 2411.05787 null
2024-11-08 Using Language Models to Disambiguate Lexical Choices in Translation Josh Barua et.al. 2411.05781 link
2024-11-08 Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths? Veronica Chatrath et.al. 2411.05775 null
2024-11-08 Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024 Christopher Malon et.al. 2411.05762 null
2024-11-08 End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering Dylan Goetting et.al. 2411.05755 link
2024-11-08 Aioli: A Unified Optimization Framework for Language Model Data Mixing Mayee F. Chen et.al. 2411.05735 link
2024-11-08 Poze: Sports Technique Feedback under Data Constraints Agamdeep Singh et.al. 2411.05734 null
2024-11-08 STARS: Sensor-agnostic Transformer Architecture for Remote Sensing Ethan King et.al. 2411.05714 null
2024-11-08 Unmasking the Limits of Large Language Models: A Systematic Evaluation of Masked Text Processing Ability through MskQA and MskCal Fuka Matsuzaki et.al. 2411.05665 link
2024-11-08 The influence of persona and conversational task on social interactions with a LLM-controlled embodied conversational agent Leon O. H. Kroczek et.al. 2411.05653 null
2024-11-08 LightVA: Lightweight Visual Analytics with LLM Agent-Based Task Planning and Execution Yuheng Zhao et.al. 2411.05651 null
2024-11-08 Harnessing High-Level Song Descriptors towards Natural Language-Based Music Recommendation Elena V. Epure et.al. 2411.05649 link
2024-11-08 Evaluating Large Language Model Capability in Vietnamese Fact-Checking Data Generation Long Truong To et.al. 2411.05641 null
2024-11-08 Assessing Open-Source Large Language Models on Argumentation Mining Subtasks Mohammad Yeghaneh Abkenar et.al. 2411.05639 null
2024-11-08 A Two-Step Concept-Based Approach for Enhanced Interpretability and Trust in Skin Lesion Diagnosis Cristiano Patrício et.al. 2411.05609 link
2024-11-08 Evaluating and Adapting Large Language Models to Represent Folktales in Low-Resource Languages JA Meaney et.al. 2411.05593 null
2024-11-08 Open-set object detection: towards unified problem formulation and benchmarking Hejer Ammar et.al. 2411.05564 null
2024-11-08 Training objective drives the consistency of representational similarity across datasets Laure Ciernik et.al. 2411.05561 link
2024-11-08 AcceLLM: Accelerating LLM Inference using Redundancy for Load Balancing and Data Locality Ilias Bournias et.al. 2411.05555 null
2024-11-08 Assessing the Answerability of Queries in Retrieval-Augmented Code Generation Geonmin Kim et.al. 2411.05547 null
2024-11-07 SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models Muyang Li et.al. 2411.05007 link
2024-11-07 Analyzing The Language of Visual Tokens David M. Chan et.al. 2411.05001 null
2024-11-07 Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? Jonathan Roberts et.al. 2411.05000 null
2024-11-07 DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation Peiqi Liu et.al. 2411.04999 link
2024-11-07 LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation Weiquan Huang et.al. 2411.04997 link
2024-11-07 Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Weixin Liang et.al. 2411.04996 null
2024-11-07 Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives Hao Sun et.al. 2411.04991 link
2024-11-07 The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities Zhaofeng Wu et.al. 2411.04986 null
2024-11-07 Enhancing Reverse Engineering: Investigating and Benchmarking Large Language Models for Vulnerability Analysis in Decompiled Binaries Dylan Manuel et.al. 2411.04981 null
2024-11-07 SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference Gabriele Oliaro et.al. 2411.04975 null
2024-11-07 BitNet a4.8: 4-bit Activations for 1-bit LLMs Hongyu Wang et.al. 2411.04965 null
2024-11-07 Position Paper On Diagnostic Uncertainty Estimation from Large Language Models: Next-Word Probability Is Not Pre-test Probability Yanjun Gao et.al. 2411.04962 null
2024-11-07 CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM Jingwei Xu et.al. 2411.04954 null
2024-11-07 M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding Jaemin Cho et.al. 2411.04952 null
2024-11-07 A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model Panwen Hu et.al. 2411.04942 null
2024-11-07 VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos Shehan Munasinghe et.al. 2411.04923 null
2024-11-07 GPTKB: Building Very Large Knowledge Bases from Language Models Yujia Hu et.al. 2411.04920 link
2024-11-07 OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Siming Huang et.al. 2411.04905 null
2024-11-07 In the Era of Prompt Learning with Vision-Language Models Ankit Jha et.al. 2411.04892 null
2024-11-07 GUI Agents with Foundation Models: A Comprehensive Survey Shuai Wang et.al. 2411.04890 null
2024-11-06 Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress? Daniel P. Jeong et.al. 2411.04118 link
2024-11-06 How Transformers Solve Propositional Logic Problems: A Mechanistic Analysis Guan Zhe Hong et.al. 2411.04105 null
2024-11-06 RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models Maya Varma et.al. 2411.04097 link
2024-11-06 Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation Ke Fan et.al. 2411.04079 null
2024-11-06 H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models Nhi Pham et.al. 2411.04077 null
2024-11-06 M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models Chuhan Li et.al. 2411.04075 null
2024-11-06 Pseudo-labeling with Keyword Refining for Few-Supervised Video Captioning Ping Li et.al. 2411.04059 link
2024-11-06 Beemo: Benchmark of Expert-edited Machine-generated Outputs Ekaterina Artemova et.al. 2411.04032 null
2024-11-06 Prompt Engineering Using GPT for Word-Level Code-Mixed Language Identification in Low-Resource Dravidian Languages Aniket Deroy et.al. 2411.04025 null
2024-11-06 Select2Plan: Training-Free ICL-Based Planning through VQA and Memory Retrieval Davide Buoso et.al. 2411.04006 null
2024-11-06 Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning Jiawei Yao et.al. 2411.03978 link
2024-11-06 What Really is Commonsense Knowledge? Quyet V. Do et.al. 2411.03964 null
2024-11-06 How Does A Text Preprocessing Pipeline Affect Ontology Syntactic Matching? Zhangcheng Qiang et.al. 2411.03962 null
2024-11-06 Face Reconstruction from Face Embeddings using Adapter to a Face Foundation Model Hatef Otroshi Shahreza et.al. 2411.03960 null
2024-11-06 Fine-Grained Guidance for Retrievers: Leveraging LLMs’ Feedback in Retrieval-Augmented Generation Yuhang Liu et.al. 2411.03957 null
2024-11-06 Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks Felipe Marra et.al. 2411.03948 null
2024-11-06 Interactions Across Blocks in Post-Training Quantization of Large Language Models Khasmamad Shabanovi et.al. 2411.03934 null
2024-11-06 Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision-Language Models Minh Duc Bui et.al. 2411.03888 link
2024-11-06 Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models Zhijian Zhuo et.al. 2411.03884 link
2024-11-06 MEG: Medical Knowledge-Augmented Large Language Models for Question Answering Laura Cabello et.al. 2411.03883 link
2024-11-05 Inference Optimal VLMs Need Only One Visual Token but Larger Models Kevin Y. Li et.al. 2411.03312 link
2024-11-05 LLMs for Domain Generation Algorithm Detection Reynier Leyva La O et.al. 2411.03307 null
2024-11-05 VERITAS: A Unified Approach to Reliability Evaluation Rajkumar Ramamurthy et.al. 2411.03300 null
2024-11-05 Examining Human-AI Collaboration for Co-Writing Constructive Comments Online Farhana Shahid et.al. 2411.03295 null
2024-11-05 Interaction2Code: How Far Are We From Automatic Interactive Webpage Generation? Jingyu Xiao et.al. 2411.03292 link
2024-11-05 The Future of Intelligent Healthcare: A Systematic Analysis and Discussion on the Integration and Impact of Robots Using Large Language Models for Healthcare Souren Pashangpour et.al. 2411.03287 null
2024-11-05 SMoA: Improving Multi-agent Large Language Models with Sparse Mixture-of-Agents Dawei Li et.al. 2411.03284 link
2024-11-05 Spontaneous Emergence of Agent Individuality through Social Interactions in LLM-Based Communities Ryosuke Takata et.al. 2411.03252 null
2024-11-05 DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models Ying Zhou et.al. 2411.03250 null
2024-11-05 From Pen to Prompt: How Creative Writers Integrate AI into their Writing Practice Alicia Guo et.al. 2411.03137 null
2024-11-05 “Create a Fear of Missing Out” – ChatGPT Implements Unsolicited Deceptive Designs in Generated Websites Without Warning Veronika Krauß et.al. 2411.03108 null
2024-11-05 Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation Jinbao Chen et.al. 2411.03079 null
2024-11-05 Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning Bei Li et.al. 2411.03042 null
2024-11-05 HumanVLM: Foundation for Human-Scene Vision-Language Model Dawei Dai et.al. 2411.03034 null
2024-11-05 Leveraging Large Language Models in Code Question Answering: Baselines and Issues Georgy Andryushchenko et.al. 2411.03012 link
2024-11-05 Controlling for Unobserved Confounding with Large Language Model Classification of Patient Smoking Status Samuel Lee et.al. 2411.03004 null
2024-11-05 Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation Junchen Fu et.al. 2411.02992 null
2024-11-05 Growing a Tail: Increasing Output Diversity in Large Language Models Michal Shur-Ofry et.al. 2411.02989 null
2024-11-05 [Vision Paper] PRObot: Enhancing Patient-Reported Outcome Measures for Diabetic Retinopathy using Chatbots and Generative AI Maren Pielka et.al. 2411.02973 null
2024-11-05 Multi-modal NeRF Self-Supervision for LiDAR Semantic Segmentation Xavier Timoneda et.al. 2411.02969 null
2024-11-04 Training-free Regional Prompting for Diffusion Transformers Anthony Chen et.al. 2411.02395 link
2024-11-04 Adaptive Length Image Tokenization via Recurrent Allocation Shivam Duggal et.al. 2411.02393 link
2024-11-04 Attacking Vision-Language Computer Agents via Pop-ups Yanzhe Zhang et.al. 2411.02391 link
2024-11-04 Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models Guangzhi Xiong et.al. 2411.02382 null
2024-11-04 Addressing Uncertainty in LLMs to Enhance Reliability in Generative AI Ramneet Kaur et.al. 2411.02381 null
2024-11-04 Learning General-Purpose Biomedical Volume Representations using Randomized Synthesis Neel Dey et.al. 2411.02372 link
2024-11-04 DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution Yang Yue et.al. 2411.02359 link
2024-11-04 “Give Me BF16 or Give Me Death”? Accuracy-Performance Trade-Offs in LLM Quantization Eldar Kurtic et.al. 2411.02355 null
2024-11-04 Machine learning identification of maternal inflammatory response and histologic choroamnionitis from placental membrane whole slide images Abhishek Sharma et.al. 2411.02354 null
2024-11-04 Social-RAG: Retrieving from Group Interactions to Socially Ground Proactive AI Generation to Group Preferences Ruotong Wang et.al. 2411.02353 null
2024-11-04 Can Large Language Models generalize analogy solving like people can? Claire E. Stevenson et.al. 2411.02348 null
2024-11-04 WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Zehan Qi et.al. 2411.02337 link
2024-11-04 Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Yuqi Luo et.al. 2411.02335 link
2024-11-04 Disrupting Test Development with AI Assistants Vijay Joshi et.al. 2411.02328 null
2024-11-04 PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance Ruyang Liu et.al. 2411.02327 link
2024-11-04 An Empirical Study on the Code Refactoring Capability of Large Language Models Jonathan Cordeiro et.al. 2411.02320 null
2024-11-04 Evaluating the Ability of Large Language Models to Generate Verifiable Specifications in VeriFast Marilyn Rego et.al. 2411.02318 null
2024-11-04 Defining and Evaluating Physical Safety for Large Language Models Yung-Chen Tang et.al. 2411.02317 null
2024-11-04 Evaluating Creative Short Story Generation in Humans and Large Language Models Mete Ismayilzada et.al. 2411.02316 link
2024-11-04 Taking AI Welfare Seriously Robert Long et.al. 2411.00986 null
2024-10-31 P-Masking: Power Law Masking Improves Multi-attribute Controlled Generation Mohamed Elgaar et.al. 2410.24201 null
2024-11-01 SelfCodeAlign: Self-Alignment for Code Generation Yuxiang Wei et.al. 2410.24198 link
2024-10-31 DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models Heng-Jui Chang et.al. 2410.24177 null
2024-10-31 Constraint Back-translation Improves Complex Instruction Following of Large Language Models Yunjia Qi et.al. 2410.24175 null
2024-10-31 $π_0$ : A Vision-Language-Action Flow Model for General Robot Control Kevin Black et.al. 2410.24164 null
2024-10-31 GPT or BERT: why not both? Lucas Georges Gabriel Charpentier et.al. 2410.24159 link
2024-10-31 Thought Space Explorer: Navigating and Expanding Thought Space for Large Language Model Reasoning Jinghan Zhang et.al. 2410.24155 null
2024-10-31 Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning Jiaqi Liu et.al. 2410.24152 null
2024-10-31 Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age Nouar AlDahoul et.al. 2410.24148 null
2024-10-31 Leveraging Large Language Models for Code Translation and Software Development in Scientific Computing Akash Dhruv et.al. 2410.24119 link
2024-10-31 Repository-Level Compositional Code Translation and Validation Ali Reza Ibrahimzada et.al. 2410.24117 link
2024-10-31 Matchmaker: Self-Improving Large Language Model Programs for Schema Matching Nabeel Seedat et.al. 2410.24105 null
2024-10-31 Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning Nabil Omi et.al. 2410.24096 null
2024-10-31 In-Context Fine-Tuning for Time-Series Foundation Models Abhimanyu Das et.al. 2410.24087 null
2024-10-31 Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs Muhammed Saeed et.al. 2410.24049 null
2024-10-31 Handwriting Recognition in Historical Documents with Multimodal LLM Lucian Li et.al. 2410.24034 null
2024-10-31 Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks Yingzhe Peng et.al. 2410.24032 null
2024-10-31 AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents Yifan Xu et.al. 2410.24024 link
2024-10-31 SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation Liang He et.al. 2410.24022 null
2024-10-31 Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody? Ioannis Tsiamas et.al. 2410.24019 null
2024-10-30 ReferEverything: Towards Segmenting Everything We Can Speak of in Videos Anurag Bagchi et.al. 2410.23287 null
2024-10-30 A Monte Carlo Framework for Calibrated Uncertainty Estimation in Sequence Prediction Qidong Yang et.al. 2410.23272 null
2024-10-30 TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models Ziyao Shangguan et.al. 2410.23266 link
2024-10-30 EMMA: End-to-End Multimodal Model for Autonomous Driving Jyh-Jing Hwang et.al. 2410.23262 null
2024-10-30 Keypoint Abstraction using Large Models for Object-Relative Imitation Learning Xiaolin Fang et.al. 2410.23254 null
2024-10-30 Evaluating Cultural and Social Awareness of LLM Web Agents Haoyi Qiu et.al. 2410.23252 null
2024-10-30 Carrot and Stick: Eliciting Comparison Data and Beyond Yiling Chen et.al. 2410.23243 null
2024-10-30 A little less conversation, a little more action, please: Investigating the physical common-sense of LLMs in a 3D embodied environment Matteo G. Mecattaf et.al. 2410.23242 link
2024-10-30 EMOTION: Expressive Motion Sequence Generation for Humanoid Robots with In-Context Learning Peide Huang et.al. 2410.23234 null
2024-10-30 COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences Yixin Liu et.al. 2410.23223 link
2024-10-30 Partial Channel Dependence with Channel Masks for Time Series Foundation Models Seunghan Lee et.al. 2410.23222 null
2024-10-30 OS-ATLAS: A Foundation Action Model for Generalist GUI Agents Zhiyong Wu et.al. 2410.23218 link
2024-10-31 Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval Sheryl Hsu et.al. 2410.23214 null
2024-10-30 ProTransformer: Robustify Transformers via Plug-and-Play Paradigm Zhichao Hou et.al. 2410.23182 null
2024-10-30 ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning Millennium Bismay et.al. 2410.23180 link
2024-10-30 TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters Haiyang Wang et.al. 2410.23168 link
2024-10-30 SciPIP: An LLM-based Scientific Paper Idea Proposer Wenxiao Wang et.al. 2410.23166 link
2024-10-30 FlexTSF: A Universal Forecasting Model for Time Series with Variable Regularities Jingge Xiao et.al. 2410.23160 link
2024-10-30 VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning Yichao Liang et.al. 2410.23156 null
2024-10-30 Public Domain 12M: A Highly Aesthetic Image-Text Dataset with Novel Governance Mechanisms Jordan Meyer et.al. 2410.23144 null
2024-10-29 Local Policies Enable Zero-shot Long-horizon Manipulation Murtaza Dalal et.al. 2410.22332 null
2024-10-29 Task Vectors are Cross-Modal Grace Luo et.al. 2410.22330 null
2024-10-29 Enhancing Code Annotation Reliability: Generative AI’s Role in Comment Quality Assessment Models Seetharam Killivalavan et.al. 2410.22323 null
2024-10-29 Online Detecting LLM-Generated Texts via Sequential Hypothesis Testing by Betting Can Chen et.al. 2410.22318 link
2024-10-29 Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier Kai Wang et.al. 2410.22317 link
2024-10-29 Natural Language Inference Improves Compositionality in Vision-Language Models Paola Cascante-Bonilla et.al. 2410.22315 null
2024-10-29 Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving Bo Jiang et.al. 2410.22313 link
2024-10-29 GPT-4o reads the mind in the eyes James W. A. Strachan et.al. 2410.22309 null
2024-10-29 SVIP: Towards Verifiable Inference of Open-source Large Language Models Yifan Sun et.al. 2410.22307 null
2024-10-29 Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning Yihe Deng et.al. 2410.22304 null
2024-10-29 LLMs are Highly-Constrained Biophysical Sequence Optimizers Angelica Chen et.al. 2410.22296 null
2024-10-29 Fine-Tuning LLMs for Code Mutation: A New Era of Cyber Threats Mohammad Setak et.al. 2410.22293 null
2024-10-29 From melodic note sequences to pitches using word2vec Daniel Defays et.al. 2410.22285 null
2024-10-29 Embedding-based classifiers can detect prompt injection attacks Md. Ahsan Ayub et.al. 2410.22284 link
2024-10-29 Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models Renzhe Yu et.al. 2410.22282 null
2024-10-29 Fourier Head: Helping Large Language Models Learn Complex Probability Distributions Nate Gillman et.al. 2410.22269 null
2024-10-29 Meta-Learning Adaptable Foundation Models Jacob L. Block et.al. 2410.22264 null
2024-10-29 FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation Farima Fatahi Bayat et.al. 2410.22257 null
2024-10-29 Abrupt Learning in Transformers: A Case Study on Matrix Completion Pulkit Gopalani et.al. 2410.22244 null
2024-10-29 Are Decoder-Only Large Language Models the Silver Bullet for Code Search? Yuxuan Chen et.al. 2410.22240 link
2024-10-28 Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics Yaniv Nikankin et.al. 2410.21272 link
2024-10-28 LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior Hanyu Wang et.al. 2410.21264 null
2024-10-28 BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference Changwoo Lee et.al. 2410.21262 link
2024-10-28 AutoBench-V: Can Large Vision-Language Models Benchmark Themselves? Han Bao et.al. 2410.21259 link
2024-10-28 Multi-modal AI for comprehensive breast cancer prognostication Jan Witowski et.al. 2410.21256 null
2024-10-28 LongReward: Improving Long-context Large Language Models with AI Feedback Jiajie Zhang et.al. 2410.21252 link
2024-10-28 Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback Nour Jedidi et.al. 2410.21242 null
2024-10-28 Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce Zhantao Yang et.al. 2410.21237 null
2024-10-28 Flaming-hot Initiation with Regular Execution Sampling for Large Language Models Weizhe Chen et.al. 2410.21236 null
2024-10-28 LoRA vs Full Fine-tuning: An Illusion of Equivalence Reece Shuttleworth et.al. 2410.21228 null
2024-10-28 Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines Zhixin Zhang et.al. 2410.21220 link
2024-10-28 Lifting the Veil on the Large Language Model Supply Chain: Composition, Risks, and Mitigations Kaifeng Huang et.al. 2410.21218 null
2024-10-28 BongLLaMA: LLaMA for Bangla Language Abdullah Khan Zehady et.al. 2410.21200 null
2024-10-28 Belief in the Machine: Investigating Epistemological Blind Spots of Language Models Mirac Suzgun et.al. 2410.21195 link
2024-10-29 Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction Qintong Zhang et.al. 2410.21169 null
2024-10-28 M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation Jiaheng Liu et.al. 2410.21157 null
2024-10-28 Palisade – Prompt Injection Detection Framework Sahasra Kokkula et.al. 2410.21146 null
2024-10-28 LLM-initialized Differentiable Causal Discovery Shiv Kampani et.al. 2410.21141 null
2024-10-28 Do LLMs generate test oracles that capture the actual or the expected program behaviour? Michael Konstantinou et.al. 2410.21136 null
2024-10-28 Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments Marharyta Domnich et.al. 2410.21131 null
2024-10-25 The Potential and Value of AI Chatbot in Personalized Cognitive Training Zilong Wang et.al. 2410.19733 null
2024-10-25 Rethinking Visual Dependency in Long-Context Reasoning for Large Vision-Language Models Yucheng Zhou et.al. 2410.19732 null
2024-10-25 Counting Ability of Large Language Models and Impact of Tokenization Xiang Zhang et.al. 2410.19730 link
2024-10-25 FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning Nicole Cho et.al. 2410.19727 null
2024-10-25 2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision Shilong Li et.al. 2410.19720 null
2024-10-25 Multi-view biomedical foundation models for molecule-target and property prediction Parthasarathy Suryanarayanan et.al. 2410.19704 link
2024-10-25 TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning Xiangyu Zeng et.al. 2410.19702 null
2024-10-25 IPPON: Common Sense Guided Informative Path Planning for Object Goal Navigation Kaixian Qu et.al. 2410.19697 null
2024-10-25 Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs Yifei Zhang et.al. 2410.19694 null
2024-10-25 APRICOT: Active Preference Learning and Constraint-Aware Task Planning with LLMs Huaxiaoyue Wang et.al. 2410.19656 null
2024-10-25 Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models Shenghao Fu et.al. 2410.19635 null
2024-10-25 Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina Yuan Gao et.al. 2410.19599 null
2024-10-25 Diverse Sign Language Translation Xin Shen et.al. 2410.19586 link
2024-10-25 ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems Ritvik Aggarwal Ishneet Sukhvinder Singh Ibrahim Allahverdiyev et.al. 2410.19572 null
2024-10-25 GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing Hosam Elgendy et.al. 2410.19552 link
2024-10-25 Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad? Antonia Wüst et.al. 2410.19546 link
2024-10-25 Brain-like Functional Organization within Large Language Models H. Sun et.al. 2410.19542 null
2024-10-25 Detection of Human and Machine-Authored Fake News in Urdu Muhammad Zain Ali et.al. 2410.19517 link
2024-10-25 SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models Jahyun Koo et.al. 2410.19503 null
2024-10-25 Introducing MAPO: Momentum-Aided Gradient Descent Prompt Optimization Anthony Cui et.al. 2410.19499 null
2024-10-24 Unbounded: A Generative Infinite Game of Character Life Simulation Jialu Li et.al. 2410.18975 null
2024-10-24 Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques David Ortiz-Perez et.al. 2410.18972 null
2024-10-24 ConceptDrift: Uncovering Biases through the Lens of Foundational Models Cristian Daniel Păduraru et.al. 2410.18970 null
2024-10-24 Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms Zhangheng Li et.al. 2410.18967 null
2024-10-24 Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions Yujuan Fu et.al. 2410.18966 null
2024-10-24 On the Crucial Role of Initialization for Matrix Factorization Bingcong Li et.al. 2410.18965 null
2024-10-24 OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning Xiaoqiang Wang et.al. 2410.18963 null
2024-10-24 Context is Key: A Benchmark for Forecasting with Essential Textual Information Andrew Robert Williams et.al. 2410.18959 link
2024-10-24 Bridge-Coder: Unlocking LLMs’ Potential to Overcome Language Gaps in Low-Resource Code Jipeng Zhang et.al. 2410.18957 null
2024-10-24 BioMistral-NLU: Towards More Generalizable Medical Language Understanding through Instruction Tuning Yujuan Velvin Fu et.al. 2410.18955 null
2024-10-24 Dynamic Vocabulary Pruning in Early-Exit LLMs Jort Vincenti et.al. 2410.18952 link
2024-10-24 SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models Zonghao Ying et.al. 2410.18927 null
2024-10-24 From Blind Solvers to Logical Thinkers: Benchmarking LLMs’ Logical Integrity on Faulty Mathematical Problems A M Muntasir Rahman et.al. 2410.18921 null
2024-10-25 A Survey on Speech Large Language Models Jing Peng et.al. 2410.18908 null
2024-10-24 PRISM: A Methodology for Auditing Biases in Large Language Models Leif Azzopardi et.al. 2410.18906 link
2024-10-24 LLMs for Extremely Low-Resource Finno-Ugric Languages Taido Purason et.al. 2410.18902 null
2024-10-24 Creating and Repairing Robot Programs in Open-World Domains Claire Schlesinger et.al. 2410.18893 null
2024-10-24 Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks Graziano A. Manduzio et.al. 2410.18890 null
2024-10-24 Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance Omer Nahum et.al. 2410.18889 null
2024-10-24 Provably Robust Watermarks for Open-Source Language Models Miranda Christ et.al. 2410.18861 null
2024-10-23 TP-Eval: Tap Multimodal LLMs’ Potential in Evaluation by Customizing Prompts Yuxuan Xie et.al. 2410.18071 null
2024-10-23 CLEAR: Character Unlearning in Textual and Visual Modalities Alexey Dontsov et.al. 2410.18057 null
2024-10-23 LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering Qingfei Zhao et.al. 2410.18050 link
2024-10-23 Key Algorithms for Keyphrase Generation: Instruction-Based LLMs for Russian Scientific Keyphrases Anna Glazkova et.al. 2410.18040 null
2024-10-23 MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning Jingfan Zhang et.al. 2410.18035 null
2024-10-23 GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration Xin Li et.al. 2410.18032 link
2024-10-23 MiniFed : Integrating LLM-based Agentic-Workflow for Simulating FOMC Meeting Sungil Seok et.al. 2410.18012 null
2024-10-23 Benchmarking Foundation Models on Exceptional Cases: Dataset Creation and Validation Suho Kang et.al. 2410.18001 link
2024-10-23 MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers Zebin Yang et.al. 2410.17957 null
2024-10-23 ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference Xin He et.al. 2410.17954 null
2024-10-23 SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains Ran Xu et.al. 2410.17952 null
2024-10-23 Benchmarking Floworks against OpenAI & Anthropic: A Novel Framework for Enhanced LLM Function Calling Nirav Bhan et.al. 2410.17950 null
2024-10-23 Toward path-invariant embeddings for local distance source characterization Lisa Linville et.al. 2410.17937 null
2024-10-23 Guide for Defense (G4D): Dynamic Guidance for Robust and Balanced Defense in Large Language Models He Cao et.al. 2410.17922 link
2024-10-23 Scaling Diffusion Language Models via Adaptation from Autoregressive Models Shansan Gong et.al. 2410.17891 link
2024-10-23 R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models Linger Deng et.al. 2410.17885 link
2024-10-23 Lightweight Neural App Control Filippos Christianos et.al. 2410.17883 null
2024-10-23 AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning Yehonathan Refael et.al. 2410.17881 null
2024-10-23 Understanding Layer Significance in LLM Alignment Guangyuan Shi et.al. 2410.17875 null
2024-10-23 DataTales: A Benchmark for Real-World Intelligent Data Narration Yajing Yang et.al. 2410.17859 link
2024-10-22 PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction Long Xing et.al. 2410.17247 link
2024-10-22 Towards Reliable Evaluation of Behavior Steering Interventions in LLMs Itamar Pres et.al. 2410.17245 null
2024-10-22 Frontiers in Intelligent Colonoscopy Ge-Peng Ji et.al. 2410.17241 link
2024-10-22 Large Language Models Empowered Personalized Web Agents Hongru Cai et.al. 2410.17236 null
2024-10-22 Automated Spinal MRI Labelling from Reports Using a Large Language Model Robin Y. Park et.al. 2410.17235 link
2024-10-22 Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy Benedict Aaron Tjandra et.al. 2410.17234 null
2024-10-22 Few-shot In-Context Preference Learning Using Large Language Models Chao Yu et.al. 2410.17233 null
2024-10-22 Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods Tsachi Blau et.al. 2410.17222 null
2024-10-22 MiniPLM: Knowledge Distillation for Pre-Training Language Models Yuxian Gu et.al. 2410.17215 link
2024-10-22 Exploring Possibilities of AI-Powered Legal Assistance in Bangladesh through Large Language Modeling Azmine Toushik Wasi et.al. 2410.17210 link
2024-10-22 VoiceBench: Benchmarking LLM-Based Voice Assistants Yiming Chen et.al. 2410.17196 link
2024-10-23 Non-myopic Generation of Language Model for Reasoning and Planning Chang Ma et.al. 2410.17195 link
2024-10-22 Remote Timing Attacks on Efficient Language Model Inference Nicholas Carlini et.al. 2410.17175 null
2024-10-22 From Attention to Activation: Unravelling the Enigmas of Large Language Models Prannay Kaul et.al. 2410.17174 null
2024-10-22 Self-calibration for Language Model Quantization and Pruning Miles Williams et.al. 2410.17170 null
2024-10-22 Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence İlker Işık et.al. 2410.17161 null
2024-10-22 Improving Pinterest Search Relevance Using Large Language Models Han Wang et.al. 2410.17152 null
2024-10-22 Are Visual-Language Models Effective in Action Recognition? A Comparative Study Mahmoud Ali et.al. 2410.17149 null
2024-10-22 Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ? Jirat Chiaranaipanich et.al. 2410.17145 null
2024-10-22 Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements Isamu Isozaki et.al. 2410.17141 link
2024-10-21 Reflection-Bench: probing AI intelligence with reflection Lingyu Li et.al. 2410.16270 link
2024-10-21 SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree Shuangrui Ding et.al. 2410.16268 link
2024-10-21 xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs Michael S. Ryoo et.al. 2410.16267 null
2024-10-22 Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance Zhangwei Gao et.al. 2410.16261 link
2024-10-21 Elucidating the design space of language models for image generation Xuantong Liu et.al. 2410.16257 link
2024-10-21 CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution Maosong Cao et.al. 2410.16256 link
2024-10-21 Can Knowledge Editing Really Correct Hallucinations? Baixiang Huang et.al. 2410.16251 link
2024-10-21 Analyzing Context Contributions in LLM-based Machine Translation Emmanouil Zaranis et.al. 2410.16246 null
2024-10-21 IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems Yihuan Mao et.al. 2410.16237 null
2024-10-21 LLaVA-KD: A Framework of Distilling Multimodal Large Language Models Yuxuan Cai et.al. 2410.16236 link
2024-10-21 ToW: Thoughts of Words Improve Reasoning in Large Language Models Zhikun Xu et.al. 2410.16235 null
2024-10-21 Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping Ryan Li et.al. 2410.16232 null
2024-10-21 Building A Coding Assistant via the Retrieval-Augmented Language Model Xinze Li et.al. 2410.16229 link
2024-10-21 A Realistic Threat Model for Large Language Model Jailbreaks Valentyn Boreiko et.al. 2410.16222 link
2024-10-21 Pre-training Distillation for Large Language Models: A Design Space Exploration Hao Peng et.al. 2410.16215 null
2024-10-21 Comprehensive benchmarking of large language models for RNA secondary structure prediction L. I. Zablocki et.al. 2410.16212 link
2024-10-21 CoT-TL: Low-Resource Temporal Knowledge Representation of Planning Instructions Using Chain-of-Thought Reasoning Kumar Manas et.al. 2410.16207 null
2024-10-21 Improve Vision Language Model Chain-of-thought Reasoning Ruohong Zhang et.al. 2410.16198 link
2024-10-22 LASER: Script Execution by Autonomous Agents for On-demand Traffic Simulation Hao Gao et.al. 2410.16197 link
2024-10-21 Contamination Report for Multilingual Benchmarks Sanchit Ahuja et.al. 2410.16186 null
2024-10-18 Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts German Gritsai et.al. 2410.14677 null
2024-10-18 SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment Qin Liu et.al. 2410.14676 null
2024-10-18 Enhancing Large Language Models’ Situated Faithfulness to External Contexts Yukun Huang et.al. 2410.14675 link
2024-10-18 Decomposing The Dark Matter of Sparse Autoencoders Joshua Engels et.al. 2410.14670 link
2024-10-18 NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples Baiqi Li et.al. 2410.14669 null
2024-10-18 MiCEval: Unveiling Multimodal Chain of Thought’s Quality via Image Description and Reasoning Steps Xiongtao Zhou et.al. 2410.14668 link
2024-10-18 A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning Shengjie Sun et.al. 2410.14660 null
2024-10-18 Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens Zhepeng Cen et.al. 2410.14655 null
2024-10-18 EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search Oliver Sieberling et.al. 2410.14649 link
2024-10-18 Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs Runchu Tian et.al. 2410.14641 link
2024-10-18 GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings Raghuveer Thirukovalluru et.al. 2410.14635 link
2024-10-18 Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning Yuxiang Lu et.al. 2410.14633 null
2024-10-18 On the Regularization of Learnable Embeddings for Time Series Processing Luca Butera et.al. 2410.14630 null
2024-10-18 CELI: Controller-Embedded Language Model Interactions Jan-Samuel Wagner et.al. 2410.14627 null
2024-10-18 DiSCo Meets LLMs: A Unified Approach for Sparse Retrieval and Contextual Distillation in Conversational Search Simon Lupart et.al. 2410.14609 null
2024-10-18 Teaching Models to Balance Resisting and Accepting Persuasion Elias Stengel-Eskin et.al. 2410.14596 link
2024-10-18 Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets Namid R. Stillman et.al. 2410.14587 null
2024-10-18 Do LLMs estimate uncertainty well in instruction-following? Juyeon Heo et.al. 2410.14582 null
2024-10-18 Large Language Models Are Overparameterized Text Encoders Thennal D K et.al. 2410.14578 null
2024-10-18 MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts Rachel S. Y. Teo et.al. 2410.14574 link
2024-10-17 Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Lijie Fan et.al. 2410.13863 null
2024-10-17 PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Rongyao Fang et.al. 2410.13861 link
2024-10-17 VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding Runsen Xu et.al. 2410.13860 link
2024-10-17 $γ-$ MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models Yaxin Luo et.al. 2410.13859 null
2024-10-17 How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs Guhao Feng et.al. 2410.13857 null
2024-10-17 Can MLLMs Understand the Deep Implication Behind Chinese Images? Chenhao Zhang et.al. 2410.13854 link
2024-10-17 Retrospective Learning from Interactions Zizhao Chen et.al. 2410.13852 null
2024-10-17 Differentiable Robot Rendering Ruoshi Liu et.al. 2410.13851 null
2024-10-17 SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction Xuan Zhang et.al. 2410.13846 link
2024-10-17 A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models Qiaoyu Tang et.al. 2410.13841 null
2024-10-17 Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs Tianyu Guo et.al. 2410.13835 link
2024-10-17 A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement Hui Yuan et.al. 2410.13828 link
2024-10-17 Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models Mazda Moayeri et.al. 2410.13826 null
2024-10-17 AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents Ke Yang et.al. 2410.13825 null
2024-10-18 Harnessing Webpage UIs for Text-Rich Visual Understanding Junpeng Liu et.al. 2410.13824 null
2024-10-17 Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning Xiaodan Xing et.al. 2410.13823 link
2024-10-17 Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance Mitsuhiko Nakamoto et.al. 2410.13816 null
2024-10-17 De-mark: Watermark Removal in Large Language Models Ruibo Chen et.al. 2410.13808 null
2024-10-17 A Watermark for Order-Agnostic Language Models Ruibo Chen et.al. 2410.13805 null
2024-10-18 BenTo: Benchmark Task Reduction with In-Context Transferability Hongyu Zhao et.al. 2410.13804 link
2024-10-16 Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models Ce Zhang et.al. 2410.12790 link
2024-10-16 Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception Jihao Zhao et.al. 2410.12788 link
2024-10-16 In-Context Learning Enables Robot Action Prediction in LLMs Yida Yin et.al. 2410.12782 null
2024-10-16 Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information Yingya Li et.al. 2410.12774 null
2024-10-16 Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions Zhenyu Jiang et.al. 2410.12773 null
2024-10-16 Towards Zero-Shot Camera Trap Image Categorization Jiří Vyskočil et.al. 2410.12769 null
2024-10-16 The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse Ekansh Sharma et.al. 2410.12766 null
2024-10-16 StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples Ajay Patel et.al. 2410.12757 null
2024-10-17 CREAM: Consistency Regularized Self-Rewarding Language Models Zhaoyang Wang et.al. 2410.12735 null
2024-10-16 WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation João Matos et.al. 2410.12722 link
2024-10-16 FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression Zhenheng Tang et.al. 2410.12707 null
2024-10-16 WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Genta Indra Winata et.al. 2410.12705 link
2024-10-16 Sarcasm Detection in a Less-Resourced Language Lazar Đoković et.al. 2410.12704 link
2024-10-16 Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization Xingqi Wang et.al. 2410.12700 link
2024-10-16 VividMed: Vision Language Model with Versatile Visual Grounding for Medicine Lingxiao Luo et.al. 2410.12694 link
2024-10-16 Automatic Mapping of Anatomical Landmarks from Free-Text Using Large Language Models: Insights from Llama-2 Mohamad Abdi et.al. 2410.12686 null
2024-10-16 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation Dewei Zhou et.al. 2410.12669 null
2024-10-16 Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models Shicheng Xu et.al. 2410.12662 null
2024-10-16 Evaluating Morphological Compositional Generalization in Large Language Models Mete Ismayilzada et.al. 2410.12656 null
2024-10-16 Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals Orchid Chetia Phukan et.al. 2410.12645 null
2024-10-15 GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation Fei Tang et.al. 2410.11841 link
2024-10-15 A Hitchhiker’s Guide to Scaling Law Estimation Leshem Choshen et.al. 2410.11840 link
2024-10-15 MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding Yue Cao et.al. 2410.11829 link
2024-10-15 Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws Yiding Jiang et.al. 2410.11820 link
2024-10-15 Improving Long-Text Alignment for Text-to-Image Diffusion Models Luping Liu et.al. 2410.11817 link
2024-10-15 SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing Zhiyuan Zhang et.al. 2410.11815 null
2024-10-15 NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models Han Han et.al. 2410.11805 null
2024-10-15 FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting Zhe Li et.al. 2410.11802 null
2024-10-15 Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability Tsz Ting Chung et.al. 2410.11786 null
2024-10-15 Latent BKI: Open-Dictionary Continuous Mapping in Visual-Language Latent Spaces with Quantifiable Uncertainty Joey Wilson et.al. 2410.11783 link
2024-10-15 G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks Guibin Zhang et.al. 2410.11782 null
2024-10-15 Language Models Encode Numbers Using Digit Representations in Base 10 Amit Arnold Levy et.al. 2410.11781 link
2024-10-15 MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation Chenxi Wang et.al. 2410.11779 link
2024-10-15 Time-Series Foundation Model for Value-at-Risk Anubha Goel et.al. 2410.11773 link
2024-10-15 Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models Kai Yao et.al. 2410.11772 link
2024-10-15 SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding Ying Chen et.al. 2410.11761 null
2024-10-15 Latent Action Pretraining from Videos Seonghyeon Ye et.al. 2410.11758 null
2024-10-15 Personas with Attitudes: Controlling LLMs for Diverse Data Annotation Leon Fröhling et.al. 2410.11745 link
2024-10-15 DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure Yunfan Xiong et.al. 2410.11744 null
2024-10-15 Light-Weight Fault Tolerant Attention for Large Language Model Training Yuhang Liang et.al. 2410.11720 null
2024-10-14 DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Guangxuan Xiao et.al. 2410.10819 link
2024-10-14 Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Ziyue Li et.al. 2410.10814 link
2024-10-14 LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory Di Wu et.al. 2410.10813 link
2024-10-14 Local and Global Decoding in Text Generation Daniel Gareev et.al. 2410.10810 link
2024-10-14 Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning Aakanksha et.al. 2410.10801 null
2024-10-14 Towards Foundation Models for 3D Vision: How Close Are We? Yiming Zuo et.al. 2410.10799 null
2024-10-15 MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling Jian Yang et.al. 2410.10798 null
2024-10-14 Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance Sachin Goyal et.al. 2410.10796 link
2024-10-15 LiveXiv – A Multi-Modal Live Benchmark Based on Arxiv Papers Content Nimrod Shabtay et.al. 2410.10783 link
2024-10-14 When Attention Sink Emerges in Language Models: An Empirical View Xiangming Gu et.al. 2410.10781 link
2024-10-14 Focused ReAct: Improving ReAct through Reiterate and Early Stop Shuoqiu Li et.al. 2410.10779 null
2024-10-14 AFlow: Automating Agentic Workflow Generation Jiayi Zhang et.al. 2410.10762 link
2024-10-14 Denial-of-Service Poisoning Attacks against Large Language Models Kuofeng Gao et.al. 2410.10760 link
2024-10-14 SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization Akrit Mudvari et.al. 2410.10759 null
2024-10-14 Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation for Classification Jan Cegin et.al. 2410.10756 link
2024-10-14 NT-LLM: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models Yanbiao Ji et.al. 2410.10743 null
2024-10-14 SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing Pengrui Quan et.al. 2410.10741 link
2024-10-14 Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs Ishan Jindal et.al. 2410.10739 null
2024-10-14 Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning Kuofeng Gao et.al. 2410.10735 null
2024-10-14 Towards LLM-guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection Giorgos Iacovides et.al. 2410.10728 null
2024-10-11 Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models Qin Liu et.al. 2410.09047 null
2024-10-11 AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation Zijun Wang et.al. 2410.09040 link
2024-10-11 Semi-Supervised Learning of Noisy Mixture of Experts Models Oh-Ran Kwon et.al. 2410.09039 null
2024-10-11 SimpleStrat: Diversifying Language Model Generation with Stratification Justin Wong et.al. 2410.09038 null
2024-10-11 Mentor-KD: Making Small Language Models Better Multi-step Reasoners Hojae Lee et.al. 2410.09037 link
2024-10-11 PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents Xiangyu Yin et.al. 2410.09034 link
2024-10-11 MedMobile: A mobile-sized language model with expert-level clinical capabilities Krithik Vishwanath et.al. 2410.09019 link
2024-10-11 Parameter-Efficient Fine-Tuning of State Space Models Kevin Galim et.al. 2410.09016 link
2024-10-11 The Impact of Visual Information in Chinese Characters: Evaluating Large Models’ Ability to Recognize and Utilize Radicals Xiaofeng Wu et.al. 2410.09013 null
2024-10-11 Software Engineering and Foundation Models: Insights from Industry Blogs Using a Jury of Foundation Models Hao Li et.al. 2410.09012 link
2024-10-11 SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights Ling Yang et.al. 2410.09008 link
2024-10-11 From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating UI Operation Impacts Zhuohao Jerry Zhang et.al. 2410.09006 null
2024-10-11 DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection Haochen Li et.al. 2410.09004 null
2024-10-11 Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference Grace Proebsting et.al. 2410.08996 null
2024-10-11 The structure of the token space for large language models Michael Robinson et.al. 2410.08993 null
2024-10-11 Science is Exploration: Computational Frontiers for Conceptual Metaphor Theory Rebecca M. M. Hicke et.al. 2410.08991 link
2024-10-11 SubZero: Random Subspace Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning Ziming Yu et.al. 2410.08989 link
2024-10-11 Towards Trustworthy Knowledge Graph Reasoning: An Uncertainty Aware Perspective Bo Ni et.al. 2410.08985 null
2024-10-11 NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models Zheng Yi Ho et.al. 2410.08970 null
2024-10-11 Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements Jingyu Zhang et.al. 2410.08968 null
2024-10-10 DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models Xiaoxiao He et.al. 2410.08207 null
2024-10-10 Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training Gen Luo et.al. 2410.08202 null
2024-10-10 Adam Exploits $\ell_\infty$ -geometry of Loss Landscape via Coordinate-wise Adaptivity Shuo Xie et.al. 2410.08198 link
2024-10-10 From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions Changle Qu et.al. 2410.08197 link
2024-10-10 MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code Zimu Lu et.al. 2410.08196 link
2024-10-10 Features are fate: a theory of transfer learning in high-dimensional regression Javan Tahir et.al. 2410.08194 null
2024-10-10 GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment Yuancheng Xu et.al. 2410.08193 null
2024-10-10 MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models Wenbo Hu et.al. 2410.08182 null
2024-10-10 Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models Qingni Wang et.al. 2410.08174 null
2024-10-10 On the Evaluation of Generative Robotic Simulations Feng Chen et.al. 2410.08172 null
2024-10-10 Visual Scratchpads: Enabling Global Reasoning in Vision Aryo Lotfi et.al. 2410.08165 null
2024-10-10 Agent S: An Open Agentic Framework that Uses Computers Like a Human Saaket Agashe et.al. 2410.08164 link
2024-10-10 The Effect of Surprisal on Reading Times in Information Seeking and Repeated Reading Keren Gruteke Klein et.al. 2410.08162 link
2024-10-10 DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation Jiatao Gu et.al. 2410.08159 null
2024-10-10 Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning Amrith Setlur et.al. 2410.08146 null
2024-10-10 Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs Xiaoyuan Liu et.al. 2410.08145 link
2024-10-10 DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory Yutong Wang et.al. 2410.08143 link
2024-10-10 Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction Jarrid Rector-Brooks et.al. 2410.08134 null
2024-10-10 Think Beyond Size: Dynamic Prompting for More Effective Reasoning Kamesh R et.al. 2410.08130 null
2024-10-10 Mars: Situated Inductive Reasoning in an Open-World Environment Xiaojuan Tang et.al. 2410.08126 null
2024-10-09 MM-Ego: Towards Building Egocentric Multimodal LLMs Hanrong Ye et.al. 2410.07177 null
2024-10-09 Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models Fei Wang et.al. 2410.07176 null
2024-10-09 Do better language models have crisper vision? Jona Ruthardt et.al. 2410.07173 null
2024-10-09 One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation Fabian Paischer et.al. 2410.07170 link
2024-10-09 Sylber: Syllabic Embedding Representation of Speech from Raw Audio Cheol Jun Cho et.al. 2410.07168 link
2024-10-09 Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate Qidong Huang et.al. 2410.07167 link
2024-10-09 Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making Manling Li et.al. 2410.07166 link
2024-10-09 Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning Chongyu Fan et.al. 2410.07163 link
2024-10-09 Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis Bohan Zeng et.al. 2410.07155 link
2024-10-09 Towards Interpreting Visual Information Processing in Vision-Language Models Clement Neo et.al. 2410.07149 link
2024-10-09 Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling Yingfa Chen et.al. 2410.07145 null
2024-10-09 Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates Xiaosen Zheng et.al. 2410.07137 link
2024-10-10 EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models Rui Zhao et.al. 2410.07133 link
2024-10-09 Mental Disorders Detection in the Era of Large Language Models Gleb Kuzmin et.al. 2410.07129 null
2024-10-09 Exploring the Readiness of Prominent Small Language Models for the Democratization of Financial Literacy Tagore Rao Kosireddy et.al. 2410.07118 link
2024-10-09 Personalized Visual Instruction Tuning Renjie Pi et.al. 2410.07113 link
2024-10-09 VHELM: A Holistic Evaluation of Vision Language Models Tony Lee et.al. 2410.07112 link
2024-10-09 I Want to Break Free! Anti-Social Behavior and Persuasion Ability of LLMs in Multi-Agent Settings with Social Hierarchy Gian Maria Campedelli et.al. 2410.07109 link
2024-10-09 Unleashing Multi-Hop Reasoning Potential in Large Language Models through Repetition of Misordered Context Sangwon Yu et.al. 2410.07103 null
2024-10-09 MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering Jun Shern Chan et.al. 2410.07095 link
2024-10-07 Fine-Tuning CLIP’s Last Visual Projector: A Few-Shot Cornucopia Mohammad Fahes et.al. 2410.05270 link
2024-10-07 Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models Fei Wang et.al. 2410.05269 null
2024-10-07 PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs Mengzhao Chen et.al. 2410.05265 link
2024-10-07 TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles Qingchen Yu et.al. 2410.05262 link
2024-10-07 TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens Ya-Qi Yu et.al. 2410.05261 null
2024-10-07 Differential Transformer Tianzhu Ye et.al. 2410.05258 link
2024-10-07 GLEE: A Unified Framework and Benchmark for Language-based Economic Environments Eilam Shapira et.al. 2410.05254 link
2024-10-07 Causal Micro-Narratives Mourad Heddaya et.al. 2410.05252 null
2024-10-07 SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe Yuxin Xiao et.al. 2410.05248 null
2024-10-07 Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents Boyu Gou et.al. 2410.05243 link
2024-10-08 TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models Rabin Adhikari et.al. 2410.05239 link
2024-10-07 GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models Iman Mirzadeh et.al. 2410.05229 null
2024-10-07 Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates Avanika Narayan et.al. 2410.05224 null
2024-10-07 Precise Model Benchmarking with Only a Few Observations Riccardo Fogliato et.al. 2410.05222 null
2024-10-07 Density estimation with LLMs: a geometric investigation of in-context learning trajectories Toni J. B. Liu et.al. 2410.05218 null
2024-10-07 Organizing Unstructured Image Collections using Natural Language Mingxuan Liu et.al. 2410.05217 null
2024-10-07 Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality Youngtaek Oh et.al. 2410.05210 link
2024-10-07 RevisEval: Improving LLM-as-a-Judge via Response-Adapted References Qiyuan Zhang et.al. 2410.05193 null
2024-10-07 Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective Kaiyue Wen et.al. 2410.05192 null
2024-10-07 LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation Zhijie Wang et.al. 2410.05191 null
2024-10-04 Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models Zhuochun Li et.al. 2410.03663 null
2024-10-04 Unraveling Cross-Modality Knowledge Conflict in Large Vision-Language Models Tinghui Zhu et.al. 2410.03659 link
2024-10-04 RAFT: Realistic Attacks to Fool Text Detectors James Wang et.al. 2410.03658 link
2024-10-04 Aligning LLMs with Individual Preferences via Interaction Shujin Wu et.al. 2410.03642 link
2024-10-04 Conditional Enzyme Generation Using Protein Language Models with Adapters Jason Yang et.al. 2410.03634 null
2024-10-04 Large Language Model Performance Benchmarking on Mobile Platforms: A Thorough Evaluation Jie Xiao et.al. 2410.03613 null
2024-10-04 TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and Generation Jonathan Cook et.al. 2410.03608 null
2024-10-04 LeLaN: Learning A Language-Conditioned Navigation Policy from In-the-Wild Videos Noriaki Hirose et.al. 2410.03603 null
2024-10-04 Efficiently Identifying Watermarked Segments in Mixed-Source Texts Xuandong Zhao et.al. 2410.03600 null
2024-10-04 Understanding Reasoning in Chain-of-Thought from the Hopfieldian View Lijie Hu et.al. 2410.03595 null
2024-10-04 Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models Xin Zou et.al. 2410.03577 link
2024-10-04 Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs) Abrar Rahman et.al. 2410.03568 null
2024-10-04 Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding Wei Wu et.al. 2410.03553 null
2024-10-04 Re-examining Sexism and Misogyny Classification with Annotator Attitudes Aiqi Jiang et.al. 2410.03543 null
2024-10-04 No Need to Talk: Asynchronous Mixture of Language Models Anastasiia Filippova et.al. 2410.03529 null
2024-10-04 Steering Large Language Models between Code Execution and Textual Reasoning Yongchao Chen et.al. 2410.03524 null
2024-10-04 A Probabilistic Perspective on Unlearning and Alignment for Large Language Models Yan Scholten et.al. 2410.03523 null
2024-10-04 CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios Zetian Ouyang et.al. 2410.03502 link
2024-10-04 FedStein: Enhancing Multi-Domain Federated Learning Through James-Stein Estimator Sunny Gupta et.al. 2410.03499 link
2024-10-04 Towards Reproducible LLM Evaluation: Quantifying Uncertainty in LLM Benchmark Scores Robert E. Blackwell et.al. 2410.03492 null
2024-10-03 Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations Nick Jiang et.al. 2410.02762 link
2024-10-03 FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models Zhipei Xu et.al. 2410.02761 link
2024-10-03 Erasing Conceptual Knowledge from Language Models Rohit Gandikota et.al. 2410.02760 link
2024-10-03 Loong: Generating Minute-level Long Videos with Autoregressive Language Models Yuqing Wang et.al. 2410.02757 null
2024-10-03 SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% the Cost Jifan Zhang et.al. 2410.02755 null
2024-10-03 Training Language Models on Synthetic Edit Sequences Improves Code Synthesis Ulyana Piterbarg et.al. 2410.02749 link
2024-10-03 CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation Han He et.al. 2410.02748 null
2024-10-03 Contrastive Localized Language-Image Pre-Training Hong-You Chen et.al. 2410.02746 null
2024-10-03 Neutral residues: revisiting adapters for model extension Franck Signe Talla et.al. 2410.02744 null
2024-10-03 MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions Yekun Chai et.al. 2410.02743 null
2024-10-03 Grounding Large Language Models In Embodied Environment With Imperfect World Models Haolan Liu et.al. 2410.02742 null
2024-10-03 Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization Lei Xu et.al. 2410.02741 link
2024-10-03 Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Zhengfeng Lai et.al. 2410.02740 null
2024-10-04 Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge Jiayi Ye et.al. 2410.02736 null
2024-10-03 DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects Zhaowei Wang et.al. 2410.02730 link
2024-10-03 Unified Multi-Modal Interleaved Document Representation for Information Retrieval Jaewoo Lee et.al. 2410.02729 null
2024-10-03 Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation Rohin Manvi et.al. 2410.02725 null
2024-10-03 Large Language Models as Markov Chains Oussama Zekri et.al. 2410.02724 null
2024-10-03 Domain-Specific Retrieval-Augmented Generation Using Vector Stores, Knowledge Graphs, and Tensor Factorization Ryan C. Barron et.al. 2410.02721 null
2024-10-03 UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation Zixuan Li et.al. 2410.02719 null
2024-10-02 Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads Yuxiang Huang et.al. 2410.01805 link
2024-10-02 Efficient $1$ -bit tensor approximations Alex W. Neal Riasanovsky et.al. 2410.01799 null
2024-10-02 Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models Joseph Lee et.al. 2410.01795 link
2024-10-02 When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1 R. Thomas McCoy et.al. 2410.01792 null
2024-10-02 Investigating on RLHF methodology Alexey Kutalev et.al. 2410.01789 null
2024-10-02 OmniGenBench: Automating Large-scale in-silico Benchmarking for Genomic Foundation Models Heng Yang et.al. 2410.01784 link
2024-10-02 Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models Shayekh Bin Islam et.al. 2410.01782 link
2024-10-03 Quantifying Generalization Complexity for Large Language Models Zhenting Qi et.al. 2410.01769 link
2024-10-02 Integrating Protein Sequence and Expression Level to Analysis Molecular Characterization of Breast Cancer Subtypes Hossein Sholehrasa et.al. 2410.01755 null
2024-10-03 Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks Mengzhao Jia et.al. 2410.01744 link
2024-10-02 VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models Kailai Feng et.al. 2410.01738 link
2024-10-02 Visual Perception in Text Strings Qi Jia et.al. 2410.01733 link
2024-10-02 Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing Yilmazcan Ozyurt et.al. 2410.01727 link
2024-10-02 Auto-Demo Prompting: Leveraging Generated Outputs as Demonstrations for Enhanced Batch Prompting Longyu Feng et.al. 2410.01724 null
2024-10-02 Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective Zeyu Gan et.al. 2410.01720 link
2024-10-02 Examining the Role of Relationship Alignment in Large Language Models Kristen M. Altenburger et.al. 2410.01708 null
2024-10-02 Interpretable Contrastive Monte Carlo Tree Search Reasoning Zitian Gao et.al. 2410.01707 link
2024-10-02 An Exploration of Self-Supervised Mutual Information Alignment for Multi-Task Settings Soham Govande et.al. 2410.01704 link
2024-10-02 CreDes: Causal Reasoning Enhancement and Dual-End Searching for Solving Long-Range Reasoning Problems using LLMs Kangsheng Wang et.al. 2410.01696 null
2024-10-02 U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models Tung-Yu Wu et.al. 2410.01692 null
2024-09-30 MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Haotian Zhang et.al. 2409.20566 null
2024-09-30 LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner Xiaopan Zhang et.al. 2409.20560 null
2024-09-30 Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos Md Mohaiminul Islam et.al. 2409.20557 null
2024-09-30 UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models Qiaojun Yu et.al. 2409.20551 null
2024-09-30 LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation Ziyao Zhang et.al. 2409.20550 null
2024-09-30 Robi Butler: Remote Multimodal Interactions with Household Robot Assistant Anxing Xiao et.al. 2409.20548 null
2024-09-30 Uncertainty-Informed Screening for Safer Solvents Used in the Synthesis of Perovskite via Language Models Arpan Mukherjee et.al. 2409.20512 null
2024-09-30 COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models Divyanshu Daiya et.al. 2409.20502 null
2024-09-30 A Weakly Supervised Data Labeling Framework for Machine Lexical Normalization in Vietnamese Social Media Dung Ha Nguyen et.al. 2409.20467 null
2024-09-30 Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments Mohamed Elnoor et.al. 2409.20445 null
2024-10-01 Instance-adaptive Zero-shot Chain-of-Thought Prompting Xiaosong Yuan et.al. 2409.20441 null
2024-09-30 HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty Decoding Fan Yuan et.al. 2409.20429 null
2024-09-30 World to Code: Multi-modal Data Generation via Self-Instructed Compositional Captioning and Filtering Jiacong Wang et.al. 2409.20424 link
2024-09-30 Anti-stereotypical Predictive Text Suggestions Do Not Reliably Yield Anti-stereotypical Writing Connor Baumler et.al. 2409.20390 null
2024-09-30 Wait, but Tylenol is Acetaminophen… Investigating and Improving Language Models’ Ability to Resist Requests for Misinformation Shan Chen et.al. 2409.20385 null
2024-09-30 Word-wise intonation model for cross-language TTS systems Tomilov A. A. et.al. 2409.20374 null
2024-09-30 The Perfect Blend: Redefining RLHF with Mixture of Judges Tengyu Xu et.al. 2409.20370 null
2024-09-30 VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs Ruotong Liao et.al. 2409.20365 link
2024-09-30 Efficient Driving Behavior Narration and Reasoning on Edge Device Using Large Language Models Yizhou Huang et.al. 2409.20364 null
2024-09-30 Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference Ke Yi et.al. 2409.20361 null
2024-09-27 Exploring Token Pruning in Vision State Space Models Zheng Zhan et.al. 2409.18962 null
2024-09-27 LML: Language Model Learning a Dataset for Data-Augmented Prediction Praneeth Vadlapati et.al. 2409.18957 link
2024-09-27 Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models Jiaming Li et.al. 2409.18943 link
2024-09-27 From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding Heqing Zou et.al. 2409.18938 null
2024-09-27 Social Media Bot Policies: Evaluating Passive and Active Enforcement Kristina Radivojevic et.al. 2409.18931 null
2024-09-27 AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow Huizi Yu et.al. 2409.18924 null
2024-09-27 Soft Measures for Extracting Causal Collective Intelligence Maryam Berijanian et.al. 2409.18911 link
2024-09-27 Improving Visual Object Tracking through Visual Prompting Shih-Fang Chen et.al. 2409.18901 link
2024-09-27 IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation Fan Lin et.al. 2409.18892 link
2024-09-27 Suicide Phenotyping from Clinical Notes in Safety-Net Psychiatric Hospital Using Multi-Label Classification with Pre-Trained Language Models Zehan Li et.al. 2409.18878 null
2024-09-27 Predicting and analyzing memorization within fine-tuned Large Language Models Jérémie Dentan et.al. 2409.18858 null
2024-09-27 Mitigating Selection Bias with Node Pruning and Auxiliary Options Hyeong Kyu Choi et.al. 2409.18857 null
2024-09-27 LLMs4Synthesis: Leveraging Large Language Models for Scientific Synthesis Hamed Babaei Giglou et.al. 2409.18812 link
2024-09-27 Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs Yanyuan Qiao et.al. 2409.18794 null
2024-09-27 A Survey on the Honesty of Large Language Models Siheng Li et.al. 2409.18786 link
2024-09-27 Enhancing Explainability in Multimodal Large Language Models Using Ontological Context Jihen Amara et.al. 2409.18753 null
2024-09-27 OpenObject-NAV: Open-Vocabulary Object-Oriented Navigation Based on Dynamic Carrier-Relationship Scene Graph Yujie Tang et.al. 2409.18743 null
2024-09-27 Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs Gleb Mezentsev et.al. 2409.18721 link
2024-09-27 Read Over the Lines: Attacking LLMs and Toxicity Detection Systems with ASCII Art to Mask Profanity Sergey Berezin et.al. 2409.18708 link
2024-09-27 Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models Yiming Chen et.al. 2409.18680 link
2024-09-26 EgoLM: Multi-Modal Language Model of Egocentric Motions Fangzhou Hong et.al. 2409.18127 null
2024-09-26 Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction Jing He et.al. 2409.18124 null
2024-09-26 Multi-View and Multi-Scale Alignment for Contrastive Language-Image Pre-training in Mammography Yuexi Du et.al. 2409.18119 null
2024-09-26 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding Ye Liu et.al. 2409.18111 link
2024-09-26 Open-World Evaluation for Retrieving Diverse Perspectives Hung-Ting Chen et.al. 2409.18110 null
2024-09-26 MALPOLON: A Framework for Deep Species Distribution Modeling Theo Larcher et.al. 2409.18102 link
2024-09-26 SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation Xin Li et.al. 2409.18082 null
2024-09-26 Infer Human’s Intentions Before Following Natural Language Instructions Yanming Wan et.al. 2409.18073 link
2024-09-26 Infering Alt-text For UI Icons With Large Language Models During App Development Sabrina Haque et.al. 2409.18060 null
2024-09-26 DualAD: Dual-Layer Planning for Reasoning in Autonomous Driving Dingrui Wang et.al. 2409.18053 link
2024-09-26 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions Kai Chen et.al. 2409.18042 null
2024-09-26 Compositional Hardness of Code in Large Language Models – A Probabilistic Perspective Yotam Wolf et.al. 2409.18028 null
2024-09-26 An Adversarial Perspective on Machine Unlearning for AI Safety Jakub Łucki et.al. 2409.18025 link
2024-09-26 DARE: Diverse Visual Question Answering with Robustness Evaluation Hannah Sterz et.al. 2409.18023 null
2024-09-26 Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles Lewei He et.al. 2409.18014 null
2024-09-26 Control Industrial Automation System with Large Language Models Yuchen Xia et.al. 2409.18009 link
2024-09-26 Multilingual Evaluation of Long Context Retrieval and Reasoning Ameeta Agrawal et.al. 2409.18006 link
2024-09-26 Enhancing Tourism Recommender Systems for Sustainable City Trips Using Retrieval-Augmented Generation Ashmi Banerjee et.al. 2409.18003 null
2024-09-26 Extracting Affect Aggregates from Longitudinal Social Media Data with Temporal Adapters for Large Language Models Georg Ahnert et.al. 2409.17990 link
2024-09-26 LLM4Brain: Training a Large Language Model for Brain Video Understanding Ruizhe Zheng et.al. 2409.17987 null
2024-09-25 Attention Prompting on Image for Large Vision-Language Models Runpeng Yu et.al. 2409.17143 link
2024-09-25 FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression Fazal Mittu et.al. 2409.17141 link
2024-09-25 Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents Junting Lu et.al. 2409.17140 null
2024-09-25 Blox-Net: Generative Design-for-Robot-Assembly Using VLM Supervision, Physics Simulation, and a Robot with Reset Andrew Goldberg et.al. 2409.17126 null
2024-09-25 Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Fan Zhou et.al. 2409.17115 link
2024-09-25 Unveiling Ontological Commitment in Multi-Modal Foundation Models Mert Keser et.al. 2409.17109 null
2024-09-25 Accumulator-Aware Post-Training Quantization Ian Colbert et.al. 2409.17092 null
2024-09-25 Can Vision Language Models Learn from Visual Demonstrations of Ambiguous Spatial Reasoning? Bowen Zhao et.al. 2409.17080 link
2024-09-25 VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models Yifei Liu et.al. 2409.17066 link
2024-09-25 Benchmarking Domain Generalization Algorithms in Computational Pathology Neda Zamanitajeddin et.al. 2409.17063 null
2024-09-25 Using LLM for Real-Time Transcription and Summarization of Doctor-Patient Interactions into ePuskesmas in Indonesia Azmul Asmar Irfan et.al. 2409.17054 null
2024-09-25 GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design Phillip Mueller et.al. 2409.17045 null
2024-09-25 How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not Francesco Verdini et.al. 2409.17044 null
2024-09-25 Counterfactual Token Generation in Large Language Models Ivi Chatzi et.al. 2409.17027 link
2024-09-25 LLM-CARD: Towards a Description and Landscape of Large Language Models Shengwei Tian et.al. 2409.17011 link
2024-09-25 Models Can and Should Embrace the Communicative Nature of Human-Generated Math Sasha Boguraev et.al. 2409.17005 null
2024-09-26 INT-FlashAttention: Enabling Flash Attention for INT8 Quantization Shimao Chen et.al. 2409.16997 link
2024-09-25 Harnessing Diversity for Important Data Selection in Pretraining Large Language Models Chi Zhang et.al. 2409.16986 null
2024-09-25 AXCEL: Automated eXplainable Consistency Evaluation using LLMs P Aditya Sreekar et.al. 2409.16984 null
2024-09-25 Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions Zeyneb N. Kaya et.al. 2409.16974 null
2024-09-24 Semantic Refocused Tuning for Open-Vocabulary Panoptic Segmentation Yong Xien Chng et.al. 2409.16278 null
2024-09-24 LLM Echo Chamber: personalized and automated disinformation Tony Ma et.al. 2409.16241 link
2024-09-24 EuroLLM: Multilingual Language Models for Europe Pedro Henrique Martins et.al. 2409.16235 null
2024-09-24 Fine-Tuning is Fine, if Calibrated Zheda Mai et.al. 2409.16223 link
2024-09-24 Towards Enhancing Linked Data Retrieval in Conversational UIs using Large Language Models Omar Mussa et.al. 2409.16220 link
2024-09-24 LLMCount: Enhancing Stationary mmWave Detection with Multimodal-LLM Boyan Li et.al. 2409.16209 null
2024-09-25 CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data Qian-Wen Zhang et.al. 2409.16202 link
2024-09-24 Leveraging Estimated Transferability Over Human Intuition for Model Selection in Text Ranking Jun Bai et.al. 2409.16198 null
2024-09-24 HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Haoran Que et.al. 2409.16191 link
2024-09-24 Expert-level vision-language foundation model for real-world radiology and comprehensive evaluation Xiaohong Liu et.al. 2409.16183 null
2024-09-24 SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image Dimitrije Antić et.al. 2409.16178 null
2024-09-24 Cyber Knowledge Completion Using Large Language Models Braden K Webb et.al. 2409.16176 null
2024-09-24 Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering Ziyu Zhao et.al. 2409.16167 null
2024-09-24 EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges Talor Abramovich et.al. 2409.16165 link
2024-09-24 ComiCap: A VLMs pipeline for dense captioning of Comic Panels Emanuele Vivoli et.al. 2409.16159 link
2024-09-24 Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework Lu Chen et.al. 2409.16146 link
2024-09-24 Evaluation of state-of-the-art ASR Models in Child-Adult Interactions Aditya Ashvin et.al. 2409.16135 null
2024-09-24 MOSS: Enabling Code-Driven Evolution and Context Management for AI Agents Ming Zhu et.al. 2409.16120 link
2024-09-25 Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration Pin-Jui Ku et.al. 2409.16117 link
2024-09-24 Exploring Hint Generation Approaches in Open-Domain Question Answering Jamshid Mozafari et.al. 2409.16096 link
2024-09-20 Gender Representation and Bias in Indian Civil Service Mock Interviews Somonnoy Banerjee et.al. 2409.12194 null
2024-09-18 Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution Peng Wang et.al. 2409.12191 link
2024-09-18 To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Zayne Sprague et.al. 2409.12183 link
2024-09-23 A Controlled Study on Long Context Extension and Generalization in LLMs Yi Lu et.al. 2409.12181 link
2024-09-18 Finetuning Language Models to Emit Linguistic Expressions of Uncertainty Arslan Chaudhry et.al. 2409.12180 null
2024-09-18 Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference Najmeh Forouzandehmehr et.al. 2409.12150 null
2024-09-18 MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning Justin Chih-Yao Chen et.al. 2409.12147 link
2024-09-18 MoRAG – Multi-Fusion Retrieval Augmented Generation for Human Motion Kalakonda Sai Shashank et.al. 2409.12140 null
2024-09-24 Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models Sijing Chen et.al. 2409.12139 null
2024-09-18 GRIN: GRadient-INformed MoE Liyuan Liu et.al. 2409.12136 null
2024-09-18 Linguini: A benchmark for language-agnostic linguistic reasoning Eduardo Sánchez et.al. 2409.12126 link
2024-09-18 Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement An Yang et.al. 2409.12122 null
2024-09-18 Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference Edresson Casanova et.al. 2409.12117 null
2024-09-18 Measuring Human and AI Values based on Generative Psychometrics with Large Language Models Haoran Ye et.al. 2409.12106 link
2024-09-19 Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval Warren Jouanneau et.al. 2409.12097 null
2024-09-19 The Impact of Element Ordering on LM Agent Performance Wayne Chi et.al. 2409.12089 link
2024-09-18 Dual-Layer Training and Decoding of Large Language Model with Simultaneously Thinking and Speaking Ningyuan Xi et.al. 2409.12059 null
2024-09-19 Using Large Language Models to Generate Clinical Trial Tables and Figures Yumeng Yang et.al. 2409.12046 null
2024-09-18 All-in-one foundational models learning across quantum chemical levels Yuxinxin Chen et.al. 2409.12015 link
2024-09-18 Mixture of Prompt Learning for Vision Language Models Yu Du et.al. 2409.12011 null
2024-09-17 AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs Basel Mousi et.al. 2409.11404 null
2024-09-17 NVLM: Open Frontier-Class Multimodal LLMs Wenliang Dai et.al. 2409.11402 null
2024-09-17 Says Who? Effective Zero-Shot Annotation of Focalization Rebecca M. M. Hicke et.al. 2409.11390 null
2024-09-17 Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement Simon Yu et.al. 2409.11378 link
2024-09-17 Towards Time Series Reasoning with LLMs Winnie Chow et.al. 2409.11376 null
2024-09-17 Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification Fatema-E- Jannat et.al. 2409.11375 null
2024-09-17 Learning Spatially-Aware Language and Audio Embedding Bhavika Devnani et.al. 2409.11369 null
2024-09-17 CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration Jiahui Gao et.al. 2409.11365 null
2024-09-17 CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark Zachary S. Siegel et.al. 2409.11363 link
2024-09-17 AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances Dhruv Agarwal et.al. 2409.11360 null
2024-09-17 THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models Mengfei Liang et.al. 2409.11353 link
2024-09-17 LPT++: Efficient Training on Mixture of Long-tailed Experts Bowen Dong et.al. 2409.11323 null
2024-09-17 SOAP: Improving and Stabilizing Shampoo using Adam Nikhil Vyas et.al. 2409.11321 link
2024-09-17 Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models Divij Gupta et.al. 2409.11302 null
2024-09-17 Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T5 Marcel Lamott et.al. 2409.11282 null
2024-09-17 P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task Weiye Xu et.al. 2409.11279 null
2024-09-17 Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments Maria Rigaki et.al. 2409.11276 null
2024-09-17 Task Arithmetic for Language Expansion in Speech Translation Yao-Fei Cheng et.al. 2409.11274 null
2024-09-17 LOLA – An Open-Source Massively Multilingual Large Language Model Nikit Srivastava et.al. 2409.11272 link
2024-09-17 Bio-Inspired Mamba: Temporal Locality and Bioplausible Learning in Selective State Space Models Jiahao Qin et.al. 2409.11263 null
2024-09-16 RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Di Liu et.al. 2409.10516 link
2024-09-16 Context-aware Code Segmentation for C-to-Rust Translation using Large Language Models Momoko Shiraishi et.al. 2409.10506 null
2024-09-16 DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction John Wu et.al. 2409.10504 null
2024-09-16 Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic Puzzles Kulin Shah et.al. 2409.10502 link
2024-09-16 Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models Shaznin Sultana et.al. 2409.10490 null
2024-09-16 Do Pre-trained Vision-Language Models Encode Object States? Kaleb Newman et.al. 2409.10488 null
2024-09-16 XLM for Autonomous Driving Systems: A Comprehensive Review Sonda Fourati et.al. 2409.10484 null
2024-09-16 Schrodinger’s Memory: Large Language Models Wei Wang et.al. 2409.10482 null
2024-09-16 Towards Semantic Versioning of Open Pre-trained Language Model Releases on Hugging Face Adekunle Ajibode et.al. 2409.10472 null
2024-09-16 LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning Jicong Ao et.al. 2409.10444 link
2024-09-16 CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera Jingpei Lu et.al. 2409.10441 null
2024-09-16 HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models Vineet Bhat et.al. 2409.10419 null
2024-09-16 A Large-Scale Privacy Assessment of Android Third-Party SDKs Mark Huasong Meng et.al. 2409.10411 null
2024-09-16 A Knowledge-Enhanced Disease Diagnosis Method Based on Prompt Learning and BERT Integration Zhang Zheng et.al. 2409.10403 null
2024-09-17 Learnings from a Large-Scale Deployment of an LLM-Powered Expert-in-the-Loop Healthcare Chatbot Bhuvan Sachdeva et.al. 2409.10354 null
2024-09-16 Large Language Model Enhanced Hard Sample Identification for Denoising Recommendation Tianrui Song et.al. 2409.10343 null
2024-09-16 The 20 questions game to distinguish large language models Gurvan Richardeau et.al. 2409.10338 null
2024-09-16 MGSA: Multi-granularity Graph Structure Attention for Knowledge Graph-to-Text Generation Shanshan Wang et.al. 2409.10294 null
2024-09-16 ReflectDiffu: Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework Jiahao Yuan et.al. 2409.10289 link
2024-09-16 ComplexCodeEval: A Benchmark for Evaluating Large Code Models on More Complex Code Jia Feng et.al. 2409.10280 link
2024-09-13 Agents in Software Engineering: Survey, Landscape, and Vision Yanxian Huang et.al. 2409.09030 link
2024-09-13 Contri(e)ve: Context + Retrieve for Scholarly Question Answering Kanchan Shivashankar et.al. 2409.09010 null
2024-09-13 Safeguarding Decentralized Social Media: LLM Agents for Automating Community Rule Compliance Lucio La Cava et.al. 2409.08963 null
2024-09-13 Emerging Reliance Behaviors in Human-AI Text Generation: Hallucinations, Data Quality Assessment, and Cognitive Forcing Functions Zahra Ashktorab et.al. 2409.08937 null
2024-09-13 SynSUM – Synthetic Benchmark with Structured and Unstructured Medical Records Paloma Rabaey et.al. 2409.08936 link
2024-09-13 LLM-based Weak Supervision Framework for Query Intent Classification in Video Search Farnoosh Javadi et.al. 2409.08931 null
2024-09-13 Affective Computing Has Changed: The Foundation Model Disruption Björn Schuller et.al. 2409.08907 null
2024-09-13 AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models Yifei Yao et.al. 2409.08904 link
2024-09-13 A Market for Lemons? Strategic Directions for a Vigilant Application of Artificial Intelligence in Entrepreneurship Research Martin Obschonka et.al. 2409.08890 null
2024-09-13 Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark Xuchen Li et.al. 2409.08887 null
2024-09-13 Exploring Graph Structure Comprehension Ability of Multimodal Large Language Models: Case Studies Zhiqiang Zhong et.al. 2409.08864 null
2024-09-13 FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition Zhenhua Xu et.al. 2409.08846 null
2024-09-13 AIPO: Improving Training Objective for Iterative Preference Optimization Yaojie Shen et.al. 2409.08845 link
2024-09-13 A RAG Approach for Generating Competency Questions in Ontology Engineering Xueli Pan et.al. 2409.08820 null
2024-09-13 Your Weak LLM is Secretly a Strong Teacher for Alignment Leitian Tao et.al. 2409.08813 null
2024-09-13 Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task Shao Zhang et.al. 2409.08811 null
2024-09-13 LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment Huan Zhang et.al. 2409.08795 link
2024-09-13 Optimizing Ingredient Substitution Using Large Language Models to Enhance Phytochemical Content in Recipes Luis Rita et.al. 2409.08792 null
2024-09-13 Electrocardiogram Report Generation and Question Answering via Retrieval-Augmented Self-Supervised Modeling Jialu Tang et.al. 2409.08788 null
2024-09-13 Uncertainty and Generalizability in Foundation Models for Earth Observation Raul Ramos-Pollan et.al. 2409.08744 null
2024-09-12 Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale Rogerio Bonatti et.al. 2409.08264 link
2024-09-12 OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering Jiahao Nick Li et.al. 2409.08250 null
2024-09-12 Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources Alisia Lupidi et.al. 2409.08239 null
2024-09-12 LLM Honeypot: Leveraging Large Language Models as Advanced Interactive Honeypot Systems Hakan T. Otal et.al. 2409.08234 link
2024-09-12 Adaptive Language-Guided Abstraction from Contrastive Explanations Andi Peng et.al. 2409.08212 null
2024-09-12 ComAlign: Compositional Alignment in Vision-Language Models Ali Abdollah et.al. 2409.08206 null
2024-09-12 What Makes a Maze Look Like a Maze? Joy Hsu et.al. 2409.08202 null
2024-09-12 AudioBERT: Audio Knowledge Augmented Language Model Hyunjong Ok et.al. 2409.08199 link
2024-09-12 Fine-tuning Large Language Models for Entity Matching Aaron Steiner et.al. 2409.08185 link
2024-09-12 On the Role of Context in Reading Time Prediction Andreas Opedal et.al. 2409.08160 link
2024-09-12 Faster Speech-LLaMA Inference with Multi-token Prediction Desh Raj et.al. 2409.08148 null
2024-09-12 LLM-POTUS Score: A Framework of Analyzing Presidential Debates with Large Language Models Zhengliang Liu et.al. 2409.08147 null
2024-09-12 Towards a graph-based foundation model for network traffic analysis Louis Van Langendonck et.al. 2409.08111 null
2024-09-12 The Faetar Benchmark: Speech Recognition in a Very Under-Resourced Language Michael Ong et.al. 2409.08103 null
2024-09-12 The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal Huiyuan Xie et.al. 2409.08098 null
2024-09-12 Securing Large Language Models: Addressing Bias, Misinformation, and Prompt Attacks Benji Peng et.al. 2409.08087 null
2024-09-12 SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality Chenyang Lei et.al. 2409.08083 link
2024-09-12 SoVAR: Building Generalizable Scenarios from Accident Reports for Autonomous Driving Testing An Guo et.al. 2409.08081 null
2024-09-12 TravelAgent: An AI Assistant for Personalized Travel Planning Aili Chen et.al. 2409.08069 null
2024-09-12 An Evaluation Framework for Attributed Information Retrieval using Large Language Models Hanane Djeddal et.al. 2409.08014 link
2024-09-11 “My Grade is Wrong!”: A Contestable AI Framework for Interactive Feedback in Evaluating Student Essays Shengxin Hong et.al. 2409.07453 null
2024-09-11 StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos Sijie Zhao et.al. 2409.07447 null
2024-09-11 SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories Ben Bogin et.al. 2409.07440 link
2024-09-11 A Suite for Acoustic Language Model Evaluation Gallil Maimon et.al. 2409.07437 link
2024-09-11 Synthetic continued pretraining Zitong Yang et.al. 2409.07431 link
2024-09-11 Agent Workflow Memory Zora Zhiruo Wang et.al. 2409.07429 link
2024-09-11 CLNX: Bridging Code and Natural Language for C/C++ Vulnerability-Contributing Commits Identification Zeqing Qin et.al. 2409.07407 null
2024-09-11 AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge Han Wang et.al. 2409.07394 link
2024-09-11 Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination Daniel Zhang-Li et.al. 2409.07372 null
2024-09-11 Demo: SGCode: A Flexible Prompt-Optimizing System for Secure Generation of Code Khiem Ton et.al. 2409.07368 null
2024-09-11 Think Together and Work Better: Combining Humans’ and LLMs’ Think-Aloud Outcomes for Effective Text Evaluation SeongYeub Chu et.al. 2409.07355 link
2024-09-11 Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks Md Zarif Hossain et.al. 2409.07353 link
2024-09-11 Explanation, Debate, Align: A Weak-to-Strong Framework for Language Model Generalization Mehrdad Zakershahrak et.al. 2409.07335 null
2024-09-11 Learning to Compress Contexts for Efficient Knowledge-based Visual Question Answering Weixi Weng et.al. 2409.07331 null
2024-09-11 MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications Praveen K Kanithi et.al. 2409.07314 null
2024-09-11 Exploring User-level Gradient Inversion with a Diffusion Prior Zhuohang Li et.al. 2409.07291 null
2024-09-11 STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM Qijiong Liu et.al. 2409.07276 null
2024-09-11 MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving Enming Zhang et.al. 2409.07267 link
2024-09-11 Alignment of Diffusion Models: Fundamentals, Challenges, and Future Buhua Liu et.al. 2409.07253 link
2024-09-11 PiTe: Pixel-Temporal Alignment for Large Video-Language Model Yang Liu et.al. 2409.07239 link
2024-09-10 Benchmarking Sub-Genre Classification For Mainstage Dance Music Hongzhi Shu et.al. 2409.06690 null
2024-09-10 E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning Zihan Liao et.al. 2409.06679 null
2024-09-10 LLaMA-Omni: Seamless Speech Interaction with Large Language Models Qingkai Fang et.al. 2409.06666 link
2024-09-10 Human Perception of LLM-generated Text Content in Social Media Environments Kristina Radivojevic et.al. 2409.06653 null
2024-09-10 Optimal Workload Placement on Multi-Instance GPUs Bekir Turkkan et.al. 2409.06646 null
2024-09-10 EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysis Danli Shi et.al. 2409.06644 null
2024-09-11 Segmenting sea ice floes in close-range optical imagery with active contour and foundation models Giulio Passerotti et.al. 2409.06641 null
2024-09-10 TeXBLEU: Automatic Metric for Evaluate LaTeX Format Kyudan Jung et.al. 2409.06639 link
2024-09-10 MoWE-Audio: Multitask AudioLLMs with Mixture of Weak Encoders Wenyu Zhang et.al. 2409.06635 null
2024-09-10 A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio Ningyuan Xi et.al. 2409.06624 null
2024-09-10 Exploring Italian sentence embeddings properties through multi-tasking Vivi Nastase et.al. 2409.06622 link
2024-09-10 Alleviating Hallucinations in Large Language Models with Scepticism Modeling Yetao Wu et.al. 2409.06601 null
2024-09-10 GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Sacha Muller et.al. 2409.06595 link
2024-09-10 Quantifying and Enabling the Interpretability of CLIP-like Models Avinash Madasu et.al. 2409.06579 null
2024-09-10 Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement Vivi Nastase et.al. 2409.06567 null
2024-09-10 MAPS: Energy-Reliability Tradeoff Management in Autonomous Vehicles Through LLMs Penetrated Science Mahdieh Aliazam et.al. 2409.06558 null
2024-09-10 Questioning Internal Knowledge Structure of Large Language Models Through the Lens of the Olympic Games Juhwan Choi et.al. 2409.06518 link
2024-09-10 Aligning Machine and Human Visual Representations across Abstraction Levels Lukas Muttenthaler et.al. 2409.06509 null
2024-09-10 Mitigating Hallucination in Visual-Language Models via Re-Balancing Contrastive Decoding Xiaoyu Liang et.al. 2409.06485 null
2024-09-10 Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles Qiujing Lu et.al. 2409.06450 null
2024-09-09 MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct Run Luo et.al. 2409.05840 null
2024-09-09 Are Large Language Models a Threat to Programming Platforms? An Exploratory Study Md Mustakim Billah et.al. 2409.05824 null
2024-09-09 VFA: Vision Frequency Analysis of Foundation Models and Human Mohammad-Javad Darvishi-Bayazi et.al. 2409.05817 null
2024-09-09 Improving Pretraining Data Using Perplexity Correlations Tristan Thrush et.al. 2409.05816 null
2024-09-09 Benchmarking Chinese Knowledge Rectification in Large Language Models Tianhe Lu et.al. 2409.05806 link
2024-09-09 Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models Emily Cheng et.al. 2409.05771 null
2024-09-09 Model Input Verification of Large Scale Simulations Rumyana Neykova et.al. 2409.05768 null
2024-09-09 A Novel Idea Generation Tool using a Structured Conversational AI (CAI) System B. Sankar et.al. 2409.05747 null
2024-09-09 LLMs Will Always Hallucinate, and We Need to Live With This Sourav Banerjee et.al. 2409.05746 null
2024-09-09 A System and Benchmark for LLM-based Q\&A on Heterogeneous Data Achille Fokoue et.al. 2409.05735 null
2024-09-09 Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach Meng Zhou et.al. 2409.05732 null
2024-09-09 The Influence of Task and Group Disparities over Users’ Attitudes Toward Using Large Language Models for Psychotherapy Qihang He et.al. 2409.05703 null
2024-09-09 Segmentation by Factorization: Unsupervised Semantic Segmentation for Pathology by Factorizing Foundation Model Features Jacob Gildenblat et.al. 2409.05697 null
2024-09-09 Zero-shot Outlier Detection via Prior-data Fitted Networks: Model Selection Bygone! Yuchen Shen et.al. 2409.05672 null
2024-09-09 Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case Vagrant Gautam et.al. 2409.05653 link
2024-09-10 MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery Hongjin Qian et.al. 2409.05591 link
2024-09-09 Leveraging Content and Acoustic Representations for Efficient Speech Emotion Recognition Soumya Dutta et.al. 2409.05566 null
2024-09-09 CauseJudger: Identifying the Cause with LLMs for Abductive Logical Reasoning Jinwei He et.al. 2409.05559 null
2024-09-09 SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning Alireza Ghafarollahi et.al. 2409.05556 link
2024-09-09 Harmonic Reasoning in Large Language Models Anna Kruspe et.al. 2409.05521 null
2024-09-06 VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation Yecheng Wu et.al. 2409.04429 link
2024-09-06 Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques Davide Clode da Silva et.al. 2409.04424 null
2024-09-06 RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs Jiaxing Wu et.al. 2409.04421 null
2024-09-06 Question-Answering Dense Video Events Hangyu Qin et.al. 2409.04388 null
2024-09-06 Learning vs Retrieval: The Role of In-Context Examples in Regression with LLMs Aliakbar Nafar et.al. 2409.04318 link
2024-09-06 An optically accelerated extreme learning machine using hot atomic vapors Pierre Azam et.al. 2409.04312 null
2024-09-06 Using Large Language Models to Generate Authentic Multi-agent Knowledge Work Datasets Desiree Heim et.al. 2409.04286 null
2024-09-06 Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models Yuxiao Huang et.al. 2409.04270 null
2024-09-06 An overview of domain-specific foundation model: key technologies, applications and challenges Haolong Chen et.al. 2409.04267 null
2024-09-06 UniDet3D: Multi-dataset Indoor 3D Object Detection Maksim Kolodiazhnyi et.al. 2409.04234 link
2024-09-06 Fast Forwarding Low-Rank Training Adir Rahamim et.al. 2409.04206 null
2024-09-06 Residual Stream Analysis with Multi-Layer SAEs Tim Lawson et.al. 2409.04185 link
2024-09-06 GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding Ziyin Zhang et.al. 2409.04183 null
2024-09-06 Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering Larissa Pusch et.al. 2409.04181 null
2024-09-06 From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks Andreas Stephan et.al. 2409.04168 null
2024-09-06 Can OpenSource beat ChatGPT? – A Comparative Study of Large Language Models for Text-to-Code Generation Luis Mayer et.al. 2409.04164 null
2024-09-06 Prompt-based Personality Profiling: Reinforcement Learning for Relevance Filtering Jan Hofmann et.al. 2409.04122 null
2024-09-06 Multi-Programming Language Ensemble for Code Generation in Large Language Model Tengfei Xue et.al. 2409.04114 link
2024-09-06 Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers Chenglei Si et.al. 2409.04109 link
2024-09-06 UI-JEPA: Towards Active Perception of User Intent through Onscreen User Activity Yicheng Fu et.al. 2409.04081 null
2024-09-05 Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding Yunze Man et.al. 2409.03757 link
2024-09-05 Foundation Model or Finetune? Evaluation of few-shot semantic segmentation for river pollution Marga Don et.al. 2409.03754 link
2024-09-05 Attention Heads of Large Language Models: A Survey Zifan Zheng et.al. 2409.03752 link
2024-09-05 LLM-CI: Assessing Contextual Integrity Norms in Language Models Yan Shvartzshnaider et.al. 2409.03735 null
2024-09-05 Safety vs. Performance: How Multi-Objective Learning Reduces Barriers to Market Entry Meena Jagadeesan et.al. 2409.03734 null
2024-09-05 Planning In Natural Language Improves LLM Search For Code Generation Evan Wang et.al. 2409.03733 link
2024-09-06 RAG based Question-Answering for Contextual Response Prediction System Sriram Veturi et.al. 2409.03708 null
2024-09-05 LAST: Language Model Aware Speech Tokenization Arnon Turetzky et.al. 2409.03701 null
2024-09-05 TRACE-cs: Trustworthy Reasoning for Contrastive Explanations in Course Scheduling Problems Stylianos Loukas Vasileiou et.al. 2409.03671 null
2024-09-05 A Fused Large Language Model for Predicting Startup Success Abdurahman Maarouf et.al. 2409.03668 null
2024-09-05 The representation landscape of few-shot learning and fine-tuning in large language models Diego Doimo et.al. 2409.03662 link
2024-09-06 LLM-based multi-agent poetry generation in non-cooperative environments Ran Zhang et.al. 2409.03659 link
2024-09-05 On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization Yong Lin et.al. 2409.03650 null
2024-09-05 Text-Guided Mixup Towards Long-Tailed Image Categorization Richard Franklin et.al. 2409.03583 link
2024-09-05 FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation Xi Chen et.al. 2409.03525 null
2024-09-05 Have Large Vision-Language Models Mastered Art History? Ombretta Strafforello et.al. 2409.03521 null
2024-09-05 Tissue Concepts: supervised foundation models in computational pathology Till Nicke et.al. 2409.03519 link
2024-09-05 From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents Jifan Yu et.al. 2409.03512 null
2024-09-05 LLM-based event abstraction and integration for IoT-sourced logs Mohsen Shirali et.al. 2409.03478 link
2024-09-05 How Much Data is Enough Data? Fine-Tuning Large Language Models for In-House Translation: Performance Evaluation Across Multiple Dataset Sizes Inacio Vieira et.al. 2409.03454 null
2024-09-04 RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins (early version) Yao Mu et.al. 2409.02920 null
2024-09-04 Can LVLMs Obtain a Driver’s License? A Benchmark Towards Reliable AGI for Autonomous Driving Yuhang Lu et.al. 2409.02914 null
2024-09-04 Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling Kaiwen Zheng et.al. 2409.02908 null
2024-09-05 LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA Jiajie Zhang et.al. 2409.02897 link
2024-09-04 LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture Xidong Wang et.al. 2409.02889 link
2024-09-04 CanvOI, an Oncology Intelligence Foundation Model: Scaling FLOPS Differently Jonathan Zalach et.al. 2409.02885 null
2024-09-04 Benchmarking Spurious Bias in Few-Shot Image Classifiers Guangtao Zheng et.al. 2409.02882 link
2024-09-04 Configurable Foundation Models: Building LLMs from a Modular Perspective Chaojun Xiao et.al. 2409.02877 null
2024-09-04 Historical German Text Normalization Using Type- and Token-Based Language Modeling Anton Ehrmanntraut et.al. 2409.02841 null
2024-09-04 Exploring Sentiment Dynamics and Predictive Behaviors in Cryptocurrency Discussions by Few-Shot Learning with Large Language Models Moein Shahiki Tash et.al. 2409.02836 null
2024-09-04 CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models Wentao Liu et.al. 2409.02834 link
2024-09-04 ExpLLM: Towards Chain of Thought for Facial Expression Recognition Xing Lan et.al. 2409.02828 null
2024-09-04 Design Contradictions: Help or Hindrance? Aron E. Owen et.al. 2409.02823 null
2024-09-04 Language Understanding as a Constraint on Consensus Size in LLM Societies Giordano De Marzo et.al. 2409.02822 null
2024-09-04 Towards a Unified View of Preference Learning for Large Language Models: A Survey Bofei Gao et.al. 2409.02795 link
2024-09-05 Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models? Yixuan Tang et.al. 2409.02727 link
2024-09-04 Pre-training data selection for biomedical domain adaptation using journal impact metrics Mathieu Laï-king et.al. 2409.02725 null
2024-09-04 Alignment-Aware Model Extraction Attacks on Large Language Models Zi Liang et.al. 2409.02718 link
2024-09-04 Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL Mohammad Reshadati et.al. 2409.02711 null
2024-09-04 LLM-Assisted Visual Analytics: Opportunities and Challenges Maeve Hutchinson et.al. 2409.02691 null
2024-08-30 SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists Raoyuan Zhao et.al. 2408.17437 link
2024-08-30 DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model Mona Sheikh Zeinoddin et.al. 2408.17433 link
2024-08-30 Advancing Multi-talker ASR Performance with Large Language Models Mohan Shi et.al. 2408.17431 null
2024-08-30 CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models Jonathan Bourne et.al. 2408.17428 null
2024-09-03 Open-vocabulary Temporal Action Localization using VLMs Naoki Wake et.al. 2408.17422 null
2024-08-30 Getting Inspiration for Feature Elicitation: App Store- vs. LLM-based Approach Jialiang Wei et.al. 2408.17404 link
2024-08-30 EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online Grounding and Execution Francesco Argenziano et.al. 2408.17379 null
2024-08-30 NDP: Next Distribution Prediction as a More Broad Target Junhao Ruan et.al. 2408.17377 null
2024-08-30 Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain Francesca Grasso et.al. 2408.17362 link
2024-08-30 Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage Md Rafi Ur Rashid et.al. 2408.17354 null
2024-09-02 LSMS: Language-guided Scale-aware MedSegmentor for Medical Image Referring Segmentation Shuyi Ouyang et.al. 2408.17347 null
2024-08-30 Investigating Neuron Ablation in Attention Heads: The Case for Peak Activation Centering Nicholas Pochinkov et.al. 2408.17322 link
2024-08-30 Bridging Domain Knowledge and Process Discovery Using Large Language Models Ali Norouzifar et.al. 2408.17316 link
2024-08-30 Flexible and Effective Mixing of Large Language Models into a Mixture of Domain Experts Rhui Dih Lee et.al. 2408.17280 null
2024-08-30 Joint Estimation and Prediction of City-wide Delivery Demand: A Large Language Model Empowered Graph-based Learning Approach Tong Nie et.al. 2408.17258 null
2024-08-30 VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters Mouxiang Chen et.al. 2408.17253 link
2024-08-30 Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study Shubham Agarwal et.al. 2408.17181 null
2024-08-30 Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model Zhen Ye et.al. 2408.17175 link
2024-08-30 Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning Xiaoye Qu et.al. 2408.17150 link
2024-08-30 Reasoning AI Performance Degradation in 6G Networks with Large Language Models Liming Huang et.al. 2408.17097 null
2024-08-29 PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning Noor Hussein et.al. 2408.16769 link
2024-08-29 How Far Can Cantonese NLP Go? Benchmarking Cantonese Capabilities of Large Language Models Jiyue Jiang et.al. 2408.16756 link
2024-08-29 Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models Alec Solway et.al. 2408.16753 null
2024-08-29 A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models Yi-Lin Tuan et.al. 2408.16751 null
2024-08-29 Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge Beidi Dong et.al. 2408.16749 null
2024-08-29 Theoretical and Methodological Framework for Studying Texts Produced by Large Language Models Jiří Milička et.al. 2408.16740 null
2024-08-29 Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling Hritik Bansal et.al. 2408.16737 null
2024-08-29 VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation Shiwei Wu et.al. 2408.16730 null
2024-08-30 Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Zhifei Xie et.al. 2408.16725 link
2024-08-29 GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models Moreno D’Incà et.al. 2408.16700 link
2024-08-29 Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity Ziniu Li et.al. 2408.16673 null
2024-08-29 Space3D-Bench: Spatial 3D Question Answering Benchmark Emilia Szymanska et.al. 2408.16662 null
2024-08-29 DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving Yongjie Fu et.al. 2408.16647 null
2024-08-29 Examination of Code generated by Large Language Models Robin Beer et.al. 2408.16601 link
2024-08-29 Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies Zhiyang Qi et.al. 2408.16586 null
2024-08-29 WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Shengpeng Ji et.al. 2408.16532 link
2024-08-29 CNIMA: A Universal Evaluation Framework and Automated Approach for Assessing Second Language Dialogues Rena Gao et.al. 2408.16518 link
2024-08-29 LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs? Jan Cegin et.al. 2408.16502 null
2024-08-29 CogVLM2: Visual Language Models for Image and Video Understanding Wenyi Hong et.al. 2408.16500 link
2024-08-29 A Survey on Evaluating Large Language Models in Code Generation Tasks Liguo Chen et.al. 2408.16498 null
2024-08-28 Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders Min Shi et.al. 2408.15998 link
2024-08-29 Spatio-Temporal Context Prompting for Zero-Shot Action Detection Wei-Jhe Huang et.al. 2408.15996 null
2024-08-28 Perceive-IR: Learning to Perceive Degradation Better for All-in-One Image Restoration Xu Zhang et.al. 2408.15994 null
2024-08-28 BattleAgentBench: A Benchmark for Evaluating Cooperation and Competition Capabilities of Language Models in Multi-Agent Systems Wei Wang et.al. 2408.15971 null
2024-08-28 More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding Yuan Tang et.al. 2408.15966 link
2024-08-28 Atari-GPT: Investigating the Capabilities of Multimodal Large Language Models as Low-Level Policies for Atari Games Nicholas R. Waytowich et.al. 2408.15950 null
2024-08-28 DeMoBot: Deformable Mobile Manipulation with Vision-based Sub-goal Retrieval Yuying Zhang et.al. 2408.15919 null
2024-08-28 Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models Yuncheng Yang et.al. 2408.15915 link
2024-08-28 Decentralized LLM Inference over Edge Networks with Energy Harvesting Aria Khoshsirat et.al. 2408.15907 null
2024-08-28 LLM-Based Multi-Hop Question Answering with Knowledge Graph Integration in Evolving Environments Ruirui Chen et.al. 2408.15903 null
2024-08-28 Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts Nikolas Gritsch et.al. 2408.15901 null
2024-08-28 Bias in LLMs as Annotators: The Effect of Party Cues on Labelling Decision by Large Language Models Sebastian Vallejo Vera et.al. 2408.15895 null
2024-08-28 LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation Fangxun Shu et.al. 2408.15881 link
2024-08-28 Persuasion Games using Large Language Models Ganesh Prasath Ramani et.al. 2408.15879 null
2024-08-28 Retrieval-Augmented Instruction Tuning for Automated Process Engineering Calculations : A Tool-Chaining Problem-Solving Framework with Attributable Reflection Sagar Srinivas Sakhinana et.al. 2408.15866 null
2024-08-28 Benchmarking foundation models as feature extractors for weakly-supervised computational pathology Peter Neidlinger et.al. 2408.15823 null
2024-08-28 Visual Prompt Engineering for Medical Vision Language Models in Radiology Stefan Denner et.al. 2408.15802 null
2024-08-28 Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization Léo Hemamou et.al. 2408.15801 null
2024-08-28 Evaluating Named Entity Recognition Using Few-Shot Prompting with Large Language Models Hédi Zhegidi et.al. 2408.15796 link
2024-08-28 Efficient LLM Scheduling by Learning to Rank Yichao Fu et.al. 2408.15792 link
2024-08-27 Generative Verifiers: Reward Modeling as Next-Token Prediction Lunjun Zhang et.al. 2408.15240 null
2024-08-27 The Mamba in the Llama: Distilling and Accelerating Hybrid Models Junxiong Wang et.al. 2408.15237 link
2024-08-27 Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations Yucheng Jiang et.al. 2408.15232 null
2024-08-27 LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet Nathaniel Li et.al. 2408.15221 null
2024-08-27 Investigating Coverage Criteria in Large Language Models: An In-Depth Study Through Jailbreak Attacks Shide Zhou et.al. 2408.15207 null
2024-08-27 Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation Jian Hu et.al. 2408.15205 link
2024-08-27 Can Unconfident LLM Annotations Be Used for Confident Conclusions? Kristina Gligorić et.al. 2408.15204 link
2024-08-27 Infusing Acoustic Pause Context into Text-Based Dementia Assessment Franziska Braun et.al. 2408.15188 null
2024-08-27 Unlocking Potential in Pre-Trained Music Language Models for Versatile Multi-Track Music Arrangement Longshen Ou et.al. 2408.15176 null
2024-08-27 X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation Hanjia Lyu et.al. 2408.15172 null
2024-08-27 Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation N. E. Kriman et.al. 2408.15171 null
2024-08-27 How transformers learn structured data: insights from hierarchical filtering Jerome Garnier-Brun et.al. 2408.15138 null
2024-08-27 CLIP-AGIQA: Boosting the Performance of AI-Generated Image Quality Assessment with CLIP Zhenchen Tang et.al. 2408.15098 null
2024-08-27 Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models Xiyu Liu et.al. 2408.15091 null
2024-08-27 BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline Guosheng Dong et.al. 2408.15079 null
2024-08-27 Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models Ned Cooper et.al. 2408.15066 null
2024-08-27 The Benefits of Balance: From Information Projections to Variance Reduction Lang Liu et.al. 2408.15065 null
2024-08-28 DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding Wenhui Liao et.al. 2408.15045 null
2024-08-28 A Survey of Large Language Models for European Languages Wazir Ali et.al. 2408.15040 null
2024-08-27 Speech Recognition Transformers: Topological-lingualism Perspective Shruti Singh et.al. 2408.14991 null
2024-08-26 A Practitioner’s Guide to Continual Multimodal Pretraining Karsten Roth et.al. 2408.14471 link
2024-08-27 Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models Aradhye Agarwal et.al. 2408.14470 link
2024-08-26 Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos Qirui Chen et.al. 2408.14469 null
2024-08-26 Explicit Inductive Inference using Large Language Models Tianyang Liu et.al. 2408.14467 null
2024-08-26 Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study Liuchang Xu Shuo Zhao et.al. 2408.14438 null
2024-08-26 Social perception of faces in a vision-language model Carina I. Hausladen et.al. 2408.14435 link
2024-08-26 CHARTOM: A Visual Theory-of-Mind Benchmark for Multimodal Large Language Models Shubham Bharti et.al. 2408.14419 null
2024-08-26 MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues Kuluhan Binici et.al. 2408.14418 null
2024-08-26 Hyperdimensional Computing Empowered Federated Foundation Model over Wireless Networks for Metaverse Yahao Ding et.al. 2408.14416 null
2024-08-26 Language-specific Calibration for Pruning Multilingual Language Models Simon Kurz et.al. 2408.14398 null
2024-08-26 Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning Sakhinana Sagar Srinivas et.al. 2408.14387 null
2024-08-26 Probing Causality Manipulation of Large Language Models Chenyang Zhang et.al. 2408.14380 link
2024-08-26 An Embedding is Worth a Thousand Noisy Labels Francesco Di Salvo et.al. 2408.14358 link
2024-08-26 SWE-bench-java: A GitHub Issue Resolving Benchmark for Java Daoguang Zan et.al. 2408.14354 link
2024-08-26 Assessing Contamination in Large Language Models: Introducing the LogProber method Nicolas Yax et.al. 2408.14352 null
2024-08-26 Foundation Models for Music: A Survey Yinghao Ma et.al. 2408.14340 link
2024-08-26 Claim Verification in the Age of Large Language Models: A Survey Alphaeus Dmonte et.al. 2408.14317 null
2024-08-26 LLM-3D Print: Large Language Models To Monitor and Control 3D Printing Yayati Jadhav et.al. 2408.14307 null
2024-08-26 Investigating the Effectiveness of Bayesian Spam Filters in Detecting LLM-modified Spam Mails Malte Josten et.al. 2408.14293 link
2024-08-26 Predictability and Causality in Spanish and English Natural Language Generation Andrea Busto-Castiñeira et.al. 2408.14283 null
2024-08-23 MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? Yi-Fan Zhang et.al. 2408.13257 null
2024-08-23 Domain-specific long text classification from sparse relevant information Célia D’Cruz et.al. 2408.13253 null
2024-08-23 Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption Sakhinana Sagar Srinivas et.al. 2408.13248 null
2024-08-23 Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time Yingyu Liang et.al. 2408.13233 null
2024-08-23 EUR-USD Exchange Rate Forecasting Based on Information Fusion with Large Language Models and Deep Learning Methods Hongcheng Ding et.al. 2408.13214 null
2024-08-23 DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation Qiming Zhu et.al. 2408.13204 null
2024-08-23 Can LLM be a Good Path Planner based on Prompt Engineering? Mitigating the Hallucination for Path Planning Hourui Deng et.al. 2408.13184 null
2024-08-23 IntelliCare: Improving Healthcare Analysis with Variance-Controlled Patient-Level Knowledge from Large Language Models Zhihao Yu et.al. 2408.13073 link
2024-08-23 Guiding IoT-Based Healthcare Alert Systems with Large Language Models Yulan Gao et.al. 2408.13071 null
2024-08-23 SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks Kai-Wei Chang et.al. 2408.13040 null
2024-08-23 VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models Wentao Wu et.al. 2408.13031 link
2024-08-23 In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting Haowei Du et.al. 2408.13028 null
2024-08-23 A Web-Based Solution for Federated Learning with LLM-Based Automation Chamith Mawela et.al. 2408.13010 null
2024-08-23 Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates Hui Wei et.al. 2408.13006 link
2024-08-23 CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution Ruiyang Xu et.al. 2408.13001 null
2024-08-23 Open Llama2 Model for the Lithuanian Language Artūras Nakvosas et.al. 2408.12963 null
2024-08-23 Multimodal Contrastive In-Context Learning Yosuke Miyanishi et.al. 2408.12959 null
2024-08-23 Image Segmentation in Foundation Model Era: A Survey Tianfei Zhou et.al. 2408.12957 link
2024-08-23 E-code: Mastering Efficient Code Generation through Pretrained Models and Expert Encoder Group Yue Pan et.al. 2408.12948 null
2024-08-23 Causal-Guided Active Learning for Debiasing Large Language Models Zhouhao Sun et.al. 2408.12942 link
2024-08-22 Controllable Text Generation for Large Language Models: A Survey Xun Liang et.al. 2408.12599 link
2024-08-23 Non-Homophilic Graph Pre-Training and Prompt Learning Xingtong Yu et.al. 2408.12594 null
2024-08-22 RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment Xiaohan Wang et.al. 2408.12579 null
2024-08-22 MuMA-ToM: Multi-modal Multi-Agent Theory of Mind Haojun Shi et.al. 2408.12574 link
2024-08-22 Jamba-1.5: Hybrid Transformer-Mamba Models at Scale Jamba Team et.al. 2408.12570 null
2024-08-22 ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation Lujia Zhong et.al. 2408.12561 link
2024-08-22 Towards Evaluating and Building Versatile Large Language Models for Medicine Chaoyi Wu et.al. 2408.12547 link
2024-08-22 Show-o: One Single Transformer to Unify Multimodal Understanding and Generation Jinheng Xie et.al. 2408.12528 null
2024-08-22 MEDCO: Medical Education Copilots Based on A Multi-Agent Framework Hao Wei et.al. 2408.12496 null
2024-08-22 GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models Kunsheng Tang et.al. 2408.12494 link
2024-08-23 Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese Khang T. Doan et.al. 2408.12480 null
2024-08-22 Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition Bozheng Li et.al. 2408.12475 null
2024-08-22 DLCRec: A Novel Approach for Managing Diversity in LLM-Based Recommender Systems Jiaju Chen et.al. 2408.12470 null
2024-08-22 Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning Mushui Liu et.al. 2408.12469 null
2024-08-22 Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing Mengqi Zhang et.al. 2408.12456 null
2024-08-22 Positional Description for Numerical Normalization Deepanshu Gupta et.al. 2408.12430 null
2024-08-22 FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing Jue Wang et.al. 2408.12429 link
2024-08-22 Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification Sudi Murindanyi et.al. 2408.12426 null
2024-08-22 Unlearning Trojans in Large Language Models: A Comparison Between Natural Language and Source Code Mahdi Kazemi et.al. 2408.12416 null
2024-08-22 Generalized SAM: Efficient Fine-Tuning of SAM for Variable Input Image Sizes Sota Kato et.al. 2408.12406 link
2024-08-21 Great Memory, Shallow Reasoning: Limits of $k$ NN-LMs Shangyi Geng et.al. 2408.11815 link
2024-08-21 SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs Yuanyang Yin et.al. 2408.11813 null
2024-08-21 EmbodiedSAM: Online Segment Any 3D Thing in Real Time Xiuwei Xu et.al. 2408.11811 null
2024-08-21 Approaching Deep Learning through the Spectral Dynamics of Weights David Yunis et.al. 2408.11804 link
2024-08-21 Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models Yuzhou Huang et.al. 2408.11801 null
2024-08-21 PermitQA: A Benchmark for Retrieval Augmented Generation in Wind Siting and Permitting domain Rounak Meyur et.al. 2408.11800 null
2024-08-21 Practical token pruning for foundation models in few-shot conversational virtual assistant systems Haode Qi et.al. 2408.11799 null
2024-08-21 EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model Feipeng Ma et.al. 2408.11795 null
2024-08-21 Leveraging Chemistry Foundation Models to Facilitate Structure Focused Retrieval Augmented Generation in Multi-Agent Workflows for Catalyst and Materials Design Nathaniel H. Park et.al. 2408.11793 null
2024-08-21 Critique-out-Loud Reward Models Zachary Ankner et.al. 2408.11791 link
2024-08-21 DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework Zhifei Xie et.al. 2408.11788 null
2024-08-21 Personality Alignment of Large Language Models Minjun Zhu et.al. 2408.11779 link
2024-08-21 Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards Omar Erak et.al. 2408.11775 link
2024-08-21 Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks Yiyi Chen et.al. 2408.11749 link
2024-08-21 DH-Bench: Probing Depth and Height Perception of Large Visual-Language Models Shehreen Azad et.al. 2408.11748 link
2024-08-21 Open-Ended 3D Point Cloud Instance Segmentation Phuc D. A. Nguyen et.al. 2408.11747 null
2024-08-21 Mixed Sparsity Training: Achieving 4 $\times$ FLOP Reduction for Transformer Pretraining Pihe Hu et.al. 2408.11746 null
2024-08-21 FocusLLM: Scaling LLM’s Context by Parallel Decoding Zhenyu Li et.al. 2408.11745 null
2024-08-21 MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models Elias Frantar et.al. 2408.11743 link
2024-08-21 CluMo: Cluster-based Modality Fusion Prompt for Continual Learning in Visual Question Answering Yuliang Cai et.al. 2408.11742 link
2024-08-20 Prompt-Guided Image-Adaptive Neural Implicit Lookup Tables for Interpretable Image Enhancement Satoshi Kosugi et.al. 2408.11055 link
2024-08-20 Revisiting VerilogEval: Newer LLMs, In-Context Learning, and Specification-to-RTL Tasks Nathaniel Pinckney et.al. 2408.11053 link
2024-08-20 FLAME: Learning to Navigate with Multimodal LLM in Urban Environments Yunzhe Xu et.al. 2408.11051 link
2024-08-20 MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding Jian Chen et.al. 2408.11049 link
2024-08-20 Inside the Black Box: Detecting Data Leakage in Pre-trained Language Encoders Yuan Xin et.al. 2408.11046 null
2024-08-20 Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research Sreyoshi Bhaduri et.al. 2408.11043 null
2024-08-20 Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Chunting Zhou et.al. 2408.11039 null
2024-08-20 Scaling Law with Learning Rate Annealing Howe Tissue et.al. 2408.11029 null
2024-08-20 Athena: Safe Autonomous Agents with Verbal Contrastive Learning Tanmana Sadhu et.al. 2408.11021 null
2024-08-20 While GitHub Copilot Excels at Coding, Does It Ensure Responsible Output? Wen Cheng et.al. 2408.11006 link
2024-08-20 SenPa-MAE: Sensor Parameter Aware Masked Autoencoder for Multi-Satellite Self-Supervised Pretraining Jonathan Prexl et.al. 2408.11000 link
2024-08-20 CTP-LLM: Clinical Trial Phase Transition Prediction Using Large Language Models Michael Reinisch et.al. 2408.10995 null
2024-08-20 Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models Yuyan Chen et.al. 2408.10947 null
2024-08-20 Large Language Model Driven Recommendation Anton Korikov et.al. 2408.10946 null
2024-08-20 HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments Kazi Hasan Ibn Arif et.al. 2408.10945 link
2024-08-20 SysBench: Can Large Language Models Follow System Messages? Yanzhao Qin et.al. 2408.10943 link
2024-08-20 Proxona: Leveraging LLM-Driven Personas to Enhance Creators’ Understanding of Their Audience Yoonseo Choi et.al. 2408.10937 null
2024-08-20 LBC: Language-Based-Classifier for Out-Of-Variable Generalization Kangjun Noh et.al. 2408.10923 link
2024-08-21 BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model Yeyong Yu et.al. 2408.10903 link
2024-08-20 Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMs John Mendonça et.al. 2408.10902 link
2024-08-19 SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP Yusuke Hirota et.al. 2408.10202 null
2024-08-19 Demystifying the Communication Characteristics for Distributed Transformer Models Quentin Anthony et.al. 2408.10197 null
2024-08-19 Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Aviv Bick et.al. 2408.10189 null
2024-08-19 LongVILA: Scaling Long-Context Visual Language Models for Long Videos Fuzhao Xue et.al. 2408.10188 link
2024-08-19 SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models Anke Tang et.al. 2408.10174 link
2024-08-19 Customizing Language Models with Instance-wise LoRA for Sequential Recommendation Xiaoyu Kong et.al. 2408.10159 link
2024-08-19 Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models Amey Hengle et.al. 2408.10151 link
2024-08-19 In-Context Learning with Representations: Contextual Generalization of Trained Transformers Tong Yang et.al. 2408.10147 null
2024-08-19 Instruction Finetuning for Leaderboard Generation from Empirical AI Research Salomon Kabongo et.al. 2408.10141 null
2024-08-19 Rhyme-aware Chinese lyric generator based on GPT Yixiao Yuan et.al. 2408.10130 null
2024-08-19 Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track Feiyu Pan et.al. 2408.10125 null
2024-08-19 Molecular Graph Representation Learning Integrating Large Language Models with Domain-specific Small Models Tianyu Zhang et.al. 2408.10124 link
2024-08-19 Geometry Informed Tokenization of Molecules for Language Model Generation Xiner Li et.al. 2408.10120 null
2024-08-19 GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document Summarization Ran Liu et.al. 2408.10115 link
2024-08-20 PLUTUS: A Well Pre-trained Large Unified Transformer can Unveil Financial Time Series Regularities Yuanjian Xu et.al. 2408.10111 null
2024-08-19 ARMADA: Attribute-Based Multimodal Data Augmentation Xiaomeng Jin et.al. 2408.10086 null
2024-08-19 Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning Sriyash Poddar et.al. 2408.10075 null
2024-08-19 FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant Zhengchao Huang et.al. 2408.10072 link
2024-08-19 Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory Haoran Li et.al. 2408.10053 null
2024-08-19 Defense Priorities in the Open-Source AI Debate: A Preliminary Assessment Masao Dahlgren et.al. 2408.10026 null
2024-08-16 SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation Xinyu Xiong et.al. 2408.08870 link
2024-08-16 PEDAL: Enhancing Greedy Decoding with Large Language Models using Diverse Exemplars Sumanth Prabhu et.al. 2408.08869 null
2024-08-16 A Hassle-free Algorithm for Private Learning in Practice: Don’t Use Tree Aggregation, Use BLTs H. Brendan McMahan et.al. 2408.08868 null
2024-08-16 Visual Agents as Fast and Slow Thinkers Guangyan Sun et.al. 2408.08862 link
2024-08-16 DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models Eman Ali et.al. 2408.08855 null
2024-08-16 GeoTransformer: Enhancing Urban Forecasting with Geospatial Attention Mechanisms Yuhao Jia et.al. 2408.08852 null
2024-08-16 ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis Yubao Zhao et.al. 2408.08849 link
2024-08-16 PsychoLex: Unveiling the Psychological Mind of Large Language Models Mohammad Amin Abbasi et.al. 2408.08848 null
2024-08-16 FLEXTAF: Enhancing Table Reasoning with Flexible Tabular Formats Xuanliang Zhang et.al. 2408.08841 link
2024-08-16 EasyRec: Simple yet Effective Language Models for Recommendation Xubin Ren et.al. 2408.08821 link
2024-08-16 Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models Lin Zhao et.al. 2408.08813 null
2024-08-16 Artificial Intelligence and Strategic Decision-Making: Evidence from Entrepreneurs and Investors Felipe A. Csaszar et.al. 2408.08811 null
2024-08-16 Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge Ravi Raju et.al. 2408.08808 null
2024-08-16 CIKMar: A Dual-Encoder Approach to Prompt-Based Reranking in Educational Dialogue Systems Joanito Agili Lopo et.al. 2408.08805 null
2024-08-16 A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks Boa Jang et.al. 2408.08790 link
2024-08-16 EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics Chenwei Wan et.al. 2408.08782 link
2024-08-16 Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions Chenming Tang et.al. 2408.08780 null
2024-08-16 DAC: Decomposed Automation Correction for Text-to-SQL Dingzirui Wang et.al. 2408.08779 link
2024-08-16 Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused Dingwei Chen et.al. 2408.08769 null
2024-08-16 Rethinking Generative Semantic Communication for Multi-User Systems with Multi-Modal LLM Wanting Yang et.al. 2408.08765 null
2024-08-15 Can Large Language Models Understand Symbolic Graphics Programs? Zeju Qiu et.al. 2408.08313 null
2024-08-15 ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws Ruihang Li et.al. 2408.08310 null
2024-08-15 Towards Flexible Visual Relationship Segmentation Fangrui Zhu et.al. 2408.08305 null
2024-08-15 Benchmarking the Capabilities of Large Language Models in Transportation System Engineering: Accuracy, Consistency, and Reasoning Behaviors Usman Syed et.al. 2408.08302 null
2024-08-15 VLPG-Nav: Object Navigation Using Visual Language Pose Graph and Object Localization Probability Maps Senthil Hariharan Arul et.al. 2408.08301 null
2024-08-15 HELP: Hierarchical Embeddings-based Log Parsing Andy Xu et.al. 2408.08300 null
2024-08-15 The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community Shachar Don-Yehiya et.al. 2408.08291 null
2024-08-15 Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model Jin Wang et.al. 2408.08282 null
2024-08-15 BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts Qizhen Zhang et.al. 2408.08274 null
2024-08-15 DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System Xihong Yang et.al. 2408.08231 null
2024-08-15 RED-CT: A Systems Design Methodology for Using LLM-labeled Data to Train and Deploy Edge Classifiers for Computational Social Science David Farr et.al. 2408.08217 null
2024-08-15 Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models Javier González et.al. 2408.08210 null
2024-08-15 LLM4DSR: Leveraing Large Language Model for Denoising Sequential Recommendation Bohao Wang et.al. 2408.08208 null
2024-08-15 Heavy Labels Out! Dataset Distillation with Label Space Lightening Ruonan Yu et.al. 2408.08201 null
2024-08-15 Scaling Up Natural Language Understanding for Multi-Robots Through the Lens of Hierarchy Shaojun Xu et.al. 2408.08188 null
2024-08-15 General-purpose Clothes Manipulation with Semantic Keypoints Yuhong Deng et.al. 2408.08160 null
2024-08-15 EmBARDiment: an Embodied AI Agent for Productivity in XR Riccardo Bovo et.al. 2408.08158 null
2024-08-15 DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Huajian Xin et.al. 2408.08152 link
2024-08-15 P/D-Serve: Serving Disaggregated Large Language Model at Scale Yibo Jin et.al. 2408.08147 null
2024-08-15 KOALA: Enhancing Speculative Decoding for LLM via Multi-Layer Draft Heads with Adversarial Learning Kaiqi Zhang et.al. 2408.08146 null
2024-08-14 The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models Karime Maamari et.al. 2408.07702 null
2024-08-15 Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities Enneng Yang et.al. 2408.07666 link
2024-08-14 Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models Yi-Cheng Lin et.al. 2408.07665 link
2024-08-14 Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions Quan Liu et.al. 2408.07663 link
2024-08-14 WeKnow-RAG: An Adaptive Approach for Retrieval-Augmented Generation Integrating Web Search and Knowledge Graphs Weijian Xie et.al. 2408.07611 null
2024-08-14 Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey Hamza Kheddar et.al. 2408.07583 null
2024-08-15 MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark Minxuan Zhou et.al. 2408.07543 link
2024-08-15 Usefulness of data flow diagrams and large language models for security threat validation: a registered report Winnie Bahati Mbaka et.al. 2408.07537 null
2024-08-14 Development of a Multi-Agent Clinical Decision Support System for Korean Triage and Acuity Scale (KTAS)-Based Triage and Treatment Planning in Emergency Departments Seungjun Han et.al. 2408.07531 null
2024-08-14 Large Language Models Know What Makes Exemplary Contexts Quanyu Long et.al. 2408.07505 null
2024-08-14 Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach Shizhou Zhang et.al. 2408.07500 link
2024-08-14 QirK: Question Answering via Intermediate Representation on Knowledge Graphs Jan Luca Scheerer et.al. 2408.07494 null
2024-08-14 Training Overhead Ratio: A Practical Reliability Metric for Large Language Model Training Systems Ning Lu et.al. 2408.07482 null
2024-08-14 Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization Yuxin Jiang et.al. 2408.07471 link
2024-08-14 Domain-invariant Representation Learning via Segment Anything Model for Blood Cell Classification Yongcheng Li et.al. 2408.07467 link
2024-08-14 Large Language Models Prompting With Episodic Memory Dai Do et.al. 2408.07465 null
2024-08-14 From Brazilian Portuguese to European Portuguese João Sanches et.al. 2408.07457 null
2024-08-14 Fact or Fiction? Improving Fact Verification with Knowledge Graphs through Simplified Subgraph Retrievals Tobias A. Opsahl et.al. 2408.07453 link
2024-08-15 BAPLe: Backdoor Attacks on Medical Foundational Models using Prompt Learning Asif Hanif et.al. 2408.07440 link
2024-08-14 Beyond Inter-Item Relations: Dynamic Adaptive Mixture-of-Experts for LLM-Based Sequential Recommendation CanYi Liu et.al. 2408.07427 null
2024-08-13 Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Kexun Zhang et.al. 2408.07060 null
2024-08-13 LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Yushi Bai et.al. 2408.07055 link
2024-08-13 Casper: Prompt Sanitization for Protecting User Privacy in Web-Based Large Language Models Chun Jie Chong et.al. 2408.07004 null
2024-08-13 LLMs can Schedule Henrik Abgaryan et.al. 2408.06993 link
2024-08-13 DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs Dongyuan Li et.al. 2408.06966 null
2024-08-13 Towards Holistic Disease Risk Prediction using Small Language Models Liv Björkdahl et.al. 2408.06943 null
2024-08-13 OpenResearcher: Unleashing AI for Accelerated Scientific Research Yuxiang Zheng et.al. 2408.06941 link
2024-08-13 The advantages of context specific language models: the case of the Erasmian Language Model João Gonçalves et.al. 2408.06931 link
2024-08-13 Evaluating Cultural Adaptability of a Large Language Model via Simulation of Synthetic Personas Louis Kwok et.al. 2408.06929 link
2024-08-13 SceneGPT: A Language Model for 3D Scene Understanding Shivam Chandhok et.al. 2408.06926 null
2024-08-13 Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives Zhihu Wang et.al. 2408.06904 null
2024-08-13 Leveraging Language Models for Emotion and Behavior Analysis in Education Kaito Tanaka et.al. 2408.06874 null
2024-08-13 LoRA $^2$ : Multi-Scale Low-Rank Approximations for Fine-Tuning Large Language Models Jia-Chen Zhang et.al. 2408.06854 null
2024-08-13 Causal Agent based on Large Language Model Kairong Han et.al. 2408.06849 link
2024-08-13 DracoGPT: Extracting Visualization Design Preferences from Large Language Models Huichen Will Wang et.al. 2408.06845 null
2024-08-13 How Aligned are Human Chart Takeaways and LLM Predictions? A Case Study on Bar Charts with Varying Layouts Huichen Will Wang et.al. 2408.06837 null
2024-08-13 Efficient Search for Customized Activation Functions with Gradient Descent Lukas Strack et.al. 2408.06820 link
2024-08-13 MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty Yongjin Yang et.al. 2408.06816 null
2024-08-13 HLSPilot: LLM-based High-Level Synthesis Chenwei Xiong et.al. 2408.06810 link
2024-08-13 Layerwise Recurrent Router for Mixture-of-Experts Zihan Qiu et.al. 2408.06793 link
2024-08-12 FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection Yufei Huang et.al. 2408.06333 link
2024-08-12 Animate, or Inanimate, That is the Question for Large Language Models Leonardo Ranaldi et.al. 2408.06332 null
2024-08-12 Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let’s Take TravelPlanner as an Example Yanan Chen et.al. 2408.06318 null
2024-08-12 Long-Form Answers to Visual Questions from Blind and Low Vision People Mina Huh et.al. 2408.06303 null
2024-08-12 The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Chris Lu et.al. 2408.06292 link
2024-08-12 MovieSum: An Abstractive Summarization Dataset for Movie Screenplays Rohit Saxena et.al. 2408.06281 link
2024-08-13 Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation Jieyong Kim et.al. 2408.06276 null
2024-08-12 FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data Haoran Sun et.al. 2408.06273 link
2024-08-12 A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution Sampath Rajapaksha et.al. 2408.06272 null
2024-08-12 Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment Karel D’Oosterlinck et.al. 2408.06266 link
2024-08-12 Context-aware Visual Storytelling with Visual Prefix Tuning and Contrastive Learning Yingjin Song et.al. 2408.06259 null
2024-08-12 On Effects of Steering Latent Representation for Large Language Model Unlearning Dang Huu-Tien et.al. 2408.06223 null
2024-08-12 Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers Zhenting Qi et.al. 2408.06195 link
2024-08-12 FruitNeRF: A Unified Neural Radiance Field based Fruit Counting Framework Lukas Meyer et.al. 2408.06190 link
2024-08-12 Improving Structural Diversity of Blackbox LLMs via Chain-of-Specification Prompting Halley Young et.al. 2408.06186 null
2024-08-12 OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning Mushui Liu et.al. 2408.06158 link
2024-08-12 LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library Tianhao Yu et.al. 2408.06150 null
2024-08-12 Self-Supervised Learning on MeerKAT Wide-Field Continuum Images Erica Lastufka et.al. 2408.06147 link
2024-08-12 Med42-v2: A Suite of Clinical LLMs Clément Christophe et.al. 2408.06142 null
2024-08-12 Utilize Transformers for translating Wikipedia category names Hoang-Thang Ta et.al. 2408.06124 null
2024-08-10 Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions Michele Miranda et.al. 2408.05212 link
2024-08-09 VITA: Towards Open-Source Interactive Omni Multimodal LLM Chaoyou Fu et.al. 2408.05211 link
2024-08-09 Evaluating the capability of large language models to personalize science texts for diverse middle-school-age learners Michael Vaccaro Jr et.al. 2408.05204 null
2024-08-09 TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning Yujie Feng et.al. 2408.05200 link
2024-08-09 ECG-FM: An Open Electrocardiogram Foundation Model Kaden McKeen et.al. 2408.05178 link
2024-08-09 Weak-Annotation of HAR Datasets using Vision Foundation Models Marius Bock et.al. 2408.05169 link
2024-08-09 AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity Recognition Dataset Pritam Deka et.al. 2408.05149 null
2024-08-09 A Hybrid RAG System with Comprehensive Enhancement on Complex Reasoning Ye Yuan et.al. 2408.05141 null
2024-08-09 Is ChatGPT a Good Software Librarian? An Exploratory Study on the Use of ChatGPT for Software Library Recommendations Jasmine Latendresse et.al. 2408.05128 null
2024-08-09 Large Language Models and Thematic Analysis: Human-AI Synergy in Researching Hate Speech on Social Media Petre Breazu et.al. 2408.05126 null
2024-08-09 Sportify: Question Answering with Embedded Visualizations and Personified Narratives for Sports Video Chunggi Lee et.al. 2408.05123 null
2024-08-09 A Survey of NL2SQL with Large Language Models: Where are we, and where are we going? Xinyu Liu et.al. 2408.05109 link
2024-08-09 Depth Helps: Improving Pre-trained RGB-based Policy with Depth Information Injection Xincheng Pang et.al. 2408.05107 null
2024-08-09 How Well Do LLMs Identify Cultural Unity in Diversity? Jialin Li et.al. 2408.05102 link
2024-08-09 Hyperbolic Learning with Multimodal Large Language Models Paolo Mandica et.al. 2408.05097 null
2024-08-09 Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive Prompts Tingchen Fu et.al. 2408.05094 null
2024-08-09 Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models Zikai Xie et.al. 2408.05093 link
2024-08-09 Generating novel experimental hypotheses from language models: A case study on cross-dative generalization Kanishka Misra et.al. 2408.05086 link
2024-08-09 RT-Surv: Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records Sangjoon Park et.al. 2408.05074 null
2024-08-09 Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil Marcelo Sartori Locatelli et.al. 2408.05035 null
2024-08-08 Better Alignment with Instruction Back-and-Forth Translation Thao Nguyen et.al. 2408.04614 null
2024-08-08 Code-switching in text and speech reveals information-theoretic audience design Debasmita Bhattacharya et.al. 2408.04596 null
2024-08-09 Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models Qirui Jiao et.al. 2408.04594 link
2024-08-08 Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency, Performance, and Adversarial Robustness Xiaojing Fan et.al. 2408.04585 null
2024-08-08 SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More Tianrun Chen et.al. 2408.04579 null
2024-08-08 SCENE: Evaluating Explainable AI Techniques Using Soft Counterfactuals Haoran Zheng et.al. 2408.04575 null
2024-08-08 Learning Fine-Grained Grounded Citations for Attributed Large Language Models Lei Huang et.al. 2408.04568 link
2024-08-08 Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models Yupeng Chang et.al. 2408.04556 link
2024-08-08 Depth Any Canopy: Leveraging Depth Foundation Models for Canopy Height Estimation Daniele Rege Cambrin et.al. 2408.04523 link
2024-08-08 Compromesso! Italian Many-Shot Jailbreaks Undermine the Safety of Large Language Models Fabio Pernisi et.al. 2408.04522 null
2024-08-08 What You Need is What You Get: Theory of Mind for an LLM-Based Code Understanding Assistant Jonan Richards et.al. 2408.04477 null
2024-08-08 Can LLMs Beat Humans in Debating? A Dynamic Multi-agent Framework for Competitive Debate Yiqun Zhang et.al. 2408.04472 link
2024-08-08 RiskAwareBench: Towards Evaluating Physical Risk Awareness for High-level Planning of LLM-based Embodied Agents Zihao Zhu et.al. 2408.04449 link
2024-08-08 Large Language Models for cross-language code clone detection Micheline Bénédicte Moumoula et.al. 2408.04430 null
2024-08-08 Recognizing Emotion Regulation Strategies from Human Behavior with Large Language Models Philipp Müller et.al. 2408.04420 null
2024-08-08 Enhancing Robustness of Retrieval-Augmented Language Models with In-Context Learning Seong-Il Park et.al. 2408.04414 null
2024-08-08 Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers Moritz Scherer et.al. 2408.04413 null
2024-08-08 Exploring Reasoning Biases in Large Language Models Through Syllogism: Insights from the NeuBAROCO Dataset Kentaro Ozeki et.al. 2408.04403 link
2024-08-08 Automated Educational Question Generation at Different Bloom’s Skill Levels using Large Language Models: Strategies and Evaluation Nicy Scaria et.al. 2408.04394 link
2024-08-08 Open-domain Implicit Format Control for Large Language Model Generation Yiqun Yao et.al. 2408.04392 link
2024-08-07 How Well Can Vision Language Models See Image Details? Chenhui Gou et.al. 2408.03940 null
2024-08-07 SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature Vinícius Di Oliveira et.al. 2408.03936 null
2024-08-07 CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases Xiangyan Liu et.al. 2408.03910 link
2024-08-07 Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models Shachi H Kumar et.al. 2408.03907 null
2024-08-07 Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond Beomseok Lee et.al. 2408.03900 link
2024-08-07 Simplifying Scholarly Abstracts for Accessible Digital Libraries Haining Wang et.al. 2408.03899 link
2024-08-07 From Data to Story: Towards Automatic Animated Data Video Creation with LLM-based Multi-Agent Systems Leixian Shen et.al. 2408.03876 null
2024-08-07 PackMamba: Efficient Processing of Variable-Length Sequences in Mamba training Haoran Xu et.al. 2408.03865 null
2024-08-07 GAIA – A Large Language Model for Advanced Power Dispatch Yuheng Cheng et.al. 2408.03847 null
2024-08-07 MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models Yuchen Dong et.al. 2408.03841 null
2024-08-07 WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models Prannaya Gupta et.al. 2408.03837 link
2024-08-07 Target Prompting for Information Extraction with Vision Language Model Dipankar Medhi et.al. 2408.03834 null
2024-08-07 Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning Simret Araya Gebreegziabher et.al. 2408.03819 null
2024-08-07 Generative Language Models with Retrieval Augmented Generation for Automated Short Answer Scoring Zifan Wang et.al. 2408.03811 null
2024-08-07 ‘Finance Wizard’ at the FinLLM Challenge Task: Financial Text Summarization Meisin Lee et.al. 2408.03762 null
2024-08-07 MMSummary: Multimodal Summary Generation for Fetal Ultrasound Video Xiaoqing Guo et.al. 2408.03761 null
2024-08-07 Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation Jingjing Xie et.al. 2408.03735 link
2024-08-07 Question Rephrasing for Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks Zizhang Chen et.al. 2408.03732 null
2024-08-07 A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models Pengxiang Zhao et.al. 2408.03728 null
2024-08-07 Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction Benjamin Matthias Ruppik et.al. 2408.03706 null
2024-08-06 CoverBench: A Challenging Benchmark for Complex Claim Verification Alon Jacovi et.al. 2408.03325 null
2024-08-06 Segment Anything in Medical Images and Videos: Benchmark and Deployment Jun Ma et.al. 2408.03322 link
2024-08-06 TextIM: Part-aware Interactive Motion Synthesis from Text Siyuan Fan et.al. 2408.03302 null
2024-08-06 KaPO: Knowledge-aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models Ruizhe Zhang et.al. 2408.03297 null
2024-08-06 Biomedical SAM 2: Segment Anything in Biomedical Images and Videos Zhiling Yan et.al. 2408.03286 link
2024-08-07 StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation Boxi Cao et.al. 2408.03281 link
2024-08-06 Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments Angie Boggust et.al. 2408.03274 null
2024-08-06 Synthesizing Text-to-SQL Data from Weak and Strong LLMs Jiaxi Yang et.al. 2408.03256 null
2024-08-06 Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons Yifei Wang et.al. 2408.03247 link
2024-08-06 Making Long-Context Language Models Better Multi-Hop Reasoners Yanyang Li et.al. 2408.03246 link
2024-08-06 Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi Pranita Deshmukh et.al. 2408.03172 null
2024-08-06 Conditioning LLMs with Emotion in Neural Machine Translation Charles Brazier et.al. 2408.03150 null
2024-08-06 Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization Yanghai Zhang et.al. 2408.03149 link
2024-08-06 Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations Leo Donisch et.al. 2408.03130 null
2024-08-06 Lisbon Computational Linguists at SemEval-2024 Task 2: Using A Mistral 7B Model and Data Augmentation Artur Guimarães et.al. 2408.03127 link
2024-08-06 Evaluating the Translation Performance of Large Language Models Based on Euas-20 Yan Huang et.al. 2408.03119 null
2024-08-06 Topic Modeling with Fine-tuning LLMs and Bag of Sentences Johannes Schneider et.al. 2408.03099 link
2024-08-07 TestART: Improving LLM-based Unit Test via Co-evolution of Automated Generation and Repair Iteration Siqi Gu et.al. 2408.03095 null
2024-08-06 500xCompressor: Generalized Prompt Compression for Large Language Models Zongqian Li et.al. 2408.03094 link
2024-08-06 Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement Le Yu et.al. 2408.03092 link
2024-08-05 Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining Dongyang Liu et.al. 2408.02657 link
2024-08-05 Can Reinforcement Learning Unlock the Hidden Dangers in Aligned Large Language Models? Mohammad Bahrami Karkevandi et.al. 2408.02651 null
2024-08-05 Command-line Obfuscation Detection using Small Language Models Vojtech Outrata et.al. 2408.02637 null
2024-08-05 SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models Muxi Diao et.al. 2408.02632 null
2024-08-05 Language Model Can Listen While Speaking Ziyang Ma et.al. 2408.02622 null
2024-08-05 Progressively Selective Label Enhancement for Language Model Alignment Biao Liu et.al. 2408.02599 null
2024-08-05 Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection Sajal Aggarwal et.al. 2408.02595 null
2024-08-05 Leveraging the Power of LLMs: A Fine-Tuning Approach for High-Quality Aspect-Based Summarization Ankan Mullick et.al. 2408.02584 null
2024-08-05 DanModCap: Designing a Danmaku Moderation Tool for Video-Sharing Platforms that Leverages Impact Captions Siying Hu et.al. 2408.02574 null
2024-08-05 Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information Yauwai Yim et.al. 2408.02559 null
2024-08-05 Generative AI as a Service in 6G Edge-Cloud: Generation Task Offloading by In-context Learning Hao Zhou et.al. 2408.02549 null
2024-08-05 RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation Daniel Fleischer et.al. 2408.02545 link
2024-08-05 Caution for the Environment: Multimodal Agents are Susceptible to Environmental Distractions Xinbei Ma et.al. 2408.02544 link
2024-08-05 Towards Coarse-grained Visual Language Navigation Task Planning Enhanced by Event Knowledge Graph Zhao Kaichen et.al. 2408.02535 null
2024-08-05 Practical Attacks against Black-box Code Completion Engines Slobodan Jenko et.al. 2408.02509 null
2024-08-05 UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model Zhaowei Li et.al. 2408.02503 link
2024-08-05 Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation Aaron Imani et.al. 2408.02502 null
2024-08-05 A First Look at License Compliance Capability of LLMs in Code Generation Weiwei Xu et.al. 2408.02487 link
2024-08-05 Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection Ting Lei et.al. 2408.02484 link
2024-08-05 From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future Haolin Jin et.al. 2408.02479 null
2024-08-02 Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting Xiangyu Zhao et.al. 2408.01423 null
2024-08-02 Mission Impossible: A Statistical Perspective on Jailbreaking LLMs Jingtong Su et.al. 2408.01420 null
2024-08-02 DebateQA: Evaluating Question Answering on Debatable Knowledge Rongwu Xu et.al. 2408.01419 link
2024-08-02 Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs Yilun Hua et.al. 2408.01417 null
2024-08-02 Pre-trained Language Models Improve the Few-shot Prompt Ability of Decision Transformer Yu Yang et.al. 2408.01402 null
2024-08-02 Coalitions of Large Language Models Increase the Robustness of AI Agents Prattyush Mangal et.al. 2408.01380 null
2024-08-02 Toward Automatic Relevance Judgment using Vision–Language Models for Image–Text Retrieval Evaluation Jheng-Hong Yang et.al. 2408.01363 null
2024-08-02 Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs Peng Ding et.al. 2408.01355 link
2024-08-02 MCGMark: An Encodable and Robust Online Watermark for LLM-Generated Malicious Code Kaiwen Ning et.al. 2408.01354 link
2024-08-02 Prompt Refinement or Fine-tuning? Best Practices for using LLMs in Computational Social Science Tasks Anders Giovanni Møller et.al. 2408.01346 null
2024-08-02 MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models Benno Weck et.al. 2408.01337 link
2024-08-02 A Backbone for Long-Horizon Robot Task Understanding Xiaoshuai Chen et.al. 2408.01334 null
2024-08-02 FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only He Zhu et.al. 2408.01323 null
2024-08-02 A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks Jiaqi Wang et.al. 2408.01319 null
2024-08-02 Reconsidering Token Embeddings with the Definitions for Pre-trained Language Models Ying Zhang et.al. 2408.01308 null
2024-08-02 The Mismeasure of Man and Models: Evaluating Allocational Harms in Large Language Models Hannah Chen et.al. 2408.01285 null
2024-08-02 RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework Kunlun Zhu et.al. 2408.01262 link
2024-08-02 The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models Simone Caldarella et.al. 2408.01228 null
2024-08-02 High-Throughput Phenotyping of Clinical Text Using Large Language Models Daniel B. Hier et.al. 2408.01214 null
2024-08-02 Misinforming LLMs: vulnerabilities, challenges and opportunities Bo Zhou et.al. 2408.01168 null
2024-08-01 AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation Mengkang Hu et.al. 2408.00764 null
2024-08-01 UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model Xiangyu Fan et.al. 2408.00762 null
2024-08-01 Tamper-Resistant Safeguards for Open-Weight LLMs Rishub Tamirisa et.al. 2408.00761 link
2024-08-01 Thermal Conductivity Predictions with Foundation Atomistic Models Balázs Póta et.al. 2408.00755 link
2024-08-01 Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model Benlin Liu et.al. 2408.00754 null
2024-08-01 Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation Siyu Jiao et.al. 2408.00744 link
2024-08-01 DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency Jovan Stojkovic et.al. 2408.00741 null
2024-08-01 Virchow 2: Scaling Self-Supervised Mixed Magnification Models in Pathology Eric Zimmermann et.al. 2408.00738 null
2024-08-01 Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions Guangzhi Xiong et.al. 2408.00727 link
2024-08-01 An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models Yangzhen Wu et.al. 2408.00724 null
2024-08-01 Pathway to Secure and Trustworthy 6G for LLMs: Attacks, Defense, and Opportunities Sunder Ali Khowaja et.al. 2408.00722 null
2024-08-01 SAM 2: Segment Anything in Images and Videos Nikhila Ravi et.al. 2408.00714 link
2024-08-01 Point-supervised Brain Tumor Segmentation with Box-prompted MedSAM Xiaofeng Liu et.al. 2408.00706 null
2024-08-01 Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning Trapoom Ukarapol et.al. 2408.00690 link
2024-08-01 Can Developers Prompt? A Controlled Experiment for Code Documentation Generation Hans-Alexander Kruse et.al. 2408.00686 null
2024-08-01 ExpertAF: Expert Actionable Feedback from Video Kumar Ashutosh et.al. 2408.00672 null
2024-08-01 AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models Daqin Luo et.al. 2408.00665 link
2024-08-01 Disentangling Dense Embeddings with Sparse Autoencoders Charles O’Neill et.al. 2408.00657 null
2024-08-02 SentenceVAE: Faster, Longer and More Accurate Inference with Next-sentence Prediction for Large Language Models Hongjun An et.al. 2408.00655 link
2024-08-01 Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint Learning Xuri Ge et.al. 2408.00644 null
2024-07-31 Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey Atsuyuki Miyai et.al. 2407.21794 null
2024-07-31 Vision-Language Model Based Handwriting Verification Mihir Chauhan et.al. 2407.21788 null
2024-07-31 Large Language Monkeys: Scaling Inference Compute with Repeated Sampling Bradley Brown et.al. 2407.21787 null
2024-07-31 The Llama 3 Herd of Models Abhimanyu Dubey et.al. 2407.21783 null
2024-07-31 Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs Shi Liu et.al. 2407.21771 null
2024-07-31 MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts Xi Victoria Lin et.al. 2407.21770 null
2024-07-31 ReplanVLM: Replanning Robotic Tasks with Visual Language Models Aoran Mei et.al. 2407.21762 null
2024-07-31 Learning Video Context as Interleaved Multimodal Sequences Kevin Qinghong Lin et.al. 2407.21757 link
2024-07-31 A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation Mothilal Asokan et.al. 2407.21739 null
2024-07-31 Open-Vocabulary Audio-Visual Semantic Segmentation Ruohao Guo et.al. 2407.21721 null
2024-07-31 Adaptive Retrieval-Augmented Generation for Conversational Systems Xi Wang et.al. 2407.21712 null
2024-07-31 CEAR: Automatic construction of a knowledge graph of chemical entities and roles from scientific literature Stefan Langer et.al. 2407.21708 null
2024-07-31 TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities Ming Zhang et.al. 2407.21693 link
2024-07-31 Synth-Empathy: Towards High-Quality Synthetic Empathy Data Hao Liang et.al. 2407.21669 link
2024-08-01 Defending Jailbreak Attack in VLMs via Cross-modality Information Detector Yue Xu et.al. 2407.21659 link
2024-07-31 MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment Anurag Das et.al. 2407.21654 null
2024-07-31 Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation Xiang Luo et.al. 2407.21633 link
2024-07-31 TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods Gabriel Loiseau et.al. 2407.21630 link
2024-07-31 LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows Lukas Teufelberger et.al. 2407.21593 null
2024-07-31 A Performance Study of LLM-Generated Code on Leetcode Tristan Coignion et.al. 2407.21579 null
2024-07-30 ThinK: Thinner Key Cache by Query-Driven Pruning Yuhui Xu et.al. 2407.21018 null
2024-07-30 CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning Yuexi Du et.al. 2407.21011 link
2024-07-30 GABInsight: Exploring Gender-Activity Binding Bias in Vision-Language Models Ali Abdollahi et.al. 2407.21001 link
2024-07-30 MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning Yupeng Chen et.al. 2407.20999 null
2024-07-30 From Feature Importance to Natural Language Explanations Using LLMs with RAG Sule Tekkesinoglu et.al. 2407.20990 link
2024-07-30 Large Language Models (LLMs) for Semantic Communication in Edge-based IoT Networks Alakesh Kalita et.al. 2407.20970 null
2024-07-30 MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions Xiaowei Chi et.al. 2407.20962 link
2024-07-30 UniProcessor: A Text-induced Unified Low-level Image Processor Huiyu Duan et.al. 2407.20928 link
2024-07-30 SSPA: Split-and-Synthesize Prompting with Gated Alignments for Multi-Label Image Recognition Hao Tan et.al. 2407.20920 null
2024-07-30 Automated Review Generation Method Based on Large Language Models Shican Wu et.al. 2407.20906 link
2024-07-30 Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach Adam Wojciechowski et.al. 2407.20899 link
2024-07-30 ThinkRepair: Self-Directed Automated Program Repair Xin Yin et.al. 2407.20898 link
2024-07-30 Effective Black Box Testing of Sentiment Analysis Classification Networks Parsa Karbasizadeh et.al. 2407.20884 null
2024-07-30 Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification Boyang Zhang et.al. 2407.20859 null
2024-07-30 Learn by Selling: Equipping Large Language Models with Product Knowledge for Context-Driven Recommendations Sarthak Anand et.al. 2407.20856 null
2024-07-30 Large Language Model (LLM)-enabled Graphs in Dynamic Networking Geng Sun et.al. 2407.20840 null
2024-07-30 How to Measure the Intelligence of Large Language Models? Nils Körber et.al. 2407.20828 null
2024-07-30 Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning Norman Di Palo et.al. 2407.20798 null
2024-07-30 Interpretable Pre-Trained Transformers for Heart Time-Series Data Harry J. Davies et.al. 2407.20775 link
2024-07-30 OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance Yongqiang Yao et.al. 2407.20761 link
2024-07-29 Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing Ekaterina Iakovleva et.al. 2407.20232 null
2024-07-29 Improving 2D Feature Representations by 3D-Aware Fine-Tuning Yuanwen Yue et.al. 2407.20229 null
2024-07-29 FlexAttention for Efficient High-Resolution Vision-Language Models Junyan Li et.al. 2407.20228 null
2024-07-29 Can Editing LLMs Inject Harm? Canyu Chen et.al. 2407.20224 null
2024-07-29 SANGRIA: Surgical Video Scene Graph Optimization for Surgical Workflow Prediction Çağhan Köksal et.al. 2407.20214 null
2024-07-29 QAEA-DR: A Unified Text Augmentation Framework for Dense Retrieval Hongming Tan et.al. 2407.20207 null
2024-07-29 MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Zehui Chen et.al. 2407.20183 link
2024-07-29 Theia: Distilling Diverse Vision Foundation Models for Robot Learning Jinghuan Shang et.al. 2407.20179 link
2024-07-29 AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs Feiyang Kang et.al. 2407.20177 link
2024-07-29 Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning Xingchen Zeng et.al. 2407.20174 link
2024-07-29 Diffusion Feedback Helps CLIP See Better Wenxuan Wang et.al. 2407.20171 link
2024-07-29 Language-Conditioned Offline RL for Multi-Robot Navigation Steven Morad et.al. 2407.20164 null
2024-07-29 rLLM: Relational Table Learning with LLMs Weichen Li et.al. 2407.20157 link
2024-07-29 ByteCheckpoint: A Unified Checkpointing System for LLM Development Borui Wan et.al. 2407.20143 null
2024-07-29 Strong Copyright Protection for Language Models via Adaptive Model Fusion Javier Abad et.al. 2407.20105 null
2024-07-29 Orca: Ocean Significant Wave Height Estimation with Spatio-temporally Aware Large Language Models Zhe Li et.al. 2407.20053 null
2024-07-29 Exploring Large Language Models to generate Easy to Read content Paloma Martínez et.al. 2407.20046 null
2024-07-29 MaskInversion: Localized Embeddings via Optimization of Explainability Maps Walid Bousselham et.al. 2407.20034 null
2024-07-29 Efficient Training of Large Language Models on Distributed Infrastructures: A Survey Jiangfei Duan et.al. 2407.20018 null
2024-07-29 Rosetta Statements: Lowering the Barrier for Semantic Parsing and Increasing the Cognitive Interoperability of Knowledge Graphs Lars Vogt et.al. 2407.20007 null
2024-07-26 Wolf: Captioning Everything with a World Summarization Framework Boyi Li et.al. 2407.18908 null
2024-07-26 SHIC: Shape-Image Correspondences with no Keypoint Supervision Aleksandar Shtedritski et.al. 2407.18907 null
2024-07-26 A Flexible and Scalable Approach for Collecting Wildlife Advertisements on the Web Juliana Barbosa et.al. 2407.18898 link
2024-07-26 Small Molecule Optimization with Large Language Models Philipp Guevorguian et.al. 2407.18897 link
2024-07-26 Human-artificial intelligence teaming for scientific information extraction from data-driven additive manufacturing research using large language models Mutahar Safdar et.al. 2407.18827 null
2024-07-26 Automatic Detection of Moral Values in Music Lyrics Vjosa Preniqi et.al. 2407.18787 link
2024-07-26 The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs Aleix Sant et.al. 2407.18786 null
2024-07-26 Foundation Models for the Digital Twin Creation of Cyber-Physical Systems Shaukat Ali et.al. 2407.18779 null
2024-07-26 TAGIFY: LLM-powered Tagging Interface for Improved Data Findability on OGD portals Kevin Kliimask et.al. 2407.18764 null
2024-07-26 Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery Yuni Susanti et.al. 2407.18752 link
2024-07-26 Towards Effective and Efficient Continual Pre-training of Large Language Models Jie Chen et.al. 2407.18743 null
2024-07-26 Towards Generalized Offensive Language Identification Alphaeus Dmonte et.al. 2407.18738 null
2024-07-26 LLASP: Fine-tuning Large Language Models for Answer Set Programming Erica Coppolillo et.al. 2407.18723 null
2024-07-26 Neurosymbolic AI for Enhancing Instructability in Generative AI Amit Sheth et.al. 2407.18722 null
2024-07-26 Cluster-norm for Unsupervised Probing of Knowledge Walter Laurito et.al. 2407.18712 link
2024-07-26 Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation Esteban Garces Arias et.al. 2407.18698 link
2024-07-26 Collaborative Evolving Strategy for Automatic Data-Centric Development Xu Yang et.al. 2407.18690 null
2024-07-26 The BIAS Detection Framework: Bias Detection in Word Embeddings and Language Models for European Languages Alexandre Puttick et.al. 2407.18689 link
2024-07-26 Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift Seongho Son et.al. 2407.18676 null
2024-07-26 Every Part Matters: Integrity Verification of Scientific Figures Based on Multimodal Large Language Models Xiang Shi et.al. 2407.18626 link
2024-07-25 Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Tianduo Wang et.al. 2407.18248 link
2024-07-25 LoRA-Pro: Are Low-Rank Adapters Properly Optimized? Zhengbo Wang et.al. 2407.18242 link
2024-07-25 Recursive Introspection: Teaching Language Model Agents How to Self-Improve Yuxiao Qu et.al. 2407.18219 null
2024-07-26 Exploring Scaling Trends in LLM Robustness Nikolaus Howe et.al. 2407.18213 null
2024-07-25 AsEP: Benchmarking Deep Learning Methods for Antibody-specific Epitope Prediction Chunan Liu et.al. 2407.18184 link
2024-07-25 Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning Sindhura Kommu et.al. 2407.18181 null
2024-07-25 Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models Sanae Lotfi et.al. 2407.18158 null
2024-07-25 $\mathbb{X}$ -Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs Vlad Sobal et.al. 2407.18134 null
2024-07-25 Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic Fakhraddin Alwajih et.al. 2407.18129 null
2024-07-25 Efficient Inference of Vision Instruction-Following Models with Elastic Cache Zuyan Liu et.al. 2407.18121 link
2024-07-25 Multi-Resolution Histopathology Patch Graphs for Ovarian Cancer Subtyping Jack Breen et.al. 2407.18105 link
2024-07-25 Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow Tian Guo et.al. 2407.18103 null
2024-07-25 PEFT-U: Parameter-Efficient Fine-Tuning for User Personalization Christopher Clarke et.al. 2407.18078 link
2024-07-25 C2P: Featuring Large Language Models with Causal Reasoning Abdolmahdi Bagheri et.al. 2407.18069 null
2024-07-25 ComPeer: A Generative Conversational Agent for Proactive Peer Support Tianjian Liu et.al. 2407.18064 link
2024-07-25 Audio Entailment: Assessing Deductive Reasoning for Audio Understanding Soham Deshmukh et.al. 2407.18062 link
2024-07-25 Difficulty Estimation and Simplification of French Text Using LLMs Henri Jamet et.al. 2407.18061 null
2024-07-25 The Geometry of Queries: Query-Based Innovations in Retrieval-Augmented Generation Eric Yang et.al. 2407.18044 null
2024-07-25 RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models Haoyu Chen et.al. 2407.18035 null
2024-07-25 GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy Jan Batzner et.al. 2407.18008 null
2024-07-24 I Could’ve Asked That: Reformulating Unanswerable Questions Wenting Zhao et.al. 2407.17469 link
2024-07-24 WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries Wenting Zhao et.al. 2407.17468 null
2024-07-24 CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models Jiawei Gu et.al. 2407.17467 null
2024-07-24 $VILA^2$ : VILA Augmented VILA Yunhao Fang et.al. 2407.17453 null
2024-07-24 Fluent Student-Teacher Redteaming T. Ben Thompson et.al. 2407.17447 link
2024-07-24 Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? Michael-Andrei Panaitescu-Liess et.al. 2407.17417 null
2024-07-24 (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork Tianjin Huang et.al. 2407.17412 null
2024-07-24 Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models Yida Zhao et.al. 2407.17406 link
2024-07-24 Grammar-based Game Description Generation using Large Language Models Tsunehiko Tanaka et.al. 2407.17404 null
2024-07-24 3D Question Answering for City Scene Understanding Penglei Sun et.al. 2407.17398 null
2024-07-24 PERSONA: A Reproducible Testbed for Pluralistic Alignment Louis Castricato et.al. 2407.17387 null
2024-07-24 A Comprehensive Approach to Misspelling Correction with BERT and Levenshtein Distance Amirreza Naziri et.al. 2407.17383 null
2024-07-24 MMRA: A Benchmark for Multi-granularity Multi-image Relational Association Siwei Wu et.al. 2407.17379 link
2024-07-24 ViPer: Visual Personalization of Generative Models via Individual Preference Learning Sogand Salehi et.al. 2407.17365 null
2024-07-24 Gradient-based inference of abstract task representations for generalization in neural networks Ali Hummos et.al. 2407.17356 null
2024-07-24 Scalify: scale propagation for efficient low-precision LLM training Paul Balança et.al. 2407.17353 link
2024-07-24 Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching Yuyang Ding et.al. 2407.17349 link
2024-07-24 DexGANGrasp: Dexterous Generative Adversarial Grasping Synthesis for Task-Oriented Manipulation Qian Feng et.al. 2407.17348 null
2024-07-24 Label Alignment and Reassignment with Generalist Large Language Model for Enhanced Cross-Domain Named Entity Recognition Ke Bao et.al. 2407.17344 null
2024-07-24 How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations? Leo Yu-Ho Lo et.al. 2407.17291 null
2024-07-23 PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects Junyi Li et.al. 2407.16696 link
2024-07-23 Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack Xiaoyue Xu et.al. 2407.16695 link
2024-07-23 Can Large Language Models Automatically Jailbreak GPT-4V? Yuanwei Wu et.al. 2407.16686 null
2024-07-23 SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation Pengfei Chen et.al. 2407.16682 null
2024-07-23 RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent Huiyu Xu et.al. 2407.16667 null
2024-07-23 Course-Correction: Safety Alignment Using Synthetic Preferences Rongwu Xu et.al. 2407.16637 link
2024-07-23 Lawma: The Power of Specialization for Legal Tasks Ricardo Dominguez-Olmedo et.al. 2407.16615 null
2024-07-23 Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data? Jonathan Hayase et.al. 2407.16607 link
2024-07-23 Shared Imagination: LLMs Hallucinate Alike Yilun Zhou et.al. 2407.16604 null
2024-07-23 A Comparative Study on Patient Language across Therapeutic Domains for Effective Patient Voice Classification in Online Health Discussions Giorgos Lysandrou et.al. 2407.16593 null
2024-07-23 Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs Yifan Xia et.al. 2407.16576 null
2024-07-23 TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback Eunseop Yoon et.al. 2407.16574 null
2024-07-23 Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models Ioana Buhnila et.al. 2407.16565 link
2024-07-23 Patched RTC: evaluating LLMs for diverse software development tasks Asankhaya Sharma et.al. 2407.16557 link
2024-07-24 MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues Liyun Zhang et.al. 2407.16552 null
2024-07-23 Quantifying the Role of Textual Predictability in Automatic Speech Recognition Sean Robertson et.al. 2407.16537 null
2024-07-23 Imperfect Vision Encoders: Efficient and Robust Tuning for Vision-Language Models Aristeidis Panos et.al. 2407.16526 null
2024-07-23 AMONGAGENTS: Evaluating Large Language Models in the Interactive Text-Based Social Deduction Game Yizhou Chi et.al. 2407.16521 null
2024-07-23 Language-Based Security for Low-Level MPC Christian Skalka et.al. 2407.16504 null
2024-07-23 Machine Translation Hallucination Detection for Low and High Resource Languages using Large Language Models Kenza Benkirane et.al. 2407.16470 link
2024-07-22 AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description Junyu Xie et.al. 2407.15850 link
2024-07-22 LLMmap: Fingerprinting For Large Language Models Dario Pasquini et.al. 2407.15847 link
2024-07-22 SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models Mingze Xu et.al. 2407.15841 link
2024-07-22 MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity Yangzhou Liu et.al. 2407.15838 link
2024-07-22 dMel: Speech Tokenization made Simple He Bai et.al. 2407.15835 null
2024-07-22 J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling Wataru Nakata et.al. 2407.15828 null
2024-07-22 Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight Ziyuan Huang et.al. 2407.15819 null
2024-07-22 Perceptions of Linguistic Uncertainty by Language Models and Humans Catarina G Belem et.al. 2407.15814 link
2024-07-22 AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection Yunkang Cao et.al. 2407.15795 link
2024-07-22 CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning Emanuele Frascaroli et.al. 2407.15793 link
2024-07-22 Extracting Structured Insights from Financial News: An Augmented LLM Driven Approach Rian Dolphin et.al. 2407.15788 null
2024-07-22 Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels Zhuorui Ye et.al. 2407.15786 null
2024-07-22 Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning Kaiwen Wang et.al. 2407.15762 null
2024-07-22 MoRSE: Bridging the Gap in Cybersecurity Expertise with Retrieval Augmented Generation Marco Simoni et.al. 2407.15748 null
2024-07-22 OMoS-QA: A Dataset for Cross-Lingual Extractive Question Answering in a German Migration Context Steffen Kleinle et.al. 2407.15736 null
2024-07-22 TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSON John Chong Min Tan et.al. 2407.15734 link
2024-07-22 Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders Laura Niss et.al. 2407.15731 null
2024-07-22 SAM2CLIP2SAM: Vision Language Model for Segmentation of 3D CT Scans for Covid-19 Detection Dimitrios Kollias et.al. 2407.15728 null
2024-07-22 DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design Zhi Hao Luo et.al. 2407.15723 link
2024-07-22 Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability Zhuoyan Xu et.al. 2407.15720 link
2024-07-19 Internal Consistency and Self-Feedback in Large Language Models: A Survey Xun Liang et.al. 2407.14507 link
2024-07-19 On Pre-training of Multimodal Language Models Customized for Chart Understanding Wan-Cyuan Fan et.al. 2407.14506 null
2024-07-19 PD-TPE: Parallel Decoder with Text-guided Position Encoding for 3D Visual Grounding Chenshu Hou et.al. 2407.14491 null
2024-07-19 Evaluating the Reliability of Self-Explanations in Large Language Models Korbinian Randl et.al. 2407.14487 link
2024-07-19 Data-Centric Human Preference Optimization with Rationales Hoang Anh Just et.al. 2407.14477 link
2024-07-19 Contrastive Learning with Counterfactual Explanations for Radiology Report Generation Mingjie Li et.al. 2407.14474 null
2024-07-19 Check-Eval: A Checklist-based Approach for Evaluating Text Quality Jayr Pereira et.al. 2407.14467 null
2024-07-19 Undermining Mental Proof: How AI Can Make Cooperation Harder by Making Thinking Easier Zachary Wojtowicz et.al. 2407.14452 null
2024-07-19 Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding Renshan Zhang et.al. 2407.14439 link
2024-07-19 Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders Senthooran Rajamanoharan et.al. 2407.14435 null
2024-07-19 Mixture of Experts with Mixture of Precisions for Tuning Quality of Service HamidReza Imani et.al. 2407.14417 null
2024-07-19 System-1.x: Learning to Balance Fast and Slow Planning with Language Models Swarnadeep Saha et.al. 2407.14414 link
2024-07-19 DEAL: Disentangle and Localize Concept-level Explanations for VLMs Tang Li et.al. 2407.14412 link
2024-07-19 The Vision of Autonomic Computing: Can LLMs Make It a Reality? Zhiyang Zhang et.al. 2407.14402 null
2024-07-19 Frontiers of Deep Learning: From Novel Application to Real-World Deployment Rui Xie et.al. 2407.14386 null
2024-07-19 Open Artificial Knowledge Vadim Borisov et.al. 2407.14371 null
2024-07-19 Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models Xuenan Xu et.al. 2407.14355 link
2024-07-19 Improving Retrieval in Sponsored Search by Leveraging Query Context Signals Akash Kumar Mohankumar et.al. 2407.14346 null
2024-07-19 LLMs left, right, and center: Assessing GPT’s capabilities to label political bias from web domains Raphael Hernandes et.al. 2407.14344 null
2024-07-19 Multimodal Misinformation Detection using Large Vision-Language Models Sahar Tahmasebi et.al. 2407.14321 null
2024-07-18 Latent Causal Probing: A Formal Perspective on Probing with Causal Models of Data Charles Jin et.al. 2407.13765 null
2024-07-18 SegPoint: Segment Any Point Cloud via Large Language Model Shuting He et.al. 2407.13761 null
2024-07-18 Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models Zhuo Chen et.al. 2407.13757 null
2024-07-18 CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications Mirza Masfiqur Rahman et.al. 2407.13742 null
2024-07-18 Baba Is AI: Break the Rules to Beat the Benchmark Nathan Cloos et.al. 2407.13729 null
2024-07-18 CoDefeater: Using LLMs To Find Defeaters in Assurance Cases Usman Gohar et.al. 2407.13717 link
2024-07-18 Understanding Reference Policies in Direct Preference Optimization Yixin Liu et.al. 2407.13709 link
2024-07-18 A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice Shaina Raza et.al. 2407.13699 null
2024-07-18 Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation Yotam Perlitz et.al. 2407.13696 link
2024-07-18 Prover-Verifier Games improve legibility of LLM outputs Jan Hendrik Kirchner et.al. 2407.13692 null
2024-07-18 Shaded Route Planning Using Active Segmentation and Identification of Satellite Images Longchao Da et.al. 2407.13689 null
2024-07-18 FuLG: 150B Romanian Corpus for Language Model Pretraining Vlad-Andrei Bădoiu et.al. 2407.13657 null
2024-07-18 COMCAT: Leveraging Human Judgment to Improve Automatic Documentation and Summarization Skyler Grandel et.al. 2407.13648 null
2024-07-18 Weak-to-Strong Reasoning Yuqing Yang et.al. 2407.13647 link
2024-07-18 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Chaofan Tao et.al. 2407.13623 link
2024-07-18 KNOWNET: Guided Health Information Seeking from LLMs via Knowledge Graph Integration Youfu Yan et.al. 2407.13598 null
2024-07-18 PLANTS: A Novel Problem and Dataset for Summarization of Planning-Like (PL) Tasks Vishal Pallagani et.al. 2407.13597 null
2024-07-18 EarthMarker: A Visual Prompt Learning Framework for Region-level and Point-level Remote Sensing Imagery Comprehension Wei Zhang et.al. 2407.13596 link
2024-07-18 Robust Calibration of Large Vision-Language Adapters Balamurali Murugesan et.al. 2407.13588 link
2024-07-18 Towards Zero-Shot Multimodal Machine Translation Matthieu Futeral et.al. 2407.13579 link
2024-07-17 LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Kaichen Zhang et.al. 2407.12772 link
2024-07-17 EchoSight: Advancing Visual-Language Models with Wiki Knowledge Yibin Yan et.al. 2407.12735 null
2024-07-17 NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model Zhongqun Zhang et.al. 2407.12727 null
2024-07-17 Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models? Ben Yao et.al. 2407.12725 null
2024-07-17 The Future of Learning: Large Language Models through the Lens of Students He Zhang et.al. 2407.12723 null
2024-07-17 MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models Leyang Shen et.al. 2407.12709 link
2024-07-17 Subgraph-Aware Training of Text-based Methods for Knowledge Graph Completion Youmin Ko et.al. 2407.12703 null
2024-07-17 Patch-Level Training for Large Language Models Chenze Shao et.al. 2407.12665 link
2024-07-17 Zero-shot Text-guided Infinite Image Synthesis with LLM guidance Soyeong Kwon et.al. 2407.12642 null
2024-07-17 Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification? Aman Sinha et.al. 2407.12626 null
2024-07-17 Harnessing the Power of Artificial Intelligence to Vitalize Endangered Indigenous Languages: Technologies and Experiences Claudio Pinhanez et.al. 2407.12620 null
2024-07-17 AudienceView: AI-Assisted Interpretation of Audience Feedback in Journalism William Brannon et.al. 2407.12613 link
2024-07-17 VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding Ofir Abramovich et.al. 2407.12594 null
2024-07-18 Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks Antoni Kowalczuk et.al. 2407.12588 link
2024-07-17 E5-V: Universal Embeddings with Multimodal Large Language Models Ting Jiang et.al. 2407.12580 link
2024-07-17 Audio Conditioning for Music Generation via Discrete Bottleneck Features Simon Rouard et.al. 2407.12563 null
2024-07-17 Conspiracy theories and where to find them on TikTok Francesco Corso et.al. 2407.12545 null
2024-07-17 Abstraction Alignment: Comparing Model and Human Conceptual Relationships Angie Boggust et.al. 2407.12543 link
2024-07-17 Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models Xihe Qiu et.al. 2407.12532 null
2024-07-17 Crafting the Path: Robust Query Rewriting for Information Retrieval Ingeol Baek et.al. 2407.12529 null
2024-07-16 UrbanWorld: An Urban World Model for 3D City Generation Yu Shang et.al. 2407.11965 link
2024-07-16 NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? Mo Li et.al. 2407.11963 link
2024-07-16 Code Documentation and Analysis to Secure Software Development Paul Attie et.al. 2407.11934 null
2024-07-16 What’s Wrong? Refining Meeting Summaries with LLM Feedback Frederic Kirstein et.al. 2407.11919 null
2024-07-16 GraphFM: A Scalable Framework for Multi-Graph Pretraining Divyansha Lachi et.al. 2407.11907 null
2024-07-16 Ascend-CC: Confidential Computing on Heterogeneous NPU for Emerging Generative AI Workloads Aritra Dhar et.al. 2407.11888 null
2024-07-16 Zero-shot Cross-Lingual Transfer for Synthetic Data Generation in Grammatical Error Detection Gaetan Lopez Latouche et.al. 2407.11854 null
2024-07-16 Schema Matching with Large Language Models: an Experimental Study Marcel Parciak et.al. 2407.11852 link
2024-07-16 LoFTI: Localization and Factuality Transfer to Indian Locales Sona Elza Simon et.al. 2407.11833 link
2024-07-16 GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text Kyle Hamilton et.al. 2407.11827 null
2024-07-16 PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation Branden Butler et.al. 2407.11798 null
2024-07-16 Large Language Models as Misleading Assistants in Conversation Betty Li Hou et.al. 2407.11789 null
2024-07-16 SwitchCIT: Switching for Continual Instruction Tuning of Large Language Models Xinbo Wu et.al. 2407.11780 null
2024-07-16 Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text Seyedeh Fatemeh Ebrahimi et.al. 2407.11774 null
2024-07-16 Educational Personalized Learning Path Planning with Large Language Models Chee Ng et.al. 2407.11773 null
2024-07-16 XEdgeAI: A Human-centered Industrial Inspection Framework with Data-centric Explainable Edge AI Approach Truong Thanh Hung Nguyen et.al. 2407.11771 link
2024-07-16 Robust Utility-Preserving Text Anonymization Based on Large Language Models Tianyu Yang et.al. 2407.11770 link
2024-07-16 Vectoring Languages Joseph Chen et.al. 2407.11766 null
2024-07-16 Exploring Quantization for Efficient Pre-Training of Transformer Language Models Kamran Chitsaz et.al. 2407.11722 link
2024-07-16 Harnessing Large Language Models for Multimodal Product Bundling Xiaohao Liu et.al. 2407.11712 null
2024-07-15 VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation Bocheng Zou et.al. 2407.10972 link
2024-07-15 Q-Sparse: All Large Language Models can be Fully Sparsely-Activated Hongyu Wang et.al. 2407.10969 null
2024-07-15 Fast Matrix Multiplications for Lookup Table-Quantized LLMs Han Guo et.al. 2407.10960 link
2024-07-15 Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? Ruisheng Cao et.al. 2407.10956 link
2024-07-15 MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models Chengguang Gan et.al. 2407.10953 null
2024-07-15 Can Textual Semantics Mitigate Sounding Object Segmentation Preference? Yaoting Wang et.al. 2407.10947 link
2024-07-15 Learning from Naturally Occurring Feedback Shachar Don-Yehiya et.al. 2407.10944 link
2024-07-15 GRUtopia: Dream General Robots in a City at Scale Hanqing Wang et.al. 2407.10943 link
2024-07-15 Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together Dilara Soylu et.al. 2407.10930 null
2024-07-15 Benchmarking Vision Language Models for Cultural Understanding Shravan Nayak et.al. 2407.10920 null
2024-07-15 FinDKG: Dynamic Knowledge Graphs with Large Language Models for Detecting Global Trends in Financial Markets Xiaohui Victor Li et.al. 2407.10909 link
2024-07-15 Hey, That’s My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique Mark Russinovich et.al. 2407.10887 null
2024-07-15 SLIP: Securing LLMs IP Using Weights Decomposition Yehonathan Refael et.al. 2407.10886 null
2024-07-15 Understanding the Importance of Evolutionary Search in Automated Heuristic Design with Large Language Models Rui Zhang et.al. 2407.10873 null
2024-07-15 GPT Sonograpy: Hand Gesture Decoding from Forearm Ultrasound Images via VLM Keshav Bimbraw et.al. 2407.10870 null
2024-07-15 Physics-Inspired Generative Models in Medical Imaging: A Review Dennis Hein et.al. 2407.10856 null
2024-07-15 Weighted Grouped Query Attention in Transformers Sai Sena Chinnakonduru et.al. 2407.10855 null
2024-07-15 An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use Cases Dylan Bouchard et.al. 2407.10853 null
2024-07-15 MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs Quang H. Nguyen et.al. 2407.10834 null
2024-07-15 BiasScanner: Automatic Detection and Classification of News Bias to Strengthen Democracy Tim Menzner et.al. 2407.10829 null
2024-07-12 FairyLandAI: Personalized Fairy Tales utilizing ChatGPT and DALLE-3 Georgios Makridis et.al. 2407.09467 null
2024-07-12 Human-like Episodic Memory for Infinite Context LLMs Zafeirios Fountas et.al. 2407.09450 link
2024-07-12 ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts Amelia F. Hardy et.al. 2407.09447 link
2024-07-12 MUSCLE: A Model Update Strategy for Compatible LLM Evolution Jessica Echterhoff et.al. 2407.09435 null
2024-07-12 A Perspective on Foundation Models for the Electric Power Grid Hendrik F. Hamann et.al. 2407.09434 null
2024-07-12 Open (Clinical) LLMs are Sensitive to Instruction Phrasings Alberto Mario Ceballos Arroyo et.al. 2407.09429 link
2024-07-12 TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models Hang Zou et.al. 2407.09424 null
2024-07-12 Mitigating Entity-Level Hallucination in Large Language Models Weihang Su et.al. 2407.09417 link
2024-07-12 SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers Shraman Pramanick et.al. 2407.09413 link
2024-07-12 Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-Commerce Zhe Lin et.al. 2407.09395 null
2024-07-12 PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric Agents Saber Zerhoudi et.al. 2407.09394 link
2024-07-12 GAVEL: Generating Games Via Evolution and Language Models Graham Todd et.al. 2407.09388 link
2024-07-12 Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text Lucio La Cava et.al. 2407.09364 null
2024-07-12 Good Intentions, Risky Inventions: A Method for Assessing the Risks and Benefits of AI in Mobile and Wearable Uses Marios Constantinides et.al. 2407.09322 link
2024-07-12 Scalability of Bayesian Network Structure Elicitation with Large Language Models: a Novel Methodology and Comparative Analysis Nikolay Babakov et.al. 2407.09311 null
2024-07-12 Transformer Layers as Painters Qi Sun et.al. 2407.09298 link
2024-07-12 Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study Yulong Yang et.al. 2407.09295 null
2024-07-12 CEIPA: Counterfactual Explainable Incremental Prompt Attack Analysis on Large Language Models Dong Shu et.al. 2407.09292 null
2024-07-12 Structuring Authenticity Assessments on Historical Documents using LLMs Andrea Schimmenti et.al. 2407.09290 null
2024-07-12 WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation Robin Schön et.al. 2407.09288 link
2024-07-11 MAVIS: Mathematical Visual Instruction Tuning Renrui Zhang et.al. 2407.08739 link
2024-07-11 Real-Time Anomaly Detection and Reactive Planning with Large Language Models Rohan Sinha et.al. 2407.08735 null
2024-07-11 Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist Zihao Zhou et.al. 2407.08733 null
2024-07-11 A Taxonomy for Data Contamination in Large Language Models Medha Palavalli et.al. 2407.08716 null
2024-07-11 GTA: A Benchmark for General Tool Agents Jize Wang et.al. 2407.08713 link
2024-07-11 eyeballvul: a future-proof benchmark for vulnerability detection in the wild Timothee Chauvin et.al. 2407.08708 link
2024-07-11 Extracting Training Data from Document-Based VQA Models Francesco Pinto et.al. 2407.08707 null
2024-07-11 HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models Runhui Huang et.al. 2407.08706 null
2024-07-11 Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models Zhening Xing et.al. 2407.08701 null
2024-07-11 Mitigating Catastrophic Forgetting in Language Transfer via Model Merging Anton Alexandrov et.al. 2407.08699 null
2024-07-11 Cloud Atlas: Efficient Fault Localization for Cloud Systems using Language Models and Causal Insight Zhiqiang Xie et.al. 2407.08694 null
2024-07-11 Robotic Control via Embodied Chain-of-Thought Reasoning Zawalski Michał et.al. 2407.08693 null
2024-07-11 SEED-Story: Multimodal Long Story Generation with Large Language Model Shuai Yang et.al. 2407.08683 link
2024-07-11 NODE-Adapter: Neural Ordinary Differential Equations for Better Vision-Language Reasoning Yi Zhang et.al. 2407.08672 null
2024-07-11 Uncertainty Estimation of Large Language Models in Medical Question Answering Jiaxin Wu et.al. 2407.08662 null
2024-07-11 Towards Building Specialized Generalist AI with System 1 and System 2 Fusion Kaiyan Zhang et.al. 2407.08642 null
2024-07-11 $β$-DPO: Direct Preference Optimization with Dynamic $β$ Junkang Wu et.al. 2407.08639 link
2024-07-11 RoboMorph: Evolving Robot Morphology using Large Language Models Kevin Qiu et.al. 2407.08626 null
2024-07-11 Tamil Language Computing: the Present and the Future Kengatharaiyer Sarveswaran et.al. 2407.08618 null
2024-07-11 FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision Jay Shah et.al. 2407.08608 link
2024-07-10 Training on the Test Task Confounds Evaluation and Emergence Ricardo Dominguez-Olmedo et.al. 2407.07890 link
2024-07-10 Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization Junkang Wu et.al. 2407.07880 link
2024-07-11 Toto: Time Series Optimized Transformer for Observability Ben Cohen et.al. 2407.07874 null
2024-07-10 FACTS About Building Retrieval Augmented Generation-based Chatbots Rama Akkiraju et.al. 2407.07858 null
2024-07-10 OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training Sami Jaghouar et.al. 2407.07852 link
2024-07-10 Natural Language Mechanisms via Self-Resolution with Foundation Models Nicolas Della Penna et.al. 2407.07845 null
2024-07-10 Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective Shengjia Chen et.al. 2407.07841 link
2024-07-10 Decompose and Compare Consistency: Measuring VLMs’ Answer Reliability via Task-Decomposition Consistency Comparison Qian Yang et.al. 2407.07840 null
2024-07-10 Transformer Alignment in Large Language Models Murdock Aubry et.al. 2407.07810 null
2024-07-11 AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning Jongsuk Kim et.al. 2407.07801 link
2024-07-10 Attribute or Abstain: Large Language Models as Long Document Assistants Jan Buchmann et.al. 2407.07799 link
2024-07-11 Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard Oguzhan Topsakal et.al. 2407.07796 link
2024-07-10 Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities Tianjie Ju et.al. 2407.07791 link
2024-07-10 WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment Jiefu Ou et.al. 2407.07778 null
2024-07-10 Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs Hao-Tien Lewis Chiang et.al. 2407.07775 null
2024-07-10 Can ChatGPT Pass a Theory of Computing Course? Matei A. Golesteanu et.al. 2407.07757 null
2024-07-10 Fine-Tuning Large Language Models with User-Level Differential Privacy Zachary Charles et.al. 2407.07737 null
2024-07-10 PaliGemma: A versatile 3B VLM for transfer Lucas Beyer et.al. 2407.07726 link
2024-07-10 Why should we ever automate moral decision making? Vincent Conitzer et.al. 2407.07671 null
2024-07-10 A Proposed S.C.O.R.E. Evaluation Framework for Large Language Models : Safety, Consensus, Objectivity, Reproducibility and Explainability Ting Fang Tan et.al. 2407.07666 null
2024-07-09 AnyTaskTune: Advanced Domain-Specific Solutions through Task-Fine-Tuning Jiaxi Cui et.al. 2407.07094 link
2024-07-09 FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation Liqun Ma et.al. 2407.07093 link
2024-07-09 CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation Tong Chen et.al. 2407.07087 link
2024-07-09 Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models Logan Cross et.al. 2407.07086 link
2024-07-09 Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities Shaltiel Shmidman et.al. 2407.07080 null
2024-07-09 Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps Yung-Sung Chuang et.al. 2407.07071 link
2024-07-09 Prompting Techniques for Secure Code Generation: A Systematic Investigation Catherine Tony et.al. 2407.07064 null
2024-07-09 Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence Weize Chen et.al. 2407.07061 link
2024-07-09 Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Wenqi Zhang et.al. 2407.07053 link
2024-07-09 ProtoSAM – One Shot Medical Image Segmentation With Foundational Models Lev Ayzenberg et.al. 2407.07042 link
2024-07-09 Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models Yue Zhang et.al. 2407.07035 link
2024-07-09 Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization Jeongseok Hyun et.al. 2407.07024 link
2024-07-09 Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies Inwon Kang et.al. 2407.07019 null
2024-07-09 End-To-End Causal Effect Estimation from Unstructured Natural Language Data Nikita Dhawan et.al. 2407.07018 null
2024-07-09 Is Large Language Model All You Need to Predict the Synthesizability and Precursors of Crystal Structures? Zhilong Song et.al. 2407.07016 null
2024-07-09 Induction Heads as an Essential Mechanism for Pattern Matching in In-context Learning J. Crosbie et.al. 2407.07011 null
2024-07-09 Metron: Holistic Performance Evaluation Framework for LLM Inference Systems Amey Agrawal et.al. 2407.07000 link
2024-07-09 Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective Yu-An Liu et.al. 2407.06992 link
2024-07-09 Segment-Based Interactive Machine Translation for Pre-trained Models Angel Navarro et.al. 2407.06990 null
2024-07-09 Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models Yi-Cheng Lin et.al. 2407.06957 link
2024-07-08 Multi-Object Hallucination in Vision-Language Models Xuweiyi Chen et.al. 2407.06192 link
2024-07-08 4D Contrastive Superflows are Dense 3D Representation Learners Xiang Xu et.al. 2407.06190 link
2024-07-08 Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision Orr Zohar et.al. 2407.06189 link
2024-07-08 CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation Xinying Guo et.al. 2407.06188 null
2024-07-08 JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation Yu Zeng et.al. 2407.06187 null
2024-07-08 Vision-Language Models under Cultural and Inclusive Considerations Antonia Karamolegkou et.al. 2407.06177 null
2024-07-08 On Speeding Up Language Model Evaluation Jin Peng Zhou et.al. 2407.06172 null
2024-07-08 What’s Wrong with Your Code Generated by Large Language Models? An Extensive Study Shihan Dou et.al. 2407.06153 null
2024-07-09 Using Grammar Masking to Ensure Syntactic Validity in LLM-based Modeling Tasks Lukas Netz et.al. 2407.06146 null
2024-07-08 ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation Ethan Chern et.al. 2407.06135 link
2024-07-08 Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization Hannah K. Bako et.al. 2407.06129 link
2024-07-08 Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities Avinash Anand et.al. 2407.06125 null
2024-07-08 Enhancing Language Model Rationality with Bi-Directional Deliberation Reasoning Yadong Zhang et.al. 2407.06112 null
2024-07-08 Artificial Intuition: Efficient Classification of Scientific Abstracts Harsh Sakhrani et.al. 2407.06093 null
2024-07-08 Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models Jinliang Lu et.al. 2407.06089 null
2024-07-08 From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty Maor Ivgi et.al. 2407.06071 link
2024-07-08 Variational Best-of-N Alignment Afra Amini et.al. 2407.06057 null
2024-07-08 MST5 – Multilingual Question Answering over Knowledge Graphs Nikit Srivastava et.al. 2407.06041 link
2024-07-08 PAS: Data-Efficient Plug-and-Play Prompt Augmentation System Miao Zheng et.al. 2407.06027 null
2024-07-08 iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement Aoyu Pang et.al. 2407.06025 link
2024-07-05 Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs Rudolf Laine et.al. 2407.04694 link
2024-07-05 ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models Yuzhe Gu et.al. 2407.04693 link
2024-07-05 Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge Yuanze Lin et.al. 2407.04681 null
2024-07-05 Lost in Translation: The Algorithmic Gap Between LMs and the Brain Tommaso Tosato et.al. 2407.04680 null
2024-07-05 Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition Ye Bai et.al. 2407.04675 null
2024-07-05 Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement Yongji Wu et.al. 2407.04656 null
2024-07-05 Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models Bolaji Yusuf et.al. 2407.04641 null
2024-07-05 Entity Decomposition with Filtering: A Zero-Shot Clinical Named Entity Recognition Framework Reza Averly et.al. 2407.04629 null
2024-07-05 On scalable oversight with weak LLMs judging strong LLMs Zachary Kenton et.al. 2407.04622 null
2024-07-05 CountGD: Multi-Modal Open-World Counting Niki Amini-Naieni et.al. 2407.04619 null
2024-07-05 ARM: Efficient Guided Decoding with Autoregressive Reward Models Sergey Troshin et.al. 2407.04615 null
2024-07-05 AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation Yuhan Zhu et.al. 2407.04603 link
2024-07-05 Written Term Detection Improves Spoken Term Detection Bolaji Yusuf et.al. 2407.04601 link
2024-07-05 Testing learning hypotheses using neural networks by manipulating learning data Cara Su-Yi Leong et.al. 2407.04593 null
2024-07-05 Leveraging Large Language Models for Integrated Satellite-Aerial-Terrestrial Networks: Recent Advances and Future Directions Shumaila Javaid et.al. 2407.04581 null
2024-07-05 VRSD: Rethinking Similarity and Diversity for Retrieval in Large Language Models Hang Gao et.al. 2407.04573 null
2024-07-05 Not (yet) the whole story: Evaluating Visual Storytelling Requires More than Measuring Coherence, Grounding, and Repetition Aditya K Surikuchi et.al. 2407.04559 link
2024-07-05 Spontaneous Reward Hacking in Iterative Self-Refinement Jane Pan et.al. 2407.04549 null
2024-07-05 PoPreRo: A New Dataset for Popularity Prediction of Romanian Reddit Posts Ana-Cristina Rogoz et.al. 2407.04541 link
2024-07-05 GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning Aleksander Ficek et.al. 2407.04528 null
2024-07-03 Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages Max Zuo et.al. 2407.03321 link
2024-07-03 InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Pan Zhang et.al. 2407.03320 link
2024-07-03 BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations Zhantao Yang et.al. 2407.03314 null
2024-07-03 Universal Length Generalization with Turing Programs Kaiying Hou et.al. 2407.03310 null
2024-07-03 Large Language Models for JSON Schema Discovery Michael J. Mior et.al. 2407.03286 null
2024-07-03 LLM Internal States Reveal Hallucination Risk Faced With a Query Ziwei Ji et.al. 2407.03282 link
2024-07-03 STF: Sentence Transformer Fine-Tuning For Topic Categorization With Limited Data Kheir Eddine Daouadi et.al. 2407.03253 null
2024-07-03 Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning Zhili Shen et.al. 2407.03227 null
2024-07-03 How Does Quantization Affect Multilingual LLMs? Kelly Marchisio et.al. 2407.03211 null
2024-07-03 TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts Ruida Wang et.al. 2407.03203 link
2024-07-03 Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models Haritz Puerto et.al. 2407.03181 link
2024-07-03 Investigating Decoder-only Large Language Models for Speech-to-text Translation Chao-Wei Huang et.al. 2407.03169 null
2024-07-03 SOS! Soft Prompt Attack Against Open-Source Large Language Models Ziqing Yang et.al. 2407.03160 null
2024-07-03 Let the Code LLM Edit Itself When You Edit the Code Zhenyu He et.al. 2407.03157 null
2024-07-03 Reinforcement Learning for Sequence Design Leveraging Protein Language Models Jithendaraa Subramanian et.al. 2407.03154 null
2024-07-03 Enhancing Translation Accuracy of Large Language Models through Continual Pre-Training on Parallel Data Minato Kondo et.al. 2407.03145 null
2024-07-03 Social Bias Evaluation for Large Language Models Requires Prompt Variations Rem Hida et.al. 2407.03129 link
2024-07-03 KeyVideoLLM: Towards Large-scale Video Keyframe Selection Hao Liang et.al. 2407.03104 null
2024-07-03 Cactus: Towards Psychological Counseling Conversations using Cognitive Behavioral Theory Suyeon Lee et.al. 2407.03103 link
2024-07-03 ScreenTK: Seamless Detection of Time-Killing Moments Using Continuous Mobile Screen Text Monitoring Le Fang et.al. 2407.03063 null
2024-07-02 MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Huiqiang Jiang et.al. 2407.02490 link
2024-07-02 Neurocache: Efficient Vector Retrieval for Long-range Language Modeling Ali Safaya et.al. 2407.02486 link
2024-07-02 RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs Yue Yu et.al. 2407.02485 null
2024-07-02 MMedAgent: Learning to Use Medical Tools with Multi-modal Agent Binxu Li et.al. 2407.02483 link
2024-07-02 Understanding Alignment in Multimodal LLMs: A Comprehensive Study Elmira Amirloo et.al. 2407.02477 null
2024-07-02 Open Scene Graphs for Open World Object-Goal Navigation Joel Loo et.al. 2407.02473 null
2024-07-02 ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions Chan Young Park et.al. 2407.02472 link
2024-07-02 Reliable Confidence Intervals for Information Retrieval Evaluation Using Generative A.I Harrie Oosterhuis et.al. 2407.02464 null
2024-07-02 Ensemble of pre-trained language models and data augmentation for hate speech detection from Arabic tweets Kheir Eddine Daouadi et.al. 2407.02448 null
2024-07-03 Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs Jinmin Li et.al. 2407.02411 null
2024-07-02 CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models Song Wang et.al. 2407.02408 null
2024-07-02 Assessing the Code Clone Detection Capability of Large Language Models Zixian Zhang et.al. 2407.02402 null
2024-07-02 Learning to Refine with Fine-Grained Natural Language Feedback Manya Wadhwa et.al. 2407.02397 link
2024-07-02 Is Your AI-Generated Code Really Secure? Evaluating Large Language Models on Secure Code Generation with CodeSecEval Jiexin Wang et.al. 2407.02395 null
2024-07-02 TokenPacker: Efficient Visual Projector for Multimodal LLM Wentong Li et.al. 2407.02392 link
2024-07-02 Talking to Machines: do you read me? Lina M. Rojas-Barahona et.al. 2407.02354 null
2024-07-02 Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification Pritish Sahu et.al. 2407.02352 null
2024-07-02 Generative Large Language Models in Automated Fact-Checking: A Survey Ivan Vykopal et.al. 2407.02351 null
2024-07-02 Conceptual Codebook Learning for Vision-Language Models Yi Zhang et.al. 2407.02350 null
2024-07-02 MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space Yihong Tang et.al. 2407.02345 null
2024-06-28 Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs Sukmin Yun et.al. 2406.20098 link
2024-06-28 LLaRA: Supercharging Robot Learning Data for Vision-Language Policy Xiang Li et.al. 2406.20095 link
2024-06-28 Scaling Synthetic Data Creation with 1,000,000,000 Personas Xin Chan et.al. 2406.20094 link
2024-06-28 LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression Jieneng Chen et.al. 2406.20092 link
2024-06-28 ProgressGym: Alignment with a Millennium of Moral Progress Tianyi Qiu et.al. 2406.20087 link
2024-06-28 Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language Yicheng Chen et.al. 2406.20085 null
2024-06-28 Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification Anisha Gunjal et.al. 2406.20079 link
2024-06-28 EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model Yuxuan Zhang et.al. 2406.20076 link
2024-06-28 To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models Bastien Liétard et.al. 2406.20054 null
2024-06-28 Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation Danny Halawi et.al. 2406.20053 null
2024-07-01 BMW Agents – A Framework For Task Automation Through Multi-Agent Collaboration Noel Crawford et.al. 2406.20041 null
2024-06-28 BioMNER: A Dataset for Biomedical Method Entity Recognition Chen Tang et.al. 2406.20038 null
2024-06-28 LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models Renzhi Wang et.al. 2406.20030 null
2024-06-28 ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models Yuxiang Zhang et.al. 2406.20015 link
2024-06-28 The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models Xinyi Chen et.al. 2406.19999 link
2024-06-28 Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model Habib Hajimolahoseini et.al. 2406.19995 null
2024-06-28 ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting Rui Pan et.al. 2406.19976 null
2024-06-28 STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Guohao Sun et.al. 2406.19973 link
2024-06-28 Into the Unknown: Generating Geospatial Descriptions for New Environments Tzuf Paz-Argaman et.al. 2406.19967 null
2024-06-28 Simulating Financial Market via Large Language Model based Agents Shen Gao et.al. 2406.19966 null
2024-06-27 ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos Jr-Jen Chen et.al. 2406.19392 link
2024-06-27 The Remarkable Robustness of LLMs: Stages of Inference? Vedang Lad et.al. 2406.19384 link
2024-06-27 The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in the Era of Large Language Models Xiliang Zhu et.al. 2406.19358 null
2024-06-27 DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions Nigel Fernandez et.al. 2406.19356 link
2024-06-27 Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs? Peter Hase et.al. 2406.19354 null
2024-06-27 IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language Lucky Susanto et.al. 2406.19349 null
2024-06-27 Jump Starting Bandits with LLM-Generated Prior Knowledge Parand A. Alamdari et.al. 2406.19317 link
2024-06-27 MCNC: Manifold Constrained Network Compression Chayne Thrash et.al. 2406.19301 null
2024-06-27 From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data Zheyang Xiong et.al. 2406.19292 link
2024-06-27 PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models Cathy Mengying Fang et.al. 2406.19283 null
2024-06-27 HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale Junying Chen et.al. 2406.19280 link
2024-06-27 VERISCORE: Evaluating the factuality of verifiable claims in long-form text generation Yixiao Song et.al. 2406.19276 link
2024-06-27 AutoPureData: Automated Filtering of Web Data for LLM Fine-tuning Praneeth Vadlapati et.al. 2406.19271 link
2024-06-27 Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding Yue Fan et.al. 2406.19263 link
2024-06-27 Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment Hao Fei et.al. 2406.19255 null
2024-06-27 AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation Jia Fu et.al. 2406.19251 null
2024-06-27 Revealing Fine-Grained Values and Opinions in Large Language Models Dustin Wright et.al. 2406.19238 link
2024-06-28 FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts Shubhankar Singh et.al. 2406.19237 null
2024-06-27 Seeing Is Believing: Black-Box Membership Inference Attacks Against Retrieval Augmented Generation Yuying Li et.al. 2406.19234 null
2024-06-28 RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs Ekaterina Taktasheva et.al. 2406.19232 link
2024-06-26 Towards Compositionality in Concept Learning Adam Stein et.al. 2406.18534 link
2024-06-26 Symbolic Learning Enables Self-Evolving Agents Wangchunshu Zhou et.al. 2406.18532 link
2024-06-26 PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine Translation and Summarization Evaluation Christoph Leiter et.al. 2406.18528 link
2024-06-26 CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Zirui Wang et.al. 2406.18521 link
2024-06-26 “Is ChatGPT a Better Explainer than My Professor?”: Evaluating the Explanation Capabilities of LLMs in Conversation Compared to a Human Baseline Grace Li et.al. 2406.18512 null
2024-06-26 WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models Liwei Jiang et.al. 2406.18510 link
2024-06-26 Mental Modeling of Reinforcement Learning Agents by Language Models Wenhao Lu et.al. 2406.18505 null
2024-06-26 Is In-Context Learning a Type of Gradient-Based Learning? Evidence from the Inverse Frequency Effect in Structural Priming Zhenghao Zhou et.al. 2406.18501 null
2024-06-26 Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation Ahmed Njifenjou et.al. 2406.18460 null
2024-06-26 Cascading Large Language Models for Salient Event Graph Generation Xingwei Tan et.al. 2406.18449 link
2024-06-26 New intelligent empowerment for digital transformation Peng Yifeng et.al. 2406.18440 null
2024-06-26 IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons Dan Shi et.al. 2406.18406 link
2024-06-26 Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers Yibo Jiang et.al. 2406.18400 null
2024-06-26 Adversarial Search Engine Optimization for Large Language Models Fredrik Nestaas et.al. 2406.18382 null
2024-06-26 MALSIGHT: Exploring Malicious Source Code and Benign Pseudocode for Iterative Binary Malware Summarization Haolang Lu et.al. 2406.18379 null
2024-06-26 Themis: Towards Flexible and Interpretable NLG Evaluation Xinyu Hu et.al. 2406.18365 link
2024-06-26 AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations Adam Dahlgren Lindström et.al. 2406.18346 null
2024-06-26 PDFA Distillation via String Probability Queries {PDFA Distillation via String Probability Queries} Robert Baumgartner et.al. 2406.18328 link
2024-06-26 PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language Models Huixuan Zhang et.al. 2406.18326 null
2024-06-26 MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data Meng Fang et.al. 2406.18321 null
2024-06-25 MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning Xiangyu Zhao et.al. 2406.17770 link
2024-06-25 EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data Jesse Zhang et.al. 2406.17768 null
2024-06-25 BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning Ercong Nie et.al. 2406.17764 null
2024-06-25 CaLMQA: Exploring culturally specific long-form question answering across 23 languages Shane Arora et.al. 2406.17761 link
2024-06-25 Accelerating Clinical Evidence Synthesis with Large Language Models Zifeng Wang et.al. 2406.17755 null
2024-06-25 Measuring and Benchmarking Large Language Models’ Capabilities to Generate Persuasive Language Amalie Brogaard Pauli et.al. 2406.17753 null
2024-06-25 Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon USVSN Sai Prashanth et.al. 2406.17746 link
2024-06-25 Point-SAM: Promptable 3D Segmentation Model for Point Clouds Yuchen Zhou et.al. 2406.17741 link
2024-06-25 Find Parent then Label Children: A Two-stage Taxonomy Completion Method with Pre-trained Language Model Fei Xia et.al. 2406.17739 null
2024-06-25 LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users Elinor Poole-Dayan et.al. 2406.17737 null
2024-06-25 FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model Feijie Wu et.al. 2406.17706 link
2024-06-25 From Distributional to Overton Pluralism: Investigating Large Language Model Alignment Thom Lake et.al. 2406.17692 link
2024-06-25 VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation Kun Qian et.al. 2406.17681 link
2024-06-25 Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models Yuan Li et.al. 2406.17675 null
2024-06-25 LaTable: Towards Large Tabular Models Boris van Breugel et.al. 2406.17673 null
2024-06-25 LLM-ARC: Enhancing LLMs with an Automated Reasoning Critic Aditya Kalyanpur et.al. 2406.17663 null
2024-06-25 Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients Aashiq Muhamed et.al. 2406.17660 link
2024-06-25 DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning Xiaohan Zhang et.al. 2406.17659 null
2024-06-25 Leveraging Large Language Models for Software Model Completion: Results from Industrial and Public Datasets Christof Tinnes et.al. 2406.17651 link
2024-06-25 Variationist: Exploring Multifaceted Variation and Bias in Written Language Data Alan Ramponi et.al. 2406.17647 link
2024-06-24 Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Shengbang Tong et.al. 2406.16860 link
2024-06-24 EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees Yuhui Li et.al. 2406.16858 link
2024-06-24 Long Context Transfer from Language to Vision Peiyuan Zhang et.al. 2406.16852 link
2024-06-24 Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts Aditya Sharma et.al. 2406.16851 null
2024-06-24 RaTEScore: A Metric for Radiology Report Generation Weike Zhao et.al. 2406.16845 link
2024-06-24 From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models Sean Welleck et.al. 2406.16838 null
2024-06-24 USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$ onversations Mounika Marreddy et.al. 2406.16833 null
2024-06-24 Understanding and Mitigating Tokenization Bias in Language Models Buu Phan et.al. 2406.16829 null
2024-06-24 Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track Ronak Pradeep et.al. 2406.16828 link
2024-06-24 GPT-4V Explorations: Mining Autonomous Driving Zixuan Li et.al. 2406.16817 null
2024-06-24 RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository Scale Beck LaBash et.al. 2406.16801 link
2024-06-24 Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs Ashwinee Panda et.al. 2406.16797 link
2024-06-24 Adam-mini: Use Fewer Learning Rates To Gain More Yushun Zhang et.al. 2406.16793 link
2024-06-24 M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models Rishabh Maheshwary et.al. 2406.16783 null
2024-06-24 It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension Sagi Shaier et.al. 2406.16779 null
2024-06-24 Finding Transformer Circuits with Edge Pruning Adithya Bhaskar et.al. 2406.16778 link
2024-06-24 Blending LLMs into Cascaded Speech Translation: KIT’s Offline Speech Translation System for IWSLT 2024 Sai Koneru et.al. 2406.16777 null
2024-06-24 WARP: On the Benefits of Weight Averaged Rewarded Policies Alexandre Ramé et.al. 2406.16768 null
2024-06-24 The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories Xi Yu Huang et.al. 2406.16767 link
2024-06-24 Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters Euiin Yi et.al. 2406.16758 link
2024-06-21 GenoTEX: A Benchmark for Evaluating LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians Haoyang Liu et.al. 2406.15341 link
2024-06-21 Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance Haoling Li et.al. 2406.15330 null
2024-06-21 Bug In the Code Stack: Can LLMs Find Bugs in Large Python Code Stacks Hokyung Lee et.al. 2406.15325 link
2024-06-21 Cognitive Map for Language Models: Optimal Planning via Verbally Representing the World Model Doyoung Kim et.al. 2406.15275 link
2024-06-21 Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics Weijia Zhang et.al. 2406.15264 null
2024-06-21 Unsupervised Morphological Tree Tokenizer Qingyang Zhu et.al. 2406.15245 null
2024-06-21 Large Batch Analysis for Adagrad Under Anisotropic Smoothness Yuxing Liu et.al. 2406.15244 null
2024-06-21 Detecting Synthetic Lyrics with Few-Shot Inference Yanis Labrak et.al. 2406.15231 null
2024-06-21 A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation Irune Zubiaga et.al. 2406.15227 link
2024-06-21 Unsupervised Extraction of Dialogue Policies from Conversations Makesh Narsimhan Sreedhar et.al. 2406.15214 null
2024-06-21 Prompting Whisper for QA-driven Zero-shot End-to-end Spoken Language Understanding Mohan Li et.al. 2406.15209 null
2024-06-21 Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms Santiago Berrezueta-Guzman et.al. 2406.15198 null
2024-06-21 UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-world Document Analysis Yulong Hui et.al. 2406.15187 link
2024-06-21 Hybrid Alignment Training for Large Language Models Chenglong Wang et.al. 2406.15178 link
2024-06-21 EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot Hao Fei et.al. 2406.15177 link
2024-06-21 Enhancing Idiomatic Representation in Multiple Languages via an Adaptive Contrastive Triplet Loss Wei He et.al. 2406.15175 null
2024-06-21 Évaluation des capacités de réponse de larges modèles de langage (LLM) pour des questions d’historiens Mathieu Chartier et.al. 2406.15173 null
2024-06-21 Assessing Good, Bad and Ugly Arguments Generated by ChatGPT: a New Dataset, its Methodology and Associated Tasks Victor Hugo Nascimento Rocha et.al. 2406.15130 link
2024-06-21 Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network Badr AlKhamissi et.al. 2406.15109 link
2024-06-21 PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data Ishaan Watts et.al. 2406.15053 null
2024-06-20 Model Merging and Safety Alignment: One Bad Model Spoils the Bunch Hasan Abed Al Kader Hammoud et.al. 2406.14563 null
2024-06-20 Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities Sachit Menon et.al. 2406.14562 null
2024-06-20 How to Compute the Probability of a Word Tiago Pimentel et.al. 2406.14561 link
2024-06-21 Asynchronous Large Language Model Enhanced Planner for Autonomous Driving Yuan Chen et.al. 2406.14556 link
2024-06-20 GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models Shilong Li et.al. 2406.14550 null
2024-06-20 Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Large Language Models Sunny Duan et.al. 2406.14549 null
2024-06-20 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data Johannes Treutlein et.al. 2406.14546 link
2024-06-20 Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems Đorđe Klisura et.al. 2406.14545 null
2024-06-20 Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs Yuxuan Qiao et.al. 2406.14544 link
2024-06-20 Are LLMs Naturally Good at Synthetic Tabular Data Generation? Shengzhe Xu et.al. 2406.14541 link
2024-06-20 PostMark: A Robust Blackbox Watermark for Large Language Models Yapei Chang et.al. 2406.14517 link
2024-06-20 MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding Xinyu Fang et.al. 2406.14515 link
2024-06-20 Evidence of a log scaling law for political persuasion with large language models Kobi Hackenburg et.al. 2406.14508 link
2024-06-20 Overview of the CAIL 2023 Argument Mining Track Jingcong Liang et.al. 2406.14503 null
2024-06-20 Improving Expert Radiology Report Summarization by Prompting Large Language Models with a Layperson Summary Xingmeng Zhao et.al. 2406.14500 null
2024-06-20 LLaSA: Large Multimodal Agent for Human Activity Analysis Through Wearable Sensors Sheikh Asif Imran et.al. 2406.14498 link
2024-06-20 CodeRAG-Bench: Can Retrieval Augment Code Generation? Zora Zhiruo Wang et.al. 2406.14497 link
2024-06-20 African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification Gregor Geigle et.al. 2406.14496 link
2024-06-20 Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models? Gregor Geigle et.al. 2406.14492 null
2024-06-20 Instruction Pre-Training: Language Models are Supervised Multitask Learners Daixuan Cheng et.al. 2406.14491 link
2024-06-18 DrVideo: Document Retrieval Based Long Video Understanding Ziyu Ma et.al. 2406.12846 null
2024-06-18 Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts Haoxiang Wang et.al. 2406.12845 link
2024-06-18 Synergizing Foundation Models and Federated Learning: A Survey Shenghui Li et.al. 2406.12844 null
2024-06-18 GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation Ci-Siang Lin et.al. 2406.12834 null
2024-06-18 LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation Seyedarmin Azizi et.al. 2406.12832 link
2024-06-18 What Are the Odds? Language Models Are Capable of Probabilistic Reasoning Akshay Paruchuri et.al. 2406.12830 link
2024-06-18 From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries Hitesh Wadhwa et.al. 2406.12824 null
2024-06-18 Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models? Pinzhen Chen et.al. 2406.12822 null
2024-06-18 Adversarial Attacks on Multimodal Agents Chen Henry Wu et.al. 2406.12814 link
2024-06-18 Can Large Language Models Always Solve Easy Problems if They Can Solve Harder Ones? Zhe Yang et.al. 2406.12809 link
2024-06-18 Identifying Performance-Sensitive Configurations in Software Systems through Code Analysis with LLM Agents Zehao Wang et.al. 2406.12806 null
2024-06-18 Supporting Human Raters with the Detection of Harmful Content using Large Language Models Kurt Thomas et.al. 2406.12800 null
2024-06-18 ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools Team GLM et.al. 2406.12793 link
2024-06-18 In-Context Learning of Energy Functions Rylan Schaeffer et.al. 2406.12785 null
2024-06-18 UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions Xunzhi Wang et.al. 2406.12784 link
2024-06-18 Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries Eden Biran et.al. 2406.12775 link
2024-06-18 Towards Exact Gradient-based Training on Analog In-memory Computing Zhaoxian Wu et.al. 2406.12774 null
2024-06-18 GFM4MPM: Towards Geospatial Foundation Models for Mineral Prospectivity Mapping Angel Daruna et.al. 2406.12756 null
2024-06-18 OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang et.al. 2406.12753 link
2024-06-18 Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning Bingchen Zhao et.al. 2406.12742 link
2024-06-17 LLaNA: Large Language and NeRF Assistant Andrea Amaduzzi et.al. 2406.11840 null
2024-06-17 mDPO: Conditional Preference Optimization for Multimodal Large Language Models Fei Wang et.al. 2406.11839 null
2024-06-17 MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs Ziyu Liu et.al. 2406.11833 link
2024-06-17 Unveiling Encoder-Free Vision-Language Models Haiwen Diao et.al. 2406.11832 link
2024-06-17 Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models Bingqi Ma et.al. 2406.11831 null
2024-06-17 Language Modeling with Editable External Knowledge Belinda Z. Li et.al. 2406.11830 link
2024-06-17 WPO: Enhancing RLHF with Weighted Preference Optimization Wenxuan Zhou et.al. 2406.11827 link
2024-06-17 On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning Geewook Kim et.al. 2406.11823 link
2024-06-17 MegaScenes: Scene-Level View Synthesis at Scale Joseph Tung et.al. 2406.11819 link
2024-06-17 Embodied Instruction Following in Unknown Environments Zhenyu Wu et.al. 2406.11818 null
2024-06-17 Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level Jie Liu et.al. 2406.11817 null
2024-06-17 VideoLLM-online: Online Video Large Language Model for Streaming Video Joya Chen et.al. 2406.11816 null
2024-06-17 How Do Large Language Models Acquire Factual Knowledge During Pretraining? Hoyeon Chang et.al. 2406.11813 link
2024-06-17 RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content Joao Monteiro et.al. 2406.11811 link
2024-06-17 Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations Rima Hazra et.al. 2406.11801 link
2024-06-17 DataComp-LM: In search of the next generation of training sets for language models Jeffrey Li et.al. 2406.11794 null
2024-06-17 CELL your Model: Contrastive Explanation Methods for Large Language Models Ronny Luss et.al. 2406.11785 null
2024-06-17 Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs Swanand Ravindra Kadhe et.al. 2406.11780 null
2024-06-17 Improving Multi-Agent Debate with Sparse Communication Topology Yunxuan Li et.al. 2406.11776 null
2024-06-17 Task Me Anything Jieyu Zhang et.al. 2406.11775 link
2024-06-14 Quantifying Variance in Evaluation Benchmarks Lovish Madaan et.al. 2406.10229 null
2024-06-14 EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models Julian Straub et.al. 2406.10224 link
2024-06-14 Short Film Dataset (SFD): A Benchmark for Story-Level Video Understanding Ridouane Ghermi et.al. 2406.10221 link
2024-06-14 Semantic Membership Inference Attack against Large Language Models Hamid Mozaffari et.al. 2406.10218 null
2024-06-14 Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs Rui Yang et.al. 2406.10216 link
2024-06-14 DevBench: A multimodal developmental benchmark for language learning Alvin Wei Ming Tan et.al. 2406.10215 link
2024-06-14 Be like a Goldfish, Don’t Memorize! Mitigating Memorization in Generative LLMs Abhimanyu Hans et.al. 2406.10209 link
2024-06-14 A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors Naaman Tan et.al. 2406.10203 link
2024-06-14 TRIP-PAL: Travel Planning with Guarantees by Combining Large Language Models and Automated Planners Tomas de la Rosa et.al. 2406.10196 null
2024-06-14 Detecting and Evaluating Medical Hallucinations in Large Vision Language Models Jiawei Chen et.al. 2406.10185 null
2024-06-14 Practical offloading for fine-tuning LLM on commodity GPU via learned subspace projectors Siyuan Chen et.al. 2406.10181 null
2024-06-14 Let the Poem Hit the Rhythm: Using a Byte-Based Transformer for Beat-Aligned Poetry Generation Mohamad Elzohbi et.al. 2406.10174 link
2024-06-14 IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce Wenxuan Ding et.al. 2406.10173 link
2024-06-14 Datasets for Multilingual Answer Sentence Selection Matteo Gabburo et.al. 2406.10172 null
2024-06-14 CarLLaVA: Vision language models for camera-only closed-loop driving Katrin Renz et.al. 2406.10165 null
2024-06-14 Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models Carson Denison et.al. 2406.10162 link
2024-06-14 RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model Hantao Zhou et.al. 2406.10157 null
2024-06-14 BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack Yuri Kuratov et.al. 2406.10149 link
2024-06-14 Evaluation of Large Language Models: STEM education and Gender Stereotypes Smilla Due et.al. 2406.10133 null
2024-06-14 The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models Yan Liu et.al. 2406.10130 link
2024-06-13 VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding Muhammad Maaz et.al. 2406.09418 link
2024-06-13 Explore the Limits of Omni-modal Pretraining at Scale Yiyuan Zhang et.al. 2406.09412 link
2024-06-13 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities Roman Bachmann et.al. 2406.09406 null
2024-06-13 Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models Yushi Hu et.al. 2406.09403 null
2024-06-13 OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation Junke Wang et.al. 2406.09399 link
2024-06-13 Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms Miaosen Zhang et.al. 2406.09397 null
2024-06-13 Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA Jongwoo Park et.al. 2406.09396 link
2024-06-13 Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition Youngtaek Oh et.al. 2406.09388 link
2024-06-13 Towards Vision-Language Geo-Foundation Model: A Survey Yue Zhou et.al. 2406.09385 link
2024-06-13 Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models Lukas Thede et.al. 2406.09384 null
2024-06-13 Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs Zijia Zhao et.al. 2406.09367 link
2024-06-13 ElicitationGPT: Text Elicitation Mechanisms via Language Models Yifan Wu et.al. 2406.09363 null
2024-06-13 Enhancing Domain Adaptation through Prompt Gradient Alignment Hoang Phan et.al. 2406.09353 link
2024-06-13 Separations in the Representational Capabilities of Transformers and Recurrent Architectures Satwik Bhattamishra et.al. 2406.09347 null
2024-06-13 DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding Suwon Shon et.al. 2406.09345 null
2024-06-13 ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models David Anugraha et.al. 2406.09334 link
2024-06-13 REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space Tomer Ashuach et.al. 2406.09325 null
2024-06-13 Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs Zhao Xu et.al. 2406.09324 link
2024-06-13 JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models Delong Ran et.al. 2406.09321 link
2024-06-13 Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases Meng Wang et.al. 2406.09317 link
2024-06-12 What If We Recaption Billions of Web Images with LLaMA-3? Xianhang Li et.al. 2406.08478 null
2024-06-12 Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens Ting-Ji Huang et.al. 2406.08477 null
2024-06-12 Real2Code: Reconstruct Articulated Objects via Code Generation Zhao Mandi et.al. 2406.08474 null
2024-06-12 PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences Daiwei Chen et.al. 2406.08469 null
2024-06-12 Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Zhangchen Xu et.al. 2406.08464 link
2024-06-12 AToM-Bot: Embodied Fulfillment of Unspoken Human Needs with Affective Theory of Mind Wei Ding et.al. 2406.08455 null
2024-06-12 OLMES: A Standard for Language Model Evaluations Yuling Gu et.al. 2406.08446 null
2024-06-12 SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models Chun Yin et.al. 2406.08445 null
2024-06-12 TasTe: Teaching Large Language Models to Translate through Self-Reflection Yutong Wang et.al. 2406.08434 link
2024-06-12 Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL Zijin Hong et.al. 2406.08426 null
2024-06-12 OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text Qingyun Li et.al. 2406.08418 link
2024-06-12 Discovering Preference Optimization Algorithms with and for Large Language Models Chris Lu et.al. 2406.08414 link
2024-06-12 Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference Christopher Wolters et.al. 2406.08413 null
2024-06-13 MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos Xuehai He et.al. 2406.08407 link
2024-06-12 Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models Chun-Yi Kuan et.al. 2406.08402 link
2024-06-12 cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers Anirudh Sundar et.al. 2406.08398 null
2024-06-12 VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks Jiannan Wu et.al. 2406.08394 link
2024-06-12 Large Language Models Must Be Taught to Know What They Don’t Know Sanyam Kapoor et.al. 2406.08391 link
2024-06-12 Banal Deception Human-AI Ecosystems: A Study of People’s Perceptions of LLM-generated Deceptive Behaviour Xiao Zhan et.al. 2406.08386 null
2024-06-13 APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation Weizhao He et.al. 2406.08372 null
2024-06-11 A3VLM: Actionable Articulation-Aware Vision Language Model Siyuan Huang et.al. 2406.07549 link
2024-06-11 Image and Video Tokenization with Binary Spherical Quantization Yue Zhao et.al. 2406.07548 link
2024-06-11 Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena Aidar Myrzakhan et.al. 2406.07545 link
2024-06-11 QuickLLaMA: Query-aware Inference Acceleration for Large Language Models Jingyao Li et.al. 2406.07528 link
2024-06-11 Simple and Effective Masked Diffusion Language Models Subham Sekhar Sahoo et.al. 2406.07524 link
2024-06-11 Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling Liliang Ren et.al. 2406.07522 link
2024-06-11 Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement Yunzhen Feng et.al. 2406.07515 null
2024-06-11 THaLLE: Text Hyperlocally Augmented Large Language Extension – Technical Report KBTG Labs et.al. 2406.07505 null
2024-06-11 Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions Renjie Pi et.al. 2406.07502 link
2024-06-11 TextGrad: Automatic “Differentiation” via Text Mert Yuksekgonul et.al. 2406.07496 link
2024-06-11 CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization Frederic Kirstein et.al. 2406.07494 null
2024-06-11 Paraphrasing in Affirmative Terms Improves Negation Understanding MohammadHossein Rezaei et.al. 2406.07492 null
2024-06-11 PITCH: Productivity and Mental Well-being Coaching through Daily Conversational Interaction Adnan Abbas et.al. 2406.07485 null
2024-06-11 Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing Mao Li et.al. 2406.07483 null
2024-06-11 VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Zesen Cheng et.al. 2406.07476 link
2024-06-11 Anomaly Detection on Unstable Logs with GPT Models Fatemeh Hadadi et.al. 2406.07467 null
2024-06-11 Estimating the Hallucination Rate of Generative AI Andrew Jesson et.al. 2406.07457 null
2024-06-11 Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis Qining Zhang et.al. 2406.07455 null
2024-06-11 On the Robustness of Document-Level Relation Extraction Models to Entity Name Variations Shiao Meng et.al. 2406.07444 link
2024-06-11 McEval: Massively Multilingual Code Evaluation Linzheng Chai et.al. 2406.07436 null
2024-06-10 Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Peize Sun et.al. 2406.06525 link
2024-06-10 UMBRELA: UMbrela is the (Open-Source Reproduction of the) Bing RELevance Assessor Shivani Upadhyay et.al. 2406.06519 link
2024-06-10 Merlin: A Vision Language Foundation Model for 3D Computed Tomography Louis Blankemeier et.al. 2406.06512 null
2024-06-10 NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative Asmar Nadeem et.al. 2406.06499 null
2024-06-10 Direct Preference Optimization for Suppressing Hallucinated Prior Exams in Radiology Report Generation Oishi Banerjee et.al. 2406.06496 null
2024-06-10 Can Language Models Serve as Text-Based World Simulators? Ruoyao Wang et.al. 2406.06485 null
2024-06-10 Parallelizing Linear Transformers with the Delta Rule over Sequence Length Songlin Yang et.al. 2406.06484 link
2024-06-10 Towards a Personal Health Large Language Model Justin Cosentino et.al. 2406.06474 null
2024-06-10 AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction Zhen Xing et.al. 2406.06465 null
2024-06-10 Transforming Wearable Data into Health Insights using Large Language Model Agents Mike A. Merrill et.al. 2406.06464 null
2024-06-10 VCR: Visual Caption Restoration Tianyu Zhang et.al. 2406.06462 link
2024-06-11 Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies Junlin Wang et.al. 2406.06461 null
2024-06-10 Evaluating the Retrieval Component in LLM-Based Question Answering Systems Ashkan Alinejad et.al. 2406.06458 null
2024-06-10 A Large Language Model Pipeline for Breast Cancer Oncology Tristen Pool et.al. 2406.06455 null
2024-06-10 Insights from Social Shaping Theory: The Appropriation of Large Language Models in an Undergraduate Programming Course Aadarsh Padiyath et.al. 2406.06451 null
2024-06-10 LLM Dataset Inference: Did you train on my dataset? Pratyush Maini et.al. 2406.06443 link
2024-06-10 Interpretability of Language Models via Task Spaces Lucas Weber et.al. 2406.06441 null
2024-06-10 Language Models are Alignable Decision-Makers: Dataset and Application to the Medical Triage Domain Brian Hu et.al. 2406.06435 link
2024-06-10 Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking Gabriel Rioux et.al. 2406.06425 null
2024-06-10 An Empirical Design Justice Approach to Identifying Ethical Considerations in the Intersection of Large Language Models and Social Robotics Alva Markelius et.al. 2406.06400 null
2024-06-07 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs Jianing Yang et.al. 2406.05132 link
2024-06-07 An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models Xiongtao Zhou et.al. 2406.05130 link
2024-06-07 Towards Semantic Equivalence of Tokenization in Multimodal LLM Shengqiong Wu et.al. 2406.05127 null
2024-06-07 Large Generative Graph Models Yu Wang et.al. 2406.05109 null
2024-06-07 LINX: A Language Driven Generative System for Goal-Oriented Automated Data Exploration Tavor Lipman et.al. 2406.05107 null
2024-06-07 Corpus Poisoning via Approximate Greedy Gradient Descent Jinyan Su et.al. 2406.05087 link
2024-06-07 Multi-Head RAG: Solving Multi-Aspect Problems with LLMs Maciej Besta et.al. 2406.05085 link
2024-06-07 SUMIE: A Synthetic Benchmark for Incremental Entity Summarization Eunjeong Hwang et.al. 2406.05079 null
2024-06-07 Are Large Language Models More Empathetic than Humans? Anuradha Welivita et.al. 2406.05063 null
2024-06-07 Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions Shi-Yu Tian et.al. 2406.05055 null
2024-06-07 Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation Nachiket Kotalwar et.al. 2406.05053 null
2024-06-07 Bootstrapping Referring Multi-Object Tracking Yani Zhang et.al. 2406.05039 link
2024-06-07 Scenarios and Approaches for Situated Natural Language Explanations Pengshuo Qiu et.al. 2406.05035 null
2024-06-07 CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search Fengran Mo et.al. 2406.05013 link
2024-06-07 Compositional Generalization with Grounded Language Models Sondre Wold et.al. 2406.04989 link
2024-06-07 Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences Patrick Haller et.al. 2406.04988 link
2024-06-07 MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter Jitai Hao et.al. 2406.04984 link
2024-06-07 CityCraft: A Real Crafter for 3D City Generation Jie Deng et.al. 2406.04983 null
2024-06-07 Quantifying Geospatial in the Common Crawl Corpus Ilya Ilyankou et.al. 2406.04952 null
2024-06-07 BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense Baktash Ansari et.al. 2406.04947 link
2024-06-06 Verbalized Machine Learning: Revisiting Machine Learning with Language Models Tim Z. Xiao et.al. 2406.04344 null
2024-06-06 Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image Stanislaw Szymanowicz et.al. 2406.04343 link
2024-06-06 Learning 1D Causal Visual Representation with De-focus Attention Networks Chenxin Tao et.al. 2406.04342 link
2024-06-06 RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation Jiaming Liu et.al. 2406.04339 null
2024-06-06 Coherent Zero-Shot Visual Instruction Generation Quynh Phung et.al. 2406.04337 null
2024-06-06 DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs Lingchen Meng et.al. 2406.04334 null
2024-06-06 PaCE: Parsimonious Concept Engineering for Large Language Models Jinqi Luo et.al. 2406.04331 link
2024-06-06 Parameter-Inverted Image Pyramid Networks Xizhou Zhu et.al. 2406.04330 link
2024-06-06 Simplified and Generalized Masked Diffusion for Discrete Data Jiaxin Shi et.al. 2406.04329 null
2024-06-06 Causal Estimation of Memorisation Profiles Pietro Lesci et.al. 2406.04327 link
2024-06-06 ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Lin Chen et.al. 2406.04325 null
2024-06-06 Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step Zhanhao Liang et.al. 2406.04314 link
2024-06-06 Improving Alignment and Robustness with Short Circuiting Andy Zou et.al. 2406.04313 link
2024-06-06 Semantically Diverse Language Generation for Uncertainty Estimation in Language Models Lukas Aichberger et.al. 2406.04306 link
2024-06-06 Quixer: A Quantum Transformer Model Nikhil Khatri et.al. 2406.04305 null
2024-06-06 Text-to-Drive: Diverse Driving Behavior Synthesis via Large Language Models Phat Nguyen et.al. 2406.04300 null
2024-06-06 VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval Junjie Zhou et.al. 2406.04292 link
2024-06-06 Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation Adam Fisch et.al. 2406.04291 null
2024-06-07 What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages Nadav Borenstein et.al. 2406.04289 null
2024-06-06 Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People Dun-Ming Huang et.al. 2406.04278 link
2024-06-05 Wings: Learning Multimodal LLMs without Text-only Forgetting Yi-Kai Zhang et.al. 2406.03496 null
2024-06-06 Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training Ao Sun et.al. 2406.03488 link
2024-06-05 Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends Sanjana Ramprasad et.al. 2406.03487 null
2024-06-05 BIPED: Pedagogically Informed Tutoring System for ESL Education Soonwoo Kwon et.al. 2406.03486 null
2024-06-05 Does your data spark joy? Performance gains from domain upsampling at the end of training Cody Blakeney et.al. 2406.03476 null
2024-06-05 AD-H: Autonomous Driving with Hierarchical Agents Zaibin Zhang et.al. 2406.03474 null
2024-06-05 What is the Best Way for ChatGPT to Translate Poetry? Shanshan Wang et.al. 2406.03450 null
2024-06-05 Pre-trained Large Language Models Use Fourier Features to Compute Addition Tianyi Zhou et.al. 2406.03445 null
2024-06-05 Are language models rational? The case of coherence norms and belief revision Thomas Hofweber et.al. 2406.03442 null
2024-06-05 Cycles of Thought: Measuring LLM Confidence through Stable Explanations Evan Becker et.al. 2406.03441 null
2024-06-05 Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis Moein Heidari et.al. 2406.03430 link
2024-06-05 Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach Saehyung Lee et.al. 2406.03411 link
2024-06-05 Automating Turkish Educational Quiz Generation Using Large Language Models Kamyar Zeinalipour et.al. 2406.03397 link
2024-06-05 Log Parsing with Self-Generated In-Context Learning and Self-Correction Yifan Wu et.al. 2406.03376 null
2024-06-05 IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models David Ifeoluwa Adelani et.al. 2406.03368 null
2024-06-05 CLMASP: Coupling Large Language Models with Answer Set Programming for Robotic Task Planning Xinrui Lin et.al. 2406.03367 null
2024-06-05 LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback Timon Ziegenbein et.al. 2406.03363 null
2024-06-05 Save It for the “Hot” Day: An LLM-Empowered Visual Analytics System for Heat Risk Management Haobo Li et.al. 2406.03317 null
2024-06-05 The Good, the Bad, and the Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games Mikhail Mozikov et.al. 2406.03299 null
2024-06-05 SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms Xingrun Xing et.al. 2406.03287 link
2024-06-04 Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks Tianyu He et.al. 2406.02550 link
2024-06-04 Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation Mohamed El Amine Boudjoghra et.al. 2406.02548 link
2024-06-04 Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning Alex Jinpeng Wang et.al. 2406.02547 link
2024-06-04 To Believe or Not to Believe Your LLM Yasin Abbasi Yadkori et.al. 2406.02543 null
2024-06-04 Loki: Low-Rank Keys for Efficient Sparse Attention Prajwal Singhania et.al. 2406.02542 link
2024-06-04 Parrot: Multilingual Visual Instruction Tuning Hai-Long Sun et.al. 2406.02539 link
2024-06-04 TopViewRS: Vision-Language Models as Top-View Spatial Reasoners Chengzu Li et.al. 2406.02537 link
2024-06-04 Mitigate Position Bias in Large Language Models via Scaling a Single Dimension Yijiong Yu et.al. 2406.02536 link
2024-06-04 SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices Ruslan Svirschevski et.al. 2406.02532 link
2024-06-04 Scalable MatMul-free Language Modeling Rui-Jie Zhu et.al. 2406.02528 link
2024-06-04 CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks Maciej Besta et.al. 2406.02524 link
2024-06-04 RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots Soroush Nasiriany et.al. 2406.02523 null
2024-06-04 Demystifying the Compression of Mixture-of-Experts Through a Unified Framework Shwai He et.al. 2406.02500 link
2024-06-04 Hiding Text in Large Language Models: Introducing Unconditional Token Forcing Confusion Jakub Hoscilowicz et.al. 2406.02481 link
2024-06-04 Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding Zhihan Zhang et.al. 2406.02472 link
2024-06-04 Meta-Designing Quantum Experiments with Language Models Sören Arlt et.al. 2406.02470 null
2024-06-04 Seed-TTS: A Family of High-Quality Versatile Speech Generation Models Philip Anastassiou et.al. 2406.02430 link
2024-06-04 Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion Ruiqi Li et.al. 2406.02429 null
2024-06-04 GrootVL: Tree Topology is All You Need in State Space Model Yicheng Xiao et.al. 2406.02395 link
2024-06-04 Multiple Choice Questions and Large Languages Models: A Case Study with Fictional Medical Data Maxime Griot et.al. 2406.02394 link
2024-05-31 Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis Chaoyou Fu et.al. 2405.21075 null
2024-05-31 Code Pretraining Improves Entity Tracking Abilities of Language Models Najoung Kim et.al. 2405.21068 null
2024-05-31 Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality Tri Dao et.al. 2405.21060 link
2024-05-31 RydbergGPT David Fitzek et.al. 2405.21052 link
2024-05-31 Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling Jiatao Gu et.al. 2405.21048 null
2024-05-31 Grammar-Aligned Decoding Kanghee Park et.al. 2405.21047 null
2024-05-31 Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF Tengyang Xie et.al. 2405.21046 null
2024-05-31 Direct Alignment of Language Models via Quality-Aware Self-Refinement Runsheng Yu et.al. 2405.21040 null
2024-05-31 Standards for Belief Representations in LLMs Daniel A. Herrmann et.al. 2405.21030 null
2024-05-31 LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models Elias Stengel-Eskin et.al. 2405.21028 link
2024-05-31 You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet Zhen Qin et.al. 2405.21022 null
2024-05-31 Improved Techniques for Optimization-Based Jailbreaking on Large Language Models Xiaojun Jia et.al. 2405.21018 link
2024-06-03 StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond Pengyuan Lyu et.al. 2405.21013 null
2024-05-31 Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models Yi Yang et.al. 2405.20991 link
2024-05-31 DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models Linli Yao et.al. 2405.20985 link
2024-05-31 Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training Feiteng Fang et.al. 2405.20978 link
2024-05-31 SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales Tianyang Xu et.al. 2405.20974 link
2024-05-31 LCQ: Low-Rank Codebook based Quantization for Large Language Models Wen-Pu Cai et.al. 2405.20973 null
2024-06-03 Large Language Models are Zero-Shot Next Location Predictors Ciro Beneduce et.al. 2405.20962 link
2024-06-03 A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs’ Humour Alignment with Comedians Piotr Wojciech Mirowski et.al. 2405.20956 null
2024-05-30 MotionLLM: Understanding Human Behaviors from Human Motions and Videos Ling-Hao Chen et.al. 2405.20340 link
2024-05-30 Visual Perception by Large Language Model’s Weights Feipeng Ma et.al. 2405.20339 link
2024-05-30 Xwin-LM: Strong and Scalable Alignment Practice for LLMs Bolin Ni et.al. 2405.20335 link
2024-05-31 ParSEL: Parameterized Shape Editing with Language Aditya Ganeshan et.al. 2405.20319 null
2024-05-30 CausalQuest: Collecting Natural Causal Questions for AI Agents Roberto Ceraolo et.al. 2405.20318 link
2024-05-30 ANAH: Analytical Annotation of Hallucinations in Large Language Models Ziwei Ji et.al. 2405.20315 link
2024-05-30 Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation Guillaume Huguet et.al. 2405.20313 null
2024-05-30 Large Language Models Can Self-Improve At Web Agent Tasks Ajay Patel et.al. 2405.20309 link
2024-05-30 Can’t make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models Himangi Mittal et.al. 2405.20305 null
2024-05-30 Group Robust Preference Optimization in Reward-free RLHF Shyam Sundhar Ramesh et.al. 2405.20304 link
2024-05-30 Who Writes the Review, Human or AI? Panagiotis C. Theocharopoulos et.al. 2405.20285 null
2024-05-30 ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections Massimo Bini et.al. 2405.20271 link
2024-05-30 Evaluating Large Language Model Biases in Persona-Steered Generation Andy Liu et.al. 2405.20253 link
2024-05-30 Towards Hierarchical Multi-Agent Workflows for Zero-Shot Prompt Optimization Yuchi Liu et.al. 2405.20252 link
2024-05-30 Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use Franz Louis Cesista et.al. 2405.20245 null
2024-05-30 Context Injection Attacks on Large Language Models Cheng’an Wei et.al. 2405.20234 null
2024-05-30 Data-efficient fine-tuning of foundational models for first-principles quality sublimation enthalpies Harveen Kaur et.al. 2405.20217 null
2024-05-30 TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models Chen Zhang et.al. 2405.20215 null
2024-05-30 One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments Ke Yi et.al. 2405.20202 null
2024-05-31 Using Large Language Models for Humanitarian Frontline Negotiation: Opportunities and Considerations Zilin Ma et.al. 2405.20195 null
2024-05-29 X-VILA: Cross-Modality Alignment for Large Language Model Hanrong Ye et.al. 2405.19335 null
2024-05-29 LLMs Meet Multimodal Generation and Editing: A Survey Yingqing He et.al. 2405.19334 link
2024-05-29 Multi-Modal Generative Embedding Model Feipeng Ma et.al. 2405.19333 null
2024-05-29 Self-Exploring Language Models: Active Preference Elicitation for Online Alignment Shenao Zhang et.al. 2405.19332 link
2024-05-29 Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation Atrisha Sarkar et.al. 2405.19328 null
2024-05-29 MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Ge Zhang et.al. 2405.19327 link
2024-05-29 Reasoning3D – Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models Tianrun Chen et.al. 2405.19326 null
2024-05-29 Nearest Neighbor Speculative Decoding for LLM Generation and Attribution Minghan Li et.al. 2405.19325 null
2024-05-29 Are Large Language Models Chameleons? Mingmeng Geng et.al. 2405.19323 null
2024-05-29 Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF Shicong Cen et.al. 2405.19320 null
2024-05-29 Robust Preference Optimization through Reward Model Distillation Adam Fisch et.al. 2405.19316 null
2024-05-29 Matryoshka Query Transformer for Large Vision-Language Models Wenbo Hu et.al. 2405.19315 link
2024-05-29 Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice Jian-Qiao Zhu et.al. 2405.19313 null
2024-05-29 Expert-Guided Extinction of Toxic Tokens for Debiased Generation Xueyao Sun et.al. 2405.19299 null
2024-05-29 MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection Michael Regan et.al. 2405.19285 null
2024-05-29 Optimizing Foundation Model Inference on a Many-tiny-core Open-source RISC-V Platform Viviane Potocnik et.al. 2405.19284 null
2024-05-29 Programmable Motion Generation for Open-Set Motion Control Tasks Hanchao Liu et.al. 2405.19283 null
2024-05-29 PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications Dingkang Yang et.al. 2405.19266 link
2024-05-29 AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data Zifan Song et.al. 2405.19265 link
2024-05-29 Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models Zhanhui Zhou et.al. 2405.19262 link
2024-05-28 Why are Visually-Grounded Language Models Bad at Image Classification? Yuhui Zhang et.al. 2405.18415 link
2024-05-28 Don’t Forget to Connect! Improving RAG with Graph-based Reranking Jialin Dong et.al. 2405.18414 null
2024-05-28 WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization Jiawei Ma et.al. 2405.18405 null
2024-05-29 Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass Ethan Shen et.al. 2405.18400 link
2024-05-28 Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning Yixiao Zhang et.al. 2405.18386 link
2024-05-28 OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning Pengxiang Li et.al. 2405.18380 link
2024-05-28 LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models Anthony Sarah et.al. 2405.18377 null
2024-05-28 Empowering Source-Free Domain Adaptation with MLLM-driven Curriculum Learning Dongjie Chen et.al. 2405.18376 link
2024-05-28 Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning Phakphum Artkaew et.al. 2405.18375 link
2024-05-28 PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework Eshaan Agarwal et.al. 2405.18369 null
2024-05-28 Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving? Yifan Bai et.al. 2405.18361 null
2024-05-28 Bridging the Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs Somnath Kumar et.al. 2405.18359 null
2024-05-28 MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning Somnath Kumar et.al. 2405.18358 null
2024-05-28 Faithful Logical Reasoning via Symbolic Chain-of-Thought Jundong Xu et.al. 2405.18357 link
2024-05-28 Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography Jie Liu et.al. 2405.18356 link
2024-05-28 Intelligent Clinical Documentation: Harnessing Generative AI for Patient-Centric Clinical Note Generation Anjanava Biswas et.al. 2405.18346 null
2024-05-28 The Battle of LLMs: A Comparative Study in Conversational QA Tasks Aryan Rangapur et.al. 2405.18344 null
2024-05-28 Frustratingly Easy Test-Time Adaptation of Vision-Language Models Matteo Farina et.al. 2405.18330 link
2024-05-28 Multi-modal Generation via Cross-Modal In-Context Learning Amandeep Kumar et.al. 2405.18304 link
2024-05-28 Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning Renzhi Wang et.al. 2405.18292 null
2024-05-27 Matryoshka Multimodal Models Mu Cai et.al. 2405.17430 null
2024-05-27 NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Chankyu Lee et.al. 2405.17428 null
2024-05-27 Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model Kuan-Chih Huang et.al. 2405.17427 link
2024-05-27 LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence Zhuoling Li et.al. 2405.17424 null
2024-05-27 Privacy-Aware Visual Language Models Laurens Samson et.al. 2405.17423 null
2024-05-27 Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation Jiaming Liu et.al. 2405.17418 null
2024-05-27 THREAD: Thinking Deeper with Recursive Spawning Philip Schroeder et.al. 2405.17402 link
2024-05-27 The Expressive Capacity of State Space Models: A Formal Language Perspective Yash Sarrof et.al. 2405.17394 null
2024-05-27 MindMerger: Efficient Boosting LLM Reasoning in non-English Languages Zixian Huang et.al. 2405.17386 link
2024-05-27 Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective Zhen Qin et.al. 2405.17383 null
2024-05-27 ReMoDetect: Reward Models Recognize Aligned LLM’s Generations Hyunseok Lee et.al. 2405.17382 link
2024-05-27 Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention Zhen Qin et.al. 2405.17381 link
2024-05-27 RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects Ahmed Allam et.al. 2405.17378 link
2024-05-28 Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models ShengYun Peng et.al. 2405.17374 link
2024-05-27 Prompt Optimization with Human Feedback Xiaoqiang Lin et.al. 2405.17346 link
2024-05-27 Exploring and steering the moral compass of Large Language Models Alejandro Tlaie et.al. 2405.17345 link
2024-05-27 Cost-efficient Knowledge-based Question Answering with Large Language Models Junnan Dong et.al. 2405.17337 null
2024-05-27 XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser Xianfu Cheng et.al. 2405.17336 link
2024-05-27 FedHPL: Efficient Heterogeneous Federated Learning with Prompt Tuning and Logit Distillation Yuting Ma et.al. 2405.17267 null
2024-05-27 On the Noise Robustness of In-Context Learning for Text Generation Hongfu Gao et.al. 2405.17264 link
2024-05-24 Scaling Laws for Discriminative Classification in Large Language Models Dean Wyatte et.al. 2405.15765 null
2024-05-24 Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence Abhinav Patil et.al. 2405.15750 link
2024-05-24 Sparse maximal update parameterization: A holistic approach to sparse training dynamics Nolan Dey et.al. 2405.15743 link
2024-05-24 Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias Andres Algaba et.al. 2405.15739 link
2024-05-24 LM4LV: A Frozen Large Language Model for Low-level Vision Tasks Boyang Zheng et.al. 2405.15734 link
2024-05-24 Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks Jerome Sieber et.al. 2405.15731 link
2024-05-24 Optimizing Large Language Models for OpenAPI Code Completion Bohdan Petryshyn et.al. 2405.15729 link
2024-05-24 Disease-informed Adaptation of Vision-Language Models Jiajin Zhang et.al. 2405.15728 link
2024-05-24 The Impact of Geometric Complexity on Neural Collapse in Transfer Learning Michael Munn et.al. 2405.15706 null
2024-05-24 Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models Yue Zhang et.al. 2405.15684 null
2024-05-24 VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap Sreyan Ghosh et.al. 2405.15683 link
2024-05-24 What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models Abdelrahman Abdelhamed et.al. 2405.15668 null
2024-05-24 Class Machine Unlearning for Complex Data via Concepts Inference and Data Poisoning Wenhan Chang et.al. 2405.15662 null
2024-05-24 \(\mathbf{L^2\cdot M = C^2}\) Large Language Models as Covert Channels… a Systematic Analysis Simen Gaure et.al. 2405.15652 null
2024-05-24 LLM-based Robot Task Planning with Exceptional Handling for General Purpose Service Robots Ruoyu Wang et.al. 2405.15646 null
2024-05-24 GECKO: Generative Language Model for English, Code and Korean Sungwoo Oh et.al. 2405.15640 null
2024-05-24 M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models Hongyu Wang et.al. 2405.15638 link
2024-05-24 GPTZoo: A Large-scale Dataset of GPTs for the Research Community Xinyi Hou et.al. 2405.15630 link
2024-05-24 A Comparative Analysis of Distributed Training Strategies for GPT-2 Ishan Patwardhan et.al. 2405.15628 null
2024-05-24 Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment Hao Sun et.al. 2405.15624 null
2024-05-23 PuzzleAvatar: Assembling 3D Avatars from Personal Albums Yuliang Xiu et.al. 2405.14869 link
2024-05-23 A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns Asaf Yehudai et.al. 2405.14863 null
2024-05-23 Bitune: Bidirectional Instruction-Tuning Dawid J. Kopiczko et.al. 2405.14862 null
2024-05-23 Not All Language Model Features Are Linear Joshua Engels et.al. 2405.14860 link
2024-05-23 PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression Vladimir Malinovskii et.al. 2405.14852 link
2024-05-23 A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis Yue Yang et.al. 2405.14839 null
2024-05-23 From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step Yuntian Deng et.al. 2405.14838 link
2024-05-23 HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models Bernal Jiménez Gutiérrez et.al. 2405.14831 link
2024-05-23 Designing A Sustainable Marine Debris Clean-up Framework without Human Labels Raymond Wang et.al. 2405.14815 link
2024-05-23 As an AI Language Model, “Yes I Would Recommend Calling the Police’’: Norm Inconsistency in LLM Decision-Making Shomik Jain et.al. 2405.14812 null
2024-05-23 Implicit Personalization in Language Models: A Systematic Study Zhijing Jin et.al. 2405.14808 link
2024-05-23 Can LLMs Solve longer Math Word Problems Better? Xin Xu et.al. 2405.14804 null
2024-05-23 Lessons from the Trenches on Reproducible Evaluation of Language Models Stella Biderman et.al. 2405.14782 null
2024-05-23 WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models Peng Wang et.al. 2405.14768 link
2024-05-23 FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models Hongyang Yang et.al. 2405.14767 link
2024-05-23 Evaluating Large Language Models for Public Health Classification and Extraction Tasks Joshua Harris et.al. 2405.14766 null
2024-05-23 Large language models can be zero-shot anomaly detectors for time series? Sarah Alnegheimish et.al. 2405.14755 link
2024-05-23 A Transformer-Based Approach for Smart Invocation of Automatic Code Completion Aral de Moor et.al. 2405.14753 link
2024-05-23 MultiCast: Zero-Shot Multivariate Time Series Forecasting Using LLMs Georgios Chatzigeorgakidis et.al. 2405.14748 null
2024-05-23 Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View Xuan Liu et.al. 2405.14744 null
2024-05-21 Reducing Transformer Key-Value Cache Size with Cross-Layer Attention William Brandon et.al. 2405.12981 null
2024-05-21 OmniGlue: Generalizable Feature Matching with Foundation Model Guidance Hanwen Jiang et.al. 2405.12979 link
2024-05-21 BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once Theodore Zhao et.al. 2405.12971 null
2024-05-21 Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale Shriram Chennakesavalu et.al. 2405.12961 link
2024-05-21 Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models Zhangyue Yin et.al. 2405.12939 link
2024-05-21 Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs Bilgehan Sel et.al. 2405.12933 null
2024-05-21 Code-mixed Sentiment and Hate-speech Prediction Anjali Yadav et.al. 2405.12929 link
2024-05-21 Streamlining Software Reviews: Efficient Predictive Modeling with Minimal Examples Tim Menzies et.al. 2405.12920 link
2024-05-21 G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation Xingyuan Pan et.al. 2405.12915 link
2024-05-21 An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation Zhiyu Tan et.al. 2405.12914 link
2024-05-21 Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment Holli Sargeant et.al. 2405.12910 link
2024-05-21 Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents San Kim et.al. 2405.12900 null
2024-05-21 Investigating Persuasion Techniques in Arabic: An Empirical Study Leveraging Large Language Models Abdurahmman Alzahrani et.al. 2405.12884 null
2024-05-21 LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language James Requeima et.al. 2405.12856 link
2024-05-21 OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models Zhaojian Yu et.al. 2405.12843 link
2024-05-21 SmartFlow: Robotic Process Automation using LLMs Arushi Jain et.al. 2405.12842 null
2024-05-21 Large Language Models Meet NLP: A Survey Libo Qin et.al. 2405.12819 link
2024-05-21 Test Oracle Automation in the era of LLMs Facundo Molina et.al. 2405.12766 null
2024-05-21 C3L: Content Correlated Vision-Language Instruction Tuning Data Generation via Contrastive Learning Ji Ma et.al. 2405.12752 null
2024-05-21 Generative AI and Large Language Models for Cyber Security: All Insights You Need Mohamed Amine Ferrag et.al. 2405.12750 null
2024-05-20 Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning Guanglin Zhou et.al. 2405.12217 link
2024-05-20 MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark Hongwei Liu et.al. 2405.12209 link
2024-05-20 Developers’ Perceptions on the Impact of ChatGPT in Software Development: A Survey Thiago S. Vaillant et.al. 2405.12195 link
2024-05-20 CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models Haoxiang Shi et.al. 2405.12174 null
2024-05-20 Fennec: Fine-grained Language Model Evaluation and Correction Extended through Branching and Bridging Xiaobo Liang et.al. 2405.12163 link
2024-05-20 Eliciting Problem Specifications via Large Language Models Robert E. Wray et.al. 2405.12147 null
2024-05-20 DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM Xuchen Li et.al. 2405.12139 null
2024-05-20 MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Ting Jiang et.al. 2405.12130 link
2024-05-20 Reindex-Then-Adapt: Improving Large Language Models for Conversational Recommendation Zhankui He et.al. 2405.12119 null
2024-05-20 Imp: Highly Capable Large Multimodal Models for Mobile Devices Zhenwei Shao et.al. 2405.12107 link
2024-05-20 DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical Correction Hao Chen et.al. 2405.12100 null
2024-05-20 Distributional Semantics, Holism, and the Instability of Meaning Jumbly Grindrod et.al. 2405.12084 null
2024-05-20 PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation Zhuobin Huang et.al. 2405.12079 null
2024-05-20 CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models Tong Zhang et.al. 2405.12063 link
2024-05-20 STYLE: Improving Domain Transferability of Asking Clarification Questions in Large Language Model Powered Conversational Agents Yue Chen et.al. 2405.12059 null
2024-05-20 KG-RAG: Bridging the Gap Between Knowledge and Creativity Diego Sanmartin et.al. 2405.12035 null
2024-05-20 Can AI Relate: Testing Large Language Model Response for Mental Health Support Saadia Gabriel et.al. 2405.12021 link
2024-05-20 MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering Jingqun Tang et.al. 2405.11985 link
2024-05-20 A review on the use of large language models as virtual tutors Silvia García-Méndez et.al. 2405.11983 null
2024-05-20 Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays Zhichao Sun et.al. 2405.11976 link
2024-05-17 Observational Scaling Laws and the Predictability of Language Model Performance Yangjun Ruan et.al. 2405.10938 link
2024-05-17 A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers Kaiyu Huang et.al. 2405.10936 link
2024-05-17 The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks Lucius Bushnaq et.al. 2405.10928 link
2024-05-17 Blackbox Adaptation for Medical Image Segmentation Jay N. Paranjape et.al. 2405.10913 link
2024-05-17 COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain Dimitrios P. Panagoulias et.al. 2405.10893 null
2024-05-17 Application of Artificial Intelligence in Schizophrenia Rehabilitation Management: Systematic Literature Review Hongyi Yang et.al. 2405.10883 null
2024-05-17 ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains Zhaopei Huang et.al. 2405.10860 link
2024-05-17 The Future of Large Language Model Pre-training is Federated Lorenzo Sani et.al. 2405.10853 null
2024-05-17 Open-Vocabulary Spatio-Temporal Action Detection Tao Wu et.al. 2405.10832 null
2024-05-17 Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities Hao Zhou et.al. 2405.10825 null
2024-05-17 ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios Markus Bayer et.al. 2405.10808 null
2024-05-17 The Relational Machine Calculus Chris Barrett et.al. 2405.10801 null
2024-05-17 Empowering Small-Scale Knowledge Graphs: A Strategy of Leveraging General-Purpose Knowledge Graphs for Enriched Embeddings Albert Sawczyn et.al. 2405.10745 null
2024-05-17 Efficient Multimodal Large Language Models: A Survey Yizhang Jin et.al. 2405.10739 link
2024-05-17 INDUS: Effective and Efficient Language Models for Scientific Applications Bishwaranjan Bhattacharjee et.al. 2405.10725 null
2024-05-17 SignLLM: Sign Languages Production Large Language Models Sen Fang et.al. 2405.10718 null
2024-05-17 Persian Pronoun Resolution: Leveraging Neural Networks and Language Models Hassan Haji Mohammadi et.al. 2405.10714 null
2024-05-17 SynDy: Synthetic Dynamic Dataset Generation Framework for Misinformation Tasks Michael Shliselberg et.al. 2405.10700 null
2024-05-17 Revolutionizing Process Mining: A Novel Architecture for ChatGPT Integration and Enhanced User Experience through Optimized Prompt Engineering Mehrdad Agha Mohammad Ali Kermani et.al. 2405.10689 null
2024-05-17 Realistic Evaluation of Toxicity in Large Language Models Tinh Son Luong et.al. 2405.10659 null
2024-05-16 UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models Sahel Sharifymoghaddam et.al. 2405.10311 null
2024-05-16 4D Panoptic Scene Graph Generation Jingkang Yang et.al. 2405.10305 link
2024-05-16 Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees Yu Gui et.al. 2405.10301 link
2024-05-16 HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Rhea Sanjay Sukthanker et.al. 2405.10299 link
2024-05-17 Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning Yuexiang Zhai et.al. 2405.10292 null
2024-05-16 Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction Jianhao Chen et.al. 2405.10288 link
2024-05-16 FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models Adrian Bulat et.al. 2405.10286 null
2024-05-16 Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers Tuo Zhang et.al. 2405.10276 null
2024-05-16 Keep It Private: Unsupervised Privatization of Online Text Calvin Bao et.al. 2405.10260 link
2024-05-16 When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models Xianzheng Ma et.al. 2405.10255 link
2024-05-16 PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology George Shaikovski et.al. 2405.10254 null
2024-05-16 A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks Xuanfan Ni et.al. 2405.10251 null
2024-05-16 IntelliExplain: Enhancing Interactive Code Generation through Natural Language Explanations for Non-Professional Programmers Hao Yan et.al. 2405.10250 null
2024-05-16 A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts Xinru Zhang et.al. 2405.10246 link
2024-05-16 DocuMint: Docstring Generation for Python using Small Language Models Bibek Poudel et.al. 2405.10243 link
2024-05-16 Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting Divij Gupta et.al. 2405.10216 null
2024-05-16 CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations Jiahao Zhao et.al. 2405.10212 link
2024-05-16 LFED: A Literary Fiction Evaluation Dataset for Large Language Models Linhao Yu et.al. 2405.10166 link
2024-05-16 PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning Jiancheng Pan et.al. 2405.10160 link
2024-05-16 Speaker Verification in Agent-Generated Conversations Yizhe Yang et.al. 2405.10150 null
2024-05-15 Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming Bushi Xiao et.al. 2405.09508 null
2024-05-15 Constrained Learning for Causal Inference and Semiparametric Statistics Tiffany Tianhui Cai et.al. 2405.09493 null
2024-05-15 Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts Donya Rooein et.al. 2405.09482 null
2024-05-15 Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models Majid Zarharan et.al. 2405.09454 link
2024-05-15 M $^4$ oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts Yufeng Jiang et.al. 2405.09446 link
2024-05-15 Facilitating Opinion Diversity through Hybrid NLP Approaches Michiel van der Meer et.al. 2405.09439 null
2024-05-15 A Survey On Text-to-3D Contents Generation In The Wild Chenhan Jiang et.al. 2405.09431 null
2024-05-15 MicroPython Testbed for Federated Learning Algorithms Miroslav Popovic et.al. 2405.09423 link
2024-05-15 Matching domain experts by training from scratch on domain knowledge Xiaoliang Luo et.al. 2405.09395 null
2024-05-15 Compositional imprecise probability Jack Liell-Cock et.al. 2405.09391 null
2024-05-15 PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models Devansh Jain et.al. 2405.09373 link
2024-05-15 SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition Weijie L et.al. 2405.09365 link
2024-05-15 Large Language Model Bias Mitigation from the Perspective of Knowledge Editing Ruizhe Chen et.al. 2405.09341 null
2024-05-15 Prompting-based Synthetic Data Generation for Few-Shot Question Answering Maximilian Schmidt et.al. 2405.09335 link
2024-05-15 Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls Pedro Miguel Sánchez Sánchez et.al. 2405.09318 null
2024-05-15 Comparing the Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support Birger Moell et.al. 2405.09300 null
2024-05-15 Do language models capture implied discourse meanings? An investigation with exhaustivity implicatures of Korean morphology Hagyeong Shin et.al. 2405.09293 null
2024-05-15 Sign of the Times: Evaluating the use of Large Language Models for Idiomaticity Detection Dylan Phelps et.al. 2405.09279 null
2024-05-15 Dynamic Activation Pitfalls in LLaMA Models: An Empirical Study Chi Ma et.al. 2405.09274 null
2024-05-15 New Textual Corpora for Serbian Language Modeling Mihailo Škorić et.al. 2405.09250 null
2024-05-14 Efficient Vision-Language Pre-training by Cluster Masking Zihao Wei et.al. 2405.08815 link
2024-05-14 Towards Enhanced RAC Accessibility: Leveraging Datasets and LLMs Edison Jair Bejarano Sepulveda et.al. 2405.08792 link
2024-05-14 Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring Tiantian Zhang et.al. 2405.08786 link
2024-05-14 Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs Akhila Yerukola et.al. 2405.08760 link
2024-05-14 Distributed Threat Intelligence at the Edge Devices: A Large Language Model-Driven Approach Syed Mhamudul Hasan et.al. 2405.08755 null
2024-05-14 Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Zhimin Li et.al. 2405.08748 link
2024-05-14 Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory Xueyan Niu et.al. 2405.08707 null
2024-05-14 EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera Beilei Cui et.al. 2405.08672 link
2024-05-14 Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research Qinglong Cao et.al. 2405.08668 link
2024-05-14 Thinking Tokens for Language Modeling David Herel et.al. 2405.08644 null
2024-05-15 ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation Dimitris Gkoumas et.al. 2405.08619 null
2024-05-14 A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine Hanguang Xiao et.al. 2405.08603 null
2024-05-15 EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark Xiaohui Zhang et.al. 2405.08596 link
2024-05-14 Open-Vocabulary Object Detection via Neighboring Region Attention Alignment Sunyuan Qiang et.al. 2405.08593 null
2024-05-14 Improving Transformers with Dynamically Composable Multi-Head Attention Da Xiao et.al. 2405.08553 link
2024-05-14 Self-Distillation Improves DNA Sequence Inference Tong Yu et.al. 2405.08538 link
2024-05-14 Falcon 7b for Software Mention Detection in Scholarly Documents AmeerAli Khan et.al. 2405.08514 null
2024-05-14 Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure Odysseas S. Chlapanis et.al. 2405.08502 link
2024-05-14 Is Less More? Quality, Quantity and Context in Idiom Processing with Natural Language Models Agne Knietaite et.al. 2405.08497 link
2024-05-14 Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models Andrea Piergentili et.al. 2405.08477 null
2024-05-13 Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Chengyue Wu et.al. 2405.07990 null
2024-05-13 A Generalist Learner for Multifaceted Medical Image Interpretation Hong-Yu Zhou et.al. 2405.07988 null
2024-05-13 The Platonic Representation Hypothesis Minyoung Huh et.al. 2405.07987 link
2024-05-13 Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation Kevin Stangl et.al. 2405.07969 null
2024-05-13 PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation Suad Alshammari et.al. 2405.07963 link
2024-05-13 AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments Samuel Schmidgall et.al. 2405.07960 null
2024-05-13 EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning Yinzhu Quan et.al. 2405.07938 link
2024-05-13 PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition Ziyang Zhang et.al. 2405.07932 link
2024-05-13 Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data Mahdi Morafah et.al. 2405.07925 null
2024-05-13 Can Better Text Semantics in Prompt Tuning Improve VLM Generalization? Hari Chandana Kuchibhotla et.al. 2405.07921 null
2024-05-13 A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking Ferdinand Schlatt et.al. 2405.07920 link
2024-05-13 PLUTO: Pathology-Universal Transformer Dinkar Juyal et.al. 2405.07905 null
2024-05-13 Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers Alena Tsanda et.al. 2405.07886 link
2024-05-13 Zero-Shot Tokenizer Transfer Benjamin Minixhofer et.al. 2405.07883 link
2024-05-13 RLHF Workflow: From Reward Modeling to Online RLHF Hanze Dong et.al. 2405.07863 link
2024-05-13 Can LLMs Help Predict Elections? (Counter)Evidence from the World’s Largest Democracy Pratik Gujral et.al. 2405.07828 null
2024-05-13 A View of How Language Models Will Transform Law Frank Fagan et.al. 2405.07826 null
2024-05-13 FreeVA: Offline MLLM as Training-Free Video Assistant Wenhao Wu et.al. 2405.07798 link
2024-05-13 DEPTH: Discourse Education through Pre-Training Hierarchically Zachary Bamberger et.al. 2405.07788 link
2024-05-13 Generating Human Motion in 3D Scenes from Text Descriptions Zhi Cen et.al. 2405.07784 null
2024-05-10 Linearizing Large Language Models Jean Mercat et.al. 2405.06640 link
2024-05-10 Value Augmented Sampling for Language Model Alignment and Personalization Seungwook Han et.al. 2405.06639 link
2024-05-10 Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark Evan M. Williams et.al. 2405.06634 link
2024-05-10 Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models Chakshu Moar et.al. 2405.06626 null
2024-05-10 Explaining Text Similarity in Transformer Models Alexandros Vasileiou et.al. 2405.06604 link
2024-05-10 Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach Elham Ravanbakhsh et.al. 2405.06586 null
2024-05-10 What Can Natural Language Processing Do for Peer Review? Ilia Kuznetsov et.al. 2405.06563 link
2024-05-10 Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval Mengjia Niu et.al. 2405.06545 null
2024-05-10 Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts Wenyu Huang et.al. 2405.06524 null
2024-05-10 UniDM: A Unified Framework for Data Manipulation with Large Language Models Yichen Qian et.al. 2405.06510 null
2024-05-10 Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling Lyumanshan Ye et.al. 2405.06495 null
2024-05-10 Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification Yaoqin Ye et.al. 2405.06468 link
2024-05-10 Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation JoonHo Lee et.al. 2405.06424 link
2024-05-10 Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions? Hunter McNichols et.al. 2405.06414 link
2024-05-10 Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL Ning Cheng et.al. 2405.06410 null
2024-05-10 Program Synthesis using Inductive Logic Programming for the Abstraction and Reasoning Corpus Filipe Marinho Rocha et.al. 2405.06399 null
2024-05-10 Memory Mosaics Jianyu Zhang et.al. 2405.06394 link
2024-05-10 LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play Li-Chun Lu et.al. 2405.06373 link
2024-05-10 LMD3: Language Model Data Density Dependence John Kirchenbauer et.al. 2405.06331 null
2024-05-10 Correlation Dimension of Natural Language in a Statistical Manifold Xin Du et.al. 2405.06321 null
2024-05-09 Natural Language Processing RELIES on Linguistics Juri Opitz et.al. 2405.05966 null
2024-05-09 OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning Dan Qiao et.al. 2405.05957 link
2024-05-09 Probing Multimodal LLMs as World Models for Driving Shiva Sreeram et.al. 2405.05956 link
2024-05-09 Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning Junzhi Chen et.al. 2405.05955 link
2024-05-09 CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts Jiachen Li et.al. 2405.05949 link
2024-05-09 DOLOMITES: Domain-Specific Long-Form Methodical Tasks Chaitanya Malaviya et.al. 2405.05938 null
2024-05-09 Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness Siyuan Li et.al. 2405.05930 null
2024-05-09 Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? Zorik Gekhman et.al. 2405.05904 null
2024-05-09 Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes Ziang Guo et.al. 2405.05885 link
2024-05-09 FlockGPT: Guiding UAV Flocking with Linguistic Orchestration Artem Lykov et.al. 2405.05872 null
2024-05-09 Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control Gunshi Gupta et.al. 2405.05852 link
2024-05-09 Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning Artem Lykov et.al. 2405.05824 link
2024-05-09 Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference Zhihang Lin et.al. 2405.05803 link
2024-05-09 Towards a More Inclusive AI: Progress and Perspectives in Large Language Model Training for the Sámi Language Ronny Paul et.al. 2405.05777 null
2024-05-09 Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions Polina Tsvilodub et.al. 2405.05776 null
2024-05-09 Large Language Model-Aided Evolutionary Search for Constrained Multiobjective Optimization Zeyi Wang et.al. 2405.05767 null
2024-05-09 Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social Media Zhizhen Zhang et.al. 2405.05760 null
2024-05-09 Exploring the Potential of Human-LLM Synergy in Advancing Qualitative Analysis: A Case Study on Mental-Illness Stigma Han Meng et.al. 2405.05758 null
2024-05-09 Can large language models understand uncommon meanings of common words? Jinyang Wu et.al. 2405.05741 null
2024-05-09 Evaluating Dialect Robustness of Language Models via Conversation Understanding Dipankar Srirag et.al. 2405.05688 link
2024-05-08 THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models Prannay Kaul et.al. 2405.05256 null
2024-05-08 You Only Cache Once: Decoder-Decoder Architectures for Language Models Yutao Sun et.al. 2405.05254 link
2024-05-08 Open Source Language Models Can Provide Feedback: Evaluating LLMs’ Ability to Help Students Using GPT-4-As-A-Judge Charles Koutcheme et.al. 2405.05253 link
2024-05-09 LLMs with Personalities in Multi-issue Negotiation Games Sean Noh et.al. 2405.05248 null
2024-05-08 EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning Jingfeng Yao et.al. 2405.05237 link
2024-05-08 SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants Masoud Moghani et.al. 2405.05226 null
2024-05-08 Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers Jiuxiang Gu et.al. 2405.05219 null
2024-05-08 FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models Jinglin Xu et.al. 2405.05216 link
2024-05-08 MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning Inderjeet Nair et.al. 2405.05189 link
2024-05-08 Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming Tommaso Pasini et.al. 2405.05176 null
2024-05-08 Air Gap: Protecting Privacy-Conscious Conversational Agents Eugene Bagdasaryan et.al. 2405.05175 null
2024-05-08 XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples Peiqin Lin et.al. 2405.05116 link
2024-05-08 QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs Weijia Zhang et.al. 2405.05109 null
2024-05-08 Concerns on Bias in Large Language Models when Creating Synthetic Personae Helena A. Haxvig et.al. 2405.05080 null
2024-05-08 Impact of Tone-Aware Explanations in Recommender Systems Ayano Okoso et.al. 2405.05061 null
2024-05-08 Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models Aylin Gunal et.al. 2405.05060 null
2024-05-08 Seeds of Stereotypes: A Large-Scale Textual Analysis of Race and Gender Associations with Diseases in Online Sources Lasse Hyldig Hansen et.al. 2405.05049 null
2024-05-08 ${M^2D}$ NeRF: Multi-Modal Decomposition NeRF with 3D Feature Fields Ning Wang et.al. 2405.05010 null
2024-05-08 ADELIE: Aligning Large Language Models on Information Extraction Yunjia Qi et.al. 2405.05008 link
2024-05-08 NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair Ruoke Wang et.al. 2405.04994 null
2024-05-07 ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning Jing Lin et.al. 2405.04533 null
2024-05-07 QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving Yujun Lin et.al. 2405.04532 link
2024-05-07 NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts Shudan Zhang et.al. 2405.04520 null
2024-05-07 xLSTM: Extended Long Short-Term Memory Maximilian Beck et.al. 2405.04517 link
2024-05-07 A Transformer with Stack Attention Jiaoda Li et.al. 2405.04515 link
2024-05-08 Unveiling Disparities in Web Task Handling Between Human and Web Agent Kihoon Son et.al. 2405.04497 null
2024-05-07 Toward In-Context Teaching: Adapting Examples to Students’ Misconceptions Alexis Ross et.al. 2405.04495 null
2024-05-07 Representation Learning of Daily Movement Data Using Text Encoders Alexander Capstick et.al. 2405.04494 link
2024-05-08 DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model DeepSeek-AI et.al. 2405.04434 link
2024-05-07 The Silicone Ceiling: Auditing GPT’s Race and Gender Biases in Hiring Lena Armstrong et.al. 2405.04412 null
2024-05-07 Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks Georgios Pantazopoulos et.al. 2405.04403 link
2024-05-07 Large Language Models Cannot Explain Themselves Advait Sarkar et.al. 2405.04382 null
2024-05-07 A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI Hannah Chafetz et.al. 2405.04333 null
2024-05-07 Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation Atharvan Dogra et.al. 2405.04325 null
2024-05-07 Granite Code Models: A Family of Open Foundation Models for Code Intelligence Mayank Mishra et.al. 2405.04324 link
2024-05-07 Accelerating Speculative Decoding using Dynamic Speculation Length Jonathan Mamou et.al. 2405.04304 null
2024-05-07 Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework Xiangpeng Wan et.al. 2405.04294 link
2024-05-07 Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore Junchao Wu et.al. 2405.04286 null
2024-05-07 On the Foundations of Earth and Climate Foundation Models Xiao Xiang Zhu et.al. 2405.04285 null
2024-05-07 Semantic API Alignment: Linking High-level User Goals to APIs Robert Feldt et.al. 2405.04236 null
2024-05-06 Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs Muhammad Uzair Khattak et.al. 2405.03690 null
2024-05-06 Pose Priors from Language Models Sanjay Subramanian et.al. 2405.03689 null
2024-05-06 Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames Keith Burghardt et.al. 2405.03688 link
2024-05-06 Language-Image Models with 3D Understanding Jang Hyun Cho et.al. 2405.03685 null
2024-05-06 AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design Kamal Choudhary et.al. 2405.03680 link
2024-05-06 When LLMs Meet Cybersecurity: A Systematic Literature Review Jie Zhang et.al. 2405.03644 link
2024-05-06 A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama Vlad-Andrei Cursaru et.al. 2405.03616 null
2024-05-06 GREEN: Generative Radiology Report Evaluation and Error Notation Sophie Ostmeier et.al. 2405.03595 null
2024-05-06 Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment Abhinav Agarwalla et.al. 2405.03594 null
2024-05-06 Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing Han Liu et.al. 2405.03565 null
2024-05-07 ID-centric Pre-training for Recommendation Yiqing Wu et.al. 2405.03562 null
2024-05-06 AlphaMath Almost Zero: process Supervision without process Guoxin Chen et.al. 2405.03553 link
2024-05-06 MAmmoTH2: Scaling Instructions from the Web Xiang Yue et.al. 2405.03548 null
2024-05-06 Position Paper: Leveraging Foundational Models for Black-Box Optimization: Benefits, Challenges, and Future Directions Xingyou Song et.al. 2405.03547 null
2024-05-06 Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning Yubo Mai et.al. 2405.03509 null
2024-05-06 UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images Yiting Qu et.al. 2405.03486 null
2024-05-06 LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model Haowen Sun et.al. 2405.03485 link
2024-05-06 Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search Hideaki Joko et.al. 2405.03480 link
2024-05-07 Large Language Models (LLMs) as Agents for Augmented Democracy Jairo Gudiño-Rosero et.al. 2405.03452 null
2024-05-06 SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence Hangyuan Ji et.al. 2405.03446 link
2024-05-03 Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models Piotr Padlewski et.al. 2405.02287 link
2024-05-03 Structural Pruning of Pre-trained Language Models via Neural Architecture Search Aaron Klein et.al. 2405.02267 link
2024-05-03 On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning? Maxime Zanella et.al. 2405.02266 link
2024-05-03 Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows Jasmine Y. Shih et.al. 2405.02260 null
2024-05-03 What matters when building vision-language models? Hugo Laurençon et.al. 2405.02246 null
2024-05-03 REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs Deepa Tilwani et.al. 2405.02228 null
2024-05-03 Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks Lujing Zhang et.al. 2405.02225 null
2024-05-03 FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems Yashar Deldjoo et.al. 2405.02219 null
2024-05-03 Automatic Programming: Large Language Models and Beyond Michael R. Lyu et.al. 2405.02213 null
2024-05-03 Assessing and Verifying Task Utility in LLM-Powered Applications Negar Arabzadeh et.al. 2405.02178 null
2024-05-03 Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset Hsuvas Borkakoty et.al. 2405.02175 link
2024-05-03 Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models Mohamad Al Mdfaa et.al. 2405.02162 null
2024-05-03 Neural Context Flows for Learning Generalizable Dynamical Systems Roussel Desmond Nzoyem et.al. 2405.02154 link
2024-05-03 The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates Giuseppe Russo Latona et.al. 2405.02150 link
2024-05-03 MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain Chao Jiang et.al. 2405.02144 null
2024-05-03 Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection Guillem Ramírez et.al. 2405.02134 null
2024-05-03 Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets Xuelong Geng et.al. 2405.02132 link
2024-05-03 Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph Vladyslav Nechakhin et.al. 2405.02105 null
2024-05-03 Argumentative Large Language Models for Explainable and Contestable Decision-Making Gabriel Freedman et.al. 2405.02079 null
2024-05-03 Comparative Analysis of Retrieval Systems in the Real World Dmytro Mozolevskyi et.al. 2405.02048 null
2024-05-02 Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Seungone Kim et.al. 2405.01535 link
2024-05-02 Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks Murtaza Dalal et.al. 2405.01534 null
2024-05-02 OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning Shihao Wang et.al. 2405.01533 link
2024-05-02 FLAME: Factuality-Aware Alignment for Large Language Models Sheng-Chieh Lin et.al. 2405.01525 null
2024-05-02 A separability-based approach to quantifying generalization: which layer is best? Luciano Dyballa et.al. 2405.01524 link
2024-05-02 Transformer-Aided Semantic Communications Matin Mortaheb et.al. 2405.01521 null
2024-05-02 D2PO: Discriminator-Guided DPO with Response Evaluation Models Prasann Singhal et.al. 2405.01511 link
2024-05-02 Analyzing the Role of Semantic Representations in the Era of Large Language Models Zhijing Jin et.al. 2405.01502 link
2024-05-02 Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models Raymond Fok et.al. 2405.01501 null
2024-05-02 Controllable Text Generation in the Instruction-Tuning Era Dhananjay Ashok et.al. 2405.01490 null
2024-05-02 MANTIS: Interleaved Multi-Image Instruction Tuning Dongfu Jiang et.al. 2405.01483 link
2024-05-02 NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Gerald Shen et.al. 2405.01481 link
2024-05-02 V-FLUTE: Visual Figurative Language Understanding with Textual Explanations Arkadiy Saakyan et.al. 2405.01474 link
2024-05-02 Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning Théo Moutakanni et.al. 2405.01469 null
2024-05-02 Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models Yifei Ming et.al. 2405.01468 null
2024-05-02 A Systematic Literature Review on Large Language Models for Automated Program Repair Quanjun Zhang et.al. 2405.01466 link
2024-05-02 Natural Language to Verilog: Design of a Recurrent Spiking Neural Network using Large Language Models and ChatGPT Paola Vitolo et.al. 2405.01419 null
2024-05-02 MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors Yuan Tang et.al. 2405.01413 link
2024-05-02 Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving Xin Quan et.al. 2405.01379 link
2024-05-02 GAIA: A General AI Assistant for Intelligent Accelerator Operations Frank Mayet et.al. 2405.01359 null
2024-05-01 Self-Play Preference Optimization for Language Model Alignment Yue Wu et.al. 2405.00675 link
2024-05-01 Is Bigger Edit Batch Size Always Better? – An Empirical Study on Model Editing with Llama-3 Junsang Yoon et.al. 2405.00664 link
2024-05-01 HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models Ningke Li et.al. 2405.00648 null
2024-05-01 When Quantization Affects Confidence of Large Language Models? Irina Proskurina et.al. 2405.00632 link
2024-05-01 “I’m Not Sure, But…”: Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust Sunnie S. Y. Kim et.al. 2405.00623 null
2024-05-01 Causal Evaluation of Language Models Sirui Chen et.al. 2405.00622 link
2024-05-01 Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling Yida Mu et.al. 2405.00611 link
2024-05-01 Investigating Automatic Scoring and Feedback using Large Language Models Gloria Ashiya Katuka et.al. 2405.00602 null
2024-05-01 Are Models Biased on Text without Gender-related Language? Catarina G Belém et.al. 2405.00588 link
2024-05-01 The Real, the Better: Aligning Large Language Models with Online Human Behaviors Guanying Jiang et.al. 2405.00578 null
2024-05-01 EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model Deng Li et.al. 2405.00574 null
2024-05-01 NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance Huan-Yi Su et.al. 2405.00566 null
2024-05-01 Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment Zhili Liu et.al. 2405.00557 null
2024-05-01 Long-Term Human Trajectory Prediction using 3D Dynamic Scene Graphs Nicolas Gorlo et.al. 2405.00552 link
2024-05-01 ChatBI: Towards Natural Language to Complex Business Intelligence SQL Jinqing Lian et.al. 2405.00527 null
2024-05-01 CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions Donghee Choi et.al. 2405.00523 null
2024-05-01 Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning Lucas-Andreï Thil et.al. 2405.00516 null
2024-05-01 GOLD: Geometry Problem Solver with Natural Language Description Jiaxin Zhang et.al. 2405.00494 link
2024-05-01 Is Temperature the Creativity Parameter of Large Language Models? Max Peeperkorn et.al. 2405.00492 link
2024-05-01 The Pyramid of Captions Delong Chen et.al. 2405.00485 null
2024-04-30 Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Yunhao Ge et.al. 2404.19752 null
2024-04-30 PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification Leon Garza et.al. 2404.19744 null
2024-04-30 Better & Faster Large Language Models via Multi-token Prediction Fabian Gloeckle et.al. 2404.19737 null
2024-04-30 A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications Steph Buongiorno et.al. 2404.19729 null
2024-04-30 PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games Steph Buongiorno et.al. 2404.19721 null
2024-04-30 Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns Constantinos Patsakis et.al. 2404.19715 null
2024-04-30 Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models Scott Sumpter et.al. 2404.19713 null
2024-04-30 When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively Tiziano Labruna et.al. 2404.19705 link
2024-04-30 Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners Chun Feng et.al. 2404.19696 null
2024-04-30 Towards Generalist Robot Learning from Internet Video: A Survey Robert McCarthy et.al. 2404.19664 null
2024-04-30 MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation Min Zhang et.al. 2404.19644 link
2024-04-30 On Training a Neural Network to Explain Binaries Alexander Interrante-Grant et.al. 2404.19631 null
2024-04-30 Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model Denys Godwin et.al. 2404.19609 null
2024-04-30 Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning Xuanli He et.al. 2404.19597 null
2024-04-30 RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing Yucheng Hu et.al. 2404.19543 link
2024-04-30 MoST: Multi-modality Scene Tokenization for Motion Prediction Norman Mu et.al. 2404.19531 null
2024-04-30 Do Large Language Models Understand Conversational Implicature – A case study with a chinese sitcom Shisen Yue et.al. 2404.19509 link
2024-04-30 More Compute Is What You Need Zhen Guo et.al. 2404.19484 null
2024-05-01 Neuro-Vision to Language: Image Reconstruction and Language enabled Interaction via Brain Recordings Guobin Shen et.al. 2404.19438 null
2024-04-30 Can Large Language Models put 2 and 2 together? Probing for Entailed Arithmetical Relationships D. Panas et.al. 2404.19432 null
2024-04-29 Hallucination of Multimodal Large Language Models: A Survey Zechen Bai et.al. 2404.18930 link
2024-04-29 Holmes: Benchmark the Linguistic Competence of Language Models Andreas Waldis et.al. 2404.18923 null
2024-04-29 DPO Meets PPO: Reinforced Token Optimization for RLHF Han Zhong et.al. 2404.18922 null
2024-04-29 TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation Junhao Cheng et.al. 2404.18919 link
2024-04-29 Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting Fangcheng Liu et.al. 2404.18911 link
2024-04-29 Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking Hong Jin Kang et.al. 2404.18881 link
2024-04-29 More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness Aaron J. Li et.al. 2404.18870 link
2024-04-29 Truth-value judgment in language models: belief directions are context sensitive Stefan F. Schouten et.al. 2404.18865 null
2024-04-29 Performance-Aligned LLMs for Generating Fast Code Daniel Nichols et.al. 2404.18864 null
2024-04-29 A Survey on Vision Mamba: Models, Applications and Challenges Rui Xu et.al. 2404.18861 link
2024-04-29 VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning Aidan Z. H. Yang et.al. 2404.18852 null
2024-04-29 FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition Yuxuan Yan et.al. 2404.18848 null
2024-04-29 It’s Difficult to be Neutral – Human and LLM-based Sentiment Annotation of Patient Comments Petter Mæhlum et.al. 2404.18832 null
2024-04-29 Benchmarking Benchmark Leakage in Large Language Models Ruijie Xu et.al. 2404.18824 link
2024-04-29 AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering Wenxiang Zhao et.al. 2404.18816 null
2024-04-29 Unknown Script: Impact of Script on Cross-Lingual Transfer Wondimagegnhue Tsegaye Tufa et.al. 2404.18810 link
2024-04-29 Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Pat Verga et.al. 2404.18796 null
2024-04-29 PECC: Problem Extraction and Coding Challenges Patrick Haller et.al. 2404.18766 link
2024-04-29 Transitive Vision-Language Prompt Learning for Domain Generalization Liyuan Wang et.al. 2404.18758 null
2024-04-29 Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models Hongyi Zhu et.al. 2404.18746 null
2024-04-26 Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo Stephen Zhao et.al. 2404.17546 link
2024-04-26 Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models Yuhang Huang et.al. 2404.17534 null
2024-04-26 Large Language Model Agent as a Mechanical Designer Yayati Jadhav et.al. 2404.17525 null
2024-04-26 On the Use of Large Language Models to Generate Capability Ontologies Luis Miguel Vieira da Silva et.al. 2404.17524 link
2024-04-26 Enhancing Legal Compliance and Regulation Analysis with Large Language Models Shabnam Hassani et.al. 2404.17522 null
2024-04-26 A Comprehensive Evaluation on Event Reasoning of Large Language Models Zhengwei Tao et.al. 2404.17513 link
2024-04-26 CEval: A Benchmark for Evaluating Counterfactual Text Generation Van Bach Nguyen et.al. 2404.17475 link
2024-04-26 Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System Robin Schmucker et.al. 2404.17460 null
2024-04-26 “ChatGPT Is Here to Help, Not to Replace Anybody” – An Evaluation of Students’ Opinions On Integrating ChatGPT In CS Courses Bruno Pereira Cipriano et.al. 2404.17443 null
2024-04-26 PromptCIR: Blind Compressed Image Restoration with Prompt Learning Bingchen Li et.al. 2404.17433 link
2024-04-26 Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations Rémy Decoupes et.al. 2404.17401 null
2024-04-26 UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning Maoxun Yuan et.al. 2404.17360 null
2024-04-26 InspectorRAGet: An Introspection Platform for RAG Evaluation Kshitij Fadnis et.al. 2404.17347 link
2024-04-26 Introducing cosmosGPT: Monolingual Training for Turkish Language Models H. Toprak Kesgin et.al. 2404.17336 null
2024-04-26 A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation Xin Zhang et.al. 2404.17335 null
2024-04-26 An Extendable Cloud-Native Alloy Property Explorer Zhuoyuan Li et.al. 2404.17330 link
2024-04-26 When to Trust LLMs: Aligning Confidence with Response Quality Shuchang Tao et.al. 2404.17287 link
2024-04-26 Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM Xuan Zhang et.al. 2404.17283 link
2024-04-26 Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT as a Pivot Michelle Terblanche et.al. 2404.17216 null
2024-04-26 Low-Rank Knowledge Decomposition for Medical Foundation Models Yuhang Zhou et.al. 2404.17184 link
2024-04-25 The Third Monocular Depth Estimation Challenge Jaime Spencer et.al. 2404.16831 null
2024-04-25 Make-it-Real: Unleashing Large Multimodal Model’s Ability for Painting 3D Objects with Realistic Materials Ye Fang et.al. 2404.16829 null
2024-04-25 V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection Xuanyu Zhang et.al. 2404.16824 null
2024-04-25 How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Zhe Chen et.al. 2404.16821 link
2024-04-25 IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages Harman Singh et.al. 2404.16816 link
2024-04-26 Make Your LLM Fully Utilize the Context Shengnan An et.al. 2404.16811 link
2024-04-25 Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning Tianhui Zhang et.al. 2404.16807 link
2024-04-25 AAPL: Adding Attributes to Prompt Learning for Vision-Language Models Gahyeon Kim et.al. 2404.16804 link
2024-04-25 Weak-to-Strong Extrapolation Expedites Alignment Chujie Zheng et.al. 2404.16792 link
2024-04-25 SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension Bohao Li et.al. 2404.16790 link
2024-04-25 Continual Learning of Large Language Models: A Comprehensive Survey Haizhou Shi et.al. 2404.16789 link
2024-04-25 Modeling Selective Feature Attention for Representation-based Siamese Text Matching Jianxiang Zang et.al. 2404.16776 link
2024-04-25 REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao et.al. 2404.16767 link
2024-04-25 Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model Runzhe Zhan et.al. 2404.16766 null
2024-04-25 RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis Xiaoman Zhang et.al. 2404.16754 link
2024-04-25 Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class Mazda Moayeri et.al. 2404.16717 null
2024-04-25 Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding Mostafa Elhoushi et.al. 2404.16710 link
2024-04-25 Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents Giorgio Piatti et.al. 2404.16698 link
2024-04-25 Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 Lydia Uhler et.al. 2404.16692 null
2024-04-25 EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning Hongxia Xie et.al. 2404.16670 link
2024-04-24 Hybrid LLM/Rule-based Approaches to Business Insights Generation from Structured Data Aliaksei Vertsel et.al. 2404.15604 null
2024-04-24 ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction Henry Peng Zou et.al. 2404.15592 link
2024-04-24 MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis Jiaxin Zhuang et.al. 2404.15580 null
2024-04-24 Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Hossein Salami et.al. 2404.15578 null
2024-04-24 Retrieval Head Mechanistically Explains Long-Context Factuality Wenhao Wu et.al. 2404.15574 link
2024-04-23 PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models Shashi Kant Gupta et.al. 2404.15549 null
2024-04-23 BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis Shuhang Lin et.al. 2404.15532 link
2024-04-23 Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models Mihir Parmar et.al. 2404.15522 link
2024-04-23 Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval Young Kyun Jang et.al. 2404.15516 null
2024-04-23 ToM-LM: Delegating Theory Of Mind Reasoning to External Symbolic Executors in Large Language Models Weizhi Tang et.al. 2404.15515 null
2024-04-23 IryoNLP at MEDIQA-CORR 2024: Tackling the Medical Error Detection & Correction Task On the Shoulders of Medical Agents Jean-Philippe Corbeil et.al. 2404.15488 link
2024-04-23 Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance Het Patel et.al. 2404.15485 null
2024-04-23 Can Large Language Models Learn the Physics of Metamaterials? An Empirical Study with ChatGPT Darui Lu et.al. 2404.15458 null
2024-04-23 XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference João Monteiro et.al. 2404.15420 null
2024-04-23 Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs Davide Caffagni et.al. 2404.15406 null
2024-04-23 Aligning LLM Agents by Learning Latent Preference from User Edits Ge Gao et.al. 2404.15269 link
2024-04-23 XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts Yifeng Ding et.al. 2404.15247 link
2024-04-23 CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies Weiyan Shi et.al. 2404.15238 link
2024-04-23 Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models Aidan Z. H. Yang et.al. 2404.15236 null
2024-04-23 Re-Thinking Inverse Graphics With Large Language Models Peter Kulits et.al. 2404.15228 null
2024-04-23 Does Instruction Tuning Make LLMs More Consistent? Constanza Fierro et.al. 2404.15206 null
2024-04-23 Setting up the Data Printer with Improved English to Ukrainian Machine Translation Yurii Paniv et.al. 2404.15196 link
2024-04-23 Regressive Side Effects of Training Language Models to Mimic Student Misconceptions Shashank Sonkar et.al. 2404.15156 null
2024-04-23 Bias patterns in the application of LLMs for clinical decision support: A comprehensive study Raphael Poulain et.al. 2404.15149 link
2024-04-23 Rethinking LLM Memorization through the Lens of Adversarial Compression Avi Schwarzschild et.al. 2404.15146 null
2024-04-23 MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning Sunan He et.al. 2404.15127 link
2024-04-23 Identifying Fairness Issues in Automatically Generated Testing Content Kevin Stowe et.al. 2404.15104 null
2024-04-23 Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation Xun Wu et.al. 2404.15100 null
2024-04-23 Detection of circular permutations by Protein Language Models Yue Hu et.al. 2404.15087 link
2024-04-23 Multi-Head Mixture-of-Experts Xun Wu et.al. 2404.15045 link
2024-04-23 TAXI: Evaluating Categorical Knowledge Editing for Language Models Derek Powell et.al. 2404.15004 link
2024-04-23 Transformers Can Represent $n$ -gram Language Models Anej Svete et.al. 2404.14994 null
2024-04-23 A Short Review for Ontology Learning from Text: Stride from Shallow Learning, Deep Learning to Large Language Models Trend Rick Du et.al. 2404.14991 null
2024-04-23 $\texttt{MiniMol}$ : A Parameter-Efficient Foundation Model for Molecular Learning Kerstin Kläser et.al. 2404.14986 null
2024-04-23 Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case Muhammad Asif Auyb et.al. 2404.14977 null
2024-04-22 AutoAD III: The Prequel – Back to the Pixels Tengda Han et.al. 2404.14412 null
2024-04-22 SpaceByte: Towards Deleting Tokenization from Large Language Modeling Kevin Slagle et.al. 2404.14408 link
2024-04-22 RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios? Adrian de Wynter et.al. 2404.14397 link
2024-04-22 SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation Yuying Ge et.al. 2404.14396 link
2024-04-22 PARAMANU-GANITA: Language Model with Mathematical Capabilities Mitodru Niyogi et.al. 2404.14395 null
2024-04-22 A Multimodal Automated Interpretability Agent Tamar Rott Shaham et.al. 2404.14394 null
2024-04-22 A Survey on Self-Evolution of Large Language Models Zhengwei Tao et.al. 2404.14387 link
2024-04-22 Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph Xiaochen Kev Gao et.al. 2404.14372 link
2024-04-23 Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data Fahim Tajwar et.al. 2404.14367 link
2024-04-22 Better Synthetic Data by Retrieving and Transforming Existing Datasets Saumya Gandhi et.al. 2404.14361 link
2024-04-22 Rethinking Legal Compliance Automation: Opportunities with Large Language Models Shabnam Hassani et.al. 2404.14356 null
2024-04-22 Calc-CMU at SemEval-2024 Task 7: Pre-Calc – Learning to Use the Calculator Improves Numeracy in Language Models Vishruth Veerendranath et.al. 2404.14355 link
2024-04-22 Automated Long Answer Grading with RiceChem Dataset Shashank Sonkar et.al. 2404.14316 link
2024-04-22 Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels Jan-Philipp Fränken et.al. 2404.14313 link
2024-04-22 Explaining Arguments’ Strength: Unveiling the Role of Attacks and Supports (Technical Report) Xiang Yin et.al. 2404.14304 link
2024-04-22 Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits Shashank Sonkar et.al. 2404.14301 null
2024-04-22 Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach Yao Wan et.al. 2404.14296 link
2024-04-22 A Survey on Efficient Inference for Large Language Models Zixuan Zhou et.al. 2404.14294 null
2024-04-22 LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots Dongge Han et.al. 2404.14285 null
2024-04-22 Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback Wenyi Xiao et.al. 2404.14233 null
2024-04-19 MoVA: Adapting Mixture of Vision Experts to Multimodal Context Zhuofan Zong et.al. 2404.13046 link
2024-04-19 Unified Scene Representation and Reconstruction for 3D Large Language Models Tao Chu et.al. 2404.13044 null
2024-04-19 Data Alignment for Zero-Shot Concept Generation in Dermatology AI Soham Gadgil et.al. 2404.13043 null
2024-04-19 Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs Biyang Guo et.al. 2404.13033 link
2024-04-19 When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering Stephen Choi et.al. 2404.13028 null
2024-04-19 Stronger Random Baselines for In-Context Learning Gregory Yauney et.al. 2404.13020 link
2024-04-19 Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Chuofan Ma et.al. 2404.13013 link
2024-04-19 Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs Clemencia Siro et.al. 2404.12994 link
2024-04-19 FineRec:Exploring Fine-grained Sequential Recommendation Xiaokun Zhang et.al. 2404.12975 link
2024-04-19 Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models Yian Li et.al. 2404.12966 null
2024-04-19 Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction Qinyuan Wu et.al. 2404.12957 null
2024-04-19 Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models Konstantinos Vilouras et.al. 2404.12920 null
2024-04-19 Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models Zhenyang Ni et.al. 2404.12916 link
2024-04-19 Large Language Models for Networking: Workflow, Advances and Challenges Chang Liu et.al. 2404.12901 null
2024-04-19 Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning Ahmed Elshabrawy et.al. 2404.12897 null
2024-04-19 Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation Guanhua Chen et.al. 2404.12879 null
2024-04-19 LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency Zhaodonghui Li et.al. 2404.12872 link
2024-04-19 How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning? Yang Luo et.al. 2404.12866 link
2024-04-19 Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation Yilong Chen et.al. 2404.12861 null
2024-04-19 TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages Aleksei Dorkin et.al. 2404.12845 null
2024-04-18 BLINK: Multimodal Large Language Models Can See but Not Perceive Xingyu Fu et.al. 2404.12390 null
2024-04-18 Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Aitor Ormazabal et.al. 2404.12387 null
2024-04-18 MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale Xiaotang Gai et.al. 2404.12372 null
2024-04-18 When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes Asaf Yehudai et.al. 2404.12365 link
2024-04-18 From $r$ to $Q^*$ : Your Language Model is Secretly a Q-Function Rafael Rafailov et.al. 2404.12358 null
2024-04-18 Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation Jingmin Sun et.al. 2404.12355 link
2024-04-18 V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning Hang Hua et.al. 2404.12353 null
2024-04-18 Evaluating AI for Law: Bridging the Gap with Open-Source Solutions Rohan Bhambhoria et.al. 2404.12349 null
2024-04-18 Large Language Models in Targeted Sentiment Analysis Nicolay Rusnachenko et.al. 2404.12342 link
2024-04-18 Normative Requirements Operationalization with Large Language Models Nick Feng et.al. 2404.12335 null
2024-04-18 Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment Zhaofeng Wu et.al. 2404.12318 null
2024-04-18 Large Language Models for Synthetic Participatory Planning of Shared Automated Electric Mobility Systems Jiangbo Yu et.al. 2404.12317 null
2024-04-18 Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair Yusuke Sakai et.al. 2404.12299 null
2024-04-18 Augmenting emotion features in irony detection with Large language modeling Yucheng Lin et.al. 2404.12291 null
2024-04-18 Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery Yona Falinie A. Gaus et.al. 2404.12285 null
2024-04-18 Enhancing Embedding Performance through Large Language Model-based Text Enrichment and Rewriting Nicholas Harris et.al. 2404.12283 null
2024-04-18 Advancing the Robustness of Large Language Models through Self-Denoised Smoothing Jiabao Ji et.al. 2404.12274 link
2024-04-18 FedEval-LLM: Federated Evaluation of Large Language Models on Downstream Tasks with Collective Wisdom Yuanqin He et.al. 2404.12273 null
2024-04-18 Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences Shreya Shankar et.al. 2404.12272 null
2024-04-18 Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM Michelle S. Lam et.al. 2404.12259 link
2024-04-17 Private federated discovery of out-of-vocabulary words for Gboard Ziteng Sun et.al. 2404.11607 null
2024-04-17 VG4D: Vision-Language Model Goes 4D Video Recognition Zhichao Deng et.al. 2404.11605 link
2024-04-17 A Deep Dive into Large Language Models for Automated Bug Localization and Repair Soneya Binta Hossain et.al. 2404.11595 null
2024-04-17 Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding Zezhong Fan et.al. 2404.11589 null
2024-04-17 LLMTune: Accelerate Database Knob Tuning with Large Language Models Xinmei Huang et.al. 2404.11581 link
2024-04-17 On the Scalability of GNNs for Molecular Graphs Maciej Sypetkowski et.al. 2404.11568 null
2024-04-17 MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation Kuan-Chieh et.al. 2404.11565 null
2024-04-17 Quantifying Multilingual Performance of Large Language Models Across Languages Zihao Li et.al. 2404.11553 null
2024-04-17 Evaluating Span Extraction in Generative Paradigm: A Reflection on Aspect-Based Sentiment Analysis Soyoung Yang et.al. 2404.11539 null
2024-04-17 FedPFT: Federated Proxy Fine-Tuning of Foundation Models Zhaopeng Peng et.al. 2404.11536 link
2024-04-17 Select and Reorder: A Novel Approach for Neural Sign Language Production Harry Walsh et.al. 2404.11532 null
2024-04-17 Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization Costas Mavromatis et.al. 2404.11531 link
2024-04-17 Embedding Privacy in Computational Social Science and Artificial Intelligence Research Keenan Jones et.al. 2404.11515 null
2024-04-17 Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models Yushuo Chen et.al. 2404.11502 link
2024-04-17 Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models Yue Zhou et.al. 2404.11500 link
2024-04-18 Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent Wei Chen et.al. 2404.11459 null
2024-04-17 Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models Sunhao Dai et.al. 2404.11457 link
2024-04-17 AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts Meng Jiang et.al. 2404.11449 link
2024-04-17 Open-Ended Wargames with Large Language Models Daniel P. Hogan et.al. 2404.11446 link
2024-04-17 DUPE: Detection Undermining via Prompt Engineering for Deepfake Text James Weichert et.al. 2404.11408 null
2024-04-16 Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback Qiwei Di et.al. 2404.10776 null
2024-04-16 COMBO: Compositional World Models for Embodied Multi-Agent Cooperation Hongxin Zhang et.al. 2404.10775 null
2024-04-16 Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification Yu-Yang Li et.al. 2404.10757 link
2024-04-16 Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study Shusheng Xu et.al. 2404.10719 link
2024-04-16 Dual Modalities of Text: Visual and Textual Generative Pre-training Yekun Chai et.al. 2404.10710 link
2024-04-16 Question Difficulty Ranking for Multiple-Choice Reading Comprehension Vatsal Raina et.al. 2404.10704 null
2024-04-16 An empirical study on code review activity prediction in practice Doriane Olewicki et.al. 2404.10703 null
2024-04-16 Automating REST API Postman Test Cases Using LLM S Deepika Sri et.al. 2404.10678 null
2024-04-16 Self-playing Adversarial Language Game Enhances LLM Reasoning Pengyu Cheng et.al. 2404.10642 link
2024-04-16 HLAT: High-quality Large Language Model Pre-trained on AWS Trainium Haozheng Fan et.al. 2404.10630 link
2024-04-16 Private Attribute Inference from Images with Vision-Language Models Batuhan Tömekçe et.al. 2404.10618 null
2024-04-16 Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases Yanze Li et.al. 2404.10595 null
2024-04-16 Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training Masanori Hirano et.al. 2404.10555 null
2024-04-16 Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning Xiao Wang et.al. 2404.10552 null
2024-04-16 Capturing the Macroscopic Behaviour of Molecular Dynamics with Membership Functions Alexander Sikorski et.al. 2404.10523 link
2024-04-16 CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Moshe Berchansky et.al. 2404.10513 null
2024-04-16 White Men Lead, Black Women Help: Uncovering Gender, Racial, and Intersectional Bias in Language Agency Yixin Wan et.al. 2404.10508 null
2024-04-16 Self-Supervised Visual Preference Alignment Ke Zhu et.al. 2404.10501 link
2024-04-16 When Emotional Stimuli meet Prompt Designing: An Auto-Prompt Graphical Paradigm Chenggian Ma et.al. 2404.10500 null
2024-04-16 Spiral of Silences: How is Large Language Model Killing Information Retrieval? – A Case Study on Open Domain Question Answering Xiaoyang Chen et.al. 2404.10496 link
2024-04-15 KG-CTG: Citation Generation through Knowledge Graph-guided Large Language Models Avinash Anand et.al. 2404.09763 null
2024-04-15 Resilience of Large Language Models for Noisy Instructions Bin Wang et.al. 2404.09754 null
2024-04-15 Personalized Collaborative Fine-Tuning for On-Device Large Language Models Nicolas Wagner et.al. 2404.09753 link
2024-04-15 AMPCliff: quantitative definition and benchmarking of activity cliffs in antimicrobial peptides Kewei Li et.al. 2404.09738 link
2024-04-15 Quantization of Large Language Models with an Overdetermined Basis Daniil Merkulov et.al. 2404.09737 null
2024-04-15 Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models Ziwei Luo et.al. 2404.09732 link
2024-04-15 Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model Hyunsoo Cho et.al. 2404.09717 null
2024-04-15 Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction David Sobrín-Hidalgo et.al. 2404.09705 null
2024-04-15 Generative AI for Game Theory-based Mobile Networking Long He et.al. 2404.09699 null
2024-04-15 Are Large Language Models Reliable Argument Quality Annotators? Nailia Mirzakhmedova et.al. 2404.09696 link
2024-04-15 LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models Guangyan Li et.al. 2404.09695 null
2024-04-15 Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation Juhwan Choi et.al. 2404.09682 link
2024-04-15 Learn Your Reference Model for Real Good Alignment Alexey Gorbatovski et.al. 2404.09656 null
2024-04-15 Do LLMs Understand Visual Anomalies? Uncovering LLM Capabilities in Zero-shot Anomaly Detection Jiaqi Zhu et.al. 2404.09654 null
2024-04-15 Bridging Vision and Language Spaces with Assignment Prediction Jungin Park et.al. 2404.09632 link
2024-04-15 AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception Yipo Huang et.al. 2404.09624 link
2024-04-15 UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark Zhaokun Zhou et.al. 2404.09619 null
2024-04-15 A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions Pengfei Liu et.al. 2404.09606 link
2024-04-15 Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction Zepeng Ding et.al. 2404.09593 null
2024-04-15 Modelling Language Jumbly Grindrod et.al. 2404.09579 null
2024-04-15 Transformers, Contextualism, and Polysemy Jumbly Grindrod et.al. 2404.09577 link
2024-04-15 Large language models and linguistic intentionality Jumbly Grindrod et.al. 2404.09576 null
2024-04-12 Probing the 3D Awareness of Visual Foundation Models Mohamed El Banani et.al. 2404.08636 link
2024-04-12 Pre-training Small Base LMs with Fewer Tokens Sunny Sanyal et.al. 2404.08634 link
2024-04-12 FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models Yanting Wang et.al. 2404.08631 link
2024-04-12 Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation Yanhao Zheng et.al. 2404.08603 link
2024-04-12 Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts Övgü Özdemir et.al. 2404.08589 link
2024-04-12 Pathological Primitive Segmentation Based on Visual Foundation Model with Zero-Shot Mask Generation Abu Bakor Hayat Arnob et.al. 2404.08584 link
2024-04-12 FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation Riza Velioglu et.al. 2404.08582 link
2024-04-12 Lossy Image Compression with Foundation Diffusion Models Lucas Relic et.al. 2404.08580 null
2024-04-12 Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation Hanlin Tian et.al. 2404.08570 link
2024-04-12 RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs Shreyas Chaudhari et.al. 2404.08555 null
2024-04-12 Memory Traces: Are Transformers Tulving Machines? Jean-Marie Chauvet et.al. 2404.08543 null
2024-04-12 Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward Xuan Xie et.al. 2404.08517 null
2024-04-12 ChatGPT and general-purpose AI count fruits in pictures surprisingly well Konlavach Mengsuwan et.al. 2404.08515 null
2024-04-12 Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction Haoran Qiu et.al. 2404.08509 link
2024-04-12 LaSagnA: Language-based Segmentation Assistant for Complex Queries Cong Wei et.al. 2404.08506 link
2024-04-12 Strategic Interactions between Large Language Models-based Agents in Beauty Contests Siting Lu et.al. 2404.08492 null
2024-04-12 Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation Haozhe Zhao et.al. 2404.08491 link
2024-04-12 Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian Stefano De Paoli et.al. 2404.08488 null
2024-04-12 Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task Hassan Ali et.al. 2404.08424 null
2024-04-12 Adapting the Segment Anything Model During Usage in Novel Situations Robin Schön et.al. 2404.08421 null
2024-04-11 OpenBias: Open-set Bias Detection in Text-to-Image Generative Models Moreno D’Incà et.al. 2404.07990 link
2024-04-11 Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding Yiwen Tang et.al. 2404.07989 link
2024-04-11 Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning Simon Schrodi et.al. 2404.07983 null
2024-04-11 Language Imbalance Can Boost Cross-lingual Generalisation Anton Schäfer et.al. 2404.07982 link
2024-04-11 Manipulating Large Language Models to Increase Product Visibility Aounon Kumar et.al. 2404.07981 link
2024-04-11 LLoCO: Learning Long Contexts Offline Sijun Tan et.al. 2404.07979 link
2024-04-11 Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models Haotian Zhang et.al. 2404.07973 null
2024-04-11 Rho-1: Not All Tokens Are What You Need Zhenghao Lin et.al. 2404.07965 link
2024-04-11 On Unified Prompt Tuning for Request Quality Assurance in Public Code Review Xinyu Chen et.al. 2404.07942 null
2024-04-11 Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation Jinkyung Park et.al. 2404.07926 null
2024-04-11 LaVy: Vietnamese Multimodal Large Language Model Chi Tran et.al. 2404.07922 link
2024-04-11 AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs Zeyi Liao et.al. 2404.07921 link
2024-04-11 DesignQA: A Multimodal Benchmark for Evaluating Large Language Models’ Understanding of Engineering Documentation Anna C. Doris et.al. 2404.07917 link
2024-04-11 HGRN2: Gated Linear RNNs with State Expansion Zhen Qin et.al. 2404.07904 link
2024-04-11 High-Dimension Human Value Representation in Large Language Models Samuel Cahyawijaya et.al. 2404.07900 link
2024-04-11 Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations Dayeon Ki et.al. 2404.07851 link
2024-04-11 On Training Data Influence of GPT Models Qingyi Liu et.al. 2404.07840 link
2024-04-11 RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Aleksandar Botev et.al. 2404.07839 link
2024-04-11 Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution Handi Deng et.al. 2404.07833 null
2024-04-11 Heron-Bench: A Benchmark for Evaluating Vision Language Models in Japanese Yuichi Inoue et.al. 2404.07824 link
2024-04-10 BRAVE: Broadening the visual encoding of vision-language models Oğuzhan Fatih Kar et.al. 2404.07204 null
2024-04-10 UMBRAE: Unified Multimodal Decoding of Brain Signals Weihao Xia et.al. 2404.07202 link
2024-04-10 Scaling Laws for Data Filtering – Data Curation cannot be Compute Agnostic Sachin Goyal et.al. 2404.07177 link
2024-04-10 Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Tsendsuren Munkhdalai et.al. 2404.07143 null
2024-04-10 Open reaction-diffusion systems: bridging probabilistic theory across scales Mauricio J. del Razo et.al. 2404.07119 null
2024-04-10 Continuous Language Model Interpolation for Dynamic and Controllable Text Generation Sara Kangaslahti et.al. 2404.07117 link
2024-04-11 From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications Yongqiang Ma et.al. 2404.07108 null
2024-04-10 Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs Bowen Jin et.al. 2404.07103 link
2024-04-10 Dynamic Generation of Personalities with Large Language Models Jianzhi Liu et.al. 2404.07084 link
2024-04-10 VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning Alexandros Xenos et.al. 2404.07078 link
2024-04-10 Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers? Mingyu Jin et.al. 2404.07066 link
2024-04-10 Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study Alessandro Stolfo et.al. 2404.07060 null
2024-04-10 Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation Elisa Sanchez-Bayona et.al. 2404.07053 link
2024-04-10 ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling Ege Özsoy et.al. 2404.07031 link
2024-04-10 Improving Language Model Reasoning with Self-motivated Learning Yunlong Feng et.al. 2404.07017 null
2024-04-10 A Mathematical Theory for Learning Semantic Languages by Abstract Learners Kuo-Yu Liao et.al. 2404.07009 null
2024-04-10 WordDecipher: Enhancing Digital Workspace Communication with Explainable AI for Non-native English Speakers Yuexi Chen et.al. 2404.07005 null
2024-04-10 LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models Igor Tufanov et.al. 2404.07004 null
2024-04-10 Event Grounded Criminal Court View Generation withCooperative (Large) Language Models Linan Yue et.al. 2404.07001 link
2024-04-10 Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study Hongru Du et.al. 2404.06962 link
2024-04-09 InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD Xiaoyi Dong et.al. 2404.06512 link
2024-04-09 Can Feedback Enhance Semantic Grounding in Large Vision-Language Models? Yuan-Hong Liao et.al. 2404.06510 null
2024-04-09 On the Effect of (Near) Duplicate Subwords in Language Modelling Anton Schäfer et.al. 2404.06508 link
2024-04-09 Pitfalls of Conversational LLMs on News Debiasing Ipek Baris Schlicht et.al. 2404.06488 null
2024-04-10 Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks Chonghua Wang et.al. 2404.06480 link
2024-04-10 Text-Based Reasoning About Vector Graphics Zhenhailong Wang et.al. 2404.06479 null
2024-04-09 Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models Zihan Fang et.al. 2404.06448 null
2024-04-09 Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems Kunal Garg et.al. 2404.06413 null
2024-04-09 AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents Luca Gioacchini et.al. 2404.06411 link
2024-04-09 Take a Look at it! Rethinking How to Evaluate Language Model Jailbreak Hongyu Cai et.al. 2404.06407 link
2024-04-09 Apprentices to Research Assistants: Advancing Research with Large Language Models M. Namvarpour et.al. 2404.06404 null
2024-04-09 MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies Shengding Hu et.al. 2404.06395 link
2024-04-09 MuPT: A Generative Symbolic Music Pretrained Transformer Xingwei Qu et.al. 2404.06393 null
2024-04-09 Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis Mikel Zubillaga et.al. 2404.06392 null
2024-04-09 Latent Distance Guided Alignment Training for Large Language Models Haotian Luo et.al. 2404.06390 null
2024-04-09 Model Generation from Requirements with LLMs: an Exploratory Study Alessio Ferrari et.al. 2404.06371 null
2024-04-09 Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python Valdecy Pereira et.al. 2404.06370 link
2024-04-09 VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs Yi Gui et.al. 2404.06369 null
2024-04-09 ClinLinker: Medical Entity Linking of Clinical Concept Mentions in Spanish Fernando Gallego et.al. 2404.06367 null
2024-04-09 Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation Sidra Aleem et.al. 2404.06362 link
2024-04-08 MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding Bo He et.al. 2404.05726 link
2024-04-08 Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Keen You et.al. 2404.05719 null
2024-04-08 Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding Ahmad Idrissi-Yaghir et.al. 2404.05694 null
2024-04-08 Evaluating Mathematical Reasoning Beyond Accuracy Shijie Xia et.al. 2404.05692 link
2024-04-08 Retrieval-Augmented Open-Vocabulary Object Detection Jooyeon Kim et.al. 2404.05687 link
2024-04-08 MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation Kunpeng Song et.al. 2404.05674 link
2024-04-08 CoReS: Orchestrating the Dance of Reasoning and Segmentation Xiaoyi Bao et.al. 2404.05673 null
2024-04-08 Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data Haitham Hammami et.al. 2404.05632 link
2024-04-08 LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking Faren Yan et.al. 2404.05624 null
2024-04-08 MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning Matteo Farina et.al. 2404.05621 link
2024-04-08 SpeechAlign: Aligning Speech Generation to Human Preferences Dong Zhang et.al. 2404.05600 link
2024-04-08 MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering Iñigo Alonso et.al. 2404.05590 null
2024-04-08 Enhancing Software Related Information Extraction with Generative Language Models through Single-Choice Question Answering Wolfgang Otto et.al. 2404.05587 null
2024-04-08 Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model Yue-Hua Han et.al. 2404.05583 null
2024-04-08 360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System Shen Gao et.al. 2404.05569 link
2024-04-08 Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models Bowen Pan et.al. 2404.05567 null
2024-04-08 Chinese Sequence Labeling with Semi-Supervised Boundary-Aware Language Model Pre-training Longhui Zhang et.al. 2404.05560 link
2024-04-08 Evaluating Interventional Reasoning Capabilities of Large Language Models Tejas Kasetty et.al. 2404.05545 null
2024-04-08 OPSD: an Offensive Persian Social media Dataset and its baseline evaluations Mehran Safayani et.al. 2404.05540 null
2024-04-08 Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data Tim Baumgärtner et.al. 2404.05530 null
2024-04-05 Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2) Michael Saxon et.al. 2404.04251 link
2024-04-05 Physical Property Understanding from Language-Embedded Feature Fields Albert J. Zhai et.al. 2404.04242 null
2024-04-05 Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents Harsh Kohli et.al. 2404.04237 null
2024-04-05 player2vec: A Language Modeling Approach to Understand Player Behavior in Games Tianze Wang et.al. 2404.04234 null
2024-04-05 Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation Ji-Jia Wu et.al. 2404.04231 link
2024-04-05 Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation Tong Su et.al. 2404.04212 null
2024-04-05 Social Skill Training with Large Language Models Diyi Yang et.al. 2404.04204 null
2024-04-05 Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text? Ilya Ilyankou et.al. 2404.04169 null
2024-04-05 Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Xinrun Du et.al. 2404.04167 null
2024-04-05 Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval João Coelho et.al. 2404.04163 link
2024-04-05 BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models Jacek Wiland et.al. 2404.04113 link
2024-04-05 Large language models as oracles for instantiating ontologies with domain-specific knowledge Giovanni Ciatto et.al. 2404.04108 link
2024-04-05 Robust Preference Optimization with Provable Noise Tolerance for LLMs Xize Liang et.al. 2404.04102 null
2024-04-05 Label Propagation for Zero-shot Classification with Vision-Language Models Vladan Stojnić et.al. 2404.04072 link
2024-04-05 Assessing the quality of information extraction Filip Seitl et.al. 2404.04068 null
2024-04-05 CLUE: A Clinical Language Understanding Evaluation for LLMs Amin Dada et.al. 2404.04067 link
2024-04-05 VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots Akhil Padmanabha et.al. 2404.04066 null
2024-04-05 A Comparison of Methods for Evaluating Generative IR Negar Arabzadeh et.al. 2404.04044 link
2024-04-05 Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer Hele-Andra Kuulmets et.al. 2404.04042 link
2024-04-05 Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds Annerose Eichel et.al. 2404.04031 link
2024-04-04 OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views Francis Engelmann et.al. 2404.03650 null
2024-04-04 AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Hanyu Lai et.al. 2404.03648 link
2024-04-04 Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra Darioush Kevian et.al. 2404.03647 null
2024-04-04 Locating and Editing Factual Associations in Mamba Arnab Sen Sharma et.al. 2404.03646 link
2024-04-04 Training LLMs over Neurally Compressed Text Brian Lester et.al. 2404.03626 null
2024-04-04 Standardizing Knowledge Engineering Practices with a Reference Architecture Bradley P. Allen et.al. 2404.03624 null
2024-04-04 Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph Marco Bronzini et.al. 2404.03623 link
2024-04-04 Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models Wenshan Wu et.al. 2404.03622 null
2024-04-04 DeViDe: Faceted medical knowledge for improved medical vision-language pre-training Haozhe Luo et.al. 2404.03618 null
2024-04-04 Sailor: Open Language Models for South-East Asia Longxu Dou et.al. 2404.03608 link
2024-04-04 Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization Aniruddha Nrusimha et.al. 2404.03605 link
2024-04-04 Evaluating LLMs at Detecting Errors in LLM Responses Ryo Kamoi et.al. 2404.03602 link
2024-04-04 Intent Detection and Entity Extraction from BioMedical Literature Ankan Mullick et.al. 2404.03598 link
2024-04-04 ReFT: Representation Finetuning for Language Models Zhengxuan Wu et.al. 2404.03592 link
2024-04-04 SemGrasp: Semantic Grasp Generation via Language Aligned Discretization Kailin Li et.al. 2404.03590 null
2024-04-04 Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models Yantao Liu et.al. 2404.03577 link
2024-04-04 Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity Jake Varley et.al. 2404.03570 null
2024-04-04 Personalized LLM Response Generation with Parameterized Memory Injection Kai Zhang et.al. 2404.03565 null
2024-04-04 Select and Summarize: Scene Saliency for Movie Script Summarization Rohit Saxena et.al. 2404.03561 link
2024-04-04 How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes Harmon Bhasin et.al. 2404.03558 link
2024-04-03 ALOHa: A New Measure for Hallucination in Captioning Models Suzanne Petryk et.al. 2404.02904 null
2024-04-03 MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment Duygu Ceylan et.al. 2404.02899 null
2024-04-03 ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline Yifan Xu et.al. 2404.02893 link
2024-04-03 MODNO: Multi Operator Learning With Distributed Neural Operators Zecheng Zhang et.al. 2404.02892 null
2024-04-03 Linear Attention Sequence Parallelism Weigao Sun et.al. 2404.02882 link
2024-04-03 Integrating Explanations in Learning LTL Specifications from Demonstrations Ashutosh Gupta et.al. 2404.02872 null
2024-04-03 Toward Inference-optimal Mixture-of-Expert Large Language Models Longfei Yun et.al. 2404.02852 null
2024-04-03 I-Design: Personalized LLM Interior Designer Ata Çelen et.al. 2404.02838 null
2024-04-03 Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models Wanyun Cui et.al. 2404.02837 null
2024-04-03 Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison Maxime Bouthors et.al. 2404.02835 null
2024-04-03 Empowering Biomedical Discovery with AI Agents Shanghua Gao et.al. 2404.02831 null
2024-04-03 BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models Qijun Luo et.al. 2404.02827 link
2024-04-03 Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models Haoran Sun et.al. 2404.02823 link
2024-04-03 A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches Zhigen Zhao et.al. 2404.02817 null
2024-04-03 The RealHumanEval: Evaluating Large Language Models’ Abilities to Support Programmers Hussein Mozannar et.al. 2404.02806 link
2024-04-03 Efficient Multi-Vector Dense Retrieval Using Bit Vectors Franco Maria Nardini et.al. 2404.02805 link
2024-04-03 AI and personalized learning: bridging the gap with modern educational goals Kristjan-Julius Laak et.al. 2404.02798 null
2024-04-03 CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech Jaehyeon Kim et.al. 2404.02781 null
2024-04-03 FPT: Feature Prompt Tuning for Few-shot Readability Assessment Ziyang Wang et.al. 2404.02772 link
2024-04-03 DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement Hao Wu et.al. 2404.02755 null
2024-04-02 Segment Any 3D Object with Language Seungjun Lee et.al. 2404.02157 null
2024-04-02 Iterated Learning Improves Compositionality in Large Vision-Language Models Chenhao Zheng et.al. 2404.02145 null
2024-04-02 Topic-based Watermarks for LLM-Generated Text Alexander Nemecek et.al. 2404.02138 null
2024-04-02 ViTamin: Designing Scalable Vision Models in the Vision-Language Era Jienneg Chen et.al. 2404.02132 link
2024-04-02 FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning Joel Niklaus et.al. 2404.02127 link
2024-04-02 Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models Wanyong Feng et.al. 2404.02124 link
2024-04-02 GINopic: Topic Modeling with Graph Isomorphism Network Suman Adhya et.al. 2404.02115 link
2024-04-02 CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems Sara Rosenthal et.al. 2404.02103 link
2024-04-02 Advancing LLM Reasoning Generalists with Preference Trees Lifan Yuan et.al. 2404.02078 link
2024-04-02 Red-Teaming Segment Anything Model Krzysztof Jankowski et.al. 2404.02067 link
2024-04-02 Digital Forgetting in Large Language Models: A Survey of Unlearning Methods Alberto Blanco-Justicia et.al. 2404.02062 null
2024-04-02 Long-context LLMs Struggle with Long In-context Learning Tianle Li et.al. 2404.02060 link
2024-04-02 IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT Junchen Fu et.al. 2404.02059 link
2024-04-02 Deconstructing In-Context Learning: Understanding Prompts via Corruption Namrata Shivagunde et.al. 2404.02054 link
2024-04-02 A Survey on Large Language Model-Based Game Agents Sihao Hu et.al. 2404.02039 link
2024-04-02 MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages Daryna Dementieva et.al. 2404.02037 null
2024-04-02 Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts Zhuo Chen et.al. 2404.02022 link
2024-04-02 Large Language Models for Orchestrating Bimanual Robots Kun Chu et.al. 2404.02018 link
2024-04-02 MuxServe: Flexible Multiplexing for Efficient Multiple LLM Serving Jiangfei Duan et.al. 2404.02015 link
2024-04-02 Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models Stephan Linzbach et.al. 2404.01992 null
2024-03-29 Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models Atsuyuki Miyai et.al. 2403.20331 link
2024-03-29 Are We on the Right Way for Evaluating Large Vision-Language Models? Lin Chen et.al. 2403.20330 link
2024-03-29 ReALM: Reference Resolution As Language Modeling Joel Ruben Antony Moniz et.al. 2403.20329 null
2024-03-29 Gecko: Versatile Text Embeddings Distilled from Large Language Models Jinhyuk Lee et.al. 2403.20327 null
2024-03-29 Convolutional Prompting meets Language Models for Continual Learning Anurag Roy et.al. 2403.20317 null
2024-03-29 Learn “No” to Say “Yes” Better: Improving Vision-Language Models via Negations Jaisidh Singh et.al. 2403.20312 link
2024-03-29 Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference Jovan Stojkovic et.al. 2403.20306 null
2024-03-29 Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain Burcu Sayin et.al. 2403.20288 link
2024-03-29 LUQ: Long-text Uncertainty Quantification for LLMs Caiqi Zhang et.al. 2403.20279 link
2024-04-01 Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want Weifeng Lin et.al. 2403.20271 link
2024-03-29 Latxa: An Open Language Model and Evaluation Suite for Basque Julen Etxaniz et.al. 2403.20266 link
2024-03-29 ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models Thibaut Thonet et.al. 2403.20262 link
2024-03-29 MedCLIP-SAM: Bridging Text and Image Towards Universal Medical Image Segmentation Taha Koleilat et.al. 2403.20253 link
2024-03-29 Using LLMs to Model the Beliefs and Preferences of Targeted Populations Keiichi Namikoshi et.al. 2403.20252 null
2024-03-29 Long-Tailed Anomaly Detection with Learnable Class Names Chih-Hui Ho et.al. 2403.20236 null
2024-03-29 H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model Chao Pang et.al. 2403.20213 link
2024-03-29 Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science Yazheng Yang et.al. 2403.20208 null
2024-03-29 The Future of Combating Rumors? Retrieval, Discrimination, and Generation Junhao Xu et.al. 2403.20204 null
2024-03-29 ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models Shuo Liu et.al. 2403.20194 null
2024-03-29 HARMamba: Efficient Wearable Sensor Human Activity Recognition Based on Bidirectional Selective SSM Shuangjian Li et.al. 2403.20183 null
2024-03-28 RSMamba: Remote Sensing Image Classification with State Space Model Keyan Chen et.al. 2403.19654 link
2024-03-28 InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction Sirui Xu et.al. 2403.19652 null
2024-03-28 MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions Kai Zhang et.al. 2403.19651 link
2024-03-28 Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models Samuel Marks et.al. 2403.19647 link
2024-03-28 Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning Chenyang Liu et.al. 2403.19646 link
2024-03-28 Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models Yucheng Shi et.al. 2403.19631 link
2024-03-28 RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents Zeren Chen et.al. 2403.19622 null
2024-03-28 SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects Avinash Ummadisingu et.al. 2403.19607 null
2024-03-28 Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation Zhongliang Zhou et.al. 2403.19584 link
2024-03-28 Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics Norman Di Palo et.al. 2403.19578 null
2024-03-28 WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models Piotr Molenda et.al. 2403.19548 null
2024-03-28 Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models Ang Lv et.al. 2403.19521 link
2024-03-28 Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data Shan Chen et.al. 2403.19511 link
2024-03-28 LLMs as Academic Reading Companions: Extending HCI Through Synthetic Personae Celia Chen et.al. 2403.19506 null
2024-03-28 Evolving Assembly Code in an Adversarial Environment Irina Maliukov et.al. 2403.19489 link
2024-03-28 JDocQA: Japanese Document Question Answering Dataset for Generative Language Models Eri Onami et.al. 2403.19454 link
2024-03-28 Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model Qi Gou et.al. 2403.19443 null
2024-03-28 OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion Xinyu Zhan et.al. 2403.19417 null
2024-03-28 BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation Yuhong He et.al. 2403.19414 null
2024-03-28 Checkpoint Merging via Bayesian Optimization in LLM Pretraining Deyuan Liu et.al. 2403.19390 null
2024-03-27 Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Yanwei Li et.al. 2403.18814 link
2024-03-27 ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation Suraj Patni et.al. 2403.18807 link
2024-03-27 Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation Mateusz Klimaszewski et.al. 2403.18804 link
2024-03-27 Projective Methods for Mitigating Gender Bias in Pre-trained Language Models Hillary Dawkins et.al. 2403.18803 link
2024-03-27 Long-form factuality in large language models Jerry Wei et.al. 2403.18802 link
2024-03-27 Towards a World-English Language Model for On-Device Virtual Assistants Rricha Jalota et.al. 2403.18783 null
2024-03-27 3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation Ehsan Latif et.al. 2403.18778 null
2024-03-27 ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object Chenshuang Zhang et.al. 2403.18775 link
2024-03-27 CheckEval: Robust Evaluation Framework using Large Language Model via Checklist Yukyung Lee et.al. 2403.18771 null
2024-03-27 MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model Yike Wu et.al. 2403.18760 link
2024-03-27 CYCLE: Learning to Self-Refine the Code Generation Yangruibo Ding et.al. 2403.18746 link
2024-03-27 Understanding the Learning Dynamics of Alignment with Human Feedback Shawn Im et.al. 2403.18742 link
2024-03-27 PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations Ehsan Latif et.al. 2403.18721 null
2024-03-27 Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding Xintong Wang et.al. 2403.18715 link
2024-03-27 The Invalsi Benchmark: measuring Language Models Mathematical and Language understanding in Italian Andrea Esuli et.al. 2403.18697 null
2024-03-27 NL-ITI: Optimizing Probing and Intervention for Improvement of ITI Method Jakub Hoscilowicz et.al. 2403.18680 link
2024-03-27 An Exploratory Study on Upper-Level Computing Students’ Use of Large Language Models as Tools in a Semester-Long Project Ben Arie Tanay et.al. 2403.18679 null
2024-03-27 SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens Chengbo Liu et.al. 2403.18647 link
2024-03-27 To Recommend or Not: Recommendability Identification in Conversations with Pre-trained Language Models Zhefan Wang et.al. 2403.18628 link
2024-03-27 Vulnerability Detection with Code Language Models: How Far Are We? Yangruibo Ding et.al. 2403.18624 link
2024-03-26 OmniVid: A Generative Framework for Universal Video Understanding Junke Wang et.al. 2403.17935 link
2024-03-26 Track Everything Everywhere Fast and Robustly Yunzhou Song et.al. 2403.17931 null
2024-03-26 MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution Wei Tao et.al. 2403.17927 null
2024-03-26 LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning Rui Pan et.al. 2403.17919 link
2024-03-26 Large scale paired antibody language models Henry Kenlay et.al. 2403.17889 null
2024-03-26 Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation Carlos Gomes et.al. 2403.17886 link
2024-03-26 MIND Your Language: A Multilingual Dataset for Cross-lingual News Recommendation Andreea Iana et.al. 2403.17876 link
2024-03-26 Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach Andrea Ferrario et.al. 2403.17873 null
2024-03-26 Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications Philip Lippmann et.al. 2403.17860 null
2024-03-26 ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages Bhawna Piryani et.al. 2403.17859 link
2024-03-26 Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs David R. Mortensen et.al. 2403.17856 null
2024-03-26 ArabicaQA: A Comprehensive Dataset for Arabic Question Answering Abdelrahman Abdallah et.al. 2403.17848 link
2024-03-26 Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation Abdelrhman Werby et.al. 2403.17846 null
2024-03-26 Mechanistic Design and Scaling of Hybrid Architectures Michael Poli et.al. 2403.17844 link
2024-03-26 ReMamber: Referring Image Segmentation with Mamba Twister Yuhuan Yang et.al. 2403.17839 link
2024-03-26 A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities Ibrahim Ethem Hamamci et.al. 2403.17834 link
2024-03-26 Assessment of Multimodal Large Language Models in Alignment with Human Values Zhelun Shi et.al. 2403.17830 null
2024-03-26 Accelerating Radio Spectrum Regulation Workflows with Large Language Models (LLMs) Amir Ghasemi et.al. 2403.17819 null
2024-03-26 Graph Language Model (GLM): A new graph-based approach to detect social instabilities Wallyson Lemes de Oliveira et.al. 2403.17816 null
2024-03-26 Are Compressed Language Models Less Subgroup Robust? Leonidas Gee et.al. 2403.17811 link
2024-03-25 Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making Shuai Ma et.al. 2403.16812 null
2024-03-25 An LLM-Based Digital Twin for Optimizing Human-in-the Loop Systems Hanqing Yang et.al. 2403.16809 link
2024-03-25 Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback Zhangqian Bi et.al. 2403.16792 link
2024-03-25 All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification Deepak Narayan Gadde et.al. 2403.16750 null
2024-03-25 A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models Nils Ingelhag et.al. 2403.16730 null
2024-03-25 ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search Zehan Li et.al. 2403.16702 link
2024-03-25 Synapse: Learning Preferential Concepts from Visual Demonstrations Sadanand Modak et.al. 2403.16689 null
2024-03-25 Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography Jiayue Zhang et.al. 2403.16687 null
2024-03-25 RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict Yirong Zeng et.al. 2403.16662 link
2024-03-25 Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT Rohit Raju et.al. 2403.16655 null
2024-03-25 CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment Feiteng Fang et.al. 2403.16649 link
2024-03-25 Virtual Co-Pilot: Multimodal Large Language Model-enabled Quick-access Procedures for Single Pilot Operations Fan Li et.al. 2403.16645 null
2024-03-25 Semantically Enriched Cross-Lingual Sentence Embeddings for Crisis-related Social Media Texts Rabindra Lamsal et.al. 2403.16614 null
2024-03-25 Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units Biswesh Mohapatra et.al. 2403.16609 null
2024-03-25 TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques Ashok Urlana et.al. 2403.16592 null
2024-03-25 Can Large Language Models (or Humans) Distill Text? Nicolas Audinet de Pieuchon et.al. 2403.16584 link
2024-03-25 NSINA: A News Corpus for Sinhala Hansi Hettiarachchi et.al. 2403.16571 link
2024-03-25 Elysium: Exploring Object-level Perception in Videos via MLLM Han Wang et.al. 2403.16558 link
2024-03-25 DOrA: 3D Visual Grounding with Order-Aware Referring Tung-Yu Wu et.al. 2403.16539 null
2024-03-25 Open-Set Recognition in the Age of Vision-Language Models Dimity Miller et.al. 2403.16528 link
2024-03-25 Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art Neeloy Chakraborty et.al. 2403.16527 null
2024-03-25 Harnessing the power of LLMs for normative reasoning in MASs Bastin Tony Roy Savarimuthu et.al. 2403.16524 null
2024-03-25 Norm Violation Detection in Multi-Agent Systems using Large Language Models: A Pilot Study Shawn He et.al. 2403.16517 null
2024-03-25 Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social Media Uma Sushmitha Gunturi et.al. 2403.16514 null
2024-03-22 LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models Yuzhang Shang et.al. 2403.15388 null
2024-03-22 Long-CLIP: Unlocking the Long-Text Capability of CLIP Beichen Zhang et.al. 2403.15378 link
2024-03-22 InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Yi Wang et.al. 2403.15377 link
2024-03-22 Can large language models explore in-context? Akshay Krishnamurthy et.al. 2403.15371 null
2024-03-22 CoLLEGe: Concept Embedding Generation for Large Language Models Ryan Teehan et.al. 2403.15362 null
2024-03-22 Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities Zhitong Xiong et.al. 2403.15356 link
2024-03-22 Controlled Training Data Generation with Diffusion Models Teresa Yeo et.al. 2403.15309 null
2024-03-22 Sphere Neural-Networks for Rational Reasoning Tiansi Dong et.al. 2403.15297 null
2024-03-22 Measuring Gender and Racial Biases in Large Language Models Jiafu An et.al. 2403.15281 null
2024-03-22 Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review Jinge Wang et.al. 2403.15274 null
2024-03-22 Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs Xiaobin Zhang et.al. 2403.15273 null
2024-03-22 Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models Huanxuan Liao et.al. 2403.15268 link
2024-03-22 AI Exposure and Strategic Positioning on an Online Work Platform Shun Yiu et.al. 2403.15262 null
2024-03-22 FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions Orion Weller et.al. 2403.15246 link
2024-03-22 Shadow Generation for Composite Image Using Diffusion model Qingyang Liu et.al. 2403.15234 link
2024-03-22 An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets Jonathan Katzy et.al. 2403.15230 link
2024-03-22 Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models Qiong Wu et.al. 2403.15226 link
2024-03-22 Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations Pranav Kulkarni et.al. 2403.15218 link
2024-03-22 InstaSynth: Opportunities and Challenges in Generating Synthetic Instagram Data with ChatGPT for Sponsored Content Detection Thales Bertaglia et.al. 2403.15214 link
2024-03-22 MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection Taeheon Kim et.al. 2403.15209 null
2024-03-21 MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Renrui Zhang et.al. 2403.14624 null
2024-03-21 Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey Zeyu Han et.al. 2403.14608 null
2024-03-21 MyVLM: Personalizing VLMs for User-Specific Queries Yuval Alaluf et.al. 2403.14599 null
2024-03-21 ReAct Meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training Zonghan Yang et.al. 2403.14589 null
2024-03-21 Large Language Models for Multi-Choice Question Classification of Medical Subjects Víctor Ponce-López et.al. 2403.14582 null
2024-03-21 RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain William James Bolton et.al. 2403.14578 link
2024-03-21 A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students’ Formative Assessment Responses in Science Clayton Cohn et.al. 2403.14565 null
2024-03-21 The Era of Semantic Decoding Maxime Peyrard et.al. 2403.14562 null
2024-03-21 Lexicon-Level Contrastive Visual-Grounding Improves Language Modeling Chengxu Zhuang et.al. 2403.14551 null
2024-03-21 EDT: Improving Large Language Models’ Generation by Entropy-based Dynamic Temperature Sampling Shimao Zhang et.al. 2403.14541 link
2024-03-21 Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference Han Zhao et.al. 2403.14520 link
2024-03-21 The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs) Joschka Haltaufderheide et.al. 2403.14473 null
2024-03-21 Detoxifying Large Language Models via Knowledge Editing Mengru Wang et.al. 2403.14472 link
2024-03-21 ChatGPT Alternative Solutions: Large Language Models Survey Hanieh Alipour et.al. 2403.14469 null
2024-03-21 Recourse for reclamation: Chatting with generative language models Jennifer Chien et.al. 2403.14467 null
2024-03-21 Towards Single-System Illusion in Software-Defined Vehicles – Automated, AI-Powered Workflow Krzysztof Lebioda et.al. 2403.14460 null
2024-03-21 Multi-Level Explanations for Generative Language Models Lucas Monteiro Paes et.al. 2403.14459 null
2024-03-21 gTBLS: Generating Tables from Text by Conditional Question Answering Anirudh Sundar et.al. 2403.14457 null
2024-03-21 Language Models Can Reduce Asymmetry in Information Markets Nasim Rahaman et.al. 2403.14443 null
2024-03-21 A Multimodal Approach to Device-Directed Speech Detection with Large Language Models Dominik Wager et.al. 2403.14438 null
2024-03-20 RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition Ziyu Liu et.al. 2403.13805 link
2024-03-20 Learning from Models and Data for Visual Grounding Ruozhen He et.al. 2403.13804 null
2024-03-20 Reverse Training to Nurse the Reversal Curse Olga Golovneva et.al. 2403.13799 null
2024-03-20 Bridge the Modality and Capacity Gaps in Vision-Language Model Selection Chao Yi et.al. 2403.13797 null
2024-03-20 RewardBench: Evaluating Reward Models for Language Modeling Nathan Lambert et.al. 2403.13787 link
2024-03-20 Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts Guangzeng Han et.al. 2403.13786 link
2024-03-20 Information-Theoretic Distillation for Reference-less Summarization Jaehun Jung et.al. 2403.13780 null
2024-03-20 Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation Hugues Thomas et.al. 2403.13777 null
2024-03-20 Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models Nicholas Bai et.al. 2403.13771 link
2024-03-20 Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model Diwei Wang et.al. 2403.13756 null
2024-03-20 Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement Catherine Arnett et.al. 2403.13754 null
2024-03-20 EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation Atnafu Lambebo Tonja et.al. 2403.13737 null
2024-03-20 Large Language Models meet Network Slicing Management and Orchestration Abdulhalim Dandoush et.al. 2403.13721 null
2024-03-20 SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning Hongjun Wang et.al. 2403.13684 null
2024-03-20 PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents Mitodru Niyogi et.al. 2403.13681 null
2024-03-20 RoleInteract: Evaluating the Social Interaction of Role-Playing Agents Hongzhan Chen et.al. 2403.13679 link
2024-03-20 Grounding Spatial Relations in Text-Only Language Models Gorka Azkune et.al. 2403.13666 link
2024-03-20 Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese Meet Doshi et.al. 2403.13638 null
2024-03-20 VL-Mamba: Exploring State Space Models for Multimodal Learning Yanyuan Qiao et.al. 2403.13600 null
2024-03-20 No more optimization rules: LLM-enabled policy-based multi-modal query optimizer (version 1) Yifan Wang et.al. 2403.13597 null
2024-03-19 LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression Zhuoshi Pan et.al. 2403.12968 link
2024-03-19 Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models Zuyan Liu et.al. 2403.12966 link
2024-03-19 Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models Ce Zhang et.al. 2403.12964 link
2024-03-19 Dated Data: Tracing Knowledge Cutoffs in Large Language Models Jeffrey Cheng et.al. 2403.12958 link
2024-03-19 Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models Elaine Sui et.al. 2403.12952 link
2024-03-19 Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models Joana Ribeiro de Faria et.al. 2403.12936 null
2024-03-19 Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties Efrain Torres-Lomas et.al. 2403.12935 null
2024-03-19 Rapid AIdeation: Generating Ideas With the Self and in Collaboration With Large Language Models Gionnieve Lim et.al. 2403.12928 null
2024-03-19 Supporting Energy Policy Research with Large Language Models Grant Buster et.al. 2403.12924 null
2024-03-19 Contextual AD Narration with Interleaved Multimodal Sequence Hanlin Wang et.al. 2403.12922 null
2024-03-19 Semantic Layering in Room Segmentation via LLMs Taehyeon Kim et.al. 2403.12920 null
2024-03-19 Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts Sai Ashish Somayajula et.al. 2403.12918 link
2024-03-19 Yell At Your Robot: Improving On-the-Fly from Language Corrections Lucy Xiaoyang Shi et.al. 2403.12910 null
2024-03-19 Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference Baolin Li et.al. 2403.12900 null
2024-03-19 mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding Anwen Hu et.al. 2403.12895 link
2024-03-20 MEDBind: Unifying Language and Multimodal Medical Data Embeddings Yuan Gao et.al. 2403.12894 null
2024-03-19 HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning Fucai Ke et.al. 2403.12884 link
2024-03-19 Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models Zehui Chen et.al. 2403.12881 link
2024-03-19 Epistemology of Language Models: Do Language Models Have Holistic Knowledge? Minsu Kim et.al. 2403.12862 null
2024-03-19 RASP: A Drone-based Reconfigurable Actuation and Sensing Platform Towards Ambient Intelligent Systems Minghui Zhao et.al. 2403.12853 null
2024-03-18 Modality-Agnostic fMRI Decoding of Vision and Language Mitja Nikolaus et.al. 2403.11771 null
2024-03-18 Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs M. Jehanzeb Mirza et.al. 2403.11755 link
2024-03-18 Revisiting The Classics: A Study on Identifying and Rectifying Gender Stereotypes in Rhymes and Poems Aditya Narayan Sankaran et.al. 2403.11752 link
2024-03-18 Embedded Named Entity Recognition using Probing Classifiers Nicholas Popovič et.al. 2403.11747 link
2024-03-18 TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models Lisa Weijler et.al. 2403.11691 null
2024-03-18 HDLdebugger: Streamlining HDL debugging with Large Language Models Xufeng Yao et.al. 2403.11671 null
2024-03-18 Prioritized Semantic Learning for Zero-shot Instance Navigation Xander Sun et.al. 2403.11650 link
2024-03-18 Arc2Face: A Foundation Model of Human Faces Foivos Paraperas Papantoniou et.al. 2403.11641 link
2024-03-18 Compositional Kronecker Context Optimization for Vision-Language Models Kun Ding et.al. 2403.11631 null
2024-03-18 Let’s Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model Haoyun Xu et.al. 2403.11621 null
2024-03-18 CRS-Diff: Controllable Generative Remote Sensing Foundation Model Datao Tang et.al. 2403.11614 link
2024-03-18 Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines Ekaterina Trofimova et.al. 2403.11585 null
2024-03-18 Reinforcement Learning with Token-level Feedback for Controllable Text Generation Wendi Li et.al. 2403.11558 link
2024-03-18 LLM^3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning Shu Wang et.al. 2403.11552 link
2024-03-18 Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters Jiazuo Yu et.al. 2403.11549 link
2024-03-18 DEE: Dual-stage Explainable Evaluation Method for Text Generation Shenyu Zhang et.al. 2403.11509 null
2024-03-18 Do CLIPs Always Generalize Better than ImageNet Models? Qizhou Wang et.al. 2403.11497 null
2024-03-18 VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding Yue Fan et.al. 2403.11481 null
2024-03-18 HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models Huy Nghiem et.al. 2403.11456 link
2024-03-18 Zero-shot Compound Expression Recognition with Visual Language Model at the 6th ABAW Challenge Jiahe Wang et.al. 2403.11450 null
2024-03-18 LLM Guided Evolution - The Automation of Models Advancing Models Clint Morris et.al. 2403.11446 link
2024-03-18 StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation Jinpeng Li et.al. 2403.11439 null
2024-03-18 InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions Yifan Wang et.al. 2403.11435 null
2024-03-18 A Novel Paradigm Boosting Translation Capabilities of Large Language Models Jiaxin Guo et.al. 2403.11430 null
2024-03-15 VideoAgent: Long-form Video Understanding with Large Language Model as Agent Xiaohan Wang et.al. 2403.10517 null
2024-03-15 Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization Ratnadira Widyasari et.al. 2403.10507 null
2024-03-15 ATOM: Asynchronous Training of Massive Models for Deep Learning in a Decentralized Environment Xiaofeng Wu et.al. 2403.10504 null
2024-03-15 Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study Chenguang Wang et.al. 2403.10499 link
2024-03-15 Reconfigurable Robot Identification from Motion Data Yuhang Hu et.al. 2403.10496 null
2024-03-15 Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst? Bruno de Melo et.al. 2403.10482 null
2024-03-15 Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases Jiarui Li et.al. 2403.10446 link
2024-03-15 Optimal Block-Level Draft Verification for Accelerating Speculative Decoding Ziteng Sun et.al. 2403.10444 null
2024-03-15 Using an LLM to Turn Sign Spottings into Spoken Language Sentences Ozge Mercanoglu Sincan et.al. 2403.10434 null
2024-03-15 SocialGenPod: Privacy-Friendly Generative AI Social Web Applications with Decentralised Personal Data Stores Vidminas Vizgirda et.al. 2403.10408 link
2024-03-15 A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE Hervé Déjean et.al. 2403.10407 null
2024-03-15 Monotonic Representation of Numeric Properties in Language Models Benjamin Heinzerling et.al. 2403.10381 link
2024-03-15 EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models Rocktim Jyoti Das et.al. 2403.10378 link
2024-03-15 TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale Pengcheng Jiang et.al. 2403.10351 null
2024-03-15 Investigating grammatical abstraction in language models using few-shot learning of novel noun gender Priyanka Sukumaran et.al. 2403.10338 null
2024-03-15 CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model Shang-Hsuan Chiang et.al. 2403.10326 link
2024-03-15 NetBench: A Large-Scale and Comprehensive Network Traffic Benchmark Dataset for Foundation Models Chen Qian et.al. 2403.10319 link
2024-03-15 Uni-SMART: Universal Science Multimodal Analysis and Research Transformer Hengxing Cai et.al. 2403.10301 null
2024-03-15 Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models Tian Meng et.al. 2403.10287 null
2024-03-15 Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning Shang-Hsuan Chiang et.al. 2403.10281 link
2024-03-14 GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping Yuhang Zheng et.al. 2403.09637 link
2024-03-14 Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference Piotr Nawrot et.al. 2403.09636 null
2024-03-14 Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models Akhil Kedia et.al. 2403.09635 link
2024-03-14 OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning Lingyi Hong et.al. 2403.09634 null
2024-03-14 3D-VLA: A 3D Vision-Language-Action Generative World Model Haoyu Zhen et.al. 2403.09631 null
2024-03-14 Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Eric Zelikman et.al. 2403.09629 link
2024-03-14 Explore In-Context Segmentation via Latent Diffusion Models Chaoyang Wang et.al. 2403.09616 null
2024-03-14 MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Brandon McKinzie et.al. 2403.09611 null
2024-03-14 Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey Xiaoyu Liu et.al. 2403.09606 null
2024-03-14 Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis Gregory Coppola et.al. 2403.09599 null
2024-03-14 Renovating Names in Open-Vocabulary Segmentation Benchmarks Haiwen Huang et.al. 2403.09593 null
2024-03-14 ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models Runyu Ma et.al. 2403.09583 null
2024-03-14 Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation Yunhao Gou et.al. 2403.09572 null
2024-03-14 Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models Laura Fernández-Becerra et.al. 2403.09567 null
2024-03-14 Welcome Your New AI Teammate: On Safety Analysis by Leashing Large Language Models Ali Nouri et.al. 2403.09565 null
2024-03-14 PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps Ruixuan Liu et.al. 2403.09562 null
2024-03-14 Less is More: Data Value Estimation for Visual Instruction Tuning Zikang Liu et.al. 2403.09559 null
2024-03-15 Logits of API-Protected LLMs Leak Proprietary Information Matthew Finlayson et.al. 2403.09539 null
2024-03-14 VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding Chris Kelly et.al. 2403.09530 null
2024-03-15 WavCraft: Audio Editing and Generation with Natural Language Prompts Jinhua Liang et.al. 2403.09527 link
2024-03-13 Simple and Scalable Strategies to Continually Pre-train Large Language Models Adam Ibrahim et.al. 2403.08763 link
2024-03-13 Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework Jingling Li et.al. 2403.08743 null
2024-03-13 The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models Carlo Nicolini et.al. 2403.08739 null
2024-03-13 ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation Sayar Ghosh Roy et.al. 2403.08737 link
2024-03-13 Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization Renjie Pi et.al. 2403.08730 null
2024-03-14 SOTOPIA- $π$ : Interactive Learning of Socially Intelligent Language Agents Ruiyi Wang et.al. 2403.08715 link
2024-03-13 Review of Generative AI Methods in Cybersecurity Yagmur Yigit et.al. 2403.08701 null
2024-03-13 TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning Shangding Gu et.al. 2403.08694 link
2024-03-13 Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages Rik van Noord et.al. 2403.08693 null
2024-03-13 Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records Erlend Frayling et.al. 2403.08664 null
2024-03-13 Self-Supervised Learning for Covariance Estimation Tzvi Diskin et.al. 2403.08662 null
2024-03-13 Human Alignment of Large Language Models through Online Preference Optimisation Daniele Calandriello et.al. 2403.08635 null
2024-03-13 MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models Subash Neupane et.al. 2403.08607 null
2024-03-13 Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation Daniel Honerkamp et.al. 2403.08605 link
2024-03-13 DevBench: A Comprehensive Benchmark for Software Development Bowen Li et.al. 2403.08604 link
2024-03-13 Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments Sitao Cheng et.al. 2403.08593 null
2024-03-13 Non-discrimination Criteria for Generative Language Models Sara Sterlie et.al. 2403.08564 link
2024-03-13 AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models Yifei Gao et.al. 2403.08542 link
2024-03-13 Language models scale reliably with over-training and on downstream tasks Samir Yitzhak Gadre et.al. 2403.08540 link
2024-03-13 Masked Generative Story Transformer with Character Guidance and Caption Augmentation Christos Papadimitriou et.al. 2403.08502 link
2024-03-12 Beyond Text: Frozen Large Language Models in Visual Signal Comprehension Lei Zhu et.al. 2403.07874 link
2024-03-12 Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Fangyun Wei et.al. 2403.07872 null
2024-03-12 Exploring Safety Generalization Challenges of Large Language Models via Code Qibing Ren et.al. 2403.07865 link
2024-03-12 Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation Shihao Zhao et.al. 2403.07860 link
2024-03-12 MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric Haokun Lin et.al. 2403.07839 null
2024-03-12 DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies William Xie et.al. 2403.07832 null
2024-03-12 The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing Jianchen Wang et.al. 2403.07825 null
2024-03-12 Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM Sainbayar Sukhbaatar et.al. 2403.07816 null
2024-03-12 Chronos: Learning the Language of Time Series Abdul Fatir Ansari et.al. 2403.07815 link
2024-03-12 Beyond Memorization: The Challenge of Random Memory Access in Language Models Tongyao Zhu et.al. 2403.07805 link
2024-03-12 Fine-tuning Large Language Models with Sequential Instructions Hanxu Hu et.al. 2403.07794 link
2024-03-12 Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations Carlos Jose Xavier Cruz et.al. 2403.07769 link
2024-03-12 Synth $^2$ : Boosting Visual-Language Models with Synthetic Captions and Image Embeddings Sahand Sharifzadeh et.al. 2403.07750 null
2024-03-12 FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models Yan Liu et.al. 2403.07747 null
2024-03-12 Multi-modal Auto-regressive Modeling via Visual Words Tianshuo Peng et.al. 2403.07720 link
2024-03-12 WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? Alexandre Drouin et.al. 2403.07718 link
2024-03-12 StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models Zhicheng Guo et.al. 2403.07714 link
2024-03-12 Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards Wei Shen et.al. 2403.07708 null
2024-03-12 Large, Small or Both: A Novel Data Augmentation Framework Based on Language Models for Debiasing Opinion Summarization Yanyue Zhang et.al. 2403.07693 null
2024-03-12 Reference-free Monolithic Preference Optimization with Odds Ratio Jiwoo Hong et.al. 2403.07691 link
2024-03-11 Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena Leonie Weissweiler et.al. 2403.06965 null
2024-03-11 Materials science in the era of large language models: a perspective Ge Lei et.al. 2403.06949 null
2024-03-11 Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation Xinyao Li et.al. 2403.06946 link
2024-03-11 Naming, Describing, and Quantifying Visual Objects in Humans and LLMs Alberto Testoni et.al. 2403.06935 link
2024-03-11 ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis Yanming Liu et.al. 2403.06932 link
2024-03-11 MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning Yichuan Li et.al. 2403.06914 link
2024-03-11 Application of Quantum Tensor Networks for Protein Classification Debarshi Kundu et.al. 2403.06890 null
2024-03-11 Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents Nishchal Prasad et.al. 2403.06872 link
2024-03-11 Semantic Residual Prompts for Continual Learning Martin Menabue et.al. 2403.06870 link
2024-03-11 Learning with Noisy Foundation Models Hao Chen et.al. 2403.06869 null
2024-03-11 A Geospatial Approach to Predicting Desert Locust Breeding Grounds in Africa Ibrahim Salihu Yusuf et.al. 2403.06860 null
2024-03-11 Development of a Reliable and Accessible Caregiving Language Model (CaLM) Bambang Parmanto et.al. 2403.06857 null
2024-03-11 DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation Guosheng Zhao et.al. 2403.06845 null
2024-03-11 RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback Yanming Liu et.al. 2403.06840 link
2024-03-11 ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts Lyuye Zhang et.al. 2403.06838 null
2024-03-11 Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? Egor Zverev et.al. 2403.06833 link
2024-03-11 The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework Zhuo Chen et.al. 2403.06832 link
2024-03-11 ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model Zhiwei Liu et.al. 2403.06765 link
2024-03-11 An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models Liang Chen et.al. 2403.06764 link
2024-03-11 ALaRM: Align Language Models via Hierarchical Rewards Modeling Yuhang Lai et.al. 2403.06754 link
2024-03-08 Bayesian Preference Elicitation with Language Models Kunal Handa et.al. 2403.05534 null
2024-03-08 Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Machel Reid et.al. 2403.05530 null
2024-03-08 GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM Hao Kang et.al. 2403.05527 link
2024-03-08 DeepSeek-VL: Towards Real-World Vision-Language Understanding Haoyu Lu et.al. 2403.05525 link
2024-03-08 Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola Yijiang Li et.al. 2403.05523 null
2024-03-08 Authorship Attribution in Bangla Literature (AABL) via Transfer Learning using ULMFiT Aisha Khatun et.al. 2403.05519 null
2024-03-08 Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought James Chua et.al. 2403.05518 link
2024-03-08 To Err Is Human, but Llamas Can Learn It Too Agnes Luhtaru et.al. 2403.05493 link
2024-03-08 Will GPT-4 Run DOOM? Adrian de Wynter et.al. 2403.05468 null
2024-03-08 Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs Arijit Nag et.al. 2403.05434 null
2024-03-08 Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition Bingbing Wang et.al. 2403.05428 null
2024-03-08 FedFMS: Exploring Federated Foundation Models for Medical Image Segmentation Yuxi Liu et.al. 2403.05408 link
2024-03-08 Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery Xavier Bou et.al. 2403.05381 link
2024-03-08 VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model Junsu Kim et.al. 2403.05346 null
2024-03-08 Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings Wei Zhou et.al. 2403.05338 null
2024-03-08 ChatASU: Evoking LLM’s Reflexion to Truly Understand Aspect Sentiment in Dialogues Yiding Liu et.al. 2403.05326 null
2024-03-08 RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation Zihao Wang et.al. 2403.05313 null
2024-03-08 Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents Jinyang Li et.al. 2403.05307 link
2024-03-08 ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications Sotaro Takeshita et.al. 2403.05303 link
2024-03-08 Modeling Dynamic (De)Allocations of Local Memory for Translation Validation Abhishek Rose et.al. 2403.05302 null
2024-03-07 iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries Adam Coscia et.al. 2403.04760 link
2024-03-07 KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts Adam Coscia et.al. 2403.04758 link
2024-03-07 LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error Boshi Wang et.al. 2403.04746 link
2024-03-08 How Far Are We from Intelligent Visual Deductive Reasoning? Yizhe Zhang et.al. 2403.04732 link
2024-03-07 Common 7B Language Models Already Possess Strong Math Capabilities Chen Li et.al. 2403.04706 link
2024-03-07 ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes Hashmat Shadab Malik et.al. 2403.04701 link
2024-03-07 Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification Ekaterina Fadeeva et.al. 2403.04696 link
2024-03-07 Telecom Language Models: Must They Be Large? Nicola Piovesan et.al. 2403.04666 null
2024-03-07 Yi: Open Foundation Models by 01.AI 01. AI et.al. 2403.04652 link
2024-03-07 Teaching Large Language Models to Reason with Reinforcement Learning Alex Havrilla et.al. 2403.04642 null
2024-03-07 CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios Qilang Ye et.al. 2403.04640 link
2024-03-07 A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds Xuenan Xu et.al. 2403.04594 link
2024-03-07 Embodied Understanding of Driving Scenarios Yunsong Zhou et.al. 2403.04593 link
2024-03-07 Wiki-TabNER:Advancing Table Interpretation Through Named Entity Recognition Aneta Koleva et.al. 2403.04577 link
2024-03-07 Reducing self-supervised learning complexity improves weakly-supervised classification performance in computational pathology Tim Lenz et.al. 2403.04558 null
2024-03-07 Enhancing Data Quality in Federated Fine-Tuning of Foundation Models Wanru Zhao et.al. 2403.04529 null
2024-03-07 Where does In-context Translation Happen in Large Language Models Suzanna Sia et.al. 2403.04510 null
2024-03-07 GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability Zihan Luo et.al. 2403.04483 link
2024-03-08 Do Large Language Model Understand Multi-Intent Spoken Language ? Shangjian Yin et.al. 2403.04481 link
2024-03-08 Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset Minjin Kim et.al. 2403.04460 link
2024-03-06 Backtracing: Retrieving the Cause of the Query Rose E. Wang et.al. 2403.03956 link
2024-03-06 Bridging Language and Items for Retrieval and Recommendation Yupeng Hou et.al. 2403.03952 link
2024-03-06 The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models Adithya Bhaskar et.al. 2403.03942 link
2024-03-06 Did Translation Models Get More Robust Without Anyone Even Noticing? Ben Peters et.al. 2403.03923 null
2024-03-06 Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing Asmita et.al. 2403.03897 link
2024-03-06 IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators Indraneil Paul et.al. 2403.03894 link
2024-03-06 From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models Luiza Pozzobon et.al. 2403.03893 link
2024-03-06 FaaF: Facts as a Function for the evaluation of RAG systems Vasileios Katranidis et.al. 2403.03888 link
2024-03-06 SaulLM-7B: A pioneering Large Language Model for Law Pierre Colombo et.al. 2403.03883 null
2024-03-06 Learning to Decode Collaboratively with Multiple Language Models Shannon Zejiang Shen et.al. 2403.03870 link
2024-03-06 On the Origins of Linear Representations in Large Language Models Yibo Jiang et.al. 2403.03867 null
2024-03-06 KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions Fangyuan Xu et.al. 2403.03866 null
2024-03-06 Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning Deepanway Ghosal et.al. 2403.03864 link
2024-03-06 X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification Hanzi Xu et.al. 2403.03863 link
2024-03-06 Designing Informative Metrics for Few-Shot Example Selection Rishabh Adiga et.al. 2403.03861 null
2024-03-06 Emojinize : Enriching Any Text with Emoji Translations Lars Henning Klein et.al. 2403.03857 null
2024-03-06 ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Xin Men et.al. 2403.03853 null
2024-03-06 Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ Carolin Holtermann et.al. 2403.03814 link
2024-03-06 Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery Wei Zhang et.al. 2403.03790 null
2024-03-06 PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion Zekai Zhang et.al. 2403.03788 link
2024-03-05 The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning Nathaniel Li et.al. 2403.03218 null
2024-03-05 CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments Savitha Sam Abraham et.al. 2403.03203 null
2024-03-05 Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement Rafaela Martelo et.al. 2403.03188 link
2024-03-05 Reliable, Adaptable, and Attributable Language Models with Retrieval Akari Asai et.al. 2403.03187 null
2024-03-05 MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting Fangchen Liu et.al. 2403.03174 null
2024-03-05 SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection Peng Qi et.al. 2403.03170 null
2024-03-05 PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset Arda Uzunoğlu et.al. 2403.03167 link
2024-03-05 Quantum Many-Body Physics Calculations with Large Language Models Haining Pan et.al. 2403.03154 null
2024-03-05 Language Guided Exploration for RL Agents in Text Environments Hitesh Golchha et.al. 2403.03141 null
2024-03-05 CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following Kaiyan Zhang et.al. 2403.03129 null
2024-03-05 Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution Flor Miriam Plaza-del-Arco et.al. 2403.03121 link
2024-03-05 “In Dialogues We Learn”: Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning Chuanqi Cheng et.al. 2403.03102 null
2024-03-05 KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents Yuqi Zhu et.al. 2403.03101 link
2024-03-05 Learning to Use Tools via Cooperative and Interactive Agents Zhengliang Shi et.al. 2403.03031 link
2024-03-05 Socratic Reasoning Improves Positive Text Rewriting Anmol Goel et.al. 2403.03029 null
2024-03-05 Word Importance Explains How Prompts Affect Language Model Outputs Stefan Hackmann et.al. 2403.03028 null
2024-03-05 OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following Haochen Shi et.al. 2403.03017 null
2024-03-05 Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations Hasan Abu-Rasheed et.al. 2403.03008 null
2024-03-05 Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models Gen Luo et.al. 2403.03003 link
2024-03-05 Localized Zeroth-Order Prompt Optimization Wenyang Hu et.al. 2403.02993 null
2024-03-02 LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems Tasnim Ahmed et.al. 2403.01342 null
2024-03-02 Making Hybrid Languages: A Recipe Leif Andersen et.al. 2403.01335 null
2024-03-02 Chaining thoughts and LLMs to learn DNA structural biophysics Tyler D. Ross et.al. 2403.01332 link
2024-03-02 VBART: The Turkish LLM Meliksah Turker et.al. 2403.01308 null
2024-03-02 ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation Moran Yanuka et.al. 2403.01306 link
2024-03-02 Improving the Validity of Automatically Generated Feedback via Reinforcement Learning Alexander Scarlatos et.al. 2403.01304 link
2024-03-02 NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention Tianyi Zhang et.al. 2403.01273 link
2024-03-02 Employing LLMs for Incident Response Planning and Review Sam Hays et.al. 2403.01271 null
2024-03-02 Dissecting Language Models: Machine Unlearning via Selective Pruning Nicholas Pochinkov et.al. 2403.01267 link
2024-03-02 Accelerating Greedy Coordinate Gradient via Probe Sampling Yiran Zhao et.al. 2403.01251 link
2024-03-02 SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code Ziniu Hu et.al. 2403.01248 null
2024-03-02 Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal Jianheng Huang et.al. 2403.01244 link
2024-03-02 IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact Ruikang Liu et.al. 2403.01241 link
2024-03-02 Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy Jamie Hayes et.al. 2403.01218 null
2024-03-02 API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access Jiayuan Su et.al. 2403.01216 null
2024-03-02 Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning Shuo Yang et.al. 2403.01209 null
2024-03-02 The Case for Animal-Friendly AI Sankalpa Ghose et.al. 2403.01199 null
2024-03-02 DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling Shanghaoran Quan et.al. 2403.01197 link
2024-03-02 RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots Philip Feldman. James R. Foulds et.al. 2403.01193 null
2024-03-02 Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding Ha-Thanh Nguyen et.al. 2403.01185 null
2024-02-29 The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations? Alex Gu et.al. 2402.19475 null
2024-02-29 The All-Seeing Project V2: Towards General Relation Comprehension of the Open World Weiyun Wang et.al. 2402.19474 link
2024-02-29 Retrieval-Augmented Generation for AI-Generated Content: A Survey Penghao Zhao et.al. 2402.19473 link
2024-02-29 Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling Gabriel Grand et.al. 2402.19471 null
2024-03-01 TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning Kate Sanders et.al. 2402.19467 null
2024-02-29 Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models Chen Qian et.al. 2402.19465 link
2024-02-29 Curiosity-driven Red-teaming for Large Language Models Zhang-Wei Hong et.al. 2402.19464 link
2024-02-29 Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap Saurabh Srivastava et.al. 2402.19450 link
2024-02-29 Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models Frederik Kunstner et.al. 2402.19449 null
2024-02-29 ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL Yifei Zhou et.al. 2402.19446 link
2024-02-29 Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation Jonathan Yang et.al. 2402.19432 null
2024-02-29 Compositional API Recommendation for Library-Oriented Code Generation Zexiong Ma et.al. 2402.19431 null
2024-02-29 Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Soham De et.al. 2402.19427 null
2024-02-29 Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines Lijia Ma et.al. 2402.19421 null
2024-02-29 PaECTER: Patent-level Representation Learning using Citation-informed Transformers Mainak Ghosh et.al. 2402.19411 null
2024-02-29 On the Scaling Laws of Geographical Representation in Language Models Nathan Godey et.al. 2402.19406 null
2024-02-29 Entity-Aware Multimodal Alignment Framework for News Image Captioning Junzhe Zhang et.al. 2402.19404 null
2024-02-29 Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Match Human Crowd Accuracy Philipp Schoenegger et.al. 2402.19379 null
2024-02-29 OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models Jenish Maharjan et.al. 2402.19371 null
2024-02-29 SoK: Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency Akila Wickramasekara et.al. 2402.19366 null
2024-02-28 Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards Haoxiang Wang et.al. 2402.18571 link
2024-02-28 Diffusion Language Models Are Versatile Protein Learners Xinyou Wang et.al. 2402.18567 link
2024-02-28 A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic Gregory Coppola et.al. 2402.18566 null
2024-02-28 Approaching Human-Level Forecasting with Language Models Danny Halawi et.al. 2402.18563 null
2024-02-28 Implicit Bias of Next-Token Prediction Christos Thrampoulidis et.al. 2402.18551 null
2024-02-28 Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling Mahdi Karami et.al. 2402.18508 null
2024-02-28 Few-Shot Fairness: Unveiling LLM’s Potential for Fairness-Aware Classification Garima Chhikara et.al. 2402.18502 null
2024-02-28 Language Models Represent Beliefs of Self and Others Wentao Zhu et.al. 2402.18496 null
2024-02-28 IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding Lanyun Zhu et.al. 2402.18476 null
2024-02-28 Meta-Task Prompting Elicits Embedding from Large Language Models Yibin Lei et.al. 2402.18458 link
2024-02-28 Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization Deng Li et.al. 2402.18447 null
2024-02-28 Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication Weize Chen et.al. 2402.18439 link
2024-02-28 A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models Xiujie Song et.al. 2402.18409 link
2024-02-28 Balanced Similarity with Auxiliary Prompts: Towards Alleviating Text-to-Image Retrieval Bias for CLIP in Zero-shot Learning Hanyao Wang et.al. 2402.18400 null
2024-02-28 Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models Ercong Nie et.al. 2402.18397 null
2024-02-28 The First Place Solution of WSDM Cup 2024: Leveraging Large Language Models for Conversational Multi-Doc QA Yiming Li et.al. 2402.18385 link
2024-02-28 Large Language Models As Evolution Strategies Robert Tjarko Lange et.al. 2402.18381 null
2024-02-28 Tokenization Is More Than Compression Craig W. Schmidt et.al. 2402.18376 link
2024-02-28 VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models Seoyeon Kim et.al. 2402.18374 link
2024-02-28 Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning Jiachun Li et.al. 2402.18344 link
2024-02-27 ShapeLLM: Universal 3D Object Understanding for Embodied Interaction Zekun Qi et.al. 2402.17766 link
2024-02-27 The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Shuming Ma et.al. 2402.17764 null
2024-02-27 Massive Activations in Large Language Models Mingjie Sun et.al. 2402.17762 link
2024-02-27 Towards Optimal Learning of Language Models Yuxian Gu et.al. 2402.17759 null
2024-02-27 Evaluating Very Long-Term Conversational Memory of LLM Agents Adyasha Maharana et.al. 2402.17753 null
2024-02-27 Tower: An Open Multilingual Large Language Model for Translation-Related Tasks Duarte M. Alves et.al. 2402.17733 link
2024-02-27 AmbigNLG: Addressing Task Ambiguity in Instruction for NLG Ayana Niwa et.al. 2402.17717 link
2024-02-27 Case-Based or Rule-Based: How Do Transformers Do the Math? Yi Hu et.al. 2402.17709 link
2024-02-27 RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations Jing Huang et.al. 2402.17700 link
2024-02-27 NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents Tamara Czinczoll et.al. 2402.17682 link
2024-02-27 The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks Ashwin Prasad Shivarpatna Venkatesh et.al. 2402.17679 null
2024-02-27 CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention Mohammad Sadil Khan et.al. 2402.17678 null
2024-02-27 Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models Yunpeng Huang et.al. 2402.17671 null
2024-02-27 Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs Tanise Ceron et.al. 2402.17649 null
2024-02-27 SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation Shuangrui Ding et.al. 2402.17645 link
2024-02-27 Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data Xiao Liu et.al. 2402.17644 link
2024-02-27 Variational Learning is Effective for Large Deep Networks Yuesong Shen et.al. 2402.17641 link
2024-02-27 Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling David S. W. Williams et.al. 2402.17622 null
2024-02-27 Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization Wenqi Zhang et.al. 2402.17574 link
2024-02-27 Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers Xinyu Tang et.al. 2402.17564 link
2024-02-26 Integrating Large Language Models with Graphical Session-Based Recommendation Naicheng Guo et.al. 2402.16539 null
2024-02-26 LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments Junzhe Chen et.al. 2402.16499 link
2024-02-26 On Languaging a Simulation Engine Han Liu et.al. 2402.16482 null
2024-02-26 Unveiling ChatGPT’s Usage in Open Source Projects: A Mining-based Study Rosalia Tufano et.al. 2402.16480 null
2024-02-26 mEdIT: Multilingual Text Editing via Instruction Tuning Vipul Raheja et.al. 2402.16472 link
2024-02-26 Unveiling Vulnerability of Self-Attention Khai Jiet Liong et.al. 2402.16470 link
2024-02-26 Defending LLMs against Jailbreaking Attacks via Backtranslation Yihan Wang et.al. 2402.16459 link
2024-02-26 ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing Liuzhenghao Lv et.al. 2402.16445 link
2024-02-26 ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors Zhexin Zhang et.al. 2402.16444 link
2024-02-26 Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models Tianyi Tang et.al. 2402.16438 link
2024-02-26 RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions Yuansen Zhang et.al. 2402.16431 null
2024-02-26 Predicting Sustainable Development Goals Using Course Descriptions – from LLMs to Conventional Foundation Models Lev Kharlashkin et.al. 2402.16420 null
2024-02-26 From RAGs to riches: Using large language models to write documents for clinical trials Nigel Markey et.al. 2402.16406 null
2024-02-26 MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property Shiwen Ni et.al. 2402.16389 link
2024-02-26 Immunization against harmful fine-tuning attacks Domenic Rosati et.al. 2402.16382 null
2024-02-26 Improving LLM-based Machine Translation with Systematic Self-Correction Zhaopeng Feng et.al. 2402.16379 link
2024-02-26 Unraveling Babel: Exploring Multilingual Activation Patterns within Large Language Models Weize Liu et.al. 2402.16367 null
2024-02-26 LLM Inference Unveiled: Survey and Roofline Model Insights Zhihang Yuan et.al. 2402.16363 link
2024-02-26 Layer-wise Regularized Dropout for Neural Language Models Shiwen Ni et.al. 2402.16361 null
2024-02-26 An Integrated Data Processing Framework for Pretraining Foundation Models Yiding Sun et.al. 2402.16358 link
2024-02-26 Language-guided Skill Learning with Temporal Variational Inference Haotian Fu et.al. 2402.16354 null
2024-02-23 AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning Jianguo Zhang et.al. 2402.15506 link
2024-02-23 API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs Kinjal Basu et.al. 2402.15491 link
2024-02-23 Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models Yiran Liu et.al. 2402.15481 null
2024-02-23 Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization Swaroop Nath et.al. 2402.15473 link
2024-02-23 Repetition Improves Language Model Embeddings Jacob Mitchell Springer et.al. 2402.15449 link
2024-02-23 A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models Stefan Hegselmann et.al. 2402.15422 link
2024-02-23 PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning Simon Holk et.al. 2402.15420 null
2024-02-23 Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy? Nader Asadi et.al. 2402.15414 null
2024-02-23 Grasp, See and Place: Efficient Unknown Object Rearrangement with Policy Structure Prior Kechun Xu et.al. 2402.15402 link
2024-02-23 Explorations of Self-Repair in Language Models Cody Rushing et.al. 2402.15390 link
2024-02-23 Safe Task Planning for Language-Instructed Multi-Robot Systems using Conformal Prediction Jun Wang et.al. 2402.15368 null
2024-02-23 Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang et.al. 2402.15350 link
2024-02-23 NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data Sergei Bogdanov et.al. 2402.15343 link
2024-02-23 Ranking Entities along Conceptual Space Dimensions with LLMs: An Analysis of Fine-Tuning Strategies Nitesh Kumar et.al. 2402.15337 null
2024-02-23 GPTVQ: The Blessing of Dimensionality for LLM Quantization Mart van Baalen et.al. 2402.15319 null
2024-02-23 ArabianGPT: Native Arabic GPT-based Large Language Anis Koubaa et.al. 2402.15313 null
2024-02-23 Counterfactual Generation with Identifiability Guarantees Hanqi Yan et.al. 2402.15309 link
2024-02-23 Representing Online Handwriting for Recognition in Large Vision-Language Models Anastasiia Fadeeva et.al. 2402.15307 null
2024-02-23 How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries Somnath Banerjee et.al. 2402.15302 link
2024-02-23 Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models Yuzhe Zhang et.al. 2402.15301 null
2024-02-22 PALO: A Polyglot Large Multimodal Model for 5B People Muhammad Maaz et.al. 2402.14818 link
2024-02-22 Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging Yuzhe Yang et.al. 2402.14815 link
2024-02-22 WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition Lianghui Zhu et.al. 2402.14812 link
2024-02-22 Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking Nikhil Prakash et.al. 2402.14811 null
2024-02-22 CriticBench: Benchmarking LLMs for Critique-Correct Reasoning Zicheng Lin et.al. 2402.14809 link
2024-02-22 RelayAttention for Efficient Large Language Model Serving with Long System Prompts Lei Zhu et.al. 2402.14808 link
2024-02-22 A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health Nikhil Behari et.al. 2402.14807 null
2024-02-22 Identifying Multiple Personalities in Large Language Models with External Evaluation Xiaoyang Song et.al. 2402.14805 null
2024-02-22 Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models Xudong Lu et.al. 2402.14800 link
2024-02-22 Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic Nathaniel Weir et.al. 2402.14798 null
2024-02-22 Zero-shot cross-lingual transfer in instruction tuning of large language model Nadezhda Chirkova et.al. 2402.14778 null
2024-02-22 2D Matryoshka Sentence Embeddings Xianming Li et.al. 2402.14776 link
2024-02-22 DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models Yuhang Cao et.al. 2402.14767 link
2024-02-22 MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues Ge Bai et.al. 2402.14762 link
2024-02-22 Generalizing Reward Modeling for Out-of-Distribution Preference Learning Chen Jia et.al. 2402.14760 link
2024-02-22 Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation Jiawei Wang et.al. 2402.14744 link
2024-02-22 Dependency Annotation of Ottoman Turkish with Multilingual BERT Şaziye Betül Özateş et.al. 2402.14743 null
2024-02-22 Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Arash Ahmadian et.al. 2402.14740 null
2024-02-22 Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models Seungduk Kim et.al. 2402.14714 link
2024-02-22 IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus Honghao Gui et.al. 2402.14710 link
2024-02-21 Coercing LLMs to do and reveal (almost) anything Jonas Geiping et.al. 2402.14020 link
2024-02-21 Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment Vyas Raina et.al. 2402.14016 link
2024-02-21 OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems Chaoqun He et.al. 2402.14008 link
2024-02-21 Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models Zhiwei He et.al. 2402.14007 link
2024-02-21 Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models Aline Ioste et.al. 2402.14002 null
2024-02-21 Analysing The Impact of Sequence Composition on Language Model Pre-Training Yu Zhao et.al. 2402.13991 link
2024-02-21 Towards Building Multilingual Language Model for Medicine Pengcheng Qiu et.al. 2402.13963 link
2024-02-21 Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality Rahul Zalkikar et.al. 2402.13954 link
2024-02-21 Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning Debjit Paul et.al. 2402.13950 null
2024-02-21 Do Efficient Transformers Really Save Computation? Kai Yang et.al. 2402.13934 null
2024-02-21 Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content Federico Bianchi et.al. 2402.13926 null
2024-02-21 SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Prakamya Mishra et.al. 2402.13919 link
2024-02-21 What Linguistic Features and Languages are Important in LLM Translation? Ryandito Diandaru et.al. 2402.13917 null
2024-02-21 Calibrating Large Language Models with Sample Consistency Qing Lyu et.al. 2402.13904 null
2024-02-21 Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models Chenyang Lyu et.al. 2402.13887 null
2024-02-21 $\texttt{Se}^2$: $\textit{Se}$quential Example $\textit{Se}$ lection for In-Context Learning Haoyu Liu et.al. 2402.13874 link
2024-02-21 An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach Mohammad Amaz Uddin et.al. 2402.13871 null
2024-02-21 Kuaiji: the First Chinese Accounting Large Language Model Jiayuan Luo et.al. 2402.13866 null
2024-02-21 RealDex: Towards Human-like Grasping for Robotic Dexterous Hand Yumeng Liu et.al. 2402.13853 null
2024-02-21 VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models Jiawei Liang et.al. 2402.13851 null
2024-02-20 Towards audio language modeling – an overview Haibin Wu et.al. 2402.13236 null
2024-02-20 Unlocking Insights: Semantic Search in Jupyter Notebooks Lan Li et.al. 2402.13234 null
2024-02-20 A Touch, Vision, and Language Dataset for Multimodal Alignment Letian Fu et.al. 2402.13232 link
2024-02-20 Investigating Cultural Alignment of Large Language Models Badr AlKhamissi et.al. 2402.13231 link
2024-02-20 Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive Arka Pal et.al. 2402.13228 link
2024-02-20 AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning Qiao Jin et.al. 2402.13225 null
2024-02-20 RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian Adrian Cosma et.al. 2402.13222 link
2024-02-20 How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts Yusu Qian et.al. 2402.13220 null
2024-02-20 Softmax Probabilities (Mostly) Predict Large Language Model Correctness on Multiple-Choice Q&A Benjamin Plaut et.al. 2402.13213 link
2024-02-20 Soft Self-Consistency Improves Language Model Agents Han Wang et.al. 2402.13212 link
2024-02-20 Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation Dongjin Kang et.al. 2402.13211 null
2024-02-20 Bayesian Reward Models for LLM Alignment Adam X. Yang et.al. 2402.13210 null
2024-02-20 How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena Marco Gaido et.al. 2402.13208 link
2024-02-20 Question Calibration and Multi-Hop Modeling for Temporal Question Answering Chao Xue et.al. 2402.13188 null
2024-02-20 What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents Mingyu Jin et.al. 2402.13184 link
2024-02-20 DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models Norman Di Palo et.al. 2402.13181 null
2024-02-20 Benchmarking Retrieval-Augmented Generation for Medicine Guangzhi Xiong et.al. 2402.13178 link
2024-02-20 Defending Jailbreak Prompts via In-Context Adversarial Game Yujun Zhou et.al. 2402.13148 null
2024-02-20 OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog Adnen Abdessaied et.al. 2402.13146 null
2024-02-20 The Hidden Space of Transformer Language Adapters Jesujoba O. Alabi et.al. 2402.13137 link
2024-02-19 Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding Zhuoming Chen et.al. 2402.12374 link
2024-02-19 AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies Xiao Ye et.al. 2402.12370 link
2024-02-19 A Critical Evaluation of AI Feedback for Aligning Large Language Models Archit Sharma et.al. 2402.12366 link
2024-02-19 Emergent Word Order Universals from Cognitively-Motivated Language Models Tatsuki Kuribayashi et.al. 2402.12363 link
2024-02-19 Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge Julien Delile et.al. 2402.12352 null
2024-02-19 GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations Jinhao Duan et.al. 2402.12348 link
2024-02-19 Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! Zhanhui Zhou et.al. 2402.12343 link
2024-02-19 Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models Christian Schlarmann et.al. 2402.12336 link
2024-02-19 Query-Based Adversarial Prompt Generation Jonathan Hayase et.al. 2402.12329 null
2024-02-19 Shall We Talk: Exploring Spontaneous Collaborations of Competing LLM Agents Zengqing Wu et.al. 2402.12327 link
2024-02-19 ARKS: Active Retrieval in Knowledge Soup for Code Generation Hongjin Su et.al. 2402.12317 link
2024-02-19 Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports Felix J. Dorfner et.al. 2402.12298 null
2024-02-19 KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students Matthew Shu et.al. 2402.12291 null
2024-02-19 DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models Xiaoyu Tian et.al. 2402.12289 null
2024-02-19 Adaptive Skeleton Graph Decoding Shuowei Jin et.al. 2402.12280 null
2024-02-19 Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks Nadezhda Chirkova et.al. 2402.12279 null
2024-02-19 Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from Large Language Models Puxuan Yu et.al. 2402.12276 link
2024-02-19 High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models Michela Lorandi et.al. 2402.12267 link
2024-02-19 Uncertainty quantification in fine-tuned LLMs using LoRA ensembles Oleksandr Balabanov et.al. 2402.12264 null
2024-02-19 NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms Jonathan Zheng et.al. 2402.12261 link
2024-02-16 PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter Junfei Xiao et.al. 2402.10896 null
2024-02-16 RLVF: Learning from Verbal Feedback without Overgeneralization Moritz Stephan et.al. 2402.10893 link
2024-02-16 Instruction Diversity Drives Generalization To Unseen Tasks Dylan Zhang et.al. 2402.10891 null
2024-02-16 When is Tree Search Useful for LLM Planning? It Depends on the Discriminator Ziru Chen et.al. 2402.10890 link
2024-02-16 Multi-modal preference alignment remedies regression of visual instruction tuning on language model Shengzhi Li et.al. 2402.10884 link
2024-02-16 EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models Muhammad Shihab Rashid et.al. 2402.10866 link
2024-02-16 Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities Mingyu Jin et.al. 2402.10835 null
2024-02-16 RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model Jianhao Yuan et.al. 2402.10828 null
2024-02-16 Quantifying the Persona Effect in LLM Simulations Tiancheng Hu et.al. 2402.10811 null
2024-02-16 Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond Yongqi Li et.al. 2402.10805 null
2024-02-16 EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge Xuan Shen et.al. 2402.10787 link
2024-02-16 A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models Mingchen Li et.al. 2402.10779 null
2024-02-16 AutoGPT+P: Affordance-based Task Planning with Large Language Models Timo Birr et.al. 2402.10778 null
2024-02-16 How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs? Ehsan Doostmohammadi et.al. 2402.10770 null
2024-02-16 Distillation Enhanced Generative Retrieval Yongqi Li et.al. 2402.10769 null
2024-02-16 Inference to the Best Explanation in Large Language Models Dhairya Dalal et.al. 2402.10767 null
2024-02-16 When Dataflow Analysis Meets Large Language Models Chengpeng Wang et.al. 2402.10754 link
2024-02-16 ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages Junjie Ye et.al. 2402.10753 link
2024-02-16 GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models Pengcheng Jiang et.al. 2402.10744 link
2024-02-16 Let’s Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning Yinpeng Liu et.al. 2402.10738 link
2024-02-15 Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation Huizhuo Yuan et.al. 2402.10210 null
2024-02-15 Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment Rui Yang et.al. 2402.10207 link
2024-02-15 Chain-of-Thought Reasoning Without Prompting Xuezhi Wang et.al. 2402.10200 null
2024-02-15 A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents Lingbo Mo et.al. 2402.10196 link
2024-02-15 BitDelta: Your Fine-Tune May Only Be Worth One Bit James Liu et.al. 2402.10193 link
2024-02-15 Uncertainty Decomposition and Quantification for In-Context Learning of Large Language Models Chen Ling et.al. 2402.10189 link
2024-02-15 Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective Tianyi Qiu et.al. 2402.10184 null
2024-02-15 TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation Yaoxiang Wang et.al. 2402.10178 null
2024-02-15 OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset Shubham Toshniwal et.al. 2402.10176 link
2024-02-15 Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence Yinhong Liu et.al. 2402.10175 link
2024-02-15 OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models Ali AhmadiTeshnizi et.al. 2402.10172 link
2024-02-15 Data Engineering for Scaling Language Models to 128K Context Yao Fu et.al. 2402.10171 link
2024-02-15 Knowledge-Infused LLM-Powered Conversational Health Agent: A Case Study for Diabetes Patients Mahyar Abbasian et.al. 2402.10153 null
2024-02-15 ControlLM: Crafting Diverse Personalities for Language Models Yixuan Weng et.al. 2402.10151 link
2024-02-15 TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles Yinhong Liu et.al. 2402.10137 null
2024-02-15 Zero-Shot Reasoning: Personalized Content Generation Without the Cold Start Problem Davor Hafnar et.al. 2402.10133 link
2024-02-15 Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning Ming Li et.al. 2402.10110 link
2024-02-15 Quantized Embedding Vectors for Controllable Diffusion Language Models Cheng Kang et.al. 2402.10107 null
2024-02-15 GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving Jiaxin Zhang et.al. 2402.10104 link
2024-02-15 Any-Shift Prompting for Generalization over Distributions Zehao Xiao et.al. 2402.10099 null
2024-02-14 AQA-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability Siwei Yang et.al. 2402.09404 link
2024-02-14 Reinforcement Learning from Human Feedback with Active Queries Kaixuan Ji et.al. 2402.09401 null
2024-02-14 Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference Harry Dong et.al. 2402.09398 link
2024-02-14 LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset Botao Yu et.al. 2402.09391 link
2024-02-14 HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation Yihao Fang et.al. 2402.09390 link
2024-02-14 Transformers Can Achieve Length Generalization But Not Robustly Yongchao Zhou et.al. 2402.09371 null
2024-02-14 Pseudorandom Error-Correcting Codes Miranda Christ et.al. 2402.09370 null
2024-02-14 Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking Yi Fung et.al. 2402.09369 link
2024-02-14 Copyright Traps for Large Language Models Matthieu Meeus et.al. 2402.09363 link
2024-02-14 HiRE: High Recall Approximate Top- $k$ Estimation for Efficient LLM Inference Yashas Samaga B L et.al. 2402.09360 null
2024-02-14 Developing a Framework for Auditing Large Language Models Using Human-in-the-Loop Maryam Amirizaniani et.al. 2402.09346 null
2024-02-14 Mitigating Reward Hacking via Information-Theoretic Reward Modeling Yuchun Miao et.al. 2402.09345 link
2024-02-14 AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach Maryam Amirizaniani et.al. 2402.09334 null
2024-02-14 ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization Feifan Song et.al. 2402.09320 link
2024-02-14 Embracing the black box: Heading towards foundation models for causal discovery from time series data Gideon Stein et.al. 2402.09305 link
2024-02-14 Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code Vahid Majdinasab et.al. 2402.09299 link
2024-02-14 Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey Zhichen Dong et.al. 2402.09283 link
2024-02-14 Leveraging Large Language Models for Enhanced NLP Task Performance through Knowledge Distillation and Optimized Training Strategies Yining Huang et.al. 2402.09282 null
2024-02-14 Personalized Large Language Models Stanisław Woźniak et.al. 2402.09269 null
2024-02-14 Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation Xiaoying Zhang et.al. 2402.09267 null
2024-02-13 Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance Linxi Zhao et.al. 2402.08680 null
2024-02-13 COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability Xingang Guo et.al. 2402.08679 link
2024-02-13 Human Curriculum Effects Emerge with In-Context Learning in Neural Networks Jacob Russin et.al. 2402.08674 null
2024-02-13 Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models Yuqing Liu et.al. 2402.08670 null
2024-02-13 Improving Generalization in Semantic Parsing by Increasing Natural Language Variation Irina Saparina et.al. 2402.08666 link
2024-02-13 The Last JITAI? The Unreasonable Effectiveness of Large Language Models in Issuing Just-in-Time Adaptive Interventions: Fostering Physical Activity in a Prospective Cardiac Rehabilitation Setting David Haag et.al. 2402.08658 null
2024-02-13 PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs Michael Dorkenwald et.al. 2402.08657 null
2024-02-13 Tandem Transformers for Inference Efficient LLMs Aishwarya P S et.al. 2402.08644 null
2024-02-13 SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages Nedjma Ousidhoum et.al. 2402.08638 null
2024-02-13 Knowledge Editing on Black-box Large Language Models Xiaoshuai Song et.al. 2402.08631 link
2024-02-13 Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning Haeju Lee et.al. 2402.08594 link
2024-02-13 Test-Time Backdoor Attacks on Multimodal Large Language Models Dong Lu et.al. 2402.08577 link
2024-02-13 Online Foundation Model Selection in Robotics Po-han Li et.al. 2402.08570 null
2024-02-13 Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast Xiangming Gu et.al. 2402.08567 link
2024-02-13 Artificial Intelligence for Literature Reviews: Opportunities and Challenges Francisco Bolanos et.al. 2402.08565 null
2024-02-13 Higher Layers Need More LoRA Experts Chongyang Gao et.al. 2402.08562 link
2024-02-13 Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback Vineet Bhat et.al. 2402.08546 null
2024-02-13 The Application of ChatGPT in Responding to Questions Related to the Boston Bowel Preparation Scale Xiaoqiang Liu et.al. 2402.08492 null
2024-02-13 Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models Shaeke Salman et.al. 2402.08473 null
2024-02-13 Large Language Models for the Automated Analysis of Optimization Algorithms Camilo Chacón Sartori et.al. 2402.08472 link
2024-02-12 A systematic investigation of learnability from single child linguistic input Yulu Qin et.al. 2402.07899 link
2024-02-12 Suppressing Pink Elephants with Direct Principle Feedback Louis Castricato et.al. 2402.07896 null
2024-02-12 WildfireGPT: Tailored Large Language Model for Wildfire Analysis Yangxinyu Xie et.al. 2402.07877 null
2024-02-12 Policy Improvement using Language Feedback Models Victor Zhong et.al. 2402.07876 link
2024-02-12 PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs Soroush Nasiriany et.al. 2402.07872 null
2024-02-12 Scaling Laws for Fine-Grained Mixture of Experts Jakub Krajewski et.al. 2402.07871 link
2024-02-12 PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models Wei Zou et.al. 2402.07867 link
2024-02-12 Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models Siddharth Karamcheti et.al. 2402.07865 link
2024-02-12 AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy Philipp Schoenegger et.al. 2402.07862 null
2024-02-12 Lissard: Long and Simple Sequential Reasoning Datasets Mirelle Bueno et.al. 2402.07859 link
2024-02-12 Mercury: An Efficiency Benchmark for LLM Code Synthesis Mingzhe Du et.al. 2402.07844 link
2024-02-12 Do Membership Inference Attacks Work on Large Language Models? Michael Duan et.al. 2402.07841 link
2024-02-12 Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Ahmet Üstün et.al. 2402.07827 null
2024-02-12 Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning Z Liu et.al. 2402.07818 null
2024-02-12 Injecting Wiktionary to improve token-level contextual representations using contrastive learning Anna Mosolova et.al. 2402.07817 null
2024-02-12 Retrieval-Augmented Thought Process as Sequential Decision Making Thomas Pouplin et.al. 2402.07812 null
2024-02-12 Empowering Federated Learning for Massive Models with NVIDIA FLARE Holger R. Roth et.al. 2402.07792 null
2024-02-12 TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection Hui Liu et.al. 2402.07776 link
2024-02-12 Quantitative knowledge retrieval from large language models David Selby et.al. 2402.07770 link
2024-02-12 Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model Mikail Khona et.al. 2402.07757 null
2024-02-09 Feedback Loops With Language Models Drive In-Context Reward Hacking Alexander Pan et.al. 2402.06627 link
2024-02-09 Understanding the Effects of Iterative Prompting on Truthfulness Satyapriya Krishna et.al. 2402.06625 null
2024-02-09 Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning Shivalika Singh et.al. 2402.06619 null
2024-02-09 FaBERT: Pre-training BERT on Persian Blogs Mostafa Masumi et.al. 2402.06617 null
2024-02-09 On the Out-Of-Distribution Generalization of Multimodal Large Language Models Xingxuan Zhang et.al. 2402.06599 null
2024-02-09 CigaR: Cost-efficient Program Repair with LLMs Dávid Hidvégi et.al. 2402.06598 link
2024-02-09 Understanding the Weakness of Large Language Model Agents within a Complex Android Environment Mingzhe Xing et.al. 2402.06596 link
2024-02-09 Self-consistent context aware conformer transducer for speech recognition Konstantin Kolokolov et.al. 2402.06592 null
2024-02-09 G-SciEdBERT: A Contextualized LLM for Science Assessment Tasks in German Ehsan Latif et.al. 2402.06584 link
2024-02-09 Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning Amir Ziai et.al. 2402.06560 link
2024-02-09 The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model Gregory Coppola et.al. 2402.06557 link
2024-02-09 Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA Marek Šuppa et.al. 2402.06549 link
2024-02-09 Calibrating Long-form Generations from Large Language Models Yukun Huang et.al. 2402.06544 link
2024-02-09 Introspective Planning: Guiding Language-Enabled Agents to Refine Their Own Uncertainty Kaiqu Liang et.al. 2402.06529 link
2024-02-09 Multimodal Clinical Trial Outcome Prediction with Large Language Models Wenhao Zheng et.al. 2402.06512 link
2024-02-09 Iris-SAM: Iris Segmentation Using a Foundational Model Parisa Farmanifard et.al. 2402.06497 link
2024-02-09 Large Language Models for Captioning and Retrieving Remote Sensing Images João Daniel Silva et.al. 2402.06475 null
2024-02-09 V-STaR: Training Verifiers for Self-Taught Reasoners Arian Hosseini et.al. 2402.06457 null
2024-02-09 StruQ: Defending Against Prompt Injection with Structured Queries Sizhe Chen et.al. 2402.06363 link
2024-02-09 CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models Peiyuan Gong et.al. 2402.06360 link
2024-02-08 SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models Peng Gao et.al. 2402.05935 link
2024-02-08 Driving Everywhere with Large Language Model Policy Adaptation Boyi Li et.al. 2402.05932 null
2024-02-08 WebLINX: Real-World Website Navigation with Multi-Turn Dialogue Xing Han Lù et.al. 2402.05930 link
2024-02-08 An Interactive Agent Foundation Model Zane Durante et.al. 2402.05929 null
2024-02-08 On the Convergence of Zeroth-Order Federated Tuning in Large Language Models Zhenqing Ling et.al. 2402.05926 link
2024-02-08 Efficient Stagewise Pretraining via Progressive Subnetworks Abhishek Panigrahi et.al. 2402.05913 null
2024-02-08 FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs Eun Cheol Choi et.al. 2402.05904 link
2024-02-08 Large Language Model Meets Graph Neural Network in Knowledge Distillation Shengxiang Hu et.al. 2402.05894 null
2024-02-08 Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking Nikhil Sharma et.al. 2402.05880 null
2024-02-08 PromptCrypt: Prompt Encryption for Secure Communication with Large Language Models Guo Lin et.al. 2402.05868 link
2024-02-08 How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis Federico Bianchi et.al. 2402.05863 link
2024-02-08 Let Your Graph Do the Talking: Encoding Structured Data for LLMs Bryan Perozzi et.al. 2402.05862 link
2024-02-08 Learning to Route Among Specialized Experts for Zero-Shot Generalization Mohammed Muqeeth et.al. 2402.05859 link
2024-02-08 Limitations of Agents Simulated by Predictive Models Raymond Douglas et.al. 2402.05829 null
2024-02-08 Is it Possible to Edit Large Language Models Robustly? Xinbei Ma et.al. 2402.05827 link
2024-02-08 Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models Lingzhi Wang et.al. 2402.05813 null
2024-02-08 Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning Zhiheng Xi et.al. 2402.05808 link
2024-02-08 How do Transformers perform In-Context Autoregressive Learning? Michael E. Sander et.al. 2402.05787 null
2024-02-08 Limits of Transformer Language Models on Algorithmic Learning Jonathan Thomm et.al. 2402.05785 link
2024-02-08 Text-to-Code Generation with Modality-relative Pre-training Fenia Christopoulou et.al. 2402.05783 null
2024-02-07 Opening the AI black box: program synthesis via mechanistic interpretability Eric J. Michaud et.al. 2402.05110 link
2024-02-07 You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models Alix Decrop et.al. 2402.05102 null
2024-02-07 Hydragen: High-Throughput LLM Inference with Shared Prefixes Jordan Juravsky et.al. 2402.05099 link
2024-02-07 Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation Dennis Hoftijzer et.al. 2402.05090 link
2024-02-07 A Roadmap to Pluralistic Alignment Taylor Sorensen et.al. 2402.05070 link
2024-02-07 SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models Lijun Li et.al. 2402.05044 link
2024-02-07 How BERT Speaks Shakespearean English? Evaluating Historical Bias in Contextual Language Models Miriam Cuscito et.al. 2402.05034 null
2024-02-07 A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? Agustinus Kristiadi et.al. 2402.05015 link
2024-02-07 Pedagogical Alignment of Large Language Models Shashank Sonkar et.al. 2402.05000 link
2024-02-07 An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration Yihao Li et.al. 2402.04978 null
2024-02-07 ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12 Liuqing Chen et.al. 2402.04975 null
2024-02-07 Reconfidencing LLMs from the Grouping Loss Perspective Lihu Chen et.al. 2402.04957 null
2024-02-07 Chatbots in Knowledge-Intensive Contexts: Comparing Intent and LLM-Based Systems Samuel Kernan Freire et.al. 2402.04955 null
2024-02-07 Prompting Implicit Discourse Relation Annotation Frances Yung et.al. 2402.04918 null
2024-02-07 Personalized Text Generation with Fine-Grained Linguistic Control Bashar Alhafni et.al. 2402.04914 link
2024-02-07 L4Q: Parameter Efficient Quantization-Aware Training on Large Language Models via LoRA-wise LSQ Hyesung Jeon et.al. 2402.04902 null
2024-02-07 Detecting Generated Native Ads in Conversational Search Sebastian Schmidt et.al. 2402.04889 link
2024-02-07 Multimodal Query Suggestion with Multi-Agent Reinforcement Learning from Human Feedback Zheng Wang et.al. 2402.04867 null
2024-02-07 Automated Smart Contract Summarization via LLMs Yingjie Mao et.al. 2402.04863 null
2024-02-07 CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay Natasha Butt et.al. 2402.04858 link
2024-02-06 AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls Yu Du et.al. 2402.04253 link
2024-02-06 HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal Mantas Mazeika et.al. 2402.04249 link
2024-02-06 Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks Jongho Park et.al. 2402.04248 link
2024-02-06 Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science Xiangru Tang et.al. 2402.04247 null
2024-02-06 CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations Ji Qi et.al. 2402.04236 link
2024-02-06 Can Generative Agents Predict Emotion? Ciaran Regan et.al. 2402.04232 null
2024-02-06 “Task Success” is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors Lin Guan et.al. 2402.04210 null
2024-02-06 Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models David Sobrín-Hidalgo et.al. 2402.04206 link
2024-02-06 SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models Yichen Shi et.al. 2402.04178 link
2024-02-06 Scaling Laws for Downstream Task Performance of Large Language Models Berivan Isik et.al. 2402.04177 null
2024-02-06 Harnessing the Plug-and-Play Controller by Prompting Hao Wang et.al. 2402.04160 null
2024-02-06 Multi-line AI-assisted Code Authoring Omer Dunay et.al. 2402.04141 null
2024-02-06 Advancing Legal Reasoning: The Integration of AI to Navigate Complexities and Biases in Global Jurisprudence with Semi-Automated Arbitration Processes (SAAPs) Michael De’Shazer et.al. 2402.04140 null
2024-02-06 Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science Pengfei Liu et.al. 2402.04119 link
2024-02-06 Measuring Implicit Bias in Explicitly Unbiased Large Language Models Xuechunzi Bai et.al. 2402.04105 link
2024-02-06 The Use of a Large Language Model for Cyberbullying Detection Bayode Ogunleye et.al. 2402.04088 null
2024-02-06 A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation Zhengbo Wang et.al. 2402.04087 link
2024-02-06 Provably learning a multi-head attention layer Sitan Chen et.al. 2402.04084 null
2024-02-06 Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models Reza Khanmohammadi et.al. 2402.04075 null
2024-02-06 Retrieve to Explain: Evidence-driven Predictions with Language Models Ravi Patel et.al. 2402.04068 link

Video Understanding

Publish Date Title Authors PDF Code
2024-12-10 GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-grained Video-language Learning Yicheng Wang et.al. 2412.07704 null
2024-12-10 Multi-Scale Contrastive Learning for Video Temporal Grounding Thong Thanh Nguyen et.al. 2412.07157 null
2024-12-09 VidMusician: Video-to-Music Generation with Semantic-Rhythmic Alignment via Hierarchical Visual Features Sifei Li et.al. 2412.06296 null
2024-12-09 Towards Long Video Understanding via Fine-detailed Video Story Generation Zeng You et.al. 2412.06182 null
2024-12-06 Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Zhe Chen et.al. 2412.05271 null
2024-12-06 LinVT: Empower Your Image-level Large Language Model to Understand Videos Lishuai Gao et.al. 2412.05185 link
2024-12-06 Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection Khurram Azeem Hashmi et.al. 2412.04915 null
2024-12-06 Espresso: High Compression For Rich Extraction From Videos for Your Vision-Language Model Keunwoo Peter Yu et.al. 2412.04729 null
2024-12-05 VisionZip: Longer is Better but Not Necessary in Vision Language Models Senqiao Yang et.al. 2412.04467 link
2024-12-04 VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding Chaoyu Li et.al. 2412.03735 null
2024-12-04 Streaming Detection of Queried Event Start Cristobal Eyzaguirre et.al. 2412.03567 link
2024-12-04 Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning Wujian Peng et.al. 2412.03565 link
2024-12-04 AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning Yiwu Zhong et.al. 2412.03248 link
2024-12-04 Video LLMs for Temporal Reasoning in Long Videos Fawad Javed Fateh et.al. 2412.02930 null
2024-12-03 VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding Kangsan Kim et.al. 2412.02186 link
2024-12-03 Progress-Aware Video Frame Captioning Zihui Xue et.al. 2412.02071 null
2024-12-04 Towards Universal Soccer Video Understanding Jiayuan Rao et.al. 2412.01820 link
2024-12-02 PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos Meng Cao et.al. 2412.01800 null
2024-12-05 SEAL: Semantic Attention Learning for Long Video Representation Lan Wang et.al. 2412.01798 null
2024-12-02 Unlocking Video-LLM via Agent-of-Thoughts Distillation Yudi Shi et.al. 2412.01694 null
2024-12-02 Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation Xin Yan et.al. 2412.01316 null
2024-12-02 Eyes on the Road: State-of-the-Art Video Question Answering Models Assessment for Traffic Monitoring Tasks Joseph Raj Vishal et.al. 2412.01132 link
2024-12-01 VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation Weiming Ren et.al. 2412.00927 null
2024-12-01 VideoSAVi: Self-Aligned Video Language Models without Human Supervision Yogesh Kulkarni et.al. 2412.00624 null
2024-11-30 Empowering the Deaf and Hard of Hearing Community: Enhancing Video Captions Using Large Language Models Nadeen Fathallah et.al. 2412.00342 null
2024-11-29 STEP: Enhancing Video-LLMs’ Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training Haiyi Qiu et.al. 2412.00161 null
2024-12-02 T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs Shukang Yin et.al. 2411.19951 link
2024-11-29 Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark Joseph Heyward et.al. 2411.19941 null
2024-11-29 LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos Tiantian Geng et.al. 2411.19772 null
2024-11-29 Look Every Frame All at Once: Video-Ma $^2$ mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing Hosu Lee et.al. 2411.19460 null
2024-11-29 Actions and Objects Pathways for Domain Adaptation in Video Question Answering Safaa Abdullahi Moallim Mohamud et.al. 2411.19434 null
2024-11-27 AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans Dillon Loh et.al. 2411.18539 link
2024-11-27 TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability Shimin Chen et.al. 2411.18211 link
2024-11-27 HyperGLM: HyperGraph for Video Scene Graph Generation and Anticipation Trong-Thuan Nguyen et.al. 2411.18042 null
2024-11-27 VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format Yueqian Wang et.al. 2411.17991 link
2024-11-25 Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding Andong Deng et.al. 2411.16932 null
2024-11-25 SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context Jungang Li et.al. 2411.16213 null
2024-11-25 VideoOrion: Tokenizing Object Dynamics in Videos Yicheng Feng et.al. 2411.16156 null
2024-11-23 ReWind: Understanding Long Videos with Instructed Learnable Memory Anxhelo Diko et.al. 2411.15556 null
2024-11-23 FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity Hang Hua et.al. 2411.15411 null
2024-11-22 VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection Songhao Han et.al. 2411.14794 link
2024-11-22 Whats in a Video: Factorized Autoregressive Decoding for Online Dense Video Captioning AJ Piergiovanni et.al. 2411.14688 null
2024-11-21 Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding Yiming Zhang et.al. 2411.14401 null
2024-11-20 Extending Video Masked Autoencoders to 128 frames Nitesh Bharadwaj Gundavarapu et.al. 2411.13683 null
2024-11-20 Principles of Visual Tokens for Efficient Video Understanding Xinyue Hao et.al. 2411.13626 null
2024-11-20 Teaching VLMs to Localize Specific Objects from In-context Examples Sivan Doveh et.al. 2411.13317 link
2024-11-20 VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation Ziyang Luo et.al. 2411.13281 null
2024-11-20 Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension Yongdong Luo et.al. 2411.13093 link
2024-11-19 AdaCM $^2$ : On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction Yuanbin Man et.al. 2411.12593 null
2024-11-19 DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding Yudong Han et.al. 2411.12355 null
2024-11-17 TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models Tingyu Qu et.al. 2411.11066 link
2024-11-16 ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models Vipula Rawte et.al. 2411.10867 null
2024-11-13 Can MLLMs Guide Weakly-Supervised Temporal Action Localization Tasks? Quan Zhang et.al. 2411.08466 null
2024-11-12 Grounded Video Caption Generation Evangelos Kazakos et.al. 2411.07584 null
2024-11-11 EVQAScore: Efficient Video Question Answering Data Evaluation Hao Liang et.al. 2411.06908 null
2024-11-11 Multi-Modal interpretable automatic video captioning Antoine Hanna-Asaad et.al. 2411.06872 null
2024-11-08 Poze: Sports Technique Feedback under Data Constraints Agamdeep Singh et.al. 2411.05734 null
2024-11-08 Video RWKV:Video Action Recognition Based RWKV Zhuowen Yin et.al. 2411.05636 null
2024-11-06 Pseudo-labeling with Keyword Refining for Few-Supervised Video Captioning Ping Li et.al. 2411.04059 link
2024-11-06 StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding Junming Lin et.al. 2411.03628 link
2024-11-05 Personalized Video Summarization by Multimodal Video Understanding Brian Chen et.al. 2411.03531 null
2024-11-05 PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance Ruyang Liu et.al. 2411.02327 link
2024-11-04 SPECTRUM: Semantic Processing and Emotion-informed video-Captioning Through Retrieval and Understanding Modalities Ehsan Faghihi et.al. 2411.01975 null
2024-11-02 Designing a Robust Radiology Report Generation System Sonit Singh et.al. 2411.01153 null
2024-10-31 Technical Report for Soccernet 2023 – Dense Video Captioning Zheng Ruan et.al. 2411.00882 null
2024-10-31 Video Token Merging for Long-form Video Understanding Seon-Ho Lee et.al. 2410.23782 null
2024-10-30 TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models Ziyao Shangguan et.al. 2410.23266 link
2024-10-30 Situational Scene Graph for Structured Human-centric Situation Understanding Chinthani Sugandhika et.al. 2410.22829 null
2024-10-29 Standardization Trends on Safety and Trustworthiness Technology for Advanced AI Jonghong Jeon et.al. 2410.22151 null
2024-10-28 Zero-Shot Action Recognition in Surveillance Videos Joao Pereira et.al. 2410.21113 null
2024-10-26 Adaptive Video Understanding Agent: Enhancing efficiency with dynamic frame sampling and feedback-driven reasoning Sullam Jeoung et.al. 2410.20252 null
2024-10-25 FLAASH: Flow-Attention Adaptive Semantic Hierarchical Fusion for Multi-Modal Tobacco Content Analysis Naga VS Raviteja Chappa et.al. 2410.19896 null
2024-10-25 TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning Xiangyu Zeng et.al. 2410.19702 null
2024-10-24 VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks Lawrence Jang et.al. 2410.19100 null
2024-10-24 CAMEL-Bench: A Comprehensive Arabic LMM Benchmark Sara Ghaboura et.al. 2410.18976 link
2024-10-22 LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding Xiaoqian Shen et.al. 2410.17434 link
2024-10-22 Order Matters: Exploring Order Sensitivity in Multimodal Large Language Models Zhijie Tan et.al. 2410.16983 null
2024-10-22 EVC-MF: End-to-end Video Captioning Network with Multi-scale Features Tian-Zi Niu et.al. 2410.16624 null
2024-10-21 xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs Michael S. Ryoo et.al. 2410.16267 null
2024-10-20 EVA: An Embodied World Model for Future Video Anticipation Xiaowei Chi et.al. 2410.15461 null
2024-10-20 ContextDet: Temporal Action Detection with Adaptive Context Aggregation Ning Wang et.al. 2410.15279 null
2024-10-20 Can LVLMs Describe Videos like Humans? A Five-in-One Video Annotations Benchmark for Better Human-Machine Comparison Shiyu Hu et.al. 2410.15270 null
2024-10-19 Making Every Frame Matter: Continuous Video Understanding for Large Models via Adaptive State Modeling Hao Wu et.al. 2410.14993 null
2024-10-18 Zero-shot Action Localization via the Confidence of Large Vision-Language Models Josiah Aklilu et.al. 2410.14340 null
2024-10-15 It’s Just Another Day: Unique Video Captioning by Discriminative Prompting Toby Perrett et.al. 2410.11702 null
2024-10-15 VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI Sijie Cheng et.al. 2410.11623 null
2024-10-15 VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models Xiaohan Lan et.al. 2410.11417 null
2024-10-15 TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models Mu Cai et.al. 2410.10818 link
2024-10-14 LVD-2M: A Long-take Video Dataset with Temporally Dense Captions Tianwei Xiong et.al. 2410.10816 link
2024-10-14 MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer Minghao Zhu et.al. 2410.10589 link
2024-10-16 Free Video-LLM: Prompt-guided Visual Perception for Efficient Training-free Video LLMs Kai Han et.al. 2410.10441 link
2024-10-13 ViFi-ReID: A Two-Stream Vision-WiFi Multimodal Approach for Person Re-identification Chen Mao et.al. 2410.09875 null
2024-10-13 MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models Hang Hua et.al. 2410.09733 null
2024-10-12 Prompting Video-Language Foundation Models with Domain-specific Fine-grained Heuristics for Video Question Answering Ting Yu et.al. 2410.09380 null
2024-10-12 Multi-granularity Contrastive Cross-modal Collaborative Generation for End-to-End Long-term Video Question Answering Ting Yu et.al. 2410.09379 link
2024-10-11 VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding Houlun Chen et.al. 2410.08593 link
2024-10-10 Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models Qingni Wang et.al. 2410.08174 null
2024-10-10 Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations Yiyuan Zhang et.al. 2410.08049 link
2024-10-10 TVBench: Redesigning Video-Language Evaluation Daniel Cores et.al. 2410.07752 null
2024-10-09 MM-Ego: Towards Building Egocentric Multimodal LLMs Hanrong Ye et.al. 2410.07177 null
2024-10-11 Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization Changli Tang et.al. 2410.06682 null
2024-10-15 ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition Mohammadreza Salehi et.al. 2410.05774 null
2024-10-08 Enhancing Temporal Modeling of Video LLMs via Time Gating Zi-Yuan Hu et.al. 2410.05714 link
2024-10-08 TRACE: Temporal Grounding Video LLM via Causal Event Modeling Yongxin Guo et.al. 2410.05643 link
2024-10-09 SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference Yuan Zhang et.al. 2410.04417 link
2024-10-04 SONIQUE: Video Background Music Generation Using Unpaired Audio-Visual Data Liqian Zhang et.al. 2410.03879 link
2024-10-04 Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models Haibo Wang et.al. 2410.03290 link
2024-10-07 Frame-Voyager: Learning to Query Frames for Video Large Language Models Sicheng Yu et.al. 2410.03226 null
2024-10-04 AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark Wenhao Chai et.al. 2410.03051 null
2024-10-03 AirLetters: An Open Video Dataset of Characters Drawn in the Air Rishit Dagli et.al. 2410.02921 null
2024-10-01 YouTube Video Analytics for Patient Engagement: Evidence from Colonoscopy Preparation Videos Yawen Guo et.al. 2410.02830 null
2024-10-03 Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos Jianrui Zhang et.al. 2410.02763 null
2024-10-09 DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM Xuchen Li et.al. 2410.02492 null
2024-10-02 Deep learning for action spotting in association football videos Silvio Giancola et.al. 2410.01304 null
2024-10-02 UAL-Bench: The First Comprehensive Unusual Activity Localization Benchmark Hasnat Md Abdullah et.al. 2410.01180 link
2024-10-01 ScVLM: a Vision-Language Model for Driving Safety Critical Event Understanding Liang Shi et.al. 2410.00982 null
2024-10-01 Empowering Large Language Model for Continual Video Question Answering with Collaborative Prompting Chen Cai et.al. 2410.00771 null
2024-09-30 MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Haotian Zhang et.al. 2409.20566 null
2024-10-04 VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs Ruotong Liao et.al. 2409.20365 link
2024-09-30 Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs Zicheng Zhang et.al. 2409.20063 null
2024-10-02 Visual Context Window Extension: A New Perspective for Long Video Understanding Hongchen Wei et.al. 2409.20018 null
2024-09-29 Video DataFlywheel: Resolving the Impossible Data Trinity in Video-Language Understanding Xiao Wang et.al. 2409.19532 null
2024-09-27 From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding Heqing Zou et.al. 2409.18938 null
2024-09-27 Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks Min Yang et.al. 2409.18478 null
2024-09-26 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding Ye Liu et.al. 2409.18111 link
2024-09-26 IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning Soeun Lee et.al. 2409.18046 link
2024-09-26 LLM4Brain: Training a Large Language Model for Brain Video Understanding Ruizhe Zheng et.al. 2409.17987 null
2024-09-26 EAGLE: Egocentric AGgregated Language-video Engine Jing Bi et.al. 2409.17523 null
2024-09-23 Can CLIP Count Stars? An Empirical Study on Quantity Bias in CLIP Zeliang Zhang et.al. 2409.15035 null
2024-09-24 Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding Yan Shu et.al. 2409.14485 null
2024-09-22 Scene-Text Grounding for Text-Based Video Question Answering Sheng Zhou et.al. 2409.14319 link
2024-09-20 ReMEmbR: Building and Reasoning Over Long-Horizon Spatio-Temporal Memory for Robot Navigation Abrar Anwar et.al. 2409.13682 link
2024-09-20 Towards Child-Inclusive Clinical Video Understanding for Autism Spectrum Disorder Aditya Kommineni et.al. 2409.13606 null
2024-09-20 First Place Solution to the Multiple-choice Video QA Track of The Second Perception Test Challenge Yingzhe Peng et.al. 2409.13538 null
2024-09-19 Interpretable Action Recognition on Hard to Classify Actions Anastasia Anichenko et.al. 2409.13091 null
2024-09-17 AMEGO: Active Memory from long EGOcentric videos Gabriele Goletto et.al. 2409.10917 null
2024-09-16 HAVANA: Hierarchical stochastic neighbor embedding for Accelerated Video ANnotAtions Alexandru Bobe et.al. 2409.10641 null
2024-09-16 SoccerNet 2024 Challenges Results Anthony Cioppa et.al. 2409.10587 link
2024-09-14 QTG-VQA: Question-Type-Guided Architectural for VideoQA Systems Zhixian He et.al. 2409.09348 null
2024-09-12 Top-down Activity Representation Learning for Video Question Answering Yanan Wang et.al. 2409.07748 null
2024-09-12 Multi-object event graph representation learning for Video Question Answering Yanan Wang et.al. 2409.07747 null
2024-09-10 Enhancing Long Video Understanding via Hierarchical Event-Based Memory Dingxin Cheng et.al. 2409.06299 null
2024-09-11 VidLPRO: A $\underline{Vid}$eo-$\underline{L}$anguage $\underline{P}$re-training Framework for $\underline{Ro}$ botic and Laparoscopic Surgery Mohammadmahdi Honarmand et.al. 2409.04732 null
2024-09-06 Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment Keyne Oei et.al. 2409.04607 link
2024-09-05 TC-LLaVA: Rethinking the Transfer from Image to Video Understanding with Temporal Considerations Mingze Gao et.al. 2409.03206 null
2024-09-04 LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture Xidong Wang et.al. 2409.02889 link
2024-09-03 A Novel Audio-Visual Information Fusion System for Mental Disorders Detection Yichun Li et.al. 2409.02243 null
2024-09-02 VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges Yuxuan Wang et.al. 2409.01071 null
2024-08-31 Streamlining Forest Wildfire Surveillance: AI-Enhanced UAVs Utilizing the FLAME Aerial Video Dataset for Lightweight and Efficient Monitoring Lemeng Zhao et.al. 2409.00510 null
2024-08-31 StimuVAR: Spatiotemporal Stimuli-aware Video Affective Reasoning with Multimodal Large Language Models Yuxiang Guo et.al. 2409.00304 null
2024-09-20 HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics Gueter Josmy Faure et.al. 2408.17443 link
2024-08-29 CogVLM2: Visual Language Models for Image and Video Understanding Wenyi Hong et.al. 2408.16500 link
2024-08-29 DLM-VMTL:A Double Layer Mapper for heterogeneous data video Multi-task prompt learning Zeyi Bo et.al. 2408.16195 null
2024-08-28 Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input Jiajun Liu et.al. 2408.15542 null
2024-08-27 Fine-grained length controllable video captioning with ordinal embeddings Tomoya Nitta et.al. 2408.15447 null
2024-08-27 GenRec: Unifying Video Generation and Recognition with Diffusion Models Zejia Weng et.al. 2408.15241 link
2024-08-27 Sec2Sec Co-attention for Video-Based Apparent Affective Prediction Mingwei Sun et.al. 2408.15209 link
2024-08-26 Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos Qirui Chen et.al. 2408.14469 null
2024-08-26 Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification Mahrukh Awan et.al. 2408.14441 null
2024-08-26 Video-CCAM: Enhancing Video-Language Understanding with Causal Cross-Attention Masks for Short and Long Videos Jiajun Fei et.al. 2408.14023 link
2024-08-26 LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models Qihang Ge et.al. 2408.14008 null
2024-08-23 Cap2Sum: Learning to Summarize Videos by Generating Captions Cairong Zhao et.al. 2408.12800 null
2024-08-22 Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models Jean Park et.al. 2408.12763 null
2024-08-21 Audio Description Customization Rosiana Natalie et.al. 2408.11406 null
2024-08-21 LongVILA: Scaling Long-Context Visual Language Models for Long Videos Fuzhao Xue et.al. 2408.10188 link
2024-08-17 Flatten: Video Action Recognition is an Image Classification task Junlin Chen et.al. 2408.09220 null
2024-07-31 Segment Anything for Videos: A Systematic Survey Chunhui Zhang et.al. 2408.08315 link
2024-08-15 VLPG-Nav: Object Navigation Using Visual Language Pose Graph and Object Localization Probability Maps Senthil Hariharan Arul et.al. 2408.08301 null
2024-08-15 LLaVA-Surg: Towards Multimodal Surgical Assistant via Structured Surgical Video Learning Jiajie Li et.al. 2408.07981 null
2024-08-15 Continuous Perception Benchmark Zeyu Wang et.al. 2408.07867 null
2024-08-14 Disentangle and denoise: Tackling context misalignment for video moment retrieval Kaijing Ma et.al. 2408.07600 null
2024-08-12 HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization Sakib Reza et.al. 2408.06437 link
2024-08-12 OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning Mushui Liu et.al. 2408.06158 link
2024-08-12 CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer Zhuoyi Yang et.al. 2408.06072 link
2024-08-09 Spherical World-Locking for Audio-Visual Localization in Egocentric Videos Heeseung Yun et.al. 2408.05364 null
2024-08-08 VideoQA in the Era of LLMs: An Empirical Study Junbin Xiao et.al. 2408.04223 link
2024-08-06 LLaVA-OneVision: Easy Visual Task Transfer Bo Li et.al. 2408.03326 link
2024-08-06 Dual-path Collaborative Generation Network for Emotional Video Captioning Cheng Ye et.al. 2408.03006 link
2024-08-05 Towards Coarse-grained Visual Language Navigation Task Planning Enhanced by Event Knowledge Graph Zhao Kaichen et.al. 2408.02535 null
2024-08-05 FE-Adapter: Adapting Image-based Emotion Classifiers to Videos Shreyank N Gowda et.al. 2408.02421 null
2024-08-05 COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language Benchmark Koki Maeda et.al. 2408.02272 link
2024-08-01 Text-Guided Video Masked Autoencoder David Fan et.al. 2408.00759 null
2024-08-01 Multimodal Fusion and Coherence Modeling for Video Topic Segmentation Hai Yu et.al. 2408.00365 null
2024-07-31 Learning Video Context as Interleaved Multimodal Sequences Kevin Qinghong Lin et.al. 2407.21757 link
2024-07-30 Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos Dhruv Verma et.al. 2407.20642 link
2024-07-23 Causal Understanding For Video Question Answering Bhanu Prakash Reddy Guda et.al. 2407.20257 null
2024-07-29 Adversarial Robustness in RGB-Skeleton Action Recognition: Leveraging Attention Modality Reweighter Chao Liu et.al. 2407.19981 null
2024-07-28 Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation Tz-Ying Wu et.al. 2407.19520 null
2024-07-26 Wolf: Captioning Everything with a World Summarization Framework Boyi Li et.al. 2407.18908 null
2024-07-26 Harnessing Temporal Causality for Advanced Temporal Action Detection Shuming Liu et.al. 2407.17792 link
2024-07-23 EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval Thomas Hummel et.al. 2407.16658 link
2024-07-22 LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding Haoning Wu et.al. 2407.15754 link
2024-07-23 End-to-End Video Question Answering with Frame Scoring Mechanisms and Adaptive Sampling Jianxin Liang et.al. 2407.15047 null
2024-07-21 Audio-visual training for improved grounding in video-text LLMs Shivprasad Sagare et.al. 2407.15046 null
2024-07-19 EVLM: An Efficient Vision-Language Model for Visual Understanding Kaibing Chen et.al. 2407.14177 null
2024-07-19 Reexamining Racial Disparities in Automatic Speech Recognition Performance: The Role of Confounding by Provenance Changye Li et.al. 2407.13982 null
2024-07-18 Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data Wufei Ma et.al. 2407.13094 null
2024-07-17 Goldfish: Vision-Language Understanding of Arbitrarily Long Videos Kirolos Ataallah et.al. 2407.12679 null
2024-07-16 Scaling Sign Language Translation Biao Zhang et.al. 2407.11855 null
2024-07-23 Video-Language Alignment via Spatio-Temporal Graph Transformer Shi-Xue Zhang et.al. 2407.11677 link
2024-07-04 Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case Hanxiao Lu et.al. 2407.11031 null
2024-07-15 TripletViNet: Mitigating Misinformation Video Spread Across Platforms Petar Smolovic et.al. 2407.10644 null
2024-07-12 Open Vocabulary Multi-Label Video Classification Rohit Gupta et.al. 2407.09073 null
2024-07-11 VideoMamba: Spatio-Temporal Selective State Space Model Jinyoung Park et.al. 2407.08476 link
2024-07-16 Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding Minghui Wu et.al. 2407.08150 link
2024-07-10 Malicious Path Manipulations via Exploitation of Representation Vulnerabilities of Vision-Language Navigation Systems Chashi Mahiul Islam et.al. 2407.07392 null
2024-07-09 Rethinking Image-to-Video Adaptation: An Object-centric Perspective Rui Qian et.al. 2407.06871 null
2024-07-09 VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model Xinhao Li et.al. 2407.06491 link
2024-07-08 Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision Orr Zohar et.al. 2407.06189 link
2024-07-06 OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding Tiancheng Zhao et.al. 2407.04923 null
2024-07-20 Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning Thong Nguyen et.al. 2407.03788 link
2024-07-04 VDMA: Video Question Answering with Dynamically Generated Multi-Agents Noriyuki Kugo et.al. 2407.03610 null
2024-07-03 InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Pan Zhang et.al. 2407.03320 link
2024-07-03 KeyVideoLLM: Towards Large-scale Video Keyframe Selection Hao Liang et.al. 2407.03104 null
2024-07-03 Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering Zhaohe Liao et.al. 2407.03008 null
2024-07-03 PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition Yanbin Hao et.al. 2407.02934 link
2024-07-03 Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs Jinmin Li et.al. 2407.02411 null
2024-07-02 The Solution for the ICCV 2023 Perception Test Challenge 2023 – Task 6 – Grounded videoQA Hailiang Zhang et.al. 2407.01907 null
2024-07-10 Referring Atomic Video Action Recognition Kunyu Peng et.al. 2407.01872 link
2024-06-30 Tarsier: Recipes for Training and Evaluating Large Video Description Models Jiawei Wang et.al. 2407.00634 link
2024-06-30 Hierarchical Memory for Long Video QA Yiqin Wang et.al. 2407.00603 null
2024-06-28 InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding Kirolos Ataallah et.al. 2406.19875 link
2024-06-27 Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads Ali Khaleghi Rahimian et.al. 2406.19391 link
2024-06-27 OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding Tao Zhang et.al. 2406.19389 null
2024-06-27 VideoMambaPro: A Leap Forward for Mamba in Video Understanding Hui Lu et.al. 2406.19006 link
2024-06-25 Zero-Shot Long-Form Video Understanding through Screenplay Yongliang Wu et.al. 2406.17309 null
2024-06-24 PVUW 2024 Challenge on Complex Video Understanding: Methods and Results Henghui Ding et.al. 2406.17005 link
2024-06-25 OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer Lu Zhang et.al. 2406.16620 link
2024-06-24 Directed Domain Fine-Tuning: Tailoring Separate Modalities for Specific Training Tasks Daniel Wen et.al. 2406.16346 null
2024-06-24 VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models Yuxuan Wang et.al. 2406.16338 null
2024-06-22 HCQA @ Ego4D EgoSchema Challenge 2024 Haoyu Zhang et.al. 2406.15771 link
2024-06-22 video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models Guangzhi Sun et.al. 2406.15704 link
2024-06-20 MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding Xinyu Fang et.al. 2406.14515 link
2024-06-20 Live Video Captioning Eduardo Blanco-Fernández et.al. 2406.14206 link
2024-06-20 Towards Event-oriented Long Video Understanding Yifan Du et.al. 2406.14129 link
2024-06-19 Towards Holistic Language-video Representation: the language model-enhanced MSR-Video to Text Dataset Yuchen Yang et.al. 2406.13809 null
2024-06-21 AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video Understanding Alessandro Suglia et.al. 2406.13807 link
2024-06-19 GUI Action Narrator: Where and When Did That Action Take Place? Qinchen Wu et.al. 2406.13719 null
2024-06-19 GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement Hao Wang et.al. 2406.13136 null
2024-06-18 DrVideo: Document Retrieval Based Long Video Understanding Ziyu Ma et.al. 2406.12846 null
2024-06-18 VoCo-LLaMA: Towards Vision Compression with Large Language Models Xubing Ye et.al. 2406.12275 link
2024-06-26 Slot State Space Models Jindong Jiang et.al. 2406.12272 link
2024-06-18 Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM Huaxin Zhang et.al. 2406.12235 link
2024-06-17 Task Me Anything Jieyu Zhang et.al. 2406.11775 link
2024-06-17 Hallucination Mitigation Prompts Long-term Video Understanding Yiwei Sun et.al. 2406.11333 null
2024-06-17 VideoVista: A Versatile Benchmark for Video Understanding and Reasoning Yunxin Li et.al. 2406.11303 null
2024-06-17 i-SRT: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective Judgment Daechul Ahn et.al. 2406.11280 link
2024-06-16 VELOCITI: Can Video-Language Models Bind Semantic Concepts through Time? Darshana Saravanan et.al. 2406.10889 null
2024-06-15 EchoGuide: Active Acoustic Guidance for LLM-Based Eating Event Analysis from Egocentric Videos Vineet Parikh et.al. 2406.10750 null
2024-06-15 Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model Lu Xu et.al. 2406.10484 link
2024-06-14 Short Film Dataset (SFD): A Benchmark for Story-Level Video Understanding Ridouane Ghermi et.al. 2406.10221 link
2024-06-22 Localizing Events in Videos with Multimodal Queries Gengyuan Zhang et.al. 2406.10079 null
2024-06-14 GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding Yiqi Wu et.al. 2406.09781 null
2024-06-14 A Survey of Video Datasets for Grounded Event Understanding Kate Sanders et.al. 2406.09646 link
2024-06-13 VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding Muhammad Maaz et.al. 2406.09418 link
2024-06-17 Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA Jongwoo Park et.al. 2406.09396 link
2024-06-13 Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs Zijia Zhao et.al. 2406.09367 link
2024-06-13 MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos Xuehai He et.al. 2406.08407 link
2024-06-12 Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams Haoji Zhang et.al. 2406.08085 link
2024-06-12 LVBench: An Extreme Long Video Understanding Benchmark Weihan Wang et.al. 2406.08035 link
2024-06-12 Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models Shimin Chen et.al. 2406.08024 null
2024-06-12 Labeling Comic Mischief Content in Online Videos with a Multimodal Hierarchical-Cross-Attention Model Elaheh Baharlouei et.al. 2406.07841 link
2024-06-17 VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Zesen Cheng et.al. 2406.07476 link
2024-06-11 MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD Ioanna Ntinou et.al. 2406.07191 null
2024-06-10 NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative Asmar Nadeem et.al. 2406.06499 null
2024-06-10 Vript: A Video Is Worth Thousands of Words Dongjie Yang et.al. 2406.06040 link
2024-06-08 1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR’24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation Qingfeng Liu et.al. 2406.05352 null
2024-06-07 Semantic Segmentation on VSPW Dataset through Masked Video Consistency Chen Liang et.al. 2406.04979 null
2024-06-06 ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Lin Chen et.al. 2406.04325 null
2024-06-06 MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding Junjie Zhou et.al. 2406.04264 link
2024-06-07 3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation Ruipu Wu et.al. 2406.04002 null
2024-06-04 Story Generation from Visual Inputs: Techniques, Related Tasks, and Challenges Daniel A. P. Oliveira et.al. 2406.02748 null
2024-06-04 Contrastive Language Video Time Pre-training Hengyue Liu et.al. 2406.02631 null
2024-05-21 Backpropogation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration Wei Ji et.al. 2406.01601 null
2024-06-03 Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos Luigi Seminara et.al. 2406.01486 link
2024-06-02 Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering Xingrui Wang et.al. 2406.00622 link
2024-06-01 2nd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation Biao Wu et.al. 2406.00500 null
2024-06-06 HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model Khoa Vo et.al. 2406.00307 null
2024-05-31 Shotluck Holmes: A Family of Efficient Small-Scale Large Language Vision Models For Video Captioning and Summarization Richard Luo et.al. 2405.20648 link
2024-05-30 Video Question Answering for People with Visual Impairments Using an Egocentric 360-Degree Camera Inpyo Song et.al. 2405.19794 null
2024-05-30 Encoding and Controlling Global Semantics for Long-form Video Question Answering Thong Thanh Nguyen et.al. 2405.19723 link
2024-05-30 EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos Ryo Fujii et.al. 2405.19644 link
2024-05-29 VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos Ziyang Wang et.al. 2405.19209 link
2024-05-28 MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning Somnath Kumar et.al. 2405.18358 null
2024-05-28 Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions Rui Zhang et.al. 2405.17729 null
2024-05-27 Video Enriched Retrieval Augmented Generation Using Aligned Video Captions Kevin Dela Rosa et.al. 2405.17706 link
2024-05-25 Streaming Long Video Understanding with Large Language Models Rui Qian et.al. 2405.16009 null
2024-05-23 MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models Jiuming Liu et.al. 2405.14338 null
2024-05-22 Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline Dingyi Yang et.al. 2405.14040 null
2024-05-22 TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment Wei Li et.al. 2405.13911 link
2024-05-22 Dense Connector for MLLMs Huanjin Yao et.al. 2405.13800 link
2024-05-22 VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding Yongxin Guo et.al. 2405.13382 link
2024-05-21 Anticipating Object State Changes Victoria Manousaki et.al. 2405.12789 null
2024-05-17 Open-Vocabulary Spatio-Temporal Action Detection Tao Wu et.al. 2405.10832 null
2024-05-14 Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis Yao Fu et.al. 2405.08944 null
2024-05-14 CinePile: A Long Video Question Answering Dataset and Benchmark Ruchit Rawal et.al. 2405.08813 null
2024-05-14 No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding Yingjie Zhai et.al. 2405.08344 link
2024-05-13 FreeVA: Offline MLLM as Training-Free Video Assistant Wenhao Wu et.al. 2405.07798 link
2024-05-11 Memory-Maze: Scenario Driven Benchmark and Visual Language Navigation Model for Guiding Blind People Masaki Kuribayashi et.al. 2405.07060 null
2024-05-11 Retrieval Enhanced Zero-Shot Video Captioning Yunchuan Ma et.al. 2405.07046 null
2024-05-11 Global Motion Understanding in Large-Scale Video Object Segmentation Volodymyr Fedynyak et.al. 2405.07031 null
2024-05-09 A Survey on Backbones for Deep Video Action Recognition Zixuan Tang et.al. 2405.05584 null
2024-05-08 Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios Chirag Parikh et.al. 2405.05354 null
2024-05-07 Vision Mamba: A Comprehensive Survey and Taxonomy Xiao Liu et.al. 2405.04404 link
2024-05-06 Foundation Models for Video Understanding: A Survey Neelu Madan et.al. 2405.03770 link
2024-05-08 How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs Muhammad Uzair Khattak et.al. 2405.03690 null
2024-05-06 WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning Yuanhan Zhang et.al. 2405.03272 null
2024-04-30 Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition Zhendong Liu et.al. 2404.19383 null
2024-05-01 Capabilities of Gemini Models in Medicine Khaled Saab et.al. 2404.18416 null
2024-04-26 Learning text-to-video retrieval from image captioning Lucas Ventura et.al. 2404.17498 null
2024-04-26 MovieChat+: Question-aware Sparse Memory for Long Video Question Answering Enxin Song et.al. 2404.17176 link
2024-04-26 Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting Yuanyuan Liu et.al. 2404.17100 null
2024-04-29 PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning Lin Xu et.al. 2404.16994 link
2024-04-25 SFMViT: SlowFast Meet ViT in Chaotic World Jiaying Lin et.al. 2404.16609 link
2024-04-23 IPAD: Industrial Process Anomaly Detection Dataset Jinfan Liu et.al. 2404.15033 null
2024-04-23 Pegasus-v1 Technical Report Raehyuk Jung et.al. 2404.14687 null
2024-04-26 Narrative Action Evaluation with Prompt-Guided Multimodal Interaction Shiyi Zhang et.al. 2404.14471 link
2024-04-20 Movie101v2: Improved Movie Narration Benchmark Zihao Yue et.al. 2404.13370 null
2024-04-18 Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Reka Team et.al. 2404.12387 null
2024-04-18 From Image to Video, what do we need in multimodal LLMs? Suyuan Huang et.al. 2404.11865 null
2024-04-17 VG4D: Vision-Language Model Goes 4D Video Recognition Zhichao Deng et.al. 2404.11605 link
2024-04-15 Leveraging Temporal Contextualization for Video Action Recognition Minji Kim et.al. 2404.09490 link
2024-04-15 The 8th AI City Challenge Shuo Wang et.al. 2404.09432 null
2024-04-16 Human-in-the-Loop Segmentation of Multi-species Coral Imagery Scarlett Raine et.al. 2404.09406 link
2024-04-14 In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition Wiktor Mucha et.al. 2404.09308 link
2024-04-14 TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning Quang Minh Dinh et.al. 2404.09275 link
2024-04-14 Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection Jin Yang et.al. 2404.09263 link
2024-04-12 Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis Maged Shoman et.al. 2404.08229 link
2024-04-11 Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval Minkuk Kim et.al. 2404.07610 link
2024-04-10 A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos Suleyman Ozdel et.al. 2404.07351 null
2024-04-10 Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention Suleyman Ozdel et.al. 2404.07347 null
2024-04-09 MoReVQA: Exploring Modular Reasoning Models for Video Question Answering Juhong Min et.al. 2404.06511 null
2024-04-07 X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model Jan Held et.al. 2404.06332 null
2024-04-24 MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding Bo He et.al. 2404.05726 link
2024-04-06 SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos Tao Wu et.al. 2404.04565 link
2024-04-19 Koala: Key frame-conditioned long video-LLM Reuben Tan et.al. 2404.04346 null
2024-04-05 Neural-Symbolic VideoQA: Learning Compositional Spatio-Temporal Reasoning for Real-world Video Question Answering Lili Liang et.al. 2404.04007 null
2024-04-04 OW-VISCap: Open-World Video Instance Segmentation and Captioning Anwesa Choudhuri et.al. 2404.03657 null
2024-04-04 MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens Kirolos Ataallah et.al. 2404.03413 link
2024-04-10 LongVLM: Efficient Long Video Understanding via Large Language Models Yuetian Weng et.al. 2404.03384 link
2024-04-03 DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement Hao Wu et.al. 2404.02755 null
2024-04-05 SnAG: Scalable and Accurate Video Grounding Fangzhou Mu et.al. 2404.02257 null
2024-04-01 TraveLER: A Multi-LMM Agent Framework for Video Question-Answering Chuyi Shang et.al. 2404.01476 link
2024-04-01 CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes Ting En Lam et.al. 2404.01299 link
2024-04-01 Streaming Dense Video Captioning Xingyi Zhou et.al. 2404.01297 link
2024-04-02 Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward Ruohong Zhang et.al. 2404.01258 link
2024-04-01 VideoDistill: Language-aware Vision Distillation for Video Question Answering Bo Zou et.al. 2404.00973 null
2024-03-31 $R^2$ -Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding Ye Liu et.al. 2404.00801 link
2024-03-30 Instrument-tissue Interaction Detection Framework for Surgical Video Understanding Wenjun Lin et.al. 2404.00322 null
2024-03-30 ST-LLM: Large Language Models Are Effective Temporal Learners Ruyang Liu et.al. 2404.00308 link
2024-03-29 A Unified Framework for Human-centric Point Cloud Video Understanding Yiteng Xu et.al. 2403.20031 null
2024-03-28 Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality Sishuo Chen et.al. 2403.19221 link
2024-03-27 An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM Wonkyun Kim et.al. 2403.18406 link
2024-03-26 OmniVid: A Generative Framework for Universal Video Understanding Junke Wang et.al. 2403.17935 link
2024-03-25 Understanding Long Videos in One Multimodal Language Model Pass Kanchana Ranasinghe et.al. 2403.16998 link
2024-03-24 AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue Yunlong Tang et.al. 2403.16276 null
2024-03-22 InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Yi Wang et.al. 2403.15377 link
2024-03-25 VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding Ahmad Mahmood et.al. 2403.14743 link
2024-03-21 Language Repository for Long Video Understanding Kumara Kahatapitiya et.al. 2403.14622 link
2024-03-21 Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels Tianming Liang et.al. 2403.14430 null
2024-03-18 Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation Zixin Zhu et.al. 2403.12042 link
2024-03-18 Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation Wangbo Zhao et.al. 2403.11808 link
2024-03-27 LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model Yuxin Cao et.al. 2403.11656 null
2024-03-18 VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding Yue Fan et.al. 2403.11481 null
2024-03-15 VideoAgent: Long-form Video Understanding with Large Language Model as Agent Xiaohan Wang et.al. 2403.10517 null
2024-03-14 Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding Guo Chen et.al. 2403.09626 link
2024-03-25 Don’t Judge by the Look: Towards Motion Coherent Video Representation Yitian Zhang et.al. 2403.09506 link
2024-03-13 DAM: Dynamic Adapter Merging for Continual Video QA Learning Feng Cheng et.al. 2403.08755 link
2024-03-11 Action Reimagined: Text-to-Pose Video Editing for Dynamic Human Actions Lan Wang et.al. 2403.07198 null
2024-03-12 VideoMamba: State Space Model for Efficient Video Understanding Kunchang Li et.al. 2403.06977 link
2024-03-25 An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models Liang Chen et.al. 2403.06764 link
2024-03-08 Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation Joseph Cho et.al. 2403.05131 null
2024-03-11 Beyond MOT: Semantic Multi-Object Tracking Yunhao Li et.al. 2403.05021 link
2024-03-08 Pix2Gif: Motion-Guided Diffusion for GIF Generation Hitesh Kandala et.al. 2403.04634 link
2024-03-05 A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives Simone Alberto Peirone et.al. 2403.03037 null
2024-03-03 MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies Zhende Song et.al. 2403.01422 null
2024-03-01 Abductive Ego-View Accident Video Understanding for Safe Driving Perception Jianwu Fang et.al. 2403.00436 null
2024-02-29 Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Tsai-Shien Chen et.al. 2402.19479 null
2024-03-11 TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning Kate Sanders et.al. 2402.19467 null
2024-02-29 Percept, Chat, and then Adapt: Multimodal Knowledge Transfer of Foundation Models for Open-World Video Recognition Boyu Chen et.al. 2402.18951 null
2024-02-27 MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning Huiyu Xiong et.al. 2402.17680 null
2024-02-25 LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding Yuxuan Wang et.al. 2402.16050 link
2024-02-22 Think before You Leap: Content-Aware Low-Cost Edge-Assisted Video Semantic Segmentation Mingxuan Yan et.al. 2402.14326 null
2024-02-21 LLMs Meet Long Video: Advancing Long Video Comprehension with An Interactive Visual Adapter in LLMs Yunxin Li et.al. 2402.13546 null
2024-02-28 Video ReCap: Recursive Captioning of Hour-Long Videos Md Mohaiminul Islam et.al. 2402.13250 link
2024-02-20 VideoPrism: A Foundational Visual Encoder for Video Understanding Long Zhao et.al. 2402.13217 null
2024-02-20 Slot-VLM: SlowFast Slots for Video-Language Modeling Jiaqi Xu et.al. 2402.13088 null
2024-02-19 System Identification of Neural Systems: Going Beyond Images to Modelling Dynamics Mai Gamal et.al. 2402.12519 null
2024-02-19 LVCHAT: Facilitating Long Video Comprehension Yu Wang et.al. 2402.12079 link
2024-02-28 Are you Struggling? Dataset and Baselines for Struggle Determination in Assembly Videos Shijia Feng et.al. 2402.11057 null
2024-02-16 Question-Instructed Visual Descriptions for Zero-Shot Video Question Answering David Romero et.al. 2402.10698 link
2024-02-13 World Model on Million-Length Video And Language With RingAttention Hao Liu et.al. 2402.08268 link
2024-02-12 BDIQA: A New Dataset for Video Question Answering to Explore Cognitive Reasoning through Theory of Mind Yuanyuan Mao et.al. 2402.07402 null
2024-02-09 Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning Amir Ziai et.al. 2402.06560 link
2024-02-09 Dynamic swarms regulate the morphology and distribution of soft membrane domains Aakanksha Gubbala et.al. 2402.06518 null
2024-02-08 Memory Consolidation Enables Long-Context Video Understanding Ivana Balažević et.al. 2402.05861 null
2024-02-06 Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization Yang Jin et.al. 2402.03161 null
2024-02-04 Spatio-temporal Prompting Network for Robust Video Feature Extraction Guanxiong Sun et.al. 2402.02574 link
2024-02-02 Simulator-Free Visual Domain Randomization via Video Games Chintan Trivedi et.al. 2402.01335 link
2024-01-30 YTCommentQA: Video Question Answerability in Instructional Videos Saelyne Yang et.al. 2401.17343 link
2024-01-30 Multi-granularity Correspondence Learning from Long-term Noisy Videos Yijie Lin et.al. 2401.16702 null
2024-01-29 Cutup and Detect: Human Fall Detection on Cutup Untrimmed Videos Using a Large Foundational Video Understanding Model Till Grutschus et.al. 2401.16280 null
2024-01-25 Knowledge Graph Supported Benchmark and Video Captioning for Basketball Zeyu Xi et.al. 2401.13888 null
2024-01-22 ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition Jiaming Zhou et.al. 2401.11654 null
2024-01-21 Exploring Missing Modality in Multimodal Egocentric Datasets Merey Ramazanova et.al. 2401.11470 null
2024-01-19 Learning to Visually Connect Actions and their Effects Eric Peh et.al. 2401.10805 null
2024-01-28 Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering Haibo Wang et.al. 2401.10711 link
2024-01-17 CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding Yunze Liu et.al. 2401.09057 null
2024-01-16 Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data Yuhui Zhang et.al. 2401.08567 link
2024-01-16 Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization Chongzhi Zhang et.al. 2401.08232 null
2024-01-11 Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition Yukun Zuo et.al. 2401.06287 link
2024-01-10 HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition Qian Wu et.al. 2401.04975 link
2024-01-10 SnapCap: Efficient Snapshot Compressive Video Captioning Jianqiao Sun et.al. 2401.04903 null
2024-01-08 Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification Wentao Zhu et.al. 2401.04154 null
2024-01-08 Dr $^2$ Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning Chen Zhao et.al. 2401.04105 link
2024-01-08 STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering Yueqian Wang et.al. 2401.03901 link

Multi-modal Learning

Publish Date Title Authors PDF Code
2024-12-10 Modeling High-Resolution Spatio-Temporal Wind with Deep Echo State Networks and Stochastic Partial Differential Equations Kesen Wang et.al. 2412.07265 null
2024-12-09 LMS-AutoTSF: Learnable Multi-Scale Decomposition and Integrated Autocorrelation for Time Series Forecasting Ibrahim Delibasoglu Sanjay Chakraborty Fredrik Heintz et.al. 2412.06866 link
2024-12-09 How to Merge Your Multimodal Models Over Time? Sebastian Dziadzio et.al. 2412.06712 null
2024-12-05 MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation Longtao Zheng et.al. 2412.04448 null
2024-12-03 Towards the efficacy of federated prediction for epidemics on networks Chengpeng Fu et.al. 2412.02161 null
2024-12-02 Navigating Challenges in Spatio-temporal Modelling of Antarctic Krill Abundance: Addressing Zero-inflated Data and Misaligned Covariates André Victor Ribeiro Amaral et.al. 2412.01399 link
2024-11-30 PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation Qiyao Xue et.al. 2412.00596 link
2024-11-27 Predicting Extubation Failure in Intensive Care: The Development of a Novel, End-to-End Actionable and Interpretable Prediction System Akram Yoosoofsah et.al. 2412.00105 null
2024-11-27 TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video Jinyuan Qu et.al. 2411.18671 null
2024-11-26 Temporal Models for Demographic and Global Health Outcomes in Multiple Populations: Introducing the Normal-with-Optional-Shrinkage Data Model Class Leontine Alkema et.al. 2411.18646 null
2024-11-26 SAMWISE: Infusing wisdom in SAM2 for Text-Driven Video Segmentation Claudia Cuttano et.al. 2411.17646 link
2024-11-25 GAST: Sequential Gaussian Avatars with Hierarchical Spatio-temporal Context Wangze Xu et.al. 2411.16768 null
2024-11-20 MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection Tong Ning et.al. 2411.13628 null
2024-11-19 Hierarchical Spatio-Temporal Uncertainty Quantification for Distributed Energy Adoption Wenbin Zhou et.al. 2411.12193 null
2024-11-15 TESGNN: Temporal Equivariant Scene Graph Neural Networks for Efficient and Robust Multi-View 3D Scene Understanding Quang P. M. Pham et.al. 2411.10509 link
2024-11-15 MDHP-Net: Detecting Injection Attacks on In-vehicle Network using Multi-Dimensional Hawkes Process and Temporal Model Qi Liu et.al. 2411.10258 null
2024-11-11 HSTrack: Bootstrap End-to-End Multi-Camera 3D Multi-object Tracking with Hybrid Supervision Shubo Lin et.al. 2411.06780 null
2024-11-14 Gaussian process modelling of infectious diseases using the Greta software package and GPUs Eva Gunn et.al. 2411.05556 null
2024-11-07 Multi-temporal crack segmentation in concrete structure using deep learning approaches Said Harb et.al. 2411.04620 null
2024-11-07 TrajGPT: Controlled Synthetic Trajectory Generation Using a Multitask Transformer-Based Spatiotemporal Model Shang-Ling Hsu et.al. 2411.04381 link
2024-11-05 FilterNet: Harnessing Frequency Filters for Time Series Forecasting Kun Yi et.al. 2411.01623 link
2024-10-31 Self-Ensembling Gaussian Splatting for Few-shot Novel View Synthesis Chen Zhao et.al. 2411.00144 null
2024-10-30 LGU-SLAM: Learnable Gaussian Uncertainty Matching with Deformable Correlation Sampling for Deep Visual SLAM Yucheng Huang et.al. 2410.23231 link
2024-10-27 Neural rendering enables dynamic tomography Ivan Grega et.al. 2410.20558 null
2024-10-25 UbiHR: Resource-efficient Long-range Heart Rate Sensing on Ubiquitous Devices Haoyu Bian et.al. 2410.19279 null
2024-10-24 Classifying Bicycle Infrastructure Using On-Bike Street-Level Images Kal Backman et.al. 2410.19194 null
2024-10-24 Spatio-spectral-temporal Modelling of Two Young Pulsar Wind Nebulae A. Kundu et.al. 2410.18386 null
2024-10-25 Beyond position: how rotary embeddings shape representations and memory in autoregressive transfomers Valeria Ruscio et.al. 2410.18067 null
2024-10-22 A Survey on Deep Learning-based Gaze Direction Regression: Searching for the State-of-the-art Franko Šikić et.al. 2410.17082 null
2024-11-27 Spectrum and location of ongoing extreme particle acceleration in Cassiopeia A Jooyun Woo et.al. 2410.16522 null
2024-10-18 Context-Enhanced Multi-View Trajectory Representation Learning: Bridging the Gap through Self-Supervised Models Tangwen Qian et.al. 2410.13196 null
2024-10-14 Fed-piLot: Optimizing LoRA Assignment for Efficient Federated Foundation Model Fine-Tuning Zikai Zhang et.al. 2410.10200 null
2024-10-09 Causal Representation Learning in Temporal Data via Single-Parent Decoding Philippe Brouillard et.al. 2410.07013 link
2024-10-08 Enhancing Temporal Modeling of Video LLMs via Time Gating Zi-Yuan Hu et.al. 2410.05714 link
2024-10-04 Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models Haibo Wang et.al. 2410.03290 link
2024-10-04 Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach Yaofang Liu et.al. 2410.03160 link
2024-10-04 AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark Wenhao Chai et.al. 2410.03051 null
2024-10-03 A Spatio-Temporal Machine Learning Model for Mortgage Credit Risk: Default Probabilities and Loan Portfolios Pascal Kündig et.al. 2410.02846 link
2024-09-30 Masked Autoregressive Model for Weather Forecasting Doyi Kim et.al. 2409.20117 null
2024-09-30 SurgPETL: Parameter-Efficient Image-to-Surgical-Video Transfer Learning for Surgical Phase Recognition Shu Yang et.al. 2409.20083 null
2024-09-29 PPLNs: Parametric Piecewise Linear Networks for Event-Based Temporal Modeling and Beyond Chen Song et.al. 2409.19772 link
2024-09-26 PGN: The RNN’s New Successor is Effective for Long-Range Time Series Forecasting Yuxin Jia et.al. 2409.17703 link
2024-09-26 MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling Weihao Yuan et.al. 2409.17686 null
2024-09-23 Automated Spatio-Temporal Weather Modeling for Load Forecasting Julie Keisler et.al. 2409.16326 null
2024-09-24 Self-Supervised Representation Learning with Augmentations of Continuous Training Data Improves the Feel and Performance of Myoelectric Control Shriram Tallam Puranam Raghu et.al. 2409.16015 null
2024-09-24 DepMamba: Progressive Fusion Mamba for Multimodal Depression Detection Jiaxin Ye et.al. 2409.15936 link
2024-09-18 SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with Mamba Xiangning Zhang et.al. 2409.12108 null
2024-09-18 DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech Xin Qi et.al. 2409.11835 null
2024-09-21 Self-Supervised Learning via VICReg Enables Training of EMG Pattern Recognition Using Continuous Data with Unclear Labels Shriram Tallam Puranam Raghu et.al. 2409.11632 null
2024-09-14 QTG-VQA: Question-Type-Guided Architectural for VideoQA Systems Zhixian He et.al. 2409.09348 null
2024-09-08 Estimating velocities of infectious disease spread through spatio-temporal log-Gaussian Cox point processes Fernando Rodriguez Avellaneda et.al. 2409.05036 null
2024-09-05 TC-LLaVA: Rethinking the Transfer from Image to Video Understanding with Temporal Considerations Mingze Gao et.al. 2409.03206 null
2024-09-01 Searching for MeV-scale Axion-like Particles and Dark Photons with PandaX-4T PandaX Collaboration et.al. 2409.00773 null
2024-09-17 Robo-GS: A Physics Consistent Spatial-Temporal Model for Robotic Arm with Hybrid Representation Haozhe Lou et.al. 2408.14873 null
2024-08-23 Multivariate Time-Series Anomaly Detection based on Enhancing Graph Attention Networks with Topological Analysis Zhe Liu et.al. 2408.13082 link
2024-08-23 Animal Identification with Independent Foreground and Background Modeling Lukas Picek et.al. 2408.12930 null
2024-08-22 Deep Analysis of Time Series Data for Smart Grid Startup Strategies: A Transformer-LSTM-PSO Model Approach Zecheng Zhang et.al. 2408.12129 null
2024-08-20 TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning Bin Wang et.al. 2408.10688 link
2024-08-20 DemMamba: Alignment-free Raw Video Demoireing with Frequency-assisted Spatio-Temporal Mamba Shuning Xu et.al. 2408.10679 null
2024-08-20 Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended? Chen Liang et.al. 2408.10627 null
2024-08-19 Uncertainty Quantification of Pre-Trained and Fine-Tuned Surrogate Models using Conformal Prediction Vignesh Gopakumar et.al. 2408.09881 link
2024-08-14 Limit Theorems for Weakly Dependent Non-stationary Random Field Arrays and Asymptotic Inference of Dynamic Spatio-temporal Models Yue Pan et.al. 2408.07429 null
2024-08-12 OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning Mushui Liu et.al. 2408.06158 link
2024-08-12 Spacetime $E(n)$ -Transformer: Equivariant Attention for Spatio-temporal Graphs Sergio G. Charles et.al. 2408.06039 null
2024-08-16 Performance and Non-adversarial Robustness of the Segment Anything Model 2 in Surgical Video Segmentation Yiqing Shen et.al. 2408.04098 null
2024-08-07 Surgformer: Surgical Transformer with Hierarchical Temporal Attention for Surgical Phase Recognition Shu Yang et.al. 2408.03867 link
2024-08-07 PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model Yunlong Huang et.al. 2408.03540 null
2024-09-09 SiamMo: Siamese Motion-Centric 3D Object Tracking Yuxiang Yang et.al. 2408.01688 link
2024-09-11 RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining Hongtao Wu et.al. 2407.21773 link
2024-08-03 Unveiling land use dynamics: Insights from a hierarchical Bayesian spatio-temporal modelling of Compositional Data Mario Figueira et.al. 2407.21695 null
2024-07-30 Autogenic Language Embedding for Coherent Point Tracking Zikai Song et.al. 2407.20730 link
2024-07-26 UniForensics: Face Forgery Detection via General Facial Representation Ziyuan Fang et.al. 2407.19079 null
2024-07-26 Harnessing Temporal Causality for Advanced Temporal Action Detection Shuming Liu et.al. 2407.17792 link
2024-07-24 PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction Nan Peng et.al. 2407.17378 link
2024-07-24 DVPE: Divided View Position Embedding for Multi-View 3D Object Detection Jiasen Wang et.al. 2407.16955 link
2024-07-22 A divide-and-conquer approach for spatio-temporal analysis of large house price data from Greater London Kapil Gupta et.al. 2407.15905 null
2024-07-03 Digital Twin-based Driver Risk-Aware Intelligent Mobility Analytics for Urban Transportation Management Tao Li et.al. 2407.15025 null
2024-08-06 Physics-guided Active Sample Reweighting for Urban Flow Prediction Wei Jiang et.al. 2407.13605 link
2024-07-15 Human-Centric Transformer for Domain Adaptive Action Recognition Kun-Yu Lin et.al. 2407.10860 null
2024-07-15 Spatio-temporal neural distance fields for conditional generative modeling of the heart Kristine Sørensen et.al. 2407.10663 link
2024-07-12 Open Vocabulary Multi-Label Video Classification Rohit Gupta et.al. 2407.09073 null
2024-07-09 Rethinking Image-to-Video Adaptation: An Object-centric Perspective Rui Qian et.al. 2407.06871 null
2024-07-07 Efficient Bayesian dynamic closed skew-normal model preserving mean and covariance for spatio-temporal data Hajime Kuno et.al. 2407.05288 link
2024-07-03 Graph and Skipped Transformer: Exploiting Spatial and Temporal Modeling Capacities for Efficient 3D Human Pose Estimation Mengmeng Cui et.al. 2407.02990 null
2024-07-03 PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition Yanbin Hao et.al. 2407.02934 link
2024-07-16 Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion Bohan Li et.al. 2407.02077 link
2024-07-29 Three-Stream Temporal-Shift Attention Network Based on Self-Knowledge Distillation for Micro-Expression Recognition Guanghao Zhu et.al. 2406.17538 null
2024-06-23 Multi-Scale Temporal Difference Transformer for Video-Text Retrieval Ni Wang et.al. 2406.16111 null
2024-06-20 ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning Zhongjie Duan et.al. 2406.14130 link
2024-06-20 LGmap: Local-to-Global Mapping Network for Online Long-Range Vectorized HD Map Construction Kuang Wu et.al. 2406.13988 null
2024-06-18 RIGL: A Unified Reciprocal Approach for Tracing the Independent and Group Learning Processes Xiaoshan Yu et.al. 2406.12465 link
2024-06-18 Translation Equivariant Transformer Neural Processes Matthew Ashman et.al. 2406.12409 null
2024-06-18 LiCAF: LiDAR-Camera Asymmetric Fusion for Gait Recognition Yunze Deng et.al. 2406.12355 null
2024-06-15 X-Ray spectral and temporal properties of LMXB 4U 1608-52- observed with AstroSat and NICER Sree Bhattacherjee et.al. 2406.10666 null
2024-06-13 OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation Junke Wang et.al. 2406.09399 link
2024-06-13 Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs Zijia Zhao et.al. 2406.09367 link
2024-06-17 VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Zesen Cheng et.al. 2406.07476 link
2024-06-11 RecMoDiffuse: Recurrent Flow Diffusion for Human Motion Generation Mirgahney Mohamed et.al. 2406.07169 null
2024-06-11 AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding Xing Zhang et.al. 2406.07091 null
2024-06-07 Joint Spatial-Temporal Modeling and Contrastive Learning for Self-supervised Heart Rate Measurement Wei Qian et.al. 2406.04942 null
2024-06-07 Bayesian inference of Latent Spectral Shapes Hiu Ching Yip et.al. 2406.04915 null
2024-06-07 MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome Yixin Huang et.al. 2406.04680 link
2024-06-05 Non-stationary Spatio-Temporal Modeling Using the Stochastic Advection-Diffusion Equation Martin Outzen Berild et.al. 2406.03400 link
2024-06-04 I4VGen: Image as Stepping Stone for Text-to-Video Generation Xiefan Guo et.al. 2406.02230 null
2024-06-03 UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation Xiang Wang et.al. 2406.01188 null
2024-06-01 DSCA: A Digital Subtraction Angiography Sequence Dataset and Spatio-Temporal Model for Cerebral Artery Segmentation Qihang Xie et.al. 2406.00341 null
2024-06-01 A Review of Pulse-Coupled Neural Network Applications in Computer Vision and Image Processing Nurul Rafi et.al. 2406.00239 null
2024-05-31 Streamflow Prediction with Uncertainty Quantification for Water Management: A Constrained Reasoning and Learning Approach Mohammed Amine Gharsallaoui et.al. 2406.00133 null
2024-05-31 4Diffusion: Multi-view Video Diffusion Model for 4D Generation Haiyu Zhang et.al. 2405.20674 null
2024-05-30 Streaming Video Diffusion: Online Video Editing with Diffusion Models Feng Chen et.al. 2405.19726 link
2024-05-30 Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training Jinxia Yang et.al. 2405.19654 link
2024-05-30 FTS: A Framework to Find a Faithful TimeSieve Songning Lai et.al. 2405.19647 null
2024-05-24 Dynamical Analysis of a Cocaine-Heroin Epidemiological Model with Spatial Distributions Achraf Zinihi et.al. 2405.15532 null
2024-05-20 Biomarker Selection for Adaptive Systems Joshua Pickard et.al. 2405.09809 null
2024-05-14 No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding Yingjie Zhai et.al. 2405.08344 link
2024-05-13 Improved Bound for Robust Causal Bandits with Linear Models Zirui Yan et.al. 2405.07795 null
2024-05-10 Residual-based Attention Physics-informed Neural Networks for Efficient Spatio-Temporal Lifetime Assessment of Transformers Operated in Renewable Power Plants Ibai Ramirez et.al. 2405.06443 null
2024-05-10 A Multi-Channel Spatial-Temporal Transformer Model for Traffic Flow Forecasting Jianli Xiao et.al. 2405.06266 null
2024-05-07 DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving Chen Min et.al. 2405.04390 null
2024-05-07 Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling Jiawei Shi et.al. 2405.04309 null
2024-05-06 Hierarchical Space-Time Attention for Micro-Expression Recognition Haihong Hao et.al. 2405.03202 link
2024-05-21 RSCaMa: Remote Sensing Image Change Captioning with State Space Model Chenyang Liu et.al. 2404.18895 link
2024-04-24 Deep Predictive Model Learning with Parametric Bias: Handling Modeling Difficulties and Temporal Model Changes Kento Kawaharazuka et.al. 2404.15726 null
2024-04-19 MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model Kang Zeng et.al. 2404.12794 link
2024-04-13 Understanding Human-COVID-19 Dynamics using Geospatial Big Data: A Systematic Literature Review Binbin Lin et.al. 2404.10013 null
2024-04-15 A spatio-temporal model to detect potential outliers in disease mapping Victoire Michal et.al. 2404.09882 null
2024-04-11 Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos Soumyabrata Chaudhuri et.al. 2404.07645 link
2024-04-05 Low-Rank Robust Subspace Tensor Clustering for Metro Passenger Flow Modeling Jiuyun Hu et.al. 2404.04403 null
2024-04-03 Spatio-temporal Modeling of Count Data Steffen Maletz et.al. 2404.02982 link
2024-03-31 $R^2$ -Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding Ye Liu et.al. 2404.00801 link
2024-03-30 ST-LLM: Large Language Models Are Effective Temporal Learners Ruyang Liu et.al. 2404.00308 link
2024-03-28 X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization Anna Kukleva et.al. 2403.19811 link
2024-03-25 TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models Zhongwei Zhang et.al. 2403.17005 null
2024-04-13 Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition R. Gnana Praveen et.al. 2403.13659 link
2024-03-19 SUN Team’s Contribution to ABAW 2024 Competition: Audio-visual Valence-Arousal Estimation and Expression Recognition Denis Dresvyanskiy et.al. 2403.12609 null
2024-03-18 Bayesian Optimization Sequential Surrogate (BOSS) Algorithm: Fast Bayesian Inference for a Broad Class of Bayesian Hierarchical Models Dayi Li et.al. 2403.12250 null
2024-03-19 Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling Jun Yu et.al. 2403.11942 null
2024-03-15 Spatio-temporal Occupancy Models with INLA Jafet Belmont et.al. 2403.10680 null
2024-03-15 Multivariate Bayesian models with flexible shared interactions for analyzing spatio-temporal patterns of rare cancers Garazi Retegui et.al. 2403.10440 link
2024-03-13 Leveraging Non-Decimated Wavelet Packet Features and Transformer Models for Time Series Forecasting Guy P Nason et.al. 2403.08630 null
2024-03-10 Coherent Temporal Synthesis for Incremental Action Segmentation Guodong Ding et.al. 2403.06102 null
2024-04-26 Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention R. Gnana Praveen et.al. 2403.04654 link