ASR-TTS-paper-daily

🎯 ASR-TTS Paper Daily

Automatically curated collection of the latest research papers in Speech & Language Technology

📅 Updated on 2025.09.25

🌟 About This Repository

This repository provides a daily-updated collection of the latest research papers from arXiv in the following domains:

🎤 Automatic Speech Recognition (ASR)
🗣️ Text-to-Speech (TTS)
🌐 Machine Translation
⚡ Small Language Models
🔄 Data Augmentation
🎨 Synthetic Generation

📖 Usage instructions: here 🌐 Web version: GitHub Pages

💡 This page is inspired by cv-arxiv-daily

🎤 ASR

📊 117 papers

📅 Publish Date	📝 Title	👥 Authors	📄 PDF	💻 Code
2025-09-23	WolBanking77: Wolof Banking Speech Intent Classification Dataset	Abdou Karim Kandji et.al.	2509.19271	null
2025-09-23	SloPalSpeech: A 2,8000-Hour Slovak Speech Corpus from Parliamentary Data	Erik Božík et.al.	2509.19270	null
2025-09-23	LOTUSDIS: A Thai far-field meeting corpus for robust conversational ASR	Pattara Tipaksorn et.al.	2509.18722	null
2025-09-22	Speech Vecalign: an Embedding-based Method for Aligning Parallel Speech Documents	Chutong Meng et.al.	2509.18360	link
2025-09-20	Conversational Orientation Reasoning: Egocentric-to-Allocentric Navigation with Multimodal Chain-of-Thought	Yu Ti Huang et.al.	2509.18200	null
2025-09-19	MNV-17: A High-Quality Performative Mandarin Dataset for Nonverbal Vocalization Recognition in Speech	Jialong Mai et.al.	2509.18196	null
2025-09-22	Transformer-Encoder Trees for Efficient Multilingual Machine Translation and Speech Translation	Yiwen Guan et.al.	2509.17930	null
2025-09-22	Qwen3-Omni Technical Report	Jin Xu et.al.	2509.17765	null
2025-09-22	Leveraging Audio-Visual Data to Reduce the Multilingual Gap in Self-Supervised Speech Models	María Andrea Cruz Blandón et.al.	2509.17523	null
2025-09-20	Idiosyncratic Versus Normative Modeling of Atypical Speech Recognition: Dysarthric Case Studies	Vishnu Raja et.al.	2509.16718	null
2025-09-20	Audio-Conditioned Diffusion LLMs for ASR and Deliberation Processing	Mengqi Wang et.al.	2509.16622	null
2025-09-19	Whisper-UT: A Unified Translation Framework for Speech and Text	Cihan Xiao et.al.	2509.16375	null
2025-09-19	GLip: A Global-Local Integrated Progressive Framework for Robust Visual Speech Recognition	Tianyue Wang et.al.	2509.16031	null
2025-09-19	Session-Level Spoken Language Assessment with a Multimodal Foundation Model via Multi-Target Learning	Hong-Yun Lin et.al.	2509.16025	null
2025-09-19	Interpreting the Role of Visemes in Audio-Visual Speech Recognition	Aristeidis Papadopoulos et.al.	2509.16023	null
2025-09-19	VOX-KRIKRI: Unifying Speech and Language through Continuous Fusion	Dimitrios Damianos et.al.	2509.15667	null
2025-09-19	Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations	Linyang He et.al.	2509.15655	null
2025-09-19	Thinking in cocktail party: Chain-of-Thought and reinforcement learning for target speaker automatic speech recognition	Yiru Zhang et.al.	2509.15612	null
2025-09-19	Chunk Based Speech Pre-training with High Resolution Finite Scalar Quantization	Yun Tang et.al.	2509.15579	null
2025-09-19	State-of-the-Art Dysarthric Speech Recognition with MetaICL for on-the-fly Personalization	Dhruuv Agarwal et.al.	2509.15516	null
2025-09-18	BiRQ: Bi-Level Self-Labeling Random Quantization for Self-Supervised Speech Recognition	Liuyuan Jiang et.al.	2509.15430	null
2025-09-18	Speech Language Models for Under-Represented Languages: Insights from Wolof	Yaya Sy et.al.	2509.15362	null
2025-09-20	Listening, Imagining & Refining: A Heuristic Optimized ASR Correction Framework with LLMs	Yutong Liu et.al.	2509.15095	null
2025-09-19	From Hype to Insight: Rethinking Large Language Model Integration in Visual Speech Recognition	Rishabh Jain et.al.	2509.14880	null
2025-09-18	Towards Building Speech Large Language Models for Multitask Understanding in Low-Resource Languages	Mingchen Shao et.al.	2509.14804	null
2025-09-18	UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition	Ying Fang et.al.	2509.14653	null
2025-09-17	Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses	Yufeng Yang et.al.	2509.14430	null
2025-09-13	Context-Enhanced Granular Edit Representation for Efficient and Accurate ASR Post-editing	Luan Vejsiu et.al.	2509.14263	null
2025-09-17	Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST	Monica Sekoyan et.al.	2509.14128	null
2025-09-17	Language Conditioning Improves Accuracy of Aircraft Goal Prediction in Untowered Airspace	Sundhar Vinodh Sangeetha et.al.	2509.14063	null
2025-09-17	Conducting Mission-Critical Voice Experiments with Automated Speech Recognition and Crowdsourcing	Jan Janak et.al.	2509.13724	null
2025-09-16	Invisible Ears at Your Fingertips: Acoustic Eavesdropping via Mouse Sensors	Mohamad Fakih et.al.	2509.13581	null
2025-09-16	TICL: Text-Embedding KNN For Speech In-Context Learning Unlocks Speech Recognition Abilities of Large Multimodal Models	Haolong Zheng et.al.	2509.13395	null
2025-09-22	GLAD: Global-Local Aware Dynamic Mixture-of-Experts for Multi-Talker ASR	Yujie Guo et.al.	2509.13093	null
2025-09-16	PAC: Pronunciation-Aware Contextualized Large Language Model-based Automatic Speech Recognition	Li Fu et.al.	2509.12647	null
2025-09-17	FunAudio-ASR Technical Report	Keyu An et.al.	2509.12508	null
2025-09-15	In-domain SSL pre-training and streaming ASR	Jarod Duret et.al.	2509.12101	null
2025-09-12	Improving Audio Event Recognition with Consistency Regularization	Shanmuka Sadhu et.al.	2509.10391	null
2025-09-12	Data-independent Beamforming for End-to-end Multichannel Multi-speaker ASR	Can Cui et.al.	2509.10234	null
2025-09-12	Prominence-aware automatic speech recognition for conversational speech	Julian Linke et.al.	2509.10116	null
2025-09-12	Unified Learnable 2D Convolutional Feature Extraction for ASR	Peter Vieting et.al.	2509.10031	null
2025-09-11	Combining Textual and Spectral Features for Robust Classification of Pilot Communications	Abdullah All Tanvir et.al.	2509.09752	null
2025-09-11	Improving Synthetic Data Training for Contextual Biasing Models with a Keyword-Aware Cost Function	Chin Yuen Kwok et.al.	2509.09197	null
2025-09-11	Efficient Trie-based Biasing using K-step Prediction for Rare Word Recognition	Chin Yuen Kwok et.al.	2509.09196	null
2025-09-09	A Bottom-up Framework with Language-universal Speech Attribute Modeling for Syllable-based ASR	Hao Yen et.al.	2509.08173	null
2025-09-09	EnvX: Agentize Everything with Agentic AI	Linyao Chen et.al.	2509.08088	null
2025-09-08	Identifying and Calibrating Overconfidence in Noisy Speech Recognition	Mingyue Huo et.al.	2509.07195	null
2025-09-08	The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties	William Chen et.al.	2509.07139	null
2025-09-20	TSPC: A Two-Stage Phoneme-Centric Architecture for code-switching Vietnamese-English Speech Recognition	Minh N. H. Nguyen et.al.	2509.05983	null
2025-09-07	Enhancing the Robustness of Contextual ASR to Varying Biasing Information Volumes Through Purified Semantic Correlation Joint Modeling	Yue Gu et.al.	2509.05908	null
2025-09-06	New Insights into Optimal Alignment of Acoustic and Linguistic Representations for Knowledge Transfer in ASR	Xugang Lu et.al.	2509.05609	null
2025-09-05	Graph Connectionist Temporal Classification for Phoneme Recognition	Henry Grafé et.al.	2509.05399	null
2025-09-05	Layer-wise Analysis for Quality of Multilingual Synthesized Speech	Erica Cooper et.al.	2509.04830	null
2025-09-02	From Silent Signals to Natural Language: A Dual-Stage Transformer-LLM Approach	Nithyashree Sivasubramaniam et.al.	2509.04507	null
2025-09-01	Refining Transcripts With TV Subtitles by Prompt-Based Weakly Supervised Training of ASR	Xinnian Zhao et.al.	2509.04491	null
2025-09-01	Serialized Output Prompting for Large Language Model-based Multi-Talker Speech Recognition	Hao Shi et.al.	2509.04488	null
2025-08-29	SpeechLLM: Unified Speech and Language Model for Enhanced Multi-Task Understanding in Low Resource Settings	Jaekwon Yoo et.al.	2509.04473	null
2025-09-04	Contextualized Token Discrimination for Speech Search Query Correction	Junyu Lu et.al.	2509.04393	null
2025-09-04	Denoising GER: A Noise-Robust Generative Error Correction with LLM for Speech Recognition	Yanyan Liu et.al.	2509.04392	null
2025-09-04	PARCO: Phoneme-Augmented Robust Contextual ASR via Contrastive Entity Disambiguation	Jiajun He et.al.	2509.04357	null
2025-09-04	Enhancing Self-Supervised Speaker Verification Using Similarity-Connected Graphs and GCN	Zhaorui Sun et.al.	2509.04147	null
2025-08-27	An Effective Strategy for Modeling Score Ordinality and Non-uniform Intervals in Automated Speaking Assessment	Tien-Hong Lo et.al.	2509.03372	null
2025-09-05	Exploring persuasive interactions with generative social robots: An experimental framework	Stephan Vonschallen et.al.	2509.03231	null
2025-09-03	Beyond Words: Interjection Classification for Improved Human-Computer Interaction	Yaniv Goren et.al.	2509.03181	null
2025-09-03	A Study on Zero-Shot Non-Intrusive Speech Intelligibility for Hearing Aids Using Large Language Models	Ryandhimas E. Zezario et.al.	2509.03021	null
2025-09-04	Speech Intelligibility Assessment with Uncertainty-Aware Whisper Embeddings and sLSTM	Ryandhimas E. Zezario et.al.	2509.03013	null
2025-09-02	SSVD: Structured SVD for Parameter-Efficient Fine-Tuning and Benchmarking under Domain Shift in ASR	Pu Wang et.al.	2509.02830	null
2025-09-02	Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices	Evan King et.al.	2509.02523	null
2025-09-04	AudioCodecBench: A Comprehensive Benchmark for Audio Codec Evaluation	Lu Wang et.al.	2509.02349	null
2025-09-03	NADI 2025: The First Multidialectal Arabic Speech Processing Shared Task	Bashar Talafha et.al.	2509.02038	null
2025-09-02	Group Relative Policy Optimization for Speech Recognition	Prashanth Gurunath Shivakumar et.al.	2509.01939	null
2025-09-02	Multilingual Speech Recognition Using Discrete Tokens with a Two-step Training Strategy	Zehan Li et.al.	2509.01900	null
2025-09-01	Mic Drop or Data Flop? Evaluating the Fitness for Purpose of AI Voice Interviewers for Data Collection within Quantitative & Qualitative Research Contexts	Shreyas Tirumala et.al.	2509.01814	null
2025-09-01	Characterization of Speech Similarity Between Australian Aboriginal and High-Resource Languages: A Case Study on Dharawal	Ting Dang et.al.	2509.01419	null
2025-09-01	CabinSep: IR-Augmented Mask-Based MVDR for Real-Time In-Car Speech Separation with Distributed Heterogeneous Arrays	Runduo Han et.al.	2509.01399	null
2025-09-01	Analysing the Language of Neural Audio Codecs	Joonyong Park et.al.	2509.01390	null
2025-09-01	Noisy Disentanglement with Tri-stage Training for Noise-Robust Speech Recognition	Shuangyuan Chen et.al.	2509.01087	null
2025-08-31	A Unified Denoising and Adaptation Framework for Self-Supervised Bengali Dialectal ASR	Swadhin Biswas et.al.	2509.00988	null
2025-08-30	Entropy-based Coarse and Compressed Semantic Speech Representation Learning	Jialong Zuo et.al.	2509.00503	null
2025-08-27	Automatic Pronunciation Error Detection and Correction of the Holy Quran’s Learners Using Deep Learning	Abdullah Abdelfattah et.al.	2509.00094	null
2025-08-29	NSPDI-SNN: An efficient lightweight SNN based on nonlinear synaptic pruning and dendritic integration	Wuque Cai et.al.	2508.21566	null
2025-09-02	AHELM: A Holistic Evaluation of Audio-Language Models	Tony Lee et.al.	2508.21376	null
2025-08-28	Can Layer-wise SSL Features Improve Zero-Shot ASR Performance for Children’s Speech?	Abhijit Sinha et.al.	2508.21225	null
2025-08-28	Benchmarking Large Pretrained Multilingual Models on Québec French Speech Recognition	Coralie Serrand et.al.	2508.21193	null
2025-08-28	OLMoASR: Open Models and Data for Training Robust Speech Recognition Models	Huong Ngo et.al.	2508.20869	null
2025-08-28	Generative Annotation for ASR Named Entity Correction	Yuanchang Luo et.al.	2508.20700	null
2025-08-28	Towards Inclusive Communication: A Unified LLM-Based Framework for Sign Language, Lip Movements, and Audio Understanding	Jeong Hun Yeo et.al.	2508.20476	null
2025-09-08	Heterogeneous Self-Supervised Acoustic Pre-Training with Local Constraints	Xiaodong Cui et.al.	2508.19990	null
2025-08-27	TokenVerse++: Towards Flexible Multitask Learning with Dynamic Task Activation	Shashi Kumar et.al.	2508.19856	null
2025-08-27	CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese	Carlos Carvalho et.al.	2508.19721	null
2025-08-27	Hybrid Decoding: Rapid Pass and Selective Detailed Correction for Sequence Models	Yunkyu Lim et.al.	2508.19671	null
2025-08-27	Towards stable AI systems for Evaluating Arabic Pronunciations	Hadi Zaatiti et.al.	2508.19587	null
2025-08-22	Whisper based Cross-Lingual Phoneme Recognition between Vietnamese and English	Nguyen Huu Nhat Minh et.al.	2508.19270	null
2025-08-26	MOSA: Mixtures of Simple Adapters Outperform Monolithic Approaches in LLM-based Multilingual ASR	Junjie Li et.al.	2508.18998	null
2025-08-26	TaiBai: A fully programmable brain-inspired processor with topology-aware efficiency	Qianpeng Li et.al.	2508.18961	null
2025-08-26	DESAMO: A Device for Elder-Friendly Smart Homes Powered by Embedded LLM with Audio Modality	Youngwon Choi et.al.	2508.18918	null
2025-08-26	Improving Noise Robust Audio-Visual Speech Recognition via Router-Gated Cross-Modal Feature Fusion	DongHoon Lim et.al.	2508.18734	null
2025-08-26	Cross-Learning Fine-Tuning Strategy for Dysarthric Speech Recognition Via CDSD database	Qing Xiao et.al.	2508.18732	null
2025-08-26	Attention2Probability: Attention-Driven Terminology Probability Estimation for Robust Speech-to-Text System	Yanfan Du et.al.	2508.18701	null
2025-08-22	H-PRM: A Pluggable Hotword Pre-Retrieval Module for Various Speech Recognition Systems	Huangyu Dai et.al.	2508.18295	null
2025-08-20	Toward Responsible ASR for African American English Speakers: A Scoping Review of Bias and Equity in Speech Technology	Jay L. Cunningham et.al.	2508.18288	null
2025-08-25	Evaluating the Representation of Vowels in Wav2Vec Feature Extractor: A Layer-Wise Analysis Using MFCCs	Domenico De Cristofaro et.al.	2508.17914	null
2025-08-25	Designing Practical Models for Isolated Word Visual Speech Recognition	Iason Ioannis Panagos et.al.	2508.17894	null
2025-08-25	Talking to Robots: A Practical Examination of Speech Foundation Models for HRI Applications	Theresa Pekarek Rosin et.al.	2508.17753	null
2025-08-24	AI-Powered Legal Intelligence System Architecture: A Comprehensive Framework for Automated Legal Consultation and Analysis	Sean Kalaycioglu et.al.	2508.17499	null
2025-08-22	Benchmarking Training Paradigms, Dataset Composition, and Model Scaling for Child ASR in ESPnet	Anyu Ying et.al.	2508.16576	null
2025-08-21	Beyond Transcription: Mechanistic Interpretability in ASR	Neta Glazer et.al.	2508.15882	null
2025-08-20	MGSC: A Multi-granularity Consistency Framework for Robust End-to-end Asr	Xuwen Yang et.al.	2508.15853	null
2025-08-21	UniCoM: A Universal Code-Switching Speech Generator	Sangmin Lee et.al.	2508.15244	null
2025-08-20	A Study of the Scale Invariant Signal to Distortion Ratio in Speech Separation with Noisy References	Simon Dahl Jepsen et.al.	2508.14623	null
2025-08-18	Whispering Context: Distilling Syntax and Semantics for Long Speech Transcripts	Duygu Altinok et.al.	2508.13376	null
2025-08-18	Overcoming Latency Bottlenecks in On-Device Speech Translation: A Cascaded Approach with Alignment-Based Streaming MT	Zeeshan Ahmed et.al.	2508.13358	null
2025-08-18	Evaluating ASR robustness to spontaneous speech errors: A study of WhisperX using a Speech Error Database	John Alderete et.al.	2508.13060	null
2025-08-18	Arabic ASR on the SADA Large-Scale Arabic Speech Corpus with Transformer-Based Models	Branislav Gerazov et.al.	2508.12968	null
2025-08-17	CarelessWhisper: Turning Whisper into a Causal Streaming Model	Tomer Krichli et.al.	2508.12301	null
2025-08-17	HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization	Hyebin Ahn et.al.	2508.12292	null
2025-08-17	What do Speech Foundation Models Learn? Analysis and Applications	Ankita Pasad et.al.	2508.12255	null

🗣️ TTS

📊 97 papers

📅 Publish Date	📝 Title	👥 Authors	📄 PDF	💻 Code
2025-09-23	Finding My Voice: Generative Reconstruction of Disordered Speech for Automated Clinical Evaluation	Karen Rosero et.al.	2509.19231	null
2025-09-23	Investigating Test-Time Scaling with Reranking for Machine Translation	Shaomu Tan et.al.	2509.19020	null
2025-09-23	No Verifiable Reward for Prosody: Toward Preference-Guided Prosody Learning in TTS	Seungyoun Shin et.al.	2509.18531	null
2025-09-22	Discrete-time diffusion-like models for speech synthesis	Xiaozhou Tan et.al.	2509.18470	null
2025-09-22	TMD-TTS: A Unified Tibetan Multi-Dialect Text-to-Speech Synthesis for Ü-Tsang, Amdo and Kham Speech Dataset Generation	Yutong Liu et.al.	2509.18060	null
2025-09-22	Variation in Verification: Understanding Verification Dynamics in Large Language Models	Yefan Zhou et.al.	2509.17995	null
2025-09-22	Nord-Parl-TTS: Finnish and Swedish TTS Dataset from Parliament Speech	Zirui Li et.al.	2509.17988	null
2025-09-23	Mitigating Strategy-Selection Bias in Reasoning for More Effective Test-Time Scaling	Zongqian Wu et.al.	2509.17905	null
2025-09-22	Audiobook-CC: Controllable Long-context Speech Generation for Multicast Audiobook	Min Liu et.al.	2509.17516	null
2025-09-21	Bridging the gap between training and inference in LM-based TTS models	Ruonan Zhang et.al.	2509.17021	null
2025-09-21	MBCodec:Thorough disentangle for high-fidelity audio compression	Ruonan Zhang et.al.	2509.17006	null
2025-09-19	Fed-PISA: Federated Voice Cloning via Personalized Identity-Style Adaptation	Qi Wang et.al.	2509.16010	null
2025-09-19	VoXtream: Full-Stream Text-to-Speech with Extremely Low Latency	Nikita Torgashov et.al.	2509.15969	null
2025-09-19	Deep Dubbing: End-to-End Auto-Audiobook System with Text-to-Timbre and Context-Aware Instruct-TTS	Ziqi Dai et.al.	2509.15845	null
2025-09-19	Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech	Xinlei Niu et.al.	2509.15492	null
2025-09-18	Real-Time Streaming Mel Vocoding with Generative Flow Matching	Simon Welker et.al.	2509.15085	null
2025-09-18	DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis	Ye-Xin Lu et.al.	2509.14684	null
2025-09-20	Cross-Lingual F5-TTS: Towards Language-Agnostic Voice Cloning and Speech Synthesis	Qingyu Liu et.al.	2509.14579	null
2025-09-15	SpeechWeave: Diverse Multilingual Synthetic Text & Audio Data Generation Pipeline for Training Text to Speech Models	Karan Dua et.al.	2509.14270	null
2025-09-17	Slim-SC: Thought Pruning for Efficient Scaling with Self-Consistency	Colin Hong et.al.	2509.13990	null
2025-09-22	Do You Hear What I Mean? Quantifying the Instruction-Perception Gap in Instruction-Guided Expressive Text-To-Speech Systems	Yi-Cheng Lin et.al.	2509.13989	null
2025-09-16	MSR-Codec: A Low-Bitrate Multi-Stream Residual Codec for High-Fidelity Speech Generation with Information Disentanglement	Jingyu Li et.al.	2509.13068	null
2025-09-21	LTA-thinker: Latent Thought-Augmented Training Framework for Large Language Models on Complex Reasoning	Jiaqi Wang et.al.	2509.12875	null
2025-09-16	Towards personalized, precise and survey-free environment recognition: AI-enhanced sensor fusion without pre-deployment	Ruichen Wang et.al.	2509.12870	null
2025-09-16	A Lightweight Pipeline for Noisy Speech Voice Cloning and Accurate Lip Sync Synthesis	Javeria Amir et.al.	2509.12831	null
2025-09-21	Building Coding Agents via Entropy-Enhanced Multi-Turn Preference Optimization	Jiahao Yu et.al.	2509.12434	null
2025-09-15	Preservation of Language Understanding Capabilities in Speech-aware Large Language Models	Marek Kubis et.al.	2509.12171	null
2025-09-14	FuseCodec: Semantic-Contextual Fusion and Supervision for Neural Codecs	Md Mubtasim Ahasan et.al.	2509.11425	null
2025-09-14	Length-Aware Rotary Position Embedding for Text-Speech Alignment	Hyeongju Kim et.al.	2509.11084	null
2025-09-12	Towards Data Drift Monitoring for Speech Deepfake Detection in the context of MLOps	Xin Wang et.al.	2509.10086	null
2025-09-11	DiTReducio: A Training-Free Acceleration for DiT-Based TTS via Progressive Calibration	Yanru Huo et.al.	2509.09748	null
2025-09-12	DiFlow-TTS: Discrete Flow Matching with Factorized Speech Tokens for Low-Latency Zero-Shot Text-To-Speech	Ngoc-Son Nguyen et.al.	2509.09631	link
2025-09-11	HISPASpoof: A New Dataset For Spanish Speech Forensics	Maria Risques et.al.	2509.09155	null
2025-09-10	Accelerating Diffusion Transformer-Based Text-to-Speech with Transformer Layer Caching	Siratish Sakpiboonchit et.al.	2509.08696	null
2025-09-14	Progressive Facial Granularity Aggregation with Bilateral Attribute-based Enhancement for Face-to-Speech Synthesis	Yejin Jeon et.al.	2509.07376	null
2025-09-09	When Fine-Tuning is Not Enough: Lessons from HSAD on Hybrid and Adversarial Audio Spoof Detection	Bin Hu et.al.	2509.07323	null
2025-09-08	Controllable Singing Voice Synthesis using Phoneme-Level Energy Sequence	Yerin Ryu et.al.	2509.07038	null
2025-09-07	Multimodal Fine-grained Context Interaction Graph Modeling for Conversational Speech Synthesis	Zhenqi Jia et.al.	2509.06074	null
2025-09-06	LatinX: Aligning a Multilingual TTS Model with Direct Preference Optimization	Luis Felipe Chary et.al.	2509.05863	null
2025-09-08	Sticker-TTS: Learn to Utilize Historical Experience with a Sticker-driven Test-Time Scaling Framework	Jie Chen et.al.	2509.05007	null
2025-09-04	Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding	Rui-Chen Zheng et.al.	2509.04685	null
2025-09-04	DarkStream: real-time speech anonymization with low latency	Waris Quamer et.al.	2509.04667	null
2025-09-04	AUDETER: A Large-scale Dataset for Deepfake Audio Detection in Open Worlds	Qizhou Wang et.al.	2509.04345	null
2025-09-04	Open-Source Full-Duplex Conversational Datasets for Natural and Interactive Speech Synthesis	Zhitong Zhou et.al.	2509.04093	null
2025-09-04	LibriQuote: A Speech Dataset of Fictional Character Utterances for Expressive Zero-Shot Speech Synthesis	Gaspard Michel et.al.	2509.04072	null
2025-09-16	SwinSRGAN: Swin Transformer-based Generative Adversarial Network for High-Fidelity Speech Super-Resolution	Jiajun Yuan et.al.	2509.03913	null
2025-09-03	Multi-level SSL Feature Gating for Audio Deepfake Detection	Hoan My Tran et.al.	2509.03409	null
2025-09-03	Improving Perceptual Audio Aesthetic Assessment via Triplet Loss and Self-Supervised Embeddings	Dyah A. M. G. Wisnu et.al.	2509.03292	null
2025-09-03	AIVA: An AI-based Virtual Companion for Emotion-aware Interaction	Chenxi Li et.al.	2509.03212	null
2025-09-02	Scale, Don’t Fine-tune: Guiding Multimodal LLMs for Efficient Visual Place Recognition at Test-Time	Jintao Cheng et.al.	2509.02129	null
2025-09-04	FireRedTTS-2: Towards Long Conversational Speech Generation for Podcast and Chatbot	Kun Xie et.al.	2509.02020	null
2025-09-01	MixedG2P-T5: G2P-free Speech Synthesis for Mixed-script texts using Speech Self-Supervised Learning and Language Model	Joonyong Park et.al.	2509.01391	null
2025-08-31	MPO: Multidimensional Preference Optimization for Language Model-based Text-to-Speech	Kangxiang Xia et.al.	2509.00685	null
2025-08-31	Speaker-Conditioned Phrase Break Prediction for Text-to-Speech with Phoneme-Level Pre-trained Language Model	Dong Yang et.al.	2509.00675	null
2025-08-29	Democratizing Agentic AI with Fast Test-Time Scaling on the Edge	Hao Mark Chen et.al.	2509.00195	null
2025-08-27	Learning to Refine: Self-Refinement of Parallel Reasoning in LLMs	Qibin Wang et.al.	2509.00084	null
2025-08-28	Multilingual Dataset Integration Strategies for Robust Audio Deepfake Detection: A SAFE Challenge System	Hashim Ali et.al.	2508.20983	null
2025-08-26	Predicting the optimal noise strength for solving optimization problems with analog Ising machines	Leen Mys et.al.	2508.19107	null
2025-08-26	CLEAR: Continuous Latent Autoregressive Modeling for High-quality and Low-latency Speech Synthesis	Chun Yat Wu et.al.	2508.19098	null
2025-08-25	SwiftF0: Fast and Accurate Monophonic Pitch Detection	Lars Nieradzik et.al.	2508.18440	null
2025-08-25	Unseen Speaker and Language Adaptation for Lightweight Text-To-Speech with Adapters	Alessio Falai et.al.	2508.18006	null
2025-08-27	Vocoder-Projected Feature Discriminator	Takuhiro Kaneko et.al.	2508.17874	null
2025-08-25	ClearMask: Noise-Free and Naturalness-Preserving Protection Against Voice Deepfake Attacks	Yuanda Wang et.al.	2508.17660	null
2025-08-24	Improving French Synthetic Speech Quality via SSML Prosody Control	Nassima Ould Ouali et.al.	2508.17494	null
2025-08-23	WildSpoof Challenge Evaluation Plan	Yihan Wu et.al.	2508.16858	null
2025-09-09	Trust but Verify! A Survey on Verification Design for Test-time Scaling	V Venktesh et.al.	2508.16665	null
2025-09-05	Mitigating Hallucinations in LM-Based TTS Models via Distribution Alignment Using GFlowNets	Chenlin Liu et.al.	2508.15442	null
2025-08-25	Linear Preference Optimization: Decoupled Gradient Control via Absolute Regularization	Rui Wang et.al.	2508.14947	null
2025-08-20	Long-Context Speech Synthesis with Context-Aware Memory	Zhipeng Li et.al.	2508.14713	null
2025-08-20	Improving Resource-Efficient Speech Enhancement via Neural Differentiable DSP Vocoder Refinement	Heitor R. Guimarães et.al.	2508.14709	null
2025-08-22	Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS	Can Jin et.al.	2508.14313	null
2025-08-19	Who Gets the Mic? Investigating Gender Bias in the Speaker Assignment of a Speech-LLM	Dariia Puhach et.al.	2508.13603	null
2025-08-18	Integrating Feedback Loss from Bi-modal Sarcasm Detector for Sarcastic Speech Synthesis	Zhu Li et.al.	2508.13028	null
2025-08-18	Cooperative Sensing-Assisted Predictive Beam Tracking for MIMO-OFDM Networked ISAC Systems	Xiaoyu Yang et.al.	2508.12723	null
2025-08-18	Real-Time Sign Language Gestures to Speech Transcription using Deep Learning	Brandone Fonya et.al.	2508.12713	null
2025-08-19	FNH-TTS: A Fast, Natural, and Human-Like Speech Synthesis System with advanced prosodic modeling based on Mixture of Experts	Qingliang Meng et.al.	2508.12001	null
2025-08-15	MoE-TTS: Enhancing Out-of-Domain Text Understanding for Description-based TTS via Mixture-of-Experts	Heyang Xue et.al.	2508.11326	null
2025-08-15	EmoSSLSphere: Multilingual Emotional Speech Synthesis with Spherical Vectors and Discrete Speech Tokens	Joonyong Park et.al.	2508.11273	null
2025-08-14	Facilitating Personalized TTS for Dysarthric Speakers Using Knowledge Anchoring and Curriculum Learning	Yejin Jeon et.al.	2508.10412	null
2025-08-14	Towards Frame-level Quality Predictions of Synthetic Speech	Michael Kuhlmann et.al.	2508.10374	null
2025-08-15	Training-Free Multimodal Large Language Model Orchestration	Tianyu Xie et.al.	2508.10016	null
2025-09-16	UtterTune: LoRA-Based Target-Language Pronunciation Edit and Control in Multilingual Text-to-Speech	Shuhei Kato et.al.	2508.09767	null
2025-08-12	ProMode: A Speech Prosody Model Conditioned on Acoustic and Textual Inputs	Eray Eren et.al.	2508.09389	null
2025-08-12	Fake-Mamba: Real-Time Speech Deepfake Detection Using Bidirectional Mamba as Self-Attention’s Alternative	Xi Xuan et.al.	2508.09294	null
2025-08-12	HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis	Timo Teufel et.al.	2508.09137	null
2025-08-12	QAMRO: Quality-aware Adaptive Margin Ranking Optimization for Human-aligned Assessment of Audio Generation Systems	Chien-Chun Wang et.al.	2508.08957	null
2025-08-10	Scalable Controllable Accented TTS	Henry Li Xinyuan et.al.	2508.07426	null
2025-08-10	KLASSify to Verify: Audio-Visual Deepfake Detection Using SSL-based Audio and Handcrafted Visual Features	Ivan Kukanov et.al.	2508.07337	null
2025-08-12	XEmoRAG: Cross-Lingual Emotion Transfer with Controllable Intensity Using Retrieval-Augmented Generation	Tianlun Zuo et.al.	2508.07302	null
2025-08-09	Maestro-EVC: Controllable Emotional Voice Conversion Guided by References and Explicit Prosody	Jinsung Yoon et.al.	2508.06890	null
2025-08-09	Text to Speech System for Meitei Mayek Script	Gangular Singh Irengbam et.al.	2508.06870	null
2025-08-08	Llasa+: Free Lunch for Accelerated and Streaming Llama-Based Speech Synthesis	Wenjie Tian et.al.	2508.06262	null
2025-08-08	NEP: Autoregressive Image Editing via Next Editing Token Prediction	Huimin Wu et.al.	2508.06044	null
2025-08-07	A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understanding	Runchuan Ye et.al.	2508.05385	null
2025-08-15	Fairness in Dysarthric Speech Synthesis: Understanding Intrinsic Bias in Dysarthric Speech Cloning using F5-TTS	M Anuprabha et.al.	2508.05102	null
2025-08-07	UniTalker: Conversational Speech-Visual Synthesis	Yifan Hu et.al.	2508.04585	null
2025-08-06	The State Of TTS: A Case Study with Human Fooling Rates	Praveen Srinivasa Varadhan et.al.	2508.04179	null

🌐 Machine Translation

📊 117 papers

📅 Publish Date	📝 Title	👥 Authors	📄 PDF	💻 Code
2025-09-22	Transformer-Encoder Trees for Efficient Multilingual Machine Translation and Speech Translation	Yiwen Guan et.al.	2509.17930	null
2025-09-22	Specification-Aware Machine Translation and Evaluation for Purpose Alignment	Yoko Kayano et.al.	2509.17559	null
2025-09-22	Enhancing Cross-Lingual Transfer through Reversible Transliteration: A Huffman-Based Approach for Low-Resource Languages	Wenhao Zhuang et.al.	2509.17493	null
2025-09-22	Filling in the Clinical Gaps in Benchmark: Case for HealthBench for the Japanese medical system	Shohei Hisada et.al.	2509.17444	null
2025-09-22	Scaling, Simplification, and Adaptation: Lessons from Pretraining on Machine-Translated Text	Dan John Velasco et.al.	2509.17317	null
2025-09-22	JPResUnet: A Joint Probability Density Function Translation Model in Partially Premixed Flames	Hanying Yang et.al.	2509.17297	null
2025-09-21	Extending Automatic Machine Translation Evaluation to Book-Length Documents	Kuang-Da Wang et.al.	2509.17249	null
2025-09-21	CUTE: A Multilingual Dataset for Enhancing Cross-Lingual Knowledge Transfer in Low-Resource Languages	Wenhao Zhuang et.al.	2509.16914	null
2025-09-20	Angular Dispersion Accelerates $k$ -Nearest Neighbors Machine Translation	Evgeniia Tokarchuk et.al.	2509.16729	null
2025-09-19	Whisper-UT: A Unified Translation Framework for Speech and Text	Cihan Xiao et.al.	2509.16375	null
2025-09-19	UPRPRC: Unified Pipeline for Reproducing Parallel Resources – Corpus from the United Nations	Qiuyang Lu et.al.	2509.15789	null
2025-09-19	Multilingual LLM Prompting Strategies for Medical English-Vietnamese Machine Translation	Nhu Vo et.al.	2509.15640	null
2025-09-18	RulER: Automated Rule-Based Semantic Error Localization and Repair for Code Translation	Shuo Jin et.al.	2509.14829	null
2025-09-18	Evaluating Large Language Models for Cross-Lingual Retrieval	Longfei Zuo et.al.	2509.14749	null
2025-09-17	Translate, then Detect: Leveraging Machine Translation for Cross-Lingual Toxicity Classification	Samuel J. Bell et.al.	2509.14493	null
2025-09-17	You Are What You Train: Effects of Data Composition on Training Context-aware Machine Translation Models	Paweł Mąka et.al.	2509.14031	null
2025-09-17	Audio-Based Crowd-Sourced Evaluation of Machine Translation Quality	Sami Ul Haq et.al.	2509.14023	null
2025-09-17	Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale	Hasan Abed Al Kader Hammoud et.al.	2509.14008	null
2025-09-17	Long-context Reference-based MT Quality Estimation	Sami Ul Haq et.al.	2509.13980	null
2025-09-20	Data Augmentation for Maltese NLP using Transliterated and Machine Translated Arabic Data	Kurt Micallef et.al.	2509.12853	null
2025-09-17	Human + AI for Accelerating Ad Localization Evaluation	Harshit Rajgarhia et.al.	2509.12543	null
2025-09-15	A comparison of pipelines for the translation of a low resource language based on transformers	Chiara Bonfanti et.al.	2509.12514	null
2025-09-14	PATIMT-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language Models	Wanru Zhuang et.al.	2509.12278	null
2025-09-15	XplaiNLP at CheckThat! 2025: Multilingual Subjectivity Detection with Finetuned Transformers and Prompt-Based Inference with Large Language Models	Ariana Sahitaj et.al.	2509.12130	null
2025-09-04	Optimal Multi-Task Learning at Regularization Horizon for Speech Translation Task	JungHo Jung et.al.	2509.09701	null
2025-09-11	Mitigating Language Barriers in Education: Developing Multilingual Digital Learning Materials with Machine Translation	Lucie Poláková et.al.	2509.09473	null
2025-09-09	Small Open Models Achieve Near Parity with Large Models in Low Resource Literary Translation at a Fraction of the Cost	Mihai Nadas et.al.	2509.07829	null
2025-09-09	From Scarcity to Efficiency: Investigating the Effects of Data Augmentation on African Machine Translation	Mardiyyah Oduwole et.al.	2509.07471	null
2025-09-09	Hunyuan-MT Technical Report	Mao Zheng et.al.	2509.05209	null
2025-09-05	PRIM: Towards Practical In-Image Multilingual Machine Translation	Yanzhi Tian et.al.	2509.05146	null
2025-09-03	Artificially Fluent: Swahili AI Performance Benchmarks Between English-Trained and Natively-Trained Datasets	Sophie Jaffer et.al.	2509.04516	null
2025-09-04	Exploring NLP Benchmarks in an Extremely Low-Resource Setting	Ulin Nuha et.al.	2509.03962	null
2025-09-04	Align-then-Slide: A complete evaluation framework for Ultra-Long Document-Level Machine Translation	Jiaxin Guo et.al.	2509.03809	null
2025-09-24	Expanding the WMT24++ Benchmark with Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader	Jannis Vamvas et.al.	2509.03148	null
2025-09-02	The Forgotten Code: Validating a Century-Old Translation System with AI	Jean-Marie Le Ray et.al.	2509.02506	null
2025-09-18	CSRM-LLM: Embracing Multilingual LLMs for Cold-Start Relevance Matching in Emerging E-commerce Markets	Yujing Wang et.al.	2509.01566	null
2025-08-28	The Uneven Impact of Post-Training Quantization in Machine Translation	Benjamin Marie et.al.	2508.20893	null
2025-08-28	Languages Still Left Behind: Toward a Better Multilingual Machine Translation Benchmark	Chihiro Taguchi et.al.	2508.20511	null
2025-09-06	FlowMalTrans: Unsupervised Binary Code Translation for Malware Detection Using Flow-Adapter Architecture	Minghao Hu et.al.	2508.20212	null
2025-08-26	Improving Low-Resource Translation with Dictionary-Guided Fine-Tuning and RL: A Spanish-to-Wayuunaiki Study	Manuel Mosquera et.al.	2508.19481	null
2025-09-03	The Ramon Llull’s Thinking Machine for Automated Ideation	Xinran Zhao et.al.	2508.19200	null
2025-08-26	LaTeXTrans: Structured LaTeX Translation with Multi-Agent Coordination	Ziming Zhu et.al.	2508.18791	null
2025-08-26	A New NMT Model for Translating Clinical Texts from English to Spanish	Rumeng Li et.al.	2508.18607	null
2025-08-25	COMET-poly: Machine Translation Metric Grounded in Other Candidates	Maike Züfle et.al.	2508.18549	null
2025-08-24	Evaluating the Impact of Verbal Multiword Expressions on Machine Translation	Linfeng Liu et.al.	2508.17458	null
2025-08-22	Cetvel: A Unified Benchmark for Evaluating Language Understanding, Generation and Cultural Capacity of LLMs for Turkish	Yakup Abrek Er et.al.	2508.16431	null
2025-08-22	The Mediomatix Corpus: Parallel Data for Romansh Idioms via Comparable Schoolbooks	Zachary Hopton et.al.	2508.16371	null
2025-09-23	OpenWHO: A Document-Level Parallel Corpus for Health Translation in Low-Resource Languages	Raphaël Merx et.al.	2508.16048	null
2025-08-21	Confidence-Modulated Speculative Decoding for Large Language Models	Jaydip Sen et.al.	2508.15371	null
2025-08-20	Improving LLMs for Machine Translation Using Synthetic Preference Data	Dario Vajda et.al.	2508.14951	null
2025-08-24	Preliminary Ranking of WMT25 General Machine Translation Systems	Tom Kocmi et.al.	2508.14909	null
2025-08-20	Filling the Gap for Uzbek: Creating Translation Resources for Southern Uzbek	Mukhammadsaid Mamasaidov et.al.	2508.14586	null
2025-08-20	In2x at WMT25 Translation Task	Lei Pang et.al.	2508.14472	null
2025-08-18	Overcoming Latency Bottlenecks in On-Device Speech Translation: A Cascaded Approach with Alignment-Based Streaming MT	Zeeshan Ahmed et.al.	2508.13358	null
2025-08-18	DocHPLT: A Massively Multilingual Document-Level Translation Dataset	Dayyán O’Brien et.al.	2508.13079	null
2025-08-18	From SALAMANDRA to SALAMANDRATA: BSC Submission for WMT25 General Machine Translation Shared Task	Javier Garcia Gilabert et.al.	2508.12774	null
2025-08-25	SEA-BED: Southeast Asia Embedding Benchmark	Wuttikorn Ponwitayarat et.al.	2508.12243	null
2025-08-14	Neural Machine Translation for Coptic-French: Strategies for Low-Resource Ancient Languages	Nasma Chaoui et.al.	2508.10683	null
2025-08-14	Evaluating LLMs on Chinese Idiom Translation	Cai Yang et.al.	2508.10421	null
2025-08-28	Estimating Machine Translation Difficulty	Lorenzo Proietti et.al.	2508.10175	null
2025-08-12	TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation	Armel Zebaze et.al.	2508.08680	null
2025-08-12	UWB at WASSA-2024 Shared Task 2: Cross-lingual Emotion Detection	Jakub Šmíd et.al.	2508.08650	null
2025-08-11	Toward Machine Interpreting: Lessons from Human Interpreting Studies	Matthias Sperber et.al.	2508.07964	null
2025-08-10	ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models	Archchana Sindhujan et.al.	2508.07484	null
2025-08-08	Testing the Limits of Machine Translation from One Book	Jonathan Shaw et.al.	2508.06665	null
2025-08-08	Train It and Forget It: Merge Lists are Unnecessary for BPE Inference in Language Models	Tomohiro Sawada et.al.	2508.06621	null
2025-08-07	PEACH: A sentence-aligned Parallel English-Arabic Corpus for Healthcare	Rania Al-Sabbagh et.al.	2508.05722	null
2025-08-07	MELLA: Bridging Linguistic Capability and Cultural Groundedness for Low-Resource Language MLLMs	Yufei Gao et.al.	2508.05502	null
2025-08-07	Optimal Corpus Aware Training for Neural Machine Translation	Yi-Hsiu Liao et.al.	2508.05364	null
2025-08-11	REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation	Nameer Hirschkind et.al.	2508.04946	null
2025-08-05	Marito: Structuring and Building Open Multilingual Terminologies for South African NLP	Vukosi Marivate et.al.	2508.03529	null
2025-08-05	Investigation on deep learning-based galaxy image translation models	Hengxin Ruan et.al.	2508.03291	null
2025-08-05	Cross-lingual Opinions and Emotions Mining in Comparable Documents	Motaz Saad et.al.	2508.03112	null
2025-08-04	A Survey on Data Security in Large Language Models	Kang Chen et.al.	2508.02312	null
2025-08-04	A French Version of the OLDI Seed Corpus	Malik Marmonier et.al.	2508.02290	null
2025-08-04	SHAMI-MT: A Syrian Arabic Dialect to Modern Standard Arabic Bidirectional Machine Translation System	Serry Sibaee et.al.	2508.02268	null
2025-08-25	CultureGuard: Towards Culturally-Aware Dataset and Guard Model for Multilingual Safety Applications	Raviraj Joshi et.al.	2508.01710	null
2025-08-02	ArzEn-MultiGenre: An aligned parallel dataset of Egyptian Arabic song lyrics, novels, and subtitles, with English translations	Rania Al-Sabbagh et.al.	2508.01411	null
2025-09-16	Sample-Aware Test-Time Adaptation for Medical Image-to-Image Translation	Irene Iele et.al.	2508.00766	null
2025-07-31	Arabic Hate Speech Identification and Masking in Social Media using Deep Learning Models and Pre-trained Models Fine-tuning	Salam Thabet Doghmash et.al.	2507.23661	null
2025-07-31	Beyond the Cloud: Assessing the Benefits and Drawbacks of Local LLM Deployment for Translators	Peter Sandrini et.al.	2507.23399	null
2025-07-29	RL from Teacher-Model Refinement: Gradual Imitation Learning for Machine Translation	Dongyub Jude Lee et.al.	2507.22219	null
2025-07-31	Multi-Hypothesis Distillation of Multilingual Neural Translation Models for Low-Resource Languages	Aarón Galiano-Jiménez et.al.	2507.21568	null
2025-07-07	iLSU-T: an Open Dataset for Uruguayan Sign Language Translation	Ariel E. Stassi et.al.	2507.21104	null
2025-07-28	Multilingual Self-Taught Faithfulness Evaluators	Carlo Alfano et.al.	2507.20752	null
2025-09-02	Advancing Dialectal Arabic to Modern Standard Arabic Machine Translation	Abdullah Alabdullah et.al.	2507.20301	null
2025-07-29	Mind the Language Gap in Digital Humanities: LLM-Aided Translation of SKOS Thesauri	Felix Kraus et.al.	2507.19537	null
2025-07-25	LLaVA-NeuMT: Selective Layer-Neuron Modulation for Efficient Multilingual Multimodal Translation	Jingxuan Wei et.al.	2507.18940	null
2025-07-24	GIIFT: Graph-guided Inductive Image-free Multimodal Machine Translation	Jiafeng Xiong et.al.	2507.18562	null
2025-07-24	Uncertainty Quantification for Evaluating Machine Translation Bias	Ieva Raminta Staliūnaitė et.al.	2507.18338	null
2025-07-25	Natural Language Processing for Tigrinya: Current State and Future Directions	Fitsum Gaim et.al.	2507.17974	null
2025-07-23	Dual-branch Prompting for Multimodal Machine Translation	Jie Wang et.al.	2507.17588	null
2025-07-22	Introducing Quality Estimation to Machine Translation Post-editing Workflow: An Empirical Study on Its Usefulness	Siqi Liu et.al.	2507.16515	null
2025-07-22	GG-BBQ: German Gender Bias Benchmark for Question Answering	Shalaka Satheesh et.al.	2507.16410	null
2025-07-21	Evaluating Text Style Transfer: A Nine-Language Benchmark for Text Detoxification	Vitaly Protasov et.al.	2507.15557	null
2025-07-20	A Case Against Implicit Standards: Homophone Normalization in Machine Translation for Languages that use the Ge’ez Script	Hellina Hailu Nigatu et.al.	2507.15142	null
2025-08-21	Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters	Shanbo Cheng et.al.	2507.13618	null
2025-07-16	Mitigating Stylistic Biases of Machine Translation Systems via Monolingual Corpora Only	Xuanqi Gao et.al.	2507.13395	null
2025-07-16	The first open machine translation system for the Chechen language	Abu-Viskhan A. Umishov et.al.	2507.12672	null
2025-09-19	Translationese-index: Using Likelihood Ratios for Graded and Generalizable Measurement of Translationese	Yikang Liu et.al.	2507.12260	null
2025-07-16	Marco-Bench-MIF: On Multilingual Instruction-Following Capability of Large Language Models	Bo Zeng et.al.	2507.11882	null
2025-07-31	ILID: Native Script Language Identification for Indian Languages	Yash Ingle et.al.	2507.11832	null
2025-08-30	How Important is `Perfect’ English for Machine Translation Prompts?	Patrícia Schmidtová et.al.	2507.09509	null
2025-07-11	Improving MLLM’s Document Image Machine Translation via Synchronously Self-reviewing Its OCR Proficiency	Yupu Liang et.al.	2507.08309	null
2025-07-10	Conditional Unigram Tokenization with Parallel Data	Gianluca Vico et.al.	2507.07824	null
2025-07-10	Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation	Yupu Liang et.al.	2507.07572	null
2025-07-09	Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation	Kazi Mahathir Rahman et.al.	2507.06530	null
2025-07-09	Pun Intended: Multi-Agent Translation of Wordplay with Contrastive Learning and Phonetic-Semantic Embeddings	Russell Taylor et.al.	2507.06506	null
2025-07-07	A Tale of Two Scripts: Transliteration and Post-Correction for Judeo-Arabic	Juan Moreno Gonzalez et.al.	2507.04746	null
2025-07-09	Losing our Tail – Again: On (Un)Natural Selection And Multilingual Large Language Models	Eva Vanmassenhove et.al.	2507.03933	null
2025-07-17	Learning to Translate Ambiguous Terminology by Preference Optimization on Post-Edits	Nathaniel Berger et.al.	2507.03580	null
2025-07-04	GRAFT: A Graph-based Flow-aware Agentic Framework for Document-level Machine Translation	Himanshu Dutta et.al.	2507.03311	null
2025-07-01	TransLaw: Benchmarking Large Language Models in Multi-Agent Simulation of the Collaborative Translation	Xi Xuan et.al.	2507.00875	null
2025-07-01	Neural translation for Stokes inversion and synthesis	A. Asensio Ramos et.al.	2507.00594	null
2025-06-30	Natural language processing for African languages	David Ifeoluwa Adelani et.al.	2507.00297	link
2025-06-30	Bridging the Gap with Retrieval-Augmented Generation: Making Prosthetic Device User Manuals Available in Marginalised Languages	Ikechukwu Ogbonna et.al.	2506.23958	null
2025-07-07	CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation	Yi Liu et.al.	2506.23347	null

⚡ Small Language Models

📊 180 papers

📅 Publish Date	📝 Title	👥 Authors	📄 PDF	💻 Code
2025-09-23	Adversarially-Refined VQ-GAN with Dense Motion Tokenization for Spatio-Temporal Heatmaps	Gabriel Maldonado et.al.	2509.19252	null
2025-09-23	PPG-Distill: Efficient Photoplethysmography Signals Analysis via Foundation Model Distillation	Juntong Ni et.al.	2509.19215	null
2025-09-23	Exact WKB Formulation of Quantization and Particle Production in Time-Dependent Backgrounds	Ryo Namba et.al.	2509.19194	null
2025-09-23	Data-Free Knowledge Distillation for LiDAR-Aided Beam Tracking in MmWave Systems	Abolfazl Zakeri et.al.	2509.19092	null
2025-09-23	Enhancing Noise Robustness for Neural Speech Codecs through Resource-Efficient Progressive Quantization Perturbation Simulation	Rui-Chen Zheng et.al.	2509.19025	null
2025-09-23	Otters: An Energy-Efficient SpikingTransformer via Optical Time-to-First-Spike Encoding	Zhanglu Yan et.al.	2509.18968	null
2025-09-23	VGGT-DP: Generalizable Robot Control via Vision Foundation Models	Shijia Ge et.al.	2509.18778	null
2025-09-23	DiSSECT: Structuring Transfer-Ready Medical Image Representations through Discrete Self-Supervision	Azad Singh et.al.	2509.18765	null
2025-09-23	Bi-VLM: Pushing Ultra-Low Precision Post-Training Quantization Boundaries in Vision-Language Models	Xijun Wang et.al.	2509.18763	null
2025-09-23	Enhanced Survival Trees	Ruiwen Zhou et.al.	2509.18494	null
2025-09-23	Codebook-Based Adaptive Feature Compression With Semantic Enhancement for Edge-Cloud Systems	Xinyu Wang et.al.	2509.18481	null
2025-09-22	Individualized non-uniform quantization for vector search	Mariano Tepper et.al.	2509.18471	null
2025-09-22	TinyBEV: Cross Modal Knowledge Distillation for Efficient Multi Task Bird’s Eye View Perception and Planning	Reeshad Khan et.al.	2509.18372	null
2025-09-21	nDNA – the Semantic Helix of Artificial Cognition	Amitava Das et.al.	2509.18216	null
2025-09-19	MMCD: Multi-Modal Collaborative Decision-Making for Connected Autonomy with Knowledge Distillation	Rui Liu et.al.	2509.18198	null
2025-09-19	TinyEcoWeedNet: Edge Efficient Real-Time Aerial Agricultural Weed Detection	Omar H. Khater et.al.	2509.18193	null
2025-09-22	Visual Detector Compression via Location-Aware Discriminant Analysis	Qizhen Lan et.al.	2509.17968	null
2025-09-23	Optimizing Inference in Transformer-Based Models: A Multi-Method Benchmark	Siu Hang Ho et.al.	2509.17894	null
2025-09-23	Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared Semantics	Kavin R V et.al.	2509.17737	null
2025-09-22	RCTDistill: Cross-Modal Knowledge Distillation Framework for Radar-Camera 3D Object Detection with Temporal Fusion	Geonho Bang et.al.	2509.17712	null
2025-09-22	Stratification of the half-density quantization of the Jeffrey-Weitsman-Witten invariants	Adrian Chitan et.al.	2509.17656	null
2025-09-22	Evaluating the Energy Efficiency of NPU-Accelerated Machine Learning Inference on Embedded Microcontrollers	Anastasios Fanariotis et.al.	2509.17533	null
2025-09-22	MapCoder-Lite: Squeezing Multi-Agent Coding into a Single Small LLM	Woongkyu Lee et.al.	2509.17489	null
2025-09-22	Learning Dexterous Manipulation with Quantized Hand State	Ying Feng et.al.	2509.17450	null
2025-09-23	QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models	Hyesung Jeon et.al.	2509.17428	null
2025-09-22	Physics-Informed Operator Learning for Hemodynamic Modeling	Ryan Chappell et.al.	2509.17293	null
2025-09-21	On the Quantization of the Electromagnetic Field with Magnetic Monopoles	Kanan Anwar et.al.	2509.17284	null
2025-09-21	PTQTP: Post-Training Quantization to Trit-Planes for Large Language Models	He Xiao et.al.	2509.16989	null
2025-09-21	Equip Pre-ranking with Target Attention by Residual Quantization	Yutong Li et.al.	2509.16931	null
2025-09-21	PRISM: Precision-Recall Informed Data-Free Knowledge Distillation via Generative Diffusion	Xuewan He et.al.	2509.16897	null
2025-09-20	Knowledge Distillation for Variational Quantum Convolutional Neural Networks on Heterogeneous Data	Kai Yu et.al.	2509.16699	null
2025-09-20	When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs	Abhirama Subramanyam Penamakuri et.al.	2509.16633	null
2025-09-20	The Role of Vocabularies in Learning Sparse Representations for Ranking	Hiun Kim et.al.	2509.16621	null
2025-09-20	Federated Learning with Ad-hoc Adapter Insertions: The Case of Soft-Embeddings for Training Classifier-as-Retriever	Marijan Fofonjka et.al.	2509.16508	null
2025-09-20	PrediPrune: Reducing Verification Overhead in Souper with Machine Learning Driven Pruning	Ange-Thierry Ishimwe et.al.	2509.16497	null
2025-09-20	Eye Gaze Tells You Where to Compute: Gaze-Driven Efficient VLMs	Qinyu Chen et.al.	2509.16476	null
2025-09-19	Locally Purified Maximally Mixed States At Scale: Entanglement Pruning and Symmetries	Amit Jamadagni et.al.	2509.16439	null
2025-09-19	Pico: A Modular Framework for Hypothesis-Driven Small Language Model Research	Richard Diehl Martinez et.al.	2509.16413	null
2025-09-19	A Unified AI Approach for Continuous Monitoring of Human Health and Diseases from Intensive Care Unit to Home with Physiological Foundation Models (UNIPHY+)	Minxiao Wang et.al.	2509.16348	null
2025-09-19	The Role of High-Performance GPU Resources in Large Language Model Based Radiology Imaging Diagnosis	Jyun-Ping Kao et.al.	2509.16328	null
2025-09-18	Language Modeling with Learned Meta-Tokens	Alok N. Shah et.al.	2509.16278	null
2025-09-19	DiEP: Adaptive Mixture-of-Experts Compression through Differentiable Expert Pruning	Sikai Bai et.al.	2509.16105	null
2025-09-19	DistillMatch: Leveraging Knowledge Distillation from Vision Foundation Model for Multimodal Image Matching	Meng Yang et.al.	2509.16017	null
2025-09-19	DISPATCH: Distilling Selective Patches for Speech Enhancement	Dohwan Kim et.al.	2509.15922	null
2025-09-19	RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation	Davide Ettori et.al.	2509.15724	null
2025-09-19	Once Upon a Time: Interactive Learning for Storytelling with Small Language Models	Jonas Mayer Martins et.al.	2509.15714	null
2025-09-19	Training-Free Pyramid Token Pruning for Efficient Large Vision-Language Models via Region, Token, and Instruction-Guided Importance	Yuxuan Liang et.al.	2509.15704	null
2025-09-19	pFedSAM: Personalized Federated Learning of Segment Anything Model for Medical Image Segmentation	Tong Wang et.al.	2509.15638	null
2025-09-19	MEC-Quant: Maximum Entropy Coding for Extremely Low Bit Quantization-Aware Training	Junbiao Pang et.al.	2509.15514	null
2025-09-19	Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers	Zahra Aref et.al.	2509.15498	null
2025-09-19	Backdoor Mitigation via Invertible Pruning Masks	Kealan Dunnett et.al.	2509.15497	null
2025-09-18	IMPQ: Interaction-Aware Layerwise Mixed Precision Quantization for LLMs	Junchen Zhao et.al.	2509.15455	null
2025-09-18	Fair-GPTQ: Bias-Aware Quantization for Large Language Models	Irina Proskurina et.al.	2509.15206	null
2025-09-18	MaRVIn: A Cross-Layer Mixed-Precision RISC-V Framework for DNN Inference, from ISA Extension to Hardware Acceleration	Giorgos Armeniakos et.al.	2509.15187	null
2025-09-18	No Modality Left Behind: Adapting to Missing Modalities via Knowledge Distillation for Brain Tumor Segmentation	Shenghao Zhu et.al.	2509.15017	null
2025-09-19	MeanFlowSE: one-step generative speech enhancement via conditional mean flow	Duojia Li et.al.	2509.14858	null
2025-09-18	Delta Knowledge Distillation for Large Language Models	Yihan Cao et.al.	2509.14526	null
2025-09-17	NIRVANA: Structured pruning reimagined for large language models compression	Mengting Ai et.al.	2509.14230	null
2025-09-17	Where Do Tokens Go? Understanding Pruning Behaviors in STEP at High Resolutions	Michal Szczepanski et.al.	2509.14165	null
2025-09-17	SV-Mixer: Replacing the Transformer Encoder with Lightweight MLPs for Self-Supervised Model Compression in Speaker Verification	Jungwoo Heo et.al.	2509.14136	null
2025-09-17	MOCHA: Multi-modal Objects-aware Cross-arcHitecture Alignment	Elena Camuffo et.al.	2509.14001	null
2025-09-17	Asymptotic Analysis of Nonlinear One-Bit Precoding in Massive MIMO Systems via Approximate Message Passing	Zheyu Wu et.al.	2509.13955	null
2025-09-19	Efficient Quantization-Aware Neural Receivers: Beyond Post-Training Quantization	SaiKrishna Saketh Yellapragada et.al.	2509.13786	null
2025-09-17	TENET: An Efficient Sparsity-Aware LUT-Centric Architecture for Ternary LLM Inference On Edge	Zhirui Huang et.al.	2509.13765	null
2025-09-18	DSPC: Dual-Stage Progressive Compression Framework for Efficient Long-Context Reasoning	Yaxin Gao et.al.	2509.13723	null
2025-09-17	InfraMind: A Novel Exploration-based GUI Agentic Framework for Mission-critical Industrial Management	Liangtao Lin et.al.	2509.13704	null
2025-09-17	A High-Quality and Low-Complexity Streamable Neural Speech Codec with Knowledge Distillation	En-Wei Zhang et.al.	2509.13670	null
2025-09-16	AQUA-LLM: Evaluating Accuracy, Quantization, and Adversarial Robustness Trade-offs in LLMs for Cybersecurity Question Answering	Onat Gungor et.al.	2509.13514	null
2025-09-16	Improving 3D Gaussian Splatting Compression by Scene-Adaptive Lattice Vector Quantization	Hao Xu et.al.	2509.13482	null
2025-09-16	LLMs for energy and macronutrients estimation using only text data from 24-hour dietary recalls: a parameter-efficient fine-tuning experiment using a 10-shot prompt	Rodrigo M Carrillo-Larco et.al.	2509.13268	null
2025-09-18	HAM: Hierarchical Adapter Merging for Scalable Continual Learning	Eric Nuertey Coleman et.al.	2509.13211	null
2025-09-16	Vi-SAFE: A Spatial-Temporal Framework for Efficient Violence Detection in Public Surveillance	Ligang Chang et.al.	2509.13210	null
2025-09-16	Multi-Model Synthetic Training for Mission-Critical Small Language Models	Nolan Platt et.al.	2509.13047	null
2025-09-16	Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models	Yuval Weiss et.al.	2509.12960	null
2025-09-17	A Novel Compression Framework for YOLOv8: Achieving Real-Time Aerial Object Detection on Edge Devices via Structured Pruning and Channel-Wise Distillation	Melika Sabaghian et.al.	2509.12918	null
2025-09-16	Energy-Efficient Quantized Federated Learning for Resource-constrained IoT devices	Wilfrid Sougrinoma Compaoré et.al.	2509.12814	null
2025-09-16	NEFT: A Unified Transformer Framework for Efficient Near-Field CSI Feedback in XL-MIMO Systems	Haiyang Li et.al.	2509.12748	null
2025-09-16	Effective Gaussian Management for High-fidelity Object Reconstruction	Jiateng Liu et.al.	2509.12742	null
2025-09-16	ZTree: A Subgroup Identification Based Decision Tree Learning Framework	Eric Cheng et.al.	2509.12688	null
2025-09-16	The Better You Learn, The Smarter You Prune: Towards Efficient Vision-language-action Models via Differentiable Token Pruning	Titong Jiang et.al.	2509.12594	null
2025-09-16	iCD: A Implicit Clustering Distillation Mathod for Structural Information Mining	Xiang Xue et.al.	2509.12553	null
2025-09-16	LEAF: Knowledge Distillation of Text Embedding Models with Teacher-Aligned Representations	Robin Vujanic et.al.	2509.12539	null
2025-09-15	Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction	Ryan Lucas et.al.	2509.12464	null
2025-09-15	GhostNetV3-Small: A Tailored Architecture and Comparative Study of Distillation Strategies for Tiny Images	Florian Zager et.al.	2509.12380	null
2025-09-15	Unsupervised Atomic Data Mining via Multi-Kernel Graph Autoencoders for Machine Learning Force Fields	Hong Sun et.al.	2509.12358	null
2025-09-15	SAQ: Pushing the Limits of Vector Quantization through Code Adjustment and Dimension Segmentation	Hui Li et.al.	2509.12086	null
2025-09-15	AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models	Sangjun Lee et.al.	2509.12019	null
2025-09-15	CLAIRE: A Dual Encoder Network with RIFT Loss and Phi-3 Small Language Model Based Interpretability for Cross-Modality Synthetic Aperture Radar and Optical Land Cover Segmentation	Debopom Sutradhar et.al.	2509.11952	null
2025-09-16	Enriched text-guided variational multimodal knowledge distillation network (VMD) for automated diagnosis of plaque vulnerability in 3D carotid artery MRI	Bo Cao et.al.	2509.11924	null
2025-09-15	SpecVLM: Fast Speculative Decoding in Vision-Language Models	Haiduo Huang et.al.	2509.11815	null
2025-09-15	Visualization and Analysis of the Loss Landscape in Graph Neural Networks	Samir Moustafa et.al.	2509.11792	null
2025-09-15	Quantization Errors, Human–AI Interaction, and Approximate Fixed Points in $L^1(μ)$	Faruk Alpay et.al.	2509.11700	null
2025-09-15	DARD: Dice Adversarial Robustness Distillation against Adversarial Attacks	Jing Zou et.al.	2509.11525	null
2025-09-14	Knowledge Distillation for Sensing-Assisted Long-Term Beam Tracking in mmWave Communications	Mengyuan Ma et.al.	2509.11419	null
2025-09-14	Investigating the Lottery Ticket Hypothesis for Variational Quantum Circuits	Michael Kölle et.al.	2509.11190	null
2025-09-16	Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs	Hang Guo et.al.	2509.11177	null
2025-09-14	SVR-GS: Spatially Variant Regularization for Probabilistic Masks in 3D Gaussian Splatting	Ashkan Taghipour et.al.	2509.11116	null
2025-09-13	GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings	Yixuan Tang et.al.	2509.10844	null
2025-09-12	Automated MCQA Benchmarking at Scale: Evaluating Reasoning Traces as Retrieval Sources for Domain Adaptation of Small Language Models	Ozan Gokdemir et.al.	2509.10744	null
2025-09-12	Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs	Yixiao Zhou et.al.	2509.10377	null
2025-09-12	Efficient Learned Image Compression Through Knowledge Distillation	Fabien Allemand et.al.	2509.10366	null
2025-09-12	I-Segmenter: Integer-Only Vision Transformer for Efficient Semantic Segmentation	Jordan Sassoon et.al.	2509.10334	null
2025-09-12	Investigating Language Model Capabilities to Represent and Process Formal Knowledge: A Preliminary Study to Assist Ontology Engineering	Hanna Abi Akl et.al.	2509.10249	null
2025-09-12	FedBiF: Communication-Efficient Federated Learning via Bits Freezing	Shiwei Li et.al.	2509.10161	null
2025-09-12	Scalable Training for Vector-Quantized Networks with 100% Codebook Utilization	Yifan Chang et.al.	2509.10140	null
2025-09-12	Efficient and Accurate Downfacing Visual Inertial Odometry	Jonas Kühne et.al.	2509.10021	null
2025-09-12	Toward Green Code: Prompting Small Language Models for Energy-Efficient Code Generation	Humza Ashraf et.al.	2509.09947	null
2025-09-12	Acoustic Scene Classification Using CNN-GRU Model Without Knowledge Distillation	Ee-Leng Tan et.al.	2509.09931	null
2025-09-11	ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms	Bingxin Xu et.al.	2509.09679	null
2025-09-11	ReBaNO: Reduced Basis Neural Operator Mitigating Generalization Gaps and Achieving Discretization Invariance	Haolan Zheng et.al.	2509.09611	null
2025-09-11	Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference	Haoran Wu et.al.	2509.09505	null
2025-09-11	Unified Start, Personalized End: Progressive Pruning for Efficient 3D Medical Image Segmentation	Linhao Li et.al.	2509.09267	link
2025-09-11	Adaptive Knowledge Distillation using a Device-Aware Teacher for Low-Complexity Acoustic Scene Classification	Seung Gyu Jeong et.al.	2509.09262	null
2025-09-11	SQAP-VLA: A Synergistic Quantization-Aware Pruning Framework for High-Performance Vision-Language-Action Models	Hengyu Fang et.al.	2509.09090	null
2025-09-10	CSI Compression Beyond Latents: End-to-End Hybrid Attention-CNN Networks with Entropy Regularization	Maryam Ansarifard et.al.	2509.08776	null
2025-09-10	Compressing CNN models for resource-constrained systems by channel and layer pruning	Ahmed Sadaqa et.al.	2509.08714	null
2025-09-10	BitROM: Weight Reload-Free CiROM Architecture Towards Billion-Parameter 1.58-bit LLM Inference	Wenlun Zhang et.al.	2509.08542	null
2025-09-12	SINDI: an Efficient Index for Approximate Maximum Inner Product Search on Sparse Vectors	Ruoxuan Li et.al.	2509.08395	null
2025-09-10	Mitigating Catastrophic Forgetting in Large Language Models with Forgetting-aware Pruning	Wei Huang et.al.	2509.08255	null
2025-09-10	Strategies for Improving Communication Efficiency in Distributed and Federated Learning: Compression, Local Training, and Personalization	Kai Yi et.al.	2509.08233	null
2025-09-09	Risk-Bounded Multi-Agent Visual Navigation via Dynamic Budget Allocation	Viraj Parimi et.al.	2509.08157	null
2025-09-09	Tensor-Train Operator Inference	Engin Danis et.al.	2509.08071	null
2025-09-09	SA-OOSC: A Multimodal LLM-Distilled Semantic Communication Framework for Enhanced Coding Efficiency with Scenario Understanding	Feifan Zhang et.al.	2509.07436	null
2025-09-09	The Role of Exploration Modules in Small Language Models for Knowledge Graph Question Answering	Yi-Jie Cheng et.al.	2509.07399	null
2025-09-09	Knowledge Distillation Driven Semantic NOMA for Image Transmission with Diffusion Model	Qifei Wang et.al.	2509.07363	null
2025-09-09	Word2Spike: Poisson Rate Coding for Associative Memories and Neuromorphic Algorithms	Archit Kalra et.al.	2509.07361	null
2025-09-09	Quantization of the electromagnetic fields from single atomic or molecular radiators	Valerica Raicu et.al.	2509.07359	null
2025-09-08	Recursive algorithm for constructing antisymmetric fermionic states in first quantization mapping	E. Rule et.al.	2509.07279	null
2025-09-08	HealthSLM-Bench: Benchmarking Small Language Models for Mobile and Wearable Healthcare Monitoring	Xin Wang et.al.	2509.07260	null
2025-09-08	Efficient Multi-Agent Coordination via Dynamic Joint-State Graph Construction	Yanlin Zhou et.al.	2509.07234	null
2025-09-08	Efficient Low-Memory Fast Stack Decoding with Variance Polarization for PAC Codes	Mohsen Moradi et.al.	2509.07231	null
2025-09-08	Explaining How Quantization Disparately Skews a Model	Abhimanyu Bellam et.al.	2509.07222	null
2025-09-07	MEGS $^{2}$ : Memory-Efficient Gaussian Splatting via Spherical Gaussians and Unified Pruning	Jiarui Chen et.al.	2509.07021	null
2025-09-08	H $_{2}$ OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers	Wenhao Li et.al.	2509.06956	null
2025-09-08	COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens	Eugene Kwek et.al.	2509.06836	null
2025-09-08	Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective Reasoning	Song Yu et.al.	2509.06436	null
2025-09-08	Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Language Models	Jaemin Son et.al.	2509.06415	null
2025-09-08	3DOF+Quantization: 3DGS quantization for large scenes with limited Degrees of Freedom	Matthieu Gendrin et.al.	2509.06400	null
2025-09-08	Variational Garrote for Statistical Physics-based Sparse and Robust Variable Selection	Hyungjoon Soh et.al.	2509.06383	null
2025-09-08	Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?	Junjie Mu et.al.	2509.06350	null
2025-09-08	LoaQ: Layer-wise Output Approximation Quantization	Li Lin et.al.	2509.06297	null
2025-09-15	FineServe: Precision-Aware KV Slab and Two-Level Scheduling for Heterogeneous Precision LLM Serving	Kyungmin Bin et.al.	2509.06261	null
2025-09-10	BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models	Yuming Li et.al.	2509.06040	null
2025-09-07	StripDet: Strip Attention-Based Lightweight 3D Object Detection from Point Cloud	Weichao Wang et.al.	2509.05954	null
2025-09-07	Quantization of bounded symplectic domains associated with compact Lie groups	Alexey A. Sharapov et.al.	2509.05931	null
2025-09-06	Batalin-Fradkin-Vilkovisky Quantization of FLPR model	Ansha S. Nair et.al.	2509.05632	null
2025-09-06	Quantization of spin circular photogalvanic effect in altermagnetic Weyl semimetals	Hiroki Yoshida et.al.	2509.05620	null
2025-09-06	SpecPrune-VLA: Accelerating Vision-Language-Action Models via Action-Aware Self-Speculative Pruning	Hanzhen Wang et.al.	2509.05614	null
2025-09-09	Mitigating Spurious Correlations Between Question and Answer via Chain-of-Thought Correctness Perception Distillation	Hongyan Xie et.al.	2509.05602	null
2025-09-06	ProfilingAgent: Profiling-Guided Agentic Reasoning for Adaptive Model Optimization	Sadegh Jafari et.al.	2509.05584	null
2025-09-06	Sensitivity-Aware Post-Training Quantization for Deep Neural Networks	Zekang Zheng et.al.	2509.05576	null
2025-09-05	SuperSNN: A Hardware-Aware Framework for Physically Realizable, High-Performance Superconducting Spiking Neural Network Chips	Changxu Song et.al.	2509.05532	null
2025-09-05	Dynamic Sensitivity Filter Pruning using Multi-Agent Reinforcement Learning For DCNN’s	Iftekhar Haider Chowdhury et.al.	2509.05446	null
2025-09-05	Accuracy-Constrained CNN Pruning for Efficient and Reliable EEG-Based Seizure Detection	Mounvik K et.al.	2509.05190	null
2025-09-05	FLOWER: Democratizing Generalist Robot Policies with Efficient Vision-Language-Action Flow Policies	Moritz Reuss et.al.	2509.04996	null
2025-09-05	PLaMo 2 Technical Report	Preferred Networks et.al.	2509.04897	null
2025-09-05	AI-Driven Fronthaul Link Compression in Wireless Communication Systems: Review and Method Design	Keqin Zhang et.al.	2509.04805	null
2025-09-05	STADI: Fine-Grained Step-Patch Diffusion Parallelism for Heterogeneous GPUs	Han Liang et.al.	2509.04719	null
2025-09-08	Advancing SLM Tool-Use Capability using Reinforcement Learning	Dhruvi Paprunia et.al.	2509.04518	null
2025-09-02	ProST: Progressive Sub-task Training for Pareto-Optimal Multi-agent Systems Using Small Language Models	Biddut Sarker Bijoy et.al.	2509.04508	null
2025-09-04	PagedEviction: Structured Block-wise KV Cache Pruning for Efficient Large Language Model Inference	Krishna Teja Chitty-Venkata et.al.	2509.04377	null
2025-09-04	Integrating Pruning with Quantization for Efficient Deep Neural Networks Compression	Sara Makenali et.al.	2509.04244	null
2025-09-04	Real Time FPGA Based Transformers & VLMs for Vision Tasks: SOTA Designs and Optimizations	Safa Mohammed Sali et.al.	2509.04162	null
2025-09-04	Real Time FPGA Based CNNs for Detection, Classification, and Tracking in Autonomous Systems: State of the Art Designs and Optimizations	Safa Mohammed Sali et.al.	2509.04153	null
2025-09-04	Duality between polyhedral approximation of value functions and optimal quantization of measures	Abdellah Bulaich Mehamdi et.al.	2509.04101	null
2025-09-04	Robust MIMO Semantic Communication with Imperfect CSI via Knowledge Distillation	Mingze Gong et.al.	2509.04005	null
2025-09-04	Data-Augmented Quantization-Aware Knowledge Distillation	Justin Kur et.al.	2509.03850	null
2025-09-03	QuantV2X: A Fully Quantized Multi-Agent System for Cooperative Perception	Seth Z. Zhao et.al.	2509.03704	null
2025-09-03	DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling	Yubo Gao et.al.	2509.03472	null
2025-09-08	Amplifying Effective CXL Memory Bandwidth for LLM Inference via Transparent Near-Data Processing	Rui Xie et.al.	2509.03377	null
2025-09-03	NeurStore: Efficient In-database Deep Learning Model Management System	Siqi Xiang et.al.	2509.03228	null
2025-09-03	BAMG: A Block-Aware Monotonic Graph Index for Disk-Based Approximate Nearest Neighbor Search	Huiling Li et.al.	2509.03226	null
2025-09-03	CapsBeam: Accelerating Capsule Network based Beamformer for Ultrasound Non-Steered Plane Wave Imaging on Field Programmable Gate Array	Abdul Rahoof et.al.	2509.03201	null
2025-09-03	Deep Self-knowledge Distillation: A hierarchical supervised learning for coronary artery segmentation	Mingfeng Lin et.al.	2509.03173	null
2025-09-03	FastCaps: A Design Methodology for Accelerating Capsule Network on Field Programmable Gate Arrays	Abdul Rahoof et.al.	2509.03103	null
2025-09-03	Binary Quantization For LLMs Through Dynamic Grouping	Xinzhe Zheng et.al.	2509.03054	null
2025-09-02	LExI: Layer-Adaptive Active Experts for Efficient MoE Model Inference	Krishna Teja Chitty-Venkata et.al.	2509.02753	null
2025-09-02	A quantization of the $\operatorname{SL}_2(\mathbb{C})$ -Chern-Simons invariant of tangle exteriors	Calvin McPhail-Snyder et.al.	2509.02365	null
2025-09-02	All-optical band structure reconstruction and onset of Landau quantization of Dirac fermions	Josef Riepl et.al.	2509.02362	null
2025-09-02	Operator Algebras and Third Quantization	Yidong Chen et.al.	2509.02293	null

🔄 Data Augmentation

📊 130 papers

📅 Publish Date	📝 Title	👥 Authors	📄 PDF	💻 Code
2025-09-23	Generative data augmentation for biliary tract detection on intraoperative images	Cristina Iacono et.al.	2509.18958	null
2025-09-23	PIE: Perception and Interaction Enhanced End-to-End Motion Planning for Autonomous Driving	Chengran Yuan et.al.	2509.18609	null
2025-09-23	SynSonic: Augmenting Sound Event Detection through Text-to-Audio Diffusion ControlNet and Effective Sample Filtering	Jiarui Hai et.al.	2509.18603	null
2025-09-23	Efficient Breast and Ovarian Cancer Classification via ViT-Based Preprocessing and Transfer Learning	Richa Rawat et.al.	2509.18553	null
2025-09-23	Reverse-Complement Consistency for DNA Language Models	Mingqian Ma et.al.	2509.18529	null
2025-09-21	Automatic Classification of Magnetic Chirality of Solar Filaments from H-Alpha Observations	Alexis Chalmers et.al.	2509.18214	null
2025-09-22	Intra-Cluster Mixup: An Effective Data Augmentation Technique for Complementary-Label Learning	Tan-Ha Mai et.al.	2509.17971	null
2025-09-22	SeqUDA-Rec: Sequential User Behavior Enhanced Recommendation via Global Unsupervised Data Augmentation for Personalized Content Marketing	Ruihan Luo et.al.	2509.17361	null
2025-09-21	Enhanced Detection of Tiny Objects in Aerial Images	Kihyun Kim et.al.	2509.17078	null
2025-09-23	Penalizing Boundary Activation for Object Completeness in Diffusion Models	Haoyang Xu et.al.	2509.16968	null
2025-09-20	IPF-RDA: An Information-Preserving Framework for Robust Data Augmentation	Suorong Yang et.al.	2509.16678	null
2025-09-20	MedCutMix: A Data-Centric Approach to Improve Radiology Vision-Language Pre-training with Disease Awareness	Sinuo Wang et.al.	2509.16673	null
2025-09-20	AISTAT lab system for DCASE2025 Task6: Language-based audio retrieval	Hyun Jun Kim et.al.	2509.16649	null
2025-09-19	Intrinsic Meets Extrinsic Fairness: Assessing the Downstream Impact of Bias Mitigation in Large Language Models	‘Mina Arzaghi’ et.al.	2509.16462	null
2025-09-19	Evaluating the Effectiveness and Scalability of LLM-Based Data Augmentation for Retrieval	Pranjal A. Chitale et.al.	2509.16442	null
2025-09-19	DistillMatch: Leveraging Knowledge Distillation from Vision Foundation Model for Multimodal Image Matching	Meng Yang et.al.	2509.16017	null
2025-09-19	Chunk Based Speech Pre-training with High Resolution Finite Scalar Quantization	Yun Tang et.al.	2509.15579	null
2025-09-19	Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection	Xinxin Meng et.al.	2509.15570	null
2025-09-18	Generative AI Meets Wireless Sensing: Towards Wireless Foundation Model	Zheng Yang et.al.	2509.15258	null
2025-09-17	GenCAD-3D: CAD Program Generation using Multimodal Latent Space Alignment and Synthetic Dataset Balancing	Nomi Yu et.al.	2509.15246	null
2025-09-18	Synthetic-to-Real Object Detection using YOLOv11 and Domain Randomization Strategies	Luisa Torquato Niño et.al.	2509.15045	null
2025-09-18	Data Augmentation via Latent Diffusion Models for Detecting Smell-Related Objects in Historical Artworks	Ahmed Sheta et.al.	2509.14755	null
2025-09-18	SpeechMLC: Speech Multi-label Classification	Miseul Kim et.al.	2509.14677	null
2025-09-18	How Does Instrumental Music Help SingFake Detection?	Xuanjun Chen et.al.	2509.14675	null
2025-09-18	SWE-QA: Can Language Models Answer Repository-level Code Questions?	Weihan Peng et.al.	2509.14635	null
2025-09-18	Mitigating Intra-Speaker Variability in Diarization with Style-Controllable Speech Augmentation	Miseul Kim et.al.	2509.14632	null
2025-09-18	LSTC-MDA: A Unified Framework for Long-Short Term Temporal Convolution and Mixed Data Augmentation in Skeleton-Based Action Recognition	Feng Ding et.al.	2509.14619	null
2025-09-18	Leveraging IndoBERT and DistilBERT for Indonesian Emotion Classification in E-Commerce Reviews	William Christian et.al.	2509.14611	null
2025-09-18	VisMoDAl: Visual Analytics for Evaluating and Improving Corruption Robustness of Vision-Language Models	Huanchen Wang et.al.	2509.14571	null
2025-09-18	Learning to Retrieve for Environmental Knowledge Discovery: An Augmentation-Adaptive Self-Supervised Learning Framework	Shiyuan Luo et.al.	2509.14563	null
2025-09-18	Data coarse graining can improve model performance	Alex Nguyen et.al.	2509.14498	null
2025-09-17	Sequential Data Augmentation for Generative Recommendation	Geon Lee et.al.	2509.13648	null
2025-09-17	Multimodal signal fusion for stress detection using deep neural networks: a novel approach for converting 1D signals to unified 2D images	Yasin Hasanpoor et.al.	2509.13636	null
2025-09-16	Adversarial Appearance Learning in Augmented Cityscapes for Pedestrian Recognition in Autonomous Driving	Artem Savkin et.al.	2509.13507	null
2025-09-16	Contrastive timbre representations for musical instrument and synthesizer retrieval	Gwendal Le Vaillant et.al.	2509.13285	null
2025-09-16	Time-step Mixup for Efficient Spiking Knowledge Transfer from Appearance to Event Domain	Yuqi Xie et.al.	2509.12959	null
2025-09-16	Synthetic Protein-Ligand Complex Generation for Deep Molecular Docking	Sofiene Khiari et.al.	2509.12915	null
2025-09-16	Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment	Avinaash Manoharan et.al.	2509.12871	null
2025-09-20	Data Augmentation for Maltese NLP using Transliterated and Machine Translated Arabic Data	Kurt Micallef et.al.	2509.12853	null
2025-09-16	Double Helix Diffusion for Cross-Domain Anomaly Image Generation	Linchun Wu et.al.	2509.12787	null
2025-09-15	Robust Fetal Pose Estimation across Gestational Ages via Cross-Population Augmentation	Sebastian Diaz et.al.	2509.12062	null
2025-09-15	Learning to Generate 4D LiDAR Sequences	Ao Liang et.al.	2509.11959	null
2025-09-15	Automated training of neural-network interatomic potentials	Davide Bidoggia et.al.	2509.11703	null
2025-09-15	DTGen: Generative Diffusion-Based Few-Shot Data Augmentation for Fine-Grained Dirty Tableware Recognition	Lifei Hao et.al.	2509.11661	null
2025-09-15	Task Decoding based on Eye Movements using Synthetic Data Augmentation	Shanmuka Sadhu et.al.	2509.11547	null
2025-09-14	An Entropy-Guided Curriculum Learning Strategy for Data-Efficient Acoustic Scene Classification under Domain Shift	Peihong Zhang et.al.	2509.11168	null
2025-09-14	An Advanced Convolutional Neural Network for Bearing Fault Diagnosis under Limited Data	Shengke Sun et.al.	2509.11053	null
2025-09-13	Point-Plane Projections for Accurate LiDAR Semantic Segmentation in Small Data Scenarios	Simone Mosco et.al.	2509.10841	null
2025-09-01	MIDOG 2025 Track 2: A Deep Learning Model for Classification of Atypical and Normal Mitotic Figures under Class and Hardness Imbalances	Sujatha Kotte et.al.	2509.10502	null
2025-09-12	Improving Audio Event Recognition with Consistency Regularization	Shanmuka Sadhu et.al.	2509.10391	null
2025-09-12	Scaling Arabic Medical Chatbots Using Synthetic Data: Enhancing Generative AI with Synthetic Patient Records	Abdulrahman Allam et.al.	2509.10108	null
2025-09-11	Combining Textual and Spectral Features for Robust Classification of Pilot Communications	Abdullah All Tanvir et.al.	2509.09752	null
2025-09-24	Structure Matters: Brain Graph Augmentation via Learnable Edge Masking for Data-efficient Psychiatric Diagnosis	Mujie Liu et.al.	2509.09744	null
2025-09-11	Virtual staining for 3D X-ray histology of bone implants	Sarah C. Irvine et.al.	2509.09235	null
2025-09-11	Target-oriented Multimodal Sentiment Classification with Counterfactual-enhanced Debiasing	Zhiyue Liu et.al.	2509.09160	null
2025-09-10	Handling Open-Vocabulary Constructs in Formalizing Specifications: Retrieval-Augmented Parsing with Expert Knowledge	Mohammad Saqib Hasan et.al.	2509.08808	null
2025-09-10	ADHDeepNet From Raw EEG to Diagnosis: Improving ADHD Diagnosis through Temporal-Spatial Processing, Adaptive Attention Mechanisms, and Explainability in Raw EEG Signals	Ali Amini et.al.	2509.08779	null
2025-09-10	Ensemble Distribution Distillation for Self-Supervised Human Activity Recognition	Matthew Nolan et.al.	2509.08225	null
2025-09-09	Transformer-Based Approach to Optimal Sensor Placement for Structural Health Monitoring of Probe Cards	Mehdi Bejani et.al.	2509.07603	null
2025-09-09	From Scarcity to Efficiency: Investigating the Effects of Data Augmentation on African Machine Translation	Mardiyyah Oduwole et.al.	2509.07471	null
2025-09-08	Breast Cancer Detection in Thermographic Images via Diffusion-Based Augmentation and Nonlinear Feature Fusion	Sepehr Salem et.al.	2509.07277	null
2025-09-08	Pothole Detection and Recognition based on Transfer Learning	Mang Hu et.al.	2509.06750	null
2025-09-08	Contrastive Self-Supervised Network Intrusion Detection using Augmented Negative Pairs	Jack Wilkie et.al.	2509.06550	null
2025-09-08	IGAff: Benchmarking Adversarial Iterative and Genetic Affine Algorithms on Deep Neural Networks	Sebastian-Vasile Echim et.al.	2509.06459	null
2025-09-08	CAPMix: Robust Time Series Anomaly Detection Based on Abnormal Assumptions with Dual-Space Mixup	Xudong Mou et.al.	2509.06419	null
2025-09-08	PL-CA: A Parametric Legal Case Augmentation Framework	Ao Chang et.al.	2509.06356	null
2025-09-07	Exploring Light-Weight Object Recognition for Real-Time Document Detection	Lucas Wojcik et.al.	2509.06246	null
2025-09-07	Learning in ImaginationLand: Omnidirectional Policies through 3D Generative Models (OP-Gen)	Yifei Ren et.al.	2509.06191	null
2025-09-06	CardiacFlow: 3D+t Four-Chamber Cardiac Shape Completion and Generation via Flow Matching	Qiang Ma et.al.	2509.05754	null
2025-09-05	DuoCLR: Dual-Surrogate Contrastive Learning for Skeleton-based Human Action Segmentation	Haitao Tian et.al.	2509.05543	null
2025-09-05	Handling Data Gaps for the Next Generation of Gravitational-Wave Observatories	Noah Pearson et.al.	2509.05479	null
2025-09-01	Handling imbalance and few-sample size in ML based Onion disease classification	Abhijeet Manoj Pal et.al.	2509.05341	null
2025-08-30	A Dataset Generation Scheme Based on Video2EEG-SPGN-Diffusion for SEED-VD	Yunfei Guo et.al.	2509.05321	null
2025-09-05	Uncertain but Useful: Leveraging CNN Variability into Data Augmentation	Inés Gonzalez-Pepe et.al.	2509.05238	null
2025-09-05	SL-SLR: Self-Supervised Representation Learning for Sign Language Recognition	Ariel Basso Madjoukeng et.al.	2509.05188	null
2025-09-05	Hybrid Matrix Factorization Based Graph Contrastive Learning for Recommendation System	Hao Chen et.al.	2509.05115	null
2025-09-05	Leveraging Transfer Learning and Mobile-enabled Convolutional Neural Networks for Improved Arabic Handwritten Character Recognition	Mohsine El Khayati et.al.	2509.05019	null
2025-09-05	Optimizing Small Transformer-Based Language Models for Multi-Label Sentiment Analysis in Short Texts	Julius Neumann et.al.	2509.04982	null
2025-09-05	DeGuV: Depth-Guided Visual Reinforcement Learning for Generalization and Interpretability in Manipulation	Tien Pham et.al.	2509.04970	null
2025-09-05	A transformer-BiGRU-based framework with data augmentation and confident learning for network intrusion detection	Jiale Zhang et.al.	2509.04925	null
2025-09-05	Evaluating Multiple Instance Learning Strategies for Automated Sebocyte Droplet Counting	Maryam Adelipour et.al.	2509.04895	null
2025-08-29	MOSAIC: A Multilingual, Taxonomy-Agnostic, and Computationally Efficient Approach for Radiological Report Classification	Alice Schiavone et.al.	2509.04471	null
2025-09-04	TauGenNet: Plasma-Driven Tau PET Image Synthesis via Text-Guided 3D Diffusion Models	Yuxin Gong et.al.	2509.04269	null
2025-09-04	How many patients could we save with LLM priors?	Shota Arai et.al.	2509.04250	null
2025-09-04	Explicit and Implicit Data Augmentation for Social Event Detection	Congbo Ma et.al.	2509.04202	null
2025-09-04	Chest X-ray Pneumothorax Segmentation Using EfficientNet-B4 Transfer Learning in a U-Net Architecture	Alvaro Aranibar Roque et.al.	2509.03950	null
2025-09-04	A Generative Foundation Model for Chest Radiography	Yuanfeng Ji et.al.	2509.03903	null
2025-09-04	Data-Augmented Quantization-Aware Knowledge Distillation	Justin Kur et.al.	2509.03850	null
2025-09-03	Lightweight image segmentation for echocardiography	Anders Kjelsrud et.al.	2509.03631	null
2025-09-04	Invariant Features for Global Crop Type Classification	Xin-Yi Tong et.al.	2509.03497	null
2025-09-03	Joint Training of Image Generator and Detector for Road Defect Detection	Kuan-Chuan Peng et.al.	2509.03465	null
2025-09-02	Enhancing Machine Learning for Imbalanced Medical Data: A Quantum-Inspired Approach to Synthetic Oversampling (QI-SMOTE)	Vikas Kashtriya et.al.	2509.02863	null
2025-08-29	Foundation Model-Driven Classification of Atypical Mitotic Figures with Domain-Aware Training Strategies	Piotr Giedziun et.al.	2509.02601	null
2025-09-02	PalmX 2025: The First Shared Task on Benchmarking LLMs on Arabic and Islamic Culture	Fakhraddin Alwajih et.al.	2509.02550	null
2025-09-02	EmoPerso: Enhancing Personality Detection with Self-Supervised Emotion-Aware Modelling	Lingzhi Shen et.al.	2509.02450	null
2025-09-02	Improving Electroencephalogram-Based Deception Detection in Concealed Information Test under Low Stimulus Heterogeneity	Suhye Kim et.al.	2509.02234	null
2025-09-02	Enhancing Zero-Shot Pedestrian Attribute Recognition with Synthetic Data Generation: A Comparative Study with Image-To-Image Diffusion Models	Pablo Ayuso-Albizu et.al.	2509.02161	null
2025-09-02	A Data-Centric Approach to Pedestrian Attribute Recognition: Synthetic Augmentation via Prompt-driven Diffusion Models	Alejandro Alonso et.al.	2509.02099	null
2025-09-16	Abex-rat: Synergizing Abstractive Augmentation and Adversarial Training for Classification of Occupational Accident Reports	Jian Chen et.al.	2509.02072	null
2025-09-01	CabinSep: IR-Augmented Mask-Based MVDR for Real-Time In-Car Speech Separation with Distributed Heterogeneous Arrays	Runduo Han et.al.	2509.01399	null
2025-09-01	MARS: Modality-Aligned Retrieval for Sequence Augmented CTR Prediction	Yutian Xiao et.al.	2509.01184	null
2025-08-31	A Unified Denoising and Adaptation Framework for Self-Supervised Bengali Dialectal ASR	Swadhin Biswas et.al.	2509.00988	null
2025-09-05	Semi-Supervised Bayesian GANs with Log-Signatures for Uncertainty-Aware Credit Card Fraud Detection	David Hirnschall et.al.	2509.00931	null
2025-08-30	NoiseCutMix: A Novel Data Augmentation Approach by Mixing Estimated Noise in Diffusion Models	Shumpei Takezaki et.al.	2509.00378	null
2025-08-26	Amplifying Emotional Signals: Data-Efficient Deep Learning for Robust Speech Emotion Recognition	Tai Vu et.al.	2509.00077	null
2025-08-29	A Multi-Stage Fine-Tuning and Ensembling Strategy for Pancreatic Tumor Segmentation in Diagnostic and Therapeutic MRI	Omer Faruk Durugol et.al.	2508.21775	null
2025-08-29	QZhou-Embedding Technical Report	Peng Yu et.al.	2508.21632	null
2025-08-29	Towards On-Device Personalization: Cloud-device Collaborative Data Augmentation for Efficient On-device Language Model	Zhaofeng Zhong et.al.	2508.21313	null
2025-08-28	Reverse Imaging for Wide-spectrum Generalization of Cardiac MRI Segmentation	Yidong Zhao et.al.	2508.21254	null
2025-08-26	CoBA: Counterbias Text Augmentation for Mitigating Various Spurious Correlations via Semantic Triples	Kyohoon Jin et.al.	2508.21083	null
2025-08-28	Improved photometric redshift estimations through self-organising map-based data augmentation	Yun-Hao Zhang et.al.	2508.20903	null
2025-08-28	Re4: Scientific Computing Agent with Rewriting, Resolution, Review and Revision	Ao Cheng et.al.	2508.20729	null
2025-08-28	Compositionality in Time Series: A Proof of Concept using Symbolic Dynamics and Compositional Data Augmentation	Michael Hagmann et.al.	2508.20656	null
2025-08-28	Mask-Guided Multi-Channel SwinUNETR Framework for Robust MRI Classification	Smriti Joshi et.al.	2508.20621	null
2025-08-28	KCS: Diversify Multi-hop Question Generation with Knowledge Composition Sampling	Yangfan Wang et.al.	2508.20567	null
2025-08-28	Enhancing Health Fact-Checking with LLM-Generated Synthetic Data	Jingze Zhang et.al.	2508.20525	null
2025-08-27	IELDG: Suppressing Domain-Specific Noise with Inverse Evolution Layers for Domain Generalized Semantic Segmentation	Qizhe Fan et.al.	2508.19604	null
2025-08-27	Improving Recommendation Fairness via Graph Structure and Representation Augmentation	Tongxin Xu et.al.	2508.19547	null
2025-08-26	Database Entity Recognition with Data Augmentation and Deep Learning	Zikun Fu et.al.	2508.19372	null
2025-08-26	HuBE: Cross-Embodiment Human-like Behavior Execution for Humanoid Robots	Shipeng Lyu et.al.	2508.19002	null
2025-08-26	Enhancing compact convolutional transformers with super attention	Simpenzwe Honore Leandre et.al.	2508.18960	null
2025-08-26	SegReConcat: A Data Augmentation Method for Voice Anonymization Attack	Ridwan Arefeen et.al.	2508.18907	null
2025-08-26	Enhancing Video-Based Robot Failure Detection Using Task Knowledge	Santosh Thoduka et.al.	2508.18705	null
2025-08-26	Auditing Approximate Machine Unlearning for Differentially Private Models	Yuechun Gu et.al.	2508.18671	null
2025-08-25	Analise de Desaprendizado de Maquina em Modelos de Classificacao de Imagens Medicas	Andreza M. C. Falcao et.al.	2508.18509	null
2025-08-25	Data Augmentation Improves Machine Unlearning	Andreza M. C. Falcao et.al.	2508.18502	null
2025-08-29	German4All – A Dataset and Model for Readability-Controlled Paraphrasing in German	Miriam Anschütz et.al.	2508.17973	null
2025-08-25	Diffusion-Based Data Augmentation for Medical Image Segmentation	Maham Nazir et.al.	2508.17844	null
2025-08-25	LLMulator: Generalizable Cost Modeling for Dataflow Accelerators with Input-Adaptive Control Flow	Kaiyan Chang et.al.	2508.17826	null
2025-08-24	LodeStar: Long-horizon Dexterity via Synthetic Data Augmentation from Human Demonstrations	Weikang Wan et.al.	2508.17547	null

🎨 Synthetic Generation

📊 161 papers

📅 Publish Date	📝 Title	👥 Authors	📄 PDF	💻 Code
2025-09-23	CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching	Chen Chen et.al.	2509.19300	null
2025-09-23	Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation	Sherwin Bahmani et.al.	2509.19296	null
2025-09-23	Enabling Plant Phenotyping in Weedy Environments using Multi-Modal Imagery via Synthetic and Generated Training Data	Earl Ranario et.al.	2509.19208	null
2025-09-23	GSTM-HMU: Generative Spatio-Temporal Modeling for Human Mobility Understanding	Wenying Luo et.al.	2509.19135	null
2025-09-23	Extractive Fact Decomposition for Interpretable Natural Language Inference in one Forward Pass	Nicholas Popovič et.al.	2509.18901	null
2025-09-22	Hierarchical Semi-Markov Models with Duration-Aware Dynamics for Activity Sequences	Rohit Dube et.al.	2509.18414	null
2025-09-22	Evaluating the Creativity of LLMs in Persian Literary Text Generation	Armin Tourajmehr et.al.	2509.18401	null
2025-09-22	StereoFoley: Object-Aware Stereo Audio Generation from Video	Tornike Karchkhadze et.al.	2509.18272	null
2025-09-22	Synth-MIA: A Testbed for Auditing Privacy Leakage in Tabular Data Synthesis	Joshua Ward et.al.	2509.18014	null
2025-09-22	Autoregressive-Gaussian Mixture Models: Efficient Generative Modeling of WSS Signals	Kathrin Klein et.al.	2509.17953	null
2025-09-22	Unsupervised Learning and Representation of Mandarin Tonal Categories by a Generative CNN	Kai Schenck et.al.	2509.17859	null
2025-09-22	Semantic and Visual Crop-Guided Diffusion Models for Heterogeneous Tissue Synthesis in Histopathology	Saghir Alfasly et.al.	2509.17847	null
2025-09-22	GEM-T: Generative Tabular Data via Fitting Moments	Miao Li et.al.	2509.17752	null
2025-09-23	A Generative Framework for Personalized Sticker Retrieval	Changjiang Zhou et.al.	2509.17749	null
2025-09-22	PG-CE: A Progressive Generation Dataset with Constraint Enhancement for Controllable Text Generation	Yan Zhuang et.al.	2509.17669	null
2025-09-22	Is It Certainly a Deepfake? Reliability Analysis in Detection & Generation Ecosystem	Neslihan Kose et.al.	2509.17550	null
2025-09-22	Audiobook-CC: Controllable Long-context Speech Generation for Multicast Audiobook	Min Liu et.al.	2509.17516	null
2025-09-21	Echo-Path: Pathology-Conditioned Echo Video Generation	Kabir Hamzah Muhammad et.al.	2509.17190	null
2025-09-21	STAR: Speech-to-Audio Generation via Representation Learning	Zeyu Xie et.al.	2509.17164	null
2025-09-21	ScenGAN: Attention-Intensive Generative Model for Uncertainty-Aware Renewable Scenario Forecasting	Yifei Wu et.al.	2509.17119	null
2025-09-21	Deep Synthetic Cross-Project Approaches for Software Reliability Growth Modeling	Taehyoun Kim et.al.	2509.16939	null
2025-09-21	PRISM: Precision-Recall Informed Data-Free Knowledge Distillation via Generative Diffusion	Xuewan He et.al.	2509.16897	null
2025-09-20	DoubleGen: Debiased Generative Modeling of Counterfactuals	Alex Luedtke et.al.	2509.16842	null
2025-09-23	Pain in 3D: Generating Controllable Synthetic Faces for Automated Pain Assessment	Xin Lei Lin et.al.	2509.16727	null
2025-09-20	Semi-Supervised Synthetic Data Generation with Fine-Grained Relevance Control for Short Video Search Relevance Modeling	Haoran Li et.al.	2509.16717	null
2025-09-20	An Octave-based Multi-Resolution CQT Architecture for Diffusion-based Audio Generation	Maurício do V. M. da Costa et.al.	2509.16603	null
2025-09-20	A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis	Antonio Scardace et.al.	2509.16582	link
2025-09-20	SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning	Yuyang Ding et.al.	2509.16548	link
2025-09-20	ChemOrch: Empowering LLMs with Chemical Intelligence via Synthetic Instructions	Yue Huang et.al.	2509.16543	link
2025-09-20	mmExpert: Integrating Large Language Models for Comprehensive mmWave Data Synthesis and Understanding	Yifan Yan et.al.	2509.16521	null
2025-09-20	RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation	Tianyi Yan et.al.	2509.16500	null
2025-09-19	SynthIPD: assumption-lean synthetic individual patient data generation	Zixuan Zhao et.al.	2509.16466	null
2025-09-19	Entropic Causal Inference: Graph Identifiability	Spencer Compton et.al.	2509.16463	null
2025-09-19	Introducing Resizable Region Packing Problem in Image Generation, with a Heuristic Solution	Hrishikesh Sharma et.al.	2509.16363	null
2025-09-19	Guided Sequence-Structure Generative Modeling for Iterative Antibody Optimization	Aniruddh Raghu et.al.	2509.16357	null
2025-09-19	Rethinking Molecule Synthesizability with Chain-of-Reaction	Seul Lee et.al.	2509.16084	null
2025-09-19	Sampling String Vacua Using Generative Models	Moritz Walden et.al.	2509.16029	null
2025-09-19	Fed-PISA: Federated Voice Cloning via Personalized Identity-Style Adaptation	Qi Wang et.al.	2509.16010	null
2025-09-19	On Optimal Steering to Achieve Exact Fairness	Mohit Sharma et.al.	2509.15759	null
2025-09-19	TrueMoE: Dual-Routing Mixture of Discriminative Experts for Synthetic Image Detection	Laixin Zhang et.al.	2509.15741	null
2025-09-19	Toward Medical Deepfake Detection: A Comprehensive Dataset and Novel Method	Shuaibo Li et.al.	2509.15711	null
2025-09-19	Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification	Zinan Lin et.al.	2509.15591	null
2025-09-19	LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs	Junlong Jia et.al.	2509.15568	null
2025-09-19	Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech	Xinlei Niu et.al.	2509.15492	null
2025-09-18	Discrete Flow-Based Generative Models for Measurement Optimization in Quantum Computing	Isaac L. Huidobro-Meezs et.al.	2509.15486	null
2025-09-18	Efficient Multimodal Dataset Distillation via Generative Models	Zhenghao Zhao et.al.	2509.15472	null
2025-09-18	PILOT: Steering Synthetic Data Generation with Psychological & Linguistic Output Targeting	Caitlin Cisar et.al.	2509.15447	null
2025-09-18	Causal Fingerprints of AI Generative Models	Hui Xu et.al.	2509.15406	null
2025-09-18	Autoguided Online Data Curation for Diffusion Model Training	Valeria Pais et.al.	2509.15267	null
2025-09-18	Emotion-Aware Speech Generation with Character-Specific Voices for Comics	Zhiwen Qian et.al.	2509.15253	null
2025-09-18	Fair-GPTQ: Bias-Aware Quantization for Large Language Models	Irina Proskurina et.al.	2509.15206	null
2025-09-18	Learning Mechanistic Subtypes of Neurodegeneration with a Physics-Informed Variational Autoencoder Mixture Model	Sanduni Pinnawala et.al.	2509.15124	null
2025-09-19	Sea-ing Through Scattered Rays: Revisiting the Image Formation Model for Realistic Underwater Image Generation	Vasiliki Ismiroglou et.al.	2509.15011	null
2025-09-20	SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding	Bingsong Bai et.al.	2509.14946	null
2025-09-18	Mitigating data replication in text-to-audio generative diffusion models through anti-memorization guidance	Francisco Messina et.al.	2509.14934	null
2025-09-19	MeanFlowSE: one-step generative speech enhancement via conditional mean flow	Duojia Li et.al.	2509.14858	null
2025-09-18	SynBench: A Benchmark for Differentially Private Text Generation	Yidan Sun et.al.	2509.14594	null
2025-09-18	Cross-Lingual F5-TTS: Towards Language-Agnostic Voice Cloning and Speech Synthesis	Qingyu Liu et.al.	2509.14579	null
2025-09-17	A generative model of function growth explains hidden self-similarities across biological and social systems	James Holehouse et.al.	2509.14468	null
2025-09-15	SpeechWeave: Diverse Multilingual Synthetic Text & Audio Data Generation Pipeline for Training Text to Speech Models	Karan Dua et.al.	2509.14270	null
2025-09-17	Quantum Reinforcement Learning-Guided Diffusion Model for Image Synthesis via Hybrid Quantum-Classical Generative Model Architectures	Chi-Sheng Chen et.al.	2509.14163	null
2025-09-19	FlightDiffusion: Revolutionising Autonomous Drone Training with Diffusion Models Generating FPV Video	Valerii Serpiva et.al.	2509.14082	null
2025-09-17	Lightweight Implicit Neural Network for Binaural Audio Synthesis	Xikun Lu et.al.	2509.14069	null
2025-09-17	Enhancing Time Awareness in Generative Recommendation	Sunkyung Lee et.al.	2509.13957	null
2025-09-17	Synthetic Data Generation for Screen Time and App Usage	Gustavo Kruger et.al.	2509.13892	null
2025-09-17	EDITS: Enhancing Dataset Distillation with Implicit Textual Semantics	Qianxin Xia et.al.	2509.13858	null
2025-09-17	CraftMesh: High-Fidelity Generative Mesh Manipulation via Poisson Seamless Fusion	James Jincheng et.al.	2509.13688	null
2025-09-17	AgentCTG: Harnessing Multi-Agent Collaboration for Fine-Grained Precise Control in Text Generation	Xinxu Zhou et.al.	2509.13677	null
2025-09-17	LLM-I: LLMs are Naturally Interleaved Multimodal Creators	Zirun Guo et.al.	2509.13642	null
2025-09-17	Privacy-Aware In-Context Learning for Large Language Models	Bishnu Bhusal et.al.	2509.13625	null
2025-09-14	Synthetic Data and the Shifting Ground of Truth	Dietmar Offenhuber et.al.	2509.13355	null
2025-09-16	SURGIN: SURrogate-guided Generative INversion for subsurface multiphase flow with quantified uncertainty	Zhao Feng et.al.	2509.13189	null
2025-09-17	TeraSim-World: Worldwide Safety-Critical Data Synthesis for End-to-End Autonomous Driving	Jiawei Wang et.al.	2509.13164	null
2025-09-16	A Synthetic Data Pipeline for Supporting Manufacturing SMEs in Visual Assembly Control	Jonas Werheid et.al.	2509.13089	null
2025-09-16	MSR-Codec: A Low-Bitrate Multi-Stream Residual Codec for High-Fidelity Speech Generation with Information Disentanglement	Jingyu Li et.al.	2509.13068	null
2025-09-16	MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data	Eyal German et.al.	2509.13046	null
2025-09-16	A Lightweight Pipeline for Noisy Speech Voice Cloning and Accurate Lip Sync Synthesis	Javeria Amir et.al.	2509.12831	null
2025-09-16	ConvergeWriter: Data-Driven Bottom-Up Article Construction	Binquan Ji et.al.	2509.12811	null
2025-09-16	Toward Ownership Understanding of Objects: Active Question Generation with Large Language Model and Probabilistic Generative Model	Saki Hashimoto et.al.	2509.12754	null
2025-09-16	Chat-Driven Text Generation and Interaction for Person Retrieval	Zequn Xie et.al.	2509.12662	null
2025-09-15	MTEB-NL and E5-NL: Embedding Benchmark and Models for Dutch	Nikolay Banar et.al.	2509.12340	null
2025-09-15	VADER: A Variational Autoencoder to Infer Planetary Masses and Gas-Dust Disk Properties Around Young Stars	Sayed Shafaat Mahmud et.al.	2509.12324	null
2025-09-14	Prediction of Stocks Index Price using Quantum GANs	Sangram Deshpande et.al.	2509.12286	null
2025-09-15	OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling	Yang Zhou et.al.	2509.12201	null
2025-09-15	Learning Majority-to-Minority Transformations with MMD and Triplet Loss for Imbalanced Classification	Suman Cha et.al.	2509.11511	null
2025-09-14	Scaling Up Forest Vision with Synthetic Data	Yihang She et.al.	2509.11201	null
2025-09-14	Differentially-private text generation degrades output language quality	Erion Çano et.al.	2509.11176	null
2025-09-14	STASE: A spatialized text-to-audio synthesis engine for music generation	Tutti Chi et.al.	2509.11124	null
2025-09-14	Filling the Gaps: A Multitask Hybrid Multiscale Generative Framework for Missing Modality in Remote Sensing Semantic Segmentation	Nhi Kieu et.al.	2509.11102	null
2025-09-14	Patient-Zero: A Unified Framework for Real-Record-Free Patient Agent Generation	Yunghwei Lai et.al.	2509.11078	null
2025-09-13	Term2Note: Synthesising Differentially Private Clinical Notes from Medical Terms	Yuping Wu et.al.	2509.10882	null
2025-09-13	CogGNN: Cognitive Graph Neural Networks in Generative Connectomics	Mayssa Soussia et.al.	2509.10864	null
2025-09-12	Struct-Bench: A Benchmark for Differentially Private Structured Text Generation	Shuaiqi Wang et.al.	2509.10696	null
2025-09-12	Humanizing Automated Programming Feedback: Fine-Tuning Generative Models with Student-Written Feedback	Victor-Alexandru Pădurean et.al.	2509.10647	null
2025-09-11	The Coding Limits of Robust Watermarking for Generative Models	Danilo Francati et.al.	2509.10577	null
2025-09-12	Differentially Private Decentralized Dataset Synthesis Through Randomized Mixing with Correlated Noise	Utsab Saha et.al.	2509.10385	null
2025-09-12	Merging Physics-Based Synthetic Data and Machine Learning for Thermal Monitoring of Lithium-ion Batteries: The Role of Data Fidelity	Yusheng Zheng et.al.	2509.10380	null
2025-09-12	Arabic Large Language Models for Medical Text Generation	Abdulrahman Allam et.al.	2509.10095	null
2025-09-11	A Modular and Multimodal Generative AI Framework for Urban Building Energy Data: Generating Synthetic Homes	Jackson Eshbaugh et.al.	2509.09794	null
2025-09-11	OpenFake: An Open Dataset and Platform Toward Large-Scale Deepfake Detection	Victor Livernoche et.al.	2509.09495	null
2025-09-11	Diabatic quantum annealing for training energy-based generative models	Gilhan Kim et.al.	2509.09374	null
2025-09-11	HISPASpoof: A New Dataset For Spanish Speech Forensics	Maria Risques et.al.	2509.09155	null
2025-09-10	Generative quantum advantage for classical and quantum problems	Hsin-Yuan Huang et.al.	2509.09033	null
2025-09-12	ForTIFAI: Fending Off Recursive Training Induced Failure for AI Models	Soheil Zibakhsh Shabgahi et.al.	2509.08972	null
2025-09-10	PromptGuard: An Orchestrated Prompting Framework for Principled Synthetic Text Generation for Vulnerable Populations using LLMs with Enhanced Safety, Fairness, and Controllability	Tung Vu et.al.	2509.08910	null
2025-09-10	GeneVA: A Dataset of Human Annotations for Generative Text to Video Artifacts	Jenna Kang et.al.	2509.08818	null
2025-09-10	Learning Turbulent Flows with Generative Models: Super-resolution, Forecasting, and Sparse Flow Reconstruction	Vivek Oommen et.al.	2509.08752	null
2025-09-10	Design-GenNO: A Physics-Informed Generative Model with Neural Operators for Inverse Microstructure Design	Yaohua Zang et.al.	2509.08749	null
2025-09-11	Generative Data Refinement: Just Ask for Better Data	Minqi Jiang et.al.	2509.08653	null
2025-09-10	Variational Rank Reduction Autoencoders for Generative Thermal Design	Alicia Tierz et.al.	2509.08515	null
2025-09-10	A Structured Review of Underwater Object Detection Challenges and Solutions: From Traditional to Large Vision Language Models	Edwine Nabahirwa et.al.	2509.08490	null
2025-09-10	Joint Learning using Mixture-of-Expert-Based Representation for Enhanced Speech Generation and Robust Emotion Recognition	Jing-Tong Tzeng et.al.	2509.08470	null
2025-09-10	LLM-Guided Ansätze Design for Quantum Circuit Born Machines in Financial Generative Modeling	Yaswitha Gujju et.al.	2509.08385	null
2025-09-10	Persistent-DPO: A novel loss function and hybrid learning for generative quantum eigensolver	Junya Nakamura et.al.	2509.08351	null
2025-09-09	Performance Assessment Strategies for Generative AI Applications in Healthcare	Victor Garcia et.al.	2509.08087	null
2025-09-09	One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation	Zheng Geng et.al.	2509.07978	null
2025-09-09	Enhancements in Score-based Channel Estimation for Real-Time Wireless Systems	Florian Strasser et.al.	2509.07839	null
2025-09-09	A Generalisable Generative Model for Multi-Detector Calorimeter Simulation	Piyush Raikwar et.al.	2509.07700	null
2025-09-09	Spectral Masking and Interpolation Attack (SMIA): A Black-box Adversarial Attack against Voice Authentication and Anti-Spoofing Systems	Kamel Kamel et.al.	2509.07677	null
2025-09-09	Target matching based generative model for speech enhancement	Taihui Wang et.al.	2509.07521	null
2025-09-09	Synthetic Data Generation with Lorenzetti for Time Series Anomaly Detection in High-Energy Physics Calorimeters	Laura Boggia et.al.	2509.07451	null
2025-09-09	When Fine-Tuning is Not Enough: Lessons from HSAD on Hybrid and Adversarial Audio Spoof Detection	Bin Hu et.al.	2509.07323	null
2025-09-08	A transformer-based generative model for planetary systems	Yann Alibert et.al.	2509.07226	null
2025-09-08	Neurocognitive Modeling for Text Generation: Deep Learning Architecture for EEG Data	Khushiyant et.al.	2509.07202	null
2025-09-04	K-Syn: K-space Data Synthesis in Ultra Low-data Regimes	Guan Yu et.al.	2509.06997	null
2025-09-08	SynthDrive: Scalable Real2Sim2Real Sensor Simulation Pipeline for High-Fidelity Asset Generation and Driving Data Synthesis	Zhengqing Chen et.al.	2509.06798	null
2025-09-15	A Statistical 3D Stomach Shape Model for Anatomical Analysis	Erez Posner et.al.	2509.06464	null
2025-09-08	MeanFlow-Accelerated Multimodal Video-to-Audio Synthesis via One-Step Generation	Xiaoran Yang et.al.	2509.06389	null
2025-09-08	Text4Seg++: Advancing Image Segmentation via Generative Language Modeling	Mengcheng Lan et.al.	2509.06321	null
2025-09-07	If generative AI is the answer, what is the question?	Ambuj Tewari et.al.	2509.06120	null
2025-09-07	DreamAudio: Customized Text-to-Audio Generation with Diffusion Models	Yi Yuan et.al.	2509.06027	null
2025-09-06	GUIDe: Generative and Uncertainty-Informed Inverse Design for On-Demand Nonlinear Functional Responses	Haoxuan Dylan Mu et.al.	2509.05641	null
2025-09-04	SasAgent: Multi-Agent AI System for Small-Angle Scattering Data Analysis	Lijie Ding et.al.	2509.05363	null
2025-09-02	Ensembling Membership Inference Attacks Against Tabular Generative Models	Joshua Ward et.al.	2509.05350	null
2025-09-04	Improved 3D Scene Stylization via Text-Guided Generative Image Editing with Region-Based Control	Haruo Fujiwara et.al.	2509.05285	null
2025-09-05	Recomposer: Event-roll-guided generative audio editing	Daniel P. W. Ellis et.al.	2509.05256	null
2025-09-08	Probabilistic operator learning: generative modeling and uncertainty quantification for foundation models of differential equations	Benjamin J. Zhang et.al.	2509.05186	null
2025-09-05	Painting the market: generative diffusion models for financial limit order book simulation and forecasting	Alfred Backhouse et.al.	2509.05107	null
2025-09-05	QCA-MolGAN: Quantum Circuit Associative Molecular GAN with Multi-Agent Reinforcement Learning	Aaron Mark Thomas et.al.	2509.05051	null
2025-09-05	Efficient Video-to-Audio Generation via Multiple Foundation Models Mapper	Gehui Chen et.al.	2509.04957	null
2025-09-05	SynGen-Vision: Synthetic Data Generation for training industrial vision models	Alpana Dubey et.al.	2509.04894	null
2025-09-04	Transition Models: Rethinking the Generative Learning Objective	Zidong Wang et.al.	2509.04394	null
2025-09-04	AUDETER: A Large-scale Dataset for Deepfake Audio Detection in Open Worlds	Qizhou Wang et.al.	2509.04345	null
2025-09-04	Synthetic Survival Data Generation for Heart Failure Prognosis Using Deep Generative Models	Chanon Puttanawarut et.al.	2509.04245	null
2025-09-04	Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning	Zhilin Wang et.al.	2509.04059	null
2025-09-04	An invertible generative model for forward and inverse problems	Tristan van Leeuwen et.al.	2509.03910	null
2025-09-04	Diffusion Generative Models Meet Compressed Sensing, with Applications to Image Data and Financial Time Series	Zhengyi Guo et.al.	2509.03898	null
2025-09-03	LuxDiT: Lighting Estimation with Video Diffusion Transformer	Ruofan Liang et.al.	2509.03680	null
2025-09-05	CEHR-XGPT: A Scalable Multi-Task Foundation Model for Electronic Health Records	Chao Pang et.al.	2509.03643	null
2025-09-03	Multi-level SSL Feature Gating for Audio Deepfake Detection	Hoan My Tran et.al.	2509.03409	null
2025-09-03	Generative Auto-Bidding in Large-Scale Competitive Auctions via Diffusion Completer-Aligner	Yewen Li et.al.	2509.03348	null
2025-09-03	A Comprehensive Guide to Differential Privacy: From Theory to User Expectations	Napsu Karmitsa et.al.	2509.03294	null
2025-09-03	Improving Perceptual Audio Aesthetic Assessment via Triplet Loss and Self-Supervised Embeddings	Dyah A. M. G. Wisnu et.al.	2509.03292	null
2025-09-03	RTGMFF: Enhanced fMRI-based Brain Disorder Diagnosis via ROI-driven Text Generation and Multimodal Feature Fusion	Junhao Jia et.al.	2509.03214	null
2025-09-03	Eigendecompositions of temporal networks	Lucas Lacasa et.al.	2509.03135	null
2025-09-03	Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers	Xingyue Huang et.al.	2509.03059	null
2025-09-03	Scale-Adaptive Generative Flows for Multiscale Scientific Data	Yifan Chen et.al.	2509.02971	null
2025-09-02	Generative AI for Crystal Structures: A Review	Pierre-Paul De Breuck et.al.	2509.02723	null
2025-09-02	Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation	Erfan Baghaei Potraghloo et.al.	2509.02510	null
2025-09-02	Exploring Variational Graph Autoencoders for Distribution Grid Data Generation	Syed Zain Abbas et.al.	2509.02469	null
2025-09-02	Exploring Diffusion Models for Generative Forecasting of Financial Charts	Taegyeong Lee et.al.	2509.02308	null

🤝 Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

⭐ Star History

If you find this repository useful, please consider giving it a star!

This site is open source. Improve this page.