Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
matlok
's Collections
Papers - Embeddings - Freq n-gram Hash - Vocabulary Impacts
Papers - Embeddings - n-gram Hash - Vocabulary
Papers - Text - Eval - Character Level - CUTE
Papers - Multilingual - Encoders - Bytes
Papers - Training - Bytes - Dynamic Patch Sizes
Papers - Text - Dataset - Classification - Multitask - MMLU
Datasets - Text - Classification - Multitask
Papers - Text - Dataset - Coding - MBPP
Papers - Text - Eval - Coding - Python
Papers - Embeddings - Bytes - BPB - Larger Patches than BPE
Papers - Text - Dataset - Datacomp-LM
Papers - Embeddings - Bytes - Tokenizer Free
Papers - Training - Text - Datasets - Coding - GitHub
Papers - Text - Character Level Transformers
Papers - Text - Character Level RNNs
Papers - Training - Bytes - Lookup - Rolling Poly Hashing
Papers - Training - Scaling - Bytes - BLT >= BPE Tokenizer
Papers - Training - Scaling - Compute Optimal
Papers - Attention - Flex Attention
Papers - Embeddings - Bytes - BPB - Tokenzr Free Perplexity
Papers - Embeddings - Bytes - Flops - Input Layer Lookup
Papers - Training - Embeddings Model - Bytes - Entropy Model
Papers - Attention - Bytes - Patch Cross Attention
Papers - Attention - Bytes - MHA Cross Attention - Perceiver
Papers - Embeddings - Text - Byte - Hash ngrams
Papers - Attention - Block Causal
Papers - Tokenizers - Bytes - Incremental Patching
Papers - Tokenizers- Bytes - Entropy Patching - Threshold
Papers - Tokenizers - Bytes - Space - First Char - Patch Len
Papers - Tokenizers - Bytes - Patches - Space Detection
Papers - Tokenizers - Bytes - Patches - Entropy-based
Papers - Tokenizers - Bytes - Strided Patches - MegaByte
Papers - Text - Tokenizer - Bytes - Strided Patches
Papers - Training Research - Bytes - No Vocabulary
Papers - Audio - STT - ASR - wav2vec
Papers - Audio - Contrastive Task - Quantized - Speech
Papers - Audio - Training - Mask Len Distribution - Ablation
Papers - Audio - Training - Masking - Time Steps
Papers - Audio - Viz - Phoneme - Conditional Probability
Papers - Audio - Training - Self-Supervised- Unlabeled Data
Papers - Audio - Fine-tuning - Decoder only - SpecAugment
Papers - Audio - Fine-tuning - Metric - WER
Papers - Audio - Pretraining - Fairseq
Papers - Audio - Dataset - Phoneme Recognition - TIMIT
Papers - Audio - Dataset - LibriVox
Papers - Audio - Dataset - Librispeech
Papers - Audio - Fine-tuning - Loss - CTC
Papers - Audio - Training - Activation - Gumbel Softmax
Papers - Audio - Training - Activation - Gelu
Papers - Audio - Training - Loss - CTC
Spaces - Reasoning
Models - Image - ViT
Papers - Encodings - BBPE - Byte level byte pair
Papers - Tokenizer - Qwen
Papers - Attention - QKV Bias - RMSNorm with Pre-normalizatn
Papers - Training - Activation Function - SwiGLU
Models - Qwen
Papers - Training - Algorithm - SGD vs Adam vs Prodigy
Papers - Training - SGD - SGDM - SGD with Momentum
Papers - Training - CNN
Papers - Training - Eval - Mix of Show
Papers - Training - LR - Optimizer - SGD-Sal
Papers - Training - LR - Optimizer - Prodigy
Papers - Pretraining - Image - ViT
Papers - Pretraining - Image
Datasets - Text - E2E
Papers - Training - SGD - Regularization
Papers - Training - SGD - Decoupled Weight Decay
Papers - Training - PyTorch
Papers - Training - LR - Gradient Local Gain - Variance
Papers - Training - LR - Gradient Signal to Noise Ratio
Papers - Training - Layer Initialization
Papers - Training - LR - Learning Rate
Papers - Training - Adam
Papers - Training - Optimizers
Papers - Training - Dataset Selection - Spectrogram Features
Papers - Training - Backward Masking
Papers - Training - Feature Extraction - Frequency - STFT
Papers - KV Cache - Spectrogram
Papers - Attention - Spectrogram - KV Cache
Papers - Text - Midtraining - Rag - Recall - Rerank - ICL
Papers - Training - Midtraining - Context Length
Papers - Text - Training - Dataset Selection - Filtering
Papers - Text - Training - Mixture
Papers - Text - Datasets - Math - AMC
Papers - Training - Eval - Out of Distribution
Papers - Training - Overfitting - Decontamination
Papers - Pretraining - Synthetic Data - Problem Solving
Papers - Pretraining - Synthetic Data - Reasoning
Papers - Fine-tuning - DPO - Pivotal Token Search
Models - Video - Understanding
Papers - Training - Scaling Laws - Scaling Consistency
Models - Image - Sketch - Pencil
Papers - Training - Text - Vocabulary - SentencePiece
Papers - Encoders - Bytes - More Depth than Decoder
Papers - Training - Token Free - Bytes or Characters
Papers - Training - Bytes - No Tokenizer
Papers - Audio - Encoders - Bert
Papers - Reinforcement Learning - Video Games
Papers - Video Games - Starcraft 2
Papers - Training - Speed - Reduced Training Time
Papers - Fine-tuning - Decoder Only - Frozen Encoder Weights
Papers - CoT - Latent Search Tree
Papers - Reasoning - CoT - Tree Search - BFS
Papers - 3D - SLAT - Structure Latents
Papers - Reasoning - CoT - MCTS
Papers - Training - Sparse Learning - k-Sparse Autoencoder
Models - Biology - Protein - SAE
Models - Text - SAE
Papers - Robotics - Lie Groups
Papers - Robotics
Papers - Math - Differential Geometry - Lie Theory
Papers - Math - Differential Geometry
Papers - NEDL - Differential Geometry - Visualizations
Papers - Training - Convergence - Gaussian Kernel
Papers - Math - SGD - Stochastic Gradient Descent
Papers - Training - Convergence - SoftMax vs SGD
Papers - Training - Convergence - Stoch Gradient Descent
Papers - Training - Convergence - Kernel - Gaussian
Papers - Training - Convergence - SoftMax
Papers - Image - Normalizing Flows
Papers - Image - Rectified Flows
Papers - Image - Diffusion - SBDM
Papers - Image - DDPM - SDE
Papers - Image - Diffusion Coefficient - Fokker-Planck
Papers - Image - Diffusion Coefficient - Deterministic
Papers - Image - Diffusion Coefficient - Stochastic
Papers - Image - Training - Sampler - SDE
Papers - Image - Training - Sampler - ODE
Papers - Image - Datasets - Oxford Flowers
Papers - Training - Gaussian Mixtures - Bridging
Papers - Image - Diffusion - Stochastic Interpolants
Papers - Video - Generator - Multiple Views
Papers - Text - SAE - Sparse Autoencoders
Papers - SAT Solver - GNN
Papers - SAT Solver
Papers - Training - Supervised - Classification
Papers - BitNet - Research - Classification - SAT Solvers
Papers - SAT Solver - NeuroSAT
Papers - Training - Classification - Bit - SAT Solver
Spaces - Image - Generation - High Res - Wide
Papers - Visualizations - GPU Programming - Memory
Papers - Attention - GPU Programming - Kernel - Cuda
Papers - ICL - Text - Classification - Label - Unique Words
Papers - ICL - Text - Prompts - Learning Unique Words
Papers - Text - Encoders - DeBERTa
Papers - RL - Text - Prompts - ASCII ART - Game Board
Papers - RL - Text - Prompts - Navigation - Maze Running
Papers - RL - PC Board Games - Chess - TicTacToe
Papers - RL - Monte Carlo
Papers - RL - Natural Language
Papers - NEDL - Training - Hyperparameters
Papers - Math - Distance - Spearman Correlation
Papers - Math - Distance - Pearson Correlation
Papers - Math - Distance - Chebyshev Polynomials
Papers - NEDL - Embedding - Potential Distance
Papers - NEDL - Embedding - Potential Distance - PHATE
Papers - NEDL - Train - Diffusion - Geodesic
Papers - NEDL - Geodesic Symmetry - Harnack Inequality
Papers - NEDL - Embedding - Geodesic - Euclidean Distance
Papers - Biology - Dataset - RNA - Swiss Roll
Papers - Biology - RNA - Sequencing
Papers - NEDL - Embedding - Heat Geodesic
Papers - Reasoning - Visualization - Pearson’s R
Papers - Training - Scaling - Influence Functions
Papers - Training - Influence Functions - EK-FAC
Papers - Inference - CPU - Apple
Papers - Inference - CPU - Intel
Papers - Inference - CPU - Intel vs Apple - BitNet
Papers - NEDL - Visualization - Non-linearity - tSNE
Papers - Fine-tuning - Multimodal - Contrastive Learning
Papers - NEDL - Fine-tuning - Multimodal Mixup
Papers - NEDL - Fine-tuning - Embedding Shift
Papers - NEDL - Fine-tuning - Geometric Contrastive Learning
Papers - Image - Datasets - SIMAT
Papers - NEDL - Hypersphere
Papers - NEDL - Geodesic
Papers - Training - Cauchy-Schwarz Inequality
Papers - Training - PCA - Kernel
Papers - Training - Non-linear Learning - Lipschitz
Papers - Training - Gradient Descent - Kernel
Papers - Training - Non-linear Learning - Kernel
Papers - Training - Non-linear Learning
Papers - Training - Kernel
Papers - Text - 3D Mesh - Fine-tuning - LLaMa
Papers - Text - Fine-tuning - Loss - CCE - Triton
Papers - Fine-tuning - Memory Reduction Techniques - Text
Papers - Gemma 2 - Fine-tuning
Papers - Mistral - NeMo - Fine-tuning
Papers - Text - Training - Vocabulary Sorting
Papers - Text - Training - Gradient Filtering
Papers - Text - Train - Vocab - Dense Blocks Common Tokens
Papers - Text - Training - Loss - Cuda - Triton - SRAM
Papers - Triton
Papers - Text - Training - Loss - Cut Cross Entropy
Papers - Text - Training - Batch Scaling - Cut Cross Entropy
Papers - Text - Training - Large Vocabulary - CCE
Spaces - Image - Editing a Picture
Papers - Image - Fine-tuning - Dataset - Hand Drawn - DCI
Papers - Image - Fine-tuning - Editing - LAION-Aesthetics
Papers - Fine-tuning - Image - LLaVA
Papers - Image - Benchmarks - Editing - BrushBench
Papers - Image - Editing - BrushNet
Papers - Image - Generation Quality Models - Aesthetic Score
Papers - Image - Generation Quality Models - HPS
Papers - Image - Generation Quality Models - Image Reward
Papers - Image - Guidance - Masked Image Guidance
Papers - Image - BrushNet
Papers - Text - Bit Strings - Hamming Distance
Datasets - Coding - GitHub Issues
Models - Embedding - Multimodal
Papers - Text - Embedding - Noise - In-Batch Deduplication
Papers - Fine-tuning - Text - Embedding
Papers - Fine-tuning - LoRA - Text - Embedding - Sentence
Papers - Text - Embedding - Angle Optimization
Papers - Text - Datasets - GitHub Issues
Datasets - Text - GitHub Issues
Models - Text - Embedding - Multilingual
Models - Text - Embedding - Sentence - German and English
Models - Text - Reranker
Papers - Text - Embedding - Sentence
Models - Text - Embedding - MRL
Models - Text - Embedding - Matryoshka Representation Lang
Papers - Embedding - Text - Sentence - 2DMSE
Papers - Embeddings - Text - Sentence - Matryoshka
Models - Text - Sentence Embedding - Binary Quantization
Models - Text - Sentence Embedding
Papers - CoT - Arch - Reasoning - Layer Depth vs Wider Layer
Papers - Math - Generate - Synthetic Data - CoT
Papers - Math - Generate Synthetic Data
Papers - Benchmarks - Math - Reasoning - GSM-Symbolic
Papers - Flow Matching - Data Generation - XGBoost
Papers - Text - Datasets - Math - Reasoning - iGSM
Papers - Fine-tune - Text - Retry
Papers - Text - Training - Retry
Papers - Audio - Tokenizer
Papers - Image - MiniCPM
Papers - Text - Embedding - Sentence - R-BM25
Papers - Text - Embedding - Sentence - BM25
Papers - Text - Embedding - Sentence - SONAR
Papers - Text - Datasets - Flores-200
Papers - Text - Encodings - Roberta
Papers - Inria
Papers - Text - Machine Translation
Papers - Healthcare - Image Segmentation
Papers - Image - OOD - Out of Distribution
Papers - Image - Guidance - PAG - Perturbed Attention Guidan
Papers - Image - Guidance - Smooth Energy Guidance (SEG)
Papers - Text - Training - Complex Vector Token Representati
Papers - Text - Training - Wave Net
Papers - Text - Encodings - Complex Vectors
Papers - Text - Embedding - Fixed Token - Skip-gram
Papers - Text - Embedding - Fixed Token - CBOW
Papers - Fine-tuning - LoRA - Intruder Dimensions
Papers - Image - Fine-tuning - Clip - Self-supervision
Papers - Text - Inference - Early Stop - Filter Layers
Papers - Image - Training - Contrastive Loss - Batch Size
Papers - Image - Training - Batch
Papers - Image - Fine-tuning - DPO
Papers - Image - Datasets - XM-3600
Papers - Text - Tokens - Vocabulary- Zipfian
Papers - Text - Tokens - Vocabulary - Herdan’s Law
Papers - Text - Tokens - Vocabulary - Heaps Law
Papers - Image - Visual Tokens
Papers - Image - Zipf
Papers - Datasets - Multimodal - YFCC100M
Papers - Training - Image - SLIP
Papers - Datasets - Visualization - WizMap
Papers - Fine-tuning - Video - Video Masked Encoder
Papers - Datasets - Image to Video
Papers - Datasets - Text to Image
Papers - Fine-tuning - Self-Consistency - ScPO
Papers - Training - Self-Alignment
Papers - Healthcare - CoT
Papers - Healthcare - Reasoning
Papers - Healthcare - Benchmarks
Papers - Quantization - BitNet
Papers - Training - Scaling Properties
Models - Image - Autoregressive
Papers - Image - Autoregressive Visual Generation
Papers - Benchmarks - Math - VQA
Paperse - Mobile - Android
Papers - Interpretability - Sparse Autoencoder (SAE)
Papers - Fine-tuning - Machine Unlearning
Papers - Fine-tuning - ResNet
Papers - Training - Knowledge Distillation - Tool Usage
Papers - Training - Knowledge Distillation - World
Papers - Image - CoT
Papers - Custom Layers - Persistent Key-Value Vectors
Papers - Attention - Token Parameter - Pattention
Models - Video Games - Gameplay
Papers - Flow Matching
Papers - Video - MovieGen
Papers - Training - Differential Transformer
matlok - Python Copilot Image Datasets
matlok - Python Code Instruction Datasets
matlok - Python Copilot Audio Datasets
matlok - Python Src Code Datasets (base)
How to build a Python Coding Model with Alpaca Instructions
Dataset - Python Coding Alpaca Instructions
Image Papers
Audio Papers
Text Instruction Papers
Multimodal Papers
Mixture of Experts Papers
Coding Papers
Coding Models
Embedding Papers
Transformer Arch
LMM
LoRA
Non-English Embeddings and Models
Fine-Tuning
More Alpaca Instruction Datasets
Model Benchmarking
Actor Critic Papers
Gaming Reinforcement Learning
Search papers from a url
Chat datasets
Audio models
Datasets - DPO
Datasets - Geospatial
Models - Geospatial
Papers - Geospatial
Models - Biotech
Datasets - Financial
Models - Video Editing
Models - Testing
Papers - Attention
Papers - Context
Papers - Synthetic Data
Tuning - Dora
Models - Fintech
Models - Multimodal
Models - MultiAgent
Models - n-gram and Kneser-Ney
Papers - NLP Research
Papers - Multi-turn Conversations
Datasets - Synthetic - Instruct
Models - Watermarking
Models - Captions
Papers - Fintech - Benchmarks
Models - Touch and Image
Models - Video
Datasets - Image - Text
Models - NeRFs - Image Radiance Fields
Models - Parameter Testing
Models - Predicting Models
Models - Robotics
Datasets - Coding
Models - ReAct - Reasoning and Action
Models - Text
Models - Custom-Training
Papers - Decoders
Papers - Testing a Coding Model
Datasets - Text
Datasets - Multimodal - Text and Images
Models - Large Scale
Papers - Coding
Papers - Transfer Learning
Datasets - Text - Multiple Choice
Datasets - Binarized
Models - Math
Papers - Pipeline - Multimodal
Papers - Reasoning
Models - Gaming
Papers - IoT
Papers - Learning and Compression
Papers - Conversations
Models - Quants
Models - Image - Geometric Algebra
Models - Image
Papers - Video
Spaces - Math
Datasets - Math
Models - Base - 7B
Models - Base - 1B
Spaces - Vision
Datasets - Image and Bounding Box
Models - Science
Papers - Sampling
Models - Byte Transformer
Models - Cooking
Papers - RoPE
Papers - Math - GSM8K
Papers - Model Scaling
Papers - Training Research
Models - UI - Front-End
Papers - Reasoning - Vision
Models - Text - Explanation
Papers - Multi-Agent
Papers - QLoRA
Papers - Ring Attention
Papers - Sequence Parallelism
Models - Legal
Helpful - VRAM Calculator
Models - Audio - Translation
Models - Video Generation
Models - Image - Long Context
Papers - Masked Sequence Packing
Papers - Speculative Decoding
Papers - Fine-tuning - Multimodal
Datasets - Math - Word Problems
Spaces - Coding
Models - Audio - Music Generation
Datasets - Audio
Datasets - Audio - Fine-tuning
Models - Audio - Sheet Music Gen
Papers - Striped Attention
Datasets - Text - Instruction (non-Alpaca)
Models - Images - Instruct
Papers - Benchmarks - Image and Text
Papers - Image - Not-using CLIP
Models - Suggest - Audiobooks from Playlist
Models - MoE
Papers - MoE
Models - MoE - IoT
Models - IoT
Models - Mamba
Models - MoE - Mulitmodal
Papers - MoE - Research
Papers - Image - Knowledge Graphs
Papers - MoE - Training
Papers - Image - MoE
Models - Image - MoE
Papers - Lora - LCM
Models - Image - Drone Photography
Models - Image - Lora
Models - MoE - Principles
Models - MoE - Constitutional Experts
Models - MoE - Visual Relationship Detection
Models - MoE - Training using Lora
Papers - Training with Lora
Papers - MoE - Prompt Immunity
Papers - MoE - Router
Models - MoE - Audio - Underwater Acoustics
Models - MoE - Audio
Papers - MoE - Malicious Queries
Papers - MoE - Image
Models - MoE - Image
Papers - MoE - Training - Blocks
Papers - MoE - Scaling
Papers - MoE - Adversary Queries
Papers - MoE - Deny an Expert
Papers - MoE - Custom Layers
Papers - MoE - Frankenmerge
Papers - Multimodal
Papers - Image - Bounding Box
Papers - Multimodal - Documents
Papers - Exploit - Model Layer Retrieval
Papers - Image
Papers - Image - Dataset Generator
Datasets - Text and Video
Papers - Video - Mamba
Papers - Performance Trends in AI
Papers - Fine-tuning - Home Lab
Papers - MoE - Audio
Papers - MoE - Attention
Papers - Quants
Papers - Image - MoE - IoT
Papers - MoE - Speech Recognition
Papers - MoE - Router - Task
Papers - MoE - Multilingual
Papers - MoE - Federated Learning
Papers - MoE - Training - Weight Sharing
Papers - MoE - Router - Research
Papers - Image - Handwriting Recognition
Papers - MoE - IoT
Papers - MoE - Handwriting Recognition
Papers - Image - OCR Handwriting
Papers - Image - Adversarial
Papers - Image - Segment - Handwriting
Papers - Image - Handwriting and Online Gestures
Papers - Image - Handwritten Characters
Papers - Image - Fine-tuning
Papers - Image - HTR - Math Gestures and Symbols
Papers - Image - Handwritten Generation
Models - Text - Multilingual
Models - Image - Diffusion Probabilistic Models
Papers - Benchmark - Handwriting Recognition
Papers - Image - Handwriting Recognition - Lexical Features
Datasets - Image - Handwritten Recognition
Papers - Image - Custom Layers
Papers - Image - Handwriting Recognition - Tetrolets
Papers - Image - Handwriting Recognition - Near-Realtime
Papers - Text - Encoders
Papers - Text - Decoders
Papers - Text - Bidirectional - Bio
Papers - Text - Bidirectional Encoders
Papers - Text - Pre-training
Papers - Text - Pre-training - Research
Papers - Text - Pre-training - Decoder Multi-Steps
Papers - Text - Benchmarks - Quality Diversity
Papers - Image - Multimodal - Handwriting Recognition
Papers - Text - Research
Papers - Text - Multilingual
Papers - Multimodal - Speech and Text
Papers - Multimodal - Speech and Text - Multilingual
Papers - Multimodal - Training and Tuning
Papers - Multimodal - Document Analysis
Papers - Video - Motion Control
Papers - Video - Entity Recognition
Papers - Video - Pre-training
Papers - Image - Pre-training
Papers - Image - Caption Generation
Papers - Image - Synthetic Data Generator
Papers - Transformer Research - Custom Layers
Papers - ResNet
Papers - SuperNets
Papers - Federated Learning
Papers - Mamba - Structured State Space Model
Papers - Image - Human Motion Generator
Papers - MoE - Multimodal
Papers - Autonomous Drones
Papers - Multimodal - Drone
Papers - Multimodal - Drone - Object Manipulation
Papers - Training Research - Time series
Papers - Pre-training - Time Series
Papers - Neural Architecture Search
Papers - Training - Hardware Detection
Papers - Image - Split Computing
Papers - Image - IoT - Split Computing
Papers - U-Net
Papers - Image - Segmentation
Papers - Image - Segmentation - Cancer
Papers - Video - Synthetic Data Generator
Papers - Image - Segmentation - Drone
Papers - Image - Segmentation - Report
Papers - Image - Segmentation - Adversarial
Papers - Image - Segmentation - MRI
Papers - Image - Segmentation - Stroke Brain Lesions
Papers - Image - SkipNet
Papers - Image - IoT
Papers - Image - Hybrid
Papers - Image - Hybrid - ResNet - U-Net
Papers - Image - Hybrid - Swin - U-Net
Papers - Image - Segmentation - Bio Cell
Papers - Image - Segmentation - Quantum
Papers - Image - Hybrid - Graph Net - U-Net
Papers - Image - Hybrid - Patient Meta Data - U-Net
Papers - Image - CSWin - Cross-Shaped Windows
Papers - Image - Encoders
Papers - Image - Encoders - LePE - Local-Enhanced Pos Enc
Papers - Image - Attention - BOAT - Bilateral Local Attn
Papers - Image - Attention - Multi-Scale
Papers - Image - Swin
Papers - Text - Fine-tuning - Math
Papers - BYOL
Papers - Robot - Tasks - Boss
Papers - Text - Model Guided Training
Papers - Robot - Research
Papers - Image - Hybrid - Hybrid Task Cascade (HTC) - Swin
Papers - Image - GasHis
Papers - Image - Dino
Papers - Text - Architecture - Scaling to 1000 Layers
Papers - DenseNet
Papers - Adversarial Testing
Papers - Image - EfficientNet
Papers - Image - Compound Scaling Method
Papers - Base Models - Text - Coding
Papers - Image - Visualization - Splatting
Papers - AI - Social Risks
Papers - AI - Safety
Papers - Testing - Single Layer Model
Papers - Custom Layers
Papers - Pre-training
Papers - Motion Control
Papers - Fine-tuning
Papers - Fine-tuning - LoRA
Papers - Reinforcement Learning
Papers - AI - Self-refinement - Training and Tuning
Papers - Training
Papers - Audio
Papers - Text - Math
Papers - Observability and Interpretability
Papers - Multimodal - Healthcare
Papers - Interpretability - DAS
Papers - Named Entity Extraction - Healthcare
Papers - Multilingual
Papers - Healthcare
Papers - Watermark
Papers - Image - Clip
Papers - Proof of Learning
Papers - Disaster Recovery
Papers - Named Entity Extraction and Disambiguation
Papers - Neural Architecture Search - Report
Papers - Neural Architecture Search - One-shot
Papers - Neural Architecture Search - Tabular Data
Papers - Hyperparameter Architecture Search
Papers - Image - Neural Architecture Search
Papers - Neural Architecture Search - RNN
Papers - Neural Architecture Search - Reinforcement Learning
Papers - Neural Architecture Search - Quantization - FLIQS
Papers - AutoML
Papers - Neural Architecture Search - AutoML
Papers - Testing - Speech and Text
Papers - AI - Are models similar to a human brain?
Papers - Automated Training - Self Discover
Papers - Math - Automated Discovery
Papers - Math - Research
Papers - Alpaca
Papers - Critical Thinking - Step Back
Papers - Critical Thinking
Papers - Text - Length Generalization
Papers - Text - Encoders - Fire
Papers - Image - Multi-Image Reasoning
Paper - Image - Chain of Thought
Papers - Image - Text and Symbolic Image Generator
Models - Fine-tuning - Mixture of Loras
Papers - Multimodal - Text to 2D to 3D Mesh
Datasets - HTML
Datasets - Multimodal - Text and Image
Papers - Image - Mamba
Papers - Image - Selective Scan
Papers - Healthcare - Mental Health
Papers - Encoders
Papers - Encoders - Fire
Papers - Video - Understanding with Many Models
Papers - Video - Understanding
Papers - Image - Understanding
Papers - Multimodal - Encoders
Papers - Image - GiT
Papers - Text - Star
Papers - QFormer
Papers - Image - Near Real Time
Papers - Image - Attention - Window
Papers - Image - Editing
Papers - Image - Training - Noise
Papers - Image - LCM
Papers - Image - Training - Quantized Mask
Papers - Image - Editing - Glide
Papers - Image - Training - Seed Vector
Papers - Image - Semantic Palette
Papers - Blockwise Parallel
Papers - Training - Distributed
Papers - Training - Masked Sequence Packing
Datasets - Chess
Papers - Semantic Segmentation
Papers - Training - FixMatch
Papers - Training - Self-Training - Student and Teacher
Papers - Task Assistant - ExploreLLM
Papers - Training - Guided Task Flow
Papers - Training - Problem Solving
Papers - Structured Thoughts
Papers - GUI - Task Assistants
Papers - Chinchilla
Papers - Model Scaling - Effective Parameter Count
Papers - Custom Layers - Hash Layers
Papers - Scaling
Papers - Hallucination - Reduction
Papers - Chain of Verification
Papers - Reading Comprehension
Datasets - Text - Multilingual
Papers - Training - Chain of Thought
Papers - CoT - Chain of Thought
Papers - Ethics
Papers - Fine-tuning - QA-LoRA
Papers - Fine-tuning - Understanding Tables
Papers - Text - Perform Tasks on Tabular Data
Datasets - Text - Tabular
Papers - Text - Dataset - TabLib - Tabular
Papers - Qwen
Papers - Qwen - Report
Papers - Multimodal - Report
Papers - MoE - Quantization
Papers - Attention - Custom Encoder
Papers - Research - Replacing Attention
Papers - Research - Safety
Embeddings - C4 - Jina
Papers - Reduce Model Size - SliceGPT
Papers - Decoders - CoT Decoding
Papers - Rag
Papers - Rag - Multi-hop Queries
Papers - Encoders - Coding
Embeddings - Coding
Embeddings - Coding - CodeBert
Papers - Training - Synthetic Noise
Papers - Coding - Fill in the Middle - Infilling
Papers - Text - Pre-training - Synthetic Noise
Papers - Training - Knowledge Graphs
Papers - Image - Training - Knowledge Graphs
Papers - Image - Training - Adversarial
Papers - Multimodal - Fine-tuning - Report
Papers - Text - Tabular - Conditional Formatting
Papers - Text - Training - Code - Byte Pair Encoding
Papers - Coding - Out of Vocabulary
Papers - Coding - BPE vs Pointer Mixture Network
Papers - Automatic Speech Recognition
Papers - Automatic Speech Recognition - Beam Search
Papers - Beam Search
Papers - Explainability
Papers - Training - Synthetic Data - Sycophancy
Papers - Training - DoReMi
Papers - Training - Domain Reweighting
Papers - Training - AI training AI
Papers - Training - Proxy Model - Group DRO
Papers - Adafactor
Papers - Coding - Decoding with Static Analysis
Papers - MoE - Hashing instead of a Router
Papers - UDOP
Datasets - Multimodal - Image and Text
Papers - Multimodal - Document and Text
Datasets - Multimodal - Document and Image
Papers - Encoder - Byte-Pair Encoding
Papers - Text - SQL
Papers - Science - Research Analysis
Papers - Training - Speculative Decoding - Single Model
Papers - Attention - Tree Attention
Papers - Fine-tuning - Rag
Models - Table - Extraction
Papers - Video - Agent
Papers - Audio - GAN - Upsamplimg
Papers - Audio - GAN
Papers - Image - Illumination
Papers - Decoders - 3D Nerf
Papers - Image - Edit
Papers - ControlNet
Papers - Fine-tuning - Parameter Efficiency
Papers - Image - Lightning
Papers - Text - 3D Mesh - Volumetric
Papers - Text - Label Generator
Papers - Image - Limited-Training
Papers - Image - Chart to Table
Papers - Image - Plot - Understanding and Reasoning
Papers - Image - 3D Asset Enhancement
Papers - Text - Taxonomy Generator
Papers - Training - Reward Model
Papers - Fine-tuning - Language Model Policy with LoRA
Papers - Fine-tuning - Mixture of LoRA (MoL)
Papers - Robotic - Observational Learning
Papers - Attention - Cross
Papers - Training - Skill Learning
Papers - FIne-tuning - Multi-Agent
Papers - mPlug-Owl
Papers - Image - Document - mPlugOwl
Papers - Document - mPlugOwl
Papers - Structured Learning - Document
Papers - Prompt - Prompt Compression - Report
Papers - Prompt
Papers - Image - Gaussian Splatting and NeRF
Models - Reverse Engineering - Decompiler
Models - Reverse Engineering
Papers - Text - 3D
Models - Table - Structure - Recognition
Paper - Image - Table - Extraction
Paper - Image - Table
Papers - Tabular
Papers - Image - Object Detection
Models - Image - Object Detection
Papers - Benchmarks - Reward Models
Papers - 3D - Text
Papers - Science - Molecule
Papers - Frankenmerging
Papers - Image - Frankenmerging
Papers - Image - Model Merging
Papers - Attention - Grouped-Query Attention (GQA)
Papers - Image - Math
Papers - Benchmarks - Math
Papers - Image - Reward Model
Papers - Multimodal - Mamba
Papers - Video - Editing
Papers - Image - Personalization
Papers - Image - Personalization - Captions
Papers - Image - Blip
Papers - 3D - Reconstruction
Papers - Image - Video Generator
Papers - Video - Upsampler
Papers - Video - Time Reversal Fusion
Papers - Image - Adversarial (GAN)
Papers - Image - Video - Adversarial (GAN)
Papers - Toxicity
Papers - Fine-tuning - Toxicity
Papers - Video - Content Motion Latent Diffusion
Papers - Decoders - Chain of Thought
Papers - Image - Depth Estimation
Papers - Image - Flow Matching
Papers - Image - Training
Papers - Text - Classification - Social Media
Papers - Text - Classification
Papers - Text - Training - Classification
Papers - Audio - Training
Papers - Multimodal - Audio
Papers - Audio - Whisper vs Clap - Whisper wins with ASR
Papers - Encoders - Audio
Papers - ICL - In-Context Learning
Papers - Math - Derive New Math - Function Class
Papers - Agent - Architecture
Papers - Agent - Memory
Papers - Fine-tuning - DPO
Papers - Critic Models
Papers - Training - Critic Model
Papers - Security
Papers - Security - Fuzzing
Papers - Reasoning - Critic Pattern
Papers - Benchmarks - Reasoning
Papers - Sports
Papers - Music
Papers - Pop Culture
Papers - Coding - Chain of Thought
Papers - Coding - Training
Papers - Coding - Fine-tuning
Papers - Coding - Reasoning
Papers - Fine-tuning - Reasoning
Papers - Video - Streaming
Papers - Mamba - FFT - EinFFT
Papers - Encoders - Video
Papers - Multimodal - Video - Text - Audio
Papers - Multimodal - Captions - Audio
Papers - Multimodal - Captions - Speech
Papers - Multimodal - Captions - Video
Papers - Synthetic Data - Multimodal
Papers - 3D
Papers - 3D - Synthetic Data
Papers - Document - Understanding
Papers - Documents - Fine-tuning
Papers - Compiler
Papers - Coding - Compiler
Papers - LLVM
Papers - Training - Teacher Model
Papers - Tree of Thoughts
Papers - Searchformer
Papers - Coding - Stack Traces
Papers - Training Research - Stack Traces
Papers - Fine-tuning - Search Based
Papers - Fine-tuning - Procedure Cloning
Papers - Encoders - T5
Papers - Decoders - T5
Papers - T5
Papers - DenseFormer
Papers - Training - Weighted Average
Papers - Encoders - Image - Clip
Papers - Training - Fitness Score
Papers - Training Research - Exemplary Prompts
Papers - Fine-tuning - Prompts
Models - TTS
Models - T5
Models - Documents
Papers - Encoders - VAE
Papers - Agent - Operating Systems
Papers - Image - Synthetic Data - Human Faces
Papers - Multilingual - Japanese
Papers - Fine-tuning - Multilingual
Papers - Document - Understanding - Historical Images Text
Papers - SAM - Segment Anything Model
Papers - Image - Historical
Papers - Image - Explainability
Papers - Image - VGG
Papers - Image - Pattern Recognition
Papers - Image - Historical - Symbolic and Artistic
Papers - Training - Distribution-based
Papers - Research - Emergent Properties
Papers - Image - In-Context Learning
Papers - Deepmind - ICL vs RNN vs LTSM
Papers - Deepmind - ICL Rule-based Classification
Papers - DeepMind - ICL Small Models are More Exemplar-Based
Spaces - Decoders - Beam Search Visualizer
Spaces - Decoders - Beam
Spaces - Decoders
Papers - Video - NeRF
Papers - FAIR
Papers - Fine-tuning - Model Layer Pruning
Papers - Healthcare - Text - Antibodies
Papers - Intel - MLP
Papers - Performance - Intel
Papers - Image - Prompt
Papers - VQA
Papers - Fine-tuning - SFT
Papers - Fine-tuning - Report
Papers - Text - Video Generator
Papers - Video - Enhance
Spaces - LangChain
Papers - Image - Gaussian Splatting - 2D
Papers - Meta
Papers - Audio - Image
Papers - Image - Avatar Generator
Papers - Training Research - Audio
Papers - Healthcare - Synthetic Data Generator - 3D
Models - Image - Streaming
Datasets - Fine-tuning
Datasets - Meta
Papers - University - MIT
Papers - Google
Papers - Image - MultiDiffusion
Papers - Imagen
Papers - Convert - T2I to T2V
Papers - University - University of California Berkeley
Papers - OpenAI
Papers - Adobe
Papers - RWKV
Papers - 3DGS
Papers - Text - Fact Checking
Papers - Text - Factuality
Papers - Healthcare - Text
Papers - Healthcare - Training Research
Papers - University - Stanford University
Papers - DataBricks
Models - Healthcare
Papers - Image - Generator - Large Resolution
Papers - Encoders - Synthetic Noise
Papers - Apple
Papers - Video - Clothing
Papers - Encoders - Video - MetaCLIP
Papers - IoT - Assistant
Papers - Training Research - Mixture FOFE
Papers - Training Research - AD FOFE
Papers - Image - Editing - Object Removal
Papers - Image - Editing - Object Insertion
Papers - Image - Editing - Counterfactual Supervision
Papers - 3DGS - Feature Rendering
Papers - 3DGS - Open-world Segmentation
Papers - 3DGS - Security Camera Object Detection
Papers - Microsoft
Papers - University - Carnegie Mellon University
Papers - Healthcare - Image Analysis
Papers - Healthcare - Image - SynthRAD2023
Papers - Healthcare - Image - CT
Models - MoE - GQA
Papers - Image - Segmentation - Bounding Box Infilling
Models - MoE - Coding
Papers - Image - Translation
Papers - Text - Translation
Papers - Multilingual - German
Papers - Image - Synthetic Noise
Papers - Multilingual - Translation
Papers - Johns Hopkins
Papers - Multilingual - Synthetic Noise
Papers - Intel
Papers - Fine-tuning - Text - U-Net
Papers - Image - Encoders - Text
Papers - Image - Encoders - Clip
Papers - Video - Reasoning - Time of Events
Papers - Video - Encoders
Papers - Video - Training - Understanding Time
Papers - Nvidia
Papers - U-Net - 3D
Papers - 3DGS - 3D Mesh Generator
Models - Fine-tuning
Papers - Model - SFT - Alpaca and DPO - Solar
Papers - Fine-tuning - Preference-based RL (PbRL)
Papers - University - Cornell University
Papers - Robotics - Fine-tuning - PbRL
Papers - Fine-tuning - DPO - Reward Model Training
Papers - Reward Model
Papers - Reward Model - Bradley-Terry
Papers - Reward Model - Training
Papers - University of Chicago
Papers - Reward Models - KL Regularization - RL
Papers - KL Regularization - ADP - Con/Divergence Error Rate
Models - Fine-tuning - PPO
Papers - Fine-tuning - Factuality
Papers - Fine-tuning - Emulator
Datasets - RLHF
Datasets - Fine-tuning - RLHF
Papers - top-p - Nucleus Sampling
Papers - top-k - Flat (good) vs Peaked (bad) Dist Sampling
Papers - Distribution - Zipf Analysis
Papers - Institute - Allen Institute
Papers - University - University of Washington
Models - 1bit
Models - Bitnet - Text
Papers - Coding - Unit Tests
Papers - Tacotron 2
Papers - Audio - WaveNet
Papers - Audio - Time Domain Waveforms
Papers - Audio - TTS
Papers - Audio - Mel Spectogram
Papers - Decoders - Audio
Papers - GAN
Papers - Image - GAN
Papers - GAN - Compression - Bitstream
Papers - GAN - Compression
Papers - Audio - STT - ASR
Papers - Audio - Speech Transcription
Papers - Audio - WhisperX
Papers - Audio - Voice Activity Detection
Papers - Audio - VoiceCraft
Models - Audio - TTS
Papers - Audio - Compression
Models - Audio
Models - Audio - Codec
Models - Audio - Encoders
Models - Audio - Decoders
Models - FAIR
Models - Meta - FAIR
Models - Audio - Music Generator
Models - Getting Started - Pre-training
Models - TinyLlama
Models - Reward Model
Models - Starling
Datasets - Chat - RLHF
Datasets - Starling
Papers - Audio - Masked Language Model
Papers - Audio - Residual Vector Quantization
Papers - Audio - Encoders
Models - Image - Object Detection - DETR
Models - ResNet
Papers - Audio - Inference - Rescore Models
Papers - Inference - Rescore Models
Inference - Autoregressive and Non-Autoregressive Models
Papers - Kyutai
Models - Text - Music Generator
Models - Audio - Hybrid - AR with NAR Models
Papers - Touch
Papers - MoE - Mamba
Papers - Flan-T5
Papers - IoT - Screen Usage Understanding and Context
Papers - Mobile - User Entity Context Understanding
Papers - Mamba - Limitations - In-Context Learning (ICL)
Models - MoE - Mamba
Papers - AI21 Labs
Papers - University of Tokyo
Papers - S-Lab
Papers - Duke
Papers - University of Wisconsin
Papers - Image - Report
Papers - Hallucinations
Papers - Trustworthiness
Papers - University of Bristol
Papers - Healthcare - Surgical Gestures
Papers - Vanderbilt
Papers - Fine-tuning - Dataset - Few-Shot Retrieval (FRet)
Papers - University - New York University
Papers - Embeddings
Papers - Embeddings - Text
Papers - Text - Memorization
Papers - Training a 2.8B Model in 38 days
Papers - Huawei
Papers - vLLM
Papers - Inference - vLLM
Papers - Attention - PagedAttention
Papers - Fine-tuning - Model Merge
Papers - Frankenmerge - Model Stock - Use Fine-tuned Models
Papers - Naver
Models - Model Stock
Models - Frankenmerge
Models - Frankenmerge - Model Stock
Papers - Benchmarks
Papers - Benchmarks - Financials
Papers - 1bit
Models - 2bit
Papers - Video - Fine-tuning
Papers - Video - Reward Model
Models - Spright
Papers - ASU
Papers - Hugging Face
Papers - University of Maryland
Papers - University - Tsinghua University
Papers - Chinese Academy of Sciences
Papers - Xidian University
Papers - 3D - FlexiCubes
Papers - ShengShu
Papers - Fine-tuning - Llava - DPO
Papers - Non-Autoregressive Transformers
Papers - Salesforce
Papers - Safety
Papers - Speech - Chain of Thought
Papers - Audio - Chain of Thought
Papers - Chinese University of Hong Kong
Papers - Audio - Fine-tuning
Papers - Audio - Fine-tuning - Lora
Papers - Image - Continual Training Framework
Papers - Documents - LayoutLM
Papers - Documents - FormNet
Papers - Document - OCR
Papers - Ohio State
Papers - Video - Captions
Papers - Video - Streaming - Captions
Papers - Decoders - Training Decoding Point Supervision
Papers - Healthcare - Cardiac MRI - CMRxRecon Challenge 2023
Papers - Image - Healthcare - Cardiac MRI
Papers - Image - Healthcare
Papers - Training Research - Optimizers
Papers - Coding - C/C++ - Memory
Papers - Coding - C/C++
Papers - Coding - Annotations, Decorators and Captions
Papers - Coding - Operating Systems - Memory
Papers - Image - Contrastive Graph Learning
Papers - Extended Transformer Construction
Papers - Documents - Tabular
Papers - Graph Convolutional Network
Papers - Documents - Graph Convolutional Network
Papers - Training Research - Contrastive Predictive Coding
Papers - Decoders - Bert
Papers - Optimizers - Adafactor
Papers - T5 - MoE
Papers - University of Georgia Tech
Papers - Image - Extract Style
Papers - Image - Contrastive Style Descriptors
Papers - Image - Use a Model to find a similar image
Papers - Ellis Institute
Papers - Shanghai AI Laboratory
Papers - Image - Security Cameras
Papers - Government - USA
Papers - University - University of Waterloo
Papers - Vector Institute
Papers - Benchmarks - Text
Papers - Benchmarks - In-Context Learning
Papers - Benchmarks - Text - Long Context
Models - Documents - OCR
Models - Text - Classifier - Zero-Shot
Models - Text - Classifier - Deberta
Papers - Network - Adaptive BitRate Algorithms
Papers - Network Traffic - 4G and 5G - OTA - Packet Shaping
Papers - Network Traffic - 4G and 5G - OTA
Papers - Network Traffic - 4G and 5G
Papers - Network Traffic - OTA
Papers - Network Traffic - Packet Shaping
Papers - Network Traffic - Transport Optimization
Papers - Network Traffic
Papers - University of Texas
Papers - University of Peking
Papers - Coding - Preference Trees
Papers - Coding - Understanding Tree Structures
Papers - Math - Reasoning
Papers - University - University of Illinois
Papers - University - Northeastern University
Papers - Multilingual - Finnish
Papers - Multilingual - Encoders - BPE
Papers - LLaVA
Papers - Gemma
Papers - Multimodal - Training
Papers - Encoders - DinoV2
Papers - Image - Encoders - DinoV2
Papers - Training Research - Scaling Properties - T2I
Papers - Training Research - Smaller vs Larger Models
Papers - Pre-training - In-filling - PSM and SPM ordering
Papers - Pre-training - Dynamic Context Length
Papers - Text - Supervised Fine-tuning
Papers - Text - Supervised Fine-tuning - Batch Grouping
Papers - Fine-tuning - PPO
Papers - Multilingual - Benchmarks
Papers - Amazon
Papers - Image - SDXL
Papers - ByteDance
Papers - Video - Autoregressive Model
Papers - Infererence - Performance
Papers - Coding - Algorithmic Reasoning
Papers - Coding - Think and Execute vs CoT and PoTs
Papers - Coding - Program of Thoughts (PoT)
Papers - Coding - Think and Exectue - 7B vs 13B vs GPT
Papers - Prompts - Detailed Examples
Papers - Infra - Cost - Automatic Compute Planning
Papers - Mixture of Depths - MLP, residuals, router, tokens
Papers - MoD - Router
Papers - Yonsei University
Papers - Image - NeRF
Papers - Alibaba
Papers - University - Fudan University
Papers - Image - Frequency Decomposition
Papers - Image - Demosaic
Papers - University - Hong Kong University of Science and Te
Papers - Image - Interior Design
Papers - 3D - Interior Design
Papers - ETH Zurich
Papers - 3D - Indoor Scene Synthesis
Datasets - Reasoning
Papers - Reasoning - Self-Reference Metalinguistic
Papers - University - University of California San Diego
Papers - PlayTest AI
Papers - Contextual AI
Papers - Reasoning - MRGSM8k - Meta Math Multi Step
Papers - Reasoning - GSM8k
Papers - Tencent
Papers - Benchmarks - GSM8k
Datasets - Reasoning - Meta Math Multi-Step - GSM8k
Datasets - Math - Meta Context Reasoning
Papers - University of Cambridge
Papers - Southern University of Science and Technology
Papers - Alan Turing Institute
Papers - Max Planck Institute
Datasets - Text - QA
Datasets - Text - System Chat
Models - Image - Handwriting Comprehension
Models - Table - Handwriting Comprehension
Papers - Arctic University of Norway
Papers - Document - Tabular - Manual Review
Papers - Documents - Tabular - Census
Papers - Image - Custom Annotation and Labeling Tools
Papers - Documents - Custom Annotation and Labeling Tools
Papers - Image - Tabular
Papers - CascadeTabNet
Papers - Image - OCR
Papers - Pune Institute
Papers - Image - Table Structure Recognition
Papers - Documents - Table Recognition - Fine-tuning
Papers - Image - Fine-tuning - Tables
Papers - Image - OCR - Tesseract for Text Location
Papers - Document AI
Papers - Harbin Institute
Papers - Coding - Benchmarks - Report
Papers - Coding - OpenCodeInterpreter
Papers - Benchmarks - Coding
Papers - Coding - Training - Equal-Info Windows
Papers - Coding - Multi-Model Inference
Papers - Coding - Distributed - Adaptive Computation Time
Papers - Anthropic
Papers - Training Research - Compression and Multi-Model Inf
Papers - Coding - Encoders
Papers - Encoders - Compression
Papers - Coding - Compression
Papers - Tokenizer - Neural Compression
Papers - Inference - Multi-Model
Papers - Fine-tuning - ReFT
Papers - Fine-tuning - Report - Llama 7B and 13B
Datasets - Reasoning - Commonsense
Papers - Tokenizers - Roberta
Papers - Reasoning - Commonsense
Papers - Reasoning - Social IQ
Papers - University of Houston
Papers - Image - Classifier - Label Quality Assessment
Datasets - Reasoning - Math
Papers - Benchmarks - Image - Labels
Papers - Benchmarks - Image
Papers - Reasoning - Math
Papers - Reasoning - Math - AQuA
Papers - University of Oxford
Papers - University of IAIR Xi’an Jiaotong
Papers - Training - Instruction-Following
Datasets - Text - Instruction-following
Papers - RLHF
Papers - Benchmarks - Text - General Language Understanding
Papers - Benchmarks - Text - Glue
Datasets - Benchmarks - Glue
Datasets - Benchmarks - Text
Papers - Encoders - Roberta
Papers - Reasoning - Program of Thoughts
Papers - University of California Santa Barabra
Papers - StructLM - Understanding Structured Data
Models - StructLM
Datasets - Text - StructLM
Papers - Prompts - System Chat
Papers - Prompts - Chain of Thought
Papers - Tokenizers - LLaMA Byte Pair Encoding (BPE)
Datasets - OCR - Image with Text from Textract
Datasets - Documents - OCR - Image with Text from Textract
Papers - Benchmarks - Web Browsing Tasks
Papers - University - Harvard University
Papers - Kaust
Papers - Image - Point Cloud
Papers - Video - MultiView Compressive Coding (MCC)
Papers - Image - Encoders - RBG-D
Papers - Image - Training - Low Res Predicts High Res
Papers - University - Beihang University
Papers - Tokenizers - Documents - TrOCR
Papers - Tokenizers - Image - TrOCR
Papers - Tokenizers - Image - Handwriting
Spaces - Image - Handwriting Recognition
Papers - University of Zhejiang
Papers - Audio - Text to Speech
Papers - Audio - TTS - VALL-E
Papers - Audio - TTS - RALL-E
Papers - Security - Jailbreak
Papers - Benchmark - Security
Papers - LMU Munich
Papers - Siemens
Papers - University of Wuhan
Papers - Munich Center for Machine Learning (MCML)
Papers - Benchmarks - Website Navigation
Papers - Web Navigation - Chrome Extension
Papers - Web - Recognition
Papers - Web - Training - Curriculum Learning
Papers - Fine-tuning - Rejection Sampling (RFT)
Papers - Zhipu AI
Models - General Purpose
Datasets - Benchmarks - CodeEditorBench - OCI
Models - Chat
Models - Text - Image
Models - Multimodal - Chat
Models - Audio - Understanding
Models - Synthetic Data - Audio
Models - Audio - Edit with Text
Models - Audio - Classification and Segmentation
Models - Image - Chat
Models - Image - Synthetic Data
Spaces - Image - Chat
Papers - Audio - Understanding
Papers - Audio - Captions
Spaces - Qwen - Image
Datasets - SQL
Models - Audio - STT - ASR
Papers - Redwood Research
Papers - Automated Interpretability
Models - Encoders - Bidirectional
Models - Encoders - Bert
Papers - Text - Encoders - Image - Clip
Papers - Training Research - Rank-One Model Editing
Papers - Training Research - Mamba
Papers - Training Research - Ablation - Mamba
Papers - Training Research - Ablation - Factuality
Papers - Training Research - Weights - Activation Patching
Papers - Training Research - Interpretability
Papers - Interpretability - Rome - Factuality Editing
Papers - Interpretability
Papers - University of Tel-Aviv
Papers - Interpretability - Attention
Papers - University of Brown
Papers - Training Research - Layer Understanding
Papers - Interpretability - Prompts
Papers - Image - Imagen
Papers - Training Research - Control Attention Reweighting
Papers - Attention - Weights - Re-Weighting
Papers - Training Research - Text - Token Visualization
Datasets - Image - ImageNet
Datasets - Image
Papers - Recommendation - Cloze Task
Papers - Recommendation - Encoders - Bert
Papers - Recommendation
Papers - Recommendation - Multi-Task Learning
Papers - Recommendation - Bert4rec - SASRec
Papers - Recommendation - RTG Balancing
Papers - University of Zurich
Papers - Healthcare - Radiology
Papers - University - Shanghai Jiao Tong University
Papers - Training Research - Pre-training - ALBEF
Papers - Training Research - Vision Language Pre-training
Papers - Pre-training - ALBEF - Multimodal Encoder
Papers - Multimodal - Encoders - ALBEF
Papers - Dataset - MultiModal - MultiLingual - Wiki
Papers - Fine-tuning - RLHF - Direct Nash Optimization (DNO)
Papers - RLHF - Iterative Contrastive Self-Improvement
Datasets - Text - Alpaca
Papers - RL - Consistency Model (RLCM)
Papers - Fine-tuning - Image - Prompt Image Alignment
Papers - Harvey Mudd
Papers - Fine-tuning - Stream of Search
Papers - Training Research - Search Based (BFS / DFS)
Models - Text - Science
Papers - University of Tubingen
Papers - HKUST
Papers - Kuaishou
Papers - Text - Dialog Inpainting
Papers - 3DGS - Motion Blur
Papers - 3DGS - Color Transformation
Papers - Image - Encoders - RGB-T (Thermal)
Papers - University of Dalian
Models - Image - Stock Market - Pattern Detection
Papers - Audio - Encoders - HuBert with EnCodec
Papers - Audio - Bark
Papers - Mobile - Multimodal - Screen Image with Captions
Papers - Training Research - DeiT
Papers - Healthcare- DeiT
Papers - Image - Object Detection - YoloV8
Papers - Healthcare - Image - Cancer - Brain
Papers - Image - Hybrid - DeiT and YoloV8
Papers - Image - Healthcare - DICOM
Papers - Image - Healthcare - PTP Metrics
Papers - Image - DeiT
Papers - Custom Layers - MLP
Papers - University of Melbourne
Papers - Multilingual - Image - Greek
Papers - Indian Institute of Technology
Papers - Indian Institute of Science
Papers - University of Sorbonne
Papers - Regularization - LayerScale
Papers - Regularization - Binary Cross Entropy
Models - Image - DeiT
Models - Image - Classification
Papers - Image - Report - VQA
Papers - Image - Training - Mistral
Papers - AIRI Institute
Papers - Sber AI
Papers - Skoltech
Papers - Image - LLaVA
Papers - Image - Coco Testing
Papers - Image - Clip - Coco Testing
Papers - Image - Frechet Inception Distance (FID)
Papers - Training - Long Context
Papers - Benchmark - Context
Papers - Benchmarks - Context - Ruler
Papers - Image - Decoders
Papers - Image - Decoders - ViT
Papers - Training - Image - Causal Self Attention
Papers - Image - Training - AS2D RoPE and SwiGLU
Papers - Training - Detailed Appendices
Papers - Image - Encoders - ViT
Papers - 3D - Panoramic View Generator
Papers - Image - Training - Self Refinement
Papers - Training - Noisy or Unseen Data Drops Accuracy 6%
Papers - Image - Object Detection - DETR
Spaces - Healthcare - Multimodal
Papers - Text - Social Skills
Papers - Fine-tuning - Orpo
Papers - KAIST AI
Papers - Image - Fourier Neural Operators (FNO) vs CNNs
Papers - Image - FNO - Low and High Frequency Data
Papers - Image - Training - Training with an Ensemble
Papers - Image - FNO - SpecBoost Ensemble
Papers - Image - Differential Equations - FNO - ReLu
Papers - Image - Spectral Analysis
Papers - Rag - Prompts
Papers - Rag - Multiple Documents in Parallel
Papers - Tokens - Path Equilibrium Positioning
Papers - Tokens - Real-Valued Positioning
Papers - Model - Griffin
Papers - Models - Griffin - RecurrentGemma
Models - Mistral - Orpo
Papers - Fine-tuning - ControlNet
Papers - University of Central Florida
Papers - Reward Model - Consistency Loss - ControlNet
Papers - Audio - Datasets - Dialog
Papers - Qwen - Audio
Papers - Advanced Micro Devices
Papers - Image - Auto - Lane Detection
Papers - Image - Auto - Lane - Training Segmentation
Papers - Operating Systems
Papers - Agents - Operating Systems
Papers - Benchmarks - Agent - Multimodal - Tasks
Papers - University of Aalto
Papers - University - Princeton University
Papers - Megatron
Papers - Attention - Mixture of Attention Heads (MoA)
Papers - DiffusionDet
Papers - Image - Generator - Gaussian Noise - Bounding Boxes
Papers - Image - Ordinary Differential Equations (ODE)
Papers - Image - Object Detection - Bounding Boxes
Papers - Image - Bounding Boxes - Loss - Timeseries
Datasets - Image - Coco - Obj Det, Segmentation, Captions
Models - Image - Image Segmentation - Coco
Models - Image - DPT - Dino
Papers - Image - ConsistencyDet
Papers - Image - TrOCR
Models - Rag
Models - Mistral
Models - Image - Clip
Models - Image - Dino
Models - Agent
Models - Agent - On-Device
Spaces - Comics
Papers - Chain of Thoughts - Visualization
Papers - Visualization of Thought (VoT) - Mind’s Eye
Papers - Benchmarks - Documentation
Papers - Benchmark - Multimodal - Image Documentation
Papers - AutoDesk
Papers - Investing - Stock Forecasting
Papers - University of Shenzhen
Papers - Investing - AceFormer - ACEEMD
Papers - Image - Knowledge Graph
Papers - Agent
Papers - Knowledge Graph - Tasks
Papers - Panasonic
Papers - University of Xiamen
Papers - Selective Language Modeling vs Causal
Papers - Fine-tuning - Math
Datasets - Chat
Papers - Image - VQA
Papers - Image - VQA - Captions High Res Alignment
Papers - University - University of Santa Barbara
Papers - University - Columbia University
Papers - Image - VQA - Ferret
Papers - Image - Encoders - Dual Vision MLP projectors
Papers - Image - Referring Object Classification (ROC)
Papers - Image - Dataset - LVIS
Papers - Image - Grounding
Papers - Image - Training - OCR - High-Res Dense Alignment
Papers - Image - Captioning
Papers - Documents - UDOP
Papers - Documents - Fine-tuning - LayoutLM and UDOP
Papers - Image - Scientific Charts
Papers - Documents - Scientific Charts
Papers - University of Ulm
Papers - Image - Fine-tuning - ICPR22 dataset
Papers - Image - Fine-tuning - CHIME-R and EconBiz datasets
Papers - Image - Fine-tuning - DeGruyter dataset
Papers - Embeddings - Text - RoBERTA and BPE
Papers - Embeddings - Image
Papers - Embeddings - Image - DiT and dVAE
Papers - LayoutLM - Fine-tuning - Word Patch Alignment
Papers - Tokenizers - Text - T5
Papers - Fine-tuning - Hyperparameter - FUNSD
Papers - Classification - F1 Macro and F1 Micro
Papers - Timeseries
Papers - University of Panjab
Papers - Image - Report - Training - CNN RNN LTSM MLP
Papers - Image - Connectionist Temporal Classification (CTC)
Papers - Image - Climate - SHAP
Papers - Courant Institute
Papers - Image - Climate - ERA5
Papers - Image - Coco - Annotation Pipeline
Papers - Image - Mask - box-kMaX over kMaX-DeepLab
Papers - Image - Coco - Annotation RLHF
Papers - Image - Coco - Panoptic
Papers - Video - NeRF - Real Estate Walkthroughs
Papers - NeRF - Training - Photometric Consistency Patches
Papers - Image - Datasets - ETH3D
Papers - Image - Datasets - TanksAndTemples
Papers - Image - NeRF - Mesh - TSDF fusion RGBD sequences
Papers - Image - Evaluation Metrics - PSNR SSIM LPIPS
Datasets - Research Papers - ARXIV QA
Papers - University of Alberta
Papers - University of Auburn
Papers - Explainability - Image - VQA
Papers - Explainability - Image - VQA - CHM-Corr++
Spaces - Chat - QA - Research Papers on Arxiv read by Claude
Audio Reading - 2404.08639 - COCONut
Audio Reading - 2403.07691 - ORPO Fine-tuning
Audio Reading - 2212.05525 - Extending TrOCR
Audio Reading - 2404.06209 - Elephants Never Forget
Audio Reading - 2404.07773 - ConsistencyDet
Models - Reasoning
Datasets - Audio - Large
Datasets - Audio - Multilingual
Datasets - Audio - Multilingual - Large
Spaces - Audio - TTS
Models - WizardLM
Datasets - Benchmark - Tasks
Models - Image - QA
Datasets - Chat - Persuasion
Papers - Training Research - Dataset Ordering
Papers - Training - Curriculum Learning
Papers - Training - Education Stage then Cognitive Hierarchy
Papers - Training - Curriculum Instruction Tuning
Papers - Llama 2
Papers - Training - AI2 Reasoning
Papers - Training - Out of Vocabulary
Papers - Training - Multilingual - Out of Vocabulary
Papers - University of Charles
Papers - Training - Report - LTSM vs LLM vs Ensemble
Papers - Training - Filter Low Quality with Contriever
Papers - University of Seoul National
Papers - University of Ewha Womans
Papers - University - National University of Singapore
Papers - University - University of Michigan
Papers - Audio - Fine-tuning - DPO
Papers - Audio - Fine-tuning - Alpaca
Papers - Audio - Clap
Papers - Audio - Encoder - Variational Auto-Encoder (VAE)
Papers - Audio - Frechet Audio Distance (FAD) like FID
Papers - University of North Carolina Chapel Hill
Papers - University of Southern California
Papers - Megalodon - Unlimited Context
Papers - Multimodal - Long Context - Megalodon
Papers - 3DGS - Compression
Papers - Multimodal - Speculative Decoding
Papers - Inference - Multimodal
Papers - Qualcomm
Papers - Inference - Speculative Decoding - Draft Model
Papers - Dataset Grooming - Report
Papers - Dataset Generation - Guide
Papers - Image - Hyperspectral Images (HSI)
Papers - Mamba - Bidirectional
Papers - Healthcare - Image - Cancer
Papers - Healthcare - Image - Cancer - Prostate
Papers - Agent - Research
Papers - Research - Automated Research
Papers - Fine-tuning - DPO - KL Divergence vs Learning Rates
Papers - Tinkoff AI
Papers - Embeddings - Scalable Positional Encodings
Papers - University of Pennsylvania
Papers - Image - Layer Pruning
Papers - Inference - Image
Papers - Inference - Image - Layer Pruning
Audio Reading - 2402.16827 - Survey on Data Selection ~3.5h
Audio Reading - 2404.08011 - Review Handwriting Recognition
Papers - Pre-training - Warm-Start - Encoder and Decoders
Papers - Pre-training - Pegasus
Papers - Imperial College of London
Papers - Pre-training - Text - Masked Language Models (MLM)
Papers - Pre-training - Self-Supervised for Downstream Tasks
Papers - Pre-training - Warm-Start - Encoders - BPE
Papers - Pre-training - Warm-Start - Encoders - Unigram
Papers - Pre-training - Summarization
Papers - Pre-training - Encoders - Bert
Papers - Pre-training - Encoders - Roberta
Papers - Pre-training - Warm-Start
Papers - Pre-training - Unsupervised
Papers - Pre-training - Checkpoints
Models - Fintech - Financial Summarization
Audio Reading - 2310.09518 - Instruct with Human Curriculum
Datasets - Image - Multilingual - VQA
Datasets - Image - VQA
Papers - Inference
Papers - Inference - KV Cache
Models - Encoders - Multimodal - Clip - SigLIP
Models - Image - Embeddings
Spaces - Multimodal - Image and Chat
Papers - Stability AI
Papers - Audio - Activation - Snake
Papers - Audio - Decoders - DAC - No tanh activation
Papers - Audio - RoPE
Papers - Audio - Embedding - Time - Sinusoidal Cross Attensi
Papers - Audio - Embedding - Text - Clap - Cross Attention
Papers - Audio - Embedding - Clap - Timestep - Prepended
Papers - Audio - Encoders - Clap - HTSAT audio RoBERTa text
Papers - Attention - Block-wise
Papers - Audio - Encoders - Clap - Training - Metadata
Papers - Audio - Musical Structure Analysis
Papers - Audio - Encoders - Laion-Clap
Papers - Agent - Sima
Papers - World Sim - Agent - Tasks
Papers - Training - Video Games
Papers - Video Games
Papers - Video Games - Survival
Papers - Video Games - Crafting
Papers - Video Games - Survival - Valheim
Papers - Video Games - Navigation
Papers - Video Games - Object Tools
Papers - Video Games - Farming
Papers - Video Games - Environment Resource Planning
Papers - World Sim - Encoder - Image - Sparc
Papers - World Sim - Encoder - Video - Phenaki
Papers - World Sim - OCR
Papers - World Sim - Training - Classifier-Free Guidance
Papers - World Sim - Cognitive Architectures
Papers - Video - Phenaki
Papers - Video - Encoders - C-ViViT
Papers - Video - Encoders - C-ViViT - MaskGiT
Papers - Embeddings - Text - T5X
Papers - World Sim - Embedings - Text - T5X
Papers - JAX
Papers - GNN
Papers - Training - GNN
Papers - GNN - Dataset - LargeMix
Papers - GNN - Fine-tuning
Papers - GNN - Benchmark - TDC
Papers - GNN - Benchmark - Polaris
Papers - GNN - Benchmark - MoleculeNet
Papers - Hybrid Arch - Skip Connections
Papers - GNN - MPNN
Papers - GNN - Encoders
Papers - GNN - Encoders - Positional and Structural Encoding
Papers - GNN - Fine-tuning - Custom Layer - MLP
Papers - GNN - MoIE
Papers - Healthcare - Molecules - GNN
Papers - Healthcare - Molecules
Papers - Healthcare - GNN
Papers - GNN - Ensemble
Papers - Healthcare - Drug Discovery
Papers - Healthcare - Drug Discovery - GNN
Papers - Valence Labs
Papers - University of Montreal
Papers - University - University of Toronto
Papers - University of McGill
Papers - Healthcare - Image - X-ray
Papers - Healthcare - Image - Chest - X-ray
Papers - Healthcare - Image - Lung Disease
Papers - XAI
Papers - XAI - Gradient Weighted Class Activation Mapping
Papers - XAI - Loc Interpretable Model Agnostic Explanation
Papers - XAI - Fine-tuning
Papers - University of Ahsanullah
Papers - Healthcare - Image - Covid-19
Papers - Image - Visual Feature Extractor
Papers - Inference - Batch - Hierarchical Sharing Pattern
Papers - Optimizer - Lamb
Papers - Attention - Sliding Window
Papers - Training - 3D Parallelism - Back - Reduce-Scatter
Papers - Training - 3D Parallelism - Forward - All-Gather
Papers - Custom Layers - Feedforward Neural Network (FFN)
Papers - Training Research - Model FLOPs Utilization (MFU)
Papers - Training Research - Fault Tolerance
Papers - Custom Layers - Decoders - No FFN
Papers - Training - Parameter Reduction - FFN
Papers - Equall AI
Papers - Multilingual - Spanish
Datasets - Fine-tuning - Orpo
Papers - Emergent Properties
Papers - Emergent Properties - Multiple Choice Grade
Papers - Emergent Properties - Exact String Match
Papers - Emergent Properties - Image
Papers - Training - Epoch - 4 Epochs by Default
Papers - Attention - Mixture-of-Attention (MoA)
Papers - Surge Global
Papers - Benchmarks - Safety
Papers - Benchmarks - Toxicity
Papers - Reward Model - Fine-tuning
Papers - Fine-tuning - Reward Model
Papers - Reward Model - Cross-Lingual
Papers - Datasets - Multilingual - Documents - Seahorse
Papers - Datasets - Multilingual - OpenAssistant
Papers - Inference - Speculative Decoding - KV Cache
Papers - Speculative Decoding - KV Cache
Papers - KV Cache
Papers - Inference - Speculative Decoding - Draft - KV Cache
Papers - Speculative Decoding - Draft - Base Model - JF68M
Papers - Speculative Decoding - Long Context
Papers - Speculative Decoding - Draft - Model - SpecInfer
Models - Speculative Decoding - Draft - Base Model
Models - Speculative Decoding - Draft - SpecInfer
Papers - Speculative Decoding - Token Tree Verification
Papers - Speculative Decoding - Token Verification
Papers - TensorRT-LLM - FasterTransformer - deprecated
Papers - Multimodal - Reka - Image Video Text Audio
Papers - Tokenizers - tiktoken
Papers - Animation - Text
Papers - Animation - Text - Kinetic Typography
Papers - Video - Text Animation
Papers - Image - LPIPS
Papers - Video - Score Distillation Sampling
Models - Fine-tuning - Orpo
Papers. - Samsung
Papers - Nota
Papers - 3D - Mesh Generator
Papers - Training - 3D - NeRF
Papers - Games - AlphaGo
Papers - Training - Self-Improvement
Papers - University of Turku
Datasets - Benchmarks - Image - QA - Real World Objects
Papers - Benchmarks - Image - QA - Abstract
Papers - Benchmarks - Image - Visual Commonsense
Datasets - Benchmarks - Image
Datasets - Benchmarks - Image - QA
Datasets - Benchmarks - Image - Blink
Papers - Context - NoPE
Papers - International Human Phenome Institute
Papers - University - East China Normal University
Papers - Datasets - Training - Context - LongBencb
Papers - Context - Length Generalization
Papers - Attention - NoPE - Long Context with SoftMax Temp
Papers - Attention - Training - Context - Head-based Scaling
Papers - TinyLlama
Papers - Datasets - Training - Context - SlimPajama
Papers - Training - Eval - Sliding Window Perplexity
Papers - Datasets - Training - Context - Starcoderdata
Papers - Training - Eval - Sliding Window - PG19
Papers - Training - Eval - Sliding Window - Proof-pile
Papers - Context - NoPE vs RoPE - Passkey Retrieval Viz
Papers - Transformers Without Positional Encoding - NoPE
Papers - Mila
Papers - IBM
Papers - ServiceNow
Papers - Attention - Multi-Head Attention (MHA)
Papers - Training - Residual Connections
Papers - Text - Encoders - Bert
Papers - Positional Encodings
Papers - Embeddings - Absolute Position Embedding (APE)
Papers - Embeddings - ALiBi
Papers - Encodings - Rotary - RoPE
Papers - Encodings - No Positional Encodings - NoPE
Papers - Embeddings - T5 Relative Bias
Papers - Chain of Thought - Scratchpad
Papers - Text - Classification - FastFit
Datasets - Text - Argument Topics
Datasets - Text - FinTech
Papers - University - Hebrew University of Jerusalem
Papers - Text - Datasets - Classification and Labels
Papers - Benchmarks - Text - Classification - FewMany
Papers - Weather
Papers - Datasets - Weather
Papers - Datasets - Weather - ERA5
Papers - Historical - Weather
Papers - University of Aarhus
Papers - University - Berlin Technical University
Datasets - Coding - Code Reviews
Datasets - Benchmarks - Coding
Datasets - Text - Web
Datasets - Text - CommonCrawl
Datasets - Text - QA - Web
Datasets - Text - Research Papers - QA - QASPER
Papers - Image - Graph - Understanding
Papers - Knowledge Graphs
Papers - Image - Glip
Papers - University - UCLA
Papers - International Digital Economy Academy (IDEA)
Papers - Image - Phrase Grounding
Papers - Image - Bounding Box - Coco - Teacher and Student
Papers - Image - Grounded Captions
Models - Image - GLIGEN
Papers - Text - Instruct - Grounding and Captions
Papers - Image - UMAP
Papers - Text - Legal - Remove Redaction
Papers - University - University of Padua
Papers - Benchmarks - Text - Text Anonymization Benchmark
Papers - Text - Named Entity Recognition (NER)
Papers - Text - Encoders - Sentence Transformers (SBERT)
Papers - Text - Eval - SMOTE
Papers - ML - XGBoost
Papers - Text - Remove Redaction - Countermeasures
Papers - University - Delft University
Papers - FDM Business Services
Papers - Attention - Gated Self-Attentio - Spatial Grounding
Papers - Inference - Scheduled Sampling
Papers - Image - Object Detection - YOLO
Papers - Image - Inpainting
Papers - Image - Keypoint
Papers - 3DGS - Structure from Motion
Papers - SQL - Database Migrations
Papers - SQL - Knowledge Graphs
Papers - SQL - Query Tree
Papers - SQL - Curriculum Learning
Papers - Web - Agent
Papers - University - Simon Fraser University
Papers - University - University of British Columbia
Papers - Coding - Git Commits
Papers - Coding - Defects
Papers - 3DGS - Material Point Method (MPM)
Papers - 3DGS - Motion
Papers - Video - Simulated Material Dynamics - MLS-MPM
Papers - 3DGS - K-Means Clustering
Papers - University - Huazhong University
Papers - Phi - Technical Report
Papers - Text - Mobile
Papers - Audio - Classifier-Free Guidance (CFG)
Papers - Kunlun
Papers - Image - Fine-tuning - LoRA
Papers - Multimodal - XAI
Papers - XAI - Eval - Synthetic Vision Neuron
Papers - XAI - Research in Appendix
Papers - XAI - MAIA
Papers - Llama 3
Papers - Llama 3 - Fine-tuning - Quantization
Papers - Llama 3 - Fine-tuning
Papers - Llama 3 - GPTQ AWQ PB-LLM BiLLM - 1.1-8 bits LoRA
Papers - Image - NeRF - Structure from Motion (SfM)
Papers - Niantic
Papers - Benchmarks - Fintech
Papers - Coding - Automated Workflows
Papers - Fintech - Datasets - SEC - Edgar Filings DB - N-CEN
Papers - Investing - Document QA - SEC Filings
Papers - JP Morgan Chase
Papers - Image - Consistency Trajectory Model (CTM)
Papers - KL Regularization - Diffusion Matching Distillation
Papers - Security - Prompt Injection
Papers - Prompts - Security - Instruction Prioritization
Papers - Image - Multi-Concept Customization (MCC)
Papers - Image - Adaptive Concept Normalization (ACN)
Papers - Image - Encoder - Single-Concept Learning - QFormer
Papers - Image - Synthetic Generator - Canny
Papers - Image - Synthetic Generator - Depth
Datasets - Image - Classification
Papers - Image - Datasets - CIFAR
Papers - Image - Datasets - MNIST
Papers - Activation Functions
Papers - Pre-training - Layer Initialization
Papers - Pre-training - Layer Initialization - LSUV
Papers - Image - Datasets - ImageNet
Papers - University - Czech Technical University
Papers - Pre-training - Weight Initialization
Models - Instruct - Context - 128k
Models - Phi-3
Models - Text - Long Context
Papers - Audio - Attention - FlashSpeech
Papers - Command-R
Papers - Cohere
Papers - Pre-training - Text - Cross-lingual
Papers - Training - KL-divergence Upper bound (KLUB)
Papers - Twelve Labs
Papers - Audio - Latent Consistency Model (LCM)
Papers - Audio - Discriminator - Adversarial Loss
Papers - Audio - Prosody Generator
Papers - Audio - Voice Conversion
Papers - MSRA
Papers - University - Inner Mongolia University
Papers - University - Beijing University
Papers - Attention - Flash Attention
Papers - OLMo
Papers - MobiLlama
Papers - Fine-tuning - Dataset - Instruct - UltraFeedback
Papers - Fine-tuning - PEFT
Papers - Fine-tuning - DoRA
Papers - OpenELM
Papers - Fine-tuning - Text - Bottleneck - RMSNorm
Models - OpenELM
Papers - Training Research - Flash Memory - DRAM
Papers - Attention - Sparse Attention
Papers - Attention - Hard Attention
Papers - Image - Mask2Former
Papers - Training - Early Exit - Gating Network
Papers - Image - Detectron2
Paper - Image - Segmentation - Cost vs Quality - Gating Net
Papers - University - University of California Riverside
Papers - NEC Laboratories
Papers - Image - Cost Reduction - Early Exit
Papers - Custom Layers - No Dropout - Batch Normalization
Papers - Model - Inception
Papers - Pre-training - Batch Normalization
Papers - Image - Training - Per-class Regressor (PCR)
Papers - Healthcare - DNA
Papers - Healthcare - Mamba
Papers - University - University of Massachusetts
Papers - Fine-tuning - Multilingual - Multi-task
Papers - Fine-tuning - Transfer Learning - Cross-Lingual
Papers - Fine-tuning - Named Entity Recognition (NER)
Papers - Fine-tuning - Part of Speech (POS)
Papers - University - University of Groningen
Papers - Cross-lingual
Models - Multilingual - Rag - Catalan, Spanish, English
Datasets - Text - Multilingual - Catalan, Spanish, English
Models - Audio - TTS - Catalan
Datasets - Text - Web, Medical Journals
Spaces - CoT
Spaces - Image - Clothing
Papers - Coding - Knowledge Graphs
Papers - Knowledge Graphs - Construction and Validation
Papers - Rag - Knowledge Graphs
Papers - Documents - Knowledge Graphs
Papers - Data Extraction - OpenIE
Papers - Prompt - Knowledge Graphs
Papers - Knowledge Graphs - Prompts
Papers - Knowledge Graphs - Validation - Pydantic
Papers - Quantexa
Papers - Knowledge Graphs - Llama 2
Papers - Apple - CoreNet
Papers - Training - Contrastive Loss - CatLIP
Papers - Image - Classification- WordNet synsets
Papers - Image - Pre-training - Transfer Learning
Papers - Mixture of Data Experts (MoDE)
Papers - Pre-training - Continual - Expert Onboarding
Papers - Image - MoDE - Clip
Papers - Training - Image - MoE - Clip
Papers - Image - Pre-training - Distribution Clustering
Papers - Embeddings - Clustering
Papers - Image - Encoders - MetaClip
Papers - Embeddings - Text - SimCSE
Papers - Embeddings - Text - TF-IDF
Papers - Pre-training - MoE - Flexible Expert Ensembles
Papers - MoE - Training - Expert Prioritization
Papers - Pre-training - MoE - Continual Learning
Papers - Pre-Training - MoE - Train One Expert
Papers - Inference - MoE - Routing with Task Metadata
Papers - MoE - Inference - Routing with Task Metadata
Papers - Image - Datasets - Flickr
Papers - Image - Encoders - OpenClip
Papers - MoE - Image - MoDE
Papers - Image - Benchmarks - Clip
Papers - Image - Datasets - LAION
Papers - MoE - Routing - Softmax Normalization
Papers - Attention - BASS
Papers - 3DGS - Segmentation
Papers - 3D - NeRF
Papers - 3D - Interactive - Semantic Editing based on Loss
Papers - 3D - Interactive
Papers - 3D - Gaussian Splatting and NeRF
Papers - University - The Chinese University of Hong Kong
Papers - SenseTime Research
Papers - Benchmarks - Multimodal
Papers - Benchmarks - Multimodal - SEED-Bench
Papers - Multimodal - Benchmarks - Report
Papers - ARC Lab
Papers - Inference - Early Exit
Papers - Inference - Draft Model - Early Exit - Dropout
Papers - Prompts - Adversarial
Papers - Agent - Image
Papers - Agent - Robotics
Papers - Agent - Tasks
Papers - Healthcare - Fine-tuning
Papers - Healthcare - Chain of Reasoning (CoR)
Papers - Chain of Reasoning (CoR)
Papers - Training - Self-Guided with Search
Papers - Inference - Uncertainty-Guided Search
Papers - Healthcare - VQA - Understanding
Papers - Healthcare - Multimodal
Papers - Healthcare - Biomedical Research
Papers - Gemini
Papers - Healthcare - Prompts
Papers - Healthcare - Surgery - VQA
Papers - Healthcare - Radiology Objects in Context (ROCO)
Papers - Healthcare - Benchmarks - Text - NEJM
Papers - Healthcare - Benchmarks - Text - MMMU-HM
Papers - Healthcare - Benchmarks - Long Context - MIMIC-III
Papers - Healthcare - Benchmarks - VQA - MedVidQA
Papers - Healthcare - Benchmarks - Video - Cholec80
Papers - Healthcare - Benchmarks - Video - Cholec80-CVS
Papers - Healthcare - Report
Papers - Speculative Decoding - Early Exit
Papers - 3D - Garment
Papers - Training - Multi-Model Evaluation
Papers - Training - Multi-Model Evaluation - PoLL
Papers - Training - Evaluation - Multi-Hop QA
Papers - Prompts - Training - Evaluation - Multi-Hop QA
Papers - Agent - Training
Papers - Agent - Fine-tuning
Papers - Agent - Evaluation
Papers - Blender
Papers - 3D - Blender
Papers - 3D - Mesh Editing
Papers - 3D - Texture Editing
Papers - 3D - Lighting
Papers - Video - Robot Simulator - VQA
Papers - World Sim - VQA
Papers - World Sim - Scene Generation
Papers - University - Central South University
Papers - Image - Fine-tuning - Dataset - StylusDocs
Papers - Image - Multi-Model Evaluation
Papers - Image - Datasets - DOCCI
Papers - Image - Annotation Pipeline
Papers - Image - Annotation UI
Papers - 3DGS - Test - Dataset - RealEstate10k
Papers - 3DGS - Test - Dataset - Objaverse
Papers - Image - Detailed Multi-Object Generation
Papers - SK Telecom
Papers - 3DGS - Point Cloud - COLMAP
Papers - 3DGS - Tabular Structure Detection
Papers - 3DGS - Test - PSNR
Papers - 3DGS - Structure Preservation
Papers - University - Imperial College London
Papers - Custom Layers - KAN
Papers - Octopus
Papers - Nexa AI
Papers - Alternative Layers - KAN instead of MLP
Papers - California Institute of Technology
Papers - National Science Foundation (NSF)
Papers - University - University College London
Papers - ICL - Induction Head
Papers - ICL - Induction Circuit
Papers - ICL - Training - Activations - Clamping
Papers - Ensemble
Papers - Audio - Codec - Bitrate - Low
Papers - Model Editing
Papers - Image - Comics
Papers - Image - Multi-Caption Generation
Papers - University - Nankai University
Papers - Institute - Nankai Int Advanced Research Institute
Papers - Fine-tuning - LoRA - LoRAX
Spaces - Comics and Cartoons
Papers - Training - Datasets - Few-Shot Learning - OmniGlot
Papers - Emergent Properties - ICL - Induction Heads
Papers - Custom Layers - No Dropout - Dropout Regularization
Papers - Custom Layers - Residual Connection - Ablation
Papers - Ablation - Attention - Head Pruning
Papers - XAI - Attention - Induction Heads
Papers - Attention - Induction Heads
Papers - Attention - Ablation
Papers - Training - Ablation
Papers - Attention - Previous Token Head
Papers - Training Research - Loss Dynamics - Clamping
Papers - Training Research - Clamping
Papers - XAI - Induction Head - Phase Change - Components
Papers - ICL - Induction Head - Num Labels vs Classes - Loss
Papers - ICL - Induction Circuit - Data Dependent Learning
Papers - Training - ICL - Induction Circuit Evolution
Papers - ICL - Induction Head - Copy vs QK Match
Papers - ICL - Phase Change - Delay - Classes and Labels
Papers - XAI - Framework - pyvene
Papers - Pr(Ai)2R Group
Papers - Training - Interventions - Understanding
Papers - ICL - Locating Early and Late Fact Associations
Papers - ICL - Training - Distributed Alignment Search
Papers - ICL - Phase Change Delay - Large Vocabulary Size
Papers - XAI - Attention - LayerNorm
Models - MoE - Reward Model
Papers - Reward Model - Preference Collection Construction
Papers - Reward Model - Model Merging vs Joint Training
Papers - Model Merging - DARE better than TIES
Datasets - Reward Model - Preference Collection
Papers - LG
Papers - ICL - Residual Head Hypothesis
Papers - Dataset Storage - Orc vs Parquet
Papers - Dataset Storage - Parquet
Papers - Dataset Storage - Orc
Papers - Voltron Data
Papers - Dataset Storage - cuDF - Parquet and Orc
Papers - Dataset Storage - Zarr
Papers - Dataset Storage - Technical Report
Papers - Dataset Storage - Lessons Learned
Papers - BitNet
Papers - Pre-training - Rerankers
Papers - Fine-tuning - Rerankers
Models - Coding - Code Interpreter - Agent - Multi-shot
Papers - Training - Math
Papers - FNet - Fourier Transformers
Papers - SSMs
Papers - Mamba
Papers - Mamba - Mamba 2
Papers - Training - Cost Estimates
Papers - Epoch AI
Papers - Training - Historical GPU Cost Trends
Papers - Training - Report - Historical Cost Estimates
Papers - Reasoning - Complex - Alice in Wonderland - AIW
Papers - Reasoning - Complex - TACT
Papers - Text - Table Generation - Pandas DataFrames
Papers - Reasoning - Prompt - Table and Calculations
Papers - Coding - Table and Calculations using Pandas
Papers - Reasoning - Datasets - TACT
Models - Video - Captions
Datasets - Video - Captions
Papers - Chain of Thoughts - Multi-Shot - Buffer of Thoughts
Papers - Training - Piecewise Affine Multiplication
Papers - Training - PAM faster vs MatMul - CPU
Papers - Training - Multiplication Free
Papers - Training - Distribution Estimation - Autoregressive
Papers - Training - CNN - Binarized MNIST - Code Examples
Papers - Audio - Distribution Estimate - Spectrogram
Papers - Unsupervised - Distribution Estimation
Papers - Datasets - Biology - SMILES
Papers - Healthcare - Virus Detection - Classification
Papers - Healthcare - Text - Biology QA
Papers - University - Hong Kong University
Papers - Image - Fine-tuning - Llama
Papers - Text to Image - Encoders - Flan-T5 XL
Papers - Image - Training Metrics - PSNR
Papers - Image - Training Metrics - SSIM
Papers - Image - Tokenizers - VQGAN
Papers - Image - Tokenizers - ViT-VQGAN
Papers - Image - Metrics - Inception Score (IS)
Papers - Image - Training - AutoRegressive
Papers - Inference - Image - vLLM
Papers - Image - Training - Captions created with LLaVA
Datasets - Text - Characters
Papers - Image - Tokenizer - L2 Normalization
Papers - Image - Training - Loss - Gradient Estimator
Papers - Image - Training - Loss - PatchGAN
Papers - Image - Training - Arch - 2D RoPE and SwiGLU
Papers - Image - Training - Detailed Training Tables
Papers - Image - Classifier-Free Guidance (CFG)
Papers - Image - BigGAN
Papers - University - University of Heriot-Watt
Papers - Image - Metrics - FID and IS
Papers - Image - Sampling - Variety, Fidelity, Truncation
Papers - Image - Classifier - Inception v2 - JFT-300M
Papers - Training - Process Reward Model
Papers - RL - Monte Carlo Tree Search (MCTS)
Papers - Image - InceptionResNet-v2
Papers - Image - Datasets - JFT-300M
Papers - Image - Semantic Segmentation - Benchmark - PASCAL
Papers - Image - U-Net - Mask Augmentation
Papers - Ant Group
Papers - Monte Carlo Tree Search (MCTS) - Self-Refine MCTSr
Papers - Monte Carlo Tree Search (MCTS) - Math Reasoning
Papers - University - Hong Kong Polytechnic University
Papers - Image - Faster RCNN
Papers - Image - Region Proposal Network (RPN)
Papers - Image - Faster RCNN - 2nd Stage - Box Classifier
Papers - Image - Faster RCNN - Region Proposal Network (RPN)
Papers - Image - Human Pose Estimation - Coco
Papers - Image - InceptionResNet
Papers - Image - Deep Fakes - Detecting Video Forgeries
Papers - University - Drexel University
Papers - Prompts - Report
Papers - Agent - Security
Papers - Security - Pen Testing
Papers - Security - OWASP Testing
Papers - Image - Diffusion - Parallel Denoising
Papers - Image - Inference - Model Segmentation
Papers - Image - Denoising - Stride Denoising
Papers - Video - SDXL - Multi-GPU
Papers - Training - Multi-GPU
Papers - Coding - Benchmarks - McEval
Papers - Coding - Training - Annotations
Papers - Coding - Prompts
Papers - Coding - Inference - vLLM
Papers - Coding - Training - Distributed - PyTorch FSDP
Papers - Coding - Tokenizer - CodeBert
Papers - Coding - Tokenizer - Visualization - t-SNE
Papers - Coding - Tokenizer - Viz - Hierarchical Clustering
Papers - CCSE
Papers - Coding - MCoder
Papers - Coding - Classification - Categories Easy Med Hard
Papers - University - Beijing Information Science and Tech
Papers - Coding - Fine-tuning - CodeQwen
Papers - Coding - Fine-tuning - DeepSeekCoder
Papers - World Sim - Video - Benchmarks - MMWorld
Papers - University - University of California Santa Cruz
Papers - SSMs - Chimera
Papers - SSMs - Testing - Time Series Forecasting Report
Papers - SSMs - 2D Mamba
Papers - SSMs - Classification
Papers - SSMs - Time Series Anomaly Detection
Papers - Image - Augmentation - Edge Detection - HED
Papers - Image - ControlNet
Models - Image - ControlNet - Canny
Models - Image - ControlNet - Training Annotator
Papers - Image - Datasets - BSDS 500 - Berkeley Segmentation
Papers - Image - Datasets - NYUD - NYU Depth
Papers - Image - VGGNet
Papers - Image - Pipeline - HED
Papers - Image - CFG - CFG Resolution Weighting (CFG-RW)
Papers - 3DGS - Enhancement - Lighting
Papers - 3DGS - Cone Scatter Initialization
Papers - 3DGS - Security Camera - Image Enhancement
Papers - Training - Preference Optimization - DiscoPOP
Papers - Training - Preference Optimization - Code Samples
Papers - Training - Synthetic - Loss Functions
Papers - Image - Augmentation - Binarization - NAF-DPM
Papers - Image - OCR - Binarization - Otsu
Papers - Image - OCR - Binarization - Sauvola
Papers - Image - OCR - Binarization - DE-GAN
Papers - Image - OCR - Binarization - D2BFormer
Papers - Image - OCR - Binarization - DocDiff
Papers - Image - OCR - Binarization - DocEnTr
Papers - Image - OCR - CER (Character Error Rate)
Papers - Image - Datasets - OCR - DIBCO
Papers - Image - OCR - Metrics - PSNR, F-Measure, Fps
Papers - Image - DPM - Diffusion Probabilistic Model
Papers - Image - OCR - Fine-tuning - CTC Loss Function
Papers - Document - Deblurring
Papers - Datasets - Multimodal
Models - Abliterated - Refusal Direction Editing
Papers - Image - Augmentation - Depth - MDE
Models - Image - Augmentation - Depth Estimation
Papers - Image - Augmentation - Plasma Fractals
Papers - XAI - Text - WordNet - Noun and Verb Hierarchy
Papers - Text - Training - Estimation - LDA
Papers - Quantization
Spaces - Image - Stable Diffusion - 3 - Medium
Papers - 3D - Artist-Created Meshes (AMs)
Papers - Inference - Speculative Decoding
Papers - Duplex Models
Papers - Embed - Duplex Models - Time-Division Mulitplexing
Papers - XAI - Confidence Regulation
Papers - Image - Charts - QA - Reasoning
Papers - Image - Charts
Papers - Image - Benchmarks - Charts
Papers - In-Context Learning - Concept Learning Geometry
Papers - ICL - Prompt - Out of Distribution (OOD) Emergence
Papers - ICL - Concept Spaces
Papers - NTT Research
Papers - Coding - Programming by Example
Papers - Coding - List Functions, Editing, Logos ASCII Art
Papers - Coding - Eval - LambdaBeam Problems
Papers - Coding - Building Using Multi-shot Prompts
Spaces - Biology - ESM - Proteins
Models - Biology - Proteins - ESM
Papers - Healthcare - Datasets - Image - PubMedVision
Papers - Image - Fine-tuning - LLaVA
Papers - Image - Datasets - Biology - Arboretum
Papers - Math - TabMWP
Papers - Training - Brier Score - Probabilistic Accuracy
Datasets - Text - Personas
Papers - Rag - Benchmarks
Papers - Rag - Long Context
Models - Text - Multi-token Prediction
Papers - 3D - AssetGen
Papers - Benchmarks - Tables
Papers - Text - Inferential Adversaries
Papers - Image - Region Zoom
Papers - Multimodal - Embeddings
Papers - Image - Florence 2
Papers - 3DGS - Text - Enhance
Papers - 3DGS - Geometry-Bound
Papers - 3DGS - Loss - Interval Score Matching (ISM)
Papers - 3DGS - Classifier-Free Guidance (CFG)
Papers - 3DGS - Training - Model - Stability Diffusion 2
Papers - 3DGS - Text - Image - Mesh
Papers - Multimodal - 3DGS and Text
Models - Text - Research
Papers - Multimodal - Training - Joint Example Selection
Papers - Image - Training - Optimization - SigLIP
Models - Text - Fine-tuning - Axolotl
Models - Text - Chemistry
Models - SAE - Sparse Auto Encoders
Papers - Multimodal - Training - Decoder Only
Papers - Multimodal - Training - Patch Aligning Layer
Papers - Attention - Decoder Only
Papers - Multimodal - Training - Loss - Cross Entropy
Papers - Multimodal - Training - LLM Guided Pre-training
Papers - Agent - Tools
Papers - Agent - Math
Papers - Agent - Math - Reasoning
Papers - Text - Decoding - Truthful
Papers - Decoders - Report
Datasets - CoT - Math
Datasets - CoT
Papers - XAI - Attention - MLP - Partitioning - Affine Maps
Papers - XAI - Token Tracing - Model MLP Layers Plots
Papers - Decoders - Strategy - Beam Search - Report
Papers - RL - Gradient-Boosting
Papers - Markov Decision Process
Papers - RL - Actor-Critic
Papers - RL - GBT vs GBRL vs XGBoost
Papers - RL - Structured Data - Gradient Boosting
Papers - XGBoost
Papers - Datasets - Multimodal - Creator Guide
Papers - Decoders - Deterministic - FSD
Papers - Decoders - Deterministic - Diverse Beam Search
Papers - Decoders - Deterministic
Papers - Decoders - Stochastic
Papers - Decoders - Deterministic - DoLa
Papers - Decoders - Deterministic - Greedy Search
Papers - Decoders - Deterministic - Contrastive Search
Papers - Decoders - Deterministic - Contrastive Decoding
Papers - Decoders - Stochastic - Mirostat Sampling
Papers - Decoders - Stochastic - Typical Sampling
Papers - Decoders - Stochastic - Temperature Sampling
Papers - Decoders - Stochastic - Top-p Sampling
Papers - Decoders - Stochastic - Top-k Sampling
Papers - Decoders - Stochastic - n-Sampling
Papers - Benchmark - Coding - HumanEval
Papers - Coding - Datasets - MBPP
Papers - Text - Datasets - Translation - WMT22
Papers - Text - Benchmark - Translation - BLEU
Papers - Text - Benchmark - Factual Knowledge - FActScore
Papers - Text - Benchmark - Instructions - AlpacaEval
Papers - Fine-tuning - Math - QA
Papers - Image - Reasoning
Papers - Encodings - SPE - Sinusoidal Position Encoding
Papers - Encodings - LPE - Learnable Position Encodings
Papers - Text - Reasoning - Causal Chains
Papers - Text - Dataset - Knowledge Graph - WordNet
Papers - Knowledge Graph - Dataset - Text - WordNet
Papers - Knowledge Graph - GraphRag - WordNet -
Papers - CoT - Intermediate Thoughts
Papers - CoT - Branch Solve Merge (BSM)
Papers - Training - Text - Continual Learning
Models - Text - Embedding
Datasets - Text - Wiki - Embeddings - SBert
Papers - ICV - In-Context Vectors (controllable ICL)
Papers - Positive Geometries - Report
Papers - ICL - Attention
Papers - ICV - PCA - Directional Alignment
Papers - Text - Detoxification
Papers - Text - Datasets - Toxicity - ParaDetox
Papers - Text - Toxicity - Feature Shifting
Papers - Text - Safety
Papers - Fine-tuning - Text - Detoxification - LoRA
Papers - Text - Personalization - ICV
Papers - Text - Datasets - Formality - Yahoo Answers
Papers - Text - Role-Play - Shakespeare - Romeo and Juliet
Papers - Text - Datasets - Sentiment Transfer - Yelp Reviews
Papers - Text - Role-Play - Ranking Responses - ChatGPT
Papers - Vicuna
Papers - Text - Benchmarks - Similarity - Text - ROUGE-1
Papers - Text - Benchmark - Similar - Feature - Bert-Score
Papers - ICL - Detox - ICL Fine-tuning vs In-Context Vectors
Papers - Text - Personalization - Positivity
PPapers - Text - Safety - Diagonal Safety for Unsafe Queries
Papers - ICV - Strength - Tradeoffs Similarity and Fluency
Papers - Text - Jail break - ICV
Papers - Text - Role-Play - Style - Speaking
Papers - Text - Datasets - AGNews
Papers - Text - Activation Editing
Papers - Activation Editing - ICV
Papers - Text - Task Arithmetics - Fine-tune vs Base
Papers - ICV - Task Arithmetics
Papers - Text - Formality - Classifier - XLM-RoBERTa
Papers - Text - Sentiment - Classification
Papers - Attention - Dual Chunk
Papers - Attention - Rescale Weights - YARN
Papers - Activation - SwiGLU
Papers - Text - Training - Long Context
Papers - Training - Data Annotation
Papers - Benchmarks - Alignment - MT-Bench
Papers - Benchmarks - Text - Long Context - LV-Eval
Papers - Benchmarks - Long Context - Needle in a Haystack
Papers - Benchmarks - Text - Long Context - NeedleBench
Papers - Benchmarks - Biology
Papers - Text - Long Context
Papers - Text - Benchmarks - Reasoning - Long Context - ATC
Papers - 3DGS - Benchmarks - LPIPS
Papers - 3DGS - Scene Editing - Day vs Night - t-SNE
Papers - 3DGS - Editing - Appearance Interpolation
Papers - 3DGS - Datasets - Photo Tourism
Papers - 3DGS - Benchmarks - SSIM
Papers - 3DGS - Datasets - NeRF on-the-go
Papers - 3DGS - Fibonacci Sphere Sampling - Sky Handling
Papers - 3DGS - Uncertainty - Per-Pixel Binary Mask
Papers - Quantization - EfficientQAT
Papers - Ternary
Papers - Audio - Text - Music Generator
Papers - Quantization - AQLM
Papers - Security - Red Team - Agents
Papers - Multimodal - Benchmarks
Papers - Text - Linguistic Agency - Algospeak
Papers - Text - Cognitive Science - Participation
Papers - Text - Cognitive Science - Linguistic Agency
Papers - Text - Linguistics - Precarity - Conflict - Tension
Papers - Text - Linguistics - CYOA Game Exploration
Papers - Visualizations - Non-Euclidean Structures
Papers - Visualizations - Report
Papers - Visualizations - Topological, Geometric, Algebraic
Papers - Visualizations - High Dimensional Approximations
Papers - Image - Segmentation - High Dimensional Objects
Papers - Visualizations - Graphical Taxonomy
Papers - Visualizations - Dimensionality Reduction
Papers - Math - Non-Euclidean Geometry
Papers - Math - Visualizations
Papers - Math - Topology - Discrete Topological Structures
Papers - Math - Geometry - Distance - Riemannian Manifold
Papers - Math - Geometry - Distance - Riemannian Metric
Papers - Math - Geometry - Riemannian Geodesic
Papers - Math - Research - Training Loss - Riemannian Metric
Papers - Coding - Science
Papers - Math - Geometry - Continuous Geometric Structures
Papers - Math - Algebra - Algebraic Transformations
Papers - Training - Energy - Carbon Footprint
Papers - Coding - Verilog
Papers - Coding - Hardware - FPGA
Papers - Coding - Agentic - Summarization - Prompting
Papers - Attention - Topology, Geometry and Algebra
Papers - Math - Structures - Topology, Geometry and Algebra
Papers - Training - Math - PCA
Models - Attention - GQA
Papers - Image - PhotoMaker
Papers - Math - Algebra - Lie Group - SO(3)
Papers - Math - Group Action - Translate, Rotate, Reflect
Papers - Math - Structures in Data
Papers - Healthcare - CoT - Diagnosis
Papers - Healthcare - Medical Assistant - Diagnosis
Papers - Math - Training - Topological Structures
Papers - Topological Deep Learning - Structures in Data
Papers - Graphs
Papers - Training - Research - Data as Signals
Papers - Training - Noise - Labels
Papers - Attention - Algebra SE(d) - Fourier Nonlinearities
Papers - Attention - Algebra - Equivariant
Papers - Math - Fourier Components - Fourier Space
Papers - Encodings - Equivariant Positional Encodings
Models - Embedding - Text - BGE M3
Models - Text - Fine-tuning - SPPO
Models - Text - Fine-tuning - SPPO - Reranker
Papers - KAN
Papers - MLP
Papers - Training - Activation - Nonlinear - B-spline
Papers - Multilingual - Greek
Papers - Multilingual - Malaysian
Papers - Multilingual - Hebrew
Papers - Training - Research - Data as Coordinates
Papers - Math - Non-Euclidean Spaces - Domain and Codomain
Papers - Math - Non-Euclidean - Covariance Matrix - SPD
Papers - Math - Visualization - Non-Linear - t-SNE
Models - Bitnet - Layer Conversion
Models - Bitnet - Frankenmerge
Papers - Reasoning - Grokking
Paper - Non-Euclidean - Sphere - Frechet Mean - Geodesic
Papers - Math - Riemannian Manifolds - PCA
Papers - Math - PCA - Barycentric Subspace Analysis (BSA)
Papers - NEML - Frechet Mean - Consistency Bias
Papers - Math - Manifold - Metric Space - Quotient Space
Models - Image - Rectified Flow Transformers
Papers - Image - Rectified Flow Transformers
Papers - Math - Self-Compressing Models
Papers - Fine-tuning - LlamaFactory
Papers - Coding - DBA
Papers - Multimodal - Storytelling
Papers - Audio - Segmentation - Music - Vocals
Papers - Netflix
Papers - Georgia Institute of Technology
Papers - Audio - Segmentation -Cinematic Music
Papers - Image - Training - Instruct - VQA - Multi-Image
Papers - NTU
Papers - Fine-tuning - LoRA - Rank Stabilized Adapters
Papers - NEML - Manifold - Tangent Space - Exponential Map
Papers - NEML - Math - KNN with Geodesics and Frechet Mean
Papers - Math - Non-Euclidean Machine Learning (NEML)
Models - Coding - Compiler
Papers - NEML - Preprocessing - Topological Data Analysis
Papers - NEML - Latent Manifold - Topological Data Analysis
Papers - NEML - Preprocessing - Algebra - Group Learning
Papers - NEML - Latent Structure - Algebra - Group Learning
Papers - NEML - Transform - Euclidean to Manifold
Papers - NEML - Manifold - Local Geodesic Regression
Papers - NEML - Manifold - Bayesian - Kernel Regression
Papers - Image - Multi-Image
Papers - Benchmark - Distractions
Papers - mPLUG
Papers - Attention - Topology
Papers - Math - Regression - Geometric Structures
Papers - NEML - Manifolds Geometric - Polynomial Regression
Papers - NEML - Manifolds Geometric - Bezier Splines
Papers - NEML - Frechet Regression - Geodesic Regression
Papers - NEML - Regression - Manifold - Weighted Frechet
Papers - NEML - Regression - Stochastic - Non-Geodesic
Papers - NEML - Regression - Local Frechet Regression
Papers - NEML - Bayesian - Non-Parametric - Gaussian Process
Papers - NEML - Manifold Random Forest
Papers - NEML - Regression - Local Extrinsic
Papers - NEML - Manifold IO - Steinke Regular Splines
Papers - NEML - Manifold IO - Banerjee Kernel Regression
Papers - NEML - Geometric Structures - Dim Reduction - tSNE
Papers - NEML - Geometric Structures - Dim Reduction - UMAP
Papers - NEML - Geometric - Dimension Reduction - Isomap
Papers - NEML - Geometric - Dim Reduction - Barycentric Subs
Papers - NEML - Geometric - Dimension Reduction - Rie-SNE
Papers - NEML - Linear - Embeddings - Tangent Space PCA
Papers - NEML - Geometric - Dim Reduction - Poincare Embeds
Papers - NEML - Manifolds - VAE
Papers - NEML - Non-Euclidean Machine Learning
Papers - NEML - Hyberbolic - Frechet Mean - Poincare
Papers - NEML - Hyperbolic Learning - Poincare Ball
Papers - NEML - Poincare Ball
Papers - NEML - Datasets - WordNet
Papers - Science - Discovery
Papers - NEML - Euclidean Latents - Decoder - Riemannian LLE
Papers - NEML - Euclid Latents - Nongeodesic Sub Man - VAE
Papers - NEML - Manifold Latents - Hypersphere VAE
Papers - NEML - Manifold Latents - Lie Group Latent Space
Papers - NEML - Manifold Latents - Toroidal Latent Space
Papers - NEML - Manifold - Nonparametric Decoder - GPLVM
Papers - NEDL - Non-Euclidean Deep Learning
Papers - NEDL - Model Layer - Euclidean - MLP
Papers - NEDL - Layer - Perceptron-Exp - Riemannian Expo Map
Papers - NEDL - Layer - Log Perceptron - Riemannian Log Map
Papers - NEDL - Model Layers - Topology, Geometry, Alegbra
Papers - NEDL - Benchmarks - Topology Deep Learning (TDL)
Papers - NEDL - Attention - Equivar - Steerable Transformers
Papers - NEDL - Attention - Equivariance - LieTransformer
Papers - NEDL - Geometry - Layers - ManifoldNet
Papers - Monte-Carlo Tree Search - MCTS
Papers - Function Calling
Papers - Function Calling - LLM Compiler - Parallel
Papers - Agent - Web Navigation
Papers - Video Games - Image - Understanding - QA
Papers - Video - Segmentation
Papers - Multimodal - Blip-3
Papers - Image - Summarize as JSON
Papers - Math - Polynomial Symmetry - Galois Theory
Papers - NEDL - Research - Symmetry - Group - Galois Groups
Papers - NEDL - Topological Deep Learning (TDL)
Papers - NEDL - Topology - Persistent Homology
Papers - Security - Benchmark
Papers - Music - Piano - Performer - Robot - Motion
Papers - Music - Training - Performer - Finger Location
Papers - Music - Training - Segmentation - Piano
Papers - Music - Training - Annotation - Piano
Papers - Audio - Pipeline - Annotation - Finger Placement
Papers - NEDL - Equivariant Transformers
Papers - NEDL - Lie Groups
Papers - NEDL - Hyperbolic Rotation
Papers - NEDL - Embeddings - Hyperbolic
Papers - NEDL - Dim Redct - Principal Geodesics Analysis PGA
Papers - Benchmark - Tables - Reasoning - QA
Papers - Normalization - NLP - Power vs Batch
Papers - Normalization - NLP - Layer vs Batch
Papers - Normalization - Embedding Layer - SVD
Papers - Normalization - No Normalization - Fixup
Papers - Training - Initialization - Regularization - Fixup
Papers - ResNet - Training - Init - Exploding Gradients
Papers - ResNet - Activation - nonlinear ReLU
Papers - Training - Layers - Scalar - Bias and Multipliers
Papers - Training - Regularization - MixUp Regularizer
Papers - Training - Feature Space Cluster - Fisher Criterion
Papers - NEDL - Topology - Attention - Set Transformer
Papers - NEDL - Topology - Attn - Point Cloud Transformer
Papers - NEDL - Topology - Attention - Geodesic Transformer
Papers - NEDL - Topology - Attn - Graph Attn Transformer
Papers - NEDL - Topology - Attention - SE(3) Transformer
Papers - NEDL - Dim Reduction - Principal Geodesic Analysis
Papers - Text - Controllable Text Generation (CTG)
Papers - NEDL - Latent Space Manipulation
Papers - Training - Unlearning
Papers - Text - Survey
Papers - MoE - Jamba
Papers - Text - Benchmark - QA - Knowledge Conflicts
Spaces - Image - Segmentation
Spaces - Image - Prompt with LoRA
Spaces - Multimodal - Image Generation - Text and Image
Papers - NEDL - Geometry - Wasserstein Manifold
Papers - Multimodal - Alignment Correspondence Policy
Models - Image - Llava
Models - Image - SDXL
Papers - Training - Multi-Task Learning - Jacobian Descent
Papers - Training - Loss - Multiple Loss - Jacobian Descent
Papers - Training - Hardware - Survey
Papers - NEDL - Topology - Lifting Topological Domains
Papers - Benchmarks - Data Science
Papers - Training - Eval - Out of Distribution
updated
10 days ago
Upvote
-
Phi-4 Technical Report
Paper
•
2412.08905
•
Published
15 days ago
•
92
Upvote
-
Share collection
View history
Collection guide
Browse collections