向AI转型的程序员都关注了这个号👇👇👇
【CVPR 2023 论文开源目录】
Backbone
CLIP
MAE
GAN
GNN
MLP
NAS
OCR
NeRF
DETR
Diffusion Models(扩散模型)
Avatars
ReID(重识别)
长尾分布(Long-Tail)
Vision Transformer
视觉和语言(Vision-Language)
自监督学习(Self-supervised Learning)
数据增强(Data Augmentation)
目标检测(Object Detection)
目标跟踪(Visual Tracking)
语义分割(Semantic Segmentation)
实例分割(Instance Segmentation)
全景分割(Panoptic Segmentation)
医学图像分割(Medical Image Segmentation)
视频目标分割(Video Object Segmentation)
参考图像分割(Referring Image Segmentation)
图像抠图(Image Matting)
图像编辑(Image Editing)
Low-level Vision
超分辨率(Super-Resolution)
去模糊(Deblur)
3D点云(3D Point Cloud)
3D目标检测(3D Object Detection)
3D语义分割(3D Semantic Segmentation)
3D目标跟踪(3D Object Tracking)
3D人体姿态估计(3D Human Pose Estimation)
3D语义场景补全(3D Semantic Scene Completion)
医学图像(Medical Image)
图像生成(Image Generation)
视频生成(Video Generation)
视频理解(Video Understanding)
行为检测(Action Detection)
文本检测(Text Detection)
知识蒸馏(Knowledge Distillation)
模型剪枝(Model Pruning)
图像压缩(Image Compression)
异常检测(Anomaly Detection)
三维重建(3D Reconstruction)
深度估计(Depth Estimation)
轨迹预测(Trajectory Prediction)
图像描述(Image Captioning)
视觉问答(Visual Question Answering)
手语识别(Sign Language Recognition)
视频预测(Video Prediction)
新视点合成(Novel View Synthesis)
Zero-Shot Learning(零样本学习)
立体匹配(Stereo Matching)
场景图生成(Scene Graph Generation)
数据集(Datasets)
新任务(New Tasks)
其他(Others)
Backbone
Integrally Pre-Trained Transformer Pyramid Networks
Paper: https://arxiv.org/abs/2211.12735
Code: https://github.com/sunsmarterjie/iTPN
Stitchable Neural Networks
Homepage: https://snnet.github.io/
Paper: https://arxiv.org/abs/2302.06586
Code: https://github.com/ziplab/SN-Net
Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks
Paper: https://arxiv.org/abs/2303.03667
Code: https://github.com/JierunChen/FasterNet
BiFormer: Vision Transformer with Bi-Level Routing Attention
Paper: None
Code: https://github.com/rayleizhu/BiFormer
DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network
Paper: https://arxiv.org/abs/2303.02165
Code: https://github.com/alibaba/lightweight-neural-architecture-search
Vision Transformer with Super Token Sampling
Paper: https://arxiv.org/abs/2211.11167
Code: https://github.com/hhb072/SViT
Hard Patches Mining for Masked Image Modeling
Paper: None
Code: None
CLIP
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Paper: https://arxiv.org/abs/2301.12959
Code: https://github.com/tobran/GALIP
DeltaEdit: Exploring Text-free Training for Text-driven Image Manipulation
Paper: https://arxiv.org/abs/2303.06285
Code: https://github.com/Yueming6568/DeltaEdit
MAE
Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
Paper: https://arxiv.org/abs/2212.06785
Code: https://github.com/ZrrSkywalker/I2P-MAE
Generic-to-Specific Distillation of Masked Autoencoders
Paper: https://arxiv.org/abs/2302.14771
Code: https://github.com/pengzhiliang/G2SD
GAN
DeltaEdit: Exploring Text-free Training for Text-driven Image Manipulation
Paper: https://arxiv.org/abs/2303.06285
Code: https://github.com/Yueming6568/DeltaEdit
NeRF
NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior
Home: https://nope-nerf.active.vision/
Paper: https://arxiv.org/abs/2212.07388
Code: None
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
Paper: https://arxiv.org/abs/2211.07600
Code: https://github.com/eladrich/latent-nerf
NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics via Novel-View Synthesis
Paper: https://arxiv.org/abs/2301.08556
Code: None
Panoptic Lifting for 3D Scene Understanding with Neural Fields
Homepage: https://nihalsid.github.io/panoptic-lifting/
Paper: https://arxiv.org/abs/2212.09802
Code: None
NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer
Homepage: https://redrock303.github.io/nerflix/
Paper: https://arxiv.org/abs/2303.06919
Code: None
DETR
DETRs with Hybrid Matching
Paper: https://arxiv.org/abs/2207.13080
Code: https://github.com/HDETR
NAS
PA&DA: Jointly Sampling PAth and DAta for Consistent NAS
Paper: https://arxiv.org/abs/2302.14772
Code: https://github.com/ShunLu91/PA-DA
Avatars
Structured 3D Features for Reconstructing Relightable and Animatable Avatars
Homepage: https://enriccorona.github.io/s3f/
Paper: https://arxiv.org/abs/2212.06820
Code: None
Demo: https://www.youtube.com/watch?v=mcZGcQ6L-2s
ReID(重识别)
Clothing-Change Feature Augmentation for Person Re-Identification
Paper: None
Code: None
MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID
Paper: https://arxiv.org/abs/2303.07065
Code: https://github.com/vimar-gu/MSINet
Diffusion Models(扩散模型)
Video Probabilistic Diffusion Models in Projected Latent Space
Homepage: https://sihyun.me/PVDM/
Paper: https://arxiv.org/abs/2302.07685
Code: https://github.com/sihyun-yu/PVDM
Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models
Paper: https://arxiv.org/abs/2211.10655
Code: None
Imagic: Text-Based Real Image Editing with Diffusion Models
Homepage: https://imagic-editing.github.io/
Paper: https://arxiv.org/abs/2210.09276
Code: None
Parallel Diffusion Models of Operator and Image for Blind Inverse Problems
Paper: https://arxiv.org/abs/2211.10656
Code: None
DiffRF: Rendering-guided 3D Radiance Field Diffusion
Homepage: https://sirwyver.github.io/DiffRF/
Paper: https://arxiv.org/abs/2212.01206
Code: None
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Paper: https://arxiv.org/abs/2212.09478
Code: https://github.com/researchmm/MM-Diffusion
HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising
Homepage: https://aminshabani.github.io/housediffusion/
Paper: https://arxiv.org/abs/2211.13287
Code: https://github.com/aminshabani/house_diffusion
TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
Paper: https://arxiv.org/abs/2303.05762
Code: https://github.com/chenweixin107/TrojDiff
Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption
Paper: https://arxiv.org/abs/2207.03442
Code: https://github.com/shiyegao/DDA
DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration
Paper: https://arxiv.org/abs/2303.06885
Code: None
Vision Transformer
Integrally Pre-Trained Transformer Pyramid Networks
Paper: https://arxiv.org/abs/2211.12735
Code: https://github.com/sunsmarterjie/iTPN
Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors
Homepage: https://niessnerlab.org/projects/hou2023mask3d.html
Paper: https://arxiv.org/abs/2302.14746
Code: None
Learning Trajectory-Aware Transformer for Video Super-Resolution
Paper: https://arxiv.org/abs/2204.04216
Code: https://github.com/researchmm/TTVSR
Vision Transformers are Parameter-Efficient Audio-Visual Learners
Homepage: https://yanbo.ml/project_page/LAVISH/
Code: https://github.com/GenjiB/LAVISH
Where We Are and What We’re Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes
Paper: https://arxiv.org/abs/2303.04249
Code: None
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
Paper: https://arxiv.org/abs/2301.06051
Code: https://github.com/Haiyang-W/DSVT
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
Paper: https://arxiv.org/abs/2211.10772
Code link: https://github.com/ViTAE-Transformer/DeepSolo
BiFormer: Vision Transformer with Bi-Level Routing Attention
Paper: https://arxiv.org/abs/2303.08810
Code: https://github.com/rayleizhu/BiFormer
Vision Transformer with Super Token Sampling
Paper: https://arxiv.org/abs/2211.11167
Code: https://github.com/hhb072/SViT
BEVFormer v2: Adapting Modern Image Backbones to Bird’s-Eye-View Recognition via Perspective Supervision
Paper: https://arxiv.org/abs/2211.10439
Code: None
BAEFormer: Bi-directional and Early Interaction Transformers for Bird’s Eye View Semantic Segmentation
Paper: None
Code: None
视觉和语言(Vision-Language)
GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods
Paper: https://arxiv.org/abs/2301.01893
Code: None
Teaching Structured Vision&Language Concepts to Vision&Language Models
Paper: https://arxiv.org/abs/2211.11733
Code: None
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks
Paper: https://arxiv.org/abs/2211.09808
Code: https://github.com/fundamentalvision/Uni-Perceiver
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Paper: https://arxiv.org/abs/2303.00040
Code: None
CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
Paper: https://arxiv.org/abs/2303.02489
Code: None
FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks
Paper: https://arxiv.org/abs/2303.02483
Code: None
Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding
Homepage: https://rllab-snu.github.io/projects/Meta-Explore/doc.html
Paper: https://arxiv.org/abs/2303.04077
Code: None
All in One: Exploring Unified Video-Language Pre-training
Paper: https://arxiv.org/abs/2203.07303
Code: https://github.com/showlab/all-in-one
Position-guided Text Prompt for Vision Language Pre-training
Paper: https://arxiv.org/abs/2212.09737
Code: https://github.com/sail-sg/ptp
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Paper: https://arxiv.org/abs/2209.14941
Code: https://github.com/yanmin-wu/EDA
CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
Paper: https://arxiv.org/abs/2303.02489
Code: None
FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks
Paper: https://arxiv.org/abs/2303.02483
Code: https://github.com/BrandonHanx/FAME-ViL
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Homepage: https://boheumd.github.io/A2Summ/
Paper: https://arxiv.org/abs/2303.07284
Code: https://github.com/boheumd/A2Summ
目标检测(Object Detection)
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Paper: https://arxiv.org/abs/2207.02696
Code: https://github.com/WongKinYiu/yolov7
DETRs with Hybrid Matching
Paper: https://arxiv.org/abs/2207.13080
Code: https://github.com/HDETR
Enhanced Training of Query-Based Object Detection via Selective Query Recollection
Paper: https://arxiv.org/abs/2212.07593
Code: https://github.com/Fangyi-Chen/SQR
Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
Paper: https://arxiv.org/abs/2303.05892
Code: https://github.com/LutingWang/OADP
目标跟踪(Object Tracking)
Simple Cues Lead to a Strong Multi-Object Tracker
Paper: https://arxiv.org/abs/2206.04656
Code: None
语义分割(Semantic Segmentation)
Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos
Paper: https://arxiv.org/abs/2303.07224
Code: https://github.com/THU-LYJ-Lab/AR-Seg
医学图像分割(Medical Image Segmentation)
Label-Free Liver Tumor Segmentation
Paper: https://arxiv.org/abs/2210.14845
Code: https://github.com/MrGiovanni/SyntheticTumors
视频目标分割(Video Object Segmentation)
Two-shot Video Object Segmentation
Paper: https://arxiv.org/abs/2303.12078
Code: https://github.com/yk-pku/Two-shot-Video-Object-Segmentation
参考图像分割(Referring Image Segmentation )
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Paper: https://arxiv.org/abs/2302.07387
Code: None
3D点云(3D-Point-Cloud)
Physical-World Optical Adversarial Attacks on 3D Face Recognition
Paper: https://arxiv.org/abs/2205.13412
Code: https://github.com/PolyLiYJ/SLAttack.git
3D目标检测(3D Object Detection)
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
Paper: https://arxiv.org/abs/2301.06051
Code: https://github.com/Haiyang-W/DSVT
FrustumFormer: Adaptive Instance-aware Resampling for Multi-view 3D Detection
Paper: https://arxiv.org/abs/2301.04467
Code: None
3D Video Object Detection with Learnable Object-Centric Global Optimization
Paper: None
Code: None
3D语义分割(3D Semantic Segmentation)
Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation
Paper: https://arxiv.org/abs/2303.11203
Code: https://github.com/l1997i/lim3d
3D语义场景补全(3D Semantic Scene Completion)
Paper: https://arxiv.org/abs/2302.12251
Code: https://github.com/NVlabs/VoxFormer
Low-level Vision
Causal-IR: Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective
Paper: https://arxiv.org/abs/2303.06859
Code: https://github.com/lixinustc/Casual-IR-DIL
超分辨率(Video Super-Resolution)
Super-Resolution Neural Operator
Paper: https://arxiv.org/abs/2303.02584
Code: https://github.com/2y7c3/Super-Resolution-Neural-Operator
视频超分辨率
Learning Trajectory-Aware Transformer for Video Super-Resolution
Paper: https://arxiv.org/abs/2204.04216
Code: https://github.com/researchmm/TTVSR
图像生成(Image Generation)
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Paper: https://arxiv.org/abs/2301.12959
Code: https://github.com/tobran/GALIP
MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
Paper: https://arxiv.org/abs/2211.09117
Code: https://github.com/LTH14/mage
视频生成(Video Generation)
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Paper: https://arxiv.org/abs/2212.09478
Code: https://github.com/researchmm/MM-Diffusion
视频理解(Video Understanding)
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Paper: https://arxiv.org/abs/2209.15280
Code: https://github.com/TencentARC/TVTS
行为检测(Action Detection)
TriDet: Temporal Action Detection with Relative Boundary Modeling
Paper: https://arxiv.org/abs/2303.07347
Code: https://github.com/dingfengshi/TriDet
文本检测(Text Detection)
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
Paper: https://arxiv.org/abs/2211.10772
Code link: https://github.com/ViTAE-Transformer/DeepSolo
知识蒸馏(Knowledge Distillation)
Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation
Paper: https://arxiv.org/abs/2302.14290
Code: None
Generic-to-Specific Distillation of Masked Autoencoders
Paper: https://arxiv.org/abs/2302.14771
Code: https://github.com/pengzhiliang/G2SD
模型剪枝(Model Pruning)
DepGraph: Towards Any Structural Pruning
Paper: https://arxiv.org/abs/2301.12900
Code: https://github.com/VainF/Torch-Pruning
图像压缩(Image Compression)
Context-Based Trit-Plane Coding for Progressive Image Compression
Paper: https://arxiv.org/abs/2303.05715
Code: https://github.com/seungminjeon-github/CTC
异常检测(Anomaly Detection)
Deep Feature In-painting for Unsupervised Anomaly Detection in X-ray Images
Paper: https://arxiv.org/abs/2111.13495
Code: https://github.com/tiangexiang/SQUID
三维重建(3D Reconstruction)
OReX: Object Reconstruction from Planar Cross-sections Using Neural Fields
Paper: https://arxiv.org/abs/2211.12886
Code: None
SparsePose: Sparse-View Camera Pose Regression and Refinement
Paper: https://arxiv.org/abs/2211.16991
Code: None
NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
Paper: https://arxiv.org/abs/2303.02375
Code: None
Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition
Homepage: https://moygcc.github.io/vid2avatar/
Paper: https://arxiv.org/abs/2302.11566
Code: https://github.com/MoyGcc/vid2avatar
Demo: https://youtu.be/EGi47YeIeGQ
To fit or not to fit: Model-based Face Reconstruction and Occlusion Segmentation from Weak Supervision
Paper: https://arxiv.org/abs/2106.09614
Code: https://github.com/unibas-gravis/Occlusion-Robust-MoFA
Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
Paper: https://arxiv.org/abs/2303.05937
Code: None
3D Cinemagraphy from a Single Image
Homepage: https://xingyi-li.github.io/3d-cinemagraphy/
Paper: https://arxiv.org/abs/2303.05724
Code: https://github.com/xingyi-li/3d-cinemagraphy
Revisiting Rotation Averaging: Uncertainties and Robust Losses
Paper: https://arxiv.org/abs/2303.05195
Code https://github.com/zhangganlin/GlobalSfMpy
FFHQ-UV: Normalized Facial UV-Texture Dataset for 3D Face Reconstruction
Paper: https://arxiv.org/abs/2211.13874
Code: https://github.com/csbhr/FFHQ-UV
深度估计(Depth Estimation)
Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
Paper: https://arxiv.org/abs/2211.13202
Code: https://github.com/noahzn/Lite-Mono
轨迹预测(Trajectory Prediction)
IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint Multi-Agent Trajectory Prediction
Paper: https://arxiv.org/abs/2303.00575
Code: None
图像描述(Image Captioning)
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Paper: https://arxiv.org/abs/2303.02437
Code: Node
视觉问答(Visual Question Answering)
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering
Paper: https://arxiv.org/abs/2303.01239
Code: https://github.com/jingjing12110/MixPHM
手语识别(Sign Language Recognition)
Continuous Sign Language Recognition with Correlation Network
Paper: https://arxiv.org/abs/2303.03202
Code: https://github.com/hulianyuyy/CorrNet
视频预测(Video Prediction)
MOSO: Decomposing MOtion, Scene and Object for Video Prediction
Paper: https://arxiv.org/abs/2303.03684
Code: https://github.com/anonymous202203/MOSO
新视点合成(Novel View Synthesis)
3D Video Loops from Asynchronous Input
Homepage: https://limacv.github.io/VideoLoop3D_web/
Paper: https://arxiv.org/abs/2303.05312
Code: https://github.com/limacv/VideoLoop3D
Zero-Shot Learning(零样本学习)
Bi-directional Distribution Alignment for Transductive Zero-Shot Learning
Paper: https://arxiv.org/abs/2303.08698
Code: https://github.com/Zhicaiwww/Bi-VAEGAN
Semantic Prompt for Few-Shot Learning
Paper: None
Code: None
立体匹配(Stereo Matching)
Iterative Geometry Encoding Volume for Stereo Matching
Paper: https://arxiv.org/abs/2303.06615
Code: https://github.com/gangweiX/IGEV
场景图生成(Scene Graph Generation)
Prototype-based Embedding Network for Scene Graph Generation
Paper: https://arxiv.org/abs/2303.07096
Code: None
数据集(Datasets)
Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes
Paper: https://arxiv.org/abs/2303.02760
Code: None
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Homepage: https://boheumd.github.io/A2Summ/
Paper: https://arxiv.org/abs/2303.07284
Code: https://github.com/boheumd/A2Summ
其他(Others)
Interactive Segmentation as Gaussian Process Classification
Paper: https://arxiv.org/abs/2302.14578
Code: None
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger
Paper: https://arxiv.org/abs/2302.14677
Code: None
SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries
Homepage: http://bit.ly/splinecam
Paper: https://arxiv.org/abs/2302.12828
Code: None
SCOTCH and SODA: A Transformer Video Shadow Detection Framework
Paper: https://arxiv.org/abs/2211.06885
Code: None
DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization
Homepage: https://ai4ce.github.io/DeepMapping2/
Paper: https://arxiv.org/abs/2212.06331
None: https://github.com/ai4ce/DeepMapping2
RelightableHands: Efficient Neural Relighting of Articulated Hand Models
Homepage: https://sh8.io/#/relightable_hands
Paper: https://arxiv.org/abs/2302.04866
Code: None
Token Turing Machines
Paper: https://arxiv.org/abs/2211.09119
Code: None
Single Image Backdoor Inversion via Robust Smoothed Classifiers
Paper: https://arxiv.org/abs/2303.00215
Code: https://github.com/locuslab/smoothinv
To fit or not to fit: Model-based Face Reconstruction and Occlusion Segmentation from Weak Supervision
Paper: https://arxiv.org/abs/2106.09614
Code: https://github.com/unibas-gravis/Occlusion-Robust-MoFA
HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics
Homepage: https://dolorousrtur.github.io/hood/
Paper: https://arxiv.org/abs/2212.07242
Code: https://github.com/dolorousrtur/hood
Demo: https://www.youtube.com/watch?v=cBttMDPrUYY
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others
Paper: https://arxiv.org/abs/2212.04825
Code: https://github.com/facebookresearch/Whac-A-Mole.git
RelightableHands: Efficient Neural Relighting of Articulated Hand Models
Homepage: https://sh8.io/#/relightable_hands
Paper: https://arxiv.org/abs/2302.04866
Code: None
Demo: https://sh8.io/static/media/teacher_video.923d87957fe0610730c2.mp4
Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation
Paper: https://arxiv.org/abs/2303.00914
Code: None
Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression
Paper: https://arxiv.org/abs/2303.01052
Code: None
UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy
Paper: https://arxiv.org/abs/2303.00938
Code: None
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness
Paper: https://arxiv.org/abs/2303.00971
Code: https://github.com/zhijieshen-bjtu/DOPNet
Learning Neural Parametric Head Models
Homepage: https://simongiebenhain.github.io/NPHM)
Paper: https://arxiv.org/abs/2212.02761
Code: None
A Meta-Learning Approach to Predicting Performance and Data Requirements
Paper: https://arxiv.org/abs/2303.01598
Code: None
MACARONS: Mapping And Coverage Anticipation with RGB Online Self-Supervision
Homepage: https://imagine.enpc.fr/~guedona/MACARONS/
Paper: https://arxiv.org/abs/2303.03315
Code: None
Masked Images Are Counterfactual Samples for Robust Fine-tuning
Paper: https://arxiv.org/abs/2303.03052
Code: None
HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling
Paper: https://arxiv.org/abs/2303.02700
Code: None
Decompose, Adjust, Compose: Effective Normalization by Playing with Frequency for Domain Generalization
Paper: https://arxiv.org/abs/2303.02328
Code: None
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
Paper: https://arxiv.org/abs/2303.03108
Code: None
Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples
Paper: https://arxiv.org/abs/2301.01217
Code: https://github.com/jiamingzhang94/Unlearnable-Clusters
Where We Are and What We’re Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes
Paper: https://arxiv.org/abs/2303.04249
Code: None
UniHCP: A Unified Model for Human-Centric Perceptions
Paper: https://arxiv.org/abs/2303.02936
Code: https://github.com/OpenGVLab/UniHCP
CUDA: Convolution-based Unlearnable Datasets
Paper: https://arxiv.org/abs/2303.04278
Code: https://github.com/vinusankars/Convolution-based-Unlearnability
Masked Images Are Counterfactual Samples for Robust Fine-tuning
Paper: https://arxiv.org/abs/2303.03052
Code: None
AdaptiveMix: Robust Feature Representation via Shrinking Feature Space
Paper: https://arxiv.org/abs/2303.01559
Code: https://github.com/WentianZhang-ML/AdaptiveMix
Physical-World Optical Adversarial Attacks on 3D Face Recognition
Paper: https://arxiv.org/abs/2205.13412
Code: https://github.com/PolyLiYJ/SLAttack.git
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing
Paper: https://arxiv.org/abs/2301.06281
Code: https://carlyx.github.io/DPE/
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Paper: https://arxiv.org/abs/2211.12194
Code: https://github.com/Winfredy/SadTalker
Intrinsic Physical Concepts Discovery with Object-Centric Predictive Models
Paper: None
Code: None
Sharpness-Aware Gradient Matching for Domain Generalization
Paper: None
Code: https://github.com/Wang-pengfei/SAGM
Mind the Label-shift for Augmentation-based Graph Out-of-distribution Generalization
Paper: None
Code: None
Blind Video Deflickering by Neural Filtering with a Flawed Atlas
Homepage: https://chenyanglei.github.io/deflicker
Paper: None
Code: None
RiDDLE: Reversible and Diversified De-identification with Latent Encryptor
Paper: None
Code: https://github.com/ldz666666/RiDDLE
PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation
Paper: https://arxiv.org/abs/2303.07337
Code: None
Upcycling Models under Domain and Category Shift
Paper: https://arxiv.org/abs/2303.07110
Code: https://github.com/ispc-lab/GLC
Modality-Agnostic Debiasing for Single Domain Generalization
Paper: https://arxiv.org/abs/2303.07123
Code: None
Progressive Open Space Expansion for Open-Set Model Attribution
Paper: https://arxiv.org/abs/2303.06877
Code: None
Dynamic Neural Network for Multi-Task Learning Searching across Diverse Network Topologies
Paper: https://arxiv.org/abs/2303.06856
Code: None
GFPose: Learning 3D Human Pose Prior with Gradient Fields
Paper: https://arxiv.org/abs/2212.08641
Code: https://github.com/Embracing/GFPose
PRISE: Demystifying Deep Lucas-Kanade with Strongly Star-Convex Constraints for Multimodel Image Alignment
Paper: https://arxiv.org/abs/2303.11526
Code: https://github.com/Zhang-VISLab
Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings
Paper: https://arxiv.org/abs/2303.11502
Code: None
Boundary Unlearning
Paper: https://arxiv.org/abs/2303.11570
机器学习算法AI大数据技术
搜索公众号添加: datanlp
长按图片,识别二维码
阅读过本文的人还看了以下文章:
基于40万表格数据集TableBank,用MaskRCNN做表格检测
《深度学习入门:基于Python的理论与实现》高清中文PDF+源码
2019最新《PyTorch自然语言处理》英、中文版PDF+源码
《21个项目玩转深度学习:基于TensorFlow的实践详解》完整版PDF+附书代码
PyTorch深度学习快速实战入门《pytorch-handbook》
【下载】豆瓣评分8.1,《机器学习实战:基于Scikit-Learn和TensorFlow》
李沐大神开源《动手学深度学习》,加州伯克利深度学习(2019春)教材
【Keras】完整实现‘交通标志’分类、‘票据’分类两个项目,让你掌握深度学习图像分类
如何利用全新的决策树集成级联结构gcForest做特征工程并打分?
Machine Learning Yearning 中文翻译稿
斯坦福CS230官方指南:CNN、RNN及使用技巧速查(打印收藏)
中科院Kaggle全球文本匹配竞赛华人第1名团队-深度学习与特征工程
不断更新资源
深度学习、机器学习、数据分析、python
搜索公众号添加: datayx
文章出处登录后可见!