0%

多模态论文集

近3年多模态相关论文列表 (2018-2020)

key word 关键字

  • multi-modal or multimodal
  • cross-domain or cross-modal
  • multi-view
  • Multivariate
  • Generative model
  • Collaborative

多模态应用场景

研究方向 应用场景 解释
跨模态的生成模型 机器翻译 (text-to-text)、 图像、文本、语音等多模态数据相互转换生成
多模态人脸反欺诈
动态手势识别
视觉理解 VQA,textVQA Object Referring,给定查询语句,在图像或者视频中找到对应信息
跨模态检索 以图搜图、语音(语言)搜图 大多基于hash算法做检索
多模态关系推理
视觉语言导航 (Visual Language Navigation,VLN) 智能机器人 属于强化学习:用自然语言(NLP)指令让智能体(agent)在真实环境中导航
多模态融合架构搜索 (NAS) 在所有可能的融合架构空间中,找到最适合特定数据集性能的架构
基本任务:预测、分类、聚类等 轨迹预测,行人识别
视觉问答(VQA) 智能机器人
即时字幕

点击跳转

  • T-PAMI (8)

    • 2020: 0篇
    • 2019: 3篇
    • 2018: 5篇
  • IJCV (4)

    • 2020: 2篇
    • 2019: 1篇
    • 2018: 1篇
  • JMLR

    • 2020: 1篇 多模态推荐系统
    • 2019: 0篇
    • 2017-2018: 1篇 词袋工具
    • 2016: 1篇 手势识别

International Conference on Learning Representations (ICLR)

Year Author/mechanism Title Summary Valuable
2020 Learning Robust Representations via Multi-View Information Bottleneck (post)
2019 Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation(post)
2019 Disjoint Mapping Network for Cross-modal Matching of Voices and Faces(post)
2019 Learning Multimodal Graph-to-Graph Translation for Molecule Optimization(post)
2019 DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder(post)
2019 Learning Factorized Multimodal Representations(post)
2018 Spectral Normalization for Generative Adversarial Networks (oral)
2018 Wasserstein Auto-Encoders (oral)
2018 MGAN: Training Generative Adversarial Nets with Multiple Generators(post)
2018 Multi-Mention Learning for Reading Comprehension with Neural Cascades(post)
2018 Learning to cluster in order to transfer across domains and task(post)
2017 Gated Multimodal Units for Information Fusion (Workshop Track)
2017 Is a picture worth a thousand words? A Deep Multi-Modal Fusion Architecture for Product Classification in e-commerce (Reject)
2017 Joint Multimodal Learning with Deep Generative Models (Reject)
2017 Multi-modal Variational Encoder-Decoders(Reject)
2017

返回顶部


International Conference on Machine Learning (ICML)

Year Author/mechanism Title Summary Valuable
2020 Yoshua Bengio Perceptual Generative Autoencoders
2020 Google Interpretable, Multidimensional, Multimodal Anomaly Detection with Negative Sampling for Detection of Device Failure
2020 Graph Optimal Transport for Cross-Domain Alignment
2020 The Differentiable Cross-Entropy Method
2020 Graph Representation Learning by Maximizing Mutual Information Between Spatial and Spectral Views
2020 InfoGAN-CR: Disentangling Generative Adversarial Networks with Contrastive Regularizers
2020 Conditional Augmentation for Generative Modeling
2019 *Scalable Nonparametric Sampling from Multimodal Posteriors with the Posterior Bootstrap *
2019 *Learning Generative Models across Incomparable Spaces *
2019 *Wasserstein of Wasserstein Loss for Learning Generative Models *
2019 EMI: Exploration with Mutual Information
2019 Self-Attention Generative Adversarial Networks
2019 Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching
2019 Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs
2019 Non-Parametric Priors For Generative Adversarial Networks
2019 Lipschitz Generative Adversarial Nets
2019 Self-Attention Generative Adversarial Networks
2019 Disentangling Disentanglement in Variational Autoencoders
2019 Hierarchical Decompositional Mixtures of Variational Autoencoders
2019 Sparse Multi-Channel Variational Autoencoder for the Joint Analysis of Heterogeneous Data
2019 MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets
2018 Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations
2018 End-to-End Learning for the Deep Multivariate Probit Model
2018 A probabilistic framework for multi-view feature learning with many-to-many associations via neural networks

返回顶部


Neural Information Processing Systems (NIPS)

Year Author/mechanism Title Summary Valuable
2020
2019 Adaptive Cross-Modal Few-shot Learning
2019 Cross-Modal Learning with Adversarial Samples
2019 Deep Multimodal Multilinear Fusion with High-order Polynomial Pooling
2019 Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models
2019 Cross Attention Network for Few-shot Classification
2019 Cross-Domain Transferability of Adversarial Perturbations
2019 Learning Representations by Maximizing Mutual Information Across Views
2018 Multimodal Generative Models for Scalable Weakly-Supervised Learning
2018 Mental Sampling in Multimodal Representations
2018 Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces
2018 Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels
2018 Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies
2018 Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language
2018 Flexible and accurate inference and learning for deep generative models

返回顶部


2016Computer Vision and Pattern Recognition (CVPR)

Year Author/mechanism Title Summary Valuable
2020 英国布里斯托尔大学(Bristol) Multi-Modal Domain Adaptation for Fine-Grained Action Recognition 域适应(分类)
2020 合肥工业大学 Creating Something From Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing 跨模态检索(基于Hash)
2020 德国弗莱堡大学(freiburg) Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior 多模态预测
2020 法国自动化研究所(Inria) Cross-Modal Deep Face Normals With Deactivable Skip Connections 跨模态生成 (均为图像)
2020 清华大学 Monocular Real-Time Hand Shape and Motion Capture Using Multi-Modal Data 姿态估计
2020 华中科技大学 & 北京大学 Semantically Multi-Modal Image Synthesis 语义合成图像 (用不同语义来操控合成结果)
2020 美国罗格斯大学 (Rutgers) Knowledge As Priors: Cross-Modal Knowledge Generalization for Datasets Without Superior Knowledge 跨模态知识蒸馏
2020 南京理工大学 Cross-Modal Pattern-Propagation for RGB-T Tracking tracking (追踪)
2020 fackbook & UC伯克利 Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA 文本视觉问答(TextVQA)
2020 韩国科技院(Korea Advanced Institute of Science and Technolog)& 三星 (Samsung) Modality Shifting Attention Network for Multi-Modal Video Question Answering 视觉问答VQA
2020 港中文 & 商汤 联合实验室 A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation 分割
2020 韩国科技院(Korea Advanced Institute of Science and Technolog) Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification person ReID
2020 牛津大学 & Google & DeepMind Speech2Action: Cross-Modal Supervision for Action Recognition 行为识别
2020 英国萨里大学 & 伦敦大学玛丽皇后学院 Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval 拼图游戏(图像检索)
2020 中科院信息所 & 中科大 Referring Image Segmentation via Cross-Modal Progressive Comprehension 分割
2020 中科院自动化所 Cross-Modal Cross-Domain Moment Alignment Network for Person Search 跨模态检索
2020 中国科技大学 Vision-Dialog Navigation by Exploring Cross-Modal Memory 视觉对话导航
2020 北京航空航天大学 A Real-Time Cross-Modality Correlation Filtering Method for Referring Expression Comprehension 视觉理解
2020 中国科技大学 Multi-Modality Cross Attention Network for Image and Sentence Matching 图像文本匹配
2020 美国Aptiv (汽车公司) nuScenes: A Multimodal Dataset for Autonomous Driving 自动驾驶
2020 奔驰公司 Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather 多传感器融合
2020 法国自动化研究所(Inria) xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation 图像分割 (3D点云和2D图像)
2020 清华大学 IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval 跨模态检索
2020 Facebook What Makes Training Multi-Modal Classification Networks Hard? 多模态分类
2020 中科院计算所 Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text 视觉问答
2020 电子科技大学 Universal Weighting Metric Learning for Cross-Modal Matching 跨模态匹配
2020 Microsoft & 美国乔治亚理工 (Georgia Tech) MMTM: Multimodal Transfer Module for CNN Fusion 多模态融合
2020 中国科技大学 Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer person ReID
2020 以色列特拉维夫大学(Tel Aviv) Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation 多模态配准(图像配准)
2020 上海交通大学 Where, What, Whether: Multi-Modal Learning Meets Pedestrian Detection 行人检测
2020 美国加州理工学院 & Aptiv (汽车公司) CoverNet: Multimodal Behavior Prediction Using Trajectory Sets 行为预测
2020 美国马里兰大学 (Maryland) EmotiCon: Context-Aware Multimodal Emotion Recognition Using Frege’s Principle 情感分类
2020 Xpeng motors (中国电动汽车初创公司) Discriminative Multi-Modality Speech Recognition 语音识别 (视频和语音)
2020 浙江大学 MCEN: Bridging Cross-Modal Gap between Cooking Recipes and Dish Images with Latent Variable Model 跨模态检索(食品检索)
2020 Kakao Brain (韩国聊天软件公司) Hypergraph Attention Networks for Multimodal Learning 多模态问答
2020 中科院软件所 End-to-End Adversarial-Attention Network for Multi-Modal Clustering 多模态聚类
2020 Multimodal Categorization of Crisis Events in Social Media 多模态分类
2019 沙特阿卜杜拉国王科技大学 Latent Filter Scaling for Multimodal Unsupervised Image-To-Image Translation image-to-image 无监督转换
2019 南加州大学 & 阿里巴巴(美国) Unsupervised Multi-Modal Neural Machine Translation 基于图像的语言之间翻译无监督学习
2019 京东 (JD) A Dataset and Benchmark for Large-Scale Multi-Modal Face Anti-Spoofing 人脸反欺诈
2019 香港中文大学 & 商汤 Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition With Multimodal Training 动态手势识别(3D视频)
2019 香港中文大学 & 商汤 Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing 视觉理解
2019 Microsoft Cloud & AI Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval 跨模态检索
2019 法国索邦大学 MUREL: Multimodal Relational Reasoning for Visual Question Answering 视觉问答
2019 京东(JD) Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering 视觉问答
2019 香港科大 ContextDesc: Local Descriptor Augmentation With Cross-Modality Context 跨模态特征匹配
2019 香港大学 Cross-Modal Relationship Inference for Grounding Referring Expressions 视觉理解
2019 美国匹兹堡大学 Cross-Modality Personalization for Retrieval 跨模态检索
2019 加州大学巴拉拉分校&Microsoft Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation 视觉语言导航(强化学习)
2019 香港中文大学 & 商汤 Joint lab Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering 视觉问答
2019 法国诺曼底卡昂大学 MFAS: Multimodal Fusion Architecture Search 多模态融合架构搜索 (NAS)
2019 德国弗莱堡大学 Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction 未来预测
2019 美国罗切斯特大学 Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss 跨模态生成(语音生成视频)
2019 Preferred Networks (日本丰田收购的独角兽AI公司) & 东京大学 Multimodal Explanations by Predicting Counterfactuality in Videos 多模态解释(用文本解释图像分类结果的原因)
2019 西北工业大学 (聂飞平组) Deep Multimodal Clustering for Unsupervised Audiovisual Learning 多模态聚类 (图像和音频)
2019 四川大学 Deep Supervised Cross-Modal Retrieval 跨模态检索 (图像与文本)
2019 加拿大马尼托巴大学 & 上海大学 Cross-Modal Self-Attention Network for Referring Image Segmentation 基于文本描述的图像分割
2019 麻省理工人工智能lab (MIT CSAIL) Connecting Touch and Vision via Cross-Modal Prediction 跨模态生成 (触觉和视觉)
2019 香港城市大学 & 国立新加坡 R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network 跨模态检索(图像和文本)
2019 新加坡管理大学 Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images 跨模态生成(图像和文本)
2019 纽约哥伦比亚大学 Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding 视觉理解 (图像和文本)
2018 苏黎世联邦理工(ETH Zurich)誉为欧洲第一名校(爱因斯坦) Cross-Modal Deep Variational Hand Pose Estimation 跨模态生成 (均为图像)
2018 facebook Stacked Latent Attention for Multimodal Reasoning 视觉问答
2018 美国维拉诺瓦大学 Deep Sparse Coding for Invariant Multimodal Halle Berry Neurons 神经元的改进
2018 西安电子科技大学 & 腾讯 Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval 跨模态检索
2018 希腊雅典国立技术大学 Multimodal Visual Concept Learning With Weakly Supervised Techniques 视频理解(视频和文本)
2018 新加坡南洋理工 & 阿里巴巴(杭州) Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval With Generative Models 跨模态检索 (图像和文本)
2018 中科院自动化所 M3: Multimodal Memory Modelling for Video Captioning 跨模态生成(视频字幕生成)
2018 牛津大学 Seeing Voices and Hearing Faces: Cross-Modal Biometric Matching 跨模态检索(语音和人脸图像)
2018 UC 伯克利 Multimodal Explanations: Justifying Decisions and Pointing to the Evidence 视觉理解 (视觉问答)
2018 西门子医疗 Translating and Segmenting Multimodal Medical Volumes With Cycle- and Shape-Consistency Generative Adversarial Network 基于缺失的跨模态生成和分割
2017 Dual Attention Networks for Multimodal Reasoning and Matching
2017 Discriminative Bimodal Networks for Visual Localization and Detection With Natural Language Queries
2017 Missing Modalities Imputation via Cascaded Residual Autoencoder
2017 Multi-Modal Mean-Fields via Cardinality-Based Clamping
2017 Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals
2017 Instance-Aware Image and Sentence Matching With Selective Multimodal LSTM
2017 AMC: Attention guided Multi-modal Correlation Learning for Image Search
2017 Learning Cross-Modal Embeddings for Cooking Recipes and Food Images
2017 Hierarchical Multimodal Metric Learning for Multimodal Classification
2017 Deep Cross-Modal Hashing
2017 Generalized Semantic Preserving Hashing for N-Label Cross-Modal Retrieval
2017 Online Asymmetric Similarity Learning for Cross-Modal Retrieval
2017 Human Shape From Silhouettes Using Generative HKS Descriptors and Cross-Modal Neural Networks
2017 Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension
2017 Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer
2017 Learning to Extract Semantic Structure From Documents Using Multimodal Fully Convolutional Neural Networks
2017 Learning Cross-Modal Deep Representations for Robust Pedestrian Detection
2017 Deep Multimodal Representation Learning From Temporal Data
2017 GuessWhat?! Visual Object Discovery Through Multi-Modal Dialogue
2017 Amodal Detection of 3D Objects: Inferring 3D Bounding Boxes From 2D Ones in RGB-Depth Images
2017 Simultaneous Super-Resolution and Cross-Modality Synthesis of 3D Medical Images Using Weakly-Supervised Joint Convolutional Sparse Coding
2017 Joint Sequence Learning and Cross-Modality Convolution for 3D Biomedical Segmentation
2017 Fast Boosting Based Detection Using Scale Invariant Multimodal Multiresolution Filtered Features
2017 Cross-Modality Binary Code Learning via Fusion Similarity Hashing
2016 Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images
2016 Collaborative Quantization for Cross-Modal Similarity Search
2016 MDL-CW: A Multimodal Deep Learning Framework With Cross Weights
2016 Cross Modal Distillation for Supervision Transfer
2016 Learning Aligned Cross-Modal Representations From Weakly Aligned Data
2016 Discriminative Multi-Modal Feature Fusion for RGBD Indoor Scene Recognition
2016 Multimodal Spontaneous Emotion Corpus for Human Behavior Analysis
2016 Temporal Multimodal Learning in Audiovisual Speech Recognition
2016 Geospatial Correspondences for Multimodal Registration
2016 Modality and Component Aware Feature Fusion For RGB-D Scene Classification

返回顶部


International Conference on Computer Vision (ICCV)

Year Author/mechanism Title Summary Valuable
2019 Robust Multi-Modality Multi-Object Tracking 目标跟踪
2019 Drive&Act: A Multi-Modal Dataset for Fine-Grained Driver Behavior Recognition in Autonomous Vehicles 行为识别
2019 Deep Joint-Semantics Reconstructing Hashing for Large-Scale Unsupervised Cross-Modal Retrieval 多模态检索
2019 RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment person ReID
2019 A Deep Step Pattern Representation for Multimodal Retinal Image Registration 图像配准
2019 Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection 行人检测
2019 CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
2019 ACMM: Aligned Cross-Modal Memory for Few-Shot Image and Sentence Matching
2019 Multi-Modality Latent Interaction Network for Visual Question Answering
2019 Multimodal Style Transfer via Graph Cuts
2019 Towards Unsupervised Image Captioning With Shared Multimodal Embeddings
2019 Unpaired Image-to-Speech Synthesis With Multimodal Information Bottleneck
2019 GAN-Tree: An Incrementally Learned Hierarchical Generative Framework for Multi-Modal Data Distributions
2019 MMAct: A Large-Scale Dataset for Cross Modal Human Action Understanding
2019 Watch, Listen and Tell: Multi-Modal Weakly Supervised Dense Event Captioning
2019 DUAL-GLOW: Conditional Flow-Based Generative Model for Modality Transfer
2017 Recurrent Multimodal Interaction for Referring Image Segmentation
2017 Multi-Modal Factorized Bilinear Pooling With Co-Attention Learning for Visual Question Answering
2017 Hierarchical Multimodal LSTM for Dense Visual-Semantic Embedding
2017 MUTAN: Multimodal Tucker Fusion for Visual Question Answering
2017 Misalignment-Robust Joint Filter for Cross-Modal Image Pairs
2017 Cross-Modal Deep Variational Hashing
2017 Learning a Recurrent Residual Fusion Network for Multimodal Matching
2017 Attention-Based Multimodal Fusion for Video Description
2017 Multimodal Gaussian Process Latent Variable Models With Harmonization
2017 A Multimodal Deep Regression Bayesian Network for Affective Video Content Analyses
2017 RGB-Infrared Cross-Modality Person Re-Identification

返回顶部


European Conference on Computer Vision (ECCV)

Year Author/mechanism Title Summary Valuable
2020
2018 Learnable PINs: Cross-Modal Embeddings for Person Identity
2018 Multimodal Unsupervised Image-to-image Translation
2018 Cross-Modal Hamming Hashing
2018 Cross-Modal and Hierarchical Modeling of Video and Text
2018 Deep Cross-modality Adaptation via Semantics Preserving Adversarial Learning for Sketch-based 3D Shape Retrieval
2018 MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics
2018 Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval
2018 Deep Cross-Modal Projection Learning for Image-Text Matching
2018 Multimodal Dual Attention Memory for Video Story Question Answering
2018 Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries
2018 Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking
2018 Pivot Correlational Neural Network for Multimodal Video Categorization
2018 Multi-modal Cycle-consistent Generalized Zero-Shot Learning
2018 Multimodal image alignment through a multiscale chain of neural networks with application to remote sensing

返回顶部


International Joint Conferences on Artifical Intelligence (IJCAI)

Year Author/mechanism Title Summary Valuable
2020 A Similarity Inference Metric for RGB-Infrared Cross-Modality Person Re-identificatio
2020 EViLBERT: Learning Task-Agnostic Multimodal Sense Embeddings
2020 Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
2020 Set and Rebase: Determining the Semantic Graph Connectivity for Unsupervised Cross-Modal Hashing
2020 Embodied Multimodal Multitask Learning
2020 Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative Adversarial Nets
2020 Modeling Dense Cross-Modal Interactions for Joint Entity-Relation Extraction
2020 Interpretable Multimodal Learning for Intelligent Regulation in Online Payment Systems
2020
2020
2019 Graph Convolutional Network Hashing for Cross-Modal Retrieval
2019 Solving the Satisfiability Problem of Modal Logic S5 Guided by Graph Coloring
2019 Extensible Cross-Modal Hashing
2019 Success Prediction on Crowdfunding with Multimodal Deep Learning
2019 AttnSense: Multi-level Attention Mechanism For Multimodal Human Activity Recognition
2019 Metric Learning on Healthcare Data with Incomplete Modalities
2019 DeepCU: Integrating both Common and Unique Latent Information for Multimodal Sentiment Analysis
2019 Comprehensive Semi-Supervised Multi-Modal Learning
2019 Equally-Guided Discriminative Hashing for Cross-modal Retrieval
2019 Exploring and Distilling Cross-Modal Information for Image Captioning
2019 Adapting BERT for Target-Oriented Multimodal Sentiment Classification
2019 MNN: Multimodal Attentional Neural Networks for Diagnosis Prediction
2019 Embodied Conversational AI Agents in a Multi-modal Multi-agent Competitive Dialogue
2018 Cross-Modality Person Re-Identification with Generative Adversarial Training
2018 Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss
2018 MEGAN: Mixture of Experts of Generative Adversarial Networks for Multimodal Image Generation
2018 Multi-modal Circulant Fusion for Video-to-Language and Backward
2018 Deep Learning Based Multi-modal Addressee Recognition in Visual Scenes with Utterances
2018 SDMCH: Supervised Discrete Manifold-Embedded Cross-Modal Hashing
2018 Cross-modal Bidirectional Translation via Reinforcement Learning
2018 Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval
2018 Semi-Supervised Multi-Modal Learning with Incomplete Modalities
2018 Multi-modality Sensor Data Classification with Selective Attention
2018 Interpretable Recommendation via Attraction Modeling: Learning Multilevel Attractiveness over Multimodal Movie Contents
2018 Multi-modal Sentence Summarization with Modality Attention and Image Filtering
2018 Multi-modal Predicate Identification using Dynamically Learned Robot Controllers
2017 Extracting Visual Knowledge from the Web with Multimodal Learning
2017 Cross-modal Common Representation Learning by Hybrid Transfer Network
2017 Modal Consistency based Pre-Trained Multi-Model Reuse
2017 Adaptively Unified Semi-supervised Learning for Cross-Modal Retrieval
2017 Hashtag Recommendation for Multimodal Microblog Using Co-Attention Network
2017 Multimodal Linear Discriminant Analysis via Structural Sparsity
2017 DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
2017 Depression Detection via Harvesting Social Media: A Multimodal Dictionary Learning Solution
2017 Multimodal Storytelling via Generative Adversarial Imitation Learning
2017 MAT: A Multimodal Attentive Translator for Image Captioning
2017 Multi-Modal Word Synset Induction
2017 Dual Track Multimodal Automatic Learning through Human-Robot Interaction
2017 Approximating Discrete Probability Distribution of Image Emotions by Multi-Modal Features Fusion
2017 KSP: A Resolution-based Prover for Multimodal K, Abridged Report
2017 Multimodal News Article Analysis
2016 Group-Invariant Cross-Modal Subspace Learning
2016 Supervised Matrix Factorization for Cross-Modality Hashing
2016 Multi-Grained Role Labeling Based on Multi-Modality Information for Real Customer Service Telephone Conversation
2016 Multi-Modal Bayesian Embeddings for Learning Social Knowledge Graphs
2016 Incomplete Multi-Modal Visual Data Grouping
2016 Semi-Supervised Multimodal Deep Learning for RGB-D Object Recognition
2016 Learning Multi-Modal Grounded Linguistic Semantics by Playing “I Spy”
****
2015 Auxiliary Information Regularized Machine for Multiple Modality Feature Learning
2015 Multi-Modality Tracker Aggregation: From Generative to Discriminative
2015 Social Image Parsing by Cross-Modal Data Refinement
2015 Deep Multimodal Hashing with Orthogonal Regularization
2015 Semantic Topic Multimodal Hashing for Cross-Media Retrieval
2015 Learning to Hash on Partial Multi-Modal Data
2015 Quantized Correlation Hashing for Fast Cross-Modal Search
****
****
****

返回顶部


The Association for the Advance of Artificial Intelligence (AAAI)

Year Author/mechanism Title Summary Valuable
2020 MULE: Multimodal Universal Language Embedding
2020 Infrared-Visible Cross-Modal Person Re-Identification with an X Modality
2020 Aspect-Aware Multimodal Summarization for Chinese E-commerce Products
2020 Multimodal Summarization with Guidance of Multimodal Reference
2020 Learning Cross-modal Context Graph Networks for Visual Grounding
2020 Adaptive Cross-modal Embeddings for Image-Text Alignment
2020 Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification
2020 Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification
2020 *Learning Long- and Short-Term User Literal-Preference with Multimodal Hierarchical Transformer Network for Personalized Image Caption *
2020 Towards Cross-modality Medical Image Segmentation with Online Mutual Knowledge Distillation
2020 Multimodal Structure-Consistent Image-to-Image Translation
2020 Cross-Modal Subspace Clustering via Deep Canonical Correlation Analysis
2020 Semi-supervised Multi-modal Learning with Balanced Spectral Decomposition
2020 Cross-Modality Attention Network for Temporal Inconsistent Audio-Visual Event Localization
2020 Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
2020 Privacy Enhanced Multimodal Neural Representations for Emotion Recognition
2020 Factorized Inference in Deep Markov Models for Incomplete Multimodal Time Series
2020 M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech Cues
2020 Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval
2020 Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding
2020 Learning Relationships between Text, Audio, and Video via Deep Canonical Correlation for Multimodal Language Analysis
2020 Attention-based Multi-modal Fusion Network for Semantic Scene Completion
2020 Modality-Balanced Models for Visual Dialogue
2020 Visual Agreement Regularized Training for Multi-Modal Machine Translation
2020 ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
2020 Learning Multi-Modal Biomarker Representations via Globally Aligned Longitudinal Enrichments
2020 Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion
2020 Multimodal Interaction-Aware Trajectory Prediction in Crowded Space
2020 Mining on Heterogeneous Manifolds for Zero-shot Cross-modal Image Retrieval
2019 Cooperative Multimodal Approach to Depression Detection in Twitter
2019 Y2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences
2019 Coupled CycleGAN: Unsupervised Hashing Network for Cross-Modal Retrieval
2019 VistaNet: Visual Aspect Attention Network for Multimodal Sentiment Analysis
2019 Multi-Interactive Memory Network for Aspect Based Multimodal Sentiment Analysis
2019 Synergistic Image and Feature Adaptation: Towards Cross-Modality Domain Adaptation for Medical Image Segmentation
2019 Joint Representation Learning for Multi-Modal Transportation Recommendation
2019 Play as You Like: Timbre-Enhanced Multi-Modal Music Style Transfer
2019 On the Time Complexity of Algorithm Selection Hyper-Heuristics for Multimodal Optimisation
2019 Disjunctive Normal Form for Multi-Agent Modal Logics Based on Logical Separability
2019 Ranking-Based Deep Cross-Modal Hashing
2019 An Efficient Approach to Informative Feature Extraction from Multimodal Data
2019 Deep Robust Unsupervised Multi-Modal Network
2019 Found in Translation: Learning Robust Joint Representations by Cyclic Translations between Modalities
2019 Unsupervised Bilingual Lexicon Induction from Mono-Lingual Multimodal Data
2019 ACM: Adaptive Cross-Modal Graph Convolutional Neural Networks for RGB-D Scene Recognition
2019 Dynamically Identifying Deep Multimodal Features for Image Privacy Prediction
2018 Synthesis of Programs from Multimodal Datasets
2018 Dual Deep Neural Networks Cross-Modal Hashing
2018 Spatiotemporal Activity Modeling Under Data Scarcity: A Graph-Regularized Cross-Modal Embedding Approach
2018 Unsupervised Generative Adversarial Cross-Modal Hashing
2018 Towards Building Large Scale Multimodal Domain-Aware Conversation Systems
2018 Multi-Modal Multi-Task Learning for Automatic Dietary Assessment
2018 Multimodal Poisson Gamma Belief Network
2018 AJILE Movement Prediction: Multimodal Deep Learning for Natural Human Neural Recordings and Video
2018 Perception Coordination Network: A Framework for Online Multi-Modal Concept Acquisition and Binding
2018 Placing Objects in Gesture Space: Toward Incremental Interpretation of Multimodal Spatial Descriptions
2018 Efficient Large-Scale Multi-Modal Classification
2018 Guiding Exploratory Behaviors for Multi-Modal Grounding of Linguistic Descriptions
2018 Learning Multi-Modal Word Representation Grounded in Visual Context
2018 Investigating Inner Properties of Multimodal Representation and Semantic Compositionality with Brain-Based Componential Semantics
2018 Learning Multimodal Word Representation via Dynamic Fusion Methods
2018 CMCGAN: A Uniform Framework for Cross-Modal Visual-Audio Mutual Generation
2018 Multimodal Keyless Attention Fusion for Video Classification
2018 Co-Attending Free-Form Regions and Detections with Multi-Modal Multiplicative Feature Embedding for Visual Question Answering
2018 Gesture Annotation with a Visual Search Engine for Multimodal Communication Research
2018 Is a Picture Worth a Thousand Words? A Deep Multi-Modal Architecture for Product Classification in E-Commerce
2018 Predicting Depression Severity by Multi-Modal Feature Engineering and Fusion
2018 Water Advisor - A Data-Driven, Multi-Modal, Contextual Assistant to Help with Water Usage Decisions
2017 Towards Better Understanding the Clothing Fashion Styles: A Multimodal Deep Learning Approach
2017 Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval
2017 Exploring Commonality and Individuality for Multi-Modal Curriculum Learning
2017 Collective Deep Quantization for Efficient Cross-Modal Retrieval
2017 Imagined Visual Representations as Multimodal Embeddings
2017 Multimodal Fusion of EEG and Musical Features in Music-Emotion Recognition
2017 Webly-Supervised Learning of Multimodal Video Detectors
2016 Business-Aware Visual Concept Discovery from Social Media for Multimodal Business Venue Recognition
2016 Online Cross-Modal Hashing for Web Image Retrieval
2016 Co-Regularized PLSA for Multi-Modal Learning
2016 Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos
2016 Look, Listen and Learn — A Multimodal LSTM for Speaker Identification
2016 Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media
2015 Cross-Modal Image Clustering via Canonical Correlation Analysis
2015 Tackling Mental Health by Integrating Unobtrusive Multimodal Sensing
2015 Cross-Modal Similarity Learning via Pairs, Preferences, and Active Supervision
2015 “Is It Rectangular?” Using I Spy as an Interactive, Game-Based Approach to Multimodal Robot Learning

返回顶部


IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)

Year Author/mechanism Title Summary Valuable
2019 Label Consistent Matrix Factorization Hashing for Large-Scale Cross-Modal Similarity Search
2019 Towards Personalized Image Captioning via Multimodal Memory Networks
2019 Multimodal Machine Learning: A Survey and Taxonomy
2018 Learning Compositional Sparse Bimodal Models
2018 Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos
2018 Hetero-Manifold Regularisation for Cross-Modal Hashing
2018 A Multi-Modal, Discriminative and Spatially Invariant CNN for RGB-D Object Labeling
2018 Cross-Modal Scene Networks

返回顶部


International Journal of Computer Vision (IJCV)

Year Author/mechanism Title Summary Valuable
2020 RGB-IR Person Re-identification by Cross-Modality Similarity Preservation
2020 Self-Supervised Model Adaptation for Multimodal Semantic Segmentation
2019 Motion-Compensated Spatio-Temporal Filtering for Multi-Image and Multimodal Super-Resolution
2018 Deep Multimodal Fusion: A Hybrid Approach

返回顶部


------ 本文结束感谢您的阅读 ------