A Cross-modal Multi-task Learning Framework for Image Annotation-现代数据工程与实时计算实验室

现代数据工程与实时计算实验室

科研成果

软件著作权

论文发表

A Cross-modal Multi-task Learning Framework for Image Annotation

作者

Liang Xie, Peng Pan, Yansheng Lu, Shixun Wang

期刊

期刊名称：ACM New York, NY, USA

出版日期：2014

所在页数：431-440

摘要

With the advance of internet, multi-modal data can be easily collected from many social websites such as Wikipedia, Flickr, YouTube, etc. Images shared on the web are usually associated with social tags or other textual information. Although existing multi-modal methods can make use of associated text to improve image annotation, the disadvantages of them are that associated text is also required for a new image to be predicted. In this paper, we propose the cross-modal multi-task learning (CMMTL) framework for image annotation. Labeled and unlabeled multi-modal data are both levaraged for training in CMMTL, and it finally obtains visual classifiers which can predict concepts for a single image without any associated information. CMMTL integrates graph learning, multi-task learning and cross-modal learning into a joint framework, where a shared subspace is learned to preserve both cross-modal correlation and concept correlation. The optimal solution of the proposed framework can be obtained by solving a generalized eigenvalue problem. We conduct comprehensive experiments on two real world image datasets: MIR Flickr and NUS-WIDE, to evaluate the performance of the proposed framework. Experimental results demonstrate that CMMTL obtains a significant improvement over several representative methods for cross-modal image annotation.

关键词

Cross-modal learning; Image annotation; Multi-task learning; Semi-supervised learning

[pdf]

地址：湖北省武汉市洪山区珞瑜路1037号,华中科技大学南一楼西南501室邮编：430074 电话：027-87556601

计算机科学与技术学院，现代数据工程与实时计算实验室有问题和意见请与网站管理员联系：adelab@163.com

温馨提示：为保证能正常的浏览此网站，请用IE9.0以上版本查看！访问人次：