Heterogeneous convolutional neural networks for visual recognition

Heterogeneous convolutional neural networks for visual recognition Xiangyang Li, Luis Herranz, Shuqiang Jiang Key Laboratory of Intelligent Information Processing Institute of Computing Technology, CAS

Heterogeneous convolutional neural networks for visual recognition • Deep convolutional neural networks (CNNs) have shown impressive performance for image recognition. • A number of works have focused on how to improve the performance of CNNs: • train and test CNNs with multiple scales [1]. • increase the depth and width of CNNs [2][3]. • ensemble: averaging the results obtained from multiple instances of the same network [1][2][3]. Sermanet at al. Overfeat: integrated recognition, localization and detection using convolutional networks. In ICLR 2014. Simonyan et al. Very deep convolutional networks for large-scale image recognition, In ICLR 2015. Szegedy et al. Going deeper with convolutions, In CVPR 2015.

Heterogeneous convolutional neural networks for visual recognition • Features extracted from different architectures have different characteristics: • different number of layers and filter designs (related with different receptive fields). • grandmother cells & distributed codes [1]. D VD Agrawal et al. Analyzing the performance of multilayer neural networks for object recognition. In ECCV 2014.

Heterogeneous convolutional neural networks for visual recognition • Here we study combining heterogeneous convolutional neural networks (HCNNs). • Higher level features resulting from the combination of features from heterogeneous CNNs lead to richer and more discriminative feature representations. • The combination network strategy is flexible and it can benefit from large datasets and fine tuning to specific tasks.

Heterogeneous convolutional neural networks for visual recognition • Heterogeneous convolutional neural networks： • pre-training base networks • pre-training the combination network • fine tuning

Heterogeneous convolutional neural networks for visual recognition • Base networks:

Heterogeneous convolutional neural networks for visual recognition • Combination network: • (a) Two fully-connected layers • (b) Three fully-connected layers • (c) Three fully-connected layers with one additional top-layer removed from the base networks (a) (b) (c)

Heterogeneous convolutional neural networks for visual recognition • Training the network: • First, the base networks are pre-trained with a suitable large scale dataset. • The second step is training the combination network, which typically uses back propagation over a large scale dataset. • The third step is fine tuning the network.

Heterogeneous convolutional neural networks for visual recognition • Experiments: • We conduct our experiments on the Caffe [1] framework. • D denotes Alexnet and VD denotes VGG. We train the models on ImageNet. • In addition to the network D, we trained from scratch another network D’ with the same architecture for comparing with homogeneous architectures. • For test, we only used the central crop of size 227x227 of the image previously resized to 256x256. Jia et al. Caffe: Convolutional architecture for fast feature embedding. In ACM Multimedia 2014.

Heterogeneous convolutional neural networks for visual recognition Table 1. Comparison of different architectures for the combination network. • Combination network architecture:

Heterogeneous convolutional neural networks for visual recognition • Understanding the combination mechanism: we back propagate the optimal output for each class

Heterogeneous convolutional neural networks for visual recognition • Visualizing features obtained from different models: (a) fc7 of D (b) fc7 of VD

Heterogeneous convolutional neural networks for visual recognition • Visualizing features obtained from different models: (b) fc7 of VD (c) the joint network

Heterogeneous convolutional neural networks for visual recognition Table 2. Accuracy (%) for object recognition. • Object recognition performance:

Heterogeneous convolutional neural networks for visual recognition • Scene recognition performance: Table 3. Accuracy (%) for scene recognition.

Heterogeneous convolutional neural networks for visual recognition • Impact of fine tuning:

Heterogeneous convolutional neural networks for visual recognition • Impact of combination architecture and fine tuning: [D,VD]+2CL [Dp,VDp]+2CL

Heterogeneous convolutional neural networks for visual recognition • Conclusion: • we studied the unusual case of combining deep networks with heterogeneous architectures, and proposed HCNNs which combine them by concatenating high-layer features and stacking another (combination) network. • The proposed method achieves significant performance on many challenging benchmarks. • The combination network strategy is flexible and it can benefit from large datasets and fine tuning to specific tasks.

Thank you！

Heterogeneous convolutional neural networks for visual recognition

Heterogeneous convolutional neural networks for visual recognition

Presentation Transcript

Will neural network work for my problem? Character recognition neural networks Prediction neural networks

Tiled Convolutional Neural Networks

Dual-force convolutional neural networks for accurate brain tumor segmentation

Fingerprint Recognition – Neural Networks

Learning Convolutional Feature Hierarchies for Visual Recognition

Convolutional Neural Networks with Multiple Channel Features for Human Detection

Face Recognition: A Convolutional Neural Network Approach

Convolutional Networks

Introduction: Convolutional Neural Networks for Visual Recognition

Speech Recognition through Neural Networks

End-to-End Text Recognition with Convolutional Neural Networks

Artificial Neural Networks for Pattern Recognition

Character Recognition Using Neural Networks

Meta-analysis of Convolutional neural networks for radiological images – Pubrica