MultiMedia Modeling 26th International Conference, MMM 2020, Daejeon, South Korea, January 5-8, 2020, Proceedings, Part I / [electronic resource] :
edited by Yong Man Ro, Wen-Huang Cheng, Junmo Kim, Wei-Ta Chu, Peng Cui, Jung-Woo Choi, Min-Chun Hu, Wesley De Neve.
- 1st ed. 2020.
- XXIX, 844 p. 461 illus., 324 illus. in color. online resource.
- Information Systems and Applications, incl. Internet/Web, and HCI, 11961 2946-1642 ; .
- Information Systems and Applications, incl. Internet/Web, and HCI, 11961 .
Audio and Signal Processing -- Light Field Reconstruction using Dynamically Generated Filters -- Speaker-Aware Speech Emotion Recognition by Fusing Amplitude and Phase Information -- Gen-Res-Net: a Novel Generative Model for Singing Voice Separation -- A Distinct Synthesizer Convolutional TasNet for Singing Voice Separation -- Exploiting the Importance of Personalization When Selecting Music for Relaxation -- Coding and HVS -- An Efficient Encoding Method for Video Compositing in HEVC -- VHS to HDTV Video Translation using Multi-task Adversarial Learning -- Improving Just Noticeable Difference Model by Leveraging Temporal HVS Perception Characteristics -- Down-Sampling Based Video Coding with Degradation-aware Restoration-Reconstruction Deep Neural Network -- Beyond Literal Visual Modeling: Understanding Image Metaphor based on Literal-Implied Concept Mapping -- Color Processing and Art -- Deep Palette-based Color Decomposition for Image Recoloring with Aesthetic Suggestion -- On Creating Multimedia Interfaces for Hybrid Biological-Digital Art Installations -- Image Captioning based on Visual and Semantic Attention -- An Illumination Insensitive and Structure-aware Image Color Layer Decomposition Method -- CartoonRenderer: An Instance-based Multi-Style Cartoon Image Translator -- Detection and Classification -- Multi-Condition Place Generator for Robust Place Recognition -- Guided Refine-Head for Object Detection -- Towards Accurate Panel Detection in Manga: A Combined Effort of CNN and Heuristics -- Subclass Deep Neural Networks: Re-enabling Neglected Classes in Deep Network Training for Multimedia Classification -- Automatic Material Classification using Thermal Finger Impression -- Face -- Face Attributes Recognition Based on One-way Inferential Correlation between Attributes -- Eulerian Motion Based 3DCNN Architecture for Facial Micro-expression Recognition -- Emotion Recognition with Facial Landmark Heatmaps -- One-shot Face Recognition with Feature Rectification via Adversarial Learning -- Visual Sentiment Analysis by Leveraging Local Regions and Human Faces -- Image Processing -- Prediction-error Value Ordering for High-fidelity Reversible Data Hiding -- Classroom Attention Analysis Based on Multiple Euler Angles Constraint and Head Pose Estimation -- Multi-branch Body Region Alignment Network for Person Re-Identification -- DeepStroke: Understanding Glyph Structure with Semantic Segmentation and Tabu Search -- 3D Spatial Coverage Measurement of Aerial Images -- Learning and Knowledge Representation -- Instance Image Retrieval with Generative Adversarial Training -- An Effective Way to Boost Black-box Adversarial Attack -- Crowd Knowledge Enhanced Multimodal Conversational Assistant in Travel Domain -- Improved Model Structure with Cosine Margin OIM Loss For End-to-End Person Search -- Effective Barcode Hunter via Semantic Segmentation in the Wild -- Video Processing -- Wonderful Clips of Playing Basketball: A Database forLocalizing Wonderful Actions -- Structural Pyramid Network for Cascaded Optical Flow Estimation -- Real-time Multiple Pedestrians Tracking in Multi-camera System -- Learning Multi-feature based Spatially Regularized and Scale Adaptive Correlation Filters for Visual Tracking -- Unsupervised Video Summarization via Attention-Driven Adversarial Learning -- Poster Papers -- Efficient HEVC Downscale Transcoding Based on Coding Unit Information Mapping -- Fine-grain level sports video search engine -- The Korean Sign Language Dataset for Action Recognition -- SEE-LPR: A Semantic Segmentation based End-to-End System for Unconstrained License Plate Detection and Recognition -- Action Co-Localization in an Untrimmed Video by Graph Neural Networks -- A Novel Attention Enhanced Dense Network For Image Super-Resolution -- Marine Biometric Recognition Algorithm Based on YOLOv3-GAN Network -- Multi-scale Spatial Location Preference for Semantic Segmentation -- HRTF Representation with Convolutional Auto-Encoder -- Unsupervised Feature Propagation for Video Object Detection using Generative Adversarial Networks -- OmniEyes: Analysis and Synthesis of Artistically Painted Eyes -- LDSNE: Learning Structural Network Embeddings by Encoding Local Distances -- FurcaNeXt: End-to-End Monaural Speech Separation with Dynamic Gated Dilated Temporal Convolutional Networks -- Multi-step Coding Structure of Spatial Audio Object Coding -- Thermal Face Recognition based on Transformation by Residual U-Net and Pixel Shuffle Upsampling -- K-SVD Based Point Cloud Coding for RGB-D Video Compression Using 3D Super-point Clustering -- Resolution Booster: Global Structure Preserving Stitching Method For Ultra-High Resolution Image Translation -- Cross Fusion for Egocentric Interactive Action Recognition -- Improving Brain Tumor Segmentation with Dilated Pseudo-3D Convolution and Multi-direction Fusion -- Texture-based Fast CU Size Decision and Intra Mode Decision Algorithm for VVC -- An Efficient Hierarchical Near-Duplicate Video Detection Algorithm Based on Deep Semantic Features -- Meta Transfer Learning for Adaptive Vehicle Tracking in UAV Videos -- Adversarial Query-by-Image Video Retrieval Based on Attention Mechanism -- Joint Sketch-Attribute Learning for Fine-Grained Face Synthesis -- High Accuracy Perceptual video hashing via Low-Rank decomposition and DWT -- HMM-Based Person Re-Identification in Large-scale Open Scenario -- No Reference Image Quality Assessment by Information Decomposition.
The two-volume set LNCS 11961 and 11962 constitutes the thoroughly refereed proceedings of the 25th International Conference on MultiMedia Modeling, MMM 2020, held in Daejeon, South Korea, in January 2020. Of the 171 submitted full research papers, 40 papers were selected for oral presentation and 46 for poster presentation; 28 special session papers were selected for oral presentation and 8 for poster presentation; in addition, 9 demonstration papers and 6 papers for the Video Browser Showdown 2020 were accepted. The papers of LNCS 11961 are organized in the following topical sections: audio and signal processing; coding and HVS; color processing and art; detection and classification; face; image processing; learning and knowledge representation; video processing; poster papers; the papers of LNCS 11962 are organized in the following topical sections: poster papers; AI-powered 3D vision; multimedia analytics: perspectives, tools and applications; multimedia datasets for repeatable experimentation; multi-modal affective computing of large-scale multimedia data; multimedia and multimodal analytics in the medical domain and pervasive environments; intelligent multimedia security; demo papers; and VBS papers.
9783030377311
10.1007/978-3-030-37731-1 doi
Multimedia systems.
Computer vision.
Artificial intelligence.
Application software.
User interfaces (Computer systems).
Human-computer interaction.
Multimedia Information Systems.
Computer Vision.
Artificial Intelligence.
Computer and Information Systems Applications.
User Interfaces and Human Computer Interaction.
QA76.575
006.7
Audio and Signal Processing -- Light Field Reconstruction using Dynamically Generated Filters -- Speaker-Aware Speech Emotion Recognition by Fusing Amplitude and Phase Information -- Gen-Res-Net: a Novel Generative Model for Singing Voice Separation -- A Distinct Synthesizer Convolutional TasNet for Singing Voice Separation -- Exploiting the Importance of Personalization When Selecting Music for Relaxation -- Coding and HVS -- An Efficient Encoding Method for Video Compositing in HEVC -- VHS to HDTV Video Translation using Multi-task Adversarial Learning -- Improving Just Noticeable Difference Model by Leveraging Temporal HVS Perception Characteristics -- Down-Sampling Based Video Coding with Degradation-aware Restoration-Reconstruction Deep Neural Network -- Beyond Literal Visual Modeling: Understanding Image Metaphor based on Literal-Implied Concept Mapping -- Color Processing and Art -- Deep Palette-based Color Decomposition for Image Recoloring with Aesthetic Suggestion -- On Creating Multimedia Interfaces for Hybrid Biological-Digital Art Installations -- Image Captioning based on Visual and Semantic Attention -- An Illumination Insensitive and Structure-aware Image Color Layer Decomposition Method -- CartoonRenderer: An Instance-based Multi-Style Cartoon Image Translator -- Detection and Classification -- Multi-Condition Place Generator for Robust Place Recognition -- Guided Refine-Head for Object Detection -- Towards Accurate Panel Detection in Manga: A Combined Effort of CNN and Heuristics -- Subclass Deep Neural Networks: Re-enabling Neglected Classes in Deep Network Training for Multimedia Classification -- Automatic Material Classification using Thermal Finger Impression -- Face -- Face Attributes Recognition Based on One-way Inferential Correlation between Attributes -- Eulerian Motion Based 3DCNN Architecture for Facial Micro-expression Recognition -- Emotion Recognition with Facial Landmark Heatmaps -- One-shot Face Recognition with Feature Rectification via Adversarial Learning -- Visual Sentiment Analysis by Leveraging Local Regions and Human Faces -- Image Processing -- Prediction-error Value Ordering for High-fidelity Reversible Data Hiding -- Classroom Attention Analysis Based on Multiple Euler Angles Constraint and Head Pose Estimation -- Multi-branch Body Region Alignment Network for Person Re-Identification -- DeepStroke: Understanding Glyph Structure with Semantic Segmentation and Tabu Search -- 3D Spatial Coverage Measurement of Aerial Images -- Learning and Knowledge Representation -- Instance Image Retrieval with Generative Adversarial Training -- An Effective Way to Boost Black-box Adversarial Attack -- Crowd Knowledge Enhanced Multimodal Conversational Assistant in Travel Domain -- Improved Model Structure with Cosine Margin OIM Loss For End-to-End Person Search -- Effective Barcode Hunter via Semantic Segmentation in the Wild -- Video Processing -- Wonderful Clips of Playing Basketball: A Database forLocalizing Wonderful Actions -- Structural Pyramid Network for Cascaded Optical Flow Estimation -- Real-time Multiple Pedestrians Tracking in Multi-camera System -- Learning Multi-feature based Spatially Regularized and Scale Adaptive Correlation Filters for Visual Tracking -- Unsupervised Video Summarization via Attention-Driven Adversarial Learning -- Poster Papers -- Efficient HEVC Downscale Transcoding Based on Coding Unit Information Mapping -- Fine-grain level sports video search engine -- The Korean Sign Language Dataset for Action Recognition -- SEE-LPR: A Semantic Segmentation based End-to-End System for Unconstrained License Plate Detection and Recognition -- Action Co-Localization in an Untrimmed Video by Graph Neural Networks -- A Novel Attention Enhanced Dense Network For Image Super-Resolution -- Marine Biometric Recognition Algorithm Based on YOLOv3-GAN Network -- Multi-scale Spatial Location Preference for Semantic Segmentation -- HRTF Representation with Convolutional Auto-Encoder -- Unsupervised Feature Propagation for Video Object Detection using Generative Adversarial Networks -- OmniEyes: Analysis and Synthesis of Artistically Painted Eyes -- LDSNE: Learning Structural Network Embeddings by Encoding Local Distances -- FurcaNeXt: End-to-End Monaural Speech Separation with Dynamic Gated Dilated Temporal Convolutional Networks -- Multi-step Coding Structure of Spatial Audio Object Coding -- Thermal Face Recognition based on Transformation by Residual U-Net and Pixel Shuffle Upsampling -- K-SVD Based Point Cloud Coding for RGB-D Video Compression Using 3D Super-point Clustering -- Resolution Booster: Global Structure Preserving Stitching Method For Ultra-High Resolution Image Translation -- Cross Fusion for Egocentric Interactive Action Recognition -- Improving Brain Tumor Segmentation with Dilated Pseudo-3D Convolution and Multi-direction Fusion -- Texture-based Fast CU Size Decision and Intra Mode Decision Algorithm for VVC -- An Efficient Hierarchical Near-Duplicate Video Detection Algorithm Based on Deep Semantic Features -- Meta Transfer Learning for Adaptive Vehicle Tracking in UAV Videos -- Adversarial Query-by-Image Video Retrieval Based on Attention Mechanism -- Joint Sketch-Attribute Learning for Fine-Grained Face Synthesis -- High Accuracy Perceptual video hashing via Low-Rank decomposition and DWT -- HMM-Based Person Re-Identification in Large-scale Open Scenario -- No Reference Image Quality Assessment by Information Decomposition.
The two-volume set LNCS 11961 and 11962 constitutes the thoroughly refereed proceedings of the 25th International Conference on MultiMedia Modeling, MMM 2020, held in Daejeon, South Korea, in January 2020. Of the 171 submitted full research papers, 40 papers were selected for oral presentation and 46 for poster presentation; 28 special session papers were selected for oral presentation and 8 for poster presentation; in addition, 9 demonstration papers and 6 papers for the Video Browser Showdown 2020 were accepted. The papers of LNCS 11961 are organized in the following topical sections: audio and signal processing; coding and HVS; color processing and art; detection and classification; face; image processing; learning and knowledge representation; video processing; poster papers; the papers of LNCS 11962 are organized in the following topical sections: poster papers; AI-powered 3D vision; multimedia analytics: perspectives, tools and applications; multimedia datasets for repeatable experimentation; multi-modal affective computing of large-scale multimedia data; multimedia and multimodal analytics in the medical domain and pervasive environments; intelligent multimedia security; demo papers; and VBS papers.
9783030377311
10.1007/978-3-030-37731-1 doi
Multimedia systems.
Computer vision.
Artificial intelligence.
Application software.
User interfaces (Computer systems).
Human-computer interaction.
Multimedia Information Systems.
Computer Vision.
Artificial Intelligence.
Computer and Information Systems Applications.
User Interfaces and Human Computer Interaction.
QA76.575
006.7