基於上下文語義聚合的改進型 DeepLab V3+ 對象分割方法研究

許世昌; 蔡貫傑

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	基於上下文語義聚合的改進型 DeepLab V3+ 對象分割方法研究
並列篇名	Research on an Improved DeepLab V3+ Object Segmentation Method Based on Contextual Semantic Aggregation
作者	許世昌 (Shih-Chang Shei)、蔡貫傑
中文摘要	本文主要研究基於上下文語義聚合的對象分割方法，重點關注了 DeepLab V2 和 DeepLab V3+ 兩種方法在 VOC2012 資料集上的比較，並對 DeepLab V3+ 方法進行了改進以優化其參數量並保持一定分割精度。對象分割的目的是將圖像中的目標分為前景，其餘像素分為背景。近年來，深度學習技術的發展使得對象分割任務取得了重要進展。其中，基於卷積神經網絡（ Convolutional Neural Networks CNN ）的方法表現出了較好的效果，如 DeepLab V2 和基於其的 DeepLab V3+ 。這兩種方法都採用了空洞卷積和多尺度池化等技術，以捕捉不同尺度的對象信息。其中， DeepLab V3+ 還引入了多尺度空洞卷積模塊，以進一步提高分割精度。在本次課題研究中，所做實驗使用 VOC2012 資料集進行了對比實驗，結果表明 DeepLab V3+ 在分割精度上優於 DeepLab V2 。然而， DeepLab V3+ 的參數量也更大，因此需要進行改進以提高其效率。為此，本文提出了一種基於 MobileNetV2 的改進方法，該方法主要包括兩個方面：一是改變空洞空間金字塔結構（ Atrous Spatial Pyramid Pooling ASPP ）的採樣率組合併加入 CBAM 混合注意力機制，二是在 DeepLab V3+ 結構的解碼器部分加入並聯自注意力機制，並將特徵融合優化為多分支特徵融合模塊。通過這些改進，本文成功地減少了 DeepLab V3+ 的參數量，並保持了一定的分割精度。實驗結果表明，改進後的 DeepLab V3+ 方法具有較高的效率和較好的分割精度，可在對象分割任務中發揮重要作用。本文的研究對於提高對象分割的效率和精度具有重要意義，同時也為其他基於 CNN 的對象分割任務提供了借鑒和啓示。
英文摘要	This article primarily examines object segmentation methods based on contextual semantic aggregation, particularly focusing on comparing DeepLab V2 and DeepLab V3+ on the VOC2012 dataset, and enhancing DeepLab V3+ to optimize its parameter count while maintaining a certain segmentation accuracy. Object segmentation is a crucial task in computer vision, aimed at distinguishing the target object from the background in an image. Recently, advancements in deep learning have led to significant improvements in object segmentation tasks, especially with convolutional neural network (CNN)-based methods such as DeepLab V2 and DeepLab V3+. These methods employ techniques like atrous convolution and multi-scale pooling to capture object information at various scales. Specifically, DeepLab V3+ introduced the Multi-scale atrous convolution module to further enhance segmentation accuracy. Nevertheless, DeepLab V3+ has a higher parameter count, necessitating improvements to boost its efficiency. Therefore, we proposed an enhancement method based on MobileNetV2, which involves adjusting the sampling rate combination of the Atrous Spatial Pyramid Pooling (ASPP) module and incorporating the Convolutional Block Attention Module (CBAM) mixed attention mechanism. Additionally, we included parallel self-attention mechanisms in the decoder section of the DeepLab V3+ structure and optimized feature fusion into a multi-branch feature fusion module. Through these modifications, we successfully reduced the parameter count of DeepLab V3+ while maintaining a certain level of segmentation accuracy. The experimental results indicate that the improved DeepLab V3+ method exhibits high efficiency and segmentation accuracy, making it highly effective in object segmentation tasks.
起訖頁	001-017
關鍵詞	Context semantic aggregation、Deep Labv3 plus、CBAM attention mechanism、Self-attention mechanism、Multi-branch feature fusion
刊名	理工研究國際期刊
期數	202604 (16:1期)
出版單位	國立臺南大學
該期刊-下一篇	應用於電化學感測器電極改質之恆電流儀設計