月旦知識庫
 
  1. 熱門:
 
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
ROCLING論文集 本站僅提供期刊文獻檢索。
  【月旦知識庫】是否收錄該篇全文,敬請【登入】查詢為準。
最新【購點活動】


篇名
文本意圖的多模態分析:以Instagram為例
並列篇名
An Analysis of Multimodal Document Intent in Instagram Posts
作者 陳膺宇 (Ying-Yu Chen)Shu-Kai Hsieh (Shu-Kai Hsieh)
中文摘要
"時至今日,社群媒體(如Instagram)趨向結合圖片以及文字表徵,建構出一種新的「多模態」溝通方式。利用計算方法分析多模態關係已成為一個熱門的主題,然而,尚未有研究針對台灣的百大網紅發文中的多模態圖文配對(Image-caption Pair)來分析文本意圖和圖文關係。利用文字和圖片的多模態表徵,本研究沿用Kruk et al.(2019)的圖文關係分類方法(contextual relationship/ semiotic relationship/ author's intent),對此三種分類提出新的圖文表徵方式(Sentence-BERT及image embedding),並利用計算模型(Random Forest, Decision Tree Classifier)精準分類三種圖文關係,研究結果顯示正確率高達86.23%。"
英文摘要
Present-day, a majority of representation style on social media (i.e., Instagram) tends to combine visual and textual content in the same message as a consequence of building up a modern way of communication. Message in multimodality is essential in almost any type of social interaction especially in the context of social multimedia content online. Hence, effective computational approaches for understanding documents with multiple modalities are needed to identify the relationship between them. This study extends recent advances in authors intent classification by putting forward an approach using Image-caption Pairs (ICPs). Several Machine Learning algorithm like Decision Tree Classifier (DTC's), Random Forest (RF) and encoders like Sentence-BERT and picture embedding are undertaken in the tasks in order to classify the relationships between multiple modalities, which are 1) contextual relationship 2) semiotic relationship and 3) authors intent. This study points to two possible results. First, despite the prior studies consider incorporating the two synergistic modalities in a combined model will improve the accuracy in the relationship classification task, this study found out the simple fusion strategy that linearly projects encoded vectors from both modalities in the same embedding space may not strongly enhance the performance of that in a single modality. The results suggest that the incorporating of text and image needs more effort to complement each other. Second, we show that these text-image relationships can be classified with high accuracy (86.23%) by using only text modality. In sum, this study may be essential in demonstrating a computational approach to access multimodal documents as well as providing a better understanding of classifying the relationships between modalities.
起訖頁 1-15
關鍵詞 多模態文本分析自然語言處理決策樹隨機森林multimodal documents understandingcontextual relationshipsemiotic relationshipauthors intentNatural Language ProcessingDecision Tree ClassifierRandom ForestSentence-BERTimage embedding
刊名 ROCLING論文集  
期數 2020 (2020期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 基於深度聲學模型其狀態精確度最大化之強健語音特徵擷取的初步研究
該期刊-下一篇 使用元學習技術於語碼轉換語音辨識之初步研究
 

新書閱讀



最新影音


優惠活動




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄