Semiotic Analysis of Texts and Interpretation of Sign Systems in the Digital Era: Sentiment-analysis Using the KNIME Platform
https://doi.org/10.32603/2412-8562-2025-11-4-121-138
Abstract
Introduction. The aim of the article is to study the feasibility of integrating semiotic approaches and machine learning methods for Sentiment-analysis. Sentiment-analysis is a popular area of linguistics at the interface with computer science and data analysis. The novelty of the paper lies in the attempt to interpret the results of machine learning based on the text of reviews as sign systems, revealing their lexical, syntactic, and pragmatic characteristics.
Methodology and sources. The research is based on the fundamental principles of semantics, syntactics, and pragmatics, as well as on modern approaches to the automation of textual information processing and the application of mathematical methods to substantiate speech phenomena. The research material is a freely distributed data set of film reviews from the IMDB platform. The KNIME system for data analysis in the ‘No-coding’ paradigm is used as an automation tool. The paper presents a workflow including the stages of data preprocessing, construction of classification models, and evaluation of their effectiveness, and proposes a linguistic interpretation of automatic review classification errors.
Results and discussion. The results demonstrate high classification accuracy (up to 92,0 %) and the ability of the algorithms to identify key lexical and syntactic markers that form the emotional colouring of the text. The study extends the boundaries of traditional semiotics by integrating methods of machine learning and big data analysis, and emphasises the practical value of using KNIME in natural language processing tasks.
Conclusion. This paper provides a detailed description of an algorithm for automating Sentiment analysis of film reviews, taking into account the advantages and potential challenges of this approach for text interpretation. Prospects for further research include applying the proposed methods to multilingual corpora and analysing multimodal data, which opens up new opportunities for studying sign systems in digital communication. The proposed methodology can be applied in the commercial sphere to identify the attitudes of users to goods, services, applications, books, films, etc., which increases the interest in linguistic services, namely Sentiment analysis.
About the Authors
E. V. IsaevaRussian Federation
Ekaterina V. Isaeva – Can. Sci. (Philology, 2013), Docent (2019), Head of Department of English for Professional Communication
15 Bukireva str., Perm 614068.
The author of 86 scientific publications.
Area of expertise: discursive linguistics, cognitive term science, text mining, digital linguistics.
S. V. Semenov
Russian Federation
Sergey V. Semenov – Student (4th year, Linguistics)
15 Bukireva str., Perm 614068.
Area of expertise: linguistics, translation studies, Sentiment-analysis.
D. L. Chernykh
Russian Federation
Denis L. Chernykh – Student (4th year, Linguistics)
15 Bukireva str., Perm 614068.
Area of expertise: digital linguistics, Sentiment-analysis.
A. V. Gudovshikov
Russian Federation
Alexei V. Gudovshikov – Student (4th year, Linguistics)
15 Bukireva str., Perm 614068
Area of expertise: linguistics, translation studies, Sentiment-analysis, text interpretation.
References
1. Popova, E.O. and Volkova, Ya.A. (2019),”Identification of Extremism Signs through the Analysis of the Text Tonality”, Functional Aspects of Intercultural Communication. Translation and Interpreting Issues, Vol. 6: XI Int. Sci. Interdisciplinary Conf. on Research and Methodology, Moscow, RUS, 24 Nov. 2023, pp. 66–76. DOI: 10.22363/2712-7974-2019-6-66-76.
2. Sai, D. Gautham, Reddy S. Govind, Greeshma, D. et al. (2023), “Analysis of Tonality of Text Using Machine Learning”, IJRASET, vol. 11, iss. XII, pp. 973–979. DOI: 10.22214/ijraset.2023.57492.
3. Baydogan, C. and Alatas, B. (2018), “Sentiment analysis using Konstanz Information Miner in social networks”, 6th Int. Symposium on Digital Forensic and Security (ISDFS), Antalya, TUR, 22–25 March 2018. DOI: 10.1109/ISDFS.2018.8355395.
4. Taboada, M. (2016), “Sentiment Analysis: An Overview from Linguistics”, Annual Review of Linguistics, vol. 2, pp. 325–347. DOI: 10.1146/annurev-linguistics-011415-040518.
5. Benamara, F., Taboada, M. and Mathieu, Y. (2017), “Evaluative Language Beyond Bags of Words: Linguistic Insights and Computational Applications”, Computational Linguistics, vol. 43, no. 1, pp. 201-264. DOI: 10.1162/COLI_a_00278.
6. Baly, R. et al. (2017), “A Meta-Framework for Modeling the Human Reading Process in Sentiment Analysis”, ACM Transactions on Information Systems, vol. 35, iss. 1: 7. DOI: 10.1145/2950050.
7. Lu, B. (2013), “On computing textual sentiment with linguistic knowledge and semi-supervised learning”. Dr. Sci. (Philosophy) Thesis, City Univ. of Hong Kong, Hong Kong, HKG.
8. Stepanov, Ju.S. (1974), “Some Burning Issues of Contemporary Semiotics”, Linguistics, vol. 12, Iss. 141, pp. 53–66. DOI: 10.1515/ling.1974.12.141.53.
9. Veron, E. (1971), “Ideology and Social Sciences: A Communicational Approach”, Semiotica, vol. 3, iss. 1, pp. 59–76. DOI: https://doi.org/10.1515/semi.1971.3.1.59.
10. Allwood, J. (1978), “A Bird’s Eye View of Pragmatics”, Papers from the Fourth Scandinavian Conference of Linguistics, Odense Univ. Press., Odense, DNK, pp. 145–159.
11. Grinevich, O.A. (2020), “Dynamics of the estate supertext functioning in Russian literature: semantics, syntactics, pragmatics”, Izvestia of Smolensk State Univ., no. 1 (49), pp. 46–60. DOI: 10.35785/2072-9464-2020-49-1-46-60.
12. Hogenboom, A. (2009), Sentiment Analysis of Text Guided by Semantics and Structure, Erasmus Univ. Rotterdam, Rotterdam, NDL.
13. Ranjan, M., Tiwari, S., Md Sattar, A. and Tatkar, N.S. (2023), “A New Approach for Carrying Out Sentiment Analysis of Social Media Comments Using Natural Language Processing”, Engineering Proceedings, vol. 59, iss. 1: 181. DOI: 10.3390/engproc2023059181.
14. Vilares, D. (2013), Sentiment analysis for reviews and microtexts based on lexico-syntactic knowledge, available at: https://www.scienceopen.com/hosted-document?doi=10.14236/ewic/FDIA2013.8 (accessed 27.01.2025).
15. Chauhan, D., Sutaria, K. and Doshi, R. (2018), “Impact of Semiotics on Multidimensional Sentiment Analysis on Twitter: A Survey”, Second Int. Conf. on Computing Methodologies and Communication (ICCMC), Erode, IND, 15–16 Feb. 2018, pp. 671–674. DOI: 10.1109/ICCMC.2018.8487851.
16. Liebmann, M., Hagenau, M. and Neumann, D. (2012), “Information Processing in Electronic Markets: Measuring Subjective Interpretation Using Sentiment Analysis”, ICIS 2012 Proceedings, available at: https://aisel.aisnet.org/icis2012/proceedings/KnowledgeManagement/9 (accessed 27.01.2025).
17. Singh, B., Kushwaha, N. and Vyas, O.P. (2016), “An interpretation of sentiment analysis for enrichment of Business Intelligence”, IEEE Region 10 Conference (TENCON), Singapore, SGP, 22–25 Nov. 2016, pp. 18–23. DOI: 10.1109/TENCON.2016.7847950.
18. Thiel, K. (2016), “Introduction to the KNIME Text Processing Extension”, Text Mining and Visualization: Case Studies Using Open-Source Tools, 1st ed., in Hofmann, M. and Chisholm, A. (eds.), Chapman and Hall, London, NY, UK, pp. 63–80. DOI: https://doi.org/10.1201/b19007.
19. Meinl, T., Jagla, B. and Berthold, M.R. (2012), “Integrated data analysis with KNIME”, Open Source Software in Life Science Research, in Harland, L. and Forster, M. (eds.), Woodhead Publishing Ltd., Cambridge, UK, pp. 151–171. DOI: https://doi.org/10.1533/9781908818249.151.
20. Dorr, R.A., Casal, J.J. and Toriano, R. (2022), “Text Mining of Biomedical Articles Using the Konstanz Information Miner (KNIME) Platform: Hemolytic Uremic Syndrome as a Case Study”, Healthcare Informatics Research, vol. 28, no. 3, pp. 276–283. DOI: 10.4258/hir.2022.28.3.276.
21. Kharlamov, E., Mehdi, G., Savkovic, O. et al. (2018), “Towards Simplification of Analytical Workflows with Semantics at Siemens (Extended Abstract)”, IEEE Int. Conf. on Big Data (Big Data), Seattle, WA, USA, 10–13 Dec. 2018, pp. 1951–1954. DOI: 10.1109/BigData.2018.8622652.
22. Isaeva, E., Manzhula, O., Baiburova, O. and Crawford, R. (2022), “Smart Technologies for Genre Closeness Evaluation”, Lecture Notes in Networks and Systems, vol. 342, Springer, Cham, CHE. DOI: 10.1007/978-3-030-89477-1_60.
23. Isaeva, E. (2022), “Topic Modelling in Computer Security Discourse: a Case Study of Whitepaper Publications and News Feeds”, Perm Univ. Herald. Russian and Foreign Philology, vol. 14, iss. 2, pp. 18–26. DOI: 10.17072/2073-6681-2022-2-18-26.
24. Valtolina, S., Barricelli, B.R. and Dittrich, Y. (2012), “Participatory knowledge-management design: A semiotic approach”, J. of Visual Languages & Computing, vol. 23, iss. 2. pp. 103–115. DOI: https://doi.org/10.1016/j.jvlc.2011.11.007.
25. “03_Sentiment_Classification rev 1 – KNIME Community Hub” (2024), KNIME Open for Innovation, 23.06.2024, available at: https://hub.knime.com/rfeigel/spaces/Public/03_Sentiment_Classification%20rev%201~4i6l8oqEGQ_ngBU5/current-state (accessed 27.01.2025).
Review
For citations:
Isaeva E.V., Semenov S.V., Chernykh D.L., Gudovshikov A.V. Semiotic Analysis of Texts and Interpretation of Sign Systems in the Digital Era: Sentiment-analysis Using the KNIME Platform. Discourse. 2025;11(4):121-138. (In Russ.) https://doi.org/10.32603/2412-8562-2025-11-4-121-138