{"id":20,"date":"2022-04-29T14:32:54","date_gmt":"2022-04-29T11:32:54","guid":{"rendered":"https:\/\/traces.gate-ai.eu\/?page_id=20"},"modified":"2023-03-14T13:40:47","modified_gmt":"2023-03-14T10:40:47","slug":"results","status":"publish","type":"page","link":"https:\/\/traces.gate-ai.eu\/?page_id=20","title":{"rendered":"Results"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Videos<\/h2>\n\n\n\n<p>You can watch a short video about the project&#8217;s results on this <a href=\"https:\/\/www.youtube.com\/watch?v=rK67LgUfIEo\">link<\/a>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.youtube.com\/watch?v=rK67LgUfIEo\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"625\" src=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/image-5-1024x625.png\" alt=\"\" class=\"wp-image-685\" srcset=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/image-5-1024x625.png 1024w, https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/image-5-300x183.png 300w, https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/image-5-768x469.png 768w, https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/image-5-1536x938.png 1536w, https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/image-5.png 1752w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Models &amp; Tools<\/h2>\n\n\n\n<p>We implemented and trained three machine learning models. One that can detect Bulgarian texts, generated by the GPT-2 and ChatGPT language models, one for detecting disinformation, and one for detecting untrue information. All three models are trained with data in Bulgarian language.<\/p>\n\n\n\n<p>We also created a tool that uses the three models. The tool has three versions. All three versions use the same model for disinformation detection, but different models for textual deepfakes detection, and the last version contains a model for detecting untrue information.<\/p>\n\n\n\n<p><em>Current release:<\/em><\/p>\n\n\n\n<p><strong><em>TRACES-tool v1.2<\/em> :<\/strong>  The deepfake model can detect texts generated with the models GPT-2 and ChatGPT, disinformation and untrue information. If a text is recognized as both generated by the language models and containing untrue information or disinformation, it can be considered as potentially being a textual deepfake. The model for detecting untrue information has F1-Score of 0.96.<\/p>\n\n\n\n<p><em>Previous releases:<\/em><\/p>\n\n\n\n<p><strong><em>TRACES-tool v1.1<\/em> :<\/strong>  The deepfake model can detect texts generated with the models GPT-2 and ChatGPT, and disinformation. We used 10-fold cross-validation and achieved F1-Score of 0.88. <\/p>\n\n\n\n<p><strong><em><strong><em>TRACES-tool<\/em><\/strong> v1.0 :<\/em><\/strong> The deepfake model can detect texts, generated by GPT-2 with F1-Score 0.9339.<\/p>\n\n\n\n<p>To test the tool, please follow the <a rel=\"noreferrer noopener\" href=\"https:\/\/huggingface.co\/spaces\/TRACES\/traces-tool\" target=\"_blank\"><strong>link to the DEMO<\/strong>.<\/a> See the screenshot of the tool, containing the input text in Bulgarian with the following translation into English: &#8220;I have no idea where did the cookies disappear&#8221;.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/image-3.png\" alt=\"\" class=\"wp-image-669\" width=\"506\" height=\"592\" srcset=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/image-3.png 646w, https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/image-3-256x300.png 256w\" sizes=\"auto, (max-width: 506px) 100vw, 506px\" \/><\/figure>\n\n\n\n<p><em><strong>Note:<\/strong> \u0422he code of the tool contains an unpublished third party model for detecting disinformation that will be released it in the next few months.<\/em> <em>Because of this reason, the tool cannot be yet publicly released, but it is available for testing.<\/em><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>We have publicly released two machine learning models (the ML models with the highest F1-Scores): <\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Machine learning model for identifying Bulgarian texts, automatically generated by the language models GPT-2 and ChatGPT, achieving F1-Score 0.88. Can be accessed under legal restrictions on <a href=\"https:\/\/zenodo.org\/record\/7713672\">Zenodo<\/a>.<\/li>\n\n\n\n<li>Machine learning model for identifying Bulgarian texts, potentially containing untrue information, achieving F1-Score 0.96. The model can be accessed under strict legal conditions on <a href=\"https:\/\/zenodo.org\/record\/7713572\">Zenodo<\/a>.<\/li>\n<\/ol>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Datasets<\/h2>\n\n\n\n<p><a href=\"https:\/\/zenodo.org\/record\/7614247\">Bulgarian Twitter dataset on Covid-19, annotated with linguistic markers of lies<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/zenodo.org\/record\/7614294\">Bulgarian Telegram dataset, annotated with linguistic markers of lies<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/zenodo.org\/record\/7359055\">Bulgarian Twitter dataset on lies and manipulation, annotated with linguistic markers of lies<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/zenodo.org\/record\/7614357\">Bulgarian Twitter Dataset on Famous Bulgarian Political Cases of Suspected Lies, Annotated with Linguistic Markers of Lies<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/zenodo.org\/record\/7359055\">Bulgarian sentiment analysis Twitter dataset<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/zenodo.org\/record\/7657029\">Python Scripts for Downloading from Twitter, Cleaning, and Annotating with the Linguistic Markers of Deception New (Bulgarian) Text Datasets<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/zenodo.org\/record\/7656905\">Hierarchical Classification of Categories of Linguistic and Psycholinguistic Markers of Deception with Bulgarian Expression Lists for Disinformation Detection<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/zenodo.org\/record\/7702054#.ZAYBWT1BxPY\" target=\"_blank\" rel=\"noreferrer noopener\">TRACES Telegram and Twitter Dataset with Bulgarian Journalists Manual Annotations of True\/Untrue and Disinformation\/Not and Automatic Annotations for Markers of Lies<\/a><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>  <style type=\"text\/css\">p { line-height: 115%; margin-bottom: 0.1in; background: transparent }a:link { color: #000080; so-language: zxx; text-decoration: under<\/style><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Deliverables<\/h2>\n\n\n\n<p><strong>Data Management Plan<\/strong>:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><a href=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2022\/11\/AI4Media-OC1-Data-Management-Plan-V3.0.pdf\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/deliverable_1-724x1024.png\" alt=\"\" class=\"wp-image-676\" width=\"438\" height=\"619\" srcset=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/deliverable_1-724x1024.png 724w, https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/deliverable_1-212x300.png 212w, https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/deliverable_1-768x1086.png 768w, https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/deliverable_1-1086x1536.png 1086w, https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/deliverable_1-1448x2048.png 1448w\" sizes=\"auto, (max-width: 438px) 100vw, 438px\" \/><\/a><figcaption class=\"wp-element-caption\"> <\/figcaption><\/figure>\n\n\n\n<p>The Data Management Plan can be downloaded <a href=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2022\/11\/AI4Media-OC1-Data-Management-Plan-V3.0.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<p><strong>Annotation Guidelines<\/strong>:<\/p>\n\n\n\n<p>Manual annotation guidelines for journalists for annotating texts with categories &#8220;true\/untrue&#8221; and &#8220;disinformation\/no disinformation&#8221;. The guidelines contain also instructions on how to use an annotation tool.<\/p>\n\n\n\n<p>Bulgarian version of the guidelines:<\/p>\n\n\n<a href=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/TRACES_Dts10_Manual_Annotation_Guidelines_for_Journalists_BG_1.0.pdf\" class=\"pdfemb-viewer\" style=\"\" data-width=\"max\" data-height=\"max\"  data-toolbar=\"top\" data-toolbar-fixed=\"off\">TRACES_Dts10_Manual_Annotation_Guidelines_for_Journalists_BG_1.0<br\/><\/a>\n<p class=\"wp-block-pdfemb-pdf-embedder-viewer\"><\/p>\n\n\n\n<p>English version of the guidelines (with Bulgarian examples):<\/p>\n\n\n<a href=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2023\/03\/TRACES_Dts10_Manual_Annotation_Guidelines_for_Journalists_EN_1.0.pdf\" class=\"pdfemb-viewer\" style=\"\" data-width=\"max\" data-height=\"max\"  data-toolbar=\"top\" data-toolbar-fixed=\"off\">TRACES_Dts10_Manual_Annotation_Guidelines_for_Journalists_EN_1.0<br\/><\/a>\n<p class=\"wp-block-pdfemb-pdf-embedder-viewer\"><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Research Publications<\/h2>\n\n\n\n<p><strong>1 research article accepted at the Language Technologies Conference 2023 (LTC&#8217;23):<\/strong><\/p>\n\n\n\n<p>Irina Temnikova, Silvia Gargova, Ruslana Margova, Veneta Kireva, Ivo Dzhumerov, Tsvetelina Stefanova, and Hristiana Krasteva (2023, Forthcoming). <strong>New Bulgarian Resources for Detecting Disinformation.<\/strong> 10th Language &amp; Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics. Pozna\u0144, Poland.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>1 research article accepted at the conference Computational Linguistics in Bulgaria (CLIB&#8217;22): <\/strong><\/p>\n\n\n\n<p>Silvia Gargova, Irina Temnikova, Ivo Dzhumerov, Hristiana Nikolaeva&nbsp;(September, 2022). <strong>Evaluation of Off-the-Shelf Language Identification Tools on Bulgarian Social Media Posts<\/strong>. In Proceedings of Computational Linguistics in Bulgaria (<a href=\"https:\/\/dcl.bas.bg\/clib\/program-accepted-papers\/\" target=\"_blank\" rel=\"noreferrer noopener\">CLIB&#8217;22<\/a>).<\/p>\n\n\n\n<p>The article addresses the lack of updated information about which off-the-shelf language identification tools work the best for Bulgarian social media posts. The paper makes curious discoveries useful as for Bulgarian users, especially those, who have no programming skills; as for researchers of other languages. For more information, read our paper, when uploaded at the conference <a rel=\"noreferrer noopener\" href=\"https:\/\/dcl.bas.bg\/clib\/programme\/\" data-type=\"URL\" data-id=\"https:\/\/dcl.bas.bg\/clib\/programme\/\" target=\"_blank\">website<\/a>, or follow our presentation (in English) online on September 8th or 9th 2022 (there are no fees for participating, but <a rel=\"noreferrer noopener\" href=\"https:\/\/dcl.bas.bg\/clib\/registration-2\/\" data-type=\"URL\" data-id=\"https:\/\/dcl.bas.bg\/clib\/registration-2\/\" target=\"_blank\">registration<\/a> is compulsory).<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Presentations<\/h2>\n\n\n\n<p>Presentation of the project at the AI4Media Open Call 1 funded projects kick-off meeting:<\/p>\n\n\n<a href=\"https:\/\/traces.gate-ai.eu\/wp-content\/uploads\/2022\/05\/TRACES-Project-Presentation-website.pdf\" class=\"pdfemb-viewer\" style=\"\" data-width=\"max\" data-height=\"max\"  data-toolbar=\"both\" data-toolbar-fixed=\"off\">TRACES-Project-Presentation-website<br\/><\/a>\n<p class=\"wp-block-pdfemb-pdf-embedder-viewer\"><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Dissemination events<\/h2>\n\n\n\n<p>The project was presented during Institute&#8217;s GATE Disinformation (detection) research group <a href=\"https:\/\/gate-ai.eu\/en\/news\/gates-week-of-events-united-against-disinformation\/\">Open doors event<\/a> on January 23, 2023.<\/p>\n\n\n\n<p>The project was presented in Bulgarian during an interactive game\/presentation event during European&#8217;s scientists night on September 30, 2023.<\/p>\n\n\n\n<p>The project has been advertised at a 2-day educational event on <a href=\"https:\/\/gate-ai.eu\/en\/news\/gate-trains-librarians-to-detect-disinformation-2\/\">information literacy for librarians<\/a> in the Bulgarian town of Pazardzhik.<\/p>\n\n\n\n<p>Information about the project has been shared at the <a href=\"https:\/\/ibl.bas.bg\/en\/mezhdunarodna-yubileyna-konferentsiya-na-instituta-za-balgarski-ezik-vprof-lyubomir-andreytchinv-2022\/\">International Jubilee Conference Of The Institute For Bulgarian Language 2022<\/a>, which took place on 15-17 May 2022 in Sofia.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Media articles<\/h2>\n\n\n\n<p>A <a href=\"https:\/\/it.dir.bg\/tehnologii\/balgari-razrabotvat-softuer-za-otkrivane-na-diypfeyk-tekstove\">news article<\/a> in Bulgarian language about the TRACES team working on recognizing deepfake texts appeared on March 10, 2023.<\/p>\n\n\n\n<p>The project members Irina Temnikova and Silvia Gargova were interviewed by 2 Bulgarian radio and 1 tv channel in January 2023.<\/p>\n\n\n\n<p>A news article in Bulgarian language, entirely dedicated to the project TRACES has appeared in an <a href=\"https:\/\/it.dir.bg\/komunikatsii\/inovativen-proekt-otkriva-sledi-ot-dezinformatsiya-na-balgarski\">online Bulgarian media outlet<\/a>. <\/p>\n\n\n\n<p>TRACES has been mentioned in an <a href=\"https:\/\/www.24chasa.bg\/region\/article\/11557641\">article<\/a> at the major Bulgarian daily newspaper <a href=\"https:\/\/www.24chasa.bg\/\">24chasa<\/a> (in Bulgarian).<\/p>\n\n\n\n<p>TRACES has also been mentioned in an <a href=\"https:\/\/www.marica.bg\/region\/pazardjik\/specialisti-shte-pomagat-na-horata-da-ne-se-podvejdat-po-dezinformaciq\">article<\/a> in the regional Plovdiv daily newspaper <a href=\"https:\/\/www.marica.bg\/\" data-type=\"URL\" data-id=\"https:\/\/www.marica.bg\/\">Marica<\/a> (in Bulgarian).<\/p>\n\n\n\n<p>The project has also been mentioned in an <a href=\"https:\/\/www.geomedia.bg\/beyond-geodesy\/bibliotekari-shte-sadeystvat-grazhdanite-da-ne-se-podvezhdat-po-dezinformatsiya\/\">article<\/a> in the Geodesy magazine <a href=\"https:\/\/www.geomedia.bg\/en\/homepage-en\/\">Geomedia<\/a> (in Bulgarian).<\/p>\n\n\n\n<p>Information about the project has been shared in a <a href=\"https:\/\/libpz.eu\/%D0%B1%D0%B8%D0%B1%D0%BB%D0%B8%D0%BE%D1%82%D0%B5%D1%87%D0%BD%D0%B8%D1%82%D0%B5-%D1%81%D0%BF%D0%B5%D1%86%D0%B8%D0%B0%D0%BB%D0%B8%D1%81%D1%82%D0%B8-%D0%B2-%D0%BF%D0%B0%D0%B7%D0%B0%D1%80%D0%B4%D0%B6\/\">news article<\/a> on the <a href=\"https:\/\/libpz.eu\/\">website<\/a> of the Regional Library Nikola Furnadzhiev.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Videos You can watch a short video about the project&#8217;s results on this link. Models &amp; Tools We implemented and trained three machine learning models. One that can detect Bulgarian texts, generated by the GPT-2 and ChatGPT language models, one for detecting disinformation, and one for detecting untrue information. All three models are trained with&hellip;&nbsp;<a href=\"https:\/\/traces.gate-ai.eu\/?page_id=20\" class=\"\" rel=\"bookmark\">Read More &raquo;<span class=\"screen-reader-text\">Results<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":347,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"neve_meta_sidebar":"","neve_meta_container":"","neve_meta_enable_content_width":"","neve_meta_content_width":0,"neve_meta_title_alignment":"","neve_meta_author_avatar":"","neve_post_elements_order":"","neve_meta_disable_header":"","neve_meta_disable_footer":"","neve_meta_disable_title":"","footnotes":""},"class_list":["post-20","page","type-page","status-publish","has-post-thumbnail","hentry"],"_links":{"self":[{"href":"https:\/\/traces.gate-ai.eu\/index.php?rest_route=\/wp\/v2\/pages\/20","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/traces.gate-ai.eu\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/traces.gate-ai.eu\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/traces.gate-ai.eu\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/traces.gate-ai.eu\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=20"}],"version-history":[{"count":74,"href":"https:\/\/traces.gate-ai.eu\/index.php?rest_route=\/wp\/v2\/pages\/20\/revisions"}],"predecessor-version":[{"id":690,"href":"https:\/\/traces.gate-ai.eu\/index.php?rest_route=\/wp\/v2\/pages\/20\/revisions\/690"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/traces.gate-ai.eu\/index.php?rest_route=\/wp\/v2\/media\/347"}],"wp:attachment":[{"href":"https:\/\/traces.gate-ai.eu\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=20"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}