Linguistic steganalysis using the features derived from synonym frequency

Xiang, Lingyun; Sun, Xingming; Luo, Gang; Xia, Bin

doi:10.1007/s11042-012-1313-8

Linguistic steganalysis using the features derived from synonym frequency

Published: 18 December 2012

Volume 71, pages 1893–1911, (2014)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Lingyun Xiang¹,
Xingming Sun²,
Gang Luo³ &
…
Bin Xia³

712 Accesses
45 Citations
Explore all metrics

Abstract

A linguistic steganalysis method is proposed to detect synonym substitution-based steganography, which embeds secret message into a text by substituting words with their synonyms. First, attribute pair of a synonym is introduced to represent its position in an ordered synonym set sorting in descending frequency order and the number of its synonyms. As a result of synonym substitutions, the number of high frequency attribute pairs may be reduced while the number of low frequency attribute pairs would be increased. By theoretically analyzing the changes of the statistical characteristics of attribute pairs caused by SS steganography, a feature vector based on the difference of the relative frequencies of different attribute pairs is utilized to detect the secret message. Finally, the impact on the extracted feature vector caused by synonym coding strategies is analyzed. Experimental results demonstrate that the proposed linguistic steganalysis method can achieve better detection performance than previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A linguistic steganography based on word indexing compression and candidate selection

Article 11 May 2018

Text Semantic Steganalysis Based on Word Embedding

Is blind image steganalysis practical using feature-based classification?

Article 26 May 2023

Notes

Synset is defined as a set of words with identical or similar meanings

References

Atallah MJ, Raskin V, Crogan M, Hempelmann C, Kerschbaum F, Mohamed D, Naik S (2001) Natural language watermarking: design, analysis, and a proof-of-concept implementation. In: Proceedings of 4th International Workshop Information Hiding, Lecture Notes in Computer Science, Springer, Berlin, vol 2137, pp 185–199
Bolshakov A (2004) A method of linguistic steganography based on collocationally-verified synonymy. In: Proceedings of 6th International Workshop Information Hiding, Lecture Notes in Computer Sciences, Springer, Berlin, vol 3200, pp 180–191
Chang CC, Lin CJ (2010) LIBSVM: a library for support vector machines. [Online]. Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chapman MT, Davida GI (1997) Hiding the hidden: a software system for concealing ciphertext as innocuous text. In: Proceedings of the International Conference on Information and Communications Security, Lecture Notes in Computer Sciences, Springer, Berlin, vol 1334, pp 333–345
Chen ZL, Huang LS, Yu ZS, Yang W et al (2008) Linguistic steganography detection using statistical characteristics of correlations between Words. In: Proceedings of 10th International Workshop on Information Hiding, Lecture Notes in Computer Sciences, Springer, Berlin, vol 5284, pp 224–235
Chiang YL, Chang LP, Hsieh WT, Chen WC (2003) Natural language watermarking using semantic substitution for Chinese text. In: Proceedings of 2nd International Workshop Digital Watermarking, Lecture Notes in Computer Sciences, Springer, Berlin, vol 2939, pp 129–140
Google Terms of Service (2010) [Online]. Available: http://www.google.com/accounts/TOS?hl=en
Leech G, Rayson P, Wilson A (2001) Word frequencies in written and spoken english: based on the British National Corpus. Longman, London
Google Scholar
Leech G, Rayson P, Wilson A (2010) Word frequencies in written and spoken english: based on the British National Corpus. [Online]. Available: http://ucrel.lancs.ac.uk/bncfreq/
Liu YL, Sun XM, Gan C, Wang H (2007) An efficient linguistic steganography for Chinese text. In: Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, pp 2094–2097
Liu YL, Sun XM, Liu YP, Li CT (2008) MIMIC-PPT: mimicking-based steganography for microsoft PowerPoint document. Inf Tech J 7(4):654–660
Article Google Scholar
Luo G, Sun XM, Xiang LY, Liu YL, Gan C (2008) Steganalysis on synonym substitution steganography. J Comput Res Dev (Chinese) 45(10):1696–1703
Google Scholar
Meral HM, Sankur B, Özsoy AS, Güngör T, Sevinç E (2009) Natural language watermarking via morphosyntactic alterations. Comput Speech Lang 23(1):107–125
Article Google Scholar
Muhammad HZ, Rahman SMSAA, Shakil A (2009) Synonym based Malay linguistic text steganography. In: 2009 Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, pp 423–427
Shirali-Shahreza MH, Shirali-Shahreza M (2008) A new synonym text steganography. In: Proceedings of the 4th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp 1524–1526
Taskiran CM, Topkara U, Topkara M, Delp EJ (2006) Attacks on lexical natural language steganography systems. In: Proceedings of the SPIE, Security, Steganography and Watermarking of Multimedia Contents VIII, vol 6072, pp 97–105
Topkara U, Topkara M, Atallah MJ (2006) The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions. In: Proceedings of the 8th Workshop on Multimedia and security. ACM Press, pp 164–174
Wang Y, Moulin P (2007) Optimized feature extraction for learning-based image steganalysis. IEEE Trans Inf Forensics Secur 2(1):31–45
Article Google Scholar
Winstein K (2010) Lexical steganography through adaptive modulation of the word choice hash. [Online]. Available: http://alumni.imsa.edu/~keithw/tlex/lsteg.ps
Winstein K (2010) Tyrannosaurus lex. [Online]. Available: http://alumni.imsa.edu/~keithw/tlex/
WordNet (2010) [Online]. Available: http://wordnet.princeton.edu/
Yang JL, Wang JM, Wang CK, Li DY (2007) A novel scheme for watermarking natural language text. In: Proceedings of the 3^rd International Conference on Intelligent Information Hiding and Multimedia Signal Processing, vol. 2, pp. 481–484
Yu ZS, Huang LS, Chen ZL, Li LJ, Zhao XX, Zhu YW (2008) Detection of Synonym-Substitution Modified Articles Using Context Information. In: Proceedings of 2nd International Conference on Future Generation Communication and Networking, vol 1, pp 134–139

Download references

Acknowledgments

This work was supported in part by National Natural Science Foundation of China (Nos. 60973128, 61073191, 61070196, 61070195, 61103215, 61173141, 61173142, and 61232016), National Basic Research Program 973 of China (Nos. 2010CB334706, and 2011CB311808), 2011GK2009, GYHY201206033, 201301030, 0S2013GR0445 and PAPD fund.

Author information

Authors and Affiliations

College of Computer and Communication Engineering, Changsha University of Science & Technology, Changsha, Hunan, 410004, China
Lingyun Xiang
Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Xingming Sun
College of Information Science and Engineering, Hunan University, Changsha, 410082, China
Gang Luo & Bin Xia

Authors

Lingyun Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Xingming Sun
View author publications
You can also search for this author in PubMed Google Scholar
Gang Luo
View author publications
You can also search for this author in PubMed Google Scholar
Bin Xia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xingming Sun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiang, L., Sun, X., Luo, G. et al. Linguistic steganalysis using the features derived from synonym frequency. Multimed Tools Appl 71, 1893–1911 (2014). https://doi.org/10.1007/s11042-012-1313-8

Download citation

Published: 18 December 2012
Issue Date: August 2014
DOI: https://doi.org/10.1007/s11042-012-1313-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Linguistic steganalysis using the features derived from synonym frequency

Abstract

Access this article

Similar content being viewed by others

A linguistic steganography based on word indexing compression and candidate selection

Text Semantic Steganalysis Based on Word Embedding

Is blind image steganalysis practical using feature-based classification?

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Linguistic steganalysis using the features derived from synonym frequency

Abstract

Access this article

Similar content being viewed by others

A linguistic steganography based on word indexing compression and candidate selection

Text Semantic Steganalysis Based on Word Embedding

Is blind image steganalysis practical using feature-based classification?

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation