Digital Media Steganography: Principles, Algorithms, and Advances [1 ed.] 0128194383, 9780128194386

The common use of the Internet and cloud services in transmission of large amounts of data over open networks and insecu

1,743 209 29MB

English Pages 386 [376] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Digital Media Steganography: Principles, Algorithms, and Advances [1 ed.]
 0128194383, 9780128194386

Table of contents :
Cover
Digital Media Steganography:
Principles, Algorithms, and Advances
Copyright
Contents
List of contributors
About the editor
Preface
Acknowledgments
1 Introduction to digital image steganography
1.1 Introduction
1.2 Applications of steganography
1.3 Challenges facing steganography
1.4 Steganographic approaches
1.4.1 Spread spectrum approaches
1.4.2 Spatial domain approaches
1.4.2.1 Gray level modification
1.4.2.2 Pixel value differencing (PVD)
1.4.2.3 Least significant bit substitution (LSB)
1.4.2.4 Exploiting modification direction (EMD)
1.4.2.5 Quantization-based approaches
1.4.2.6 Multiple bit-planes-based approaches
1.4.3 Adaptive-based approaches
1.4.4 Transform domain approaches
1.5 Performance evaluation
1.5.1 Payload capacity
1.5.2 Visual stego image quality analysis
1.5.3 Security analysis
1.5.3.1 Pixel difference histogram analysis
1.5.3.2 Universal steganalysis
1.5.3.3 Regular and singular steganalysis
1.6 Conclusion
References
2 A color image steganography method based on ADPVD and HOG techniques
2.1 Introduction
2.2 Review of the ADPVD method
2.3 The pixel-based adaptive directional PVD steganography
2.3.1 Histogram of oriented gradients
2.3.2 Pixel-of-interest (POI)
2.3.3 Embedding algorithm
2.3.4 Extraction algorithm
2.4 Results and discussion
2.4.1 Embedding direction analysis using HOG
2.4.2 Embedding direction analysis using POI
2.4.3 Impact of threshold value on POI
2.4.4 Impact of threshold on capacity and visual quality
2.4.5 Visual quality analysis
2.4.6 Comparison with other adaptive PVD-based methods
2.4.7 Comparison with color image-based methods
2.4.8 Comparison with edge-based methods
2.4.9 Security against pixel difference histogram analysis
2.4.10 Security against statistical RS-steganalysis
2.5 Conclusion
References
3 An improved method for high hiding capacity based on LSB and PVD
3.1 Introduction
3.2 Related work
3.2.1 Pixel value differencing (PVD) steganography [13]
3.2.1.1 The PVD embedding procedure
3.2.1.2 The PVD extraction steps
3.2.1.3 Illustration of the PVD method
3.2.2 Khodaei et al.'s method [20]
3.2.2.1 An illustration of incorrect data extraction in Khodaei et al.'s method
3.2.3 Jung's method [15]
3.2.3.1 Embedding algorithm
3.2.3.2 Extraction algorithm
3.2.3.3 FOBP in Jung's method
3.2.3.4 Extraction problem in Jung's method
3.3 The proposed method
3.3.1 Embedding procedure
Case 1: Pixel shifting process for overflow condition
Case 2: Pixel shifting process for underflow condition
3.3.2 Extraction procedure
3.3.3 Example of the proposed method
3.3.3.1 Embedding side
3.3.3.2 Extraction side
3.4 Results and discussion
3.4.1 Analysis of PSNR, capacity, BPP, FOBP, and SSIM
3.4.2 Security check using RS analysis
3.4.3 Security check using Pixel Difference Histogram (PDH) analysis
3.5 Conclusion
References
4 An efficient image steganography method using multiobjective differential evolution
4.1 Introduction
4.2 Literature review
4.3 Background
4.3.1 LSB substitution method
4.3.2 Differential evolution
4.4 The proposed method
4.4.1 Embedding process
4.4.2 Extraction process
4.5 Experimental results
4.5.1 Peak signal-to-noise ratio
4.5.2 Structural similarity index measure
4.5.3 Bit error rate
4.6 Conclusion
References
5 Image steganography using add-sub based QVD and side match
5.1 Introduction
5.2 Proposed ASQVD+SM technique
5.2.1 The embedding procedure
5.2.2 Extraction procedure
5.2.3 Example of embedding and extraction
5.3 Experimental analysis
5.4 Conclusion
References
6 A high-capacity invertible steganography method for stereo image
6.1 Introduction
6.2 Preliminaries
6.2.1 Discrete cosine transforms (DCT) and quantized DCT (QDCT)
6.2.2 Yang and Chen's method
6.3 The proposed method
6.3.1 Generation of the embedding direction histogram (EDH)
6.3.2 Stereo image embedding algorithm
6.3.2.1 Similar block searching
6.3.2.2 Based-2-D histogram shifting with EDH data embedding
6.3.2.3 Example of embedding
6.3.3 Information extracting and stereo image recovering algorithm
6.3.4 Evaluation metrics
6.4 Experimental results
6.5 Conclusion
Acknowledgment
References
7 An adaptive and clustering-based steganographic method: OSteg
7.1 Introduction
7.2 Related works
7.3 OSteg embedding
7.3.1 Preparation
7.3.2 Otsu clustering
7.3.3 Pretreatment: fake embedding
7.3.4 Scrambling selection: Ikeda system
7.3.5 Secret shared key and key space
7.3.6 Effective embedding
7.4 Experimental results and discussion
7.5 Conclusion
Acknowledgments
References
8 A steganography method based on decomposition of the Catalan numbers
8.1 Introduction
8.2 Related works
8.3 Decomposition of Catalan numbers
8.4 Implementation of the proposed method
Module for embedded data
Module for extract data
8.5 Steganalysis and security testing
Security analysis of stego key
Steganalysis of the proposed method
8.6 Conclusion
References
9 A steganography approach for hiding privacy in video surveillance systems
9.1 Introduction
9.2 Related works
9.3 Hiding privacy information using video compression concept
9.3.1 Background model generator
9.3.2 Deidentification private details
9.3.3 H.264 compression preprocessing
9.3.4 The proposed quantization hiding technique
9.3.5 The extraction module
9.4 Experimental results
9.4.1 Data payload
9.4.2 Invisibility performance
Conclusion
References
10 Reversible steganography techniques: A survey
10.1 Introduction
10.1.1 Reversible Steganography Scheme (RSS)
10.1.2 Measurements of RSS
10.1.3 Categories of RSS
10.2 Difference Expansion (DE) schemes
10.2.1 Embedding procedure of Tian's method
10.2.2 Extraction procedure of Tian's method
10.2.3 Embedding procedure of Alattar's method
10.2.4 Extraction procedure of Alattar's method
10.2.5 Recovery procedure of Alattar's method
10.3 Histogram-Shifting (HS) schemes
10.3.1 Embedding procedure of HS
10.3.2 Extraction and recovery procedures of HS
10.3.3 Extra information of HS
10.3.4 Experimental results of HS
10.4 Pixel-Value-Ordering (PVO) schemes
10.4.1 Embedding procedure of PVO
10.4.2 Embedding procedure of IPVO
10.4.3 Experimental results of PVO-based schemes
10.5 Dual-image-based schemes
10.5.1 Center-folding strategy
10.5.2 Experimental results of dual-based RSS
10.6 Interpolation-based schemes
10.6.1 Embedding procedure of NMI
10.6.2 Extraction procedure of NMI
10.6.3 Comparison results
10.7 Conclusion
Acknowledgments
References
11 Quantum steganography
11.1 Introduction
11.1.1 The idea of steganography
11.1.2 Quantum error-correcting codes
11.2 Goals and tools of quantum steganography
11.3 Quantum steganography with depolarizing noise
11.3.1 The depolarizing channel
11.3.2 A local steganographic encoding
11.3.3 Key usage
11.3.4 Weaknesses of the local encoding
11.4 Steganographic encoding in error syndromes
11.4.1 The encoding and decoding procedure
11.4.2 Communication and key usage rates
11.5 Encoding in the binary symmetric channel
11.6 Encoding in the 5-qubit "perfect" code
11.6.1 Encoding with one-qubit errors
11.6.2 Two error encodings
11.6.3 Rate of secret qubit transmission
11.6.4 Comparison to encoding across blocks
Steganographic communication rate
Key usage rate
11.7 Secrecy and security
11.7.1 Diamond norm distance for the binary symmetric channel
11.7.2 Diamond norm distance for the depolarizing channel
11.7.3 Conditions for secrecy
11.7.4 Secret key vs. shared entanglement
11.8 Asymptotic rates in the noiseless case
11.8.1 Direct coding theorem (achievability)
The binary symmetric channel
The depolarizing channel
Random unitary channels
General channels
Secret key consumption
11.8.2 Converse theorem (upper bound)
Upper bound on steganographic rate
11.9 Asymptotic rates in the noisy case
11.9.1 Direct coding in the noisy case
Achievable rate for the BSC
Secret key consumption
Depolarizing channel
General channels
11.9.2 Converse theorem in the noisy case
Upper bound on steganographic rate
11.10 Discussion and future directions
11.11 Conclusion
Acknowledgments
References
12 Digital media steganalysis
12.1 Introduction
12.2 Image steganalysis
12.2.1 Signature steganalysis
12.2.2 Statistical steganalysis
12.2.3 Deep learning applied to steganalysis of digital images
12.2.4 Summary and perspectives
12.3 Audio steganalysis
12.3.1 Methods
12.3.1.1 Noncompressed audio formats
12.3.1.2 Compressed audio formats
12.3.1.3 Modern audio steganalysis
12.3.2 Summary and perspectives
12.4 Video steganalysis
12.4.1 General context
12.4.2 Previous methods
12.4.3 Recent method
12.4.4 Summary and perspectives
12.5 Text steganalysis
12.5.1 Methods
12.5.1.1 Statistical algorithms
12.5.1.2 Modern text steganalysis
12.5.2 Summary and perspectives
12.6 Conclusion
References
13 Unsupervised steganographer identification via clustering and outlier detection
13.1 Introduction
13.2 Primary concepts and techniques
13.2.1 JPEG compression
13.2.2 JPEG steganalysis features
13.2.2.1 PEV-274 features
13.2.2.2 LI-250 features
13.2.3 Batch steganography and pooled steganalysis
13.2.4 Agglomerative clustering
13.2.5 Local outlier factor
13.2.6 Maximum mean discrepancy
13.3 General frameworks
13.3.1 Clustering-based detection
13.3.2 Outlier-based detection
13.3.3 Performance evaluation and analysis
13.3.3.1 Clustering-based detection
13.3.3.2 Outlier-based detection
13.4 Ensemble and dimensionality reduction
13.4.1 Clustering ensemble
13.4.2 Dimensionality reduction
13.4.2.1 Feature selection
13.4.2.2 Feature projection
13.5 Conclusion
Acknowledgment
References
14 Deep learning in steganography and steganalysis
14.1 Introduction
14.2 The building blocks of a deep neuronal network
14.2.1 Global view of a Convolutional Neural Network
14.2.2 The preprocessing module
14.2.3 The convolution module
14.2.4 The classification module
14.3 The different networks used over the period 2015-2018
14.3.1 The spatial steganalysis Not-Side-Channel-Aware (Not-SCA)
14.3.2 The spatial steganalysis Side-Channel-Informed (SCA)
14.3.3 The JPEG steganalysis
14.3.4 Discussion about the Mismatch phenomenon scenario
14.4 Steganography by deep learning
14.4.1 The family by synthesis
14.4.2 The family by generation of the modifications probability map
14.4.3 The family by adversarial-embedding iterated (approaches misleading a discriminant)
14.4.4 The family by 3-player game
14.5 Conclusion
References
Index
Back Cover

Citation preview

Digital Media Steganography

Digital Media Steganography Principles, Algorithms, and Advances Edited by Mahmoud Hassaballah Department of Computer Science Faculty of Computers and Information South Valley University Qena, Egypt

Academic Press is an imprint of Elsevier 125 London Wall, London EC2Y 5AS, United Kingdom 525 B Street, Suite 1650, San Diego, CA 92101, United States 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom Copyright © 2020 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-12-819438-6 For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals Publisher: Stacy Masucci Acquisitions Editor: Elizabeth Brown Editorial Project Manager: Gabriela D. Capille Production Project Manager: Maria Bernard Designer: Matthew Limbert Typeset by VTeX

Contents List of contributors

xi

About the editor

xv

Preface

xvii

Acknowledgments

xxi

1. Introduction to digital image steganography

1

M. Hassaballah, Mohamed Abdel Hameed, Monagi H. Alkinani 1.1. Introduction

1

1.2. Applications of steganography

3

1.3. Challenges facing steganography

4

1.4. Steganographic approaches

5

1.5. Performance evaluation

8

1.6. Conclusion

12

References

12

2. A color image steganography method based on ADPVD and HOG techniques

17

M. Hassaballah, Mohamed Abdel Hameed, Saleh Aly, A.S. AbdelRady 2.1. Introduction

17

2.2. Review of the ADPVD method

18

2.3. The pixel-based adaptive directional PVD steganography

18

2.4. Results and discussion

26

2.5. Conclusion

39 v

vi

Contents

References

3. An improved method for high hiding capacity based on LSB and PVD

39

41

Aditya Kumar Sahu, Gandharba Swain 3.1. Introduction

41

3.2. Related work

42

3.3. The proposed method

47

3.4. Results and discussion

54

3.5. Conclusion

62

References

62

4. An efficient image steganography method using multiobjective differential evolution

65

Manjit Kaur, Vijay Kumar, Dilbag Singh 4.1. Introduction

65

4.2. Literature review

67

4.3. Background

69

4.4. The proposed method

71

4.5. Experimental results

73

4.6. Conclusion

77

References

77

5. Image steganography using add-sub based QVD and side match

81

Anita Pradhan, K. Raja Sekhar, Gandharba Swain 5.1. Introduction

81

5.2. Proposed ASQVD+SM technique

83

5.3. Experimental analysis

90

5.4. Conclusion

95

Contents

References

6. A high-capacity invertible steganography method for stereo image

vii

96

99

Phuoc-Hung Vo, Thai-Son Nguyen, Van-Thanh Huynh, Thanh-Nghi Do 6.1. Introduction

99

6.2. Preliminaries

102

6.3. The proposed method

104

6.4. Experimental results

116

6.5. Conclusion

120

Acknowledgment

120

References

120

7. An adaptive and clustering-based steganographic method: OSteg

123

Marwa Saidi, Rhouma Rhouma, Rasheed Hussain, Olfa Mannai 7.1. Introduction

123

7.2. Related works

126

7.3. OSteg embedding

127

7.4. Experimental results and discussion

139

7.5. Conclusion

142

Acknowledgments

143

References

143

8. A steganography method based on decomposition of the Catalan numbers

145

Muzafer Saraˇcevi´c, Samed Juki´c, Adnan Hasanovi´c 8.1. Introduction

145

viii

Contents

8.2. Related works

146

8.3. Decomposition of Catalan numbers

148

8.4. Implementation of the proposed method

152

8.5. Steganalysis and security testing

157

8.6. Conclusion

161

References

162

9. A steganography approach for hiding privacy in video surveillance systems

165

Ahmed Elhadad, Safwat Hamad, Amal Khalifa, Hussein Abulkasim 9.1. Introduction

165

9.2. Related works

166

9.3. Hiding privacy information using video compression concept

167

9.4. Experimental results

176

Conclusion

184

References

185

10. Reversible steganography techniques: A survey

189

Tzu-Chuen Lu, Thanh Nhan Vo 10.1. Introduction

189

10.2. Difference Expansion (DE) schemes

191

10.3. Histogram-Shifting (HS) schemes

195

10.4. Pixel-Value-Ordering (PVO) schemes

199

10.5. Dual-image-based schemes

200

10.6. Interpolation-based schemes

205

10.7. Conclusion

210

Acknowledgments

211

References

211

Contents

11. Quantum steganography

ix

215

Todd A. Brun 11.1. Introduction

215

11.2. Goals and tools of quantum steganography

217

11.3. Quantum steganography with depolarizing noise

219

11.4. Steganographic encoding in error syndromes

222

11.5. Encoding in the binary symmetric channel

224

11.6. Encoding in the 5-qubit “perfect” code

225

11.7. Secrecy and security

233

11.8. Asymptotic rates in the noiseless case

239

11.9. Asymptotic rates in the noisy case

249

11.10. Discussion and future directions

255

11.11. Conclusion

257

Acknowledgments

257

References

257

12. Digital media steganalysis

259

Reinel Tabares-Soto, Raúl Ramos-Pollán, Gustavo Isaza, Simon Orozco-Arias, Mario Alejandro Bravo Ortíz, Harold Brayan Arteaga Arteaga, Alejandro Mora Rubio, Jesus Alejandro Alzate Grisales 12.1. Introduction

259

12.2. Image steganalysis

261

12.3. Audio steganalysis

270

12.4. Video steganalysis

275

12.5. Text steganalysis

281

12.6. Conclusion

285

References

286

x

Contents

13. Unsupervised steganographer identification via clustering and outlier detection

295

Hanzhou Wu 13.1. Introduction

295

13.2. Primary concepts and techniques

297

13.3. General frameworks

306

13.4. Ensemble and dimensionality reduction

310

13.5. Conclusion

317

Acknowledgment

318

References

318

14. Deep learning in steganography and steganalysis

321

Marc Chaumont 14.1. Introduction

321

14.2. The building blocks of a deep neuronal network

322

14.3. The different networks used over the period 2015–2018

325

14.4. Steganography by deep learning

335

14.5. Conclusion

344

References

344

Index

351

List of contributors A.S. AbdelRady South Valley University, Faculty of Science, Department of Mathematics, Qena, Egypt Aditya Kumar Sahu Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, India Department of Computer Science and Engineering, GMRIT, Rajam, Andhra Pradesh, India Adnan Hasanovi´c Department of Philological Sciences, University of Novi Pazar, Novi Pazar, Serbia Ahmed Elhadad Faculty of Science, South Valley University, Department of Mathematics and Computer Science, Qena, Egypt Faculty of Science and Art, Jouf University, Department of Computer Science and Information, Al Qurayyat, Saudi Arabia Alejandro Mora Rubio Universidad Autónoma de Manizales, Department of Electronics and Automation, Manizales, Caldas, Colombia Amal Khalifa Purdue Fort Wayne University, Department of Computer Science, West Lafayette, IN, United States Anita Pradhan Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, India Dilbag Singh Manipal University Jaipur, Computer Science and Engineering, School of Computing and Information Technology, Jaipur, India Gandharba Swain Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, India Gustavo Isaza Universidad de Caldas, Department of Systems and Informatics, Manizales, Caldas, Colombia xi

xii

List of contributors

Hanzhou Wu School of Communication and Information Engineering, Shanghai University, Shanghai, China Harold Brayan Arteaga Arteaga Universidad Autónoma de Manizales, Department of Electronics and Automation, Manizales, Caldas, Colombia Hussein Abulkasim Faculty of Science, New Valley University, Department of Mathematics, El-Kharja, Egypt Jesus Alejandro Alzate Grisales Universidad Autónoma de Manizales, Department of Electronics and Automation, Manizales, Caldas, Colombia K. Raja Sekhar Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, India M. Hassaballah South Valley University, Faculty of Computers and Information, Department of Computer Science, Qena, Egypt Manjit Kaur Manipal University Jaipur, Computer Communication and Engineering, School of Computing and Information Technology, Jaipur, India Marc Chaumont Montpellier University, LIRMM (UMR5506)/CNRS, Nîmes University, Montpellier, France LIRMM/ICAR, Montpellier, France Mario Alejandro Bravo Ortíz Universidad Autónoma de Manizales, Department of Electronics and Automation, Manizales, Caldas, Colombia Marwa Saidi Laboratoire RISC, Ecole Nationale d’Ingénieurs de Tunis, University of Tunis El Manar, Tunis, Tunisia Mohamed Abdel Hameed Luxor University, Faculty of Computers and Information, Department of Computer Science, Luxor, Egypt Monagi H. Alkinani University of Jeddah, College of Computer Science and Engineering, Department of Computer Science and Artificial Intelligence, Jeddah, Saudi Arabia

List of contributors

xiii

Muzafer Saraˇcevi´c Department of Computer Sciences, University of Novi Pazar, Novi Pazar, Serbia Olfa Mannai Laboratoire RISC, Ecole Nationale d’Ingénieurs de Tunis, University of Tunis El Manar, Tunis, Tunisia Phuoc-Hung Vo School of Engineering and Technology, Tra Vinh University, Tra Vinh City, Tra Vinh Province, Vietnam College of Information Technology, Can Tho University, Can Tho, Vietnam Rasheed Hussain Institute of Information Systems, University of Innopolis, Tatarstan, Russia Raúl Ramos-Pollán Universidad de Antioquia, Department of Systems Engineering, Medellín, Antioquia, Colombia Reinel Tabares-Soto Universidad Autónoma de Manizales, Department of Electronics and Automation, Manizales, Caldas, Colombia Rhouma Rhouma Laboratoire RISC, Ecole Nationale d’Ingénieurs de Tunis, University of Tunis El Manar, Tunis, Tunisia College of Applied Sciences, Salalah, Sultanate of Oman Safwat Hamad Faculty of Computer and Information Sciences, Ain Shams University, Department of Scientific Computing, Cairo, Egypt Saleh Aly Aswan University, Faculty of Engineering, Department of Electrical Engineering, Aswan, Egypt Samed Juki´c Faculty of Information Tech., International Burch University, Ilidža, Sarajevo, BIH Simon Orozco-Arias Universidad Autónoma de Manizales, Department of Computer Science, Manizales, Caldas, Colombia Universidad de Caldas, Department of Systems and Informatics, Manizales, Caldas, Colombia Thai-Son Nguyen School of Engineering and Technology, Tra Vinh University, Tra Vinh City, Tra Vinh Province, Vietnam

xiv

List of contributors

Thanh Nhan Vo Chaoyang University of Technology, Department of Information Management, Taichung, Taiwan, R.O.C. Thanh-Nghi Do College of Information Technology, Can Tho University, Can Tho, Vietnam Todd A. Brun University of Southern California, Ming Hsieh Department of Electrical and Computer Engineering, Los Angeles, CA, United States Tzu-Chuen Lu Chaoyang University of Technology, Department of Information Management, Taichung, Taiwan, R.O.C. Van-Thanh Huynh School of Engineering and Technology, Tra Vinh University, Tra Vinh City, Tra Vinh Province, Vietnam Vijay Kumar NIT Hamirpur, Computer Science and Engineering, Hamirpur, HP, India

About the editor Mahmoud Hassaballah was born in 1974, Qena, Egypt. He received his B.Sc. degree in mathematics in 1997 and his M.Sc. degree in computer science in 2003, both from South Valley University, Egypt, and his Doctor of Engineering (D. Eng.) in computer science from Ehime University, Japan, in 2011. He was a visiting scholar with the Department of Computer & Communication Science, Wakayama University, Japan, in 2013 and GREAH laboratory, Le Havre Normandie University, France, in 2019. He is currently an associate professor of computer science at the Faculty of Computers and Information, South Valley University, Egypt. He served as a reviewer for several Journals such as IEEE Transactions on Image Processing, IEEE Transactions on Fuzzy Systems, IEEE Transactions on Parallel and Distributed Systems, Pattern Recognition, Pattern Recognition Letters, Egyptian Informatics Journal, IET Image Processing, IET Computer Vision, IET Biometrics, Journal of Real-Time Image Processing, The Computer Journal, Journal of Electronic Imaging, and Optical Engineering. He has published 5 books and over 50 research papers in refereed international journals and conferences. His research interests include feature extraction, object detection/recognition, artificial intelligence, biometrics, image processing, computer vision, machine learning, and data hiding.

xv

Preface In the information age and digital communication revolution the Internet and cloud services are widely used in transmission of large amounts of data over social media like Facebook, Instagram, WhatsApp, and several other insecure networks, exposing private and secret data to serious situations. Also, novel technologies and new applications such as internet of things and artificial intelligence bring new threats. Consequently, ensuring that information transmission over these mediums is safe and secure has become one of the most important issues in the field of data security. To keep unauthorized persons away from the transmitted information, a variety of techniques have been introduced, and steganography is one of them. Steganography as a technique for covert communication aims to hide secret messages in a normal message achieving the least possible statistical detectability and without drawing suspicion during data communication. Steganography differs from other data security techniques (cryptography and watermarking). For example, cryptography conceals only the content of the message through encryption, whereas steganography conceals the presence of the message itself. In a broader sense, watermarking, as an old data security technique, and steganography share some common features with each other, but fundamentally they are quite different. Cryptography and watermarking are beyond the scope of the present work. The purpose of this book is to provide researchers, scholars, postgraduate students, possibly senior undergraduate students who are taking an advanced course in related topics, and professionals with the foundations, basic principles, and technical information regarding digital media steganography (i.e., image, video, text, and audio data). Besides, it is intended to be a comprehensive reference volume and to provide a bird’s eye view of recent state-of-the-art methods on the topic of steganography. Further, several new methods are presented in this book such as quantum steganography. The emergence of deep learning in steganography and steganalysis is also discussed. The book consists of fourteen high-quality chapters written by renowned experts in the field. Each chapter provides principles and fundamentals of a specific method, introduces reviews up-to-date techniques, and presents outcomes. In each chapter, figures, tables, and sometimes examples are used to improve presentation and analysis of the proposed method. Furthermore, bibliographic references are included in each chapter, providing a good starting point for deeper research and further exploration of the book topics. The book is structured so that each chapter is self-contained within reasonable limits and can be read independently from the others as follows. Chapter 1: Introduction to digital image steganography presents a general and brief introduction to concepts, applications, challenges, and methods of image steganography. xvii

xviii

Preface

It also provides short descriptions of several metrics used in performance evaluation of steganography methods. Chapter 2: A color image steganography method based on ADPVD and HOG techniques introduces a new pixel-based adaptive directional pixel value differencing (P-ADPVD) data hiding method consisting of five algorithms. The main advantage of the P-ADPVD method is embedding secret data in three different edge directions, rather than only in one direction as in the PVD-based methods. This methodology significantly improves the quality and security of stego images without sacrificing the embedding capacity. Chapter 3: An improved method for high hiding capacity based on LSB and PVD introduces an image steganography method using the concept of least significant bit substitution and pixel value differencing. The major contributions of the method are: increasing hiding capacity, avoiding fall off boundary problem, and resistance to regular and singular attack. It achieves high hiding capacity having 800007 bits with maintaining good PSNR at 36.03 dB and shows great resistance to attacks. Chapter 4: An efficient image steganography method using multiobjective differential evolution proposes a steganography method based on differential evolution and least significant substitution (LSB) scheme. Differential evolution is utilized to optimize the mask assignment list of the LSB scheme. Several evaluation results and comparisons are reported. Chapter 5: Image steganography using add-sub based QVD and side match is a new steganography method to address the FOBP. It performs two stages of embedding on 3×3 size disjoint pixel blocks. In the first stage, it performs ASQVD and remainder substitution on central pixel and its left, right, lower, and upper neighbor pixels. Based on new values of these five pixels, in the second stage, it performs SM embedding approach on four corner neighbors. The average hiding capacity achieved by this method was 3.55 bpp. Chapter 6: A high-capacity invertible steganography method for stereo image proposes a new invertible steganography method for high embedding capacity using twodimensional histogram shifting in transform domain. It investigates DCT-quantized coefficients of each similar block pair in the left and right views of stereo image. Then secret bits are first partitioned into 3-bit groups and encoded into decimal form. The embedding direction histogram is built and used for embedding. This method achieves the trade-off between imperceptibility and embedding capacity. Chapter 7: An adaptive and clustering-based steganographic method: OSteg proposes a new adaptive spatial steganographic method based on Otsu clustering technique. It uses clustering pixels values to restrict the embedding alteration to rich-textured blocks only. The security of the method is enhanced through exploiting a nonlinear Ikeda system. The embedding of a secret message is carried out in preprocessing and alteration phases. Chapter 8: A steganography method based on decomposition of the Catalan numbers discusses some mathematical concepts of number theory and applications of combinatorial mathematics in the area of steganography. Then it has proposed a steganography method based on decomposition of the Catalan numbers.

Preface

xix

Chapter 9: A steganography approach for hiding privacy in video surveillance systems presents a method for embedding video captured by a surveillance camera into another processed video form with removed private information. The hiding process is carried out in the discrete cosine transform domain of the cover video based on the H.264 video compression concept. Experimental results showed that this method achieved low distortion in the stego video while maintaining an acceptable visual quality for the retrieved frames. Chapter 10: Reversible steganography techniques: A survey presents a detailed survey for the state-of-the-art reversible steganography techniques to provide for new researchers a concise introduction to the reversible steganography field. Chapter 11: Quantum steganography inspects in detail quantum steganography protocols, describes different encoding methods, and proves bounds on the asymptotic rate of quantum steganographic communication and the rate of secret key consumption. Then it examines one particularly well-developed approach, in which the messages are disguised as errors on a quantum error-correcting code from a noisy quantum channel. Chapter 12: Digital media steganalysis is devoted to recent trends in digital media steganalysis. Several steganalysis methods are discussed. It also identifies several opportunities as future research directions. It is a good starting point for research projects on digital media steganalysis, as useful methods can be isolated, and past errors can be avoided. Chapter 13: Unsupervised steganographer identification via clustering and outlier detection presents concepts and advanced methodologies in the steganographer identification problem (SIP), where one or multiple users called actors are guilty of using steganography, and the goal is to identify the user who sends many steganographic images among other innocent users. It is self-contained and intended as a tutorial introducing the SIP in the context of digital media steganography. Chapter 14: Deep learning in steganography and steganalysis deals with deep learning in steganalysis. It presents the structure of deep neural networks in a generic way, discuses the networks proposed in the literature for different scenarios of steganalysis, and describes steganography using deep learning. Some promising future lines of research are also introduced. Finally, it is very necessary to mention here that the book is a small piece in the puzzles of data security and digital media steganography. We hope that readers will find the presented chapters in the book interesting and that the chapters will inspire future research from both theoretical and practical viewpoints to spur further advances in the field of data security.

Mahmoud Hassaballah Department of Computer Science Faculty of Computers and Information South Valley University, Qena, Egypt March 3, 2020

Acknowledgments The editor would like to take this opportunity to express his sincere gratitude to the contributors for extending their wholehearted support in sharing some of their latest results and findings. Without their contributions, it would be not possible for the book to successfully come into existence. The reviewers of chapters deserve special thanks for their constructive and timely input. Finally, the editor is deeply grateful for the dedicated and professional work of the staff at ELSEVIER and for giving him the opportunity to edit a book on digital media steganography. In particular, I would like to thank Elizabeth Brown, senior acquisitions editor, and Gabriela Capille, editorial project manager, for initiating this project and for their kind and timely support in publishing the book and for handling the publication. The editorial staff at ELSEVIER has done a meticulous job, and working with them was a pleasant experience, with special thanks to Maria Bernard (Project Manager).

Mahmoud Hassaballah Department of Computer Science Faculty of Computers and Information South Valley University, Qena, Egypt March 3, 2020

xxi

1 Introduction to digital image steganography M. Hassaballaha , Mohamed Abdel Hameedb , Monagi H. Alkinanic a South Valley University, Faculty of Computers and Information, Department of Computer Science, Qena,

Egypt b Luxor University, Faculty of Computers and Information, Department of Computer Science, Luxor, Egypt c University of Jeddah, College of Computer Science and Engineering, Department of Computer Science

and Artificial Intelligence, Jeddah, Saudi Arabia

1.1 Introduction Steganography has been used from ages and has been set its roots from ancient civilizations (e.g., Greece, Egypt). The word steganography is a composite of two Greek words: Steganos which means “covered” and Graphia which means “writing”. In the 5th century BC, Histaiacus shaved a slave’s head and tattooed a message on his skull, and the slave was dispatched with the message after his hair grew back [1]. Cardan (1501–1576) reinvented a Chinese ancient method of secret writing, where a paper mask with holes is shared among two parties, this mask is placed over a blank paper, and a sender writes the secret message through the holes, then takes the mask off, and fills the blanks to display the message as an innocuous text. This method was approved to Cardan Grille [2]. Null Ciphers, Microdots, and invisible ink methods were also very popular steganographic methods during World War II. These hiding secret message methods have been used in various forms for thousands of years [3]. Nowadays, digital steganography can be defined as the art of hiding secret messages behind the innocent looking digital media [4]. Jessica [2] defined steganography as the art of concealed communication where the existence of a message is secret. Other researchers define digital steganography as a task of hiding digital information in covert channels so that one can conceal some information and prevent detection of this hidden information [5]. Steganography can be defined as a science of obscuring a message in a carrier (host object) with the intent of not drawing suspicion to the context in which the messages is transferred [6]. In general, there are several types of digital mediums, which can be used for hiding secret information such as image, video, audio, and text (linguistics) as shown in Fig. 1.1. These digital mediums have different characteristics to embed secret information [7], where the best medium for embedding secret information must have two features: the medium should be popular, and the modification in this host (cover) should be invisible to any unauthorized third party. Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00009-8 Copyright © 2020 Elsevier Inc. All rights reserved.

1

2 Digital Media Steganography

FIGURE 1.1 Categories of digital media steganography.

FIGURE 1.2 A general image steganography pipeline.

To the best of our knowledge, in the literature of digital steganography an image is the most popular medium that attracted steganographers [8]. Reasons for this popularity are the abundance of digital images on the Internet, the image provides enough redundancy to manipulate steganography, and Human Visual System HVS attributes motivate researchers to exploit these attributes in data hiding systems. Though images are popular in steganography, other media such as text is also another choice for performing steganography [9]. There are several data hiding applications related to text. For instance, in the context of copy-right protection, text watermarking may be used, whereas steganography can be employed for adding a “hash” to a text file for protecting from tampering. Unfortunately, the lack of redundancy in the texts compared to digital images makes steganography using texts a nontrivial challenge [10–13]. Fig. 1.2 shows the general pipeline of image steganography approaches, where the term “cover image” denotes the image that is used to embed the “secret message” [14,15]. Nor-

Chapter 1 • Introduction to digital image steganography

3

mally, any image steganography approach is composed of two algorithms, one for “embedding”, which is actually the procedure or algorithm that is used to hide the “secret message” within the cover image, and the other, extraction algorithm, which simply can be used to recover the secret message from the stego image [16]. Thus the stego image, as the final output image, embeds the secret information [17].

1.2 Applications of steganography Simply, steganography can be employed anytime one wants to conceal some data. There are many reasons to hide data, but they all come into the desire to prevent unauthorized individuals from reaching the data or from becoming aware of the existence of a message. Steganography can be used quite effectively in the automatic monitoring of a radio advertisement or music. An automated system can be set up to watch for a specific stego message [3]. Modern computer and networking technology allows individuals of the basic applications of steganography related to secret communications. Clearly, groups and companies can be access to host a web page that may contain secret information meant for another party. Anyone can download the web page; however, the hidden information is invisible and does not take any attention [18]. Some modern applications of steganography are used in medical imaging systems [19], where a separation is considered necessary for confidentiality between patients’ image data or DNA sequences and their captions such as Physician, Patient’s name, address, and other particulars. Using steganography may help to avoid the leakage of patients’ private data in unauthorized hands. Inspired by the notion that steganography can be embedded as part of the normal printing process, the Japanese firm Fujitsu is going to develop technology for encoding data in a printed picture that is invisible to the human eye and later can be retrieved by a mobile phone with a camera [20]. The process takes less than one second as the embedded data is mere 12 bytes. Hence users are able to use their cell phones to capture encoded data. The basic idea is transforming the image color scheme before printing to its hue, saturation, and value components (HSV), then embedding it into the Hue domain to which human eyes are not sensitive. Mobile cameras can see and decode the coded data [21]. There are several other applications, which can use steganography to keep their communications secret [3], including: • Intelligence services or Intellectual Properties. • Securing multimodal biometric data. • Corporations with trade secrets to protect. • Governments claimed that criminals can use steganography to communicate. So, it may become limited under laws. • Military and defence communication. In the business world, steganography can be used to hide a secret chemical formula or plan for new inventions. It can also be used for corporate espionage by sending out trade

4 Digital Media Steganography

secrets. Also, it can be used in the noncommercial sector to hide information that someone wants to keep private.

1.3 Challenges facing steganography In steganography techniques the statistics of the cover image are used to embed the secret information into it without changing its properties [14]. The resulting image is called a stego image. The stego image must be free from observable change, so that any third party cannot be able to discover these changes and handle this cover as a normal image, whereas the secret data transmitted through this image remain secure. Any image steganographic system faces the following major challenges shown in Fig. 1.3: •





Size of payload: how a maximum embedding capacity can be achieved? Steganography aims sufficient embedding capacity. Requirements for higher payload and secure communication are often contradictory [22,23]. Visual image quality: how much the stego image is perceptually identical to its cover image? So the image steganography techniques should produce a high imperceptible stego image [24]. Robustness: how can a stego image resist the different steganalysis detection attacks? The stego image should provide robustness against image processing techniques like compression, cropping, resizing, and so on; that is, when any of these steganalysis techniques are performed on stego image, secret information should not be completely destroyed [25].

Therefore the ideal steganographic method must fulfill the above objectives simultaneously as high capacity, good visual image quality, and undetectability. But most often, high payload steganographic approaches introduce the distortion artifacts in stego images that are vulnerable to steganalysis. The steganographic methods having good visual image quality suffer from the low payload. Thus, how to achieve simultaneously high payload, good visual quality, and undetectability is a real challenging research issue due to the contradictions between them [17].

FIGURE 1.3 Trade-off between capacity, imperceptibility, and robustness.

Chapter 1 • Introduction to digital image steganography

5

1.4 Steganographic approaches In this section, we attempt to give an overview of the most important steganographic approaches, which use digital images as a cover media by addressing the classification of image steganography. Based on the nature of embedding, there are various steganographic techniques available under image cover media containing spatial, transform spread spectrum and adaptive domains. In the spread spectrum the secret data is multiplied by a pseudonoise (PN) sequence and then modulated before embedding in the cover object. Spatial or image domain techniques use bitwise methods that apply bit insertion and noise manipulation using simple mechanisms, whereas the transform domain is defined as the transformation of an image into its frequency representation followed by modification on the spectral components of the image. Adaptive nature can be introduced in the data embedding schemes in several ways such as selecting the target pixels of the cover image, nature of modification to be made, and the number of bits embedded in a pixel [4]. Classification of image steganography techniques is illustrated in Fig. 1.1.

1.4.1 Spread spectrum approaches These systems hide and recover a message of substantial length within digital imagery while maintaining the original image size and dynamic range. The embedded secret message can be recovered using appropriate keys without any knowledge of the original image. Image restoration, error-control coding, and techniques similar to the spread spectrum are illustrated. A message embedded by this method can be in the form of text, image, or any other digital signal. Applications for such schemes include in-band captioning, covert communication, image tamper-proofing, authentication, embedded control, and revision tracking [26].

1.4.2 Spatial domain approaches Spatial domain schemes are more adapted with the human visual system (HVS) and can provide more embedding capacity than transform domain schemes with an acceptable image quality [27]. It is the simplest way of data embedding in digital images in which pixel values can be modified directly to encode the secret message bits. The main steganographic schemes coming under the spatial domain technique include Pixel Value Differencing (PVD), Least Significant Bit substitution (LSB), Exploiting Modification Direction (EMD), Quantization based, Gray Level modification, Multiple Bit-planes, and Palettebased steganography schemes [28–39].

1.4.2.1 Gray level modification The gray level values of the pixels are checked and contrasted to the bit stream that is to be mapped in the image. First, the gray level values of the chosen pixels (odd pixels) are made even by changing the gray level by one unit. When all the chosen pixels have an even gray level, it is contrasted to the bit stream that must be mapped. The principal bit from the bit stream is contrasted to the initially chosen pixel. When the primary bit is even, the primary

6 Digital Media Steganography

pixel is not altered as all the chosen pixels have an even gray level value. Whenever the bit is odd, the gray level value of the pixel is decremented by one unit to make its esteem odd, which then would represent an odd bit mapping. This is done for all bits in the bit stream, and every single bit is mapped by changing the gray level values consequently. These methods help us to provide better quality stego images compared to other methods [40].

1.4.2.2 Pixel value differencing (PVD) This technique subdivides the cover image into nonoverlapping blocks consisting of two connecting pixels. It hides the data by altering the difference between these two pixels. The area of the pixel decides the hiding capacity of this technique. For example, if the edge area is chosen, then the difference is high in between the connected pixels, whereas in smooth areas the difference is low. Thus the best choice is to select edge areas to embed the secret message that is having more embedding capacity [29] as shown in Fig. 1.4.

FIGURE 1.4 Steps of the pixel value differencing method.

1.4.2.3 Least significant bit substitution (LSB) Least significant bit (LSB) steganography is one of the fundamental and conventional methods that are capable of hiding large secret information in a cover image [28]. This technique is involved to replace all LSB bits of pixels within the cover image with secret bits. This method embeds the fixed-length secret bits in the same fixed length LSBs of pixels as shown in Fig. 1.5. Although this technique is simple, it generally causes noticeable distortion when the number of embedded bits for each pixel exceeds four [41,42].

1.4.2.4 Exploiting modification direction (EMD) Exploiting modification direction (EMD) uses n pixels as a group so that it can establish hidden digits to 2n + 1 array notational system to lessen the stego image distortion. Furthermore, embedding needs a decrease or increase from a specific pixel value within the

Chapter 1 • Introduction to digital image steganography

7

FIGURE 1.5 Hiding data in images using LSB method.

set. For this method, it is necessary to compute for the value of n before embedding. The highest image quality is achieved when the value of n is equal to 2, where the embedding is represented by only one secret digit within each two pixels [43].

1.4.2.5 Quantization-based approaches The steganographic system of this category uses any sort of encoding system to hide secret data bits. The encoding system is any standard compression codec like JPEG, vector quantization, and so on. The secret data are divided into small pieces of data, and these small data pieces are embedded along with the encoded carrier image. These systems are used for enhancing the capacity while minimizing the distortion of the stego image. Unfortunately, these systems are not sufficient to handle the geometrical attacks and steganalysis [4].

1.4.2.6 Multiple bit-planes-based approaches These methods are introduced as an extension to the LSB substitution method, where bit planes are utilized for hiding secret data bits [44]. Usually, bit plane stego approaches are used along with other methods to boost the performance of the overall system. The expanded bit-plane encoding brings two advantages: it can host more secret bits than the 8-bit LSB techniques, and the degree of randomness of embedding is high [45].

1.4.3 Adaptive-based approaches Adaptive steganography is known as “Statistics-aware embedding” [1] or “Masking” [46]. In other words, the statistics of the cover image are used to embed secret information without changing its properties. This embedding can be done by a random adaptive selection

8 Digital Media Steganography

of pixels according to the cover image and the selection of pixels in a block with large local STD (Standard Deviation) [47]. The pixels that carry secret bits are selected adaptively depending on the content of the cover image [48].

1.4.4 Transform domain approaches Several transform domain methods are utilized in the field of steganography and the most popular schemes include: Discrete Wavelet Transform (DWT), Discrete Cosine Transform (DCT), Discrete Fourier Transform (DFT), Integer Wavelet Transform (IWT), and Complex wavelet transform (CWT) [49–51]. Basically, this type of technique is more robust with regard to common image processing operations and lossy compression. The DCT-based steganography is conveniently applied in the Joint Photographic Experts Group (JPEG) compression standards, whereas the DWT-based steganography conveniently applied in the Joint Photographic Experts Group 2000 (JPEG2000) compression standards. A block diagram of hiding data using the DWT-based methods is shown in Fig. 1.6.

FIGURE 1.6 Block diagram of hiding data using DWT.

1.5 Performance evaluation 1.5.1 Payload capacity In fact, the embedding capacity (or playload capacity) depends on a steganography scheme and nature of the selected cover image. The capacity can be defined as the maximum size of secret data that can be embedded in the cover image without damaging the integrity of the cover image. In other words, the capacity is the number of bits embedded in each pixel and is represented by bits per pixel (bpp) or in relative percentage as follows: Capacity(bpp) =

N o. of embedded bits T otal pixels in a cover image

(1.1)

1.5.2 Visual stego image quality analysis There are several measures for assessment of stego image quality, and any measure can be used [52–54]. The most common measures used for comparing stego S and cover C images of size M × N are:

Chapter 1 • Introduction to digital image steganography

9

• Peak-signal-to-noise ratio (PSNR) can be defined as a statistical image quality estimation used for measuring the distortion between the cover and stego image. High values of PSNR mean small amount of distortion and lead to high visual quality, whereas low values lead to remarkable changes in stego images, which can easily be detectable by HVS. The PSNR is given by P SNR = 10 log10

2552 , MSE

(1.2)

where the mean square error (MSE) is given by M−1  N−1  1 MSE = (S − C)2 . M ×N

(1.3)

i=0 j =0

• Normalized-cross-correlation (NCC) illustrates how strongly the stego image is correlated with the cover image. The value of NCC value lies between 0 and 1. If NCC values are equal to 1, then this means that the stego images are completely robust against various image processing attacks [55]. The NCC is given by M−1  N−1  i=0 j =0

N CC =

(S(i, j ) × C(i, j ))

M−1  N−1 

.

(1.4)

S(i, j )2

i=0 j =0

• Structural similarity index measurement (SSIM) [56] is an image quality estimation metric that compares two images (stego and cover images) to obtain similarities between them. It is suggested as an improvement for PSNR [57]. The SSIM takes the form SSI M(C, S) =

(2μC μS + c1 ) × (2σCS + c2 ) , (μ2C + μ2S + c1 ) × (σC2 σS2 + c2 )

(1.5)

where μC is the average of cover image C pixels, μS is the average of stego image S pixels, σC2 is the variance of C, σS2 is the variance of S, σCS is the covariance of C and S, c1 = (K1 L)2 and c2 = (K2 L)2 are two variables stabilizing the division with weak denominator, L is the dynamic range of the pixel values, and K1 = 0.01 and K2 = 0.03 by default. • Universal image quality index Q [58] is utilized to evaluate the visual quality of images. High values for Q mean that the stego and cover images are highly correlated and differences between them are very small. The universal quality index Q can be computed using Q=

4 σxy x¯ y¯ , (σx2 + σy2 )[(x) ¯ 2 + (y) ¯ 2 )]

(1.6)

10

Digital Media Steganography

where x¯ =

M−1  N−1  1 (xij ), M ×N i=0 j =0

σx2 = σy2 2 σxy =

y¯ =

M−1  N−1  1 (yij ), M ×N i=0 j =0

1 (M × N ) − 1

1 = (M × N ) − 1

M−1  N−1 

(xij − x) ¯ 2,

i=0 j =0 M−1  N−1 

(1.7)

(yij − y) ¯ , 2

i=0 j =0

M−1  N−1  1 (xij − x)(y ¯ ij − y), ¯ M ×N i=0 j =0

where x is the value of cover image pixels, y is the value of stego image pixels, x¯ is the 2 are the variance and mean value of x, y¯ is the mean value of y, and σx2 , σy2 , and σxy covariance of x and y images, respectively.

1.5.3 Security analysis Robustness of steganography methods is not a trivial or simple task due to several factors. Two chapters of this book are assigned to steganalysis topic, and here the most traditional steganalysis methods are briefly discussed.

1.5.3.1 Pixel difference histogram analysis Histogram is a measure of the number of occurrences of pixels with respect to particular pixel value. Each pixel value in the cover image changes during embedding process, and these changes can be used to detect steganography. Hence a small difference of histograms between cover and its stego image results in that it is more difficult to detect by attackers. The pixel difference histogram may be a potential characteristic to expose the hidden message of those stegos using the PVD-based steganographic methods. Zhang and Wang [59] proved that the original PVD scheme [29] inevitably introduces some undesired steps in the histogram of the differences between two continuous pixels in each embedding unit due to its fixed division of embedding units and its fixed quantization steps. By detecting and analyzing such artifacts it is possible to estimate the size of hidden message, especially when the embedding rate is high [60].

1.5.3.2 Universal steganalysis Universal steganalysis is a metadetection method in the sense that it can be adjusted after training on cover and stego images to detect any steganographic method regardless of the embedding domain. The trick is finding an appropriate set of sensitive statistical quantities (a feature vector) with “distinguishing” capabilities. Neural networks, clustering algorithms, and other tools of soft computing can be used to find the right thresholds

Chapter 1 • Introduction to digital image steganography

11

and construct the detection model from the collected data [61]. Universal steganalysis is also known as blind steganalysis, which is a modern approach to attack the stego images without any prior knowledge about the type of the used steganographic algorithm. These blind detectors are built using machine learning such as using a classifier trained on the extracted features from the cover and stego images to identify the differences between the cover and stego features. There are many steganalysis features that are suitable for detection of spatial and JPEG steganography [62]. Among spatial domain feature sets, the second-order subtractive pixel adjacency matrix (SPAM) [63] and the spatial rich model (SRM) [64] were proposed. In [65] a feature set named discrete cosine transform residual (DCTR) was proposed for steganalysis of JPEG images where the detection accuracy is measured using the minimal total error probability under equal priors and is given by PE = min PFA

1 (PF A + PMD ), 2

(1.8)

where PF A and PMD are the false alarm and missed detection probabilities, respectively.

1.5.3.3 Regular and singular steganalysis It is not easy to uncover and quantify the weak relationship between some pseudorandom components present in the image (e.g., the LSB plane) and the image itself. Let us assume that a cover image with M × N pixels and with pixel values from the set P = {0, . . . , 255} for 8 bit gray-scale image. The spatial correlation is captured using a discrimination function f that assigns a real number f (x1 , . . . , xn ) ∈ R to a group of pixels G = (x1 , . . . , xn ). The function f is defined as f (x1 , . . . , xn ) =

n−1  (|xi+1 − xi |),

(1.9)

i=1

which measures the smoothness of G−, while the noisier group G is the larger value of the discrimination function f . In the LSB embedding algorithm, as noisiness is increased in the image, the value of f will increase after embedding. In typical images, flipping the group G will more frequently lead to an increase in the discrimination function f rather than a decrease. Thus the total number of regular groups is larger than the total number of singular groups. Let us denote the relative number of regular groups for a nonnegative mask m as Rm (in percents of all groups), and let Sm be the relative number of singular groups. The value of Rm is approximately equal to that of R−m , and the same should be true for Sm and S−m in the case of assumption of the steganalysis method for cover image taking a different rate of payload to achieve the following: Rm ∼ = R−m and Sm ∼ = S−m .

(1.10)

Briefly, the main idea of the RS-steganalysis exploits the correlation of images in the spatial domain and is applicable to most commercial steganographic software products. Also, the principles of RS-steganalysis can be extended to variants of LSB embedding in indices of palette images and quantized DCT coefficients in JPEG files [61].

12

Digital Media Steganography

1.6 Conclusion In this chapter, we reviewed the concepts, applications, challenges, and methods of image steganography. Firstly, the basic concepts and developing history of image steganography were briefly revisited. Then we attempted to give an overview of the most important steganographic approaches, which use digital images as a cover media by addressing the classification of image steganography. Finally, we provided short descriptions of several metrics used in performance evaluation of steganography methods.

References [1] Niels Provos, Peter Honeyman, Hide and seek: an introduction to steganography, IEEE Security & Privacy 1 (3) (2003) 32–44. [2] Jessica Fridrich, Steganography in Digital Media: Principles, Algorithms, and Applications, Cambridge University Press, 2009. [3] Gregory Kipper, Investigator’s Guide to Steganography, CRC Press, 2003. [4] Inas Jawad Kadhim, Prashan Premaratne, Peter James Vial, Brendan Halloran, Comprehensive survey of image steganography: techniques, evaluations, and trends in future research, Neurocomputing 335 (2019) 299–326. [5] Abid Yahya, Steganography Techniques for Digital Images, Springer, 2019. [6] Donovan Artz, Digital steganography: hiding data within data, IEEE Internet Computing 5 (3) (2001) 75–80. [7] Mohamed Abdel Hameed, M. Hassaballah, Saleh Aly, Ali Ismail Awad, An adaptive image steganography method based on histogram of oriented gradient and PVD-LSB techniques, IEEE Access 7 (2019) 185189–185204. ´ [8] El˙zbieta Zielinska, Wojciech Mazurczyk, Krzysztof Szczypiorski, Trends in steganography, Communications of the ACM 57 (3) (2014) 86–95. [9] Mustafa Cem Kasapba¸si, A new chaotic image steganography technique based on Huffman compression of Turkish texts and fractal encryption with post-quantum security, IEEE Access 7 (2019) 148495–148510. [10] Behrooz Khosravi, Behnam Khosravi, Bahman Khosravi, Khashayar Nazarkardeh, A new method for pdf steganography in justified texts, Journal of Information Security and Applications 45 (2019) 61–70. [11] Milad Taleby Ahvanooey, Qianmu Li, Xuefang Zhu, Mamoun Alazab, Jing Zhang, ANiTW: a novel intelligent text watermarking technique for forensic identification of spurious information on social media, Computers & Security (2019) 101702. [12] Milad Taleby Ahvanooey, Qianmu Li, Jun Hou, Ahmed Raza Rajput, Chen Yini, Modern text hiding, text steganalysis, and applications: a comparative analysis, Entropy 21 (4) (2019) 355. [13] Kuo-Chen Wu, Chung-Ming Wang, Steganography using reversible texture synthesis, IEEE Transactions on Image Processing 24 (1) (2014) 130–139. [14] Paulo Vinicius Koerich Borges, Joceli Mayer, Ebroul Izquierdo, Robust and transparent color modulation for text data hiding, IEEE Transactions on Multimedia 10 (8) (2008) 1479–1489. [15] Mohamed Abdel Hameed, Saleh Aly, M. Hassaballah, An efficient data hiding method based on adaptive directional pixel value differencing (ADPVD), Multimedia Tools and Applications 77 (12) (2018) 14705–14723. [16] Sunil Lee, Chang D. Yoo, Ton Kalker, Reversible image watermarking based on integer-to-integer wavelet transform, IEEE Transactions on Information Forensics and Security 2 (3) (2007) 321–330. [17] Mehdi Hussain, Ainuddin Wahid Abdul Wahab, Yamani Idna Bin Idris, Anthony T.S. Ho, Ki-Hyun Jung, Image steganography in spatial domain: a survey, Signal Processing: Image Communication 65 (2018) 46–66. [18] Frank Y. Shih, Digital Watermarking and Steganography: Fundamentals and Techniques, CRC Press, 2017.

Chapter 1 • Introduction to digital image steganography

13

[19] Stefan Katzenbeisser, Fabien Petitcolas, Information Hiding Techniques for Steganography and Digital Watermarking, Artech House, 2000. [20] Dominic Bucerzan, Crina Ra¸tiu, Testing methods for the efficiency of modern steganography solutions for mobile platforms, in: 6th International Conference on Computers Communications and Control, IEEE, 2016, pp. 30–36. [21] David Frith, Steganography approaches, options, and implications, Network Security 8 (20) (2007) 4–7. [22] Mohamed Abdel Hameed, M. Hassaballah, Saleh Aly, A.S. Abdel Rady, A high payload steganography method based on pixel value differencing, in: 11th International Conference on Informatics and Systems (INFOS), Cairo, Egypt, 2018. [23] Gandharba Swain, High capacity image steganography using modified LSB substitution and PVD against pixel difference histogram analysis, Security and Communication Networks (2018) 2018. [24] Inas Jawad Kadhim, Prashan Premaratne, Peter James Vial, High capacity adaptive image steganography with cover region selection using dual-tree complex wavelet transform, Cognitive Systems Research 60 (2020) 20–32. [25] Valery Korzhik, Nguyen Duy Cuong, Guillermo Morales-Luna, Cipher modification against steganalysis based on NIST tests, in: 24th Conference of Open Innovations Association (FRUCT), IEEE, 2019, pp. 179–186. [26] Lisa M. Marvel, Charles G. Boncelet, Charles T. Retter, Spread spectrum image steganography, IEEE Transactions on Image Processing 8 (8) (1999) 1075–1083. [27] Sara Sajasi, Amir-Masoud Eftekhari Moghadam, An adaptive image steganographic scheme based on noise visibility function and an optimal chaotic based encryption method, Applied Soft Computing 30 (2015) 375–389. [28] Chi-Kwong Chan, Lee-Ming Cheng, Hiding data in images by simple LSB substitution, Pattern Recognition 37 (3) (2004) 469–474. [29] Da-Chun Wu, Wen-Hsiang Tsai, A steganographic method for images by pixel-value differencing, Pattern Recognition Letters 24 (9) (2003) 1613–1626. [30] H-C. Wu, N-I. Wu, C-S. Tsai, M-S. Hwang, Image steganographic scheme based on pixel-value differencing and LSB replacement methods, IEEE Proceedings-Vision, Image and Signal Processing 152 (5) (2005) 611–615. [31] Cheng-Hsing Yang, Chi-Yao Weng, Shiuh-Jeng Wang, Hung-Min Sun, Adaptive data hiding in edge areas of images with spatial LSB domain systems, IEEE Transactions on Information Forensics and Security 3 (3) (2008) 488–497. [32] Cheng-Hsing Yang, Chi-Yao Weng, Shiuh-Jeng Wang, Hung-Min Sun, Varied PVD+ LSB evading detection programs to spatial domain in data embedding systems, Journal of Systems and Software 83 (10) (2010) 1635–1643. [33] Xin Liao, Qiao-yan Wen, Jie Zhang, A steganographic method for digital images with four-pixel differencing and modified LSB substitution, Journal of Visual Communication and Image Representation 22 (1) (2011) 1–8. [34] Yen-Po Lee, Jen-Chun Lee, Wei-Kuei Chen, Ko-Chin Chang, Jiunn Su, Chien-Ping Chang, Highpayload image hiding with quality recovery using tri-way pixel-value differencing, Information Sciences 191 (2012) 214–225. [35] M. Khodaei, K. Faez, New adaptive steganographic method using least-significant-bit substitution and pixel-value differencing, IET Image Processing 6 (6) (2012) 677–686. [36] C. Balasubramanian, S. Selvakumar, S. Geetha, High payload image steganography with reduced distortion using octonary pixel pairing scheme, Multimedia Tools and Applications 73 (3) (2014) 2223–2245. [37] Jeanne Chen, A PVD-based data hiding method with histogram preserving using pixel pair matching, Signal Processing: Image Communication 29 (3) (2014) 375–384. [38] Ki-Hyun Jung, Kee-Young Yoo, High-capacity index based data hiding method, Multimedia Tools and Applications 74 (6) (2015) 2179–2193. [39] Gandharba Swain, Adaptive pixel value differencing steganography using both vertical and horizontal edges, Multimedia Tools and Applications 75 (21) (2016) 13541–13556.

14

Digital Media Steganography

[40] Wien Hong, Tung-Shou Chen, A novel data embedding method using adaptive pixel pair matching, IEEE Transactions on Information Forensics and Security 7 (1) (2011) 176–184. [41] Liang Zhang, Haili Wang, Renbiao Wu, A high-capacity steganography scheme for JPEG2000 baseline system, IEEE Transactions on Image Processing 18 (8) (2009) 1797–1803. [42] Po-Chyi Su, C.-C.J. Kuo, Steganography in JPEG2000 compressed images, IEEE Transactions on Consumer Electronics 49 (4) (2003) 824–832. [43] Xinpeng Zhang, Shuozhong Wang, Efficient steganographic embedding by exploiting modification direction, IEEE Communications Letters 10 (11) (2006) 781–783. [44] K. Sathish Shet, A.R. Aswath, M.C. Hanumantharaju, Xiao-Zhi Gao, Design and development of new reconfigurable architectures for LSB/multi-bit image steganography system, Multimedia Tools and Applications 76 (11) (2017) 13197–13219. [45] Tuan Duc Nguyen, Somjit Arch-Int, Ngamnij Arch-Int, An adaptive multi bit-plane image steganography using block data-hiding, Multimedia Tools and Applications 75 (14) (2016) 8319–8345. [46] Neil F. Johnson, Sushil Jajodia, Exploring steganography: seeing the unseen, Computer 31 (2) (1998). [47] Abbas Cheddad, Joan Condell, Kevin Curran, Paul Mc Kevitt, Digital image steganography: survey and analysis of current methods, Signal Processing 90 (3) (2010) 727–752. [48] Jiri Fridrich, Rui Du, Secure steganographic methods for palette images, in: International Workshop on Information Hiding, Springer, 1999, pp. 47–60. [49] Nabin Ghoshal, Jyotsna Kumar Mandal, A novel technique for Image Authentication in Frequency Domain using Discrete Fourier Transformation Technique (IAFDDFTT), Malaysian Journal of Computer Science 21 (1) (2008) 24. [50] Michela Cancellaro, Federica Battisti, Marco Carli, Giulia Boato, Francesco G.B. De Natale, Alessandro Neri, A commutative digital image watermarking and encryption method in the tree structured Haar transform domain, Signal Processing: Image Communication 26 (1) (2011) 1–12. [51] Chun-peng Wang, Xing-yuan Wang, Zhi-qiu Xia, Geometrically invariant image watermarking based on fast radial harmonic Fourier moments, Signal Processing: Image Communication 45 (2016) 10–23. [52] Lihuo He, Fei Gao, Weilong Hou, Lei Hao, Objective image quality assessment: a survey, International Journal of Computer Mathematics 91 (11) (2014) 2374–2388. [53] Damon M. Chandler, Seven challenges in image quality assessment: past, present, and future research, ISRN Signal Processing (2013) 2013. [54] Weisi Lin, C.-C. Jay Kuo, Perceptual visual quality metrics: a survey, Journal of Visual Communication and Image Representation 22 (4) (2011) 297–312. [55] P. Thiyagarajan, G. Aghila, Reversible dynamic secure steganography for medical image using graph coloring, Health Policy and Technology 2 (3) (2013) 151–161. [56] Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, Eero P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Transactions on Image Processing 13 (4) (2004) 600–612. [57] Zhou Wang, Alan C. Bovik, Modern image quality assessment, Synthesis Lectures on Image, Video, and Multimedia Processing 2 (1) (2006) 1–156. [58] Zhou Wang, Alan C. Bovik, A universal image quality index, IEEE Signal Processing Letters 9 (3) (2002) 81–84. [59] Xinpeng Zhang, Shuozhong Wang, Vulnerability of pixel-value differencing steganography to histogram analysis and modification for enhanced security, Pattern Recognition Letters 25 (3) (2004) 331–339. [60] Weiqi Luo, Fangjun Huang, Jiwu Huang, A more secure steganography based on adaptive pixel-value differencing scheme, Multimedia Tools and Applications 52 (2–3) (2011) 407–430. [61] Jessica Fridrich, Miroslav Goljan, Practical steganalysis of digital images: state of the art, in: Security and Watermarking of Multimedia Contents IV, vol. 4675, 2002, pp. 1–14. [62] Randa Atta, Mohammad Ghanbari, A high payload steganography mechanism based on wavelet packet transformation and neutrosophic set, Journal of Visual Communication and Image Representation 53 (1) (2018) 42–54.

Chapter 1 • Introduction to digital image steganography

15

[63] Tom Pevny, Patrick Bas, Jessica Fridrich, Steganalysis by subtractive pixel adjacency matrix, IEEE Transactions on Information Forensics and Security 5 (2) (2010) 215–224. [64] Jessica Fridrich, Jan Kodovsky, Rich models for steganalysis of digital images, IEEE Transactions on Information Forensics and Security 7 (3) (2012) 868–882. [65] Vojtˇech Holub, Jessica Fridrich, Low-complexity features for JPEG steganalysis using undecimated DCT, IEEE Transactions on Information Forensics and Security 10 (2) (2015) 219–228.

2 A color image steganography method based on ADPVD and HOG techniques M. Hassaballaha , Mohamed Abdel Hameedb , Saleh Alyc , A.S. AbdelRadyd a South Valley University, Faculty of Computers and Information, Department of Computer Science, Qena,

Egypt b Luxor University, Faculty of Computers and Information, Department of Computer Science, Luxor, Egypt c Aswan University, Faculty of Engineering, Department of Electrical Engineering, Aswan, Egypt d South Valley University, Faculty of Science, Department of Mathematics, Qena, Egypt

2.1 Introduction Recently, information security has become one of the most important issues of human communities because of increased data transmission over social networks and cloud services [1]. The common use of the Internet is transmission of a large amount of data over open networks and insecure channels, exposing private and secret data to serious situations. As digital information and data are transmitted more often than ever over the Internet, technologies of protecting and securing sensitive messages need to be discovered and continuously developed [2,3]. To maintain the confidentiality of secret data and keep unauthorized users away from the transmitted information, the data hiding concept was introduced using three mechanisms: steganography, cryptography, and watermarking. Since steganography provides covert communication, it becomes a more effective technique than the other two data hiding techniques [4,5]. In fact, steganographic methods for digital images have been widely studied. Various techniques are employed to embed secret data into the cover objects. Anyone other than the participant cannot detect the hidden information in the cover object. Steganography has been utilized in many types of applications, such as copyright control of materials, smart IDs, electronic watermarking, medical imaging systems, and security communication. Images are the most popular cover objects used for steganography [6,7]. The statistics of cover image can be used to embed the secret information into a cover image without changing its properties. This embedding can be done by a random adaptive selection of pixels according to the cover image and the selection of pixels in a block with large local standard deviation. The pixels with carrying secret bits are selected adaptively depending on the content of the cover image [8]. Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00010-4 Copyright © 2020 Elsevier Inc. All rights reserved.

17

18

Digital Media Steganography

The most existing steganography methods suffer from detectable artifacts due to high embedding of a hidden secret message, which performs a degradation of the stegoimage quality [9,10]. A third party could use such artifacts as an indication that a secret message exists The pixel value differencing (PVD) is one of efficient and powerful methods for image steganography. However, this method can handle only one embedding direction for all pixels of the cover image, which leads to exact detection of embedded secret data [11,12]. In this chapter, we consider texts as a type of secret data bits as hidden within the cover image, that is, the images are used to provide safe communication and conceals confidential data within it, where an unauthorized receiver cannot suspect any existence of confidential information. This chapter proposes a pixel-based adaptive directional pixel value differencing (P-ADPVD) method for embedding secret message in various edge directions of cover images. The P-ADPVD method depends on pixel-of-interest (POI) to hide secret data bits in the dominant edge direction of each pixel. Histogram of oriented gradient (HOG) algorithm is employed to find the dominant edge direction for each block of the POI using gradient magnitude and angle information calculated from the cover image, which can be determined by a threshold value.

2.2 Review of the ADPVD method The ADPVD-based method [13] is employed to embed secret data in a cover image along with three directions (horizontal, vertical, and diagonal) by selecting the appropriate embedding direction for each color channel. The color image steganography finds more importance than gray-scale image because it can hold large amount of secret information since it has three color components. Furthermore, using color images makes it very hard for human eye to predict and detect the existence of any secret data inside the cover image after embedding secret data. Eight bits are used to represent the pixel value for red, green, and blue components. Part of the secret data is embedded in each one of the three channels (R,G,B) according to their optimum embedding directions: horizontal, vertical, and diagonal. Figs. 2.1 and 2.2 summarize two examples of embedding and extraction phases for the ADPVD-based method, respectively. The proposed method adopts the same idea of the ADPVD method for hiding secret data, whereas the selection of pixels is based on the HOG algorithm, which will be discussed later in the remaining sections of this chapter.

2.3 The pixel-based adaptive directional PVD steganography As shown in Fig. 2.3, the proposed P-ADPVD method is composed of two algorithms: one for embedding and the other for extracting secret data. Both algorithms are based on selecting a set of pixel-of-interest (POI) using HOG algorithm to embed and extract secret data from any color cover image. More details are further given to explain the steps of calculating HOG and finding the POI for the cover image.

Chapter 2 • A color image steganography method

19

FIGURE 2.1 An illustration example for the ADPVD embedding phase.

FIGURE 2.2 An illustration example for the ADPVD extraction phase.

2.3.1 Histogram of oriented gradients Gradients of the input cover image in the x- and y-directions are first computed as illustrated in Fig. 2.4. Then the gradient magnitude and angle are calculated from x and y gradient images. The gradient angle is quantized to make all angles falling within a specified fixed range, which helps to handle small angle variations. To find the dominant edge direction for each pixel, the histogram of oriented gradient (HOG) is calculated using the

20

Digital Media Steganography

FIGURE 2.3 Outline of the proposed pixel-based adaptive directional PVD steganography method.

FIGURE 2.4 An example of applied HOG algorithm on a color image.

2 × 2 block size from the quantized angle image. It is clear that not all pixels carry useful information for embedding. Thus the POIs can be adaptively determined by adjusting the threshold value Th . This value is computed adaptively according to the secret message length.

Chapter 2 • A color image steganography method

21

The HOG algorithm is used for edge-based local shape descriptor investigated by Dalal et al. [14]. It is originally used to solve object detection problem by describing local edge information of object shape. In this work, a similar idea of HOG is employed to find the dominant gradient direction for each block. The gradient magnitude values for each pixel are accumulated by their corresponding angles. To find the dominant angle for each pixel in the cover image, we select the gradient angle that corresponds to the maximum accumulated gradient magnitude value among other values. Steps of the HOG algorithm for extracting dominant edge direction are outlined in Algorithm 1.

Algorithm 1 HOG computation algorithm. Input: Input cover image C of size M × N Output: Histogram of oriented gradients H 1: Calculate the gradient of input image in the x- and y-directions: Gx = I ∗ Kx ,

Gy = I ∗ Ky ,

(2.1)

 −1 . 1 2: The magnitude and edge direction is computed using the formula   where Kx = −1 1



and Ky =

G= 3:



G2x + G2y ,

θ = tan−1 (

Gy ). Gx

(2.2)

Normalize the magnitude value to be within the range of [0 − 1] using the formula Gn =

G . max(G)

(2.3)

Quantize the angle of edge direction θq according to the quantization ranges given in Table 2.1. 5: Divide cover image into a set of 2 × 2 nonoverlapping blocks. Calculate the histogram of oriented gradients (HOG) as follows:  Hi,j = s(θq = Q), i = 0, 1, ..., N − 1, j = 0, 1, ..., M − 1, (2.4) 4:

x,y∈B(i,j )

where Q = {1, 2, 3, 4, 5} is the quantized angle labels, and  Gm , x is true s(x) = , 0, x is f alse

(2.5)

where s(x) is indicator function, and Hi,j is a function used to accumulate the gradient magnitude Gm inside each block B(i, j ).

22

Digital Media Steganography

Table 2.1 Angle quantization range. Angle range θ

Quantized value

[0–22.5] or [157.5–180]

1 or 5

[22.5–67.5]

2

[67.5–112.5]

3

[112.5–157.5]

4

2.3.2 Pixel-of-interest (POI) The dominant magnitude value for each block of the cover image is obtained by finding the maximum value of the histogram of accumulated gradient magnitude. The POI can be defined as a pixel that has an accumulated gradient magnitude value greater than the selected threshold parameter Th . The value of Th is determined adaptively according to the size of bits in the secret message. These pixels are called pixels-of-interest (POIs), which are found in the edge regions, whereas the other pixels located in smooth regions of the cover image are called as pixels-of-noninterest (PONIs). Here POIs are used to embed/extract secret message, whereas the direction of embedding is the corresponding quantized dominant angle. The first stage to determine the POI is to compute the threshold value Th using Algorithm 2. It finds a suitable high threshold value for the HOG edge detector. Initially, tmin is set to 0, and tmax is set to 1. The threshold value Th located between [0, 1] has to be stored in the last pixel of stegoimage sent to the intended receiver. Let limit be set to 1% of the message length, the number of pixels in edges returned by HOG edge detector be Ne for the given threshold, which is the median between tmin and tmax . It is quite possible that the number of pixels Ne belonging to edges is not exactly same as the length of the secret message Sm . To alleviate this problem, the terminating condition of the search is modified so that it returns the number of pixel size Se greater than or equal to Sm , and the limit is used to set an upper bound on Ne . The POIs of the cover image are allocated by applying steps of Algorithm 3.

2.3.3 Embedding algorithm The first stage of the embedding algorithm is applying the previously explained HOG steps for the cover image to determine the dominant magnitude and angle for each pixel as in Algorithm 1. Then calculate the threshold value Th adaptively depending on the length of the secret message according to Algorithm 2. After that, the appropriate number of edge POIs can be determined and calculated according to Algorithm 3. Clearly, the number of embedded secret bits depends on the quantization range table, which can be classified into two levels, high and low levels as shown in Fig. 2.5. Assume that the top left pixel in each POI is a reference edge pixel and the other one is selected according to the dominant edge direction. To implement the embedding algorithm which exploits the POIs for hiding the secret message, the steps of Algorithm 4 should be carried out.

Chapter 2 • A color image steganography method

23

Algorithm 2 Threshold Th computation algorithm. Input: Cover image C, Length of secret message Sm , Capacity of edge pixels Se Output: Threshold Th values to get the required POIs 1. Initialize limit = 0.01 × Se , tmin = 0, tmax = 1, Ti =  2. No. of edge pixels are computed by the formula

tmin + tmax . 2

Ne = Count (H OG (C, T )),

(2.6)

where Ne = No. of edge pixels returned by HOG computation. 3. Calculate the difference diff between Se and Sm to decide the number of edge pixels required for embedding using the formula diff = Se − Sm .

(2.7)

4. Adjust Th value according to the following conditions: if diff < limit ⇒ Th = tmin . if diff < 0 ⇒ Th = Ti . if diff < limit, calculate the required threshold value Th using the formula Th =

limit . Sm

(2.8)

tmin + tmax . 2 5. Repeat steps (1–4) until the required value Th is adjusted to accommodate the whole length of the secret message. if Th > tmax ⇒ Th =

Algorithm 3 POI identification. Input: Cover image C, Threshold Th Output: No. of POIs required for embedding secret message. 1. For each pixel in the cover image, the dominant magnitude G =



Fx2 + Fy2 and angle

values are selected as the maximum value of 2 × 2 HOG block (i.e., P OI = Max (G)). 2. Calculate the threshold Th value according to Algorithm 2. 3. Identify the number of POIs required for embedding based on the value of Th . 4. Determine the embedding edge direction for each POI that has been selected using the corresponding quantized dominant angle according to Table 2.1.

24

Digital Media Steganography

Algorithm 4 P-ADPVD embedding algorithm. Input: Cover image C of size M × N and secret message Sm of size N . Output: Stegoimage S of size M × N 1. 2. 3. 4.

Separate the color image into three color components (R,G,B). Partition the cover image into a set of size 2 × 2 blocks for each color channel. Find POIs for the cover image by applying steps in Algorithm 3. For each POI, calculate difference values for blocks based on the dominant direction as: dh = |pi,j − pi,j +1 |, dv = |pi,j − pi+1,j |,

(2.9)

dd = |pi,j − pi+1,j +1 |, where dh , dv , and dd represent horizontal, vertical, and diagonal embedding directions, respectively. 5. For each color channel, apply the PVD technique [15] to hide secret bits in the four edge pixels of each POI to modify pixel values according to its dominant angle direction into  , p     (pi,j i,j +1 ), (pi,j , pi+1,j ), and (pi,j , pi+1,j +1 ) for the three directions. 6. Calculate the total embedding capacity (EC) as the sum for three color channels EC =

3  (EChi , ECvi , ECdi ),

(2.10)

i=1

where ECh , ECv , and ECd are the three embedding directions, horizontal, vertical, and diagonal, respectively. 7. According to Algorithm 2, if (EC) is less than the required length of the secret message, modify the value of Th and then repeat steps (3–7). Otherwise, go to step 8. 8. Generate the stegoimage S after embedding the whole secret message.

FIGURE 2.5 Quantization range table with number of extracted bits.

Chapter 2 • A color image steganography method

25

2.3.4 Extraction algorithm In this procedure, the hidden secret bits can be extracted from the stegoimage at the receiver side. When HOG detection is applied for the stego image, the gradient magnitude is not changed after embedding bits in POIs. By using the quantization range table as shown in Fig. 2.5 the following steps are applied to extract secret bits stream. Algorithm 5 P-ADPVD extraction algorithm. Input: Stegoimage S of size M × N . Output: Secret message Sm of size N . 1. Extract the threshold value Th from the stegoimage. 2. Separate the stegoimage into three color components (R,G,B). 3. For each color channel, partition the cover image into nonoverlapping blocks of 2 × 2 pixels. 4. Find POIs using steps in Algorithm 3. 5. For each POI of the stegoimage, calculate new difference value in each channel using   − pi,j dh = |pi,j +1 |,   − pi+1,j |, dv = |pi,j

dd

 = |pi,j

(2.11)

 − pi+1,j +1 |,

where dh , dv , and dd are the differences among horizontal, vertical, and diagonal directions, respectively. 6. For each pixel in the color channel, find the optimum range according to the range 2.5 of each dominant direction (horizontal, vertical, and diagonal). Therefore bh = dh − lh , bv = dv − lv , bd

= dd

(2.12)

− ld ,

where lh , lv , and ld are the lower boundaries of ranges for each pixel block used for embedding the secret message in the horizontal, vertical, and diagonal directions, respectively. 7. Convert bh , bv , and bd into its corresponding binary hidden secret bits of t for each block direction located in each color channel (R,G,B) using the formula t = log2 wi,j , where wi,j is the range width of pixel differences dh , dv , and dd . 8. Concatenate the extracted bits to obtain the secret message Sm .

(2.13)

26

Digital Media Steganography

2.4 Results and discussion This section investigates the performance of the proposed P-ADPVD data hiding method by carrying out several experiments, where the embedding capacity, visual quality, and security benchmarks are utilized as performance measures. The proposed method is implemented and tested using MATLAB® on six color images of size 512 × 512 pixels as cover images shown in Fig. 2.6.

FIGURE 2.6 Original cover images.

2.4.1 Embedding direction analysis using HOG We investigate the embedding direction of cover image using the histogram of oriented gradient for each color channel (Red, Green, and Blue). Since the major drawback of the previous work of the PVD-based steganography methods is the use of one embedding direction to hide data in all pixels, the proposed method overcomes this problem by employing different embedding directions for each pixel using HOG. Digital images contain different edges directions to represent the shapes of objects in the image, exploiting this property by finding the dominant direction of edges at small blocks not only helps to hide data properly but also helps to increase the security.

Chapter 2 • A color image steganography method

27

Table 2.2 Three embedding directions according to the number of POIs. Cover images 512 × 512

Horizontal direction

Vertical direction

Diagonal direction

No. POIs

Capacity (bpp)

No. POIs

Capacity (bpp)

No. POIs

Capacity (bpp)

Image (A)

21,387

513,288

12,358

288,216

28,325

679,800

Image (B)

31,844

764,256

6,690

160,560

24,758

594,192

Image (C)

24,455

586,920

11,170

268,080

29,228

701,472

Image (D)

27,512

660,288

7,800

187,200

30,061

721,464

Image (E)

21,680

520,320

13,035

312,840

30,816

739,584

Image (F)

24,652

591,648

9,770

234,480

31,043

745,032

Average

25,255

606,120

10,137

241,896

29,039

696,924

2.4.2 Embedding direction analysis using POI Another major disadvantage of the previous PVD-based steganography methods is their low embedding capacity compared with the LSB substitution method. We can overcome this problem by employing different embedding directions. In Table 2.2 the comparison results of three edge directions (horizontal, vertical, and diagonal) are investigated according to the number of POIs and their corresponding embedding capacities of the six RGB images. From the obtained results we can conclude that the diagonal edge direction has the largest number of POIs, which leads to the highest embedding capacity compared with the other two directions. Besides, the diagonal direction gives an averaged POI and capacity of approximately 29, 039 POIs and 696, 924 bits, respectively, where the threshold Th = 0 to obtain the maximum capacity for each POI direction. This is caused by the low correlation between diagonal pixels compared with the high correlation of pixels allocated in the vertical and horizontal directions. Basically, the number of POIs is controlled depending on the texture and properties of the cover image, which can be determined adaptively for embedding secret message. For example, image (B) has the largest number of POIs with highest embedding capacity along with horizontal edge direction unlike the rest of five test images. Fig. 2.7 shows the embedding direction of the cover image throughout three directions (horizontal, vertical, and diagonal) for POI where red, green, and yellow refer to horizontal, diagonal, and vertical edge directions, respectively. It is clear that the HOG-based algorithm can automatically capture dominant edge directions of the cover image, which guide the PVD algorithm to embed secret data in the appropriate direction.

2.4.3 Impact of threshold value on POI In this experiment, we show that the number of POIs may change according to the value of threshold parameter Th . For small threshold values, the number of POIs increases; otherwise, the number of POIs decreases. Fig. 2.8 shows two sets of pixels, pixels-of-interest (POIs) and pixels-of-noninterest (PONIs), of the cover image at various threshold val-

28

Digital Media Steganography

FIGURE 2.7 The embedding direction of the cover images.

FIGURE 2.8 POIs at different threshold values Th = 0, 0.01, and 0.1, where red pixels represent POIs, and blue ones belong to PONIs.

ues Th = 0, 0.01, and 0.1, respectively. Blue pixels refer to pixels-of-noninterest (PONIs), whereas red ones belong to the POIs. This leads to that the POIs are determined adaptively according to the value of Th , which can be assigned between 0 as the minimum and 1 as the maximum value. Also, this value is computed using Algorithm 2 depending on the length of secret message.

Chapter 2 • A color image steganography method

29

2.4.4 Impact of threshold on capacity and visual quality In this experiment, we study the impact of changing threshold value on the embedding capacity and visual quality, which are used as benchmark evaluation of the steganography methods [16,17]. The embedding capacity is defined as the number of secret bits that can be embedded in the cover image pixels. It can be given either as a relative percentage of the total size of pixels in the cover image or as an absolute measure such as bits per pixel (bpp). Besides, the PSNR value is used as a benchmark to measure the imperceptibility of the stegoimage, which is measured in decibels (dB). Table 2.3 illustrates the effects of changing the values of T from 0 to 1 divided into three ranges 0–0.01, 0.01–0.1, and 0.1–1. The obtained results of the first range Th = 0–0.01 give an average of 25% embedding capacity and PSNR value of 42.89 dB, where the large secret message length is accommodated. The second range Th = 0.01–0.1 gives an average of embedding capacity with 11% and PSNR value of 45.12 dB for medium secret message length. The third one Th = 0.1–1.00 gives an average of 6.5% embedding capacity and 50.10 dB for PSNR, where a small secret message length is targeted for embedding. Thus the decrease of Th allows weak edge pixels to be selected for embedding secret information, which leads to high embedding capacity with maintaining visual quality. On the other hand, increasing Th allows only sharp edges to be used for embedding secret data, which leads to high imperceptibility of the stegoimage with acceptable embedding capacity. Table 2.3 The effect of changing threshold value on the embedding capacity. Cover images 512 × 512

Th range = 0–0.01

Th range = 0.01–0.1

Th range = 0.1–1.00

PSNR (dB)

Capacity %

PSNR (dB)

Capacity %

PSNR (dB)

Capacity %

Image (A)

42.79

25.0

43.34

10.77

49.16

6.5

Image (B)

40.94

25.0

42.94

10.79

46.86

6.0

Image (C)

46.36

25.0

50.90

10.39

54.75

6.8

Image (D)

44.12

25.0

47.63

12.35

51.98

6.6

Image (E)

40.18

25.0

42.75

10.50

48.92

6.7

Image (F)

42.96

25.0

43.13

10.94

48.94

6.4

Average

42.89

25.0

45.12

11.0

50.10

6.50

Fig. 2.9 illustrates the effects of changing the values of Th from 0 to 1 for six stegoimages on the embedding capacity (bpp). The obtained results reveal that embedding data bits depend on the amount of existing edge information in the image and increase as the value of Th decreases. The stegoimage (E) has high embedding capacity as it contains rich edges, and this results in a large number of POIs. Fig. 2.10 shows the effects of changing the values of Th from 0 to 1 for six stegoimages on the PSNR value. The stegoimage (C) has a high PSNR value, which leads to better visual quality. The obtained results prove that the PSNR value increases as Th decreases. We conclude from this experiment that there is a trade off between PSNR value and the embedding capacity, which can be controlled by the threshold value Th . In addition, the threshold value can be calculated adaptively depending on the amount of data in the secret message. For small sizes of the secret message, Th takes

30

Digital Media Steganography

FIGURE 2.9 Impact of changing threshold values on the embedding capacity.

FIGURE 2.10 Impact of changing threshold values on the visual quality.

large values, whereas for large size of the secret message, the value of Th is small enough to accommodate all data.

2.4.5 Visual quality analysis In this experiment, four different image quality assessment metrics, mean square error (MSE), peak-signal-to-noise ratio (PSNR), normalized-cross-correlation (NCC), and structural similarity index measurement (SSIM), are used for measuring the visual quality of a stegoimage. These four metrics are calculated to depict different quality aspects and to evaluate the efficiency of the proposed method. The MSE is used to calculate the error between cover- and stegoimages [18]. The PSNR can be defined as a statistical image qual-

Chapter 2 • A color image steganography method

31

ity estimation level used for measuring the distortion between the cover- and stegoimage. Higher values of PSNR mean smaller amounts of distortion and lead to high visual quality. On the other hand, lower values of PSNR lead to remarkable changes in stegoimages, which are easily detectable by HVS [19]. The NCC shows how strongly the stegoimage is correlated with the cover image. The value of NCC lies between 0 and 1. NCC values equal to 1 mean that the stegoimages generated by the proposed method are completely robust against various image processing attacks such as cropping, rotation, and scaling [20]. The SSIM is an image quality estimation method that compares cover image with stegoimage to obtain similarities between them. It was suggested as an improvement over the two standards measures (PSNR and MSE), which have proved to be incompatible with HVS [21]. The MSE, PSNR, NCC, and SSIM are calculated using the following formulas: MSE =

M−1  N−1  1 (S − C)2 , MxN

(2.14)

i=0 j =0

P SNR = 10 log10 M−1 N−1

N CC =

2552 , MSE

(2.15)

(S(i, j ) x C(i, j ))

i=0 j =0 M−1 N−1

,

(2.16)

S(i, j )2

i=0 j =0

SSI M(C, S) =

(2μC μS + c1 )x(2σCS + c2 ) , (μ2C + μ2S + c1 )x(σC2 σS2 + c2 )

(2.17)

where μC is the average of C, μS is the average of S, σC2 is the variance of C, σS2 is the variance of S, σCS is the covariance of C and S, c1 = (K1 L)2 and c2 = (K2 L)2 are two variables to stabilize the division with weak denominator, L is the dynamic range of the pixel values, K1 = 0.01 and K2 = 0.03 by default, C is the pixel of cover image before embedding, and S is the pixel of the stegoimage after embedding. Another metric based on statistic measurement utilized to show the quality of stegoimages is the universal image quality index Q [22]. High values of Q mean that the coverand stegoimages are highly correlated and differences between them are very small. The universal quality index Q can be calculated as Q=

(σx2

4 σxy x¯ y¯ , + σy2 )[(x) ¯ 2 + (y) ¯ 2 )]

where x¯ =

M−1  N−1  1 (xij ), M ×N i=0 j =0

y¯ =

M−1  N−1  1 (yij ), M ×N i=0 j =0

(2.18)

32

Digital Media Steganography

σx2 =

M−1 M−1  N−1   N−1  1 1 (xij − x) ¯ 2 , σy2 = (yij − y) ¯ 2, (M × N ) − 1 (M × N ) − 1 i=0 j =0 2 = σxy

(2.19)

i=0 j =0

M−1  N−1  1 (xij − x)(y ¯ ij − y), ¯ M ×N i=0 j =0

where M × N is the size of cover image, x is the value of the pixels in the cover image, y is the value of the pixels in the stegoimage, x¯ is the mean value of x, y¯ is the mean value of y, 2 are the variance and covariance of x and y images, respectively. and σx2 , σy2 , and σxy Table 2.4 shows the results obtained by the proposed P-ADPVD method based on PSNR, MSE, NCC, and SSIM. These metrics are used for measuring the image quality assessment meter. In this experiment, high values of PSNR, SSIM, and NCC show the better imperceptibility of the proposed method, which leads to high visual quality stegoimages. The last row of the table shows the average value for each metric over six color images. All metrics confirm the effectiveness of the proposed P-ADPVD method to hide secret data without any remarkable changes in the stegoimage. It is clear that the stegoimages are quite imperceptible and no visual artifacts can be observed, which makes it difficult for a steganalyst to discover the existence of hidden message in easy way as shown in Fig. 2.11. Table 2.4 Experimental results of the visual quality benchmarks of the proposed P-ADPVD method. Cover images 512 × 512

PSNR

MSE

NCC

SSIM

Image (A) Image (B)

Q

52.07

0.4

0.9999

0.9999

0.9999

48.45

0.92

0.9999

0.9999

0.9999

Image (C)

61.99

0.04

1

1

0.9999

Image (D)

56.91

0.13

1

1

0.9999

Image (E)

48.77

0.86

0.9999

0.9999

0.9999

Image (F)

51.73

0.42

0.9999

0.9999

0.9999

Average

53.23

0.46

1

1

0.9999

2.4.6 Comparison with other adaptive PVD-based methods In this experiment, we compare the proposed P-ADPVD method with other well-known adaptive PVD-based methods including [23], [24], and [12] in terms of NCC and SSIM values using six color images from USC-SIPI image database; namely, Lena, Baboon, Jet, Peppers, Boat, and House. These images are commonly used along with 280, 000 bits of a secret message, which are hidden in each of them for measuring the image quality assessment meter. The experimental results of the proposed P-ADPVD method reported in Table 2.5 confirm the superiority of the visual quality of the proposed method, which gives unity values for NCC and SSIM metrics. This result means that the cover- and stegoimages are highly correlated and differences between them are very small. Clearly, the NCC and

Chapter 2 • A color image steganography method

33

FIGURE 2.11 Stegoimages of the proposed P-ADPVD method at 25% embedding capacity.

Table 2.5 Comparison of the proposed P-ADPVD method with the adaptive PVD-based methods according to NCC and SSIM. 1 × 2 adaptive PVD

Pradhan’s method

2 × 2 adaptive PVD

Proposed method

SSIM

NCC

SSIM

NCC

SSIM

NCC

0.9976

0.9990

0.9989

0.9992

0.9989

0.9999

0.9996

0.9981

0.9951

0.9992

0.9982

0.9993

0.9987

0.9998

0.9995

Peppers

0.9981

0.9976

0.9992

0.9987

0.9993

0.9989

0.9999

0.9998

Jet

0.9962

0.9958

0.9995

0.9986

0.9996

0.9989

0.9994

0.9997

Boat

0.9981

0.9975

0.9996

0.9982

0.9997

0.9989

0.9995

0.9998

House

0.9971

0.9966

0.9993

0.9991

0.9995

0.9991

0.9999

0.9998

Average

0.9976

0.9967

0.9993

0.9986

0.9994

0.9989

0.9997

0.9997

Cover image (512 × 512)

NCC

Lena

0.9981

Baboon

SSIM

SSIM values for the proposed P-ADPVD method in all cases are greater than those of the methods in [23], [12], and [24]. This provides high imperceptibility for the stego color images without any noticeable changes detected by human visual system (HVS). The average value for each metric over six images is calculated at the end of the table. This average shows that the proposed P-ADPVD method provides better results than these methods with respect to NCC and SSIM metrics with an average of 0.9997 and 0.9997, respectively.

34

Digital Media Steganography

Table 2.6 Comparison of the proposed P-ADPVD method with other adaptive PVDbased methods according to capacity and PSNR. Cover image 512 × 512 × 3

Luo et al.’s 1 × 3 pixel block adaptive PVD Capacity (bits)

PSNR

Capacity (bits)

PSNR

Capacity (bits)

PSNR

Lena

229,037

48.79

1,234,394

40.21

1,341,191

45.04

Baboon

611,197

48.03

1,406,405

37.14

1,489,945

47.13

Peppers

264,058

48.32

1,236,715

40.64

1,350,251

45.73

Jet

145,755

48.76

1,224,178

39.35

1,267,690

44.86

Boat

389,588

48.20

1,289,871

40.37

1,424,967

46.08

House

259,413

48.41

1,263,038

39.62

1,339,985

43.58

Average

316,508

48.42

1,275,767

39.56

1,369,005

45.40

Cover image 512 × 512 × 3

Pradhan’s 2 × 3 pixel block adaptive PVD Capacity (bits)

Lena

1,445,784

Baboon

1,532,417

Peppers

Mandal & Das’ method adaptive PVD

Swain’s 2 × 2 pixel block adaptive PVD

Pradhan’s 3 × 2 pixel block adaptive PVD

Proposed P-ADPVD method

PSNR

Capacity (bits)

PSNR

Capacity (bits)

PSNR

50.89

1,425,521

50.61

1,548,888

46.76

52.29

1,527,208

52.36

1,571,784

43.75

1,418,101

51.29

1,409,621

51.22

1,572,864

46.71

Jet

1,381,432

50.65

1,362,765

50.77

1,570,800

45.54

Boat

1,479,835

51.42

1,474,106

51.43

1,502,864

45.58

House

1,431,346

49.09

1,429,845

49.18

1,562,160

44.74

Average

1,448,153

50.94

1,438,178

50.93

1,554,893

45.51

2.4.7 Comparison with color image-based methods Table 2.6 summarizes the comparison between the proposed P-ADPVD method with other well-known adaptive PVD-based methods, including 1 × 3 pixel block [25], 2 × 2 pixel block [23], 2 × 3 and 3 × 2 pixel block [12], and Mandal and Das’ [26] methods in terms of capacity and PSNR values using six color images. The results obtained from the proposed P-ADPVD method confirm the superiority of the embedding capacity with maintaining visual quality with average values of 1, 554, 893 bits and 45.51, respectively. This result means that high embedding secret data can be obtained as all edge regions are used for hiding. Besides, the cover- and stegoimages are quite imperceptible. The PSNR values for the proposed method are better than those of Mandal and Das [26] and Swain’s 2 × 2 pixel block [23] methods. It also verifies its effectiveness in embedding secret data without any detectable changes over stego-images from their original ones by the HVS, which leads to more security in comparison with other adaptive PVD methods. Generally, adaptive steganography has a trade off between embedding capacity and PSNR value. When high PSNR is provided, the embedding capacity will be sacrificed and vice versa. Note that the stegoimages obtained using the proposed P-ADPVD method are visually undetectable from their original ones by the human vision, which leads to more security in comparison with previous works.

Chapter 2 • A color image steganography method

35

2.4.8 Comparison with edge-based methods This experiment is carried out to explore the performance of the proposed P-ADPVD method with respect to the number of edge pixels available for embedding by using BOSSbase 1.01. Besides, a comparison of their difference between the cover- and stegoimages after embedding secret data. The database contains 10, 000 gray-scale 512 × 512 images. Here 1000 images are selected randomly from database. Table 2.7 lists the total number of edge pixels, the average of them in percentage, and the difference of edge pixels between cover- and stegoimages. From the obtained results we can see that the HOG algorithm gives more edge pixels than the other three edge detectors containing Canny, Sobel, and Prewitt, which can be used for high embedding data. The average number of edge pixels is approximately 64, 638 and 24.66%, whereas the average difference between cover- and stegoimages for HOG edge detector is limited to be 1.55%, which is less than for the Canny algorithm edge detector. This shows that the proposed P-ADPVD method is better than the other detectors with respect to the hiding capacity and security. Table 2.7 Average difference of edge pixels between image and its stegoimage according to four edge detection algorithms. Algorithm

Total edge pixels

Edge pixels (average)

Edge pixels %

Difference average

Difference %

Canny

262,144

23,818

9.1

383

1.61

Sobel

262,144

8,451

3.2

7

0.09

Prewitt

262,144

8,407

3.2

8

0.1

HOG

262,144

64,638

24.66

1000

1.55

2.4.9 Security against pixel difference histogram analysis Since PVD-based methods can be detected by a steganalyst using PVD histogram analysis, the proposed P-ADPVD method has been tested and analyzed using a pixel difference histogram (PDH). The previous works [15,27] introduced the PVD histogram analysis to detect and analyze the artifacts resulted from steganography algorithms. When the embedding rate is high, it is possible to evaluate and calculate the size of a secret message. The PDH shown in Figs. 2.12, 2.13, and 2.14 is performed on both cover- and stegoimages for the previous six color images. The obtained results show that there are no remarkable effects in histogram of color images before and after embedding secret message with approximately 280, 000 bits. Furthermore, there is a small distortion in the pixel values of the stegoimages, which makes it difficult for a steganalyst to discover the existence of hidden message. The histograms obtained after applying the proposed P-ADPVD method are found to be almost like that of the cover image. This analysis clarifies the robustness of the proposed method as it hides different numbers of bits for adaptively selected pixels in all channels of the input color image.

36

Digital Media Steganography

FIGURE 2.12 PDH analysis of the proposed P-ADPVD method for Baboon and Peppers images.

FIGURE 2.13 PDH analysis of the proposed P-ADPVD method for Lena and Jet images.

FIGURE 2.14 PDH analysis of the proposed P-ADPVD method for Car and Boat images.

Chapter 2 • A color image steganography method

37

2.4.10 Security against statistical RS-steganalysis Statistical methods of Westfeld [28] or Provos [29] neglect a large amount of very important information (i.e., the placement of pixels in the stegoimage). It is intuitively clear that utilizing the spatial correlations in the stegoimage, we should be able to build much more reliable and accurate detection. However, it is not easy to uncover and quantify the weak relationship between some pseudorandom components present in the image and the image itself. For a gray-scale cover image with M × N pixels and pixel values from a set P = {0, . . . , 255}, the spatial correlations are captured using a discrimination function f that assigns a real number f (x1 , ..., xn ) ∈ R to a group of pixels G = (x1 , ..., xn ). The function f is defined as f (x1 , ..., xn ) =

n−1  (|xi+1 − xi |),

(2.20)

i=1

which measures the smoothness of G−, whereas a noisier group G has a larger value of the discrimination function f . In the LSB embedding algorithm, as noisiness is increased in the image, the value of f will be increased after embedding. In typical images, flipping the group G will more frequently lead to an increase in the discrimination function f rather than a decrease. Thus the total number of regular groups is larger than the total number of singular groups. Let us denote the relative number of regular groups for a nonnegative mask m as Rm (in percents of all groups), and let Sm be the relative number of singular groups. The value of Rm is approximately equal to that of R−m , and the same should be true for Sm and S−m in the case of assumption of the steganalysis method for cover image taking a different rate of payload to achieve the following: Rm ∼ = R−m and Sm ∼ = S−m .

(2.21)

It is clear that this dual statistics steganalysis method is commonly used to explore the correlation of images in the spatial domain [30]. In Figs. 2.15, 2.16, and 2.17 the RS-analysis curves for six color images Lena, Baboon, Jet, Peppers, Boat, and Car are introduced. The curves for Rm and R−m are straight lines and almost overlap with each other. Also, the curves for Sm and S−m are straight lines and almost overlap with each other. Thus the relation Rm ∼ = R−m > Sm ∼ = S−m is true. So the RS analysis will not detect the hidden message in the cover image when the proposed P-ADPVD method is used to hide secret data bits at different percentages of embedding POIs payload. It is calculated according to the total number of edge regions Ne that are available for embedding secret data as follows: Ppayload =

P OI s × 100, Ne

(2.22)

where Ppayload is the percentage of embedding data, P OI s is the number of pixels used for hiding various lengths of secret messages to achieve different percentage of embedding, and Ne is the total number of edge regions available for hiding data.

38

Digital Media Steganography

FIGURE 2.15 RS-analysis of the proposed P-ADPVD method over Lena and Baboon images.

FIGURE 2.16 RS-analysis of the proposed P-ADPVD method over Airplane and Peppers images.

FIGURE 2.17 RS-analysis of the proposed P-ADPVD method over Car and Boat images.

Chapter 2 • A color image steganography method

39

From the obtained results shown in the previous figures we see that the proposed P-ADPVD method is robust against statistical RS-analysis at different embedding capacity rate (1–100)%, which leads to more security against steganalysis attackers.

2.5 Conclusion In this chapter, we have introduced a pixel-based adaptive directional pixel value differencing (P-ADPVD) data hiding method consisting of five algorithms: HOG edge detection, threshold computation, POI identification, embedding, and extraction algorithms. The main advantage of the proposed P-ADPVD method is embedding secret data in three different edge directions rather than only in one direction as in the PVD-based methods. Experimental results and discussion are provided to show the efficiency of the proposed PADPVD method using different evaluation metrics, including hiding capacity, visual quality, and security. The obtained results show that the P-ADPVD method resists steganalysis attacks such as pixel difference histogram and statistical RS-analysis using various image databases.

References [1] Niels Provos, Peter Honeyman, Hide and seek: an introduction to steganography, IEEE Security & Privacy 1 (3) (2003) 32–44. [2] Ruohan Meng, Qi Cui, Zhili Zhou, Zhangjie Fu, Xingming Sun, A steganography algorithm based on CycleGAN for covert communication in the Internet of things, IEEE Access 7 (2019) 90574–90584. [3] Mohamed Abdel Hameed, M. Hassaballah, Saleh Aly, Ali Ismail Awad, An adaptive image steganography method based on histogram of oriented gradient and PVD-LSB techniques, IEEE Access 8 (2020). [4] Abbas Cheddad, Joan Condell, Kevin Curran, Paul Mc Kevitt, Digital image steganography: survey and analysis of current methods, Signal Processing 90 (3) (2010) 727–752. [5] Mehdi Hussain, Ainuddin Wahid Abdul Wahab, Yamani Idna Bin Idris, Anthony TS Ho, Ki-Hyun Jung, Image steganography in spatial domain: a survey, Signal Processing. Image Communication 65 (2018) 46–66. [6] Satwinder Singh, Varinder Kaur Attri, State-of-the-art review on steganographic techniques, International Journal of Signal Processing, Image Processing and Pattern Recognition 8 (7) (2015) 161–170. [7] Rohit Thanki, Surekha Borra, A color image steganography in hybrid FRT-DWT domain, Journal of Information Security and Applications 40 (2018) 92–102. [8] Jiri Fridrich, Rui Du, Secure steganographic methods for palette images, in: International Workshop on Information Hiding, Springer, 1999, pp. 47–60. [9] Wanteng Liu, Xiaolin Yin, Wei Lu, Junhong Zhang, Jinhua Zeng, Shaopei Shi, Mingzhi Mao, Secure halftone image steganography with minimizing the distortion on pair swapping, Signal Processing 167 (2020) 1–10. [10] Aditya Kumar Sahu, Gandharba Swain, Reversible image steganography using dual-layer LSB matching, Sensing and Imaging 21 (1) (2020) 1–21. [11] Gandharba Swain, High capacity image steganography using modified LSB substitution and PVD against pixel difference histogram analysis, Security and Communication Networks 2018 (2018) 1–14. [12] Anita Pradhan, K. Raja Sekhar, Gandharba Swain, Adaptive PVD steganography using horizontal, vertical, and diagonal edges in six-pixel blocks, Security and Communication Networks (2017) 2017. [13] Mohamed Abdel Hameed, Saleh Aly, M. Hassaballah, An efficient data hiding method based on adaptive directional pixel value differencing (ADPVD), Multimedia Tools and Applications 77 (12) (2018) 14705–14723.

40

Digital Media Steganography

[14] Navneet Dalal, Bill Triggs, Histograms of oriented gradients for human detection, in: IEEE Computer Vision and Pattern Recognition (CVPR), vol. 1, IEEE, 2005, pp. 886–893. [15] Da-Chun Wu, Wen-Hsiang Tsai, A steganographic method for images by pixel-value differencing, Pattern Recognition Letters 24 (9) (2003) 1613–1626. [16] Muhammad Sajjad, Naveed Ejaz, Sung Wook Baik, Multi-kernel based adaptive interpolation for image super-resolution, Multimedia Tools and Applications 72 (3) (2014) 2063–2085. [17] Randa Atta, Mohammad Ghanbari, A high payload steganography mechanism based on wavelet packet transformation and neutrosophic set, Journal of Visual Communication and Image Representation 53 (1) (2018) 42–54. [18] Muhammad Sajjad, Naveed Ejaz, Irfan Mehmood, Sung Wook Baik, Digital image super-resolution using adaptive interpolation based on Gaussian function, Multimedia Tools and Applications 74 (20) (2015) 8961–8977. [19] Ramadhan J. Mstafa, Khaled M. Elleithy, A highly secure video steganography using Hamming code (7, 4), in: Proceedings of Applications and Technology Conference, IEEE, 2014, pp. 1–6. [20] P. Thiyagarajan, G. Aghila, Reversible dynamic secure steganography for medical image using graph coloring, Health Policy and Technology 2 (3) (2013) 151–161. [21] Mahesh Satish Khadtare, GPU based image quality assessment using structural similarity (SSIM) index, in: Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing, IGI Global, 2016, pp. 276–282. [22] Zhou Wang, Alan C. Bovik, A universal image quality index, IEEE Signal Processing Letters 9 (3) (2002) 81–84. [23] Gandharba Swain, Adaptive pixel value differencing steganography using both vertical and horizontal edges, Multimedia Tools and Applications 75 (21) (2016) 13541–13556. [24] Gandharba Swain, Adaptive and non-adaptive PVD steganography using overlapped pixel blocks, Arabian Journal for Science and Engineering (2018) 1–14. [25] Weiqi Luo, Fangjun Huang, Jiwu Huang, A more secure steganography based on adaptive pixel-value differencing scheme, Multimedia Tools and Applications 52 (2–3) (2011) 407–430. [26] J.K. Mandal, Debashis Das, Colour image steganography based on pixel value differencing in spatial domain, International Journal of Information Sciences and Techniques 2 (4) (2012). [27] Xinpeng Zhang, Shuozhong Wang, Vulnerability of pixel-value differencing steganography to histogram analysis and modification for enhanced security, Pattern Recognition Letters 25 (3) (2004) 331–339. [28] Andreas Westfeld, Andreas Pfitzmann, Attacks on steganographic systems, in: International Workshop on Information Hiding, Springer, 1999, pp. 61–76. [29] Niels Provos, Defending against statistical steganalysis, in: Usenix Security Symposium, vol. 10, 2001, pp. 323–336. [30] Jessica Fridrich, Miroslav Goljan, Practical steganalysis of digital images: state of the art, in: Security and Watermarking of Multimedia Contents IV, vol. 4675, 2002, pp. 1–14.

3 An improved method for high hiding capacity based on LSB and PVD Aditya Kumar Sahua,b , Gandharba Swaina a Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur,

Andhra Pradesh, India b Department of Computer Science and Engineering, GMRIT, Rajam, Andhra Pradesh, India

3.1 Introduction Internet has always been the preferred medium for digital data transmission. However, transmitting the secure data through the public channel is not secure [1]. Hence the data hiding approaches are essential to provide security to the data during transmission. Data hiding can be broadly classified into two major categories, cryptography and steganography [2]. Steganography is different from cryptography in a manner that, the existence of the data will be hidden from the adversary. It uses cover media in the form of digital objects, such as digital video, image, audio, and text. In image steganography, secret data is hidden inside the digital images. The image with secret data inside it is known as stegoimage. Image steganography can be performed in two domains, the spatial domain and frequency domain. The spatial domain directly performs the embedding of secret data on the pixels of the cover image. Least significant bit (LSB) substitution, pixel value differencing (PVD), exploiting modification directions (EMD), histogram shift, spread spectrum, and pixel indicator are some of the popular methods in the spatial domain [3]. The LSB substitution techniques directly replace the LSB bits of the cover image pixels to hide data inside it [4]. Although LSB substitution techniques perform better in terms of hiding capacity, the quality of the stego image reduces after a certain threshold capacity. Wu and Hwang [5] proposed a modified LSB substitution to reduce the distortion in the stego image. This method hides three bits in a block of three pixels with ±1 modification to each pixel. The expected number of modifications per pixel (ENMPP) has been significantly reduced compared to the conventional LSB substitution method. Sahu and Swain [6] proposed an improved LSB matching method to increase the capacity by retaining the image quality. Various steganography methods using LSB substitution and LSB matching with higher hiding capacity and better visual quality have been described in [7–12]. Wu and Tsai [13] introduced a way for embedding data in a cover image, called pixel value differencing (PVD). Here the difference value is calculated from a block consisting Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00011-6 Copyright © 2020 Elsevier Inc. All rights reserved.

41

42

Digital Media Steganography

of two consecutive pixels. The number of bits to be embedded in a block of two pixels is decided by the range to which the difference value belongs. They proposed two different types of quantized range tables ranging from 0 to 255. The one with widths of {2, 2, 4, 4, 4, 8, 8, 16, 16, 32, 32, 64, and 64} gives better image quality, and the other with widths of {8, 8, 16, 32, 64, and 128} gives better hiding capacity. The detailed explanation of Wu and Tsai’s [13] method is discussed in Section 3.2.1. The PVD method provides better security, but it suffers from FOBP, that is, some stego-pixel values fall outside the range of 0 to 255. Various PVD-based methods [17–19,21–24] exist in the literature. The PVD methods are resilient to RS attacks, but they have limitations with respect to the hiding capacity. In this aspect, to achieve better hiding capacity with less distortion to the stego image, Wu et al. [14] suggested a data hiding method using LSB substitution and PVD. The difference value among two consecutive pixels decides the smooth and edge area in an image. LSB substitution is done in smooth areas, and PVD is done in edge areas for data embedding. Swain [25] proposed a combined LSB and PVD method, which improves the hiding capacity compared to Wu et al. [14]. Darabkh et al. [26] have proposed a combined PVD and LSB substitution method called multidirectional PVD with 2 × 3 and 3 × 3 pixel blocks. The experimental result shows that this method provides better capacity compared to PVD [13], TPVD [22], and Octa PVD [23] while maintaining acceptable visual quality for the stego image. Liao et al. [27] have suggested an improved method using four-pixel differencing and LSB substitution. The pixels are divided into blocks having four pixels. The average difference value is used to identify the smooth and edge areas for the block for data embedding. The LSB substitution and PVD steganography methods suffer from three major problems, (i) fall off boundary problem (FOBP), (ii) incorrect data extraction, and (iii) Weakness to RS attack. Hence, to address these problems, we propose an improved steganographic method using the principle of LSB substitution and PVD. The major contributions of the proposed method are highlighted further. 1. The hiding capacity and the PSNR have been improved significantly as compared to the existing methods considered in this work. 2. The proposed method evades the FOBP, which usually occurs in most of the PVD methods. 3. The proposed method successfully resists the RS and pixel difference histogram (PDH) attack.

3.2 Related work 3.2.1 Pixel value differencing (PVD) steganography [13] We consider a nonoverlapping block consisting of two consecutive pixels for data embedding. The embedding and extraction procedures are further described.

Chapter 3 • An improved method for high hiding capacity based on LSB and PVD

43

3.2.1.1 The PVD embedding procedure Step 1: Step 2:

Suppose Pi and Pi+1 are the two consecutive pixels of a nonoverlapping block. A difference value dold is obtained as dold = |Pi+1 − Pi |. Then the range Ri for dold is identified, and the corresponding lower bound Lb and the upper bound Ub for the range Ri are obtained. The range table for Wu and Tsai [13] is shown in Table 3.1.

Table 3.1 Range table for Wu and Tsai [13]. Range (Ri ) = (Lb , Ub ) Width

Step 3:

R1 = [0, 7]

R2 = [8, 15]

R3 = [16, 31]

R4 = [32, 63]

R5 = [64, 127]

R6 = [128, 255]

8

8

16

32

64

128

The number of bits that can be hidden in a block is b calculated by Eq. (3.1). The decimal value of b secret bits is represented as bd .   b = log2 (Ub − Lb + 1) .

Step 4: Step 5: Step 6:

(3.1)

After hiding the data, the new difference value dnew is calculated as dnew = bd +Lb . The difference value m = |dnew − dold |. The stego-pixels Pi and Pi+1 are obtained using the equation ⎧ ⎪ (Pi + m/2, Pi+1 − m/2), ⎪ ⎪ ⎪ ⎪     ⎨ (Pi − m/2, Pi+1 + m/2), Px , Px+1 = ⎪ (Pi − m/2, Pi+1 + m/2), ⎪ ⎪ ⎪ ⎪ ⎩ (Pi + m/2, Pi+1 − m/2),

if Pi ≥ Pi+1 and dnew > dold , if Pi < Pi+1 and dnew > dold , if Pi ≥ Pi+1 and dnew ≤ dold , if Pi < Pi+1 and dnew ≤ dold . (3.2)

3.2.1.2 The PVD extraction steps Step 1: Step 2: Step 3:

The stego-pixels in a block of the stego image are Pi and Pi+1 . Compute the difference dstego = |Pi − Pi+1 |. The secret data can be obtained by converting |dstego − Lb | to b binary bits. The value of b is obtained as in the embedding procedure.

3.2.1.3 Illustration of the PVD method Step 1: Let the two consecutive pixels of a cover image be Pi = 80 and Pi+1 = 100. Step 2: The original difference value dold = |Pi − Pi+1 | = 20 ∈ R3 . The corresponding values of Lb = 16 and Ub = 31. Step 3: The hiding capacity for this block is calculated as b = log2 (Ub − Lb + 1) = 4. Suppose the 4-bits of binary secret data are 10002 . Its decimal value is 8, that is, bd = 8. Step 4: The new difference value dnew = bd + Lb = 24, where bd = 8 and Lb = 16. Step 5: Now m = |dnew − dold | = 4. Step 6: The stego-pixels Pi and Pi+1 can be computed using Eq. (3.2) as Pi = 78, Pi+1 = 102.

44

Digital Media Steganography

Step 7:

At the receiver side, the stego difference is found as dstego = |78 − 102| = 24 ∈ R3 . The secret data can be recovered as dstego − Lb = 24 − 16 = 8 = 10002 .

3.2.2 Khodaei et al.’s method [20] Khodaei et al. proposed an adaptive data hiding method using the combination of LSB substitution and PVD. A block having two consecutive pixels is chosen for embedding the secret data. We outline the embedding and extraction procedures as follows. Step 1: Step 2: Step 3:

The pixel value of a gray image can range from 0 to 255. This range is again partitioned into two separate subranges: (i) 0 to 191 and (ii) 192 to 255. If both Px and Px+1 are smaller than 191, then using 3-LSB substitutions, the secret bits are embedded in both pixels. In case any of the pixels of a block is larger than 191, the difference value d = |Px − Px+1 | is computed, and using the suggested range in Table 3.2, the number of embedding bits b in each pixel is decided.

Table 3.2 Range for Khodaei et al. R1 = [0, 15]

R2 = [16, 63]

R3 = [64, 127]

R4 = [128, 255]

Width

16

48

64

128

Block capacity, b

8

10

12

14

Range (Ri )

Step 4:

Let Px and Px+1 be the new pixels after embedding the secret data. Obtain the new difference value d = |Px − Px+1 |. If d and d belong to the same range in Table 3.2, then these are the obtained stego-pixels; rename them as P∗x and P∗x+1 . Otherwise, the stego-pixels are obtained using the pixel adjustment process as follows. Eq. (3.3) is used to obtain two new modified pixels from Px and Px+1 : Px = Px + 2b/2 ,

Step 5:

Step 6:



Px = Px − 2b/2 ,

Px+1 = Px+1 + 2b/2 ,



b

Px+1 = Px+1 − 2 2 . (3.3) 

There can be eight possible stego-pixel pairs: (Px , Px+1 ), (Px , Px+1 ), (Px , Px+1 ),      (Px , Px+1 ), (Px , Px+1 ), (Px , Px+1 ), (Px , Px+1 ), and (Px , Px+1 ). The optimal pair is that producing the lowest difference with the original pairs. At the extraction time, if both P∗x and P∗x+1 are less than 191, then extract the 3-LSB bits from each stego-pixel. Otherwise, compute the difference d∗ = |P∗x − P∗x+1 |. If d∗ ∈ R1 , then extract 4-LSBs of both stego-pixels. Similarly, obtain 5, 6, and 7-LSBs of the stego-pixels if d∗ ∈ R2 , R3 , and R4 , respectively.

3.2.2.1 An illustration of incorrect data extraction in Khodaei et al.’s method It has been observed that Khodaei et al.’s method fails to extract the correct data at the receiver side due to presence of error blocks (EBs). Suppose the original pixels of a block are Px = 240 and Px+1 = 90. As Px > 191, so d = |Px − Px+1 | = 150. As 150 ∈ R4 , so b = 14. So 7 bits of secret data will be hidden in each of these two pixels. Let the 14 bits of secret

Chapter 3 • An improved method for high hiding capacity based on LSB and PVD

45

data be 001100001110012 . After substituting 7 bits, each in the LSBs of the cover image pixels, the new pixels are Px = 100110002 = 15210 and Px+1 = 001110012 = 5710 . The new difference value d = |Px − Px+1 | = |152 − 57| = 95 ∈ R3 . We can observe that d and d belong to different ranges. Hence pixels readjustments are required. The modified pix els for Px and Px+1 using Eq. (3.3) are Px = Px + 2b/2 = 152 + 27 = 280, Px = Px − 2b/2 =  152 − 27 = 24, Px+1 = Px+1 + 2b/2 = 57 + 27 = 185, and Px+1 = Px+1 − 2b/2 = 57 − 27 = −71. The  eight possible pairs are (Px , Px+1 ) = (152, 185), (Px , Px+1 ) = (152, −71), (Px , Px+1 ) = (280, 57),    (Px , Px+1 ) = (280, 185), (Px , Px+1 ) = (280, −71), (Px , Px+1 ) = (24, 57), (Px , Px+1 ) = (24, 185),   and (Px , Px+1 ) = (24, −71). Here we can observe that none of the obtained eight pairs is optimal as no pixels from the obtained pairs belongs to the range {191, 255}. Hence such blocks cannot be utilized for data embedding. In such a case the stego-pixel pair will be the same as the original pixel pair, that is, P∗x = 240 and P∗x+1 = 90. At the receiver side, the receiver has no idea about such blocks. Since d∗ = |P∗x − P∗x+1 | = |240 − 90| = 150 ∈ R4 , 11100002 will be extracted from P∗x = 240, and 10110102 will be extracted from 90. Therefore the extracted bits are 111000010110102 , whereas the embedded bits are 001100001110012 . Since no data have been hidden inside the block, the extracted bits are incorrect. Therefore Khodaei et al.’s method suffers from incorrect data extraction problem. This issue arises due to the invalid partition of the subranges and the fact that seven LSBs of the eight bits of the pixels are utilized for embedding the secret data.

3.2.3 Jung’s method [15] In this section, we explain embedding and extraction algorithms for Jung’s method. At first, we divide the cover image into blocks, each block with a pair of pixels. Each pixel has twobit planes, where one of the planes utilizes LSB substitution, and another plane utilizes PVD. The embedding and extraction algorithms are discussed further.

3.2.3.1 Embedding algorithm Step 1:

Let Pi and Pi+1 be the original pixels. Suppose the two-bit planes for Pi are (Px and Py ) and for Pi+1 are (Px+1 and Py+1 ), which can be computed using the equation     (Px , Px+1 ) = Pi div 2k , Pi+1 div 2k and (Py , Py+1 ) = Pi mod 2k , Pi+1 mod 2k . (3.4)

Step 2:

Let di be the difference value, which can be computed from as di = |Px+1 − Px |.

Step 3:

(3.5)

Suppose b is the number of bits that can be embedded and computed using the equation   (3.6) b = log2 (Ub − Lb + 1) , where Lb and Ub are the respective lower and upper ranges for the difference value di . The ranges are shown in Table 3.3.

46

Digital Media Steganography

Table 3.3 Ranges of Jung’s technique [15]. Range (Ri ) = (Lb , Ub )

R1 = [0, 7]

R2 = [8, 15]

R3 = [16, 31]

R4 = [32, 63]

3

3

4

5

No of bits hidden in a block, b

Step 4:

The new difference value di can be computed using the equality di = Lb + bdj ,

Step 5:

Step 6:

Step 7:

(3.7)

where bdj is the decimal value of b secret bits. Now calculate m = |di − di | and obtain the new pixel pair (Px , Px+1 ) using the equation

    (Px − m/2, Px+1 + m/2) if di is even, (3.8) Px , Px+1 = (Px − m/2, Px+1 + m/2) if di is odd. Take k-bits of binary secret data and convert to decimal, let it be b1 . Similarly, take other k-bits of binary data and convert to decimal, let it be b2 . Here k can be 2 or 3 bits. The stego-pixel pair (P∗i , P∗i+1 ) can be computed using the equation    P∗i , P∗i+1 = Px × 2k + b1 , Px+1 × 2k + b2 .



(3.9)

3.2.3.2 Extraction algorithm Step 1:

The stego-pixel pair is (P∗i , P∗i+1 ). Obtain the values of (P∗x , P∗x+1 ) using the equation    P∗x = P∗i − P∗i mod 2k div 2k

Step 2:

and

   P∗x+1 = P∗i+1 − P∗i+1 mod 2k div 2k . (3.10)

Using the difference value d∗i = |P∗x+1 −P∗x |, find bd∗ j , the decimal value of the secret data embedded, using the equation ∗ bd∗ j = di − Lb ,

Step 3:

(3.11)

where Lb is the lower range for d∗i . Convert bd∗ j to b binary bits. Append these bits to the extracted binary bit stream. Obtain b3 and b4 using the equation b3 = P∗i mod 2k

and

b4 = P∗i+1 mod 2k .

(3.12)

Convert b3 and b4 with k-bit binary. Append these bits to the extracted binary bit stream. The extraction is done.

3.2.3.3 FOBP in Jung’s method An image is a combination of pixels. The pixel of a gray image consists of 8 bits. So the minimum value for the pixel in the gray image is 0, and the maximum is 255. If any of the

Chapter 3 • An improved method for high hiding capacity based on LSB and PVD

47

stego-pixels goes outside the permissible range, then we say that the approach suffers from FOBP. This section proves the presence of FOBP with Jung’s method. Suppose the original pixels are Pi = 132 and Pi+1 = 252 and k = 2. Obtain the two-bit planes (Px and Py ) and (Px+1 and Py+1 ) for Pi and Pi+1 from Eq. (3.4) as (Px , Px+1 ) = (33, 63) and (Py , Py+1 ) = (0, 0). Step 2: Find the difference value di = |63 − 33| = 30 ∈ R3 . Step 3: Let the secret data in binary be 001000002 . The number of bits of secret data b that can be embedded is computed using Eq. (3.6). So b = log2 (31 − 16 + 1) = 4. Let bdj be the decimal value of b binary bits, i.e., the decimal value of 00102 . Thus

Step 1:

bdj = 2. Step 4: The new difference value di can be computed using Eq. (3.7) as di = 16 + 2 = 18. Step 5: The pixel pair Px and Px+1 can be computed using Eq. (3.8) as Px = 33 − 6 = 27 and Px+1 = 63 + 6 = 69, where m = |18 − 30| = 12. Step 6: Let k = 2. So b1 = 002 and b2 = 002 . Step 7: The stego-pixel pair (P∗i , P∗i+1 ) can be computed using Eq. (3.9). So P∗i = (27 × 4) + (0)10 = 108 and P∗i+1 = (69 × 4) + (0)10 = 276. The stego-pixel P∗i+1 exceeds the maximum range of 255. Hence FOBP arose in Jung’s method.

3.2.3.4 Extraction problem in Jung’s method This section explains how Jung’s method fails in extracting the embedded data. Let the gray values for two original pixels Pi and Pi+1 be 160 and 156, respectively. So Px = 40 and Px+1 = 39. The difference value di = |40 − 39| = 1 ∈ R1 . Let the secret bits be 11110002 . The number of secret data that can be embedded is b = log2 (7 − 0 + 1) = 3. Now, bdj = 1112 = 7. The new difference value di = 0 + 7 = 7, and m = |di − di | = |7 − 1| = 6. Using Eq. (3.8), obtain Px = 37 and Px+1 = 42. Let, k = 2. So b1 = 102 and b2 = 002 . Finally, using Eq. (3.9), the two stego-pixels can be computed as P∗i = 150 and P∗i+1 = 168. At the extraction side the value of P∗x and P∗x+1 can be found using Eq. (3.10). So P∗x = 37 and P∗x+1 = 42. The difference value d∗i = |P∗x+1 − P∗x | = 5 ∈ R1 , its corresponding binary is 1012 , and b3 = P∗i mod 2k = 2 ∈ R1 = 102 and b4 = P∗i+1 mod 2k = 0 ∈ R1 = 002 . Finally, the extracted bits are 10110002 . But the embedded bits are 11110002 . Hence the extraction problem occurs in Jung’s method.

3.3 The proposed method A nonoverlapping block consisting of three consecutive pixels are considered for data embedding. The proposed method has been tested with separate ranges shown in Tables 3.4 and 3.5. When we consider the ranges specified in Table 3.4 for embedding the secret data, then the proposed method is said to be of type 1. Similarly, if we consider the ranges in Table 3.5, then the proposed method is said to be of type 2. The embedding and extraction procedures for the proposed method are discussed in Sections 3.3.1 and 3.3.2, respectively. Figs. 3.1 and 3.2 show the block diagram of the embedding and extraction procedure.

48

Digital Media Steganography

FIGURE 3.1 Block diagram for the embedding procedure.

Chapter 3 • An improved method for high hiding capacity based on LSB and PVD

49

Table 3.4 Ranges for the proposed method, type 1. Range (Ri ) = (Lb , Ub ) Capacity

R1 = [0, 7]

R2 = [8, 15]

R3 = [16, 31]

R4 = [32, 63]

R5 = [64, 255]

3

3

3

4

4

Table 3.5 Ranges for the proposed method, type 2. Range (Ri ) = (Lb , Ub ) Capacity

R1 = [0, 7]

R2 = [8, 15]

R3 = [16, 31]

R4 = [32, 63]

R5 = [64, 255]

3

3

4

5

6

FIGURE 3.2 Block diagram for the extraction procedure.

3.3.1 Embedding procedure Step 1: Step 2:

Step 3:

Let the three pixels of a block be Pi , Pi+1 , and Pi+2 . Suppose Pi ≤ Pi+1 and Pi ≤ Pi+2 . Then the reference pixel is Pi , and apply the steps 3 to 7. Otherwise, the reference pixel is Pi+1 , and apply the steps 8 to 12 to produce the stego-pixels. Read k-bits of secret data and substitute in the k-LSBs of Pi to obtain gi . The value of k can be either 3 or 4. Let the decimal value of k-LSBs of Pi is kold , and the decimal value of k-LSBs of gi is knew . Compute the difference value kdif =

50

Digital Media Steganography

kold − knew . Now compute Pi using the equation ⎧ k k−1 and 0 ≤ g + 2k ≤ 255, ⎪ ⎪ i ⎨ (gi + 2 ) if kdif > 2 Pi = (gi − 2k ) if kdif < −2k−1 and 0 ≤ gi − 2k ≤ 255, ⎪ ⎪ ⎩ g otherwise.

(3.13)

i

Find the difference values v1 = |Pi+1 − Pi | and v2 = |Pi+2 − Pi |. From the proposed range table obtain the ranges for v1 and v2 and select the number of secret bits to embed as n1 and n2 . Convert n1 and n2 bits to their corresponding decimal values decn1 and decn2 , respectively. Now compute the new difference values v1 = Lb1 + decn1 and v2 = Lb2 + decn2 , where Lb1 and Lb2 are the lower level ranges for the respective ranges. Obtain two new values Pi+1 and Pi+1 for Pi+1 and, similarly, two new values Pi+2 and Pi+2 for Pi+2 using the equation Pi+1 = Pi + v1 ,

Pi+1 = Pi − v1 ,

Pi+2 = Pi + v2 ,

Pi+2 = Pi − v2 . (3.14)

Now obtain the values for Pi+1 and Pi+2 using the following equations

Pi+1

=

and

Step 4:

Pi+1

Pi+2

=

Pi+1

Pi+2

Pi+2

if |Pi+1 − Pi+1 | < |Pi+1 − Pi+1 |, otherwise,

if |Pi+2 − Pi+2 | < |Pi+2 − Pi+2 |, otherwise.

(3.15)

(3.16)

Now find the smallest pixel from Pi , Pi+1 , and Pi+2 and store it in Psmall . Compute the difference values d1 = Pi − Pi+1 and d2 = Pi − Pi+2 . Obtain Pi , Pi+1 , and Pi+2 using Eqs. (3.17), (3.18) and (3.19). ⎧ ⎪ Pi if Psmall = Pi and Pi ≤ Pi+1 and Pi ≤ Pi+2 , ⎪ ⎪ ⎪ ⎪        ⎪ ⎪ ⎪ Pi if Psmall = Pi or Pi+1 or Pi+2 and Pi = Pi+1 = Pi+2 , ⎨ Pi = (3.17) (Pi − 2 × |d1 |) if Psmall = Pi+1 and d1 > d2 , ⎪ ⎪ ⎪ ⎪ (Pi − 2 × |d2 |) if Psmall = Pi+1 and d1 = d2 , ⎪ ⎪ ⎪ ⎪ ⎩  (Pi − 2 × |d2 |), if Psmall = Pi+2 and d1 < d2 , ⎧ ⎪ Pi+1 if Psmall = Pi and Pi ≤ Pi+1 and Pi ≤ Pi+2 , ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ if Psmall = Pi or Pi+1 or Pi+2 and Pi = Pi+1 = Pi+2 , P ⎪ ⎨ i+1 Pi+1 = Pi+1 if Psmall = Pi+1 and d1 > d2 , ⎪ ⎪ ⎪ ⎪ (Pi+1 − (|Pi+1 − Pi | − |d1 |)) if Psmall = Pi+1 and d1 = d2 , ⎪ ⎪ ⎪ ⎪ ⎩  (Pi+1 − (|Pi+1 − Pi | − |d1 |)) if Psmall = Pi+2 and d1 < d2 ,

(3.18)

Chapter 3 • An improved method for high hiding capacity based on LSB and PVD ⎧ ⎪ Pi+2 if Psmall = Pi and Pi ≤ Pi+1 and Pi ≤ Pi+2 , ⎪ ⎪ ⎪ ⎪        ⎪ ⎪ ⎨ Pi+2 if Psmall = Pi or Pi+1 or Pi+2 and Pi = Pi+1 = Pi+2 , Pi+2 = Pi+2 − (|Pi+2 − Pi | − |d2 |) if Psmall = Pi+1 and d1 > d2 , ⎪ ⎪ ⎪ ⎪ Pi+2 if Psmall = Pi+1 and d1 = d2, ⎪ ⎪ ⎪ ⎩ P if P = P and d < d . i+2

Step 5: Step 6:

small

2

Pi+1 = Pi+1 + d4 ,

Pi+2 = Pi+2 + d4 .

(3.20)

Now, if the value of Pi or Pi+1 or Pi+2 lies below 255, then these are stego-pixels, and rename them as P∗i , P∗i+1 , and P∗i+2 . Otherwise, if any of the pixels goes beyond 255, then perform case 1 to readjust the stego-pixels. Compute d3 = Pi − Pi and d4 = (2k − d3 ), and obtain Pi , Pi+1 , and Pi+2 using Eq. (3.21). Pi = Pi − d4 ,

Step 8:

1

(3.19)

If Pi ≤ Pi , go to step 6. Otherwise, go to step 7. Compute d3 = Pi − Pi and d4 = (2k − d3 ), and obtain Pi , Pi+1 , and Pi+2 using Eq. (3.20). Pi = Pi + d4 ,

Step 7:

i+2

51

Pi+1 = Pi+1 − d4 ,

Pi+2 = Pi+2 − d4 .

(3.21)

Now, if the value of Pi or Pi+1 or Pi+2 is above 0, then these are stego-pixels, and rename them as P∗i , P∗i+1 , and P∗i+2 . Otherwise, if any of the pixels goes below 0, then perform case 2 to readjust the stego-pixels. Read k-bits of secret data and substitute in the k-LSBs of Pi+1 to obtain gi+1 . The value of k can be either 3 or 4. Let the decimal value of k-LSBs of Pi+1 be kold and the decimal value of k-LSBs of gi+1 be knew . Compute the difference value kdif = kold − knew . Now, compute Pi+1 using Eq. (3.22).

Pi+1

=

⎧ k ⎪ ⎪ ⎨ (gi+1 + 2 ) (gi+1 − 2k ) ⎪ ⎪ ⎩ g i+1

if kdif > 2k−1 and 0 ≤ gi+1 + 2k ≤ 255, if kdif < −2k−1 and 0 ≤ gi+1 − 2k ≤ 255,

(3.22)

otherwise.

Find the difference values v11 = |Pi − Pi+1 | and v22 = |Pi+2 − Pi+1 |. From the proposed range table obtain the ranges for v11 and v22 and select the number of secret bits to embed as n11 and n22 . Convert n11 and n22 bits to their corresponding decimal values as decn11 and decn22 , respectively. Now compute the new difference values v11 = Lb1 + decn11 and v22 = Lb2 + decn22 , where Lb1 and Lb2 are the lower level ranges for the respective ranges. Obtain two new values Pi and Pi for Pi and, similarly, two new values Pi+2 and Pi+2 for Pi+2 using the equations Pi = Pi+1 + v11 ,

Pi = Pi+1 − v11 ,

Pi+2 = Pi+1 + v22 ,

Pi+2 = Pi+1 − v22 . (3.23)

52

Digital Media Steganography

Now, obtain the values for Pi and Pi+2 using Eqs. (3.24) and (3.25).

Pi

=

Pi+2 = Step 9:

Pi

Pi

Pi+2

Pi+2

if |Pi − Pi | < |Pi − Pi |, otherwise, if |Pi+2 − Pi+2 | < |Pi+2 − Pi+2 |, otherwise.

(3.24)

(3.25)

Obtain the smallest pixel from Pi , Pi+1 , and Pi+2 and store it in Psmall . Compute the difference values t1 = Pi+1 − Pi and t2 = Pi+1 − Pi+2 . Find Pi , Pi+1 , and Pi+2 using Eqs. (3.26), (3.27), and (3.28). ⎧ ⎪ Pi ⎪ ⎪ ⎪ ⎨ P i Pi =  ⎪ P ⎪ i ⎪ ⎪ ⎩  Pi

if Psmall = Pi+1 or Pi+2 and (Pi+1 ≤ Pi or Pi+2 ≤ Pi ),

if Psmall = Pi or Pi+1 or Pi+2 and (Pi = Pi+1 = Pi+2 ), if Psmall = Pi and t1 > t2 ,

(3.26)

if Psmall = Pi and t1 ≤ t2 ,

⎧ ⎪ Pi+1 ⎪ ⎪ ⎪ ⎨ P i+1 Pi+1 = ⎪ (P − 2 × |t1 |) ⎪ i+1 ⎪ ⎪ ⎩  (Pi+1 − 2 × |t1 |)

if Psmall = Pi+1 or Pi+2 and (Pi+1 ≤ Pi or Pi+2 ≤ Pi ), if Psmall = Pi or Pi+1 or Pi+2 and Pi = Pi+1 = Pi+2 ,

if Psmall = Pi and t1 > t2 , if Psmall = Pi and t1 ≤ t2 ,

(3.27) ⎧ ⎪ Pi+2 if Psmall = Pi+1 or Pi+2 and (Pi+1 ≤ Pi or Pi+2 ≤ Pi ), ⎪ ⎪ ⎪ ⎨ P       i+2 if Psmall = Pi or Pi+1 or Pi+2 and Pi = Pi+1 = Pi+2, Pi+2 = ⎪ (Pi+2 − (|Pi+2 − Pi+1 | − |t2 |)) if Psmall = Pi and t1 > t2 , ⎪ ⎪ ⎪ ⎩  (Pi+2 − (|Pi+2 − Pi+1 | − |t2 |)) if Psmall = Pi and t1 ≤ t2 .

(3.28)

Step 10: If Pi+1 < Pi+1 , go to step 11. Otherwise, go to step 12. Step 11: Compute t3 = Pi+1 − Pi+1 and t4 = (2k − t3 ), and obtain Pi , Pi+1 , and Pi+2 using the equations Pi = Pi + t4 ,

Step 12:

Pi+1 = Pi+1 + t4 ,

Pi+2 = Pi+2 + t4 .

(3.29)

Now, if the value of Pi or Pi+1 or Pi+2 lies below 255, then these are stego-pixels, and rename them as P∗i , P∗i+1 , and P∗i+2 . Otherwise, if any of the pixels goes beyond 255, then perform case 1 to readjust the stego-pixels. Compute t3 = Pi+1 − Pi+1 and t4 = (2k − t3 ), and obtain Pi , Pi+1 , and Pi+2 using the equations Pi = Pi − t4 ,

Pi+1 = Pi+1 − t4 ,

Pi+2 = Pi+2 − t4 .

(3.30)

Chapter 3 • An improved method for high hiding capacity based on LSB and PVD

53

Now, if the value of Pi or Pi+1 or Pi+2 is above 0, then these are stego-pixels, and rename them as P∗i , P∗i+1 , and P∗i+2 . Otherwise, if any of the pixels goes below 0, then perform case 2 to readjust the stego-pixels. Case 1: Pixel shifting process for overflow condition In case any of the obtained pixels Pi , Pi+1 , and Pi+2 goes beyond the range of 255, then find the largest pixel among Pi , Pi+1 , and Pi+2 . Generate a new difference value m1 by finding the difference value with the largest pixel from 255, i.e., m1 = |255 − Pi | or |255 − Pi+1 | or |255 − Pi+2 |. Obtain the upper range Ub for m1 . Now obtain the stego-pixels P∗i , P∗i+1 , and P∗i+2 by using the equations P∗i = Pi − Ub + 1,

P∗i+1 = Pi+1 − Ub + 1,

P∗i+2 = Pi+2 − Ub + 1.

(3.31)

Case 2: Pixel shifting process for underflow condition Suppose any of the obtained stego-pixels value is below the minimum range, i.e., 0, then find the smallest pixel among Pi , Pi+1 , and Pi+2 . Generate a new difference value m2 by finding the difference value with the smallest pixel from 0, i.e., m2 = |0 − Pi | or m2 = |0 − Pi+1 | or m2 = |0 − Pi+2 |. Obtain the upper range Ub for m2 . Now obtain the stego-pixels P∗i , P∗i+1 , and P∗i+2 using the equations P∗i = Pi + Ub + 1,

P∗i+1 = Pi+1 + Ub + 1,

P∗i+2 = Pi+2 + Ub + 1.

(3.32)

3.3.2 Extraction procedure The 3 stego-pixels of the block are P∗i , P∗i+1 , and P∗i+2 . Obtain the smallest in the block and store it in Psmall . If Psmall = P∗i , then the smallest pixel is P∗i , and go to step 2; otherwise, choose the smallest pixel as P∗i+1 and go to step 3. Step 2: Convert Psmall to its corresponding binary bits. Now extract k-LSB bits from Psmall and store in sk . Compute the differences sd1 = |P∗i+1 − P∗i | and sd2 = |P∗i+2 − P∗i |. Obtain the lower range Lb for sd1 and sd2 , respectively, using the proposed range table. Find sk1 = sd1 − Lb and sk2 = sd2 − Lb . Convert sk1 and sk2 to their respective binaries and add to sk . The secret data is sk . Step 3: The smallest pixel is P∗i+1 ; store it in Psmall . Convert Psmall to its corresponding binary bits. Now extract k-LSBs from Psmall and store in sk . Compute the differences sd3 = |P∗i − P∗i+1 | and sd4 = |P∗i+2 − P∗i+1 |. Obtain the lower range Lb for sd3 and sd4 , respectively. Find sk3 = |sd3 − Lb | and sk4 = |sd4 − Lb |. Convert sk3 and sk4 to binaries and add to sk . The secret data is sk .

Step 1:

3.3.3 Example of the proposed method 3.3.3.1 Embedding side Step 1: Step 2:

Consider the original pixels Pi = 235, Pi+1 = 237, and Pi+2 = 239. As Pi RM − SM is true, then the method is exposed to the RS attack. The result of Lena and Zelda images for the proposed method, type 1, is shown in Fig. 3.9. Similarly, the RS plot for Jung [15], Khodaei and Faez, type 1 [16], Wu and Tsai [13], and Khodaei et al. [20] are shown in Figs. 3.10, 3.11, 3.12, and 3.13. We can observe from the obtained plot that the condition RM ≈ R−M > SM ≈ S−M is satisfied for both images for the proposed method, Khodaei and Faez’s method [16], and Wu and Tsai’s method [13]. At the same time, Jung’s method [15] and Khodaei et al.’s method [20] are exposed to RS attack. Hence we can conclude that the proposed method is undetectable to the RS analysis test.

60

Digital Media Steganography

FIGURE 3.9 RS plot for proposed method, type 1. (A) Lena, (B) Zelda.

FIGURE 3.10 RS plot for Jung [15]. (A) Lena, (B) Zelda.

FIGURE 3.11 RS plot for Khodaei and Faez, type 1 [16]. (A) Lena, (B) Zelda.

Chapter 3 • An improved method for high hiding capacity based on LSB and PVD

61

FIGURE 3.12 RS plot for Wu and Tsai [13]. (A) Lena, (B) Zelda.

FIGURE 3.13 RS plot for Khodaei et al. [20]. (A) Lena, (B) Zelda.

3.4.3 Security check using Pixel Difference Histogram (PDH) analysis In this section, we present the security of the proposed method to PDH analysis. The PDH plot can be obtained by finding the difference between the respective pixels of the original and stego image in the x-axis and the frequency of the difference value in the y-axis [41–44]. From the obtained plot, if we observe that the curves of both original and stego images are smooth in nature, then the method successfully passes the PDH analysis. On the other hand, if the curves are zig-zag in nature, then the presence of secret data is detected. Fig. 3.14 shows the PDH plots of the Lena image for the proposed method, type 1. We can observed that the curves for both original and stego images are almost identical. So the proposed methods successfully resist PDH attack.

62

Digital Media Steganography

FIGURE 3.14 PDH plot for proposed method, type 1.

3.5 Conclusion In this chapter, we suggest an improved steganography method using LSB substitution and PVD. At first, the data embedding is performed in a block consisting of three pixels. After successful embedding, the pixel shifting process is performed to identify and readjust the overflow and underflow pixels. The proposed type 1 and type 2 methods achieve high hiding capacity having 800,007 bits and 831,830 bits with maintaining good PSNR at 36.03 dB and 35.06 dB, respectively. The observed result for the proposed method outperformed other methods considered in this chapter. In addition, the proposed method shows greater resistance to RS and PDH attack.

References [1] J. Fridrich, M. Goljan, Practical steganalysis of digital images: state of the art, in: Security and Watermarking of Multimedia Contents IV, vol. 4675, 2002, pp. 1–14. [2] A. Cheddad, J. Condell, K. Curran, P. Mc Kevitt, Digital image steganography: survey and analysis of current methods, Signal Processing 90 (3) (2010) 727–752. [3] C.K. Chan, L.M. Cheng, Hiding data in images by simple LSB substitution, Pattern Recognition 37 (3) (2004) 469–474. [4] R.Z. Wang, C.F. Lin, J.C. Lin, Image hiding by optimal LSB substitution and genetic algorithm, Pattern Recognition 34 (3) (2001) 671–683. [5] N.I. Wu, M.S. Hwang, A novel LSB data hiding scheme with the lowest distortion, The Imaging Science Journal 65 (6) (2017) 371–378. [6] A.K. Sahu, G. Swain, An improved data hiding technique using bit differencing and LSB matching, Internetworking Indonesia Journal 10 (1) (2018) 17–21. [7] A.K. Sahu, G. Swain, Information hiding using group of bits substitution, International Journal on Communications Antenna and Propagation 7 (2) (2017) 162–167. [8] A.K. Sahu, G. Swain, E.S. Babu, Digital image steganography using bit flipping, Cybernetics and Information Technologies 18 (1) (2018) 69–80. [9] H. Yang, X. Sun, G. Sun, A high-capacity image data hiding scheme using adaptive LSB substitution, Radioengineering 18 (4) (2009) 509–516. [10] A.K. Sahu, G. Swain, A novel multi stego-image based data hiding method for gray scale image, Pertanika Journal of Science & Technology 27 (2) (2019) 753–768. [11] A.K. Sahu, M. Sahu, Digital image steganography techniques in spatial domain: a study, International Journal of Pharmacy and Technology (IJPT) 8 (4) (2016) 5205–5217. [12] A.K. Sahu, G. Swain, Dual stego-imaging based reversible data hiding using improved LSB matching, International Journal of Intelligent Engineering and Systems 12 (5) (2019) 63–73.

Chapter 3 • An improved method for high hiding capacity based on LSB and PVD

63

[13] D.C. Wu, W.H. Tsai, A steganographic method for images by pixel-value differencing, Pattern Recognition Letters 24 (9–10) (2003) 1613–1626. [14] H.C. Wu, N.I. Wu, C.S. Tsai, M.S. Hwang, Image steganographic scheme based on pixel-value differencing and LSB replacement methods, IEE Proceedings. Vision, Image and Signal Processing 152 (5) (2005) 611–615. [15] K.H. Jung, Data hiding scheme improving embedding capacity using mixed PVD and LSB on bit plane, Journal of Real-Time Image Processing 14 (1) (2018) 127–136. [16] M. Khodaei, K. Faez, New adaptive steganographic method using least-significant-bit substitution and pixel-value differencing, IET Image Processing 6 (6) (2012) 677. [17] G. Swain, Adaptive pixel value differencing steganography using both vertical and horizontal edges, Multimedia Tools and Applications 75 (21) (2016) 13541–13556. [18] H.C. Lu, Y.P. Chu, M.S. Hwang, New steganographic method of pixel value differencing, Journal of Imaging Science and Technology 50 (5) (2006) 424–426. [19] R. Anushiadevi, P. Praveenkumar, J.B.B. Rayappan, R. Amirtharajan, Reversible secret data hiding based on adjacency pixel difference, Journal of Artificial Intelligence 10 (1) (2017) 22–31. [20] M. Khodaei, B. Sadeghi Bigham, K. Faez, Adaptive data hiding, using pixel-value-differencing and LSB substitution, Cybernetics and Systems 47 (8) (2016) 617–628. [21] M.A. Hameed, S. Aly, M. Hassaballah, An efficient data hiding method based on adaptive directional pixel value differencing (ADPVD), Multimedia Tools and Applications 77 (2018) 14705–14723. [22] Y.P. Lee, J.C. Lee, W.K. Chen, K.C. Chang, J. Su, C.P. Chang, High-payload image hiding with quality recovery using tri-way pixel-value differencing, Information Sciences 191 (2012) 214–225. [23] S.A. Thanekar, S.S. Pawar, OCTA (STAR) PVD: a different approach of image steganography, in: 2013 IEEE International Conference on Computational Intelligence and Computing Research, 2013, pp. 1–5. [24] C.H. Yang, C.Y. Weng, H.K. Tso, S.J. Wang, A data hiding scheme using the varieties of pixel-value differencing in multimedia images, Journal of Systems and Software 84 (4) (2011) 669–678. [25] G. Swain, Digital image steganography using nine-pixel differencing and modified LSB substitution, Indian Journal of Science and Technology 7 (9) (2014) 1444–1450. [26] K.A. Darabkh, A.K. Al-Dhamari, I.F. Jafar, A new steganographic algorithm based on multi directional PVD and modified LSB, Information Technology and Control 46 (1) (2017) 16–36. [27] X. Liao, Q.Y. Wen, J. Zhang, A steganographic method for digital images with four-pixel differencing and modified LSB substitution, Journal of Visual Communication and Image Representation 22 (1) (2011) 1–8. [28] USC-SIPI image database. [Online]. Available: http://sipi.usc.edu/database/database.php?volume= misc. [29] M. Hussain, A.W. Abdul Wahab, N. Javed, K.H. Jung, Hybrid data hiding scheme using right-most digit replacement and adaptive least significant bit for digital images, Symmetry 8 (6) (2016) 41. [30] M. Hussain, A.W.A. Wahab, A.T. Ho, N. Javed, K.H. Jung, A data hiding scheme using parity-bit pixel value differencing and improved rightmost digit replacement, Signal Processing: Image Communication 50 (2017) 44–57. [31] X. Liao, S. Guo, J. Yin, H. Wang, X. Li, A.K. Sangaiah, New cubic reference table based image steganography, Multimedia Tools and Applications (2017) 1–18. [32] X. Liao, Z. Qin, L. Ding, Data embedding in digital images using critical functions, Signal Processing: Image Communication 58 (2017) 146–156. [33] X. Liao, Q. Wen, J. Zhang, Improving the adaptive steganographic methods based on modulus function, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 96 (12) (2013) 2731–2734. [34] G. Swain, Adaptive and non-adaptive PVD steganography using overlapped pixel blocks, Arabian Journal for Science and Engineering 43 (12) (2018) 7549–7562. [35] X. Liao, Q.Y. Wen, Z.L. Zhao, J. Zhang, A novel steganographic method with four-pixel differencing and modulus function, Fundamenta Informaticae 118 (3) (2012) 281–289. [36] A.K. Sahu, G. Swain, Pixel overlapping image steganography using PVD and modulus function, 3D Research 9 (3) (2018) 1–14. [37] X. Liao, Q. Wen, J. Zhang, A novel steganographic method with four-pixel differencing and exploiting modification direction, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 95 (7) (2012) 1189–1192.

64

Digital Media Steganography

[38] M. Hussain, A.W.A. Wahab, Y.I.B. Idris, A.T. Ho, K.H. Jung, Image steganography in spatial domain: a survey, Signal Processing: Image Communication 65 (2018) 46–66. [39] A.K. Sahu, G. Swain, A novel n-rightmost bit replacement image steganography technique, 3D Research 10 (1) (2019) 1–18. [40] A.K. Sahu, G. Swain, Digital image steganography using PVD and modulo operation, Internetworking Indonesia Journal 10 (2) (2018) 3–13. [41] A.K. Sahu, G. Swain, An optimal information hiding approach based on pixel value differencing and modulus function, Wireless Personal Communications 108 (1) (2019) 159–174. [42] A.K. Sahu, G. Swain, High fidelity based reversible data hiding using modified LSB matching and pixel difference, Journal of King Saud University: Computer and Information Sciences (2019), https://doi. org/10.1016/j.jksuci.2019.07.004. [43] A.K. Sahu, G. Swain, Data hiding using adaptive LSB and PVD technique resisting PDH and RS analysis, International Journal of Electronic Security and Digital Forensics 11 (4) (2019) 458–476. [44] A.K. Sahu, G. Swain, Reversible image steganography using dual-layer LSB matching, Sensing and Imaging 21 (2020) 1.

4 An efficient image steganography method using multiobjective differential evolution Manjit Kaura , Vijay Kumarb , Dilbag Singhc a Manipal University Jaipur, Computer Communication and Engineering, School of Computing and

Information Technology, Jaipur, India b NIT Hamirpur, Computer Science and Engineering,

Hamirpur, HP, India c Manipal University Jaipur, Computer Science and Engineering, School of Computing and

Information Technology, Jaipur, India

4.1 Introduction In the era of high-speed technology, anything can be sent and received with a single click of a button, anywhere and anytime in the world. With these advancements, where life has become very easy and smooth, their security of digital information has become a major concern for everyone [1]. Digital information includes text documents, digital images, videos, and audio signals. There are several methods available to secure the significant information from attackers. The most popular among them are cryptography, steganography, and watermarking [2]. Cryptography protects the meaningful information from attackers by converting it into unreadable form [3]. The authorized users who have the correct key can only recover the original message. Cryptographic methods involve encryption, decryption, and keys to make secure communication [4]. Fig. 4.1 shows the general model of cryptography. Encryption algorithms are used to convert the plain image into a cipher image using secret keys. The decryption algorithms are used to retrieve the plain image using the same secret keys [5]. Although cryptographic methods cannot be broken easily, plain cipher images draw the attention of eavesdroppers. Sometimes, invisible communication is required without noticing to anyone. This is the reason why information hiding methods are required [6]. Information hiding methods are decomposed into two categories such as watermarking and steganography [6]. Both information hiding methods conceal the secret information, but with different objectives. Watermarking is used to protect the integrity of private information. The concealing of existence of communication from attackers is optional in the case of watermarking [7]. In contrast, steganography is an another most popular Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00012-8 Copyright © 2020 Elsevier Inc. All rights reserved.

65

66

Digital Media Steganography

FIGURE 4.1 General model of cryptography.

FIGURE 4.2 General model of image steganography.

method to protect or hide the information. It is also known as cover writing [8] since it conceals the presence of potential information inside an image, audio, video, and text file [9]. The general model of image steganography is shown in Fig. 4.2. The image that hides the secret message is known as a “cover image”. The procedure used to hide the secret message inside the cover image is known as an “embedding method”. The use of stego-key is optional and depends on the embedding method. The final output of embedding process is a “stegoimage”, which hides the secret message. Similarly, the “extraction method” is used at receiver side to extract the secret message from a “stegoimage” using an optional stego-key. In this chapter, we propose a steganography method based on the least significant substitution (LSB) method and differential evolution. In the LSB method the process of mask assignment for embedding a secret image into a cover image is a tedious task. It can be optimized using metaheuristic algorithms. In this chapter, we use differential evolution to optimize the mask assignment process. The list of mask assignment is considered solutions in the case of differential evolution. The reason to choose differential evolution is its good convergence speed and lesser sticking in local optima as compared to other metaheuristic algorithms.

Chapter 4 • An efficient image steganography method

67

FIGURE 4.3 Types of image steganography methods.

4.2 Literature review Various image steganography methods have been proposed in the literature. Different methods have been adopted by the researchers to implement the steganography. These methods can be decomposed into subcategories such as spatial and transform. In the spatial domain the data is directly embedded into the pixels of cover image, whereas in the case of transform domain the cover image is converted into another form, and then embedding process is done. The steganographic methods can be further decomposed based on the coded format and secret data format as shown in Fig. 4.3. Zhang et al. [10] proposed a steganography method based on joint distortion capacity for binary images. In this method, syndrome-trellis code was used to embed the secret message and to reduce the embedding distortion. Sarmah and Kulkarni [11] used cohort intelligence and the reversible data hiding method to encrypt and hide the secret information inside JPEG image. Wu et al. [12] studied the effect of noise on optical steganography. In this study, extra dispersion created by noise was used as a key to encrypt the information and then to hide the encrypted information. The main benefit of this method is that it enhances the security of transmission. Zhou et al. [13] proposed an attacking method to extract the secret information and detect the stegoimages because the steganography methods based on texture synthesis suffered from poor mirror operation.

68

Digital Media Steganography

Brandao and Jorge [14] presented an image steganography method using artificial neural networks. In this method the secret images were hidden into cover images using least significant bits. The secret keys were generated using artificial neural networks. Zhang et al. [15] proposed a coverless image steganography to resist the steganalysis. In this method, latent Dirichlet allocation was used to classify the image database. Thereafter the discrete cosine transform was applied on the images. Sarreshtedari and Akhaee [16] proposed a method known as one–third least significant bits embedding steganography. This method carried bits of the secret information using three adjacent pixels of a cover image. The main advantage of this method is that the histogram of an image remains unchanged. Zhang et al. [17] improved the performance of adaptive steganography methods against scaling attacks using the Zernike moments. This method provides better detection resistance as compared to other methods based on quantization index. Hu et al. [18] used generative adversarial networks to propose the steganography method without embedding. In this method, there is no need to alter the data of a cover image. Rajput et al. [19] implemented the data-hiding algorithm that uses secret image-sharing and steganography methods. This method has been used for both gray and color images. In this method, n + 1 shared images were created using n cover images and a private image during encryption. Guo et al. [20] proposed a steganography method using a modification of uniform embedding. In this method the coefficients of discrete cosine transform are used as cover elements. The performance of proposed methods was tested using BOSSbase database. Wu and Wang [21] used reversible texture synthesis to implement the steganography. It provides the embedding capacity, which is approximately equal to the size of stegoimage. Therefore the proposed method cannot be easily detected by the attackers and provides better security. Zhang et al. [22] studied nonadditive distortion steganography using joint distortion. In this method, joint distortion was decomposed into distortion on individual pixel. Chen et al. [23] bring out the details of an image using microscope in adaptive steganography to refine the distortion. In this method, linear unsharp masking was used as a microscope. The security has been also improved using an interspreading rule. Li et al. [24] proposed a steganalysis method based on split vector quantization. Support vector machine was used to build the detector. Su et al. [25] used a generalized uniform embedding distortion function for steganography. In this method, two distortion parameters were used, discrete cosine transform and alternation current mode. Jero and Ramu [26] implemented a steganography method using the curvelet transform and ECG signal to hide the patient data. The curvelet transform hides the data into the coefficient of ECG signal. It overcomes the problems of random location-based methods. Liu et al. proposed a new steganography method using the pixel-value differencing and the side match methods. The main benefit of this method is that it provides high embedding capacity with acceptable peak signal-to-noise ratio [27]. Hameed et al. [28] used the adaptive directional pixel value differencing method to hide the color images. Sahu et al. [29] implemented the image steganography method using n-right most bits of each pixel and

Chapter 4 • An efficient image steganography method

69

secret data. The main benefits of this method is that it works on PSNR, embedding capacity, boundary problem, and noise attacks. Swain [30] used LSB substitution and quotient value difference methods to implement the image steganography method. In this method, embedding is done at two levels to remove the boundary problem. The image steganography methods are generally evaluated using the parameters such as embedding capacity, visual image quality, and security. Therefore an ideal steganography method must have high capacity, undetectability, and better visual image quality. But there is a contradiction between these parameters. The high payload methods are more prone to steganalysis, and good visual image quality methods suffer from low payload. Therefore designing a steganography method that can achieve all above-stated parameters is still a challenging issue.

4.3 Background 4.3.1 LSB substitution method Least significant bit (LSB) [31] is a very easy and simple method to hide the secret information in a cover image. In LSB steganography the least significant bits of cover image are used to hide the secret image. The basic framework of LSB method is demonstrated in Fig. 4.4.

FIGURE 4.4 Framework of least significant bit method.

To understand the concept of LSB, assume that there is a secret image of 8 bits and the cover image has 8 pixels. During the embedding process of LSB, each bit of the secret image is embedded into the least significant bit of the cover image. The order of embedding can be sequential, or it can be determined by stego-key. After embedding the single bit into first pixel of cover image, the pixel value is changed from 0110010 to 0110011. After embedding all the bits of a secret image into a cover image, an output image known as a stegoimage is obtained. The extraction of secret bits from a stegoimage is very simple. It can be done by extracting the LSB of each pixel.

70

Digital Media Steganography

FIGURE 4.5 Differential evolution process.

4.3.2 Differential evolution Differential evolution (DE) (see Fig. 4.5) provides optimized solutions of the problems that have objective function such as nonlinear, nondifferentiable, noncontinuous, multidimensional, and so on [32]. It also provides better convergence as compared to other evolutionary algorithms. It can be implemented parallelly to manage the computationally intensive objective functions [33]. DE consists of the following steps to obtain the optimal solution. 1. Population initialization: In this step the initial population is randomly generated. The values of parameters depend upon the upper and lower bounds. The normal distribution can be used to obtain the random solutions whose values lie between 0 and 1. 2. Mutation: In this step a donor solution is developed from three randomly selected solutions. It increases the search space, which helps to find more optimized solution. 3. Recombination: In this step, elements of solutions are combined to generate a trial solution. It combines the elements of donor solution obtained through mutation and target solution (i.e., best solution among previous generation). 4. Selection: In this step the best solution is selected on the basis of the fitness function. The best solution is chosen among trial solutions and target solutions using fitness function. The fitness function depends upon either maximization or minimization problem. If the fitness function is a minimization problem, a trial solution is only

Chapter 4 • An efficient image steganography method

71

survived when the fitness value of trail solution is less than that of a target solution; otherwise, the target solution will be survived for next generation. 5. Stopping criteria: In this step the process of differential evolution is stopped on the basis of some criteria such as the maximum number of generations, acceptance error, number of fitness function evaluation, and so on. Otherwise, steps from (2) to (4) will be repeated.

4.4 The proposed method In this section, we present the embedding and extraction process of the proposed technique.

4.4.1 Embedding process In this method, a mask assignment number presented reveals the outcome. This outcome is represented with a solution in DE. Each aspect in a solution presents one assignment of secret image to be inserted into the cover image. Every is seen just one time in an assignment list. Therefore, for the n-mask of the secret image, every solution includes n proportions equivalent to n masks. The exemplary instance of a mask assignment list for an 8-mask secret image with a 16-mask cover image is explained in Fig. 4.6.

FIGURE 4.6 8-mask secret image with 16-mask cover image.

By checking the assignment number from left to right the H2 fits to the initial address of the cover image (i.e., mask number 2 of the cover image), where initial of the secret (S0 ) is going to be embedded. H5 corresponds to the next address of cover image (i.e., mask number 5 of the cover image), where the next of the cover image (S1 ) is likely to be allocated, H8 corresponds to the next address of the cover image (i.e., mask number 8 of the cover image), where the next of the secret image (S2 ) is likely to be embedded and continue. Therefore a probable stegoimage may be created as revealed in Fig. 4.7. We briefly discuss the steps of the proposed method: 1. Get full LSBs of the cover image and keep them in array H . 2. Change the secret image to binary sequence and keep in array S. 3. Decompose H and S into various s of size L. Therefore the various s for H and S are computed as length of H length of S and Nbs = . (4.1) L L 4. DE is then utilized to tune the allocation list for embedding a secret image into the cover image. This method is depicted in Fig. 4.8. Thus we try to optimize the allocation Nbh =

72

Digital Media Steganography

FIGURE 4.7 Construction of stegoimage from the mask assignment list.

FIGURE 4.8 Optimization of mask assignment list using differential evolution.

list that minimizes the sum of various bits in all host and secret images evaluated by Eq. (4.2). This method is depicted in Fig. 4.4. bs   1 (hj − sj )2 , MSE = Nbs × L

N

L

(4.2)

i=1 j =1

where hj and sj show the pixel gray values in all allocated cover and secret images, respectively. Greedy selection is utilized to update the solutions as depicted in Fig. 4.9. In greedy selection, based on the comparison of the sum of different bits D, a new position is produced by selecting each dimension from the assignment list of old and neighboring solutions.

Chapter 4 • An efficient image steganography method

73

FIGURE 4.9 Exchanging information using greedy selection method.

The difference is evaluated as D=

L 

(hj − sj )2 .

(4.3)

j =1

In Fig. 4.9, at initial dimension, assume that the sum of various bits of H2 in the old solution is less than H3 of the neighboring solution. Therefore H4 will be elected in a new solution. The masks that are already selected so far from the old solution cannot be elected again. Therefore the new solution will be randomly evaluated from other masks that have not been yet selected from both old and neighboring solutions. For example, at fifth dimension, assume that the sum of various bits of H1 0 in the old solution is less as compared to H0 in the neighboring solution. But H1 0 has been elected at fourth dimension. Therefore a mask that has not been selected so far will be randomly elected. For the given case, it is H1 . To avoid suboptimal solutions, the existing solutions are ignored that cannot not enhance the fitness value and randomly search for new outcomes by regenerating a new assignment list.

4.4.2 Extraction process In this process, extraction of the embedded image is performed. Initially, embedded image and hyperparameters obtained using the differential evolution process are taken for extraction process. Thereafter full LSB of the embedded image is obtained. A binary sequence from the LSB is then obtained. Finally, this binary sequence is converted to its actual form, that is, a secret image.

4.5 Experimental results In this section, various experiments are carried out to test the effectiveness of the proposed method. The proposed method is implemented in simulation environment using MATLAB® 2017a. To test the proposed method, well-known benchmark images are used with size of 256 × 256. The proposed method is compared with the competitive steganography methods. These methods are Stirling transform-based image steganography (STS) [34], genetic algorithm-based image steganography (GAS) [35], modified logistic chaotic map-based image steganography (MLCM) [36], and particle swarm optimization-based image steganography (PSOS) [37].

74

Digital Media Steganography

FIGURE 4.10 Visual analysis of the proposed method. Visual analysis of benchmark gray and color images: (A) input gray image, (B) embedded gray image, (C) stego input, (D) extracted image, (E) cover color image, (F) embedded color image, (G) stego input, and (H) extracted image.

Fig. 4.10 shows the visual effects of the proposed method on benchmark gray and color images. We can clearly observe from figures that the input and embedded images are identical to each other. Similarly, stego and the extracted images are also identical to each other. Therefore the proposed method has significant visual analysis.

4.5.1 Peak signal-to-noise ratio To quantitatively evaluate the visual quality of cover images of the proposed method, peak signal-to-noise ratio (PSNR) [38] metric is evaluated. It is important to check whether the proposed method is perceptually transparent or not. PSNR is evaluated between stegoimage and cover image to assess the image quality as follows [6]: P SNR = 10 log10

(255)2 , MSE

(4.4)

where MSE is the mean square error between the cover- and stegoimages. It is mathematically defined as MSE =

L W 1  (ai,j − bi,j )2 , L×W

(4.5)

i=1 j =1

where L and W represent the length and width of the cover image, respectively, and ai,j and bi,j denote the pixel values of cover- and stegoimages, respectively. The comparative PSNR results of the proposed method are shown in Table 4.1. The higher value of PSNR indicates the better quality of the stegoimage. We can see from Table 4.1 that the proposed method has better PSNR as compared to other existing steganography methods. Thus the proposed method provides better image quality.

Chapter 4 • An efficient image steganography method

75

Table 4.1 Comparative analysis based on PSNR. Method

Lena

Text

Grapes

Hello

House

STS [34]

42.230

42.995

42.560

42.359

42.432

GAS [35]

43.720

43.540

43.544

43.575

43.600

MLCM [36]

44.352

44.339

44.554

44.359

44.361

PSOS [37]

44.693

44.753

44.888

44.871

44.869

Proposed

47.998

47.997

47.909

47.233

47.921

FIGURE 4.11 Structural similarity index analysis for gray images.

4.5.2 Structural similarity index measure Structural similarity index measure (SSIM) [39] is used evaluate the perceptual difference between cover and stego images. Fig. 4.11 represents the SSIM analysis for gray cover images. From Fig. 4.11 we can see that the proposed method has better SSIM than other methods. GAS has the lowest SSIM as compared to STS, MLCM, PSOS, and the proposed method. MLCM also has significant SSIM as compared to STS, GAS, and PSO. However, it is not better than the proposed method. Fig. 4.12 represents the SSIM analysis for color cover images. We can observe that the proposed method has better structural similarity between cover- and stegoimages. Therefore it is difficult for an attacker to detect the secret image from the stegoimage.

4.5.3 Bit error rate During transmission, the stegoimages may get infected from some type of noises. Therefore it is necessary to check robustness of the proposed method against distortion toler-

76

Digital Media Steganography

FIGURE 4.12 Structural similarity index analysis for color images.

FIGURE 4.13 Distortion analysis. Comparative analysis of extracted images from noisy stegoimages using methods (A) STS, (B) GAS, (C) MLCM, (D) PSOS, and (E) Proposed.

ance. To test the same, the bit error rate (BER) is used in this experiment. BER is mathematically calculated as N E ⊕ E × 100%, (4.6) BER = i=1 N

Chapter 4 • An efficient image steganography method

77

where E and E  denote the embedded and extracted secret images, respectively, and ⊕ is the XOR operation. The number of bits of a secret image is represented by N . The minimum value of BER indicates the better accuracy of extracted secret images. To carry out this experiment, we added 2% impulse noise to a stegoimage. The extracted images from noisy stegoimage using the existing and proposed methods are shown in Fig. 4.13. We can observe that extracted secret image of the proposed method is less noisy as compared to other methods.

4.6 Conclusion In this chapter, we proposed a steganography method using LSB method and differential evolution. Differential evolution is used to optimize the mask assignment of LSB method. Using an optimized mask assignment to hide the secret image into cover image creates difficulties for an attacker to detect the secret image from the stegoimage. The experimental result analysis shows that the proposed method has better PSNR, SSIM, and BER than the existing steganography methods. It implies that it has a better stegoimage quality, robustness against noise attacks, and payload capacity.

References [1] Manjit Kaur, Vijay Kumar, Parallel non-dominated sorting genetic algorithm-II-based image encryption technique, The Imaging Science Journal 66 (8) (2018) 453–462. [2] Manjit Kaur, Vijay Kumar, A comprehensive review on image encryption techniques, Archives of Computational Methods in Engineering (2018) 1–29. [3] Manjit Kaur, Vijay Kumar, Fourier–Mellin moment-based intertwining map for image encryption, Modern Physics Letters B 32 (09) (2018) 1850115. [4] Manjit Kaur, Vijay Kumar, Li Li, Color image encryption approach based on memetic differential evolution, Neural Computing and Applications (2018) 1–13. [5] Manjit Kaur, Vijay Kumar, Beta chaotic map based image encryption using genetic algorithm, International Journal of Bifurcation and Chaos 28 (11) (2018) 1850132. [6] Mehdi Hussain, Ainuddin Wahid Abdul Wahab, Yamani Idna Bin Idris, Anthony TS Ho, Ki-Hyun Jung, Image steganography in spatial domain: a survey, Signal Processing. Image Communication 65 (2018) 46–66. [7] Aloni Cohen, Justin Holmgren, Ryo Nishimaki, Vinod Vaikuntanathan, Daniel Wichs, Watermarking cryptographic capabilities, SIAM Journal on Computing 47 (6) (2018) 2157–2202. [8] Muhammad Khan, Muhammad Sajjad, Irfan Mehmood, Seungmin Rho, Sung Wook Baik, Image steganography using uncorrelated color space and its application for security of visual contents in online social networks, Future Generation Computer Systems 86 (2018) 951–960. [9] Shadi Elshare, Nameer N. El-Emam, Modified multi-level steganography to enhance data security, International Journal of Communication Networks and Information Security 10 (3) (2018) 509. [10] Junhong Zhang, Wei Lu, Xiaolin Yin, Wanteng Liu, Yuileong Yeung, Binary image steganography based on joint distortion measurement, Journal of Visual Communication and Image Representation 58 (2019) 600–605. [11] Dipti Kapoor Sarmah, Anand J. Kulkarni, Improved cohort intelligence—a high capacity, swift and secure approach on JPEG image steganography, Journal of Information Security and Applications 45 (2019) 90–106. [12] B. Wu, M.P. Chang, B.J. Shastri, P.Y. Ma, P.R. Prucnal, Dispersion deployment and compensation for optical steganography based on noise, IEEE Photonics Technology Letters 28 (4) (2016) 421–424.

78

Digital Media Steganography

[13] H. Zhou, K. Chen, W. Zhang, N. Yu, Comments on “Steganography using reversible texture synthesis”, IEEE Transactions on Image Processing 26 (4) (2017) 1623–1625. [14] A. Santos Brandao, D. Calhau Jorge, Artificial neural networks applied to image steganography, IEEE Latin America Transactions 14 (3) (2016) 1361–1366. [15] X. Zhang, F. Peng, M. Long, Robust coverless image steganography based on DCT and LDA topic classification, IEEE Transactions on Multimedia 20 (12) (2018) 3223–3238. [16] S. Sarreshtedari, M.A. Akhaee, One-third probability embedding: a new ±1 histogram compensating image least significant bit steganography scheme, IET Image Processing 8 (2) (2014) 78–89. [17] Y. Zhang, X. Luo, Y. Guo, C. Qin, F. Liu, Zernike moment-based spatial image steganography resisting scaling attack and statistic detection, IEEE Access 7 (2019) 24282–24289. [18] D. Hu, L. Wang, W. Jiang, S. Zheng, B. Li, A novel image steganography method via deep convolutional generative adversarial networks, IEEE Access 6 (2018) 38303–38314. [19] M. Rajput, M. Deshmukh, N. Nain, M. Ahmed, Securing data through steganography and secret sharing schemes: trapping and misleading potential attackers, IEEE Consumer Electronics Magazine 7 (5) (2018) 40–45. [20] L. Guo, J. Ni, W. Su, C. Tang, Y. Shi, Using statistical image model for JPEG steganography: uniform embedding revisited, IEEE Transactions on Information Forensics and Security 10 (12) (2015) 2669–2680. [21] K. Wu, C. Wang, Steganography using reversible texture synthesis, IEEE Transactions on Image Processing 24 (1) (2015) 130–139. [22] W. Zhang, Z. Zhang, L. Zhang, H. Li, N. Yu, Decomposing joint distortion for adaptive steganography, IEEE Transactions on Circuits and Systems for Video Technology 27 (10) (2017) 2274–2280. [23] K. Chen, H. Zhou, W. Zhou, W. Zhang, N. Yu, Defining cost functions for adaptive JPEG steganography at the microscale, IEEE Transactions on Information Forensics and Security 14 (4) (2019) 1052–1066. [24] S. Li, Y. Jia, C.J. Kuo, Steganalysis of QIM steganography in low-bit-rate speech signals, IEEE/ACM Transactions on Audio, Speech and Language Processing 25 (5) (2017) 1011–1022. [25] W. Su, J. Ni, X. Li, Y. Shi, A new distortion function design for JPEG steganography using the generalized uniform embedding strategy, IEEE Transactions on Circuits and Systems for Video Technology 28 (12) (2018) 3545–3549. [26] S.E. Jero, P. Ramu, Curvelets-based ECG steganography for data security, Electronics Letters 52 (4) (2016) 283–285. [27] Hsing-Han Liu, Yuh-Chi Lin, Chia-Ming Lee, A digital data hiding scheme based on pixel-value differencing and side match method, Multimedia Tools and Applications 78 (9) (2019) 12157–12181. [28] Mohamed Abdel Hameed, Saleh Aly, M. Hassaballah, An efficient data hiding method based on adaptive directional pixel value differencing (ADPVD), Multimedia Tools and Applications 77 (12) (2018) 14705–14723. [29] Aditya Kumar Sahu, Gandharba Swain, A novel n-rightmost bit replacement image steganography technique, 3D Research 10 (1) (2018) 2. [30] Gandharba Swain, Very high capacity image steganography technique using quotient value differencing and LSB substitution, Arabian Journal for Science and Engineering 44 (4) (2019) 2995–3004. [31] Anan Banharnsakun, Artificial bee colony approach for enhancing LSB based image steganography, Multimedia Tools and Applications 77 (20) (2018) 27491–27504. [32] A. Kai Qin, Vicky Ling Huang, Ponnuthurai N. Suganthan, Differential evolution algorithm with strategy adaptation for global numerical optimization, IEEE Transactions on Evolutionary Computation 13 (2) (2009) 398–417. [33] Swagatam Das, Ponnuthurai Nagaratnam Suganthan, Differential evolution: a survey of the state-ofthe-art, IEEE Transactions on Evolutionary Computation 15 (1) (2011) 4–31. [34] S.K. Ghosal, J.K. Mandal, On the use of the Stirling transform in image steganography, Journal of Information Security and Applications 46 (1) (2019) 320–330. [35] Hamidreza Rashidy Kanan, Bahram Nazeri, A novel image steganography scheme with high embedding capacity and tunable visual image quality based on a genetic algorithm, Expert Systems with Applications 41 (14) (2014) 6123–6130. [36] Milad Yousefi Valandar, Peyman Ayubi, Milad Jafari Barani, A new transform domain steganography based on modified logistic chaotic map for color images, Journal of Information Security and Applications 34 (2017) 142–151.

Chapter 4 • An efficient image steganography method

79

[37] Zhaotong Li, Ying He, Steganography with pixel-value differencing and modulus function based on PSO, Journal of Information Security and Applications 43 (2018) 47–52. [38] Manjit Kaur, Vijay Kumar, Adaptive differential evolution-based Lorenz chaotic system for image encryption, Arabian Journal for Science and Engineering 43 (12) (2018) 8127–8144. [39] Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, Eero P. Simoncelli, et al., Image quality assessment: from error visibility to structural similarity, IEEE Transactions on Image Processing 13 (4) (2004) 600–612.

5 Image steganography using add-sub based QVD and side match Anita Pradhan, K. Raja Sekhar, Gandharba Swain Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, India

5.1 Introduction In recent years, internet has been used as a preferred media for data communication. However, the data during transit in internet faces the security challenges. Hence the techniques like cryptography and steganography have attracted the attention of researchers. Steganography provides security to the data by burying it inside another file like image, video, and audio [1]. Data hiding in an image should be performed without altering its inherent properties, so that the image looks innocuous [2]. Since the inception of digital image steganography, researchers have been altering the least significant bits (LSBs) to hide the data in an image. But this approach is traced by the regular–singular (RS) test [3]. Another popular steganography scheme in images is pixel value differencing (PVD). It was developed by Wu and Tsai [4]. In this scheme, difference between two-pixel values in a 1 × 2 size pixel block determines the data hiding capacity (HC) in that block. But this scheme is traced by pixel difference histogram (PDH) test [5]. By using PVD steganography in 2 × 2 and 3 × 3 size blocks the HC have been further improved [6–8]. In Wu & Tsai’s original PVD, every pixel block utilizes a fixed and predefined range table. Luo et al. [9] made the range table adaptive and used 1 × 3 size blocks. The HC in a target pixel depends on its lower bound (LB) and upper bound (UB). It’s LB and UB values are derived from the other pixels in the block. So the range table is adaptive. This adaptive PVD concept has been further extended to 2 × 2 and 3 × 3 size blocks to hide larger amounts of data [10]. Pradhan et al. [11] used 2 × 3 and 3 × 2 size pixel blocks to obtain better results as compared to that of 2 × 2 and 3 × 3 size blocks. The HC of these schemes is lesser than the HC of the traditional PVD schemes. Furthermore, PVD-dependent adaptive LSB alteration steganography has been proposed [12,13]. In these techniques the PVD concept is used to determine the HC of the pixels in a block, based on which the number of LSB bits are utilized to hide the data. Wu et al. [14] viewed that by applying PVD in textured regions and LSB replacement in non-textured regions, the distortion decreases, but HC increases. Yang et al. [15] expressed that Wu et al.’s scheme classifies a majority of the blocks into smooth category and a fewer number of blocks into textured category. Hence, mainly LSB substitution is performed in Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00013-X Copyright © 2020 Elsevier Inc. All rights reserved.

81

82

Digital Media Steganography

a majority of the blocks, so RS analysis can detect it. Khodaei & Faez [16] utilized 1 × 3 size pixel blocks to apply LSB replacement on the center pixel and PVD on its two neighbors. This technique is detectable by PDH test and weakened by fall off boundary problem (FOBP). Swain [17] proposed an improved version of this technique using 2 × 2 size pixel blocks to acquire higher HC. Wang et al. [18] proposed modulus function-based PVD (MFPVD), wherein they utilized the remainder value instead of the difference value. Zhao et al. [19] proposed an improved version of it. Swain [20] proved Wang et al.’s technique possesses range mismatch issue. Furthermore, he proposed an improved version of it in 2 × 3 size pixel blocks and successfully avoided the range mismatch problem. Yang et al. [21] made horizontal, vertical, and diagonal pairing in a 2 × 2 size pixel block and applied PVD in various appropriate directions. They achieved higher HC with lesser distortion. Pradhan et al. [22] classified the blocks into two categories, (i) edge regions and (ii) smooth regions. They applied PVD with LSB replacement in edge regions, but applied a combination of exploiting modification direction (EMD) and LSB replacement in smooth regions. Tang et al. [23] viewed that while using the three channels of a color pixel, all three channels should be either positive oriented or negative oriented, so that the statistical correlation among the three channels can be preserved. Li et al. [24] also viewed that to improve the security, the pixel values in highly textured regions of the image should be changed toward the same direction. There are many real-life applications of data hiding techniques. It has been viewed that steganography can be used to transfer medical data in IoT-based applications [25]. Steganography can also be applied to hide data in a batch of images in social networks [26]. Jung [27] derived higher bit planes and lower bit planes from a 1 × 2 size pixel block and then applied LSB alteration in lower order bit planes and PVD approach in higher order bit planes. The higher bit planes comprise of six most significant bits (MSBs), and the lower bit planes comprise of two LSBs. Finally, after embedding is performed, the lower and higher bit planes are combined together to form a stego-block. Swain [28] discussed through a case study that Jung’s scheme has FOBP and developed an improved version of it using 3 × 3 blocks. It avoided FOBP and resisted to both RS and PDH tests. Pradhan et al. [7] expanded Wu & Tsai’s PVD approach to multiple directions using 3 × 3 blocks. But the HC was only 2.39 bits per byte (BPB). Swain [29] extended Khodaei & Faez’s technique to 3 × 3 size blocks to obtain relatively improved performance and protection from PDH analysis. Liu et al. [30] developed a steganography scheme in 3 × 3 size pixel blocks. They applied PVD on the central pixel and its left, right, top, and bottom neighboring pixels. After getting their stego-values, they applied the side match method to embed in remaining four corner pixels of the block. Hameed et al. [31] proposed a steganography method in 2 × 2 size pixel blocks by choosing suitable hiding directions for each color channel in an RGB image to achieve the largest HC. The security has been improved because different embedding directions are used dynamically to hide the changeable number of data bits in different channels. Sahu & Swain [32] proposed a steganography technique, where the stego-value of a pixel depends upon the decimal value of its n-LSBs of the cover image pixel and the

Chapter 5 • Image steganography using add-sub based QVD and side match

83

decimal value of n secret data bits. It does not suffer from FOBP and possesses higher embedding capacity. Sahu & Swain [33] developed a new kind of steganography scheme using dual stegoimage-based embedding, wherein from an original image, two mirror images are formed, and then data are hidden in these images using LSB matching. Although it does not provide much higher embedding capacity, it is reversible in nature. In general, the PVD techniques suffer from the problems like FOBP and detection by PDH test. To address these problems, this paper integrates add-sub-based quotient value differencing (ASQVD) and the side match (SM) principle. It gives an average HC of 3.55 BPB with an average peak signal-to-noise ratio (PSNR) of 33.89 decibels (dB). Furthermore, RS and PDH tests cannot trace this steganography.

5.2 Proposed ASQVD+SM technique 5.2.1 The embedding procedure The pixels of the original image are raster-scanned and divided logically into 3 × 3 size disjoint pixel blocks, see Fig. 5.1A. The embedding operation is applied on this block in two stages, (i) stage 1 and (ii) stage 2. Stage 1 performs ASQVD and remainder substitution, and stage 2 performs side match embedding.

FIGURE 5.1 The 3 × 3 pixel blocks. (A) The original block. (B) The stego-block after ASQVD. (C) Four SMs. (D) The stego-block after SM.

The stage 1 of embedding procedure applies ASQVD on pixels Pc , P2 , P4 , P6 , and P8 . It is narrated in steps 1–8. Step 1: Pc is the middle pixel in a block. Its upper, left, right, and lower neighboring pixels are P2 , P4 , P6 , and P8 , respectively. Apply 4-bit LSB replacement in Pc , to get its stego-value Pc .

84

Digital Media Steganography

Step 2: Let decold denote the four LSBs of Pc and decnew denote the four LSBs of Pc . Now determine the variation var = decold − decnew and calculate an altered value of Pc as in Eq. (5.1). This process is called a 4-bit modified LSB (MLSB) substitution. ⎧  4 3  4 ⎪ ⎨ Pc + 2 if var > 2 and 0 ≤ Pc + 2 ≤ 255,  (5.1) Pc = Pc − 24 if var < −23 and 0 ≤ Pc − 24 ≤ 255, ⎪ ⎩  otherwise. Pc Step 3: From the pixels Pc , P2 , P4 , P6 , and P8 the quotients and remainders are calculated using Eq. (5.2) and Eq. (5.3), respectively. Qc = Pc div 4, Rc

= Pc

mod 4,

and

Qi = Pi div 4

and

Ri = Pi mod 4

for i = 2, 4, 6, 8; for i = 2, 4, 6, 8.

Step 4: Utilize Qc and Qi in Eq. (5.4) to determine the four differences   di = Qc − Qi  for i = 2, 4, 6, 8.

(5.2) (5.3)

(5.4)

For i = 2, 4, 6, 8, di belongs to the range of Table 5.1. Let Li and ti denote the LB and HC of this range respectively. For i = 2, 4, 6, 8, determine Si , the decimal value of ti secret bits. Table 5.1 Range table for proposed ASQVD technique. Pixel difference range

{0, 7}

{8, 15}

{16, 31}

{32, 63}

2

2

3

3

Capacity of the range

Step 5: Utilize Eq. (5.5), to determine di , the new values of di : di = Li + Si

for i = 2, 4, 6, 8.

(5.5)



Step 6: For each quotient value Qi , calculate Qi and Qi using Eq. (5.6) and determine Qi by Eq. (5.7): Qi = Qc − di

Qi =

⎧ ⎪ Qi ⎪ ⎪ ⎪ ⎪ ⎨  ⎪ ⎪ Qi ⎪ ⎪ ⎪ ⎩

and



Qi = Qc + di

for i = 2, 4, 6, 8;



(5.6)



if (Qi > 63) or ((0 ≤ Qi ≤ 63) and (0 ≤ Qi ≤ 63) 

and (|Qi − Qi | ≤ |Qi − Qi |)),



if (Qi < 0) or ((0 ≤ Qi ≤ 63) and (0 ≤ Qi ≤ 63)

for i = 2, 4, 6, 8.



and (|Qi − Qi | < |Qi − Qi |)),

(5.7) Step 7: For each of the remainders R2 , R4 , R6 , and R8 , their stego-values are to be found. For i = 2, 4, 6, 8, determine Ri , the decimal digit for 2 secret bits. Say, Ri is the stegovalue of Ri .

Chapter 5 • Image steganography using add-sub based QVD and side match

85

Step 8: Utilizing Eq. (5.8), determine Pi , the stego-value for Pi : Pi = Qi × 4 + Ri

for i = 2, 4, 6, 8.

(5.8)

In this way, we have hidden the data in pixels Pc , P2 , P4 , P6 , and P8 (see Fig. 5.2). After applying the above steps, the original block is transformed to an intermediate stego-block; see Fig. 5.1B. Stage 2 of embedding procedure applies SM approach on pixels P1 , P3 , P7 , and P9 . Fig. 5.1C represents the two side pixels for each of these pixels. Steps 9–12 describes stage 2 of embedding procedure. Step 9: The two neighboring pixels of P1 are P2 and P4 . Similarly, the two neighboring pixels of P3 are P2 and P6 . The two neighboring pixels of P7 are P4 and P8 . The two neighboring pixels of P9 are P6 and P8 . Calculate four difference values d1 , d3 , d7 , and d9 using Eq. (5.9):   d1 = P2 − P4 , Step 10:

Step 11:

  d3 = P2 − P6 ,

  d7 = P4 − P8 ,

  d9 = P6 − P8 .

For i = 1, 3, 7, 9, calculate ni using Eq. (5.10). If ni ≥ 4, then set ni = 4.  1 if 0 ≤ di ≤ 1, ni = log2 di  if di > 1.

(5.10)

For i = 1, 3, 7, 9, determine bi , a decimal value for ni secret bits. After embedding bi in Pi , the stego-value Pi is calculated using Eq. (5.11). Pi = Pi − Pi mod 2ni + bi .

Step 12:

(5.9)

(5.11)

For i = 1, 3, 7, 9, calculate the difference values i = Pi − Pi . Apply Eq. (5.12) for further adjustment on Pi .  Pi

=

Pi − 2ni Pi + 2ni

if (2ni −1 < i < 2ni ) and (Pi ≥ 2ni ), if (−2ni < i < −2ni −1 ) and (Pi ≤ 255 − 2ni ).

(5.12)

After applying steps 9–12, we get the final stego-block; see Fig. 5.1D.

5.2.2 Extraction procedure Like the embedding procedure, the stego-image is split into 3 × 3 size pixel blocks as in Fig. 5.1D. Data extraction is performed from this block in two stages, (i) stage 1 and (ii) stage 2. Stage 1 of extraction procedure is applied on pixels Pc , P2 , P4 , P6 , and P8 . It is performed in steps 1–4.

86

Digital Media Steganography

FIGURE 5.2 Flowchart for data hiding.

Chapter 5 • Image steganography using add-sub based QVD and side match

87

Step 1: Extract 4 LSBs of Pc and keep them in extracted binary data stream (EBDS). For i = 2, 4, 6, 8, utilize Eq. (5.13) to determine the remainders and quotients of Pi . Further, utilize Eq. (5.14) to determine the quotient of Pc . Qi = Pi div 4,

and

Ri = Pi mod 4,

Qc = Pc div 4.

(5.13) (5.14)

Step 2: Now use Eq. (5.15) to calculate the four difference values di = |Qc − Qi |

for i = 2, 4, 6, 8.

(5.15)

Step 3: Look at Table 5.1. The di falls in one of its ranges. The HC of that range is ti , and LB is Li . For i = 2, 4, 6, 8, use Eq. (5.16) to determine Si . It is a decimal value for the ti binary bits. Si = di − Li .

(5.16)

Now for i = 2, 4, 6, 8, convert Si to ti binary bits and append to EBDS. Step 4: For i = 2, 4, 6, 8, transform Ri to 2 bits and append into EBDS. Thus stage 1 of data extraction is completed. Stage 2 of extraction procedure is applied on pixels P1 , P3 , P7 , and P9 . It is performed in steps 5–7. Step 5: The two neighboring pixels of P1 are P2 and P4 . Similarly, the two neighboring pixels of P3 are P2 and P6 . The two neighboring pixels of P7 are P4 and P8 . The two neighboring pixels of P9 are P6 and P8 . Compute d1 , d3 , d5 , and d7 , the 4 difference values utilizing Eq. (5.17).         d3 = P2 − P6 , d7 = P4 − P8 , d9 = P6 − P8 . (5.17) d1 = P2 − P4 , Step 6: For i = 1, 3, 7, 9, calculate ni using Eq. (5.18). If ni ≥ 4, then set ni = 4.  1 if 0 ≤ di ≤ 1, ni = log2 di  if di > 1.

(5.18)

Step 7: For i = 1, 3, 7, 9, calculate bi = Pi mod 2ni , convert each bi into ni binary data bits, and append to EBDS. Thus stage 2 of extraction procedure is completed. (See Fig. 5.3.)

5.2.3 Example of embedding and extraction Consider an example block shown in Fig. 5.4A. The decimal values of the pixels are P1 = 100, P2 = 101, P3 = 102, P4 = 101, Pc = 102, P6 = 103, P7 = 101, P8 = 102, and P9 = 104. Suppose the secret data to be buried is (0111 01 00 01 10 01 00 10 11 1 0 10 1)2 . The embedding of this data into the block is explained below step-by-step.

88

Digital Media Steganography

FIGURE 5.3 Flowchart for data extraction.

FIGURE 5.4 (A) A 3 × 3 size original block. (B) The 3 × 3 size stego-block.

Step 1: Pc = 10210 = 011001102 . Take 4 bits of data (i.e. 0111) from secret binary data stream (SBDS) and apply 4-bit LSB alteration on Pc . We obtain its stego-value Pc = 011001112 = 10310 .

Chapter 5 • Image steganography using add-sub based QVD and side match

89

Step 2: Four LSBs of Pc are 01102 , so decold = 610 . Four LSBs of Pc are 01112 , so decnew = 710 . The variation var = 6 − 7 = −1. Let us apply Eq. (5.1). The first two cases are not applicable, and the third case is applicable. Thus Pc = 103. Step 3: Using Eq. (5.2) and Eq. (5.3), we get Qc = 25, Q2 = 25, Q4 = 25, Q6 = 25, Q8 = 25, Rc = 3, R2 = 1, R4 = 1, R6 = 3, and R8 = 2. Step 4: Using Eq. (5.4) we get d2 = 0, d4 = 0, d6 = 0, d8 = 0. From Table 5.1, d2 ∈ {0, 7}, so L2 = 0 and t2 = 2; d4 ∈ {0, 7}, so L4 = 0 and t4 = 2; d6 ∈ {0, 7}, so L6 = 0 and t6 = 2; d8 ∈ {0, 7}, so L8 = 0 and t8 = 2. Collect next 2 bits from SBDS and change to decimal, so S2 = 1. Collect next 2 bits from SBDS and change to decimal, so S4 = 0. Collect next 2 bits from SBDS and change to decimal, so S6 = 1. Collect next 2 bits from SBDS and change to decimal, so S8 = 2. Step 5: Using Eq. (5.5) we get d2 = 1, d4 = 0, d6 = 1, and d8 = 2.  Step 6: Using Eq. (5.6) we get Q2 = 24, Q4 = 25, Q6 = 24, and Q8 = 23, and Q2 = 26,    Q4 = 25, Q6 = 26, and Q8 = 27. Furthermore, applying Eq. (5.7), we get Q2 = 24, Q4 = 25, Q6 = 24, and Q8 = 23. Step 7: Collecting next 2 bits from SBDS and changing to decimal, we get R2 = 110 . Collecting next 2 bits from SBDS and changing to decimal, we get R4 = 010 . Collecting next 2 bits from SBDS and changing to decimal, we get R6 = 210 . Collecting next 2 bits from SBDS and changing to decimal, we get R8 = 310 . Step 8: Using Eq. (5.8), we get P2 = 24 × 4 + 1 = 97, P4 = 25 × 4 + 0 = 100, P6 = 24 × 4 + 2 = 98, and P8 = 23 × 4 + 3 = 95. Step 9: Using Eq. (5.9), we get d1 = 3, d3 = 1, d7 = 5, and d9 = 3. Step 10: Using Eq. (5.10), we get n1 = 1, n3 = 1, n7 = 2, and n9 = 1. Step 11: n1 = 1, so collecting 1 bit of data from SBDS and changing to decimal value, we get b1 = 1. n3 = 1, so collecting 1 bit of data from SBDS and changing to decimal value, we get b3 = 0. n7 = 2, so collecting 2 bits from SBDS and changing to decimal value, we get b7 = 2. n9 = 1, so collecting 1 bit from SBDS and changing to decimal value, we get b7 = 1. Now using Eq. (5.11), we get P1 = 100−100 mod 21 + 1 = 101, P3 = 102−102 mod 21 + 0 = 102, P7 = 101−101 mod 22 + 2 = 102, and P9 = 104−104 mod 21 + 1 = 105. Step 12: 1 = P1 − P1 = 1, 3 = P3 − P3 = 1, 7 = P7 − P7 = 1, and 9 = P9 − P9 = 1. Now by applying Eq. (5.12) we get no change in the values of P1 , P3 , P7 , and P9 . Thus their final values are P1 = 101, P3 = 102, P7 = 102, and P9 = 105. The stego-block has been depicted in Fig. 5.4B. Consider the stego-block depicted in Fig. 5.4B: P1 = 101, P2 = 97, P3 = 102, P4 = 100, Pc = 103, P6 = 98, P7 = 102, P8 = 95, andP9 = 105. The extraction from this block is further explained step-by-step. Step 1: Using Eq. (5.13) and Eq. (5.14), we get Qc = 25, Q2 = 24, Q4 = 25, Q6 = 24, Q8 = 23, and R2 = 1, R4 = 0, R6 = 2, R8 = 3. Pc = 10310 = 011001112 . Extract the 4 LSBs of Pc and concatenate with extracted binary data stream (EBDS). Thus EBDS = 0111.

90

Digital Media Steganography

Step 2: Using Eq. (5.15), we get d2 = 1, d4 = 0, d6 = 1, and d8 = 2. Step 3: d2 ∈ {0, 7}, so L2 = 0 and t2 = 2; d4 ∈ {0, 7}, so L4 = 0 and t4 = 2; d6 ∈ {0, 7}, so L6 = 0 and t6 = 2; d8 ∈ {0, 7}, so L8 = 0 and t8 = 2. Using Eq. (5.16), we get S2 = d2 − L2 = 1, S4 = d4 − L4 = 0, S6 = d6 − L6 = 1, and S8 = d8 − L8 = 2. Converting S2 to t2 binary bits, we get 012 . Converting S4 to t4 binary bits, we get 002 . Converting S6 to t6 binary bits, we get 012 . Converting S8 to t8 binary bits, we get 102 . These eight bits are appended to EBDS. Now EBDS = 0111 01 00 01 10. Step 4: Convert R2 to 2 binary bits and append to EBDS. Convert R4 to 2 binary bits and append to EBDS. Convert R6 to 2 binary bits and append to EBDS. Convert R8 to 2 binary bits and append to EBDS. Now EBDS = 0111 01 00 01 10 01 00 10 11. Step 5: Using Eq. (5.17), we get d1 = |P2 − P4 | = 3, d3 = |P2 − P6 | = 1, d7 = |P4 − P8 | = 5, and d9 = |P6 − P8 | = 3. Step 6: Using Eq. (5.18), we get n1 = 1, n3 = 1, n7 = 2, and n9 = 1. Step 7: Now we calculate b1 = P1 mod 2n1 = 101 mod 2 = 1, b3 = P3 mod 2n3 = 102 mod 2 = 0, b7 = P7 mod 2n7 = 102 mod 4 = 2, and b9 = P9 mod 2n9 = 105 mod 2 = 1. Convert b1 to n1 binary bits and append to EBDS. Convert b3 to n3 binary bits and append to EBDS. Convert b7 to n7 binary bits and append to EBDS. Convert b9 to n9 binary bits and append to EBDS. Thus EBDS = 0111 01 00 01 10 01 00 10 11 1 0 10 1. These bits are the same as the embedded bits. Extraction is completed.

5.3 Experimental analysis The proposed steganography technique has been programmed in MATLAB® R2015a. Experiments are performed with various images from SIPI image database [34]. The results for eight samples are recorded in Table 5.2. Fig. 5.5 lists four sample test images, and Fig. 5.6 depicts their four stego-images. These stego-images are looking innocent, and no visible marks are available in them. Table 5.2 Performance measurement of the proposed scheme. Test images (Size = 512 × 512 × 3)

PSNR

Capacity

QI

BPB

Pot

34.30

2,502,088

0.9978

3.18

House

34.38

2,864,210

0.9961

3.64

Boat

33.36

2,863,565

0.9968

3.64

Jet

34.51

2,727,179

0.9941

3.47

Peppers

33.32

2,714,012

0.9965

3.45

Tiffany

33.69

2,727,278

0.9918

3.47 4.06

Baboon

32.77

3,190,636

0.9945

Lena

34.80

2,765,439

0.9969

3.52

Average

33.89

2,794,301

0.9956

3.55

Chapter 5 • Image steganography using add-sub based QVD and side match

91

FIGURE 5.5 Original images: (A) Lena, (B) Baboon, (C) Tiffany, (D) Peppers, (E) Jet, (F) Boat, (G) House, (H) Pot.

FIGURE 5.6 The stego-images: (A) PSNR = 34.80, (B) PSNR = 32.77, (C) PSNR = 33.69, (D) PSNR = 33.32.

The efficiency of this scheme has been measured by three parameters. The first parameter is PSNR. It calculates the distortion of the stego-image. The next important parameter is the HC. It specifies the total data embeddable in the image. Sometimes HC per a byte is termed as bits per byte (BPB). The third parameter is known as the quality index QI. It estimates the resemblance between two images. The PSNR calculation is represented in Eq. (5.19), where Pij is the original pixel, Qij is the stego-pixel, and m × n represents image size. For a color image, every byte can be assumed as a unit of calculation, and Eq. (5.19) can be applied. For the test images of Fig. 5.5, m = 512 and n = 1536. 255 × 255 × m × n PSNR = 10 × log10 m n . 2 i=1 j=1 (Pij − Qij )

(5.19)

92

Digital Media Steganography

Table 5.3 Performance comparison of various existing schemes with the proposed scheme. Steganography technique

Average PSNR

Average HC

Average QI

7-Directional PVD in [7]

39.78

1,885,371

0.9985

Average BPB 2.39

LSB+PVD in [16] type 1

40.11

2,385,386

0.9988

3.03

LSB+PVD in [16] type 2

38.57

2,466,275

0.9984

3.13

8-Directional PVD in [29] type 1

39.55

2,361,368

0.9986

3.00

8-Directional PVD in [29] type 2

37.22

2,603,604

0.9977

3.31

Proposed ASQVD+SM Scheme

33.89

2,794,301

0.9956

3.55

To measure QI, Eq. (5.20) can be used. In Eq. (5.20), P and Q imply the mean pixel values of the original image and stego-images, respectively.  n 4×P×Q×{ m i=1 j=1 (Pij − P) × (Qij − Q)} . QI = m n   m { i=1 j=1 (Pij − P)2 + i=1 nj=1 (Qij − Q)2 } × {(P)2 + (Q)2 }

(5.20)

The PSNR, HC, QI, and BPB values of the said scheme are represented in Table 5.2. The comparison of these parameters among different techniques is represented in Table 5.3. The PSNR and QI calculations are performed using stegoimages with 7,00,000 bits of data hidden in them. We can notice from this table that the average HC of our scheme is 3.55 BPB. The average HC of technique [7] is 2.39 BPB. The average HC of technique [16] is 3.03 BPB and 3.13 BPB for variants 1 and 2, respectively. Similarly, the averages HC of technique [29] are 3.00 BPB and 3.31 BPB for variants 1 and 2, respectively. The PSNR of our scheme is 33.89 dB. It is in the acceptable range of {30, 40}. Ideally, a stegoimage with PSNR value larger than 40 dB is imperceptible. A stegoimage with PSNR lesser than 30 dB is unacceptable because of high distortion. A stegoimage with PSNR value lesser than 40 dB but greater than 30 dB is moderate in distortion and acceptable. Fig. 5.7 specifies a bar-chart to compare BPB and PSNR. The BPB values of the said scheme is found to be greater than in the other techniques. This developed scheme uses the principles of differencing and substitution. So, it is customary to evaluate its robustness by analyzing its security through PDH and RS tests. The PDH is a two-dimensional curve. The x-axis depicts the pixel difference, and y-axis depicts the frequency of pixel difference [37,38]. It has been found that for natural images, the PDH curve is smoother in nature, which means that the curve will not have zig-zag shape. But in a stegoimage of traditional PVD technique, this zig-zag shape is available [35]. Fig. 5.8 represents the PDH analysis of eight test images. There are eight subdiagrams, one for each image. In each subdiagram the PDH of the original image is depicted by a solid line, and the PDH of stegoimage is depicted by a dotted line. Note that the dotted line curve in each subdiagram is free from zig-zag appearance, which implies that the PDH test cannot detect the proposed scheme. The RS test was applied on the proposed technique by referring the four estimations Rm , R−m , Sm , and S−m [36]. The procedure is as follows. Divide the image M into fi-

Chapter 5 • Image steganography using add-sub based QVD and side match

93

FIGURE 5.7 PSNR (Series 1) and BPB (Series 2) of the proposed scheme and existing schemes.

nite number of equisized blocks. Let G be one block, and let its constituent pixels be  g1 , g2 , g3 , . . . , gn . Then use the function f(g1 , g2 , g3 , . . . , gn ) = n−1 i=1 |gi+1 − gi | to compute the smoothness of G. Furthermore, define two functions F1 : 2n ↔ 2n + 1 and F−1 : 2n ↔ 2n − 1. F1 indicates that a pixel 2n alters to 2n + 1 and vice versa because of LSB substitution. Similarly, F−1 indicates that a pixel value 2n alters to 2n − 1 and vice versa because of LSB substitution. Then apply F1 and F−1 over all the blocks of M to interpret Rm , Sm , R−m , and S−m using Eqs. (5.21)–(5.24). Rm =

No of blocks satisfying the condition f(F1 (G)) > f(G) , Total number of blocks

(5.21)

Sm =

No of blocks satisfying the condition f(F1 (G)) < f(G) , Total number of blocks

(5.22)

R−m =

No of blocks satisfying the condition f(F−1 (G)) > f(G) , Total number of blocks

(5.23)

S−m =

No of blocks satisfying the condition f(F−1 (G)) < f(G) . Total number of blocks

(5.24)

As per the concept in RS test, if R−m − S−m > Rm − Sm , then there is data inside the image. If Rm ≈ R−m > Sm ≈ S−m is true, then there are no data inside the image. Fig. 5.9 represents the RS test of four images. There are four subdiagrams, one for each image. We can observe that Rm is close to R−m , and they are almost parallel to each other, which means that Rm ≈ R−m is achieved. Similarly, Sm is close to S−m , and they are almost parallel to each other, which means that Sm ≈ S−m is achieved. Furthermore, the lines represented by Rm or R−m are above the lines represented by Sm or S−m . Hence we can conclude that Rm ≈ R−m > Sm ≈ S−m , which means that the proposed scheme is not detectable by RS test.

94

Digital Media Steganography

FIGURE 5.8 PDH analysis over (A) Lena, (B) Baboon, (C) Tiffany, (D) Peppers, (E) Jet, (F) Boat, (G) House, and (H) Pot images.

Chapter 5 • Image steganography using add-sub based QVD and side match

95

FIGURE 5.9 RS analysis over (A) Lena, (B) Baboon, (C) Peppers, and (D) Boat images.

5.4 Conclusion A different kind of steganography mechanism using two principles like ASQVD and SM has been advanced in this chapter. It performs two stages of hiding in 3 × 3 size disjoint pixel blocks. In stage 1, it performs ASQVD and remainder substitution on the central pixel and its four side neighboring pixels i.e. on left, right, lower, and upper neighboring pixels. Based on the new values of these five pixels, in stage 2, it performs SM embedding approach on four corner pixels. This developed scheme has been programmed in MATLAB and distinguished with the related existing schemes in terms of PSNR, HC, and QI. It is evident from the recorded observations that these parameter values for the proposed technique are satisfactory. Furthermore, we also noticed that the HC is greater than the HC of related existing techniques. To judge the security of the proposed scheme, RS and PDH tests have been performed. From the experimental results, PDH curves, we have observed that PDH test fails to recognize the proposed steganography scheme. Similarly, from the RS curves we could observe that Rm ≈ R−m > Sm ≈ S−m . So, RS test fails to recognize the proposed scheme.

96

Digital Media Steganography

References [1] A. Cheddad, J. Condell, K. Curran, P.M. Kevitt, Digital image steganography: survey and analysis of current methods, Signal Processing 90 (2010) 727–752. [2] A. Martin, G. Sapiro, G. Seroussi, Is image steganography natural?, IEEE Transactions on Image Processing 14 (12) (2005) 2040–2050. [3] J. Fridrich, M. Goljian, R. Du, Detecting LSB steganography in color and gray-scale images, Magazine of IEEE Multimedia and Security 8 (4) (2001) 22–28. [4] D.C. Wu, W.H. Tsai, A steganographic method for images by pixel value differencing, Pattern Recognition Letters 24 (9) (2003) 1613–1626. [5] X. Zhang, S. Wang, Vulnerability of pixel-value differencing steganography to histogram analysis and modification for enhanced security, Pattern Recognition Letters 25 (2004) 331–339. [6] Y.P. Lee, J.C. Lee, W.K. Chen, K.C. Chang, I.J. Su, C.P. Chang, High-payload image hiding with quality recovery using tri-way pixel-value differencing, Information Sciences 191 (2012) 214–225. [7] A. Pradhan, K.R. Sekhar, G. Swain, Digital image steganography based on seven way pixel value differencing, Indian Journal of Science and Technology 9 (37) (2016) 1–11. [8] K.A. Darabkh, A.K. Al-Dhamari, I.F. Jafar, A new steganographic algorithm based on multi directional PVD and modified LSB, Journal of Information Technology and Control 46 (1) (2017) 16–36. [9] W. Luo, F. Huang, J. Huang, A more secure steganography based on adaptive pixel-value differencing scheme, Multimedia Tools and Applications 52 (2010) 407–430. [10] G. Swain, Adaptive pixel value differencing steganography using both vertical and horizontal edges, Multimedia Tools and Applications 75 (2016) 13541–13556. [11] A. Pradhan, K.R. Sekhar, G. Swain, Adaptive PVD steganography using horizontal, vertical, and diagonal edges in six-pixel blocks, Security and Communication Networks 2017 (2017) 1–13. [12] X. Liao, Q.Y. Wen, J. Zhang, A steganographic method for digital images with four-pixel differencing and modified LSB substitution, Journal of Visual Communication and Image Representation 22 (1) (2011) 1–8. [13] G. Swain, Digital image steganography using nine-pixel differencing and modified LSB substitution, Indian Journal of Science and Technology 7 (9) (2014) 1444–1450. [14] H.C. Wu, N.I. Wu, C.S. Tsai, M.S. Hwang, Image steganographic scheme based on pixel-value differencing and LSB replacement methods, IEEE Proceedings Vision, Image and Signal Processing 152 (5) (2005) 611–615. [15] C.H. Yang, C.Y. Weng, S.J. Wang, H.M. Sun, Varied PVD+LSB evading programs to spatial domain in data embedding systems, The Journal of Systems and Software 83 (10) (2010) 1635–1643. [16] M. Khodaei, K. Faez, New adaptive steganographic method using least-significant-bit substitution and pixel-value differencing, IET Image Processing 6 (6) (2012) 677–686. [17] G. Swain, A steganographic method combining LSB substitution and PVD in a block, Procedia Computer Science 85 (2016) 39–44. [18] C.M. Wang, N.I. Wu, C.S. Tsai, M.S. Hwang, A high quality steganographic method with pixel-value differencing and modulus function, The Journal of Systems and Software 81 (2008) 150–158. [19] W. Zhao, Z. Jie, L. Xin, W. Qiaoyan, Data embedding based on pixel value differencing and modulus function using indeterminate equation, The Journal of China Universities of Posts and Telecommunications 22 (1) (2015) 95–100. [20] G. Swain, Two new steganography techniques based on quotient value differencing with additionsubtraction logic and PVD with modulus function, Optik – International Journal for Light and Electron Optics 180 (2019) 807–823. [21] C.H. Yang, C.Y. Weng, H.K. Tso, S.J. Wang, A data hiding scheme using the varieties of pixel-value differencing in multimedia images, The Journal of Systems and Software 84 (2011) 669–678. [22] A. Pradhan, K.R. Sekhar, G. Swain, Digital image steganography using LSB substitution, PVD, and EMD, Mathematical Problems in Engineering 2018 (2018) 1–11. [23] W. Tang, B. Li, W. Luo, J. Huang, Clustering steganographic modification directions for color components, IEEE Signal Processing Letters 23 (2) (2016) 197–201. [24] B. Li, M. Wang, X. Li, S. Tan, J. Huang, A strategy of clustering modification directions in spatial image steganography, IEEE Transactions on Information Forensics and Security 10 (9) (2015) 1905–1917.

Chapter 5 • Image steganography using add-sub based QVD and side match

97

[25] M. Elhoseny, G. Ramirez-Gonzalez, O.M. Abu-Elnasr, S.A. Shawkat, N. Arunkumar, A. Farouk, Secure medical data transmission model for IoT-based healthcare systems, IEEE Access 6 (2018) 20596–20608. [26] F. Li, K. Wu, X. Zhang, J. Yu, J. Lei, M. Wen, Robust batch steganography in social networks with nonuniform payload and data decomposition, IEEE Access 6 (2018) 29912–29914. [27] K.H. Jung, Data hiding scheme improving embedding capacity using mixed PVD and LSB on bit plane, Journal of Real-Time Image Processing 14 (1) (2018) 127–136. [28] G. Swain, Very high capacity image steganography technique using quotient value differencing and LSB substitution, Arabian Journal for Science and Engineering 44 (4) (2019) 2995–3004. [29] G. Swain, Digital image steganography using eight-directional PVD against RS analysis and PDH analysis, Advances in Multimedia 2018 (2018) 1–13. [30] H.-H. Liu, Y.-C. Lin, A digital data hiding scheme based on pixel-value differencing and side match method, Multimedia Tools and Applications 78 (9) (2019) 12157–12181. [31] M.A. Hameed, S. Aly, M. Hassaballah, An efficient data hiding method based on adaptive directional pixel value differencing (ADPVD), Multimedia Tools and Applications 77 (12) (2018) 14705–14723. [32] A.K. Sahu, G. Swain, A novel n-rightmost bit replacement image steganography technique, 3D Research 10 (2) (2019) 1–18. [33] A.K. Sahu, G. Swain, Dual stego-imaging based reversible data hiding using improved LSB matching, International Journal of Intelligent Engineering and Systems 12 (5) (2019) 63–73. [34] The USC-SIPI image database [Online]. Available: http://sipi.usc.edu/database. (Accessed 20 October 2018). [35] A. Pradhan, A.K. Sahu, G. Swain, K.R. Sekhar, Performance evaluation parameters of image steganography techniques, in: IEEE International Conference on Research Advances in Integrated Navigation Systems, 2016, pp. 1–8. [36] G. Swain, Advanced Digital Image Steganography Using LSB, PVD, and EMD: Emerging Research and Opportunities, IGI Global, 2019. [37] A.K. Sahu, G. Swain, An optimal information hiding approach based on pixel value differencing and modulus function, Wireless Personal Communications 108 (1) (2019) 159–174. [38] A.K. Sahu, G. Swain, High fidelity based reversible data hiding using modified LSB matching and pixel difference, Journal of King Saud University: Computer and Information Sciences (2019), https://doi. org/10.1016/j.jksuci.2019.07.004.

6 A high-capacity invertible steganography method for stereo image Phuoc-Hung Voa,b , Thai-Son Nguyena , Van-Thanh Huynha , Thanh-Nghi Dob a School of Engineering and Technology, Tra Vinh University, Tra Vinh City, Tra Vinh Province, Vietnam b College of Information Technology, Can Tho University, Can Tho, Vietnam

6.1 Introduction The rapid development of the Internet has provided the easiness in digital communication. It has a great influence on our daily life, such as multimedia communication; meanwhile it also brings many challenges for securing the information over the open network. To protect secret messages from being stolen during transmission, numerous methods have been proposed. Overall, these approaches are under information encryption and information hiding [1]. Information encryption is well known as cryptography, which is used to encode the secret message in such a way that it becomes meaningless information to eavesdroppers. However, the scrambled appearance of encrypted information can lead to high attraction of attacks. Therefore an invisible communication system is required. Information hiding referred as steganography gets a role on such requirement. Steganography is a technique of embedding information into digital media aiming to conceal the existence of information. The digital media is also referred as a cover object, whereas the digital media with hidden information is called a stego-object [2]. The stego-object is an entirely meaningful object and looks completely like the cover object. Therefore the stegoobject can be easily passing the eavesdroppers when it is transmitted to legal users over the network. In fact, data hiding includes steganography and watermarking approaches, which are closely related to each other. However, there is a little difference in application and evaluation between these two approaches. Digital watermarking is the technology that is used to embed the information into the actual information for protecting it from unauthorized use [3–5]. Focused upon the issue to protect secret information from being stolen during transmission, steganography, a method that hides a secret message into a carrier and makes such a message unnoticed and less attractive, is one of the solutions on applications of military, forensic, and medical area [6,7]. Because of the secret message Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00014-1 Copyright © 2020 Elsevier Inc. All rights reserved.

99

100

Digital Media Steganography

concealing, a steganography system is considered as failed if the existence of the secret information is disclosed. In general, several types of channels, for example, image, video, text, audio, and network protocol, are used as cover object. The image channel is the most popular one [8]. Image steganography can be broadly classified into two categories, the spatial domain techniques and frequency domain techniques. Typically, in the spatial domain techniques the secret information is directly embedded into the pixel values of the cover images. The most common digital steganography technique is the least significant bit (LSB) replacement method, which hides the secret data by replacing some LSB bits of cover pixels [9–12]. There is a wide variety of existing image steganography methods in the spatial domain such as Tian’s scheme [13], which exploits the different expansion (DE) of two pixels in the cover image to embed the secret information. In 2006 an efficient steganography method based on the exploiting modification direction was firstly introduced by Zhang and Wand [14]; in its major embedding formula the secret data are converted in a (2n + 1)-ary notational system, and then the secret digit is embedded into n adjacent pixels. Abdel Hameed et al. [15] investigated that the traditional pixel-value differencing (PVD) was applied only to two horizontal consecutive pixels of grayscale images, whereas in color images the edge information is different for each channel, in which it allows for hiding more bits in the diagonal and vertical directions. Thus they proposed a new adaptive steganography method for color images using adaptive directional pixel-value differencing (ADPVD). Liu et al. [16] proposed a method in which PVD scheme and the side match method are combined to enable maximum embedding capacity while maintaining an acceptable PSNR. Sahu and Swain [17] proposed a novel n-right most bit replacement image steganography method to embed the secret information in the image, in which trade-off between PSNR and embedding capacity is achieved. Another approach of digital steganography in the spatial domain is revealed by Ni et al. [6], who proposed invertible data hiding based on histogram shifting in which none or minimum numbers of points of the image histogram are mainly concerned for embedding the secret message. Nguyen et al. [18], carefully selecting image regions and using the optimal embedding modification direction table, developed a matrix for the modification of difference histogram for embedding the secret data. In the frequency domain techniques the value of pixels is firstly transformed into the frequency domain. Then the obtained frequency coefficients are modified to embed the secret messages. There are many transforming domain steganography methods: the discrete cosine transform (DCT) [7,19], singular value decomposition (SVD) [3,20–22], and discrete wavelet transform (DWT) [23,24]. In addition, image steganography can be divided into two main groups, noninvertible steganography and invertible steganography, based on the result in which the cover image is reconstructed after the embedded information is extracted from the stegoimage or not. Many anticipated noninvertible steganography techniques, known as irreversible steganography techniques, were proposed in the literature [25,26]. In these schemes the cover image experiences some distortion. This means that the image cannot be inverted back to the original form. In contrast, the invertible steganography techniques, also named

Chapter 6 • A high-capacity invertible steganography method for stereo image

101

as reversible or lossless, are widely researched to be used in real-life applications in medical, military, and forensic fields. The main reason is that the original image is required to be completely restored in these applications. As a consequence, the invertible steganography techniques are sufficient to satisfy these constraints [6,27–29]. In recent time, with the development of high-technology devices, stereo images that are acquired from two eyes/cameras for depth perception have been widely used in many applications, such as flight simulators for pilot training, robot vision medical surgery, virtual reality games, and autonomous navigations [30]. In fact, a stereo image includes two images, left image and right images, and is similar to the human visual system. It is just a binocular disparity. Due to the proliferation of stereo images in various applications, it evokes more attraction on researchers in data hiding field. Several previous works have made some progresses in watermarking techniques for stereo images, especially in the copyright protection field [30–33]. In [30], Zhou et al. introduced a fragile watermarking scheme based on binocular visual characteristics of stereo images for authentication and tamper detection. Yang and Chen [27] proposed a lossless DCT-based data hiding method. Based on the characteristics of stereo images, the scheme searches similar block pairs in the left and right images for embedding secret messages in the DCT domain. Vo et al. [7] proposed an invertible steganography method based on two-dimensional histogram shifting for quantized DCT (QDCT) coefficients. The scheme is mostly focused on the upper right corner of a two-dimensional histogram for embedding the secret messages. In general, the proposed scheme reaches the trade-off between human visual quality and embedding capacity, although these methods can achieve a high embedding capacity and a good visual imperceptibility. However, in the Yang and Chen method [27] the scheme depends upon the number of different values between the QDCT coefficients in the left and right images, which are equal to zero, whereas Vo et al.’s scheme [7] seems completely engrossed in the top-right corner of the 2-D histogram. Therefore the embedding capacity is slightly restricted because the embedding process skips over the other corner of 2-D histogram. To achieve a balanced relationship between the embedding capacity and the visual quality of a stego-stereo image, this work adaptively changes the embedding and extracting processes based on 2-D histogram shifting by building up an embedding direction histogram (EDH). It exploits the EDH to embed the secret data in the three directions of the 2-D histogram. Each pair of coefficients in the middle frequency DCT is embedded 3 bits based on the EDH. The embedding capacity is lightly increased, and the visual quality of the stego-stereo image is persevered as well. The rest of this chapter is organized as follows. In Section 6.2, we briefly describe theory background of the discrete cosine transform and related work. The proposed model by shifting QDCT coefficients in 2-D histogram with EDH is given in Section 6.3. Experimental results obtained in tests of the proposed algorithm are presented in Section 6.4. Finally, Section 6.5 concludes the chapter.

102

Digital Media Steganography

6.2 Preliminaries In this section, we briefly introduce basic knowledge and related work. First, we present the mathematical discrete cosine transform (DCT). Then we review Yang et al.’s method in detail.

6.2.1 Discrete cosine transforms (DCT) and quantized DCT (QDCT) The discrete cosine transform is an important technique for transforming a signal or an image from the spatial domain to the frequency domain. It is well known as an image compression standard called JPEG (Joint Photographic Experts Group) [34]. The DCT needs to separate the image into nonoverlapping parts sized of N by N . Then each part is transformed through the mathematical definition of the forward DCT (FDCT) and inverse DCT (IDCT) as follows: Forward DCT :     −1 N−1  N 2 π (2x + 1) u π (2y + 1) v C (u, v) = w (u) w(v) f (x, y)cos cos , N 2N 2N

(6.1)

x=0 y=0

 where N = 8, and w (e) =

√1 2

if e = 0,

1

otherwise.

Inverse DCT :     N−1 N−1 π (2x + 1) u 2  π (2y + 1) v w (u) w(v)C(u, v)cos f (x, y) = cos , N 2N 2N

(6.2)

u=0 v=0

where N = 8, C(u, v) with u and v running from 0 to N − 1 is the DCT coefficient at the coordinate (u, v), and f (x, y) with x and y running from 0 to N − 1 is the pixel value at the coordinate (x, y). After FDCT, the element in the upper most left is the DC coefficient, and the remaining elements are the AC coefficients depicted as shown in Fig. 6.1. The data can be precisely recovered through the IDCT. Quantization DCT is the step where we discard information that is not visually significant. Quantization of every coefficient in the 8 × 8 block is divided by a corresponding quantization value. The QDCT coefficient is calculated as Eq. (6.3) and dequantization is the inverse function defined in (6.4).   C(u, v) C(u, v)Q = round , (6.3) Q(u, v) C(u, v)deQ = C(u, v)Q × Q(u, v),

(6.4)

where Q is a quantization matrix. Fig. 6.2 displays the quantization matrix with the quality factor QF equal to 50.

Chapter 6 • A high-capacity invertible steganography method for stereo image

103

FIGURE 6.1 The distribution of DC and AC coefficients and Zig zag sequence in a block of size 8 × 8.

FIGURE 6.2 JPEG quantization matrix with QF = 50.

6.2.2 Yang and Chen’s method Based on the fact that a pair of stereo images looks significantly similar, Yang and Chen [27] proposed reversible data hiding in the frequency domain for stereo images. They firstly split each image into blocks of size 8 × 8 and transform them into DCT domain. Each block is separated into three parts with difference DCT frequencies. To increase the embedding capacity, each 3-bit secret message b2 b1 b0 is coded by two integers i0 and i1 in the interval [−1, 1] through the following steps:

104

Digital Media Steganography

Step 1: compute D = 4 × b 2 + 2 × b 1 + b0 .

(6.5)

Step 2: code D into a 3-based integer with two digits d0 and d1 : d0 = D mod 3,

(6.6)

d1 = (D − d0 )/3.

(6.7)

Step 3: obtain i0 and i1 from d0 and d1 :  ik =

dk −1

if dk ≤ 1, if dk = 2,

for k = 0, 1.

(6.8)

To embed the secret integers, the different value Dif B (u, v) of QDCT coefficients C(u, v) in the embedding area between the left block (B L ) and the most similar right block (B R ) are computed by the following equation: Dif B (u, v) = CB L

if (u, v) − CB R (u, v) , 5 ≤ (u + v) ≤ 7.

(6.9)

The secret digits i ∈ [−1, 1] will be embedded into Dif B (u, v) by using the following equation: ⎧ ⎨ Dif B (u, v) + 1 if Dif B (u, v) > 0,  (6.10) Dif B (u, v) = i if Dif B (u, v) = 0, ⎩ Dif B (u, v) − 1 if Dif B (u, v) < 0. Subsequently, the new QDCT coefficients are adjusted based on the Dif B  (u, v) values. After that, dequantization and the inverse DCT is applied to obtain the stego-stereo image. In general, the method proposed by Yang and Chen is able to distribute the reversible DCT-based data hiding in the stereo image and hence provides a better visual quality of the stego-images. However, only the difference value of zero can be used to embed data while all difference values Dif B (u, v) are changed, which may lead to slightly restricted embedding capacity and much affected visual quality of stegoimages.

6.3 The proposed method In Yang and Chen’s method, difference values of the embedding area between two similarity blocks are computed, and only the difference value of zero can be used to embed data because of the low embedding capacity. To achieve a balanced relationship between the embedding capacity and the visual quality, our method utilized embedding direction histogram (EDH) encoded from analyzing the secret data. The EDH plays an important role in shifting 2-D histogram and embedding information. Fig. 6.3 exhibits the main processes of the embedding procedure. First, according to the secret data, EDH is generated. Next, based on preselected threshold T , the left and right blocks in DCT domain are considered

Chapter 6 • A high-capacity invertible steganography method for stereo image

105

FIGURE 6.3 Flowchart of embedding process.

whether they are similar or not. Then a 2-D histogram is constructed for embedding the secret data based on the EDH. This section is organized as follows:

6.3.1 Generation of the embedding direction histogram (EDH) The embedded secret data is first split into 3-bit groups and represented in decimal form to make a sequence S = {s1 , s2 , . . . , s|n|/3 }, and thus si is in the range from 0 to 7 digits. Next, a frequency histogram is built by counting the occurrence of si number. Then the histogram is arranged in descending order of frequency si . An example of frequency histogram and sorted frequency histogram is displayed in Fig. 6.4A and 6.4B, respectively. From the sorted frequency histogram Fig. 6.4B, to reach the highest embedding capacity but the lowest distortion, the EDH is constructed following the rules. Fig. 6.5 shows the generation drawing of the EDH according to Fig. 6.4.

106

Digital Media Steganography

FIGURE 6.4 An example of frequency histogram. (A) Frequency histogram before sorting. (B) Frequency histogram sorting.

FIGURE 6.5 Generation drawing of EDH according to Fig. 6.4.

Rule 1: A peak digit will be set at the origin coordinate (e.g., s1 at the coordinate (0, 0)). Rule 2: Four consecutive highest frequency digits are distributed on the axis in ordering left, up, right, and down (e.g., s4 , s2 , s6 , and s3 at the coordinates (−1, 0), (0, 1), (1, 0), and (0, −1)), respectively. This set of digits is called the 4-neighbors of the peak digit. Rule 3: The others are scattered at the opposite angel of the origin coordinate in ordering left, right and down (e.g., s5 , s0 , and s7 at the coordinates (−1, 1), (1, 1), and (1, −1)), respectively. This set of digits is called the diagonal neighbors of the peak digit. The main idea of these si distributions in the EDH is maintaining the minimum modification for data embedding. For instance, in this case of s1 , we need not to modify any coefficients but still use the secret digit s1 for embedding.

Chapter 6 • A high-capacity invertible steganography method for stereo image

107

6.3.2 Stereo image embedding algorithm 6.3.2.1 Similar block searching Left and right images of stereo images are considered as similarity. Thus the image is decomposed into M × N nonoverlapping blocks with of size 8 × 8. QDCT coefficients of the block are generated by using DCT and then quantization processes. Each block Bm,n [m ∈ (0, M − 1), n ∈ (0, N − 1)] can be divided into three categories: (a) the searching sector, which contains some lower-frequency QDCT coefficients, (b) the embedding sector, which contains some middle-frequency QDCT coefficients, and (c) the nonused sector, which contains the remaining coefficients. Fig. 6.6 demonstrates a sample block. Let B L and B R be the left and right blocks of views. The block B R is considered as the most similar one with B L according to the following steps:

FIGURE 6.6 Three sectors of the 8 × 8 block in the QDCT domain: (a) Circle shape: searching sector, (b) square shape: embedding sector, (c) blank: nonused sector.

R Step 1: All the most neighbor right blocks Bm+i,n+j (−k ≤ i, j ≤ k), calculate the differL ence value from the left block Bm,n based on the searching area by the formula

Dif m+i,n+j =

2 L R (u, v) , Bm,n (u, v) − Bm+i,n+j

u+v α × ξ ) from the noncandidate ones (σi  α × ξ ). Fig. 7.5 illustrates the resulting clustered standard deviations and the witness image, where white blocks stand for candidate pixels belonging to rich zones with high σ and Red blocks stand for noncandidate spots. In the following, we introduce a mathematical and theoretical description of the procedure applied by the sender and the receiver. Both of them will exploit the Otsu segmentation method to identify the candidate blocks where the secret message should be embedded/extracted. After the preparation step where the image is divided into n × n blocks and the standard deviation is calculated and stored in a vector {σi , i = 1, 2, . . . , K}, the probability set {pi , i = 1, 2, . . . , K} is constructed as P r[std(block)] = P r(σi ) = pi .

(7.2)

This probability denotes the probability of appearance of each σi within image blocks. In essence, Otsu separation looks for an optimal threshold based on the discriminant criterion to maximize the separability of the resultant classes, denoted by C0 and C1 . In our case the first class C0 stands for blocks with relatively high σ , which will be considered as the preliminary candidate blocks for embedding. The second class C1 is composed of noncandidate blocks with relatively low σ as compared to the optimal threshold ξ . Denote by L the level at which the two classes within {σi , i = 1, 2, . . . , K} are optimally separable. In other words, the optimal threshold in {σi , i = 1, 2, . . . , K} is equal to the standard deviation at level L, ξ = σL . The probabilities of class occurrence for the two classes C0 and C1 are given by   P0 = P r(C0 ) = L i=1 pi , (7.3)  P1 = P r(C1 ) = K i=L+1 pi .

FIGURE 7.5 Clustered candidate blocks of a given cover image using Otsu method.

Chapter 7 • OSteg

131

Furthermore, the mean values for classes C0 and C1 , denoted by μ0 and μ1 , are respectively given by   i×pi μ0 = L , i=1 (7.4) K P0i×pi μ1 = i=L+1 P1 . Let us denote the between class variance by σB2 , which computes the distance between two classes weighted by their corresponding probabilities of class occurrence for variable thresholds. σB2 can be calculated as  σB2

= P0 × μ0 −

K 



2 i × pi

+ P1 × μ1 −

i=1

K 

2 i × pi

.

(7.5)

i=1

 The quantity K i=1 i ×pi is the total probabilistic mean (denoted by μT ) of the set {σi , i = 1, 2, . . . , K} and is given by μT =

K 

i × pi ,

(7.6)

i=1

and we can easily verify that μT = P0 × μ0 + P1 × μ1 .

(7.7)

Thus the “between class” variance σB2 can be expressed as σB2 = P0 × P1 × (μ1 − μ0 )2 .

(7.8)

Hence Otsu searches for the optimal value L corresponding to the optimal threshold ξ = σL by maximizing the between class variance as follows:

argmax σB2 = argmax P0 × P1 × (μ1 − μ0 )2 . L,ξ

(7.9)

L,ξ

By varying ξ (similarly, L), the expression in Eq. (7.9) finds the maximal σB2 , and then the optimal threshold ξ ∗ corresponding to the optimal ordering L∗ can be found. Eventually, the distinction between the two classes is established, and a witness image denoted by ω is constructed by assigning a white color to the embedding candidate blocks belonging to C1 and a red color to the noncandidate blocks belonging to C0 . The witness image ω is formed as follows:  Red(n × n)pixels if Blockk ⊂ C0 , (7.10) ω= ωk , where ωk = white(n × n)pixels if Blockk ⊂ C1 .

132

Digital Media Steganography

7.3.3 Pretreatment: fake embedding The main purpose of this step is to use the Otsu clustering method in a recursive way to correctly and blindly identify the class of the candidate blocks {σi , i = K} even after embedding (negligible embedding artifacts). Such a condition cannot be verified unless a fake embedding is done on the candidate blocks. In each round of this recursive searching, two witness images are generated ωc and ωs based on application of the Otsu method on the image before fake embedding (the original cover) and the same image after embedding (the stego) until the same potentially embedded blocks are spotted and detected in both witnesses meaning that ωc = ωs . A ±1 embedding change over one pixel in a given block has an impact over the standard deviation σ as shown in Fig. 7.6. It also directly affects the membership of that block to class 1 or class 2. Initially, the Otsu clustering method is applied to a set of σ measured over n × n blocks to determine the threshold value ξ . Then the first iteration starts for αk=0 = 1. From experimental point of view, during the first round, the Otsu threshold computed for cover image ξcover is slightly different from that computed for the stego image ξstego . This phenomenon causes an extraction problem (Fig. 7.7: resulting in missing candidate blocks (MD blocks) or false positive blocks (FP blocks)). That is why the main goal is to similarly select candidate blocks in both scenarios whether embedding or extracting and to completely eliminate the possibility of MD or detecting noncandidate blocks as candidates (the FP blocks). It is then crucial to investigate the possibility of choosing blocks having σ greater than a combined threshold, denoted by T h = α × ξ , where ξ is the computed Otsu threshold, and α (Eq. (7.11)) is a configurable factor, and its main purpose is to perfectly extract the secret message from the stego object within a recursive pretreatment. αk+1 = 1 + (k ∗ ),

k ∈ [0, N ];

(7.11)

In each iteration, ωc and ωs are compared to check if the candidate blocks are identically selected in both witnesses images. If ωs and ωc illustrate the same candidate blocks, then the last computed αk value is considered as the optimal threshold to distinguish between

FIGURE 7.6 ±1 embedding impact on the standard deviation measured over a block of 3 × 3 pixels.

Chapter 7 • OSteg

133

candidate and noncandidate blocks. If it is not the case, then a new threshold is formulated and observed with incremented value of α, denoted as αk+1 = αk + (k ∗ ). Thus another

FIGURE 7.7 Otsu spotting for candidate blocks without Fake embedding and without searching for optimal parameter α: candidate blocks of cover images and stegoimages and the difference between witness images ωs and ωc over the Red layer.

134

Digital Media Steganography

round of embedding is proceeded in modeling the new witness cover image containing all blocks, which validates σi > T h = αk+1 × ξ . Without loss of generality, let us choose  = 0.1 in our experiments.

7.3.4 Scrambling selection: Ikeda system In this section, we introduce and analyze the Ikeda system with different properties and different parameters to prove the effectiveness of its application within an information security domain. In essence, our algorithm is a key-based steganographic method, and therefore we choose the exchanged secret key to hold only the initial condition x0 and factor α. The Ikeda system is used initially to generate a vector assigned to neighboring pixels in an image, each 8 bits per pixel corresponds to 8 directions used in checking the parity according to the focal pixel in a given n × n block. Then the rest of the generated sequence is exploited to apply padding of bits to increase the complexity of our method against attacks. The Ikeda system was initially proposed in [24] and can be described by the equation dx(t) = −μx(t) + m sin(x(t − T )), dt

(7.12)

where x(t), μ, and m are the control parameters. To solve Eq. (7.12), it is necessary to choose the initial condition of x denoted as x0 (t) within an interval [−T ; 0]. Then the solution can be approximated by N samples in each period T as described in Eq. (7.13) in [25]. Thus, while discretizing, T is considered as follows: T = N h, where N is the number of samples, and h is the sampling step. In such a case the initial condition should be presented over N values within one column vector: x0 (t) = {x00 , x01 , x02 , . . . , x0(N−1) },  xn (i + 1) =

xn (i) + F (xn (i), x0 (i)) × h,

(7.13)

xn (i) + (−μxn (i) + m sin(x0 (i))) × h,

where μ, h, and m are control parameters, and x0 is the initial condition vector of size (N, 1). For N = 3, Eq. (7.13) can be described as follows within a matrix format: ⎛

x00

⎜ 1 ⎝x0 x02

x01

x02

x03

x02

x03

f (x01 , x30 , m, h, μ)

x03

f (x01 , x03 , m, h, μ)

f (x02 , f (x01 , x03 , m, h, μ), m, h, μ)

⎞ ⎟ ⎠.

(7.14)

In the experiments, we have chosen the following settings: μ = 1, h = 0.5, m = 20, and N = 3. It is worth mentioning that with the mentioned values of these parameters, the Ikeda system exhibits high randomness. The effectiveness of this system depends on its ability to generate a random sequence with a long period and efficient statistical properties to ensure security. Fig. 7.8 shows that the complexity increases with certain values of the control parameters μ and m. More precisely, for sufficiently large values of μ and m, a

Chapter 7 • OSteg

135

hyperchaotic behavior (high chaotic complexity) can be observed (as shown in Fig. 7.8). Such a regime aids OSteg to maintain a higher complexity and ensure the randomness of the generated sequences. Table 7.1 outlines the sensibility range and illustrates the perfor-

FIGURE 7.8 (A) One parameter bifurcation diagram in (μ, xn ) plane, (B) One parameter bifurcation diagram in (m, xn ) plane.

mance of the Ikeda system toward change in its control parameters. In other words, if the value of p1 is changed with ± 10−15 , then the output sequence will be totally different from the first generated output with the original value of p1 . Table 7.1 Sensibility measurement of control parameters of the Ikeda system. Parameters

Interval Ipi

Sensibility

Subkey space Ip1 × S1−1 = 6 × 1015 Ip2 × S2−1 = 4.4 × 1014 Ip3 × S3−1 = 0.5 × 103 Ip4 × S4−1 = 99 × 1013

p1 = μ

[1, 6]

10−15

p2 = m

[17, 20.4]

10−14

p3 = h

[0.1, 0.5]

10−3

p4 = N

[2, 100]

10−13

Key space =

4

−1 i=1 (Ipi × Si )

1.3068 × 1048

7.3.5 Secret shared key and key space To extract the secret message efficiently, the two communicating entities have to exchange a secret key denoted by  containing two elements: the initial condition of the used chaotic function denoted by x0 and the optimal threshold elaborating the separation of the two classes. ⎧ ⎪ (x0 , αR , αG , αB ), ⎪ ⎪ ⎪ ⎨(x , 1 + (5 · 0.1), 1 + (8 · 0.1), 1 + (6 · 0.1)) 0 Given a secret key: (x0 , αRGB ) = ⎪ So the effective sent secret key is: ⎪ ⎪ ⎪ ⎩ (x0 , f actors) = (x0 , 5, 8, 6)

(7.15)

136

Digital Media Steganography

Algorithm 1 gives a detailed description of the pretreatment phase consisting of a recursive searching for the optimal value of the threshold by using the fake embedding and comparing witnesses images of the cover with stego. Algorithm 1: PreTreatment(Xc , α). Data: Cover RGB Image: Xc , parameters α and  Result: αopt 1 initialization; 2 α=1 (only for first round); 3 for Each Color layer in Xc do 4 σc = std(Xc ); 5 ξ =Otsu(σc ); 6 ωc = Witness(ξ, α); 7 Xs = FakeEmbedding(Xc ,Wc ) ; 8 σs = std(Xs ); 9 ξs =Otsu(σs ); 10 ωs = Witness(ξs , α); 11 if ωs =ωc then 12 αopt = α ; 13 else 14 α=α+ ; 15 αopt = P reT reatment (Xc , α); 16 end 17 end As mentioned previously, the use of a secret key (x0 , α) is crucial for security and for the correct extraction of the secret message. In fact, the use of a nonlinear system (the Ikeda system), parameterized through an initial condition x0 and a set of control parameters μ, m, h, and N , serves mainly the random generation of pixels for the embedding phase and also serves the padding process. In Eq. (7.16) the total number of the brute force attack attempts is estimated as KS =

8 

k

C8 × 2128−(k+1)×8 ,

(7.16)

k=0 k

where C8 =

k! 8!(k−8)! .

7.3.6 Effective embedding After the preparation and the recursive fake-embedding phases, the sender knows the exact candidate blocks where secret vector ϑ is concealed. For each candidate block, a pseudorandom keystream is generated by iterating a chaotic (Ikeda) system [24].

Chapter 7 • OSteg

137

The length of the subkey stream per block is equal to n × n − 1 illustrating the pixels that should be considered in computing the parity check of a given block. Based on this information, a decision if this candidate block can hold as a secret bit should be the value “1” or the value “0”. These neighboring pixels can be described by the following offsets and directions (Eq. (7.17)): Off set = [{e,w}, {n,s}, {se,nw}, {sw,ne}],

(7.17)

where the letters (e, w, n, s) stand respectively to (east, west, north, south). An example of a keystream denoted by ψ is shown in Fig. 7.9, where ψi∈[1,2] for two candidate blocks: ψ1 = [{e, w}, {n, s}, {se, nw}, {sw, ne}] = [{1, 1}, {1, 1}, {1, 1}, {1, 1}],

(7.18)

ψ2 = [{e, w}, {n, s}, {se, nw}, {sw, ne}] = [{0, 1}, {0, 0}, {0, 1}, {0, 1}].

(7.19)

This means that for the parity check test, the west, north–west, and north–east pixels should be included.

FIGURE 7.9 Detailed Embedding method: neighbors choice of the pixel of interest xc for a candidate block.

The binary sequence, which will be considered in each candidate block, contains the union (concatenation) of the binary representation of chosen neighboring pixels and the focal pixel (pixel of interest c). This union is given by dec2bin(xneighbor ) dec2bin(xc ) (7.20) Bi = neighbor

Afterwards, the Ikeda system is iterated to generate more bits to check the parity of a binary sequence of 128 bits length for each block. Originally, there are at least 9 pixels in each block, each presentable over 8 bits, which make the total length of the original binary sequence up to 72 bits. See Fig. 7.10. As explained earlier, the final length of the sequence has to be 128 bits, which implies that the length of Ikeda extra bits is in (56,120): I keda {128 − length(Bi )} , (7.21) Bi = Bi

138

Digital Media Steganography

FIGURE 7.10 Detailed effective Embedding method: generation of the binary sequence Bi from the central pixel, its chosen neighbors, and Ikeda system.

where I keda {·} stands for generating binary bits from the Ikeda system using Eq. (7.13). The formed binary sequence Bi extracted from each candidate block will be used to conceal one bit from the secret message ϑ depending on the existing number of ones and whether this number is even (=1 Bi mod 2 = 0) or odd (=1 Bi mod 2 = 1). Algorithm 2: Effective embedding algorithm: parity check. Data: Bi , xc , ϑi Result: xc       1 if =1 Bi mod 2 = 0 & ϑi = 0 OR =1 Bi mod 2 = 1 & ϑi = 1 then 2 xc = x c ; 3 else 4 f liplsb (xc ) ; 5 end The embedding also depends on the value of the actual processed bit in ϑ. With these two conditions, the value of the central pixel xc of a given block will be kept the same, or its least significant bit will be flipped (f liplsb (xc )). The details of the effective embedding are shown in Algorithm 2. The extraction process can be carried out by checking the parity of the sequence Bi if it is odd or even by following Algorithm 3.

Chapter 7 • OSteg

139

Algorithm 3: Extraction algorithm: parity check. Data: Bi Result: ϑi    1 if =1 Bi mod 2 = 0 then 2 ϑi = 0 ; 3 else 4 ϑi = 1 ; 5 end

7.4 Experimental results and discussion In this section, we evaluate the performance of OSteg and compare the existing schemes. In practice, embedding schemes have to be evaluated differently by using empirical measures rather than information-theoretic approaches. Recent steganalysis methods have been proposed based on the use of binary classifiers in identifying the cover objects from the stego objects. More precisely, the security evaluation metrics (probability of error PE and the receiver operating characteristic (ROC) curve), the feature set, the used machine learning tool, and the image database are illustrated. Experiments are made over a database of color images with different textures in bitmap(bmp) format taken from [14]. The images are resized to 512 × 512. OSteg is compared with four well-known state-of-the-art schemes in spatial domain: WOW [15], MVG [16], S-UNIWARD [17], and MVGG [18]. Furthermore, a spatial domain feature extractor designed for rich model [13] available at [26] is exploited. Fig. 7.11 illustrates the difference between the histogram of a given cover image and its stego version over the red layer. It is clearly visible that our embedding method keeps the same patterns of the histogram for lower values and slightly changes the histogram of higher values depending on the payload and the selected candidates blocks. Through Fig. 7.12, the selected focal pixels holding the secret message over the three layers of the color image are spotted. For each layer (RGB), the effective embedding spots are presented, and the histograms of both cover and stego are subplotted. Note that the artifacts of the embedding method OSteg are adjusted to each layer in a way to maintain the same patterns between the two histograms. To evaluate the performance, security, and classification accuracy of OSteg, we use a rich model of features SCRM with a total dimensions of 18157 [27] designed for color images computed from all three color channels in the spatial domain. This model of features is composed of submodels measuring dependencies among neighboring noise residuals of pixels across color channels. These dependencies are expressed in terms of symmetric joint probability distributions. The building process of SCRM is processed within the training phase of data, where a satisfying trade-off is established between the model dimensionality and the detection accuracy allowing its flexibility to detect various embedding artifacts in the spatial domain.

140

Digital Media Steganography

FIGURE 7.11 Histogram of measured standard deviation σ over 3 × 3 block of pixels and the difference between the histograms of cover.

In machine learning the use of ensembles [28] is known to be an effective method to obtain highly accurate classifier through combining less accurate ones. We use the ensemble classifiers (ECs) proposed in [12] with base learners implemented as Fisher linear discriminant (FLD) independently trained several times and each time with a different subset containing m samples randomly drawn from the original training set. We make the final decision by aggregating the decisions of individual base learners. This method of manipulating the training examples is recognized in the machine learning society by bootstrap aggregation, and the term bagging stems from this process. By default the ensemble minimizes the total classification error probability PE =

1 × (PF A + PMD ), 2

(7.22)

where PF A and PMD denote the false-alarm and missed-detection probabilities, respectively. The security is evaluated by averaging PE measured over two separate sets (194 images: 130(training)/64(testing)). Table 7.2 shows the obtained PE over 10 disjoint training and testing subsets (splits) of five different approaches, including OSteg. Note that the PE is stable for all approaches and is approximately in the range of [0.4, 0.5], which refers to the inability of the EC to distinguish various changed patterns in highly rich-textured zones of the image. Note also that OSteg has the highest value of the averaged PE , which

Chapter 7 • OSteg

141

FIGURE 7.12 Selected embedding position for the red, green, and blue components with their corresponding histograms (cover, stego).

Table 7.2 Measured PE over 10 splits of the five described spatial domain schemes using SCRM features. The mean probability of error PE and its standard deviation over those 10 splits is also given. Features: SCRM WOW MVG S-UNIWARD MVGG OSteg

0.4844 0.4896 0.5000 0.4896 0.4948

0.4635 0.4688 0.5052 0.4896 0.4740

0.4635 0.4948 0.5000 0.5000 0.5000

0.4740 0.4844 0.4844 0.4740 0.4844

PE

0.4740 0.4740 0.5104 0.4740 0.5365

0.4740 0.4635 0.4896 0.4844 0.5052

0.4844 0.4688 0.4948 0.4740 0.4948

0.4740 0.4688 0.4948 0.5000 0.4948

0.4792 0.4635 0.5000 0.5208 0.4844

PE 0.4792 0.4750 (+/- 0.0073) 0.4844 0.4760 (+/- 0.0113) 0.4688 0.4948 (+/- 0.0118) 0.4792 0.4885 (+/- 0.0151) 0.4844 0.4953 (+/- 0.0171)

refers to a higher difficulty in classifying using OSteg. Generally, the ROC curve is used to evaluate the performance of a given classifier; it plots the rate of true positive vs. the rate of false positive for various classification thresholds in an interval [0,1]. From an attacker point of view, the closer the curve comes to the upper left corner of the plot, the better because in such scenario the classifier successfully classifies the two classes (cover, stego). The ROC of OSteg is illustrated with comparison to four of the most well-known spatial steganographic approaches in the literature, that is, WOW, MVG, S-UNIWARD, and MVGG, in Fig. 7.13. The figure shows that the proposed method achieves the smallest area under the curve and the highest error probability PE . It is noticeable from the figure that despite the recognition of the ensemble classifiers as an effective tool in ensuring a better detection accuracy, it performs poorly in separating the two distributions of the cover and stego objects for all given steganographic approaches. This phenomenon implies that the classification process does no better than random guessing since all curves are close to the diagonal line.

142

Digital Media Steganography

FIGURE 7.13 Receiver Operating Characteristic (ROC) curve of OSteg, WOW, MG, MVG, and S-UNIWARD.

As a metric of evaluation, the greater the area under the curve (AUC), the more detectable the steganographic method. It has been found that the ensemble classifiers classifies the OSteg method within the same performance range of other approaches. Comparing their AUCs, note that OSteg outperforms all of the approaches and has the smallest area under the curve (AU C = 0.5073), which ensures its undetectability.

7.5 Conclusion In this chapter, we introduced spatial steganography with an alternative, lightweight, efficient, and simple method, called OSteg for information hiding. OSteg is based on applying a clustering method to effectively select the blocks of pixels that have high probability of undetectability. The embedding method is designed to adaptively hide a certain number of bits per each image (each layer apart: Red, Green, and Blue layers). Such adaptivity shows the weak performance of the ensemble classifiers in detecting the hidden data and thus-forth increases its applicability as a steganographic method. The embedding scheme contains two major steps: the main role of the first step is preparing and selecting the spots where the cover support is supposed to hold the secret bits and the second step is elaborated to alter the pixel value depending on the bit to be embedded. Since embedding the secret bits is established for each layer individually, it makes OSteg adjustable even for gray images. The alterations are made in a way to maintain the histograms of the three components: Red, Green, and Blue of color images. A chaotic map is exploited to hide the data bits to increase OSteg security against steganalysis attacks. OSteg outperforms the existing well-known methods such as WOW, MVG, S-UNIWARD, and MVGG in terms of the classification probability error when he/she tries to classify if a given set of images are cover or stego. Besides that, comparison in terms of AUC of ROC curves gives equivalent values. Furthermore, OSteg is resilient against attackers who aim to spot the placement of secret bits. This means that targeted steganalysis against OSteg is  k equivalent to a brute force attack searching in a keyspace of size 8k=0 C8 × 2128−(k+1)×8 .

Chapter 7 • OSteg

143

Acknowledgments The second Author Rhouma Rhouma is supported by The Research Council (TRC) in the Sultanate of Oman under Research Grant (RG) of reference number BFP/RGP/ICT/19/140.

References [1] Jessica Fridrich, Steganography in Digital Media: Principles, Algorithms, and Applications, Cambridge University Press, 2009. [2] Abbas Cheddad, Joan Condell, Kevin Curran, Paul Mc Kevitt, Digital image steganography: survey and analysis of current methods, Signal Processing 90 (3) (2010) 727–752. [3] Kan Wang, Zhe-Ming Lu, Yong-Jian Hu, A high capacity lossless data hiding scheme for JPEG images, Journal of Systems and Software 86 (7) (2013) 1965–1975. [4] M. Ghebleh, A. Kanso, A robust chaotic algorithm for digital image steganography, Communications in Nonlinear Science and Numerical Simulation 19 (6) (2014) 1898–1907. [5] Diqun Yan, Rangding Wang, Xianmin Yu, Jie Zhu, Steganography for MP3 audio by exploiting the rule of window switching, Computers & Security 31 (5) (2012) 704–716. [6] Xian-Ting Zeng, Ling-Di Ping, Xue-Zeng Pan, A lossless robust data hiding scheme, Pattern Recognition 43 (4) (2010) 1656–1667. [7] Chung-Min Yu, Kuo-Chen Wu, Chung-Ming Wang, A distortion-free data hiding scheme for high dynamic range images, Displays 32 (5) (2011) 225–236. [8] Shabir A. Parah, Javaid A. Sheikh, Abdul M. Hafiz, Ghulam Mohiuddin Bhat, Data hiding in scrambled images: a new double layer security data hiding technique, Computers & Electrical Engineering 40 (1) (2014) 70–82. [9] Xiang-Yang Luo, Dao-Shun Wang, Ping Wang, Fen-Lin Liu, A review on blind detection for image steganography, Signal Processing 88 (9) (2008) 2138–2157. [10] Romana Machado, EzStego, Software downloadable from http://www.stego.com, 2001. [11] D. Upham, JPEG-JSTEG-Modifications of the independent JPEG groups JPEG software for 1-bit steganography in JFIF output files, 1997. [12] Jan Kodovsky, Jessica Fridrich, Vojtˇech Holub, Ensemble classifiers for steganalysis of digital media, IEEE Transactions on Information Forensics and Security 7 (2) (2012) 432–444. [13] Miroslav Goljan, Jessica Fridrich, Rémi Cogranne, Rich model for steganalysis of color images, in: Information Forensics and Security (WIFS), 2014 IEEE International Workshop on, pp. 185–190. [14] Adriana Olmos, Frederick A.A. Kingdom, A biologically inspired algorithm for the recovery of shading and reflectance images, Perception 33 (12) (2004) 1463–1473. [15] Vojtech Holub, Jessica Fridrich, Designing steganographic distortion using directional filters, in: Information Forensics and Security (WIFS), 2012 IEEE International Workshop on, pp. 234–239. [16] Jessica J. Fridrich, Jan Kodovsky, Multivariate Gaussian model for designing additive distortion for steganography, in: ICASSP 2013, pp. 2949–2953. [17] Vojtˇech Holub, Jessica Fridrich, Tomáš Denemark, Universal distortion function for steganography in an arbitrary domain, EURASIP Journal on Information Security (2014) 1. [18] Vahid Sedighi, Jessica J. Fridrich, Remi Cogranne, Content-adaptive pentary steganography using the multivariate generalized Gaussian cover model, in: Media Watermarking, Security, and Forensics, 2015, p. 94090H. [19] Mohamed Abdel Hameed, Saleh Aly, M. Hassaballah, An efficient data hiding method based on adaptive directional pixel value differencing (ADPVD), Multimedia Tools and Applications (ISSN 1573-7721) 77 (12) (2018) 14705–14723. [20] Hsing-Han Liu, Yuh-Chi Lin, Chia-Ming Lee, A digital data hiding scheme based on pixel-value differencing and side match method, Multimedia Tools and Applications (ISSN 1573-7721) 78 (9) (2019) 12157–12181. [21] Gandharba Swain, Very high capacity image steganography technique using quotient value differencing and LSB substitution, Arabian Journal for Science and Engineering (ISSN 2191-4281) 44 (4) (2019) 2995–3004.

144

Digital Media Steganography

[22] Aditya Kumar Sahu, Gandharba Swain, A novel n-rightmost bit replacement image steganography technique, 3D Research (ISSN 2092-6731) 10 (1) (2018) 2. [23] Nobuyuki Otsu, A threshold selection method from gray-level histograms, IEEE Transactions on Systems, Man, and Cybernetics 9 (1) (1979) 62–66. [24] K. Ikeda, Multiple-valued stationary state and its instability of the transmitted light by a ring cavity system, Optics Communications 3 (1979) 257–261. [25] O. Mannai, R. Bechikh, H. Hermassi, R. Rhouma, S. Belghith, A new image encryption scheme based on a simple first-order time-delay system with appropriate nonlinearity, Nonlinear Dynamics 82 (2015) 100–117. [26] Binghamton University, http://dde.binghamton.edu/download, http://dde.binghamton.edu/ download, 1999. (Accessed 19 July 2008). [27] Jessica Fridrich, Jan Kodovsky, Rich models for steganalysis of digital images, IEEE Transactions on Information Forensics and Security 7 (3) (2012) 868–882. [28] Thomas G. Dietterich, Ensemble methods in machine learning, Multiple Classifier Systems 1857 (2000) 1–15.

8 A steganography method based on decomposition of the Catalan numbers Muzafer Saraˇcevi´ca , Samed Juki´cb , Adnan Hasanovi´cc a Department of Computer Sciences, University of Novi Pazar, Novi Pazar, Serbia b Faculty of Information Tech., International Burch University, Ilidža, Sarajevo, BIH c Department of Philological Sciences, University of Novi Pazar, Novi Pazar, Serbia

8.1 Introduction Digital steganography applies popular techniques such as hiding messages in the least significant image bits, hiding a message in an existing cipher, or mapping the statistical properties from one file to another file to make the first one look like the other file. The main goal of modern steganography is the use of powerful computer algorithms and methods, where it is almost impossible to isolate or detect a secret message. The ultimate goal of modern steganography is hiding data in such a way that it is not possible to prove their existence in other data. Data hiding is a procedure which is performed for security reasons, and today it is a common occurrence in all organizations whose business is based on secure data transmission. Business systems that operate through a computer network increase the interdependence between information and communication channels and thus become a target of many malicious users. This certainly leads to the need for data protection in the business environment. To solve this problem, in addition to the application of certain cryptographic algorithms, it is possible to use steganographic methods that allow hiding data behind objects in images, audio, videos, but in such a way that the original data carrier remains authentic. Some encryptions in the business applications offer the ability to create invisible (steganographic) partitions that can be used, or whose presence in the system can be undoubtedly determined, only after entering a special stego key (or password), where before that it cannot be distinguished from the empty space in the encrypted disk. The proposed solution is based precisely on this principle, and the hidden message can never be generated from the data carrier without the exact stego key. In this chapter, we propose a steganographic method based on decomposition of the Catalan numbers. In this case, the generating technique does not change the data carrier but generates a hidden message based on the stego key itself. The data carrier remains the Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00016-5 Copyright © 2020 Elsevier Inc. All rights reserved.

145

146

Digital Media Steganography

original file that is completely immune to the comparison with other files because it does not suffer from any modification, which justifies the need for application of decomposition of Catalan numbers in steganography. The proposed solution is implemented in the Java programming language (NetBeans environment). The main highlights of the proposed method are: 1) the data hiding method uses specific properties of the Catalan numbers and their decomposition process; 2) the method ensures that the data carrier retains its original shape; and 3) several security tests and steganalysis methods are used in security analysis. The chapter is organized as follows. In Section 8.2, we give relevant research in the field of steganography and application of the Catalan numbers and combinatorial problems in the steganography. Section 8.3 deals with the basic properties of the Catalan numbers and their decomposition. In Section 8.4, we detail the implementation of the proposed steganographic method, which consists of two segments: the first segment refers to the process of embedding data and generation of a complex stego key, and the second segment refers to the process of extracting a hidden message. Steganalysis and security testing of the proposed method with the focus on the advantages is given in Section 8.5. Finally, Section 8.6 provides concluding observations and suggestions for further research in the field of public-key steganography.

8.2 Related works In the previous investigations, the authors encountered the Catalan numbers in quite many places and studied their close relationships (computational geometry, combinatorial problems, steganography, and cryptography). In [1] a few examples of application of combinatorial mathematics in cryptography and steganography are presented. Saracevic et al. [2] present the possibilities of applying the Catalan numbers and appropriate combinatorial problems (balanced parentheses, stack permutations, and ballot problem) in data encryption. The application of the Catalan numbers in generating hidden cryptographic Catalan keys from one segment of the image is presented in [3]. This procedure consists of three phases: 1) selection of one segment from an image, 2) conversion in the binary record that represents the Catalan key; and 3) the Catalan key is applied in data encryption. Also, in [4] the authors analyze the properties of the Catalan numbers with the Lattice Path in cryptography. Catalan numbers play an important role in data hiding and steganography. The purpose of Saracevic et al. [5] is investigating the specific properties of the Catalan numbers and their possible application in steganography. Another technique for data hiding using Catalan numbers and binary (Dyck) words is introduced in [6]. Research presented in [7] analyzes steganography based on intensity value decomposition of pixel using famous number sequences (such as Fibonacci, Lucas, and Catalan– Fibonacci). In this paper the authors present a method using two generators of accidental numbers: one generates Catalan numbers, and the second generates Lucas sequences. Besides the aforementioned paper, the techniques in steganography of image by using known

Chapter 8 • A steganography method

147

sequences of numbers (Catalan numbers as well) are listed in [8]. Methods use a combination of Lucas–Catalan–Fibonacci and Catalan–Lucas numbers, which are far superior compared to the Fibonacci number technique for data hiding. Bhaskari et al. [9] present a combinatorial approach for data hiding using steganography. In [10] the authors present a method for secret-key steganography using random numbers. The advantage of this proposed technique is reflected in security against the classic steganalysis attacks because it leaves the original and stego images unchanged. An image steganography method is proposed in [11] based on pixel value differencing (PVD) and modulus function. The authors provide an effective method to improve the peak signal-to-noise ratio and hiding capacity. The proposed author’s approach has two variants, and it is important to emphasize that both variants to embed the secret data use the difference between a pair of consecutive pixels. The experimental results prove that the fall off boundary problem in most of PVD approaches has been avoided. Also, it has been experimentally verified that the proposed method is resistant against RS attacks. Sahu and Swain [12] proposed a new bit replacement image steganography method to hide the secret data in an image. The major objective of the proposed technique is improving the embedding capacity. In addition, the benefits of this approach are avoiding the fall of boundary problem and avoiding RS attack. From the experimental results it is observed that the proposed technique is resistant to some specific attacks. The authors of [13] propose an image steganography method to improve the embedding capacity in stego-objects, which is based on the principle of pixel overlapping. The proposed technique has two variants related to overlapped PVD with modulus function and overlapped PVD. It is compared with the existing methods from the aspect of bits per pixel and execution time. Also, the security of the proposed technique has been demonstrated using RS analysis. In [14] a multistego-image method is introduced using the principle of the modified least significant bit (LSB technique) matching to improve the embedding capacity and image quality. The proposed technique successfully withstands against some specific methods of steganalysis. Each original pixel produces four new pixels, and the secret data is hidden in all the four produced pixels. It is important to emphasize that the application of this method achieves that each stego-object (image) hides one bit per pixel. In [15] a bit flipping technique is proposed to hide secret data in the original image. The block consists of two pixels and thereby flipping 1 or 2 LSBs of the pixels to hide secret data. The authors propose a method that raises the capacity and relates to bits per pixel in the stegoimage for data hiding. This method is compared to some existing bit flipping methods, and steganalysis is presented. The steganographic parameters such as the hiding capacity of the quality index of the proposed technique have been compared with the results of the existing techniques and some methods of more recent date.

148

Digital Media Steganography

8.3 Decomposition of Catalan numbers The Catalan numbers Cn, n > 0, represent a sequence of natural numbers that occurs as a solution to a large number of known combinatorial problems (binary trees, ballot problem, polygon triangulation, lattice path, Dyck words, balanced parentheses, etc.). The Catalan numbers are defined in [16] as Cn = or Cn =

1 n+1

(2n)! (n + 1)!n!



2n n

(8.1)

 n ≥ 0.

(8.2)

(2n)! (2n)! (2n)! − = = Cn . 2 (n − 1)!(n + 1)! n!(n + 1)! (n!)

(8.3)

,

The other way for defining Cn is 

2n n



 −

2n n−1

 =

The Catalan numbers Cn have a wide application in solving many problems in combinatorial mathematics. In [16], concrete representations of Catalan numbers and applications are listed. A group of problems that describe over 60 different representations of the Catalan numbers is listed in [17]. The property of recursiveness of the Catalan numbers and their application to steganography are discussed in [5,18]. A method for decomposition of the Catalan numbers is provided in [5], which uses the procedure of dynamic programming and memoization. A recursive definition of the Catalan numbers can be represented as Cn =

4n − 2 Cn−1 , n+1

n ≥ 1.

(8.4)

Relationships for nonconsecutive Cn can be defined: (4n − 2) ∗ (4n − 6) Cn−2 , (n + 1)n (4n − 2)(4n − 6)(4n − 10) Cn = Cn−3 , (n + 1)n(n − 1) (4n − 2)(4n − 6)(4n − 10) · · · 6 · 2 Cn = C0 . (n + 1)n(n − 1) · · · 3 · 2 Cn =

(8.5) (8.6) (8.7)

Authors provided an effective method of decomposition of Cn , represented by segments (2 + i). The number 2 can be represented by one initial segment (2 + 0). Decomposition of the Catalan number Cn can be derived from the already made decomposition of Cn−1 and the transformation f defined by (2 + i) → (2 + 0) + (2 + 1) + · · · + (2 + i + 1) = f (2 + i) ,

i ≥ 0.

(8.8)

Chapter 8 • A steganography method

149

Lemma 8.1. For each n ≥ 4, the recurrent relation  (Cn ) = f ( (Cn−1 )) is valid. In [18], for simplicity, the authors use the notation σi = (2 + 0) + (2 + 1) + · · · + (2 + i) .

(8.9)

Then Eq. (8.8) can be simplified to f (σ0 ) = σ1 , f (2 + i) = σi+1 , i ≥ 1.  Further, using f (σi ) = σ1 + · · · + σi+1 = i+1 k=1 σk , i ≥ 0, decompositions can be further simplified:  (C2 ) = (2 + 0) = σ0 ,  (C3 ) = σ1 ,  (C4 ) =

2 

σl ,

l=1

 (C5 ) =

2 

σl +

l=1

  (C6 ) =

3 

(8.10) σl ,

l=1

2 

σl +

l=1

3 

 σl +

 2 

l=1

σl +

3 

l=1

σl +

4 

l=1

l=1

σ l1 ,

i ≥ 1,

 σl .

Now replacing  f

i 

 σl =

l=1

i 

f (σl ) =

l+1 i  

(8.11)

l=1 l1 =1

l=1

the authors simplify Eq. (8.9) and get:  (C2 ) = σ0 ,  (C3 ) = f (σ0 ) = σ1 ,  (C4 ) = f (σ1 ) =

2 

σl ,

l=1

 (C5 ) =

2 

f (σl ) =

 (C6 ) =

l=1

(8.12) σ l1 ,

l=1 l1 =1

l=1 2 

l+1 i  

f (σl ) = 2

l+1 i   l=1 l1 =1

f (σl1 ) =

l+1 l i  1 +1 

σ l2 .

l=1 l1 =1 l2 =1

Theorem 8.1. Decomposition of Cn is defined by summation of all segments (2 + i).

150

Digital Media Steganography

It suffices to verify the inductive step. From the inductive hypothesis and Lemma 8.1 we get ⎛ ⎞ ln−6 +1 l+1 i     (Cn ) = f ( (Cn−1 )) = f ⎝ ... σln−5 ⎠ l=1 l1 =1

=

l+1 i   l=1 l1 =1

ln−6 +1

...



ln−5 =1

f (σln−5 ) =

ln−5 =1

l+1 i   l=1 l1 =1

ln−5 +1

...



σln−4 ,

(8.13)

ln−4 =1

which completes the proof. Besides  (Cn ) = f ( (Cn−1 )), the following general recurrent relation is also satisfied: for each n ≥ 4,  (Cn ) = f k ( (Cn−k )), k ≥ 1. Example 8.1. For n = 4, it follows that C4 = 14. In the first step of the algorithm, an initial expression is (C2 ) = (2 + 0). In the second step of the algorithm for this case, there are two cycles. First, the number of segments in initial expression (2 + i) is calculated in the expression (C2 ). The number of such segments is 1, and the sum in this segment is 2. The following expression (C3 ) will have two segments of the expression (2 + i), from which it follows that (C3 ) = (2 + 0) + (2 + 1). Now (C3 ) is set for the initial, and then it is calculated how many segments of the expression (2 + i) it possesses. That number is 2, whereas the sums are 2 and 3, respectively. This means that the expression (C4 ) will have five segments of the expression (2 + i) in total. Finally, (C4 ) = (2 + 0) + (2 + 1) + (2 + 0) + (2 + 1) + (2 + 2). See Fig. 8.1.

FIGURE 8.1 Graphic presentation of decomposition of the Catalan numbers.

The algorithm at the entrance expects the base n and at the exit gets the expression that represents the sum of segment shapes (2 + i) whose value is C n , that is, a set of Catalan numbers. Algorithm 8.1. Decomposition of the Catalan numbers in the form of (2 + i), where i = 0 to sum(s).

Chapter 8 • A steganography method

151



First step: In the initial phase, enter the parameter n. It sets an initial expression expr(k) on (2 + 0). • Second step: For k = 2 to n, calculate the number of segments (2 + i) in expr(k) and return count(k). Then calculate the sum in each segment (2 + i) and return sum(s). • Third step: For s = 1 to count(k), create the segment of expression in (2 + i) and connect the expressions with the addition operation +, where i = 0 to sum(s). • Fourth step: Create an output with the expression expr[n] for decomposition for the entered n.

Implementation of Algorithm 8.1 in Java programming language is presented in Fig. 8.2.

FIGURE 8.2 Java source code for Algorithm 8.1.

152

Digital Media Steganography

8.4 Implementation of the proposed method We will present an example of the Catalan numbers decomposition application in steganography. The purpose of the previous works [5,6] is to analyze possible application of the Catalan numbers in data hiding. We will use sequences in the form of combinations of a Catalan number and decomposition (Cn and (Cn )). The decomposition method generates variables based on the Catalan numbers (e.g., 1, 2, 5, 14, 42, etc.). The value Cn will serve as a generator for the first sequence numbers, and the decomposition (Cn ) will serve as a generator for the second sequence. These two sequences are interchangeably connected, that is, the other sequence is conditioned by the first.

Module for embedded data Module for embedded secret message to image or text is based on the following phases: a) Phase of converting: In this phase, carrier data and secret message are loaded. Then they are converted to binary notation. In the case where the data are an image, it first needs to be converted to Base64 string and then to binary notation. b) Phase of Catalan numbers definition: In this phase, basements n are defined for generating the Catalan number Cn. c) Phase of Catalan number decomposition: In this phase, the Catalan number decomposition is done. In fact, the expression (Cn ) is generated based on Algorithm 8.1. d) Phase of selection: Definition of block size (parameter Size of block) and bits selection based on the Catalan number decomposition are done in this phase. Besides, an additional rule for rows of block (parameter Set_Cn_row) is defined. e) Phase of comparison: In this phase, selected bits from carrier data are compared with bits of secret message, and in that way the set of positions where the difference occurs (parameter Dp) is generated. Fig. 8.3 illustrates a general scenario for the embedded data method based on the Catalan number decomposition. For better understanding of the general scenario containing five phases, follow the example of embedded message in some carrier data. Example 8.2. Let carrier data be defined as the following text: “Steganography is very interesting ”, and let the secret message be “GET ”. Through five phases authors are going to explain how to dissimulate data message and the way of generating stego key used for extracting data. Phase of converting • •

Carrier data = “Steganography is very interesting”. Binary (Carrier data) = 010100110111010001100101011001110110000101101110011011 1101100111011100100110000101110000011010000111100100100000011010010111001 1001000000111011001100101011100100111100100100000011010010110111001110100 011001010111001001100101011100110111010001101001011011100110011100101110.

Chapter 8 • A steganography method

153

FIGURE 8.3 General scenario for data embedding (generating a stego key).

• •

Short secret message = “GET”. Binary (Secret message) = 010001110100010101010100.

Phase of Catalan number definition In this phase, we define the basement for Catalan number generation and use the following parameters: • n = {2, 3, 4, 5}, which gives C2 = 2, C3 = 5, C4 = 14, C5 = 42. • The obtained set, which will be used in the decomposition phase, is Cn = {2, 5, 14, 42}. Phase of Catalan number decomposition In this phase, based on Algorithm 8.1, the expressions of the Catalan number decomposition for set n are the following: • (C2 ) = (2 + 0). • (C3 ) = (2 + 0) + (2 + 1). • (C4 ) = (2 + 0) + (2 + 1) + (2 + 0) + (2 + 1) + (2 + 2). • (C5 ) = (2 + 0) + (2 + 1) + (2 + 0) + (2 + 1) + (2 + 2) + (2 + 0) + (2 + 1) + (2 + 0) + (2 + 1) + (2 + 2) + (2 + 0) + (2 + 1) + (2 + 2) + (2 + 3). Phase of selection In this phase, we define the block size. Let us define the parameter Size of block = 32, which means that the binary notation of carrier data will be divided into rows that can contain maximum 32 bits. Also, an additional rule for rows in block will be defined. Let us define the following rule: Set_Cn_row = {3, 4, 2, 4, 0, 5}. After that, a bits selection is applied based on the Catalan number decomposition and based on the defined rule Set_Cn_row:

154

Digital Media Steganography

Table 8.1 Phase of comparison (Stego_bits and Secret_bits). A

1

0

1

0

0

1

1

1

1

1

0

0

0

1

1

0

0

0

0

1

1

1

0

B

0

1

0

0

0

1

1

1

0

1

0

0

0

1

0

1

0

1

0

1

0

1

0

1 0

D(A, B)

*

*

*











*











*

*



*





*





*

Dp

1

2

3











9











15 16 –

18 –



21 –



24

• The first row applied base n = 3, that is, the decomposition (C3) = (2 + 0) + (2 + 1), which means that it selects the second bit in the first row based on segment (2 + 0) and then three positions from the selected bit based on segment (2 + 1). • In the same manner the selection in the second row based on (C4 ) is done, in the third row the selection is done based on (C2 ), and for the forth row, based on (C4 ), where the fifth row has no basement for bits selection, and the sixth row uses a decomposition expression for (C5 ), which is explained in Fig. 8.4.

FIGURE 8.4 Example for selection bit in data carrier based on the parameters Size_block = 32 and Set_Cn_row = {3, 4, 2, 4, 0, 5}.

Phase of comparison In this phase, selected bits from the carrier data (A = Stego_bits) are compared with bits from the secret message (B = Secret_bits). In that way a set of positions where difference occurs is obtained (D = Differences, Dp = positions of different bits) as shown in Table 8.1. Based on the obtained data from the last phase, the stego key can be generated according to the model Sk = {Size_block, Set_Cn_row, D p }. In this example the obtained stego key

Chapter 8 • A steganography method

155

has the parameters Sk = {32, {3, 4, 2, 4, 0, 5} and {1, 2, 3, 9, 15, 16, 18, 21, 24}}. In Java application, the process of hiding information can be reduced to the following steps: 1. 2. 3. 4. 5.

Load carrier data. Load secret message. Select the base Cn for generating decomposition expression. Insert block size. Insert rule for rows in block.

Fig. 8.5 shows the module for embedded data in Java application; Fig. 8.6. shows the method of generating Dp value in complex stego key in the process of hiding the text in image.

FIGURE 8.5 Java application: Module for embedded data.

FIGURE 8.6 Dp value generation in stego key.

Module for extract data The module for extracting a secret message in an image or text is based on the following phases: a) Phase of converting stego object: In this phase, the stego object (carrier) is loaded. After loading, it is converted to binary notation. In the case where the stego object is an image, it first needs to be converted to Base64 string and then to binary notation. b) Phase of stego-key analysis: In this phase, there are three basing segments of stego key Size_block, Set_Cn_row, and Dp . Based on the first parameter, the stego object is

156

Digital Media Steganography

divided to blocks; based on the second parameter, it generates an expression of decomposition; and the third set Dp is used for determination of the position in which bit modification is applied. c) Phase of selection: Based on generated Catalan number decomposition, bits are selected in this phase. Besides that, the additional parameter Set_Cn_row is applied for rows in block. d) Phase of modification: In this phase, selected bits from stego object change the positions defined inside set Dp. e) Phase of generating secret message: In this phase the process of extract data is successfully completed (the secret message is generated) based on three input parameters {Size_block, Set_Cn_row, D p }. Fig. 8.7 represents a general scenario of a data extracting method based on the Catalan number decomposition.

FIGURE 8.7 General scenario for data extract (generating hidden messages).

The necessary parameters for the data extraction from stego object in JavaApp are: • An adequate Catalan stego key, which is loaded in the form of a special file. • The image (or text), which is the carrier of information. • Selecting Cn (one or more) for data extraction. Fig. 8.8 is illustration of the module for data extraction in Java application. Fig. 8.9 presents a case where the invalid Catalan key is loaded. We chose to generate data for C2 and C3, but the message is generated in an irregular shape because the wrong stego key indicates the wrong bits in the image and the message is not readable.

Chapter 8 • A steganography method

157

FIGURE 8.8 Java application: Module for data extraction.

FIGURE 8.9 Loading a nonvalid key and wrong selection of bits in the image.

8.5 Steganalysis and security testing Steganalysis techniques try to find out the existence of a hidden message within another medium. In essence, steganalysis should solve three basic tasks: detection, definition, and decoding of the hidden message. One of the main criteria of the effectiveness of steganographic systems is the relationship between the hidden content and the supporting content. In [6], steganalysis is performed whether the solution is safe enough from the aspect of easy removal of confidential information from the picture. The most advanced methods and techniques of machine learning are applied for classification and clustering. Also, the authors gave a comparative analysis where they showed that the classifier works equally

158

Digital Media Steganography

well or even better in relation to the existing algorithms. The steganalysis is based on suspicious data sets, and none of them with certainty bears a secret message. The goal has two main points: 1) to detect the message and 2) to read the message after the key detection. In the proposed method, representation of an image in the Catalan sequence (numbers) allows increasing the bit planes from 8-bit to 12-bit planes. In [19] the representation of an image in the Fibonacci sequence increases to 12-bit planes as well. The results of this method in comparison with other existing techniques display that the proposed method achieves high embedding of secret data and gives high quality of stego-objects (images, texts). In [20] an approach that uses Fibonacci and Catalan numbers was presented to increase the number of bit-planes that can be used in hiding information. These proposed techniques alter the LSBs of the cover data and lead to increased possibility of attack by a visual method. The robustness of hidden data (text, image) that are embedded by algorithms in [20] is low, because the message bits are still embedded into LSBs. In this method the message bit permutation is performed prior to embedding, resulting in uniform bit distribution. Popular steganography tools based on LSB differ significantly in the approach of information concealment. In addition, in the proposed method the distribution of bits in R, G, B channels is fairly even, which means that all channels are equally burdened. Time and space complexity for Catalan keys Many authors discussed the time and memory complexity of generating Catalan numbers in different forms (balanced parentheses, binary or ballot notation). In [21] the program run space and time complexity is analyzed. The asymptotic growth of Cn is estimated by the formula 4n (8.14) √ . n3/2 π n This means that the algorithm uses O n43/2 = O(4n ) memory. It is important to emphaCn =∼

size that the total number of output number sequences is exactly Cn = O(4n ).

Security analysis of stego key In this case, there is a complex stego key, that is, an ordered triple Sk = {Size_block, Set_Cn_row, D p }. An important advantage of the proposed solution is that the data carrier (image or text) remains unchanged, which means that it is completely resistant to statistical tests. The overall security is in the stego key, that is, the key that generates a secret message from the data carrier. In this case the security analysis will be divided into three segments: 1. The existence of more bases for generation of the Catalan numbers. 2. Existence of additional parameters – size of block and rule for rows in block. 3. Analysis from the aspect of different bits. 1) The existence of more bases for generation of the Catalan numbers: The complexity of the steganalysis increases with the ability to load the bases set {n1 , n2 , . . . , nm }, which

Chapter 8 • A steganography method

159

makes it even more difficult to find a complex stego key. The attacker does not know whether only one basis was used, or two consecutive, or three that were not consecutive, and so on. All this gives a considerable advantage to the proposed solution in terms of security and inability to randomly penetrate this first segment of the complex key. 2) Existence of additional parameters (size of block and rule for rows in block): The proposed steganographic method accepts the key (determines the additional parameters, size of block and rule for rows in block) based on which the so-called bit indexing in one or more data carriers is performed. These indices are significant when transmitting a message secret generated on the basis of multiple documents because they are based on the reconnection of parts of the message (or some cipher) in the appropriate order. It is important to note that this solution offers the ability to hide the secret message in more files. In the case where the bit length of encrypts is greater than the bit length of the bits selection space in the supporting document, the steganographic system takes the next supporting document. It depends on the length of the secret message and the parameters size of block and rule for rows in block. All this additionally aggravates the steganalysis. 3) Analysis from the aspect of different bits: Attacks on hidden information can take several forms: detection, separation, destruction, or decryption of hidden information. When the proposed solution is concerned, if an attacker has an image (or text) that constitutes a message carrier, then he will never be able to generate an exact secret message without all the segments of the stego key (and especially the third segment or set Dp). This third key segment provides information about which bits are taken unchanged from the carrier and abour which bits are complemented. What further aggravates the steganalysis is the fact that in the set Dp, iterations are recorded in the counter, and not the actual position of the bits in the data carrier. Real positions of bits can only be found if the definitions of all three sets Sk = {Size_block, Set_Cn_row, D p } are given. So, even if the attacker finds the values in the Dp set, this does not mean anything; the actual position of the bits selected in the data carrier will only be obtained if it has a previous set. Based on all this, the distribution of the key can be done by segments, thus increasing the security of the steganographic system itself.

Steganalysis of the proposed method Steganalysis was based on the following tests: 1. The amount of information per pixel in stego image. 2. Approximate entropy: original vs. stego image. 3. Bit distribution in stego image. 1) The amount of information per pixel: In our previous research, we have determined the acceptable parameters of the LSB algorithm for a safe steganographic channel. We further will show the way of distributing the bits of confidential information to the R, G,

160

Digital Media Steganography

and B channels in the picture. Testing for the needs of steganalysis was carried out in the following way: • Selection of 150 images that represented data carriers. • Using 24 bits of images for steganalysis. • Determining the amount of information to hide. We were guided by the recommendations from previous research for steganalysis, where it was found that no tool can detect an LSB algorithm where the information is 0.005 bpp (bits per pixel). By embedding the secret information in the image we implemented additional options, and one of them is displaying additional parameters that indicate whether the process was successfully implemented and the amount of information installed per pixel. In Fig. 8.10 a dialog is displayed that appears in the “embedding data” procedure. The system user always gets feedback on whether the stego image and the appropriate stego key are successfully created, as well as details about other parameters: image size, size of secret information, and very important parameter BPP (bits per pixel).

FIGURE 8.10 Some parameters in the process of embedding confidential data.

2) Approximate entropy: In this testing, we have identified another good side of the proposed solution, that is, the entropy for both classes of images has remained unchanged. The test examines the frequency of occurrence of all possible overlapping of n-bit patterns in the series. In Fig. 8.11, one testing in the Octave GUI environment is shown, where we can see the relationship of the original image with respect to the stego image, and this ratio is 7.3696 for the original image and 7.3690 for the stego image. 3) Bit distribution: The message bit permutation is performed prior to embedding, resulting in uniform bit distribution. Popular steganography tools based on LSB differ significantly in the approach of information concealment. Fig. 8.12 shows the distribution of the bits for a single selected image where secret information is embedded with Catalan stego method. In this case, we only showed built-in bits for the value of 1, whereas the message bits are represented by the zero-omitted (Octave GUI environment). We will now show that the proposed method meets another condition, which is to load the three channels on the image (R, G, B) evenly. Fig. 8.13 shows the distribution of bits in

Chapter 8 • A steganography method

161

FIGURE 8.11 Comparative analysis of the entropy of the original and stego images.

FIGURE 8.12 Distribution of bits in the image.

R, G, B channels, and we can see them to be fairly even, which means that all channels are equally burdened (the right image is the zoomed part from the left picture).

8.6 Conclusion In this chapter, we encompass some mathematical concepts and give a contribution with regards to number theory application in the area of steganography and application of combinatorial mathematics in hiding of data. Theoretical research foundations are mentioned where the basic properties of Catalan numbers are investigated and an algorithm for their decomposition is analyzed. The theoretical bases of the research are followed by experimental testing. More specifically, a case study is presented, which includes a hiding module and implementing it in Java NetBeans environment. A GUI application is implemented that has all the necessary elements for easy and efficient hiding data. Also, steganalysis of the proposed solution is carried out in the following aspects: the amount of information per pixel in the stego image, approximate entropy, original vs. stego image, and bit distribution in stego image. The attacker, without knowing the parameters of the stego key, cannot find whether it used only one or more bases in the generation of the decomposition and

162

Digital Media Steganography

FIGURE 8.13 Uniform distribution on R, G, B channels.

therefore cannot determine the position of the bits in the carrier, nor it can determine how many data carriers the hidden message contains. Further research in the field of steganography will be based on the application of Catalan numbers and certain combinatorial problems in the domain of the public-key steganography and exchanging hidden keys.

References [1] M. Saracevic, Application of combinatorial mathematics in cryptography and steganography, in: BALCOR – XIII Balkan Conference on Operational Research, May 2018, Serbia. [2] M. Saracevic, E. Koricanin, E. Bisevac, Encryption based on ballot, stack permutations and balanced parentheses using Catalan-keys, Journal of Information Technology and Applications 7 (2) (2017) 69–77. [3] M. Saracevic, A. Selimi, F. Selimovic, Generation of cryptographic keys with algorithm of polygon triangulation and Catalan numbers, Computer Science – AGH 19 (3) (2018) 243–256. [4] M. Saracevic, S. Adamovic, E. Bisevac, Applications of Catalan numbers and lattice path combinatorial problem in cryptography, Acta Polytechnica Hungarica: Journal of Applied Sciences 15 (7) (2018) 91–110. [5] M. Saracevic, M. Hadzic, E. Koricanin, Generating Catalan-keys based on dynamic programming and their application in steganography, International Journal of Industrial Engineering and Management 8 (4) (2017) 219–227. [6] M. Saracevic, S. Adamovi´c, V. Miškovic, N. Maˇcek, M. Šarac, A novel approach to steganography based on the properties of Catalan numbers and Dyck words, Future Generation Computer Systems 100 (2019) 186–197. [7] S. Pund-Dange, C.G. Desai, Data hiding technique using Catalan–Lucas number sequence, Indian Journal of Science and Technology 10 (4) (2017) 12–17. [8] N. Aroukatos, K. Manes, S. Zimeras, F. Georgiakodis, Techniques in image steganography using famous number sequences, International Journal of Computers & Technology 11 (3) (2013) 2321–2329.

Chapter 8 • A steganography method

163

[9] D.L. Bhaskari, P.S. Avadhani, A. Damodaram, Combinatorial approach for information hiding using steganography and godelization techniques, International Journal of Systemics, Cybernetics and Informatics (2007) 21–24. [10] J.M. Gutierrez-Cardenas, Secret key steganography with message obfuscation by pseudo-random number generators, in: 38th IEEE International Computer Software and Applications Conference Workshops, Sweden, July 2014, pp. 164–168. [11] A.K. Sahu, G. Swain, An optimal information hiding approach based on pixel value differencing and modulus function, Wireless Personal Communications 108 (1) (2019) 159–174. [12] A.K. Sahu, G. Swain, A novel n-rightmost bit replacement image steganography technique, 3D Research 10 (1) (2019) 1–18. [13] A.K. Sahu, G. Swain, Pixel overlapping image steganography using PVD and modulus function, 3D Research 9: 3 (2018) 1–14. [14] G. Swain, A.K. Sahu, A novel multi stego-image based data hiding method for gray scale image, Pertanika Journal of Science and Technology 27 (2019) 753–768. [15] A.K. Sahu, G. Swain, E. Babu, Digital image steganography using Bit Flipping, Cybernetics and Information Technologies 18 (1) (2018). [16] T. Koshy, Catalan Numbers with Applications, Oxford University Press, New York, 2009. [17] R.P. Stanley, Catalan addendum to Enumerative Combinatorics, available at: http://www-math.mit. edu/~rstan/ec/catadd.pdf, 2013. (Accessed 3 June 1919). [18] P. Stanimirovi´c, P. Krtolica, M. Saraˇcevi´c, S. Mašovi´c, Decomposition of Catalan numbers and Convex Polygon Triangulations, International Journal of Computer Mathematics 91 (6) (2014) 1315–1328. [19] A. Rehman, T. Saba, T. Mahmood, Z. Mehmood, M. Shah, A. Anjum, Data hiding technique in steganography for information security using number theory, Journal of Information Science 45 (6) (2019) 767–778. [20] N. Aroukatos, K. Manes, S. Zimeras, F. Georgiakodis, Data hiding techniques in steganography using Fibonacci and Catalan numbers, https://doi.org/10.1109/ITNG.2012.96, 2012, pp. 392–396. [21] S. Saba, Generating all balanced parentheses: a deep dive into an interview question, MathCode. Available on https://sahandsaba.com/interview-question-generating-all-balanced, 2018.

9 A steganography approach for hiding privacy in video surveillance systems Ahmed Elhadada,b , Safwat Hamadc , Amal Khalifad , Hussein Abulkasime a Faculty of Science, South Valley University, Department of Mathematics and Computer Science, Qena, Egypt b Faculty of Science and Art, Jouf University, Department of Computer Science and Information,

Al Qurayyat, Saudi Arabia c Faculty of Computer and Information Sciences, Ain Shams University, Department of Scientific Computing,

Cairo, Egypt d Purdue Fort Wayne University, Department of Computer Science, West Lafayette, IN, United States e Faculty of Science, New Valley University, Department of Mathematics, El-Kharja, Egypt

9.1 Introduction Surveillance systems are used everywhere for decades to protect facilities and people from criminal activities such as theft, fraud, and violence. In addition, organizations are utilizing surveillance technology to help on improving business operations [1–6]. For example, Digital Video Surveillance (DVS) enables their clients to establish effective risk management strategies that can help manage and safeguard business information and technology assets and anticipate vulnerabilities and risk [7,8]. Despite it importance, surveillance systems are imposing a number of privacy issues. In general, privacy can be defined as the ability of an individual or group to have their personal information and affairs secluded from others and to disclose them as they choose [9,10]. In this context, privacy concerns arise wherever personal data or personal (identifiable) information is collected and stored in digital form or otherwise. In fact, privacy advocates worry whether the potential abuses of video surveillance outweigh its benefits [9]. Therefore it is a fundamental challenge to design surveillance systems in such a way that security needs can be met without violating the right of privacy. One solution to this problem could be deidentification. Deidentification in multimedia content can be defined as the process of concealing the identities of individuals captured in a given set of data (images, video, audio, text) for the purpose of protecting their privacy. This can be done simply by either adding noise or removing and/or replacing objects using black boxes, blurring, or in-painting [11]. On the other hand, reidentification is one of the more important processing actions to powerful monitoring in the case of crimes or authorized access; therefore hiding details of privacy is required [12]. Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00017-7 Copyright © 2020 Elsevier Inc. All rights reserved.

165

166

Digital Media Steganography

Such techniques are actually modifying the original scene, destroying the integrity of the original video, which may lead to errors in the reidentification process. Hence it may be necessary to think of another way to work around this problem. In this chapter, we propose using information hiding principles to imperceptibly embed privacy information within the modified video itself. Most of the existing techniques [13–16] were actually designed with watermarking applications in mind. For purposes such as authentication or copyright protection, the size of the embedded information is usually very small. However, this will not be very useful in our case. In this chapter, we propose a framework to hide privacy information captured in video surveillance systems. According to the proposed framework, the original video is modified to remove the target object from the scene using a background image with an in-paint technique. Then a data hiding technique is used to embed the original video frame into the in-painted one. The proposed hiding technique is carried out in the discrete cosine transform (DCT) domain based on H.264 video compression concept. The proposed technique is embedding video into video. To ensure correct recovery of the hidden information, a preprocessing step is needed on the cover video before the embedding process takes place. This preprocessing step depends on the details of the H.264 video compression technique for spatial redundancy reduction. Once embedded, the stego-video can go through an extraction process to blindly reconstruct the original frame. In this context a blind extraction means that the cover video is not needed to retrieve the hidden one.

9.2 Related works Yi-Chun Liao et al. [17] proposed a data hiding method in video using adaptive LSB (least significant bit) substitution. The proposed method depends on decoding the video file into frames where each frame represents a color image described as pixels. The proposed scheme used three components: the original frame (original image), background image, and secret image. The secret image is extracted from the original image using color clustering technique to select the secret image dimension and location. The proposed adaptive LSB works by substituting two bits in each pixel of the cover image by the bits of the secret message. Finally, a pasting technique is proposed to put the secret object smoothly in the video frame. Almost the same approach was proposed in [18], where the main data required to extract the secret data correctly are embedded into the first frame. Then the cover video is decoded into frames, and the secret message is split into byte streams embedded using LSB with the help of a random byte allocator. The secret data consists of textual messages, and the cover file is in AVI format. Chae and Manjunath [19] proposed another hiding technique into video files or images. They proposed a scheme to hide image into image, image into video, and/or video into video. The data hiding technique is depending on domain transformation of the host cover and the secret message using the 8 × 8 block DCT. After that, the secret message coefficients are quantized; then it is encoded using multidimensional lattices embedded into

Chapter 9 • A steganography approach for hiding privacy

167

the DCT coefficients of the host cover. The proposed technique succeeded in hiding video into video with the same dimension. In addition, it showed robustness against MPEG-2 compression, but the experimental results scored noticeably destroyed stego frames where PSNR less than 35 dB. From other standpoint, data hiding in video is used to develop video quality metrics as illustrated in [20]. A binary mark is embedded in the video frames, and then the resultant video is encoded to be transmitted through the channel. After receiving and decoding the signals, the binary mark bits are extracted blindly and used to estimate the quality of the received video by measuring its degradation. Privacy hiding methods first start by removing authorized persons from the scene. Wickramasuriya et al. [21] introduced a full framework that combines sensors and video streams to select privacy details and scene view, respectively. Depending on authorization, the stream is further viewed using removing object by noise/blur filter, pixel-coloring, or bounding-box to hide privacy details. Therefore this beginning is more powerful in privacy, but in the case of criminal, it is difficult to retrieve the original scenario. Wei Zang et al. [13] proposed a method of storing privacy information in surveillance video as a watermark, which satisfies recording everything in the area, and any person can only see the unauthorized behavior, where the privacy information must be entirely invisible. On the other hand, authorized person can use a secret key to view the original video, including the privacy information if any as reidentification. Unauthorized access cannot reconstruct the privacy details without the secret key; even if the exact algorithm of the surveillance system is known to the attacker. The authors proposed method first uses an identity sensor (e.g., RFID), so the policy engine checks the authority of the incoming person. The resulting video is then compressed into another bit-stream, and the embedding process take place as a watermark by a privacy-embedding scheme. Cheung et al. [22] described privacy protection in a video surveillance system that uses camera systems and object identification and tracking unit with the help of RFID sensors. After segmentation processing, the privacy information is encrypted, and the data hiding scheme takes place, combined with a standard video compression encoder.

9.3 Hiding privacy information using video compression concept In this section, the proposed framework hides privacy details in video using data hiding principals along with H.264 video compression concept. The original video is modified to remove an object from the scene using a background image and the private information details, which is called a hide in-paint technique.Then the proposed data hiding technique hides the original frame into the modified one. It blindly reconstructs the original frame from the fake one. Hiding privacy in video involves many challenges. In fact, modifying the original scene to get a resultant video without private information details, it requires a lot of computational tasks such as detecting, classifying, tracking, and removing objects. Object detection aims to segment an object in a scene. Subsequent processes are required,

168

Digital Media Steganography

such as object classification and tracking to recognize/identify and follow the object from one frame to another.

FIGURE 9.1 Hiding In-paint Privacy Information Details Framework.

Fig. 9.1 shows a full diagram of framework of hiding in-paint privacy information details in its basic layer view. The framework includes three main parts: the presets layer, the hide in-paint layer, and the recapture original video layer. In the first layer, presets, background modeling and privacy management system receive the original video from the surveillance scene cameras, where the background modeling is reconstructed from the scene. Here it is assumed that the cameras are fixed, so the background can be generated using the Gaussian background model. At the same time a privacy management system receives the original video and reconstructs the private information video. The frames will be in black only, and the privacy details will be colored areas. It is supposed to use the privacy management system described by Cheung et al. [23]. This research in fact focuses on the second layer, the hide in-paint, which removes private information depending on the generated background model and privacy information frames. The result of this layer is a faked video with removed private information details, in which the original frames are securely embedded. With this capability, in the case of crimes or urgent cases, only the authorized person with secret parameters can blindly extract the original frames from the faked ones. The following sections illustrate the concepts of the proposed framework. Firstly, a background modeling technique is explained to generate a background image from a

Chapter 9 • A steganography approach for hiding privacy

169

sequence of frames. Next, a deidentification procedure is used to remove the private information from the original video using the hide in-paint technique. Finally, a compression preprocessing step takes place to prepare the cover video for hiding the original video using the proposed quantization hiding technique.

9.3.1 Background model generator The background modeling generators are methods used for segmenting the scene into foreground and background objects. In video, background subtraction is a simple background modeling generator, which computes the difference between the current and reference frames. For instance, let f (x, y, t) and f (x, y, t +1) be pixels intensities for the location in the current and next frames in video; the frame difference formula is illustrated by the equation [24] Diff (t + 1) = |f (x, y, t + 1) − f (x, y, t)|.

(9.1)

This difference mask illustrates the change in the sequential frames at time t assuming that background is static and foreground is moving. Hence the background is removed in the difference mask. In addition, a threshold value is used to improve the elimination: Diff (t + 1) = |f (x, y, t + 1) − f (x, y, t)| > T hreshold.

(9.2)

Instead of using sequential frames, the averaging of several frames can be used to get the background and the foreground at any time t as shown in the equations N 1  Avg(x, y) = f (x, y, t − i), N

(9.3)

|f (x, y, t) − Avg(x, y)| > T hreshold,

(9.4)

i=1

where N is the number of samples frames used to calculate the average. On the other hand, a Gaussian probabilistic density function can generate a background model by calculating the mean μ and variance σ 2 for each pixel in recent sequence frames as shown in the equations [24] |(ft − μ)| > threshold → F oreground, σ |(ft − μ)| ≤ threshold → Background. σ

(9.5) (9.6)

In the proposed framework the suggested background model generator uses the Gaussian mixture model. The Gaussian mixture model decides the most probably background or foreground pixels by a simple heuristic algorithm that can be applied on video sequence frames.

170

Digital Media Steganography

9.3.2 Deidentification private details Deidentification is the process of concealing the identities of individuals captured in a given set of data (text, audio, images, video) to protect their privacy. The de-identification process is an interdisciplinary challenge involving scientific areas such as image processing, speech analysis, video tracking, and biometrics. The traditional privacy protection method in video (e.g., videos in TV channels) is the person masking, in which people’s features are blocked or blurred in video frames to protect their identity. In the proposed framework the original video is modified to remove the private information details by the hide in-paint technique, which aims to replace the privacy information details by background model regions in symmetrical areas. The resultant video shows the original scene without private details of the protected objects’ activities. In addition, to improve the homogeneity and smoothness of the modified scene, neighbor’s pixels are included in the replacement process.

9.3.3 H.264 compression preprocessing At this moment, there are two equal videos. The first video contains the original frames, whereas the second video has the public frames in which private information details are deidentified. As a preprocessing step, for the sake of spatial redundancy reduction, the original frames with removed privacy information are suggested to go through an H.264 compression process using an integer DCT. In this way, compression is supposed to reduce the file size to be ready for adding the private information. As shown in Fig. 9.2, for each color band (Red, Green, and Blue) of frames in the video, each frame is divided into 16 × 16 blocks, and each block is divided into 4 × 4 microblocks (MB). Then a 4 × 4 integer DCT is applied on each MB independently, resulting in a 16-element DCT transform that consists of one DC coefficient and 15 AC coefficients. The DC coefficient represents the average color of the 4 × 4 region, whereas the 15 AC coefficients represent color changes across the block [25–27]. Thereafter, a quantization step takes place to scale down the transformed coefficients using a quantization parameter (QP). Finally, a dequantization and an inverse integer DCT are applied to reconstruct the compressed cover frame. Note that a 4 × 4 Hadamard transform is applied on the 16 DC coefficients in each block before the reconstruction process is done. Fig. 9.3 gives an example of a 4 × 4 block of frame data X. Then applying integer DCT, we get a 4 × 4 coefficient matrix W . Its upper left portion represents lower frequency components of X, whereas its lower right portion gives higher-frequency components, and Z is the quantized version of W . It seems that the amount of data is much smaller than that of X, the original frame data. W  is the scale-up (inversely quantized) version of Z. After applying inverse integer DCT (IDCT) on W  , we get X  , which is the decoded frame data. Note that X  is not exactly identical to X, that is, this process is data lossless due to the irreversibility of quantization.

Chapter 9 • A steganography approach for hiding privacy

171

FIGURE 9.2 H.264 compression concept for preprocessing step.

9.3.4 The proposed quantization hiding technique In the proposed framework the compressed cover frames and the original frames are both assumed to be true colored videos. The overall technique is depicted in Fig. 9.4. First, the compressed cover frames are normalized to get float pixel values that range between 0.0 and 1.0 instead of the integer range of 0–255. Similarly, the original frames are normalized as well. The next step carries out the hiding process on the normalized coefficients of the compressed cover in discrete cosine transform (DCT) domain. The proposed embedding technique replaces these coefficients by the pixel values of the original frames using the equation Stego =

2 (Msg + i), β

where

2i 2(i + 1) ≤ Cover < , β β

(9.7)

where Msg refers to an adjusted normalized pixel of the original frame, Stego refers to the resultant coefficient of the stego-frame, Cover is the corresponding compressed cover frame coefficient, and i = 0, 1, 2, 3, . . ., (β − 1).

172

Digital Media Steganography

FIGURE 9.3 Simple example illustrates transformation and quantization.

The proposed hiding equation also specifies an additional parameter β that will be used for embedding and indicates the number of intervals that will be used to divide the range of the compressed cover coefficients where the range is in fact determined by the minimum and maximum coefficient values. For example, when β = 6, the stego coefficients are computed according to the following rules: ⎧ 1 ⎪ ⎪ 3 Msg, ⎪ ⎪ 1 ⎪ ⎪ ⎪ 3 (Msg + 1), ⎪ ⎪ ⎨ 1 (Msg + 2), Stego = 13 ⎪ (Msg + 3), ⎪ 3 ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎪ 3 (Msg + 4), ⎪ ⎩1 3 (Msg + 5),

0 ≤ Cover < 13 , 1 3 2 3

≤ Cover < 23 , ≤ Cover < 1,

1 ≤ Cover < 43 , 4 3 5 3

(9.8)

≤ Cover < 53 , ≤ Cover < 2,

that is, the utilized part of the preceding equation actually depends on the value of the cover coefficient. To illustrate this point, Fig. 9.5 shows an example where the compressed cover coefficient lies in the third region, which implies that the corresponding stego coefficient will be computed as 13 (0.4 + 2) = 0.8. Note that the computed values introduce slight changes in the original ones. This decreases the possibility of the degradation that will be caused due to the hiding process.

Chapter 9 • A steganography approach for hiding privacy

FIGURE 9.4 The proposed Quantization Hiding technique.

FIGURE 9.5 An example of the hiding process using β = 6.

173

174

Digital Media Steganography

In addition, an adjustment step takes place on the normalized pixel values of the original frame using α. This step assures that saturated pixel values will not eventually result in an overflow in the embedded stego coefficients. Finally, the stego frame is obtained by an inverse discrete cosine transform (IDCT). Since the pixel values were actually normalized, they need to be de-normalized to convert the pixel values back to their original integer domain. The detailed steps of the embedding process are listed in Algorithm 1.

Algorithm 1: The Embedding Process. Input: Original and Private detail frames, α, β, and QP Output: Stego frame 1) Read original frame → Org frame. 2) Read Private details frame → Priv frame. 3) Background extraction using Org and Gaussian mixture model→ BG model 4) Deidentify private details in Org frame using Priv frame and BG model→ Cover frame 5) Compress Cover frame using H.264 concept: 5.1) Transform Cover frame into integer DCT domain for each microblock (MB). 5.2) Quantize and Dequantize AC coefficients in MB using QP. 5.3) Pick out DC coefficient in MB and apply the following: 5.3.1) Apply Hadamard transform. 5.3.2) Quantize Hadamard coefficients. 5.3.3) Inverse Hadamard transform. 5.3.4) Dequantize the resultant coefficients. 5.4) Apply inverse integer DCT transform on resultant MB coefficients→ Cover frame. 5.5) Set Compressed cover → Comp Cover frame. 6) Set Org frame→ Msg frame. 7) Normalize both Msg frame and Comp Cover frame. 8) Adjust extreme pixel values in Msg frame using the function  1 − α, Msgf rame ≥ 1 − α, Msgf rame = α, Msgf rame < α. 9) 10)

For each MB, Transform normalized Comp Cover frame in DCT domain. Embed Msg frame pixels into Comp Cover frame coefficients using the following equation: Stego =

11) 12) 13)

2i 2(i + 1) 2 (Msg +i), where ≤ Cover < , i = 0, 1, 2, 3, . . ., (β −1). β β β

Apply inverse DCT transform on resultant Stego coefficients→ Stego frame. Denormalize Stego frame. Return the Stego frame.

Chapter 9 • A steganography approach for hiding privacy

175

FIGURE 9.6 A flowchart illustrates the proposed extraction process.

9.3.5 The extraction module As illustrated in Fig. 9.6, the steps of extraction process are exactly the inverse of those followed during the embedding phase, that is, the process starts by normalizing the stego frames. Next, the DCT decomposition of the normalized stego frame is computed. So, for each coefficient utilized for embedding, the range is subdivided according to β using the equation Msg =

β 2i (Stegof rame − ), 2 β

where

2i 2(i + 1) ≤ Stegof rame < , β β

(9.9)

where Msg is the extracted pixel value of the reconstructed original frame from the Stego coefficient, and i = 0, 1, 2, 3, . . ., (β − 1). Fig. 9.7 shows the inverse of the embedding process illustrated in Fig. 9.5. In this case the value of the stego coefficient determines the equation that will be used for extraction. Note that the extraction process is done blindly where the original frame can be extracted

176

Digital Media Steganography

FIGURE 9.7 An example of the extraction process using β = 6.

correctly from the stego frame using only the value of β. The detailed steps of the extraction process are listed in Algorithm 2. Algorithm 2: The Extraction Process. Input: Stego frames and β Output: Reconstructed Original Frame 1. Read Stego frame→ Stego frame. 2. Normalize the Stego frame. 3. Transform normalized Stego frame into DCT domain. 4. Apply the inverse of the parameterization function to extract Msg frame from coefficients: Msg =

2i 2i 2(i + 1) β (Stegof rame − ), where ≤ Stegof rame < , i = 0, 1, 2, 3, . . ., (β −1) 2 β β β

5.

Apply correction into the reconstructed Msg frame as follows:  Stego(i, j ), Stego(i, j ) > threshold, Msg(i, j ) = Msg(i, j ) otherwise.

6. 7. 8.

Denormalize Msg frame. Set Msg frame → Reconstructed Original Frame. Return the Reconstructed Original Frame.

9.4 Experimental results We apply the proposed technique on the hallmonitor frames, which exhibit only two persons’ movement as shown in Fig. 9.8. The hall monitor sequence consists of 299 frames in

Chapter 9 • A steganography approach for hiding privacy

177

CIF (352 × 288) 4:2:0 Y Cb Cr format. All frames are concatenated into a single file, with each frame containing the Y frame followed by the Cb and Cr frames. On the other hand, Fig. 9.9 shows samples of the equivalent privacy mask sequence that selects only one person in the scene as private information.

FIGURE 9.8 Various samples of hall monitor frames.

FIGURE 9.9 Privacy mask video samples.

Y U V T ools are a set of tools that allow playing, converting, editing, and analyzing YUV video data in its raw formats. YUV format conversion is one of the tools that can be used

178

Digital Media Steganography

to convert one YUV/RGB/BMP file to another. The converting source and destination files can be of any valid YUV format, or RGB 4:4:4 format, or a sequence of BMP files. Table 9.1 shows the resultant hall monitor sequence after AVI conversion used to conduct the proposed technique by YUV format conversion. The proposed technique is conducted and implemented using Intel(R) Core (TM) i7-4700MQ CPU, 2.40 GHz, 64 bit Windows 8 operating system with 8 GB of RAM. In addition, the MATLAB® version 8.0.0.783 (R2012b) is used in coding the implementation. Table 9.1 Full details of AVI hall monitor frames. Original video Source name: Frame size:

Privacy mask Video

hall-org.avi

hall-private.avi

288 H × 352 W

288 H × 352 W

Color format:

RGB

RGB

Source data type:

uint8

uint8

Display data type:

uint8

uint8

Source rate:

30 fps

30 fps

Frame count:

299

299

Bits per pixel

24

24

9.4.1 Data payload The hiding capacity offered by a technique is defined as the maximum amount of information that can be hidden within a frame. It is usually measured in bits per pixel. However, for easier comparisons, the capacity was computed as a percentage of the cover frame size as shown in the following equation (note that here the size is measured in bytes): Max size of hidden data × 100 size of f rame (288 × 352) = × 100 = 100%. (288 × 352)

Data P ayload =

(9.10)

Now, since the proposed technique can hide one byte in each cover coefficient, this equation evaluates to 100% in case of 8-bit cover frames.

9.4.2 Invisibility performance Hence a description metric used to evaluate the invisibility performance of the proposed technique. It is essential to have a measure by which one can judge how a frame is degraded after embedding. Usually, the invisibility of the hidden message is measured in terms of the peak signal-to-noise ratio (PSNR). PSNR is measured in decibels (dB) and can

Chapter 9 • A steganography approach for hiding privacy

179

be computed by the equation P SNR = 10 × log( MSE =

1 (X × Y )

(max p(x, y))2 ), MSE  (p(x, y) − p (x, y))2 ,

(9.11) (9.12)

(x,y)

where p(x, y) represents the shade level of a pixel, whose coordinates are (x, y) in the original frame, and p  (x, y) represents the same pixel in the distorted frame. Obviously, a high value of PSNR indicates that the frame is less distorted. Fig. 9.10 shows the resultant stego frame distortion compared with original frames with removed private information details using PSNR for each frame. The results improve the PSNR values approximated for each frame in the same experiment, and therefore Fig. 9.11 illustrates the average values of PSNR for each video according to the different values of the parameter β using QP = 27. Notice that the invisibility performance of the proposed technique increases by using higher values of β.

FIGURE 9.10 PSNR values of stego frames using QP = 27.

On the other hand, Fig. 9.12 shows the comparison of PSNR values for each frame using β = 80 and different values of QP. In addition, Fig. 9.13 illustrates the average in this case

180

Digital Media Steganography

FIGURE 9.11 PSNR average for stego frames using QP = 27.

FIGURE 9.12 PSNR value of stego frames using β = 80.

study. Note the invisibility performance of the proposed technique approximated by using various values of QP.

Chapter 9 • A steganography approach for hiding privacy

181

FIGURE 9.13 PSNR average for stego frames using β = 80.

Table 9.2 Descriptive Statistics.

β = 70

β = 80

β = 90

β = 100

N total

Mean

Standard Deviation

Minimum

Median

Maximum

QP=24

299

33.86

QP=25

299

32.90

0.0470

33.74

33.85

33.97

0.0541

32.77

32.90

QP=26

299

33.04

32.59

0.0595

32.47

32.58

QP=27

32.73

299

32.55

0.0537

32.42

32.54

32.69

QP=28

299

38.15

0.0260

38.06

38.15

38.22

QP=24

299

34.13

0.0474

34.01

34.12

34.26

QP=25

299

32.92

0.0564

32.80

32.91

33.06

QP=26

299

32.43

0.0613

32.30

32.41

32.57

QP=27

299

38.96

0.0243

38.89

38.96

39.03

QP=28

299

38.86

0.0257

38.78

38.86

38.93

QP=24

299

34.36

0.0474

34.25

34.35

34.48

QP=25

299

32.94

0.0581

32.81

32.93

33.09

QP=26

299

32.52

0.0611

32.38

32.50

32.67

QP=27

299

39.16

0.0255

39.09

39.16

39.22

QP=28

299

39.39

0.0267

39.31

39.39

39.45

QP=24

299

34.06

0.0500

33.94

34.05

34.19

QP=25

299

32.76

0.0576

32.63

32.74

32.90

QP=26

299

32.45

0.0632

32.33

32.44

32.61

QP=27

299

39.30

0.0275

39.21

39.30

39.36

QP=28

299

39.34

0.0291

39.25

39.34

39.41

Table 9.2 shows descriptive statistics of the frame distortion using PSNR for different parameter values of β and QP. In this case study the total number of frames is 299. The term arithmetic mean and sometimes average is used synonymously to refer to a central value of a discrete set of numbers, specifically, the sum of the values divided by the number

182

Digital Media Steganography

of values. In addition, the median is the numerical value separating the higher half of a data sample from the lower half. The standard deviation (SD) measures the amount of variation from the average PSNR. A low standard deviation indicates that the data points tend to be very close to the mean; a high standard deviation indicates that the PSNR values are spread out over a large range of values.

FIGURE 9.14 The mean PSNR values for stego frames.

Fig. 9.14 illustrates the mean PSNR values for the stego frames that use various values of QP from 18 until 28 and the sets of β from 30 to 100; the higher values of β with QP = 27 and 28 recorded high-invisibility performance and low stego distortion. On the other hand, Fig. 9.15 shows the count number of PSNR values to illustrate the intensity of these values for each frame using β = 80, 90, and 100 with QP = 27 and 28. Furthermore, since the extracted frame is only an estimate of the original one, the similarity between the original secret frame and the extracted one needs some measure to quantify. Here the normalized correlation (NC) coefficient is employed to indicate how much of the original frame was successfully extracted. It can be computed as follows: √ (x × x ∗ ) ÷ ( (x × x ∗ )) ∗ Sim(x, x ) = × 100, (9.13) √ (x × x) ÷ ( (x × x)) where x is the original frame components organized as a vector, and x ∗ is the recovered vector. Obviously, the higher similarity means the better quality of the retrieved watermark.

Chapter 9 • A steganography approach for hiding privacy

FIGURE 9.15 PSNR counts for β = 80, 90, and 100.

183

184

Digital Media Steganography

FIGURE 9.16 The similarity of the extracted frames.

Throughout the hall monitor video, there are 299 frames used as covers, and equivalent 299 colored frames were used as the secret message. In fact, we need to investigate the effect of the parameters β and QP on the fidelity of the extracted frames. The results of simultaneous study of the average similarity of the extracted frames are shown in Fig. 9.16. Note that the extracted frames of the proposed technique provide an acceptable visual quality for the retrieved original frames. Finally, Fig. 9.17 gives a closer look on the results of the stego frames embedded with the original frame of the same size. The experiment was using β = 30, 50, and 80 for QP = 19, 23, and 27, respectively. Simple visual inspection of the results show that the quality of the stego frame maintains very high (over 35 dB) while maintaining an acceptable visual quality retrieval of the original frame.

Conclusion In this chapter, we introduced a data hiding technique as a connection of the deidentification and reidentification approaches. Data hiding is commonly referred to as steganography, which is the art and science of embedding secret data within another seemingly innocent data, or carrier. The carrier can take the form of any medium used to convey information. The proposed hiding technique is in fact embedding the video captured by the surveillance camera into another processed video from which private information is removed. The hiding process is carried out in the discrete cosine transform (DCT) domain

Chapter 9 • A steganography approach for hiding privacy

185

FIGURE 9.17 Samples results obtained by applying the proposed method.

of the cover. The hiding function is based on a parameter β, which divides the range of coefficient values into nonoverlapping regions. In addition, the cover video undergoes an H.264 compression as a preprocessing step for spatial redundancy reduction. The extraction process can be done in a blind fashion where the cover video is not needed to retrieve the hidden video. Experimental results showed that the proposed technique can successfully hide an original video into the same one with removed private information details that is as large as itself. Simple visual inspection of the results show that the quality of the stego frame maintains very high (over 35 dB) while maintaining an acceptable visual quality retrieval of the original frame. There are some issues that remain open to future exploration such as identifying individuals, information assimilation, and domain-data transformation modeling.

References [1] David Lyon, Surveillance technology and surveillance society, Modernity and Technology (2003) 161–183.

186

Digital Media Steganography

[2] Su-yu Wang, Lan-sun Shen, Intelligent visual surveillance technology: a survey, Journal of Image and Graphics 9 (2007). [3] Arun Hampapur, Lisa Brown, Jonathan Connell, Sharat Pankanti, Andrew Senior, Yingli Tian, Smart surveillance: applications, technologies and implications, in: Fourth International Conference on Information, Communications and Signal and the Fourth Pacific Rim Conference on Multimedia, vol. 2, IEEE, 2003, pp. 1133–1138. [4] David D. Ferris, Nicholas C. Currie, Survey of current technologies for through-the-wall surveillance (TWS), in: Sensors, C3I, Information, and Training Technologies for Law Enforcement, vol. 3577, International Society for Optics and Photonics, 1999, pp. 62–73. [5] Michelle Cayford, Wolter Pieters, The effectiveness of surveillance technology: what intelligence officials are saying, The Information Society 34 (2) (2018) 88–103. [6] Nouf Al-Juaid, Adnan Gutub, Combining RSA and audio steganography on personal computers for enhancing security, SN Applied Sciences 1 (8) (2019) 830. [7] Anthony C. Caputo, Digital Video Surveillance and Security, Butterworth-Heinemann, 2014. [8] Arjun Prakash, Santosh Verma, Shivam Vijay, Real-time video surveillance for safety line and pedestrian breach detection in a dynamic environment, in: Advances in Signal Processing and Communication, Springer, 2019, pp. 233–246. [9] Rakesh Agrawal, Ramakrishnan Srikant, Privacy-preserving data mining, in: ACM Sigmod Record, vol. 29, 2000, pp. 439–450. [10] Robert S. Laufer, Maxine Wolfe, Privacy as a concept and a social issue: a multidimensional developmental theory, Journal of Social Issues 33 (3) (1977) 22–42. [11] Vassilios S. Verykios, Elisa Bertino, Igor Nai Fovino, Loredana Parasiliti Provenza, Yucel Saygin, Yannis Theodoridis, State-of-the-art in privacy preserving data mining, ACM Sigmod Record 33 (1) (2004) 50–57. [12] Slobodan Ribaric, Aladdin Ariyaeeinia, Nikola Pavesic, De-identification for privacy protection in multimedia content: a survey, Signal Processing: Image Communication 47 (2016) 131–151. [13] Wei Zhang, Sen-Ching S. Cheung, Minghua Chen, Hiding privacy information in video surveillance system, in: IEEE International Conference on Image Processing 2005, vol. 3, 2005, pp. II–868. [14] Adnan Gutub, Maimoona Al-Ghamdi, Image based steganography to facilitate improving countingbased secret sharing, 3D Research 10 (1) (2019) 6. [15] Adnan Gutub, Nouf Al-Juaid, Multi-bits stego-system for hiding text in multimedia images based on user security priority, Journal of Computer Hardware Engineering 1 (2) (2018) 1–9. [16] Walaa Abu-Marie, Adnan Gutub, Hussein Abu-Mansour, Image based steganography using truth table based and determinate array on RGB indicator, International Journal of Signal & Image Processing 1 (3) (2010). [17] Yi-Chun Liao, Chung-Han Chen, Timothy K. Shih, Nick C. Tang, Data hiding in video using adaptive LSB, in: 2009 Joint Conferences on Pervasive Computing (JCPC), IEEE, 2009, pp. 185–190. [18] Ashish T. Bhole, Rachna Patel, Steganography over video file using Random Byte Hiding and LSB technique, in: 2012 IEEE International Conference on Computational Intelligence and Computing Research, IEEE, 2012, pp. 1–6. [19] Jong Jin Chae, B.S. Manjunath, Data Hiding in Video, in: Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348), vol. 1, IEEE, 1999, pp. 311–315. [20] Mylène C.Q. Farias, Marco Carli, Sanjit K. Mitra, Objective video quality metric based on data hiding, IEEE Transactions on Consumer Electronics 51 (3) (2005) 983–992. [21] Jehan Wickramasuriya, Mahesh Datt, Sharad Mehrotra, Nalini Venkatasubramanian, Privacy protecting data collection in media spaces, in: The 12th Annual ACM International Conference on Multimedia, ACM, 2004, pp. 48–55. [22] S.-C.S. Cheung, Mahalingam Vijay Venkatesh, J.K. Paruchuri, Jian Zhao, Thinh Nguyen, Protecting and managing privacy information in video surveillance systems, in: Protecting Privacy in Video Surveillance, Springer, 2009, pp. 11–33. [23] S. Cheung Sen-ching, Jithendra K. Paruchuri, Thinh P. Nguyen, Managing privacy data in pervasive camera networks, in: The 15th IEEE International Conference on Image Processing 2008, 2008, pp. 1676–1679. [24] Massimo Piccardi, Background subtraction techniques: a review, in: IEEE International Conference on Systems, Man and Cybernetics 2004 (IEEE Cat. No. 04CH37583), vol. 4, 2004, pp. 3099–3104.

Chapter 9 • A steganography approach for hiding privacy

187

[25] Iain E. Richardson, The H. 264 Advanced Video Compression Standard, John Wiley & Sons, 2011. [26] Iain E. Richardson, H. 264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia, John Wiley & Sons, 2004. [27] Lajos Hanzo, Peter Cherriman, Jurgen Streit, Video Compression and Communications: From Basics to H. 261, H. 263, H. 264, MPEG4 for DVB and HSDPA-Style Adaptive Turbo-Transceivers, John Wiley & Sons, 2007.

10 Reversible steganography techniques: A survey Tzu-Chuen Lu, Thanh Nhan Vo Chaoyang University of Technology, Department of Information Management, Taichung, Taiwan, R.O.C.

10.1 Introduction Steganography is the technique of embedding secret data to another multimedia type, such as an image, video, or audio, to generate the stego-media. The purposes of steganography include secret sharing, copyright protection, image authentication, tamper detection, and so on [1–6]. For secret sharing, a sender tries to send some secret message to a receiver through Internet. However, the environment of Internet is insecure. Some illegal person might watch the Internet to steal the message transmitted on the Internet. Cryptography is one of the methods to solve the problem. The secret data is encrypted by a cryptography method to generate the ciphertext. However, the ciphertext is unreadable, which may cause the attention of the illegal person trying to decrypt it. The scholars proposed another technology, called steganography, to conceal the secret data into a cover media to generate the stego-media by using the hiding procedure. The sender transmits the stego-media to the receiver through Internet. The stego-media is very similar to the original media such that an illegal person will not catch any attention. After received the stego-media, the receiver extracts the secret data out from the stego-media by using the extraction procedure. Because of the limit of pages, in this chapter, we consider an image as the main cover media [7,8]. The word steganography combines two Greek words “steganos” and “graphein”, which jointly have the meaning of “covered the writing”. In a general steganography scheme a sender tries to embed some secret data into a host image to generate the stego-image. The benefit of the steganography technique over cryptography alone is that a stego-image is almost the same as a host image and thus does not attract attention from an illegal user. In addition, encrypted messages tend to arouse interest, no matter how unbreakable they are. Furthermore, in some countries, encryption may be illegal. Therefore the steganography technique is more practical compared to cryptography [9–13].

10.1.1 Reversible Steganography Scheme (RSS) A stego-image is useless after secret data are extracted since the media is totally different from the original. To solve these problems, the reversible steganography scheme (RSS), Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00018-9 Copyright © 2020 Elsevier Inc. All rights reserved.

189

190

Digital Media Steganography

FIGURE 10.1 Hiding procedure of RSS. Embedding procedure of RSS.

FIGURE 10.2 Extraction and recovery procedure of RSS. Extraction and recovery procedure of RSS.

also known as the lossless steganography method, has been proposed [14–17]. The method uses the media feature to hide the secret data into the host image for further recovering. An RSS contains three procedures: • Embedding procedure: the sender conceals the secret data and image feature into the cover (or host) image to generate the stego-image. • Extraction procedure: the receiver extracts the secret message from the stego-image. • Recovery procedure: the receiver restores the stego-image to its original. In the embedding procedure of RSS the recovery information, such as image features, will be extracted and concatenated with the secret message to hide it in the host image, thus generating the stego-image. The basic concept of RSS is shown in Fig. 10.1. In the extraction procedure the concealed information is extracted from the stego-image, which can then be restored to its original status. The basic concept of extraction and recovery procedure of RSS is shown in Fig. 10.2. For the extraction and recovery procedure, the secret message and image features are extracted from the stego-image. The image feature is used to recover the host image. The reproducibility of RSS is particularly suitable for processing military or medical images. Because of its practicability and importance, it has attracted the interest of many scholars and experts, and many methods linked to this scheme have been proposed.

Chapter 10 • Reversible steganography techniques: A survey

191

FIGURE 10.3 RSS categories. The categories of RSS.

10.1.2 Measurements of RSS There are three measurements to judge a steganography method: • Hiding payload: how many secret data can be embedded into the cover image. • Image quality: the image quality of the stego-image. • Security: only legal persons can extract the secret data. The steganography scheme tries to hide huge data into an image. However, a greater hiding payload may cause image distortion. Hence achieving an acceptable balance between image quality and information payload is an important research topic in the field of steganography technology.

10.1.3 Categories of RSS According to the type of techniques used, RSS can be further divided into several categories, such as difference expansion (DE), histogram-shifting (HS), pixel-value-ordering (PVO), dual-image-based, interpolation-based, and so on. Fig. 10.3 shows various categories of RSS. In the next subsections, more detail of these schemes will be described.

10.2 Difference Expansion (DE) schemes Tian proposed a DE-based scheme in 2003 that can hide a large amount of data and can control the degree of distortion of the stego-image to a certain range [18–21]. Two adjacent pixels can be used to hide one secret data. The average hiding payload is 12 = 0.5 bit per pixel (bpp). However, the scheme requires additional information to recover the original image. Hence the pure hiding payload is less than 0.5 bpp. To increase the information load, the author performs multiple hiding operations on the stego-image. This kind of method, called multilevel information hiding, decreases the image quality of the stegoimage if multiple hiding is performed.

192

Digital Media Steganography

10.2.1 Embedding procedure of Tian’s method In the scheme of Tian, two adjacent pixels can be used to hide one bit. The secret data is concealed in the difference between two pixels. Suppose that the host pixels are p1 = 20 and p2 = 15, where 0 ≤ p1 , p2 ≤ 255, and the secret data is s = 1. First, Tian computes the difference between p1and p2 by d = p1 −p2 = 20 − 15 = 5 and calculates the average 2) value of p1 and p2 by α = (p1 +p = (20+15) = 17. The secret data is embedded into the 2 2  difference using the calculation    d = 2× d +s = 2 × 5 + 1 = 11. The   stego-pixels  are then (d +1) (11+1) d   calculated as p1 = α + = 17 + = 23 and p2 = α − 2 = 17 − 11 2 2 2 = 12.

10.2.2 Extraction procedure of Tian’s method For the extraction procedure, the expanded difference is computed as d  = p1 − p2 =       (p1 +p2 ) (23+12) 23 − 12 = 11, and the average value is computed as α = = = 17. The 2 2   d original difference d is measured as d = 2 = 5, and the secret message is extracted   by s = d  − d2 × 2 = 11 − 10 = 1. The original pixels are subsequently recovered using       d  (5+1) 5 = 17 − = α − = 17 + = 20 and p p1 = α + (d+1) 2 2 2 2 2 = 15. In this example, if the pixels satisfy 0 ≤ α − (2 × d + s) ≤ 255 and 0 ≤ α + (2 × d + s) ≤ 255, then the stego-pixels will have no underflow or overflow problem. This kind of pixels is called expandable. On the contrary, if the pixel does not belong to expandable, the pixel is nonexpandable. Tian used an alternative strategy to hide the secret data, called the changeable pixels.   Using the changeable strategy, the hiding equation changes to d  = 2× d2 +s. If the pixd  d  els satisfy the rules 0 ≤ α − (2 × 2 + s) ≤ 255 and 0 ≤ α + (2 × 2 + s) ≤ 255, then the pixels are changeable, and one secret bit can be concealed in the pixels. In this case the receiver   has no idea whether the stego-pixels are generated by d  = 2 × d + s or d  = 2 × d2 + s. Therefore the sender needs to generate a location map that records the strategy used by each pixel pair. If the pixel pair is expandable, then the sender records 1 in the location map; otherwise, the sender records 0. Furthermore, if the pixel pair is changeable, then the sender needs to keep the original least significant bit (OLSB) of difference d for further recovery procedures. The location map and the OLSB are concatenated with the secret data to produce the hidden stream. To reduce the length of the stream, some compressed methods are used to narrow down the size of the stream. Then the compressed stream, instead of the original secret data, concealed into the cover image.

10.2.3 Embedding procedure of Alattar’s method To improve the hiding capacity of Tian’s scheme, Alattar [22] used a reversible integer transform function to transform the pixel and conceal secret data into the pixels. For example, suppose a vector contains four pixels P = {p1 , p2 , p3 , p4 } = {8, 12, 15, 12}. There are three bits to be concealed into the vector, where secret data are s1 = 1, s2 = 0, and

Chapter 10 • Reversible steganography techniques: A survey

193

s3 = 1. For the first step, Alattar uses a reversible integer  transformation function to 2 ×p2 +a3 ×p3 +a4 ×p4 ) = compute the weighted average v1 of the vector using v1 = (a1 ×p1 ×+a (a1 +a2 +a3 +a4 )   (1×8+2×12+2×15+1×12) = 12 and the differences v2 , v3 , and v4 by v2 = p2 − p1 = 4, v3 = p3 − (1+2+2+1) p1 = 7, and v4 = p4 − p1 = 4. In the equation used to calculate v1 the variables a1 , a2 , a3 , a4 are four constants. In this example the constants are set to be a1 = 1, a2 = 2, a3 = 2, and a4 = 1. Then the scheme adjusts the weighted value and conceals the secret message into the difference. The equations are as follows: vˆ1 = v1 = 12,

(10.1)

vˆ2 = 2 × v2 + s1 = 2 × 4 + 1 = 9,

(10.2)

vˆ3 = 2 × v3 + s2 = 2 × 7 + 0 = 14,

(10.3)

vˆ4 = 2 × v4 + s3 = 2 × 4 + 1 = 9.

(10.4)

Finally, the stego-pixels are calculated by the following equations:    (2 × 9 + 2 × 14 + 1 × 9) (a2 vˆ2 + a3 vˆ3 + a4 vˆ4 ) = 12 − = 3, (a1 + a2 + a3 + a4 ) (1 + 2 + 2 + 1)   (a2 vˆ2 + a3 vˆ3 + a4 vˆ4 )  p2 = vˆ2 + vˆ1 − = 9 + 12 − 9 = 12, (a1 + a2 + a3 + a4 )   (a2 vˆ2 + a3 vˆ3 + a4 vˆ4 ) p3 = vˆ3 + vˆ1 − = 14 + 12 − 9 = 17, and (a1 + a2 + a3 + a4 )   (a2 vˆ2 + a3 vˆ3 + a4 vˆ4 )  p4 = vˆ4 + vˆ1 − = 9 + 12 − 9 = 12. (a1 + a2 + a3 + a4 )

p1 = vˆ1 −



(10.5) (10.6) (10.7) (10.8)

10.2.4 Extraction procedure of Alattar’s method For the extraction process, the receiver received the stego-pixels p1 , p2 , p3 , and p4 . First, they compute the weighted average as follows:    (a1 p1 + a2 p2 + a3 p3 + a4 p4 ) (1 × 3 + 2 × 12 + 2 × 17 + 1 × 12) = = 12, vˆ1 = (a1 + a2 + a3 + a4 ) (1 + 2 + 2 + 1) 

(10.9)

vˆ2 = p2 − p1 = 12 − 3 = 9,

(10.10)

vˆ3 = p3 − p1 = 17 − 3 = 14, and

(10.11)

vˆ4 = p4 − p1 = 12 − 3 = 9.

(10.12)

194

Digital Media Steganography

Next, the scheme extracts the hidden information from the average values with:  vˆ2 s1 = vˆ2 − = 9 − 8 = 1, 2   vˆ3 s2 = vˆ3 − = 14 − 14 = 0, 2   vˆ4 s3 = vˆ4 − = 9 − 8 = 1. 2 

(10.13) (10.14) (10.15)

The values are then computed as follows: v1 = vˆ2 = 12,     9 = 4, v2 = vˆ2 /2 = 2     14 v3 = vˆ3 /2 = = 7, 2     9 v4 = vˆ4 /2 = = 4. 2

(10.16) (10.17) (10.18) (10.19)

10.2.5 Recovery procedure of Alattar’s method The original pixel values are subsequently recovered by:    (2 × 4 + 2 × 7 + 1 × 4) (a2 v2 + a3 v3 + a4 v4 ) p1 = v 1 − = 12 − = 8, (a1 + a2 + a3 + a4 (1 + 2 + 2 + 1) 

(10.20)

p2 = v2 + p1 = 4 + 8 = 12,

(10.21)

p3 = v3 + p1 = 7 + 8 = 15, and

(10.22)

p4 = v4 + p1 = 4 + 8 = 12.

(10.23)

The hiding payload of the Alattar method is 34 = 0.75 bpp, which is higher than that of Tian’s scheme. As with Tian’s scheme, Alattar’s scheme also requires a location map to record whether the vector is expandable or changeable. The main weakness of the DEbased schemes is the low image quality. The average image of Tian’s scheme is 37 dB with a hiding payload 0.46 bpp, whereas the average image of Alattar’s scheme is 37 dB with hiding payload 1.5 bpp. To improve the image quality of DE-based schemes, Ni et al. proposed the HS scheme to perform RSS.

Chapter 10 • Reversible steganography techniques: A survey

195

FIGURE 10.4 Histogram of Lena. The histogram of a test image Lena of size 512 × 512.

10.3 Histogram-Shifting (HS) schemes Ni et al. [23] proposed the concept of HS in 2006. First, the occurrence number of the pixel value in the host image is analyzed and counted, and then used to plot a histogram. The pixel with the most occurrences is determined from the histogram, denoted as the peak point, whereas the pixel value with zero occurrences is called the zero point. For example, Fig. 10.4 is a histogram of an image Lena of size 512 × 512. In the histogram the occurrence number of pixel value 154 is 2, 890, which is the peak point of the histogram (denoted as point A). The occurrence number of pixel value 246 is 0, which is the zero point (denoted as point B).

10.3.1 Embedding procedure of HS The value of bin A + 1 is shifted to that of the bin A + 2, and the total number of A + 1 is equated to zero to generate a space to hide the secret data. The shifting operator adds 1 to the values between A + 1 to B − 1. All the pixels with values in the range [A + 1, B − 1] are increased by 1. After shifting, the total number of A + 1 will shift to that of A + 2, and the number of A + 2 will shift to that of A + 3. Following this process, the values in the range [A + 1, B − 1] will shift to the range [A + 2, B]. Fig. 10.5 shows the histogram of Lena after shifting. The total number of A + 1 becomes 0. If the shifted pixel value is equal to the peak point A, then the pixel is embeddable. For an embeddable pixel, if the hidden secret data is equal to 0, then the pixel value need not be modified. The stego-pixel is thus equal to the original pixel. Otherwise, the stego-pixel is equal to A + 1. If the shifted pixel is not equal to A, then the pixel is nonembeddable, and the stego-pixel is equal to the shifted pixel value. Let us consider an example to demonstrate the HS scheme. Fig. 10.6(A) is a 5 × 5 image. As the first step, the number of occurrences of the pixel intensity in Fig. 10.6(A) must be counted, and the corresponding histogram is plotted, as shown in Fig. 10.7. In this example the peak point A is 13 with occurrence frequency 8. The zero point B is 16 with

196

Digital Media Steganography

FIGURE 10.5 The histogram after shifting. The histogram of the image after shifting.

FIGURE 10.6 An example of the hiding procedure. An example of HS scheme.

occurrence frequency 0. Thus 8 secret bits can be concealed into the image. The pixels that fall within the range [14, 15] are increased by 1. This procedure is known as shifting. Fig. 10.6(B) presents the shifted image. The pixel values equal to 14 are changed to 15 and pixel values equal to 15 are changed to 16. Let the secret data be 11001101. If the pixel is equal to the peak point A = 13, then the pixel is embeddable. If the secret data is equal to 0, then the stego-pixel is the same as the original value. If the secret data is equal to 1, then the stego-pixel is equal to A + 1 = 14. The final stego-image is shown in Fig. 10.6(C).

Chapter 10 • Reversible steganography techniques: A survey

197

FIGURE 10.7 The histogram. The histogram of Fig. 10.6(A).

10.3.2 Extraction and recovery procedures of HS The stego-image and the peak and zero points are all sent to the receiver. When the receiver receives all the information, the extraction and the recovery procedures are used to extract the secret message and recover the original image. If the stego-pixel is equal to A (or A + 1), then one secret bit 0 (or 1) is extracted. Following the same example, the peak point is A = 13, and the zero point is B = 16. If the stego-pixel is equal to 13 (or 14), then the secret bit 0 (or 1) is extracted. The extracted secret data are 11001101. The next step is shifting the pixel so that it returns to its original value. If the value lies within the range [A + 1, B], then 1 is subtracted from the value. In the example the values of 14 are changed to 13, and the values of 15 and 16 are changed to 14 and 15, respectively. The final recovered image is shown in Fig. 10.6(D).

10.3.3 Extra information of HS The main drawback of the HS scheme is the overflow and underflow problems. If the pixel is equal to 255 or 0 and the pixel needs to be shifted, then the pixel is changed to 256 (the overflow problem) or −1 (the underflow problem). To solve these problems, the pixels equal to 0 are changed to 1, and those equal to 255 are changed to 254. Furthermore, a location map records whether the pixel has changed or not. The location map is compressed and concatenated with the secret data that is to be hidden into the host image. The pure payload of the HS scheme requires to subtract the size of the extracted information from the hiding capacity. The extra information includes the location map, the peak point, and the zero point. The hiding payload of the HS scheme is limited in the size of the peak points. To improve the amount of the embedded bits, many researchers used multiple pairs of peak and zero points. Ni et al. used two pairs of peak and zero points to increase the hiding payload.

10.3.4 Experimental results of HS Ni et al. used several test images to test the performance of HS scheme. The test images are shown in Fig. 10.8. The experiment is reported in Table 10.1.

198

Digital Media Steganography

FIGURE 10.8 The test images of HS scheme. The test images used to test the performance of HS scheme.

Table 10.1 The experimental results of Ni et al.’s scheme. Image

PSNR (dB)

Hiding Payload

bpp

Airplane

48.3

16,171

0.062

Baboon

48.2

5,421

0.021

Boat

48.2

7,301

0.028

House

48.3

14,310

0.055

Lena

48.2

5,460

0.022

Tiffany

48.2

8,782

0.034

The average peak signal-to-noise ratio (PSNR) of the HS scheme is approximately 48 dB. The PSNR value is calculated using the formula ⎛ ⎞ ⎜ P SNR = 10 × log10 ⎝

2552 ⎟

2 ⎠ , w h 1  i=1 j =1 pi,j − pi,j w×h

(10.24)

Chapter 10 • Reversible steganography techniques: A survey

199

and the payload is computed by bpp =

total number of hidden bits (bit per pixel). image size

(10.25)

For example, the first peak point of Lena is 154 with occurrence frequency of 2,890, and the second peak point is 157 with occurrence frequency of 2,868. The total number of hidden bits is 5,758 = 2,890 + 2,868 bits. The image size of Lena is 512 × 512, and the 5758 payload is (512×512)  0.022 (bpp). The hiding payload of the HS-based scheme is limited by the total number of the peak points, and hence its payload is very low. Li et al. [24] enhanced the HS-based method by proposing a pixel-value-ordering (PVO) RSS.

10.4 Pixel-Value-Ordering (PVO) schemes In Li et al.’s scheme [24] an image is divided into several subsets. Each set is a candidate set to find the peak point for hiding secrete data. The scheme sorts the pixel values in the set in the ascent ordering. The first and second smallest (or largest) pixel values are used to compute the difference. If the difference is equal to 0 or 1, then the smallest (or largest) pixel value is embeddable. Otherwise, the smallest (or largest) pixel value needs to be shifted for further recovering.

10.4.1 Embedding procedure of PVO Let us consider an example. Fig. 10.9(A) is the original host image. The image is divided into several sets of size 1 × 5. The first set is p = {12, 15, 13, 11, 11}. Then the scheme sorts the set in the ascent ordering to get pˆ = {11, 11, 12, 13, 15} as shown in Fig. 10.9(B). The first and second smallest pixels are 11 and 11. The first and second largest pixels are 15 and 13. The difference between the first and second smallest pixels is dmin = 11 − 11 = 0. The difference between the first and second largest pixels is dmax = 15 − 13 = 2. The value of dmin is equal to 0, and hence the smallest pixel 11 is embeddable. Suppose that the secret data is s = 1. The stego-pixel is computed by p1 = pˆ 1 − s = 11 − 1 = 10. Although the largest pixel is nonembeddable, it still needs to be shifted by p5 = pˆ 5 + 1 = 15 + 1 = 16. Finally, the scheme rearranges the stego-pixels to their original location to get the stego-set as shown in Fig. 10.9(C). In this example, only one secret data is concealed into the pixel. The optimal situation of PVO is that both the smallest and largest pixels are embeddable. The hiding capacity of the PVO-based scheme is thus twice that of the HS-based scheme. Furthermore, the middle pixels in the PVO-based scheme need not be shifted for recovery. Hence the image quality of the PVO-based schemes is better than that of the HS-based schemes. In Li et al.’s PVO scheme, if the difference is equal to 0 or 1, then the pixel is embeddable, and thus the embeddable range is very small.

200

Digital Media Steganography

FIGURE 10.9 The hiding example of Li et al.’s PVO scheme. The hiding example of Li et al.’s PVO scheme.

10.4.2 Embedding procedure of IPVO Hence Peng et al. [24] proposed an improved pixel-value-ordering (IPVO) method to increase the embeddable range of Li et al.’s PVO scheme. For this scheme, a bimodal strategy is used to expand the embeddable range. In Peng et al.’s scheme the original location is a factor that affects the hiding strategy. Let the symbols u and v denote the largest and second largest pixel indices after sorting, respectively. In addition, let s and t denote the smallest and second smallest pixel indices after sorting, respectively. Fig. 10.10 shows an example using Peng’s scheme. The first set of the image is p = {3, 2, 15, 19}, which is then sorted to the set pˆ = {2, 3, 15, 19}. The sorted index is the original location of each pixel in p. ˆ Hence the sorted indices are (1) = 2, (2) = 1, (3) = 3, and (4) = 4. In this example, s = min ((1), (2)) = 1, t = max ((1), (2)) = 2, u = min ((3), (4)) = 3, and v = max ((3), (4)) = 4. The scheme then computes dmax = pˆ u − pˆ v = pˆ 2 − pˆ 3 = 15 − 19 = −4 and dmin = pˆ s − pˆ t = 3 − 2 = 1. If dmax is equal to 1 or 0, then dmin equals −1 or 0, and the pixel is embeddable. Otherwise, the pixel is nonembeddable. In this case, the pixel is nonembeddable. The smallest and largest pixels need to be shifted. Hence p2 = pˆ 2 − 1 = 2 − 1 = 1 and p4 = pˆ 4 + 1 = 19 + 1 = 20. More detail is provided in Fig. 10.10. In this case, two secret bits are concealed in the image.

10.4.3 Experimental results of PVO-based schemes Table 10.2 reports the experimental results of the PVO-based schemes. The test images are shown in Fig. 10.11. The average PSNR value of Peng et al.’s scheme is 51.84 dB, and the hiding payload is 0.15 bpp. As with the HS scheme, the PVO scheme also needs a location map to record the boundary pixel, which may have an underflow or overflow problem. Hence the pure hiding payload is still limited. Therefore the scholars proposed the dualimage-based scheme to solve the problem.

10.5 Dual-image-based schemes The dual-image-based hiding scheme is another type of RSS [23,25–29]. The secret data is hidden into a host image to generate two stego-images. A diagram of the dual-image-based RSS is shown in Fig. 10.12(A).

Chapter 10 • Reversible steganography techniques: A survey

201

FIGURE 10.10 The hiding example of Peng et al.’s scheme. The hiding example of Peng et al.’s scheme.

Table 10.2 The experimental results of the PVO-based schemes. Image

PSNR

Hiding Payload

bpp

Airplane

51.84

38,938

0.15

Mandrill

51.37

13,656

0.05

Lena

51.84

38,938

0.15

Tiffany

51.94

44,047

0.17

In the extraction procedure, the owner can extract the secret data only when two stego-images exist at the same time. Any illegal user who only owns one stego-image cannot extract the correct secret data. The diagram of the extraction procedure is shown in Fig. 10.12(B). Furthermore, the secret data can disperse into two stego-images. Thus the hiding capacity is twice that of other RSS. Hence the advantages of dual-image-based RSS are security, a high hiding payload, reversibility, and strong robustness [26–29]. Chang et al. [8] proposed a dual-image-based RSS along with the exploiting modification direction (EMD) method. Lee et al. transform secret messages into quinary-based symbols and concatenates two symbols as a set to embed in the dual images [30]. Chang et al. consider the right diagonal line of a magic matrix to determine the correct position for

202

Digital Media Steganography

FIGURE 10.11 The test images of PVO scheme. The test images used to test the performance of PVO scheme.

hiding secret data. Qin et al. use different hiding strategies for two different stego-images in RSS. For the first stego-image, Qin et al. [19] apply the EMD method. The first stegoimage is used to decide the hiding rule for the second stego-image. Lu et al. [28] modify the LSB matching method by making it reversible using a rule table.

10.5.1 Center-folding strategy Most of the hiding methods focus on enhancing the performance of the hiding mechanism. Lu et al. [29] suggest that the secret data are another important factor that infers with the performance of the RSS. Hence they proposed a center-folding strategy to fold the secret data before hiding into the host image. In their scheme, K secret bits are taken as a set and transformed into a decimal number DN. The value range table of DN is R = {0, 1, 2, . . ., 2(K−1) }. To narrow down the value of DN, they subtract the center value of the range table from DN so that the range table becomes Rˆ = {−2(K−1) , −2(K−1) + ˆ 1, . . ., −1, 0, 1, . . ., 2(K−1) − 2, 2(K−1) − 1}. The value after folding process is denoted as DN

Chapter 10 • Reversible steganography techniques: A survey

FIGURE 10.12 The diagram of the dual-image based RSS. The diagram of the dual-image-based RSS.

203

204

Digital Media Steganography

FIGURE 10.13 An hiding example of Lu’s scheme. An hiding example of Lu’s scheme.

and is given as follows: ˆ = DN − 2(K−1) , DN

(10.26)

where 2(K−1) is the intermediate value of the original range table. For example, let K be set as 3. The original range table is R = {0, 1, 2, . . ., 2(3) − 1} = {0, 1, 2, 3, 4, 5, 6, 7}, and the intermediate value of the range table is 2(K−1) = 2(3−1) = 4. The scheme subtracts the intermediate value from the values in the range table to generate the new range table Rˆ = {−2(K−1) , −2(K−1) + 1, . . ., −1, 0, 1, . . ., 2(K−1) − 2, 2(K−1) − 1} = {−4, −3, −2, −1, 0, 1, 2, 3}. The folded value is then hidden into the pixel to generate the stego-pixels using the equations   ˆ DN  , (10.27) p i = pi − 2   ˆ DN pi = pi + , (10.28) 2 where pi is the ith pixel, pi is the ith stego-pixel of the first stego-image, and pi is the ith stego-pixel of the second stego-image. Fig. 10.13(A) shows an example image of Lu’s scheme. Suppose that the secret data are given as 110 011 011 110. The scheme first obtains three bits from the secret data and transforms the bit string to the decimal number DN = (110)2 = (6)10 .The value is folded ˆ = DN − 2(K−1) = 6 − 4 = 2. The DN numbers are shown in Fig. 10.13(B), and their by DN ˆ are shown in Fig. 10.13(C). corresponding folded values DN The first pixel in Fig. 10.13(A)  isp1 = 12 and  the first corresponding stego-pixel in the ˆ 2 first stego-image is p1 = p1 − DN = 12 − 2 2 = 11. The first stego-pixel in the second     stego-image is p1 = p1 + 22 = 12 + 22 = 13. Figs. 10.13(D) and 10.13(E) are the stegoimages I and II, respectively. In the extraction procedure the scheme uses simple subtraction and average operators to extract the secret data and recover the original pixel value. The hidden data are calcu-

Chapter 10 • Reversible steganography techniques: A survey

205

lated using the equation ˆ = pi − pi . DN

(10.29)

ˆ value is added to the intermediate value to recover the original DN as follows: The DN ˆ + 2(K−1) . DN = DN

(10.30)

Then the DN value is transformed to the binary system to get the original secret data. In the recovery procedure the scheme computes the average value from the two stego-images by    (pi + pi ) pi = . (10.31) 2 ˆ For example, the first stego-pixels in p1 and p1 are 11 and 13, respectively. The DN   ˆ = p − p = 13 − 11 = 2, and the original DN value is DN = value is computed as DN 1 1 ˆ + 2(K−1) = 2 + 4 = 6. The binary string is (6)10 = (110)2 . The first three bits are 110. The DN   original pixel value is computed as p1 = (p1 + p1 )/2 = (13 + 11)/2 = 12. In Lu’s scheme, if the pixel is not in the range [−2(K−2) , 2(K−2) − 1], then the overflow or underflow problem may exist. If the original pixel falls in the above range, it is embeddable; otherwise, it is nonembeddable. For a nonembeddable pixel, the stego-pixel is set to be the same as the original pixel. For the extraction procedure, the scheme must check whether two stego-pixels that lie in the same location are not in the range [−2(K−2) , 2(K−2) − 1]. If both pixels do not fall in the range, then the pixel is nonembeddable, and no secret data is placed in the pixel. The original pixel is the same as the stego-pixel.

10.5.2 Experimental results of dual-based RSS Table 10.3 presents some comparisons of dual-based RSS using the test image Lena. In Table 10.3, Lu et al.’s center folding strategy can increase the image quality and hiding capacity. Under the same hiding payload of 524,288 bits, the PSNR value of Lu et al.’s scheme is 51.40 dB with K = 2, which is higher than that of Chang et al.’s scheme of 48.14 dB. Under the similar image quality of 46 dB, the hiding capacity of Lu et al.’s scheme is 786,432 bits with K = 3, which is higher than that of Qin et al.’s scheme with 557,052 bits. The dualimage-based RSS methods are simple and effective. However, a major drawback of these schemes is the need for two stego-images. To overcome this problem, interpolation-based RSS has been proposed.

10.6 Interpolation-based schemes Interpolation-based RSS generate an extra pixel between two adjacent pixels to hide the secret data instead of creating an additional stego-image. Fig. 10.14 presents a diagram of the interpolation-based RSS [30–33]. Fig. 10.14(A) is the hiding procedure, and Fig. 10.14(B)

206

Digital Media Steganography

Table 10.3 Experimental results of Lu et al.’s scheme. Method

PSNR

Capacity

Chang et al (2007)

45.13

524,288

Chang et al (2009)

48.14

524,288

Lee et al (2009)

52.65

392,880

Lee and Huang (2013)

49.66

560,801

Chang et al. (2013)

39.89

802,895

Qin et al. (2014)

46.85

557,052

Lu et al. (2015, K = 2) center folding

51.40

524,288

Lu et al. (2015, K = 3) center folding

46.86

786,432

FIGURE 10.14 Diagram of the interpolation-based RSS. Diagram of the interpolation-based RSS.

is the extraction procedure. In Fig. 10.14(A) an input image of size w × h is reduced to w2 × h2 . The reduced image is called the original image. An interpolation operator is then used to generate the virtual pixel, to extend the image for the generation of the host image of size of w × h. The secret data are subsequently concealed into the virtual pixel to produce the stego-image. In the extraction procedure the secret data are extracted from the virtual pixel of the stego-image. The scheme then reduces the stego-image to recover the original image of size w2 × h2 . Malik et al. [33] proposed an image interpolation-based RSS that applied a pixel value adjusting feature. Lee et al. [30] used a reduplicated exploiting modification direction and edge detection to propose an interpolation RSS. Lu et al. [32] adopted the interpolation technique with neighboring pixels (INP) and a center-folding strategy for an advanced interpolation-based RSS.

Chapter 10 • Reversible steganography techniques: A survey

207

FIGURE 10.15 The extension procedure for the NMI scheme. The extension procedure for the NMI scheme.

FIGURE 10.16 Example of Jung and Yoo’s scheme. Example of Jung and Yoo’s scheme.

10.6.1 Embedding procedure of NMI For the interpolation-based RSS schemes, key issues are how to extend the image and how to generate the virtual pixel. Many techniques have been proposed in relation to these concerns. A popular scheme is neighbor mean interpolation (NMI) proposed by Jung and Yoo. In their scheme the virtual pixel is calculated using the mean value of the adjacent pixels. The equation is as follows:

N MI v(i,j )

⎧  (p(i,j −1) +p(i,j +1) )  ⎪ if i = 2h, j = 2w, ⎪ 2 ⎪ ⎪ ⎪ ⎨  (p(i−1,j ) +p(i+1,j ) )  if i = 2h, = 2 ⎪  ⎪  ⎪ NMI +v NMI ) ⎪ (p(i−1,j −1) +v(i−1,j ⎪ ) (i,j −1) ⎩ otherwise. 3

(10.32)

N MI is the virtual pixel, and p Fig. 10.15 shows the extension diagram for N MI . v(i,j (i,j −1) is the ) original pixel at location (i, j − 1). Fig. 10.16 shows an example of the N MI scheme. The       p(i,j −1) +p(i,j +1) (74+76) (74+78) N MI N MI = = = 75, v = virtual pixels are given as v(i−1,j ) = (i,j −1) 2 2 2   N MI = (74+75+76) = 75. Fig. 10.16(B) presents the host image. 76, and v(i,j ) 3

In the embedding procedure the host image is divided into several blocks of size of 2 × 2 to conceal the secret message. The pixel p(i−1,j −1) is the base pixel used to calculate the distance from the other virtual pixels. The equations are as follows: N MI d1N MI = |v(i−1,j ) − p(i−1,j −1) |,

(10.33)

N MI d2N MI = |v(i,j −1) − p(i−1,j −1) |,

(10.34)

N MI d3N MI = |v(i,j ) − p(i−1,j −1) |.

(10.35)

208

Digital Media Steganography

FIGURE 10.17 The reducing procedure of Jung and Yoo’s scheme. The reducing procedure of Jung and Yoo’s scheme.

Then the distance used to judge how many secret bits can be concealed into the virtual pixel is calculated. The size of the resulting secret bits is computed as:   MI LenN = log2 (d1N MI ) , 1

(10.36)

  MI = log2 (d2N MI ) , LenN 2

(10.37)

  MI = log2 (d3N MI ) . LenN 3

(10.38)

Then the scheme obtains LenN MI bits from the secret data and transforms the bit string to a decimal number. The decimal number is embedded into the virtual pixel to generate the stego-pixel. Fig. 10.16 shows an example of Jung and Yoo’s scheme. The distances in Fig. 10.16(B) are d1N MI = |75 − 74| = 1, d2N MI = |76 − 74| = 2, and d3N MI = |75 − 74| = 1.     MI = log (1) = 0, LenN MI = log (2) = 1, and The sizes of the secret data are LenN 2 2 1 2   MI = log (1) = 0. In this example, as LenN MI and LenN MI are both 0, the secret data LenN 2 3 1 3 N MI and v N MI . The value of LenN MI is 1, which cannot be embedded into the pixels v(i−1,j 2 ) (i,j ) N MI . means that one secret bit can be hidden into v(i,j −1) Let the secret data be 110. The fist secret bit (1)2 is transformed into the decimal number N MI to generate the stego-pixel v N MI using v N MI = (1)10 and added to the virtual pixel v(i,j −1) (i,j −1) (i,j −1) N MI + s = 76 + 1 = 77. The final stego-image is shown in Fig. 10.16(C). v(i,j −1)

10.6.2 Extraction procedure of NMI In the extraction procedure the receiver uses Eq. (10.32) to generate the new virtual pixN MI , vˆ N MI , and vˆ N MI by referring to p els vˆ(i−1,j (i−1,j −1) , p(i−1,j +1) , p(i+1,j −1) , and p(i+1,j +1) , ) (i,j −1) (i,j ) N MI , v N MI , respectively. The new virtual pixel is then subtracted from the stego-pixels v(i−1,j ) (i,j −1) N MI to first compute the size of the secret data and to subsequently extract it. Then and v(i,j ) the stego-image is reduced to obtain the host image. The extraction and reducing procedures are shown in Fig. 10.17. N MI = Following the same example, the new virtual pixels of Fig. 10.16(C) are given as vˆ(i−1,j ) N MI = 77, and vˆ N MI = 75. The scheme subtracts the new virtual pixel from the stego75, vˆ(i,j −1) (i,j ) pixels to get:   MI N MI ˆ N = log(v − p ) = log(75 − 74) = 0, Len (i−1,j −1) 1 (i−1,j )

Chapter 10 • Reversible steganography techniques: A survey

Table 10.4 NMI.

Table 10.5 Image Airplane

Lena

209

The experimental results of

Image

PSNR

Hiding Payload

bpp

Airplane

33.05

275,251

1.05

Mandrill

32.40

288,358

1.10

Lena

34.89

288,358

1.10

Tiffany

37.77

243,794

0.93

The experimental results of different kinds of RSS.

Method

PSNR

bpp

Image Mandrill

Method

PSNR

bpp

DE

38.42

0.9

DE

36.02

0.54

HS

48.30

0.06

HS

48.20

0.02

PVO

51.84

0.15

PVO

51.37

0.05

CFS dual

41.42

1.50

CFS dual

45.39

1.50

NMI

33.05

1.05

NMI

32.40

1.10

DE

38.04

0.91

DE

37.10

0.86

HS

48.20

0.02

HS

48.20

0.03

PVO

51.84

0.15

PVO

51.94

0.17

CFS dual

45.38

1.50

CFS dual

45.48

1.47

NMI

34.89

1.10

NMI

37.77

0.93

Tiffany

  MI N MI ˆ N = log(v − p ) = log(77 − 74) = 1, and Len (i−1,j −1) 2 (i,j −1)   MI N MI ˆ N = log(v − p ) = log(75 − 74) = 0. Len (i−1,j −1) 3 (i,j ) MI ˆ N = 1, there is one secret bit concealed in the pixel. The secret data are given As Len 2 N MI − v N MI = 77 − 76 = (1) = (1) . Lastly, the scheme reduces the stego-image as s = vˆ(i,j 10 2 −1) (i,j −1) to produce the original image, as shown in Fig. 10.16(A). The experimental results of NMI are shown in Table 10.4.

10.6.3 Comparison results The comparisons of the schemes described before are shown in Table 10.5 and Fig. 10.18. In the figure, the image quality of PVO is the highest with 51.75 dB. However, the average hiding payload is 0.13 bpp. The average hiding payload of dual-based scheme is 1.49 bpp. The average image quality is 44.42 dB, but the size of the stego-images is twice of the other methods. Many scholars proposed other technologies to solve the problems of the limit hiding capacity and low image quality. For example, Liu and Pun proposed an encrypted image embedding scheme by using redundant space transfer (RST). In their scheme an original image is transferred to the encrypted image. The scheme embeds secret data in the redundant space of the encrypted image for increasing the hiding capacity [34]. Duan et

210

Digital Media Steganography

FIGURE 10.18 The comparisons of different kinds of RSS. The average values of PSNR and bpp of each RSS.

al., based on the deep learning technique, proposed a U-Net structure for increasing the hiding capacity. A neural network is used to hide the secret data. According to the experimental results, the embedding capacity can be improved [35]. Pixel value differencing (PVD) is another way to increase the hiding capacity. In a PVD scheme a cover image is divided into several 2 × 1 blocks. Each block has two neighboring pixels. The scheme computes the difference between the pixels and calculates how many secret bits can be concealed into the pixels. The scheme then shifts the pixels to make the difference of the shifted pixels match with the secret data. In PVD scheme the scholars usually scan the image with zig-zag direction and considers three directions, horizontal, vertical, and diagonal, to find the neighboring pixels. In 2018, Hameed et al. adaptively selects embedding directions to enlarge the embedding capacity. The proposed scheme is called adaptive directional pixel-value differencing (ADPVD) [36]. In 2019, Liu et al. [37] proposed an enhanced PVD method. In their scheme, the cover image is divided into several 3 × 3 blocks. They apply the side match method to generate eight differences for hiding secret data. The scheme produces more hiding spaces so that the hiding capacity can be increased. In CFS the secret data are reencoded by using two folding zones. The first zone contains negative numbers, and the second zone contains positive numbers. In 2019, Lu et al. [38] enhanced CFS by using multiple folding zones to further narrow down the code of the secret data. In their scheme the number of zones and the maximum distortion are used to control the image quality and the hiding capacity. The hiding payload of Lu et al.’s scheme is 1.75 bpp, and the image quality is 43.22 dB.

10.7 Conclusion This chapter introduces several RSS, for example, DE-based RSS, HS-based RSS, PVO, dualimage-based RSS, and interpolation-based RSS. The distortion made by DE-based RSS is huge, and the overflow and underflow problems may be present. Hence this kind of scheme requires a location map to record the extra information needed to recover the original values. HS-based RSS can obtain an improved image quality. However, the hid-

Chapter 10 • Reversible steganography techniques: A survey

211

ing capacity is limited by the total number of peak points. Furthermore, the scheme also requires a location map to record the boundary pixels. The PVO-based scheme tries to hide the secret data in the maximum and minimum numbers. The hiding capacity of the PVO-based scheme is greater than that of the HS-based scheme. The hiding payload of the dual-image-based scheme is the greatest, and its image quality is acceptable. However, this kind of scheme requires two stego-images. Therefore the interpolation-based scheme is proposed to solve the two stego-images problem. For the interpolation-based scheme, the secret data are concealed into the virtual pixel, thus not interfering with the original pixels. Hence the original image can be restored to its original status by simply using the reducing operator. One drawback of the interpolation-based scheme is that it only recovers the stego-image to the status of the original image. However, the original image of size w h 2 × 2 is not the true image in NMI scheme. The true image is the input image of size w × h. This chapter briefly describes various reversible steganography schemes. The schemes have different advantages and disadvantages, and thus the user is able to choose a suitable solution according to their requirements.

Acknowledgments This research is partially sponsored by Ministry of Science and Technology (108-2221-E-324-017) and Chaoyang University of Technology (CYUT) and Higher Education Sprout Project, Ministry of Education (MOE), Taiwan, under the project “The R&D and the cultivation of talent for health-enhancement products”.

References [1] F.H. Wang, H.C. Huang, J.S. Pan, Efficient and robust watermarking algorithm with vector quantisation, Electronic Letters 37 (13) (2001) 826–828. [2] J.J. Hwang, Digital image watermarking employing codebook in vector quantisation, Electronic Letters 39 (1) (2003) 840–841. [3] M. Jo, H. Kim, A digital image watermarking scheme based on vector quantization, IEICE Transactions on Information and System (2002) 1054–1056. [4] T. Kim, Side match and overlap match vector quantizers for images, IEEE Transactions on Image Processing 1 (1992) 170–185. [5] A. Buzo, Y. Linde, R.M. Gray, An algorithm for vector quantizer design, IEEE Transactions on Communications 28 (1) (1980) 84–95. [6] Z.M. Lu, S.H. Sun, Digital image watermarking technique based on vector quantisation, Electronic Letters 36 (4) (2000) 303–305. [7] N. Morimoto, W. Bender, D. Gruhl, A. Lu, Techniques for data hiding, IBM System Journal 35 (1996) 313–336. [8] J.Y. Hsiao, C.C. Chang, C.S. Chan, Finding optimal least-significant-bit substitution in image hiding by dynamic programming strategy, Pattern Recognition 36 (2003) 1583–1595. [9] B. Chen, G.W. Wornell, Quantization index modulation: a class of provable good methods for digital watermarking and information embedding, IEEE Transactions on Information Theory 47 (2001) 1428–1443. [10] R. Tzschoppe, J.J. Eggers, R. Bauml, B. Girod, Scalar Costa scheme for information embedding, IEEE Transactions on Signal Processing 51 (2003) 1003–1019. [11] J. Goljan, J. Fridrich, R. Du, Invertible authentication watermark for JPEG images, in: Proceedings of the SPIE Conference on Security and Watermarking of Multimedia Content, 2001, pp. 223–227.

212

Digital Media Steganography

[12] J.F. Delaigle, C.D. Vleeschouwer, B. Macq, Circular interpretation of bijective transformation in lossless watermarking for media asset management, IEEE Transactions on Multimedia 5 (1) (2003) 97–105. [13] M.M. Yeung, F. Mintzer, An invisible watermarking technique for image verification, in: Proceedings of IEEE International Conference on Image Processing, vol. 2, 1997, pp. 680–683. [14] A.M. Tekalp, M.U. Celik, G. Sharma, E. Saber, Lossless generalized-LSB data embedding, IEEE Transactions on Image Processing 14 (2) (2005) 253–266. [15] A.M. Tekalp, M.U. Celik, G. Sharma, E. Saber, Localized lossless authentication watermark, in: Proceedings of SPIE Security and Watermarking of Multimedia Contents, vol. 5020, 2003, pp. 689–698. [16] A.M. Tekalp, M.U. Celik, G. Sharma, E. Saber, Reversible data hiding, in: Proceedings of IEEE International Conference on Image Processing, vol. II, 2002, pp. 157–160. [17] M. Rabbani, C.W. Honsinger, P.W. Jones, J.C. Stoffel, Lossless recovery of an original image containing embedded data, US Patent Application 6 (278) (2001) 791. [18] J. Tian, Reversible data embedding and content authentication using difference expansion, IEEE Transactions on Circuits and Systems for Video Technology 13 (8) (2003) 831–841. [19] C.C. Chang, C. Qin, T.J. Hsu, Reversible data hiding scheme based on exploiting modification direction with two steganographic images, Multimedia Tools and Applications 74 (15) (Sep. 2014) 5861–5872. [20] H.M. Wong, S.K. Yip, O.C. Au, C.W. Ho, Generalized lossless data hiding by multiple predictors, in: Proceedings of IEEE International Symposium on Circuits and Systems, 2006, pp. 1426–1429. [21] M. Radford, I.H. Witten, J.G. Cleary, Arithmetic coding for data compression, Communications on ACM 30 (6) (1987) 520–540. [22] A.M. Alattar, Reversible watermark using the difference expansion of a generalized integer transform, IEEE Transactions on Image Processing 13 (8) (2004) 1147–1156. [23] N. Ansari, Z. Ni, Y.Q. Shi, W. Su, Reversible data hiding, IEEE Transactions on Circuits and Systems for Video Technology 16 (3) (2006) 354–362. [24] X. Li, F. Peng, B. Yang, Improved PVO-based reversible data hiding, Digital Signal Process 25 (2014) 255–265. [25] T.D. Kieu, C.C. Chang, Y.C. Chou, Reversible data hiding scheme using two steganographic images, in: Proceedings of IEEE Region 10 International Conference (TENCON), vol. 117–118, 2007, pp. 1–4. [26] C.F. Lee, Y.L. Huang, Reversible data hiding scheme based on dual stegano-images using orientation combinations, Telecommunication Systems 52 (4) (2013) 2237–2247. [27] C.C. Chang, C.F. Lee, K.H. Wang, Y.L. Huang, A reversible data hiding scheme based on dual steganographic images, in: Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication, in: ICUIMC, vol. 09, 2009, pp. 228–237. [28] C.Y. Tseng, T.C. Lu, J.H. Wu, Dual imaging-based reversible hiding technique using LSB matching, Signal Processing 108 (2015) 77–89. [29] J.H. Wu, T.C. Lu, C.C. Huang, Dual-image-based reversible data hiding method using center folding strategy, Signal Processing 115 (2015) 195–213. [30] K.C. Chen, C.F. Lee, C.Y. Weng, An efficient reversible data hiding with reduplicated exploiting modification direction using image interpolation and edge detection, Multimedia Tools and Applications 76 (2017) 9993–10016. [31] M.S. Kumar, J. Biswapati, G. Debasis, Weighted matrix based reversible data hiding scheme using image interpolation, Computational Intelligence in Data Mining 2 (2016) 239–248. [32] T.C. Lu, Adaptive (k, F1) interpolation-based hiding scheme using center folding strategy, Multimedia Tools and Application 76 (2017) 1827–1855. [33] H.K. Verma, A. Malik, G. Sikka, An image interpolation based reversible data hiding scheme using pixel value adjusting feature, Multimedia Tools and Applications 76 (2017) 13025–13046. [34] C.M. Pun, Z.L. Liu, Reversible data-hiding in encrypted images by redundant space transfer, Information Sciences 433–434 (2018) 188–203. [35] B. Li, et al., X.T. Duan, K. Jia, Reversible image steganography scheme based on a U-Net structure, IEEE Access 7 (2019) 9315–9323.

Chapter 10 • Reversible steganography techniques: A survey

213

[36] S. Aly, M.A. Hameed, M. Hassaballah, An efficient data hiding method based on adaptive directional pixel value differencing (ADPVD), Multimedia Tools and Applications 77 (12) (2018) 14705–14723. [37] Y.C. Lin, H.H. Liu, C.M. Lee, Techniques for data hiding, Multimedia Tools and Applications 78 (9) (2019) 12157–12181. [38] Y.C. Lu, T.C. Lu, T.N. Vo, Dual-image based high image quality reversible hiding scheme with multiple folding zones, Multimedia Tools and Applications (2019) 1–39.

11 Quantum steganography Todd A. Brun University of Southern California, Ming Hsieh Department of Electrical and Computer Engineering, Los Angeles, CA, United States

11.1 Introduction 11.1.1 The idea of steganography Steganography is the science of hiding a secret message (the stegotext) within a larger, innocent-looking message (the covertext), and transmitting the result so that the steganographic message is readable only by the intended receiver. “Steganography” comes from the Greek words steganos (meaning “covered”) and graphia (meaning “writing”). The art of information hiding dates back to the Greeks in 440 BCE [1]. The word “steganography” was first used in 1499 by Johannes Trithemius in his Steganographia, one of the first treatises on cryptographic and steganographic techniques [2]. During WWII, a Japanese spy Velvalee Dickinson sent classified information to neutral South America. She was a dealer in dolls, and her letters discussed the quantity and type of doll to ship. The covertext was the doll orders, concealing an encoded stegotext about battleship movements [3]. The modern study of steganography began with Simmons [4]. In his paradigm, Alice and Bob are imprisoned in two cells that are far apart from each other. They would like to devise an escape plan, but the only way they can communicate is through a courier who works for the warden of the prison (Eve, the eavesdropper; in steganographic literature, she is sometimes called Willie, the warden). The courier reveals all messages to Eve. If Eve suspects that Alice and Bob are conspiring to escape, then she will cut off all communication between them. Prior to their incarceration, Alice and Bob shared a secret key—assumed to be a long string of random bits—which they later use to send secret messages hidden in a covertext. Can Alice and Bob devise an escape plan without arousing the suspicion of the warden? It is important to recognized the difference between steganography and cryptography. In cryptography the sender (Alice) encrypts a secret message (the plaintext) using a shared secret key, and the resulting ciphertext is then sent to the receiver (Bob) to be decoded. If an eavesdropper (Eve) observes the ciphertext, then she cannot decode it without the secret key. However, she will know that there is a secret message, since Alice is sending apparent gibberish to Bob. The secrecy of steganography comes from concealing the fact that there is any message at all. In many cases, if Eve becomes aware that a secret message exists, then she can read it without difficulty. Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00019-0 Copyright © 2020 Elsevier Inc. All rights reserved.

215

216

Digital Media Steganography

The two paradigms also give a different role to the eavesdropper, or adversary. In standard cryptography the eavesdropper is assumed to operate secretly and (perhaps) illegitimately. In steganography the eavesdropper can operate openly and is often in a position of authority. If Eve is the prison warden, then she could prevent secret communication by the simple expedient of banning all communication. But generally, she wishes to allow certain kinds of approved communication while banning others. Cryptography is a defense against spies; steganography, against censors and secret police. Quantum cryptography has been widely studied [5]. The study of quantum steganography, however, is still in a relatively early stage. Generalizing the idea of steganography to quantum information can take a number of different forms. One could send a secret classical message through a quantum channel, hidden in a quantum communication protocol. One could similarly send a secret quantum messages (i.e., quantum states) through a quantum channel. Quantum steganography can also include using quantum resources other than channels (such as entanglement) to send secret classical or quantum messages. These possibilities require the development of new methods of encoding, suitable for quantum communication channels, and of a sensible notion for an “innocent” quantum message. Curty et al. proposed three different quantum steganographic protocols [6]. However, none of these protocols addressed the issue of communicating an innocent message over a noisy classical channel or a general quantum channel, or gave key-consumption rates. Natori [7] provided a simple treatment of quantum steganography that is a modification of superdense coding. Martin [8] also introduced a notion of quantum steganographic communication. His protocol is a variation of Bennett and Brassard’s quantum-key distribution protocol (QKD), in which he hides a steganographic channel in the QKD protocol. Banerjee [9] gave a classical steganographic method inspired by the use of reversible quantum gates. This chapter examines one promising approach to quantum steganography, based on hiding messages in quantum states by disguising them as errors in a quantum channel [10–14].

11.1.2 Quantum error-correcting codes The approach studied in this chapter was introduced by Gea-Banacloche [10]. GeaBanacloche gave a protocol to hide secret classical messages in the form of quantum error syndromes by deliberately applying correctable errors to a quantum state encoded in the three-qubit quantum error-correcting code (QECC). His paper, however, did not address the issue of making the message look innocent: in the protocol the message would not resemble any plausible quantum channel. This approach to quantum steganography has been studied in detail by Shaw and Brun [11,12], with explicit encoding and decoding procedures and calculated rates of communication and secret key consumption. They showed that such schemes can hide both quantum and classical information, with a quantitative measure of secrecy, even in the presence of a noisy physical channel.

Chapter 11 • Quantum steganography

217

Building on the above work, Sutherland and Brun [13,14] derived asymptotic communication rates for quantum steganography. When the error rate of the physical channel is lower than the eavesdropper’s expectation, it is possible to achieve nonzero asymptotic rates of communication. (If the eavesdropper has exact knowledge of the channel, then secret communication may still be possible, but the amount of secret information that can be transmitted in general grows sublinearly with the number of channel uses.) In this chapter, we will examine these techniques. Recently, an idea closely related to quantum steganography has been studied under the name of quantum covert communication [15–19]. Many of the ideas in this literature are closely related to steganographic requirements, such as secrecy and recoverability. This is not surprising, since covert quantum communication can be seen as a particular case of quantum steganography over noisy quantum channels in the case where the eavesdropper has exact knowledge of the channel, and where Eve assumes that the channel is idle (so only noise is being transmitted). Similarly, quantum steganography is a type of covert quantum communication where Eve knows about the covertext communication but not the hidden stegotext, and where Eve may not have perfect knowledge of the channel. Work on covert communication has found that if Eve has exact knowledge of the channel, then the amount of secret communication that can be done generally grows as the square root of the number of channel uses. For the purposes of this chapter, one only needs to know a small amount about quantum error-correcting codes. A QECC is characterized by three parameters {n, k, d}. It encodes k logical qubits into n physical qubits with a minimum distance d. Such a code can correct errors acting on t qubits if t ≤ (d − 1)/2. The most widely used class of QECCs is the stabilizer codes [20], based on classical linear codes. Errors are diagnosed by measuring a set of error syndromes, analogous to classical parity checks. The one other distinction that is needed is between degenerate and nondegenerate codes. A QECC is nondegenerate if all correctable errors have unique error syndromes. A degenerate QECC has two or more correctable errors with the same error syndrome, which can be corrected by the same operation. Degeneracy is a property unique to quantum codes; there is no analogous property of classical linear codes. In this chapter the QECCs used are assumed to be nondegenerate, but it should be recognized that many QECCs are degenerate, and these protocols will have to be modified for that case.

11.2 Goals and tools of quantum steganography There are two major goals in quantum steganography. Communication: Alice wants to send classical or quantum information to Bob over a quantum channel. Most protocols that have been proposed for quantum steganography are for sending classical information, but Shaw and Brun showed that it is also possible to send quantum information, given certain assumptions about the eavesdropper’s behavior. Secrecy: Eve should be unable to detect the presence of the secret message. Ideally, a protocol should maximize the rate of communication, consistent with the secrecy requirement. In this chapter, we will show

218

Digital Media Steganography

that it is possible to make a hidden message indistinguishable from the covertext passing through the appropriate channel. A third requirement may or may not be imposed: Security. In some cases, we may require that Eve be unable to read the steganographic message even if she knows it is present. This requirement is, in a sense, separate from the methods of steganography itself and can be achieved by combining steganography with cryptography. Shaw and Brun presented a set of protocols that achieve the above goals. These protocols have the following structure. An “innocent” quantum message (or state) ρc is encoded in a QECC by Alice. This ρc is the covertext. In this chapter, we will usually assume that ρc is a pure state, ρc = |ψc ψc |. Alice then performs a second operation on the encoded covertext, which embeds the steganographic message in the codeword. This stegotext is another state |φs . One bit or qubit of the stegotext is called a stego bit or stego qubit, respectively. The modified codeword is sent over a quantum channel to Bob, who can (with high probability) decode it and extract the stegotext |φs . The encoding is done in such a way that if an eavesdropper Eve intercepts the codeword, then it will look exactly like the encoded state ρc after it has passed through a noisy channel. In other words, Eve cannot distinguish the encoded steganographic message from noise in the channel. To prove the efficacy of a quantum steganography protocol, we must make a number of assumptions about what Alice, Bob, and Eve know, and the resources on which they can draw: 1. Alice and Bob know (with sufficient accuracy) the physical channel, which may or may not have intrinsic noise (both cases will be considered). 2. Eve has beliefs about the physical channel, which may or may not be accurate. But this chapter assumes that Alice and Bob have some knowledge of Eve’s expectations. This can be plausible if Alice and Bob have been systematically making the channel appear more noisy than it actually is by adding extra errors to all their transmissions. But (as will be seen) secret communication is still possible even if Eve’s knowledge of the channel is accurate. 3. Alice and Bob share either a secret key or shared entanglement. A secret key is a long binary string drawn from an equally weighted i.i.d. distribution. Shared entanglement can be used to generate such a key, but it can also be used, for example, as a quantum resource for teleportation [21]. 4. Eve can make measurements of any message that passes on the channel, but she will not necessarily always do so. If Eve intercepts a message, then she can demand from Alice and Bob information about the covertext ρc , the QECC used, and so on, and make measurements to verify their information. A steganographic protocol succeeds if Alice and Bob can communicate a nonzero amount of information while satisfying the secrecy requirement. The requirement is that if Eve intercepts the message, then with high probability she cannot tell whether it contains a secret message. The message should appear exactly like the encoded state ρc that has passed through the channel that Eve expects.

Chapter 11 • Quantum steganography

219

We can go further and demand that even if Eve knows that the codeword contains secret information, she is unable to read it—the condition called security. This can be achieved by encrypting the message before embedding it in the covertext. Alice and Bob want to maximize their rate of steganographic communication while minimizing their rate of key usage subject to the secrecy condition.

11.3 Quantum steganography with depolarizing noise In this section, we will present a simple steganographic protocol that shows how quantum information can be hidden in the noise of a depolarizing channel using a classical secret key shared between Alice and Bob. First, the case will be considered where the physical channel is noiseless (i.e., all noise is controlled by Alice), but Eve expects some level of depolarizing noise. This simple protocol can be extended to the case where the channel does have intrinsic depolarizing noise (not controlled by Alice and Bob). The amount of secret key consumed can be calculated. This protocol is not optimal, but proves that quantum steganography is possible and gives a simple and robust encoding. Later in the chapter, more efficient quantum steganographic protocols will be considered against more general quantum channels, which hide quantum information in the space of typical error syndromes. This encoding is asymptotically optimal for certain classes of QECCs and quantum channels.

11.3.1 The depolarizing channel Classically, a standard error model is the binary symmetric channel (BSC) in which each bit has an independent probability p of being flipped 0 ↔ 1. In quantum notation, this corresponds to the channel ρ → NpBSC (ρ) = (1 − p)ρ + pXρX,

(11.1)

where ρ is a density matrix representing the state of a single qubit, and the X operator flips the bit (see further). The quantum generalization is the depolarizing channel (DC), one of the most widely used quantum channel models: ρ → NpDC (ρ) = (1 − p)ρ +

p p p XρX + YρY + ZρZ. 3 3 3

(11.2)

Each qubit has an equal probability of undergoing an X, Y , or Z error, where these are the canonical Pauli matrices:       0 1 0 −i 1 0 X= , Y= , Z= . (11.3) 1 0 i 0 0 −1 Applying this channel repeatedly to a qubit will map it eventually to the maximally mixed state I /2. This channel can be written in a different but equivalent form: NpDC = (1 − 4p/3)I + (4p/3)T ,

(11.4)

220

Digital Media Steganography

where I(ρ) = ρ and T (ρ) = (1/4)(ρ + XρX + YρY + ZρZ). The operation T is twirling : it takes a qubit in any state ρ to the maximally mixed state I /2. If the channel in written this way, instead of applying X, Y , or Z errors with probability p/3, we can think of removing the qubit with probability 4p/3 and replacing it with a maximally mixed state. This picture makes the steganographic protocol more transparent.

FIGURE 11.1 Local encoding. This figure illustrates the scheme for hiding secret message qubits locally in a quantum codeword while emulating the depolarizing channel. The secret message qubits are “twirled” and inserted at random locations in the codeword by Alice, using the secret key shared with Bob. The codeword is sent through the channel. Knowing the secret key, Bob can extract the secret message qubits and “untwirl” them. If Eve looks at the codeword, then it will look like an innocent codeword that has passed through the depolarizing channel.

11.3.2 A local steganographic encoding To begin with, assume that the actual physical channel between Alice and Bob is noiseless. All the noise that Eve sees is due to errors that Alice deliberately applies to her codewords. The procedure is illustrated in Fig. 11.1. 1. Alice encodes a covertext of kc qubits into N qubits with an {N, kc } quantum errorcorrecting code (QECC). 2. From (11.4), the DC would maximally mix Q qubits with probability   N (4p/3)Q (1 − 4p/3)N−Q . (11.5) pQ = Q For large N, Alice can send M = (4/3)pN (1 − δ) stego qubits, where 1 δ √ (1 − 4p/3)/(4p/3)N . (The chance of fewer than M errors is negligibly small.) 3. Using the shared random key (or shared ebits), Alice chooses a random subset of M qubits out of the N , and swaps her M stego qubits for those qubits of the codeword. She also replaces a random number m of qubits outside this subset with maximally mixed qubits, so that the total Q = M + m matches the binomial distribution (11.5) to high accuracy. 4. Alice “twirls” her M stego qubits using 2M bits of secret key or 2M shared ebits. To each qubit, she applies one of I , X, Y , or Z chosen at random, so ρ → T ρ. To Eve, who does not have the key, these qubits appear maximally mixed. (Twirling can be thought of as the quantum equivalent of a one-time pad.)

Chapter 11 • Quantum steganography

221

5. Alice transmits the codeword to Bob. From the secret key, he knows the correct subset of M qubits and the one-time pad to decode (“untwirl”) them. This protocol transmits (4/3)pN (1 − δ) secret qubits from Alice to Bob. The secrecy follows from the fact that, without the key, Eve cannot distinguish a stego qubit from a maximally mixed qubit; these maximally mixed qubits are distributed exactly as would be expected from the depolarizing channel with error rate p. If the rate p matches Eve’s expectations, then she will detect nothing suspicious even if she intercepts the codeword and measures its error syndromes. If the channel contains intrinsic noise, then Alice will first have to encode her ks stego qubits in an {M, ks } QECC, swap those M qubits for a random subset of M qubits in the codeword, and apply the twirling procedure. This twirling does not interfere with the errorcorrecting power of the QECC if Bob knows the key. The rate of transmission ks /N will depend on the rate of the QECC used to protect the stego qubits. For a BSC, this would be upper-bounded by (1 − δ)(1 − h(p))δp/(1 − 2p). However, for most quantum channels (including the DC), the highest achievable rate is not known. Assuming that the physical channel is also a DC with error rate p and that Alice emulates a DC with error rate q, the effective channel will appear to Eve like a DC with error rate p + q(1 − 4p/3) ≡ p + δp. So long as p + δp is sufficiently close to Eve’s expectation of the error rate, the communication will remain secret. (This notion of secrecy will be made rigorous shortly.) The rate of communication is ks /N ≈ (4/3)cδp/(1 − 4p/3), where c = ks /M is the achievable rate of the code for the DC with error rate p. Although the asymptotic limit of c (the quantum capacity) is not known for the DC, a lower bound on this limit is known, called the hashing bound [22]. This is chashing = 1 − h(p) − p log2 (3), where h(p) = −p log2 (p) − (1 − p) log2 (1 − p) is the binary entropy.

11.3.3 Key usage The secret key is used at two points in this protocol. In step 3, Alice chooses a random subset of M qubits out of the N -qubit codeword. There are C(N, M) subsets, so roughly log2 C(N, M) bits are needed to choose one. In step 4, 2M bits of key are used for twirling. This gives   N + 2M (11.6) nk ≈ log2 M bits of secret key used. Define the key consumption rate K = nk /N to be the number of bits of key consumed per qubit that Alice sends through the channel, and use M ≈ 4qN/3 and q ≈ δp/(1 − 4p/3) to express K in terms of p, δp, and N :   (11.7) K ≈ log2 (4/β)β (1 − βN )β−1 , β ≡ 4δp/(3 − 4p) . Alice can consume fewer bits of key if she has a source that is close to a maximally mixed state when averaged over all possible messages—for instance, if Alice first compresses the

222

Digital Media Steganography

state |φs  before sending it [23]. This would allow Alice and Bob to skip the twirling/untwirling procedure. This difference in key consumption rate illustrates the distinction between secrecy and security. If the source is compressed, and twirling is omitted, then the encoding still emulates the effects of the depolarizing channel and satisfies the requirement of secrecy. However, if Eve becomes aware of the message, then she may be able to read it (or read most of it) without needing the shared key. By contrast, if twirling is used, the message is randomized and cannot be recovered without the key.

11.3.4 Weaknesses of the local encoding The local encoding protocol given above performs well in emulating a depolarizing channel. However, there are far more general channels, and the protocol may not work well, or at all, in other cases. If we have a channel that can be written ρ → N ρ = (1 − pT + pE )Iρ + pT T ρ + pE Eρ,

(11.8)

where E is an arbitrary error operation, we can still use the previous protocols to hide approximately pT N stego bits or qubits while generating pE N random errors of type E. But for some channels, pT may be very small or zero. Furthermore, hiding stego qubits locally sacrifices some potential extra secret information that could be transmitted. That is, the location of the errors—the choice of the subset holding the errors—can also be used to convey information, potentially increasing the steganographic communication rate and reducing the amount of secret key or shared entanglement required. A different approach is instead to encode information in the error syndromes.

11.4 Steganographic encoding in error syndromes 11.4.1 The encoding and decoding procedure For simplicity, consider the case where N is large. In this case, it suffices to consider only typical errors [24,25]. First, consider the case where the physical channel is noise-free. For large N , almost all (with probability 1 − ) combinations of errors on the individual qubits will correspond to one of the set of typical errors. There are roughly 2sN of these, and their probabilities pe are all bounded within the range 2−N (s+δ) ≤ pe ≤ 2−N (s−δ) . The number s is the entropy of the channel on one qubit; for the BSC, s = h(p) = −p log2 p − (1 − p) log2 (1 − p), and for the DC, s = −(1 − p) log2 (1 − p) − p log2 p/3. For more general error models, the one-qubit entropy s may not be well defined; this possibility will be addressed in Section 11.8. The typical error operators are labeled E0 , E1 , . . . , E2sN −1 , and their corresponding probabilities are pj : ρ → N ⊗N (ρ) ≈

 j

pj Ej ρEj† .

(11.9)

Chapter 11 • Quantum steganography

223

A good choice of QECC for the covertext will be able to correct all these errors. (If Alice and Bob used a QECC that was not strong enough for the supposed error model of the channel, then that might be grounds of suspicion for Eve.) Assume that they use an {N, kc , d} QECC, where d is large enough to correct all the typical errors, and make the further simplifying assumption that the QECC is nondegenerate, so each typical error Ej has a distinct error syndrome labeled sj . Now we can see how it is possible to construct a steganographic encoding that will simulate the errors in this channel. Before the protocol begins, Alice and Bob partition the typical errors into C roughly equiprobable sets Sk , so that 

pj ≈

Ej ∈Sk

1 , ∀k . C

(11.10)

As far as possible, the errors in a given set should be chosen to have equal probabilities. The maximum of C is roughly C ≈ 2N (s−δ) for some small δ > 0, and k = 0, . . . , C − 1. This partition enables a new a quantum steganographic protocol, using error syndromes to store information (rather than storing it locally). 1. Alice prepares the kc qubits of the covertext in a state ρc = |ψc ψc |. 2. Alice’s secret message is a string of M = log2 C ≈ N (s − δ) qubits in a state |ψs  =

C−1 

αk |k .

(11.11)

k=0

She “twirls” each qubit of this string using 2N (s − δ) bits of the secret key or shared ebits to get a maximally mixed state. To this, she appends N − kc − (s − δ)N extra ancilla qubits in the state |0 to make up a total register of N − kc qubits. (Note that, as discussed before, if the message qubits have been compressed so that they are already close to maximally mixed after averaging over all messages, then this step can be omitted.) 3. Using the shared secret key, Alice chooses from each set Sk a typical error Ejk with syndrome sjk . She applies a unitary US to the register of N − kc qubits that maps   US |k ⊗ |0⊗N−kc −(s−δ)N = |sjk . She appends this register to the cover qubits in state |ψc  and then applies the encoding unitary UE . Averaging over the secret key, the resulting state will appear to Eve like the noisy state given in Eq. (11.9), which is effectively indistinguishable from the simulated channel acting on the encoded covertext. 4. Alice sends this codeword to Bob. If Eve examines its syndrome, then she will find a typical error for the channel being emulated. 5. Bob applies the decoding unitary UD = UE† and then applies US† (which he knows from the shared secret key). He discards the covertext and the last N − kc − (s − δ)N ancilla qubits, and undoes the twirling operation on the remaining qubits, again using the secret key. If Eve has not measured the qubits, then she will have recovered the state encoded by Alice.

224

Digital Media Steganography

FIGURE 11.2 Encoding in error syndromes. The scheme for hiding secret message qubits as error syndromes in a quantum codeword while emulating an underlying noisy channel (like the depolarizing channel). Alice encodes the secret message qubits together with the cover state as a quantum codeword with typical errors from the expected channel, using the secret key shared with Bob. They are sent through the channel (which is actually noiseless). Bob can decode the secret message qubits and the cover state with the secret key. If Eve looks at the codeword, then it will look like an innocent codeword that has passed through the expected channel.

The procedure is illustrated schematically in Fig. 11.2. This protocol may easily be used to send classical information by using a single basis state, rather than a superposition as in (11.11). So the protocol works for transmitting either classical or quantum information.

11.4.2 Communication and key usage rates The steganographic transmission rate R is roughly R ≈ s − δ → s. The rate of transmission s is higher than the rate 4p/3 of the local encoding. This protocol used 2N (s − δ) bits of secret key (or ebits) for twirling in step 2 and roughly N δ bits of secret key in choosing representative errors Ejk from each set Sk in step 3. So the key rate is roughly K ≈ 2s − δ → 2s, better than the first protocol in key usage per stego qubit transmitted. Since almost all the key usage goes to the twirling operation, for sources that are maximally mixed, on average the rate of key usage can actually go to zero as N → ∞. In the previous analysis, we assumed that the actual physical channel was noiseless, and that Eve believed that the channel was noisy because she had been systematically misled by Alice and Bob deliberately adding noise to all their messages. Needless to say, this is not a realistic assumption: real channels will generally have noise. Whether encoding into error syndromes is possible is much less obvious in the case where the channel contains intrinsic noise. It turns out that it is, indeed, possible; this case will be considered in Section 11.9. The next two sections will first give two examples of quantum steganographic encoding in the noiseless case.

11.5 Encoding in the binary symmetric channel In this section, we will first look at a simple classical example of an n-bit repetition code and the binary symmetric channel (BSC). The repetition code is the conceptually simplest error-correcting code; a single bit 0 or 1 is encoded by being repeated n times: 0 → 00 · · · 0, 1 → 11 · · · 1. If one of these codewords is sent through the BSC, then each bit independently

Chapter 11 • Quantum steganography

225

has a probability p of being flipped 0 ↔ 1 and a probability 1 − p of being left unchanged. Alice and Bob want to communicate steganographically by applying bit-flips to codewords in such a way that the statistics of their messages matches the statistics that Eve would expect from a BSC with error probability p. (The actual channel is noiseless.) Assume that n 1 and the average number of errors is pN 1, but p is significantly smaller than 1/2. Then the typical errors will have a number of bit-flips w (the weight of the error) lying in the range N (p − δ) ≤ w ≤ N (p + δ). There is some arbitrariness in the choice √ of δ, but since w has a binomial distribution, it makes sense to choose δ = D p(1 − p)/N √ where D is a coefficient of O(1) and Np(1 − p) is the standard deviation of w. If there are C possible messages, then Alice and Bob must agree on C distinct codewords. They can do this by randomly choosing C errors from the typical errors. If this is a single-shot communication, then Alice and Bob can make this choice ahead of time. But if they are going to send messages repeatedly, then at each step they should choose a new set of random errors using their shared secret key to make the choice, so they always know which set of errors is being used. After that, the procedure is straightforward. Alice prepares a codeword. If she wishes to send message m to Bob, then she applies the typical error corresponding to m to the codeword before sending it to Bob. When Bob receives the codeword, he measures the error syndrome to find out which error was applied and translates that back to m. If Eve looks at the codeword being transmitted, then it will look exactly like a codeword that has passed through a BSC. Averaging over all messages m and all ways of choosing the random errors, the error statistics match the statistics of the BSC with error rate p. This protocol can be turned straightforwardly into a quantum protocol. The error-free codewords become quantum states |00 · · · 0 and |11 · · · 1, and the covertext qubit is encoded in the state |ψc  = α|00 · · · 0 + β|11 · · · 1. The BSC becomes the quantum channel N ⊗N , where N acting on a single qubit is given in Eq. (11.1). The errors of weight w are operators Em that act as X on those qubits that are flipped and as I on the ones that are not. The major difference from the classical case is that Alice can now send superpositions of the messages m. If she wants to transmit a superposition state, then she just does the encoding   αm |m → αm Em |ψc . (11.12) |ψ = m

m

Consider now a fully quantum example, which shows how such an encoding can be done.

11.6 Encoding in the 5-qubit “perfect” code This encoding was worked out in detail in [12]. In this example, Alice uses the {5, 1, 3} code (the “perfect code”) to encode her stego qubits in the syndromes of the code. The perfect code encodes one logical qubit into five physical qubits and is the smallest quantum error-correcting code that can correct an arbitrary single-qubit error [26]. First consider the following problem: Alice wishes to embed secret quantum information in a message

226

Digital Media Steganography

FIGURE 11.3 Encoding circuit for the perfect code. The H gate is a Hadamard gate, and the two-qubit gates are controlled-NOT (CNOT) gates. The first four qubits are ancilla qubits, and the fifth qubit |ψc  is the cover qubit.

Table 11.1 Stabilizer generators for the perfect code. Stabilizer generators g1 , . . . , g4 , and logical operators X¯ and Z¯ for the perfect code. g1

X

Z

Z

X

I

g2

I

X

Z

Z

X

g3

X

I

X

Z

Z

g4

Z

X

I

X

Z

X

X

X

X

X

X

Z

Z

Z

Z

Z

Z

to Bob by disguising it as an error in a single codeword of the {5, 1, 3} code. In this section, we will demonstrate encodings that allow up to four qubits to be transmitted, disguised as either a single (weight one) error or a double (weight two) error. We will then treat the problem of encoding quantum information in a sequence of transmitted codewords, such that the error statistics match those of a depolarizing channel. The perfect code is nondegenerate: each single-qubit Pauli error is mapped to a unique syndrome. The code is a stabilizer code with n − k = 5 − 1 = 4 stabilizer generators; they are listed in Table 11.1 along with the logical operators. Alice and Bob can hide up to four qubits of information in this code, which they send over a channel that will look to Eve like a depolarizing channel. Fig. 11.3 shows the encoding unitary circuit for the perfect code.

11.6.1 Encoding with one-qubit errors The perfect code has a total of fifteen single-qubit errors X1 , X2 , . . . , X5 , Z1 , Z2 , . . . , Z5 , and Y1 , Y2 , . . . , Y5 . Together with the “no error operator” I I I I I , these sixteen operators have sixteen distinct syndromes, which are listed in Table 11.2. Alice can encode four steganographic qubits into the one-qubit error syndromes of the code. The protocol is somewhat involved because Alice can hide quantum information, not just classical information, and therefore one must take care not to disturb the qubits to be hidden. Label Alice’s subsystem that contains the five-qubit codeword as A, and the subsystem holding the stego qubits (before encoding) as S. Using the classical secret key

Chapter 11 • Quantum steganography

227

Table 11.2 Syndrome table of the perfect code. A list of all the onequbit Pauli errors and their corresponding syndromes for the perfect code. The error operators are ordered by syndrome values. Error

Syndrome

Error

IIIII

0000

XI I I I

Syndrome 0001

I I ZI I

0010

IIIIX

0011

IIIIZ

0100

I ZI I I

0101

I I I XI

0110

IIIIY

0111

I XI I I

1000

I I I ZI

1001

ZI I I I

1010

YIIII

1011

I I XI I

1100

IYIII

1101

IIYII

1110

IIIYI

1111

she shares with Bob, Alice first applies a “twirling” operation to each of the four stego qubits, randomly applying I , X, Y , or Z to each qubit. To do this to four qubits requires 8 bits of secret key. (Note once again that if Alice’s stego qubits are drawn from a compressed source that looks like the maximally mixed state on average, then twirling is not necessary.) The combined A and S subsystems, after twirling but before encoding, are in the state 1  |kS k| ⊗ |0000A 0000| ⊗ |ψc A ψc |, 16 15

ρ SA =

(11.13)

k=0

where |ψc  is the one-qubit covertext state. Denote the ket states |k, k = 0, . . . , 15, in place of their binary representations. Alice now applies the SWAP unitary USWAP = U2 U1 , where these two unitaries are defined as U1 ≡

15 

|iS i| ⊗ EiA ⊗ OiA ,

(11.14)

i=0

U2 ≡

15  (X j )S ⊗ |j j |A ⊗ ICA ,

(11.15)

j =0

where Ei and Oi are the error operators listed in Table 11.3, ordered from top to bottom. The operators Ei ⊗ Oi transform under the encoding unitary of the five-qubit code into the correctable one-qubit errors of Table 11.2. Note that the Ei operators act on the ancilla qubits so that Ei |0 ∝ |i. In Eq. (11.15), X j is a shorthand notation for X j1 ⊗ X j2 ⊗ X j3 ⊗ X j4 , where j1 · · · j4 is the binary expression for j . After applying the unitary USWAP , the system

228

Digital Media Steganography

Table 11.3 Encoded error operators of the perfect code. This table lists, for each of the one-qubit Pauli errors, the corresponding operators Ei that act on the syndrome space and Oi that act on the logical qubit. Ei

Oi

Encoded Error

Syndrome i

IIII

I

IIIII

0000

s0

XI I I

I

XI I I I

0001

s1

I XZZ

X

I I ZI I

0010

s2

XI ZI

X

IIIIX

0011

s3

I I ZX

Z

IIIIZ

0100

s4

XZZX

I

I ZI I I

0101

s5

I I ZX

I

I I I XI

0110

s6

Syndrome Label

XI I X

Y

IIIIY

0111

s7

I XI I I

I

I XI I I

1000

s8

I XZZ

X

I I I ZI

1001

s9

ZXI X

Z

ZI I I I

1010

s10

Y XI X

Z

YIIII

1011

s11

I I ZI

I

I I XI I

1100

s12

XY ZX

I

IYIII

1101

s13

−XI Y Z

Z

IIYII

1110

s14

Y XI X

Z

YIIII

1111

s15

is left in the state 1  |0000S 0000| ⊗ |kA k| ⊗ Ok |ψc A ψc |Ok . ρ = 16 

15

(11.16)

k=0

Note that the original stego qubits are all now in the state |0. The quantum information they contained has been swapped into the syndrome subsystem. Alice applies the encoding unitary UE to subsystem A and discards subsystem S. This encoding unitary UE is shown as a quantum circuit in Fig. 11.3. The resulting codeword must look plausible to Eve. In this case, she tries to match the depolarizing channel from Eq. (11.2). The encoded, error-free state would be ρc = UE (|00000000| ⊗ |ψc ψc |) UE† .

(11.17)

By swapping the stego qubits into the syndrome space as described before, Alice has effectively prepared the state ρc =

15 

E k ρc E k ,

(11.18)

k=0

where the Ek are the single-qubit errors from Table 11.2 (including the “no error” operator E0 = I I I I I ). To plausibly use the five-qubit code, the error rate p must be sufficiently low

Chapter 11 • Quantum steganography

229

so that it is rare for more than one single-qubit error to occur. So if Eve intercepts this state and checks the error syndromes, then the result should look reasonable for a codeword that has passed through the depolarizing channel.

11.6.2 Two error encodings The perfect code can correct any error on a single qubit of its codeword. So long as the error rate p is sufficiently low, this code transmits quantum information correctly most of the time. But at least occasionally (at a rate that scales like p 2 ) an error of weight two or higher will occur. These errors are uncorrectable, but they will happen from time to time. It therefore makes sense to consider steganographic encodings that correspond to errors of weight two, just as was done in the previous subsection for errors of weight one. Consider the set of all two-qubit errors. There are ninety such errors. These ninety twoqubit errors naturally divide into six sets of fifteen errors each, where each set has one error corresponding to each of the fifteen nonzero syndromes. By also including the “no error” operator each set corresponds to an encoding that can transmit four stego qubits. These six sets are listed in Table 11.4. When Alice sends four stego qubits to Bob, she must use all sixteen distinct syndromes. A single row of this table, spanning all sixteen syndromes, corresponds to a single encoding. Each encoding proceeds exactly as described before for the single-error case, except that the operators Ei and Oi from Table 11.4 are used in Eq. (11.14) instead of those from Table 11.3. In a quite analogous way, we could find encodings corresponding to errors of weight three or more. However, Alice and Bob would only plausibly use the {5, 1, 3} code if the error rate p is sufficiently low as to make three-qubit errors very unlikely, and the number of possible encodings is very high, so they are omitted. In any case, we can find those encodings by a procedure exactly similar to that used for the single-error and double-error encodings above.

11.6.3 Rate of secret qubit transmission The encodings given in the previous two subsections assumed a one-shot protocol, in which steganographic qubits were encoded in a single 5-qubit codeword. They would not do if Alice wants to send many messages to Bob. It would quickly look suspicious if almost every codeword had one or two single-qubit errors. For longer messages, most of the blocks should have no errors; from time to time a block should have a single error, and occasionally a block will have two errors (which is an uncorrectable error). The pattern of errors should match the statistics of the quantum channel they are emulating. This subsection assumes a depolarizing channel with error probability p, but a similar analysis can be done for other channels. The most straightforward approach for Alice is to vary which encoding she uses (if any) from block to block, where the choice for each block is determined by the shared secret key. Suppose Alice has some number of secret qubits that she wishes to transmit to Bob. She sends a succession of five-qubit codewords to Bob, each encoding an innocuous cover

230

Digital Media Steganography

Table 11.4 Table of double-qubit errors. The table represents a total of six different encodings where each encoding uses sixteen distinct error operators and their corresponding distinct syndromes. The error operators in bold are an example of a single encoding. Syndrome i

0001

0011

0101

0111

1001

1011

Ei ⊗ O i

Encoded Error

Ei ⊗ O i

Encoded Error

I ZI XI IIZYY

XZI I I

ZI XI Z

XZI I I

IXXII

IZXZX

IYIYI

−ZI ZY X

ZI Y I I

ZI XZY

Y I I ZI

I I I XI

I I ZXI

I I XI X

XI I I X

ZI I Y Y

YIIYI

I ZXI Y

I ZI I Y

−I ZI Y X

I Y I ZI

I I XI Z

I I I XZ

ZZXXZ

YYIII

ZXI I Z

ZI ZI I

-ZIYYX

ZIXII

IXZZY

IIYXI

−I I Y Y Y

I XY I I

I XI ZX

XI I ZI

I ZXXX

I ZI I X

ZXI I Y

YIIIX

I I XXY

XI I I Y

I XI ZZ

IIIYY

I I XXZ

I I ZI Z

I XZZX

I I XI Z

I Y I XI

XY I I I

I XXI I

I XZI I

ZXIXZ

ZIIXI

IXYZY

IIXXI

IYIYX

I ZI ZI

−I Y XZX

I ZI Y I

I XI Y I

IIIYX

I XXZI

I I I ZX

−ZXI XX

YIIIY

I Y XI Y

IYIIY

I XI XZ

I XI I Z

−I XY ZX

IIYIZ

−ZY XXZ

Y ZI I I

XZZZY

I Y XI I

IXXXI

IXIXI

XIZIZ

IIYYI

I XXY X

XI I Y I

XI I ZX

I XI ZI

I Y XXX

IYIIX

XI I I X

I I ZI X

−I XXY Z

I I I ZY

XI I I Y

I I I XY

ZXXXI

ZI I I Z

XZI I Z

I ZI I Z

−Y I ZY X

YIYII

Y I XI Z

Y XI I I

Syndrome i

0010

0100

0110

1000

-YIIYY

ZIIYI

-XZYZY

IYYII

XI ZXZ

I I XZI

XI XI I

XI ZI I

XI I XX

I I I XX

XZXI I

I ZI XI I I XY I

1010

XI I XY

I I ZI Y

XI Y I Z

XI I XZ

XI I I Z

−Y I XZY

ZI I ZI

−Y ZXXZ

ZY I I I

XXI I I

XXI I I

-YIYYX

YIXII

-XYZZY

IZXII

XZXXI

I ZZI I

Y XI I Z

Y I ZI I

XI XXI

XI I XI

−Y XI I Y

ZI I I X

XI XY X

I XI Y I

XXZZI

IIYIY

−XI Y XZ

I I Y ZI

XY I I Z IYIIZ continued on next page

1100

Chapter 11 • Quantum steganography

231

Table 11.4 (continued) Syndrome i

1101

1111

Ei ⊗ O i XXZY Y YXIXZ XXI Y X XXZY Z Y XI XX XXI Y Y Y Y XXZ -XXYYY XY XXI XXY Y Z XXXXY Y XXXI

Encoded Error XI XI I YIIXI I I ZY I IIYIX ZI I I Y I I I ZZ ZZI I I XIYII I Y ZI I I I XI X I XI I Y YIIIZ

Syndrome i

1110

0000

Ei ⊗ O i XY Y ZY XYXII XXXZX XXXI X XXY ZI −XXXZY IIIII IIIII IIIII IIIII IIIII IIIII

Encoded Error I ZY I I IYIXI I I ZZI I XI I X I I XI Y IIIYZ IIIII IIIII IIIII IIIII IIIII IIIII

qubit in the {5, 1, 3} code. Based on the secret key, she selects a subset of codewords to contain “errors” and encodes four qubits in each of those codewords, using one of the encodings from the previous subsections. From his copy of the secret key, Bob knows which codewords contain the hidden qubits and which encoding was used, so he can extract them when he receives them. This is a more sophisticated version of the local encoding described in Sec. 11.3. We would like to maximize the number of stego qubits that Alice can send to Bob under the constraint that Alice’s encoding scheme should match the probability distribution of errors in the depolarizing channel. For a 5-qubit codeword, the probability of no errors is p0 = (1 − p)5 ; there are 15 equally likely one-qubit errors, with total probability p1 = 5p(1 − p)4 and 90 equally likely two-qubit errors with total probability p2 = 10p 2 (1 − p)3 . There are of course three, four, and five qubit errors, but it is very unlikely that such errors will occur, and so they are omitted. For each 5-qubit block, there are three distinct cases. Case 0 uses no encoding and transmits no hidden information. This always has the syndrome s0 ≡ 0000 in Table 11.3. Case 1 corresponds to all the single-qubit errors, plus no error, with equal probability. These are the syndromes s0 ≡ 0000 to s15 ≡ 1111 in the same table. This is the encoding described in Subsection 11.6.1. This case includes a single encoding and transmits four stego qubits. Case 2 corresponds to the set of all two-qubit errors. As described in Subsection 11.6.2, there are six such encodings, each of which hides four qubits, and each encoding should be used with equal probability. These encodings are summarized in Table 11.4. We can now solve for how often the each case should be used to match the channel statistics. Let Q0 , Q1 , and Q3 be the fraction of times Alice uses each of the three cases. The channel distribution constraints are p0 = (1 − p)5 = Q0 +

1 (Q1 + Q2 ), 16

(11.19)

232

Digital Media Steganography

15 Q1 , 16 15 p2 = 10p 2 (1 − p)3 = Q2 , 16

p1 = 5p(1 − p)4 =

(11.20) (11.21)

which can be solved to get Q0 = p0 − (p1 + p2 )/15, Q1 = (16/15)p1 ,

(11.22)

Q2 = (16/15)p2 . Note that these numbers do not quite add up to 1, because errors of weight three or more have been neglected. For small p, they will come close, and the fractions can be renormalized; or if Alice and Bob are going to do a large enough amount of secret communication that 3-qubit and higher-weight errors should not be neglected, then more encodings can be worked out for those cases. The average number of steganographic qubits per channel use that Alice can send to Bob under the above constraints is R5-qubit = 4(Q1 + Q2 )/5 = (64/75)(p1 + p2 ) 64 = p(1 − p 2 )(1 − p)2 . 15

(11.23)

11.6.4 Comparison to encoding across blocks Steganographic communication rate If one compares the above encoding scheme to the local encoding described in Sec. 11.3, the 5-qubit code scheme transmits secret qubits at an asymptotic rate of R5-qubit ≈ (64/15)p (to leading order), whereas the purely local encoding transmits secret qubits at an asymptotic rate Rlocal ≈ p. This is a more-than-fourfold improvement in rate, which shows the advantage of using syndromes for encoding. However, both rates scale linearly in p. In principle, it is possible to do much better. Instead of treating each five-qubit codeword as a separate encoding problem, Alice and Bob can treat an entire sequence of M codewords as one big code block of size N = 5M. This large codeword will have error syndromes that are sequences of the error syndromes of the individual blocks. Alice and Bob can choose a selection of typical sequences of error syndromes as their codewords and encode their hidden qubits in these nonlocal states. This scheme has an asymptotic rate of Rsyndrome ≈ s(p) = −p log2 p − (1 − p) log2 (1 − p)/3 secret qubits per channel use, which is much better than O(p). In the 5-qubit encoding scheme, the information is stored nonlocally within each block, but locally in the entire sequence of M blocks. Again, this encoding does not make use of the error positions to convey information. For the case where the underlying physical channel is actually noiseless, encoding in typical error sequences is clearly superior.

Chapter 11 • Quantum steganography

233

Key usage rate There are similar advantages in key usage. For purposes of comparison, ignore the secret key used in twirling and consider only the secret key consumed by the encoding process itself. In the purely local encoding the asymptotic key consumption rate per channel use is 1 Klocal = lim log2 N→∞ N



N pN

 = −p log2 p − (1 − p) log2 (1 − p) = h(p).

(11.24)

For the encoding into the syndromes of local 5-qubit blocks, the asymptotic rate of key consumption per channel use becomes Klocal = −

 1 Q0 log2 Q0 + Q1 log2 Q1 + Q2 log Q2 /6 , 5

(11.25)

which is the asymptotic amount of information needed to choose the encodings for each 5-qubit block, divided by 5. For small p, this is approximately (1/5)h(16p/3) (since the twoqubit encodings are rarely used), so the scaling is similar to the local encoding, though the absolute level is lower. By contrast, the key consumption rate Rsyndrome (exclusive of twirling) for the syndrome encoding across blocks goes to zero asymptotically in the noiseless case. The advantage to this encoding is obvious. (In Section 11.9, we will show that the advantage is not so stark in the case where the underlying channel is noisy, but still there.)

11.7 Secrecy and security For steganographic protocols, we must make a distinction between secrecy—the indistinguishability of the message with hidden information from an innocent message—and security—the inability of the eavesdropper to read the hidden message even if she knows it is there. The conditions for security are the same as in ordinary cryptography, and quantifying them is well understood. Twirling, which consumes two shared key bits per transmitted steganographic qubit, guarantees security for hidden quantum information. This is the quantum equivalent of a one-time pad and is optimal if perfect security is required [22]. What is the quantitative condition for secrecy in a quantum steganography protocol? The most important question is: can Alice and Bob avoid arousing Eve’s suspicions that they are transmitting secret messages? To do this, the messages that Alice sends must emulate as closely as possible the channel that Eve expects. This requirement can be made quantitative. Let EC be the channel on N qubits that Eve expects, and let ES be the effective channel that Alice and Bob produce with their steganographic protocol. The protocol is -secret if ES is -close to EC in diamond norm distance for some small  > 0: ES − EC  ≤ .

(11.26)

The diamond norm is directly related to the probability for Eve to distinguish EC from ES under ideal circumstances (i.e., when she controls both inputs and outputs) [27] and so

234

Digital Media Steganography

puts an upper bound on her ability to distinguish them in practice. The optimal probability to correctly distinguish two channels ES and EC is Popt =

1 1 + ES − EC  . 2 4

(11.27)

The diamond norm distance can therefore be used as a quality measure for the “innocence” of the quantum message from Alice to Bob: if Eve cannot distinguish a channel containing the steganographic message from the channel that she expects, then the steganographic encoding satisfies the secrecy criterion. The diamond norm is defined as follows [28]. Let N : L (V) → L (W) be an arbitrary superoperator, where L (V) is the space of linear operators on a Hilbert space V. The diamond-norm of N is N  ≡ IL(V ) ⊗ N tr , (11.28) where N tr is defined as

N tr ≡ max N (ρ)tr : ρ ∈ L (V) , ρtr = 1 . ρ

(11.29)

The maximization in (11.29) is over all density matrices ρ. (When the Hilbert space is infinite dimensional, we take the supremum instead of the maximum.) In principle, the diamond norm distance can be calculated for any channel, but the calculation is not necessarily simple. However, for certain standard channels, such as the BSC and DC, the calculation can be done in closed form.

11.7.1 Diamond norm distance for the binary symmetric channel Compare two BSCs, Np with error rate p (probability to flip each bit) and Nr with error rate r = p + δp. The BSC is given by Eq. (11.1). Assume that 0 < p < r < 1/2. In the context of steganography, p could be the error rate of the underlying channel, and δp could represent the extra noise added by the encoded secret message. A nice property of BSCs is that they form a one-parameter family that is closed under composition; so Np ◦ Nq = Nq ◦ Np = Np+q−2pq . Alice and Bob would be adding errors with a rate q = δp/(1 − 2p). The difference of the two channels can now be expressed as (Nr − Np )ρ = (p − r)ρ + (r − p)XρX = δp(−ρ + XρX).

(11.30)

The diamond norm of the difference of the channels Np and Nr is Nr − Np = max (I ⊗ (Nr − Np ))ρ  tr ρ

= (r − p) × max (I ⊗ I )ρ(I ⊗ I ) − (I ⊗ X)ρ(I ⊗ X)tr . ρ

(11.31) (11.32)

Chapter 11 • Quantum steganography

235

The maximum is achieved by substituting ρ = χ ⊗ |00| (where χ is an arbitrary density operator) into the above equation: Nr − Np = (r − p) χ ⊗ |00| − ψ ⊗ |11|tr  ≤ (r − p) χtr |00|tr + ψtr |11|tr = (r − p)(1 + 1) = 2(r − p) = 2δp.

(11.33)

The second line of Eq. (11.33) uses the triangle inequality and the fact that for any two operators A and B, A ⊗ Btr = Atr Btr .

(11.34)

In fact, this bound is tight because the two terms are orthogonal to each other. So from Eq. (11.27) and Eq. (11.33) we see that by using the channel on a single qubit the optimal probability to distinguish the two BSCs is 1 Popt = (1 + δp). 2

(11.35)

Two uses of the channel Np on two qubits gives the map (Np ⊗ Np )ρ = (1 − p)2 ρ + p(1 − p)X1 ρX1 + p(1 − p)X2 ρX2 + p 2 X1 X2 ρX1 X2 ,

(11.36)

where X1 ≡ X ⊗ I , X2 ≡ I ⊗ X, and X1 X2 ≡ X ⊗ X. There is a similar expression for Nr ⊗ Nr . Then the difference between the two channels is (Nr ⊗ Nr −Np ⊗ Np )ρ = (r 2 − 2r + 2p − p2 )ρ + (r − r 2 − p + p 2 )(X1 ρX1 + X2 ρX2 ) + (r 2 − p 2 )X1 X2 ρX1 X2 .

(11.37)

The diamond norm of the difference between two BSCs on two qubits is Nr ⊗ Nr − Np ⊗ Np = max (I ⊗ (Nr ⊗ Nr − Np ⊗ Np ))ρ .  tr ρ

(11.38)

A construction similar to the single-qubit maximizes the right-hand side of (11.38): substituting ρ = χ ⊗ |0000| into (11.38) yields Nr ⊗ Nr − Np ⊗ Np = (1 − r)2 − (1 − p)2 + 2 |r(1 − r) − p(1 − p)| + r 2 − p 2  = 2(r − p)(2 − r − p) = 2δp(2 − 2p − δp),

(11.39)

236

Digital Media Steganography

where in the second line of Eq. (11.39), we used the constraints 0 < p < r < 1/2. So in the double-qubit case, Popt =

1 1 + δp(2 − 2p − δp). 2 2

(11.40)

Examining Eq. (11.39) carefully reveals that the terms are distributed binomially. Generalizing to the N -qubit case, the state ρ = χ ⊗ |00 · · · 000 · · · 0| maximizes the diamond norm for N uses of BSC:  N   N j ⊗N ⊗N r (1 − r)N−j − p j (1 − p)N−j . Nr − Np = j 

(11.41)

j =0

11.7.2 Diamond norm distance for the depolarizing channel The calculation of the diamond norm of the difference between N uses of two depolarizing channels (DC) is similar to the calculation for the BSC in the previous subsection. The expression for the channel is given by Eq. (11.2), and this is compared to another depolarizing channel with higher rate r = p + δp. As in the BSC case, assume that 0 < p < r < 1/2. Just as for the BSC, depolarizing channels form a one-parameter family that is closed under composition. This is easier to see if we rewrite the channel in the form (used in Eq. (11.4)) Np ρ = (1 − p  )ρ + p  (I /2),

(11.42)

where p  = (4/3)p. Then successively applying two maps of the form (11.42) with parameters p  and q  is the same as applying one map with parameter p  + q  − p  q  . Translating this back to the original form in Eq. (11.2) shows that Np ◦ Nq = Nq ◦ Np = Np+q−4pq/3 . The difference between two depolarizing maps is 1 (Nr − Np )(ρ) = (p − r)ρ + (r − p)(XρX + YρY + ZρZ) 3 = δp(−ρ + (1/3)(XρX + YρY + ZρZ)).

(11.43)

To calculate the diamond norm, we must find a state of two qubits such that the trace norm of the map (I ⊗ (Nr − N) )(ρ) is maximized. It is not too hard to see that this is achieved by the maximally entangled state + = | +  + |: (Nr − Np )( + ) = (r − p) − + + 1 ( − + + + − ) tr 3 tr    1 = (r − p)  + tr +  − tr +  + tr +  − tr 3 = (r − p)(1 + (1/3) × 3) = 2(r − p) = 2δp,

(11.44)

Chapter 11 • Quantum steganography

237

where ± and ± denote the four states of the Bell basis: 1 1 | ±  = √ (|00 ± |11) , | ±  = √ (|01 ± |10) , 2 2

(11.45)

and the second line of Eq. (11.44) follows because all four terms are orthogonal to each other. Note that this distance is exactly the same as for two BSCs. For the case N = 2, the difference between the two depolarizing channels is (Nr ⊗ Nr − Np ⊗ Np )ρ = ((1 − r)2 − (1 − p)2 )ρ + ((1 − r)(r/3) − (1 − p)(p/3)) (X1 ρX1 + · · · + Z2 ρZ2 ) + ((r/3)2 − (p/3)2 )(X1 X2 ρX1 X2 + · · · + Z1 Z2 ρZ1 Z2 ).

(11.46)

The density matrix that maximizes the trace norm in this case is ρ = + ⊗ + ≡ ⊗2 + , where each of the two depolarizing channels acts on one half of one of the entangled pairs of qubits. If we expand the state after applying the two maps, then it is not hard to see that all 16 terms produced are orthogonal, so the diamond norm becomes Nr ⊗ Nr − Np ⊗ Np = (1 − r)2 − (1 − p)2 + 6 |(1 − r)(r/3) − (1 − p)(p/3)|  (11.47) + 9 (r/3)2 − (p/3)2 = (1 − r)2 − (1 − p)2 + 2 |(1 − r)r − (1 − p)p| + r 2 − p 2 . After evaluating the absolute value terms, we get   Nr ⊗ Nr − Np ⊗ Np = 2(r − p)(2 − r − p) = 2δp 2 − 2p − δp , 

(11.48)

which implies that Popt =

1 1 + δp(2 − 2p − δp). 2 2

(11.49)

For the general case of N uses of the depolarizing channel, the optimal state is ⊗N + , and the N depolarizing maps produce 4N orthogonal terms. We can write the diamond norm as  N! ⊗N Nr − Np⊗N =  n !n !n !n ! n0 +n1 +n2 +n3 =N 0 1 2 3 × (1 − r)n0 (r/3)N−n1 −n2 −n3 − (1 − p)n0 (p/3)N−n1 −n2 −n3   N  N j 3 = (r/3)j (1 − r)N−j − (p/3)j (1 − p)N−j j j =0

238

Digital Media Steganography

=

 N   N j r (1 − r)N−j − p j (1 − p)N−j , j

(11.50)

j =0

which is exactly the same expression as for the BSC.

11.7.3 Conditions for secrecy Then what are the conditions for secrecy for the BSC and DC? The simplest case is where Eve does not have an exact knowledge of the channel, either because the channel has not been fully characterized, or because (as suggested before) Alice and Bob have been systematically deceiving Eve by making the channel appear noisier than it really is. From the previous consideration, if the actual physical channel is a depolarizing channel with error rate p, but Eve believes that the error rate is p + δp, then Alice and Bob can effectively apply an extra DC with error rate q = δp/(1 − 4p/3). (For a BC, it would be q = δp/(1 − 2p).) This extra noise represents the secret information they are hiding in the codeword. Suppose, however, that Eve has exact knowledge of the channel. Can Alice and Bob still hide information in codewords? Using expression (11.50) for the diamond norm with r = p + δp, the secrecy requirement is ⊗N (11.51) Nr − Np⊗N ≤ , 

where  > 0 is a small value. In the limit where δp is small compared to p, we can approximate this diamond distance by expanding Eq. (11.50) to leading order and using a Gaussian approximation:

 ∞ N 2x 2 ⊗N ⊗N − N ≈ δp √ e−x /2 dx Nr p  p(1 − p) 0 2π

2n = δp . (11.52) πp(1 − p) √ If δp ≤  p(1 − p)/N , then this distance is less than . This result is already enough to show that it is possible (in principle) to transmit an arbitrarily large amount of hidden information with an arbitrarily low detection probability even if Eve exactly knows the channel. Recall from Section 11.3.2 that for the DC, the local encoding allows us to transmit an amount of secret information √ 4N chashing δp √ 4chashing p(1 − p) ks = = N , (11.53) 3 − 4p 3 − 4p in the limit of large N , where chashing = 1 − h(p) − p log2 (2) is the hashing bound for the DC, and the upper bound on δp from above has been used. So while it is not possible, in this case, to communicate secretly at a constant asymptotic rate, it is possible to transmit an arbitrarily large amount of secret information by spreading it out over a number of channel

Chapter 11 • Quantum steganography

239

uses that grows like the square of the number of qubits to be transmitted. In the literature on covert communication [15–19], this is called the square root law. The local encoding gives only a lower bound on the achievable amount of steganographic communication, but √ more efficient encodings change only the multiplicative factor, not the scaling with N. It has not been shown whether it is possible to transmit unlimited amounts of secret quantum information over arbitrary quantum channels, but it seems likely that the square root law will apply in most cases where the channel is exactly known and the quantum capacity is nonzero.

11.7.4 Secret key vs. shared entanglement If Alice and Bob share a secret, random key, then they can use the steganographic encodings described in the chapter. Shared entanglement (ebits) can act as a resource in the same way—by measuring the two halves of a maximally entangled pair of qubits √ (|00 + |11)/ 2 Alice and Bob can generate a shared secret bit. Therefore k shared ebits is at least as useful for quantum steganography as k classical secret key bits. However, shared entanglement can go beyond what can be done with a classical secret key in at least two ways. First, instead of sending quantum information through the channel, Alice can instead teleport qubits to Bob [21]. Teleportation consumes one ebit and requires the transmission of two classical bits for each qubit teleported. These classical bits can be sent through the channel steganographically. Because these bits are perfectly random, no one-time pad or twirling is needed, and because they are purely classical information, they are not disrupted if Eve chooses to measure the error syndromes, as a general quantum state would be. In this sense, quantum steganography with shared ebits is more powerful than quantum steganography with a shared classical key. Second, if the actual physical channel is noisy, then by using additional shared entanglement it is possible to increase the capacity of the channel by the use of entanglementassisted quantum error-correcting codes [29,30]. This can increase the rate for transmitting either quantum information directly through the channel [31] or classical information to be used in conjunction with quantum teleportation. While the work in this chapter calculates rates of steganographic communication using a classical shared secret key, the achievable rates using shared entanglement are at least as large when the underlying physical channel is noiseless, and almost certainly larger in almost all cases where the underlying physical channel is noisy.

11.8 Asymptotic rates in the noiseless case In this chapter so far, we have looked at steganographic encodings into quantum errorcorrecting codes, where the hidden information is made to appear like errors from a particular quantum channel. We have particularly concentrated on two especially wellunderstood quantum channels, the binary symmetric channel (BSC), and the depolarizing channel (DC). In this section, we will look at the information-theoretic limits on quantum

240

Digital Media Steganography

steganography in the simplifying case where the underlying physical channel is actually noiseless, but the eavesdropper Eve has been systematically deceived into believing that the channel is noisy. In the next section, we will generalize this, at least for a few cases, to the much more complicated situation where the underlying physical channel is noisy. Before the protocol begins, Alice and Bob share with each other a secret key, an arbitrarily long string of random bits. This key is known only to them. But after the protocol begins, they cannot communicate except through channels that can be monitored by Eve. Alice sends an innocent-looking message to Bob over the channel. This is a covertext state ρc , encoded into an error-correcting code; it is assumed that the choice of code is known to Eve, and this code should be a plausible choice for the noisy channel that Eve believes exists. The general procedure is the same as that illustrated in Fig. 11.2. An important assumption is that this section only considers the case where the quantum error-correcting code (QECC) that Alice uses is nondegenerate. That is, the N channel uses are dominated by a discrete set of errors (i.e., these errors have total probability greater than 1 −  for some suitably small ), all these errors are correctable, and each error corresponds to a unique error syndrome. These are the typical errors [24,25]. This allows Alice to communicate as much steganographic information as possible, and it allows one to ignore the details of which QECC is being used. Methods similar to those in this section should also work for degenerate codes; but in that case the encoding depends on the properties of the particular code, since the typical errors must first be grouped into equivalent sets, and then the possible messages mapped into these sets. The proper treatment of nondegenerate QECCs is an open question.

11.8.1 Direct coding theorem (achievability) The two particular cases considered in this chapter—the BSC and DC—will be looked at first, and then these examples will be generalized to the broader class of random unitary channels. From these results we can derive a statement about achievable rates in the case of general channels when the actual physical channel is noiseless. This is a direct coding theorem for quantum steganography. In the next subsection, we prove a converse theorem, giving an upper bound on the steganographic communication rate and showing that it asymptotically coincides with the achievable rate. The binary symmetric channel Suppose Eve believes that the channel from Alice to Bob is a BSC with probability p of a bit flip per qubit transmitted. Alice sends a codeword of length N to Bob, encoding an innocent covertext state ρc . The number of errors in the codewords that Alice sends to Bob should be binomially distributed with mean pN and variance (1 − p)pN . The total probability that there is an error of weight w should be pk =

  N w p (1 − p)N−w . w

(11.54)

Chapter 11 • Quantum steganography

There are

  N N! ≡ w w!(N − w)!

241

(11.55)

such errors, all with equal probability p w (1 − p)N−w . For large N, the typical errors will have a weight w within a narrow range about the mean pN . Alice encodes her messages into these typical errors. For each w from Np(1 − δ) √ to Np(1 + δ), where (1 − p)/pN  δ  1, Alice chooses at random a set of Cw possible error strings of weight w. (An error string of weight w is a string of N bits, with a 1 at every location with a bit flip and 0 at every location with no error.) This random choice is made using the shared secret key with Bob, so that Bob also knows which set of errors is being used to encode secret messages, but Eve (who does not share the key) could not know this. Note that this encoding is a variation of the idea presented earlier in this chapter of encoding in error syndromes, since each typical error has a unique error syndrome; since this is an information-theoretic argument, any practical difficulties of such an encoding are ignored. Let these sets of error strings of weight w be called {Sw }; the set of all strings used in the encoding is  Sw . (11.56) S= w

Sum up the total number of codewords: C=

Np(1+δ) 

Cw = |S|.

(11.57)

w=Np(1−δ)

So the total number of strings in the set S is C. This number C is the total number of possible distinct secret messages that Alice can send to Bob (though she may also send superpositions of these messages). All these messages are assumed to be equally likely. The message encodes M = log2 C bits (or qubits) of information. Define the probability q = 1/C. These error strings S are typical errors. Eve should not be suspicious at seeing such an error string, since it matches a probable result for the channel that she expects. For this encoding to be indistinguishable from the BSC, the probability of the message being an error string of weight w should equal the value from the distribution in Eq. (11.54): qCw =

Cw = pw . C

Necessarily, Cw ≤

  N w

(11.58)

242

Digital Media Steganography

for all w in the typical range, since one cannot choose more errors of weight w than exist. Putting these together yields   N w w N−w Cw p (1 − p) ≤ p (1 − p)N−w = Cw q ⇒ q ≥ p w (1 − p)N−w . (11.59) w To maximize the number of possible steganographic messages q = 1/C should be as small as possible. The constraint in Eq. (11.59) then yields q = p Np(1−δ) (1 − p)N (1−p+pδ) .

(11.60)

So Alice can send M secret qubits to Bob, where M = log2 C = log2 1/q =N (−p log2 p − (1 − p) log2 (1 − p) + δ(p log2 p − p log2 (1 − p))) =N (h(p) − δp log2 ((1 − p)/p)) ≈ N h(p).

(11.61)

With this encoding, Alice can send (almost) N h(p) bits. The calculation in Section 11.7.1 shows that the diamond norm distance between the channel (NpBSC )⊗N and Alice’s encoding is exponentially small in N . This justifies the claim that this protocol will not arouse suspicion from Eve and satisfies the steganographic criterion. The depolarizing channel Suppose the channel to be emulated is the depolarizing channel (DC). The encoding in this case looks quite similar to that of the BSC. Recall that the DC acting on a single qubit ρ is given by Eq. (11.2). Applied to N qubits, the total probability of errors with exactly n1 X, n2 Y , and n3 Z errors (and n4 = N − n1 − n2 − n3 qubits without errors) is p(n1 , n2 , n3 , n4 ) =

N! (p/3)n1 +n2 +n3 (1 − p)n4 . n1 !n2 !n3 !n4 !

(11.62)

Instead of specifying n1 , n2 , and n3 exactly, we can instead talk about errors with weight w = n1 + n2 + n3 . It follows by a simple calculation that the total probability of all errors of weight w is     N w w N w N−w = p (1 − p)N−w , (11.63) p(w) = 3 (p/3) (1 − p) w w which is just a binomial distribution in w. Specify the typical errors to be those with weights √ w that lie between Np(1 − δ) and Np(1 + δ) for (1 − p)/pN  δ  1 and follow exactly the same encoding procedure given for the bit flip code  using  errors with weight w, except N that the set of errors of weight w is now of size 3w , and errors of weight w have w probability (p/3)w (1 − p)N−w . Then this leads to the following encoding rate: M = N (−p log2 (p/3) − (1 − p) log2 (1 − p)

Chapter 11 • Quantum steganography

243

+ δ(p log2 (p/3) − p log2 (1 − p))) = N (s(p) + δ(p log2 (p/3) − p log2 (1 − p)) ≈ N s(p),

(11.64)

where s(p) = −p log2 (p/3) − (1 − p) log2 (1 − p) is the entropy of the depolarizing channel on one qubit. So the asymptotically achievable rate is s(p). Random unitary channels The class of random unitary channels includes the BSC, DC, and all Pauli channels, among many others. Consider a single-qubit channel of the form N (ρ) =

k 

pi Ui ρUi† ,

(11.65)

i=1

√ where the operators Ui are all unitary: Ui Ui† = Ui† Ui = I . The set of Kraus operators { pi Ui } can be interpreted as a set of single-qubit unitary errors {Ui } occurring with probabilities {pi }. The channel acts on an N -qubit encoded state ρ as N ⊗N (ρ). For any state, the total probability of all errors with n1 U1 errors, n2 U2 errors, and so forth, is given by the multinomial distribution: p(n1 , . . . , nk ) =

N! n p n1 · · · pk k . n1 ! · · · nk ! 1

(11.66)

Now consider all errors with weights nj in the range from Npj (1 − δ) to Npj (1 + δ) for all j . This is the set of strongly typical sequences of errors. Assume that these typical errors are all correctable by the QECC being used and that they are nondegenerate—that is, each of these typical errors has a unique error syndrome. We can then use these syndromes for our encoding, just as in the BSC and DC cases. Randomly choose Cn1 ,...,nk error strings with weights n1 , n2 , . . . , nk in this range such that n1 + · · · + nk = N. As with the bit flip and depolarizing channels, let these sets of strings be called Sn1 ,...,nk , and let S denote the union of all these sets of strings, which are a subset of the typical strings. For all weights n1 , . . . , nk outside the typical set, let Cn1 ,...,nk = 0. The total number of strings in the set S is  Cn1 ,...,nk . (11.67) C= n1 ,...,nk

Defining q ≡ 1/C, one must satisfy Cn1 ,...,nk q = Cn1 ,...,nk /C = p(n1 , . . . , nk )

(11.68)

for all weights n1 , . . . , nk in the typical set, so that Eve does not become suspicious. Also, N! clearly Cn1 ,...,nk must be less than n1 !···n . This implies that: k! n

Cn1 ,...,nk p1n1 · · · pk k ≤

N! n p n1 · · · pk k , n1 ! · · · nk ! 1

244

Digital Media Steganography

Cn1 ,...,nk p1n1 · · · pknk ≤ Cn1 ,...,nk q, p1n1 · · · pknk ≤ q.

(11.69)

We cannot simply use the lower bounds of the sums for nj , as in the DC and BSC cases, because there is an additional constraint that n1 + · · · + nk = N . However, the same general argument applies. Inside the set of typical weights, there is a string n˜ 1 , . . . , n˜ k with |n˜ j /N − pj | ≤ δpj for all j that maximizes the probability: n˜

pmax ≡ p1n˜ 1 p2n˜ 2 · · · pk k .

(11.70)

Choose q = pmax . This bounds the number of stego qubits M that Alice can send to Bob: M = log2 C = − log2 (q) = − log2 pmax = −n˜ 1 log2 (p1 ) − · · · − n˜ k log2 (pk ) n˜ k n˜ 1 log2 (p1 ) − · · · − log2 (pk )) N N k  ≥ N (1 − δ)(− pi log2 (pi )) = N (−

i=1

= N (1 − δ)H (p1 , . . . , pk ).

(11.71)

In the large N limit the rate approaches H (p1 , . . . , pk ) with this encoding. General channels The advantage of the random unitary channels is that every single-qubit error has a definite probability, independent of the state or the choice of QECC. This means that for an N-qubit codeword, we can use the multinomial distribution for the probabilities of the typical errors. Such a distribution may hold for a general channel as well, but it is not so straightforward to make that argument. Instead, we can perform the steganographic encoding across multiple code blocks, where the errors on the individual code blocks can be treated as random unitary errors, and the multinomial distribution can be recovered on the entire sequence of code blocks. Consider a general quantum channel acting on a single qubit: N (ρ) =

k 

Ai ρA†i .

(11.72)

i=1

The channel acts on an N -qubit encoded state ρ as N ⊗N (ρ). Let N become large. By the previous assumptions that the code is nondegenerate and that almost all errors are correctable we can well approximate this N -qubit channel by a sum over the correctable errors:      N ⊗N (ρ) = Ai1 ⊗ Ai2 ⊗ · · · ⊗ AiN ρ A†i1 ⊗ · · · ⊗ A†iN ≈ pk Ek ρEk† , (11.73) i

k

Chapter 11 • Quantum steganography

245

where ρ is now the N -qubit codeword, the index i in the first sum is i = i1 i2 . . . iN , and the index k in the second sum is an arbitrary labeling of the correctable errors. This set of errors {Ek } is determined jointly by the N -fold channel and the choice of QECC. The operators {Ek } will be linear combinations of the tensor-product operators Ai1 ⊗ Ai2 ⊗ · · · ⊗ AiN (at least to a very good approximation). Assume that the total probability of all correctable errors is greater than 1 −  for some very small 0 <   1. Note that the Kraus map in Eq. (11.72) is not unique. For some combinations of channels and QECC, we can choose the Kraus decomposition such that the correctable errors have the product form Ei = Ai1 ⊗ Ai2 ⊗ · · · ⊗ AiN , for a subset of sequences {i} [22]. For instance, we can do this for the BSC and DC if our QECC is a stabilizer code. Each of these errors Ek has a unique error syndrome and acts on the code space of the QECC as a unitary that moves the state to a distinct, orthogonal subspace labeled by k. This means that error Ek occurs with a fixed probability pk for all valid codewords of the QECC, that is, we can write   pk Ek ρEk† = pk Uk ρUk† (11.74) k

k

for some set of unitaries {Uk } if ρ is in the code space of the QECC. We can now essentially repeat the argument that leads to Eq. (11.71), but using the probabilities pk . For this argument to work, we must go to a limit where the number of code blocks J 1 and where the block size N is large enough that the probability of an uncorrectable error is very small, that is, J   1. In those limits the rate approaches −

1  pk log2 pk ≡ H¯ , N

(11.75)

k

where H¯ is an effective entropy per qubit from the channel. Secret key consumption How much secret key is required to carry out these protocols? Assume that all the details of the encoding have been decided between Alice and Bob ahead of time, so the only place where secret key is consumed is in picking the subsets of errors used in the encoding. Consider the BSC. The possible messages are mapped onto a set of C error syndromes, representing errors of weights (1 − δ)Np ≤ w ≤ (1 + δ)Np. For each error weight w in that range, a subset of Cw errors is chosen to represent possible messages. Alice and Bob can agree before the protocol begins to divide the set of errors of weight w into nw nonoverlapping subsets of Cw errors each, where 1 nw = Cw



N w



 =

1−p p

w−Np(1−δ) .

(11.76)

For each transmitted block, Alice and Bob must randomly choose one of these nw subsets for each weight w in the typical range. Choosing a subset requires log2 nw random bits, which are drawn from their shared key. Since any given message is encoded as an error

246

Digital Media Steganography

of some specific weight w, the same secret key bits can be used to choose the subset for each error weight w. (Or to put it another way, the choice of message itself provides an additional source of randomness.) The number of key bits consumed to transmit one block is therefore the maximum of log2 nw for (1 − δ)Np ≤ w ≤ (1 + δ)Np: K= =

max

Np(1−δ)≤w≤Np(1+δ)

log2 nw 

max

log2 Np(1−δ)≤w≤Np(1+δ)   1−p = (2Npδ) log2 . p

1−p p

w−Np(1−δ)

Since this is a binomial distribution, δ takes the form

  1 1−p δ=D , N p

(11.77)

(11.78)

where D is a fixed constant determining what fraction of all errors are included in the typical set. The key consumption therefore is

    1−p 1−p K = 2D N log2 . p p

(11.79)

√ Key consumption scales like N, and the key consumption rate goes to zero as N → ∞. Although the details will vary, this sublinear scaling of K with N should be generic for all channels and encodings when the actual physical channel is noiseless. Recall the distinction, discussed before, between secrecy and security: a steganographic protocol is secret if an eavesdropper without the secret key cannot distinguish between an encoded message being sent and the noisy channel being applied. It is secure if the eavesdropper cannot learn anything about the message, even if she knows that a message is begin sent. Using a sublinear amount K of shared secret key is sufficient to make the steganographic protocol secret, but in general it is not secure. Since the number of qubits M transmitted is typically larger than the number of secret key bits K consumed, we would generically expect an eavesdropper to be able to learn on the order of M − K bits of information about the message if she became aware of its existence.

11.8.2 Converse theorem (upper bound) In this subsection, we will prove an upper bound on the rate of steganographic communication via the task known as entanglement transmission. Since the ability to transmit quantum states implies the ability to share entanglement, an upper bound on this task is also an upper bound on the rate of quantum communication. This proof closely follows the discussion of quantum communication in [22].

Chapter 11 • Quantum steganography

247

Alice has a quantum system of dimension C = 2M (i.e., M qubits), which is in a maximally entangled state A1 R with reference system R. She prepares a pure covertext ρc that will be encoded into an N -qubit QECC in subsystems A N , and she and Bob share a secret key k. Her encoded state is ωk,A N R ≡ Ek,A1 C→A N (ρc ⊗ A1 R ).

(11.80)

The dependence of the encoding on the secret key k corresponds to choosing among the different sets of error strings S in the protocols from the previous subsection. To Eve, who does not know the secret key k and has no access to the subsystem R, the state is effectively    (11.81) pk TrR ωk,A N R , ωA N ≡ k

where ωA N is the state with R traced out and averaged over all possible values of the secret key k with probabilities pk . One can choose this probability to be uniform for simplicity, pk = p for all k, if desired. The following secrecy condition must be satisfied: 1 ω  N − N ⊗N (Vρc V † )1 ≤ δ, 2 A

(11.82)

where N is the channel Alice is emulating, V is an isometry representing the encoding of the covertext into a suitably chosen codeword (that can correct the errors induced by the channel N with high probability), and δ > 0 is a small parameter. This condition implies that Eve cannot distinguish the state from the encoded covertext with noise from the quantum channel N ⊗N . Another requirement of the steganographic encoding is recoverability. Bob receives N qubits through the channel; call these qubits the subsystem B N . He applies a decoder Dk,B N →B1 C , which depends on the key k, to obtain the maximally entangled state ρc ⊗ B1 R . The input and output states must be  close: 1 (ω N ) − ρc ⊗ B1 R 1 ≤ , ∀k, D N 2 k,B →B1 C k,B R

(11.83)

where  > 0 is a small parameter. Upper bound on steganographic rate These requirements of secrecy and recoverability, give a bound on the number of qubits M that can be sent stegonagraphically from Alice to Bob. Defining σE ≡ N ⊗N (Vρc V † ) and applying the Fannes–Audeneart inequality to the secrecy condition (11.82) yields H (TrR (ωA N R )) ≤ H (σE ) + δN + h(δ), where h is the binary entropy function.

(11.84)

248

Digital Media Steganography

The recoverability condition gives M = I (RB1 )

≤ I (RB1 )Dk (ω) + N + (1 + )h(/[1 + ]) ≤ I (RA N )ωk + f (N, ) ≤ H (TrR (ωk,A N R )) + f (N, ).

(11.85)

The first equality follows from the fact that the coherent information of a maximally entangled state is just the logarithm of the dimension of one of the subsystems. The first inequality follows from the AFW inequality applied to (11.83). The second inequality is the data processing inequality for coherent information. The last inequality follows from the definition of coherent information. The concavity of entropy implies that  k

pk H (ωk,A N ) ≤ H

 

 pk ωk,A N

= H (ωA N ).

(11.86)

k

The encodings Ek,A1 C→A N are isometries, which means that H (ωk,A N ) has the same value for every k. We can therefore sum over the probabilities pk on the left-hand side of (11.86) to get H (TrR (ωk,A N R )) ≤ H (TrR (ωA N R )).

(11.87)

Now putting (11.84) and (11.85) together yields an upper bound on M: M ≤ H (TrR (ωRA N )) + f (N, ) ≤ H (σE ) + g(N, δ) + f (N, ),

(11.88)

where g(N, δ) ≡ δN + h(δ). Thus, if we can compute a maximum for H (N ⊗N (ρ)) when ρ is pure (because V is an isometric encoding and ρc is pure), then we have a tight upper bound on the number of qubits M that can be sent steganographically over a noiseless quantum channel. We can evaluate this quantity for certain specific channels. For the BSC it is not hard to see that the maximum entropy of an N -qubit pure state passing through the channel is Nh(p), which means that in the limit N → ∞ the achievable rate from the previous subsection matches the upper bound from this subsection, h(p). For the DC, this is not quite so obvious; but if we impose the restriction that the QECC is nondegenerate, as assumed in the previous channel, with the typical errors each having a unique error syndrome, then the entropy of the state will be N s(p), and the achievable rate will again asymptotically match the upper bound.

Chapter 11 • Quantum steganography

249

FIGURE 11.4 Encoding for a noisy channel. The scheme for hiding secret message qubits in a quantum codeword to pass through a noisy channel. Using the secret key, Alice encodes the secret message qubits together with the cover state as a quantum codeword with typical errors from a channel that, when composed with the actual physical channel, yields the expected channel. The codeword is sent through the noisy channel. Bob can decode the secret message qubits and the cover state with the secret key and correct any errors from the physical channel. If Eve looks at the codeword, then it will look like an innocent codeword that has passed through the expected channel.

11.9 Asymptotic rates in the noisy case The case where the underlying channel is noisy is significantly more complicated than the noiseless case because the recoverability of the information becomes nontrivial. In addition to satisfying the secrecy condition (of emulating the desired noisy channel), the steganographic encoding must also protect the encoded information from the noise of the actual channel. This problem has not been fully solved, but in this section, we will look at the relatively simple cases of the BSC and DC. In the case where the eavesdropper has been systematically deceived into believing that the channel is noisier than it really is, it is possible to communicate steganographically at a nonvanishing rate. That this is possible has already been shown above for the DC using the local encoding. This section will show that it is possible to use more efficient encodings into error syndromes even when the underlying channel is noisy. The general scheme is illustrated in Fig. 11.4.

11.9.1 Direct coding in the noisy case Achievable rate for the BSC Alice wishes to send stego qubits to Bob. The channel connecting Alice and Bob is NpBSC from Eq. (11.1) with error parameter p. However, Eve believes the quantum channel to be BSC Np+δp with error parameter p + δp. Alice and Bob make use of the following encoding. First, Alice encodes the covertext ρc into a nondegenerate QECC that can correct all typical errors induced by the channel BSC ⊗N (Np+δp ) . Second, depending on the secret key k and message m that she would like to send, she applies the error E N (k, m) ≡ E1 (k, m) ⊗ · · · ⊗ EN (k, m)

(11.89)

to her state, where each operator Ej (k, m) is either X or I . These errors are drawn from an independent and identically distributed (i.i.d.) distribution over N bits, pE N (eN ), where

250

Digital Media Steganography

pE (e) is given by pE (X) = q,

pE (I ) = 1 − q,

(11.90)

with 0 < q < 1. The states with these errors form the codewords for the possible messages. To send quantum information, Alice prepares the system in a superposition of these codewords. The set of codewords corresponding to each key value k is generated and agreed between Alice and Bob ahead of time, so that for a given k, Bob knows which codeword corresponds to which message m. The errors in Eq. (11.89) that define the codewords are typical errors associated with the channel (NqBSC )⊗N [24,25]. By the asymptotic equipartition theorem [22], for large enough N , it is highly likely that each of these codewords that Alice generates is a typical sequence with a sample entropy close to H (E) = −(1 − q) log(1 − q) − q log q = h(q). Furthermore, Np ◦ Nq = Np+δp if we set q = δp/(1 − 2p). Having prepared the codeword as described, Alice sends the state through the channel (NpBSC )⊗N . Bob’s decoding process essentially follows classical random coding over a classical BSC with parameter p. By the asymptotic equipartition theorem for conditionally typical sequences [22], for each input sequence (i.e., error E N (k, m) applied to the encoded covertext), there is a corresponding conditionally typical set of errors {F n (k, m)} that has the following properties: it has probability greater than 1 − , its size is ≈ 2nH (F |E) , and the probability of each conditionally typical error, given knowledge of the input error E n (k, m), is ≈ 2−nH (F |E) . With high probability, the error F n Bob observes will be a typical error of the channel BSC ⊗N (Np+δp ) . From Shannon’s noisy channel coding theorem, if the number of messages C = M 2 = 2N R satisfies 2N R ≈

2N H (F ) = 2N (H (F )−H (F |E)) , 2nH (F |E)

(11.91)

then Bob will be able to decode correctly with high probability [32,33,22] the error E N (k, m) that was applied by Alice, as long as the code is nondegenerate. For the steganographic protocol, H (F ) = h(p + q − 2pq) = h(p + δp) for q = δp(1 − 2p), and H (F |E) = h(p). Therefore Alice can communicate M = log2 (C) ≈ N (h(p + δp) − h(p))

(11.92)

bits of information to Bob steganographically, which corresponds to a secret communication rate R = h(p + δp) − h(p).

(11.93)

Moreover, this protocol satisfies the secrecy criterion, because the state passing through the channel is to a good approximation the state Eve would expect:     (NpBSC )⊗N E N (k, m)Vρc V † E N (k, m) k∈K m∈M

Chapter 11 • Quantum steganography

=



NpBSC

⊗N



 

251

 N



N

E (k, m)Vρc V E (k, m)

k∈K m∈M  ⊗N BSC ≈ NpBSC ◦ Nδp/(1−2p)



  ⊗N BSC (Vρc V † ), Vρc V † = Np+δp

(11.94)

where V is the encoding isometry for the QECC Alice and Bob are using. The first equality follows from linearity of quantum operations. The approximate equality follows from the fact that if one applies the errors corresponding to codewords and averages over the key and possible messages, then one is applying all the typical errors of the channel (NqBSC )⊗N and hence applying a good approximation of the full channel [24,25]. The final equality follows because the composition of two BSCs is another BSC. This state is exactly the state Eve expects to observe (up to the approximation of the channel), and therefore the steganographic protocol satisfies the secrecy criterion. Secret key consumption For the above encoding, how much a secret key must be consumed? Assume that the details of the encoding—how each message m is mapped to a codeword for a particular key element k—have been decided between Alice and Bob ahead of time. So in the protocol as described before, the only place where secret key is consumed is in picking the subset of errors used in the encoding. Before the protocol begins, Alice and Bob divide the set of typical errors of the channel BSC (Nδp/(1−2p) )⊗N into n nonoverlapping subsets of size C = 2N (h(p+δp)−h(p)) each, where n=

2N h(δp/(1−2p)) . 2N (h(p+δp)−h(p))

(11.95)

For each transmitted block, Alice and Bob must randomly choose one of these n subsets to encode her messages. This requires K bits of secret key: K = log2 (n) = N (h(δp/(1 − 2p)) − h(p + δp) + h(p)) ≡ N RK .

(11.96)

The key consumption K = N RK scales linearly with N , so the key is consumed at a fixed asymptotic rate RK = h(δp/(1 − 2p)) − h(p + δp) + h(p). Notice that in the limit where the physical channel is noiseless (p = 0), this formula gives RK = 0, which agrees with the result above for the noiseless channel case where it was shown that only a sublinear amount of key is needed for encoding across noiseless channels. Depolarizing channel In most respects the encoding for the depolarizing channel (DC) is similar to that for the BSC above. Both are one-parameter families of channels that are closed under composition. Suppose the actual physical channel between Alice and Bob is NpDC from Eq. (11.2), DC . Alice encodes a pure covertext ρc into a whereas Eve believes that the channel is Np+δp nondegenerate QECC on N qubits. This code should be able to correct typical errors of

252

Digital Media Steganography

DC ⊗N ) . Next, she applies the error (Np+δp

GN (k, m) ≡ G1 (k, m) ⊗ · · · ⊗ GN (k, m)

(11.97)

to her state, which depends on the secret key k and the message m. The resulting state is the codeword corresponding to m. If her message is a quantum state, then she prepares the system in a superposition of codewords. These codewords are generated by applying errors drawn randomly from the channel (NqDC )⊗N . That is, the errors X, Y , Z, or I on each qubit are drawn from the product distribution pGN (g N ), where pG (g) is given by pG (X) = pG (Y ) = pG (Z) = q/3,

pG (I ) = 1 − q,

(11.98)

and q = δp/(1 − 4p/3). For each value of k, a set of errors is generated and agreed on ahead of time by Alice and Bob, and k is used to choose the encoding used for a particular message m. Since the set of errors is selected using the shared secret key k, Bob knows what codeword corresponds to each message m. The errors given by Eq. (11.97) are typical errors associated with the channel (NqDC )⊗N [24,25]. By the asymptotic equipartition theorem [22], for large enough N , it is highly likely that each codeword has a typical error with a sample entropy close to H (G) = −(1 − q) log(1 − q) − q log(q/3) ≡ s(q), where s(q) is the entropy of a qubit passed through the depolarizing channel with error parameter q. Moreover, Np ◦ Nq = Np+δp for q = δp/(1 − 4p/3), which satisfies the secrecy criterion similarly to the way it was shown in Eq. (11.94) for the BSC above. Alice then sends the state through the channel (NpDC )⊗N . Following the same random coding argument described for the bit-flip channel, with high probability, the error J N DC ⊗N Bob observes will be a typical error of the channel (Np+δp ) . If the number of messages M N R is C = 2 = 2 , where 2N R ≈

2N H (J ) = 2N (H (J )−H (J |G)) , 2N H (J |G)

(11.99)

then Bob can decode correctly which error GN (k, m) was applied by Alice with high probability [32,33,22] as long as the code is nondegenerate. Here H (J ) = s(p + q − 4qp/3) = s(p + δp) for q = δp/(1 − 4p/3), and H (J |G) = s(p). Therefore Alice can steganographically communicate M = log2 (C) ≈ N (s(p + δp) − s(p))

(11.100)

classical or quantum bits of information to Bob. This corresponds to a secret communication rate R = s(p + δp) − s(p).

(11.101)

General channels The assumption of nondegeneracy is natural in the case of the BSC, which is essentially classical (since all of the error commute), but not so much for the DC, where the errors are

Chapter 11 • Quantum steganography

253

noncommuting. Generic QECCs are usually degenerate. This general approach to encoding will probably work for degenerate codes as well, but the achievable rate may be lower, and it will require an analysis specific to the code in question. At present, no formula is known for the achievable rate of secret communication (classical or quantum) of a general channel and a general QECC. Some results have been shown for general channels. Tahmashi and Bloch [34] show that for a physical channel NA→B when the eavesdropper believes the channel is noisier, with a form that can be written NA→B ◦ MA→A , Alice can secretly send classical information to Bob by embedding it in a cover protocol for sending an innocent classical message. Further, in the noiseless case where NA→B = I, they also prove that classical randomness can be secretly shared as part of a cover protocol for entanglement sharing, and classical messages can be secretly sent as part of a cover protocol for quantum communication. Stronger results have been shown in this chapter, but with more restrictive assumptions.

11.9.2 Converse theorem in the noisy case This subsection again bounds the rate of secret communication from above by the task of entanglement transmission [22]. Assume explicitly that the physical channel Np is a member of a one-parameter family of channels that is closed under composition and that Eve has been deceived into believing that the true channel is Np+δp . (This includes the cases of the BSC and DC but is not completely general.) Alice has a subsystem A1 of M qubits that is in a maximally entangled state A1 R with reference system R. She encodes a pure covertext ρc together with her subsystem A1 into an N -qubit QECC with errors. This N -qubit encoded system is called A N . The encoded state depends on the secret key element k: ωk,A N R ≡ Ek,A1 C→A N (ρc ⊗ A1 R ).

(11.102)

Then this state is sent through the channel. To Eve, who does not know the secret key k and has no access to the reference system R, the received state is effectively    pk TrR Np⊗N (ωk,A N R ) = Np⊗N (ωA N ), (11.103) k

where we have used the linearity of quantum operations and the fact that the channel commutes with the partial trace over R. The secrecy condition is 1 ⊗N (Vρc V † )|| ≤ δ, ||Np⊗N (ωA N ) − Np+δp 2

(11.104)

where Np+δp is what Eve believes the physical channel to be, V is the encoding isometry ⊗N for the QECC (which is assumed to correct typical errors induced by the channel Np+δp ), and δ > 0 is a small parameter. This condition means that if Eve observes the quantum state, then it will be effectively indistinguishable from the encoded covertext after being ⊗N sent through the noisy quantum channel Np+δp .

254

Digital Media Steganography

Bob receives the N qubits through the channel; call this subsystem B N . He applies his decoder Dk,B N →CB1 to obtain the original state ρc ⊗ B1 R . The recoverability condition is 1 (N ⊗N ⊗ IR (ωk,A n R )) − ρc ⊗ B1 R ||1 ≤  ||D N 2 k,B →B1 C p

(11.105)

for all k, where  > 0 is a small parameter. Upper bound on steganographic rate We can now put a bound on the number of qubits M that can be sent secretly from Alice to ⊗N Bob. Defining σE = Np+δp (Vρc V † ) and applying the Fannes–Audeneart inequality [35] to the secrecy condition (11.104) yield    H Np⊗N ωA N ) ≤ H (σE ) + g(N, δ),

(11.106)

where g(N, δ) ≡ δN + h(δ). Furthermore, M =I (RB1 ) B1 R ≤ I (RB1 )Dk (Np⊗N (ωk )) + N + (1 + )h(/(1 + )) ≤I (RA N )Np⊗N (ωk ) + f (N, )     =H Np⊗N (TrR [ωk,A N R ]) − H Np⊗N ⊗ IR (ωk,A N R ) + f (N, ),

(11.107)

where f (N, ) ≡ N + (1 + )h(/(1 + )). The first equality follows from the fact that the coherent information of a maximally entangled state is just the logarithm of the dimension of one of the subsystems. The first inequality follows from the Alicki–Fannes–Audeneart inequality [36] applied to the recoverability condition (11.105). The second inequality is the quantum data processing inequality for coherent information [22]. The last equality follows from the definition of the coherent information. Using the concavity of von Neumann entropy and linearity of quantum operations yields      min H Np⊗N (TrR [ωk,A N R ]) ≤ pk H Np⊗N (TrR [ωk,A N R ]) (11.108) k

k

≤H





Np⊗N (TrR [



  pk ωk,A N R ]) = H Np⊗N (TrR [ωA N R ]) .

k

For many cases, we expect H (Np⊗N (TrR [ωk,A N R ])) to be the same for every k. If that is true, then     (11.109) H Np⊗N (TrR [ωk,A N R ]) ≤ H Np⊗N (TrR [ωA N R ]) for all k. Now putting Eqs. (11.106), (11.107), and (11.109) together produces the main result for this subsection, which states that Alice can secretly and reliably send M qubits to Bob,

Chapter 11 • Quantum steganography

where M is bounded above by   M ≤ H (σE ) − H (Np⊗N ⊗ IR )(ωk,A N R ) + g(N, δ) + f (N, ).

255

(11.110)

11.10 Discussion and future directions The protocols presented in this chapter where the performance is known can be summarized in Table 11.5. This is a limited set of results, though as discussed in the last two sections, these techniques should generalize to a larger set of quantum channels and QECCs. The results in this chapter are largely derived from [11–14]. Although a number of papers have been written on quantum steganography over the last several years, relatively few combine a rigorous definition of what it means for information to be secret with an encoding protocol that is proven to satisfy that secrecy criterion. Also, most either restrict themselves to transmitting classical information or do not make a proper distinction between the requirements for classical and quantum communication. In the meanwhile, however, there has been interesting work on the closely related idea of covert communication, which has been extended to quantum channels (though so far mostly for communicating classical secret messages) [15–19]. Covert communication mostly assumes a noisy channel (such as a bosonic channel at finite temperature) in which there are signals (e.g., photons) in the channel even when the channel is idle (i.e., no message is being sent). The goal is to send secret messages that cannot be distinguished from the intrinsic noise in the channel. The models used have both similarities and differences from the models presented in this chapter. Similar to this chapter, it is assumed that Alice and Bob agree on an encoding ahead of time and share a key (a string of shared random bits) that is consumed in the course of the protocol; but rather than assuming that the eavesdropper can, at her discretion, measure what Bob receives, the inputs to the channel (including intrinsic noise plus any signals sent by Alice) are split into two channels, one to Bob and one to Eve. It is generally assumed that Eve has exact knowledge of the channel, and the generic result is the square root law: to transmit n bits of secret information from Alice to Bob requires a number of channel uses N that scales like n2 . This is similar to the result in Section 11.7.3. Such square-root scaling makes intuitive sense in a model where Eve knows the channel exactly. Even an exact knowledge of the channel does not allow one to predict what one will actually measure; over many channel uses, the “amount” of noise (e.g., the number of bit flips in a BSC or the number of thermal photons in a bosonic channel) will generally follow a Gaussian distribution with a standard deviation that grows like the square root of the number of channel uses. Alice can hide secret information by disguising it as a slightlylarger-than-average fluctuation, which requires spreading it over a number of channel uses that grows like the square of the amount of communication. To communicate at a nonzero asymptotic rate generally requires that Eve have a distorted belief about the channel, except in rather contrived cases where the information Eve receives from the channel is so limited that she cannot tell what Bob might be receiving [16].

256

Digital Media Steganography

Table 11.5 Summary of quantum steganography protocols. This table summarizes the asymptotic performance of the main quantum steganography protocols presented in this chapter. The key usage rate is that required to satisfy the condition of secrecy, but not necessarily the condition of security. In the table, q = δp/(1 − 2p), h(p) = −p log2 (p) − (1 − p) log2 (1 − p) is the binary entropy function, s(p) = −p log2 (p/3) − (1 − p) log2 (1 − p) is the entropy of the depolarizing channel, and c(p) is the (unknown) capacity of the depolarizing channel. Protocol

Physical Channel

Expected Channel

Stego Rate

Key Usage Rate

Local Encoding

Noiseless Channel

NpDC

(4p/3)

−(4p/3) log2 (2p/3)

5-qubit code Encoding

Noiseless Channel

NpDC

(64/15)p

(1/5)h(16p/3)

Syndrome Encoding

Noiseless Channel

NpBSC

h(p)

0

Syndrome Encoding

Noiseless Channel

NpDC

s(p)

0

Local Encoding

NpDC

DC Np+δp

(4q/3)c(p)

−(4q/3) log2 (2q/3)

Syndrome Encoding

NpBSC

BSC Np+δp

h(p + δp) − h(p)

h(δp/(1 − 2p)) − h(p + δp) + h(p)

Syndrome Encoding

NpDC

DC Np+δp

s(p + δp) − s(p)

s(δp/(1 − 4p/3)) − s(p + δp) + s(p)

There are many open questions about quantum steganography. For example, there is a qualitative difference between shared secret key and shared entanglement. Protocols based on secret key can be simulated using shared entanglement; but in general, we would expect entanglement to be more powerful. It should give a greater ability both to fool Eve and to protect secret information from intrinsic noise in the channel. The case where the underlying physical channel is noiseless is fairly simple and not too hard to understand. The encodings presented in this chapter, which are predicated on the assumption that nondegenerate QECCs are used, can almost certainly be readily generalized to the degenerate case. The case with a noisy channel is much more complicated, since Alice and Bob must satisfy both secrecy and recoverability criteria where recoverability is much less trivial in the presence of noise. In this chapter, we have presented some limited examples of the noisy case, which argue that quantum steganography is possible over noisy channels at least for some channels and some QECCs. But broader results should be possible, both giving specific encodings that work against particular channels and proving asymptotic limits for the possible rates of secret communication. There is also the question of whether the square root law applies to arbitrary quantum channels with nonzero capacity when Eve exactly knows the channel. A different direction of inquiry is in the adversarial case of quantum steganography, where Alice and Bob are using untrustworthy communication equipment that may try to leak information to the eavesdropper. This topic might be related to recent work on deviceindependent quantum protocols [37].

Chapter 11 • Quantum steganography

257

11.11 Conclusion Quantum steganography is a set of protocols that hide classical or quantum information in an innocent-seeming quantum communication. This chapter has examined one particularly well-developed approach, in which the messages are disguised as errors on a quantum error-correcting code from a noisy quantum channel. With the help of a shared secret key, the messages can be encoded by the sender Alice in such a way that they match the channel effects expected by the eavesdropper Eve and can be decoded by the receiver Bob. Specific encodings and communication rates have been worked out for a number of channels for nondegenerate quantum codes, but a great deal remains unknown. Although many basic ideas have been developed, we have only scratched the surface of quantum steganography.

Acknowledgments The work presented in this chapter was done in collaboration with my former students Bilal Shaw and Christopher Sutherland. Their hard work and many contributions are gratefully acknowledged. This work was supported in part by NSF Grants CCF-0448658, CCF-1421078, and QIS-1719778. Additional thanks are due to the Kavli Institute for Theoretical Physics at the University of California Santa Barbara and the Institute for Advanced Study in Princeton, NJ, for their hospitality during some phases of this work.

References [1] Herodotus, The Histories, Penguin Books, 1996. [2] J. Trithemius, Steganographia, first printed edition, 1606, Frankfurt. [3] FBI, Velvalee Dickinson, the “Doll Woman”, Available from: https://www.fbi.gov/history/famouscases/velvalee-dickinson-the-doll-woman, 1943. [4] G.J. Simmons, The prisoners problem and the subliminal channel, in: D. Chaum (Ed.), Advances in Cryptology – CRYPTO 83, 1983, pp. 51–67. [5] M. Dušek, N. Lütkenhaus, M. Hendrych, Quantum cryptography, Progress in Optics 49 (2006) 381–454. [6] M. Curty, D.J. Santos, Quantum steganography, in: 2nd Bielefeld Workshop on Quantum Information and Complexity, Bielefeld, Germany, 2004, pp. 12–14. [7] S. Natori, Why Quantum Steganography Can Be Stronger Than Classical Steganography, Quantum Computation and Information, vol. 102, Springer, 2006, pp. 235–240. [8] Martin K. Steganographic, Communication with quantum information, in: T. Furon, F. Cayre, G. Doërr, P. Bas (Eds.), Information Hiding, IH 2007, in: Lecture Notes in Computer Science, vol. 4567, Springer, Berlin, Heidelberg, 2007. [9] I. Banerjee, S. Bhattacharyya, G. Sanyal, A procedure of text steganography using Indian regional language, International Journal of Computer Network and Information Security 4 (8) (2012) 65. [10] J. Gea-Banacloche, Hiding messages in quantum data, Journal of Mathematical Physics 43 (9) (2002) 4531–4536. [11] B.A. Shaw, T.A. Brun, Quantum steganography with noisy quantum channels, Physical Review A 83 (2) (2011) 022310. [12] B.A. Shaw, T.A. Brun, Hiding quantum information in the perfect code, arXiv:1007.0793, 2010. [13] C. Sutherland, T.A. Brun, Quantum steganography over noiseless channels: achievability and bounds, arXiv:1805.01599, 2018. [14] C. Sutherland, T.A. Brun, Quantum steganography over noisy channels: achievability and bounds, arXiv:1808.03183, 2018. [15] B.A. Bash, A.H. Gheorghe, M. Patel, J.L. Habif, D. Goeckel, D. Towsley, et al., Quantum-secure covert communication on bosonic channels, Nature Communications (2015) 6.

258

Digital Media Steganography

[16] A. Sheikholeslami, B.A. Bash, D. Towsley, D. Goeckel, S. Guha, Covert communication over classicalquantum channels, in: IEEE International Symposium on Information Theory (ISIT), IEEE, 2016, pp. 2064–2068. [17] L. Wang, Optimal throughput for covert communication over a classical-quantum channel, in: IEEE Information Theory Workshop (ITW), IEEE, 2016, pp. 364–368. [18] K. Bradler, T. Kalajdzievski, G. Siopsis, C. Weedbrook, Absolutely covert quantum communication, arXiv:1607.05916, 2016. [19] J.M. Arrazola, V. Scarani, Covert quantum communication, Physical Review Letters 117 (25) (2016) 250503. [20] D. Gottesman, Stabilizer Codes and Quantum Error Correction, California Institute of Technology, 1997, arXiv:quant-ph/9705052. [21] C.H. Bennett, G. Brassard, C. Crépeau, R. Jozsa, A. Peres, W.K. Wootters, Teleporting an unknown quantum state via dual classical and Einstein–Podolsky–Rosen channels, Physical Review Letters 70 (1993) 1895. [22] M.M. Wilde, Quantum Information Theory, Cambridge University Press, 2013. [23] B. Schumacher, Quantum coding, Physical Review A 51 (1995) 2738. [24] R. Klesse, Approximate quantum error correction, random codes, and quantum channel capacity, Physical Review A 75 (6) (2007) 062315. [25] R. Klesse, A random coding based proof for the quantum coding theorem, Open Systems & Information Dynamics 15 (01) (2008) 21–45. [26] R. Laflamme, C. Miquel, J.P. Paz, W.H. Zurek, Perfect quantum error correcting code, Physical Review Letters 77 (1) (1996 Jul) 198–201. [27] A.Y. Kitaev, A.H. Shen, M.N. Vyalyi, Classical and Quantum Computation, Graduate Studies in Mathematics, vol. 47, American Mathematical Society, Providence, Rhode Island, 2002. [28] J. Watrous, Advanced topics in quantum information processing, Lecture notes, https://cs.uwaterloo. ca/~watrous/LectureNotes.html, 2004. [29] T.A. Brun, I. Devetak, M.H. Hsieh, Correcting quantum errors with entanglement, Science 314 (2006) 436–439. [30] T.A. Brun, I. Devetak, M.H. Hsieh, Catalytic quantum error correction, IEEE Transactions on Information Theory 60 (2014) 3073–3089. [31] T. Mihara, Quantum steganography using prior entanglement, Physics Letters A 379 (2015) 952–955. [32] T.M. Cover, J.A. Thomas, Elements of Information Theory, John Wiley & Sons, 2012. [33] M.A. Nielsen, I.L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, 2000. [34] M. Tahmasbi, M.R. Bloch, Steganography protocols for quantum channels, in: International Symposium on Information Theory (ISIT), 2019, p. TH3.R5.4. [35] K.M. Audenaert, A sharp continuity estimate for the von Neumann entropy, Journal of Physics A: Mathematical and Theoretical 40 (28) (2007) 8127. [36] R. Alicki, M. Fannes, Continuity of quantum conditional information, Journal of Physics A: Mathematical and General 37 (5) (2004) L55. [37] B.W. Reichardt, F. Unger, U. Vazirani, A classical leash for a quantum system: command of quantum systems via rigidity of CHSH games, arXiv:1209.0448, 2012.

12 Digital media steganalysis Reinel Tabares-Sotoa , Raúl Ramos-Pollánc , Gustavo Isazad , Simon Orozco-Ariasb,d , Mario Alejandro Bravo Ortíza , Harold Brayan Arteaga Arteagaa , Alejandro Mora Rubioa , Jesus Alejandro Alzate Grisalesa a Universidad Autónoma de Manizales, Department of Electronics and Automation, Manizales,

Caldas, Colombia b Universidad Autónoma de Manizales, Department of Computer Science, Manizales, Caldas, Colombia c Universidad de Antioquia, Department of Systems Engineering, Medellín, Antioquia, Colombia d Universidad de Caldas, Department of Systems and Informatics, Manizales, Caldas, Colombia

12.1 Introduction The first steganography appearance dates back to ancient Greece. The story describes how Herodotus sent a message to Sparta to warn of the intention of Xerxes to invade Greece. The message was sent so that it would be hidden, making it imperceptible to inspection. To camouflage messages, the information was written directly on wooden boards and then covered with wax, followed by a regular message written on it. At first glance, one could only see the regular message, but if it was removed, then the hidden information on the wood was revealed. During World War II, the most common steganography technique was to microfilm a message and reduce it to the size of a small dot, being seen as a punctuation mark or the dot of a character within a text. For example, the dot on the vowel (i) could be a microfilm carrying a message [1]. This technique has provided an alternative to traditional methods used to hide information, such as cryptography, which is banned in some countries [2]. The process of steganography formulation comes from the famous Prisoners’ problem, explained in [3]. The problem presents two prisoners, Alice and Bob, who wish to exchange messages while being under constant surveillance by the prison director Eve. If Eve considers the messages suspicious, then she will not allow the messages to get to their receiver. Industrial steganography has been used to control the illegal copy of digital material; to this end, copyright societies introduce information containing evidence of who owns the material or to whom it has been sold or sent. This is done by modifying digital content in an imperceptible way to the human eye [4]. At the military level, steganography has been used to transmit important messages without calling the attention of the counterpart. It is also believed that steganography could have been used by illegal groups and terrorists to send information about attacks or targets [4]. Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00020-7 Copyright © 2020 Elsevier Inc. All rights reserved.

259

260

Digital Media Steganography

FIGURE 12.1 Steganography process. Example of embedding a message in the least significant bits (LSB).

Steganography can be done from several domains: spatial, frequency, compressed, and file-structural. In particular, the last two domains may be easily detected. For this reason, the main focuses of steganography are the spatial and frequency domains. From the spatial domain, the algorithms directly change certain information of the digital media that will be undetectable to the human eye. One way to do this is introducing the message by sequentially or randomly changing the least significant bits (LSB) of data samples [5,6]. Fig. 12.1 explains the general steganography process that begins with a clean digital media file (cover), for example, an image. Then a message is introduced to this file by changing some bits using a steganographic algorithm. Following this process, a new file is obtained that contains the hidden message and does not show perceptible changes (stego). In image steganography the most employed algorithms in this domain are HUGO [7], HILL [8], MiPOD [9], S-UNIWARD [10], and WOW [11]. All these algorithms have different payloads; usually, the most used payload for experiments is 0.4 bits per pixel (bpp). From the frequency domain, several transformations are used, such as discrete cosine transform (DCT), discrete wavelet transform (DWT), and singular value decomposition (SVD); these are explained in detail in [12]. Coefficients generated with these transforms are changed to insert messages in the frequency domain such that it is imperceptible to the human eye. The most employed algorithms in this domain for images are J-UNIWARD [10], F5 [13], UED [14], and UERD [15]. These algorithms have a commonly used payload of 0.4 bpnzAC (bits per nonzero cover AC DCT coefficient). Steganalysis faces different tasks related to hidden messages in digital media, including the prediction of the payload used to introduce the message, the prediction of the steganographic algorithm used, and the classification of files with or without messages; the last one is of greater importance and development. There are two common approaches to solve the classification task of detecting whether there is a hidden message in a digital media file (cover stego). The first one uses Machine Learning (ML), where manual fea-

Chapter 12 • Digital media steganalysis

261

FIGURE 12.2 Steganalysis process. Steganalysis based on ML (top part) and DL techniques (bottom part).

ture extraction is done, followed by the use of a traditional classifier in a separate process. The second involves Deep Learning (DL), where feature extraction is automated using convolutional or recurrent layers; then classification is done using fully connected neural networks. Both processes are performed in an end-to-end manner. Fig. 12.2 shows both steganalysis approaches, in which classification follows a feature extraction stage [16]. This chapter aims to explain the most contemporary and relevant techniques of steganalysis applied to digital media in the following order: Section 12.1 provides an introduction, Sections 12.2, 12.3, 12.4, and 12.5 explain steganalysis on images, audio, video, and text, respectively. Section 12.6 includes the conclusions of the chapter.

12.2 Image steganalysis Research on image steganalysis began in the late 1990s when Johnson and Jajodia [17] and Chandramouli et al. [18] conducted the first studies. Steganalysis has gone through different facets, starting with visual detections up to the use of convolutional neural networks (CNNs). From its beginnings, it was divided into two domains, spatial and frequency. In the spatial domain the random or adaptive LSB insertion method is used [19,6]. For the frequency domain (JPEG, Joint Photographic Experts Group), transformations, such as DCT, DWT, and SVD, are required for steganography. The next sections, 12.2.1 and 12.2.2, refer to traditional image steganalysis techniques that involve visual analysis or hand-crafted features. Section 12.2.3 includes aspects related to modern steganalysis, in which different algorithms and computational models are used to extract features and perform automatic classification.

12.2.1 Signature steganalysis Signature identification is one of the first methods used to detect images with hidden messages. The goal is to search for repetitive patterns to identify steganographic tool signatures. For example, in [17] the authors discovered that the steganographic algorithm of Hide and Seek made all pixels in the image divisible by four. When the steganographic al-

262

Digital Media Steganography

gorithm is applied to an RGB image from 0 to 255, it generates an image with the same characteristics. However, the color varies from 0 to 252. This type of signature is visually identified in the image histogram because the whitest color will always be 252.

12.2.2 Statistical steganalysis Statistical steganalysis is more robust than signature steganalysis since mathematical analysis is more accurate than visual analysis. The images can be seen as matrices; therefore statistics can be obtained from them. Accordingly, if there is a modification in the matrix or image, then there will be a statistical change. Statistical steganalysis is subdivided into the following types. •

LSB embedding steganalysis: LSB steganography [20] consists in embedding messages in the least significant bits of digital images. One of the first papers on LSB embedding steganalysis was published by Zhi et al. [21], who proposed a detection method based on the loss of energy in the gradient. The relationship between the length of the embedded message and the energy of the gradient allows classifying the images into cover and stego. First, the energy of the cover image gradient is computed, and then the energy of the stego image gradient is calculated at different rates of incrustation. Afterward, the energy of the stego gradient is plotted, and the length of the stego message is estimated. Another method was proposed by Fridrich et al. [22], which involves selecting the pixels that display sudden color changes; for example, if there is a color reduction from bit 1 to bit 2, then these values are grouped as (1,2). All the dimensions of color are ordered, and concatenated and the homogeneity between them is evaluated. Fridrich demonstrated that the homogeneity is a quadratic function of the length of the secret message, and the best results are obtained in 8-bit GIF images. Avcibas et al. [23] created a specific algorithm for the detection of LSB, based on calculations of binary similarity and characteristics of binary texture within bit planes. Guided by the above features, a similarity measurement classifier was created, which classifies the image as cover or stego depending on the variance of similarity between the two images. This research showed that the steganographic algorithms of the time altered the texture of the image. Therefore, Avcibas conducted a new research, which evaluated the texture of the image, starting from the cooccurrence matrix. • LSB matching steganalysis: Steganography based on LSB matching [24] is more difficult to detect than LSB embedding steganography [20]. One of the most relevant investigations in LSB matching steganalysis is [25]. The authors worked with grayscale images applying the histogram characteristic function (HCF) and calibrating the center of mass (COM) using an undersampled image instead of the traditional histogram. The fundamental problem of this research is the length of the introduced message because the algorithm only works if the embedded message is smaller than the number of pixels in the image. Another problem is being able to determine if there is a hidden message in the color scale. To address this problem, Liu et al. [26] proposed a technique based on pixel correlation features and pattern recognition. The statistical pattern recognition

Chapter 12 • Digital media steganalysis

263

algorithms are Fisher linear discriminate (FLD), optimization of the Parzen classifier (ParzenC), naive Bayes classifier (NBC), support vector machines (SVM), linear Bayes normal classifier (LDC), and quadratic Bayes normal classifier (QDC). These algorithms are trained and classify the image into cover and stego. • Spread-spectrum steganalysis: Insertion spectrum steganography adds images combined with Gaussian noise [27]. Due to its features, this type of steganography is more robust and has a low probability of detection. Despite its difficulty, detection methods were proposed in [28] through exploring the properties of the HCF center of mass, which behaves as the main feature. With HCF, hidden noise analysis is possible, allowing one to analyze the effects of the embedded message based on the histogram. A simple Bayesian multivariate classifier [29] was used in this research. Another proposed method based on DCT is described in [30]. This method relies on detecting the dispersion difference per block. First, the stego image is restored using spatial filters. Then the spread spectrum is simulated several times, and the variance is estimated from the low-frequency coefficients of the DCT and the cover image. Accordingly, the difference of the two dispersions is used to determine if the image has a hidden message. Another method used in spread-spectrum steganalysis aims to find the correlation between pixels. This method was proposed by Sullivan et al. [31], who used a random Markov string to compute the correlation between pixels, as well as SVM (Joachim [32]) as a classifier. The classifier is trained with stego and cover images providing outstanding results. • Transform domain steganalysis: Wavelet quantization modulation steganalysis [33] was introduced by Liu et al. [34]. In the histogram analysis the cover image histogram is smoother than the stego image. The authors demonstrated that the energy difference for stego images with the modulation method of quantization is much higher than for cover images. Therefore it can be determined if the image is stego or cover. One of the most relevant advances made in steganalysis occurred in [35], in which steganalysis was performed using a neural network. The digital images, both cover and stego, are analyzed in transformation domains DFT (discrete Fourier transform), DCT, and DWT. The neural network then calculates the statistical features of the cover and stego images. This method showed promising classification percentages at the time. • Additive noise steganalysis: Additive noise steganography [36] relies on noise generated to decrease the probability of detecting embedded messages. To counteract additive noise steganography, Jiang et al. [37] provide a steganalysis technique in binary images. The success of this method is based on the compression rate and the data insertion rate. It models steganographic insertion as an additive noise process; the compression index is used as the main statistic, which helps discriminate between stego and cover images since the data compression rate increases when a message is embedded. Based on the above, we describe the most representative percentages of classification of steganographic images. For LSB embedding steganalysis [23], it was shown that embedding a message at 1 bpp with a message of 5000 words produces an accuracy of 78.23%. For LSB matching steganalysis [24], embedding a 64 × 64 bits message generates an accuracy of

264

Digital Media Steganography

43%. Spread-spectrum steganalysis [30] has an accuracy of 90%, and Transform domain steganalysis [35] has an accuracy of 85%. Note that, at the time, the steganographic algorithms were weaker than those existing today. In [38,39], it was demonstrated that statistical algorithms of steganalysis with ML techniques are not efficient with recent steganographic techniques. Therefore DL methods are currently used.

12.2.3 Deep learning applied to steganalysis of digital images The design and implementation of different Convolutional Neural Networks (CNNs) is considered the main contribution of DL in steganalysis. The CNN architectures have evolved based on previous works done in neural networks. The CNNs proposed can be listed in chronological order as follows: Qian-Net or GNCNN [40], Xu-Net [41], Ye-Net [42], Yedroudj-Net [43], SRNet [44], and Zhu-Net [45]. The first studies in this field included an unsupervised learning implementation using autoencoders stacks. Then several publications showed advances in supervised learning following three fundamental principles for steganalysis: i) reinforcement of the steganographic noise using a fixed high-pass filter, ii) feature extraction, and iii) classification. The proposed architectures unified these principles under a single one to optimize its parameters simultaneously. On the other hand, developments were carried out first in spatial domain and then in frequency domain (JPEG). In the JPEG domain, steganalysis is performed in sizes 512 × 512 or 256 × 256 depending on the available hardware. It can be done on grayscale or color images, mainly with quality factors (QF) 100, 95, 90, 85, 80, and 75. Researchers have proposed several changes in CNNs aiming to improve performance, such as increasing the network depth or using fully connected networks [46]; using custom activation features to ensure network convergence and improve steganographic image detection rates [40,42,43]; using CNNs with jumps between convolutional layers (residual networks or dense networks) to design very deep networks (20 or more layers), achieving network convergence, and improving detection percentages [47,44,48–52]; training sets of CNNs and transferring the learned parameters to CNNs with complex convergence or low detection percentage [53–55]; training CNNs with a given database and testing the network with a completely different database to determine the reliability of the designed CNNs (cover-source mismatch) [46,54]; strengthening statistical modeling by an absolute value layer (ABS) [41,43,45,53]; improving steganographic noise extraction using filters designed in SRM and performing feature extraction and classification with CNNs [56,42,43, 45]; using real-world databases, such as ImageNet, to determine how well a CNN can be adapted to any dataset with diverse resolution and capture characteristics [56,40,47,44, 48,49]; placing two CNNs to compete, by which the first network is used for steganography and the second for steganalysis to obtain an automatic steganographic process due to feature learning of both processes [57–64]; training a network to classify high-resolution images from low-resolution images [65]; predicting the payload (quantitative steganalysis) of a steganographic image using DL in the spatial and JPEG domains [66,67]; generating an increase in the database taking into account trimming, rotation, and interpolation operations, as well as the use of cameras with similar or different features for image acquisition,

Chapter 12 • Digital media steganalysis

265

being cautious of resizing [56,42,68,45]; placing three CNNs to work in parallel [69] so that each network uses activation functions (ReLU, Sigmoid, and TanH) and different filters in the preprocessing layer inspired by Gabo Filters [70] and SRM (linear and nonlinear) [71]; and, finally, performing similar parallel work with color images [72], among others. The high-pass filter shown in Eq. (12.1) is widely used by the proposed networks, although its parameters are not optimized during the training process. This filter was developed by [71], and it was first used in steganalysis by [40]. Since the high-pass filter performs image preprocessing to enhance the steganographic noise, it can decrease the impact of the image content. Also, the filter helps the CNN training process to be convergent. However, not all networks use this filter; for example, SRNet automatically learns all the parameters without the need for heuristic approaches. ⎛ ⎜ 1 ⎜ ⎜ K= ⎜ 12 ⎜ ⎝

−1 2 −2 2 −1

2 −2 −6 8 8 −12 −6 8 2 −2

2 −6 8 −6 2

−1 2 −2 2 −1

⎞ ⎟ ⎟ ⎟ ⎟. ⎟ ⎠

(12.1)

The general procedure of the CNN is shown in Eq. (12.2), where M l is each of the feature maps of the lth layer, Mi l−1 is the ith previous layer feature map, Ki l is the ith kernel of the lth layer, bl is the bias parameter of the lth layer, ∗ is the convolution operation, f is a nonlinear operation known as the activation function, pool is the pooling operation, and norm is the normalization operation. Operations in convolutional layers are performed in the following order: convolution, normalization, activation function, and pooling. Feature maps obtained by the last layer are used as input to the classification module composed of one or several layers of fully connected neurons and a Softmax layer. The last fully connected layer aims to normalize the CNN values between [0, 1], which correspond to the probability that the image is cover or stego.   M = pool f l

norm

 n

Mil−1

∗ Kil



+b

l

.

(12.2)

i=1

The most commonly used nonlinear activation functions in most CNNs are: i) rectification linear unit (ReLU ) [73], ii) tangent hyperbolic (TanH ) [74], iii) Gaussian, and iv) Truncated Linear Unit (TLU) [75]. The last one is exclusive of DL applied to steganalysis and it limits the range of values; therefore the network is not modeled to large values. Usually, TanH is used in the first layers, and ReLU in the last ones. The operation used for data normalization is batch normalization (BN), which is summarized in Eq. (12.3) [76]. BN works by initially normalizing the distribution of each characteristic in the feature map to have zero average and unitary variance, allowing rescaling and retranslation of the distribution, if needed.

266

Digital Media Steganography

Given a random variable X whose realization is a value x ∈ R of the feature map, the BN of this value x is x − E[X] BN(x, γ , β) = β + γ √ V ar[X] + ε

(12.3)

with E[X] the expectation, V ar[X] the variance, and γ and β two scalars representing a rescaling and a retranslation. The expectation E[X] and variance V ar[X] are updated at each batch, whereas γ and β are learned by back-propagation. In practice, BN makes learning less sensitive to the initial parameters [76] and therefore allows one to use a higher learning rate, which speeds up the learning process and improves classification accuracy [54]. BN was not used in the first proposed CNNs. Steganographic noise embedded in stego images is usually very weak; thus average pooling [77] is very often used in CNNs since this operation favors the propagation and preservation of this type of noise, which is not possible using max pooling [77]. On the other hand, the most common pooling strategy is local operation that is computed with its neighbors. The most common performance metric reported by steganalysis researchers is classification accuracy [78], or its complement, the error percentage. Accuracy is calculated using the total amount of correct predictions from a given dataset. Classification accuracy is provided by the equation Acc =

# Correct Predictions · 100%. # Total Predictions

(12.4)

On the other hand, the error percentage is computed using Er = 100% − Acc. These are simple metrics to determine the performance of a model, in this case, a steganalysis scheme. However, given the binary classification task, in which classes are always balanced (cover–stego pairs), these metrics are representative enough to allow making decisions for model improvement. Fig. 12.3 shows several CNN architectures applied to image steganalysis. Purple indicates pixels that are input to the network, which, in most experiments, is of 256 × 256 pixels due to processing limitations and computational memory. The preprocessing layer is shown in yellow, where the aim is to increase the noise power introduced by the steganography process and decrease the image content. The convolutional layers are displayed in green, which perform the hierarchical feature extraction. In blue the functions of activation, scaling, absolute value layers, and normalization are indicated. White shows the pooling operation that reduces the dimensionality of the feature map and the computational complexity. All the CNNs designed so far use average pooling operation due to the low power of the steganographic noise; this requires using all the pixels of the region where the pooling operation will take place to avoid losing information. Red and cyan show the classification module consisting of fully connected neuron layers and a Softmax, which is responsible for delivering a distribution of probabilities between 0 and 1 for each class, defining whether the image is cover or stego.

Chapter 12 • Digital media steganalysis

267

FIGURE 12.3 Architecture of CNNs applied to steganalysis of digital images. Most used CNN architectures. The data inside the boxes is structured as follows: number of kernels x (height x width x number of feature maps as input). The data outside the box is structured as follows: number of feature maps x (height x width). If the Stride or Padding is not specified, Stride=1 and Padding=0 are assumed. Based on [16].

268

Digital Media Steganography

Table 12.1 Error percentage of architectures for two steganographic algorithms. Error percentage of the CNNs and SRM for two steganographic algorithms with payloads of 0.4 bpp and 0.2 bpp. Algorithm

WOW 0.2 bpp

WOW 0.4 bpp

S-UNIWARD 0.2 bpp

S-UNIWARD 0.4 bpp

SRM+EC (2012)

36.5

25.5

36.6

24.7

Qian-Net (2015)

38.6

29.3

46.3

30.9

Xu-Net (2016)

32.4

20.7

39.1

27.2

Ye-Net (2017)

33.1

23.2

40.0

31.2

Yedroudj-Net (2018)

27.8

14.1

36.7

22.8

SRNet (2018)

24.6

13.1

32.6

18.4

Zhu-Net (2019)

23.3

11.8

28.5

15.3

SRNet [44] showed the best performance in the JPEG domain since it reduces the use of manual devices and heuristics employed by other networks to capture steganographic noise; this network operates in the spatial and frequency domains. However, in the spatial domain, Zhu-Net [79] has the best results. This network characteristically uses an SRMinspired filter bank to initialize the preprocessing layer weights, which will be optimized during the training process to enhance the noise introduced by the steganography process and decrease the image content. Zhu-Net uses separate convolutions to improve the feature extraction process, as well as average pooling multilevel known as spatial pyramid pooling (SPP) [80] to allow the network to analyze arbitrarily sized images. Table 12.1 shows the error percentages of the CNNs described and SRM+EC to detect two steganographic algorithms in spatial domain (S-UNIWARD and WOW) with payloads of 0.4 bpp and 0.2 bpp. Most of the reported architectures (Fig. 12.3) apply the Clairvoyant scenario [46], which is described as follows: • The steganalyst knows which algorithm was used to perform the incrustation of the messages. • The steganalyst has a good statistical knowledge distribution of the image databases used by the steganographer. • The payload of the messages for the incrustation process is known. • The same image size is always used. • The steganalyst has access to a set of cover–stego images used by the steganographer. • BOSSBase database of 10, 000 images is used, of dimensions 512 × 512 or 256 × 256, depending on the hardware available. • In BOSSBase, other 10, 000 images are constructed with the incrustation of messages by some of the existing steganographic algorithms (stego) in such a way that the complete set has 10, 000 pairs of images (cover–stego). From this set, 5000 pairs of images (cover– stego) are randomly selected, the CNN is trained with 4000 pairs, and validation is done with 1000 pairs; the remaining 5000 pairs of images are used for CNN evaluation. • The initialization of the filter weights is done by Xavier’s method [81].

Chapter 12 • Digital media steganalysis

269

Experiments done under this scenario often use BOSSBase V1.01 database [82,83], which contains 10, 000 images in Portable Gray Map format (PGM) of 8 bits and size 512 × 512. The second most widely used database is BOWS2 [84], which consists of 10, 000 images in PGM format of 8 bits and size 512x512. Additionally, the extensive ImageNet [85] is commonly used, which is composed of more than 14 million images of different sizes. ImageNet database was normally employed for experiments conducted in frequency domain (JPEG). In some experiments, the previous databases were resized or trimmed to 256 × 256 due to the computational cost and memory limitations of the research teams. Now that we have completed this image steganalysis overview and described the current DL methods, we will share some useful ideas for improving Ye-Net. Ye et al. [42] proposed this architecture in June of 2017 (Fig. 12.3). Improvements in convergence and classification accuracy were achieved by applying the ideas discussed further. • Improved Ye-Net: Improvements in Ye-Net have allowed generating a more optimal and accurate Ye-Net architecture. An essential step relies on its development in TensorFlow and Keras since the original is in Caffe. This allows researchers in steganalysis to more easily work with new tools for DL. We reproduced the results reported in [43,16] and conducted several experiments using S-UNIWARD with 0.4 bpp. Improvements in its different layers are shown below. • Activation function: The original network uses the TLU activation function on the first layer. We performed tests using this function, as well as ReLU and TanH. The best results were achieved using the TanH activation function multiplied by three and setting the first layer to be nontrainable. This is because TLU and TanH have similar shapes, the last one showing a smoother curve. The results using the ReLU function were not significant. • Batch normalization layer: The original network does not use BN layers. BN allows much faster and more efficient convergence. To this end, we added this type of layer after every convolutional layer. • Optimizer: In terms of the optimizer or optimization algorithm, Ye-Net was designed using AdaDelta optimizer. We tested different optimization algorithms provided by the Keras library and achieved the best results when using RMSprop optimizer. • Dense layer optimization: The data enters after having passed through a global average pooling layer; the network does not use the common flatten. Initially, we used the dense layers from Yedroudj-Net of 256 and 1024 neurons, which converge rapidly. However, it did not reach better accuracy, and the loss was more significant. We built our network of dense layers (128, 64, 32 neurons) using a ReLU activation function. In the first two dense layers, we also applied a dropout (value = 0.2). The architecture with the proposed modifications achieved an accuracy higher than 78% on the S-UNIWARD (0.4 bpp) steganographic algorithm. The results are better than the original Ye-Net (2017) (68.8%) and Yedroudj-Net (2018) (77.2%), yet lower than SRNet (2018) (81.96%) and Zhu-Net (2019) (84.7%). The results are based on [16,43]. We

270

Digital Media Steganography

overcame Yedroudj-Net in only 12 epochs. Improvements to this and other networks will be soon published.

12.2.4 Summary and perspectives Regarding resources and materials for steganalysis researchers, the implementation of CNNs is done with Cuda-convnet [86], Caffe [87], and TensorFlow [88], which are frameworks that allow researchers to quickly and flexibly create CNNs. In terms of dedicated steganalysis work, there is a large number of tools in the Binghamton University website [89,90], such as algorithms for steganography and steganalysis (both in spatial and frequency domains), traditional steganalyzers, the application of DL techniques, digital image databases for experiments, and several publications. On the website of the Laboratory of Informatics, Robotics, and Microelectronics of the city of Montpellier-France (LIRMM) [91], there are several DL projects applied to steganalysis, where the algorithms can be downloaded. In this section, we discussed application of steganalysis to digital images and the results of several experiments. We propose the following possibilities as future work: • Generating new CNNs that unify the advantages of existing networks or generate an entirely new architecture (dense, shallow, and/or deeper architectures) to improve detection percentages in spatial and frequency domains. • Using different digital image databases by considering, for instance, the use of different cameras to test more experiments and further study the cover-source mismatch effect. • Training existing CNNs with large-scale databases and larger image sizes. To do this, it is necessary to train under a CPU and GPU cluster architecture to meet the processing and memory demands. • Training CNNs with a given steganographic algorithm and testing on another algorithm to study how much transfer can occur from one algorithm to another. • Generating new CNNs and designing new computational elements that allow more efficient obtaining the noise generated by the steganographic process, improving feature representation, classifying images in spatial or frequency domains, and processing arbitrary images. This task should be achieved in the most automatic way possible.

12.3 Audio steganalysis The purpose of audio steganalysis is to determine if an audio signal has a hidden message. There are two main groups of techniques used in this task: compressed and noncompressed methods, which refer to the format of the audio signals, such as WAV (noncompressed), MP3, or AAC (compressed). To better understand the problem, we must discuss the steganographic algorithms used for hiding a message in an audio file and how they work. There are two widely implemented ways of hiding messages in audio; one of them is by modifying the least significant bit (LSB) of a data sample in the file. The second way refers to non-LSB methods that modify different parameters of the signal, such as its amplitude or coefficients obtained via frequency transform like wavelet or Fourier. One of the

Chapter 12 • Digital media steganalysis

271

most common algorithms of LSB methods is Steghide, which exchanges the position of the LSB rather than overwriting it. More information can be found in [92]. On the other hand, the most mentioned non-LSB methods are: FreqSteg, which hides the message at the highest audio signal frequencies not audible to humans [93], and integer wavelets, in which the detailed wavelet coefficients of the signal are modified with the hidden message [94].

12.3.1 Methods 12.3.1.1 Noncompressed audio formats Traditional audio steganalysis methods rely on sophisticated hand-crafted features. There are two main approaches to solving the problem for noncompressed signals [95]. Calibrated methods estimate the cover signal and compare it to the stego signal to extract the features that will later be passed to the classifier. In contrast, noncalibrated methods extract features directly from the signal. Most of the extracted features belong to frequency or time domains. • Calibrated methods: In terms of calibrated methods, the main idea is that steganography can be seen as the process of adding noise to an audio file. Thus the steganalysis procedure may start with denoising the received signal to estimate the cover or clean signal. A comparison of the stego and denoised signals or more specifically, the subtraction of one from the other, generates a representative value of the noise or embedded data in a given signal. The scheme of this procedure is presented in Fig. 12.4.

FIGURE 12.4 Block diagram for calibrated steganalysis. The main idea is to compare noise quantity in the original signal and estimated cover.

The most effective method for cover estimation is wavelet-based denoising [96], which computes the wavelet coefficients and reconstructs the signal after applying a threshold to transform coefficients. The threshold value is determined by the standard deviation of the coefficients and the signal length [97]. In the same paper, Özer et al. [96] computed statistical distance metrics on denoised audio signals to extract features. This was based on the idea that additive noise affects these metrics. Using these features and SVM as classifier the authors obtained an average discrimination performance of 88%. • Noncalibrated methods: As mentioned for noncalibrated methods, most of the features extracted from the audio signal belong to frequency or time domains. Usually, these features are statistical measures of the data samples or transform coefficients, such as mean, standard deviation, and variance. More discriminative features are: • Correlation between two consecutive data samples. • Markov features based on a mathematical generalization for a random process (i.e., signal) in which there is no dependence between future and past data.

272



Digital Media Steganography

Mel-frequency cepstral coefficients (MFCC), which are highly used in speech recognition applications. They are based on human hearing. The coefficients are computed using frequency components of the signal and a set of triangular weighting functions (filter-bank) on the Mel scale. MFCCs are given by Eq. (12.5), where S stands for the original signal, and FT indicates the Fourier transform: MF CC = F T (log(F T (S))).

(12.5)

More information about the features used for audio steganalysis can be found in [95]. Kraetzer et al. [98] used a set of features that includes statistical features computed from different frames or windows of the signal. These include empirical variance, covariance, entropy, LSB ratio, LSB flipping rate, mean and median of the samples, and Mel-frequency cepstral coefficients describing the rates of change in different spectrum bands. Using this set of features to identify messages hidden by different steganographic algorithms and SVM as classifier, the prediction rate varied between 52% and 100%. Other works that applied the calibrated schemes include Rekik et al. [99] and Liu et al. [100].

12.3.1.2 Compressed audio formats Compressed audio formats such as MP3 or AAC are broadly used in different audio applications and devices. Their main feature is the ability to provide high-quality audio while maintaining a high compression rate and small file sizes. Due to their popularity, these formats have become an ideal way to send hidden messages. Thus new steganographic algorithms such as Mp3 stego [101] and Huffman coding [102] were created. Most of these steganographic algorithms change the modified discrete cosine transform (MDCT) coefficients, which are used during the compression process of audio files. To detect hidden messages in such file types, Jin et al. [103] used Markov features based on MDCT coefficients. These coefficients were extracted from MP3 audio signals, achieving high detection accuracy with low embedding rate. The main idea behind their approach is that steganographic algorithms affect the quantization step between frames during compression, even at low embedding rates. The same idea can be applied to audio files in AAC format. For instance, Ren et al. [104] used statistical features of different frames of the signal to classify cover and stego using an ensemble classifier with a detection accuracy of 85.34% at an embedding rate above 50%.

12.3.1.3 Modern audio steganalysis In more recent years, owing to the huge advancements made in DL techniques, audio steganalysis has taken advantage of these models, usually outperforming the traditional methods. The most remarkable benefit of using DL is that manual feature extraction is no longer necessary. Here we present some of the latest works published on audio steganalysis using DL models. •

Deep Belief Network (DBN): Deep belief networks are a special kind of feed-forward networks composed of stacked restricted Boltzmann machines (RBMs). RBMs are es-

Chapter 12 • Digital media steganalysis

273

sentially neural networks with two layers, one visible and one hidden. In these networks a given probability distribution activates a neuron in the hidden layer. The diagram of an RBM is shown in Fig. 12.5.

FIGURE 12.5 Scheme of a restricted Boltzmann machine (RBM). An RBM is made of two layers, one visible and one hidden, with all units connected between layers. The activation of a unit in the hidden layer is given by a probability distribution.

Paulin et al. [105] built a DBN by stacking several RBMs in such a way that the first visible layer receives the input data and the next RBM is trained with the hidden layer of the previous one. They used MFCCs as features and a DBN as classifier to solve two different tasks. The first one was to determine if a signal has a hidden message, and the second one was to identify which steganographic algorithm was used to embed the data. They compared their DBN architecture against SVM and Gaussian mixture model (GMM), finding that, in general terms, SVM achieved the best results on the first task and DBN outperforms SVM and GMM by 5% precision in the second task. • Convolutional Neural Networks (CNNs): These networks use the convolution operation to extract features from the input data. The general idea is that the network has several filters that are slid over the signal while computing some coefficients that can be used as features. Using this type of networks, Chen et al. [106] achieved a classification accuracy of 88.3% on signals with an embedding rate of 0.5 bps (bits per sample). Similarly, Lin et al. [107] used a CNN with an improvement; they used four filters that were manually initialized as high-pass filters, achieving a better testing accuracy when using the LSB matching steganographic algorithm with different values of embedding rate. • Residual Neural Networks (ResNets): These are specially designed to solve the vanishing gradient problem that appears when training very deep neural networks. ResNets overcome this issue by allowing skip-connections or shortcuts between layers. Formally, this shortcut modifies the function F (x) that is being mapped from the data by adding the identity function x. Therefore the mapping function is now F (x) + x, which is easier to optimize since it is referenced to the input data. A diagram of a residual block is presented in Fig. 12.6. The benefit of training neural networks with a higher number of layers is that the network will be able to extract or learn more complex and discriminative features from the data. This may lead to better performance of the neural network. Ren et al. [108] designed a steganalysis scheme in which a ResNet is used for extracting the features from the data. Since this work is fairly different from the general schemes described in this section and shows outstanding results, we will describe it in detail. The

274

Digital Media Steganography

FIGURE 12.6 Diagram of a residual block. A residual block is the basic unit of a ResNet architecture. The arch around the layers represents the shortcut.

FIGURE 12.7 S-ResNet architecture. This neural network is made of 31 convolutional layers, a nontrainable layer, and three groups of ten layers. The first group (Conv-1) uses ten filters in each layer, the second group (Conv-2) uses 20 filters, and the third group (Conv-3) uses 40 filters. The layers in group one have a stride = 1 in both directions, and the layers in the second and third groups have a stride = 2 in the vertical and horizontal axes.

authors used the spectrogram of the audio signal as input to the neural network. A spectrogram is a visual representation of the frequency components of a signal over time. To plot a spectrogram, the signal must be divided into many frames. The plot comprises the horizontal axis, which represents the time domain, the vertical axis, which represents the frequency domain, and the color of the lines or dots, which represent the power of the signal at a given time and frequency. In this way the spectrogram can be seen as an image of size n × m, where n is half the frame size, and m is the number of frames for given window size. Consequently, the authors passed the spectrogram from a signal to a neural network, which they called S-ResNet, shown in Fig. 12.7. This particular architecture is composed of 31 convolutional layers. In particular, the first layer has four fixed filters, and the other filters in the network are updated during training. All the filters in these layers have a size of 3 × 3. After each convolutional layer, there is batch normalization and a ReLU layer. The former accelerates the learning process by normalizing the activation of the

Chapter 12 • Digital media steganalysis

275

previous layer, and ReLU activation provides nonlinearity to the function that is being mapped, allowing the model to learn more complex patterns. A skip connection that defines the residual blocks and allows training such a deep neural network is found every two convolutional layers. Since convolutional layers tend to make the data volume bigger by adding more channels, two average pooling layers reduce this data volume after every five residual blocks. There is a global average pooling layer at the end of the network to flatten the volume onto a 1-D vector of length 40, called the feature vector. The first four filters of the network were designed by hand to amplify the noise of the signal, which can be associated with hidden messages. The implementation of these filters was motivated by the work on image steganalysis using spatial rich model feature [71]. Finally, the feature vector was then passed to an SVM to execute the final training and classification steps. Using this scheme and testing with different window sizes to generate the spectrogram, the authors achieved an average classification accuracy of 94.98% on the AAC format and 99.93% on the MP3 format. • Recurrent Neural Networks (RNNs): These networks are specifically designed to detect and extract useful information for sequence data or time series. Thus, RNNs can be used on audio files, such as speech recordings, to detect correlation patterns. With this in mind, Lin et al. [109] proposed an RNN architecture to detect hidden messages in voice-over IP (VoIP) streams, which require a short response time and high detection accuracy. Through this model, they achieved 90% detection accuracy, even when the sample was 0.1 s long, and an average testing time of 0.15% of the sample length.

12.3.2 Summary and perspectives Several important ideas regarding audio steganalysis have been introduced, such as audio formats, steganographic algorithms, and how the algorithms affect certain signal features. More importantly, we discussed how ML and DL methods have been applied to this task. In audio steganalysis, there is no clear baseline for researchers, as there is with image steganalysis. In the latter, every researcher tests their work on BOSSbase and BOWS2 databases, and most of them use the S-UNIWARD steganographic algorithm using an embedding rate of 0.4 bpp [16]. In view of this, it is difficult to compare the results published on audio steganalysis due to the lack of a unified database for testing. Thus most researchers use a different database. As discussed in [95], the works on audio steganalysis have focused on solving or breaking a specific steganographic algorithm, among which LSB techniques are the most used. In this sense, it would be interesting to research and design additional steganalysis schemes for non-LSB steganographic algorithms.

12.4 Video steganalysis Currently, the speed of the internet allows the comfortable use of videos. The human eye is not sensitive to small changes in digital media, so videos can be modified to send hidden information (steganography techniques). Therefore methods to detect these changes

276

Digital Media Steganography

become necessary (steganalysis techniques). Video steganalysis aims to determine the possibility of hidden data in these files [110]. Video steganalysis has attacked steganographic architectures. The main methods include motion vectors (MVs) and intra- and inter-frame incrustation. These have been applied to videos in H.264/AV C standard and, more recently, in H EV C standard. The reader will learn about the main methods in this field of study and how video steganalysis works. In the previous methods (with MV), we will describe formerly used methods. In recent methods (with intra and inter-frame incrustation), a current algorithm is described in detail.

12.4.1 General context Fig. 12.8 shows confidential information (secret message) in the original video (video cover). The result is a video with a hidden message (stego video). Video steganalysis searches for secret information in the video files. These are composed of frames used to embed data. Video sequences are processed frame by frame.

FIGURE 12.8 A secret message with a cover video produces a stego video. This figure shows a system in which video steganalysis is applied to identify whether the tested video is cover or stego.

12.4.2 Previous methods Initially, image steganalysis techniques were directly applied, which focused on detecting the noise generated by the embedded message. The methods did not provide good results due to redundant information from one frame to another. The images of a video do not change much from one frame to the next. Therefore video steganalysis methods have notable differences compared to image steganalysis. The steganalytical approach focuses on movement estimation schemes and inter- and intra-frame embedding. We will discuss MV-based steganalysis video methods, which were applied until 2018. •

MoViSteg algorithm: In 2007, Jainsky et al. [111] proposed an algorithm for video steganalysis based on movement. It was called MoViSteg (motion-based video steganalysis). It is suitable when only part of the frames have hidden information. The method is shown in Fig. 12.9. The architecture has two distinct stages: i) Signal processing by means of motion estimation, and ii) detection based on asymptotic relative efficiency (ARE). It employs an efficient detector, which uses many weak samples and signals required to classify the media as cover or stego video. • Kancherla and Mukkamala algorithm: In 2009, Kancherla and Mukkamala [112] presented a methodology for video steganalysis by exploring the spatial and temporal redundancies. This method uses neural networks and SVM. The database contained

Chapter 12 • Digital media steganalysis

277

FIGURE 12.9 MoViSteg algorithm: motion-based video steganalysis. This figure shows a video steganalysis methodology. The architecture has a signal processing stage and a classification stage.

42 video samples in AVI format, and each sample had a duration of 10 seconds. The embedded message was performed with a spread spectrum steganography tool from Moscow State University. The secret information can be extracted with an embedding key. The results obtained by grouping techniques, such as k-means and EM (expectation maximization), produced low results. This method showed that the performance obtained with SVM, neural networks, k-nearest neighbor, and random forests is similar. However, SVM produces the best results. The accuracy values obtained are around 99% [112]. • Cao, Zhao, and Feng algorithm: In 2012, Cao et al. [113] proposed an improvement to video steganalysis. This method attacks MV-based steganography. The system is designed for videos compressed in MPEG. The algorithm works with MV due to compression limitations. Schemes for MVs operate as targeted attacks, so the system will only work on a specific MV scheme. Thus it can fail with an advanced steganographic system. The MPEG-4 video codec Xvid tool allowed creating the database. This algorithm confronted the steganographic algorithms of Aly [114], Xu [115], Fang and Chang [116], and their methods [117], named (Tar1, Tar2, Tar3, Tar4). The database contained 22 video sequences in CIF resolution, each one with 75 nonoverlapping frames; therefore the total number of subsequences increased to 111. This method sought to remove arbitrariness in the modification of MVs. The approach was designed based on calibration and features obtained from the return of MVs. This algorithm demonstrated that perturbation in the estimation of regular motion generates the reversal of MVs during recompression. This method was planned to improve the adaptability of the proposed features and can effectively solve some MV-based steganography. • Wang, Zhao, and Hongxia algorithm: In 2014, Wang et al. [118] presented a method of video steganalysis based on motion vectors by adding or subtracting a value from them. The experiments were conducted on cover videos with different steganography methods and encoded by motion estimation methods at various bit rates. They managed to improve the issues with MVs derived from the technique proposed in [113]. The MVs are from video covers, and these are less optimal at the local level, according to the values of the sum of absolute difference (SAD). The slight influence that modified MVs impose on the SAD allows extracting features by calculating the optimal amounts for the SAD of a localized region of MV. The operation add or subtract one (AoSO) in MV

278

Digital Media Steganography

serves to analyze the influence resulting from steganography and, subsequently, to extract the new AoSO feature. The databases have two sets in YUV format and a frame rate of 30 fps. The first set consists of 44 CIF (352x288) video sequences, among which the first 60 frames are used for experimenting. The other database has 1157 videos with a CIF size and 30 fps, and the first 36 frames were used in the experiments. • Zarmehi and Ali algorithm: Zarmehi and Ali [119] developed a digital video steganalysis algorithm to attack spread spectrum (SS) data hiding. Fig. 12.10 shows the diagram of the algorithm.

FIGURE 12.10 Zarmehi and Ali algorithm. The figure shows a block diagram of the steganalysis framework. The algorithm detects whether the video is cover or stego. It also estimates the gain factor, the hidden message, and the original frame.

The method estimates both the hidden message and the gain factor of the embedding rules for SS. The cover frames are estimated and compared to the received video frames. The algorithm calculates the residual matrix. The features of the array are extracted, as well as of the video frames and estimated frames. Then the features are applied to an SVM, and, consequently, it is determined whether a video is cover or stego. If the video is cover, then the process ends. However, if the video is suspicious, then the gain factors of the embedding process and the hidden message are estimated. Finally, the original video is reconstructed. This method proved to be accurate based on experiments with different versions of SS data concealment schemes [119]. • Wang, Cao, and Zhao algorithm: Wang et al. [120] developed an innovative method of steganalysis. This method noted that the current steganalytical techniques did not take advantage of content diversity since they only extracted features with sections of fixed length. The method attacks adaptive MV steganography; in particular, Cao [121], Yao [122], and Wang [123] in videos with low bit rate and low incrustation ratio. The database consisted of 100 YUV sequences in CIF format, each one with 30 fps and a size of 150 to 300 frames. The database was processed in the H.264/AVC standard using the x264 tool. This algorithm divided the video into subsequences, allowing one to extract features with a similar intensity of movement. An independent classifier received these features. Finally, the classifiers were integrated to decide whether the video is cover or stego.

Chapter 12 • Digital media steganalysis

279

• Sadat, Faez, and Saffari algorithm: In 2018, Sadat et al. [124] presented a work on steganalysis based on MV. This algorithm used entropy, combined with features of the motion vector. The authors extracted intrinsic and statistical features obtained by local optimization of the cost function. The video determined the texture and precision in the MV. They used the H.264/AVC standard due to its popularity and widespread use in steganography [125]. Furthermore, the authors employed 284 uncompressed video sequences with CIF resolution (352 × 288) and color in YUV. They attacked three methods of steganography, including those of Aly [114] (accuracy 79.45%), Cao [121] (accuracy 71.05%), and Xuansen [126] (accuracy 72.74%).

12.4.3 Recent method Li, Meng, Xu, Yunqing, and Yuanchang [127] proposed a method of steganalysis based on PU partition modes for videos in the HEVC standard. Generally, researchers have worked with H.264/AVC standard to hide information in videos. Currently, the latest video coding standard, namely high-efficiency video coding (HEVC) [128] (Presented by Video Code Expert Group (VCEG) and Moving Pictures Expert Group (MPEG)), has been used to send hidden information in HD and Ultra HD videos. •

General context and definitions: In the HEVC standard a video has groups of pictures (GOPs). Each image can be subdivided into several square code tree units (CTU) of the same size. CTUs have smaller units of code (CU). Each CU has a transformation unit (TU) and a prediction unit (PU). There are 25 possible partition modes (PM-25D) of PU in P pictures. Its probability is as follows (PoPUPM-25D). The algorithm uses PU partition modes [129] that can be optimized to reduce the features from 25 to 3 dimensions (PoPUPM-3D). • Method: This method was developed by Li, Meng, Xu, Yunqing, and Yuanchang [127]. This architecture is known as HEVC video steganalysis, which is shown in Fig. 12.11.

FIGURE 12.11 HEVC video steganalysis algorithm. This architecture shows the stage of feature extraction for training and testing (generating PoPUPM probabilities), as well as the classification stage using SVM.

280

Digital Media Steganography

The method involves: • Feature extraction using manual techniques: the statistical distribution of PU partition modes in P images. • Classification using ML techniques: SVM to discriminate between cover and stego videos. The algorithm uses the probability of each PU partition mode in P images (PoPUPM). The architecture is described below. • Data and preprocessing: The steganographic algorithm [130] of Xie, Li, Zhang, and Yang generates stego videos. This algorithm hides the information for the HEVC standard based on differences in the intraprediction and coding modes of the sample sequences for cover videos with HM 16.15 [131]. The video sets were organized as follows: 1) Thirty-three videos with resolution 1280×720, each sequence was split into parts with 80 frames each; 2) Thirty videos with resolution 1920 × 1080, each sequence is divided into ten parts, each part with 50 frames. • Feature extraction: The P-pictures of the cover and stego videos produce all PU partition modes. The probability distribution is obtained and chosen as a feature according to the equation Ni Pi = 25

i=1 Ni

,

(12.6)

where the range [1,25] of i is the 25 PU partition modes extracted in the P pictures, and Ni represents the total amount of partition mode ith PU in a video sequence. The partition modes of PU (8×4, 4×8, 8×16) may be a three-dimensional feature (PoPUPM-3D). • Classification stage: The features extracted from cover and stego videos (PoPUPM) allow training the SVM. The features extracted from the test videos are sent to the SVM classifier to validate the model. The kernel function is polynomial for the SVM classifier. Five out of six (5/6) parts of the video sequences are used for training the SVM, and 1/6 are employed for testing the SVM (randomly). The optimal range and cost for the kernel is calculated based on the validation function; it is processed 20 times, and the accuracy is averaged. • Results: The experiments were carried out on video sequences with resolutions of 720 P and 1080 P and bit rates (4M, 8M, 12M) for 720 P and (10M, 30M, 50M) for 1080 P. The 25-dimensional features were optimized to reduce them to three dimensions. Accuracy values higher than 96% were obtained when the bit rate remained fixed, whereas bit rate accuracy values greater than 93% were achieved by joining the videos. The algorithm of Sheng, Wang, and Huang [132] only managed to obtain a maximum of 55.9% accuracy in the 1080 P video set and 10M as bit rate. The steganographic algorithm is detectable with this method. Li, Meng, Xu, Yunqing, and Yuanchang plan testing their algorithm on new emerging steganography methods.

Chapter 12 • Digital media steganalysis

281

12.4.4 Summary and perspectives Video steganalysis works for videos under the H.264/AVC [129] standard and, recently, the HEVC [128] standard. It attacks steganographic algorithms based on motion vectors and inter- and intraprediction modes. Nowadays, ML techniques and, in particular, DL play a vital role in handling large amounts of data [133]. The vast majority of steganalysis algorithms only focus on images; therefore it is crucial to develop more methods for video steganalysis [127]. Modern DL techniques will help video steganalysis. Also, researchers currently generate their own videos; consequently, fixed databases are also necessary to provide investigators with a baseline of research. Finally, a more generalized methodology must be designed for videos, regardless of the types of steganography, formats, or compressions, to provide a more general application in media.

12.5 Text steganalysis Due to the rise of digital technologies and media, the art of hiding messages in digital text, also called text steganography, has gained importance since it can be applied in security, communication, and copyright protection [134]. For this reason, detecting if a text file contains a secret message has drawn the attention of many researchers, and the techniques used to achieve this aim are called text steganalysis [134,135]. Text steganography techniques are divided into three methods: format-based or structural, random and statistics, and linguistics. The last method is more studied than the other two due to the linguistic use of the natural language to hide the message. This is done by changing properties of the text but preserving its meaning as much as possible, making this method most difficult to recognize. Additional information about these techniques can be found in [134,135]. Text steganalysis is defined as the detection of hidden messages in a text. This is possible considering that when a text is modified, its statistical properties are also modified [134, 136]. However, this is a difficult task compared to other types of digital media (images, audio, or video) [135] due to the wide variety of text characteristics that can be changed, such as spaces, synonyms, or even embedded symbols. The techniques used in text steganalysis can be divided into three types: visual, structural, and statistical [134]. • Visual: These are related to the human factor, or when a person can visually detect something unusual in a text, such as the words used and coherence. The hidden message can be detected by making some changes [134]. • Structural: This method involves changing the format or layout of a text to find something unusual. It also consists in changing the format (ASCII,UTF-8,UTF-9) to detect the hidden message [134]. • Statistical: When the hidden message cannot be detected by visual or structural techniques, it is convenient to use statistical methods. These involve calculating the number of possible solutions to find the secret message using the following equation: N P = k × 2N S , SM = C1 , C2 , . . . , CN S ,

(12.7)

282

Digital Media Steganography

where NP is the number of possible solutions, NS is the length of the secret message, and SM is the secret message. It is also possible to estimate the number of hidden symbols and the probability of guessing the correct secret message using

C P (NH, NC) = N (12.8) N H , NH ≤ N C, P (SM) =

1 1 × , NP P (NH, NC)

(12.9)

where SM is the secret message, NH is the number of hidden symbols, and NC is the number of characters in the complete text. If P(SM) is equal to zero, then the message is encrypted with a secret key [134].

12.5.1 Methods Recently, many methods of text steganalysis have been proposed, mainly based on statistical methods and ML. However, most statistical methods are used as features for ML algorithms such as SVM. Moreover, a relatively new field based on DL algorithms has shown promising results.

12.5.1.1 Statistical algorithms Linguistic steganography algorithms are the most studied, and many steganalysis methods have been developed to detect messages hidden by linguistic algorithms. Such methods aim to classify every objective text into cover or stego. For example, recent studies analyze the synonym substitution (SS) method, in which some words are replaced by their synonyms and therefore keeping the meaning of the text almost unchanged. However, as some studies propose, these changes affect the statistical and semantic properties of the text [137–141]. Some models employ features, such as the synonym frequency and semantic relation, using classifier models like word2text [137,142]. Using this classifier, a vector of synonyms (synset) and each synonym in the text are analyzed. Accordingly, if the synonym in the synset and the synonym in the text are not the same, then this is called a mismatch. The number of mismatches are used to detect a stego-text, and this process is termed WEF [137]. Other approaches using synsets can be found in [138], in which synonym frequency was applied. Also, [140] used a similar approach based on SS but focused on improving the poor accuracy to detect semantic distortion. Thus by calculating the word correlation and then subtracting high-frequency words (due to the small amount of useless information) the authors computed the context fitness of each synonym to extract the features. Finally, statistical analysis can address steganography algorithms based on translation by calculating statistical features [141]. Since word frequency is important when translated to another language, normal texts show more high-frequency words compared to stego-texts. Also, one-to-one words (words that have only one translation in the objective language) are generated using different translators, and repeated words are considered as one-toone words in all translations to expand the word frequency difference. Finally, a 12-feature

Chapter 12 • Digital media steganalysis

283

vector is made by taking six features from the frequency difference of one word (1-gram) and six features from the frequency difference between two adjacent words (2-gram) [141]. Fig. 12.12 depicts this method.

FIGURE 12.12 Method to address translation-based steganography. By extracting features from 1-gram and 2-gram twelve frequency features are obtained for analysis or use in ML methods such as SVM.

There are also algorithms that only use statistical analysis to find stego texts. For example, in [139] the proposed method is used to detect the stego-text created by synonym run-length encoding. First, an intermediate sequence is obtained from the message, and the frequency of a synonym in the English language is analyzed considering that probabilistic distributions change in odd encoding runs (sequences of bits) of synonyms when a message is embedded. Then the intermediate sequence is split into elements, and probabilistic distributions are obtained. If the obtained distribution is closer to the distribution of a text without a hidden message, then the text is also empty; otherwise, it is a stegotext [139]. More information about these techniques can be consulted in [137,138,140,141]. Regarding ML algorithms, SVM is the most used due to its high performance. In this algorithm, features are an essential factor. One approach consists in hiding the messages in spaces; this proposed method shows an accuracy of 99% in five different combinations of text type (word, PDF), layout (double, single), and font (Calibri, Times New Roman, Arial). To extract the features, each line of the text is divided into two groups with the same number of words. Then the sum of the interword distances of each group, known as group distance (GP), is calculated. Furthermore, the N -window variance group distance (WVGP) is also considered a possible feature by estimating the variance and knowing that the variance between text lines must be close. Histograms have shown that GP is more sensitive to hidden information than WVGP as a feature. Also, 263 lines were taken from two papers; the data was divided into training and test sets, the GP was taken as a feature, and an SVM classifier was tested [143].

12.5.1.2 Modern text steganalysis In text steganalysis, the use of DL algorithms is very recent and is mainly based on CNNs and RNNs [144–146]. •

Convolutional Neural Networks: Wen et al. [144] proposed an approach using CNNs. The authors performed different processes for short and long texts using an architec-

284

Digital Media Steganography

FIGURE 12.13 LS-CNN architecture. From the embedding layer, semantic and syntactic features are extracted as input to a convolutional layer for the feature learning task, which is followed by a fully connected layer with a softmax activation function for multiclass classification.

ture called LS-CNN. For short texts, they created a dictionary with the words in the text, and the words were encoded in the indexes of their position in the dictionary. As shown in Fig. 12.13, the proposed architecture for short text consists of three parts: i) a word embedding layer to extract the semantic and syntactic features of each word, ii) convolutional layers fed with word features extracted in the previous part to learn sentence features, and iii) a fully connected layer with a softmax activation function to perform the classification. For long texts, a dictionary with sentences is created, and each sentence is analyzed as a short text to create an array with the result for each sentence. Finally, the decision is made by computing a ratio shown in Eq. (12.10). If the ratio is higher than a threshold “t”, then it would be a stego-text. The proposed architecture achieves high accuracy for short and long texts, outperforming the models described in the state-of-the-art. Count (stego) R= . (12.10) Count (cover) Yang et al. [145] proposed a method very similar to [144]. The developed architecture was called TS-CNN; however, it can be divided into TS-CNN(simple) and TSCNN(multi). In TS-CNN(simple) the text is processed as a single sentence, whereas in TS-CNN(multi) the text is separated and processed like in the previous method [144]. The results showed that TS-CNN(simple) is more effective for long texts, whereas TSCNN(multi) is more effective for short texts. • Recurrent Neural Networks: The most recent approach using RNNs was proposed by Yang et al. [146]. This publication proposed an architecture, called TS-RNN, to extract features and classify the message (stego–cover). Firstly, the text is modeled as a signal since an embedded message can be seen as additive noise, and the probabilistic distribution in the text will be affected. Next, an RNN with LSTM units is proposed to extract the probabilistic features for each word. Therefore, a bidirectional RNN is introduced (i.e., forward and backward RNN). The forward section of the network extracts the correlation between the present word and the previous, and the backward extracts the correlation between the present word and the next. The features obtained are fused and, finally, passed to a final layer with a softmax activation function. Fig. 12.14 shows the proposed architecture, which achieves close to 100% accuracy.

Chapter 12 • Digital media steganalysis

285

FIGURE 12.14 TS-RNN architecture. This architecture in the recurrent layer has backward (bottom) and forward (top), 2 hidden layers and 100 LSTM units.

12.5.2 Summary and perspectives Text steganalysis algorithms are mainly developed to detect stego-texts created with linguistic steganography techniques that create apparently almost unchanged texts [134]. The main way to address the detection of stego-texts is using statistical approaches such as synonym frequency and semantic correlation. The data analyzed with statistical methods can be used as features to feed ML classifiers such as SVM. DL approaches have also been implemented, which use mostly convolutional and RNNs to extract the text features. The methods discussed are promising in statistical algorithms, ML, and DL. However, a defined work baseline is lacking, and the resources currently found in text steganalysis are minimum.

12.6 Conclusion In this chapter, we have explained the vital elements of steganalysis applied to digital media so that the reader gains a comprehensive and detailed vision of this field. The scientific community has focused its efforts in the last 10 years on applying steganography and steganalysis on digital media, generating sophisticated and challenging solutions such as those presented in this chapter and the supporting bibliography. The trend in steganalysis on digital media is to use neural networks that allow the entire process of feature

286

Digital Media Steganography

extraction and classification to run automatically, avoiding the need to elaborate complex filters manually created by experts. The results of these networks have exceeded the accuracy percentages in the classification of steganographic digital media compared to traditional methods. As we continue to advance in artificial intelligence and DL, the results will be even better. If the reader wishes to work on this subject, we recommended addressing the following problems: steganalysis on real-world images known as cover sourcemismatch effect, automatic steganography using generative adversarial network (GAN), and database enrichment (data augmentation) to obtain a set of images large enough to ensure better learning of the network. Finally, we encourage studying the techniques of transfer learning and dense-residual networks as tools to further consolidate steganalysis on digital media.

References [1] Ballesté Antoni Martínez, Navarro Naranjo, Victor Adolfo, Estenografía en contenido multimedia, UOC, 2007, pp. 1–48. [2] Crypto Law Survey - Page 2, http://www.cryptolaw.org. [3] Gustavus J. Simmons, The Prisoner’s problem and the subliminal channel, in: David Chaum (Ed.), Advances in Cryptology: Proceedings of Crypto 83, Springer US, Boston, MA, 1984, pp. 51–67. [4] Hectro Fabio Villada Estrada, Juan Camilo Jaramillo Perez, Aplicaciones de la esteganografía en la seguridad informática, 2015. [5] Jessica Fridrich, Steganography in Digital Media: Principles, Algorithms, and Applications, Cambridge University Press, Cambridge, 2009. [6] Jessica Fridrich, Miroslav Goljan, Rui Du, Detecting LSB steganography in color, and gray-scale images, IEEE Multimedia 8 (4) (2001) 22–28. [7] Tomas Pevny, Tomas Filler, Patrick Bas, Using high-dimensional image models to perform highly undetectable steganography, in: Rainer Bohme, Philip W.L. Fong, Reihaneh Safavi-Naini (Eds.), Information Hiding, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 161–177. [8] Bin Li, Ming Wang, Jiwu Huang, Xiaolong Li, A new cost function for spatial image steganography, in: IEEE International Conference on Image Processing, ICIP, 2014, pp. 4206–4210. [9] Vahid Sedighi, Rémi Cogranne, Jessica Fridrich, Senior Member, Content-adaptive steganography by minimizing statistical detectability, IEEE Transactions on Information Forensics and Security 11 (2) (2016) 221–234. [10] Vojtˇech Holub, Jessica Fridrich, Tomáš Denemark, Universal distortion function for steganography in an arbitrary domain, EURASIP Journal on Information Security 2014 (1) (2014) 1. [11] V. Holub, J. Fridrich, Designing steganographic distortion using directional filters, in: IEEE International Workshop on Information Forensics and Security, WIFS, December 2012, pp. 234–239. [12] Yambem Jina Chanu, Kh. Manglem Singh, Themrichon Tuithung, Image steganography and steganalysis: a survey, International Journal of Computer Applications 52 (2) (2012) 1–11. [13] Andreas Westfeld, F5—a steganographic algorithm, in: Ira S. Moskowitz (Ed.), Information Hiding, Springer Berlin Heidelberg, Berlin, Heidelberg, 2001, pp. 289–302. [14] Linjie Guo, Jiangqun Ni, Yun-Qing Q. Shi, Uniform embedding for efficient JPEG steganography, IEEE Transactions on Information Forensics and Security 9 (5) (2014) 814–825. [15] Linjie Guo, Jiangqun Ni, Wenkang Su, Chengpei Tang, Yun-Qing Shi, Using statistical image model for JPEG steganography: uniform embedding revisited, IEEE Transactions on Information Forensics and Security 10 (12) (2015) 2669–2680. [16] T. Reinel, R. Raúl, I. Gustavo, Deep learning applied to steganalysis of digital images: a systematic review, IEEE Access 7 (2019) 68970–68990.

Chapter 12 • Digital media steganalysis

287

[17] Neil F. Johnson, Sushil Jajodia, Steganalysis of images created using current steganography software, in: International Workshop on Information Hiding, Springer, 1998, pp. 273–289. [18] Rajarathnam Chandramouli, Grace Li, Nasir D. Memon, Adaptive steganography, in: Security and Watermarking of Multimedia Contents IV, vol. 4675, International Society for Optics and Photonics, 2002, pp. 69–78. [19] Neil F. Johnson, Sushil Jajodia, Exploring steganography: seeing the unseen, Computer 31 (2) (1998) 26–34. [20] Neil F. Johnson, Stefan Katzenbeisser, A survey of steganographic techniques, in: Information Hiding, Artech House, Norwood, Mass., 2000, pp. 43–78. [21] Li Zhi, Sui Ai Fen, Yang Yi Xian, A LSB steganography detection algorithm, in: 14th IEEE Proceedings on Personal, Indoor and Mobile Radio Communications, 2003, PIMRC 2003, vol. 3, IEEE, 2003, pp. 2780–2783. [22] Jessica Fridrich, Miroslav Goljan, David Soukal, Higher-order statistical steganalysis of palette images, in: Security and Watermarking of Multimedia Contents V, vol. 5020, International Society for Optics and Photonics, 2003, pp. 178–190. [23] Ismail Avciba¸s, Mehdi Kharrazi, Nasir Memon, Bülent Sankur, Image steganalysis with binary similarity measures, EURASIP Journal on Advances in Signal Processing 2005 (2005) 2749–2757. [24] Siwei Lyu, Hany Farid, Detecting hidden messages using higher-order statistics and support vector machines, in: International Workshop on Information Hiding, Springer, 2002, pp. 340–354. [25] Andrew D. Ker, Steganalysis of LSB matching in grayscale images, IEEE Signal Processing Letters 12 (6) (2005) 441–444. [26] Qingzhong Liu, Andrew H. Sung, Jianyun Xu, Bernardete M. Ribeiro, Image complexity and feature extraction for steganalysis of LSB matching steganography, in: 18th International Conference on Pattern Recognition, ICPR’06, vol. 2, IEEE, 2006, pp. 267–270. [27] Charles G. Boncelet Jr., Lisa M. Marvel, Charles T. Retter, Spread spectrum image steganography, 2003, patent 6,557,103. [28] Jeremiah Joseph Harmsen, William A. Pearlman, Steganalysis of additive-noise modelable information hiding, in: Security and Watermarking of Multimedia Contents V, vol. 5020, International Society for Optics and Photonics, 2003, pp. 131–142. [29] Richard O. Duda, Peter E. Hart, David G. Stork, Pattern Classification, chapter 10, 2001. [30] Rongrong Ji, Hongxun Yao, Shaohui Liu, Liang Wang, Jianchao Sun, A new steganalysis method for adaptive spread spectrum steganography, in: 2006 International Conference on Intelligent Information Hiding and Multimedia, IEEE, 2006, pp. 365–368. [31] Kenneth Sullivan, Upamanyu Madhow, Shivkumar Chandrasekaran, Bangalore S. Manjunath, Steganalysis of spread spectrum data hiding exploiting cover memory, in: Security, Steganography, and Watermarking of Multimedia Contents VII, vol. 5681, International Society for Optics and Photonics, 2005, pp. 38–46. [32] Joachims Thorsten, Making large-scale SVM learning practical. Advances in kernel methods – support vector learning, http://svmlight.joachims.org/, 1999. [33] Brian Chen, G. Wornell, A class of provably good methods for digital watermarking and information embedding, IEEE Transactions on Information Theory 47 (4) (2001) 291–314. [34] Shaohui Liu, Hongxun Yao, Wen Gao, Steganalysis of data hiding techniques in wavelet domain, in: International Conference on Information Technology: Coding and Computing, Proceedings, vol. 1, 2004, ITCC 2004, IEEE, 2004, pp. 751–754. [35] Liu Shaohui, Yao Hongxun, Gao Wen, Neural network based steganalysis in still images, in: 2003 International Conference on Multimedia and Expo. ICME’03, Proceedings (Cat. No. 03TH8698), vol. 2, IEEE, 2003, pp. 509–512. [36] Lisa M. Marvel, Charles G. Boncelet, Charles T. Retter, Spread spectrum image steganography, IEEE Transactions on Image Processing 8 (8) (1999) 1075–1083. [37] Ming Jiang, N. Menion, Edward Wong, Xiaolin Wu, Quantitative steganalysis of binary images, in: International Conference on Image Processing, 2004, ICIP’04, vol. 1, IEEE, 2004, pp. 29–32. [38] Yun Q. Shi, Patchara Sutthiwan, Licong Chen, Textural features for steganalysis, in: International Workshop on Information Hiding, Springer, 2012, pp. 63–77. [39] Licong Chen, Yun-Qing Shi, Patchara Sutthiwan, Variable multi-dimensional co-occurrence for steganalysis, in: International Workshop on Digital Watermarking, Springer, 2014, pp. 559–573.

288

Digital Media Steganography

[40] Yinlong Qian, Jing Dong, Wei Wang, Tieniu Tan, Deep learning for steganalysis via convolutional neural networks, in: IS&T International Symposium on Electronic Imaging, EI 2015, vol. 9409, 2015, 94090J. [41] Guanshuo Xu, Han-Zhou Wu, Yun-Qing Shi, Structural design of convolutional neural networks for steganalysis, IEEE Signal Processing Letters 23 (5) (2016) 708–712. [42] Jian Ye, Jiangqun Ni, Yang Yi, Deep learning hierarchical representations for image steganalysis, IEEE Transactions on Information Forensics and Security 12 (11) (2017) 2545–2557. [43] Mehdi Yedroudj, Frédéric Comby, Marc Chaumont, Yedrouj-Net: an efficient {CNN} for spatial steganalysis, in: International Conference on Acoustics, Speech, and Signal Processing, (April) 2018, abs/1803.0. [44] M. Boroumand, M. Chen, J. Fridrich, Deep residual network for steganalysis of digital images, IEEE Transactions on Information Forensics and Security 14 (5) (2019) 1181–1193. [45] Ru Zhang, Feng Zhu, Jianyi Liu, Gongshen Liu, Efficient feature learning and multi-size image steganalysis based on CNN, CoRR, arXiv:1807.11428v1, (November) 2018. [46] Lionel Pibre, Pasquet Jerome, Dino Ienco, Marc Chaumont, Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch, in: Media Watermarking, Security, and Forensics 2016, San Francisco, California, USA, February 14–18, 2016. [47] Guanshuo Xu, Deep convolutional neural network to detect J-UNIWARD, in: Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, 2017, pp. 67–73. [48] Jianhua Yang, Xiangui Kang, Edward K. Wong, Yun Qing Shi, JPEG steganalysis with combined dense connected CNNs and SCA-GFR, Multimedia Tools and Applications 78 (2019) 8481–8495. [49] X. Huang, S. Wang, T. Sun, G. Liu, X. Lin, Steganalysis of adaptive JPEG steganography based on ResDet, in: 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC, 2018, pp. 549–553. [50] Jianhua Yang, Yun-Qing Shi, Edward K. Wong, Xiangui Kang, JPEG steganalysis based on DenseNet, CoRR, arXiv:1711.09335, 2017. [51] Songtao Wu, Shenghua Zhong, Yan Liu, Deep residual learning for image steganalysis, Multimedia Tools and Applications 77 (9) (2018) 10437–10453. [52] Songtao Wu, Sheng Hua Zhong, Yan Liu, Steganalysis via deep residual network, in: Proceedings of the International Conference on Parallel and Distributed Systems, ICPADS, 2017, pp. 1233–1236. [53] Guanshuo Xu, Han-Zhou Wu, Yun Q. Shi, Ensemble of CNNs for steganalysis: an empirical study, in: Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security, 2016, pp. 103–107. [54] Mo Chen, Vahid Sedighi, Mehdi Boroumand, Jessica Fridrich, JPEG-phase-aware convolutional neural network for steganalysis of JPEG images, in: Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, IHMMSec ’17, 2017, pp. 75–84. [55] Y. Qian, J. Dong, W. Wang, T. Tan, Learning and transferring representations for image steganalysis using convolutional neural network, in: 2016 IEEE International Conference on Image Processing, ICIP, 2016, pp. 2752–2756. [56] J. Zeng, S. Tan, B. Li, J. Huang, Large-scale JPEG image steganalysis using hybrid deeplearning framework, IEEE Transactions on Information Forensics and Security 13 (5) (2018) 1200–1214. [57] W. Tang, S. Tan, B. Li, J. Huang, Automatic steganographic distortion learning using a generative adversarial network, IEEE Signal Processing Letters 24 (10) (2017) 1547–1551. [58] Yiwei Zhang, Weiming Zhang, Kejiang Chen, Jiayang Liu, Yujia Liu, Nenghai Yu, Adversarial examples against deep neural network based steganalysis, in: Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec ’18, ACM, New York, NY, USA, 2018, pp. 67–72. [59] Jamie Hayes, George Danezis, Generating steganographic images via adversarial training, arXiv eprints, arXiv:1703.00371, 2017.

Chapter 12 • Digital media steganalysis

289

[60] Jianhua Yang, Kai Liu, Xiangui Kang, Edward K. Wong, Yun-Qing Shi, Spatial image steganography based on generative adversarial network, CoRR, arXiv:1804.07939, 2018. [61] Weixuan Tang, Bin Li, Shunquan Tan, Mauro Barni, Jiwu Huang, CNN based adversarial embedding with minimum alteration for image steganography, arXiv e-prints, arXiv:1803.09043, 2018. [62] Jiayang Liu, Weiming Zhang, Yiwei Zhang, Dongdong Hou, Yujia Liu, Hongyue Zha, Nenghai Yu, Detection based defense against adversarial examples from the steganalysis point of view, arXiv e-prints, arXiv:1806.09186, 2018. [63] Jiren Zhu, Russell Kaplan, Justin Johnson, Li Fei-Fei, HiDDeN: Hiding Data with Deep Networks, arXiv e-prints, Lecture Notes in Computer Science, vol. 11219, 2018. [64] D. Hu, L. Wang, W. Jiang, S. Zheng, B. Li, A novel image steganography method via deep convolutional generative adversarial networks, IEEE Access 6 (2018) 38303–38314. [65] Clement Fuji Tsang, Jessica J. Fridrich, Steganalyzing images of arbitrary size with CNNs, in: Media Watermarking, Security, and Forensics 2018, Burlingame, CA, USA, 28 January 2018 – 1 February 2018, 2018. [66] Mo Chen, Mehdi Boroumand, Jessica Fridrich, Deep learning regressors for quantitative steganalysis, vol. 2018(7), Society for Imaging Science and Technology, 2017, pp. 1–7. [67] Ahmad Zakaria, Marc Chaumont, Gérard Subsol, Quantitative and binary steganalysis in JPEG: a comparative study, in: 26th European Signal Processing Conference, EUSIPCO 2018, Roma, Italy, September 3–7, 2018, 2018, pp. 1422–1426. [68] Mehdi Yedroudj, Marc Chaumont, Frédéric Comby, How to augment a small learning set for improving the performances of a CNN-based steganalyzer?, Electronic Imaging 2018 (2018). [69] Bin Li, Weihang Wei, Anselmo Ferreira, Shunquan Tan, ReST-Net: diverse activation modules and parallel subnets-based CNN for spatial image steganalysis, IEEE Signal Processing Letters 25 (5) (2018) 650–654. [70] Xiaofeng Song, Fenlin Liu, Chunfang Yang, Xiangyang Luo, Yi Zhang, Steganalysis of adaptive JPEG steganography using 2D Gabor filters, in: Proceedings of the 3rd ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec ’15, ACM, New York, NY, USA, 2015, pp. 15–23. [71] J. Fridrich, J. Kodovsky, Rich models for steganalysis of digital images, IEEE Transactions on Information Forensics and Security 7 (3) (2012) 868–882. [72] Jishen Zeng, Shunquan Tan, Guangqing Liu, Bin Li, Jiwu Huang, WISERNet: wider separate-thenreunion network for steganalysis of color images, CoRR, arXiv:1803.04805, 2018, pp. 1–11. [73] Vinod Nair, Geoffrey E. Hinton, Rectified linear units improve restricted Boltzmann machines, in: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, vol. 3, Omnipress, USA, 2010, pp. 807–814. [74] Karlik Bekir, A. Vehbi Olgac, Performance analysis of various activation functions in generalized MLP architectures of neural networks, International Journal of Artificial Intelligence and Expert Systems 1 (4) (2015) 111–122. [75] Y.Y. Lu, Z.L.O. Yang, L. Zheng, Y. Zhang, Importance of truncation activation in pre-processing for spatial and Jpeg image steganalysis, in: 2019 IEEE International Conference on Image Processing, ICIP, 2019, pp. 689–693. [76] Sergey Ioffe, Christian Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift, in: Proceedings of the 32Nd International Conference on International Conference on Machine Learning, ICML’15, vol. 37, JMLR.org, 2015, pp. 448–456. [77] Y-Lan Boureau, Jean Ponce, Yann LeCun, A theoretical analysis of feature pooling in visual recognition, in: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, Omnipress, USA, 2010, pp. 111–118. [78] Marina Sokolova, Guy Lapalme, A systematic analysis of performance measures for classification tasks, Information Processing & Management 45 (4) (2009) 427–437. [79] Ru Zhang, Feng Zhu, Jianyi Liu, Gongshen Liu, Depth-wise separable convolutions and multi-level pooling for an efficient spatial CNN-based steganalysis, IEEE Transactions on Information Forensics and Security 15 ((September) 2019) 1138–1150. [80] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, in: D. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars (Eds.), Computer Vision – ECCV 2014, ECCV 2014, in: Lecture Notes in Computer Science, vol. 8691, Springer, Cham, 2014.

290

Digital Media Steganography

[81] Xavier Glorot, Yoshua Bengio, Understanding the difficulty of training deep feedforward neural networks, in: Yee Whye Teh, Mike Titterington (Eds.), Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in: Proceedings of Machine Learning Research, vol. 9, PMLR, Chia Laguna Resort, Sardinia, Italy, 2010, pp. 249–256. [82] Patrick Bas, Tomas Filler, Tomas Pevny, “Break our steganographic system”: the ins and outs of organizing BOSS, in: Information Hiding, 2011, Czech Republic, in: Lecture Notes in Computer Science, vol. 6958, 2011, pp. 59–70. [83] BOSS Web page, http://agents.fel.cvut.cz/boss/index.php?mode=VIEW&tmpl=materials. [84] BOWS-2 Web page, http://bows2.ec-lille.fr/index.php?mode=VIEW&tmpl=index1, 2007. [85] Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet classification with deep convolutional neural networks, in: Proceedings of the 25th International Conference on Neural Information Processing Systems, Vol. 1, NIPS’12, Curran Associates Inc., USA, 2012, pp. 1097–1105. [86] Google Code Archive – long-term storage for Google Code Project Hosting, https://code.google.com/ archive/p/cuda-convnet/. [87] Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell, Caffe: convolutional architecture for fast feature embedding, in: Proceedings of the 22nd ACM International Conference on Multimedia, MM ’14, ACM, New York, NY, USA, 2014, pp. 675–678. [88] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek Gordon Murray, Benoit Steiner, Paul A. Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, Xiaoqiang Zhang, TensorFlow: a system for large-scale machine learning, CoRR, arXiv:1605.08695, 2016. [89] Binghamton University Downloads, http://dde.binghamton.edu/download/. [90] Jessica Fridrich Binghamton University, http://www.ws.binghamton.edu/fridrich/. [91] LIRMM Chaumont, http://www.lirmm.fr/~chaumont/. [92] Stefan Hetzl, Petra Mutzel, A graph–theoretic approach to steganography, in: Jana Dittmann, Stefan Katzenbeisser, Andreas Uhl (Eds.), Communications and Multimedia Security, Springer Berlin Heidelberg, 2005, pp. 119–128. [93] E. Swanson, C. Ganier, R. Holman, J. Rosser, Steganography in WAV files, https://www.clear.rice.edu/ elec301/Projects01/smokey_steg/group.html, 2002. [94] S. Shirali-Shahreza, M.T. Manzuri-Shalmani, High capacity error free wavelet domain speech steganography, in: 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008, pp. 1729–1732. [95] H. Ghasemzadeh, M.H. Kayvanrad, Comprehensive review of audio steganalysis methods, IET Signal Processing 12 (6) (2018) 673–687. [96] Hamza Özer, Bülent Sankur, Nasir Memon, ˙Ismail Avcıba¸s, Detection of audio covert channels using statistical footprints of hidden messages, Digital Signal Processing 16 (4) (2006) 389–401. [97] R.R. Coifman, D.L. Donoho, Translation-Invariant De-Noising, Springer New York, New York, NY, 1995, pp. 125–150. [98] Christian Kraetzer, Jana Dittmann, Mel-cepstrum based steganalysis for VoIP steganography, in: Edward J. Delp III, Ping Wah Wong (Eds.), Security, Steganography, and Watermarking of Multimedia Contents IX, vol. 6505, International Society for Optics and Photonics, SPIE, 2007, pp. 54–65. [99] S. Rekik, S. Selouani, D. Guerchi, H. Hamam, An autoregressive time delay neural network for speech steganalysis, in: 2012 11th International Conference on Information Science, Signal Processing and their Applications, ISSPA, 2012, pp. 54–58. [100] Qingzhong Liu, Andrew H. Sung, Mengyu Qiao, Derivative-based audio steganalysis, ACM Transactions on Multimedia Computing Communications and Applications 7 (3) (2011) 18. [101] Fabien A.P. Petitcolas, MP3Stego, https://www.petitcolas.net/steganography/mp3stego/. [102] Y. Wang, L. Guo, Y. Wei, C. Wang, A steganography method for AAC audio based on escape sequences, in: 2010 International Conference on Multimedia Information Networking and Security, 2010, pp. 841–845.

Chapter 12 • Digital media steganalysis

291

[103] Chao Jin, Rangding Wang, Diqun Yan, Steganalysis of MP3Stego with low embedding-rate using Markov feature, Multimedia Tools and Applications 76 (5) (2017) 6143–6158. [104] Yanzhen Ren, Qiaochu Xiong, Lina Wang, A steganalysis scheme for AAC audio based on MDCT difference between intra and inter frame, in: Christian Kraetzer, Yun-Qing Shi, Jana Dittmann, Hyoung Joong Kim (Eds.), Digital Forensics and Watermarking, Springer International Publishing, Cham, 2017, pp. 217–231. [105] Catherine Paulin, Sid Ahmed Selouani, Éric Hervet, Audio steganalysis using deep belief networks, International Journal of Speech Technology 19 (3) (2016) 585–591. [106] Bolin Chen, Weiqi Luo, Haodong Li, Audio steganalysis with convolutional neural network, in: Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec ’17, ACM, New York, NY, USA, 2017, pp. 85–90. [107] Yuzhen Lin, Rangding Wang, Diqun Yan, Li Dong, Xueyuan Zhang, Audio steganalysis with improved convolutional neural network, in: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’19, ACM, New York, NY, USA, 2019, pp. 210–215. [108] Yanzhen Ren, Dengkai Liu, Qiaochu Xiong, Jianming Fu, Lina Wang, Spec-ResNet: a general audio steganalysis scheme based on deep residual network of spectrogram, CoRR, arXiv:1901.06838, 2019. [109] Zinan Lin, Yongfeng Huang, Jilong Wang, RNN-SM: fast steganalysis of VoIP streams using recurrent neural network, IEEE Transactions on Information Forensics and Security 13 (7) (2018) 1854–1868. [110] Yunxia Liu, Shuyang Liu, Yonghao Wang, Hongguo Zhao, Si Liu, Video steganography: a review, Neurocomputing 335 (2019) 238–250. [111] Julien S. Jainsky, Deepa Kundur, Don R. Halverson, Towards digital video steganalysis using asymptotic memoryless detection, in: MM and Sec’07 – Proceedings of the Multimedia and Security Workshop 2007, 2007, pp. 161–168. [112] K. Kancherla, S. Mukkamala, Video steganalysis using spatial and temporal redundancies, in: Proceedings of the 2009 International Conference on High Performance Computing and Simulation, HPCS 2009, 2009, pp. 200–207. [113] Yun Cao, Xianfeng Zhao, Dengguo Feng, Video steganalysis exploiting motion vector reversion-based features, IEEE Signal Processing Letters 19 (1) (2012) 35–38. [114] Hussein A. Aly, Data hiding in motion vectors of compressed video based on their associated prediction error, IEEE Transactions on Information Forensics and Security 6 (1) (2011) 14–18. [115] Changyong Xu, Xijian Ping, Tao Zhang, Steganography in compressed video stream, in: First International Conference on Innovative Computing, Information and Control 2006, ICICIC’06, 2006, pp. 269–272. [116] Ding Yu Fang, Long Wen Chang, Data hiding for digital video with phase of motion vector, in: Proceedings - IEEE International Symposium on Circuits and Systems, 2006, pp. 1422–1425. [117] Yun Cao, Xianfeng Zhao, Dengguo Feng, Rennong Sheng, Video Steganography with Perturbed Motion Estimation, Lecture Notes in Computer Science: Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, vol. 6958, 2011, pp. 193–207. [118] Keren Wang, Hong Zhao, Hongxia Wang, Video steganalysis against motion vector-based steganography by adding or subtracting one motion vector value, IEEE Transactions on Information Forensics and Security 9 (5) (2014) 741–751. [119] Nematollah Zarmehi, Mohammad Ali Akhaee, Digital video steganalysis toward spread spectrum data hiding, IET Image Processing 10 (1) (2016) 1–8. [120] Peipei Wang, Yun Cao, Xianfeng Zhao, Segmentation based video steganalysis to detect motion vector modification, Security and Communication Networks 2017 (2017). [121] Yun Cao, Hong Zhang, Xianfeng Zhao, Haibo Yu, Video steganography based on optimized motion estimation perturbation, in: IH and MMSec 2015 – Proceedings of the 2015 ACM Workshop on Information Hiding and Multimedia Security, 2015, pp. 25–31. [122] Yuanzhi Yao, Weiming Zhang, Nenghai Yu, Xianfeng Zhao, Defining embedding distortion for motion vector-based video steganography, Multimedia Tools and Applications 74 (24) (2015) 11163–11186.

292

Digital Media Steganography

[123] Peipei Wang, Hong Zhang, Yun Cao, Xianfeng Zhao, A novel embedding distortion for motion vectorbased steganography considering motion characteristic, local optimality and statistical distribution, in: IH and MMSec 2016 – Proceedings of the 2016 ACM Information Hiding and Multimedia Security Workshop, 2016, pp. 127–137. [124] Elaheh Sadat Sadat, Karim Faez, Mohsen Saffari Pour, Entropy-based video steganalysis of motion vectors, Entropy 20 (4) (2018) 1–14. [125] Yiqi Tew, Koksheik Wong, An overview of information hiding in H.264/AVC compressed video, IEEE Transactions on Circuits and Systems for Video Technology 24 (2) (2014) 305–319. [126] He Xuansen, Luo Zhun, A novel steganographic algorithm based on the motion vector phase, in: Proceedings - International Conference on Computer Science and Software Engineering, CSSE 2008, vol. 3, 2008, pp. 822–825. [127] Zhonghao Li, Laijing Meng, Shutong Xu, Zhaohong Li, Shi Yunqing, Yuanchang Liang, A HEVC video steganalysis algorithm based on PU partition modes, Computers, Materials & Continua 58 (2) (2019) 563–574. [128] Gary J. Sullivan, Jens Rainer Ohm, Woo Jin Han, Thomas Wiegand, Overview of the high efficiency video coding (HEVC) standard, IEEE Transactions on Circuits and Systems for Video Technology 22 (12) (2012) 1649–1668. [129] Thomas Wiegand, Gary J. Sullivan, Gisle Bjøntegaard, Ajay Luthra, Overview of the H.264/AVC video coding standard, IEEE Transactions on Circuits and Systems for Video Technology 13 (7) (2003) 560–576. [130] Yiyuan Yang, Zhaohong Li, Wenchao Xie, Zhenzhen Zhang, High capacity and multilevel information hiding algorithm based on PU partition modes for HEVC videos, Multimedia Tools and Applications 78 (7) (2018) 8423–8446. [131] Zoran M. Miliˇcevi´c, Jovan G. Mihajlovi´c, Zoran S. Bojkovi´c, Testing HEVC model HM-16 15 on objective and subjective way, International Journal of Computers and Communications 12 (2018). [132] M. Huang, Q. Sheng, R. Wang, A prediction mode steganalysis detection algorithm for HEVC, Journal of Optoelectronics Laser 28 (2017) 433–440. [133] Roberto Caldelli, Marc Chaumont, Chang Tsun Li, Irene Amerini, Special issue on deep learning in image and video forensics, Signal Processing: Image Communication 75 (2019) 199–200. [134] Milad Taleby Ahvanooey, Qianmu Li, Jun Hou, Ahmed Raza Rajput, Chen Yini, Modern text hiding, text steganalysis, and applications: a comparative analysis, Entropy 21 (4) (2019) 355. [135] Souvik Bhattacharyya, A survey of steganography and steganalysis technique in image, text, audio and video as cover carrier, International Journal of Global Research in Computer Science (JGRCS) 2 (2011). [136] S. Samanta, S. Dutta, G. Sanyal, A real time text steganalysis by using statistical method, in: 2016 IEEE International Conference on Engineering and Technology, ICETECH, 2016, pp. 264–268. [137] Xin Zuo, Huanhuan Hu, Weiming Zhang, Nenghai Yu, Text semantic steganalysis based on word embedding, in: Xingming Sun, Zhaoqing Pan, Elisa Bertino (Eds.), Cloud Computing and Security, Springer International Publishing, Cham, 2018, pp. 485–495. [138] Lingyun Xiang, Xingming Sun, Gang Luo, Bin Xia, Linguistic steganalysis using the features derived from synonym frequency, Multimedia Tools and Applications 71 (3) (2014) 1893–1911. [139] I.V. Nechta, New steganalysis method for text data produced by synonym run-length encoding, in: 2018 XIV International Scientific-Technical Conference on Actual Problems of Electronics Instrument Engineering, APEIE, 2018, pp. 188–190. [140] L. Xiang, J. Yu, C. Yang, D. Zeng, X. Shen, A word-embedding-based steganalysis method for linguistic steganography via synonym substitution, IEEE Access 6 (2018) 64131–64141. [141] Peng Meng, Liusheng Hang, Zhili Chen, Yuchong Hu, Wei Yang, STBS: a statistical algorithm for steganalysis of translation-based steganography, in: Rainer Böhme, Philip W.L. Fong, Reihaneh Safavi-Naini (Eds.), Information Hiding, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 208–220. [142] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean, Efficient estimation of word representations in vector space, arXiv:1301.3781, 2013. [143] Y. Yang, M. Lei, J. Wang, B. Liu, A SVM based text steganalysis algorithm for spacing coding, China Communications 11 (13) (2014) 108–113.

Chapter 12 • Digital media steganalysis

293

[144] J. Wen, X. Zhou, P. Zhong, Y. Xue, Convolutional neural network based text steganalysis, IEEE Signal Processing Letters 26 (3) (2019) 460–464. [145] Zhongliang Yang, Nan Wei, Junyi Sheng, Yongfeng Huang, Yu-Jin Zhang, TS-CNN: text steganalysis from semantic space based on convolutional neural network, arXiv:1810.08136, 2018. [146] Z. Yang, K. Wang, J. Li, Y. Huang, Y. Zhang, TS-RNN: text steganalysis based on recurrent neural networks, IEEE Signal Processing Letters 26 (12) (2019) 1743–1747.

13 Unsupervised steganographer identification via clustering and outlier detection Hanzhou Wu School of Communication and Information Engineering, Shanghai University, Shanghai, China

13.1 Introduction As a means to secret communication, steganography [1] hides from untrusted parties the fact that any secret is being communicated. It can be realized by embedding the secret message into an innocent object (e.g., digital image) called cover. The resultant object called stego will be sent to the trusted receiver, who can perfectly retrieve the secret message from the stego according to the secret key. A most important requirement for a secure steganographic system is that it should be impossible for the channel monitor (attacker) to distinguish between ordinary objects and objects containing hidden information. From the perspective of the opponent, steganalysis aims to reveal the existence of hidden information. The relationship between steganography and steganalysis can be described as follows. As shown in Fig. 13.1, Alice wants to hide a secret message m into a randomly selected cover object x. She uses the well-designed embedding procedure with a key to generate the stego object y, which will be sent through the insecure channel such as the Internet and may be inspected by Eve’s detection algorithm. Eve will gather evidence about the type of communication from the output of the detector and decide whether or not to interrupt the communication channel down. If Bob successfully receives y, then he will be able to extract m with the secret key.

FIGURE 13.1 Framework of steganographic communication. Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00021-9 Copyright © 2020 Elsevier Inc. All rights reserved.

295

296

Digital Media Steganography

FIGURE 13.2 Steganalysis as a binary classification problem.

There are two common principles for designing secure steganographic systems [2]. One is model-preserving steganography, aiming to preserve the selected model of cover source during data embedding [3,4]. The other one treats steganography as a rate-distortion optimization task [5–7]. For a payload, it is expected to minimize the introduced distortion. In other words, a data hider wants to embed as many message bits as possible for a distortion. Current advanced steganographic algorithms first assign a cost to each element based on its local neighborhood and then hide the secret message using coding techniques with near-minimal total cost [8–14]. Steganalysis has been greatly developed as well. As shown in Fig. 13.2, steganalysis extracts discriminative features from the given objects; it provides training sets of both cover and stego objects and runs an effective machine learning algorithm (e.g., SVM [15]). By creating a decision function, unknown objects can be classified as either cover or stego. Since modern steganography modifies noise-like components of the cover, advanced feature extractors mine the statistical difference in high-frequency areas for detection. In traditional steganalysis, each object is treated in isolation; for example, they separate cover objects from stego objects by processing each object individually. However, such a restriction is unrealistic: In practice, we may face a complex scenario that multiple network actors send a set of media files while one or some of them are using steganography. To make the matters even more complicated, an actor performing steganography sometimes will probably behave innocently, mixing the stego objects with natural covers [16]. Therefore any steganalyst has to consider multiple actors and each transmits multiple objects, which is referred to as the steganographer identification problem (SIP). One might use traditional steganalysis to find stegos and then identify the guilty actors. However, the guilty actors may be lost due to lots of false alarms because most traditional steganalysis algorithms involve a binary classification algorithm, which has to be pretrained on large local data sets. In practice, the cover source of the actors will not be identical to that used

Chapter 13 • Unsupervised steganographer identification

297

FIGURE 13.3 An example to identify the guilty actor by unsupervised learning.

for training, and thus the training phase itself will become an error source if a normal actor is falsely accused due to the mismatch problem [17,18]. It therefore requires researchers to design reliable detectors for the aforementioned complex scenario. Fortunately, though the SIP needs further investment, there are some effective works reported in the literature. Fig. 13.3 shows an example of identifying the guilty actor from multiple actors by unsupervised learning. Each actor holds multiple images, from which multiple feature sets can be extracted. Each feature set can be regarded as a “point”. By using unsupervised learning-based method such as clustering, the most suspicious point, the feature set D, can be found out. Accordingly, Kim will be judged as the guilty actor. Following this line, in this chapter, we introduce the foundational concepts and details the methodologies of SIP. We expect that this chapter can help the readers to master basic concepts and advanced technologies for SIP. This chapter is organized as follows. In Section 13.2, we review the primary concepts and techniques. Then, in Section 13.3, we introduce two mainstream technical frameworks for SIP. In Section 13.4, we analyze the ensemble and dimensionality reduction techniques for improving identification performance. This chapter is finally concluded in Section 13.5.

13.2 Primary concepts and techniques 13.2.1 JPEG compression JPEG is a common method of lossy compression for digital images. The degree of compression can be adjusted so as to ensure a controllable tradeoff between storage size and image quality. Unless mentioned, we will use JPEG images as the covers for steganography throughout this chapter because a JPEG image is one of the most popular types of covers over social networks.

298

Digital Media Steganography

The JPEG compression algorithm is designed specifically for the human visual system (HVS). It exploits the biological properties of human sight that people are more sensitive to the illuminocity of color, rather than to the chromatic value of an image, and people are not particularly sensitive to high-frequency content in images. The data encoding process of the JPEG compression algorithm has five main basic steps: 1. The representation of colors in the given image is converted (from RGB) to YCB CR , having one luminance Y and two chrominances CB and CR . 2. The resolution of the chroma data is then reduced since human eyes are less sensitive to fine color details than to fine brightness details. 3. The image is divided into pixel-blocks of size 8 × 8, and for each block, the Y, CB , and CR data undergo the discrete cosine transform (DCT). 4. The amplitudes of the frequency coefficients are thereafter quantized. The quality setting of the encoder shows to what extent the resolution of every frequency coefficient is reduced. The magnitudes of high-frequency coefficients are stored with a lower accuracy than low-frequency coefficients. 5. All the processed 8 × 8 pixel-blocks are further losslessly compressed with a variant of Huffman encoding. The decoding process reverses these steps, except the quantization, because it is irreversible. Y, CB , and CR can be determined as ⎧ Y = 0.299R + 0.587G + 0.114B, ⎨ (13.1) CB = 0.564(B − Y), ⎩ CR = 0.713(R − Y). For a pixel-block of size 8 × 8, the DCT can be described as  (2y + 1)vπ (2x + 1)uπ 1 cos , f (x, y) cos F (u, v) = C(u)C(v) 4 16 16

(13.2)

(2y + 1)vπ 1  (2x + 1)uπ cos , C(u)C(v)F (u, v) cos 4 16 16

(13.3)

7

7

x=0 y=0

7

f (x, y) =

7

u=0 v=0

√ where C(z) = 1/ 2 for z = 0 and C(z) = 1 otherwise. Here f (u, v) denotes the value at position (u, v) to be transformed, and F (u, v) is the DCT coefficient. Note that once DCT is finished, JPEG moves to quantization, in which the less important coefficients are reduced. For quantization, a quantization table T ∈ Z8×8 is needed. (u,v) . After quantization, F (u, v) is updated as  FT (u,v)

13.2.2 JPEG steganalysis features Each actor’s JPEG images are represented by low-dimensional feature vectors, which can be called as JPEG steganalysis features. In this section, we review two JPEG feature extrac-

Chapter 13 • Unsupervised steganographer identification

299

tors that will be used latter. Let us point that designing efficient feature extractors should be a core topic for SIP.

13.2.2.1 PEV-274 features Pevný and Fridrich [19] merged the DCT features and Markov features to produce a 274-D feature vector called PEV-274 for JPEG steganalysis. Let di,j (k), 1 ≤ i ≤ 8, 1 ≤ j ≤ 8, 1 ≤ k ≤ nB , denote the (i, j )th quantized DCT coefficient in the kth block. Let H be the histogram of all 64 × nB luminance DCT coefficients, that is, i,j i,j H = (HL , . . . , HR ), where L = mini,j,k di,j (k) and R = maxi,j,k di,j (k). Let hi,j = (hL , . . . , hR ) be a histogram of coefficients for DCT mode (i, j ). Let the dual histograms with 8 × 8 ma B δ(d, di,j (k)), where 1 ≤ i ≤ 8, 1 ≤ j ≤ 8, −5 ≤ d ≤ 5, and δ(x, y) = 1 if trices be gdi,j = nk=1 x = y and 0 otherwise. The interblock dependency among DCT coefficients can be captured. First, the variation V is determined as |Ir |−1 8 i,j =1 k=1 |di,j (Ir (k)) − di,j (Ir (k + 1))| V= |Ir | + |Ic | (13.4) 8 |Ic |−1 i,j =1 k=1 |di,j (Ic (k)) − di,j (Ic (k + 1))| + , |Ir | + |Ic | where Ir and Ic are the vectors of block indexes 1, . . . , nB while scanning the JPEG image by rows and columns, respectively. Then two scalars determined from the decompressed JPEG image are collected: (M−1)/8 N Bα =

j =1 |c8i,j

i=1

− c8i+1,j |α +

(N −1)/8 M j =1

i=1 |ci,8j

− ci,8j +1 |α

N (M − 1)/8 + M(N − 1)/8

,

(13.5)

where M and N are the height and width, ci,j is the grayscale value of the decompressed JPEG image, and α = 1, 2. Calibrated cooccurrence features are also collected, for example,  C0,0 (J1 ) − C0,0 (J2 ), where Cs,t = 8i,j =1 Ei,j and |Ir −1| Ei,j =

k=1

δ(s, di,j (Ir (k)))δ(t, di,j (Ir (k + 1))) + |Ir | + |Ic | |Ic |−1 k=1 δ(s, di,j (Ic (k)))δ(t, di,j (Ic (k + 1))) . |Ir | + |Ic |

(13.6)

Let J1 be the stego image, and let J2 be its calibrated version. Calibration estimates macroscopic properties of the cover from the stego. During calibration, J1 is decompressed to the spatial domain, cropped by a few pixels in two directions, and compressed again with the identical quantization matrix as J1 . The newly obtained J2 has the most macroscopic features similar to the original cover. This is because the cropped image is visually similar to the original image. The cropping operation brings the 8×8 DCT grid “out of sync” with the previous compression, which effectively suppresses the influence of the previous JPEG compression and the embedding changes [19].

300

Digital Media Steganography

The previous analysis allows a steganalyzer to construct 193-D DCT features. In detail, 66-D histogram features are first collected, that is, Hl (J1 ) − Hl (J2 ), l ∈ {−5, −4, . . . , 5},

(13.7)

and for all (i, j ) ∈ L = {(1, 2), (2, 1), (3, 1), (2, 2), (1, 3)}, i,j

i,j

hl (J1 ) − hl (J2 ), l ∈ {−5, −4, . . . , 5}.

(13.8)

For dual histograms gd , d ∈ {−5, −4, . . . , 5}, the differences of the 9 lowest AC modes are determined: gdi,j (J1 ) − gdi,j (J2 ),

(13.9)

where (i, j ) ∈ {(2, 1), (3, 1), (4, 1), (1, 2), (2, 2), (3, 2), (1, 3), (2, 3), (1, 4)}. It allows the steganalyzer to collect 99-D DCT features. For the cooccurrence matrix, the central elements in the range [−2, 2] × [−2, 2] are used, yielding to 25-D features Cs,t (J1 ) − Cs,t (J2 ), (s, t) ∈ [−2, 2] × [−2, 2].

(13.10)

By further collecting V , B1 , B2 , 193-D DCT features can be obtained. The Markov feature set proposed in [20] models the differences between absolute values of neighboring DCT coefficients as a Markov process. The calculation starts by forming the matrix F (u, v) of absolute values of DCT coefficients. The DCT coefficients in F (u, v) are arranged in the same way as pixels in the image by replacing each 8 × 8 block of pixels with the corresponding block of DCT coefficients. Four difference arrays can be therefore calculated: ⎧ Fh (u, v) = F (u, v) − F (u + 1, v), ⎪ ⎪ ⎪ ⎪ ⎨ F (u, v) = F (u, v) − F (u, v + 1), v (13.11) ⎪ (u, v) = F (u, v) − F (u + 1, v + 1), F ⎪ d ⎪ ⎪ ⎩ Fm (u, v) = F (u + 1, v) − F (u, v + 1). Four transition probability matrices Mh , Mv , Md , Mm are computed as  ⎧ u,v δ(Fh (u, v) = i, Fh (u + 1, v) = j ) ⎪ ⎪  , Mh (i, j ) = ⎪ ⎪ ⎪ u,v δ(Fh (u, v) = i) ⎪ ⎪  ⎪ ⎪ δ(Fv (u, v) = i, Fv (u, v + 1) = j ) ⎪ ⎪ ⎪ Mv (i, j ) = u,v  , ⎪ ⎨ u,v δ(Fv (u, v) = i)  ⎪ ⎪ u,v δ(Fd (u, v) = i, Fd (u + 1, v + 1) = j ) ⎪  , ⎪ ⎪ Md (i, j ) = ⎪ ⎪ u,v δ(Fd (u, v) = i) ⎪  ⎪ ⎪ ⎪ + 1, v) = i, Fm (u, v + 1) = j ) ⎪ ⎪ Mm (i, j ) = u,v δ(Fm (u  ⎩ . u,v δ(Fm (u, v) = i)

(13.12)

Chapter 13 • Unsupervised steganographer identification

301

If the matrices were taken directly as features, then the dimensionality would be too large. By using the central [−4, 4] portion of the matrices the dimension then is 4 × 92 = 324. Obviously, these Markov features can be calibrated, that is, M(c) = M(J1 ) − M(J2 ). The (c) (c) (c) dimension of the calibrated Markov feature set {Mh , M(c) v , Md , Mm } remains the same as (c) (c) ¯ = (M(c) + M(c) its original version. PEV-274 uses the average M v + Md + Mm )/4, which has h a dimension of 81. Accordingly, for PEV-274, the dimension is 193 + 81 = 274.

13.2.2.2 LI-250 features Li et al. [21] proposed high-order joint features for SIP. Their motivation is that most JPEG steganographic systems change the correlations of neighboring coefficients due to the modification of DCT coefficients, resulting in that Markov transition probability will be changed. However, Markov transition probabilities may be incapable of fully exploiting the changes in DCT coefficients and fully capturing the neighboring joint relation. Therefore, Li et al. developed 250-D features by employing joint density matrices for pooled JPEG steganalysis. The high-order joint features are driven from high-order joint density matrices of DCT coefficients. The steganalysis features consist of two parts: one is computed from mean joint density matrices of intrablock, and the other one is derived from the mean joint density matrices of interblock. The details of extracting the features are described as follows. Denoting the quantized DCT coefficients in the blocks as cm,n (u, v), where m ∈ [1, M] and n ∈ [1, N ] mean the block indexes, and (u, v) shows the location of each coefficient (h) in each block. The intrablock joint density matrices along three directions Fia (x, y, z), (v) (d) Fia (x, y, z), Fia (x, y, z) are calculated:  m,n,u,v δ(|cm,n (u, v)| = x, |cm,n (u, v + 1)| = y, |cm,n (u, v + 2)| = z) , (13.13) 48MN  m,n,u,v δ(|cm,n (u, v)| = x, |cm,n (u + 1, v)| = y, |cm,n (u + 2, v)| = z) , (13.14) 48MN  m,n,u,v δ(|cm,n (u, v)| = x, |cm,n (u + 1, v + 1)| = y, |cm,n (u + 2, v + 2)| = z) . (13.15) 36MN The intrablock joint features Fia (x, y, z) are then calculated by 1 (h) (v) (d) (x, y, z) + Fia (x, y, z) + Fia (x, y, z)}, Fia (x, y, z) = {Fia 3

(13.16)

where x, y, z ∈ [0, 4]. Thus the dimension of intrablock joint features is 125. Similarly, (h) the interblock second-order joint density matrices along three directions Fir (x, y, z), (v) (d) Fir (x, y, z), Fir (x, y, z) can be calculated by  m,n,u,v δ(|cm,n (u, v)| = x, |cm,n+1 (u, v)| = y, |cm,n+2 (u, v)| = z) , (13.17) 64M(N − 2)

302

Digital Media Steganography



m,n,u,v δ(|cm,n (u, v)| = x, |cm+1,n (u, v)| = y, |cm+2,n (u, v)| = z)

64(M − 2)N

,

(13.18)



m,n,u,v δ(|cm,n (u, v)| = x, |cm+1,n+1 (u, v)| = y, |cm+2,n+2 (u, v)| = z)

64(M − 2)(N − 2)

.

(13.19)

The interblock joint features Fir (x, y, z) are calculated by 1 (h) (v) (d) Fir (x, y, z) = {Fir (x, y, z) + Fir (x, y, z) + Fir (x, y, z)}, 3

(13.20)

where x, y, z ∈ [0, 4]. Thus the dimension of interblock joint features is 125. By combining the intrablock and interblock features, 250-D high-order joint features can be collected, which can be used for SIP.

13.2.3 Batch steganography and pooled steganalysis In SIP, each actor holds multiple digital objects. Normal actors take no action to their own digital objects, whereas a guilty actor, that is, the steganographer, distributes a secret payload into the objects held by himself. It involves the concept of batch steganography [16], which can be described as follows: given a total of n cover objects, the steganographer hides data in m ≤ n of them and leaves the other covers unchanged. Obviously, to secure steganography, how to best spread payload between multiple covers is a critical problem to the steganographer. For a steganalysis expert, how to pool evidence from multiple objects of suspicion is a key topic, which is called pooled steganalysis [16]. The methods for SIP can be regarded as pooled steganalysis. The steganographer has to spread a message sized L into images (I1 , I2 , . . . , In ) with capacities (c1 , c2 , . . . , cn ). He needs to determine the message fragment lengths (l1 , l2 , . . . , ln )  meeting the requirement L = ni=1 li . There are four common strategies reported in the literature [22]: max-greedy, max-random, linear, and even. In the max-greedy strategy, the steganographer embeds the message into the fewest number of covers. Assuming that the images are ordered by capacity c1 > c2 > · · · > cn , it leads to the following message lengths:  li = ci , i ∈ [1, m − 1], lm = L − m−1 i=1 li , and li = 0, i ∈ [m + 1, n]. The max-random strategy is the same as max-greedy, except that the images to be embedded are chosen in random order. In the linear strategy the message is distributed into all images proportionately to their capacities, that is, li = nci L c . For the even strategy, the secret message is distributed j =1 j

evenly, that is, li = L/n. The existing works [16,23,24] point that, for the steganographer, the optimal behavior is likely to be extreme concentration of payload into as few covers as possible, or the opposite in which payload is spread as thinly as possible. However, these theoretical results cannot be confirmed without practical pooled steganalyzers to test against [22]. The aforementioned strategies require the steganographer to estimate the capacities. Ker et al. [22] presented a method to estimating the maximum message length for each

Chapter 13 • Unsupervised steganographer identification

303

available image for practical steganography. They first query the implementation of an algorithm to provide an initial estimate of the maximum message length. This is done either by embedding a very short message into the given message or by asking for information about a given image. Once having an initial estimate, they try to embed a random string of this length. If the embedding fails, then the estimate of the capacity is decreased by ten bytes, and the procedure is repeated. Otherwise, they deem the current estimate of the capacity as the final one. It is pointed that the capacity for an image is actually not easy to be well defined in terms of security. One may exploit steganalysis for estimating the secure capacity, rather than simply estimating the maximum embeddable payload as the capacity.

13.2.4 Agglomerative clustering The hierarchical cluster analysis [25] seeks to build a hierarchy of clusters, allowing one to partition a set of objects such that “similar” objects can be clustered. As a type of hierarchical cluster analysis, agglomerative clustering is such a “bottom-up” approach that each object starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. As shown in Algorithm 1, the basic agglomerative clustering maintains an “active set” of clusters (initially, all objects belong to the active set), and at each stage the nearest two clusters are merged. When two clusters are merged, they are removed from the active set, and their union is added to the active set. It iterates until there is only one cluster in the active set. The tree is formed by keeping track of which clusters were merged. All that is needed is a method to compute a distance between two clusters, for which there are lots of options. Algorithm 1 Basic agglomerative clustering. 1: Collect objects {xi }n i=1 , and set distance measure D(·, ·). 2: A ← ∅ Active set starts out empty. 3: for each i ∈ [1, n] do 4: A ← A ∪ {{xi }} Add each object as its own cluster. 5: end for 6: T ← A Store the tree as a sequence of merges. 7: while |A| > 1 do 8: G1∗ , G2∗ ← arg min D(G1 , G2 ) Choose pair in A with best distance. G1 ,G2 ∈A

9: 10: 11: 12: 13:

A ← A \ {G1∗ } \ {G2∗ } A ← A ∪ {G1∗ ∪ G2∗ } T ← T ∪ {G1∗ ∪ G2∗ } end while return Tree T

Remove both from active set. Add union to active set. Add union to tree.

Let d(x, y) denote the distance between two objects x and y, and by D(X, Y ) the distance between two clusters X and Y . The single linkage uses the distance between the nearest

304

Digital Media Steganography

points in the two clusters: DSL = min d(x, y).

(13.21)

x∈X,y∈Y

The complete linkage uses the farthest points: DCL = max d(x, y).

(13.22)

x∈X,y∈Y

The single linkage can cause long chains of clusters, and the complete linkage prefers compact clusters [17]; other agglomerative clustering algorithms are intermediate such as centroid clustering DCEN =

 1 d(x, y) |X| · |Y |

(13.23)

x∈X y∈Y

and average linkage DAL =

1 |X ∪ Y |2 − |X ∪ Y |



d(u, v).

(13.24)

u,v∈X∪Y,u =v

The input to agglomerative clustering is a distance matrix between objects. The output can be displayed in a dendrogram showing the hierarchical relationship between objects, which is a good way to allocate objects to clusters.

13.2.5 Local outlier factor The local outlier factor (LOF) [26] is an outlier detection method by measuring the local deviation of a sample point with respect to its neighbors, which requires a distance measure d : X × X → R. Assuming that there is a set of feature points C, all elements in which are vectors, we expect to estimate the degree of outlying of the point p ∈ C. The LOF uses an integer 1 < k < |C| to specify the number of nearest neighbors. In LOF the k-distance of p, denoted by dk (p), is defined as a distance between p and o ∈ C \ {p} such that [18]: • for at least k points o ∈ C \ {p}, d(p, o ) ≤ dk (p), and • for at most k − 1 points o ∈ C \ {p}, d(p, o ) < dk (p). The k-neighborhood of p is denoted by Nk (p) = {o ∈ C|d(p, o) ≤ dk (p)}. The reachability distance of p with respect to o is defined as rk (p, o) = max{dk (o), d(p, o)}. The local reachability density of p is defined as |Nk (p)| . o∈Nk (p) rk (p, o)

LRDk (p) = 

(13.25)

The LOF value of p is finally defined as LOFk (p) =

 LRDk (o) 1 . |Nk (p)| LRDk (p) o∈Nk (p)

(13.26)

Chapter 13 • Unsupervised steganographer identification

305

The LOFk (p) captures the degree to which p can be called an outlier. A higher value indicates that p is more likely to be an outlier.

13.2.6 Maximum mean discrepancy In SIP a measure of distance between two actors is required. Since each actor has a set of feature vectors, a distance measure for two sets of feature vectors is necessary. The maximum mean discrepancy (MMD) [27–29] has been empirically shown to be quite ef|X| fective for distance measurement. It can be used for SIP. Given observations X = {xi }i=1 |Y | and Y = {yi }i=1 , which are i.i.d. drawn from p(x) and q(y) defined on Rd , let F be a class of functions f : Rd → R; the MMD and its empirical estimate are: MMD[F, p, q] = sup Ex∼p(x) f (x) − Ey∼q(y) f (y),

(13.27)

1  1  f (x) − f (y). |Y | f ∈F |X| x∈X y∈Y

(13.28)

f ∈F

MMD[F, X, Y ] = sup

Usually, F is selected as a unit ball in a universal RKHS H defined on compact metric space Rd with kernel k(·, ·) and feature mapping φ(·). The Gaussian and Laplacian kernels are universal. It has been proven that  2 (13.29) MMD2 [F, p, q] = Ex∼p(x) φ(x) − Ey∼q(y) φ(y)H . An unbiased estimate of MMD is

⎞1/2  1 h[i, j ]⎠ , MMD[F, X, Y ] = ⎝ 2 |X| − |X| ⎛

(13.30)

i =j

where it is assumed that |X| = |Y | and h[i, j ] = k(xi , xj ) + k(yi , yj ) − k(xi , yj ) − k(xj , yi ).

(13.31)

For any two sets, the unbiased estimate of MMD can be used to measure their distance. However, note that when a set has only one feature vector, we cannot use MMD since its value always equals zero. In this case, we may use the Euclidean metric d(x, y) = ||x − y||2 or other metrics. A kernel function is required when using MMD. There are choices for k, for example, the linear kernel, which is simply a scalar product k(x, y) = x · y,

(13.32)

k(x, y) = exp(−γ ||x − y||2 )

(13.33)

and the Gaussian kernel defined as

with parameter γ . Typically, γ is set to η2 , where η is the median of the L2 -distance between features in the set of images being considered: this means that the exponents are close to −1 on average [17].

306

Digital Media Steganography

13.3 General frameworks In this section, we review two general frameworks used in SIP. We can say that most of the present state-of-the-arts are based on these frameworks. Both are proposed by Ker et al. and can be found in [17,18,30].

13.3.1 Clustering-based detection Suppose that there are multiple network actors, each of which transmits multiple objects and has been intercepted. Each actor may use multiple sources of objects. For simplicity, assuming that, each actor has only one source of objects, but these sources could be different from each other. For each actor, the feature vectors are extracted from the objects held by this actor. We can think of this as many clouds of points in the feature space, one cloud for each actor. A guilty actor is the one using steganography in some of his or her transmitted objects. It is hoped to identify the guilty actors if their clouds of feature vectors stand out from the normal actors. Applying steganalysis to each given object individually may be valuable. However, it is likely that any guilty party can be lost due to false positives. If a trained model for unknown objects to be tested is used, then it will produce the mismatching problem in applications. It is possible to perform clustering on the individual objects, which may provide little information. This inspires us to consider objects of each actor as a whole to characterize their source. In this way, wee can cluster the actors, hoping to separate an innocent majority from a guilty minority. Here clustering the actors is equivalent to clustering the clouds mentioned before. Each cloud consists of a set of feature vectors. The hierarchical clustering can be combined with the MMD distance measure for SIP. The distances between actors are defined as the MMD distances between the sets of feature vectors. If the features are well chosen so that the difference between actors’ sources is less than the difference between guilty actors and normal actors, then the final agglomeration will be between the normal and guilty actors. In this way, a list of suspected guilty actors from the cluster dendrogram can be extracted. As well as depending on the features, the detection accuracy will depend on the proportion of embedded objects and the amount they embed. Such detector assumes that most actors are normal and tries to identify guilty actors as an outlier cluster. Preprocessing the directly extracted features is necessary to guarantee the accuracy of the detection method since the raw features have different scales. Feature normalization is a good choice, and other preprocessing methods such as principal component transformation may be suitable as well. By normalization we mean linear scaling of features such that each column of the data matrix X has zero mean and unit variance. This means that nl 1  Xir = 0, ∀r, nl i=1

(13.34)

Chapter 13 • Unsupervised steganographer identification

307

and nl 1  2 Xir = 1, ∀r, nl

(13.35)

i=1

where n is the number of actors, and l is the number of objects held by each actor. The preprocessing enables the distance measure to be more meaningful and not significantly affected by noisy components. (i) (i) Mathematically, let A = {a1 , a2 , . . . , an } and S(ai ) = {I(i) 1 , I2 , . . . , Im } respectively represent the actors and the objects held by actor ai . Based on the aforementioned analysis, the detailed steps can be described as follows. 1. For each actor ai , 1 ≤ i ≤ n, with a well-designed feature extractor Efea , the feature vectors from the objects in S(ai ) are extracted, which are denoted by F (ai ) = (i) (i) {f1 , f2 , . . . , f(i) m }. 2. For each actor ai , 1 ≤ i ≤ n, normalize F (ai ) by the above method. 3. For any two different actors ai and aj , 1 ≤ i < j ≤ n, determine the MMD distance between normalized F (ai ) and normalized F (aj ). 4. Apply hierarchical clustering and collect two clusters C1 and C2 at the final stage of merging. The actors belonging to the cluster with a smaller size are considered as the guilty actors. Assume that there are k ≤ |C1 | + |C2 | suspicious actors and |C1 | ≤ |C2 |. If k ≤ |C1 |, then we can randomly choose k actors from C1 as the guilty actors. Otherwise, all actors in C1 and k − |C1 | randomly selected actors in C2 are judged as guilty ones. For example, in Fig. 13.4 the above figures mean MDS representation of actors, and the below figures show cluster dendrograms, from which the final two clusters are encircled in the MDS plot. The left example implies C1 = {C} and C2 = {A, B, D, E, F, G}. For the right example, C1 = {A, G} and C2 = {B, C, D, E, F}. Suppose that there is only one guilty actor. Then, for the left example, C will be judged as the suspicious actor. For the right example, either A or G will be judged as the steganographer. If there are three guilty actors, then for the right example, {A, G} and an actor randomly chosen from C2 can be selected out as the guilty actors.

13.3.2 Outlier-based detection The advantage of the hierarchical clustering based framework is that by pooling objects together it improves the “signal-to-noise” ratio compared with processing each object individually, leading to better accuracy in the identification of steganographer. The reason of using a clustering algorithm was that the group of guilty actors should form a different cluster from the normal one. Ker et al. [18] have pointed out that the actors’ cover sources have implied that there is already variation between actors, and the steganographic embedding may exhibit itself by a shift in feature vectors in some direction(s). Thus the guilty actors can be denoted by outliers, allowing us to switch attention from clustering to outlier detection.

308

Digital Media Steganography

FIGURE 13.4 Examples for MDS representation and cluster dendrogram.

Many outlier detection methods can be found in the literature. The local outlier factor (LOF) is a good choice because [18]: it requires only the pairwise distances between points; it is not tied to any particular application domain; it directly provides a measure of how much an outlier each point is; and it relies on single hyperparameter. Here LOF is used for detection. It is pointed out that it is always free to apply any effective outlier detection method. The pseudocode of applying LOF for SIP is shown in Algorithm 2. The steps are similar to that of hierarchical clustering. The LOF method does not provide a threshold, above which the element is considered as an outlier. Such a method has an advantage that it is scale invariant. It is worth mentioning that the outlier detection can overcome another shortcoming of clustering: the lack of any direct measure of “being an outlier”. If a steganalysis expert wants to identify a single guilty actor, then he can take the one with the largest LOF value; if he wants to identify a short list of the k most suspicious actors, then he can choose those with top-k LOF values.

Chapter 13 • Unsupervised steganographer identification

309

Algorithm 2 LOF-based detection method for SIP. (i)

(i)

Input: A = {a1 , . . . , an }, S(ai ) = {I1 , . . . , Im }, i ∈ [1, n]. Output: A ranking list r. rk ∈ r reveals the kth most suspicious actor. 1: Extract feature vectors for each actor. 2: Preprocess feature vectors by normalization Other methods also work. 3: Apply LOF to the n points (i.e., sets of normalized feature vectors). 4: Return a ranking list r based on the LOF values r = (r1 , r2 , . . . , rn ).

13.3.3 Performance evaluation and analysis 13.3.3.1 Clustering-based detection Suppose that there are n actors and each of them has m images. There is only one guilty actor. The actors are simulated by images taken from n different cameras, all are JPEG compressed with an identical quality factor (QF). nsF5 [31] is used as the steganographic algorithm, which is an improved version of F5 [32]. F5 is a steganographic algorithm aiming to preserve the shape of the histogram of quantized DCT coefficients. The secret message is embedded by changing the absolute values of DCT coefficients toward zero. F5 is also the first algorithm to use matrix embedding, which is a coding scheme increasing embedding efficiency measured by the number of bits embedded per change. As its variant, nsF5 uses the same type of embedding changes as F5. However, nsF5 uses wet paper codes [33,34] to avoid introducing more zeros. Each image is represented by a 274-D feature vector, called PEV-274, designed for JPEG steganalysis and previously shown to be effective against nsF5. However, we admit that nsF5 is detectable by modern steganalysis features. We use the hierarchical clustering technique with MMD distance between actors. We expect that the final agglomeration should be between one guilty actor and a cluster of n − 1 normal actors. We point that there is no training phase, and we assume no any knowledge of the actors’ cameras or the steganographic algorithm. Ker et al. [17] simulated n = 7 actors. Each actor converted RAW photos to JPEG QF = 80 prior to transmission. Each actor used images of the default resolution of the camera. Each experiment was repeated 100 times per guilty actor, and each actor transmitted m = 50 JPEG images. Moreover, the single linkage and linear MMD were used. Their experimental results show that in case that the guilty actor embedded 0.25 bits per nonzero coefficient (bpnc) in 25% of his images, the overall detection accuracy was 90.3%, and in case that the guilty actor embedded 0.3 bpnc in 30% of his images, the overall detection accuracy was 99.9%. This indicated that the clustering algorithm has good ability to identify the guilty actor. It is also mentioned that performance falls of sharply for smaller payloads since the guilty actors fade into the normal cluster and they cannot be identified accurately. Ker et al. have also conducted experiments with different parameter settings. Their results show that the clustering based on linear MMD worked better than that based on Gaussian MMD. Also, the choice of agglomeration algorithm did not make a substantial

310

Digital Media Steganography

difference. As expected, there is a dependency on m: more evidence allows the detector to make more accurate detection from smaller payloads. In the case m = 200 the linear MMD combined with any linkage achieved perfect detection with payloads as small as 0.3 bpnc embedded in only 30% of the images.

13.3.3.2 Outlier-based detection Compared to clustering-based detection, outlier-based detection allows us to consider thousands actors, each transmitting hundreds images. In [30] a randomly selected subset of 4000 actors and 200 images for each actor was used in each experiment. The images were obtained from a social network site, which automatically resized large images, to approximately 1 Mpix, and then JPEG compressed with QF = 85 at the time of crawling. Exactly one guilty actor was simulated by nsF5. Four different embedding strategies, max-greedy, max-random, linear, and even, were used to insert Lp bits to images held by the guilty actor, where L is the total number of nonzero coefficients in the images, and p is the number of secret bits per nonzero coefficient. The PEV-274 features were extracted from nm JPEG images and then normalized. The MMD distance between each pair of actors can be then determined. Finally, the LOF detection method was applied. For each combination of parameters, each experiment was repeated 500 times with a random selection of actors and guilty actor. The average rank of the guilty actor was used to reflect how well the guilty actor was identified. An average rank of one means that the guilty actor is always ranked the most suspicious. An average rank of n+1 2 corresponds to random guessing. For MMD distance, the centroid “kernel” [30] was used. For the LOF detection, a parameter k was set to 10 in default. The experimental results showed that the max-greedy strategy is the most secure for the hider: the average rank of the guilty actor is higher in all payloads. The second most secure strategy is max-random. The linear strategy is next and even is the most insecure. The reason that the max-greedy/max-random strategy is more secure than linear/even has been explained in [22]. In brief, it is caused by whitening the features in the preprocessing stage. Overall, with relatively large payloads, the LOF method achieves efficient detection performance.

13.4 Ensemble and dimensionality reduction 13.4.1 Clustering ensemble The clustering ensemble method combines the detection results from multiple clustering submodels. As shown in Fig. 13.5, the clustering ensemble framework is comprised of three parts: preprocessing, feature extraction, and clustering ensemble. The preprocessing procedure requires us to analyze the characteristics of the steganographic algorithm so that those sensitive components concealing the steganographic traces can be separated out from the stego. The preprocessing could be cropping, high-pass filtering, and other operations that can improve the statistical discriminability between the steganographic noise and the cover content.

Chapter 13 • Unsupervised steganographer identification

311

FIGURE 13.5 General framework for clustering ensemble.

After preprocessing, we can extract steganalysis features from the preprocessed objects, which reduces the preprocessed objects to vectors with a small dimension. Suppose that T clustering submodels {C1 , C2 , . . . , CT } for clustering ensemble will be used. For each submodel Ci , we can directly apply the hierarchical clustering method, by which the suspicious actor that will be judged as the steganographer candidate can be collected. By merging the results from these submodels the most suspicious actor, which will be chosen as the steganographer, can be determined. For better understanding, here we review a related work by Li et al.’s [21], exploiting clustering ensemble for SIP. In this work the images held by actors are randomly cropped

312

Digital Media Steganography

with smaller size to build actor subsets. Then the LI-250 features mentioned in the aforementioned section are extracted from these actor subsets. After performing the agglomerative clustering in the (normalized) feature space, a most suspicious actor can be separated from the innocent ones. By repeating these steps multiple decisions are made, and the final guilty actor can be identified by majority voting. It is mentioned that in Li et al.’s work the clustering operation is slightly different from the above basic agglomerative clustering algorithm, and they use the Euclidean distance for two feature vectors. The algorithm is detailed further. Let {A1 , A2 , . . . , An } be a set of n actors that includes one guilty actor and n − 1 normal actors, each of whom transmits m JPEG images. The clustering ensemble is used according to the following steps. 1. Denote the image sets of n actors as {I1 , I2 , . . . , In }. Each Ii (1 ≤ i ≤ n) contains m JPEG images of size M × N . Randomly crop each JPEG image in Ii (1 ≤ i ≤ n) to size m × n (m < M, n < N). The cropped images consist of n subsets I1 , I2 , . . . , In , where Ii (1 ≤ i ≤ n) includes m JPEG images of size m × n . 2. For each image subset Ii (1 ≤ i ≤ n), extract the LI-250 features to form feature sets (j ) (1) (2) (m) Fi (1 ≤ i ≤ n), where Fi = (fi , fi , . . . , fi ) and each fi represents a 250-D feature vector. Note that each Fi is associated with a second subscript j (1 ≤ j ≤ T ), that is, Fi,j , to denote Fi used in the j th clustering submodel since there are T submodels. 3. Normalize the feature sets {F1 , F2 , . . . , Fn } such that every column of feature matrix has zero mean and unit variance. Then the distance between different pairs of feature sets can be calculated with D(Fi , Fj ) =

m m 1  (p) (q) d(fi , fj ), i, j ∈ [1, n], m2

(13.36)

p=1 q=1

(p)

(q)

where d(fi , fj ) represents the Euclidean distance between two vectors. 4. Perform hierarchical clustering with the distance measure (13.36). First, two actors with the minimum distance are combined into a new cluster, denoted by X , and the remaining actors are in a set denoted by Y. Repeat selection of an actor from Y and add the selected actor into X until only one actor is left in Y. At each selection step, the actor Y ∈ Y who has the smallest distance to X is selected, where the distance between an actor Y ∈ Y and a cluster X is defined as D(Y, X ) =

1  D(Y, X). |X |

(13.37)

X∈X

The last remaining actor in Y is considered a suspicious actor. 5. Repeat Steps 1–4 for T times. Each time a suspicious actor is identified, denoted by Ci , i = 1, 2, . . . , T . By majority voting with T submodels, the actor who is identified as suspicious the most frequently is determined as the final culprit. When there is a tie, randomly select one to break the tie.

Chapter 13 • Unsupervised steganographer identification

313

The detection performance is evaluated by the overall identification accuracy rate, calculated as the ratio between the number of correctly detected steganographic actors and the selected total number of steganographic actors. In Li et al.’s method the original images are cropped to images with a smaller size. Li et al. point that there is generally an optimal range for m (= n ), that is, when the value of m is far away from the optimal range, the detection performance will become worse. This phenomenon is explained as follows: When the size of cropped images is too large, the diversity is not significant in the randomly cropped images, and thus each subclustering tends to generate the identical result. When the size is too small, the statistical features become unstable, leading to poor detection in each subclustering. Experiments reported in [21] show that LI-250 outperforms previous popular JPEG steganalysis features PEV-274 [19], LIU-144 [35], DCTR-8000 [36], and PHARM-12600 [37], which is explained as follows: on the one hand, comparing with PEV-274 and LIU-144, LI-250 could well reveal the changes of cover elements caused by the tested steganographic algorithms; on the other hand, although DCTR-8000 and PHARM-12600 are conclusively more sensitive for supervised binary classification, they have a high dimension and contain a number of weak features, which may lead to worse performance for SIP [18,38]. Usually, for single hierarchical clustering, better performance can be obtained when the images are of large sizes [39]. However, good performance with large image size in the single hierarchical clustering method does not imply good performance of clustering ensemble using the same large image size for Li et al.’s method. That is, when the size of cropped images is too large, the diversity in the randomly cropped images is reduced, and thus multiple clustering rounds are likely to generate the same result. The benefit of clustering ensemble accordingly disappears. Therefore the preprocessing operation has significant impact on the detection performance. Intuitively, the higher the T , the higher the overall accuracy, but the longer the running time. In [21] the overall accuracy tends toward stability with an increasing T and does not change seriously when T is larger than a threshold. It is believed that the cropped images have a lot of overlapped areas as T increases, whereas the diversity will be reduced, implying that multiple submodels may generate the same results. We can infer that the clustering ensemble framework shown in Fig. 13.5 can be applied to other detection methods (e.g., LOF-based) by making adjustments accordingly.

13.4.2 Dimensionality reduction Different from clustering ensemble, dimensionality reduction focuses on the process of reducing the number of random variables under consideration by obtaining a set of principal variables. Dimensionality reduction exploits salient features and eliminates irrelevant feature fluctuations by representing the discriminative information in a lower-dimensional manifold [40]. There are two common reduction approaches, feature selection and feature projection.

314

Digital Media Steganography

13.4.2.1 Feature selection Feature selection attempts to select a feature subspace from the original full feature space. Feature selection is often used in domains where there are many features and comparatively few sample points. Feature selection simplifies the model and thus results in a shorter running time. It can also avoid disasters due to the dimensional problem. For SIP, in mainstream works the distances in high-dimensional space between features of actors are determined to find the abnormal cluster or outlier corresponding to the steganographer. However, in a high-dimensional space the distances between feature points may become similar to each other, resulting in that it is not easy to separate the abnormal points from the normal ones. To this end, Wu [41] uses a simple feature selection-based ensemble algorithm for SIP. The method merges results from multiple detection submodels, of which each feature space is randomly sampled from the raw full-dimensional space. Unlike the basic framework, the images held by a single actor are divided into multiple disjoint sets, resulting in that each actor is represented by multiple sets of feature vectors. (i) (i) (i) In detail, let A = {a1 , a2 , . . . , an } and S(ai ) = {I1 , I2 , . . . , Im } respectively represent actors and the images held by actor ai . A detector computes the preprocessed features for (i) (i) each ai , that is, F (ai ) = {f(i) 1 , f2 , . . . , fm }. All F (ai ) (1 ≤ i ≤ n) are divided into disjoint sets with an identical size, that is, P (ai ) =

p 

Pj (ai ) (1 ≤ i ≤ n),

(13.38)

j =1 (i)

(i)

(i)

where m = p · q and Pj (ai ) = {fj q−q+1 , fj q−q+2 , ..., fj q }. Accordingly, each ai can be represented by p sets of feature vectors. The p sets are called as “p points”. A total of p · n points can be collected, each of which belongs to one of the n actors. It is naturally assumed that the distance between an abnormal point and a normal point should be larger than that between two normal points. In other words, normal points are densely distributed, whereas abnormal ones are sparsely distributed. For any two points, the MMD or other metrics can be used to measure their distance. However, when a point has only one feature vector, we cannot use MMD since its value always equals zero. In this case, we may adopt the Euclidean distance or other suitable metrics. Therefore by anomaly detection (e.g., LOF detection) a ranking list for the pn points is determined according to their anomaly scores. pn Let pn triples {(ui , vi , wi )}i=1 denote the sorted information, where u1 ≥ u2 ≥ · · · ≥ upn represent the anomaly scores, vi denotes the corresponding actor, and wi is the point index, that is, Pwi (vi ) ∈ P (vi ). For each actor ai , a fusion score can be determined as s(ai ) =

pn  (pn + 1 − j ) · δ(vj , ai ) j =1

p

(1 ≤ i ≤ n),

(13.39)

where δ(x, y) = 1 if x = y and δ(x, y) = 0 otherwise. By sorting the fusion scores the final ranking list can be generated, where the actor with the largest score will be the most suspicious.

Chapter 13 • Unsupervised steganographer identification

315

In this way a single anomaly detection system operated on the full feature space can be constructed. Obviously, T submodels {M1 , M2 , . . . , MT } can be built, whose feature dimensions are {d1 , d2 , . . . , dT }. Each di (1 ≤ i ≤ T ) is chosen from [H /2, H − 1], where H is the dimension of the raw full feature space. We see that each submodel is exactly the same as the single anomaly detection system, except that the used features are randomly sampled from the full feature space. For each Mi , a ranking list, denoted by ri = {ri,1 , . . . , ri,n }, can be collected, and ri,j means the actor with the j th largest anomaly score, that is, ri,1 is the most suspicious, and ri,n is the least suspicious. Accordingly, by further processing {r1 , r2 , . . . , rT } the final fusion score for ai can be sF (ai ) =

T  n+1− j =1

n

k=1 [k

T

· δ(rj,k , ai )]

.

(13.40)

By sorting {sF (a1 ), . . . , sF (an )} the actor with the largest score will be the most suspicious, and the smallest score corresponds to the least suspicious. We observed from experiments that it has the ability to improve the performance, which, however, is not significant and can be affected by parameter settings. For performance optimization, we may design new advanced feature selection algorithms for choosing efficient feature components for detection.

13.4.2.2 Feature projection Feature projection should be distinguished from feature selection. Whereas feature selection returns a subset of the original features by some way, feature projection creates new features from functions of the original features. We expect that the projections of the original high-dimensional features exhibit maximal information about the class label. Same as the feature selection, feature projection also reduces the complexity to help avoid overfitting. For feature projection, the feature transformation can be either linear or nonlinear. For example, the linear principle components analysis (PCA) [42] performs a linear mapping of the raw features to a lower-dimensional space such that the variance of the features in the lower-dimensional representation is maximized. Kernel PCA [43], as a nonlinear extension of PCA, uses a kernel to enable the originally linear operations of PCA to be performed in an RKHS. Pevný and Ker [38] have provided a profound study on feature projection for SIP. They limit themselves to linear feature projections because it is believed that the steganalysis problem is essentially linear “in some sense”, for example, experiments in [44] demonstrate that accurate estimators of payload size can be created as linear functions of feature vectors. As pointed in [38], we can expect neither that all stegos move in the same direction, certainly not when created with different steganographic algorithms, nor that covers begin from nearby points if they arise from different sources. The boundary between covers and stegos may be nonlinear. A good set of features consist of different linear projections, which can capture the characteristics of different embedding algorithms, cover objects, and sources.

316

Digital Media Steganography

Therefore, to achieve better performance, they exploit principal component transformation (PCT), maximum covariance transformation, ordinary least square regression, and calibrated least squares regression for feature projection. The PCT can be defined as an iterative algorithm, where in the kth iteration, we seek a projection vector wk ∈ Rd best explaining the raw features and orthogonal to all projections {wi }k−1 i=1 . It is formulated as wk = arg max w T XT Xw ||w||=1

(13.41)

subject to w T wi = 0, i ∈ [1, k − 1], where wk is kth eigenvector of XT X. A weakness of PCT is that it does not take into account the objective function. Pevný and Ker define the object function as the presence of steganography signaled by the steganographic change rate. It is to some extent resolved by finding a direction maximizing the covariance between Xs w and Ys . Also, wk should be orthogonal to {wi }k−1 i=1 . It is called the maximum covariance (MCV) transformation method, which can be formulated as wk = arg max Ys T Xs w ||w||=1

(13.42)

subject to w T wi = 0, i ∈ [1, k − 1]. The analytical solution is wk = Ys T Xsk , where Xsk = Xsk−1 (I − T ) and Xs = Xs . wk−1 wk−1 1 Ordinary least square (OLS) regression finds a single direction w minimizing the total square error between Xs w and Ys . Such an optimization problem can be formulated as w = arg min 2Ys T Xs w − w T Xs T Xs w,

(13.43)

w∈Rd

the analytical solution of which is w = (Xs T Xs )−1 Xs T Ys ,

(13.44)

which assumes that Xs T Xs is regular and the inversion is numerically stable. In practice, it is not always true, and steganalysis features may lead to a nearly singular matrix. To alleviate this, a small diagonal matrix is added to prevent nonsingularity and increase the stability of the solution: w = (Xs T Xs + λI)−1 Xs T Ys .

(13.45)

Good steganographic projections should be sensitive to embedding changes yet insensitive to the content. It has not been optimized in the above methods. To this end, Pevný and Ker propose a so-called calibrated least squares (CLS) regression method, which finds the projections by iteratively solving w = arg min 2Ys T Xs w − w T Xc T Xc w − λ||w||2 w∈Rd

(13.46)

Chapter 13 • Unsupervised steganographer identification

317

subject to w T wi = 0, i ∈ [1, k − 1]. The analytical solution is wk = (Xck T Xck + λI)−1 Xsk T Ys ,

(13.47)

c c T ), Xs = Xs , and Xc = Xc (I − w T where Xsk = Xsk−1 (I − wk−1 wk−1 k−1 wk−1 ), X1 = X . k 1 k−1 Instead of PEV-274, Pevný and Ker use CF-7850 [45] and outlier detection [18] for experiments. Experimental results show that in terms of identification accuracy (i.e., the average rank of the only one guilty actor), the unsupervised PCT method is the worst. The MCV method is inferior to OLS in most cases, and CLS significantly outperforms the others. The explanation can be described as follows [38]: PCT focuses on the variance in the data, which is not what we want, as the variance in features can be dominated by cover content. MCV finds directions maximally correlated with the explained variable. Although it is better than PCT, the image content is not suppressed. OLS maximizes the covariance of a projection with the payload and minimizes the variance of the projection. It is not clear whether the variance comes from image content or from embedding changes, whereas CLS removes this ambiguity. The outlier-based analyzer is not compatible with rich features containing lots of weak features, because the weak features contain lots of noise (caused by the cover content), resulting in inferior performance. The above supervised feature reduction can improve the performance, meaning that the complete outlier detector was no longer unsupervised. However, the features showed a high level of robustness to varying the embedding algorithm, meaning that the outlier detector retained its universal behavior. Finding the balance between supervision and universality is an important challenge for SIP, which needs further investment. In addition, Pevný and Nikolaev [46] attempted to learn an optimal pooling function for pooled steganalysis. Although experiments show that learned combining functions are superior to the prior art, many interesting phenomenons were found, pointing to directions of further research, for example, are there better strategies for the steganographer to distribute the message into multiple objects, or the linear strategy is the best he can do despite the steganalyzer knowing it? We believe that challenging problems will arise as the study of SIP moves ahead [47].

13.5 Conclusion In this chapter, we investigated the problem of steganographer identification, for which the goal is identifying the user who sends many steganographic images among many other innocent users. An analyzer has to deal with multiple users and multiple images per user and particularly the difference between cover sources and that between covers and stegos. We presented two general technical frameworks, which are based on clustering and outlier detection. In both frameworks, objects held by a single user were considered as a whole for gathering better evidence for identification. Then, we demonstrated that both ensemble and dimensionality reduction can further improve the detection performance. We believe that steganographer identification will become quite important in future from application view.

318

Digital Media Steganography

Acknowledgment This work was partly supported by the National Natural Science Foundation of China (NSFC) under grant number 61902235. It was also supported by “Chen Guang” project under grant number 19CG46, co-funded by the Shanghai Municipal Education Commission and Shanghai Education Development Foundation. The main academic contributions belong to the authors of the cited references.

References [1] J. Fridrich, Steganography in Digital Media: Principles, Algorithms, and Applications, Cambridge University Press, 2010. [2] H. Wu, W. Wang, J. Dong, H. Wang, New graph-theoretic approach to social steganography, in: Proc. IS&T Electronic Imaging, Media Watermarking, Security, and Forensics, 2019, 539. [3] P. Sallee, Model-based steganography, in: Proc. International Workshop on Information Hiding, 2003, pp. 154–167. [4] S. Hetzl, P. Mutzel, A graph-theoretic approach to steganography, in: Proc. International Conference on Communications and Multimedia Security, 2005, pp. 119–128. [5] Y. Chen, H. Wang, H. Wu, Z. Wu, T. Li, A. Malik, Adaptive video data hiding through cost assignment and STCs, IEEE Transactions on Dependable and Secure Computing, Early Access, available, 10.1109/ TDSC.2019.2932983, 2019. [6] H. Wu, H. Wang, H. Zhao, X. Yu, Multi-layer assignment steganography using graph-theoretic approach, Multimedia Tools and Applications 74 (18) (2015) 8171–8196. [7] A.K. Sahu, G. Swain, An optimal information hiding approach based on pixel value differencing and modulus function, Wireless Personal Communications 108 (1) (2019) 159–174. [8] T. Pevny, T. Filler, P. Bas, Using high-dimensional image models to perform highly undetectable steganography, in: Proc. International Workshop on Information Hiding, 2010, pp. 161–177. [9] V. Holub, J. Fridrich, T. Denemark, Universal distortion function for steganography in an arbitrary domain, EURASIP Journal on Information Security 2014 (2014) 1–13. [10] L. Guo, J. Ni, Y. Shi, An efficient JPEG steganographic scheme using uniform embedding, in: Proc. International Workshop on Information Forensics and Security, 2012, pp. 169–174. [11] V. Holub, J. Fridrich, Designing steganographic distortion using directional filters, in: Proc. IEEE International Workshop on Information Forensics and Security, 2012, pp. 234–239. [12] B. Li, M. Wang, J. Huang, X. Li, A new cost function for spatial image steganography, in: Proc. IEEE International Conference on Image Processing, 2014, pp. 4206–4210. [13] T. Filler, J. Fridrich, Gibbs construction in steganography, IEEE Transactions on Information Forensics and Security 5 (4) (2010) 705–720. [14] T. Filler, J. Judas, J. Fridrich, Minimizing additive distortion in steganography using syndrome-trellis codes, IEEE Transactions on Information Forensics and Security 6 (3) (2011) 920–935. [15] C. Cortes, V. Vapnik, Support-vector networks, Machine Learning 20 (3) (1995) 273–297. [16] A. Ker, Batch steganography and pooled steganalysis, in: Proc. Information Hiding Workshop, vol. 4437, 2006, pp. 265–281. [17] A. Ker, T. Pevný, A new paradigm for steganalysis via clustering, in: Proc. SPIE, Media Watermarking, Security, and Forensics XIII, vol. 7880, 2011, pp. 312–324. [18] A. Ker, T. Pevný, Identifying a steganographer in realistic and heterogeneous data sets, in: Proc. SPIE, Media Watermarking, Security, and Forensics XIV, vol. 8303, 2012, pp. 182–194. [19] T. Pevný, J. Fridrich, Merging Markov and DCT features for multiclass JPEG steganalysis, in: Proc. SPIE, Security, Steganography, and Watermarking of Multimedia Contents IX, vol. 6505, 2007, pp. 28–40. [20] Y. Shi, C. Chen, W. Chen, A Markov process based approach to effective attacking JPEG steganography, in: Proc. Information Hiding Workshop, vol. 4437, 2006, pp. 249–264. [21] F. Li, K. Wu, J. Lei, M. Wen, Z. Bi, C. Gu, Steganalysis over large-scale social networks with high-order joint features and clustering ensembles, IEEE Transactions on Information Forensics and Security 11 (2) (2016) 344–357. [22] A. Ker, T. Pevný, Batch steganography in the real world, in: Proc. Multimedia and Security Workshop, 2012, pp. 1–10.

Chapter 13 • Unsupervised steganographer identification

319

[23] A. Ker, Batch steganography and the threshold game, in: Proc. SPIE, Security, Steganography, and Watermarking of Multimedia Contents IX, vol. 6505, 2007, pp. 41–53. [24] A. Ker, A capacity result for batch steganography, IEEE Signal Processing Letters 14 (8) (2007) 525–528. [25] L. Rokach, O. Maimon, Clustering methods, in: Data Mining and Knowledge Discovery Handbook, 2005, pp. 321–352. [26] M. Breunig, H. Kriegel, R. Ng, J. Sander, LOF: identifying density-based local outliers, in: Proc. ACM SIGMOD International Conference on Management of Data, 2000, pp. 93–104. [27] L. Song, Learning via Hilbert space embedding of distributions, PhD Thesis, University of Sydney, 2008. [28] A. Gretton, K. Borgwardt, M. Rasch, B. Scholkopf, A. Smola, A kernel two-sample test, Journal of Machine Learning Research 13 (1) (2012) 723–773. [29] K. Muandet, K. Fukumizu, B. Sriperumbudur, B. Scholkopf, Kernel mean embedding of distributions: a review and beyond, Foundations and Trends in Machine Learning 10 (1–2) (2017) 1–141. [30] A. Ker, T. Pevný, The steganographer is the outlier: realistic large-scale steganalysis, IEEE Transactions on Information Forensics and Security 9 (9) (2014) 1424–1435. [31] J. Fridrich, T. Pevný, J. Kodovsky, Statistically undetectable JPEG steganography: dead ends, challenges, and opportunities, in: Proc. ACM Multimedia and Security Workshop, 2007, pp. 3–14. [32] A. Westfeld, F5: a steganographic algorithm, in: Proc. Information Hiding Workshop, 2001, pp. 289–302. [33] J. Fridrich, M. Goljan, D. Soukal, Efficient wet paper codes, in: Proc. Information Hiding Workshop, 2005, pp. 204–218. [34] J. Fridrich, M. Goljan, D. Soukal, Wet paper codes with improved embedding efficiency, IEEE Transactions on Information Forensics and Security 1 (1) (2006) 102–110. [35] Q. Liu, Z. Chen, Improved approaches with calibrated neighboring joint density to steganalysis and seam-carved forgery detection in JPEG images, ACM Transactions on Intelligent Systems and Technology 5 (4) (2015) 63. [36] V. Holub, J. Fridrich, Low-complexity features for JPEG steganalysis using undecimated DCT, IEEE Transactions on Information Forensics and Security 10 (2) (2015) 219–228. [37] V. Holub, J. Fridrich, Phase-aware projection model for steganalysis of JPEG images, in: Proc. SPIE, Media Watermarking, Security, and Forensics, vol. 9409, 2015, pp. 259–269. [38] T. Pevný, A. Ker, The challenges of rich features in universal steganalysis, in: Proc. SPIE, Media Watermarking, Security, and Forensics, vol. 8665, 2013, pp. 203–217. [39] T. Filler, A. Ker, J. Fridrich, The square root law of steganographic capacity for Markov covers, in: Proc. SPIE, Media Watermarking, Security, and Forensics, vol. 7254, 2009, pp. 62–72. [40] D. Erdogmus, U. Ozertem, T. Lan, Information theoretic feature selection and projection, in: B. Prasad, S.R.M. Prasanna (Eds.), Speech, Audio, Image and Biomedical Signal Processing using Neural Networks, vol. 83, Springer, 2008, pp. 1–22. [41] H. Wu, Feature bagging for steganography identification, arXiv:1810.11973, 2018. [42] K. Pearson, On lines and planes of closest fit to systems of points in space, Philosophical Magazine 2 (11) (1901) 559–572. [43] B. Scholkopf, A. Smola, K. Muller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Computation 10 (5) (1998) 1299–1319. [44] T. Pevný, J. Fridrich, A. Ker, From blind to quantitative steganalysis, IEEE Transactions on Information Forensics and Security 7 (2) (2012) 445–454. [45] J. Kodovsky, J. Fridrich, V. Holub, Ensemble classifiers for steganalysis of digital media, IEEE Transactions on Information Forensics and Security 7 (2) (2012) 432–444. [46] T. Pevný, I. Nikolaev, Optimizing pooling function for pooled steganalysis, in: IEEE International Workshop on Information Forensics and Security, 2015, pp. 1–6. [47] A. Ker, P. Bas, R. Bohme, R. Cogranne, S. Craver, T. Filler, J. Fridrich, T. Pevný, Moving steganography and steganalysis from the laboratory into the real world, in: Proc. ACM Workshop on Information Hiding and Multimedia Security, 2013, pp. 45–58.

14 Deep learning in steganography and steganalysis Marc Chaumonta,b a Montpellier University, LIRMM (UMR5506)/CNRS, Nîmes University, Montpellier, France b LIRMM/ICAR, Montpellier, France

14.1 Introduction Neural networks have been studied since the 1950s. Initially, they were proposed to model the behavior of the brain. In computer science, especially in artificial intelligence, they have been used for around 30 years for learning purposes. Ten or so years ago [1], neural networks were considered to have a lengthy learning time and to be less effective than classifiers such as SVMs or random forests. With recent advances in the field of neuron networks [2], thanks to the computing power provided by graphics cards (GPUs), and because of the profusion of available data, deep learning approaches have been proposed as a natural extension of neural networks. Since 2012, these deep networks have profoundly marked the fields of signal processing and artificial intelligence, because their performances make it possible to surpass current methods and also to solve problems that scientists had not managed to solve until now [3]. In steganalysis, for the last 10 years, the detection of a hidden message in an image was mainly carried out by calculating rich models (RMs) [4] followed by a classification using the classifier EC [5]. In 2015 the first study using a convolutional neural network (CNN) obtained the first results of deep-learning steganalysis approaching the performances of the two-step approach (EC + RM1 ) [6]. During the period 2015–2018, many publications have shown that it is possible to an obtain improved performance in spatial steganalysis, JPEG steganalysis, side-informed steganalysis, quantitative steganalysis, and so on. In Section 14.2, we present the structure of deep neural network generically. This section is focused on the existing publications in steganalysis and should be supplemented by reading about artificial learning and, in particular, gradient descent and stochastic gradient descent. In three additional sections, not present in this chapter but available on ArXiv (https:// arxiv.org/abs/1904.01444), we explain the different steps of the convolution module, tackle the complexity and learning times, and present the links between deep learning and previous approaches. 1

We will note EC + RM to indicate the two-step approach based on the calculation of RMs and the use of EC.

Digital Media Steganography. https://doi.org/10.1016/B978-0-12-819438-6.00022-0 Copyright © 2020 Elsevier Inc. All rights reserved.

321

322

Digital Media Steganography

In Section 14.3, we revisit the different networks proposed during the period 2015–2018 for different scenarios of steganalysis. Finally, in Section 14.4, we discuss steganography by deep learning, which sets up a game between two networks in the manner of the precursor algorithm ASO [7].

14.2 The building blocks of a deep neuronal network In the following subsections, we look back at the major concepts of a convolutional neural network (CNN). Specifically, we will recall the basic building blocks of a network based on the Yedroudj-Net2 network, which was published in 2018 [8] (see Fig. 14.1), and which takes up the ideas present in Alex-Net [9], as well as the concepts present in networks developed for steganalysis including the very first network of Qian et al. [6], and the networks of Xu-Net [10], and Ye-Net [11].

FIGURE 14.1 Yedroudj-Net network [8].

14.2.1 Global view of a Convolutional Neural Network Before describing the structure of a neural network and its elementary components, it is useful to remember that a neural network belongs to the machine-learning family. In the case of supervised learning, which is the case that most concerns us, it is necessary to have a database of images with, for each image, its label, that is, its class. Deep learning networks are large neural networks that can directly take raw input data. In image processing the network is directly powered by the pixels forming the image. Therefore a deep learning network learns in a joint way, both the compact intrinsic characteristics of an image (we speak of feature map or of latent space) and, at the same time, the separation boundary allowing the classification (we also talk of separator plans). The learning protocol is similar to classical machine learning methods. Each image is given as an input to the network. Each pixel value is transmitted to one or more neurons. The network consists of a given number of blocks. A block consists of neurons that take real input values, perform calculations, and then transmit the actual calculated values to the next block. Therefore a neural network can be represented by an oriented graph 2

GitHub link on Yedroudj-Net: https://github.com/yedmed/steganalysis_with_CNN_Yedroudj-Net.

Chapter 14 • Deep learning in steganography and steganalysis

323

where each node represents a computing unit. The learning is then completed by supplying the network with examples composed of an image and its label, and the network modifies the parameters of these calculation units (it learns) thanks to the mechanism of back-propagation. The CNNs used for steganalysis are mainly built in three parts, which we will call modules: the preprocessing module, the convolution module, and the classification module. As an illustration, Fig. 14.1 schematizes the network proposed by Yedroudj et al. [8] in 2018. The network processes grayscale images of 256 × 256 pixels.

14.2.2 The preprocessing module We can observe in Fig. 14.1 that in the preprocessing module the image is filtered by 30 high-pass filters. The use of one or more high-pass filters as preprocessing is present in the majority of networks used for steganalysis during the period 2015–2018. An example of a kernel of a high-pass filter, the square S5a filter [4], is given in the equation ⎛ ⎞ −1 2 −2 2 −1 ⎜ 2 −6 8 −6 2 ⎟ ⎜ ⎟ 1 ⎜ ⎟ F (0) = (14.1) ⎜−2 8 −12 8 −2⎟ . ⎜ ⎟ 12 ⎝ 2 −6 8 −6 2 ⎠ −1 2 −2 2 −1

FIGURE 14.2 Principle of a convolution.

An illustration of the filtering (convolution) principle is given in Fig. 14.2. This preliminary filtering step allows the network to converge faster and is probably needed to obtain good performance when the learning database is too small [12] (only 4,000 pairs of cover/stego images of size 256 × 256 pixels). The filtered images are then transmitted to the first convolution block of the network. Note that the recent SRNet [13] network does not use any fixed prefilters, but learns the filters. It therefore requires a much larger database (more than 15,000 pairs of cover/stego images of size 256 × 256 pixels) and strong knowhow for its initialization. Note that there is a debate in the community if one should use fixed filters or initialize the filters with prechosen values and then continue the learning, or learn filters with random initialization. At the beginning of 2019, in practice (real-world situation [14]) the best choice is probably in relation to the size of the learning database

324

Digital Media Steganography

(which is not necessary BOSS [15] or BOWS2 [16]) and the possibility to use transfer learning.

14.2.3 The convolution module Within the convolution module, we find several macroscopic computation units, which we will call blocks. A block is composed of calculation units that take real input values, perform calculations, and return real values, which are supplied to the next block. Specifically, a block takes a set of feature maps (= a set of images) as input and returns a set of feature maps as output (= a set of images). Inside a block, there are a number of operations including the following four: the convolution, the activation, the pooling, and the normalization (details are given at https://arxiv.org/abs/1904.01444). Note that the concept of neuron, as defined in the existing literature, before the emergence of convolutional networks, is still present, but it no longer exists as a data structure in neural network libraries. In convolution modules, we must imagine a neuron as a computing unit, which, for a position in the feature map taken by the convolution kernel during the convolution operation, performs the weighted sum between the kernel and the group of considered pixels. The concept of neuron corresponds to the scalar product between the input data (the pixels) and data specific to the neuron (the weight of the convolution kernel), followed by the application of a function from R in R, called the activation function. Then by extension we can consider that pooling and normalization are operations specific to neurons. Thus the notion of block corresponds conceptually to a “layer” of neurons. Note that in deep learning libraries, we call a layer any elementary operation such as convolution, activation, pooling, normalization, and so on. To remove any ambiguity, for the convolution module, we will talk about block and operations, and we will avoid using the term layer. Without counting the preprocessing block, the Yedroudj-Net network [8] has a convolution module made of five convolution blocks, like the networks of Qian et al. [6] and Xu et al. [10]. The Ye-Net network [11] has a convolution module composed of 8 convolution blocks, and SRNet network [13] has a convolution module built with 11 convolution blocks.

14.2.4 The classification module The last block of the convolution module (see the previous section) is connected to the classification module, which is usually a fully connected neural network composed of one to three blocks. This classification module is often a traditional neural network, where each neuron is fully connected to the previous block of neurons and to the next block of neurons. The fully connected blocks often end with a softmax function, which normalizes the outputs delivered by the network between [0, 1] so that the sum of the outputs equals one. The outputs are named imprecisely “probabilities”. We will keep this denomination. So in the usual binary steganalysis scenario the network delivers two values as output: one giving the probability of classifying into the first class (e.g., the cover class) and the other

Chapter 14 • Deep learning in steganography and steganalysis

325

giving the probability of classifying into the second class (e.g., the stego class). The classification decision is then obtained by returning the class with the highest probability. Note that in front of this classification module, we can find a particular pooling operation such as a global average pooling, a spatial pyramid pooling (SPP) [17], a statistical moments extractor [18], and so on. Such pooling operations return a fixed-size vector of values, that is, a feature map of fixed dimensions. The next block to this pooling operation is thus always connected to a vector of fixed size. So this block has a fixed input number of parameters. It is thus possible to present to the network images of any size without having to modify the topology of the network. For example, this property is available in the Yedroudj-Net [8] network, the Zhu-Net [19] network, or the Tsang et al. network [18]. Also note that [18] is the only paper, at the time of writing this chapter, that has seriously considered the viability of an invariant network to the dimension of the input images. The problem remains open. The solution proposed in [18] is a variant of the concept of average pooling. For the moment, there has not been enough studies on the subject to determine what is the correct topology of the network, how to build the learning data-base, how much the number of embedded bits influences the learning, or if we should take into account the square root law for learning at a fixed security level or any payload size, and so on.

14.3 The different networks used over the period 2015–2018 A chronology of the main CNNs proposed for steganography and steganalysis from 2015 to 2018 is given in Fig. 14.3. The first attempt to use deep learning methods for steganalysis

FIGURE 14.3 Chronology of the main CNNs for steganography and steganalysis from 2015 to 2018.

326

Digital Media Steganography

date back to the end of 2014 [20] with autoencoders. At the beginning of 2015, Qian et al. [6] proposed to use CNNs. One year later, Pibre et al. [21] proposed to pursue the study. In 2016 the first results, close to those of current state-of-the-art methods (EC + RMs), were obtained with an ensemble of CNNs [22], as shown in Fig. 14.4. The Xu-Net3 [10] CNN is used as a base learner of an ensemble of CNNs.

FIGURE 14.4 Xu-Net overall architecture.

Other networks were proposed in 2017, this time for JPEG steganalysis. In [23,24] (Figs. 14.5 and 14.6) the authors proposed a preprocessing inspired by RMs, and the use of a large learning database. The results were close to those of the existing state-of-theart methods (EC + RMs). In [25] the network is built with a phase-split inspired by the JPEG compression process. An ensemble of CNNs was required to obtain results that were slightly better than those obtained by the current best approach. In Xu-Net-Jpeg [26] a CNN inspired by ResNet [27] with the shortcut connection trick and 20 blocks also improved the results in terms of accuracy. Note that in 2018 the ResDet [28] proposed a variant of Xu-Net-Jpeg [26] with similar results. These results were highly encouraging, but regarding the gain obtained in other image processing tasks using deep learning methods [3], the steganalysis results represented less than a 10% improvement compared to the classical approaches that use an EC [5] with RMs 3

In this chapter, we reference Xu-Net a CNN similar to the one given in [10], and not to the ensemble version [22].

Chapter 14 • Deep learning in steganography and steganalysis

327

FIGURE 14.5 ReST-Net overall architecture.

FIGURE 14.6 ReST-Net subnetwork.

[4,49] or RMs with a selection-channel awareness [29], [30], [31]. The revolutionary significant gain in the use of deep learning, observed in other areas of signal processing, was not yet present for steganalysis. In 2017 the main trends to improve CNN results used an ensemble of CNNs, modifying the topology by mimicking RMs extraction process or using ResNet. In most cases the design or the experimental effort was very high for a very limited improvement of performance in comparison to networks such as AlexNet [9], VGG16 [32], GoogleNet [33], ResNet [27], and so on, which inspired this research. By the end of 2017 and early 2018 the studies had strongly concentrated on spatial steganalysis. Ye-Net [11] (Fig. 14.7), Yedroudj-Net4 [12,8] (Fig. 14.8), ReST-Net [34] (Figs. 14.5 and 14.6), SRNet5 [13] (Fig. 14.9) have been published respectively in November 2017, 4 5

Yedroudj-Net source code: https://github.com/yedmed/steganalysis_with_CNN_Yedroudj-Net. SRNet source code: https://github.com/Steganalysis-CNN/residual-steganalysis.

328

Digital Media Steganography

FIGURE 14.7 Ye-Net overall architecture.

FIGURE 14.8 Comparison of Yedroudj-Net, Xu-Net, and Ye-Net architectures.

January 2018, May 2018, and May 2019 (with an online version in September 2018). All these networks clearly surpass the “old” two-step machine learning paradigm that used EC [5] and RMs [4]. Most of these networks can learn with a modest database size (i.e.,

Chapter 14 • Deep learning in steganography and steganalysis

329

FIGURE 14.9 SRNet network.

around 15,000 pairs cover/stego of 8-bits-coded images of 256 × 256 pixels size from BOSS+BOWS2). In 2018 the best networks were Yedroudj-Net [8], ReST-Net [34], and SRNet [13]. Yedroudj-Net is a small network that can learn on a very small database and can be effective even without using the tricks known to improve performance such as transfer learning [35], virtual augmentation of the database [11], and so on. This network is a good candidate when working on GANs. It is better than Ye-Net [11] and can be improved to face other more recent networks [19]. ReST-Net [34] is a huge network made of three subnetworks, which uses various preprocessing filter banks. SRNet [13] is a network that can be adapted to spatial or Jpeg steganalysis. It requires various tricks such as virtual augmentation and transfer learning and therefore requires a bigger database compared to Yedroudj-Net. These three networks are described in Section 14.3.1. To resume, from 2015 to 2016, publications were in spatial steganalysis, and in 2017 the publications were mainly on JPEG steganalysis. In 2018, publications were again mainly concentrated on spatial steganalysis. Finally, at the end of 2017 the first publications using GANs appeared. In Section 14.4, we present new propositions using steganography by deep-learning and give classification per family. In the next subsection, we report on the most successful networks until the end of 2018 for various scenarios. In Section 14.3.1, we describe the not-side-channel-aware (Not-SCA) scenario, in Section 14.3.2, we discuss the scenario known as side-channel-aware (SCA), and in Sections 14.3.3, we deal with JPEG steganalysis Not-SCA and SCA scenarios. In Section 14.3.4, we very briefly discuss cover-source mismatch, although for the moment, the proposals using a CNN do not exist. We will not tackle the scenario of CNN invariant to the size of the images because it is not yet mature enough. This scenario is briefly discussed in Section 14.2.4, and the papers of Yedroudj-Net [8], Zhu-Net [19], or Tsang et al. [18] give the first solutions. We will not approach the scenario of quantitative steganalysis per CNN, which consists in estimating the embedded payload size. This scenario is very well examined in the paper [36] and serves as a new state-of-the-art method. The approach surpasses the previous state-of-the-art approaches [37,38] that rely on RMs, an ensemble of trees, and an efficient normalization of features. Nor will we discuss batch steganography and pooled steganalysis with CNNs, which has not yet been addressed, although the work presented in [39] using two-stage machine learning can be extended to deep learning.

330

Digital Media Steganography

14.3.1 The spatial steganalysis Not-Side-Channel-Aware (Not-SCA) In early 2018 the most successful spatial steganalysis approach is the Yedroudj-Net [8] method (Fig. 14.7). The experimental comparisons were carried out on the BOSS database, which contains 10,000 images subsampled to 256 × 256 pixels. For a fair comparison, the experiments were performed by comparing the approach to Xu-Net without EC [10] to the Ye-Net network in its Not-SCA version [11], and also to EC [5] fed by spatial RMs [4]. Note that Zhu-Net [19] (not yet published when writing this chapter) offers three improvements to Yedroudj-Net, which allows it to be even more efficient. The improvements reported by Zhu-Net [19] are the update to the kernel filters of the preprocessing module (in the same vein as what has been proposed by Matthew Stamm’s team in Forensics [40]), replacing the first two convolution blocks with two modules of depthwise separable convolutions as proposed in [41], and finally replacing global average pooling with a spatial pyramid pooling (SPP) module as in [17]. In May 2018 the ReST-Net [34] approach was proposed (see Figs. 14.5 and 14.6). It consists of agglomerating three networks to form a supernetwork. Each subnet is a modified Xu-Net like network [10] resembling the Yedroudj-Net [8] network, with an Inception module on blocks 2 and 4. This Inception module contains filters of the same size, with a different activation function for each “path” (TanH, ReLU, Sigmoid). The first subnet performs preprocessing with 16 Gabor filters, the second subnetwork performs preprocessing with 16 SRM linear filters, and the third network performs preprocessing with 14 nonlinear residuals (min and max calculated on SRM). The learning process requires four steps (one step per subnet and then one step for the supernetwork). The results are 2–5% better than Xu-Net for S-UNIWARD [42], HILL [43], CMD-HILL [44] on BOSSBase v1.01 [15] 512 × 512. Looking at the results, it is the concept of ensemble that improves the performances. Taken separately, each subnet has a lower performance. At the moment, no comparison in a fair framework was made between an ensemble of Yedroudj-Net and ReST-Net. In September 2018 the SRNet [13] approach became available online (see Fig. 14.9). It proposes a deeper network than previous networks, which is composed of 12 convolution blocks. The network does not perform preprocessing (the filters are learned) and subsamples the signal only from the 8th convolution block. To avoid the problem of vanishing gradient, blocks 2–11 use the shortcut mechanism. The Inception mechanism is also implemented from block 8 during the pooling (subsampling) phase. The learning database is augmented with the BOWS2 database as in [11] or [12], and a curriculum training mechanism [11] is used to change from a standard payload size of 0.4 bpp to other payload sizes. Finally, gradient descent is performed by Adamax [45]. The network can be used for spatial steganalysis (Not-SCA), informed (SCA) spatial steganalysis (As discussed in Section 14.3.2), and JPEG steganalysis (see Section 14.3.3, Not-SCA or SCA). Overall the philosophy remains similar to the previous networks, with three parts: preprocessing (with learned filters), convolution blocks, and classification blocks. With a simplified vision, the network corresponds to the addition of 5 blocks of convolution without pooling, just after the first convolution block of Yedroudj-Net network. To be able to use this large number of blocks on a modern GPU, the authors must reduce the number of feature maps to 16,

Chapter 14 • Deep learning in steganography and steganalysis

331

and to avoid the problem of vanishing gradients, they must use the trick of residual shortcut within the blocks as proposed in [27]. Note that preserving the size of the signal in the first seven blocks is a radical approach. This idea has been put forward in [21], where the suppression of pooling had clearly improved the results. The use of modern brick like shortcuts or Inception modules also enhances performance. It should also be noted that the training is completed end-to-end without particular initialization (except when there is a curriculum training mechanism). In the initial publication [13], SRNet network was not compared to Yedroudj-Net [8] or to Zhu-Net [19], but later, in 2019, in [19] all these networks have been compared, and the update of YedroudjNet, that is, Zhu-Net gives performances of 1% to 4% improvement over SRNet, and 4% to 9% improvement over Yedroudj-Net, when using the usual comparison protocol. Note that Zhu-Net is also better than the network Cov-Pool published at IH&MMSec’2019 [46], and its performances are similar to SRNet.

14.3.2 The spatial steganalysis Side-Channel-Informed (SCA) At the end of 2018, two approaches combined the knowledge of the selection channel, the SCA-Ye-Net (which is the SCA version of Ye-Net) [11] and the SCA-SRNet (which is the SCA version of SRNet) [13]. The idea is using a network for non-informed steganalysis and injecting not only the image to be steganalyzed, but also the modification probability map. It is thus assumed that Eve knows or can have a good estimation [47] of the modification probability map, that is, Eve has access to side-channel information. The modification probability map is given to the preprocessing block SCA-Ye-Net [11] and equivalently to the first convolution block for SCA-SRNet [13], but the kernel values are replaced by their absolute values. After the convolution, each feature map is summed pointwise with the corresponding convolved “modification probability map” (Fig. 14.10). Note that the activation functions of the first convolutions in SCA-Ye-Net, that is, the trun-

FIGURE 14.10 Integration of the modification probability map in a CNN.

332

Digital Media Steganography

cation activation function (truncated linear unit (TLU) in the paper), are replaced by a ReLU. This makes it possible to propagate (forward pass) “virtually” throughout the network, an information related to the image, and another related to the modification probability map. Note that this procedure to transform a Not-SCA-CNN into an SCA-CNN is inspired by the propagation of the modification probability map proposed in [30] and [31]. These two papers come as an improvement on the previous maxSRM RMs [29]. In maxSRM, instead of accumulating the number of occurrences in the cooccurrence matrix, an accumulation of the maximum of a local probability was used. In [30] and [31] the idea was transforming the modification probability map in a similar way to the filtering of the image, and then to updating the cooccurrence matrix using the transformed version of the modification probability map instead of the original modification probability map. The imitation of this principle was initially integrated into Ye-Net for CNN steganalysis, and this concept is easily transposable to most of the modern CNNs.

14.3.3 The JPEG steganalysis The best JPEG CNN at the end of 2018 was SRNet [13]. Note that this network, at this period, is the only one that has been proposed with a Side Channel Aware (SCA) version. It is interesting to list and rapidly discuss the previous CNNs used for JPEG steganalysis. The first network, published in February 2017, was the Zeng et al. network and was evaluated with a million images and does a limited evaluation of stego-mismatch [23,24]. Then in June 2017 at IH&MMSec’2017, two networks have been proposed, PNet [25] and Xu-Net-Jpeg [26]. Finally, SRNet [13] was added online in September 2018. In Zeng et al.s’ network [23,24] the preprocessing block takes as input a de-quantized (real value) image, then convolved it with 25 DCT basis, and then quantized and truncated the 25 filtered images. This preprocessing block uses handcrafted filter kernels (DCT basis), the kernels’ values are fixed, and these filters are inspired by DCTR RMs [48]. There are three different quantizations, so, the preprocessing block gives 3 × 25 residual images. The CNN is then made of three subnetworks, each producing a feature vector of dimension 512. The subnetworks are inspired by Xu-Net [10]. The three feature vectors are outputted by the three subnetworks and then given to a fully connected structure, and the final network ends with a softmax layer. Similarly to what has been done for spatial steganalysis, this network is using a preprocessing block inspired by RMs [48]. Note that the most efficient RMs today is the Gabor filter RMs [49]. Also, note that this network takes advantage of the notion of an ensemble of features, which comes from the three different subnetworks. The network of Zeng et al. is less efficient than Xu-Net-Jpeg [26] but gives an interesting first approach guided by RMs. The main PNet idea (and also VNet, which is less efficient but takes less memory) [25] is to imitate phase-aware RMs, such as DCTR [48], PHARM [50], or GFR [49], and therefore to have a decomposition of an input image into 64 features maps, which represents the 64 phases of the Jpeg images. The pre-processing block takes as input a dequantized (real-valued) image, convolves it with four filters, the “SQUARE5×5” from the Spatial Rich

Chapter 14 • Deep learning in steganography and steganalysis

333

Models [4], a “point” high-pass filter (referenced as “catalyst kernel”), which complements the “SQUARE5×5”, and two directional Gabor filters (angles 0 and π/2). Just after the second block of convolution, a “PhaseSplit Module” splits the residual image into 64 feature maps (one map = one phase), similarly to what was done in RMs. Some interesting methods have been used such as (1) the succession of the fixed convolutions of the preprocessing block, and a second convolution with learnable values, (2) a clever update of BN parameters, (3) the use of the “Filter Group Option”, which virtually builds subnetworks, (4) bagging on 5-cross-validation, (5) taking the 5 last evaluations to give the mean error for a network, (6) shuffling the database at the beginning of each epoch to have better BN behavior and to help generalization, and (7) eventually using an Ensemble. With such know-how, PNet beat the classical two-step machine learning approaches in a NotSCA and also in an SCA version (EC + GFR). The Xu-Net-Jpeg [26] is even more attractive since the approach was slightly better than PNet and does not require a strong domain inspiration like in PNet. The Xu-Net-Jpeg is strongly inspired by ResNet [27], a well-established network from the machine learning community. ResNet allows the use of deeper networks thanks to the use of shortcuts. In Xu-Net-Jpeg the preprocessing block takes as an input a dequantized (real-valued) image, then convolves the image with 16 DCT bases (in the same spirit as Zeng et al. network [23,24]) and then applies an absolute value, a truncation, and a set of convolutions, BN, ReLU until it obtains a feature vector of dimension 384, which is given to a fully connected block. Note that the max pooling or average pooling are replaced by convolutions. This network is really simple and in 2017 was the state-of-the-art method. In a way, this kind of results shows us that the networks proposed by machine learning community are very competitive, and there is no so much domain-knowledge to integrate to the topology of a network to obtain a very efficient network. In 2018 the state-of-the-art CNN for JPEG steganalysis (which can also be used for spatial steganalysis) was SRNet [13]. This network was previously presented in Section 14.3.1. Note that for the side channel aware version of SRNet, the embedding change probability per DCTs coefficient is first mapped back in the spatial domain using absolute values for the DCT basis. This side-channel map then enters the network and is convolved with each kernel (this first convolution acts as a preprocessing block). Note that the convolutions in this first block for this side-channel map are such that the filter kernels are modified to their absolute values. After passing the convolution, the feature maps are summed with the square root of the values from the convolved side-channel map. Note that this idea is similar to that exposed in SCA Ye-Net version (SCA-TLU-CNN) [11] about the integration of a side-channel map, and to the recent proposition for side-channel-aware steganalysis in JPEG with RMs [31], where the construction of the side-channel map and especially the 1/2 quantity δuSA 6 was defined. Note that a similar solution with more convolutions, applied to the side-channel map, have been proposed in IH&MMSec’2019 [51]. 6

uSA stands for Upper bounded Sum of Absolute values.

334

Digital Media Steganography

14.3.4 Discussion about the Mismatch phenomenon scenario Mismatch (cover-source mismatch or stego-mismatch) is a phenomenon present in machine learning, and this issue sees decrease of classification performances because of the inconsistency between the distribution of the learning database and the distribution of the test database. The problem is not due to an inability to generalize machine learning algorithms, but due to the lack of similar examples occurring in the training and test database. The problem of mismatch is an issue that goes well beyond the scope of steganalysis. In steganalysis the phenomenon can be caused by many factors. The cover-source mismatch can be caused by the use of different photosensors, by different digital processing, by different camera settings (focal length, ISO, lens, etc.), by different image sizes, by different image resolutions, and so on [52,53]. The stego-mismatch can be caused by different amounts of embedded bits or by different embedding algorithms. Even if not yet fully explored and understood, the mismatch (cover-source mismatch (CSM) or stego mismatch) is a major area for examination in the coming years for the discipline. The results of the Alaska challenge [54]7 published at the ACM conference IH&MMSec’2019 will continue these considerations. In 2018, CSM had been established for 10 years [55]. There are two major current schools of thought, as well as a third more exotic one: •

The first school of thought is the so-called holistic approach (that is, global, macroscopic, or systemic) and consists of learning all distributions [56,57]. The use of a single CNN with millions of images [24] is in the logical continuation of this current school of thought. Note that this scenario does not consider that the test set can be used during learning. This scenario can be assimilated to an online scenario, where the last player (from a game theory point of view) is the steganographer because in an online scenario the steganographer can change her strategy, whereas the steganalyzer cannot. • The second school of thought is atomistic (= partitioned, microscopic, analytical, of divide-and-conquer type, or individualized) and consists of partitioning the distribution [58], that is, creating a partition and associating a classifier for each cell of the partition. Note that an example of an atomistic approach for stego-mismatch management, using a CNN multiclassifier, is presented in [59] (a class is associated with each embedding algorithm, and thus there is a latent partition). Note that this idea [59], among others, has been used by the winners of the Alaska challenge [60]. Note that again, this scenario does not consider that the test set can be used during learning. This scenario can also be assimilated to an online scenario where the last player (from a game theory point of view) is the steganographer, because in an online scenario the steganographer can change her strategy, whereas the steganalyst cannot. • Finally, the third exotic school of thought considers that there is a test database (with much more than one image) and that the database is available and usable (without labels) during learning. This scenario can be assimilated to an offline scenario where 7

Alaska: A challenge of steganalysis into the wilderness of the real world. https://alaska.utt.fr/.

Chapter 14 • Deep learning in steganography and steganalysis

335

the last player (from a game theory point of view) is the steganalyzer, because in this offline scenario the steganalyzer is playing a more forensic role. In this situation, there are approaches of type domain adaptation, or a transfer of features GTCA [61], IMFA [62], CFT [63], where the idea is defining an invariant latent space. Another approach is ATS [64], which performs an unsupervised classification using only the test database and requires the embedding algorithm to reembed a payload in the images from the test database. These three schools of thought can help derive approaches by CNN that integrate the ideas presented here. That said, the ultimate solution may be detecting the phenomenon of mismatch and raising the alarm or prohibiting the decision [65]. In short, integrating a more intelligent mechanism than just holistic or atomistic.

14.4 Steganography by deep learning In Simmons’ founding paper [66], steganography and steganalysis are defined as a 3-player game. The steganographers, usually named Alice and Bob, want to exchange a message without being suspected by a third party. They must use a harmless medium, such as an image, and hide the message in this medium. The steganalyst, usually called Eve, observes the exchanges between Alice and Bob. Eve must check whether these images are natural, that is, cover images, or whether they hide a message, that is, stego images. This notion of game between Alice, Bob, and Eve corresponds to that found in game theory. Each player tries to find a strategy that maximizes his chance of winning. For this, we express the problem as a min–max problem that we seek to optimize. The solution to the optimum, if it exists, is called the solution at the Nash equilibrium. When all the players are using a strategy at the Nash equilibrium, any change of strategy from a player leads to a counterattack from the other players allowing them to increase their gains. In 2012, Schöttle and Böhme [67,68] have modeled with a simplifying hypotheses a problem of steganography and steganalysis and proposed a formal solution. Schöttle and Böhme called this approach the optimum adaptive steganography or strategic adaptive steganography in opposition to the so-called naive adaptive steganography, which corresponds to what is currently used in algorithms like HUGO (2010) [69], WOW (2012) [70], S-UNIWARD / J-UNIWARD / SI-UNIWARD (2013) [42], HILL (2014) [43], MiPOD (2016) [71], Synch-Hill (2015) [72], UED (2012) [73], IUERD (2016) [74], IUERD-UpDist-Dejoin2 (2018) [75], and so on. A mathematical formalization of the steganography/steganalysis problem by game theory is difficult and often far from practical in reality. Another way to determine a Nash equilibrium is “simulating” the game. From a practical point of view, Alice plays the entire game alone, meaning that she does not interact with Bob or Eve to build her embedding algorithm. The idea is that she uses three algorithms (two algorithms in the simplified version) that we name agents. Each agent plays the role of Alice, Bob,8 and Eve, and each 8

Bob is deleted in the simplified version.

336

Digital Media Steganography

agent runs at Alice’s home. Note these three algorithms running at Alice’s home: AgentAlice, Agent-Bob, and Agent-Eve. With these notations, we thus make a distinction with the human users: Alice (sender), Bob (receiver), and Eve (warden), and it allows us to highlight the fact that the three agents are executed from Alice’s side. So, Agent-Alice’s role is to embed a message into an image so that the resulting stego image is undetectable by Agent-Eve and such that Agent-Bob can extract the message. Alice can launch the game, that is, the simulation, and the agents are “fighting”.9 Once the agents have reached a Nash’s equilibrium, Alice stops the simulation and can now keep Agent-Alice, which is her strategic adaptive embedding algorithm, and can send AgentBob, that is, the extraction algorithm (or any equivalent information) to Bob.10 The secret communication between Alice and Bob is now possible through the use of the Agent-Alice algorithm for embedding and Agent-Bob algorithm for extraction. The first precursor approaches aimed at simulating a strategic adaptive equilibrium and therefore proposing strategic embedding algorithms from 2011 and 2012. The two approaches are MOD [76] and ASO [7,77], as shown in Fig. 14.11. Whether for MOD or ASO, the game is made by pitting Agent-Alice and Agent-Eve against each other. In this game, Agent-Bob is not used since Agent-Alice is simply generating a cost map, which is then used for coding and embedding the message thanks to an STC [78]. Alice can generate a cost map for a source image with the Agent-Alice, and then she can easily use the STC [78] algorithm to embed her message and obtain the stego image. From his side, Bob only has to use the STC [78] algorithm to retrieve the message from the stego image.

FIGURE 14.11 General scheme of ASO [7,77]. 9

The reader should be aware that from a game theory point of view, there are only two teams that are competing (Agent-Alice plus Agent-Bob from one side and Agent-Eve from the other) in a zero-sum game. 10 Note that the exchange of any secret information between Alice and Bob, prior to the use of Agent-Alice and Agent-Bob, requires the use of another steganographic channel. Also note that this initial sending from Alice to Bob before been able to use Agent-Alice and Agent-Bob is equivalent to the classical stego-key exchange problem.

Chapter 14 • Deep learning in steganography and steganalysis

337

In both MOD or ASO, the “simulation” is such that the following two actions are iterated until a stop criterion is reached: i) Agent-Alice updates its embedding cost map by asking an Oracle (the Agent-Eve) how best to update each embedding cost to be even less detectable. In MOD (2011) [76], Agent-Eve is an SVM. Agent-Alice updates their embedding costs by reducing the SVM margin separating the covers and the stegos. In ASO (2012) [7], Agent-Eve is an EC [5] and is named an Oracle. Agent-Alice updates the embedding costs by transforming a stego into a cover. In both cases the idea is finding a displacement in the latent space (feature space) colinear to the orthogonal axis to the hyperplane separating the cover and stego classes. Note that in the current terminology, introduced by Ian Goodfellow in 2014 [79], Agent-Alice runs an adversarial attack, and the Oracle (Agent-Eve), called a discriminator (or the classifier to be deceived), must learn to counter this attack. ii) The Oracle (Agent-Eve) updates its classifier. Reformulated with the terminology from machine learning, this equates to the discriminant update by relearning it to steganalyze once more the stego images generated by Agent-Alice. In 2014, Goodfellow et al. [79] used neural networks to “simulate” a game with an image generator network and a discriminating network whose role was to decide whether an image was real or synthesized. The authors have named this generative adversarial networks (GAN approach). The terminology used in this paper was subsequently widely adopted. Moreover, the use of neuron networks makes the expression of the min–max problem easy. The optimization is then carried out via the back-propagation optimization process. Moreover, thanks to deep-learning libraries, it is now easy to build a GAN-type system. As we have already mentioned before, the concept of game simulation existed in steganography/steganalysis with MOD [76] and ASO [7], but the implementation and optimization become easier with neural networks. From 2017, after a period of 5 years of stagnation, the concept of the simulated game is once again studied in the field of steganography/steganalysis, thanks to the emergence of deep learning and GAN approaches. At the end of 2018, we can define four groups or four families11 of approaches; some of them will probably merge: • The family by synthesis; • The family by generation of the modifications probability map; • The family by adversarial-embedding iterated (approaches misleading a discriminant); • The family by 3-player game. 11

“Deep Learning in Steganography and Steganalysis since 2015”, tutorial given at the “Image Signal & Security Mini-Workshop”, the 30th of October 2018, IRISA/Inria Rennes, France, DOI: 10.13140/RG.2.2.25683.22567, http:/ / www.lirmm.fr/ ~chaumont/ publications. Look at the slides (http://www.lirmm.fr/~chaumont/ publications/Deep_Learning_in_Steganography_and_Steganalysis_since_2015_Tutorial_Meeting-FranceCHAUMONT_30_10_2018.pdf) and the video of the talk (https://videos-rennes.inria.fr/video/H1YrIaFTQ).

338

Digital Media Steganography

14.4.1 The family by synthesis The first approaches based on image synthesis via a GAN [79] generator proposed the generation of cover images and then using them to make insertion by modification. These early propositions were approaches by modification. The argument put forward for such approaches is that the generated database would be safer. A reference often cited is that of SGAN [80] found on ArXiv, which was rejected at ICLR’2017 and was subsequently never published. This unpublished paper has a lot of errors and lack of proof. We should rather prefer the reference of SSGAN [81] published in September 2017, which proposes the same thing: generating images and then hiding messages in them. However, this protocol seems to complicate the matter. It is more logical that Alice herself chooses natural images that are safe for embedding, that is, images that are innocuous, never broadcasted before, adapted to the context, with lots of noise or textures [82], not well classified by a classifier [77], or with a small deflection coefficient [71], rather than generating images and then using them to hide a message. A much more interesting approach using synthesis is to directly generate images that will be considered stego. To my knowledge, the first approach exploiting the GAN mechanism for image synthesis using the principle of steganography without modifications [83] is proposed in the paper of Hu et al. [84] and published in July 2018; see Fig. 14.12.

FIGURE 14.12 Hu et al. [84] approach by synthesis without modification.

The first step consists of deriving a network able to synthesize images. In this paper the DCGAN generator [85] is used to synthesize images with a preliminary learning thanks to GAN methodology. When fed with a vector of a fixed-size uniformly distributed in [−1, 1], the generator synthesizes an image. The second step consists of learning another network

Chapter 14 • Deep learning in steganography and steganalysis

339

to extract a vector from a synthesized image; the extracted vector must correspond to the vector given at the input of the generator that synthesizes the image. Finally, the last step consists of sending to Bob the extraction network. Now Alice can map a message to a fixedsize uniformly distributed vector and then synthesize an image with the given vector and send it to Bob. Bob can extract the vector and retrieve the corresponding message. The approaches with no modifications have been around for many years, and it is known that one of the problems is that the number of bits that can be communicated is lower compared to the approaches with modifications. That said, the gap between the approaches by modifications versus no-modifications is beginning to narrow. Here is a rapid analysis of the efficiency of the method. In the paper of Hu et al. [84] the capacity is around 0.018 bits per pixel (bpp) with images of 64 × 64 pixels.12 In the experiment carried out the synthesized images are either faces or photos of food. An algorithm like HILL [43] (one of the most powerful algorithms on the BOSS database [82]) is detected by SRNet [13] (one of the most successful steganalysis approaches toward the end of 2018) with error probability Pe = 31.3% (note that Pe of 50% is equivalent to a random detector) on a 256 × 256 pixel BOSS database for a payload size of 0.1 bpp. Due to the square root law, the Pe would be higher for the 64 × 64 pixel BOSS database. Therefore there is around 0.02 bpp for the unmodified synthetic approach of Hu et al. [84], whose security has not yet been evaluated enough, against something around 0.1 bpp for HILL with less than one chance in three to be detected with a clairvoyant steganalysis, that is, a laboratory steganalysis (unrealistic and much more efficient than a “realworld”/“into the wild” steganalysis [14,54]). Therefore there is still a margin in terms of the number of bits transmitted between the no-modification synthesis-based approaches, such as that of Hu et al. approach [84], and modification approaches such as S-UNIWARD [42], HILL [43], MiPod [71], or even Synch-Hill [72], but this margin has been reduced.13 Also, note that there are still some issues to be addressed to ensure that approaches such as that proposed by Hu et al. are entirely safe. In particular, it must be ensured that the detection of synthetic images [86] does not compromise the communication channel in the long term. It must also be ensured that the absence of a secret key does not jeopardize the approach. Indeed, if one considers that the generator is public, then is it possible to use this information to deduce that a synthesis approach without modification has been used?

14.4.2 The family by generation of the modifications probability map The family by generation of the modification probability map is summarized in the late 2018s in two papers: ASDL-GAN [87] and UT-6HPF-GAN [88]; see Fig. 14.13. In this ap12

The vector dimension is 100. This vector is used to synthesize images of a size 64 × 64 × 3. There are 100 × 3 bits (see the mapping) per image, i.e. about 0.02 bits per pixel (bpp). The Bit Error Rate is BER = 1 − 0.94 = 6%. It is, therefore, necessary to add an Error Correcting Code (ECC) so that the approach is without errors. With the use of a Hamming code [15, 11, 3] that corrects at best 6% of errors, the payload size is therefore around 0.018 bpp. 13 The other families of steganography by deep learning, which are modification based, will probably help to maintain this performance gap for a few years more.

340

Digital Media Steganography

FIGURE 14.13 ASDL approach; generation of the modifications probability map.

proach, there is a generator network and a discriminant network. From a cover the generator network generates a map, which is called the modification probability map. This modification probability map is then passed to an equivalent of the random draw function used in the STC [78] simulator. We then obtain a map whose values belong to {−1, 0, +1}. This map is called the modification map and corresponds to the so-called stego-noise. The discriminant network takes as input a cover or an image resulting from the summation (point-to-point sum) of the cover and the stego-noise generated by the generator. The discriminant’s objective is to distinguish between the cover and the “cover + stego-noise” image. The generator’s objective is to generate a modification map that makes it possible to mislead the discriminant the most. Of course, the generator is forced to generate a nonzero probability map by adding in the loss term a term constraining the size of the payload in addition to the term misleading the discriminant. In practice, taking the latest approach UT-6HPF-GAN [88] the generator is a U-Net type network, the draw function is obtained by a differentiable function double Tanh, and the discriminant is the Xu-Net [10] enriched with six high-pass filters for the preprocessing in the same spirit as Ye-Net [11] or Yedroudj-Net [8]. The system learns on a first database, and then security comparisons are made on the 256 × 256 pixel BOSS [15], LIRMMBase [21], and BOWS2 [16] databases. The steganalysis is done with the EC [5] fed by SRM [4], with EC plus the MaxSRM [29], and with Xu-Net [10]. Note that using Xu-Net is not a good choice since it is less efficient than EC+SRM or EC+MaxSRM and also because it is the discriminant in the UT-6HPF-GAN (there is a risk of falling into an “incompleteness” issue; see [89,90]). So, only looking at the results with EC+SRM on the BOSS database, with real embedding using STC [78], the performances are

Chapter 14 • Deep learning in steganography and steganalysis

341

equivalent to those of HILL [43], which is one of the most efficient embedding algorithms on BOSS [82]. It is therefore a very promising family. Additionally, the generator does not seem to be impacted when used on a database that is different from the learning database. Nevertheless, curriculum learning has to be used when the target payload is changed, which seems to indicate a kind of sensitivity to the mismatch. Further reflexions have also to be achieved related to the generator’s loss and to the mixing of both a security-related term and a payload-size term. Usually, one of the two criteria is fixed, so that we have to be in a payload-limited sender scenario or a security-limited sender scenario. Note that a version for JPEG has been proposed in IH&MMSec’2019, JS-GAN [91].

14.4.3 The family by adversarial-embedding iterated (approaches misleading a discriminant) The family by adversarial-embedding iterated reuses the concept of game simulation, which was presented in the beginning of Section 14.4 with a simplification of the problem since there are only two players, Agent-Alice and Agent-Eve. Historically, MOD [76] and ASO [7] were the first algorithms of this type. Recently, some papers have used the adversarial concept14 by generating a deceiving example (see [92]), but these approaches are not adversarial-embedding iterated, nor they are dynamic: they contain no game simulation, they do not try to reach a Nash equilibrium, and there is no learning alternation between the embedder and the steganalysis. A paper with spirit more in tune with a simulation of a game, which takes the principle of ASO [7] and whose objective is updating the cost map, is the algorithm ADV-EMB [93] (previously called AMA in ArXiv arXiv:1803.09043). In this paper the authors propose to make an adversarial-embedding iterated, by letting Agent-Alice access the gradient of the loss of Agent-Eve (similarly to ASO, where Agent-Alice has access to its Oracle (the Agent-Eve)). In ADV-EMB, Agent-Alice uses the gradient of the direction to the class frontier (between classes cover and stego) to modify the cost map, and in ASO, Agent-Alice directly uses the direction of the class frontier to modify the cost map. In ADV-EMB [93] the cost map is initialized with the cost of S-UNIWARD (for ASO, it was the cost of HUGO [69]). During the iterations, the cost map is updated, but there is only a β percentage of values that are updated.15 When the ADV-EMB iterations are stopped, the cost map is composed of a β − 1 percent of positions having a cost defined by S-UNIWARD and β percent of positions having a cost coming from a change in the initial cost given by S-UNIWARD. Note that updating a cost causes a cost asymmetry since the cost of a +1 change is no longer equal to the cost of a −1 change, as in ASO. Besides, the update of the two costs 14

An adversarial attack does not necessarily require us to use a deep learning classifier. In STC, before coding the message, the pixel positions of the image are shuffled thanks to the use of a pseudorandom shuffler, seeded by the secret stego-key. Note that this stego-key is shared between Alice and Bob. After the shuffling step, ADV-EMB selects the last β percent pixels of the shuffled image and modifies their associated cost and only those ones. 15

342

Digital Media Steganography

of a pixel is rather rough since it is a simple division by 2 for a direction (+1 or −1) and multiplication by 2 for the other direction. The sign of the gradient of loss, calculated by choosing the cover label, for a given pixel, makes it possible to determine for each of the two directions (+1/−1) if we reduce or increase the cost. The idea is as in ASO, to deceive the discriminant, since when we decide to reduce the value of a cost, it is to favor the direction of modification associated with this cost, and thus we promote getting closer to the cover class. With such a scheme, security is improved. The fact that it is preferable to have a small number of modifications to the initial cost map probably makes it possible to preserve the initial embedding approach, and therefore not to introduce too many traces that could be detected by another steganalyzer [90]. That said, the update to the costs should probably be refined to better take into account the value of the gradient. For the moment, the selection of the β percent of pixels that will be modified is suboptimal, and this selection should eventually be done by looking at the initial cost of the whole pixel. Finally, as it is the case for ASO, if the discriminant is not powerful enough to carry out a steganalysis, then it can be totally counterproductive for Agent-Alice. Therefore there are many open questions regarding the convergence criterion, the stopping criterion, the number of iterations in the alternation between Agent-Alice and Agent-Eve, the definition of a metric for measuring the relevance of Agent-Eve, and so on. Note that an adversarial embedding iterated with Agent-Alice countering multiple versions of Agent-Eve has been proposed in IH&MMSec’2019 [94].

14.4.4 The family by 3-player game The 3-player game concept is an extension of the previous family (see the family “adversarialembedding iterated”), but this time with three agents and with all neural networks. Here the three agents Agent-Alice, Agent-Bob, and Agent-Eve are present. Note that Agent-Alice and Agent-Bob are “linked” since Agent-Bob is there only to add a constraint on the solution obtained by Agent-Alice. Thus the primary “game” is an antagonistic (or adversarial) game between Agent-Alice and Agent-Eve, whereas the “game” between Agent-Alice and Agent-Bob is rather cooperative, since these two agents share the common purpose of communicating (Agent-Alice and Agent-Bob both want Agent-Bob to be able to extract the message without errors). Fig. 14.14 from [95] summarizes the principle of the 3-player game. Agent-Alice takes a cover image, a message, and a stego-key, and after a discretization step generates a stego image. This stego image is used by Agent-Bob to retrieve a message. On the other side, Agent-Eve has to decide whether an image is cover or stego; this agent outputs a score. Historically, after MOD and ASO, which only included two players, we can see the premise of the idea of three players appear in 2016 with the paper of Abadi and Andersen [96]. In this paper, Abadi and Andersen, from Google Brain, proposed a cryptographic toy-example for an encryption based on the use of three neural networks. The use of neural networks makes it easy to obtain a strategic equilibrium since the problem is expressed as a min–max problem, and its optimization can be carried out by the back-propagation

Chapter 14 • Deep learning in steganography and steganalysis

343

FIGURE 14.14 The overall architecture of the 3-player game.

process. Naturally, this 3-player game concept can be transposed to steganography with the use of deep learning. In December 2017 (GSIVAT; [97]) and in September 2018 (HiDDeN; [98]), two different teams from the machine learning community proposed, in NIPS’2017 and then in ECCV’2018, to achieve strategic embedding thanks to three CNNs, iteratively updated, playing the roles of Agent-Alice, Agent-Bob, and Agent-Eve. These two papers do not rigorously define the concept of the 3-player game, and there are erroneous assertions, mainly because the security and its evaluation are not correctly handled. If we place ourself in the standard framework to evaluate the empirical security of an embedding algorithm, that is, with a clairvoyant Eve, then the two approaches are very detectable. The most significant issues with these two papers are: first, neither of the two approaches uses a stegokey, which is the equivalent to always using the same key, and it leads to very detectable schemes [21]; second, there is no discretization of pixel values issued from Agent-Alice; third, the computational complexity, due to the use of fully connected blocks, leads to unpractical approaches; and fourth, the security evaluation is not carried out with a stateof-the-art steganalyzer. At the beginning of 2019, Yedroudj et al. [95] redefined the 3-player concept by integrating the possibility of using a stego-key, treating the problem of discretization, going through convolution modules to have a scalable solution and using a suitable steganalyzer. The proposition is not comparable to classical adaptive embedding approaches, but there is a real potential to such an approach. The bit error rate is sufficiently small to be nullified, the embedding is done in the texture parts, and security could be improved in the future. As an example, the probability of error with a steganalysis by Yedroudj-Net [8] under equal errors prior, for a real payload size 0,3 bpp,16 for images from BOWS2 database is 10.8%. This can, for example, be compared to the steganalysis of WOW [70] using the same conditions, which give a probability error of 22.4%. There is still a security gap, but this approach paves the way to much research. There are still open questions on the link 16

A Hamming error correcting code ensures a null BER theoretically for most of the images, and thus a rate of 0.3 bpp for these images.

344

Digital Media Steganography

between Agent-Alice and Agent-Bob, on the use of GANs, the definition of losses, and the tuning of the compromises between the different constraints.

14.5 Conclusion In this chapter, we practically completed a full presentation of the subject on deep learning in steganography and steganalysis since its appearance in 2015. We recalled the main elements of a CNN, and discussed the memory, time complexity, and practical problems for efficiency. Then, we explored the link between some past approaches sharing similarities with what is currently carried out in a CNN. Various networks until the beginning of 2019 with multiple scenarios are presented. Also, we touched on the recent approaches for steganography with deep learning. As mentioned in this chapter, many things have not been solved yet, and the major issue is to be able to experiment with more realistic hypotheses to be more “into the wild”. The “holy grail” is cover-source mismatch and stego-mismatch, but in a way, the mismatch is a problem shared by the whole machine learning community. CNNs are now very present in the steganalysis community, and probably the next question is how to go a step further and produce clever networks? Finally, we think and hope that this chapter will help the community to understand what has been done and what are the next directions to explore.

References [1] G.E. Hinton, R.R. Salakhutdinov, Reducing the dimensionality of data with neural networks, Science 313 (5786) (Jul. 2006) 504–507. [2] Y. Bengio, A.C. Courville, P. Vincent, Representation learning: a review and new perspectives, IEEE Transaction on Pattern Analysis and Machine Intelligence, PAMI 35 (8) (2013) 1798–1828. [3] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (May 2015) 436–444. [4] J. Fridrich, J. Kodovsky, Rich models for steganalysis of digital images, IEEE Transactions on Information Forensics and Security, TIFS 7 (3) (June 2012) 868–882. [5] J. Kodovský, J. Fridrich, V. Holub, Ensemble classifiers for steganalysis of digital media, IEEE Transactions on Information Forensics and Security 7 (2) (2012) 432–444. [6] Y. Qian, J. Dong, W. Wang, T. Tan, Deep learning for steganalysis via convolutional neural networks, in: Proceedings of Media Watermarking, Security, and Forensics 2015, MWSF’2015, Part of IS&T/SPIE Annual Symposium on Electronic Imaging, SPIE’2015, San Francisco, California, USA, vol. 9409, Feb. 2015, 94090J. [7] S. Kouider, M. Chaumont, W. Puech, Adaptive steganography by oracle (ASO), in: Proceedings of the IEEE International Conference on Multimedia and Expo, ICME’2013, San Jose, California, USA, Jul. 2013, pp. 1–6. [8] M. Yedroudj, F. Comby, M. Chaumont, Yedrouj-Net: an efficient CNN for spatial steganalysis, in: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’2018, Calgary, Alberta, Canada, Apr. 2018, pp. 2092–2096. [9] A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, in: Proceeding of Advances in Neural Information Processing Systems 25, NIPS’2012, Lake Tahoe, Nevada, USA, Curran Associates, Inc., Dec. 2012, pp. 1097–1105. [10] G. Xu, H.Z. Wu, Y.Q. Shi, Structural design of convolutional neural networks for steganalysis, IEEE Signal Processing Letters 23 (5) (May 2016) 708–712. [11] J. Ye, J. Ni, Y. Yi, Deep learning hierarchical representations for image steganalysis, IEEE Transactions on Information Forensics and Security, TIFS 12 (11) (Nov. 2017) 2545–2557.

Chapter 14 • Deep learning in steganography and steganalysis

345

[12] M. Yedroudj, M. Chaumont, F. Comby, How to augment a small learning set for improving the performances of a CNN-based steganalyzer?, in: Proceedings of Media Watermarking, Security, and Forensics, MWSF’2018, Part of IS&T International Symposium on Electronic Imaging, EI’2018, Burlingame, California, USA, 28 January – 2 February 2018, p. 7. [13] M. Boroumand, M. Chen, J. Fridrich, Deep residual network for steganalysis of digital images, IEEE Transactions on Information Forensics and Security 14 (5) (May 2019) 1181–1193. [14] A.D. Ker, P. Bas, R. Böhme, R. Cogranne, S. Craver, T. Filler, J. Fridrich, T. Pevný, Moving steganography and steganalysis from the laboratory into the real world, in: Proceedings of the 1st ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2013, Montpellier, France, Jun. 2013, pp. 45–58. [15] P. Bas, T. Filler, T. Pevný, ‘Break our steganographic system’: the ins and outs of organizing BOSS, in: Proceedings of 13th International Conference on Information Hiding, IH’2011, in: Lecture Notes in Computer Science, vol. 6958, Springer, Prague, Czech Republic, May 2011, pp. 59–70. [16] P. Bas, T. Furon, BOWS-2 contest (break our watermarking system), organised within the activity of the Watermarking Virtual Laboratory (Wavila) of the European Network of Excellence ECRYPT, 2008, organized between the 17th of July 2007 and the 17th of April 2008, http://bows2.ec-lille.fr/. [17] K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, in: Proceedings of the European Conference on Computer Vision, ECCV’2014, Zurich, Switzerland, Sep. 2014, pp. 346–361. [18] C.F. Tsang, J.J. Fridrich, Steganalyzing images of arbitrary size with CNNs, in: Proceedings of Media Watermarking, Security, and Forensics, MWSF’2018, Part of IS&T International Symposium on Electronic Imaging, EI’2018, Burlingame, California, USA, 28 January–2 February, 2018, 121. [19] R. Zhang, F. Zhu, J. Liu, G. Liu, Depth-wise separable convolutions and multi-level pooling for an efficient spatial CNN-based steganalysis (previously named “efficient feature learning and multi-size image steganalysis based on CNN” on ArXiv), IEEE Transactions on Information Forensics and Security, TIFS (2020). [20] S. Tan, B. Li, Stacked convolutional auto-encoders for steganalysis of digital images, in: Proceedings of Signal and Information Processing Association Annual Summit and Conference, APSIPA’2014, Chiang Mai, Thailand, Dec. 2014, pp. 1–4. [21] L. Pibre, J. Pasquet, D. Ienco, M. Chaumont, Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch, in: Proceedings of Media Watermarking, Security, and Forensics, MWSF’2016, Part of I&ST International Symposium on Electronic Imaging, EI’2016, San Francisco, California, USA, Feb. 2016, pp. 1–11. [22] G. Xu, H.-Z. Wu, Y.Q. Shi, Ensemble of CNNs for steganalysis: an empirical study, in: Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2016, Vigo, Galicia, Spain, Jun. 2016, pp. 103–107. [23] J. Zeng, S. Tan, B. Li, J. Huang, Pre-training via fitting deep neural network to rich-model features extraction procedure and its effect on deep learning for steganalysis, in: Proceedings of Media Watermarking, Security, and Forensics 2017, MWSF’2017, Part of IS&T Symposium on Electronic Imaging, EI’2017, Burlingame, California, USA, Jan. 2017, p. 6. [24] J. Zeng, S. Tan, B. Li, J. Huang, Large-scale JPEG image steganalysis using hybrid deep-learning framework, IEEE Transactions on Information Forensics and Security 13 (5) (May 2018) 1200–1214. [25] M. Chen, V. Sedighi, M. Boroumand, J. Fridrich, JPEG-phase-aware convolutional neural network for steganalysis of JPEG images, in: Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2017, Drexel University, Philadelphia, PA, Jun. 2017, pp. 75–84. [26] G. Xu, Deep convolutional neural network to detect J-UNIWARD, in: Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2017, Drexel University, Philadelphia, PA, Jun. 2017, pp. 67–73. [27] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, CVPR’2016, Las Vegas, Nevada, Jun. 2016, pp. 770–778. [28] X. Huang, S. Wang, T. Sun, G. Liu, X. Lin, Steganalysis of adaptive JPEG steganography based on ResDet, in: Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA’2018, Hononulu, Hawaii, vol. 2018, Nov. 2018, pp. 12–15.

346

Digital Media Steganography

[29] T. Denemark, V. Sedighi, V. Holub, R. Cogranne, J. Fridrich, Selection-channel-aware rich model for steganalysis of digital images, in: Proceedings of IEEE International Workshop on Information Forensics and Security, WIFS’2014, Atlanta, Georgia, USA, Dec. 2014, pp. 48–53. [30] T. Denemark, J.J. Fridrich, P.C. Alfaro, Improving selection-channel-aware steganalysis features, in: Proceedings of Media Watermarking, Security, and Forensics, MWSF’2018, Part of IS&T International Symposium on Electronic Imaging, EI’2016, San Francisco, California, USA, Feb. 2016, pp. 1–8. [31] T. Denemark, M. Boroumand, J. Fridrich, Steganalysis features for content-adaptive JPEG steganography, IEEE Transactions on Information Forensics and Security 11 (8) (Aug. 2016) 1736–1746. [32] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: Proceeding of International Conference on Learning Representations, ICLR’2015, San Diego, CA, May 2015, p. 12. [33] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR’2015, Boston, MA, USA, Jun. 2015, pp. 1–9. [34] B. Li, W. Wei, A. Ferreira, S. Tan, ReST-net: diverse activation modules and parallel subnets-based CNN for spatial image steganalysis, IEEE Signal Processing Letters 25 (5) (May 2018) 650–654. [35] Y. Qian, J. Dong, W. Wang, T. Tan, Learning and transferring representations for image steganalysis using convolutional neural network, in: Proceedings of IEEE International Conference on Image Processing, ICIP’2016, Phoenix, Arizona, Sep. 2016, pp. 2752–2756. [36] M. Chen, M. Boroumand, J.J. Fridrich, Deep learning regressors for quantitative steganalysis, in: Proceedings of Media Watermarking, Security, and Forensics, MWSF’2018, Part of IS&T International Symposium on Electronic Imaging, EI’2018, Burlingame, California, USA, 28 January–2 February, 2018, 160. [37] J. Kodovský, J.J. Fridrich, Quantitative steganalysis using rich models, in: Proceeding of SPIE Media Watermarking, Security, and Forensics, Part of IS&T/SPIE 23th Annual Symposium on Electronic Imaging, SPIE Proceedings, SPIE’2013, San Francisco, California, USA, vol. 8665, Feb. 2013, pp. 1–11. [38] A. Zakaria, M. Chaumont, G. Subsol, Quantitative and binary steganalysis in JPEG: a comparative study, in: Proceedings of the European Signal Processing Conference, EUSIPCO’2018, Roma, Italy, Sep. 2018, pp. 1422–1426. [39] A. Zakaria, M. Chaumont, G. Subsol, Pooled steganalysis in JPEG: how to deal with the spreading strategy?, in: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS’2019, Delft, the Netherlands, Dec. 2019, p. 6. [40] B. Bayar, M.C. Stamm, A deep learning approach to universal image manipulation detection using a new convolutional layer, in: Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2016, Vigo, Galicia, Spain, Jun. 2016, pp. 5–10. [41] F. Chollet, Xception: deep learning with depthwise separable convolutions, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, CVPR’2017, Honolulu, HI, USA, Jul. 2017, pp. 1800–1807. [42] V. Holub, J. Fridrich, T. Denemark, Universal distortion function for steganography in an arbitrary domain, EURASIP Journal on Information Security, JIS 2014 (1) (2014). [43] B. Li, M. Wang, J. Huang, X. Li, A new cost function for spatial image steganography, in: Proceedings of IEEE International Conference on Image Processing, ICIP’2014, Paris, France, Oct. 2014, pp. 4206–4210. [44] B. Li, M. Wang, X. Li, S. Tan, J. Huang, A strategy of clustering modification directions in spatial image steganography, IEEE Transactions on Information Forensics and Security 10 (9) (2015) 1905–1917. [45] D.P. Kingma, J.L. Ba, Adam: a method for stochastic optimization, in: Proceedings of Conference on Learning Representations, ICLR’2015, San Diego, CA, May 2015, p. 13. [46] X. Deng, B. Chen, W. Luo, D. Luo, Fast and effective global covariance pooling network for image steganalysis, in: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2019, Paris, France, Jul. 2019, pp. 230–234. [47] V. Sedighi, J. Fridrich, Effect of imprecise knowledge of the selection channel on steganalysis, in: Proceedings of the 3rd ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2015, Portland, Oregon, USA, 2015, pp. 33–42. [48] V. Holub, J. Fridrich, Low-complexity features for JPEG steganalysis using undecimated DCT, IEEE Transactions on Information Forensics and Security, TIFS 10 (2) (Feb. 2015) 219–228.

Chapter 14 • Deep learning in steganography and steganalysis

347

[49] C. Xia, Q. Guan, X. Zhao, Z. Xu, Y. Ma, Improving GFR steganalysis features by using Gabor symmetry and weighted histograms, in: Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2017, Philadelphia, Pennsylvania, USA, 2017, pp. 55–66. [50] V. Holub, J. Fridrich, Phase-aware projection model for steganalysis of JPEG images, in: Proceedings of SPIE Media Watermarking, Security, and Forensics 2015, Part of IS&T/SPIE Annual Symposium on Electronic Imaging, SPIE’2015, San Francisco, California, USA, vol. 9409, Feb. 2015, p. 11. [51] J. Huang, J. Ni, L. Wan, J. Yan, A customized convolutional neural network with low model complexity for JPEG steganalysis, in: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2019, Paris, France, Jul. 2019, pp. 198–203. [52] Q. Gibouloto, R. Cogranne, P. Bas, Steganalysis into the wild: How to define a source?, in: Proceedings of Media Watermarking, Security, and Forensics, MWSF’2018, Part of IS&T International Symposium on Electronic Imaging, EI’2018, Burlingame, California, USA, 28 January–2 February, 2018. [53] D. Borghys, P. Bas, H. Bruyninckx, Facing the cover-source mismatch on JPHide using training-set design, in: Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2018, Innsbruck, Austria, Jun. 2018, pp. 17–22. [54] R. Cogranne, Q. Giboulot, P. Bas, The ALASKA steganalysis challenge: a first step towards steganalysis, in: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2019, Paris, France, Jul. 2019, pp. 125–137. [55] G. Cancelli, G.J. Doërr, M. Barni, I.J. Cox, A comparative study of +/−1 steganalyzers, in: Proceedings of Workshop Multimedia Signal Processing, MMSP’2008, Cairns, Queensland, Australia, Oct. 2008, pp. 791–796. [56] I. Lubenko, A.D. Ker, Steganalysis with mismatched covers: Do simple classifiers help?, in: Proceedings of the 14th ACM Multimedia and Security Workshop, MM&Sec’2008, MM&Sec’2012, Coventry, United Kingdom, Sep. 2012, pp. 11–18. [57] I. Lubenko, A.D. Ker, Going from small to large data in steganalysis, in: Proceedings of Media Watermarking, Security, and Forensics III, Part of IS&T/SPIE 22th Annual Symposium on Electronic Imaging, SPIE’2012, San Francisco, California, USA, vol. 8303, Feb. 2012. [58] J. Pasquet, S. Bringay, M. Chaumont, Steganalysis with cover-source mismatch and a small learning database, in: Proceedings of the 22nd European Signal Processing Conference, EUSIPCO’2014, Lisbon, Portugal, Sep. 2014, pp. 2425–2429. [59] J. Butora, J.J. Fridrich, Detection of diversified stego sources with CNNs, in: Proceedings of Media Watermarking, Security, and Forensics, MWSF’2019, Part of IS&T International Symposium on Electronic Imaging, EI’2019, Burlingame, California, USA, Jan. 2019, 534. [60] Y. Yousfi, J. Butora, J. Fridrich, Q. Giboulot, Breaking ALASKA: color separation for steganalysis in JPEG domain, in: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2019, Paris, France, Jul. 2019, pp. 138–149. [61] X. Li, X. Kong, B. Wang, Y. Guo, X. You, Generalized transfer component analysis for mismatched JPEG steganalysis, in: Proceedings of IEEE International Conference on Image Processing, ICIP’2013, Melbourne, Australia, Sep. 2013, pp. 4432–4436. [62] X. Kong, C. Feng, M. Li, Y. Guo, Iterative multi-order feature alignment for JPEG mismatched steganalysis, Journal of Neurocomputing 214 (C) (Nov. 2016) 458–470. [63] C. Feng, X.W. Kong, M. Li, Y. Yang, Y. Guo, Contribution-based feature transfer for JPEG mismatched steganalysis, in: Proceedings of IEEE International Conference on Image Processing, ICIP’2017, Beijing, China, Sep. 2017, pp. 500–504. [64] D. Lerch-Hostalot, D. Megías, Unsupervised steganalysis based on artificial training sets, Engineering Applications of Artificial Intelligence 50 (C) (Apr. 2016) 45–59. [65] M.A. Koçak, D. Ramirez, E. Erkip, D. Shasha, SafePredict: a meta-algorithm for machine learning that uses refusals to guarantee correctness, arXiv:1708.06425, 2017. [Online]. Available: http://arxiv.org/ abs/1708.06425. [66] G.J. Simmons, The subliminal channel and digital signatures, in: Proceeding of Crypto’83, Santa Barbara, CA, Plenum Press, New York, Aug. 1983, pp. 51–67. [67] P. Schöttle, R. Böhme, A game-theoretic approach to content-adaptive steganography, in: Proceedings of the 14th International Conference on Information Hiding, IH’12, Berkeley, CA, USA, vol. 7692, 2012, pp. 125–141.

348

Digital Media Steganography

[68] P. Schöttle, R. Böhme, Game theory and adaptive steganography, IEEE Transactions on Information Forensics and Security 11 (4) (Apr. 2016) 760–773. [69] T. Pevný, T. Filler, P. Bas, Using high-dimensional image models to perform highly undetectable steganography, in: Proceedings of the 12th International Conference on Information Hiding, IH’2010, Calgary, Alberta, Canada, in: Lecture Notes in Computer Science, vol. 6387, Springer, Jun. 2010, pp. 161–177. [70] V. Holub, J. Fridrich, Designing steganographic distortion using directional filters, in: Proceedings of the IEEE International Workshop on Information Forensics and Security, WIFS’2012, Tenerife, Spain, Dec. 2012, pp. 234–239. [71] V. Sedighi, R. Cogranne, J. Fridrich, Content-adaptive steganography by minimizing statistical detectability, IEEE Transactions on Information Forensics and Security, TIFS’2016 11 (2) (Feb. 2016) 221–234. [72] T. Denemark, J. Fridrich, Improving steganographic security by synchronizing the selection channel, in: Proceedings of the 3rd ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2015, Portland, Oregon, USA, 2015, pp. 5–14. [73] L. Guo, J. Ni, Y.Q. Shi, An efficient JPEG steganographic scheme using uniform embedding, in: Proceedings of IEEE International Workshop on Information Forensics and Security, WIFS’2012, Costa Adeje, Tenerife, Spain, Dec. 2012, pp. 169–174. [74] Y. Pan, J. Ni, W. Su, Improved uniform embedding for efficient JPEG steganography, in: Proceedings of the International Conference on Cloud Computing and Security, ICCCS 2016, Nanjing, China, in: Lecture Notes in Computer Science, vol. 10039, Springer, Jul. 2016, pp. 125–133. [75] W. Li, W. Zhang, K. Chen, W. Zhou, N. Yu, Defining joint distortion for JPEG steganography, in: Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2018, Innsbruck, Austria, 2018, pp. 5–16. [76] T. Filler, J. Fridrich, Design of adaptive steganographic schemes for digital images, in: Proceedings of SPIE Media Watermarking, Security, and Forensics, Part of IS&T/SPIE 21th Annual Symposium on Electronic Imaging, SPIE’2011, San Francisco Airport, California, United States, vol. 7880, Feb. 2011, 78800F. [77] S. Kouider, M. Chaumont, W. Puech, Technical points about adaptive steganography by oracle (ASO), in: Proceedings of Signal Processing Conference, EUSIPCO’2012, 2012 Proceedings of the 20th European, Bucharest, Romania, Aug. 2012, pp. 1703–1707. [78] T. Filler, J. Judas, J. Fridrich, Minimizing additive distortion in steganography using syndrome-trellis codes, IEEE Transactions on Information Forensics and Security 6 (3) (Sep. 2011) 920–935. [79] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in: Proceedings of Advances in Neural Information Processing Systems, NIPS’2014, Dec. 2014, pp. 2672–2680. [80] D. Volkhonskiy, I. Nazarov, B. Borisenko, E. Burnaev, Steganographic generative adversarial networks, 2017, unpublished. [81] H. Shi, J. Dong, W. Wang, Y. Qian, X. Zhang, SSGAN: secure steganography based on generative adversarial networks, in: Proceedings of the 18th Pacific-Rim Conference on Multimedia, PCM’2017, Harbin, China, in: Lecture Notes in Computer Science, vol. 10735, Springer, Sep. 2017, pp. 534–544. [82] V. Sedighi, J.J. Fridrich, R. Cogranne, Toss that BOSSbase, Alice!, in: Proceedings of Media Watermarking, Security, and Forensics, MWSF’2018, Part of IS&T International Symposium on Electronic Imaging, EI’2016, San Francisco, California, USA, Feb. 2016, pp. 1–9. [83] J. Fridrich, Steganography in Digital Media, Cambridge University Press, 2009, Cambridge books online. [84] D. Hu, L. Wang, W. Jiang, S. Zheng, B. Li, A novel image steganography method via deep convolutional generative adversarial networks, IEEE Access 6 (Jul. 2018) 38303–38314. [85] A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks, in: Proceedings of the International Conference on Learning Representations, ICLR’2016, Caribe Hilton, San Juan, Puerto Rico, May 2016, p. 16. [86] W. Quan, K. Wang, D.-M. Yan, X. Zhang, Distinguishing between natural and computer-generated images using convolutional neural networks, IEEE Transactions on Information Forensics and Security 13 (11) (Nov. 2018) 2772–2787.

Chapter 14 • Deep learning in steganography and steganalysis

349

[87] W. Tang, S. Tan, B. Li, J. Huang, Automatic steganographic distortion learning using a generative adversarial network, IEEE Signal Processing Letters 24 (10) (Oct. 2017) 1547–1551. [88] J. Yang, D. Ruan, J. Huang, X. Kang, Y.-Q. Shi, An embedding cost learning framework using GAN (previously named “spatial image steganography based on generative adversarial network” on arXiv), IEEE Transactions on Information Forensics and Security, TIFS 15 (Jun. 2019) 839–851. [89] J. Kodovský, J. Fridrich, On completeness of feature spaces in blind steganalysis, in: Proceedings of the 10th ACM Workshop on Multimedia and Security, MM&Sec’2008, Oxford, United Kingdom, 2008, pp. 123–132. [90] J. Kodovsky, J. Fridrich, V. Holub, On dangers of overtraining steganography to incomplete cover model, in: Proceedings of the Thirteenth ACM Multimedia Workshop on Multimedia and Security, MM&Sec’2011, Buffalo, New York, USA, Sep. 2011, pp. 69–76. [91] J. Yang, D. Ruan, X. Kang, Y.-Q. Shi, Towards automatic embedding cost learning for JPEG steganography, in: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2019, Paris, France, Jul. 2019, pp. 37–46. [92] Y. Zhang, W. Zhang, K. Chen, J. Liu, Y. Liu, N. Yu, Adversarial examples against deep neural network based steganalysis, in: Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2018, Innsbruck, Austria, Jun. 2018, pp. 67–72. [93] W. Tang, B. Li, S. Tan, M. Barni, J. Huang, CNN-based adversarial embedding for image steganography, IEEE Transactions on Information Forensics and Security 14 (8) (Aug. 2019). [94] S. Bernard, T. Pevný, P. Bas, J. Klein, Exploiting adversarial embeddings for better steganography, in: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec’2019, Paris, France, Jul. 2019, pp. 216–221. [95] M. Yedroudj, F. Comby, M. Chaumont, Steganography using a 3 player game, under submission, arXiv: 1907.06956, 2019. [Online]. Available: http://arxiv.org/abs/1907.06956. [96] M. Abadi, D.G. Andersen, Learning to protect communications with adversarial neural cryptography, unpublished, arXiv:1610.06918, 2016. [Online]. Available: http://arxiv.org/abs/1610.06918. [97] J. Hayes, G. Danezis, Generating steganographic images via adversarial training, in: Proceedings of Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, NIPS’2017, Long Beach, CA, USA, Dec. 2017, pp. 1951–1960. [98] J. Zhu, R. Kaplan, J. Johnson, L. Fei-Fei, HiDDeN: hiding data with deep networks, in: Proceedings of the 15th European Conference on Computer Vision, ECCV’2018, Munich, Germany, in: Lecture Notes in Computer Science, vol. 11219, Springer, Sep. 2018, pp. 682–697.

Index A Activation, 324 function, 269 Actors suspected guilty, 306 suspicious, 307, 308, 311, 312 Adaptive steganography, 7, 34, 68 methods, 68, 100, 126, 127 Add or subtract one (AoSO) operation, 277 Add-sub-based quotient value differencing (ASQVD), 83 Additive noise steganalysis, 263 Advanced steganographic algorithms, 296 Agents, 335 Algorithms Cao, Zhao, and Feng, 277 embedding, 22, 45, 107, 315, 317, 334–336, 341, 343 Kancherla and Mukkamala, 276 linguistic steganography, 282 MoViSteg, 276 Sadat, Faez, and Saffari, 279 steganographic, 11, 260, 262, 264, 268–270, 272, 273, 275, 277, 280, 281, 309, 310, 315 advanced, 296 strategic adaptive embedding, 336 Wang, Cao, and Zhao, 278 Wang, Zhao, and Hongxia, 277 Zarmehi and Ali, 278 Amount distortion, 9 of communication, 255 of data, 29, 170 of embedded bits, 197 of information, 159–161, 218, 233 existing edge, 29

maximum, 178 of key, 251 of noise, 255 of variation, 182 Ancilla qubits, 223, 227 Anomaly detection, 314, 315 Approach holistic, 334 Approximate entropy, 160 Asymptotic relative efficiency (ARE), 276 Atomistic, 334 Attacker, 65, 68, 75, 77, 124, 141, 159, 161, 167, 295 Audio applications, 272 devices, 272 files, 270–272, 275 formats, 275 signal, 65, 270–272, 274 denoised, 271 steganalysis, 270, 272, 275 methods, 271 Automatic steganography, 286 Average rank, 310 B Batch normalization (BN), 265 layer, 269 Batch steganography, 302, 329 Binary bits, 43, 46, 47, 53, 87, 90, 138 data, 87 mark, 167 classification task, 266 classifiers, 139 secret data, 43, 46 steganalysis scenario, 324 Binary symmetric channel (BSC), 219, 224, 239

351

352

Index

Bit error rate (BER), 76 Bits binary, 43, 46, 47, 53, 87, 90, 138 data, 82, 142 distribution, 160 embedding, 44 hidden, 199 secret, 25 LSB, 41, 44, 81, 100 message, 158, 160, 263, 296 per byte (BPB), 82, 91 per pixel (BPP), 8, 29, 54, 56, 115, 147, 160, 178, 260, 339 quantum, 252 secret, 6, 7, 25, 29, 43, 44, 46, 47, 50, 51, 53, 84, 85, 109, 110, 126, 142, 196, 200, 202, 208, 210 data, 18, 47, 83 message, 5, 152 selection, 154, 159 Blind extraction, 126, 166 steganalysis, 11, 123 Block, 324 candidate, 132, 136–138 for embedding, 130 code, 232, 244, 245 diagram, 8, 47, 123, 124 indexes, 299, 301 noncandidate, 130, 131, 133 nonoverlapping, 6, 42, 43, 47, 107, 113, 127 pairs, 101 pixels, 34, 42, 59 preprocessing, 324, 332, 333 searching, 107 size, 20, 152, 153, 155, 245 transmitted, 245, 251 C Calibrated cooccurrence features, 299 least squares (CLS), 316 methods, 271 Candidate block, 132, 136–138 for embedding, 130

Cao, Zhao, and Feng algorithm, 277 Carrier, 1, 99, 155, 156, 159, 162, 184 data, 145, 146, 152–154, 158, 159 image, 7 message, 159 Catalan numbers, 145–148, 150, 152, 158, 161, 162 decomposition, 150, 152 in data hiding, 152 in steganography, 146 Catalan stego key, 156 Center of mass (COM), 262 Central pixel, 82, 95, 138 Challenges facing steganography, 4 Changeable pixels, 192 Channel Alice, 247 insecure, 295 Ciphertext, 215 Clairvoyant steganalysis, 339 Classic steganalysis attacks, 147 Classification, 123, 124, 157, 260, 261, 264, 275, 280, 284, 286, 321, 322 accuracy, 266, 269, 273, 275 blocks, 330 decision, 325 error probability, 140 image steganography, 5, 12 methods, 127 module, 265, 266, 323–325 percentages, 263 performances, 334 probability error, 142 process, 141 stage, 280 steganographic images, 263 task, 260 thresholds, 141 Classifier, 125, 127, 140, 141, 157, 262, 263, 271–273, 278, 282, 321, 334, 337, 338 SVM, 280, 283 Clustering ensemble, 310–313 CNNs, 261, 264–266, 270, 273, 283, 323, 326, 327, 329, 344 multiclassifier, 334

Index

steganalysis, 332 Code blocks, 232, 244, 245 Code tree units (CTU), 279 Color images, 18, 26, 32, 34, 35, 37, 68, 74, 100, 139, 142, 264, 265 histogram, 35 steganography, 18 pixel, 82 Complex wavelet transform (CWT), 8 Compressed audio formats, 272 cover, 171 cover frames, 170–172 Compression codec, 7 Consecutive pixels, 57, 147 Constituent pixels, 93 Convolution, 324 blocks, 323, 324, 330, 331 module, 321, 323, 324 Convolutional neural network (CNN), 261, 273, 283, 321, 322 Corner pixels, 82 Correctable errors, 216, 217, 244, 245 Cov-Pool, 331 Cover, 271, 295 file, 166 Cover/stego images, 323 Coverless image steganography, 68 Covert communication, 5, 217, 239, 255 quantum, 217 Covertext, 215, 218–220, 223, 247, 249 communication, 217 encoded, 218, 223, 247, 250, 253 qubit, 225 state, 240 Criteria stopping, 71 Cryptography, 17, 41, 65, 81, 99, 123, 146, 189, 215, 216, 218, 233 quantum, 216 D Data, 280 bits, 82, 142

353

carriers, 159, 162 embedding, 5, 35, 41, 108, 109, 115, 160 secret, 18, 29, 34, 35, 37, 39, 184, 189 extraction, 85, 87, 156 hidden, 3, 26, 41, 82, 92, 127, 158, 204, 276 secret, 32, 195 hiding, 2, 17, 37, 41, 42, 44, 57, 81, 99–101, 104, 145–147, 161, 166, 184, 278 capacity, 81 in video, 167 method, 39, 127 principals, 167 schemes, 115, 167 secret, 123 technique, 17, 82, 166, 184 DCT coefficients, 116, 167, 299–301, 309 block, 300 features, 299, 300 Decimal value, 43, 46, 47, 49–51, 54, 82–85, 87, 89 Decomposition of the Catalan numbers, 150 Decompressed JPEG image, 299 Deep Belief Network (DBN), 272 Deep learning, 321, 322, 327, 329, 335, 337, 343, 344 in steganalysis, 344 in steganography, 344 libraries, 324 methods, 325, 326 networks, 322 technique, 210 Denoised audio signals, 271 Dense layer optimization, 269 Depolarizing channel (DC), 219, 221, 222, 226, 228, 229, 231, 236–239, 242, 243, 251, 252 Depthwise separable convolutions, 330 Detection accuracy, 11, 139, 272, 275, 306 algorithm, 295 anomaly, 314, 315 features, 124 HOG, 25

354

Index

methods, 262, 263, 306, 313 model, 11 outlier, 304, 307, 308, 317 performance, 310, 313 rates, 264 Diamond norm, 233–238 distance, 233, 234, 236, 242 Difference histogram, 10 for embedding, 100 mask, 169 methods, 69 pixels, 92 stego, 44 value, 42, 45, 82, 87, 107 Difference Expansion (DE), 191 Different expansion, 100 Differential evolution, 70 Digital Video Surveillance (DVS), 165 Discrete cosine transform (DCT), 8, 68, 100, 102, 166, 171, 184, 260, 298 Discrete cosine transform residual (DCTR), 11 Discrete Fourier transform (DFT), 8 Discrete wavelet transform (DWT), 8, 100, 260 Discriminating network, 337 Discrimination function (DF), 59 Discriminative features, 271, 273 Disjoint pixel blocks, 83, 95 Distortion, 7, 9, 31, 35, 41, 67, 68, 81, 91, 92, 120, 181, 191, 210 amount, 9 artifacts, 4 embedding, 67, 68, 115 maximum, 210 parameters, 68 stego, 182 image, 6 Distribution bit, 160 Double Tanh, 340 E Eavesdropper, 215–217, 233, 246, 249, 253, 255, 256 Eve, 218, 240, 257

Ebits, 224, 239 EDH data embedding, 108 histogram, 109, 114 Effective embedding, 128 Embeddable payload, 303 pixel, 195 Embedding algorithm, 22, 45, 107, 315, 317, 334–336, 341, 343 approach, 342 artifacts, 124, 126, 132, 139 bits, 25, 44 changes, 299, 316, 317 probability, 333 cost, 337 data, 5, 35, 41, 108, 109, 115, 160 bits, 29 direction, 18, 26, 27 distortion, 67, 68, 115 effective, 128 efficiency, 124 fake, 128, 132, 136 function type, 123 LSB, 11, 37 messages, 262 method, 66, 126, 139 OSteg, 127, 139 performance, 120 phase, 175 POIs payload, 37 procedure, 43, 49, 83, 85, 104, 113, 190, 192, 195, 199, 200, 207 process, 10, 66, 67, 69, 71, 101, 127, 128, 166, 167, 174, 175, 278 rate, 59, 115, 272, 273 scheme, 126, 139, 142, 209 secret data, 18, 29, 34, 35, 37, 39, 41, 184, 189 information, 1, 29 message, 18, 27, 35, 101 steganographic, 307 steganography, 68

Index

strategies, 310 unit, 10 video, 166 Embedding capacity (EC), 116 Embedding direction histogram (EDH), 101, 104, 105 Encoded covertext, 218, 223, 247, 250, 253 message, 246 steganographic, 218 payload, 123 Ensemble classifiers, 126, 140–142, 272 clustering, 310–313 Entanglement transmission, 253 Entropy approximate, 160 Error block (EB), 44, 56 Error syndromes, 217, 221–225, 229, 232, 239–241, 243, 245, 248, 249 quantum, 216 Estimate unbiased, 305 Exploiting modification direction (EMD), 5, 6, 41, 82, 201 Extracted binary data stream (EBDS), 87, 89 Extraction, 46, 73, 87, 89, 90, 175, 190, 197, 208, 336 algorithms, 25, 39, 45, 46, 336 data, 85, 87, 156 secret, 18 features, 124, 277, 280 method, 66 module, 175 network, 339 phases, 18 problem, 47, 132 procedure, 42, 44, 47, 53, 85, 87, 189, 190, 192, 193, 201, 204–206, 208 process, 71, 73, 138, 166, 175, 176, 185, 193, 327 PVD, 43 side, 47, 54 time, 44

355

F Fake embedding, 128, 132, 136 Fall off boundary problem (FOBP), 42, 82, 127 Features calibrated cooccurrence, 299 detection, 124 extraction, 124, 280 histogram, 300 interblock, 302 map, 265, 266, 322, 324, 325, 330, 331, 333 Markov, 299 preprocessed, 314 raw, 306, 315, 316 steganalysis, 11, 316 stego, 11 text, 285 vectors, 10, 275, 283, 299, 305–307, 309, 312, 314, 315, 332, 333 weak, 313, 317 Fisher linear discriminant (FLD), 140, 263 Float pixel values, 171 FOBP, 42, 46, 47, 54, 56, 58, 82, 83, 127 Focal pixel, 134, 137 Foreground pixels, 169 Forward DCT (FDCT), 102 Frames difference formula, 169 stego, 175, 182, 184 video, 166, 167, 169, 170, 278 Frequency histogram, 105 G Game simulation, 341 Gaussian mixture model (GMM), 273 Generative adversarial network (GAN), 286 Generic QECCs, 253 Global average pooling, 325 Graphia, 215 Groups of pictures (GOP), 279 Guilty actor, 296, 297, 302, 306–310, 312, 317 H HEVC video steganalysis, 279 Hidden bits, 199

356

Index

undetectability, 127 content, 157 data, 3, 26, 41, 82, 92, 127, 158, 204, 276 information, 1, 3, 4, 17, 99, 114, 159, 166, 194, 231, 233, 238, 239, 259, 275, 276, 279, 283, 295 layer, 273 message, 10, 32, 35, 37, 114, 145, 146, 157, 162, 178, 218, 233, 260–263, 270–273, 275, 276, 278, 281–283, 321 detection, 281 qubits, 231, 232 secret bits, 25 classical messages, 216 data, 32, 147, 195 data bits, 7, 18, 37 information, 255 message, 18 stegotext, 217 video, 185 Hiding capacity, 6, 35, 39, 41–43, 54, 59, 62, 127, 147, 178, 192, 197, 199, 201, 205, 209–211 data, 2, 17, 37, 41, 42, 44, 57, 81, 100, 101, 104, 145–147, 161, 166, 184, 278 details, 165 equation, 192 information, 65, 99, 142, 155, 158, 166, 191, 215 mechanism, 202 messages, 145, 216, 281, 338 in audio, 270 methods, 202 module, 161 payload, 191, 194, 197, 199–201, 205, 209, 211 privacy in video, 167 information, 167 procedure, 189, 205 process, 171, 172, 184 rule, 202 secret data, 18, 123, 202, 210

data bits, 7 information, 1 message, 1, 120 secrete data, 199 stego qubits, 222 strategy, 200, 202 technique, 166 video, 167 Histogram, 10 analysis, 263 difference, 10 features, 300 frequency, 105 sorted, 105 patterns, 125 PVD, 35 shifting, 100, 101, 108, 120 Histogram characteristic function (HCF), 262 Histogram of oriented gradient (HOG), 18, 19 algorithm, 18, 21, 35 detection, 25 edge, 39 Holistic approach, 334 HS scheme, 194, 195, 197, 198, 200 Human visual system (HVS), 5, 33, 298 I Ikeda system, 126, 134–138 Image generator network, 337 Image synthesis, 338 Improved Ye-Net, 269 Incorrect data extraction, 42, 45 Industrial steganography, 259 Information embedding secret, 1, 29 hidden, 1, 3, 4, 17, 99, 114, 159, 166, 194, 231, 233, 238, 239, 259, 275, 276, 279, 283, 295 hiding, 65, 99, 142, 155, 158, 166, 191, 215 secret, 1 payload, 191 secret, 3, 4, 6, 17, 18, 65, 67–69, 100, 113, 160, 219, 222, 238, 255, 276, 277 security, 17, 123, 134

Index

Initialization population, 70 Innocent covertext, 240 message, 216 Insecure channel, 295 Integer Wavelet Transform (IWT), 8 Interblock, 301 dependency, 299 features, 302 joint, 302 Intrablock, 301, 302 joint density matrices, 301 features, 301 Inverse discrete cosine transform (IDCT), 102, 174 Irreversible steganography techniques, 100 J Joint distortion, 68 features, 301 interblock, 302 Joint Photographic Experts Group (JPEG), 8, 102, 261, 264, 269, 297, 298, 333, 341 CNN, 332 compression, 297–299, 326 domain, 264, 268 files, 11 images, 297–299, 309, 310, 312, 332 decompressed, 299 steganalysis, 11 steganalysis, 299, 301, 309, 313, 321, 326, 329, 330, 332, 333 features, 298 steganographic systems, 301 steganography, 11 K Kancherla and Mukkamala algorithm, 276 L Latent space, 322 Layer, 324

357

batch normalization, 269 Least significant bit (LSB), 5, 6, 41, 68, 69, 81, 100, 109, 124, 127, 138, 147, 166, 260, 262, 270 Linear Bayes normal classifier, 263 Linguistic steganography algorithms, 282 techniques, 285 Local outlier factor (LOF), 304, 308 detection, 310, 314 Logical qubits, 217 Lossless steganography method, 190 Lower bound (LB), 81 LSB, 81, 158 alteration steganography, 81 bits, 41, 44, 81, 100 embedding, 11, 37 steganalysis, 262, 263 steganography, 262 matching, 41, 83, 202, 262 steganalysis, 262, 263 steganographic algorithm, 273 steganography, 69 substitution, 7, 27, 41, 42, 44, 54, 62, 69, 81, 93 Luminance DCT coefficients, 299 M Macroscopic features, 299 Map features, 322, 324 Markov features, 299, 301 Matrix embedding, 309 Maximum amount, 178 covariance transformation, 316 distortion, 210 Maximum mean discrepancy (MMD), 305 Mean square error (MSE), 9, 30, 115 Message bits, 158, 160, 263, 296 permutation, 158, 160 carrier, 159 embedding secret, 18, 27, 35, 101 encoded, 246

358

Index

hidden, 10, 32, 35, 37, 114, 145, 146, 157, 178, 218, 233, 260–263, 270–273, 275, 276, 278, 281–283, 321 secret, 1, 18, 120 length, 302, 303 quantum, 216, 218, 234 qubits, 223 secret, 1–3, 18, 66, 99, 124, 145, 166, 189, 215, 262, 295 steganographic, 215, 218, 234 stego, 3, 262 suspicious, 259 Metadetection method, 10 Methods calibrated, 271 metadetection, 10 noncalibrated, 271 Modified discrete cosine transform (MDCT), 272 Modules, 323 convolution, 323, 324 preprocessing, 323 Most significant bits (MSB), 82 Motion vectors (MV), 276 Moving Pictures Expert Group (MPEG), 279 MoViSteg algorithm, 276 Mutation, 70 MV steganography, 278 N Naive adaptive steganography, 335 Naive Bayes classifier (NBC), 263 Neighbor mean interpolation (NMI), 207 Neighboring pixels, 82, 83, 85, 87, 95, 134, 137, 206, 210 Network discriminating, 337 image generator, 337 neural, 10, 261, 264, 273, 276, 277, 285, 321, 322, 337, 342 Yedroudj-Net, 324 Neural networks, 10, 261, 264, 273, 276, 277, 285, 321, 322, 337, 342 Noise quantity, 271 Noisy stegoimage, 77

Nonadditive distortion steganography, 68 Noncalibrated methods, 271 Noncandidate blocks, 130, 131, 133 Noncompressed audio formats, 271 Nondegenerate QECCs, 240, 249, 251, 256 Nonembeddable pixel, 205 Noninvertible steganography, 100 Nonoverlapping blocks, 6, 42, 43, 47, 107, 113, 127 Normalization, 324 Normalized correlation (NC), 182 Not-side-channel-aware (Not-SCA) scenario, 329 O Object detection, 21 Offline scenario, 334 Online scenario, 334 Operations, 324 Optimization dense layer, 269 task rate-distortion, 296 Optimizer, 269 Optimum adaptive steganography, 335 Ordinary least square (OLS), 316 Original least significant bit (OLSB), 192 OSteg, 125, 139–142 adjustable, 142 classification accuracy, 139 embedding, 127 outperforms, 125, 142 Otsu clustering, 128 Outlier detection, 304, 307, 308, 317 P Parzen classifier, 263 Payload, 4, 11, 37, 115, 116, 125, 139, 199, 260, 264, 268, 296, 302, 310, 317, 335, 340 capacity, 8 encoded, 123 hiding, 191, 194, 197, 199–201, 205, 209, 211 information, 191 secret, 302 size, 315, 325, 330, 339

Index

steganographic approaches, 4 PDH analysis, 54, 61, 82, 92 Peak-signal-to-noise ratio (PSNR), 9, 30, 74, 83, 115, 127, 178, 198 Phase converting stego object, 155 of Catalan number decomposition, 152, 153 definition, 152, 153 of comparison, 152, 154 of converting, 152 of generating secret message, 156 of modification, 156 of selection, 152, 153, 156 of stego-key analysis, 155 Phase-split, 326 Pixel difference histogram (PDH), 35, 42, 61, 81 Pixel value differencing (PVD), 5, 6, 18, 41, 42, 81, 147, 210 Pixels block, 34, 42, 59, 125, 127, 128, 142 blue, 28 BOSS, 340 database, 339 boundary, 200, 211 central, 82, 95, 138 changeable, 192 color, 82 consecutive, 57, 147 constituent, 93 corner, 82 difference, 92 difference histogram, 10, 35, 39, 61, 81 analysis, 10 embeddable, 195 focal, 134, 137 foreground, 169 gray values, 72 indicator, 41 indices, 200 intensity, 169, 195 neighboring, 82, 83, 85, 87, 95, 134, 137, 206, 210 nonembeddable, 205 overlapping, 147

359

pair, 46, 47, 192 shifted, 195, 210 size, 22, 329 stego image, 10 target, 81 underflow, 62 value adjusting feature, 206 virtual, 206–208, 211 Pixels-of-interest (POIs), 20, 22, 27–29 Pixels-of-noninterest (PONIs), 28 Plaintext, 215 Pooled steganalysis, 302, 317, 329 steganalyzers, 302 Pooling, 324 operation, 265, 266, 325 Population initialization, 70 Prediction unit (PU), 279 Preparation, 128 Preprocessing, 128, 280, 306, 307, 310, 323, 326, 330, 340 block, 324, 332, 333 features, 314 filter banks, 329 layer, 265, 266 weights, 268 methods, 306 module, 323, 330 operation, 313 phase, 126 procedure, 310 stage, 310 Pretreatment, 128 Principal component transformation (PCT), 316 Principle components analysis (PCA), 315 Privacy hiding methods, 167 Privacy information, 166, 167, 170 frames, 168 hiding, 167 Probabilistic features, 284 Probability embedding change, 333 Probable stegoimage, 71

360

Index

PSNR calculation, 91 value, 29, 34, 54, 92, 179, 182, 198 PU partition, 279, 280 PVD embedding procedure, 43 extraction, 43 histogram, 35 steganography, 18, 81 methods, 42 Q QDCT coefficients, 101, 104, 107–109 Quadratic Bayes normal classifier, 263 Qualitative difference, 256 Quality factor (QF), 264, 309 Quantitative steganalysis, 264, 321, 329 Quantization hiding technique, 169, 171 Quantization parameter (QP), 170 Quantized DCT (QDCT), 102 coefficients, 301 histogram, 309 Quantum bits, 252 covert communication, 217 cryptography, 216 error syndromes, 216 message, 216, 218 steganographic communication, 216 protocols, 216, 219 steganography, 216, 217, 219, 256, 257 protocol, 218, 233 Quantum error-correcting code (QECC), 216, 217, 220, 240 Qubits ancilla, 223, 227 hidden, 231, 232 message, 223 secret, 229, 232, 242 steganographic, 226, 229, 232 stego, 220, 221, 225–229, 231, 249 teleport, 239 without errors, 242

R Random bits, 240, 245, 255 Rate-distortion optimization task, 296 Raw features, 306, 315, 316 Receiver operating characteristic (ROC), 139 Recombination, 70 Recurrent Neural Networks (RNN), 275, 284 Redundant space transfer (RST), 209 ReLU, 265, 269, 332 activation, 269, 275 function, 269 layer, 274 Residual Neural Networks (ResNets), 273 Restricted Boltzmann machines (RBM), 272 Resultant classes, 130 stego frame distortion, 179 video, 167, 170 Reversible data hiding, 103 Reversible steganography scheme (RSS), 189 Rich models (RM), 321 RS analysis, 37, 54, 59, 82, 147 attack, 42, 59, 127, 147 S S-ResNet, 274 Sadat, Faez, and Saffari algorithm, 279 Scenario not-side-channel-aware (Not-SCA), 329 offline, 334 online, 334 side-channel-aware (SCA), 329 Secrecy, 233, 246 Secret, 1, 3, 246 bits, 6, 7, 25, 29, 43, 44, 46, 47, 50, 51, 53, 84, 85, 109, 110, 126, 142, 196, 200, 202, 208, 210 extraction, 69 string, 110 classical message, 216 communication, 3, 216–218, 250, 252, 253, 256, 295, 336 amount, 217, 232

Index

data bits, 18, 47, 83 embedding, 41 format, 67 hiding, 18, 147, 202, 210 digit, 7, 100, 104, 106, 111, 114 image, 66, 68, 69, 71–73, 75, 77, 166 information, 3, 4, 6, 17, 18, 65, 67–69, 100, 113, 160, 219, 222, 238, 255, 276, 277 amount, 217, 238 integers, 104 key, 65, 68, 126, 167, 218, 239, 245, 251, 295, 339 amount, 219, 222 bits, 221, 224, 239, 246 message, 1–3, 18, 66, 99, 124, 145, 166, 189, 215, 262, 295 bits, 5, 152 concealing, 100 payload, 302 quantum messages, 216 qubits, 221, 229, 232, 242 transmission, 229 Secret binary data stream (SBDS), 88 Security, 35, 61, 65, 82, 146, 157, 191, 201, 218, 219, 222, 233, 246 analysis, 59, 146, 158 challenges, 81 check, 54, 59, 61 communication, 17 evaluation, 343 improvement, 127 information, 17, 123, 134 level, 325 reasons, 145 Selection, 70 Separator plans, 322 Sequence frames, 169 video, 169 Sequential frames, 169 Shifted pixels, 195, 210 value, 195 Shortcut connection, 326 Side channel aware (SCA), 332 scenario, 329

361

Side match (SM), 83 Side-channel map, 333 Signature steganalysis, 261, 262 Singular value decomposition (SVD), 100, 260 Sorted frequency histogram, 105 Spatial pyramid pooling (SPP), 268, 325, 330 Spatial rich model (SRM), 11 Spread spectrum (SS), 278 steganalysis, 263, 264 Square root law, 325 SRNet, 265, 269, 331, 333 network, 331 SSIM metrics, 32, 33 values, 32, 33, 56 Standard cryptography, 216 Standard deviation (SD), 182 Statistical moments extractor, 325 Statistical steganalysis, 262 Steganalysis, 4, 123, 146, 157, 159, 261, 295, 296, 321, 325, 335, 337 additive noise, 263 algorithms, 281 approaches, 261 attacks, 125 audio, 270, 272, 275 clairvoyant, 339 detection attacks, 4 expert, 302, 308 features, 11, 301, 316 JPEG, 299, 301, 309, 313, 321, 326, 329, 330, 332, 333 methods, 11, 68, 139, 146 problem, 127, 315 procedure, 271 researchers, 266, 270 results, 326 scheme, 266, 273, 275 signature, 261, 262 spread-spectrum, 263, 264 statistical, 262 targeted, 123 against OSteg, 142 techniques, 4, 157, 263, 276 text, 281–283, 285

362

Index

transform domain, 263, 264 universal, 10, 11 video, 275–277, 281 Steganalyst, 32, 35, 123, 124, 127, 268, 334, 335 Steganalyzer, 300, 317, 334, 335 Steganographer, 124, 268, 302, 307, 311, 314, 317, 334, 335 candidate, 311 identification, 317 identification problem, 296 Steganographic algorithm, 11, 260, 262, 264, 268–270, 272, 273, 275, 277, 280, 281, 309, 310, 315 approaches, 5, 12, 124, 141 communication, 123, 219, 222, 239, 240, 246 embedding, 307 encoding, 220, 222–224, 229, 234, 239, 244, 247, 249 message, 215, 218, 234, 242 method, 4, 10, 42, 125, 142, 145, 216 noise, 264–266, 268, 310 protocol, 218–220, 223, 233, 246, 250, 251 secret, 246 qubits, 226, 229, 232 Steganography, 17 adaptive, 7, 34, 68 algorithms, 35 automatic, 286 embedding, 68 formulation, 259 industrial, 259 JPEG, 11 LSB, 69 mechanism, 95 method, 10, 12, 29, 62, 66–69, 77, 82, 100, 191, 277, 280 naive adaptive, 335 optimum adaptive, 335 process, 260, 266, 268 PVD, 18, 81 quantum, 216, 217, 219, 256, 257 scheme, 8, 82, 83, 189, 191 strategic adaptive, 335 system, 100 technique, 4, 82, 189, 259, 275

technology, 191 text, 281 Steganos, 215 Steghide, 271 Stego, 271, 295 class, 325 color images, 33 difference, 44 distortion, 182 features, 11 frames, 175, 182, 184 gradient, 262 image, 3, 4, 25, 160, 263, 299, 335, 342 distortion, 6 pixels, 10 quality, 8, 18 key, 145, 146, 152, 154, 155, 158, 160 security analysis, 158 message, 3, 262 mismatch, 334 object, 123, 124, 132, 139, 141, 155, 156, 295, 296 qubits, 220, 221, 225–229, 231, 244, 249 signal, 271 texts, 283 version, 139 videos, 276, 280 Stegotext, 215 Stopping criteria, 71 Strategic adaptive embedding algorithm, 336 equilibrium, 336 steganography, 335 embedding, 343 equilibrium, 342 Structural similarity index measurement, 9, 30, 75 Structural similarity index metric, 54, 56 Subtractive pixel adjacency matrix (SPAM), 11 Supernetwork, 330 Support vector machines (SVM), 263 Suspected guilty actors, 306 Suspicious actor, 307, 308, 311, 312

Index

SVM classifier, 280, 283 Synonym substitution (SS), 282 Synthesis, 338 T TanH, 265, 269 Target payload, 341 pixel, 81 Targeted steganalysis, 123 against OSteg, 142 Teleport qubits, 239 Term bagging stems, 140 Text characteristics, 281 features, 285 file, 2, 281 steganalysis, 281–283, 285 algorithms, 285 steganography, 281 techniques, 281 watermarking, 2 Textual messages, 166 Texture, 262, 279 3-player game, 335, 343 Time and space complexity for Catalan keys, 158 Trade secrets, 3, 4 Transform domain steganalysis, 263, 264 steganography methods, 100 Transformation unit (TU), 279 Transmission entanglement, 253 Transmit important messages, 259 secret messages, 233 Transmitted block, 245, 251 Truncated linear unit (TLU), 332 Twirling, 220–222, 224, 227, 233, 239 operation, 223, 224 procedure, 221

Typical errors, 222 U Ultra HD videos, 279 Unbiased estimate, 305 Underflow pixels, 62 Undetectability, 4, 69, 127, 142 Undetectable payload, 125 Universal steganalysis, 10, 11 Unsupervised classification, 335 Upper bound (UB), 81 V Video coding standard, 279 compression concept, 166, 167 encoder, 167 technique, 166 covers, 169, 276, 277 embedding, 166 files, 166, 276 frames, 166, 167, 170, 278 hidden, 185 hiding, 167 quality metrics, 167 samples, 277 sequences, 276–278, 280 frames, 169 steganalysis, 275–277, 281 methods, 276 surveillance, 165–167 tracking, 170 Video Code Expert Group (VCEG), 279 Virtual pixel, 206–208, 211 W Wang, Cao, and Zhao algorithm, 278 Wang, Zhao, and Hongxia algorithm, 277 Watermarking, 17, 65 Wavelet obtained weights (WOW), 125 Weak features, 313, 317 Word2text, 282

363

364

Index

X Xu-Net, 326

Yedroudj-Net network, 324

Y Ye-Net, 324 improved, 269

Z Zarmehi and Ali algorithm, 278 Zhu-Net, 268